GIS-based analysis of obesity and the built environment in the US.

Author:Xu, Yanqing
  1. Introduction

    The current obesity epidemic has become a significant contributing factor to several leading causes of morbidity and mortality, including heart disease, stroke, diabetes and some cancers (Zhang, Lu, and Holt 2011). If the prevalence of obesity continues, more than 75% of the US population will be overweight within a few years, and more than 40% will be obese. The cost of treating obesity-related illnesses was estimated to be $147 billion in 2008. The U.S. Department of Health and Human Services (HHS) awarded more than $119 million to states and US territories to stimulate obesity research in the US and support public health efforts aimed at increasing physical activity and reducing obesity (e.g., Casagrande et al. 2011; Chi et al. 2013; Flegal et al. 2010; Lopez 2007; Maroko et al. 2009; Rose et al. 2009; Wang, Wen, and Xu. 2013; Yamada et al. 2012). Obesity has also become a worldwide research hotspot, with case studies in the United Kingdom (Edwards et al. 2010; Fraser et al. 2012), Taiwan (Wen, Chen, and Tsai 2010; Chen and Truong 2012), Greece (Chalkias et al. 2013), and the Netherlands (Dijkstra et al. 2013).

    Previous research in public health, transportation, and urban planning highlighted the important relationship between environmental factors and people's physical activities at a variety of spatial scales (Feng et al. 2010; Li et al. 2008; Rutt and Coleman 2005; Yamada et al. 2012). For example, researchers have concluded that built-environment attributes, especially walkability, are consistently related to physical activity in general, particularly to 'active transportation' (Casagrande et al. 2011; Smith et al. 2008). Owen et al. (2004) suggested that accessibility of recreation facilities and opportunities for activities and aesthetics were related to physical activities such as walking. Saelens, Sallis, and Frank (2003) found that ease of pedestrian access to nearby destinations was related to active transportation choices, particularly walking. In addition, the use of Geographic Information Systems (GISs) in public health studies has emerged recently, such as those to measure spatial distribution of accessibility to public resources (e.g., Giles-Corti and Donovan 2002).

    Regression models were used to study the relationship between obesity and such environmental factors as fast-food density (Rose et al. 2009), land-use pattern (Heath et al. 2006; Duncan et al. 2010), poverty (Maroko et al. 2009), and walkability (Casagrande et al. 2011). However, it should be noted that in the public health studies in large areas, i.e., the entire US, regression models could be spatially non-stationary, meaning that the coefficients of the regression model are spatially variable (Brunsdon, Fotheringham, and Charlton 1998). In this case, local regression models such as the Geographically Weighted Regression (GWR, Fotheringham, Brunsdon, and Charlton 2002) could be used to avoid the 'ecological fallacy' problem (Holt et al. 1996) and explain the variability of obesity. In addition, we could gain better understanding of the phenomenon by interpreting the spatial pattern of the coefficients (Brunsdon, Fotheringham, and Charlton 1998). Maroko et al. (2009) examined the relationship between park accessibility and social economic status characteristics such as poverty, language barrier, population density and percent of minority ethnic groups in New York City by using the global and GWR models. They found only a weak relationship between park accessibility and physical activity variables and obesity rate. Their results suggest the existence of spatial non-stationarity in the regression models. GWR has been demonstrated to be an effective tool to analyze obesity in a geographical context (Chalkias et al. 2013; Chen and Truong 2012; Chi et al. 2013; Dijkstra et al. 2013; Edwards et al. 2010; Fraser et al. 2012; Wen, Chen, and Tsai 2010). However, only very few studies have addressed the obesity problem in the US continental area. Chi et al. (2013) used GWR and k-mean clustering analysis to examine the association of the food environment and some other socioeconomic variables with obesity in the US. Their work set a basis for a new analysis framework, namely using agglomerates to explain the spatial patterns of the regression coefficients. The built environment factors, however, were not their focus. In public health, built environment is the key to integrating the physical environments of the communities with health and wellness of the residents. The Centers for Disease Control and Prevention (CDC), the World Health Organization (WHO) and other health organizations have recognized the importance of walkability for reducing obesity.

    In this research, we focus on the built environment, which is measured by street connectivity and the walk score, in order to conduct a nation-wide analysis of the obesity problem in the US. Street connectivity is a proxy for walkability. Walk score is the measurement of physical activity, i.e., 'access' to nearby amenities on foot. We hope to contribute to the existing body of knowledge by answering the following research questions: 'Are these built environment variables closely and consistently correlated to the obesity problem throughout the US?' and 'If not, can geographic analysis based on a local regression model help policy making?' In this paper, we present procedures of variable selection, model interpretation, and a regionalized analysis of the built environment and obesity problem in the US at the county scale. Then, we provide some insights and discussion of the methodology.

  2. Variables and data sources

    2.1. Choice of variables

    In the regression analysis, the dependent variable is the obesity rate, and the independent variables include four built environment variables--walkability, urbanicity (Vlahov and Galea 2002), street connectivity, and the ratio of fast-food/ful 1-service restaurants (fast-food restaurant ratio)--and two sociodemographic variables: poverty rate and ethnic heterogeneity. These variables have been discussed in previous research. The counties in Alaska and Hawaii were excluded from the data analysis because they are spatial outliers for the local regression model. Obesity rate is calculated from the obesity data provided by the Diabetes Interactive Atlases ( atlas/) along with other data such as diagnosed diabetes and leisure-time physical inactivity at county level. Self-reported weights and heights were calculated to BMI: BMI = mass (kg)/(height [(m)).sup.2]. Respondents were considered obese if their BMI values were over 30 (Flegal et al. 2010; CDC 2013). The obesity rte is the ratio of the obese population over the total population in each county. Three built environment variables were used in the analysis street connectivity, walk score, and ratio of fast-food/ full-service restaurants.

    2.1.1. Street connectivity

    Connectivity is defined by the number of intersections along a certain street network or in an area. Strictly, two-way connections are not intersections. Therefore, only those intersections with three connecting edges and the starting or ending nodes of the street network were included in the connectivity index calculation (Wang, Wen, and Xu. 2013). We used intersection density to measure street connectivity.

    SCi = number of intersections/area (1)

    Intersection density corresponds closely to block size--the greater the intersection density, the smaller the blocks. Small blocks make a neighborhood walkable. Street network density and intersection density are highly and positively correlated with each other (Aurbach 2010). Different areas have different patterns of intersection density; the differences will become larger when street network density decreases from urban to suburban and then rural areas. The intersection density measurement is based on a census tract level and then aggregated to county adjusted by population.

    2.1.2. Walkability

    Walkability is measured by the Walk Score (http://www. which calculates walking distance from a point of interest to nearby amenities. The Walk Score algorithm has been used in many public health studies (Brewster et al. 2009; Cortright 2009; Duncan et al. 2011; Jones 2010; Kirby et al. 2012; Kumar 2009; Li, Wen, and Henry 2014; Rauterkus, Thrall, and Hangen 2010; Zhu and Lee 2008). Brewster et al. (2009) showed that neighborhood Walk Score is correlated with the level of physical exercise, and hence could predict the levels of obesity, hypertension, and diabetes. Jones (2010) studied the Walk Score and its association with activity levels and she found that the Walk Score is correlated with the GIS-derived walkability index (r = 0.63 p

    The Walk Score algorithm requires user input to locate amenities such as restaurants, grocery stores, schools, parks, and movie theaters, which in this research are sourced from public domain map providers--Google,, Open Street Map, and Localeze. The algorithm calculates a weighted average of Euclidean distances from a point of interest to the amenities. The weights are determined by facility type priority and a certain distance decay function (Front Seat 2014). Walk Score ranges from 0 (the lowest) to 100 (the highest). Distances with a walk score of 0-49 are car dependent, while 0-24 means almost all errands require a car, and 25-19 means a few amenities are within walking distance. A walk score of 50-69 means some amenities are within walking distance, 70-89 means most errands can be accomplished on foot, and a score between 90 and 100 is walker's paradise where daily errands do not require a car (Front Seat 2014).

    Front Seat (2014) provides an application programming interface (API) to query the Walk Score database through URL calls, eliminating the need for manually interacting with the website interface (Front Seat 2014). We developed a Python program which automatically requests walk...

To continue reading