Abstract
The declining pattern of population density from city centres to the outskirts has been widely observed in American cities. Such a pattern reflects a trade-off between housing price/commuting cost and employment. However, most previous studies in urban population density functions are based on the Euclidean distance, and do not consider commuting cost in cities. This study provides an empirical evaluation of the classic population density functions in 382 metropolitan statistical areas (MSA) in the USA using travel times to city centres as the independent variable. The major findings of the study are: (1) the negative exponential function has the overall best fit for population density in the MSAs; (2) the Gaussian and exponential functions tend to fit larger MSAs, while the power function has better performance for small MSAs; (3) most of the MSAs appear to show a decentralisation trend during 1990–2016, and larger MSAs tend to have a higher rate of decentralisation. This study leverages crowdsourced geospatial data to provide empirical evidence of the classic urban economic models. The findings will increase our understanding about urban morphology, population–job displacement and urban decentralisation. The findings also provide baseline information to monitor and predict the changing trend of urban population distribution that could be driven by future environmental and technological changes.
Introduction
Urban population density is a fundamental topic of continual interest in urban economics, geography and urban planning. Accurate quantification of urban population density can increase the understanding about the spatial relation between population and employment (Small and Song, 1994) and provide support for modelling urban accessibility (Guy, 1983), decentralisation (Garcia-López and Muñiz, 2013; Griffith and Wong, 2007; Mills, 1970), urban sprawl (Qiang and Lam, 2016) and land cover change (Jiao, 2015). This research stream dates back to the pioneering work by Clark (1951), which uses negative exponential function to describe the declining trend of population density from city centres. In urban economic theories, this pattern is often interpreted as a sign that employment is concentrated in urban centres so residential areas near the centres are more desirable because the cost of home–work commuting is low (Mills, 1970; Muth, 1969). The seminal work of Clark aroused broad interests in mathematical modelling of urban population density (e.g. Batty and Sik Kim, 1992; Chen, 2010; Griffith, 1981b; Muñiz et al., 2003; Wang and Zhou, 1999). Early competitors of Clark’s exponential model include the inverse square function, which was conceptualised from the gravity model in social physics (Stewart and Warntz, 1958). Batty and Sik Kim (1992) argued that the inverse power function is more theoretically solid for urban population density given the fractal characteristics of urban morphology. Additionally, Newling (1969) suggested to add a quadratic component to the negative exponential function to describe the U shape of the population crater near a city centre. Analogously, Guy (1983) and Ingram (1971) proposed to use the Gaussian function (also a quadratic exponential function) to model accessibility of population to shopping centres. Song (1996) demonstrated that the Gaussian function outperformed the exponential and power function in modelling population density in the Reno-Sparks metropolitan area. A comprehensive review of studies about population density function can be found in Martori et al. (2002).
In addition to the monocentric functions, conceptual and empirical studies were conducted to model polycentric cities in new urban economics frameworks. Noticeable work in this direction includes the model proposed by Griffith (1981a, 1981b), which incorporates trend surface analysis to characterise population density gradients in polycentric urban systems where satellite city centres emerge in peripheral areas. Other model specifications incorporate a spatial autoregression model to address the issue of spatial autocorrelation (Anselin and Can, 1986), which can be considered a local form of polycentricity. Later, Griffith and Wong (2007) refined their model by taking into account spatial autocorrelation and using Minkowskian distance to describe the elasticity of space. Muñiz et al. (2003) argued that a cubic spline function can provide a more realistic portrayal of the urban population pattern in polycentric cities. Other work modelling urban population density in the polycentric framework includes (Chen, 1997; Feng et al., 2009; Gordon et al., 1986; Muñiz et al., 2008).
Despite the fact that the declining pattern of population density from city centres (or sub-centres) is often interpreted as the trade-off between commuting cost and housing price, most of the previous studies are based on the Euclidean distance, which does not perfectly represent real commuting cost. Most travelling in cities is in restricted roads and may vary with the road conditions and network (Wang, 1998), traffic speed, land cover and terrain (Griffith and Wong, 2007). Population density functions estimated from different cities are not comparable because of the variation in these conditions. Given the scarcity of groundtruth data, there is a general lack of empirical validation of the population density functions based on more realistic metrics of commuting cost such as travel distance or travel time. Moreover, the previous studies only focused on one city or a small number of cities. The largest sample size we identified in the literature is in the study by Griffith and Wong (2007), which modelled population density of the 20 largest metropolitan areas in the USA. Other than this, there is no nationwide assessment of the population density functions with a large sample of cities in different sizes. Furthermore, most empirical studies of population density function were conducted in the late 1990s and early 2000s, using even earlier data sets. Given the rapid urban population growth, transportation and technological development of the past decades, a study with up-to-date data is needed to re-evaluate the classical models in modelling the current shapes of US cities of different sizes.
To fill the above-mentioned gaps in the literature, this study provides an empirical evaluation of population density functions in the 382 Metropolitan Statistical Areas (MSAs) in the USA using travel times to MSA centres as the independent variable. The travel times were acquired from the crowdsourced mobility data in Google Maps, which can be considered a more realistic metric of commuting cost than the Euclidean distance. Three commonly used population density functions were evaluated and their goodness-of-fit compared among the MSAs. The overall best-fit function was selected to analyse the variation of population distribution among the MSAs as well as the temporal change of population distribution from 1990 to 2016. In general, this study aims to address the following questions: (1) are the classic population density functions still valid when real travel time is used as the independent variable? (2) Which function can best describe the current form of US cities? (3) Can all US cities of different sizes be modelled by the same function? (4) What are the temporal changes of population distribution in US cities from 1990 to 2016?
Data acquisition and processing
Metropolitan statistical area
The 382 Metropolitan Statistical Areas defined by the United States Office of Management and Budget were analysed in this study. According to the US Census Bureau (2010), an MSA is defined as a Core Based Statistical Area associated with at least one urbanised area that has a population of at least 50,000. An MSA comprises the central county or counties containing the core, plus adjacent outlying counties having a high degree of social and economic integration with the central county or counties as measured through commuting. The identification of the centre of a city is a challenge. Most studies consider the city centre as a point and apply various methods to estimate the location of the point. For instance, Batty and Sik Kim (1992) applied centrographic analysis to locate the centre of London, which is the mean centre estimated from a set of population-weighted points. This centrographic approach assumes that the city centre is located among the most populated districts, which is not always the case. In some cities, the employment centre may deviate from the population centre. In other studies, city centres are defined as the centroid of the Central Business District (CBD) (e.g. Chen, 2010; Muñiz et al., 2003) or landmarks that are widely perceived as the city centres (e.g. Wang and Zhou, 1999; Wang et al., 2018).
In this study, given the large number of cities, manual identification of the CBD centres or central landmarks was impractical and could be subjective, based on authors’ personal opinions. Instead, the geocoding API of Google Maps® was used for automated identification of the centres of the MSAs. The geocoding process includes the following steps. First, the name of the principal city in an MSA is extracted from the full MSA name. For instance, ‘New York’ was extracted from the full MSA name ‘New York-Newark-Jersey City, NY-NJ-PA’. Next, the principal city name is concatenated with ‘Downtown’ and the state abbreviation, leading to the string ‘New York Downtown, NY’ denoting the centre of the New York MSA. Finally, the concatenated strings of all MSA centres were used as inputs for the geocoding API, which returned the geometric centroids (longitude and latitude) of the downtown areas of the MSAs. The returned geometric centroids were defined as the centres of the MSAs.
Population data
Population data from the census tracts of 1990 and 2016 were used to estimate the population density functions. We assume that a 26-year interval is sufficiently long to capture the fundamental changes in population pattern in US cities, avoiding short-term fluctuations. Before 1990, some newly developed cities were still rural areas and these do not have enough census tracts for the quantitative modelling. The 2016 population data are from the 2012 to 2016 5-Year Data of the American Community Survey (available at: https://www.census.gov/), which were the most recent population data at the time of study and temporally close to the travel time data. The 1990 data were obtained from the National Historical Geographic Information System (NHGIS) hosted in the Minnesota Population Center (available at: https://www.nhgis.org/). Land cover data were used to eliminate uninhabitable areas (including water body, aquatic bed, emergent herbaceous wetland and perennial ice/snow) from the total areas of the census tracts. The land cover data for the contiguous USA and Alaska were obtained from the 2011 National Land Cover Database (NLCD, available at: https://www.mrlc.gov/). The land cover data for Hawaii were acquired from the 2010 data set of NOAA’s Coastal Change Analysis Program (C-CAP, available at: https://coast.noaa.gov/ccapftp/#/). The areas of the census tracts excluding the uninhabitable areas were used as the denominator in the calculation of population density.
Travel time
Travel times from census tract centroids (origin) to the geocoded MSA centres (destination) were acquired from the Direction API of Google Maps. The API takes the coordinates of the origin and destination as the input, and exports the driving time between them as the output. The travel times were requested with the default setting of the API, which returns driving times on the shortest paths in average traffic conditions. Figure 1a illustrates the travel times from the census tract centroids to the geocoded centre of the New York MSA. The travel times returned by the API are estimated from the aggregated movements of individuals using Google Maps during their travels. The travel times take into account the road network topological structure, connectivity and accessibility, as well as average traffic congestion on the road. Thus, travel time can be considered a more realistic metric of commuting cost than Euclidean distance. Figure 1b shows that the travel times from the census tracts to the centre of New York do not have a perfect linear relationship with the Euclidean distances. A clear shift in the relationship at about 40 min (10 km) from the centre can be visually identified, which indicates a systematic slowing of travel speed (km/min) near the city centre. The relationship between travel time and Euclidean distance is influenced by the density and configuration of the road network, traffic conditions and environmental obstacles (e.g. water bodies or steep terrain), which vary in different cities. Such a non-linear relationship justifies the use of travel time as the independent variable in population density functions.

(a) Driving time from census tracts to the centre of the New York MSA. (b) Scatter plot and regression lines between travel times and Euclidean distances from the census tracts to the centre of the New York MSA. The solid line denotes the regression line of census tracts within 40 min to the centre, while the dashed line represents census tracts further than 40 min to the centre.
As the census tract boundaries in 1990 and 2016 are not identical, the driving times were queried for both data sets, leading to 115,422 (61,629 for 2016 and 53,793 for 1990) requests to the Google direction API. According to the new billing model of the Google Maps APIs since July 2018 (Google, 2019), the total cost of the requests is US$577.11 (US$ 0.0005/request × 115,442). Google offers US$300 free activation credits and US$200 monthly free credits per billing-activated account. The total requests for the census tracts can be paid off with the free credits of two Google accounts. Additional funding support is needed for the higher cost of analyses at finer spatial resolutions (e.g. block groups and blocks).
Analysis
Functions
The negative exponential, inverse power and Gaussian function were evaluated in this study. The independent variable of the functions (denoted as x) is travel time to a city centre and the dependent variable (y) is population density (Table 1). The negative exponential function (Clark, 1951) and inverse power function (Batty and Sik Kim, 1992) both describe a monotonic decrease of population density from the centre, where the coefficient a describes the scale of the relationship (i.e. the height of the curve) and b determines the rate of decline (i.e. the shape of the curve). With its origin in social physics, the inverse power function was being widely exploited in gravitational models of traffic flow and spatial interaction (Foot, 2017), where the parameter b describes the elasticity of the curve. In contrast, the b parameter in the exponential function indicates density gradient, which is the percentage change in density (y) for a marginal increase in distance (dx) (Batty and Sik Kim, 1992). Density gradient has successfully characterised most North American cities in the 20th century. The value of the density gradient varies from city to city with city size, topography, political and socio-economic contexts. A comprehensive review of the conceptualisation and applications of density gradient can be found in (McDonald, 1997). Additionally, the Gaussian function composes the exponential function with a concave quadratic component to describe a bell-shaped curve. The coefficient a is a scale factor describing the height of the bell-shaped curve, b indicates the position of the curve peak in the x dimension, c defines the shape of the curve. The flexibility of the Gaussian function allows it to describe a non-monotonous population pattern in a city (e.g. a central population crater).
The three evaluated functions and coefficient constraints.
Data aggregation
Owing to the non-random distribution of census tracts, the estimation of population density functions using the original sample of census tracts may lead to a severe upward bias (Frankena, 1978). As shown in Figure 2a, the number of census tracts in the New York MSA peaks at around 40 min and declines when moving in both directions. Similar distributions can be found in other MSAs. The function estimated with the census tracts may over-fit the middle range of travel times around 40 min where the sample frequency is high, and under-fit the two tails. Several approaches were suggested to mitigate the sampling bias, including weighting spatial units in proportion to their areas in the regression (e.g. Frankena, 1978; Wang and Zhou, 1999). Taking the New York MSA as an example (see Figure 2a), the weighted approach would offset the bias by assigning higher weights to the large census tracts distant (in travel time) from the centre. The census tracts near the centre (<30 min) are still under-represented in the function estimation. Additionally, the common indices of goodness-of-fit (such as R2 and RMSE) are not applicable for functions estimated using the weighted approach. Wang et al. (2018) suggested a Monte Carlo simulation approach to resample population from census units to uniform spatial units (e.g. square and hexagon tessellation) to mitigate the effect of the uneven sampling. Resampling the population of the 382 cities using Wang et al.’s approach would significantly increase the computational workload and cost of the Google API requests. Thus, the sensitivity of the estimated functions to different spatial units will be evaluated in future studies with additional resources.

(a) Counts and average area of census tracts in different travel times to the centre of the MSA of New York. (b) Effect of the central population crater on the fitting of the exponential function in the New York MSA.
To mitigate the issue of the non-random sample, census tracts were aggregated into 5-min intervals of travel time to the MSA centres. The population density in a travel time interval is calculated as the quotient of the total population of census tracts divided by total land area of the census tracts in the interval. The calculation is expressed as equation (1), where Pi is the population of a census tract i in the travel time interval t, Ai is the land area of the census tract i, n is the number of census tracts whose centroids are in the interval t. As travel time is dependent on road network and topography, an aggregated area is typically an irregular shape and can be non-contiguous, which is different from the circular rings of distance intervals. Using this approach, each 5-min interval has only one data point and population density is equally represented throughout the range of travel time. Moreover, the aggregation can eliminate local outliers and reflect the general trend of population density around the principal city centre. The function estimation is limited to census tracts within 90-min travel time from the centres because of low variation of population density beyond this point.
Coefficient estimation
The negative exponential and inverse power function both describe a monotonous decline of population from city centres. However, a central crater (depression) of population density can be found in many cities (Muñiz et al., 2003; Newling, 1969) because of the spatial extent of the CBD where the primary land use is commercial and the residential population is relatively small. The central crater may pull down curves of the exponential and power equation near the city centre, leading to poorer fit in the outskirts. Some researchers suggest removing the first data point near city centres in the function estimation (Banks, 2013; Chen, 2010; Clark, 1951). However, population craters of some MSAs extend more than one travel time interval from the centres and the spatial extent of the craters varies in different cities. To eliminate the effect of the population craters in the 382 MSAs, the estimation of all the three functions was set to start from the interval at the peak of population density. Intervals from the centre to the peak were excluded. Figure 2b illustrates the effect of the population crater on the estimated functions. After eliminating the data points in the crater, the derived function (solid curve) better describes the declining pattern of population density outside of the crater.
The coefficients of the functions were estimated using the non-linear least square (NLS) method in the Matlab® Curve-Fitting Toolbox. Two rounds of fitting were carried out. First, an exploratory fitting process was performed without bounding intervals of the coefficients. Coefficients of the functions were calibrated for each MSA. The coefficients that significantly deviated from the majority distribution were identified as outliers. Bounding intervals of the coefficients are defined within reasonable ranges excluding the outliers. Second, the coefficients of the functions were estimated again with the bounding intervals derived in the first round. The bounding intervals can narrow the search space and facilitate convergence of the fitting programme. The coefficients derived in the second round of estimation were adopted for further analysis. The coefficient bounding intervals can be found in Table 2.
Results of function fitting for the MSAs.
The coefficients of the three functions were first estimated using the 2016 census tract population data. The goodness-of-fit of the three functions was evaluated. Student’s t-tests were used to compare the population of MSAs best fitted in different functions. The function with the overall best fit for the MSAs was identified as the primary function. Then, the coefficients of the primary function were estimated again using the 1990 population data. The differences of the coefficients between the two years were compared to analyse the temporal change of urban population distribution in US cities.
Results
Goodness-of-fit
The curves of the estimated functions in the 12 largest MSAs ranked by population are illustrated in Figure 3, in which a general declining trend of population density from the city centres can be observed. Central population craters are prominent in some MSAs such as New York, Los Angeles, Dallas and Phoenix. Comparing the goodness-of-fit, the negative exponential function has the lowest average RMSE in the 382 MSAs, followed by the Gaussian and inverse power function (Table 2); 38% of the MSAs were best fitted (lowest RMSE) in the exponential function, followed by 33% and 29% best fitted in the Gaussian and inverse power functions, respectively. The mean population of the MSAs best fitted in the Gaussian function is the largest, followed by the exponential and then the power function (Table 2). The result of the Student’s t-test indicates that the populations of the MSAs best fitted in the exponential and Gaussian functions are significantly (p < 0.05) larger than the MSAs fitted in the power function. The populations of the MSAs fitted in the exponential and Gaussian function are not significantly different in the t-test (p = 0.5427). Additionally, the negative exponential function has systematically underestimated the population density near the centres of large cities. The average residual (actual density –predicted density) of the exponential function in the first interval (0–5 min) near the centre is negative. Such a negative residual was found in 204 of the 382 MSAs (53.4%). These MSAs have an average population of 786,658 compared with the average population of 629,062 of the MSAs with a positive residual. This result indicates that the negative exponential function, despite the overall best fit, cannot well describe the central population craters in the large cities.

Curves of the estimated functions (exponential, power and Gaussian) of the 12 largest MSAs ordered by 2016 population.
Changes of centrality
The negative exponential function, which has overall the best fit (lowest RMSE and fitting most MSAs), was selected as the primary function to analyse urban population distribution in the MSAs. In the negative exponential function, the coefficient a defines the theoretical population density in the centre of the city (i.e. travel time = 0). The coefficient b indicates the rate of distance decay of population density (i.e. density gradient) from the centre. Thus, b is often used to describe the centrality of a city (e.g. Bunting et al., 2002; Mills and Tan, 1980). As b is constrained to negative, a high absolute value of b (denoted as
Regression analysis was conducted to analyse the relation between the MSA populations and the coefficient

The original distribution of MSA populations (left) and the distribution of natural logarithms of the populations (right).

(a) Regression analysis of population logarithms and absolute values of the coefficient b (

Centralised and decentralised MSAs between 1990 and 2016. The background is kernel density created from the MSAs coded with binary values (1: decentralised and −1: centralised).
Discussion
This study used travel time as the independent variable to evaluate the three classic population density functions. Compared with Euclidean distance, travel time is a more meaningful measure of commuting cost, which takes into account the conditions of the road network, traffic, land cover and topography. Population density functions based on travel times can provide more comparable information about population patterns among cities in different conditions. This study also demonstrated the advantages of crowdsourced data, including low cost, easy acquisition and feasibility for large-scale analysis. The increasing availability of such data in the Web 2.0 era will greatly benefit future studies of urban population distribution at finer spatial and temporal resolutions. Beyond the study of population density, the demonstrated methods and data can be applied to analyse the relations of other socio-economic patterns (e.g. income, race and health) with travel time to city centres.
The results of the analysis suggest that different functions should be applied to describe cities of different sizes. In general, the Gaussian and exponential functions have a better fit for larger MSAs while the power function better fits smaller MSAs. Population in small cities is highly concentrated near the centres and declines sharply within a short travel time from the centres. This pattern can be better described by the inverse power function. In large cities, population density declines relatively more slowly from the centres, which were better fitted in the exponential and Gaussian functions. Compared with the negative exponential function, the Gaussian function can better describe the population density in large cities where the central crater is prominent. This is also confirmed by the negative residuals near the city centres of the negative exponential function.
The result reveals a decreasing trend of centrality (i.e. decreasing
The observed trend of urban sprawl may cause adverse environmental and societal consequences. Excessive urban sprawl would encroach on agricultural land, fragment critical habitats and threaten endangered ecological systems (Sheridan, 2007). Additionally, urban sprawl increases long home–work commuting, and generates traffic congestion, which contributes to air pollution and increased energy consumption (Martinuzzi et al., 2007). The continuous suburban growth in large cities may reduce the incentive for re-development of land closer to city centres, leading to the decay of downtown areas (Brueckner, 2000) and concentration of poverty in the central urban areas (Downs, 1999). As well as the central population crater revealed in this study, other types of craters (e.g. income, health and race) are likely to develop with the continuous outspreading of population. The outflowing population towards low-density suburban and rural areas may reduce social interaction, weakening people’s bonds to places that underpin a sustainable community. Not only in the large cities; the sprawling trend is likely to accelerate in the small cities too as population grows. To mitigate the adverse impact of this trend, appropriate zoning regulations (e.g. farmland and open space preservation) and policies (e.g. development tax and congestion tax) should be implemented to regulate land use in a more sustainable way. The methods and analysis results in this study provide important baseline information for monitoring the trend of urban sprawl and evaluating the effectiveness of the strategies.
Population growth, rising household incomes and transportation improvements are deemed to have been the driving factors of urban sprawl in the past century (Mieszkowski and Mills, 1993). In the future, urban sprawl can be driven by technological revolutions. For instance, connected and autonomous vehicles (AVs) are expected to further boost human mobility, reduce travel costs and change travel behaviour. According to the US Department of Transportation (USDOT) Research and Innovative Technology Administration (RITA), 81% of all vehicle-involved crashes can be avoided or mitigated based on connected vehicle technologies. Existing studies (Texas A&M Transportation Institute, 2013) indicate that 75% of vehicles will be AVs by 2040. Travel costs can be significantly reduced through shortened travel times when AVs are used. Moreover, the advancing communication technologies (e.g. the 5G cellular network) point towards a future in which telecommuting and video-conferencing would reduce the importance of geographical proximity to city centres (Castells, 1991; Muñiz and Garcia-López, 2010). These technological changes would further accelerate urban decentralisation and land cover change.
Several issues should be noted when interpreting the results. First, the travel times in this study, which were retrieved in July 2018, reflect the road and traffic conditions at that time. There is uncertainty in the estimated functions in 1990, as well as the comparison of urban centrality between 1990 and 2016. Given the worsening traffic conditions and slowing driving speeds in US cities in the past decades (Schrank et al., 2015), the actual travel times in 1990 are generally lower than under current conditions. Thus, this study has possibly underestimated the rate of decentralisation of the MSAs during the period. Ideally, the 1990 functions should be adjusted by a factor of the travel time reduction to describe more accurate population patterns at that time. However, this factor varies in different cities and in different places in the cities. Additionally, the crowdsourced data are often criticised for their low accuracy and biased sampling. The quality of the data collected from the Google Maps APIs need systematic validation with groundtruth data. Second, this study is limited to monocentric urban functions. In future studies, the polycentric model should be adopted to model the increasing diversity and complexity of modern cities. Local deviations caused by spatial autocorrelations should be addressed to improve the function fitting. Third, the aggregation unit of population may affect the function estimation (as shown in Wang et al., 2018). The sensitivity of the results to the change of aggregation units (e.g. block group or uniform units) should be analysed in future studies.
Conclusion
This study provides an empirical evaluation of the classic population density functions in the 382 MSAs in the USA using real travel times to city centres as the independent variable. The study demonstrated the utility of emerging data sources (i.e. crowdsourced data) for validating the classic urban economic theories. The major findings of the study include that (1) the negative exponential function has the overall best fit for population density in the MSAs; (2) the Gaussian and exponential functions tend to fit larger MSAs, while the power function has better performance in modelling small MSAs; (3) the majority of the MSAs appeared to have a decentralisation trend during 1990–2016, and larger MSAs tend to have a higher rate of decentralisation. This study leverages crowdsourced geospatial data to provide empirical evidence of the classic urban economic models. The findings will increase understanding about urban morphology, population–job displacement, and urban decentralisation. The derived functions provide baseline information to monitor and predict the changing trend of urban population distribution that could be driven by future environmental and technological changes.
Footnotes
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
