Abstract
We describe a simple spatial model of urban growth for systems of cities at the macroscopic scale, which combines direct interaction between cities and an indirect effect of physical network flows as population growth drivers. The model is parametrized on population data for the French system of cities between 1831 and 1999, in which strong non-stationarity in correlation patterns suggest to apply the model on local time windows. The corresponding calibration of the model using genetic algorithms provides the evolution of interaction processes and network effects in time. Furthermore, the fit improvement when adding network module appears effective when controlling for additional parameters, what confirms the ability of the model to unveil network effects in the system of cities.
Keywords
Introduction
Cities are paradoxically unsustainable and source of negative externalities, but also the best chance to reach sustainability and resilience to climate change (Glaeser, 2011). The dynamics of urban systems at a macroscopic scale, and more precisely drivers of urban growth, are crucial processes that need to be understood to meet these potentialities. A better knowledge of how cities differentiate, interact and grow is thus a relevant topic both from a theoretical perspective and for policy applications. Indeed, Pumain et al. (2009) suggest that cities are incubators of social change, their fate being closely linked to the one of societies.
Various disciplines have studied models of urban growth with different objectives and taking diverse aspects into account. For example, economics are still reluctant to include spatial interactions in models (Krugman, 1998) but are extremely detailed on market processes, even for models in economic geography. On the other hand, geography focuses more on territorial specificities and interactions in space but will have more difficulties to produce general conclusions. The example of these two disciplines and their misunderstandings (Marchionni, 2004) shows how it is difficult to make bridges, how it needs exceptional efforts to translate from one to the other (as P Hall did for Von Thunen work (Taylor, 2016)), and therefore how it is far from evident to grasp the complexity of urban systems in an integrated way.
The simplest model to explain urban growth, the Gibrat model, assumes random growth rates for cities. It has been shown by Gabaix (1999) that this model asymptotically produce the expected rank-size law (Zipf’s law) for system of cities, which is considered as one of the most regular stylized facts, at least in its generalized scaling law formulation (Nitsch, 2005). Explaining urban scaling laws is closely related to the understanding of urban growth, as Bettencourt et al. (2008) suggest that these reflect underlying universal processes and that all cities are scaled version of each other. This approach however does not reflect the complex relation between economic agents for which Storper and Scott (2009) advocate. Using a bottom-up reconstruction of urban areas with dynamical microscopic population data, Rozenfeld et al. (2008) show indeed that positive deviations to the rank-size law systematically exist, and that these must be an effect of spatial interactions between urban areas.
Complexity approaches are good candidates to integrate these interactions into models. Andersson et al. (2006) introduce for example a model of urban economy as a growing complex network of relations. The Evolutive Urban Theory, introduced by Pumain (1997), focuses on cities as co-evolving entities and produces explanations for growth at the scale of the system of cities. Pumain et al. (2006) show that scaling laws could be a consequence of functional differentiations and of the diffusion of innovation between cities. The positioning regarding universality of laws is more moderate than theories of scaling (West, 2017), as Pumain (2012b) highlights that ergodicity can difficultly be assumed in the frame of complex territorial systems. One crucial feature of this paradigm is the importance of interactions between agents, generally the cities, to produce the emergent patterns at the scale of the system. Pumain and Sanders (2013) investigate the advantages of agent-based models in comparison to more classical equation systems, and this methodological aspect is in accordance with the theoretical positioning, as it allows to take into account the heterogeneity of possible interactions, the geographical particularities, and to naturally include emergence between levels and render multi-scale patterns.
In this paper we aim at exploring further the assumption, central to Pumain’s Evolutive Urban Theory, that spatial interactions between cities are significant drivers of their growth. More precisely, we consider both abstract interactions and flow interactions mediated through the physical networks, mainly the transportation networks. We extend existing models accordingly. Our contribution is twofold: (i) we show that very basic interaction models based on population only can be fitted to empirical data and that adjusted parameter values can be directly interpreted; and (ii) we introduce a novel methodology to quantify overfitting in models of simulation, as an extension of Information Criteria for statistical models, in which application to our calibrated models confirms that fit improvement is not only due to additional parameters, but that the extended model effectively capture more information on system processes. This unveils network effects in an indirect way. We first review modeling approaches to urban growth based on spatial interactions.
Urban growth and spatial interactions
First of all, we must precise that we consider only models at the macroscopic scale, ruling out the numerous and rich approaches at the mesoscopic scale, that include for example cellular automata models, models of urban morphogenesis or land-use change models. Some include explicitly spatial interactions: Rybski et al. (2013) show that a gravity-based aggregation of population in a lattice produces realistic metropolitan patterns. Such kind of models operates however at an other level of detail than the one we consider. We also naturally rule out economic models that do not explicitly include spatial interactions.
Several models of urban growth at the macro scale have insisted on the role of space and spatial interactions. Frasco et al. (2014) introduce a spatialized model of social network growth that provides a good agreement with observed data both for city size hierarchy and the relation between population and density. Bretagnolle et al. (2000) propose a spatial extension of the Gibrat model. The gravity-based interaction model that Sanders (1992) uses to apply concepts of synergetics to cities is also close to this idea of interdependent urban growth, contained physically in the phenomenon of migration between cities.
A more refined extension with economic cycles and innovation waves is developed by Favaro and Pumain (2011), yielding a version of the core of Simpop models (Pumain, 2012a) in terms of dynamical systems. This family of models has started with a toy-model based on economic interactions between cities as agents that yield hierarchical patterns at the scale of the system (Sanders et al., 1997). The Simpop2 model, still based on spatial interaction for commercial exchanges, including successive innovation waves, unveils structural differences between the European and the US urban systems (Bretagnolle and Pumain, 2010). The SimpopLocal model (Pumain and Reuillon, 2017a) simulates the emergence of initial settlement patterns. The Marius model (Cottineau, 2014) couples population and economic growth with interactions between cities, allowing to accurately reproduce real trajectories on the former Soviet Union after calibration with processes multi-modeling.
Urban growth and transportation networks
Under similar assumptions of previously reviewed models, the inclusion of transportation networks has been rarely pursued, contrary to the mesoscopic scale at which relations between networks and territories have been widely studied by LUTI models for example (Chang, 2006). Transportation network growth models (Xie and Levinson, 2009), prolific in economics and physics, are designed to reproduce network topology and do not focus mainly on the influence of networks on cities. Bigotte et al. (2010) study an optimization model for network design combining the effects of urban hierarchy and of transportation network hierarchy. Baptiste (1999) models the dynamical interplay between network links capacity and city growth on a subset of the French city system. The SimpopNet model (Schmitt, 2014) goes a step further in modeling the co-evolution between cities and transportation networks, as it allows new network links to be created in time. These examples show the difficulty of coupling these two aspects of urban systems in models of growth, and we will for this reason take into account network effects in a simplified way as detailed further.
The rest of this paper is organized as follows: our model is introduced and formally described in the next section; we then describe the results obtained through exploration and calibration of the model on data for French cities, in particular the unveiling of network effects significantly influencing growth processes thanks to a novel methodology introduced. We finally discuss implications of these results.
Model description
Rationale
Some confusion may arise when surveying stochastic and deterministic models of urban growth. To what extent is a proposed model “complex” and is the simulation of stochasticity necessary? Concerning the Gibrat model and most of its extensions, independence assumptions and linearity produce a totally predictable behavior, which is thus not complex in the sense of exhibiting emergence, in the sense of weak emergence (Bedau, 2002). In particular, the full distribution of random growth models can be analytically determined at any time (Gabaix, 1999), and in the case of studying only the first moment, a simple recurrence relation avoids to proceed to any Monte-Carlo simulation. Under these assumptions, it is natural to work with a deterministic model, as it is done for example for the Marius model (Cottineau, 2014). We will work with that hypothesis, capturing complexity through non-linearity.
We work on simple territorial systems assumed as regional city systems, in which cities are basic entities. The time scale corresponds to the characteristic scale associated to this spatial scale, i.e. around one or two centuries. Spatial interactions are captured through gravity–type interactions, this simple formulation having the advantage of being simple and of capturing the first law of Tobler, namely that interaction strength fades with distance. Other approaches introduced recently perform similarly at this scale (Masucci et al., 2013).
Model description
We consider a deterministic extension of the Gibrat model, what is equivalent to consider only expectancies in time. Let
Note that in that case, stochastic and deterministic versions are not equivalent anymore, precisely because of the non-linearity, but we stick to a simple deterministic version for the sake of simplicity. The specification of the interdependent growth rate is thus given by
The network effect term
The first component corresponds to the pure Gibrat model, that we obtain by setting the weights
Model parameter space
Model parameters’ summary. a
We give the parameters’ names, notations, associated processes, possible interpretations, typical variation ranges, and units.
We propose to interpret the distance decay parameter in the following way. Let us fix an arbitrary fraction α and typical spatial ranges for a local urban system dL and for a long range urban system dR, consider a city i and two neighbors
Finally, we will consider only positive weights wG and wN, to follow empirical observations as detailed below. Numerical values for the weights will be given normalized by number of cities implied in the process, i.e.
Data
Our model is assumed as hybrid as it relies on semi-parametrization on real data. It could be possible to study it as a full toy-model, initial configuration and physical environment being constructed as synthetic data. We however aim at unveiling stylized facts on real data rather than on model behavior in itself, and setup therefore the model from the data we now describe.
Population data
We work with the Pumain-INED historical database for French cities (Pumain and Riandey, 1986), which give populations of Aires Urbaines (INSEE definition) at time intervals of five years, from 1831 to 1999 (31 observations in time). The latest version of the database integrates Urban Areas, allowing to follow them on a long time-period, according to the long time ontology for cities given by Bretagnolle (2009), that constructs a functional definition of cities as entities with boundaries evolving in time. We work on the 50 bigger cities in 1999. We furthermore isolate periods of similar length excluding wars, obtaining nine periods of 20 years 1 on which adjustment of the model will be done, taking into account non-stationarity in time.
Physical flows
The model takes into account stylized physical flows, in order to be rather independent of network effective shape. These are therefore assumed to follow the geographical shortest path according to terrain slope. This assumption avoids to obtain geographical absurdities such as cities with a difficult access having an overestimated growth rate. Using a 1 km resolution Digital Elevation Model, we construct an impedance field, following Collischonn and Pilar (2000), which is given by
Indicators of model performance
We work on an explanatory rather than an exploratory model. Therefore, indicators to evaluate model outputs are not directly linked to intrinsic properties of trajectories or obtained final states, but rather to a distance to the phenomenon we want to explain, i.e. the data. Given real population • Overall model performance, given by logarithm of the mean-square error in space and time
• Average local model performance, given by the mean-square error on logarithms
Both are actually complementary, as using only
Results
Stylized facts
Basic stylized facts can be extracted from such a database, as it has already been widely explored in the literature (Guérin-Pace and Pumain, 1990). We retrieve better fits of log-normal distributions of growth rates at all dates compared to normal fits, and also the fact that growth rates are mainly positive, on the cities we consider and when removing wars.
An interesting feature to look at in relation with our considerations on spatial interactions is correlations between time-series, and more particularly their variation as a function of distance. We consider 50 years overlapping time-windows to have enough temporal observations, finishing respectively in 1881, 1906, 1931, 1962, 1999, and estimate on each, for each couple of cities (i, j), the correlation between log-returns
This method, used mainly in econophysics (Mantegna and Stanley, 1999), reveals dynamical interactions without being biased by sizes.
We show in Figure 1 the smoothed correlations curves as a function of distance, for each time period. First of all, the strong differences between each time period confirm the non-stationarity of growth rates over the whole time period, and justify the use of local fit in time for the model. We can also interpret these patterns in terms of historical events for the system of city and the transportation network. The dynamic of the system begins with a flat correlation in 1881, around 0.2, that could be spurious due to simultaneous similar growth for all cities. It then stays flat but goes to zero, witnessing strong differentiations in growth patterns between 1881 and 1906. After 1931, the effect of the distance is clear with decreasing curves, starting between 0.4 and 0.5. We postulate that this evolution must be partly linked to the evolution of the transportation network: considering railway network for example (Thévenin et al., 2013), the initial overall development may have fostered long range interactions flattening thus the correlation curves, whereas its maturation over time has conducted to the return of more classical interactions decreasing quickly with distance.
Time–series correlations as a function of distance. Solid line corresponds to smoothed correlations, computed between each pairs of normalized log-returns of population time-series, on successive periods given by curve color.
Model exploration
Implementation
Data preprocessing, result processing and models profiling are implemented in R. For performance reasons and an easier integration into the OpenMole software for model exploration (Reuillon et al., 2013), a scala version was also developed. The question of a trade-off between implementation performance and interoperability is a typical issue in this kind of model, as a fully blind exploration and calibration can be misleading for further research directions or thematic interpretations. A NetLogo implementation, allowing interactive exploration and dynamical visualization, was also developed for this reason. Source code for models, cleaned raw data, simulation data, and results used in this paper are available on the open repository of the project at https://github.com/JusteRaimbault/InteractionGibrat.
We show in Figure 2 an example of model output. Cities color gives city-level fit error and their size the population. Outliers can therefore easily be spotted (as Saint-Nazaire having the worst fit in the example shown) and possible regional effects identified. We illustrate in pink an example of geographical shortest path, from Rouen to Marseille, which reasonably corresponds to the actual current shortest path by highway. Top right plot shows trajectory in time for a given city, whereas the bottom right plot gives overall fit quality in time, by plotting simulated data against real data. The closest the curve is to the diagonal, the better the fit.
Example of output of the model. The graphical interface allows to explore interactively on which cities’ changes operate after a parameter change, what is necessary to interpret raw calibration results. The map gives adjustment errors by city (color) and their population (size). We illustrate in pink the geographical shortest path between Rouen and Marseille. The plot in top right panel gives in time the trajectory of a selected city, comparing simulated population with real population. The bottom right plot gives for each date the simulated data against real data: the closest the curve is to the diagonal, the better the fit.
Behavior patterns
First model explorations, by simply sweeping fixed grids of the parameter space, already suggest the presence of network effects, in the sense that physical flow effectively has an influence on growth rates. We show in Figure 3 a configuration of such a case. At fixed gravity parameters and growth rate, we study variations of the parameters wN, dN, and γN and the corresponding response of Evidence of network effects revealed by model exploration. Left plot gives 
Calibrating the Gravity Model
We now use the model to indirectly extract information on processes in time. Indeed, under assumption of non-stationarity, the temporal evolution of locally adjusted parameters shows the evolution of the corresponding aspect of the processes.
In a first experiment, we set wN = 0 and calibrate the model with four parameters on the nine successive 20-year time windows. The optimization problem associated to model calibration does not present features allowing an easy solving (closed-form of a likelihood function, convexity or sparsity of the optimization problem), we must rely on alternative techniques to solve it. Brute force grid search is rapidly limited by the dimensionality curse. Classical methods (Batty and Mackie, 1972) such as gradient descent fail because of the rather complicated shape of the optimization landscape. Calibration using genetic algorithms (GA) is an efficient solution to find approximate solutions in a reasonable time. OpenMole embeds a collection of such meta-heuristics for different purposes: Schmitt et al. (2015) demonstrate the capabilities of these methods to calibrate models of simulation. In our case, it furthermore allows to do a bi-objective calibration on
We use the standard steady state GA provided by OpenMole (Pumain and Reuillon, 2017b), distributed on 25 islands, with a population of 200 individuals and 100 generations. We show in Figure 4 the calibration results on successive periods, by plotting final populations in the objective space. As expected, Pareto fronts that correspond to compromises between the two opposite objectives are the rule. It means that the model cannot be accurate both globally and locally, and an intermediate solution has to be found. This may due to the fact that interaction range changes with city size (i.e. that terms in the potential are no longer separable), that we keep as a possible model development. The shapes of the Pareto front are revealing the chaotic optimization landscape, as for some periods such as 1921–1936 or 1962–1982 fronts are not regular and sparse. The change in shapes also translates different dynamical regimes across the periods: for 1881–1901, the quasi-vertical shape followed by an isolated front at high Calibrating the Gravity Model. Pareto-front on successive periods. Each panel gives the Pareto front between the two objectives 
We show in Figure 5 the values of fitted parameters in time, averaged over the Pareto front and for best single-objective solutions. First, the two peak patterns for r0 correspond roughly to the patterns observed in average growth rates. The evolution of wG has a similar shape but lagged by 20 years: it can be interpreted as a repercussion of endogenous growth on interaction patterns in the following years, which is consistent with an interpretation of the interaction process in terms of migrations. The values of dG, with an increase until 1900 followed by a progressive decrease, is consistent with the behavior of empirical correlations commented above: the first 50 years windows in Figure 1 have a higher interaction range corresponding to flat correlation curves. Finally, the level of hierarchy γG has regularly decreased, corresponding to an attenuation of the power of large cities that can be understood in terms of the progressive decentralization in France that has been fostered by the administration.
Calibrated parameters for Gravity Model only. Each plot gives fitted values in time t for each parameter: in order from left to right and top to bottom, growth rate r0, gravity weight wG, gravity decay dG, gravity gamma γG. Red and Green curves correspond to best points for 
Unveiling network effects
We now turn to the calibration of the full model on successive periods, in order to interpret parameters linked to network flows and gain insight into network effects. The full calibration is done in a similar way with seven parameters being free. We plot in Figure 6 the fitted values in time for some of these parameters.
Calibrated parameters for the full model. We plot values of 
The behavior of growth rate and of the gravity weight relative to growth rate, that is similar to the Gravity Model only, confirms that network effects are indeed at the second order and that endogenous growth and direct interactions are main drivers. Network effects are however not negligible, as they improve the fit as shown before in model exploration, capturing therein second-order processes. The evolution of dN, corresponding to the range on which network influences the territories it goes through, shows a minimum in 1921–1936 to stabilize again later, but at a value lower than past values. This could correspond to the “tunnel effect”, when high-speed transportation does not stop in territories it goes through. Indeed, the evolution of railway has witnessed a high decrease for local lines at a date similar to the minimum, and later the emergence of specific high speed lines, explaining this lower final value. Hierarchy of flows has slightly decreased as for gravity, but is extremely high. This means that only flows between larger cities have a significant effect. This way, the model gives indirect information on the processes linked to network effects.
Evaluating model performance
We focus in this last experiment on quantifying the “performance” of the model, taking into account not only its predictive capabilities, but also its structure. More precisely, we want to tackle the issue of overfitting, which has been for long recognized in the field of machine learning for example (Dietterich, 1995), but for which there is a lack of methods for models of simulation. We need to introduce a tool to confirm that the improvement in model fit is not only artificially due to additional parameters.
The Akaike Information Criterion (AIC) provides, for statistical models for which a likelihood function is available, the gain in information between two models (Akaike, 1998), correcting fit improvement for number of parameters. Similar methods include the Bayesian Information Criterion (BIC), which relies on slightly different assumptions. Biernacki et al. (2000) propose an integrated likelihood as a generalization of these criteria for unsupervised classification. Pohle et al. (2017) show that in the case of selecting the number of states in Hidden Markov Models, real cases induces too much pitfalls for standard methods to work robustly, and suggest pragmatic selection based on their results and expert judgment. In our case, the problem is that it is not even possible to define the criteria.
The method we propose is based on the intuitive idea of approaching models of simulation by statistical models and using the corresponding AIC under certain validity conditions. Bastani et al. (2017) use a similar trick of considering the models as black boxes and approaching them to gain insights, in their case to extract interpretable structure as decision trees.
Let (X, Y) be the data and observations. We consider computational models as functions
We follow the intuition that if statistical models are good approximations of the simulation models compared to models discrepancy to reality, namely
In practice we calibrate the gravity only model and the full model on the full time span, and choose two intermediate solutions giving
Empirical AIC results.
Discussion
Multi-modeling
We obtain for most of periods and calibration points large values of γN in the calibration of the full model, what suggests that a model with only major flows could be a possible alternative. An additional parameter determining the cutoff value for flows would have to be introduced, and to what extent this variation improves model performance should be tested in future developments. More generally, this raises the issue of more systematic multi-modeling approaches and of benchmarking models. Cottineau et al. (2015) introduce a modular framework to compare concurrent hypothesis to explain urban growth. In the same spirit, an immediate extension of our work would be to test different functional specifications and benchmark more diverse network processes, such as the diffusion of innovation as done by Favaro and Pumain (2011) or the creation of social ties (Frasco et al., 2014).
Theoretical implications
Our results support the hypothesis that physical transportation networks are necessary to explain the morphogenesis of territorial systems, in the sense that some aspects are fully contained within networks and cannot be approximated by abstract proxies. We showed indeed on a relatively simple case that the integration of physical networks into some models effectively increases their explanative power even when controlling for overfitting. This can be understood as a direction to expand Pumain’s Evolutive Urban Theory (Pumain, 1997). This theory considers networks as carriers of interactions in systems of cities but does not put a particular emphasis on their physical aspect and the possible spatial patterns resulting from it such as bifurcations or network induced differentiations. The development of a sub-theory focusing on these aspects is an interesting direction suggested by our empirical and modeling results.
Urban system specificity
The model has not yet been tested on other urban systems and other temporalities, and further work should investigate which conclusions we obtained here are specific to the French urban system on this period, and which are more general to any system of cities. Applying the model to other system of cities also recalls the difficulty of defining urban systems. In our case, a strong bias should arise by considering France only, as Lille must be highly influenced by Brussels for example. The extent and scale of such models is always a delicate question. We rely here on the administrative coherence and the consistence of the database, but sensitivity to system definition and extent should also be further tested.
Towards co-evolutive models
Our focus on network effects remains quite limited since (i) we do not consider an effective infrastructure but abstract flows only, and (ii) we do not take into account the possible network evolution, due to technical progresses (Bretagnolle et al., 2000) and infrastructure growth in time. An interesting development would be first the application of our model with real network data, using effective distance matrices in time, computed, e.g. with the train network used by Thévenin et al. (2013).
Then, allowing the network to dynamically evolve in time, as a function of flows, would yield a model of co-evolution between cities and transportation networks for a system of cities. This co-evolution has been highlighted empirically by Bretagnolle (2009). This kind of model is not frequent, and Schmitt (2014) provides with the SimpopNet model, one of the few examples. It is shown by Raimbault (2017) that disciplinary compartmentalization may be at the origin of the relative absence of such type of models in the literature. Indeed, it would imply to include heterogenous processes such as economic rules to drive network growth that are quite far from the approach taken. It would however allow to investigate to what extent the refinement of network spatial structure and network dynamics can improve the explanation of urban system dynamics.
Conclusion
We have introduced a spatial model of growth for a system of cities at the macroscopic scale, including second-order network effects among endogenous growth and direct interaction growth drivers. The model is parametrized on real data for the French city system between 1831 and 1999. The calibration of the model in time provides interpretations for the evolution of processes of interaction within the system of cities. We furthermore show that the model effectively unveils network effects by controlling for overfitting. This work paves the way for more complicated models with dynamical networks that would capture the co-evolution between transportation network and territories.
Footnotes
Acknowledgements
Results obtained in this paper were computed on the vo.complex-system.eu virtual organization of the European Grid Infrastructure (
). We thank the European Grid Infrastructure and its supporting National Grid Initiatives (France-Grilles in particular) for providing the technical support and infrastructure.
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
