Assessment of model validation outcomes of a new recursive spatial equilibrium model for the Greater Beijing

Abstract

Robust calibration and validation of applied urban models are prerequisites for their successful, policy-cogent use. This is particularly important today when expert assessment is questioned and closely scrutinized. This paper proposes a new model calibration-validation strategy based on a spatial equilibrium model that incorporates multiple time horizons, such that the predictive capabilities of the model can be empirically tested. The model is implemented for the Greater Beijing city region and the model validation strategy is demonstrated over the Census years 2000 to 2010. Through forward/backward forecasting, the model validation helps to verify the stability of the model parameters as well as the predictive capabilities of the recursive equilibrium framework. The proposed modelling strategy sets a new standard for verifying and validating recursive equilibrium models. We also consider the wider implications of the approach.

Keywords

Model validation spatial equilibrium land use transportation interaction (LUTI) models recursive dynamics model calibration

Introduction

Urban land use and transport interaction (LUTI) models have been the mainstay of practical policy analyses over the past decades. Originating from early applications of spatial interaction (Batty and Mackie, 1972; Echenique et al, 1969; Lowry, 1964), the model structure and equations have undergone remarkable transformations through, e.g. the incorporation of general equilibrium theory in urban and regional economics (Anas and Liu, 2007; Bröcker, 1998) on one end of the spectrum to the adoption of disaggregate, microscopic and agent-based spatial simulation on the other (Heppenstall et al., 2011). In contrast, research into the predictive capabilities of these models has progressed less quickly. Model validation in its strict sense (i.e. comparison between model outputs and observed data that have not been used in model calibration) is just as rarely practised today as decades ago. In the current climate where expert opinion is questioned, it is vital to develop model validation strategies that are robust and comprehensible. This is particularly challenging in fast growing cities in developing countries where the pressures for development are high while data provision for modelling is poor.

A sound understanding of the predictive capabilities of a LUTI model is a prerequisite for its use in practical projects regarding investment or regulation in cities. Systematic reviews of urban applied models, such as Wegener (1994, 2004), have always given due prominence to model calibration and validation. Since such reviews already exist, we do not intend to carry out a full literature survey. Instead, we focus on specific issues regarding validating the model over time, which is an under-researched field.

On cross-sectional model validation, Wegener (1994, 2004) points out that if model calibration is limited to only one temporal cross-section, then it provides little more than an illusion of precision. This is because model fitting techniques today have made it easier than ever before to reproduce any observed temporal cross-section. The pursuit for ever better goodness-of-fit at that cross-section alone does not necessarily indicate the actual predictive capabilities of a LUTI model. He further suggests that a model’s performance should be validated by comparing the model results with observed data over a relatively long period.

As an indication of what policy analysts may think, Volterra et al. (2007) provide an extensive review of the urban land-use and transport modelling practice in London, UK. They conclude from the London experience that model validation is particularly difficult given both data and time constraints, albeit the task is ‘completely necessary’ in a policy context. They also propose that a feasible approach to validate urban models is to create retrospective (historical) forecasts and compare them to observed histories.

Although there is consensus among the leading modellers on what should be done on model validation, there are theoretical as well as practical barriers in turning the consensus aspiration into practice in most LUTI applications. Finding suitable time series data is a common, practical challenge. However, as a number of LUTI models have been developed over the past years, the data and model predictions produced by those projects would represent a valuable resource for both supporting long time series of land use and transport data, as well as carrying out retrospective analysis of the models’ predictive performance. For cities that do not yet have their own LUTI models, it would be beneficial to design new modelling projects so that a time series of land use and transport data and predictions are gradually accumulated.

For LUTI models that are based on static general equilibrium, there is also an issue of how to compare model results that are not time specific with observations that are collected for specific temporal cross-sections. Similarly, for microscopic, agent-based non-equilibrium dynamic models, the issue is how to translate the probabilistic predictions of, e.g. the transitional states of building use into an evolutionary perspective of how buildings, infrastructure and activities change over time in a real context.

A rare and inspiring example of validating large-scale urban land-use and transport models is reported in Miller et al. (2012). They present the historical validation of their ILUTE model, an agent-based microsimulation model, for the greater Toronto area over a time span of 20 years (1986–2006). Model forecasts on demographics and the housing market are compared with historical data and a decent level of goodness of fit is achieved over multiple time horizons. This work shows that the technical difficulties of model validation are not insurmountable, and the field will greatly benefit from a closer examination of the predictive capabilities of models over time.

Our paper aims to fill a gap in the literature by proposing a new model calibration-validation strategy based on a spatial equilibrium model that incorporates multiple time horizons. The proposed strategy makes use of observed land use, buildings, transport and urban activity data across multiple time horizons such that the predictive capabilities of the model can be empirically tested.

The paper is organized as follows. The next section introduces the core structure of the model application for Greater Beijing. ‘Model calibration’ section discusses data and model calibration, ‘Model validation’ section presents the model validation strategy and its application to Greater Beijing and ‘Conclusions’ section considers wider implications of the findings.

Model description

The distinct feature of the Recursive-dynamic Spatial Equilibrium (RSE) model is the link between Spatial Equilibrium (SE) and Recursive Dynamic (RD) models, which enables the simulation of urban change processes that vary over time scales (Simmonds et al., 2013; Wegener et al., 1986). Some urban change processes (e.g. workplace relocation and transport choices) adapt quickly to circumstance changes and thus are amenable to equilibrium modelling, while others are more inertia-prone and may take many years or even decades to adjust (e.g. building stock development and transport supply). In this section, we focus on new advancements in the RSE model since its early prototype in Jin et al. (2013), and introduce how the RSE model is calibrated and validated over multiple time horizons. More details about the model including the equations and equilibrium conditions are provided in the online Supplementary Material.

The theoretical RSE model is first proposed in Jin et al. (2013). In that paper, the recursive-dynamic feature of the model is demonstrated on a hypothetical peninsular city, where no empirical data are used for model calibration or validation. The RSE Greater Beijing model presented here differs from Jin et al. (2013) by incorporating the following advancements. First, we complete the general equilibrium structure of the SE model by incorporating a new equilibrium condition for the labour market. In Jin et al. (2013), the derived workplace wage is vaguely defined without equilibrating the zonal labour demand and supply. In this paper, the SE framework entails simultaneous equilibrium in production, floorspace and labour markets, subject to supply constraints and policy interventions. Second, we substitute the multinomial-logit discrete choice model with a nested-logit model for simulating the employment-residence paired location choices; the nested-logit model is theoretically more consistent, because it captures the correlations that exist among location alternatives. Third, generic methods for establishing and calibrating the RD models for floorspace development are developed, which is supported by economic theory and amenable to statistical analysis; the proposed models account for not only the physical durability of buildings but also the regional heterogeneity and one-off policy interventions. Fourth, the recursive-dynamics is extended from modelling building floorspace growth to residence relocation of non-employed households.

To demonstrate how the RSE model is calibrated and validated over multiple time horizons before being used for model forecasting, Figure 1 illustrates the information flows in the RSE model. Three time horizons are presented in the figure from time t to t + 2, where time t and t + 1 are historical in the sense that known data are available and time t + 2 is the future year. The SE model calibration at base year t follows the standard procedure as per conventional static equilibrium models. The novelty of the RSE model is that for the model calibrated at time t, its predictive capabilities can be empirically tested through forward forecasting from time t to t + 1, by comparing the model outputs with the known data set for time t + 1. To build up a serial track record, the forecasted results for time t + 2 can also be validated when the observed data set for time t + 2 becomes available.

Figure 1.

Main information flows within and between recursive spatial equilibria.

In practice, it is rarely feasible to trace back more than one historical period for data problems and modeller resources. An alternative way to validate the SE model for time t + 1 is to forecast backwards to time t, using the zonal stock constraints of time t as inputs. This process is counterfactual in nature, but it serves a dual purpose: (1) it helps modellers to crosscheck the data sets over two consecutive cross sections; and (2) as the forward and backward forecasts essentially involve two sets of model parameters, by comparing the performance of the two models, modellers can better understand the background changes in reality as well as the role of key parameters in model predictions. The proposed modelling method sets a new standard for verifying and validating recursive equilibrium models.

Once the SE model is calibrated at base year t, the RD models are then calibrated using model outputs from the SE model for time t, observed changes in zonal stock and the knowledge on policy interventions (including one-off events) from t to t + 1. The SE and RD model calibration may have to be repeated many times in a calibration-validation loop until a satisfactory goodness of fit has been achieved. After the model validation, the RSE model will operate in forecast mode with model parameters retained for further years.

Model application for Greater Beijing

The RSE Greater Beijing Model is the first empirical application of the RSE model since Jin et al. (2013). The model is designed for examining medium- to long-term impacts of major urban land-use and transport development options. Its potential use in practical policy analysis is reported in Jin et al. (2017). The model covers the administrative area of Beijing, Tianjin and Hebei in China (locally known as the ‘Jing-Jin-Ji’ city region), which has a total population of approximately 110 million in 2014 across a geographical area of over 216 thousand km². The Greater Beijing city region is represented by a total of 209 model zones (Figure 2). We define the 130 zones of Beijing Municipality as core zones of study and the 79 zones of Tianjin and Hebei as peripheral zones. For the core zones in Beijing, the zoning is based on the administrative boundaries of Jiedao, which is the smallest areal administrative unit in China and the finest geographic level for Census statistics.

Figure 2.

Zoning map of Greater Beijing (employment density in 2010).

Table 1 summarizes the segmentations in the RSE Greater Beijing model. The current version of the model does not differentiate industrial sectors due to lack of industrial data. For the same reason, the intermediate inputs for production are omitted in the model, albeit this input factor has been incorporated in the theoretical model. Nonetheless the RSE model adopts relatively fine segmentations of residents and housing floorspace in core zones. The residents in the core zones are categorized into four types according to their socio-economic background using the EGP schema (Goldthorpe et al., 1980; Rong, 2016), namely the high, middle, low socio-economic group plus the non-employed group. In the peripheral zones, we only differentiate between the employed and the non-employed group. The employed residents (ER) in peripheral zones are assumed to have the same socio-economic characteristics as the middle group in core zones. Housing floorspace in the core zones is divided into three types (large, middle and small) according to the average floorspace area per household member. In peripheral zones, only one composite type of housing floorspace is considered.

Table 1.

Segmentations in the RSE Greater Beijing model.

	Industry type (r)	Resident type (f)	Housing floorspace type (m)	Business floorspace type (k)
Core zones	1	4	3	1
Peripheral zones	1	2	1	1

RSE: Recursive-dynamic Spatial Equilibrium.

In the RSE Greater Beijing model, residents as well as jobs are mobile across the city region subject to market inertia and specific boundary settings. The housing and business floorspace development, transport infrastructure supply and the relocation of non-employment households are treated as stock constraints. Such constraints are unchangeable in static equilibrium, implying the inertia-prone nature of the respective market, but are subsequently updated through the RD models in a recursive manner.

Model calibration

Before embarking on the discussion on model validation, we first introduce the calibration of the RSE Greater Beijing model and the associated data inputs. Specifically we start from an overview of the growth in Greater Beijing from 2000 to 2010 using aggregate statistics. Then the items for calibrating the SE and RD models are presented in turn.

Data

To prepare the data sets for the RSE Greater Beijing model, we combine various data sources, including the official statistics, censuses, surveys as well as online open data sources to complete the data set. Data that are unavailable are either estimated from supporting sources or assumptions are made. In such cases, we crosscheck the estimations with aggregate statistics and modellers’ local knowledge.

To understand the aggregate patterns and scale of changes between the two census years (2000 and 2010) in the Greater Beijing city region, we compare the demographics and the building floorspace stock in Table 2. For simplicity, the zones are aggregated into seven categories (five categories in Beijing, plus Tianjin and Hebei). Note that full employment is assumed in the model, such that the total number ER in Greater Beijing is equal to the number of employed workers (EW). From 2000 to 2010, the total employment in Greater Beijing has increased by 27%. A much higher growth rate (61%) is witnessed in the housing market, where the housing stock in suburbs and planned new towns of Beijing have more than doubled.

Table 2.

Demographics and building stock changes in Greater Beijing (2000–2010).

		Beijing					Tianjin	Hebei	Total
		Centre	Near suburbs	New towns	Far suburbs	EPA	Tianjin	Hebei	Total
Demographics (thousand person)
Total employed residents	2000	3,031	1,286	1,036	1,228	536	4,869	33,514	45,500
	2010	4,244	2,967	1,880	2,162	553	7,287	38,651	57,744
	% Change^a	40%	131%	82%	76%	3%	50%	15%	27%
Total non-employed residents	2000	829	326	235	313	120	2,442	6,486	10,751
	2010	816	555	355	388	100	3,654	7,480	13,348
	% Change	−2%	70%	51%	24%	−17%	50%	15%	24%
Total employed workers	2000	3,247	1,215	1,130	1,076	448	4,869	33,514	45,500
	2010	5,881	2,038	1,761	1,655	470	7,287	38,651	57,744
	% Change	81%	68%	56%	54%	5%	50%	15%	27%
Building floorspace (million m²)
Total housing floorspace	2000	110	49	46	56	24	286	1,646	2,216
	2010	184	139	105	116	31	547	2,436	3,558
	% Change	67%	188%	129%	107%	27%	91%	48%	61%
Total business floorspace	2000	65	24	23	22	9	97	670	910
	2010	118	42	35	33	9	147	773	1,157
	% Change	81%	73%	56%	54%	5%	50%	15%	27%

EPA: ecological protection areas.

% Change: based on the 2000 value.

Calibrating the SE model

Table 3 summarizes the items calibrated in the SE model, including the parameters involved and the data inputs required. All the items and the data input, unless stated otherwise, apply to the calibration for both 2000 and 2010. Initial parameter values are obtained from partial equilibrium analysis, the literature and modellers’ estimates. Model parameters are calibrated in an iterative and sequential manner in order to prevent divergence. Given the interdependence of variables and parameters in the equilibrium framework, the calibration algorithm would iterate many times until all calibration criteria are met simultaneously at a user-defined level of accuracy (relative error ≤ 10⁻⁰⁵).

Table 3.

Parameters and data inputs for Spatial Equilibrium model calibration.

Item	Parameter to be estimated		Data input for calibration
Item	Parameter	Dimension	Data description	Dimension
Production & labour^a
Zonal percentage share of production output value over the city region	$E_{fz}$ zonal residual attractiveness for consumer type f in location z	[f × z]	Zonal output share estimated with district-level final consumption data	[z]^b
Regional average wage income (Yuan/annum) per labour type	$θ_{r}$ elasticity of substation for labour varieties	1	Regional average wage income per labour type	[f]
	$κ_{fj}$ input-specific parameters for labour varieties	[f × j]	Regional average wage income per labour type	[f]
	$κ_{fj}$ input-specific parameters for labour varieties	[f × j]	Zonal number of workers per type	[f × j]
Daily average labour working time per labour type	$γ_{f}$ utility coefficient for leisure time	[f]	Average labour working time per labour type (available for 2010 only)	[f]
Employment-residence location choices
Employment-residence joint location choice of employed residents	$λ_{f}^{I}$ dispersion parameter for residence location choice	[f]	Estimated commuting matrix based on observed aggregate transport statistics	[ $f \times i \times j$ ]
	$E_{fi \| j}$ zonal residual attractiveness for residence location i, given employment location j	[ $f \times i \times j$ ]	Estimated commuting matrix based on observed aggregate transport statistics	[ $f \times i \times j$ ]
	$E_{fj}$ zonal residual attractiveness for employment location j	[ $f \times j$ ]	Zonal number of workers per type	[ $f \times j$ ]
Housing
For 2000, regional aggregate housing type choice per household type For 2010, zonal housing rent pattern, and regional average housing rent per housing type	$ı_{mfi}$ input-specific parameters for housing varieties	[ $m \times f \times i$ ]	Housing consumption per housing type per SE group (available for Beijing 2000 only)	[ $m \times f$ ]
			Zonal housing stock per type	[ $m \times i$ ]
			Zonal housing rent per housing type (available for Beijing 2010 only)	[ $m \times i$ ]
	$β_{f}$ utility coefficient for housing	$[f]$	Housing consumption per housing type per SE group (available for Beijing 2000 only)	[ $m \times f$ ]
	$β_{f}$ utility coefficient for housing	$[f]$	Zonal housing rent per housing type (available for Beijing 2010 only)	[ $m \times i$ ]

The intermediate demand for production is omitted from the theoretical model due to lack of data.

The estimated zonal final consumption data do not include the breakdown of different socio-economic groups. Therefore, we assume uniform parameters for all socio-economic groups.

Note that for production and labour, we calibrate the zonal production output by the percentage share over the region, rather than using the absolute output value. This is because the actual production output or sales data are not currently available in Greater Beijing and we use the processed final consumption data as a proxy. Second, as no zonal wage data are available in Greater Beijing, we make use of the regional average wage for each labour type to calibrate wages at the aggregate level. The zonal wage pattern is manually examined with modeller’s local knowledge. Third, in the RSE model, employed consumers can trade off between working and leisure in terms of time utilization under budget and time constraints. The value of unit leisure time is measured by the opportunity cost, namely hourly wage. The trade-off behaviour is calibrated with the Time Utilization Survey (National Bureau of Statistics (NBS), 2008) data in Beijing. Fourth, for calibrating the housing market, the zonal housing rent data are only available for the year 2010, and thus the housing rent calibration applies to the 2010-year model only. Nonetheless, for the 2000-year model, an alternative data set (i.e. housing consumption data per housing type per household type) is available to calibrate the housing type choices at the aggregate level. In addition, we do not calibrate the constant elasticity of substitution $σ_{f}$ for housing varieties due to limited data availability, and we take parameter values from the literature (e.g. Anas and Kim, 1996). Where there are no commonly accepted parameters, we carry out sensitivity tests and adopt parameter values by judgement (Wan, 2016).

Calibrating the RD models

Table 4 summarizes the parameters calibrated in the RD models and the corresponding data inputs. Note that some parameter values are derived with model specifications and assumptions. Systematic sensitivity tests have been conducted to understand the impacts of key parameters. The technical details of the sensitivity tests are documented in Wan (2016).

Table 4.

Parameters and data inputs for Recursive Dynamic model calibration.

Model	Parameter to be estimated		Data input for calibration
Model	Parameter	Dimension	Data description	Dimension
Business floorspace (BFS) growth model	$ϒ_{B}$ exogenous depletion rate	1	Model specification	-
	$Λ_{B}$ portion of natural growth over the aggregate growth	1	Model specification	-
	$λ_{B}$ dispersion parameter	1	Model specification	-
	$β$ regression coefficients for BFS growth model	5	Selected outputs of base-year (2000) Spatial Equilibrium model	-
	$E_{k, i \in k \| B}$ municipal/provincial level attractiveness term for BFS growth	3	Processed data on BFS growth during the period 2000 to 2010	3
Housing floorspace growth model	$ϒ_{b}$ exogenous depletion rate	1	Model specification	-
	$Λ_{b}$ portion of natural growth over the aggregate growth	1	Model specification	-
	$λ_{b}$ dispersion parameter	1	Model specification	-
	$α$ regression coefficients for housing growth model	5	Selected outputs of base-year (2000) Spatial Equilibrium model	5
	$E_{k, i \in k \| b}$ municipal/provincial level attractiveness term for housing growth	3	Processed data on housing growth during the period 2000 to 2010	3
Residence relocation model for non-employed residents	$ω_{F}$ lag coefficient	1	Model specification	-
	$λ_{F}$ dispersion parameter	1	Model specification	-
	$E_{i \| F}$ zonal residual attractiveness for residence location i for non-employed residents	[i]	Observed zonal number of non-employed residents in 2000 and 2010	[i]

The proposed stock-updating mechanism refers to not only the endogenous market variables from the SE model, but also the durability of development and one-off policy interventions. We account for the spatial heterogeneity in floorspace growth by incorporating a set of residual terms for municipal/provincial locations (E_k). This new model specification facilitates the model application to large city regions, where significant heterogeneity across the cluster of cities often exists for historical, geographical and institutional reasons. This new model parameter is identified through the proposed calibration-validation loop, where predictive discrepancies between the modelled and the observed are investigated to improve the model performance. In the next section, we present the model validation through forward and backward forecasting.

Model validation

The calibrated RSE model needs to be validated before forecasting. By validation, we first operate the calibrated SE for 2000 to predict forwards for 2010, and compare the model predictions with the known data set. The forward forecast validates the predictive capabilities of the calibrated SE model for 2000. We then use the calibrated SE model for 2010 to predict backwards for 2000. This hypothetical backward forecast serves as an alternative to validate the SE model for 2010, when the forward forecast for 2020 is not yet available.

Validating the RD models ideally requires data for at least one more transitional period other than the 2000–2010 period, which is currently not available. Alternatively, we use the in-sample validation method, i.e. the model is estimated with a partial data set, while the remainder is used to validate the calibrated model. In the next section, we discuss the validation of the SE and RD models in turn.

SE model: Forward forecast from 2000 to 2010

In the forward forecast, the parameterization of the calibrated SE model for 2000 is retained, and the known boundary conditions and stock constraints for 2010 are used as inputs. Table 5 summarizes the data inputs for the forward forecast.

Table 5.

Data inputs for the forward forecast (from 2000 to 2010).

Data input	Dimension
Socio-demographic aggregates
H_f^a City-region total of employed resident type f	[f = 3]
J_f City-region total of employed worker type f	[f = 3]
Stock constrains
$B_{ki}$ Zonal stock of business floorspace type k	[ $k = 1; i = 209$ ]
$b_{mi}$ Zonal stock of housing floorspace type m	[ $m = 3; i = 209$ ]
$G_{fij}$ Time cost of transport for consumer type f	[ $f = 4; i = 209; j = 209$ ]
$g_{fij}$ Monetary cost of transport for consumer type f	[ $f = 4; i = 209; j = 209$ ]

For the non-employed residents, we take the zonal totals as inputs because the residence relocation of the non-employed residents is modelled with RD models rather than the SE model. Thus we do not consider it for validation.

Given the exogenous inputs, the calibrated SE model for 2000 predicts the zonal prices, rents and quantities for the year 2010. In this paper, the model validation is focused on the spatial distribution of employed residents (ER) and workers (EW). Specifically we compare the modelled zonal number of ER and EW with the known data set for 2010 through scatter plots (Figure 3). In the scatter plot, the thick red line is the y = x reference line representing a perfect fit.

Figure 3.

Validation through the forward forecast – EW/ER observed vs. modelled.

The scatter plots show that the modelled zonal number of ER and EW is approximately linearly correlated to the observed data with R-squared over 0.8. In terms of the root mean square error (RMSE) value, the Middle-SE group presents significantly higher fitting errors than the other groups. This is because all ER in peripheral zones are assumed to have the same socio-economic characteristics as the Middle group. Given that the peripheral zones are geographically aggregated, the large resident/worker numbers result in large RMSE value for the Middle group. Such a bias does not affect the High and Low group. To separate the impact of peripheral zones, additional R-squared and RMSE values are provided for the Middle group, which are derived for the core zones only. We investigate the outliers in the scatter plots and try to find the sources of discrepancy. In particular, we focus on the possible policy interventions and background changes that are exogenous to the proposed model. The investigation notes on outliers are summarized in Table 6.

Table 6.

Notes on outliers – forward forecast.

Outlier zone	ER/EW type	Notes
Zone 4 Xuanwu district	High ER Underestimated	As one of the six historic districts in Beijing, large-scale urban regeneration has been witnessed since early 2000. The improved residence attractiveness is absent in the 2000-year model.
Zone 29 Ganjiakou	High ER overestimated	The concentration of High ER in Zone 29 started before 2000, but the launch of the Beijing Central Business District (CBD) in late 2000 caused a dramatic increase of High EW. Overestimation is likely to be caused by the large residence attractiveness term inherited from 2000.
Zone 36 Zizhuyuan	High ER Overestimated	Adjacent to the Zhongguancun Hi-tech Development Area, High ER were highly concentrated in Zone 36 in 2000. However, the relative concentration level actually decreased from 2000 to 2010. Overestimation is thus caused by the large residence attractiveness term in 2000.
Zone 50 Qinghua University	High ER Overestimated	High EWs were highly concentrated around Zone 50 in 2000. This trend continued from 2000 to 2010, but the relative concentration level actually decreased from 2000 to 2010. Overestimation is likely to be caused by the large attractiveness term in 2000.
Zone 8 Nanyuan County	Low ER Overestimated	The rising of the clothing wholesale industry has transformed this suburban area in 2000 into a booming business hub in 2010. A large number of Low ER are thus relocated. Overestimation is likely to be caused by the retained attractiveness term from 2000.
Zone 89 Dongxiaokou	Low ER Underestimated	Tiantongyuan, one of the largest affordable housing sites in Beijing, was located in Dongxiaokou. The local population has increased dramatically since 1999.
Zone 12 Fangzhuang	Low EW Underestimated	Fangzhuang used to be a famous residence location for High ER. The emerging retail business and local services may attract the inflow of Low EW.

ER: employed residents; EW: employed workers.

Two major sources of discrepancy are identified: (1) the use of constant parameters, particularly the residual terms reflecting locational idiosyncratic attractiveness; and (2) exogenous policy interventions and other one-off events. Apart from these outliers, the model calibrated to the census year 2000 is shown to be capable of generating predictions for the year 2010 that are consistent with the observed patterns. This implies that the SE framework captures the fundamental market mechanisms that are relatively persistent over time. In the next section we conduct the backward-forecast validation from year 2010 to 2000.

SE model: Backward forecast from 2010 to 2000

In the backward forecast, we use the SE model calibrated for 2010 to predict the market conditions of 2000. The data inputs required are the same as listed in Table 5, but are for the year 2000 instead of 2010. We compare the predicted zonal number of ER and EW for 2000 with the known data set through scatter plots (Figure 4).

Figure 4.

Validation through the backward forecast – EW/ER observed vs. modelled.

A linear relationship between the modelled and observed zonal number of ER and EW can be identified with an average R-squared over 0.8. For the Middle group, additional R-squared and RMSE values are provided, which are derived for the core zones only. Among the three socio-economic groups, the High groups see relatively larger predictive errors than the others, which is consistent with the forward-validation results. The investigation notes related to the outliers are summarized in Table 7.

Table 7.

Notes on outliers – forward forecast.

Outlier zone	ER/EW type	Notes
Zone 1 Xicheng District	High ER Overestimated	The development of Beijing Financial Street and the urban regeneration improve the attractiveness of Zone 1 for High ER in 2010. Overestimation is thus caused by the large residence attractiveness term retained from 2010.
Zone 2 Dongcheng District	High ER Overestimated	As one of the six historic districts in Beijing, large-scale urban regeneration has been witnessed since early 2000, which improves the local attractiveness for High ER in 2010. Overestimation is thus caused by the large residence attractiveness term in 2010.
Zone 4 Xuanwu District	High ER Overestimated	Same as above
Zone 29 Ganjiakou	High ER Underestimated	The Beijing Central Business District (CBD), launched in late 2000, caused a dramatic increase in High EW but the relative level of ER concentration in 2010 is actually lower than that in 2000, which causes underestimation.
Zone 36 Zizhuyuan	High ER Underestimated	Zone 36 is adjacent to the Zhongguancun Hi-tech Development Area, where High EWs were highly concentrated since 2000 but the relative level of ER concentration in 2010 is actually lower than that in 2000, which causes underestimation.
Zone 50 Qinghua University	High ER Underestimated	High EWs were highly concentrated around Zone 50 since 2000; the relatively higher concentration level in 2000 requires larger attractiveness term to reproduce the observed number of High ER.
Zone 199 Tianjin Centre	Middle ER Middle EW Both overestimated	The core area of Tianjin has undergone large-scale urban reconstruction since early 2000. The housing stock size, however, remained stable, implying larger attractiveness term in 2010. Overestimation is thus caused by the large residence attractiveness term inherited from 2010.
Zone 1 Xicheng District	Low ER Low EW Both overestimated	The development of Beijing Financial Street from 2000 to 2010 triggered the growth of the local service industry, which may attract the inflow of Low EW. The retained attractiveness from 2010 causes overestimation in 2000.

ER: employed residents; EW: employed workers.

Comparing the forward and backward forecast test, we find that some outliers are consistent in the sense that they are present in both tests. For the zones that are overestimated in the forward forecast for 2010, they are generally underestimated in the backward forecast for 2000. The inverse direction of errors implies the effectiveness of the proposed model in representing exogenous policy changes between static equilibria. We find that such policy interventions not only change the physical stock of building floorspace, but also the unobserved choice patterns (represented by the residual attractiveness term in location choice models). In model calibration, the residual attractiveness terms in location choice models come into effect and serve as a useful complement to the endogenous utility measure in order to reproduce the observed location choice patterns. The residual terms are time specific, and the model calibrated over multiple time horizons naturally involves multiple sets of such parameters. Examining the change of the residual terms over time may provide new insights into the wider impacts of stock constraints and policy interventions. In terms of modelling implications, our forward and backward forecasts show that the choice of base year is likely to have a significant impact on the outcome of model predictions. A model that is calibrated up-to-date would provide a more precise handle on model parameterizations, and thus reduce the potential modelling error in long-term forecasts.

RD model: In-sample validation

In this section, in-sample validation is conducted on the RD model for housing floorspace growth. The in-sample validation method denotes model calibration with part of the data set and model validation with the remainder. Specifically the housing growth model is first calibrated with the data set for Beijing only. This is because (1) Beijing is the key study area in the Greater Beijing city region; and (2) the data set for Beijing is verified through crosschecking data from various sources and by modellers’ accumulated local knowledge. Then the calibrated model is used to predict the housing growth for the rest of the city region (i.e. peripheral zones in Tianjin and Hebei). The predicted growth is compared with the known data.

Note that the proposed model does not model the city-region wide aggregate growth of housing. Instead we take the regional total housing growth from exogenous projections. The RD model predicts the zonal percentage share of the aggregate growth using a logit-type probabilistic model. We present the scatter plot of the modelled zonal housing growth versus the observed growth in Figure 5 for Beijing and in Figure 6 for Tianjin and Hebei. Note that the errors in Figure 5 are the regression residuals of the calibrated RD model, while Figure 6 presents the prediction errors revealed by the in-sample validation.

Figure 5.

Housing floorspace growth model – observed vs. modelled – Beijing.

Figure 6.

Housing floorspace growth model – observed vs. modelled – Tianjin & Hebei.

Overall the housing growth model estimated with the Beijing data presents a satisfying predictive capability when applied to other locations. The investigation notes on outliers are summarized in Table 8, where we try to explain the errors, with particular focus on policy interventions that may incur unusual growth or stagnation during the transition period.

Table 8.

Notes on outliers – housing floorspace growth.

	Residual type	Policy variable (ℶ $_{i \| b}$ and $_{i \| b}$ )^a	Comment
Beijing^b
Zone: 89 Dongxiaokou	Underestimated	ℶ $_{i \| b} = 1$ $_{i \| b} = 0$	Tiantongyuan, one of the largest affordable housing sites in Beijing, was launched in 1999.
Zone: 90 Huilongguan	Underestimated	ℶ $_{i \| b} = 1$ $_{i \| b} = 0$	Huilongguan, a planned satellite town in the north of Beijing, was established in 2000.
Zone: 95 Beiqijia	Underestimated	ℶ $_{i \| b} = 1$ $_{i \| b} = 0$	A State-level policy initiative of Exemplar Town Development was launched from 2000.
Tianjin
Zone: 205 Tanggu District	Underestimated	ℶ $_{i \| b} = 1$ $_{i \| b} = 0$	Binhai New Area, a state-level development area, was established in 2009.
Hebei
Zone: 185 Hengshui City	Overestimated	ℶ $_{i \| b} = 0$ $_{i \| b} = 0$	No policy trend or one-off event identified
Zone: 192 Xuanhua City	Underestimated	ℶ $_{i \| b} = 0$ $_{i \| b} = 0$	No policy trend or one-off event identified

ℶ $_{i | b}$ and $_{i | b}$ are dummy variables reflecting zonal housing policy (ℶ $_{i | b} = 1$ means policy-oriented growth identified, otherwise ℶ $_{i | b} = 0$ ; $_{i | b} = 1$ means containment policy for housing identified, otherwise ℶ $_{i | b} = 0$ ).

The listed outliers in Beijing are excluded in the calibration process in order to prevent distorted fitting.

The investigation shows that the outliers in Figures 5 and 6 are mainly caused by exogenous policy interventions of extraordinary scale and intensity. In the RD model for housing growth, exogenous policy interventions are represented by binary dummy variables ℶ $_{i | b}$ and $_{i | b}$ . However such policy variables are essentially calibrated with the data for Beijing only. For locations within Beijing, such variables, once enabled, represent a uniform and average level of policy effectiveness. When the same parameterization is applied to other locations outside Beijing, prediction errors may occur because the effectiveness of local policies is likely to vary significantly among locations. In addition, the information about local housing policies is imperfect to modellers; thus the setting of the policy variables may be partial, which also leads to prediction discrepancies.

Despite the discrepancies between the modelled and observed growth, the SE model for 2010 is calibrated using the observed floorspace data. Thus, the prediction errors of the RD models for floorspace growth do not affect the SE model for 2010. The RD models start to operate in forecast mode from 2020 onwards. In forecast mode, the model parameterization is retained, while the setting of the policy variables is subject to a scenario-specific policy scheme.

Conclusions

In this paper, we present a new LUTI model calibration and validation strategy and its application to the Greater Beijing city region. The strategy involves: (1) the calibration of the SE model for the Census years 2000 and 2010, and the calibration of the RD models for the period 2000–2010; and (2) the validation of the SE models through forward and backward forecasting and the in-sample validation of the RD models. The strategy can be used for more time horizons in the future when further observed data accumulate. The proposed modelling strategy sets a new standard for verifying and validating recursive equilibrium models.

Technically static equilibrium models can be well calibrated for a single cross section in the sense that the calibrated model is able to reproduce the observed patterns to a high accuracy. But the cross-sectional reproduction does not guarantee the predictability of the model, nor the correctness of model parameters or the data. The model validation via forward/backward forecasting thus helps to verify the stability of the model parameters. Meanwhile the exercise also helps to identify the possible calibration errors, which are difficult to detect for models with only one cross-section. By investigating the source of errors in model validation, modellers obtain a better understanding of the model’s capability in terms of what can be explained by the model and what cannot, which inspires new variables and mechanisms to be considered to improve the model representation. This learning process should become a routine feature in model calibration.

The model calibration and validation results shed a new light into the nature and precision of the RSE model. The model calibrated for the Census year 2000 is shown to be capable of generating good predictions for the year 2010 that are consistent with the observed patterns. In a hypothetical backward prediction exercise, the model calibrated for the Census year 2010 has also demonstrated the stability of the model parameters over time. This indicates that the recursive SE framework is capable of capturing the bulk of the fundamental mechanisms of the interactions among land use and transport activities through reasonably parsimonious equations.

In this paper, model validation is limited to the spatial distribution of ER and EW. In future research, model validation can be extended to other market variables, such as production output and price patterns. The model validation may also be improved through an increase in frequency. For example, in Beijing there has been a schedule of five yearly comprehensive transport surveys, which collect both land use and transport data. However, to date such data is not available to researchers outside China. In fact, many of the non-traditional statistics are likely to have disclosure restrictions attached to them, and it would be necessary to involve those who own the data sources and explore new ways to use such data in a proper manner. Increased validation frequency will also undoubtedly bring new research issues regarding model design, as more and more shorter term dynamics are revealed. These shorter term dynamics may involve complex interactions of various urban systems and are not well understood yet by modellers (Batty, 2012).

Finally, the significance of model validation goes beyond assessment of the predictive performance of any single model. It helps establish falsifiability of urban model theories through routine procedures, where the model’s predictive power is quantified with empirical evidence. The revealed gaps between the modelled and the observed indicate directions for model advancements. Small but cumulative steps could thus be made to contribute to the conviction that developing urban models is more than an act of faith, but an act of science in understanding urban complexity.

Footnotes

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: Li Wan wishes to acknowledge the China Scholarship Council and the Cambridge Overseas Trust for the financial support for his doctoral study at Cambridge. Both authors wish to acknowledge the funding support for modelling methodological developments from EPSRC Global Challenge Research Fund project “Urban poverty in Chinese cities” and from a special fund of Ministry of Education of China Key Laboratory of Eco Planning & Green Building at Tsinghua University, Beijing. The usual disclaimers apply and the authors alone are responsible for views and any remaining errors.

Li Wan, PhD (Cantab), is a Research Associate at the Martin Centre for Architectural and Urban Studies, University of Cambridge. He is interested in developing advanced computer models of cities, particularly the spatial equilibrium models for assessing large-scale urban development policies.

Ying Jin, PhD (Cantab), is a Senior Lecturer at the Department of Architecture, University of Cambridge. He is the incoming director of the Martin Centre for Architectural and Urban Studies. He is interested in understanding how technology, policy and human behaviour affect the development of cities and their infrastructure, and in using this knowledge for creating new design solutions.

References

Anas

Kim

(1996) General equilibrium models of polycentric urban land use with endogenous congestion and job agglomeration. Journal of Urban Economics 40(2): 232–256.

Anas

Liu

(2007) A regional economy, land use, and transportation model (RELU-TRAN??): Formulation, algorithm design, and testing. Journal of Regional Science 47(3): 415–455.

Batty

(2012) Managing complexity, reworking prediction. Environment and Planning B: Planning and Design 39(4): 607–608.

Batty

Mackie

(1972) The calibration of gravity, entropy, and related models of spatial interaction. Environment and Planning A 4(2): 205–233.

Bröcker

(1998) Operational spatial computable general equilibrium modeling. The Annals of Regional Science 32(3): 367–387.

Echenique

Crowther

Lindsay

(1969) A spatial model of urban stock and activity. Regional Studies 3(3): 281–312.

Goldthorpe

Llewellyn

Payne

(1980) Social Mobility and Class Structure in Modern, Oxford: Britain Claredon Press.

Heppenstall

Crooks

See

et al. (2011) Agent-based Models of Geographical Systems, London, UK: Springer Science & Business Media.

Jin Y, Denman S, Deng D, et al. (2017) Environmental impacts of transformative land use and transport developments in the Greater Beijing Region: Insights from a new dynamic spatial equilibrium model. Transportation Research Part D: Transport and Environment 52: 548–561.

10.

Jin

Echenique

Hargreaves

(2013) A recursive spatial equilibrium model for planning large-scale urban change. Environment and Planning B: Planning and Design 40(6): 1027–1050.

11.

Lowry

(1964) A Model of Metropolis, Santa Monica, CA: Rand Corporation.

12.

Miller

Farooq

Chingcuanco

et al. (2012) Historical validation of integrated transport-land use model system. Transportation Research Record: Journal of the Transportation Research Board 2255(1): 91–99.

13.

National Bureau of Statistics (NBS) (2008) Beijing Residents’ Time Utilization Survey. National Bureau of Statistics of the People’s Republic of China.

14.

Rong X (2016) Housing the poor in the outskirts of a city – The case of Beijing. University of Cambridge.

15.

Simmonds

Waddell

Wegener

(2013) Equilibrium versus dynamics in urban modelling. Environment and Planning B: Planning and Design 40(6): 1051–1070.

16.

Volterra, et al. (2007). Modelling Transport and the Economy in London. Available at: https://www.london.gov.uk/sites/default/files/gla_migrate_files_destination/tandem_framework_20071221.pdf.

17.

Wan L (2016) A recursive spatial equilibrium model for the Beijing–Tianjin–Hebei city region. PhD dissertation, The Martin Centre for Architecture and Urban Studies, University of Cambridge.

18.

Wegener

(1994) Operational urban models state of the art. Journal of the American Planning Association 60(1): 17–29.

19.

Wegener

(2004) Overview of land-use transport models. Handbook of Transport Geography and Spatial Systems 5: 127–146.

20.

Wegener

Gnad

Vannahme

(1986) The time scale of urban change. In: Hutchinson

Batty

(eds) Advances in Urban Systems Modelling, North-Holland, Amsterdam: North-Holland, pp. 145–197.

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

1.11 MB