Abstract
Hedonic estimations of the effect of transport infrastructure on property prices vary widely. This high variability demonstrates a deficit in our understanding of these relationships, limits the utility of econometrics for the valuation of urban property markets, and limits the development and implementation of effective and fair market-based policy tools. Several avenues may lead to improved consistency: re-consideration of accessibility, inclusion of urban design characteristics, assessment of spatial dependence and spatial heterogeneity, and consideration of geographic scale. This paper outlines the rationale and opportunities for inclusion of, and presents empirical tests for, these assertions using a case study in western Sydney, Australia. Results show a number of urban design characteristics to be significant determinants of residential property price. Street connectivity and higher density in areas surrounding residences negatively impact price, higher density close to train stations positively impacted price in one model. Park-and-ride stations led to decreases in property values. Smaller study area results indicate a nonlinear relationship between distance to train station and property price and a disamenity impact for residences within 400 m of train stations. Relative accessibility measured as frequency of peak hour trains is a significant and positive determinant of price in the larger study area. Incorporation of a price trend surface and estimation using a spatial error model reduce the extent to which spatial autocorrelation overstates the effect of a train station on prices. These conceptual and empirical improvements further develop our understanding of the effect of transport infrastructure on property values.
Introduction
Hedonic price modelling (HPM) is a primary means of estimating the relationship between transportation infrastructure development and land and building value changes. The majority of studies indicate that the introduction of transit services leads to increasing land values, although empirical results are highly variable and inconsistent in North America (Debrezion et al., 2007; Higgins and Kanaroglou, 2016) and internationally (Bartholomew and Ewing, 2011; Mohammad et al., 2013). Variability in results is found in studies documenting case examples and literature reviews that aggregate the results of individual studies. The impacts of different types of projects (e.g. light rail versus heavy rail) also vary tremendously in the literature. Bartholomew and Ewing (2011) summarise the literature relating transit infrastructure to property prices stating it is ‘… both mature and yet still in its infancy’ (p. 30). Further, ‘… because much of that literature ignores the roles that urban form and development design play in real estate values (and transit ridership), its explanatory power is severely limited’ (p. 30).
Responding to such calls to reduce modelling inconsistency, this investigation empirically explores the possibility that variability in hedonic price models may be reduced through refined characterisation of accessibility, inclusion of urban design characteristics, greater interrogation of methods including assessment of spatial dependence and spatial heterogeneity, as well as thoughtful consideration of geographic scale. This paper explores these possibilities using a case study of commuter rail infrastructure in the western suburbs of Sydney, Australia. Development of our conceptual understanding, along with empirical testing of these ideas, will yield greater understanding of the factors that contribute to results inconsistency.
Background
HPM is a form of multi-variable regression that uses a market indicator (e.g. sale price) as the dependent variable and quantifiable measures of property characteristics or attributes as the independent variables. For residential properties these include:
structural attributes such as the building’s quality, age and size;
location attributes such as nearest transport connections, jobs centres and services; and
neighbourhood attributes such as crime levels and socioeconomic profile.
There is considerable recent scholarship that provides insight into the relationship between property-specific structural characteristics and housing prices, including Bartholomew and Ewing (2011), de Hann and Diewert (2013), Mulley (2014) and Diewert et al. (2015). These papers note the influence of type of structure (e.g. house or apartment), structure size, liveable floor space, property size, building materials, number of bedrooms, number of bathrooms, overall number of rooms, number of parking spaces, building age, presence of a garage, presence of a swimming pool, air conditioning, presence and/or type of heating, and number of floors. Attributes of land that may influence land price and/or overall property sales price include the shape of a lot, landscaping, elevation, slope and aspect.
Locational attributes are typically proximity variables and/or geographical coordinates. Location characteristics that may impact property prices include distance to: businesses, commercial districts or shops; schools; parks (Bartholomew and Ewing, 2011; Diewert et al., 2015); other amenities (de Hann and Diewert, 2013); employment opportunities (often the central business district); and the coast or other scenic views (Baker and Nunns, 2015; McLeod, 1984). Proximity benefits may be modified by the attributes of those features providing the benefit. For example, Bartholomew and Ewing (2011) point out the benefits of being close to parks and open space, which are based on proximity as well as characteristics of those amenities, such as size. Location-based characteristics may also result in property price decreases. Potential examples include distance to rail lines and automobile traffic because of noise, vibration and light (Bartholomew and Ewing, 2011), to high-voltage overhead transmission lines (Wadley et al., 2017) and to landfills. There are potential decreases in property values due to increased crime near transport hubs (Bartholomew and Ewing, 2011; Mohammad et al., 2013).
Variables that cover areas larger than individual parcels may be referred to as neighbourhood attributes. Examples include: demographic characteristics (Baker and Nunns, 2015); environmental quality (Bastian et al., 2002); density; crime rates; noise and air pollution (Bartholomew and Ewing, 2011; Han et al., 2017); and zoning. Other potentially relevant area characteristics include: low-income rental housing (Davison et al., 2017); permitted building height (Kittrell, 2012; Lieske et al., 2018); tax rate; school quality; and other socioeconomic indicators. Neighbourhood attributes that likely influence prices but may present modelling challenges include the regulatory environment, economic conditions, availability of financing, and other real estate market conditions (Baker and Nunns, 2015). Area-based locational variables may be included in an HPM either by ascertaining the value of the variable where it overlaps a property (e.g. the neighbourhood crime rate) or as a dummy variable (e.g. zoning). Neighbourhood attributes may also result in property price decreases, for example in areas of high crime or air pollution. In a longitudinal analysis, consideration of the timing of the announcement of plans to build transportation infrastructure and how the local real estate market responds is also important.
Despite – or perhaps because of – HPM attempting to incorporate this broad variety of factors that influence property prices, there can be a high degree of variability in the modelled effect of any one such factor. The academic literature reports high levels of variability in value uplift associated with transportation infrastructure development. Debrezion et al. (2007) summarised 57 HPM studies looking at the impact of distance to rail stations on property values. They find an average residential property price increase of 4.2% within 0.25 miles of a rail station and that residential property prices increase by 2.4% for every 250 m closer to a rail station. They also note that commuter rail impacts property values more than light or other heavy rail. Mohammad et al. (2013) reviewed 23 studies, finding what they characterise as ‘a great deal of variability in the estimated change in values arising from rail investments’ (p. 158). Higgins and Kanaroglou’s (2016) review found a generally positive relationship between proximity and value uplift within a range of findings including negative impacts on land values from transit. Smith and Gihring (2017) summarise more than 100 studies looking at the impact of transit service on property values. They conclude proximity transit often, but not always, increases property values. The magnitude of property value increases is such that they are often large enough to pay for a portion of transit infrastructure development.
Research in Australia, the focus of this research’s empirical core, also presents mixed results. McIntosh et al. (2014) investigate the impact of transit on urban land markets in Perth with a focus on new fast rail services, finding land price increases up to 40%. Mulley (2014) looks at land value uplift for bus rapid transit in western Sydney. Her global model results indicate that houses within 100 m of transit incurred a statistically significant price decrease of 9.2%. McIntosh et al. (2016) look at the entire Sydney region from 2000 through 2014 using a series of cross-sectional models finding, overall, value uplift associated with commuter rail of 4.5% within 400 m of stations, 1.3% from 400 m to 800 m and 0.3% from 800 m to 1600 m.
A number of authors put forward reasons for HPM results’ variability. Bartholomew and Ewing (2011) highlight dissimilarities in methods, market conditions, level of transit service and time horizons as contributing factors. Higgins and Kanaroglou (2016) suggest results variability derives from excessive focus on proximity, as well as, ‘a lack of empirical specificity from the use of proximity’ (p. 610). They argue proximity as a proxy for other factors yields omitted variables bias and may inadvertently result in omission of relevant characteristics, including design, that are unrelated to distance and lead to model misspecification. Higgins and Kanaroglou (2016) highlight relative accessibility and transit-oriented development as specific characteristics that are often omitted from analysis. Higgins and Kanaroglou (2016) also highlight methodological differences that could contribute to results variability including regression functional form, the exact nature of the dependent variables (e.g. sales price versus rental rate), study timing relative to infrastructure project announcement and study sample sizes. Following this literature, we consider four potential sources of variability in the effect of transport infrastructure on property value: accessibility; urban design; spatial effects; and geographic scale.
Accessibility
In HPM three phenomena are often conflated with the word ‘accessibility’. First is the idea of proximity, usually measured as distance from residence to train station. Second is destination desirability and variety, or simply where one is able to travel on public transport. Third is relative accessibility, which is defined in two distinctly different ways, as characteristics that differ between transit stations and as the differing levels of accessibility offered by different transport modes.
Most studies invoke the notion of accessibility as proximity of residence to transit station (Higgins and Kanaroglou, 2016; Mavoa et al., 2012; Páez et al., 2012), typically measured as direct or network distance from each property to the nearest transit station or stop. Ewing and Cervero (2010) indicate that distance from residence to transport node may be measured as transportation route density, such as distance between transit nodes or stations per unit area. From the literature, the area of influence of an individual transit node ranges, at the long end, from 1000 m to 2000 m for residential properties and from 400 m to 1200 m for commercial properties, with effects over that distance not necessarily being linear (Mohammad et al., 2013).
Accessibility as destination desirability is the range of destinations including employment opportunities made available by the public transportation network. This is referred to by Handy (1993) as regional accessibility and by Ewing and Cervero (2010) as destination accessibility. Lei and Church (2010) use the term ‘system facilitated accessibility’ to refer to the level of service and destination desirability. These authors present means of measuring destination accessibility including distance to CBD, employment opportunities or number of attractions within a given travel time, and a gravity model of trip attraction. Mavoa et al. (2012) implement a public transit and walking accessibility index. Handy (1993) uses the term ‘local and community accessibility’ to refer to the distance from a residence to nearby shops and other amenities.
Relative accessibility focuses on the characteristics of the place where one can board public transportation including level of service. Higgins and Kanaroglou (2016) define relative accessibility as the, ‘context-dependent … characteristics of particular transit lines in different cities and regions’ (p. 621). Lei and Church (2010) define relative accessibility as a comparison between modes or types of users. Factors potentially impacting relative accessibility include type of transit system (e.g. commuter rail, light rail or bus rapid transit) (Debrezion et al., 2007; Mohammad et al., 2013), quality of service, frequency, speed, comfort, as well as cost, time and convenience trade-offs between the use of public transportation and driving (Bartholomew and Ewing, 2011; Mohammad et al., 2013). Age of the rail system (Mohammad et al., 2013) and, by extension, the quality in terms of cleanliness and comfort of the rail system may also be important. Mavoa et al. (2012) use the term ‘access’ to describe quantifying level of service using a transit frequency measure. Higgins and Kanaroglou (2016) suggest transit ridership is an indicator of relative accessibility.
Urban design
HPMs may be improved when they include measures of quality of the built environment including urban design. This encompasses notions of transit-oriented development as defined by Higgins and Kanaroglou (2017): ‘high-density, mixed-use, amenity rich, and pedestrian-friendly built environment around rapid transit stations’ (p. 2). Urban design attributes have been found to have varying effects, with Duncan (2011), for example, finding the quality of the pedestrian environment can both raise and lower property prices in areas proximal to transit stations.
Ewing and Cervero (2010) highlight differences between dense urban street grids with often poorly connected suburban streets as a basis for consideration of road network connectivity as an urban design element. Matthews and Turnbull (2007) found higher street connectivity to be associated with increased property prices in pedestrian-oriented development. Frank et al. (2008) measure street connectivity as intersection density. Duncan (2011) used an intersection count variable, street intersections within 400 m of a parcel, as a means of accessing the quality of the built environment for pedestrians. Ewing and Cervero (2010) suggest operationalising connectivity with measures of block size, four-way intersections and numbers of intersections per unit area.
Consideration of urban design in HPM often includes the notion of density where there are two divergent characterisations. Density is sometimes used to indicate an increased quantity of amenities within a given area and, alternatively, is used to simply indicate a greater concentration of people. The former is likely to increase property values, the latter decreases property values. Nelson and Kim (2016) add the distinction of origin (around residences) and destination density. Increases in destination density are associated with increases in employment opportunities and other destination options and amenities.
Urban design may also be included in HPMs by quantifying land-use mix. Consumer preferences for walkability, pedestrian-friendly environments and transit-oriented development (Bartholomew and Ewing, 2011) suggest that greater land-use mix and aspects of street connectivity would be positively associated with property prices. Land-use mix is referred to as ‘diversity’ by a number of authors. As Ewing and Cervero (2010) note, studies measure diversity with entropy measures as well as jobs–housing balance and jobs–population ratios.
Inclusion of urban design in HPMs may be facilitated with a distinction made by Bartholomew and Ewing (2011) between transit adjacent development and transit-oriented development. They present an example of the former as the addition of a train station to an auto-oriented suburb, which likely has a negligible impact on proximal land values. The latter are districts close to transit stations that incorporate pedestrian accessibility, higher quality urban design and mixed-use development. Along these lines, Kahn (2007) researched the impact of rail system development in US cities from 1970 to 2000 focusing on a distinction between ‘park-and-ride’ and ‘walk-and-ride’ stations. He finds the former tend to decrease home prices while the latter lead to home price increases. Duncan (2011) suggests the area near stations dedicated to parking may also impact nearby property values as more parking is negatively associated with price. Schuetz (2015) indicates that surface parking, parking garages and park-and-ride type stations create a barrier between transit riders and economic activity near stations. Contrasting urban with suburban stations, Higgins and Kanaroglou (2017) found that value uplift from transit-oriented development varied by station type with the greatest uplift taking place in inner urban and urban areas. For mixed-use areas, price impacts are minimal in auto-oriented development and both positive and negative (due to the benefits and drawbacks of proximity to retail) in more pedestrian-oriented developments (Matthews and Turnbull, 2007).
Spatial effects
The inherent spatial nature of HPM suggests the potential confounding influence of spatial effects, spatial dependence and spatial heterogeneity, in regression modelling. Yet, spatial effects are considered in only a modest number of studies. Higgins and Kanaroglou (2016) review over 130 analyses since the year 2000 and find only eight that address problems of spatial dependency and spatial heterogeneity with spatial regression models. Spatial dependence in an econometric model implies violation of Gauss-Markov regression assumptions resulting in incorrect standard errors as well as biased and inefficient estimators (Diao, 2015; Hill, 2011). Hill (2011) also notes a primary benefit of accounting for spatial dependence is that so doing helps compensate for problems due to omitted variables.
Geographic scale
Geographic scale, always present but infrequently discussed in HPMs, may contribute to results variability. Two aspects are considered here: the construction of variables, including area-based and location-based variables; and the boundaries of the model. As an example of the former, Atkinson-Palombo (2010) used census tracts as the unit of analysis for land-use mix, but these are arbitrary. The modifiable areal unit problem (MAUP) suggests the need to model geographic data at the scale at which processes are taking place, but this scale may differ for different independent variables as well as within and between communities. In addition to the MAUP, the concept of geographic scale also relates to the size of the study area in an investigation. The scale of HPM analysis on transport infrastructure is often based on distances from stations (e.g. 400 m, 800 m, 1600 m) where these distances are based on generalised notions of walking travel time of approximately 5 minutes per 400 m. Even here the definition of increments and outer limits are, at worst, arbitrary or, at best, based on heuristics rather than empirically driven choices.
Operationalising these factors, and testing their interaction
Developing an empirical strategy
High variability in HPM results demonstrates a deficit in our understanding of the relationships between elements of the built environment, including transport infrastructure, on property prices. This variability limits the utility of econometrics for the valuation of urban property markets, and limits the development and implementation of effective and fair market-based policy tools. Higgins and Kanaroglou (2016) highlight lack of specificity in modellers’ understanding and use of proximity and, as well as the omission of design considerations, its potentially contributing to HPM results’ variability. Both Bartholomew and Ewing (2011) and Higgins and Kanaroglou (2016) note methodological considerations contributing to HPM results’ variability with the latter highlighting the need to incorporate spatial effects in estimations. We augment these recommendations with the need to carefully consider and empirically test notions of geographic scale. Over time, this may lead to a better understanding of the components of housing value as related to transportation infrastructure development. The empirical component of this research organises this research agenda around the following questions:
Are urban design characteristics and differing notions of accessibility significant in hedonic regressions?
Does including urban design characteristics, differing notions of accessibility, explicit consideration of scale and spatial effects improve regression results?
Can variability in HPM results be reduced by including urban design, accessibility, scale and spatial effects?
Study area and methods
This paper provides background information and an empirical estimation of four models guided by these questions using a case study of commuter rail in the western suburbs of Sydney, Australia. The intent is to compare models that include urban design attributes with ones that exclude them and compare models that use a ‘station catchment’ scale with a ‘whole area’ scale. The case study is built around the Parramatta City Council local government area (Figure 1). Parramatta is the Sydney region’s third major employment centre. Most commuting flows are from Parramatta to Sydney’s primary business district some 15 km to the east. The study area for this research was based on the 2015 boundary of the Parramatta City Council but extends to all lots in the circumcircle of the council boundary. This includes potentially relevant sales in close proximity to the council that would otherwise be excluded with strict adherence to the political boundary.

The Parramatta circle study area in western Sydney, New South Wales, Australia.
The sale prices of residential properties sold between 1 July 2014 and 30 June 2015 were sourced from the Australian Property Monitors (APM) Sydney geocoded sales data. Sales price is measured in nominal Australian dollars with a recorded sale date. The 12-month timeframe minimises the influence of temporal trends in the property market. A year variable further differentiates temporal effects. The APM data set also contains structural attributes. Parcel data were from the NSW cadastral database. Empirical testing and preliminary modelling dictated the variables used in the final models. These include:
dwelling type, lot size and number of bathrooms; likely serving as proxies for the size of the dwelling, following Mulley (2014)
presence of a fireplace and walk-in wardrobe; likely proxies for build quality
presence of a balcony, courtyard and internal laundry; likely proxies for apartments or townhouses over freestanding houses
the number of residential properties under strata title for the sold property; a proxy for the type of apartment complex.
Location attributes are based on distance to points of interest, sourced from the NSW topographical database. From a long list of potential location attributes, independent variables included were distances to the nearest electrical transmission lines, primary schools (usually up to 6th grade), high schools (usually 7th–12th grade) and train stations.
Neighbourhood attributes were sourced from a similarly large set of possible metrics, primarily from the 2011 Australian Census at the SA1 census geography. About the size of a city block, SA1s are the smallest polygons for which a broad selection of census socioeconomic data are available. Other attributes were sourced from NSW planning data sets. Those incorporated in the final models were land-use zoning (with higher densities associated with higher redevelopment potential), median rent, rental financial stress and mortgage financial stress. Although individual choices will shape proportional housing spend, high rental stress when measured as an aggregate neighbourhood rate among low-income households is a good indicator of low affordability (ABS, 2010) and so puts downward pressure on property price.
Accessibility as proximity measured as distance from residential properties to train stations is incorporated in the model as either direct distance or as distance buffers. Early estimations tested a variety of accessibility conceptualisations including variables capturing the distance to the Sydney CBD and a separate variable indicating the distance to the Parramatta CBD. We interpret the lack of significance of these variables to indicate a homogeneity of destination opportunities for residents of the study area and that accessibility as destination desirability is controlled for in this modelling. The number of peak hour trains is included in the model as a measure of relative accessibility.
Urban design variables were calculated using the Australia Urban Research Infrastructure Network (AURIN) walkability tool. The tool measures aspects of walkability that are analogous to the measures of urban design discussed in the hedonic literature including street connectivity (design), density and land-use mix (diversity). Street connectivity in AURIN is measured as the count of three-way or greater intersections per square kilometre (Giles-Corti et al., 2014). The ‘density’ attribute modelled is the AURIN walkability tool output ‘averagedensity’, which is people per hectare in each walkability catchment (Pettit et al., 2015). Land-use mix measures the extent to which areas of different land uses within each defined neighbourhood are equal (Pettit et al., 2015). It is calculated by extracting selected relevant land uses for each walkability catchment, then calculating the ratio of the total area of land-use cover divided by the summed area of the different land uses in the polygon (Giles-Corti et al., 2014). A park-and-ride dummy variable captures the distinction between park-and-ride and walk-and-ride stations.
Consideration of geographic scale in the models included two alternative spatial extents (Figure 2): the entire study area (referred to as the ‘Parramatta circle’) and within 2000 m of train stations (referred to as ‘station buffers’). The latter was based on the statistical significance of distance buffers up to 2000 m in preliminary modelling. Neighbourhoods were modelled differently at these two different scales: an overlay of 400 m2 grid cells that cover the entire study area (Parramatta circle models) and buffered polygons within 2000 m network distance of train stations at 100 m increments (station buffer models) (Figure 3). The Parramatta circle may offer the most accurate assessment of local conditions throughout a study area while the station buffers may be most relevant to station-based analysis.

Modelled sales by lot at two scales of analysis: Parramatta circle (left) and 2000 m station buffers (right).

Selected independent variables: (A) predicted sales price trend surface, (B) station buffer density, (C) station buffer connectivity, (D) peak period trains and park-and-ride, (E) Parramatta circle density and (F) significant station buffers.
The hedonic price model used in this study regards each property value as a package of three attributes: structural (S), location (L) and neighbourhood (N). This approach enables the development of a model that best estimates prices for properties by determining the weightings for the set of attributes across a number of sold properties. Properties with known sales prices (V) is the dependent variable and the main attributes (S, L, N) in the three categories are the independent variables (Equation 1).
That is, the modelling can determine the coefficients (
where V is property price;
Preliminary estimations of equation (2) indicated substantial issues with multicollinearity, heteroscedasticity and spatial autocorrelation. De Hann and Diewert (2013) indicate multicollinearity as a common problem in hedonic regressions, with Kuminoff et al. (2010) noting that factors incorporated in HPM are rarely independent and in fact go up and down together. Multicollinearity was addressed first by calculating the variance inflation factors (VIF) for each potential independent variable and dropping most independent variables where VIF values are greater than 10. O’brien (2007) notes VIF thresholds are a widely used approach but that excluding variables from a model should be theoretically based. Variables that measure price drivers that are not otherwise included in a model may be retained and their performance evaluated using confidence intervals and/or t-values. In the Parramatta data set the APM data are under the VIF threshold. Most of the census data, on the other hand, are over the VIF threshold. The distance to train (direct) and the individual buffers at 800 m as well as between 1000 m and 1800 m inclusive were also above the threshold.
We began addressing spatial autocorrelation found in preliminary models by ascertaining the spatial structure of the price data. We developed a quadratic trend surface (Anselin, 2005) using sales price as the dependent variable and the polynomial and interaction terms of the coordinates of the observations as the independent variables (X2, Y2 and XY). Regression results show the polynomial and interaction terms are highly significant determinants of sales price with an adjusted R squared of 0.24. The price trend surface is shown in Figure 3A where darker shades indicate increasing predicted sales prices. The model presents a distinct pattern with predicted prices increasing from west to east on the western side of the map and increasing from south-west to north-east on the eastern side of the map. In order to incorporate the spatial heterogeneity of sales prices in the study area the polynomial and interaction terms were included in the hedonic regressions. A summary of the data input into the hedonic price model along with descriptive statistics and data sources is presented in Table 1. Figure 3 presents the spatial distributions of the price trend surface (Figure 3A), density at both the station buffers and study area scales (Figure 3B, 3E, respectively), connectivity at the station area scale (Figure 3C) as well as peak hour trains and park-and-ride stations (Figure 3D).
Input data and descriptive statistics.
Notes: aonly in Parramatta circle models; bonly in station buffer models; conly in urban design models; GIS/CViz/AURIN: derived.
Results
Hedonic regression results are presented in Table 2 for four models at two spatial scales: the entirety of the Parramatta circle and the station buffers scale. Models at each scale are presented both with and without design features incorporated as independent variables. Model 1 presents results for the Parramatta circle, model 2 presents results for the Parramatta circle and includes design elements as independent variables, model 3 presents results at the station buffers scale and model 4 presents results at the station buffers scale that includes design elements. All four models are estimated as spatial error models (SEM) based on post-estimation tests in preliminary modelling, which indicate the appropriateness of SEMs over spatial lag and OLS regressions.
Hedonic regression results.
Notes: *p < 0.05, **p < 0.01, ***p < 0.001.
VIF≥10. □ Not in model.
The influence of the independent variables capturing structural attributes is consistent across the four models and these variables all present their expected signs. The financial year variable is significant and positive in all four models. This variable captures a booming market and indicates the importance of incorporating broader market trends. The time variable could be made more nuanced by indicating quarterly results. However, time as a dummy variable is less susceptible to issues of multicollinearity (Diewert, 2003). Panel models could also more thoroughly address temporal trends. In the Parramatta circle models, incorporating density reduces the effect of strata title suggesting the latter captures density as congestion. The zonings used a series of dummy variables, which were not all significant. However, including zoning, and the strata variable, was found to improve performance and stability in preliminary models, and removing them disrupted post-estimation tests. This suggests that when these variables are removed the model suffers from omitted variables bias. The trend surface variables are highly significant in all four models. As suggested by the VIF data presented in Table 2, inclusion of these variables increased serial correlation in the models. However, post-estimation tests of preliminary models show removing X2, Y2 and XY resulted in inferior models. Relative accessibility measured as frequency of peak hour trains is a significant and positive determinant of price at the Parramatta circle scale. The peak hour train variable demonstrates that the value of accessibility to commuter rail is determined not just by the distance of a property to a train station but by the level of service (as frequency) that station provides.
Among the urban design variables, connectivity was significant and negatively signed in model 4. Counter to broader expectations around walkability, this suggests greater numbers of intersections may not be a universally desirable characteristic. This result is similar to that found by Duncan (2011) as well as one of the samples of Matthews and Turnbull (2007) where they found greater connectivity to be associated with lower house prices. Higher density in the areas surrounding residences was found to negatively impact price in model 2 whereas higher station area densities in model 4 positively impacts property price. The positive impact of density near train stations is congruent with Atkinson-Palombo (2010), who found value uplift increases near stations with higher density walk-and-ride, high-amenity, mixed-use neighbourhoods. Land-use mix was not a statistically significant determinant of sales price in either model 2 or model 4 and was excluded from final estimations. Connectivity at the Parramatta circle scale tested significant but disrupted post-estimation tests and was excluded from the final estimations. As suggested in the literature and shown in model 4, park-and-ride stations led to decreases in property values.
The measure of system accessibility, the direct distance to a train station, is negatively signed in the Parramatta circle models. At this scale property prices increase with proximity to train stations. The station buffer models present a more nuanced relationship between distance to a train station and residential property price. In the station buffer models residences within 400 m of a train station suffer a price reduction, likely due to the disamenity effect of the station. On the other hand, residences 900 m and 1900 m from train stations also command price premiums. These results indicate a nonlinear relationship between distance to a train station and property price as well as the impact of disamenity from the train station for residences within 400 m. This difference across scales and the nonlinear results of the station buffers models suggest value uplift occurs because of transportation infrastructure but proximity-based assessment is best investigated at multiple scales with consideration given to nonlinear impacts associated with distance. Notably, the results for both the direct distance and distance band measures of system accessibility are lower than the averages from the literature presented by Debrezion et al. (2007) and lower than the findings of McIntosh et al. (2016) for the Sydney region.
Results of all four models include a highly significant spatial autoregressive coefficient (Lambda). The Moran’s I test statistic assessing the clustering and distribution of residuals in the SEM for all four models indicates that inclusion of the spatial autoregressive error term in the model has eliminated spatial autocorrelation. Values of the Wald test, Likelihood Ratio Test, and Lagrange Multiplier test statistics are in the expected order for all models. While all four models present evidence of heteroscedasticity, residuals plots suggest this is the result of a number of outliers rather than distinct trends in the pattern of residuals. The lower AIC and Breusch-Pagan test statistics in the station buffers models suggest the 2000 m station buffers to be a more appropriate scale of analysis than the entire study area.
Discussion and conclusion
The aims of this paper were to initiate investigation of three research questions addressing variability in hedonic regression focusing on accessibility, urban design, geographic scale and spatial effects. The urban design characteristics considered include street connectivity, density, land-use mix and station type. The empirical investigation is based on a series of HPMs that estimate changes in residential property value near transportation infrastructure in western Sydney at two scales. All four models performed consistently and with the expected signs for the independent variables representing property characteristics.
This paper presents three notions of accessibility: (1) system accessibility or proximity to transit nodes, (2) destination accessibility (which may be differentiated between system-facilitated accessibility and local, vis-à-vis residences, accessibility), and (3) relative accessibility, which may be differentiated between characteristics of transit stations and lines or between transportation modes. This research evaluates the first and third empirically. Destination accessibility is controlled for with the homogeneity in destination opportunities available in the study area. System accessibility measured as direct distance from residences to train stations is significant at the Parramatta circle scale and using selected empirically determined distance buffers at the station buffers scale. At the Parramatta circle scale, residential sales prices increase with proximity to stations. The station buffers scale presents a more nuanced view. Results shows a decrease in sales prices within 400 m of a station and increases in prices at 900 m and 1900 m from stations. Incorporating multiple distance buffers as independent variables as opposed to a single distance variable we are able to pick up nonlinearities in the influence of distance to station. Nonlinearities may also be relevant to other location variables such as distance to school, where, for example, one may want to be near a school but not adjacent. This is an area for further research. Relative accessibility is measured here as a count of trains departing each station for central Sydney during the morning peak period. Peak hour trains are correlated with higher prices in the Parramatta circle models. This suggests level of service should be considered in discussions of HPM. Relative accessibility as differentiation between transport modes measured as distance from residence to highway on/off ramps was tested but not significant in any model. Together, these three notions augment the suggestion of Higgins and Kanaroglou (2016) for more carefully considered and described notions of accessibility in HPM. A more detailed and systematic investigation of relative accessibility as differentiation between transport modes is an opportunity for further research.
The inclusion of urban design in HPMs highlights the potential variability of property values due to local context. Results indicate both positive and negative price impacts of density and a negative impact of connectivity on price. Density is a significant determinant of price at each scale although differing signs suggest dis-amenity value of density near residences and amenity value of density near train stations. These findings are congruent with the literature that identifies both positive and negative aspects of density due to amenity and congestion. However, the literature on connectivity and land-use mix often presents these as universally valued aspects of urban design. The negative sign on connectivity may seem surprising. While one would expect connectivity to contribute positively to price it may be that people prefer to live along quiet streets rather than at busy intersections. Our results suggest that the idea that connectivity is universally valued should not be accepted uncritically.
Park-and-ride stations showed a substantial impact on property price in model 4. This finding is congruent with Kahn (2007), who found park-and-ride stations tend to decrease home prices, and Duncan (2011), who observed a negative relationship between station area parking and property prices from park-and-ride stations in neighbourhoods with poor pedestrian quality. We add that the park-and-ride variable may also be an accessibility proxy as it could indicate that one does not need to live near a transport node in order to take advantage of the transport network. This would lower prices near park-and-ride stations.
This research highlights two aspects of geographic scale. The MAUP is relevant to neighbourhood metrics which can be measured at different scales, and these scales are often arbitrary. The result is that measured effects (and variability in measured effects) might be a function of the scale of the modelling rather than a real difference. A second issue with scale is the size of the study area. There will be variability in the value of a train station depending on local and regional context, suggesting differently sized study areas are likely to present different influences for specific independent variables. For both the MAUP and study area size it is critical to operate at scales that best represent the systems and process at play. For the MAUP these may mean conceptual development as well as experimentation with slightly different versions of specific independent variables. At the study area scale, modelling beyond the area of influence of a transport node is likely to yield muddled results. However, if the research aim is to compare prices near stations (controlling for the differences in the qualities of those stations) with prices farther from a station this may be appropriate. This difference across scales and the nonlinear results of the station buffers models suggest value uplift occurs because of transportation infrastructure but proximity-based assessment is best investigated at multiple scales with consideration given to nonlinear impacts associated with distance. In this case, the lower AIC and Breusch-Pagan test statistics in the station buffers models suggest the 2000 m station buffers to be a more appropriate scale of analysis than the entire study area.
Preliminary regression diagnostics confirm that spatial effects needed to be controlled for in the modelling process. Developing a price trend surface was a key step in addressing spatial effects in this study. The price trend surface accounts for part of the spatial distribution of the dependent variable. In effect, the price trend surface compensates for key data sets typically incorporated into HPMs that do not exist for greater Sydney (e.g. building area, year built, building quality and materials), as well as the existence of market conditions that have not been fully analysed nor understood. The price trend surface improved the regression diagnostics of initial OLS regressions to the point where regression diagnostics indicate the appropriateness of a SEM over OLS or a spatial lag model. A drawback to the use of the trend surface is it increases multi collinearity. The SEM is particularly valuable for hedonic regression as it reduces the potential to overstate the link between spatially auto-correlated variables. Together, the trend surface and the SEM reduce the extent to which spatial autocorrelation overstates the effect of a train station on house values. Employing the trend surface and SEM in tandem offers as complete as practicable accounting for spatial effects. This is not standard practice in HPM, there is considerable opportunity for refinement in future research.
These results affirmatively answer research question (1). Design elements suggested by the literature including street connectivity, density and type of train station (park-and-ride) are found to be statistically significant determinants of property price in this research. There is some evidence supporting an affirmative answer to research question (2), that urban design characteristics, differing notions of accessibility, explicit consideration of scale and spatial effects improve regression results. At both scales, comparing models 1 and 2 and comparing models 3 and 4, the inclusion of urban design elements slightly improves modelling outcomes. More dramatic is the improvement in the model from moving from the Parramatta circle scale to the station buffer scale. Comparing the Parramatta circle models (1 and 2) with the station buffers models (3 and 4) indicates that focusing quantitative analysis closer to stations improves AIC and heteroscedasticity test results. Both this improvement and the nonlinear relationship between it and distance to a train station at the larger scale corroborate Mulley (2014), who argues that global values linking property characteristics, including distance to a train station, may be misleading and there may be value in more disaggregated analysis that considers variation across space. Recommendations are to communicate the scale of analysis of independent variables as well as to consider and empirically test areas of influence across multiple scales in preliminary models.
The larger research question is (3): can variability in HPM results be reduced by including urban design, accessibility, scale and spatial effects? A recommendation from this research that could lead to greater consistency in HPM outputs is to model empirically determined areas of influence rather than entire study areas. Definitively answering question (3) will require considerably more theoretical and empirical work including comparison of results across studies. It remains to be seen whether incorporation of factors found to be significant here could reduce inconsistency in econometric modelling of housing price values more broadly. Continued investigation of alternative specifications of design variables is also a recommended area for future research. Specifically, standardising a suite of urban form variables including both standardised measurements and expected influences on sales prices.
Footnotes
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work has been supported by FrontierSI.
