Abstract
Bidding for proximity to a good school can lead to a pattern of spatial distribution in which households with similar socio-economic status and willingness-to-pay for school quality cluster together. In this paper, we adopt a three-level hierarchical framework using residential house prices in Orange County, California, in 2001 and 2011, to estimate how much homebuyers pay for school quality. Our data show that, during this period, the Academic Performance Index (API) scores of elementary schools in Orange County increased by 16.4% yet converged while the house prices rose by 50.3%. The variation in house prices attributed to school district boundaries was at the same level in both years, but the variation in the API scores shrank. Using a hierarchical random effects model, our estimation results show that, on average, a 10% increase in the API raised the house prices by 1.9% in 2001 and by 3.4% in 2011. Ten years apart, a one standard deviation increase in school quality in the sample increased house prices by a surprisingly similar percentage: 2.7% in 2001, and 2.6% in 2011, respectively. Our findings also reveal that, in both years, there was a significant spatial heterogeneity of school premiums in house prices across school districts. This research provides a spatial understanding of the education capitalisation effects and sheds light on the effectiveness of urban education policy.
Introduction
When homeowners purchase a house or apartment, they also pay for the public amenities in the neighbourhood, such as public school quality, the premiums of which are capitalised in the house prices. To estimate the premiums, hedonic analysis (Rosen, 1974) has been applied with the most conventional method by using an ordinary least square (OLS) regression model (Leech and Campos, 2003). However, the OLS estimation has been challenged because house prices and public amenities are often spatially correlated instead of being independent of each other (Black, 1999; Thorsnes and Reifel, 2007). By assuming that all observations are independent, OLS generates a single relationship for all the data, which can lead to misleading interpretations of local relationships (Fotheringham et al., 2002).
Alternative estimation methods of hedonic analysis have been applied in school quality capitalisation studies (Black, 1999; Gibbons and Machin, 2003; Gibbons et al., 2013; Haurin and Brasington, 1996). Although these estimation methods have their own merits, the studies which apply these methods have not shown the spatial heterogeneity of the school quality capitalisation effect across school districts from a regional perspective. However, we argue that the regional variation of public education premiums across school districts is important because government spending on public schools and policy-making is at the school district level.
In this study, we hypothesise that there is spatial heterogeneity of the school quality capitalisation in house prices across school districts. In other words, homeowners in different school districts pay different levels of premiums in house prices for public education. We will examine the plausible clustering of relatively homogeneous households in terms of their preferences for school quality and willingness-to-pay. Houses within the catchment areas of high-quality schools are often priced higher than comparable houses outside the catchment areas, all else being equal (Black, 1999; Bogart and Cromwell, 1997). This is because homebuyers are aware of school quality when they make their purchase decisions (Weimer and Wolkoff, 2001) and parents choose to live close to good schools to ensure their children’s admittance (Barrow, 2002; Calvo, 2007). In fact, previous studies have confirmed that there are strong boundary effects in school capitalisation; nevertheless, the geographical scope of these studies is relatively narrow, and therefore it limits the possibility of examining potential heterogeneity across a region (Bayer et al., 2007; Black, 1999; Bogart and Cromwell, 1997; Kane et al., 2006). To reveal the spatial heterogeneity of the capitalisation effects of school quality in a region, we apply a random effects model, in which the coefficient of the test score is allowed to vary across school districts, by which we can explore whether homeowners in different school jurisdictions pay a different proportion of their house price for public schools.
In addition, we conceptualise a hierarchical structure of houses: houses are clustered by school districts. In each district houses are further clustered by neighbourhoods, which are approximated by census tracts. This hierarchy is formed because households with a similar socio-economic status and willingness-to-pay for neighbourhood characteristics (e.g. racial composition, average income level) bid for their residential location choice (Alonso, 1964). Therefore, households are expected to be more homogeneous within and more heterogeneous across different geographic and administrative boundaries such as school districts and neighbourhoods, thereby forming a hierarchy. Indeed, previous housing studies have used the boundary of relatively homogeneous neighbourhoods to define submarkets and to explain house price differentials (Bourassa et al., 2010).
To accommodate the hierarchy of spatial data, a multilevel model (or a hierarchical model) was previously recommended because of its capability of picking up the unobserved spatial contextual interactions of individual objects from the same spatial units (Fotheringham et al., 2002). For the more efficient use of the information in the data, a three-level multilevel model (as opposed to a two-level model) can be adopted to estimate the price a homebuyer would pay for public school quality: level 1 denotes observations at the most detailed level of the data, represented by the basic units of analysis in this study – houses; level 2 observations represent clusters of housing units – neighbourhoods (approximated by census tracts); and level 3 represents clusters of level 2 units (or groups of clusters) – school districts.
In this paper, we will use a three-level hierarchical random effects model to estimate the premium in house prices for public elementary schools and to reveal their spatial heterogeneity. Our study area is Orange County, a suburban county in the Los Angeles region of California, for both 2001 and 2011. These two years were selected to show the premium in two respective time periods. It is noteworthy that the housing markets in 2001 and 2011 might have valued the same residential unit in a different way because the housing markets may be in different equilibriums due to new land development and/or unpredictable events such as the 2008 global financial crisis. For this reason, the observations from these two years are estimated in separate models.
The findings in this research demonstrate that: first, a considerable variance in house prices stems from differences between neighbourhoods and between school districts; second, a standard deviation rise in the test score increased the house prices by a surprisingly similar percentage in 2001 and 2011; and third, housing prices and the shadow price of public education are not homogenous across boundaries but, overall, homeowners from more expensive school districts paid a higher premium for school quality. The substantial variations in the test score and its premium in house prices across the district boundaries reflect the regional gap in public education and shed light on the effectiveness of recent education policy.
Review of hedonic price estimation
The traditional estimation of the hedonic price estimation approach – OLS – has been criticised because of the plausible autocorrelation of the error terms in the housing data (Can and Megbolugbe, 1997; Fingleton, 2006; Orford, 2000). The assumption of independence of its error terms in OLS is often misguided in studies using spatial data, which are spatially dependent by nature (Fotheringham et al., 2002; Páez et al., 2008). Alternatives to the OLS estimation have been developed in capitalisation studies, including generalised least squares (GLS) (Haurin and Brasington, 1996), boundary discontinuity design (Bayer et al., 2007; Black, 1999; Gibbons et al., 2013), non- and semi-parametric/locally weighted regression (Gibbons and Machin, 2003; Redfearn, 2009), geospatial models (Bourassa et al., 2010), and multilevel estimation (Goodman and Thibodeau, 1998, 2003).
Empirical studies in different study areas which use different methods may lead to varied magnitudes of effect. For example, Haurin and Brasington (1996) applied GLS to test for the capitalisation effect of two local public services in Ohio in 1991. The results showed that a 1% increase in the pass rate of 9th graders increased house prices by US$380–US$400, equivalent to 0.5% of the house price at mean. Black (1999) designed a boundary discontinuity regression to evaluate the parental valuations of elementary school quality in three suburban counties in Massachusetts from 1993 to 1995. Black’s results showed that parents were willing to pay 2.5% more for a house for a 5% increase in test scores. Gibbons et al. (2013) generalised Black’s approach by weighting the observations by the inverse distance between the paired sales in the UK from 2000 to 2006. As the distance moves towards zero, the variance of school quality of the pair is expected to move towards a constant. They found that house prices rose by approximately 3% for a one standard deviation increase in the value-added test scores. Gibbons and Machin (2003) adopted a non-parametric estimation to evaluate the effects of primary school performance on property prices in less well-defined school catchment areas in London, UK, from 1996 to 1999. They found that a 1% increase in school performance pushed up property prices by 6.7%. Comprehensive reviews of recent advances in school quality capitalisation in house prices can be found in works by Gibbons and Machin (2008) and Nguyen-Hoang and Yinger (2011).
The aforementioned alternative estimation approaches, together with the traditional OLS estimation approach, can be differentiated from each other by one fundamental – the assumption of the data structure. The OLS approach assumes that observations are independent from each other and that the error terms are homoskedastic. GLS relaxes such an assumption by allowing a certain degree of correlation between the observations, but the model does not have a pre-determined structure of the correlation. The boundary discontinuity approach assumes that the observations are discontinuous across certain boundaries. Thus, the difference-in-difference approach can be applied when a buffered zone from both sides of the boundary is used to control for variations in the sample. The non-parametric and geospatial models essentially concur with the first law of geography by assuming that there is spatial dependence in the data: an observation is influenced by its neighbouring observations; hence, a weighting scheme is applied to account for the influence of distance or contiguity.
The multilevel approach not only acknowledges that there is spatial dependence between the observations, but it also assumes that the correlation structure is determined by the context as bound by the a priori boundaries. The a priori definition of discrete boundaries at each aggregation level is usually a pre-defined boundary derived from social science research, such as an administrative, political or statistical boundary. The imposition of the a priori boundaries implies that people’s behaviours are different across the boundary and, therefore, that people’s attitudes and preferences towards residential location choice are also different (Fotheringham et al., 2002).
In fact, multilevel models are often applicable in urban studies because individual objects often cluster at different geographic aggregation levels and these models can account for both individualism and holism (Courgeau, 2003). The multilevel approach is able to separate the individual effects from the place/contextual effects by allowing the individual spatial objects to have within-group interactions (Duncan et al., 1998; Goldstein, 1987). At each level, homogeneous errors are assumed.
Methodology
Hierarchical random effects model
Hierarchical models originate from linear mixed models (LMM), in which both fixed and random effects associated with one or more random factors can be included (West et al., 2007). They differ from conventional hedonic analysis estimation – the OLS regression model – in that there are no random effects in a regular linear regression model (equation 1). In other words, the intercept and the slope are fixed for all observations, regardless of the underlying context.
where i is level 1 (e.g. house), j is level 2 (e.g. census tract: representing neighbourhood level in this paper), and k is level 3 (e.g. school district).
In the following, we will first specify a model with a random intercept while assuming that the coefficients are fixed (equation 2). Later, in view of our hypothesis that homebuyers in different school districts have different levels of willingness-to-pay for public education, we will allow the test score coefficient to vary across the school districts (equation 3).
Random intercept model
A three-level random intercept model can be specified, as follows:
where
The intraclass correlation coefficients (ICCs) (denoted as
Random intercept random slope model
In the random intercept model (equation 2), only the intercepts are allowed to vary across the district and tract boundaries, but not the coefficients of the independent variables. Nevertheless, homebuyers in different districts (or submarkets) may exhibit different levels of willingness-to-pay for school quality. Those who have a similar demand are likely to cluster in the same district. Such variation across districts can be accommodated by allowing the slope of the Academic Performance Index (API) variable to vary at the district level. Thus, the coefficient of the test score is decomposed into fixed and random parts (equation 3).
Measures of school quality
Parents value the various characteristics of a school and view school quality in different ways. Indicators of school quality may include test scores, reputation, per-pupil expenditure, peer group effect, and so forth (Brasington, 1999; Downes and Zabel, 2002; Rothstein, 2006). Among these indicators, the most widely used and consistent indicator is test scores. Arguably, test scores may reflect the socio-economic status of the students rather than their learning experience; however, this indicator is still a good measure of quality because it signals academic achievement. Moreover, it is publicly available and is often the only tangible evidence of quality that many parents have (Calvo, 2007).
In this paper, school quality will be approximated by a proficiency score – the API, which first became available on 1 July 1999. This score measures a school’s academic performance and growth based on the test scores of those students in Grades 2 to 11 who participate in the Standardized Testing and Reporting (STAR) Program and the California High School Exit Examination. The API scores are calculated by the California Department of Education (CDE) and disseminated directly to schools and districts. The API scores are posted on the CDE’s and most schools’ websites, thereby making the API one of the most commonly used measures of academic performance.
Two variables are derived from the test scores. One is the district-weighted API score, obtained by weighting each school’s API score by its enrolment, and the other is the API score of the nearest elementary school within the district. While the district-weighted API score captures school premiums in the district, the API score of the nearest elementary school captures more disaggregated school information and reflects the school premiums in land values more precisely. In other words, the estimate of the former variable can be interpreted as the capitalised price for an entry ticket to a school district, whereas the latter variable can be interpreted as the price of the quality of the neighbourhood school.
Variables
The key variable – the quality of the nearest school to each house – is considered as a house level (level 1) variable. It is assumed that, at the time of purchase of their home, parents expect their children to attend the neighbourhood school (i.e. the nearest school within the school district) because of guaranteed admission and minimal commuting to school. Therefore, the quality of the nearest school is likely to be capitalised in house prices. To account for the distance decay effect in respect of the location of the house and school, the distance to the nearest school in a residential district will be entered in the model as inverse distance. Levels 2 and 3 refer to the neighbourhood (N) and district levels (D), respectively. The analysis unit of the neighbourhood level is proxied by the census tracts. At the neighbourhood level, several variables from the US Census are included, such as population density, ethnicity and median household income. At the school district level, information such as weighted API scores, expenditure per pupil and the number of pupils per teacher is added (Bayer et al., 2007; Brasington, 1999; Hilber and Mayer, 2009; Ihlanfeldt, 2007; Kane et al., 2006). It should be noted that the multilevel approach is not immune to the problem of endogeneity. The estimation of capitalisation effects, as in other papers, may suffer from omitted variable bias (Nguyen-Hoang and Yinger, 2011).
Data in this study are drawn from three sources. House prices, location and unit structure characteristics are obtained from DataQuick, a nationwide supplier of real estate information and analytics. The attributes of neighbourhoods, through the proxy of census tracts, are drawn from the US Census. The individual school and school district information of 2001 and 2011 was retrieved from the CDE.
Study area
The study area is Orange County, a suburban county located in the southwestern part of the Los Angeles region. Its population grew from approximately 4,933,000 in 2001 to 5,163,000 in 2011. The county consists of 12 unified school districts, 12 elementary school districts, and three high school districts. Since the objective of this research is to examine the capitalisation of public elementary schools in house prices, only the boundaries of unified and elementary school districts, totalling 24, will be used (Figure 1).

Twenty-four elementary and unified school districts in Orange County.
There were 370 elementary schools in 24 elementary and unified districts in 2001 and 391 in 2011. 1 While each district has a reputation for its education quality, within most districts there is still a noticeable variation in school performance (Figure 2). The spatial distribution of school quality in Orange County is not homogeneous. On the one hand, there is a marked variation in school quality in some districts. On the other hand, over the years, the discrepancy in the API across the region has shrunk, as indicated by a more compressed spectrum and a reduced standard deviation of the county.

Academic Performance Index (API) of elementary schools in Orange County.
The final data set contains 29,135 observations from 2001 and 18,622 from 2011, consisting of single-family residences and condominiums. An ANOVA test was performed to examine how much school district boundaries contribute to the variances in house prices and test scores. Results from the ANOVA test confirm that the means of both house prices and API scores differ significantly across the school district boundaries. This boundary effect contributes to 16% and 46% of the variation in the 2001 house prices and the API scores, respectively. Interestingly, when it comes to the 2011 data, the boundary effect still contributes to 16% of the variation in house prices but only 33% of the API scores; this is a signal that the boundary effect is shrinking in the latter measure.
Although the numbers of housing observations from these two years are quite different, which reflects the impact of the financial tsunami of 2008–2009 and a lower level of transactions and consumer confidence, the respective district’s share in the final sample remains relatively consistent (Table 1). Our data show that, between 2001 and 2011, the API scores of the elementary schools in Orange County increased by 16.4% whereas the house prices increased by 50.3%. In general, the districts with the most expensive houses were more likely to be associated with higher API scores, whereas the least expensive districts were mostly associated with lower API scores. Nevertheless, the relationship between average house prices and school quality at the district level is not necessarily linear. Some inexpensive districts, such as Brea-Olinda Unified, Cypress Elementary, Fountain Valley Elementary, and Saddleback Valley Unified, are attached to quality schools. These districts all had weighted API scores of over 800 in 2001, yet the average house price was approximately only US$300,000. Considering that these districts did not have smaller housing units (i.e., square footage) compared with other districts, the house prices per unit in these districts were also relatively low.
Statistical summary of house prices and schools in 24 school districts in Orange County (descending in average house price of 2001).
Notes: aThe weighted API score of a district is obtained by weighting each school’s API score by its enrollment:
This column shows the standard deviation of school API scores by district.
The statistics of dependent and independent variables are summarised in Table 2. Comparing transactions from 2001 and 2011, the house attributes are quite similar and comparable. Although it can be argued that the housing market in 2011 had not recovered from the 2008 global financial tsunami and that some submarkets recovered faster than others, the average home transactions of our sample in both years had similar structural characteristics. To examine the hypothesis of the spatial autocorrelation in house prices and the API scores, Moran’s I (Getis and Ord, 1992) was calculated by using a 10% random sample. 2 Our results reject the null hypothesis that there was zero spatial autocorrelation in these two variables in 2001 and 2011, respectively. The test result is not surprising because the assumption of independence of its error terms in OLS is often violated in studies using spatial data, which are spatially dependent by nature (Páez et al., 2008). When considering hierarchically structured spatial data at different geographic aggregation levels, a recommended model structure is multilevel models because it allows the error terms to be correlated within the boundaries (Courgeau, 2003).
Statistical summary of variables.
Notes: aThese variables will be transformed by natural log in regression models. bThis variable will be inversed and then transformed by natural log in regression models. cA quarter bath is usually found in older homes and it usually contains only a toilet or shower stall.
Results
A total of five models are estimated: besides the multilevel models – HML 1 (random intercept model) and HML 2 (random intercept and random slope model) – we also estimate three alternative models: OLS; EIV (errors-in-variables); and IV (instrumental variables). Results of the estimations are summarised in Table 3. Since the dependent variables are transformed by the natural log, estimates of independent variables that are in the form of a natural log can be interpreted as elasticity.
Summary of estimation results.
Notes: avariables in natural log; bvariable (inverse) in natural log; standard errors in parentheses.
significance at the 1% level, ** significance at the 5% level and * significance at the 10% level.
The OLS is served as a baseline whereas the EIV and IV models serve as comparisons acknowledging the possibility of endogeneity due to simultaneity issue because of unobservable determining housing values and/or measurement error issue in the school quality variables. In the EIV model, we conduct sensitivity analysis for the potential measurement errors in the measure of school quality – the API score variable. The potential measure error stems from the fact that the ideal variable for school quality is unobserved because it is too comprehensive to measure and that our proxy may be a mismeasured school quality variable, even though the score itself may be accurate. When the reliability of the API score variable is assumed to be 95%, the coefficient of this variable is larger than that from OLS regression. In fact, when the reliability of the school quality measure becomes lower, the estimate of the capitalisation effect will be further inflated.
IVs estimation is the standard way to address the issue of endogeneity. However, in the IV model, valid instruments that satisfy all criteria are often difficult to find. Hence, we apply Lewbel’s (2012) method to develop structural parameters based on certain heteroskedasticity restrictions from the current set of regressors instead of relying on external instruments. To implement this method, we run a STATA command developed by Baum and Schaffer (2012). The coefficient of the API score from the IV estimation is smaller in magnitude compared with that from the OLS regression especially for the models which use the 2011 data. This is not unexpected because school quality is usually confounded with various unobserved neighbourhood qualities; therefore, its estimate is prone to bias if some confounded neighbourhood variables are missing. Moreover, since school quality is a normal good, the consumption of which will increase with neighbourhood income level and local public amenities. If this type of confounded information is not included in the model, the estimate of school quality tends to be upward-biased.
HML estimation is another alternative to OLS in hedonic analysis. From a three-level null model (results not reported herein), the ICC indicator calculated through the parameters
Estimates from the random intercept model (HML 1) suggest that a 10% increase in the API raised the house prices by 2.1% in 2001 and by 4.5% in 2011. Lastly, the random intercept and random slope model (HML 2) allows the API coefficient to vary across school districts, reflecting the homebuyers’ similar level of willingness-to-pay for school quality in each district. It is shown that a 10% increase in the API score of the nearest elementary school pushed up the house prices by approximately 1.9% in 2001 and by approximately 3.5% in 2011. Most of the other estimates from HML 2 appear consistent with the outcomes from HML 1. The largest difference lies in the estimate of the API test score of the 2011 data, in which the estimate from HML 2 is considerably smaller than that from HML 1, thereby implying that the random part of the school quality premium of 2011 in some school districts may be relatively large, making the lump-sum regional mean (i.e. the coefficient from HML 1) skewed and greater than the decomposed fixed part of the regional mean (i.e. the coefficient from HML 2). This postulation can be confirmed by the random effects of the API test score for each school district estimated by using the final model – HML 2 (Table 4). To a certain extent, the random coefficient appears to be correlated with the average house price. In general, homebuyers who can afford to live in the best-performing districts tend to pay a higher proportion of the house price for school quality, and households from the lowest-performing districts tend to pay less than the average (Figure 3(a) and (c)). When combined with the fixed part of the test score coefficient, the total API coefficients reveal the premium for the test score for each district. When the grand total of house prices is taken into account, the absolute values of school quality premiums represent a wide spectrum (Figure 3(b) and (d)).
Premiums of test scores using three-level random intercept random slope model.

House prices and capitalisation effect.
To better demonstrate the relationship between house prices and school quality premiums, linear and quadratic lines are added to Figure 3(a–d). For the relationship between the random API coefficient and the housing price of the district (Figure 3(a) and (c)), the linear fitted line of the 2001 data has a larger slope, thereby demonstrating a higher degree of positive correlation. The 2011 results are less linear and predictable probably because the housing market may not be in its usual equilibrium; it may still be recovering from the credit meltdown in 2008–2009. However, when it comes to the relationship between the absolute premiums of the API score and the house prices (Figure 3(b) and (d)), the data for both 2001 and 2011 show a largely linear pattern. For the 2001 data, the two most expensive school districts, namely, Laguna Beach Unified (US$3704) and Newport-Mesa Unified (US$2962), are associated with the largest school premiums in absolute terms, whereas some of the least expensive districts, namely, Anaheim (US$66), Santa Ana Unified (US$127) and Garden Grove Unified (US$89), have the lowest school premiums. On average, homebuyers in Orange County paid a school premium of US$672 for a 1% increase in the 2001 API score. For the 2011 results, we can still see that, besides the top districts (e.g. Laguna Beach Unified and Newport-Mesa Unified), districts in the upper-middle section (e.g. Capistrano Unified, Tustin Unified, Irvine Unified and Brea-Olinda Unified) are associated with above-average school premiums, which is similar to the 2001 pattern. In contrast, the bottom districts are associated with below-average school premiums. On average, homebuyers paid US$1963 for a 1% increase in the 2011 API score. Moreover, the random effects model may result in a negative school premium in a small number of districts. This is not totally unexpected because quite a number of random API coefficients are negative, and this can lead to a negative premium if its absolute value is greater than the fixed part of the API coefficient. One possible explanation is that not all homebuyers value test scores and that the housing and/or neighbourhood attributes they value more may be negatively correlated with the API scores.
For factors other than the API score, most of our level 1 variables are statistically significant at the 5% level and the sign is consistent across all five models. The only exception is the distance to nearest school variable, which is unstable across models and years, thereby suggesting that the distance decay effect may not be a good explanatory factor for the housing prices. For the level 2 variables, only the median income is statistically significant across all models in both years. For the 2001 data, the other two important level 2 variables are population density and percentage of African-Americans, with expected signs, but for the 2011 data they are insignificant in the HML models even with unexpected signs for the percentage of African-Americans in the three comparison models. Among the level 3 variables, none of them are statistically significant at the 5% level in the HML models. Although the test score and expenditure are sometimes referred to as the input and the output factors of school quality (Brasington, 1999), the latter indicator turns out to be statistically insignificant (at the 5% level) in the three-level models. The results should be interpreted in the context of our model design. We set up the hierarchical framework assuming that housing prices are influenced by attributes at all three levels and that they are correlated within school districts and neighbourhoods. Under this framework, our results suggest that the discrepancy in housing prices in Orange County is explained primarily by the housing and neighbourhood information.
Comparing results from 2001 and 2011, among the five models that we estimated, the hierarchical models yield consistent and expected coefficient estimates that are statistically significant at the 5% level, although some of these coefficients are significant only in one year but not in the other. In contrast, the other three comparison models (i.e. OLS, EIV and IV) have resulted in opposite signs for some estimates. In this study, we prefer the hierarchical estimation approach over the other tested models primarily because of its capability of accommodating the hierarchical organisation and hence the spatial autocorrelation issue of the spatial data in education and housing research (Goldstein, 1987; Orford, 2000). Between the two HML models, we prefer the random intercept random slope model (HML2) because it essentially allows the homebuyers’ willingness-to-pay for school quality to vary across different school districts, which is plausibly a more realistic economic behaviour than fixating the land premium across the region. Indeed, our results confirm our hypothesis that the land premium for public education varies substantially from district to district, which in turn reveals the high correlation between this premium and the average house price at the district level, thereby providing a useful piece of evidence which explains the widening gap in public education resources between high-performing and low-performing schools.
Conclusions
In this study, the hierarchical estimation framework is adopted to account for the clustering and autocorrelating nature of the spatial data. Our results show that homebuyers value the accessibility of a good school. Over the period from 2001 to 2011, house prices in Orange County increased by 50.3% and the API scores of elementary schools increased by 16.4%. As both the house prices and the API scores have noticeably increased, the school premium in house prices has also significantly increased during this period. The ANOVA analysis shows that the boundary effect contributes to the same level of 16% of the variation in house prices in both years, but for the variation in API scores, the boundary effect is down from 46% to 33%.
In our three-level random intercept random slope model (HML 2), a 10% increase in the API raised house prices by 1.9% in 2001 and by 3.5% in 2011. To put this finding into context, a one standard deviation increase in school quality in the sample (+14.4% for 2001 and +7.38% for 2011) increased house prices by a surprisingly similar percentage: 2.7% in 2001 and 2.6% in 2011, respectively, which is approximately mid-range in the more recent literature on the capitalisation of school quality that uses the boundary discontinuity approach or semi-parametric methods.
Moreover, drawing data from ten years apart, the random part of the API score coefficient still shows a striking heterogeneity across school districts; meanwhile, we continue to observe a linear pattern between school premiums and house prices (Figure 3(a)–(d)). Homeowners pay a higher premium for public education in the best-performing school districts. In this respect, we are not surprised to see that the schools with good performance are often geographically situated in neighbourhoods and school districts of high socio-economic status, implying the social and contextual influences on test scores (Ball et al., 1996; Catsambis and Beveridge, 2001). In 2011, the housing prices in the five best-performing school districts in Orange County ranged from US$477,854 to US$1,109,350 whilst the prices in the five lowest-performing districts ranged from US$267,042 to US$313,830. Such a strong linkage between geography and social class would suggest that there are inevitable obstacles to overcoming school performance gaps in order to achieve a higher level of education equity unless there are efforts made towards neighbourhood redevelopment or if school choices continue to be limited or restricted within the residential school districts.
The findings of this study have two main implications for urban policy. First, the absolute values of school premiums span a remarkably wide spectrum across geographical space and the correlation with house prices is rather linear, as shown in Figure 3(a–d). House prices capitalisation encourages homeowners to invest in durable public goods (Hilber and Mayer, 2009). The higher the premiums, the more incentives homeowners have to make school quality improvements in their own areas. This difference in incentives may lead to a widening gap in school quality because homeowners in low-income neighbourhoods have much less incentive to capitalise school premiums in house prices, and this may not be beneficial for community development in the long run. Hence, our findings justify government intervention to improve public education quality in low-performing districts. Such public policy may be more effective if we take a community development approach and foster school–community collaboration rather than focus narrowly on education reform within schools (Warren, 2005). In other words, education improvement in low-income neighbourhoods would be better linked to urban renewal and community reform.
Second, knowing the high public education premiums paid by homebuyers, we may expect the boundary effects to persist. These boundary effects reflect the common interests and preferences of rent bidders, which in turn reinforce the autonomy of each school district. Urban education policies, such as the interdistrict transfer programme that aims to improve social mobility and equality by offering more school choices (Gewirtz et al., 1995), may encounter resistance from existing landowners if such policies threaten to weaken the boundary effects and depreciate the school premiums that the homebuyers have paid. In this sense, our research helps to explain the obstacles to the implementation of the interdistrict transfers programme and its relatively limited scope, both since its commencement and in the near future.
Footnotes
Acknowledgements
The author would like to thank Genevieve Giuliano, Christian Redfearn, and Geert Ridder and the anonymous referees for their constructive advice and comments to the earlier versions of this manuscript.
Funding
This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.
