Abstract
The aim of this paper is to identify convergence clubs in house prices among Spanish regions over the period 1995:Q1 to 2007:Q4 and to investigate which factors are responsible for club formation. Employing a novel regression-based convergence test proposed by Phillips and Sul (2007) we find that regional house prices do not converge to a common trend, which confirms the existence of some degree of segmentation in the Spanish housing market. The research results support the existence of convergence clubs, indicating that Spanish regions form four separate groups that converge to different house price levels. Results of an ordered logit model suggest that differences in population growth, size of the rental market, initial house supply and geographical situation have played a crucial role in determining club membership.
Introduction
From the mid 1990s to 2007 the Spanish housing market was characterised by an extraordinary boom, which multiplied house price by a factor of more than three. Between 1995 and 2007 the annual rate of increase of house prices was on average 9.7% and peaked in 2003 and 2004 over 17%. At the regional level however, the impact of the housing boom was quite heterogeneous and the differences in house prices have widened considerably. While in some provinces house price was multiplied by a factor of more than four with growth rates exceeding 20% in several years, in other provinces the house price increased far more moderately.
The housing bubble had a crucial role on the impact of the international financial crisis beginning in 2008 in the Spanish economy. Thus, the causes behind the boom have been the focus of research and policy debate in economics in recent years. Several factors have been associated with this phenomenon (García-Montalvo, 2009, 2010): low interest rates which fostered borrowing, the relaxation of conditions for the granting of mortgages, the high employment rates, the increase in income per capita and the demographic growth owing largely to immigration and to the access to the market of the baby boom generation. Despite a myriad of studies that have devoted many efforts to analyse the role of these and other factors on the house price dynamics during the housing boom at the national level, little is known about the causes explaining the evolution of regional house prices and the increased disparities between provinces.
Against this background, the aim of this paper is twofold. First we examine the evidence for convergence cluster across Spanish provinces, on the basis of house price trends over the period 1995:Q1–2007:Q4. To assess the convergence in house price among Spanish provinces, we employ the recent methodology introduced by Phillips and Sul (2007) which allows the identification of the number and membership of convergence clubs using a regression-based convergence test. Contrary to other approaches in which regions are grouped a priori, and thus the cluster outcomes are to some extent predetermined by the arbitrarily selected variables for club formation and its threshold levels, the methodology of Phillips and Sul (2007) enables the endogenous determination of convergence clubs. Moreover, following the method proposed by Phillips and Sul (2009), we explore whether there is evidence of club merging and transitioning between groups of provinces belonging to different clubs. Second, the potential formation of clubs suggests that there might be common factors among groups of provinces leading them to converge to a similar house price level. Hence, apart from studying whether some Spanish provinces share common convergence patterns in house prices, the results of this research aim to provide some insights into the factors behind the formation of clubs across Spanish regions. In order to improve our understanding of why house prices increase faster in some provinces than others, we estimate several ordered logit models and analyse which factors played an important role in determining club membership. Methodologically, our paper is closely related to the work of Kim and Rous (2012), who analyse house price convergence in panels of states and metropolitan areas in the US using the convergence test by Phillips and Sul (2007). In order to describe the resulting cluster patterns, the authors employ a multinomial logit model and conclude that housing supply regulation together with climate are the main factors determining convergence clubs.
Literature review on regional house price convergence
Nowadays, there is an established and growing empirical literature on convergence in regional house prices. However, using a battery of econometric approaches to detect comovements between regional house prices, the existing literature has concentrated mainly on examining convergence in UK regional house prices and, more recently, in US regional house prices, whilst for other developed countries the literature is rather scarce.
Regarding the UK, existing studies have failed to reach a consensus on whether or not regional house prices converge, offering mixed evidence. On one hand, studies by Giussani and Hadjimatheou (1991), MacDonald and Taylor (1993), Alexander and Barrow (1994), Cook (2003, 2005), Cook and Thomas (2003) and Holmes and Grimes (2008) find support for convergence in regional house prices and for the so-called ripple effect, which describes a tendency for shocks to UK regional house prices to originate in the London region and then spatially diffuse outward from the southeast. For example, Macdonald and Taylor (1993) employ standard cointegration analysis and find long-run interregional stability in the UK house prices, but only weak evidence in support of the ripple effect. Applying asymmetric unit root tests to study the stationarity of the UK regional to national house price ratios, Cook (2003) detects the presence of asymmetric adjustment between regions and suggests that reversion to equilibrium occurs faster (more slowly) during periods in which house prices in the southeast decrease (increase) relative to other regions. The author arrives at a similar conclusion in Cook (2005), employing a threshold autoregressive cointegration model. On the other hand, studies by Rosenthal (1986), Drake (1995), Ashworth and Parker (1997) and Abbott and De Vita (2013) find little evidence of the robustness of the ripple effect and convergence in UK regional house prices. Rosenthal (1986) uses cross-spectral analysis of regional house price transactions to cast doubt on the convergence hypothesis. Drake (1995) finds regional differences in UK house price dynamics employing time-varying parameter estimation. Following a pairwise approach (Pesaran, 2007) to study the stationarity of all possible pairs of house price differentials across UK regions, Abbott and De Vita (2013) conclude that there is no evidence of long-run convergence among regional house prices or of an equilibrium relationship towards which UK regional house prices have a tendency to gravitate. Nevertheless, Cook (2012) shows that convergence is only observed over the housing market cycle, with overwhelming evidence of convergence detected particularly during the downturn.
With respect to the USA, a large number of studies have been undertaken mainly from the late 1990s onwards. Pollahowski and Ray (1997), investigate the relationship among house price changes in nine US census divisions and metropolitan areas and find evidence of spatial house price diffusion. Zohrabyan et al. (2008) apply cointegration techniques to find that house price dynamics among US census divisions are highly correlated and that the housing market is mainly led by influential regions in financial and economic aspects. Clark and Coggin (2009) apply structural time series analysis and unit root tests for absolute and relative convergence among US regions. The authors find mixed evidence of regional house price convergence. Holly et al. (2010), estimate an error correction model for a panel of 49 US states to identify a significant spatial effect associated with contiguity. Holmes and Grimes (2011) apply a pairwise approach (Pesaran, 2007) to study the stationarity of all possible pairs of house price differentials across US states and find strong evidence of convergence where the speed of adjustment between pairs of states seems to be inversely related to their distance. Barros et al. (2012) investigate the relationship between US state house prices and overall US house prices using fractional integration and cointegration techniques, raising doubts regarding the long-run convergence in US state house prices and the presence of the ripple effect.
There are also several studies that analyse house price club convergence among US regions using the approach of Phillips and Sul (2007). The study by Apergis and Payne (2012) examines the convergence of US house prices by state over the quarterly period 1975:Q1 to 2010:Q4 and finds three different convergence clubs. Kim and Rous (2012) analyse house price convergence in a panel of 48 US states from 1975:Q1 to 2009:02 and in four US metropolitan area/division panels for different time periods. Interestingly, the authors show that club members are not necessarily geographically neighbouring so that conventional definitions of economic regions may not be the best choice to study regional differences in house prices. In addition, to describe the resulting cluster patterns, the authors employ a multinomial logit model and conclude that housing supply regulation together with climate are the main factors determining convergence clubs. Lastly, Montañes and Olmos (2013) investigate the presence of club convergence among 19 US metropolitan areas for the period 2000:M1–2013:M3. Allowing the sample end point to vary, the authors note that the number and composition of clubs differs when the sample includes information posterior to 2010, concluding that the bursting of the housing price bubble significantly altered the US housing market.
To the best of our knowledge, the only published study that investigates convergence in Spanish house prices at the regional level is the work by Larraz-Iribas and Alfaro-Navarro (2008). Using cointegration techniques, the authors analyse the behaviour of house prices among 17 Spanish NUTS2 regions (Autonomous Communities) for the period 1995:Q1–2006:Q4. The obtained results provide evidence of cointegration, which suggests a broad grouping of regions based on physical proximity or similar economic characteristics. Nevertheless, there are also several studies related to house price behaviour in Spain. For example, García-Montalvo (2001) examines the trend in Spanish house prices from the late 1980s to 2000, and estimates a model to analyse the factors that determine house price growth rates among Spanish Autonomous Communities. Martínez and Maza (2003) analyse several determinants of the evolution of Spanish house prices during the period 1978–2002. The authors find that the relaxation of credit constrains, reflected both in the growth of housing loans as well as in the fall of nominal interest rates to mortgage loans, accounts for a substantial portion of house price growth. García-Montalvo (2010) analyses the effect of several factors such as land use regulation, immigration and employment growth, on the growth of house prices in Spain at the municipality level during the recent housing boom. The author concludes that neither land availability nor the growth in the number of immigrants has any statistically significant explanatory power for the growth rate of prices at the municipal level. Moreover, the only factors with a significant effect on the growth of house prices are the proportion of rental housing and the initial price level. Gonzalez and Ortega (2013) study the effect of immigration on house prices and residential construction in Spain over the period 1998–2008 to find that immigration accounted for roughly one-third of the housing boom, both in terms of prices and new construction.
Empirical methodology: A regression-based convergence test
In this section we present the econometric methodology used to analyse the existence of house price convergence clubs among Spanish provinces. The methodology was developed by Phillips and Sul (2007) in order to test for sigma convergence in a panel of countries. In our particular case, the departure point is the decomposition of a panel for the natural logarithm of house price, logPit, into two components; one systematic, git, including permanent common components that give rise to cross section dependence, and one transitory, ait:
To separate common from idiosyncratic components in the panel, equation (1) is reformulated as:
where μt is a common component and δit is a idiosyncratic component, both of which are time varying. Thus, the idiosyncratic component, δit, is a form of individual distance between the common component, μt, and logPit. Phillips and Sul (2007) model the time-varying behaviour of δit in semi parametric for as:
where δi is fixed, ξit∼iid (0,1) across i but weakly dependent on t, and L(t) is a slowly varying function (like log t) for which L(t) →∞ as t→∞. Equation (3) ensures that δit converges to δi for all α ≥ 0, which becomes the null hypothesis of interest. This formulation enables Phillips and Sul (2007) to develop an econometric test of convergence, by testing whether the factor loadings δit converges to a constant δ. For this purpose, the authors define the relative transition parameter hit as:
which eliminates the common component by taking ratios and measures the transition path of house price in province i relative to the panel average. Thus it measures individual house price behaviour in relation to other regions and describes the relative departure of house price in province i from the common growth path, μt. When all house prices move towards the same transition path, i.e. δit converges to a constant δ, the relative transition parameters hit converges to unity. In this case the cross-sectional variance of hit converges to zero:
Using the semi parametric model represented in equation (3) above, the null hypothesis of convergence can be written as:
and the alternative:
The null hypothesis is tested using the following log t regression:
where L(t) = log (t+1) and the parameter of log t is b = 2α. Using the t-statistic tb, robust to heteroscedasticity and autocorrelation (HAC), the null hypothesis of convergence is rejected when tb < −1.65 (5% significance level). It is worth noting that the parameter of log t above has a relevant economic interpretation, not only concerning its sign, but also its magnitude. Specifically, its magnitude is directly related to the rate of convergence, so that the higher the value of b, the faster the rate of convergence. Based on Monte Carlo simulations, Phillips and Sul (2007) recommend setting r = 0.3.
This procedure presents several features that make it very useful in applied work. First, the test does not rely in any particular assumption concerning trend stationarity or stochastic nonstationarity in individual house price or the common trend, μt. Second, the nonlinear form of equation (3) is sufficiently general to include the possibility of transitional heterogeneity or even transitionally divergent individual behaviour. Thus, the method enables to detect convergence even in the case of transitional divergence, where other methods such as cointegration tests for long-run analysis and stationary time series tests may fail. 1
In the empirical application of the log t-statistic to identify convergence clubs, Phillips and Sul (2007) suggest using the following algorithm:
Step 1 (Ordering): the panel members are ordered according to the last observation.
Step 2 (Core Group Formation): a core group of provinces is identified on the basis of the maximum tk with tk > −1.65, from the sequential log t regressions based on the k highest members for 2 ≤ k ≤ N.
Step 3 (Club Membership): each individual region not included in the core group is evaluated, one at a time, for membership in this group. A new province is included if the associated t-statistic is greater than zero.
Step 4 (Recursion and Stopping): the log t regression is applied to those provinces not selected in Step 3. If the null of convergence is not rejected for this complement group, then they form a second convergence club. If rejected, Steps 1–3 are repeated in order to detect subconvergence clusters. If no core group is found in Step 2, then the house price of provinces from the complement group display a divergent behaviour and the algorithm stops.
Empirical results
Data
The data used in this study are the quarterly log house price (euros per square metre) for 50 Spanish provinces 2 for the period 1995:Q1 to 2007:Q4. The data are provided by the Spanish Ministry of Public Works and are computed on the basis of around 200,000 evaluations per quarter performed by various appraisal companies throughout Spain. These evaluations distinguish between houses in terms of type (state-subsidised and non-state-subsidised), location (NUTS3, NUTS2 and municipality), price stratum and the built-up area. Published weighted mean prices are computed by aggregating the prices corresponding to the lower location levels using the bottom-up technique.
For the clustering algorithm, and following Phillips and Sul (2007), 3 we applied the Hodrick-Prescott filter (Hodrick and Prescott, 1997) to the house price data in order to remove the cyclical components. As recommended by the authors, we have also discarded a fraction (0.3) of the time series data. Since local labour markets are closely related to the housing market, we select provinces as the geographical unit of analysis (local house markets), because they delimit adequately the boundaries of local labour markets. Recent research by the Organization for Economic Cooperation and Development (OECD) has identified metropolitan areas in Spain, defined as those areas where labour linkages are very high (OECD, 2012). These areas are built clustering urban municipalities with high levels of commuting flows. The majority of the metropolitan areas identified by the OECD correspond to provinces capitals. 4
Figure 1 plots the coefficient of variation of house prices among Spanish provinces.

House price dispersion among Spanish provinces, 1995:Q1–2007:Q4.
The figure shows no evidence of sigma convergence between 1995:Q1 and 2007:Q4, since the dispersion of house price between provinces has widened. According to the plot, the dispersion of house price remained rather constant during 1995–1997, around a value of 0.27. However, between the first quarter of 1998 to the first quarter of 2003 the coefficient of variation increased by a factor of 1.3, and then decreased from a value above 0.34 in the first quarter of 2003 to 0.30 in the fourth quarter of 2007.
Club convergence: The log t test
When the log t test 5 is applied to house price across the 50 Spanish provinces, the point estimate of b (equation 8) is −0.53 and the t-statistics (−11.60) indicates that the parameter is significantly less than zero, suggesting divergence of the full group. This result confirms the house price dispersion increase among Spanish provinces observed in Figure 1.
We then proceed to the clustering algorithm test procedure described in the Empirical methodology section to examine whether there are any subgroups of provinces that converge. After ordering the provinces based on the last time series observations and determining the core group (Steps 1 and 2), we found a first convergence club (Club 1) that consists of 18 provinces: Alicante, Almería, Cádiz, Castellón, Córdoba, Gerona Guadalajara, Guipúzcoa, Huelva, Huesca, Madrid, Málaga, Murcia, Sevilla, Tarragona, Toledo, Vizcaya and Zaragoza. The log t-statistic for the rest of provinces (Step 4) was −5.79 (b = −0.30). Since the null hypothesis of convergence is rejected, these provinces do not form a complementary club, so that we proceed to analyse the existence of a new and smaller convergence club. The second convergence club (Club 2) identified consists of eight provinces: Álava, Albacete, Baleares, Barcelona, Cantabria, Granada, Pontevedra and Valencia. For the remaining provinces the log t-statistic was −2.41 (b = −0.13) and thus larger in magnitude than the corresponding critical value at the 5% significance level. Following the club convergence algorithm, a third club was found including the following provinces: Asturias, Ávila, Badajoz, Burgos, Ciudad Real, Cuenca, A Coruña, Jaén, Lleida, Navarra, Las Palmas, La Rioja, Salamanca, Santa Cruz de Tenerife, Segovia and Valladolid. Finally, the null of convergence was not rejected for the eight remaining provinces: Cáceres, León, Lugo, Ourense, Palencia, Soria, Teruel and Zamora. These provinces form the fourth convergence club (Club 4).
The results of the log t regressions for each of the four clubs identified above along with several descriptive statistics are presented in Panel A of Table 1. Furthermore, Panel B of Table 1 shows club membership. For the log t regressions we report the estimated parameters and the corresponding t-statistics in parentheses, while for the descriptive statistics we present the mean value for each club and the standard deviation. It is worth noting that the speed of convergence towards the average is by far larger in Club 4 (α = 0.25) indicating that house price in provinces belonging to the fourth club are approaching one another faster in relative terms. Club 3 exhibit the lowest speed of convergence (α = 0.07), and the first and second club present a similar speed of convergence (α = 0.09). The reported descriptive statistics show that Club 1 consists of provinces with an initial house price below the average and experienced a large house price increase during the period under scrutiny. The house prices of provinces belonging to Club 2 were also heavily affected by the housing boom, but their initial house price was already large in relative terms at the beginning of the period. The initial house price of provinces in Club 3 was rather similar to those in Club 1. However, the impact of the housing boom was less severe than in provinces belonging to Club 1 and Club 2. Lastly, provinces in Club 4 are characterised by an initial house price far below the average, while the effect of the housing bubble was moderate with relatively lower housing inflation rates. In addition, Figure 2 provides a graphic illustration of house price club membership among Spanish provinces. Interestingly, with the exception of provinces belonging to Club 4, the clubs seem to be spatially concentrated to some extent. As it can be appreciated, there is some evidence that neighbourhood provinces tend to cluster together. This applies most notably to the Autonomous Community of Andalucía, Castilla León and Galicia. This fact is confirmed by both, the Moran’s I and the Geary’s C spatial autocorrelation test statistics, which are equal to 0.23 (p-value = 0.003) and 0.72 (p-value = 0.002), respectively. Thus, the null hypothesis of lack of spatial autocorrelation in club membership is rejected in both cases at the 5% level. 6
Convergence club classification.
Notes: No.: number of provinces members. The log t test is distributed as a one sided t-statistics with a 5% critical value of −1.65.

House price convergence clubs among Spanish provinces.
Figure 3 illustrates the relative transition paths of the four different clubs calculated as the cross-sectional mean of the relative transition paths of the members of each club. Under the assumption of convergence for the full panel of provinces, the relative transitions paths should tend to unity, that is, all regional house prices converge to the same level. However, under the assumption of club convergence the relative transition paths of the members of each club tend to different constants. This regularity can be clearly appreciated in Figure 3 where we can observe that the first and second clubs appear distinctively above the average, whereas the third and fourth clubs keep below unity, and no evidence is observed of a convergence process between clubs with the exception of Club 1 and Club 2. Moreover, although the second club and the third club depart from a similar value, the average transition path of the latter displays a marked downward trend dropping further below the average, while the average transition path of the former is clearly upward trended.

Relative transition paths of Spanish provinces clubs, 1995:Q1–2007:Q4.
Additionally, Figure 4 plots the relative transition paths of each province house price by club. Several regularities are observed. First and as expected, in all clubs there is some evidence of a catch up process (absolute β-convergence), where the house price of those provinces with a lower initial state approaches the house price level of those provinces with a higher initial state. To further confirm these patterns and following traditional cross-country growth regressions, we estimated for each club by Ordinary Least Squares, a model for the difference between the log house price in 2007 and 1995 against a constant term and the log house price in 1995. The coefficient on initial house price is negative in all four cases, suggesting the presence of β-convergence. Moreover, the coefficient is highly significant (1% level) in Club 1, with a value of −0.26 and a t-statistic of −4.58, and in Club 4, with a value of −0.47 and a t-statistic of −4.46. However, for Club 2 and Club 3 the coefficient on initial house price is marginally significant (15% level), with a value of −0.12 and −0.13 respectively and a t-statistic of −1.66 in both cases. Second, the relative transition paths within clubs present quite heterogeneous patterns, implying that the manner of house price transition and convergence was quite different across provinces within a given club during the analysed period.

Relative transition paths of Spanish provinces by clubs, 1995:Q1–2007:Q4.
Further results and robustness analysis
In addition to the previous analysis, in this section we deal with some extensions and investigate the robustness of the results with respect to the methodology employed as well as sample size.
Club merging and transitioning
An interesting issue to explore is whether there is evidence of club merging and transitioning between groups of provinces belonging to different clubs. Following the method proposed by Phillips and Sul (2009), the test results of club merging for all possible combinations of two and three consecutive clubs are reported in columns (1) through (3) of Table 2. We reject the null hypothesis of club merging for all combinations with the exception of club merging between Club 1 and Club 2. In this case, the estimated value of b is 0.096 and the corresponding t-statistic is 1.208, suggesting that the first two clubs can form a large convergence club of 26 provinces. However, the speed of convergence among the 26 provinces is very slow, as indicated by the fact that the estimated b is not statistically different from zero. To explore transitioning between groups of provinces belonging to different clubs, we check whether the λ fraction of the provinces in an upper club with the lowest house price converge with the λ fraction of the provinces in a lower club with the highest house price. We set λ = 0.5 and apply the log t test for all combinations of two consecutive clubs. The presented results in columns (4) through (6) of Table 2 reveal that it is not possible to reject the convergence hypothesis for the first two pairs of consecutive clubs. This finding indicates that during the period of analysis, there is a tendency for some provinces to move from one convergence club to the other, between Club 1 and Club 2 and between Club 2 and Club 3.
Tests results of club merging.
Notes: No.: = number of provinces members. The test is distributed as a one sided t-statistics with a 5% critical value of −1.65. Rejection of the null hypothesis of club merging is denoted by *.
Extending the sample size
Although our main concern is to identify convergence clubs during the housing boom, it is interesting to analyse whether the results are sensitive to an increase in the sample size to the latest available data. 7 Table 3 reports the results of the cluster analysis for two different sample endpoints, 2009:Q4 and 2011:Q4. When the period after the housing boom is considered, the number of clubs reduces, indicating a clear alteration of the clustering results. Interestingly, these results suggest that the collapse of the housing bubble notably reduced the observed segmentation in the Spanish housing market. Specifically, when the sample size ends in 2011:Q4 we find evidence of two different clubs; a relatively small club (Club B.1) of 16 provinces with a larger speed of convergence towards the average (α = 0.13) and a larger club (Club B.2) of 34 provinces with a smaller speed of convergence (α = 0.06). A comparison of the composition of these two clubs with our previous findings (Table 1) shows that, all provinces belonging to Club 2 (with the exception of Barcelona, Pontevedra and Valencia), Club 3 and Club 4, along with several provinces of Club 1, are now grouped together in a single club (Club B.2).
Effect of changing the sample period on the club membership.
Notes: No.: number of provinces members. The log t test (between parenthesis) is distributed as a one sided t-statistics with a 5% critical value of −1.65.
Stochastic convergence
As a robustness check of the clustering results of the previous section, we examine the long-run relationships over time between regional house prices within each club. Specifically, we utilise the time series methodology suggested in Carlino and Mills (1993) and test for stochastic convergence. Stochastic convergence implies that shocks to a province house price relative to the average of the club members are temporary. We examine the null hypothesis that houses prices are diverging by means of unit root tests, where for each province the variable tested is the logarithm of the relative house price. Within each club, stochastic convergence exists when the null hypothesis of a unit root is rejected, indicating that relative house prices are trend stationary and thus, house prices converge. On the other hand, failure to reject the null hypothesis indicates that the effect of a shock is permanent suggesting divergence of the series from the club average.
Instead of using the traditional approaches for testing unit roots, we utilise the minimum Lagrange multiplier (LM) unit root test proposed by Lee and Strazicich (2003). The advantage of this procedure is that it allows for two structural breaks in level and trend, and determines the break points endogenously from the data. The LM test allows for breaks under both the null and alternative hypothesis so that rejection of the null unambiguously implies trend stationarity. The detailed results of the LM unit root test 8 are reported in Table 4. With the exception of Club 2, the findings suggest that the majority of the relative house prices are stationary and provide significant evidence of a long-run relationship between regional house prices within each club. We find evidence of house price stochastic convergence for 14 out of the 18 provinces in Club 1 at the 5% level and for 13 out of the 16 provinces in Club 3. Also, the null hypothesis of a unit root is rejected for all log relative house price series in Club 4. However, we only find evidence of stochastic convergence for four out of the eight cases in Club 2.
Two-break minimum LM unit root test and panel unit root tests, 1995:Q1–2007:Q4.
Notes: All estimated models include two changes in level and trend (Model C in Lee and Strazicich, 2003). The number in [ ] indicates the optimal number of lagged terms included in the unit root test.
The critical values reported by Lee and Strazicich (2003: table 2) were used to evaluate the significance of the LM test statistic. Statistical significance is indicated by an asterisk
(*) at the 5% level. For the reported break dates the significance is determined using a conventional t-statistic.
The geographical and socioeconomic meaning of convergent clubs
As mentioned before (see sections Introduction and Empirical methodology), the procedure of Phillips and Sul (2007) presents several appealing features against other methodologies to test for club convergence, and is econometrically powerful (see Phillips and Sul, 2007, 2009). However, and although the departure point of their methodology is the standard neoclassical economic growth model allowing for heterogeneous technology progress, their testing procedure is somehow atheoretical because it requires no prior specific assumptions regarding potential geographical or socioeconomic convergence club associations.
To analyse whether the test results make sense geographically and from a socioeconomic point of view, we specify a logit model where the dependent variable is a dummy that takes the value of one if two provinces belong to the same club for each possible pair, and zero otherwise. As explanatory variables we use four different sets. First, we use three variables related to geographical closeness between each pair of provinces: sharing a border, being in the same Autonomous Community (NUTS2 region) and the distance between provinces. We also use a binary variable that takes value one if provinces are both coastal and zero otherwise. Second, we introduce two variables for economic differences between provinces: one for differences in the size of the economy (real Gross Domestic Product) and one for differences in its rate of growth. The third set of variables account for demographic closeness, which includes population and an index of similarity in the population structure by age following the work of Finger and Kreinin (1979). 9 Lastly, we also control for differences in the house market between provinces, including the size of the rental market, the house stock and the number of vacant homes. All variables are measured at the beginning of the period of analysis. 10
The results of this regression are presented in Table 5. For easier interpretative purposes, we report the estimated coefficients and their corresponding robust standard deviation in parentheses, along with the marginal effects. First, we estimate the model with only the set of geographical variables. Afterwards, one by one, we include the sets of variables related to economic, demographic and house market closeness between provinces. Column (1) shows a positive correlation between being in the same Autonomous Community and the probability of belonging to the same club. The marginal effect implies that a pair of provinces from the same Autonomous Community has a higher probability of being in the same club of 14 percentage points. This result suggests that there is some evidence of house price contagion or ripple effect between provinces in the same Autonomous Community, a fact that we already pointed out in section Literature review (Figure 2). However, the evidence is not robust since the statistical significance of the estimated coefficient vanishes when we include additional sets of control variables. Column (1) also shows that coastal provinces have a higher probability of being in the same club of 7 percentage points. However, the estimated coefficients on sharing a border and distance are not statistically significant at conventional levels. Column (2) includes those variables to control for differences in economic size and its rate of growth. Both coefficients are negative and significant, but only the parameter estimate on differences in economic size remains significant when other sets of variables are included. Specifically, the marginal effect for this variable suggests that, a one log point difference in the size of the economy reduces the probability of being in the same club by 19 percentage points. This value reduces to 12 percentage points when the full set of variables is included. Column (3) indicates a positive correlation between the Demographic Similarity Index and the probability of being in the same club. Thus, two provinces where the age structure of the population is similar have a larger probability of belonging to the same club, a fact that could be explained by similarities in housing demand. The estimated coefficient for this variable remains highly significant when the model includes the full set of explanatory variables. Lastly, Column (4) incorporates differences in the house market. The only variable that is statistically significant is the difference in vacant homes. The corresponding estimated coefficient is negative suggesting that larger differences in the number of vacant homes over the total house stock reduce the probability of being in the same club. Specifically, a one percentage point difference in the number of vacant homes decreases the probability of being in the same club by 2 percentage points.
Estimation results from logit model.
Notes: N = 1225. All regressions include a constant term. Robust standard errors in parentheses. Statistical significance is indicated by ** at 1%, * at 5%. The value of R2 corresponds to the pseudo-R2 proposed by McKelvey and Zavoina (1975).
Overall, these figures suggest that provinces in the same particular club share a few common characteristics. Mainly, they have a similar economic size, a similar demographic composition by age and a similar number of vacant homes over the total house stock.
Factors driving club membership
The results on club convergence show that regionally, house prices in Spain are not converging to a single level, but there are subgroups of provinces within which house prices tend to be converging to a common level. Also the differences on the house price converging levels of the several clubs indicates that the impact of the housing boom in Spain was rather heterogeneous at the regional level. Thus, it is of interest to investigate the characteristics of convergent provinces and the factors driving the formation of different convergence clubs.
To analyse the interaction between the several variables and house price club membership, we estimate an ordered logit model to predict how regional characteristics affect the likelihood that any given province would be found to be a member of each convergence club. In our model, the dependent variable represents the club to which a province belongs, and is considered as an ordinal variable since the observed clubs can be ranked according to the converging house price level of provinces in the respective club. We analyse club membership attending to our initial findings of four different clubs.
In the literature on house price determinants, house price dynamics are usually modelled in terms of changes in housing demand and supply. On the demand side, determinants can be classified into economic, demographic and social factors. Among others, key economic factors 11 include household income and the cost of renting relative to owning in the market. Demographic and social factors include immigration, population growth and population composition. On the supply side, key considered factors are construction costs, the existing housing stock and several characteristics linked to the territory referring regional factors such as land availability and climate. Departing from this literature, we consider several indicators to analyse which factors are driving club membership. Table A.2 in the Appendix provides the definition of the variables, the corresponding sources and the mean values. Among the demand factors, our first considered variable is the growth rate of total population. Since increasing house prices may induce reallocation of population between provinces, we use the annual average growth rate during the 1990s to avoid endogeneity problems. To account for differences between native and foreign population growth, we also consider the annual average growth rate of native and foreign population during the 1990s. Third, we include the share of population with college degree aged 25 or more over total population in 1995, as a proxy of initial potential demand. Fourth, because of the lack of data on the cost of renting at the province level, we use the number of rental houses over owned houses. This variable accounts for differences in the size of the rental market between provinces, and thus, we expect that those provinces with a larger rental market were less influenced by the housing boom. The data were obtained from the Spanish Population and Housing Census carried out in the years 1991 and 2001, so that we use the average value to proxy for initial rental market size. Our last considered variable from the demand side is the rate of growth of GDP per capita. Since data on regional income per capita are only available from 1995 onwards and since there could be an endogeneity problem between house prices and income increase, we use the annual average rate of growth during the second half of the 1990s. Thus we exclude the period (2000–2007) with higher housing inflation rates. From the supply side, and in order to account for the size of the initial stock supply, we include the house stock per 100 inhabitants in year 1995 and the number of vacant homes over the total stock. Third, the number of new houses built during 1995–2007 per 100 inhabitants in year 1995 was included to control for the house supply increase during the analysed period. Lastly we add a dummy variable to account for coastal provinces.
The estimation results from an ordered logit model are presented in Table 6. In each case we present the parameter estimates and their corresponding robust standard deviation in parentheses, the resulting R2 and the value of the Wald test for the null hypothesis that all estimated coefficients are zero. In column (1) we report the estimation results of our first specification. As expected, the parameter estimate for total population growth is positive and significant at the 1% level, indicating that a province with a higher population growth is more likely to belong to a club with a higher house price converging level. The coefficient on the share of population aged 25 and above with a college degree is positive but not statistically significant. Interestingly, the parameter estimate for the size of the rental market is negative and highly significant (1% level). This result suggests that those provinces with a larger initial house rental market are less likely to belong to a club with a higher house price converging level, and thus, were less influenced by the housing boom. As it can be appreciated, differences in income per capita growth between Spanish provinces do not appear to be a key factor determining club membership. Even though the coefficient of income growth is positive, it is not statistically significant at conventional levels. From the supply side we find that the parameter estimate of initial house stock and vacant homes is negative and statistically significant, indicating that a higher initial house supply reduces the probability of belonging to a club with a higher house price converging level. Hence, it suggests that house price increase was much moderate in those provinces where the initial house supply was already large. The estimated coefficients on built houses during 1995–2007 per 100 initial inhabitants and for the coast dummy are both positive but not statistically significant.
Estimation results from ordered logit model.
Notes: N = 50. The dependent variable takes value 1 for provinces in Club 4, value 2 for provinces in Club 3, value 3 for provinces in Club 2 and value 4 for provinces in Club 1. Columns (1) to (3) report parameter estimates. Columns (4) to (7) report marginal effects calculated at mean values for the estimated model in Column (2). All regressions include a constant term.
Robust standard errors in parenthesis. Statistical significance is indicated by ** at 1%, **at 5%. The value of R2 corresponds to the pseudo-R2 proposed by McKelvey and Zavoina (1975).
In our model specification on column (2), we exclude total population growth. Instead, we use the growth rate of native population and the growth rate of foreign population. As it can be noted our main results are unaffected. The coefficient on the growth rate of native population is positive and significant at the 5% level. However, the estimated coefficient on foreign population growth is positive but no significant at conventional levels. The estimated coefficients on the size of the rental market, the initial house stock, and the number of vacant homes over the total stock, remain negative and statistically significant.
In column (3), we exclude the coast dummy variable and we introduce instead three dummy variable to account for different coastal regions; the Mediterranean, the Atlantic and the Cantabric coast. As appreciated, the coefficient estimates on these three variables are positive, but only in the case of the Mediterranean dummy variable the parameter is statistically significant at the 5% level. In fact, most provinces in the Mediterranean coast belong to Club 1 (Figure 3), where the housing boom was relatively larger. Furthermore, the annual average growth rate of house prices on coastal provinces in the Mediterranean was above 11% in most cases, while in the Atlantic and Cantabric coast the mean average growth rate was around 9%. This fact could be plausibly explained by the increasing demand of vacation homes in the Mediterranean coastal provinces during the analysed period, both from national residents and foreigners, mainly from the European Union (EU).
To facilitate the interpretation of the impact of the explanatory variables on the probability of membership in a specific club, we report in columns (4) to (7) the resulting marginal effects from the estimated model in column (2). These were computed at the mean of all explanatory variables. The marginal effects show the change in the probability of belonging to a specific club given a small change in the explanatory variables. By far, the variable with the largest impact on club membership is the growth rate of native population. More specifically, an increase of a one point in the annual average growth rate of native population, increases the probability of being in Club 1 by 50 percentage points and decreases the probability of belonging to Club 3 and Club 4 by 78 and 6 percentage points, respectively. This result may be explained by the strong correlation between the growth of native population and the house price growth rate across Spanish provinces. As it can be appreciated in Figure 5, the house price growth rate during 1995–2007 was larger in those provinces with a higher population growth in the period previous to the housing boom. Regarding the size of the rental market, an increase of a one point in the ratio of rental homes over the total number of owned houses decreases the probability of belonging to Club 1 by 2.6 and increases the probability of belonging to Club 3 by 4 percentage points. For the initial house stock variable, each additional home per 100 inhabitants decreases the probability of being in Club 2 by a one percentage point and increases the probability of being in Club 3 by 2.2 percentage points. Lastly, the marginal effects for vacant homes imply that an increase of a one point in the ratio of vacant homes over total stock decreases the probability of being in Club 1 by 6.6 percentage points and increases the probability of being in Club 3 by 10.3 percentage points.

House price growth rate and population growth in Spain.
Conclusions
This paper has attempted to investigate for evidence on convergence clusters among Spanish regions, on the basis of house price trends over the period 1995:Q1–2007:Q4, using the official source of information. We have employed a novel methodology introduced by Phillips and Sul (2007) which allows the identification of the number and membership of convergence clubs using a regression-based convergence test.
The results on club convergence show that regionally, house prices of Spanish regions were not converging to a single level, but there exists subgroups of regions within which house prices tend to be converging to their common level. This result suggests the existence of some degree of segmentation in the Spanish housing market. More specifically, our results support the existence of four separate groups that converge to different house price levels. Furthermore, house price dynamics and housing inflation rates were rather heterogeneous between clubs, implying that the impact of the housing boom differed among provinces belonging to different clubs. Following the method proposed by Phillips and Sul (2009), we also find evidence of club merging between provinces belonging to the first two found clubs.
When the sample includes data after 2007, the results provide evidence that the bursting of the housing bubble has altered the Spanish housing market, since the cluster analysis offers different outcomes. Specifically, the results suggest that the collapse of the housing bubble notably reduced the observed segmentation in the Spanish housing market.
As a further analysis of the clustering results, we also examined the long-run relationships over time between regional house prices within each club, by means of unit root tests. In general, we find evidence for stochastic convergence across Spanish provinces within each club, indicating that shocks to a province house price relative to the average of the club members are temporary. The fact that the house prices of those provinces in the same club tend to move together provide evidence in favour of a ripple effect within the members of each club.
The methodology of Phillips and Sul (2007) enables the endogenous determination of convergence clubs. This is an appealing feature against other approaches in which regions are grouped a priori, and thus the cluster outcomes are to some extent predetermined by the arbitrarily selected variables for club formation and its threshold levels. However, this is also a disadvantage of their methodology, since the test results are somehow atheoretical. The results from a logit model to analyse whether the test results make sense geographically and from a socioeconomic point of view, indicate that, along the analysed period, provinces belonging to the same club have a similar economic size, a similar demographic composition by age and a similar number of vacant homes over the total house stock. Moreover, provinces from the same Autonomous Community are more likely to be in the same club suggesting that a certain level of price contagion exists among the provinces from the same Autonomous Community. Nevertheless, this evidence is rather weak. At this point, and given that the period of analysis presents certain peculiarities because of the housing boom, future analysis on regional house price club convergence in Spain, for a larger sample period, should emphasise the characteristics of regional clubs.
Results from an ordered logit model suggest that differences in population growth, size of the rental market, initial house supply and geographical situation have played a crucial role in determining club membership over the period 1995:Q1–2007:Q4. In fact, those provinces with larger population growth were more likely to belong to a club with a higher house price converging level. In addition, the results confirm that the housing boom was much more pronounced in coastal provinces, mainly in the Mediterranean coast. However, provinces where the size of the rental market was relatively large and with a large initial house supply were less likely to belong to a club with a higher house price converging level, where the impact of the housing boom was far more moderate.
From a policy point of view, the results concerning the size of the rental market are quite interesting. Compared with other countries in the EU, the rental market in Spain is far less developed. In Spain, owner-occupancy rates are around 82% and the rental share is around 11%, while in the EU as a whole the 29% of all dwellings belongs to the rental market. Apart from the fact that in Spain, property ownership is widely viewed as superior to renting almost as a social status, the historical housing policy could be largely responsible for the low size of the rental sector. In particular, fiscal deductions and incentives, along with a relatively large subsidised owner-occupancy sector, have favoured home ownership against rental (López-García, 1992, 2001 2003; Ortega et al., 2011). Thus, a housing policy that encourages rental housing could protect the housing market in Spain from large housing inflation rates as those experienced from the mid 1990s to 2007. Also, a large population growth could be considered as a sign of a possible future large increase in house price.
Lastly, an important issue to be raised is that convergence clubs do not coincide with commonly defined regions in Spain. Although we find a positive correlation between being in the same Autonomous Community and the probability of belonging to the same club, the evidence is not robust. Thus, any analysis of the housing markets or any house market policy at the NUTS2 regional level may not be appropriate. Instead, defining housing markets by convergence clubs may lead to improved results, since provinces in the same club share some common characteristics and similar house price dynamics. Once more, we must remark that, clearly, further research is needed on house price club convergence, for a larger sample period, since our results are obtained for a very particular period.
Footnotes
Appendix
Variables definition and sources for the ordered logit model.
| Variable | Corr. | Mean | Definition | Source |
|---|---|---|---|---|
| Total pop. growth | 0.64 | 0.32 | Annual growth rate (%) of total population, 1990s average. | Population Census, INE |
| Native pop. growth | 0.66 | 0.08 | Annual growth rate (%) of native population, 1990s average. | Population Census, INE |
| Foreign pop. growth | 0.08 | 19.10 | Annual growth rate (%)of immigrant population, 1990s average. | Population Census, INE |
| College population | 0.08 | 10.93 | Pop. with college degree aged 25 and above over total population (%), 1995. | Survey of Working Population (EPA), INE |
| Rental market | 0.12 | 13.25 | Number of rental houses over owned houses (%), 1991 and 2001 average. | Population and Housing Census (1991, 2001), INE |
| Income growth | 0.39 | 3.49 | Annual growth rate (%) of real GDP per capita, 1995–2000 average. | Regional Accounts, INE |
| House stock | −0.36 | 25.61 | House stock per 100 inhabitants, 1995. | Spanish Ministry of Public Works |
| Vacant homes | −0.38 | 14.43 | Number of vacant homes over total stock (%), 1991 and 2001 average. | Population and Housing Census (1991, 2001), INE |
| Built homes | 0.40 | 139.74 | Built homes between 1995 and 2007 per 100 inhabitants in 1995. | Spanish Ministry of Public Works |
| Coast | 0.40 | 0.44 | Dummy variable: 1 = coastal province and 0 = otherwise. | Spanish Ministry of Public Works |
Notes: Correlation: correlation coefficient between each variable and club membership. INE: Instituto Nacional de Estadística.
Funding
This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.
