Abstract
This article checks for the robustness of the estimate of the impact of market access (MA) on the regional variability of human capital, derived from the New Economic Geography literature. The hypothesis is that the estimate of the coefficient of the measure of MA is actually capturing the effect of regional differences in the industrial mix and the spatial dependence in the distribution of human capital. Results for the Spanish provinces indicate that the estimated impact of MA vanishes and becomes nonsignificant once these two elements are included in the empirical analysis.
Introduction
Contributions to the literature in the last decades have shown that regional disparities are associated with differences in the endowment of some socioeconomic characteristics in each region. Among them human capital, and in particular the educational attainment of the population, has been claimed to be an important ingredient of differences in regional economic growth. Endogenous growth models remark that human capital is the element that stimulates the diffusion of knowledge and technological development. Lucas (1988) and Romer (1990) emphasize the importance of human capital for explaining why some economies are more developed than others. In this sense, Barro and Sala-i-Martin (2004) also consider human capital as an important factor for explaining economic convergence across countries and across regions.
From a regional perspective, human capital has been proved to be a key ingredient for regional growth in different economies (e.g., Rodríguez-Pose and Vilalta-Bufí 2005; Di Liberto 2008; López-Bazo and Moreno 2008; Bronzini and Piselli 2009). Hence, the improvement of the knowledge on the determinants of the spatial distribution of human capital will contribute to the better understanding of the origin of regional inequality in productivity, income per capita and, thus, long-run welfare. This is of particular interest in the case of Spain, as the sustained growth experienced by the Spanish economy since the country’s accession to the European Union (EU) in the mid-eighties to the impact of the financial crisis at the end of the last decade did not cause a significant decrease in the amount of regional disparities in the main macroeconomic indicators (e.g., Cuadrado, Garcia-Greciano, and Raymond 1999; De la Fuente 2002). This is particularly true for human capital, as despite the continuous increase in the level of schooling in the last decades, the Spanish provinces today still show marked differences in the endowment of such type of capital.
The New Economic Geography (NEG) has suggested a connection between the endowment of human capital in each economy and the spatial distribution of economic activity. Initially, the two-sector model of Krugman (1991) and the more recent augmented model of Fujita, Krugman, and Venables (1999) focused just on the location of production and, hence, on the distribution of economic growth among localities. From these types of models, it is possible to derive a relationship between the spatial concentration of economic activity and factor prices. Specifically, wages are associated with the so-called market access (MA), which is the distance-weighted sum of the purchasing power of the system of economies. The model predicts that locating in high MA areas will allow firms to pay higher wages to their workers, since it allows them to face lower transport costs and cost savings from large-scale production. The existing empirical evidence supports the prediction of the theoretical model, given that results confirm a strong and significant impact of MA on wages, proxied by per capita income, both for samples of countries and regions (e.g., Redding and Venables 2004; Breinlich 2006).
However, the understanding of the endogenous accumulation of factors of production was not investigated in detail in these seminal papers. That is to say, the earlier contributions to the NEG analyzed the spatial distribution of economic activity without paying particular attention to the impact of agglomeration on the accumulation of the factors supposed to determine economic growth. It is more recently that the accumulation of human capital was endogenized within the framework of an NEG model by Redding and Schott (2003). Under the assumption that the endowment of human capital will be larger in the areas offering higher returns to this factor, the model predicts a higher endowment of such capital in economies with a better access to markets and suppliers. This is so because in the model, the relative wage of skilled labor, and thus the economic incentive to invest in human capital, increases with MA and supply access (SA).
Following a similar empirical strategy than that in studies checking for the relationship between wages and MA, Redding and Schott (2003) provided evidence supporting the positive impact of MA on human capital for a sample of countries. In the same vein, López-Rodríguez, Faiña, and López-Rodríguez (2007) tested this hypothesis for a sample of EU regions, obtaining a positive and significant correlation between MA and different measures of educational attainment. However, these two exercises can be criticized, as the empirical specification they used does not control for factors that are also likely to impact the spatial distribution of human capital. In fact, López-Rodríguez (2007) checked for the robustness of the estimate of the impact of MA in the case of the EU regions. He showed that the estimated impact decreased markedly (to less than one-third, from around 0.9 to 0.3) but remained significant when additional control variables were included (employment in high-tech sectors, labor productivity, number of patents, and a dummy variable accounting for peripherality). Redding and Schott (2003) also included in the regression indicators thought to be important in cross-country studies of development (the risk of expropriation by the government, the percentage of countries’ land that is tropical, and dummies for socialist rule and external wars). The estimate of the impact of MA diminished by the half (from about 0.6 to 0.3), being significant only at 5 percent.
In this article, we aim at contributing to the robustness checking of the MA–human capital relationship in a regional setting. It is our belief that the estimate of the coefficient of the measure of MA is actually capturing the effect of regional differences in the industrial mix and the spatial dependence in the distribution of human capital. Actually, our hypothesis is that the omission of such factors in previous studies biases the estimate of the coefficient associated with MA. Concretely, this will be the case if, as expected, the sectoral composition of each region is correlated to the measure of MA, and if this measure is capturing at least part of the spatial dependence that is likely to characterize the regional distribution of human capital. Niebuhr (2006) and Kosfeld and Eckey (2008) raised a similar criticism in the case of the relationship between wages and MA. Actually, Niebuhr (2006) proved that controlling for additional conditioning variables decreases the power of MA in explaining regional wages. Alternatively, the exercise in this article can be viewed as an attempt to shed some more light into the origin of such a link. From this perspective, we provide evidence that suggests that the industrial mix and the spatial distribution of human capital are consistent candidates to explain, at least part, of the positive association between human capital and MA.
It can be argued that the regional distribution of human capital is also likely to be shaped by the spatial distribution of amenities (e.g., Glaeser, Kolko, and Saiz 2001; Florida 2002), and therefore that amenities must be included in our empirical analysis. 1 However, Storper and Scott (2009) have recently questioned the contribution of amenities, stressing that human capital is far more conditioned by economic factors, such as employment opportunities in each region than by amenities. In any case, it must be stressed that in this piece of research we do not aim at disentangling the relative contribution of alternative theories, but simply to check for the robustness of previous results supporting the link between human capital and MA derived from the NEG model.
We test our hypothesis using data for the set of Spanish provinces. First, in the second section, we study the dispersion of human capital among the provinces of Spain, using the average years of schooling as a proxy for the endowment of human capital. The results of the spatial descriptive analysis confirm remarkable regional disparities and strong spatial dependence in the distribution of human capital. Next, we estimate the coefficient of a simple specification, which reveals the positive and significant effect of MA on the measure of human capital. The theoretical arguments from the NEG that support these empirical results are sketched in the third section, while in the fourth section we discuss the effect of not controlling for regional differences in the sectoral composition, and for spatial dependence. Based on these arguments, the original NEG specification is augmented and the estimation obtained with alternative specifications is compared with those originally obtained from the baseline model. In the fifth section, we provide an extensive robustness checking, including results from panel data estimators with and without spatial effects, and from an alternative measure of human capital that is closer to the theoretical magnitude in the NEG model. Finally, the sixth section concludes.
The Geography of Human Capital in Spain
Preliminary Evidence
The evidence provided in this article was obtained from data for the forty-seven continental provinces in Spain, for the period between 1995 and 2007. 2 We use the average years of schooling of the working population living in each province as a traditional measure of the endowment of human capital. The source of the data is theInstituto Valenciano de Investigaciones Económicas (IVIE) - Bancaja Human Capital data set for Spain (see Serrano and Soler 2008 for the description of the methodology used to build the data set). 3 In a first step, we focus the analysis on the cross sections for the initial and final years in the period under analysis. This is so because we want to maximize the comparability of the results with those provided by previous studies that have only exploited the cross-section dimension. However, in the last section of results, we also provide results obtained using the entire yearly panel data set, which allows controlling for unobservable regional effects that are likely to affect the link between MA and human capital.
The spatial distribution of the average years of schooling in Spain for 1995 and 2007 is depicted in Figure 1. The maps confirm the existence of marked differences in the endowment of human capital across the Spanish provinces and how they persist over time despite the increase in the endowment for all the provinces. However, the most interesting feature for our analysis in this article is that there is a geographical pattern in the dispersion of the human capital with, broadly speaking, higher levels in the North and with lower levels in the Southern provinces. Again such pattern seems to persist, despite the general increase in the level of education over the period under analysis.

Spatial distribution of the average years of schooling in Spain. Source: IVIE–Bancaja Human Capital Data set.
As stated above, the prediction of the NEG is that a big deal of the spatial pattern of the distribution of human capital in the Spanish provinces has to do with the geographic location of each province. Geography, location or, in other words, relative remoteness can be proxied by the MA measure suggested initially by Harris (1954). As discussed in this seminal contribution and later revisited by influential NEG models, MA can be proxied by the distance-weighted sum of the purchasing power of the economies. Therefore, MA of a province in Spain will be positively associated with the purchasing power of the remaining provinces but negatively related with the distance between each other:
where Yj
is the gross value added in province j and Dij
is the distance by road between each pair of provinces i and j.
4
The internal distance of each province is calculated following the suggestion in Head and Mayer (2006), that is,

Spatial distribution of (the log of) market access in Spain. Source: Spanish National Institute for Statistics (INE) and authors’ own calculations.
As for the relationship between the distribution of human capital and that of MA, the comparison of Figures 1 and 2 reveals a connection between the two magnitudes, that is however far from perfect. In general terms, provinces with large endowments of human capital are not in the economic periphery, while regions in the periphery tend to be those with the lower endowments. But figures for some provinces contradict this general statement in both cases. This is confirmed by the information depicted in Figure 3. In both years, there exists a positive relationship between human capital and marked access, but the amount of dispersion in such relationship is far from negligible. For example, it can be observed how there are provinces with similar low values of MA that have rather different endowments of human capital. In addition, the distribution of both magnitudes is likely to be characterized by spatial dependence, something that must be also considered when analyzing formally the impact of MA on the endowment of human capital.

Human capital and market access in the Spanish provinces. Source: Institute for Statistics (INE), IVIE-Bancaja Human Capital Dataset, and authors’ own calculations.
Estimation of the Baseline Model
As a first step in our study of the robustness of the estimated impact of MA on the spatial distribution of human capital, we estimate a simple specification that will be used as a benchmark:
where HK denotes the column vector with values of the average years of schooling in the economies under analysis and ∊ is supposed to be (so far) a well-behaved error term. β is the parameter that captures the impact of MA on human capital.
The ordinary least square (OLS) estimates of the parameters in equation (2) for the two years under analysis are reproduced in the first two columns of Table 1. 5 Results are obviously in agreement with the pictures depicted in Figure 3, confirming the existence of a positive correlation between the two variables. Hence, the evidence from Spain also indicates that provinces with higher MA are endowed with higher levels of human capital. In other words, that remoteness plays against the incentive to accumulate human capital. The impact of MA was however decreasing over the period analyzed, as shown by the lower estimate of the coefficient in 2007 with respect to that for 1995.
To account for the possibility of a different response to the domestic MA of the provinces bordering France and Portugal, the table also includes two additional columns that summarize the results obtained when the benchmark model distinguishes between provinces sharing a border with these countries from those that do not. 6 We included two dummy variables that equal one for provinces sharing a border with one of the above-mentioned countries and zero otherwise. In addition, the border dummies interacting with that variable were included in the regression to allow for a different impact of the MA. Results suggest that there are not significant differences between provinces sharing a border with France and Portugal and those that do not. The only difference that deserves to be mentioned is the slight decrease in the estimated coefficient of MA in 1995, from 0.113 in the original specification to 0.093 in the one including the controls for the bordering provinces.
Results of the Estimation of the Baseline Model.
Note: AIC = Akaike information criterion; ln L = logarithm of the likelihoods; LM-ERR= Lagrange Multiplier test for spatial error dependence; LM-LAG= Lagrange Multiplier test for spatial lag dependence.
*, **, and ***Represent significance at 10 percent, 5 percent, and 1 percent, respectively.
Standard errors for coefficient estimates are in parentheses. p Values for the statistics are in brackets.
Table 1 also includes results for some diagnostic checks. While the Breusch–Pagan test indicates that there are no symptoms of heteroscedasticity in none of the estimated baseline models, the battery of spatial dependence tests reveals that the baseline human capital–MA model is likely to be (spatially) misspecified. The results of these spatial dependence tests will be discussed in detail in the fourth section, as they will motivate our claim for the estimation of a spatial specification of the human capital–MA model. But before, we frame the results of the baseline specification within an NEG model extended to account for the endogenous accumulation of human capital in each region.
NEG’s Explanation: Human Capital and Geography
Not only the findings of the previous section are intuitively reasonable but also the NEG framework allows deriving the link between human capital and remoteness quite straightforwardly from a theoretical perspective. The models of Krugman (1991) and Fujita, Krugman, and Venables (1999) did not include the accumulation of human capital. It is in Redding and Schott (2003) where an endogenous mechanism for the accumulation of human capital was considered, that in conjunction with standard arguments of the NEG gave rise to a reduced form linking the skill wage premium in every economy to its MA and SA. 7
Next, we briefly sketch the main elements of the model in Redding and Schott (2003) stressing the derivations that support the empirical specification in equation (2). 8 The economy is composed by i∊{1, … , R} regions. There are Li consumers in each region, each having one unit of labor. This unit of labor is initially unskilled. Individuals choose endogenously whether or not to invest in becoming skilled. Consumer preferences are identical and homothetic, defined over the consumption of agricultural and manufacturing goods. The agricultural sector produces under constant returns to scale, while the manufacturing industry operates with increasing return to scale.
The critical part of the model is constructed over the individuals’ human capital investment choice, which is formulated as:
where
Hence,
Next, Redding and Schott (2003) make use of the NEG framework to link relative wages to the geography of economic activity. The wage equation is derived from the equilibrium in the manufacturing sector (zero-profit condition):
where α, β, and (1 – α − β) are the factor shares of skilled workers, unskilled workers, and intermediate goods, respectively, σ represents the elasticity of substitution, ci
denotes the marginal input requirement, and Gj
is the price index for manufacturing goods. On the right-hand side of equation (5), Ej
represents the total consumption of manufacturing goods in region j, whereas
Defining MA
i
and SA
i
of region i as:
the wage equation can be written as:
where ξ absorbs constant terms. Therefore, the wage equation can be expressed as a function of MA and SA. Manufacturing firms in regions with easy access to the market and to suppliers can increase the maximum wages that they can afford to pay.
Combining the zero-profit conditions of the constant returns to scale sector (agriculture) and of manufacturing
9
with the skill indifference condition in equation (4), Redding and Schott (2003) are able to characterize the equilibrium relationship between geographical location and endogenous human capital investments. Taking logarithms and totally differentiating each profit condition results in:
From these expressions, it can be deduced that if a region becomes remote (decreasing its access to the market and to supplier) and assuming that manufacturing production is skill intensive, then the new equilibrium should be such that the relative wage of skilled workers should be lower.
10
Turning back to the critical ability condition, this decline in the relative wage for skilled workers means a lower incentive to invest in human capital. Accordingly, the number of skilled workers is also expected to decline in that region. This is the argument that supports the connection between the spatial distribution of human capital and MA in equation (2), as the relative wage of skilled workers is predicted to be lower in the remote regions and, hence, the critical level of ability (
Missing Links: Sectoral Composition and Spatial Dependence
The NEG model by Redding and Schott (2003) sketched in the previous section provides a theoretical justification for the empirical evidence reported in the second section about the fact that the endowment of human capital is higher in some specific locations (the core economic provinces in Spain) and less abundant in the peripheral areas. However, the baseline model in equation (2) does not account for other potential determinants of the process of accumulation of human capital at the regional level. Actually, the theoretical model also includes other mechanisms that impact on the critical level of ability. Besides the impact of MA and SA, the supply of skilled workers is monotonically decreasing in the level of productivity in the constant returns to scale sector, in the cost of the manufacturing production parameter (ci ), and in the cost of education (hi ). On the other hand, technology transfers to a less developed region i reduce their ci , raising the maximum wage that its manufacturing firms can afford to pay to skilled and unskilled workers, given its current MA and SA. Since manufacturing is skill intensive, this causes an increase in the relative wage of skilled workers and then a higher endowment of human capital. 11 For that reason, empirical specifications such as that in equation (2), which does not include variables proxying for these other factors, are likely to produce biased estimates of the impact of MA (and SA) on human capital.
This concern has already been pointed out in the recent empirical literature investigating the impact of MA on the dispersion of regional wages. For instance, Breinlich (2006) controls for the direct distance between the capital of each region and Luxemburg, as the economic activity center of Europe, and for human and physical capital stocks in his study of the relationship between regional wages and MA. Similarly, Niebuhr (2006) and Kosfeld and Eckey (2008) mention that MA’s impact on wage dispersion can be influenced by the sectoral composition of the labor force and by spatial dependence.
However, despite the arguments derived from the theoretical model, and the evidence from studies focusing their attention on wages and MA, López-Rodríguez, Faiña, and López-Rodríguez (2007) control only for the direct distance between each region and Luxemburg in their analysis of the link between human capital and MA across the EU regions. Interestingly, in a closely related article, López-Rodríguez (2007) showed that the estimate of the impact of MA remains significant (although decreasing in size) when some other variables are included in the model. In sharp contrast, in the rest of this section, we show how simply controlling for the industrial mix (as a rough proxy for the factors described above) and for spatial dependence (that it is also likely to account for the impact of some of these factors) in the baseline human capital equation modifies the conclusion on the impact of MA on the regional distribution of human capital.
Sectoral Composition
It is well known that different economic activities tend to demand workers with distinct education levels. Accordingly, our hypothesis is that the industrial mix affects the regional distribution of human capital, as some sectors are skilled intensively while some others employ low-skilled workers. In the case of the Spanish provinces, there are strong disparities in the share of each sector in the economy. Therefore, we expect provinces specialized in some particular industries to show higher endowments of human capital. As an example, Figure 4 maps the spatial distribution of the employment share of the manufacturing and the nonmarket service sector. 12 The picture revealed by the maps is already well known: the manufacturing sector is more important in the Northeast of the country (along the Mediterranean coast and the Ebro Valley), and in some provinces in the center. Meanwhile, the share of the nonmarket service sector is higher in the Southwest and in Madrid, the capital province city.

Spatial distribution of the employment shares in manufacturing and nonmarket services in Spain (percentage of total employment). Source: Institute for Statistics (INE).
A regression between (the log of) the years of schooling and the sectoral employment shares for the two years under analysis confirms the role played by the industrial mix when explaining the variation across provinces in the endowment of human capital. In 1995, the adjusted R 2 of such a regression equals .57, with a value for the F statistic of the null hypothesis of the joint significance of the coefficients of the sectoral shares of 13.40, and a p value of 0. In 2007, the explanatory power is even higher, with an adjusted R 2 of .63, and an F statistic of 16.89 (p value = 0). Therefore, our feeling is that variability in the industrial mix must not be excluded from the empirical model. Actually, results at the end of this section confirm that its omission biases the estimate of the impact of MA.
Spatial Dependence
Our second concern is related with spatial dependence. An exploratory spatial data analysis (ESDA) reveals that the human capital indicator is characterized by significant spatial dependence. Global spatial autocorrelation has been tested by means of the Moran’s I (see for instance, Anselin 1993):
where n represents the number of provinces, z is the standardized value of the variable under analysis, s is the summation of all the elements in the weight matrix, and wi, j
is the generic element of W, a spatial row-standardized weight matrix defined as:
Two different matrices of spatial weights have been used in the ESDA. First, a contiguity weight matrix, where wij
= 1 if provinces i and j are neighbors, and wij
= 0 otherwise. Next, an inverse distance weight matrix with elements defined by:
where
The first two rows of Table 2 reproduce the results for the Moran’s test for the average years of schooling in the two years under analysis, and using the two weight matrices. In all cases, the null of absence of spatial dependence in the human capital variable is strongly rejected. A more detailed analysis, by means of the computation of measures of local spatial dependence, reveals a clear North–South divide, where hot spots of high endowments of human capital appear in the North, and groups of provinces with much lower endowments appear in the South (Moran’s Scatterplot in Figure 5).

Moran scatterplots for the average years of schooling in Spain. Source: IVIE-Bancaja Human Capital data set and authors’ own calculations.
Results of the Global Spatial Autocorrelation Test (Moran’s I).
Note: *** represents significance at 1 percent. Standard errors are in parentheses.
A similar analysis for the MA variable shows that its spatial distribution is far from random as well. As shown in the last two rows of Table 2, the Moran’s I test clearly rejects its null hypothesis of absence of spatial dependence in both years and for the two weight matrices. However, the contribution of each area to the global spatial dependence differs from the one observed for the human capital indicator. Values of the local Moran’s I in Figure 6 reveal that there is not a clear North–South pattern in this case. Instead, there seems to be a sort of East–West divide that in any case does not match to the structure of dependence observed for the average years of schooling. As a consequence, we must not expect MA to be accounting for the pattern of spatial dependence detected in the human capital indicators in a regression such as that in our baseline specification. On the contrary, spatial autocorrelation is likely to be present in the residuals of the OLS estimation of equation (2). This is confirmed by the results of the Moran’s I and the battery of Lagrange Multiplier (LM) tests of spatial dependence reported in Table 1. The tests conclude in favor of the presence of significant residual spatial dependence, either nuisance or substantive, which means that the results based on the OLS estimator would be providing an inefficient and even biased estimation of the coefficient that summarizes the relationship between human capital and MA.

Moran scatterplots for the (log of the) market access in Spain. Source: Institute for Statistics (INE) and authors’ own calculations.
Results of the Tests for the Joint Significance of the Sectoral and Spatial Coefficients.
Note: SAR = spatial autoregressive model; ERR = spatial error model.
Values of the likelihood ratio test for the significance of the sectoral composition variables and/or the spatial effects.
** and *** represent significance at 5 percent and 1 percent, respectively.
p Values for the statistics are in brackets.
Extended Empirical Specification
Considering the descriptive evidence provided so far, and the role played by the other elements in the theoretical model described in the third section, it is our belief that the empirical specification used for testing the connection between human capital endowments and MA must account for regional differences in the industrial mix and for spatial dependence. In the rest of this section, we show the effect of neglecting both phenomena in the case of the Spanish provinces.
As a first step, the baseline specification is augmented to control for the sectoral composition of each region:
where SE is a matrix whose columns correspond to the share of the employment in each sector in total employment, excluding one (agriculture) to avoid the collinearity problem. φ is the corresponding vector of parameters associated with the effect of the sectoral composition.
Next, two specifications have been considered to control for spatial dependence: the spatial autoregressive model (SAR):
and the spatial error model (SEM):
13
where ρ and λ are the spatial coefficients, and υ a well-behaved error term.
The results of the estimation of the parameters in equation (12) are reported in the first two columns of Table 3, while those for the spatial models in equations (13) and (14) are shown in the subsequent columns of that table. 14 As for the impact of the inclusion of variables conditioning for the industrial mix, the results in Table 3 are quite clear. The magnitude of the coefficient associated with MA decreases for the two years under analysis. Actually, the nonsignificance of the effect of MA on the human capital endowment cannot be rejected at the usual significant level in 2007, while in 1995 it is only significant at 5 percent. This finding confirms our concerns about the importance of the inclusion of a proxy for regional differences in the sectoral composition.
Results of the Estimation of the Model Including Controls for the Sectoral Composition and Spatial Dependence.
Note: AIC = Akaike information criterion; ln L = logarithm of the likelihoods; LM-ERR= Lagrange Multiplier test for spatial error dependence; LM-LAG= Lagrange Multiplier test for spatial lag dependence; SC= Schwarz criterion.
*, **, and *** represent significance at 10 percent, 5 percent, and 1 percent, respectively. Standard errors for coefficient estimates are in parentheses. p Values for the statistics are in brackets.
Nonetheless, the Moran’s I test and the LM tests of the models estimated including the controls for the sectoral composition still reject their null hypotheses of no spatial dependence. That is to say, the addition of the sectoral composition does not account (at least fully) for the spatial autocorrelation in the human capital distribution in the Spanish provinces. Therefore, the estimation of a spatial specification (the SAR and/or the SEM models) is required to guarantee a robust inference on the effect of MA. The last group of columns in Table 3 summarizes the estimation results of the two alternative spatial models, showing that the spatial parameter is strongly significant in all cases, and large in magnitude. We have also tested for the joint significance of the coefficients associated with the variables proxying for the sectoral composition and for the spatial effects. The results of the likelihood ratio tests are reproduced in Table 4. In building these tests, the logarithm of the likelihoods (ln L) for the appropriate specifications in each case (from Tables 1 and 3) have been used. It is observed that the null hypothesis of no joint significance is rejected in all the cases, confirming that both the sectoral variables and the spatial effects are significant when explaining the variability in the regional distribution of the endowment of human capital.
In addition, results of the LM tests of residual (in the SAR model) and of substantive (in the SEM) spatial dependence indicate that these models no longer exhibit significant spatial autocorrelation. As for the effect of MA, results strongly support our hypothesis as it can be observed that the change in its size and significance is even more intense when it is estimated considering spatial dependence, by means of either the SAR or the SEM specifications. Actually, these results suggest an almost negligible role of MA in explaining regional differences in human capital endowment, once the sectoral composition and spatial dependence are accounted for.
It could be argued, however, that MA is likely to be correlated with the proxies for the industrial mix, and also with the spatial lags in equations (13) and (14). As a result, it could be the case that large collinearity causes the nonsignificance of the coefficient of our variable of interest. In other words, part of the explanation of MA could be absorbed by the additional control variables in our study. Recognizing this possibility, we would like to stress that the logic of the argument can be reversed, supporting our hypothesis that the favorable result to the NEG arguments in López-Rodríguez, Faiña, and López-Rodríguez (2007) might be (at least partly) due to the omission of a proxy for the sectoral composition and of spatial effects in their analysis. To try to shed some more light in this issue, we compare the values for the Akaike and the Schwartz Information Criteria, respectively, as statistical measures that can help us in selecting the most appropriate specification. These two measures are reproduced in the bottom panel of Tables 1 and 3. In all cases, the values are lower for the specifications including controls for the sectoral composition and the spatial effects, supporting our claim that the inference on the effect of MA on human capital should be based on an expanded model including these two elements.
Robustness Checks
Panel Data
The evidence reported in the previous sections has been obtained from the information contained in the cross section of Spanish provinces in each of the two years analyzed. As mentioned above, this has been the practice so far in studies examining the impact of MA on human capital. However, since yearly data for each province are available for the period between 1995 and 2007, one can argue that additional information can be obtained by exploiting the panel structure of the data. This type of approach has been applied in a recent contribution by Boulhol and de Serres (2010) to analyze the effect of MA and SA on gross domestic product (GDP) per capita, but as far as we know it has not been used yet to check for the connection between accessibility and human capital. 15 Besides the increase in the number of observations, using the panel data set allows controlling for unobservable regional effects that could be shaping the regional distribution of human capital in Spain.
Using the panel data also allows us to define a set of instruments for MA that is used to obtain an instrumental variable estimator of the impact of remoteness on human capital. Although in this case endogeneity is less likely to be a serious issue than in the case of studies dealing with the impact on GDP per capita, this section also provides results on the basis of an IV estimator for panel data. The set of instruments is similar to that proposed in Boulhol and de Serres (2010), obtained by interacting the sum of the distances of each province to all other provinces with the time dummies defined for each year included in the period under analysis.
Results of the fixed-effects panel data estimator for the benchmark model and for the model including controls for differences in industrial mix are summarized in Table 5. The first set of columns corresponds to the estimator based on the standard least squares (within estimator), while the last set shows the results obtained using the IV estimator (two-stage least squares) for the fixed-effects model. It should be noted first that, as expected, the fixed-effects model was preferred to a random effects model, confirming the correlation between the unobserved province effects, and the MA measure and the employment shares in each sector. Also it should be mentioned that significant fixed-year effects are included in all the specifications.
The least squares estimates of the benchmark specification in the levels of the variables provide a significant negative value for the impact of MA, which is clearly against the prediction of the NEG model. Therefore, the within-province variation in the data when the factors that are common to all province are taken into account (time effects), suggests that, for a representative Spanish province over the period under analysis, the endowment of human capital decreased when its accessibility to the market increased, and the other way round. This is in sharp contrast with the results based on the variation across provinces (that is the one exploited in the cross-section estimates), and thus with those in the previous evidence in the literature. In any case, the impact turns out to be nonsignificant when the industrial mix controls are included in the model. As in Boulhol and de Serres (2010), we have also estimated the model with a first-order autoregressive process in the error term, and in first differences of the variables. In the latter case, only the year-fixed effects were included as first differencing eliminates the time-invariant province effects. In both specifications, the estimated impact of MA remains negative though becomes nonsignificant.
Results of the Estimation of the Panel Data Fixed-effects Model.
Note: *, **, and *** represent significance at 10 percent, 5 percent, and 1 percent, respectively. Standard errors for coefficient estimates are in parenthesis. p Values for the statistics are in brackets.
aEstimate of the coefficient of the AR(1) process in the error term. bValue of the Anderson canonical correlation Lagrange Multiplier (LM) statistic, and significance from the corresponding χ 2 (L) distribution, with L the number of instruments. cValue of the Cragg-Donald Wald F statistic, and in parentheses the corresponding percentage of the maximal relative bias of the IV estimator over the ordinary least square (OLS) estimator under the null hypothesis of weak instruments. dSargan’s overidentification test of all the instruments.
As for the results obtained when MA is treated as an endogenous variable and the time-varying instruments based on the sum of distances for each province are used, the last four columns in Table 5 show that the impact of MA is estimated to be positive but highly insignificant in all cases. The Sargan test indicates that the instruments are exogenous. However, results of the weak identification test for all the specifications point to a large bias in the IV estimates (it is at least 30 percent larger than that associated with the least squares estimates), raising serious doubts about their accuracy.
Finally, the estimates corresponding to the panel data specifications accounting for spatial dependence are summarized in Table 6. First, we have computed the spatial LM test for the specification that includes the MA variable and the controls for the industrial mix. Then, both the spatial lag and the spatial errors specifications for the fixed province and time effects model, in levels and in first differences, have been estimated. 16 The results indicate that there are significant spatial effects in the specification in levels but not in first differences. As for the impact of MA, the coefficient is negative for the specification in levels, and positive when the model is first differenced. However, it is significant only for the spatial models in levels.
Results of the Estimation of the Spatial Panel Data Fixed Effects Model.
Note: *, **, and *** represent significance at 10 percent, 5 percent, and 1 percent, respectively. p Values for the statistics are in brackets. LM-ERR = Lagrange Multiplier test for spatial error dependence; LM-LAG = Lagrange Multiplier test for spatial lag dependence.
Standard errors for coefficient estimates are in parentheses.
In all, the panel data estimates that control for fixed regional and time effects do not support the prediction made by the NEG model about the positive impact of MA on the regional endowment of human capital. There is even some evidence pointing to a negative relationship between the improvement in accessibility for the Spanish representative province over the period under analysis, and the change in its endowment of human capital.
An Alternative Measure of Human Capital
As an additional robustness check, this section shows the results obtained using an alternative measure of human capital that is more closely related to that in the theoretical NEG model. It is the per capita value of human capital, computed as the weighted sum of all workers, where the weights are the ratio of their wage to the wage of a zero-schooling worker with no experience. This measure, proposed in Mulligan and Sala-i-Martin (1997), is provided in the IVIE-Bancaja Human Capital Data set and is described in detail in Serrano and Pastor (2002), see also Mulligan and Sala-i-Martin (2000). In brief, the method used to compute that measure of human capital starts from the estimation of the schooling (and experience) wage premium using microdata from a representative sample of workers (the Spanish wave of the Earnings Structure Survey). Then, defining wsea as the relative wage of an individual of gender s, with education attainment of e, and belonging to the age cohort a, relative to an individual without any schooling and younger than twenty years, it is stated that hsea = wsea . That is to say, the relative wage of a worker with characteristics s, e, and a equals the relative endowment of human capital of the individual with such characteristics relative to an individual with no schooling and younger than twenty years old (hsea ).
The total amount of human capital in a region is thus obtained by aggregating across individuals,
As indicated in Mulligan and Sala-i-Martin (1997) and in Serrano and Pastor (2002), this alternative measure of human capital solves some of the shortcomings of the average years of schooling. And most interestingly for our study, as it is based on the relative wage of skilled workers to unskilled workers, this measure is more closely linked to the individuals’ economic incentive to acquire education and thus to the magnitude in the theoretical model sketched in the third section.
We have replicated the estimations for the cross section and panel data using the per capita value of human capital as the dependent variable. Results for the cross sections of 1995 and 2007, summarized in Table 7, suggest a positive impact of MA on this alternative measure of human capital somewhat stronger than that obtained for the years of schooling. Actually, the coefficient of MA remains significant at 5 percent in the spatial lag model with industrial mix controls and at 10 percent in the SEM. The estimated value in those specifications is clearly above (in some cases even doubling) the one obtained in the fourth section for the measure of human capital based on education attainment.
Cross-Section Results Using the Per Capita Value of Human Capital.
Note
*, **, and *** represent significance at 10 percent, 5 percent, and 1 percent, respectively. Standard errors for coefficient estimates are in parentheses.
p Values for the statistics are in brackets.
However, estimates from the panel data set shown in Table 8 do not support the positive impact of MA. On the contrary, estimates from the spatial models with the shares of sectoral employment, and fixed province and time effects, point to a significant negative effect of MA. The estimate of the effect remains negative, though not significant, in the specifications in first differences. Therefore, as in the case of years of schooling, the estimates based on the within-province variation, and controlling for time effects, do not allow to support the prediction made by the NEG model.
Results of the Estimation of the Spatial Panel Data Fixed Effects Model for the Per Capita Value of Human Capital.
Note: *, **, and *** represent significance at 10 percent, 5 percent, and 1 percent, respectively. p Values for the statistics are in brackets. LM-ERR= Lagrange Multiplier test for spatial error dependence; LM-LAG= Lagrange Multiplier test for spatial lag dependence.
Standard errors for coefficient estimates are in parentheses.
Conclusion
The hypothesis in this article has been that the inference on the impact of MA on the regional distribution of human capital provided in previous contributions to the literature is likely to be nonrobust because it is based on a rather simple specification that does not account for regional differences in the sectoral composition and for spatial dependence in the distribution of human capital.
Our results for the cross sections of Spanish provinces in different years confirm that once we include the sectoral composition of employment as well as control for spatial dependence, the role of MA decreases sharply; and even vanish and become statistically insignificant. Indeed, we can even conclude that spatial effects and differences in the demand of human capital across sectors play a much more prominent role than the traditional measure used to proxy for the accessibility to the market of each region. Besides, the evidence obtained from a panel data setting, controlling for fixed regional and time effects, is clearly against the prediction of the NEG model, as the estimated impact of MA is even negative and significant in some specifications. The interpretation of such a result must take into account that the specification controlling for fixed regional effects utilized the variation in the time-series dimension of the data. Therefore, it can be argued that these estimates correspond to the short-run impact of MA on human capital, whereas those based on the cross-section dimension provide the estimate of the long-run effect. Additional results, not reported in this article but available from the authors, from panel data models that do not include fixed effects, and from the application of the between estimator (that exploits only variability in the cross-section dimension) lead to conclusions that are similar to those derived from the cross-sectional analysis in the fourth section.
Our conclusion is in line with that in Fingleton (2006, 2011). He indicates that there are alternative (or at least complementary) plausible theories to those from the NEG when explaining local wage variations. It is also consistent with the smaller role played by the NEG elements at the regional level when compared to the country scale, derived from results in Brakman, Garretsen, and Van Marrewijk (2009). In any case, it is our belief that additional elements should be combined with those from the NEG model in order to obtain empirical specifications that provide robust inference on the real impact of MA on the regional differences in human capital endowments.
The evidence obtained in this article suggests also the interest of the implementation of a more direct test of the connection between the regional differences in the incentives to invest in human capital and MA, based on the use of a measure of the returns to human capital instead of its endowment. The cross-section estimate of the impact of MA on the per capita value of human capital, as a measure of the wage premium of skilled labor, is stronger than that obtained with the measure of the endowment of education (average years of schooling). In our opinion, this is a more appropriate way of testing the implication given by the wage equation in the NEG model (equation 6 in the third section), where the estimated return to education in each region would be capturing the skill wage premium.
Although the objective of our contribution has been simply to check for the robustness of the results in the previous literature on the positive and significant impact of market accessibility on the spatial distribution of human capital, we cannot conclude without recognizing that the analysis in this article has not provided evidence on the mechanisms behind the effect of the industrial mix and the spatial interactions. A deeper analysis, connecting such type of magnitudes with those in the theoretical model related to the production technology and with the incentive to accumulate human capital, is in order and it is, actually, in our future research agenda.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: BC-K acknowledges the scientific research support from the Scientific and Technological Research Council of Turkey (TUBITAK), code BIDEB-2214. EL-B acknowledges financial support from the Spanish Ministry of Science and Innovation, National Program of R&D, ECO2011-30260-C03-03, and from the European Community’s Seventh Framework Program (FP7-SSH-2010-2.2-1) for financial support under grant agreement n° 266834, SEARCH Project.
