Abstract
Higher nominal wages in urban areas are well-documented phenomena which imply higher productivity of urban workers. Yankow and Wheeler show that these gains come through a variety of sources including static agglomeration economies and dynamic learning and matching efficiencies in cities. Yet, earlier articles offer little evidence of how the effects of learning and matching on urban wage differentials vary by city size. This article allows for the relative importance of these productivity advantages to differ according to the size of the city and finds significant differences between small, medium, and large cities. We find that learning efficiencies are most important in medium-sized cities, while a mix of learning and matching efficiencies are important in the largest and smallest cities.
Keywords
Introduction
An old German saying goes Stadtluft macht frei. 1 Although the original saying concerned feudal obligations, the idea that city air is somehow different persists both in the broader culture and—more recently—in the urban economic literature. Glaeser (1999) and Glaeser and Maré (2001) offer theory and evidence suggesting that urban areas increase wages through a process of learning. However, Glaeser (1999) acknowledges that the pattern of evidence found in Glaeser and Maré (2001) is also consistent with the potential of better labor market matches in dense urban labor markets, which are realized gradually through more intensive or productive job search. Indeed, Wheeler (2006) and Yankow (2006) find evidence supporting this hypothesis.
In terms of our understanding of cities and their economic function, the two theories offer very different views. While Glaeser (1999) makes a persuasive defense of his assumptions, the learning externality essentially assumes that there is something different about cities in the Marshallian or Jacobsian way: that the urban atmosphere somehow imparts knowledge to those who breathe it. On the other hand, the matching mechanism for enhanced productivity of urban workers requires only that city labor markets be thicker on both the supply and demand side, which is something already known to be true. However, the relative importance of these urban wage growth effects need not be uniform across all locations. Most prior works, like Glaeser and Maré (2001), Wheeler (2006), and Yankow (2006), offer little evidence on how these effects on growth differ by city size. This article extends these analyses to examine how the relative importance of agglomeration, learning, and matching economies varies with city size.
The rest of the article is organized as follows. The section on Background briefly discusses several explanations for urban wage premia and differential urban wage growth. The next sections on Empirical Framework and Data discuss the empirical model and describe the data, followed by the section on Results, which are consistent with the earlier work by Wheeler (2006) and Yankow (2006) but expose significant differences in the sources of urban wage premia across city sizes. These results suggest that the functions of cities vary with their size. Middle-sized cities seem to generate the largest learning effects, while small and large cities are characterized by a mix of matching and learning effects. The Discussion section comments on these findings and the final section concludes.
Background
Wages are higher in urban areas and higher still in larger urban areas. From the perspective of workers, it is not difficult to see why this would be the case. To attract labor supply to large cities, employers must compensate them for the high cost of living and possibly also for congestion externalities associated with these areas. Looking at the question from the labor demand side does not yield as obvious an explanation. Why should profit-maximizing firms be willing to pay workers more in large cities than in smaller cities or rural areas? Since the costs of doing business besides wages (e.g., land costs, congestion costs, regulatory costs and taxes) are also higher in urbanized areas, it must be the case that workers in large cities are more productive than workers in smaller cities. Why?
There are several possible answers. Following Glaeser and Resseger (2010), these theories can be categorized as knowledge-based and not knowledge-based. The former include the two theories of interest here—city air (learning) and city markets (matching) —discussed below. The other theories include many intuitive explanations. It could be the case that city workers are more able than rural workers. Sorting of more productive workers into larger cities could explain the wage premium, but would require explanation itself. If urban residents earn more because of their talents, and not because of their residence, then they should be able to move to the country or a smaller city (with lower costs of living) and have a higher material quality of life. For the ability-sorting explanation to hold up in locational equilibrium, high-skill individuals would have to have an unobserved taste for urban areas. Heuermann et al. (2010) summarize the several studies measuring the extent to which sorting accounts for the urban wage premium, concluding generally that worker sorting accounts for much but not all of the premium. Glaeser and Maré (2001) and Yankow (2006) find controlling for worker fixed effects reduces the premium by over half. Sorting on observable skill and unobservables continue to receive attention in the literature (see, e.g., Adamson, Clark, and Partridge 2004; Gould 2007; Baum-Snow and Pavan, forthcoming).
It is also possible that urban wages are higher because of the large capital stocks that are at workers' disposal in urban employment, relative to small towns or rural areas. Under normal assumptions, larger capital stocks increase the marginal product of labor, and thus generate higher wages in short-run equilibrium. However, in the long run the capital stock is endogenous. If capital is mobile, it is hard to see why employers would pay workers more in cities when lower wages could be paid to rural workers, had the capital been invested in the rural area instead. The empirical implications of a capital-deepening explanation of urban wage premia are that migrants to urban areas should experience immediate wage gains, and urban–rural migrants should experience immediate wage losses. Glaeser and Maré (2001) find instead that urban migrants appear to increase their wage gradually, while the wage losses of urban–rural migrants are not persistent. Furthermore, they find that the urban wage premium is largest for workers with the most experience, an effect that does not seem consistent with a straight capital-deepening story. Glaeser and Resseger (2010) explore this possibility further, finding no evidence to support this hypothesis.
A related explanation for the urban wage premium is that urban firms benefit from large stocks of unpriced inputs like public infrastructure. These roads, schools, ports, and utilities could increase the productivity of firms that have access to them. If there are economies of scale in the provision of such inputs, then dense urban areas could retain their advantage over rural areas and the urban wage premium could constitute a locational equilibrium. Similar effects could derive from unproduced productive amenities such as coastal location or access to natural resources. However, the impact of public capital on business productivity is in dispute, with some papers finding no net effect of public infrastructure expenditures (Holtz-Eakin and Lovely 1996). Even if public infrastructure is productive, this source of the urban wage premium would have empirical implications identical to those of the private capital story above (immediate wage gains for urban in-migrants, immediate wage losses to urban out-migrants), so the evidence cited above about gradual wage gains also undermines a public-infrastructure or productive-amenity–based explanation for urban wage premia.
The remaining explanations for the urban wage premium are all forms of agglomeration economies. 2 Fujita, Krugman, and Venebles (1999) describe the three major agglomeration economies, which they attribute to Marshall, as access to customers, access to specialized inputs, and human capital spillovers. Krugman (1991) models the first of these types of agglomeration economies. While Krugman is at the head of a very exciting literature, 3 the implications for the urban wage premium are similar to the public and private capital interpretations, above. In these models, transport costs plus economies of scale make location near consumers valuable. Workers also find it attractive to locate near the production centers because that lowers their cost of living. 4 In all of this, the agglomeration forces rest completely within the firm. It is the firm’s access to consumers and scale economies which give rise to the agglomeration tendencies. Mobile workers are homogenous. If this is the case, then any mobile worker can move to the site of agglomeration, take a job and earn a wage as high as any long-term resident of the agglomerated area. It is not the workers, but the firm and the agglomeration that raise productivity. Thus, the predictions of immediate wage gains and losses persist. 5
The knowledge-based theories of agglomeration economies often start with human capital spillovers, as Glaeser (1999) models. He includes the oft-quoted passage from Marshall that in cities the “mysteries of the trade become no mystery: but are as it were, in the air.” It is not simply the air’s saturation with trade secrets that increases productivity in cities, however. In Glaeser’s model, learning is achieved through interactions with more skilled individuals. Cities foster more learning than rural areas because interactions are more frequent in cities. Given certain simplifying assumptions, Glaeser derives privately and socially optimal city sizes and skill distributions. Although the theory is in the context of long-run equilibrium, Glaeser draws out the dynamic implications for the urban wage premium. If urban workers earn more because they have learned and are learning through interactions with peers, two implications are obvious. First, the wage premium should develop gradually as migrants to cities learn their skills from their seniors. Second, the urban wage premium should increase with experience. 6 Glaeser and Maré (2001) take these predictions to several data sets and find evidence that broadly supports the learning hypothesis. Results found by other authors support this interpretation as well. 7 Thus, while Krugman (1991) and the consequent new economic geography literature model Marshall’s first agglomeration economy, Glaeser (1999) models his third.
What Glaeser (1999) acknowledges is that these same patterns could be explained by Marshall’s second source of agglomeration economies: thicker markets for specialized inputs. Specifically, if firms require certain kinds of labor, then having a larger labor market should improve the quality of employment matches. Helsley and Strange (1990) model this kind of labor-matching, finding that privately optimal cities would be too large (have too many firms). For simplicity, Helsley and Strange assume a kind of perfect information: once a person moves to a city, it is costless for him to find the employer that will provide the highest quality match. 8 This is probably a reasonably accurate description of a worker’s situation in the long run. However, in the short run, workers probably take some time to search out and discover this best match. This near-term search will be characterized by the kind of rapid job turnover documented in Topel and Ward (1992) and in the urban context by Wheeler (2008) and Finney and Kohlhase (2008). This period of searching for better matches and accruing gradual wage increases through job mobility could take some years (indeed, an optimist might believe that there is always a better match somewhere out there). Thus, the Helsley and Strange (1990) model of agglomeration arising from thick city input markets or better matching would give rise to wage patterns very similar to the learning model in Glaeser (1999): wage premia should accrue gradually and should be increasing with time spent in the city.
This article addresses the mechanisms by which these wage premia gradually accrue to city dwellers. The observed urban wage premium may arise from wage level effects or via wage growth rate effects, or both. The learning and matching hypotheses both predict urban wage gains over time, while other theories suggest immediate wage differentials. Of particular interest is the growth aspect, getting closer to the microfoundations of how the growth differential comes about in cities (learning vs. matching), and whether the roles of “city air” and “city markets” in generating this differential vary with the scale of the metropolitan area. Most prior work that evaluates the learning and the matching hypotheses either restrict cities' effect on wage growth (via learning or matching) to be the same regardless of city size (e.g., Glaeser and Maré 2001; Gould 2007) or impose a monotonic relationship between city size and these wage dynamics (e.g., Wheeler 2006; Glaeser and Resseger 2010). The estimation approach taken here improves on this by including flexible city-size interactions.
The prior literature differentiating between the learning and matching hypotheses across city size is quite limited. Empirical analyses distinguish between the two hypotheses by separating “within-job” wage growth from “between-job” wage growth. Although imperfectly, within-job raises reflect learning while between-job wage gains support the matching or coordination hypothesis. Wheeler (2006) and Yankow (2006) examine urban wage premia and urban wage growth effects and find evidence for both the learning and the matching efficiency arguments: urban residents earn more because their wages grow faster, and their wages grow faster in part because job changes are more frequent and more rewarding financially. 9 Yet, the accumulated evidence on learning versus matching across urban scales remains cloudy. For between-job wage gains, Wheeler (2006) finds significant urban effects but Yankow (2006) does not. Moreover, some of Wheeler’s models and all of his fixed-effects models fail to find significant effects of urbanicity on between-job wage gains. For within-job wage growth, some studies find significantly higher rates in cities (Glaeser and Maré 2001; Glaeser and Resseger 2010), but others show no association between growth rates and city population (Glaeser and Resseger 2010; Wheeler 2006). The different measures of urban size used by these researchers suggest that the relationship between city size and learning and matching may be rather complicated.
This article reproduces these earlier results and shows how the urban wage premia—in particular, that part deriving from differential wage growth rates—depend on the place of the city within the urban hierarchy. This focus on the interactions between city size and work experience and job mobility contrasts with the emphases of other recent studies of the urban wage premium, which focus on other themes like wage distributions (Matano and Naticchioni 2011), sorting (Gould 2007; Baum-Snow and Pavan, forthcoming), and education (Adamson, Clark, and Partridge 2004). The results here give a relatively direct test of the learning and matching hypotheses behind the urban wage premium. Incorporating both in the same model also offers a measure of their relative contributions, and how their importance varies with city size. The estimation approach and larger data set used here yield more precise estimates than previous studies. These new results provide further evidence for this important research area which has thus far produced a good deal of insignificant and inconsistent results. This inquiry improves our understanding of the dynamics and implications of urban growth. This could allow for economic development and employment policies to be differentiated by city size. How cities' roles in worker productivity—as a forum for matching (city markets) or learning (city air)—vary with size can guide urban policy to foster the sorts of economic environments most suited a city’s scale.
Empirical Framework
The presence of an urban wage premium can be confirmed quite simply by regressing individual i’s log wage at time t on his contemporaneous urban residence, as in equation (1):
The addition of individual fixed effects (µ
i
) to the equation allows for the possibility of selection into cities based on unobservable characteristics such as skill, ambition, work ethic, or other wage- or productivity-enhancing characteristics unobserved by the econometrician.
Differentiating between the several theories for the urban wage premium is the primary focus of this discussion. Follow Glaeser and Maré (2001), interactions are used to identify the different avenues of the urban effect in the wage equation.
The learning effect of cities on productivity should be apparent by an increased return to work experience in cities, relative to rural areas. But since tenure and experience will increase equally when an employee does not switch positions, a pure Glaeserian learning effect would be represented by
The two hypotheses are not mutually exclusive. If matching and learning are both important, it is expected that
The fixed-effects estimator in equation (4) allows identification of the urban wage premium as both a wage-level effect (
Finally, one objection to the methodology used in this article is that urban residence is not assigned randomly, and thus there is the possibility of endogeneity bias in these estimates. While the identification of the urban effects through within-observation variation addresses this to some extent, it is of course still likely that urban–rural and rural–urban migration is not exogenous, either. If all changes were from rural to urban, the effect of such endogeneity would be to bias all of the estimated urban and urban interaction coefficients away from zero: those who expect (or realize) a large urban wage premium will be more likely to move to cities. However, the flow of individuals from cities to rural areas will induce the opposite bias. These flows are of similar magnitude in this data (about 4,000 rural/urban moves vs. about 3,300 urban/rural moves), so the bias is likely quite small. On the other hand, it is unlikely that experience, education, tenure, or any of the other time-varying variables are exogenous either. Ideally, one would find compelling instruments for all these variables. At least for the case of urban status, such a compelling instrument does not exist in the National Longitudinal Survey of Youth (NLSY) data. The results to follow are thus based on simple Mincerian regressions, and best interpreted as a good, first cut at the question, hoping that most of the endogeneity in these variables is driven by time-invariant factors such as tastes, ability, or family background.
Data
The NLSY 1979 (NLSY79) restricted-use geocoded and work history files are used to estimate the above equations. The NLSY79 offers a long panel data set of a nationally representative sample. The initial cohort of 12,686 people aged fourteen to twenty-two interviewed in 1979 were surveyed annually through 1994 and then biennially after that. Unlike the many previous studies of the urban wage premium that use the NLSY79 survey for years 1979–94, the panel is extended through 2004. The vector of control variables includes cumulative experience (in weeks), tenure with current employer (in weeks), age, sex, race, marital status (= 1 for married individuals, = 0 for all other individuals), and the percentile score on the Air Force Qualifying Test (AFQT). A battery of occupation, industry, and year indicators are sometimes included. Data from the entire panel (1979–2004) are used, with the requirement that a person-year work at least thirty-five hours per week, have valid geographic and occupational information, and have a “reasonable” reported wage. 10 The industry codes come from either the 1970 Standard Industrial Classification (SIC) codes, or the revised 2000 codes, depending on the year of the survey (2002 and 2004 waves use the more current codes). A hybrid code system grouping industries into sixteen categories for all survey years was generated. 11 Similarly, occupation codes are reported in 1970’s codes for all survey years except 2002. Comparing the 1970 and 2000 census occupation codes, an educated guess about assigning 2000 codes into 1970’s bins was made. The assignment rules for occupation and industry are available to those interested.
To ensure comparability across the sample, only work experience gained after the age of 18 was counted. 12 The experience variable was generated by summing the reported number of weeks worked since the previous interview. The interaction between urban status and experience was done during the data processing: it is not simply the product of contemporaneous urban status and cumulative work experience. Rather, this interaction was generated by recomputing the cumulative experience variable assigning zero to any weeks where work was done outside an urban area. 13 The interactions of tenure and education with urban status, on the other hand, were computed in the standard way.
Urban status of county of residence was derived from the reported county of residence, in conjunction with the USDA ERS’s Rural/Urban Continuum scale. 14 These data are convenient because they provide measures of urbanity derived from the 1970 to 2000 censuses. For the years 1979–85, 1980 county urbanity was used. For 1986–95, 1990 county urbanity codes were used. For the remainder of the data, urbanity codes from the 2000 census were used. The 2000 codes are presented in Figure 1 . These codes allow for three gradations of urban status which are used throughout the rest of the article. The most restrictive, Big Urban, includes only metropolitan areas with populations of more than one million (black shapes in Figure 1). The next category, Medium and Big Urban, includes all those large cities, plus counties in cities with populations greater than 250,000, but less than one million (dark gray shapes in Figure 1). The most inclusive, Urban, includes all metropolitan areas with populations exceeding 50,000. Table 1 presents the sample means for the sample of worker-years in the NLSY from 1979 through 2004, as well as some comparisons across sizes of urban areas. Note that the urban wage premium is apparent in the group averages: big cities residents make eighteen log points more than medium city residents, twenty-three log points more than small city residents, and thirty-four log points more than rural residents in this sample. It should be noted that the urban indicators are constructed different than typical, mutually exclusive dummy variables. As Table 1 makes clear, big cities take a value of unity for all three urban scale variables while small cities have only the Urban value equal to one. This construction makes it easy to detect significant incremental effects of city size, this article’s emphasis, in the results shown below.

2000-Census-based urban–rural codes. Note: Metropolitan areas with populations over one million shaded in black. Metropolitan areas with populations between 250,000 and one million shaded with dark gray. Metropolitan areas with populations between 50,000 and 250,000 shaded with white.
Descriptive Statistics.
Note. Standard deviations below each average.
Results
The existence of the urban wage premium is confirmed in the first column of Table 2 , which takes equation (1) to the data, with the threefold distinction among metropolitan areas. 15 The results show that, over the 1979–2004 period, full-time workers in urban areas earn around a 12.5 percent wage premium, 16 while workers in cities with populations greater than 250,000 earn about an additional 4.5 percent over small cities, and workers in the largest cities (metropolitan areas with populations over one million) earn an additional 20 percent over medium cities. While these coefficients are highly significant and economically substantial, the predictive power of this basic equation is very low. Columns 2 and 3 of Table 2 report estimates from equation (2), which control for a variety of individual characteristics (column 2) and a set of occupation, industry, and year fixed effects (column 3). These characteristics increase the predictive power substantially, have expected signs, and are highly significant. Note that while the measures of experience (Exp) and tenure (Ten) are measured in weeks, the coefficients in Tables 2–4 have been scaled to represent a year’s worth of experience in general or at a single firm. 17 The inclusion of these observable characteristics reduces but does not eliminate the urban wage premium. Observationally, equivalent workers in urban areas make around 5 percent more than those in rural areas, with an additional 4 percent premium in medium-sized or larger cities, and an additional 14 percent in the largest urban areas. This suggests that sorting into urban areas based on observables accounts for some of the urban wage premium, but not all of it.
Urban Level Effects without Interactions.
Note. OLS = ordinary least squares, FE = fixed effects.
*Significant at the 5 percent level. **significant at the 1 percent level.
Columns 4–7 of Table 2 report estimates from the individual fixed-effects regression (equation 3) with various additional controls for year, occupation, and industry. The time-varying control variables retain their significance. However, the addition of the individual fixed effects further reduces the urban wage premium, suggesting again that some of the urban wage premium is an artifact of high ability individuals sorting into cities. In the smallest cities, the urban wage premium becomes insignificant. In medium-sized cities, the premium is driven down to only 2.5 percent in the model with industry, occupation, and year controls, and the large-city premium is driven down to an additional 6.5 percent over medium-sized cities. Both these urban premia are highly significant. Interestingly, the addition of the year fixed effects reduces the medium-city premium (compare columns 4 and 5), while the addition of the occupation and industry effects lowers only the small-city premium (compare columns 4 and 6). The large-city premium is robust to the inclusion of these additional effects. 18 The substantial reduction in the urban wage premium after the inclusion of the individual fixed effects means that sorting of unobservedly able earners into cities is an important component of the observed wage differentials in cities. Whether this sorting derives from a preference among the highly able to live in cities, or higher returns to unobservable ability in cities, these results cannot address. However, the medium- and large-city premia are still significant: the urban wage premium does not appear to be an artifact of sorting.
Table 3 introduces the experience, tenure, and education interaction terms as in equation (4). In Table 3, urban residence is entered as a single dummy variable. For exposition’s sake, regressions both with and without the interactions are reported. 19 Columns 1 and 2 treat the Urb dummy variable as Urban, so all metropolitan areas are considered urban. Columns 3 and 4 define Urb as Medium & Big Urban, and columns 5 and 6 code Urb = Big Urban (i.e., only cities with populations over one million). Across all measures of urban residence, urban residence, urban experience, and urban tenure are significant, while the urban education interaction is insignificant. The tenure and experience interactions are such that they suggest both learning and matching are important factors in differential urban wage growth. While this broad pattern is consistent across definitions of urban residence, 20 the importance of learning relative to matching varies considerably (ranging from about .5 to .75, by this rough measure). These results are consistent with those reported in Yankow (2006) and Wheeler (2006), although the specification does not facilitate the comparison of effects across city sizes.
Fixed Effects Estimates using Different Measures of Urban, with and without Interactions.
Note. Fixed effects estimates with unreported controls for occupation, industry, and year.
*Significant at the 5 percent level.
**Significant at the 1 percent level.
Table 4 enters urban residence as a vector of three urban sizes as described above. The urban dummies are the same ones used in the previous tables, so that the total urbanization effect for the largest cities (and the total urban interaction for these cities) is thus the sum of all three urban coefficients (or urban interaction coefficients). Column 1 reports results from a fully specified ordinary least squares (OLS) model, while columns 2–5 report coefficients from fixed-effect models with various controls for industry, occupation, and year.
Estimates with Flexible Specification of Urban Status.
Note. OLS = ordinary least squares.
*Significant at the 5 percent level.
**Significant at the 1 percent level.
Before turning to the learning versus matching issue, the other results bear some description (focusing on the fixed-effects models). The main effects of experience, tenure, and education have the expected signs. Factoring in the interaction terms, education, tenure, and experience all have positive effects in rural areas and in all city sizes. Relative to rural areas, the additional urban return to education is either negative (for smaller cities) or insignificant (for the largest cities). Relative to small cities, however, the additional returns to education in the largest cities (Education × Big Urban + Education × Medium and Big Urban) are significantly greater than zero across all models in Table 4 at the 5 percent level or better. This result resembles that of Adamson, Clark, and Partridge (2004). Overall, while there are some interesting patterns across city sizes, there is little evidence of enhanced returns to education in cities relative to rural areas. These data thus fail to support at least one interpretation of the Duranton and Puga (2001) “nursery cities” model.
Table 4 offers quite detailed description of how the sources of urban wage growth vary across city sizes. “Growth” here refers to the association of within-worker variation in wages and within-person variation in time spent working. While the results vary somewhat between OLS and fixed-effects estimates, within the fixed-effects models they are quite consistent. In the smallest cities, the urban-tenure coefficients are over 60 percent of the urban experience coefficients. 21 By the rough measure used here, learning makes up only about a third of the urban wage growth effect. The marginal effect of moving from a small city to a medium city is actually perverse in terms of matching so that learning accounts for the entire urban wage growth effect in medium cities. 22 However, in the largest cities, matching again becomes important. In fact, the marginal increase in wage growth associated with moving from a medium to a large city arises completely from the matching effect. The large-city tenure coefficients are about the same magnitude as the large-city experience coefficients, meaning that (relative to medium cities) experience gained at one firm in a large city has no effect on wages. 23 The total urban growth effect in large cities (relative to rural areas) is about 60 percent learning, 40 percent matching. 24
Discussion
These results can be interpreted in terms of urbanization and localization economies. Assuming that Glaeserian learning occurs mostly within industries, 25 the learning dynamic can be seen as a localization economy, while the matching dynamic seems to represent an urbanization economy with the larger variety of jobs resulting from the urbanization of a variety of firms and industries. The results suggest that smaller cities are the result of strong urbanization economies, while medium-sized cities are the result of deep localization economies and the largest urban areas are results of a mix of each. Compared to rural areas, there are a very wide variety of jobs in small cities, but these small cities do not offer the deep industry-specific knowledge pools from which young workers can learn much. In medium-sized cities, on the other hand, the specialization in one or few industries does not significantly broaden the array of possible jobs, but these cities host concentrated stocks of workers with deep knowledge in the relevant subsectors. Finally, relative to medium-sized cities, the largest cities offer intense urbanization economies. The largest cities offer every kind of employment imaginable, improving matches, but the increased variety does not engender additional learning (relative to medium-sized cities). That additional learning does not occur could be a result of the knowledge base in larger cities being spread widely across industries rather than more deeply, of congestion or rapidly declining returns in learning mechanisms or of time constraints. After several hundred thousand residents, adding more residents does not increase the likelihood of learning because most of a person’s interactions occur among a smaller group of close friends, family, and colleagues.
These results paint an interesting picture for cities. Localization advantages for learning persist across city scales but plateau at the largest city size. Urbanization advantages for matching manifest for small cities and are particularly strong in large cities (relative to smaller cities or rural areas). That the age-earnings profile grows steeper from small to large cities is expected and consistent with Baum-Snow and Pavan’s (forthcoming) findings. The fading of the matching effect for medium-sized cities is also consistent with Yankow (2006), who shows negligible matching effects in moving from small to medium cities and no increased propensity to change jobs in medium-sized markets compared to smaller towns. Relatively specialized medium-sized cities offer better learning opportunities but thinner job markets, consistent with a version of Duranton and Puga’s (2001) model where the largest (and smallest) cities are “nurseries” for innovation and specialized firms relocate to mid-sized cities to exploit that learning.
The difference across city sizes in the importance of urbanization versus localization economies is also consistent with the differences in urban wage effects across education levels. Medium-sized cities concentrated in a few industries are likely production-oriented cities, with high demand for production workers (high school graduates) but no substantial increased demand for high-skill workers. Highly urbanized cities, on the other hand, have greater demand for all sorts of laborers. Thus, high school graduates with no urban experience earn an urban wage premium in small, medium, and large cities, while inexperienced college degree holders earn a premium only in the largest cities.
Taking the results from Tables 2 to 4 together, the story emerges that wage differences (and thus productivity differences) across city sizes arise from a variety of factors. Sorting based on observable and unobservable characteristics explains a substantial amount of urban wage differences. Similarly, straight agglomeration economies are important across all urban sizes. For college graduates, large cities offer substantial wage premiums. Finally, there is evidence supporting enhanced learning along the lines of Glaeser (1999) in all cities sizes, while substantial productivity increases due to enhanced match quality along the lines of Helsley and Strange (1990) also seem quite important at least in the smallest and largest cities.
These results are consistent with many previous findings (e.g., Glaeser and Maré 2001; Gabe 2004; Adamson, Clark, and Partridge 2004; Glaeser and Resseger 2010) and offer some new results in two ways. First, some significant effects are found where previous studies (Yankow 2006; Wheeler 2006) found insignificant results. This may be due to a longer panel or other data construction differences, alternative model specifications, or different identifying assumptions. Violations of the fixed-effects assumptions and different counterfactuals may also account for some of the difference. Second, Table 4 reveals complex relationships between city size and urban wage growth effects that eluded previous studies that imposed more restrictive specifications. The flexible estimator used here shows an intercept shift (for wage-level effects), convexity (for matching effects), and concavity (for learning effects) over city size.
Conclusion
Intuitively, it is known that there is something different about cities. How and why they are different is the subject of perennial debate. Recent theory has begun to formalize the intuitions laid out by Alfred Marshall in the nineteenth century and Jane Jacobs in the mid-twentieth century. This article has mainly tried to shed light on the source of one of these aspects of cities: the fast wage growth urban workers experience.
Several plausible explanations for this phenomenon are supported by the data. The diminishment of the urban level effect when observable and unobservable characteristics are controlled for suggests that there is sorting into large labor markets. The persistently significant coefficient on the uninteracted urban status variables suggest that productive amenities or agglomeration economies deriving from scale economies in the production of goods, services, and/or public infrastructure may also be contributing to high urban wages. However, it is likely that at least some of these level effects are deriving from the matching and learning dynamics discussed in the text. The wage data used in the article come from surveys conducted some time after respondents would have moved to or from the city. If rural–urban migrants are able to find better matches in cities, even with their first job, then some of the urban level effect would come from a matching effect. Similarly, if the respondents have had time to live and learn in the city before the survey was conducted, some of the urban level effect could be coming from learning as modeled by Glaeser (1999).
The results also suggest that the dynamics of wage growth differ depending on the size of the city. The smallest cities offer richer and more varied markets than rural areas, but not much opportunity to gain expertise from knowledgeable coworkers or neighbors. In contrast, medium-sized cities offer intense concentrations of industry-relevant knowledge, which leads to substantial wage growth as workers pick up useful skills. In the largest cities, workers retain the learning advantages of medium-sized cities in terms of learning, but get the added benefit of extremely thick labor markets, which allow them to find better and better matches to their skills. These patterns can be understood as urbanization/localization effects. The results shed light on how the importance of these effects varies across the urban hierarchy, and on the varying functions of cities of different sizes.
While the article has concentrated on comparing two models of urban wage growth, models can be interpreted in various ways. For instance, the learning model could be interpreted as meaning that in cities it is easier to learn about better job opportunities. Conversely, one could interpret matching as a match to the firm that employs people from whom a person can learn the most. These models are not necessarily mutually exclusive, as the empirical results show. There is plenty different about cities, be it air, markets, or some third factor. It should not be surprising that the measurable differences, such as wage or productivity differences, are made up of many small factors. What may be more surprising is that almost all these factors seem to point in the same direction, toward advantage for urban residence and production.
Footnotes
Acknowledgment
The authors would like to thank Joseph Persky and Geoffrey Turnbull for helpful comments. All errors are our own.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
