Abstract
This article explores the geography of public libraries in the United States. The distribution of libraries is examined using maps and spatial statistics to analyze spatial patterns. Methods for delineating and studying library service areas used in previous LIS research are modified and applied using geographic information systems to study variations in library accessibility by state and by socio-economic group at the national level. A history of library development is situated within the broader economic and demographic history of the US to provide insight to contemporary patterns, and Louis Round Wilson’s Geography of Reading is used as a focal point for studying historical trends. Findings show that while national library coverage is extensive, the percentage of the population that lives in a library’s geographic service area varies considerably by region and state, with Southern and Western states having lower values than Northeastern and Midwestern states.
Keywords
Introduction
The United States is a nation of library users. According to the Institute of Museum and Library Services in 2009 there were 1.59 billion visits to US public libraries and 2.41 billion items were checked out (Miller et al., 2011). Visits and circulation have continued to increase over the past 10 years. The last national survey of US households in 2002 revealed that 48% of all households used a public library within the past year and 31% used one within the past month (Glander and Dam, 2007).
Libraries are one of the few public, communal spaces that all residents can freely use for education and enjoyment. They can be crucial resources during difficult economic times for job seeking and training, but are also one of the first public resources to face budget cuts (Mantel, 2011; Prentice, 2011). This is certainly true in recent history, as local media across the US documented how Americans turned to public libraries for help in the midst of the Great Recession while the libraries faced shrinking staff, tighter budgets, shorter hours, and branch closures (Brustein, 2009; Dvorak, 2011; Jackson, 2009). The Free Library of Philadelphia is one of many library systems that have released cost-benefit studies to demonstrate the economic and social value that libraries provide. Personal interviews with librarians and patrons revealed that while patrons continued to use libraries for accessing books, movies, and the Internet, there was an increasing demand for job searching and training, preschool and after school activities, and ESL (English as a second language) materials for non-native speakers. Both librarians and patrons agreed that the library had positive impacts on children’s performance in school and on the neighborhood as a place of stability and safety (Diamond, 2010).
This latter point serves to emphasize that libraries are more than just a collection of information resources and services; they are also public places that contribute to the social capital of their communities (Hillenbrand, 2005; Johnson, 2010). There are several case studies where the nature of libraries as social, public spaces is explored (Aabø and Audunson, 2012; Buschman and Leckie, 2007). The importance of libraries to democracy is frequently invoked in the US library literature and mass media, stressing Jefferson and Madison’s philosophies that a well-informed public is necessary for a functioning democracy (Buschman, 2007). Buschman (2003, 2007) has identified deeper philosophical connections to Habermas’ conception of the public sphere, arguing that public libraries are vital to a democratic society because they provide equal and unrestricted access to knowledge preserved in a public space that can be used for rational discourse and debate. Equal access to libraries is important for fulfilling this democratic function, and the American Library Association (ALA, 2012) holds this as a key tenet: ‘Equity of access means that all people have the information they need regardless of age, education, ethnicity, language, income, physical limitations or geographic barriers’. Given the values that libraries provide to communities and the real threat of public library closures, their geographic distribution is of great importance.
This study examines the geography of public libraries in the United States in order to identify variations in library distribution based on geographic region or socio-economic group. While there has been an increasing number of studies that measure public library access and use within the US, few of these studies are at the national level and fewer focus on analyzing the actual geographic distribution of the libraries themselves. A national framework should be useful in providing context for local library issues. Public library outlets are mapped using geographic information systems (GIS) to examine the distribution of libraries and to delineate library service areas to measure spatial equity of library access among various socio-economic groups and between various geographic regions. Spatial statistics will be used to quantify the clustering or dispersion of library outlets and service areas, and the current geographic distribution will be placed in historical context.
Literature review
GIS has been used increasingly within library and information science (LIS) research to measure and analyze library services. This literature was recently surveyed in depth by Bishop and Mandel (2010), who classified articles into two groups based on whether GIS was used to analyze library services, facilities, and use, or to manage library facilities and collections. Articles in the former group are of interest here and can be further sub-divided into different categories: using GIS as a site selection tool for new libraries (Bishop, 2008; Hertel and Sprague, 2007), using GIS to understand library patrons in order to provide relevant services and resources (Kinikin, 2004; Koontz et al., 2005; Ottensmann, 1997; Park, 2012), and using GIS to measure equity of access (Japzon and Gong, 2005; Jue et al., 1999; Koontz et al., 2005, 2009; Sin, 2011). Studies that focused on site selection or library patrons typically mapped registered borrowers or circulation data relative to library locations in order to understand or define a library’s market area. All studies used demographic data from the US Census Bureau to define the constituents of a library’s market in various ways.
Defining geographic service areas
The literature recognizes a distinction between a library’s legal service area and its geographic market area. The legal service area is the official space served by the library; the local jurisdiction that provides funding and the maximum area from which it draws patrons. The geographic market area represents the realistic area in which residents would be willing and most likely to travel to reach the closest library, influenced by convenience of location and by cultural and topographical barriers (Jue et al., 1999; Koontz et al., 2009). The term ‘geographic service area’, abbreviated to GSA, will be used throughout this paper to refer to this realistic catchment zone. The word ‘service’ is used in lieu of ‘market’ to reflect the library’s role as a public good as opposed to a private enterprise.
Several studies have used a Euclidean distance of one to two miles (1.6 to 3.2 km) from the library to define geographic service areas (GSAs) (Koontz et al., 2005, 2009; Sin, 2011) with circular buffers drawn around each library with a radius at the given distance to capture areas that the library effectively serves. Small census statistical units that fall within the buffer are selected or apportioned to determine the population of the GSA. The choice of these distances is influenced by historical precedent in the US set by the ALA (Koontz, 2007) and by extensive work done in the late 1970s and early 1980s by Palmer and Hayes. They surveyed and summarized literature from several countries on the effect of distance on public library use up to that time (Palmer, 1981) and built upon it with a detailed survey and analysis of the effect of distance on central and branch library use in Los Angeles (Hayes and Palmer, 1983). These studies helped establish within the LIS literature that library use decays with distance, and much subsequent work has tied physical access to the library with the ability of people to access its resources (Park, 2012). Such observations are consistent with conventions in the field of geography where libraries would be considered as impure public goods. Pure public goods, like national defense, are non-exclusionary and free to anyone in society, while: impure public goods are provided at fixed locations or along fixed routes. A park is a public good, and so is a public transport service. However, due to distance decay, the further a person lives from these immobile public goods, the less is the usage of them, and their (potential) benefit. (Gregory et al., 2009: 599)
Plotting library registration or circulation data and associating it with existing administrative areas (Shoham et al., 1990) or delineating areas based on the distribution of borrowers or circulated items (Kinikin, 2004; Ottensmann, 1997) are alternatives to fixed radii approaches for defining GSAs. The advantage of the circulation delineation approach is that it defines areas based on actual use. The resulting statistics provide insight beyond their local settings: Ottensmann’s (1997) study of the Indianapolis Public Library System found that the median distance from a registered patron’s residence to their nearest library ranged from two-thirds of a mile to a mile for small inner-city branches, two to three miles for suburban branches, and five miles for the main central library. Of all materials 47% circulated within two miles and 66% circulated within three miles. Such findings reinforce the notion of distance decay and demonstrate differences in library distribution and use based on the varying density of communities. A shortcoming of this method is that circulation statistics do not capture all of the library’s services, and the library’s role of providing non-circulating amenities such as computer access, educational and cultural events, and public space has increased (Koontz et al., 2005).
Thiessen or Voroni polygons have also been used to define GSAs. A boundary is drawn between a library and each of its neighbors at their midpoint distances to form polygons around each library. Census units that fall within this polygon can be selected or apportioned to calculate the population of the library’s GSA. This approach was used to study library use among various socio-economic groups in New York City (Japzon and Gong, 2005) and Lake County, Florida (Park, 2012). Since this approach unequivocally assigns all residents to a service area based on least distance, it is impossible to study differences between inclusion and exclusion since all areas are included in a GSA (Jue et al., 1999). In reality the distances may be beyond what a person is willing to travel; Park (2012) found that only 65% of registered borrowers used the library that was within the polygon GSA where they resided, while the remainder traveled to a library outside their area.
Similar methods have been used to delineate effective service areas for other public goods such as schools, parks, and hospitals, in order to analyze their distribution and measure equity of access for various population groups. Pearce et al. (2008) provides a comprehensive, international summary of this literature that prefaces their study on access to community resources in New Zealand.
National Unites States studies on public libraries
Jue et al. (1999) summarized the key factors that impact library access and employed several spatial methods to measure it nationally in order to identify whether public libraries were adequately serving impoverished areas. Three methods were used to calculate whether an area was served or unserved: (1) counting census tracts where a library was located, (2) using a fixed-radius approach to select all tracts within a certain distance of each library, and (3) using a gravity model to draw Thiessen polygons around each library and select tracts within each. When tracts were selected in methods 2 and 3 the population for each tract was apportioned based on the percentage of land that fell within each service area; if 50% of a tract fell within a given library area, 50% of its population was apportioned to that area. The study found that public libraries were distributed rather equitably, but there was an under-representation of libraries in the most extreme poverty areas, particularly when the first method (single census tract) was employed.
Sin (2011) employed a fixed-radii, two-mile buffer census tract apportionment method to compare library system-level data on circulation, staffing, and funding from the Public Library System dataset with demographic data from the 2000 Census. The demographic characteristics for each library outlet in the system were averaged and paired to the library system data in a multiple regression analysis. Findings showed that a library’s funding was associated with the income levels of its surrounding neighborhood, and that overall access to electronic resources, to library staff with an MLIS, and to library programs was inequitably distributed. A study on library closures between 1999 and 2003 (Koontz et al., 2009) used smaller census blocks rather than tracts and employed a fixed-radii one-mile apportionment method, to reflect that groups like children, the elderly, and those in poverty are least able to travel great distances and would be more disproportionately impacted by a library closure. The study found that African Americans tended to be more disproportionately affected relative to their percentage of the total population, and most neighborhoods where libraries closed tended to have higher numbers of people in poverty.
These national studies did not focus on the actual geographic distribution of library resources; some discussed inequities at a large regional level. Perhaps the last broad national study was the Geography of Reading by Louis Round Wilson in 1938. Wilson set out to understand the extent and causes of inequality in access to libraries and library resources between states and regions. His empirical approach explored the geographic distribution of libraries using data from the ALA, the US Office of Education, and the 1930 Census. While his goal was to study the distribution of all types of libraries and media (academic libraries, bookstores, etc.) a large portion of the study was devoted to public libraries. The original impetus for the work was his recognition of the disparity in literacy rates and library access in his home state of North Carolina relative to the rest of the country (Martin, 1986).
Given the limitations of data at that time Wilson relied on legal library service areas to measure access. In 1935 he found that approximately 45 million Americans (37% of the population) lived outside a library legal service area, while the other 78 million (63%) were living in library service areas; there were approximately 6000 public libraries at that time. He found that service levels varied significantly by geographic region, from 87% on the West Coast, 78% in the Northeast, 72% in the Midwest, 47% in the Mountain West, 36% in the Southwest, to 35% in the Southeast (Wilson, 1938: 16). The majority of the population with no library service lived in rural areas and small towns.
Subsequent updates to Wilson’s work (Downs, 1957, 1974) were not as comprehensive in scope and focused on studying the national urban hierarchy of libraries by counting volumes held in academic and public libraries in large cities. The closest contemporary national studies are perhaps the Institute of Museum and Library Services (IMLS) annual reports that accompany the national Public Library Survey (PLS) data. The PLS is essentially an annual census of libraries with a response rate of nearly 100%. The IMLS compiles the survey data taken from each library system and publishes summary data for states, systems, and outlets; library outlets represent individual library buildings or branches. The reports summarize long-term trends and include state-level summaries. In 2009 there were 9225 public library systems in the United States that operated 16,698 individual library outlets or branches (Miller et al., 2011). Recent reports include national maps of libraries by state. For example, a map of libraries per capita shows a high concentration of libraries (nine or more per 100,000 residents) in northern New England and the central Great Plains states and a low per capita concentration (two or fewer) in the South and on the West Coast (Miller et al., 2011: 5).
Methodology
This study will use the IMLS PLS data for public library outlets to understand how libraries in the US are distributed, and will use variations of some of the spatial methods described to calculate library GSAs and illustrate differences in access across the country. The Geography of Reading will be a focal point for providing historical context as to how the distribution of libraries has evolved.
PLS data from 2009 (IMLS, 2009) was filtered to remove library outlets located in the US territories, branches coded as closed, and outlets that were bookmobiles or book by mail centers. Since bookmobiles and books by mail cannot be tied to one physical space and they lack several amenities that physical libraries provide (public space, computers, educational programming) they were excluded. The remaining 16,700 libraries were plotted in GIS using the latitude and longitude coordinates stored in each record. The open source GIS packages GRASS GIS 6.4 and QGIS 1.7 were used for spatial processing and mapping, and spatial statistics were generated using Open GeoDa. Data was stored in a SQLite relational database and the Python programming language was used for data processing.
The first part of the study examines the geographic distribution of public libraries. Several methods were used to characterize the degree to which libraries are concentrated or scattered across the country. Once plotted in GIS, the geographic centroid or midpoint of the libraries was plotted and compared to the national population center and to centroids from Wilson’s study in the 1930s to see how they align with the national population distribution. Measures of global and local spatial autocorrelation, spatial statistical methods for measuring whether areas have similar or dissimilar values that are not likely the result of random chance, were used to determine whether there were significant clusters of libraries by county, and possible explanations for observed patterns were explored.
The second part of the study examines the populations served by public libraries within GSAs and how these populations are distributed. Census tract data from the 2005–2009 American Community Survey (US Census Bureau, 2009a) was processed and joined to census tract boundaries in GIS. 1 Some processing was done to reconcile the census data to the tracts, given discrepancies in how the 2005–2009 ACS data was coded relative to the geography (for details see US Census Bureau, 2009b). Variables were chosen to measure equity of access by race, educational attainment, income, and labor force status. Other key library constituents such as children and residents who are not US citizens were also included (the latter as a proxy for recent immigrants; includes legal permanent residents, legal temporary migrants, and undocumented migrants).
Two variations on previous methods for selecting census areas that fall within library GSAs were used: variable buffer size and census tract selection based on population centroids. Instead of using a single buffer for all libraries, buffers of varying size were assigned based on each library’s locale code. The National Center for Education Statistics created locale codes to characterize neighborhoods where public schools are located. The IMLS used these codes and applied a similar methodology to classify public libraries (Manjarrez et al., 2011). There are four categories of locale (City, Suburb, Town, and Rural) each of which is subdivided based on its population (large, mid-size, and small for cities and suburbs) or its distance from urban areas (fringe, distant, and remote for towns and rural areas). Libraries in cities were assigned an area using a one-mile radius from the library, libraries in suburbs and in fringe towns and rural areas (which represent ex-urbs) were assigned a two-mile radius, and libraries in distant and remote towns and rural areas were assigned a three mile radius (1.6, 3.2, and 4.8 km respectively). These distances are based on the literature and reflect the need, ability, and willingness to travel greater distances as one moves from urban to rural areas. One exception was made for libraries that were designated as central libraries in the largest city locale; they were given the maximum radius of three miles (75 libraries met this criteria). Rather than being first among equals, central libraries in America’s largest cities have larger collections and specialized services, and as such they have a greater pull. This proved to be the case in Ottensmann’s study in Indianapolis (1997) and Hayes and Palmer’s study in Los Angeles (1983). Overall, out of the 16,700 public libraries, 16% were assigned one-mile buffers, 33% were assigned two-mile buffers, and 51% were assigned three-mile buffers.
A second variation is the method for selecting how tracts should be included in the GSA. Previous studies apportioned the population of the tract based on its percentage of land within the buffer (Jue et al., 1999; Sin 2011). One problem with this approach is that it assumes the population within each census tract is evenly distributed, which is not always the case. An additional problem is that the 2000 Census was the last decennial census to provide detailed characteristics of the population such as education level, income, and employment status; the 2010 Census is limited to basic demographic variables like total population, age, and race. The detailed variables that are no longer collected in the decennial census are provided in the American Community Survey (ACS), a large sample survey with data presented as estimates with margins of error at a 90% confidence interval. When a new ACS value is created by aggregating or splitting values, a new margin of error must be calculated. Apportioning this sample data for small areas is problematic, as the margins of error, particularly for small subgroups of the population, may be larger than the values themselves. Census tracts are the smallest area for which most ACS data is reported.
Instead of apportioning the population of tracts based on land area, population centroids were used. Population centroids represent the center of a population’s distribution within a tract and account for how a population is clustered or dispersed to some degree; they have been used in other studies that measure distances from populations to public services (Pearce et al., 2008; Talen and Anselin, 1998). In this study tracts were selected as being in a GSA if the tract had a library in it or if the population centroid of the tract fell within a library buffer. Population centroids were obtained as coordinates and plotted in GIS (US Census Bureau, 2000). 2
This selection process is illustrated for a rural area of Ohio in Figure 1. Library A has several tracts within its GSA; the population centroids of each tract are clustered near the edge of each tract, indicating a concentrated area of settlement where the library is located. In contrast Library B has a single-tract GSA. The population centroid of Library B’s tract falls outside the buffer of the library and is toward the geographic center of the tract, suggesting the population of this tract is more evenly distributed. Even though its centroid is outside the buffer, the tract is counted as being served since it has a library. The tract immediately to the north of it is counted as not being served even though a portion of it falls within a library buffer, since it does not have a library in it and its population centroid is not contained within the buffer. Libraries A and B have three-mile buffers, indicating that they are either distant or remote towns or rural areas. Library C has a two-mile buffer, since it is classified as either a suburban or a fringe area. In this example and throughout the country, the area of the selected census tracts is often larger than the area in the library buffer, with portions of tracts extending beyond the buffer. In effect this means that the actual area, and thus population, captured within the library GSA is often more liberal than the stated one-, two-, and three-mile buffer areas.

Example of library buffers and census tract selection.
Data from the selected tracts was aggregated into library GSAs for the nation as a whole and for individual states. Access to libraries for various socio-economic groups was calculated based on the percentage of each group that was within library GSAs relative to the total population, and access at the state level was evaluated based on the percentage of the state’s population living within a GSA relative to the national population. Global and local spatial autocorrelation was calculated at the state level for the percentage of the state’s population in a library GSA to identify significant regional clusters. Several population and fiscal variables were paired with states’ library GSA population in an ordinary least squares regression and a spatial regression to identify predictors of this library population percentage.
There are some study limitations to note. For each library in the PLS dataset the accuracy of a library’s coordinates is recorded as a match to a street-level address or to various levels of ZIP code (US postal code). A match to a standard five-digit ZIP is least ideal, as the coordinates for the library are estimated as being at the center of its ZIP code. In these cases tracts selected for a library GSA may be incorrect. To improve the data, libraries matched to a ZIP Code were uploaded to Texas A & M’s online geocoding service (Goldberg, 2012) and re-matched to street addresses; this improved the coordinates for 282 libraries from a ZIP-code to a street-level match. For the final dataset 15,590 libraries were matched to street level; these ideal matches represent 93% of the total libraries in the dataset of 16,700.
In this study distances for establishing library GSAs are Euclidean, straight-line distances. While road network distances may provide a more realistic representation of actual library travel (Park, 2012), a national US comparison between driving and straight-line distance from tract centroids to public hospitals found that the difference between the two was negligible. At local scales outliers around large lakes or mountains were evident, but are washed out at a national level. The authors concluded that straight-line distance was acceptable for non-emergency travel measures at the national scale (Boscoe et al., 2012).
This study focuses solely on whether a community has a library or not in order to delineate service areas and to estimate service to different groups. However, one can not make the assumption that every area and group is receiving equal quality of service. Quality will vary based on the hours that a library is open, the size of the collection, the number of staff, the physical condition of the building, the number of programs offered, and by different levels of funding. The PLS reports characteristics of library service at the library system level but not at the individual outlet level. This study considers the mere presence of a library as a public benefit and assumes that libraries function like other impure public goods; the greatest benefits are realized locally and access and use decay with distance. While some limitations of distance for a subset of library users can be overcome with technology like electronic databases and e-books, the library’s physical resources and its intrinsic value as a place for all users are limited by geographic area.
Results
Library geographic distribution
The contemporary distribution of the 16,700 US public libraries is the outcome of individual decisions made in thousands of communities over 150 years. The decision to build, maintain, or close a public library is invariably a local one. States play some role in decision making and are the jurisdiction where the legal status of public libraries is primarily defined, but most powers are ceded to local government (Prentice, 2011). Among public libraries 85% are public agencies connected to some form of local government, and the vast majority of a library’s operating revenue comes from local sources (Miller et al., 2011). Funds from the federal government have had an impact on library construction and maintenance over the course of the 20th century, but have become increasingly scarce.
Three methods were used to characterize the national geographic distribution of public libraries: (1) comparing the mean center of libraries to the mean population center, (2) aggregating libraries by county and mapping the distribution, and (3) calculating spatial autocorrelation. Plotting the geographic midpoint or center of the US library distribution and comparing it to the mean center of the population provides a general indicator of how libraries are distributed relative to people. 3 The two distributions do not precisely correspond; the population center in 2010 is located in south-central Missouri, approximately 186 miles away from the 2009 geographic center of libraries in west-central Illinois (Figure 2). The distribution of libraries is oriented more to the north and slightly to the east relative to the population center. Wilson (1938: 40–42) observed a similar phenomenon in 1930, where the library center in east-central Illinois was about 109 miles north and slightly west of the population center in south-western Indiana. Over the past 80 years the population center has migrated about 276 miles to the southwest, while the library center has moved only about 115 miles to the west and slightly to the south. While the population of the US has been migrating in a southerly and westerly direction, the library distribution has not been keeping pace with this movement.

Population and library centroids: 1930 and 2010.
The mean number of libraries per county 4 is approximately five and the standard deviation is nine; 2094 of the nation’s 3141 counties (67%) have less than the national average, and 46 rural counties have no libraries. The top class with 51 to 245 libraries (Figure 3) were urban counties with large central cities. Libraries are concentrated in densely populated regions of the Northeast, West Coast, and Florida, and in several metropolitan areas scattered across the country. Despite the high total population of the South many of its counties have less than the national county average of five libraries, similar to sparsely populated counties in the northern Plains and Mountain states in the West. A number of rural counties in the Midwest, northern New England, and several West Coast states are above average. Is there a relationship between population density and libraries? Simple Pearson’s correlations show that while the number of libraries and total county population is strongly correlated at .89, the relationship between number of libraries and county population density is more weakly correlated at .31. This suggests that rural counties, no matter how sparsely populated, will have at least one or more libraries to serve their area, while counties with a higher population will have more libraries but only up to a certain point irrespective of their density.

Libraries per county.
Measures of spatial autocorrelation can help identify the degree to which values for areas are spatially dependent, or whether areas that are adjacent to each other have similar values. Moran’s I is a global measure of spatial autocorrelation that characterizes an entire distribution; it can indicate the degree to which counties have a similar number of libraries relative to their neighbors. A Moran’s value of zero indicates a random pattern, values from 0 to 1 show that neighboring values are similar (positive spatial autocorrelation) and values from 0 to -1 show that neighboring values are dissimilar (negative spatial autocorrelation) (Lloyd, 2010). Open GeoDa was used to calculate spatial autocorrelation using rook’s contiguity for determining the adjacency of counties. 5 A statistically significant (at .002) Moran’s I of .444 indicates that counties had moderately strong spatial autocorrelation based on the number of libraries. Counties are likely to have a similar number of libraries as their neighbors.
Measures of local indicators of spatial autocorrelation (LISA) can be used to identify statistically significant clusters of related values or outliers. Significance is determined using Monte Carlo methods; the test statistic (in this case, number of libraries by county) is randomly and repeatedly redistributed to create a reference distribution for a null hypothesis, which represents no spatial association. The null hypothesis is compared to the actual test statistic to determine if significant difference exists (Cromley and McLafferty, 2012: 161–163). Counties that have a high number of libraries (above the mean) and are associated with high library values in surrounding counties are classified as High-High, while counties with a low number of libraries (below the mean) associated with low values in surrounding counties are classified as Low-Low. Counties classified as High-Low and Low-High are outliers; High-Low values are individual counties that have a significantly higher number of libraries relative to the mean and their neighbors while Low-High values are individual counties that have a significantly lower number of libraries relative to the mean and their neighbors (Talen and Anselin, 1998).
Figure 4 is a LISA map of libraries by county. There is a statistically significant (95%) concentration of counties with low numbers of libraries that stretch in a band from the Dakotas and Montana in the north to Texas in the south, roughly along the boundary between the Great Plains and the Rocky Mountains. This sparsely populated western area of the Plains, known as the High Plains, is characterized by low rainfall, submarginal land, and declining population (Rowley, 1998). Additional clusters of counties with few libraries appear in Kentucky and Georgia and cover most of these states in their entirety. There are smaller concentrations in northern Virginia and northern Missouri. There are two large, significant concentrations of counties with a high number of libraries; the first is in the Northeast Corridor, New England, and Western New York and the second is in Southern California and Western Arizona. Additional pockets of high concentration are scattered in urban areas across the country.

LISA for libraries per county.
Library geographic service areas
Based on the study criteria for measuring library GSAs, with variable library buffers and population centroid selection, 43,620 census tracts (66% of the total) were counted as being in a library GSA, 21,443 (33%) were outside, and 259 (< 1%) were counted as empty (zero population). Approximately 196 million people (65% of the total US population) lived in a library GSA and 106 million (35%) lived outside a library GSA. The percentages of various socio-economic groups that live in a library GSA are reported in Table 1 and the largest variations from the total population are in the categories for race. The percentage of Native Americans that lived in a library GSA (57%) was much lower than the general population; this is influenced by shortcomings in the PLS survey, which did not count public libraries on reservations comprehensively (Manjarrez et al., 2011: 2).
Population within library geographic service areas.
Note: *Significant difference at .10. The confidence level for US Census ACS estimates is 90%, and the margin of error represents the precision of the estimate.
The percentage of Asians (71%), Hispanics (71%), and African Americans (69%) that lived in a library GSA was higher than the general population (65%) while the percentage of whites was a little lower than the average (63%). Hawaiian and Pacific Islanders had the largest margin of error, at 68% plus or minus 6%, given the small size of this group and its concentration in two states (over half lived in California or Hawaii). At 73% the non-US citizen population had the highest positive difference compared to the general population. There were no appreciable differences between the general population and children, educational attainment, or labor force status; the exception in these categories was the percentage of the population with less than a 12th grade education (68%). Based on income, households in the lowest bracket were more likely to be served by a library (70%), and the likelihood of being served by a library generally decreased as income increased. Most of the differences between individual groups and the general population were statistically significant. The confidence level for all US Census ACS estimates is 90%, and the margin of error represents the precision of the estimate.
There is a much greater discrepancy by geographic area (Table 2). The percentage of state populations that live in a library GSA range from a high of 97% in the District of Columbia to a low of 42% in North Carolina; the state mean is 66% with a standard deviation of 13%. Mapped using natural breaks (Figure 5), the states with the lowest percentages are entirely in the South. The District of Columbia and Vermont are outliers in the top class, while states in the second highest class are clustered in the Northeast, plus Illinois. The District of Columbia and Vermont are both two standard deviations above the mean, while all of the states in the second highest class are between one and two deviations above the mean. Practically all of the states in the lowest category are more than one standard deviation below the mean. A LISA map of the states (Figure 6) shows a solid, significant (at 95%) cluster of states in the South where the percentage of the population living in a library GSA is significantly lower than the national mean and neighboring states, and a significant concentration of higher values in part of the Northeast. Global spatial autocorrelation is strong (Moran’s I is .597) with a significant (.002) degree of spatial dependency.
State population within library geographic service areas.
Note: *The confidence level for US Census ACS estimates is 90%. Margins of error (representing the precision of the estimate) range from 0.1% to 0.6% for all states; exceptions are Wyoming (0.7%) and District of Columbia (1.1%).

Percentage of population in library GSAs.

LISA for % of population in library GSAs.
What could account for this state-level variation? States vary widely in size, population density, and socio-economic composition. It is reasonable to assume that states with higher population densities and more libraries per unit of population (i.e. libraries per 100,000 people) would have higher percentages of population in a library GSA. Table 3 shows that these two population figures are moderately correlated with library GSA population. Fiscal factors are correlated more strongly, with median household income representing an ability to pay for services and library expenditures per capita representing a willingness to pay.
Correlations for % of state population in library geographic service areas.
Note: *All significant at .001.
The strength of these correlations warranted further investigation, in part because these variables are themselves related. Using an ordinary least squares (OLS) regression, nearly two-thirds of the state-level variation in the percentage of the population living in a library GSA (the dependent variable) can be explained by the these four independent variables; the adjusted R-Squared is .64 and all the variables are statistically significant (Table 4).
OLS regression for % of state population in library geographic service areas.
Notes: Mean dependent variable = .656, standard deviation dependent variable = .130, R-squared = .671, Adjusted R Squared = .643, standard error of regression = .079, F = 23.467, probability F = .000.
Open GeoDa provides the ability to incorporate spatial dependency in the regression model, as a high degree of autocorrelation can impact the independence of the variables. We have seen that spatial autocorrelation is evident in the library GSA distribution; in other words, states often have values that are similar to neighboring states, and there are significant regional clusters of high and low values. Based on the spatial dependency measures that Open GeoDa automatically generates for an OLS, the suggested course of action was to run a spatial lag model to incorporate spatial effects (Anselin, 2005: 199). More of the variation is explained when spatial dependence is accounted for; R-Squared is .73 and the impact of spatial effects is significant with a coefficient (Rho) of .20 (Table 5). The coefficients and the significance of the independent variables remain relatively constant; population density is a little more significant and expenditures a little less. This analysis demonstrates that these four factors are significant indicators of a state’s library GSA population, but the impact of neighboring states’ values is more important for predicting percentage of population in a library GSA, since this is the variable that is most sensitive to the spatial dependency specification. Other factors like socio-economic and demographic characteristics could be included in this model that might explain differential access, but that remains for future research.
Spatial lag model for % of state population in library geographic service areas.
Note: Lag coefficient (Rho) = .204, R-squared = .731, standard error of regression = .068
Discussion
Libraries are widely distributed across the US in 98% of all counties, and two-thirds of Americans live in close proximity to a library. Despite this rather comprehensive coverage there is inequity in the national distribution; there are significant clusters of counties in many Southern states and the High Plains where there are a lower number of libraries compared to the national average, Southern and Western states have lower percentages of their population living in library GSAs, and the overall mean center of the library distribution is further north and east compared to the nation’s population center.
The percentage of different socio-economic groups that live within a library GSA is not appreciably different from the national average in many instances, with the exception of race, non-US citizens, and those in the lowest income bracket. In this sense and at this scale, the distribution of US public libraries is relatively equitable. However, equity can be defined in various ways, as: equal services for all, services equal to needs, services equal to demand, services equal to preference, and services related to willingness to pay (Lucy, 1981). In this case equity is being met at the national scale in terms of equal service to most groups, with service being simply defined as the availability of a library within a reasonable distance, not accounting for differences in quality or level of service. Furthermore as scale changes the degree of equity can change; this is evident in the inequity of library access by state.
Historical basis for geographic variation of libraries
In a comprehensive study of access to public and community resources in New Zealand (Pearce et al., 2008), the authors stressed that current patterns are not merely the result of contemporary policies, but are formed by different histories of regional development and that factors like the evolution of transport systems, agricultural and industrial development, and historical patterns of land use all have an impact on access to contemporary resources. The prevailing social and political values of different time periods also influence the provision of public services. The history of public library development is tied to the history of urban and economic development, and both have an impact on the current distribution of US libraries. An exploration of this history follows, to provide an interpretation of contemporary patterns.
The current distribution of libraries is similar to the one illustrated by Wilson in 1938. In examining causal factors for why the South had lower levels of library service, Wilson pointed to several socio-economic indicators. First, he noted that there was an overlap between areas of low library service and areas of sub-marginal land (either eroded soil or high plateau areas with scant rainfall). This, in conjunction with the rural nature of the South and its economic concentration on tenant farming, meant that the South was less able to support public services like libraries. He also observed that the South had population imbalances relative to the rest of the country; Southern states had a much higher percentage of children and greater outmigration of young adults of working age, which meant that the tax base for providing public services was smaller. The South had a higher African American population relative to the rest of the country, and in an era of segregation that meant providing separate services for whites and blacks at a higher cost, or lesser or no services to blacks at all (Battles, 2009). Finally, when comparing library service to economic factors like per capita income, income tax payments, and access to other public services like health care, Wilson found strong positive correlations. The patterns of strong library service related to these different measures tended to follow the same geographic order from greatest to least: West Coast, Northeast, Midwest, Mountain West, Southwest, and Southeast.
Many of these factors posited by Wilson seem less plausible for explaining conditions in the early 21st century. During the second half of the 20th century the economy and population of the South boomed, as Americans migrated to the South and West out of the Northeast and Midwest. The erosion of population imbalances, the end of segregation in law, and the urbanization of the South and its transition to a manufacturing and service-based economy transformed the region. However, as we shall see, historical changes in urban development, demographics, and the valuation of public services have shaped the contemporary distribution.
The library landscape Wilson surveyed had been formed by the Progressive Era, where there was a belief in building libraries and other public goods for the purpose of elevating the education of working-class people, integrating recent immigrants into American society, and enhancing civilization as a whole (Prentice, 2011). The Carnegie public libraries, financed by business mogul and philanthropist Andrew Carnegie, were built in 47 states and the District of Columbia between 1896 and 1923 and represented an enormous contribution to public library infrastructure. Half of the 1678 libraries were constructed in states throughout the Midwest; 19% were built in the West, 17% in the Northeast, and only 13% in the South. On a per capita basis (i.e. total number of Carnegie libraries built per 100,000 people in 1910) the West (especially the Mountain states) and Midwest were larger beneficiaries than the Northeast and South. 6 Wilson (1938: 172–174) noted that a lack of state or local funds required to match a percentage of the Carnegie funds for the purposes of annual maintenance contributed to the low concentration of new libraries in certain states. Rural areas of the Midwest and West capitalized on this opportunity to invest in library infrastructure while the South did not. The Northeast already had a head start; it was home to the first public libraries founded in New England in the mid-19th century, and prior to the Carnegie era two-thirds of the country’s libraries were in the Northeast (Prentice, 2011: 4–5).
The lack of library building in the South was influenced by broad social and economic trends. After the Civil War the Southern economy was in ruins, and the South remained economically stagnant and isolated from the national economy for almost a century while the rest of the country continued to develop and grow (Heilbroner and Singer, 1994; Klein, 2004). The Southern economy continued to revolve around cotton and plantation crops, agriculturally based on sharecropping and industrially based on textile production, and thus it was susceptible to swings in commodity prices (Heim, 2000). The Southern population continued to grow due to natural increase but there was little in-migration; the black population was virtually immobile, and segments of the non-immigrant white population migrated to the West (Klein, 2004). In contrast, the North and Midwest continued to industrialize rapidly and their urban areas swelled with foreign immigrants, while the non-immigrant white population continued settling the West.
During this era of growth in the late 19th and early 20th centuries the national population became increasingly urban, while the South remained predominantly rural. The burgeoning big cities were high in population density and limited in geographic area by available forms of transportation; roads were often poor and traffic clogged and most urbanites traveled by foot. Railroad development encouraged dense but scattered town settlements, as it was impractical for steam engines of this era to make frequent starts and stops. Later developments, such as horse-drawn and electric trolleys and the widespread paving of roads, led to urban expansion but still favored density and compactness (Monkkonen, 1988; Vance, 1990). These settlement patterns and forms of transport, in conjunction with Progressive Era library building policies, led to a concentration of libraries in cities and towns that could be accessed easily over short distances. By 1911 the ALA (American Library Association) was recommending that library systems build branches to serve individual communities and that libraries be located optimally one mile apart (Koontz, 2007). The rest of the country, relative to the South, was able to capitalize on library development during an era when urban patterns favored density and the provision of public services became a prevailing social belief.
During Wilson’s era the New Deal programs expanded library construction, particularly in more rural areas (Wilson, 1938: 37–39). Library growth was interrupted by the war but accelerated during the 1950s and 1960s, aided by an economic and population boom and by legislation to increase federal funding for library construction; during this period many Southern states took the lead in taking advantage of these programs (Merrifield, 1995). The Library Services and Construction Act of 1964 provided funding for building or remodeling over 1800 libraries throughout the country (Prentice, 2011: 7).
The fortunes of the South started changing after the Second World War, spurred by several factors: large federal projects that created employment, the end of the sharecropping system and reliance on agriculture, the migration of manufacturing jobs from the older, high-wage union areas of the Northeast and the Midwest, the end of segregation, the construction of the Interstate Highway System, and the growing ubiquity of air conditioning (Easterlin, 2000; Heim, 2000; Klein, 2004; Schulman, 1991). All regions of the country grew rapidly during the post-war years with the West firmly in the lead, but after 1970 population growth increased considerably in the South and slowed substantially in the Northeast and Midwest (Hobbs and Stoops, 2002).
During this period of Western and Southern growth the predominant urban land use patterns were suburban and increasingly ex-urban sprawl driven by automobiles and highways, a pattern that required fewer libraries per square mile relative to earlier periods. New urban patterns that began forming in the early 20th century with the introduction of heavy intra-urban rail transit, the automobile, and the widespread paving of roads accelerated rapidly after the Second World War. Cities shifted from being densely populated cores with tight suburban hinterlands to sprawling, polycentric or edge cities with lower densities (Easterlin, 2000). Library planning became more sophisticated and less ad hoc during this era, and gradually shifted away from focusing on large central city locations to establishing suburban library branches at intersections of major roadways near retail areas, with three or four miles between libraries (Koontz, 2007).
Despite the transformation of the South from Wilson’s time, imbalances persist. The urbanization and economic growth of the South during the late 20th century were geographically uneven; there were imbalances between incoming capital and the existing labor markets and many rural areas remained impoverished (Heim, 2000). Southern states continue to have lower median income relative to the rest of the country (Noss, 2011), and lower library expenditures as well (Miller et al., 2011: 125–26).
The late 20th-century period of economic and population growth in the South and West coincided with more decentralized development and with declining commitments to public services. The prevailing social and political ideology of this time period was profoundly different from the modern liberalism of the mid-20th century and progressivism of the late 19th and early 20th centuries. From the 1970s and 1980s forward neoliberalism became predominant, with a focus on rolling back government, cutting taxes, and privatizing public services (Harvey, 2005). During the 1980s federal funding for domestic programs was cut to shrink government and there was an increasing lack of investment in public infrastructure (Heilbroner and Singer, 1994). The new neoliberal perspective, where civic values are reduced to economic considerations, led to cuts in the federal depository library program in the 1980s and the growth of the private-sector information industry, which would increasingly consume library funding in exchange for proprietary electronic resources (Buschman, 2003). Buschman’s (2003) work situates library issues within neoliberalism, demonstrating that despite the booming economy of the 1990s, public library funding grew slowly and inconsistently, and did not meet the expectation that libraries would be the backbone of the new Information Age. While the number of public libraries grew from 15,481 in 1989 (Podolsky, 1991) to 16,700 in 2009, a net change of 1219 or 8%, the population of the United States expanded by 24% during that same period.
The South did make gains in public library infrastructure during the 20-year period from 1990 to 2010, but these gains were uneven across the region. The South had a net increase of 556 libraries, but over two-thirds of the increase occurred in Texas, Florida, Virginia and Georgia while states in the Deep South and Mid-Atlantic had net losses. Growth in the West was even more lopsided; despite its booming population 80% of the net increase of 490 libraries occurred in California and Arizona. Gains in the Midwest were more evenly distributed compared to other regions with a net increase of 260, while the Northeast suffered a net loss of 87 libraries, with the highest losses in Massachusetts and New York. Table 6 demonstrates that there is little relationship between population change and library growth or loss at the state level. Despite population growth in all 50 states between 1990 and 2010 (DC had a small loss), 14 states had a net loss in public libraries, 36 had net gains of varying degree, and one had no change.
Net change in public libraries by state 1989 to 2009.
While the West benefited from early investments in library infrastructure, in recent decades library construction has not kept pace with rapid population growth. The five fastest growing states between 1990 and 2010 were in the West, but only Arizona had substantial library growth. Library growth in Nevada, Utah, and Colorado was tepid compared to population growth and Idaho had a net loss in libraries. A booming population, decentralized suburban development, and a lack of investment in library infrastructure over the past 20 years may explain why many Western states rank just above Southern ones in terms of the percentage of the population that lives in a library GSA.
Variation in library access by socio-economic group
The percentage of different population groups that are within library GSAs can be understood to some degree based on the national distribution of those populations. Unlike the white and to some extent the black population, other racial groups tend to be clustered in certain geographic areas. For example, in 2010 75% of the Asian population lived in 10 states (Hoeffel et al., 2012: 7–9) and 75% of all Hispanics lived in just eight states (Ennis et al., 2011: 5–6). In most of these states, the percentage of the population in library GSAs is near average or above average.
More importantly, Asians and Hispanics are more heavily concentrated in urban areas and particularly in central cities (Klein, 2004), where libraries are more numerous. Based on the tract-level 2005–2009 ACS data, approximately 26% of all Americans live in the counties of the United States that contain the 50 largest cities. Of the Asian, Hispanic, and non-US citizen population, 46% lives in these large city-counties, compared to 36% of blacks and only 18% of whites (Table 7). Of the population of these large city-counties, 71% lives within a library GSA compared to 65% of the general population. In every instance, the percentage of each group that lives in a library GSA in these city-counties is greater than the national percentages. Nationally 71% of Asians and Hispanics live in a library GSA, while 75% of Asians and Hispanics in these big city-counties live in library GSAs. For the black and white population the national percentages are 69% and 63% while the big city-county percentages are 77% and 67% respectively. Big cities are better served by libraries, and thus groups that are concentrated in these areas are more likely to be in a library GSA.
Population in library geographic service areas for 50 largest city-counties.
Note: The confidence level for US Census ACS estimates is 90%, and the margin of error represents the precision of the estimate.
This distribution has also been shaped by history; from the mid- to late 20th century white Americans migrated from central cities to suburbs and from the Northeast and Midwest to the South and West, while black Americans moved from rural areas of the South to central cities across the country, particularly in the Northeast, Midwest, and West Coast. Immigration from Latin America and Asia swelled after 1970, with many immigrants arriving in major metropolitan areas and in central cities in particular (McDonald, 2008; Teaford, 2006). Essentially, minority groups occupied the regions and older parts of metro areas that were well served by library infrastructure created from the late 19th to mid-20th centuries, while white Americans migrated to areas that were less well served, during a time when public finances and attitudes toward public institutions shifted and urban development became less dense.
The low percentage of Native Americans living in library GSAs is due in part to omissions in the PLS dataset, but the geographic distribution of Native Americans is likely a factor. The percentage of the population in library GSAs is below average in most of the top states where Native Americans live (Norris et al., 2012: 6–8). The largest reservations with high concentrations of Native Americans are located in parts of Alaska, Arizona, North Dakota, South Dakota, Montana, and Oklahoma where the number of libraries is significantly lower than surrounding areas (particularly in the High Plains). In addition, only 17% of Native Americans live in the 50 largest big city-counties where library access is higher: the lowest percentage of any racial group. Other research has shown that Native Americans are three times more likely than whites to live over 10 miles away from the closest library, and that there are a number of distinct cultural and socio-economic barriers that limit or reduce Native American access to libraries (Burke, 2007).
Conclusion
The United States has extensive public library coverage; 98% of all US counties have at least one library and the average number of libraries per county is five. At the county level there is a strong correlation between the number of libraries and population, but a weak correlation between number of libraries and population density, suggesting that most counties will have a library regardless of how sparsely populated they are. Two out of three people in the nation live within a library GSA, and the percentage of various socio-economic groups that live in a library GSA is around the national average or better, with a few notable exceptions. The percentage of most minority groups and non-US citizens that live within a library GSA is much higher, due to the similar concentration of these groups and libraries in densely populated urban areas. One exception is Native Americans whose percentage in library GSAs is noticeably lower than average, due to their lower urban population and to shortcomings in the data.
Geographically libraries are distributed unevenly in different regions and states. Nationally the population is distributed more to the west and south relative to the distribution of public libraries. There is a large, significant cluster of counties in the High Plains and the South that have a low number of libraries relative to the national mean, while there is a significant concentration of counties in the Northeast, Southern California, and various metropolitan areas throughout the country with higher numbers of libraries. The percentage of the population that lives within a library GSA is much lower in the South and West, with statistically significant spatial autocorrelation of lower percentages in Southern states and higher percentages in the Northeastern states. This is in spite of the fact that the number of libraries in many Southern states has grown in the past 20 years, while many Northeastern states suffered net losses. The percentage of a state’s population that lives in a library GSA is significantly related to the values in neighboring states as well as to state-level variations in: population density, libraries per capita, library expenditures per capita, and median household income. Patterns of historical urban and library development, contemporary issues concerning the ability and willingness to pay for library services, and the individual decisions of thousands of communities have shaped the current distribution of US libraries.
A broad, scoping study like this one should hopefully lay the groundwork for future research in many directions and provide a national frame of reference for regional studies. One path would follow refinements in methodology, such as alterations in the approach for measuring distances or calculating geographic service areas. A different path would explore the national or regional patterns for particular groups or areas to add more depth to these findings. For example, at this broad level it is meaningful to identify that Native Americans may be under-served relative to other groups, based on a mismatch between the Native American population and the location of libraries. However, this does not indicate cause and can lead to a circular argument, i.e. Native Americans do not have adequate access to libraries because they do not live where libraries are located. Research that studies the history, unique needs, and data for this group vis à vis public libraries can provide the detailed insight that a broader study does not (for example see Burke, 2007). The distinction between library access at the national and regional levels for urban versus rural areas could be more fully explored, as could the differences in access in urban areas in different regions of the country that developed during different periods of history. If the data is available, the methods employed here could also be used to study library distributions in other countries.
While late 20th century population trends have generally continued, the turbulent economy of the first decade of the 21st century has rendered some changes. Population growth and internal migration have slowed, new suburban growth has momentarily halted while some central cities and older suburbs have seen increases, and minority populations are becoming more decentralized (Frey, 2011, 2012). One national trend that is clear is that public library use continues to increase (Miller et al., 2011). Whether the US is able, or willing, to maintain and enhance its vast library infrastructure remains to be seen, considering the deteriorating finances of local and state governments and the continuing acceptance of, or acquiescence to, neoliberalism in civic discourse.
Footnotes
Funding
This research received no specific grant from any funding agency in the public, commercial or not-for-profit sectors.
