Abstract
Background:
Childhood lead poisoning remains a critical public health problem in the United States. The neurotoxic effects of early life lead exposure have been associated with numerous behavioral issues, including violence. To explore this association further, we evaluated whether there was spatial association between topsoil lead levels and violent crime in Jefferson County, Kentucky.
Methods:
Our study collected topsoil samples from four pre-specified areas; one high crime area (Study Area), and three normal-to-low crime areas (Northeast, Southeast, Southwest). A spatial error model was used to compare topsoil lead content between the four collection areas. A Bayesian sparse spatial generalized linear mixed model was used to examine the relationship between topsoil lead levels and violent crime.
Results:
The Study Area had an 8.25-fold increase in topsoil lead concentration compared with the referent Southeast Control Area (β: 8.25, 95% confidence intervals [CI]: 5.12–13.27). In addition, for every 100-unit increase in the estimated mean topsoil lead concentration per census-tract, there was a 62% increased risk for violent crime (risk ratio: 1.62, 95% CI: 1.59–1.68). Model results were attenuated but remained significant after adjusting for pertinent confounders (risk ratio: 1.05, 95% CI: 1.03–1.08).
Discussion:
Results from our study suggest that lead contaminated topsoil may be an important modifiable risk factor for violent crime. Although there are individual-level and sociological risk factors associated with violence, environmental lead remains a known developmental neurotoxicant that may increase the risk for violent behavior.
Conclusion:
Remediating lead contaminated environments may be an important consideration for legislative policies aimed at preventing violent crime.
INTRODUCTION
Scientific and epidemiological studies have implicated early life lead exposure with subsequent criminality. 1 This association, which has since garnered the title “The Lead Crime Hypothesis,” has been observed across several academic domains, through different methodological approaches.2,3 Prenatal and postnatal blood lead levels, 4 as well as adolescent bone lead levels,5,6 have been associated with delinquent and antisocial behaviors in adolescents. In addition, an association between early life lead exposure and arrests for violent offences in adulthood has been observed. 7 Several ecological studies have implicated atmospheric2,3,8 and water 9 lead contamination to trends in violent crime, and more recently, aggregate blood lead levels have been associated with the geospatial incidence of violent crime in St. Louis, Missouri.10,11
The relationship between early life lead exposure and criminality is predicated on the neurotoxic effects of lead. Findings from the Cincinnati Lead Study indicate that early life lead exposure is associated with reduced adult brain volume, 12 as well as white matter organization and myelination. 13 These lead-associated alterations were found in numerous parts of the brain, including the prefrontal cortex, 12 which is believed to process various neuronal inputs that give rise to executive functioning. 14 Alterations in impulse inhibition, a metric of executive functioning, 15 as well as attenuation and dysregulation of the prefrontal cortex have been associated with antisocial behaviors. 16
Although both children and adults are at risk from the adverse effects of lead, children are especially vulnerable given their greater propensity for incidental ingestion from age-related hand-to-mouth behaviors, as well as increased gastrointestinal absorption. After gastrointestinal absorption, lead can cross the blood brain barrier into the central nervous system. 17 Since development of the prefrontal cortex is incomplete during the peak hand-to-mouth activities of early childhood, 18 lead contaminated topsoil may be an under-appreciated risk factor for lead-associated brain injury and related neuropsychiatric sequela.
Historically, state lead abatement programs that are governed by the Environmental Protection Agency (EPA) have primarily focused on remnant lead paint exposure in aged residential structures. 19 However, it has been reported that lead contaminated topsoil is an important pathway for lead exposure. Prior studies have found a direct relationship between topsoil lead content and pediatric blood lead levels.20,21,22
Although remnant lead paint is an important risk factor for pediatric lead exposure, 23 topsoil lead content is estimated to have three times greater bioavailability compared to lead from paint, 24 and thus a stronger relationship to pediatric blood lead levels. 25
The reported evidence that links lead contaminated topsoil to childhood blood lead levels, in conjunction with research that links pediatric lead exposure to subsequent violent behavior, raises concern over the potential criminogenic effect of lead contaminated topsoil. To date, there has not been an evaluation of the geospatial association of lead contaminated topsoil and the incidence of Federal Bureau of Investigation (FBI) designated violent crime, defined as aggravated assault, robbery, homicide, and rape. 26
To address this gap in the literature, we evaluated whether neighborhoods with significantly higher rates of violent crime had increased levels of lead contaminated topsoil compared with neighborhoods reported to have normal-to-low rates of violent crime in Jefferson County, Kentucky. In addition, we assessed whether topsoil lead concentration was a predictor for violent crime, adjusting for pertinent census-tract-level covariates.
METHODS
The design of our study was based upon an a priori risk assessment of violent crime in Jefferson County. To evaluate the spatial distribution of violent crime, a series of pin maps were created using the latitude and longitude coordinates of all violent crime events reported by the Louisville Metro Police Department from 2012 through 2016. All shapefiles for spatial analysis were downloaded from the U.S. Census Bureau's TIGER/Line shapefile repository. 27 All maps were produced using Esri ArcGIS v.10.4.
A review of the criminological and sociological literature suggests that violent crime is usually non-randomly distributed across populations and geographic areas.28,29,30,31,32 Cluster analysis for each violent crime type was carried out in SaTScan v.9.1.1 via the Kulldorff spatial scan statistic, a validated method for cluster analysis. 33 To control for the background population, the estimated census-tract-level population was downloaded from the 2014 American Community Survey 5-year estimates. The high crime “Study Area” was established as the shared space of the highest risk for each violent crime type as identified by the Kulldorff spatial scan statistic.
Three disconnected Control Areas (Northeast, Southeast and Southwest Control Areas) were established in regions with no increased risk for any violent crime type. The Study Area and three Control Areas are collectively known as the “topsoil collection areas” for this study (Fig. 1a).

The three Control Areas were selected to cover a range of urban and suburban environments, to include differences in socioeconomic and sociodemographic populations. In addition, the three Control Areas were also selected for their differing cardinal directions from the Study Area, to account for historical wind patterns that may have impacted topsoil lead contamination that likely occurred via atmospheric diffusion during the lead gas era.
Between 2012 and 2016, a total of 12,710 violent crime events occurred within the four topsoil collection areas. Of these violent crime events, 9038 (71.11%) occurred within the designated Study Area. The Southeast Control Area had the fewest amount of violent crime events, accounting for 664 (5.22%) reported crimes. Descriptive statistics for crime data conditioned on the four topsoil collection areas are shown in Table 1.
Federal Bureau of Investigation Designated Violent Crime Events From 2012 Through 2016, per Topsoil Collection Area
Table values are not standardized by population; table column “N (%)” represents the number and corresponding percentage violent crimes per topsoil collection area; the remaining descriptive statistics were calculated using the violent crime events per census tract by topsoil collection area.
Max, maximum value; Med, median; Min, minimum; N, number of violent crime events; P25, 25th percentile; P75, 75th percentile.
Exposure assessment
Topsoil samples were collected from “edge zones,” during May through July of 2018. Edge zones, sometimes referred to as utility strips, are defined as the government owned land adjacent to roads. One-inch core topsoil samples, weighing ∼10 g each, were extracted using the ESS Lock N Load™ Soil Syringe produced by AMS Incorporated. Topsoil lead concentration (reported as mg/kg) was determined by Inductively Coupled Plasma Mass Spectrometry.
Edge zone soil samples
Multiple soil samples were collected from each census tract within each topsoil collection area. A random sample of streets in each collection area within each census track was established for the collection of edge zone samples using the sample function in R, version 3.3.1. 34 An approximately equal number of samples were collected from each census tract within the respective topsoil collection area.
A total of 300 edge zone samples were collected, 120 from the Study Area, and 60 from each Control Area. An increased number of soil samples from the Study Area was collected to better evaluate the topsoil lead concentration within the primary area of concern and to account for the anticipated higher variability in this region, relative to the Control Areas.
Dependent variable, primary predictor, and covariates
The dependent variable, “violent crime events,” included any violent crime that occurred in the Study Area and the Control Areas during 2012–2016. To evaluate the predictive effect of topsoil lead concentration on the incidence of violent crime, an interpolated mean topsoil lead concentration variable was created by Empirical Bayesian Kriging within Esri ArcGIS v.10.4. Among many spatial interpolation approaches, Empirical Bayesian Kriging permits accurate interpolation of spatially intensive data (e.g., soil lead contamination) and provides dependable diagnosis of the uncertainty of model predictions. 35
This method used the 300 topsoil lead samples collected from the edge zones to model the topsoil lead concentration at all locations within the four topsoil collection areas. These were averaged across all locations within each census tract to produce an estimate for the mean topsoil lead concentration of that census tract. The estimated mean topsoil lead concentration was then used as the primary predictor for the analysis.
A review of the literature showed that important census-tract-level predictors of crime and/or socioeconomic deprivation include median household income, and percent of: female head of household; households living in poverty; households with dependents under 18 years of age; households receiving food stamps; black population; population with a bachelor's degree; and male population 15–24 years of age.36,37,38,39,40 Principal components (PCs) analysis was used to create a “census-tract features index” from these census-tract-level predictor variables, excluding the male population 15–24 years of age.
This exclusion was based on recent evidence from literature, which indicates that percent of the male population between 15 and 24 years of age is a significant predictor of violent crime.39,40
The final census-tract features index included in the full model was constrained to the first three PCs, which explained 90% of the variance within the seven variables. The rotation scores of the retained PCs guided the interpretation of each component. The first PC (PC1), referred to as “neighborhood advantage,” was associated with high median household income, high percentage of bachelor's degrees, low percentage of black population, low percentage of female head of household, low percentage of households receiving food stamps, and low percentage of households living in poverty (all roughly equal in weight).
The majority of variance in PC2, referred to as “family neighborhoods,” was explained by high median household income and high percentage of households with children less than 18 years of age. The majority of the remaining variance in PC3 was explained by the percent of the population without a bachelor's degree, and it is referred to as “low college graduates.” Although there are other data dimensionality reduction methods, 41 PCs analysis was used for ease of comparison between different ecological studies.
It is important to note that we did not include percent pre-1950 housing as an independent variable in our model because it was colinear with several other independent variables, including percent of female head of household, households living in poverty, and black population. Although pre-1950 housing is a known risk factor for environmental lead exposure, 42 percent households living in poverty and percent black population are also reliable predictors of childhood lead poisoning.43,44 This phenomenon is likely due to a history of redlining, which systematically divested from Black communities during the first half of the twentieth century. 45 The effects of this systematic racism are still felt in historically Black communities, especially among families living in poverty. 46
Statistical methods
To evaluate for spatial dependence, a spatially weighted Moran's I test was employed. Residual spatial dependence was identified among our topsoil samples. Therefore, the independence assumption of analysis of variance was not met. To evaluate topsoil lead concentration between the four topsoil collection areas, spatially weighted linear models were used. Model specification was determined by the Lagrange Multiplier test.47,48
Based on the results from the Lagrange Multiplier test, a spatial error model was used to evaluate edge zone lead concentration between the four topsoil collection areas. The dependent variable “Soil Lead Concentration” was log transformed in the analysis, and regression coefficients (β) and confidence intervals (CI) were back-transformed using the exponential function to return effects to the original measurement scale.
A Bayesian Sparse Spatial Generalized Linear Mixed Model (SGLMM) was used to examine the statistical relationship between violent crimes and the estimated topsoil lead concentration per census-tract, while adjusting for the census-tract features index and spatial confounding. Review of the literature suggests that a Bayesian SGLMM outperforms traditional SGLMM, as well as restricted spatial regression. 49 A sparse SGLMM included an adjacency matrix, which accounted for the inherent spatial dependence within the four topsoil collection areas.
Spatial adjacency was determined by queen contiguity, which defined census tracts as neighbors in the presence of a shared border or corner. 50 A Poisson sparse SGLMM was used with an offset, defined as the 2014 American Community Survey 5-year estimates census-tract-level population data. This offset effectively transformed the response variable, violent crime events per census tracts, into per capita rates of violent crimes.
To evaluate the predictive effect of the estimated mean topsoil lead concentration on violent crime rate, two models were used: an unadjusted model and a full model. Both models used the estimated mean topsoil lead concentration for each census tract as the primary predictor. The full models included the percentage of young males 15–24 years of age per census tract, as well as the census-tract features index to account for confounding between topsoil lead, violent crime, and other neighborhood features.
Sub-analysis for both model types was performed to evaluate the effect of the interpolated mean topsoil lead concentration on each violent crime type: aggravated assault, robbery, forcible rape, and homicide. All analyses were performed using R statistical software, version 3.3.1.
This study was reviewed by the University of Louisville Institutional Review Board and was deemed to be non-human subjects research.
RESULTS
An increased median lead concentration was found from Study Area samples (Table 2). Of the 300 topsoil samples taken, 15 had lead levels exceeding 400 mg/kg, the threshold for soil safety in play areas, as regulated by the EPA's Final Rule. 51 Of these 15 samples, 14 were found in the Study Area. Therefore, of the 120 samples taken from the Study Area, ∼12% exceeded the EPA's guidelines for safety in play areas.
Topsoil Lead Concentration by Collection Area
Table values reported in mg/kg.
n, number of samples.
Results from the spatial error model showed that the mean topsoil lead concentration was significantly higher in the Study Area compared with the mean topsoil lead concentration in the referent Southeast Control Area (Table 3). Model results estimated an 8.25-fold increase in Study Area topsoil lead concentration compared with the referent Southeast Control Area (Study Area exponentiated results: β: 8.25, 95% CI: 5.12–13.27). Model results also estimated a 2.43-fold increase in Northeast topsoil lead concentration compared with the referent Southeast Control Area (Northeast Area exponentiated results: β 2.43, 95% CI: 1.55–3.81).
Spatial Error Model Results for Topsoil Lead Concentrations by Collection Area
Table coefficients have been exponentiated back to their original measurement scale; Southeast Collection Area served as the model referent.
CI, confidence interval.
Results from the unadjusted SGLMM showed that for every 100-unit increase (lead coefficients from the SGLMM models were multiplied by 100 before exponentiation) in estimated mean topsoil lead concentration, there was a 1.62 (95% CI: 1.59–1.68) times increase in the relative risk of violent crime. Results from the full SGLMM showed that for every 100-unit increase in estimated mean topsoil lead concentration, there was a 1.05 (95% CI: 1.03–1.08) times increase in the relative risk of violent crime when controlling for the neighborhood characteristics (Table 4).
Bayesian Sparse Spatial Generalized Linear Mixed Models for Crime Risk Using Lead Levels and Other Characteristics
RR are for 100 mg/kg increase in the topsoil lead levels; table results rounded to two decimal places.
CI, confidence interval; PC, principal component; RR, relative risk.
Results from the unadjusted SGLMM sub-analysis showed that every 100-unit increase in estimated mean topsoil lead concentration was associated with a 1.75 (95% CI: 1.63–1.89) times increase in the relative risk of homicide, a 1.63 (95% CI: 1.58–1.69) times increase in the relative risk of aggravated assault, a 1.61 (95% CI: 1.56–1.68) times increase in the relative risk of robbery, and a 1.55 (95% CI: 1.46–1.65) times increase in the relative risk of forcible rape.
Results from the full SGLMM showed that for demographically similar neighborhoods, every 100-unit increase in estimated mean topsoil lead concentration was associated with a 1.20 (95% CI: 1.07–1.34) times increase in the relative risk of homicide, a 1.12 (95% CI: 1.04–1.23) times increase in the relative risk of forcible rape, a 1.05 (95% CI: 1.01–1.10) times increase in the relative risk of robbery, and a 1.04 (95% CI: 1.01–1.07) times increase in the relative risk of aggravated assault (Table 4).
DISCUSSION
Prior ecological research has found that atmospheric3,52,53 and water lead 9 contamination may be important sources of lead exposure, as well as potential risk factors that underpin the biological mechanisms of the lead-crime hypothesis. This study represents a first ecological evaluation of the spatial association between topsoil lead contamination and the incidence of FBI designated violent crime. Our study found a non-random distribution of topsoil lead contamination across the four topsoil collection areas, and a relationship between topsoil lead concentration and the incidence of violent crime.
The inverse soil-lead concentration gradient from the urban-to-suburban environments as predicted by our models is congruent with prior research (Fig. 1b).54,55 This phenomenon is likely due to the relationship between population density and corresponding consumption of lead-based products during the first half of the twentieth century. The distribution of pre-1950 housing, a risk factor for remnant lead paint, 42 as well as the atmospheric diffusion of leaded automobile exhaust from the urban center, 56 may explain why the urban Study Area and nearby Northeast Control Area had respective 8.25- and 2.43-fold increases in topsoil lead levels compared with the suburban referent Southeast Control Area (Table 3) (Fig. 1c, d).
Although the true difference in topsoil lead contamination between urban and suburban environments is not completely understood, a recent meta-analysis 53 suggests that urban environments have a threefold increase in topsoil lead contamination compared to their suburban counterparts. This may explain why leaded house dust, a known pathway for lead exposure, was found to have a dose-response with historical patterns of traffic density. 57
This association is believed to be due to the leaded exhaust-soil-dust pathway, wherein leaded soils contaminated during the lead-gas era get tracked into the house, where it then becomes a primary constituent of floor house dust.24,58 These findings underscore an ongoing environmental hazard between topsoil lead contamination and indoor lead exposure, especially among pediatric populations with age-related hand-to-mouth behaviors.
Although increased levels of topsoil lead contamination were geospatially correlated to the urban Study Area, it is important to acknowledge that research from other academic domains has identified numerous measures of societal disadvantage that are also correlated to urban environments.28,59 These aggregate measures, such as metrics of poverty and education, are also associated with violent crime.38,60 Our results found that edge zone topsoil lead contamination is a significant predictor of violent crime after adjusting for these aggregate measures, as well as for the inherent spatial dependency among the four topsoil collection areas.
The results of our study built on the prior research that found significant associations between aggregate blood lead levels and violent crime.10,11 Although our study used a similar methodological approach to this research, we found that environmental topsoil lead contamination, a known risk factor for elevated blood lead levels, 61 predicted the incidence of violent crime. This is an important finding for several reasons.
First, it highlights a potential environmental agent in the etiology for violence that may be mitigated through primary prevention strategies. 62 Second, given the ubiquitous nature and increased bioavailability of topsoil lead in urban environments, 24 our findings suggest ongoing negative consequences for unabated contamination.
Our findings lend credence to prior environmental studies that found significant associations between atmospheric.3,50,51 and municipal water 9 lead contamination, and trends in violent crime. Although air, water, and now topsoil lead exposure have been associated with violent crime, little is known about the additive effect of lead exposure within these environmental mediums on the incidence of violent crime. Future research should simultaneously measure exposure to lead within different environmental mediums to better specify the predicative effect of early life lead exposure on measures of crime and/or neuropsychiatric pathologies.
Our study has several strengths worth mentioning. In an effort to eliminate selection bias of topsoil collection locations, we followed a prespecified sampling protocol. For example, all topsoil samples came from predefined collection areas based on a spatial analysis of violent crime via the Kulldorff spatial scan statistic. An approximately equal number of topsoil samples was taken from each census tract, within each topsoil collection area. We also used randomly selected street locations within each census tract to prevent oversampling in neighborhoods with a high proportion of pre-1950 housing.
In addition, the Louisville Metro Police Department—Crime Information Center was able to provide the exact latitude and longitude coordinates of all reported violent crimes in Jefferson County. These data enhanced the precision of crime cluster specification via the Kulldorff spatial scan statistic, which allowed for larger control areas that were more socioeconomically heterogeneous.
The use of spatial error modeling and SGLMM allowed for evaluation of the spatial relationship between topsoil lead concentration and violent crime, while controlling for spatial autocorrelation. The use of these models improved regression inference compared with traditional modeling under an independence assumption, thereby reducing the chance of type I error. 63
This study is also subject to several limitations. It is possible that there could be differential reporting practices of the violent crime measure across the four topsoil collection areas. Although this study had access to all the known crime events documented by the Louisville Metro Police Department between 2012 and 2016, it is possible that some violent crimes may not have been reported to the police.
Approximately 65% of rapes and sexual assaults, and 7 million cases of child maltreatment, 64 including forms of physical and sexual assault, 65 go unreported each year in the United States. If these findings are generalizable to Jefferson County, and there are differences in reporting rates across census tracts, then the “true” geospatial distribution of FBI designated violent crime may be different than what was used to establish the Study and Control Areas.
In addition, although 300 edge zone topsoil samples were collected for this study, they may be a poor estimation of the actual topsoil lead burden in Jefferson County. This is an important consideration among pre-1950 housing, where topsoil lead contamination may be worse in soils more proximate to housing structures, as compared with lead contamination measured from the street curb (i.e., edge zones).
Therefore, the estimated topsoil lead concentration used for modeling purposes may underestimate topsoil lead contamination, especially in the older urban neighborhoods of Jefferson County.
Further, this study did not conduct a sensitivity analysis concerning the modifiable areal unit problem as described by Openshaw. 66 Although all analyses conducted in this study were performed at the census-tract-level, it is possible to aggregate data by a different areal unit. For example, census block groups or other grid-cell modeling could have been used instead of census tracts. 67
Although other areal units are available, census tracts were used throughout this study due to their ease of interpretation within county boundaries, shape consistency over time, and increased precision (i.e., smaller standard errors) relative to their composite census block groups.68,69
Finally, although our full Bayesian SGLMM included eight variables associated with deprivation and/or violence, it is possible that residual confounding from other pertinent variables impacted the results. For example, we did not have access to aggregate measures of childhood or lifetime exposure to family and community violence.
However, if these unobserved confounders are also spatially correlated, some of their explanatory effects may be accounted for by controlling for spatial autocorrelation. Nevertheless, the results from our full Bayesian SGLMM, though statistically significant, should be interpreted with caution.
CONCLUSION
This study provides new evidence in support of the lead-crime hypothesis. Although there are likely other unknown risk factors for violence, results from this study found a spatial relationship between the estimated mean topsoil lead concentration, demographic measures of socioeconomic deprivation, and the incidence of FBI designated violent crime. These results suggest that lead contaminated topsoil may be an important modifiable risk factor for the genesis of violence.
Footnotes
AUTHORs' CONTRIBUTIONS
B.E.G.: Conceptualization (lead), investigation (lead), project administration (lead), and writing original draft (lead); K.B.B.: Methodology (equal), supervision (equal), and writing—review and editing (lead); S.D.B.: Formal analysis (supporting), methodology (equal), and writing—review and editing (equal); J.G.: Data curation (lead), formal analysis (lead), methodology (equal), and writing—review and editing (equal); H.Z.: Data curation (supporting), methodology (equal), and supervision (equal); K.M.Z.: Methodology (equal), supervision (equal), and writing—review and editing (equal).
AUTHOR DISCLOSURE STATEMENT
No competing financial interests exist.
FUNDING INFORMATION
No funding was received for this article.
