Abstract
Background:
Recreating in waterbodies impacted by combined sewer overflows (CSOs) can present health risks due to exposure to microbial pathogens. This study compares population characteristics of those living within walking distance to CSO-impacted versus nonimpacted waters in Philadelphia to determine whether these populations differ by race, ethnicity, and sociodemographic characteristics.
Methods:
Adults recreating at or near natural water bodies in Philadelphia completed a questionnaire that assessed the average walking distance to each site. Walking distance boundaries informed by questionnaire responses were created around each waterbody in Philadelphia, and population-level census data corresponding with block groups included within each buffer were used to characterize those living near a CSO-impacted and nonimpacted waterbodies in Philadelphia.
Results:
Compared with populations residing in census block groups within walking distance to a nonimpacted waterway, populations living within the same distance to a CSO-impacted waterway were more likely to comprise Hispanic residents (standardized adjusted prevalence ratio [APR] = 1.13) and those living in poverty (APR = 1.21) and less likely to comprise White residents (APR = 0.76).
Conclusion:
These findings suggest that communities of color and those experiencing poverty are disproportionally impacted by environmental hazards.
INTRODUCTION
Exposure to natural waterways or “blue spaces” may have health benefits, particularly for urban populations, by improving mental health, promoting physical activity, and enhancing overall quality of life. 1 However, direct contact with contaminated waterways through activities such as swimming, fishing, and wading can expose individuals to waterborne pathogens that can result in illness. 2 Previous research estimated that ∼4 billion surface water recreation events (e.g., swimming, paddling, boating, and fishing) occur annually in the United States, resulting in ∼90 million illnesses. 3
The risk of developing illness after recreational activities is impacted by the concentration of waterborne pathogens in the environment. Although waterborne pathogens may be introduced into natural waterbodies through multiple pathways including urban surface water runoff, the concentration of pathogens in water bodies impacted by combined sewer overflows (CSOs) can be particularly high. 4 In fact, a previous study measured 10-fold higher concentrations of human sewage markers in waterbodies impacted by CSOs than in waterbodies impacted by rainfall without a CSO. 5
Previous research has investigated relationships between CSOs and gastrointestinal illness, using residential proximity as a proxy of exposure. These studies found that populations living near CSO-impacted waterbodies had an increased risk of emergency department visits for gastrointestinal illness after rain events in Massachusetts 6 and that children living within 500 m of a combined sewer outfall had an increased risk of an emergency department visit within 7 days of a sewer overflow in Cincinnati. 7
Although these studies have established relationships between exposure to CSO-impacted water and increased risk of gastrointestinal illnesses, current research is lacking on whether certain demographic groups may be more likely to be exposed to CSO-impacted waterbodies and subsequently at higher risk of waterborne disease.
Geographic information systems (GIS) methods have been used in epidemiological studies to measure inequitable exposure to environmental hazards, such as to landfills and highways. 8 These studies often find that minority or low-income communities are more likely to reside near hazards, which can negatively impact health. 9 These methods have also been used to investigate how residential proximity to community assets, such as parks and urban green space, may promote healthy behaviors, such as physical activity, which can in turn reduce obesity and improve mental health. 10 , 11
Although past research has shown that environmental hazards are often not equitably distributed across communities, there has been little research on the distribution of CSO-impacted versus nonimpacted water bodies. To address this gap, this study seeks to address the following research question: do populations living near CSO-impacted waters differ from those living near nonimpacted waters in Philadelphia by race, ethnicity, and socioeconomic variables? We hypothesize that populations living in census block groups or tracts located near CSO-impacted and nonimpacted water bodies will differ with respect to race, ethnicity, and socioeconomic status.
METHODS
Estimated acceptable travel distance to sites for recreation
To estimate acceptable travel distances to natural waterbodies in Philadelphia, interviewers administered a questionnaire to adults (n = 56) recreating (i.e., making some contact with the water through swimming, fishing, playing with a dog, or wading) at three sites along Cobbs Creek, Tacony Creek, and the Delaware River in Philadelphia. Sites chosen were CSO-impacted areas where recreation is known to occur (based on observations and a previous hidden camera study 12 ). Questionnaires were administered between June and August 2019 during at least 1 day per week at each site (total of 39 days in 2019). Questionnaires asked about travel method (car, public transit train/bus, or walking), how long it took them to travel there, and the approximate distance they traveled.
All individuals who appeared eligible were approached for recruitment by a member of the research team. Individuals were eligible to complete the questionnaire if they were recreating at or near the study sites and were 18 years of age or older. Consent was attained verbally by a member of the research team. All questions were read aloud and responses were recorded on site using paper and pencil. Questionnaires did not collect any identifying information. Researchers administering the questionnaire received training on the ethical conduct of human subjects research (Collaborative Institutional Training Initiative training). Data collection protocol and measures were approved by Temple University's Institutional Review Board Protocol No. 25159.
GIS methods
Using data reported in the questionnaires, we computed the mean, 5th, and 95th percentile travel distances to each site. To determine the distance participants walked to the sites when only travel time was reported (or when distance was reported but not plausible, e.g., one participant reported walking 20 km in 25 minutes), walking time was converted to distance based on an average pedestrian walking speed of 1.42 m/s or 0.0852 km/min. 13 The mean, 5th percentile, and 95th percentile distances participants reported walking to the site were all used in the analysis.
Potential recreation sites in Philadelphia were defined as public park spaces that contained or were next to a natural waterbody. Sites were identified using data on the locations of public parks, 14 natural waterbodies, 15 and locations of combined sewer outfalls in Philadelphia (which were used to define areas downstream of a combined sewer outfall as CSO impacted). 16 Road network buffers were created around each site using 2018 ArcGIS StreetMap data and using street junctions within 100 m of each park as the entry point for each waterbody. 17 All GIS analyses were conducted in ArcMap version 10.7.1.
Outcome variables
Shapefiles for U.S. census block groups and tracts included were joined with the shapefile containing the road network travel distance buffers to identify census block groups and/or tracts included within the road network travel buffers of CSO-impacted and nonimpacted water bodies. 18 Census block groups and/or tracts included in road network buffers around both CSO-impacted and nonimpacted sites were considered CSO impacted for the analysis.
Exposure variables
The most recent (2017) American Community Survey (ACS) 5-year estimates were retrieved at the census block group level, when available. 19 These data include total population, percentage per racial group (categorized into “White,” “Black,” and “Other,” whereas the “Other” category included all race categories that included <10% of the total population), percentage who are Hispanic/Latinx, percentage with male/female gender identities, median age, and median household income. Other variables were only available at the census tract level, including the percentage living below the poverty level in the past 12 months and the Gini index of income inequality. The Gini index is a relative measure of income inequality, where higher scores indicate more inequality. 20
U.S. census data were joined to the outcome using two different methods: (1) proportional split and (2) modified Poisson regression. The “proportional split” method assigned proportions of each block group/tract to the CSO-impacted or nonimpacted outcome category based on the proportion of the geographic unit included within each travel distance buffer. ACS estimates for continuous variables (e.g., median age and median income) or the total number in each category for categorical variables (e.g., gender and race) was multiplied by the proportion of the block group/tract included within each travel distance buffer. 21
Data (including ACS estimates and margins of errors) were aggregated to get combined population characteristics. This method allows for individual block groups or tracts to contribute some proportion to both CSO-impacted and nonimpacted outcome categories (i.e., if part of the area falls within the CSO-impacted and another part falls within the nonimpacted buffer).
The second method uses modified Poisson regression to examine independent associations between demographic or economic variables and whether block groups (or tracts) had >50% of their area within the reported travel distance buffer to CSO-impacted waters or nonimpacted waters. In this method, block groups or tracts with ≤50% of their area overlapping the buffers were excluded.
Exclusion criteria and data management
Of the 1336 census block groups in Philadelphia, 8 were nonresidential (total population = 0 in ACS data) and were removed from the analysis. For some census block groups, median income (n = 137) or Gini index (n = 1) data were missing (for 10.3% of census block groups). For this reason, multiple imputation (MI) was done using the other ACS variables in the model as well as ACS variables expected to be predictive of median income (i.e., percentage with different education levels, percentage of the civilian population that is unemployed, percentage vacant homes, median home value, and percentage of the population receiving SNAP/food stamps). MI was performed using the “mice” package in R. All analyses were conducted using pooled results from these imputed data sets to account for missing data.
Statistical analysis
All statistical analyses were done in R version 3.5.2. To aggregate census block group or tract data across buffer areas (proportional split method), the “tidycensus” package in R was used. 22 Comparisons were made between the two combined areas (CSO impacted and nonimpacted) using z-tests. 23
Modified Poisson regression models with robust error variance using the sandwich estimator 24 were used to determine which variables were independently associated with census block groups (or tracts) having >50% of their area within the reported travel distance buffer to CSO-impacted waters. For adjusted models, Poisson regression with generalized estimating equations (GEE) was used to adjust for clustering at the census tract level. 25 Because the race variables (“White,” “Black,” or “Other” race) were collinear and could not be included together in the same model, the adjusted model included only the percentage White variable to account for race.
Sensitivity analyses
Sensitivity analyses were conducted to determine the impact of classifying areas within both CSO-impacted and nonimpacted travel distance buffers as CSO impacted and to determine whether the threshold used to classify an area as CSO impacted or nonimpacted (i.e., percentage of the areal unit within the reported travel distance buffer) influenced results. Results of these analyses and detailed methods are reported in the Supplementary Data.
RESULTS
Travel distance to recreation sites
A total of 56 questionnaires were completed by adults recreating at or near the study sites during the 2018 and 2019 summer seasons (June–August): 35 provided information on the method used to travel to the site (15 reported walking [45%], 7 reported biking [21%], 18 reported driving [55%], and 1 reported taking a rideshare service to the site [3%]). If applicable, participants were also asked about their travel methods/distances to other water bodies where they recreate in Philadelphia. A total of 21 (36%) reported going to other sites in the city. Among these, eight reported walking (38%), five reported biking (24%), and eight reported driving to the site (38%).
Based on these responses, walking and driving were the two most popular forms of transportation used. However, a large proportion of those recreating at the sites were children who were there without an adult present (and, therefore, could not complete a questionnaire). For this reason, and because walking is the most accessible mode of transportation across age groups, 26 , 27 road network travel buffers were computed using the travel time and travel distance participants reported walking to the sites. After removing one outlier, the distance participants reported walking was approximately normally distributed with a mean value of 1.4 km (95% confidence interval: 800 m–1.6 km).
GIS analysis results
A map displaying network buffers created around CSO-impacted and nonimpacted areas using the mean, 5th, and 95th percentile values for distances participants reported walking to the sites is displayed in Figure 1. Figure 1 also displays sites where questionnaire data were collected.

Map of buffer areas around CSO-impacted and nonimpacted waterbodies in Philadelphia. CSO, combined sewer overflow.
Table 1 displays demographic and economic characteristics of populations living within the 800 m, 1.4 km, and 1.6 km CSO-impacted or nonimpacted buffers using the “proportional split” method. Using the estimates and standard errors reported in the ACS data, areas within the CSO-impacted buffers were more densely populated, had a higher percentage Black population, a higher percentage of the population in “Other” race categories, and a higher percentage Hispanic/Latinx population, than those in the nonimpacted buffers. In addition, areas within the CSO-impacted buffer had a lower median age, a higher percentage of its population living below the poverty level, a lower median household income, and more neighborhood inequality (Table 1). These results were statistically significant (p < 0.05).
2017 American Community Survey 5-Year Estimates and Standard Errors for Areas Located Within the Combined Sewer Overflow-Impacted and Nonimpacted Travel Distance Buffers Using the Proportional Split Method
All variables statistically differ between CSO-impacted and nonimpacted buffers across all road network distances considered. p < 0.05.
Includes racial groups that represent <10% of the population (Asian, American Indian/Alaskan Native, Native Hawaiian/Pacific Islander, or some Other race).
Data only available at the census tract level.
CSO, Combined Sewer Overflow; Est, estimate; SE, standard error.
Results of the unadjusted modified Poisson regression models are reported in Table 2. Compared with census block groups/tracts with >50% of their area within the nonimpacted buffers, those with >50% of their area within the CSO-impacted buffers had higher population densities, a lower percentage White population, a higher percentage Black or “Other” race population, a higher percentage Hispanic/Latinx population, a lower median age, a higher percentage of the population living below the poverty level, a lower median household income, and a higher measure of income inequality (Table 2, p < 0.05).
Results of Unadjusted Models Comparing Census Block Groups That Had >50% of Their Area Within the Reported Travel Distance Buffer to Combined Sewer Overflow-Impacted Waters with Those Who Had >50% of Their Area Within the Reported Travel Distance Buffer to Nonimpacted Waters
All estimates are standardized.
Includes racial groups that represent <10% of the population (Asian, American Indian/Alaskan Native, Native Hawaiian/Pacific Islander, or some Other race).
Data only available at the census tract level.
CI, confidence interval; PR, adjusted prevalence ratio.
Results of adjusted modified Poisson regression models that accounted for clustering at the census tract level using GEE are summarized in Table 3. There were statistically significant differences across all buffer sizes for the population density, percentage White, percentage Hispanic/Latinx, and the percentage living below the poverty line. Differences for median age and Gini index were not statistically significant for any buffer size. Finally, for the median income variable, statistically significant differences were observed for the 800 m buffer area (adjusted prevalence ratio = 1.33) but not for the 1.4 or 1.6 km buffer areas (Table 3).
Results of Adjusted Models Comparing Census Block Groups with >50% of Their Area Within the Travel Distance Buffer to Combined Sewer Overflow-Impacted Waters with Those with >50% of Their Area Within the Reported Travel Distance Buffer to Nonimpacted Waters
All estimates are standardized.
Data only available at the census tract level.
APR, adjusted prevalence ratio.
DISCUSSION
Compared with census block groups/tracts with >50% of their area within the reported walking distance of a nonimpacted waterbody, census block groups/tracts with >50% of their area within the same distances of a CSO-impacted waterbody comprised more minority residents (non-White and/or Hispanic/Latinx) and more residents living below the poverty line, revealing a potential environmental justice issue. Findings are consistent with other studies that have demonstrated that environmental hazards are more likely to be in or near minority or low-income communities 28 , 29 , 30 , 31 and with studies that have found demographic differences in residential proximity to natural waterways. 32 , 33
This study highlights a potential inequity in the distribution of CSO-impacted water bodies in Philadelphia and demonstrates the importance of considering park quality when assessing the relationships between urban parks, green space, or blue spaces and positive health outcomes. 34 , 35 , 36 Many studies have reported inequitable distribution of urban parks and green spaces across communities. 37 , 38 However, few have considered local differences in the safety, quality, size, infrastructure, and other more local factors that may alter the effect that these spaces have on health.
In this study, both populations that were compared were within walking distance to an urban park and green space; however, there were differences in the water quality available at these sites due to impact of upstream CSOs. Although recreation (swimming or wading) in clean water bodies may provide health benefits, 39 recreation in contaminated water can also increase exposure to waterborne pathogens that can lead to illness. 40 For this reason, these findings may suggest a need for future research to consider quality of urban parks and green spaces when measuring their impact on public health. 41
An important assumption of this analysis is that residential proximity increases the likelihood of being exposed to natural waterbodies. However, the demographic characteristics of those who are exposed to these waterbodies may differ from that of the surrounding community. Past research has found that local social, demographic, and environmental factors may influence the relationships between individuals and their local waterways. 42 For example, one study found that while populations with higher socioeconomic status live further from waterways, they tend to spend more time at the waterway than populations that lived closer. 43
Therefore, it is possible that the population that is exposed to the water bodies is different than the population living within an acceptable travel distance, resulting in a potential misclassification bias. Because there is no evidence to suggest that the subset of the surrounding population that recreate would be different between those living near CSO-impacted versus nonimpacted sites, this misclassification bias is likely to be nondifferential.
Findings of this analysis align with previous investigations in New York City, 44 Indianapolis, 45 and in the Mystic River Watershed. 46 These studies also found that active CSOs disproportionally affect lower income and higher minority populations within their watersheds. Although these studies support the findings of this analysis, a study done in Atlanta noted that although CSO outfalls are located primarily in neighborhoods with high poverty, their measured association between CSOs and emergency department visits for gastrointestinal illness was stronger in areas with lower poverty rates than in areas with higher poverty rates in Atlanta. 47
Importantly, these researchers note that those living in areas with lower poverty rates may be more likely to seek care in the event of experiencing acute gastrointestinal illness and, therefore, their illnesses may go undiagnosed and be under-represented in this analysis. 48 , 49
One potential limitation of this study is that questionnaire data were collected from a convenience sample of individuals ≥18 years of age to define an acceptable travel distance. In addition, data could not be collected from children without their guardians present. Previous studies have reported a wide range of distances that individuals are willing to walk to a park from 0.5 to 5 km. 50 , 51 , 52 Our travel buffers are in line with these studies and a strength is that we collected local data at three CSO-impacted sites in Philadelphia where recreation is known to occur.
Furthermore, the results of our sensitivity analyses show that our findings are robust to changes in the methods used to classify the areas assigned to each outcome group (e.g., within walking distance of a CSO-impacted vs. nonimpacted waterway). More research is needed to determine whether these results are generalizable to other cities impacted by combined sewer systems.
CONCLUSIONS
Findings from this study show that within Philadelphia, neighborhoods within reported walking distances to CSO-impacted waterbodies were more likely to comprise minority and lower income populations than areas within reported walking distances to nonimpacted waterbodies. These results suggest that there is a potential inequity in the distribution of CSOs in Philadelphia and that individuals located near these water bodies may be more likely to be exposed to waterborne pathogens through recreation and thus may be experiencing a higher burden of waterborne disease. This study provides further evidence for the need to accelerate equitable efforts to reduce the health impacts of CSOs and improve the health of underserved and vulnerable populations who reside in Philadelphia and other cities. 53 , 54
Footnotes
ACKNOWLEDGMENTS
This study was supported by funding provided by the Temple University Graduate School and the Temple University College of Public Health, Philadelphia, PA.
AUTHORs' CONTRIBUTIONS
S.M.M. contributed to conceptualization, methodology, software, formal analysis, investigation, writing—original draft, and visualization. A.E.R. and D.L.C. were involved in conceptualization, methodology, and writing—review and editing. H.M.M. carried out conceptualization, methodology, writing—review and editing, resources, and supervision.
AUTHOR DISCLOSURE STATEMENT
The authors declare no conflicts of interests or competing interests.
FUNDING INFORMATION
This study was supported in part by funding provided by the Temple University Graduate School, the Temple University College of Public Health, Philadelphia, PA, and Drexel's Academy of Natural Sciences through funding from the William Penn Foundation.
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
