Abstract
Shiga toxin–producing Escherichia coli (STEC) infections are an important health burden for human populations in Ontario and worldwide. We assessed 452 STEC cases that were reported to Ontario's reportable disease surveillance system between 2015 and 2017. A retrospective scan statistic using a Poisson model was used to detect high-rate STEC clusters at the forward sortation area (FSA; the first three digits of a postal code) level. A significant spatial cluster in the southwest region of Ontario was identified. A case–case logistic regression analysis was applied to compare FSA-level socioeconomic and demographic characteristics among STEC cases included inside the spatial cluster with cases outside of the cluster. Cases included in the spatial cluster had higher odds of living in FSAs with a low median family income, low proportion of lone-parent families, and low proportion of the visible minority population. In addition, STEC cases inside the cluster had higher odds of coming from rural FSAs. Our study demonstrated that STEC cases were spatially clustered in Ontario and their clustering was associated with FSA-level socioeconomic and demographic determinants of cases.
Introduction
Shiga toxin–producing Escherichia coli (STEC) infections are an important human health burden in Ontario, Canada, and worldwide. In the last decade, the emergence of non-O157 STEC serogroups and decline of the previously predominant O157 serogroup have been observed (Gould et al., 2013; Bruyand et al., 2018).
Clinical signs of STEC infections can vary from minor gastrointestinal distress to more severe outcomes, including the hemolytic uremic syndrome (HUS), end-stage renal disease, and death (Adams et al., 2019; Bruyand et al., 2019). Young children and older adults are more prone to develop severe clinical symptoms and their risk of contracting an STEC infection is enhanced by the low infectious dose of the bacteria (Bruyand et al., 2018; Pedersen et al., 2018).
Globally, STEC infections cause an estimated 2,801,000 acute diseases each year; and among those infections, 3890 cases were diagnosed as having HUS and 270 cases as having end-stage renal disease and 230 cases resulted in mortality (Majowicz et al., 2014). In Canada, among domestically acquired, foodborne STEC cases, an estimated 12,827 cases (39.47 per 100,000) of O157 STEC and 20,523 cases (63.15 per 100,000) of non-O157 STEC have been detected annually (Thomas et al., 2013).
Foodborne exposure, particularly consumption of STEC-contaminated beef products, is an important source of STEC infections in North America (Pires et al., 2019). However, several STEC infection outbreaks have recently been associated with the consumption of fresh vegetables (Herman et al., 2015; Luna-Guevara et al., 2019) and flour (Crowe et al., 2017). While consuming contaminated food poses an infection risk, other exposures such as contact with infected livestock and their contaminated environment in petting zoos (Conrad et al., 2017; Schlager et al., 2018) and farms (Friesema et al., 2011; Whitfield et al., 2017), contact with infected pets (Bentancor et al., 2012; Whitfield et al., 2017), and drinking contaminated water (Coleman et al., 2013) should also be considered.
Spatial scan statistics have been used in public health research in Canada to identify high-infection rate areas. These include salmonellosis in Ontario (Varga et al., 2015, 2020; Paphitis et al., 2020); Campylobacter infections in Manitoba (Green et al., 2006); Campylobacter, E. coli, Salmonella, and Shigella infections in New Brunswick (Valcour et al., 2016); STEC O157 infections in Alberta (Pearl et al., 2006); and notifiable gastrointestinal illnesses in the Northwest Territories (Pardhan-Ali et al., 2012).
The socioeconomic and demographic determinants of foodborne diseases are an area of increasing research interest worldwide (Simonsen et al., 2008; Varga et al., 2013). Previous studies reported high STEC infection rates in subpopulations with high socioeconomic status (SES) (Chang et al., 2009; Jalava et al., 2011; Whitney et al., 2015), while other studies have reported the opposite—lower infection rates among those with higher SES (Sakuma et al., 2006) or no relationship between SES and STEC infection rates (Simonsen et al., 2008; Pearl et al., 2009).
Our study uses an integrative, stepwise study framework by combining STEC infection data obtained from Ontario's reportable disease surveillance system with auxiliary socioeconomic and demographic data obtained from the Canadian Census of Population. First, we proposed a retrospective spatial scan statistic using a Poisson model to detect the forward sortation area (FSA; represented by the first three characters of the postal code)-level clustering of STEC cases. Next, a case–case logistic regression analysis compared FSA-level socioeconomic and demographic factors among STEC cases that were included in spatial clusters with cases located outside of the clusters. Detecting high-STEC infection rate spatial clusters and identifying area-level socioeconomic risk factors will aid public health stakeholders in designing targeted prevention and control programs.
Materials and Methods
Study area and setting
Our study area was Ontario, the largest province in Canada (Fig. 1), with a population of 13,875,394 people in 2016 (Statistics Canada, 2016a).

Location of the study area.
The analysis was performed at the FSA level. FSAs are represented by the first three digits of the postal code. Cartographical boundary files and population estimates for each FSA were acquired from the 2016 Census of Population (Statistics Canada, 2016b, c). For our study, we included 504 FSAs, in which the 2016 population ranged from 1,090 to 111,370 with a mean of 26,673.
Case data
Case data on all reported STEC infections between January 1, 2015, and December 31, 2017, in Ontario were obtained from the Ontario Ministry of Health and Long-Term Care and extracted from the reportable disease database by Public Health Ontario staff on December 10, 2018. The data contained STEC cases' FSAs and their disease onset date. The data were deidentified and cleared of client name, exact address, and exact age before they were provided to the authors. Ethics clearance was obtained through the University of Waterloo Ethics Committee (ORE # 40133), who decided that informed consent was not needed.
Socioeconomic data
We used publicly available FSA-level socioeconomic and demographic data from the 2016 population census provided by Statistics Canada (Statistics Canada, 2016c).
Statistical analyses
Descriptive statistic
All STEC cases were geocoded and spatially joined to the FSA level. Incidence rates (IRs) per 100,000 persons were calculated by dividing the FSA-level total number of STEC cases for the whole study period by the FSA-level population estimates. A choropleth map was created to display the distribution of FSA-level IRs by using natural breaks to classify IRs into five categories.
Scan statistics
We conducted a retrospective scan statistic using SaTScan software, version 9.6 (Kulldorff and Information Management Services I, 2018), to detect spatial high-STEC infection rate clusters. A Poisson model, a circular scanning window, and Monte Carlo hypothesis testing with 999 replications were used (Kulldorff, 1997). The geographical center of an FSA represented the smallest spatial unit.
Selection of the ideal scanning window size and optimal collection of nonoverlapping clusters was based on the Gini coefficient, selecting them based on the largest coefficient (Han et al., 2016; Kim and Jung, 2017).
We calculated the relative risk (RR) for each FSA belonging to a significant spatial cluster to prevent the assumption of similarity of RRs of FSAs within a cluster (Desjardins et al., 2018; Varga et al., 2020). Statistically significant spatial high-rate STEC clusters and the RRs of FSAs included in significant clusters were illustrated in ArcGIS 10.7.1 (ESRI, Inc., Redlands, CA).
We described the number of cases in each FSA belonging to a significant cluster and evaluated whether these FSAs were from rural or urban areas, where rural areas are those with a “0” in the second character of the FSA (e.g., N0B) (Statistic Canada, 2016c).
Case–case logistic regression analysis
A database containing information on STEC case origin (inside or outside of the spatial cluster) and FSA-level SES and demographic variables was created in Excel for Microsoft 365 and subsequently imported into STATA Intercooled 14.2 statistical software for analysis.
To identify socioeconomic factors associated with the spatial clustering of STEC cases in Ontario, a case–case analysis (Varga et al., 2012; Pogreba-Brown et al., 2018) was conducted. A logistic regression model was built, in which the binomial outcome (dependent) variable signified whether an STEC case was included (yes = 1) or not (no = 0) in the spatial cluster. Three demographic variables (rural/urban status, population density, and age categories) were considered a priori as confounding variables based on the results of the descriptive statistic and a previous Ontario study (Michel et al., 1999) and were kept in the final multivariable model regardless of their significance. The multivariable logistic regression model included the outcome variable, three demographic variables, and the FSA-level socioeconomic categorical variables.
For each socioeconomic variable, FSAs were divided into three equal groups—low, medium, and high—based on the variable's distribution across all FSAs. For categorical variables, the medium category was selected as the reference category.
The following FSA-level socioeconomic variables were included in the model: Average median family income, which was calculated by dividing the median income of census families (related members within households) by the total number of households. Proportion of lone-parent families, which was represented by dividing the number of lone-parent families by the total population. A lone-parent family was defined as a family with only one parent, male or female, and with at least one child under the age of 25, who has no spouse in the household. The proportion of the visible minority population, which was calculated by dividing the total number of visible minorities by the total population. Visible minority, according to the Employment Equity Act, describes a person, other than an Aboriginal person, who is non-Caucasian in race or nonwhite in color.
For each variable, odds ratios (ORs) and their corresponding 95% confidence intervals were calculated. Variables with Wald's p < 0.05 were considered significant. For categorical variables, the OR represented the odds of a category of interest compared with the odds of the reference category, keeping all the other variables in the model constant. The overall model fit was evaluated with the Hosmer and Lemeshow goodness-of-fit test (Hosmer et al., 2013).
Results
Descriptive statistic results
A total of 452 laboratory-confirmed STEC cases were reported between January 1, 2015, and December 31, 2017, in the province of Ontario. The number of cases in each FSA ranged from 0 to 32. The FSA-level mean STEC IRs ranged from 0 to 5.76 per 100,000 persons (Fig. 2). High-IR FSAs (IR >2.50) were detected in several southwestern and a few eastern Ontario regions.

Distribution of STEC infection rates in Ontario's FSAs (n = 452), 2015–2017. Mean FSA-level incidence rates were calculated by dividing the FSA-level total number of cases for the 3-year study period by the 3-year population estimates and multiplying it by 100,000. FSAs from Northern Ontario are not included in this map because no cases were diagnosed from this region. FSA, forward sortation area; STEC, Shiga toxin–producing E. coli.
Scan statistic results
The highest Gini coefficient (0.19) was detected at a 4% scanning window size; therefore, we used this scanning window for our retrospective spatial scan statistic (Supplementary Fig. S1).
One significant high-rate spatial STEC cluster was identified (Table 1; Figs. 3 and 4). The spatial cluster included 104 STEC cases across 18 FSAs in the southwest region of Ontario and had an RR of 8.03.

The number of STEC cases included in the spatial cluster. Results based on a retrospective, discrete Poisson model using the SaTScan™ software, scanning for STEC clusters with high rates. Spatial scale: centroid of an FSA. Circular scanning window size of 4% of the population at risk. STEC cases per FSA inside the spatial cluster are indicated in shapes, ranging from 0 to 32 cases. FSA, forward sortation area; STEC, Shiga toxin–producing E. coli.

Distribution of relative risks of STEC infections among FSAs included in the spatial cluster. Results based on a retrospective, discrete Poisson model using the SaTScan software, scanning for STEC clusters with high rates. Spatial scale: centroid of an FSA. Circular scanning window size of 4% of the population at risk. Relative risk of STEC infections per FSA inside the spatial cluster is indicated in shapes, ranging from 0 to 15.7. FSA, forward sortation area; STEC, Shiga toxin–producing E. coli.
The Spatial Cluster of Shiga Toxin–Producing Escherichia coli Infections in Ontario, Canada, 2015–2017a– c
Retrospective spatial analysis, scanning for clusters with high rates, using a Poisson model with the SaTScan™ software.
Spatial scale: centroid of an FSA.
Circular scanning window size of 4% of the population at risk.
FSA, forward sortation area.
Of the 18 FSAs included in the cluster, 4 FSAs reported an RR of 0 and 14 had an RR >1, which ranged from 2.41 to 15.67 (Fig. 4).
Of the 104 STEC cases included in the significant spatial cluster, 78 cases (8.07 cases/100,000 persons) were from rural FSAs (32 cases in N0G, 11 cases in N0K, 11 cases in N0M, 10 cases in N0B, 8 cases in N0C, 6 cases in N0H, and 0 cases in L0N), while 26 cases (5.39 cases/100,000 persons) were from urban FSAs (6 cases in N5A; between 1 and 5 cases in each of N4K, N4W, N7A, L9Y, N3B, N4N, and N4X; and 0 cases in N2Z, N4L, and N4Z).
Case–case logistic regression analysis results
Table 2 shows the significant FSA-level socioeconomic factors.
Results of the Case–Case Logistic Regression Analysis (n = 452 Cases from 504 Forward Sortation Areas)a, b
Outcome variable signified whether an STEC case was included (yes = 1) or not (no = 0) in the high-STEC infection rate cluster.
Rural/urban status, population density, and age categories were considered a priori as confounding variables and kept in the final multivariable model.
CI, confidence interval; OR, odds ratio; STEC, Shiga toxin–producing E. coli.
Cases included in the spatial cluster had higher odds of coming from FSAs with a low average median family income, low proportion of lone-parent families, and low proportion of visible minority populations, after accounting for FSA rurality, population density, and age distribution.
Discussion
In our study, the availability of detailed geospatial data made it possible to link, by location, an STEC disease surveillance dataset with auxiliary socioeconomic and demographic data. We used an integrative analytical approach, combining a retrospective scan statistic with case–case logistic regression analysis to identify high-STEC infection rate clusters, and uncovered area-level socioeconomic and demographic factors that were associated with STEC case inclusion in the spatial cluster.
Case–case analysis is useful to evaluate disease surveillance data retrospectively because, in contrast to case–control studies, it does not require enrollment of controls from the general population, therefore it needs less time and resources to complete the study. In addition, in a case–case analysis, cases obtained from the same disease surveillance system can be compared based on their specific characteristics, including their location (i.e., inside or outside of a spatial high-infection rate cluster), and socioeconomic and demographic characteristics.
We used a spatial scan statistic that is useful in investigating large disease surveillance datasets to identify clusters without any previous hypotheses on the location, time, or size of those clusters (Kulldorff et al., 2009). Spatial scan statistic was applied to identify areas with high infection rates where future prevention and control studies should focus or future individual-level studies should be conducted to identify novel risk factors to decrease the burden of STEC infections (Desjardins et al., 2018; Varga et al., 2020). To identify purely spatial clusters, a scanning window size of 50% of the population at risk has been proposed as a default option (Kulldorff, 1997; Kulldorff et al., 2009). However, setting a too large scanning window size might identify a single large cluster instead of several smaller or medium-sized clusters. To overcome this issue, we used the Gini coefficient to select the optimal scanning window size (Han et al., 2016; Kim and Jung, 2017). In addition, the scan statistical analysis might include areas with low infection rates inside the significant spatial cluster if they are located close to high-infection rate areas. To address this issue, we reported the number of cases and the RR for each FSA that was included in the spatial cluster to assist public health authorities in performing targeted local interventions (Tadesse et al. 2018; Varga et al., 2020).
In the choropleth map, an area in southwestern Ontario with high IRs was identified, which was later confirmed as a significant spatial cluster that included mainly rural areas. Living in a rural area has been identified previously as a risk factor for STEC infections (Michel et al., 1999; Byrne et al., 2015; Karmali, 2017; Elson et al., 2018; Klumb et al., 2020). Rural areas in southwestern Ontario have a high livestock farm density, and contact with infected livestock, especially cattle, with exposure to their contaminated environment has been described as a major source for STEC infections (Locking et al., 2001; Valcour et al., 2002; Jaros et al., 2013; Klumb et al., 2020). Rural areas in southwestern Ontario also have a large number of private wells and small, public groundwater systems, which are not regularly treated and tested for pathogens, and were described previously as a risk for waterborne STEC disease outbreaks (Coleman et al., 2013; Invik et al., 2019; Reynolds et al., 2020).
It is worthwhile to note that the spatial cluster detected in our study was earlier identified as an area of increased risk for waterborne STEC outbreaks (Krolik et al., 2016) and included the township of Walkerton, where a large STEC outbreak occurred in 2000 (Bruce–Grey–Owen Sound Health Unit, 2000; Hrudey et al., 2003). In addition, the spatial cluster consisted of an area with several lakes and rivers and agricultural lands that have been demonstrated to be important environmental sources for STEC infections (Valcour et al., 2002; Johnson et al., 2014; Byrne et al., 2015; Heiman et al., 2015; Brubacher et al., 2020). Exposure to the contaminated environment and contact with infected livestock species on farms (Whitfield et al., 2017; Mughini-Gras et al., 2018; Klumb et al., 2020) and petting zoos (Adams et al., 2016; Conrad et al., 2017) have increased in importance as STEC infection sources since improved meat food safety measures in slaughter plants reduced STEC contamination of beef products; consequently, beef-associated foodborne STEC outbreaks in Ontario decreased (Pollari et al., 2017).
Future studies are needed in southwestern Ontario to identify environmental and animal contact-related exposure sources. In addition, there is a need for additional disease prevention and promotion programs as, despite the efforts of public health stakeholders, the rate of STEC infections in this region has not decreased over the past decade.
Socioeconomic and demographic factors have been described previously as important risk factors for STEC infections. In our study, the odds of inclusion of STEC cases in the spatial cluster were higher if they originated from areas with low median income. This finding agrees with some previous studies. In Japan, lower average income and a higher number of beef cattle per individual in an area were associated with increased STEC infections (Sakuma et al., 2006), and in Finland, the low-income household proportion with children was a risk factor for high STEC infection incidence (Jalava et al., 2011). On the other hand, several studies have found the opposite. In Connecticut, United States, high STEC infection rates were observed in census tracts with a low proportion of people living below the poverty level (Whitney et al., 2015). Furthermore, a large U.S. study evaluated surveillance data of the Foodborne Diseases Active Surveillance Network (FoodNet) sites and found an increased risk of STEC infections in areas with low poverty rates (Hadler et al., 2018), and in England, high SES (i.e., least deprived) groups compared with the low socioeconomic groups had higher odds of STEC infections (Adams et al., 2019).
Cases included in the spatial cluster were more likely to originate from areas with a low proportion of visible minority populations. This result could be explained by the rural location of the spatial cluster, knowing that visible minority populations concentrate in large urban centers (Wang and Hu, 2013). Note also that “visible minority” does not capture other dimensions of ethnocultural or linguistic diversity, which might be important.
Increased foodborne infection rates in low SES populations, and among specific racial/ethnic groups, could be explained by lower quality foods available in local retail food markets. (Koro et al., 2010).
Before interpreting our findings, several limitations should be considered. It is known that enteric disease surveillance underestimates the true population-level disease IRs, and young children and older adults are overrepresented in these systems, and generalization of our study findings to the whole Ontario population should be done with caution (Majowicz et al., 2005; MacDougall et al., 2008). Besides, analyzing risk factors at a higher geographic level might generate different findings due to the modifiable areal unit issue (Weisent et al., 2011). We believe that by using a sufficiently small geographic area (i.e., FSA), we reduced the impacts of these problems.
Conclusions
Cases originating from areas with low family income and a low proportion of visible minority populations had higher odds of STEC infections. Areas and socioeconomic groups identified in our study should be targeted with disease prevention and control programs to decrease the health impact of STEC infections. Our methodology can be integrated into public health practice and applied to other enteric disease outbreak investigations in Ontario and across North America.
Footnotes
Disclosure Statement
No competing financial interests exist.
Funding Information
Csaba Varga was supported by a start-up fund of the Department of Pathobiology, College of Veterinary Medicine, University of Illinois at Urbana-Champaign. The funders were not involved in the study design; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.
Supplementary Material
Supplementary Figure S1
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
