Abstract
The lack of pattern diversity among pulsed-field gel electrophoresis (PFGE) profiles for Escherichia coli O157:H7 in Canada does not consistently provide optimal discrimination, and therefore, differentiating temporally and/or geographically associated sporadic cases from potential outbreak cases can at times impede investigations. To address this limitation, DNA sequence-based methods such as multilocus variable-number tandem-repeat analysis (MLVA) have been explored. To assess the performance of MLVA as a supplemental method to PFGE from the Canadian perspective, a retrospective analysis of all E. coli O157:H7 isolated in Canada from January 2008 to December 2012 (inclusive) was conducted. A total of 2285 E. coli O157:H7 isolates and 63 clusters of cases (by PFGE) were selected for the study. Based on the qualitative analysis, the addition of MLVA improved the categorization of cases for 60% of clusters and no change was observed for ∼40% of clusters investigated. In such situations, MLVA serves to confirm PFGE results, but may not add further information per se. The findings of this study demonstrate that MLVA data, when used in combination with PFGE-based analyses, provide additional resolution to the detection of clusters lacking PFGE diversity as well as demonstrate good epidemiological concordance. In addition, MLVA is able to identify cluster-associated isolates with variant PFGE pattern combinations that may have been previously missed by PFGE alone. Optimal laboratory surveillance in Canada is achieved with the application of PFGE and MLVA in tandem for routine surveillance, cluster detection, and outbreak response.
Introduction
T
The national distribution of food products, along with international travel, has made it possible for human foodborne infections to originate in a region different from where the onset of illness was first observed (Barrett et al., 2006; Bavaro, 2012); thus, outbreak detection and investigation require subtyping to distinguish epidemiologically related and unrelated cases and to link cases of illness to a source. A timely response to potential outbreaks of E. coli O157:H7 is of paramount importance and is largely dependent on the availability of highly discriminatory and reliable subtyping methods (Lindstedt et al., 2003; Gerner-Smidt et al., 2006; Sabat et al., 2013).
Proven as an effective laboratory-based cluster detection tool, pulsed-field gel electrophoresis (PFGE) has been used with great success for detecting clusters and responding to outbreaks of E. coli O157:H7 since 2004 by PulseNet Canada (PNC). The robust historical PFGE data contained within the national database comprising more than 10,000 surveillance, outbreak, and food isolates provide the underlying foundation of the molecular subtyping network in Canada. PFGE pattern interpretation is based on several criteria of which are published in Health Canada's Weight of Evidence document (
To enhance cluster detection and outbreak response (i.e., improve categorization of cases as cluster/outbreak-related based on laboratory evidence), DNA sequence-based methods, such as multilocus variable-number tandem-repeat analysis (MLVA), have been explored. In principle, MLVA is based on multiplex PCR amplification of highly repetitive DNA sequences organized in tandem within the genome (i.e., variable number tandem repeats) (Nakamura et al., 1987; van Belkum et al., 1998) followed by sizing or sequencing of the amplification product using capillary electrophoresis (Lindstedt et al., 2003; Noller et al., 2003; Hyytia-Trees et al., 2006). The high-throughput unambiguous data provided by MLVA has high discrimination due to the highly variable target sequences (i.e., VNTRs), which allow for the designation of alleles in a discrete, rather than continuous, data set.
MLVA protocols have been developed and implemented in several countries for many foodborne bacterial pathogens (Nadon et al., 2013). In the early 2000s, reports began to emerge that indicated MLVA could be useful in discriminating among epidemiologically unrelated E. coli O157:H7 isolates for which PFGE provided limited discrimination (Noller et al., 2003; Keys et al., 2005; Hyytia-Trees et al., 2006).
Despite the usefulness of the subtyping technique, historical MLVA data are lacking in Canada. Performance of the method had primarily been conducted in other countries and had not been validated on strains of E. coli O157:H7 that occur in Canada. To further improve detection and response for E. coli O157:H7 outbreaks in Canada, the performance of MLVA as a supplemental method to PFGE and its ability to influence the categorization of cases as “cluster/outbreak-related” and “not cluster/outbreak-related” were retrospectively evaluated for all cases of E. coli O157:H7 in Canada from January 2008 to December 2012.
Materials and Methods
Isolate collection
A total of 3341 isolates of E. coli O157:H7 had been uploaded to the PNC National E. coli PFGE database from January 2008 to December 2012 (inclusive). Of these, 2285 E. coli O157:H7 isolates were available at the National Microbiology Laboratory (NML) for retrospective MLVA (the remaining isolates were excluded: 747 were not available, 133 were duplicates, and 176 had been subsequently identified as non-O157 E. coli following their initial upload). The source of isolates included human clinical (n = 2104), food (n = 166), and other (not identified or environmental, n = 15). The isolates available for this study provided coverage of ∼86% of all laboratory-confirmed E. coli O157 cases reported to the National Enteric Surveillance Program in Canada (i.e., 2672 laboratory-confirmed cases) over the 5-year study period (data not shown).
Pulsed-field gel electrophoresis
PFGE was performed by PNC steering committee member laboratories, including the provincial public health laboratories and the Canadian Food Inspection Agency, using standardized PulseNet International protocols (
Multilocus variable-number tandem-repeat analysis
A total of 349 E. coli O157:H7 isolates out of the total available isolates (n = 2285) had been previously subtyped by MLVA at the NML during early pilot studies for method optimization. MLVA was performed on the remaining 1936 isolates according to the PulseNet USA standardized protocol for the Applied Biosystems Genetic Analyzer 3130xl and 3730xl DNA Analyzer platforms (available at
Selection criteria for clusters in the retrospective analyses
A total of 197 clusters of E. coli O157:H7 had been identified and tracked through PNC from January 2008 to December 2012. To define clusters, the upload date (i.e., the date the PFGE subtyping information was uploaded to the national PNC database) was used, which is common practice in PNC laboratory analyses as illness onset date is usually not available and collection/isolation dates are not consistently provided. For the study, a cluster was defined as two or more isolates uploaded within a 60-day window with highly similar/indistinguishable PFGE patterns satisfying at least one of the following criteria: (1) isolates or cases originated from more than one jurisdiction/province; (2) isolates or cases originated in a single province and were linked to food recall, or (3) isolates or cases originated in a single province and were related to a concurrent international cluster. A total of 63 E. coli O157:H7 clusters, comprising ∼69% of the total isolates included in the study (n = 1576), fulfilled the selection criteria and were selected for retrospective analysis.
Retrospective analyses
All available metadata, including PFGE results (XbaI and BlnI), were compiled for all 1576 isolates comprising the 63 selected E. coli O157:H7 clusters, and any available epidemiological summary reports were retrieved (data not shown). To search for MLVA matches to each cluster, the database was queried (i.e., using PFGE pattern upload date) to identify isolates with MLVA profiles that matched the cluster during the window of time that spanned the date of the cluster's first upload to 60 days after the last cluster isolate upload. MLVA profiles were considered a match if they differed by no more than one repeat at up to three loci, or by three repeats at one locus, and showed no variation at VNTR 34 in comparison to the main MLVA cluster profile. To assess the similarity of MLVA profiles within each of the 63 clusters, the same criteria for matching were applied. Variability among MLVA profiles was determined by calculating the number of repeats occurring at each VNTR locus in comparison to the main cluster profile, where “main cluster profile” was defined as the most prevalent MLVA profile observed among the isolates within each cluster. Minimum spanning trees were constructed using the advanced cluster analysis tool for categorical data with single and double locus variance priority rules in BioNumerics v6.01 (Applied Maths).
In the context of cluster and outbreak investigations, sensitivity measures the ability of a typing method to correctly identify all cases that are truly cluster/outbreak related, while conversely, specificity relates to the ability of a typing method to correctly identify all cases that are truly unrelated to the cluster/outbreak to rule them out. This concept was applied to further analysis of a subset of 16 well-characterized clusters with available outbreak investigation data to compare the categorization of cases by PFGE compared to PFGE+MLVA. All isolates uploaded to PNC within the time span of each cluster were included in the analysis; isolates that did not match by neither PFGE nor MLVA were considered unrelated to the cluster/outbreak.
Statistical analysis of PFGE and MLVA
To supplement the qualitative data, a quantitative comparison of the laboratory results obtained from each typing method under investigation was performed using the Comparing Partitions website (
Results
Impact of MLVA as a supplemental method on cluster detection
The addition of MLVA impacted the laboratory evidence for 60% of the clusters analyzed in the study. For ∼16% of the clusters, adding MLVA resulted in the inclusion of isolates in the cluster that were previously missed by PFGE alone; between one and four additional isolates would have been ruled in for those clusters. The PFGE pattern combinations of isolates that were ruled in by the addition of MLVA were similar to the main cluster PFGE patterns by a minimum of 74.5% and a maximum of 97.8%, which was calculated using the UPGMA algorithm in BioNumerics v.6.01 (data not shown). This range of similarities would suggest that during cluster identification and outbreak investigations, PFGE patterns that are clearly distinct from one another may in fact be related. Thus, MLVA has the capacity to refine the scope of a cluster/outbreak as this typing system is able to identify isolates with variant PFGE pattern combinations that may have been previously missed by PFGE alone (Fig. 1).

MST depicting 46 MLVA profiles for 92 clinical Escherichia coli O157:H7 isolates uploaded to the National PulseNet Canada database during the time frame of a 2012 outbreak associated with the consumption of romaine lettuce. Each circle in the MST corresponds to a unique MLVA profile and is color coded based on the PFGE pattern. The size of the circle corresponds to the number of isolates, and the solid lines connecting them infer relatedness by stating the number of differing loci between profiles. Temporally associated sporadic isolates that were ruled out by both PFGE and MLVA are colored gray; isolates demonstrating indistinguishable PFGE patterns from that of the outbreak pattern, but highly variable MLVA profiles, are indicated by a dotted line outlining the circle. Isolates that would have been ruled out by PFGE alone are highlighted in yellow. Clustering was performed in BioNumerics v.6.01 using the advanced cluster analysis tool for categorical data with single and double locus variance priority rules. MLVA, multilocus variable-number tandem-repeat analysis; MST, minimum spanning tree; PFGE, pulsed-field gel electrophoresis.
Conversely, the use of MLVA also excluded isolates that were likely erroneously included by PFGE alone in ∼28% of clusters investigated in the study. Of those, at least one isolate was identified as not related to the cluster by the addition of MLVA. More interestingly, MLVA informed that six clusters (previously characterized by PFGE) may not have been representative of “true” clusters at all based on the high level of variability observed in the MLVA profiles. Finally, PFGE+MLVA both ruled in and ruled out cases that had been either missed or erroneously included based on PFGE alone, respectively, for 16% of the clusters. Although the results provided by MLVA refined the scope for a majority of the clusters investigated, no difference was observed in the categorization of cases by PFGE compared to PFGE+MLVA for ∼40% of clusters.
PNC relies on molecular subtyping data to identify clusters of cases with matching profiles; these clusters may then be classified as outbreaks with follow-up or investigation once assessed by epidemiologists. To further compare the case categorization of PFGE compared to PFGE+MLVA, a total of 986 clinical isolates associated with 16 well-characterized outbreaks were used to assess the ability of PFGE+MLVA to categorize cases as “outbreak related” or “not outbreak related” compared to PFGE alone (Table 1). Both methods classified 181 cases as outbreak related and 788 cases as nonoutbreak related, indicating these cases were most likely categorized correctly. PFGE+MLVA assigned 10 cases as not outbreak related, whereas PFGE would have included them. In contrast, where PFGE+MLVA had identified 7 cases as outbreak related, these cases had been previously excluded by PFGE alone.
Isolates were considered to be outbreak related (i.e., true positive) or nonoutbreak related (i.e., true negative) if identified by both PFGE and PFGE+MLVA.
MLVA, multilocus variable-number tandem-repeat analysis; PFGE, pulsed-field gel electrophoresis.
Statistical assessment of PFGE and PFGE+MLVA
A statistically significant difference was observed between the SID for PFGE alone and PFGE+MLVA (p < 0.001), indicating PFGE+MLVA is more discriminatory than PFGE alone (Table 2). In addition, a statistically significant difference between the AW coefficients (p < 0.001) was observed; therefore, combining the results derived by PFGE and those derived by MLVA is likely to offer additional information to cluster detection and outbreak investigations over either single method alone. These results were consistent with the findings from the qualitative retrospective cluster analyses.
All statistical values were computed using the Comparing Partitions website (
p-Value ≤0.05 is significant.
AW, adjusted Wallace coefficient; CI, confidence interval; MLVA, multilocus variable-number tandem-repeat analysis; PFGE, pulsed-field gel electrophoresis; SID, Simpson's Index of Diversity.
Discussion
In this study, the combination of PFGE+MLVA was found to provide optimal resolution for cluster detection by refining the scope of a cluster/outbreak for a majority of the retrospective clusters investigated compared to PFGE alone; this was concordant with supplemental statistical findings. Although MLVA changed the categorization of isolates for a large part of the clusters investigated (60%), no change was observed for ∼40% of all clusters. Based on information obtained from the national PNC database, the PFGE patterns associated with the 40% of unchanged clusters were characterized as “new” (i.e., not previously isolated) or rare pattern combinations; thus, finding any matches based on PFGE alone would have produced a strong enough signal to indicate these isolates were likely to have originated from a common source. In such situations, MLVA could serve to confirm PFGE results, but may not add further information per se. This study did not assess MLVA for cluster detection alone. Although it is possible to standardize the method, this takes careful planning and coordination to ensure interlaboratory results can be compared (Nadon et al., 2013). More importantly, since the foundation of the PNC network is based on extensive PFGE data, a complete overhaul of the surveillance network would be required, which is not feasible or practical as more discriminatory methods, such as whole genome sequencing, are being explored and will likely replace both methods in the future. For this reason, MLVA is highly unlikely to entirely replace PFGE, but, however, could potentially serve as a valuable supplemental tool during cluster detection and outbreak response.
The relationship between sensitivity and specificity exists in all laboratory tests. Laboratory thresholds used to rule cases as “in” or “out” that are too stringent could exclude related cases, limiting or potentially altering the scope of the outbreak; criteria that are not stringent enough could lead to erroneously ruling in nonrelated cases, potentially diluting investigators' ability to find common exposures (Besser, 2011). It is in this context that the performance of PFGE+MLVA was assessed to describe the ability of the combined methods to sort cases into the correct category (or what is perceived to be correct based on available data) compared to PFGE alone. Based on the results presented in this study, the relevance of MLVA data when used in combination with PFGE-based analyses proves to be promising as it was able to correctly cluster isolates belonging to 16 well-characterized outbreaks, as well as rule out a substantial number of sporadic isolates (i.e., nonoutbreak cases). For the purposes of cluster/outbreak investigations of E. coli O157:H7 in Canada, the categorization of cases by PFGE+MLVA appears to have higher concordance with epidemiological evidence than PFGE alone. Therefore, the greatest benefits of utilizing MLVA as a supplemental test may be realized during routine surveillance, when epidemiological information is incomplete or unavailable as the data provide an extra layer of resolution to cluster detection and outbreak investigations.
The findings of the retrospective study have already impacted operational procedures for laboratory surveillance in Canada. MLVA has proven to be integral in separating outbreak-related cases from temporally associated sporadic cases of E. coli O157:H7 (Fig. 2); based on these results, the method is now applied to every case of E. coli O157:H7 in Canada in addition to PFGE.

MST depicting 37 MLVA profiles for 58 clinical Escherichia coli O157:H7 isolates uploaded to the National PulseNet Canada database during a 2012 outbreak associated with the consumption of beef products. Each circle in the MST corresponds to a unique MLVA profile and is color coded based on the PFGE pattern. The size of the circle corresponds to the number of isolates, and the solid lines connecting them infer relatedness by stating the number of differing loci between profiles. Temporally associated sporadic isolates that were ruled out by both PFGE and MLVA are colored gray; isolates demonstrating indistinguishable PFGE patterns from that of the outbreak pattern, but highly variable MLVA profiles, are indicated by a dotted line outlining the circle. Clustering was performed in BioNumerics v.6.01 using the advanced cluster analysis tool for categorical data with single and double locus variance priority rules. MLVA, multilocus variable-number tandem-repeat analysis; MST, minimum spanning tree; PFGE, pulsed-field gel electrophoresis.
Conclusion
The application of MLVA to E. coli O157:H7 for routine surveillance proved to be an excellent molecular epidemiological tool to complement PFGE as it provided additional resolution to cluster detection and outbreak investigations for which PFGE offered limited discrimination. Even though the addition of MLVA may not always change the number of isolates included in a cluster/outbreak, the subtyping tool has a significant impact on the accuracy of cluster detection and therefore provides a clearer picture of the overall scope of the investigation. Thus, optimal laboratory surveillance in Canada is achieved with the application of PFGE and MLVA in tandem for routine surveillance, cluster detection, and outbreak support.
Footnotes
Acknowledgments
The authors thank the PulseNet Canada Steering Committee member laboratories (Canadian Public Health Laboratory Network) for providing the isolates used in this study, providing thoughtful discussion and for reviewing the article. They are also grateful for the National Microbiology Laboratory Enteric Diseases Section personnel for their technical assistance (Kristine Cruz, Chelsey Goodman, Ashley Kearney, Christy-Lynn Peterson, Keri Trout-Yakel, Kathleen Whyte) and the PulseNet Canada database managers for analysing the PFGE patterns (Erin Ballegeer, Connie Blakeston, Cynthia Misfeldt). They would also like to thank the Outbreak Management Division epidemiologists at the Public Health Agency of Canada for providing them with epidemiological outbreak reports. Finally, they would like to extend their appreciation to the co-advisor, Dr. Matthew Gilmour, and committee members, Dr. John Wylie, and Dr. Lawrence Elliott for their guidance during the graduate thesis project from which this article was derived.
Disclosure Statement
No competing financial interests exist.
