Abstract
Abstract
Background:
Electronic health and administrative data are increasingly being used for identifying surgical site infections (SSI). We found an unexpectedly high number of patients who could not be classified definitively as having an infection or not. To further explore this, we present an electronic classification algorithm for conservative case finding and identify alterations that would adapt the method for other purposes.
Methods:
Two computer algorithms were created to identify SSI. One model used a strict National Healthcare Safety Network (NHSN) based SSI algorithm, which was applied to all discharges from 443,284 all discharges from four hospitals in Manhattan, NY, 2009 through 2012. The second model used discharges that only had NHSN-defined SSI procedures during the same period.
Results:
The strict SSI algorithm was able to classify SSI status for 27.3% of discharges; there was a high number of indeterminate cases. In contrast, the modified, less strict model, classified 97.2% of discharges with NHSN-approved SSI procedures.
Conclusion:
Electronic records provide several options for aiding with the identification of infections in healthcare settings and can be tailored to suit specific uses. While algorithms for SSI classification should reflect the NHSN definition, our research emphasizes how variations of model building can affect the number of indeterminate cases that may necessitate manual review.
M
As part of a federally funded study (Health Information Technology to Reduce Healthcare-Associated Infections, NR010822), we developed computerized classification algorithms to identify four types of HAIs using electronically available data [5]. Electronic data were suitable for identifying pneumonia, BSI, and urinary tract infection (UTI), but for SSIs, we found a high number of patients who could not be classified definitively as having an infection or not. To further explore this limitation and optimize SSI detection for specific uses, we present an electronic classification algorithm for conservative case finding and identify alterations that would adapt the method for other purposes.
Methods
We developed a data mart that included electronic health and administrative records for patients discharged from four academically affiliated acute care hospitals in Manhattan, NY [5]. All patients discharged from 2009 through 2012 were included. The four hospitals are part of the same network and share information technology systems including a commercially available electronic medical record and charting system, an admission-discharge-transfer (ADT) system, and a clinical data warehouse storing information from several smaller sources such as clinical laboratory records.
Based on HAI surveillance guidelines published by the Centers for Disease Control and Prevention's (CDC) National Healthcare Safety Network (NHSN) [6], electronic algorithms for identifying BSI, SSI, UTI, and pneumonia were developed by an interdisciplinary team that included an infectious disease physician, an infection prevention nurse, an epidemiologist, a programmer/data manager, and an information technology systems manager with expertise in the use of hospital administrative data. The algorithms, described in Table 1, use a combination of International Classification of Diseases, 9th Revision, Clinical Modification (ICD-9-CM) procedure and diagnosis codes, date and time-stamped clinical microbiology culture results, and admission and discharge dates. The classification algorithms were designed to be conservative in their identification of patients who likely had infections and those who likely did not have infections by creating an “indeterminate” category for those whose records contained incomplete, insufficient, or conflicting information.
ICD-9-CM = International Classification of Diseases, 9th Revision, Clinical Modification; NHSN = National Healthcare Safety Network; CFU = colony forming unit.
The completed algorithms were applied to all patient discharges occurring during the study period. For each infection type, we counted the number and percent of patient discharges that were classified as indeterminate. For SSIs, we used a flow diagram to determine the number and percent of patient discharges that were classified as having an infection, not having an infection, and having indeterminate status at each stage of the algorithm.
Results
A total of 443,284 discharges occurred during the four-year study period and were included in our analyses. Our algorithms were able to classify BSI status for 97.5%, UTI status for 95.8%, and pneumonia status for 94.4% of discharges (n = 10,930, n = 18,501, and n = 24,980 indeterminate discharges, respectively). When applying the strict NHSN guidelines, SSI status was classified for 27.3% of discharges (n = 322,423 indeterminate discharges). Figure 1 illustrates the number of indeterminate discharges resulting from each step of the SSI classification algorithm. Some discharges were indeterminate because at least one post-operative incision culture was taken, but all results were negative, and no clinical diagnosis of post-operative infection was recorded in the discharge codes (0.7%, n = 2,185). A smaller percentage (0.3%, n = 1,297) was indeterminate, because there was a discrepancy between the microbiology record and ICD-9-CM diagnosis codes (i.e., a post-operative infection was documented but no incision culture was taken).

Classification algorithm for identifying surgical site infections (SSI) using electronic data using strict NHSN definitions. White boxes represent the number of patient discharges identified as having an SSI. Black boxes represent the number of patient discharges identified as not having an SSI. Dark gray boxes represent the number of patient discharges who could not be definitively classified using the algorithm.
Figure 2 illustrates the number of indeterminate discharges resulting from each step of the SSI classification algorithm when the NHSN definition is modified to only consider patients who underwent a NHSN-defined SSI procedure (n = 124,343). Using the modified algorithm, the indeterminate numbers were reduced substantially (n = 3,482).

Classification algorithm for identifying surgical site infections (SSI) using electronic data using a modified NHSN definition only to consider patients that underwent a NHSN defined SSI procedure. White boxes represent the number of patient discharges identified as having an SSI. Black boxes represent the number of patient discharges identified as not having an SSI. Dark gray boxes represent the number of patient discharges who could not be definitively classified using the algorithm.
Discussion
Electronic records provide several options for aiding with the identification of infections in healthcare settings and can be tailored to suit specific uses [2]. Nevertheless, work is still needed to assess and improve the validity and reliability of electronic administrative data and identify the most useful, parsimonious, and accurate data elements for SSI surveillance.
Administrative data have been used for surveillance of various types of infections with mixed results to date. A decade ago, researchers reported a positive predictive value of 20% for HAIs identified by administrative data as compared with 100% identified through active surveillance by an experienced infection prevention professional [7]. More recently, Snyders et al. [8] found that electronic algorithms to identify central line associated BSI required adjustment for various populations, and others have noted discrepancies in diagnosing SSIs, depending on definitions used [9,10]. Singh et al. [11] found a significant discrepancy between SSI rates reported to a national United Kingdom surveillance system when compared with rates identified by a retrospective review of electronic medical records, with higher rates identified electronically. It was not possible to discern, however, whether the discrepancy was attributable to under-reporting or to under-identification of cases.
Others have evaluated the use of ICD-CM (currently ICD-10-CM) codes for post-operative infection, although there are limitations to relying solely on discharge diagnosis codes because sensitivity and specificity may not be adequate [12]. Another method would be to identify patients who had incision cultures performed during their admission. Because incision cultures may be collected for reasons unrelated to SSI or may not be collected at all, however, this approach would still yield a large proportion of patient discharges for whom manual chart review would be required and the positive predictive value of this added step would likely be low.
In a recent systematic review of 57 studies using electronic surveillance for HAIs, sensitivities and predictive values were highly variable, and the studies were characterized by considerable methodologic heterogeneity. Hence, the authors recommended careful use of such data and continued work to improve algorithms [13]. In January 2016, the CDC's NHSN Patient Safety Component Manual published updated guidelines that provide more specificity for identifying and monitoring SSI and re-emphasized the importance of using epidemiologically sound infection definitions and effective surveillance methods [14]. The European Centre for Disease Prevention and Control also publishes an annual epidemiologic report on SSIs [15] using slightly different case definitions [16].
Our study adds to the extant and burgeoning literature on use of administrative data for case finding and identification of SSIs. The algorithm presented in this study was initially developed for the purpose of a research study in which the goal was to match patients who had infections with those who did not. Therefore, our classification schema was designed to be conservative in the identification of infected and non-infected patient discharges to minimize false positives and false negatives. The objectives of electronic case finding for the of purpose institutional surveillance, however, are different. Infection control staff members need to apply a more inclusive case definition, one that identifies definitive versus possible infections to flag those that would require manual chart review and clinical adjudication. The ultimate goal of surveillance methods is to provide valid and reliable, efficient, real-time SSI data that would minimize the resources required for manual record searches and other surveillance activities that now require a large proportion of time from infection prevention and control staff [17,18].
In 2014, Woeltje et al. [19] published recommended data elements for effective electronic surveillance of selected HAIs and the possible complications associated with each. For SSI surveillance, a combination of microbiologic cultures, procedure and diagnosis billing codes, and ADT data were identified as key elements. Ultimately, electronic surveillance that allows use of data from multiple sources including surgical, laboratory, radiologic and medication records, physician and nursing notes as well as ICD-10-CM codes shows great promise for more precise and efficient case finding.
Like all studies involving electronic record review, our investigation has some limitations. Although the classifications were thoroughly reviewed by several members of our institution's infection control team and an initial validation study was performed [20], further validation studies with larger and more geographically representative samples are in order. Finally, the largest proportion of SSIs generally occur after the patient's discharge from the hospital, and hence those identified among inpatients represent only a small proportion of the total incidence. Our aim, however, was not to assess incidence of SSI but rather to apply the NHSN definitions of SSI and determine the extent to which those infections that do manifest among inpatients can be detected using electronically collected data.
Even when essential data elements are available, reliance on electronic review has several limitations because of the complexity of clinical presentation and diagnostic criteria for SSIs, and further work is needed to take maximum advantage of the large volumes of electronic data now available. Despite limitations of electronic monitoring for SSI, progress has been made in data mining, programming, and standardization of definitions. Hence, electronic monitoring is useful for identifying patients likely to have an SSI so that follow-up by infection prevention and control staff is expedited. Ultimately, increasingly sensitive and sophisticated surveillance and reporting methods will free up more time for other prevention activities such as staff and patient education.
Footnotes
Acknowledgments
This study was funded in part by a grant from National Institute of Nursing Research, R01 NR010822-07S1.
Author Disclosure Statement
No competing financial interests exist.
