Abstract
Abstract
Background:
Refinement of criteria for both screening and initiation of empiric therapy in ventilator-associated pneumonia (VAP) will minimize antibiotic overuse. We hypothesized that variables within the commonly used Clinical Pulmonary Infection Score (CPIS) have unfavorable test performance characteristics.
Methods:
Consecutive bronchoalveolar lavage (BAL) cultures obtained from surgical intensive care unit patients were abstracted (2009–2012). Ventilator-associated pneumonia was defined as ≥105 cfu/mL. The CPIS both without (CPISclinical) and with (CPISclinical+GS) the result of gram stain (GS) was calculated. Test performance characteristics for the sample, as well as several subgroups, were compared.
Results:
One thousand thirteen lower respiratory tract cultures from 492 patients were analyzed; 438 (43.2%) of cultures were classified as VAP, and 310 of 492 patients (62.4%) had ≥1 episode of VAP. Both CPISclinical and CPISclinical+GS had poor discrimination for VAP (Receiver-operating characteristic area under the curve=0.55 and 0.66, respectively). Sensitivity of CPISclinical using a threshold of >6 was 21%; the lowest threshold for CPISclinical for which the sensitivity was at least 85% was 3. The highest sensitivity among the individual CPIS components was new CXR infiltrate (91.1%). Among the subset of cultures sent during the early VAP window (days intubated 2–5), organisms on GS had a sensitivity of 93.3%. The CPISclinical, CPISclinical+GS, organisms, and neutrophils on GS parameters all became less accurate in both the late VAP window and when screening for recurrent VAP. Every case of VAP had at least one of the following: 1) fever; 2) new CXR infiltrate, or 3) organisms on GS.
Conclusion:
In this series of BALs, traditional screening tools for VAP missed the majority of microbiological confirmed cases. Screening based on either new CXR infiltrate or fever yielded an acceptably high sensitivity. The only scenario identified in which empiric antibiotics could be withheld safely was the absence of organisms on GS in the early VAP window.
V
The decisions to both screen patients for VAP and initiate empiric antimicrobial therapy in patients selected for lower respiratory tract culture are related strongly to antibiotic use. The most commonly employed screening tool for VAP in both medical and surgical ICUs remains a modification of the Clinical Pulmonary Infection Score (CPIS), described originally by Pugin et al. in a cohort of surgical ICU patients [7]. Specifically, omission of the microbiologic variable leaves five clinical variables, known as CPISclinical, which is the currently employed tool within our, and many other, surgical ICUs as well as recent clinical trials relating to VAP [8–10].
Since its initial description, several small series have reported poor test characteristics of CPISclinical for diagnosing VAP [11,12]. However, small sample sizes, mixed or exclusively medical patient populations, and lack of subgroup analyses have limited the generalizability of these findings to the modern surgical ICU patient. The utility of incorporating the results of the gram stain into the CPIS also remains unclear [11]. More recently, the value of CPISclinical in determining resolution of VAP in trauma patients (and specifically for guiding discontinuation of antibiotics) has also been questioned [13–15]. Although the ability of CPISclinical to distinguish infection from the systemic inflammatory response syndrome is poor, less is known regarding its ability to rule out VAP when low; it is this sensitivity that is most relevant to the aforementioned decisions to both screen and initiate empiric antimicrobial therapy.
A series of instances from our surgical ICU in which patients with relatively low CPISclinical ultimately were found to have microbiologic VAP prompted this critical investigation of our current screening algorithms. We hypothesized that: 1) CPISclinical possesses poor discriminative ability for VAP, and 2) the discriminative ability of CPISclinical varies by clinical scenario.
Patients and Methods
This was a cross-sectional analysis of lower respiratory tract samples obtained from ventilated surgical ICU patients at our State-certified, American College of Surgeons-verified Level I trauma center from October 2009 to December 2012. Our ICU is a closed 24-bed unit staffed by in-house, board-certified surgical intensivists.
Microbiology records were queried for all respiratory tract cultures from the surgical ICU over the study period. Specimens were excluded from analysis for any of the following reasons: 1) Patient age <14 y; 2) non-surgical patient; 3) respiratory culture obtained via either sputum culture or tracheal aspirate; and 4) patient either not ventilated or ventilated <48 h at the time of respiratory culture.
Over the study period, both the indications for obtaining a lower respiratory tract culture in ventilated patients and the decision to initiate empiric therapy were at the discretion of the attending intensivist. In general the following protocol was followed: Patients ≥48 h from either trauma or surgery and with a CPISclinical >6 points were considered for lower respiratory tract sample via either blind (“mini”) bronchialveolar lavage (BAL) or bronchoscopic BAL. In general, bronchoscopic BAL was reserved for patients with a separate indication for bronchoscopy (e.g., lobar collapse). Empiric antimicrobial therapy was initiated after the sample was obtained, and de-escalated based on final culture results within 72 h. Attending intensivists may have chosen to deviate from any of these steps based on clinical circumstances. A common example of this deviation would involve a patient with CPISclinical ≤6 with grossly purulent, foul-smelling airway secretions.
The primary outcome variable was VAP, defined as ≥105 colony-forming units (CFU)/mL on final culture of the BAL [16]. This threshold was lowered to ≥104 CFU/mL if the sample was obtained while the patient was on antimicrobial therapy. Early VAP was defined as occurring 2–5 d after intubation; late VAP was defined as occurring >5 d after intubation [1]. Recurrent VAP was defined using the same criteria as the initial episode, and included any culture obtained >3 d from the culture that defined the index episode of VAP. Outcome variables included the prevalence of both VAP and recurrent VAP, hospital length of stay (LOS), ICU LOS, and mortality.
Clinical variables included age (years), gender, admission diagnosis (trauma, emergency general surgery, elective general surgery, and other), time from ICU admission to respiratory culture (days), days intubated at the time of respiratory culture, operation within 72 h of respiratory culture, method of specimen acquisition (blind vs. bronchoscopic BAL), respiratory culture obtained while on antibiotics, and presence of both organisms and numerous polymorphonucleocytes (PMNs) on gram stain. Gram stain PMNs were considered unimportant if reported as either <25, none, rare, or few, and numerous if reported as >25, moderate, or many. In the instance of respiratory cultures from patients who had a prior episode of VAP, the time from the prior respiratory culture to the repeat respiratory culture, as well as the duration of antimicrobial therapy for the index episode of VAP (days) were abstracted.
We examined several variations of the CPIS, as described originally by Pugin et al. [7] (Table 1). The original CPIS included five clinical variables [temperature, white blood cell count (WBC), tracheal secretions, ratio of PaO2 to FIO2 (P:F), and chest radiograph findings) and one microbiologic variable (culture of tracheal aspirate and its relation to the gram stain) [7]. One limitation of this score as a screening tool involves the unavailability of culture information at the time that the decisions to both obtain a lower respiratory tract culture and start empiric antibiotics must be made. Because of this limitation, several authors have modified the score to exclude the microbiologic variable [14,17]. This modified score, termed CPISclinical, includes only the five clinical variables and ranges from 0–10 points, with a threshold of >6 points being suggestive of VAP.
Used only to calculate CPIS+GS
CPIS=clinical pulmonary infection score; GS=gram stain.
Other authors have described increased accuracy of the CPIS when incorporating the results of the gram stain [11], which would be available shortly after the acquisition of the BAL and could thus guide the decision to prescribe empiric antibiotics. We therefore modified the CPIS further to include the results of the gram stain, with zero additional points for no organisms and two additional points for ≥1 organism, regardless of density. In accordance with the initial description by Pugin et al. [7], we used a threshold of >6 as suggestive of VAP. This modification is referred to herein as CPIS+GS and ranges from 0–12 (Table 1).
Information necessary to calculate the CPIS was abstracted by two of the investigators (FMP and MRR). The worst values were used (those associated with the highest number of points) for each variable over the 24 h prior to the respiratory tract sample being obtained. Tracheal secretions were scored based on daily progress notes from the respiratory therapist. Chest radiograph findings were scored based on attending radiologist final reads.
All statistical analyses were performed using SAS version 9.1 (SAS Inc., Cary, NC). Data are expressed as median (range) or number (%). The medians of continuous variables were compared using the Wilcoxon rank sum test because most variables were not distributed normally. Normality was assessed using the Kolmogorov-Smirnov test. Proportions of categorical variables were compared using the chi-square test, unless expected cell counts were <10, in which case Fisher exact test was used. Differences in continuous variables across a variable with >2 levels were compared using analysis of variance (ANOVA).
Test performance characteristics included sensitivity (true positives/true positives+false negatives), specificity (true negatives/true negatives+false positives), positive predictive value (PPV, true positives/true positives+false positives), negative predictive value (NPV, true negatives/true negatives+false negatives), and the receiver-operating characteristic curve area under the curve (ROC AUC, association of predicted probabilities and observed responses c statistic). The alpha error level was set at 0.05, with p<0.05 considered significant statistically.
Results
The derivation of the final sample is shown in Figure 1. A total of 1,013 lower respiratory tract cultures from 497 patients were available for analysis. Of these 497 patients, 310 had at least one episode of VAP during their SICU course, for an overall prevalence of VAP in this patient population of 62.4%. The median number of cultures per patient was two (range 1–10). Clinical variables related to the cultures and stratified by VAP are presented in Table 2.

Derivation of study sample.
VAP=ventilator-assisted pneumonia; OR=operating room; CPIS=clinical pulmonary infection score; BAL=bronchoalveolar lavage; GS=gram stain; PMN=polymorphonuclear leukocytes; LOS=length of stay; Resp=respiratory, CPIS=clinical pulmonary infection score.
Information was available to calculate a complete CPIS in 748 of the 1,013 respiratory cultures (73.9%); the likelihood of the CPIS calculation being complete was significantly greater for instances of VAP as opposed to instances of no VAP (78.5% vs. 70.4%, respectively, p<0.01). The CPISclinical score was distributed normally among the sample (Fig. 2); 617 of the 748 respiratory cultures (82.5%) were drawn in response to a CPISclinical ≤6. The majority of these cases (506/617, 82.0%) involved a CPIS in the 4–6 range. These findings were mirrored when we limited the analysis to cases of VAP; CPISclinical was distributed normally (p>0.05) and 221/270 patients (81.9%) in the CPISclinical ≤6 group had a score of 4–6.

Distribution of CPISclinical scores for patients with lower respiratory tract cultures. Only instances in which complete information was available to calculate the score are included (n=748). Kolmogorov-Smirnov normality test p>0.05. CPIS=clinical pulmonary infection score.
A comparison of clinical variables from patients with CPISclinical ≤6 vs. CPISclinical >6 is shown in Table 3. Among the variables examined, only the likelihood of a prior episode of VAP differed between groups, being significantly greater in the CPISclinical >6 group as compared with the CPISclinical ≤6 group (30.6% vs. 18.8%, respectively, p=0.03). Causative organism for VAP also did not differ as a function of CPIS group (data not shown).
VAP=ventilator-associated pneumonia; OR=operating room; CPIS=clinical pulmonary infection sure; BAL=bronchoalveolar lavage; GS=gram stain ; GS PMN=gram stain neutrophils; LOS=length of stay.
The CPISclinical score was unable to discriminate between patients with and without VAP. The median CPISclinical was 5 for both instances of VAP and no VAP (p=0.98). The mean CPISclinical was 5.1 in instances of VAP as compared with 4.9 in instances of no VAP (p=0.05). There was no significant relation (either linear or logarithmic) between the total CPISclinical and the likelihood of VAP (p=0.22, Fig. 3). Of the 438 cases of VAP, 271 (61.9%) had a CPISclinical ≤6.

Association between CPISclinical score and likelihood of VAP (p=0.22).
The ROC AUC for CPISclinical was 0.56, indicating a predictive ability no different than chance. This finding did not change when performing subgroup analyses of initial respiratory cultures (0.59), the early VAP window (0.56), the late VAP window (0.64), or screening for recurrent VAP (0.54). Furthermore, incorporating the results of the gram stain for organisms (CPIS+GS) did not increase the ROC AUC meaningfully; it remained in the 0.60–0.65 range in all of the aforementioned clinical scenarios (data not shown). Finally, we tested CPISclinical in two additional scenarios: 1) When using ≥104 cfu/mL as a diagnostic threshold for VAP (as opposed to ≥105), the ROC AUC for CPISclinical was 0.53; 2) when limiting the sample to cultures obtained bronchoscopically, the ROC AUC for CPISclinical was 0.57.
Additional test performance characteristics for CPISclinical (threshold >6), CPIS+GS (threshold >6), presence of organisms on gram stain, and presence of PMNs on gram stain for the entire sample, as well as subgroup analyzes, are summarized in Table 4. At the traditional threshold of >6, the specificity and sensitivity of CPISclinical were 21.0% and 85.7%, respectively. The NPV and PPV were both approximately 50%. Screening based on CPISclinical >6 alone would have missed 271 of 343 cases of VAP (79.0%). Lowering the CPISclinical threshold to five resulted in a sensitivity of 42.0%, to four 68.1%, to three 85.4%, and to two 94.1%. These numbers remained similar in each subgroup analysis. Incorporating the results of the gram stain into CPISclinical (CPIS+GS) increased the sensitivity from 21.0% to 63.3%, although the test remained relatively insensitive. Specificity fell from 85.7% to 64.4%.
PPV=positive predictive value; NPV=negative predictive value; Resp=respiratory; PMN=polymorphonuclear leukocyte.
The presence of organisms on gram stain was moderately sensitive for diagnosing VAP (sensitivity=86.1%); the sensitivity was highest in the case of diagnosing early VAP (93.3%) and lowest in the case of diagnosing recurrent VAP (77.5%). Within the early VAP window, the NPV was 84.4%. Specificity was poor (<50%) in all subgroup analyses. The presence of PMNs on gram stain was moderately sensitive for diagnosing VAP (sensitivity=77.2%); the sensitivity did not change substantially in any of the sub-group analyses. Specificity was low (<40%) for all analyses.
We next examined the test performance characteristics of each of the individual five components of the CPISclinical. Each component was dichotomized as either normal (scored zero) or abnormal (scored >0) (Fig. 4). Tracheal secretions, WBC, and P:F were insensitive for diagnosing VAP (sensitivity <70%). The maximum temperature was moderately sensitive (sensitivity=79.6%) and the CXR findings were highly sensitive (sensitivity=91.1%). Finally, all individual components were poorly specific for diagnosing VAP (specificity <50%).

Test performance characteristics of individual CPIS components.
Every case of VAP was associated with at least one of the following three variables: 1) Fever, 2) abnormal CXR, and 3) presence of organisms on gram stain. With respect to screening algorithms, near-perfect sensitivity was observed in cases where the patient had either a fever or new CXR findings (sensitivity=98.0%). The NPV of this strategy was 82.0%.
Discussion
In this large cross-sectional analysis of lower respiratory tract cultures from ventilated surgical ICU patients, we attempted to identify sensitive screening tools for VAP, such that critically ill patients with VAP would not be missed, and empiric antibiotics might be withheld in certain patients selected for lower respiratory tract cultures. Suspicion for VAP begins with derangement of one or more of the clinical variables grouped into CPISclinical. However, neither CPISclinical nor CPIS+GS performed favorably in any clinical scenario, returning ROC AUC values similar to chance. Furthermore, the traditional cutoff point of six (used in both practice and clinical trials) was highly insensitive. The majority of cases of VAP (79.0%) would have been missed if this cutoff alone were used to decide on whom to obtain a lower respiratory tract culture. Rather, acceptable (>80%) sensitivity occurred at the much lower threshold of three. The most sensitive of the individual CPISclinical components were CXR (91.1%) and Tmax (80.0%). Finally, in the subgroup of the early VAP window (days intubated 2–5), organisms on gram stain was highly sensitive for VAP (93.3%), suggesting the empiric antibiotic therapy might be safely withheld in patients in this scenario. This tactic could have a substantial impact on antibiotic use as approximately one third of lower respiratory tract cultures from patients in the early VAP window did not demonstrate organisms on gram stain.
To our surprise, the majority of lower respiratory tract samples in this series were sent from patients with a CPISclinical ≤6, in violation of our own ICU general guidelines. Several possible explanations for this discrepancy exist. First, most lower respiratory tract cultures from the CPISclinical ≤6 group were obtained on patients with a CPISclinical of either five or six. In these cases, a miscalculation (or disagreement) of as little as one point would have placed many patients over the threshold of six. This subjectivity of CPISclinical, and in particular the respiratory secretions and CXR variables, has been documented previously [12]. Similarly, patients with “borderline” CPISclinical for whom the clinical suspicion of pneumonia was otherwise high (e.g., septic shock) may have been selected for lower respiratory tract culture. However, we were unable to detect any differences between patients with CPISclinical ≤6 and patients with CPISclinical >6, including acuity, causative organism, days of pulmonary failure, and mortality. Finally, the two authors who abstracted the data retrospectively may have systematically underestimated CPISclinical, although this seems unlikely given a small number of consistent reviewers with assess to the complete medical record included final radiolgraphic interpretations.
A related issue involves the large number of patients in this series with a relatively low CPISclinical and microbiologic evidence of VAP. Do these patients actually have VAP and warrant treatment? Most patients with VAP and CPISclinical ≤6 had a score of either five or six, suggesting derangement of at least three clinical variables. For example, a patient with a new CXR infiltrate, fever, and leukocytosis would qualify. It is our contention that, given this clinical scenario, and in addition to microbiologic confirmation, most surgical intensivists would classify this hypothetical patient as having VAP. Secondly, a relatively conservative microbiologic definition of VAP was used in this study (>105 cfu/mL) [16]. Lastly, quantitative cultures from lower respiratory tract correlate well with quantitative microbiology from lung parenchymal tissue, which remains the gold standard for diagnosing VAP [18,19].
On the basis of the results of this study we propose the following screening algorithm for suspected VAP: Surgical ICU patients ventilated >48 h are screened for lower respiratory tract culture based on the presence of either fever or new CXR infiltrate. Patients are then stratified into the early (days intubated 2–5) or late (days intubated >5) VAP window. In the early VAP window, empiric antimicrobial therapy is withheld if there are no organisms on the gram stain. In the late VAP window, empiric antimicrobial therapy is initiated regardless of the results of the gram stain. Adoption of this strategy would decrease the false negative rate in our ICU to between 0% and 3%. Diagnosis of VAP by this strategy would thus involve both derangement of one or more clinical variables and microbiologic confirmation.
Importantly, this approach would markedly decrease specificity, and at first may appear to result in an overall increase in empiric antibiotic prescription. However, in the case of VAP, high specificity is achieved by following the results of the lower respiratory tract culture, and either discontinuing or de-escalating antibiotic therapy promptly, as appropriate. This follow-up is imperative as both over-diagnosis and overtreatment of VAP is common [5]. Most current data suggest that relatively short (1–3 d) courses of antimicrobial therapy are less harmful than full (10–14 d) courses of empiric therapy that are not ultimately guided by culture results [20,21]. For this reason, adoption of a more sensitive screening trigger may actually decrease overall antibiotic use by avoiding long courses of empiric antibiotic therapy for patients with CPISclinical ≤6 and unrecognized VAP. In order to test this hypothesis, we plan to collect data on antibiotic usage in our surgical ICU following adoption of the more liberal screening algorithm. Finally, because delays in appropriate antimicrobial therapy have been associated with adverse outcomes in patients with VAP [3], a highly sensitive screening tool is mandatory, even at the expense of low specificity.
This study is limited by its retrospective nature and particularly our inability to abstract the CPISclinical as calculated in real time, as well as the actual indication that prompted the lower respiratory tract culture. Similarly, information to calculate a complete CPISclinical was unavailable in approximately 25% of lower respiratory tract cultures. Changes in both providers and protocols for care of patients with VAP may have affected both the decision to obtain a lower respiratory tract culture and prescribe empiric antibiotics over the course of the study period. Our sample consisted of patients selected for BAL, not all ventilated patients. For this reason, we did not observe the usual association between duration of mechanical ventilation and VAP. Finally, we do not follow biomarkers of infection, such as procalcitonin, routinely in our ICU, although the value of such measurements in VAP has not been established [22].
In conclusion, in this large cross sectional analysis of ventilated surgical ICU patients selected for lower respiratory tract culture, CPIS possessed poor discriminative ability for VAP. Acceptable sensitivity for CPISclinical occurred at a much lower range that used traditionally. Fever, new CXR findings, and the presence of organisms on gram stain in the early VAP window all were highly sensitive for diagnosing VAP, and we have incorporated these variables into a proposed screening algorithm. On the basis of these findings, we recommend abandonment of the CPIS as a screening tool for VAP in the surgical ICU.
Footnotes
Author Disclosure Statement
No competing financial interests exist.
