Abstract
Background:
The Gatti and the bilateral internal mammary artery (BIMA) scores were created to predict the risk of deep sternal wound infection (DSWI) after bilateral internal thoracic artery (BITA) grafting.
Methods:
Both scores were evaluated retrospectively in two consecutive series of patients undergoing isolated multi-vessel coronary surgical procedures—i.e., the Trieste (n = 1,122; BITA use, 52.1%; rate of DSWI, 5.7%) and the Besançon cohort (n = 721; BITA use, 100%; rate of DSWI, 2.5%). Baseline patient characteristics were compared between the two validation samples. For each score, the accuracy of prediction and predictive power were assessed by the area under the receiver-operating characteristic curve (AUC) and the Goodman-Kruskal gamma coefficient, respectively.
Results:
There were significant differences between the two series in terms of age, gender, New York Heart Association functional class, chronic lung disease, left ventricular function, surgical priority, and the surgical techniques used. In the Trieste series, accuracy of prediction of the Gatti score for DSWI was higher than that of the BIMA score (AUC, 0.729 vs. 0.620, p = 0.0033). The difference was not significant, however, in the Besançon series (AUC, 0.845 vs. 0.853, p = 0.880) and when only BITA patients of the Trieste series were considered for analysis (AUC, 0.738 vs. 0.665, p = 0.157). In both series, predictive power was at least moderate for the Gatti score and low for the BIMA score.
Conclusions:
The Gatti and the BIMA scores seem to be useful for pre-operative evaluation of the risk of DSWI after BITA grafting. Further validation studies should be performed.
Use of the left internal thoracic artery (ITA) to bypass the left anterior descending coronary artery is a well-established procedure in coronary surgical procedures. The increased ITA resistance to intimal hyperplasia and medial calcification compared with venous and other arterial grafts could improve late outcomes after operation [1]. Although no significant difference was found in the rate of all-cause death at 10 years in patients assigned randomly to undergo either bilateral (BITA) or single ITA (SITA) grafting in the Arterial Revascularization Trial [2], several observational studies and meta-analyses have since reported increased long-term survival with BITA use [3–7].
Nonetheless, the rate of BITA use remains low worldwide, ranging from 4% in North America [8–10] to 10% in Europe [11]. The perceived increased risk of sternal surgical site complications, primarily infections, is the major reason for not using BITA grafts [9]. In fact, BITA harvesting is an independent predictor of sternal complications [2,10–12], including deep sternal wound infection (DSWI), which is the most serious form of sternal infection after sternotomy, and a strong risk factor for early and late death after cardiac operation [13,14].
In this context, to reduce the rate of DSWI after coronary operation without losing the presumed long-term survival benefit of BITA use, the first weighed scoring system to predict the risk of DSWI after BITA grafting—namely, the Gatti score—was developed in 2015 by Gatti and colleagues [15]. The score outperformed existing scoring systems for predicting sternal wound infection after coronary operation and was validated internally (using bootstrapping) [15] and externally in three different studies [16–18].
In February 2018, a new scoring system was published by Raja and Benedetto [19] to guide decision making for BITA use—namely, the bilateral internal mammary artery (BIMA) score. Although the BIMA score has been validated internally (also using bootstrapping), no external validation has been performed to date [19].
The purpose of this study, therefore, was to perform the first external validation of the BIMA score and to compare the ability performance of the Gatti and the BIMA scores for the pre-operative prediction of the risk of DSWI after coronary operation.
Methods
The Gatti and BIMA scores were compared in two large, consecutive series of patients undergoing isolated multi-vessel coronary bypass surgical procedures—namely, the Trieste and Besançon cohorts. The Trieste validation sample was composed of 1,122 patients receiving either one (n = 537, 47.9%) or two ITA grafts (n = 585, 52.1%) and operated on between January 2014 and March 2019 at the cardio-thoracic and vascular department of the University Hospital of Trieste, Italy. The Besançon validation sample was composed of 721 BITA patients operated on between January 2015 and December 2017 at the department of thoracic and cardio-vascular surgery of the University Hospital Jean Minjoz of Besançon, France.
Details pertaining to the patients and their disease were recorded prospectively in a computerized data registry. For both samples, post-discharge surveillance of surgical sites was performed for every patient in a specifically dedicated surgical outpatient clinic. All patients who experienced surgical site complications were referred to this outpatient clinic, regardless of the time elapsed since hospital discharge. The Centers for Disease Control and Prevention (CDC) classification of the sternal wound infections, 2017 [20] was used to define sternal wound infections. Patients in whom superficial sternal wound infections developed were excluded from this analysis.
The baseline characteristics of patients in the original series from which the Gatti and the BIMA score have been derived, as well as the criteria for score attribution and DSWI risk calculation, were obtained from the corresponding published original articles [15,19].
The study was conducted in accordance with the principles of the Declaration of Helsinki. The local ethical committee in each hospital approved the study, and the requirement for individual patient consent was waived.
Statistical methods
Quantitative variables are expressed as mean ± standard deviation, or median (interquartile range), and categoric variables as number (percentage). Baseline characteristics were compared between the two cohorts using the chi-square test. The accuracy of prediction (discriminatory power) of each score was assessed using receiver-operating characteristic curves with calculation of the area under the curve (AUC) and 95% confidence interval (95% CI). According to arbitrary guidelines [21], the accuracy of prediction was defined as low (AUC, 0.5 to 0.7), moderate (AUC, 0.7 to 0.9), and high (AUC, 0.9 to 1).
The two scores were compared using the Hanley-McNeil method. For both scores, goodness-of-fit (calibration) in BITA patients was assessed with the Hosmer-Lemeshow test: High chi-square values (p < 0.05) indicate poor fit; and low chi-square values (with p closer to 1) indicate good fit of the logistic regression model. The predictive power of both scores was assessed using the Goodman-Kruskal gamma coefficient. According to Haley [22], the predictive power was defined as low (gamma <0.3), moderate (gamma, 0.3 to 0.5), and high (gamma >0.5). A p < 0.05 was considered statistically significant. Analyses were performed using SPSS software for Windows, version 13.0 (SPSS, Inc., Chicago, IL.).
Results
The two original series used to develop the Gatti and BIMA scores differed significantly in terms of baseline characteristics. These differences, as well as the surgical techniques and rate of DSWI in the two original development series are summarized in Supplementary Table S1. The comparison of the two original series and the two validation samples of the present study is presented in Supplementary Tables S2 and S3. In all patients of both validation series, ITAs were harvested as skeletonized grafts.
Gatti score versus BIMA score: The Trieste validation series
The DSWI occurred in 64 (5.7%) cases overall—in 4.3% (n = 25) of BITA patients and in 7.3% (n = 39) of SITA patients (Table 1).
Baseline Characteristics of Patients in the Two Validation Series *
NYHA = New York Heart Association; CHF = congestive heart failure; LVEF = left ventricular ejection fraction; IABP = intra-aortic balloon pumping; BITA = bilateral internal thoracic artery; SITA = single internal thoracic artery; DSWI = deep sternal wound infection.
Data are expressed as number of patients with the percentage in parentheses. Age and body mass index are reported as range between quartiles of the Trieste series, in brackets.
All patients: The Besançon validation series versus the Trieste validation series.
BITA patients only: The Besançon validation series versus the Trieste validation series.
The between-score difference in discriminatory power was significant in the overall series (p = 0.0033), but was not significant in BITA (p = 0.157) or SITA patients (p = 0.09). In the overall series, there was no statistically significant between-score difference in the accuracy of prediction for DSWI in patients with off-pump operation only (Gatti score, AUC, 0.826, 95% CI, 0.700 to 0.915 vs. BIMA score, AUC, 0.674, 95% CI, 0.534 to 0.794; p = 0.132), whereas the difference was significant for patients with on-pump operation only (AUC, 0.721, 95% CI, 0.693 to 0.748 vs. AUC, 0.605, 95% CI, 0.575 to 0.634; p = 0.0067) (Fig. 1A–D).

Gatti score versus bilateral internal mammary artery (BIMA) score. Evaluation of accuracy of prediction of deep sternal wound infection in the Trieste validation series. (
In BITA patients, the goodness-of-fit of the Gatti score (chi-square, 1.53, degrees of freedom, 4, p = 0.822) was higher than that of the BIMA score (chi-square, 8.67, degrees of freedom, 8, p = 0.371). The predictive power was moderate (Goodman-Kruskal gamma, 0.53) for the Gatti score and low (Goodman–Kruskal gamma, 0.24) for the BIMA score.
Gatti score versus BIMA score: The Besançon validation series
A DSWI occurred in 18 (2.5%) cases (Table 1). For both scores, the accuracy of prediction was moderate (Gatti score, AUC, 0.845, 95% CI, 0.817 to 0.871 vs. BIMA score, AUC, 0.853, 95% CI, 0.825 to 0.878; p = 0.880) (Fig. 2).

Gatti score versus bilateral internal mammary artery (BIMA) score. Evaluation of accuracy of prediction of deep sternal wound infection in the Besançon validation series. AUC = area under the curve.
Discussion
This study is the first to compare the only two existing scoring systems specifically devised to predict the risk of DSWI after BITA grafting—namely, the Gatti [15] and BIMA scores [19]. The Gatti score was developed originally from a series of 2,872 BITA patients and included nine discrete predictive variables: female gender, body mass index >30 kg/m2, diabetes mellitus managed orally, diabetes managed with insulin, poor glycemic control, chronic lung disease, chronic dialysis, congestive heart failure, and urgent surgical priority.
In its original development series, the score showed high predictive power (Goodman–Kruskal gamma, 0.76) and moderate accuracy of prediction (AUC, 0.72, 95% CI, 0.70 to 0.73) [15]. The accuracy of prediction was moderate also in three validation samples composed of 304 (AUC, 0.82, 95% CI, 0.72 to 0.91) [16], 255 (AUC, 0.78, 95% CI, 0.64 to 0.92) [17], and 53 (AUC, 0.84, 95%CI, 0.71 to 0.92) prospectively enrolled patients [18].
The BIMA score was developed originally in a series of 5,234 BITA/SITA patients and included five predictors: Age, female gender, body mass index, diabetes managed with insulin, and left ventricular ejection fraction <30%. Age and body mass index were considered as continuous predictive variables. Body mass index was scored on two different scales according to BITA status. The model showed moderate discriminatory power (AUC, 0.75) [19]. Yet, to date, no external validation study has been performed.
In the present study, both scores were evaluated in a series of more than 1,100 BITA/SITA patients undergoing isolated multi-vessel coronary operation at the Trieste University Hospital, Italy, and in a second cohort of more than 700 French BITA patients (Besançon University Hospital). In both validation series, ITAs were harvested as skeletonized grafts for every patient.
In the Trieste series, the discriminatory power of each score was calculated for the overall series, and for BITA and SITA patients, separately; the goodness-of-fit of both scores was assessed in BITA patients; the difference between the actual rate and the expected risk of DSWI by both scores was compared; and the predictive power of each score was estimated. Accuracy of prediction and predictive power of the two scores was assessed in the French patients using the same methodology.
The most relevant finding of this comparative study was that the Gatti score showed moderate discriminatory power for BITA patients in both validation samples, while the discriminatory power of the BIMA score was low in Italian patients and moderate in those of the French series. The difference in accuracy of prediction between the two scores in the Trieste series was significant only for the overall series (but not for BITA or SITA patients alone) and only for on-pump surgery. In BITA patients, the Gatti score was superior to the BIMA score as concerns the goodness-of-fit. In addition, in both validation samples, the predictive power was moderate or high for the Gatti score and low for the BIMA score.
There are at least two significant reasons to account for the superiority of the Gatti score over the BIMA score in the Trieste series. First, both scores were assessed at the center (Trieste) where the Gatti score was devised; although the present validation sample does not include the cohort of patients used for the development of the Gatti score. Practitioners in Trieste may have acquired skills in the peri-operative treatment of patients with specific comorbidities.
Second, the two original series were very different in terms of almost all the most relevant baseline characteristics, as well as the rates of use of ITA skeletonized technique and BITA grafting, and incidence of DSWI. Conversely, there was no significant difference in discriminatory power between the two scores in the French series. Both scores showed good goodness-of-fit for BITA patients. Finally, in both validation series, the predictive power was moderate/high for the Gatti score and low for the BIMA score.
This study has several limitations that deserve to be acknowledged. First, for each predictive system, score attribution and DSWI risk calculation were performed retrospectively and separately by two senior authors (AP, GG). In the event of discordant evaluations, a consensus was reached with the other authors of the present article. Second, data concerning baseline characteristics and predictors of sternal wound infections such as current smoking, peripheral vascular disease, and renal impairment were not available for both series. Third, there is a possibility that there were differences in the definitions used for some pre-operative variables, as well as for the classification of sternal wound infections in the two participating centers, even though internationally agreed definitions were used [20].
Fourth, although the surgeons' notes on surgical site revisions have been reviewed to ensure that the definitions were in accordance with the CDC classification of the sternal wound infections [20], there is the possibility that some superficial incisional infections were misclassified as deep incisional infections. Fifth, details pertaining to sternal wiring techniques and peri-operative management of sternal wounds were not reported. Sixth, no information on the prophylactic use of antibiotic agents was considered.
Finally, a clarification. Although it was devised specifically for BITA patients (as stated by the authors [19]), the BIMA score was created from a population of subjects who underwent either BITA or SITA grafting [19]. Therefore, we adopted, to perform score validation, a combined cohort of BITA and SITA patients (i.e., the Trieste validation series); consequently, the performance of the two scores was explored even in SITA patients. Nevertheless, in further analyses along the article, both scores were evaluated just in the BITA patients of the Trieste series, as well as in the Besançon validation series, which is composed of only BITA patients.
Conclusion
According to the results of the present study, both the Gatti and the BIMA scores seem to be useful for pre-operative evaluation of the risk of DSWI after BITA grafting and could be adopted in clinical practice to guide decision making for BITA use. Further validation studies, however, on larger cohorts of patients should be performed—for the BIMA score especially.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
