Abstract
The Glasgow Coma Scale (GCS) and the Abbreviated Injury Score of the head region (HAIS) are validated prognostic factors in traumatic brain injury (TBI). The aim of this study was to compare the prognostic performance of an alternative predictive model including motor GCS, pupillary reactivity, age, HAIS, and presence of multi-trauma for short-term mortality with a reference predictive model including motor GCS, pupil reaction, and age (IMPACT core model). A secondary analysis of a prospective epidemiological cohort study in Switzerland including patients after severe TBI (HAIS >3) with the outcome death at 14 days was performed. Performance of prediction, accuracy of discrimination (area under the receiver operating characteristic curve [AUROC]), calibration, and validity of the two predictive models were investigated. The cohort included 808 patients (median age, 56; interquartile range, 33–71), median GCS at hospital admission 3 (3–14), abnormal pupil reaction 29%, with a death rate of 29.7% at 14 days. The alternative predictive model had a higher accuracy of discrimination to predict death at 14 days than the reference predictive model (AUROC 0.852, 95% confidence interval [CI] 0.824–0.880 vs. AUROC 0.826, 95% CI 0.795–0.857; p < 0.0001). The alternative predictive model had an equivalent calibration, compared with the reference predictive model Hosmer-Lemeshow p values (Chi2 8.52, Hosmer-Lemeshow p = 0.345 vs. Chi2 8.66, Hosmer-Lemeshow p = 0.372). The optimism-corrected value of AUROC for the alternative predictive model was 0.845. After severe TBI, a higher performance of prediction for short-term mortality was observed with the alternative predictive model, compared with the reference predictive model.
Introduction
T
In the acute care trauma setting of level 1 trauma centers, severe TBI is the cause of trauma-related mortality in approximately two-thirds of cases. 7 The relation between TBI-related mortality and trauma-related mortality was lower (39%) using administrative data. 8 The outcome “mortality at 14 days” is a relevant quality outcome indicator for the performance of pre- and in-hospital acute care system including care of TBI. 9,10 For instance, if local observed mortality is higher than estimated, local clinical pathways of acute pre- and in-hospital care can be critically reviewed and specific quality programs can be established. 11 However, actual models predicting mortality following severe TBI are not excellent.
The Glasgow Coma Scale (GCS) is an established predictor of death and unfavorable outcome. 12 Although GCS alone correlates with survival and functional outcome, this correlation is weak and inconsistent. 13 –15 The GCS seems to be a controversial prognostic factor for outcome in elderly patients, potentially related to a discrepancy between GCS and anatomic TBI severity in this specific population. 16,17 The performance of prediction is improved if supplementary independent variables are included, such as pupillary reactivity and age. Several models have been developed that take into account these and further factors in an attempt to predict outcomes in patients with TBI. The performance of some of these models is good (for instance, the Corticoid Randomisation After Significant Head injury [CRASH] and International Mission on Prognosis and Analysis of Clinical Trials [IMPACT] models) and they have good generalizability. 10,18 –20 One major difference between the CRASH and IMPACT (core) model is the implementation of the GCS: the CRASH model uses all three subscales of GCS, 21 while the IMPACT (core) model uses only the motor subscale. 22 Motor subscale of GCS used in the IMPACT model may be more accurate at hospital admission in patients after severe TBI because this neurological assessment is potentially less biased than the total GCS. Typical biases of the assessment with GCS are sedation and/or alcohol consumption before injury, endotracheal intubation, and facial trauma. 23 Further, the IMPACT model was validated for 14-day mortality prediction. 10
One possibility to improve prediction performance early after TBI is the addition of head computed tomography (CT) findings to clinical findings. The Nijmegen prediction model has tested this approach using a mix of clinical parameters and presence or absence of head CT alteration. The prediction performance to predict mortality at 6 months with these additional variables was good (area under the curve 0.86. 95% confidential interval [CI] 0.82–0.90 in the external validation model). 24
Another possibility to improve prediction performance is the addition of the Abbreviated Injury Score (AIS). AIS is a well-defined anatomical or structural scoring system of trauma with less interference of external factors, such as sedation, alcohol, intubation, and facial trauma. According to AIS, the severity of each injury in seven body regions (head, face, neck, thorax, abdomen, spine, upper extremities, and lower extremities) is graded on a scale from 1 to 6 points. In most cases but not for all, accurate grading requires diagnostic imaging. 25,26 Interrater agreement for the degree of severity (and therefore, similar to GCS) may be limited. 27 Injuries of grade 3 or higher are considered to be relevant or severe. A score of 6 denotes an unsurvivable injury. The AIS is updated periodically to reflect changes in mortality and is the basis for other scoring systems such as the Injury Severity Score (ISS), 28 the Trauma and Injury Severity Score, 29 and the Revised Injury Severity Classification. 30 The AIS of the head region (HAIS) is a validated prognostic factor in TBI. 31 In the few studies using both initial GCS and HAIS, the relationship between these two scores was limited and the accurate prediction of mortality was different between the scores. 17,32 –35 Further, major extracranial injury could be a further prognostic factor of mortality in TBI patients. However, there is some evidence that at least in patients with severe TBI, major extracranial injuries have a minor prognostic effect on 6-month mortality. 36
Prediction models in patients after severe TBI that included HAIS and major extracranial injury as prognostic factors and mortality as outcome were rarely investigated. 32,37 Prediction models that have investigated HAIS and mortality did not describe in detail the process of discrimination, calibration, and validation. 38,39 We hypothesized that the inclusion of the anatomical description of the injury including its grades of severity using the AIS system could improve the performance of a “traditional” prediction model in TBI patients for mortality at 14 days.
The aim of this study was to compare the predictive performance of an alternative predictive model including motor GCS, pupil reaction, age, HAIS, and presence of multi-trauma for short-term mortality and a reference predictive model including motor GCS, pupil reaction, and age (IMPACT core model).
Methods
This article has been developed based on the Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis (TRIPOD) guidelines. 40
Source of data
We re-analyzed the database of a prospective epidemiological cohort study including patients after severe TBI with a follow-up from the time of accident until 14 days post-injury or earlier death. 5 The study was approved by the ethics committees of the participating trauma centers. As a result of their neurological condition, patients were unable to give informed consent before enrollment. The local study coordinators contacted their legal representatives (proxies) to inform them of the study within 14 days of the TBI. Both patients and/or proxies received detailed written information on the study and were asked for consent. In the case of withdrawal, further follow-up was discontinued. Complying with a patient's request, the collected data was removed from the database and destroyed.
Study population and inclusion criteria
We included patients ages ≥16 years having sustained severe TBI from either blunt or penetrating trauma in Switzerland. Severe TBI was defined by a HAIS of more than 3. HAIS was assessed based on the diagnoses made by neurosurgeons or radiologists in charge and established after computer tomography imaging of the head. The worst CT scan in the first 24 h was assessed using a standardized data sheet based on the HAIS. Patients who died before neurosurgical or radiological diagnosis were included if the history of the trauma and trauma signs of severe head injury were documented by the out-of-hospital emergency medical services. Patients with unclear brain trauma history (e.g., comatose patients found in a public area without witnessed injury from bystanders) or no signs of brain trauma (e.g., fatal multi-trauma patients with abdominal and thoracic injuries without visible injuries to the head) were excluded. Non-survivor patients on scene and patients with missing prediction factors also were excluded.
Outcome: Mortality at 14 days
Included predictors for mortality at 14 days were: 1. Patient characteristics: age. 2. Initial physiological and biological variables: GCS, subscale motor score of GCS, pupil reaction at hospital admission in the emergency department (ED). 3. Severity of TBI: HAIS scoring based on clinical assessment and a cerebral CT scan taken within 24 h of the injury. 4. Severity of concomitant injuries: The ISS was calculated within the 24 h following the injury event, which included concomitant injuries; injuries were classified in mono-trauma and multiple trauma. Multiple trauma was defined as AIS >2 in another body region.
Sample size and missing data
The sample size was pre-specified.
5
We excluded 11 non-survivor patients on scene. We excluded 102 patients with missing predictive factors. Seventy-seven had missing motor GCS scores, and 31 patients had missing pupil reactions (Fig. 1; Supplementary Table 1; see online supplementary material at

Flow chart of enrolled and included patients.
Statistical analysis
Patients' baseline characteristics were described by medians and interquartile ranges (IQRs) for continuous variables and frequencies and percentages for categorical variables. Descriptive statistics were conducted for the entire population and for two subgroups: survivors versus non-survivors at 14 days. The predictor “age” was presented as a distribution (median, IQR). The predictors “GCS” and “motor GCS” were presented as a distribution and in categories of three subgroups (3–8, 9–12, 13–15 and 1–2, 3–4, 5–6, respectively). The predictor “HAIS” was presented as a distribution and in categories of three subgroups (HAIS 4, HAIS 5, HAIS 6). Differences between the group “survivors” and the group “non-survivors” were assessed by non-parametric Wilcoxon t-tests for continuous variables and by tests for categorical variables (univariate analyses).
Development of the models
There was an a priori decision to develop a predictive model based on data from the ED and to include the subscale “motor” of the GCS score only, similar to the reference predictive model 22 ; further, motor score was associated with long-term functional outcome. 41 The model included the predictors age, motor GCS at ED admission, pupil reaction at ED admission, HAIS, and multiple trauma. The predictor “age” was used continuously, the other predictors were used in categories.
First, an analysis with risk factors for death at 14 days based on the univariate analyses was performed including HAIS, and multiple trauma. Multivariate logistic regression models were used with a backward selection based on likelihood ratio (LR) Chi2 test to select the final alternative predictive model.
Second, an analysis based on the reference predictive model (IMPACT core model) was performed using a multivariate logistic regression model; this model includes motor GCS score at ED admission, pupillary reactivity at ED admission, and age as predictors, and death at 14 days as outcome. 22 This reference predictive model was used as described in the published validation study. 22 This reference predictive model was refitted for our cohort.
The performance of the two prediction models (alternative predictive model and reference predictive model) was assessed by determining the explained variation (by Nagelkerke's pseudo R2 and deviance models), their discrimination (area under the receiver operating curve [AUROC]) and calibration (by calibration slope and intercept).
Nagelkerke's R2 measures the explained variation of the model. 42 The deviance is a method to assess the “goodness-of-fit” of a model. It is defined as twice the difference between the log likelihoods of two models. This quantity compares the values predicted by the fitted model and those predicted by “the most complete model we could fit.” Evidence for model lack-of-fit occurs when the value of deviance is large.
Model discrimination
The AUROC for the two fitted models (reference predictive model and alternative predictive model) were calculated to evaluate the discriminative ability or predictive validity. The AUROC values range from 0.5 to 1.0, where 1.0 indicates perfect discrimination and 0.5 indicates that the discrimination ability of the model is no better than which would occur by chance. Generally, AUROCs >0.90 are considered excellent, >0.80 good, >0.70 modest, and ≤0.70 poor. 43 The AUROC of the alternative predictive model and the AUROC of the reference predictive model were compared. A p value <0.05 means AUROC are not equal (one AUROC is better than the other AUROC). 44
Calibration of models
We evaluated the calibration for the two fitted models (alternative predictive model and reference predictive model), which correspond to the concordance between predicted and observed outcome over the entire risk spectrum. This concordance was assessed by the Hosmer-Lemeshow test. 43 The test divides the patients into equally sized deciles based on mortality risk and calculates a Chi2 for each decile. The smaller Chi2 value and the larger the p values are, the better the calibration.
A graphical assessment of calibration was performed for the alternative predictive model, with the predicted death rate at 14 days on the x axis and the observed death rate at 14 days on the y-axis. Perfect predictions should be on the 45° line. Observed death rate at 14 days was plotted by decile of predicted death rate (graphical illustration of the Hosmer-Lemeshow goodness-of-fit test).
Validation of the alternative predictive model
Bootstrapping with 2000 repetitions was used to correct for optimism of AUROC using the alternative predictive model. 45 This method allowed the measure of a new AUROC corrected for optimism that was compared with AUROC of the initial alternative predictive model: The closer the two AUROC are, the less overfitting is expected.
Statistical analyses were performed using STATA Release 12.1 (Stata Statistical Software: Release 12.1 Stata Corporation, College Station, TX). Statistical significance was set at p < 0.05 for each analysis.
Results
Baseline characteristics
A total of 808 patients were included in this study (Fig. 1). The median age was 56 (IQR 33–71), the median GCS at ED admission was 3 (IQR 3–14), and 29% had abnormal pupil reaction; the overall mortality rate at 14 days was 29.7% (240 of 808 patients; Table 1). Non-survivors were significantly older, had higher severity of brain injury, and higher rate of multiple trauma in the univariate analyses (Table 1). The distribution of the patients to the trauma centers was heterogeneous, with a high rate of direct admission (70% to 88%; Supplementary Table 2; see online supplementary material at
TBI, traumatic brain injury; IQR, interquartile range; GCS, Glasgow Coma Scale; ED, emergency department; HAIS, Abbreviated Injury Score of the head region.
Model development
Using the backward selection, the best alternative predictive model included the five tested predictive factors: age, motor GCS score, pupil reaction, HAIS, and multiple trauma (Table 2).
OR, odds ratio; CI, confidence interval; GCS, Glasgow Coma Scale; ED, emergency department; HAIS, Abbreviated Injury Score of the head region.
The alternative predictive model significantly increased the predictive measure, compared with the reference predictive model measured by Nagelkerke's R2: 43% versus. 37%. The deviance of the alternative predictive model was statistically significantly improved, compared with the deviance of reference predictive model (LR Chi2 50.3, p < 0.0001).
Model discrimination
Both models had a good discrimination for the prediction of death at 14 days with AUROCs ≥0.800 (Fig. 2). The alternative predictive model had an improved predictive validity to predict 14 days mortality, compared with the reference predictive model (0.852, 95% CI 0.824–0.880 vs. 0.826, 95%CI 0.795–0.857; p < 0.0001; Table 3).

Accuracy of discrimination (area under the receiver operating characteristic curve [AUROC]) for the alternative predictive model (APM) and the reference predictive model (RPM).
CI, confidence interval; AUROC, area under the receiver operating characteristic curve.
Calibration of models
The graphical concordance between predicted and observed outcome over the entire risk spectrum was high for the alternative predictive model; a trend toward a lower graphical concordance can be suspected for the reference predictive model (Fig. 3A, 3B). However, this concordance assessed by the Hosmer-Lemeshow test was similar for both models (Table 3).

Calibration of the alternative predictive model (APM).

Calibration of the reference predictive model (RPM).
Internal validation of the alternative predictive model
The optimism-corrected value of AUROC for the alternative predictive model was 0.845 which is close to the initial AUROC for the alternative predictive model (0.852). Therefore, a low risk of overfitting is expected for an external population with regard to the discrimination of accuracy.
Discussion
This large epidemiological cohort multi-center study observed that the alternative predictive model, with the addition of an anatomical or structural scoring system of trauma (HAIS and ISS), compared with the reference predictive model (IMPACT core model), had a statistically higher predictive validity.
Interpretation
We hypothesized that the reference model including motor GCS, age, and pupillary reactivity at ED admission had a lower predictive performance for short-term mortality related to the limited context validity of the GCS for brain injury and the absence of trauma assessment in other body regions.
The limited context validity of the GCS, including the motor GCS, for brain injury may be related to several external factors: alcohol, sedation, cognitive limitations of the (elderly) patient, multiple injuries, and limited training of health care providers for the utilization of the GCS. 46,47 The level of consciousness measured with the GCS is easy to identify if the clinical presentation is extreme (e.g., profound coma or full consciousness); however, the interrater variability is large for intermediate scores. 46 –48 Therefore, the addition of HAIS, which includes values from the cerebral CT scan, and clinical assessment may improve predictive performance. This last assumption is in line with the findings of a prediction model developed in Nijmengen. 24
About one-third of the patients with TBI also had the criteria for a multiple trauma. 5 Therefore, mortality may not only be related to TBI but also may be related to trauma in other body regions. 49
In our study, multiple trauma was an independent risk factor. The exact relevance of the risk factor “extracranial injury” on outcome is controversial in the scientific literature. For instance, in one detailed analysis of the CRASH data, extracranial injury was identified as a highly relevant risk factors 50 ; however, in a data meta-analysis, major extracranial injury was less relevant in severe TBI for mortality. 36 Potentially, these differences may be related to three collinear-related factors: a) the time of inclusion in an investigation; b) the requested extracranial injury severity for inclusion; and c) the inclusion of hypotensive events (often related with major extracranial injury). Early inclusion and a high severity degree of extracranial injury as inclusion criteria without considering hypotension may increase the relevance of “extracranial injury” as independent predictor of mortality.
Comparison with existing studies predicting short-term mortality after TBI
In a single center retrospective study with about 400 patients, four independent predictors for poor outcome measured with the Glasgow Outcome Scale at hospital discharge (GOS 1–2) were identified using a binary logistic regression analysis: age, GCS, ISS, and high HAIS (≥5). 51 However, this study did not include a predictive model to estimate model performance and further, length of hospital stay was not mentioned. Despite these limitations, predictors similar to the present study, except pupillary reactivity on light (which was not tested), were observed.
In a trauma register analysis with 7764 patients, four independent predictors were associated with in-hospital death using a stepwise logistic regression analysis: age, GCS, HAIS, and type of injury (penetrating vs. blunt). 13 ISS and pupillary reactivity were not included in this investigation and model performance was conducted in a secondary publication. However, accuracy of discrimination (AUROC) was not reported. 32
In a large prospective cohort of eight level I trauma centers in the United States (TBI Clinical Trial Network) with 2808 patients, the investigators did not observe any added benefit to combine GCS or motor subscale of GCS with HAIS for the prediction of 2-week cumulative mortality; although GCS or motor subscale and HAIS were independent predictors of mortality after adjustment for age, hypotension, and center differences. 35 However, pupillary reactivity and ISS were not included in this prospective investigation and a formal testing of model performance was not presented.
Strengths
This study has some strengths. First, this investigation was performed on prospectively collected data for the purpose of this outcome. 5 Second, this study was a national multi-center study including a considerable sample size with a heterogeneous population. Third, the study has included a high number of elderly patients, an increasing population in this setting extending the validity over a broad range of age groups. Fourth, the study included a score system of trauma that is used in many trauma centers for accounting and quality control, and can therefore be included without any difficulty. 52 Fifth, the study had simple times for identification of predictors and for outcome (mortality at 14 days), which increase the robustness of the model. Sixth, the alternative predictive model has a low risk of overfitting confirming the robustness of the model.
Limitations
This investigation has several limitations. First, the study included only patients with a HAIS >3; therefore, this risk score model cannot be used for patients with HAIS ≤3. However, mortality is lower in patients with mild TBI. 53
Second, the study focused on simple predictors that clinicians could use in daily practice in the first 24 h after arrival in trauma center, at least in high income countries with available CT scans. The predictive validity would be improved with results of blood analyses during ED stay. 48
Third, the study defined death at 14 days as outcome. This outcome is easy to measure and has a high content validity. However, the alternative predictive model cannot be used for long-term outcome or for other outcomes.
Fourth, the time of the risk factor assessments was slightly different between these two models. The impact of this time difference on the outcome “mortality at 14 days” is probably of low relevance.
Fifth, the clinical relevance of the difference between the alternative predictive model and the reference model may be considered as debatable; however, it is a further step to improved early prediction in severe TBI patients.
Future implications
The use of prognostic models in patients with severe TBI is essential due to its high mortality risk. It is important that the patient, their proxies, and health care providers are informed accurately and objectively on the risk of early mortality. Communication, including results of prognostic models, may improve clinical decision-making, management and realistic expectations for the near future.
Our alternative predictive model should be validated in other cohorts before implementation in clinical practice.
Conclusion
Among patients with severe TBI, the predictive validity of the alternative predictive model for death at 14 days including additionally the trauma scoring system HAIS/ISS was higher, compared with the reference predictive model including motor GCS, pupil reaction on light, and age. Before this alternative predictive model can be used in clinical practice, a validation study is needed to show generalizability.
Footnotes
Acknowledgments
Funding for this study was in part provided by the Swiss Accident Insurance Fund. We thank Briget Benn for the English editing of the manuscript. We are grateful to all collaborators of the 10 participating trauma centers who made this investigation possible.
Author Disclosure Statement
No competing financial interests exist.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
