Abstract
The Glasgow Coma Scale (GCS) score is used in clinical practice for patient assessment and communication among clinicians and also in outcome prediction models such as the Trauma and Injury Severity Score (TRIS). The objective of this study is to determine which GCS subscore is best associated with outcome, taking time of assessment into account. Records of patients with brain injury who presented after 1989 were extracted from the Trauma Audit and Research Network (TARN) database. Using logistic regression, a baseline model was derived with age, Injury Severity Score (ISS), and year of injury as covariates and survival at discharge as the dependent variable. Total GCS, its subscores, and their combinations at various time points were separately added to the baseline model to compare their effect on model performance. The dataset contained 21,657 cases. The total GCS score at scene and its subscores had significantly lower predictive power compared with those recorded on arrival at the Emergency Department (ED) (scene total GCS: Area Under the Curve—AUC: 0.89; 95% confidence interval [CI]: 0.89–0.90) and Nagelkerke R2 of 0.55, admission total GCS: AUC of 0.91; 95% CI: 0.91–0.91, and Nagelkerke R2 of 0.59). Eye and verbal subscores had significantly lower performances compared with total GCS, motor subscore, and various combinations of subscores. Motor subscore and total GCS appeared to have similar predictive performance (admission total and motor GCS both had AUC of 0.91 (95% CI: 0.91–0.92) and Nagelkerke R2 of 0.59 and 0.58, respectively). Motor subscore contains most of the predictive power of the total score. GCS on arrival is a significantly better predictor of outcome than that recorded at scene.
Introduction
There are some disadvantages with the measurement of GCS. It is not straightforward to learn, 5 and at times it might be impossible to measure quickly, which is an issue in emergency situations. Further, the interrater reliability of total GCS is only moderate; Gill and associates 6 observed there is a probability of 68% that a pair of two GCS scores measured by two observers at the same time will differ in one subscore or another. It is also well known that assessing the verbal subscore of GCS is not reliable in sedated or intubated patients. This is also the case for the motor/eye subscore when neuromuscular blockage is used. When using GCS in prognostic models by trauma registries, it is yet unclear which time point of GCS assessment holds better prognostic value. GCS is usually measured both at scene (where the injury is incurred) and on admission to the Emergency Department (ED), and the trauma registries often hold the record at both time points.
Healey and colleagues 7 reported that the motor subscore contains the most predictive strength of GCS through a careful statistical analysis of a North American dataset of general trauma patients. Ross and coworkers 8 reported that the motor subscore may have similar diagnostic characteristics to total GCS to identify severe structural brain damage. These two studies, however, were performed on a North American trauma population sample. This is relevant, because U.S. trauma patients and care systems differ significantly from those of Europe (including the United Kingdom [UK]), both these aspects influencing patient outcome and also affecting the GCS/outcome interrelationship.
The differences mainly relate to the patterns of either injury or care. Most notably, the penetrating injuries occur far less commonly in Europe than in the United States. Penetrating injuries have a different pathophysiology and management to that of non-penetrating injuries. Regarding the care pattern, British trauma care has evolved over years. Initially, trauma care was only a part of the emergency care system in which all trauma patients were transferred to the nearest hospital, irrespective of available specialties on site. This was a major problem in the case of multiple traumas where the receiving hospital did not have the neurosurgical care. As such, secondary transfer for neurocare was significantly higher compared with that of other countries where trauma victims arrived at the trauma center with all surgical specialties available. 9 This would significantly compound the outcome of the Traumatic Brain Injury (TBI) victim. 10 Overall, the performance of trauma care in the UK has been the subject of criticism compared with other international systems, including those in the United States. Only recently, the Royal College of Surgeons of England introduced guidelines for the improvement of the trauma care system in the UK. 11
The objectives of this study were twofold: to analyze the prognostic power of various GCS subscores in patients with TBI under the British trauma care system taking other important prognosticators into account and to investigate which time point of GCS measurement (at scene versus on admission to ED) has more prognostic strength.
Methods
A subset of patients with TBI who presented to the Trauma Audit and Research Network (TARN) was studied. TARN is a non-profit self-funded trauma registry that is a part of the University of Manchester and is based at the Salford Royal Hospital, UK. It is currently the largest trauma registry in Europe and receives data on trauma patients across England and Wales and increasingly from Europe (currently Dublin, Waterford (Eire), Copenhagen, and Bern). Subscription to TARN was on a voluntary basis and recommended by the Royal College of Surgeons of England for all English hospitals who receive trauma patients (i.e., having an ED) to submit their data to TARN and is now mandated by the Department of Health. The TARN registry started with only 13 participating hospitals in 1989 and increasingly attracted more hospitals for data submission as more than 60% of hospitals across England and Wales submitted their data to TARN in 2008.
The TARN inclusion criteria are that the injured patient reaches the hospital alive and meets either: (1) a stay of more than 3 days at hospital and/or (2) being cared for in the intensive care unit, and/or (3) interhospital transfer, and/or (4) death at any time in hospital. The information is extracted from patients' medical notes or other available electronic sources by the data collector(s) at the participating hospital. Subsequently, TARN staff members code each injury using the Abbreviated Injury Scale (AIS). 12 The inclusion criteria for this study were all TARN patients sustaining brain injuries of an AIS severity score of 3 or above. Patients with head injuries with AIS score 1 and 2 were excluded, because these scores refer to cases with mild head injuries, such as simple or unspecified skull fractures.
For multivariate analysis, logistic regression was used, and to address the linear relationship of continuous variables with log odds of the outcome of interest as a requirement for logistic regression, 13 fractional polynomials transformation was used. 14 In this method, the continuous variables are transformed into one or more other variables, which is referred to as the “functional form” of the original variable. The transformation is a power transformation, and the power candidates are −3, −2, −1,− 0.5, 0, 0.5, 1, 2, 3, where 0 is loge transformation and 1 reflects no transformation (linear). The selection of the best transformation(s) is based on when the power candidate(s) yields a model with a significant improvement (referred to as “gain”) in the goodness of fit of the model, which holds no transformations. Table 1 presents the power transformations used for each variable in the study.
ISS, Injury Severity Score; GCS, Glasgow Coma Scale.
Age, Injury Severity Score (ISS), and the year in which the TBI occurred are covariates with which the GCS prognostic strength is adjusted. Year of the incidence was selected as a confounder, because TARN holds data from 1988 when the TBI management was far less advanced than the following years. Using logistic regression, a baseline model was derived with age, ISS, and the year as predictors and discharge outcome (survival) as the dependent variable. Total GCS, its subscores, and their combinations were added separately to the baseline model to assess their effect on model performance. The various combinations of GCS subscores were the sum of motor and eye subscores, motor and verbal subscores, and eye and verbal subscores.
Overall, 15 models were constructed (one baseline model, seven models with admission total GCS, subscores, or combination of subscores, and seven models with scene total GCS, subscores or combination of subscores). Area Under the Curve (AUC), classification accuracy, Nagelkerke R2, and p value of Hosmer-Lemeshow (HL) statistics were taken as measures of the performance of each model. Regarding missing information, all missing total GCS scores were imputed with the sum of their subscores in case of lack of availability in the dataset. Similarly, if total GCS was recorded as 15 and one or more subscores were missing, then the missing subscores(s) were imputed with the full score. Apart from this, no other imputation strategies were implemented.
This resulted in a variable amount of missing information in each model on total GCS, single subscores, or their combinations, which made the comparison between models less reliable. To address this problem, the models were run and compared again on a separate dataset that contained no missing values on both total GCS and the subscores from scene and admission (first sensitivity analysis). Also, a second sensitivity analysis was performed only on records of those patients who sustained their injuries between 1998 and 2008. The years after 1998 were assumed to be the period when modern advancements in the diagnosis and management of TBI were introduced.
While the fractional polynomials transformations were performed in Stata software, logistic regression was run in Statistical Package for the Social Sciences (SPSS).
Results
Using the inclusion criteria, a dataset of 21,657 TBI cases were extracted containing all brain injury records in TARN from January 1988 to April 2008 (Table 2). The median age was 34.4 years (interquartile range: 20–57), and 73.3% of the population were male. The median ISS was 24. The median total GCS was 9 at scene. The median total GCS on admission, however, is higher than the scene score, being 11. Sixty-nine percent of patients survived their injuries at discharge. The amount of missing information varied across various subscores of GCS and also across the two time points of measurement: at scene or on admission. Overall, there are more missing GCS scores at scene than on admission.
ISS, Injury Severity Score; GCS, Glasgow Coma Scale.
Adding total GCS, its subscores, and various combinations of the subscores resulted in a significant decrease in the deviance of the baseline model at all times. Also, in each model the effect of covariates included in the model on outcome was significant: i.e., p value<0.05.
Table 3 presents the performance of each constructed model per measures of AUC, Nagelkerke R2, classification accuracy, and HL statistic. The baseline model is the model that contains only age, ISS, and the year of incidence. Other models are named according to the added GCS, subscores, or combinations of subscores to the baseline model. The performance of the baseline model is increased after addition of GCS, subscores, or combinations of subscores according to AUC, classification accuracy, and Nagelkerke R2. In addition, any subscore (apart from eye score) or combination of subscores has relatively the same predictive strength as per AUC, classification accuracy, and Nagelkerke R2. Comparing the admission and scene scores, each model containing admission scores outperforms its counterpart model with scene scores in all three measures, and the AUC differences are statistically significant. For example, the AUC of the admission “total GCS” model is significantly higher than scene “total GCS” model (confidence intervals: 0.909–0.918 versus 0.888–0.899, respectively).
N: number of cases included in the modeling.
AUC, Area Under the Curve; HL, Hosmer Lemeshow; GCS, Glasgow Coma Scale.
Sensitivity analysis
Table 4 presents the results of the first sensitivity analysis. This analysis was performed after cases with at least one missing subscore on either scene or admission were excluded. As such, all compared models were derived from the same number of cases: i.e., 9352 cases. As seen in this table, subscores or their combinations still hold the same predictive strength as total GCS apart from the eye subscore. Comparing the admission and scene scores, admission scores still appear as stronger predictors of outcome, whereas the AUCs of motor and eye subscores, unlike Table 3, slightly overlap. We performed the Hanley and McNeil test 16 to explore if this slight overlap can still indicate significant difference. This statistical test provided the following significant p values for the difference in the scene and admission AUCs: 0.0031 (for motor subscore) and 0.05 (for eye subscore).
AUC, Area Under the Curve; HL, Hosmer Lemeshow; GCS, Glasgow Coma Scale.
Table 5 presents the second sensitivity analysis in which only submissions from 1998 until 2008 were analyzed. The baseline model in this analysis only includes age and ISS and excludes the submission year. As seen, the results are comparable to those of the main analysis and the first sensitivity analysis.
The baseline model in this analysis does not contain year as a confounder (N: number of cases included in the modeling).
AUC, Area Under the Curve; HL, Hosmer Lemeshow; GCS, Glasgow Coma Scale.
Discussion
In this study, we have compared the prognostic power of total GCS, its subscores, and various combinations of its subscores through multivariate analysis of a large British dataset of TBI cases. A baseline model was constructed with age, ISS, and year of presentation. Subsequently, the improvement in the model performance was investigated after addition of GCS, subscores, or combinations of subscores. It appears that motor or verbal subscores or any combination of subscores, including eye, hold the same prognostic strength as total GCS. Similarly, the predictive strength of total GCS, subscores, or their combinations is better for admission scores than scene scores.
We acknowledge a number of limitations. First, for this analysis, an existing dataset of TBI patients retrieved from TARN was used. Therefore, the effect of local protocols in GCS collection is unclear as to when the condition of the patient does not permit measurement of one subscore, such as intubation or paralysis. In such case, immeasurable subscore might be assigned the lowest score or regarded as missing. Second, the GCS predictability was adjusted only with age, ISS, and year. Pupillary reactivity, however, is also one of the important predictors in TBI but was not accounted for in this analysis. The reason is that recording of pupillary reactivity has only recently commenced for TBI submissions in TARN. Had pupillary reactivity been included in the modeling procedure, the dataset would have then been significantly smaller, yielding less reliable results.
With regard to the same predictive strength of motor and total GCS, the results of our study are consistent with findings by Healey and colleagues. 7 In our study, however, GCS is adjusted with other TBI predictors: i.e., age and ISS and year. Moreover, unlike Healey and coworkers, 7 who performed their analysis on general trauma patients with no exclusion of intoxication or shock, which can affect the level of consciousness, we performed our analysis on a TBI population who all sustained documented brain injury by AIS codes. Perhaps this also explains why the population sample in the study by Healey and associates 7 consisted of 90% GCS score of 15 whereas our dataset contains more varied GCS scores (e.g., median admission total GCS: 11 with interquartile range of 6 and 15).
We cannot explain why admission GCS scores have more predictive strength than scene scores. It might be because of the effects of alcohol or other drugs, which are diminished by the time the patient arrives at the hospital. GCS on admission might be more representative of the true level of consciousness caused by the injury per se. 17 Also, it might highlight inaccurate recording of GCS at scene, which might be because of environmental difficulties or skill level of attending personnel. Whatever the reason for the difference in scene and admission GCS predictability, however, it might have pragmatic implications on clinical decisions for management and therapeutic interventions based on GCS. We suggest that GCS on admission should be taken into account rather than GCS at scene.
We observed that a model that contains total GCS holds similar prognostic performance to a model that contains only the motor or verbal subscore or any combination of these. This suggests that measuring one or maximally two subscores may suffice. On the one hand, omitting one subscore might result in an improvement in the overall interrater reliability (between personnel with the same level of experience/skill), which is poor for total GCS but better for the subscores. 6 Also, measuring fewer subscores would be easier to teach, learn, and implement than total GCS because the error rate of eye and verbal GCS is high among unskilled observers compared with skilled ones. 5 On the other hand, although the GCS scale is designed to measure the depth of unconsciousness, which also relates to the outcome, in practice it is not only used for the purpose of prognosis. The results of our study demonstrate a possible similar prognostic strength for total and motor GCS, but this cannot be generalized reliably to other applications of GCS, such as day-to-day monitoring of patient alertness (as occurs in the intensive care units) or a clinical decision on intervention. Further, total GCS with respect to its descriptive capability holds more information content compared with a subscore or a combination of two subscores.
In fact, each GCS score can be the sum of a varied combination of subscores, and each combination of subscores might have significantly different mortality rates. Therefore, we can assume that for each total GCS with certain motor subscores, changes in the eye and verbal subscores would then result in different mortality rates. If verbal and eye subscores are not measured, then the influence of eye and verbal response on outcome within the group of patients with the same motor subscore is ignored. It is considered that the added value of the eye and verbal subscores is mainly in trauma patients with more moderate degrees of injuries. It may be that measurement of motor subscore alone, despite being more simple and perhaps more reliable, does not outweigh its disadvantages. Had the motor subscore significantly outperformed the total GCS in outcome prediction, omission of eye and verbal subscores might have then been suggested in clinical practice.
The similar prognostic value of total GCS to motor and verbal subscores or any combinations of them, however, are reassuring when missing information is a problem. In such situations, the analysis may be safely performed on only one subscore or a combination of two subscores when the proportion of missing total GCS scores is higher. This may also apply in the clinical situation.
It is important that the results of our study be validated in a different set of TBI cases from a different setting (country) and for a different type and time point of outcome. The quality of trauma care is one factor that affects the outcome and as such has a confounding role. Regarding the outcome, discharge survival is not the only end target of TBI care, because functionality as close to that before the injury is also important. For example, Glasgow Outcome Scale (GOS) is a well-known tool for the level of functionality assessment after TBI that has been used in many prognostic studies. 18 –20 Unfortunately, only 5% of recorded TBI patients in TARN during our study period had GOS available. Because of this missing information on the vast majority of cases, the analysis was only performed for survival prediction. It is important that the prognostic value of GCS (total, subscores, or combinations of subscores) be validated for GOS as well. To the best knowledge of the authors, this validation of GCS subscores and their various combinations is still lacking in the literature.
Conclusion
In a large population of TBI patients whose injuries were managed within England and Wales over the last 20 years, the total GCS, motor or verbal subscores, or any combinations of the subscores may have similar predictive strength. With regard to admission and scene GCS scores, admission scores significantly outperform scene scores for outcome prediction.
Footnotes
Acknowledgment
We would like to thank TARN members of staff and participating hospitals for the collection and submission of the data. This work was in part funded by the Trauma Audit and Research Network (TARN) and Overseas Research Students (ORS) Award Scheme, University of Manchester.
Author Disclosure Statement
No competing financial interests exist.
