Abstract
The prediction of functional outcome after mild traumatic brain injury (mTBI) is challenging. Conventional magnetic resonance imaging (MRI) does not do a good job of explaining the variance in outcome, as many patients with incomplete recovery will have normal-appearing clinical neuroimaging. More advanced quantitative techniques such as diffusion MRI (dMRI), can detect microstructural changes not otherwise visible, and so may offer a way to improve outcome prediction. In this study, we explore the potential of linear support vector classifiers (linearSVCs) to identify dMRI biomarkers that can predict recovery after mTBI. Simultaneously, the harmonization of fractional anisotropy (FA) and mean diffusivity (MD) via ComBat was evaluated and compared for the classification performances of the linearSVCs. We included dMRI scans of 179 mTBI patients and 85 controls from the Collaborative European NeuroTrauma Effectiveness Research in Traumatic Brain Injury (CENTER-TBI), a multi-center prospective cohort study, up to 21 days post-injury. Patients were dichotomized according to their Extended Glasgow Outcome Scale (GOSE) scores at 6 months into complete (n = 92; GOSE = 8) and incomplete (n = 87; GOSE <8) recovery. FA and MD maps were registered to a common space and harmonized via the ComBat algorithm. LinearSVCs were applied to distinguish: (1) mTBI patients from controls and (2) mTBI patients with complete from those with incomplete recovery. The linearSVCs were trained on (1) age and sex only, (2) non-harmonized, (3) two-category-harmonized ComBat, and (4) three-category-harmonized ComBat FA and MD images combined with age and sex. White matter FA and MD voxels and regions of interest (ROIs) within the John Hopkins University (JHU) atlas were examined. Recursive feature elimination was used to identify the 10% most discriminative voxels or the 10 most discriminative ROIs for each implementation. mTBI patients displayed significantly higher MD and lower FA values than controls for the discriminative voxels and ROIs. For the analysis between mTBI patients and controls, the three-category-harmonized ComBat FA and MD voxel-wise linearSVC provided significantly higher classification scores (81.4% accuracy, 93.3% sensitivity, 80.3% F1-score, and 0.88 area under the curve [AUC], p < 0.05) compared with the classification based on age and sex only and the ROI approaches (accuracies: 59.8% and 64.8%, respectively). Similar to the analysis between mTBI patients and controls, the three-category-harmonized ComBat FA and MD maps voxelwise approach yields statistically significant prediction scores between mTBI patients with complete and those with incomplete recovery (71.8% specificity, 66.2% F1-score and 0.71 AUC, p < 0.05), which provided a modest increase in the classification score (accuracy: 66.4%) compared with the classification based on age and sex only and ROI-wise approaches (accuracy: 61.4% and 64.7%, respectively). This study showed that ComBat harmonized FA and MD may provide additional information for diagnosis and prognosis of mTBI in a multi-modal machine learning approach. These findings demonstrate that dMRI may assist in the early detection of patients at risk of incomplete recovery from mTBI.
Introduction
Traumatic brain injury (TBI) is a global public health problem, with nearly 70,000,000 TBI cases worldwide every year, and over 2,500,000 cases in Europe. 1 The majority of TBIs (> 85%) may be classified as mild. The term “mild” underestimates the effect of this brain injury, with up to 50% of the patients having persisting symptoms, which may include cognitive, psychological, and somatic problems that can last for months to years after injury. 2 –4 The development of accurate, data-driven methods for classifying and characterizing (image-based brain) neuroimaging features of mild TBI (mTBI) could help identify those at risk of developing deficits that affect functional outcome, stratify for early treatment, and design future clinical trials.
The presence and type of lesions visualized on conventional computed tomography (CT) and structural magnetic resonance imaging (MRI) scans do not explain all variance in outcome observed after mTBI. 5 –8 Indeed scans may appear to be normal even in patients with persisting symptoms. The more advanced diffusion MRI (dMRI) has been shown to better detect subtle abnormalities associated with mTBI, because dMRI is susceptible to microstructural changes in particularly in brain white matter (WM). 5,9 –11 Detecting these WM alterations may offer the potential to improve outcome prediction for mTBI patients.
Fractional anisotropy (FA) and mean diffusivity (MD) are the most commonly used diffusion parameters, representing the fraction of the total diffusion attributable to anisotropic diffusion and the overall extent of diffusivity along the direction of WM tracts, respectively. Abnormalities in FA and MD in mTBI patients with incomplete recovery, compared to patients with complete recovery and controls have been reported to be primarily located in the corpus callosum, right anterior thalamic radiations, superior longitudinal fasciculus, and inferior longitudinal, fronto-occipital fasciculi and cerebellum. 12 –15 Some mTBI studies have reported that abnormalities in diffusion parameters are associated with clinically relevant outcomes, including cognitive and functional impairment. 5,16,17
Many studies on recovery from mTBI rely on small sample sizes, leading to low reproducibility of results. Nonetheless, research is evolving toward large multi-center studies with the aim of increasing statistical power and generalizability of results. The success of a joint analysis is highly dependent on the implementation of harmonization procedures to increase the comparability of the multi-scanner data. 18 It has been shown that the variability in diffusion metrics in the corpus callosum among controls, mTBI, and moderate TBI patients, are of the same order of magnitude as intra-scanner changes. 19 Therefore, it is crucial to reduce the variability of diffusion data across multiple scanners before classification implementations. Hence, there is a substantial need for robust harmonization techniques. 20,21 The overall concept of harmonization is to apply mathematical concepts to reduce unwanted site variability while maintaining the biological information to further evaluate outcome and recovery of mTBI patients from their imaging data.
Many prediction models have been developed to investigate and anticipate the outcome of patients who have sustained an mTBI, predominantly using regression techniques. 22 –24 However, none of these models has both good discrimination and prediction and/or are not yet suitable for use in clinical practice. 23 In recent years, there has been a shift in clinical prediction models toward the study of dynamic models, whereas in the case of imaging prediction models, because of the static nature of imaging, the exploration of new indices using machine learning (ML) techniques offers a new promising diagnostic path to improve outcome prognostication. 25,26 ML methods implemented with multiple dMRI-derived indices may help to infer the pathophysiological features of WM changes and provide more specific biomarkers for WM neuropathology in mTBI patients.
Support vector machines (SVMs) are popular ML models that have demonstrated considerable potential for robustly identifying neuroimaging biomarkers of neurological dysfunctions and diseases. 27 –29 SVMs can achieve reliable performance by determining a hyper-plane that divides the samples into two groups. Compared with conventional regression-based methods, ML-based approaches are able to reveal subtle and complex patterns that can be used to classify various psychiatric and neurodegenerative disorders, such as schizophrenia, 28 dementia, 30 and Parkinson's disease. 31 In particular, support vector classifiers (SVCs) have been applied to non-harmonized dMRI and resting-state functional connectivity data to detect mTBI, identifying biomarkers to characterize mTBI and suggesting that ML can reveal underlying mTBI-related neuropathology. 32 –35 Nonetheless, harmonization has been an important step that has been shown to remove unwanted scanner biases and improve the image analysis, which is recommended in multi-site investigations. 18
The aim of the study was to explore whether linearSVCs can aid in classification of mTBI patients and predict recovery. Simultaneously, the effect of ComBat harmonization was evaluated on the performance of linearSVCs. Two key analyses were performed. First, an analysis to investigate whether the use of linearSVCs allows for distinguishing between mTBI patients and controls; and second whether linearSVCs trained on ComBat harmonized dMRI metrics obtained within 21 days of injury are able to predict the clinical outcome of mTBI patients at 6 months.
Methods
Participants
All eligible subjects were included from the prospective observational study Collaborative European NeuroTrauma Effectiveness Research in Traumatic Brain Injury (CENTER-TBI) (December 19, 2014, to December 17, 2017;
Inclusion criteria for this analysis were being ≥16 years of age who had sustained an mTBI (defined as Glasgow Coma Score [GCS] on presentation of 13–15), requiring a head CT according to local criteria on initial presentation, and having had an MRI within 21 days of the injury. Outcome at 6 months was measured using the Extended Glasgow Outcome Scale (GOSE), a classification based on function, independence, and participation. 38 The preferred method for assessment was by interview in person. However, to maximize the follow-up rate, postal and Web-based questionnaires and administration by telephone were also allowed. These different methods for assessment of GOSE were found to have good agreement in the larger CENTER-TBI core data set. 39 Quantitative imaging analysis did not occur until after follow-up had been completed. Hence, raters were blinded to the neuroimaging results. GOSE was used to dichotomize between complete (GOSE = 8) and incomplete (GOSE <8) recovery.
Image acquisition and analysis
The initial cohort consisting of 194 eligible patients (15 scanners, 10 sites) underwent MRI at 3 Tesla with each scanner using similar acquisition protocols. The full scanning protocol details for all sites and scanners can be found at
Sequences were processed using a TBI-specific pipeline. All DTI data were corrected for noise, 41,42 Gibbs ringing artifacts, 43 head motion and eddy current artifacts, 44 and inhomogeneities in the magnetic field. 45 Diffusion tensors were obtained by weighted least squares fitting to derive MD and FA maps using the FMRIB Software Library. 46 After diffusion tensor model fitting, each FA map was spatially normalized to the FA template of the John Hopkins University (JHU ICBM-DTI-81) Atlas 47 via non-linear registration using advanced normalization tools (ANTs). 48,49 The transformations were subsequently used to project corresponding MD maps to the same space.
Image data and pipeline outputs for controls and patients were visually inspected to ensure that no scans with artifacts were included. The diffusion parameter maps (FA and MD) were visually inspected and two mTBI patients and one control were excluded because of blurring in the dMRI parameter maps caused by excessive head motion artifacts. Zero mean normalized cross-correlation (ZNCC) was computed between the JHU FA template and the aligned FA maps to quantify registration accuracy. Subjects were excluded from further analysis if the ZNCC similarity metric fell below a pre-defined threshold (ZNCC <0.8). Therefore, 13 mTBI patients and three controls were excluded based on this criterion. After identifying the eligible diffusion maps, data were retained and considered to reflect true variation or pathology. The final data cohort consisted of 179 patients (15 scanners, 10 sites) and 85 controls (12 scanners, 9 sites; comparable age and sex), with a minimum of two controls and/or mTBI patients per scanner, maximizing the number of subjects for the present study. Patients were dichotomized according to their GOSE scores at 6 months into those with complete (n = 92; GOSE = 8) and those with incomplete (n = 87; GOSE <8) recovery. A flow chart of patients can be seen in Supplementary Figure S1.
WM voxels of the spatially normalized FA and MD maps were harmonized on a voxelwise level using ComBat. 50 ComBat facilitates adjustment of quantitative diffusion maps while taking into account the effect of possible confounding factors. In this study, age, sex, and disease status served as biological covariates while fitting the harmonization model, which should be protected during the removal of scanner/site effects. FA and MD maps were ComBat harmonized in two ways: (1) using two categories for disease status (controls and mTBI patients), and (2) using three categories for disease status (controls, mTBI patients with complete recovery, and mTBI patients with incomplete recovery). The categories were chosen to identify the optimal use of ComBat harmonization to address the overall aims of the study.
Machine learning
LinearSVCs were applied to differentiate mTBI patients from controls and mTBI patients with complete from those with incomplete recovery using the dMRI maps, sex, and age. Sex and age were included as they have previously been shown to affect both the quantitative metrics obtained with DTI (FA and MD), as well as outcome after mild TBI. 51 –54 The linearSVCs were either trained on the voxel- or ROI-wise FA and MD images (1) non-harmonized, (2) ComBat harmonized with two categories, and (3) ComBat harmonized with three categories combined with age and sex. FA and MD maps were examined on a voxel- and ROI-wise level within the areas of the JHU ICBM-DTI-81 Atlas (Fig. 1). For the voxelwise approach, each FA and MD voxel intensity within the atlas served as individual features. For the ROI-wise approach, FA or MD intensities were averaged within each of the 48 JHU regions, such that we end up with 96 (2 x 48) image derived regional features to train the linearSVC.

Overlap between Montreal Neurological Institute (MNI) brain template and the Johns Hopkins University (JHU) white matter (WM) atlas regions (GCC, genu corpus callosum; BCC, body corpus callosum; SCC, splenium corpus callosum; Fx, fornix; mCP, middle cerebellar peduncle; CP, right and left [R/L] cerebral peduncle; iCP, inferior cerebellar peduncle; sCP, superior cerebellar peduncle; CST, corticospinal tract; ACR, anterior corona radiata; PCR, posterior corona radiata; SLF, superior longitudinal fasciculus; EC, external capsule; pIC, posterior limb of internal capsule; cCg, cingulate gyrus part of the cingulum; hCg, hippocampus part of the cingulum; PTR, posterior thalamic radiation; and Tp, tapetum).
Analyses were conducted with the ‘sklearn’ package of Python (version 3.5.0) 55 with the linear kernel SVC and default regularization (C = 1). To estimate the linearSVCs generalizability, a stratified fourfold cross-validation (CV) was performed (for an overview of the proposed classification framework see Fig. 2). For this, the data set was divided into four subsets, each including all disease categories (e.g., controls and mTBI patients with different recovery). For each fold, three subsets were chosen to train the model, which was then validated on the remaining subset.

Flow chart of the analysis and prediction framework.
Recursive feature elimination (RFE) 56 is an iterative feature selection algorithm, which ranks features (e.g., voxels or ROIs within the JHU Atlas) in a training data set based on their relevance for predicting the target variable (e.g., controls and mTBI patients or mTBI recovery at 6 months post-injury). RFE was used to identify the 10% most discriminative JHU voxels or the ten most discriminative JHU ROIs to classify mTBI separately from controls, and to predict mTBI recovery. For the voxelwise approach, the 10% most discriminative WM voxels within the JHU regions were identified in each of the four folds. Similarly, for the ROI-wise approach, from the 48 JHU regions, the 10 most discriminative ROIs were selected in each of the folds in the cross-validation method. Next, for each fold, the classification scores (accuracy, sensitivity, specificity, and F1-score) of the different linearSVC models were calculated based on the results obtained from the RFE-selected 10% most discriminative JHU voxels or the RFE-selected 10 most discriminative JHU ROIs exclusively, while the remaining voxels or ROIs were excluded from the analysis.
Statistical analysis
Analysis of variance (ANOVA) was performed to compare the demographics among the groups (controls, mTBI complete and incomplete recovery). 57 Groups were considered statistically different if p < 0.05.
The linearSVCs were compared based on accuracy, sensitivity, specificity, and F1-score, calculating the mean and standard error across the fourfold cross-validation based on the positive and negative predicted values (Table 1).
Meaning of Positive and Negative Predictions
mTBI, mild traumatic brain injury.
The voxels (voxelwise approach) or brain regions (ROI-wise approach) that significantly contributed to the classification between the two groups identified via RFE were used for further analysis. For the voxelwise approach, discriminative voxels identified in at least two folds were mapped to the JHU Atlas. The relative voxel count was calculated as the percentage of discriminative voxels in the JHU regions relative to all voxels in the WM structure.
Additionally, the mean value across those discriminative FA and MD voxels was calculated and compared between groups via Student's t test. For the ROI-wise approach, the 10 most discriminative JHU ROIs were selected via the RFE algorithm in each of the four folds and retained as the final discriminative ROIs if identified in at least two folds. The average FA and MD for the 48 JHU regions were compared between groups using Student's t test. The diffusion metrics between groups were considered statistically different if p < 0.05, after controlling for false discovery rate (FDR). 58
Lastly, to evaluate the classification and prediction performances for the different linearSVCs, the receiver operating characteristic (ROC) curves were obtained. Classification accuracy was also assessed using the AUC. The paired non-parametric DeLong test was used to compare the difference AUCs among the implemented linearSVC methods across the four folds. 59
Results
Demographic data are listed in Table 2. The cohort consists of 85 controls and 179 mTBI patients, 92 with complete recovery (GOSE = 8) and 87 with incomplete recovery (GOSE <8). Age differed significantly among the three groups: controls and mTBI patients with complete and incomplete recovery (p < 0.05). Two thirds of each patient group were male (66 male and 26 female for complete recovery and 56 male and 31 female for incomplete recovery). There was no significant difference for the time between injury and scan for the patient cohorts (mean number of days was 8.5 for complete recovery and 9.0 for incomplete recovery). For both complete and incomplete recovery, the majority of mTBI patients had an initial GCS of 15 (76 out of 92 mTBI with complete recovery and 64 out of 87 with incomplete recovery). The most prevalent mechanisms of injury for both patient groups were road collisions and falls (40 and 32 for complete and 40 and 36 for incomplete recovery, respectively). Patients with incomplete recovery after an mTBI had a (prevailing) GOSE score of 7 after 6 months (42 out of 87 mTBI patients).
Demographic and Clinical Data for the Included Patients
mTBI, mild traumatic brain injury; SD, standard deviation; GCS, Glasgow Coma Scale Score; GOSE, Extended Glasgow Outcome Scale.
Intracranial abnormalities are described in Supplementary Table S1. The number of subjects per scanner was comparable across 14 of the 15 scanners (Supplementary Table S2, average of seven controls, eight complete, and six incomplete recovery per scanner), with exception of scanner #1 which scanned the highest number of mTBI patients.
The results will be presented separately for the classification between controls and mTBI patients and the prediction between complete (GOSE = 8) and incomplete (GOSE <8) recovery of the mTBI patients.
Classification of mTBI patients versus controls
Classifying mTBI patients versus controls based on age and sex yielded a classification accuracy of 59.8 (± 1.0)%. The addition of imaging features FA/MD yielded a higher classification performance than using age and sex alone. The voxelwise approaches provided a classification accuracy of 73.5 (± 3.5)% for non-harmonized FA and MD, 80.7 (± 2.1)% for the maps harmonized with ComBat using two categories, and 81.4 (± 1.7)% for the maps harmonized with ComBat using three categories (Table 3). For the ROI-wise approach, the classification scores obtained were 65.2 (± 3.9)% for non-harmonized diffusion maps, 62.9 (± 2.2)% for the maps harmonized with ComBat using two categories and 64.8 (± 2.9)% for the maps harmonized with ComBat using three categories for the ROI-wise approach (Table 3). Comparing the multiple classification features listed in Table 3 and shown in Figure 3, the implantation based on ComBat 3 categories voxelwise approach including age and sex was the one that provided the highest performance scores when classifying controls and mTBI patients (93.3% sensitivity, 81.4% accuracy, 80.3% F1-score, and 0.88 AUC).

Receiver operating characteristic (ROC) curves for the fourfold cross-validation linearSVCs classification between controls and mild traumatic brain injury (mTBI) patients based on:
Averaged Performance Statistics of the Fourfold Cross-Validation for the Classification Between mTBI and Controls (Average ± Standard Error)
mTBI, mild traumatic brain injury; AUC, area under the curve; FA, fractional anisotropy; MD, mean diffusivity; JHU, Johns Hopkins University; ROI, region of interest.
Based on the paired DeLong test (Table 4), comparing the AUCs across the CV folds for each of the linearSVCs implementations, comparing age and sex alone as input, and using age and sex combined with FA and MD voxel- or ROI-wise, the use of diffusion imaging significantly improves (p < 0.05) the classification between controls and mTBI patients. When looking at the voxelwise comparison among harmonization approaches, there is a significant improvement (p < 0.05) in the use of ComBat harmonization for the use of two or three categories in the disease status, but no significant differences for the ROI-wise approach. Finally, when comparing the same harmonization approaches for both voxel- and ROI-wise implementations, there are significant differences for both ComBat harmonization approaches but none for the non-harmonized FA/MD. Therefore, comparing the linearSVC implementations, ComBat harmonized using three disease status categories FA and MD maps in the voxel-wise linearSVC significantly improves the differentiation between mTBI and controls (p < 0.05).
DeLong Test Comparison Between Implementations for the Classification Between Controls and mTBI Patients: Delong z Value, p Value, and Statistical Significance
p < 0.05, ** p < 0.01, *** p < 0.001.
mTBI, mild traumatic brain injury; FA, fractional anisotropy; MD, mean diffusivity; ROI, region of interest.
The most predictive voxels were consistent across the voxels selected when using ComBat harmonization using two or three categories or non-harmonized data. On the other hand, for the ROI-wise classification, most predictive ROIs selected based on ComBat harmonized FA and MD maps using two or three categories were consistent, but for non-harmonized maps, the areas identified were not similar to those for the harmonized data.
As can be seen in Figure 4, for the three-category-harmonized ComBat voxelwise approach (highest classification scores) the most predictive brain tracts were identified in the middle cerebellar peduncle, corpus callosum, external capsule, superior longitudinal fasciculus, and posterior thalamic radiation. In the three-category-harmonized ComBat ROI-wise approach, as can be seen in Figure 5, the most predictive brain tracts identified were the corpus callosum, anterior corona radiata, external capsule, posterior thalamic radiation, superior longitudinal fasciculus, pontine crossing fibers, cerebral peduncles, and superior corona radiata.

Location of the 10% most predictive three-category-harmonized ComBat fractional anisotropy (FA) and mean diffusivity (MD) voxels identified via a recursive feature elimination (RFE) algorithm. Voxels identified in at least two folds were mapped to the Johns Hopkins University (JHU) atlas

Location of the most predictive three-category harmonized ComBat fractional anisotropy (FA) and mean diffusivity (MD) Johns Hopkins University (JHU) regions of interest (ROIs) for control versus mild traumatic brain injury (mTBI) classification. The most predictive JHU tracts identified were: genu of the corpus callosum (GCC), body of the corpus callosum (BCC), splenium of the corpus callosum (SCC), anterior corona radiata (ACR), external capsule (EC), posterior thalamic radiation (pTR), superior longitudinal fasciculus (SLF), pontine crossing fibers (PCF), cerebral peduncles (CP), and superior corona radiata (SCR).
Figure 6 shows a violin plot of the mean values and standard deviations of ComBat using three categories FA and MD over the discriminative voxels and ROIs for TBI patients and healthy control groups. For the linearSVCs based on dMRI metrics combined with age and sex, mTBI patients displayed higher MD and lower FA values for the most discriminative voxels and ROIs identified through RFEs. The analysis of the most discriminative voxels and ROIs of the DTI diffusion indices demonstrated possible differences in the WM microstructure in the TBI patient group compared with healthy controls. Significantly decreased FA and increased MD were present in several major WM tracts in the TBI group compared with the healthy control group (FDR corrected p value <0.05).

Mean three-category-harmonized ComBat fractional anisotropy (FA) and mean diffusivity (MD) for the most discriminative voxels (
Prediction between mTBI patients with complete (GOSE = 8) and incomplete (GOSE <8) recovery
The use of age and sex for the differentiation between mTBI patients with complete versus those with incomplete recovery yielded a classification accuracy of 61.4 (± 5.2)%. The addition of imaging features (FA/MD) provided higher classification performance than age and sex alone, when the imaging maps were optimally harmonized, yielding a classification accuracy of 66.2 (± 6.8)% for the three-category-harmonized ComBat FA and MD voxelwise. In contrast, for the non-harmonized FA and MD voxelwise an accuracy of 56.0 (± 7.7)% was obtained, and for the two-category- harmonized ComBat FA and MD voxelwise, 44.7 (± 4.1)% was obtained, demonstrating the impact of harmonization on the prediction results. A similar result was obtained when using the ROI-wise approach: 60.1 (± 4.4)% for the non-harmonized FA and MD ROI-wise, 54.9 (± 3.7)% for the two-category-harmonized ComBat FA and MD ROI-wise, and 64.8 (± 3.3)% for the three-category- harmonized ComBat FA and MD ROI-wise, Table 5. Similarly to the previous classification between controls and patients, comparing the multiple classification features listed in Table 5 and shown in Figure 7, the three-category-harmonized voxelwise approach combined with age and sex was the one that provided the highest performance scores when predicting between mTBI patients with complete and those with incomplete recovery (71.8% specificity, 66.4% accuracy, 66.2% F1-score, and 0.71 AUC).

Receiver operating characteristic (ROC) curves for the fourfold cross-validation linearSVCs classification between mild traumatic brain injury (mTBI) patients with complete and incomplete recovery based on:
Averaged Performance Statistics of the Fourfold Cross-Validation for the Prediction Between mTBI Complete and Incomplete Recovery (Average ± Standard Error)
mTBI, mild traumatic brain injury; AUC, area under the curve; FA, fractional anisotropy; MD, mean diffusivity; JHU, Johns Hopkins University; ROI, region of interest.
The paired DeLong test, Table 6, revealed significant differences (p < 0.05) between the approaches: (1) age and sex alone and the ComBat harmonized with three categories FA and MD voxelwise combined with age and sex linearSVCs, (2) age and sex alone and ComBat harmonized with three categories FA and MD ROI-wise combined with age and sex linearSVCs, (3) non-harmonized and ComBat harmonized with three categories FA and MD voxelwise combined with age and sex linearSVCs, (4) ComBat harmonized with two or three categories FA and MD voxelwise combined with age and sex linearSVCs, and (5) Combat harmonized with two categories FA and MD ROI- and voxelwise combined with age and sex linearSVCs. Thus, the combination of age and sex in addition to the use of ComBat harmonized with three categories FA and MD in the voxelwise approach provided a statistically significant improvement in the prediction between mTBI patients with complete and those with incomplete recovery.
Delong Test Comparison Between Implementations for the Classification Between mTBI Patients With Complete and Those with Incomplete Recovery: Delong Z-Value, p Value, and Statistical Significance
p < 0.05, ** p < 0.01, *** p < 0.001.
mTBI, mild traumatic brain injury; FA, fractional anisotropy; MD, mean diffusivity; ROI, region of interest.
Similar discriminative voxels and ROIs were identified independently of the harmonization approach. Moreover, the distributions of the discriminative FA and MD as measured with voxelwise or ROI methods in complete and incomplete mTBI recovery were similar when compared with the findings for discriminating between controls and mTBI for the harmonization approaches (Figs. 8 and 9). With the ComBat harmonization with three categories voxelwise approach, the most predictive brain tracts identified were in the middle cerebellar peduncle, corpus callosum, superior longitudinal fasciculus, external capsule, and anterior corona radiata (Fig. 8). Whereas for the ComBat harmonization with three categories ROI-wise approach, the most predictive brain tracts identified were the corpus callosum, external capsule, superior corona radiata, sagittal stratum, middle cerebellar peduncle, medial lemniscus, and cingulum (Fig. 9).

Location of the 10% most predictive three-category harmonized ComBat fractional anisotropy (FA) and mean diffusivity (MD) voxels identified via recursive feature elimination (RFE) algorithm. Voxels identified in at least two folds were mapped to the Johns Hopkins University (JHU) Atlas

Most predictive three-category-harmonized ComBat fractional anisotropy (FA) and mean diffusivity (MD) Johns Hopkins University (JHU) regions of interest (ROIs) for the mild traumatic brain injury (mTBI) complete versus incomplete recovery prediction. The most predictive identified JHU tracts were: genu of the corpus callosum (GCC), body of the corpus callosum (BCC), external capsule (EC), superior corona radiata (SCR), sagittal stratum (SS), middle cerebellar peduncles (mCP), medial lemniscus (ML), and cingulum (Cg).
Significantly increased FA and decreased MD were present in some major WM JHU tracts in patients with complete versus those with incomplete recovery (FDR corrected p value <0.05, Fig. 10). These differences for complete and incomplete recovery are expected to be related to microstructural neuropathological alterations in WM bundles in the semi-acute phase. 60

Mean ComBat harmonization with three categories fractional anisotropy (FA) and mean diffusivity (MD) for the most discriminative voxels (
Discussion
In this study, ComBat harmonized FA and MD were demonstrated to provide additional information for recovery prediction in mTBI using a multi-modal ML approach, based on dMRI acquired within the first 21 days post-injury.
The results of the predictions based on the dMRI quantitative metrics were compared with the baseline predictions based on age and sex only. Age and sex are factors that are likely associated with recovery after an mTBI. Both variables interact and are related to cortical maturation, biological response, and social modifiers. 52,60,61 Therefore, using age and sex only in a predictive model is a useful baseline comparison for research and provides an early assessment of injury severity, but it is not sufficiently accurate to guide decision making in the clinical setting.
We applied ComBat harmonization to remove unwanted sources of variability, while preserving variations caused by other biologically relevant covariates, such as age, sex, and disease status, in the data, because the method accounts for systemic intensity variations caused by inter-scanner biases. 18 Nonetheless, the choice of parameters for the harmonization, such as disease status, influences the accuracy of the classification groups. It should be noted that ComBat harmonization is a retrospective harmonization tool, and that therefore our data set could be regarded as a best-case simulation of how SVCs would perform when using perfectly harmonized data sets.
The linear SVC trained on age and sex and ComBat three-category-harmonized voxels of FA and MD to differentiate mTBI patients from controls yielded an accuracy of 81.4 (± 3.4)% and an AUC of 0.88 (± 0.04). Distinguishing between mTBI patients with complete and those with incomplete recovery was demonstrated to be more challenging, yielding a prediction accuracy of 66.4 (± 13.7)% and an AUC of 0.71 (± 0.12) for the age and sex and ComBat three-category-harmonized FA and MD voxelwise approach (Fig. 7). Our results are comparable to the UPFRONT model, which is the current gold standard for the prediction of outcome for complete (GOSE = 8) and incomplete (GOSE <8) recovery in mTBI, providing an of AUC 0.7. 62
The discriminative voxels identified on the DTI maps that differed between mTBI patients and controls were predominantly located in the corpus callosum, middle cerebral peduncle, right and left cerebral peduncles, external capsule, fornix, and right and left tapetum. The identified discriminative brain regions associated with mTBI with complete or incomplete recovery based on the GOSE 6 months post-injury were mainly the corpus callosum, middle cerebral peduncles, external capsule, and the right and left anterior corona radiata. These areas mostly overlapped with the regions identified by the linearSVC that differentiates mTBI patients from controls.
Many JHU regions that showed different diffusion metrics in mTBI and control subjects were also shown to be discriminative when classifying patients with different outcomes. This implies that the DTI differences found to be associated with mTBI are also important for the assessment of recovery post-injury. Further, the cerebellar peduncle, brainstem, and corpus callosum have previously been found to have prognostic value in moderate and severe TBI, 63 –65 and their identification as discriminative for mTBI in this analysis adds weight to the biological plausibility of our findings. This is also consistent with the hypothesis that the pathological and biomechanical mechanisms in mTBI may be similar (at least in part) to those in severe TBI; albeit that differences may be more subtle, hence harder to detect.
The regions identified as predictive in the current study have been previously associated with mTBI outcome. Reduced FA is the most common finding in the acute and semi-acute phases of mTBI when comparing patients and controls. 11 However, some studies in the acute/early subacute phase of mTBI have shown significantly increased FA in major WM tracts, consistent with our findings of an increased FA and reduced MD in some major WN JHU tracts in patients with complete versus incomplete recovery (Fig. 10). 66,67 An investigation using whole-brain voxelwise non-parametric statistical comparison evaluated mTBI patients with complete or incomplete recovery, and found higher MD values voxelwise in patients with incomplete recovery than in patients with complete recovery and controls, in the corpus callosum, right anterior thalamic radiations, superior longitudinal fasciculus, and inferior longitudinal and fronto-occipital fasciculi at 7–28 days after injury. 12 Ling and colleagues found increased FA and decreased radial diffusivity voxelwise within the genu of the corpus callosum, in a cohort of 28 mTBI patients with complete or incomplete recovery who underwent MRI 15.6–4.3 days after injury. 13 In contrast, an ROI approach found no significant difference in FA or MD in the genu, body, or splenium of the corpus callosum in 60 mTBI patients with complete and incomplete recovery (on the more severe end of the mTBI spectrum), in comparison with 34 controls. 14 Yuh and colleagues found that WM FA was significantly reduced in mTBI patients who had positive acute traumatic intracranial abnormality on conventional MRI, but not in negative MRI mTBI patients, compared with controls. 15 In addition, regions of reduced FA in mTBI patients were modest, but statistically significant, predictors of unfavorable 3- and 6-month outcomes. The FA alterations are likely related to microstructural neuropathological changes in the WM major tracts, which may include cytotoxic edema, changes in water content within the myelin sheath, and inflammation. 66 –68
Using the mean FA/MD measurement across all discriminative voxels (Figs. 6A/B and 10A/B), and specific discriminative ROIs (Figs. 6C/D and 10C/D), we found significant (p < 0.05) between-group differences based on the Student's t test calculated with the average FA and MD values. Thus, the differences in the WM dMRI metrics for the discriminative areas identified by the method were demonstrated to be significantly associated with the classification between mTBI patients and controls and with the prediction between complete or incomplete recovery for those mTBI patients. Moreover, the distributions of FA and MD, even if visually similar between the groups, showed significant p values (p < 0.05) in the Student's t test in such comparisons in the neuroanatomical regions of WM known to be vulnerable to axonal injury.
The DeLong test comparisons between the linearSVC models are demonstrated in Tables 4 and 6, for the classification between controls and mTBI patients and the classification between mTBI patients with complete and those with incomplete recovery, respectively. Model performance is related to the use of either single intensities (predictive voxels) or the average voxel intensity in selected regions (predictive ROIs), as well as to the choice of harmonization approach (no harmonization or ComBat-harmonized FA and MD maps with either two or three disease status categories). Based on the paired DeLong test, the ComBat harmonized with three categories FA and MD voxelwise combined with age and sex linearSVC implementation provides a significantly better classification between mTBI and controls than do the other methods (Table 4, p < 0.05), which is not an unexpected result, because the optimal harmonization of single voxels seems to highlight the group differences by preserving the underlying biological changes caused by trauma. 69 Moreover, a similar trend is demonstrated in the prediction between mTBI patients with complete and those with incomplete recovery. The combination of age and sex in addition to the use of ComBat harmonized with three categories FA and MD in the voxelwise approach provided a statistically significant improvement compared with other models (Table 6, p < 0.05), providing evidence that the single-voxel ComBat harmonization successfully removes scanner effects in diffusion data, preserving the mTBI effects in the brain. 69
The findings provide evidence for the use of DTI to aid identification of patients at risk of incomplete recovery after mTBI. The prediction of recovery remains an extremely challenging and complex question, which is influenced by multiple pre-injury factors. 70 DTI significantly improved performance of models predicting complete versus incomplete recovery, showing that there is promise in combining this advanced imaging technology with linearSVCs. However, even the best DTI model only achieved an accuracy of 66%, which is insufficient for clinical practice. Future research therefore needs to address the challenges for outcome prediction after mTBI generally (such as having large enough samples to control for a wide range of pre-morbid factors) and the use of DTI in particular (identifying the optimal timing for imaging and optimal acquisition parameters and harmonization strategies).
Therefore, the results of this study should be considered in the context of certain limitations. First, partial volume effects can lead to abnormal DTI indices in the case of inaccurate registration of individual images into standard space. To reduce the likelihood of such an error, we registered the diffusion maps using the JHU FA map and performed visual checks and correlation evaluations as a strict inclusion criterion for the quantitative dMRI maps in common space. Despite our diligent processing steps and quality control, subtle misregistration between diffusion maps and the atlas cannot be ruled out entirely.
It is important to validate an automated method based on advanced MRI techniques, such as dMRI, which can be used as a predictive tool for a wide spectrum of mild TBI outcomes. Therefore, future work with even larger mTBI patient data sets would be beneficial to increase cohort size for a comprehensive and generalizable predictive model. Additionally, the use of ComBat harmonization should be thought through carefully, as it can be tricky to decide which biological covariates to use with the intention of keeping image alterations related to the mTBI disorder. In our case, we used outcome categories as a biological covariate, but other possibilities are to include variables that become available immediately after the patient is scanned, such as the presence of any intracranial abnormalities including microhemorrhages. In addition, other biomarkers may add prognostic information; for example, proteomic biomarkers including glial fibrillary acid protein (GFAP) or ubiquitin C-terminal hydrolase L1 (UCH-L1). 71,72 The development of more complex prognostic models including this additional information would be best explored in future studies with larger numbers of patients. Given that GOSE may miss more subtle deficits, such studies should also explore more granular outcomes including cognitive and mental health outcomes, the latter being particularly salient in patients at the higher end of functional outcome as may be expected after mTBI. 73 Finally, for our study we have only used two biological features (age and sex), which were not removed as confounding factors in the analysis. Although we demonstrated that age and sex seem predictive and that the diffusion metrics FA and MD improve such predictions, further investigation of multiple diffusion metrics (e.g., radial diffusivity, axial diffusivity, mean kurtosis) and their association with direct alterations caused by mTBI would be beneficial for better understanding the underlying neuropathological changes that occur after mTBI. Based on the available measurements from the CENTER-TBI database, we did not include the scan time post-injury, severity of extracranial injury, education, or a history of mental health problems as variables in our model for the reasons outlined in the following paragraphs.
The time between injury and imaging ranged from 1 to 21 days in our study, and it is known that diffusion metrics may change during this time frame. 66 Because the time to imaging was not significantly different between patients with complete and those with incomplete recovery in our cohort, it would not affect our conclusion that DTI has some prognostic value after TBI. However, it is likely that DTI would perform better if all patients were imaged at the same time point, ideally closer to the time of injury.++ 5
Further, at present there is no accepted framework for handling missing data when applying machine learning algorithms, such as the SVM used in the present analysis. We therefore had to restrict the covariates in our model to data that were available for all patients. Future studies should strive to assess if DTI adds prognostic value also, when other variables are included such as severity of extracranial injury, education, or a history of mental health problems.
Despite the mentioned limitations, our model still adds value to the investigation of mTBI recovery and provides important insights that may be built on in bigger cohorts.
Conclusion
This study shows that ComBat-harmonized dMRI metrics provide additional information related to WM differences, which are demonstrated to be relevant for the classification of mTBI patients, suggesting that DTI may be a predictive marker of recovery in mTBI. We also demonstrated the potential utility of ComBat harmonization on FA and MD maps voxelwise combined with age and sex for mTBI recovery prediction using linearSVCs. Predictive models capable of identifying mTBI patients with the potential for incomplete recovery could facilitate the design of future clinical trials and stratification for treatment planning.
Transparency, Rigor, and Reproducibility Summary
This study was pre-registered at the CENTER-TBI study site (
Footnotes
Acknowledgments
We thank Dr. Ella Roelant and Dr. Erik Fransen from StatUa (University of Antwerp, Belgium) for providing statistical advice.
Authors' Contributions
M.S.P. was responsible for investigation, formal analysis and writing – original draft (lead); S.W. was responsible for methodology, and writing – review and editing; E.N.K. was responsible for data curation, methodology, and writing – review and editing; S.R. was responsible for methodology and writing – review and editing; R.P. was responsible for methodology; M.M.C. was responsible for conceptualization, methodology, and writing – review and editing; B.G. was responsible for methodology; G.W. was responsible for conceptualization; A.V. was responsible for conceptualization and writing – review and editing; J.P.P. was responsible for conceptualization and writing – review and editing; A.H. was responsible for conceptualization and writing – review and editing; JS was responsible for conceptualization and writing – review and editing; PJG was responsible for conceptualization, supervision and writing – review and editing; AJD was responsible for conceptualization, supervision and writing – review and editing; DKM was responsible for conceptualization; J.S. was responsible for funding, conceptualization, supervision, and writing – review and editing; P.V.D. was responsible for funding, resources, supervision, and writing – review and editing; and V.F.J.N. was responsible for funding, conceptualization, supervision, and writing – review and editing.
Data Sharing
Data access is conditional on an approved study proposal; there are no end dates to the availability. The CENTER-TBI data used in this study are available to researchers who provide a methodologically sound study proposal that is approved by the CENTER-TBI management committee to achieve the aims in the approved proposal. Proposals may be submitted online to CENTER-TBI (
Funding Information
The CENTER-TBI study was supported by the European Union 7th Framework Programme (EC grant 602150), with additional project support from OneMind, the Hannelore Kohl Foundation, NeuroTrauma Sciences. and Integra Neurosciences. This project has received funding from the European Union's Horizon 2020 research and innovation programme under Marie Skłodowska-Curie grant agreement No 764513. Individual sources of funding were the Engineering and Physical Sciences Research Council (EP/R511547/1, to B.G.), a Wellcome Trust PhD Fellowship (S.R.), the Academy of Finland (Grant 17379 to J.P.P.), the National Institute for Health Research UK (D.K.M. and V.F.J.N), and the Academy of Medical Sciences/The Health Foundation (V.F.J.N.).
Author Disclosure Statement
No competing financial interests exist.
Supplementary Material
Supplementary Figure S1
Supplementary Table S1
Supplementary Table S2
Supplementary Author List
TRIPOD Checklist
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
