Group-Based Trajectory Analysis Applications for Prognostic Biomarker Model Development in Severe TBI: A Practical Example

Abstract

Over the last decade, biomarker research has identified potential biomarkers for the diagnosis, prognosis, and management of traumatic brain injury (TBI). Several cerebrospinal fluid (CSF) and serum biomarkers have shown promise in predicting long-term outcome after severe TBI. Despite this increased focus on identifying biomarkers for outcome prognostication after a severe TBI, several challenges still exist in effectively modeling the significant heterogeneity observed in TBI-related pathology, as well as the biomarker-outcome relationships. Biomarker data collected over time are usually summarized into single-point estimates (e.g., average or peak biomarker levels), which are, in turn, used to examine the relationships between biomarker levels and outcomes. Further, many biomarker studies to date have focused on the prediction power of biomarkers without controlling for potential clinical and demographic confounders that have been previously shown to affect long-term outcome. In this article, we demonstrate the application of a practical approach to delineate and describe distinct subpopulations having similar longitudinal biomarker profiles and to model the relationships between these biomarker profiles and outcomes while taking into account potential confounding factors. As an example, we demonstrate a group-based modeling technique to identify temporal S100 calcium-binding protein B (S100b) profiles, measured from CSF over the first week post-injury, in a sample of adult subjects with TBI, and we use multivariate logistic regression to show that the prediction power of S100b biomarker profiles can be superior to the prediction power of single-point estimates.

Introduction

B iomarker research has the capacity to unlock important clues about the molecular biology of injury, disease, plasticity, and recovery. In traumatic brain injury (TBI), there is a tremendous unmet need to deal effectively with the significant heterogeneity of TBI-related pathology and personalize treatments in a manner that optimizes recovery. Yet, despite many patients having similar injury factors and clinical care after their TBI, recovery and outcomes can be very different. One possible rationale is that individual differences in outcome may be the result of unique and changing profiles in an injury-induced proteome. To date, the TBI biomarker literature primarily has focused on biomarker levels in the first few days after injury; however, biomarkers may also be informative about pathology associated with ongoing neurodegenerative processes as well as the restorative processes that influence recovery. Also, pathophysiological profiles observed early after injury may affect late-TBI pathology and risk for complications in the postacute or -chronic phases after injury.^1,2 Thus, there is a strong clinical rationale for assessing longitudinal biomarker profiles in TBI.

Over the last decade, a significant focus has centered on identifying and utilizing novel biomarkers for the diagnosis, prognosis, and management of TBI. A search of “TBI and biomarker” on March 23, 2012 resulted in 9960 and 545 hits in Google Scholar and PubMed, respectively. Several cerebrospinal fluid (CSF) and serum biomarkers have shown promise in predicting long-term outcome after severe TBI.^{3

–13} One particular marker of interest in TBI is S100 calcium-binding protein B (S100b), a protein expressed in mature astrocytes whose foot processes contribute to the blood–brain barrier (BBB). S100b easily extravasates into the serum as a result of BBB compromise, making it a potentially logical choice as a candidate biomarker for TBI diagnosis and prognosis.^14

–17

Despite this increased focus on identifying biomarkers for prognosis estimation after a severe TBI, several challenges still exist in effectively modeling biomarker-outcome relationships. Traditionally, biomarker data collected over time are summarized into single-point estimates. These point estimates are then used to examine the relationship between biomarker levels and outcome of interest. Some studies have used average biomarker levels to predict TBI outcome.^4,6,7,11 Others have used peak levels,^8,18 levels obtained during the first day after injury,^5,19,20 whereas others have used arbitrary cutoffs.^5,9
–11,20

Recent research from our group has assessed temporal biomarker profiles in evaluating TBI prognosis.^12,13,21,22 Based on this recent work, we hypothesized that temporal biomarker profiles can be more informative than single-point estimates in predicting outcome. Work has also identified demographic, premorbid, and psychosocial factors associated with functional impairments, disability, and community integration for those with TBI. However, it is unknown whether these factors are linked to temporal biomarker profiles.²³ Additionally, injury severity, as measured by the Glasgow Coma Scale (GCS) and the Injury Severity Score, are sensitive predictors of outcome in multiple populations with TBI.²⁴ However, many biomarker studies to date have focused on the prediction power of the biomarkers without controlling for potential clinical and demographic confounders, such as injury severity and age.^5,6,8,18 Hence, more work is needed to determine the unique diagnostic and prognostic utility of biomarkers in TBI when effects of standard clinical and demographic variables are taken into account.

A systematic and practical approach in biomarker research is needed to evaluate temporal trends, determine factors affecting these trends, and assess outcome differences associated with these temporal biomarker patterns. Our immediate aim for this study was to demonstrate the application of a contemporary modeling approach to delineate distinct subpopulations with TBI having unique temporal biomarker profiles and model the relationships between biomarker profiles and outcomes while taking into account appropriate covariates. Our long-term aim is to establish effective biomarker modeling algorithms for diagnosis, prognosis, and management of patients with TBI. Previously, we have applied group-based trajectory modeling to discern unique biomarker trends after TBI.^12,13 The specific aims of this article are to demonstrate a step-by-step process for this approach by (1) summarizing approaches for describing temporal trends of biomarker levels across time in persons with TBI and (2) comparing prediction potential of biomarker profiles to methodologies that use a single measure in predicting outcome after TBI. To provide a practical example of this novel approach in biomarker research, we use S100b data that were measured from CSF samples collected from adult subjects with severe TBI over the first 6 days after their injury.

Methods

Population description

This study was approved by the University of Pittsburgh's Institutional Review Board (Pittsburgh, PA). Our population included 138 adults with severe (GCS≤8), closed-head TBI for whom CSF was collected as a part of their intensive care unit management. People with penetrating trauma as the source of their injury were not included for analysis. Sample collection procedures and S100b measurements used in this analysis are described elsewhere.²⁵ Beginning within 12 hours of injury, CSF samples were collected up to twice-daily for 6 days by an external ventricular drain with a run-off bag placed for clinical care. Upon collection of each bag, CSF samples were centrifuged, aliquoted, and then stored at −80°C until batch analysis.

A total of 501 samples were collected from these subjects enrolled at our level 1 trauma center. Subjects were included in statistical analysis if they had data on at least 2 days within the 6-day sampling period. Sample collection, processing methods, and biomarker assessment are provided in our companion article.²⁵

Group-based trajectory modeling

Identification of distinct subpopulations that have unique biomarker profiles over time can be accomplished using a contemporary statistical technique called group-based trajectory modeling (GBTM). GBTM is a statistical approach designed to identify clusters of individuals following a similar progression of some behavior (in this case, biomarker trajectories) over time.²⁶ The methodology assumes that the population is composed of a finite number of distinct groups that can be defined by their biomarker profiles or trajectories.²⁶ In this section, we present a brief discussion about the process involved in trajectory model development, as it applies to biomarker profiles. A complete theoretical discussion of these concepts is presented elsewhere.²⁷ Additionally, an online source (http://www.andrew.cmu.edu/user/bjones/) provides the SAS code and examples for reference.

Data distribution

PROC TRAJ²⁸ is a SAS procedure that fits GBTM. It provides the ability to model three different data distributions for the variable of interest: (1) counts; (2) continuous data; and (3) dichotomous data. Thus, when using PROC TRAJ, one must first decide the appropriate data distribution before fitting a trajectory model. PROC TRAJ allows for zero-inflated Poisson (ZIP), censored normal (CNORM), and Bernoulli distributions. The ZIP model is used for count data, CNORM distribution is used for continuous data, and Bernoulli distribution is appropriate for dichotomous data. A distribution that allows for censoring is particularly useful when working with biomarker data sets, because the data can cluster at the minimum of the (biomarker) measurement scale or at the measurement scale maximum or both.²⁹ If there is no clustering, a normal distribution model can be specified by identifying a minimum and maximum that is outside the range of the observed biomarker values.

In our example, S100b concentration is a continuous variable, with a minimum detection level resulting from assay detection limits. Because of this characteristic of the data, subjects have data that tend to cluster at the minimum value, which can lead to a skewed distribution. As such, we chose to use CNORM distribution. As with many biomarkers measured in subjects after TBI, S100b levels from the same individual can widely vary across time, and levels from different subjects can substantially vary. As a result, especially when the sample size is not large, distribution is usually not normal. To address these issues associated with data distribution, we applied the natural log transformation to our data set.

Trajectory model building

Group-based modeling assumes that the population is composed of finite distinct groups. One of the key decisions when identifying trajectory groups in a population is determining the number of groups that best fit the data. One must also decide on the highest polynomial order that best characterizes the path that biomarkers for each trajectory group takes over time. Polynomial order relates to the shape of the trajectory. The first-order or linear polynomial suggests a linearly decreasing or increasing trajectory. The quadratic polynomial, or second order, suggests a trajectory that has one turning (i.e., inflection) point. For example, levels can initially increase and decrease after a peak is reached. The cubic polynomial, or third-order polynomial, suggests a trajectory where there are two turning points (inflections), a maximum and minimum concentration, for example. Clinical expertise regarding TBI pathophysiology and the expected progressive path for biomarkers to take over time can be an important component when determining polynomial order.

Different models with a varying number of groups and shapes have to be compared to find the model that best fits the biomarker data. Several model-fit indices exist to help determine the best model, but one commonly used index is the Bayesian information criterion (BIC). In general, BIC measures improvement in model fit gained by adding more parameters (e.g., more groups and more-complex trajectory shapes), but also emphasizes model simplicity by applying a penalty for complex models. When comparing two possible trajectory models (e.g., with different number of groups and/or trajectory shapes [order of polynomials]), the model with the highest BIC value would be chosen. Thus, if two models fit the data similarly, but one is more complex (i.e., has more groups or higher polynomial order) than the other, the simpler model should be chosen. A thorough discussion of BIC can be found elsewhere.³⁰

Because BIC is likely to change as the number of groups or the shape of the trajectories is changed, a decision has to be made about what constitutes a meaningful change in BIC value when comparing two models with a different number of groups or trajectory shapes or both. One proposed measure for assessing meaningful change²⁷ is the Bayes factor. For two models (1 and 2), a Bayes factor is the ratio of the probability of model 1 being the correct model to the probability of model 2 being the correct model.²⁷ Thus, if two models have equal probability of being correct, the Bayes factor would be 1. Values less than 1 favor model 2, whereas values greater than 1 imply that model 1 has a higher probability of being the correct model. The Bayes factor between two different models is estimated by exp^(BIC1-BIC2), where BIC1 and BIC2 represent the BIC values for models 1 and 2 respectively. When comparing two models, a 10-fold difference in Bayes factor is considered a meaningful difference.²⁷

For our example, we started with a model that had the highest polynomial order (quartic) included. As described below, these polynomial terms are often reduced when refining the model. We should point out that, currently, PROC TRAJ does not allow a polynomial order greater than four (quartic). Thus, we typically begin with a model consisting of one group with a quartic degree polynomial, and then we increase the group numbers until the number of groups that best fit the data is identified using a combination of BIC and Bayes factors. Once the number of groups is identified, we then reduce the polynomial orders until the highest order polynomial for each group is significant at the confidence level alpha (α)=0.05.

Evaluating trajectory model fit

Group-based modeling assigns each subject a posterior probability, which measures an individual's probability of belonging to a particular group given his or her measured biomarker levels across time.²⁷ Each individual is then assigned to the group where the posterior probability of membership is highest (i.e., maximum probability rule). In addition to assigning individuals into distinct groups, the posterior membership probabilities are the basis for judging the adequacy of the model. A brief discussion of model diagnostics is presented in this section. A detailed presentation can be found elsewhere.²⁷

1. Average group posterior probability (AvePP): AvePP_j is the average posterior probability for group j. If individuals are assigned to distinct groups with no ambiguity, the AvePP_j would be 1 for each group. Thus, the closer the AvePP_j are to 1, the better the model fit. An AvePP greater than 0.7 for all groups is generally recommended. In our published work, we have observed AvePP much greater than 0.7, suggesting that subjects with TBI can be very accurately placed into a trajectory group.^12,13

2. Odds of correct classification (OCC): for a trajectory group j, OCC_j, is given by \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}$$ { \rm OCC } _j = \frac { \frac { \frac { AvePPj } { 1 - AvePPj} } { \pi j } } { 1 - \pi j } $$ \end{document} , where AvePP_j is the average group posterior probability and π_j represents the population size of trajectory group j. π_j represents the probability that a randomly selected individual belongs to group j. In the above equation, the numerator represents the OCC based on the maximum probability rule, and the denominator represents the OCC based on a random assignment. So, if the maximum probability rule is not better than random guessing, the OCC would equal 1 for a given trajectory group. For a model that fits the data well, the numerator should be much bigger than the denominator; leading to an OCC value much greater than 1. Generally, an OCC of 5 or more is recommended for all groups.

3. The difference between estimated group probabilities π_j and the proportion P_j assigned to the group using the maximum probability rule, π_j , is the population size of trajectory group j estimated by the model, and P _j is the actual proportion of individuals that are assigned to group j. When a model fits the data well, these two quantities are similar.

Dealing with missing data

Missing data is often a problem in longitudinal studies. PROC TRAJ uses the maximum likelihood method to estimate parameters, including group sizes and shapes of trajectories. Subjects with missing data are included in the analysis, but only available data for each subject are used. These parameter estimates can be biased if missing data are not random. Thus, missing data patterns should be explored and their effects on biomarker profiles and outcomes investigated.

Assessing the effect of trajectory groups on global outcome

After the best trajectory model was identified for our example with S100b, based on the steps discussed above, we evaluated the predictive power of trajectory groups compared to average weekly S100b level. Below, we show that, after controlling for demographic and clinical variables, trajectory groups are superior to average levels in predicting outcome. Trajectory groups were also compared to common demographic and clinical variables, including gender, age, injury severity, mechanism of injury, and initial computed tomography findings. Outcomes were measured using the Glasgow Outcome Score (GOS), collected 6 months after injury, and acute care mortality. The GOS is a frequently used outcome measure developed by Jennett and Bond (1975).³¹ It consists of five categories: good recovery (category 5); moderate disability (category 4); severe disability (category 3); persistent vegetative state (category 2); or death (category 1). For this article, we collapsed GOS scores into three categories (category 1 versus categories 2 and 3 versus 4 and 5) and used multivariate ordinal logistic regression to examine whether trajectory groups further discriminate outcome, after controlling for other covariate factors. A logistic regression model to predict outcome was built using trajectory groups and clinical and demographic variables, and a separate, comparative model was built by replacing trajectory groups with the average of the first week's S100b levels.

Results

This study included 138 subjects who had suffered a severe TBI. The majority of these subjects (71.5%) had suffered their injuries from automobile or motorcycle accidents, and 22 (16.1%) were injured as a result of fall or jump. The median GCS score for all subjects was 6. The age range for this sample was 16–74 years (average, 35.6±1.3). Twenty-eight subjects (20.3%) were female. Less than one fifth (17.4%) of the total sample died during acute care, and 40 subjects (33.9%) had moderate disability or good recovery 6 months after their injuries.

Our data set included subjects who did not have biomarker data on all 6 days. More than 75% of our subjects had biomarker data on 4 days or less, and 23.2% had data on only 2 days. Only 10 (7.3%) subjects had all 6 days of data (Table 1). However, missing data in clinical data sets are commonly found, particularly with critically ill populations who may be receiving emergent surgeries and procedures. Additionally, samples were obtained by passive drainage; thus, some missing samples could be the result of variation in the amount of CSF passive draining from patients over the course of their intracranial pressure (ICP) monitoring window. We investigated patterns of missing data to determine whether “missingness” affected biomarker profiles and/or outcomes. Three dummy variables were created for this purpose: (1) whether subjects had missing data early (first 2 days after injury); (2) whether they had missing data late (days 4 and 5 after their injuries); and (3) whether subjects had fewer than 4 of the 6 sampling days, regardless of when missing data occured. There were 28 (20.3%) subjects who had early missing data, and 38 (27.5%) had late missing data. Further, 64 (46.4%) subjects had fewer than 4 data points. Despite this degree of missingness, there were no associations between any of these dummy variables and trajectory groups or outcomes. Thus, patterns of missing data did not affect subjects' biomarker profile or their recovery status. This suggests that the assumption of data missing at random may not be violated.

Table 1A.

Missing Data Distribution: Number of Data Points Per Subject

Number of data points	Samples available for analysis (%)
2	32 (23.19)
3	32 (23.19)
4	41 (29.71)
5	23 (16.67)
6	10 (7.25)
Total	138 (100)

Table 1B.

Missing Data Distribution: Samples Available by Day

Day of sample	Proportion of samples for available for analysis (%)
1	51 (37.0)
2	107 (77.5)
3	100 (72.5)
4	108 (78.3)
5	87 (63.0)
6	48 (34.8)

Trajectory model development

As mentioned above, we started with a model that had one trajectory group and a quartic polynomial order, and we repeatedly increased the number of groups in the model. Models with a different number of groups were compared based on BIC and Bayes factor, as described above. The results of this process are summarized in Table 2. Once the number of groups was determined, we refined our model by adjusting the order of the polynomial for each group. BIC values for the model with one and two trajectory groups were 943.07 and 844.66, respectively, and the Bayes factor comparing these two models was >1000. Thus, the model with two trajectory groups was considered superior to the model with one group. The BIC value for the three-group model was −798.05, with a Bayes factor >1000 when compared to the two-group model, leading to the conclusion that the three-group model fits the data better than the two-group model. We then fitted a four-group model, and the BIC value decreased (–801.42) when compared to the three-group model. Thus, we concluded that a three-group model fit the data better than a four-group model. The three-group model was then refined until the highest polynomial's coefficient for each trajectory group was significantly different from zero. The final model had a cubic order for the “low” group, a second order for the “intermediate” group, and a slowly decreasing linear “high” group profile (Fig. 1). The BIC value for this model was −786.97, and the Bayes factor was >1000, when compared to the three-group model with a quartic order polynomial for each group. Using the maximum probability rule, 25 (18.1%) patients were assigned to the low group, 80 (58.0%) to the intermediate group, and 33 (23.9%) to the high group. For comparative purposes and to examine the effects of missingness on TRAJ formation, a subset of subjects having at least 4 data points were used and TRAJ groups were defined. The results were very similar to the full model presented in this article (see Fig. 2). Specifically, the best model was composed of three groups, with the majority of patients assigned to the intermediate group. The shapes of the trajectories in this smaller sample were also similar to the trajectories in the whole sample, with the low group having a rapidly declining profile and the high group having levels that remained high for the entire sampling time. Finally, in the smaller sample, being in the high group was associated with high rates of poor outcome similar to that observed in the full sample (graphical representation of this data not shown).

FIG. 1.

Trajectory groups for S100b profiles over time with percent membership for each trajectory group. The y-axis represents the natural log-transformed S100b levels. Three trajectory groups were identified: group 1, low group; group 2, intermediate group; group 3, high group.

FIG. 2.

Trajectory groups for S100b profiles over time for subjects with 4 or more data points. The y-axis represents the natural log-transformed S100b levels. Three trajectory groups were identified: group 1, low group; group 2, intermediate group; group 3, high group.

Table 2.

Model Selection Results

Number of groups	Polynomial order	BIC	Bayes factor
1	4	−943.07
2	4, 4	−844.66	>1000
3	4, 4, 4	−798.05	>1000
4	4, 4, 4, 4	−801.42	3/100
3	3, 2, 1	−786.97	>1000^a

The BIC of the four-group model decreased, compared to the three-group model, and the groups had overlapping confidence intervals.

The last model is compared to the three-group (4, 4, 4) model.

BIC, Bayesian information criterion.

We evaluated the fit of the model using fit indices presented in the model diagnostics section, and results are presented in Table 3. For all three trajectory groups, the lowest average posterior probability was 0.92, far greater than the recommended value of 0.7. This means that the model assigned patients to different trajectory groups with little ambiguity. Further, the lowest value for the OCC was 10, which is also greater than the recommendation of 5 as a general guideline for GBTM.²⁷ Finally, the probability of group membership (as estimated from the model) and the proportion assigned to each group using the maximum probability rule, are almost identical for each group.

Table 3.

Model Adequacy Results

Trajectory group	AvePP	OCC	\|π-P\|
Low	0.96	109	0.00
Intermediate	0.93	10	0.01
High	0.92	35	0.01

AvePP, average posterior probability; OCC, odds of correct classification; P, actual proportion of subjects assigned to each trajectory group using the maximum probability rule; π, posterior probability of group membership.

Outcome prediction

Clinical and demographic associations with trajectory groups

The bivariate results assessing relationships between S100b trajectory groups and demographic and clinical variables are presented in Table 4. Gender distributions were significantly different between the final three trajectory groups. The low group was primarily composed of male subjects (92.0%), whereas female subjects comprised 36.4% of those in the high group. Further, subjects assigned to the high group were, on average, significantly older than subjects in at least one of the other two groups (p=0.004). S100b trajectory groups were also significantly associated with GOS and mortality outcomes. For example, a higher proportion of subjects in the high group died during acute care, compared to subjects in the low group (39.4 versus 4%). Further, a significantly lower proportion of subjects in the high group had good outcome 6 months after injury, compared to subjects in the low group (15.2 versus 47.4%). However, there was no significant relationship between trajectory groups and either mechanism or radiological type of injury.

Table 4.

Demographic and Clinical Variables by Trajectory Groups

	S100b trajectory groups
Subjects' characteristics	Low group (N=25)	Intermediate group (N=80)	High group (N=33)	Total sample (N=138)	Statistics
Gender, N (%)
Female	2 (8.0)	14 (17.5)	12 (36.4)	28 (20.3)	X² ₂=7.82, p =0.022
Male	23 (92.0)	66 (82.5)	21 (63.6)	110 (79.7)
Age, mean±SEM	36.3±3.0	32.3±1.6	42.8±2.9	35.6±1.0	F_2,135=5.69, p =0.004
Age range	16–72	16–74	16–70	16–74
GCS, median	7	6	6	6	X² ₂=2.9, p=0.234
Mechanism of injury
Automobile/motorcycle	22 (91.7)	57 (71.3)	19 (57.6)	98 (71.5)	X² ₄=8.74, p=0.057
Fall/jump	2 (8.3)	13 (16.3)	7 (21.2)	22 (16.1)
Other	0 (0.0)	10 (12.5)	7 (21.2)	17 (12.4)
Radiological injury type, N (%) present
Subdural hematoma	14 (66.7)	47 (59.5)	20 (60.6)	81 (60.9)	X² ₂=0.37, p=0.833
Subarachnoid hemorrhage	17 (81.0)	55 (69.6)	26 (78.8)	98 (73.7)	X² ₂=1.73, p=0.421
Diffuse axonal injury	5 (23.8)	31 (39.2)	7 (21.2)	43 (32.3)	X² ₂=4.42, p=0.110
Epidural hematoma	5 (23.8)	8 (10.1)	4 (12.1)	17 (12.8)	X² ₂=2.44, p=0.295
Contusion	7 (33.3)	33 (41.8)	18 (54.6)	58 (43.6)	X² ₂=2.62, p=0.270
Intraventricular hemorrhage	5 (23.8)	26 (32.9)	12 (36.4)	43 (32.3)	X² ₂=0.99, p=0.610
Intracerebral hemorrhage	10 (47.6)	22 (27.9)	14 (42.4)	46 (34.6)	X² ₂=4.02, p=0.134
Acute care mortality
Dead	1 (4.0)	10 (12.5)	13 (39.4)	24 (17.4)	X² ₂=15.57, p <0.001
Alive	24 (96.0)	70 (87.5)	20 (60.6)	114 (82.6)
Six-month GOS
Dead	2 (10.5)	11 (16.7)	17 (51.5)	30 (25.4)	X² ₄=17.62, p =0.002
Vegetative state/severe disability	8 (42.1)	29 (43.9)	11 (33.3)	48 (40.7)
Moderate disability/good recovery	9 (47.4)	26 (39.4)	5 (15.2)	40 (33.9)

SEM, standard error of the mean; GCS, Glasgow Coma Scale; GOS, Glasgow Outcome Score.

Bolded values represent statistically significant comparisons where alpha <0.05.

Multivariate logistic regression modeling

Our aim here was to predict global outcome after TBI using S100B and demographic and clinical variables. Two different logistic regression models were used: One used clinical and demographic variables and S100b trajectory groups, and the other used the same clinical and demographic variables, but used average S100b levels observed over the first week after injury. For both models, independent variables that were associated with GOS in bivariate analysis, based on a p<0.2 cutoff, were included. A backward step-wise selection was then used to identify variables affecting outcome at the confidence level α=0.05. Average S100b did not affect outcome, but it was kept in the model for comparison purposes. The final multivariate results are summarized in Tables 5 and 6. After controlling for demographic and clinical variables, S100b trajectory groups remained significantly associated with GOS (Table 5). In fact, for subjects in the intermediate group, the odds of having good outcome were three times the odds of having good outcome for subjects in the high group (p=0.008). Subjects in the low group had even better outcome 6 months after their injuries. The odds of having good outcome for subjects in this group were six times the odds of having good outcome for subjects in the high group (p=0.007). However, average S100b level was not associated with outcome after controlling for clinical and demographic variables (p=0.229; Table 6).

Table 5.

Multivariate Logistic Regression Predicting Outcome Using Clinical/Demographic Variables and CSF S100b Trajectory Groups

Independent variable	Odds ratio	95% CI	p-value
Age	0.64	(0.50, 0.83)	0.001
Injury severity (GCS)	1.32	(1.04, 1.70)	0.022
Subdural hematoma	2.99	(1.35, 6.59)	0.007
S100b low Group^a	5.92	(1.64, 21.41)	0.007
S100b intermediate group^a	3.29	(1.36, 8.07)	0.008

The high group is the reference category. The low and intermediate groups are being compared to the high group.

CI, confidence interval; GCS, Glasgow Coma Scale.

Bolded values represent statistically significant comparisons where alpha <0.05.

Table 6.

Multivariate Logistic Regression Predicting Outcome Using Clinical/Demographic Variables and Average First Week After Injury CSF S100b Levels

Independent variable	Odds ratio	95% CI	p-value
Age	0.62	(0.47, 0.80)	<0.001
Injury severity (GCS)	1.39	(1.10, 1.77)	0.006
Subdural hematoma	2.41	(1.11, 5.26)	0.027
Contusion	2.09	(1.00, 4.35)	0.049
S100b average levels	0.95	(0.86, 1.04)	0.229

Average S100b are used instead of trajectory groups.

CI, confidence interval; GCS, Glasgow Coma Scale.

Bolded values represent statistically significant comparisons where alpha <0.05.

Discussion

Our results highlight the advantages of group-based analysis for longitudinal biomarker modeling and outcome prognosis after TBI. Using this statistical method, we tested for heterogeneity in patterns of change in S100b levels during the first week after injury and identified three groups that were qualitatively different, based on demographic variables as well as outcomes. The low group (18.2%) comprised patients whose levels rapidly declined during the first few days. The majority of subjects (56.7%) fell in the intermediate group, which comprised subjects whose levels steadily decreased over time. This declining pattern is similar to that noted when graphing subject S100b levels over the entire population.²⁵ Subjects in the high group (25.1%) represent a somewhat atypical pattern of change because their levels remained high during the entire sampling period. In fact, S100b levels for the high group 6 days after injury were comparable, on average, to levels of patients in the low group right after their injuries. It is interesting that these three groups had significantly different acute mortality rates and global outcome 6 months after injury. A significantly higher percentage of patients that were assigned to the high and atypical group died during hospital stay (39.4%), compared to 12.5 and 4% for the intermediate and low groups, respectively. Further, high group subjects were less likely to have good outcome 6 months after injury (15.2%, compared to 47.4% for the low group). Finally, women were more likely to be placed in the atypical group (42.9%, compared 19.1% for men) and patients in this group were significantly older (42.8±2.9 versus 32.3±1.6 and 36.3±3.0 years for the intermediate and low groups, respectively), suggesting that age may influence injury severity and/or play a role-associated BBB pathology that accompanies injury, thus influencing the evolution of S100b levels over time. Further discussion of the implications of this model are discussed elsewhere.²⁵

It is also interesting to note that without controlling for demographic or clinical variables, higher average S100b levels was associated with worse outcome 6 months after injury (4.34±0.97 versus 2.04±0.44 ng/mL for those who had good recovery). However, there is no significant effect of average S100b levels in our multivariate regression model after potential confounders are controlled for in the model, suggesting that capturing the heterogeneity in biomarker patterns over time is an important reason for why trajectory groups were better able to discriminate outcome in this example. Thus, using average levels would have led to the conclusion that S100b does not affect outcome after TBI. Our results show that subjects in the intermediate group were three times more likely to have better outcome and those in the low group were six times more likely to have better outcome, compared to the high group, even after confounders are taken into account in the multivariate logistic regression model. The varied pathology that underlies the biomarker profile associated with each group warrants further study. However, one must carefully consider the research question in hand and whether an assessment of longitudinal profiles will appropriately address this question. For example, the prognostic value of admission-based biomarkers would not be pertinent to a longitudinal profile approach.

Although not discussed in depth in this article, several longitudinal statistical methods exist to model data collected over time. Some of these methods are mixed-effects models,³² hierarchical modeling,³³ latent curve analysis,³⁴ and growth-curve modeling technique.³⁵ However, these methods model the overall mean pattern and deviations from the overall mean over time. Because subject-derived biomarker assessments may follow different patterns, a group-based modeling approach allows the flexibility to exploit this feature of biomarker evolution after TBI and identify qualitatively distinct subpopulations, rather than estimating the overall biomarker pattern.

It should be noted that these groups provide estimations of complex distributions, and group membership should not be taken as an absolute certainty, even in high-fit models. When developing trajectory models, missing data points should be addressed. In the current example, only 10 patients had all data points and this is considered a limitation. However, replicating these findings in similar populations, as well as generalizing this modeling approach to other populations, such as those with mild TBI, blast injury, or other types of acquired brain injury, will be essential to better understand the utility of this approach for outcome prognostication and management across the spectrum of TBI.

Future directions

In our group-based modeling, we used S100b data collected during the first 6 days after injury. One interesting question is whether we can effectively predict trajectory groups using fewer days, using some type of dynamic pattern-recognition approach. In other words, our goal is to investigate whether biomarker levels obtained during the first few days after injury can effectively predict the trajectory groups identified using the full sampling period. This approach has significant implications with regard to earlier clinical prognosis and the development of clinical treatment algorithms, for which a rapid assessment of treatment effectiveness could be generated using a biomarker approach.

In addition, this approach may be useful for future assessments of multiple biomarker profiles, or profile combinations for both prognostication and patient management. Also, there may be some utility for using biomarker-based trajectory groups, in combination with genetic and/or epigenetic information, to further characterize secondary injury and recovery mechanisms as well as to assess prognosis and/or treatment effects using a gene stratification approach. Finally, longitudinal physiological data (e.g., ICP monitoring, brain tissue oxygenation, and quantitative electroencephalography techniques) may also be appropriate to explore the utility of GBTM approaches to further describe physiological correlates of secondary injury over time and to use alone, and/or in combination, with biomarker data for prognostication and management purposes.

Footnotes

Acknowledgments

This study was supported by DODW81XWH-071-0701 (to A.F., A.K.W., C.N., H.O., K.A., and A.G.) and R49 CCR 323155-03 (to A.F. and A.K.W.).

Author Disclosure Statement

No competing financial interests exist.

References

Wagner

A.K.

2010. TBI translational rehabilitation research in the 21st century: exploring a rehabilomics research model. Eur. J. Phys. Rehabil. Med., 46:549–556.

Wagner

A.K.

2011. Rehabilomics: a conceptual framework to drive biologics research. PM R, 3:S28–S30.

Berger

R.P.

, Pierce

M.C.

, Wisniewski

S.R.

, Adelson

P.D.

, Clark

R.S.

, Ruppel

R.A.

, Kochanek

P.M.

2002. Neuron-specific enolase and S100B in cerebrospinal fluid after severe traumatic brain injury in infants and children. Pediatrics, 109:E31.

Ucar

, Baykal

, Akyuz

, Dosemeci

, Toptas

2004. Comparison of serum and cerebrospinal fluid protein S-100b levels after severe head injury and their prognostic importance. J. Trauma, 57:95–98.

Rainey

, Lesko

, Sacho

, Lecky

, Childs

2009. Predicting outcome after severe traumatic brain injury using the serum S100B biomarker: results using a single (24h) time-point. Resuscitation, 80:341–345.

Naeimi

Z.S.

, Weinhofer

, Sarahrudi

, Heinz

, Vécsei

2006. Predictive value of S-100B protein and neuron specific-enolase as markers of traumatic brain damage in clinical use. Brain Inj., 20:463–468.

Pelinka

L.E.

, Kroepfl

, Leixnering

, Buchinger

, Raabe

, Redl

2004. GFAP versus S100B in serum after traumatic brain injury: relationship to brain damage and outcome. J. Neurotrauma, 21:1553–1561.

Nylén

, Ost

, Csajbok

L.Z.

, Nilsson

, Hall

, Blennow

, Nellgård

, Rosengren

2008. Serum levels of S100B, S100A1B, and S100BB are all related to outcome after severe traumatic brain injury. Acta. Neurochir., 150:221–227.

Raabe

, Grolms

, Sorge

, Zimmermann

, Seifert

1999. Serum S-100B protein in severe head injury. Neurosurgery, 45:477–483.

10.

Woertgen

, Rothoerl

R.D.

, Metz

, Brawanski

1999. Comparison of clinical, radiologic, and serum marker as prognostic factors after severe head injury. J. Trauma, 47:1126–1130.

11.

Chatfield

D.A.

, Zemlan

F.P.

, Day

D.J.

, Menon

D.K.

2002. Discordant temporal patterns of S100 beta and cleaved tau protein elevation after head injury: a pilot study. Br. Neurosurg., 16:471–476.

12.

Wagner

A.K.

, Amin

, Niyonkuru

, Postal

B.A.

, McCullough

E.H.

, Ozawa

, Dixon

C.E.

, Bayir

, Clark

R.S.

, Kochanek

P.M.

, Fabio

2011a. CSF Bcl-2 and cytochrome C temporal profiles in outcome prediction for adults with severe TBI. J. Cereb. Blood Flow Metab., 31:1886–1896.

13.

Wagner

A.K.

, McCullough

E.H.

, Niyonkuru

, Ozawa

, Loucks

T.L.

, Dobos

J.A.

, Brett

C.A.

, Santarsieri

, Dixon

C.E.

, Berga

S.L.

, Fabio

2011b. Acute serum hormone levels: characterization and prognosis after severe traumatic brain injury. J. Neurotrauma, 28:871–888.

14.

Kirchhoff

, Buhmann

, Braunstein

, Leidel

B.A.

, Vogel

, Kreimeier

, Mutschler

, Biberthaler

2008. Cerebrospinal S100-B: a potential marker for progressive intracranial hemorrhage in patients with severe traumatic brain injury. Eur. J. Med. Res., 13:511–516.

15.

Marchi

, Rasmussen

, Kapural

, Fazio

, Kight

, Mayberg

M.R.

, Kanner

, Ayumar

, Albensi

, Cavaglia

, Janigro

2003. Peripheral markers of brain damage and blood-brain barrier dysfunction. Restor. Neurol. Neurosci., 21:109–121.

16.

Kapural

, Krizanac-Bengez

L.J.

, Barnett

, Perl

, Masaryk

, Apollo

, Rasmussen

, Mayberg

M.R.

, Janigro

2002. Serum S-100 beta as a possible marker of blood–brain barrier disruption. Brain Res., 940:102–104.

17.

Raabe

, Kopetsch

, Woszczyk

, Lang

, Gerlach

, Zimmermann

, Seifert

2003. Serum S-100B protein as a molecular marker in severe traumatic brain injury. Restor. Neurol. Neurosci., 21:159–169.

18.

Hayakata

, Shiozaki

, Tasaki

, Ikegawa

, Inoue

, Toshiyuki

, Hosotubo

, Kieko

, Yamashita

, Tanaka

, Shimazu

, Sugimoto

2004. Changes in CSF S100B and cytokine concentrations in early-phase severe traumatic brain injury. Shock, 22:102–107.

19.

Berger

R.P.

, Beers

S.R.

, Richichi

, Wiesman

, Adelson

P.D.

2007. Serum biomarker concentrations and outcome after pediatric traumatic brain injury. J. Neurotrauma, 24:1793–1801.

20.

Townend

W.J.

, Guy

M.J.

, Pani

M.A.

, Martin

, Yates

D.W.

2002. Head injury outcome prediction in the emergency department: a role for protein S-100B? J. Neurol. Neurosurg. Psychiatry, 73:542–546.

21.

Berger

R.P.

, Bazaco

M.C.

, Wagner

A.K.

, Kochanek

P.M.

, Fabio

2010. Trajectory analysis of serum biomarker. Dev. Neurosci., 32:396–405.

22.

Salonia

, Empey

P.E.

, Poloyac

S.M.

, Wisniewski

S.R.

, Klamerus

, Ozawa

, Wagner

A.K.

, Ruppel

, Bell

M.J.

, Feldman

, Adelson

P.D.

, Clark

R.S.

, Kochanek

P.M.

2010. Endothelin-1 is increased in cerebrospinal fluid and associated with unfavorable outcomes in children after severe traumatic brain injury. J. Neurotrauma, 27:1819–1825.

23.

Hall

K.M.

, Wallbom

A.S.

, Englander

1998. Premorbid history and traumatic brain injury. NeuroRehabillitation, 10:3–12.

24.

Wagner

A.K.

, Hammond

F.M.

, Sasser

H.C.

, Wiercisiewski

, Norton

H.J.

2000. Use of injury severity variables in determining disability and community integration after traumatic brain injury. J. Trauma, 49:411–419.

25.

Goyal

, Niyonkuru

, Carter

M.D.

, Fabio

, Berger

R.P.

, Wagner

A.K.

2012. Comparative assessment of serum and CSF S100B profiles in outcome prediction. J. Neurotrauma[Epub ahead of print.]

26.

Nagin

D.S.

, Odgers

C.L.

2010. Group-based trajectory modeling in clinical research. Annu. Rev. Clin. Psychol., 6:109–138.

27.

Nagin

2005. Group-Based Modeling of Development. Harvard University Press: Cambridge, MA.

28.

Jones

B.L.

, Nagin

D.S.

, Roeder

2001. A SAS procedure based on mixture models for estimating developmental trajectories. Sociol. Methods Res., 29:374–393.

29.

Kotz

, Johnson

N.L.

, Balakrishnan

2000. Continuous Multivariate Distributions. Wiley: New York.

30.

Kass

R.E.

, Raftery

A.E.

1995. Bayes factors. J. Am. Stat. Assoc., 90:773–795.

31.

Jennet

, Bond

1975. Assessment of outcome after severe traumatic brain damage. Lancet, 1:480–484.

32.

Demidenko

2004. Mixed Models: Theory and Applications. John Wiley & Sons, Inc.: Hoboken, NJ.

33.

Bryk

A.S.

, Raudenbush

S.W.

1987. Application of hierarchical linear models to assessing change. Psych. Bull., 101:147–158.

34.

Meredith

, Tisak

1990. Latent curve analysis. Phsychometrika, 55:107–122.

35.

Duncan

T.E.

, Duncan

S.C.

2004. An introduction to latent growth curve modeling. Behav. Ther., 35:333–363.