Abstract
Purpose/Objective:
This study examined the clinical utility of a single item for anxiety from the Neurobehavioral Symptom Inventory (NSI) in determining the need for mental health referral for veterans with traumatic brain injury (TBI).
Research Method/Design:
Three hundred eighty veterans referred for TBI evaluation were administered the NSI and a common anxiety screening measure (Beck Anxiety Inventory; BAI). Receiver Operating Characteristic (ROC) curve analyses were conducted to determine ideal BAI total cutoff scores for a single item of the NSI pertaining to anxiety (i.e., “anxious or tense”).
Results
Using multiclass ROC curve analyses, NSI scores of 3 and 4 for the sample were comparable to scores of 11 and 22 on the BAI, respectively. Post hoc ROC curve analyses were then conducted on the sample after removal of potentially invalid NSI protocols (i.e., Validity-10 scores greater than 22), and NSI scores 3 and 4 corresponded with scores of 11 and 20, respectively.
Conclusion/Implications
A minimum score of 3 (severe) on the NSI item was deemed sufficient to indicate the need for further mental health referral without warranting additional screening for anxiety. Further analyses also revealed that removal of positive Validity-10 protocols did not significantly change ROC curve findings, suggesting that the particular NSI item for anxiety can still be used for clinical purposes despite an otherwise invalid protocol. Implications for treatment and recommendations pertaining to when additional screening might be required are discussed.
Introduction
Veterans returning from conflicts Operation Enduring Freedom (OEF), Operation Iraqi Freedom (OIF) and Operation New Dawn (OND) have reported high prevalence rates of cognitive/postconcussive, emotional, and physical symptoms. For example, in an often cited study by Lew et al. (2009), veterans who were seen at a Department of Veterans Affairs Polytrauma Network Site reported high prevalence of postconcussive (66.8%), emotional (e.g., posttraumatic stress disorder; 68.2%), and physical (e.g., chronic pain; 81.5%) symptoms. In general, mild traumatic brain injury (mTBI)/postconcussive symptoms have been noted to be common problems with returning OEF/OIF soldiers, and prevalence rates vary widely (from about 11% to 22%; Hoge et al., 2008; Polusny, et al., 2011; Terrio et al., 2009). In response to the needs of veterans who have suspected mTBI due to blast exposure or other forms of trauma, the Department of Veterans Affairs mandated the polytrauma system of care. As part of this system of care, specialized clinics were established in VA hospitals throughout the country (Department of Veterans Affairs, 2009, 2013). Within the polytrauma system of care, there is a desire to provide important services that meet veterans’ needs in the most efficient and least burdensome manner possible (Flaherty et al., 2018). Resources, such as time and finances, can be valuable commodities in the healthcare industry. Comprehensive testing resources in some situations may not always be considered feasible or medically necessary. Therefore, it is important to determine what must be done to maximize the quality of veterans’ appointments/screening times if resources are limited.
As part of the requirement of the polytrauma clinic, an evaluation is required that includes use of the Neurobehavioral Symptom Inventory (NSI; Cicerone & Kalmar, 1995; Department of Veterans Affairs, 2017; Meterko et al., 2012). The NSI consists of 22 items designed to provide the clinician with information on three different types of symptoms including affective/stress/psychological, physical/somatic, and cognitive symptoms (Vanderploeg et al., 2014). Assessment of emotional factors for polytrauma patients is essential due to the high comorbidity of psychiatric symptoms reported in persons receiving care for TBI.
Beck and Steer (1993) developed the Beck Anxiety Inventory (BAI), which is a self-report measure consisting of 21 items. This measure was initially developed in order to assist clinicians with assessing anxiety in psychiatric populations (Beck et al., 1988). However, it has been found useful in a variety of settings including use with veterans (Oehlert et al., 2019; Palmer et al., 2017) and in polytrauma settings (Palmer, Happe, Paxson, Jurek, Graca, et al., 2016; Palmer, Happe, Paxson, Jurek, & Olson, 2016). Research has indicated scores in the moderate range (i.e., scores of 16 or higher) may identify clinically meaningful symptoms of anxiety (Bardhoshi et al., 2016); and Oehlert et al. (2019) has suggested in preliminary research that a cutoff score of 18 for veterans may be appropriate to identify clinically meaningful symptoms of anxiety in this population.
Previous research suggests that mental health symptoms have an impact on overall NSI scores, leading to higher overall scores. For example, Palmer, Happe, Paxson, Jurek, Graca, et al. (2016) found that higher anxiety scores accounted for the most variance in predicting overall scores on the NSI. For anxiety, a very strong positive correlation was found between the BAI (Beck & Steer, 1993) total score and item 19 (i.e., “anxious or tense”) of the NSI. Further, the BAI total score had strong positive relationships with several items on the NSI, including “feeling dizzy;” “loss of balance;” “poor coordination, clumsy;” “loss of appetite or increased appetite;” “poor concentration, can’t pay attention;” “forgetfulness, can’t remember things;” “difficulty making decisions;” “slowed thinking, difficulty getting organized, can’t finish things;” “fatigue, loss of energy, getting tired easily;” “feeling depressed or sad;” “irritability, easily annoyed;” and “poor frustration tolerance, feeling easily overwhelmed by things;” (Palmer, Happe, Paxson, Jurek, Graca, et al., 2016).
From a practical standpoint, little is known regarding what higher scores on specific items of the NSI might suggest in terms of clinical utility. Specifically, there is limited research available regarding functionality of specific mental health items on the NSI (e.g., anxiety, depression, etc.). One recent study conducted by Flaherty et al. (2018) reported that higher scores on the NSI for anxiety and depression (i.e., “severe” or “very severe” symptoms) would indicate further referral for mental health services without need for further screening. However, research has not been conducted in the polytrauma setting when considering the validity of NSI profiles. Vanderploeg et al. (2014) developed an embedded validity scale for the NSI to assist in identifying invalid response styles. Flaherty et al. (2018) removed invalid profiles from their analysis when developing recommendations for cutoff scores. Comparisons of sample profiles of veterans with and without invalid Validity-10 scale protocols of the NSI has not, to these authors’ knowledge, been considered in context of reviewing influence on specific items. Given the Validity-10 scale does not include item 19, is this particular item significantly altered by an otherwise invalidated profile?
The purpose of this study was to examine the clinical utility of item 19 of the NSI “feeling anxious or tense” in the context of total scores on the BAI. Based on previous literature, it would be expected that scores of “severe” (3) or “very severe” (4) would be most informative in making decisions for referral for mental health follow-up. Further, this article attempted to determine if removal of invalidated profiles (via identification of positive Validity-10 profiles) would seriously alter cutoff scores, thus altering the decision-making process.
Methods
Participants
This study used records from a sample of 380 veterans who received services at a Midwestern VA medical center. The sample of OEF/OIF/OND veterans used for the study was referred to an outpatient polytrauma clinic for evaluation. The patients were routinely administered the BAI and NSI as part of the evaluation process. All of the screening instruments used in the evaluation were administered, scored, and interpreted by a licensed psychologist.
This research study was reviewed and approved by the health care system’s affiliate Institutional Review Board (IRB), as well as its own local Research and Development Committee (RDC). A comprehensive review of the proposed research study was conducted by the IRB and RDC with full consideration for the safety of human subjects. After review of the minimal risk study, the IRB granted waiver of informed consent. Data containing all personally identifiable health information was de-identified prior to analysis.
Demographic characteristics of the veterans for this research study revealed that the sample was predominantly male (n = 367; 96.6%). Mean age of the veteran polytrauma sample was 31.0 years (SD = 7.56), and ethnicity of this sample was predominantly white (n = 357; 93.9%). Information provided from patient clinical interviews and results of TBI evaluations (conducted by a licensed psychologist and psychiatrist) revealed that the majority of the veterans in the sample met criteria for concussion (n = 364; 95.8%). Various branches of the military were represented; and the majority of subjects served in the Army National Guard/Army Reserves (n = 172; 45.3%) or Army (n = 133; 35.0%). A number of comorbid conditions were noted, with anxiety (n = 65, 17.1%), PTSD (n = 203, 53.4%), and depression (n = 182, 47.9%) being reported as the top mental health conditions. Alcohol abuse/dependence (n = 125, 32.9%) and other substance use disorders were noted by a significant portion of the sample (n = 45, 11.8%).
Measures
The BAI is a common self-report screening measure for anxiety that consists of 21 items assessing subjective and somatic symptoms associated with anxiety. The screening instrument requires the patient to subjectively rate a number of symptoms on a scale from minimal/no symptoms (i.e., “Not at all;” score of 0) to more severe symptoms (i.e., “Severely;” score of 3). The patient is asked to subjectively rate the symptoms in the context of personal experiences within the past week. According to the developers of the BAI, it has been demonstrated to have good reliability and validity. Internal reliability is excellent for this anxiety screening measure (
The NSI (Cicerone & Kalmar, 1995) is a self-report assessment used by the Department of Veterans Affairs as part of their evaluation process (commonly called a second level TBI evaluation) for veterans who have reportedly experienced TBI (Department of Veterans Affairs, 2017). The NSI consists of 22 questions that require response by a Likert scale of 0 (i.e., none/minimal symptoms) to 4 (very severe). Item 19 in the NSI, feeling anxious or tense, is the primary focus of this article. The NSI also has an embedded validity scale called the Validity-10 (Vanderploeg, 2014). The 10 items on the scale include items 1 (feeling dizzy), 2 (loss of balance), 3 (poor coordination/clumsy), 5 (nausea), 6 (vision problems, blurring, trouble seeing), 8 (hearing difficulty), 9 (sensitivity to noise), 11 (change in ability to taste and/or smell), 15 (difficulty making decisions), and 16 (slowed thinking/difficulty getting organized/can’t finish things). Scores of >22 are considered to have a positive Validity-10 score, and may be classified as overreporting of symptoms.
Statistical analysis
All statistics were computed using R 3.6.1 (R Core Team, 2019) on a Windows system. Software packages used for the calculations included plyr (Wickham, 2011), dplyr (Wickham et al., 2019), gmodels (Warnes et al., 2018), psych (Revelle, 2018), and pROC (Robin et al., 2011). Receiver Operating Characteristic (ROC) curve analysis was used to determine the overall accuracy of the BAI in predicting symptoms on NSI item 19 (i.e., anxiety). For each NSI scores, the sensitivities, specificities, and thresholds with the best sensitivity and specificity pair for the instruments were calculated. A small number of cases (n = 11) were excluded due to missing data, leaving a sample of 369 cases for analysis.
Typically in performing a ROC curve analysis, one binary classifier is used to fit the logistic regression model. However, in our case there were five classifiers for NSI item 19 (0, 1, 2, 3, and 4). Therefore, a multiclass ROC curve analysis was warranted. Some approaches include framing the problem in terms of multiple binary classifiers (one vs. all or one vs. one methodologies) or representing the curves as surfaces in 3-dimensional space (Wandishin & Mullen, 2009). We decided to implement a one vs. all analysis, which meant there were five ROC curves (one for each classifier). This approach allowed fewer computations and ease of interpretability. For veterans scoring 0 on NSI item 19, we would set each observation of item 19 to TRUE (1) if 0 or FALSE (0) if 1, 2, 3, or 4. We then would regress the TRUE/FALSE values corresponding to NSI item 19 on the BAI total scores of the total sample to get our first model.
After the models were fit for the five binary classifiers, we determined the sensitivities and specificities for each NSI score. We reran the above analyses excluding veterans with NSI Validity-10 scores of 23 or higher to account for potential symptom overreporting. We compared ROC curves from the corresponding classes before and after Validity-10 scores were removed using pROC’s test for two ROC curves (Robin et al., 2011). The one vs. all classification method is good for overall misclassification rate (Hand & Till, 2001).
Results
The area under the ROC curve (AUC) for each NSI item score (i.e., 0–4) for anxiety was examined for the entire sample (n = 369) and is shown in Table 3. We tested the null hypothesis that the AUCs were not significantly different from 0.5. NSI scores 0 (AUC = 0.8971, p < 0.001) and 4 (AUC = 0.8740, p < 0.001) were the most accurately predicted, and scores 1 (AUC = 0.7552, p < 0.001), 3 (AUC = 0.6565, p < 0.001), and 2 (AUC = 0.5820, p = 0.0153) were least accurately predicted. Figure 1 displays the results of the different ROC curves for NSI scores 2, 3, and 4, which were the scores of interest. For each score, sensitivities and specificities (coordinates of the ROC curve) were calculated with the best sensitivity/specificity pairs reported in Table 1. The ROC curve was inverted below the diagonal line for NSI scores 0, 1, and 2 to report lower coordinates of the ROC curve as cutoff scores were expected to be in ascending order. For a score of 3, results revealed excellent sensitivity (0.9394) and modest specificity (0.3667). The best sensitivity/specificity pair for an NSI score of 3 matched with a BAI score of 11 for the entire sample. For a score of 4, results revealed good sensitivity (0.8657) and good specificity (0.7417). The best sensitivity/specificity pair for an NSI score of 4 matched with a BAI score of 22 for the entire sample.

ROC curves for the prediction of NSI Item 19 by BAI scores.
Best trade-offs for 1–specificity and sensitivity by NSI score (n = 369), entire sample.
Next, protocols that were invalid according to Validity-10 criteria scores were removed from the entire sample and ROC curve analyses were run with the remaining cases (n = 320). Results of analyses are shown in Table 2. NSI scores 0 (AUC = 0.8807, p < 0.001) and 4 (AUC = 0.8623, p < 0.001) were the most accurately predicted (i.e., 0 = no/minimal anxiety and 4 = significant anxiety); and scores 1 (AUC = 0.7138, p < 0.001), 3 (AUC = 0.6942, p < 0.001) and 2 (AUC = 0.5228, p = 0.5151) were least accurately predicted. For a score of 3, results revealed excellent sensitivity (0.9268) and modest specificity (0.416). The best sensitivity/specificity pair for an NSI score of 3 matched with a BAI score of 11 for the entire sample. For a score of 4, results revealed good sensitivity (0.8684) and good specificity (0.7234). The best sensitivity/specificity pair for an NSI score of 4 matched with a BAI score of 20 for the reduced sample (n = 320). Figure 2 shows the ROC curves for NSI scores 2, 3, and 4 with positive Validity-10 protocols removed.
Best trade-offs for 1–specificity and sensitivity with NSI item 19 scores excluding validity-10 scores >22 (n = 320).

ROC curves for the prediction of NSI Item 19 by BAI scores excluding Validity-10 scores >22.
After testing for two paired ROC curves with each NSI score pair, we determined that none of the paired areas significantly differed at the 95% confidence level when comparing profiles of the entire sample with a subsample consisting of removal of positive Validity-10 protocols. Results are presented in Table 3. For the individual AUCs for each NSI score, however, it was found that all values were significantly different from 0.5 at the 95% confidence level with the entire sample. After removing Validity-10 scores, we observed that all but NSI score 2 was significantly different from 0.5 at the 95% confidence level. The AUC for NSI score 3 was the only area that increased with the invalid protocols removed.
Test for two ROC curves by NSI score comparing before and after removing validity-10 >22.
aAUCs indicate predictive power that persons predicted to score 0 and 1 actually scored 0 and 1 (have low or minimal anxiety).
Discussion
Use of the NSI for evaluation of veterans referred for polytrauma services can be beneficial for screening of possible clinical anxiety via use of item 19 (i.e., “anxious or tense”). Specific attention to this can assist the provider with making decisions regarding additional mental health referral. Further, findings of our study revealed that trade-offs corresponding to the anxiety item on the NSI were similar regardless of whether one considered protocols with or without Validity-10 positive scores. Therefore, our findings suggest that clinicians could still make decisions for mental health referral for anxiety by considering response on item 19 (anxious or tense) even though the profile might be otherwise invalidated.
The BAI total score has great predictive power for NSI scores 0 and 4. The data for NSI scores 0 and 4 suggests that people who report severe anxiety on the BAI will likely not underreport or overreport their symptoms on the corresponding NSI item 19, respectively. The BAI total score has moderate predictive power for NSI scores 1 and 3. The increase in the predictive power for NSI score 3 made us opt to refer patients with this score and above without the need for further screening. Scores lower than 3 would indicate more in-depth screening (perhaps with a BAI or other screening instrument for anxiety) in the polytrauma setting in order to help determine whether significant symptoms exist for referral. Recommended cutoff scores of the BAI in previous research suggested scores of between 16 and 18 to indicate possible clinically meaningful symptoms of anxiety. Consistent with Flaherty et al. (2018), scores on the NSI of 3 (severe) or 4 (very severe) proved the best predictors of clinical anxiety when considering patient response on Item 19, which would indicate need for referral to a mental health practitioner. For an NSI score of 3, a score of 11 on the BAI (i.e., mid to upper range for mild anxiety) provided the best trade-off for sensitivity and specificity. With the NSI score of 4, a score of 22 on the BAI (i.e., upper range for moderate anxiety) provided the best trade-off for sensitivity and specificity. Recommended cutoff scores on the BAI in previous research suggested scores of between 16 and 18 may indicate clinically meaningful symptoms of anxiety. Looking at the cutoffs for BAI total scores, an NSI score of 3 would suggest that a referral could confidently be made without need for further screening in the polytrauma setting because it falls under “mild anxiety” for the BAI total score. Scores lower than 3 would indicate more in-depth screening (perhaps with a BAI or other screening instrument for anxiety) in the polytrauma setting in order to help determine whether significant symptoms exist for referral.
The AUC corresponding to score 2 prior to invalid Validity-10 score removal is statistically significant from 0.5 but not practically significant in this case. Removing the invalid protocols makes this abundantly clear as the logistic regression model has a deviance of 396.41 > χ2(318) = 360.59, indicating the goodness-of-fit is poor. Using the BAI total score to predict a NSI score of 2 is unreliable, and the clinician would likely need further anxiety screening in order to determine if referral is necessary. The predictions made with this score may be correct or incorrect based on random chance.
It was surprising to note that no significant differences were found between ROC curves in regards to item 19, despite removal of invalid NSIs. There are a couple of possible reasons for this finding. First, it might be possible that veterans who come for services to the clinic focus more on “perceived” symptoms that should be reported in the clinic in relation to TBI. Therefore, a veteran might focus more on symptoms that they would expect a provider to be looking for in regards to head injury (e.g., dizziness, slowed thinking) versus mental health symptoms that are common with this population. Another possible reason for no differences is that previous research related to development of the Validity-10 embedded scale did not include mental health items (i.e., anxiety and depression) as being sufficient indicators for inclusion in the scale. Our study confirms that this particular item was minimally affected by otherwise invalid responses when looking at the entire NSI score.
There were some limitations to our study. First, our sample consisted of veterans with subjective report of history of TBI. Diagnosis is made based upon completion of a second level TBI evaluation, which relies to a significant extent on self-report. Secondly, the characteristics of the sample were predominantly white and male. Further research is needed with other cultural groups. A third limitation to this study is in regards to data presented in Tables 1 and 2. Directly comparing certain levels of the NSI item 19 and BAI total scores are not absolute due to the NSI item 19 having 5 scores and the BAI diagnosis having 4 main levels. This allows for some flexibility in the interpretation of our cutoff scores for the BAI total score. We decided to match the “none,” “mild,” and “moderate” levels with each other on both tests. However, we broke the severe category of the BAI test down to match the ranges with scores 3 and 4 on the NSI. Finally, we did not split the data into train and test sets, so had it not been for previous research corroborating our results, we would not know how generalizable our results were to other data.
Further research is also needed regarding the use of the NSI in polytrauma settings. Specifically, examination of this instrument is needed for veterans with comorbid conditions and how those comorbid conditions can affect response patterns on the NSI. Similar research in regards to other NSI mental health items, such as depression, would also be helpful.
The one vs. all approach has its advantages and disadvantages. Fewer computational expenses and easier understandability of the data are this method’s strong points. Increasing the frequency of a class’s occurrence, which inevitability affected classes uncorrelated with our high Validity-10 scores, will increase the one vs. all AUC without increasing our ability to accurately predict values better (Wandishin & Mullen, 2009) as noted by the AUCs. The results of the test between the ROC curves did not show statistically significant differences between the AUCs for each NSI score pair, so the above statement may perhaps be disregarded in our case.
Overall, the use of the NSI for evaluation of veterans referred for polytrauma services can be helpful for clinicians when screening for presence of clinical anxiety via use of item 19 (i.e., “anxious or tense”). Specific attention to the patient response on this item can assist the provider with making decisions regarding additional mental health referral. Further, findings of our study revealed that veteran response to the anxiety item on the NSI was similar regardless of whether one considered protocols with or without Validity-10 positive scores. Therefore, our findings suggest that clinicians can still make informed decisions regarding mental health referral by considering the patient response on item 19 (anxious or tense) even though the profile might be otherwise invalidated.
Footnotes
Acknowledgements
Preliminary data was presented as poster presentations at the 37th Annual Conference of the National Academy of Neuropsychology in Boston, MA; and the 39th Annual Conference of the National Academy of Neuropsychology in San Diego, CA. Special acknowledgment to Drs. Tracy Biehn, Gina Falcone, Autumn Arch, and Mr. Stephen Olson for contributory work on the poster presentations. The Authors declare there is no conflict of interest.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This manuscript is the result of work supported with resources and the use of facilities at the St. Cloud VA Health Care System. The contents of this manuscript do not represent the views of the Department of Veterans Affairs or the United States Government.
