Abstract
A growing number of young adults are presenting in clinical settings with self-reported symptoms of ADHD, seeking to qualify for supports and services (Hagar & Goldstein, 2001; Harrison, 2004; McGuire, 1998; Weyandt & Dupaul, 2013). This increase may be due to greater awareness about ADHD or better availability of accommodations for individuals diagnosed with the disorder. The Diagnostic and Statistical Manual of Mental Disorders (5th ed.; American Psychiatric Association [APA], 2013) presents clear diagnostic guidelines that require the presence of lifelong symptoms in addition to cross-setting impairment due to the symptoms, after ruling out other potential causes for the symptoms (McCann & Roy-Byrne, 2004; Gathje, Lewandowski, & Gordon, 2008).
The diagnosis of ADHD in adults is more difficult than in children because of the need to establish both the current and historical presence of the syndrome (Kolar et al., 2008; Shaffer, 1994). Adult retrospective recall of childhood symptoms is unreliable (Mannuzza, Klein, Klein, Bessler, & Shrout, 2002), making it difficult to determine, with a high degree of confidence, whether an adult met the diagnostic criteria for ADHD in childhood. In the absence of school reports about past behavior, retrospective self- or third-party reporting cannot be corroborated. This is problematic because current symptoms of ADHD alone are insufficient to make a diagnosis, related to the fact that individual symptoms of ADHD are often present in the general population (Harrison, 2004; Lewandowski, Lovett, Codding, & Gordon, 2008; Suhr, Zimak, Buelow, & Fox, 2009). Postsecondary students, in particular, endorse a high frequency and severity of symptoms associated with ADHD. For example, 81.8% of healthy undergraduate students endorse difficulties concentrating when reading, 63.6% endorse becoming tired easily, and 62.5% endorse being impatient (Wong, Regennitter, & Barrios, 1994). Moreover, the severity of symptoms reported by nondisabled college students related to memory problems, loss of interest, irritability, and fatigue does not differ significantly from individuals with brain injury (Gouvier, Uddo-Crane, & Brown, 1988).
The fact that university students demonstrate symptoms of ADHD at high rates and severity may relate to university life itself, a uniquely stressful time that includes new demands related to academics, social functioning, and living independently (Robotham & Julian, 2006). For example, it is well established that stress can affect cognitive processes, such as memory (Lupien et al., 2005; McEwen & Sapolsky, 1995; Newcomer et al., 1999). Consistently, poorer coping ability, an increased number of negative life events, and feeling overwhelmed were significant predictors of high scores on self-reported measures of ADHD symptoms among undergraduates (Harrison, Alexander, & Armstrong, 2013; Lovibond & Lovibond, 1995). Thus, misdiagnosis of ADHD seems especially likely in students who are experiencing elevated stress or have had many negative life events.
More importantly, many symptoms associated with ADHD are present in a number of other psychological disorders (Suhr, Hammers, Dobbins-Buckland, Zimak, & Hughes, 2008). For instance, individuals suffering from depression or anxiety frequently experience difficulty with memory, agitation, or inability to concentrate (APA, 2000). Van Voorhees, Hardy, and Kollins (2011) found that many individuals with diagnosed psychiatric disorders also produce elevated scores on the Conners’ Adult ADHD Rating Scale (CAARS; Conners, Erhardt, & Sparrow, 1999). Consistently, first-year psychology students with no history of ADHD exhibit scores on the CAARS, including the ADHD Index, which are associated positively with symptoms of stress, anxiety, and depression, as measured by the Depression, Anxiety and Stress Scale (DASS; Alexander & Harrison, 2013; Harrison et al., 2013).
Assessment Using CAARS
Despite admonitions to the contrary, research has shown that many clinicians employ only symptom checklists when making the diagnosis of ADHD (Joy, Julius, Akter, & Baron, 2010; McCann & Roy-Byrne, 2004; Nelson, Whipple, Lindstrom, & Foels, 2014; Weyandt & Dupaul, 2013). Hence, the need for information regarding diagnostic sensitivity of such checklists is apparent. In a review of 14 self-report questionnaires for ADHD, the CAARS was endorsed as having strong psychometric qualities and content validity (Taylor, Deb, & Unwin, 2011). The CAARS is a questionnaire that asks individuals to rate how frequently they experience symptoms associated with ADHD, where a higher score indicates more problems (Gallagher & Blader, 2001). The CAARS measures four main factors: inattention/memory problems, hyperactivity/restlessness, impulsivity/emotional lability, and problems with self-concept. The scale also contains an ADHD Index and three DSM-IV (4th ed.; DSM-IV; APA, 1994) ADHD symptom subscales: Inattentive Symptoms, Hyperactive/Impulsive Symptoms, and ADHD Symptoms Total. The test manual asserts that when making a diagnosis of ADHD, one must not only examine individual score elevations among the subscales but also consider the number and pattern of subscale elevations. Specifically, one subscale t score above 65 is said to indicate marginal support for a diagnosis, with a greater number of subscale elevations indicating a higher likelihood of moderate to severe problems (Conners et al., 1999).
To establish criterion validity of the CAARS, 39 adults who met DSM-IV criteria for ADHD were compared with 39 normal adults (Erhardt, Epstein, Conners, Parker, & Sitarenios, 1999). The ADHD group was mixed, consisting of 23 inattentive, five hyperactive, and 11 combined. The ADHD group differed significantly from the healthy group on the four main factors (i.e., non-DSM-IV scales) of the CAARS (Erhardt et al., 1999). Discriminant functions analysis using all eight CAARS subscales revealed that overall sensitivity was 82%, specificity was 87%, positive predictive value (PPV) was 87%, and negative predictive value (NPV) was 83%. The discriminant function incorrectly classified normal individuals as having ADHD (i.e., false positive) at a rate of 13%.
Although the CAARS’ psychometric properties are encouraging and the measure distinguishes well between adults known to have ADHD and a non-clinical control group (Gallagher & Blader, 2001), it must be noted that the base rate of ADHD in their small sample was 50%, which would tend to inflate the classification scores obtained. Furthermore, little research has been conducted to evaluate the ability of the CAARS to distinguish between those who have ADHD and those with other reasons for presenting with attention and/or memory problems, such as other psychiatric problems, general life stresses, or even feigning. In clinical settings, one is not asked to distinguish between normal, non-symptomatic individuals and individuals with a true disorder. Rather, most clinicians undertake evaluations with symptomatic individuals to determine the cause of reported symptoms and to differentiate between potential diagnoses in a clinical population.
The current study set out to determine the sensitivity, specificity, and positive and NPVs of the CAARS rating scale when undertaking diagnostic evaluations in a clinical setting. We evaluated the ability of each of the CAARS subscales, both individually and combined, to discriminate between postsecondary students diagnosed with ADHD and clinical controls who underwent an evaluation to investigate possible ADHD but did not to not meet diagnostic criteria for this disorder. Given that the test manual identifies that individuals with a higher number of symptomatic subscale scores are more likely to meet criteria for ADHD, we also wished to investigate the diagnostic efficiency of this method of diagnostic reasoning. Last, we sought to investigate the ability of the ADHD Index to discriminate true ADHD from other clinical presentations, as the test manual states that this index consists of the best set of items for identifying adults “at risk” for ADHD (Conners et al., 1999).
Method
Participants
Participants (N = 756) were postsecondary students receiving a psychoeducational assessment at a regional assessment center in Ontario, Canada. All were community college- or university-level students who either required updated documentation of previously diagnosed ADHD or had been referred for an evaluation of their reported attention or learning problems. They were assessed by clinical psychologists at a university-based regional assessment center between 2007 and 2014 and consented to have their data used for research purposes. Diagnosis of ADHD was made using the clinical criteria outlined in DSM-IV-TR (4th ed., text rev.; APA, 2000), including objective evidence that the symptoms were present and caused substantial impairment both in childhood and currently. While neuropsychological and psychoeducational test data were not used in making the diagnosis, they were used to quantify the extent to which the disorder currently impaired the individual in academic or other life functions.
Those diagnosed with ADHD (n = 249; 150 men, 99 women; M age = 21.2, SD = 4.8) met DSM-IV diagnostic criteria for this disorder (APA, 2000), in that they provided evidence to corroborate lifetime impairment, had self-reported deficits in keeping with observed and documented behavioral problems, and provided evidence from reliable collateral informants to confirm that their self-reported impairments were both present and severe. The ADHD group also met the criteria outlined by Slick et al. (1999) regarding symptom credibility, as they had obtained a passing score on a well-validated symptom validity test (SVT; most commonly the Green Word Memory Test [Green, 2003], but occasionally the Medical Symptom Validity Test [MSVT; Green, 2004], Test of Memory Malingering [TOMM; Tombaugh, 1996], or the Victoria Symptom Validity Test [VSVT; Slick, Hopp, Strauss, & Thompson, 2005]).
The Clinical Control group (n = 507; 189 men, 318 women; M age = 22.2, SD = 6.3) consisted of individuals referred between 2007 and 2014 for assessment of ADHD or academic learning problems, all of whom were complaining of problems with attention and concentration but were not found to meet the diagnostic criteria for ADHD. Like the ADHD group, all consented to have their data used for research purposes. Of these, 195 were diagnosed with a learning disability, 116 were given no diagnosis, 29 were diagnosed with a mood disorder, 29 were diagnosed with an anxiety disorder, 26 were diagnosed with a personality disorder, and the balance had other problems (e.g., brain injury, lower overall cognitive ability). Only participants who had obtained a passing score on the Word Memory Test (Green, 2003) or another well-validated SVT were included in the Clinical Control group.
Materials
All participants completed the CAARS–Self-Report version (Conners, Erhardt, & Sparrow, 1998). The CAARS is a 66-item scale, in which items are rated on a 4-point scale (0 = not at all, 1 = just a little, 2 = pretty much, 3 = very much). The CAARS allows for the calculation of eight different indices, with some items contributing to more than one scale. Aside from its four principal factors (inattention/memory problems, hyperactivity/restlessness, impulsivity/emotional lability, and problems with self-concept), the CAARS provides scores on four factor-derived subscales: three scales that correspond to the DSM-IV symptoms, namely, Inattentive Symptoms, Hyperactive-Impulsive Symptoms, and ADHD Symptoms Total, and a general ADHD Index that is said to measure the “overall level of ADHD symptoms” (Conners et al., 1998, p. 23). The CAARS test manual states that “this index is the best screen for identifying those ‘at risk’ for ADHD” (Conners et al., 1998, p. 23). The ADHD Index is reported to have 71% sensitivity and 75% specificity (Conners et al., 1998). Sensitivity and specificity of other subscales are not reported. The manual does not stipulate a specific cut-off score that may be taken to indicate ADHD but recommends that a score over a t value of 65 on any subscale might indicate an area of clinically significant problems and that t scores more than 70 or 75 should be used as a cut-off for inferring clinically significant problems. It also states that individuals obtaining t scores of more than 70 on the ADHD Index are likely to meet the diagnostic criteria for ADHD. However, it also cautions that t scores above 80 on any of the CAARS subscales should be considered as possible indicators of symptom exaggeration.
Procedure
As part of the informed consent process, clients referred for assessment were informed at the start of the assessment that their data would be entered into a database and used, without any identifying information, in a study investigating test accuracy. Clients were told that they did not have to agree to allow their deidentified data to be used in any research study and that they would still be provided with a complete assessment regardless of their decision. All clients who agreed to have their anonymous data used for this research were included in this ethics-approved study.
While the CAARS does not contain a symptom validity scale per se, it does have an Inconsistency Index that identifies if similar items are endorsed at similar levels. Scores of eight or more on this index are said to reflect response inconsistency great enough to invalidate interpretation. Hence, individuals whose consistency score exceeded this cut-off were removed, leaving 404 in the Clinical Control group and 201 in the ADHD group.
Clients referred for assessment underwent a full neuropsychological assessment, including mental health screening surveys and tests of symptom validity. In addition, participants were asked to provide report cards from childhood, have their parents/caregiver complete a retrospective rating of their childhood behaviors prior to age 12, complete both self- and observer-versions of the CAARS, and provide evidence to document substantial impairment in more than one major life activity prior to age 12. This allowed a determination as to whether they met all of the five criteria for diagnosis as outlined in DSM-IV.
Results
As shown in Table 1, the ADHD group had more men relative to women compared with the Clinical Control group, χ 2 (1) = 35.6, p < .001. The proportion of participants who failed an SVT or were not given one was comparable between the groups—Clinical Controls (40.2%), ADHD group (41.0%)—and they were subsequently removed from further analyses. The Clinical Control group was older on average than the ADHD group, t(605) = 2.3, p = .022, for those who passed an SVT, and mean scores remained once we removed participants who had achieved a score of eight or more on the CAARS Inconsistency Index, t(525) = 2.4, p = .016.
Mean Age and Gender Proportion of Groups Who Passed a Symptom Validity Test and Who Also Scored Below the Cut for CAARS Consistency.
Note. SVT = symptom validity test; CAARS = Conners’ Adult ADHD Rating Scale.
A MANOVA using the CAARS subscales as the dependent measures, and gender and group as the independent measures revealed a main effect for group, Λ = .98, F(8, 591) = 14.7, p < .001, and gender, Λ = .40, F(8, 591) = 49.3, p < .001, but no interaction between the factors, Λ = 0.01, F(8, 591) =.57, p = .803. Subsequent one-way ANOVAs were performed on each of the eight dependent measures separately. As shown in Table 2, compared with the Clinical Control group, the ADHD group returned higher mean scores on all subscales of the CAARS, p < .001, except the Problems With Self-Concept scale, p = .436. Mean scores for women were larger than for men on the Inattention/Memory Problems, Hyperactivity/Restlessness, Impulsivity/Emotional Lability, and the ADHD Index subscales, p values ranging from .005 to < .001. In contrast, men returned larger mean scores for the DSM-IV subscales of Inattentive Symptoms, Hyperactive/Impulsive Symptoms, and ADHD Symptoms Total, p values ranging from .002 to < .001. There was no difference between men and women on the Problems With Self-Concept scale, p = .436
Mean (SE) on CAARS Subscales as a Function of Group and Sex.
Note. CAARS = Conners’ Adult ADHD Rating Scale; DSM-IV = Diagnostic and Statistical Manual of Mental Disorders (4th ed.; APA, 1994); Sx = Symptoms.
We investigated the percentage of respondents in each group who scored in the ADHD range for both cut-off values of 65 and 70 as discussed above. As can be seen in Table 3, a greater proportion of participants in the ADHD group attained CAARS subscale scores above 70 than those in the Clinical Control group for most subscales. Note that a higher percentage of scores were above the cut-off values in the Clinical Control group than in the ADHD group for the Problems With Self-Concept subscale. There was no significant difference between the two groups on the Impulsivity/Emotional Lability subscale. The lower cut-off value of 65 showed similar results.
Percentage of Respondents Scoring Above Defined Cut-Offs as a Function of Group.
Note. Men had significant differences between the ADHD group and Clinical Control group for all CAARS subscales except the Hyperactivity/Restlessness subscale, irrespective of cut-off. Men also had a significant difference in proportion scoring above 70 for CAARS Impulsivity/Emotional Lability subscale and the CAARS ADHD Index. CAARS Problems With Self-Concept subscale shows a difference between ADHD and Clinical Control only when women and men are grouped together; separately, the difference is not significant for either sex. CAARS = Conners’ Adult ADHD Rating Scale; DSM-IV = Diagnostic and Statistical Manual of Mental Disorders (4th ed.; APA, 1994); Sx = Symptoms.
The sensitivity, specificity, PPV, and NPV for each of the CAARS subscales using t score cut-offs of 65 and 70 are shown in Table 4. As may be seen, subscales in general show weak sensitivity (cut-off = 65: .16-.72; cut-off = 70: .61-.91) and low to moderate specificity (cut-off = 65: 07-.59; cut-off = 70: .73-.96) for all subscales, including the ADHD Index. Furthermore, the PPVs of the CAARS subscales were low, ranging between 20% and 62% depending on subscale and cut score employed. Worth noting, too, is that a high score on the ADHD Index had between a 47% and 51% chance of correctly classifying an individual as having ADHD. Given that the base rate of ADHD in the present sample was approximately 30%, this finding indicates that the positive predictive power of the ADHD Index would be reduced in samples with a lower base rate. NPV was weak, with a negative score offering only a 65% to 81% chance that the individual does not have ADHD.
Sensitivity, Specificity, PPV, and NPV for Identifying ADHD Among Clinical Controls for Each Subscale of the CAARS.
Note. PPV = positive predictive value; NPV = negative predictive value; CAARS = Conners’ Adult ADHD Rating Scale; DSM-IV = Diagnostic and Statistical Manual of Mental Disorders (4th ed.; APA, 1994); Sx = symptoms.
Although participants who received a diagnosis of ADHD obtained significantly higher scores on seven of the eight subscales of the CAARS (the difference was not significant for the subscale, Problems With Self-Concept; see Table 2), discriminant validity of the subscales was weak. When t scores from all eight subscales were entered into a discriminant function analysis (DFA), they correctly discriminated 71.1% and 72.2% of the individuals with cut-off scores of 65 and 70, respectively. Even though 30.3% of the current sample had ADHD, when using a cut score of 65 and including all subscales in the function, the CAARS had an overall sensitivity of .57, specificity of .78, PPV of .56, and NPV of .79. At a population prevalence of 20%, PPV was .39 and NPV was .88. At a population prevalence of 10%, PPV was .22 and NPV was .94. The test fared no better with a cut score of 70, which yielded sensitivity of .46, specificity of .85, PPV of .60, and NPV of .76 in a DFA. At a population prevalence of 20%, PPV was .43 and NPV was .86. At a population prevalence of 10%, PPV was .25 and NPV was .93. Overall, the false negative rates for this DFA were 43% and 54%, and the false positive rates were 22% and 15% at cut scores of 65 and 70, respectively.
Furthermore, employing the number of elevated subscales as an indication of ADHD also produced high false negative and false positive rates, although the rates depended upon gender. For the number of CAARS subscales greater than t = 65, sensitivity was .30 (women: .20, men: .34), specificity was .86 (women: .92, men: 82), PPV was .51 (women: .43, men: 59), and NPV was .71 (women: .79, men: .61). The number of subscales greater than 65 correctly classified 67.6% of all cases (women: 74.8%, men 60.4%). Using the number of CAARS subscales greater than t = 70 yielded sensitivity of .22 (women: .09, men: .36), specificity of .90 (women: .94, men: .84), PPV of .52 (women: .30, men .63), and NPV of .70 (women: .77, men: .62), and the score correctly classified 67.6% of all cases (women: 73.6%, men: 62.5%).
Discussion
This study set out to investigate the sensitivity, specificity, and positive and negative predictive values of the CAARS and to determine how well this self-report measure discriminates between those individuals with ADHD and other symptomatic young adults who did not meet DSM-IV criteria for this diagnosis but were diagnosed with other mental health conditions or psychological disorders. Employing various methods of interpreting the CAARS scores yielded similar results: At best, only 72% of all individuals were identified correctly by the CAARS, and anywhere from 20% to 45% of Clinical Controls were falsely identified as having ADHD. Similarly, a high score on any of the CAARS subscales had weak predictive value. For instance, a high score on the ADHD Index, said to be the best indicator of ADHD status, had only a 47% to 51% chance of accurately predicting a true case of ADHD in our sample (see PPV scores Table 4). This is noteworthy, as the prevalence of ADHD in this sample was just more than 30%. Given that the population prevalence of ADHD in adults is estimated at below 10% (Simon, Czobor, Bálint, Mészáros, & Bitter, 2009), our results suggest that a score above 65 on any of the CAARS scales has, at best, a 22% chance of correctly identifying a true case of ADHD, and a 25% chance at cut scores above 70 with a 22%-15% false positive rate depending on the t score employed. These findings are consistent with many recent studies (e.g., Harrison et al., 2013), which show that individuals who experience high levels of anxiety, depression, or stress frequently endorse a high number of symptoms on ADHD self-report scales even when they have no previous history of ADHD symptoms. In other words, ADHD symptoms are commonly reported in clinical populations and are not specific to this diagnosis.
The sensitivity values returned also speak to the problem of false negative scores, with between 43% and 54% of ADHD individuals being falsely identified as non-impaired. While a good screening test should err on the side of caution and overidentify a number of non-symptomatic patients as possibly ill, it is important that the test not miss or fail to identify those who truly have the disorder in question. In a sample with a 30% base rate, a score below 65 conferred only a 78% probability that the individual did not have ADHD. This finding is not unexpected, as one might well suspect that individuals with true ADHD may not always be accurate self-reporters of symptoms and may, due to executive functioning deficits, underestimate the severity of their problems (Canadian ADHD Resource Alliance, 2011). In our study, NPVs rose as the population prevalence decreased, meaning that in the general population at large, a low score is likely to correctly classify a non-ADHD individual. Nevertheless, this means that low scores alone are not sufficient to rule out the possibility of ADHD in a clinical sample.
Accurate diagnosis is essential for correct treatment and also for prognosis. The findings from this study underscore the need to ensure that all of the DSM diagnostic criteria are established prior to making this diagnosis, as high scores from a self-report checklist alone will significantly overidentify individuals with other clinical problems and underdiagnose true cases of ADHD. Furthermore, many of the clinical conditions that present with symptoms similar to ADHD are not lifelong disorders and, as such, may be amenable to treatment. By contrast, mislabeling an individual as having ADHD rather than addressing the true cause of symptoms may not only delay appropriate treatment but also expose the individual to inappropriate and potentially life-threatening treatment. For instance, given the relatively rare but significant adverse effects associated with medication treatments for ADHD (including but not limited to cardiovascular adverse events such as hypertension, arrhythmia, and sudden death; sleep disruption; epilepsy; psychiatric events and suicidality; tics; and possible liver damage with long-term use; Graham et al., 2011; Schneider & Enenbach, 2014) as well as concerns around substance abuse, misuse, and diversion (Graham et al., 2011; Schneider & Enenbach, 2014), the importance of accurate diagnosis cannot be stressed enough.
Findings from this study demonstrate that self-report measures like the CAARS are neither sensitive nor specific to ADHD. As a diagnostic instrument, the CAARS had an unacceptably high false positive rate and may also fail to properly identify those with true ADHD. This finding is even more worrisome, given recent research showing how easily symptoms of ADHD may be feigned on self-report inventories (e.g., Harrison, Edwards, & Parker, 2007; Jachimowicz & Geiselman, 2004) and the reported 20% to 47% rate of symptom exaggeration in a postsecondary population (e.g., Harrison, 2006; Sullivan, May, & Galbally, 2007). As such, self-report alone should not be the sole criterion by which the diagnosis of ADHD is determined. While the CAARS may be an adequate initial screen to rule out ADHD, a positive result does not guarantee a diagnosis of ADHD. Given recent reports that the majority of clinicians are still employing screening measures such as the CAARS as diagnostic measures (e.g., Gordon et al., 2006; Joy et al., 2010; Nelson et al., 2014), these results underscore the need to ensure that all DSM diagnostic criteria are met before diagnosing a young adult with ADHD based solely on a high CAARS score.
As a final note, our results should generalize well to a clinic-based setting where postsecondary students are evaluated, including students in graduate programs. These findings may not, however, generalize to assessment of individuals outside of this age range or to those no longer in school. Further studies using adults who are no longer in school should be undertaken to help expand our understanding of the diagnostic accuracy of the CAARS in older individuals.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
