Abstract
Objective:
Continuous performance tests are widely used to aid diagnostic decision making and measure symptom reduction in adult ADHD clinical populations. The diagnostic accuracy of the Quantified Behavior Test plus (QbTest+), developed to identify ADHD populations as an objective measure of ADHD symptoms, was explored.
Methods:
The utility of the QbTest+ was investigated in a clinical cohort of 69 adult patients referred to a specialist ADHD clinic in the UK.
Results:
Scores from the QbTest+ failed to differentiate between patients diagnosed with ADHD and those who did not receive a diagnosis after full clinical assessment.
Conclusions:
Based on our findings, we recommend clinicians are cautious when interpreting results of the QbTest+ in clinical populations. This study highlights the need for investigation into the lack of validation of commonly used objective measures in ADHD populations.
Introduction
ADHD is a pervasive and debilitating neurodevelopmental disorder characterized by symptoms of inattention, hyperactivity, and impulsivity (American Psychiatric Association, 2013) and elements of executive dysfunction (Willcutt et al., 2005). ADHD creates significant difficulties in daily functioning in areas such as education, employment, or social networking (Barkley & Murphy, 2010; Hall et al., 2016; Kooij et al., 2019; Ogrim et al., 2012; Oie et al., 2011). The prevalence of ADHD is thought to be approximately 3% to 5% in children, and 2.5% in adults, with symptoms continuing into adulthood 60% to 90% of the time (Agnew-Blais et al., 2016; Ogrim et al., 2012; Polanczyk et al., 2014; Sibley et al., 2022; Simon et al., 2009). Presentation of ADHD in adulthood is often more diverse than in childhood (Hirsch et al., 2018) and this can make diagnosing ADHD at this point in the lifespan difficult, as clinicians require accurate retrospective information pertaining to presence and onset of symptoms during childhood, which may be difficult for the patient to provide (Faraone et al., 2006).
There is no single assessment or diagnostic tool available to diagnose ADHD. Diagnosis is determined using clinician judgment, usually with input from caregivers and teachers, alongside a variety of direct clinical observations (Reh et al., 2015). However, in cases of adult diagnosis this can prove difficult, especially because research suggests clinical rating tools tend to lack acceptable levels of sensitivity (Groom et al., 2016). The development of clinical tools that provide objective measures of the main ADHD symptoms are advantageous to specialist clinics. This is because symptoms of ADHD often overlap with other disorders and may complicate the diagnostic process (Biederman et al., 2010). A further consideration is that self-report measures used to aid clinicians in diagnosing ADHD may be unreliable and subject to bias (Edwards et al., 2007). This means clinicians often have vague or unreliable information which is especially difficult when measuring ambiguous symptoms such as inattention, hyperactivity, and impulsive behaviors (Uno et al., 2006).
The continuous performance test (CPT) is well established, having been used in clinical and laboratory settings for over 60 years (Shaked et al., 2020). The CPT is a neuropsychological task most commonly known for quantifying executive functions such as sustained attention (thought vigilance) and impulsivity (by response inhibition), and is widely employed clinically as a diagnostic aid (Epstein et al., 2003; McGee et al., 2000; Nichols & Waschbusch, 2004; Rapport et al., 2000). It is intended as an objective measure of treatment efficacy in both childhood and adulthood ADHD (Johansson et al., 2021; Uno et al., 2006). The theory behind the use of the CPT in ADHD is that the test should be able to differentiate ADHD from other mental health disorders by quantifying cognitive traits in isolation, therefore measuring symptoms that are specific to the condition. However, whilst the CPT has been successfully utilized to differentiate ADHD from neurotypical development in childhood cohorts (Berger et al., 2017; Epstein et al., 2003; González-Castro et al., 2013; Pollak et al., 2010; Tallberg et al., 2019) and in adult populations (Schoechlin & Engel, 2005), the CPT has not had the same success in differentiating between ADHD cohorts and other psychiatric disorders (Riccio & Reynolds, 2001; Solanto et al., 2004).
Whilst there are many versions of CPTs (Parsons et al., 2019), in principle it is a computerized task where a continuous stream of stimuli (auditory or visual, or both) is presented to the participant rapidly on a monitor (Hall et al., 2016). The CPT requires the participant to respond to certain stimuli and refrain from responding to other types or “non-target” stimuli. The CPT measures ADHD symptoms by recording omission errors and reaction time which is representative of sustained attention ability, and commission errors and variability in reaction times which measures impulsive behaviors (Berger & Cassuto, 2014). It is commonly employed in clinical settings, with various CPTs demonstrating reliability in measuring ADHD symptoms (Emser et al., 2018). For instance, Conner’s (2014) continuous performance test, a popular CPT used widely in clinical cohorts, has shown higher test-retest reliability than gold standard questionnaires used to aid ADHD diagnosis (Soreni et al., 2009). Losier's et al. (1996) meta-analysis concluded that children generally perform worse on CPTs than neurotypicals. However, findings are not consistent. Other studies have shown CPT performance does not always differentiate between ADHD and controls (McGee et al., 2000; Solanto et al., 2004). Therefore, CPTs, whilst demonstrating some dependability, are not advised to be used in isolation for diagnostic or medication titration purposes, as questions surrounding their reliability persist (Hervey et al., 2004).
The Quantified Behavior Test (QbTest) (https://www.qbtech.com/adhd-tests) is a norm measured CPT that has been developed to measure all three core symptoms of ADHD separately. It is designed to be used alongside clinical interviews and other tools, not as a stand-alone test. Unlike traditional CPTs, the QbTest also measures motor activity using infrared tracking, therefore capturing all three core symptoms of ADHD (Ulberstad, 2012). Which is valuable as the omission of hyperactivity measures is a regular criticism of other CPTs (Reh et al., 2015). A number of studies have validated the QbTest as a diagnostic tool for ADHD with sensitivity reported between 86% and 90% (Edebol et al., 2012; Ulberstad, 2012). The developers of the QbTest suggest it should be a key component in clinical evaluation and treatment efficacy during follow-up, which according to the developers will be cost-effective for healthcare services (Ulberstad, 2012). In terms of clinical utility, findings are varied. Vogt and Shameli (2011) found that including the QbTest in clinical diagnosis of ADHD added“robustness” and strengthened clinical decision making. Groom et al. (2016) found the QbTest could discriminate between ADHD and ASD with 90% accuracy. Hult et al. (2018) assessed diagnostic accuracy of the QbTest in a sample of 124 children with ADHD compared to 58 controls (81% diagnosed with ASD), concluding QbTest parameters for inattention and hyperactivity differentiated between groups, but not impulsivity. They reported sensitivity between 47% and 67% and specificity between 72% and 84%, concluding effect of the QbTest was “moderate” and “unsatisfactory.”
The question of whether the CPTs have clinical utility is unresolved, especially when employed to differentiate clinical groups. Whilst the use of CPTs in isolation is not supported to make a diagnosis, the question of whether CPTs can offer clinically meaningful information is undecided. This study aimed to evaluate the validity of a CPT as presented by QbTest+ as an objective measure of ADHD symptoms in a clinical diagnostic pathway in a sample of adult patients referred to a specialist NHS clinic for possible diagnosis of ADHD.
Methods
Participants
The sample employed 69 adults referred for ADHD assessment to a Specialist Adult ADHD and Autism Service, South West Yorkshire Partnership NHS Foundation Trust, UK, between 2017 and 2018. The Adult ADHD and Autism Service is a specialist service in diagnosing ADHD and Autism in adulthood. Patients without intellectual disability are referred to the service by health care professionals only, who deem it appropriate based on history and current difficulties. Inclusion criteria dictated that participants were over the age of 18 years (no cut-off), had a good comprehension of the English language, and IQ within normal range (>70). Patients accessing the service are routinely informed that their data can be used for research purposes and have the opportunity to opt-out. For this project, the need for ethics approval was waived by SWYPFT Research and Development Department as the data was gathered retrospectively and was collected as part of the clinical operations of the service. The SWYPFT Caldicott Guardian endorsed access to data following Caldicott Principles. Data was gathered from electronic records. Gender was measured by asking the participants to report male, female, or prefer not to say. The sample consisted of 45 (65.2%) cis males, 24 (34.8%) cis females, with no participants choosing not to disclose gender. Mean age was 33 years (SD ±9.9, Range = 42).
Diagnostic Assessment Tools
Qb test
The QbTest is a computerized CPT which measures inattention, impulsivity, and hyperactivity. The test combines CPT with activity levels which are measured by an infrared motion tracking camera. It can be used in children ages 6 to 12 years, and in adolescents and adults aged 12 to 60 years. The difference between the versions of the test are that the children’s version consists of a go-no-go paradigm whereas the adult version consists of unconditional identical pair paradigm to avoid floor to ceiling effects (Ulberstad, 2012). The adult version was used in this study (QbTest+). The test takes on average 20 minutes to complete. Participants are asked to sit 1 m from a monitor which the infrared motion tracking camera is attached to, and to hold a handheld responder. Participants are instructed (by standardized instruction on the screen, and verbally) that there will be time for a 5-minute practice before they begin, and that accuracy and speed is the objective. The Qb Test consists of 600 stimuli presented on the monitor, each stimulus is present for 200 ms, followed by an interval of 2000 ms. Stimulus consists of red or blue circles and squares. Participants are instructed to only press the responder when the stimuli they see matches the previous stimuli in color or shape. Attention is measured by number of correctly identified targets, reaction time, and variability of reaction time. Impulsivity is measured by incorrect responses, and hyperactivity is measured using the motion-tracking system using the infrared camera. This captures movement by tracking a reflective headband worn by the participant. The camera captures movement throughout the whole of the task at a frequency of 50 samples a second and with spatial resolution of 1/27 mm per infrared camera unit (Groom et al., 2016; Ulberstad, 2012). Scores are derived by transforming raw date to z-scores, with higher scores indicative of greater probability of ADHD. A QbTest (total) score is calculated by the mean of the three Q-inattention, Q-Impulsivity, and Q-Activity scores. Data from normative assessment assumes features of ADHD are likely to be present if a score of ≥1.5 is observed (Ulberstad, 2012). Available data calculates a sensitivity of 86% and specificity of 83% (Edebol et al., 2013). The professionals who administer the QbTest have undertaken formal training and are competent in the knowledge of administering and scoring the assessment.
The diagnostic interview for ADHD in adults (DIVA)
The Diagnostic Interview for ADHD in Adults 2.0 was developed by Kooij & Francken (2010). The DIVA measures core ADHD symptoms as recognized by the DSM-5 (American Psychiatric Association, 2013). Semi-structured interviews are carried out by qualified healthcare professionals with specialist knowledge of ADHD. The interview takes on average 1.5 hours. The assessment is determined using five criterion which include: the presence of symptoms in accordance with the DSM-IV, which is divided into two parts which indicate presence of symptoms of Attention-Deficit and Hyperactivity-Inattention in both childhood (5–12 years) and adulthood (Criterion A), age of onset and impairment (Criterion B), symptoms in two or more areas in childhood and adulthood (Criterion C and D), and consideration that symptoms are not better explained by another psychiatric disorder (Criterion E). Current and retrospective information is provided by the patient and is supplemented by someone who knows the patient at the time of interview and during childhood (if possible, usually a parent or close relative). Rule of thumb dictates that if the patient and parent/close relative are not in agreement, the patient should be the informant. For Criterion A, six or more criteria for either inattention and/or hyperactivity/impulsivity in childhood and adulthood indicate that a diagnosis of ADHD is plausible. The professionals who administer the DIVA have undertaken formal training and are competent in the knowledge of administering and scoring the assessment.
Diagnostic Process
As part of routine clinical evaluation of adults referred to a Specialist Adult ADHD Pathway, the participants undergo a thorough psychiatric assessment by a doctor with expertise in ADHD and General Psychiatry. Apart from the QbTest and the DIVA, the assessment included other sources of information including full psychiatric history, mental state examination, observations during assessments, and informant history. As a result of this assessment process, the assessor was able to establish impairment linked to ADHD by assessing symptom settings, symptom timeline and be able to rule out other mental health diagnoses which could better explain the presentation. As part of this psychiatric assessment, if there was presence of mental health comorbidity, it was recorded in the notes.
Data Collection
Data was recorded using an internal electronic spreadsheet specific for this project. Data was recorded by the healthcare professionals carrying out the QbTest assessments.
Data Analysis
Data from electric records was explored using descriptive and inferential tests (independent t-test) using Statistical Package for Social Sciences (SPSS) Version 27.
Results
The study included 69 participants (2 participants removed due to incomplete QbTest data). Overall, 38 patients (55.1%) received a final diagnostic outcome of ADHD by clinical consensus as described above. Specifically, 20 patients were diagnosed with ADHD without comorbidity (29%), 18 (26.1%) were diagnosed as ADHD with comorbidity, and 31 (44.9%) patients were not diagnosed with ADHD after full assessment. For males, the diagnostic rate was 53.3% and for females it was 58.3% (ns). Patients who received an overall diagnostic outcome of ADHD (with or without comorbidity) scored greater on all Criterion A measures of the DIVA assessment than those who did not receive a diagnosis; number of childhood symptoms of attention-deficit t(30.900) = 4.606, p < .01), adult symptoms of attention-deficit t(30.285) = 5.696, p < .01), as well as childhood symptoms of hyperactivity-impulsivity t(60) = 3.530, p < .01), and adult symptoms of hyperactivity-impulsivity t(63) = 4.801, p < .01) (See Table 1).
Mean (±SD) Scores for DIVA Criterion A by Diagnostic Outcome.
Note. Criterion A measures number of symptoms of ADHD present in childhood and adulthood. Scores of >6 are scored positively, indicative of the presence of ADHD.
Significance <.01.
Those who received a clinical diagnosis of ADHD did not score significantly greater on the QbTest (Total) than those who did not receive a diagnosis (see Table 2). This applied when the sample was grouped into positive and negative diagnosis groups, and when grouped according to the presence of comorbidity (ns). In terms of meeting the threshold score for Qb Test, overall, 43 patients (62.3%) scored above the diagnostic threshold for Qb Total (≥1.5) (Median = 1.87, Range = 3.90). Of those who received a diagnosis, 26 (70.3% within group) patients scored above the Qb Total threshold compared to 17 (56.7% within group) of the non-ADHD group (ns). There were no differences found for age or sex between ASD and non-ASD outcome groups (ns), except for scores on Q-Activity where females scored significantly greater than males t(64.9) = −2.862, p < .01 (see Table 3).
Mean (±SD) Scores for Qb Test-Total, Q-Activity, Q-Inattention, and Q-Impulsivity by Diagnostic Group After Full Clinical Assessment.
Mean (±SD) Scores for Qb Test-Total, Q-Activity, Q-Inattention, and Q-Impulsivity by Gender.
Significance <.01.
Sensitivity, Specificity, and Predictive Values
The Qb Test (total) demonstrated 70% sensitivity at detecting the presence of ADHD in those who received a clinical diagnosis, however only 43% specificity at detecting the absence of ADHD in those who did not receive a clinical diagnosis. Positive predictive value (PPV) determined that if a patient scored above the Qb Test threshold cut-off (≥1.5), they have a 60% chance of receiving a clinical diagnosis. Negative predictive value (NPV) determined that 54% of those who did not score above the threshold would not receive a clinical diagnosis.
Discussion
Diagnosing ADHD is subjective and dependent on clinical observations which can be subject to biases and inconsistencies. A well validated, objective test to supplement the diagnostic pathway of ADHD would be highly desirable to specialist clinics. The QbTest is unique as unlike traditional CPTs, it boasts the ability to measure all three main symptoms or ADHD, inattention, hyperactivity, and impulsivity.
Results from this study suggests the use of the QbTest in our pathway was not effective. The QbTest was unable to differentiate (on each level, and collectively) between those who received a diagnostic outcome of ADHD from those who did not. On face value, sensitivity levels were poor but acceptable (70%), however levels of specificity were not acceptable (43%). The importance of these findings was supported by the results of the DIVA, where patients diagnosed with ADHD scored significantly greater than those who did not receive a diagnosis, after full clinical assessment. We observed an effect of gender for measured performance of Q-Activity. Whilst we are unsure of the reason for this observation, the effects of gender on CPTs in ADHD populations are observed in other studies. For instance Hirsch and Christiansen (2017) report that males demonstrated higher levels of hyperactivity in their sample of (collectively) 1,070 outpatients. Hasson and Fine (2012) in their review found that boys tend to be more impulsive than girls, but no differences are apparent for measures of inattention. Further research is warranted into the gender differences in ADHD for measures of hyperactivity, which would add to the discussion around possible differing normative values for males and females on CPTs.
In the clinical documentation which supplements the QbTest, the developers state that “it is. . .important that QbTest can differentiate patients with ADHD from normative individuals.” Indeed, there is evidence to support the differentiation of ADHD and neurotypical profiles (Edebol et al., 2013). However, realistically, if QbTest is to be applied to clinical cohorts such as ours, it would need to reliably differentiate between those with ADHD and those exhibiting traits of ADHD, or other psychiatric groups. Our sample is representative of heterogeneous cohorts that will be received in clinical settings.
Whilst the performance of the QbTest was not effective in this cohort, there is evidence to suggest CPTs, generally, are clinically valuable in childhood and adulthood cohorts (Edebol et al., 2013; Emser et al., 2018; Hirsch & Christiansen, 2017; Lis et al., 2010; Teicher et al., 2012; Ulberstad, 2012; Zelnik et al., 2012). Whilst this evidence is useful, Johansson et al. (2021) alludes to the fact that a limitation to some support for the QbTest are published by the authors of the QbTest (see Ulberstad, 2012) or have received financial support from the developers (Groom et al., 2016). This is not necessarily a detriment, however Johansson et al. (2021) suggest this is potentially problematic due to a significant association between positive findings and financial ties in clinical studies. Indeed, publication bias could also play a part here.
Our study suggests that the clinical validity of the QbTest is limited at best in comorbid samples (Baader et al., 2020; Baggio et al., 2020; Brunkhorst-Kanaan et al., 2020; Hall et al., 2016; Hult et al., 2018; Nichols & Waschbusch, 2004; Reh et al., 2015; Riccio & Reynolds, 2001; Söderström et al., 2014; Teicher et al., 2012). Issues pertaining to low sensitivity and specificity levels are widely acknowledged (Díaz-Orueta et al., 2014; Rizzo et al., 2000; Rodríguez et al., 2016). Whilst sensitivity levels have shown to be acceptable, specificity has not (Riccio & Reynolds, 2001), which is the pattern we have observed here. Troublingly, even when the inclusion of the QbTest is supported in clinical observations, when adding QbTest to a predictive model to differentiate between ASD and ADHD adult patients (which included Conners Adult ADHD Rating Scale–subscale E and Autism Quotient), the QbTest only added “modest” improvement to the model (Groom et al., 2016).
It seems QbTest has most difficultly differentiating clinical groups, for instance, the QbTest has previously demonstrated 86% and 83% sensitivity and specificity respectively, when differentiating between ADHD and neurotypical controls, yet sensitivity drops to 36% when differentiating ADHD and borderline personality disorder controls (Edebol et al., 2012, 2013).
Other studies such as Baader et al. (2020) found correlation between ADHD symptoms and Q-Activity, yet no other associations. Similarly, in a study which supports the use of QbTest to identify ADHD, activity was found to be 3.5 times higher in ADHD than neurotypicals, but not inattention or impulsivity (Lis et al., 2010), suggesting usefulness of the test is limited. Moreover, Johansson et al. (2021) concluded that a sample of 340 patients the QbTest was unable to differentiate between ADHD, other neurodevelopmental disorder, and neurotypical development, suggesting that clinicians take caution when using tests such as the QbTest to aid diagnostic or clinical decision making.
A potential explanation for these inconsistent findings this is the lack of ecological validity in the test condition. It is possible that tests such as the QbTest fail to capture the random distractions that are present in real life situations, and it is this type of attention that is not quantified (Barkley, 1991). More recently, in order to address this, CPTs have evolved to use virtual reality paradigms where participants immerse themselves in virtual worlds which mirror real life settings, with results in favor of better sensitivity and specificity for ADHD populations than traditional CPTs (Rodríguez et al., 2018). Perhaps the QbTest and those similar would be most useful as a measure of executive function (as part of the diagnostic pathway) rather than concerned solely with diagnostic outcomes.
Interestingly, a multi methods randomized control trial investigated the feasibility and acceptability of employing the QbTest in UK specialist services, finding through interviews with clinicians that they were in overwhelming favor of the addition of the QbTest to clinical pathways. One of the major themes derived from the data surrounded the impression that the QbTest provided an objective, observable measure of ADHD which was beneficial for both clinicians and patients (similar findings were recorded for Williams et al., 2021). Another main theme alluded to the pressure clinics felt from funders and commissioners to speed up the time between referral and diagnosis, meaning that tests such as the QbTest are valuable in facilitating this. Whilst findings such as this are understandable and logical, it seems important to point out here that based on analysis of validation studies, such as the one here, the usefulness of the QbTest remains as a “perceived validity” and certainly, more research is required before clinics can give weight to the results of QbTest. As Reh et al. (2015) points out, the QbTest is widely marketed in the US and Europe as a diagnostic and titration tool without thorough validation (Vogt & Shameli, 2011). An issue that we would agree is highly problematic based on the findings of this study, and of those similar. Essentially, a single test or measure should not be used to diagnosis or measure symptom reduction in ADHD, however, like all clinical assessments, the QbTest should be evidenced to be reliable in clinical cohorts. The test is purposed for use as an aid alongside clinician judgment by the developers, therefore questions surrounding validity must be resolved.
Limitations
Whilst a strength of our study was that it is representative of typical referral cases for ADHD, therefore we can confidently generalize to our service, it is limited to one single center, therefore our findings are not generalizable to whole populations. Also, studies including larger numbers would be valuable going forth. This study employed the QbTest+ for adolescents and adults between the ages of 12 and 60 years, therefore results are only comparable for studies which did similar.
Conclusions
Our findings are meaningful in that they are representative of typical clinical cohorts and they have significant implications for specialist clinics. Not only do studies such as this one ask the question of whether employing the QbTest (or similar CPTs) is cost effective for services, they highlight the importance of the expert judgment of experienced healthcare practitioners in diagnostic decision making. It is disconcerting that the QbTest failed to differentiate between ADHD and non-ADHD in our sample, yet we suggest that there is the potential for the QbTest to be useful in samples such as ours, in way of measuring symptom reduction or medication monitoring in individual cases (Gualtieri & Johnson, 2005; Wehmeier et al., 2014). Whilst CPTs such as the QbTest may have benefits in some cohorts or individual circumstances, the gold-standard method of diagnosing using clinical observations and history taking by specialist practitioners remains. Tests designed for clinical utility must be able to differentiate between clinical groups (Forbes, 1998). Further research is required, especially if it is to be routinely employed in clinical cohorts.
Footnotes
Author’s Note
Marios Adamou is also affiliated to South West Yorkshire Partnership NHS Foundation Trust, Wakefield, UK.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
