Abstract
Introduction
ADHD is a neurodevelopmental disorder characterized by inattention, hyperactivity, and impulsivity (Epstein & Loren, 2013). Children with ADHD typically exhibit emotional and behavioral symptoms, which negatively affect academic performance and interfere with or reduce social functioning. If untreated, ADHD continues into adulthood and increases risks for dysfunction, including poor occupational performance, unemployment, traffic accidents, interpersonal conflict, obesity, and suicide (Cheung et al., 2015). The need for an objective diagnostic tool is growing (Bruchmuller, Margraf, & Schneider, 2012), yet few diagnostic biological ADHD markers, as well as confirmatory neuropsychological or laboratory tests, exists.
In clinical settings, the Diagnostic and Statistical Manual of Mental Disorders (4th ed.; DSM-IV; American Psychiatric Association, 1994) is used to diagnose ADHD in children and adolescents who exhibit attention deficits, hyperactivity, and impulsivity. Clinicians conduct comprehensive assessments to gather information and diagnose ADHD, including caregiver/teacher interviews, medical history, caregiver/doctor behavioral assessments, and neuropsychological tests.
The Continuous Performance Test (CPT) is the most widely used assessment tool for evaluating the neuropsychology of children exhibiting ADHD symptoms. Conners’ CPT (Conners, 1994), Gordon Diagnostic System (Gordon, 1986), and the Test of Variables of Attention (TOVA; Greenberg & Kindschi, 1996) are most commonly used for ADHD diagnosis. The tests primarily assess selective attention, sustained attention, behavior inhibition, and determine degree of cognitive impairment (Corkum & Siegel, 1993). CPT scores effectively correlate with ADHD symptom severity measured via behavioral assessments, and ADHD symptom improvement tends to lead to CPT score improvement. Consequently, the CPT is regarded as an effective ADHD assessment tool for children’s cognitive functions and drug treatment effects (Corkum & Siegel, 1993; Kronenberger & Meyer, 2001; C. A. Riccio, Reynolds, Lowe, & Moore, 2002). However, at present, there are questions about its utility as a tool for distinguishing ADHD children from the general population.
The Advanced Test of Attention (ATA) is one of the most commonly used CPT in Korea. Shin and colleagues suggested a T-score-based ADHD diagnostic model using a cutoff value of 2.0 SD (Hong, Shin, & Cho, 1999). However, only 30 ADHD boys aged 7 to 9 were included in the discriminant validity test, limiting result generalizability. To address model limitations, Shin and colleagues conducted a follow-up study involving 1,091 ADHD children and proposed new ADHD diagnostic models incorporating machine learning and pattern recognition in 2008 (H. J. Lee, Cho, & Shin, 2008). However, because certain machine-learning methods exhibit greater false positive and negative rates than T-score standards, it was difficult to conclude that this method was superior. In addition, the machine-learning model can potentially over-fit, and the real-life clinical setting application is challenging.
Therefore, the present study tested the ATA on healthy and ADHD children diagnosed by a semi-structured diagnostic interview to evaluate the clinical application of ATA standards. To determine the feasibility of the ADHD and healthy control group comparisons, we examined the ATA variable distributions and the sensitivity and specificity of the suggested ADHD diagnostic cutoff point. This study aims to assess the clinical utility of the ATA in discriminating between ADHD and healthy control groups.
Method
Participants
ADHD group
Participants were consecutively recruited from August 2012 and May 2014 at the outpatient clinic of pediatric psychiatry at Asan medical center, Seoul, Korea. ADHD participants met the following criteria: (a) aged between 6 and 12 years and (b) diagnosed as ADHD as per diagnostic criteria from the Diagnostic and Statistical Manual of Mental Disorders (4th ed., text rev.; DSM-IV-TR; American Psychiatric Association, 2000) and Kiddie-Schedule for Affective Disorders and Schizophrenia–Present and Lifetime version (K-SADS-PL). Individuals were excluded if they met one or more of the following criteria: (a) intelligence quotient lower than 70; (b) a history of ADHD drug treatment (e.g., central nervous system stimulants, atomoxetine, or clonidine) within the past 3 months; (c) past and/or current history of schizophrenia, organic mental disorder, or pervasive developmental disorder; and (d) presence of seizure or other neurologic disorders.
Control group
The control group was recruited through an Internet bulletin board at [name removed for blinded review]. The control group also met the following criteria: (a) aged between 6 and 12 years and (2) a negative diagnosis of ADHD according to DSM-IV-TR and K-SADS-PL. Comorbid disorders—such as tics, and depressive or anxiety disorder, which do not need pharmacological treatment—were allowed. The exclusion criteria were the same as for the ADHD group.
The study was approved by the institutional review board (IRB) of the [name removed for blinded review]. Written informed consent was obtained from parents and written assent from participants.
Instruments
The Korean version of the K-SADS-PL-K
Kaufman and colleagues (1997) developed the K-SADS-PL, which is a semi-structured diagnostic interview used to assess a child’s current and lifetime diagnoses through parent and child interviews. The validity and reliability of the Korean version of the K-SADS-PL (K-SADS-PL-K) have been well established for the assessment of ADHD, tic disorders, and oppositional defiant disorder (Kim et al., 2004).
ADHD and comorbid psychiatric disorders were diagnosed by board-certified child and adolescent psychiatrists (H.J.L. and H.W.K.,) according to the DSM-IV-TR and were confirmed by the K-SADS-PL-K. Two raters independently rated 20% of the K-SADS-PL-K, and kappa coefficients ranged from 0.86 to 1.00. The discrepancies were solved by a consensus discussion meeting between the psychiatrists.
ADHD subtypes were defined according to the DSM-IV-TR criteria. A child who meets three or more diagnostic criteria and exhibits prominent attention deficit or hyperactivity and impulsivity that impairs social, academic, and interpersonal functioning is diagnosed as having ADHD not otherwise specified (NOS).
The ATA
The ATA comprises a visual test and an auditory test, which measure responses to a mix of target and non-target stimuli. The test-takers should inhibit their response when the non-target stimulus is presented. Target stimuli are non-linguistic to minimize language, learning, and cultural influences (e.g., triangles as visual target stimuli, “beep-beep-beep” as auditory target stimuli). Frequency of target stimulus presentation changes from 22% at onset, to 50% in the middle, and 78% toward the end. The stimulus presentation interval was 2 s, and each stimulus was presented for 0.1 s. Each test (visual/auditory) administration time was 10 min for 6-year-olds (beginning/middle/end: 3.3 min each, total time: 20 min), and 15 min for those aged above 7 years (beginning/middle/end: 5 min each, total time: 30 min). After test completion, omission errors, commission errors, reaction time (RT), and reaction time variability (RTV) were recorded as z scores which were standardized by age and sex. A z score less than 1.0 indicates “normal,” greater than 1.0 and less than 1.5 indicates “suspected,” and greater than 1.5 indicates “ADHD.” The sensitivity index (d’) and response bias (β) are also measured.
Omission errors are expressed as a percentage of target identification
Commission errors are represented as a percentage ratio of non-target identification. RT measures speed of information processing and response (ms). RTV measures the consistency of pressing a microswitch. Based on signal detection theory, the sensitivity index (d’) reflects the individual’s ability to discriminate target stimulations from non-target ones. The response criterion (β) is the ratio of signal (hit) to noise (false alarm) and represents an individual’s response tendency either conservatively or adventurously when performing a task.
The participants underwent the ATA with a blind investigator for the child assessment. For those who underwent the repetitive test, the second ATA test occurred 2 weeks later at the same time, place, and with same investigator as the first test. (Interval range = 12 ~ 23 days.)
Korean Educational Developmental Institute–Wechsler Intelligence Scale for Children–Revised (KEDI-WISC-R)
The KEDI-WISC-R was developed by modifying the American WISC-R to accommodate Korean culture. It obtains overall intelligence, verbal intelligence, performance intelligence, and 11 other sub-test results for children aged 5 to 15 (α = .89; Park, Yoon, & Park, 1991).
Data Analysis
A chi-square test was used to compare categorical variables between ADHD and control groups. A t test was used for between-group continuous variable comparisons. A goodness-of-fit (GF) test was used to examine the distributions of the two groups. An ANCOVA was used to compare between-group ATA variables using age and gender as covariates. When a significant difference was found, the post hoc Tukey test was performed. Sensitivity, specificity, false positive value, false negative value, and diagnostic accuracy of the ATA variables were calculated when cutoff values were set at 1.0 SD (“suspected”), 1.5 SD (“ADHD”), and 2.0 SD (“ADHD,” based on ADHD Diagnostic System standards). Sensitivity and specificity differences at each cutoff point were compared using the McNemar test. Discriminant analysis was performed to examine if the ATA accurately distinguished between the ADHD and control groups. Finally, an intra-class correlation coefficient (ICC) was calculated to assess the test–retest consistency and variability of the ATA. Predictive Analytic SoftWare (PASW) statistics for Windows, Version 18.0 (SPSS Inc., Chicago, IL) was used for analysis, and the significance level was set at p < .05.
Results
Demographic Characteristics
The study included 114 children (79 ADHD, 35 controls). The ADHD group was slightly younger than the control group (p = .012) and included a higher proportion of boys (p = .005). The IQ between the two groups did not differ significantly (p = .092).
Within the ADHD group, 36 children (45.6%) were classified as “predominantly inattentive,” 29 (36.6%) as “combined,” four (5.1%) as “predominantly hyperactive–impulsive type,” and 10 (12.7%) as “ADHD NOS.” In the ADHD group, the most prevalent comorbid disorder was “oppositional defiant disorder” (n = 9, 11.4%), followed by separation anxiety disorder (n = 4, 5.1%), tic disorder (n = 4, 5.1%), and depressive disorder (n = 1, 1.3%). In the control group, one child was diagnosed with generalized anxiety disorder. Statistical analysis indicated no between-group differences in comorbid disorders (Table 1).
Demographic and Clinical Characteristics of the ADHD and Control Group.
Note. FSIQ = full-scale intelligent quotient; NOS = not otherwise specified; ODD = oppositional defiant disorder; GAD = generalized anxiety disorder.
Analyzed by independent t test.
Analyzed by chi-square test.
Boldfaced values are significant at p < 0.05.
ADHD and Control Group GF Tests
A GF test was performed for frequency distributions of the eight ATA visual and auditory test variables (four variables for each modality: omission errors, commission errors, RT, and RTV). Between-group differences in the frequency distributions occurred in omission errors (p < .001), commission errors (p = .013), and RTV in the visual (p < .001) and auditory (p < .001) tests (Figures 1 and 2).

Goodness-of-fit test of ATA visual test variables between ADHD and control groups.

Goodness-of-fit test of ATA auditory test variables between ADHD and control groups.
Comparisons Between the ATA Results for the ADHD and Control Groups
Because there were significant differences for age and gender distribution between the ADHD and control groups, ANCOVA adjusted for age and gender was used. In the visual test, the ADHD group exhibited higher rates of omission errors (p = .032), commission errors (p = .022), and RTV (p = .030), as well as lower sensitivity (p = .019) compared with the control group. In the auditory test, the ADHD group showed higher RTV (p < .001) compared with the control group (Table 2). In the analysis excluding the 10 children with an ADHD NOS diagnosis, the ADHD group did not perform, as well as the control group in the RTV in the visual test (p = .033), auditory test (p < .001), and sensitivity (p = .039).
Comparisons of the ADHD and Control Group Results on the ATA.
Note. ATA = Advanced Test of Attention.
Statistical significance was tested by ANCOVA adjusted for age and sex.
Boldfaced values are significant at p < 0.05.
Comparisons of Between the ATA Results for the ADHD Subgroups and Control Group
Significant differences of age (p = .010) and gender distribution (p = .001) between the ADHD subgroups and control group were observed, so the comparisons among the groups were controlled for age and gender.
The comparisons revealed differences between the control group and ADHD subgroups in commission errors in both the visual and auditory tests (p = .032 and .019, respectively) and RTV in the auditory test (p = .003; Table 3). Post hoc analysis found that the combined group had a higher commission error rate than the control group in both the visual and auditory tests, and a significant difference in RTV was observed in both the predominantly inattentive and combined groups in the auditory test.
Comparison of the Results of the ATA by ADHD Subtypes.
Note. ATA = Advanced Test of Attention; IA = inattentive; HI = hyperactivity impulsivity type; COM = Combined type; FSIQ = full-scale intelligent quotient.
Statistical significance was tested by ANCOVA adjusted for age and sex. If the group effect was significant (p < .05), post hoc Tukey tests were performed to clarify the main effect.
a = inattentive type; b = combined group and hyperactive–impulsive type; c = control. Boldfaced values are significant at p < 0.05.
Comparison Between Sensitivity, Specificity, and Accuracy Based on the SD Cutoff Points
When any of the eight ATA variables had an SD greater than 1.5, results indicated 72.8% accuracy, 84.8% sensitivity, and 45.7% specificity (positive predictive value [PPV] = 77.9%, negative predictive value [NPV] = 57.1%). A 1.0 SD cutoff (“suspected”) value led to 73.7% accuracy, 93.6% sensitivity, and 28.6% specificity. These values did not vary significantly from the 1.5 SD cutoff value (ADHD). Based on each criterion, the area under the curve (AUC) values were .653 and .611, respectively. When a cutoff SD value of 2.0 was applied, 71.6%, accuracy, 54.3% sensitivity, and 81.2% specificity were found. Although sensitivity was lower (p = .002), specificity (p < .001) was greater than the 1.5 SD cutoff value (Table 4).
Sensitivity, Specificity, and Accuracy Using of Different ATA Cutoff Scores to Predict Corresponding K-SADS-PL Diagnoses.
Note. ATA = Advanced Test of Attention; K-SADS-PL = Kiddie-Schedule for Affective Disorders and Schizophrenia–Present and Lifetime version; PPV = positive predictive value; NPV = negative predictive value.
McNemar’s test was performed to examine sensitivity and specificity using 1.5 SD cutoff value above the mean as a point of reference.
Boldfaced values are significant at p < 0.05.
ATA Discriminant Analysis
A stepwise discriminant analysis was conducted for the eight ATA variables. Sensitivity index and response bias were excluded from the analysis because of potential between-variable multicollinearity. The results revealed that the auditory test RTV (standardized discriminant coefficient [SDC] = .819, p < .001) was the most reliable variable for discriminating between the ADHD and control groups, followed by visual test commission errors (SDC = .291, p < .001).
The following formula (F) was deduced from the analysis:
Based on this formula, children with F > 0 were classified as ADHD, and children with F < 0 were classified as controls with 64.9% accuracy. Measurements produced 58.2% sensitivity and 80% specificity, and the function’s centroid was −.685 in the control group and .304 in the ADHD group, respectively. Cross-validation using one-leave-out procedure confirmed that 64.9% of the cross-validated groups were correctly classified. Discriminant analysis conducted with all eight ATA variables had 69.2% accuracy.
Comparison of the ICC Values of the ATA Variables
To assess test–retest reliability, the ATA was repeated after 2 weeks on 15 healthy controls and 24 ADHD children. The ICCs were calculated to measure test–retest reliability. Typically, an ICC greater than .7 indicates high reliability, while an ICC between .5 and .6 indicates moderate reliability. In the control group, the commission errors of the visual test (r = .623, p = .039) and response criterion (r = .786, p = .023) had a statistically significant test–retest correlation. However, in the ADHD group, no ICC greater than .5 was found, indicating low individual test–retest reliability and high between-test variability (Table 5).
Test–Retest Reliability Using ICC Values of ATA Variables.
Note. ATA = Advanced Test of Attention; ICC = intra-class correlations coefficient.
The ICC will be high when there is little variation between the raters’ scores for each item given (above .7 = strong; .5-.6 = moderate; >.5 = weak).
Boldfaced values are significant at p < 0.05.
Discussion
Our study investigated the clinical utility of the ATA in discriminating between the ADHD and healthy control groups. The current diagnostic algorithm of the ATA based on a cutoff point showed high sensitivity and low specificity. Discriminant analysis results indicated that, of the ATA variables, the RTV of the auditory test and commission errors of the visual test reliably distinguished between the ADHD and control group children. However, because the distribution of some ATA variables varied between the ADHD and control groups, caution is advised when interpreting T-score results. Moreover, repeated ADHD group tests yielded highly variable results, suggesting that care should be exercised in one-time result interpretation.
When looking at the ADHD diagnostic classifications based on the ATA cutoff points, a 1.5 SD (ADHD diagnostic standard) yielded high sensitivity (84.8%) but low specificity (45.7%). Applying a 2.0 SD yielded low sensitivity (54.3%) and higher specificity (81.2%). Of the CPTs currently used in worldwide, reliability of the TOVA revealed sensitivity and specificity of 91.1% and 21.6%, respectively. In a CPT meta-analysis, Pineda, Puerta, Aguirre, Garcia-Barrera, and Kamphaus (2007) found that it had 61.9% accuracy in distinguishing ADHD from normal children, it was similar to our result (64.9%) In addition, in the ATA GF test, some variables showed significant differences in between-group distributions, ruling out homogeneity. Converting ADHD children’s T-scores based on the control group results (with a different distribution) will influence values, thereby making it difficult to achieve a fair between-group comparison. Therefore, diagnosing ADHD based on variable T-scores presumed to be evenly distributed across groups may be limited.
Previous studies reported that CPT can hardly distinguish cases where attention problems are rooted in organic causes and that further evidence is required to use CPT for ADHD diagnosis (McGee, Clark, & Symons, 2000; Riccio, Reynolds, & Lowe, 2001). Consequently, they suggested that the tool should not be used for diagnosis but rather for unspecific assessment, as it can produce similar results with children who have impaired attention due to reasons other than ADHD. To address this limitation, Yoon and colleagues (2008) reported that calculating a diagnostic normative score (clinical T-score) based on an ADHD group’s distribution is more useful when discriminating an ADHD group from other clinical groups (depressive disorders and anxiety disorders) and that this approach can have value for disorder severity assessment in the ADHD group. However, it is uncertain whether it would aid ADHD diagnosis. In addition, if the distribution were different between clinical groups, T-score comparisons would not yield meaningful results. Consequently, ADHD diagnosis must occur in conjunction with clinical observation and evaluation. Therefore, a number of clinical guidelines, including those of the American Academy of Pediatrics, do not recognize CPT in children’s ADHD diagnosis (Pliszka et al., 2006; Wolraich et al., 2011).
Results from existing studies pertaining to repeated CPT testing of ADHD and control groups vary considerably. Soreni, Crosbie, Ickowicz, and Schachar (2009) found an ICC of .72 as measured by CPT in the ADHD group and an ICC of .41 as measured by the IOWA Conners Rating Scale. Consequently, they argued that repeating the CPT rather than relying on behavioral symptoms results in higher reliability. Conversely, results of repeatedly administered CPTs in normal children yielded a moderate ICC between .33 and .65. Subsequent retests found results outside of the 90% confidence interval of the initial test in more than 30% of the tests, suggesting that test result interpretation requires caution for both normal and ADHD children (Erdodi et al., 2014; Zabel, von Thomsen, Cole, Martin, & Mahone, 2009). However, because most previous studies were conducted separately for children with ADHD and control groups and used different assessment tools, direct comparisons between ADHD and control groups are difficult. The present study found a low ICC in the ADHD group and inconsistent retest performance, indicating that individual between-test variability was high. Consistently, Llorente and colleagues (2001) reported that the high variability of ADHD children’s CPT tests indicates notable internal validity and correlation with ADHD symptoms, supporting our findings. Consequently, care should be used in ADHD diagnosis based on one-time administration of a CPT (including the ATA).
In the ATA result comparisons, differences were found in omission errors, commission errors, RTV, and the sensitivity of the visual test. Conversely, the only auditory test difference was in RTV. Thus, greater group differences were found for the visual than auditory test. Research reports that auditory tests are more sensitive when testing older compared with younger children, which may explain the present results, as it predominantly involved young children (Shin, Cho, Chun, & Hong, 2000). Because auditory tests are more difficult than visual tests, they are reported as more useful for testing children with mild attention problems. Although the current study excluded ADHD NOS children from analysis, no auditory test result differences were identified.
ADHD children’s RTV was significantly greater than that of the control group, and discriminant analysis revealed it was the most sensitive between-group distinguishing variable. Consistently, studies suggest RTV as a neuropsychological ADHD endophenotype (Hervey et al., 2006) that can meaningfully distinguish ADHD children from normal children. Likewise, these studies report that RTV is statistically significant when distinguishing ADHD children from children with depressive disorders, learning disorders, and intellectual disabilities, indicating that it is an appropriate variable for screening children with different types of attention problems (Castellanos, Sonuga-Barke, Milham, & Tannock, 2006). In addition, its relevance in medication effect and behavior control has been reported (S. H. Lee et al., 2009), and efforts are in progress to use it as a mathematical/statistic diagnostic method (Lin, Hwang-Gu, & Gau, 2015).
In our study, the ADHD group’s visual commission error rate was greater than the control group’s and was the second most sensitive variable for between-group discrimination. The high commission error rate supports Barkley’s hypothesis suggesting that impaired inhibition represents ADHD-related behavior pathology (Barkley, 1997). ADHD subgroups and control group comparisons revealed that commission errors are sensitive for distinguishing ADHD subtypes because the variable tends to reflect the cognitive characteristics of the combined subtype. Discrepancies in commission error studies might result from a lack of ADHD subtype discrimination. In the ADHD subgroup analysis conducted without controlling for gender and age, the inattentive group had a high omission error rate, while the combined group had a high commission error rate. These findings are consistent with existing studies indicating that individual CPT variability reflects the cognitive characteristics of the ADHD subtype.
The limitations of the current study are as follows. First, the small sample size meant that sufficient statistical power could not be reached. Second, because the children’s age range was limited, caution should be exercised in generalizing the findings to other age groups. A control group of 35 children is too small to be representative of typically developing children, and the control group’s age and gender were also not matched with the ADHD group. Therefore, the reader should be careful when generalizing the study results. Further studies are necessary with a larger sample size and a gender-matching control group must be included. Third, analysis excluded the 10 children with ADHD NOS, although analysis revealed no significant differences when these children were included. Fourth, the indicator of the severity (e.g., Children’s Global Assessment Scale; CGAS) of the groups in this study was not measured. However, all ADHD children who participated met the DSM-IV impairment criteria. Fifth, this study compared children with ADHD and typically developing children; thus, our study could not answer whether ATA can differentiate attention problems among children with various psychiatric disorders, including depressive disorders, anxiety disorders, and tic disorder. Finally, the ICC values may have been affected by the test–retest interval, which varied between children.
In conclusion, the present study examined the clinical application of the ATA as an ADHD diagnostic tool. The results confirmed variability in the ATA variable distribution between the ADHD and control groups, suggesting between-group heterogeneity. Consequently, T-scores based on normal children’s results may be limited in ADHD diagnosis. Despite high sensitivity, the current ATA ADHD diagnostic criteria have low specificity. In addition, discriminant analysis found that the ATA had 64.9% accuracy for classifying ADHD relative to normal children. Thus, although ATA results can help us understand neuropsychological observations and complement clinical diagnosis, it is an insufficient diagnostic tool that cannot be used independently. Based on the RTV that represents the intra-subject variability of the ADHD group and the low ICC that represents inter-test variability, it can be concluded that caution should be exercised when using one or two ATA variables or a single test for ADHD diagnosis.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This study was supported by a grant (2012-0653) from the Asan Institute for Life Sciences, Seoul, Korea.
