Abstract
The Behavior Rating Inventory of Executive Function, Second Edition (BRIEF2; Gioia, Isquith, Guy, & Kenworthy, 2015) is the recent revision of the original Behavior Rating Inventory of Executive Function (BRIEF; Gioia, Isquith, Guy, & Kenworthy, 2000), an informant-report rating scale designed to assess executive behaviors in children and adolescents. The original BRIEF has been shown to be sensitive to clinical dysfunction across diagnostic groups. Specifically, considering the presence of persisting executive function deficits in many youth with ADHD (Barkley & Fischer, 2011; Seidman, 2006; Willcutt, Doyle, Nigg, Faraone, & Pennington, 2005), the BRIEF has been shown to discriminate between children with ADHD and typically developing controls (Mahone, Cirino, et al., 2002; Reddy, Hale, & Brodzinsky, 2011; Toplak, Bucciarelli, Jain, & Tannock, 2009) as well as between ADHD and clinically referred children without ADHD (McCandliss & O’Laughlin, 2007) or those with other clinical disorders (e.g., Gioia, Isquith, Kenworthy, & Barton, 2002; Mahone, Cirino, et al., 2002; Mahone, Zabel, Levey, Verda, & Kinsman, 2002; Vriezen & Pigott, 2002). Although subtypes of ADHD appear notably unstable over time (Lahey, Pelham, Loney, Lee, & Willcutt, 2005) suggesting that presentations may reflect the most severe current concerns rather than durable diagnostic differences, several studies have shown that the BRIEF also discriminates between (current) subtypes of ADHD; children with the combined presentation of ADHD tend to be rated by caregivers as higher on the Behavioral Regulation Index (BRI) and/or Inhibit scale of the BRIEF than are children with predominantly inattentive ADHD (Gioia et al., 2002; McCandliss & O’Laughlin, 2007), potentially reflecting the behavioral inhibition deficits characteristic of those with more hyperactive symptoms (e.g., Barkley, 1997; Fischer, Barkley, Smallish, & Fletcher, 2005). Likewise, children with the inattentive presentation of ADHD tend to be rated as showing a higher level of behaviors indicative of difficulty with working memory and cognitive control (e.g., the Metacognitive Index and/or Working Memory scale; Gioia et al., 2002; McCandliss & O’Laughlin, 2007; Reddy et al., 2011), though this finding has not been consistent across studies (e.g., McCandliss & O’Laughlin, 2007).
The recent revision of the measure, the BRIEF2 (Gioia et al., 2015), has resulted in a reduced number of items, the addition of a new index and renaming of the previous Metacognition Index, and rearrangement of scale content resulting in creation of two scales out of the previous Monitor scale items. Specifically, the new measure now consists of nine scales, with the Inhibit and Self-Monitor scales comprising the Behavior Regulation Index (BRI), the Shift and Emotional Control scales comprising the new Emotional Regulation Index (ERI), and the Initiate, Working Memory, Plan/Organize, Task Monitor, and Organization of Materials scales comprising the Cognitive Regulation Index (CRI; the renamed Metacognition Index). The removal of the Shift and Emotional Control scales from the BRI and addition of the Self-Monitor scale to this index suggests that prior work supporting the utility of the BRI in discriminating between ADHD and non-ADHD groups and among ADHD subtypes may no longer be directly applicable. Validation of the new scale arrangement is needed to determine the degree to which the BRI continues to be useful for such purposes. Examination of the BRIEF2 in clinical samples is needed to determine the degree to which prior work remains useful and the confidence with which clinicians can use the revision to support diagnostic decision-making.
The original measure was designed to capture informant reports of children’s day-to-day executive functioning in “real world” situations (Gioia, Kenworthy, & Isquith, 2010). Published work examining the BRIEF and other rating scales of executive behavior suggests that traditional performance-based measures of executive function are not consistently correlated with ratings of everyday functioning provided by caregivers (e.g., McCauley, Chen, Goos, Schachar, & Crosbie, 2010). Given this limited pattern of association, rating measures like the BRIEF appear to measure unique constructs from those assessed by performance-based tests or other broad-band behavior rating scales (Jarratt, Riccio, & Siekierski, 2005) and add clinical value to assessment of everyday application of executive skills and their real-world behavioral impact (Barkley & Fischer, 2011; Goldberg & Podell, 2000; Isquith, Roth, & Gioia, 2013). Specifically, the BRIEF2 authors assert that the new measure was designed to show increased sensitivity to everyday executive dysfunction “in key clinical groups, such as ADHD” (Gioia et al., 2015, p. 3). The test manual describes BRIEF2 parent ratings in a sample of children with ADHD combined presentation (ADHD-C; n = 218) and inattentive presentation (ADHD-I; n = 159), relative to parent ratings of typically developing children drawn from the normative sample. Overall, the ADHD-C group showed significantly higher scores across all BRIEF2 scales and indices than controls, with the highest scale elevations on the Inhibit, Working Memory, and Shift scales and the highest index elevation on the BRI (Gioia et al., 2015, p. 140). Similarly, the ADHD-I group in this sample also showed significantly higher scores across scales and indices than controls, with highest scale elevations reported on Working Memory, Initiate, and Plan/Organize, and the CRI the most elevated index (Gioia et al., 2015). In a smaller research subsample (Isquith, Kenealy, Roth, & Gioia, 2015, reported in Gioia et al., 2015), parent ratings of children with ADHD-C were reported to be significantly higher than those of children with ADHD-I on Inhibit, Self-Monitor, Shift, and Emotional Control. Taken together, the authors assert that children with ADHD-C have “much higher average T scores on the Inhibit and Self-Monitor scales than children diagnosed with ADHD-I and . . . both groups on average have much higher Working Memory, Plan/Organize, and Task Monitor scales than their typically developing peers” (Gioia et al., 2015, p. 140).
These preliminary data from the BRIEF2 authors are promising; however, further external validation of the new measure is needed to determine its utility for clinical purposes. As noted, given the rearrangements among the scale to index loading, prior work regarding the utility of the BRI in particular may not be directly applicable to the revision of the BRIEF. Additionally, most, if not all, of the prior studies investigating performance of either version of the measure in ADHD have examined only children with the inattentive or combined presentations of ADHD (e.g., Gioia et al., 2015; Gioia et al., 2002; McCandliss & O’Laughlin, 2007). Furthermore, meta-analytic and review studies (Doyle, 2006; Willcutt et al., 2005) have found a pattern of broad executive dysfunction among youth with ADHD, with notable heterogeneity, suggesting that more work is needed to characterize the heterogeneity in functioning among individuals with ADHD. To clarify the role of day-to-day inhibitory deficits as compared with attentional dysregulation, examination of patterns among children with all three presentations—including the predominantly hyperactive-impulsive presentation—is important.
The present study investigated the preliminary validity of the BRIEF2 in a large clinically referred sample of children and adolescents by examining patterns among the scales and indices of the BRIEF2 as well as sensitivity and specificity of selected scales in clinically referred youth with and without symptoms of ADHD (including individuals with symptoms of all three presentations). We hypothesized that the revisions to the measure would provide evidence for discrimination between children with high levels of ADHD symptoms and referred children without ADHD, but also between ADHD symptom presentations. Furthermore, we hypothesized a “double dissociation” model: children with predominantly inattentive ADHD symptoms would show higher ratings on scales comprising the CRI relative to children with the predominantly hyperactive-impulsive presentation, whereas children with the predominantly hyperactive-impulsive presentation would show higher ratings on the scales comprising the BRI.
Methods
Procedures
As part of routine clinical practice at a large outpatient neuropsychology assessment center, parents of children referred to the clinic are typically asked to complete behavioral rating scales and all data are then entered into a clinical database by department clinicians via the hospital electronic health record. These data are securely maintained by the hospital’s Information Systems Department. Data were collected over a 4-year period (October 2010-October 2014).
Following approval from the hospital Institutional Review Board, the clinical database was queried, and a limited data set was constructed of patients between 5 and 18 years of age for whom valid scores were available on all measures of interest (specified below). There were no exclusionary criteria beyond complete data on these measures, except as specified in procedures for group selection below.
Participants
Participants included parents of 1,969 youth, ages 5 to 18 years (M = 10.45, SD = 3.22), referred for outpatient neuropsychological assessment in a large outpatient neuropsychology clinic (see Table 1). Cases were selected for inclusion if complete caregiver ratings (i.e., not more than one missing item per scale) were available on the BRIEF2 and ADHD Rating Scale-IV (ADHDRS), and criteria for membership in one of the four groups were met as specified below. The majority of the children being rated were male (62.5%) and Caucasian (59.63%); 24.46% were African American, 5.45% were multi-racial, 2.3% were Asian American, and 4.8% were of unknown racial background, while 2.25% reported Hispanic ethnicity.
Sample Characteristics.
Note. Superscripts indicate significant differences between groups (identical letters mean groups are not statistically different [a, a; group means do not differ]; different letters indicate significant differences between groups [a, b, c; all groups are different]). IA only = inattentive symptoms group (e.g., ≥6 symptoms of inattention, ≤3 symptoms of hyperactivity/impulsivity); HI only = hyperactive-impulsive symptoms group (e.g., ≥6 symptoms of hyperactivity/impulsivity, ≤3 symptoms of inattention); comb = Combined symptoms group (e.g., ≥6 symptoms of both inattention and hyperactivity/impulsivity); ADHDRS = Attention Deficit Hyperactivity Disorder Rating Scale; ADHDRS IA = Inattention scale sum; ADHDRS HI = Hyperactive/Impulsive scale sum; BRIEF2 = Behavior Rating Inventory of Executive Function, Second Edition; BRI = Behavioral Regulation Index; ERI = Emotion Regulation Index; CRI = Cognitive Regulation Index; GEC = Global Executive Composite.
Group assignment
Children were categorized using modified Diagnostic and Statistical Manual of Mental Disorders (5th ed.; DSM-5; American Psychiatric Association, 2013) ADHD symptom criteria, including caregiver-report symptom count on the ADHD Rating Scale-IV (DuPaul, Power, Anastopoulos, & Reid, 1998) and ratings of impairment (rating of 4 or greater) in at least one domain of daily function on the Impairment Rating Scale (IRS; Fabiano et al., 2006). Participants in each of the ADHD symptoms groups were required to meet symptom count criteria (≥6 symptoms in the relevant domain(s) of inattention, hyperactivity/impulsivity, or both) with impairment reported in at least one functional domain. However, consistent with a recent latent class analysis suggesting that the severe inattentive characterization is best predicted by at least six symptoms of inattention but fewer than three symptoms of hyperactivity/impulsivity (Volk, Todorov, Hay, & Todd, 2009), the IA only symptom group and HI only symptom group reflected restrictive symptom presentations (e.g., not simply subthreshold combined presentation) and were restricted to no more than three symptoms of the other symptom domain (i.e., children in the IA only group were reported to display at least six symptoms of inattention but no more than three symptoms of hyperactivity/impulsivity, whereas children in the HI only group were required to have at least six symptoms of hyperactivity/impulsivity, but no more than three symptoms of inattention). These groups are not diagnostic groups, per se, but rather reflect children exhibiting specific patterns of symptoms of ADHD. Four groups (n = 1,381) were created from the larger clinical sample, reflecting symptom presentations: a restricted inattentive group (IA only; n = 395), a restricted hyperactive/impulsive group (HI only; n = 77), a combined symptoms group (Combined; n = 315), and a non-ADHD clinical comparison (n = 594). The non-ADHD clinical comparison group was selected from the same referred cohort, but participants were only included if they were rated as exhibiting three or fewer symptoms of inattention and/or hyperactivity/impulsivity on the ADHDRS. Children were not included in the analyses if they did not meet the above group assignment criteria (n = 588), either because they were rated by parents as showing substantial subthreshold symptoms (five symptoms of inattention, hyperactivity, or both), because at least six symptoms were endorsed within one domain and four or five symptoms were endorsed in the other, or because clinical symptoms were endorsed without impairment (IRS).
Measures
ADHDRS
Caregivers completed the ADHDRS (DuPaul et al., 1998), an 18-item measure, reflecting the Diagnostic and Statistical Manual of Mental Disorders (4th ed.; DSM-IV; American Psychiatric Association, 1994) and DSM-5 ADHD diagnostic criteria. Item content reflects DSM-IV symptom criteria and ratings are based on the child’s behavior over the past 6 months, using a 4-point scale (0 = not at all; 1 = sometimes; 2 = often; 3 = very much). Subscales correspond to the DSM-IV Inattentive and Hyperactive/Impulsive criteria. Total subscale scores were obtained by adding item ratings (range: 0-27 for each). The ADHDRS has been shown to demonstrate adequate reliability and validity (DuPaul et al., 1998); internal consistency estimates for the parent-report version ranged from .86 for the Inattention scale to .88 for the Hyperactivity/Impulsivity scale, with test-retest reliability over short periods of time ranging from .78 for Inattention to .86 for Hyperactivity/Impulsivity. In the current sample, internal consistency for the ADHDRS subscales was good (Inattention rα = .859; Hyperactivity/Impulsivity rα = .856).
IRS
The IRS (Fabiano et al., 2006) is an informant report measure of a youth’s impairment in several domains of functioning. The parent version of the measure is comprised of seven items (six domain items and a summary rating); five of these items were collected as part of the present study. Parents were asked to rate, on a 7-point Likert scale, the extent to which their child’s problems are impacting their academic progress, peer relationships, relationships with caregivers, self-esteem, and home life. For the purposes of the present study, a score at or above the midpoint of 4 on this 7-point scale was considered an endorsement of functional impairment in a given domain.
BRIEF2 parent-report form
The BRIEF2 (Gioia et al., 2015) is a revision of the BRIEF, a rating scale assessing everyday behaviors reflecting executive functions across the school-age span (ages 5-18). The BRIEF2 parent-report form contains 63 items within nine theoretically derived clinical scales: Inhibit, Self-Monitor, Shift, Emotional Control, Initiate, Working Memory, Plan/Organize, Task Monitor, and Organization of Materials. The nine clinical scales form three Composite or Index scores: the BRI, ERI, and CRI as well as the overall Global Executive Composite (GEC) summary score. All Parent Form coefficient alpha values for index scores were reported to fall above .90, with coefficients for individual scales ranging from .80 (Monitor) to .91 (Emotional Control) in the standardization sample (Gioia et al., 2015, p. 101). The manual reports test-retest reliability estimates from a sample of 163 parent ratings; the mean test-retest correlation coefficient for clinical scales was .79 (range = .67-.92) over an average time interval of 2.9 weeks (Gioia et al., 2015, p. 111). According to the test manual (Gioia et al., 2015, p. 144), the Working Memory scale reportedly showed the largest effect size in comparisons between typically developing controls and youth with ADHD (either inattentive or combined presentation).
For the present study, item level data used for analysis were obtained via the original BRIEF parent rating form. As all of the item content for the revised BRIEF2 is derived from existing items on the original BRIEF, it was possible to extract the items present on the BRIEF2 from original BRIEF data stored in the computer scoring program. The BRIEF2 publisher (PAR, Inc.) then arranged these extracted items into the new BRIEF2 clinical scales and indices. These new clinical scale and index raw scores were compared with the BRIEF2 normative sample to obtain age- and sex-based T scores for clinical scales, indices, and the overall composite. Correlations between corresponding BRIEF and BRIEF2 scales were strong, with all values above .90 with exception of Organization of Materials (.83; due at least in part to substantive item changes between versions) and Monitor (.85 with Self-Monitor, .78 with Task Monitor) scales. The overall GEC correlated highly across measures (r = .975), with the original Metacognition Index correlating .96 with the CRI and the original BRI correlating slightly lower with the revised BRI (.85).
Analysis Plan
Following group assignment, detailed above, examination of BRIEF2 scale internal consistency was completed via calculation of Cronbach’s alpha. Next, two multivariate analyses of variance (MANOVAs) were conducted to examine (a) performance of the BRIEF2 index scores across groups, and (b) performance of the clinical scales across groups, while correcting for multiple simultaneous comparisons. Post-hoc tests, correcting for multiple comparisons (Bonferroni correction), were examined to identify scales that discriminated between clinical groups. Although BRIEF2 scores were age normed, given the broad school-age range of the sample, an additional follow-up multivariate analysis of covariance (MANCOVA) was conducted to determine whether the pattern of between-group differences among clinical scales remained consistent when controlling for age.
Next, as it is important to know the likelihood of specific impairments in clinical groups relative to typically developing children (e.g., Chelune, 2010), clinical base rates of elevated scores across specific scales were calculated across groups, to identify approximate prevalence of such elevations in a mixed clinical sample. Finally, discriminative value for accurately classifying symptom presentations was examined via receiver operator characteristic (ROC) curve analyses.
Results
Reliability
In this mixed clinical sample, the BRIEF2 exhibited excellent internal consistency (Cronbach’s α = .965) for the items contributing to the GEC. Scale reliabilities (α) ranged from .772 (Initiation, five items) to .923 (Emotional Control, eight items; see Table 2).
Internal Consistency (Cronbach’s Alpha) by BRIEF2 Scale.
Note. BRIEF2 = Behavior Rating Inventory of Executive Function, Second Edition.
Clinical Group Profiles
Index score profiles
As expected, there was a large overall effect of group on the BRIEF2 indices, F(12, 4128) = 139.84, p < .0001;
Clinical scale profiles
There was a significant overall effect of group across the BRIEF2 clinical scales, F(27, 4113) = 67.11, p < .0001;

Profiles of mean BRIEF2 subscale and index scores across clinical groups.
Among children in the HI only group, ratings of more than 2 SD above the mean were observed only on the Inhibit scale, with mean ratings of >1.5 SD on Shift and Emotional Control. Ratings on Initiate, Plan/Organize, and Organization of Materials were largely within normal limits (on average not more than 0.5 SD above the mean). Among children with greater inattentive symptomatology (IA only group), elevations of >2 SD were seen only on the Working Memory scale, with elevations of >1.5 SD on Initiate, Plan/Organize, Organization of Materials, and Task Monitor. Ratings on the Inhibit and Emotional Control scales fell within 1 SD of the mean for this group. As expected then, children with high levels of both inattentive and hyperactive symptoms (Combined group) showed elevated mean scores (>1.5 SD) across all subscales, with mean ratings of >2 SD on Inhibit, Shift, and Working Memory. Effect sizes for across group comparisons
Examining the consistency of findings after controlling for age (MANCOVA), both group
Clinical Utility of Scale Elevations
Base rates of elevated scores are important to consider in evaluating performance for clinical purposes (Chelune, 2010) and the proportion of children in each group with clinical scale and index T scores ≥70 is presented in Table 3. Children who were reported to exhibit clinically elevated levels of ADHD symptoms were consistently more likely to exhibit elevations of ≥2 SD above the normative mean across BRIEF2 scales than those without (non-ADHD; all p < .01), with the exception of the Working Memory and Organization of Materials scales (HI only group not different from the non-ADHD group; see Table 3). With regard to within symptom group comparisons, referred children showing high levels of hyperactivity/impulsivity symptoms (either HI only or Combined groups) were more likely than those with primarily inattentive symptoms (IA only) to show elevations greater than 2 SD (T ≥ 70) on the Inhibit scale (both p < .001; see Table 3). In contrast, elevations ≥2 SD on the Working Memory scale were more likely in children with greater inattention (IA only and Combined presentation vs. HI only; both p < .001). Consistent with these findings, the BRI composite score was more frequently elevated (T ≥ 70) in youth with hyperactive symptoms (HI only or Combined vs. IA only; both p < .001), whereas the CRI was more frequently elevated in youth with greater inattentiveness (IA only or Combined vs. HI only; both p < .001).
Base Rates of Scale Elevations T ≥ 70.
Note. Superscripts indicate significant differences between groups (identical letters mean group proportions are not statistically different [a, a; group proportions do not differ]; different letters indicate significant differences between group proportions [a, b, c; all groups are different, with a representing the lowest proportion]). IA only = inattentive only group; HI only = hyperactive/impulsive only group; Comb = combined group; EmoCtrl = Emotional Control; WM = Working Memory; Plan/Org = Plan/Organize; OrgMaterials = Organization of Materials; BRI = Behavioral Regulation Index; ERI = Emotional Regulation Index; CRI = Cognitive Regulation Index.
The scales associated with the new ERI showed a less consistent pattern: children rated as displaying high levels of inattention and hyperactivity (i.e., Combined group) were more likely than both the IA only and HI only groups to show clinically significant elevations on the Shift scale (both p = .009; IA only and HI only groups not significantly different from each other), with the Combined and HI only groups more likely than the IA only group to show elevations on the Emotional Control scale (p = .009). As such, the new ERI was more likely to be elevated in children with greater hyperactivity (i.e., Combined and HI Only groups vs. the IA only group; both p < .001).
Between-Group Discrimination
The overall GEC composite accurately discriminated children with elevated symptoms of ADHD (any presentation) from those without (area under the curve [AUC] = .888; see Table 4). Discriminative power of specific clinical scales for identifying any symptom presentation versus the non-ADHD referred group and HI only versus IA only was examined across those scales showing the largest between group (i.e., ADHD symptoms vs. non-ADHD) effect size: Inhibit, Working Memory, and Organization of Materials.
Classification Accuracy Measures for Discriminating Between Groups With Selected Scales at T = 70.
Note. IA only = inattentive only group; HI only = hyperactive/impulsive only group; Sens = sensitivity; Spec = specificity; CA = classification accuracy; PPV = positive predictive value; NPV = negative predictive value; AUC = area under the curve; WM = Working Memory; Org = Organization of Materials; GEC = Global Executive Composite.
ADHD vs. non–ADHD
With regard to discriminating children in any of the ADHD symptom groups from the non-ADHD group, the Inhibit scale showed a good AUC (.806) and excellent specificity (see Table 4); over 96% of children with Inhibit scores below 70 were not reported to show clinically significant ADHD symptoms (i.e., were in the non-ADHD group). However, the Inhibit scale correctly classified only 63.07% of the sample at T = 70 and only 69.8% of the sample at T = 65 (sensitivity = 53.88, specificity = 90.91).
In our clinical sample, the Working Memory scale showed a very good AUC of .872 and correctly classified 75.67% (T = 70) and 79.58% (T = 65) of the sample as showing elevated symptoms of ADHD or not. Examining specificity, almost 88% of those with Working Memory scores below 70 were not rated as having clinically elevated ADHD symptoms.
The Organization of Materials scale exhibited the third largest effect size in this clinical sample, with a good AUC (.834) and excellent specificity (95.79% at T = 70) for discriminating between children rated with elevated ADHD symptoms and those without.
HI only vs. IA only
With regard to discriminating children with predominantly hyperactive symptoms from those with predominantly inattentive symptoms, the BRIEF2 Inhibit scale showed good discrimination (AUC = .874) and accurately classified 83.9% of these youth at a T score of 70. Sensitivity was marginal with stronger specificity, and sensitivity improved at a T score of 65 (sensitivity = 80.52; specificity = 74.68; 75.64% correctly classified). Discriminating between restricted symptom presentations, the Working Memory scale accurately classified 72.25% (T = 70) and 81.36% (T = 65), with an AUC of .844. In contrast, the Organization of Materials scale exhibited a somewhat lower AUC (.784) and classification accuracy (49.79%) when discriminating between children with a Hyperactive only versus IA only presentation. However, specificity remained high (94.81%), suggesting that almost 95% of those with Organization of Materials scale scores ≥70 were in the IA only group.
Discussion
These findings provide very promising initial evidence of construct validity and clinical utility of the BRIEF2 in clinically referred children. Caregiver ratings on the BRIEF2 scales were specifically associated with reports of inattentive and hyperactive ADHD symptoms, such that the Inhibit scale was most likely to be elevated in children with greater hyperactivity, whereas the Working Memory scale was more likely to be elevated in children with high levels of inattention. Likewise, children with combined symptoms were likely to have elevated scores on both Inhibit and Working Memory. Notably, these scales were less likely to be elevated by high levels of symptoms in the opposite domains, suggesting specificity of prediction as well as evidence of convergent and discriminant validity of the revised BRIEF2 scales. As hypothesized, the BRIEF2 scales showed a pattern of double dissociation between the restricted presentations of ADHD symptoms—highlighted by the differences between the IA and hyperactive only groups and complemented by the combined group in each case. As few, if any, existing studies on the original measure included samples of youth with a hyperactive only presentation, the present study also provides a unique examination of the BRIEF2 profile in children with the restricted hyperactive presentation.
These data further suggest that the BRIEF2 discriminates well between specific clinical groups. The overall composite score (GEC) is sensitive to the everyday executive dysfunction associated with ADHD, and discriminates well between referred children with high levels of ADHD symptoms and those without. However, the GEC does not discriminate well between ADHD presentations, and the profile across the clinical scales appears more useful for describing the differential impact of inattentive versus hyperactive symptoms. These findings are consistent with preliminary validity evidence for the measure from the test manual (Gioia et al., 2015), but extend those initial findings into a large sample of clinically referred children including those with a pattern of restricted hyperactive-impulsive symptoms.
Although some recent work suggests that ratings on the BRIEF may reflect broad behavioral impairment rather than executive dysfunction per se (McAuley et al., 2010), the present findings highlight a degree of precision in the pattern of overlap between the BRIEF2 scales and behavioral ratings of ADHD symptomatology. Specifically, the Inhibit and Organization of Materials scales show excellent specificity for discriminating between children with and without ADHD, whereas the Organization of Materials scale shows excellent specificity for discriminating between restricted inattentive and hyperactive ADHD presentations. Taken together, these findings suggest a pattern of shared functional impairment between the BRIEF2 and parent ratings of ADHD symptoms that is consistent with findings observed using the original measure in McAuley et al. (2010). It should be noted, however, that some of the item content on the BRIEF2 Inhibit and Working Memory scales corresponds closely to the DSM diagnostic criteria for ADHD hyperactive-impulsive and inattentive presentations, respectively. As such, the classification accuracy may simply reflect a degree of rater consistency across similar items. However, the present findings also highlight a pattern of dissociation in day-to-day behavioral impact of ADHD symptoms as measured by specific BRIEF2 scales. Although the pattern of elevations on the Inhibit and Working Memory scales in part reflects the core weaknesses thought to characterize the hyperactive and inattentive presentations of ADHD, respectively, this examination adds to our understanding of the relationships between ADHD symptoms and functional impact on informant reports of day-to-day executive behavior.
In the present sample of clinically referred youth, sensitivity of the BRIEF2 scales was consistently poorer than specificity, suggesting that the clinical scales may more accurately help to rule out ADHD symptomatology than rule it in. Furthermore, the overall classification accuracy across scales examined did not reach substantively above 75% (examining a cut score of T = 70) for discriminating those with any elevated ADHD symptoms from the non-ADHD comparison group. These findings are generally consistent with those presented in the test manual (Gioia et al., 2015, p. 260) using a cut score of T = 70 on the Inhibit scale to discriminate between children with inattentive versus combined type ADHD (classification accuracy: 68.4%-72.7%), but somewhat lower than the classification accuracy presented for the Working Memory scale in discriminating between children with ADHD and typical controls (T = 65; classification accuracy: 83.1%-84.8%). The lower sensitivity and classification accuracy in the present sample for discriminating between children with high levels of ADHD symptomatology and those without may reflect the referred nature of the present sample, in which a variety of symptom presentations, in addition to ADHD, may disrupt effective daily executive behavior. While the BRIEF2 may show greater sensitivity and classification accuracy in discriminating between typical controls and youth with ADHD, the purpose of clinical assessment is not usually so discrete.
Interestingly, in spite of the revisions to the BRIEF2 scale and index structure, the BRI and ERI do not appear different from one another in terms of what they appear to be measuring in this clinically referred sample. Within these groups, it is not clear that the modifications to the BRIEF2 index structure have resulted in better articulation of the constructs of behavioral self-regulation versus affective self-regulation—at least, in children presenting with symptoms of ADHD. This differentiation, and the discriminatory power of the ERI, may be more evident in other clinical groups such as children with anxiety disorders or Autism Spectrum Disorders, given movement of the Shift scale to the ERI (e.g., Gioia et al., 2002; Hovik et al., 2017). Further work in other clinical groups will be needed to clarify the clinical utility of this new Index and how well it can be differentiated from the BRI.
This study is one of the first to examine the construct validity of the new BRIEF2. Strengths of the study include the large sample size and use of a clinically referred cohort to examine ADHD symptom severity including a clinically referred, non-ADHD comparison group. Furthermore, examination of restrictive inattentive and hyperactive symptom presentations allowed for a more careful examination of the association between BRIEF2 scales and specific types of ADHD symptomatology. At the same time, limitations of the study include the clinical nature of the sample, which limited the availability of data from teachers/school settings for each participant. As such, group assignment was made based upon parent ratings of symptoms and impairment, as information about cross-setting symptom presentation from teachers was not consistently available. Additionally, we recognize that the use of restricted presentations is not entirely consistent with clinical diagnostic practice (e.g., exclusion of children with six symptoms in one domain and five in the other from analyses), which can be expected to impact the reported classification accuracy values. Furthermore, the clinical nature of the sample precluded comparison of the measure’s predictive power in discriminating referred from non-referred (e.g., typically developing) children. As such, additional examination of the BRIEF2’s validity in large samples of referred and typically developing children, with cross-informant symptom reports of both ADHD symptoms and executive dysfunction, will help to support and replicate these initial findings.
Taken together, the present study provides initial evidence for reliability and validity of the revised BRIEF2 for assessing behavioral symptoms of childhood impairment in day-to-day settings. While not designed to directly assess symptoms of ADHD, the BRIEF2 appears well able to accurately characterize the pattern of self-regulatory weaknesses in children with clinically elevated ADHD symptoms, providing an important source of complimentary clinical information to ADHD symptom ratings. Given these findings, the additional benefits of a shorter length and reduced administration time appear to have made the BRIEF2 a useful improvement over the original.
Footnotes
Acknowledgements
The authors gratefully acknowledge the assistance of Peter Isquith, PhD, and Jennifer Greene at PAR, Inc., in transcription of existing BRIEF item data onto the BRIEF2 item structure and provision of normed BRIEF2 scores from these original data.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This study was funded by the Kennedy Krieger Institute’s Intellectual and Developmental Disabilities Research Center U54-HD-079123.
