Abstract
Objective:
This study aimed to (1) examine benchmarks for the benefits of the Daily Report Card (DRC) within a therapeutic recreation setting, that is, the Summer Treatment Program (STP) and (2) explore differences in baseline characteristics and treatment outcomes among optimal and suboptimal responders. Benchmarks were examined for children’s DRC target behaviors using standardized mean difference (SMD) effect sizes (ES) across 2-week periods of the STP.
Method:
Participants were 38 children attending an STP.
Results:
Aside from teasing, all DRC targets showed improvement by the second 2-week period that was sustained through the third 2-week period. Optimal responders demonstrated greater improvement in parent-rated impairment and camp behaviors than suboptimal responders. Some baseline differences between responder groups were found.
Conclusion:
This study provides the first benchmarks for change in DRC targets within a therapeutic recreational setting, offering guidelines for treatment expectations. Implications for clinical decision-making, treatment planning, and future research are discussed.
A significant number of children with symptoms of ADHD experience impairment in academic, social, and/or behavioral functioning (e.g., Power et al., 2017). The daily report card (DRC) is among the most studied and effective behavioral interventions for addressing these impairments experienced by elementary school-aged children with or at risk for ADHD (Iznardo et al., 2020; Pyle & Fabiano, 2017; Vannest et al., 2010). Specifically, the DRC shapes a child’s behavior using feedback at the point of performance and contingency management principles (i.e., the contingent application of rewards for achieving a goal). When using a DRC, teachers or clinicians operationalize and target specific behaviors for change (e.g., noncompliance, interruptions), establish daily goals for success (e.g., six or fewer interruptions), give feedback to the child at the point of performance (“That was an interruption and will result in a mark on your card.”), and collaborate with the child’s parent to provide rewards contingent upon the child’s goal achievement. The goals for each target behavior are gradually changed overtime (e.g., from six or fewer violations to four to two to one) to shape the child’s behavior until it matches age-appropriate expectations for the setting. Because the DRC can be used to shape a variety of behaviors (e.g., academic productivity, prosocial behavior such as helping, excessive complaining in the context of sports games), it represents an intervention that can be used in a variety of different contexts (e.g., Fabiano et al., 2010; Owens et al., 2012).
Indeed, the DRC has been used in therapeutic recreational settings and is an integral component of the Summer Treatment Program for ADHD (STP; Pelham et al., 2010). The STP is a well-established, evidence-based intensive treatment program for children with ADHD (for theoretical rationale, see Fabiano et al., 2014) that includes several intervention components delivered in the context of a recreational camp setting. Throughout the day, a behavioral point system is used to increase prosocial behaviors and decrease negative behaviors. Counselors use the point system data from the first week of camp to guide the selection of behaviors to be targeted on the child’s DRC (e.g., disruptive behaviors with the highest frequency or that are most impairing to the child’s ability to get along well with others and/or participate). Since the STP’s development four decades ago, it is now implemented as a routine clinical service in numerous hospital and non-hospital settings across the United States (e.g., California, Florida, Kansas, Massachusetts, Ohio, Washington) and internationally. Given the expansion of the model as a routine clinical service, the STP represents an important context in which to study treatment outcomes for children with or at risk for ADHD.
Benchmarks for Behavioral Interventions
One method for understanding what can be expected in a given clinical context is benchmarking. A benchmark represents a comparator value that serves as a standard for judging the impact of a given service (Hunsley & Lee, 2007). For example, the amount of pre-post change achieved in an efficacy study can be used as an indicator (or benchmark) of what could be expected in an effectiveness trial or routine clinical practice (Spilka & Dobson, 2015; Weersing, 2005). Benchmarks also provide an indicator for how much change can be expected in a given behavior under specific treatment conditions within a given time frame (e.g., Holdaway et al., 2020; Owens et al., 2012). When benchmarks are available over multiple time points, such trajectories can assist clinicians with clinical decision making. For example, if short cycle (e.g., monthly) benchmarks for change are established, they can be used to (a) determine if a child is having an adequate initial response to an intervention, (b) inform whether the intervention should be reduced or intensified, and (c) monitor for adverse responses. This is an important endeavor, as reducing intervention intensity as soon as is warranted could reduce resource expenditures (e.g., personnel time devoted to a more intensive intervention than necessary). Similarly, increasing intervention intensity as soon as is warranted could result in the child receiving optimal care (rather than taking a “wait and see” approach). Setting-specific benchmarks allow us to evaluate factors that predict treatment success and subsequently use those factors to inform early treatment planning and personalize care over time. Two classroom studies reveal benchmarks for the magnitude of change in DRC target behaviors that can be expected monthly over a 4-month period (Holdaway et al., 2020; Owens et al., 2012). However, given the short duration of therapeutic summer camps like the STP (e.g., 6–8 weeks), benchmarks for shorter time frames (every 2 weeks) are needed for clinical decision making in this setting.
Benchmarks for the DRC in School Settings
Researchers have begun to establish benchmarks for DRC-specific behavior change in the classroom. For example, Owens et al. (2012) examined monthly benchmarks within a sample of elementary school students with or at risk of ADHD using standardized mean-difference (SMD) effect sizes (ES) over a period of 4 months. The most common target behaviors included interruptions, touching others (aggression), disrespect, off-task behavior, classroom rule violations, and being out of seat. Change in these behaviors showed large improvement in Month 1 of intervention (SMD = 0.78) with additional small incremental improvement from Month 1 to Month 2 (SMD ES = 0.22), maintenance between Month 2 and Month 3 (SMD ES = −0.02), and small incremental improvement between Month 3 and Month 4 (SMD ES = 0.21). The cumulative change in behavior across 4 months was large in magnitude (SMD ES = 1.16). Owens et al. (2012) described that students reduced their average daily instances of each target behaviors by nearly half over this 4-month time frame (e.g., nine interruptions dropped to four and four instances of touching others to less than one). These findings were replicated by Holdaway et al. (2020) in a more diverse sample of elementary school students, further supporting the utility of school-specific benchmarks to evaluate a child’s response to treatment and make treatment decisions. Importantly, educators can now determine if a child’s early response to intervention is typical (i.e., a moderate to large change in the first month), make adjustments accordingly, and monitor for adverse responses. However, the extent to which these benchmarks generalize to recreational settings, like the STP, has not been evaluated.
Critically, benchmarks allow us to determine whether a child is responding to the DRC intervention as expected. It is also important to consider whether there are differential responses to the intervention and if those responses are related to change on distal-level outcomes (e.g., changes in symptoms and impairment). Holdaway et al. (2020) examined differences in treatment outcomes by categorizing children as optimal responders (i.e., cumulative SMD ES of 1.0 at Month 4 on at least 50% of academic or behavior targets) and non-optimal responders (i.e., did not meet this threshold). The authors did not find statistically significant differences in post-treatment teacher-rated symptoms and impairment between optimal and suboptimal responders in either category (academic or behavior). However, trends for the behavioral target response groups suggested that optimal behavior responders had lower post-treatment ratings of inattention (between-group effect size d = 0.95), and impairment in academic (d = 1.73), classroom (d = 1.93), and overall functioning (d = 0.81) than suboptimal behavioral responders. Thus, there is promise that change in proximal DRC behavioral targets is associated with distal ratings of symptoms and impairment, but replication is warranted. Trends also suggested that teacher implementation (i.e., percentage of target child rule violations to which they responded appropriately) was higher among optimal responders than suboptimal responders, suggesting that implementation integrity may affect benchmarks. This is consistent with other studies that show that greater intensity of use (e.g., DRC implemented for the entire day versus a portion of the day) is associated with better DRC outcomes (Vannest et al., 2010).
It is interesting to note that, across both studies (Holdaway et al., 2020; Owens et al., 2012) the largest change in behavior occurred between baseline and Month 1 and smaller changes occurred in subsequent months. This is an important pattern and highlights the utility of benchmarking as it can guide expectations (among clinicians, parents, and teachers) about how much change to expect. Although it is not entirely understood why this pattern occurred, we speculate that the large initial change in behavior represents the power of feedback at the point of performance and contingency management in shaping behavior. In addition, many children with ADHD are unaware of their own behavior (see Owens et al., 2007) and its impact of their behavior on others (e.g., interruptions, complaining). The DRC represents a concrete mechanism for raising the child’s awareness of these behaviors (i.e., the goal is reviewed daily; feedback is given at the point of performance). For these reasons, the initial response compared to baseline is large. However, making additional improvement after this initial change (i.e., to the normative range) may require more effort and seems to take time. We anticipate that these principles will apply to the benchmarks achieved in a recreational setting as well.
DRC Benchmarks in Therapeutic Recreational Settings
Establishing benchmarks for DRC outcomes within the STP is important for several reasons. First, relative to the school year, the STP is shorter in duration. Thus, clinicians need to be able to identify and adapt treatment plans quickly for suboptimal responders. Second, although the camp milieu represents a universal intervention for all campers, the DRC represents the primary mechanism for personalized intervention. With identified benchmarks, clinicians could more efficiently optimize the DRC to match each child’s initial response. Third, because implementation integrity may affect benchmarks (e.g., Holdaway et al., 2020; Vannest et al., 2010), we cannot assume that benchmarks from the school setting will apply to the STP setting. Namely, because STP counselors receive intensive training and supervision, implementation integrity may be higher than that observed in the school setting and behavioral change may occur faster or in greater magnitude. Lastly, it would be beneficial to know if there are baseline characteristics (e.g., parent and teacher ratings of symptoms and/or impairment) that are differentially associated with subsequent treatment response, permitting early identification of children needing more intensive care. Taken together, context-specific benchmarks have the potential to enhance the efficiency and effectiveness of STPs across the nation.
Current Study
Although researchers have identified monthly benchmarks for the effectiveness of the DRC in the school setting, benchmarks for clinical progress have not been established in the STP. Given that the STP is a well-established, evidence-based treatment offered in multiple states around the country, studying the DRC in this setting can be useful to clinicians as they make data-driven decisions regarding treatment. The development of context-specific DRC benchmarks can guide decisions about whether the intervention should be faded, maintained, or modified. Further, it is important to consider characteristics of responders to the DRC within this setting. The goals of this study were to (1) examine benchmarks associated with the DRC across each 2-week period of the STP using standardized mean difference effect sizes and (2) explore possible differences in baseline characteristics and treatment outcomes among optimal and suboptimal DRC responders.
Method
Participants
Participants were 38 children who participated in a 7-week STP in the Midwestern United States (see Table 1 for characteristics). In total, 86.4% of families attending the program consented to participate in this study. To be eligible, children had to either (a) meet criteria for ADHD or (b) be considered at risk for ADHD (i.e., at least four symptoms plus impairment). Because this program was offered primarily as a clinical service, we did not conduct structured diagnostic interviews. Instead, parent and teacher ratings of symptoms and impairment were used to determine whether children met criteria for ADHD. Children with comorbid diagnoses (e.g., oppositional defiant disorder, anxiety disorders, autism spectrum disorders) participated in the program. Exclusionary criteria included behaviors or diagnoses that impacted the child’s ability to safely participate in group activities (e.g., history of severe aggression or elopement, sexualized behavior, and psychosis).
Demographic Characteristics of Participants (N = 38).
Note. Values are n (%) unless indicated otherwise. Parent and teacher ratings used to describe symptomology were missing for 2.5% (n = 1) of the sample. Parent and teacher ratings of symptoms and impairment were used to determine ADHD, ODD, and CD diagnoses.
Grade represents the most recently completed grade before enrollment in the program.
Procedures
The study was approved by the institutional review boards at Children’s Mercy Kansas City and Ohio University. Parents provided written consent and children provided assent for their data to be used in the current study. Clinical procedures did not differ for children participating in the study. The STP was provided as a clinical service and was advertised at local clinics and schools. Interested families contacted the clinic, engaged in an initial phone screen with a psychosocial nursing coordinator, and completed parent and teacher rating scales. If no exclusion criteria were met, children and parents were invited to participate in an eligibility evaluation which involved a review of rating scales and a psychosocial history interview. The eligibility evaluations were conducted by three licensed clinical psychologists.
Summer treatment program
Intervention components used in the STP, including social reinforcement, effective commands, a token-economy behavioral point system, time out, DRC, and a home reward system, were implemented over a 7-week period during recreational activities, social skills training, and academic activities (for details see Pelham et al., 2010). Camp hours were 8:30 am to 4:00 pm daily and children spent 4 hours daily in recreational activities (e.g., sports games or skill drill periods). Throughout daily camp activities, counselors tracked frequencies of child behaviors consistent with the behavioral point system in 15-minute intervals. Frequencies of the point system behaviors were recorded and entered daily into a clinical database. Counselors were undergraduate students who participated in a 2-week training consistent with standard training procedures (Pelham et al., 2010). Advanced-level clinicians completed fidelity observations and counselors completed quizzes periodically throughout the summer to assess their knowledge of the manualized procedures. During the last week of the STP, parent ratings of symptoms and impairment were collected as treatment outcome measures.
DRC intervention
DRCs were created and modified following the standardized shaping procedures in the STP manual. These procedures are consistent with principles followed in DRC benchmarking studies (Holdaway et al., 2020; Owens et al., 2012). During the first week of camp, counselors identified and defined three to four behaviors that likely contributed to child impairment. Daily behavioral frequencies of these target behaviors during the first week (i.e., the baseline phase) were used to establish the initial goal criterion for each behavior during the second week of camp (i.e., beginning of intervention). Generally, initial goals and later modifications were shaped by setting the goal to be a 20% improvement over the baseline rate (or most recent goal) of the target behavior. Goals for each behavior were evaluated every week by doctoral-level clinicians and modified until the child’s behavior matched age-appropriate expectations for the setting using normative data provided in the program manual. Regarding the termination of behavior targets, we followed standardized guidelines, which included termination following mastery or termination to address a higher priority behavior. The DRC was implemented for each child for the 6 weeks of camp following the collection of baseline data during the first week.
Beginning in the second week, counselors reviewed the goals with the child at the start of the day, gave feedback to the child at the point of performance throughout the day, reinforced alternative desirable behaviors, and reviewed success at the end of the day with the child and the parents. Performance on the DRC throughout the week was the main determinant of access to camp activities each Friday. On Fridays, children either earned access to a special activity (met ≥75% of DRC targets for at least 3 days), regular camp activities (met ≥50% of DRC targets on at least 3 days), or 30 minutes of chores followed by regular camp activities (met <50% of DRC targets on 3 days). During group parent training sessions, parents were taught how to provide contingent rewards for camp behavior and clinical staff provided feedback to parents as they established their home reward system. Subsequently, parents were prompted daily by their child’s lead counselor to provide rewards based on their child’s performance.
Measures
DRC data
Descriptive information regarding the implementation of the DRC was gathered, including length of implementation, number of target behaviors, and types of target behaviors. Within the current sample, the following point system behaviors were included as DRC target behaviors: (1) attention questions answered correctly; (2) contributing to a group discussion; (3) poor sportsmanship violations; (4) intentional aggression; (5) noncompliance; (6) verbal abuse to staff; (7) teasing; (8) interruptions; (9) complaining; and (10) classroom rule violations. Because DRC implementation began in the second week, if a target was applied for the duration of camp, there would be data for 22 days (e.g., days implemented following baseline period minus holidays). Behavioral frequencies for each target behavior a child had over the course of the 7-week camp were used in the calculations for benchmarks (Aim 1).
Disruptive Behavior Disorder Rating Scale (DBD)
Prior to the eligibility evaluation, parents and teachers completed the DBD (Pelham et al., 1992), a 45-item measure of ADHD symptoms including inattention, hyperactivity, and impulsivity. The measure also assesses for comorbid symptoms related to oppositional defiant disorder (ODD) and conduct disorder (CD). Respondents rate the frequency of the symptoms on a 4-point scale (0 = not at all to 3 = very much) and symptoms are considered to be present for ratings with a 2 or 3. The DBD has demonstrated strong psychometric properties (Pelham et al., 2005). The DBD was used in the current study to describe the symptoms of ADHD, ODD, and CD in the sample. Parent and teacher ratings obtained pre-treatment (i.e., during the eligibility evaluation) and parent ratings obtained at post-treatment (last week of camp) were used to examine differences in baseline ADHD symptoms and treatment outcomes between optimal and suboptimal DRC responders (Aim 2).
Impairment Rating Scale (IRS)
Prior to the eligibility evaluation, parents and teachers completed the IRS (Fabiano et al., 2006), a 7-item measure of impairment related to peer relationships, sibling relationships, relationships with caregivers/teachers, academics, self-esteem, and overall family and school functioning. Respondents rate areas of impairment on a continuous 7-point scale (0 = no problem to 6 = extreme problem) and impairment is considered to be present for ratings greater than or equal to 3. The IRS has demonstrated strong psychometric properties (Fabiano et al., 2006). The IRS was used in the current study in conjunction with the DBD to describe children in the sample with elevated levels of ADHD. To be considered as having elevated levels of ADHD, children had to exhibit marked impairment (i.e., a score of 3 or higher) across settings as rated by parents and teachers. Parent and teacher ratings obtained pre-treatment (i.e., during the eligibility evaluation) and parent ratings obtained at post-treatment (i.e., during last week of camp) were used to examine differences in baseline impairment and treatment outcomes between optimal and suboptimal DRC responders (Aim 2).
STP point system behaviors
For our analyses, the following daily frequencies of point system behaviors were analyzed in Aim 2: (1) percentage of intervals following activity rules earned; (2) rule violations; (3) classroom rule violations; (3) noncompliance; (4) interruptions; (5) complaining; (6) conduct problems (i.e., lying, stealing, destruction of property, physical aggression; (7) negative verbalizations (i.e., verbal abuse to staff, teasing, cursing/swearing); and (8) attention. To examine differences in baseline behavior and treatment outcomes related to camp behavior between optimal and suboptimal DRC responders (Aim 2), we calculated the average number of the above-mentioned behaviors for each child for each week.
Data Preparation and Analytic Strategies
To describe the use of the DRC within the current sample, we calculated descriptive statistics for the average number of target behaviors on each DRC, duration of implementation, and the types of target behaviors. To be included in the analyses to identify benchmarks (Aim 1), behavioral targets had to have a baseline period standard deviation (i.e., not 0) that allowed computations for effect sizes. To compare changes in target behavior to available benchmarks (Holdaway et al., 2020; Owens et al., 2012), we calculated standard mean difference effect sizes (SMD ES) across three, 2 week periods of the STP. Thus, each SMD ES represents the difference between the mean of a given 2-week period and the mean of the baseline period divided by the standard deviation of the baseline period. Effect sizes were interpreted according to Cohen’s (1988) standards, where <0.20 is considered a small change, 0.20 to 0.60 a moderate change, 0.60 to 0.80 a large change, and above 0.80 a large to very large change.
To categorize optimal responders and suboptimal responders (Aim 2), we completed a two-step process similar to that used by Holdaway et al. (2020). First, we examined each target behavior and categorized the target as having an optimal response if it had a cumulative SMD ES greater than or equal to 1 when the target was terminated. Then, each child was categorized as an optimal responder if at least 50% of their target behaviors had been identified as showing an optimal response in the first step. All target behaviors for which an effect size was calculated (e.g., had a baseline period standard deviation) were included in the categorization of optimal and suboptimal responders. This process resulted in 15 children (40%) categorized as optimal responders and 23 children (61%) categorized as suboptimal responders.
To examine differences in baseline characteristics between optimal and suboptimal responders (Aim 2), we conducted independent samples t-tests and chi-square tests on pre-treatment variables and camp behaviors during the first week. To examine differences in treatment outcomes between responder groups (Aim 2), we conducted three, 2 (Time: pre-treatment/post-treatment) × 2 (Group: optimal/suboptimal) multi-variate analyses of variance (MANOVA) tests; one that included the four symptom subscales from the parent-rated DBD, one that included five impairment scores from the parent-rated IRS, and one that examined eight point system behaviors (described above). Lastly, we calculated Cohen’s d effect sizes representing pre- to post-treatment change in symptoms, impairment, and camp behaviors.
Results
Description of DRC Target Behaviors
Children had an average of 7.71 (SD = 1.68) DRC target behaviors (range = 5–12). We analyzed data for the most common target behaviors: attention questions answered correctly (n = 32; average duration = 17.66 days), contributions to group discussions (n = 30; average duration = 11.23 days), interruptions (n = 27; average duration = 16.33 days), teasing (n = 22; average duration = 15.45 days), and classroom rule violations (n = 21; average duration = 16.64 days). Other behaviors (e.g., complaining, verbal abuse, noncompliance, poor sportsmanship, property destruction, and aggression) were less frequently targeted (all ns < 12); thus, these target behaviors were not included in the analyses. Regarding the termination of targets, 34% of target behaviors were terminated due to mastery and only 4.8% were terminated to address another behavior prior to mastery.
Aim 1: Effect Size Benchmarks for DRC Target Behaviors
The average SMD ESs for each 2-week period are presented in Table 2. In Period 1, improvements were seen for three of five target behavior types (contributions, interruptions, classroom rule violations). Aside from teasing, all target behaviors showed improvement in Period 2 and 3 (ES ranged from 0.17 to 5.90).
Standardized Mean Difference Effect Sizes (SMD ES) for the Most Common DRC Behavior Targets Across 2-Week Periods of the STP.
Note. Number of eligible behavioral targets vary by 2-week period. Larger SMD ESs indicate greater improvement.
Ns for period 1, 2, and 3 were 32, 31, and 23, respectively.
Ns for period 1, 2, and 3 were 30, 16, and 8, respectively.
Ns for period 1, 2, and 3 were 27, 24, and 18, respectively.
Ns for period 1, 2, and 3 were 22, 18, and 11, respectively.
Ns for period 1, 2, and 3 were 21, 18, and 15, respectively.
Aim 2: Differences Between Optimal and Suboptimal Responders
Differences in baseline characteristics
Independent samples t-tests and chi-square tests were conducted to examine differences in baseline characteristics for optimal and suboptimal responders (see Table 3). There were no significant differences in child age, t(36) = 0.59, p = .56. Girls were more likely to be suboptimal responders than optimal responders (Pearson chi-square = 3.91, p < .05) compared to boys who were more evenly distributed across groups. Significant differences in pre-treatment teacher-reported symptoms revealed that teachers’ ratings for optimal responders were more severe than for suboptimal responders for (1) hyperactivity/impulsivity, t(34) = −2.8, p < .01; (2) ODD symptoms, t(34) = −0.45, p < .05; and (3) CD symptoms, t(34) = −2.35, p < .05. The groups were not significantly different at baseline on any domain of parent- or teacher-rated impairment, or on parent-rated symptoms (see Table 3). Further, there were no significant differences in camp behavior (i.e., following activity rules, rule violations, noncompliance, complaining, interruptions, conduct problems, negative verbalizations, and attention) during the first week between the suboptimal and optimal responder groups (see Table 4).
Parent-and Teacher-Rated Symptoms and Impairment and Camp Behavior Among Optional and Suboptimal Responders.
Note. FAR = percentage of intervals earned following activity rules; RV = rule violations; NC = noncompliance; INT = interruptions; C/W = complaining/whining; CDP = Conduct problems; NEGV = negative verbalizations; ATT = percentage of standardized attention questions answered correctly.
Denotes significant differences (p < .05) between groups at baseline.
Denotes significant main effect of time.
Denotes significant group × time effect.
Means and Standard Deviations of STP Point System Behaviors for DRC Responder Groups Across Time.
Note. FAR = Percentage of intervals earned following activity rules; RV = rule violations; NC = noncompliance; INT = Interruptions; C/W = complaining/whining; CDP = conduct problems; NEGV = negative verbalizations; ATT = percentage of standardized attention questions answered correctly.
Differences in treatment outcomes
For parent-rated symptoms, there was no statistically significant multi-variate group by time interaction effect, F(4, 30) = 1.01, p = .42, partial η2 = 0.12, There was a significant multi-variate main effect of time, F(4, 30) = 11.83, p < .01, partial η2 = 0.61. Univariate follow-up ANOVAs indicated that there was a significant reduction in parent-reported (1) inattention, F(1, 33) = 37.49, p < .05, partial η2 = 0.53; (2) hyperactivity/impulsivity, F(1, 33) = 43.07, p < .05, partial η2 = 0.56; and (3) ODD symptoms, F(1, 33) = 10.62, p < .05, partial η2 = 0.24 for both optimal and suboptimal responders (see Table 3 for effect sizes representing the magnitude of within-group change over time).
For parent-reported impairment, there was a statistically significant multi-variate group by time interaction, F(5, 23) = 3.13, p < .05, partial η2 = 0.41. Univariate follow-up ANOVAs indicated that the significant interaction was observed for (1) impairment in peer relationships, F(1, 27) = 6.96, p < .05, partial η2 = 0.21; (2) impairment in self-esteem, F(1, 27) = 7.70, p < .05, partial η2 = 0.22; and (3) overall impairment, F(1, 27) = 4.93, p < .05, partial η2 = 0.15. Optimal responders experienced a greater reduction in these domains of impairment over time than did suboptimal responders (see Figure 1; see Table 3 for effect sizes).

Change in Parent-Rated Impairment Among Optimal and Suboptimal Responders.
Regarding camp behaviors, there was a statistically significant multi-variate group by time interaction, F(8, 28) = 5.56, p < .01, partial η2 = 0.61. Univariate follow-up ANOVAs indicated that the significant interaction was observed for (1) following activity rules, F(1, 35) = 12.99, p < .01, partial η2 = 0.27; (2) complaining, F(1, 35) = 10.62, p < .01, partial η2 = 0.23; and (3) negative verbalizations, F(1, 35) = 5.48, p < .05, partial η2 = 0.14. Optimal responders demonstrated a greater increase in following activity rules over time than suboptimal responders. Optimal responders also demonstrated a reduction in complaining and negative verbalizations, whereas suboptimal responders showed an increase in these behaviors (see Table 3 for effect sizes and Table 4 for means and standard deviations of behavior frequencies for each week).
Discussion
The current study advances our knowledge about the effectiveness of behavioral interventions across clinical contexts by providing the first benchmarks for the benefits of the DRC in the STP. Aside from teasing, all target behaviors showed moderate to large improvement by the second 2-week period with continued improvement through the third period. Some baseline differences between optimal and suboptimal responders were found, and optimal DRC responders demonstrated greater improvement in parent-rated impairment and camp behaviors than did suboptimal responders. These findings offer a reference point for behavior change that can be expected of children with a DRC in therapeutic recreational settings and has the potential to assist STP practitioners in determining when additional individualized approaches may be needed based on early response to intervention. We discuss these results in relation to previous DRC benchmarking studies, interpret them in the context of the STP therapeutic milieu, and discuss implications for research and practice.
Similar to previous DRC benchmark studies (Holdaway et al., 2020; Owens et al., 2012), immediate improvement and moderate to very large effect sizes were observed for three of five DRC target behaviors (e.g., contributions, interruptions, and classroom rule violations). These findings suggest that if a child has not made sufficient progress on their target behaviors (in the context of standard shaping procedures) by the third week of the STP, clinicians would be wise to consider intensifying the DRC intervention (e.g., making the goals more salient through the use of pictures, modifying schedules of feedback or reward) and/or adding other interventions. For attention check questions, small to moderate improvements were observed, but not until the second and third 2-week periods of the STP. This may epitomize the challenges that children with ADHD have in paying attention to activity details in a new setting and the time it takes for children to acclimate to such details. Future studies should examine if greater or more rapid change in this behavior can be achieved by altering the methods used by STP (e.g., how staff orient children to activities; greater focus on teaching skills for paying attention).
In contrast to all other target behaviors, teasing worsened over the course of the camp. The reason for the worsening is unclear. Increases in teasing behaviors over time may be related to deviance training (Dishion et al., 1996); however, the likelihood of deviance training is significantly reduced in the context of intensive behavioral treatment like that of the STP (Helseth et al., 2015). It is possible that teasing increased as a function of peer familiarity over time. Namely, in the context of competitive games, children may be more likely to tease as they get to know one another better. Alternatively, increases in frequencies of teasing behavior may reflect improvements in counselors’ ability to detect nonverbal forms of teasing (e.g., excessive, unwanted hugging, relational aggression) as they become increasingly familiar with children over time and as these behaviors become increasingly intolerable for peers. Within classroom settings, researchers have found that additional interventions that target peer processes (e.g., social devaluation and exclusion) may be needed to improve social behavior in children with ADHD (e.g., Mikami & Normand, 2015). Further research is needed to determine whether the teasing behavior response to the DRC is similar across recreational settings and if so, identify effective strategies for reducing these problematic behaviors.
Optimal DRC responders in the current study had higher teacher-rated symptoms of hyperactivity/impulsivity, ODD, and CD symptoms at baseline compared to suboptimal responders.There were no baseline differences between optimal and suboptimal responders in age, teacher- and parent-rated impairment, or camp behavior during the first week. However, a significant effect for gender was observed such that girls were more likely to be classified as suboptimal responders. While it is encouraging that children with both high and low levels of impairment at baseline respond optimally to the DRC in the STP, it is concerning that females were less likely to be classified as optimal responders. Perhaps this finding occurred because girls exhibited fewer of the disruptive behaviors included in the current study at the outset of treatment, making it difficult for them to achieve the very large effect sizes of improvement across more than half of their behavioral targets required to meet our conservative definition of optimal response. Future studies are needed to evaluate gender differences in treatment trajectories over time and continue to explore whether there are other measures of baseline characteristics that are better predictors of subsequent treatment response.
The current study demonstrates that there are some similarities in treatment response to the DRC in the school and STP settings (i.e., immediate moderate to very large improvements in behavior, with additional small improvements over the course of treatment). However, there are some significant differences in terms of baseline characteristics of optimal and suboptimal responders indicating that school-based benchmarks are not necessarily applicable to the STP setting. Replication of these findings within another STP site is needed to better understand differences in treatment response to the DRC in that context. Importantly, significant improvements in parent-reported symptoms, impairment, and camp behaviors were observed for both optimal and suboptimal responders, although the magnitude of improvement was greater for optimal responders. Notably, optimal DRC responders demonstrated significant improvement on parent-reported impairment in multiple domains that reached the point of non-clinical concern (e.g., scores less than a 3 on the IRS; see Figure 1), whereas suboptimal responders. This suggests that change in proximal indicators of target behavior may well represent what can be expected of change in distal indicators of symptom and impairment rating scales. These findings provide additional support for the promise of personalized interventions, such as the DRC, in which treatment plans are adjusted based on individual response over time.
Limitations
Although the current study is the first of its kind to develop benchmarks for the DRC in a therapeutic recreational setting, there are limitations to consider. First, the sample size was small (n = 38), and children were not randomly assigned to the DRC intervention, which limits our ability to interpret differences between optimal and suboptimal responders. Second, it is impossible to know whether medication changes contributed to the current study’s findings. Some children within the study were actively participating in medication consultations throughout camp. Third, a small number of children with severe behaviors received specialized programing in addition to the DRC (e.g., use of tangible reinforcement, such as poker chips, with some young children to communicate contingencies). It is possible that specialized programing contributed to treatment outcomes above and beyond the DRC for these children. Fourth, the number of cases within each analysis differed, reflecting the naturalistic study design. Future studies that use a broader range of target behaviors with a consistent number of cases across analyses need to be conducted. Lastly, within our clinical database, we only had access to limited data from the classroom (i.e., overall rule violations) and did not have specific information about the nature of the rule violations that would be helpful in comparing benchmarks across contexts (e.g., recreational and classroom settings).
Conclusion
The current study provides the first benchmarks for the DRC intervention within a therapeutic recreational setting (i.e., the STP) and reveals characteristics between optimal and suboptimal responders in this setting, extending the previous school-based DRC literature. The findings also provide confidence that change in proximal behaviors are associated with change in distal behaviors. Current technologies (see Mixon et al., 2019) could be adapted and leveraged by clinicians to make access to effect size benchmarks more efficient and to promote data-driven intervention decisions. Additional research is needed to explore (a) strategies that can be used in the STP to produce greater change in attention and teasing, (b) whether females are truly at risk for attenuated treatment response or whether these findings are specific to the current study, and (c) whether other baseline measures (e.g., parenting efficacy; medication history) better predict subsequent treatment response.
Footnotes
Author Note
Allison K. Zoromski is now affiliated to Department of Pediatrics, Division of Behavioral Medicine and Clinical Psychology, Cincinnati Children’s Hospital Medical Center, 3333 Burnet Avenue, Cincinnati, OH 45229-3039, USA.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
