Abstract
To address the variability of naturalistic developmental behavioral intervention outcomes, the current study sought to isolate the effects of the instructional strategies of caregiver-mediated naturalistic developmental behavioral interventions. In this comparative efficacy trial, mothers of 111 autistic children (18–48 months) were randomized to learn one of two sets of naturalistic developmental behavioral intervention language facilitation strategies (responsive or directive). We aimed to characterize the effect of strategy type on language outcomes and explore the extent to which joint engagement outcomes mediated language outcomes. Children in the directive condition had significantly greater scores across multiple language assessments. At follow-up, the effect of strategy type on the frequency of spontaneous directed communication acts was fully mediated by coordinated joint engagement (indirect effect = −2.070, 95% CI = [−4.394, −0.06], p < 0.05). Thus, children may benefit from caregiver prompts to facilitate long-term language outcomes. The current study is an initial step in the identification of the mechanisms of caregiver-mediated NDBIs.
Lay abstract
Caregiver-mediated early interventions support caregivers’ use of strategies to improve their young autistic child’s communication. In the current clinical trial, we sought to isolate the most effective strategies to improve short-term and long-term child communication outcomes. Results demonstrated how children may benefit from caregiver prompts to facilitate long-term language outcomes. In conclusion, the current study improves our understanding of how early intervention facilitates child communication outcomes.
Caregiver-mediated naturalistic developmental behavioral interventions (NDBIs) are a commonly implemented class of early interventions to improve autistic toddlers’ social communication and language development (Schreibman et al., 2015). Such outcomes have cascading effects on academic achievement, occupational success, and quality of life (Billstedt et al., 2005; Zaidman-Zait et al., 2021). Despite the widespread use of caregiver-mediated NBDIs, child outcomes remain variable (Crank et al., 2021; Fuller & Kaiser, 2019; Sandbank et al., 2020; Tiede & Walton, 2019). To address the variability of child outcomes, Vivanti et al. (2018) called the identification of intervention factors and caregiver–child factors that may influence the efficacy of caregiver-mediated NDBIs. The overall goal of this line of research is to understand how/why NDBIs facilitate outcomes and identify which caregiver–child dyads may benefit from NDBIs.
Caregiver-mediated NDBIs are based on developmental and behavioral learning theories. The theoretical framework informs the two broad sets of instructional strategies implemented across all NDBIs: responsive and directive language facilitation strategies (Schreibman et al., 2015). Responsive strategies are based on developmental theory (Carpendale & Lewis, 2004; Tomasello, 2010; Vygotsky & Cole, 1978), in which the caregiver follows the child’s lead and responds to the child’s communication. Directive strategies are based on behavioral learning theory (Carnett et al., 2019; DeSouza et al., 2017; Lovaas & Smith, 1989; Skinner, 1957), in which the caregiver prompts for child language using a three-part contingency approach. Given that all NDBIs include both responsive and directive strategies, it is critical to understand the relative impact of each set of strategies in facilitating child outcomes. In fact, Schreibman et al. (2015) called for the empirical analysis of the active ingredients of multicomponent NBDIs to determine the impact of strategies included in NDBIs. The current study employs a novel, experimental approach to understanding the relative impact of NDBI strategies by randomly assigning caregivers to learn responsive or directive language facilitation strategies.
Caregivers’ propensity to learn and implement different language facilitation strategies may contribute to variability in child outcomes. Thus, the primary aim of the current clinical trial addressed the extent to which the strategy type (i.e. responsive or directive) influenced caregivers’ use of language facilitation strategies (Roberts et al., 2022). Results of the primary aim of the clinical trial demonstrated that caregivers taught to use responsive strategies demonstrated greater proficiency in strategy use than caregivers taught to use directive strategies (d = 0.90, 95% CI = [0.47, 1.32]). In order to understand which caregivers may benefit most from learning responsive or directive strategies, we explored the extent to which caregivers’ strategy use varied by caregivers’ learning style characteristics, as quantified by the broad autism phenotype (BAP; i.e. presence or absence of subclinical characteristics of autism). Caregivers’ use of taught strategies did not differ by BAP status and BAP status did not moderate the effect of strategy type on caregivers’ use of strategies. Taken together, the findings suggest that intervention factors, such as the strategy taught, may influence caregivers’ use of language facilitation strategies, regardless of their learning style.
The current study reports the secondary outcomes of the clinical trial, which aimed to assess the extent to which strategy type (i.e. responsive or directive) influenced child outcomes. First, we aimed to evaluate the impact of responsive and directive strategies on spoken language outcomes, a critical distal (i.e. long-term) outcome. As responsive strategies emphasize child-led interactions, children may have more opportunities to initiate spontaneous directed communication related to their interests. Conversely, as directive strategies emphasize caregiver-led interactions, the child may not have as many opportunities to initiate spontaneous directed communication. Rather, child communication is elicited in response to caregiver-led prompting episodes. As spontaneous directed communication skills are associated with language skills later in life, the frequency of spontaneous directed child communication acts was chosen as the primary dependent variable. Here, we consider spontaneous directed communication as a measure of the social use of spoken language.
Studies of NDBIs include a variety of outcomes that differ based on the outcome proximity (i.e. the degree to which outcomes of the study relate to the exact target of the intervention), and a recent meta-analysis demonstrated that NDBI effects vary based on the proximity of study outcomes (Crank et al., 2021). NDBIs directly target proximal (i.e. short-term) outcomes that have downstream effects on distal communication and language skills as well as skills in other developmental domains (Schreibman et al., 2015). There is a paucity of research examining the extent to which proximal child outcomes serve as mediators of distal child outcomes—despite recent calls for such evaluations (Sandbank et al., 2020; Schreibman et al., 2015). Brian et al. (2022) found that child responsivity (the proximal outcome) was associated with gains in later expressive language and caregiver-reported functional communication (the distal outcome). Pickles et al. (2015) found that child initiations (the proximal outcome) mediated the effect of the intervention on child Autism Diagnostic Observation Schedule-General (ADOS-G scores; the distal outcome). In addition, Shih et al. (2021) found that joint engagement concurrently mediated the effect of the intervention of child’s initiation of joint attention skills. To extend upon this emerging field of research, we conducted post hoc analyses to determine the main effect of the intervention on proximal outcomes (i.e. joint engagement) and the extent to which proximal outcomes mediate distal outcomes.
Joint engagement (i.e. sustained periods of shared attention) is an optimal, interactional context for language learning and therefore is a common proximal outcome of NDBIs (Crank et al., 2021; Tiede & Walton, 2019). Only one study (which included 85 toddler–caregiver dyads) compared the effects of responsive and directive interaction styles on joint engagement. Responsive caregiver interaction styles were associated with a greater frequency of child-initiated joint engagement episodes; directive caregiver interaction styles were associated with a greater frequency of caregiver-initiated joint engagement episodes (Patterson et al., 2014). Although this study provides preliminary evidence of the potential effects of responsive and directive interaction styles on joint engagement, the implications of the study were limited given the observational study design. We extend this previous work by comparing the effects of responsive and directive strategies on joint engagement in the context of a comparative efficacy trial.
Joint engagement reflects the nature of the learning contexts facilitated by NDBIs. Given that each type of intervention strategy elicits different caregiver and child behaviors, it is critical to understand the extent to which intervention strategies alter the quality of the learning context, as reflected by joint engagement. Preliminary evaluations of the relative impact of strategy use on child outcomes suggest that specific language facilitation strategies, such as mirrored pacing, facilitate joint engagement outcomes (Gulsrud et al., 2016). Here, we consider the effects of two sets of language facilitation strategies (i.e. responsive or directive) on joint engagement, specifically coordinated and supported joint engagement (Adamson et al., 2004, 2012; Bakeman & Adamson, 1984). Directive strategies, which encourage caregivers to prompt for directed communication, may facilitate a greater frequency of coordinated engagement states (i.e. periods of shared attention to an activity during which the child is actively coordinating attention to the caregiver). However, given that directive language facilitation strategies may interrupt and redirect the child’s attention, directive strategies may not facilitate a greater mean length of coordinated engagement states. For example, if the caregiver implements a communication temptation, such as pausing within a play routine, the child may directly communicate to request for continuation of the routine. Once the child communicates, the play routine continues, and the caregiver is no longer eliciting directed communication from the child. As a result, the coordinated engagement state is discontinued. Therefore, caregiver prompting may elicit frequent episodes of coordinated engagement, however, directive strategies may not facilitate sustained periods of coordinated engagement states. In contrast, because responsive strategies encourage caregivers to follow the child’s lead and maintain mutual focus, responsive strategies may facilitate a greater frequency and mean length of supported engagement states (i.e. periods of shared attention to an activity during which the child is not actively coordinating attention to the caregiver). Understanding the extent to which responsive and directive strategies impact specific aspects of joint engagement is critical to characterizing the language learning context facilitated by each NDBI strategy. Differential effects of language facilitation strategies on the language learning context may have significant downstream effects on spoken language outcomes. Exploring the extent to which spoken language outcomes are mediated by joint engagement outcomes is critical to determining short-term intervention targets, an important next step toward the advancement of NDBIs and the capacity to improve child outcomes (Pickles et al., 2015).
As recommended by Schreibman et al. (2015), we take a dismantling approach to understanding the mechanisms of NDBIs. Rather than examining intervention strategies as part of a full, manualized intervention, we examine the mechanisms of NDBIs by separating the shared components of NDBI interventions (i.e. responsive and directive strategies). The study explored the following research questions.
Preregistered
To what extent do language skills differ between the type of caregiver-mediated language facilitation strategy (i.e. responsive or directive) immediately after intervention and 3 months later? We hypothesized that responsive language facilitation strategies would facilitate a greater frequency of spontaneous directed communication acts.
Post hoc
To what extent does joint engagement differ between the type of caregiver-mediated language facilitation strategy (i.e. responsive or directive) immediately after intervention? We hypothesized that responsive language facilitation strategies would facilitate a greater frequency and mean length of supported engagement states. We hypothesized that directive language facilitation strategies would facilitate a greater frequency, but not mean length, of coordinated engagement states.
To what extent does joint engagement immediately after intervention mediate language intervention outcomes 3 months later? We hypothesized that language outcomes would be mediated by joint engagement directly after intervention.
Method
Trial design
Recruitment of 119 mother–child dyads took place between 28 January 2016, and 5 March 2020 in Chicago, Illinois, and the surrounding suburbs. Dyads were enrolled (n = 119) and randomized (n = 111) to either the responsive or directive experimental condition by the study project manager after completing baseline activities (NCT02632773, clinicaltrials.gov). The randomization sequence was computer generated by the senior project statistician and contained random block sizes with a 1:1 assignment ratio within blocks. Based on the primary aims of the clinical (Roberts et al., 2022), randomization was stratified on child verbal status and maternal BAP status (i.e. presence or absence of subclinical characteristics of autism). Maternal BAP status was determined using two measures: (a) the Modified Personality Assessment Schedule-Revised (MPAS-R; Tyrer, 1988) and (b) the Pragmatic Rating Scale (PRS; Landa et al., 1992); see Roberts et al. (2022) for more details. There were no associations between maternal BAP status and child outcomes at any timepoint, therefore, BAP status was not included in the secondary outcome analyses reported in the current article. Child verbal status was categorically defined by item A1 on the Autism Diagnostic Observation Schedule (score of 1–2 = verbal; score of 3–4 = preverbal; ADOS-2; Lord et al., 2012).
Participants
Child eligibility criteria included as follows: (a) age between 18 and 48 months old, (b) an autism diagnosis as determined by the ADOS-2, (c) no other diagnosis likely to impact development, and (d) child language level did not meet criteria for flexible phrase speech per the ADOS-2. For all children 30 months or younger, the Toddler Module was administered. For all children 31 months or older, Module 1 was administered. The ADOS-2 was chosen to confirm or determine the child’s diagnostic status, given the measure’s diagnostic accuracy (Lebersfeld et al., 2021), as indicated by high sensitivity (0.88) and specificity (0.91) for this age group (Luyster et al., 2009). At the time of enrollment, 88 of the 111 randomized participants had a previously established medical diagnosis of autism, which was confirmed using the ADOS-2; 23 of the 111 participants did not have a previously established medical diagnosis of autism. The presence of characteristics consistent with autism were determined using the ADOS-2 along with the clinical judgment of a developmental therapist (a state-certified early interventionist and evaluator) and the senior author (a speech–language pathologist, developmental therapist, and state-certified early interventionist and evaluator with a PhD in Special Education), both of whom have extensive experience working on a multidisciplinary team as a part of autism medical diagnostic evaluations. Mother eligibility criteria included as follows: (a) learned to speak English before 12 years old, (b) used English with their child at least 50% of the time, and (c) had no diagnosis that affected cognition or personality. All mother inclusion criteria were established to ensure the validity of the BAP measures. Only mothers were included because differences in interaction styles exist between mothers and fathers (Flippin & Watson, 2011) and mothers are typically the parent who elects to participate in caregiver-mediated interventions (Kaiser & Roberts, 2013); see Table 1 for demographic characteristics.
Baseline demographic characteristics.
ADOS-2 comparison scores were calculated based on recommendations in Esler et al. (2015). ADOS: Autism Diagnostic Observation Schedule-2nd Edition; RBSR: Repetitive Behavior Scale-Revised; MSEL: Mullen Scales of Early Learning Visual Reception Scale; d: Cohen’s d; RR: relative risk; V: Cramer’s V.
All mothers signed an informed written consent approved by Northwestern University’s Institutional Review Board. Collected data were stored in REDCap databases (Harris et al., 2009).
Adverse events and protocol deviations (see Supplementary Table 1) did not differ between the two groups; therefore, they were not accounted for in the analyses. The manual with standard operating procedures is available by request for using this link. The pre-trial power analysis was determined based on the primary aim of the clinical trial. As such, the sample size of 108 was determined assuming 80% power to detect an effect size of d = 0.54 (derived from a pilot study) in strategy use between intervention groups. The actual sample included 119 dyads with a signed consent form of whom 111 were randomized, 96 completed assessments immediately after intervention (86.49% from T0), and 84 completed follow-up assessments (75.68% from T0; 87.50% from T1; see Figure 1). Given the actual sample size of 111 with respect to the spontaneous directed communication acts (defined below) during the language sample, 63% power was achieved to detect the reported effect size of d = 0.44 at T1 and 73% power was achieved to detect the reported effect size of d = 0.49 at T2. With respect to the spontaneous directed communication acts during the mother–child interaction, 31% power was achieved to detect the reported effect size of d = 0.28 at T1 and 55% power was achieved to detect the reported effect size of d = 0.40 at T2. Joint engagement main effects were defined as exploratory, post hoc analyses, therefore adequate power may not be achieved. Power for joint engagement analyses ranged from 50% to 98%. Groups did not significantly differ on any demographic or dependent variables at baseline, except for supported joint engagement frequency (d = 0.46). All analyses controlled for the dependent variable at baseline to minimize the effects of baseline differences.

CONSORT chart.
Strategy types
Mothers were taught either responsive or directive language facilitation strategies over 8 weekly 1-h appointments in the dyad’s home. The first session consisted of a PowerPoint workshop outlining the respective strategy type. Subsequent sessions followed a Teach–Model–Coach–Review instructional format (Roberts et al., 2014). Research interventionists (i.e. a developmental therapist and speech–language pathologists) trained to research fidelity delivered the intervention. Interventionists achieved fidelity through the delivery of three consecutive sessions at or above 80% fidelity. Two sessions for each participant and 20% of workshops were randomly rated to monitor ongoing fidelity. All sessions remained at or above 80% fidelity (Responsive: M = 96.51%, SD = 4.08%; Directive: M = 93.08%, SD = 5.23%).
Responsive strategies are based on a developmental model in which caregivers provide input appropriate for their child’s developmental level. Mothers in the responsive condition were taught to respond to all child communication by commenting on their child’s focus of attention using verbal utterances and gestures. Directive strategies are based on a behavioral learning theory in which a caregiver teaches communication through structured prompts and direct instruction. Mothers in the directive condition were taught to set up communication opportunities related to their child’s interests, prompt their child to communicate, and reinforce communication attempts; see Supplementary Table 2 for strategy definitions.
Outcomes
Child outcomes were measured at baseline (T0), immediately after intervention (T1), and 3 months after intervention (T2). Research staff (naïve to experimental condition) completed assessments, transcription using SALT procedures (Miller & Igelsias, 2012), and behavioral coding using Mangold Interact Software (Mangold, 2020). Assessors were trained to research fidelity by achieving three consecutive assessment administrations at 80% fidelity or above.
Transcribers and coders were trained to research reliability by achieving 80% agreement or above with a master transcriber/coder on three consecutive samples.
Language outcomes were measured during two naturalistic contexts, a language sample and a caregiver–child interaction, designed to assess (a) spontaneous directed communication acts and (b) number of different words (NDW). The language sample was collected during a video-recorded 20-min play-based observation with the child and an unfamiliar assessor using a standard set of toys. The assessor refrained from using content-rich language to maximize opportunities for spontaneous child communication and provided five communication temptations to elicit communication. The mother–child interaction was collected during a video-recorded 10-min play-based observation using a standard set of toys. Mothers were encouraged to play with their child as they would normally. Behavioral codes were assigned to each child utterance to measure (a) spontaneous directed communication acts and (b) the total NDW said by the child. Each child utterance was coded to determine whether it was (a) spontaneous and (b) directed to yield the total number of spontaneous directed communication acts. Child communication acts were coded as spontaneous if they were not prompted, imitated, or elicited by the adult. For example, if a child independently showed their caregiver a ball and said “ball,” this would be considered a spontaneous communicative act. Conversely, if the child was playing with the ball and their caregiver asked, “What is that?” and the child replied “ball,” this would not be considered a spontaneous communicative act. Child communication acts were coded as directed: (a) the child made clear eye contact paired with a vocalization, gesture, or other symbolic form, (b) the child used a joint engagement gesture (i.e. point, show, give), (c) the child referenced the adult by name or pronoun, or (d) the child’s communication occurred as a result of the adult’s prompt. NDW during the language sample and mother–child interaction was used as a measure of child expressive vocabulary. A full description of these codes is available at this link https://redcap.nubic.northwestern.edu/redcap/surveys/?s=8WXRRKNCH4. Reliability was completed for 20% of the language samples (ICC = 0.98, 95% CI = [0.96, 0.98]) and mother–child interactions (ICC = 0.97, 95% CI = [0.95, 0.98]).
Language outcomes were also characterized using parent-report and semi-structured assessments. Mothers completed the MacArthur-Bates Communicative Development Inventories-Words and Gestures (MCDI-WG; Fenson et al., 2007), a vocabulary checklist containing of 396 words across 19 categories. Mothers indicated whether their child said each word to yield the total words said. The Communication and Symbolic Behavior Scales Developmental Profile (CSBS) is a semi-structured assessment of communication, which consists of six sampling opportunities to elicit child communication and yields a total raw score (Wetherby & Prizant, 2002). Reliability was completed for 20% of CSBS administrations (ICC = 0.98, 95% CI = [0.96, 0.98]).
Joint engagement was measured from a 10 min mother–child interaction (see description above). Language outcomes and joint engagement codes were extracted from the same mother–child interaction. The interaction was coded using a 5-s interval code adapted from Bakeman and Adamson (1984), which defines joint engagement as sustained periods of interaction between the caregiver and the child. Each interval was coded to reflect whether the engagement state was joint engagement or not joint engagement. Periods of joint engagement were categorized as coordinated or supported engagement. Intervals were coded as coordinated engagement if the dyad was actively engaged in the same activity and the child explicitly acknowledges their mother, as evidenced by the presence of directed eye gaze or communication (see definition of directed communication above; Adamson et al., 2004, 2012; Bakeman & Adamson, 1984). Intervals were coded as supported joint engagement if the dyad was actively engaged in the same activity, but the child did not explicitly acknowledge their mother, as evidenced by the lack of directed eye gaze or directed communication (Adamson et al., 2004, 2012). Intervals were coded as not joint engagement if the dyad was not actively engaged in the same activity, which includes episodes of object engagement, onlooking, and unengaged (Adamson et al., 2004; Bakeman & Adamson, 1984). Frequency of coordinated and supported engagement states was defined as the number of 5-s intervals throughout the interaction. Mean length of coordinated and supported engagement states was defined as the average number of consecutive 5-s intervals throughout the interaction. Reliability was calculated using the ICC function of the psych package in R (Revelle, 2022). Reliability coding was completed for 20% of all coded samples by one of two reliability coders, and ICCs reported here reflect the association between ratings of the primary and reliability coder (coordinated joint engagement ICC = 0.92, 95% CI = [0.87, 0.95]; supported joint engagement ICC = 0.77, 95% CI = [0.64, 0.86]).
Statistical methods
As shown in Figure 1, 111 participants were randomized. Dropped participants did not differ from enrolled participants on any baseline measures, and no discernable pattern of missingness was identified. At T0, joint engagement observations for one participant, the MCDI for three participants, and the CSBS for two participants were missing. Analyses were completed under the assumption of intent-to-treat analysis, such that the last observed data point was carried forward for participants with missing data at T1 or T2.
Multiple linear regression models were used to test the main effects of experimental condition on child outcomes that met the necessary assumptions for ordinary least squares (OLS) regression, controlling for the dependent variables at baseline and child verbal status. Given that the chronological age of children enrolled in the current study exceeded the chronological age range for the CSBS, raw scores, rather than standard scores, were included. Thus, age was included as a covariate when modeling CSBS raw score as the dependent variable. Heteroskedasticity was present in residuals for select child outcome models, thus, robust variance estimators were used in all models for methodological consistency. The MCDI total words said count variable was over-dispersed, failed to meet linear modeling assumptions, and required negative binomial regression models. Joint engagement frequency, for both coordinated and supported engagement states, was over-dispersed, failed to meet linear modeling assumptions, and required quantile regression models (Congdon, 2017; Frumento & Salvati, 2021; Koenker 2023; Winkelmann, 2006).
Model-based causal mediation analyses were conducted using the mediation R package (Tingley et al., 2014) for outcomes that differed significantly at T2. The mediator and outcome models were specified using the regression models included in the main effects analyses. Then, the mediate function was used to test the causal mediation. To test the significance of the indirect effect, the bootstrapped unstandardized indirect effect was computed for each of the 1000 bootstrapped samples. Participants with a missing T0 assessment were excluded from the corresponding mediation analysis.
Community involvement
There was no community involvement in the reported study.
Results
Language outcomes
At T1 during the language sample, children in the directive condition produced significantly more spontaneous directed communication acts (d = −0.44, 95% CI = [−0.82, −0.06), B = −4.17, p = 0.026) and NDW (d = −0.55, 95% CI = [−0.94, −0.17], B = −6.99, p = 0.007). At T2, children in the directive condition produced significantly more spontaneous directed communication acts during the language sample (d = −0.49, 95% CI = [−0.88, −0.11), B = −4.79, p = 0.015) and mother–child interaction (d = −0.40, 95% CI = [−0.79, −0.02], B = −2.37, p = 0.034), said significantly more words per caregiver-report on the MCDI (d = −0.41, 95% CI = [−0.83, 0.01], B = −0.46, p = 0.049), and scored higher on the CSBS (d = −0.40, 95% CI = [−0.78, −0.01], B = −7.72, p = 0.041); see Table 2 for descriptive data and Table 3 for regression tables of language outcomes at T1 and T2.
Descriptive data of outcomes at baseline (T0), post-intervention (T1), and follow-up (T2).
SD: standard deviation; NDW: number of different words; MCDI: MacArthur-Bates Communicative Development Inventories; CSBS: Communication and Symbolic Behavior Scales Developmental Profile.
Language outcomes main effect at post-intervention (T1) and follow-up (T2).
SE: standard error; MCDI: MacArthur-Bates Communicative Development Inventories; CSBS: Communication and Symbolic Behavior Scales Developmental Profile; NDW: number of different words.
1 = Preverbal.
1 = Responsive.
p < 0.05, **p < 0.01, ***p < 0.001.
Joint engagement outcomes
The responsive condition had a significantly greater frequency (B = 24.43, p < 0.001) and mean length (d = .80, 95% CI = [0.40, 1.20], B = 1.85, p < 0.001) of supported joint engagement. The directive condition had a significantly greater frequency of coordinated joint engagement states (B = −12.48, p < 0.001). Coordinated joint engagement mean length did not significantly differ between experimental conditions (d = −0.38, 95% CI = [−0.77, 0.01], B = −0.69, p = 0.05); see Table 2 for descriptive data and Table 4 for regression tables of joint engagement outcomes at T1.
Joint engagement main effect at post-intervention (T1).
SE: standard error.
1 = Preverbal.
1 = Responsive.
p < 0.01, ***p < 0.001.
Mediation analysis
Post hoc-mediation analyses aimed to understand the extent to which joint engagement mediated the effect of language facilitation strategy on language outcomes at T2. Given that language outcomes at T2 were significantly greater for the directive group and frequency of coordinated joint engagement at T1 was significantly greater in the directive group, the frequency of coordinated joint engagement is a plausible mediator of language outcomes at follow-up. Mediation models were conducted for outcomes that significantly differed at T2 using frequency of coordinated joint engagement as the mediator.
For the mediation model predicting spontaneous directed communication acts during the mother–child interaction at T2, the bootstrapped unstandardized indirect effect was −2.070 (95% CI = [−4.394, −0.06], p < 0.05). The proportion of the effect of strategy type on the spontaneous directed communication acts that goes through coordinated joint engagement frequency was 0.913 (95% CI = [0.182, 4.46], p < 0.05). Thus, joint engagement at T1 fully mediated the effect of strategy type on spontaneous directed communication acts during the mother–child interaction at T2 (see Figure 2). Coordinated joint engagement at T1 did not mediate the effect of strategy type on spontaneous directed communication acts during the language sample, MCDI scores, or CSBS scores at T2.

Joint engagement at post-intervention fully mediated the effect of strategy type on spontaneous directed utterances during the mother–child interaction at follow-up.
Discussion
The overall goal of the current study was to characterize the mechanisms of caregiver-mediated NDBI strategies on proximal and distal child outcomes. Immediately after intervention and at follow-up, language outcomes on a variety of assessments were greater in the directive condition. Therefore, we conducted exploratory, post hoc analyses to understand the mechanisms through which these outcomes were facilitated. Directly following the intervention, the frequency and mean length of supported engagement states were greater in the responsive condition. This follows our prediction that responsive strategies encourage caregivers to support their child’s engagement by maintaining mutual focus. In addition, the frequency of coordinated engagement states was greater in the directive condition. This was also expected given that directive strategies encourage caregivers to prompt for child directed communication, a key characteristic of coordinated joint engagement states. Moreover, the frequency of coordinated joint engagement at post-intervention fully mediated the effect of strategy type on spontaneous directed communication acts at follow-up. Therefore, children may benefit from caregiver prompts early on to facilitate long-term independent language outcomes. These results identify the respective contributions of responsive and directive language facilitation strategies on proximal and distal child outcomes.
Previously, only one study compared the effects of responsive and directive interaction styles on joint engagement. Patterson et al. (2014) found that responsive interaction styles were associated with a greater frequency of child-initiated engagement episodes. The current study expanded this work by examining the extent to which differences in joint engagement were associated with language outcomes. Because previous research indicates that responsive strategies facilitate a greater number of child initiations (Aldred et al., 2004; Green et al., 2017; Kasari et al., 2008; Rahman et al., 2016; Whitehouse et al., 2021), it was predicted that responsive strategies would result in a greater number of spontaneous directed communication acts. However, directive strategies facilitated a greater number of spontaneous directed communication acts. One explanation for this finding may be that the reinforcement and practice of elicited directed communication leads to language learning that can be generalized to spontaneous directed communication. In other words, practicing with caregiver support leads to independent communication later. This finding is in line with previous research which demonstrates that encouraging child responses can lead to generalized gains in child communication (Brian et al., 2022). In addition, there is some evidence to suggest that a subgroup of autistic toddlers benefit from directive interventions (Sandbank et al., 2020). Future research should include moderation analyses to understand which children may benefit most from the implementation of directive strategies and, thus, individualize early interventions based on the child’s clinical profile.
Although these findings support the efficacy of directive strategies, it is critical to understand stakeholders’ acceptability of directive strategies. In a recent commentary, Schuck et al. (2021) discussed the potential for NDBIs to bridge the gap between intervention approaches informed by behavioral learning theory and the neurodiversity movement. Directive strategies implemented in the context of NBDIs aim to reform strictly behavioral interventions through the incorporation of naturalistic, child-led learning contexts that seek to facilitate functional, meaningful outcomes (Schuck et al., 2021). Nevertheless, firsthand accounts of stakeholders’ perspectives on the current implementation of NDBIs are limited. As such, future research should incorporate the perspectives of autistic people and stakeholders to characterize the social validity of NDBI approaches (Roche, Adams and Clark, 2021).
A core component of NDBIs is to facilitate learning through teaching foundational skills that have cascading effects on other developmental skills. We demonstrated the developmental cascade of proximal outcomes on distal outcomes by using a mediation analysis. Mediation analyses tested the extent to which intervention effects at follow-up were mediated by joint engagement immediately after intervention. Results suggest that spontaneous directed communication acts during the mother–child interaction at follow-up were mediated by coordinated joint engagement immediately after intervention. In other words, children benefit from caregiver prompting to facilitate short-term outcomes of coordinated joint engagement states, which in turn facilitates long-term outcomes of child directed spontaneous communication acts. It is important to distinguish between these two outcome measures. Coordinated joint engagement differs from spontaneous directed communication acts because (a) child communication during coordinated joint engagement states may be imitated or elicited (rather than spontaneous), (b) coordinated joint engagement episodes include the use of eye contact alone to facilitate joint referencing, and (c) coordinated joint engagement episodes is a dyadic measure which reflects caregiver and child behavior. Results of the current study suggest that language learning opportunities are facilitated during periods of coordinated joint engagement, which is in line with recent research highlighting the influence of mutual gaze on social communication outcomes (Rollins et al., 2021).
However, joint engagement did not mediate additional intervention effects reflected by the language sample, MCDI, or CSBS. While the mother–child interaction is a context-bound assessment (i.e. a context similar to the one in which the intervention was implemented), additional assessments were included to measure how the effects generalized to different contexts (e.g. interactions with an unfamiliar partner). Given that there was a main effect but no mediation, these results may indicate a different mechanism through which intervention effects generalize to various contexts. Therefore, future research needs to characterize the mechanisms through which outcomes generalize to different contexts.
Although responsive language facilitation strategies were not predictive of language outcomes, responsive strategies facilitated sustained interactions between the dyad (as indicated by the mean length of supported joint engagement states). Sustained supported joint engagement states reflect the caregiver’s ability to attend to their child’s interests and facilitate learning opportunities. In fact, in contrast with our findings, previous research demonstrated associations between supported joint engagement and language outcomes (Adamson et al., 2009; Bottema-Beutel et al., 2014). Supported engagement states, particularly those filled with high-quality linguistic input, may be especially important for language learning. Furthermore, responsive strategies are not only associated with improvement in language and social communication skills (Davis et al., 2022), but also critical family-centered outcomes (e.g. socioemotional well-being, externalizing behaviors, caregiver–child attachment; Brophy-Herb et al., 2011; Raval et al., 2001; Roskam et al., 2015). One potential explanation of these findings in the current study is that treatment effects of responsive strategies on spoken language outcomes may not be evident within the 3-month follow-up period. While not reflected in the current study, some studies (Patterson et al., 2014; Shih et al., 2021) consider whether engagement states were child-initiated or adult-initiated and demonstrated that responsive strategies were associated with child-initiated engagement states. Future research may consider additional factors to characterize the quality of interactions facilitated by NDBI strategies, which may in turn have downstream effects on language outcomes. Alternatively, it may be that the quality of the linguistic input embedded within the supported engagement states was not facilitative of language outcomes. Our future research aims to understand the extent to which language facilitation strategies may differentially influence the linguistic input embedded within the language learning opportunities, and thus affect child language outcomes. Future research is necessary to understand the long-term effects of responsive strategies on child language outcomes.
The goal of NDBIs is to support caregivers’ implementation of responsive and directive language facilitation strategies, as the integration of such intervention strategies may have synergistic effects. Given the purpose of this article was to understand the unique effects of two theoretically contrasted sets of language facilitation strategies, such effects are not reflected in the current study. Moreover, the current article sought to address the comparative effects of two sets of strategies that are embedded within NDBI intervention packages. As such, we elected to not include a control group. Rather, our goal was to employ a dismantling approach to NBDI strategies, as recommended by Schreibman et al. (2015), to shed light on the respective contributions of language facilitation strategies within NDBI intervention packages. Future research should expand on the current study by evaluating the independent and synergistic effects of NDBI strategies on language learning for autistic children.
Limitations of the current study include limited inclusion criteria and statistical power that, although justified by the primary aim of the clinical trial, limit the generalizability of the findings. While adequate power is not expected for post hoc analyses, the power achieved for the preregistered analyses of language outcomes should be considered. Group differences at baseline were present for supported joint engagement frequency; however, such differences were controlled for in the analysis to minimize threats to internal validity. For 23 of the 111 children enrolled in the current study, the presence of characteristics consistent with autism were determined using the ADOS-2 in conjunction with the clinical judgment of clinicians with extensive experience working on a multidisciplinary team as a part of autism medical diagnostic evaluation. A limitation to consider is the fact that these children did not yet receive an official medical diagnosis of autism. Finally, the current study aimed to characterize differences in child outcomes between two groups: caregivers who learned directive strategies and caregivers who learned responsive strategies. Results of the current study demonstrated that children in the directive group obtained better language outcomes than children in the responsive group. However, findings of Roberts et al. (2022) demonstrated that caregivers in the responsive group used strategies more frequently and accurately than caregivers in the directive group, regardless of their learning style (as indicated by BAP status). Taken together, a critical next step to characterizing the mechanisms of caregiver-mediated NDBIs is understanding the extent to which caregiver fidelity (i.e. the frequency and accuracy with which they implement the language facilitation strategies) is associated with child outcomes. Although beyond the scope and, thus, a limitation of the current article, our future research aims to explore the extent to which caregivers’ proficiency in implementing language facilitation strategies is associated with child outcomes and the degree to which these associations vary depending on the type of language facilitation strategies caregivers were taught. These limitations should be considered with respect to the strengths of the current project. This is the first comparative efficacy trial to isolate the effect of NDBI language facilitation strategies on proximal and distal outcomes. The intervention effects were characterized within a large sample from representative racial, ethnic, and socioeconomic backgrounds. Results of this study advance our understanding of the mechanism through which early interventions impact child outcomes for autistic toddlers.
Supplemental Material
sj-doc-1-aut-10.1177_13623613231213283 – Supplemental material for Characterizing mechanisms of caregiver-mediated naturalistic developmental behavioral interventions for autistic toddlers: A randomized clinical trial
Supplemental material, sj-doc-1-aut-10.1177_13623613231213283 for Characterizing mechanisms of caregiver-mediated naturalistic developmental behavioral interventions for autistic toddlers: A randomized clinical trial by Maranda K Jones, Bailey J Sone, Jeffrey Grauzer, Laura Sudec, Aaron Kaat and Megan Y Roberts in Autism
Supplemental Material
sj-docx-2-aut-10.1177_13623613231213283 – Supplemental material for Characterizing mechanisms of caregiver-mediated naturalistic developmental behavioral interventions for autistic toddlers: A randomized clinical trial
Supplemental material, sj-docx-2-aut-10.1177_13623613231213283 for Characterizing mechanisms of caregiver-mediated naturalistic developmental behavioral interventions for autistic toddlers: A randomized clinical trial by Maranda K Jones, Bailey J Sone, Jeffrey Grauzer, Laura Sudec, Aaron Kaat and Megan Y Roberts in Autism
Footnotes
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The study was funded by the National Institute on Deafness and Other Communication Disorders (R01DC014709 PI: M.Y.R.).
Supplemental material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
