Abstract
The purpose of this meta-analysis was to summarize single-case intervention research studies in which students with disabilities received function-based intervention (FBI) within inclusive school settings to address challenging behavior. A total of 27 studies were identified and systematically reviewed to determine the overall effect of FBI on challenging and appropriate behavior and whether study characteristics moderated intervention outcomes. In addition, we summarized the following: (a) characteristics of study participants and settings, (b) characteristics of FBI applied within the studies, and (c) quality of the studies. Overall, FBI led to improved behavior in a variety of inclusive school settings. Interventions delivered after a teacher-administered functional behavior assessment and within the context of a whole group instructional arrangement resulted in significant reductions in challenging behavior and improvements in appropriate behavior, respectively. Implications for practice, future directions for research, and limitations are described.
Keywords
In response to a growing emphasis on improving school culture and applying research-driven practices to improve student outcomes, schools have utilized multitiered systems of supports to address the academic, social, and behavioral needs of students with and without disabilities (Fuchs & Fuchs, 2006; Sugai & Horner, 2002). In particular, School-Wide Positive Behavioral Interventions and Support (SWPBIS) has emerged as a multitiered system of supports to prevent challenging behavior and promote socially appropriate behavior across all students within a given school (Sugai & Horner, 2002). SWPBIS is characterized by a continuum of supports across the following three levels: (a) Primary: All students in a school are taught school-wide behavioral expectations and receive reinforcement for engaging in expected behavior; (b) Secondary: Students with at-risk behavior receive additional specialized, targeted group interventions that promote appropriate behavior; and (c) Tertiary: Students with high-risk behavior receive specialized, individualized behavioral interventions (Sugai & Horner, 2006).
The ultimate outcome of SWPBIS is to nurture a school climate that supports the engagement and learning of all students. Inclusion of students with disabilities, although not necessarily a targeted outcome of SWPBIS, may improve within a multitiered system of supports (Kurth & Enyart, 2016; Sailor et al., 2006; Shogren et al., 2015). In fact, both students with and without disabilities have reported various elements of SWPBIS as factors that contribute to more equitable, enhanced school membership for all students (Shogren et al., 2015). Moreover, others have identified behavior instruction delivered within multitiered systems of supports as a necessary component for successful inclusive schools (see, for example, http://www.swiftschools.org).
The number of students with disabilities who access the general education curriculum in inclusive school settings has steadily increased over the past 20 years (McLeskey, Landers, Williamson, & Hoppey, 2012), a trend that can be attributed to advocacy efforts of parents, practitioners, researchers, and self-advocates alike. Across disability categories, students are reported to spend some or most of their school day receiving instruction from general educators alongside peers without disabilities in inclusive classrooms (Newman et al., 2011). As a result, short- and long-term benefits of inclusive education have been documented for both students with and without disabilities, including skill and knowledge acquisition, development of friendships and classroom membership, increased peer and teacher awareness of diversity, and promising postsecondary outcomes (Mavropoulou & Sideridis, 2014; Rojewski, Lee, & Gregg, 2015; Salend & Duhaney, 1999). Nonetheless, several challenges have emerged as barriers to inclusion, including lack of time and resources for team collaboration and limited in-service training on addressing students’ individual needs across academic, social, and behavioral domains (Wagner et al., 2006). Together, these structural constraints become even more detrimental when students demonstrate challenging behavior (Bambara, Goh, Kern, & Caskie, 2012; Bambara, Nonnemacher, & Kern, 2009). Lack of support to address students’ challenging behavior can prevent students from meaningful inclusive opportunities and, in some cases, adversely sway an educational team’s decision to consider the general education setting as the least restrictive placement (Lane, Carter, Common, & Jordan, 2012; Lohrmann & Bambara, 2006). It is likely that this challenge is more pronounced for students who engage in high-priority challenging behavior requiring intensive, individualized intervention (e.g., tertiary level of SWPBIS). Documenting the significance of these barriers, Bambara and colleagues conducted interviews (2009) and a large-scale survey (2012) to examine the experience of educational teams who implemented tertiary-level interventions. Common barriers were reported across stakeholders, including structural limitations (e.g., insufficient administrative supports and planning time) and attitudinal challenges (e.g., lack of belief in educating all students and delivering interventions in inclusive school settings).
However, emerging evidence suggests that implementation of behavioral interventions, specifically function-based intervention (FBI) and those at the tertiary level of SWPBIS, may contribute to better inclusive experiences for students with disabilities (e.g., Freeman et al., 2006; Gann, Ferro, Umbreit, & Liaupsin, 2014; Lane et al., 2007). For example, Freeman and colleagues (2006) worked with a school team to develop individualized supports for an elementary school student with a learning disability (LD); the student subsequently showed academic and social progresses in an inclusive classroom. Lane and colleagues (2007) implemented tertiary intervention plans with two students who did not respond to either primary or secondary behavioral intervention supports. One of the participants was a middle school–aged boy with an LD, who subsequently increased his class participation and task completion in an inclusive science class after teachers delivered a multicomponent FBI. In addition, Gann and colleagues (2014), though not framed within an SWPBIS framework, improved the academic engagement of a middle school student with Asperger syndrome across inclusive classrooms following the implementation of FBI.
Interventions implemented at the tertiary level of SWPBIS are often guided by functional behavior assessment (FBA) whereby the evaluator (a) identifies events or conditions that predict and maintain challenging behavior and (b) develops a hypothesis of the potential function(s) of the behavior (Kern, O’Neill, & Starosta, 2005). Previous research on FBI (i.e., behavioral interventions driven by the results of FBA) provides clear evidence supporting the effectiveness of these strategies versus traditional behavior management approaches (e.g., Ingram, Lewis-Palmer, & Sugai, 2005). Despite this evidence, however, little is known about the overall effectiveness of FBI applied within inclusive settings for individuals with disabilities. Individualized FBI is often multifaceted and comprised of several intervention components. The complex nature of these supports raises additional questions regarding whether and how individualized, intensive supports can be applied successfully within inclusive settings by individuals who are typically present in those settings and who may not have the knowledge, skills, and training necessary to implement complex behavioral interventions (e.g., general education teachers, paraprofessionals). Establishing the effectiveness of FBI in inclusive settings has the potential to motivate schools and districts to implement best practices that are preventive in nature and contribute to a more inclusive school culture.
Researchers have conducted both systematic literature reviews and meta-analyses of studies in which FBI was applied to address unresolved challenging behavior (e.g., Braddock, 1999; Gage, Lewis, & Stichter, 2012; Goh & Bambara, 2012). In contrast to traditional literature reviews that descriptively summarize a body of literature, meta-analyses rely on a quantitative approach for estimating intervention effect across a group of similar studies (Hedges & Valentine, 2009). However, many of the available reviews have summarized intervention outcomes across a broad array of participants (e.g., adults and children, students with and without disabilities), focused narrowly on specific disability categories (e.g., emotional behavioral disorders [EBDs], Gage et al., 2012; intellectual disability, Marquis et al., 2000; attention deficit hyperactivity disorder [ADHD], Miller & Lee, 2013), and/or did not report results by inclusive and noninclusive intervention settings (e.g., Gresham et al., 2004).
Gresham and colleagues (2004) conducted a comprehensive review of behavioral interventions published in the Journal of Applied Behavior Analysis from 1991 to 1999. Among the 150 school-based intervention studies identified, slightly over half of the studies (52%) did not involve conducting FBA. After comparing the effect sizes and the percentage of nonoverlapping data (PND) points, the authors did not find meaningful differences between FBI and non-FBI, but found that interventions based on functional analysis produced larger effects. In another meta-analysis, Gage and colleagues (2012) identified 69 single-case studies that investigated the effects of FBI on students with or at risk for EBDs. Despite the overall effectiveness found across different assessment procedures, the researchers concluded that FBI with experimental FBA might be more effective in reducing challenging behavior of students with or at risk for EBD compared with descriptive FBA. Furthermore, interventions implemented within general education settings, as compared with special education settings, resulted in better outcomes for students.
Goh and Bambara (2012) reviewed 83 studies in which students received individualized, FBI across different school settings. Results indicated that FBI was effective in addressing challenging behavior across a diverse body of students, including those with and without disabilities, and settings, including inclusive and noninclusive school settings. The authors examined effect size by student disability and setting, and concluded that FBI can be applied successfully within inclusive settings. However, given that this review broadly focused on all students within schools, information about specific participant and intervention characteristics associated with students with disabilities in inclusive settings was unavailable. Most recently, Miller and Lee (2013) reviewed studies involving FBI and non-FBI for students with a specific diagnosis of ADHD. The results of this review suggested that FBI was more effective for this particular population of students, though information about the effectiveness of these procedures in inclusive school settings was not reported.
Given that these reviews do not focus specifically on FBI in inclusive settings, a need exists to use meta-analytic techniques to understand the overall effectiveness of FBI in inclusive settings for students with a variety of disabilities and potential characteristics that contribute to desirable intervention outcomes. The results of such a review may inform the development of more effective interventions, which, in turn, may contribute to more meaningful inclusive experiences for individuals with disabilities. The purpose of this meta-analysis was to synthesize single-case intervention research studies in which students with disabilities received FBI within inclusive school settings to address challenging behavior. Specifically, this meta-analysis was designed to address the following research questions:
Method
Literature Search
Three strategies were used to identify studies included in the review: (a) electronic reference database search, (b) hand search of journals, and (c) ancestry search (i.e., review of reference lists from relevant published literature reviews and relevant articles). Electronic databases included PsycINFO, ERIC, and MEDLINE. In addition, we searched ProQuest Dissertation and Theses, a database that disseminates and archives thesis papers and dissertations, to address potential publication bias (Cooper, Hedges, & Valentine, 2009). Search parameters were as follows: (challenging behavior OR maladaptive behavior OR aberrant behavior OR problem behavior OR destructive behavior OR disruptive behavior OR self-injurious behavior OR inappropriate behavior OR aggression OR noncompliance) AND (functional assessment OR function-driven intervention OR function-based intervention OR functional behavior assessment OR functional analysis) in all text fields. We did not apply search terms specific to inclusive school settings, as this may have inadvertently limited the outcome of the searches. We also reviewed the reference lists of related literature reviews and meta-analyses identified through electronic searches (e.g., Braddock, 1999; Goh & Bambara, 2012) and hand searched 14 journals associated with research in the area of behavioral intervention and special education (a list of journals is available upon request). All searches were limited to the last 20 years of research, with references published between 1994 and 2014. Collectively, these search strategies yielded a total of 3,311 sources, with results across each search strategy as follows: electronic reference search (n = 1,620), hand search (n = 1,671), and ancestry search (n = 20). After consulting the abstract and eliminating duplicates, irrelevant sources (e.g., book chapters, commentaries), and studies that were unrelated to the focus of the search, a total of 352 potentially relevant studies remained in the literature database.
Selection Criteria
To determine whether the remaining 352 studies met the requirements for inclusion, we reviewed the abstract of each article to apply the following inclusion criteria: (a) The study utilized an experimental single-case research design (i.e., multiple baseline and probe designs, reversal and withdrawal designs, and comparative intervention designs with three or more data points in the initial control phase; Gast & Ledford, 2014), (b) the study included one or more students with a reported disability that fell within one or more categories of disability under the Individuals with Disabilities Education Act (IDEA), (c) the study participants received FBI with one or more relevant dependent measures of challenging or appropriate behavior, and (d) FBI was delivered within an inclusive school setting. When the abstract did not provide sufficient information to make a selection decision, we accessed the full article or contacted the author(s) of the study in question. Studies were excluded if intervention outcomes were evaluated with a nonexperimental case study design (i.e., AB design) or group comparison design. We defined FBI as any intervention strategy that was based on the information derived from an FBA (Ingram et al., 2005). Therefore, studies were excluded if the intervention strategies were not directly linked to the function of the challenging behavior, as determined through the reported results of an FBA. Inclusive school settings included any setting within early childhood, elementary, middle, and/or high school where peers without disabilities were present. However, studies were eliminated when participants received FBI in segregated areas (e.g., partitioned area of classroom), despite the presence of peers without disabilities. Because we calculated effect sizes based on the raw values of graphed data, studies that did not include graphed data or presented unclear data that precluded such calculations were excluded.
To evaluate interrater reliability, a second coder independently applied the inclusion criteria to 30% of the articles (n = 105) selected at random. The primary coder (the first author) was an assistant professor in special education, and the secondary coder (the third author) was a speech–language pathologist specializing in behavioral intervention; both had experience implementing behavioral intervention within inclusive school settings. Interrater reliability was calculated across each of the four inclusion criteria by dividing the number of agreements by the number of agreements plus disagreements and multiplying by 100 to obtain a percentage of agreement. The mean interrater reliability across the inclusion criteria was 94%. The coders discussed disagreements and determined whether to include the articles in question. Ultimately, we identified a total of 27 studies meeting the criteria for inclusion.
Coding Procedures
Coding instrument
After applying the selection criteria, we coded the remaining 27 included articles using a coding form developed specifically for this review (complete form with corresponding coding definitions is available from author). It should be noted that individual study participants were treated as the unit of analysis, and thus, codes were identified for each qualifying participant across each coding item. The coding form incorporated various participant, intervention, and quality of study characteristics consistent with those included in relevant literature reviews and evaluated as potential moderating variables within meta-analyses in the field of behavioral intervention and special education (e.g., Goh & Bambara, 2012; Walker & Snell, 2013). Utilizing similar coding schemes allowed for a comparison of results from the current review with those from other reviews. The form included coding items categorized into six sections: (a) student participant characteristics (number, gender, age, school level, ethnicity, disability diagnosis), (b) intervention characteristics (behavior assessment, behavior assessor, type of intervention, delivery method, dosage [frequency, duration], setting, interventionist, interventionist training, training delivery), (c) intervention outcomes (measures of challenging behavior [destructive, disruptive, or distracting behavior] and appropriate behavior [e.g., on-task behavior, replacement behavior]), (d) research design, (e) quality of study, and (f) effect size. In the case of this review, effect size refers to “a quantitative index of practical significance that estimates the meaningfulness of change associated with an intervention” (Vannest & Ninci, 2015, p. 403). As such, the term “effect size” will be used throughout the review to align with terminology used in other single-case research reviews and related publications.
The quality of study characteristics were based in part on those investigated in previous literature reviews (e.g., Walker & Snell, 2013) and guidelines for single-case research set forth by the What Works Clearinghouse (WWC; Kratochwill et al., 2010). The purpose of applying these guidelines was to provide a descriptive analysis of study quality across external, internal, and social validity indicators. The procedures for applying the WWC standards replicated those outlined by Maggin, Briesch, and Chafouleas (2013). However, it is important to note that not all students within a given study qualified for inclusion in the current review; therefore, in some cases, application of the standards excluded quality demonstrations that were not considered for the review. It also should be noted that studies not meeting the WWC design standards were not considered for evaluation against the WWC evidence standards.
Meta-analytic researchers have relied on several a priori and post hoc approaches to handle study quality (Cooper et al., 2009), with some utilizing study quality as an inclusion/exclusion criterion and others including all studies regardless of potential bias (e.g., Goh & Bambara, 2012; Walker & Snell, 2013). The latter approach may be particularly useful when meta-analyses utilize mixed methods approaches whereby both descriptive and inferential analyses are combined to summarize studies, with one goal being to describe the status of research relative to study quality. Another strategy is to conduct post hoc analyses to examine the effect of study quality on study outcomes (i.e., effect size). In the case of the current study, we did not exclude from our analyses cases in which quality indicators were not present due to our mixed methods focus (a decision rule established prior to conducting the review to address potential bias; Cooper et al., 2009) and found that study quality did not affect intervention outcomes.
Effect size calculation
Prior to calculating effect sizes, we identified the numeric value of raw data from each participant’s graph using UnGraph for Windows (2004), a software program that allows users to extract numerical data from graphs when such information is unavailable. Shadish et al. (2009) reported UnGraph to have high reliability and validity. Raw data values were then entered into either SPSS or an online calculator (Vannest, Parker, & Gonen, 2011) for the purposes of calculating effect sizes. To estimate the effect of FBI, we calculated effect sizes for all measures of challenging and appropriate behavior across each study participant. Not all studies included measures of challenging and appropriate behavior; thus, in some cases, effect size was calculated for only one measure per participant.
Typically, the success of a single-case research intervention is determined by visual inspection of graphed data; changes in level, slope, trend, consistency, and overlap across baseline and intervention conditions, along with the immediacy of change, are all considered (Gast & Ledford, 2014). However, visual inspection alone is not sufficient for quantitatively summarizing the overall effect of a common intervention across multiple studies for the purposes of a meta-analysis (Vannest & Ninci, 2015). The technique for calculating effect size for single-case research data has been widely debated, yet several recent approaches address many concerns associated with quantifying single-case intervention effect and have been successfully used by researchers to conduct meta-analytic studies. We applied two relatively new approaches to estimate intervention effect across each study participant: (a) Nonoverlap of All Pairs (NAP; Parker & Vannest, 2009) and (b) Tau-U (Parker, Vannest, Davis, & Sauber, 2011). The NAP index relies on a pairwise comparison of data across study phases and is calculated as the number of comparison pairs showing no overlap, divided by the total number of comparisons. We utilized SPSS 22.0 for Windows to estimate NAP using the Receiver Operator Characteristics (ROC) diagnostic test (Parker & Vannest, 2009). Parker and Vannest (2009) suggested the following interpretation guidelines: weak effect: 0 to 0.65, medium effect: 0.66 to 0.92, and large or strong effect: 0.93 to 1.00. Although NAP has been shown to equal or outperform other nonoverlap indexes (e.g., PND; Scruggs, Mastropieri, & Casto, 1987), the Tau-U index, a more sophisticated approach, considers both nonoverlapping data across phases and the trend from the intervention phase (Parker et al., 2011). After each pairwise comparison is determined as positive (i.e., desirable improvement), negative, or tied, Tau-U is calculated as the difference between the positive pairs and negative pairs, divided by the total number of pairs. We used a web-based calculator, Single Case Research (Vannest et al., 2011), to calculate Tau-U. Vannest and Ninci (2015) provided the following interpretation guidelines: small change: <0.20, moderate change: 0.20 to 0.60, large change: 0.60 to 0.80, and large to very large change: >0.80.
To address the unique features of each type of single-case research design, we applied guidelines described by Walker and Snell (2013). Each control phase (e.g., baseline) and intervention phase was contrasted to examine the extent of behavior change between phases. Functional analysis data meeting those conditions outlined by Walker and Snell were used in the absence of baseline data. The following strategies guided this process: (a) ABAB/reversal design or variations thereof—NAP and Tau-U were calculated for each AB and BA pairing to measure the introduction and withdrawal of intervention (Parker, Vannest, & Brown, 2009), (b) multiple baseline or probe design—NAP and Tau-U were calculated for each tier consisting of control and intervention phases with the resulting effect sizes averaged, and (c) alternating treatment designs—NAP and Tau-U were calculated for each relevant treatment with resulting effect sizes averaged. To account for the inverse contrasts of ABAB/reversal designs, we changed the test direction of the ROC analysis for NAP scores and recorded the absolute value of Tau-U scores produced by the calculator.
Coding reliability
Initially, all coders met to review the coding instrument and discuss coding definitions. Next, each coder independently coded a subset of articles (n = 3), resulting in 90% agreement among coders; coders discussed discrepancies in coding outcomes before moving forward with coding assignments. Two secondary coders (the second and third authors) independently coded an additional 52% (n = 14) and 26% (n = 7) of the articles selected at random, respectively, to analyze interrater reliability across descriptive study characteristics. The additional secondary coder (the second author) was an assistant professor in special education with behavioral intervention research experience. In addition, this coder independently calculated NAP and Tau-U scores for 27% (n = 12) of study participants. Interrater reliability was calculated across each coding item by dividing the number of agreements by the number of agreements plus disagreements and multiplying by 100 to obtain a percentage of agreement. The mean interrater reliability across all descriptive coding items was 96% (range = 78%–100%); agreement on NAP and Tau-U scores was 100%. The coders met to discuss disagreements and reached a consensus as to which code to apply for analysis.
Analysis
Data analysis consisted of both descriptive and nonparametric analyses. To estimate the overall effect of FBI across qualifying study participants, we calculated average NAP and Tau-U scores for measures of challenging and appropriate behavior in SPSS. The Kruskal–Wallis one-way ANOVA was utilized to detect significant differences in intervention effect across coded participant and study characteristics. Coding characteristics with eight or fewer ns were excluded from these analyses (Walker & Snell, 2013); as such, entire coding items were, at times, eliminated from moderator analyses (e.g., student ethnicity/race, intervention setting, interventionist training). A total of 10 (challenging behavior) and seven (appropriate behavior) analyses were conducted across qualifying coding categories. Finally, we calculated descriptive statistics (frequency, percentage) for each quality of study indicator in SPSS.
Results
We have organized the results of this meta-analysis into four main sections: (a) overall effect of intervention, (b) moderator analyses findings, (c) descriptive analyses findings, and (d) quality of study indicators.
Overall Effect
Overall, FBI implemented in inclusive settings was found to have a positive effect on both challenging and appropriate behavior of students with disabilities. The average NAP score across the 45 participants was .86 (SD = .24; range = .00–1.00) for challenging behavior and .90 (SD = 12; range = .58–1.00) for appropriate behavior. According to the guidelines set forth by Parker and Vannest (2009), these average NAP scores represent the upper end of a moderate intervention effect. Similarly, the average Tau-U score was .86 (SD = .23; range = .00–1.00) for challenging behavior and .90 (SD = .13; range = .58–1.00) for appropriate behavior. The average Tau-U scores can be interpreted as a large to very large improvement (Vannest & Ninci, 2015).
Moderator Analyses Findings
To determine whether study characteristics, if any, contributed to more or less pronounced intervention effects, we conducted moderator analyses of all qualifying participant and intervention coding items and corresponding characteristics. These analyses were applied across NAP and Tau-U scores of both challenging and appropriate behavior. Only characteristics that were coded across eight or more participants were included in the analyses; as a result, in some cases, several coding characteristics or entire coding item categories were eliminated from the analyses. Findings for challenging behavior and appropriate behavior are presented in Tables 1 and 2, respectively.
Moderator Analysis Findings for Challenging Behavior.
Note. χ2 values are derived from Kruskal–Wallis test. NAP = Nonoverlap of All Pairs, ID = intellectual disability; DD = developmental delay; LD = learning disability; FBA = functional behavior assessment.
Not all studies provided graphed data of challenging behavior; thus, M NAP and Tau-U scores do not reflect challenging behavior across all participants within a category. A total of 18 studies and 32 participants were included in this analysis.
p < .01 (alpha level refers to comparison across moderator levels).
Moderator Analysis Findings for Appropriate Behavior.
Note. χ2 values are derived from Kruskal–Wallis test. NAP = Nonoverlap of All Pairs, FBA = functional behavior assessment.
Not all studies provided graphed data of appropriate behavior; thus, M NAP and Tau-U scores do not reflect appropriate behavior across all participants within a category. A total of 17 studies and 31 participants were included in this analysis.
p < .06 (alpha level refers to comparison across moderator levels).
Most study characteristics did not contribute to statistically significant differences in FBI outcomes. However, the effect of the individual conducting the FBA (specifically, their role) on students’ challenging behavior, as calculated by NAP, was statistically significant, χ2(1, N = 32) = 7.08, p < .01. NAP scores were significantly higher when the classroom teacher (M = .91) conducted the FBA as compared with an experimenter or therapist (M = .81). Furthermore, the effect of the instructional arrangement on appropriate behavior as calculated by NAP was statistically significant, χ2(1, N = 27) = 3.48, p < .06, suggesting that NAP scores for participants receiving intervention in whole group arrangements (M = .95) were significantly higher than NAP scores for participants receiving intervention in small group instructional arrangements (M = .84). However, similar results were not found when conducting these analyses with Tau-U.
Descriptive Analyses Findings
Descriptive results are reported as the number and percentage of participants associated with specific coding characteristics, as the participant was treated as the unit of analysis. Not all studies reported information relevant to all coding items and, consequently, some coding characteristics may have been coded as “cannot determine” for any given participant; therefore, some percentages do not total to 100%. However, it was also possible to code one or more characteristics for any given participant (e.g., disability diagnosis). As such, totals for any given coding category may exceed 100%.
Student participants
A total of 45 students with disabilities received FBI within inclusive school settings across the 27 included studies. A large percentage of these students were male (n = 33, 73%), with fewer females (n = 12, 27%) receiving intervention. The average age of students was 7.9 years (range = 2–17 years). Students received special education services within elementary school (n = 20, 44%), early childhood (n = 14, 31%), middle school (n = 7, 16%), and high school (n = 2, 4%) settings. Of the 19 students for whom ethnicity was described, 13 (68%) were Caucasian, four (21%) African American, and two (11%) Hispanic. Disability diagnoses were reported as follows: (a) developmental delay/intellectual disability (n = 12, 27%), (b) autism spectrum disorders (n = 12, 27%), (c) LD (n = 10, 22%), (d) other (e.g., ADHD; n = 6, 13%), and (e) specific genetic syndrome (e.g., Down syndrome; n = 3, 7%). We were unable to determine the disability diagnosis for three of the participants.
Intervention characteristics
Participants’ behavior was assessed prior to the implementation of FBI using descriptive (n = 44, 98%) and experimental (n = 18, 40%) FBA methods. Descriptive approaches included both indirect (e.g., record review, interview; n = 43, 98%) and direct (e.g., direct observation; n = 35, 80%) strategies; experimental approaches included traditional analog functional analysis (Iwata, Dorsey, Slifer, Bauman, & Richman, 1982) and variations of such (e.g., trial-based functional analysis; see Rispoli, Ninci, Neely, & Zaini, 2014). Typically, an experimenter or therapist (n = 38, 84%) conducted the FBA; teachers (n = 15, 33%), paraprofessionals (n = 1, 2%), and parents (n = 1, 2%) were less likely to serve in this role.
The FBA results were used to guide the development of FBI that was applied in inclusive school settings; a large number of students (n = 29, 64%) received interventions comprised of multiple components (e.g., antecedent- and consequence-based strategies), whereas fewer students received intervention that relied on consequence-based (n = 12, 27%) or antecedent-based (n = 4, 9%) strategies alone. Interventions were implemented across a wide range of inclusive settings, such as classrooms during nonacademic activities (e.g., nap time, free time; n = 23, 51%), specific academic classes (e.g., math, social studies, science, language arts; n = 22, 49%), and elective classes (e.g., music, art, physical education; n = 7, 16%). Within these inclusive settings, participants received intervention within small group (n = 12, 27%), whole group (n = 12, 27%), and one-on-one (n = 11, 24%) instructional arrangements. The instructional arrangement was not described for slightly less than half (n = 21, 47%) of participants.
For most students, the classroom teacher (n = 39, 87%) and/or paraprofessional (n = 12, 27%) implemented FBI; experimenters or therapists (n = 8, 18%) and peers (n = 1, 2%) were less often reported as interventionists. In a majority of cases (n = 28, 62%), the interventionist, excluding experimenters or therapists, received training (e.g., workshop, classroom-based coaching) to support their implementation of FBI. Intervention dosage (i.e., frequency, duration) was reported for only 14 participants, with just one or two dosage characteristics (i.e., sessions per day, trials per session, session days per week, duration in days, weeks, months, or years) being described.
Intervention outcomes and research design
Studies most often included measures of both challenging and appropriate behavior across individual participants (n = 18, 40%) or challenging behavior alone (n = 14, 31%). Although each participant received FBI to address challenging behavior, some studies included only a measure of appropriate behavior for individual participants (n = 13, 29%). Measures of challenging behavior were distributed proportionally across three priority levels: (a) distracting behavior (does not immediately endanger student or others, but interferes with everyday activities and experiences; n = 30, 67%), (b) disruptive behavior (deviates from what is typically expected from an individual of the same age, but does not substantially interfere with everyday activities and experiences; n = 29, 64%), and (c) destructive behavior (harmful or threatens safety of students or others; n = 22, 49%). The single-case research designs most often applied to evaluate intervention effect for study participants were multiple baseline designs or variations thereof (e.g., multiple probe design; n = 27, 60%) and reversal designs or variations thereof (n = 18, 40%). Intervention effect was evaluated by a changing criterion design for only two (4%) participants.
Quality of Study Indicators
Initially, we coded each study to assess external validity (generalization and maintenance) and social validity across participants. Acceptable skill generalization (to new partners, settings, stimuli, responses, etc.) was reported across 13 (29%) participants, with acceptable skill maintenance (measured at least 3 months post intervention) not reported. Social validity relevant to the value or practical nature of the intervention was measured across 27 (60%) participants; in all cases, the intervention was reported to be socially valid. We then applied the WWC standards to each participant. Results for the WWC design standards are as follows: meets standards with reservations (n = 26, 58%), does not meet standards (n = 12, 27%), and meets standards (n = 7, 15%). Of those meeting or meeting standards with reservations, over half (n = 21, 64%) met the evidence standards with reservations; fewer met (n = 6, 18%) or did not meet (n = 6, 18%) evidence standards. A description of these results across studies is presented in Table 3.
Summary of Study Quality Characteristics, Including Number of Participants Who Met Inclusion Criteria, Intervention Type, FBA, SV, and WWC DS and ES.
Note. 0 = does not meet standards; 1 = meets standards with reservations; 2 = meets standards. FBA = functional behavior assessment; SV = social validity; WWC = What Works Clearinghouse; DS = design standards; ES = evidence standards.
Discussion
We conducted this meta-analysis to evaluate the effects of FBI implemented in inclusive school settings for students with disabilities and, further, to identify participant and intervention characteristics that contributed to such effects. Findings from this meta-analysis extend the existing knowledge base of FBI research for and inclusion of students with disabilities in several notable ways that inform practices and future research, as described below.
Overall Effects of FBI
Overall, FBI implemented across student participants with different disabilities and within a variety of inclusive school settings resulted in moderate to strong intervention effects, suggesting that FBI can be effective and feasible in inclusive settings (e.g., Gage et al., 2012; Goh & Bambara, 2012), even for students with the highest priority behavior (e.g., physical aggression, self-injury) or who have more significant disabilities (e.g., intellectual disability, autism). This finding is particularly encouraging, as it highlights the potential of FBI to support students with disabilities in inclusive classrooms and to promote a more inclusive culture among schools implementing SWPBIS (Lane et al., 2012). Given that behavioral expectations may vary across grade levels and settings, it is critical that students with disabilities have the opportunity to learn and master skills required for classroom success in the least restrictive environments (Lane et al., 2012).
Characteristics That Moderated FBI Outcomes
We found that reductions in challenging behavior were more significant when FBAs were conducted by teachers as opposed to experimenters and therapists. Similarly, Marquis et al. (2000) found that intervention implemented by typical interventionists (e.g., classroom teachers) led to more pronounced intervention outcomes. Possible explanations of these findings include the existing relationship between the focus students and educators and the educators’ prior knowledge of student characteristics, conditions contributing to the challenging behavior, and setting(s) in which the behavior occurs. It is also possible that the presence of an experimenter or therapist conducting an FBA in the inclusive setting contributed to observer reactivity whereby behavior is altered due to an awareness of being observed, thus resulting in unreliable FBA data. Regardless of the potential factors involved, these findings highlight the practicability of FBI for educators who typically lack the knowledge, skills, and prior training necessary to address high-priority challenging behavior. It is important to note that these particular findings were not replicated when moderator analyses were conducted with Tau-U scores. Researchers will need to explore the inconsistencies of these two effect size indices to better understand the strengths and limitations of each, and how this might affect interpretation of meta-analytic analyses, especially when comparing results with other reviews in which less sophisticated effect size indices have been applied.
Our analysis also indicated a significant difference in appropriate behavior when intervention was implemented within a whole group rather than a small group arrangement, with more substantial improvement in appropriate behavior in a whole group instructional arrangement composed of peers without disabilities. This finding is noteworthy, as whole group instruction serves as a typical teaching arrangement in inclusive settings (Chung, Carter, & Sisco, 2012), and suggests that educators are able to balance teaching responsibilities and implementation of behavioral interventions during whole group activities. FBI may have been more successful in whole group arrangements due to other contextual factors such as natural social consequences associated with peer acceptance that may exert a certain level of control over socially appropriate behavior.
Consistent with some meta-analyses (e.g., Goh & Bambara, 2012; Walker & Snell, 2013), results of this review suggest that a significant difference does not exist in behavioral outcomes in relation to FBA type. That is, FBI was equally effective whether based on the results of descriptive FBA or experimental FBA. However, others have found interventions based on experimental FBA to be more effective (e.g., Gage et al., 2012). A more in-depth analysis is necessary to identify potential variables contributing to these conflicting conclusions (e.g., disability type, preintervention challenging behavior, assessor role). Nonetheless, this finding is encouraging, as descriptive FBA is considered less intrusive and requires a less sophisticated methodology as compared with experimental FBA strategies, and therefore is likely more feasible for teachers and paraprofessionals to implement in inclusive settings. However, it should be noted that when the behavior has no clear function based on a descriptive FBA, experimental FBA (e.g., Iwata et al., 1982; Rispoli et al., 2014) should be considered.
Characteristics of FBI in Inclusive Settings
There are several characteristics of the reviewed literature that are noteworthy. First, and similar to the results of Goh and Bambara (2012) and Gresham et al. (2004), a majority of FBI implemented in inclusive settings involved multiple components with a combination of antecedent-, teaching-, and/or consequence-based strategies. The selection of multiple strategies to address challenging behavior may reflect the complex nature of inclusive school environments and the growing emphasis on preventive and instructional (e.g., teaching replacement behavior, coping strategies, etc.) strategies in the field of SWPBIS (e.g., Dunlap et al., 2010). It will be important for researchers to continue identifying the specific elements of a multicomponent intervention that are necessary to produce desired results (i.e., component analysis). Doing so may result in the identification of socially valid strategies that general educators and paraprofessionals can efficiently implement in the inclusive classroom, where competing responsibilities are likely to affect successful implementation of FBI. FBI has received strong research support across a variety of recipients, with much of the research having been conducted in more segregated settings such as self-contained classrooms, therapy rooms, or treatment facilities. Therefore, this finding is particularly promising because it not only supports the generalizability of FBI to more inclusive settings but also strengthens the applicability of tertiary-level interventions of SWPBIS, which often rely on FBA, in inclusive settings. In addition, if challenging behavior no longer serves as a barrier to inclusion, students may experience more meaningful inclusive opportunities, including academic and social achievement.
Second, results of this review indicated that implementers of FBI in inclusive school settings were mostly school personnel (i.e., classroom teachers, paraprofessionals). This suggests that, with appropriate training and support, educators were able to implement FBI with fidelity, which subsequently resulted in reductions in challenging behavior and improvements in desirable behavior and supports further the notion that interventions delivered by typical interventionists may contribute to better student outcomes (e.g., Marquis et al., 2000). We did not report the type and dosage of training provided to implementers in the reviewed studies. However, due to concerns regarding limited resources in schools (e.g., budget, time, personnel; Bambara et al., 2012), it will be necessary for researchers to identify the most efficient and effective training strategies for educators addressing high-priority challenging behavior in inclusive school settings. Furthermore, schools should consider building capacity for continued training of teachers and paraprofessionals by assigning supervisory and coaching roles to special educators, school psychologists, and others who are typically charged with conducting FBAs (e.g., train the trainer approach).
Third, and somewhat surprisingly, only two of the reviewed studies involved high school students with disabilities receiving FBI in inclusive settings. It is possible that older students with disabilities are less likely to be included, especially those with challenging behavior. Interestingly, Walker and Snell (2013) found that communication-based behavioral interventions were less effective for participants 18 years and older, suggesting that older students may have well-established patterns of behavior that are more resistant to intervention and serve as an additional barrier to inclusive experiences at the high school level. We also found this outcome to be concerning, as high school SWPBIS and inclusive educational teams may not be well informed about tertiary, individualized behavioral strategies to address students’ challenging behavior in inclusive settings. As such, we encourage schools to consider continued training in FBI across all grade levels and researchers to investigate best practices for training high school educators in FBI.
A final noteworthy finding relates to the quality of the reviewed studies. Only 60% of FBIs were assessed for social validity (i.e., value and/or practical nature of the intervention). It is important to understand whether these interventions are socially valid, especially when students are at risk for placement change due to behavior and educators are not equipped with the skills and knowledge necessary to intervene successfully. In addition, fewer measures of generalization (29%) and no measures of maintenance were reported. It is critical to determine whether reductions in challenging behavior and improvements in socially appropriate behavior generalize to new partners, settings, and stimuli, and whether such outcomes are maintained longitudinally. Future studies should include such measures to ensure that FBI within inclusive settings are feasible and practical, and result in skill generalization and maintenance.
Our application of the WWC standards for single-case research suggested that 75% of cases met the design standards with or without reservations, with a majority (82%) having met evidence standards with or without reservations. Application of these standards can be helpful in determining whether studies exploring the effects of FBI are of high quality; however, additional quality research, meeting both the design and evidence standards, is needed to build upon the existing evidence base. Most importantly, readers must exercise caution when interpreting these results. Despite having reached acceptable levels of interrater reliability, the coders’ interpretation of data may have influenced their application of the standards (Maggin et al., 2013).
Limitations
Several limitations should be considered when interpreting the results of this review. The small number of cases analyzed in the review limits the generalizability of the results. Furthermore, we excluded student data when challenging or appropriate behavior outcomes were not graphically displayed; as such, the exclusion of descriptive data for these participants might have compromised the overall scope of the findings. We utilized two effect size metrics (i.e., NAP, Tau-U) to estimate intervention effect across study participants. In doing so, we found significant effects for two variables using NAP scores. However, it is unclear as to whether these statistically significant results can be substantiated, as analyses using Tau-U did not yield the same findings. We also eliminated from analyses coding characteristics or entire coding item categories due to a small number of participants representing such characteristics (e.g., one-on-one instruction for instructional arrangement analysis). Exclusion of these characteristics significantly limits our understanding of participant, setting, and other intervention characteristics that may lead to better FBI outcomes. As additional investigators explore FBI in inclusive settings, analyses of these excluded variables will be necessary. A final noteworthy limitation involves the way in which we handled cases with poor study quality. We included all studies in our analyses, regardless of study quality. Because including studies with weaker quality involves combining different levels of evidence and increases the risk for bias (Cooper et al., 2009), results from this study should be interpreted with caution. However, the fact that some cases did not meet evidence standards emphasizes the importance of continued high-quality research so as to establish a stronger evidence base in this particular area of behavioral intervention.
Conclusion
We conducted a meta-analysis to examine the effectiveness of FBI delivered in inclusive school settings on behavioral outcomes of students with disabilities. Across 27 reviewed studies with a total of 45 students, FBI was shown to effectively reduce challenging behavior and increase appropriate behavior of participating students. In particular, the intervention effect was significantly pronounced when classroom teachers conducted FBA and when FBI was implemented within a whole group arrangement. Our findings not only strengthen the evidence base of FBI but also highlight the feasibility of FBI in inclusive settings. We hope to encourage educational teams to continue delivering appropriate individualized behavioral supports to advance inclusion outcomes of students with disabilities.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
