Abstract
We conducted a descriptive analysis of single-case research design (SCRD) studies on safety skills instruction (SSI) for individuals with autism spectrum disorder (ASD). Once we identified studies through electronic databases and reference lists, we used What Works Clearinghouse (WWC) Standards to evaluate each study. We analyzed studies in terms of various descriptive variables, calculated effect sizes through improvement rate difference (IRD), and aggregated effect sizes across studies to produce an omnibus effect size. Results showed 18 of 29 studies met the WWC Standards to meet design standards (MS) and meet design standards with reservations (MS-R), and various types of SSI were effective in teaching various skills. Of 18, 12 studies resulted in a large effect, and we found a behavioral skills training package (BST) to be evidence-based when we applied a 5-3-20 rule. Implications for researchers and practitioners are discussed.
Children with autism spectrum disorder (ASD) and other developmental disabilities face 2 to 3 times the risk of injury or abuse compared with same age peers in the general population (e.g., Agran & Krump, 2010; Calavari & Romanczyk, 2012; Lee et al., 2008; Volkmar & Wiesner, 2009). This can be attributed to various factors, including difficulty in social interactions and communication, level of cognitive functioning, lack of generalization (Doyle & Doyle-Iland, 2004), and failure to learn new skills not taught systematically (Summers et al., 2011). An unsafe response to strangers who intend harm may occur due to compliance training (e.g., being taught to comply with demands of teachers, staff, or therapists; Lumley & Miltenberger, 1997). Finally, safety skills can be difficult to teach as students may not have the opportunity to practice them on a regular basis.
Prevalence rates for ASD have risen over recent decades (Fombonne, 2003; Matson & Kozlowski, 2011), increasing demands on educational, health care, and social services. Research has shown that students with ASD may need specially designed instructional strategies to acquire and generalize new content and skills (Courtade et al., 2015; Ryan et al., 2011). Since safety risks for individuals with ASD are high, it is imperative that professionals, as well as parents, implement evidence-based practices (EBPs) identified through rigorous experimental research, including single-case research design (SCRD). This can improve long-term independence as well as related social outcomes while reducing risks associated with ineffective treatments (e.g., child safety, instructor burn out).
In spite of the need, teaching safety skills to individuals with ASD is often neglected (Agran & Krump, 2010; Brown-Lavoie et al., 2014; Kenny et al., 2013). Limited research on safety skills instruction (SSI) has shown that, when taught systematically, individuals with ASD can learn various safety skills. Behavioral skill training (BST; Garcia et al., 2016), video modeling and prompting (Akmanoglu & Tekin-Iftar, 2011), and other prompting strategies (Harriage et al., 2016) are examples of interventions used for teaching safety skills. While outcomes are promising, research reveals that teachers and parents of children with ASD do not provide systematic instruction on safety skills as needed (Sirin & Tekin-Iftar, 2016). They documented that parents and teachers of children with ASD considered SSI as limited to providing warnings to stay away from and/or eliminating risks rather than implementing effective instructional procedures to teach safety skills. Both groups indicated the need for programs and materials to guide them in how to teach these skills.
Despite promising SSI outcomes, further research across a number of variables (e.g., ages, circumstance) is needed to identify effective interventions for guide teachers, therapists, and families. Almost all reviews of studies on SSI to date have focused on teaching individuals with intellectual disability (e.g., Dixon et al., 2010; Wright & Wolery, 2011). Wiseman et al. (2017) recently provided promising evidence for conducting SSI with individuals with ASD based on 11 SCRD studies published between 1993 and 2014; they reported medium to large effect sizes across interventions and no difference in effectiveness across methods or setting, concluding that more research is needed.
We designed the current study to contribute to the SSI literature in various ways. First, Wiseman et al. (2017) included only published SCRD studies and excluded graduate studies (i.e., thesis and dissertations); hence, their findings should be interpreted cautiously due to potential bias known as the “file drawer problem” (Rosenthal, 1979). Although some studies conducted by graduate students may go unpublished, these studies can play a significant role (e.g., serve as case studies for future research, describe ineffective procedures). At the same time, journal editors tend only to publish studies with positive outcomes. In particular, Shadish et al. (2016) noted that researchers are more likely to submit and recommend publication of SCRD studies that have large effect sizes; thus, studies with smaller effect sizes may never be recognized. Including unpublished studies (whether or not findings are positive or show a large effect size) in analysis can prevent a potential bias toward specific procedures. Second, Wiseman et al. included research studies published between 1990 and 2016; however, autism was first defined as infantile autism in Diagnostic and Statistical Manual of Mental Disorders (3rd ed.; DSM-III; American Psychiatric Association, 1980) in 1980. Therefore, we included studies published prior to 1990. Third, we used What Works Clearinghouse (WWC; Kratochwill et al., 2013) Standards to evaluate each study in terms of rigor. Fourth, we analyzed more variables (e.g., gender, assessment method for targeting skills, criteria, interventionists) through analysis to draw comprehensive conclusions. Last, there are several nonparametric techniques to calculate effect sizes of SCRD studies. While improvement rate difference (IRD) provides reliable findings, Wiseman et al. used another nonparametric technique, Tau U. There is a debate in special education as to the most appropriate method for synthesizing SCRD (Parker et al., 2011). IRD (Chen et al., 2016) provides the difference in successful performance between baseline and intervention conditions (Parker et al., 2009). Parker et al. stated the advantages of IRD as (a) accessible interpretation as to the difference in improvement rates between baseline and treatment conditions, (b) simple hand-calculation, (c) compatibility with the percentage of nonoverlapping data (PND) from visual analysis, (d) known sampling distribution so confidence intervals are available, (e) proven track record (as risk difference) of a great number of evidence-based medical research studies, (f) few data distribution assumptions, and (g) application to complex SCRD and multiple data series. Therefore, this analysis contributes to the SSI literature by using IRD. In summary, we designed the present study to extend the literature on SSI through the above stated points. There are two major ways while designing a study to identify an EBP: (a) examining the independent variable (e.g., time delay procedure) or (b) examining the dependent variable (e.g., participants’ outcome measures). We chose to identify EBPs according to the outcome measures (i.e., safety skills). The purpose of this study was to conduct an analysis of published SCRD studies and unpublished graduate studies to determine whether there is an EBP for conducting SSI with individuals with ASD. To do this, we (a) used the WWC Standards; (b) conducted a comprehensive descriptive analysis of research studies on the SSI for demographics, procedural variables, and outcomes; and (c) analyzed effect size using IRD.
Method
Search Procedures
We conducted a systematic review to locate studies investigating SSI for individuals with ASD from January 1980 to January 2018 via Academic Search Complete, ArticleFirst, EBSCOhost, JSTOR, ProQuest, PsycINFO, ScienceDirect, Theses Global, Worldcat.org, and Web of Science using the keywords safety skills, fire safety, abduction, gun safety, water safety, accident, home safety, pedestrian skills, social safety, first-aid, telephone skills, lures of strangers, community safety skills, sexual abuse prevention, sexual abuse, autism, autism spectrum disorder, pervasive developmental disorders (final search on January 16, 2018). Then, we reviewed reference lists of identified articles as well as a meta-analysis conducted by Wiseman et al. (2017) to identify additional studies. Finally, we completed an ancestral search of the reference lists of the additional identified studies. Although we located unpublished graduate studies through the search for ProQuest and Theses Global, we are aware that we may not have found all unpublished graduate studies.
Inclusion and Exclusion Criteria
We included studies that met the following criteria: (a) published in English in internationally disseminated peer-reviewed journal; (b) unpublished graduate studies in English; (c) participant(s) diagnosed with autism, ASD, pervasive developmental disorders (PDD), pervasive developmental disorders not otherwise specified (PDD-NOS), Asperger syndrome (AS), or autistic disorder (AD); (d) use of SCRD; and (e) focus on teaching safety skills. Of the 52 studies we located, we excluded 16 for the following reasons: (a) focus on descriptive and group experimental research methods (n = 11), (b) case study (n = 2), or (c) review of literature (n = 3). This left 36 studies. We excluded two additional studies as being both dissertations and published journal articles, keeping the published versions. Finally, we excluded three additional studies due to (a) being a multitreatment comparison SCRD (n = 1), (b) not directly focusing on instruction of safety (n = 1), or (c) lacking information about data collection and analysis (n = 1). Hence, we excluded a total of 21 studies. Two researchers coded all studies according to inclusion criteria to determine studies retained for further analysis with 98.7% (range: 80%–100%) consistency. Disagreement between the coders occurred for two studies we excluded due to “not directly focusing on instruction of safety” based on the researchers’ opinions. Finally, we retained 29 studies for further analyses. Supplemental Figure 1 displays processes and number of studies identified for inclusion and analysis.
Procedures for Evaluating Quality Indicators of Studies
We used the WWC Standards to evaluate design quality of the 29 identified studies. We determined the presence and absence of each indicator within eight categories: (a) systematic manipulation of independent variable, (b) collection of interobserver agreement (IOA) data for at least 20% of all sessions, (c) IOA for at least 80% of all sessions, (d) at least three demonstrations of effect, (e) at least five data points per condition (to meet standards), (f) at least three data points per condition (to meet standards with reservation), (g) clarification of design standards, and (h) clarification of evidence for effectiveness. Prior to evaluating articles, we discussed and listed decision rules for each indicator. Then, two of the experienced researchers (same coding responsibility in another published systematic review research project) independently coded one study and reached 100% consensus through discussion of examples and nonexamples of each indicator. Finally, these two researchers coded quality indicators (QIs) of each article (see Supplemental Table 1). They examined each tier in a study to determine an indicator’s presence, coding “yes” if all tiers met the indicator in the study. Classification of standards (item “g” in preceding narrative and the second to last column in Supplemental Table 1) coded according to the definitions as follows. If a study failed to meet an indicator in a single tier, they coded that indicator as “no” for the study. Studies which met the criteria from “a to f” coded as meet standards (MS), studies which met the criteria from “a to f except e” coded as meet standards with reservation (MS-R), studies which met the criterion “e but did not meet at least one of the criterion between a and f” coded as does not meet standards (nMS) in Supplemental Table 1. Classification of evidence of effectiveness coded according to visual analysis of studies categorized as MS and MS-R. We retained studies which met the QIs in all tiers for visual analysis and descriptive analysis. To assess effects, we considered six outcome measures (i.e., level, trend, variability, immediacy of effects, overlap, consistency of data patterns across similar phases) within and between conditions (Kratochwill et al., 2013). If a study provided demonstration of an effect in all outcome measures, we categorized it as strong evidence. If a study provided three demonstrations of an effect and also included at least one demonstration of noneffect, we categorized it as moderate evidence. If a study did not provide at least three, temporally distinct, demonstrations of an effect, we categorized it as no evidence. Demonstration of an effect through visual analysis in all outcomes for the studies is presented in Supplemental Table 2. We retained studies we categorized as having either strong or moderate evidence for calculation of effect size estimations.
Procedures for Conducting Descriptive Analysis of SSI Studies
Two researchers coded the following data for the descriptive analysis for each study that met the QIs recommended by Kratochwill et al. (2013) and coded as MS or MS-R: (a) characteristics of participants (i.e., number, age, gender, disability); (b) target safety skills (i.e., targeted safety skills, assessment of safety skills, criteria); (c) settings and instructional arrangement; (d) research design and reliability; (e) intervention description (e.g., prompt, reinforcers, implementer); (f) social validity, maintenance, and generalization; and (g) overall outcome. Table 1 displays the compiled data.
Coding Demographical Parameters for the Selected Articles.
Note. DV = dependent variable; IV = independent variable; M = male; F = female; ASD = autism spectrum disorder; MR = mental retardation; Comm. sett. = community settings; MP/p = multiple probe design across participants; MP/B = multiple probe design across behavior; NC-MB/p = nonconcurrent multiple baseline design across participants; BST = behavioral skills training; AS = Asperger syndrome; PDD = pervasive developmental disorder; VR = virtual reality; SLI = speech and language impairment; ADHD = attention-deficit/hyperactivity disorder ; MB/p = multiple baseline design across participants; NCMs = naive community members; MB/B = multiple baseline design across behavior; MD = multiple disabilities; JS = Jacobsen syndrome; TD = typical development; PDD-NOS= pervasive developmental disorder-not otherwise specified; CCD = changing-criterion design; BTP: behavioral treatment package; GS = graduate students; MMR = moderate mental retardation; MP/PRAT = multiple probe replicated across task; CTD = constant time delay.
Intervention Effect Calculations
We used the web-based IRD calculator at http://www.singlecaseresearch.org (Vannest et al., 2011) to calculate IRD effect size for baseline-intervention comparison. Based on guidelines, we considered IRD scores at or below 50% as “small effect,” between 50% and 70% as “moderate effect,” and at or above 70% as “large effect” (Parker et al., 2009). We examined each single-case tier within a study to calculate the IRD score through a data extraction process using the software program BIOSOFT UnGraph5. Two researchers familiar with using UnGraph5 taught the rest of the researchers to operate the software with 100% accuracy by working on articles outside this study. Then, two researchers digitized data in each tier using UnGraph5 across all studies and exported extracted data into a Microsoft Excel file for calculating IRD scores (Vannest et al., 2011).
Determination of an Evidence-Base for SSI
We evaluated the studies as MS and MS-R together against the criteria for EBPs recommended by Kratochwill et al. (2013). This included three criteria (5-3-20 rule): (a) minimum of five studies categorized as MS and MS-R, (b) practice conducted by at least three groups of researchers with no overlapping authorship from three geographic regions, and (c) total number of participants included in combined studies equaling at least 20.
Reliability
First, two researchers obtained 100% agreement (98.7% before obtaining consensus between two researchers) regarding inclusion and exclusion of studies in this review. Subsequently, we conducted five reliability analyses that included (a) QIs, (b) visual analysis, (c) UnGraph5 digitized data, (d) descriptive analysis, and (e) IRD effect size calculation. The two researchers collected reliability data independently. We used a point by point method to determine the percentage of interrater reliability by dividing number of agreements by total number of agreements plus disagreements and multiplying by 100. In cases of disagreement during QI analysis and descriptive analysis, the two researchers reexamined the coded articles and achieved consensus on each parameter of the QIs and descriptive analysis. For the QIs, two researchers independently coded 100% (n = 29) of the studies and obtained a mean of 97.8% (range: 75%–100%) agreement. Then, the two researchers independently conducted visual analyses of 100% (n = 18) of the studies and obtained a mean of 98.1% (range: 83.3%–100%) agreement. The two researchers also independently digitized 100% (n = 12) of the studies. Reliability analysis for digitizing the data (n = 12) resulted in a mean of 99.9% (range: 93–100%) agreement. Given human error involved in using UnGraph5 (i.e., if mouse cursor was slightly off mid-point of data point, rounding error could change value of data point), we operationalized agreement as the value of two data points being identical or one unit apart (i.e., below or above). For example, if one researcher coded a data point as 25, the other researcher could code the same data point as 24 or 26 and still be counted as correct in the reliability analysis. We reviewed 18 articles for descriptive analysis. Two researchers independently coded 100% of these studies, obtaining a mean of 99.1% (range: 92%–100%) agreement. Finally, two researchers independently calculated 100% of the studies (n = 12) retained to IRD effect size calculation, resulting in 100% agreement.
Results
Quality Indicators of Single-Case Research Studies
As reported in the methods and shown in Supplemental Figure 1, we found 29 studies that met criterion to be included in our analysis. Data on the quality of the SCRD studies and classification of evidence of effectiveness through visual analysis can be found in Supplemental Tables 1 and 2, respectively. (Due to page constraints, we give one example of the result of each variable in our analysis here, and others can be found in the Tables.) Of 29 studies, the researchers rated two (6.9%) as meet standards (e.g., Rossi et al., 2017) and 16 (55.2%) as meet standards with reservations (e.g., Ergenekon, 2012). We rated 11 (37.9%) as does not meet standards (e.g., Hawkins, 2016). The most common reason we did not rate studies as meet standards or meet standards with reservations was failure to obtain sufficient data points (less than three) in each condition (n = 7; 63.6%; e.g., Rodriguez, 2016). Other reasons were failure to collect IOA data for 20% of each condition (n = 3; 27.3%; e.g., Morgan, 2017), having an unacceptable level of IOA data (below 80%; n = 3; 27.3%; e.g., Hawkins, 2016), and showing at least three demonstrations of effect (n = 3; 27.3%; e.g., Sokolosky, 2011). Also, two studies (18.2%; e.g., Morgan, 2017) did not meet at least three criteria in the standards.
Visual analysis findings across 18 studies rated as meet standards and meet standards with reservations showed half of the studies (n = 9; 50%; e.g., Summers et al., 2011) as having strong evidence under classification of evidence of effectiveness. We categorized three studies (16.7%) as having moderate evidence (e.g., Winterling et al., 1992) and six studies (33.3%) as having no evidence (e.g., King & Miltenberger, 2017). We did not include six studies categorized as “meet standards with reservations” either in the descriptive analysis or in effect size analysis due to several failures in visual analysis such as having no data pattern across behaviors or participants (e.g., Akmanoglu & Tekin-Iftar, 2011) or having no immediate effect and large overlap between baseline and intervention conditions (e.g., Johnston, 2010). Visual analysis can be found in Supplemental Table 2.
Descriptive Analysis of SSI Articles
We included 18 studies that met the QIs recommended by Kratochwill et al. (2013) in the descriptive analysis. Demographic, procedural, and outcome characteristics of the studies are in Table 1. Four studies (22.2%; e.g., Tucker, 2016) were unpublished graduate studies, and the remaining studies (n = 14; 77.8%) were published in internationally disseminated peer-reviewed journals.
Participants
The reviewed studies included a total of 62 participants—36 male (58%; n = 13; e.g., Honsberger, 2015) and 10 female (16.1%; n = 6; e.g., Goldsmith, 2008). Gender was not identified for 16 participants (25.8%) in five studies (e.g., Rossi et al., 2017). Thirty-one (50%) participants were school age (7–15 years; n =12; e.g., Ergenekon, 2012), 18 participants (27.4%) were preschool age (2–6 years; n = 7; e.g., Garcia et al., 2016), and 13 participants (20.9%) were adolescents or young adults (n = 5; e.g., Winterling et al., 1992). Investigators predominantly examined the effects of SSI with individuals with an autism diagnosis (n = 40) in 14 studies (e.g., Summers et al., 2011) and individuals having comorbidity (i.e., at least one additional label; n = 15) in seven studies (e.g., Kearney et al., 2018). Moreover, studies included individuals with Asperger syndrome (n = 3) in two studies (e.g., Goldsmith, 2008); pervasive developmental disabilities (n = 2) in two studies (e.g., Goldsmith, 2008), pervasive developmental disabilities not otherwise specified (n = 1) in one study (Levy et al., 2017); and typical development (n = 1) in one study (Levy et al., 2017).
Skills taught
Investigators used SSI to teach pedestrian skills (n = 3; e.g., Honsberger, 2015), abduction skills (n = 3; e.g., Gunby & Rapp, 2014), domestic safety skills and/or home accident prevention skills (n = 2; e.g., Summers et al., 2011), water safety skills (n = 2; e.g., Levy et al., 2017), seeking help when lost (n = 2; e.g., Hoch et al., 2009), first-aid skills (n = 2; e.g., Kearney et al., 2018), safety response in the presence and absence of dangerous fire starting stimuli and poisonous liquid stimuli (n = 1; Rossi et al., 2017), poison prevention skills (n = 1; King & Miltenberger, 2017), fire safety skills (n = 1; Garcia et al., 2016), and sexual abuse protection skill (n = 1; Johnston, 2010).
Assessment of skills taught and criteria for acquisition
Investigators used three strategies for assessing skills in their studies: (a) task analysis (n = 10; e.g., Goldsmith, 2008), (b) scoring system (n = 7; e.g., Gunby & Rapp, 2014), and (c) task analysis with scoring system (n = 1; Harriage et al., 2016). Investigators defined criteria in two ways: (a) performing steps of task analysis correctly (n = 11; e.g., Goldsmith, 2008) and (b) meeting certain score in scoring systems (n = 7; e.g., Ledbetter-Cho et al., 2016). Except for a study by Kearney et al. (2018) that set criteria for acquisition as 86% correct, the investigators (n = 9) set acquisition criteria as 100% correct on task analysis steps (e.g., Goldsmith, 2008). Investigators also set acquisition criteria as the full score of scoring system (n = 6; e.g., Johnston, 2010). Harriage et al. (2016) used correct responding for each step of the task analysis with a total score of 25 (5 points given to every correct step).
Settings and teaching format
Investigators conducted SSI in various settings. SSI predominantly occurred in combined settings, such as university unit and community setting; classroom, home, and treatment center (n = 7; e.g., Garcia et al., 2016). The investigators also provided SSI at home (n = 3; e.g., Ergenekon, 2012); in community settings, such as streets and/or home (n = 3; e.g., Harriage et al., 2016); in school units, such as parking lot, cafeteria, or classroom (n = 3; e.g., Honsberger, 2015); and at swimming pool (n = 2; e.g., Levy et al., 2017). All investigators used a one-on-one instructional arrangement.
Research design and reliability
Investigators predominantly used multiple baseline design across participants (n = 11; e.g., Harriage et al., 2016); of these, four used nonconcurrent multiple baseline design across participants (e.g., Garcia et al., 2016). They also used multiple probe design across participants (n = 2; e.g., Honsberger, 2015), multiple baseline design across behaviors (n = 2; e.g., Tucker, 2016), multiple probe design across participants replicated across behaviors (n = 1; Winterling et al., 1992), multiple baseline design across participants and changing criterion design (n = 1; Levy et al., 2017), and multiple probe design across behaviors (n = 1; Ergenekon, 2012). They conducted dependent variable analysis in all studies (n = 18) and independent variable analysis (n = 11; Tucker, 2016) in the majority of studies. They reported dependent and independent variable reliability as over 90% agreement.
Intervention description
SSI consisted of various interventions, including BST (n = 8; e.g., Gunby & Rapp, 2014); most-to-least and least-to-most prompting procedures (n = 2; e.g., Harriage et al., 2016); video modeling (n = 1; Honsberger, 2015); literacy-based behavioral intervention (n = 1; Kearney et al., 2018); behavioral treatment package consisting of shaping, prompting, and reinforcement (n = 1; Levy et al., 2017); instructional package consisting of video modeling, graduated guidance, and community-based instruction (n = 1; Akmanoglu & Tekin-Iftar, 2011); story reading and video modeling (n = 1; Ergenekon, 2012); sexual abuse prevention program (board game plus informational story book; n = 1; Johnston, 2010); video modeling and in situ training (n =1; King & Miltenberger, 2017); and verbal instruction and least-to-most prompting (n = 1; Taylor et al., 2004).
Interventionist
Investigators delivered intervention (n = 10; e.g., Goldsmith, 2008) in the majority of the studies. Others identified included teachers (n = 2; e.g., Levy et al., 2017), therapists (n = 2; e.g., Ledbetter-Cho et al., 2016), peers (n = 1; Kearney et al., 2018), parent and investigator (n = 1; Johnston, 2010), teacher and graduate student (n = 1; Taylor et al., 2004), and parent (n = 1; Harriage et al., 2016).
Social validity
Investigators reported SSI to be socially valid in studies in which they collected these data. While 10 investigations (Johnston, 2010) included social validity data, eight did not (e.g., Garcia et al., 2016). Investigators in 10 studies analyzed social validity data collected from participants’ parents (n = 6; e.g., Gunby & Rapp, 2014), special education teachers and graduate students (n = 1; Rossi et al., 2017), participants themselves and other professionals (n = 1; Kearney et al., 2018), peers (n = 1; Ergenekon, 2012), and community members (n = 1; Hoch et al., 2009).
Maintenance
Analyses showed SSI as effective in promoting maintenance of acquired safety skills. Investigators addressed maintenance (n = 14; e.g., Harriage et al., 2016) in the majority of the studies by collecting data between 1 and 6 weeks (n = 13; e.g., Kearney et al., 2018) or 6 and 24 months (n = 1; Levy et al., 2017) after training. Four studies did not address maintenance (e.g., Summers et al., 2011).
Generalization
Most investigations addressed generalization of SSI (n = 12; e.g., Goldsmith, 2008), including across settings and persons (n = 2; e.g., Hoch et al., 2009), settings (n = 4; e.g., Honsberger, 2015), settings and materials (n = 2; e.g., Harriage et al., 2016), settings and stimuli (n = 1; Rossi et al., 2017), and multiple exemplars (n = 3; e.g., Ergenekon, 2012). One third (n = 6) did not address generalization (e.g., Tucker, 2016).
Overall outcomes
Investigators reported that SSI provided positive outcomes in the majority of studies across all participants (n = 14; e.g., Hoch et al., 2009) and was effective for four participants with modifications (n = 3; e.g., Winterling et al., 1992). In one study (Ledbetter-Cho et al., 2016), the intervention was not effective for one participant. SSI was effective for 57 out of 62 participants across studies.
Determination of an EBP
In citing Miltenberger (2003), Goldsmith (2008) stated that a formalized BST package is a four part teaching strategy that involves (1) clear explicit instructions for appropriate behavior, (2) modeling or demonstration of appropriate behavior, (3) rehearsal or practice of the appropriate behavior, and (4) feedback on the performance that occurred during rehearsal. (p. 9)
Based on this review, a BST package can be considered evidence based for teaching safety skills to individuals ASD. First, the criterion requiring a minimum of five studies categorized as MS and MS-R was met in that seven studies had acceptable methodological rigor to support a BST package (i.e., Goldsmith, 2008; Gunby & Rapp, 2014; Ledbetter-Cho et al., 2016; Rossi et al., 2017; Summers et al., 2011; Tucker, 2016; Winterling et al., 1992). Second, the criterion requiring that studies be conducted by at least three researcher groups with no overlapping authorship from three different geographic regions was met in these seven studies as they were conducted by seven different research groups from different regions in the United States. Third, the criterion requiring that results be demonstrated across a minimum of 20 participants was met in that results in the above cited seven studies were demonstrated across 28 participants. Of these, 17 had a diagnosis of ASD only (e.g., Gunby & Rapp, 2014), eight had comorbidity including autism (e.g., Tucker, 2016), and three participants (e.g., Goldsmith, 2008) had a single label, such AS (n = 2) or PDD (n = 1).
Effects of BST Package on SSI
We determined the effects of a BST package by using IRD calculations for further analysis in this systematic review. We applied IRD to the 12 studies with the classifications of “meet standards” or “meet standards with reservations” and classification of evidence of effectiveness criteria recommended by Kratochwill et al. (2013). Supplemental Table 3 displays the IRD scores calculated across the 12 studies using baseline-intervention comparisons, as well as the number of tiers analyzed for these comparisons. IRD results from baseline-intervention comparison suggest that a BST package as well as other interventions and/or instructional packages reviewed in this study have a “large effect” for SSI with individuals with ASD.
Discussion
The purpose of this study was to conduct a comprehensive descriptive review and analysis of published SCRD studies and graduate studies to determine EBPs for SSI across for individuals with ASD. Eighteen studies, almost two thirds of located studies (n = 29), met a sufficient number of criteria identified by Kratochwill et al. (2013) to “meet design standards” or “meet design standards with reservations.” Of 18, two thirds (n = 12) rated as having strong and moderate evidence under classifications of evidence of effectiveness. IRD findings showed that the interventions in these studies seemed to have a “large effect” in teaching safety skills to individuals with ASD. In almost half of the studies (7 out of 12), investigators used a BST package to teach safety skills, providing sufficient data to support this intervention being an EBP. These studies show that a BST package appears to be an EBP to teach various safety skills (e.g., street crossing, abduction prevention, water safety, household safety) to individuals with ASD of varying ages (preschool to young adulthood). We addressed demographic, methodological, and outcome characteristics of the studies through the descriptive analysis of 18 studies that met a sufficient number of the criteria identified by Kratochwill et al. (2013). Investigators identified the majority of the participants as having ASD, followed by comorbidity with ASD. The participants learned to perform various safety skills (e.g., abduction prevention, first-aid, household safety), investigators predominantly used task analysis as the assessment method, and investigators reported acquisition criteria in all studies. Across studies, investigators implemented intervention in a one-on-one instructional arrangement in either a single setting or combined settings and reported reliability analysis. They addressed social validity, generalization, and maintenance effects of SSI in the majority of the reviewed studies and obtained promising findings in these parameters. Investigators reported positive effects of SSI in all studies, although a few of the participants needed modifications.
Among the studies analyzed, two thirds of the studies met the QIs raised by Kratochwill et al. (2013); however, these studies failed to meet five data points per condition criteria. It could be that investigators may not have wanted to collect data for a longer period of time due to the extended exposure to possible dangers in SSI. In 2005, Horner et al. suggested that studies have five baseline points per condition. Our review only had two studies published before 2005, but others may have been completed but not published prior to the date. Another reason for failure to meet five data points could be that the time needed to collect five data points which can be limited due to constraints of a school schedule or limited access to a home setting. Other reasons may be the need to collect IOA data in 20% of each condition, have acceptable IOA data, and show at least three demonstrations of effect.
Analysis of the studies showed a BST package to be an EBP for teaching various safety skills to individuals with ASD. We recognize that a BST package can be composed of specific components that may vary: However, all of the BST packages consisted of a variation of the four basic components for a BST package identified by Miltenberger (2003; i.e., explanation, demonstration, practice, feedback). For example, some used video modeling (Ledbetter-Cho et al., 2016) while others used live modeling (Tucker, 2016) when demonstrating appropriate behavior. It is noteworthy that while the BST package has evidence in teaching the safety skills in the reviewed studies, it may not work with all types of safety skills.
Although QI analysis did not require us to examine whether independent variable reliability was collected in the studies, the collection of independent variable reliability data was one of the parameters we reviewed during the descriptive analysis. The inclusion of both dependent and independent variable data across investigations serves to strengthen believability in the effectiveness of SSI in general and specifically in the BST packages used in the studies. The investigators obtained high fidelity in all of the studies.
All of the investigators set stringent criteria for the acquisition of safety skills. Unlike other skill areas, setting a criterion for safety skills is crucial due to their nature. Because safety skills can be life-saving, most should be acquired with 100% accuracy. Acquiring these skills to a less stringent criterion may not be sufficient for protection from danger. Findings showed that, in all except one study, criterion was 100% accuracy. Many investigators also addressed generalization by assessing the effects of SSI beyond original instructional settings to determine if the participants could transfer acquired skills to novel settings. Given the nature of safety skills, this is important.
Several additional parameters of the studies should be considered. No study required a minimum of five data points to establish stability of data during baseline condition. This was most evident on the first tier of implementation as subsequent tiers most often consisted of five or more baseline data points over time. In spite of the minimal number of baseline data points used to establish stability in the initial tier across studies, the investigators in all studies implemented intervention after stability was established without a therapeutic trend across at least three data points. Another important point is that all investigators presented SSI in a one-on-one instructional arrangement. Group should be an option in case there is a shortage of teachers or if more individuals with ASD need to learn safety skills. The literature shows that individuals with ASD taught in group arrangements can learn various skills from others through observational learning (Tekin-Iftar & Birkan, 2010). Therefore, group arrangements can increase the efficiency of instruction.
In conclusion, it is clear that ongoing research on SSI is merited, but we suggest instructors should feel confident in using a BST package consisting of various components to teach various safety skills to individuals with ASD because this can be a powerful as well as flexible instructional strategy. On the other hand, it is also important to recognize that other intervention procedures have some evidence for teaching safety skills (e.g., prompting procedures, video modeling, literacy-based interventions) but not a sufficient level to meet current guidelines.
Limitations and Future Directions
Several factors limit the interpretation of the findings in the present study. It is possible that we missed some published studies and graduate studies on SSI for individuals with ASD, and our findings are limited to studies designed with SCRD. Another limitation is that we could not analyze the differential effects of the components of a BST package to determine which component was most crucial and responsible for the results in the studies. We also found it difficult to discern between various labels given to participants, especially in the case of comorbid labels (e.g., ASD and intellectual disability, attention-deficit/hyperactivity disorder). Another limitation is that we included only SCRD studies and excluded other designs (e.g., group designs) that would possibly have contributed to our analysis when identifying EBPs. It is important to remember that the ASD population is highly heterogeneous, making random selection and forming equal groups a problem when designing group studies. In addition, group studies consider average scores of groups, making it impossible to detect individual performance of the participants; thus, what appears to be effective may be inaccurate and fail to reflect individual differences. Last, we aggregated the studies that meet standards and meet standards with reservation in our analysis by following the guidelines of Kratochwill et al. (2013), and combining those studies that meet with reservations could lessen the overall impact of BST, as well as other interventions.
We recommend designing studies to further investigate the component analysis of BST packages. Although this analysis builds a strong case for the effectiveness of the SSI, continued research on its parameters and employment by new groups of researchers in other geographic areas will serve to strengthen the argument for a BST package with specific components as an EBP. Moreover, this study also revealed additional instructional procedures in teaching safety skills to individuals with ASD that may be effective. Therefore, we recommend further research investigating these procedures to build even stronger conclusions regarding the evidence of specific strategies for teaching safety skills.
Supplemental Material
JSED_19_10_197_R1_Supplemental_Materials – Supplemental material for Systematic Review of Safety Skill Interventions for Individuals With Autism Spectrum Disorder
Supplemental material, JSED_19_10_197_R1_Supplemental_Materials for Systematic Review of Safety Skill Interventions for Individuals With Autism Spectrum Disorder by Elif Tekin-Iftar, Seray Olcay, Nursinem Sirin, Hatice Bilmez, H. Deniz Degirmenci and Belva C. Collins in The Journal of Special Education
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This study was supported by a grant from Anadolu University Research Fund (Project No: 1608E588).
Supplemental Material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
