Abstract
Math outcomes for students with disabilities, in particular students with emotional disturbance (ED), are bleak, warranting intervention strategies that have research to support their utility. The purpose of this meta-analysis was to examine the literature of math interventions used with students with ED to improve math outcome variables. Our statistical analysis included 17 studies, categorized as addressing fractions, number sense, geometry and measurement, algebra, word problems, and “other.” Although only four of the included studies met all of the CEC-EBP (Council for Exceptional Children’s Standards for Evidence-Based of Practices in Special Education) quality indicators, results of effect size calculations suggest large effects for all interventions. Results of publication bias analyses were mixed. Limitations, directions for future research in this field, and implications for practice are discussed.
Science, technology, engineering, and mathematics (STEM) instruction is vital to America’s continued growth and standing in the world economy (Gonzalez & Kuenzi, 2012). Current trends have suggested that more and more talent in the STEM areas are coming from outside of the United States, with numbers steadily increasing from 25.2% in 2003 to 28.8% in 2015 (National Science Board, 2008). In addition, many current workers in STEM fields are reaching retirement age, while the number of jobs in these fields are tripling in comparison to other fields. These combined factors will further the tendency to hire skilled workers from other countries unless a concerted effort is made to bolster the education of American citizens in STEM fields (National Mathematics Advisory Panel [NMAP], 2008). The consequences of failing to do so will weaken the United States’s economy and by extension independence and security (NMAP, 2008).
Mathematics proficiency is a vital component of STEM education, as math fluency is often a necessary component of the other STEM areas (e.g., physics). In addition, mathematics literacy is seen as a necessary aspect of adult independence (NMAP, 2008). In 2007, a state-led initiative to develop the Common Core State Standards began with a goal of facilitating real-world learning to support students in successful graduation, college preparation, career readiness, and everyday life (Common Core State Standards Initiative, 2010).
Despite this fact, many students are not meeting benchmarks of adequate math proficiency. According to the National Assessment of Educational Progress (NAEP; 2015) scores in mathematics of both fourth- and eighth-grade students have remained relatively unchanged, and below proficient, since the authoring of the NMAP (2008). This is a sobering trend considering many high-paying jobs require a foundation of math skills, in addition to many skilled labor jobs. Likewise, more than half of Americans have problems making relatively simple calculations including interest paid on loans and calculating tips at restaurants (Phillips, 2007). The ability to calculate fractions is another skill where Americans consistently fall short according to the (NAEP, 2015).
Considering that many Americans continue to struggle with early math skills, it follows that later algebraic skills also tend to show deficiency. Indeed, mathematics scores have stayed consistently below proficient for 12th-grade students since 2005, with a significant decrease from 2013 to 2015 (NAEP, 2015). These trends do not bode well for the postsecondary plans of students graduating from American secondary institutions. To shore up a student’s ability to perform well in algebra, the NMAP (2008) discussed the “critical foundations” (p. 20) of mathematics: number sense, fractions, and geometry and measurement. Number sense refers to a wide range of abilities from the capacity to immediately identify quantities, to an understanding of the distributive property. Fractions refer to the segmentation of whole numbers represented by traditional fractions, decimals and percentages, as well as the ability to apply basic arithmetic models to them. Geometry and measurement involves the ability to calculate the perimeter and area of two and three-dimensional shapes, as well as the slope of lines and the relationships among shapes.
Students With Disabilities and Mathematics
Since 2005, the NAEP has reported that, on average, American students have been below mathematics proficiency in Grades 4, 8, and 12. Unfortunately, students with disabilities, generally score well below the national mean, and below the basic level in mathematics (NAEP, 2015). Consistent with the poor progress made by typical students, students with disabilities have not significantly improved since 2011. In fact, since 2011, the performance of eighth-grade students with disabilities has significantly decreased.
The Diagnostic and Statistical Manual of Mental Disorders (5th ed.; DSM-5; American Psychiatric Association, 2013) recognizes impairments in mathematics as a specific academic domain under the specific learning disabilities diagnosis. In particular, these students may have difficulties with number sense, memorization, accurate/fluent calculation, and/or accurate math reasoning. Other disorders commonly co-occur with specific learning disabilities, including attention deficit hyperactivity disorder, autism spectrum disorder, communication disorders, developmental delays, intellectual disabilities, emotional disturbance (ED) and other mental disorders (e.g., anxiety, depression; DSM-5, American Psychiatric Association, 2013). For example, students with ED routinely score in the 25th percentile on math assessments (e.g., Trout, Nordness, Pierce, & Epstein, 2003). Researchers have documented that impairments in math can be attributed to several core cognitive processes, including working memory, executive function, language, and processing speed (Fuchs et al., 2006). Clearly, there is a need to find evidence-based practices to support the math proficiency of students with disabilities as outcomes without intervention are bleak.
Students with ED may be particularly susceptible to difficulty with mathematics problems due to the high rate of co-morbid conditions they possess (Merikangas et al., 2010). For example, children with certain mental health issues may experience greater deficits with executive functioning, including working memory (Snyder, 2013), which may impede their ability to make sense of some abstract concepts including mathematics concepts (Bull, Espy, & Wiebe, 2008). Children with ED tend to do more poorly in mathematics than their typical peers and deficits tend to increase with age (Ennis, Evanovich, Losinski, Jolivette, & Kimball, 2017). In addition, Ennis and colleagues (2017) found that certain social skills deficits (e.g., problem behavior) predicted deficits in mathematics. It also appears that as students are served in more restrictive settings, the gap between them and their nondisabled peers grows at a higher rate. These trends suggest a need for investigations into evidence-based practices in mathematics to help these students be successful in their respective settings.
Previous Reviews
There have been five recent reviews examining mathematics interventions for students with ED. However, only two could be considered a meta-analysis by measuring the treatment effects of the math interventions in the included studies (Ralston, Benner, Tsai, Riccomini, & Nelson, 2014; Templeton, Neel, & Blood, 2008). The meta-analysis conducted by Templeton and colleagues (2008) utilized the percent of nonoverlapping data (PND; Scruggs, Mastropieri, & Casto, 1987), a widely accepted method for indicating an intervention’s effectiveness. However, PND is not without faults, with critics highlighting its susceptibility to outliers and questioning the method as a true measure of treatment magnitude (Wolery, Busick, Reichow, & Barton, 2010). The synthesis by Ralston and colleagues (2014) attempted to address some of the concerns with the PND statistic by also calculating the improvement rate difference (IRD; Parker, Vannest, & Brown, 2009) which calculates the difference between the improvement rate of the intervention and baseline. However, neither review calculated a between case–standardized mean difference (BC-SMD) effect size. The BC-SMD allows researchers the ability to compare single-case design studies (SCD) with group designs, as well as the capability of combining multiple SCD studies together into subgroups to compare the results (Shadish, Hedges, & Pustejovsky, 2014). In addition, only one study, Mulcahy, Krezmien, and Travers (2016), coded articles for meeting accepted standards of quality using a hybrid of guidelines including Horner et al. (2005); Kratochwill et al. (2010); O’Neill, McDonnell, Billingsley, and Jenson (2011); Tawney and Gast (1984); and Council for Exceptional Children’s (CEC; 2014) evidence-based practice guidelines.
With respect to the quality of the reviews, only one provided the exact keywords used during the electronic database search (Hodge, Riccomini, Buford, & Herbst, 2006), a necessary component for replication. The other four reviews included examples of the combination of search terms, but did not provide a replicable Boolean phrase (Mulcahy et al., 2016; Mulcahy, Maccini, Wright, & Miller, 2014; Ralston et al., 2014; Templeton et al., 2008). However, none of the search methods within any of the reviews can be considered replicable, as none included the date the search was conducted, thus inhibiting replicability and inability for researchers to put the review in context. In essence, the lack of a systematic search methodology makes the previous reviews prone to bias (D. J. Cook, Mulrow, & Haynes, 1997). Furthermore, none of the previous reviews addressed publication bias through the application of statistical tests, despite the evidence that research in the social sciences is especially prone to publishing only positive results (B. G. Cook & Odom, 2013; Maag & Losinski, 2015). Finally, only one review attempted to address bias within individual studies through the application of a set of standards (Mulcahy et al., 2016).
In sum, there are currently no meta-analyses that address publication bias, examine the quality of the studies through the application of widely accepted standards, and calculate effect sizes of the included studies. Therefore, the purpose of this meta-analysis is to systematically review and meta-analyze the existing literature on instructional math interventions for elementary and secondary students with ED to improve mathematic outcomes in the classroom. The current meta-analysis addresses the following research questions:
Method
To answer the research questions, a comprehensive database search of the existing literature on math interventions with students with EDs was conducted. A database search of Academic Search Premiere, Education Full Text, ERIC, and PsychINFO using the Boolean phrase (“math*” OR “algebra” OR “geometry” OR “fraction*” OR “number sense” OR “word problem*”) AND (“behavior* disorder” OR “behaviour* disorder” OR “emotional disturbance” OR “EBD”) was conducted on January 31, 2018, and included all previous dates. A second database was searched on March 16, 2018, of ProQuest Dissertations and Theses Global using the following Boolean Phrase ab(“math*”) AND ab(“emotional disturbance” OR “EBD” OR “behavior* disorder”). In addition, hand searches of the programs from the Midwest Symposium for Leadership in Behavior Disorders (MSLBD) and the journals Journal of Emotional and Behavioral Disorders, Behavioral Disorders, Exceptional Children, and The Journal of Special Education were conducted from 2013 to 2018 to ensure recent studies were not missed. The MSLBD program was searched because it is a common outlet for research on children with behavioral challenges and is the only organization that currently makes its program accessible for past years. In addition, the journals were searched because they frequently publish research on interventions for students with disabilities, particularly students with ED. Finally, an ancestral search of the previous reviews on math with students with ED was conducted by screening the reference list of included articles for any additional studies that met the inclusion criteria of this analysis.
Inclusion Criteria
Two authors screened all titles and abstracts of articles obtained in the search using the following inclusion criteria. To be included in the synthesis, studies had to (a) present findings from an experimental quantitative (group or single-case design) study, for SCD studies, studies must have included a graph of results and would have had to have at least 3 data points in each phase with three replications; (b) have an instructionally based (as opposed to behaviorally oriented) math intervention as an independent variable; (c) include a dependent variable that measured mathematic achievement; (d) involve an intervention that happened in a school setting (Grades K–12); (e) participants included students with EDs; and (f) provided adequate information to calculate an effect size.
Coding Procedures
Studies that met the inclusion criteria were coded by four researchers in two dyads. Specifically, the dyads worked in teams with both reading the article, one coding on an Excel spreadsheet as the other highlighted information in the text. Working in this way allowed the teams to resolve discrepancies within the dyad coding before meeting with the other team. Following dyad coding, members from each dyad met to compare results. Initial interrater agreement between the dyads was arrived at by comparing the coding sheets and dividing the number of agreements by the number of opportunities for agreement. Percent of agreement at this step was 91.94%. In the event of disagreements between dyads, a member from each dyad discussed the issue and reached a final agreement.
The following variables were coded for each study. First, participant characteristics were coded including the number of participants, grade, age, gender, and disability. Then, the type of design was coded as randomized control trial, quasi-experiment, or SCD. Next, the outcome variable was categorized as one of the following: (a) fractions, (b) geometry, (c) number sense, (d) word problems, and (e) other. Finally, the independent variable was coded along one of the following categories based on previous reviews of the research and the NMAP (2008) reported (a) strategy instruction; (b) peer tutoring; (c) anchored instruction; (d) cover, copy, and compare (CCC); and (e) direct instruction. For example, studies coded as strategy instruction included interventions utilizing mnemonic devices, step-by-step strategy instruction, and schema-based instruction. CCC was the only intervention given its own construct because it did not fit neatly into any other category. Although some previous reviews included CCC into the strategy instruction category, it was determined there were enough differences to justify its own category.
To assess study quality, each study was rated using the Council for Exceptional Children’s Standards for Classifying the Evidence-Base of Practice in Special Education (CEC-EBP; CEC, 2014). Quality indicator (QI) 1.0 refers to the context and setting, requiring a study to clearly articulate at least one demographic variable (1.1). QI 2.0 refers to the description of the participants, again, requiring at least one demographic variable (2.1) and adequate information to determine the students’ disability status (i.e., ED; 2.2). QI 3.0 refers to the intervention agent, requiring at least one demographic variable (3.1) and enough detail to ascertain qualifications, or include detailed training procedures (3.2). QI 4.0 refers to the description of the practice requiring studies to describe the procedures (4.1) and materials used (4.2) with adequate detail to allow for replication. QI 5.0 refers to implementation fidelity. To meet this, indicator studies must describe fidelity related to adherence (5.1), dosage (5.2), and explicitly state that fidelity was assessed throughout the intervention (5.3). QI 6.0 refers to the internal validity of the study. QI 6.1 to 6.3 apply to all studies and requires a demonstration of systematic manipulation of the independent variable (6.1), a detailed description of the control/baseline phases (6.2), and explicitly state, or be easily inferred, that baseline or control conditions did not have exposure to the intervention (6.3). Single-case research design studies must adhere to QI 6.5, requiring the use of a design that allows for three demonstrations of effect, (6.6), requiring at least 3 baseline data points with a stable trend, and (6.7), requiring the use of a single-case research design that controls for threats to internal validity. Group design studies must adhere to QI 6.4, requiring detailed description to groups using random assignment or demonstration of equal groups, (6.8) requiring studies to explicitly state the overall attrition and that it is at an acceptable level, and (6.9) requiring the same for differential attrition. QI 7.0 refers to the outcome/dependent variables and requires studies to have socially important outcomes (7.1), adequately describe the dependent variables (7.2), report results of all dependent variables measured (7.3), report measurement with appropriate frequency and timing (7.4), and measure interrater reliability and/or interobserver agreement (7.5). QI 7.6 only refers to group design studies and requires that studies report evidence of validity (construct, content, and social validity). QI 8.0 refers to data analysis and reporting. QI 8.1 and 8.3 refer to group design studies and require the use of appropriate analyses and reporting of effect sizes (or providing data to allow for the calculation of an effect size), respectively. QI 8.2 refers to single-case research designs and is met by including a graph that allows for clear interpretation of the findings. More detailed information on the CEC-EBPs can be found at https://www.cec.sped.org/~/media/Images/Standards/CEC%20EBP%20Standards%20cover/CECs%20Evidence%20Based%20Practice%20Standards.pdf. For a study to be considered methodologically sound under the CEC-EBPs, all domains, and subdomains must be met. When rating studies, the coders examined the article to determine if the indicator was explicitly stated. If the indicator was not clearly addressed within the study, the research team used guidance from the CEC-EBP to make an informed decision on whether the indicator was met (B. G. Cook et al., 2015).
It is typical for research syntheses to eliminate those studies that do not meet criteria from further analyses. However, we have included each to account for publication bias, and because previous research (Losinski, Cuenca-Carlino, Zablocki, & Teagarden, 2014) has shown that meeting all of the domains in QIs is challenging (e.g., Spooner, Root, Saunders, & Browder, 2018), especially in the literature that precedes the standards.
Data Analysis
The current research synthesis included no group designs (see Figure 1 for a flowchart of the inclusion process), as a result, the only effect sizes calculated were of SCD studies, which were analyzed using three outcome metrics: (a) visual analysis, (b) percentage of nonoverlapping data (PND; Scruggs et al., 1987), and (c) the BC-SMD (Shadish et al., 2014). Prior to analysis, decisions were made regarding the collection of baseline and intervention data. A few studies utilized multiple treatments or variations of interventions. We chose to analyze only the complete treatment package, not component parts, against baseline data. In addition, if a study investigated two or three different interventions before choosing a final intervention, we only used the final intervention chosen. For example, Jolivette, Wehby, and Hirsch (1999) utilized error analysis and preference assessments to identify three possible interventions for students to use. The most promising intervention was then implemented and measured. In this case, only the final intervention was measured against the baseline. Follow-up and generalization points were also excluded from analysis.

Flow chart of inclusion process for synthesis and meta-analysis.
Visual analysis
Each graph was analyzed to determine the response rate (RR) of the intervention, following the guidelines described in the Procedures and Standards Handbook (Version 3.0) from the What Works Clearinghouse (2014). Each case was examined for changes in trend, level, and variability to determine if a functional relation existed. An overall RR was calculated for each study, by summing the total number of responders in a study and dividing by the total participants in the intervention.
PND
PND was calculated for each study by determining the number of treatment points that exceeded the highest baseline point and dividing that number by the total treatment points (Scruggs et al., 1987). PND is reported as a percentage and is interpreted based on the guidelines provided by Scruggs and Mastropieri (1998): PND > 90% suggests a very effective intervention, 90% > PND > 70% suggests an effective intervention, and a PND < 70% suggests a questionable or ineffective intervention. Overall, PND values were averaged for each construct and the overall PND was calculated without weighting.
BC-SMD
To calculate the BC-SMD for each study, data from graphs were extracted using Engauge Digitizer (Version 10.3). Then, data were input into an open-source, web-based program called scdhlm. Data were detrended by assigning a session number to each case, which was utilized as the detrending variable. To calculate the BC-SMD, there must be at least three baseline and three intervention datum points, as well as three cases within the study.
Computed scores from the scdhlm program were then input into Comprehensive Meta-Analysis (CMA; Version 2.2.064) and grouped according to preidentified constructs based on the outcome measures of the study by means of a random effects model. Using previous reviews and the NMAP Report, four outcome constructs were developed: Number Sense, Fractions, Word Problems, and Other. Interpretation of the computed BC-SMD score follows the guidelines set by Cohen (1988) where BC-SMD < 0.20 indicates a small effect, and BC-SMD > 0.80 indicates a large effect. The BC-SMD statistic has been identified as a robust effect size consistent with the group standard mean difference (Shadish, Hedges, Horner, & Odom, 2015) and will be primarily utilized when drawing conclusions in this meta-analysis.
Publication bias
Prior to accepting the treatment effect expressed in a meta-analysis as an accurate approximation of the impact of mathematics interventions for students with ED, it must be determined if the included studies are representative of all studies conducted on the topic. Publication bias, or the tendency to exclude the publication of studies with null results, is a serious concern in the social sciences (B. G. Cook, 2014; Maag & Losinski, 2015) and in SCD research (Shadish, Zelinsky, Vevea, & Kratochwill, 2016). To address publication bias, the present meta-analysis utilized statistical tests to assess for potential bias. Two publication bias analyses were conducted using CMA (version 2.2.064): Egger’s regression of the intercepts test (ERI; Egger, Smith, Schneider, & Minder, 1997) and Duval and Tweedie’s (2000) trim and fill method (T&F). ERI is designed to predict the effect size, assuming all studies are present in the analysis. To calculate ERI, the effect size is divided by the standard error. If the answer is zero, it is unlikely bias is present, while if the number exceeds zero, there is likely publication bias within the analysis. The T&F method utilizes a funnel plot, imputing all included studies. It is expected that when no bias is present, the funnel plot will be symmetric. A nonsymmetric funnel plot indicates bias is present, in which case effects are added until symmetry is achieved and the effect size is recalculated.
Results
The initial search generated 823 results. After duplicates were removed, 713 articles remained. Two of the authors screened the titles and abstracts agreeing on 46 potential articles for inclusion. Articles were then screened for inclusion by two dyads of researchers working together. Following the initial screening, the dyads met to compare results, unanimously identifying 14 articles meeting inclusion criteria. A hand search of recent literature was conducted as well as an ancestral search of previous reviews. Sixteen additional studies were identified through the ancestral search and were evaluated for inclusion using the same method as previously described. Zero studies were identified during the hand search, resulting in a total of 17 studies included in the synthesis.
Within the 17 included articles, there were a total of 66 participants. Descriptions of participants and the type of math intervention can be found in Table 1. Of the 17 studies that reported the age of participants, the median age was 11.5 (range = 8.6–16). There were 16 studies that provided the gender of students, with males being more prevalent than females. The most common setting for the intervention was a special day school (n = 10), followed by a self-contained classroom (n = 4), and resource room (n = 3). There were slightly more researchers as intervention agents (n = 9) than teachers (n = 8). The intervention in each study was sorted into a preidentified construct with strategy instruction (n = 6) being the most common approach followed by CCC (n = 4), peer tutoring (n = 3), direct instruction (n = 3), and anchored instruction (n = 1). Finally, the areas of mathematics addressed by the study were sorted into the following outcome constructs: number sense (n = 7), fractions (n = 4), word problems (n = 4), geometry and measurement (n = 1), and other (n = 1).
Study Characteristics.
Note. SC = self-contained; SCD = single-case design; SD = special day school; M = male; RR = resource room; DNS = did not specify.
Quality of Included Studies
Of the 17 studies, four met all of CEC’s (2014) evidence-based standards (Alter, 2012; Alter, Brown, & Pyle, 2011; Hott, Evmenova, & Brigham, 2014; Williams, 2015). Out of the remaining studies, three met seven indicators, five met six indicators, four met five indicators, and one met only four indicators (see Figure 2). The following indicators were met by all studies (1.1, 4.1, 4.2, 6.8, 6.9, 7.1, 7.3, 7.6, and 8.1). Commonly missed were descriptions of fidelity related to adherence (5.1 = 66.67% met) and assessment regularly throughout the intervention (5.3 = 60.87% met). Other frequently omitted indicators denote controlling for internal validity (6.7 = 52.17% met) and systematically manipulating the independent variable (6.1 = 66.67% met), both of which are directly linked to inclusion of adherence fidelity (Indicator 5.1). Another frequently omitted variable was 3.2 (70.83% met), which requires the inclusion of information on the interventionist’s qualifications to implement the intervention.

Quality coding results from the included studies.
Synthesis of Study Effects
Visual analysis of graphed data showed a high RR with 75 of the 76 cases responding to the math intervention presented in the study. The average PND was 91.5 indicating the included math interventions were effective in increasing mathematic performance. The PND from all included studies ranged from 75% (Peltier & Vannest, 2017; Skinner, Turco, Beatty, & Rasavage, 1989) to 100% (Brasch, Williams, & McLaughlin, 2008; Cade & Gunter, 2002; Cieslar et al., 2008; Davis & Hajicek, 1985; Jolivette et al., 1999; Mulcahy & Krezmien, 2009; Skinner, Ford, & Yunker, 1991), all above the suggested threshold of 70% used to identify effective interventions.
The omnibus effect size was quite large (BC-SMD = 2.78, SE = 0.476, z = 5.832, p = .000) based on the interpretations recommended by Cohen (1988). Individual study effect sizes ranged from 0.330 (SE = 0.110, z = 3.000, p = .003; Franca & Kerr, 1990) to 26.260 (SE = 3.910, z = 6.716, p = .000; Cade & Gunter, 2002). Because the Cade and Gunter effect size was a considerable outlier, we ran a second analysis omitting this study, and found an effect size of (BC-SMD = 2.44, SE = 0.438, z = 5.567, p = .000). Finally, we calculated an omnibus effect size for only those studies that met all of the QIs (BC-SMD = 2.21, SE = 0.327, z = 6.762, p = .000). Table 2 shows the results of the 11 studies for which the BC-SMD effect sizes were calculated. Common reasons for excluding cases from effect size analysis was a failure to provide three demonstrations of effects (e.g., Skinner et al., 1991). Studies were grouped by six predetermined constructs based upon the math skills the intervention was designed to teach: fractions, number sense, geometry and measurement, algebra, word problems, and other. The largest effects were found for number sense (BC-SMD = 4.271, SE = 1.018, z = 4.195, p = .000), followed by fractions (BC-SMD = 2.852, SE = 1.176, z = 2.195, p = .028) word problems (BC-SMD = 2.103, SE = 0.265, z = 7.924, p = .000) and other (BC-SMD = 2.060, SE = 2.012, z = 1.024, p = .306).
Effect Sizes.
Note. RR = resource room; PND = percent of nonoverlapping data; BC-SMD = between-case standard mean difference; CI = confidence interval; QI = quality indicator.
Average of the two treatments. bThose studies that met all QIs. cAnalysis without Cade and Gunter (2002).
Publication Bias
The BC-SMD metric was used to conduct the publication bias analyses. Results of the ERI (intercept = 6.63; p = .00) suggests publication bias may be present. However, results from Duval and Tweedie’s (2000) trim and fill method resulted in an adjusted value equal to that obtained in the original analysis indicating that the funnel plot was not significantly asymmetrical. In sum, results of publication bias show mixed results, with the regression of the intercept test showing the possibility of bias while the trim and fill method showing little bias.
Discussion
In the present meta-analysis, we investigated the effects and evidence-base of mathematics interventions to increase the mathematics achievement of students with disabilities. Overall, significant effects were observed, proposing that interventions targeting mathematics achievement in students with ED can increase these students’ academic success. Few studies in the current meta-analysis met the QIs established by CEC (Alter, 2012; Alter et al., 2011; Hott et al., 2014; Williams, 2015) suggesting there is a lot of work yet to be done in the area of ED and mathematics education. We will discuss these results with deference to the research questions followed by implications of the meta-analysis and what the results mean to the education of students with ED. Finally, limitations and implications for future research are provided.
Effectiveness of Math Interventions for Students With ED
Our findings are consistent with other systematic reviews and meta-analyses of math interventions for students with ED. To begin, Mulcahy et al. (2016) also found that many of the 19 studies included in their review did not meet standards for rigorous research. This is not uncommon in reviews addressing other bodies of literature among students with and at-risk for ED (e.g., Ennis, Royer, Lane, & Griffith, 2017; Losinski, Wiseman, White, & Balluch, 2016) especially for articles predating the 2005 special issue of Exceptional Children (e.g., Odom et al., 2005).
Similarly, Ralston et al. (2014) and Templeton et al. (2008) found that most published studies had high PND and IRD rates. Despite a paucity of research in math interventions for students with ED, it is encouraging to see that the existing strategies being evaluated have demonstrated effectiveness for this population of students. There are several important things to note within this body of literature. To begin, the median age across all studies was 11.5, with a number of studies conducted in middle and high schools. This is an overall strength for this body of research, as historically academic intervention research for students with ED, as with many other areas of special education research, has focused on younger students. However, one weakness worth noting is that three studies did not specify the ages of student participants. As we seek to better define participating students, this is an important demographic variable to include.
Also, of note is the least restrictive environment settings represented in this body of literature. The majority of studies took place within a special day school (n = 10) for students with disabilities. Along the continuum, this is a restrictive setting, suggesting that many students in these settings may possess significant behavioral challenges that have prevented them from being successful in less restrictive settings. This is not surprising as in 2013, 17.1% of students with ED received services in nontraditional settings (U.S. Department of Education, 2015). However, this is also encouraging as, historically, more restrictive settings have maintained a focus on getting student behavior under control with less of an emphasis on academics (Tobin & Sprague, 2000). This is also promising as very few practices have research-supported utility in nontraditional settings and future research in these settings is needed (Ennis, Harris, Lane, & Mason, 2014). A related, and encouraging finding, is that over half of the studies were conducted using a classroom teacher as an interventionist. As we seek to bring evidence-based practices in the classroom, investigations that equip teachers to implement key strategies are a necessary step (Odom et al., 2005). However, only 70.83% of studies met QI 3.2, which requires studies to clearly define how the interventionist is qualified to implement the intervention. We suggest that future studies clarify what training or prerequisite skills that interventionists, researchers or teachers, have.
One overall limitation to this body of literature is the failure of many studies to adequately collect or describe how treatment fidelity was assessed (5.1) or that fidelity was assessed throughout the intervention, rather than only for certain phases (5.3). Although it is possible that these procedures were in place, in many instances their lack of description or explicit statement resulted in being coded as not present. Future researchers will want to include explicit statements of these procedures. This is especially true as QI 6.1 (systematic manipulation of the independent variable) and 6.7 (control for threats to internal validity) are directly linked to 5.1. In addition, some studies did not meet QI 6.7 because they used nontraditional designs that failed to control for threats to interval validity (e.g., two-tiered multiple baseline).
Evidence-Based Practices in Math for Students With ED
Given the limited number of studies meeting the CEC-EBP and the diverse approaches for math instruction included in the review, we cannot make the determination that any one practice has a sufficient evidence-base. To be considered an evidence-based practice, five single-case research design studies with a minimum of three cases, meeting all QIs, totaling 20 or more participants with primarily positive outcomes are required. However, to be considered a potentially evidence-based practice, only two single-case research design studies are needed (i.e., Alter, 2012; Alter et al., 2011). Therefore, we can say that strategy instruction is a potentially evidence-based practice for math instruction, in particular word problems, for students with ED.
Three articles used strategy instruction to address both number sense (Cade & Gunter, 2002; Jolivette et al., 1999) and fractions (Davis & Hajicek, 1985). However, the strategies used across these articles were diverse. Strategy instruction, in particular self-regulated strategy development (SRSD), is an evidence-based practice for teaching writing to students with ED (Ennis & Jolivette, 2014; Losinski et al., 2014) and is also being used to increase the reading comprehension of students with ED (e.g., Sanders, Ennis, & Losinski, 2018). In their analysis, Ralston and colleagues (2014) urged teachers to use SRSD to teach cognitive strategies in math. We echo the need for the use of strategy instruction, noting that more research on using SRSD to teach math is needed.
Furthermore, in light of a lack of any clear evidence-based practices in math for students with ED, we suggest that practitioners consider using the strategies included in these articles, but with caution given the lack of sufficient evidence. Many of these strategies have proven utility with other populations of students with disabilities and utilize effective intervention methods, including interventions that are teacher mediated, peer meditated, and student mediated. For example, two studies looked at the effect of direct instruction, a teacher-mediated intervention, to teach both number sense (Brasch et al., 2008) and geometry and measurement (Mulcahy & Krezmien, 2009). Direct instruction has been widely researched for students with a range of disabilities, and has demonstrated utility for mathematics instruction (Przychodzin, Marchad-Martella, Martella, & Azim, 2004). Similarly, one study utilized a peer-tutoring strategy to teach fractions (Franca & Kerr, 1990) and math vocabulary (Hott et al., 2014). Many effective strategies, such as peer-assisted learning strategies (PALS) math (Calhoon & Fuchs, 2003), utilize peer-tutoring to teach math to students with disabilities and, therefore, may have success with students with ED. Finally, four studies looked at the CCC strategy to improve outcomes for problems involving fractions (Cieslar et al., 2011) and number sense (Skinner et al., 1993; Skinner et al., 1991; Skinner et al., 1989). CCC is a student directed strategy that has promise for improving math fact fluency for students with a range of disabilities (Stocker & Kubina, 2016).
Limitations and Future Directions
There are a number of limitations to the current meta-analysis that should be addressed. First, when analyzing cases, we did not analyze specific components and only examined the complete intervention package. Therefore, it is unclear the effect the lesser components had on mathematical achievement. Second, coding of articles is a subjective activity, particularly with regard to the CEC (2014) standards, and it may be that other researchers would have arrived at different conclusions than were reported here. For example, our coding necessitated authors explicitly state the necessary components rather than giving them the “benefit of the doubt.” Third, it is possible we may have missed articles investigating mathematics for students with ED in the search process, however our methods were extensive and the analyses of publication bias likely would account for any missing studies. Finally, we did not examine behavioral interventions on mathematics achievement, thus, it is unclear the extent to which behavioral interventions impact the academic achievement of children with ED.
With respect to future research in the area of mathematics for students with EDs, future research should include and/or document strategies to support behavior during math instruction. As many students with ED have co-occurring academic and behavioral challenges, using behavioral strategies during math instruction can contribute to higher academic outcomes (Sanders et al., 2018). Currently, no intervention construct can be described as an evidence-based practice. The closest practice is strategy instruction for word problem solving (e.g., Alter et al., 2011). Thus, future researchers should replicate these methods to confirm the findings and to build that evidence-base. Future research should also focus on the concepts outlined by NMAP (2008), specifically the areas of fractions and geometry and measurement. Currently, there are no quality studies for students with ED examining geometry and measurement, with only one study (Mulcahy & Krezmien, 2009) investigating this construct. However, the study did not provide information to calculate the effects nor did it meet all QIs. With regard to fractions, future researchers should conduct more replications investigating peer-tutoring, CCC, and strategy instruction as each of the methods showed promise, but none had an adequate number of studies to support their use as an evidence-based practice.
Conclusion
The evidence for mathematics interventions for students with EDs is promising, yet sporadic. Indeed, the current meta-analysis identified 17 studies carried out over the last three decades. These interventions have ranged from strategy instruction to peer tutoring and have covered much of the range of prealgebra skills discussed in the NMAP (2008) demonstrating large effects. However, no strategy can be considered an evidence-based practice due to the general low-quality of the included studies and the lack of replications of specific interventions. Thus, we highly recommend that the field begin replicating and extending the literature base of math interventions for students with EDs.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
