Abstract
Students with emotional and behavioral disorders (EBD) exhibit problem behaviors that potentially result in lower performance in reading and related content areas. Researchers and policy makers have increasingly emphasized the need for evidence-based practices (EBPs) in reading. However, conclusions made regarding the effectiveness of the interventions strongly depend on the rigor of systematic reviews and meta-analyses used to identify intervention research. This article applied a set of established quality indicators to literature reviews of reading instruction for children with EBD. Systematic reviews and meta-analyses published in refereed journals between 1996 and 2018 were eligible for inclusion. Identified reviews (n = 17) generally exhibited a range of methodological strengths; however, authors did not consistently describe coding procedures or assess the quality of primary studies. Implications for the identification of EBP follow a discussion of findings.
Keywords
Approximately 5.1% of children aged between 4 and 17 years have been identified as having emotional and behavioral disorders (EBD; Henning-Smith & Alang, 2016). Students with EBD often exhibit behavioral problems that negatively affect their academic performance and have lower rates of academic achievement in math, reading, history, and science compared with students with other high-incidence disabilities (Gage et al., 2014; Kauffman, 2001; Nelson et al., 2004). Children with EBD are also at increased risk of dropping out of school or engaging in risky behaviors that can span into adulthood (Wagner et al., 2003). Osher et al. (2003) reported that students with EBD are more likely to engage in drug abuse and violence, and be arrested within 3 years of leaving school. Risky behavior, poor performance in school, and tenuous social relationship often result in a lower quality of life for individuals with EBD.
In addition to general academic problems, students with EBD often exhibit deficits in reading. Wagner, Kutash, Duchnowski, Epstein, and Sumi (2005) posited that students with EBD read approximately 2.2 grades below grade level. In a study examining the academic achievement of K-12 students with EBD, Nelson and colleagues (2004) indicated 83% of children with EBD scored below students in the typical group on a standardized reading measure. Due to their importance in academic performance, fundamental reading deficits eventually cascade into multiple content areas including science, literature, and history (Nelson et al., 2004). Challenges with reading become particularly acute when students transition to middle or high school, where content becomes more dense, complex, and difficult to comprehend (Hall et al., 2013).
The past two decades have witnessed a surge in the volume of published studies investigating practices meant to address reading skills of students with EBD (e.g., Cullen et al., 2014; Oakes et al., 2010; Palmer et al., 2014). Yet reading continues to challenge this group of students (Mitchell et al., 2019; Wanzek et al., 2014). Consequently, experts in special education have emphasized the need to adopt evidence-based practices (EBPs) to address the reading difficulties of children with EBD (Zaheer et al., 2019). EBPs are methods evaluated through experimental studies capable of demonstrating functional relationships between interventions and outcomes. Laws and professional standards pertaining to education explicitly require the use of EBP; nonetheless, teachers in special education rarely adopt empirically supported instructional practices or consult research in identifying instructional methods to use in their classrooms (Cook et al., 2015; Sciuchetti, McKenna, & Flower, 2016).
Challenges associated with establishing EBPs represent one of the chief obstacles to their use (Cook et al., 2015; Sciuchetti et al., 2016). In special education and related fields, systematic reviews and meta-analyses have become popular approaches for identifying EBP (King et al., 2020). Systematic reviews and meta-analyses involve the transparent analysis of literature on a specific topic, thereby enabling broad conclusions about the effects of interventions as well as subsequent replication of review procedures (Maggin et al., 2011, 2017). Meta-analyses also feature statistical summaries of overall intervention effects on outcomes of interest. In demonstrating effects across studies, systematic reviews and meta-analyses support one of the defining characteristics of EBP (Cook & Cook, 2011).
EBP must be supported by several studies featuring multiple participants and settings (Cook & Cook, 2011). Systematic reviews and meta-analyses entail gathering and summarizing all research concerning a particular intervention. Authors may also attempt to determine whether intervention effectiveness varies with changes in settings or methodological factors. Several literature reviews synthesize findings on reading interventions for students with EBD (e.g., Ruhl & Berlinghoff, 1992). However, findings regarding the intervention efficacy often depend on the methodological rigor of the review (King et al., 2020).
Efficacy of Systematic Reviews and Meta-Analyses in the Identification of EBP
The importance of systematic reviews and meta-analyses in identifying EBP in human service fields has resulted in enormous interest in their execution (McNamara & Scales, 2011). As part of the EBP initiative, there have been efforts to develop guidelines for conducting research syntheses (Talbott et al., 2018). Transparent descriptions of review procedures assist consumers in detecting bias and understanding contradictory overlapping reviews, or separate reviews pertaining to the same subject (e.g., Siontis et al., 2013). Methodological differences have resulted in discordant reviews on a number of topics including social skills interventions (e.g., Wang et al., 2013; cf. Ledford et al., 2018) and the prevalence of replication studies in special education (e.g., Lemons et al., 2016; cf. Therrien et al., 2016). Consequently, researchers have increasingly recognized the effect literature review procedures (e.g., search terms used, quality assessment) may have on outcomes (King et al., 2020).
Specific facets of literature reviews have implications for the identification of EBPs and the validity of outcomes. Clear research objectives, operationalized search criteria (e.g., definition of the independent variable), and transparent coding schemes allow consumers to properly contextualize the suitability of procedures given the questions guiding the review (Talbott et al., 2018). Search methods (e.g., databases) reflect the extent to which authors have identified material relevant to questions of interest (Delaney & Tamás, 2018). Similarly, decisions regarding material to include on the basis of source (e.g., peer-reviewed journal), publication date, study design (i.e., group or single-case; Shadish et al., 2015), and language of origin can bias the review. Specifically, publication bias, or the tendency of positive findings to be overrepresented in certain bodies of literature, often influences the findings of reviews that exclude certain types of research (Cook & Therrien, 2017; Gage et al., 2017; King et al., 2020). Methodological rigor of primary studies also represents an important consideration in reviewing literature, as experiments that do not control for threats to validity have the potential to distort outcomes (Petticrew, 2015). Transparent analysis plans further demonstrate the rationale for conclusions reached by the authors and allow consumers to assess the accuracy of reported findings (Talbott et al., 2018).
Efforts to assess the quality of systematic reviews and meta-analyses have flourished in medicine (Liberati et al., 2009) and are most often associated with the Cochrane Collaboration, an organization dedicated to evaluating systematic reviews in health care (Chalmers et al., 2002). Similar efforts in special education have only recently emerged. Talbott and colleagues (2018) developed a set of quality indicators to guide the assessment of eight methodological elements of literature reviews and meta-analyses (e.g., eligibility criteria, search procedures). Using these criteria, Maggin and colleagues (2017) evaluated systematic reviews published in Behavioral Disorders. They found that most reviews addressed several procedural elements, but also limited reports regarding coders’ qualifications, language restrictions, and the reliability of search procedures. A broad assessment of reviews (n = 1, 196) published in special education journals conducted by King et al. (2020) similarly identified a limited amount of transparency as well as criteria with the potential to bias results (e.g., reliance on peer-reviewed research). In light of their findings, the authors encouraged additional evaluation of subject-specific reviews, as attention to methodological rigor has the potential to alter the significance of previous research.
Purpose
Methods employed in systematic reviews of reading interventions for students with EBD have political and practical implications. Given the important role literature syntheses play in identifying EBPs, the procedures reviewers use in identifying and analyzing reading intervention research warrant investigation (Maggin et al., 2017). To date, researchers have not scrutinized review of intervention research in this area. This study evaluated the methodology of systematic reviews and meta-analyses concerning reading interventions for students with EBD. Specific questions included (a) What were the characteristics (e.g., included study designs, independent and dependent variables) of reviews and meta-analyses; and (b) To what extent do reviewers address the various aspects of quality indicators for systematic reviews?
Method
Literature Search
The literature review entailed database, ancestral, and manual search methods. All search procedures originally occurred in 2016 and were updated in 2018 to ensure the inclusion of the most recent systematic reviews and meta-analyses. We conducted a database search of abstracts for articles published in peer-reviewed journals between 1996 and 2018 using PsycINFO, PsycARTICLES, and ERIC databases with the following Boolean string: (AB (“meta-analys*” OR “meta analys*” OR “quantitative review*” OR “quantitative analys*” OR “review” OR “synthes*”)) AND (AB (“emotion* behavior* dis*” OR “ebd” OR “behavior* dis*” OR “emotion* dis*”)) AND (AB (“read*” OR “litera*” OR “comprehen*” OR “phon*” OR “alphabet*” OR “fluen*” OR “vocabulary”)).
A total of 700 abstracts were identified from the initial search, 52 of which were removed due to duplication. Of the remaining 648 abstracts, 22 met initial inclusion criteria. An ancestral search of studies retrieved from the database search, in which the first author examined reference lists of studies found in the database search, identified an additional four studies. Finally, we performed a hand search of Behavior Modification, Journal of Emotional and Behavioral Disorders, and Behavioral Disorders from the years 1996 to 2018. The hand search did not identify any additional studies.
Inclusion Criteria
Two independent coders reviewed abstracts identified through the database search using an initial set of initial inclusion criteria. Studies meeting initial inclusion criteria (a) explicitly identified as a literature review, synthesis, meta-analysis, quantitative analysis, or quantitative synthesis in the title or abstract; (b) targeted literature concerning students with EBD; and (c) appeared in English-language, peer-reviewed journals. We retrieved the full-text of studies for which qualification could not be concluded from the abstract. Application of the initial inclusion criteria resulted in the elimination of 626 studies, leaving 22 to undergo additional screening. We retrieved and subjected to full texts to further examination. Included reviews systematically evaluated interventions, defined as adaptations to instructional presentation, content, or delivery with the potential to improve reading or literacy skills for students with or at risk for EBD.
Author identification of students as being with or at risk for EBD was sufficient for inclusion. Studies featuring multiple disability categories met criterion for inclusion provided (a) a primary emphasis on students with EBD appeared in the title or abstract and (b) the authors reported results for students with EBD separately. Examples of specific reading skills included phonemic awareness, alphabetic principles, fluency, vocabulary, and comprehension as well as measures nominally described as pertaining to reading (e.g., reading worksheets). Reviews targeting interventions on academic outcomes had to separately report results pertaining to reading. We restricted the publication date due to major changes in standards for systematic reviews (e.g., Maggin et al., 2017; Schlosser, 2006) as well as the review protocols of the What Works Clearing House (WWC; Kratochwill et al., 2010), which restrict literature searches to the 20 years prior to the date of the search. Following application of additional inclusion criteria, eight articles were removed because they did not address reading outcomes, were not systematic reviews, or did not address reading interventions (e.g., Kostewicz & Kubina, 2008).
Coding Procedures
A series of codes guided data extraction. Codes pertained to the characteristics (e.g., methodology, interventions) or quality of the reviews. We adapted quality assessment codes featured in Talbott et al. (2018).
Review characteristics
The first set of codes assessed characteristics of primary studies as well as the methods used in the review. Codes related to primary studies included participant diagnoses and ages, interventions, dependent variables, the number of participants, and outcomes. Diagnostic codes referred to disability categories encompassed by the reviews. Studies in which authors required participants to be identified as having EBD, possess an official diagnosis, or attend a facility for individuals with EBD were identified as including children with EBD. Studies that included children without a diagnosis or who otherwise exhibited problem behavior were identified as featuring children with or at risk for EBD. We coded age of participants based on the reports of the authors.
Focus of intervention codes identified whether reviews concerned reading interventions exclusively or academic interventions more broadly. Main categories of interventions included direct instruction, peer-mediated, self-management, and technology-based strategies. We defined direct instruction as any instructional approach structured, sequenced, and led by teachers (e.g., school-based classroom instruction, repeated reading, group instruction). Direct instruction also included teacher-initiated changes to prompt sequences, instructional antecedents, or consequences. Peer-mediated interventions involved the presentation or administration of content by other students. Self-management strategies encompassed self-regulatory behaviors to independently complete tasks (e.g., self-monitoring). Dependent variable codes denoted the target outcome variables featured in the reviewed studies. The categories of reading outcomes referred to specific reading skills set forth by the National Reading Panel (i.e., phonological awareness, alphabetic principles, vocabulary, fluency, and comprehension; Shanahan, 2005). The code various applied to reviews targeting wide ranges of dependent variables related to reading.
Additional codes pertained to characteristics of the review, including methodology, the study designs and sources included in the review, the use of quality assessment procedures, and outcome analyses. Methodology codes identified reviews as systematic or meta-analytic. Systematic reviews reported search procedures followed in the identification of studies. Meta-analyses included the terms “meta-analysis” or “quantitative synthesis/review” in the title and statistically synthesized the effects of included studies. The frequent use of single-case and group designs—two distinct forms of experimental inquiry often difficult to reconcile—represents a unique feature of special education research (Shadish et al., 2015). Design codes assessed whether authors included single-case or group designs, regardless of whether authors reported a priori criteria regarding designs featured in the review. However, we also noted whether authors explicitly identified study design as a criterion for inclusion as part of the assessment of eligibility criterion. Source codes described whether authors reported exclusion criterion on the basis of article origin as well as the inclusion of peer-reviewed articles, dissertations, or other material. Quality codes represented a binary assessment of whether or not authors assessed research quality. We further recorded the number of reading studies and, whenever possible, the number of participants included in each review. Study outcomes reflected either effect sizes or narrative description of reading studies provided by the authors.
Quality indicators
The second set of codes pertained to the quality indicators developed by Talbott and colleagues (2018). The evaluation tool developed by Talbott et al. (2018) consists of codes that describe the presence or absence of detail relating to eight domains, including research questions, eligibility criteria, search procedures, retrieval procedures, systematic screening, coding scheme procedures, coding scheme content, and data analysis. A more detailed description of these codes appears in Talbott et al. (2018).
Analysis
Given the descriptive aim of the study, analyses involved reporting the percentage of reviews meeting criteria or otherwise providing reported information. We entered all codes into a spreadsheet during coding and calculated descriptive statistics using Microsoft Excel. Missing information was noted as “not reported” with the exception of information pertaining to study participants. When it was not possible to provide the number of participants included in reading studies, we reported the total number of participants featured in the review.
Interrater Agreement
Two coders, a PhD in special education with experience assessing literature reviews (i.e., the first author) and an MA-level graduate student in psychology with prior research training (i.e., the third author), engaged in training to familiarize with the abstract selection, characteristic codes, and the quality indicators for literature review. The graduate student reviewed a codebook corresponding with inclusion criteria, characteristic codes, and the Talbott et al. (2018) indicators prior to training. Thereafter, the two reviewers independently coded three practice articles and compared the results. Any disagreements were discussed until consensus was reached.
We defined agreement for abstracts as reviewers reaching the same opinion regarding article inclusion. Interrater Agreement (IRR) for all identified abstracts, determined by dividing the number of agreements by the total of identified abstracts and multiplying by 100, was 97.5%. The two coders independently coded 50% of the articles and compared results after completing each article. Agreement was defined as both reviewers identifying an identical code for a single category. Average IRR calculated using the point-by-point method (Ledford & Gast, 2017) was 96.6% (Range = 92–100%; SD = 3.4). Disagreements were discussed until consensus was reached.
Results
The search resulted in the identification of 17 studies. A summary of characteristics codes appears in Table 1, whereas the results for the quality assessment appear in Table 2. On average, a single systematic review appeared each year between the oldest article identified in the search (Coleman & Vaughn, 2000) and 2018 (range = 0–4).
Summary of Quality Indicators.
Summary of Studies.
Note. Studies focused on broad academic interventions are italicized. Studies with an asterisk did not provide a priori criteria related to designs eligible for review. DV = dependent variable; QA = quality assessment; MA = meta-analysis; AR = with or at risk for EBD; EBD = diagnosed with EBD; NR = not reported; PM = peer-mediated; DI = direct instruction; SM = self-management; Comp = comprehension; PA = phonological awareness; ORF = oral reading fluency; G = group design; SCD = single-case design; PR = peer reviewed; ES = effect size; PND = percentage of non-overlapping data; IRD = improvement rate difference; S = systematic; M-S = middle-secondary; VR = various reading variables; B = behavioral; O = other; CEC = Council of Exceptional Children; SMD = standardized mean difference; TI = technology interventions; WWC = What Works Clearing House.
Review Characteristics
Diagnosis and age of participants
Fifty-nine percent of reviews (n = 10) established a narrow definition of EBD related to nominal diagnosis or placement in a specified setting. The remaining reviews (41%, n = 7) employed a broader definition encompassing individuals without a formal diagnosis. Authors did not report the age group of participants in 35% of reviews (n = 6). Of the remaining reviews, 18% (n = 2) targeted elementary aged children, 36% (n = 4) concerned children in middle or high school, and 45% (n = 4) reported large age ranges (e.g., 4–17; 5–21).
Focus of intervention
The majority of reviews (65%, n = 11) focused explicitly on reading interventions, whereas the remaining 35% (n = 6) involved generic academic skills or interventions that encompassed a reading component. Reviews featuring a variety of components often included teacher directed reading instruction (53%, n = 9), peer-mediated strategies (47%, n = 8), and self-monitoring/self-management (29%, n = 5). Other practices appearing less frequently included contingency-based strategies (18%, n = 3), corrective/repeated reading (11%, n = 2), and technology-based interventions (6%, n = 1). Authors exclusively self-management in 6% of the reviews (n = 1), whereas peer-mediated reading interventions represented the exclusive focus in 18% (n = 3) of reviews.
Dependent variables
Interventions in 35% (n = 6) of reviews specifically targeted reading outcomes, whereas 41% (n = 7) additionally reviewed reading and other academic outcomes. Authors reviewed studies concerning reading and behavior outcomes in 24% (n = 4) of articles. Most reviews listed at least two different types of reading outcomes, and these included oral reading fluency (65%, n = 11), comprehension (53%, n = 9), phonologic awareness (24%, n = 4), spellings (18%, n = 3), and nonspecified reading outcomes (17%, n = 3).
Methodology and research designs
Authors conducted systematic reviews most frequently (94%, n = 16) and performed a meta-analysis in a single instance. Only 24% of reviews assessed single-case design (SCD) studies exclusively (n = 4), whereas 71% (n = 12) featured both group and SCDs. Authors did not provide information pertaining to the study designs prior to or following the search in 6% (n = 1) of reviews.
Article sources
Ninety-four percent of reviews (n = 16) reported criteria involving the sources of eligible articles. Of these, 94% (n = 15) exclusively reviewed refereed journal articles. The remaining articles (6%, n = 1) included dissertations as well as peer-reviewed sources.
Quality assessment
Twelve percent of reviews (n = 2) assessed the quality of primary studies (Dunn et al., 2017; McKenna et al., 2017). McKenna et al. (2017) applied the WWC quality indicators (Kratochwill et al., 2010) to studies concerning all reading skills for students with EBD, finding that most studies (66%) did not meet design standards. Dunn et al. (2017) evaluated 24 studies of peer-mediated interventions using the quality indicators of the Council of Exceptional Children (CEC; Cook et al., 2015). On average, studies met 77% of the indicators. Authors concluded that overall, peer-mediated interventions met criteria for EBP.
Outcomes
In 94% (n = 16) of reviews, authors reported information regarding the outcomes of studies. Of these, 31% (n = 5) provided a narrative description of outcomes. The remaining reviews featured quantitative effect sizes (69%; n = 11). These included indices typically associated with SCD, such as nonoverlap effect sizes (e.g., Tau-U; Parker et al., 2011) as well as Swanson’s methodology (Swanson & Sasche-Lee, 2000), which appeared in 47% of studies (n = 8). Variations of the standardized mean difference (e.g., Glass’ Delta; Cummings, 2011) typically associated with group designs appeared in 24% of studies (n = 4).
Quality Indicators
Eligibility criteria
Eligibility codes pertained to whether reviewers specified inclusion or exclusion criteria for the primary studies. Reviewers specified the variable characteristics (e.g., interventions, dependent variables) as well as participants’ characteristics in all studies (100%; n = 17). Eligibility information regarding targeted research designs and the time frames for publication appeared in 65% (n = 11) of the reviews, respectively. Authors reported language restrictions in 24% (n = 4) of articles. Although 59% (n = 10) of the studies provided eligibility criteria for research designs a priori, 94% (n = 16) listed the designs of included studies in the results.
Search procedures
Search codes pertained to the transparency of article identification procedures. Authors reported information regarding each of the following search procedures: databases (100%, n = 17), hand searches (82%, n = 14), reviews of citations (94%, n = 16), reference lists of previous reviews (65%, n = 11), and consultation with other authors (6%; n = 1). Authors also reported reviewing titles and abstracts of articles in 65% (n = 11), searcher qualifications in 35% (n = 6), and search agreement in 24% (n = 4) of the reviews.
Retrieval procedures
This category of codes pertained to reports of the article retrieval process. Information regarding total citations returned from the electronic database search was reported in 71% (n = 12) of the reviews, whereas authors in 65% (n = 11) of reviews reported the total number of studies screened out from the database search. Fifty-nine percent of reviews (n = 10) featured information regarding the total number of articles retrieved, whereas 47% (n = 8) presented the total number of articles excluded.
Systematic screening
Screening codes described the selection of articles meeting inclusion criteria. The majority of reviews (n = 15; 88%) provided the total number of studies included in the review. Information regarding coder training and expertise was reported in 41% (n = 7) of the reviews, whereas reliability and disagreement resolution procedures were each reported in 41% (n = 7) of the reviews, respectively.
Coding scheme procedures
These indicators pertained to procedures authors used to extract information from primary sources. Authors in less than half of the reviews (47%; n = 8) reported coder expertise, whereas 12% (n = 2) reported coder training. Over half of the reviews (53%; n = 9) included information regarding the proportion of studies that were double coded, whereas 65% (n = 11) of the reviews reported reliability procedures and disagreement resolution methods. Only 12% of the reviews (n = 2) featured information pertaining to response categories used in coding, whereas none of the review had information concerning how authors addressed missing information.
Coding scheme content
Coding scheme content indicators provided information regarding the data extracted from the primary studies. Examples of such information include participant characteristics, key variable features, and methodological quality. Authors in 82% of the reviews (n = 14) provided information regarding participant characteristics and key variable features. Two reviews (12%) assessed the methodological quality of studies.
Data analysis plan
This set of codes pertained to whether reviewers reported a plan related to aggregating and interpreting results. Less than half of the reviews (35%; n = 6) specified a plan detailing how data were to be analyzed. Also, only 24% (n = 4) of the reviews reported information regarding the aggregation of the coded variables across studies.
Discussion
This article assessed the methodology of systematic reviews concerning reading interventions for students with EBD. Systematic reviews and meta-analyses are important tools for identifying EBPs and disseminating study findings throughout the field. Thus, procedural characteristics of reviews have potential implications for special education research and practice. Reviews exhibited heterogeneity in terms of students eligible for the EBD assignation and age groups targeted. The impact of direct instruction on reading featured prominently; however, many reviews (a) focused on the application of specific interventions to a wide range of skills or (b) incorporated dependent measures corresponding with disparate reading variables. That reviews often did not attempt a formal synthesis of outcomes via meta-analysis accentuates difficulties with drawing conclusions regarding intervention efficacy. Although inclusive in terms of study design, reviews relied almost exclusively on peer-reviewed material, indicating that conclusions may be subject to publication bias (Cook & Therrien, 2017). Instances in which authors applied quality indicators resulted in the elimination of multiple studies, suggesting quality considerations may mitigate positive quantified or narrative outcomes reported elsewhere in the literature. Application of quality indicators developed by Talbott et al. (2018) suggest a tendency toward transparency, though omissions potentially diminish confidence that consumers may place in the reviews.
Consumers may have difficulty linking guidance featured in reviews to specific students given uncertainty regarding the EBD population (Lloyd et al., 2019). Advocates insist students identified as exhibiting defiance, conduct disorder, or other mental health issues currently not encompassed by IDEA represent underserved populations (Kauffman et al., 2009). Reviews reflecting this assumption or that otherwise include children reported to exhibit problem behaviors (i.e., at risk) would fundamentally address different populations than reviews hewing to a narrow conception of EBD. The increasing use of more precise identification methods (e.g., Kilgus et al., 2013) in primary studies will likely allow future reviewers to be more discriminating when selecting studies on the basis of population. Currently, the lack of consistency in what constitutes EBD represents a clear threat to external validity.
The reviews identified in the current study differ in many key respects from reviews appearing in special education more generally (King et al., 2020). Whereas approximately 20% of intervention focused systematic reviews in special education (n = 505) include some form of meta-analysis, only a single article in the current review explicitly applied such procedures. Similarly, 34.3% of intervention reviews in special education applied quality indicators, compared with only two reviews in the current sample (11.7%). This omission appears to be considerable given the large number of reading studies that, when assessed, do not meet quality standards (e.g., McKenna et al., 2017).
Discretion may be warranted in noting the absence of methodological details in studies published prior to the dissemination of evidence standards, however (e.g., Horner et al., 2005). The absence of information now considered essential to study quality (e.g., training of interventionists) from published reports does not necessarily suggest that authors in the past failed to account for such factors in conducting an experiment. Similarly, the emergence of quality indicators in special education (e.g., Horner et al., 2005) after or at approximately the same time as 53% (n = 9) of identified reviews likely explains the infrequent use of quality assessment. Of the reviews published within 5 years prior to the search (i.e., 2013–2018), 50% included quality assessment procedures. This proportion is consistent with trends in special education and suggests quality assessment may appear more frequently in future research syntheses (King et al., 2020).
In contrast, reviews also tended to rely on peer-reviewed articles despite awareness of publication bias within the field prior to our search range (e.g., Fuchs & Fuchs, 1986). This suggests that, on average, reviews concerning reading instruction for children with EBD are more problematic than those for other populations and further indicates this area may benefit from renewed attention. Moreover, whether omissions of methodological details result from journal page limitations or differences in evidentiary standards over time, consumers often have no choice but to evaluate research based on the information available in a published manuscript. Despite consistently positive findings, we advise consumers to exercise caution when consulting reviews of the reading research for this population.
Application of quality indicators for literature reviews revealed considerable variation in the information reported by authors of systematic reviews and meta-analyses. Reviewers often reported details related to important methodological areas (e.g., inclusion criteria for primary studies) and described procedures followed in conducting searches. The explication of eligibility criteria in most reviews is encouraging, in that consumers could identify studies within the purview of the review (Talbott et al., 2018). The ostensible triangulation of literature search methods (i.e., database search, hand search, ancestral search) further attests to the strength of reviews involving reading instruction for children with EBD (Booth, 2010).
Notwithstanding these qualities, several other areas remained opaque. Less frequently reported details included searcher qualifications and agreement on inclusion criteria. This area is critical for precisely the same reasons that similar information is included in primary studies; specifically, to contribute to the credibility of measurement procedures used in obtaining findings (Chalmers et al., 2002). Reporting of training procedures for individuals that coded the articles for data collection was also largely absent, which casts doubt on the reliability, validity, and replicability of the data collection process. Finally, authors provided limited information concerning analysis as well as methods to address missing data. Although often associated with meta-analyses, explicit information regarding data analysis can be key in interpreting outcomes for systematic reviews (Talbott et al., 2018).
Limitations
This review focused on the methodology of literature reviews involving reading interventions for children with emotional disturbance. Despite a history of similar work across multiple disciplines (e.g., Liberati et al., 2009; Mackay et al., 2003), all of the identified reviews were published prior to the quality standards for literature reviews in special education (e.g., Talbott et al., 2018). It is therefore not surprising that none of the reviews met contemporary standards. We would note, however, that the purpose of applying quality standards is not to condemn previous scholarship. Rather, quality assessment occurs to determine the presence or absence of objective features related to the veracity of findings. Issues such as publication bias were identified as threats to literature syntheses prior to the creation of quality indicators, and would represent a threat to validity regardless of whether formal quality standards for literature reviews were developed (Cook, 2014). The age of targeted studies, relative to quality standards, is therefore of secondary importance given that the quality assessment process exists to educate consumers about the limitations of reported research findings. We believe studies examining overlapping reviews will facilitate the publication of syntheses that build upon the accomplishments of previous scholarship and improve guidance received by consumers.
Although we reported outcomes of the reviews, our primary purpose was to examine the procedural variables with the potential to bias outcomes. The present study therefore has little direct utility for practitioners. This review may nonetheless permit more critical consumption of resources that attempt to disseminate effective instructional practices. The search described in this review, though intensive, may not have identified all sources relevant to the questions of interests. Exclusion criteria eliminated reviews concerning reading that did not explicitly concern students with EBD but which may have had bearing on instruction for this population. The explicit emphasis on reviews pertaining to EBD is nonetheless justified given (a) our questions of interest and (b) the breadth of material related to reading difficulties more generally. We also restricted our search to peer-reviewed, English-language publications, which may have biased our findings (King et al., 2020). As this review represents a transparent, first attempt at assessing literature syntheses in the area of reading education for students we EBD, we nonetheless believe the work contributes to the literature. We encourage future scholars to address the shortcomings of this study in subsequent work.
Future Directions
The current study revealed a number of issues with literature reviews concerning reading instruction for students with EBD. Due to the wide range of interventions, dependent variables, and other factors across reviews, it was not possible to examine the influence of individual methodological elements on outcomes. This presents an opportunity for future researchers to analyze primary studies in a manner more consistent with the current standards of review (e.g., Talbott et al., 2018) while simultaneously examining the influence of distinct procedural variations on outcomes (e.g., Egger et al., 1997). Given our findings, the analysis could concern a single dimension of literacy (e.g., fluency) compare results based on (a) the definition of EBD used in selecting studies, (b) the inclusion of dissertations, (c) the application of different quality standards (e.g., Cook et al., 2015; Kratochwill et al., 2010), and (d) the use of different effect sizes. Such work would provide updated guidance regarding students with EBD while providing a demonstration of factors conceptually acknowledged as critical to the conduct of literature reviews (King et al., 2020).
As with other areas in special education, the state of reviews invites attempts to conduct additional research syntheses of the reading research involving students with EBD. We have a number of recommendations for researchers hoping to contribute to this area. Reviewers should consider limiting their focus to one or two operationally defined literacy skills to avoid aggregating research that, although ostensibly related to reading, addresses fundamentally different skills (Sharpe, 1997). Authors should also review dissertations related to this line of research to mitigate the threat of publication bias (Cook & Therrien, 2017). The historic skepticism regarding the quality of dissertations may be unfounded due to the well-documented limitations of peer review (e.g., unreliable, susceptible to fraud; Cook, 2014) and the potential to control for quality through the post hoc application of quality indicators (Petticrew, 2015). Including dissertations is ultimately important because (a) authors generally report smaller effects than those appearing in published research (Ferguson & Brannick, 2012), (b) disclosure of procedures is unhindered by factors such as page limitations, (c) results are generally subject to review by academic experts, and (d) the questions and methodology related to reading interventions for children with EBD are conducive to the low-cost, small-scale research often pursued by doctoral students (Hartling et al., 2017).
Leaving aside the importance of including dissertations, we do not believe authors should necessarily attempt to retrieve unindexed gray literature, as partial inclusion of this work may bias results (Ferguson & Brannick, 2012). In addition, authors should not feel compelled to conduct meta-analyses, as this procedure should only be performed when warranted by the available studies. However, we do recommend calculating effect sizes. Given the prominence of both single-case and group designs in this body of literature, between-case effect sizes, which permit the synthesis of the disparate designs, may be an appropriate addition to more typical indices (Shadish et al., 2015). Finally, we recommend adhering to guidelines for high-quality literature reviews during the planning, executing, and reporting stages (Talbott et al., 2018).
Given their importance in identifying effective special education practices, conducting systematic reviews and meta-analysis in accordance with quality standards provides a critical service to both researchers and practitioners. We found reviews concerning reading interventions for students with EBD exhibited several strengths (e.g., triangulation of literature search methods). The reviews identified in the current study, therefore, do contribute significantly to inventories of effective reading interventions for students with EBD. Lapses in reports regarding methodological details in some reviews and the absence of quality assessment procedures for primary studies represent notable limitations of the research, however. Future reviewers have an opportunity to build upon these earlier efforts through a commitment to quality and transparency in reviews and meta-analyses.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
