Abstract
More than two-thirds of middle school students do not read proficiently. Research has shown that targeted interventions using explicit instruction methods can improve reading outcomes for struggling readers. A central feature of explicit instruction is the systematic implementation of instructional interactions, but it is not clear what specific instructional interaction practices lead to stronger outcomes for middle school readers. This study used a regression discontinuity design to compare the frequency and impact of instructional interactions experienced by U.S. eighth-grade students who received a targeted reading intervention (n = 1,461) with those who did not (n = 4,292). Results indicated that students who received intervention experienced far more instructional interactions with their teachers than did students who did not. However, the association between rates of interaction and student need in the intervention group was minimal, and the relationship between the rate of instructional interactions and reading growth was mixed. Implications for intervening with struggling students in the middle grades are discussed.
On the latest National Assessment of Educational Progress (NAEP report), 69% of eighth-grade students lacked proficient reading skills (i.e., the skills necessary to independently read and understand grade-level texts), and 30% did not have even basic reading skills (U.S. Department of Education et al., 2022), a 3% decrease in the percentage of students in both the proficient and basic skill categories compared with the previous NAEP assessment (i.e., 2019, prior to the onset of the COVID-19 pandemic). This long-standing pattern has worsened, despite decades of research that has firmly established what content students need to be taught to read proficiently (Foorman et al., 2016; Kamil et al., 2008; National Reading Panel, 2001; Snow et al., 1998; Vaughn et al., 2022). Furthermore, there is broad consensus that explicit instruction represents the most effective and efficient way to teach those skills, especially for students with reading difficulties (Kamil et al., 2008; Vaughn et al., 2022). However, research on interventions delivered in authentic school settings suggests that these best practices may not be widely implemented in schools (Boulay et al., 2015; McKenna et al., 2015, 2021; Vaughn et al., 2002), providing one possible explanation for consistently poor NAEP results and highlighting the need for more research on the reading instruction schools provide.
Explicit instruction provides structure and organization to the interactions between teachers and students as the latter learn and master new skills (Carnine et al., 2004). A defining characteristic of explicit instruction is the frequency and quality of those intentional and direct interactions (Vaughn et al., 2011, 2013). These instructional interactions include demonstrations, directed questions, student responses, and corrective or confirmatory feedback. As described by Carnine et al., teachers who engage in high-quality instructional interactions explain and clearly demonstrate the skills and strategies students must learn and closely guide students in the initial application of those skills. As students develop proficiency, they apply what they learn in a variety of carefully sequenced contexts to deepen their understanding and proficiency. Throughout this process, teachers provide feedback to students on the accuracy, depth, and quality of their efforts and give students additional explicit instruction and practice in areas that are difficult for them. Students who continue to struggle receive additional instruction and support, oftentimes in the context of formal interventions.
Traditionally, factors such as group size, intervention duration, and intervention frequency have been used to increase the intensity of instruction and support struggling students (Fuchs et al., 2008; Harn et al., 2008; Simmons et al., 2008). Researchers have also begun to investigate the impact of variables that more directly reflect the quantity and quality of instructional interactions during explicit instruction, such as counts of teacher models and student response opportunities (e.g., Doabler et al., 2015; Nelson-Walker et al., 2013). To date, investigations into these detailed aspects of instructional interactions have occurred at the elementary school level, but have not, to our knowledge, been studied with older students.
Thus, the purpose of this study was to examine instructional interactions during reading instruction in middle school. We were primarily interested in the impact of these interactions on students with reading difficulties, so our efforts targeted the two middle school instructional contexts in which struggling readers are likely to receive some degree of reading or literacy instruction: targeted reading intervention classes and English language arts classes (ELA; Kamil et al., 2008). ELA classes served as the counterfactual to the reading interventions, representing the reading instruction that struggling readers would likely receive in the absence of intervention. Our objectives were to compare instructional interactions for students who received a reading intervention to those of comparable students who did not, and for students with reading difficulties, evaluate the extent to which the frequency and quality of instructional interactions were associated with their instructional need and growth in reading.
Middle School Reading Interventions in Controlled Studies
Quantitative syntheses of research studies have shown that targeted interventions in middle school can have a positive impact on the reading proficiency of adolescent struggling readers. For example, Scammacca et al. (2015) conducted a meta-analysis of 67 intervention studies published in Grades 4 to 12 since 1980 and reported an average effect size of 0.49 (Hedges’ g), with positive effects on all types of measures (i.e., reading comprehension, vocabulary, and word study). For middle schools specifically, the overall effect size was 0.57 (k = 30). Other published reviews also support the use of reading interventions with adolescent struggling readers (e.g., Edmonds et al., 2009; Slavin et al., 2008; Wanzek et al., 2013), with several suggesting that effects are most pronounced for those interventions that target essential content areas (e.g., phonological awareness, phonics, fluency, reading comprehension, and vocabulary) and use explicit instruction to teach that content (e.g., Biancarosa & Snow, 2006; Kamil et al., 2008; National Association of State Directors of Special Education, 2007; Vaughn et al., 2022).
In addition, there is growing consensus that if schools were more systematic in their use of explicit instruction, “levels of adolescent literacy would improve” (Biancarosa & Snow, 2006, p. 1), and when examples of quality instruction are provided in these reports, the importance of high-quality interactions between teachers and students is apparent. For example, the 2008 IES Practice Guide on improving adolescent literacy recommended that schools implement five evidence-based instructional practices, three of which focused on the need for explicit instruction when teaching reading comprehension and vocabulary, and when providing interventions to struggling readers (Kamil et al., 2008). Components of explicit instruction are also strongly recommended in influential reports by the Academic Literacy Instruction for Adolescents (Torgesen et al., 2007), and Reading Next (Biancarosa & Snow, 2006), and a recent IES Practice Guide on reading interventions for struggling readers in Grades 4 to 9 (Vaughn et al., 2022) further emphasized the importance of targeted instruction and the use of instructional interactions in explicit instruction. Based on an analysis of 38 studies that met What Works Clearinghouse standards for quality (What Works Clearinghouse, 2022), the authors recommended that interventions to improve reading performance address students’ particular content needs, highlighting three specific areas: decoding, fluency, and comprehension. Importantly, the guide provides extensive examples from all three content areas showing how instructional interactions are an essential component of explicit instruction. For instance, in one example focused on teaching comprehension strategies, the teacher and students engage in 19 verbal interaction exchanges to figure out the meaning of the word “remote” in a text passage. These conclusions are consistent with research on the effects of instructional interactions in early literacy and math, which found positive relationships between the frequency of explicit instructional interactions and student outcomes (Doabler et al., 2015; Nelson-Walker et al., 2013).
Authentic Setting Considerations
Intervention Studies
Troublingly, research has shown that middle school reading interventions found to be effective in controlled studies may not have the same impact when implemented in more authentic school contexts. Scammacca et al. (2015) foreshadowed this possibility in their meta-analysis of intervention effects in Grades 4 to 12. Effect sizes were systematically larger when research staff implemented interventions compared with practicing teachers. The average effect size associated with researcher implementation was 0.68, whereas for teacher implementation, it was 0.35. More ominously, in the Striving Readers grant program, Boulay et al. (2015) summarized the effects of 17 randomized controlled trails (RCTs) in which 16 school districts were paired with an external evaluator to study the effects of district-selected and implemented reading interventions. Only three interventions had positive or potentially positive effects on literacy outcomes, and these effects were small (ESs < .21; Boulay et al., 2015).
Research by Slavin et al. (2008) may shed light on why so many of the programs in the Striving Readers evaluation had nonsignificant effects. Based on an analysis of 33 adolescent reading studies, Slavin et al. (2008) found that “programs designed to change daily teacher practices had substantially greater research support than those focused on curriculum or technology alone” (p. 290). In other words, interventions that focused on what teachers and students did during instruction, including their interactions, had a stronger impact than interventions that implemented a particular curriculum.
Observation Studies
Several observation studies in elementary and middle school settings also provide evidence that reading instruction and intervention practices in authentic school settings are often less explicit compared with research settings (McKenna et al., 2015, 2021; Vaughn et al., 2002), and that this disparity may be especially problematic for struggling readers. At the elementary school level, Vaughn et al. conducted an extensive summary of observation research targeting reading instruction and intervention for students with learning disabilities (LD) and emotional behavior disorders (EBD). Across 16 studies, the authors found that although substantial time was allocated to reading instruction, the quality of that instruction was low, and specifically noted the lack of interactions between teachers and students. Instead, students spent a great deal of time on their own, waiting, in transition from one activity to another, and engaged in independent seatwork.
More recently, McKenna et al. (2015) summarized the results of eight elementary and three middle school observation studies focused on the degree to which instruction for students with LD was aligned with research-supported practices. In reading, findings were presented in relation to reading content (e.g., phonics, vocabulary, and comprehension) and instructional approaches (e.g., individualized instruction, whole group instruction, and explicit instruction). The authors noted some instructional strengths in terms of vocabulary and fluency instruction but also found that students had minimal opportunities to read connected text, received inadequate phonics instruction, and spent relatively little time on reading comprehension instruction.
McKenna et al. (2021) extended this work by synthesizing an additional 11 observation studies conducted between 1984 and 2019 focused on reading instruction provided to students with or at risk for EBD. Although grade levels were not reported, students’ ages suggest that nearly all of them were in elementary school. Each study included instruction in at least one essential reading content area (e.g., phonological awareness, phonics, fluency, comprehension, and vocabulary) but the studies varied significantly in the amount of time devoted to specific areas. The authors suggested no conclusions could be drawn regarding the extent to which students with EBD have access to effective, explicit reading instruction.
At the middle school level, Ciullo et al. (2016) observed reading instruction in three schools to investigate the extent to which observed instructional practices for students with learning disabilities aligned with researcher recommendations. The most frequently occurring categories did target essential content, although in some lessons that content was not taught at all. Frequency counts were used to measure specific aspects of explicit instructional practices, including instructional interactions. The most frequently occurring practices were checks for student understanding, feedback to students, and student practice opportunities.
The Effects of Instructional Interactions on Student Outcomes
Data on the impact of explicit instruction on student outcomes is not common in middle school settings, but several early grade studies provide some guidance regarding the potential effects of targeted instructional interactions. For example, Connor et al. (2009) found that in the early elementary grades, students taught by teachers who used more explicit teaching methods aligned to students’ needs had greater gains than did students taught by teachers who used less explicit methods. Smolkowski and Gunn (2012) and Nelson-Walker et al. (2013) systematically examined the association between the quantity and quality of instructional interactions and student outcomes using an observation protocol to quantify four components of instructional interactions during reading instruction: (a) teacher demonstrations, (b) teacher feedback, (c) student practice, and (d) student errors. In both studies, higher rates of student practice were associated with gains on reading outcomes. Furthermore, the results reported by Nelson-Walker et al. were conducted as part of a larger intervention study in which positive treatment effects were observed on a range of reading measures (Smith et al., 2016). Differences by condition regarding the frequency of student practice opportunities were large (Hedges’ g = 1.63), and ratings of explicit instruction quality were higher in treatment classrooms (Hedges’ g = 1.31).
In summary, prior research on adolescent literacy indicates that interventions for struggling readers that use explicit approaches to instruction can effectively increase reading outcomes. However, when implemented in broader, more authentic school contexts—when practicing teachers are providing the interventions—the effects are generally more muted, and observation research suggests two likely explanations for those differences. First, the content of reading interventions that are delivered in authentic contexts may vary substantially from that delivered in controlled studies. Second, the use of explicit instruction and high-quality instructional interactions may be less robust. These findings highlight the need for more research on the quality and impact of instructional interactions on student outcomes in middle school.
Purpose of the Study
As detailed above, research on the impact of instructional interactions between middle school teachers and students is limited. Thus, the purpose of this study was to systematically examine the types and frequency of instructional interactions between eighth-grade teachers and students, including students with reading difficulties, and investigate the extent to which differences in the frequency of these interactions were associated with reading outcomes. This research extends an earlier study (the Middle School Intervention Project—MISP) that examined the impact of school delivered reading interventions on the reading skills of eighth-grade students compared with similarly performing students who did not receive the intervention (Fien et al., 2018). Students were assigned to either the intervention or comparison group based on a cut score, and regression discontinuity analyses showed that the average impact of the interventions on student reading scores was not statistically significant.
We used the same sample of students to investigate three related questions regarding instructional interactions. First, we examined whether the quantity and quality of instructional interactions experienced by students who received a reading intervention differed from those for similarly performing students who did not receive intervention. We hypothesized that students who received a reading intervention would experience more instructional interactions than students who did not. Second, we investigated whether the quantity and quality of instructional interactions varied as a function of level of students’ reading risk. We hypothesized that intervention students with more substantial reading difficulties would experience more interactions than intervention students with less substantial reading difficulties. Third, we examined the extent to which the frequency and quality of interactions were associated with increases in reading achievement for students who received a reading intervention.
Method
Participants
Participants were eighth-grade teachers and students from six school districts in the U.S. Pacific Northwest, located in or adjacent to one of two large metropolitan areas. Districts served between 5,659 and 39,941 students (National Center for Education Statistics [NCES], 2015) and were recruited to participate because they had previously established interventions to improve reading outcomes for struggling readers in middle school. Of the 41 schools that served eighth-grade students, 25 (61%) participated in the study. Schools were not eligible to participate if they did not offer a traditional, comprehensive middle school curriculum (n = 10) or if fewer than 10% of students qualified for the intervention (n = 6). Participating schools each served between 339 and 1,049 students. Across the 25 schools, 188 classes provided targeted supplemental reading support for struggling readers in eighth grade, and 102 classes provided English language arts (ELA) instruction.
All students with a valid score on one of the two assessments used for condition assignment were included in the study (N = 5,753). This sample was comparable with both state and national averages on a range of demographic variables. For example, 22% of students in the study were Hispanic, and 17% represented other minority groups. This compares to 22% and 25% Hispanic, and 13% and 25% Other minority, at the state and national levels, respectively. Similarly, 59% of the sample was eligible for free or reduced-price lunch, compared with 54% and 52% eligibility at the state and national levels, respectively. Additional details regarding participants and methods are reported in Fien et al. (2018).
Measures of Reading Proficiency
Participating schools administered two standardized measures of reading proficiency: (a) the Oregon Assessment of Knowledge and Skills, Reading/Literature subtest (OAKS-R, Oregon Department of Education [ODE], 2012); and (b) easyCBM Passage Reading Fluency, a measure of oral reading fluency (Alonzo et al., 2006). Students were tested at the end of seventh grade to determine condition assignment, and again in eighth grade to assess reading proficiency.
Oregon Assessment of Knowledge and Skills Reading/Literature (OAKS-Reading)
At the time of the study, the OAKS-R was the state test of student reading proficiency. The OAKS-R was an untimed, multiple-choice, computer adaptive test of comprehension (80% of items), and vocabulary knowledge (20% of items; ODE, 2012). Reliability was high across a broad range of student abilities (ODE, 2007). Student performance was reported as an Item Response Theory Rasch-scaled score centered on 200 with a standard deviation of 10.
easyCBM Passage Reading Fluency
easyCBM Passage Reading Fluency (PRF) (Alonzo et al., 2006) is a standardized, individually administered measure of oral reading fluency in which students read a passage aloud for 1 minute. The number of words read correctly in 1 minute was used for analysis. Alonzo and Tindal (2008) reported that the average correlation between a reference PRF passage and 19 other seventh-grade passages was .89. PRF is moderately predictive of performance on the OAKS–Reading (r = .68) and accounted for 15% of variance in OAKS–Reading performance (Anderson et al., 2011).
Measures of Reading Instruction and Intervention
To estimate the reading instruction students received over the year, project and school staff documented the following for each reading intervention and ELA class: (a) days per week; (b) average lesson duration; and (c) frequency of use of published programs. Project staff also documented student scheduling changes throughout the year and collected average attendance rate for each student. This information was then combined at the individual student level to generate a precise estimate of reading instruction and intervention dosage for each student.
MSIP Classroom Observation Tool
Trained research staff conducted direct observations of each reading intervention and ELA class an average of three times during the year, a total of 934 observations. To ensure that observations collected information about teacher–student interactions, content coverage, and quality ratings tailored to middle school instruction, observers used the project developed Middle School Intervention Project (MSIP) Classroom Observation Tool (MSIP-COT; Fien et al., 2018), which was based on a critical analysis of other published direct observation instruments. Observations lasted the entirety of instruction, which on average, was 58 minutes.
Structure of the MSIP-COT
The MSIP-COT has three main components. The first measures the frequency of teacher–student and student–student interactions during instruction. This component was derived from the COSTI (Smolkowski & Gunn, 2012), a low inference instrument field-tested and validated using more than 1,000 observations of elementary reading and mathematics classrooms. Previously reported predictive validity coefficients with reading and mathematics outcomes range from .25 to .55 (Smolkowski & Gunn).
The MSIP-COT was adapted from the COSTI by including additional interaction types not typically observed in elementary classrooms. It is used to document, in real time, frequencies of seven types of content-relevant, explicit instructional interactions. Three interactions center on teacher behaviors: overt demonstrations and explanations, questions directed at one or more students, and corrective or confirmatory feedback. The other four interactions focus on student behaviors: an individual student who responds to a teacher prompt; two or more students who respond together to a teacher prompt; a student response to a peer not monitored by the teacher; and a student-initiated question or comment acknowledged by the teacher. An overall interaction composite is calculated by summing the frequency of these seven interaction types.
The second MSIP-COT component documents the content domains in which reading instruction takes place: reading words, reading connected text, reading comprehension, writing, vocabulary, other literacy activities (e.g., spelling), and other nonliteracy activities (e.g., classroom management). This section also identifies time spent in each of four grouping structures: independent work, small group work (i.e., 2–7 students) without teacher instruction, small group work that is teacher led, and large group instruction (i.e., eight or more students). Content domains were based on Kamil et al. (2008), and grouping structures were developed by project staff based on a review of widely used middle school reading interventions.
The third MSIP-COT component is a moderate inference measure of the quality of instruction. Modeled after other broad measures of instructional quality, including the Classroom Assessment Scoring System (CLASS) (Pianta & Hamre, 2009) and the Ratings of Classroom Management and Instructional Supports (RCMIS) (Doabler & Nelson-Walker, 2009), this component contains 11 items rated on a 4-point scale. A rating of 1 indicates the item was never present and a 4 suggests the item was highly present. Items were adapted from the RCMIS by project staff to align with middle school instruction. Features addressed include classroom climate, classroom organization, classroom management, student engagement, instructional delivery, and teaching for reading proficiency. Observers completed their ratings at the conclusion of the observation period.
Observation Training
Observers conducted observations over 2 years, when the study sample was in seventh grade (Year 1) and in eighth grade (Year 2). Training and reliability data are reported for Year 1; the same procedures were used in Year 2. Classroom observations were conducted by 23 trained observers, including two project staff, four district employees, and 17 trained data collectors. Observation training occurred prior to each of the three observation rounds. In the fall, observers received 12 hours of training over two consecutive days. An additional 6 hours of refresher training were provided prior to each of the winter and spring observations. Each training involved review of the observation instrument, coding practice using video clips, and reliability documentation.
Interobserver Reliability
At each of the three observation points, reliability estimates were calculated for initial training (i.e., checkout), during an initial field observation, and overall maintenance checks. The maintenance checks represented 20% of all classroom observations. Across the nine reliability opportunities (fall, winter, spring by three types of reliability checks), reliability estimates for teacher–student interactions ranged from 89.61% to 99.59%, with an average of 92.98%. On the 11 classroom quality indicators, total score reliability ranged from 91.71% to 100%, with an average of 95.28%.
Reading Instruction and Intervention
Participating schools had supplemental reading intervention classes in place prior to the start of the study, the purpose of which was to improve the reading achievement of struggling readers. As a condition of participation, schools agreed to provide these reading interventions only to those students who scored below the reading cut point determined at the end of seventh grade. Districts and schools made all intervention decisions (i.e., the research team was not involved in the decision-making process), and interventions varied widely with respect to intervention dosage, curricula used, and ratio of students to teachers.
We observed reading instruction in both reading intervention and ELA classes. Our rationale for observing ELA instruction was that outside of reading intervention classes, ELA classes were the most likely setting in which struggling readers would receive some level of reading support (Kamil et al., 2008). Thus, ELA instruction served as our best estimate of the reading support students would receive in the absence of a reading intervention, although we did not expect instruction in ELA classes to be as explicit as the instruction provided in reading intervention classes. On average, reading intervention classes met slightly more than 4 days per week (M = 4.3, SD = .82). In three districts, all reading intervention classes met 5 days per week. In the remaining three districts, reading intervention classes met, on average, 4.97 (SD = .18), 4.83 (SD = .48), and 4.07 days per week (SD = 1.42). The average reading intervention session lasted 58 minutes (SD = 21), with district averages ranging from 47 minutes (SD = 8) to 70 minutes (SD = 27).
Two of the six districts reported using a published program in every reading intervention class and three others reported using a published program in between 73% and 92% of intervention classes. In the sixth district, only 42% of reading interventions reported using a published program. Across schools, the most frequently reported intervention programs were teacher-created materials (30.1% of classes), Language! (20.0% of classes), Step Up to Writing (9.4%), and Read 180 (8.9%). Across districts, reading intervention classes had an average student to teacher ratio of 11.5 to 1 (SD = 5.79). In contrast, ELA classes had an average student to teacher ratio of 28.2 to 1 (SD = 6.43).
Assignment of Students to Condition
A standardized cut score based on the OAKS and PRF measures was used to assign students to condition. Prior to the start of intervention, project staff provided a rank-ordered list of student scores to each school. Each school then selected the cut point their school would use to assign students to condition, usually determined by the number of students they could serve in intervention classes. Students with scores at or below this point were assigned to the intervention condition, and students with scores above it were assigned to the comparison condition. The use of separate cut points for each school was necessary because a single cut point for all schools would have required some schools to serve a very different proportion of students than they would normally. To account for this, we centered cut scores within each school so that, for analysis and interpretation purposes, all schools had a cut point of zero. Comparable with the use of school mean centering in multilevel modeling (Raudenbush & Bryk, 2002), we included the school cut point value in all models as an additional school-level predictor.
Analyses
In addition to descriptive and visual examination of the data, we conducted a series of regression discontinuity (RD) and linear mixed model (LMM) analyses to (a) test for differences in the composite interaction rate and quality rating variables conditional on students’ scores on the assignment variable; and (b) for students who received a reading intervention, estimate the extent to which those variables were associated with increases in reading achievement. An essential requirement of RD designs is the correct specification of functional form (Bloom, 2012). If the relationship is modeled as linear when the data are curvilinear, a gap at the cut point may be estimated when no gap exists, or vice versa. Consequently, we fit our RD analyses using generalized additive models (GAMs), which relax the assumption of linearity by replacing one or more model parameters with individually estimated smoothing functions. For smoothed terms, the functional form is determined, in part, by the data themselves (Chib & Greenberg, 2014).
We fit all GAMs in the statistical computing environment R (R Core Team, 2020) using the gamm4 package (Wood & Scheipl, 2020) and thin-plate spline smooths. The amount of smoothing was estimated via generalized cross-validation procedures (Wood, 2006), with nonlinearity estimated by the effective degrees of freedom (EDF). Higher EDF values indicate greater nonlinearity, an EDF of 1.0 indicates a linear relationship (Shadish et al., 2014). For parsimony, we re-estimated all smooths with an EDF of 1.00 using linear parameters, increasing power, and reducing the likelihood of sample-specific results (James et al., 2013).
Because the data were inherently multilevel (i.e., students nested within schools), we estimated the RD model using a multilevel GAM. At the student level, the model was defined as
where yij represents the outcome (i.e., interaction composite or quality rating) for student i in school j. The LEC and AC variables were dummy-coded vectors representing whether students’ scores on the assignment variable, pre, were less than or equal to the cut (LEC), or above the cut (AC), for school j. Both variables were included as interactions with the assignment variable, allowing for the estimation of separate smooths, sp(), for students on either side of the cut point. At the school level, the model was defined as
where cutj represents the location of the cut score for school j, γ00 represents the model intercept (indicating average scores for students above the cut), γ10 represents the average regression discontinuity gap, and uoj and u1 j represent school-level deviations from these averages.
In a regular, or sharp, RD design, the γ10 term represents the average treatment effect. In our sample, however, 9% of students who scored above the cut point received a reading intervention, and 18% of students who scored below the cut point did not receive a reading intervention, making the design a fuzzy RD. Consequently, we first tested for a gap in the probability of receiving intervention at the cut-point (Bloom, 2012) using a model equivalent to Equations (1) to (3) but with a multilevel logistic GAM in which the outcome was intervention receipt. We then divided the sharp RD estimate, γ10, by the estimated probability gap to determine the fuzzy RD point estimate. All other analyses were conducting using supplemental R packages: LMM analyses were conducted using lme4 (Bates et al., 2015), effect sizes were estimated using effectsize (Ben-Shachar et al., 2020), and figures were generated using ggplot2 (Wickham, 2016) and patchwork (Pedersen, 2020).
Results
Research Question 1: Differences Between the Intervention and Comparison Conditions
We examined instructional intensity using three sets of variables: frequency counts of interactions, observer ratings of instructional quality, and observed estimates of the proportion of class time spent on various literacy content areas. Instructional interaction and quality rating data are summarized by condition in Table 1. As hypothesized, students who received a reading intervention experienced, on average, a higher rate of instructional interactions on all variables than did their comparison peers. The average difference in the interaction composite (see Row 8, Table 1), is large (Hedges’ g = 1.11). Differences were also large on several individual variables, particularly teacher feedback (Hedges’ g = 1.08, an average of 73% more feedback per hour), teacher questions (Hedges’ g = 1.06, 65% more questions per hour), and student individual responses (Hedges’ g = 0.86, 66% more individual responses per hour).
Descriptive Statistics by Condition and Effect Sizes for Estimated Rates of Instructional Interactions Per Hour and Quality Ratings.
Note. n = number of students. Hedges’ g represents the standardized effect of the treatment condition relative to the control condition.
In contrast, quality rating (see Row 9, Table 1) and content coverage (not shown in table form) were, on average, extremely similar across conditions. In terms of content coverage, in both conditions, students spent less than 1% of observed time on word level reading activities. Students in both conditions also spent the same percentage of instructional time on vocabulary (9%) and other literacy activities not coded under a primary category (20%). Students in both conditions also spent nearly the same amount of time on the primary coded categories of reading connected text (15% for intervention students; 13% for comparison students) and reading comprehension (27% vs. 25%), and on nonliteracy activities (11% vs. 12%). Quality of instruction ratings were also very similar across conditions (reading intervention M = 35.41; ELA M = 35.48).
The left side of Figure 1 displays fitted GAMs for the interaction composite (top left quadrant) and quality rating (bottom left quadrant) superimposed over the observed data. The quality rating model reduced to a linear fit on both sides of the cut (EDF = 1.00), while there was some nonlinearity on both sides of the interaction composite model (EDF below the cut = 2.16, EDF above the cut = 1.98). The former was thus refit to constrain both slopes to be linear. The mean probability of receiving an intervention was 0.73 for students who scored below the cut and 0.25 for students who scored above the cut, resulting in a statistically significant mean probability gap of 0.48 (SE = 0.05, z = 9.90, p < .001). The raw gap at the cut for the interaction composite was 38.58, which when divided by the probability gap resulted in a significant fuzzy estimate of 79.68 (SE = 21.52, z = 3.07, p < .001), confirming that the rate of instructional interactions was statistically greater for students near the cut who received an intervention than for students near the cut who did not. In contrast, the raw gap at the cut for quality rating was −0.001, resulting in a nonsignificant fuzzy estimate of −0.003 (SE = 0.55, z = −0.005, p = .99).

Interaction Composite and Quality Rating Scores by Instructional Need.
Research Question 2: Relation Between Student Need and Interaction Frequency
Variability in the intensity of reading instruction is depicted visually in Figure 1 for the interaction composite and quality rating scores. The left panel displays a scatter plot for each outcome, with student level estimates by condition. The right panel displays a density plot for each outcome, showing the proportion of students with each interaction rate and quality rating estimate, modeled separately by condition. Students’ scores on the centered assignment variable are plotted on the x-axis, and scores on the interaction composite and quality rating variables are plotted on the y-axis. The left side of the scatter plots shows students below the cut and the right shows students above the cut. The cloud of points on both sides of the cut in both plots illustrate the substantial variability in student level estimates of interaction rate and quality rating across the full range of instructional need, but no clear relationship between need and either variable.
Across conditions, the line representing the interaction composite for students who received a reading intervention is higher at every point on the x-axis than the corresponding line for students who did not, showing that intervention students experienced, on average, a higher rate of instructional interactions than did comparison students. This same pattern is apparent in the corresponding density plot, where a higher proportion of comparison students have scores at the bottom (i.e., left side) of the distribution, and a higher proportion of intervention students are at the top, including a notable proportion of intervention students who experienced higher rates of instructional interactions than did any of the comparison students. No such differences are apparent in the corresponding figure for quality rating.
Research Question 3: Association Between Interaction Frequency and Student Gains
Results from the GAM models illustrate the nested nature of the instructional interaction data, and thus, the need for a multilevel modeling approach. In the interaction composite model, school accounted for 36% of the variance, and the cut score accounted for an additional 28%. In the quality rating model, school accounted for 56% of the variance, and the cut score accounted for an additional 9%. Thus, to answer our third question, we used linear mixed model (LMM) regressions to evaluate, for intervention students only, associations among observed quality ratings, instructional interaction rates, and gains on OAKS and PRF.
Correlations for Students in the Intervention Condition
Generally, quality rating was significantly, but weakly, correlated with all instructional rate variables (range = .00–.37) but uncorrelated with reading outcomes (range = -.01–.06) (see Supplemental Table S1). Similarly, instructional interaction rates were generally significantly and moderately correlated with each other (range = .03–.88), but not, or only weakly, correlated with reading outcomes (range = −.17–.09).
OAKS Gains
In the unconditional OAKS gain score model, the estimated school variance was only 0.34, whereas the estimated residual variance was 29.09, indicating minimal between-schools variability (ρ = 0.01). However, given the inherently nested nature of the data, we nevertheless estimated the effect of instructional quality and instructional interactions on OAKS gains using a mixed model approach. We found a small, nonsignificant main effect of quality rating on OAKS gains (B = 0.11, p = .06, d = 0.06); a trivial, nonsignificant, main effect for the interaction composite (B = −0.01, p = .89, d = −0.004), and a trivial, nonsignificant interaction effect between the two observation variables (B = −0.01, p = .72, d = −0.01; see Supplemental Table S2). These results suggest that, as hypothesized, intervention students with higher gains were slightly more likely to have received instruction in classes with higher average quality ratings, but the differences were minimal. No such relationship was found for rates of instructional interactions.
PRF Gains
For the unconditional PRF gain score model, the estimated school variance was 12.4, whereas the estimated residual variance was 286.9, again indicating very little between-schools variability (ρ = 0.04). However, as with the OAKS gain model, we estimated the effect of instructional quality and instructional interactions on PRF gains using a mixed model approach. We found nonsignificant effects for all parameters: quality rating (B = 0.07, p = .74, d = 0.01), interaction composite (B = 0.16, p = .68, d = 0.01), and the interaction between the two variables (B = 0.03, p = .85, d = 0.01; see Supplemental Table S2). That is, both higher average quality ratings and higher rates of instructional interactions were associated with slightly higher PRF gains, but neither relationship was statistically significant.
Discussion
In this study, we were interested in better understanding three aspects of the academic instructional interactions that occur between middle school students with reading difficulties and their teachers. First, we expected to find that students who received a reading intervention would experience more interactions than their peers who did not receive an intervention. Second, we expected that in intervention classes, the frequency of those interactions would be based on need. That is, we expected that students with more severe reading difficulties would experience higher rates of interactions than students with less severe reading difficulties. Third, we wanted to know whether intervention students who experienced more interactions demonstrated stronger reading growth than students who experienced fewer interactions.
Regarding the first question, our hypothesis was supported. Students who received a targeted reading intervention experienced substantially more instructional interactions than students who did not receive an intervention. This pattern was observed on both a composite measure of instructional interactions, and on the components that made up that composite. This finding aligns with related work in early reading (e.g., Nelson-Walker et al., 2013) and early mathematics (e.g., Doabler et al., 2015). The value of explicit instruction with older struggling readers has been demonstrated in controlled research studies (Kamil et al., 2008; Vaughn et al., 2022), but specific features of instructional interactions as part of explicit instruction for older students has not been studied. Results from this study suggest more instructional interactions may be occurring for students who receive reading interventions.
More instructional interactions in reading intervention classes may be related to the more remedial nature of instruction one might expect in these classes. However, a related expectation would be observed differences in literacy content during instruction, with more focus on basic skill instruction in intervention classes, which we did not observe. Instructional interaction rate may also be related to the smaller class sizes observed in the intervention classes. The student–teacher ratio in intervention classes was 11.5 to 1; in ELA classes it was 28.2 to 1. It is plausible that these smaller class sizes were more conducive to higher rates of interaction.
It is worth emphasizing that students who received a reading intervention experienced similar content coverage to students who did not. We were unsurprised that struggling readers who did not receive an intervention (i.e., comparison students) spent less than 1% of their time on word level reading instruction and activities. We were quite surprised, however, that this focus was equally low for students in reading intervention classrooms, and that, proportionally, those students also spent relatively little time reading connected text. We expected that reading intervention classes would spend significantly more time on more basic reading skills, but for students in both settings, the largest activity category was reading comprehension, and across all categories (e.g., reading comprehension, vocabulary instruction, and reading connected text), the percentages of time spent were very similar for both groups of students. This suggests that, at least on average, reading interventions were not differentiating content by student need.
Regarding the second research question, our hypothesis was not confirmed. That is, we did not find a clear association between reading problem severity and interaction frequency. This lack of association is illustrated in Figure 1 by the shape of the dots around the regression line that represent the association between risk level (measured by reading score at the beginning of the year) and instructional interaction frequency. We are unsure why this association was not stronger but suspect that it may be related to how middle school staff operationalize and implement reading intervention support for struggling readers. That is, schools plan reading interventions within the constraints of an overall school schedule, and it appears that, at least in this study, schools used a relatively coarse conceptualization of reading intervention, one in which those students who were assigned to the intervention received whatever the school used as its reading intervention. As such, gradations of support based on reading problem severity (i.e., need) may not be consistently factored into how schools provide support for students.
This possibility is supported by the fact that considerable variance in instructional interaction frequency was accounted for by which school students attended. Clearly, there is significant variability in the skills of struggling readers in middle school, and students with more severe reading difficulties may need more work in more specific areas, such as word level reading skills (e.g., multisyllabic word reading), a content area that was almost never observed, even in intervention classes. Struggling readers with stronger skills may need more work on other skills, such as comprehension, which, although it was the most frequently observed content area, still represented only a quarter of instructional time, at 27%. In other words, it does not appear that schools were approaching reading problem severity systematically, even though schools could determine problem severity via state assessments and other data sources (e.g., screening data). There are several possible explanations for this. For one, it is possible that schools attended to differences in problem severity in ways that were not captured by our observation system or in ways different from the quantification of instructional interactions between teachers and students. However, it may also be that teachers are not prepared adequately to design and implement the kinds of individualized interventions necessary to support students with diverse needs (Vaughn et al., 2012).
Results of the third research question suggest there was no association between frequency of instructional interactions and reading growth for students in reading interventions. However, considering the weak associations found with respect to our second research question, this finding should be interpreted carefully. That is, the lack of association may be less dire than if we had found a strong connection between interaction frequency and reading problem severity. The lack of association may also be due to the observation tool not amply capturing associations between interactions and growth.
Observers also rated the quality of reading instruction, and here, two findings are important. First, ratings of quality were not systematically higher in classrooms where frequency of interactions was greater. That is, quality ratings were not simply a proxy for interaction frequency (or vice versa). Observers rated something about instruction quality that appears to be independent from the quantitative information provided by instructional interactions. However, the association between quality and reading growth was also weak, so it remains unclear how quality in this study would manifest itself in terms of benefit to students. Second, there were no differences between type of classroom, ELA or reading intervention. We draw attention to this because it suggests that whatever level of instructional quality students received in reading intervention classes, a commensurate level was also being provided in ELA classes, which is where comparison students received what constituted their most explicit reading instruction support (most students with reading difficulties were also in ELA classes).
Implications for Research and Practice
Taken together, what do the findings suggest? On a national level, an essential context variable is that 69% of eighth-grade students are not proficient readers. This means, among other things, that content-area teachers can be confident that many of their students lack the reading skills necessary to understand and adequately learn grade level, subject-area content by reading grade-level texts on their own. For these struggling readers to adequately learn expected academic content, they must receive additional instructional support that their reading proficient peers do not require. It seems sound based on previous research findings that this support should come in the form of explicit and systematic supplemental reading interventions during the school day.
Findings from this study suggest several additional considerations. First, results of the first research question suggest that reading interventions are characterized by more frequent instructional interactions with teachers than are ELA classes. We believe this is an important instructional objective, but one that should be accompanied by evidence showing a positive association between interaction frequency and reading outcomes. In this study, we observed no association between interaction frequency and outcomes. One explanation for this may be the lack of association between problem severity and instructional interactions. That is, students with more severe reading problems received the same degree of instructional interactions with their teacher, and with other peers, on average, as students with less severe difficulties. We suggest schools should attend to this discrepancy in designing reading interventions for students and provide professional development to and support teachers in matching student need to intervention (Vaughn et al., 2012).
We also suggest that a better alignment between need and frequency of instructional interactions may produce evidence that reading interventions delivered in authentic school settings can improve outcomes for struggling students, similar to that found in controlled studies (e.g., Scammacca et al., 2015). We did not see this association in this study, perhaps because interaction frequency did not appear to be related to problem severity (Vaughn et al., 2022). In summary, middle schools should continue to provide reading interventions to struggling readers; better link intervention intensity, including interactions between teachers and students, to student need; and carefully assess whether interventions are improving rates of student reading growth. We also suggest that supplemental reading support in content area classes must be part of the solution to the reading problem in secondary settings.
Empirical evidence is strong that reading interventions can work for struggling middle school students (Biancarosa & Snow, 2006; Scammacca et al., 2015; Vaughn et al., 2022). This part of the solution to the middle school reading problem should continue to be strongly encouraged. However, additional efforts are needed to implement and sustain effective reading interventions on a wide scale in middle schools and identify the mechanisms that make such interventions effective. Therefore, it is essential that schools rigorously evaluate the quality and impact of interventions they provide. The expectation should be that for most struggling readers, interventions are closing the gap with their on-track peers.
Limitations
Several limitations to this study may have impacted its findings. First, the study rests on the assumption that three observations per classroom accurately represent the instruction and supports students received across the entire school year. Observations were sequenced to measure instruction at the beginning, middle, and end of the school year, but given the nature of observation data, even information generated by low-inference observation tools such as the MSIP-COT, it may be that additional observations would be required to obtain a sufficiently stable estimate of classroom instructional practice.
Second, observers coded instructional interactions as discrete elements (e.g., the number of teacher models and opportunities to respond). In Smolkowski and Gunn (2012), this tactic was used successfully to demonstrate the association between instruction and student reading performance. However, a related way of conceptualizing instructional interaction frequency is to quantify interactions as sequences of instructional interaction patterns rather than as discrete elements. For example, a single sequence might include teachers demonstrating something for students before asking them to do it, students responding to those requests, and teachers providing feedback to students in relation to their completion of the request. Future research might examine the utility of measuring sequences of linked interaction elements.
Conclusion
We were hopeful that this study might point to a robust relationship between observable instructional interactions between teachers and students and both reading problem severity and subsequent student reading growth. This association has been observed in the early elementary grades and similar results in middle school would provide important information about what implementation factors should be prioritized and measured to determine whether interventions are being delivered as intended. Although this relationship was not observed, students who received a reading intervention did experience substantially more interactions with their teachers than did their peers who did not receive an intervention. If beneficial instructional interactions between teachers and students can be identified, middle schools might be better able to provide reading intervention supports that decrease the reading gap between students with reading difficulties and their on-track peers.
Supplemental Material
sj-docx-1-ldx-10.1177_00222194231211948 – Supplemental material for Measuring Instructional Interactions During Reading Instruction for Students Receiving Intervention in Middle School
Supplemental material, sj-docx-1-ldx-10.1177_00222194231211948 for Measuring Instructional Interactions During Reading Instruction for Students Receiving Intervention in Middle School by Scott K. Baker, Patrick C. Kennedy, Dean Richards, Nancy J. Nelson, Hank Fien and Christian T. Doabler in Journal of Learning Disabilities
Footnotes
Correction (December 2023):
Article updated online to correct reference McKenna et al., 2021.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The Institute of Education Sciences, U.S. Department of Education through Grants R305A150037 and R305E100041 to the University of Oregon. The opinions expressed are those of the authors and do not represent views of the Institute or the U.S. Department of Education.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
