Abstract
This study extends prior research by manipulating both intervention and skill difficulty using a multiple baseline across participants design with changing phases in a virtual tutoring environment. Participants were four U.S. students from third and fifth grades for whom appropriate and challenging instructional targets were selected following diagnostic assessment using curriculum-based measurement. Instructional strategies were selected to align or misalign with those instructional targets. The multiple baseline design was used to determine the functional relationship between the instructional strategies (acquisition or fluency-building) with appropriate and challenging skills. Results suggested that indicated intervention strategies aligned with students’ skill proficiency resulted in improvements but that contraindicated intervention strategies that were misaligned with students’ skill proficiency did not. Furthermore, most students rated the contraindicated intervention strategies as less acceptable or reported higher levels of math anxiety.
A defining feature of multi-tiered systems of support (MTSS) is access to intensified instruction based on student need. Students who do not experience learning success at lower tiers of instruction move to higher, more intensive tiers of instruction. When MTSS was a new framework, intensification was defined largely by temporal features with more frequent, longer session lengths, and more weeks of intervention (Vaughn et al., 2003). In addition to these temporal features, more frequent progress monitoring was considered more intensive as was an intervention delivered under more costly arrangements (e.g., one on one vs. in small groups) (Batsche et al., 2006). As MTSS has developed, research has informed our understanding of intensification. Intensification is usually measured via student learning, with stronger learning gains reflecting a more intensive instructional arrangement. Unfortunately, few studies have empirically examined instructional arrangement using data-based decision-making. Moreover, given the persistently poor math proficiency displayed by U.S. students (National Center for Education Statistics [NCES], 2022), coupled with unique learning losses associated with school closing and alternative reopening plans resulting from the novel coronavirus (Dorn et al., 2021; Lewis et al., 2021), studies examining the contributions of the instructional arrangement for improving intensification efforts in math are particularly needed.
Intervention Intensification
Whereas it seemed logical that more intensive instruction might require more time for delivery, defining intensity as a longer duration intervention has not produced stronger gains on student learning in math (Codding et al., 2016; Schutte et al., 2015). Temporal characteristics of intervention like session length and duration are no longer considered the best ways or even optimal ways to operationalize or intensify instruction. Features like intervention dosage (the frequency with which an intervention’s active ingredients are experienced by students) have emerged as more central to intensification (Yoder & Woynaroski, 2015). Dosage reflects the frequency or magnitude of the intervention ingredients experienced by a learner during intervention. For example, if the intervention targets fluency-building using partner work and the active ingredient is opportunities to respond, then dosage can be captured as the number of opportunities to respond experienced by students. If the intervention involves taking turns, then such an intervention might result in half the opportunities to respond compared with an individually delivered intervention during which the child responds at every opportunity (e.g., response cards). Alternatively, active ingredients may be added to the intervention, for example, adding immediate corrective feedback to explicit timing (Duhon et al., 2014). Recent work underscores that students will respond to different dosages of treatment and provides a model for determining the dose needed to attain the needed learning improvements for students vis-à-vis a dose–response curve analysis (Duhon et al., 2020).
Another emerging feature of intensification centers on alignment of the intervention to a student’s specific learning needs with more individualized interventions reflecting more intensive intervention efforts. The concept of aligning instruction with student need based on student data is not a new idea and, in fact, is the theoretical basis for the well-documented effect of formative assessment on student learning (Fuchs et al., 1991). Specifically, formative assessment enables the teacher to alter instruction to better meet the learner’s needs (Fuchs et al., 1989). Recent models of intensification from MTSS policy groups, for example, the National Center on Intensive Intervention’s Data-Based Individualization Process, reflect a growing recognition that an important dimension of intensification is the alignment of measured student need with intervention (Lemons et al., 2018).
Instructional Hierarchy
One way of aligning instruction with student proficiency is the Instructional Hierarchy (IH; Haring & Eaton, 1978). The IH is a model for measuring student performance, knowing a student’s stage of learning, and choosing instructional tactics that are most likely to be effective for the learner. The IH suggests that learning progresses through four distinct stages: acquisition, fluency-building, generalization, and adaptation. Generalization and adaptation as originally described by Haring and Eaton are both forms of generalization (stimulus and response) so we will collapse those stages in our presentation here. Understanding the stage of learning within which a specific learner is functioning at a specific point in instruction is determined by measuring student performance. Student performance during the acquisition stage of learning is referred to as “frustration” and is characterized by a high likelihood of errors, incomplete understanding, and slow performance. In the fluency-building stage of learning, student performance is referred to as “instructional” and is characterized by high accuracy but slow and effortful performance. The student should be able to demonstrate complete and accurate understanding, but the student will have to dedicate a great deal of cognitive effort and resources to solve problems. In the generalization stage of learning, student performance is referred to as mastery and is characterized by fluent, facile, adaptable, and flexible skill performance. The student’s performance is now both accurate and speeded or “fluent” (Binder, 1996).
The IH further details that specific instructional tactics are needed within each stage of learning. In the acquisition stage, the goal of instruction is to establish accurate and complete understanding. Thus, effective tactics include prompting, cuing, and supporting correct student responses using adult guidance to permit detection of errors and immediate corrective feedback. In the fluency-building stage of learning, the goal of instruction is to build speed or ease of responding. Thus, effective fluency-building tactics include a high dosage of opportunities to respond with delayed corrective feedback and goals for improved performance. The goal of generalization instruction is to encourage the use of the learned skill broadly, under different conditions (e.g., related but more challenging or applied tasks) and modification of the learned skill to solve problems more efficiently or differently. Thus, effective tactics include varying the task, providing application opportunities, and monitoring and supporting correct responding.
One meta-analysis documented that when students received the indicated intervention, performance improved and when students received the contraindicated intervention, performance worsened, calling this a “skill by treatment interaction” (Burns et al., 2010). However, this study did not manipulate alignment or test the IH effect directly. Maki and colleagues (2021) conducted an experiment manipulating instructional match directly and used diagnostic assessment to determine the stage of skill development within the IH to determine whether five students required acquisition or fluency intervention. Within a multiple-baseline design across students, the indicated intervention resulted in improvements on the multiplication measure, whereas the contraindicated intervention did not. The indicated intervention also yielded better accuracy on multiplication facts than the contraindicated intervention for those students falling in the acquisition stage of skill development, but this was not the case for two students who displayed high levels of accuracy, supporting the IH-guided alignment of intervention.
Math Instruction and Math Anxiety
The performance of students in the 10th and 25th percentiles on the National Assessment of Educational Progress recently decreased, suggesting that students who struggle to learn math are not receiving appropriate instructional support (NCES, 2022). Many students do not display whole number proficiency by fifth grade which is an urgent problem, given that whole number knowledge is necessary to access and engage in advanced math content (e.g., Namkung et al., 2018). Students experiencing math difficulties are more likely to drop out of school, be less prepared for professional careers, and are less likely to be employed (Fan & Wolters, 2014). Low math performance may lead to avoidance of math tasks, coursework, and science, technology, engineering, and mathematics (STEM) careers, which are in high demand with the increasing reliance on technology and artificial intelligence (Ashcraft & Moore, 2009; Eutsler et al., 2020). Conversely, students with early math proficiency complete advanced high school math courses, increasing the likelihood of college graduation and leading to higher lifetime earnings (Lee, 2012).
School closures and alternative reopening plans, driven by onset of the COVID-19 pandemic, exacerbated these problems as U.S. math learning losses exceeded typical summer losses and were more substantial than with any other subject (Kuhfeld et al., 2020; Renaissance Learning, Inc., 2021). Projected and realized drops in math performance have pointed to the most significant impacts for students in elementary grades. Longitudinal studies demonstrated that math performance in 2020, compared with 2019, was as much as 10 percentile points lower, a smaller proportion of students made gains over the year, and twice as many students’ performance dropped one quintile (Lewis et al., 2021). Data collected from a few sources suggest that students lost, on average, 4 to 5 months of progress in math (Dorn et al., 2021; Lewis et al., 2021). These learning losses are likely compounded for students with or at risk of math learning disabilities due to disruptions in the quality, frequency, and access to specialized services or individualized education plans generated by the remote learning environment (National Academies of Sciences, Engineering, and Medicine [NASEM], 2020).
Experts speculate that one factor responsible for this concerning slide is math anxiety (Sawchuk & Sparks, 2020). According to a nationally representative survey, 67% of U.S. teachers reported students’ math anxiety was a significant classroom challenge (Sparks, 2020). A moderate negative bidirectional relationship is consistently found between math achievement and math anxiety (Barroso et al., 2021). That is, low math performance is associated with higher math anxiety and higher math anxiety is associated with lower math performance. The two primary theories, Reduced Competency Theory (Maloney, 2016) and Disruption Theory (Ashcraft & Kirk, 2001), explaining the connection between math anxiety and math achievement suggest that either poor math performance elicits less enjoyment of math tasks and lower engagement with math work or high levels of math anxiety lead to low math performance due to negative thoughts and worries that impair cognitive capacity (Gunderson et al., 2018).
Three recent meta-analyses suggested that math content may contribute to variability in the strength of the relationship between math achievement and math anxiety (Barroso et al., 2021; Li et al., 2021; Namkung et al., 2019). In their meta-analysis, Namkung et al. (2019) found a stronger relationship between math anxiety and math achievement when tasks included challenging math material containing multiple steps than with simple math tasks. Another meta-analysis, authored by Li et al. (2021), specifically looking at the relationship between math anxiety and motivation, illustrated that perceptions regarding students’ competence in math and value of math were negatively associated with math anxiety. The authors suggested providing math material with a moderate level of difficulty (not too easy or too hard) as one way to intervene to address this relationship. We are unaware of any studies that have examined math anxiety in context of well-aligned instructional support versus misaligned instructional support or considered the impact of skill difficulty.
Virtual Tutoring
Some experts suggest virtual tutoring may be one cost-effective way to address the learning gaps observed in math following repercussions from COVID-19 (Kraft & Falken, 2021). Virtual tutoring is conducted synchronously through a computer-based platform such as Zoom. There are few empirical studies on the effectiveness of virtual tutoring, but preliminary work holds promise. For example, one study paired university students with middle schoolers who provided 3 or 6 hr of one-to-one tutoring for about 5 weeks on homework. The results positively affected gains on the standardized outcome measure and better outcomes were observed with more hours of tutoring (Carlana & La Ferrara, 2021). Another study, also pairing undergraduate university students with middle schoolers to address school or homework, was conducted one to one via Zoom twice weekly for 30 min but did not produce a significant effect when compared with students in the control condition (Kraft et al., 2022). These authors indicated a positive impact but more intensive dosing (increases in frequency or duration) may be required to achieve a significant impact on student performance. We are aware of only one study that provided specific online math tutoring. Gortazar and colleagues (2022) found significant increases in test scores and year-end math grades among middle schoolers who received three sessions weekly for 50 min over 8 weeks after school using a two-to-one ratio of students to tutor. These studies varied according to dose and focus albeit generally yielded positive impacts.
Purpose
The extent to which an intervention is individualized to address students learning needs is a contemporary method to operationalize instructional intensification. Therefore, it is important to evaluate the framework proposed by Haring and Eaton (1978) and specially apply it to math. The purpose of this study was to experimentally manipulate intervention alignment to determine if indicated instruction produced stronger learning gains than did contraindicated instruction. The practical implications of such findings inform MTSS work by validating a specific method for intensifying instruction for struggling learners.
This study extends prior research by manipulating both intervention tactics and skill difficulty using a multiple baseline across participants design with changing phases in a virtual tutoring environment. First, we examined the intervention (i.e., acquisition or fluency) aligned with student performance, according to the IH, on the skill that diagnostic testing indicated was the appropriate instructional target. The purpose of this phase was to verify the treatment effect of aligning the skill target and intervention with student proficiency. Next, a contraindicated intervention (i.e., acquisition or fluency) misaligned with skill development on a challenging target skill was delivered. The purpose of this phase was to evaluate the effect of misaligned intervention skill and intervention on student performance. In the third phase, students received the indicated intervention to address the challenging skill. The purpose of this phase was to evaluate the effect of the correctly aligned intervention on a too difficult skill. Given that previous research suggested that the relationship between math achievement and math anxiety is stronger when students are given challenging instructional tasks, participants self-reported experiencing math anxiety before and after every intervention session and student acceptance of the presented intervention options was collected following each study phase. Primary and exploratory research questions are listed below:
Method
Participants and Setting
Four students, three third graders (Nora, Brice, and Alexis) and one fifth grader (Mallory), from a public elementary school serving students in Grades 3 through 5 from a small suburban district in the Northeastern United States participated in the study. All participant names are pseudonyms. Three students identified as girls (Nora, Alexis, and Mallory) and one as a boy (Brice). All students were White, English was their primary language, and no students had an identified disability or were they receiving special education programming or math supports. School demographics were as follows: 89% of students were White, 5% Latinx, 3% Asian, 1.5% African American/Black, and 1.4% Multirace; 3% of students were English learners, 21% were economically disadvantaged, and nearly 20% of students had an identified disability.
Students were recruited by the director of pupil services and school principal using fall universal screening data (iReady) to identify students that they believed would benefit from targeted (Tier 3) supplemental services. From the schools’ universal screening procedures, nine students were identified as eligible for participation and caregivers of six students provided informed consent. Follow-up screening on number operations using SpringMath measures determined that four students scored in the frustration range signifying the need for intervention (VanDerHeyden et al., 2019), whereas the remaining two students demonstrated performance on grade-level skills that were commensurate with expectations.
Instructional Context
The school was delivering instruction in person, 4 days per week at the time of this study (January through March 2021) and students had been in school under that arrangement since September 15, 2020. Class sizes were reduced by half to permit social distancing between students during class. There were 10 third-grade and 11 fifth-grade classes with an average of 14 students per class. In 2019, 49% of third-grade students and 54% of fifth-grade students met or exceeded the proficiency criterion on the state’s year-end test, which was comparable to the state average. The school used Everyday Mathematics (McGraw Hill, 2016) as their core curriculum and supplemented with several online tools to build fact fluency and to differentiate instruction. The school used iReady to universally screen all students in mathematics and two full-time math interventionists provided supplemental instruction in small groups to students who were determined by the school’s decision team to be struggling in mathematics. Participants in this study were not receiving math support from the school during the study.
Measures
Appropriate and challenging skills selected from SpringMath were administered each session to monitor student progress. SpringMath screening and diagnostic testing were used to identify the appropriate and challenging skill for each student. The authors developed a one- item rating scale to measure math anxiety that was also given every session. Acceptability of each intervention condition was measured at the end of each phase of the study.
SpringMath
SpringMath is an online assessment and intervention tool (Education Research & Consulting, Inc., 2013). The fall screening measures were administered to students according to their grade levels. Grade 3 screening measures included Fact Families Addition & Subtraction 0 to 20; 3-Digit Addition With & Without Regrouping; and 3-Digit Subtraction With & Without Regrouping. Grade 5 assessments included the following: Fact Families for Multiplication & Division 0 to 12; Add & Subtract with Decimals to the Hundredths; Multiply 2-Digit by 2-Digit Numbers With & Without Regrouping; and Find Least Common Denominator. The diagnostic assessments included successively easier but related subskill measures to the first screening skill for which the student scored in the risk range, which was the first screening skill for all students in all grades. For the third-grade students, Sums to 20 was the first diagnostic measure administered and all students fell in the instructional range for this skill; therefore, no additional measures were administered. For the fifth-grade student, Mallory, eight measures were administered (multiplication facts 0–12, division facts 0–12, division facts 0–5, multiplication facts 0–9, multiplication facts 5–9, multiplication facts 0–5, sums to 12, sums to 6) before the appropriate target skill could be identified. The challenging skill was a winter screening skill.
All progress monitoring measures were generated in SpringMath according to tested parameters. Each measure included between 49 and 68 problems reflecting mastery plus 20% to minimize potential ceiling effects on scores. For each measure, problems were stratified by operation and difficulty and then randomized within each stratification to fill the assessment. All measures were subjected to development procedures designed to maximize equivalence across generated forms, which has been described elsewhere. Studies examining reliability of scores reported delayed alternate form reliabilities of r = .77 to .85 (VanDerHeyden et al., 2019) for generated measures. A large-scale generalizability study reported that 77.10% to 88.69% of the variance in student scores was accounted for at the student level and remaining sources of variance (including probe form, day, and error) were trivial (Solomon et al., 2022). All measures were generated with answer keys and included standard directions for administration. For this study, all measures were timed at 2 min each. All scores are reported as answers correct per 2 min. Prior to the start of this study, a small generalizability study was conducted to verify the reliability of these measures when administered virtually. A basic fact and complex operation skill were administered via paper/pencil or the computer. Researchers administered four alternate forms of each measure under both computer and paper/pencil conditions (VanDerHeyden et al., 2023) and analyzed reliability within a two-facet (i.e., assessment format and probe) partially nested G-study followed by four one-facet (i.e., probe) G-studies to examine reliability by probe type but within computer or paper/pencil conditions. Results indicated strong reliability of measurement for both the complex and basic skill operation via paper/pencil and under virtual conditions (65% and 72% of the variance in scores was explained at the student level when these measures were administered via the computer; both measures exceeded generalizability and dependability of .75 with the administration of two probes under both assessment conditions). Results were consistent across assessment formats (virtual and paper/pencil) for the basic fact operation. Because all measures were administered virtually in this study, virtual assessment data could be considered reliable across phases.
Math Anxiety
The authors created a one-item rating scale for which students reported their own math anxiety on a scale ranging from 0 (worried) to 10 (comfortable). The scale was issued through Seesaw as a worksheet and students circled their answers using a virtual pen. The students did not share their screen via Zoom during this time. The scale consisted of a colored bar ranging from different gradations of red for the worried portion (0–2) of the scale, fading to orange and yellow (3–7), and with different gradations of green for the comfortable portion of the scale (8–10). Face emojis were also used to anchor 0 (frown), 5 (neutral), and 10 (smile) and were located at the top of the colored bar. Students were asked “On this scale [interventionist pointed to colored bar], how anxious are you about working on math?” All ratings were reverse scored for analyses so higher scores reflected greater anxiety.
Children’s Intervention Rating Scale
Intervention acceptability was measured using an adapted version of the seven-item Children’s Intervention Rating Profile (CIRP) containing seven items, which uses a Likert-type scale ranging from 1 (I do not agree) to 6 (agree) (Witt & Elliott, 1985). Adaptions were only made to reflect the intervention features and content area which was intended with the development of this rating scale. The CIRP is the most used measure employed in intervention research (Silva et al., 2020). The CIRP assesses students’ beliefs about an intervention (e.g., the intervention was fair, the intervention was too hard, the intervention may cause problems with my friends, there are better ways to learn than with this intervention, the intervention activities are good to use with other students, I like the intervention activities, I think the intervention will help me do better in school). In accordance with common practices (Silva et al., 2020), the total acceptability scores and mean rating scores were reported; higher scores indicate more acceptability. Three versions of the CIRP were created, one for each intervention condition. At the end of each intervention phase, the student accessed the CIRP through the online platform. The interventionist provided a brief description of the intervention phase prior to the student completing the rating scale. The CIRP has an average coefficient alpha of .86 (Turco & Elliott, 1986).
Interventions
Two types of interventions, designed to build either skill acquisition or fluency, were administered to all students. For Brice, Nora, and Alexis, performance on the target skill fell in the instructional range which suggests that a fluency-building intervention be delivered; thus, the fluency intervention served as the indicated or matched intervention. For the challenging skill, which fell in the frustration range of performance among all three students, the fluency intervention was delivered as the contraindicated intervention, given that the acquisition intervention would be the appropriately matched intervention (see Table 1). For Mallory, whose performance for the target and challenging skills fell in the frustration range, an acquisition intervention was delivered as the indicated intervention, whereas the fluency intervention served as the contraindicated intervention. For all students, an acquisition intervention was administered during the final intervention phase because this intervention was indicated for the challenging skill.
Participants’ Appropriate and Challenging Skills and Associated Intervention Categories.
Note. For the appropriate skill, the indicated intervention was provided, which consisted of the aligned instructional tactic. For the challenging skill and contraindicated intervention category, initially the misaligned instructional tactic was provided. Then, for the challenging skill, the indicated intervention category, consisting of the aligned instructional tactic, was provided.
All intervention packages contained procedural and conceptual knowledge activities as well as a reward component. A reward component was included in all intervention conditions to promote engagement in the intervention. For the reward component, if a student beat their prior score, an announcement was delivered to students via Seesaw which notified that student when they received a virtual token. When five tokens were earned, a mystery prize (e.g., markers, code detectors, stickers, key chains) was post mailed to the student’s home.
Fluency Intervention
The fluency intervention package required 20 min and consisted of three components: (a) timed trial with scoring, error correction, graphing, and reward (7 min), (b) word problem-solving (7 min), and (c) review game (6 min). Students were presented with a worksheet with math problems written on it in a vertical format. After completing an example problem with the interventionist, who provided immediate error correction if necessary, students answered aloud as many problems as possible in 2 min. Before answering the problems aloud, students were encouraged to beat their best score and a graph of the prior score was provided.
During the first session, the highest baseline score was provided to the student as the score to beat. When the 2 min was complete, the interventionist provided the score and the student colored in the graph. Students used the virtual white board option or blank space on the worksheet to rework any missed problems. Next, the interventionist and student worked together to complete three- to five-word problems using the white board or a blank activity sheet. Finally, the interventionist and student played the card game “war.” The interventionist drew one fact card from each of the two stacks written on different colored paper and showed the student the cards. The student indicated which player had the higher value card and the interventionist kept track of winning hands. If the values were tied, the interventionist drew four cards for each player showing the fourth card. The player with the higher value final card won all the cards. Facts written on the card matched the skill the student was working on (e.g., 0–20 if the student was working on sums to 20).
Acquisition Intervention
The acquisition intervention package was 20 min and consisted of two components: (a) cover-copy-compare with graphing (10 min) and (b) conceptual understanding activities (10 min). Students were shown a worksheet with six columns. The first three columns in each row contained a problem without the answer (worked problem), a problem with the answer (check problem), and a box for students to indicate the match. Procedures from Skinner et al. (1997) were adapted for the online format in which the worked problem with the answer was hidden using a virtual text box. Students wrote the answer to one problem at a time, uncovered the worked problem, and made a check mark if solutions matched. If the solutions did not match, the interventionist assisted the student to solve the problem again. After 10 min, the interventionist provided the score and the student colored in a graph. To address conceptual understanding, various activities were embedded, as appropriate for each skill, illustrating the relationship between math operations, practicing with commutative and associative properties, using number lines and multiplication arrays, rewriting number sentences into equations, finding missing numbers, and determining multiple ways to make different quantities.
Interventionist Training
The interventionist was a first-year female school psychology doctoral student who was Lebanese and bilingual. Prior to entry in the doctoral program, the interventionist was a math teacher at a charter school. She received training in the assessment and intervention materials by attending a 2-hr workshop generated by SpringMath. The interventionist practiced the assessment and intervention procedures with the first author and was required to achieve 100% assessment and intervention fidelity before the study commenced.
Procedures
All sessions were completed virtually during after-school hours when students logged into the Zoom platform from their homes. Intervention and assessment materials were delivered through a paid subscription to Seesaw. The Seesaw platform permitted all intervention and assessment materials to be uploaded as activities and students used virtual pens, pencils, or markers to write on the screen with their track pad or mouse cursor. Screens were shared via Zoom with the interventionist, who also made use of the white board option available in Zoom.
Preintervention Assessment
Preintervention assessment included two phases: (a) administration of the SpringMath fall package of screening measures for each student’s appropriate grade level and (b) diagnostic assessment to identify the skill targets for each student’s intervention (VanDerHeyden et al., 2021). For screening, students in Grade 3 were administered three measures: Fact Families Addition & Subtraction 0 to 20, 3-Digit Addition With & Without Regrouping; and 3-Digit Subtraction With & Without Regrouping which reflected skills that students were expected to have acquired and for which diagnostic assessment could sample back through successively easier content to quantify student skill proficiency and identify specific skill gaps. The student (Mallory) in Grade 5 was administered four measures: Fact Families for Multiplication & Division 0 to 12; Add & Subtract with Decimals to the Hundredths; Multiply 2-Digit by 2-Digit Numbers With & Without Regrouping; and Find Least Common Denominator. All students included in the study scored in the frustration range according to SpringMath cut scores on the fall screening for their respective grade levels.
The second phase of preintervention assessment was diagnostic. Each screening skill has an associated sequence of follow-up assessments that sample incrementally prerequisite skills based on specific decision rules in SpringMath. Broadly, there are two possibilities for intervention; if the student’s performance is in the instructional range on a prerequisite skill, the student will be slated for a fluency intervention for that skill. If the student is frustrational on a given skill, but mastery on the most immediate perquisite, the student will be slated for an acquisition intervention on the more challenging skill. The diagnostic assessment accomplishes two things: identifying the intervention skill that should be targeted during intervention and determining whether the student needs acquisition or fluency intervention.
The screening skill for which the student performed in the frustration range and was the earliest skill in the scope and sequence chart served as the starting place for the diagnostic assessment. The intervention target was a related, prerequisite skill. For this study, we also chose a challenging skill so that we could evaluate misaligned intervention. In all cases, the challenging skill was a winter screening skill for the student’s grade level (Fact Families Multiplication & Division 0–9 for Grade 3 students and Add & Subtract Fractions with Unlike Denominators for Grade 5). The contraindicated intervention tactic was fluency-building (because we knew students had not acquired that skill). Targeting a winter skill for intervention is like the way that many schools plan intervention for students because core instruction would be teaching those skills at midyear and these students would be struggling with such skills.
Baseline
The appropriate and challenging skills for each participant were administered by the interventionist without feedback via Seesaw and using the Zoom platform. Nora and Mallory had five baseline sessions, Alexis had eight sessions, and Brice had 11 baseline sessions.
Indicated
The appropriate skill was addressed with the indicated intervention. The purpose of this phase was to verify the treatment effect of aligning the appropriate skill and aligned intervention with students’ stage of skill proficiency. For Nora, Brice, and Alexis, the appropriate skill was sums to 20 and the fluency intervention was indicated. For Mallory, the appropriate skill was sums to 6 and because her performance fell in the frustration range, an acquisition intervention was indicated. Throughout this phase, the interventionist continued to administer progress monitoring probes assessing the appropriate and challenging skills. The anxiety scale was administered before and after each intervention session.
Contraindicated
The contraindicated intervention was administered to address the challenging skill. The purpose of this phase was to evaluate the effect of challenging skills and intervention on student performance. For all four students, the contraindicated intervention was the fluency intervention. Throughout this phase, the interventionist continued to administer progress monitoring probes assessing the appropriate and challenging skills. The anxiety scale was administered before and after each intervention session.
Indicated Intervention With Challenging Skill
During the final intervention phase, students received the indicated intervention to address the grade-level skill that fell in the frustration range (challenging skill). The purpose of this phase was to evaluate the effect of a correctly aligned intervention on a too difficult skill. For Nora, Brice, and Alexis, the acquisition intervention was delivered to improve multiplication and division fact families with numbers 0 to 9. For Mallory, the acquisition intervention was delivered to improve add and subtract fractions with unlike denominators. Throughout this phase, the interventionist continued to administer progress monitoring probes assessing the appropriate and challenging skills. The anxiety scale was administered before and after each intervention session.
Experimental Design and Analysis
A multiple baseline across participants design with changing phases (baseline, indicated, contraindicated, indicated intervention for challenging skill) was employed. Three replications of the effect, rather than four, were observed because Nora and Mallory had the same intervention timeline. Data were examined using visual analysis of trend, level, and variability during phases and when phases changed (Gast & Ledford, 2014). Level changes between phases were assessed by visually inspecting the magnitude in difference between the last point in the baseline phase and the first point in the treatment phase. Trend and variability were assessed by visually inspecting overall patterns in the data during phases and determining whether trends were accelerating, decelerating, or stabilizing (Gast & Ledford, 2014).
Interscorer Agreement
Interscorer agreement was assessed by having an independent rater, who was a White woman in her first year of the doctoral program in school psychology, independently score SpringMath progress monitoring probes across seven sessions representing all study phases for each student. Copies of the SpringMath progress monitoring probes were downloaded from Seesaw for the rater to score. An average of 28% (range, 24% to 32%) of SpringMath progress monitoring probes were scored for interscorer agreement. Agreement was calculated by dividing the number of agreements by the number of agreements plus disagreements. Mean percent agreement for the SpringMath probes containing the appropriate skill was 98.4% (range, 94%–100%). Mean percent agreement for SpringMath probes containing the challenging skill was 97.8% (range, 91%–100%). Discrepancies in scores were due to the readability of numerals, given variable legibility of numerals produced by the online pen.
Treatment Adherence
An independent observer, a White woman in the masters’ level school psychology program, assessed the interventionist’s adherence to the intervention protocols over the course of the study. Adherence was observed via Zoom with the independent observer entering Zoom without using video and audio. She recorded whether the 12 to 15 steps on each treatment protocol were fully completed as written, partially completed, or omitted. The independent observer was trained in each intervention protocol by the first author until she achieved 100% fidelity and had prior coursework on math interventions. The first author provided feedback to the interventionist following review of the integrity data, as necessary.
For Nora, treatment adherence was observed for 19% of sessions and was 100% across all phases. For Alexis, treatment adherence was observed for 16% of sessions and was 100%, 100%, 100%, and 93% for the baseline, indicated, contraindicated, and indicated tactic with challenging skill phases, respectively. For Brice, treatment adherence was observed for 33% of sessions and was 100%, 100%, 100%, and 84% for the baseline, indicated, contraindicated, and indicated tactic with challenging skill phases, respectively. For Mallory, adherence was observed for 29% of sessions and was 100%, 97.25%, 93%, and 100% for the baseline, indicated, contraindicated, and indicated tactic with challenging skill phases, respectively. Estimates of adherence can be considered conservative because omitted steps were the result of connectivity issues at the start of or during the intervention session that required the interventionist to shorten intervention components to maintain the time allocated by caregivers for intervention delivery.
Acceptability
Scores on the CIRP were collected for each of the three intervention phases. During the indicated phase, total acceptability scores (n = 4) were high, ranging from 34 (Brice) to 42 (Mallory), with a mean item rating score of 5.46 (SD = 1.03) across all students. During the contraindicated phase, acceptability ratings (n = 4) were lower, albeit still in the acceptable range, ranging from 20 (Alexis) to 42 (Mallory), with a mean item rating score of 4.63 (SD = 1.73). During the indicated intervention with a challenging skill phase, acceptability scores (n = 4) were high, ranging from 37 (Brice) to 42 (Mallory), with a mean item score of 5.61 (SD = 1.03).
Results
Outcomes for the appropriate and challenging skills are described below according to each phase of the study and illustrated in Figure 1 for math outcomes and Figure 2 for math anxiety self-ratings. Table 2 displays the means and standard deviations for each student across each phase of the study according to the two math outcome measures and math anxiety ratings.

Changes in Answers Correct as a Function of Condition.

Changes in Math Anxiety Ratings as a Function of Condition.
Means and Standard Deviations Across Phases for CBM and Math Anxiety Scores.
Note. Math anxiety items were reverse scored. CBM = Curriculum-based Measures.
Appropriate Skill
During baseline, Nora, Brice, and Alexis displayed performance that fell in the instructional range. Mallory’s performance was in the frustration range. The baseline trend decreased for Nora and Alexis and remained stable for Brice and Mallory. When the indicated phase was implemented, Nora and Alexis displayed an immediate increase in the level of performance which remained stable throughout that intervention phase. An increasing trend was observed for the performance of Mallory and Brice. During the contraindicated phase, an immediate level change, indicating lower performance than in the prior phase, was observed for Alexis, Mallory, and Brice and performance remained lower and stable. Nora’s performance continued to improve during this phase. During the final phase when the intervention was aligned with the challenging skill, performance for Alexis and Brice was consistent with the prior intervention phases albeit slightly variable. For Nora, performance on the appropriate skill displayed an increasing trend, whereas Mallory’s performance was variable.
Challenging Skill
During baseline, all four students displayed performance in the frustration range, as expected based on the preintervention assessment. For all students, performance was low and stable. Mallory was unable to answer any items correctly. During the indicated phase, performance continued to be low and stable for all students, as expected, given the appropriate skill was not the focus of the intervention. During the contraindicated phase, three students continued to display low and stable performance, whereas Nora’s performance displayed a slightly increasing trend. During the final intervention phase, the performance of Nora and Brice displayed an increasing trend, Alexis’ performance was variable but higher than all other intervention phases, and Mallory’s performance showed no change.
Math Anxiety
During baseline, pre- and post-session math anxiety ratings were low and stable (in the comfortable range) for Mallory and Brice. For Nora, pre- and post-session ratings displayed an increasing trend, except for the final data point in the phase. Pre-session ratings were generally higher than post-session ratings and fell in the midrange of scores. For Alexis, a decreasing trend in baseline was observed for both pre- and post-session ratings. Pre-session ratings were higher than post-session ratings.
During the indicated phase, Brice and Alexis displayed low and stable pre- and post-session ratings. Nora’s pre- and post-session ratings were variable and higher than baseline. Post-session ratings were the same as or lower than pre-session ratings with one exception. For Mallory, pre- and post-session ratings displayed a decreasing trend with pre-session ratings higher than post-session ratings and both sets of ratings higher than baseline.
During the contraindicated phase, pre- and post-session ratings for Alexis were low and stable. For Mallory, pre-session ratings were higher than post-session ratings, displaying an increasing trend consistent with the level of performance from the prior phase. Nora displayed pre-session ratings that were higher than post-session ratings in the mid-range. Scores were stable and in the midrange of the scale. Whereas pre-session ratings for Brice were low, post-session ratings were high reflecting an immediate change in the level of performance indicating higher math anxiety. Post-session scores decreased after two sessions.
During the final phase, Alexis and Brice generated pre- and post-session ratings that were low and stable. Nora’s pre- and post-session ratings reflected an immediate level change indicating less anxiety than the prior phase. Pre-session ratings reflected a decreasing trend, whereas post-session ratings were more variable. For Mallory, pre-session ratings were low and stable with one exception. Post-session ratings were the same as or lower than pre-session ratings with one exception.
Discussion
The purpose of this study was to manipulate both intervention and skill difficulty using a multiple baseline across participants design with changing phases in a virtual tutoring environment. For the two indicated phases in this study, during which the indicted intervention was aligned with students’ skill proficiency to address either an appropriate or a challenging skill, improvements in math performance were observed with all four students. When the contraindicated intervention was used to address the challenging skill, stable or slight decreases in performance for all participants were observed. Taken together, generating intervention plans (to build acquisition or fluency) that were aligned with students’ level of skill proficiency (frustration or instructional) led to better outcomes than a misaligned intervention, even when the target skill was much too difficult for the participant. Two students (Nora and Alexis) rated the contraindicated intervention phase as less acceptable than both indicated intervention phases and two students (Nora and Brice) reported more anxiety during the contraindicated phase.
Math Outcomes
Evidence suggests that intervention intensification efforts should extend beyond total intervention duration, instructor qualifications, and group size, given that these variables are not related to math achievement outcomes (Pellegrini et al., 2021). Rather, emphasis should be on the quality and amount of practice opportunities or learning trials embedded within the intervention session. The IH provides a framework with burgeoning evidence to support the idea that student’s math performance will improve when diagnostic assessment is used to align the stage of skill development with intervention tactics (Maki et al., 2021). By doing so, students will be exposed to the appropriate type of learning opportunities.
This study suggests that when students are provided with an intervention package containing intervention strategies (acquisition or fluency-building) that are aligned with their stage of skill development as determined through diagnostic assessment using curriculum-based measures (CBM), performance improves. This finding is consistent with one other study that examined aligned and misaligned interventions in math (Maki et al., 2021) but also extends this research by demonstrating that the same outcome can be observed when an indicated intervention is aligned to address a challenging skill. Notably, students in both conditions received scripted activities to build associated conceptual understanding, so the only difference between the fluency and acquisition intervention was the use of a high dosage of practice opportunities for fluency and more immediate and elaborate error correction for acquisition. In this study, three students made gains when presented with an intervention aligned to meet their stage of skill development on a challenging skill. The fourth student, Mallory, did not display any improvements, a result that is likely because of the large gap (several grade levels) between her current performance and the challenging skill. This may suggest a limit on the extent to which an indicated intervention will lead to improved performance on a related, challenging skill.
When the contraindicated phase was implemented, the performance of three students on the appropriate skill remained stable or declined, as might be expected because the instructional tactics designed to improve the students’ performance were replaced with tactics that were not designed to address their skill proficiency. Nora was the only student for whom improvement continued. During the indicated phase, which was fluency-building, Nora’s performance was approaching the mastery range, and therefore repeated opportunities to practice, which were offered through the daily progress monitoring data, were likely sufficient to continue to improve her performance on the appropriate skill. For example, Nelson et al. (2020, 2021) demonstrated that kindergarten and third students who were exited from intervention displayed performance gains when given one oral reading fluency progress monitoring probe per week than those who did not receive this maintenance intervention.
Math Anxiety
Idiosyncratic patterns for performance were observed on student self-reported math anxiety ratings. Ratings were collected from students before and after each session across all phases of the study. Alexis reported math anxiety during baseline but with exposure to instruction, regardless of intervention alignment or skill difficulty, her ratings displayed no indication of math anxiety for the remainder of the study. Brice generally rated himself as nonanxious across intervention phases except during the contraindicated phase. Brice’s mean ratings were higher during post-session than pre-session, perhaps indicating some discomfort with the misaligned intervention condition. Nora’s self-ratings of math anxiety ranged from just below neutral to just above neutral throughout the study but, like Brice, her mean post-session ratings were highest during the contraindicated phase.
Finally, Mallory rated her math anxiety as lowest during the baseline phase (across pre- and post-session scores). Her pre-session anxiety scores reflected more anxiety than post-session scores across intervention phases, albeit in the low range to midrange, with the highest ratings occurring during the indicated intervention with the challenging skill. Accordingly, it may be the case that Mallory’s ratings reflected concerns about skill difficulty, which lessoned to some extent once intervention supports were provided. Given that Mallory’s instructional level was far below her grade level, it is logical that her math anxiety ratings were higher when she anticipated working on skills that exceeded her instructional level and is consistent with theoretical hypotheses that lower math competency results in higher math anxiety (Maloney, 2016). One finding of this study is that math anxiety is idiosyncratic and may be mitigated for some students by simple exposure to instruction, but for others only with instruction that is well attuned to their learning needs.
Limitations and Future Directions
The conclusions drawn from this study should be considered in context of its limitations.
First, although level and trend changes were observed in the expected direction during the indicated intervention phases, these improvements were more subtle than anticipated. This is likely due to the online context with which this study was carried out. Due to inconsistent broadband connectivity during some sessions, the time allocated to each intervention component was shortened in some sessions. Other sessions were canceled because of poor connectivity. For Brice, the end of the indicated phase coincided with a week-long school break which may have affected the maintenance of performance with the instructionally appropriate skill in the subsequent phase of the study. Thus, these findings might be considered conservative estimates of the effect. Second, although this study included four students, our design includes only three replications of the effect as Nora and Mallory started the intervention simultaneously. This occurred because two other fifth graders recommended by the school did not meet our study criteria based on diagnostic assessment. Prior to commencement of this study, we were hoping to run two multiple baseline designs across third- and fifth-grade students, respectively, with two different interventionists. However, two fifth graders recommended by the school did not meet our study criteria based on our screening and diagnostic assessment. Furthermore, the families of Mallory and Nora indicated the same time preferences for tutoring. Third, fewer observations were conducted to assess treatment adherence for Nora and Alexis because of scheduling challenges. Fourth, the math anxiety scale was researcher-created and although existing published scales are available for children, no one-item scales, that we are aware of, were available for this age level (Barroso et al., 2021). Fifth, the external validity of this study is limited to the characteristics of recruited sample who were from a school consisting of predominantly White, monolingual students with high socioeconomic backgrounds. Sixth, although this study occurred during the context of the novel coronavirus pandemic, schooling for participating students constituted a hybrid model consisting of 4 days of in-person instruction in small classes of less than 15 students and one remote schooling day.
Conclusion
This study extends prior research by manipulating both intervention and skill difficulty using a multiple baseline across participants design with changing phases in a virtual tutoring environment. Verifying that aligned instruction produces stronger learning than misaligned instruction provides important context for discussions about access to more rigorous content and recent demonstrations showing that access on its own does not bring about stronger learning (Koon & Davis, 2019). This study, along with that from Maki et al. (2021), provides further evidence that aligning instructional math strategies with students’ stage of skill development results in positive outcomes. This study illustrated that when instructional strategies are misaligned with students’ stage of skill development, even when the instructional target is appropriate, students’ math performance will not improve. Furthermore, as suggested in this study, students may exhibit higher levels of anxiety and lower acceptability of misaligned instructional practices.
The interaction between skill difficulty of instructional content and the alignment of the intervention strategies with students’ skill proficiency is important for educators to consider when intensifying the intervention approach for students who struggle to learn math skills.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
Funding for this project was provided by Sourcewell Technology, a nonprofit dedicated to improving the use of technology in education and also the publisher of SpringMath. The second author of the article is the founder of SpringMath and she derives financial benefit from the use of SpringMath assessments and interventions in schools.
