Abstract
We synthesized studies published since 2000 that assessed the effects of using virtual manipulatives to increase the mathematical accuracy of students with disabilities. We extracted a total of 1,796 raw data points from 114 cases across 35 single-case studies. By applying three-level multilevel modeling, we analyzed both immediate effects and trends during the intervention phase as well as moderation effects related to student characteristics (case level) and intervention features (study level). Both the average immediate effect and trend during the intervention were statistically significant. The average immediate effect varied significantly by student grade, disability type, developer, device, type of virtual manipulative, and visual model embedded in virtual manipulatives. Neither student characteristics nor intervention feature–related moderators significantly influenced the average trend during the use of virtual manipulatives.
Many elementary and secondary school students, including those with disabilities, obtain low scores when taking standardized mathematics tests. According to the National Assessment of Education Progress results (Nation’s Report Card, 2019), only 19% of fourth-grade students with disabilities performed at or above a proficient level in 2009; this performance declined slightly to 17% in 2019. Eighth-grade students with disabilities exhibited more limited mathematics performance than the elementary group; only 8% performed at or above the proficient level in 2009, and this rate remained unchanged in 2019. The persistent struggle faced by students with disabilities demonstrates an urgent need for educators to determine how to provide students with disabilities intensive, tailored mathematics interventions.
To this end, educators have implemented various digital technologies as instructional tools for teaching mathematics (e.g., Özdemir, 2017; Reneau, 2012). Researchers (e.g., Kiru et al., 2018; Ok et al., 2020) have validated the positive effects of digital technology in improving the performance of students who struggle with mathematics. The software used for mathematics education includes various programs, such as auto-generated practice exercises; tutorials providing guidance and human-like tutoring support; games providing time constraints, simulations replicating real or imaginary systems, problem-solving activities that include specific steps and additional opportunities to practice (Doering & Veletsianos, 2009); and programs providing virtual calculators or virtual manipulatives (VMs). VMs are technology-based interactive visual models, allowing students to manipulate virtual objects to understand mathematical concepts (Moyer-Packenham & Bolyard, 2016).
Virtual manipulatives are technology-based interactive visual models, allowing students to manipulate virtual objects to understand mathematical concepts.
VMs and Recommendations for Use in Teaching Mathematics
VMs were initially developed as Flash or Java programs (see the Utah State University’s [n.d.] National Library of Virtual Manipulatives) and have been available since the late 1990s. Since the implementation of HTML5 web standards, VMs have also become available on mobile devices (see Math Learning Center, n.d.). The key component of VMs is that the representations are not static but are instead designed to be manipulated. Highlighting the function of interactive visualization of images, Moyer-Packenham and Bolyard (2016) suggested two other types of software that are suitable for VM environments, single representation (displaying only interactive visual images) and multirepresentation (presenting numerical or textual information and images to single-representation images), in addition to existing instructional software types, such as tutorials, games, simulations, and problem solving (Doering & Veletsianos, 2009). In designing visual models embedded in VMs, Beckmann (2013) and the National Council of Teachers of Mathematics (2010) suggested several appropriate visual models, including set models (a countable discrete quantity), area models (a continuous quantity comparing the magnitude of area), linear models (a continuous quantity comparing lengths), base-10 blocks (an object representing place value of 10 numerals or digits in a number), algebra scales (balance of two sides presenting numerical values for algebraic equations), and multiple models (more than two visual models).
Previous Literature on the Use of VMs in Mathematics Interventions
In the years since they were introduced, research has indicated that the use of VMs is effective in teaching mathematics to students with disabilities (Bouck & Park, 2018; Satsangi & Miller, 2017; Shin et al., 2017). Four reviews and syntheses have discussed the use of VMs in mathematics interventions. First, Moyer-Packenham and Westenskow (2013) reviewed 66 research reports as part of their examination on the effects of VMs on learning. They observed a moderate overall effect (Cohen’s d = 0.35) of VMs alone or in combination with other instructional treatments (e.g., physical manipulatives and textbooks). However, because Moyer-Packenham and Westenskow’s meta-analysis involved interventions with all students, conclusions about the effects of VMs on specific groups were not possible.
More recently, other researchers (Bouck & Park, 2018; Lafay et al., 2019; Peltier et al., 2020) reviewed studies of VM use in mathematics interventions that included the joint use of virtual and physical manipulatives for students with disabilities. Lafay et al. (2019) synthesized 38 articles to examine the effects of different types of mathematics manipulatives (e.g., physical, pictorial, and virtual) and discovered considerable variations in the types of manipulatives. Bouck and Park (2018) noted an increase in VM use in the education field and found five of the 36 articles were about examining the academic effects of incorporating VMs (e.g., concrete-representational-abstract). Peltier et al. (2020) quantified the effects of mathematics manipulatives on learning by applying a meta-analytic method to 53 single-case design (SCD) articles or dissertations targeting students with disabilities or mathematics difficulties. Although Peltier et al. did not find any significant moderation effects associated with the manipulative type, they observed a higher average effect for VMs (Tau-U = 0.98) than for physical manipulatives (Tau-U = 0.89).
Moderators Influencing the Effects of VMs
Previous research revealed several potential moderators that might influence the effects of VMs on mathematics teaching. Such effects may differ depending on student characteristics (e.g., student grade and disability type) and intervention features (e.g., pretraining purpose, instructional methods of mathematics interventions, developer, device, VM type, and visual model embedded in VMs). For example, Moyer-Packenham and Westenskow (2013) found mixed effects of VM use across grade levels, with moderate effects for preschoolers through fourth graders, small or no effects for middle school students, and large effects for high school students. Ok and Kim (2017) indicated that the effects of using technology (e.g., iPads and iPods) might differ among students with different types of disabilities; most studies have focused on teaching students with low-incidence disabilities (e.g., autism spectrum disorders) and that research including individuals with other types of disabilities is limited.
Furthermore, researchers of studies incorporating technology into academic content, including mathematics, suggested that intervention features can serve as potential moderators. For example, Meyer et al. (2019) noted that a period of pretraining before using a virtual reality platform added significantly to the degree of knowledge attained and reduced the cognitive load necessary for multimedia lessons. Additionally, researchers incorporating technology in their mathematics interventions suggested three different types of instructional methods, leading to various student achievement: teacher-led instruction (providing explicit teacher modeling with a combination of multiple practice opportunities; Bouck, Park, Levy, et al., 2020), teacher-guided instruction (providing scaffolds and prompts when students struggle with independent mathematical problem solving; Satsangi, Hammer, & Hogan, 2018), or technology-assisted instruction (supporting students’ mathematical learning through software programs, commonly combined with instructors’ prompts for instructional steps; Shin & Bryant, 2017). Commercial applications that have been continuously field-tested by researchers maintained the high quality specified by the tool (Koumpouros & Kafazis, 2019). The type of device (e.g., computer or iPad) also led to variations in the mathematical achievement of students with learning disabilities or mathematics difficulties (Shin, 2018; Sung et al., 2016). Other reviews (Ok et al., 2020; Seo & Bryant, 2009) have demonstrated that students’ mathematical achievement and attitudes varied depending on the type of educational technology used (e.g., games, drill and practice, and tutorials). The game type was particularly effective in sustaining students’ motivation for learning, whereas the tutorial type was effective in providing support and feedback within the program.
Furthermore, the use of visual models can be a potential moderator that affects the impact of mathematics intervention effects for students with disabilities. Mazzocco et al. (2013) noted that one consistent challenge for students with disabilities is visualizing mathematical concepts. Thus, educators have implemented multiple visual models (e.g., linear, area, and set models) for various topics to assist students in making sense of different mathematical concepts (van Garderen, 2006). Hamdan and Gunderson (2017) recommend that teachers incorporate a number line (linear model) when teaching fractions to prevent students’ tendency to rely on whole-number counting.
Rationale for the Study
Despite the number of syntheses on technology implementation for mathematics interventions for students with disabilities (Kiru et al., 2018; Ok et al., 2020; Seo & Bryant, 2009) and syntheses on the use of mathematics manipulatives (Bouck & Park, 2018; Lafay et al., 2019; Peltier et al., 2020), no previous meta-analyses have investigated the moderation effects of VMs on students with disabilities in terms of student characteristics and intervention features. To fill this gap in the literature, we implemented three-level multilevel modeling and analyzed the effects of VMs used by students with disabilities.
We utilized only studies that implemented SCDs, a quantitative experimental design in which each student served as their own control (Horner et al., 2005). Previously, there have been two group studies (Bouck et al., 2015; Xin et al., 2017) that have incorporated VMs in their mathematics interventions for students with disabilities. However, the design structure of these studies did not provide case-level data involving repeated measurements on different occasions to examine changes in levels and trends during the intervention; therefore, we excluded these group studies.
The purpose of this study was to synthesize the effects of using VMs to increase the mathematical accuracy of students with disabilities. Using three-level multilevel modeling, we examined the immediate effects and trends during the intervention as well as the moderation effects of case-level (student characteristics) and study-level (intervention features) variables. Multilevel models for individual case data allowed for flexibility and addressed methodological considerations within SCD studies, such as changes in levels and trends between baseline and intervention phases (Baek et al., 2020), variability in intervention effects through moderators at both the case and study levels (Moeyaert et al., 2020), and within-group errors that can be autocorrelated (Petit-Bois et al., 2016). Three primary research questions guided the current study:
What are the average immediate effect and the average trend during the use of VMs for the mathematical accuracy of students with disabilities?
To what extent is the average immediate effect of using VMs moderated by student characteristics and intervention features?
To what extent is the average trend during the use of VMs moderated by student characteristics and intervention features?
Method
Inclusion Criteria and Search Strategies
We selected studies that met four inclusion criteria: (a) studies included students with disabilities in Grades K–12, (b) the research designs of the studies were SCDs (reversal or ABAB, multiple baseline or multiple probe, changing criterion, and alternating treatment with baselines), (c) the independent variable was the use of VMs (operationally defined as interactive visual representations on a digital screen that are used for mathematical teaching and learning tools), and (d) the dependent variable was mathematical accuracy (correct percentages on outcome measures in mathematics, depicted in graphical data).
We employed four search strategies to identify the studies. Figure S1 (in the online supplemental materials) depicts the Preferred Reporting Items for Systematic Reviews and Meta-Analyses flowchart (Moher et al., 2009). The first was a search of peer-reviewed articles and dissertations published between January 2000 and September 2020. We designated the year 2000 as the starting point because we sought to examine the latest trends in VM use in our field of interest. Also, beginning in 2000, we observed a gradual increase in articles about VM use in special education journals. Therefore, using an electronic database search of Academic Search Complete (n = 2,919), Education Source (n = 1,360), PsycInfo (n = 1,235), and ERIC (n = 1,347)—with keywords “virtual manipulative*,” “game,” “simulation,” “interactive*,” “touchscreen,” “digital,” “mobile,” “tablet,” “iPad,” “computer,” or “app*,” and “mathematic*” and “disab*”—we archived a total of 6,861 articles. After the removal of 2,536 duplicates, this article search resulted in 4,325 studies. The next step was to review the titles, keywords, and abstracts of the 4,325 studies. Also, 3,942 studies were excluded for being unrelated to the research topic. The third step was to retrieve full-text copies of the remaining 383 studies and to carefully review them to determine whether they met the inclusion criteria. A total of 352 of the 383 studies were excluded for the following reasons: (a) they did not target students with disabilities (n = 11), (b) they did not implement single-case research designs (n = 98), (c) they did not use VMs within mathematics interventions (n = 87), or (d) they did not focus on mathematical accuracy (n = 156). Thus, the full-text review resulted in a total of 31 studies. The fourth and final step was to conduct an ancestral search of the 31 included studies. After a careful review of their reference lists, we found five additional studies that met the inclusion criteria, resulting in a total of 36 studies. Two authors checked the interrater reliability for the selection of included studies and achieved 100% agreement using the percentage agreement calculation between the two coders. After checking for outliers (perfect fit with residual standard error as almost zero; 0% accuracy in the baseline and 100% accuracy in the treatment phases), one study was removed, resulting in 35 studies for the data analysis.
Coding Procedures
Using an Excel spreadsheet, we developed a coding tool by reviewing previous studies of the use of VMs (e.g., Moyer-Packenham & Bolyard, 2016) and three-level multilevel modeling for SCD data and effects (e.g., Moeyaert et al., 2021). The primary coder was the first author, who has more than 10 years of teaching and research experience in the field of special education. The first author provided coding training to another coauthor in two hour-long sessions, the first focused on how to code each categorical variable, and the second focused on how to code the SCD raw data. The two coders conducted four individual remote follow-ups to establish an agreement about the meaning of each coding variable. After the training, the coders independently coded two articles and achieved 100% interrater reliability for all categories of the coding tool. The primary coder coded student characteristics, intervention feature–related variables, study quality, and SCD raw data. The independent coder coded approximately 34% (n = 12) of the 35 included studies to check the interrater reliability of coding. By applying the percentage agreement calculation established by the two coders, we achieved 98% agreement for student characteristics and intervention feature–related variables and 100% agreement for study quality and SCD raw data.
Student characteristics and intervention features
To examine the overall features of the 35 included studies, we coded student characteristics and intervention features (see Tables S1 to S3 in the online supplemental materials for the summary of each coding). For the categorical variables that were included in the moderator analysis, each variable was dummy coded. To code student characteristics, we identified students’ gender, ethnicity, grade level (Grades 1 to 5 as elementary school, Grades 6 to 8 as middle school, and Grades 9 to 12 as high school), and disability (learning disability [LD], intellectual disability [ID], autism spectrum disorder [ASD], emotional and behavioral disorders [EBD], and other health impairments [OHI]). In cases of students that were identified with multiple disabilities, we coded only the primary disability.
Next, we identified the research designs and the Common Core State Standards in Mathematics (CCSSM) content standards (National Governors Association Center for Best Practices [NGA Center] & Council of Chief State School Officers [CCSSO], 2010) for grade-level mathematics topics. We coded additional intervention features that were proposed as potential moderators: pretraining purpose (no pretraining, device use only, and device use with teacher-led instruction), instructional method (technology-assisted instruction with teachers’ prompts, teacher-guided instruction, and teacher-led instruction), VM developer (researcher developed and commercially developed), device for VM use (iPad and computer), VM type (single representation, multirepresentation, tutorial, and game), and the visual model embedded in the VMs (set model, area model, linear model, base-10 block, algebra scale, and multiple models).
Assessment of study quality
Before analyzing the data for three-level multilevel modeling, we assessed the quality of the included studies using the quality indicators (QIs) established by the Council for Exceptional Children (CEC; 2014). The eight CEC QI categories for SCD studies were context and setting (1.1), participants (2.1 and 2.2), intervention agent (3.1 and 3.2), description of practice (4.1 and 4.2), implementation fidelity (5.1, 5.2, and 5.3), internal validity (6.1, 6.2, 6.3, 6.5, 6.6, and 6.7), outcome measures and dependent variables (7.1, 7.2, 7.3, 7.4, and 7.5), and data analysis (8.2). We coded each QI that was met as 1, and we coded each criterion that was not met as 0, leading to a total of 22 possible points. Table S4 (in the online supplemental materials) displays study quality by the QIs of the CEC (2014), which was quite high. Out of a total of 22 possible points, study quality scores ranged from 19 to 22 points: 19 (n = 1), 20 (n = 2), 21 (n = 11), and all 22 (n = 21). Of the 35 studies, 13 did not meet the QI related to the intervention agent. The fidelity of implementation was 100% (n = 27) or 80% to 100% (n = 4); in one study, fidelity was descriptively reported as being high in the study, and three studies did not properly report implementation fidelity information. Interrater reliability was 100% (n = 23) or 83% to 100% (n = 10); one study evaluated interrater reliability yet did not report the actual value, and the other remaining study did not collect interrater reliability. Although there were a few missing variables described already, these were not the primary concern in the current study; therefore, we decided to include all 35 studies for data analysis.
Extraction of SCD raw data
Regarding student outcomes, each SCD data point was extracted using the GetData Graph Digitizer 2.26 web-based application (GetData Graph Digitizer, 2013). The reliability and validity of this software were significantly high, r = 1.00, p < .001 (Shin & Jung, 2018). Shin and Jung (2018) calculated correlation coefficients between coders’ extracted data points to examine the interrater reliability. They also evaluated the concurrent validity between data points from previously published articles and hypothetical graphs.
To ensure accurate data collection, we checked the baseline phase first and then moved to the intervention phase. We extracted the data using the following step: opening a captured image graph from the folder, setting up the minimum and maximum coordinate values of the x- and y-axes, selecting the data in the baseline phase and transforming them into numerical values, and selecting the data in the intervention phase and transforming them into numerical values. Finally, the y-axis values were copied and pasted into the Excel file.
Data Analysis
SCD data are hierarchically structured and can be displayed using three-level multilevel modeling (Moeyaert et al., 2021; Pustejovsky & Ferron, 2017). The levels examine repeated-measurement occasions (Level 1) nested within cases (Level 2) and cases nested within studies (Level 3). In analyzing SCD data, visual analysis techniques are frequently used to determine the quality and effects of interventions (CEC, 2014). In an attempt to quantify the level and trend between phase data patterns among visual analysis approaches (What Works Clearinghouse, 2017), we focused on the analysis of immediate effects and trends during the use of VMs.
As the focus of the current study was to determine mathematical accuracy, all outcome measures were instructional probes of correct percentages or correct scores. The scores from 12 of the 35 studies (34.3%) were first converted to percentages to apply uniform metric measures across studies. Because the raw SCD data were all correct percentages ranging from 0% to 100%, we did not standardize each case data point but, rather, used raw data for three-level multilevel modeling. To postulate individuals’ immediate effects and trends during the use of VMs, we applied a piecewise linear regression approach by separately modeling time trends in the baseline and intervention phases (Singer & Willett, 2003). Specifically, the following equation was used in Model 1, which includes no moderators:
where yijk is the mathematical accuracy at the measurement occasion, ith (I = 0, 1, . . . I), for the jth case (j = 1, 2, . . . J) in study k when β00jk is the baseline level, β01jk is the change in level when the intervention phase starts (immediate effect), β02jk is the change in slope between the baseline and the intervention phase (trend during the use of VMs), and eijk is an error term that is assumed to be independent and is normally distributed with a mean of 0 and variance of σ2. Intervention jk was dummy coded (0 for baseline phase, 1 for intervention phase), and Time ijk was centered at the first measurement occasion in the intervention phase. Assuming that errors were associated with repeated observations (autocorrelation) and heterogeneous across phases (Baek et al., 2020; Petit-Bois et al., 2016), we applied a first-order autoregressive structure, corCAR1 (Pinheiro et al., 2020). We also assumed heterogeneous variance across phases (Moeyaert et al., 2020), incorporating the varIdent variance structure (Pinheiro et al., 2020). Based on the initial examination of SCD data, we found stable baseline data and assumed no trend in the baseline phase. Therefore, we assumed only that baseline levels, immediate effects, and trends during an intervention would vary across cases and studies. Table 1 depicts the parameters and standard errors of Model 1’s fixed and random effects.
Parameters for the Three-Level Multilevel Model Without Moderators.
Note. Standard errors are in parentheses. The means and variances of the intercept (baseline level) are not presented.
p < .05. **p < .01. ***p < .001.
Furthermore, the between-case and between-study variances of immediate effects and trends during the use of VMs were examined. As depicted in Figure S2 and Figure S3 (in the online supplemental materials), the random effects of immediate effects and trends during the intervention phase were in the larger dispersion. Therefore, as a follow-up analysis, we investigated whether case-level (student characteristics) and study-level (intervention features) moderators could explain these variances.
To examine moderation effects, an interaction between immediate effects and dummy-coded moderators was added to Model 2. In a follow-up analysis, another interaction between trends during interventions and dummy-coded moderators was added to Model 3. Table 2 depicts the parameters and standard errors of all fixed effects as well as variances across cases and studies. For the analysis of three-level multilevel modeling, we primarily used the lme function from the nlme R package (Pinheiro et al., 2020). Standard errors in the standard deviation scale were calculated using the delta method function (Bruin, 2006) from the msm R package (Jackson, 2011).
Parameters for the Three-Level Multilevel Model With Moderators.
Note. Standard errors are in parentheses. The means and variances of the intercept (baseline level) are not presented.
Interactions between immediate effects and moderators.
Interactions between trends during intervention and moderators.
The reference group was given to elementary school students with learning disabilities who received technology-assisted instruction with teachers’ prompts, but without pretraining, using iPads to run the researcher-developed multirepresentation-with-sets virtual manipulative.
Publication Bias
To reduce publication bias, we included unpublished dissertations within our target studies. Any potential publication bias was assessed using funnel plot asymmetry with Egger’s statistical test (Egger et al., 1997). As depicted in Figure S4 and Figure S5 (in the online supplemental materials), funnel plots were created using the metaphor R package (Viechtbauer, 2010). An asymmetric plot with potential publication bias was observed for immediate effects (t = −3.77, p < .001). Regarding the trends during the intervention, a somewhat symmetrical plot was observed with no significant publication bias (t = 1.53, p = .14). This confounding result regarding publication bias requires careful attention when interpreting the results. The extracted raw data points, coded data, and R codes for data analysis were posted through an online data repository (Shin et al., 2021).
Results
Summary of Student Characteristics and Intervention Features
A total of 114 students participated across 35 studies (seven dissertations and 28 peer-reviewed journal articles), published between 2012 and 2020 (see Table S1 in the online supplemental materials). Participants included 86 male and 28 female students: 21 in elementary school, 72 in middle school, and 21 in high school grade levels. Of these, 68 students were classified as White, 20 as Black, 13 as Hispanic, and four as multiracial (nine students did not report their race-ethnicity). Additionally, 34 students were identified as having LD; 26, ID; 32, ASD; 11, EBD; and 11, OHI.
As depicted in Table S2 and Table S3 (in the online supplemental materials), 21 of the 35 studies employed a multiple-probe design across subjects (n = 17) or behaviors (n = 4), eight used a multiple-baseline design across subjects (one also employed an alternating treatment design), and six others implemented an alternating-treatment design. Regarding the CCSSM content standards (NGA Center & CCSSO, 2010), the largest number of studies focused on numbers and operations–fractions (e.g., equivalent fractions, adding and subtracting fractions, multiplying fractions; n = 10), followed by number and operations in base 10 (e.g., place value, adding and subtracting whole numbers with regrouping; n = 6) mixed with other domains (n = 1), operations and algebraic thinking (e.g., single-digit and double-digit multiplication and division; n = 7), algebra–reasoning with equations and inequalities (e.g., solving linear algebraic equations; n = 3), measurement and data (solving area and perimeter) or counting and cardinality (single-digit number comparison; n = 3), expressions and equations (e.g., one-step division algebraic equations; n = 1), the number system (e.g., adding integers; n = 2), geometry (e.g., partition a rectangle; n = 1), and ratios and proportional relationships (n = 1).
Regarding intervention features, about half of the reviewed studies (n = 19) provided no pretraining before the intervention; in 13 other studies, interventionists delivered pretraining on both device use and instruction with (n = 10) or without explicit instruction (n = 3); in the remaining three studies, device use was the only purpose of pretraining. Interventionists implemented VMs through teacher-guided instruction with (n = 11) or without (n = 3) incorporating the system of least prompts, teacher-led instruction with (n = 15) or without (n = 1) the use of explicit instruction, and as part of technology-assisted instruction with teachers’ prompts (n = 5). When teacher-guided instruction was incorporated, instructional prompts were given to each student at the independent testing time with records of students’ independence on mathematical task completion. In most studies, commercially developed VM programs were used (n = 21), although 13 other studies used researcher-developed VMs, and one study included both VM programs. The interventions using VMs were conducted via iPads (n = 24) or computers (n = 11). The most frequently incorporated type of VM was multirepresentations (n = 20), followed by tutorials (n = 6), single representations (n = 4), and games (n = 3). In the other two studies, a single representation was used in combination with multirepresentation or a tutorial. The most frequently observed visual model embedded in VM was the linear model (n = 11), followed by the base-10 block (n = 5; one with a combination of set models), set model (n = 8), area model (n = 4), algebra scale (n = 3), and multiple models within VM apps (n = 4).
The Immediate Effect and Trend During the Use of VMs
In total, 1,796 single-case raw data points for 114 cases across 35 SCD studies were analyzed. Model 1 in Table 1 depicts the parameter estimates of a three-level multilevel model that assumes autocorrelation. According to the multilevel model without moderators, the average immediate effect was statistically significant, γ0100 = 70.95, p < .001. Specifically, when VMs were used, there was an average increase of 70.95% in mathematical accuracy of the level, from baseline to intervention phases, in the first intervention session. Furthermore, the average trend during the use of VMs was also statistically significant and positive (γ0200 = 1.73, p < .001). There was a significant interaction between the use of VMs and time; with one session increase during the intervention phase, there was an average of 1.73% increase in mathematical accuracy.
The between-case (
The Influence of Moderators on the Immediate Effect of Using VMs
As depicted in Model 2 in Table 2, we added case-level (student characteristics) and study-level (intervention features) moderators. Because all moderators were categorical variables, the reference group was designated to have the following aspects: elementary school students with LD who received technology-assisted instruction with teachers’ prompts, but without pretraining, using iPads to run the researcher-developed multirepresentation-with-sets VM.
We found statistically significant interactions between the immediate effect of using VMs and student characteristics–related variables: student grade and disability type. When controlling for main effects and other variables in the model (e.g., disability type, developer, and device), the average immediate effect of using VMs was significantly greater for high school students than for elementary school students (γ0400 = 27.78, p = .04). Nonsignificant and smaller average immediate effect was observed for middle school students than for elementary school students. Additionally, although not statistically significant, the average immediate effect was larger for students with LD than for students with ID, ASD, and OHI. Contrariwise, there was a significantly greater average immediate effect of using VMs for students with EBD than for those with LD (γ0700 = 30.83, p = .02).
There were also statistically significant interactions between the immediate effect of using VMs and intervention feature–related variables: developer, device, VM type, and the visual model embedded in VMs. Notably, the larger average immediate effect of using VMs was associated with the use of commercially developed VMs rather than researcher-developed VMs (γ1300 = 46.60, p < .001). Furthermore, the incorporation of the computer was significantly more effective than the use of the iPad (γ1400 = 28.00, p = .02). A larger average immediate effect was observed for the use of the multirepresentation type than for single-representation, tutorial, and game VM types. In particular, the average immediate effect was significantly smaller for tutorials than for the multirepresentation VM type (γ1600 = −44.15, p = .01). The average immediate effect of the use of the set model was larger than the effect of the area model, linear model, and multiple models but smaller than the effect of the base-10 block and algebra scale. Compared with the use of multiple models in VMs, a statistically significant and larger average immediate effect was associated with the use of the set model (γ2200 = −45.95, p = .004).
After controlling for all other variables, studies showed no significant difference in the average immediate effects of using VMs between providing pretraining (on the use of the device or device use combined with instruction) and not providing pretraining. Moreover, although studies demonstrated slightly larger average immediate effects on the use of technology-assisted instruction with the teachers’ prompts than on the teacher-guided instruction (some with the system of the least prompts) and teacher-led instruction, including explicit instruction, these differences were not statistically significant.
After adding moderators to the analysis in Model 2, both the immediate effect of using VMs (γ0100 = 88.21, p = .02) and the trend during the use of VMs (γ0200 = 1.99, p < .001) remained significant. Relative to Model 1, there was a 1.87% increase in the between-case variance (
Influences of Moderators on Trends During the Use of VMs
As displayed in Model 3 in Table 2, we examined another three-level multilevel model, one including all moderators, to analyze the variation of the trend effect of using VMs on the mathematical accuracy of students with disabilities. We assigned the same reference group as Model 2 to examine the interactions between trend effects and moderators.
None of the moderators related to student characteristics (student grade and disability type) significantly influenced the trend about the use of VMs. Although the average trend was slightly larger for middle and high school students than for elementary school students, these differences were not statistically significant after controlling for other variables. The difference in the average trend for students with LD compared with students with other disabilities was also minimal.
All intervention feature–related moderators (pretraining purpose, instructional method, VM developer, device for the use of VMs, VM type, and visual model) did not significantly influence the trend during the use of VMs. In other words, there were no statistically significant interactions between moderators and the trend during the intervention. Although not statistically significant, after controlling for all other variables, the slightly larger average trend during the use of VMs was associated with groups provided with pretraining on the device or on both device use and instruction together compared with the groups with no pretraining. Moreover, a slightly larger average trend was found during the intervention phase associated with teacher-guided instruction and teacher-led instruction compared with the use of technology-assisted instruction. Over time, the use of researcher-developed VMs turned out to be slightly more effective in improving the mathematical accuracy of students with disabilities than that of commercially developed VMs. A slightly higher trend effect was associated with the use of iPads than with the use of computers. Additionally, the incorporation of the multirepresentation VM type was associated with a slightly higher average trend than the single-representation and tutorial VM types but a slightly smaller average trend than the game VM type. The average trend was slightly higher for the set model than for the area model, linear model, and base-10 block. Relative to the use of the algebra scale and multiple models embedded in VMs, a lower average trend was observed for the use of the set model.
By adding moderators to the analysis in Model 3, the trend during the use of VMs turned out to be nonsignificant (γ0200 = −1.01, p = .88), but the average immediate effect remained significant (γ0100 = 70.44, p < .001). Compared with Model 1, there was a 1.43% reduction in the between-case variance (
Discussion
The purpose of the current study was to synthesize the effects of using VMs on the mathematical accuracy of students with disabilities. Applying three-level multilevel modeling, we investigated trends during the intervention phase as well as immediate intervention effects among SCD studies. The influences of moderators related to student characteristics (case level) and intervention features (study level) on immediate effects and trends during the use of VMs were also evaluated.
Both the average immediate effect and trend during the use of virtual manipulatives were statistically significant.
The Immediate Effect and Trend During the Use of VMs
The results support that the use of VMs was highly effective in teaching mathematical accuracy for students with disabilities. There was a statistically significant average immediate effect of using VMs in mathematical accuracy from the baseline to the intervention phase. This finding is consistent with previous syntheses of research on mathematics manipulatives (Bouck & Park, 2018; Lafay et al., 2019; Peltier et al., 2020) or technology-mediated interventions (Kiru et al., 2018; Ok et al., 2020) in the use of VMs, which have been found to be generally effective for improving the mathematical outcomes of students with disabilities. These previous research included VMs as a subset of targeted mathematics interventions and validated the positive effects of using mathematics manipulatives or technology-mediated mathematics interventions. Additionally, the use of VMs produced a significantly positive average trend during the intervention, indicating a growth over time due to the interventions incorporating VMs in their mathematics lessons. These findings add value to the existing syntheses and reviews on the effects of using VMs on students with disabilities (Bouck & Park, 2018; Lafay et al., 2019; Peltier et al., 2020), particularly in terms of positive changes in trends and levels between the baseline and intervention phases.
The Influence of Moderators on the Immediate Effect of Using VMs
To reduce the unexplained variance between cases and studies concerning the immediate effects of using VMs for mathematical accuracy, we added student characteristics and intervention feature–related moderators. Although not all categorical variables turned out to be significant moderators influencing the immediate effects of using VMs, we found that student grade, disability type, developer, device, VM type, and visual models embedded in VMs are potential moderators.
After controlling for all other variables in the model, student grade and disability type were significant moderators influencing the immediate effect of using VMs. Consistent with the findings of Moyer-Packenham and Westenskow (2013), high school students showed the largest immediate effects, with a significantly greater effect than elementary school students. Moreover, students with EBD exhibited the highest average immediate effect of using VMs, a significantly greater improvement than even those with LD. This finding is somewhat different from that of Peltier et al. (2020); from studies using both physical and virtual manipulatives, students with EBD exhibited the least improvement compared with students with LD, OHI, ID, and ASD. However, in the present study, when we disaggregated these mathematical manipulatives and evaluated the effect of VMs, we found the opposite results for this student population.
The average immediate effect significantly differed by intervention features (developer, device, VM type, and visual model embedded in VM). After controlling for the main effects and other variables (e.g., student disability, device, and VM type), the use of commercially developed VMs had a larger average immediate effect on mathematical accuracy than the use of researcher-developed VMs. This finding indicates that commercially available applications (e.g., Langhorne, 2017), which receive continuous updates and validations by many other researchers and users, might be able to maintain higher quality and more accessibility for users than researcher-developed applications that are not easily accessible (e.g., Özdemir, 2017).
Students with disabilities exhibited statistically significant and larger average immediate effects when they used VMs with a computer than with an iPad. This finding is somewhat contradictory to previous reviews on the effects of device type (Ok et al., 2020; Sung et al., 2016). Sung et al. (2016) found more effective usage of handheld devices, including iPads, than that of laptops as teaching and learning tools across subjects (e.g., mathematics, science). However, when exclusively focused on the mathematics domain, Ok et al. (2020) did not find a significant difference in the mathematical improvement of students with LD. Thus, the moderation effect by different device types should be further validated within various technology-mediated learning settings, including the use of VMs.
The higher average immediate intervention effect with the use of the multirepresentation VM type compared with the other types of VMs (i.e., single representation, tutorial, and game) was notable. When selecting visual models for students struggling with mathematics, the built-in function of multirepresentations, such as pictorial, numeric, and text representations, can accommodate the problem-solving process and extend learners’ experiences to mathematical connections across various representations (Shin et al., 2017). When compared with the immediate effect of using the tutorial type (e.g., providing feedback or tutoring), this difference was statistically significant. Although the tutorial VM type includes an additional function of feedback beyond the features of the multirepresentation type, the current findings indicate that the additional feature does not always lead to better mathematical outcomes.
The visual model embedded in VMs was another moderator influencing the immediate effect on students’ mathematical accuracy. The use of the set model embedded in VMs had a larger average immediate effect than the effect of the area model, linear model, and multiple models. As shown in the present study, the set model was applied to various VM types and mathematical topics. For example, the set model was embedded in the single-representation VM type (e.g., Color Tiles; Brainingcamp) in teaching multiplication and division (Bouck, Park, & Shurr, 2019) or multirepresentation VM type (e.g., two-color counter) in teaching integer addition (Bouck & Park, 2020). The effect of using set-model VMs was statistically significant and larger compared with the effect of using multiple models. This finding indicates that providing more than two different visual models (e.g., linear model and set model) does not guarantee better immediate effects.
The Influence of Moderators on Trends During the Use of VMs
With the incorporation of three-level multilevel modeling, we were able to examine how the trend changes over time during the use of VMs in the intervention phase. Observing a change in slope to a describable direction, as well as a shift in level, is an essential component of the visual analysis of SCD studies (Ledford & Gast, 2018).
After controlling for all other variables, the trend during the use of VMs did not significantly differ by student grade or disability status. However, the current multilevel modeling indicated some notable patterns in the students’ performances. Although statistically nonsignificant, regarding the use of VMs, middle and high school students with disabilities demonstrated slightly higher average trends during the intervention phase than the elementary students’ groups. This finding is consistent with Peltier et al.’s (2020) study, in which secondary students with disabilities or mathematics difficulties exhibited larger effects than those at the elementary level when using mathematical manipulatives.
Furthermore, consistent with the moderation effect on the immediate effect, the growth of students with EBD over time was larger than the growth of students with LD. Although statistically not significant, during the intervention, there was a notable improvement in students with ASD. Students with ASD exhibited a smaller average immediate effect than those with LD. However, during the intervention phase, over time, they exhibited a slightly larger average trend than students with LD.
After controlling for main effects and other moderators, study-level moderators did not show significant interactions with students’ growth of mathematical accuracy over time. Despite the overall nonsignificant moderation effects, some findings were notable. The average trend during the use of VMs was slightly larger when researchers provided pretraining. This finding is consistent with Meyer et al. (2019) regarding the importance of providing pretraining in technology-mediated learning environments to familiarize learners with the use of technology before the actual intervention. In the present study, researchers (e.g., Shin et al., 2017; Xin et al., 2020) who implemented technology-assisted instruction with teachers’ prompts primarily provided pretraining on how to use the device. Other researchers (e.g., Bouck, Shurr, et al., 2020; Saunders, 2014) provided pretraining on both device use and how to solve mathematical problems with task analysis steps. Despite a larger average immediate effect of using technology-assisted instruction with teachers’ prompts, the average trend during the use of VMs was smaller than when using the other instructional methods, either teacher-guided (e.g., Jimenez & Besaw, 2020) or teacher-led mathematics instruction (e.g., Park et al., 2021).
The influence of the developer and device type of VMs on trends during the use of VMs was contrary to those of the immediate effects. Students with disabilities demonstrated slightly more positive growth when they used researcher-developed VMs than when they used commercially developed VMs. Although nonsignificant, the trend during the use of VMs was more positively associated with the use of the iPad device than with the use of a computer. The use of the multirepresentation VM type was more effective than the use of the single-representation and tutorial types in improving students’ mathematical growth over time. Moyer-Packenham and Westenskow (2013) pointed out various desirable features of VMs, such as allowing students to simultaneously link two or more representations (e.g., graphic or numeric information). Zbiek et al. (2007) also noted that with the implementation of interactive and dynamic technology tools, such as VMs, students can increase their representational fluency.
The effect of the visual model embedded in VMs on trends during the intervention phase was distinguished from the visual model’s impact on the immediate effects. Although the base-10-block model of VM was associated with a slightly higher average immediate effect than the set model, the set model was more effective in improving students’ mathematical growth during the intervention than the base-10 block. Furthermore, although the incorporation of multiple models led to a significantly lower immediate effects than the set model, a slightly higher average trend was found for the use of multiple models during the intervention phase. These deviations in the findings corroborate the need to examine trends during the use of VMs, even after the immediate effects have been observed after the introduction of mathematics interventions.
Deviations in the moderation effects corroborate the need to examine trends during the use of virtual manipulatives, even after the immediate effects have been observed.
Limitations and Future Research
The findings of the current study have several limitations that should be taken into account in future research. First, of the 35 included studies, various SCD studies were included (multiple-baseline and multiple-probe designs across subjects or behaviors and alternating-treatment designs). These results indicate that studies can demonstrate intervention effects across various student populations and mathematics topics, depending on the study design. Thus, future research needs to consider how students’ immediate learning and academic growth can differ by various types of SCDs. Furthermore, for Reneau’s (2012) study, we disaggregated data and selectively coded only the two students with disabilities, excluding the other three cases of students without disabilities. When we evaluated the study’s QIs by CEC (2014), we considered these variations across studies. Furthermore, the results of the study’s QIs can vary according to different QI guidelines. In future research, it is suggested that QIs are further compared across multiple guidelines to validate whether consistent results are obtained across various QI standards (e.g., CEC, What Works Clearinghouse).
Second, there was a limited focus on the target students, limiting the generalization of the current study’s findings across students from various demographics. Specifically, the researchers in the included studies primarily targeted middle school students (n = 20) versus elementary school students (n = 8) and high school students (n = 7). Furthermore, the participants in the studies were predominantly White and those with LD, ASD, or ID. There is a need to investigate the effects of using VMs with students from other racial and ethnic groups (e.g., Black, Hispanic, Asian) who have other types of disabilities (e.g., EBD, OHI).
Third, there was a limited focus on CCSSM content standards (NGA Center & CCSSO, 2010). Of the 35 studies, researchers most frequently targeted numbers and operations–fractions (n = 10) or numbers and operations in base 10 (n = 7). However, limited research has been conducted on other mathematics concepts and skills, including expressions and equations, the number system, geometry, ratios, and proportional relationships.
Fourth, because the primary concern of the current study was to examine the average level and trend changes during the use of VMs over time, we did not conduct formal visual analysis for each study (e.g., What Works Clearinghouse, 2017). Thus, a functional relation between mathematics interventions with VMs and mathematical accuracy for each study and case remains unknown. Thus, future synthesis needs to conduct a formal visual analysis (level, trend, variability, immediacy of the effect, overlap, and consistency of data in similar phases) for each study and case, assessing data patterns within and across phases (What Works Clearinghouse, 2017).
Last, in the current multilevel modeling, pretraining and instructional methods were found to have no statistically significant moderating effects that influenced both immediate effects and trends during the use of VMs. However, these results could have been influenced by the relatively small sample sizes. As a result, low power in the model may have failed to detect such interaction terms between moderators and each intervention effect (Aguinis, 1995). In future research, researchers should continuously implement VMs, comparing the effects of the different purposes of pretraining before the intervention and exploring the influence of various types of instructional methods in the educational technology environment (teacher-guided, teacher-led, or technology-assisted instruction) in an effort to further validate the role of moderators.
Practical Implications
The results of this study revealed that the use of VMs during mathematics instruction with students with disabilities immediately had a positive influence on mathematical accuracy and supported students’ mathematical growth over time during mathematics interventions. The effects of student characteristics and study feature–related moderators on immediate effects and trends during the use of VMs were somewhat different. For example, the use of commercially developed VMs and computer devices was found to be significantly more effective in improving the immediate effects of mathematical accuracy than researcher-developed VMs used with iPads. However, when observing students’ mathematical growth over time during the use of VMs, the opposite results were found. Thus, when planning mathematics interventions incorporating VMs, teachers need to consider both immediate and gradual growth over time and accommodate various design features in their mathematics lessons.
As learning tools, the use of VMs provides benefits for students with disabilities. As an alternative mathematical tool to physical manipulatives, VMs are considered practical tools for teaching mathematics to students both with and without disabilities in inclusive settings (Bouck & Park, 2018; Moyer-Packenham & Westenskow, 2013). The findings of the present study highlighted favorable effects on students’ growth over time as well as increased mathematical accuracy through the use of VMs. VMs can be flexibly used to fit the various sizes, colors, and unlimited objects provided during mathematics instruction.
VMs are a promising and viable option for applying technology in K–12 schools in a flexible and convenient way. Furthermore, several VMs are free or low-cost, enabling easy access for educators for planning differentiated instruction for diverse learners. As previously noted, various types and visual models embedded in VMs can be used flexibly in mathematics interventions. Teachers can apply authoring tools (e.g., SMART Notebook software) to create interactive materials (Root et al., 2020) across mathematical domains. From the perspective of assistive technology, VMs can provide a new means of representation that interactively supports students’ visualization of mathematical concepts.
Supplemental Material
sj-zip-1-ecx-10.1177_00144029211007150 – Supplemental material for Effects of Using Virtual Manipulatives for Students With Disabilities: Three-Level Multilevel Modeling for Single-Case Data
Supplemental material, sj-zip-1-ecx-10.1177_00144029211007150 for Effects of Using Virtual Manipulatives for Students With Disabilities: Three-Level Multilevel Modeling for Single-Case Data by Mikyung Shin, Jiyeon Park, Rene Grimes and Diane P. Bryant in Exceptional Children
Footnotes
Authors’ Note
Our work was supported by the Killgore Faculty Research Grant, West Texas A&M University.
Open Science Practices
Supplemental Material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
