Abstract
Single-case experimental designs (SCEDs) are frequently used to evaluate whether a functional relation exists between interventions and student outcomes. A critical factor in decision making is the evaluation of graphical data, typically displayed in time-series graphs. Distortion in the graphical display of data can lead to invalid decisions on whether a functional relation exists, as well as overestimating the magnitude of an effect. Previous research has identified two potentially analysis-altering elements that when manipulated alter visual analysts’ decision regarding the presence of a functional relation and magnitude of effect. The purpose of this review was to evaluate the graphical display of data from SCEDs in the field of emotional and behavioral disorders (EBD). The review covered 40 SCEDs, including 258 graphs, published in Behavioral Disorders and Journal of Emotional and Behavioral Disorders over the last 10 years (2010–2019). We identified large variation in the axis proportions of reviewed graphs, as measured using standardized x:y and the data points per x- to y-axis ratio (DPPXYR). A majority of graphs included an ordinate scaling procedure that aligns with findings from preliminary research on this analysis-altering element. We provide recommendations to the field on designing graphs to enhance the validity of visual analysis.
Keywords
Single-case experimental designs (SCEDs) are a collection of flexible and rigorous experimental procedures used to evaluate effects of interventions on socially important outcomes for individuals. SCEDs originated from the fields of experimental analysis of behavior and applied behavior analysis (Barlow & Hersen, 1973); however, the use of SCEDs has spread to a variety of fields: special education (Horner et al., 2005), counseling (Lundervold & Belwood, 2000), school psychology (Radley et al., 2020), and medicine (Percha et al., 2019). A unique contribution of SCEDs is the ability for practitioners and researchers to evaluate participant responsiveness to interventions in applied settings (Maggin et al., 2018; Rick, 1976). SCEDs are a rigorous methodology used to determine whether a functional relation is present between an intervention and student outcome(s) (Horner et al., 2005; Odom et al., 2005). Because of the complex behavioral needs of students with emotional and behavioral disorders (EBD), SCEDs are needed to provide the field with a nuanced view of intervention effectiveness at the individual student level (Conroy et al., 2008; Spear et al., 2013). SCEDs also play a critical role in the identification of evidence-based practices (EBPs) in special education (Garwood et al., 2019; Hagan-Burke et al., 2007) and their inclusion in systematic reviews of intervention research for students with EBD (Sreckovic et al., 2017; Watts et al., 2019) to inform what works, for whom, and under what conditions. For the field of EBD, it is imperative that the identification of EBPs is grounded in rigorous methodology reducing threats of Type I and Type II errors.
Analysis-Altering and Aesthetic-Altering Elements of Graph Construction
Most frequently, time-series graphs are used to report outcomes from SCEDs (Cooper et al., 2020; Kratochwill et al., 2014; Odom et al., 2018). In constructing a time-series graph, there are many different components that the designer must consider (e.g., x-axis scaling, y-axis scaling, line thickness, size and shape of data points, labels). Dart and Radley (2018) proposed a scheme for thinking about these components as either analysis-altering or aesthetic-altering. Analysis-altering components are defined as those elements that when manipulated alter a visual analyst’s decision about the presence of a functional relation and magnitude of treatment effect. Aesthetic-altering components are defined as those elements that facilitate the ease in which a graph is interpretable. Aesthetic-altering elements, when altered, are not expected to sway a visual analyst’s decision on mean level, trend, or variability of the data series and thus should not impact decisions regarding a functional relation and magnitude of treatment effect. Although standardization of aesthetic-altering components is recommended (Dart & Radley, 2018), we focused our discussion around analysis-altering components due to their impact on visual analysis.
X:Y Ratio
Several sources have provided recommendations for an appropriate x:y axis ratio that will minimize the distortion of data, with an x:y ratio between 8:5 and 3:2 (i.e., standardized x:y ratio between 1.6 and 1.5) as preferred (Cooper et al., 2020; Parsonson & Baer, 1978). However, to date, we are unaware of empirical literature that demonstrates the x:y ratio impacts results of visual analysis. Prior research has evaluated the x:y proportion of SCEDs. Kubina and colleagues (2017) selected 11 journals publishing work in the field of behavior analysis (n = 6) and behavior related to education (n = 5). Neither Behavioral Disorders (BD) nor the Journal of Emotional and Behavioral Disorders (JEBD) were included. The team randomly selected one issue from every 2-year block from the journal inception through 2011. A total of 4,313 graphs were evaluated as part of the study. The research team did not provide the distribution of x:y ratios per graph but reported only 15% of graphs fell between x:y ratio of 8:5 and 4:3 (i.e., standardized x:y ratio between 1.6 and 1.5).
Ledford and colleagues (2019) evaluated the graphical display of SCEDs published in 12 journals in the field of special education. Neither BD nor JEBD were included in this review. The team included published articles using an SCED with a time-series graphs from the previous 5 years; a total of 267 articles including 1,123 graphs were evaluated. The team investigated the x:y ratio across types of SCEDs and identified mean ratios for multiple-baseline designs (standardized x:y = 2.91), multiple-probe designs (standardized x:y = 2.76), and ABAB designs standardized (standardized x:y = 2.88) to be similar. Alternating treatment designs (standardized x:y = 1.99) and adapted alternating treatment designs (standardized x:y = 2.08) produced smaller ratios, meaning the length of the x-axis was shorter in comparison with the height of the y-axis.
Data Points per X- to Y-Axis Ratio (DPPXYR)
Despite several sources emphasizing the importance of axis proportions to visual analysis and providing the recommendation to set the x:y ratio between 8:5 and 3:2 (i.e., standardized x:y ratio between 1.6 and 1.5; Cooper et al., 2020; Parsonson & Baer, 1978), no experimental evidence was available to substantiate the claim or recommendation. Radley and colleagues (2018) aimed to address the need. The team highlighted why the number of data points on the x-axis should be considered in addition to the x:y ratio. The team developed the DPPXYR which takes into account the x:y ratio and the number of possible data points that could be plotted along the x-axis. The following formula can be used to calculate the DPPXYR: (length of x/length of y)/possible number of data points to be plotted on x-axis. The team identified that the mean DPPXYR for AB cases within multiple-baseline designs was 0.14. The team then created eight unique data sets and manipulated the graphs to display five different DPPXYR ratios (i.e., –1.0 SD, –0.5 SD, DPPXYR = 0.14, +0.5 SD, +1.0 SD) for a total of 40 graphs. A total of 29 experienced visual analysts evaluated graphs to determine whether a functional relation was present, and if so, their confidence in this decision on a 0–10 scale. Results suggest as DPPXYR was reduced, –1.0 and –0.5 SD below the mean increased visual analysts’ confidence in attributing a functional relation, even for graphs constructed to represent no functional relation and a small effect. Radley and colleagues (2018) reported that the optimal range for accurate visual analysis was a DPPXYR between 0.14 and 0.16, with 0.14 as bare minimum to reduce Type I error rates. This study provides preliminary evidence that axis proportions (as measured via DPPXYR) can impact visual analysis. Future research needs to evaluate whether this finding replicates to real data sets and whether this holds true for other commonly used SCEDs (i.e., multiple-baseline design, alternating treatment design). For a concrete example, we refer readers to top graph (i.e., DPPXYR = 0.16) and bottom-left graph (i.e., DPPXYR = 0.08) in Figure 1. As the DPPXYR value is reduced, the trend of data is distorted and appears steeper, which potentially influences visual analysts’ decisions regarding a functional relation and the magnitude of intervention effect.

Examples of a truncated y-axis and manipulated DPPXYR using identical data set.
Ordinate scaling
Ordinate scaling is one element identified as a potential analysis-altering component. In an early evaluation of interrater reliability (IRR), 10% of visual analysts commented that the scaling of the ordinate affected their visual analysis process (De Prospero & Cohen, 1979). Dart and Radley (2017) investigated this empirically. They asked 32 experienced visual analysts to evaluate 32 graphs representing ABAB designs. The researchers systematically manipulated the ordinate scaling of the graphs to determine the degree to which this characteristic influenced visual analysis. Specifically, the researchers displayed the data on an ordinate scale representing the total possible range (0%–100%) and various restricted ranges (i.e., 0%–80%, 0%–60%, 0%–40%). The team found a restricted ordinate scale correlated with an overestimation of treatment effect. Type I errors increased as the truncation of the ordinate increased: 4.7% Type I error rate for 0% to 80%, 6.3% Type I error rate for 0% to 60%, and 21.9% Type I error rate for 0% to 40%. Although an important finding to inform graph construction, further research needs to evaluate whether this finding replicates to real data sets, whether this holds true for other commonly used SCEDs (e.g., multiple-baseline design, alternating treatment design), and whether ordinate scaling would impact visual analysts’ decisions for data without a natural upper bound limit (e.g., rate, count). For a concrete example, we refer readers to top graph (i.e., ordinate scaled to 0%–100%) and second graph (i.e., ordinate truncated to 0%–70%) in Figure 1. As the ordinate scale is truncated, the trend of data is distorted and appears steeper, which potentially influences visual analysts’ decisions regarding a functional relation and the magnitude of intervention effect.
Purpose of the Current Study
The simplicity of displaying data in graphical form and making decisions has its allure; however, the use of visual analysis has received warranted criticism from the field (Byun et al., 2017; Wolfe et al., 2019). In an early study, De Prospero and Cohen (1979) found low IRR (M = 0.61) between 250 visual analysts’ evaluation of 36 ABAB designs. This finding was supported by a later study by Gibson and Ottenbacher (1988) who found low IRR (range = .52–.66) across visual analysts’ evaluation of 24 graphs representing an AB design. Extending the previous study, Ottenbacher (1993) used graphs from 14 studies and asked 789 analysts to evaluate the graphs for a functional relation; mean IRR was .58, which corresponded with previous experiments. Last, Ninci and colleagues (2015) conducted a meta-analysis that included 19 studies published between 1984 and 2010, and reported mean IRR across studies was .76. Collectively, these aforementioned studies highlight low IRR among visual analysts, which in turn threaten the perceived credibility of SCEDs. The field is investigating ways to improve visual analysis through a variety of avenues: (a) identifying effective training procedures for visual analysts (Wolfe & Slocum, 2015), (b) creating structured visual analysis protocols to aid the process (Ledford et al., 2018), and (c) identifying statistical procedures and tools to aid in determining a functional relation (Manalov & Vannest, 2019). However, another critical element that will aid in protecting the integrity of the visual analysis process and needs further attention is graph construction. We aimed to provide empirical data to fill this void.
This current brief report aimed to provide additional evidence on the graphical display of SCEDs. First, we evaluated the graphs reported in SCEDs published in the two flagship journals (i.e., BD and JEBD) for students with EBD. BD and JEBD were not included in either previous review of graphical display of SCEDs (Kubina et al., 2017; Ledford et al., 2019). SCEDs are used frequently in the field of EBD to evaluate intervention effects; thus, this brief report would provide usable information for researchers and editorial board members. Second, we evaluated the method used to scale the ordinate, which Kubina et al. (2017) and Ledford et al. (2019) did not address. Truncating the y-axis is proposed as a potential analysis-altering element that can increase Type I error rates for visual analysts (Dart & Radley, 2017). Last, we provide the distribution of x:y proportions using standardized x:y and the DPPXYR. The two previous reviews did not evaluate graph construction using the DPPXYR, which has preliminary evidence as a potential analysis-altering element (Radley et al., 2018). For the current brief report, we evaluated two journals (BD and JEBD) focused on students with EBD during the years 2010–2019. Although this is a more limited sampling of journals than previous reviews (Kubina et al., 2017; Ledford et al., 2019), we evaluated a full 10 years (cf. Ledford et al., 2019, evaluated 5 years) and evaluated every issue published in the journals (cf. Kubina et al., 2017, randomly selected one issue per year per journal). We believe targeting the two journals and year span allows us to gather a sample that provides usable information to our targeted population of SCED researchers in the field of EBD. The following research questions informed our investigation:
Method
Search Procedure
Data from the current project were collected as part of another systematic review synthesizing intervention research published in BD and JEBD from 2010 to 2019 (Garwood et al., 2020). Stage 1 of the process included title and abstract screening. We accessed both journals’ table of contents using PsycINFO and read the Title/Abstract of every article published from 2010 (BD: Volume 35, Issue 2; JEBD: Volume 18, Issue 1) through 2019 (BD: Volume 45, Issue 1; JEBD: Volume 27, Issue 4). We reviewed 370 articles, excluding editorial articles, for possible inclusion. Stage 2 involved screening articles to identify whether they met our inclusion criteria: (a) used an SCED, (b) reported data in a graphical display, and (c) included at least one student with or at risk for EBD. We identified 40 SCEDs, including 258 graphs (i.e., ABAB, alternating treatment design, or an AB as part of a multiple-baseline design), that met our inclusion criteria. A total of 25% of full texts were evaluated by two screeners; inter-observer agreement (IOA) for inclusion criteria was 100%.
Coding for Graph Construction
We developed a coding guide to extract relevant characteristics of the graphical display (i.e., available upon request). First, two aspects of the x:y proportion were estimated. The x:y ratio was estimated and standardized to report the ratio of the x-axis to 1 unit y, and the DPPXYR was calculated (see Procedures for Standardized X:Y and DPPXYR Estimation section for this process). Next, two aspects of the dependent variable(s) were coded: (a) a description of the dependent variable(s) provided on the ordinate and (b) the type of data reported (i.e., percent, count, rate). Last, we coded two aspects of the ordinate: (a) the scale reported on the graph and (b) our interpretation for how the ordinate maximum value was set. The first author coded 100% (k = 40, n = 258) of the graphs, and the second author coded 22.5% of studies (k = 9) that included 32.9% of graphs (n = 85). To calculate IOA for standardized x:y ratio and DPPXYR, we used a modified partial agreement procedure (smaller value/larger value × 100). To calculate IOA for all other variables, we used exact agreement. IOA was high for each variable: standardized x:y ratio (M = 98.3% [1.99%], range = 87.25%–100%), DPPXYR (M = 98.05% [2.40%], range = 86.3%–100%), dependent variable(s) description (M = 90.59%), dependent variable data type (M = 95.29%), and scale of ordinate (M = 96.47%). Eight errors were identified for dependent variable description: three disagreements from Ennis et al. (2017) on correct word sequences versus total words, and five from Mason et al. (2010) on quality versus number of total parts. For the dependent variable data type, four disagreements were identified from Mong et al. (2011) on count versus rate. For the scale of the ordinate, there were three disagreements identified, all on Ennis et al. (2017). All disagreements were discussed until a consensus was reached.
Procedures for Standardized X:Y and DPPXYR Estimation
We followed the procedures described by Gould and colleagues (2018) to estimate the standardized x:y ratio and DPPXYR. First, we took a screenshot of each graph using a Mac (Shift-Command-4) that spanned the length of the x-axis and height of the y-axis for each ABAB design, alternating treatment design, or individual AB phase contrast within a multiple-baseline design or multiple-probe design. Each screenshot was saved as a .png file to a folder specified for the project. To obtain the length of the x- and y-axis, we coded the dimensions of the screenshot, available by right clicking and selecting get info, and entered these values into an Excel sheet. We divided the x value by the y value to standardize the ratio to 1 y: # of units x. Then, we used the standardized x:y ratio obtained for the graph and divided it by the possible number of data points that could have been plotted on the x-axis given the scale to estimate the DPPXYR (see formula provided in the introduction). To determine the number of possible data points that could be plotted along the x-axis, we replicated procedures by Radley et al. (2018). We consulted the x-axis to determine the maximum value on the axis. If a break was present in the axis, the possible number of data points was determined by counting possible locations for data to fall along the x-axis; thus, the possible number of data points was used and not the number of data points plotted.
Results
Distribution of Standardized x:y Ratios
The mean value of x, for standardized x:y ratio, was 3.00 (SD = 1.80) and the median was 2.52. The range across graphs was large: 1.24–9.66. The distribution of x values was positively skewed (skewness = 1.91 [SE = 0.15]) with extreme values at the tails (kurtosis = 3.38 [SE = 0.30]). A majority of x values (n = 99, 38%), for standardized x:y ratio were between 1.00 and 1.99. The supplemental files contain two histograms of standardized x:y with varying levels of specificity and a scatterplot displaying the distribution of x:y ratio by year. Mean standardized x:y values are reported by study in Table 1.
Standardized X:Y and DPPXYR per Study.
Note. Cooper et al. (2020) recommended standardized x:y between 1.5 and 1.6. Radley et al. (2018) recommended a DPPXYR between 0.14 and 0.17. DPPXYR = data points per x- to y-axis ratio.
The most frequently used design type was multiple-baseline design or multiple-probe design (62.5%), followed by ABAB design (15%) and alternating treatment design (15%). A total of 25 studies, including 162 graphs, used a multiple-baseline design or multiple-probe design across participants, settings, or behaviors/skills. The mean standardized x:y ratio across AB graphs included in multiple-baseline designs was 3.32 (SD = 1.57). The data were positively skewed with large variation identified across graphs (see Figure 2). A total of six studies, including 25 graphs, used an ABAB design. The mean standardized x:y ratio across ABAB graphs was 2.07 (SD = 0.45). The data were slightly positively skewed and had smaller variation across graphs than multiple-baseline designs (see Figure 2). A total of six studies, including 49 graphs, used an alternating treatment design. The mean standardized x:y ratio across alternating treatment design graphs was 1.74 (SD = 0.45). The data were slightly negatively skewed and had smaller variation across graphs than multiple-baseline designs, and similar variation to ABAB designs (see Figure 2).

Distribution of standardized x:y by type of single-case research design.
Distribution of DPPXYR
The mean value of DPPXYR was 0.12 (SD = 0.09) and the median was 0.10. The range across graphs was large: 0.02–0.41. The distribution of x values was positively skewed (skewness = 1.57 [SE = 0.15]) with extreme values at the tails (kurtosis = 2.02 [SE = 0.30]). A majority of DPPXYR values (n = 65, 25.2%) were between 0.10 and 0.15. The supplemental files contain two histograms of DPPXYR with varying levels of specificity and a scatterplot displaying the distribution of DPPXYR by year. Mean DPPXYR are reported by study in Table 1.
The mean DPPXYR both within and across design types varied. The mean DPPXYR across AB graphs included in multiple-baseline designs was 0.14 (SD = 0.09). The data were positively skewed with large variation identified across graphs (see Figure 3). We identified that 101 of 162 graphs (62.3%) fell below the preliminary minimum DPPXYR (0.14) proposed by Radley and colleagues (2018) to reduce Type I error rates. The mean DPPXYR across ABAB graphs was 0.07 (SD = 0.02). The data were slightly positively skewed with relatively consistent values observed across graphs. The mean DPPXYR was substantially smaller than for graphs reported in multiple-baseline design family. All ABAB graphs fell below the minimum DPPXYR (0.14) proposed, although these recommendations were based off findings from multiple-baseline designs and have not been experimentally tested for ABAB designs. The mean DPPXYR across alternating treatment designs was 0.07 (SD = 0.03). The mean value was comparable with ABAB designs and substantially smaller than multiple-baseline designs. The data were not skewed, although larger dispersion was identified than for ABAB designs. All alternating treatment design graphs fell below the minimum DPPXYR (0.14) proposed, although these recommendations were based off findings from multiple-baseline designs and have not been experimentally tested for alternating treatment designs.

Distribution of DPPXYR by type of single-case research design.
Ordinate Scaling
The most frequent type of data measurement procedure for the dependent variables was the use of percent (n = 113 graphs). A promising finding was 110 of the 113 graphs using percent, scaled the ordinate from 0% to 100%. Three graphs included a truncated ordinate with the maximum value set at 60% (n = 2) and 80% (n = 1). The second most frequent type of data measurement procedure for the dependent variables was a count procedure (n = 100). In many of the graphs (n = 44) that included a count procedure, there was a natural upper bound maximum value because scores were based on a rubric. This would allow the team to set an ordinate max representative of total possible range of values, much like percent data. Of the 42 graphs that used count with an upper bound limit, 31 (73.8%) set the ordinate max aligning to the maximum score of the dependent variable. Research teams in six (14.3%) of these graphs set the ordinate max higher than the maximum possible value. Five graphs (11.9%) included a truncated ordinate with max value set at the maximum observed value. For count data without a natural upper bound limit, the most common procedure employed was to set the ordinate maximum value slightly above the maximum observed value for the individual case. Rate was the last type of data measurement procedure used in 52 graphs. The most frequent method for setting the ordinate max was to set the maximum value slightly above the maximum observed value.
Discussion
SCEDs have a robust history and serve a critical role in guiding research and practice for students with EBD (Maggin et al., 2016; Schloss et al., 1981; Skiba & Casey, 1985). Accurate and reliable visual analysis of time-series data is necessary to examine the findings from SCEDs. Visual analysis is used to determine the presence of causal relations between independent and dependent variables as well as the magnitude of effects (Kratochwill et al., 2013). Preliminary research has identified two potential analysis-altering elements that transform data impacting conclusions drawn from visual analysis (Dart & Radley, 2017, 2018; Radley et al., 2018). These two elements are the scaling of the ordinate axis and the DPPXYR, a metric to evaluate the proportions of the x:y axis along with the number of potential data points. We focused our investigation on these two elements when evaluating all SCED graphs printed in BD and JEBD from 2010 to 2019 (see Table 2 and Table 3).
Characteristics of the Dependent Variable(s) and Scaling of the Ordinate Axis.
Note. DV = dependent variable; CWPM = correct words read per minute; OTRs = opportunities to respond; CPMP = correct problems per minute; EPM = errors per minute; ODRs = office discipline referrals; CDPM = correct digits per minute; BSP = behavior-specific praise; CWS = correct word sequences; DPR = daily progress report; PECS = picture exchange communication system.
One potential analysis-altering element we investigated was the DPPXYR. In their original investigation, Radley and colleagues (2018) calculated the DPPXYR for all graphs printed in SCEDs published in five school psychology journals between the years 2010 and 2015. The team identified 205 studies, including 295 panels of graphs to evaluate, which is comparable with the number of graphs included in our present review (n = 258). Radley and colleagues reported that the mean DPPXYR across graphs was 0.12, which is identical to the mean DPPXYR value from our sample. In addition, the mean DPPXYR for graphs reported in multiple-baseline designs in BD and JEBD was 0.14, which is identical to Radley and colleagues’ sample. The DPPXYR for ABAB designs was comparable, BD and JEBD (M = 0.07) and Radley and colleagues (M = 0.08). The one difference identified was that the mean DPPXYR for alternating treatment designs was smaller for the BD and JEBD sample (M = 0.07) compared with Radley and colleagues (M = 0.10). Given the preliminary data obtained, Radley and colleagues recommended a minimum threshold for DPPXYR (0.14) when constructing panels to be included in multiple-baseline designs. We identified that 101 (62.3%) graphs included in multiple-baseline designs were below this threshold, which may inflate Type I errors for visual analysis. This was not surprising because the DPPXYR was developed and published in 2018, and we included studies published from 2010 to 2019. Further research is needed to substantiate recommendations for constructing graphs included in multiple-baseline designs with a DPPXYR between 0.14 and 0.16. Last, no research has evaluated how the manipulation of the DPPXYR impacts visual analysis for ABAB or alternating treatment designs, two other prevalent designs in the field of EBD.
Given the DPPXYR was proposed recently, we also investigated the distribution of standardized x:y values across graphs. Although no empirical research is available for standardized x:y, recommendations have been prevalent in the literature for decades (Parsonson & Baer, 1978). In addition, previous reviews of graph construction reported these data, which allowed us to compare the findings. Ledford and colleagues (2019) evaluated 1,123 graphs reported in 267 articles across 12 special education journals (BD and JEBD were not included). Ledford and colleagues reported a smaller mean standardized x:y for multiple-baseline designs (M = 2.91) and multiple-probe designs (M = 2.76) compared with multiple-baseline design graphs in BD and JEBD (M = 3.32). This means, on average, multiple-baseline design graphs in the field of BD and JEBD were constructed with a longer x-axis in comparison with the height of the y-axis when compared with Ledford and colleagues’ sample. The opposite was identified for ABAB designs; BD and JEBD reported a smaller mean standardized x:y (M = 2.07) compared with Ledford and colleagues’ sample (M = 2.88). The standardized x:y for alternating treatment designs were more similar, with the BD and JEBD reporting a smaller value (M = 1.74) compared with Ledford and colleagues (M = 1.99). Last, it does not appear that researchers publishing in BD and JEBD are using the guidelines of constructing x:y ratio between 8:5 and 3:2 (Cooper et al., 2020; Kubina et al., 2017; Parsonson & Baer, 1978). We identified that only 3% of graphs were between this threshold. This is not surprising because these recommendations do not have empirical support, and Ledford and colleagues (2019) identified that 50 editorial board members reported preference for graphic displays outside the 8:5 and 3:2 ratios.
A second potential analysis-altering element we evaluated was the ordinate scaling. Dart and Radley (2017) identified truncating the ordinate to include an 80%, 60%, and 40% maximum value inflated Type I error rates, with a larger truncation leading to larger error rate. The two previous reviews of graph construction did not evaluate whether the scaling of the ordinate and whether truncation were prevalent. In the BD and JEBD sample, findings suggest that the scaling of the ordinate across a majority of graphs was appropriate. Over 95% (n = 110) of graphs that used percent data set the ordinate from 0% to 100%. Only three graphs truncated the ordinate, setting ordinate max at 80% (n = 1) and 60% (n = 2). Second, for count data with a natural upper bound limit, over 75% of graphs set the ordinate as the maximum possible value, which would also reduce Type I error rates resulting from a truncated graph. Last, for count or rate data without a natural upper bound limit, a majority of graphs set the maximum ordinate slightly above the maximum observed value. Similar to the DPPXYR, the ordinate scaling can only be classified as a potential analysis-altering element. The single study evaluated percent data along the ordinate and only had analysts evaluate ABAB designs. Future research should investigate whether count data with a natural upper bound maximum produced similar findings, as well as expanding investigations to other commonly used SCEDs (i.e., multiple-baseline designs, alternating treatment designs). Furthermore, no research has provided recommendations on how to scale an ordinate for data without a natural upper bound maximum.
Limitations
Three limitations are associated with this investigation. First, we focused on the evaluation of SCEDs that implemented an intervention to improve outcomes for students at risk or identified with EBD, with an exclusive focus on the outlets of BD and JEBD. These are the flagship journals in the field; however, there are other journals with published studies that would have met our inclusion criteria. Our aim is to reach those conducting SCED research in the field of EBD; thus, the exclusive focus on these two outlets and submission to BD seemed appropriate. Second, we did not evaluate the degree to which the DPPXYR and ordinate scaling influenced conclusions drawn during visual analysis. However, we merely sought to provide data to describe the distribution of DPPXYR values across graphs and approached researchers used to scale the ordinate. Third, we only evaluated graphs from published studies as we were unable to access graphs from investigations rejected during the peer-review process. Thus, we were unable to compare the characteristics of published with unpublished studies (e.g., Are differences in DPPXYR and ordinate scaling different between published and unpublished SCEDs?).
Implications for the Field
We believe there are several critical implications for the field moving forward. First and foremost, additional research is needed to identify whether preliminary findings regarding the DPPXYR and ordinate scaling replicate. Systematic replication is needed to investigate the DPPXYR across ABAB designs and alternating treatment designs. Radley and colleagues (2018) only investigated the DPPXYR for multiple-baseline designs. In addition, ordinate scaling needs to be investigated across design types (i.e., alternating treatment designs, multiple-baseline designs) and across different measurement procedures (i.e., count/rate data). Second, response-guided approach (Ferron et al., 2017) is the predominant method used within SCED to make decisions about when to implement, or withdraw, the intervention. If the research team is evaluating graphed data in real time with a truncated ordinate or DPPXYR value below 0.14, then the decisions to implement or withdraw an intervention could be inappropriate and impact the internal validity of the design. Researchers are encouraged to critically think about graph construction during the data collection process as well as during manuscript preparation. Reviewers and editors should be aware of these potential analysis-altering elements when evaluating graphs displayed in SCEDs. The presentation of graphs, with the intention of reducing Type I error rates, is critical given the increase in systematic reviews in the field of special education to provide recommendations on EBPs (Talbott et al., 2018). Researchers who use empirically supported guidelines for graph construction may enhance perceived credibility of SCEDs and visual analysis.
Supplemental Material
sj-pdf-1-bhd-10.1177_0198742920982587 – Supplemental material for Brief Report: Ordinate Scaling and Axis Proportions of Single-Case Graphs in Two Prominent EBD Journals From 2010 to 2019
Supplemental material, sj-pdf-1-bhd-10.1177_0198742920982587 for Brief Report: Ordinate Scaling and Axis Proportions of Single-Case Graphs in Two Prominent EBD Journals From 2010 to 2019 by Corey Peltier, John W. McKenna, Tracy E. Sinclair, Justin Garwood and Kimberly J. Vannest in Behavioral Disorders
Supplemental Material
sj-pdf-2-bhd-10.1177_0198742920982587 – Supplemental material for Brief Report: Ordinate Scaling and Axis Proportions of Single-Case Graphs in Two Prominent EBD Journals From 2010 to 2019
Supplemental material, sj-pdf-2-bhd-10.1177_0198742920982587 for Brief Report: Ordinate Scaling and Axis Proportions of Single-Case Graphs in Two Prominent EBD Journals From 2010 to 2019 by Corey Peltier, John W. McKenna, Tracy E. Sinclair, Justin Garwood and Kimberly J. Vannest in Behavioral Disorders
Footnotes
Author’s Note
Tracy E. Sinclair is now affiliated with University of Connecticut.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
Supplemental Material
Supplemental material for this article is available on the Behavioral Disorders website with the online version of this article.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
