Abstract
Behavioral interventions to decrease problem behavior often involve the use of single-case experimental designs in which an individual’s responding during a treatment condition is compared to responding during a control or baseline condition. It is possible that during the initial introduction of treatment, problem behavior continues to occur at baseline rates before behavior reduction is observed; this phenomenon is called a transition state. Evaluated the prevalence of transition states in the Journal of Applied Behavior Analysis and found that they occurred within 5.3% of the published literature. The current study replicated and extended Brogan et al. by evaluating the prevalence of transition states in unpublished clinical data of patients admitted to an inpatient hospital for the treatment of severe problem behavior. Using a retrospective consecutive-controlled case series, transition states were observed in 3% of cases for an average duration of 4.8 sessions. We discuss factors that may affect transitional behavior between phases and relevant implications for practice and research.
Keywords
Single-case experimental designs (SCEDs) involve the repeated measurement of behavior across time and conditions. Behavioral data are analyzed on an ongoing basis while clinical and experimental analyses are conducted (e.g., Falligant et al., 2022). Visual analysis is the primary method that clinicians and researchers use for evaluating behavioral data within SCEDs (e.g., Baer, 1977; Fisher & Lerman, 2014; Smith, 2012). Clinicians and researchers make important decisions based on their visual interpretation of the data within and across SCED phases, such as conducting additional sessions, beginning new conditions, reversing conditions, or implementing procedural modifications when necessary. Thus, in the applied domain, visual analysis is critically important with respect to making informed judgment about the controlling variables of behavior; and it directly relates to the application, ongoing modification, and efficacy of behavioral treatment evaluations (e.g., Falligant et al., 2022). The principles guiding the visual analysis and interpretation of SCED data are well-established (e.g., Kazdin, 2010), and many have described the advantages of using visual inspection relative to traditional statistical methods with SCED data (cf. Fisher & Lerman, 2014).
Recently, Brogan et al. (2019) proposed that the presence of “transition states” could potentially obscure the interpretation of SCED data. Transition states are the continuation of a baseline response pattern into the subsequent treatment phase. In other words, behavior continues to occur at baseline levels when an intervention is first implemented before a treatment effect is eventually observed. Transition states are problematic from the perspective of visual analysis because they have the potential to mask apparent treatment effects; this may lead to erroneous conclusions that otherwise effective interventions are ineffective (i.e., Type II errors). Moreover, structured criteria and other common methods for quantifying effect sizes (e.g., percentage of nonoverlapping data points; PND) may not account for or eliminate these transition state effects; thus, possibly leading to inaccurate conclusions regarding the effects of independent variables on changes in behavior when using structured criteria (Brogan et al., 2019).
To evaluate how common this phenomena is in practice, Brogan et al. (2019) quantified the prevalence of transition states within the studies published in the Journal of Applied Behavior Analysis (JABA) from 1994 to 2014. They found that transition states occurred within 7.4% of the published treatment evaluations. More specifically, transition states were only observed in 5.3% of behavior reduction interventions. These preliminary results are promising and suggest that transition states occur infrequently. However, as described by Brogan et al., a significant limitation of their study is their analysis of only published data. Data obtained from published studies are often subject to case-selection bias, in which only cases with favorable outcomes are included by authors in their manuscripts submitted for publication. Published data is also subject to potential publication bias, in which journals may be more likely to publish positive findings or those with large effects over negative findings or those with smaller effects (e.g., Gage & Lewis, 2014; Sham & Smith, 2014). It is plausible that published data systematically differ with respect to effect size, series stability, and or degree of transitory behavior between phases. In other words, published data may be less likely to include transition states compared to unpublished clinical data. Therefore, a more useful representation of the prevalence of transition states is likely found in the evaluation of unpublished clinical datasets. Thus, the purpose of the current study was to replicate and extend the procedures described by Brogan et al. by quantifying the prevalence of transition states in unpublished clinical data.
Data Acquisition, Preparation, and Analysis
The current analysis was conducted via a retrospective consecutive-controlled case series (CCCS; Hagopian, 2020). Data were obtained from the consecutively encountered records of patients admitted to an inpatient program (see Hagopian et al., 2022) for the assessment and treatment of severe problem behavior from 2014 to 2017 (Table 1). This sample included 82 males (73.2%) and 30 females (26.8%), with a median age of 14-years-old (range: 4–25-years-old). About 92 participants (88.4%) were diagnosed with autism spectrum disorder (ASD), 101 participants (90.2%) were diagnosed with an intellectual disability, and 89 participants (79.5%) were diagnosed with both ASD and an intellectual disability.
Participant Demographics.
Note. ID = Intellectual Disability; F = Female; M = Male; N/A = No race and ethnicity were disclosed or no intellectual disability was diagnosed. Behavioral data for a subset of these participants were described previously (see Falligant, McNulty, Hausman, et al., 2020).
Similar to Brogan et al. (2019), we excluded applications that used response blocking interventions because the response patterns observed are not representative when the opportunity for problem behavior to occur is removed. We also excluded applications that did not include at least one baseline-treatment comparison. Recall, a transition state is the temporary continuation of baseline patterns of responding when a treatment is initially introduced. As such, there needs to be at least one baseline and one treatment condition for a transition state to occur. For reversal and multiple baseline (MBL) designs, baseline-treatment comparisons were only used if the treatment phase immediately followed the baseline phase. If a patient’s treatment evaluation included multiple individual treatments (e.g., ABAC or MBL across three behaviors), then we treated the new baseline-treatment comparisons (e.g., A-C phase comparison, or MBL 2 and MBL 3) as different applications. For multielement designs, a baseline-treatment comparison met criteria if both occurred as separate conditions. If there were multiple treatment conditions in the multielement design, then those were treated as different applications with the same baseline comparison.
Then, we extracted the applicable data series and entered them into an automated spreadsheet (Microsoft Excel 2016) which applied the Brogan et al. inclusion criteria. 1 For applications to meet the inclusion criteria there must be a demonstration of (1) experimental control and (2) a treatment effect. A demonstration of experimental control was defined as the presence of three consecutive data points in the baseline phase of a baseline-treatment comparison. A demonstration of a treatment effect was defined as the presence of three consecutive treatment data points occurring below the range of baseline. Therefore, an application was required to have a minimum of three data points in both baseline and treatment phases to meet either criteria for inclusion in this analysis. If a baseline-treatment comparison did not meet these criteria but had two comparisons of the same conditions (e.g., ABAB), then observers determined whether the second comparison met the inclusion criteria (i.e., the analysis was performed on the second A-B phase).
Next, we determined whether transition states occurred in the treatment applications that met the inclusion criteria (see examples in Figure 1). Each treatment evaluation was intended to decrease problem behavior. Therefore, transition states were defined as problem behavior consecutively occurring within the range of baseline during the first three treatment sessions. The duration of a transition state was also measured by adding the number of initial, consecutive treatment sessions in which problem behavior remained within the range of baseline.

Hypothetical data series with and without transition states.
Results and Discussion
Across the 112 cases identified between 2014 and 2017, there were 465 behavior reduction treatment evaluations (hereafter referred to as applications). Of the identified applications, 30 incorporated response blocking and 128 did not include a baseline-treatment comparison. Of the remaining 313 applications, 12 did not demonstrate experimental control and 167 did not demonstrate an experimental effect. Therefore, the final sample in the analysis included 75 cases and 134 applications (see Figure 2). Using the same criteria described by Brogan et al., transition states only occurred in 4 applications (3%) and lasted an average of 4.8 sessions. These results are similar to those obtained by Brogan et al., which observed transition states within 5.3% of behavior reduction applications and had an average duration of 4.9 sessions across all applications.

Flow chart of applications included and excluded.
Notably, if there was a single baseline session in which behavior did not occur, then the lower range of baseline would start at zero. Therefore, the treatment condition would have to include three data points below zero to demonstrate a treatment effect. As this is impossible, all applications that have a zero session during baseline would be excluded from this analysis. We adhered to this methodology to remain consistent with the criteria set by Brogan et al. (2019). However, as a subsequent analysis, we evaluated the same treatment applications and excluded any baseline sessions in which problem behavior did not occur. In the subsequent analysis, a demonstration of experimental control was defined as three consecutive, non-zero data points in the baseline phase of a baseline-treatment comparison; and a demonstration of a treatment effect was defined as the presence of three consecutive treatment data points occurring below the range of non-zero baseline data points. With the modified inclusion criteria of the supplemental analysis, the prevalence of transition states was 3% (9 of 202), with an average duration of 3.6 sessions. These results appear to be consistent with the original analysis that used the procedures described by Brogan et al. Indeed, results of a chi-square analysis revealed no difference in the prevalence of transition states between studies, X2(1) = 0.25, p = .62.
Transition states may mask real treatment effects, leading one to erroneously conclude that an intervention is not effective. Thus, transition states are related to Type II error with respect to treatment effects. Note, it is possible that the transitions states observed in the current study reflect false positives insofar as they are associated with natural variation in otherwise orderly data series. Here, “false positives” refer to concluding that a transition state has occurred somewhere within a data series when in reality the data actually just reflect uncontrolled variation in the data series and are not “true” transition states. Again, this should not be confused with the Type II error that transition states have in terms of masking treatment effects. Lanovaz et al. (2019) and Falligant, McNulty, Hausman, et al. (2020) evaluated the proportion of false positives for baseline phases of various lengths across published and unpublished clinical datasets, respectively. Specifically, they divided single-phase baselines into two new phases (i.e., Phases A and B) and quantified whether a change, or false treatment effect, was observed across the phases using the dual-criteria method (Fisher et al., 2003). Although they occurred at relatively low rates, false positives were still observed across both evaluations. Therefore, it is possible that the transition states observed in the current study are not specific to behavior patterns that are observed at the beginning of treatment phases; rather, they may represent false positives that are likely to occur within phases, independent of any context changes. To explore this hypothesis in a supplemental analysis, we extracted the first nine sessions of every baseline series across each application (assuming the baseline series contained at least nine data points). Then, each series was divided into 2 phases: sessions 1 to 3 constituted Phase A (e.g., baseline) and sessions 4 to 9 constituted Phase B (e.g., treatment). According to the criteria set by Brogan et al. (2019), transition states are observed within the first three data points of treatment followed by at least three data points that demonstrate a treatment effect. Therefore, the prevalence of transition states was evaluated in sessions 4 to 6 across each baseline application. In this sample, 109 applications had at least nine data points and only one transition state (<1%) was observed. Thus, assuming this is the probability of observing an illusory transition state—one that is not caused by a change in the independent variable, but rather is the product of chance variation within a single data series—the binomial probability of the observed prevalence of transition states in the current study is very low (p < .001).
These results suggest that transition states are relatively rare. Indeed, results of the current retrospective CCCS suggest that transition states in unpublished behavior reduction SCED are commensurate with the estimates provided by Brogan et al. (2019). Therefore, it is unlikely that the low prevalence estimates reported by Brogan et al. are the product of case selection or publication bias. In fact, the obtained prevalence estimates in the current study were slightly lower than those reported by Brogan et al. It is unclear if the lower rate is attributable to differences in contextual factors (e.g., interventions for severe problem behavior vs. less-severe behavior), baseline data series characteristics (e.g., baseline data collected de novo vs. from a functional analysis test series; see Falligant, McNulty, Hausman, et al., 2020), or other possible factors. Regardless, these results serve to replicate and extend Brogan et al. to the clinical context that would be most affected by the presence of transition states.
As described previously, the principles guiding the visual analysis and interpretation of SCED data are well-established and enjoy considerable empirical support and applied utility (e.g., Kazdin, 2010). However, others have noted some limitations of visual analysis, namely poor interrater reliability (Ninci et al., 2015), reliance on large effect sizes (Harrington & Velicer, 2015), and other risks related to limited trend detection and Type I error (Fisch, 2001). 2 Extant research suggests a number of data series characteristics, including degree of autocorrelation, effect size, and other variables (e.g., idiosyncratic level, trend, variability) may affect the accuracy of visual analysis (Brossart et al., 2006). Fortunately, the use of structured visual criteria (e.g., Hagopian et al., 1997; Roane et al., 2013) can mitigate these limitations by providing objective quantitative guidelines that enhance the accuracy and reliability of visual inspection (e.g., Wolfe et al., 2018). There are many approaches for supplementing visual analysis using structured criteria and quantitative methods (see Falligant et al., 2022), including the conservative dual-criteria (Fisher et al., 2003), fail-safe k (Barnard-Brak et al., 2018), and ANSA methods (Hall et al., 2020), in addition to approaches based in machine learning (Lanovaz et al., 2020), and others (e.g., Lanovaz et al., 2019). To date, there has been very little systematic research investigating how the presence of transition states affects the accuracy of visual analysis and/or these structured criteria and supplemental quantitative techniques. Future research should more fully examine this issue, as clinicians and researchers may periodically encounter transition states within their SCEDs. Ideally, clinicians should be trained to quickly diagnose transition states when they are encountered in practice. Similarly, the accuracy of quantitative tools (e.g., fail-safe k) should remain durable when transition states are present.
Although transition states are relatively uncommon, the current results suggest that they do periodically occur. Future research in this area is needed to identify under which contexts a transition state is more or less likely to occur. For example, transition states may be more common for certain functional classes of behavior for which the establishing operations (EOs) build across successive sessions (e.g., attention); or they may be more likely to occur across different classes of behavioral treatment (e.g., noncontingent reinforcement; NCR). Indeed, it is hypothesized that NCR may be associated with less transitory behavior and more stable response patterns than other treatments because it directly alters the establishing operation for problem behavior (Falligant, McNulty, Hausman, et al., 2020; Thompson et al., 2003). Additional analyses of factors related to transitory behavior may prove to yield important insights into the variables that underlie behavior change, affect visual analysis of SCED data, and directly support the science and practice of applied behavior analysis.
Footnotes
Data Availability
The data that support the findings of this study are available from the corresponding author upon reasonable request.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
Ethical Approval
Data for this record review were obtained in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards.
