Abstract
In this secondary systematic review of single-case and controlled group design intervention studies conducted with transition-age autistic youth, we examined features of 48 studies with 273 participants that measured at least one “problem behavior” outcome (Prospero registration number: 231764). We searched 11 databases for relevant studies, and the final search date was November 2022. Our primary aims were to determine how problem behaviors were defined and selected for reduction, how functions were determined, and the interventions used to address them. Studies were coded and codes were tabulated and converted to percentages to answer each research question. Thirty-eight percent of studies defined problem behavior, and 88% of studies implemented behavioral strategies to reduce problem behaviors. Behaviors with low potential for harm constituted the majority of the 67 outcome variables (61%), while behaviors with high potential for harm were a minority (39%). The most common intervention target was stereotypic behavior. Fewer than half of studies: reported procedures for selecting behaviors, reported procedures to determine behavior function, or ascribed functions to behaviors. We were unable to report on some demographic features of participants (e.g. race/ethnicity) because they were rarely reported in primary studies. We conclude that problem behavior is poorly conceptualized in this research.
Lay abstract
In a previous study, we looked at research done on strategies to support autistic people who were between 14 and 22 years old. For this study, we looked at all of the studies in our previous study that tried to decrease or stop autistic people from doing certain things—many researchers call these things “problem behavior.” There were 48 studies that tried to reduce problem behavior, and most of them used strategies like prompting and reinforcement to try get autistic people to change their behavior. We found many things wrong with these studies. Most of them did not define the group of behaviors they were trying to stop autistic people from doing. None of the studies looked at whether any side effects happened when they tried the strategy they were studying. Also, most of the studies tried to stop autistic people from doing behaviors that probably were not harmful, like stereotypic behavior. Most of the studies did not say how they decided that the behaviors they tried to stop were a problem for the autistic people in the study, and most studies did not try to figure out why the autistic people in the study did the behaviors the researchers were trying to stop them from doing.
Ameliorating “problem behavior” 1 has been a goal of autism interventions since their initial development, with scholars in applied behavior analysis (ABA) leading the way (Lovaas et al., 1974). In the United States, addressing problem behavior is considered so critical for educational access that it is codified in federal law; strategies to address problem behaviors are legally required if the behavior is perceived as disability-related, and negatively impacts school progress (IDEA 300.346(2)(i)). Caregivers of autistic children express significant concern with problem behavior, and it is reported to be a major factor in the quality of life of both autistic people and their families (Ruef & Turnbull, 2002). Although problem behavior is not specifically mentioned in the Diagnostic and Statistical Manual of Mental Disorders (5th ed.; DSM-5; American Psychiatric Association, 2022) description of autism, there is some evidence that it is quite common in autistic children (Nicholls et al., 2020). It may be especially common among autistic people with co-occurring intellectual disability, who appear to have greater incidences of problem behavior than those with autism alone or intellectual disability alone (Álvarez-Couto et al., 2023; Murphy et al., 2009). In this study, we examine intervention research designed for autistic youth that addresses problem behavior, to understand: the characteristics of participants included in this research, how problem behavior is conceptualized and identified, and the strategies used to address it.
Definitions of problem behavior vary within and across disciplinary boundaries, and many definitions are broad and/or vague, such that seemingly any behavior could be categorized as problematic. For example, Hanley and colleagues (2003) define problem behavior as:
. . . behavioral excess that is socially significant to the extent that someone complains of its occurrence. These behaviors are typically of sufficient intensity or frequency that the safety of the person or others is threatened, the ability of the person or others to acquire new skills is hindered, or more restrictive living arrangements are warranted. (Hanley et al., 2003, p. 150)
Of note about this definition is that it centers others’ perceptions of a behavior to determine if it is problematic. Issues of safety, learning opportunities, or independence, though “typical,” are not required to label a behavior a problem (however see Emerson, 2001 for a more constrained definition). In autism research, the specific kinds of conduct (what we refer to as “subdomains” of problem behavior) that have been examined using this framework include behaviors where potential for harm is evident, such as self-injury, aggression, and property destruction, but behaviors where overt harm is less clear, such as stereotypic behaviors (i.e. repetitive motor or vocal movements), disruption, and noncompliance (Machalicek et al., 2007; Matson & Nebel-Schwalm, 2007) have also been subsumed under the problem behavior construct. In other writing, Hanley and colleagues (2014) have gone so far as to consider the behavior profiles described in the DSM-5 that warrant an autism diagnosis (i.e. differences in social communication and restrictive and repetitive behavior) to be problem behaviors themselves, and behavior like self-injury and aggression are “additional” problem behaviors (p. 16).
Addressing problem behavior
Behavioral interventions are the most frequently studied approach for supporting autistic transition-age youth across all outcomes (Bottema-Beutel et al., 2023), and this likely holds for problem behavior. Many behavioral interventions rely on first establishing the function of a behavior considered to be problematic, through the administration of a Functional Behavior Assessment (FBA) (Machalicek et al., 2007). This assessment involves systematic observation, record review, and in some cases interviews with teachers and caregivers to determine the events that precede and follow the behavior, which allows the researcher to develop hypotheses about the behavior’s purpose. For example, if a problem behavior is typically followed by admonishment, the behavior could serve the purpose of eliciting attention. Once a set of possible functions are generated, they can be experimentally assessed in a functional analysis, where antecedents and consequences are systematically manipulated to determine if a specific set of conditions makes the behavior more likely to occur (see Beavers et al., 2013 for a review, and Dawson & Fletcher-Watson, 2022 for important critiques of this procedure).
ABA researchers have delineated a short list of potential functions, which include escape from task demands or other aversive events; access to attention, food, or preferred materials; and automatic reinforcement, which is when the behavior itself provides a sought-after sensory experience (Beavers et al., 2013; Hong et al., 2018; Iwata et al., 1994). Each problem behavior topography (measurable features of behavior) is thought to serve only a single, generally stable function. Some behaviors may serve more than one function, but this has traditionally been considered rare (Beavers & Iwata, 2011, but see Werner & Bienstein, 2022).
Once the function of the behavior is determined, procedures such as extinction (decoupling the problem behavior from the consequence that gives it value, for example, by withholding attention following a vocal outburst), response blocking (physically preventing the enactment of the behavior), response cost (removing a positive reinforcement), or punishment (e.g. introducing an aversive event, such as a slap or electric shock) are used to decrease the occurrence of the problem behavior (Doehring et al., 2014; Yadollahikhales et al., 2021). Simultaneously, procedures such as prompting and reinforcement are used to teach and promote the use of alternative behaviors that could serve the same purported function as the problem behavior (Gregori et al., 2020).
Adverse impacts of behavioral interventions
Almost as soon as interventions were developed to target problem behaviors, researchers acknowledged the possibility of adverse consequences to participating in such interventions. One such consequence of behavioral approaches is referred to as “extinction bursts,” which are temporary increases in the intensity, frequency, or duration of the problem behavior (Skinner, 1938; summarized in Lattal et al., 2020). Extinction bursts are estimated to occur about 25% of the time following extinction procedures (Lerman & Iwata, 1995; Muething et al., 2024). Extinction procedures can also cause increases in other problem behaviors that were already in an autistic person’s behavioral repertoire, or can cause new problem behaviors to emerge (Fisher et al., 2023). Finally, problem behaviors often resurge following the cessation of extinction interventions, even when those interventions are considerably lengthy (Greer et al., 2023). The reason for extinction bursts, and what should be done about their occurrence, are not agreed upon by ABA researchers (see Fisher et al., 2023 for different theorizations within the behaviorist paradigm).
Extinction procedures such as restraints or the physical prevention of problem behavior (DeRosa et al., 2016; Potter et al., 2013) may also result in adverse consequences such as physical injury or emotional distress (Kerns et al., 2022). However, the fallout of “extinction bursts” (beyond their occurrence) and other kinds of adverse events are not well understood in autistic populations. This is because adverse events associated with most interventions are not traditionally monitored or reported by researchers (Bottema-Beutel, Crowley et al., 2021, 2023; Dawson & Fletcher-Watson, 2022).
This study
Behaviors considered problematic may be an especially significant source of concern for transition-age autistic youth, because these behaviors are thought to increase in adolescence (Emerson et al., 2001) and remain through early adulthood (Rattaz et al., 2018). The occurrence and perception of these types of behavior may have impacts on the opportunities autistic youth are granted as they enter adulthood, and the amount of autonomy and independence they are afforded. However, there are multiple ways of conceptualizing and categorizing behaviors that have traditionally been labeled problematic, and many attempts at doing so have been contested by autistic people. Other reviews of problem behavior intervention research have focused on quantifying the effectiveness of interventions (Campbell, 2003; Heyvaert et al., 2014; Severini et al., 2018). In this secondary analysis of a review of intervention research conducted with transition-age autistic youth (Bottema-Beutel et al., 2023), we examined a subset of single-case and controlled group design studies that were designed to reduce specific forms of participants’ behavior. Focusing on transition-age youth is important because during this period, students receive specialized services to prepare them for adulthood, and the type and focus of interventions may differ as compared to younger autistic children. Our previous review showed that there are serious quality concerns in this group of studies that prevent drawing conclusions about intervention effectiveness. Rather than focus on effectiveness, we examined the following four study-level research questions:
Research Question 1 (RQ1). What are the demographic features of participants?
Research Question 2 (RQ2). What terms did the authors use to describe problem behavior, and what approaches did they use to conceptualize these behaviors? Specifically, we examined how frequently researchers provided a definition of the behavior, listed examples, and/or described immediate or long-term impacts of the behavior.
Research Question 3 (RQ3). What intervention approaches were used to reduce/change the behavior?
Research Question 4 (RQ4). Was there any consideration of adverse consequences of participating in intervention, including the possibility of extinction bursts?
In addition, we examined the following three variable-level research questions:
5. Research Question 5 (RQ5). What subdomains of problem behavior (e.g. self-injury and aggression) were targeted in these interventions?
6. Research Question 6 (RQ6). How were problem behaviors selected for intervention?
7. Research Question 7 (RQ7). What, if any, procedures were used to determine the function of behaviors targeted for reduction, and what functions were ascribed to each behavior?
Methods
The coding manual, coding spreadsheets, and PRISMA checklists are available on the Open Science Framework (OSF; https://osf.io/haxmj/).
Search, screening, and selection procedures
Search, screening, and selection procedures are detailed in our previously conducted review (Bottema-Beutel et al., 2023). The original search and selection process was initiated in November of 2020 and resulted in 193 articles. To update the search in November of 2022, we applied an identical search and selection process to capture relevant articles published since the original search. To screen articles, we first imported records into RefWorks (we used Zotero in the update) and removed duplicates. Abstracts were then imported into the web-based software Abstrackr to aid in title/abstract screening. Full texts of articles that made it through the title/abstract screening were double coded for inclusion/exclusion by two researchers, and agreement was 94%. Articles gathered in the original 2020 search in which at least one variable was categorized as “reduction in unwanted behavior” (which we refer to as “problem behavior” here) were selected for the current study (n = 44). In the follow up search, 28 studies met the inclusion criteria described above, and 4 of these studies measured at least one problem behavior variable (agreement for selecting studies that included a problem behavior was 100%). Combined across search periods, 48 studies met inclusion criteria. See Figure 1 for a PRISMA diagram outlining article screening and selection.

PRISMA diagram.
Coding studies
A coding manual was developed deductively and was informed by reviewing literature on problem behavior in relation to our research questions. This manual was then refined inductively during an initial coding process involving five randomly selected studies, which underwent further refinements to resolve disagreements between coders. The manual was finalized when it was detailed enough to exhaustively code all five studies for each coding category. After training coders on these five studies, the remaining 43 studies were independently coded by two coders (Bottema-Beutel et al., 2023). After a first round of coding, agreement was reached (
Study-level coding
Participant demographics
Participant demographics were extracted from the original review, including chronological age, sex/gender, 2 co-occurring conditions, and language assessment scores, cognitive assessment scores, and autism assessments. However, most of the studies did not report standardized assessment scores, and instead provided descriptive information about participants. For the current study, we extracted these descriptions from the text, and then determined whether they indicated participants used no or minimal speech, and whether participants had co-occurring intellectual disabilities or other conditions. We also coded whether participants were reported to live in an institutional setting.
Overall term to describe behaviors that should be targeted for reduction
The text was searched to determine if authors used a label such as a “problem behavior,” “aberrant behavior,” “challenging behavior,” or some other term as a superordinate category to which they ascribed the specific behavior(s) they examined.
Approach to conceptualizing problem behavior/subdomains
To determine how authors conceptualized the problem behavior subdomain that was measured in each study, we reviewed the introduction for any text related to the behavior subdomain, excluding text related to prevalence or interventions used to address the behavior. If the subdomain was not conceptualized, but the authors conceptualized problem behavior as a broad construct, we used this text. After copying and pasting relevant text into our coding spreadsheet, we coded the excerpt to determine if the authors: listed examples of behaviors in the subdomain, provided statements regarding the immediate or long-term impacts of the behaviors, or gave a definition of the behavior that would allow for categorization of any given behavior into the subdomain. These three approaches to conceptualizing the behavior were mutually exclusive, except in one case; when the authors provided an exhaustive list of all behaviors in the category, this was coded as both listing examples and providing a definition. See Table 1 for example quotes to illustrate each conceptualization approach.
Examples of approaches to conceptualizing problem behavior.
Intervention approach
We extracted coding of the intervention approach (e.g. social cognitive, behavioral, and planning-based interventions) from the previous review. See Bottema-Beutel et al. (2023) also on OSF for definitions and examples of intervention approaches.
Adverse events
We extracted coding on adverse events from the previous review (Bottema-Beutel et al., 2023). We also used the pdf search function for the term “extinction burst,” to determine if authors’ included mitigation for the possibility of extinction bursts in their methods, or if they interpreted study results as indicating the occurrence of extinction bursts.
Variable-level coding
Behavior subdomain
Each variable was coded as belonging to a subdomain of problem behavior. An initial list of subdomains was devised based on our review of the problem behavior literature, and included physical aggression, verbal aggression, self-injury, pica, stereotypic behavior, rumination/vomiting, food refusal (which could be measured as amount of food consumed), sleep problems, irritability, elopement/avoidance related to phobia or anxiety, property destruction, a combination of behaviors, or other. After all included articles were coded, the “other” and “combination” categories were disaggregated for presentation of results. We also extracted information from the previous review (Bottema-Beutel et al., 2023) about the measurement approach for each behavior variable; whether discrete behaviors were operationalized and tallied, or whether broad measures were used to capture behavioral/processes (e.g. through standardized instruments).
Selecting behaviors and determining functions
For each variable, we extracted information about how the behavior was selected for reduction (e.g. consultation with parents and consultation with teachers), procedures conducted to identify the function of the behavior (e.g. FBA without functional analysis, functional analysis, informal observation, and interview), and the function the authors ascribed to the behavior (e.g. escape and attention).
Tabulating results
Counts and percentages were used to answer all research questions. The total number of studies was used as the denominator for study-level questions, and the total number of variables was used as the denominator for variable-level questions.
Community involvement
This research was not co-produced with autistic community members, but the focus of this article aligns with autistic people’s research priorities (Pellicano et al., 2014; Waddington et al., 2023).
Results
Study-level findings
Description of studies and participants
Forty-eight studies met inclusion criteria, representing nearly 25% of all studies conducted on transition-age autistic youth for any outcome. Of these, 45 (94%) were single-case design studies, and 3 (6%) were group design studies. Publication years spanned from 1984 to 2023, and the mean publication year was 2012. Twenty-nine (60%) studies were published in the last 10 years (2013–2023), 13 (27%) were published between 2002 and 2012, and 6 (13%) were published prior to 2002. There were 273 participants represented across the studies (214 male, 59 female). Twenty-four studies (50%) had samples that were 100% male, and the majority of female participants were from the three-group design studies (n = 48). Among the single-case design studies, sample sizes ranged from 1 to 6 participants: 33 had n = 1, 3 had n = 2, 8 had n = 3, and 1 had n = 6 (total n = 69). Of the three group design studies, n’s were 36, 41, and 127 (total n = 204). The average mean chronological age of participants was 16 years. Twenty-six studies (54%) reported on participants’ cognitive abilities. Of these, twenty of the studies described participants that specifically indicated or implied (based on descriptions) at least one participant had co-occurring intellectual disability. Thirty-three studies (69%) reported participants’ language abilities. Of these, 16 studies (48%) described participants as using no or minimal speech, 17 studies (52%) described participants as using more than minimal speech. Twelve studies (25%) indicated that at least one participant lived in an institutional setting, seven studies (15%) indicated that participants lived in their family home, and the remaining 29 studies (60%) did not describe participants’ living situation. See Table 2 for a breakdown of this information by study.
Participant and intervention descriptions.
CA: Chronological Age; X: the participant was described as using no or little spoken language; CBT = Cognitive Behavioral Therapy; NR: Not Reported; —: the participant was described as using speech beyond a few words or phrases (but may still have had language disabilities); ADHD: Attention Deficit Hyperactivity Disorder; FBA: Functional Behavior Assessment; ODD: Oppositional Defiant Disorder; tDCS = Transcraial Direct-Current Stimulation.
Group design study, all other studies are single case designs.
Intellectual disability was not specifically identified as a co-occurring disability, but description of the participants suggests intellectual disability.
Overall term to describe behavior
Problem behavior was the most used term (n = 21), followed by challenging behavior (n = 6), inappropriate behavior (n = 6), and aberrant behavior (n = 2). The terms dangerous, exceptional, and pathological were used in one study each, and the terms maladaptive and negative were used together in a study that also used the term inappropriate. Ten studies did not use an overall term beyond the term for the subdomain (e.g. stereotypic behavior).
Conceptualization of problem behavior subdomains
Thirty-nine studies (81%) used at least one of the three approaches to conceptualize the specific subdomain addressed in the study, five (10%) used at least one approach to conceptualize problem behavior as a broad construct (but did not conceptualize a subdomain), and four (8%) provided no conceptual information about either problem behavior as a broad construct, or the subdomain addressed in the study. Fourteen studies (29%) used only one of the three approaches to conceptualize problem behavior: two listed examples, three provided definitions, and nine described impacts. Nineteen studies (40%) used two approaches: two listed examples and provided a definition, five described impacts and provided a definition, and 12 listed examples and described impacts. Eight studies (17%) used all three approaches, and seven studies (15%) did not use any approach. Of this latter group, four did not provide any conceptual discussion of problem behavior that we could code. Three discussed problem behavior, but in a vague way that could not be categorized as any of the three strategies, for example, “Compliance among children with and without intellectual disabilities is a common concern for teachers” (Jessel et al., 2017, p. 248). Collapsed across studies that used one, two, or three approaches to conceptualizing problem behavior, describing impacts was most common (n = 34 studies), followed by listing examples (n = 24 studies). Providing a definition was least common (n = 18).
Description of interventions
Forty-two studies (88%) tested exclusively behavioral strategies, and the remaining six tested a variety of other strategies. Refer again to Table 2 for a breakdown of this information by study.
Consideration of adverse events
We reported in our larger review that only two studies in our data set mentioned adverse events (Bottema-Beutel et al., 2023); and neither of those were studies that examined problem behavior. None of the studies reported monitoring participants for increases in new problem behaviors or for problem behaviors not targeted by the intervention, but already in the participants’ repertoire, or included a priori plans for handling the possibility of extinction bursts that could result from participating in the intervention. Two of the 48 studies mentioned extinction bursts. One study that attempted to reduce stereotypic behavior using response blocking found that in one participant the behavior increased (Potter et al., 2013). In the discussion, the authors stated, “The persistent high level of stereotypy observed after blocking was introduced could have been an effect of extinction (i.e. an extinction burst) and potentially would have decreased over time” (Potter et al., 2013, pp. 412–413). In a study that examined the effects of delay to reinforcement, mand training, and extinction on self-injury and aggression, the authors observed increases in both behaviors during the intervention phase in which the participants were also presented with a task (Kern et al., 1997). The authors concluded, “. . . we hypothesized that this procedure was likely to minimize the occurrence of an extinction burst. However, as the data indicate this procedure was less effective during task situations.” (Kern et al., 1997, p. 285). Neither of these studies labeled extinction bursts as adverse events or recommended against using the intervention in light of their occurrence.
Variable level findings
Across the 48 studies, there were 67 unique problem behavior variables. Sixty-four of these involved researcher-defined observational variables; the remaining three used standardized assessments (the Support Intensity Scale–A, Exceptional Behavioral Domain; The Social Responsiveness Scale–2, Restrictive and Repetitive Behavior Domain; and the Repetitive Behavior Scale–Revised), and all three standardized assessments were from group design studies.
Subdomains of behavior
See Table 3 for the frequency of behavior subdomains across all 67 problem behavior variables. In this table, we also provide recalculated subdomain frequencies when the five studies that used operational definitions that included more than one behavior subtype were disaggregated. We include the disaggregated counts separately, because constructing combination variables is not recommended in methodological guidance as it is generally believed that different behavior topographies serve different functions, and it is not possible to determine which behavior was influenced by the intervention when they are tallied together (Beavers & Iwata, 2011; Yoder et al., 2018). Finally, we grouped behavior subdomain categories into those in which the immediate risk for harm was high (self-injury, food refusal, physical aggression, elopement, rumination/vomiting, sleep problems, and property destruction) and those in which immediate risk for harm was low (all remaining subdomains). High-risk subdomains accounted for 39% of variables and low-risk subdomains accounted for 61% of variables. These results are similar when calculated at the study level (62% of studies focus on behaviors with low-risk of harm).
Frequencies of Behavior Subtypes Across Studies.
The number of variables when the 5 “combination” variables (variables where a combination of multiple sub-types of behavior were included in a single operational definition) were disaggregated.
Selecting behaviors for intervention
The most common method for selecting the behavior for intervention was consultation with parent/parent report (n = 15 studies), followed by a combination of various sources (n = 10), consultation with teachers/teacher report (n = 6), consultation with clinical or residential staff/clinical reports (n = 5), and researcher observation. Twenty-six studies did not describe any procedures for selecting behavior targets. We note that the three group design studies included broad outcome measures of restrictive and repetitive behaviors, and so may have been less apt to describe procedures for selecting behaviors because they did not select/operationalize discrete behaviors. Finally, none of the studies (including those that used a combination of sources) consulted the participants involved in the study to determine which behaviors to target.
Determining behavior functions
The authors conducted procedures to determine the functions of the behaviors that were targeted for reduction for 32 behavior variables (48%). See Table 4 for the procedures used across all variables. Authors assigned functions to 43 behavior variables. This number is larger than the number of variables for which procedures were conducted to determine the function because two variables were assigned functions without conducting assessments (both related to needle phobias) and there were three studies in which different functions were assigned to the same operational definition of a behavior variable across the different participants in the study. Across all 43 behavior variables for which a function was assigned, there were only four function categories: automatic/sensory reinforcement (12 instances), escape/avoidance (12 instances), attention (5 instances), and access to materials (2 instances). See Table 5 for detailed information on behavior functions.
Procedures used to determine the function of problem behavior variables.
FBA: Functional behavior assessment.
Functions ascribed to behaviors.
These were instances when a single operational definition of a problem behavior was attributed to multiple functions, within a single participant. In two instances, the authors did not specify what multiple functions the behavior served, in one instance the behavior was determined to function as escape and automatic reinforcement, and in one instance the functions were determined to be automatic reinforcement, attention, and escape.
Discussion
In this systematic review, we examined 48 intervention studies conducted with transition-age autistic youth, which targeted 67 behavior variables. We described the participants involved in the studies, the interventions used, the conceptualization of problem behaviors, and procedures for selecting and ascribing functions to behaviors that were targeted for remediation.
Study features
Our review reflects broader trends in autism intervention literature in that autistic girls and women are underrepresented in research, even beyond their underrepresentation in diagnosis (Shefcyk, 2015). This appears to be especially true in single-case design literature, in which only 11 girls or women participated across 45 studies. This gender disparity could be because the behavior of autistic men and boys is more readily interpreted as problematic, and autistic girls may be subject to different socialization processes that influence behavioral repertoires and the social significance placed on them 3 (Pearson & Rose, 2021).
The participants represented in our review diverge from a common assumption in the field, which is that autistic children and youth who use no or minimal speech to communicate are underrepresented in intervention research (Tager-Flusberg & Kasari, 2013). Of the studies that reported participants’ language, more than half indicated participants used no or minimal speech. Participants with co-occurring intellectual disability, and with support needs perceived to be high enough to warrant living in institutional contexts were also well-represented. This is consistent with research showing that autistic people with intellectual disability and who are non-speaking are characterized as having more behavioral challenges (Nicholls et al., 2020; Quetsch et al., 2023). However, we note that many of the behaviors that were most frequently targeted in the studies in our review (e.g. stereotypic behavior, “inappropriate” social behavior, and sleep problems) occur across the autism spectrum.
Consistent with our expectations, we found that authors only infrequently provided conceptual definitions of the behavior they studied and used a variety of different terms to refer to the overall group of behaviors (e.g. “problem,” “challenging,” and “inappropriate”). The most common approach to conceptualizing behaviors was researcher assertions about the potential negative impacts of various problem behavior subdomains. Appealing to hypothesized impacts without providing a definition is problematic, because it does not allow for a determination as to whether the behavior the authors operationalized and measured fits into the subdomain of behaviors they reference. For example, researchers could cite evidence that off-task behavior in classrooms is associated with poor academic performance, but then examine “looking at others” as an instance of off-task behavior (this is an actual example from our data set, see Riden et al., 2021, p. 402). Researcher-derived operational definitions used in the studies in our review rarely if ever are subject to construct validation, so the extent to which behaviors like “looking at others” will result in poor academic outcomes, or require remediation, is questionable. If we do not have a conceptual definition of what constitutes problem behavior or the various subdomains, it is unclear how to evaluate researchers’ operationalizations of behavior or contextualize study findings in relation to any prior evidence indicating the impacts of such behavior on learning and development.
In addition, the negative impacts researchers have reported to result from problem behavior that is not a direct effect of the behavior itself, such as limiting autistic children’s access to education, contributing to unemployment, causing teacher burnout, and causing significant parental stress and depression (summarized in Conroy et al., 2005; Machalicek et al., 2007; Zaidman-Zait et al., 2014), have been demonstrated in correlational research. This means that the actual associations between problem behavior and these outcomes may not be causal, could be in the reverse direction or bidirectional, or could be more complex than a simple bivariate association (Kildahl et al., 2023; Zaidman-Zait et al., 2014). For example, there may be associations between autistic children’s “inappropriate” social behavior and unsatisfying social relationships, but this could be explained by ableism and stigma on the part of potential friends and may not necessarily mean that autistic children whose conduct is non-normative are incapable of developing social relationships—or that enforcing normative behavior will improve their relationships. Although this study was designed to capture any intervention strategy designed to reduce behaviors exhibited by transition-age autistic youth, most studies in this review used behavioral approaches. This aligns with how ABA has historically positioned itself as a discipline, and the kinds of outcomes ABA researchers view as in their purview. However, ABA approaches generally do not examine macro-scale social processes (e.g. norms, expectations, biases, schooling and medical systems, etc.) or interactional processes (e.g. the conduct of interaction partners across various timescales, including interventionists, that are not captured by considering only immediate behavioral antecedents and consequences). Recognizing complexity in associations between problem behavior and important life outcomes suggests that direct intervention on the behavior—especially without systemic changes—may not always be successful in improving those outcomes and could in fact exacerbate longer term problems by further stigmatizing behaviors that are atypical but not harmful (Goffman, 1963; Kapp et al., 2019).
None of the studies included in the current study mentioned adverse events, and this is especially problematic for interventions that attempt to reduce or eliminate behaviors, since behavioral researchers have already established that adverse consequences to implementing these approaches are common (Lerman & Iwata, 1995; Muething et al., 2024). Monitoring for these possibilities would require holistic assessment of autistic participants’ functioning, wellbeing, and social ecologies (e.g. beyond tallying behaviors targeted for reduction), and for much longer periods of time than the duration of an intervention. Extinction bursts were mentioned in two studies, but we find it noteworthy that in one study they appear to have been mentioned to salvage unpromising results (the increase in the problem behavior could have improved eventually if the findings did indeed represent an extinction burst), and the other study mentioned them to indicate that some settings may need additional procedures to improve the behavior. That is, neither study viewed extinction bursts as particularly problematic or recommended against implementing the intervention in light of their occurrence. An additional concern is that even procedures that have not been associated with extinction bursts, such as the use of food reinforcers or planned ignoring, have the potential to cause harm (Dawson & Fletcher-Watson, 2022), and this was never acknowledged in these studies.
Variable features
Inadequate conceptualization of problem behavior means that a wide range of behaviors, including those that are likely benign (e.g. stereotypic behavior, body position in class, appearance, tidying) are targeted for intervention. That is, researchers may come to view any behavior produced by autistic people, regardless of strong evidence for immediate or long-term negative impacts, as subject to remediation without attention to wider social and societal processes that influence researchers’ interpretations of behavior. The disproportionate focus on behaviors that are not harmful has been consistent across time and age groups—stereotypic behavior has been the most frequent form of problem behavior targeted in behavioral autism intervention research for nearly six decades (Campbell, 2003; Heyvaert et al., 2014). In the early days of autism research, stereotypic behavior was targeted because it was thought to be functionless and impede learning, but the evidence for this is weak. For example, experimental studies found autistic children more readily complied with prompts to push a lever to obtain food if stereotypic behavior (e.g. rhythmic body rocking, waving of hands or objects near the eyes, gazing at an overhead light) was punished with slaps to the hands and a loud “No!” (Koegel & Covert, 1972). These studies have not considered the extent to which following instructions to receive food constitutes learning, or the psychological processes that may influence autistic children’s preference for stereotypic behavior as compared to the behavior requested by researchers (e.g. their arousal levels, motivation for the activities presented to them as part of these experiments, etc.).
The disproportionate focus on behaviors with low-risk for harm indicates there may not be alignment between intervention researchers who focus on problem behavior and autistic people in regards to the types of behaviors considered appropriate intervention targets. For example, Waddington and colleagues (2023) found that autistic adults viewed intervention goals aimed at reducing harmful behaviors as a high priority (in alignment with parents and professionals) but viewed interventions focused on reducing autistic characteristics (which include stereotypic and other restrictive and repetitive behavior) to be inappropriate. Questions about how wide a net should be cast around the “problem behavior” construct have been raised even within the ABA field (see Carr, 2007 for a behaviorist’s reflections). However, this sentiment does not seem to have substantially influenced intervention research.
There is also a lack of transparency in regard to how behaviors were selected for remediation, with more than half of studies failing to describe any procedures for selection. Equally problematic is the fact that the autistic participants themselves were never consulted. While researchers have asserted that ABA interventions target behaviors that are “socially significant” (e.g. Leaf et al., 2022), autistic people do not appear to have any involvement in making determinations about significance. Similarly, while systematic procedures to determine behavior functions are a cornerstone of ABA practices to address problem behavior (Doehring et al., 2014), we found that these procedures (even counting those that were not systematic) were implemented for fewer than half of variables. This proportion holds when variables from the handful of studies that did not use behavioral approaches are excluded and is consistent with previous reviews of interventions on problem behavior (e.g. Machalicek et al., 2007). Consultation with autistic people was also never a means by which interventions sought to determine behavior function, even though there is research indicating many autistic people can do so (Maddox et al., 2017). This means that, in these studies, researchers attempted to influence behavior with little if any understanding of the meaning and purposes it has for autistic people (e.g. for self-regulation, to assert autonomy), may have incorrectly assigned behavior functions, and designed interventions without ensuring that autistic participants would be supported in learning new behaviors that serve similar functions to those targeted for reduction. Finally, the short list of only four possible behavior functions—which are thought to have stable relationships with a given behavior over time and context—may drastically oversimplify human conduct.
Limitations
Findings from this study should be considered alongside at least three limitations. First, our review covered only research on transition-age autistic youth, and so cannot be generalized to problem behavior research on other age groups or with other disability populations. Second, we were limited to analyzing what was reported in the primary studies, which means we were unable to report on important features of participants such as race, ethnicity, and whether participants were transgender or had genders other than male or female. Third, we only examined studies that used designs that can provide evidence for causal or functional relations, which is consistent with the aims of the initial review. However, problem behavior could be conceptualized in other ways by researchers who use other designs for other purposes (e.g. for descriptive research).
Conclusion
Problem behavior is considered common in autistic people, and the specter of problem behavior, especially potentially harmful behaviors such as self-injury and aggression, is sometimes used to justify the provision of interventions that many autistic people find objectionable (e.g. Anderson, 2023; Hastings & Noone, 2005; Leaf et al., 2022). However, we have shown that for transition-age autistic youth, there is a disproportionate focus on eliminating behaviors that are not harmful, and that many autistic people view as tied to their identities (e.g. stimming). What is and is not considered “problem behavior” is of consequence to autistic people, because categorizing behavior in this way can restrict their opportunities and freedom of movement, and subject them to highly restrictive intervention programs (including institutionalization) that have high potential for harm (Dawson & Fletcher-Watson, 2022). Our study strongly suggests that more theoretically and empirically grounded approaches to understanding problem behavior, and the complexities that explain its occurrence, are urgently needed to inform intervention research.
Frameworks other than ABA could advance research into understanding and supporting autistic people who engage in behavior that could be distressing for them. For example, neurodiversity frameworks assert that neurological differences may underpin sensory, emotional, and social experiences that manifest in behavior perceived to be problematic. Behaviors like stimming are conceptualized as adaptive responses to an environmental and social milieu not designed with autistic people in mind—therefore combining both neurobiological/neurocognitive and sociological considerations (Kapp et al., 2019; Pellicano & den Houting, 2022). For example, Kapp and colleagues (2019) describe how autistic people experience stimming as a means of self-regulation, which is especially important for coping with either emotional or sensory overwhelm. Stimming is also an important source of autistic people’s personal and community identity (Bascom, 2012; Kim & Bottema-Beutel, 2019). Mental health research on self-injury that has elicited autistic people’s perspectives suggests that it may be a means to cope with anxiety, frustration, or emotional pain; or as a form of atonement for wrongdoing (Maddox et al., 2017; see also Samways et al., 2022 for similar findings in people with intellectual disability).
Alternative perspectives on problem behavior could point toward alternative intervention approaches to support autistic people who engage in behavior where direct intervention may be needed, such as self-injury (e.g. cognitive-affective, relational, and developmental approaches). At present, approaches other than ABA designed to address problem behavior are rare, and characterized by a variety of design flaws. In the future, alternative approaches could be better aligned with autistic people’s preferences and values regarding their rights, autonomy, and dignity (see Wolkorte et al., 2019 for similar conclusions based on findings from populations with intellectual disabilities). Systems change approaches, such as those informed by Nussbaum’s (2011) capabilities approach to human flourishing (described in Pellicano et al., 2022) and community psychology (Botha, 2023), as well as overhauls to existing mental health systems so that they are designed with autistic people in mind (Maddox et al., 2020) could be productive starting points.
Our findings have at least two implications for policy, especially those relevant to schooling and transition-age autistic youth. First, functional behavior assessments are a component of federal education policy in the United States, but were used to address problem behavior in only six studies in our data set. Functional analyses were more common, but there are ethical concerns with this process because they involve manipulating the environment to induce the behavior—which may be especially inappropriate in schools (Dawson & Fletcher-Watson, 2022). Second, the available studies on any given potentially harmful problem behavior is quite small, ranging from nine studies for food refusal to only two studies for sleep problems, and five and three studies for highly classroom-relevant issues such as self-injury and elopement, respectively. Taken together with the very likely but unexamined potential for harm in this area of intervention research and the quality problems that have already been identified in this literature (Bottema-Beutel et al., 2023), there is very little high-quality evidence supporting the emphasis on ABA strategies to address problem behaviors in schools for this population. The lack of evidence to support these policies suggests alternatives should be considered in future research, especially those that support systems change so that autistic people are not marginalized for their conduct (Ruef & Turnbull, 2002), and that consider processes such as emotional and sensory regulation (Quetsch et al., 2023).
Footnotes
Acknowledgements
The authors thank Audrey Bond for her assistance in screening studies.
Declaration of conflicting interests
The author(s) declared the following potential conflicts of interest with respect to the research, authorship, and/or publication of this article: K.B.-B. has previously received fees for consulting with school districts on intervention practices for autistic children and teaches courses on autism interventions in her role as an associate professor of special education. She has also accepted speaker fees to discuss her work on research quality, adverse events, and researcher conflicts of interest as they pertain to autism intervention research. She also receives royalties for a coedited book titled Clinical Guide to Early Interventions for Children with Autism, published by Springer. R.M. was previously employed as a special educator in a school that provided transition supports to autistic students. S.C.L. was formerly affiliated with an entity that trained students to become Board Certified Behavior Analysts and provided early intensive behavioral intervention.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
