Disproportionality Reduction in Exclusionary School Discipline: A Best-Evidence Synthesis

Abstract

A full canon of empirical literature shows that students who are African American, Latinx, or American Indian/Alaskan Native, and students who are male, diagnosed with disabilities, or from low socioeconomic backgrounds are more likely to experience exclusionary discipline practices in U.S. schools. Though there is a growing commitment to mitigating discipline disparities through alternative programming, it is clear that disproportionality in the application of harmful discipline practices persists. The purpose of this literature synthesis was to examine the effectiveness of empirically studied school-based interventions in reducing disproportionality in discipline practices. We analyzed articles that assessed both prevention and intervention program effects using at least one outcome variable representing exclusionary discipline, either in the form of office discipline referrals or suspension/expulsion rates. Included studies used experimental, quasi-experimental, or observational research designs that disaggregated student outcomes by race, ethnicity, gender, disability, or other sociodemographic categories. We identified 20 articles meeting inclusion criteria, four of which provided direct evidence of disproportionality reduction using interaction terms. Results indicate limited evidence that available programs reduce discipline disparities and that common programs may function as a protective factor for White and female students while failing to do so for marginalized students. Findings identify promising areas for future research.

Keywords

discipline gap disproportionality exclusionary discipline suspension

Exclusionary discipline includes a spectrum of punitive practices, including office discipline referrals (ODRs), in-school suspensions, out-of-school suspensions, expulsions, and referrals to the juvenile justice system (Noltemeyer & Mcloughlin, 2010). Research has indicated that exclusionary discipline results in a host of negative outcomes, such as lost instructional time (Losen & Whitaker, 2017), lowered academic outcomes (Noltemeyer et al., 2015), and increased likelihood of truancy and dropout (Balfanz et al., 2015; Fabelo et al., 2011; Rumberger & Losen, 2017). Additionally, higher suspension and expulsion rates have been associated with reduced schoolwide academic performance (Skiba & Rausch, 2006) and perceptions of school climate (Gregory et al., 2011).

Racial disparities in the disciplinary process have been well documented in the literature (Losen et al., 2015; Losen & Gillespie, 2012; Welsh & Little, 2018). African American students are more likely to receive an ODR than their White peers (Skiba et al., 2011), and evidence suggests that African American students are more likely to receive an ODR for subjective infractions—those that require teacher judgment, such as disruption or defiance—as opposed to objective infractions (e.g., tardiness or truancy) compared with White peers (Skiba et al., 2002; Smolkowski, Girvan, et al., 2016). Following an infraction, African American students are also more likely to receive harsher consequences than White peers, even when the behavior violation is similar (Okonofua & Eberhardt, 2015; Skiba et al., 2002). Using a nationally representative sample, Skiba et al. (2011) examined patterns of discipline disparities in 364 elementary and middle schools. They found that “both initial referral to the office and administrative decisions made as a result of that referral significantly contribute[d] to racial and ethnic disparities in school discipline” (p. 101), indicating that disparities exist at multiple points within the disciplinary process and that solutions must be applied across this process.

Latinx and American Indian/Alaska Native students are also more likely to experience exclusionary discipline compared with White peers (Losen et al., 2015; Losen & Gillespie, 2012). Although exclusionary discipline research on Latinx students has been less consistent, potentially due to impacts of localized political shifts (Mora, 2014), migration trends (Jaworsky et al., 2012), and contexts of reception (Rumbaut, 2005), research has found that Latinx students are underrepresented in the application of exclusionary discipline in early-school years and overrepresented in later years (Losen et al., 2015; Skiba et al., 2011). For American Indian students, research has pointed to persistent and substantial disparities, particularly in discipline outcomes (e.g., Nowicki, 2018); however, relatively small sample sizes have posed measurement challenges (Hussar et al., 2020), and few studies exist that have foregrounded outcomes for American Indian students in general. Studies have consistently described Asian and Asian American students as underrepresented in exclusionary discipline practices (see Losen et al., 2015), suggesting the impact of model minority stereotypes on disciplinary processes (see Ngo & Lee, 2007).

In addition to racial disparities, studies have also highlighted discrepancies in discipline practices for males (e.g., Bradshaw et al., 2010), students of low socioeconomic status (e.g., Petras et al., 2011), and students labeled with disabilities (e.g., Brobbey, 2018; Sullivan et al., 2013). Intersectionality (i.e., intersecting forms of oppression, exclusion, and erasure; see Annamma et al., 2018; Blake et al., 2017) refers to the compounding experiences of social injustice for students at the intersections of these groups (Artiles, 2011). Examining the experiences of students who exist at these intersections is critical in understanding the complex phenomena that underlie disparate practices (Crenshaw, 1991). For example, African American students with disabilities are suspended at higher rates than White students with disabilities (Losen, 2018; Losen & Gillespie, 2012), and African American males with disabilities are suspended at rates double those of White males with disabilities and triple those of White females without disabilities (Cruz & Rodl, 2018; Morris & Perry, 2017; Wallace et al., 2008), indicating the interplay among racism, ableism, and gender issues in school discipline systems (Annamma et al., 2018; Blake et al., 2017).

Reports have shown extensive variation in ODR, suspension, and expulsion rates between schools and districts, suggesting that local factors affect exclusionary risk (Losen et al., 2015; Skiba, 2015). School features, such as enrollment (Camacho & Krezmien, 2018) and diversity of the student body (Anyon et al., 2014), have been cited as significant factors predicting the use of exclusionary practices. Other work has implicated schools’ malleable features, such as average suspension rates (Theriot et al., 2010) and average school achievement (Rausch & Skiba, 2005), indicating school climate and organizational behavior as predictive factors. Studies have also examined teacher perspectives toward discipline (Skiba et al., 2014; Skiba & Knesting, 2001) and found that teachers’ classroom management skills (Skiba et al., 2014) highly predict referral rates, suggesting that practitioner behavior may be a lever for improving discipline disparities (e.g., Okonofua et al., 2016). As a whole, this body of work underscores the complexity of exclusionary discipline practices (see Welsh & Little, 2018, for a comprehensive review).

Proposed Solutions

School discipline practices exist on a continuum that spans from prevention to postinfraction intervention. In considering this continuum, Gregory et al. (2017) developed an integrated framework for application in discipline disparity reduction, which includes prevention, intervention strategies, and structures that offer both. The framework identifies 10 requisite components for addressing disparities across the discipline continuum: (a) supportive relationships, (b) bias-aware classrooms, (c) academic rigor, (d) culturally relevant teaching, (e) opportunities for learning and correcting behavior within the classroom, (f) data-based programming, (g) problem-solving approaches to discipline, (h) inclusion of student and family voice, (i) reintegration after conflict, and (j) multitiered systems of support. Gregory et al. emphasized that interventions focused only on narrow or singular aspects of the discipline continuum are unlikely to rectify entrenched inequities.

Previous Reviews

Welsh and Little (2018) conducted a comprehensive review focused on factors that contribute to discipline disparities and the nascent evidence for alternatives to exclusionary discipline. The authors uncovered a variety of programs designed to support schools and districts in reducing exclusionary discipline practices overall, including School-Wide Positive Behavior Interventions and Supports (SWPBIS; e.g., Gage et al., 2018), restorative justice (e.g., Fronius et al., 2019), and professional development interventions targeting teachers’ culturally responsive practices (e.g., Bottiani et al., 2018). The authors found that many of these programmatic approaches problematically focused on “fixing” student behavior and assimilating students to existing school culture, rather than adjusting existing structures to support the students’ varied socioemotional needs. The authors theorized that such programs remain unlikely to reduce disparities. However, their review did not examine specific quantitative outcomes (e.g., ODR rates) or program moderators and mediators (e.g., feasibility of implementation or program fidelity requirements). Additionally, though a multitude of systematic reviews exist on school discipline—particularly for SWPBIS and restorative justice—none have examined direct relationships between program implementation and disparity reduction.

SWPBIS Research

SWPBIS seeks to improve discipline practices across school systems through universal screening, a continuum of interventions and supports, and data-based progress monitoring. The program is grounded in many robust, high-quality studies indicating improved prosocial student behavior (e.g., Bradshaw et al., 2012), and many components of Gregory et al.’s (2017) framework are included in the program (e.g., multitiered systems and data-based decision making). However, SWPBIS research has rarely disaggregated impact by race, gender, disability label, or other sociodemographic groups known to experience higher rates of exclusionary discipline (e.g., Flannery et al., 2014). Studies that have included race, gender, and/or disability variables have indicated that SWPBIS may benefit White students more than other students (e.g., Vincent & Tobin, 2011), females more than males (e.g., Bradshaw et al., 2012), and results have been inconclusive regarding students with disabilities (e.g., Vincent & Tobin, 2011).

In response to these findings, researchers have developed models for incorporating culturally responsive practices into SWPBIS (see Fallon et al., 2012), but evidence regarding the efficacy of such add-on programming in reducing discipline disparities is emergent. In Gage et al.’s (2018) review of SWPBIS’ impact on disciplinary exclusion, the authors found a statistically large treatment effect of SWPBIS on reducing suspensions overall. However, their review was based on just four experimental research studies that met criteria for rigor. Only one of these studies (Bradshaw et al., 2012) examined disproportionately disciplined student subgroups (the study examined differential effects by gender).

Restorative Justice Research

School-based restorative justice has been described as a collective approach to building mutual respect and inviting student participation in the development of school community (Buckley & Maxwell, 2007; Mansfield et al., 2018). Restorative justice also features many components of Gregory et al.’s (2017) framework, such as encouraging supportive relationships and problem-solving approaches to discipline. Pilot studies have shown efficacy in reducing racial discipline disparities (e.g., Augustine et al., 2018; Jain et al., 2014; Sumner et al., 2010). However, peer-reviewed studies have been largely qualitative in nature (e.g., González, 2012), which, though important for understanding the program, leaves open questions regarding direct efficacy. Song and Swearer (2016) described the ways in which restorative justice has become a part of the cultural zeitgeist, carrying forward a belief that it reduces disparities without a full cannon of research to support these claims. Fronius et al.’s (2019) review of restorative justice in schools uncovered one randomized controlled trial (RCT; Augustine et al., 2018), implemented in one school district, that demonstrated a reduction in discipline disparities between African American and White students, and another observational study (Jain et al., 2014), also implemented in a single school district, that produced mixed results. Fronius et al. emphasized that, “most programs are still at the infancy stage” (p. 21), and further research was needed to understand the direct effects of restorative justice in reducing discipline disparities.

Wholistic Evidence

Both SWPBIS and restorative justice have been shown to reduce exclusionary punishment. However, previous systematic reviews have not examined whether these programs reduce disparities in such practices for marginalized groups. In their review, Welsh and Little (2018) detailed the vast scope of documented discipline disparities and emergent solutions, and they hypothesized why disparities remain despite the implementation of such programs. Their review did not examine for whom these interventions have been shown to work or the minimum implementation fidelity with which schools must implement them to reduce disparities (see Welsh & Little, 2018, Table S3). In other words, they examined all studies related to discipline reduction, including studies that did not disaggregate by demographic category. Welsh and Little (2018) asserted a need for further research to illuminate these unexplored areas in order to provide “a better sense of not only whether alternative approaches to exclusionary discipline are working but also why” (p. 785). They concluded that the phenomenon is multifaceted and remains rife with unanswered empirical questions, including whether it is possible for schools to address disparities with currently available programs. Given Welsh and Little’s foundational findings, in this review we examine the body of evidence available for empirically studied, school-based interventions in reducing disproportionality in school discipline practices and associated key program implementation features.

Implementation Challenges

It is critical to note that schoolwide programs, such as restorative justice and SWPBIS, come at a cost to schools and districts, which may attenuate the benefits reported in highly controlled research. Newmann et al. (2001) detailed the perils of initiatives that require intensive time and energy but lack immediate success, as these commonly incur high costs, cause teacher fatigue, and struggle to improve teaching and learning in a sustainable way (Forman et al., 2013). Relatedly, recent studies on SWPBIS have underscored the importance of implementation fidelity (e.g., Gage et al., 2018), as outcomes have varied as a function of implementation quality (Kim et al., 2018). Given the costs and training it takes to achieve fidelity for schoolwide programs, it remains unclear whether feasible solutions are available to schools seeking to reduce discipline disparities. If implementation fidelity is out of reach, then intended outcomes and scaling up remain untenable (Fixsen et al., 2013).

It is critical, then, to understand how disparity-reduction research considers program efficacy in relation to treatment fidelity. The proportion of studies that were demonstration programs—implemented by researchers to ensure implementation fidelity under controlled conditions—remains unexamined. Therefore, we sought to determine whether or not there was an efficacy difference between demonstration programs and those implemented on a routine basis in schools. In doing so, we relied on Wilson et al.’s (2003) definition of demonstration programs as, “those implemented and evaluated by a researcher mainly for research or demonstration purposes,” and routine practice programs as, “those in which the program being studied already exist[ed] in the school on an ongoing basis and the evaluation [was] conducted either by school-based or outside researchers” (p. 137). Wilson et al. pointed to the relevancy of this distinction in that some meta-analyses have yielded smaller effect sizes for routine practice programs than for demonstration programs. We sought to examine whether this discrepancy existed in the extant research on discipline disparity reduction.

Purpose of the Study

In alignment with Welsh and Little’s (2018) conclusions and Gregory et al.’s (2017) identified priorities for future research, this analysis reconceptualizes the literature on school discipline by answering the following questions: (a) Which programs designed to reduce exclusionary discipline are associated with reduced disproportionality for students who are male, African American, Latinx, or American Indian/Alaskan Native, labeled with a disability, and/or of low socioeconomic status? (b) Which components of Gregory et al.’s framework for equitable discipline are included in these programs? (c) How do studies consider treatment fidelity in the context of program application and efficacy? By extracting overall effect sizes, we examined whether efforts to reduce exclusionary discipline were associated with an equal reduction in discipline for students of all sociodemographic categories, or whether reduction occurred at different rates for different student populations. The analysis included both referral incidents and consequence application (e.g., in-school suspension, out-of-school suspension, expulsion). We connected empirically investigated programs to Gregory et al.’s 10 components to determine which areas of the framework have been substantiated, and we identified key themes in the literature describing the ways in which schools can implement programs in the most efficient and effective manner.

Method

Given that there were very few common design features across the studies, we conducted an integrative best-evidence synthesis (Slavin, 1986) rather than a meta-analysis. A best-evidence synthesis identifies studies using explicit inclusion criteria and reports on effect sizes while also reporting on key themes. We conducted a systematic review of empirical studies examining the relationship between school- or district-wide behavior support programs and disciplinary exclusion (i.e., the dependent variable). We conducted this review in three phases: (a) title and abstract search, (b) full text review, and (c) data extraction and literature synthesis.

Phase 1: Title and Abstract Search

We used a comprehensive search strategy to locate articles that were peer reviewed, published, and relevant to the field of schoolwide discipline. In consultation with a research librarian, we performed systematic searches using keywords. We began with general search terms for literature related to school discipline: Along with the connector “and,” we searched the terms “school discipline” and “race OR ethnic OR ethnicity OR diverse OR diversity” located in article abstracts. We also used the term school discipline in combination with other sociodemographic category terms: “gender OR socioeconomic OR disability OR special education.” We then searched for specific terms related to school discipline, replacing “school discipline” with the terms “Positive Behavior Interventions and Supports,” “school wide positive behavior interventions,” “social emotional learning,” “restorative justice,” “restorative practices,” and “social justice discipline.” After reviewing these results, we performed additional searches using found search terms with potential for identifying programs meant to reduce exclusionary discipline, such as “teen court” and “socio-emotional learning.” We conducted our search throughout Spring 2019 and, thus, did not include articles published after May 2019.

Next, we performed hand searches of identified articles’ reference lists to find additional studies potentially eligible for the synthesis. As aforementioned, some of the programs available have a substantial evidence base, including several literature reviews and meta-analyses (e.g., SWPBIS); thus, we conducted hand searches of these articles for relevant citations (Boneshefski & Runge, 2014; Bottiani et al., 2018; Bouchard & Wong, 2017; Cotter Stalker, 2017; Durlak et al., 2011; Fallon et al., 2012; Gage et al., 2018; Gregory et al., 2017; Mallett, 2016; Mitchell et al., 2018; Öğülmüş & Vuran, 2016; Welsh & Little, 2018). Search engines used included EBSCO, PsychINFO, ERIC, PubMed, and JSTOR. We conducted these ancestral searches of relevant article citation lists using Google Scholar. We set search criteria for articles published from 1990 to present, as school policies and programs meant to reduce exclusionary discipline emerged during this time (Kafka, 2011).

Next, we entered each article’s title and abstract into an Excel database and coded them with four inclusion criteria: (a) the study was empirical and published in a peer-reviewed journal (i.e., this review excluded book chapters, technical reports, master’s theses, dissertations, and conceptual papers); (b) the study examined outcomes in public school settings (i.e., studies examining juvenile justice or criminal justice systems were excluded); (c) the study used a quantitative or mixed-method research design (i.e., studies utilizing solely qualitative methods were excluded); and (d) the study included exclusionary discipline as an explanatory variable. Any articles fitting the aforementioned inclusion criteria—or with unclear coding results based on the title and abstract—were kept for a full text review.

This initial search yielded 506 articles for potential inclusion in the study, 83 of which met criteria for a second round of review. Table 1 displays the number of articles at each phase of the retrieval process by search term. After applying Phase 1 criteria, the most common reason for exclusion of articles about “restorative practices,” “social justice,” or “disproportionality” was that the manuscript was conceptual or used qualitative methods (e.g., interview data regarding school/classroom climate or adult perceptions and attitudes). For articles about “SWPBIS,” “teen court,” and “socioemotional learning,” the most common reason for exclusion was the absence of exclusionary discipline as an outcome variable. For example, many teen court studies examined dependent variables related to substance abuse, and many SWPBIS studies examined fidelity measures or predictors of sustained program implementation.

Table 1

Article retrieval process

Main search term	Title and abstract		Full text review		Final sample
Main search term	n	%	n	%	n	%
SWPBIS	69	13.6	17	20.5	9	45.0
Restorative practices	83	16.4	17	20.5	5	25.0
SEL	28	5.5	4	4.8	1	5.0
Teen court	25	4.9	5	6.0	0	0.0
Social justice	162	32.0	10	12.1	2	10.0
Disproportionality	22	4.3	6	7.2	2	10.0
Meta-analyses search	74	14.6	22	26.5	1	5.0
Ancestral search	43	8.5	2	2.4	0	0.0
Total	506	100	83	100	20	100

Note. Ancestral and meta-analysis searches only included titles not previously identified in our original search. SWPBIS = School-Wide Positive Behavior Interventions and Supports; SEL = Socioemotional Learning.

Phase 2: Full Text Review

During this phase, we reviewed the full texts of all studies that met eligibility in Phase 1 using the following inclusion criteria: (a) whether the study question, purpose, and hypothesis included effects for an outcome variable related to exclusionary discipline (i.e., ODR, suspensions, or expulsions); (b) whether the study included outcomes disaggregated by sociodemographic categories; and (c) whether the study used a methodological design appropriate to understanding intervention efficacy, including RCTs, quasi-experimental designs (QED), or observational research designs (ORD) with extensive control covariates that compared groups receiving one or more identifiable intervention with one or more control condition. We included studies that presented both pre- and posttest measures on at least one qualifying outcome variable, or that used a pretest-posttest design in which measures of at least one qualifying outcome variable were taken before and after intervention on the same participants, including single-group designs and multiple-group designs involving different interventions. We also included observational studies that used administrative data sets with extensive control covariates that could be compared. Studies describing interventions without evaluation data were excluded.

Both primary researchers independently read and coded the 83 articles for inclusion, and we calculated interobserver agreement using Cohen’s kappa. Interobserver agreement for Phase 2 resulted in substantial agreement (κ = 0.77, CI = [0.61, 0.93]), with disagreement on five studies. After reviewing inclusion criteria and definitions, the authors independently reread articles on which disagreements occurred and subsequently achieved nearly perfect agreement (κ = 0.90, CI = [0.79, 1.01]), with disagreement remaining on three articles. We discussed these three articles until consensus was reached, which resulted in the inclusion of 20 studies.

Phase 3: Study Coding and Data Extraction

Both primary researchers coded eligible studies using a predetermined coding protocol (see Table 2). Our descriptive coding included a wide variety of study characteristics, including the publication year, the intervention type, and the sample population (i.e., age/grade, demographic characteristics). We included the design (i.e., RCT, QED, or ORD), measures (e.g., implementation tools), and attrition. We also coded for outcome (i.e., ODR, out-of-school suspension, other), sociodemographic disaggregation of outcomes (e.g., race, gender, disability status), and any other control variables included (e.g., socioeconomic status).

Table 2

Article coding and descriptions

Study	Design	Age range	Demographic groups	Program type	Program level	Outcome variable(s)	Fidelity measure	Demonstration or routine	Prevention or differential processing
Anyon et al. (2014)	ORD	K–12	White, African American, Latinx, Asian, AI/AN, gender, ELL, IEP	RJ	District policy change	ODR, OSS, expulsion	None	Routine	Prevention
Anyon et al. (2016)	ORD	K–12	White, African American, Latinx, Asian, AI/AN, gender, ELL, IEP	RJ	Teacher PD, individual coaching	ODR, OSS	None	Routine	Differential processing
Bradshaw et al. (2012)	RCT	K–6	African American or not, gender, IEP, FRL	SWPBIS	School level	ODR, OSS	SWPBIS measures	Demonstration	Prevention
Bradshaw et al. (2018)	RCT	K–8	African American, White, Latinx	CRT within SWPBIS	Teacher PD, individual coaching	ODR	SWPBIS measures	Demonstration	Prevention
C. R. Cook et al. (2018)	Pre-Post	K–6	African American, FRL	CRT within SWPBIS	Teacher PD: Action research	ODR	Researcher-created rubrics	Demonstration	Prevention
Cornell et al. (2012)	RCT	K–12	White or not, gender	Suspension alternative	Teacher PD: one-day training	Long-term OSS	Researcher-created metrics	Demonstration	Differential processing
Cornell et al. (2018)	ORD	K–12	White, African American, Latinx, gender, IEP, FRL	Suspension alternative	State policy change	OSS, alternative placement or expulsion	None	Routine	Differential processing
Cruz & Rodl (2018)	ORD	K–12	White, African American, Latinx, AI/AN, Asian, gender, IEP	SWPBIS	District	OSS	None	Routine	Prevention
Gage et al. (2019)	QED	K–12	White, African American, Latinx, IEP	SWPBIS	State policy change	ISS, OSS, expulsion, referral to law	SWPBIS measures	Routine	Prevention
Gregory, Clawson, et al. (2016)	ORD	9–12	White/Asian, African American/Latinx	RJ	Teacher PD with teacher coaching	ODR	Researcher-created surveys	Demonstration	Prevention
Gregory, Hafen, et al. (2016)	RCT	6–12	White, African American, Asian, Latinx	MTP-S	Teacher PD with coaching	ODR	None	Demonstration	Prevention
Gregory et al. (2018)	ORD	K–12	White, African American, Latinx, Asian, AI/AN	RJ	District	OSS	Not discussed	Demonstration	Differential Processing
Hashim et al. (2018)	ORD	K–12	African American, Latinx, Asian/White, other race, gender, IEP	SWPBIS, policy, RJ	District	OSS	None	Routine	Both
Mansfield et al. (2018)	ORD	9–12	White, African American, gender, IEP	RJ within MTSS framework	School	ISS, OSS	Not discussed	Unclear	Prevention
McIntosh et al. (2018)	ORD	K–8	African American, White	SWPBIS framework, data-based decision making	Teacher PD	ODR	Not discussed	Unclear	Prevention
Okonofua et al. (2016)	RCT	6–8	African American/Latinx or not, gender	Empathetic discipline	Teacher PD	OSS	Not discussed	Demonstration	Prevention
Osher et al. (2014)	ORD	K–12	African American, White, Latinx, gender, IEP	Suspension alternative, human-centered planning	School and district	ISS, OSS, expulsion	Reports from school leaders	Demonstration	Both
Scott et al. (2012)	ORD	9–12	Minority or not	SWPBIS framework, data-based decision making	Teacher PD	ODR	Not discussed	Demonstration	Prevention
Vincent et al. (2011)	ORD	K–8	White, African American	SWPBIS	School	ODR	SWPBIS measures	Routine	Prevention
Vincent & Tobin (2011)	ORD	K–12	White, Native American, Asian/PI, Latinx, African American, IEP, gender	SWPBIS	School	OSS	SWPBIS measures	Routine	Prevention

Note. RCT = randomized controlled trial; ORD = observational research design; QED = quasi-experimental design; IEP = individualized education program; FRL = free or reduced price lunch; ELL = English language learner; RJ = restorative justice; CRT = culturally responsive teaching; MTSS = multitiered systems of support; SWPBIS = School-Wide Positive Behavior Interventions and Supports; ODR = office discipline referral; OSS = out-of-school suspension; ISS = in-school suspension; PD = professional development; MTP-S = My Teaching Partner-Secondary; AI/AN = American Indian/Alaskan Native; PI = Pacific Islander.

To assess implementation feasibility, we coded for two features: (a) whether the program was demonstration or routine and (b) which measures of implementation fidelity—if any—were used. (See previous discussion regarding how demonstration programs may show higher efficacy than routine programs, whereas routine programs may provide more viable evidence for implementation feasibility; Wilson et al., 2003.) We also analyzed the ways that researchers conceptualized adherence to the program (i.e., implementation fidelity), as the frequency of schools attempting implementation but unable to reach fidelity may indicate a lack of feasibility.

Finally, in alignment with Gregory et al.’s (2017) framework, we analyzed the location on the discipline continuum at which the intervention occurred (i.e., prevention of an offense citation vs. differential processing following an offense) and which of the 10 equity components were present in the program or intervention. Both primary researchers independently read and coded the 20 articles and compared codebooks using MAXQDA 2020 (Version 20.3.0). When disagreements occurred, the authors independently reread articles and discussed the codes until consensus was reached.

After descriptive coding, we began data extraction. We tabulated whether each study included a regression coefficient for a specific demographic category (e.g., race, gender, disability) with an interaction term for the treatment under study (e.g., SWPBIS or restorative justice), given the study design. For example, ORDs might report a time × treatment × demographic category interaction, whereas RCTs might report a treatment × category interaction. These interaction terms provided statistical evidence of differences in slope for treated students in a demographic subgroup compared with treated students in the reported reference group. We also extracted data from each study that provided evidence of pre- and posttreatment impact for each demographic group; in other words, we extracted odds ratios or coefficients that indicated whether the treatment reduced exclusionary discipline for that group between the pre- and postintervention period, regardless of effect compared with the reference group. Although most studies reported the results of regression models as odds ratios or beta coefficients, studies varied widely by methodological features. For studies reporting beta coefficients, we exponentiated to odds ratios using conversion methods outlined by Borenstein et al. (2009). We arranged these results in Table 3 from the most to the least methodologically rigorous. We classified RCTs as the most rigorous way to provide evidence of treatment effects, so these were ranked the highest, followed by ORDs with extensive control covariates, and then studies that did not randomize or use control covariates and only reported descriptive results.

Table 3

Program by category interaction effects

Study	Variable description	Reported effects		Intervention	Notes
Study	Variable description	Odds ratio	CI, p
Gregory, Hafen, et al. (2016)	African American × Treatment	0.94	[0.90, 0.97], <.01	MTP-S	RCT; Reference category is all students not African American
Bradshaw et al. (2012)	Male ODR × Treatment	1.27	[1.04, 1.56], <.05	SWPBIS	RCT; Study does not report a Race × Treatment term
	Male OSS × Treatment	1.29	[0.97, 1.73], ns
	SWD ODR × Treatment	0.96	[0.75, 1.24], ns
	SWD OSS × Treatment	1.34	[0.99, 1.79], ns
Anyon et al. (2016)	Native American ODR × Treatment	1.68	[0.31, 9.19], ns	RJ	ORD; Study sample includes students who were issued an ODR. The study examines differential processing after an infraction has occurred, rather than reduction in infractions overall.
	Native American OSS × Treatment	0.14	[0.01, 1.91], ns
	African American ODR × Treatment	1.41	[0.70, 2.86], ns
	African American OSS × Treatment	0.80	[0.36, 1.80], ns
	Latinx ODR × Treatment	1.08	[0.56, 2.09], ns
	Latinx OSS × Treatment	0.62	[0.31, 9.19], ns
	Asian ODR × Treatment	0.36	[0.29, 1.34], ns
	Asian OSS × Treatment	1.23	[0.13, 11.86], ns
Cruz & Rodl (2018)	African American × Treatment × Time	1.11	[1.05, 1.18], <.001	SWPBIS	ORD; Study examines 6 years of data, and regression coefficients include time points (years) by using a fixed effect. Study only examines OSS and does not examine ODR.
	Asian × PBIS × Time	1.01	[0.03], ns
	Latinx × PBIS × Time	1.04	[1.02, 1.07], <.001
	AI/AN × PBIS × Time	1.37	[0.13], <.01
	Decline × PBIS × Time	0.97	[0.06], ns
	Male × PBIS × Time	0.99	[1.01, 1.06], <.01
	SWD × PBIS × Time	1.01	[0.01], ns

Note. When only one number is reported in brackets, it represents a standard error rather than a confidence interval. RCT = randomized controlled trial; ORD = observational research design; ns = nonsignificant; OSS = out-of-school suspension; ODR = office discipline referral; PBIS = Positive Behavior Interventions and Supports; MTP-S = My Teaching Partner-Secondary; SWPBIS = School-Wide Positive Behavior Interventions and Supports; RJ = restorative justice; SWD = students with disabilities; AI/AN = American Indian/Alaskan Native.

Results

Descriptive Results

Our analytic sample contained nine articles on SWPBIS: five examined the efficacy of the program alone, two examined the efficacy of the program supplemented with cultural responsivity training and/or individual teacher coaching, and two examined the efficacy of an intervention that was not SWPBIS but included SWPBIS components (i.e., use of data-based decision making and universal language for behavior expectations). We also analyzed six studies focused on restorative justice, one of which evaluated any alternative to out-of-school suspension, including in-school suspension and restorative practices, and three of which analyzed the same data set collected from Denver Public Schools between 2011 and 2015. The remaining three studies also analyzed district-level data sets. We found just one study that referred to a social-emotional learning program (Osher et al., 2014), and this study included comprehensive, system-wide interventions in addition to the social-emotional learning component (e.g., student-centered planning teams, data-based decision making, and instructional coaches).¹ Five articles evaluated the impact of teacher professional development on disciplinary outcomes; two of these supported teacher application of the Virginia Student Threat Assessment following an offense (Cornell et al., 2012; Cornell et al., 2018). The remaining three studies implemented professional development for teachers designed to reduce racial discipline disparities by addressing teacher attitudes, perceptions, skills, and actions.

Estimates of Program Efficacy in Reducing Disproportionality

Our first research question asked which programs were associated with reduced disparities in exclusionary discipline application for students who are African American, Latinx, or American Indian/Alaskan Native,² male, labeled with a disability, or of low socioeconomic status. Given that the most direct way to examine differential impact was with studies that reported differences in slope between demographic groups before and after treatment, we tabulated studies that provided an interaction term between the category and treatment. We found only four studies that provided such a term, and these studies, reported in Table 3, yielded mixed results. Bradshaw et al. (2012) examined SWPBIS and found increased ODRs and suspensions by treatment for males and reduced ODRs but increased suspensions by treatment for students with disabilities, whereas Cruz and Rodl (2018) examined SWPBIS and found increased suspensions by treatment for African American and Latinx students. Anyon et al. (2016) examined restorative justice and found increased ODRs but reduced suspensions for African American students in treatment conditions.

The only program for which we found explicit disparity reduction by treatment was Gregory, Hafen, et al.’s (2016) study of a one-to-one coaching intervention, which found a larger reduction in exclusionary discipline by treatment for African American students compared with the overall student population. We found one additional study in which the authors conducted such an analysis but did not report the results due to nonsignificance (Gregory et al., 2018). Table 3 depicts the reported interaction terms for each study in ranked order of study methodology, and in what follows, we further substantiate these findings by discussing key themes found across the sample of 20 articles.

School-Wide Positive Behavior Interventions and Supports

The majority of studies, both in our initial search and final sample, examined SWPBIS (see Table 1). The five included studies indicated that (a) the efficacy of SWPBIS alone in reducing disparities was either inconsistent or ineffective and (b) the program may have benefited White, female students more so than other demographic groups. Only two studies provided direct interaction terms (Bradshaw et al., 2012; Cruz & Rodl, 2018), and findings suggested continued or worsening inequities for males, students with disabilities, African American, Latinx, and American Indian/Alaska Native students over the course of program implementation.

These results can be further explored alongside other studies that examined SWPBIS without interaction terms, which suggests that SWPBIS has provided general reductions in exclusionary discipline but has not reduced disparities among student groups. Vincent et al. (2011) found that schools using SWPBIS with fidelity did not impact the discipline disparity between African American and White students, and that—though suspensions for both groups decreased at an equal rate—African American students remained overrepresented in suspensions. Vincent et al.’s (2011) study found that, in non-SWPBIS schools, the gap between African American and White students widened over time due to an increase in the number of suspensions for African American students. In their QED, Gage et al. (2019) found that African American students and students with disabilities in schools implementing SWPBIS with fidelity had significantly fewer out-of-school suspensions than those in nonimplementing schools. The authors reported standardized mean difference effect sizes for all students (OR = 0.37, g = −0.55) compared with students with disabilities (OR = 0.36, g = −0.56) and African American students (OR = 0.57, g = −0.31), which indicated that the program reduced suspensions for African American students but that the effect was smaller for this group than the overall student population. Though studies show that SWPBIS reduces exclusionary discipline overall, the available evidence does not demonstrate a reduction in the disciplinary disparities that negatively affect marginalized student subgroups.

Finally, some smaller descriptive studies (McIntosh et al., 2018; Scott et al., 2012) showed that single schools implementing a modified, equity-focused version of SWPBIS may wield efficacy in disparity reduction. In these studies, researchers supported schools’ use of a disaggregated data-tracking system to assist faculty in identifying problems, building awareness, developing goals for improvement, and tracking progress with data—all concepts derived from SWPBIS.³ Similar to the classic SWPBIS studies discussed in the previous paragraph, these two studies documented changes in ODRs across time for all students, but differed in that they found a greater effect for African American students compared with White peers. However—due to nonrandom methodological designs—these studies did not provide evidence of a direct causal connection between the programs implemented and reductions in discipline disparities.

Restorative Justice

Five ORD (see Table 2) studies indicated both the preventative benefits of strengthening teacher-student relationships and the intervention benefits of providing alternatives to suspension and expulsion; however, results were often complicated by teachers’ differential ability to implement recommended techniques. Similar to the findings in the SWPBIS literature, this set of research did not provide evidence that the program reduced discipline disparities despite reducing exclusionary discipline overall. Using the same data set from Denver Public Schools, three studies examined the impact of district policy changes that included staff training in restorative practices and policy recommendations that students be offered a restorative intervention following a discipline action. The only study in this group to report interaction terms was Anyon et al.’s (2016), which examined the differential processing of students (n = 9,921) after an infraction and found that being African American was associated with increased ODRs (OR = 1.41) but decreased out-of-school suspensions (OR = 0.80). However, both of these findings were nonsignificant, and African American students and students with disabilities remained overrepresented in exclusionary discipline despite longitudinal district-wide policy changes encouraging alternatives. Similarly, Gregory et al.’s (2018) findings—using the same data set with a postinfraction student sample (n = 9,039)—demonstrated that student participation in a restorative intervention reduced the odds that a student would be suspended. However, after controlling for other covariates—including participation in restorative interventions—discipline-referred African American students were still 11% more likely to receive an out-of-school suspension compared with discipline-referred White peers. In the third Denver study, Anyon et al. (2014) included the entire student sample (n = 87,997) to analyze infraction prevention and found that schools reduced overall out-of-school suspensions through a combination of both in-school suspension (i.e., another exclusionary practice) and restorative approaches. The authors did not include interaction effects but reported that even after controlling for participation in restorative justice, sociodemographic characteristics such as race/ethnicity, gender, special education status, and socioeconomic status were all significantly associated with increased ODRs and out-of-school suspensions.

Several studies combined concepts from SWPBIS and restorative justice (e.g., Hashim et al., 2018), and these suggested potential systemic improvements for marginalized student groups. In addition to the studies suggesting that restorative approaches can positively impact “differential processing” (Gregory et al., 2018, p. 168) after a referral, other studies examined the potential of restorative approaches to transform teacher-student relationships and thus prevent ODRs. Together, these studies suggested the benefits of strengthening classroom climate and teachers’ ability to implement restorative justice. Mansfield et al. (2018) examined one large district implementing the SaferSanerSchools Whole-School Change program (Mirsky, 2011), which focused on mutual respect and community among school stakeholders. The authors reported a reduction in suspensions for African American students and students with disabilities but noted that disparities were not eliminated entirely. Similarly, using teacher reports of restorative justice implementation in the classroom and student reports of perceived teacher respect, Gregory, Clawson, et al. (2016) found that “higher Restorative Justice implementers narrowed the racial discipline gap but did not eradicate it in their referral patterns” (p. 342). Gregory, Clawson, et al. noted that because the relationship between implementation and students’ perceived respect from teachers held across racial groups, the discipline gap may be better conceptualized as a “relationship gap” (p. 345).

Other Programs

Studies of programs other than SWPBIS and restorative justice mainly focused on professional development related to culturally responsive or empathetic practices. These studies indicated the potential of one-on-one teacher coaching in preventing disparities in ODRs, but they showed mixed results regarding the impact of brief professional development in prevention efforts. Gregory, Hafen, et al.’s (2016) and Bradshaw et al.’s (2018) RCTs both examined the effects of one-to-one coaching programs in which teachers received ongoing and personalized feedback on instructional segments. Teachers in these interventions showed significantly lower use of ODRs, and, using a direct interaction term (see Table 3), Gregory, Hafen, et al. (2016) found that teachers who incorporated higher order thinking into their instruction significantly reduced ODRs for African American students (OR = 0.98, p < .01), suggesting that improved pedagogy affected discipline disparities.

Aiming to examine more cost-effective and sustainable approaches, Bradshaw et al.’s (2018) study also included five 60-minute faculty trainings on cultural responsiveness for all teachers, and they found that teachers who received the professional development training improved attitudes and beliefs about discipline but did not decrease in ODR frequency, indicating that without the individual coaching component, teachers struggled to implement suggested strategies. C. R. Cook et al. (2018) and Okonofua et al. (2016) also studied the impact of professional development programs. C. R. Cook et al.’s (2018) study assisted teachers in building respectful relationships and providing proportionate responses to student behavior in the classroom, whereas Okonofua et al.’s (2016) study supported teachers in adopting an empathetic mind-set when considering classroom discipline practices. Both studies uncovered promising results. In Okonofua et al.’s study, students in classrooms with a teacher who participated in the intervention were half as likely to be suspended over the school year compared with students in nontreatment classrooms (OR = 0.42, p < .001), and this effect held when controlling for student race, gender, and suspension in the prior year. However, neither study provided an interaction effect, and both authors suggested the need for expanded future study.

Gregory’s Framework

Our second research question sought to identify which components of Gregory et al.’s (2017) framework were present across the studies and whether patterns emerged regarding these components and the strength of study findings. As aforementioned, Gregory et al.’s framework stipulated the need for preventative approaches that reduce ODRs, intervention approaches that improve differential processing of referred students, and integrated systems that address both (i.e., multitiered systems of support). We coded articles for explicit inclusion of Gregory et al.’s components. For example, although SWPBIS is a system that emphasizes multitiered systems of support, Vincent and Tobin’s (2011) study focused on schools using the Effective Behavior Support Survey (Sugai et al., 2000), which focused primarily on Tier 1 implementation rather than implementation of the entire program. As described in Vincent and Tobin’s (2011) study, the survey focused on clearly defined behavior expectations and use of data to drive decisions. Therefore, we coded this study as an approach that provided data-based inquiry, opportunities for correcting and learning from behavior, and problem-solving approaches to discipline. We did not code this study as including multitiered systems, as there was no evidence that participating schools implemented all three tiers. Figure 1 depicts each component of the framework and the location at which each study fell based on at least one relevant code in the authors’ description (i.e., codes were not mutually exclusive).

Figure 1.

Connections to Gregory et al.’s (2017) framework.

Preventative Approaches

We grouped preventative approaches into two categories: those in which the teacher provided preventative environmental adjustments, including (a) bias-aware practices, (b) academic rigor, and (c) culturally relevant teaching, and those in which interactions between teachers and students prevented the need for removal from the classroom, including (d) building supportive relationships and (e) opportunities for learning and correcting behavior within the classroom. Of the studies that examined preventative efforts (k = 11), the most abundant evidence existed for coaching or professional development addressing teachers’ ability to build supportive relationships and correct behavior with an empathetic lens. We found only one study that explicitly examined culturally responsive teaching and awareness of bias (Bradshaw et al., 2018), although several studies potentially included these components but placed explicit emphases on other factors. In these, researchers presented disaggregated data to faculty and may be considered as building a more bias-aware teaching staff, as results from disaggregated data often led to faculty discussions for reasons behind disproportionate ODRs (e.g., Scott et al., 2012). Finally, although Gregory’s framework suggested academic rigor and instructional responsiveness as a key component in rectifying inequities, we found just one study (Gregory, Hafen, et al., 2016) with promising results that included academic rigor, thus representing a preventative component for which we found limited but promising evidence. This represents a notable gap in the literature.

Intervention Approaches

Most studies examining intervention approaches included (a) data-based programming (k = 8), (b) problem-solving approaches to discipline (k = 11), (c) inclusion of student and family voice (k = 6), and (d) reintegration after conflict (k = 9). By design, restorative justice addresses these aspects of Gregory et al.’s (2017) framework directly. Studies focused on this program included problem-solving approaches by students, families, faculty, and leadership following an infraction and emphasized plans for students to repair harm and rejoin the learning environment. Similarly, Cornell et al. (2012) and Cornell et al. (2018) examined the use of the Virginia Threat Assessment protocol to inform disciplinary responses for students who had made a threat of violence. Although threats of violence in schools remain rare, responses may be subject to bias and, therefore, are a working component in the school-to-prison pipeline. The protocol was effective in funneling students into counseling rather than long-term suspension or expulsion—compared with students in control conditions. The authors found no significant disparities between African American, Latinx, and White students in long-term suspension rates after use of the protocol. These studies, along with the restorative justice research, provided evidence in support of problem-solving approaches to discipline in improving differential processing for African American students, in particular. Furthermore, these studies suggested that when interdisciplinary teams (e.g., school psychologists, teachers, families, and administrators) critically analyze discipline cases, it may support informed decisions that reduce bias in the process of assigning consequences. However, our results indicate that these components alone are not sufficient to eradicate entrenched inequities, as they are ineffective at preventing ODRs for marginalized groups.

Whole-School Approaches

Although Gregory et al. (2017) outlined a limited description of systems that address both prevention and intervention (i.e., multitiered systems of support), we found several studies (k = 6) that comprised various systemic improvements meant to achieve reduction in ODRs and suspensions. Osher et al. (2014) studied the effects of comprehensive, district-wide interventions (e.g., student-centered planning teams, data-based decision making, and staffing schools with instructional coaches), and Hashim and colleagues (2018) studied a layered approach to discipline (i.e., the district began with SWPBIS, enacted policies banning suspension for defiance, and ultimately adopted a restorative justice philosophical approach). In accordance with Gregory’s recommendations, both studies examined districts that implemented multitiered systems, the use of data to refine interventions, alignment of prevention and intervention systems to address immediate needs, and support of schools and staff to implement research-based programs. In both studies, suspensions trended downward, but disparities remained for “frequently disciplined subgroups” (Hashim et al., 2018, p. 184). Furthermore, it remains unclear whether reductions were related to increased teacher capacity or the districts’ policy shifts, such as a ban that prevented suspension without necessarily altering school climate and classroom management.

Local Control and Sustained Uptake

Our final research question sought to identify the feasibility of identified programs for application in authentic school contexts. As Fixsen et al. (2013) noted, implementation is complex and context specific; there are critical differences in program feasibility given the context of everyday school settings and factors required for successful implementation and sustained uptake (e.g., training time, fiscal demands, human resource demands, program complexity). To examine the viability of programs in real-world applications, we assessed the ways in which studies trained faculty and leadership and enacted scalable, sustained uptake.

Our analysis showed that, though simple and cost-effective, traditional models of teacher professional development meant to add cultural responsivity to programs already being implemented in districts were less effective than ongoing, growth-in-practice approaches. However, it is critical to note that one such growth-in-practice approach, one-to-one coaching, was not sustainable. Gregory, Hafen, et al. (2016) and Bradshaw et al. (2018) both examined coaching models implemented by researcher trained and supervised coaches, and study authors acknowledged the “additional cost and burden associated with coaching individual teachers, both in terms of teacher time and coach time” (Bradshaw et al., 2018, p. 122). To this point, Osher et al.’s (2014) study included support from coaches hired from within the district and similarly indicated that training costs and hiring difficulties created significant barriers for the one-to-one coaching portion of the intervention, likely adding to the study’s null findings.

Whereas our findings indicate the complexity and system-wide investment required to disrupt entrenched inequities in discipline disparities, approaches must be jointly anchored in feasibility and effectiveness to optimize the likelihood of achieving equitable practices and outcomes in real-world school settings (C. R. Cook et al., 2018). Hashim et al. (2018) described a 10-year staged reform series in Los Angeles Unified School District that required extensive faculty and leadership training, an independently hired implementation auditor, a district-level task force, and a more than $4.9-million cost to the district. Even still, the largest reduction in disparities occurred only after the district prohibited student suspensions for willful defiance, rather than after implementation of SWPBIS and restorative interventions. Though this indicates the need for both policy and practice solutions, it is unclear whether the costly measures undertaken by the district improved climate and culture in conjunction with the suspension ban. On the other hand, although some studies’ interventions implemented shorter, more cost-effective teacher-training modules (e.g., Okonofua et al., 2016), studies in our sample that implemented such professional development demonstrated mixed or null results. It is critical for future research to grapple with this tension.

Treatment Versus Intent to Treat

To further examine evidence for feasibility and sustained uptake, we identified the differences in each study’s approach to application in authentic school contexts. We encountered critical differences between results of studies that used treatment-to-fidelity approaches (e.g., Gage et al., 2019) versus intent-to-treat approaches that ignored treatment adherence and prioritized the randomization process (e.g., Gregory, Hafen, et al., 2016). B. G. Cook and Odom (2013) argued that “implementation is the critical link between research and practice” (p. 138), and if schools struggle to implement with fidelity, program efficacy matters little. It is no surprise that researchers who took a treatment-to-fidelity approach—eliminating schools that tried a program but did not achieve fidelity (i.e., Gage et al., 2019)—demonstrated higher levels of efficacy than researchers who took an intent-to-treat approach and kept schools in the sample regardless of fidelity (i.e., Cruz & Rodl, 2018). It is well established that implementation fidelity is critical (Fixsen et al., 2013; Kim et al., 2018), and our results further demonstrate that schools may struggle to implement particular interventions without robust researcher support (e.g., C. R. Cook et al., 2018) or external grant funding (e.g., Mansfield et al., 2018). Both demonstration studies, which provide direct researcher support for the intervention, and routine studies, which only classify schools achieving a certain metric of fidelity as treated schools, can provide valuable evidence that a program may work. However, further research is needed to establish whether schools can implement these programs on a routine basis to fidelity without considerable outside support and funding.

Discussion

In recent decades, scholars have undertaken considerable effort to understand discipline disparities (Welsh & Little, 2018), yet there is a dearth of quality research focused explicitly on disparity-reducing interventions. Our findings indicate that research on school discipline has largely cohered not only around a “color-evasive” approach (Annamma et al., 2017) but also around a larger, neurotypical, and socially normative approach that avoids addressing the wide range of variability present in diverse classrooms. This has resulted in insufficient data regarding the extent to which embedded structural and personal biases affect intervention effectiveness. The studies we found that overtly examined differential reductions concluded with a common refrain: “Although our data suggests that the rate of suspension and expulsion decreased, disparities may remain” (Osher et al., 2014, p. 1). The primary purpose of this analysis was to provide relevant program direction to districts seeking to reduce disparities in exclusionary discipline. It is clear that schools have a variety of options for reducing ODRs, out-of-school suspensions, and expulsion rates, and it is encouraging that many of these options reduced exclusionary practices overall. However, it is also clear that schools lack programming to reduce disproportionality in exclusionary discipline practices.

Our analysis indicated that schools have several comprehensive programs available that address student-teacher relationships and teacher practice at various points on the discipline continuum. Programs under study demonstrated a shift from exclusively focusing on student behavior toward relationships within the school and community, as recommended in Gregory et al.’s (2017) framework. A small set of studies indicated that teacher coaching strategies may lead to reduced exclusionary discipline for all students, including those belonging to vulnerable groups, but few provided significant evidence of disparity reduction, and—critically—many require costly adoption measures and ongoing external support, which may limit true potential for impact (Smolkowski, Strycker, et al., 2016).

There is substantial evidence from rigorous research that SWPBIS reduces ODRs and out-of-school suspensions, and this is reflected in our analysis. Perhaps because SWPBIS is not designed with a specific equity focus, we did not find evidence that SWPBIS alone is effective at reducing disparities. In fact, we found evidence that it may exacerbate gaps in certain contexts. Vincent and Tobin (2011) posited that we still know very little about how SWPBIS implementation differs in culturally homogeneous versus culturally heterogeneous schools and that context may be an important consideration that has gone unaddressed in SWPBIS implementation. Welsh and Little (2018) discussed the conceptual underpinnings of programs primarily designed to address student behavior without consideration of larger underlying drivers of disproportionality, and they theorized that “addressing the biases and cultural clashes that may be driving discipline disparities” (p. 773) is a critical component in addressing entrenched inequities. McIntosh et al. (2018) and Scott et al. (2012) provided preliminary evidence that being intentional with school personnel regarding the goal of reducing inequitable discipline practices—building awareness around disparities and tracking progress with disaggregated data—can supplement an SWPBIS framework, but these strategies are still unlikely to address the underlying drivers that Welsh and Little emphasized.

We also confirmed a paucity of rigorous evidence for restorative justice’s capacity to reduce exclusionary discipline, both overall and in terms of disparities. We found no high-quality RCTs examining restorative justice, although some articles indicated that larger RCTs may be forthcoming. Studies that used extensive control covariates in an ORD were largely focused on the same data set (i.e., Denver Public Schools), and these examined differential processing rather than prevention. Though the differential processing line of research is critical—given that African American and Latinx students often receive harsher consequences after a disciplinary infraction—programs addressing postinfraction processing are situated on just one end of the prevention-intervention continuum. These studies did not examine the potential for restorative justice to prevent initial infractions, and, as was the case in Denver Public Schools, they often required a student to take sole responsibility for an incident through the reintegration process. In the Denver Public Schools studies, if students failed to accept responsibility because of perceived injustice or unfairness in the initial infraction assignment, they were assigned an exclusionary consequence. Restorative justice advances the idea that authentic spaces be built for students and staff to work through conflict, but these spaces are not power-free and can quickly become a site for surveillance when students are forced to share their motivations for certain behavior (Lustick, 2017). Whereas Mansfield et al. (2018) stated that a restorative approach prioritizes “engaging students socially in the school community” as opposed to “social control” (p. 306), our results suggest that—in forcing students to admit to and repair harm on the intervention end of the continuum without addressing fairness through prevention—a level of social control remains present in some of this research. Given the popularity of the program among social justice leaders and within the public lexicon, and the potential ability of the program to address student-teacher relationships and school climate, further research using rigorous methodology is critical.

Considering the aforementioned studies on restorative justice, and the two studies that examined the Virginia Student Threat Assessment protocol (Cornell et al., 2012; Cornell et al., 2018), there are options for schools seeking equity in differential processing following a serious infraction (e.g., a threat of violence). We found that schools may address this issue through use of uniform protocols that provide alternatives to the zero-tolerance narrative. Multidisciplinary, school-based teams trained to assess and respond to student infractions with a well-defined yet flexible process should apply such a protocol (Cornell et al., 2012), and the process should support students who have caused harm in the learning environment to repair the harm and remain in or return to the learning environment (Anyon et al., 2016). As mentioned, however, these studies did not provide guidance on prevention efforts, again indicating a focus on the reactionary side of the school-discipline continuum. Schools that support more positive social bonds among practitioners and students and increase feelings of belonging for all students may reduce the need for these processing protocols.

In conjunction with Welsh and Little (2018), our results highlight the continued proliferation of “color-evasive” interventions available to schools. However, we identified a small set of studies (e.g., Gregory, Hafen, et al., 2016; Okonofua et al., 2016) that did not identify specific equity foci within their intervention frameworks but still found disparity reduction for African American students in particular. These studies showed that teachers who gained a deeper understanding of students as individuals, and who provided instruction that communicated high expectations for analytic thinking, can become a powerful driver for disparity reduction, even without explicit training to reduce implicit bias and increase cultural consciousness. These interventions should be implemented with caution, as we also found that key environmental factors in learning environments did not always provide a uniformly positive impact on different groups; the impact of risk and protective factors varies based on developmental timing, family and social circumstances, and niche-specific contexts that exist across cultures (Masten, 2015). For example, it may be that certain programs in the school context (i.e., SWPBIS absent culturally responsive pedagogy) may function as a protective factor for White and female students while failing to do so for non-White and male students. Welsh and Little (2018) questioned whether the conceptual underpinnings behind the array of available alternative approaches sufficiently address the sociocultural causes of discipline disparities. This is a critical question for future study, especially given that some scholars assert that interventions meant to “fix students of color” (Gorski, 2019, p. 58) perpetuate deficit ideologies and, thus, exacerbate racial inequities.

Finally, we found little evidence that studies have explicitly addressed students’ intersectional identities in the context of unfair discipline practices, and the ways in which oppressive policies and practices have disparate effects on groups whose needs are poorly served through policy and practice design (Crenshaw, 1991). Studies most commonly included disaggregated racial outcomes, and a small set of studies included disability labels or gender, but almost no studies included outcomes for the most vulnerable students (e.g., African American males with disabilities). We believe this relates to insufficient theorization in this literature of the underlying drivers of discipline disparities. Though Mansfield et al. (2018) included the importance of intersectionality in selecting interventions meant to support marginalized students, and Gregory, Clawson, et al. (2016) discussed the humanist origins behind restorative justice in regard to its use in building equity, clear grounding in an established theoretical framework is critical in guiding researchers to select interventions that address underlying drivers of disparities and, thus, affect outcomes. Future studies should be grounded in a theoretical or conceptual framework that provides a rationale for the selected intervention, analytic design, and included covariates. Doing so may more effectively target the intersectional impact of racism, ableism, and sexism that pervades inequitable systems within schools (Annamma, 2014) but remain unaddressed in existing programs. Given the complexity of inequities—as evidenced in the efficacy of programs in reducing discipline enactment but not disparities—it is likely that disparate practices are rooted in different student groups accessing fundamentally different school programs and resources in a given context (Carter et al., 2013; Carter et al., 2017; Orfield & Ee, 2014). Thus, clarifying the epistemological assumptions underlying proposed solutions is critical.

Limitations

This review is limited in that it did not include books, book chapters, dissertations, or other published works apart from peer-reviewed journal articles. There were several studies, particularly related to restorative justice, such as doctoral dissertations, that may have provided additional evidence for its efficacy (see Fronius et al., 2019, for a complete review). However, we chose not to include these studies, given that they had not undergone peer review. Additionally, Fronius et al. (2019) reported that there were several large-scale RCTs of restorative justice under way, but these had not been published at the time of this writing. Future work should consider these studies in an effort to reduce publication bias and increase our understanding of restorative justice’s efficacy in reducing disparities. This review also did not include qualitative studies because we aimed to focus on direct evidence of disparity reduction rather than the processes and perceptions experienced by practitioners implementing these interventions. However, we acknowledge that additional information related to discipline disproportionality is available from a wide range of sources, which are relevant in informing future implementation research.

In addition, this study was limited in that we were not able to conduct a full-scale meta-analysis. We reported odds ratios as they were reported in the research and, given that odds ratios are a measure of effect size in that they identify the strength and direction of a relationship, we recognize that they are unstandardized effect sizes. Studies that have different units of measure for the dependent variable (e.g., counts vs. binary indicators) are difficult to compare. The limited number of available studies, along with their vastly different methodological designs, analyses, covariate adjustments, and samples, meant that a meta-analysis was not feasible. Future research should consider the smaller set of RCTs available and use meta-analytic procedures to explore moderators for each program type.

Conclusions and Future Directions

This study examined key components of disparity-reducing discipline programs—in alignment with Gregory et al.’s (2017) framework—and the available evidence underlying these programs’ impact on discipline disparities. Our analysis indicated the trend toward research on multilevel programs that address student-teacher relationships and teacher practices. We also uncovered several gaps in the literature that should be prioritized in future study. We found that few studies offered disaggregated estimates of program efficacy by demographic group—a critical aspect of research if we are to understand whether such programs actually address unfair practices. Future studies should include interaction terms wherever possible to further elucidate impact on marginalized groups. Additionally, we found that the ways in which treatment fidelity was considered affected interpretation of program efficacy. We therefore emphasize that future studies must be clear about how fidelity is measured and included in analyses. Although preliminary studies might examine treatment fidelity, interventions must eventually consider routine implementation before a program can be considered effective and, ultimately, feasible. Program implementation studies could begin with demonstration studies, and, when these are found to be efficacious, researchers should then proceed to studies implemented in real-world contexts that examine implementation in typical school conditions prior to costly, highly controlled RCTs (Hill et al., 2013). Additionally, future studies should be clear about whether a program supports prevention or intervention—or whether the program addresses multiple aspects of the discipline continuum—in line with Gregory et al.’s (2017) framework.

Our analysis also highlighted the importance of local context in understanding students’ social-emotional needs and development. Although we identified some studies that examined national- and state-level data, most analyzed a single district with unique needs and resources. For example, Osher et al.’s (2014) study of a Cleveland-area school district displayed the need to understand an educational agency’s unique challenges before implementing a program meant to support the school, district, and community. Programs in urban California may serve needs and goals different from those needed to support students in rural Idaho, for example. Studies that use learning-lab techniques (Bal et al., 2018) or design-based school improvement techniques (Mintrop, 2016) that engage with schools’ and districts’ contextual needs are an area for future research in reducing discipline disparities. Researchers must be explicit in describing the local contexts in which interventions are studied.

Finally, one of our most notable findings was that one study (Gregory, Hafen, et al., 2016) provided robust evidence of disparity reduction despite the intervention being focused on instructional practices rather than on equity-specific, discipline practices. This is notable because few studies on exclusionary discipline have examined the relationship between classroom instruction and classroom management, both of which are critical factors in student engagement. The idea that teachers should hold high expectations for all students is well established in the empirical base for culturally responsive pedagogy (see Hammond, 2014; Ladson-Billings, 1995; Valenzuela, 1999). Given that teachers in Gregory, Hafen, et al.’s study integrated higher order thinking skills into their pedagogy and, in doing so, significantly reduced discipline disparities, indicates the need for further analysis of increased academic rigor as a way to reduce discipline disparities. The authors hypothesized that, through “the opportunity to engage in cognitively demanding problem-solving tasks, Black students may detect their teachers’ high expectations and confidence in them as scholars” (Gregory, Hafen, et al., 2016, p. 186).

We believe it is a critical gap that recommendations for improving discipline practices have revolved around school and classroom climate, multitiered systems of support, and collaboration among practitioners (Morgan et al., 2014) without attending to depth and rigor of instruction. We cannot overstress the need for further research in this area, as it holds potential for both increasing positive relationships between teachers and students and for building academic opportunities for marginalized groups. It may be that “training teachers to strengthen the motivating and engaging qualities of instruction” (Anyon et al., 2016, p. 1688) functions as a prevention strategy. Though this is an understudied approach in the disparity reduction literature, we encourage schools and districts wishing to address discipline inequities to consider the integrated nature of pedagogy and social/emotional supports.

In addition to understanding the dynamic nature of schools as they comprise students’ sociocultural contexts, studies must consider the ways that programming may improve school and classroom climate and student-teacher interactions through instructional and pedagogical design. Though we found a lack of empirically supported solutions, there is reason for optimism as the field learns from this body of work and systematically tests innovative approaches in line with Gregory’s framework. As disparities research identifies systemic and structural issues that pervade educational opportunity, this study provides insight into a small but critical aspect of school quality. Given the persistent racial and class stratification that exist in society (e.g., racial segregation, discriminatory housing policies, equal access to health care; see Desmond & Emirbayer, 2015) and, in turn, affect how schools and districts operate and ultimately shape educational opportunities and outcomes (Carter et al., 2013), future educational research has the potential to provide a rich understanding of interventions that build students’ identities as learners and allow teachers to bring empathy and understanding to the learning environment.

Footnotes

ORCID iDs

Rebecca A. Cruz

Allison R. Firestone

Notes

Authors

REBECCA A. CRUZ earned her PhD in special education from a joint doctoral program between University of California, Berkeley, and San Francisco State University (SFSU) and is an assistant professor of education at San Jose State University, 1 Washington Sq., San Jose 95112-3613, USA; email: rebecca.cruz01@sjsu.edu . Her research interests include sociology of education, education policy, quasi-experimental and longitudinal data analysis, educational stratification, educational equity, and critical disability theory. Prior to beginning her doctoral studies, Rebecca earned a master’s degree in special education from SFSU and worked in middle and high school settings to develop co-teaching and inclusion models.

ALLISON R. FIRESTONE is a PhD candidate in the University of California, Berkeley, and San Francisco State University’s joint doctoral program in special education (University of California Berkeley, 2121 Berkeley Way, 4th Floor, GSE, Berkeley, CA 94118; email: allisonfirestone@berkeley.edu ). Allison’s research focuses on evidence-based practices for teacher preparation, with a specific emphasis on developing teachers’ competence in inclusive practices. Prior to beginning her doctoral studies, she earned a master’s degree in special education from the University of Oregon and worked as a special education teacher at the elementary school level.

JANELLE E. RODL earned her PhD in special education from a joint doctoral program between University of California, Los Angeles, and California State University, Los Angeles, and is an assistant professor of special education (mild/moderate disabilities) at San Francisco State University, 1600 Holloway Ave., San Francisco, CA 94132-1722, USA; email: jrodl@sfsu.edu . Janelle’s primary research interests include developing equitable measures of special education teacher efficacy, examining disproportionality in special education, and improving teacher preparation. Prior to earning her doctoral degree, she taught students with mild to moderate disabilities in an inclusive secondary setting.

References

Annamma

S. A.

(2014). Disabling juvenile justice: Engaging the stories of incarcerated young women of color with disabilities. Remedial and Special Education, 35(5), 313–324. https://doi.org/10.1177/0741932514526785

Annamma

S. A.

Ferri

B. A.

Connor

D. J.

(2018). Disability critical race theory: Exploring the intersectional lineage, emergence, and potential futures of DisCrit in education. Review of Research in Education, 42(1), 46–71. https://doi.org/10.3102/0091732X18759041

Annamma

S. A.

Jackson

D. D.

Morrison

(2017). Conceptualizing color-evasiveness: Using dis/ability critical race theory to expand a color-blind racial ideology in education and society. Race Ethnicity and Education, 20(2), 147–162. https://doi.org/10.1080/13613324.2016.1248837

*Anyon

Gregory

Stone

Farrar

Jenson

J. M.

McQueen

Downing

Greer

Simmons

(2016). Restorative interventions and school discipline sanctions in a large urban school district. American Education Research Journal, 53(6), 1663–1697. https://doi.org/10.3102/0002831216675719

*Anyon

Jenson

J. M.

Altschul

Farrar

McQueen

Greer

Downing

Simmons

(2014). The persistent effect of race and the promise of alternatives to suspension in school discipline outcomes. Children and Youth Services Review, 44, 379–386. https://doi.org/10.1016/j.childyouth.2014.06.025

Artiles

A. J.

(2011). Toward an interdisciplinary understanding of educational equity and difference: The case of the racialization of ability. Educational Researcher, 40(9), 431–445. https://doi.org/10.3102/0013189X11429391

Augustine

C. H.

Engberg

Grimm

G. E.

Lee

Wang

E. L.

Christianson

Joseph

A. A.

(2018). Can restorative practices improve school climate and curb suspensions? An evaluation of the impact of restorative practices in a mid-sized urban school district. RAND Corporation. https://www.rand.org/pubs/research_reports/RR2840.html

Bal

Afacan

Cakir

H. I.

(2018). Culturally responsive school discipline: Implementing Learning Lab at a high school for systemic transformation. American Educational Research Journal, 55(5), 1007–1050. https://doi.org/10.3102/0002831218768796

Balfanz

Byrnes

Fox

J. H.

(2015). Sent home and put off track: The antecedents, disproportionalities, and consequences of being suspended in the 9th grade. In Losen

(Ed.), Closing the school discipline gap: Equitable remedies for excessive exclusion (pp. 17–30). Teachers College Press.

10.

Blake

J. J.

Keith

V. M.

Luo

Salter

(2017). The role of colorism in explaining African American females’ suspension risk. School Psychology Quarterly, 32(1), 118–130. https://doi.org/10.1037/spq0000173

11.

Boneshefski

M. J.

Runge

T. J.

(2014). Addressing disproportionate discipline practices within a School-Wide Positive Behavioral Interventions and Supports framework: A practical guide for calculating and using disproportionality rates. Journal of Positive Behavior Interventions, 16(3), 149–158. https://doi.org/10.1177/1098300713484064

12.

Borenstein

Hedges

L. V.

Higgins

J. P. T.

Rothstein

H. R.

(2009). Introduction to meta-analysis. Wiley.

13.

Bottiani

J. H.

Larson

K. E.

Debnam

K. J.

Bischoff

C. M.

Bradshaw

C. P.

(2018). Promoting educators’ use of culturally responsive practices: A systematic review of inservice interventions. Journal of Teacher Education, 69(4), 367–385. https://doi.org/10.1177/0022487117722553

14.

Bouchard

Wong

J. S.

(2017). A jury of their peers: A meta-analysis of the effects of teen court on criminal recidivism. Journal of Youth and Adolescence, 46(7), 1472–1487. https://doi.org/10.1007/s10964-017-0667-7

15.

Bradshaw

C. P.

Mitchell

M. M.

O’Brennan

L. M.

Leaf

P. J.

(2010). Multilevel exploration of factors contributing to the overrepresentation of Black students in office disciplinary referrals. Journal of Educational Psychology, 102(2), 508–520. https://doi.org/10.1037/a0018450

16.

*Bradshaw

C. P.

Pas

E. T.

Bottiani

J. H.

Debnam

K. J.

Reinke

W. M.

Herman

K. C.

Rosenberg

M. S.

(2018). Promoting cultural responsivity and student engagement through Double Check coaching of classroom teachers: An efficacy study. School Psychology Review, 47(2), 118–134. https://doi.org/10.17105/SPR-2017-0119.V47-2

17.

*Bradshaw

C. P.

Waasdorp

T. E.

Leaf

P. J.

(2012). Effects of School-Wide Positive Behavioral Interventions and Supports on child behavior problems. Pediatrics, 130(5), e1136–e1145. https://doi.org/10.1542/peds.2012-0243

18.

Brobbey

(2018). Punishing the vulnerable: Exploring suspension rates for students with learning disabilities. Intervention in School and Clinic, 53(4), 216–219. https://doi.org/10.1177/1053451217712953

19.

Buckley

Maxwell

G. M.

(2007). Respectful schools: Restorative practices in education. A summary report. Office of the Children’s Commissioner and the Institute of Policy Studies, School of Government, Victoria University. https://transformingconflict.org/wp-content/uploads/2017/09/Respectful-Schools-report-Buckley-and-Maxwell-Australia.pdf

20.

Camacho

K. A.

Krezmien

M. P.

(2018). Individual- and school-level factors contributing to disproportionate suspension rates: A multilevel analysis of one state. Journal of Emotional and Behavioral Disorders, 27(4), 209–220. https://doi.org/10.1177/1063426618769065

21.

Carter

P. L.

Skiba

Arredondo

M. I.

Pollock

(2017). You can’t fix what you don’t look at: Acknowledging race in addressing racial discipline disparities. Urban Education, 52(2), 207–235. https://doi.org/10.1177/0042085916660350

22.

Carter

P. L.

Welner

K. G.

Ladson-Billings

(2013). Closing the opportunity gap: What America must do to give every child an even chance. Oxford University Press.

23.

Cook

B. G.

Odom

S. L.

(2013). Evidence-based practices and implementation science in special education. Exceptional Children, 79(3), 135–144. https://doi.org/10.1177/001440291307900201

24.

*Cook

C. R.

Duong

M. T.

McIntosh

Fiat

A. E.

Larson

Pullmann

M. D.

McGinnis

(2018). Addressing discipline disparities for Black male students: Linking malleable root causes to feasible and effective practices. School Psychology Review, 47(2), 135–152. https://doi.org/10.17105/SPR-2017-0026.V47-2

25.

*Cornell

Allen

Fan

(2012). A randomized controlled study of the Virginia Student Threat Assessment Guidelines in kindergarten through grade 12. School Psychology Review, 41(1), 100–115. https://doi.org/10.1080/02796015.2012.12087378

26.

*Cornell

Maeng

Huang

Shukla

Konold

(2018). Racial/ethnic parity in disciplinary consequences using student threat assessment. School Psychology Review, 47(2), 183–195. https://doi.org/10.17105/SPR-2017-0030.V47-2

27.

Cotter Stalker

. (2017). Teen court–school partnerships: Reducing disproportionality in school discipline. Children & Schools, 40(1), 17–24. https://doi.org/10.1093/cs/cdx024

28.

Crenshaw

(1991). Mapping the margins: Intersectionality, identity politics, and violence against women of color. Stanford Law Review, 43(6), 1241–1299. https://doi.org/10.2307/1229039

29.

*Cruz

R. A.

Rodl

J. E.

(2018). Crime and punishment: An examination of school context and student characteristics that predict out-of-school suspension. Children and Youth Services Review, 95, 226–234. https://doi.org/10.1016/j.childyouth.2018.11.007

30.

Desmond

Emirbayer

(2015). The racial order. University of Chicago Press.

31.

Durlak

J. A.

Weissberg

R. P.

Dymnicki

A. B.

Taylor

R. D.

Schellinger

K. B.

(2011). The impact of enhancing students’ social and emotional learning: A meta-analysis of school-based universal interventions. Child Development, 82(1), 405–432. https://doi.org/10.1111/j.1467-8624.2010.01564.x

32.

Fabelo

Thompson

M. D.

Plotkin

Carmichael

Marchbanks

M. P.

Booth

E. A.

(2011). Breaking schools’ rules: A statewide study of how school discipline relates to students’ success and juvenile justice involvement. Council of State Governments Justice Center. http://knowledgecenter.csg.org/drupal/system/files/Breaking_School_Rules.pdf

33.

Fallon

L. M.

O’Keeffe

B. V.

Sugai

(2012). Consideration of culture and context in school-wide positive behavior support: A review of current literature. Journal of Positive Behavior Interventions, 14(4), 209–219. https://doi.org/10.1177/1098300712442242

34.

Fixsen

Blase

Metz

Van Dyke

(2013). Statewide implementation of evidence-based programs. Exceptional Children, 79(3), 213–230. https://doi.org/10.1177/001440291307900206

35.

Flannery

K. B.

Fenning

Kato

M. M.

McIntosh

(2014). Efforts of school wide positive behavioral interventions and supports and fidelity of implementation on problem behavior in high schools. School Psychology Quarterly, 29(2), 111–124. https://doi.org/10.1037/spq0000039

36.

Flay

B. R.

Allred

C. G.

Ordway

(2001). Effects of the Positive Action program on achievement and discipline: Two matched-control comparisons. Prevention Science, 2(2), 71–89. https://doi.org/10.1023/a:1011591613728

37.

Forman

S. G.

Shapiro

E. S.

Codding

R. S.

Gonzales

J. E.

Reddy

L. A.

Rosenfield

S. A.

Sanetti

L. M. H.

Stoiber

K. C.

(2013). Implementation science and school psychology. School Psychology Quarterly, 28(2), 77–100. https://doi.org/10.1037/spq0000019

38.

Fronius

Darling-Hammond

Persson

Guckenburg

Hurley

Petrosino

(2019). Restorative justice in U.S. schools: An updated research review. WestEd Justice & Prevention Research Center. https://files.eric.ed.gov/fulltext/ED595733.pdf

39.

*Gage

N. A.

Grasley-Boy

Peshak George

Childs

Kincaid

(2019). A quasi-experimental design analysis of the effects of school-wide positive behavior interventions and supports on discipline in Florida. Journal of Positive Behavior Interventions, 21(1), 50–61. https://doi.org/10.1177/1098300718768208

40.

Gage

N. A.

Whitford

D. K.

Katsiyannis

(2018). A review of schoolwide positive behavior interventions and supports as a framework for reducing disciplinary exclusions. Journal of Special Education, 52(3), 142–151. https://doi.org/10.1177/0022466918767847

41.

González

(2012). Keeping kids in schools: Restorative justice, punitive discipline, and the school to prison pipeline. Journal of Law & Education, 41(2), 281–335.

42.

Gorski

(2019). Avoiding racial equity detours. Educational Leadership, 76(7), 56–61.

43.

*Gregory

Clawson

Davis

Gerewitz

(2016). The promise of restorative practices to transform teacher-student relationships and achieve equity in school discipline. Journal of Educational and Psychological Consultation, 26(4), 325–353. https://doi.org/10.1080/10474412.2014.929950

44.

Gregory

Cornell

Fan

(2011). The relationship of school structure and support to suspension rates for Black and White high school students. American Educational Research Journal, 48(4), 904–934. https://doi.org/10.3102/0002831211398531

45.

*Gregory

Hafen

C. A.

Ruzek

Mikami

A. Y.

Allen

J. P.

Pianta

R. C.

(2016). Closing the racial discipline gap in classrooms by changing teacher practice. School Psychology Review, 45(2), 171–191. https://doi.org/10.17105/SPR45-2.171-191

46.

*Gregory

Huang

F. L.

Anyon

Greer

Downing

(2018). An examination of restorative interventions and racial equity in out-of-school suspensions. School Psychology Review, 47(2), 167–182. https://doi.org/10.17105/SPR-2017-0073.V47-2

47.

Gregory

Skiba

R. J.

Mediratta

(2017). Eliminating disparities in school discipline: A framework for intervention. Review of Research in Education, 41(1), 253–278. https://doi.org/10.3102/0091732x17690499

48.

Hammond

(2014). Culturally responsive teaching and the brain: Promoting authentic engagement and rigor among culturally and linguistically diverse students. Corwin Press.

49.

*Hashim

Strunk

Dhaliwal

(2018). Justice for all? Suspension bans and restorative justice programs in the Los Angeles Unified School District. Peabody Journal of Education, 93(2), 174–189. https://doi.org/10.1080/0161956X.2018.1435040

50.

Hausman

Pierce

Briggs

(1996). Evaluation of comprehensive violence prevention education: Effects on student behavior. Journal of Adolescent Health, 19(2), 104–110. https://doi.org/10.1016/1054-139X(96)00128-0

51.

Hill

H. C.

Beisiegel

Jacob

(2013). Professional development research: Consensus, crossroads, and challenges. Educational Researcher, 42(9), 476–487. https://doi.org/10.3102/0013189x13512674

52.

Hussar

Zhang

Hein

Wang

Roberts

Cui

Smith

Bullock Mann

Barmer

Dilig

(2020). The Condition of Education 2020 (NCES 2020-144). U.S. Department of Education, National Center for Education Statistics. https://nces.ed.gov/pubs2020/2020144.pdf

53.

Jain

Bassey

Brown

M. A.

Kalra

(2014). Restorative justice in Oakland schools. Implementation and impacts: An effective strategy to reduce racially disproportionate discipline, suspension, and improve academic outcomes. Oakland Unified School District. https://www.ousd.org/cms/lib/CA01001176/Centricity/Domain/134/OUSD-RJ%20Report%20revised%20Final.pdf

54.

Jaworsky

Levitt

Cadge

Hejtmanek

Curran

(2012). New perspectives on immigrant contexts of reception. Nordic Journal of Migration Research, 2(1), 78–88. https://doi.org/10.2478/v10202-011-0029-6

55.

Kafka

(2011). The history of “zero tolerance” in American public schooling. Palgrave Macmillan.

56.

Kim

McIntosh

Mercer

S. H.

Nese

R. N. T.

(2018). Longitudinal associations between SWPBIS fidelity of implementation and behavior and academic outcomes. Behavioral Disorders, 43(3), 357–369. https://doi.org/10.1177/0198742917747589

57.

Ladson-Billings

(1995). Toward a theory of culturally relevant pedagogy. American Educational Research Journal, 32(3), 465–491. https://doi.org/10.3102/00028312032003465

58.

Losen

D. J.

(2018). Disabling punishment: The need for remedies to the disparate loss of instruction experienced by black students with disabilities. The Center for Civil Rights Remedies & The Charles Hamilton Houston Institute for Race and Justice.

59.

Losen

D. J.

Gillespie

(2012). Opportunities suspended: The disparate impact of disciplinary exclusion from school. The Civil Rights Project/Proyecto Derechos Civiles, University of California, Los Angeles. https://civilrightsproject.ucla.edu/resources/projects/center-for-civil-rights-remedies/school-to-prison-folder/federal-reports/upcoming-ccrr-research/losen-gillespie-opportunity-suspended-2012.pdf

60.

Losen

D. J.

Hodson

Keith

M. A.

II Morrison

Belway

(2015). Are we closing the school discipline gap? The Civil Rights Project/Proyecto Derechos Civiles, University of California, Los Angeles. https://www.civilrightsproject.ucla.edu/resources/projects/center-for-civil-rights-remedies/school-to-prison-folder/federal-reports/are-we-closing-the-school-discipline-gap/AreWeClosingTheSchoolDisciplineGap_FINAL221.pdf

61.

Losen

D. J.

Whitaker

(2017). Lost instruction: The disparate impact of the school discipline gap in California. Civil Rights Project-Proyecto Derechos Civiles, University of California, Los Angeles. https://files.eric.ed.gov/fulltext/ED578997.pdf

62.

Lustick

(2017). Making discipline relevant: Toward a theory of culturally responsive positive schoolwide discipline. Race Ethnicity and Education, 20(5), 681–695. https://doi.org/10.1080/13613324.2016.1150828

63.

Mallett

C. A.

(2016). The school-to-prison pipeline: A critical review of the punitive paradigm shift. Child and Adolescent Social Work Journal, 33(1), 15–24. https://doi.org/10.1007/s10560-015-0397-1

64.

*Mansfield

K. C.

Fowler

Rainbolt

(2018). The potential of restorative practices to ameliorate discipline gaps: The story of one high school’s leadership team. Educational Administration Quarterly, 54(2), 303–323. https://doi.org/10.1177/0013161X17751178

65.

Masten

A. S.

(2015). Ordinary magic: Resilience in development. Guilford.

66.

*McIntosh

Ellwood

McCall

Girvan

E. J.

(2018). Using discipline data to enhance equity in school discipline. Intervention in School and Clinic, 53(3), 146–152. https://doi.org/10.1177/1053451217702130

67.

McIntosh

Smolkowski

Gion

C. M.

Witherspoon

Bastable

Girvan

E. J.

(2020). Awareness is not enough: A double-blind randomized controlled trial of the effects of providing discipline disproportionality data reports to school administrators. Educational Researcher, 49(7), 533–537. https://doi.org/10.3102/0013189X20939937

68.

Mintrop

(2016). Design-based school improvement: A practical guide for education leaders. Harvard Education Press.

69.

Mirsky

(2011). Restorative practices: Giving everyone a voice to create safer saner school communities [Data set]. Prevention Researcher. https://doi.org/10.1037/e542592012-002

70.

Mitchell

B. S.

Hatton

Lewis

T. J.

(2018). An examination of the evidence-base of School-Wide Positive Behavior Interventions and Supports through two quality appraisal processes. Journal of Positive Behavior Interventions, 20(4), 239–250. https://doi.org/10.1177/1098300718768217

71.

Mora

G. C.

(2014). Cross-field effects and ethnic classification: The institutionalization of Hispanic panethnicity. American Sociological Review, 79(2), 183–210 https://doi.org/10.1177/0003122413509813

72.

Morgan

Salomon

Plotkin

Cohen

(2014). The school discipline consensus report: Strategies from the field to keep students engaged in school and out of the juvenile justice system. Council of State Governments Justice Center. http://csgjusticecenter.org/wp-content/uploads/2014/06/The_School_Discipline_Consensus_Report.pdf

73.

Morris

E. W.

Perry

B. L.

(2017). Girls behaving badly? Race, gender, and subjective evaluation in the discipline of African American girls. Sociology of Education, 90(2), 127–148. https://doi.org/10.1177/0038040717694876

74.

Newmann

F. M.

Smith

Allensworth

Bryk

A. S.

(2001). Instructional program coherence: What it is and why it should guide school improvement policy. Educational Evaluation and Policy Analysis, 23(4), 297–321. https://doi.org/10.3102/01623737023004297

75.

Ngo

Lee

S. J.

(2007). Complicating the image of model minority success: A review of Southeast Asian American education. Review of Educational Research, 77(4), 415–453. https://doi.org/10.3102/0034654307309918

76.

Noltemeyer

Mcloughlin

C. S.

(2010). Patterns of exclusionary discipline by school typology, ethnicity, and their interaction. Penn GSE Perspectives on Urban Education, 7(1), 27–40.

77.

Noltemeyer

A. L.

Ward

R. M.

Mcloughlin

C. S.

(2015). Relationship between school suspension and student outcomes: A meta-analysis. School Psychology Review, 44(2), 224–240. https://doi.org/10.17105/spr-14-0008.1

78.

Nowicki

J. M.

(2018). K–12 education: Discipline disparities for Black students, boys, and students with disabilities. Report to congressional requesters. GAO-18-258. U.S. Government Accountability Office. https://www.gao.gov/assets/700/690828.pdf

79.

Öğülmüş

Vuran

(2016). Schoolwide positive behavioral interventions and support practices: Review of studies in the Journal of Positive Behavior Interventions. Educational Sciences: Theory & Practice, 16(5), 1693–1710. https://doi.org/10.12738/estp.2016.5.0264

80.

Okonofua

J. A.

Eberhardt

J. L.

(2015). Two strikes: Race and the disciplining of young students. Psychological Science, 26(5), 617–624. https://doi.org/10.1177/0956797615570365

81.

*Okonofua

J. A.

Paunesku

Walton

G. M.

(2016). Brief intervention to encourage empathic discipline cuts suspension rates in half among adolescents. Proceedings of the National Academy of Sciences of the United States of America, 113(19), 5221–5226. https://doi.org/10.1073/pnas.1523698113

82.

Orfield

(2014). Segregating California’s future: Inequality and its alternative 60 years after Brown v. Board of Education. The Civil Rights Project/Proyecto Derechos Civiles, University of California, Los Angeles. https://www.civilrightsproject.ucla.edu/research/k-12-education/integration-and-diversity/segregating-california2019s-future-inequality-and-its-alternative-60-years-after-brown-v.-board-of-education/orfield-ee-segregating-california-future-brown-at.pdf

83.

*Osher

D. M.

Poirier

J. M.

Jarjoura

G. R.

Brown

Kendziora

(2014). Avoid simple solutions and quick fixes: Lessons learned from a comprehensive districtwide approach to improving student behavior and school safety. Journal of Applied Research on Children: Informing Policy for Children at Risk, 5(2), 1–45. https://doi.org/10.1037/e534832013-001

84.

Petras

Masyn

K. E.

Buckley

J. A.

Ialongo

N. S.

Kellam

(2011). Who is most at risk for school removal? A multilevel discrete-time survival analysis of individual- and context-level influences. Journal of Educational Psychology, 103(1), 223–237. https://doi.org/10.1037/a0021545

85.

Rausch

M. K.

Skiba

R. J.

(2005, April). The academic cost of discipline: The relationship between suspension/expulsion and school achievement [Conference session]. American Educational Research Association Conference, Montreal, Canada. https://pdfs.semanticscholar.org/9d53/05902807041b69543b7e2d864d9e68fc56e4.pdf

86.

Rumbaut

R. G.

(2005). Sites of belonging: Acculturation, discrimination, and ethnic identity among children of immigrants. In Weiner

T. S.

(Ed.), Discovering successful pathways in children’s development: Mixed methods in the study of childhood and family life (pp. 111–164). University of Chicago Press.

87.

Rumberger

R. W.

Losen

D. J.

(2017). The hidden costs of California’s harsh school discipline: And the localized economic benefits from suspending fewer high school students. The Civil Rights Project/Proyecto Derechos Civiles, University of California, Los Angeles. https://files.eric.ed.gov/fulltext/ED573326.pdf

88.

*Scott

T. M.

Hirn

R. G.

Barber

(2012). Affecting disproportional outcomes by ethnicity and grade level: Using discipline data to guide practice in high school. Preventing School Failure: Alternative Education for Children and Youth, 56(2), 110–120. https://doi.org/10.1080/1045988X.2011.592168

89.

Skiba

R. J.

(2015). Interventions to address racial/ethnic disparities in school discipline: Can systems reform be race-neutral? In Bangs

Davis

L. E.

(Eds.), Race and social problems: Restructuring inequality (pp. 107–124). Springer Science and Business Media. https://doi.org/10.1007/978-1-4939-0863-9_7

90.

Skiba

R. J.

Chung

C. G.

Trachok

Baker

T. L.

Sheya

Hughes

R. L.

(2014). Parsing disciplinary disproportionality: Contributions of infraction, student, and school characteristics to out-of-school suspension and expulsion. American Educational Research Journal, 51(4), 640–670. https://doi.org/10.3102/0002831214541670

91.

Skiba

R. J.

Horner

R. H.

Chung

C. G.

Rausch

M. K.

May

S. L.

Tobin

T. J.

(2011). Race is not neutral: A national investigation of African American and Latinx disproportionality in school discipline. School Psychology Review, 40(1), 85–107. https://doi.org/10.1080/02796015.2011.12087730

92.

Skiba

R. J.

Knesting

(2001). Zero tolerance, zero evidence: An analysis of school disciplinary practice. In Skiba

R. J.

Noam

G. G.

(Eds.), New directions for youth development (pp. 17–43). Jossey-Bass.

93.

Skiba

R. J.

Michael

R. S.

Nardo

A. C.

Peterson

R. L.

(2002). The color of discipline: Sources of racial and gender disproportionality in school punishment. Urban Review, 34, 317–342. https://doi.org/10.1023/A:1021320817372

94.

Skiba

R. J.

Rausch

M. K.

(2006). Zero tolerance, suspension, and expulsion: Questions of equity and effectiveness. In Evertson

C. M.

Weinstein

C. S.

(Eds.), Handbook for classroom management: Research, practice, and contemporary issues (pp. 1063–1089). Lawrence Erlbaum.

95.

Slavin

R. E.

(1986). Best-evidence synthesis: An alternative to meta-analytic and traditional reviews. Educational Researcher, 15(9), 5–11. https://doi.org/10.3102/0013189x015009005

96.

Smolkowski

Girvan

E. J.

McIntosh

Nese

R. N.

Horner

R. H.

(2016). Vulnerable decision points for disproportionate office discipline referrals: Comparisons of discipline for African American and White elementary school students. Behavioral Disorders, 41(4), 178–195. https://doi.org/10.17988/bedi-41-04-178-195.1

97.

Smolkowski

Strycker

Ward

(2016). Scale-up of Safe & Civil Schools’ model for school-wide positive behavioral interventions and supports. Psychology in the Schools, 53(4), 339–358. https://doi.org/10.1002/pits.21908

98.

Song

S. Y.

Swearer

S. M.

(2016). The cart before the horse: The challenge and promise of restorative justice consultation in schools. Journal of Educational and Psychological Consultation, 26(4), 313–324. https://doi.org/10.1080/10474412.2016.1246972

99.

Sugai

Horner

R. H.

Todd

(2000). Effective Behavior Support Self-Assessment Survey (EBS-SAS). https://www.pbis.org/resource/sas

100.

Sullivan

A. L.

Klingbeil

D. A.

Van Norman

E. R.

(2013). Beyond behavior: Multilevel analysis of the influence of sociodemographics and school characteristics on students’ risk of suspension. School Psychology Review, 42(1), 99–114. https://doi.org/10.1080/02796015.2013.12087493

101.

Sumner

M. D.

Silverman

C. J.

Frampton

M. L.

(2010). School-based restorative justice as an alternative to zero-tolerance policies: Lessons from West Oakland. Thelton E. Henderson Center for Social Justice, University of California, Berkeley, School of Law. https://www.law.berkeley.edu/files/thcsj/10-2010_School-based_Restorative_Justice_As_an_Alternative_to_Zero-Tolerance_Policies.pdf

102.

Theriot

M. T.

Craun

S. W.

Dupper

D. R.

(2010). Multilevel evaluation of factors predicting school exclusion among middle and high school students. Children and Youth Services Review, 32(1), 13–19. https://doi.org/10.1016/j.childyouth.2009.06.009

103.

Valenzuela

(1999). Subtractive schooling: Issues of caring in education of US-Mexican youth. State University of New York Press.

104.

*Vincent

C. G.

Swain-Bradway

Tobin

T. J.

May

. (2011). Disciplinary referrals for culturally and linguistically diverse students with and without disabilities: Patterns resulting from school-wide positive behavior support. Exceptionality, 19(3), 175–190. https://doi.org/10.1080/09362835.2011.579936

105.

*Vincent

C. G.

Tobin

T. J.

(2011). The relationship between implementation of School-Wide Positive Behavior Support (SWPBS) and disciplinary exclusion of students from various ethnic backgrounds with and without disabilities. Journal of Emotional and Behavioral Disorders, 19(4), 217–232. https://doi.org/10.1177/1063426610377329

106.

Wallace

J. M.

Goodkind

Wallace

C. M.

Bachman

J. G.

(2008). Racial, ethnic, and gender differences in school discipline among U.S. high school students: 1991–2005. Negro Educational Review, 59(1–2), 47–62.

107.

Welsh

R. O.

Little

(2018). The school discipline dilemma: A comprehensive review of disparities and alternative approaches. Review of Educational Research, 88(5), 752–794. https://doi.org/10.3102/0034654318791582

108.

Wilson

S. J.

Lipsey

M. W.

Derzon

J. H.

(2003). The effects of school-based intervention programs on aggressive behavior: A meta-analysis. Journal of Consulting and Clinical Psychology, 71(1), 136–149. https://doi.org/10.1037/0022-006X.71.1.136