Abstract
School disciplinary processes are an important mechanism of inequality in education. Most prior research in this area focuses on the significantly higher rates of punishment among African American boys, but in this article, we turn our attention to the discipline of African American girls. Using advanced multilevel models and a longitudinal data set of detailed school discipline records, we analyze interactions between race and gender on office referrals. The results show troubling and significant disparities in the punishment of African American girls. Controlling for background variables, black girls are three times more likely than white girls to receive an office referral; this difference is substantially wider than the gap between black boys and white boys. Moreover, black girls receive disproportionate referrals for infractions such as disruptive behavior, dress code violations, disobedience, and aggressive behavior. We argue that these infractions are subjective and influenced by gendered interpretations. Using the framework of intersectionality, we propose that school discipline penalizes African American girls for behaviors perceived to transgress normative standards of femininity.
Scholars of educational inequality have increasingly turned their attention to disparities in school discipline. Student discipline is a necessary condition for learning, but research indicates that who is punished and how one is punished differs strikingly by race, class, and gender. A 1975 report by the Children’s Defense Fund first brought these disparities to light, demonstrating that African American students were twice as likely as white students to receive a suspension (Children’s Defense Fund 1975). Unfortunately, little progress has been made since 1975 in mitigating school punishment disparities. In fact, school suspension rates have doubled since the 1970s, and African American students today are three times more likely than white students to receive a suspension (Losen et al. 2015). Research has identified similar racial inequalities in other disciplinary outcomes, such as expulsions (KewalRamani et al. 2007; Wallace et al. 2008), office referrals (Rocque 2010; Skiba et al. 2002), and classroom reprimands (Ferguson 2000).
Disparities in school punishment have a less straightforward relationship to gender inequality. Boys are disciplined more often and more severely than girls despite men as a whole maintaining greater social power. The picture deepens even more when race and gender are considered simultaneously. Black boys are punished at vastly disproportionate rates compared to other race-gender groups (see Noguera 2003; Wallace et al. 2008), suggesting that the intersection of race and gender reveals important patterns in school discipline. 1 The voluminous scholarship on the disciplinary and educational plight of black boys has captured widespread public attention, prompting targeted policies, such as the federal My Brother’s Keeper initiative, to improve educational outcomes among boys of color. 2
However, scholars and advocates have directed comparatively little attention to patterns of discipline among girls. In this article, we apply an intersectional framework to demonstrate that black girls also suffer from school discipline processes but in different ways from black boys. Intersectionality suggests that complex inequalities emanate from distinct stereotypes and modes of oppression that result from overlapping systems of inequality (Collins 1990; Crenshaw 1991). Using a detailed, longitudinal data set derived from school district records, we examine the relationship between race and gender on the odds of students receiving the most basic form of school discipline: office referral. This analysis reveals striking interactions between race and gender. Consistent with previous research, we find that African American students are more likely to be punished overall. However, gender interactions show that black girls are much more likely than other girls to be cited for infractions such as dress code violations, disobedience, disruptive behavior, and aggressive behavior—and these gaps are far wider than the gaps between black boys and boys of other races for these offenses. These disparities cannot be understood by viewing race or gender separately. Instead, we show how the interweaving of race and gender produces unique, and troubling, punitive inequalities for black girls.
Background
Intersectionality
The framework of intersectionality seeks to ascertain complex permutations of race, class, and gender inequality. Intersectionality emerged from the critical scholarship of women of color such as Crenshaw (1989, 1991), hooks (1984), and Collins (1990), among others. Central to this framework is the assertion that inequalities and identities of race, class, and gender must be analyzed simultaneously, not in isolation. Women’s experiences, for example, are fundamentally transformed by intersections with race and class, rendering “essential” womanhood moot (Spelman 1988). Intersectional analysis searches for the complex ways in which inequalities stemming from race and gender are intertwined, interactive, and mutually constitutive (Shields 2008). Race and gender are not simply discrete variables that can be taken apart and added together. Instead, the meanings and effects of race occur only through gender, and in turn, the meanings and effects of gender occur only through race.
Despite its enormous utility, the methodological deployment of intersectionality continues to be relatively narrow (Choo and Ferree 2010). Scholarship using this term tends to be more theoretical than empirical, and the empirical research tends to use qualitative methods, such as historical-comparative analysis, in-depth interviewing, and ethnography (McCall 2005). Moreover, contemporary analyses of intersectionality tend to focus on subjective experience rather than examining institutional patterns of inequalities. The visceral, daily experiences of intersecting axes of inequality are obviously important, but it is also important to measure and document these inequalities as they appear in large-scale organizations. In this article, we move scholarship on intersectionality in important new directions through the use of sophisticated quantitative modeling techniques to capture race-gender patterns in school discipline.
Race, Gender, and School Punishment
Racial disparities in school punishment are well documented. In 2014, the U.S. Department of Education issued a set of guiding principles regarding school punishment, reviewing the literature on racial disparity in discipline, and reminding educators of the requirement to administer discipline fairly (U.S. Department of Education 2014). Persistent and severe punishment creates a wide range of negative effects. High levels of school suspension are linked to lower academic achievement at the individual and school levels (Perry and Morris 2014). Regularly disciplined students often feel spurned by educational institutions, prompting a cycle of disengagement that can include dropping out of high school and contact with the criminal justice system (Nicholson-Crotty, Birchmeier, and Valentine 2009; Peguero and Bracy 2015).
Although recent reports indicate that some schools have reduced exclusionary punishments, studies continue to reveal significant racial disproportionality (Losen et al. 2015). A burgeoning psychological literature suggests that such disparities may emerge from implicit biases. Implicit bias refers to preconscious, unacknowledged schemas that distort perceptions of racial outgroup members (Dovidio, Kawakami, and Gaertner 2002; Payne 2006). Such schemas arise without a person’s conscious awareness—and even against one’s stated intentions or beliefs—especially in ambiguous or tense circumstances (Payne 2006). Forsyth and colleagues (2015) find that African American students are punished primarily for subjective infractions such as disobedience or defiance, suggesting that implicit bias might influence interpretations of student behavior. Similarly, Skiba and colleagues (2011) find that educators punish African American and Latino students more severely than whites for the same or similar behavior, indicating that educators interpret transgressions more critically when they are exhibited by children of color (for analogous research on teacher perceptions, see Downey and Pribesh 2004; McGrady and Reynolds 2013).
The scholarship on implicit bias, subjective evaluation, and school discipline tends to focus on race, but complex biases emerge at the intersection of race and gender. Perceptions of masculinity, for example, vary importantly when combined with race, producing heightened social control and perceptions of dangerous “hypermasculinity” for young men of color (Collins 2005; Oeur 2016; Rios 2011). Less work examines the surveillance and punishment of young women of color, but studies do provide insight into the intersection of race and perceptions of femininity. Race may create space for alternative femininity, allowing black girls more leeway to challenge gender strictures (Fordham 1993), but African American girls are still evaluated according to white gender standards, especially within dominant institutions (Ispa-Landa 2013). Based on a study of classroom observations, Morris (2007) found that educators disciplined African American girls for assertive behavior interpreted as loud and overbearing. Latina and white girls in the same school did not receive similar admonishments to behave like “ladies,” even when they exhibited similar behavior and clothing (Morris 2007). African American girls were punished primarily for perceptions of gendered transgressions, but race shaped the enactment and perception of gender in the evaluations of these transgressions.
A handful of quantitative studies have discovered significantly disparate patterns of discipline for black girls. Wallace and colleagues (2008) find that black girls are over five times more likely than white girls to report being suspended or expelled. Using data from two nationally representative data sets, Hannon, DeFina, and Bruch (2013)show that African American girls are also more likely to report receiving a suspension than are white girls. Whereas such analyses focus on self-reported suspensions, Blake and colleagues (2011) analyze school records to examine reasons for black girls’ punishment. They find significant differences from white girls in disciplinary infractions such as defiance, inappropriate dress, and physical fighting. This punishment of African American girls, although typically less severe than punishment of African American boys, can have important negative effects. Recently, two public policy reports have built on such findings and attempted to move black girls’ detrimental educational and disciplinary experiences into the public dialogue (Crenshaw, Ocen, and Nanda 2015; Morris 2016).
This research forms an important foundation from which to examine African American girls’ disciplinary patterns. However, critical gaps remain. First, previous analyses have not used sophisticated modeling techniques, and they run the risk of confounding the relationship between race, gender, and discipline. Blake and colleagues (2011) use cross-sectional data and are unable to control for important covariates such as socioeconomic status, school location, special education status, and academic proficiency. Our multilevel modeling, using a large longitudinal data set from school district records, controls for these covariates as well as all time-invariant characteristics of students. Second, previous work has not examined a full range of disciplinary infractions. Wallace and colleagues (2008) and Hannon and colleagues (2013) examine self-reported suspension and expulsion, but they did not have direct access to school records to assess the reasons for suspension. Blake and colleagues (2011) focus only on a selection of disciplinary categories. Using school records of office disciplinary referrals, our analysis examines all types of violations, using the school district’s classification system to assess severity of violations. 3 Examining patterns across different types of offenses—including offenses that are more or less serious and/or subjective—provides important insight about the potential for racial and gender bias. Finally, most previous work in this area compares black girls to white girls (for an exception, see Wallace et al. 2008). Inspired by the full complexity of intersectionality, our approach examines patterns both across and within gender, allowing a comparison of white boys and black girls, for example, on a range of disciplinary actions. Doing so provides critical information about groups most at risk for discipline at the intersection of race and gender.
In short, we believe that our analysis provides the most robust study of African American girls and school discipline to date. Using advanced multilevel methods that capitalize on the rich explanatory power of longitudinal and hierarchical data, we focus on the following questions:
Research Question 1: How does race or ethnicity moderate the effects of gender on the predicted odds of receiving an office referral for black girls?
Research Question 2: How do the moderating effects of race or ethnicity on gender differences in office referrals vary by the severity of the violation?
Research Question 3: What types of rule violations are disproportionately likely to be attributed to black girls?
Methods
This analysis draws on data from the Kentucky School Discipline Study (KSDS) compiled by the authors. We merged de-identified school records and supplementary data collected routinely from parents in a large, urban public school district. We obtained all data on school discipline directly from school records, eliminating any selection bias and social desirability effects that occur when students or parents report their own behavior. For each student offense resulting in any disciplinary action (e.g., office referral, detention, suspension, or expulsion), school personnel in this district are required to complete an electronic information form about the offense, all students involved, and any response by school officials. The district stores this information for the purposes of monitoring school safety and reporting discipline statistics to the state, and the process is well regulated. Protocols for determining and recording office disciplinary referrals can differ widely across schools (Irvin et al. 2004), but the district our sample is drawn from uses systematic reporting procedures and pre-established, mutually exclusive categories that are followed districtwide.
Our sample includes students in Grades 6 through 12 (middle and high school) who were enrolled in a district public school over a four-year period beginning in August 2007 and ending in June 2011. The full sample includes 53,323 students. However, we dropped 22,512 students (42 percent of the full sample) due to missing data on achievement scores. Much of the missing data is attributable to inconsistent testing by the school district prior to 2009 during the pilot phase. By 2009–2010, full implementation of the testing was in place. Because the piloting process was random, missing data are unlikely to lead to biases. Moreover, most missing data are for a particular year rather than across all years, meaning that missing achievement scores rarely eliminated a student from the analysis sample altogether. Sensitivity analyses indicate that all significant findings, including interaction models, are robust in the full sample (results available from the authors on request). Results from the full sample without controls for achievement indicate larger racial and gender differences (and interactions) than do those from the restricted sample where achievement is controlled, indicating that the confounding effect of achievement likely biases results in ways that exaggerate the very differences we are interested in modeling. Given the important role of achievement in rule violations and discipline and the correlation with race and gender, we believe it is critical to include test scores as a control variable; we therefore use the restricted sample in the findings presented here. We dropped an additional 518 student cases whose racial or ethnic categories had cell sizes too small to support interaction models.
The analysis sample includes 30,202 students nested in 22 middle and high schools, providing a total of 56,676 observations over four years of data. In the first year of the study, 49 percent of students in the sample are girls, and 51 percent are boys. The majority of these students are white (64 percent) or African American (24 percent); only 8 percent are Latino, and 4 percent are Asian. Among the sample, 39 percent of students qualify for free or reduced-price meals. These data, which are drawn from one school system, are not nationally representative of all public school children. Notably, a smaller percentage of the U.S. student population is non-Hispanic black (17 percent) compared to our sample, and a greater percentage is Latino (21 percent; National Center for Education Statistics 2014). However, African American populations tend to be concentrated in the southeast, where this school district is located. Consequently, these data may be reasonably representative of the southeastern United States. In addition, this sample is on par with national trends in discipline rates and with racial/ethnic and gender differences in patterns of school discipline, as reported in the national Household Education Surveys and other published work (Aud, Fox, and KewalRamani 2010; Perry and Morris 2014; U.S. Department of Education 2007). We can thus cautiously suggest that our findings will be applicable to other school districts nationally.
Measures
We examine two static characteristics of individual students as independent variables in multivariate models. Gender is coded as a binary variable (1 = female; 0 = male). 4 We measure race in four categories and code it as binary indicators: white, African American, Latino, and Asian. White is the omitted category in all models. 5 Because there were too few students of other races/ethnicities (e.g., Native American, Alaskan Native) to obtain accurate and stable estimates of their effects on outcomes, we omitted these students (n = 518) from all analyses.
We coded time using academic year beginning with 0 at baseline in 2008–2009 and ending with 2 in 2010–2011. We also calculated time-squared and time-cubed to assess the nonlinearity of the growth or decline in office referrals over time, but these are nonsignificant, indicating a linear model. We separated all other time-varying measures into their between-person and within-person variance to differentiate the degree to which outcomes are due to average differences between students across waves or differences over time in students’ characteristics compared to themselves at other waves (Raudenbush and Bryk 2002). Between-person variance is reflected in the average score for the four waves of the study, and it is held constant across observations nested within the same individual. Within-person variance is the average score subtracted from the value for the current wave; it measures how different a person is in a given wave from their own average. For binary variables, the between-person measure is equivalent to the proportion of waves in which each student had the characteristic in question. The within-person score is the difference between the binary indicator for a particular wave and the between-person proportion.
We use participation in the free or reduced meal program as a proxy measure of socioeconomic status (SES). Although not an ideal measure of SES, it is frequently used as a crude indicator of poverty in educational research. For this variable, between-person variance is the mean of free/reduced lunch status (coded 1 = yes; 0 = no) across four waves of data. This is equal to the proportion of waves in which each student was in the free/reduced lunch program. The within-person measure is the difference between the binary indicator for the current wave and the proportion of waves in which the student participated in free/reduced lunch. We measure receipt of special education services using binary coding, and it is decomposed into between- and within-person variation.
We use performance on tests in math and reading to control for student achievement, and these are drawn from official school records. Between 2007 and 2011 in the targeted school district, academic achievement was measured using two statewide standardized tests: Commonwealth Accountability Testing System (CATS) and Measures of Academic Progress (MAP). These tests are designed to assess students’ mastery of core curriculum content and help monitor students’ academic growth in reading and math, respectively. A dichotomous variable represents a student’s score: “proficient” or “distinguished” on this test is coded 1, and “novice” or “apprentice” (i.e., not meeting grade-level expectations) is coded 0. Students are coded as not proficient (0) if they do not meet grade-level requirements in both reading and math. In cases where CATS data are missing (all ninth graders and a small number of observations in other grades), we use MAP scores instead. Students are coded proficient if they fall above the 34th percentile on this test for their grade level because this is the percentile corresponding to proficiency on the CATS test. Although this measure is imperfect, it does reduce concerns about academic achievement confounding the relationship between race, gender, and office referrals for rule violations.
We measure rule violations resulting in office referral using a series of binary variables. We drew information on student office referrals from official school records. A minority of students experienced multiple office referrals in a given school year, but a substantial subset of those who received any referral received multiple referrals. 6 However, there are insufficient cases to use a count variable, particularly when divided by rule violation and predicting some of the more severe and rare violations. One variable is simply a dichotomous indicator of receiving any office referral in a given academic year. We subsequently broke down this measure by type of rule violation according to the school district’s classification. We combined some very small categories where appropriate. Binary measures of rule violations include drug or alcohol possession or intoxication, assault, possession or use of a weapon, other major law violation (e.g., assault, bomb threat, larceny, or extortion), theft or possession of stolen property, fighting, bullying or harassment, vandalism or property damage, sexually inappropriate behavior, truancy, cheating, disobeying staff, possession or use of tobacco products, disruptive behavior, excessive tardiness, and other minor board violations (e.g., dress code violations, cell phone use, loitering, or use of vulgar or profane language). The school district further categorizes these offenses, from least to most severe, as Class I (disruptive behavior, excessive tardiness, or other minor board violation), Class II (truancy, cheating, or disobedience), Class III (theft, fighting, harassment, property damage, or inappropriate sexual behavior), or Class IV (drugs or alcohol, weapon, or other major law violation). Four binary indicators measure any office referral for each of these classes of violations.
Analyses
Analyses focus on identifying the association between gender, race and ethnicity, and office referrals for various types of rule violations. We model multivariate effects with multilevel mixed logistic regression models using Stata 13 (Statacorp 2013). These adjust for the hierarchical structure of the data and the interdependence among observations resulting from having multiple observations over time for each student and multiple students in schools. The models have a three-level structure: Level 1 observations over time are nested in Level 2 individual students, which are nested in Level 3 schools.
These models focus on predicting referral for rule violations—an individual-level outcome—using time-invariant and time-variant characteristics. Consequently, the models include a random intercept at Level 2. To control for unmeasured time- and student-invariant characteristics of schools, these models also include Level 3 fixed effects that are modeled using dichotomous school indicators (estimates not shown in tables). This strategy effectively estimates mechanisms of suspension and achievement for students in a particular school in comparison to other students in the same school. We control for variables such as the neighborhood in which the school is located and other potential confounding school-level effects that are time invariant, or that are likely to change very little over four years, because all comparisons are between students within the same school. This strategy also eliminates the small n problem at Level 3 (i.e., 22 schools) because school-level information is controlled in the fixed effects model rather than being used for prediction.
The basic mixed effects model with three levels predicting referral using two independent variables, for example, takes the following form:
In this model, i corresponds to time (Level 1), j to student (Level 2), and k to school (Level 3). The symbol
The first set of models examines the effects of gender, race and ethnicity, and control variables on the log odds of any office referral. A baseline model includes all key independent variables and controls, including sociodemographic characteristics that may confound the relationship between race or ethnicity and referrals. Next, an interaction model adds a multiplicative interaction term for gender and race/ethnicity; it tests whether gender moderates the effect of race on discipline outcomes. In all models, time-invariant characteristics (i.e., gender and race and ethnicity) are measured at Level 2, and time-variant characteristics (i.e., referral, socioeconomic status, special education status, and achievement) are measured at Level 1. All Level 1 variables are separated into between-student effects (e.g., Why are students different from each other, on average?) and within-student effects (e.g., Why are students different from themselves this year compared to other years?). A second set of models examines the severity of offenses to determine whether disproportionate referral results from the attribution of both major and minor rule violations. Finally, a third set of models investigates disproportionate office referrals for different types of rule violations. The number of observations varies slightly across models because a few schools had no reported incidents of some of the more severe violations in a given year, causing cases to be dropped due to perfect negative prediction.
We use graphs depicting odds ratios of gender-specific effects of race or ethnicity on referrals to facilitate interpretation. We present these only when the interaction is statistically significant. Statistical significance is based on the Delta method for testing group-specific effects because Chow-type tests of the equality of coefficients are inappropriate for logit models (Long 2009). Also, a number of covariates (e.g., race, socioeconomic status, and achievement) are correlated, introducing the possibility of multicollinearity. However, variance inflation factors (VIFs) do not exceed 1.46 for any model.
We conduct two additional sets of three-way interaction models to determine whether the moderating effects of race or ethnicity on gender are contingent on a third variable. Specifically, we test whether these patterns differ across middle and high schools given research suggesting that rates of discipline are higher in high schools relative to middle schools (Losen and Martinez 2013). In addition, we run three-way interactions to determine whether the moderating effects of race or ethnicity on gender differ by academic achievement. We find that referral rates are higher for all groups in high school compared to middle school and among students who score below proficient on academic achievement tests, as expected. However, predicted probabilities for race and gender subgroups by high school/middle school and proficiency are not significantly different. In other words, the race by gender interactions hold in middle and high school and among academically proficient and nonproficient students. Given the null findings, we do not present these results in the text or tables.
Results
Descriptive statistics in Table 1 suggest that 16 percent of public middle and high school students in this Kentucky district will receive an office referral in any given year. About 9 percent of students receive referrals for minor (Class I) rule violations such as disruptive behavior, dress code violations, and cell phone misuse; 6 percent of students are referred for Class II violations (e.g., truancy, disobedience); and 6 percent are referred for Class III violations (e.g., theft, fighting). Only a little over 1 percent of students are cited for major law violations (Class IV), including drug or alcohol use or weapons possession. Intraclass correlations range from .49 for Class I violations to .20 for Class IV violations, indicating there is substantial overlap in the students who are referred for rule violations across years of the study, particularly for more minor offenses (i.e., multiple referrals within and across years are not uncommon).
Descriptive Sample Characteristics at Year One, n = 30,202; Observations = 56,676.
Effects of Time and Control Variables on the Odds of Any Office Referral
Table 2 shows results from a mixed effects logistic regression of any office referral for a rule violation. The effect of time is linear and negative, meaning that this Kentucky school district administered fewer office referrals in middle and high schools in each subsequent year during the study period (odds ration [OR] = .81; p < .001). Students on free/reduced lunch, both in comparison to other students not in this program (OR = 3.83; p < .001) and relative to themselves in years when they were not in this program (OR = 1.35; p < .001), are significantly more likely to receive an office referral. Additionally, students in the special education program, relative to those not in special education, are more likely to have received an office referral (OR = 1.68; p < .001). Finally, students who score proficient or above on standardized achievement tests are significantly less likely to receive an office referral than are students who are not proficient (OR = .22; p < .001) and compared to themselves in other years when they did not score proficient on these exams (OR = .64; p < .001).
Mixed Effects Logistic Regression of Any Office Referral on Sociodemographic and Academic Characteristics, Kentucky Schools Study.
Note: Models control for dichotomous school indicators (fixed effects). BP = between-person variation; WP = within-person variation.
Omitted category is white.
p < .01. ***p < .001 (two-tailed tests).
Racial/Ethnic and Gender Effects on Any Office Referral
As Table 2 shows, compared to white students in the same middle or high school, rule violations are significantly more likely to be attributed to black students (OR = 2.29, p < .001) and less likely to be attributed to Latino (OR = .79, p < .01) and Asian (OR = .21, p < .001) students, net of controls. In addition, girls have significantly lower odds of receiving an office referral for a rule violation than do boys (OR = .38, p < .001). However, the effect of race is significantly moderated by gender. Specifically, black boys are about twice as likely as white boys to be referred for a rule violation in the same school (OR = 1.97, p < .001), whereas black girls have nearly three times the odds of receiving a referral compared to white girls (OR = 2.80, p < .001).
As Figure 1 shows, black students of both genders are disproportionately susceptible to being remanded to the office for a rule violation. The predicted probability of receiving an office referral for any rule violation in a given year is highest for black boys at .15. However, being black effectively negates girls’ lower probability of referral. Specifically, black girls and white boys both have a predicted probability of receiving any office referral of about .07. Moreover, Latino and Asian boys are less likely to be disciplined for a rule violation than are black girls (although the difference for Latino boys is nonsignificant). Finally, black girls are significantly and substantially more likely to be referred to the office than are girls of any other race, net of controls.

Predicted probability of an office referral for any rule violation by race/ethnicity and gender, Kentucky Schools Study.
Racial/Ethnic and Gender Effects on Severity of Rule Violations
Table 3 shows results assessing the effects of race or ethnicity and gender on the severity of rule violations in middle and high schools in this Kentucky school district. Similar to findings for any office referral, girls are significantly less likely than boys to receive all types of office referrals— from the most minor violations (OR = .42, p < .001) to severe law violations (OR = .30, p < .001). Moreover, black students are disproportionately susceptible to receiving office referrals for minor and moderate Class I (OR = 2.54, p < .001), Class II (OR = 1.59, p < .001), and Class III (OR = 2.23, p < .001) rule violations compared to white students, but they are no more likely to have a Class IV (most severe) violation. Likewise, relative to white students, Latino and Asian students have significantly lower odds of receiving a referral for minor Class I (OR = .77, p < .01 and OR = .21, p < .001, respectively) or Class II (OR = .81, p < .05 and OR = .23, p < .001, respectively) violations, regardless of gender.
Mixed Effects Logistic Regression of Class I–IV Violations on Sociodemographic and Academic Characteristics, Kentucky Schools Study.
Note: Models control for dichotomous school indicators (fixed effects). BP = between-person variation; WP = within-person variation.
Omitted category is white.
p < .05. **p < .01. ***p < .001 (two-tailed tests).
The effects of race or ethnicity on Class I through IV violations by gender are presented in Figure 2; these results are based on interaction models. According to the odds ratios and confidence intervals depicted, the adverse effect of being black on the likelihood of receiving an office referral for minor and moderate violations is significantly larger for girls than for boys attending the same school. For example, black boys are a little over twice as likely as white boys to receive an office referral for a minor Class I violation (OR = 2.13, p < .001). In contrast, for girls, being black is associated with 3.26 times greater predicted odds of being disciplined for a minor offense (p < .001) compared to being white. We find a similar significant moderation for Class II violations, although the magnitude of the interaction is smaller. For Class III violations, all three Race × Gender interactions are significant. Specifically, the effect of being black for boys (OR = 1.83, p < .001) is significantly smaller than the effect for girls (OR = 3.09, p < .001) relative to same-gender white peers. However, whereas Latino boys are significantly less likely than white boys (OR = .63, p < .001) to be referred for a Class III violation, Latino girls and white girls do not differ in their odds of receiving a violation of this severity (OR = 1.15, ns). Finally, Asian students of both genders are less likely than their same-gender white peers to receive a Class III violation, but the protective effect of being Asian is stronger among girls than among boys (OR = .09, p < .001 and OR = .40, p < .001, respectively). Overall, racial disparities in the attribution of minor and moderate rule violations among public school students are larger for black girls than for black boys or for students of other races and ethnicities, regardless of gender.

Odds ratios and confidence intervals for the effects of race/ethnicity on any Class I through IV violations by gender, Kentucky Schools Study.
Racial/Ethnic and Gender Effects on Types of Rule Violations
We find a clear pattern in the specific types of offenses that are disproportionately attributed to black girls relative to girls of other races or ethnicities in middle and high schools in this Kentucky school district. As Table 4 and Figure 3 show, the effect of being black is significantly larger for girls relative to boys in predicting citations for disruptive behavior (OR = 3.29, p < .001 and OR = 1.81, p < .001, respectively) and other minor offenses (OR = 2.03, p < .001 and OR = 1.33, p < .01, respectively), including dress code violations, inappropriate cell phone use, and loitering. Excessive tardiness is significantly more likely to be attributed to black students (OR = 2.92, p < .001) regardless of gender (i.e., no interaction). We find no significant gender moderation of race or ethnicity for Latino or Asian students for any minor Class I offenses, and with the exception of excessive tardiness, Latino and Asian students are less likely than white students to receive these kinds of citations.
Mixed Effects Logistic Regression of Class I Violations on Sociodemographic and Academic Characteristics, Kentucky Schools Study.
Note: Models control for dichotomous school indicators (fixed effects). Interactions are presented where significant. Predictors and interaction terms are dropped where they perfectly predict 0. BP = between-person variation; WP = within-person variation.
Omitted category is white.
p < .05. **p < .01. ***p < .001 (two-tailed tests).

Odds ratios and confidence intervals for the effects of race/ethnicity on Class I and II violations by gender, Kentucky Schools Study.
With respect to other relatively minor Class II violations (see Table 5 and Figure 3), we find a significant Race × Gender interaction only for disobedience: Black boys are 1.56 times as likely as white boys to be cited (p < .001), on average, whereas black girls are 2.53 times more likely to be cited than white girls (p < .001). Black students of both genders are no more likely than white students to receive an office referral for truancy, and black students are significantly less likely to be cited for using tobacco (OR = .33, p < .001). However, black boys and girls have disproportionate odds of receiving a referral for cheating or dishonesty (OR = 3.22, p < .001). We find no significant gender moderation of race or ethnicity for Latino or Asian students for any minor Class I or II offenses, although these students are less likely or no more likely than white students to receive these types of citations.
Mixed Effects Logistic Regression of Class II Violations on Sociodemographic and Academic Characteristics, Kentucky Schools Study.
Note: Models control for dichotomous school indicators (fixed effects). Interactions are presented where significant. Predictors and interaction terms are dropped where they perfectly predict 0. BP = between-person variation; WP = within-person variation.
Omitted category is white.
p < .01. ***p < .001 (two-tailed tests).
For moderately severe Class III violations (see Table 6 and Figure 4), results on gender moderation of racial and ethnic effects are mixed. With respect to referrals for aggressive behavior (fighting and bullying/harassment), the effect of being black is significantly larger for girls (OR = 3.06, p < .001 and OR = 3.01, p < .001, respectively) than for boys (OR = 1.67, p < .001 and OR = 1.61, p < .001, respectively). Moreover, black students of either gender are disproportionately likely to receive a referral for theft (OR = 1.66, p < .001) or inappropriate sexual behavior (OR = 2.85, p < .001), but they are no more likely to be cited for property damage or vandalism. Interestingly, Latino girls have significantly higher predicted odds of fighting violations relative to white girls (OR = 1.71, p < .001), but Latino boys are no more likely than white boys to receive a referral for fighting. However, for other types of Class III offenses, compared to white students, Latino and Asian students of either gender are less likely or equally likely to be cited.
Mixed Effects Logistic Regression of Class III Violations on Sociodemographic and Academic Characteristics, Kentucky Schools Study.
Note: Models control for dichotomous school indicators (fixed effects). Interactions are presented where significant. Predictors and interaction terms are dropped where they perfectly predict 0. BP = between-person variation; WP = within-person variation.
Omitted category is white.
p < .05. **p < .01. ***p < .001 (two-tailed tests).

Odds ratios and confidence intervals for the effects of race/ethnicity on Class III violations by gender, Kentucky Schools Study.
Table 7 presents results from models predicting office referrals for major law violations. We find no significant moderation of racial or ethnic effects by gender for these severe offenses. In fact, we find few significant racial or ethnic disparities, with the exception of black students of either gender being more susceptible to violations for “other” major offenses (OR = 1.82, p < .001)—largely assault—and Latino students being less vulnerable (OR = .30, p < .01). Asian students also have significantly lower odds of being cited for drug or alcohol offenses relative to white students (OR = .26, p < .01), all else being equal.
Mixed Effects Logistic Regression of Class IV Violations on Sociodemographic and Academic Characteristics, Kentucky Schools Study.
Note: Models control for dichotomous school indicators (fixed effects). Interactions are not statistically significant. Predictors and interaction terms are dropped where they perfectly predict 0. BP = between-person variation; WP = within-person variation.
Omitted category is white.
p < .05. **p < .01. ***p < .001 (two-tailed tests).
Taken together, several noteworthy findings emerge from these analyses of racial/ethnic and gender differences in referrals in public middle and high schools in Kentucky. First, in line with previous research, black students and boys are disproportionately cited for most kinds of rule violations and especially less severe violations. However, more complex patterns emerge where race and gender intersect, revealing that black girls are just as likely as white boys to receive a referral. Also, the influence of being black for girls compared to boys is most pronounced for minor offenses, and it does not extend to major law violations. Furthermore, when examining specific types of violations, we find that gender moderation of the effects of being black on office referrals is driven by offenses that are more subjective and are inconsistent with traditional norms of femininity. Specifically, black girls are disproportionately likely to be cited for disruptive behavior, disobedience, aggression, and other minor offenses, including dress code violations, in middle and high school.
Discussion and Conclusions
Drawing on district data from middle and high schools in Kentucky, we find evidence of interactions between race and gender in school punishment that are detrimental to African American girls. These interactions are captured through multilevel models that make comparisons across and within both race and gender, controlling for social class, indicators of academic achievement, and all time-invariant school-level conditions that might confound racial and gender patterns in school discipline. Black boys are twice as likely as white boys to receive a disciplinary referral in this population, but black girls are three times as likely as white girls to receive a referral. Overall, boys are more likely to receive an office referral, but when race is taken into account, black girls have the same probability of receiving an office referral as do white boys and a higher probability than Asian and Latino boys. We observe similar interactions between race and gender in the severity of offense. Black boys are generally twice as likely as white boys to be referred for a minor or moderate offense (Class I, II, or III violations), but black girls, again, are over three times as likely as white girls to receive such referrals.
Our analyses demonstrate that the relationship between race and the type of offense is magnified in this population of students when considering the intersection of race and gender. We observe few significant effects of race, or the intersection of gender and race, on the most severe, but also most clear-cut, Class IV violations (drug or alcohol possession, weapon possession, major law violation). Instead, consistent with other research (Blake et al. 2011), we find that black girls are disciplined primarily for less serious but more ambiguous offenses, such as disruptive behavior, dress code violations, disobedience, and aggressive behavior. Comparing the effects of race across gender groups reveals that the gap between black girls and white girls is significantly larger for these subjective offenses than is the gap between black boys and white boys.
Our findings on disciplinary disparities from this Kentucky school district raise two important implications. First, the offenses incurred disproportionately by black girls—especially disobedience and disruptive behavior—are largely based on school officials’ interpretations of behavior. Similar to previous work (Downey and Pribesh 2004; Forsyth et al. 2015), our findings suggest that educators evaluate the behavior of African American students critically. However, our results reveal an important intersection with gender, suggesting that African American girls’ behavior is perceived as misbehavior far more often compared to other girls. We assert that the ambiguous and comparatively inconsequential nature of behaviors like disobedience and disruptiveness may create a space for unintentional, implicit racial and gender bias. That is, teachers and staff have discretion to either take official disciplinary action or resolve issues in the classroom, in some cases even letting misbehavior slide. By contrast, less ambiguous and more serious behaviors (e.g., truancy, theft, substance use, possession of a weapon) show either no effect of race in this population or that white students are more likely to be cited for a violation. For law violations or offenses that hold potential for threat of harm to self or others, an official disciplinary response is compulsory. Moreover, the types of reprimands black girls receive may seem relatively minor, but such admonishments can accumulate over time, leading to ambivalence toward school or perceptions of personal deficiency (Harrison forthcoming).
Second, our results suggest that the perceived misbehavior of African American girls is often behavior that breaches gender assumptions of standard femininity. To be sure, we can only speculate on the gendered dynamics of such violations with our data from one Kentucky school district, but previous research supports our explanation. Intersectional scholarship reveals that blackness compromises and modifies perceptions of appropriate femininity, which is coded as white (Collins 2005). This means black girls are more likely to be scrutinized and held accountable for gender non-normative behavior (Morris 2007). Because our analysis compares gender gaps across race, we argue that the infractions black girls are punished for reveal gender normative accountability (West and Zimmerman 1987). For example, in their review of research on masculinity, Schrock and Schwalbe (2009) cite physical aggression and resistance to control as actions widely symbolic of masculinity. The fact that black girls in our analysis receive more punishment than other girls for disobedience (resisting control) and aggressive behavior (physical aggression) indicates behavior inconsistent with normative femininity. Race thus appears to heighten perceptions of the nonpassive and therefore gender-inappropriate behavior of African American girls in middle and high school.
Because our data are based on office records and not in-depth observations, we are limited in our ability to determine the exact interactional processes that may be producing discipline disparities. Most importantly, we cannot determine whether African American girls’ behavior actually purposefully defies passive norms of femininity or if it is merely perceived as defying these norms. We suspect this dynamic is largely dialectical, involving elements of both. Future qualitative research in schools could explicate the routine processes through which staff and teachers monitor black girls’ behavior and how the girls themselves respond.
We conclude by highlighting the utility of intersectionality for our results, and we underscore the importance of this approach to studies of educational inequality more generally. Recently, critics have warned that intersectionality often appears as a hollow theoretical and methodological “buzzword” instead of an explanatory concept (Choo and Ferree 2010; Davis 2008). Analyses using an intersectional lens should not just mention integrated categories but must demonstrate how those categories combine in complex ways that fundamentally moderate their effects. Our analysis reveals how an integrated race-gender lens is indispensable to understanding how schools reproduce inequality through disciplinary processes. Black girls may not suffer from punitive measures, such as suspension, to the same degree as black boys, but they do suffer from the subtle, regular shaping of their behavior. Girls in general are less likely to be punished in school, perhaps because dominant models of femininity emphasize quiescence (Mickelson 1989). However, black girls in our data experienced much higher levels of punishment than other girls, signaling how race interacts with femininity. These findings indicate that the relationship between gender and discipline cannot be fully understood apart from race, and the relationship between race and discipline cannot be fully understood apart from gender.
Indeed, based on our results, the picture of punitive school processes is confusing and distorted unless one deploys an intersectional framework. For example, if our analysis focused solely on gender, we might reach the conclusion that gender presents an unequivocal disadvantage for boys, who are punished more frequently than girls. To the contrary, our analysis, which considers gender in combination with race, demonstrates that generalized and unidimensional characterizations of school disciplinary patterns are naïve and misleading. Comprehension of gender patterns in this population requires an analysis of race. Moreover, an intersectional understanding of these patterns is critical from a policy and programmatic standpoint. Educational reforms or interventions that focus exclusively on race or gender may misappropriate resources or insufficiently support African American girls or other race-gender subgroups that are disadvantaged in public school settings. Some policy advocates, for instance, use evidence of greater punishment to argue that schools treat boys unfairly, and therefore more educational resources should be directed to helping boys succeed. In practice, this gender-only logic would provide resources to white boys but not black girls.
In closing, we advocate for more research that strives to unpack the interwoven patterns of race and gender (along with other axes of inequality) in education. Such work can provide a complex but more comprehensive picture of educational inequality. We hope that our research not only spotlights the challenges faced by African American girls but also demonstrates the broader necessity and utility of intersectionality for educational research and policy.
Footnotes
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was supported by a grant from the Spencer Foundation. The authors wish to thank Rebecca DiLoretto and the Children’s Law Center for their contributions to this project and for their commitment to equity and justice for all children in public education.
Research Ethics
This research was approved by a University Institutional Review Board and conducted in a way that is consistent with the American Sociological Association Code of Ethics. Data from school records were de-identified before we received them, and we have taken steps to protect the confidentiality of schools and the school district from which the data were obtained.
