Abstract
Research on teacher stability typically focuses on the extent to which teachers remain in the same school, district, or the teaching profession from one year to the next. I investigate another facet of stability—whether teachers remain in the grade they teach. Drawing on administrative data from a large district in California, I find that high shares of teachers switch grades. Disproportionately, these are early career teachers who come from low-achieving or high-minority schools. Teachers who switch grades leave schools at higher rates than their colleagues and exhibit lower impacts on their students’ achievement. For teachers who switch to a nonadjacent grade, these negative effects can wipe out any gains due to increased experience and can persist in the year after the switch occurs.
Keywords
Introduction
Each year, school principals, leadership, and teaching teams make decisions about how best to allocate their scarce resources in the service of improving student outcomes. They may decide, for example, on how small or large to make classes, whom to hire, and what type of professional development to provide their staff. Despite fierce debates about which—if any—of these resources matter most (e.g., Greenwald, Hedges, & Lain, 1996; Hanushek, 1997), a broad consensus has emerged on the relative importance of investing in high-quality teachers (Kyriakides & Creemers, 2008; Nye, Konstantopoulos, & Hedges, 2004; Rivkin, Hanushek, & Kain, 2005). A key question, then, is how schools can allocate resources to get these high-quality teachers into classrooms. Research indicates that paying for education master’s degrees may not be a wise investment (Wayne & Youngs, 2003), but focusing instead on developing teacher productivity through experience, coaching, and relationships with high-quality colleagues can improve teacher quality and student achievement (Allen, Pianta, Gregory, Mikami, & Lun, 2011; Jackson & Bruegmann, 2009; Rockoff, 2004).
Another potential avenue not yet fully explored is teacher grade assignments. To the extent that researchers have studied teacher assignments, they have focused primarily on whether or not teachers hold qualifications in the subject they teach (e.g., Ingersoll, 2002). However, recent analyses indicate that teachers switch grades at high rates; in multiple districts across the United States, over 20% of teachers switch grades from one year to the next (Jacob & Rockoff, 2011). In addition, these trends may have negative consequences for teachers and students. Studies indicate that accountability policies likely incentivize school leaders to keep their most effective teachers in tested grades (Boyd, Lankford, Loeb, & Wyckoff, 2008; Chingos & West, 2011; Fuller & Ladd, 2012) and that switching grades is associated with declines in teachers’ effectiveness at raising student achievement (Jacob & Rockoff, 2011; Ost, 2014) and increases in teacher turnover (Ost & Schiman, 2015).
Attention to and modification of grade assignment policies and practices has the potential to be quite valuable to schools, particularly when compared to other programs seeking to harness teacher productivity to improve educational outcomes for students. Although individualized coaching can cost around $4,000 per teacher (Allen et al., 2011) and increasing the quality of teachers’ peers requires large-scale recruitment efforts, the decision of whether or not to have a teacher switch grades may have no direct and few indirect costs.
To explore the extent to which grade assignments can be a lever for change in the teacher pipeline and for student achievement, I draw on a 10-year panel of administrative data from a large urban school district in California. I focus on elementary school teachers who are most likely to teach one self-contained grade in a given year. In order to understand the context within which grade switching occurs, I first explore trends in teacher grade assignments. I focus on the extent to which grade reassignments occur more frequently for inexperienced teachers and those with low value-added scores in years prior to the switch, or for those teachers who work in high-risk schools (i.e., low-achieving, low-income, and/or high-minority schools). Second, I explore the relationship between grade switching and teachers’ long-term career trajectories. Specifically, I examine whether there is a relationship between switching grades and teachers’ growth in effectiveness at raising student achievement or their retention in a school or in the district.
In order to help inform decision-making processes, I focus on two additional features of grade reassignments: switching to an adjacent versus a nonadjacent grade and the lasting effects of switching grades. It is possible that teachers who switch to a grade far from their original assignment experience more disruption than those who switch to an adjacent grade. If this is the case, then school leaders may aim to avoid the latter more than the former. Relatedly, if disruptions due to grade reassignment last for multiple years, this would be much more problematic than a scenario where disruptions fade out quickly.
Background
Recent research highlights a variety of explanations for why teachers switch grades from one year to the next. Administrators may reassign teachers to different grades based on need, due either to changes in cohort size or teacher turnover (Jacob & Rockoff, 2011). Relatedly, administrators may aim to match teachers to a specific group of students. An array of studies also suggests that grade assignments are related to accountability policies that emphasize tested grades over untested ones. Low-quality teachers are more likely than their colleagues to be moved to an untested grade, and these trends are more pronounced following the enactment of accountability policy (Boyd et al., 2008; Chingos & West, 2011; Fuller & Ladd, 2012). Furthermore, the decision to switch grades may be a voluntary one made by teachers who choose to work with a specific age group or want a change of pace (Jacob & Rockoff, 2011).
The multitude of factors for grade reassignment suggests no single outcome for teachers who experience this event. Many of the scenarios described above have theoretical benefits for teachers and their students. At the same time, qualitative research, theory, and intuition highlight a number of potential consequences. Interviews with teachers indicate that understanding of curricula and content is a primary factor in their job satisfaction, perceptions of their teaching skill, and decisions to stay in their school and in the profession (Johnson & Birkeland, 2002; Kauffman, Johnson, Kardos, Liu, & Peske, 2002). Therefore, adapting to new curricula, content, and possibly a new set of grade-level colleagues as a result of switching grades may create disruptions in their satisfaction and professional growth. Relatedly, research on teaching across subject areas highlights a need for teachers to negotiate differences in how content is delivered, including the level of cognitive demand (Graeber, Newton, & Chambliss, 2012). Differences in content standards and expectations across grade levels—even within the same subject—likely bring similar challenges. Finally, teachers who switch to a grade far from their original one also must adapt to differences in developmental needs of students at different ages (Fischer, 1980).
A handful of quantitative analyses indicate that these and other potential challenges associated with switching grades are negatively related to teacher effectiveness and retention. Drawing on data from North Carolina, Ost (2014) examines differences in elementary teachers’ returns to experience—that is, how much they contribute to gains in student achievement over time—for those who remain in the same grade versus those who switch. His value-added model with teacher fixed effects attempts to account for sorting of students into classrooms. Results indicate that teachers who repeatedly teach the same grade from one year to the next have returns to experience roughly one-third to one-half larger (i.e., 0.01 to 0.04 standard deviations in student achievement growth) than general returns to experience. Jacob and Rockoff (2011) discuss similar findings from unpublished analyses of New York City administrative data. However, in this policy-oriented discussion paper, the authors do not describe their sample or estimation strategy. Using similar data as above, Ost and Schiman (2015) find that teachers who switch grades also leave their school at the end of that year over 3 percentage points higher than teachers who do not switch grades.
These negative trends may be most problematic when viewed from an educational equity perspective. Brummet, Gershenson, and Hayes (2013) find that grade switchers in Michigan tend to come from a unique subset of the workforce. Teachers who switch grades are more likely to work in schools that are low performing and have high percentages of minority students. Therefore, students in these schools who are assigned to a teacher who recently switched grades may be even worse off than students in other schools.
Despite a growing literature highlighting suboptimal outcomes associated with switching grades, questions remain about trends in grade reassignment and its effect on teachers and students. There may be additional predictors of grade reassignment, such as income level, which also are related to issues of equity. In addition, studies typically have focused on rates of switching from a tested to an untested grade because of its relevance to policy; however, they have not explored switching from adjacent versus nonadjacent grades, which may be more relevant to teachers’ experiences and their ability to grow as educators. Teachers who switch to a grade far away from their original one may experience a much steeper learning curve than those who switch to a grade close to their original assignment. Relatedly, it is not clear the extent to which teachers recover losses to productivity in years after the switch occurs, or whether higher rates of turnover fade out over time.
Therefore, I build on existing work by asking three sets of related research questions: (1) Do inexperienced teachers, those with low value-added scores, or those who work in high-risk schools (i.e., high-turnover, low-achieving, and/or low-income schools) switch grades at higher rates than their colleagues in a way that may exacerbate inequality? (2) Is grade reassignment related to teachers’ long-term career trajectories—namely, their productivity or retention in their school or in the district? (3) Do these trends differ for those who switch to a grade adjacent to their original assignment versus those who switch to a grade farther away? An additional contribution of this work is to combine these questions into a single set of analyses. As such, I am able to draw on descriptive patterns in grade reassignments to inform results exploring teachers’ longer term trajectories.
Methods
Data and Sample
In order to answer my research questions, I draw on administrative records from a large urban school district in California. This 10-year panel of data beginning in the 2002-03 school year includes human resource information for all teachers, demographic and test-score data (where relevant) for all students, and course files that allow me to connect teachers to students and identify the grades they teach.
Although I have access to information on all teachers and students in the district, I limit this sample in ways that are most conducive to answering my research questions. Across all analyses, I focus on elementary teachers who are most likely to teach one self-contained grade in a given year. Based on grade identification that I describe below, over 90% of all teacher-year observations at the elementary level are attached to just one grade, compared to roughly 60% and 20% of observations at the middle- and high-school levels, respectively. I also limit the sample to teachers in core academic subject areas (i.e., English language arts [ELA], math, science, social studies), excluding those who teach self-contained special education, physical education, art, music, or supplemental courses. I describe below additional restrictions for individual analyses, such as teachers who are observed as novices at some point in the dataset and those with test-score data. These samples include between roughly 2,000 and 22,000 teachers and between roughly 7,000 and 129,000 teacher-year observations.
For each of these teachers, I identify the grade(s) taught in each school year based on course rosters and student grade information provided by the district. Where applicable, I verify students’ grade level against information from the standardized tests they took at the end of each school year. For 89% of core subject courses, all students within a given course have the same grade designation. Therefore, I am confident that this grade level is attached correctly to the assigned teacher. An additional 8% of courses include students at different grade levels. Although some of these may be mixed-level courses (e.g., a combined fourth- and fifth-grade class), two-thirds are heavily skewed toward one grade. That is, the average grade level across individual students in the class is within 0.33 of the integer value. Therefore, in these instances, I use the modal value of student grade levels in the course as the class grade. I also consider alternative ways of addressing this issue—using smaller bandwidths around the integer value and excluding these cases altogether—and find that rates of switching do not change substantively. Finally, 3% of courses are missing grade data for all students. However, in all of these cases, I am able to determine teachers’ grade level in that year through a course in another subject.
I define grade switches in instances where teachers move to a self-contained grade in the current year from a different self-contained grade in the prior school year. As the vast majority of elementary teachers teach just one grade in both the current and prior years, these switches are straightforward to identify. For those few teachers who teach multiple grade levels either in the prior or current year, I identify switches in instances where a teacher adds a grade level to his or her teaching load. For example, a teacher who teaches fourth grade one year and both fourth and fifth the next is identified as switching grades. A teacher who teaches fourth and fifth grade in one year but then only teaches fourth grade the next year is not identified as switching grades, as no new grades are added to their course load.
I exclude from this definition “loopers” (i.e., teachers who move with students from one grade to the next), teachers who switch both grades and schools, and teachers who return to a grade that they taught in any prior year because of the unique nature of these types of switches; in particular, these switches may not lead to the same challenges associated with grade reassignment described above. This third exclusion also ensures that teachers who leave teaching for a short period of time (e.g., maternity leave) but then return to the classroom are not identified as switching grades. For those teachers who I identify as switching grades in a given year, I further classify these as switches to an adjacent versus a nonadjacent grade.
Analysis
Estimating Trends in Grade Reassignment
My first research question aims to understand trends in grade reassignments and the extent to which experiencing this event may be inequitably distributed amongst certain types of schools and teachers. First, I present rates of switching by school year, grade level that teachers switch from, and experience level. I also disaggregate rates by those who switch to an adjacent versus nonadjacent grade.
Next, I explore whether teachers who work in high-risk schools (i.e., high-turnover, low-achieving, and/or low-income schools) switch grades at higher rates than teachers who work in other schools. I do so in a regression framework in order to control for differential rates of switching by teaching experience, given evidence that inexperienced teachers are more likely to work in high-risk schools (Clotfelter, Ladd, & Vigdor, 2005; Lankford, Loeb, & Wyckoff, 2002). Specifically, I estimate the following equation using Ordinary Least Squares (OLS):
where SWITCHjt is an indicator for whether or not teacher j switches grades in year t (from year t – 1), which I also disaggregate by adjacent and nonadjacent switches in separate models. I estimate expected values of rates of grade reassignment by quartiles of school characteristics,
Finally, I examine whether teachers with low value-added scores switch grades at higher rates than their higher quality peers. To do so, I estimate the relationship between switching grades and teachers’ prior value-added score using a similar model as above:
Here, I replace school characteristics with indicators for teachers’ prior value-added quartile [see equation (3) below for the sort of value-added model from which scores are derived]. To increase the precision of my estimates, I utilize all years of data prior to t (Goldhaber & Hansen, 2012; Koedel & Betts 2011; Schochet & Chiang, 2013). I focus on value-added scores in prior years in order to ensure appropriate directionality of this relationship. That is, I explore whether teachers who are observed as being ineffective at raising student achievement in all prior years switch grades at the end of that year at higher rates than their more effective colleagues. This analysis does not aim to estimate the effect of switching grades on teachers’ effectiveness. Because this sample consists of elementary teachers, many of whom teach both math and reading, I average value-added scores across subjects to create a composite measure of teachers’ overall effectiveness at raising student achievement. I do not run analyses separately by subject area, given that teachers who switch grades almost always do so in both subjects simultaneously. I control for a vector of school year indicators, ηt, given that value-added scores are normed within a given year. I also estimate models that include school fixed effects in order to account for various factors at the school level that contribute to grade reassignment.
For all of these analyses, I only include teachers in their second year of teaching or higher, as these are teachers for whom it is possible to switch grades. As teachers in their first year in the classroom could not have switched grades from the prior year, including them in these analyses would artificially attenuate rates of switching. I also exclude school year 2002-03, as I am not able to observe a switch in this first year of available data.
Estimating the Relationship Between Switching Grades and Student Achievement
My second set of research questions examines the relationship between switching grades and teachers’ long-term careers. First, I estimate the relationship between switching grades and teachers’ effectiveness at raising student achievement over time. To do so, I specify a value-added model similar to those used by Kane et al. (2013) and Chetty, Friedman, and Rockoff (2014):
The outcome of interest is current-year test score, Aisgcjt, for student i in school s, grade g, and class c with teacher j at time t. Test scores are modeled as a function of students’ prior achievement, Ait–1. I control for vectors of student covariates, Xit, and peer covariates, Pct, for all students within classroom c at time t. I include school fixed effects, φs, in order to account for some of the school-level factors that are related to grade reassignments and also affect student achievement. Results available upon request are quite similar when I control instead for observable school characteristics. I also include grade-by-year fixed effects, ωgt, to account for scaling of tests at this level. I cluster standard errors at the class level in order to account for the nested structure of the data.
Following Ost (2014) and Rockoff (2004), I include teacher fixed effects, τj, to control for the fact that different teachers have different underlying effectiveness, regardless of experience. As Ost (2014) argues, teacher fixed effects account for the possibility that “unobserved teacher characteristics are correlated with grade-specific experience” (p. 128).
I add parameters for years of experience, f(EXPERIENCEjt), to examine the extent to which teachers’ ability to raise student achievement improves over time. I include dummy variables for Year 2 of teaching and up, with Year 1 as the reference group. This allows me to identify average gains in student achievement attributable to each additional year of experience relative to average gains for novice teachers. Others who have specified similar models note nearly perfect collinearity between experience and year when teacher fixed effects are included (Harris & Sass, 2011; Ost, 2014; Rockoff, 2004). To address this, I replace experience dummies with experience levels in some instances. Although the authors cited above use a cutoff of 10 years of experience, I do so for teachers with 7 years of experience given that I only am able to follow novice teachers into their 9th year. Teachers who have 7 or more years of experience are identified as having exactly 7 years of experience.
I parameterize decrements to returns to experience for those teachers who switch grades in a few key ways that align with my research questions. The simplest way to estimate this relationship would be to include a single dummy variable indicating whether or not a teacher switched grades from the prior school year. This is the same as the dummy variable SWITCHjt, described above. For example, if a teacher switched grades from Year 1 to Year 2, this variable would take on a value of 1 in that teacher’s second year. One should interpret the coefficient on this binary variable as any losses (or gains) to student achievement above and beyond factors already included in the model, including teaching experience.
To examine whether these potential decrements vary by the year that a teacher switches grades, I create a dummy variable for each potential year of experience that a teacher could switch grades. I do so by interacting SWITCHjt with experience dummy variables. The result is a set of additional dummy variables, SWITCH YEAR2jt through SWITCH YEAR7jt. This approach also allows me to capture instances where teachers switch grades more than once. As with teaching experience, teachers who switch grades in their seventh year of teaching or higher are collapsed into one category.
To examine whether these potential decrements are lasting or fade out over time, I allow the indicators I just described to take on a value of 1 in the year of the switch and all subsequent years. Then, I interact them with the experience dummy variables. Because these variables still take on a value of 0 in years prior to the switch, I include the subscript t to indicate time variance. These interactions are included in equation (3) and are my main predictors of interest. For the sake of parsimony, I only follow teachers for 2 years after the switch could occur. That is, if a teacher switched grades in their second year, I estimate the decrements to returns to experience in that year, in their third year, and in their fourth year. These coefficients should be interpreted in the same way that I describe above, with each indicating losses (or gains) to student achievement above and beyond factors already included in the model. I only include these interactions, and not the main effect of switching grades, as they are exhaustive of all the ways that teachers might switch grades. Finally, in separate models I disaggregate these effects for teachers who switch to an adjacent grade versus a nonadjacent grade, which are mutually exclusive for each year of experience.
In this set of analyses, I restrict the sample to those teachers who are observed as novices at some point in the dataset, which ensures that I always am able to capture a grade switch if it occurs. If teachers enter my dataset in their third year of teaching, I cannot tell if they switched grades from their first to their second year, or from their second to their third. I also am limited to teachers who are observed teaching a tested subject and grade, and whose students have prior-year test scores. This includes teachers who teach math and/or ELA in Grades 3 through 5 in school years 2003-04 through 2011-12.
Estimating the Relationship Between Switching Grades and Retention
The hypothesized model to describe the relationship between switching grades and teacher retention is given by equation (4):
Here, the outcome of interest, LEAVEjst, is one of two variables, indicating whether teacher j left his or her school, s, or left the district at the end of year t. In most instances, I identify these teachers who leave their school or the district through human resource records. As this information is not available in the first 2 years of available data, I define teachers who leave schools as those who are observed in a different school year in the following year, t + 1. In these same 2 years, I define teachers who leave the district as those who are not observed in the dataset in any subsequent year, T > t. Setting up the variable this way ensures that teachers who take a leave for maternity or other reasons but then return to the district are not identified as leavers.
Equations for leaving a school versus leaving the district are estimated separately, with each modeling the linear probability that teacher j leaves the school or district as a function of switching grades. In a teacher-by-year dataset, this setup is similar to those used in discrete time hazard analyses (Singer & Willett, 1993). I estimate linear probability as opposed to hazard models given empirical justification (Heckman & Snyder, 1996) and ease of interpretation of estimates. As above, SWITCHjt is an indicator for whether teacher j switches grades in year t (from year t – 1) and is further disaggregated by adjacent and nonadjacent switches in separate analyses. I control for indicators for teachers’ year of experience, with first year teaching as the reference group. I also include indicators for the year that a teacher began teaching to control for idiosyncratic differences in leave rates for a given cohort due to policy shocks such as the Great Recession. Although experience dummies vary across years, cohort indicators do not; therefore, these variables are not collinear. Finally, like Ost and Schiman (2015), I include school fixed effects, φs, to account for differences across schools that might be related both to grade switching and to turnover, such as student and teacher composition.
The coefficient β on the indicator for whether or not teachers switch grades, describes the percentage point difference of leaving for those teachers who switch grades compared to those who do not. Although this set of analyses only describes the difference in retention rates in the year of the switch itself, I also examine whether there is a relationship between grade switching and retention in any year after the switch. To do so, I estimate additional models where I replace my main predictor with a new variable that takes on a value of 1 in the year of a switch and all subsequent years.
Restrictions to the sample are similar to those described above in analyses that examine the relationship between switching and student achievement. However, I am able to include teachers outside of tested grades and subjects. In addition, I exclude school years 2010-11 and 2011-12, given lack of human resource data in these years and the fact that I am not able to impute data by following teachers into subsequent years.
Results
Trends in Grade Reassignment
I begin the Results section by describing trends in grade reassignment. Overall, I find that, year to year, elementary teachers switch the grade in which they teach at high rates upwards of 17% (see Figure 1); lower rates around 12%-14% after the 2007-08 school year may reflect more general trends in transfer and migration as a result of budget cuts and teacher firings related to the Great Recession. Interestingly, switching to an adjacent grade occurs slightly more frequently than switching to a nonadjacent grade through the 2007-08 school year, but this trend is reversed in the following school years.

Rates of grade switching by school year (top panel), grade level teachers switch from (middle panel), and teaching experience (bottom panel).
Disaggregating grade switches by grade level—that is, the grade teachers switch from—I find that rates steadily increase across the elementary grades. Roughly 11% of kindergarten teachers switch grades, compared to roughly 17% for fifth-grade teachers. By design of schools, I find lower rates of adjacent grade switching for kindergarten and fifth-grade teachers who teach at the extremes of the elementary grades and, therefore, only can switch to one adjacent grade.
Disaggregating rates of switching by teachers’ level of experience, I find that rates of switching increase between the second and fourth year of experience, from roughly 21% overall to 24%, and then decline steadily for teachers with more experience. By construction of my grade-switching variable, no teacher can switch grades in his or her first year of experience; therefore, this bin is excluded from the analysis. Teachers with 9 or more years of experience switch grades at much lower rates of roughly 11%. Within each experience level, rates of adjacent and nonadjacent grade switching are roughly equal.
In addition, I find that rates of switching differ depending on school and teacher characteristics. In Table 1, I present results for school characteristics that are estimated from a regression framework that controls for teaching experience, which is absorbed in the model. Teachers who work in the lowest achieving schools, operationalized as average prior-year achievement in math, switch grades from one year to the next at higher rates than those who work in the highest achieving schools. I find that roughly 16% of teachers who work in bottom-quartile schools (i.e., low-achieving) switch grades, compared to roughly 13% who work in top-quartile schools (i.e., high-achieving). This differential of over 3 percentage points is statistically significant. Differentials between the bottom quartile and the second and third quartiles are also statistically significant but smaller in magnitude, of 1.3 and 1.7 percentage points, respectively. When examining rates of switching to adjacent or nonadjacent grades, trends are similar, but differentials between quartiles are smaller in magnitude given lower rates of switching to an adjacent or nonadjacent grade overall.
Rates of Grade Switching by School Characteristics
Note. Estimates are calculated from regression models that include fixed effects for teaching experience, which are absorbed in the model. Sample includes all elementary teachers who teach core academic subjects and are in their second year of teaching or higher.
p < 0.05, **p < 0.01, ***p < 0.001, comparing the second through top quartiles to the bottom.
When I examine trends for teachers who work in low-income and high-minority schools, some differentials remain statistically significant but are substantively smaller in magnitude. Roughly 13% of teachers who work in high-income schools (i.e., bottom quartile of percent of students eligible for free- or reduced-price lunch) switch grades, compared to roughly 15% of teachers who work in low-income schools (i.e., top quartile of percent of students eligible for free- or reduced-price lunch). However, these differences disappear when controlling for average school math achievement. Finally, 13% of teachers who work in low-minority schools (i.e., bottom quartile of percent of non-White students) switch grades, compared to 14% of those who work in high-minority schools (i.e., top quartile of percent of non-White students). As above, differentials between teachers who work in top- versus bottom-quartile schools are similar when examining rates of switching to adjacent versus nonadjacent grades.
In Table 2, I present differences in rates of switching by teachers’ prior-year value-added quartile. Rates of switching are estimated from a regression framework that controls for school year given that value-added scores are normed within years, with 2003-04 as the baseline year. For teachers who have a prior-year value-added score in both math and ELA, I average these scores. Because grade reassignment is a school-level decision, I look at trends both across and within schools.
Rates of Grade Switching by Teachers’ Prior Value-Added Quartile
Note. Estimates are calculated from regression models that include fixed effects for school year, with 2003-04 as the baseline year. Sample includes all elementary teachers who teach core academic subjects, are in their second year of teaching or higher, and have a value-added score in the year(s) prior to the switch. In instances where a teacher has value-added data in both reading and mathematics, I average these scores.
p < 0.10, *p < 0.05, **p < 0.01, ***p < 0.001, comparing the second through top quartiles to the bottom.
Here, I find that rates of switching are higher for teachers in the bottom quartile of prior-year value-added than for their more effective colleagues. Roughly 20% of bottom-quartile teachers switch grades compared to roughly 15% of top-quartile teachers. This trend remains when disaggregating by adjacent versus nonadjacent grade switches and when comparing teachers within schools.
Relationship Between Switching Grades and Teachers’ Long-Term Careers
In addition to observing differential rates of switching for specific types of teachers and schools, I examine the relationship between switching grades and teachers’ longer term careers. This is of particular importance given findings above that the most vulnerable teachers—early career teachers, those with low prior value-added scores, and those who work in high-risk schools—switch grades at the highest rates.
I begin this analysis by examining the relationship between grade switching and teachers’ effectiveness at raising student achievement over time—often referred to as teachers’ “returns to experience.” In Table 3, I present parameter estimates for average returns to experience and deviations from this trajectory for switching grades in a given year.
Returns to Experience for Teachers Who Do and Do Not Switch Grades
Note. Estimates for adjacent and nonadjacent grade switches are from the same regression model. Each model controls for student and class characteristics, and includes teacher fixed effects, school fixed effects, and grade-by-year fixed effects. Robust standard errors clustered at the class level are reported in parentheses. Sample includes all elementary teachers who are observed as novices at some point in the data and whose students have test score data in the given subject in the current and prior year. All cells have 89 teachers or more.
p < .10. *p < .05. **p < .01. ***p < .001.
In order to interpret results related to switching grades, I first describe average returns to experience. Importantly, I note two trends. Consistent with prior research (Ost, 2014; Papay & Kraft, in press), I find larger returns to experience in math than in ELA. At the same time, the magnitude of these estimates are substantively larger than those found in most analyses that examine teacher productivity over time (Clotfelter, Ladd, & Vigdor, 2006; Harris & Sass, 2011; Kraft & Papay, 2014; Ost, 2014; Papay & Kraft, in press; Rockoff, 2004; Wiswall, 2013). In these studies, average returns to experience generally rise no higher than 0.17 SD by the 10th year of experience (often with smaller returns in ELA). In comparison, I find average returns to experience in the seventh through ninth year of teaching of 0.321 SD and 0.138 SD in math and ELA, respectively.
One reason for these differences in estimates is the unique sample of teachers that I include in my analyses, which focuses on elementary teachers who are observed as novices at some point in my dataset. When I estimate returns to experience using the same analytic models for all teachers with current- and prior-year test-score data (third through eighth grade in math and third through ninth grade in ELA), I find estimates that are much more in line with other work (see Table A1 in the Appendix). Returns to experience for teachers in their seventh year of teaching or higher are 0.130 SD and 0.070 SD in math and ELA, respectively. These estimates rise to 0.184 SD and 0.073 SD, respectively, when I further limit the sample to teachers who are observed as novices at some point in the dataset. With the exception of Ost (2014), most studies do not make this same sample restriction, suggesting that including teachers with censored data may attenuate results.
A second reason for larger average returns to experience in my study is the fact that I account for the confounding effect of grade reassignment on teacher productivity. Again, with the exception of Ost (2014), other studies do not take this into account. The fact that teachers with grade-specific experience have larger returns than those who switch grades is consistent with theory, prior work, and findings that I describe in detail below. As a point of comparison, in Table A1, I show average returns to experience for elementary teachers who are observed as novices at some point in the dataset without controlling for switching grades. Indeed, these estimates (0.270 SD and 0.098 SD in math and ELA, respectively, for teachers in their seventh year of teaching or higher) are smaller than those I observe in Table 3 when I do control for switching grades (0.321 SD and 0.138 SD in math and ELA, respectively, for this same group of teachers). In light of these differences between my work and other studies, I describe trends both in absolute terms (i.e., magnitude of effect sizes related to switching grades) and relative terms (i.e., returns to experience for those who switch grades compared to average).
Here, I observe some statistically significant and some marginally significant decrements to average returns to experience in both math and ELA. For example, average returns to experience are 0.099 SD and 0.168 SD in Years 2 and 3, respectively. In both years, returns for those who switch grades are 0.036 SD, or one-third to one-fifth, lower. I find larger decrements in math of 0.059 SD and 0.070 SD for teachers who switch grades in their fifth or sixth year of teaching, respectively. These losses are one-and-a-half to two times larger than average returns to experience from the prior year (0.040 SD between Years 4 and 5, and 0.026 SD between Years 5 and 6), meaning that teachers who switch grades in these years lose ground relative to where they began before switching grades. For teachers who switch grades in their second or sixth year of teaching, these decrements persist into the following year.
In ELA, I only find decrements to returns to experience for those who switch grades in their sixth year of teaching of 0.047 SD. One reason for this likely is that average returns to experience are much smaller in ELA than they are in math. Thus, decrements also are smaller in magnitude and harder to distinguish from zero. I illustrate these trends for both ELA and math in Figure 2, only presenting trend lines that deviate from average returns to experience at conventional levels of statistical significance.

Returns to experience in math (left panel) and ELA (right panel) on average and for teachers who switch to any grade (top panel) and to adjacent versus nonadjacent grades (bottom panel).
I further explore these relationships by disaggregating returns to experience for those who switch to adjacent versus nonadjacent grades. Two trends emerge. First, although I do observe some statistically significant lower returns to experience for those who switch to an adjacent grade, I also observe some higher returns for these teachers. For teachers who switch grades in their fifth year of teaching, returns to experience in both math and ELA are statistically significantly larger than average returns in the year following the switch (i.e., Year 6). This suggests that there may be something unique about teachers who switch grades in this year, such as a motivation to switch. This could be the case for teachers who feel that they mastered teaching in one grade over their first few years in the classroom and want to explore a new grade level. Interestingly, though, teachers who switch to an adjacent grade in their sixth year of experience have statistically significant lower returns to experience in math.
A second key finding is that, on average, teachers who switch to a nonadjacent grade experience substantively larger decrements to returns to experience than those discussed earlier. These decrements often persist into future years. In math, teachers who switch to a nonadjacent grade in their third or fifth year exhibit returns to experience roughly half as large as average returns to experience in that year. For teachers in their third year, average returns are 0.173 SD, whereas those for teachers who switch to a nonadjacent grade are 0.078 SD lower or 0.095 SD overall. For teachers in their fifth year, average returns are 0.268 SD, whereas those for teachers who switch to a nonadjacent grade are 0.118 SD lower or 0.150 SD overall. In both cases, decrements of over 0.070 SD persist into the year following the switch. Teachers who switch to a nonadjacent grade in their seventh year of teaching or higher have returns 0.091 SD lower than average returns of 0.334 SD. As I cut off experience at Year 7, I cannot follow these trajectories into future years. In ELA, I find that teachers who switch to a nonadjacent grade in their fifth year erase almost all returns to experience in the following year. While average returns to experience in ELA in both Years 5 and 6 are 0.111 SD, returns for teachers who switch to a nonadjacent grade are 0.093 SD lower in Year 6, or 0.018 SD overall.
Finally, I examine the relationship between switching grades and teacher retention, both in a given school (see Table 4) and in the district (see Table 5). Findings indicate that teachers who switch grades transfer schools at the end of that year 2.7 percentage points higher than their colleagues who do not switch grades. This differential is almost 40% of the average transfer rate for this sample of first- through eighth-year teachers, of 7.2%. These trends are similar for teachers who switch to adjacent and nonadjacent grades. I also find that teachers who switch grades transfer schools in any year after the switch roughly 1.7 percentage points higher than those who do not do so. The differential for those who switch to an adjacent grade is about half as large and no longer statistically significant. I find similar results when I control for prior valued-added scores but do not show these results in Table 4, as these value-added scores are not a statistically significant predictor of school transfer rates when also controlling for teaching experience.
School Retention Rates for Teachers Who Do and Do Not Switch Grades
Note. Estimates in each column are from separate regression models that include fixed effects for entering cohort, with 2003-04 as the baseline cohort, and school fixed effects. Robust standard errors clustered at the school level are reported in parentheses. Sample includes all elementary teachers who teach core academic subjects and are observed as novices at some point in the data.
p < .05. **p < .01. ***p < .001.
District Retention Rates for Teachers Who Do and Do Not Switch Grades
Note. Estimates in each column are from separate regression models that include fixed effects for entering cohort, with 2003-04 as the baseline cohort, and school fixed effects. Robust standard errors clustered at the school level are reported in parentheses. Sample includes all elementary teachers who teach core academic subjects and are observed as novices at some point in the data.
p < .10. *p < .05. **p < .01. ***p < .001.
The relationship between switching grades and retention in the district is less strong. Those teachers who switch to a nonadjacent grade leave the district at the end of the year 1.9 percentage points higher than those who do not switch grades. However, given relatively high attrition rates overall—between roughly 8% for first-year teachers and 19% for second-year teachers—this differential is not as substantively significant as the results above.
Discussion and Conclusion
Research on teacher stability typically focuses on the extent to which teachers remain in the same school, school district, or the teaching profession from one year to the next. In this paper, I investigate another potential facet of stability—whether teachers experience stability in the grade in which they teach. General trends in grade reassignment and the relationship to student achievement and teacher turnover are consistent with prior research (Brummet, Gershenson, & Hayes, 2013; Chingos & West, 2011; Jacob & Rockoff, 2011; Ost, 2014; Ost & Schiman, 2015).
To my knowledge, this study is the first to distinguish between adjacent and nonadjacent grade switching. I find that, in many cases, teachers who switch grades exhibit smaller returns to experience in the year of the switch relative to average. For those who switch to a nonadjacent grade, these decrements can wipe out any gains due to increased experience and can persist in the year after the switch occurs.
Though not a main focus of this paper, interestingly, I find returns to experience that are substantively larger in magnitude than those found in other studies (Clotfelter, Ladd, & Vigdor, 2006; Harris & Sass, 2011; Kraft & Papay, 2014; Ost, 2014; Papay & Kraft, in press; Rockoff, 2004; Wiswall, 2013), driven in large part by the estimation sample. This does not affect my interpretation of the results, as I describe the relationship between switching grades and returns to experience in both absolute and relative terms. That said, my findings suggest that this topic deserves further inquiry.
In light of disproportionate rates of switching that this and other studies highlight (Brummet, Gershenson, & Hayes, 2013), findings described above may be particularly troubling from an equity perspective. In addition to having potentially negative consequences for the most vulnerable teachers (i.e., early career teachers and those with low prior value-added scores), high rates of grade switching likely impact the most vulnerable students. Teachers who switch grades—and thus exhibit decrements in their returns to experience—are more likely to work in low-achieving and high-minority schools. In addition, high rates of switching to nontested grades (Boyd et al., 2008; Chingos & West, 2011; Fuller & Ladd, 2012) may result in concentration of low-quality teachers in the earliest elementary grades with potentially lasting effects on students at a formative stage in their academic and emotional development (Jennings & DiPrete, 2012).
At the same time, it is important to note that one must be cautious in interpreting results on the relationship between switching grades and teachers’ longer term career trajectories, given the noncausal nature of these analyses. Above, I describe a variety of factors that may motivate grade reassignments. In addition, the first set of descriptive analyses suggests that grade reassignments likely are not random. For example, findings that teachers with low prior value-added scores switch grades at higher rates than their more effective colleagues suggest that grade reassignments may be a strategic decision on the part of school leadership.
Like Ost (2014), I attempt to limit the degree of bias through strategic use of fixed effects that control for some of these factors. However, although Ost predominantly is concerned with sorting of students to teachers, I recognize that selection bias still likely plays a role. For example, if teacher motivation to switch grades is time invariant, then teacher fixed effects cannot account for this. Similarly, school fixed effects control for observed and unobserved differences across schools but cannot account for within-school factors, such as grade reassignments motivated by matching of teachers to students.
Not being able to distinguish between forced and chosen migration is a limitation of this work. That said, given that selection into treatment can occur in multiple ways, the direction of bias is unclear. Forced migration may make decrements to returns to experience and retention rates larger than they would be otherwise, whereas chosen migration likely makes decrements smaller. Therefore, the extent of bias in my estimates depends in part on the proportion of switches that are forced versus chosen. Although I am unable to calculate or even estimate these proportions directly, I argue that trends from this work are unlikely to disappear completely. Future research may attempt to isolate the causal effect of switching grades on teacher- and student-level outcomes.
In addition, findings must be placed in a broader context of teaching and schools as organizations. As noted by prior research, in some instances switching grades likely is done strategically to avoid accountability policy. However, in other cases, switching grades may be done with the best interest of teachers in mind. For example, working in different grades may help teachers develop a deeper understanding of student development and learning trajectories. If this is true, then teachers who switch grades may also be less likely to experience “burn out.” This may be one reason why I observe some positive returns to experience for teachers who switch to an adjacent grade, relative to average returns to experience. It also may be why, in some instances, decrements to returns to experience for switching grades appear to fade in the year after the switch.
Taken together, these findings indicate a need for districts to investigate why teachers are switching grades at high rates and to consider the extent to which increasing stability in teachers’ grade assignments may benefit schools, teachers, and students. Doing so may be particularly relevant given that, compared to other policies and programs focused on improving educational outcomes for students, creating stability in teachers’ grade assignments poses a potentially cost-effective way of shifting the needle. Each year, districts spend millions of dollars recruiting, developing, and seeking to retain their teacher workforce (Barnes, Crowe, & Schaefer, 2007; Darling-Hammond, Wei, Andree, Richardson, & Orphanos, 2009; Miles, Odden, Fermanich, Archibald, & Gallagher, 2011). Yet these investments—particularly around teacher development—often do not correspond to significant improvements in teacher quality or student achievement at scale (Garet et al., 2011; Garet et al., 2008; Yoon, Duncan, Lee, Scarloss, & Shapley, 2007). Even factors that are related to teachers’ effectiveness and development trajectories come at high costs for the size of their effects. For example, one-on-one teacher coaching can raise student achievement by 0.22 SD in the postintervention year but costs roughly $4,000 per teacher (Allen et al., 2011). Interacting with high-quality teacher colleagues and working in a strong school environment can produce annual returns to experience of 0.04 SD and 0.003 SD, respectively (Jackson & Bruegmann, 2009; Kraft & Papay, 2014) but require large-scale recruitment efforts, building of school support networks, etc. Conversely, the decision of whether or not to have a teacher switch grades has no direct cost. In fact, in light of the relationship between grade reassignment and teacher retention in schools, the small effort of keeping teachers in the same grade may save money while also potentially raising student achievement. Therefore, continued research in this area may prove quite valuable to schools.
Footnotes
Appendix
Sensitivity of Returns to Experience to Different Analysis Samples
| Math |
ELA |
|||||
|---|---|---|---|---|---|---|
| Third Through Eighth Grade Teachers | Third Through Eighth Grade Teachers Who Are Observed as Novices | Third Through Fifth Grade Teachers Who Are Observed as Novices | Third Through Ninth Grade Teachers | Third Through Ninth Grade Teachers Who Are Observed as Novices | Third Through Fifth Grade Teachers Who Are Observed as Novices | |
| Second year teaching | 0.060*** | 0.069*** | 0.086*** | 0.030*** | 0.035*** | 0.040*** |
| (0.005) | (0.009) | (0.012) | (0.003) | (0.006) | (0.010) | |
| Third year teaching | 0.080*** | 0.100*** | 0.136*** | 0.040*** | 0.045*** | 0.064*** |
| (0.005) | (0.015) | (0.022) | (0.004) | (0.010) | (0.017) | |
| Fourth year teaching | 0.094*** | 0.125*** | 0.169*** | 0.051*** | 0.049*** | 0.061** |
| (0.006) | (0.022) | (0.031) | (0.004) | (0.014) | (0.024) | |
| Fifth year teaching | 0.106*** | 0.136*** | 0.203*** | 0.058*** | 0.059** | 0.072* |
| (0.006) | (0.029) | (0.040) | (0.004) | (0.018) | (0.031) | |
| Sixth year teaching | 0.122*** | 0.158*** | 0.229*** | 0.063*** | 0.055* | 0.078* |
| (0.006) | (0.035) | (0.048) | (0.004) | (0.022) | (0.037) | |
| Seventh year teaching | 0.130*** | 0.184*** | 0.270*** | 0.070*** | 0.073** | 0.098* |
| (0.007) | (0.044) | (0.059) | (0.005) | (0.028) | (0.045) | |
| Teacher observations | 16,094 | 2,986 | 1,920 | 16,795 | 3,137 | 1,922 |
| Teacher-year observations | 67,671 | 10,973 | 7,051 | 70,029 | 11,449 | 7,050 |
| Student-year observations | 2,166,595 | 455,008 | 159,903 | 2,082,156 | 407,204 | 159,400 |
Note. Estimates in each column are from separate regression models that control for student and class characteristics, and include teacher fixed effects, school fixed effects, and grade-by-year fixed effects. Robust standard errors clustered at the class level are reported in parentheses.
p < .05. **p < .01. ***p < .001.
