Building Teacher Teams

Abstract

Student peer effects are well documented; however, we know far less about peer effects among teachers. We hypothesize that a relatively effective teacher can positively affect the performance of his or her peers, whereas a relatively ineffective teacher may negatively affect the performance of other teachers with whom he or she works closely. Utilizing a decade of data on teacher transfers between schools that result in changes of peers when transfer teachers enter grade-level team in the new school, we find evidence of strong positive spillover effects associated with the introduction of peers who are more effective than the incumbent teacher himself or herself. However, the incumbent teacher’s students are not meaningfully disadvantaged by the entry of relatively ineffective peers. This finding provides initial evidence that mixing teachers with diverse performance levels can increase student achievement in the aggregate. These results are robust to several student sorting and teacher selection issues.

Keywords

teacher spillovers peer effects teacher transfer teacher quality

Research provides persuasive evidence on teachers’ contributions to their students’ achievement (Aaronson, Barrow, & Sander, 2007; Koretz, 2002; McCaffrey, Lockwood, Koretz, Louis, & Hamilton, 2004; McCaffrey, Sass, Lockwood, & Mihaly, 2009; Rivkin, Hanushek, & Kain, 2005; Rockoff, 2004; Sanders & Rivers, 1996). Yet, alongside a large research base showing evidence of peer effects in other workplaces (see Herbst & Mas, 2015), an emerging body of research suggests that student achievement is a function not just of one’s own classroom teacher but of the combined effort of the classroom teacher and others with whom he or she works. The quality of a teacher’s colleagues, for example, is correlated with the test score gains made by that teacher’s students (Jackson & Bruegmann, 2009). Teachers’ instructional expertise diffuses through professional interactions and thus can change colleagues’ classroom practices (Sun, Penuel, Frank, Gallagher, & Youngs, 2013; Papay, Taylor, Tyler, & Laski, 2016). Teachers’ collaboration with one another within teams can increase their effectiveness as measured by student achievement (Ronfeldt, Farmer, McQueen, & Grissom, 2015).

Despite this initial evidence, our understanding of teacher spillover effects lacks clarity and detail. If spillover effects are nonnegligible, ignoring them means underestimating the impact of effective teachers. By focusing only on effects on students in a teacher’s own classroom, evaluations of efforts to increase teacher quality or its equitable distribution across schools, such as the U.S. Department of Education’s Talent Transfer Initiatives (Glazerman, Protik, Teh, Bruch, & Max, 2013), may underestimate those efforts’ total impacts. Assuming no spillover effects in models that states and districts use to measure teachers’ “value added” to student achievement may not be appropriate (Jackson & Bruegmann, 2009; Yuan, 2015). In addition, failure to recognize spillovers among teachers on grade-level or subject-area teams within a school may lead school leaders to miss an important opportunity to strategically build teacher teams in ways that augment student learning.

This study examines teacher spillover effects using longitudinal administrative data from Miami-Dade County Public Schools (M-DCPS). We apply insights from the economic literature on employee peer effects in other workplaces, which emphasizes the roles of social pressure and knowledge spillover as means for employees to affect the productivity of one another, to model spillovers in the context of teacher work. We then test these models using the case of teacher transfers from other schools in M-DCPS onto grade-level teams of existing teachers in elementary and middle schools. The idea behind this test is that if the presence of a more effective teacher on one’s team affects other teachers’ own performance, the arrival of a new peer provides an opportunity to observe evidence of this spillover. Specifically, we ask whether a transfer teacher’s entry into a grade-level team affects the achievement of students of incumbent teachers (i.e., those already in the school), and how these effects depend on the relative effectiveness of transfer and incumbent teachers.

We examine four different potential types of spillovers. First, we look at the average spillover effects of new transfer teachers. This “linear-in-means” model assumes that with the arrival of an effective peer, all incumbent teachers will improve, and conversely, the arrival of an ineffective peer will hurt all others’ outcomes. We then consider the nonlinearity of spillover effects depending on the difference in prior stable effectiveness between new transfers and incumbent teachers—the “relatively effective” and “relatively ineffective” models. The “relatively effective” approach models how incumbent teachers’ effectiveness changes in relationship to the degree to which the new peer is more effective than they are. This model could reflect knowledge transfer from more effective to less effective teachers. Similarly, the “relatively ineffective” approach measures the effect of the degree to which the new transfer is less effective. This model could capture a drain on incumbent teachers from having less effective teachers enter their grade. Finally, in contrast to the relative approach, we examine the variation of spillover effects depending on the absolute effectiveness of focal (incumbent) teachers. We use “focal” teachers interchangeably with “incumbent” teachers hereafter to refer to those who are already in the grade when the new transfer joins the team and whose students’ achievement gains are the outcome measures of the analysis. This “absolute effectiveness” model evaluates which types of teachers are more or less responsive to peers’ effectiveness. Less effective teachers may be more affected by the performance of new teachers, because they need greater support from their peers or are more easily influenced.

Although we find some evidence of positive “linear-in-means” effects, we find stronger evidence of positive spillover effects associated with the introduction of relatively effective peers into a teacher group. If a student has a new peer teacher at the same grade level who is about one standard deviation more effective than that of his or her own teacher, this student would have a 1.9% or 2.8% of a standard deviation increase in math test scores. This spillover effect is about 23% or 29% of the student’s own teacher’s effect on his or her achievement gains. We also find that effects are asymmetrical; although teachers benefit from a relatively effective peer, their students are not meaningfully disadvantaged by the presence of relatively ineffective peer. This finding implies that mixing teachers with diverse performance levels on grade-level teams may be a useful assignment strategy for increasing student achievement. In keeping with the importance of relatively effective peers, we also find some evidence that low-performing teachers are more responsive to the composition of his or her peer colleagues than high-performing teachers. Having an effective peer teacher particularly benefits students assigned to low-performing teachers.

In what follows, we first review the literature on spillover effects among employees in schools and other workplaces. Next, we describe the four types of spillover in more detail, motivated by possible spillover mechanisms among teachers. We then describe the data and analytic strategies for testing these models. Finally, we discuss the main findings and their policy and research implications.

Spillover Effects Among Employees in Schools and Other Workplaces

A large body of research finds evidence of peer effects on worker productivity in both high-skilled and low-skilled occupations and in a variety of experimental contexts (e.g., Battu, Belfield, & Sloane, 2003; Bauer & Vorell, 2010; Cornelissen, Dustmann, & Schönberg, 2013; De Grip & Sauermann, 2012; Herbst & Mas, 2015; Kuroda & Yamamoto, 2013; Stoyanov & Zubanov, 2012). Explanations for peer effects in the workplace center on two mechanisms: social pressure and knowledge transfer (Cornelissen et al., 2013; Frank, Lo, & Sun, 2014). Social pressure works either by providing relatively low-performing workers with incentives to work more to keep up with other coworkers or by making high-performing workers reduce their efforts to conform to the group norm. Knowledge transfer, however, is a process in which workers learn job-relevant knowledge or skills that make them more productive from observing or interacting with coworkers.

Research findings are consistent with both mechanisms. For example, as evidence of social pressure, Falk and Ichino’s (2006) study of short-term workers stuffing envelopes showed that the presence of a more productive peer working nearby compelled less productive workers to work more quickly. Similarly, Mas and Moretti’s (2009) study of supermarket cashiers found that introducing a faster cashier into a shift increased the pace of scanning among others on the shift. These gains were limited to workers in the productive cashier’s line-of-sight—suggesting that the increase in peer productivity came from a kind of monitoring pressure—and were concentrated among those peers with whom he or she works more frequently. Kuroda and Yamamoto (2013) provided another example of conforming to the group norm. When managerial-level employees were transferred from Japan to European branches of the same global firms, these employees significantly reduced their work hours due to behavioral influences of locally hired staff. The size of this reduction in hours depended on the level of the interactions between the transfers and local peers.

Substantial research also shows evidence of spillovers consistent with the knowledge transfer mechanism, including studies of the transmission of knowledge learned during a formal training program to other employees (De Grip & Sauermann, 2012) and persistent gains to the productivity of Danish manufacturing firms from hiring high-skilled employees from more productive firms (Stoyanov & Zubanov, 2012). As another example, Cornelissen, Dustmann, and Schönberg (2013) analyzed peer effects on wages among workers in one German metropolitan area—the city of Munich—and showed that peer effects existed in both high-skilled (e.g., teachers, research scientists, doctors, and lawyers) and low-skilled occupations (e.g., cashiers, data entry workers, and agricultural helpers), although the effects on low-skilled workers were larger than those on high-skilled workers. Azoulay, Graff Zivin, and Wang (2010) similarly found evidence of knowledge spillovers among medical school faculty in their analysis of changes in coauthor publication output following the death of “superstar” colleagues.

Teaching is a high-skilled, knowledge-intensive profession that involves substantial on-the-job learning and collaboration, making teaching a conducive context for spillovers among teachers (Cornelissen et al., 2013). Our understanding of peer effects among teachers in schools is sparse, however. Existing studies suggest that teachers may influence each other’s teaching and performance through both the social pressure and knowledge transfer mechanisms. For example, a study of the implementation of a reading instruction reform demonstrates the influence of social pressure on teaching, finding that local norms of instructional practices in a teacher’s school and in her collegial subgroup can mediate her adoption of new teaching practices (Penuel, Frank, Sun, Kim, & Singleton, 2013). Research also provides salient evidence on teacher learning from peers through knowledge transfer. For instance, using longitudinal elementary school teacher and student data, Jackson and Bruegmann (2009) found that students have larger test score gains when their teachers have more effective colleagues, with the historical peer quality (i.e., estimated value added from an out-of-sample preperiod) for less experienced teachers explaining about 20% of students’ own-teacher effects. Evidence in their study further suggests that peer learning is the major venue for the transmission of the peer effects that they observe, because peer effects persist over time and historical peer quality explains away some of teachers’ contemporaneous contribution to own students’ achievement. Our analysis extends the Jackson and Bruegmann (2009) study by exploring alternatives to the “linear-in-means” framework such as assessing whether peer effects vary according to the relative effectiveness of individual teachers and their peers. This addition provides insights on whether peers present a zero-sum-game or whether overall gains can be made by strategically assigning teachers to other teachers who will most benefit from their presence.

A related strand of research uses teacher network data to provide more direct evidence on knowledge spillover and the diffusion of instructional expertise among teachers. Sun, Penuel, Frank, Gallagher, and Youngs (2013) identified that spillover effects of professional development programs through teacher collaboration can be as large as the program direct effects on changing participating teachers’ classroom instruction (Sun et al., 2013). Using data from an experimental study of a writing professional development program in 39 schools, the study shows that exposure to colleagues’ expertise gained from prior-year professional development significantly increases the breadth of the writing purposes taught by a teacher and the diversity of active learning strategies to engage students in the writing processes. Moreover, teachers whose prior implementation of a new intervention was far from the desired practices responded more to direct participation in organized professional development, whereas teachers whose prior implementation was more advanced responded more to the sharing of promising practices and engaging in in-depth discussion with colleagues (Penuel, Sun, Frank, & Gallagher, 2012). These findings provide initial evidence on the heterogeneity of peer influences depending on the level of focal teachers’ prior teaching practices; however, these studies do not include student learning outcome measures, so it is unclear whether the changes in teachers’ self-reported instructional practices can be transformed into changes in student outcomes.

Modeling Teacher Spillover Effects

Research on colleague spillovers and the social pressure and knowledge transfer mechanisms motivates our analysis of teacher peer effects. Based on this research, we explore four potential models of teachers’ impacts on their colleagues’ performance.

First, the most common approach to modeling peer effects is the “linear-in-means” model (e.g., Graham, 2008; Sacerdote, 2001; Summers & Wolfe, 1977), which hypothesizes that an individual’s outcomes are a function of the average outcomes and characteristics of his or her peers. In a teacher grade-level team, the linear-in-means approach implies that with the arrival of an effective peer, all incumbent teachers in the team will improve their outcomes, and conversely, the arrival of an ineffective peer will hurt all others’ outcomes (Hoxby & Weingarth, 2005). This model is consistent with a mechanism of joint production in which various tasks that would promote student learning are distributed across teachers. For example, teachers may co-teach, co-plan, and share duties outside of their own classrooms (e.g., organizing math club). In many schools, teachers also work together to develop curriculum materials and analyze students’ assessment data (Ronfeldt et al., 2015). The addition of an effective peer could increase the overall productivity of joint activities, whereas adding a worse peer could reduce this collective productivity. However, although there is some theoretical defense for this linear-in-means model, it fails to capture the knowledge transfer or social pressure mechanism that are at the heart of research on worker peer effects because it does not measure how strong the peer is relative to the focal worker.

Peer effects may be nonlinear, varying for different individuals (Carrell, Fullerton, & West, 2009; Hoxby & Weingarth, 2005; Imberman, Kugler, & Sacerdote, 2012; Mas & Moretti, 2009). We examine whether peer teachers may have different effects on their grade-level team colleagues depending on their “relative effectiveness.” What may matter for whether there is spillover is how much more or less effective the new colleague is than the teacher already in the team. Thus, as our second model, we consider the case that a “relatively effective” new peer—that is, a teacher transferring into the grade-level team with higher teaching effectiveness than a focal incumbent teacher—could affect the achievement of the focal teacher’s students. Although the introduction of a relatively effective peer could worsen outcomes when focal teachers engage in “invidious comparisons” that undermine their confidence or sense of efficacy and, in turn, their effort level (Hoxby & Weingarth, 2005), several other potential mechanisms make benefits from working with a more effective peer more likely. One such mechanism that is likely to be especially important in teacher grade-level teams is knowledge transfer or peer learning (Jackson & Bruegmann, 2009). Working with other teachers in teams or professional learning communities provides opportunities for information about effective instructional practices to be disseminated from one teacher to another (Sun et al., 2013; Ronfeldt et al., 2015). Working together in teams allows teachers to share curricular materials, to discuss strategies for instruction or classroom management, or to model teaching practices for one another (Coburn & Russell, 2008). Each of these transfer mechanisms would benefit less effective teachers with the opportunity to work with a relatively effective colleague. Moreover, the presence of a relatively effective colleague may increase a less effective teacher’s motivation to work harder or seek out new strategies or techniques to increase his or her own effectiveness, through either friendly competition with the colleagues or being influenced by this colleague’s enthusiasm for teaching. The social pressure of not wanting to be perceived by colleagues as less productive or uncooperative may motivate the less effective teacher to improve when a relatively effective colleague is present (Mas & Moretti, 2009).

Third, we consider the case that the entrance of a relatively ineffective new peer could affect a teacher’s students. Although it is possible that the arrival of a less effective peer could increase a colleague’s performance by motivating this colleague to work harder to compensate for the lower productivity of the new peer, this ineffective peer could have negative impacts. Knowledge transfer is asymmetric. Although there is always something that one teacher can learn from the other, knowledge typically flows from more knowledgeable or productive individuals to those who are less so (Conley & Udry, 2010). An ineffective peer is thus less likely to provide the more effective colleague with productivity-enhancing insights. However, this peer may impose costs on his or her more effective colleagues by taking up their time or attention in attempting to learn from them. At the same time, ineffective peers are less likely to affect their colleagues positively via prosocial pressure because they do not provide positive reference point that motivates other teachers to emulate them.

Fourth and finally, we examine how peer effects vary depending on the “absolute effectiveness” of focal teachers. Although there are incentives for less productive incumbent teachers to “free-ride,” for example, by slowing the pace of their work when a productive peer comes in (Mas & Moretti, 2009), we hypothesize that many less effective teachers will work to minimize productivity differentials with their more effective peers, because they take pride in the social good of their profession. Motivated by prosocial pressure and having more opportunities to receive knowledge as described previously, we anticipate the less effective incumbent teachers, on average, are more likely to accept the positive influence from the new peers. In contrast, effective incumbent teachers are less affected by their peers because they may be less motivated to turn to their peers for supports or fewer opportunities to receive constructive help. These dynamics would lead to variation in spillover effects by the absolute effectiveness of the focal teachers.

We focus specifically on spillover effects among math teachers (although we provide similar analyses of reading achievement in Appendix A, available in the online version of the journal). Mathematics teaching provides a context that likely is more conducive to spillover effects than teaching in other subjects. Research has documented the distributed nature of math teaching in many schools, with teachers working together to set goals, choose instructional activities, design assessment instruments, and interpret evidence of learning (Cobb, McClain, de Silva Lamberg, & Dean, 2003). More so than many other subjects, there is widespread agreement on appropriate content, sequence, and pedagogy, which means both greater opportunities to coordinate across classrooms and greater likelihood that teachers are following similar curricula and routines. This consensus can provide a helpful basis for peer learning (McLaughlin & Talbert, 2001; Siskin, 1994). Math teachers frequently have conversations with one another about teaching and students’ interactions with material across classrooms that provide opportunities to learn from one another (Horn, 2005; Horn & Little, 2010). These collegial interactions have been linked to improvement in math teachers’ instructional practice (Sun, Garrison, Larson, & Frank, 2014). Moreover, at least one prior study shows much greater impacts of collaboration with colleagues around instruction on effectiveness in math than in reading (Ronfeldt et al., 2015). Finally, as a practical matter, prior studies generally show that teachers have a stronger effect on math achievement than on reading (e.g., Nye, Konstantopoulos, & Hedges, 2004),¹ suggesting greater opportunity to detect colleagues’ effects on one another’s classrooms empirically.

Data and Sample

Our data come from M-DCPS, the fourth-largest school district in the United States, and cover the school years from 2003–2004 through 2012–2013. We focus on math teachers in Grades 3 through 8 who can be linked to students for whom we have state standardized test scores in math. The data cover about 1.15 million student-year observations over the 10 years.

Table 1 describes the sample. Approximately 9% of students are White, 25% Black, 65% Hispanic, 49% female, 13% with limited English proficiency, 72% are eligible for subsidized lunch (free and reduced-price lunch [FRPL]), and 12% with special education needs. Besides conventional elementary and middle schools, M-DCPS has K–8 schools and combination schools (middle and high schools). Across these different school types, a math teacher in elementary grades (3–5) typically works with one group of students across multiple subjects, whereas a math teacher in secondary grades (6–8) typically works with multiple groups of students within one subject area. Close to 60% of students are enrolled in elementary grades, with the rest in middle grades. We define a teacher’s primary grade level as the grade for which she teaches the largest number of students in a given year.²

Table 1

Descriptive Statistics for M-DCPS Students and Teachers

Variables	M	SD
Student-year observations
Math scores	0.015	0.975
Race/ethnicity: White	0.085	0.280
Black	0.247	0.431
Hispanic	0.647	0.478
Other	0.021	0.142
Female	0.486	0.50
English language learner	0.125	0.330
Eligible for FRPL	0.722	0.448
Special education	0.117	0.321
N	1,150,468
Transfer teacher-year observations
Female	0.770	0.421
Race/ethnicity: White	0.368	0.482
Black	0.344	0.475
Hispanic	0.267	0.443
Other	0.021	0.142
Advanced degree (master’s or higher)	0.432	0.495
Standardized value added	−0.205	0.947
Teaching experience in this school district	7.629	6.614
Total days of absence	5.825	5.613
N	1,594
Incumbent teacher-year observation
Female	0.832	0.374
Race/ethnicity: White	0.456	0.498
Black	0.296	0.456
Hispanic	0.237	0.423
Other	0.015	0.121
Advanced degree (master’s or higher)	0.453	0.498
Standardized value added	−0.046	0.937
Teaching experience in this school district	11.169	8.515
Total days of absence	6.581	6.194
N	26,346

Note. M-DCPS = Miami-Dade County Public Schools; FRPL = free and reduced-price lunch.

We measure teachers’ annual performance in raising students’ math test scores using value-added scores. Our preferred value-added model estimates teacher-by-year fixed effects, accounting for students’ test scores in math and reading in the prior year, demographics, English proficiency, and disability status, as well as the averages of these variables at both classroom and school levels (see Appendix B, available in the online version of the journal). This model adjusts teacher effect estimates for nonrandom assignment to students based on students’ time-varying and invariant characteristics and school contexts. This model outperforms other popular value-added models and the student percentile growth model when nonrandom assignment of students exists (Chetty, Friedman, & Rockoff, 2014; Guarino, Reckase, Stacy, & Wooldridge, 2015). To further confirm that our spillover estimates do not vary depending on value-added models, we construct alternative value-added models with either student or school fixed effects. These alternative models yield similar estimates of spillover effects to our preferred model.³

After obtaining the teacher-by-year fixed effect estimates, we then shrink the estimates using the empirical Bayes (EB) methods to adjust for sampling and measurement errors and bring imprecise estimates closer to the mean (see Loeb, Kalogrides, & Béteille, 2012, for a description of the shrinking method). After shrinking the value-added estimates, we standardize them to have a mean of 0 and a standard deviation of 1 in each year to facilitate interpretation. We acknowledge that EB estimators do not always reduce the misclassification of teachers, particularly under nonrandom teacher assignment (Guarino, Maxfield, Reckase, Thompson, & Wooldridge, 2015). In our study, the Spearman’s rank correlation coefficient between the shrinkage estimates and teacher fixed effect estimates without shrinkage is .998. The strong correlation between shrinkage and fixed effect estimates is because we exclude teachers whose class size was less than 10 students per year. Unsurprisingly, when we replace the EB estimates with teacher fixed effects in Equations 1 to 3, results (as included in Appendix Table D1, available in the online version of the journal) are very consistent with main results using shrinkage estimators.

We then average three lagged value-added measures to account for concerns about year-to-year fluctuation of value-added measures due to the variation in true teacher performance over time and measurement error (Loeb & Candelaria, 2012). We name this aggregated measure as teachers’ prior stable effectiveness. This prior stable effectiveness has at least two advantages. First, the stable effectiveness prior to the peer shock of new transfers avoids the reflection problem in peer effect estimation, which we will explain further in the next section (Manski, 1993). Second, this stable measure mitigates the spurious relationship between new transfers and incumbent teachers due to contemporaneous shocks to all teachers at a given point in time.

There are 1,594 teacher-year transfer observations in the data that have stable teacher effectiveness measures over these 10 years. As shown in Table 1, approximately 37% of these transfer teachers are White, 34% Black, and 26% Hispanic. The total percentage of non-White transfer teachers is little more than 63%, which is about 8% higher than staying teachers. About 77% of transfer teachers are female, compared with 83% of incumbent teachers. The average transfer teachers’ working experience in this district is 7.6 years, which is about 3.5 years junior than the average teaching experience in the district. Moreover, transfer teachers are, on average, less effective than incumbent teachers (−0.21 vs. −0.05), less likely to have advanced degree (master’s or higher, 43% vs. 45%), but have fewer days absent from work (5.8 vs. 6.6).

Analytic Strategies

We estimate spillover effects by leveraging the peer shock to incumbent teachers due to new teachers’ transferring into a teacher group. Our main analyses focus on grade-level peers, teachers who teach the same grade in the same school and year. This peer definition allows us to use different fixed effects in modeling spillover effects of new peer teachers on incumbent teachers’ student achievement. For example, we use (a) school-grade fixed effects to control for stable characteristics and practices in the given grade and school (e.g., stable peer effects among continuing teachers in a given school and grade), (b) school-year fixed effects to control for other possible school-year shocks than new peers’ entry (e.g., enhanced professional development or teacher collaboration in the school in a given year), (c) year fixed effects to control year-to-year variations in district policies that may influence teacher collaboration and student achievement, and (d) grade fixed effects for grade-level differences that could affect both student achievement and teacher transfer behaviors.

Peer effects may expand beyond grade-level peers. This expansion may be particularly likely in schools with strong teacher collaborative activities. However, school-level peer estimates are subject to other yearly shocks to the schools that may coincide with the arrival of new peers and cannot be easily addressed using our data. We thus focus on grade-level peer effects. To demonstrate the possible spillover beyond the grade level, we show school-level estimates in Appendix C (available in the online version of the journal) with a caveat of weaker identification strategies.

There are three key challenges for identifying peer effects in literature: common influences, reflection, and selection, all of which can lead to bias in the estimate of peer effects. If we were to use peer characteristics that were contemporaneous with the focal teacher effect, we would worry about common influences, for example, students having an illness at the time of the test or teachers’ co-participation in professional development programs. These common influences would affect the performance of both new transfers and incumbent teachers, and appear to be a peer effect. However, because we measure the peer characteristic—value-added—prior to when the peer and focal teacher interact, these common influence problems should not bias our peer effect estimates. The reflection problem is similar. It refers to the scenario when one individual’s outcome is influenced by others in a given period of time, and influences others in the same period (Manski, 1993). Because we use the peer teacher value-added prior to when he or she met the focal teacher, reflection is not a problem in our case.

The final potential source of bias—selection—is more difficult to address. Selection may bias the peer estimates in settings where peers self-select into peer groups in a manner that is unobserved to researchers. For example, as shown in Appendix Table D2 (available in the online version of the journal), teachers were, on average, more likely to transfer to higher achieving and more advantaged schools in M-DCPS. This pattern has long been identified in prior studies of teacher mobility (Guarino, Santibanez, & Daley, 2006; Lankford, Loeb, & Wyckoff, 2002; Sass, Hannaway, Xu, Figlio, & Feng, 2012). Moreover, principals may assign new transfers to peer groups with similar productivity. This selection could cause substantial upward bias in the estimated magnitude of peer effects (Sacerdote, 2011). By controlling for incumbent teachers’ prior stable effectiveness and by comparing grades within schools within a given year, we adjust for much of this selection. Moreover, we use (a) school-grade fixed effects to account for the time-invariant attractiveness of a particular grade in a school; (b) school-year fixed effects to account for time-varying attractiveness of a particular school in a given year, and the year-to-year variation in vacancies in a given school due to teacher turnover or the increase in student enrollment; and (c) time-varying lagged student achievement scores to account for the possibility of transfer teachers using such information to make their selection. In a later section, we conduct falsification tests to examine other potential mechanisms of teacher selection, confirming that any resulting biases have little impact on our estimates.

“Linear-in-Means”

To construct the “linear-in-means” model, we model student math test score as a function of his or her teachers’ prior effectiveness and the average prior stable effectiveness of new peer teachers. In particular, we model

A_{i j g s t} = α_{0} + α_{1} A_{i j g s t - 1} + α_{2} A_{i j g s t - 1}^{o t h e r} + γ_{1} X_{i j g s t} + γ_{2} C_{i j g s t} + β_{1} \times θ_{j g s t - 1, 2, 3} + β_{2} \times θ_{k g s t - 1, 2, 3} + S G_{g s} + π_{t} + ε_{i j g s t},

where A_ijgst is the math exam score of student i, taught by incumbent teacher j in grade g, school s, and year t. This variable does not include new transfers’ own students’ scores on the left-hand side of the equation but only includes students’ test scores of incumbent teachers, so that we can better attribute the gain in test scores to peer effects, rather than to own teachers’ contribution to student achievement gains. A_ijgst₋₁ indicates this student’s prior-year math test score and $A_{i j g s t - 1}^{o t h e r}$ indicates his or her prior-year score in the other subject (e.g., reading). X_ijgst is a vector of student i’s characteristics, including poverty status, whether the student is an English language learner, the student’s race, gender, age, prior suspension, and prior absence. C_ijgst is a vector of student i’s classmates’ characteristics, such as percent of students eligible for subsidized lunch; percent of students who are English language learners; percent of Hispanic, African American, Asian, and White students; percent of female students; average age; average number of days suspended; average days absent; and the average and standard deviation of prior scores in math and reading. θ_jgst−1,2,3 is student i’s own teacher j’s value-added scores averaged over prior 3 years (t − 1, t − 2, and t − 3)—the focal (incumbent) teachers’ prior stable effectiveness. θ_kgst−1,2,3 is the newcomer k’s value-added scores averaged over 3 years prior to transferring into this school (t − 1, t − 2, and t − 3); and β₂ captures the “linear-in-means” estimate. SG_gs is the school-grade fixed effects, and π_t is the year fixed effects. We also estimate Equation 1 with the combination of school-year, and grade fixed effects. We cluster the standard errors at the school-grade-year level. ϵ_ijgst is the error term.

“Relative Effectiveness”

We then examine how the peer effects vary depending on the difference in effectiveness between the transfer and focal teacher—student i’s own incumbent teacher j. We define “relative effectiveness” as how much more effective the new transfer k was over the preceding 3 years than the focal teacher. “Relative ineffectiveness” is then defined as how much less effective the new transfer k was than the focal teacher j. We estimate the effects of these two types of peers separately because we suspect differential effects of “relatively effective” and “relatively ineffective” peers.

R e l a t i v e e f f e c t i v e n e s s_{k, j g s t - 1, 2, 3} \equiv D \times (θ_{k g s t - 1, 2, 3} - θ_{j g s t - 1, 2, 3}),

R e l a t i v e i n e f f e c t i v e n e s s_{k, j g s t - 1, 2, 3} \equiv (1 - D) \times (θ_{k g s t - 1, 2, 3} - θ_{j g s t - 1, 2, 3}),

where D = 1 if (θ_{k g s t - 1, 2, 3} - θ_{j g s t - 1, 2, 3}) > 0; D = 0 if (θ_{k g s t - 1, 2, 3} - θ_{j g s t - 1, 2, 3}) < 0 .

We estimate the effects of “relatively effective” and “relatively ineffective” peers using Equation 2.

\begin{matrix} A_{i j g s t} = α_{0} + α_{1} A_{i j g s t - 1} + α_{2} A_{i j g s t - 1 r e a d i n g} + γ_{1} X_{i j g s t} + γ_{2} C_{i j g s t} + β_{1} \times θ_{j g s t - 1, 2, 3} \\ + β_{2} \times R e l a t i v e E f f e c t i v e n e s s_{k, j g s t - 1, 2, 3} + β_{3} \times R e l a t i v e I n e f f e c t i v e n e s s_{k, j g s t - 1, 2, 3} \\ + S G_{g s} + π_{t} + ε_{i t g s t} . \end{matrix}

“Absolute Effectiveness”

We then test for heterogeneous effects depending on the prior stable effectiveness of incumbent teachers using an interaction term between incumbent teacher j’s prior stable effectiveness and new peer k’s prior stable effectiveness. Equation 3 illustrates the estimation model.

\begin{matrix} A_{i j g s t} = α_{0} + α_{1} A_{i j g s t - 1} + α_{2} A_{i j g s t - 1 r e a d i n g} + γ_{1} X_{i j g s t} + γ_{2} C_{i j g s t} + β_{1} \times θ_{j g s t - 1, 2, 3} + β_{2} \times θ_{k g s t - 1, 2, 3} \\ + β_{3} \times θ_{j g s t - 1, 2, 3} \times θ_{k g s t - 1, 2, 3} + S G_{g s} + π_{t} + ε_{i j g s t}, \end{matrix}

where β₃ indicates the amount of additional gain in student i’s test score that can be attributed to a new transfer teacher k, with one standard deviation increase in own teacher j’s prior stable effectiveness. This interaction effect identifies what types of teachers more or less benefit from a high-performing peer.

Although we focus on math in the main text for the purpose of brevity, we provide details of our analyses of reading achievement in Appendix A. Although overall our findings are weaker in reading than in math, we do find similar effects in the final “absolute effectiveness” model.

Results

Grade-Level Spillover Patterns

Table 2 presents the estimates of the spillover models. For linear-in-means estimates, a one standard deviation increase in the average prior stable effectiveness of the new transfer teachers is associated with 1% to 2% of a standard deviation of achievement gains of students taught by incumbent teachers in the same grade. These linear-in-means effects are between 15% and 29% of the effects of own teachers’ effects (.01/.068 or .020/.069). These percentages are consistent with Jackson and Bruegmann’s (2009) estimates of between 10% and 20% of the own-teacher effect.

Table 2

Estimated Grade-Level Spillover Effects

	Model 1	Model 2	Model 1	Model 2	Model 1	Model 2
“Linear-in-means”: Prior stable effectiveness of new transfer peers	.010^† (.006)	.020* (.009)			.010(.006)	.021* (.009)
“Relatively effective”: Positive values of the difference in prior stable effectiveness between new transfer peers and focal teachers			.019*** (.005)	.028*** (.008)
“Relatively ineffective”: Negative values of the difference in prior effectiveness between new transfers and focal teachers			.002(.006)	−.009(.008)
F statistics for the difference between “relatively effective” and “relatively ineffective”			3.77^†	6.88**
“Absolute effectiveness”: Prior stable effectiveness of new transfer peers × Prior stable effectiveness of focal teachers					−.006(.004)	−.008* (.004)
Students’ own (focal) teachers’ prior effectiveness	.068*** (.006)	.069*** (.006)	.081*** (.008)	.095*** (.011)	.069*** (.005)	.069*** (.005)
N	109,422	109,422	109,422	109,422	109,422	109,422
Adjusted R²	.687	.690	.687	.690	.687	.690

Note. Data from 2003–2004 to 2012–2013. All models include student and classroom covariates. Student covariates include eligibility for FRPL; whether the student is an English language learner; the student’s race/ethnicity, gender, age, prior suspension, prior absence; and prior math and reading test scores. Classroom covariates include % of FRPL, % of students who are English language learners, % of Hispanic, % of African American, % of Asian, % of White, % of female, average age, average days of prior suspension, average days of prior absence, and the average and standard deviation of students’ prior math and reading test scores. Model 1 includes school-grade, year fixed effects; Model 2 includes school-year, grade fixed effects. Standard errors are included in the parentheses and clustered at the school-grade-year level. FRPL = free or reduced-price lunch.

†

p < .1. *p < .05. **p < .01. ***p < .001.

Table 2 also provides estimates that formalize the differential effects of having peers who are more or less effective than the focal (incumbent) teacher—estimates from the “relative effectiveness” model. If a student in the class of an incumbent teacher has a new transfer teacher at the same grade level who is one standard deviation higher in prior stable effectiveness than that of his or her own teacher, this student would have a 1.9% or 2.8% of a standard deviation increase in math test scores. This spillover effect is about 23% or 29% of the student’s own-teacher effect (.019/.081 or .028/.095).

In contrast, the “relatively ineffective” estimate is very close to zero and not statistically different from zero. In other words, if the transfer peer teacher is less effective than a student’s own teacher, the student’s achievement is not meaningfully affected. An F test shows that the “relatively effective” estimate is significantly different from the “relatively ineffective” estimate (F = 6.88, p < .001).

Finally, Table 2 gives the variation of spillover effects by the absolute effectiveness of incumbent teachers, as indicated by the interaction term between new transfers’ prior stable effectiveness and own teachers’ prior stable effectiveness. It measures whether more or less effective incumbent teachers are differentially affected by transferring teachers. The significantly negative coefficients provide evidence that with one standard deviation increase in own teachers’ prior effectiveness, the spillover effect from new transfer peers would decrease about 0.6% or 0.8% of one standard deviation of student test scores. In other words, new peer teachers matter less for students whose own teachers were relatively more effective, or equivalently, that they matter more for those students whose own teachers were less effective.

To visually illustrate these heterogeneous spillover effects, Figure 1 plots the marginal effects of new peers on focal teachers’ student achievement (i.e., the predicted spillover effects) against focal teachers’ prior stable effectiveness. The figure confirms a substantial heterogeneity in how teachers respond to peers: The spillover is positive and larger for low-performing teachers and has little effect on the high-performing ones. Notably, the estimated effects are negative for just a very small number of cases and their 95% confidence intervals all include zero, suggesting that the effectiveness of high-performing teachers is, on average, not particularly hurt by the presence of low-performing peers.

Figure 1.

“Absolute effectiveness,” the relationship between focal-teacher–specific spillover effects and their own prior stable effectiveness.

Overall, the main findings show that teachers who are newly transferred to a grade affect the learning of students of incumbent teachers. These effects are bigger when the new teacher is more effective than the incumbent teacher, whereas the new teacher who is less effective has little impact on students’ learning in the incumbent teacher’s classrooms. The positive spillover effects are also bigger for less effective incumbent teachers.

Robustness and Falsification Tests

Teacher Sorting

Other possible shocks to the composition of grade-level peers could affect student learning and bias our estimates of peer spillover. First, it is possible that a novice teacher who just started her or his career was employed at the same time in the same grade and school as a new transfer entered the team. Equations 1 to 3 would drop this novice teacher and her students from the analysis, because a novice teacher did not have prior value-added scores on the right-hand side of the equations. For the same reason, the new transfer teachers without prior stable effectiveness, although being part of the new members of the teaching team, would be dropped out of the analysis, too. These other new peers, including both novice teachers and new transfers without prior stable effectiveness, could be the omitted factor that confounds the grade-level phenomenon of benefiting incumbent teachers, if the entrance of these other new teachers is correlated with the prior performance of new transfer teachers who had prior stable effectiveness. To account for the influence of other new teachers, we create a continuous variable indicating the number of other new teachers at the same grade and add that to Equations 1 to 3. Table 3 reports the results in the columns labeled “With other new teachers.” The point estimates and standard errors of “linear-in-means,” “relatively effective/ineffective,” and “absolute effectiveness” effects are quite consistent with corresponding estimates in Table 2, as are the adjusted R² values.⁴

Table 3

Robustness Check on Teacher Sorting

	With other new teachers		Only same-grade teachers
	Model 1	Model 2	Model 1	Model 2
Panel 1: “Linear-in-means”
“Linear-in-means”	.009(.006)	.020* (.010)	.016** (.005)	.032** (.010)
Students’ own (focal) teachers’ prior stable effectiveness	.069*** (.005)	.069*** (.006)	.077*** (.007)	.082*** (.008)
Number of other new teachers (novice teachers or new transfer teachers without prior stable effectiveness) at the grade level	−.006(.017)	−.002(.033)
N	109,422	109,422	56,469	56,469
Adjusted R²	.687	.690	.687	.690
Panel 2: “Relative effectiveness”
“Relatively effective”	.019*** (.005)	.028*** (.008)	.021** (.007)	.027** (.010)
“Relatively ineffective”	.002(.006)	−.009(.008)	−.003(.006)	−.023* (.010)
F statistics	3.658^†	6.56**	7.75**	9.9**
Students’ own (focal) teachers’ prior stable effectiveness	.081*** (.008)	.095*** (.011)	.092*** (.009)	.115*** (.014)
Number of other new teachers (novice teachers or new transfer teachers without prior stable effectiveness) at the grade level	−.005(.017)	−.002(.033)
N	109,422	109,422	56,469	56,469
Adjusted R²	.687	.690	.686	.690
Panel 3: “Absolute effectiveness”
“Linear-in-means”	.010(.006)	.021* (.009)	.014* (.006)	.032** (.010)
“Absolute effectiveness”: New transfer peers × Focal teachers	−.006(.004)	−.008* (.004)	−.009* (.004)	−.008^† (.004)
Students’ own (focal) teachers’ prior stable effectiveness	.069*** (.005)	.069*** (.005)	.075*** (.007)	.081*** (.008)
Number of other new teachers (novice teachers or new transfer teachers without prior stable effectiveness) at the grade level	−.006(.017)	−.003(.033)
N	109,422	109,422	56,469	56,469
Adjusted R²	.687	.690	.686	.690

Note. Data from 2003–2004 to 2012–2013.

All models include student and classroom covariates. Student covariates include eligibility for FRPL, whether the student is an English language learner, the student’s race/ethnicity, gender, age, prior suspension, prior absence, and prior math and reading test scores. Classroom covariates include % of FRPL, % of students who are English language learners, % of Hispanic, % of African American, % of Asian, % of White, % of female, average age, average days of prior suspension, average days of prior absence, and the average and standard deviation of students’ prior math and reading test scores.

Model 1 includes school-grade, year fixed effects; Model 2 includes school-year, grade fixed effects.

Standard errors are included in the parentheses and clustered at the school-grade-year level.

†

p < .1. *p < .05. **p < .01. ***p < .001.

Teachers churn within schools with some teachers moving to a new grade that they did not teach in the year before (entry) and others moving out (exit; Atteberry, Loeb, & Wyckoff, 2016). These churning teachers could affect students in much the same way as novice teachers do. To assess the degree to which the entry to a given grade affects the spillover estimates, we re-estimate Equations 1 to 3 using only incumbent teachers who stayed in the same grade and corresponding estimates are included in the columns of “Only same-grade teachers.” Although “linear-in-means,” “relatively effective,” and “absolute effectiveness” estimates are slightly larger than corresponding estimates in Table 2 (as would be expected if peer effects occurred mostly within grade-level teams), their inferences are not different in any meaningful ways. The estimate of “relatively ineffective” is negative but nonsignificant in Model 1, whereas Model 2 estimate is significantly negative. However, these two estimates are not statistically significantly different from each other (z = 1.27, p = .2), based on Cohen and Cohen’s (1983) test for the differences between two regression coefficients from the same sample that accounts for the covariance between these two coefficients. The significant coefficient in one model specification may not be practically meaningful; thus, we interpret the inference of “relatively ineffective” consistently with that in Table 2.

A third possible shock to a given grade in a given year that could bias our estimates is teachers’ exit. The main concern is that an ineffective teacher’s moving out of a grade and school in year t is followed by an effective new peer transferring in year t + 1. The increase in student achievement might stem not from the spillover of new effective transfer in year t + 1 to incumbent teachers, but rather because of the removal of an ineffective teacher from this grade and school in year t. If there is a systematic pattern that departed teachers were, on average, less effective than stayed teachers in year t, and a new transfer was more effective than incumbent teachers in year t + 1, the significant positive effect of “relatively effective” peers could be invalidated. To address this concern, we regress the difference in prior effectiveness between new transfers and staying teachers in year t + 1 on the difference in prior effectiveness between departed teachers and staying teachers in year t. The point estimate is small and not significantly different from zero (β = −.005, SE = .089, p = .959). Thus, we find no evidence that the arrival of a more effective new teacher to a particular grade in a school is related to the performance difference between departed and staying teachers in the prior year.

Student Sorting and Other Grade-Specific Interventions

Besides teacher sorting, it is possible that our results are confounded by dynamic student sorting (or tracking) or other related grade-level interventions that cannot be fully controlled by lagged test scores, individual characteristics, and classmates’ characteristics. For example, the “relatively effective” estimate can reflect that incumbent teachers get better students and also lobby for better new peers. This particular sorting would be problematic. We conduct a falsification test by regressing a student’s test score in year t − 1 on his or her future teacher’s value added in year t, controlling for this student’s teacher effect in year t − 1, characteristics of this student and his or her classmates’ characteristics, and school-grade, year fixed effects or school-year, grade fixed effects. If there was troubling unobservable student sorting, the coefficient of future teacher should be statistically significant. Although the coefficients on current teacher effect are about .11 and significant at the 1% level, the coefficient on future teacher’s effect is only .002 with p value greater than .8. Next, to assess whether incumbent teachers lobby for better new peers, we regress incumbent teachers’ prior stable performance on transfer teachers’ prior stable performance, controlling for school fixed effects. The small coefficient of −.016 is far from statistical significance (p = .307).⁵ Moreover, this point estimate suggests a negative, rather than positive, relationship, which does not support the possibility that effective incumbent teachers lobby for effective new peers.

Another possibility is that a principal might assign an effective new transfer to a poor performing grade as part of his grade-specific improvement, while this principal might implement other interventions at the same time (Jackson & Bruegmann, 2009). To test whether these possibilities would invalidate the inference of peer spillover, we regress grade-level average test scores of students of incumbent teachers in years t − 1, t − 2, and t − 3, respectively, on the effectiveness of future new transfers in year t, after controlling for school-year or school-grade fixed effects and student characteristics as those included in Equations 1 and 2. The coefficients on future new transfers range from −.01 to .003 (p = .5~.9), which are far from statistical significance. Taking these falsification tests together, there is little evidence on assigning new transfers as part of student sorting, lobbying for better new peers, or grade-specific interventions on student achievement.

	(1)	(2)	(3)
“Linear-in-means”: Prior stable effectiveness of new transfer peers	.015(.017)		.019(.018)
“Relatively effective”: Positive values of the difference in prior stable effectiveness between new transfer peers and focal teachers		.031(.022)
“Relatively ineffective”: Negative values of the difference in prior effectiveness between new transfers and focal teachers		−.009(.014)
F statistics for the difference between “relatively effective” and “relatively ineffective”		1.99
“Absolute effectiveness”: Prior stable effectiveness of new transfer peers × Prior stable effectiveness of focal teachers			−.014(.025)
Students’ own (focal) teachers’ prior effectiveness	.038* (.014)	.066* (.025)	.038* (.014)
N	5,644	5,644	5,644
Adjusted R²	.663	.663	.663

Alternative Measures of Teachers’ Prior Stable Effectiveness

Another concern is that measuring prior stable effectiveness by averaging teachers’ value added over prior 3 years restricts our sample to a group of teachers who are relatively experienced and thus restricts our inferences of peer effects to this peer set. This aggregated measure includes teachers who have three lagged value-added measures (about 21%), and those who only have two (27%) or one lagged value added (52%). To examine how our estimates of spillover effects vary depending on the number of lagged value-added measures available, we use either the most recent lagged value added or the most recent two lagged values added in the estimation. These results, included in Appendix Table D3 (available in the online version of the journal), do not differ in any meaningful way from those presented in Table 2.

An alternative to averaging prior value added is to make use of the panel nature of our data to directly estimate time-varying prior value added for each teacher. Given the year span from 2004 through 2013, we specify eight estimates for each teacher, one estimate for every 3 years (e.g., 2012–2010, 2011–2009, 2010–2008, 2009–2007, 2008–2006, 2007–2005, 2006–2004, 2005–2004). Covariates are the same as specified in Equation B1 in Appendix B. We obtain both teacher fixed effects and EB estimators, and exclude teachers with less than 10 students. We then replace the original prior stable effectiveness measure with the new measure in Equations 1 to 3. The inferences of the new results, reported in Appendix Table D4 (available in the online version of the journal), are consistent with our main estimates in Table 2.

In addition, to test the degree to which the new transfer teacher’s prior stable effectiveness reflects the quality of her or his previous school, we derive the measure of the average value-added of teachers in the prior-year school (excluding the new transfer teacher herself/himself) and include it as a control in Equations 1 to 3. As shown in Table D5, none of the corresponding coefficients are statistically significant. The “linear-in-means,” “relatively effective/ineffective,” and “absolute effectiveness” coefficients do not change in any meaningful ways from those estimates in Table 2. These imply that the new transfer teachers’ effect does not simply reflect the unobserved quality of their prior schools.

Finally, another proxy for peer quality could be teaching experience. For each school-year-grade cell, we compute the mean teacher experience of new transfer peers. Similar to our main approach, we construct measures of “relatively experienced”—that is, the new peers are, on average, more experienced than incumbent focal teachers—and “relatively inexperienced”—that is, the new peers are, on average, less experienced than incumbent focal teachers. Appendix Table D6, available in the online version of the journal, shows the results in two samples: The first sample includes all teachers with valid teaching experience and the second sample includes teachers with valid teaching experience and prior stable effectiveness measure. In both samples, teachers’ experience is a very weak predictor of student achievement, as indicated by the very small coefficients of students’ own teachers’ experience. This result is consistent with a number of prior studies of teacher experience (e.g., Chetty et al., 2014; Jackson & Bruegmann, 2009; Kane, Rockoff, & Staiger, 2008). It is not surprising, then, that most of the spillover effects are statistically nonsignificant. However, the directions of the coefficients are similar to those presented in our main texts. That is, the “linear-in-means” effects are positive. Relatively experienced peer teachers have positive spillover effects. Focal (incumbent) teachers who are more experienced are less likely to be influenced by new peer teachers, whereas junior focal teachers are more likely to be influenced by new peer teachers. However, we interpret these patterns with great caution because most of them are statistically nonsignificant.

Testing for Different Spillover Effects in Elementary and Middle Grades

Elementary and middle grades have different organizational structures that may influence peer formation and influence among teachers. First, elementary teachers are often assigned to work with a particular group of students, whereas teachers in middle grades are often responsible for multiple groups of students. Second, collaboration among elementary teachers is more common within grades, whereas collaboration among secondary teachers is more common across grade levels, but within subject areas. These differences in the sharing of students and collaboration structure between elementary and middle grades may suggest differential spillover effects in elementary and middle grades.

We estimate the spillover effects separately by elementary and middle grades and provide the results in Table 5. Most of the results are quite similar across grade levels and to the pooled estimates (i.e., the main effects), but there is more variation across models when examined separately by school levels. In particular, for elementary grades, Model 1 with school-grade and year fixed effects gives estimates that are very similar to the pooled estimates, whereas Model 2 with the school-year and grade fixed effects gives estimates that are generally lower in magnitude, with the exception of the “relatively ineffective” and “absolute effectiveness” estimates. For the middle grades, however, the school-year and grade fixed effects estimates tend to be larger in magnitude than those with school-grade and year fixed effects. The reason for these differences may be that elementary grades typically have fewer new transfers in a given year and, thus, less variation in transfer teachers’ effectiveness within a school and year. Therefore, the estimates from the school-year grade model that relies on the variation within a school-year consistently generate smaller estimates than the school-grade-year model that relies on the variation over time. In contrast, middle school grades may have more teachers transferred in a given school and year, and therefore, Model 2 that uses the variation in transferred teachers’ effectiveness within a school and year consistently generate larger estimates.

Table 5

Estimated Grade-Level Spillover Effects in Elementary and Secondary Schools Separately

	Elementary						Middle
	Model 1	Model 2	Model 1	Model2	Model 1	Model 2	Model 1	Model 2	Model 1	Model 2	Model 1	Model 2
“Linear-in-means”	.021* (.008)	.001(.018)			.022*(.009)	.000(.018)	.005(.007)	.021^† (.013)			.005(.007)	.022^† (.012)
“Relatively effective”			.026** (.008)	.013(.014)					.016* (.007)	.029** (.010)
“Relatively ineffective”			−.008(.010)	.011(.017)					.006(.008)	−.010(.011)
F statistics			6.45*	0.01					0.94	4.50*
“Absolute effectiveness”: New transfer peers × Focal teachers					−.010(.006)	−.012^† (.007)					−.004(.004)	−.006(.004)
Students’ own (focal) teachers’ prior effectiveness	.090*** (.008)	.091*** (.008)	.112*** (.011)	.091*** (.020)	.089*** (.008)	.088*** (.008)	.052*** (.007)	.054*** (.007)	.064*** (.010)	.082*** (.015)	.055*** (.007)	.054*** (.007)
N	41,178	41,178	41,178	41,178	41,178	41,178	68,244	68,244	68,244	68,244	68,244	68,244
Adjusted R²	.666	.671	.666	.671	.666	.671	.698	.700	.697	.700	.697	.700

Note. Data from 2003–2004 to 2012–2013. All models include student and classroom covariates. Student covariates include eligibility for FRPL, whether the student is an English language learner, the student’s race/ethnicity, gender, age, prior suspension, prior absence, and prior math and reading test scores. Classroom covariates include % of FRPL, % of students who are English language learners, % of Hispanic, % of African American, % of Asian, % of White, % of female, average age, average days of prior suspension, average days of prior absence, and the average and standard deviation of students’ prior math and reading test scores. Model 1 includes school-grade, year fixed effects; Model 2 includes school-year, grade fixed effects. Standard errors are included in the parentheses and clustered at the school-grade-year level. FRPL = free or reduced-price lunch.

†

p < .1. *p < .05. **p < .01. ***p < .001.

Discussion and Conclusion

Teaching has been described as an isolated practice with few interactions among teachers (e.g., Lortie, 1975). Yet, recent reforms have worked to increase teacher collaboration, and some recent work has demonstrated teachers’ strong influence on one another (e.g., Sun et al., 2013; Ronfeldt et al., 2015; Jackson & Bruegmann, 2009). Our study investigates the effects of new transfer teachers to grade-level teams, asking whether having a more effective teacher entering the grade improves the learning of students of incumbent teachers. Overall, we find strong and consistent evidence of positive spillover effects, as more effective entering teachers boost the learning of students of other teachers in the grade.

We find some evidence of peer effects from the baseline linear-in-means model in which the effects of a peer teacher are assumed to be constant for all teachers. With one standard deviation increase in the average effectiveness of new peers, the teacher team will increase their average productivity by about 1% or 2% of a standard deviation of students’ math test scores (15%–29% of the teacher effect). Although this result implies that the positive effects of bringing an excellent new teacher into a school extend beyond the impacts on the students in his or her classroom, it does not provide evidence on whether these effects are more important for some teachers than for others, and, as a result, does not provide guidance about how existing teachers might be redistributed within a school to increase aggregate achievement. That is, in the zero-sum case, giving one teacher a more effective grade-level team peer means taking that peer away from another teacher, canceling the effects out; the school as a whole does not benefit from redistribution if teacher peer effects are linear (Hoxby, 2002; Hoxby & Weingarth, 2005). Alternatively, if a new excellent teacher benefits some teachers more than others, the school can make gains by assigning the new teacher strategically to a teacher team with teaches who would differentially benefit.

Consistent with other recent evidence of nonlinear peer effects among students and employees in other workplaces (e.g., Burke & Sass, 2013; Carrell, Sacerdote, & West, 2013; Hoxby & Weingarth, 2005; Imberman et al., 2012; Mas & Moretti, 2009), we uncover evidence of heterogeneous peer effects among teachers, which suggest that grade-level team composition is in fact not a zero-sum game. Specifically, if a student has a new peer teacher at the same grade level who is about one standard deviation more effective than his or her own teacher, this student would have an increase in math scores of 1.9% to 2.8% of a standard deviation, or roughly 25% of the student’s own-teacher effect. In contrast, students of incumbent teachers are not particularly disadvantaged by the presence of relatively ineffective teacher peers. Similarly, low-performing teachers seem more responsive to the composition of peers than high-performing teachers. With one standard deviation decrease in own teachers’ prior effectiveness, the spillover effect from new transfers would increase about 0.6% or 0.8% of one standard deviation of student test scores. These findings suggest that schools can see aggregate achievement gains from strategically placing ineffective teachers alongside more effective colleagues on grade-level teams.

Our estimates are robust to a range of specifications such as using school-grade fixed effects to account for the time-invariant characteristics of a particular school and grade, or using school-year fixed effects to account for time-varying characteristics of a particular school and yearly variations in school conditions. Moreover, we present a variety of falsification tests to show that the results are unlikely to be biased by nonrandom student sorting and by endogenous teachers’ movement across grades and schools.

To illustrate how these nonlinearities could lead to an aggregate gain in student achievement from strategic teacher placement, consider a simple back-of-the-envelope example. Take two grade-level teams with four members each (the median group size in our sample). Team A comprises relatively more effective teachers (e.g., Alice = 2 SDs above the mean, Bob = 1.5 SD, Charlie = 1 SD, Donna = 0, the average teacher), whereas Team B comprises relatively less effective teachers (e.g., Erica= −2 SDs below the mean, Francine = −1.5, Gary = −1, and Horatio = 0). Using the coefficients from the relative effectiveness model, we can calculate that the average joint effect on student achievement of both the own teacher and the peer teachers across these two teams is estimated to be .012. Now, let us then move Alice—the highest performing teacher in Team A— to Team B and move Erica—the lowest performing teacher in Team B— to Team A. One year later, the average effect of both own-teacher and peer teacher spillover to student achievement across these two teams will be .019, a reasonable gain compared with .012. Erica’s and Francine’s students, those most disadvantaged by their teacher placement, benefit the most.

To further demonstrate the significance of strategically building teacher teams to promote positive spillover, we compare our findings with those of many expensive district-led professional development activities. According to a new report of The New Teacher Project (TNTP), school districts spent an average of US$18,000 on district-led professional development per teacher per year, and teachers spent on average of 19 school days each year in teacher development (TNTP, 2015). Yet, most rigorous evidence on professional development shows null effect or limited positive effects on student achievement (Garet et al., 2010, 2011 ; Jackson, Rockoff, & Staiger, 2014; Yoon, Duncan, Lee, Scarloss, & Shapley, 2007). Teachers often view these district-led professional development programs as lacking close connections with their classroom instruction or failing to help them understand how to improve (TNTP, 2015). In contrast, the strategic pairing of teachers requires minimal costs, yet generates positive improvement in student achievement. Teacher teams with diverse performance promote on-the-job learning pressure and opportunities. These findings are consistent with prior studies of teacher collegial networks in schools that facilitate knowledge diffusion and generate normative pressure for adoption of new instructional technology and practices in schools (e.g., Sun et al., 2013; Frank, Zhao, & Borman, 2004). Our findings are also consistent with those from a recent experiment showing substantial gains to teacher performance from pairing low-performing teachers with high-performing partners on the basis of teacher evaluation data for collaboration (Papay et al., 2016).

Our findings have several other potential implications. First, value-added models used by states and districts that assume no spillover effects among teachers will not capture teachers’ full contributions to student learning in their schools (see also Jackson & Bruegmann, 2009; Yuan, 2015). Second, the existence of spillovers raises questions about the utility of policies that only incent teachers’ contributions to their own students’ learning (e.g., individual-based merit pay) because this type of policies may discourage teachers from making positive impacts beyond their own classrooms. Third, the existence of spillover effects highlights additional benefits of efforts to retain a school’s most effective teachers, whose impacts are spread across multiple classrooms.

Our results also suggest several avenues for future research. Certainly, more needs to be known about the drivers of teacher spillovers and how to harness those drivers for improvement. We highlight two possible mechanisms for observed peer effects: social pressure and knowledge spillover. Our results point, at least in part, to the importance of knowledge spillovers for teachers, given that the positive effects of more effective teachers are much stronger than any negative effects of less effective teachers. However, additional work is needed to more clearly separate these two pathways and to explore ways of enhancing the positive knowledge transfers if those turn out to be as important as our evidence suggests. If knowledge spillover is the predominant mechanism, school leaders might place more emphasis on strategic professional learning communities to facilitate the diffusion of instructional expertise or structure other opportunities for teachers to share instructional ideas and feedback with one another (e.g., Sun et al., 2013). We know little about the conditions under which positive spillover effects can be magnified and sustained, such as in schools that have systemic structure to promote collaboration among teachers (e.g., coherent curriculum and common planning time). Future work in this area could benefit professional development decisions by school and district leaders. Future work might also investigate the long-run effect of spillovers. Would spillover effects be augmented when teachers have longer period of time to collaborate, or do spillover effects decay in the second and third years after the shock effect of new peers to the grade team disappears? These results could aid in decisions about how often to reassign teachers to different learning communities. Finally, this study draws on evidence from just one school district. We cannot be certain about the generalizability of these positive impacts to other settings.

Although future work clearly has the potential to shed far more light on teacher peer effects, the results of this study highlight that who a teacher’s colleagues are can matter substantially for achievement in his or her classroom. Talent management policies can utilize the importance of colleagues—and particularly the importance of access to skilled and effective teachers for teachers who are struggling in their classrooms—for the benefit of students.

Footnotes

Acknowledgements

The authors acknowledge the helpful comments and feedback from seminar participants of the Center for Education Policy Analysis at Stanford University and Labor and Development Economics at University of Washington, Seattle. They also thank Ilana Horn, Mimi Engel, Kenneth Frank, and Jing Liu for helpful conversations about theoretical framing and data analysis. Additionally, we appreciate the editor and anonymous reviewers for their very thoughtful suggestions.

Authors’ Note

Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.

Declaration of Conflicting Interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: Min Sun’s contribution to this study is supported by a grant from the National Science Foundation under Grant No. DRL-1506494.

Notes

Authors

MIN SUN is an assistant professor of quantitative policy research in the College of Education at the University of Washington. Her research focuses on educator quality, school accountability, and school improvement.

SUSANNA LOEB is the Barnett Family Professor of Education at Stanford University. She specializes in education policy with a focus on school governance and finance and educator labor markets.

JASON A. GRISSOM is an associate professor of public policy and education at Peabody College, Vanderbilt University. His research interests include school leadership, educator labor markets, and K–12 politics and governance.

References

Aaronson

Barrow

Sander

(2007). Teachers and student achievement in the Chicago public high schools. Journal of Labor Economics, 25, 95–135.

Atteberry

Loeb

S. L.

Wyckoff

(2016). Teacher churning within schools: Impacts on student achievement. Educational Evaluation and Policy Analysis. doi:10.3102/0162373716659929

Azoulay

Graff Zivin

Wang

(2010). Superstar extinction. Quarterly Journal of Economics, 125, 549–589.

Battu

Belfield

C. R.

Sloane

P. J.

(2003). Human capital spillovers within the workplace: Evidence from Great Britain. Oxford Bulletin of Economics and Statistics, 65, 575–594.

Bauer

T. K.

Vorell

(2010). External effects of education: Human capital spillovers in regions and firms (RUHR #195). Bochum, Germany: Ruhr-Universität Bochum.

Burke

M. A.

Sass

T. R.

(2013). Classroom peer effects and student achievement. Journal of Labor Economics, 31, 51–82.

Carrell

S. E.

Fullerton

R. L.

West

J. E.

(2009). Does your cohort matter? Measuring peer effects in college achievement. Journal of Labor Economics, 27, 439–464.

Carrell

S. E.

Sacerdote

West

(2013). From natural variation to optimal policy? The importance of endogenous peer group formation. Econometrica, 81, 855–882.

Chetty

Friedman

J. N.

Rockoff

J. E.

(2014). Measuring the impacts of teachers I: Evaluating bias in teacher value-added estimates. American Economic Review, 104, 2593–2632.

10.

Cobb

McClain

de Silva Lamberg

Dean

(2003). Situating teachers’ instructional practices in the institutional setting of the school and district. Educational Researcher, 32(6), 13–24.

11.

Coburn

C. E.

Russell

J. L.

(2008). District policy and teachers’ social networks. Educational Evaluation and Policy Analysis, 30, 203–235.

12.

Cohen

(1983). Applied multiple regression/correlation analysis for the behavioral sciences (2nd ed.). Mahwah, NJ: Lawrence Erlbaum.

13.

Conley

T. G.

Udry

C. R.

(2010). Learning about a new technology: Pineapple in Ghana. American Economic Review, 100, 35–69.

14.

Cornelissen

Dustmann

Schönberg

(2013). Peer effects in the workplace (IZA Discussion Paper No. 7617). Retrieved from http://ftp.iza.org/dp7617.pdf

15.

De Grip

Sauermann

(2012). The effects of training on own and co-worker productivity: Evidence from a field experiment. The Economic Journal, 122, 376–399.

16.

Falk

Ichino

(2006). Clean evidence on peer effects. Journal of Labor Economics, 24, 39–57.

17.

Frank

K. A.

Sun

(2014). Social network analysis of the influences of educational reforms on teachers’ practices and interactions. Zeitschrift für Erziehungswissenschaft, 17, 117–134.

18.

Frank

K. A.

Zhao

Borman

(2004). Social capital and the diffusion of innovations within organizations: The case of computer technology in schools. Sociology of Education, 77, 148–171.

19.

Garet

Wayne

Stancavage

Taylor

Eaton

Walters

. . . Doolittle

(2011). Middle school mathematics professional development impact study: Findings after the second year of implementation (NCEE 2011-4024). Washington, DC: National Center for Education Evaluation and Regional Assistance, Institute of Education Sciences, U.S. Department of Education.

20.

Garet

Wayne

Stancavage

Taylor

Walters

Song

. . . Doolittle

(2010). Middle school mathematics professional development impact study: Findings after the first year of implementation (NCEE 2010-4009). Washington, DC: National Center for Education Evaluation and Regional Assistance, Institute of Education Sciences, U.S. Department of Education.

21.

Glazerman

Protik

Teh

Bruch

Max

(2013). Transfer incentives for high-performing teachers: Final results from a multisite experiment (NCEE 2014-4003). Washington, DC: National Center for Education Evaluation and Regional Assistance, Institute of Education Sciences, U.S. Department of Education.

22.

Graham

B. S.

(2008). Identifying social interactions through conditional variance restrictions. Econometrica, 76, 643–660.

23.

Grissom

J. A.

Loeb

Nakashima

(2014). Strategic involuntary teacher transfers and teacher performance: examining equity and efficiency. Journal of Policy Analysis and Management, 33(1), 112–140.

24.

Guarino

C. M.

Maxfield

Reckase

M. D.

Thompson

P. N.

Wooldridge

J. M.

(2015). An evaluation of Empirical Bayes’s estimation of value-added teacher performance measures. Journal of Educational and Behavioral Statistics, 40, 190–222.

25.

Guarino

C. M.

Reckase

Stacy

Wooldridge

(2015). A comparison of student growth percentile and value-added models of teacher performance. Statistics and Public Policy, 2, 1–11. doi:10.1080/2330443X.2015.1034820

26.

Guarino

C. M.

Santibanez

Daley

(2006). Teacher recruitment and retention: A review of the recent empirical literature. Review of Educational Research, 76, 173–208.

27.

Herbst

Mas

(2015). Peer effects on worker output in the laboratory generalize to the field. Science, 350, 545–549.

28.

Horn

I. S.

(2005). Learning on the job: A situated account of teacher learning in high school mathematics departments. Cognition and Instruction, 23, 207–236.

29.

Horn

I. S.

Little

J. W.

(2010, March). Attending to problems of practice: Routines and resources for professional learning in teachers’ workplace interactions. American Educational Research Journal, 47, 181–217.

30.

Hoxby

(2002). The power of peers? How does the makeup of a classroom influence achievement? Education Next, 2, 57–63.

31.

Hoxby

C. M.

Weingarth

(2005). Taking race out of the equation: School reassignment and the structure of peer effects. Retrieved from https://www.pausd.org/sites/default/files/pdf-faqs/attachments/TakingRaceOutOfTheEquation.pdf

32.

Imberman

S. A.

Kugler

A. D.

Sacerdote

B. I.

(2012). Katrina’s children: Evidence on the structure of peer effects from hurricane evacuees. American Economic Review, 102, 2048–2082.

33.

Jackson

C. K.

Bruegmann

(2009). Teaching students and teaching each other: The importance of peer learning for teachers. American Economic Journal: Applied Economics, 1(4), 85–108.

34.

Jackson

C. K.

Rockoff

J. E.

Staiger

D. O.

(2014). Teacher effects and teacher-related policies. Annual Review of Economics, 6, 801–825.

35.

Kane

T. J.

Rockoff

J. E.

Staiger

D. O.

(2008). What does certification tell us about teacher effectiveness? Evidence from New York City. Economics of Education Review, 27, 615–631.

36.

Koretz

D. M.

(2002). Limitations in the use of achievement tests as measures of educators’ productivity. Journal of Human Resources, 37, 752–777.

37.

Kuroda

Yamamoto

(2013). Do peers affect determination of work hours? Evidence based on unique employee data from global Japanese firms in Europe. Journal of Labor Research, 34, 359–388.

38.

Lankford

Loeb

Wyckoff

(2002). Teacher sorting and the plight of urban schools: A descriptive analysis. Educational Evaluation and Policy Analysis, 24, 37–62.

39.

Loeb

Candelaria

C. A.

(2012). How stable are value-added estimates across years, subjects, and student groups? The Carnegie Knowledge Network. Retrieved from http://www.carnegieknowledgenetwork.org/wp-content/uploads/2012/10/CKN_2012-10_Loeb.pdf

40.

Loeb

Kalogrides

Béteille

(2012). Effective schools: teacher hiring, assignment, development, and retention. Education Finance and Policy, 7(3), 269–304.

41.

Lortie

D. C.

(1975). Schoolteacher: A sociological study. Chicago, IL: The University of Chicago Press.

42.

Manski

C. F.

(1993). Identification of endogenous social effects: The reflection problem. The Review of Economic Studies, 60, 531–542.

43.

Mas

Moretti

(2009). Peers at work. American Economic Review, 99, 112–145.

44.

McCaffrey

D. F.

Lockwood

J. R.

Koretz

D. M.

Louis

T. A.

Hamilton

L. S.

(2004). Models for value-added modeling of teacher effects. Journal of Educational and Behavioral Statistics, 29, 67–102.

45.

McCaffrey

D. F.

Sass

T. R.

Lockwood

J. R.

Mihaly

(2009). The intertemporal variability of teacher effect estimates. Education Finance and Policy, 4, 572–606.

46.

McLaughlin

M. W.

Talbert

J. E.

(2001). Professional communities and the work of high school teaching. Chicago, IL: The University of Chicago Press.

47.

The New Teacher Project. (2015). The mirage: Confronting the hard truth about our quest for teacher development. Retrieved from http://tntp.org/assets/documents/TNTP-Mirage_2015.pdf

48.

Nye

Konstantopoulos

Hedges

L. V.

(2004). How large are teacher effects? Educational Evaluation and Policy Analysis, 26, 237–257.

49.

Papay

Taylor

E. S.

Tyler

Laski

(2016). Learning job skills from colleagues at work: Evidence from a field experiment using teacher performance data (NBER working paper No. 21986). Retrieved from http://www.nber.org/papers/w21986

50.

Penuel

W. R.

Frank

K. A.

Sun

Kim

C. M.

Singleton

C. A.

(2013). The organization as a filter of institutional diffusion. Teachers College Record, 115, 306–339.

51.

Penuel

W. R.

Sun

Frank

K. A.

Gallagher

H. A.

(2012). Using social network analysis to study how collegial interactions can augment teacher learning from external professional development. American Journal of Education, 119(1), 103–136.

52.

Rivkin

S. G.

Hanushek

E. A.

Kain

J. F.

(2005). Teachers, schools, and academic achievement. Econometrica, 73, 417–458.

53.

Rockoff

J. E.

(2004). The impact of individual teachers on student achievement: Evidence from panel data. American Economic Review, 94, 247–252.

54.

Ronfeldt

Farmer

S. O.

McQueen

Grissom

J. A.

(2015). Teacher collaboration in instructional teams and student achievement. American Educational Research Journal, 52, 475–514.

55.

Sacerdote

(2001). Peer effects with random assignment: Results for Dartmouth roommates. The Quarterly Journal of Economics, 116, 681–704.

56.

Sacerdote

(2011). Peer effects in education: How might they work, how big are they and how much do we know thus far? Handbook of the Economics of Education, 3, 249–277.

57.

Sanders

W. L.

Rivers

J. C.

(1996). Cumulative and residual effects of teachers on future student academic achievement (Research Progress Report). Knoxville: University of Tennessee Value-Added Research and Assessment Center.

58.

Sass

T. R.

Hannaway

Figlio

Feng

(2012). Value added of teachers in high-poverty schools and lower poverty schools. Journal of Urban Economics, 72, 104–122.

59.

Siskin

L. S.

(1994). Realms of knowledge: Academic departments in secondary schools. London, England: Routledge.

60.

Stoyanov

Zubanov

(2012). Productivity spillovers across firms through worker mobility. American Economic Journal: Applied Economics, 4, 168–198.

61.

Summers

A. A.

Wolfe

B. L.

(1977). Do schools make a difference? American Economic Review, 6, 639–652.

62.

Sun

Garrison

Larson

C. J.

Frank

K. A.

(2014). Exploring colleagues’ professional influences on mathematics teachers’ learning. Teachers College Record, 116, 305–335.

63.

Sun

Penuel

W. R.

Frank

K. A.

Gallagher

H. A.

Youngs

(2013). Shaping professional development to promote the diffusion of instructional expertise among teachers. Educational Evaluation and Policy Analysis, 35, 344–369.

64.

Yoon

K. S.

Duncan

Lee

S. W.

Scarloss

Shapley

(2007). Reviewing the evidence on how teacher professional development affects student achievement: Issues and answers report (REL 2007–No. 033). Washington, DC: U.S. Department of Education, Institute of Education Sciences.

65.

Yuan

(2015). A value-added study of teacher spillover effects across four core subjects in middle schools. Education Policy Analysis Archives, 23(38), 1–24.

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

0.24 MB

Building Teacher Teams

Abstract

Keywords

Spillover Effects Among Employees in Schools and Other Workplaces

Modeling Teacher Spillover Effects

Data and Sample

Analytic Strategies

“Linear-in-Means”

“Relative Effectiveness”

“Absolute Effectiveness”

Results

Grade-Level Spillover Patterns

Robustness and Falsification Tests

Teacher Sorting

Student Sorting and Other Grade-Specific Interventions

Other Endogeneity Problems Associated With Voluntary Teacher Transfer

Alternative Measures of Teachers’ Prior Stable Effectiveness

Testing for Different Spillover Effects in Elementary and Middle Grades

Discussion and Conclusion

Footnotes

Acknowledgements

Authors’ Note

Declaration of Conflicting Interests

Funding

Notes

Authors

References

Supplementary Material