Abstract
While researchers have shown great interest in understanding teacher evaluation, little is known about how teachers’ actions and interactions surrounding evaluation affect the dual goals of evaluation—accountability and development. Using data collected during a yearlong ethnographic study at three schools (combined with follow-up interviews four years later), this study employs frame analysis to describe and explain how teachers formed a group perspective about the new evaluation policy, how this perspective informed their actions and interactions, and the consequences that these actions and interactions had on teacher collegiality, teacher learning, and instructional improvement.
Introduction
Recent federal and state policy has placed a renewed focus on teacher evaluation as a main strategy through which minimum standards of teaching practice can be established and enforced (Papay, 2012). In response to the federal Race to the Top initiative and waivers from No Child Left Behind (NCLB) accountability mandates, many states instituted new teacher evaluation policies that required districts to distinguish teacher performance by teachers’ measured impact on student achievement and their observed instructional prowess as measured against standardized observation protocols. Under the new laws, consistently underperforming teachers could be fired regardless of their years of service or prior employment status.
These new evaluation systems rest on the logic of “performance management” (Mintrop, Ordenes, Coghlan, Pryor, & Madero, 2018) characterized by participation in mandated activities (e.g., observations, conferences), engagement with prescribed artifacts (e.g., observation protocols), and accountability for performance (e.g., bonus pay, sanctions). For this reason, some scholars are highly critical of using teacher evaluation to improve teacher performance, calling evaluation “an instrument of industrial-era management, of well-informed managers directing the work of the laboring class toward greater efficiency” (Murphy, Hallinger, & Heck, 2013, p. 352).
Others are more optimistic and hope that teacher evaluation can serve the dual purpose of accountability and development, noting that new teacher evaluation systems may offer “cycles of observation, reflection, dialogue and feedback, and goal setting [that] can provide teachers with new ideas as well as frequent and relevant feedback to support their professional growth” (Kraft & Gilmour, 2016, p. 715). Thus, it may be possible that new teacher evaluation systems both increase accountability and promote development. The success of teacher evaluation as a means of school improvement, then, ultimately depends on how teachers and principals view these new systems and how they translate mandated activities and artifacts into substantive actions and interactions.
While several studies describe teachers’ perceptions of evaluation and a small but growing number of studies provides a qualitative account of teachers’ and principals’ responses to new evaluation systems, to my knowledge no study provides a “process analysis” (Strauss & Corbin, 1998) that links contexts, actions and interactions, and the consequences these actions and interactions have for teacher learning and instructional improvement. Making these connections will illuminate the processes of evaluation and build our understanding of evaluation reform’s dual purpose.
This study builds on extant research by making these connections. For this study, I conducted a full school year of field research across three schools. While in the field, I observed classrooms, professional development sessions, grade-level team meetings, and staff meetings; interviewed principals, teachers, and students; and analyzed policy documents and school-generated artifacts. In what follows, I use frame analysis to describe and explain how the teachers’ common perspective toward evaluation influenced their actions and interactions and the consequences that these actions and interactions had for the implementation of teacher evaluation policy.
Literature Review
While this study connects teacher perspectives, actions/interactions, and consequences surrounding evaluation reform for the first time, it is couched in the growing body of literature that examines teachers’ perceptions about evaluation reforms and describes how these reforms are implemented in schools.
Teacher Perceptions of Formal Evaluation
In the past few decades, studies that have gauged teachers’ perceptions of evaluation reforms have revealed several patterns (Donaldson, 2012; Jiang, Sporte, & Luppescu, 2015; Kimball, 2002; Ovando, 2001; Peterson & Comeaux, 1990). Namely, teachers are generally favorable toward evaluation reform. They tend to believe that new observation protocols capture the essential elements of their teaching and enhance conversations about instruction (Halverson, Kelley, & Kimball, 2004; Jiang et al., 2015; Kimball, 2002; Peterson & Comeaux, 1990). Evidence also suggests that teachers can view the accountability aspects of evaluation policy favorably, provided that the teachers identified as underperforming during the process of evaluation align with their own beliefs about who the poor teachers are (Donaldson, 2012).
Despite the generally positive perceptions, however, there are specific conditions that seem to undermine teachers’ optimism. First, teachers’ perceptions of evaluation reform appear to be sensitive to evaluation’s intended purpose. Teachers feel that evaluation should be primarily used as a source of professional growth rather than for determining competency and making high-stakes personnel decisions (Mintrop et al., 2018; Peterson & Comeaux, 1990). When teachers view the evaluation as primarily about accountability, they are likely to disinvest from the process and any potential for professional growth is significantly inhibited (Mintrop et al., 2018; Peterson & Comeaux, 1990).
Second, teachers’ perceptions of evaluation reform are associated with their beliefs of how principals enact the evaluator role (Donaldson, 2012; Kimball, 2002; Ovando, 2001). Perceptions of administrator competence are particularly important, as teachers in several studies questioned principals’ preparedness to evaluate the teaching of particular content (Donaldson, 2012; Halverson et al., 2004; Kimball, 2002). Perceived administrator fairness is also an important issue. Perceptions of the evaluation system are diminished when teachers believe that principals capriciously target particular teachers and identify them as underperforming for reasons other than their professional competence (Donaldson, 2012; Ovando, 2001).
Finally, teacher perceptions of evaluation reforms are influenced by their own characteristics and particular contexts. For instance, elementary school and early career teachers view evaluation more favorably than high school and experienced teachers (Jiang et al., 2015). Veteran teachers in particular are likely to cite the protocols’ lack of contextual factors as a major drawback to their use (Peterson & Comeaux, 1990) and teachers at higher grade levels tend to be more suspicious of their administrators’ preparedness to evaluate them accurately (Kimball, 2002). Finally, teachers whose instructional philosophy aligns more precisely with observational protocol and those who are rated higher on the protocol dimensions have better perceptions of the evaluation reform (Donaldson, 2012; Peterson & Comeaux, 1990).
Implementing New Teacher Evaluation Systems
Several recent qualitative studies have concentrated on the link between how principals and teachers make sense of evaluation reform and how well evaluation reform is enacted.
Sensemaking, Organizational Contexts, and Policy Stimuli
Recent research employs elements of sensemaking theory to investigate how individual cognition (prior knowledge and beliefs), situated cognition (social and organizational contexts), and policy stimuli (messages from the environment) seem to affect how principals enact evaluation reform and maintain either an accountability or a developmental focus (e.g., Donaldson & Mavrogordato, 2018; Donaldson & Woulfin, 2018; Kraft & Gilmour, 2016; Marsh, Bush-Mecenas, Strunk, Lincove, & Huguet, 2017; Reinhorn, Johnson, & Simon, 2017). For instance, Reinhorn et al. (2017) found that principals’ responses to evaluation were shaped by their preexisting beliefs and knowledge, their understanding of the social and organizational contexts, and their interpretation of external policy messages. At each of the six schools in their study, principal sensemaking resulted in a developmental focus that principals supported through frequent observations of instruction, helpful feedback, and integration of evaluation with other school improvement initiatives.
In a related study that examined principals’ work with previously identified underperforming teachers, Donaldson and Mavrogordato (2018) determined that implementation of evaluation reform was shaped by a blend of principals’ sensemaking, principals’ desire to build relational trust, and principals’ perceptions of organizational capacity. In this study, sensemaking helped principals distinguish between teachers who they felt deserved their low ratings and those who did not. Working with misindentified underperforming teachers was an opportunity for principals to build relational trust, as they demonstrated concern for teachers and focused on teacher development. However, maintaining a developmental focus came with some compromise, as principals built trust with underperformaing teachers by relaxing evaluation demands or making the evaluation less formal. To a lesser extent, perceived organizational capacity also impacted principal enactment of evaluation reform. Specifically, developing underperforming teachers required considerable investment of personal and organizational resources. Organizational capacity was also a factor in holding teachers accountable. Principals were reluctant to attempt to dismiss a teacher unless they were certain that district administrators would support this decision.
Kraft and Gilmour (2016) demonstrated that the principals within their sample interpreted the purpose of evaluation differently and these interpretations led to sharp contrasts in how principals enacted teacher evaluation reform. Most principals in their sample highlighted the developmental aspect of evaluation and frequently engaged teachers in instructional discussions couched in the standards of observation protocol. However, a sizable minority of principals prioritized the accountability aspect of teacher evaluation and sought to use the reform to remove teachers they perceived as underperforming. Consequently, these principals were far less likely to observe teachers and provide them feedback on their teaching. Furthermore, policy demands combined with organizational realities and personal limitations to shape implementation. Even principals who pursued a developmental focus of evaluation often believed that they lacked the proper training, felt pressed for time to meet with all teachers and complete associated paperwork, and had trouble evaluating teaching outside their area of expertise.
Although they used structure-agency theory rather than sensemaking theory to frame their study, Donaldson and Woulfin (2018) still shed light on the importance of principal interpretation of new policy demands. They determined that when enacting evaluation reform principals made mostly minor modifications to evaluation activities in ways that promoted development rather than accountability. Specifically, principals narrowed the focus of the evaluation rubric, adjusted the timing of the observations and conferences, omitted unpopular elements of the evaluation, and slackened evaluation requirements.
Focusing primarily on the importance of organizational contexts at the school level rather than on principals’ prior knowledge and beliefs, Marsh et al. (2017) determined that schools responded to evaluation reforms in three, nonmutually exclusive, ways: distortion, compliance, and reflection. Primarily distortive schools adopted temporary behaviors that would improve teachers’ documented performance on evaluative observations but would not lead to substantive changes. In compliant schools, principals and teachers neither attempted to distort the reform nor did they use evaluation as an opportunity to improve. Finally, evaluation at reflective schools encouraged meaningful interactions and became a source for teacher learning and growth.
Marsh and colleagues then tied these three responses to differences in organizational contexts. They found that when leadership was distributed across several actors in a school (i.e., teachers were enlisted to conduct observations and offer one another feedback) and when the principal herself was an active instructional leader, schools were likely to respond reflectively to the teacher evaluation reform. Furthermore, in reflective schools collaboration was built into the formal organizational structure. Unlike at distortive and compliant schools, reflective schools provided teachers ample opportunities to meet with colleagues as part of regular organizational routines.
Like Marsh and colleagues, Mintrop et al. (2018) focused their inquiry at the school level. Using literature on teacher evaluation, shared cognition, artifact use, and work incentives to frame their study, they noted three distinct stages of evaluation reform implementation: consonance, dissonance, and resonance. During the consonant phase, school leaders and teachers believed that the new evaluation system aligned well with extant values and practices and that the activities and artifacts associated with the program could help improve teaching and learning. However, consonance eventually gave way to dissonance, as school personnel felt overburdened by the system’s requirements, each school provided only modest opportunities for teachers to learn, and the focus shifted from formative feedback to summative evaluation. Finally, a period of resonance prevailed when the schools returned to the formative, local teacher assessment.
Taken together, this set of studies suggests the importance of both sensemaking (particularly for principals) and organizational contexts. These studies also highlight the primacy of the developmental focus, although maintaining this focus is likely to require some compromise of evaluation reforms’ accountability demands (Donaldson & Mavrogordato, 2018; Kraft & Gilmour, 2016; Mintrop et al., 2018).
Gaps in Current Understanding of the Implementation of Evaluation Policy
Despite the contributions of the studies described above, much is still unknown about how the evaluative and developmental aspects of teacher evaluation reform policies are enacted. This study furthers the work described above in three ways. First, most studies have relied primarily or exclusively on one-time interviews, and thus, obscure our understanding of how time affects perceptions and interactions involving evaluation reforms (for an exception, see Mintrop et al., 2018). For this study, I interviewed teachers and principals several times over the course of 4 years to better understand how familiarity with evaluation and changes in principal leadership shaped evaluation policy implementation. Similarly, primary reliance on interviews inhibits a clear understanding of enactment, as researchers have rarely directly observed teachers’ and principals’ socially constrained actions. Finally, most of qualitative implementation studies summarized above have focused primarily on importance of principals’ actions in enacting evaluation reforms. While this focus is undoubtedly important, very little is known about how teachers understand their situation in terms of evaluation policies and how this understanding (i.e., perspective) shapes their actions and interactions. With this in mind, in what follows I use frame analysis to describe and explain the basic situation of evaluation reform at three schools (including the group perspective), the actions and interactions that flow from this perspective, and the consequences that these actions and interactions have on the implementation of teacher evaluation reforms.
Theoretical Framework: Using Frame Analysis to Understand Teacher Evaluation
Frame analysis has recently been used to explain how people understand and interpret policy messages and then shape these messages (e.g., Coburn, 2006; Coburn, Bae, & Turner, 2008; Woulfin, Donaldson, & Gonzales, 2016). These studies examine how messages are “framed” as they are communicated across and within institutions. In this way, framing is an extension of individual sensemaking and helps explain how people can make collective sense of incoming messages as they seek to gain control of problematic situations through diagnosing problems (diagnostic frames), influencing the course of collective action (prognostic frames), setting expectations for interactions (normative frames), or detailing how behavior will be monitored (regulatory frames). Through imposing their frames, then, people can define the situation and control collective action.
This version of frame analysis aligns well with the belief that certain people are particularly well situated and influential and can frame incoming messages and, by extension, impose their definition of a situation on others. Although they did not use frame analysis explicitly, several studies of teacher evaluation reform share the assumption that principals are able to interpret policy messages and impose their frames on teachers (e.g., Donaldson & Woulfin, 2018; Reinhorn et al., 2017). However, I employ frame analysis as originally conceived (Goffman, 1974), as I have no evidence that principals were able to tightly control policy messages or impose their definition of the situation through diagnostic, prognostic, normative, or regulatory frames.
As Goffman (1974) theorized, people use frames to organize their ongoing experience; frames help people “locate, perceive, identify, and label a seemingly infinite number of concrete occurrences” (Goffman, 1974, p. 21). Frames, then, are ways of seeing patterns in a social setting (e.g., typical relations, ways of being, meaning of symbols) that allow people to pull meaning from contexts. Furthermore, I assume that people mainly encounter situations that are already defined and they then use their frames to understand the situations in which they find themselves. As Goffman (1974) noted, The definition of the situation is almost always to be found, but those who are in the situation ordinarily do not create this definition . . . ordinarily, all they do is to assess correctly what the situation ought to be for them and then act accordingly. (pp. 1–2)
Developing a Perspective
Social action can only be coherent to the extent that actors form a common definition or perspective of the situation from which they organize their own behavior. That is, although no single actor defines a given situation, actors form a common understanding of their situation through their involvement in problematic situations. The common definition includes coordinated views and plans of action held by a group of people (Becker, Geer, Hughes, & Strauss, 1961). People form a perspective when they notice and interpret the meaning of environmental objects in similar ways and reach explicit or tacit agreement about how best to proceed. In this way, perspectives both facilitate and constrain action. Coherent action is not possible without a group perspective while, at the same time, a group perspective will dictate that only certain actions will be appropriate. Once the perspective is established, actors have a frame of reference to guide and evaluate their own actions and the actions of others.
Projecting Into Situations
Goffman (1959, 1974) argued that after actors have reached a common perspective, they influence social situations through the information they reveal by their appearance, manner, and actions; information they take in from others; and the objects in the environment in the context of larger societal rules and expectations. That is, while they cannot control the perspective (i.e., the common definition of the situation) they can influence how they are perceived in reference to it. In any social situation actors “project” themselves into the socially desirable roles at the same time they interpret the projections of others. Meanwhile, each actor interprets others’ projections and evaluates these projections against the group perspective.
In sum, actors use their frames to understand social contexts and decide how best to respond. Over time, groups form a perspective (i.e., common definition of a situation) that makes socially coherent action possible. With the common perspective in mind, actors influence social situations primarily through the projections they “give off” by their appearance, talk, and actions. Throughout interaction, actors use the perspective as a source for monitoring their own actions and the actions of others in an ongoing “reflexive dialogue” (Mead, 1938).
Research Questions
The following questions stem from the review of the literature and the theoretical framework.
Developing a Perspective
What are the characteristics of the group perspectives that teachers form regarding teacher evaluation? How, if at all, do perspectives differ within and across contexts?
To what extent, if at all, do perspectives reflect the evaluative or developmental aspects of evaluation?
Projecting Into Situations
What implications does the group perspective have for actions and interactions related to teacher evaluation? Specifically, how do teachers maintain or improve their situation (i.e., maintain or improve their status) through projections both individually and collectively?
To what extent, if at all, do teachers’ actions and interactions support their learning and development?
Understanding the Consequences
What consequences do the group perspective and teachers’ efforts to maintain or improve their status through their projections have on teacher collegial relations, the sharing of resources, teacher learning, and instructional change?
Methodology
In this section, I detail the study context, sampling, data collection, data analysis, validity of the findings, and limitations.
Study Context
In July 2011, Michigan State House Bill 4627 was enacted into law. The law dictated that schools employ observational protocol and student achievement data to evaluate teachers and make personnel decisions. The law stipulated that teachers were to be observed multiple times annually and their instruction gauged against an observational protocol from a state-approved list or a locally designed protocol that met state requirements. The state also allowed local flexibility in how student growth scores would be calculated. Finally, the new law required that schools create composite scores of teacher performance to rate their teachers along four distinct performance bands. Those teachers rated “ineffective” during three consecutive years faced mandatory termination.
Research Design
Sampling
Several evaluation reform studies have been conducted in innovative districts (Halverson et al., 2004; Kimball & Milanowski, 2009), reconstituted public or charter schools (Marsh et al., 2017; Mintrop et al., 2018), or in schools with good reputations for leadership (Reinhorn et al., 2017). As I was interested in how evaluation occurred in typical settings, the three schools selected for this study faced no special circumstances (e.g., state or district sanctions), were traditional, district run public schools, and were not highly recognized for excellence in teacher evaluation. I selected middle schools in Michigan that differed in their region and proximity to large urban centers (see Table 1) from a larger sample of schools that were involved in a statewide reform (Formative Assessment for Michigan Educators [FAME]). However, the FAME program was not a school-wide initiative and at each school only a fraction of teachers participated.
Overview of Middle School Sample
As research has suggested, perceptions of evaluation vary with level of schooling (Jiang et al., 2015). I chose middle schools because what occurs at this level is interesting in its own right and because I believed that middle schools would likely represent a middle ground between elementary school and high school contexts.
In each school I identified focal teachers who participated in the FAME program voluntarily. Thus, the focal sample was likely biased toward teachers who were favorably disposed to instructional reforms. I also included each school’s principal in the sample. An overview of the participant sample is included in Table 2.
Participant Sample Overview and Interview Summary Chart
Note. NA = not applicable.
Data Collection
Data collection reflects the goals of the study to provide a longitudinal process analysis of teacher evaluation policy implementation that describes and explains evaluation’s impact on teacher accountability and development. This goal necessitated that I generate focused, researchable questions and then conduct a disciplined inquiry in a bounded community in which people assumed different, status-related roles, developed common meaning for symbolic action that emerged from sustained interaction, and crafted plans for action (Ogbu, 1981). In total, I spent over 100 full days in the field at three middle schools during the 2013–2014 school year conducting interviews, observing organizational routines, and analyzing documents. Triangulating data in this way and conducting my inquiry over the school year allowed me to generate and test ethnographic hypotheses, to check teachers’ and administrators’ interviews against their observed social behavior, and to bring greater clarity to observations. That is, triangulation aided validity, provided context for what I heard and saw, and helped expedite my understanding. I returned to each of the research sites during the 2017–2018 school year to interview key informants (eight teachers, two principals) again and compare my emerging findings against any differences that may have occurred over time.
Observations
I observed teacher meetings, professional developments sessions, and classroom teaching at each of the three sites (see Table 3). I spent time observing the halls during passing periods, sitting in the front office, eating lunch in the teachers’ lounge, and attending school events. During these observations, I wrote scratch notes in the field that I later elaborated more fully (Emerson, Fretz, & Shaw, 2011).
Teacher Meeting Observation Overview
Interviews
I conducted a total of 75 ethnographic interviews across sites with 14 teachers and three principals (see Table 2), interviewing each informant 3 to 5 times over the course of the year. Initial, semistructured interviews established rapport with informants and inquired about their experiences in education and how they made sense of and responded to the new teacher evaluation policy. I used initial observations and interviews to generate “ethnographic hypotheses” (Spradley, 1979) that I followed up on in subsequent interviews. I remained in the field until I had thoroughly refined these hypotheses and reached data saturation (Glaser & Strauss, 1967). All interviews were audio recorded and transcribed in their entirety.
Documents
I collected public and private documents. Public documents included formal state and district teacher evaluation policy documents and observation protocol. Private documents included written correspondence between principals and teachers, interim evaluation reports, formal summative evaluations, teachers’ rebuttals to evaluation, and development plans.
Data Analysis
As soon as data collection started, formal analysis through the inductive process of the coding paradigm began (Strauss & Corbin, 1998). This process included open coding (the generation of categories of meaning from the fieldnotes and interview transcriptions), axial coding (the identification of key categories and the refinement of them), and selective coding (the interrelation of categories into an explanatory organizational scheme). I used Spradley’s (1979) semantic relationships to assist with open, axial, and selective coding. Semantic relationships name objects, events, and processes that people use to organize their social lives. Semantic relationships are also durable across contexts. Finally, semantic relationships are finite and easy to work with. Spradley identified nine semantic relationships: strict inclusion, spatial, cause–effect, rationale, location for action, function, means–end, sequence, and attribution.
During the three stages of coding, I looked for semantic relationships in interview transcripts and field notes. Embedding the search for and refinement of semantic relationships in the processes of open, axial, and selective coding led to several fruitful conceptual developments, including: types of teachers who do well on evaluation (strict inclusion), consequences of performing poorly on evaluation (cause–effect), reasons for wanting a high relative evaluation score (rationale), places where teachers talk to colleagues about evaluation (location for action), uses of evaluation protocol (function), ways to perform well on evaluation (means–end), steps in the evaluation process (sequence), and characteristics of the teacher perspective toward evaluation (attribution).
Through the process of open, axial, and selective coding I began to construct an organizational scheme or “paradigm” (Strauss & Corbin, 1998) that allowed me to develop and interrelate concepts and describe and explain my data. More specifically, I interrelated concepts through the conditions–actions–consequences paradigm described by Strauss and Corbin (1998). Ultimately, I refined and interrelated three primary semantic relationships and mapped these onto the conditions–actions–consequences paradigm as follows: characteristics of the teacher perspective toward evaluation—conditions; ways to perform well on evaluation—actions; and consequences of teacher actions and interactions surrounding evaluation—consequences. This conceptual paradigm led to the main findings of this research.
Establishing Validity
I established validity in several ways during data collection and analysis including: triangulation, prolonged engagement in the field, member checking, and searching for disconfirming evidence.
First, I verified data through continual comparison. Namely, I checked interviews against observations, observations against artifacts, artifacts against interviews, and so forth. I also spent extended time in the field generating and confirming or disconfirming emerging hypotheses. Interviewing and observing informants over several months strengthened the validity of study in two ways. First, informants were unlikely to uphold false constructions of their typical behavior and actual feelings if interviewed and observed on multiple occasions (Becker, 1970). Furthermore, because interviews happened close to the natural setting the interviews likely extended the informant’s “discursive penetration.” Finally, I ensured the validity of this study through two closely related activities—member checking and searching for disconfirming evidence. When searching for disconfirming evidence, I generated hypotheses and then actively sought evidence to disprove them. For instance, early interviews and observations suggested that teachers did not use or learn from the observation protocol their districts provided. I searched for evidence that would disprove this hypothesis, initially finding none. However, over time it became clear that in certain circumstances (informally and privately) teachers could consult the observation protocol and use it to inform their teaching.
Member checking within and across interviews involved asking informants about the accuracy of my understandings. I also returned to each school after data collection and initial analysis to share with informants my understanding of the situation and ask about ways in which my understanding might be wrong. Member checking, then, was a way of soliciting disconfirming evidence. I later conducted a thorough search for disconfirming evidence by carefully examining each transcript and field note for any data that would cause me to disregard or (more likely) refine my conclusions. In this way, I refined the concepts and the organizational scheme I present in the findings.
Limitations
Despite extensive data collection, thorough data analysis, and careful construction of validity, this study is not without limitations. First, the purpose of this study is to generalize to theory rather than a population (Firestone, 1993). This means that the conceptual relationships described herein hold within the sample but do not necessarily generalize to a larger population. Second, the sample teachers likely did not capture the full diversity of teachers in each of the three schools. As noted previously, the sample included only those teachers who were involved in another reform. Because teachers mostly volunteered to participate in this reform, the sample teachers were likely more reform-minded than the typical teacher at their school. Even so, in most other ways the sample teachers did not appear to be markedly different from the other teachers at their schools in their perspectives, actions, and interactions surrounding evaluation. Finally, this study has limitations surrounding data collection. As the sole researcher in the field, I sampled events that I thought were important to observe in order to understand evaluation. However, I could not observe more than one event at a time and choices about what to observe surely introduce limitations about what I could observe and learn. Furthermore, I did not have access to private meetings between principals and teachers, although both principals and teachers later detailed these meetings during interviews and provided written evaluation documents.
Findings
I report the findings in a way consistent with the requirements for process analysis (Strauss & Corbin, 1998). Specifically, I describe the basic contexts (setting, teacher perspectives), actions and interactions (restricting information, aligning preferences and projections, establishing connectives), and the consequences that these actions and interactions had for teacher development and accountability.
Basic Context
Waller, Poe, and Middleton had stark differences in terms of principal leadership and local teacher labor market conditions. Waller was located in an urban district with no recent record of teacher layoffs. Furthermore, in 2013–2014 Waller’s principal Ms. Shriver had been at Waller for several years, dedicated herself to providing capable instructional leadership, and was widely respected by teachers. In contrast, Middleton’s district had been losing students for over a decade in 2013–2014 and laid off several teachers in the period prior to the first year of the study. To make matters worse, in 2013–2014 Middleton’s principal, Mrs. Novak, showed little or no interest in providing instructional leadership or conducting evaluation activities. The staff deeply distrusted Mrs. Novak and suspected that she evaluated them arbitrarily. Poe’s district was neither losing students nor gaining them and therefore the district was under little or no pressure to lay teachers off. In 2013–2014, Mr. Delancey, the school’s principal, was in his first year in the position and staff was carefully optimistic about how he would conduct evaluations. By 2017–2018, only Mr. Delancey remained as principal. Mrs. Shriver moved on to become an assistant superintendent in the area and Mrs. Novak retired. Each left her respective school at the end of the 2015–2016 school year. An overview of the school principals is included in Table 4.
Principal Sample 2013 and 2017
The Teacher Perspective and Implications for Maintaining or Improving Status
Recall from the theoretical framework that when occupying similar roles (e.g., teacher) and facing common pressures and challenges (e.g., new evaluation policy) actors often form a common definition of their situations (i.e., a perspective) from which they plan for action and interpret both their own actions and the actions of others. Actors then maintain or improve their status by projecting themselves into the situation through their appearance, talk, and actions.
The Basic Group Perspective
Teachers across the sample shared a common perspective regarding evaluation that suggests that teachers were primarily focused on accountability. As will be explained, this accountability-centered focus had profound influence on teachers’ actions and interactions surrounding the new reform. The basic group perspective can be stated simply: The main purpose of teacher evaluation is securing a record of high performance, particularly a high relative standing among peers in order to ensure one’s employment security.
This perspective persisted across differences in principal leadership, local labor market conditions, and teachers’ own beliefs, values, experiences (i.e., perceptions), and relative standing among colleagues. Furthermore, the essential elements of the perspective emerged very early in data collection, suggesting that its formation occurred before introduction of the new evaluation policy. The evaluation policy was new, but it appeared that teachers brought their preexisting perspective toward evaluation to their new situation.
The perspective became manifest in hundreds of statements in which teachers voiced anxiety of not being recognized as relatively high scoring and the implications that poor performance might have. In some instances, teachers expressed the perspective when describing potential consequences for individual teachers, as revealed by Mr. St. Johns from Middleton: Let’s say we have every teacher laid out on the paper like a timeline. The administrator is going to shuffle the teachers around wherever they think they are effectively. And this [relative comparison] is how we get our number [i.e., evaluation rating]. Now I have to find out how I get that number.
At other times, teachers focused on the school-wide consequences of the competition for high relative evaluation scores, as expressed by Mrs. Jackson, a low-scoring teacher at Waller: You hope [Ms. Shriver] sees you do something good. . . . We try to establish a culture in our classroom of students respecting one another and it’s not competitive, and it’s all about growing and showing growth and that is not the culture that’s been created within the staff . . . [Evaluation] becomes about who can be the golden children.
Differences among teachers and the direction of effort
In order to present themselves favorably in reference to the perspective (i.e., maintain or improve their status) teachers had to project themselves as highly capable teachers. Some teachers found this projection more difficult to sustain than others. Holding a common perspective did not mean that all teachers understood their particular situation in the same way, nor does it mean that all teachers shared the same feelings and attitudes (i.e., perceptions) about evaluation. As Becker et al. (1961) noted “Clearly, a situation will not present the same problem to all people. Some will have a way to act in the situation so that it calls for no thought at all; the situation is not problematic for them” (p. 35). Thus, it was possible for teachers to hold a common group perspective about evaluation but to understand their own particular situations differently.
Teachers, then, exerted effort on evaluation activities in ways that reflected their understanding of their personal situation. Lowly rated teachers (see Table 5) were the most concerned with evaluation and they responded to evaluation activities intensely. For example, Ms. Carroll described evaluation as a “looming giant” that overshadowed other school reforms. Across interviews she explained how she consulted the observation protocol, listened carefully to messages from the principal, and consulted colleagues in an attempt to improve her evaluation score.
Distribution of Teachers by Relative Score and Teacher Labor Market Conditions
Highly rated teachers varied. Some (e.g., Mr. Bridges, Mr. Trotter, Mrs. Hall, Mrs. Herman and Mr. St. Johns) viewed their relative status with a sense of permanence and they used their position to disregard evaluation activities to a large extent. For instance, Mrs. Hall at Waller said that she did not exert more than minimal effort completing evaluation activities: “I don’t do a dog and pony show. I don’t adjust my plans for the evaluation.” When asked in 2017 about whether he felt pressure from evaluation policy, Mr. Bridges responded: Me personally? No, but I don’t care. I don’t value the process . . . [and] my evaluator sat in my first meeting last year saying that I probably know more about teaching than he knows so I’m not worried about the process. It’s going to fall where it falls.
When later asked if he would change his approach to evaluation if he felt that his relative standing was in jeopardy or the district labor market conditions changed, Mr. Bridges replied, [My approach] would be adapted to my particular situation. I think everybody’s pressure that they’re feeling is based on the relationships that they have [with the principal] and how their district treats them. I’m in a unique position.
Others of high standing (e.g., Mrs. Quincy, Mrs. Reid, Mrs. Monahan, Mrs. Curtis) who were uncertain about the stability of their high rating and were more aggressive in projecting themselves and their responses to evaluation were similar to lowly rated teachers in this way.
The Elaborated Group Perspective
This section elaborates the basic group perspective in order to explain how teachers attempted to maintain their projections as highly effective teachers in their interactions with the principal and, by extension, formally document their high relative performance. The elaborated perspective includes teacher beliefs that evaluations were based primarily on principal preferences and biases that potentially led to subjective scoring and unfair treatment for particular teachers. For example, although she had established both a trusting relationship with Mr. Delancey and a high relative evaluation score over time, in 2017 Mrs. Reid still worried about principal subjectivity and fairness: I firmly believe that if an administrator did not like you, they could make your evaluation say anything they wanted. And under the current law if you’re the lowest evaluated in your position, you go. This entire system can be manipulated by an administrator to make whoever they want the lowest . . . if that’s the teacher they want to get rid of.
Teachers across the sample shared this view that their evaluation was heavily biased by principal preferences. Mr. St. Johns, a high relative scorer at Middleton, further elaborated this point: All the power is in the administrator’s court. If I do something that upsets the administrator, do I have a bull’s-eye on my back? Will they nitpick me more than others? Will it affect me on my evaluations? . . . I don’t particularly feel threatened in my job . . . but there have been people that have been completely blindsided around here.
Pleasing the principal and avoiding having a “bulls-eye” on their back and avoiding identification as a teacher the principal “wants to get rid of” required teachers to align their projections with principal preferences that were most often only tangentially connected to the formal standards of their evaluation. For instance, when asked in 2017 about whether she could use the district’s observation protocol as a point of reference about how to improve her evaluation score, Ms. Stickle was doubtful because principals “may have things that they’re stressing.” Without prompting, she added that she would likely receive a poor evaluation in 2017–2018 because her style of organization conflicted with her current principal: I’ve also learned that I’m a piler. The things I’m working on, I make piles. That’s who I am . . . And they’re organized piles, but they don’t look like that to a filer. A filer is a person who cannot stand anything out of place. . . . When your evaluator is a filer and you’re a piler [as she claimed was the case], you look disorganized. And that can hurt your evaluation.
All teachers doubted the connection between the observational protocol and earning a high relative evaluation score. When asked about how to perform well on evaluation, teachers never mentioned needing to master the teaching as described in their districts’ observation protocol. However, they routinely mentioned that principals had specific, idiosyncratic preferences that, if not satisfied, would hurt their evaluation. Mrs. Cunningham from Poe explained further: I think there’s some things that are specific to [Mr. Delancey] and what he’s looking for in more of unconscious bias when he comes into a room. . . . For him, I think an unconscious bias is that well-run classrooms look quiet. Students sitting at a desk working as opposed to other activities that we want to incorporate like group learning and collaboration and that sometimes can look messy even though the learning is really good. I think sometimes there’s an unconscious bias of “that’s not as good as quiet, silent, seatwork.”
Teachers’ beliefs about the purpose of evaluation and their beliefs about their evaluators can be stated in a single elaborated perspective as follows: The primary purpose of evaluation is to earn a high score relative to one’s peers and in so doing secure one’s occupational future. In order to perform well on evaluation, teachers need to determine a principal’s preferences and then project themselves in ways that align with these preferences. Finally, principal preferences are likely to be only loosely connected to the observation protocol, if connected at all, and may actually conflict with a teacher’s own sense of best practices.
Maintaining or Improving Status Through Control of Information
With the elaborated group perspective in mind, teachers sought to maintain or improve their relative status through projecting themselves through their talk, action, and interaction. Maintaining or improving status involved a careful control of information in which teachers attended to information principals “gave off” about their preferences and information teachers “gave off” about their performance. Controlling information came in three types: restricting information, aligning preferences and projections, and establishing connectives.
Restricting Information
Within reason, teachers could restrict the information they provided to the principal, and many teachers did (at least to a moderate extent). When asked about her interactions with Ms. Shriver, Mrs. Jackson responded, “I will talk to [Mrs. Shriver] about behavior things, but I don’t talk to her about curriculum or instruction. I don’t want to incriminate myself by asking questions.” Even Mrs. Hall who had a high evaluation score and a close working relationship with Ms. Shriver admitted some reservation about interacting with her. Mrs. Hall acknowledged, I’m pretty open with her, but I am much more open with [my department colleagues]. Maybe I tiptoe around things a little bit more with Ms. Shriver, but not always. If I go to her with an issue, then it is not necessarily the first time that I’m thinking about it.
Teachers at Middleton and Poe adopted a similar approach, albeit to a limited degree. Mrs. Quincy said, “I kind of distance myself from Mrs. Novak because of [evaluation]. The less that I interact with the principal, the less likely I am to offend or be noticed in a negative way.”
Restricting information, however, had limited benefits. Teachers could not restrict information about their performance entirely and there was no certainty that a smaller sample of performance would translate to a higher relative evaluation score. Restricting information would leave a void that principals would have to fill with their own inferences about teacher performance. Thus, restricting information could ultimately expose teachers to greater uncertainty rather than help them maintain their projections. Teachers therefore relied primarily on eliciting principal preferences and aligning their projections accordingly.
Aligning Preferences With Projections
Aligning preferences with projections began with eliciting information about principal preferences and extended through making modest adjustments that projected alignment with these preferences. This process became manifest during preparation for formal observations and when completing the annual self-evaluation.
Observations
Teachers elicited information from several sources to determine what principals wanted to see when they visited classrooms and they made accommodations to their practice accordingly. The most popular sources included casual observations, previous feedback, knowledge of the limitations of the evaluation process, staff meetings, professional development sessions, and one-on-one interactions. For instance, in 2018 when asked how she determined what the new principal at Waller valued, Ms. Stickle said simply, “You look. You get good at looking at their desk in their office and you see things.” The “things” that one might see could include the books or other materials that might influence principal preferences or provide clues about other elements of practice a principal might value (e.g., personal organization).
Teachers also attended to the feedback (e.g., “noticings” and “wonderings”) they received from previous observations. Ms. Carroll explained how she learned about Mr. Jordan’s preference for having learning targets posted in the classroom: It’s an expectation that learning targets are posted in our classroom and you can kind of pick up what they’re looking for in the noticing and wondering comments that they make. . . . So, I feel pretty clear on what he’s looking for.
Furthermore, teachers could familiarize themselves with the limitations that principals faced when conducting observations and combine this knowledge with information derived from other sources to project themselves as highly effective teachers. For example, Mrs. Reid at Poe used information from the observation protocol, requirements of the observation reporting form, and previous principal feedback to maintain the projection of excellence: I know the rubrics. I also know as a teacher what is easy for [Mr. Delancey] to look for. Because when he comes in he’s supposed to be scripting. And I know that’s pretty labor intensive and thought intensive. So if he’s putting his energy there, what are some things he can easily look for to knock me down on? So those are the things I always try to have. Like, I make sure I always have my objectives up. Or if the things that I’ve been criticized for in the past then I make sure that I do that. So, I know he’s going to look for the objective because I know they have a drop down box on [the observation form].
Staff meetings were yet another source of information about principal preferences. When asked how she changed her instructional practice in response to teacher evaluation policy, Mrs. Monahan at Poe said she posted learning objectives, a practice not directly addressed in her district’s observational protocol, but one Mr. Delancey stressed in staff meetings. She explained, The thing that I really strive to do better is to make sure that my targets are posted. Addressing those targets, maybe not on a daily basis or to at least have them posted on my agenda or something so kids can see them. What [Mr. Delancey] really wants is for us to be talking about them so it gets ingrained in [the students’] brains.
Fourth, teachers could look to professional development opportunities as a source of information about principal preferences, as principals often melded the requirements of evaluation and their districts’ offerings for professional development. For example, in 2014 Mr. Delancey, Poe’s principal, contrasted his approach to combining district professional development and teacher evaluation with his predecessor’s: So with the last principal there wasn’t a lot of accountability. . . . There wasn’t a lot of PD brought in and said “now I’m going to hold you accountable.” PD would happen and then life just happened. . . . Next year we’re going to bring in thinking maps, so now [teachers] will be required to use them and I want to see them in [teachers’] lessons. Whatever PD we have, we want to hold teachers accountable in walkthroughs and observations.
Teachers were sensitive to the messages about principal preferences revealed through their participation in profession development. For instance, Mr. Bridges explained of his interactions with Mr. Skiles, the new principal: We never really had many talks [about instruction] in a one-on-one setting. We had big talks at staff meetings. . . .He’s been to some trainings that I’ve been interested in. Like Kagan’s . . . so we’ve talked about cooperative learning. There’s been good conversations.
If the principal led the professional development, as was the case with Ms. Shriver and the formative assessment learning team at Waller in 2014, this sent an even stronger message to teachers about what the principal valued. Aligning oneself with this professional development could improve one’s evaluation rating. When asked why she chose to join the formative assessment learning team, Mrs. Jackson at Waller explained, Complete and total honesty, I knew nothing about [formative assessment] when I joined. I needed more “school involvement” on my evaluation. I liked the idea of formative assessment. I know it’s really, really helpful in the classroom . . . but I knew I needed to get more involved if I wanted to have that check on my eval.
In total, teachers looked to a variety of sources to determine principal preferences and they made modest adjustments (e.g., tidying up their room, posting learning targets, joining a professional development offering) that aligned with these preferences. In other words, when being formally or informally observed, teachers aligned their projections with principal preferences in order to earn a high relative evaluation rating.
Self-evaluation
In addition to formal observations, each of the three schools required that teachers evaluate their own performance against the demands of the rubric. Although the self-evaluation did not require accommodations to one’s practice as observations did, the goal of the observation and self-assessment activities was the same—Find out what the principal wanted and then align projections accordingly.
Completing the self-evaluation activity created some consternation for teachers, as there was no uniform agreement about how best to project one’s performance to the principal. Teachers were unsure about whether to assign themselves universally high scores or to assign themselves scores that indicated room for, and commitment to, growth. Mrs. Reid, a high scoring teacher at Poe, described her uncertainty: I worry about the self-evaluation. I worry that if I say I’m really good at the beginning of the year that that will be used against me. So I don’t always say that at the beginning of the year but at the end of the year. . . . I’ll mark myself high. But I never really know what he’s going to do with it at the end of the year.
Other teachers shared this uncertainty but learned over time that the weight of the self-evaluation was not significant. In 2017, after several years of experience, Mrs. Cunningham at Poe was still uncertain about how to proceed, but she was no longer anxious about this uncertainty: I know there have been some varying opinions about how to attack [the self-evaluation]. I’ve varied, I think, each year on the way I go for it. I know last year we had a staff member who worked a lot with the leadership team and he . . . basically gave the advice that “go for the high rating because the burden of proof is on them to say that you’re not [highly effective].” So I kind of followed that and then nothing really came of it. . . . It doesn’t really factor into the end of the year evaluation.
In other cases, the consequences for the self-evaluation were higher. For instance, at Middleton teachers believed that the principal, Mrs. Novak, simply turned teachers’ self-evaluation scores into their final scores. In response, teachers assigned themselves universally high ratings. In 2014, Mrs. Quincy spoke of teachers’ reluctance to self-incriminate: When I self evaluate, I could do it for myself and be completely honest with myself. But I feel like if I do that with Mrs. Novak there is a chance of something that I notice goes unnoticed to her if I don’t point it out. And then I have a lesser evaluation than I would have compared to the other people that I’m competing with for jobs.
Middleton’s case is particularly instructive as it demonstrates how teachers aligned their projections with the principal’s preferences and the demands of the situation. Mrs. Novak retired at the end of the 2015–2016 school year, and when interviewed again in 2017–2018, Mrs. Quincy and the rest of the Middleton staff had adjusted how they completed the self-evaluation. Specifically, the staff had determined that the new principal, Mr. Jordan, wanted teachers to document areas for growth on the self-evaluation in order to show improvement over time. Of the whole enterprise of determining what the principal wanted and constructing the self-evaluation, in 2018 Mrs. Quincy said, The first year with [Mr. Jordan] . . . our building was like “you really want it to be as low as possible because it’s all about growth. So, it’s ok if it’s low.” . . . [Laughs] We [decided] if you know you’re doing something [at the beginning of the year] and he hasn’t noticed it . . . you’re going to show growth. So . . . [giving the principal what he wants] is all that really matters if you’re manipulating the system.
At the end-of-the-year evaluation meeting, Mr. Jordan did not mention teachers’ self-evaluation, so the following year the faculty lost interest in trying to determine what Mr. Jordan wanted or how they could use the self-evaluation to secure a higher final evaluation score. In 2017–2018, Mrs. Quincy simply resubmitted her self-evaluation form unchanged from the year before.
Understanding alignment: Principal preferences, teacher projections, and instructional change
As argued above, teachers attempted to secure high evaluation scores primarily by attending to principal messages about what they valued and then responding to these messages through projections about their practice. Teachers elicited information about principal preferences through a variety of interactions and, ultimately, these interactions served as the main foundation for the teachers’ conclusion that principal preferences were idiosyncratic and deviated from the demands of the observation protocol, as the following quote from Mrs. Cunningham at Poe suggests about feedback she received from walk-through observations: [Feedback includes] “noticings” and “wonderings” . . . Sometimes the “wonders” that I see are very abstract and . . . don’t necessarily have to do with what I was doing that lesson. They’re usually the way to talk about stuff you would have talked about in the old rubric as opposed to this. So I feel like the noticing and the wonderings are kind of [Mr. Delancey’s] way of talking about things that aren’t here [in the rubric].
This belief that principals had preferences that differed from the rubrics was not specific to Poe. Rather, teachers across the sample expressed the same belief about Mrs. Novak (2013–2014), Mr. Skiles (2017–2018), and Mr. Jordan (2017–2018). Only Ms. Shriver (2013–2014) earnestly tried to communicate ideas about practice that closely mirrored the description of instructional practice in the observational protocol the district was using. She knew both the observational protocol and the larger reform ideas exceptionally well and she tried to tie the school’s involvement in several instructional reform programs (most notably the formative assessment program) to the evaluation, as she explained: I try to make connections for teachers as part of educator evaluation. We use the [Framework for Teaching]. Within “Domain 3: Engagement of Students,” there is a piece about formative assessment and kids self-assessing and peer assessing . . . I’m like, “Hey guys these are things we talk about and do.” Unfortunately, I think evaluations are on everybody’s mind and those things they can control.
Ms. Shriver had a passion for helping teachers improve their instruction, but she lamented that teachers’ minds were on the “things they can control” which typically translated to making simple and easily enacted modifications to their teaching. She also recognized that despite her own passion for and knowledge about instruction, she had difficulty translating ideas about quality instruction to her teachers, who tended to focus on discrete teaching behaviors: I don’t I always do a good job to paint the big picture for them. So sometimes when I think I’m being clear, people . . . hear bits and pieces and don’t connect it to the big picture. They hear one part of it, but they don’t listen for the rest . . . and they walk away with just a miniscule part of something. Not the bigger idea that I wanted them to walk away with.
Observations in Waller’s classrooms suggest that despite Ms. Shriver’s best efforts, teachers did indeed “hear bits and pieces” of the “big picture” and they translated her messages as requiring only slight adjustments rather than a comprehensive change in practice that such a broader understanding would entail. Consequently, Waller teachers made many of the same minor adjustments as Poe and Middleton teachers. When asked how he had changed his instruction in response to his interactions with Ms. Shriver around formative assessment, Mr. Bridges explained, Ms. Shriver and I talk a lot about formative assessment and those are the basic tenets of my whole classroom. We just completed five weeks [in the semester] and I have two grades because most of the feedback to students was formative and not summative.
Classroom observations and further interactions with Mr. Bridges suggest that he interpreted messages about formative assessment to mean that he assign formal grades only infrequently. For Mr. Bridges, “formative” feedback meant “ungraded” and there was no evidence that suggested he changed his practice to provide feedback that advanced student learning as was required in both the formative assessment program and the Framework for Teaching. Thus, while Mrs. Shriver attempted to express her preferences as closely aligning to the observational protocol, teachers responded to these messages with modest, discrete changes to their instructional practice.
Establishing Connectives
Although teachers directed the most effort to understanding and responding to principal preferences, teachers’ annual evaluations were formally documented in the observation protocol. That is, ultimately, principals had to score teachers on how they performed on the dimensions of the observational protocol despite their idiosyncratic and loosely related preferences. Typically, the transition from preferences to the formal observation protocol went smoothly and teachers were satisfied with their rating. However, when teachers scored poorly relative to their peers, they attempted to negotiate their scores through deliberately establishing connectives between their practice and standards for exemplary performance outlined in the observation protocol. In frame analysis theory, connectives are essential to maintaining status, as they are the links between the actors and actions or words that are much more explicit than the comparatively more subtle projections (Goffman, 1974). Usually, there is little or no confusion about what actors are doing and saying and how these actions and words align with situational expectations, but there are times when actors must explicitly draw connections in order to maintain or improve status. Teachers sometimes felt the need to make connectives early in the year (self-evaluations), during the course of the year (walk-throughs and formal observations), and at the end of the year (after receiving their final evaluation rating).
Self-evaluation
First, teachers could deliberately establish connectives when completing the annual self-evaluation. For example, one way that teachers could establish connectives between their performance and the demands of the rubric was by providing a surfeit of evidence when completing the annual self-evaluation. Mrs. Curtis at Waller explained, Putting in all the evidence and data and saying why you’re great and why you do your job takes quite a bit of time and [the principal is] in here twice a year and she only sees two [lessons]. . . . I put more examples and evidence in there than maybe I need to. But I feel like I need to because this evaluation process is “if you’re not highly effective then we can basically do whatever we want with you.” . . . So, I’m going to put everything in there and examples and you feel like you have to explain yourself.
Observations
Second, teachers could make connectives during the observation period, as teachers tried to formally establish their performance in the terms of the observational protocol. Mr. Trotter, a high relative scoring teacher at Waller, reported spending little time and effort on evaluation activities or maintaining the projection of his high performance. Nevertheless, in 2014, he acknowledged the need for teachers to help principals make connections between the teacher’s instruction and the descriptions of high performance in the rubric: As far as the [Framework for Teaching], what [I] do already pretty much fits into that mold. You know, its just creating sort of an awareness and drawing some examples of what you do so the administrators can see it.
End-of-year meeting
As the year drew to a close and evaluations were being finalized, the effort to make connectives piqued in intensity. Teachers reported making connectives at the end of the year during the final evaluation meeting as a way of negotiating for a higher score. Mrs. Hall explained that the end-of-year meeting was a time “when we get to explain to our administrator something that is not seen in the observation. [Ms. Shriver] takes it into consideration with evidence that you provide to get that final score.” In 2018, Mrs. Quincy explained how the same phenomenon worked at Middleton in greater detail:
So [at the end of the year] we can see how we’ve been evaluated [and] on what parts of the tool . . . we can see where he’s noticed those things and where he hasn’t and then we sit down and he says, “Ok. I had you at a 2 here . . . do you have any evidence as to why this should be higher?” So, yeah, we were able to say, “Well, you weren’t here on these days but this is what we do normally, etc.”
And then what happens? Does the score change?
Oh, yeah. Absolutely. He gave us a lot of upfront warning. Like, “These are negotiable. I know I’m not in there enough to see everything so if you feel like we need to discuss this further that’s what this meeting at the end of year will be. . . . This isn’t final.”
Importantly, connectives did not require that teachers actually perform in the ways specified in the rubric. Rather, connectives required teachers to explicitly establish the connection in the administrator’s mind. Unlike aligning preferences with projections, connectives were specific to the formal scoring of the teachers along the dimensions of their districts’ observational protocol. Also, teachers used connectives to complement their projections or when they did not think their projections would be enough to secure a high score, or both.
Using Frame Analysis to Understand the Consequences of Teacher Evaluation
As described above, concern over the accountability aspect of evaluation shaped teachers’ group perspective and, consequently, teachers acted and interacted in ways that were likely to earn them a high evaluation score but would not necessarily lead to their development and instructional improvement. This section provides evidence of the ultimate consequences of the accountability-focused group perspective and the actions and interactions that flowed from it in two areas: (1) collegial relations and social capital resources and (2) teacher learning and instructional improvement.
Collegial Relations and the Flow of Social Capital Resources
Teachers helped one another maintain or improve their status, at least to a moderate extent, by developing a common approach to self-evaluation, advising one another about principal preferences, alerting others about the timing of upcoming observations, and forming collective approaches about how best to complete evaluation activities. Collegial assistance was mostly restricted to help of this kind. However, evaluation also provided opportunities for more substantive help from specially situated teachers. Finally, evaluation pressures inhibited the sharing of instructional resources and advice, but, at the same time, teachers built a sense of solidarity. Any hard feelings surrounding evaluation seemed to be short-lived.
Substantive help from specially situated teachers
In rare instances, teachers could provide one another substantive help, particularly, if they were well positioned to do so. For example, Mrs. Herman, a teacher and instructional coach, recalled that at the end of the 2017 school year teachers who had been rated poorly were desperate to improve their score and they came to her for advice: There were a couple of teachers who came to me after . . . they got their [evaluation rating] to say, “How should I respond to this?” I would help them reflect on their practice and help them then respond. So I had a teacher who didn’t understand what the administrator had written so we did a reflective conversation about her classroom practice and she realized that she was really utilizing the gradual release model. So I said, “So in your response to him, you need to name that for him.” I don’t always know that administrators recognize best practice even when they come in and do those observations. So I felt like my job was to help [teachers] name those practices that they did in the classroom and they could name them for the administration.
Mr. Bridges at Waller was the president of the teachers’ union in his school district and teachers often came to him with concerns about their evaluation rating and, like Mrs. Herman, he provided teachers advice about changing the principal’s perceptions through more effective connectives: I just tell people to reflect and figure out why [their evaluation] is that way and based on that how can you change the [principal’s] perception? It really is what it is because [principals] don’t observe teachers enough to really get a good feel for what [teachers] are doing. I think principals look at teacher attendance or look at whether [teachers are] leaving at the last bell and running out with the students and make a judgment on that. But they might miss how much they do at home. Or they might miss students working with them at lunch, so I say “you’ve got to make your whole thing visible.”
Help could extend beyond making connectives. For example, Mrs. Herman understood that teachers were primarily coming to her to help them improve their rating, but they could also learn something about improving their instruction: I wasn’t just coaching them to manipulate the system. I was also coaching them around their instruction and strategies that they might try and it even opened up the door for me to do model lessons in a few classrooms.
Further evidence suggests that Mrs. Herman’s help did extend beyond helping others make connectives. For instance, in 2014, Ms. Carroll was searching for ways to improve her evaluation rating (she had been laid off briefly the year before because of a poor evaluation) and she approached Mrs. Herman for help. Mrs. Herman suggested that the two team teach a class during one of Mrs. Herman “coaching” hours. Ms. Carroll said, “that class that we teach together is like on-the-job training for me. Like I am hungry to learn more.” Ms. Carroll also reported interacting with Mrs. Herman as much as she could through the day and she joined the school’s formative assessment learning team that Mrs. Herman led. Ms. Carroll reported finding professional renewal in this effort to closely align herself with and learn from Mrs. Herman and she acknowledged that evaluation provided the key initial pressure that motivated her to seek local resources.
Competitive pressure and the sharing of instructional advice
While teachers provided colleagues mostly moderate and sometimes significant resources in performing well on evaluation and helping them improve their instruction, the competitive pressure of evaluation also strained collegial relationships, particularly among distal colleagues. Specifically, teachers reported a much more restricted flow of ideas about good instruction as a consequence of evaluation. Mrs. Jackson believed that innovative teaching ideas could earn one a high evaluation score (at least at Waller in 2014) and that, consequently, teachers were less likely to provide instructional advice. She explained, Sharing good ideas and things that are working in a class will almost take away from [other teachers’] ability to show off for administration. So if everybody’s doing it, it’s not something new and unique that will make them stand out in an evaluation . . . you need to be better than your peers.
Ms. Stickle, also at Waller, agreed. Evaluation was primarily about improving one’s relative standing, and, thus, was an activity performed mostly alone: Your colleagues can only be so helpful. In the end you get to May and you’re in competition with them for your job. So you don’t want to share your best—because [your colleague] is going to be an [evaluation rating] that could be greater than yours.
This circumscribed sharing, however, affected mostly weaker relationships. Teachers reported being less likely to share with others outside their immediate close-knit group, but solidarity within close groups increased. In other words, evaluation pressures made groups (usually at the department level) not only more insulated from other groups but also more internally cohesive.
Competitive pressure and the building of solidarity
Teachers often joked about evaluation with their colleagues by holding the process in derision, particularly after the negotiations over evaluation ratings were finalized. For example, at Waller in 2014, many of the teachers on staff wrote their final ratings on a sheet of paper, taped the paper to their chest, and then showed up to the end-of-the-year staff party. Staff used it as a focal point of levity throughout the evening.
At Middleton, Mrs. Quincy explained that hard feelings from being rated low relative to one’s peers quickly subsided once teachers knew that their jobs for the following year were not in jeopardy. Evaluation pressures, then, increased general solidarity among teachers, who held a common perspective about evaluation and a common disaffection for it. For instance, Ms. Carroll described how making light of evaluation during casual conversations with colleagues made her feel more at ease despite her job being in jeopardy because of her low rating: I’ve always been a person that stresses and lacks confidence. Even my colleagues joking about this stuff and hearing more people say, ‘I hope you’re professionalism is good. What are you going to give the kids if their scores go up?’ I hear everybody say it and make me feel more relaxed.
In sum then, accountability pressures influenced the group perspective in reference to which teacher’s self-interested behavior made sense to their colleagues. Teachers helped one another maintain their projections through developing a common approach to evaluation activities, but at the same time teachers restricted sharing instructional advice with all but their closest colleagues. In rare instances, the pressure of evaluation could compel a lowly scoring teacher to seek out resources in her local environment and teachers could potentially learn a great deal from these opportunities.
Evaluation Policy, Teacher Learning, and Instructional Improvement
The teachers’ perspective that focused on obtaining a high relative score combined with how teachers attempted to maintain or improve their status through aligning their projections with principal preferences had definite consequences for teacher learning and instructional improvement. Teachers did not commingle their efforts to improve their instruction with efforts to obtain a high relative evaluation score, as Mrs. Quincy at Middleton explained, I don’t feel like the evaluation process necessarily makes me a better teacher. I feel like I have to do certain things that are expected of me at that time. It is almost like a dog and pony show that I feel like I have to put on. And then I go back to what I think really is important on a daily basis. All in all, I want to do well on the evaluation so I can keep my job. [Improving instruction and doing well on evaluation] are almost two separate things.
Mr. Bridges at Waller explained how he believed his colleagues were actually neglecting their teaching in their pursuit of a high evaluation score: I’ve seen too many teachers spend too much time and energy on the evaluation and none on their teaching. And some will say, “Well, it will show up on their evaluation.” No it won’t. If you only observe me twice and . . . you can change your behavior just because somebody’s sitting in your room or not.
Because teachers attended more to principal preferences than to the description of good teaching outlined in their observation rubrics, evaluation could pull teachers in two different directions, potentially causing them to lose sight of improving their instructional practice. Mrs. Monahan from Poe elaborated on this point, “[Teachers can focus on] the pieces in the evaluation process when they should be focusing on different parts of what’s going on in the classroom.” However, she acknowledged that even this meager effort “could be good for some teachers who aren’t doing what they’re supposed to be doing.”
Embedded in these quotes and dozens of others like them are several commonly held beliefs and practices: (1) high performance on evaluation does not require instructional excellence, (2) teachers can and do spend energy on securing high evaluation scores even if that means neglecting the actual work of improving their instruction, (3) neglecting one’s actual teaching will not negatively affect one’s evaluation score, and (4) teachers can easily make in-the-moment adjustments that will satisfy principals.
The primacy of accountability concerns did not forbid teacher learning and development, however. Indeed, the above account does not warrant the conclusion that some teachers could not improve their instruction as a result of new evaluation policy. Many teachers believed that good ideas about instruction were embedded in their district’s observation protocol and they reported reflecting on the observational protocol regularly and enacting practices described therein. For example, Mrs. Cunningham reported that Mr. Delancey rarely, if ever, came to her classroom and the feedback he provided in years past was unconnected to the demands of the observation protocol. Nevertheless, she used the observation protocol as a source of ideas for improving her instruction even though she acknowledged that Mr. Delancey was unlikely to notice what, for her, were substantive instructional improvements. Mrs. Reid, also at Poe, shared this sentiment. She focused on elements of the protocol that she felt were more easily assessed by evaluators and attempted to improve on those dimensions even though she felt no direct pressure from Mr. Delancey to do so. She explained, When we first got [the observation protocol], I started thinking “That is a really good idea. Why am I not doing that?” And then try to incorporate those things. And I focus on things that I know he’s going to easily look for or would be easy for them to check off. But then there are things that I know are impossible. You just can’t get kids to that point. Those are kind of, to me, unassessable. So I set those off to the side.
Other teachers suggested that evaluation added to their other efforts to improve their instruction, even if to an uncertain degree. For instance, Mrs. Curtis explained, I’ve gotten better so I think I would attribute it to [learning from the Framework for Teaching]. I’m sure that there are other things, too, though just growing every year and learning on your own as you go. But formative assessment has helped me, cooperative learning, we’ve been doing a lot of that . . . but, I am more aware of what I’m doing because of evaluation.
A common theme emerged among teachers who used the observation protocol to improve their instruction. Namely, improvement in this way was a private matter of reflection and enactment, rather than an integral part of formal evaluation activities or evaluation-centered interactions with the principal. Teachers were also divided on the point of how much they learned about instruction from evaluation, as the following exchange with Mr. St. Johns makes clear:
Do you think about the observational protocol often?
No, you would drive yourself crazy if you did.
Do you think about the protocol when planning or teaching a lesson?
No. It’s totally separate my mind.
In sum, the consequences of teacher evaluation on teacher learning and instructional improvement (given the accountability-centered group perspective and efforts to maintain or improve evaluation scores) are mixed. In one sense, teachers had to spend time and energy matching their projections to the principal’s idiosyncratic preferences and could separate them from the work of actual instructional improvement. In another sense, however, even merely matching projections to preferences could elicit modest instructional improvements. Furthermore, a sizable minority of the focal teachers used the evaluation protocol to privately reflect on and improve their teaching.
Discussion
This study makes contributions to the growing literature on evaluation policy implementation by describing and explaining the importance of the teacher perspective, how this perspective shapes teachers actions and interactions, and to what consequences. In general, I argue that the teachers’ perspective and subsequent actions and interactions determine the quality of implementation and that perspective–action–interaction link is durable across contexts. These findings ultimately suggest that the centrality of the principal and the importance of organizational context, highlighted in previous studies, are likely overstated.
In this section, I consider how the findings described herein integrate with other recent findings on teacher evaluation policies in two areas: (1) the role of the principal, the importance of accountability, and implications for teacher development and (2) the role of organizational contexts in teachers’ responses to evaluation reform.
Teacher Evaluation: Accountability and the Role of the Principal
Two findings emerged consistently during data collection and later analysis. Teachers formed a common perspective about evaluation that stressed its potential for accountability and they did so independent of their principals’ attitudes, beliefs, or practices. Contrary to what previous research would suggest (e.g., Donaldson & Woulfin, 2018; Reinhorn et al., 2017) the principals in this study did not frame the policy and impose their frames on the teachers. Rather, the teachers’ understanding of and perspective toward evaluation seemed to form from their previous experiences with more traditional evaluation. In any event, none of the teachers in the study looked to their principal to translate messages about evaluation or determine its meaning for them.
Furthermore, using frame analysis to explain the implementation of teacher evaluation policy suggests that teachers’ primary concern when being evaluated was on the accountability aspect of evaluation. Teachers were mostly concerned with earning a high relative evaluation score and securing their long-term employment. With this perspective in mind, teachers directed their effort in accordance with how they understood their own particular circumstance in terms of their relative standing and local labor market conditions. Teachers with low relative ratings and those with high relative ratings but who were uncertain about the permanence of their position projected themselves through a careful control of information via restriction, alignment, and connectives. Because teachers believed that evaluations were biased by principal preferences that were largely unconnected to the evaluation protocol, teachers directed their energy to uncovering these preferences and then aligning their projections accordingly. When these projections were insufficient, teachers became more directive in making connections between their performance and the standards of high performance as described in the evaluation protocol.
The way that teachers attempted to maintain or improve their status had two consequences for collegial sharing and teacher learning. First, teachers responded to competitive pressure by providing colleagues modest help through crafting a collective approach to evaluation activities, notifying one another about principal preferences, and alerting their colleagues about the timing of (supposedly unannounced) formal observations. However, at the same time, they also restricted the sharing of instructional advice with all but their closest colleagues. Finally, as the effort to maintain or improve their status stressed some teachers at the end of the school year, they looked to especially well-situated colleagues (e.g., union president, instructional coach) who could provide more substantive help. Finally, once the evaluations were finalized, a general solidarity emerged among teachers and negative feelings toward colleagues diminished, at least until the following year.
Second, because teachers believed that attaining a high evaluation rating was primarily subject to aligning to idiosyncratic principal preferences, evaluation became about eliciting principal preferences and projecting oneself in alignment with these preferences through the modest accommodations to practice rather than instructional improvement, per se. Nevertheless, even modest accommodations may have been an improvement over extant practices. Furthermore, evaluation seemed to apply the pressure needed for some teachers to seek out resources in their local environment, potentially leading to substantive improvements. Teachers could also use the observational protocol as a tool for improving their practice by privately mining the protocol for ideas about instruction and enacting practices described therein.
Organizational Contexts and Responses to Teacher Evaluation
Marsh et al. (2017) concluded that schools responded differently to teacher evaluation policy and these differences could best be captured by three, nonmutually exclusive, types: distortive, compliant, and reflective. Marsh and colleagues then argued that differences in organizational contexts across their sample schools explained difference in school responses. Specifically, when leadership was distributed across several people in a school (i.e., teachers were enlisted to conduct observations and offer feedback) and when the principal engaged teachers frequently about issues of teaching and learning, schools were likely to respond reflectively to teacher evaluation. Furthermore, when schools “scheduled purposeful and consistent time for teachers to meet” (Marsh et al., 2017, p. 23) they were more likely to respond reflectively.
The evidence from my frame analysis suggests a new understanding of teacher responses to evaluation policy. Namely, rather than being collectively determined at the school level by a set of organizational conditions, responses to evaluation emerged from a common group perspective and one’s understanding of her specific situation. As we have seen, teachers in similar relative positions across organizational contexts responded to evaluation in similar ways regardless of how the school distributed leadership (Waller) or provided time for teachers to collaborate (all three schools). Teachers with high relative rating in favorable teacher labor markets (Mr. Bridges, Mr. Trotter, Mrs. Hall) tended to respond compliantly, merely going through the motions of evaluation. Highly rated teachers in less favorable teacher labor markets (Mrs. Monahan, Mr. St. Johns, Mrs. Herman) also responded compliantly so long as they felt their relative status was secure. For teachers who were high scoring but felt vulnerable to regression in the future and those who scored lower than average neither compliance nor reflection would suffice. These teachers adopted a primarily distortive response to help them match their projections with the preferences of the principal and they made accommodations strictly to satisfy these preferences, even when they had to act contrary to their own beliefs about best practices. Poe, Waller, and Middleton, then, could not be accurately labeled as distortive, compliant, or reflective, as all three schools had distortive, compliant, and reflective responses within them.
Even categorizing the “primary leaning” of individual teachers into the distortive, compliant, reflective type is difficult. Mrs. Herman, a part time instructional coach and teacher, used teacher evaluation to increase her influence in teachers’ classrooms when they sought her out to help them dispute their evaluation ratings. In at least one case, Mrs. Herman helped provoke a reflective response even as she adopted a compliant response to her own personal evaluation. For her part, Mrs. Carroll mixed reflection with distortion, as she still tried to match her projections with administrator preferences in the midst of her more substantive efforts to improve. Other teachers (e.g., Mrs. Reid, Mrs. Cunningham, Mrs. Stickle) reflected on the observation protocol privately, but publicly distorted their performance in order to perform well on evaluation. Finally, it should be noted that teachers adopted distortive responses as the need arose, but on any given day, teachers were more or less compliant. That is, teachers projected themselves in alignment with administrator preferences as needed (e.g., a principal conducted a walk-through observation) and then they quickly went back to acting as they more normally would. With this in mind, the majority of teachers could most accurately be characterized as compliant with punctuated distortion.
Implications for Policy and Practice
This account of teacher evaluation reform implementation suggests that evaluation as a source for teacher learning and instructional improvement is quite limited (Murphy et al., 2013). The accountability focus regarding evaluation is long-standing and embedded in the teacher perspective from which actions and interactions flow. Specifically, evaluation becomes about securing a high score through modest alterations to instructional practices, if any. Nevertheless, teacher development and instructional improvement do sometimes occur in special situations.
Namely, this study suggests that the pressure of evaluation combined with access to local resources can result in considerable teacher development and instructional improvement. States and districts should consider coupling pressure with provision of resources including mentoring, coaching, and professional development for those teachers who are identified as needing the most intense help.
Conclusion
In sum, using frame analysis to analyze the implementation of teacher evaluation policy reveals a complex series of apparent contradictions that likely reflects evaluation’s dual purpose of accountability and development. In some ways, evaluation hindered sharing of instructional advice and aroused competition and suspicion among colleagues. In other ways, however, evaluation provided the impetus for some teachers to engage in the difficult work of substantive improvement, strengthened close relationships, and bolstered a general solidarity among colleagues. Additionally, teachers generally kept evaluation separate from the work of instructional improvement, attending instead to administrator preferences of uncertain value. At the same time, however, evaluation also provided teachers tools from which they could learn and improve.
Footnotes
Notes
J
