Abstract
Using five AERA presidential addresses over the past half century as landmarks, this essay traces the evolution of research on teaching and teacher education as well as some critical impacts the research has had on policy and practice related to teacher education and teacher evaluation in the United States. The discussion shows how these addresses both reflected the progress and challenges of research on teaching and teacher education at the times they were delivered and identified paths that the education research community could take to address the challenges. It traces key influences of these lines of work on the quality of teacher preparation, assessment of teaching effectiveness, and competing conceptions of teacher accountability. It ends with a discussion of the role of politics in setting educational policy, a call for education researchers to become more knowledgeable about and more capable of engaging in political and policy arenas productively, and a reminder that public scholarship is the goal of the 2016 AERA Annual Meeting and the topic of the next presidential address.
Keywords
A half century ago, in his 1964 address to the Associated Organizations for Teacher Education, Nate Gage—known to many as the father of research on teaching—made an impassioned plea for research on and for teacher education (Gage, 1964). On the heels of James Bryant Conant’s (1963) review of teacher education, published less than a year earlier, Gage sought to shed light on how to strengthen the enterprise that Conant had so roundly critiqued. Harkening back to the Flexner Report that transformed medicine, Gage made the assertion that the practice of teacher education was generally stronger than medical education’s had been at that time but that the scientific knowledge base to inform teacher education was much thinner.
By this he meant both research about effective teaching that could inform the content of teacher education (research for teacher education) and research about the results of different approaches to recruiting and preparing teachers that could inform how teachers are educated (research on teacher education).
The lack of research was not, in his view, for want of interest: Gage (1964) noted from his review, “Over the years, research on teacher education has at least been yearned for, even if too little of it has been done” (p. 1). At the same time, Gage noted that many dismissed the cause of knowledge for teaching, convinced that educational know-how was really nothing more than common sense. To make his point, he outlined a number of commonsense findings from education research; for example, that a teacher should subtract pretest scores from posttest scores in a given domain to find out how much children have learned from instruction and that generally, doing so will show that brighter children have gained more than others from that instruction. In addition, to strengthen a kind of behavior, it should be rewarded, and to eliminate a kind of behavior, it should be punished. He then went on to note that each of these obvious findings was, in fact, the exact opposite of what education research had found, making the case that what teachers need to learn in order to be effective is often far from intuitive.
Recognizing that new approaches to major problems typically bundle many changes into one reform, Gage called for new methodologies beyond the then popular correlations of discrete teaching behaviors with outcomes of interest. He noted that if experiments could be launched, holistic analysis of new preparation program models and interrupted time series designs that track outcomes before and after major changes could provide traction on critical problems like how to train and retain teachers for high need areas, an agenda that remains salient today.
Gage noted, however, that ideological partisanship was a major barrier to progress, wistfully quoting Flexner (1910), who had argued that, “Modern medicine . . . does not cure defects of knowledge by partisan heat; it is free of dogmatism and open-armed to demonstration from whatever quarter” (p. 53). By contrast, he doubted Conant’s (1963) call to action would accomplish as much change given the ideological intensity and disrespect for scientific methods prevalent in the context for education.
It is not clear to me from my own historical reading that scientific knowledge for medicine was as strong as Gage believed or that medical research as nonpartisan. Indeed, the roles of the Carnegie and Rockefeller Foundations—including that of Abraham Flexner’s brother, Simon (head of the Rockefeller Institute for Medical Research)—in exerting significant influence over the professional accrediting body that took up the crusade of changing medical education were likely at least as important as the quality of the knowledge base.
However, Gage’s point about the disrespect for scientific knowledge regarding teaching and teacher education does ring true. That theme has echoed through the decades since. Even as progress has been made, new knowledge has frequently been ignored, misinterpreted, or misused—sometimes by teacher educators and more often by policymakers—with the result that the discourse and debates about teacher education today eerily resemble those of a half century ago.
The Growth of Knowledge for Teaching
Gage’s plea for investments in research for teacher education was heard. During the 1970s, when the National Institute of Education was in its heyday and education research received more federal funding than it ever has before or since, there was considerable effort devoted to understanding what kinds of teaching actions produced strong learning outcomes. A generation of “process-product” research, as it was called, used experimental, quasi-experimental, and correlational designs to uncover generalizations about what teaching behaviors were associated with what kinds of student outcomes—many of them measured in student pre- to posttest gains.
More ethnographic approaches, designed to understand teachers’ thinking and decision making, were added to the portfolio of research, led in part by the Institute for Research on Teaching and Michigan State University. A leader in that work was Lee Shulman, whose 1985 presidential address, “Those Who Understand: Knowledge Growth in Teaching” (Shulman, 1986), made the case for both more nuanced understandings of teachers’ decision making and for its application in specific content domains. Pedagogical content knowledge, the merger of knowledge about content and knowledge about pedagogy, had been absent both from much of teacher education—where generic rather than subject-specific teaching methods were taught in many preparation programs—and from nearly all of teacher evaluation, which had been guided by tallies of teaching behaviors derived largely from correlational research.
Shulman (1986) noted in his address that directly translating research to practice or policy is necessarily fraught with peril by the very nature of the enterprise: To conduct a piece of research, scholars must necessarily narrow their scope, focus their view, and formulate a question far less complex than the form in which the world presents itself in practice. This holds for any piece of research; there are no exceptions. It is certainly true of the corpus of research on teaching effectiveness that serves as the basis for these contemporary approaches to teacher evaluation. (p. 6)
Having pronounced that early body of research on teacher effectiveness largely successful on its own terms, Shulman argued that its application was, nonetheless, problematic: Research programs that arose in response to the dominance of process-product work accepted its definition of the problem and continued to treat teaching more or less generically, or at least as if the content of instruction were relatively unimportant. . . . In reading the literature of research on teaching, it is clear that central questions are unasked. The emphasis is on how teachers manage their classrooms, organize activities, allocate time and turns, structure assignments, ascribe praise and blame, formulate the levels of their questions, plan lessons, and judge general student understanding. What we miss are questions about the content of the lessons taught, the questions asked, and the explanations offered. From the perspectives of teacher development and teacher education, a host of questions arise. Where do teacher explanations come from? How do teachers decide what to teach, how to represent it, how to question students about it and how to deal with problems of misunderstanding? (p. 8)
In my own presidential address a decade later (Darling-Hammond, 1996), I noted that simplistic applications of the process-product research proved, in some cases, to be dangerous as a guide for policy, as, for example, when policymakers used lists of teaching behaviors as the basis for mandates. During the 1980s, “research-based” teacher evaluation instruments and teacher education requirements in many states had enforced a set of uniform teaching behaviors (often trivial but easy to measure, such as “keeps a brisk pace of instruction,” “manages routines,” and “writes behavioral objectives”) without regard to subject matter, curriculum, or student learning. These policy tools often had the bizarre effect of promoting teaching that was insensitive to learning while undermining successful, innovative teaching.
I remember how a school board member in Arizona once proudly confided to me that her board had just adopted a new “research-based” teacher evaluation scheme that had led them to fire one of the district’s most popular teachers—widely requested by parents and esteemed by colleagues—because he did not use the seven-step lesson plan required by the instrument. Across the country, Florida’s 1986 Teacher of the Year (also a runner-up in NASA’s Teacher in Space program) found that he could not pass review for a merit pay award according to Florida’s Performance Measurement System (FPMS), another “research-based” checklist, because his principal could not find enough of the required teaching behaviors to check off during the laboratory lesson he observed. Furthermore, the form required that the teacher be marked down for answering a question with a question, a practice forbidden by the FPMS, though popular with Socrates and some other well-respected teachers. I also noted that the FPMS approach to teaching was distinctly ill-suited to the development of students’ critical thinking abilities and out of synch with the then emerging research on student cognition.
Building on the concerns Shulman had raised, David Berliner’s (1986) presidential address in the following year also pointed out that simplistic efforts to apply behavioral research missed the heart of teacher decision making: That is, knowing what to do (e.g., homework reviews at the beginning of mathematics class) did not convey how to do it in a way that produced productive learning for the teacher and for students. He argued for studying expert teachers and contrasting their reasoning with that of novice teachers. This could enable researchers to understand how teachers collect information and reason about their next steps throughout the teaching process—and how this thinking guides future decisions—in ways that inform a more sophisticated approach to teacher education and teacher evaluation. Berliner noted as an example: We need to find out how some teachers use opening homework reviews to serve multiple purposes, such as informing themselves about the difficulty of the assignment, identifying the students who are not prepared, are having trouble, or who are breezing through and could easily become bored. We also need to find out how the pace and nature of subsequent instruction is affected by the information collected. (p. 5)
Expert teachers, Berliner and his colleagues demonstrated, are continually learning about their students’ conceptions, misconceptions, and needs rather than just implementing routines without regard to student learning. This process is informed both by pedagogical content knowledge, described by Shulman, and by the practical knowledge teachers gain from their classrooms—a form of pedagogical learner knowledge (Grimmett & Mackinnon, 1992). Berliner’s work showed how the capacities of expert, experienced teachers operate and how these same abilities require development in the case of novices (beginners just out of teacher education programs) and “postulants” (veterans in the content field who have not been taught how to teach).
Careful work over several decades on teacher thinking and decision making (Clark, 1983), comparisons of expert and novice teachers (Berliner, 1986), studies of the nature of pedagogical content knowledge (Shulman, 1986), teachers’ practical and learner knowledge (Clandinin, 1986; Grimmett & Mackinnon, 1992), and studies of culturally responsive teaching (Banks & Banks, 1995; Cochran-Smith, 1997; Ladson-Billings,1994; Lee, 1995) began to build a rich case knowledge of teaching that examined teaching actions and decisions in different contexts and for diverse learners. As these studies cumulated, they allowed for generalization not by ignoring context but by building a body of case knowledge that could be interpreted across contexts.
Connecting Teaching to Learning
Two critical advances in research on teaching and teacher education occurred in the 1990s and early 2000s. One was the initial codification of an increasingly well-defined body of research about what some have called “meaningful learning” or “learning for understanding.” Extending beyond the rote or algorithmic learning that enables recall and the solution of known problems, this kind of learning enables transfer and application of knowledge to new situations or problems. The publication of the National Research Council’s How People Learn (Bransford, Brown, & Cocking, 1999) was perhaps the most widely available summary of the accumulated research on the kinds of learning that produces what the authors termed adaptive expertise.
The other advance was the explicit connection of the study of learning to the study of teaching—with the goal of understanding how the two interact and how teaching might purposefully develop this more sophisticated kind of learning. Because the fields of psychology, on the one hand, and curriculum and instruction, on the other, were located in different parts of the academy and focused on different sites for scholarship, there was a divide between the two that had typically separated concerns about the kind and quality of learning from those about the kind and quality of teaching.
In 1994, in her presidential address titled “The Advancement of Learning,” Ann Brown (1994) described the last century’s research that had led to this deeper understanding of human learning. Two years later, I argued in my address (Darling-Hammond, 1996) that the problem of the next century would be “the advancement of teaching,” and its resolution would depend on our ability to develop knowledge for teaching that can support more complex, strategic learning—a kind of teaching that goes far beyond dispensing information, giving a test, and giving a grade. This challenge would require understanding of how to teach and organize schools in ways that respond to students’ diverse approaches to learning, are structured to take advantage of students’ unique starting points, and enable work aimed at more complex performances. Building on cognitive research examining the interaction between students’ needs and teaching strategies, former AERA president Robert Glaser (1990) had argued that 21st-century schools must shift from a selective mode—“characterized by minimal variation in the conditions for learning” in which “a narrow range of instructional options and a limited number of ways to succeed are available”—to an adaptive mode in which “the educational environment can provide for a range of opportunities for success. Modes of teaching are adjusted to individuals—their backgrounds, talents, interests, and the nature of past performance” (pp. 16–17). He suggested that adaptive education should focus on developing the potential of each individual to a high extent, a critical mission for a pluralistic society with increasing needs for talent development. Such powerful teaching and learning would require schools that value and evaluate serious intellectual performances, support responsive teaching, and allow teachers to build strong, long-term relationships with students and their parents.
While Shulman and his colleagues at Stanford investigated pedagogical reasoning for such serious intellectual performances by developing teaching cases deriving from conceptual problems that arise in specific subject areas, my colleagues and I at Teachers College were developing similar cases deriving from challenges that arise in teaching diverse learners (Darling-Hammond, Ancess, & Falk, 1995; Macdonald, 1995). We examined how expert teachers consciously seek to draw upon students’ experiences, interests, and approaches to learning as they also aim to teach for understanding and highly developed performances. Our studies focused on teachers in urban schools that include a wide range of language, cultural, economic, and family diversity and in classrooms that are heterogeneously grouped in terms of prior academic achievement.
These cases and others developed by many researchers (Ladson-Billings, 1994; Lee, 2007) suggest that teachers who succeed at developing deep understanding of challenging subjects for an array of students, including those traditionally thought to be “at risk,” exhibit some common practices:
They develop engaging tasks that give students meaningful work to do—projects and performances that tap modes of disciplinary inquiry: doing historical research; engaging in literary analysis; writing and publishing poetry, stories, and newspapers; investigating scientific questions and developing mathematical models applied to real-world tasks.
They design tasks to allow students choices and different entry points into the work. This helps motivate effort and allows students to build on their strengths and interests as they reach for new and more difficult understandings and skills.
They develop “two-way pedagogies” to find out what students are thinking, puzzling over, feeling, and struggling with. The tools of these pedagogies include student presentations, skillful discussions, journals and learning logs, debriefings, interviews, and conferences. Teachers consciously develop pedagogical knowledge about the specific learners in their classroom—including their funds of knowledge (Moll & Gonzalez, 2004)—to add to their knowledge about learning generally.
They constantly assess students to identify their strengths and learning approaches as well as their needs and to examine the effects of different instructional efforts. They understand assessment as a measure of their teaching as well as a measure of student learning. They publicly point to students’ different strengths and accomplishments, creating a platform for legitimation and growth for each student in the classroom.
They scaffold a process of successive conversations, steps, and learning experiences that take students from their very different starting points to a proficient performance—including many opportunities for approximation and practice, debriefing and conversing, sharing work in progress, and continual revision.
They help develop student confidence, motivation, and effort and assure that students feel connected and capable in school. Their strategies for supporting learning extend beyond technical teaching techniques. They practice what John Dewey called “manner” as method: Their commitment to student learning and success supports students in the risky quest for knowledge.
In combination, these various lines of research on teaching have illustrated how teachers use knowledge about learners and learning, subject matter, curriculum, and teaching; how they construct pedagogical content knowledge and pedagogical learner knowledge in ways that ultimately meet at the intersection of subjects and students; and how they vary their practice in different contexts, depending on their instructional goals, the demands of challenging content, and the needs of particular students and classes.
Applications of Research to Practice and Policy
The combination of these strands of research informed the efforts to professionalize teaching that were launched in the mid 1980s with the report of the Carnegie Task Force on Teaching as a Profession, the Holmes Group (1986), and the founding of the National Board for Professional Teaching Standards (NBPTS). A set of policy initiatives was launched to design professional standards, strengthen teacher education and certification, increase investments in induction mentoring and professional development, and transform roles for teachers (see e.g., National Commission on Teaching and America’s Future [NCTAF], 1996).
Reforms of teacher education were built on the American Association of Colleges for Teacher Education’s seminal effort to organize research on teaching for teacher educators, the Knowledge Base for the Beginning Teacher (AACTE, 1989), which was followed by the Teacher Educator’s Handbook in 1996 (AACTE, 1996). The National Board for Professional Standards built on research about learning and teaching in developing standards articulating what expert teachers should know and be able to do. Additionally, the Interstate New Teacher Assessment and Support Consortium (INTASC), a consortium of state education agencies and higher education institutions, developed model standards and assessments for licensing beginning teachers that rest on the same body of research.
These standards have become widespread. The INTASC standards were adopted by more than 40 states and integrated into licensing and accreditation standards for candidates and programs. They were also incorporated into the teacher education accreditation standards of the National Council for the Accreditation of Teacher Education (now the Council for Accreditation of Educator Preparation). Most teacher education institutions reported they had used these national and state standards to ground the foundation for their program designs and for teacher education outcome measures (Salzman, Denner, & Harris, 2002).
The standards developed by the National Board and INTASC incorporate knowledge about teaching and learning that views teaching as complex, contingent on students’ needs and instructional goals, and reciprocal—that is, continually shaped and reshaped by students’ responses to learning events. The standards take into explicit account the need for teachers to respond to a student body that is multicultural and multilingual and that includes diverse approaches to learning. The standards further define teaching as a collegial, professional activity that responds to considerations of subjects and students. By examining teaching in the light of learning, they put considerations of effectiveness at the center of practice. This view contrasts with that of the previous “technicist” era of teacher training and evaluation, in which teaching was seen as the implementation of set routines and formulas for behavior, unresponsive to the distinctive attributes of either clients or curriculum goals.
The adoption of these standards coupled with development work led by Lee Shulman and colleagues in the Stanford Teacher Assessment Project resulted in the design of the new portfolio approach to teacher assessment adopted by NBPTS. This assessment examines expert teaching practice within content areas by examining artifacts of teachers’ planning and teaching and their students’ learning, supported by the teachers’ commentary about their decisions. These reveal not only what teachers do but why they do it as it elicits teachers’ reasoning while examining their actions. Furthermore, the Board’s portfolio elements are grounded in classroom activities that engage students in knowledge construction through disciplinary inquiry—scientific investigation, mathematical reasoning and communication, interpretation of literary ideas—that are increasingly important in the emerging knowledge-based economy. The Board began assessing teachers against standards of accomplished practice in the early 1990s and has by now certified more than 100,000 teachers against these high standards.
Another important initiative in the quest to codify the knowledge base for teaching and teacher education was undertaken by the National Academy of Education (NAE) through its Committee on Teacher Education. In 2005, at the urging of the American Federation of Teachers, NAE pulled together a panel of scholars in the field of teaching and teacher education to seek to apply what was then known about effective teaching and teacher education to the practical problem of constructing a curriculum for preparing teachers. Building on the National Research Council’s summary of research about How People Learn ( 1999), the NAE panel examined research on teaching that supports the learning of higher-order skills and “adaptive expertise” outlined in How People Learn and in turn the research on teacher education that supports that kind of teaching.
Its report, Preparing Teachers for a Changing World: What Teachers Should Learn and Be Able To Do (Darling-Hammond & Bransford, 2005), was able to rely on a very substantial body of research on teaching that had accumulated in the 40 years since Gage’s speech but a much leaner body of research on how to support teachers in learning about how to teach. The Committee’s recommendations were informed by the professional standard setting initiatives described earlier and by seminal research compilations, such as the several Handbooks of Research on Teaching, sponsored by the American Educational Research Association, and the Handbooks of Research on Teacher Education, sponsored by the Association of Teacher Educators. These compilations helped to develop conceptual frameworks for synthesizing knowledge about learning, teaching, and the learning of teachers. The volume was later used by many dozens of universities in redesigning their programs.
Research on Outcomes
Starting in the late 1980s, the blizzard of efforts to ground teacher education in a stronger knowledge base resulted in improved perceptions of the quality of teacher preparation by candidates and others. Starting in the 1990s, surveys of new teacher education graduates found that more than 80% felt that they were well prepared for nearly all of the challenges of their work (California State University, 2002a, 2002b; Gray et al., 1993; Howey & Zimpher, 1993; Kentucky Institute for Education Research, 1997). A somewhat smaller majority (60% to 70%) reported feeling prepared to deal with the more complex needs of special education students and those with limited English proficiency, no doubt reflecting the efforts of preparation institutions that were leading the field in these areas.
Veteran teachers and principals—especially those working with the five-year programs and professional development school models developed by the Holmes Group of education deans—reported perceptions that their newly trained colleagues were much better prepared than they had been years earlier (for reviews, see Darling-Hammond, 2000a; Darling-Hammond & Bransford, 2005).
Various lines of research looking at teacher effectiveness have suggested that many kinds of teacher knowledge and experiences may contribute to teacher effects, including teachers’ general academic and verbal ability, subject matter knowledge, knowledge about teaching and learning, teaching experience, and the set of qualifications measured by teacher certification, which typically includes the preceding factors and others (for reviews, see Darling-Hammond, 2000a; Rice, 2003; Wilson, Floden, & Ferrini-Mundy, 2001). Equally important is the body of research finding that traits like adaptability and flexibility are also important to teacher effectiveness (for a review, see Schalock, 1979) as these signal the adaptive nature of teaching that is essential to its success.
Several large-scale studies examining teacher qualifications found that fully prepared teachers were more effective than those who entered without preparation (Boyd, Grossman, Lankford, Loeb, & Wyckoff, 2006; Clotfelter, Ladd, & Vigdor, 2007; Darling-Hammond, Holtzman, Gatlin, & Heilig, 2005). Meanwhile, smaller-scale studies began to isolate positive effects of particular innovations, such as coherent and carefully integrated coursework and clinical work, and training within professional development schools (for a summary, see Darling-Hammond & Bransford, 2005).
Another productive line of work has examined programs producing stronger outcomes for teachers, which have identified a number of features that appear to be associated with these results, including opportunities to learn about content and content-specific teaching methods, a focus on helping candidates learn specific practices that they apply in classrooms where they are practice teaching alongside their coursework, carefully designed student teaching experiences, opportunities to study and develop curriculum, and performance assessments that evaluate teachers’ work with students (Boyd et al., 2008; Darling-Hammond, 2006).
In the past 15 years, a number of studies have found that the National Board assessment process identifies teachers who are more effective in raising student achievement than others who failed to achieve certification (for a review, see National Research Council, 2008). Equally important, many studies have found that teachers’ participation in the National Board process supports their professional learning and stimulates changes in their practice. Teachers note that the process of analyzing their own and their students’ work in light of standards enhances their abilities to assess student learning and evaluate the effects of their own actions while causing them to adopt new practices that are called for in the assessments (Athaneses, 1994; NBPTS, 2001). Teachers report significant improvements in their knowledge and performance in each area assessed—planning, designing, and delivering instruction; managing the classroom; diagnosing and evaluating student learning; using subject matter knowledge; and participating in a learning community—and observational studies have documented that these changes do indeed occur (Lustick & Sykes, 2006; Sato, Wei, & Darling-Hammond, 2008).
Recently, similar performance assessments have been developed for evaluating the effectiveness of beginning teachers at the end of their preservice programs or in their initial teaching year. The assessments require teachers to document their plans and teaching for a unit of instruction, videotape and analyze lessons, and collect and evaluate evidence of student learning. Like the National Board assessments, beginning teachers’ ratings on both the Connecticut BEST assessment and the Performance Assessment for California Teachers (PACT) were found to significantly predict their students’ value-added achievement on the state reading tests (Newton, 2010; Wilson, Hallam, Moss, & Pecheone, 2007). These performance assessments also help beginning teachers improve their practice in ways that continue after the assessment experience has ended (Chung, 2008; Darling-Hammond, Newton, & Wei, 2013).
Politics, Policy, and Teacher Education
Despite a growing knowledge base about the development and assessment of teaching and teacher education and some promising directions in practice, policy moves have been widely disparate. While efforts were underway to better infuse knowledge for teaching into teachers’ preparation and development opportunities, a competing agenda was introduced to replace the traditional elements of professions—formal preparation, licensure, certification, and accreditation—with market mechanisms that would allow more open entry to teaching and greater ease of termination through elimination of tenure and greater power in the hands of districts to hire and fire teachers with fewer constraints (see e.g., Thomas B. Fordham Foundation, 1999). Advocates of this perspective have argued that teaching does not require highly specialized knowledge and skill and that such skills as there are can be learned largely on the job after teachers have been employed (e.g., Walsh, 2001).
Particularly contentious has been the debate about whether teacher preparation and certification are related to teacher effectiveness. For example, in his Annual Report on Teacher Quality (U.S. Department of Education, 2002), the U.S. Secretary of Education Rod Paige argued for the redefinition of teacher qualifications to include little specific preparation for teaching. Stating that current teacher certification systems are “broken” and that they impose “burdensome requirements” for education coursework comprising “the bulk of current teacher certification regimes” (p. 8), the report suggested that certification should be redefined to emphasize verbal ability and content knowledge and to de-emphasize requirements for education coursework, making student teaching and attendance at schools of education optional, and eliminating “other bureaucratic hurdles” (p. 19). Other commentators have also argued that certification of teachers should be abandoned by states in order to remove “regulatory barriers” to teaching (see e.g., Walsh, 2001). These arguments have been prominent in advocacy for alternative certification.
While debates about the value of teachers’ preparation and experience have often been conducted around technical analyses of studies on the topic (see e.g., Ballou & Podgursky, 1999; Darling-Hammond, 2000b, 2002; Darling-Hammond & Youngs, 2002; Walsh, 2001), they have strong social, political, and economic implications. For example, evidence on inequities in the distribution of fully qualified teachers has been prominent in a large number of school finance lawsuits (for an overview, see Darling-Hammond, 2010), and significant costs could be associated with creating the salaries, working conditions, and other incentives needed to supply qualified teachers to all communities.
In her presidential address on “The New Teacher Education,” Marilyn Cochran-Smith (2005) argued that one should not be surprised by these competing approaches: Teaching and teacher education are inherently and unavoidably political in that they involve the negotiation of conflicting values about the purposes, roles, and content of schooling. (p. 3)
Indeed, Cochran-Smith suggested, It may be useful to think of teacher education as consisting of plural universes wherein multiple and sometimes even contradictory reforms proceed simultaneously while other aspects of teacher education remain unchanged. (p. 4)
In what she called the “new teacher education,” positioned as a public policy problem, a variety of issues have been framed as forced choices for teacher preparation: These include the conflict between diversification and selectivity of the teacher workforce, the valorization of subject matter at the expense of pedagogy, the competition between university and multiple other locations as the site for teacher preparation, and the contradictions of simultaneous regulation and deregulation. (Cochran-Smith, 1997, p. 4)
In this context, research has often become a weapon wielded to advance competing views of macro-level policy moves rather than a tool to inform the learning process for prospective teachers. One has been the set of warring studies on the effectiveness of alternative routes to certification (see e.g., Darling-Hammond, Berry, & Thoreson 2001; Darling-Hammond et al., 2005; Fetler, 1999; Goldhaber & Brewer, 2000; Raymond, Fletcher, & Luque, 2001). Though relevant to policy debates, most of this work has shed little light on the characteristics of programs—regardless of the pathway label attached to them—that influence teachers’ knowledge, skills, dispositions, and effectiveness in a wide range of contexts.
Another is the debate about whether teacher effectiveness should be judged based on student test scores. Federal incentives under Race to the Top and the Elementary and Secondary Education Act (ESEA) “flexibility” waivers required participating states to evaluate teachers by calculating gains in student test scores as the basis for making evaluation decisions about individual teachers. The U.S. Department of Education has also proposed to evaluate preparation programs by using value-added test measures for the students of teacher education graduates.
While these policy incentives were creating widespread changes in practice, dozens of studies were published demonstrating that value-added measures of individual teachers’ presumed “effectiveness” are highly unstable, have extremely wide error ranges, and exhibit bias against teachers with classrooms with very high-achieving or low-achieving students (for summaries, see Baker et al., 2010; Darling-Hammond, 2015; Haertel, 2013). These challenges are outlined in a special issue of Educational Researcher (March, 2015) and in an AERA statement on the use of value-added models (VAMs) in November, 2015. A similar statement from the American Statistical Association (2014) flags the risks of test-based teacher evaluation of this kind: VAMs typically measure correlation, not causation: Effects—positive or negative—attributed to a teacher may actually be caused by other factors that are not captured in the model. . . . Most VAM studies find that teachers account for about 1% to 14% of the variability in test scores, and that the majority of opportunities for quality improvement are found in the system-level conditions. Ranking teachers by their VAM scores can have unintended consequences that reduce quality. (p. 2)
The challenges states encountered in implementing this federally leveraged policy led to significant pushback in Congress. In unprecedented language, the recently passed reauthorization of ESEA (the Every Student Succeeds Act, signed into law on December 11, 2015) prohibits the Secretary of Education from prescribing any specific methods for teacher evaluation. Many states have already made it clear that they will rethink the test-based evaluation systems they had begun to implement. This move will likely undo—or at least complicate—the requirement for test-based evaluation of teacher education in regulations the Department had planned to promulgate.
Developing Effective Teaching and Teacher Education
The use of test-based teacher evaluation for individual personnel decisions is likely to be contended in both research and practice for some time given that it was planted in more than 30 states over the past decade and new studies about its limitations are published nearly every week.
Although value-added metrics are problematic when used with tiny samples of students to try to draw inferences about the effects of individual teachers, value-added methods can be valuable when used in large-scale studies to examine associations between teacher characteristics or program strategies and outcomes. Indeed, many of the studies I earlier described in discussing teaching and teacher education outcomes have relied on value-added methods of analysis. In these cases, the usefulness of the analyses and the appropriateness of the inferences are enhanced by much larger and more comprehensive samples, the use of more sophisticated methods, the lack of high-stakes consequences (which corrupt measurement), and a level of humility in interpreting the meaning of the findings. At the same time, the shortcomings of most American tests for evaluating the kind of learning that supports critical thinking and problem solving places limitations on what we can learn about powerful teaching from studies that use the narrow achievement measures that are readily available (Darling-Hammond, 2010; Haertel, 2013).
As Cochran-Smith (2005) pointed out in her address, we may not want to return to the day when evidence about outcomes was irrelevant. Quoting Evertson, Hawley, and Zlotnik (1985), who observed that most proposals for teacher education reform were then “unburdened by evidence that the suggested changes [would] make a difference” in quality of teachers (p. 2), she suggested that data about outcomes are important to continue to pursue.
At the same time, as in the early days of research on teaching, when findings were oversimplified and thoughtlessly translated into policy, it is critically important that evidence be meaningful and that it be used judiciously, with sophistication and nuance. This body of evidence may include quantitative studies of the outcomes of particular approaches as well as more in-depth mixed methods studies tapping surveys that provide student/graduate/employer feedback, observations of candidates’ student teaching and later classroom practice, collection of deeper evidence about student learning in classrooms in relation to teaching strategies, as well as outcome data on where candidates teach, for how long, and with what results (for an example of such a multimethod approach, see Darling-Hammond et al., 2010).
These studies would profitably include teacher educators’ own assessments of the outcomes of their courses, clinical work, and overall programs as these self-studies, if carefully designed and systematically conducted, could have the additional benefit of stimulating an “inquiry stance” on practice that can lead to continuous improvement (for examples, see Cochran-Smith, 2005; Cochran-Smith & Lytle, 2004). The goal should be not just to distinguish one policy or pathway from another in gross terms but to engage in an “intentional and systematic effort to unlock the ‘black box’ of teacher education, turn the lights on inside it, and shine spotlights into its corners, rafters, and floorboards” (Cochran-Smith, 2005, p. 8).
We should not be deluded, however, that these efforts will eliminate politics from the production or use of teacher education research. As Ken Zeichner and Hilary Conklin (2015) illustrate in a recent article, “distortion and misuse of research” (p. x) to justify political goals continues to be a major aspect of policy debate and development in this field as in others. In this regard, it will behoove education researchers to become more knowledgeable about and more capable of engaging the political and policy arenas productively. Increasingly effective public scholarship that can accomplish these goals is, in fact, the goal of the 2016 AERA meeting and will be the topic of Jeannie Oakes’s presidential address. Stay tuned!
