Abstract
Using congressional testimony on teacher quality from 2003 to 2015 and analysis of 60 elite interviews, we show how the political economy of knowledge production influences idea uptake in education policy discourse. We develop and assess a conceptual framework showing the organizational and financial infrastructure that links research, ideas, and advocacy in politics. We find that congressional hearing witnesses representing groups that received philanthropic grants are more likely to support teacher evaluation policies, but specific mentions of research in testimony are not a factor. Overall, our study shows that funders and advocacy groups emphasized rapid uptake of ideas to reform teacher evaluation, which effectively influenced policymakers but limited the use of research in teacher evaluation policy discourse.
In the years following Gates’s speech, policy change focused on teacher quality was remarkably swift. From 2009 to 2013, 31 states adopted reforms that required the use of student test score data in teacher evaluations, incentivized by the Obama administration’s Race to the Top (RTTT) program (Bleiberg & Harbatkin, 2020). Yet federal policy shifted again by 2015 with the Every Student Succeeds Act (ESSA), which eliminated requirements for states to use standardized test scores in teacher evaluations. While the use of research evidence very likely played a role in these policy developments, the spread of information alone does not explain how certain policy preferences are adopted within a contested policy environment, while others are not. Issue debates are not only waged with information—organizations mobilize resources, form coalitions, and seek out high-profile venues to broadcast their perspectives. Elite interest group influence can be a key factor in federal policy outcomes generally, and in education policy specifically (Debray-Pelot & McGuinn, 2009; Gilens & Page, 2014; McGuinn, 2006). Many of the advocacy and research organizations in education policy debates—think tanks, interest groups, and academic institutions alike—rely on private philanthropy as a major source of funding support (McDonald, 2014; Reckhow & Snyder, 2014). Thus, the factors that drive the uptake of research in policy debates are not simply about the content, quality, or communication of research findings. Research production is thoroughly embedded in and influenced by funding streams and advocacy agendas. As Weiler (2009) puts it, “[the] sponsorship [of knowledge production] very often does have an economic and political agenda of its own under which the support and the production of new knowledge is being subsumed” (p. 212).
Scholars have shown that the production of policy analysis is highly politicized within education as well as other fields (Henig, 2008; Lubienski et al., 2009; Rich, 2004; Rogers-Dillon, 2004). Another stream of scholarship shows how philanthropy has influenced educational policy development in districts, states, and at the federal level (Reckhow, 2013; Scott, 2009; Tompkins-Stange, 2016). Finally, a third field of research examines the politics of the policy process and how issues such as timing and dominant beliefs can affect the uptake of ideas (Jenkins-Smith et al., 2014; Kingdon, 1984). We draw together these insights into the politicization of research, the importance of philanthropy in education policy, and theories of the policy process to develop and test a unifying conceptual framework on the political economy of knowledge production.
To explore and understand the organizational relationships and policy consequences of the political economy of knowledge production, our study addresses these two research questions:
To answer these questions, we focus on the case of teacher evaluation policy at the federal level—a dominant issue in educational policy debates since the passage of No Child Left Behind (NCLB). This case offers rich data for our study, including a history of active debate in Congress among a range of actors involved in trying to influence policy. We use a mixed-methods research design, combining quantitative discourse network analysis of congressional hearings with qualitative analysis of elite interviews.
Our empirical findings highlight the crucial role of resources in driving the uptake of certain types of research-based ideas in federal education policy debate. In addition, this work offers a theoretical contribution by integrating political science theories of the policy process with educational policy theories highlighting the roles of funders and different types of research. We combine these strands of theory into a testable conceptual framework. We show that while research is not frequently mentioned in policy discourse on teacher evaluation, it is used to legitimate preferred reforms backed by organizations with elite funding support. Funders, by and large, did not fund basic research by academics, but rather advocacy research by think tanks and nonprofits—research that is motivated by a clear policy agenda and promoted in policy debates, where it is cited as empirical justification for desired reforms. Funders’ support for advocacy research facilitated the repackaging of ideas from research that promoted a shared and taken-for-granted understanding: that evaluation reforms would dramatically improve teacher quality. These ideas were taken up by policymakers during an opportune political moment: the start of the Obama administration. Working in tandem with advocacy organizations and think tanks, philanthropic funders have the capacity to suffuse a policy debate with advocacy research that is favorable to the funder’s preferred policy direction, aligned with policymakers’ prevailing beliefs, and well positioned for uptake when policymakers are most open to adopting new ideas.
Conceptual Framework: The Political Economy of Knowledge Production
How do new ideas enter the policymaking process? Prior scholarly work points to two key features of American government that affect the uptake of new ideas: (a) the relative openness of American politics to external influence and (b) the limited internal capacity of government institutions to produce ideas and generate research on specific topics (Drutman & Teles, 2015; Mills & Selin, 2017; Weir, 1992). As a result, external organizations have an opportunity to produce and promote new research-based ideas that can gain traction in Congress and federal agencies. These external organizations—which may include think tanks, universities, research institutes, and advocacy organizations—play varying roles in producing, packaging, and promoting research. Sometimes, the most influential groups are particularly effective at packaging and marketing research in the political arena (Weir, 1992).
The uptake of ideas is closely tied to the resources available to organizations to promote those ideas. Moreover, resources are often linked to agendas and beliefs—the individuals and groups that invest significant resources are rarely interested in nonideological idea production. As Drutman and Teles (2015) describe it, Washington is . . . awash in privately funded policy research. According to R. Kent Weaver and Andrew Rich, the number of Washington-based think tanks more than tripled between 1970 and 1996, from 100 to 306. James G. McGann at the Think Tanks and Civil Societies Program counted 1,828 think tanks in the United States in 2013. But fewer and fewer think tanks can claim the mantle of truly neutral expertise anymore. Instead, most are funded by industry, labor, or wealthy partisan donors whose official stance as “nonpartisan,” necessary for tax status, is a transparent veil for their advocacy-first work product.
Focusing on think tanks in education policy, DeBray and Houck (2011) describe the prominent voice of advocacy-oriented think tanks in Elementary and Secondary Education Act (ESEA) reauthorization debates, particularly the 2010 reauthorization debates. They observed that “think tanks’ effect on the policy process is one of adding to the sheer amount of information (some empirical research, some strictly advocacy) about NCLB’s effects or the likely effects of proposed changes” (DeBray & Houck, 2011, p. 330). Research on policy change in local school districts and in federal education policy shows the critical role of private philanthropy as both a funder and coordinator of think tanks and advocacy organizations (Henig, 2008; McDonnell & Weatherford, 2013; Scott & Jabbar, 2014). Henig (2008) also describes how research has a strategic function for philanthropies that focus on education policy: “research is seen as a tool that must support the foundation’s mission, not as a value in and of itself” (p. 168). Scott and Jabbar (2014) offer a “hub and spokes” model in which funders operate as a hub—actively developing and supporting advocacy agendas by distributing resources and promoting shared ideas among diverse organizations involved in policy advocacy. More broadly, scholars have found that intermediary actors, like philanthropies, are an important factor in the spread of research evidence and the uptake of new policy ideas across multiple levels of government (Finnigan & Daly, 2014; Lubienski et al., 2016).
Figure 1 displays our conceptual framework linking the funding and production of research to the pathways for information to reach policymakers through networked relationships between organizations. The figure is designed to resemble a network, with ties linking different types of organizations; however, this is a conceptual network, and our research will present much more detailed empirical network data. The ties linking categories of organizations represent the flow of ideas through different types of research as well as the flow of resources. Although this figure simplifies many complex processes and interactions, it is intended to highlight (a) the roles of different actors; (b) competing sources of research-based ideas that could reach policymakers; (c) mediating factors such as ideological beliefs and timing that can shape which types of ideas and when those ideas gain more traction; and (d) the connections between groups, showing how ideas are generated through a fundamentally social process, in which some ideas can gain popularity by emerging from multiple sources in the network and circulating through multiple exchanges. Our framework draws on the “hub and spokes” model outlined by Scott and Jabbar (2014). However, we add the following elements to our conceptual framework: specific types of research shared between different actors and specific concepts from policy process theory (including the importance of ideological beliefs and timing for idea uptake by policymakers). Overall, our conceptual framework is designed to represent the emergence and production of ideas, idea exchange, idea promotion, and the uptake of ideas by policymakers.

Political economy of knowledge production in the policy process.
The circles in the diagram represent different categories of actors involved in funding, producing, promoting, and receiving research-based ideas; the arrows indicate the flow of funding and different types of research between these actors. Following Scott and Jabbar (2014), we represent philanthropic funders in a coordinating role linking different types of organizations by providing resources (solid lines). Increasingly, university-based researchers have incentives to seek external funding for their research, and private philanthropic funders have become an important source as federal education research funding becomes less reliable (Feuer, 2016). Funders sometimes make use of university research to inform their agendas, but their application of this knowledge often focuses on funding the translation and advocacy of research ideas through organizations that are more closely linked to policymakers—think tanks and advocacy groups (Reckhow & Tompkins-Stange, 2015). Think tanks (such as Brookings and the American Enterprise Institute) focus heavily on producing reports and digesting research for policymakers. Advocacy organizations (such as unions and civil rights groups) are more focused on constituency representation or lobbying, but they often produce and repackage research reports to support their positions. We categorize both think tanks and advocacy organizations as synthesizers; according to Manna and Petrilli (2008), “synthesizers—[are] organizations that [pack] their own take on the teacher quality research into appealing, accessible, actionable, and ideologically persuasive documents with recommendations that policy-makers could understand, embrace, and then enact” (p. 77).
We also represent two ideal types of research-based information in policy debates: traditional research (dotted lines) and advocacy research (dashed lines). As Tseng (2012) observes, the definition of “research” is fluid and often varies among different users. Most policymakers have a broader conception of research than scholars, and advocacy reports that draw upon or summarize research evidence to explicitly endorse specific policy proposals may be regarded as research to policymakers, but not to scholars, due to a lack of neutrality or objective distance. These reports have a specific purpose—influencing policy debates. Drawing from the term used by Lubienski et al. (2009), we refer to these reports as advocacy research. Accordingly, we define advocacy research based on two key attributes. First, advocacy research is typically produced by think tanks or advocacy organizations, rather than exclusively by university-based researchers. Second, the conclusions produced by advocacy research are expressed with few caveats, even if causal factors are complex, and are linked directly to policy recommendations. In contrast, traditional research includes outputs such as academic journal articles and working papers, which are primarily written for a scholarly audience.
Some reports have mixed characteristics—partly resembling traditional research and partly resembling advocacy. 1 As the diagram shows, both types of research have the potential to reach policymakers through discourse networks, but the pathways and funding streams may offer more coordination and more resources for certain types of information. Synthesizers are particularly prominent as producers of advocacy research, while universities and research institutes are the main producers of traditional research. How might these factors affect the uptake and use of research in policy debates?
The complexity of policymaking and the flood of information directed toward federal policymakers means that potential ideas must be winnowed down. Thus, our conceptual framework also incorporates theories of the policy process that shed light on the winnowing of ideas. As Kingdon (1984) explains, the ideas that survive this winnowing need to comport with prevailing ideological perspectives. But how are dominant ideologies and their associated policy preferences determined? A long tradition of research on politics and policy treats policy preferences as independently held by individual actors—rooted in the ideology or intrinsic interests of the actors in question (Ansolabehere et al., 2001; Olson, 1965). Yet a growing body of scholarship has started examining the interdependence of actors’ policy preferences in the policy process. These studies show how advocates, policymakers, and experts respond to the broader policy environment when developing preferences, by interacting with like-minded partners and sometimes updating preferences as alliances, information, political conditions, and resources change (Heaney, 2004; Leifeld, 2013; Lubell, 2013). As Metz et al. (2019) explain, “policymaking happens in a complex and intertwined setting that includes various public and private collective entities who aim to transform their preferences into public policy” (p. 3).
Building on this scholarship and our framework in Figure 1, each type of organization is positioned to strategically interact with ideas, deploy resources to promote certain proposals, and join together new ideas to expand agendas; yet some organizations occupy positions with greater influence, the potential to broadcast ideas more effectively, and greater access to policymakers to incorporate strategic information about prevailing ideologies (Galey-Horn et al., 2020). In other words, the positioning of organizations is not merely driven by research-based ideas—organizations are also working with ideologies and belief systems, which may rise or fall in popularity among the organizations interacting in the policy process. For example, our study focuses on the post-NCLB time period when teacher unions were facing greater pushback from traditional allies in the Democratic Party (DeBray & Houck, 2011), opening the door for changing ideological perspectives among Democratic policymakers. In the diagram, we represent the mediating role of policy beliefs and preferences, which filter the ideas exchanged among the politically strategic actors in our analysis—funders, think tanks, and advocacy organizations. Beliefs and preferences undoubtedly influence university-based researchers as well—shaping choices about research questions, methodologies, and more; however, we expect that the mediating role of policy beliefs and preferences is most relevant for the politically oriented actors.
Our analysis of policy beliefs is theoretically informed by the Advocacy Coalition Framework (ACF; Jenkins-Smith et al., 2014). ACF offers analytical tools to analyze how policy actors and beliefs interact in high-conflict policy areas. A key element of the ACF is its emphasis on the beliefs that policy actors hold, which comprise three tiers: deep core beliefs, which represent broad normative values; policy core beliefs, which are relevant to a specific topical area of policy; and secondary beliefs, or policy preferences, which concern the technical strategies and instruments that may be used to achieve a desired outcome.
Finally, the timing and introduction of new ideas can shape the uptake of certain ideas or the influence of different types of actors. Kingdon also theorized about the concept of a policy window, a limited period of time when policymakers are more open to new ideas—for instance, during the transition to a new presidential administration or the reauthorization of major legislation (Baumgartner & Jones, 1993; Kingdon, 1984; McGuinn, 2018). Thus, we position timing in the diagram as a factor that mediates the receptivity of policymakers to new ideas.
Our conceptual framework suggests two propositions for the survival and spread of new ideas. Broadly speaking, our propositions focus on factors that increase the chances that an idea will gain popularity, reach policymakers, and receive serious consideration for incorporation into new policy.
Proposition 1: Funders, think tanks, and advocacy groups link research and advocacy agendas by promoting advocacy research, which is used frequently in education policy debates. These groups synthesize policy beliefs with the translation of research findings to promote specific policy ideas.
Proposition 2: Policy ideas that are introduced during a policy window and aligned with beliefs and ideological perspectives that are rising in the political arena are more likely to gain popularity and to be taken up by policymakers.
Method
We use a mixed-method research design, wherein qualitative and quantitative analyses are conducted simultaneously, and results iteratively inform subsequent data collection and analysis. We selected an integrative approach known as convergence, in which “initial quantitative findings may influence the focus and kinds of qualitative data that are being collected or vice versa” (Fetters et al., 2013, p. 2137). Within the convergence design, we modeled our process on Mendlinger and Cwikel’s (2008) “double helix spiral” framework, wherein insights gained through one methodological paradigm are subsequently used to shape and refine data collection procedures. They use the metaphor of a DNA double helix to describe this process: “In our analogy, the rungs of the double helix bridge between the two approaches (the spirals) through the introjection of research methods and strategies from one research approach when working in the alternate research approach” (Mendlinger & Cwikel, 2008, p. 284).
We used this framework to inform our coding procedures for policy preferences in the congressional hearings, which allowed us to systematically identify the most prominent organizations that testified in Congress regarding teacher evaluation policies. On the quantitative rung of our double helix, we conducted discourse network analysis, a new technique that links social network analysis to content analysis, providing a way to combine the study of actor relationships with the content and sources of their policy beliefs and preferences (Leifeld, 2013). This method of analysis is particularly suitable for our conceptual framework, which represents the exchange of policy ideas (rooted in individual actors’ beliefs and preferences) as a social network of actors who influence one another through ideas and resources. Unlike critical discourse analysis, which focuses on qualitative coding of communications between actors, discourse network analysis is a quantitative coding procedure that systematically identifies policy preferences expressed in public discourse (such as congressional hearing or newspaper articles).
The type of discourse that we analyze is witness testimony in congressional hearings—a crucial component of the federal policymaking process where policymakers hear the viewpoints of advocates and experts, debate legislation, and vote on amendments and bills. Our use of congressional hearings for our primary source of data is informed by prior research showing that hearings are important venues for shaping policy outcomes. Congressional hearings are a formal structure for communicating policy information, and information provided in congressional testimony about policy effectiveness is positively associated with proposal enactment (Burstein & Hirsh, 2007). These hearings have a specific set of norms and social context guiding their proceedings; for example, legislators and their staff select the witnesses to invite, witnesses have limited time to present their statement, and legislators may be interested in identifying witnesses who could validate their own preferences (Perna et al., 2019). Furthermore, congressional hearings are the formal venue that allows legislators to conduct oversight of the executive branch, including the Department of Education (Aberbach, 2001).
We analyze hearings from 2003 to 2015, including the early implementation of NCLB as well as the reauthorization debates that led to the ESSA in 2015. Within this time period, we can observe one major change in the presidential administration—from George W. Bush to Barack Obama. Thus, we characterize the time period from the 2008 presidential election to the first 2 years of the Obama administration as a “policy window,” which might offer an opening for supporters of new policy ideas (Kingdon, 1984). We segment the analysis of the hearings into three time periods: (a) Bush administration/NCLB implementation (2003–2007); (b) policy window—presidential election and new administration (2008–2010); and (c) Obama administration/ESSA (2011–2015). This part of the analysis offers an empirical test of the “timing” component in our conceptual framework
On the qualitative rung of the double helix, we conducted inductive analysis of elite interviews to examine the organizations, funding streams, and strategies that undergird the political economy of knowledge production about teacher evaluation. We held open-ended, semi-structured elite interviews with 60 key education policy actors, including national funders, intermediaries, researchers, and policy elites involved in teacher evaluation policy development to investigate how funders might affect the uptake of research into policy debates. Our interviews allowed us to contextualize and explore the relationships between actors in the teacher quality policy network in fine-grained form, bringing us closer to understanding the strategies of funders and synthesizers. The interviews also provide details on the interaction between the stream of ideas from research and the resources that help promote ideas in public debate.
We created a bridge between these two helixes by oscillating between the two rungs over the course of the project to fill gaps in our data and further explore emerging areas of interest. We developed our interview sample through our discourse network analysis, which enabled us to identify key individuals in the teacher quality policy network. Our interview analyses yielded information about topics that informed later quantitative analyses on the types of research mentioned in hearings. This approach allowed us to collect data synchronously and analyze the strands separately—after which we brought together both rungs to draw meta-inferences, triangulate data, and check for complementarity to gain a more complex and complete picture of the subject matter. Our research design is thus both exploratory and confirmatory.
Data Collection and Analysis
Congressional Hearings
We gathered data for this study from 175 congressional hearings from January 1, 2003, to December 31, 2015, on the topic of teacher quality. We downloaded full transcripts of hearings from the U.S. Government Printing Office website based on the search phrase “teacher quality.” Our data include 552 observations of coded testimony, with 349 unique organizations/institutions represented, and 452 unique individual witnesses in the hearings (a few individuals appear multiple times but represent different organizations).
Our analysis of the hearings occurred in four steps. First, we coded witness testimony from our congressional hearing data to identify four key witness attributes: policy preferences, research use, organizational affiliation, and foundation/philanthropic funding. Second, we generated two-mode discourse networks of organizations and policy ideas. We then examined the effects of foundation funding on support for teacher quality policies using exponential random graph models (ERGMs). Third, we coded and analyzed our elite interview data. Fourth, we brought together our qualitative and quantitative analysis to look for patterns across the corpus of data.
Prior to coding the testimony, we conducted an initial scan of hearings to ensure that the hearings we coded would be relevant to our study. We only excluded hearings that lacked substantive discussion of teacher quality issues, for example, a hearing on the federal budget where a participant happened to mention federal funding for dozens of “teacher quality programs,” but with no additional discussion of the programs themselves. Once we confirmed that a hearing included even a minimal substantive discussion of teacher quality, the hearings were uploaded into Discourse Network Analyzer software for coding (Leifeld, 2013).
A team of human coders read each congressional hearing and coded statements using a specified set of policy belief categories. The Online Appendix provides the complete codebook and details on the coding procedures. To organize our analysis and codebook, we apply the ACF, as described above. The ACF groups policy beliefs into three broad categories which we apply to develop our codebook. For this study, we analyze two deep core belief categories from our codebook: efficiency—an emphasis on economic cost–benefit and optimization of policy performance (Wood & Theobald, 2003)—and equality—the use of political authority to redistribute resources (Kirst & Wirt, 2009). Efficiency and equality are widely recognized among social scientists as policy goals that can present inherent tensions, that is, the pursuit of greater equality may reduce a system’s ability to achieve more efficient outcomes (Mitchell et al., 1993).
The efficiency belief category includes policy preference codes related to teacher evaluation. We include the equality belief category in our analysis because it offers a sharply contrasting set of preferences for us to compare with efficiency, with the possibility that these preferences have different mechanisms driving support. Table 1 illustrates how our data were coded within the deep core belief of efficiency related to teacher accountability and the deep core belief of equality as it relates to the composition of the teacher workforce (which is predominantly White and female compared with the student population). We include two examples of policy preferences with each belief category. The complete codebook, including the full list of coded policy preferences, can be found in the Online Appendix, along with specific examples of how we coded hearing text. For each policy core belief, there are specific policy preferences listed in the Online Appendix. Our coding system was designed to identify actors’ support for beliefs and policy preferences at the highest level of specificity presented in the discourse.
Sample of Codebook Based on Advocacy Coalition Framework
In addition to coding for policy preferences and policy beliefs, we coded all witness testimony for references to research. We coded this in two ways. First, we identified anytime a witness mentioned research, even if the individual did not name a specific source. All mentions of research (both the mentions that include a source and mentions that do not) provide a count of the frequency of research mentions in a given statement. Second, when available, we coded each source of research mentioned by the witness, including all organizations, universities, government agencies, media, or any other organization or institution mentioned as a source of research evidence.
Finally, based on the organizational affiliation provided in the hearing transcripts for each witness, we also coded whether that organization had received funding, either in the year of testimony or in the 2 years priorly, from two specific foundations, the Bill & Melinda Gates Foundation and the Eli & Edythe Broad Foundation. We collected this information by searching the Foundation Directory Online database of philanthropic grants. Based on prior research, these two funders were identified as particularly active supporters of teacher quality initiatives (Reckhow & Tompkins-Stange, 2018). Our coding of witness organizations with Gates or Broad Foundation funding support shows that 27% of hearing witnesses were Gates Foundation grantees and 13% were Broad Foundation grantees.
Discourse Network Analysis
A discourse network is constructed by analyzing actors’ attitudes expressed in a public arena (such as congressional hearings) and creating ties between actors based on shared policy preferences, or support for the same policy instrument (Leifeld, 2013; Metz et al., 2019). For example, two actors who publicly state that teacher quality should be assessed with student growth scores would share a link (or tie) in a discourse network. These relationships can be operationalized as two-mode networks. We applied this technique to link hearing witnesses through shared policy preferences. One mode of our network is actors—the participants in congressional hearings. The second mode is policy preferences—the policies in our codebook.
Figure 2 provides an example of the two modes in a discourse network and the links between the actor mode and policy mode. In the example, two hearing witnesses both mentioned support for the policy belief: “Teachers must be evaluated and held accountable.” One of these witnesses has also mentioned a specific policy preference that is related to teacher evaluation: “Use evaluation systems with multiple measures.” Meanwhile, a third witness does not share agreement with any other witnesses but has voiced support for separate policy related to equality beliefs: “Teachers need special preparation to teach in high-needs and/or diverse schools.”

Discourse network example.
When compiled as discourse networks, the preferences expressed in witness testimonies help unmask the complex structure of idea formation, idea popularity, and policy change in public debates. This data structure can be modeled to systematically analyze the relationships between actors and policy preferences, including social influence. Our conceptual framework presents idea uptake as a social process built upon interactions between actors, and network analysis allows us to model these processes (Heaney, 2004; Leifeld, 2013; Lubell, 2013).
ERGMs
Congressional hearings are interactive events—hearing witnesses may know one another and react to statements and events occurring in the hearing. Thus, policy preferences are unlikely to be independent—mentions of a particular idea may spread or increase in frequency, for example, because others are discussing the same idea. For these reasons, we use an ERGM to analyze our discourse network data. ERGMs make it possible to model endogenous configurations in network structure (i.e., structural features of the network that might be predicted by interdependencies between actors and policy preferences in the network). We use the ergm package in R to estimate the models using Markov chain Monte Carlo maximum likelihood estimation (MCMC MLE; Hunter et al., 2008). We estimated ERGM models and compared results for three different time windows of the policy debate: 2003 to 2007; 2008 to 2010; and 2011 to 2015. The purpose of this analysis is to determine how network factors and actor attributes, as well as the dynamics between them, contributed to actors’ support for teacher evaluation preferences in each time period, and then to compare those results to see whether we can observe differences between time periods.
The dependent variable is the observed two-mode network of actors and policy preferences constructed from the congressional hearing testimony (also called the dependent network). The network in Figure 3 is one dependent network—showing the actors and policy beliefs/preferences they share. With the network serving as the dependent variable, we can assess how characteristics of actors and selection forces (such as preference popularity) affect the global structure of a network. In other words, why do we observe this specific configuration of actors sharing these policy preferences at this point in time? Do actors with similar funding sources or actors that mention research tend to promote the same ideas? With this model, we can directly test expectations from our conceptual framework related to funding support for organizations, research mentions, and the timing of idea introduction.

Two-mode network visualization: Teacher evaluation preferences, 2008 to 2010.
ERGM estimates the probability that a specific network is observed, given other potential permutations of the network. For each model 1,000 networks are simulated and compared with the original data (for an overview of the model, see Cranmer & Desmarais, 2011). Similar to logistic regression, the dependent variable is a binary variable for the presence (i.e., “0”) or absence of a tie. We apply a two-mode ERGM, which estimates the probability of a tie between an actor and a policy preference. In traditional regression analysis, the dependent variable is assumed to be only influenced by exogenous variables (i.e., independent variables). In contrast, ERGM analysis accounts for interdependencies between observations. Our models include both endogenous terms related to network structure as well as exogenous terms for node-level covariates to estimate the role of actor-specific variables (Brandenberger et al., 2020). 2 We describe each ERGM term below and then discuss the results of the analysis.
Endogenous (Network) Terms
Preference overlap
This term captures the tendency of actors to have multiple policy preferences, using geometrically weighted degree counts for the first mode (actors) in the network. This term counts how many actor nodes have one connection to a policy preference, two connections, and so on, and places a lower weight on larger numbers of connections using a geometric decay parameter. The term indicates the extent to which each tie to a policy preference decreases the likelihood of an additional tie, and the strength of this effect itself decreases geometrically. Structurally, preference overlap controls for the activity effects of high degree nodes, which tend to be more prevalent than low degree nodes in policy networks. In policy networks, prominent actors are likely to be more active in certain arenas and, thus, have higher degrees.
Preference popularity
This term counts how many actors refer to the same preference, when a particular focal actor mentions that preference. This is also known as a “star” term for the second (policy preference) mode. Preference popularity is indicative of the policy beliefs and preferences dominant in a particular policy network. Shifting patterns in the popularity of preferences over time can illustrate the uptake of popular policy ideas (e.g., Galey-Horn et al., 2020). Structurally, this term accounts for the tendency of policy network actors to cluster around popular policy preferences.
Actor-Level (Attribute) Terms
Research mentions
This is an attribute of actors in the network, based on actors that mention research (1) and actors that do not (0).
Gates or Broad funding
This is an attribute of actors in the network, based on actors that represent organizations that received funding from the Gates or Broad foundations in the last 2 years (1) and actors that did not (0).
Qualitative Interviews
To design our interview process, we made the methodological choice to use an inductive approach nested within a deductive framework, consistent with our double helix methodology. We initially constructed general interview protocols using the dependent variables that our discourse network analysis had revealed, which created several lines of inquiry, or key topics, to pursue (Merriam, 2009; see Online Appendix for a sample interview protocol). Within this deductive framework, however, we used a grounded approach to continuously refine the protocols by conducting analysis iteratively and in parallel. By grounded, we refer to a process wherein the researcher generates a data set through semi-structured open-ended interviewing, known as “intensive interviewing” (Charmaz, 2014, p. 18). As each interview generates more nuanced data, the researcher revises interview questions for future informants accordingly. This enables the researcher to discover new themes organically, grounded in the data, rather than primarily adhering to the original lines of inquiry.
In intensive interviewing, the researcher begins with open-ended questions, focusing on listening and responding to the areas of interest for the informant. These questions primarily address the views, feelings, and perspectives of informants, which are later analyzed through open-ended analytical coding to determine potential lines of inquiry for future interviews. Following the first several interviews, the researcher can begin to observe trends and themes in the data, and after the first five to 10 interviews, cross-cutting themes may become evident, which will later be collapsed into broad categories. The researcher then uses these broad categories to refine the protocol inquiry for future interviews (Merriam, 2009).
Following Charmaz (2014), we designed our interview protocols to solicit informants’ interpretations of policy discourse, including retrospective accounts of policy climates in the past as well as contemporary observations, all filtered through their own experiences. Because all the informants in our study are considered “elites,” we reviewed the literature on elite interviewing (Berry, 2002; Dexter, 1970; Goldstein, 2002), which recommends, thorough preparation, allowing for flexibility in case an informant is unresponsive, establishing rapport, and attending to time constraints.
We conducted 60 in-depth qualitative interviews with actors who are centrally involved in the teacher quality policy network. Our initial sample was created based on the list of hearing witnesses, as we sought to speak with interviewees who were central in policy discourse, indicated by their presence at hearings. Interviewees were further identified by triangulating data. A compiled database of informants included the expert witnesses who testified in congressional hearings, the authors of major research reports on teacher quality that were cited in the hearings, leaders of key organizations involved in teacher quality policy advocacy, and media sources who wrote about teacher quality. Interviewees were recruited through personal contacts with orienting figures and by systematic recruitment procedures, including emails, calls, and face-to-face meetings.
Interview protocols were tailored to interviewees based on their category of profession or expertise (i.e., researchers, advocates, policy staff, media, funders). The protocols were further refined to individual interviewees based on extensive background research on their work and publicly stated views, containing specific probes for each question depending on the informant’s position within an organization. Approximately half of the interviews were conducted in person and half via phone. Interviews were tape-recorded with the permission of the individuals, except in some situations where sources were reluctant to share information candidly, in which case the researchers took extensive notes, including verbatim quotes, and documented their impressions immediately after the conclusion of the interview. Recorded interviews were subsequently transcribed verbatim, averaging approximately 60 minutes in length. Throughout the article, the interview respondents are referred to as “interviewees” or “sources” with general description of their professional roles.
We coded the interviews using Dedoose open source software, beginning with open coding. At several intervals, we refined the existing codes to collapse redundant categories and develop more fine-grained, multilevel codes. Throughout the process, we composed analytic memos to connect emerging themes as the sample grew. After the first round of coding, we revisited the interviews in a second round of focused coding, applying a consistent set of codes across the transcripts, particularly the earliest interviews that were coded in vivo.
Results
Funding, Advocacy, and Research
We begin our presentation of results with the quantitative and qualitative findings that address our first research questions: What is the role of different types of research in the debate and uptake of policy ideas? What kinds of organizations fund and produce research that is used most often in major policy debates and what type of research do they specialize in producing? Based on our conceptual framework, we expect that funders, think tanks, and advocacy groups link research and advocacy agendas by promoting advocacy research, and these ideas have a better chance of reaching policymakers. Our analysis begins with descriptive data that compare the role of different types of research mentioned by supporters of teacher quality reforms linked to efficiency (i.e., evaluation reform) and supporters of teacher quality reforms focused on equality. We show that think tanks were mentioned more than other research sources by supporters of teacher evaluation. Furthermore, supporters of teacher quality reform focused on evaluation mentioned research less often than those focused on equality. While research was not mentioned as frequently by the supporters of teacher evaluation, a small number of advocacy research reports on teacher evaluation by organizations such as EdTrust and TNTP were highly impactful and ultimately taken up by policymakers.
We begin presenting our results by looking closely at the mentions of research in congressional hearings by supporters of different policy beliefs and preferences. In particular, we are interested in noting any unique patterns in the frequency of research mentions among supporters of teacher evaluation. We compare these witnesses with supporters of teacher quality policies related to equality issues. When evaluating this comparison, it is important to note that the policy categories are not entirely mutually exclusive (in other words, a particular witness could mention both a teacher evaluation preference and equality preference during testimony, therefore appearing in analysis of each set of preferences). We are interested in the frequency and types of research these witnesses use in presenting the entirety of their argument—both in favor of policies like teacher evaluation and to frame that argument in relation to other policy ideas.
Figure 4 plots two trends over time with a combined bar chart and a line graph. The line graph indicates the number of hearing witnesses in each year who mentioned teacher evaluation preferences (the axis on the left side of the graph). This changes considerably over time, with a significant peak in 2010, following the Obama administration’s announcement of RTTT, which included incentives for states to adopt teacher quality reforms. The bar graph indicates the proportion of hearing witnesses who mention research in their testimony (the axis on the right side of the graph). Interestingly, there appears to be an inverse relationship between these two trends. Earlier in the time period (2004) and later in the time period (2014–2015), a higher proportion of witnesses who discuss teacher evaluation preferences also mention research in their testimony. For instance, in 2004, all six of the witnesses who mentioned evaluation preferences also referred to research in testimony. Yet during the middle part of the time frame, when teacher evaluation preferences were discussed most often by hearing witnesses, fewer than half of the witnesses were mentioning research in their testimony. For example, in 2011, 20 hearing witnesses discussed evaluation preferences, but only 25% mentioned research in their testimony.

Support for teacher evaluation policies and mentions of research.
Figure 5 offers a contrasting perspective, with a plot of trends over time in mentions of equality-related teacher policy preferences and mentions of research. The proportion of references to research is steadier among hearing witnesses who mentioned equality-related preferences. Although the amount of hearing discussion of equality-related preferences ebbs and flows somewhat from 1 year to the next, the rate of references to research by these hearing witnesses is not as varied as is the case among the witnesses who mention evaluation preferences.

Support for equality-related policies and mentions of research.
The descriptive findings so far suggest a puzzle—when teacher evaluation gained its highest level of prominence in congressional hearings, the supporters of these ideas mentioned research less frequently in congressional hearings. We might expect that the philanthropic funders supporting teacher evaluation ideas would want to ensure that plenty of research is available for use in venues like congressional hearings. Is there something else underlying these patterns in research mentions?
In addition to examining the frequency of research mentions in congressional testimony, we also coded the sources and types of sources mentioned by each witness. Figure 6 compares the frequency of different types of sources mentioned by witnesses who supported policy preferences within the two broad categories of policy beliefs (evaluation and equality). The chart shows that think tanks were the most frequent source of research mentioned by witnesses who preferred teacher evaluation policies (the list of think tanks mentioned by our witnesses is included in the Online Appendix). Meanwhile, witnesses with equality policy preferences mentioned university-based research sources most often. Witnesses with equality preferences also mentioned the federal government and National Academies as sources of research slightly more often than witnesses who voiced support for teacher evaluation policies, although the differences are not large.

Top research source categories by policy preferences.
What type of research do think tanks provide, given their prominent role as a source of information for supporters of teacher evaluation? We feature think tanks as synthesizers in our conceptual framework, for their role in formulating and promoting policy ideas that are packaged for policymakers (Manna & Petrilli, 2008). Synthesizers are also closely tied with the concept of advocacy research in our conceptual framework (Lubienski et al., 2009). Our interviews show that advocacy and research use were closely intertwined by the synthesizers that were most active in the teacher evaluation policy area. These think tanks and advocacy groups were generously supported by major philanthropic foundations. Funders and the producers of advocacy research helped lay the groundwork to make ideas about teacher evaluation and effectiveness readily accessible to policymakers, and the reports were used by some of the most prominent and influential policymakers in this arena.
Interviewees described foundation-funded advocacy research as a mechanism to insert ideas into political discourse in a more hands-on way, by distilling the complicated details of test score–based teacher evaluation into a format that policy elites could easily understand. By focusing on the production of a few influential studies, teacher evaluation policies could be promoted and amplified via a modest amount of advocacy research. Our findings suggest that these efforts, which included a focus on commissioning and supporting advocacy research produced by synthesizers, rather than universities, had a visible impact on the federal policy debate.
One of the most influential reports our interviewees mentioned was a 2004 report by EdTrust, a national advocacy organization in Washington, D.C., which was titled The Real Value of Teachers. EdTrust is consistently a top recipient of major foundation funds (Reckhow & Snyder, 2014). Coinciding with the timing of this report, EdTrust received a US$2 million grant from the Gates Foundation in 2003 and another US$1.2 million in 2004. The Real Value of Teachers surveyed numerous studies on teacher quality, but emphasized the work of Bill Sanders, an academic whose research is broadly credited as the basis for the first effort to use value-added scores to assess teachers at a state policy level through the Tennessee Value-Added Assessment System (TVAAS). As we show in our conceptual framework, ideas from academic research may be part of the exchange of policy ideas, but they are also being repackaged and promoted by synthesizer groups like EdTrust. The report highlighted the clarity of information TVAAS provided, stating that TVAAS credibly and consistently distinguished high-performing versus low-performing teachers: . . . Some people are already doing [value-added teacher evaluation], right now, with great success. Easily the best example is the system that’s currently up and running in Tennessee. Created by law in the early 1990s, the system is called TVAAS, the Tennessee Value Added Assessment System . . . More than a decade of results from TVAAS and other value-added systems has shown some remarkable things about teacher effectiveness. Perhaps most importantly, it shows that some teachers are simply much more effective than others.
The report also advocated for test score–based teacher evaluation to be adopted on a broad scale, calling on the moral sensibilities of readers to view it as a means to address systemic injustices in education by pairing the most effective teachers with the neediest students: The idea of effective teachers helping needy students has tremendous power. It re-affirms the promise of public education and its ability to make all the difference in students’ lives. It is a powerful solvent to the inertia and sense of helplessness that have infiltrated the ideas and culture of our public schools. It is a catalyst for radical improvement in almost every facet of education. Good teachers can close the achievement gap, if only we can find them and let them do their work.
The report represented a core aspect of EdTrust’s mission: advancing a collective national understanding about the key role of teacher quality in improving student outcomes. To that end, EdTrust was deliberate about identifying an evidence base for its advocacy and surveyed a broad body of research. One interviewee, the leader of a Washington, D.C., think tank, reflected on how EdTrust’s CEO, Kati Haycock, “discovered” TVAAS, and that her endorsement was highly influential in catalyzing support for test score–based evaluation methods at a national level, as opposed to contained within a single state: First, Kati Haycock at EdTrust discovered Bill Sanders—he’s the key, right?—and this incredible evidence that demonstrated that teachers varied so dramatically and they vary dramatically from classroom to classroom. [Because] at the time and surely even long after, so many of us were focused on schools as the unit of change. And suddenly you say: Oh my god—teachers matter a lot. Not if you look at the traditional credentials, but if you start to look at what they’re able to get in terms of student learning gains. So that was huge. And I think the work that EdTrust did in the late 90s to popularize some of those findings were really entirely from Sanders.
Our interviewees also frequently pointed to a 2009 report, entitled The Widget Effect, by TNTP, a national advocacy organization (also a synthesizer) with a mission of identifying and supporting effective teachers. Several education foundations including the Gates Foundation, the Walton Family Foundation, the Robertson Foundation, and the Joyce Foundation funded the report, which made the case for systematic teacher evaluation systems to distinguish between low-quality and high-quality teachers using test score–based evaluation methods. Once again, we observe critical activity in the area of our conceptual framework linking philanthropic resources to a synthesizer organization that produces advocacy research (Lubienski et al., 2009; Manna & Petrilli, 2008).
The report stated that “institutional indifference to variations in teacher performance” resulted in systems that perpetuated low-quality teaching across the country, and critiqued evaluation systems that relied predominantly on observational methods. Interviewees from the Department of Education praised the report’s clear proposal and message, which distilled a complicated issue into a more comprehensive format, in contrast to academic research. One academic reflected on this: They had such a good way of making the point in that Widget Effect paper—it was really compelling in the way that academics couldn’t be. They gave talking points, not technical appendices . . . No policymaker is going to read Econometrica.
The report had immediate influence after its release in January 2009, when federal officials were beginning deliberations about policy measures shaping the Obama administration’s teacher evaluation policy agenda. In February 2009, 1 month after the report became public, Secretary Duncan made the following statement about the report in a speech: These policies . . . have produced an industrial factory model of education that treats all teachers like interchangeable widgets. A recent report from the New Teacher Project found that almost all teachers are rated the same. Who in their right mind really believes that? We need to work together to change this.
Here we see a concrete example of idea uptake by a policymaker—the final step in our conceptual framework—with Secretary Duncan mentioning a specific set of ideas from a philanthropically funded advocacy research report. Similarly, a Department of Education staff member under Secretary Duncan described the influence of The Widget Effect in policy discussions during the design of RTTT: “There were real policy recommendations attached to it, giving a starting point to work from. It was super influential for that reason . . . [it] was the most impactful [report we read].” Another interviewee, who was a Democratic Party policy staffer in Congress during the same time period, offered a similar perspective: Fast forward a few years to Race to the Top . . . it was very clear that was going to be the Administration’s top priority and they were leaving a lot of the details to Congress. I think there were elements of that . . . I would also trace back ultimately to The Widget Effect.
Overall, these findings show that think tanks were mentioned more than other research sources by supporters of teacher evaluation, and that a small number of advocacy research reports by organizations such as EdTrust and TNTP were especially important. These “synthesizers”—think tanks and advocacy organizations that meld research with advocacy for new policies—are supported by major philanthropies. Our interviews also show that these reports were influential with key policymakers, but we do not yet have a full understanding of the contextual factors that might contribute to this idea uptake. In our next set of analyses, we turn to a systematic evaluation of beliefs and timing in the uptake of ideas and research.
Timing and Beliefs: How Do Ideas Gain Momentum?
Our prior analyses showed that funders are playing a role in contributing ideas to the policy process—often by promoting organizations that produce advocacy research. Yet the funding and promotion of ideas alone does not necessarily mean those ideas will gain traction. Our second research question broadens our lens on the policy process: How do other aspects of the policymaking process—such as ideological beliefs and timing— influence the role of research in policy debates and the uptake of new ideas? Based on our conceptual framework, our second proposition is that policymakers are more likely to be receptive to new policy ideas that are introduced during a policy window and aligned with beliefs and ideological perspectives that are rising in the political arena. Our quantitative and qualitative findings both highlight the importance of timing, particularly for the uptake of evaluation-related policy ideas early in the Obama administration. Foundation-funded actors appear to be especially effective at promoting these ideas during the policy window. Furthermore, our quantitative analysis shows the importance of the rising popularity of evaluation policy preferences, whereas our qualitative interviews suggest that the popularity of these evaluation policy beliefs and preferences actually created new challenges for evaluation proponents, particularly the Gates Foundation. Officials with the Gates Foundation sought to revise their initial approach to evaluation policy in response to new research, but they found that their options were limited by the popularity of the policies they had promoted earlier in the process.
To assess the relationships between funding, advocacy, and uptake of new ideas, we use ERGM to analyze the coded discourse network data from congressional hearings paired with qualitative interview analysis with elites who were involved in the process. The results of the two-mode ERGM analysis for each time period are presented in Table 2, with estimated coefficients and standard errors. Recall that we separated the dependent networks into three time periods, with the 2008 to 2010 network (Figure 3) representing a policy window—a change in presidential administration. The coefficients are like logit coefficients; they can be interpreted as conditional log odds of a tie between an actor and a policy preference (a dyad), but the probability of observing any tie is conditional on all the other dyad outcomes in the network. In all models, we include a term for edges, which accounts for the baseline odds of creating a tie in the network. Our first network includes all teacher evaluation preferences and actors from congressional hearings during 2003 to 2007; the second focuses on the policy window, from 2008 to 2010 (the network in the diagram in Figure 3); and the third network is based on hearing participants and teacher evaluation preferences from 2011 to 2015. Actors are the witnesses that testified in congressional hearings and policy preferences are their support for policies related to teacher evaluation identified in our coding of hearing testimony.
Bipartite ERGM: Teacher Evaluation Preferences in Congressional Hearings
Note. Table entries are unstandardized regression coefficients with standard errors in parentheses. ERGM = exponential random graph model; AIC = Akaike information criterion; BIC = Bayesian information criterion.
For a two-tailed test of significance, *p < .05. **p < .01.
First, the statistically significant results related to one of our node-level variables (Actor supported by Gates or Broad foundations) indicate that the emergence of teacher evaluation policies was partly driven by philanthropic funding. During the 2008 to 2010 policy window time period, hearing witnesses representing groups funded by the Gates or Broad foundations have more ties to teacher evaluation–related preferences, controlling for both node-level and network structure factors. Meanwhile, we find that research mentions (Actor mentions research) are negatively associated with the teacher evaluation preference network during the earliest time period in our analysis, 2003 to 2007. While the lack of significance for research mentions prior to 2008 could be related to the more limited availability of teacher evaluation research earlier in the time period, we also did not find a positive and statistically significant relationship between research mentions and the teacher evaluation preference network for any time period in our analysis. However, in the last time period in the analysis (2011–2015), we do find a positive coefficient for research mentions. Altogether, these results indicate that research mentions by witnesses are not closely associated with teacher evaluation preferences in congressional hearings. Instead, during the time period with an open policy window, hearing witnesses with funding from the Gates Foundation and/or Broad Foundation are positively associated with support for teacher evaluation, controlling for other factors.
Why did a large number of foundation-funded organizations support the same preferences for teacher evaluation during a critical time period in the policy process? Interviewees shared that an elite group of peers representing foundations, their grantees, and sympathetic policy elites were deeply committed to teacher evaluation as a federal policy priority. For example, one interviewee, a thought leader in D.C., recalled how “the reform crowd” had received The Real Value of Teachers report: The stuff about if you had an effective teacher in three years versus an ineffective [one was]—hugely powerful. The paper that Kevin Carey did for Ed Trust . . . it helped to popularize the [idea], had a big impact on the conversation.
As our conceptual framework shows, foundations supported the key think tanks and advocacy organizations that were primary promoters of these ideas—ideas that were well aligned with the beliefs in holding schools accountable that had undergirded NCLB.
Our analysis demonstrates how the teacher quality agenda advanced within a closely knit cluster of elite politicians and funders who were strategically placed to advance teacher evaluation policies. Not only were these individuals positioned to work alongside policymakers; they had the opportunity presented by a policy window during a new presidential administration and the development of new policies, specifically RTTT. Interview data showed that many of the policymakers in the Obama administration also had close ties to the Gates Foundation. As the Superintendent of the Chicago Public Schools, Secretary Duncan had received support from the Gates Foundation, and his transition team and administration staff included many Gates officials, while Gates Foundation staffers and grantees alike provided input on the design of RTTT during the transition. Several assumed roles in the Department of Education, notably Jim Shelton, the former Education Program Director at Gates who became Assistant Deputy Secretary for Innovation and Improvement, and Joanne Weiss, a former partner of Gates’ grantee the New Schools Venture Fund, who became the Chief of Staff under Duncan and was the lead architect of RTTT. The transition team also included education leaders who informally advised the Department of Education on the rubric for awarding RTTT grants, including several with close ties to the Gates Foundation, who spoke with us off the record because their involvement was not official. Overall, the substantial overlap between personnel in the Gates foundation and the Obama administration enhanced the focus on teacher evaluation in RTTT. This close level of coordination between philanthropic funders and policymakers goes beyond the expectations in our conceptual framework. Our framework posits a more mediated form of exchange—with funders largely providing resources to organizations that promote ideas to policymakers; however, our findings here show that the interaction between individuals tied to private philanthropy and policymakers can be very direct and personal.
In addition, across the three models in Table 2, we find that the coefficient for preference popularity is positive and statistically significant. Actors are more likely to support teacher evaluation preferences that are popular with other actors. We also show a positive result for preference overlap during the 2003 to 2007 time period (although we had to drop this term from the 2011–2015 model, due to model estimation issues). These results indicate that preferential attachment (the more connected a node is, the more likely it is to receive new links) was an important characteristic of the teacher quality policy debate—actors supporting teacher evaluation were influenced by the support of others. Preference popularity is statistically significant in each time period of our analysis, suggesting that the popularity of ideas about teacher accountability became a feature of federal policy discourse post-NCLB. Our interview data meanwhile conveyed how policy elites experienced the self-amplifying effects of preference popularity, filling in the contours of preferential attachment to teacher quality policies observed in the ERGM analysis.
Rising issue popularity in the discourse around teacher evaluation led to important schisms in the policy instruments later supported by funders and those that were pursued during the RTTT era. Interestingly, the convergence of support for test score–based teacher evaluation occurred at the same time the Gates Foundation was launching its funding of the Measures of Effective Teaching (MET) study in 2009, which would focus more heavily on evaluation with multiple methods (including student evaluations and observations), as opposed to solely using test score–based metrics. One former Gates Foundation official described the context of the public policy debate at the time, noting that there were two distinct perspectives at play in the broader field: a multiple-measure approach, which characterized the Gates Foundation’s strategy at that time, and a quantitative test score–based approach: There was a measured group of people [who believed in] multiple measures, who said no, a single test score would never do it. Then there was another group that were pushing on “We don’t need anything but the test scores”—just value added and growth measures on student achievement; in absence of observations of practice. Their messaging was about student performance alone, and you had enough instances where that got lifted up and misinterpreted. It wasn’t ever meant to be the only factor.
However, before investing in the MET, the Gates Foundation had provided considerable grant support for advocacy research producing organizations like EdTrust and TNTP that promoted test score–based teacher evaluation. Even though the Gates Foundation pivoted to support for multiple measures, it was too late to tamp down the popularity of test score–based teacher evaluation in the policy discourse. Another former Gates Foundation staff member reflected on what the foundation could have done differently to avoid this tension: The staging piece was really interesting—that is, [the foundation should] have staged research first, student/parent/teacher voice second, new policy practices and operations third. I think [the problem was] that all three were done simultaneously. It would have been better had this been a ten-year investment initiative as opposed to a three to five. [The foundation] could have done a better job had [it] done more extensive piloting first.
The political opportunity to advance the teacher evaluation agenda during a critical moment may have forestalled a more deliberative and research-based approach to policymaking—but not without unintended consequences. This shows how the underlying mechanisms of preferential attachment may be related to how research is—or rather is not—used and the very real hazards of self-perpetuating ideas in policy debates, even for the advocates that support them.
Conclusion
Teacher evaluation systems depend on technical strategies and methods for measuring teacher impacts on individual students, and many scholars continue to work on these measurement challenges in peer-reviewed research as well as more accessible policy reports and books. There is a rich field of research in this area that has grown steadily since the state-level policy changes promoted by RTTT. Yet our findings suggest that the participants in the federal policy debate on teacher quality mentioned research relatively infrequently when speaking in support of teacher evaluation reforms.
Prior studies have shown that the research promoted in national policy debates is sometimes quite narrow and politicized, depending on the strategic considerations of actors involved in the debate (Feuer, 2016; Henig, 2009; Lubienski et al., 2009; Manna & Petrilli, 2008; Rich, 2004; Rogers-Dillon, 2004; Weir, 1992). Our study explores some possible explanations for the narrow and politicized use of research, through the lens of the political economy of knowledge production. Focusing on the case of teacher evaluation policy discourse, we show that philanthropic funders supported organizations that produced and promoted advocacy research reports focused on test score–based teacher evaluation. While this was extremely effective for promoting new ideas when a policy window emerged, we also observed less frequent mentions of research in congressional debate during the policy window.
Overall, our study shows specific political and contextual factors that may limit the overall uptake of research in national education policy debate including the strategic role of funders, the availability of a policy window, synthesizer activity around an issue debate, and the runaway popularity of certain ideas among socially connected actors. This does not imply that research is not used at all; however, as we show, the use of research can be highly integrated into a focused advocacy strategy, which may limit the breadth and depth of research leveraged in policy discourse. Furthermore, the rapid uptake of evaluation policy ideas limited opportunities to conduct pilot studies and incorporate research findings from smaller scale trials of evaluation systems. With this study, we hope to inform future approaches to connecting research with policymakers, to illuminate the trajectory through which research travels, and to show why researchers should be attentive to the broader political context of research use in their areas of expertise.
It is promising to note some new strategies in this area, in particular, the development of place-based research practice partnerships in cities like Chicago, New York City, and New Orleans (Turley, 2016). These efforts aim to “produce local, context specific research” (p. 279), which could ensure that research is more directly responsive to the interests of local policymakers. Others have noted that research practice partnerships can engage in “joint work” that incorporates school district goals directly into research agendas (Penuel et al., 2015). Returning to our conceptual framework, these efforts may draw university-based researchers and policymakers closer together. Nonetheless, the context of national politics is quite different from that of local districts, and our study points to key features of the national political landscape that may present dilemmas for researchers. Furthermore, even at the local level, research by Scott et al. (2017) shows that funders can play an outsized role in driving research and advocacy agendas.
For researchers who are concerned about communicating effectively to policymakers, our findings present many challenges—policy debates can be crowded with ideas, debates often move rapidly, and it may be especially difficult to connect research to a policy debate at the moment of decision-making. Nonetheless, we do see some lessons in these findings for researchers. In particular, researchers should be attentive to the use of research by the education policy funder community. If funders do not effectively and attentively use research, but instead focus on rapid uptake of new ideas, their power to leverage broad advocacy networks may limit the steady uptake of research into the policy process. This puts researchers, who are incentivized to seek grant funding for their work, in a difficult position. The power dynamics that allow funders to influence research and advocacy agendas are entrenched by many features of the American political system. Yet without addressing the role of funders in both research and advocacy, researchers would miss one of the most influential components of our current system of educational policymaking.
Supplemental Material
sj-docx-1-epa-10.3102_01623737211003906 – Supplemental material for How the Political Economy of Knowledge Production Shapes Education Policy: The Case of Teacher Evaluation in Federal Policy Discourse
Supplemental material, sj-docx-1-epa-10.3102_01623737211003906 for How the Political Economy of Knowledge Production Shapes Education Policy: The Case of Teacher Evaluation in Federal Policy Discourse by Sarah Reckhow, Megan Tompkins-Stange and Sarah Galey-Horn in Educational Evaluation and Policy Analysis
Footnotes
Acknowledgements
We thank Jeffrey Snyder and our wonderful team of research assistants at Michigan State University and the University of Michigan for their contributions to this project.
Declaration of Conflicting Interests
The author( s) declared no potential conflicts of interest with respect to the research, authorship, and/ or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: Funding was provided by William T. Grant Foundation (Grant No. 183183).
Notes
Authors
SARAH RECKHOW is an associate professor in the Department of Political Science at Michigan State University. Her research interests include urban politics, education policy, nonprofits, and philanthropy.
MEGAN TOMPKINS-STANGE is an assistant professor of public policy at the University of Michigan’s Ford School. Her research focuses on private philanthropic foundations and their political influence on K–12 education policy.
SARAH GALEY-HORN is a postdoctoral fellow at the University of Edinburgh in the Moray House School of Education, specializing in social network interventions and policy implementation.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
