Abstract
This research note presents a new database of policy diffusion research results useful for research synthesis. This database is a compilation of the results from every event history analysis model of policy diffusion in the American states published between 1990 and 2018. The result is 507 models with 6,641 variables. The database is publicly available and can be used to answer numerous questions regarding the veracity of policy diffusion research claims. It also provides a systematic understanding of where there are gaps in diffusion research that can be filled by scholars from many subfields. This article briefly discusses the data collection and coding processes, what is available in the database, and how it can be used. It also provides an illustrative meta-analysis of the effect of legislative professionalism on innovation adoption.
Keywords
Since the 1960s, political science, public administration, and public policy scholars have been working to understand how and why innovative policy ideas spread among the American states (Walker 1969). It is an important line of research that addresses both horizontal and vertical intergovernmental relations in the American federal system. One review identified over 700 published articles on policy diffusion, not only within the American federal system, but also globally (Graham, Shipan, and Volden 2013). Perhaps surprisingly, there have been few efforts to systematically analyze the results of this vibrant research program. Maggetti and Gilardi (2016) meta-analyzed 114 diffusion mechanism studies to illustrate inconsistency in measuring diffusion mechanisms and a recent meta-analysis in the public management literature included diffusion studies (de Vries, Tummers, and Bekkers 2018). Alas, there has not been a broad accounting of the research findings for many of the internal and external determinants of innovation adoption studied by diffusion researchers. Narrative expert reviews serve a useful purpose for setting the research agenda (e.g., Karch 2007), but they are not systematic. Efforts within political science (Costa 2017) and in fields beyond (Liberati et al. 2009) demonstrate the growing importance of formal assessment of research findings. Furthermore, while data accumulation efforts like the State Policy Innovation and Diffusion (SPID) database are pushing this research beyond single policy models (Boehmke et al. 2020), it is important to also assess what the existing results tell us before developing similar models with larger datasets.
The aim of this research note is to present for researchers the Policy Diffusion Results (PDR) database. This database is not in competition with SPID, because they serve different purposes. PDR allows researchers to meta-analyze previously published diffusion research. It also allows researchers interested in specific policies or intergovernmental relationships to systematically analyze past findings of relevance to their research. Systematic reviews and meta-analyses help identify gaps in knowledge and changes to prevailing theory where empirical support is lacking writ large. This note introduces a new database that will allow for multiple and varied systematic assessments of policy diffusion research findings. Before describing the data available, the search and coding process is briefly summarized. Then a meta-analysis of the effect of legislative professionalism on innovation adoption is included as an example of how the database can be used. The article ends with discussing the potential for using this data and how to keep it updated over time.
Article Search Process
A search for relevant literature was first conducted in 2014, with updated searches in 2016 and 2018. The specific target for each search was any article that included an event history analysis (EHA) of at least one policy innovation. Since Berry and Berry (1990) introduced EHA as a means to model both internal and external determinants of policy adoption, it has become widely used for empirical diffusion research. The common modeling approach also facilitates eventual meta-analysis of the findings. The first step of each search was a Web of Science (WOS) search on Berry and Berry (1990). Alas, WOS does not capture all potential books and articles that have cited Berry and Berry (1990) and/or used EHA. It is human curated, so not all journals appear in its searches. Furthermore, the platform does not include books. Thus, a Google Scholar search using the broader search terms of “berry and berry” and “policy diffusion” was the second stage of each search. The goal was to cast the net wide in capturing all published diffusion articles. In the end, only the published literature is included in this dataset, as peer review is a demarcation of quality (Cook et al. 1993). Further, focusing on the published literature illuminates the methodological choices made by authors and legitimized by reviewers and editors, who serve as publication gatekeepers. Conceptual papers and those that simply cite Berry and Berry as a reference to policy diffusion were also excluded. Finally, this database focuses on the tightly connected body of studies on the American states (Graham, Shipan, and Volden 2013), and thus does not include international diffusion literature.
Available Data
After each search, two unique things were coded: each article that included an EHA model and then each specific model used in those papers. The article search procedure yielded 183 articles that contained at least one EHA model. In total, those papers contained 507 logit, probit, Cox proportional hazard, general estimating equation, and dyadic models. This includes both models presented as primary results in the body of the article/book and results presented in printed and online appendices. Entries were verified by three research assistants. These coding efforts resulted in two linked datasheets that form a single database. Both, as well as a codebook and a spreadsheet of included and excluded articles, are available at https://doi.org/10.7910/DVN/NASPUC. Table 1 presents the variables that were extracted from each model and contained in the two datasheets. The left presents the model-level measures and the right variable-level measures. The Model Identifier variable in each of the two datasheets can be used to link them together.
Model and Variable Measures Recorded in the Database.
* Assigned using the Pennsylvania Policy Agendas Project codebook (McLaughlin et al. 2010), which is adapted for state policy.
Model Datasheet
The model-level measures include basic citation information, an indicator of whether the publishing journal is categorized in “political science” by Clarivate Analytics, temporal and policy measures, and measures of modeling choices. Most of the models that are included in the database were estimated for a single policy. This is a common, though changing, practice in diffusion research (Boehmke 2009). Each policy was coded using a version of the Comparative Agendas Project major and sub-topic coding scheme modified for state data (McLaughlin et al. 2010). The appropriate subtopic was determined first and then the corresponding major topic assigned. No specific policy is recorded in the case of large-n EHA studies that use policies from multiple topics (e.g., Desmarais, Harden, and Boehmke 2015). For a host of reasons, not all studies include the full 50 states in their analysis, thus there is a measure of the number of states included and a list of excluded states. Temporally, the first observed adoption year and last observed adoption year captures the length of the observation window for the study. Short observation windows do not always mean that the innovation diffused quickly, however. Many diffusion studies are conducted before a policy fully spreads to all 50 states. Moreover, many policies fail to be adopted by all states (e.g., Hannah and Mallinson 2018), creating a delicate balance for diffusion researchers studying polices as they spread.
In terms of modeling, four important choices are captured in the Model datasheet. First, diffusion models typically consider the geographical patterns of innovation adoption. The most common measure is a whether neighbor states have adopted an innovation previously. This is thought to imply either learning, competition, or social contagion mechanisms of diffusion (Berry and Baybeck 2005; Pacheco 2012). However, researchers have also modeled adoption behavior in broader regions, among ideological peers, and more. The inclusion of alternative geographical patterns is also captured in the model-level data. Second, the type of model used—logit, probit, or other—is identified. Berry and Berry (1990) used a probit model, which precipitated the use of many probit models in subsequent studies, but it is not the only functional form that is appropriate for diffusion data (Buckley and Westerland 2004). The type of effects reported, meaning whether raw coefficients or hazard ratios are used, is the third choice. Finally, there are a variety of methods for modeling duration dependence in EHA data, such as counts, logs, polynomials, and splines of time (Beck, Katz, and Tucker 1998; Carter and Signorino 2010), so those choices are captured.
Variables Datasheet
Turning to the variable-level measures, the Variables datasheet contains descriptors, estimates and indicators of statistical significance, sample sizes, and indicators for clusters of variables that researchers may be particularly interested in. In terms of descriptors, the name used by the original article or book is recorded. This was done so that they can be easily linked back to the original articles. Researchers do not always use easily interpretable variable names in their tables; thus, the scale of the variable and indicators of variable clusters (e.g., geographic, political, demographic, etc.) can help users understand what is being measured. The model-level data serves as an important linkage to the original article, so this information can be verified by accessing the original articles.
In terms of data useful for meta-analysis, the coefficient for each variable in included models is recorded (including whether it is raw or transformed), as are any reported standard errors, p-values, t-statistics, and/or z-scores. An effort was made to also provide an indicator of the level of statistical significance using categories typical in political science (p < 0.10, p < 0.05, p < 0.01, p < 0.001). These were recorded directly from the article when provided, otherwise they are based on calculations using the coefficient and standard error, t-statistic, or z-score. The number of observations is also included, as this is necessary for weighting in a meta-analysis.
Finally, a series of indicators are included for clusters of variables that may be of interest to diffusion scholars. Many of them are included in Table 1 and will be further discussed in Table 2 below. They include geographical variables, federal activity, ideological distance (Grossback, Nicholson-Crotty, and Peterson 2004), citizen and government ideology (Berry et al. 1998), other political variables like party control of government, slack resources, legislative professionalism (Squire 1992), and variables that are highly contextualized for the purpose of modeling specific types of polices. An example of a highly contextualized variable would be a measure of carbon dioxide emissions in a diffusion model of climate and energy polices (Bromley-Trujillo et al. 2016). Also clustered are model intercepts and other variables like interaction terms.
Counts of Observations Within Each Variable Cluster.
Uses of the Data
There are myriad uses of this database not only for policy diffusion researchers, but researchers of state politics more broadly and those interested in concepts like polarization, ideology, institutional design, divided government, and more. Table 2 presents a summary of the clusters in the dataset. The first column reports the total number of models that include each variable type and the second reports the total number of effects recorded in the database. Some variables (e.g., South Dummy) only appear in a model once, but most categories can be represented several times in a single model (e.g., Slack Resources).
Table 2 is a starting point for identifying possible research syntheses and identifying gaps in diffusion research. The specific clusters are not meant to be exhaustive but are instead illustrative of key variables in policy diffusion theory. For diffusion scholars, meta-analyses of neighbor adoptions, ideological distance, state ideology, slack resources, legislative professionalism, and other political/institutional variables will offer a systematic picture of the state of research findings. However, researchers specifically interested in legislative professionalism, interest groups, race and ethnicity, and more can use this database to assess their variables of interest. The database also reveals where there are important holes in our research. For example, very few models include a measure of federal activity, though there have been important findings regarding the effects of coercion through funding as well as salience raising through issue attention (Karch 2012; Baumgartner, Gray, and Lowery 2009; Welch and Thompson 1980).
Policy context variables are the most included variable type across all diffusion models (1,889 variables). Other examples of contextual variables would include glaucoma incidence, cancer incidence, and marijuana usage in Hannah and Mallinson’s (2018) study of medical marijuana policy diffusion. Such variables can be focal for a researcher interested in the spread of a particular policy or they can capture a broader concept, such as the problem environment (Nice 1994), as in the preceding example. They can also be control variables, not the key variables of interest. They are grouped together because each are highly contextualized vis-à-vis the specific policy being modeled. Among political scientists, political variables are more often variables of interest and are the next largest group (1,298 variables). These measure things like interest group presence, control of government institutions, and direct democracy. Ideology (414) and measures of slack resources (532) are also included in many models. Perhaps unsurprisingly, neighbor adoptions are captured in some way in many of the models. In most cases, there is only one measure of this concept in each model. Many, though not nearly all, of the models (332) include a measure of duration dependence. These are only some of the potential clusters of variables that policy diffusion, state politics, interest group, responsiveness, and many other researchers may be interested in evaluating.
Example Meta-Analysis
A brief meta-analysis example illustrates how the database can be used. A full reporting on key external diffusion mechanisms (e.g., neighbor adoptions, relative ideology, federal influence, etc.) is beyond the scope of this research note. Those results can be found in Mallinson (2020a). While external forces are of greatest interest to diffusion researchers who are working to identify the specific mechanisms that drive innovation adoption (learning, competition, coercion, emulation, social contagion), researchers from many different fields are interested in variables that are otherwise often used as controls in diffusion models. Additionally, the database does capture research that examines the interactions between external and internal determinants (e.g., Clouser McCann, Shipan, and Volden 2015).
Legislative professionalism is one such variable that both has its own devoted body of research, and thus is a focal variable in some analyses, and is used as a control variable in many other state politics models. While not exclusively measured as such, the Squire Index is widely used to capture the degree of professionalism in state legislatures (Squire 1992, 2007, 2017). King (2000) is the most common alternative. Using the U.S. Congress as the marker of the most professionalized legislature, Squire’s Index captures three attributes of state legislatures: pay, average days in session, and average staff per member (Squire 1992). Each measures the resources available to legislators, and thus the overall capacity to legislate each year. State legislatures are not fixed in their professionalism over time, in fact they have been professionalizing at varying rates for more than 100 years (Squire 2012; King 2000).
In diffusion research, legislative professionalism is an important measure of state resources in many policy models. The availability of slack resources is vital for overcoming obstacles, like status quo bias and institutional inertia, that make innovation difficult (Berry and Berry 2018). Thus, variables like legislative professionalism are more than simply controls, they are important internal determinants in the Berry and Berry (1990) framework of internal and external adoption forces. Recent research, for example, reveals how less professionalized legislatures are more likely to copy legislative language (Jansa, Hansen, and Gray 2019), confirming anecdotes presented in Walker’s (1969) foundational study. As a measure of legislative resources, professionalism is expected to have a positive effect on innovation adoption. The true role of legislative professionalism, however, remains unclear in diffusion research. Large pooled event history models have found negative effects (Boehmke and Skinner 2012; Mallinson 2020b) or no effect (Mallinson 2019). Taken together, the above makes legislative professionalism a useful and interesting example of the meta-analysis capabilities of the PDR database. Meta-analysis allows researchers to determine if the existing body of EHA studies supports a general effect of legislative professionalism, as well as the direction and magnitude of that effect.
Of the 185 studies in the database, 51 contained models that measured legislative professionalism, most often with the Squire Index. As Table 2 reports, there are 126 models with 131 legislative professionalism effects reported. A total of 124 effects are included in the meta-analysis, those that reported either odds ratios or raw coefficients in their output. To calculate the average effect of legislative professionalism on innovation adoption, a random effects meta-analysis was conducted in R (Schwarzer, Carpenter, and Rücker 2015; Viechtbauer 2010), the Knapp-Hartung-Sidik-Jonkman adjustment was used (Hartung and Knapp 2001; Sidik and Jonkman 2002), and the results are weighted by the sample size of the original studies. The meta-analysis results in an odds ratio of 0.84 and 95% confidence interval of [0.69, 1.01]. While this result is not statistically significant at conventional levels (p < 0.05), nearly all of the confidence interval falls below one, suggesting a negative effect (an odds ratio above 1 would be a positive effect). Nevertheless, the significant effects from the 124 total effects are split evenly between positive (14 percent) and negative (12 percent). Even the non-significant results are split (40 percent negative, 34 percent positive). Given that the results are appropriately weighted by sample size, the pull toward a negative average effect is likely driven by the results of one very large dyadic EHA (Hinkle 2015).
The types of policies that exhibit a negative effect include human trafficking prevention, physical education requirements, state resistance to No Child Left Behind, local bills of rights, net metering, and state anti-bullying standards. Many of these policies tend to have less technical complexity. Policies that show a positive effect for legislative professionalism include eminent scholar programs, electric deregulation, medical savings accounts, wage pass-through for Medicaid reimbursement, correctional boot camps for adults offenders, higher education financing, smoking bans, voting reform, DUI interlocks, several tax credits, and performance funding for higher education. There is also a large pooled EHA that reports a significant and positive effect for professionalism (Desmarais, Harden, and Boehmke 2015). It is notable, however, that for 74 percent of the reported effects, legislative professionalism was not a statistically significant predictor of adoption. The meta-analysis thus suggests that there may in fact be no general effect of legislative professionalism across all types of policies. Its effect appears highly dependent on the policy or set of policies included in an analysis. This points to a fruitful avenue of additional research to understand exactly where, when, and why legislative professionalism plays a role in innovation adoption.
In addition to estimating weighted-average effects, researchers can use this database to identify the potential for publication bias. Figure 1 presents a funnel plot for the legislative professionalism effects included in the meta-analysis above. If there is reporting bias, meaning researchers tend to report only statistically significant effects (which biases meta-analyzed findings), then the combination of odds ratios and standard errors will not fan out evenly across this space. Figure 1 shows that the reported effects do appear to scatter throughout the funnel, meaning that variation in the reported effects appears to be due to sampling variation, not a “file drawer” effect (Rosenthal 1979).

Funnel plot of legislative professionalism effects.
Updating the Database
While this database can be quite useful, it is going to become quickly dated. Also, it does not presently include unpublished studies that can also be found online and in dissertation and thesis repositories (i.e., the “gray” literature). Updating the database is far easier in some respects than assembling it in the first place, but it will take the effort of more than one small research group. The data are provided publicly both so that they are useful, but also so that they can be updated by other interested parties. If there is collaborative interest in updating the database with both newly published literature and the grey literature, consider this an invitation to collaborate and contact the author accordingly. Any efforts to update the database must be systematic and include as broad a search as possible. Only then can there be confidence that the database includes all relevant results. Perusing the model-level spreadsheet quickly dispels any notion that the study of policy diffusion is limited to the fields of political science, public policy, and public administration. Relevant articles that test diffusion theory have been published in a diverse array of scholarly venues and by many disparate disciplines. Thus, casting the net wide is vital for understanding this body of literature. That was the goal of this database and should continue to be the goal as research synthesis efforts progress.
Conclusion
The research on policy diffusion is both vibrant and vast, but there has been little systematic effort to understand its results. Narrative expert reviews serve an agenda setting role for researchers, but they cannot accurately assess the entire body of knowledge. Systematic reviews and meta-analyses provide a means for assessing patterns, trends, and important findings in research results, but accumulating the necessary data can be daunting. The intent of the Policy Diffusion Results database is to provide the means necessary for undertaking such research synthesis. Given the breadth of these findings, this is an effort larger than a single researcher. The database is made public with the hopes that it has uses that have not even been envisioned yet by the original compiler.
Footnotes
Acknowledgments
The author would like to thank Luke Yingling, Nick Turnier, and Tom Delany for their help in coding and checking the database. He would also like to thank the anonymous reviewers for their constructive feedback. All errors in the manuscript and database, however, are my own.
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
