Abstract
Theorizing under the rubric of paradigmatic ‘isms’ has made important conceptual contributions to International Relations, but the organization of the subfield around these isms is based on flawed readings of the philosophy of science and has run its course. A promising alternative is to build on the philosophical foundation of scientific realism and orient International Relations theorizing around the idea of explanation via reference to hypothesized causal mechanisms. Yet in order to transform the practice of International Relations theorizing and research, calls for ‘analytic eclecticism’ must not only demonstrate that scientific realism is a defensible epistemology amenable to diverse methods; they must provide a structured and memorable framework for diverse and cumulative theorizing and research, field-wide discourse, and compelling pedagogy. I Introduce a ‘taxonomy of theories about causal mechanisms’ as a structured pluralist framework for encompassing the theories about mechanisms of power, institutions, and legitimacy that have been providing the explanatory content of the isms all along. This framework encourages middle-range or typological theorizing about combinations of causal mechanisms and their operation in recurrent contexts, and it offers a means of reinvigorating the dialogue between International Relations, the other subfields of political science, and the rest of the social sciences.
Introduction
The political science subfield of International Relations (IR) continues to undergo debates on whether and in what sense it is a ‘science,’ how it should organize its inquiry into international politics, and how it should build and justify its theories. On one level, an ‘inter-paradigm’ debate, while less prominent than during the 1990s, has continued to limp along among researchers who identify their work as fitting within the research agenda of a grand school of thought, or ‘ism,’ and the scholar most closely associated with it, including neorealism (Waltz, 1979), neoliberalism (Keohane, 1984), constructivism (Wendt, 1992), or occasionally Marxism (Wallerstein, 1974) or feminism (Tickner, 1992). Scholars participating in this debate have often acted as if their preferred ‘ism’ and its competitors were either ‘paradigms’ (following Kuhn, 1962) or ‘research programs’ (as defined by Lakatos, 1970), and some have explicitly framed their approach as paradigmatic or programmatic (Hopf, 1998).
A second level of the debate involves post-positivist critiques of IR as a ‘scientific’ enterprise (Lapid, 1989). While the vague label ‘post-positivist’ encompasses a diverse group of scholars, frequent post-positivist themes include arguments that observation is theory-laden (Kuhn, 1962), that knowledge claims are always part of mechanisms of power and that meaning is always social (Foucault, 1978), and that individual agents and social structures are mutually constitutive (Wendt, 1992). Taken together, these arguments indicate that the social sciences face even more daunting challenges than the physical sciences.
A third axis of contestation has been methodological, involving claims regarding the strengths and limits of statistical, formal, experimental, qualitative case study, narrative, and other methods. In the last two decades the argument that there is ‘one logic of inference’ and that this logic is ‘explicated and formalized clearly in discussions of quantitative research methods’ (King et al., 1994: 3) has generated a useful debate that has clarified the similarities, differences, uses, and limits of alternative methods (Brady and Collier, 2010; George and Bennett, 2005; Goertz and Mahoney, 2006).
These debates have each in their own way proved fruitful, increasing the theoretical, epistemological, and methodological diversity of the field (Jordan et al., 2009). The IR subfield has also achieved considerable progress in the last few decades in its theoretical and empirical understanding of important policy-relevant issues, including the inter-democratic peace, terrorism, peacekeeping, international trade, human rights, international law, international organizations, global environmental politics, economic sanctions, nuclear proliferation, military intervention, civil and ethnic conflicts, and many other topics.
Yet there is a widespread sense that this progress has arisen in spite of inter-paradigmatic debates rather than because of them. Several prominent scholars, including Rudra Sil and Peter Katzenstein, have argued that although research cast within the framework of paradigmatic debates has contributed useful concepts and findings, framing the IR field around inter-paradigmatic debates is ultimately distracting and even counterproductive (Sil and Katzenstein, 2010; see also David Lake, 2011, and in this special issue, and Patrick Thaddeus Jackson and Daniel Nexon, 2009, and in this special issue). These scholars agree that IR researchers have misapplied Kuhn’s notion of paradigms in ways that imply that grand theories of tightly connected ideas — the isms — are the central focus of IR theorizing, and that such isms should compete until one wins general consensus. Sil and Katzenstein argue that the remedy for this is to draw on pragmatist philosophers and build upon an ‘eclectic’ mix of theories and methods to better understand the world (Sil and Katzenstein, 2010). In this view, no single grand theory can capture the complexities of political life, and the real explanatory weight is carried by more fine-grained theories about ‘causal mechanisms.’
In this article I argue that those urging a pragmatic turn in IR are correct in their diagnosis of the drawbacks of paradigms and their prescription for using theories about causal mechanisms as the basis for explanatory progress in IR. Yet scholars are understandably reluctant to jettison the ‘isms’ and the inter-paradigmatic debate not only because they fear losing the theoretical and empirical contributions made in the name of the isms, but because framing the field around the isms has proven a useful shorthand for classroom teaching and field-wide discourse. The ‘eclectic’ label that Sil and Katzenstein propose can easily be misinterpreted in this regard, as the Merriam-Webster online dictionary defines ‘eclectic’ as ‘selecting what appears to be best in various doctrines, methods, or styles,’ as Sil and Katzenstein clearly intend, but it also includes as synonyms ‘indiscriminate’ and ‘ragtag.’ 1 By using the term ‘eclecticism’ and eschewing any analytic structure for situating and translating among different examples of IR research, Sil and Katzenstein miss an opportunity to enable a discourse that is structured as well as pluralistic, and that reaches beyond IR to the rest of the social sciences.
I maintain that in order to sustain the genuine contributions made under the guise of the inter-paradigmatic debate and at the same time get beyond it to focus on causal mechanisms rather than grand theoretical isms, four additional moves are necessary. First, given that mechanism-based approaches are generally embedded within a scientific realist philosophy of science, it is essential to clarify the philosophical and definitional issues associated with scientific realism, as well as the benefits — and costs — of making hypothesized causal mechanisms the locus of explanatory theories. As Christian Reus-Smit argues in this special issue, IR theory cannot sidestep metatheoretical debates. Second, it is important to take post-positivist critiques seriously and to articulate standards for theoretical progress, other than paradigmatic revolutions, that are defensible even if they are fallible. Third, achieving a shift toward mechanismic explanations requires outlining the contributions that diverse methods can make to the study of causal mechanisms. Finally, it is vital to demonstrate that a focus on mechanisms can serve two key functional roles that paradigms played for the IR subfield: first, providing a framework for cumulative theoretical progress; and, second, constituting a useful, vivid, and structured vocabulary for communicating findings to fellow scholars, students, political actors, and the public (see also Stefano Guzzini’s article in this special issue). I argue that the term ‘structured pluralism’ best captures this last move, as it conveys the sense that IR scholars can borrow the best ideas from different theoretical traditions and social science disciplines in ways that allow both intelligible discourse and cumulative progress.
After briefly outlining the problems associated with organizing the IR field around the ‘isms,’ this article addresses each of these four tasks in turn. First, it takes on the challenges of defining ‘causal mechanisms’ and using them as the basis of theoretical explanations. Second, it acknowledges the relevance and importance of post-positivist critiques of causal explanation, yet it argues that scientific realism and some approaches to interpretivism are compatible, and that there are standards upon which they can agree for judging explanatory progress. Third, it very briefly clarifies the complementary roles that alternative methods can play in elucidating theories about causal mechanisms. Finally, the article presents a taxonomy of theories about social mechanisms to provide a pluralistic but structured framework for cumulative theorizing about politics. This taxonomy provides a platform for developing typological theories — or what others in this special issue, following Robert Merton, have called middle-range theories — on the ways in which combinations of mechanisms interact to produce outcomes. Here, I join Lake in this special issue in urging that IR theorizing be centered around middle-range theories, and I take issue with Jackson and Nexon’s suggestion herein that such theorizing privileges correlational evidence, and their assertion that statistical evidence is inherently associated with Humean notions of causation. I argue that my taxonomy of mechanisms offers a conceptual bridge to the paradigmatic isms in IR, adopting and organizing their theoretical insights while leaving behind their paradigmatic pretensions. The article concludes that, among its other virtues, this taxonomy can help reinvigorate dialogues between IR theory and the fields of comparative and American politics, economics, sociology, psychology, and history, stimulating cross-disciplinary discourses that have been inhibited by the scholasticism of IR’s in-grown ‘isms.’
The problem with paradigms and research programs in IR
IR scholars rightly focus their efforts on their empirical and theoretical research and its policy implications, but this has often left them out of date in their understanding of debates in the philosophy of science. Many IR scholars have continued to espouse epistemological ideas that philosophers of science found to be deeply problematic decades ago. A recent survey of American scholars in the IR subfield, for example, reported that approximately two-thirds identified themselves as ‘positivists’ (Maliniak et al., 2011: 454), even though the philosophical schools of thought most closely associated with this label fell out of favor among philosophers of science more than a half-century ago. Survey respondents may have chosen to call themselves positivists in part because this was one of the few response choices offered to them in the survey, but the very fact that it was offered as an unproblematic option is indicative of the problem.
The IR field’s lingering paradigm wars are another manifestation of out-of-date and problematic views on the philosophy of science. With apologies to those who have heard the ‘Kuhn to Lakatos and beyond’ story many times before, Thomas Kuhn introduced the concept of paradigms in the early 1960s, defining them as problem-solving achievements ‘that some particular scientific community acknowledges for a time as supplying the foundation for its further practice’ (Kuhn, 1962: 10). In Kuhn’s view, scientific progress proceeds through two routes. The first of these is ‘normal science,’ in which paradigmatic views are widely shared within a community and scientists largely agree on what problems have been solved, what puzzles remain open, and what rules and standards should be used to judge progress on remaining problems. The second route for progress is via ‘scientific revolutions,’ in which the dominant paradigm encounters increasing anomalies, an alternative paradigm arises that claims to resolve these anomalies, and then either the existing paradigm proves able to resolve its anomalies or the alternative over time gains new adherents and converts and becomes dominant. Kuhn argued that because competing paradigms were incommensurate and claimed to solve different problems, ‘as in political revolutions, so in paradigm choice — there is no standard higher than the assent of the relevant community … this issue of paradigm choice can never be unequivocally settled by logic and experiment alone’ (Kuhn, 1962: 93).
There are two core problems with the ways in which the IR field has oversimplified and misappropriated Kuhn’s notion of paradigms. The first is that scholars have constructed IR’s leading ‘paradigms’ around groups of theories about kinds of causal mechanisms that are in fact not mutually exclusive (see also Jackson and Nexon, 2009). Although IR scholars disagree on the relative strength and scope conditions of mechanisms involving material power, institutional transactions costs, or social relations, few would deny that all these kinds of mechanisms matter in international politics. Yet scholars have often presented their findings as if one ‘paradigm,’ focusing on just one of these sets of mechanisms, should displace another. This not only misapplies the notion of paradigms, it misreads the work of Kenneth Waltz, Robert Keohane, and Alex Wendt, the three iconic representatives of the main ‘isms’ in IR. In fact, none of these scholars has been wedded to using only one ‘paradigm’ to explain international politics. Waltz’s early work emphasizes the importance of combining different images or levels of analysis (Waltz, 1959), Keohane has written about sociological as well as transactions costs approaches to institutions (Keohane, 1988), and Wendt has acknowledged the role of material power and the centrality of states in the international system as currently (albeit not timelessly) constituted (Wendt, 1995).
A second problem has been the attribution of explanatory power to IR’s paradigms themselves, rather than to the theories about mechanisms that comprise each paradigm. A close reading of Waltz, Keohane, or Wendt makes clear that the explanatory power of each of their grand theories relies on more specific theories about mechanisms. For Waltz, critical mechanisms include evolutionary selection of successful states by the system and emulation of these successful states by other states; for Keohane, mechanisms involving the use of institutions to lower the transactions costs of negotiating, monitoring, and enforcing rules are key; for Wendt, mechanisms of socialization and legitimization are central.
A third problem lies with Kuhn himself. Kuhn problematized but did not fully abandon empirical evidence as an important input to choices between paradigms, leaving his ultimate position ambiguous on what kind of evidence is useful in theory choice and how it can be used. In Structure of Scientific Revolutions, Kuhn states that ‘it makes a great deal of sense to ask which of two actual and competing theories fits the facts better’ before adding that ‘this formulation, however, makes the task of choosing between paradigms look both easier and more familiar than it is’ (1962: 147, emphasis in original). The passages that follow outline his famous arguments about the incommensurability of paradigms, but Kuhn never argues that evidence is irrelevant. Moreover, in his later writings, Kuhn rejected the interpretation that cross-paradigm communication is impossible. His view is more accurately read as pointing out the challenges and limits of such communication, rather than rejecting it altogether or suggesting that evidence is irrelevant (Jackson and Nexon, 2009; Patomaki and Wight, 2000: 225–227; Wight, 2006: 40–45).
Yet Kuhn’s emphasis on the sociological aspects of science and paradigm choice left him open to critiques by Imre Lakatos and others that Kuhn had portrayed judgments on scientific progress as more subjective and relativistic than they actually are. Lakatos sought to find an alternative to both Khunian relativism and to what Lakatos called ‘naïve falisificationism,’ or the view that theories should be falsified at the first sign of any anomaly. Lakatos agreed that theories cannot be easily falsified because they involve potentially false auxiliary assumptions about instruments of observation and other issues, so empirical anomalies could be attributed to flaws in these auxiliary assumptions rather than in theories themselves. Lakatos’s solution was to characterize theories as ‘research programs’ with ‘hard core’ assumptions that are not subject to direct empirical testing and ‘outer belt’ assumptions that can be tested empirically. In his view, theory testing involved a three-cornered fight between competing theories and the evidence. A theory could be judged as no longer producing scientific progress if it encountered a series of anomalies that it could not resolve, and if at the same time an alternative theory could not only resolve these anomalies but do so by producing ‘novel facts’ that survived empirical corroboration (Lakatos, 1970).
Lakatos was ambiguous on what would constitute ‘novel facts,’ but subsequent accounts have clarified two kinds: ‘use novelty,’ in which a theory proves consistent with facts that were not used to construct the theory, and ‘background theory novelty,’ in which a theory explains facts that no other theory can credibly purport to explain (Elman and Elman, 2003). I have argued elsewhere that this focus on novel facts is Lakatos’s most useful contribution (Bennett, 2003), and I highlight below the particular importance of use novelty, which is a key standard of theoretical progress upon which scientific realists and interpretivists can agree.
Some IR theorists have embraced Lakatos’s metatheory as an improvement over Kuhn’s, and re framed IR’s paradigms as Lakatosian research programs (see, for example, Hopf, 1998). Yet this raises problems of its own. First, a conference devoted to assessing Lakatos’s framework as a standard for judging progress in IR theory, and one that included Keohane and Waltz among its participants, concluded that the distinction of the hard core and outer belt of theories is ultimately arbitrary (Elman and Elman, 2003). Second, treating IR’s ‘isms’ as research programs creates the same problem as the misappropriation of Kuhn’s notion of paradigms: it wrongly suggests that the central organizing principle in IR should be big agglomerations of highly interdependent theories and that these agglomerations are somehow in contestation with one another (for similar critiques see Jackson and Nexon, 2009, and in this special issue). Some scholars have therefore suggested embracing Laudan’s looser concept of ‘research traditions’ as an alternative to paradigms or research programs (Laudan, 1977; Sil and Katzenstein, 2010). This helps get away from the false impression that IR’s isms should be thought of as competing world views, but Laudan provides little structure to his concept of research traditions and he is frustratingly vague on the issues of theoretical progress and theory choice.
Defining causal mechanisms
The way out of the cul-de-sac of paradigms and research programs lies not in a retreat to Laudinian ambiguity but in a parallel development in the philosophy of social science in the last several decades: the scientific realist turn toward theories about causal mechanisms as the locus of scientific explanations. 2 Theories about mechanisms are far narrower and more discrete than paradigms or research programs. The idea of explanation via reference to causal mechanisms builds on the work of ‘critical realists’ like Roy Bhaskar (1975) and Milja Kurki (2007) and ‘scientific realists’ like Colin Wight (2006, 2007). 3 Wight’s view, fully consonant with the present argument, is that scientific realism entails three commitments: ‘ontological realism (that there is a reality independent of the mind(s) that would wish to know it); epistemological relativism (that all beliefs are socially produced); and judgmental rationalism (that despite epistemological relativism, it is still possible, in principle, to choose between competing theories’ (Wight, 2006: 26; for a summation of critical realism compatible with my argument, see Kurki, 2007: 364).
Scientific realists do not agree on all the particulars of the philosophy of science (Chernoff, 2002), but they generally share these three commitments, and in so doing they improve upon earlier philosophies of science in important ways. In particular, earlier efforts by Carl Hempel and Paul Oppenheim failed to achieve a satisfactory account of explanation via reference to ‘covering laws,’ or universal regularities. In this approach, to explain an outcome is to identify the covering law of which it is an instance. This is often termed a Humean view of causation and explanation, as it follows from David Hume’s view that causation has to do with constant conjunctions. A fatal flaw of the covering law approach is that Hempel and Oppenheim were unable to give an independent warrant for the covering laws themselves, as they had promised to do in a famous footnote (Hempel and Oppenheim, 1948, in Hempel, 1965: 273). In addition, the covering law approach cannot distinguish between causal relationships and non-causal regularities, including spurious correlations and regularities in which the hypothesized cause and the observed effect are reversed. To take an IR example, the covering law view would admit the correlational finding that democracies do not fight wars with other democracies as a suitable ‘explanation.’ A scientific realist approach requires instead a search for the mechanisms that might explain such a correlation. Following the latter view, scholars working on the democratic peace have indeed actively sought out such mechanisms (see, for example, Schultz, 2001).
After the inadequacy of the covering law account became apparent, the ‘scientific realist’ school gave up any attempt to justify covering laws and turned instead to the assumption of an ontological reality independent of our minds and our theories. In this view, we develop theories about the workings of the causal mechanisms that exist in the real world, and to the extent that our theories are accurate, they explain the outcomes we observe. This is a modest assumption, but it still poses several definitional and philosophical challenges.
A first challenge concerns the definition of ‘causal mechanisms.’ The most general definition is that causal mechanisms are processes in the world that generate outcomes, but within this general view scholars have offered more than a dozen different formulations (Mahoney, 2001: 578–580). 4 The most contested definitional questions are the distinction between mechanisms and theories and the issue of whether causal mechanisms are in some sense unobservable. On the first issue, scientific realists are clear and unequivocal: causal mechanisms exist in the world and theories exist in our heads (Bunge, 2004; Mayntz, 2004; Wight, 2006; for an excellent review of the literature on mechanisms see Hedstrom and Ylikoski, 2010). We have theories about how the mechanisms that generate outcomes work. Using language that treats theories and mechanisms as synonymous is an easy trap to fall into, but the distinction is critical.
Regarding the (un)observability of mechanisms, scientific realists defend the metatheoretical notion that ultimately unobservable entities have powers that generate the things we observe. In this view explanation via reference to unobservables raises important issues of observation, measurement, and inference, but it is unavoidable. Indeed, apart from abstractions like pure mathematics, everything we ‘know’ about the world is not directly accessible to our mind. Rather, we observe through the limited information taken in by our senses, through sources of information about things we have not witnessed ourselves (books, journals, news reports, reported speech, and other texts), and through instruments of observation (polls, TV cameras, surveys, etc.). Some sources and instruments are more reliable and direct than others, but all involve interpretation of some sort. In this regard Alexander George and I have suggested the metaphor of a movable horizon between the observable and the unobservable worlds (George and Bennett, 2005). Instruments of observation may improve to the point that we take them to be relatively straightforward, but this just pushes back the horizon that demarcates the many remaining things we cannot unproblematically serve. There are always obscured, discrete, or distant mechanisms that we cannot directly observe, including the ideas in other people’s heads. At the same time, our theories about mechanisms generate implications on what should be true in the observable world if the posited mechanisms and our instruments of observation operate in the manner that we theorize. We can test these observable implications against the predictions of our theories even though we cannot directly observe mechanisms or causation.
Thus Alexander George and I have defined causal mechanisms as ‘ultimately unobservable physical, social, or psychological processes through which agents with causal capacities operate, but only in specific contexts or conditions, to transfer energy, information, or matter to other entities,’ thereby changing the latter entities’ ‘characteristics, capacities, or propensities in ways that persist until subsequent causal mechanisms act upon it’ (George and Bennett, 2005: 137; for a similar definition see Salmon, 1990: 71). This definition locates mechanisms on the ontological level and acknowledges that they are ultimately unobservable. It also makes clear that theories about mechanisms are not simple universal regularities across all time and space, as the operation of any one mechanism may interact with those of many other mechanisms.
There remains the question of how explanation via reference to mechanisms is different from explanation via reference to ‘laws.’ The two are similar in the sense that explaining an outcome means showing that it was to be expected under the extant circumstances. Theories about mechanisms are not merely covering laws with narrower scope conditions, however, Here again, the crucial difference is that covering law explanations treat as causal many relationships that a mechanism view excludes. The covering law model allows ‘as if’ assumptions: under such an assumption, it does not matter whether entities at a lower level of analysis or finer degree of detail acted as the theory suggests; it only matters that patterns of outcomes look ‘as if’ this were true (Friedman, 1953). This returns to the Humean view equating explanation with constant conjunction or the ability to predict outcomes. In contrast, a mechanism-based approach to explanation rejects ‘as if’ assumptions and requires in principle that our theories about how the underlying mechanisms work must be consistent with the finest degree of detail that we care to observe. In other words, the processes that we observe in the space and time between the hypothesized cause and the observed outcome must fit our theories about how the explanatory mechanisms operate.
This does not mean as a practical matter that every piece of empirical research must delve into the microfoundational level. Theories can posit structural and macro-level mechanisms and processes, and in some cases the easiest means of testing theories, and the most powerful policy instruments, may be at this macro level. What it does mean is that theories about mechanisms are suspect if it can be shown that individual actors did not make the calculations, evince the preferences, respond to the stimuli, or engage in the behaviors posited by structural theories. As a pragmatic matter we cannot research the limitless possibilities of ‘mechanisms within mechanisms’ at infinite degrees of detail, yet there is no general rule on the level of detail at which we should stop our research. Choosing when to stop involves a kind of epistemic and professional wager. We always risk one of two possible mistakes: if we choose to stop too soon, we could later be proved wrong by research that goes to a more detailed level of analysis, but if we choose to stop too late, we will have wasted our time and efforts.
This last point draws attention to one of the main costs of focusing IR theorizing on causal mechanisms: a loss of parsimony. Even at the macro level, mechanismic theories usually have narrower scope conditions than those purported for the isms. Theorizing about mechanisms thus involves a very different trade-off between parsimony and verisimilitude from that favored by Waltz (1979). Yet this is a trade-off most IR scholars have proven willing to make. The isms are parsimonious but as a consequence they are highly indeterminate. Neorealism, for example, has not reached strong generalizations on when to expect balancing versus bandwagoning, whether bipolar or multipolar systems are more stable, and when concerns over relative gains will outweigh actors’ desire for absolute gains. These issues involve interactions of mechanisms and can only be theorized with any degee of accuracy by giving up some parsimony (for other critiques of an excessive focus on parsimony as a standard of theory choice, see Kurki, 2007: 372; Wight, 2006: 289).
Addressing post-positivist critiques of causal explanation of social phenomena
Is explanation via reference to mechanisms incompatible with constructivist or postmodern views of social and political life? Some constructivists and postmodernists make arguments that are ultimately irreconcilable with explanation via mechanisms. Scholars who are skeptical of any possibility of causal explanation will remain skeptical of explanation via mechanisms. In contrast, following Milja Kurki, I argue that scientific realism (or, in her view, critical realism) ‘provides a framework for “constitutive” IR theorists to re-engage with causal analysis’ (Kurki, 2007: 361). Kurki adds that ‘the norms, rules, and discourses that many constructivists, feminists and poststructuralists inquire into are, within the critical realist perspective, distinctly causal, although not causal in the positivist “when A, then B” sense’ (Kurki, 2007: 367; see also Patomaki and Wight, 2000; and see Banta, 2012, on methods of discourse analysis consistent with critical realism).
Many constructivists and postmodernists are not unalterably opposed to aspirations to causal explanation. In their view, even though agents and structures are mutually constitutive, it is possible both theoretically and empirically to separate periods, contexts, or steps in a historical process in which outcomes are driven mostly by structural constraints, including normative and discursive ones, or mostly by the actions of agents who have freedom of maneuver to make different choices within extant structures. Sometimes structures or norms change but actors stay largely the same, and at other times the reverse is true (Kowert and Legro, 2006). Similarly, Wight argues that it is difficult but not impossible to distinguish agents and structures ontologically and analytically, allowing a focus on the relations between them (2006: 296). Those constructivists who have embraced scientific realism have endorsed mechanism-based explanations (Wendt, 1992), and scientific realism is open to theories on the kinds of mechanisms that constructivists emphasize, including theories of persuasion, intersubjective meanings, discursive communication, learning, naming and shaming, framing, legitimacy, and norms of appropriateness.
In addition, although constructivist interpretivists and many scholars from other traditions agree that behavior, rhetoric, texts, and symbols are open to different interpretations, leading interpretivists suggest in their writings that behavior can be explained and that some explanations are better than others (Hopf, 2007). Here, both scientific realists and interpretivists have critiqued simple notions of fixed criteria for theory choice, but as Kurki argues, ‘even in the absence of fixed criteria we can still evaluate the contributions of, for example, Enloe’s account of gender relations or Campbell’s account of the Bosnian War. We can do so in reference to how convincingly they deal with (plurality of) evidence, explain the processes they focus on, reflect on possible biases in their accounts and engage in reasoned argumentation with other accounts’ (Kurki, 2007: 372). Not all interpretivists would agree with some of the more conventional standards for theory choice that Kurki offers, including ‘predictive capacity,’ ‘plausibility of theoretical assumptions, and the consistency, explanatory power and coherence of explanations’ (Kurki, 2007: 371), but interpretivists have developed extensive standards of good interpretive work, many of which are consistent with scientific realism (Hansen, 2006; Klotz and Lynch, 2007; Milliken, 1999; Price and Reus-Smit, 1998; Yanow and Schwartz-Shea, 2006).
Most importantly, interpretivists and traditional methodologists agree on the importance of trying to account for biases in research, including our own biases as researchers as well as biases in the sources of evidence available to us. In this regard, the practice of giving weight to evidence that has ‘use novelty’ can be given both a logical warrant and a justification that is more psychological, interpretivist, and scientific realist. Logically, use novelty is important because evidence used to construct a theory has zero probability of proving that theory wrong. Psychologically, use novelty is a useful standard for using evidence to assess theories because we know from numerous experiments that we are prone toward confirmation bias, or toward seeing, finding, or emphasizing evidence that our theories lead us to expect. Much of the advice in methodology textbooks, whether statistical, qualitative, or interpretivist, is focused on guarding against confirmation biases. For scientific realists, the fact that much evidence is use novel for any particular theorist is critical, because it is the independence of the mind from the ontological world that makes theory testing and theory choice possible. Observation is theory-laden but not theory-determined (Brewer, 2001). The social world that exists independently of our minds can surprise us and force us to rethink our theories and explanations, thereby allowing empirical tests that are at least somewhat independent of observer biases even though our biases still influence the questions we ask of the world and how we interpret the surprises it gives us. From a variety of epistemological perspectives, then, it makes sense to test our theories against evidence not used to derive these theories. In statistical methods, this involves the testing of models against out-of-sample data; in qualitative case studies this focuses on process tracing on evidence not already known to the researcher and studies of cases other than those from which theories were derived; in interpretivism this involves the interpretation of new texts; in formal modeling any form of statistical, case, or textual data not used to intuit or derive the model has use novelty.
Background theory novelty can be given an interpretivist justification as well. Political behavior can usually be given many different interpretations, but if only one of these interpretations has actual adherents among the community of scholars paying attention to the behavior that is to be explained, both traditional methodologists and interpretivists would be hard-pressed to argue that alternative explanations that no one actually believes are in some sense better than the accepted interpretation. An alternative explanation might later gain both adherents and superior empirical support, but by definition once this alternative gains supporters the accepted interpretation has lost its background novelty. More generally, while traditional methodologists are rightly concerned over mis-readings of Kuhn that imply that only the consensus of the relevant scientific community matters, or that suggest that this consensus is formed without any reliance on evidence and logics of inference, it is hard to argue that consensus, lack of consensus, or change in beliefs among puzzle-focused communities of researchers are irrelevant. Bayesian approaches to the philosophy of science, for example, allow roles for both an element of subjectivism in the initial assignment of prior probabilities to the likely truth of alternative theories and a healthy dose of inference from evidence in updating these probabilities (Earman, 1992). To take a concrete example from IR theory, it is notable both that empirical evidence has accumulated supporting the claim that democracies are much less likely to fight wars with each other than with non-democracies, and that the community of scholars researching this question — including many initial skeptics who became converts — has evolved toward greater consensus (though not unanimity) in the belief that this is not an accidental correlation (even if its explanatory mechanisms remain the subject of contestation and research). Whether one stresses the accumulating evidence or the convergence toward a consensus among the scholars reading this evidence despite their differing priors, this looks like progress.
Finally, postmodernists rightly remind us that power shapes and influences the study as well as the practice of politics. Critical realists agree that ‘being aware of the social and political underpinnings and biases is fundamental to any causal analysis because causal discourses — whether on differences of intelligence of races, reproductive biology or causes of crime — are deeply informed by sets of politically consequential discourses and assumptions’ (Kurki, 2007: 375). At the same time, the fact that social relations of power are pervasive, self-reproducing, and long-lasting is what makes it possible to develop theories about politics that are useful for meaningful periods of time even if they are not applicable for all time. It cannot be simultaneously true that power is pervasive and that all social structures, meanings, and identities are continuously open to being constituted in different ways. Social power constrains possible outcomes and reproduces patterns of behavior over time, and the nature of these constraints is properly a major focus of the study of international politics. Interpretivists rightly warn that this means that power infects the study of politics as well, and that moves toward consensus among scientific communities can be a function of the exercise of social power within these communities as well as of individual conversions based on new evidence and the Bayesian logic of inference. Yet it is not beyond the ability of a self-reflective and power-focused scholarly community to study and call attention to the potential biases that arise through these processes, which by definition have to be at least partly public.
In short, though there are no simple or infallible standards for theory choice, and although empirical evidence does not uniquely point to one theory, explanation, or narrative as being the ‘true’ one, useful standards exist for judging theoretical progress and assessing some interpretations and explanations to be superior to others. Interpretivists and traditional methodologists should find standards of use novelty and background theory novelty useful, even if they emphasize different justifications for each.
Causal mechanisms, modes of explanation, and political science methods
Each of the leading political science methods, and the working epistemological assumptions of at least some of their practitioners, are compatible with explanation via reference to hypothesized causal mechanisms. I have addressed above the epistemological compatibility of interpretivism and scientific realism. Here I take issue with Jackson and Nexon’s argument in this special issue that middle-range theorizing privileges correlational evidence and that the statistical analysis of observational data necessarily entails a commitment to a Humean account of causation. 5 In my conception, middle-range theories are not just theories about individual causal mechanisms, but theories about how combinations of mechanisms interact in specified and often recurrent scope conditions or contexts to produce outcomes — whether these contexts are defined as kinds of states or other units, specific periods of time or areas of social space, or recurrent problems around which research traditions have formed, such as ‘civil war,’ ‘democratic transitions,’ or ‘imperialism.’ This builds on scientific realist concepts of mechanisms that are explicitly anti-Humean and that are amenable to investigation via process tracing as well as interpretive methods (George and Bennett, 2005: 147, 205–232).
Experimental methods have an important role to play here in working to isolate the workings and effects of individual causal mechanisms (Jackson, 2011: 110). At the same time, for practical and ethical reasons, many theoretically interesting and policy-relevant phenomena are not amenable to experimental study. Studying both individual mechanisms and the interactions of mechanisms in recurrent contexts in the real world thus requires observational studies, including qualitative case studies, interpretive analyses, ‘natural experiments’ (Dunning, 2012), survey research, and many other forms of qualitative and statistical analysis.
In this context, there is nothing essential in statistical methodologies that ties them inherently to Humean conceptions of causation. There is indeed an affinity between the statistical analysis of observational data and Humean notions of explanation because both focus on correlations (Brady, 2008). Some researchers who use statistical methods on observational data conduct their research in ways that reflect a Humean epistemology, and are content with high correlations even if they cannot theorize plausible causal mechanisms that might explain these correlations. Other statistical researchers, reflect a scientific realist sensibility, however, working hard to theorize about causal mechanisms and to test the observable implications of their hypothesized causal mechanisms at the population level as well as at the level of individual cases. 6 Kenneth Schultz, for example, posits the theory that democracies are less likely to bluff and more likely to prevail in coercive diplomacy. He theorizes that opposition parties will publicly express their support for or rejection of the ruling party’s foreign policies, thereby undermining as bluffs any threats that the opposition party does not support and reinforcing the credibility of any threats that it does. This is clearly a theory about mechanisms, and Schultz proceeds to model it formally, test it statistically against population-level observational data, and use process tracing to test the observable implications of his theorized mechanism in individual cases (Schultz, 2001).
The challenges of making valid causal inferences from statistical analysis of observational data are well known (Achen, 2002), but they do not mean that all such inferences are meaningless or that they are carried out in isolation from theories about causal mechanisms. Statistical models are theory-driven. Theories determine which variables to include, what functional form to use in the statistical model, what cases to include in the population, and how to conceptualize and measure the variables. If causal mechanisms are powerful, and if combinations of such mechanisms recur in well-defined contexts, then population-level correlations can provide evidence on the presence and operation of causal mechanisms. It is not correct to suggest, as Patrick Jackson does, that in a critical realist conception of causality ‘systematic cross-case covariation is, strictly speaking, irrelevant to a causal claim except under strictly controlled laboratory conditions’ (Jackson, 2011: 109, emphasis in the original). If this were true, the observational correlational data associating smoking and lung cancer, developed long before any plausible theories about mechanisms that might explain this correlation, would be irrelevant. Similarly, the strong correlational evidence of an inter-democratic peace would be irrelevant. Clearly, neither scientists nor governments viewed these correlations as irrelevant; rather, they viewed them as evidence of underlying and as yet not fully understood causal mechanisms, evidence that was sufficiently strong to justify changes in research agendas and public policies.
In short, common methods in political science — including but not limited to statistical analysis, formal modeling, discourse analysis, and case studies — can all contribute to the development and testing of theories about causal mechanisms. Statistical methods can test whether any population-level observable implications of hypothesized mechanisms are borne out. Formal models can deductively drive a researcher to theoretical insights about or testable implications of hypothesized mechanisms that the researcher otherwise might have missed. 7 Discourse analysis can get at the inter subjective understandings and social processes that shape norms and determine their effects as enablers of and constraints on social actors. Process tracing in individual cases provides a powerful means of using both induction and deduction to develop and test theories about hypothesized mechanisms (Checkel, 2006; Collier, 2011; George and Bennett, 2005). Case studies afford an opportunity to examine closely sequences of actions and events, and the details that emerge, often unknown even to experts on the case prior to its intensive study, provide many opportunities to test the observable implications of alternative explanations that are ‘use novel’ for the researcher. Process tracing of sequences of events, ideas, and actions in cases also provides some leverage over the issue of causal direction — That is, did A cause B, or did B cause A? — which is a challenge for strictly correlational methods and Humean notions of causation
A taxonomy of theories on social mechanisms and the development of typological theories
My argument thus far is that paradigmatic isms are an inappropriate locus of IR research, and that scholars from a wide range of epistemological and methodological traditions should be able to agree that their approaches to the study of IR are consistent with the idea of explanation via reference to theories about causal mechanisms. Yet IR scholars are unlikely to switch from a focus on isms to one on theories about mechanisms unless the latter approach can serve as a full substitute for the isms by providing a framework for cumulative theoretical progress, scholarly discourse, and effective pedagogy. This requires a mechanism-based framework that is simple enough to be useful in teaching, discourse, and the translation of theories about mechanisms from different scholarly communities, but sufficiently complex and open-ended to encompass not only the many and diverse theories about mechanisms that IR theorists have developed or borrowed from other disciplines, but even those theories that scholars might yet construct.
I propose a taxonomy of theories about social mechanisms as a means of meeting these demanding criteria. The first dimension of this taxonomy draws upon James Mahoney’s (2000) typology of sociological explanations of institutions, which includes those rooted in material power, efficiency, and legitimacy. 8 This tripartite division of categories of mechanisms usefully mirrors the three leading ‘isms’ in the IR subfield: (neo)realism (with a focus on material power); (neo)liberalism (institutional efficiency); and constructivism (legitimacy). It thereby provides a bridge to the vast literature couched in terms of the isms, preserving this literature’s genuine contributions toward better theories on mechanisms of power, institutions, and social roles. Yet it does so without reifying theories about mechanisms into Kuhnian paradigms or Lakatosian research programs. 9
The second dimension of the taxonomy draws from constructivism (Wendt, 1992) and structuration theory (Giddens, 1984). This dimension focuses on the agent–structure relations that Wight deconstructs as the central nexus of IR theorizing (Wight, 2006). The taxonomy captures the four possible kinds of mechanisms through which agents and structures interact: agent to agent, structure to agent, agent to structure, and structure to structure (structures are conceived here in the constructivist sense of including both material and normative or ideational structures). 10
These two dimensions form the core of the taxonomy in Table 1. For illustrative purposes, I have added levels of analysis, fields of study associated with particular agent-centered and structural mechanisms, and illustrative but far from exhaustive examples of theories on mechanisms in each of the boxes in the taxonomy. The resulting theoretical space is varied and complex, but it also suggests that there is a manageable rather than infinite number of theories about mechanisms that students and scholars need to understand in order to participate in the field at a professional level.
A taxonomy of theories on social mechanisms.
This taxonomy serves six purposes in fostering cumulative theorizing and knowledge about politics. First, it provides a checklist that scholars can use to make sure that they are not leaving out important kinds of potential alternative explanations of a phenomenon. Second, and related, researchers can develop increasingly comprehensive historical explanations of particular cases drawing on theories from any of the categories in the taxonomy. Third, scholars can drill down deeper into any one of the boxes in the taxonomy, refining or creating theories on the mechanisms in that category, or disaggregating theorized mechanisms into different sub types. This includes the inductive use of case studies or narratives to develop new concepts and theories about mechanisms. Many of the most useful contributions during and since the height of the isms contestation, some framed in terms of the isms and others not, have taken the form of elucidating theories of individual causal mechanisms and/or using them to explain cases or populations. A few of the many examples include work on institutional mechanisms (Keohane, 1984; Koremenos et al., 2001; Milner, 1997), normative change (Finnemore and Sikkink, 1998), normative taboos (Tannenwald, 1999), rhetorical entrapment (Krebs and Jackson, 2007), collective action (Ostrom, 1990), the microdynamics of civil violence (Kalyvas, 2006; Posen, 1993; Wood, 2003), economic, political, and international resource curses (Colgan, 2013; Ross, 1999, 2001), inter-group violence as intra-group ‘bidding’ for leadership or efforts to spoil negotiated settlements (Kydd and Walter, 2002; Stedman, 1997), problems of credible commitments (Fearon, 1995), and the dynamics of advocacy networks (Keck and Sikkink, 1998).
Fourth, using statistical methods, researchers can develop estimates of the magnitude of the causal effects, in specified contexts or issue areas, of the variables identified in each of the theories about how causal mechanisms work. Fifth, researchers can refine the scope conditions of the theories in any of the categories, clarifying the conditions under which they are strongest and weakest and specifying more clearly the populations to which they apply. Researchers can use both positive cases and negative cases (cases that could have had the outcome of interest but did not) to test scope conditions empirically (Goertz and Mahoney, 2004).
Finally, scholars can use the taxonomy to develop middle-range or typological theories about how combinations of mechanisms interact in shaping outcomes for specified cases or populations. A typological theory is a theory that not only defines individual independent variables and the hypothesized causal mechanisms that shape their effects, but provides ‘contingent generalizations on how and under what conditions they [these variables] behave in specified conjunctions or configurations to produce effects on specified dependent variables’ (George and Bennett, 2005: 235). Typological theories can model complex causal relations, including non-linear relations, high-order interaction effects, and processes involving many variables. Examples from comparative politics as well as IR include theories on alliance burden-sharing (Bennett et al., 1994), national unification (Ziblatt, 2006), welfare capitalism (Esping-Andersen, 1990), national political economies (Hall and Soskice, 2001), and rebel–home state–host state relations (Bennett, 2012; for analysis of some of these examples and a long list of typologies in IR and comparative politics, see Collier et al., 2012).
The taxonomy thus encompasses the building blocks of theorized mechanisms that can be brought together in different conjunctions to develop typological theories on how combinations of variables behave. Typological theories allow for cumulative theorizing as scholars can add variables or re-conceptualize them to higher or lower levels of abstraction (Elman, 2005), and such theories can be fruitfully and cumulatively modified as they encounter anomalous cases or expand to encompass new types of cases (for discussion of the example of the evolution of typological theorizing on alliance burden- sharing, see Bennett, 2011). Adding variables of course adds to the complexity of the theory, but researchers can pare back this complexity by controlling for some of the variables and exploring only subsets of the full typological space in any one study.
An additional advantage of shifting from the isms and rooting the IR field more clearly in theories about causal mechanisms is that this can re-energize interchanges among the IR subfield, the other subfields of political science, and the other social sciences. These dialogues should be a two-way street, with borrowing and learning in both directions. The IR field has already shared with the American politics subfield many theories about mechanisms of power and institutions, for example, but the taxonomy of mechanisms serves as a reminder that the study of American politics could benefit by paying closer attention to theories on legitimacy, persuasion, norms, socialization, and identity that have been developed more fully in IR. Cross-field discourse with comparative politics can also benefit by moving away from the tribal language of the isms, which is not widely used in comparative politics. Where the two subfields have intersected in ways that focus on causal mechanisms rather than paradigmatic isms, cross-pollination and collaboration have flourished. This is particularly true in the study of civil and ethnic conflicts and their interaction with foreign states and transnational actors. In this research program, explanations have focused on mechanisms involving greed, grievance, transactions costs, mobilization, framing, informational asymmetries and credible commitments problems, ethnic security dilemmas, principal–agent relations, and many other factors, and comparativists and IR scholars have collaborated and drawn readily on each other in their research (Checkel, 2012; Collier and Hoeffler, 2004; Fearon and Laitin, 2003; Kalyvas, 2003; King, 2004; Lichbach, 1998; Salehyan, 2009). Similarly, theoretical concepts on causal mechanisms translate far more readily between IR and economics, psychology, sociology, and history than does the ingrown and esoteric language of the isms.
Conclusions
The field of political science has moved increasingly over the last two decades toward mechanism-based explanations of complex phenomena. This shift is related to the development of variants of scientific realism in the philosophy of science, but it has been hampered by limited understanding among political scientists of how mechanism-based explanations differ from explanations built upon earlier philosophies of science. As Peter Hall has persuasively argued (2003), our ontological theories of how politics works, which increasingly embrace complexity, have outrun our epistemological notions of how to study politics, which still cling to the vestiges of forms of positivism that lost favor among philosophers decades ago.
A focus on explanation via reference to causal mechanisms offers one way of bringing our ontological assumptions, epistemological approaches, and research methods back into alignment. Yet political scientists have been hesitant to commit fully to this move because they have lacked a clear sense of the philosophical costs and benefits of mechanism-based explanations. The present article has argued that although mechanism-oriented explanations are not without their own drawbacks, they are an improvement over Kuhn’s concept of ‘paradigms’ and Lakatos’s notion of ‘research programs.’ Whereas Kuhnian paradigms and Lakatosian research programs both foundered, in different ways, on the difficulties of justifying large sets of partially testable inter related ideas, the concept of theories about discrete causal mechanisms allows for middle-range typological theories that are more localized, if also more complex.
IR scholars also need assurance that explanation via mechanisms does not entirely lack the key attraction of paradigms or research programs: a structured discourse that provides a framework around which we can organize cumulative research findings. Why should we move away from the ‘isms’ — realism, liberalism, and constructivism in the IR subfield; rational choice, historical institutionalism, and other ‘isms’ in the study of American and comparative politics — and toward causal mechanisms if the latter contribute only to a hodgepodge of discrete explanations of individual cases? Here, the taxonomy of causal mechanisms introduced above shows how extant paradigms and research programs have implicitly relied on causal mechanisms all along and can be mapped onto an approach that focuses on explanatory mechanisms without reifying them into grand schools of thought. Cumulation and progress enter in as increasingly refined theories on individual mechanisms and as improvements in typological theories on how combinations of mechanisms interact to shape outcomes in problem-based research programs.
Mechanism-oriented theorizing poses important costs, particularly a loss of parsimony compared to extant paradigms and research programs. Still, researchers using this approach to theory-building can choose different trade-offs along the spectrum between parsimony and complexity. In the end, there is a strong philosophical basis for rooting the study of politics in theories about causal mechanisms, and it is possible to do so while maintaining a structured discourse and cumulating research findings.
Footnotes
Funding
An earlier version of this argument (Bennett, 2012) had financial support from the Norwegian Research Council, the Center for the Study of Civil War at the Peace Research Institute Oslo, Syracuse University, and the Simons International Endowment at Simon Frasier University.
