Abstract
Even after a quarter-century of debate in political science and sociology, representatives of configurational comparative methods (CCMs) and those of regressional analytic methods (RAMs) continue talking at cross purposes. In this article, we clear up three fundamental misunderstandings that have been widespread within and between the two communities, namely that (a) CCMs and RAMs use the same logic of inference, (b) the same hypotheses can be associated with one or the other set of methods, and (c) multiplicative RAM interactions and CCM conjunctions constitute the same concept of causal complexity. In providing the first systematic correction of these persistent misapprehensions, we seek to clarify formal differences between CCMs and RAMs. Our objective is to contribute to a more informed debate than has been the case so far, which should eventually lead to progress in dialogue and more accurate appraisals of the possibilities and limits of each set of methods.
Keywords
Introduction
About a quarter-century ago, the publication of Charles Ragin’s (1987) The Comparative Method sparked a debate in the literature on political and sociological research methods that has not lost one iota of its initial impetus to date. Quite the contrary, the two sequels Fuzzy Set Social Science (Ragin, 2000) and Redesigning Social Inquiry (Ragin, 2008) have brought the “Ragin Revolution” (Vaisey, 2009) to the current point of unprecedented commotion at which proponents and opponents are vying for the methodological high ground more fiercely than ever before. 1 The trigger of this revolution was the introduction of a novel method named Qualitative Comparative Analysis (QCA), which, after a slow yet solid start in the early 1990s, has passed the 100-articles-per-year mark in 2013—for the first time since its inaugural appearance in Ragin, Mayer, and Drass (1984). 2 According to its inventor, the primary motivation behind the development of QCA has been to “integrate the best features of the case-oriented approach with the best features of the variable-oriented approach” (Ragin, 1987, p. 84).
In this article, we do not endeavor to evaluate whether Ragin has succeeded on this front or not. Nor do we want to align ourselves in arguments over the vices and virtues of QCA. Instead, we pursue the following objective from the sidelines: to clear up three misunderstandings that have dominated the debate between representatives of configurational comparative methods (CCMs) such as QCA and those of regressional analytic methods (RAMs) ever since the publication of The Comparative Method. 3 These misunderstandings do not merely concern trivia at the methodological periphery but central issues. Contrary to expectations, however, the sources of these problems only partly reside in difficulties of communication between the two communities, but mainly in knowledge gaps and ambiguous definitions of concepts within these two communities. By filling these gaps and by clarifying concepts, we seek to clear the blockage in the debate, hopefully once and for all. We expect appraisals of the possibilities and limits of each set of methods to eventually also become more accurate in consequence. The three aspects to be addressed are listed in Table 1.
Three Aspects of the CCM–RAM Debate.
Note. CCM = configurational comparative method; RAM = regressional analytic method.
First, CCMs and RAMs build on disparate theories of mathematical structures whose syntax may often be equal but whose semantics remain incommensurable. While CCMs work under the axioms of a Boolean algebra, RAMs do so under those of a linear algebra. 4 This fundamental yet simple difference continues to be downplayed, misinterpreted, or even outright ignored (cf. Goertz & Mahoney, 2013a, pp. 280-81; 2013b, pp. 239-40). On the side of the proponents of RAMs, King, Keohane, and Verba (1994), for instance, have dismissed “Ragin’s ‘Boolean Algebra’ approach” summarily as containing “no new features or theoretical requirements” (pp. 50, 87-91), but proponents of CCMs share some responsibility for the current state of affairs. Against the background of Schneider and Wagemann’s (2012) otherwise laudable aim of “avoiding confusion and misinterpretation of set-theoretic methods” (p. 42), it is half unfortunate and half ironic to see the authors proclaim that “[t]he challenge in understanding set-theoretic methods is not so much in grasping the math” (pp. 16-17) when the small selection of Boolean math they introduce contains a considerable number of glaring errors. 5
Second, particular classes of hypotheses demand the application of either CCMs or RAMs because the associations they posit are based on either a Boolean or a linear-algebraic framework. As Braumoeller and Goertz (2000, p. 847) note, however, most members of either camp remain oblivious of the inseparability of particular classes of hypotheses and the appropriate set of methods for building and testing them. Two quotes from articles that both have appeared in top methods journals are illustrative. While Katz, vom Hau, and Mahoney (2005) conclude that “regression methods and fuzzy-set methods cannot test the same hypotheses because the two approaches’ contrasting understandings of causation lead them to formulate fundamentally different kinds of hypotheses” (p. 541), Clark, Gilligan, and Golder (2006) argue that “standard linear models that include interaction terms offer a better way” (p. 312) to test hypotheses about necessity and sufficiency. The opposition between these two statements could hardly be more direct.
Third, the issue of causal complexity has split the two communities (e.g., Brady, 2013; Clark et al., 2006; Vis, 2012). Some methodologists consider CCMs and RAMs to be closely related if not substitutable in analyzing causal complexity (Mahoney, 2008), others are of the opinion that the former have an edge over the latter (Wagemann & Schneider, 2010), and still others argue that CCMs are vastly inferior to RAMs (Clark et al., 2006). The order of these three aspects, from algebraic systems over hypotheses classes to the concept of causal complexity, is not random but reflects their level of generality. The mastery of analyzing causal complexity is impossible without an understanding of hypothesis classes, which is itself prevented by unfamiliarity with the formal algebraic rules that underlie them. 6 We thus contend that progress in the CCM–RAM debate is impossible without the joint dissolution of the confusion that surrounds these aspects individually.
The article is structured as follows. In the first section, we address the widely-held misconception that the same logic of inference applies to CCMs and RAMs by juxtaposing the formal differences in their respective languages. Surprisingly, these have so far never been laid out in the debate. Second, we systematize the different types of hypotheses each set of methods is associated with. It is usually recognized that CCMs can build and test hypotheses about relations of implication, and that RAMs are suitable for building and testing hypotheses about relations of covariation (e.g., Mahoney, 2007), but as yet an elaboration of these dissimilarities beyond the stage of mere recognition has not been presented. The differences between CCM conjunctions and RAM interactions are the topic of the third and final part, which integrates the core points of the preceding sections. In the conclusions, we recapitulate the argument, provide a short-term forecast of the direction in which the CCM–RAM debate will move over the coming years and issue some recommendations to influence its course.
Algebraic Systems
In their attempt to convince qualitative researchers of the universal applicability of the principles of RAMs, King et al. (1994) argue that “differences between the quantitative and qualitative traditions are only stylistic and are methodologically and substantively unimportant. All good research can be understood . . . to derive from the same underlying logic of inference” (p. 4). The authors insist on the existence of only one language for social-scientific inquiry and dismiss “Ragin’s ‘Boolean Algebra’ approach” summarily as containing “no new features or theoretical requirements” (King et al., 1994, pp. 50, 87-91). We confute this assertion by showing that Boolean algebra provides a self-contained logic of inference for CCMs along with mathematical machinery that is neither reducible to nor reconcilable with the logic underlying RAMs. However, representatives of CCMs share responsibility for the current state of affairs by having misrepresented central elements of Boolean algebra. Our main contention in this section is the following:
The failure to appreciate the existence of an alternative language by proponents of RAMs has prevented progress in the debate on the most basic level. As Mahoney and Goertz (2006) point out, the dissimilarity even between simple binary operators such as the logical AND and the arithmetic TIMES has contributed to “substantial confusion across the two traditions” (p. 235). So as to remedy this unfortunate state of affairs, we lay out the principal differences between and commonalities of Boolean and linear algebras. 8 While commonalities are strictly limited to syntactic features of the formalisms in which they are expressed, fundamentally different semantic interpretations are called for. These differences are so profound that the two systems give rise to formal languages that defy all attempts at comprehensive translation. Contrary to King et al. (1994), we thus not only argue that the theoretical requirements of “Ragin’s ‘Boolean Algebra’ approach” are at least as demanding as those of RAMs but also contradict Brady’s (2013) more conciliatory verdict that “[l]anguage differences can be important, but they can be transcended through careful translation” (p. 253).
Boolean and linear algebras are mathematical objects that consist of a first set
commutativity of “+” and “*”
associativity of “+” and “*”
distributivity of “+” over “*” and of “*” over “+”
as well as these special laws
Although this set is unnecessarily large for providing a minimal definition, we explicitly list (BA1) to (BA8) to facilitate comparison.
9
In contradistinction, for any linear algebra,
commutativity of “+” and “*”
associativity of “+” and “*”
distributivity of “*” over “+”
as well as these special laws
These purely syntactic definitions only partially determine the types of elements that
The most common formal languages that satisfy the axioms of a Boolean algebra are set theory, propositional logic, and switching-circuit theory.
11
In set theory, the identity elements are interpreted in terms of the universal set “
RAMs usually follow the standard arithmetic interpretation of a linear algebra, but the interpretation of the Boolean algebra that underlies CCMs has unfortunately often meandered between a set-theoretic and a logical interpretation, sometimes for no other reasons than keyboard convenience (cf. Schneider & Wagemann, 2012, pp. 54-55). From a conceptual perspective, mixing interpretations of a Boolean algebra is unproblematic although this practice has partly been responsible for the current state of confusion. We apply a consistent rendering of the Boolean algebra employed by CCMs in terms of propositional logic in the remainder of this article.
The central corollary of the semantic differences induced by Boolean and linear algebras is that CCMs and RAMs model causes and effects as disparate entities. 12 The objects x, y, and z in (BA1) to (BA8) represent conditions and outcomes for CCMs, whereas x, y, and z in (LA1) to (LA8) are understood to be regressors and regressands by RAMs. 13 This difference is not merely one of terminology as Schneider and Wagemann (2012, p. 55) suggest. A condition or an outcome always refers to one concrete value that a variable takes on. In contrast, a regressor or a regressand always refers to the variable itself. For example, RAMs deal with countries’ degree of social heterogeneity or their degree of electoral district magnitude, whereas CCMs process countries that show a high degree of social heterogeneity or a large magnitude of electoral districts. The distinction is subtle yet fundamental. Almost all contributions to the CCM–RAM debate of the last two decades have missed or misinterpreted this crucial difference. So as to keep these entities syntactically apart while remaining close to established notational conventions, we denote variables by italicized capital letters, for example, X, and conditions and outcomes by italicized capital letters to which a value indicator is appended in superscript, for example, X{.}. As Boolean algebra is limited to bivalent variables, we additionally introduce the following notational simplification for unary operations on simple terms: “¬X{1}” will be replaced by “X{0}” and “¬X{0}” by “X{1}”. 14
Further non-fundamental operations are definable based on fundamental operations. As a matter of fact, formal languages of Boolean and linear algebra typically feature a host thereof. Important non-fundamental operators in propositional logic are the left-to-right arrow “⇒”, and the likewise common but non-standard right-to-left arrow “⇐”. 15 They denote an implication and can be defined in two equivalent ways as given by definitions (DF1) and (DF2), (DF3) and (DF4), respectively. The corresponding notation with generic operators is provided in addition (in square brackets):
In natural language, X{1} ⇒ Y{1} reads as “If X{1} is the case, then Y{1} is the case as well”. According to (DF1) and (DF2), this sentence is equivalent in meaning to “It is not the case that X{1} and Y{0} are given” as well as to “X{0} or Y{1} is given”. However, the most common phrasing is “X{1} is sufficient for Y{1}”. In contrast, X{1} ⇐ Y{1} reads as “Only if X{1} is the case, then Y{1} is the case as well”. According to (DF3) and (DF4), equivalent phrasings are “It is not the case that X{0} and Y{1} are given” and “X{1} or Y{0} is given”, but the most common wording is “X{1} is necessary for Y{1}”.
Unfortunately, methodologists and users of CCMs have infused the implication operator with qualities it simply does not possess. For example, Schneider and Wagemann (2012, pp. 51-53) introduce the three fundamental operations “∨”, “∧”, and “¬” as suitable for constructing complex sets, whereas the implication operator is said to be appropriate for analyzing (causal) relations between sets. Similarly, Rihoux and De Meur (2009) introduce the operator as expressing “the (usually causal) link between a set of conditions on the one hand and the outcome we are trying to “explain” on the other” (p. 35, emphasis added). Yet, it is obvious from (DF1) and (DF2) that an implication states nothing beyond a negated conjunction in which the consequent is negated, or a disjunction in which the antecedent is negated. Implications are not in any way more amenable to causal interpretation than any of the three fundamental operations.
The implication operator forms the basis of the equivalence operator “+”, which is alternatively defined by (DF5) to (DF8):
It carries the meaning of “if, and only if ”, so that “X{1} + Y{1}” reads as “X{1} is the case if, and only if, Y{1} is the case as well”. According to (DF6) and (DF8), equivalent phrasings are “It is neither the case that X{1} and Y{0} are given nor that X{0} and Y{1} are given” and “X{0} and Y{0} is given or X{1} and Y{1} is given”, but the most common wording is “X{1} is sufficient and necessary for Y{1}”. In this connection, it is important to note that logical equivalence “+” is not the same as arithmetic equality “=”. The latter is an operator that relates expressions referring to identical mathematical objects, as in “4 = 2 + 2”, whereas the former is an operator that relates expressions with identical truth values. As “4” and “2 + 2” are neither true nor false but simply alternative names that refer to the number 4, the expression “4 + 2 + 2” is ill-formed. The same holds for “X{1} = Y{1}” because “X{1}” and “Y{1}” do not refer to mathematical objects but to conditions that can be true or false. We will come back to this crucial difference in the section on causal complexity.
Definitions (DF1) and (DF2) exemplify the unbridgeable semantic dissimilarities that exist between Boolean and linear algebras also for non-fundamental operations. The two generic expressions x * (−y) = 0 and −x + y = 1 are not only well-formed in Boolean but also in linear-algebraic syntax. Notwithstanding this commonality, their truth conditions are entirely different. In linear algebra, x * (−y) = 0 holds if, and only if, at least one of x or y is 0, whereas −x + y = 1 holds if, and only if, y = x + 1. As a result, it is indeterminate how implication and equivalence should be translated into linear algebra. Such a translation would have to give preference to either definition (DF1) or (DF2), but when read linear-algebraically, neither x * (−y) = 0 nor −x + y = 1 express the meaning of sufficiency or necessity. With regard to the first definition, it follows that 0 is sufficient for every element of
The conclusion from our exposition above can only be that Boolean algebra and linear algebra, despite occasional similarities in syntax, give rise to languages that are semantically incommensurable. Unsurprisingly, none of the works which have propagated the unity of political methodology (e.g., Brady, 2013; Gerring, 2012; King et al., 1994; Mahoney, 2008) has been accompanied by an argument of how to reconcile the minimal set of axioms of each algebraic system under a generalized system.
Hypothesis Classes
Many political methodologists expressly consider (causal) inference—the formulation and testing of (causal) hypotheses with empirical data—to be the primary goal of social research (Gerring, 2012; Goertz & Mahoney, 2012; King et al., 1994). 16 Although the topic of hypothesis formulation is part and parcel of elementary training in research design, uncertainty about the types of hypotheses associated with each set of methods persists (Braumoeller & Goertz, 2000, p. 847). For instance, Katz et al. (2005) conclude that “regression methods and fuzzy-set methods cannot test the same hypotheses because the two approaches’ contrasting understandings of causation lead them to formulate fundamentally different kinds of hypotheses” (p. 541), whereas Clark et al. (2006) argue that “standard linear models that include interaction terms offer a better way to test asymmetric hypotheses” (p. 312). 17 Before the difference between RAM interactions and CCM conjunctions will be discussed in the next section, we provide a basic taxonomy of the different types of hypotheses social scientists formulate in this part, based on the following claim:
The focus here is on simple conditions and regressors. 19 A three-level system structures the different types of hypotheses. On the first level, two hypothesis classes can be distinguished. Each of these classes comprises two functions on the second level, and to each of these functions, there exist two arguments on the third level that together form a pair of functional substitutes. This scheme is presented diagrammatically in Figure 1.

Hypothesis classes, functions, and arguments.
The two functions of implication hypotheses are sufficiency and necessity, whose two arguments are absence and presence. The two functions of covariation hypotheses are positivity and negativity, whose two arguments are increase and decrease. An implication hypothesis that features both sufficiency and necessity as functions to which absence and presence or presence and absence are supplied gives rise to an equivalence hypothesis. 20 A covariation hypothesis that features neither positivity nor negativity as functions, and thus neither increase nor decrease as arguments, generates an independence hypothesis. 21 In the remainder of this section, we elaborate on each combination of classes, functions, and arguments, first for implication and subsequently for covariation hypotheses.
A very common implication hypothesis involves the presence of condition X{1} and outcome Y{1} as arguments to the function of sufficiency. Strangely though, explicit sufficiency hypotheses seem difficult to find in the social sciences (cf. Goertz, 2003, p. 73). A few exceptions exist nonetheless. Gleditsch (1995), for instance, maintains that “nuclear deterrence may be interpreted as a sufficient condition for peace” (p. 543), and Landry, Davis, and Wang (2010) argue that electoral “competition defined as choice between candidates still is sufficient to engage voters” (p. 782).
Verbally, hypotheses of this type are thus usually put forward in one of the phrasings given by
Hypothesis type
This type is syntactically codified as Y{0} ⇒ X{0} or again, following (DF1) and (DF2) prior to invoking the law of commutativity in (BA1), as ¬(X{1} ∧ Y{0}) and X{0} ∨ Y{1}. Note that although
Hypotheses about the necessity of a condition for an outcome are considerably more common than those of
They are usually denoted by the Boolean-algebraic expression introduced in (DF3) as X{1} ⇐ Y{1}, and equivalent in content with the type given by the two forms in
The Boolean-algebraic expression Y{0} ⇐ X{0} codifies the content of
The expression X{1} + Y{1} introduced in (DF5) is used to denote this type. Hypotheses
In contradistinction to implication hypotheses, social scientists are considerably more practiced in using and interpreting covariation hypotheses.
23
A very common type passes the increase in the regressor X and the increase in the regressand Y as arguments to the function of positivity. These hypotheses are usually phrased as given by
Syntactically,
Note that the syntactic representation remains the same regardless of the arguments passed to the function of positivity. Hypotheses positing negativity are generally phrased most naturally in the form given by
Syntactically,
The simultaneous negation of
This type can be expressed linear-algebraically as ΔY/ΔX = 0 for discrete and ∂Y/∂X = 0 for instantaneous changes. In point of fact,

Visualization of basic implication and covariation hypotheses.
In summary, implication hypotheses always require variables to take on specific values that can be true or false for any given object. Covariation hypotheses only require variables to stand in some functional relationship. It is therefore uninformative to hypothesize that “the more of X{1}, the more of Y{1}”, or, conversely, that “if X, then Y”. A country having a high degree of social heterogeneity cannot simultaneously have a higher degree of social heterogeneity, just as the degree of social heterogeneity of a country cannot be necessary for the number of legislative parties. 27 A condition cannot increase or decrease (unlike regressors or regressands), and a regressor cannot be true or false (unlike conditions and outcomes). In consequence, implication hypotheses are based on a Boolean algebra and therefore associated with CCMs, whereas covariation hypotheses are based on a linear algebra and therefore associated with RAMs. Claiming that the latter are superior to the former for building or testing implication hypotheses (e.g., Clark et al., 2006; King et al., 1994) is thus tantamount to ignoring semantics.
Causal Complexity
The third area of misunderstandings to be addressed concerns the difference between Boolean and linear-algebraic products. Following terminological conventions, we refer to the former as conjunctions and to the latter as interactions. More specifically, we demonstrate that the main difference between these two constructs resides in the fact that implication hypotheses involving conjunctions give rise to Boolean expressions that delineate multi-dimensional grids of spaces, whereas covariation hypotheses involving interactions give rise to linear-algebraic expressions that delineate discrete or continuous multi-dimensional planes. Contrary to received wisdom, the degree of complexity of either construct is irrelevant to this difference. Both conjunctions and interactions can be of any order within the constraints set by the number of arguments to their higher-level functions. Before we reanalyze an influential study in this connection, a review of the applied and methodological bodies of literature reveals the current state of confusion.
Amenta and Poulsen (1996), for instance, argue that
social spending outcomes are due to complex interactions …. Because of multicollinearity and losses of degrees of freedom . . . these interactions are sometimes ignored. Qualitative comparative analysis offers a solution . . . (p. 55-56)
Similarly, Davidsson and Emmenegger (2013) emphasize that the “small number of observations would not allow for the inclusion of multiple interaction terms. In contrast, fsQCA can deal with complex causality even if the number of cases is relatively small” (p. 349). Heikkila (2004), in contrast, draws on QCA to identify “the interaction terms among variables . . . , which can complement the predicted interaction effects from the logit model” (p. 109). And for Grandori and Furnari (2008), the “choice of the data analysis method was driven by the need to detect interaction effects . . . ” (p. 473), but “three-way interactions currently represent a limit for regression analysis applications. . . . For these reasons, . . . we found Boolean comparative analysis . . . the most suitable method for our purposes”. 28 In summary, applied work seems to regard conjunctions and interactions as substitutes, but as the latter often create problems of a technical and/or interpretative nature, CCMs are considered an attractive alternative to RAMs.
In the methodological literature, Clark et al. (2006) argue that CCMs are dispensable because RAMs with interactions offer a superior means for testing (probabilistic) hypotheses about necessity and sufficiency relations, a view that appears to have recently convinced a number of scholars (Brady, 2013, pp. 258-263; Fiss, Sharapov, & Cronqvist, 2013, p. 192; Hug, 2013, p.257). Mahoney and Goertz (2006) concede that “[t]his is not a completely unreasonable view . . . , for the logical AND is a first cousin of multiplication” (p. 235), to the effect that “as statistical comparativists start to use saturated interaction models in which all possible interactions are assessed and simplified in a top-down manner, we would essentially see an integration of QCA techniques and statistical methods” (Mahoney, 2008, p. 425). Griffin and Ragin (1994, p. 11) hold that QCA and logit regression are in fact alike, the only real difference being that the former is better at handling causal complexity. This counterargument to Clark et al. (2006) is supported by Vis (2012, p. 173) as well as Wagemann and Schneider (2010, p. 384), who consider interactions more problematic in interpretation and QCA as better suited for identifying causal complexity. Somewhere in between these standpoints, Grofman and Schneider (2009) argue that “once we have completed QCA we can use what we have learned to mimic its results with more traditional methods such as binary logistic regression . . . ” (p. 669). To summarize, methodologists’ opinions diverge greatly. Some consider CCMs and RAMs to be closely related if not substitutable in analyzing causal complexity, others believe CCMs to have an edge over RAMs, and still others not only take the opposite view but even argue that CCMs are vastly inferior to RAMs. In this section, we clarify the relation between conjunctions and interactions by integrating the key points from the two preceding sections. More specifically, we make the following assertion:
As Clark et al. (2006; hereafter CGG) present the most unequivocal dismissal of CCMs, we develop our argument on the basis of their influential essay. The authors present a seemingly attractive argument: “to determine whether X1 and/or X2 is necessary, sufficient, or necessary and sufficient for Y” (p. 320), the general interaction model with two regressors X1 and X2 and a regressand Y given in Equation (1) suffices: 29
Based on this model, CGG present eight combinations of coefficients, with each of which a specific “valid conclusion” concerning the type of implication between each regressor and the regressand is associated (CGG, Table 3, p. 322).
30
For example, the authors argue that the statistical insignificance of
Three problems arise with this logic. First, all of CGG’s conclusions about necessity and sufficiency relations on the basis of regression coefficients run counter to the fundamentals of Boolean algebras and Boolean-algebraic hypothesis formulation. Implication hypotheses are posited that establish associations between regressors and regressands but, as illustrated in (DF1) to (DF8), CCMs model causes and effects in terms of variables taking on specific values. Implicational hypotheses about variables that are functionally related through arithmetic equality are ill-formed. Second, and in temporary disregard of the previous point, coefficient combinations are presented only for conclusions about single variables, but not for binary operations such as X1 ∧ X2 or X1 ∨ X2, although these expressions are intrinsic to CCMs. Third and final, CGG note that they have omitted negative coefficients for ease of presentation but do not indicate how such coefficients should actually be interpreted with respect to necessity and sufficiency. Irrespective of how this would be done, we have shown above that the unary operation “−” does not travel in translation between linear and Boolean algebras.
In spite of these problems, the authors continue to reanalyze the argument made by Duverger (1954) about the structure of a country’s party system. They distill three implication hypotheses from Duverger’s work, namely that multi-member districts (MMD{1}) and high social heterogeneity (SHG{1}) are individually necessary for a multi-party system (MPS{1}), and that the conjunction of high social heterogeneity and multi-member districts is sufficient for a multi-party system (CGG, 2006, pp. 322-324).
31
Following (DF3) and
To test (H1) to (H3), CGG estimate a model with dichotomized regressors for which “the connection between multiplicative interaction models and testing for necessary and/or sufficient conditions is clearest” (p. 325). This model is presented in Equation (2), where NP is the number of parties; MMD is a binary variable with integer 1 indicating multi-member districts and 0 single-member districts; and SHG is a binary variable with integer 1 indicating high social heterogeneity and 0 low social heterogeneity. 32 The dichotomization thresholds are derived both through data-based and theoretical criteria. A value of 1 is applied to differentiate single-member from multi-member districts, the sample median of 1.2775 to distinguish high from low social heterogeneity, and a lower bound of three parties is set to identify multi-party systems:
The authors interpret their results, which show
At the same time, however, CGG see “strong evidence that both multi-member districts and social heterogeneity are necessary, but not sufficient, for more legislative parties” (p. 325). If “more legislative parties” is interpreted to mean “multi-party systems”, as CGG explicitly do when they argue that “multi-member districts are necessary, but not sufficient, for multipartism” (p. 324), then these conclusions follow hypothesis type
In Table 2, we not only reassess (H1) to (H3) but also provide a comprehensive battery of tests for both MPS{1} as well as its Boolean negation MPS{0}—a two-party system—using conventional CCM inclusion score tests. 35 Results are presented for conjunctive (I(∧, ⇒)) and disjunctive sufficiency (I(∨, ⇒)) with respect to MPS{1} and MPS{0} as well as conjunctive (I(∧, ⇐)) and disjunctive necessity (I(∨, ⇐)) with respect to MPS{1}. 36 For computing inclusion scores, we apply the inclusion ratios for sufficiency and necessity given in Equations (3) and (4), respectively, where mi is the membership of case i in the condition and the outcome (Smithson & Verkuilen, 2006, pp. 11, 65-68):
Inclusion Score Tests For Outcome MPS{1} and Negated Outcome MPS{0}.
Note. MPS = multi-party system; C = condition; MMD = multi-member districts; SHG = social heterogeneity; n = number of cases; I(∧, ⇒) inclusion score for conjunction and sufficiency, and so on.
p < 0.10 for alternative hypothesis Incl < 0.75; **p < 0.05 for Incl < 0.75; ***p < 0.01 for Incl < 0.75; †p < 0.10 for Incl > 0.5; ††p < 0.05 for Incl > 0.5; †††p < 0.01 for Incl > 0.5.
With k variables j of pj values, there exist
At a minimum, the hypothesis that a condition is sufficient for an outcome would never be upheld if I was not significantly greater than 0.5, in which case more evidence may exist for a sufficiency relation between the condition and the negation of the outcome. Moreover, inclusion scores of substantive significance should ideally exceed a value of 0.75 (Ragin, 2008, p. 46; Schneider & Wagemann, 2012, p. 129). In this case, there are 3 times as many observations that show the conjunction of the condition and the outcome as there are observations exhibiting the conjunction of the condition and the negation of the outcome.
It is incontestable that none of the four complex conditions C1 to C4 would be considered sufficient for MPS{1} by any standards of CCM research, which ultimately also means that (H3) should be rejected. Contrary to what CGG argue, high social heterogeneity in conjunction with the presence of multi-member districts is not sufficient for the presence of a multi-party system. All that can be inferred with regard to (H3) at conventional levels of statistical significance (p < 0.1) is that the ratio between the number of cases that show both conditions as well as the outcome and the number of cases that show both conditions but not the outcome amounts to less than 0.75, which is equally true for all inclusion scores that exceed 0.5 (only C4) as well as those that do not (C1, C2, and C3). The only notable score in this column is that of C2. As it is 0, all cases that show the conjunction of the absence of multi-member districts and high social heterogeneity must be associated with two-party systems. The complementary inclusion score of 1 is not merely statistically indistinguishable from the threshold of 0.75 with 11 observations but significantly higher.
Similarly, (H2) should be rejected because high social heterogeneity, at an inclusion score of 0.45, is far from passing as a necessary condition for a multi-party system. If anything, there is more evidence for the hypothesis that high social heterogeneity represents a necessary condition for a two-party system.
37
Only (H1) receives corroboration, but not because the regression coefficients
Words do not readily convey the difference between the testing of causally complex hypotheses with CCMs and that with RAMs. Figure 3 thus provides a three-dimensional visualization of CGG’s data, the corresponding regression plane defined by Model (2) together with its 95% confidence interval, and the implication space of hypothesis (H3). Two different perspectives are provided for better orientation. Equation (2), which models the interaction between social heterogeneity and district magnitude, produces a regression plane of four points because each value of one variable can be combined with one of two values of the other variable. In contrast, the conjunction in (H3) hypothesized to be sufficient for a multi-party system produces a grid of four spaces along the dichotomization thresholds of the three variables. The two most important spaces have been enclosed by gray-transparent rectangles.

Predicted number of parties (squares) with 95% confidence intervals, and implication space of SHG{1} ∧ MMD{1} ∧ MPS{1} and SHG{1} ∧ MMD{1} ∧ MPS{0}.
Cases in the space created by SHG{1} ∧ MMD{1} ∧ MPS{1} corroborate (H3), whereas those in the space given by SHG{1} ∧ MMD{1} ∧ MPS{0}, and only those, act as falsifiers according to (DF1). Countries with low social heterogeneity or single-member districts or both, regardless of the number of parties, do not contradict the claim that the combination of high social heterogeneity and the presence of multi-member districts is sufficient for multi-party systems, although, according to CGG’s logic, any of these cases that show a multi-party system should undermine the hypothesis. Put again in graphical language, RAM researchers hunt for areas on regression planes that show discernible deviations from flatness (where one regressor has no effect on the regressand given a (range of) value(s) of the interacting regressor), whereas CCM researchers seek to collapse grids of implication spaces (where one condition has no verifiable effect on the outcome given the presence of the conjunction of other conditions).
In conclusion, irrespective of the setting of any of the 27 different constellations of regression coefficients in a RAM interaction model such as Equation (2), conjunctions and interactions are incommensurable constructs modeling causal complexity that can neither be substituted for each other nor remedy the shortcomings of the counterpart. 38 By extension, neither do saturated interaction models integrate conjunctions, as Mahoney (2008, p. 425) claims, nor can conjunctions be conceptually mimicked by interactions, as suggested by Grofman and Schneider (2009, p. 669).
Conclusion
Fundamental misunderstandings about algebraic systems, hypotheses classes, and the concept of causal complexity have been blocking progress in the debate between configurational comparativists and regressional analysts for more than a quarter-century. Contrary to expectations, the sources of these misunderstandings have not mainly resided in problems of communication between the two communities, but in knowledge gaps and ambiguous definitions of concepts within these two communities. The objective of this article has been to clear this blockage once and for all by showing how the differences arising under these aspects inform the distinct purposes of CCMs and RAMs.
It has first been demonstrated that CCMs are based on Boolean algebra, whereas RAMs work according to the laws of linear algebra, both of which give rise to semantically incommensurable languages despite occasional resemblances in syntax. If the debate is to progress at any rate, representatives of CCMs and those of RAMs cannot be spared from gaining more proficiency in the mathematical formalities of these languages. Casual dismissals of Boolean algebra by proponents of RAMs should not be tolerated any more, just as elementary misinterpretations such as those concerning the Boolean implication operator need to be finally overcome by proponents of CCMs.
It has then been argued that hypotheses formulated in social-scientific research for purposes of (causal) inference generally divide into implication and covariation hypotheses, the former of which posit implicational associations between conditions and outcomes, and the latter of which posit covariational associations between regressors and regressands. Just as “condition” is not merely a CCM term for “regressor”, an “outcome” is something entirely different from a “regressand”. When researchers design their projects and formulate hypotheses, they need to be aware of the consequences their decisions entail in this connection.
Finally, we have juxtaposed conjunctions and interactions so as to emphasize the disparate concepts behind these constructs, and to argue that CCMs and RAMs cannot substitute for each other in analyzing causal complexity. Graphically speaking, conjunctions delineate grids of spaces, whereas interactions produce discrete or continuous planes. Methodologists should thus stop arguing about the superiority of one set of methods in dealing with causal complexity, and instead begin to appreciate their distinct capabilities, leveraging respective strengths wherever apposite.
Linear algebra is essential to an understanding of RAMs, just as Boolean algebra is indispensable for comprehending the principles of CCMs. Most methods curricula for political scientists and sociologists at university departments around the world, however, assume students to be familiar with the fundamentals of only the former. Unless the horizon is broadened, we will see more incendiary works being published over the coming years that reinforce the misunderstandings addressed in this article. To prevent such methodological inertia or even regress, methods curricula should thus feature introductions to propositional logic as that branch of Boolean algebra closest to the social sciences. The tools of formal logic are not the exclusive domain of analytic philosophers, electrical engineers, or genetic biologists. If they want to employ or judge them, political scientists and sociologists must begin to eventually acquaint themselves more thoroughly with these tools as well.
Footnotes
Acknowledgements
Previous versions of this article have been presented at the Annual General Conference of the European Political Science Association, Edinburgh, 19-21 June 2014 and the Annual Meeting of the American Political Science Association, Washington, DC, 28-31 August 2014. We thank Barry Cooper, Gary Goertz, two anonymous reviewers of CPS, the editors of CPS, Benjamin Ansell and David Samuels, and the participants at the aforementioned conferences for thought-provoking comments and very helpful suggestions.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: Alrik Thiem and Michael Baumgartner gratefully acknowledge financial support from the Swiss National Science Foundation, award number PP00P1_144736/1.
Supplemental Material
Notes
Author Biographies
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
