Abstract
People see immorality in sin and sex, but is “purity” a unique type of moral content, with unique cognition? Domain-general accounts—and parsimony—suggest that all moral content is processed similarly and that “purity” is merely a descriptive label. Conversely, domain-specific theories (e.g., moral foundations theory [MFT]) argue for a special purity module. Consistent with domain-general accounts, we demonstrated that purity concerns are not distinguished from harm concerns—in either MFT or naturalistic scenarios—and that controlling for domain-general dimensions eliminates effects previously ascribed to moral “modules.” Here, we reaffirm the strength of our data, exploring how issues raised by Graham reflect only weaknesses in MFT. Importantly, we identify several clear contradictions between Graham’s comment and past-published accounts of MFT. To the extent that MFT stands by its published stimuli, methodologies, and theoretical assumptions, we believe that we have disconfirmed MFT on its own terms.
Biological science has revealed staggering diversity among organisms—biological pluralism—but all biological species derive from the same general process of evolution. Likewise, anthropology has revealed staggering diversity among moral judgments—moral pluralism—but all moral judgments may revolve around the same domain-general, harm-based dyadic moral template (Gray & Schein, 2012). Even the founder of moral pluralism, Richard Shweder (2012), advocates for “universality without uniformity” in which violations of “purity” can be understood via perceived harm.
Arguing against a common moral template, moral foundations theory (MFT) suggests that moral judgment occurs via discrete cognitive modules: “little switches in the brain of all animals” “triggered” by “specific moral inputs,” such as harm or purity (Haidt, 2012, p. 123), with “distinct cognitive computations” for each kind of moral content (Young & Saxe, 2011, p. 203). The main evidence for these claims comes from researcher-constructed scenarios of “harm” (e.g., murder) and “purity” (e.g., chicken masturbation) that reveal different patterns of judgment (Graham et al., 2013). However, our research finds that these scenarios fundamentally confound moral content (harm vs. purity) with domain-general dimensions, including severity and weirdness (among potential others; Gray & Wegner, 2011). In our controlled studies, purity per se demonstrates no special effect on moral cognition, nor does it appear to be distinct from harm—or even pass manipulation checks—all arguing against MFT modularity (Gray & Keeney, 2015).
In his comment, Graham (2015) criticized four elements of our research, namely, (1) our use of moral judgment scenarios as stimuli; (2) our reliance upon participant intuitions in categorizing moral content; (3) the high correlations between harm, purity, and severity; and (4) our description of MFT as domain specific. Although these criticisms seem superficially compelling, a closer examination reveals that they actually highlight weaknesses within MFT. As our stimuli, design choices, analysis logic, and theoretical descriptions all come directly from MFT, any flaws therein argue only against MFT. Even more problematic, Graham’s comment redefines MFT in ways that seem impossible to reconcile with past published accounts. Through our specific points, we elaborate on these inconsistencies and explain how they undermine the coherence of MFT.
Moral Judgment Scenarios
Graham criticizes our research for operationalizing purity primarily though MFT moral judgment scenarios, rather than other MFT stimuli. Although there may be other MFT items, that is beside the point—shouldn’t all MFT stimuli be free of sampling bias? More importantly, we specifically chose these scenarios in order to test MFT on its own terms. Graham and colleagues (2009) custom developed these scenarios to represent purity and frequently cite them as providing strong support for MFT (Graham et al., 2013). If these scenarios are judged to be valid for past research (ostensibly supporting MFT), they must also be valid for our research—even though our research disconfirms MFT.
Scenario studies are not only the “most widely used by far” in the field (Graham et al., 2013, p. 70), but—critically—directly assess intuitive moral judgments regarding particular acts. Other MFT stimuli, especially the Moral Foundations Questionnaire (MFQ), do not assess intuitive moral judgments. Despite its name, the MFQ-Judgments scale assesses only the endorsement of general conservative values (“men and women have different roles to play in society,” Graham et al., 2011). In other words, the MFQ-Judgments scale simply repackages political identification rather than assessing moral cognition—explaining why its effect on actual moral judgments is fully mediated by right wing authoritarianism (Kugler, Jost, & Noorbaloochi, 2014).
Even less useful for assessing intuitive moral judgment, the MFQ-Relevance scale asks participants to introspect about their own moral cognition (“what factors are relevant for your moral judgments?” Graham et al., 2011). Decades of research show that people lack introspective access to the reasons behind their judgments (Nisbett & Wilson, 1977), and as moral judgments are especially intuitive, deliberative moral reflection is unlikely to reveal anything more than lay theories and post hoc rationalizations (for a fuller explanation, see Haidt, 2001).
Reliance on Participant Intuitions
MFT posits that harm and purity are distinct concerns, but the high correlation of these variables in participants’ ratings (rs > .86) reveals a lack of distinctness. Graham criticizes this finding by suggesting that participants are unable to accurately identify harm and impurity. Why then do Graham and colleagues rely upon participants ratings of harm and impurity in their own studies? For example, the MFQ-Relevance instrument asks participants to rate whether “someone violated standards of purity and decency.” If participant identifications of purity are judged to be valid for past research (ostensibly supporting MFT), they must also be valid for our research—even though our research disconfirms MFT.
Importantly, while the MFQ-Relevance scale asks people to reflect on the reasons behind their judgments (which we suggest previously is inappropriate), we merely asked people to rate the presence of purity or harm. This is no different from asking people to rate the presence of immorality (i.e., make a moral judgment). Moreover, in defining harm and purity for participants, we pulled words directly from the MFT “dictionaries” that ostensibly reflect laypeople’s understandings of moral content (Graham, Haidt, & Nosek, 2009). If MFT “dictionary” words are judged to be valid for past research (ostensibly supporting MFT), they must also be valid for our research—even though our research disconfirms MFT.
Graham also criticizes our reliance on participant intuitions in constructing our new “purity” scenarios. We agree that our participant-generated cases (e.g., prostitution and stripping) differ substantially from MFT researcher-devised cases (adding a tail via plastic surgery; Haidt, 2012). However, we view the emphasis on everyday morality as a strength of our stimuli. As Graham notes, these naturalistic scenarios fail to independently activate harm and purity—but so too do the bizarre scenarios of MFT. We suggest that this reliable lack of independent activation stems not from specific scenarios but instead from a lack of moral modularity, consistent with domain-general dyadic morality.
High Interconstruct Correlations
Not only did our data reveal extensive overlap between harm and purity, but ratings of harm and impurity were both highly correlated with judgments of immorality—that is, severity. Graham sees this as a problem, but it is problematic only for modular MFT. The harm-based template of dyadic morality suggests that perceived harm and perceived immorality should substantially overlap. That purity—as assessed with MFT items—is also correlated with harm and severity suggests that purity is either understood via harm (see Gray, Schein, & Ward, 2014) or is poorly defined, both of which challenge modular MFT.
Domain-General Modularity?
The most surprising criticism leveled by Graham was that we incorrectly suggested that modular MFT was inconsistent with domain-general accounts. Graham asserts that MFT is “perfectly consistent with domain-general as well as domain-specific processes.” This statement is a stark reversal for MFT. MFT researchers have repeatedly and explicitly argued against domain-general moral processes for more than a decade (Graham et al., 2013; Haidt & Joseph, 2004). Cognitive modules—encapsulated, domain-specific “switches”—are by definition opposed to domain general processes that cut across content (Cameron, Lindquist, & Gray, 2015). How can Graham argue for the modularity of moral content—the specialness of purity per se—and accept that purity has no special effect on moral cognition beyond crosscutting domain-general dimensions?
Attempting to reconcile our domain-general findings with past published modular MFT claims, Graham suggests that there may be both “differences” and “similarities” across moral content. However, which exact differences and similarities MFT predicts are left vague. In order for a theory to be both falsifiable and useful (i.e., pragmatically valid, Graham et al., 2013), it must specify exactly and a priori when one pattern of results (e.g., differences)—versus its complete opposite (e.g., similarities)—are predicted to emerge. Unfortunately, Graham leaves these precise predictions unspecified, while past published formulations of MFT strongly argue only for differences (Graham et al., 2013).
Pluralism and Parsimony
MFT presumes ownership of four claims: “nativism, culture, intuition, and pluralism” (Graham et al., 2013, p. 62). We challenge this presumed ownership. Dyadic morality also asserts that morality can be innate (nativism), learned (culture), and intuitive (Gray, Young, & Waytz, 2012). In contrast to the mischaracterization of Graham (2015), dyadic morality also embraces moral pluralism. Indeed, despite the anthropological roots of MFT, we suggest that dyadic morality is the true inheritor of pluralism because it also acknowledges harm pluralism—legitimate variations in perceived harm. Dyadic morality acknowledges that Brahmans legitimately see harm when burial rites are violated (Shweder, 2012), and U.S. conservatives legitimately see harm in homosexuality (Gray et al., 2014)—whereas harm-monist MFT rejects these perceptions as mere rationalizations (Haidt, 2001).
The pluralist dyad means that the “dyad versus MFT” debate is about the underpinnings of moral cognition—“templates versus modules.” It is not about “parsimony versus pluralism,” as Graham suggests. Dyadic morality is both parsimonious and pluralist. In Shweder’s words, dyadic morality has “universality without uniformity,” by combining rich moral diversity with a common cognitive template. The variability of perceived harm gives this template flexibility, but it also yields a clear testable hypothesis: diverse moral judgments should reliably co-occur with intuitive perceptions of harm. If someone views something as immoral, they should also perceive it as harmful—a prediction supported by recent research (Gray et al., 2014).
Conclusion
In sum, Graham’s comment provides a revised formulation of MFT that seems both internally inconsistent and impossible to reconcile with previous published accounts. Nevertheless, to the extent that MFT stands by its past-published scenarios, assumptions, and claims of modularity, we believe that we have disconfirmed MFT—and on its own terms.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
