Abstract
The role of emotion in moral judgment is currently a topic of much debate in moral psychology. One specific claim made by many researchers is that irrelevant feelings of disgust can amplify the severity of moral condemnation. Numerous researchers have found this effect, but there have also been several published failures to replicate it. Clarifying this issue would inform important theoretical debates among rival accounts of moral judgment. We meta-analyzed all available studies—published and unpublished—in which incidental disgust was manipulated prior to or concurrent with a moral judgment task (k = 50). We found evidence for a small amplification effect of disgust (d = 0.11), which is strongest for gustatory/olfactory modes of disgust induction. However, there is also some suggestion of publication bias in this literature, and when this is accounted for, the effect disappears entirely (d = −0.01). Moreover, prevalent confounds mean that the effect size that we estimate is best interpreted as an upper bound on the size of the amplification effect. On the basis of the results of this meta-analysis, we argue against strong claims about the causal role of affect in moral judgment and suggest a need for new, more rigorous research on this topic.
The psychological study of moral judgment has undergone something of a revolution in the past 25 years. Prior to this time, Lawrence Kohlberg’s (1971) pioneering theories dominated research in the field. Kohlberg viewed moral judgment as resulting primarily from reasoned deliberation—“the moral force in personality is cognitive,” he argued (p. 230). Kohlberg did not deny the role of affect in moral judgment (pp. 230–231), but he saw it as considerably more peripheral than cognition, a general perspective that we call rationalist. Lately, however, there has been a paradigm shift within moral psychology, such that many theorists now emphasize the role of automatic emotional processes in moral judgment, largely downplaying the role of reasoned deliberation. For instance, Haidt’s (2001) highly influential social intuitionist model posits that “moral intuitions (including moral emotions [emphasis added]) come first and directly cause [emphasis added] moral judgments” (p. 814). Moral reasoning, insofar as it occurs at all, is seen as post hoc, justifying a judgment that has already been made. 1 According to this model, the reason why a person judges a proscribed act, such as incest, as morally wrong is that “one feels a quick flash of revulsion at the thought of incest, and one knows intuitively that something is wrong” (Haidt, 2001, p. 814). We refer to the general class of theories that posit a causal role of affect in moral judgment as neo-sentimentalist, because the scholars who argue for them position themselves as the intellectual (and empirical) descendants of sentimentalist philosophers, such as David Hume (1739/2003; see, e.g., Haidt, 2001, p. 816).
Both psychologists (e.g., Nabi, 2002; Pizarro & Bloom, 2003) and philosophers (e.g., Fine, 2006; Saltzstein & Kasachkoff, 2004) have challenged the neo-sentimentalist approach on theoretical grounds, and some researchers (e.g., Royzman, Goodwin, & Leeman, 2011; Royzman, Leeman, & Baron, 2009) have argued for a renewed emphasis on rationalist (or, more broadly, cognitive) inputs to moral judgment. The divide between rationalists and neo-sentimentalists arguably constitutes the major theoretical debate in modern moral judgment research (although cf. Monin, Pizarro, & Beer, 2007). This debate is important because it concerns issues that are central to understanding the human moral faculty. The competing accounts of human morality offered by the neo-sentimentalist and rationalist perspectives are strikingly different and give rise to very different conceptions of the origin, development, and malleability of human moral judgment. Advancing this debate is therefore of considerable interest to psychologists. In the present meta-analysis, we aim to contribute to this debate by examining the evidence for one key claim of the neo-sentimentalist approach—that unrelated physiological disgust makes judgments of moral transgressions harsher than they otherwise would be.
Why Focus on Disgust?
The emotion of disgust has received particular attention for its putative role in moral judgment. An early indication that disgust might play a role in moral judgment emerged from a seminal study in which people were found to treat disgusting, taboo acts that do not cause any harm—for example, having sexual intercourse with a dead chicken before cooking and eating it—as moral transgressions rather than as violations of social convention (Haidt, Koller, & Dias, 1993). This finding contradicted one of the foundational assumptions of Turiel’s (1983) social domain theory, according to which causing harm is a necessary and signature element of moral transgressions. This result showed that judgments of immorality can occur without the presence of harm and was therefore seen as a major challenge for the prevailing rationalist models of moral thinking, and it sparked a considerable amount of subsequent research on the role of disgust in moral judgment.
Disgust is, at first glance, an odd candidate for an emotion that would influence moral judgment because, as reviewed later, it is usually elicited by things that are contaminating but have little to do with morality—things such as feces, insects, and decaying corpses. In other words, disgust is prenormative—that is, it can be evoked in the absence of a violated moral prohibition (Royzman et al., 2009). In this way, it differs from other negative emotions, such as anger, which is more typically evoked in response to the appraisal of some offense or transgression, thus giving it an in-built normative character. However, this prenormative feature of disgust is precisely what affords it a major advantage when examining the role of emotion in moral judgment: It allows a cleaner test of the “pure” role of emotion. If disgust was found to influence moral judgments, this would provide clearer evidence for such a pure role of emotion than would demonstrating that a normatively laden emotion, such as anger, influences moral judgment. Studying disgust, therefore, offers the possibility of de-confounding affective and normative elements, thus potentially allowing clearer inferences about the role of affect per se in moral judgment.
What Is Disgust?
Before examining what role, if any, disgust plays in moral judgment, we briefly review what disgust is and what it does. Disgust is a withdrawal-motivating emotion, prompting avoidance of its eliciting stimulus (Ekman & Friesen, 1975). It is often accompanied by the sensation of nausea. Although there are divergent theories of the original purpose and function of disgust, there is considerable agreement on what sorts of stimuli typically elicit it: bodily fluids and waste, decay, filth, certain insects (e.g., cockroaches) and animals (e.g., rats), and certain classes of sexual acts (Oaten, Stevenson, & Case, 2009; Rozin, Haidt, & McCauley, 2008; Tybur, Lieberman, & Griskevicius, 2009; Tybur, Lieberman, Kurzban, & DeScioli, 2013).
Two prominent theories provide accounts of the origin and function of disgust. In both of these theories, disgust is seen as a multifaceted construct, involving a core disgust response, which can be applied to stimuli that do not intrinsically elicit disgust, such as moral offenses. The oral rejection theory of disgust (Rozin, Haidt, & Fincher, 2009; Rozin et al., 2008) posits that disgust has its roots in distaste, the rejection of bitter substances in the mouth. Bitter taste, which is an indicator of plant poisons, evokes a physiological rejection response, the “gape” response, meant to expel the offending taste object. This response is present from birth and has analogs in other mammals, such as rats (Travers & Norgren, 1986), which strongly suggests that it is an evolved response that protects the body from harmful or poisonous substances. Proponents of the oral rejection theory argue that this disgust response was co-opted to protect humans from other contaminating threats beyond oral toxins, including typical core disgust elicitors—such as feces, insects, wounds, and the like—but also people who commit immoral acts. These immoral acts need not involve any core disgust elicitors, though of course they could. On this theory, moral disgust is a preadaptation—the disgust response being expanded to serve functions for which it did not originally evolve (for a much more in-depth treatment of this theory, see Rozin et al., 2008; for a critique of this theory, see Tybur et al., 2013).
The other major theory of disgust posits that the disgust response does not have its roots in distaste but rather that it evolved specifically to help humans avoid disease. In this theory, disgust is elicited by sensory cues that indicate the probable presence of contagious pathogens (Oaten et al., 2009). Accordingly, activities that pose direct disease threats elicit the physiological disgust response (e.g., nausea, gagging, loss of appetite; see Royzman, Leeman, & Sabini, 2008) and are moralized. However, these disgust-eliciting stimuli do not have to be physically present to produce a disgust response—merely thinking about them is sufficient. Indeed, in this theory, the fact that mere thoughts of disgust elicitors can produce disgust is central to how disgust becomes broadly involved in moral judgment. Consequently, other activities that themselves do not pose direct disease threats—for instance, deviant sexual practices—can become moralized because they bring to mind thoughts of concrete disgust-eliciting stimuli via mental association. For example, necrophilia, which may in some instances consist only of sexual attraction to corpses, or of nonharmful contact with those corpses, evokes disgust and moral condemnation because it brings to mind the thought of dangerous contact with (potentially diseased) corpses.
A variant of this theory suggests that the disgust response originally evolved to motivate pathogen avoidance but was later co-opted to motivate avoidance of sexual partners with low reproductive value and, separately, to coordinate punishment of those who violate moral norms within groups (Tybur et al., 2013). In this theory, disgust can influence moral judgment because it is a cue to the fitness costs of engaging in an action. Therefore, when one feels disgust, one concludes that an act is wrong and therefore ought to be punished (Tybur et al., 2013). Regardless of the specifics, proponents of this theory generally agree that disgust has its roots in an adaptation that helps humans to avoid potential pathogens but can be co-opted to support condemnation in the moral domain.
Both of these theories—oral rejection and pathogen avoidance—provide an account of how disgust may become involved in moral judgment. Furthermore, as we show next, these theories are relevant to understanding the existing literature on disgust and moral judgment, particularly when considering potential moderators of existing findings.
How Might Disgust Be Involved in Moral Judgment?
Theorists of morality have also considered how disgust might be relevant to moral judgment. Pizarro, Inbar, and Helion (2011) articulated three ways that disgust could relate to moral judgment (for a similar treatment concerned with negative affect in general, see Prinz, 2006). The first possibility is that disgust results from at least some moral judgments—that is, when a person judges something to be a moral transgression, this evokes disgust, an idea we refer to as the elicitation hypothesis. As indicated earlier, this idea is supported by both the oral rejection and the pathogen avoidance theories of disgust. In the second possibility, this causal direction is reversed: The experience of disgust makes people’s moral appraisals more negative than they otherwise would be, a more controversial claim that we call the amplification hypothesis. In the third possibility, the experience of disgust makes people more likely to moralize actions. This moralization hypothesis, which is the most controversial of all, suggests that experienced disgust is, by itself, enough to produce moral condemnation of otherwise neutral acts—that is, disgust is sufficient for moral condemnation. These three hypotheses are summarized schematically in Figure 1.

Schematic representation of three possible relationships between disgust and moral judgment.
Evidence for the Elicitation Hypothesis
Before examining whether disgust exerts a causal influence on moral judgments, we first consider whether seeing or contemplating moral transgressions can produce disgust. If the elicitation hypothesis is not correct, then the stronger claims of amplification or moralization would seem unlikely because these claims presuppose the regular co-occurrence of disgust and moral judgment. That is, according to both of these stronger hypotheses, the reason why disgust influences moral judgment is either because it is seen as a reliable cue to the moral gravity of an offense (akin to the affect-as-information approach; see Schwarz & Clore, 1988) or because disgust is partly constitutive of moral disapprobation itself (see, e.g., Haidt, 2001; Prinz, 2006). Either way, one would expect disgust to be regularly elicited by appraisals of moral wrongness.
Three sources of evidence support the elicitation hypothesis. First, in response to at least some moral violations, individuals explicitly report feeling disgusted. Some studies suggest that this effect is limited to so-called purity violations—proscribed activities that involve the defilement or degradation of a person’s body or soul, including inappropriate sex acts, food taboos, misuse of the body, and crimes against nature (e.g., Haidt & Graham, 2007; Horberg, Oveis, Keltner, & Cohen, 2009, Study 1). Other researchers have found this effect to extend to a range of moral offenses (Hutcherson & Gross, 2011), although this latter finding may have arisen because participants reported feeling morally disgusted, rather than disgusted per se (see P. S. Russell, Piazza, & Giner-Sorolla, 2013).
Second, moral violations also appear to activate the concept of disgust implicitly. Having read about a moral violation, participants were more likely to complete ambiguous word stems (e.g., REVOL_ING) with disgust-related words (e.g., REVOLTING) rather than disgust-unrelated words (e.g., REVOLVING; Jones & Fitness, 2008). An ambiguity with this result, and others like it, is that in lay parlance the word “disgust” appears to capture elements of anger as well as core disgust (see, e.g., Nabi, 2002; J. A. Russell & Fehr, 1994), so it remains possible that the disgust evoked by moral offenses is primarily rooted in anger rather than core disgust. However, participants who were offered a gift to take home after the experiment were more likely to take a cleaning-related product when they had read about a moral violation, suggesting that they experienced core disgust, which they then desired to cleanse (Jones & Fitness, 2008, Study 1).
Third, moral transgressions, such as unfair offers in the ultimatum game, appear to activate facial expressions that are characteristic of core disgust rather than anger (Chapman, Kim, Susskind, & Anderson, 2009), which, in turn, predict the likelihood of rejecting such offers. Facial responses indicating disgust appear particularly prevalent in response to purity violations and violations of fairness norms rather than actions that are directly harmful (Cannon, Schnall, & White, 2011). However, whether such findings show that core disgust is directly activated by moral transgressions is disputed: Royzman and Kurzban (2011a, 2011b) have argued that facial expressions of this sort may constitute intentional signals of disapproval rather than direct read outs of emotional responses (though see Chapman & Anderson, 2011, for an alternative perspective).
In sum, there is accumulating, though imperfect, evidence that disgust can be elicited by an appraisal that a moral transgression has occurred, and some evidence suggests that this effect is more robust for purity offenses. This evidence thus establishes an important precondition for the amplification and moralization hypotheses. We now turn to evaluating the evidence for the primary focus of this article: the amplification hypothesis. There is some evidence that the experience of disgust can increase the severity of moral condemnation, as well as some conflicting evidence.
Evidence for the Amplification Hypothesis
Correlational evidence
Although in the present meta-analysis we are concerned with experimental studies of the amplification effect, relevant correlational results also exist, which are consistent with claims of a causal link between disgust and moral judgment. Some studies have shown that individuals’ trait sensitivity to disgust positively predicts the severity of their moral and punitive judgments (Chapman & Anderson, 2014; Jones & Fitness, 2008; but see Landy & Piazza, 2015), especially those regarding violations of bodily and sexual purity (Horberg et al., 2009). Disgust sensitivity also predicts conservative attitudes toward abortion and gay marriage (Inbar, Pizarro, & Bloom, 2009).
These correlational results are consistent with a dispositional form of the amplification hypothesis, at least in the domain of purity violations. However, they each leave open the possibility of reverse causation—it could be that the harshness of people’s moral judgments causes disgust sensitivity rather than the other way around. We are exposed to many disgust elicitors—for instance, body odor or dog feces—only when someone else breaches a social norm—for instance, by failing to maintain standards of personal hygiene or to clean up after his or her pet. People who are generally more severe in their moral judgments might be especially attuned to such hygiene-related offenses—both noticing and attending to such offenses more assiduously as well as judging them more severely—thereby experiencing an elevated level of disgust in their daily lives. Moreover, the reviewed correlational evidence leaves open any number of third variable explanations for the association between trait disgust and moral harshness. To name just one, it is known that political conservatism predicts disgust sensitivity (Inbar et al., 2009)—particularly in the domain of sexual disgust (Tybur, Merriman, Caldwell Hooper, McDonald, & Navarrete, 2010)—and that conservatives also tend to be more punitive in their moral thinking than liberals, at least under certain conditions (Gromet & Darley, 2011; Tetlock et al., 2007). The association between overall disgust sensitivity and moral harshness could therefore be driven by political ideology predicting both variables independently.
Experimental evidence
Because experimental designs can be used to rule out these alternative explanations and to provide direct evidence for a causal effect of disgust on moral judgment, we focus on these designs in the present meta-analysis. The amplification hypothesis is best tested in experiments in which disgust is manipulated with an emotion induction that is unrelated—or incidental (Bodenhausen, 1993; Loewenstein & Lerner, 2003)—to the moral judgment. Such designs can show that the mere subjective experience of physiological disgust amplifies moral judgments. Indeed, studies of incidental disgust arguably provide the best current evidence for the neo-sentimentalist idea that moral judgment is largely driven by emotion.
In several such studies researchers have found that incidental disgust amplifies moral judgment. In the earliest published study of this kind, participants were hypnotized to feel “a brief pang of disgust . . . a sickening feeling in your stomach” when reading an otherwise innocuous word (take or often; Wheatley & Haidt, 2005, p. 780). Participants then read vignettes describing moral transgressions, half of which contained the hypnotic trigger word. Participants rated behaviors described in vignettes that included the trigger word as more immoral than identical behaviors described in vignettes that did not include the trigger word (Wheatley & Haidt, 2005).
Since this study, several other methods of inducing disgust have been used, many of which were pioneered by Schnall, Haidt, Clore, and Jordan (2008). In one study, participants exposed to a noxious ambient odor (a commercially available novelty “flatulence spray”) rated a variety of transgressions as more immoral than participants not exposed to any odor. Similarly, in other studies, participants who had completed a study in a filthy work area, had written about a time they had encountered something physically disgusting, or had just watched a disgusting film clip (the infamous toilet scene from the movie Trainspotting; Boyle, 1996) all rated various moral transgressions as more wrong than control participants. These latter three results were observed only among participants high in private body consciousness (Miller, Murphy, & Buss, 1981), a dispositional tendency to attend to one’s internal bodily states (Schnall, Haidt, et al., 2008). Using the same video induction, Schnall, Benton, and Harvey (2008) observed less harsh moral judgments among participants who had washed their hands after the disgust induction than participants who had not—seemingly because the hand washing alleviated the felt disgust, thereby eliminating the amplification effect. Amplification effects in moral judgment have been replicated by researchers using other videos (Ugazio, Lamm, & Singer, 2012) and even bitter tastes (Eskine, Kacinik, & Prinz, 2011) as disgust inductions. Other researchers have replicated the amplification effect of disgust (Cheng, Ottati, & Price, 2013) while also observing similar amplifying effects for several other arousing emotions beyond disgust.
In several other studies, researchers have observed the amplification effect but only in a moderated way. For instance, Horberg et al. (2009) used the same video induction as Schnall, Haidt, et al. (2008) and replicated the amplification effect, but this was demonstrated only for purity violations; directly harmful actions were unaffected. Similarly, when participants listened to the sound of a person vomiting while making moral judgments, only judgments of the wrongness of purity offenses were amplified (Seidel & Prinz, 2013). In contrast, anger—induced by an irritating sound—amplified judgments of the wrongness of harmful actions and violations of fairness norms, but it did not affect judgments of purity violations. In another partial replication, the amplification effect was found to hold only among participants low in attentional control (Van Dillen, van der Wal, & van den Bos, 2012). It has also been found to replicate only among participants who were unskilled at differentiating among their emotional states (Cameron, Payne, & Doris, 2013). Finally, Japanese participants who were high in mindfulness—specifically, those who described themselves as aware and “present” in the moment—did not show the amplification effect, but participants who were low in mindfulness did (Sato & Sugiura, 2014).
Other researchers have also used more behavioral measures of moral judgment to investigate the amplification effect. In one study, participants who had viewed disgusting video clips were more likely to reject unfair offers in an ultimatum game—a form of costly punishment (Harlé & Sanfey, 2010). Though this is a behavioral response, it presumably depends on moral judgment to some degree and is thus pertinent to our investigation. Similarly, a study in which disgusting images were used as an induction showed that participants who were disgusted were more likely to reject unfair offers in the ultimatum game, but only when participants believed that they were playing with human partners; when they believed that they were playing with computer programs, disgust had no effect (Moretti & Di Pellegrino, 2010).
Countervailing evidence to amplification
Thus, there is considerable evidence from the published literature supporting the amplification hypothesis. Nonetheless, there have also been several published failures to replicate these results. One such failure came from a pilot study mentioned by Schnall, Haidt, et al. (2008), in which participants immersed their arms in a gooey, disgusting substance prior to making moral judgments. No effect of this disgust induction on moral judgment was observed. A failure to replicate the amplification effect when a flatulence spray was used as a disgust induction has also been reported (Ugazio et al., 2012). Moreover, in a direct replication of Schnall, Benton, and Harvey’s (2008) hand-washing study, Johnson, Cheung, and Donnellan (2014) failed to find any difference in judgments of moral severity between the hand-washing condition and the no-washing condition, despite having .99 power to detect such an effect. In a separate investigation comprising two independent studies, participants saw disgusting images, neutral images, or negative but nondisgusting images prior to a moral judgment task. The judgments of participants in the disgust condition did not differ from those of participants in the other two conditions (Case, Oaten, & Stevenson, 2012). In another study, David and Olatunji (2011) used an evaluative conditioning procedure to condition participants to feel disgust in response to an innocuous word (part). Participants then read several vignettes describing moral transgressions, half of which contained the conditioned word, and half of which did not. Participants’ ratings of the wrongness of the described transgressions did not differ as a function of the presence or absence of the conditioned word. Finally, when participants were subliminally primed with disgust facial expressions, they rated utilitarian harms—that is, directly killing one person to save several others—as less morally wrong (contrary to the findings of Schnall, Haidt, et al., 2008), though this was moderated by disgust sensitivity: Highly sensitive participants rated the harms as less wrong, whereas less sensitive participants rated them as more wrong (Ong, Mullette-Gillman, Kwok, & Lim, 2014).
In virtually all of these cases, the authors reported successful manipulation checks—that is, they did successfully induce feelings of disgust, and yet the experienced disgust did not reliably influence moral judgment. There were only two exceptions. First, Ong et al. (2014) found that disgust increased during the experiment in both their disgust-prime and neutral-prime conditions. Second, no manipulation check was reported in Schnall, Haidt, et al.’s (2008) pilot study. However, the disgust induction used in that study (plunging one’s hand into a gooey substance) was, prima facie, very potent and disgusting, so it is reasonable to assume that it induced disgust successfully. Indeed, the authors argued that this manipulation may have been too effective—that it was so salient a source of disgust that participants attributed all of their experienced disgust to the induction task and thus discounted it when making their moral judgments. 2 In all other cases, the experimental manipulations successfully induced disgust.
In addition to these failures to replicate, other recent empirical results also call into question the role of disgust in moral judgment. When disgust is pitted against another moral emotion, such as sympathy, in a moral dilemma, the neo-sentimentalist account would seemingly predict that individuals’ moral judgments would be driven by whichever emotion is felt most strongly. In a study in which this prediction was tested, participants read dilemmatic vignettes in which one course of action required the protagonist to engage in a disgusting act (incest) to prevent a serious harm, whereas the other course of action was to refrain from the disgusting act at the cost of allowing serious harm (Royzman et al., 2011). Contrary to the neo-sentimentalist prediction, participants’ judgments of the morally right course of action were best predicted by their beliefs about which course of action would cause the least harm, broadly construed, rather than by their relative levels of subjective disgust and sympathy. A similar result showed that judgments about the immorality of a putatively harmless violation of social etiquette (spitting into a napkin at a dinner party) were better predicted by whether participants felt that someone was negatively affected by the action than by how disgusting they found it (Royzman et al., 2009).
In sum, these results suggest that the amplification hypothesis might not be supported as robustly as it might appear at first glance. A full examination of the available evidence should therefore yield a clearer picture of the overall support for the amplification hypothesis. There have only been a handful of studies in which the moralization hypothesis has been examined, but we also examine the current evidence for this in the meta-analysis.
Overview of the Meta-Analysis
Accordingly, we carried out a meta-analysis so that we could examine experiments in which incidental disgust was induced in participants prior to or concurrent with their making moral judgments. Our aim was to estimate the magnitude of the amplifying effect of incidental disgust on moral judgment through a survey of all relevant literature, both published and unpublished, thus informing the broader theoretical debate in moral psychology between the neo-sentimentalist and rationalist position. If strong evidence for the amplification hypothesis was found, this would clearly support the neo-sentimentalist position; however, if only weak or no evidence was found, this would suggest that a rethinking of the neo-sentimentalist approach would be in order.
The meta-analysis is divided into four related sections. We first derive a point estimate for the size of the effect of incidental disgust on moral judgment from both published and unpublished studies (the amplification hypothesis). We then evaluate the published literature alone to see whether there is any indication of publication bias. Next, we examine the role of two theoretically important moderator variables: type of disgust induction and type of violation. Finally, we compute a point estimate for the size of the effect of incidental disgust on judgments of nonmoral actions as a test of the theoretically stronger moralization hypothesis.
Literature Search
We began our search for relevant studies by consulting several online databases: Web of Knowledge, PsycINFO, PsycARTICLES, and the Social Science Research Network. In each of these four databases, we conducted a full-text search using the intentionally broad search terms disgust moral judgment. We then narrowed down the search results on the basis of the following criteria:
In the study, researchers had to manipulate the presence of disgust experimentally such that the resulting disgust was ostensibly unrelated to the moral judgments—that is, the disgust had to be incidental.
A study’s dependent variable(s) had to be some form of morally relevant judgment or action, such as ratings of immorality, assessments of character, recommendations for punishment, or actual punishment in economic games. 3
It was usually possible to determine whether an article contained at least one study that met these criteria by examining the abstract. When there was any ambiguity or doubt about an article, we examined the method sections to determine whether any studies met the criteria for inclusion. We also requested any unpublished data from the corresponding authors of all of the published studies and posted a call for unpublished data to several online forums (for details, see the Supplemental Material available online). This complete literature search produced a final set of 33 articles, containing 51 relevant studies (31 published, 20 unpublished) 4 with a total N of 5,102 participants. A full list of all studies included in the meta-analysis—along with their publication statuses, moderator codings, sample sizes, and effect sizes—can be found in Table 1, and a forest plot presenting each study’s effect size and 95% confidence interval (CI) can be found in Figure 2.
Studies Included in the Meta-Analysis, Moderators, Sample Sizes, and Effect Sizes
Note: For ease of interpretation, effect sizes are reported as uncorrected ds. Sample sizes and effect sizes are reported for each study overall (these effect sizes were used in the first and second sections of the meta-analysis) and are reported separately for nonpurity violations and purity violations (used in the third section of the meta-analysis) as well as nonmoral judgments (used in the fourth section of the meta-analysis). Dashes indicate that a study did not include a judgment of that type.
The disgust induction in this study was a filthy work area. It was mostly visual but included olfactory and tactile elements as well. bData from one participant were missing. cFour of the “thoughts” scenarios contained both moral and nonmoral elements (e.g., bad intent with no bad consequences, or vice versa); because these could not easily be classified as moral or nonmoral, they were not included in our analyses. dIn this study, participants self-generated transgressions that they had witnessed and then made character judgments of the perpetrators; because the transgressions were not controlled and likely varied greatly, this study was not used in the moderator analyses.

Forest plot showing effect sizes and confidence intervals for all published and unpublished studies of the amplification effect.
Because readers do not have easy access to the unpublished studies included in the meta-analysis, in Table S1 in the Supplemental Material, we present brief descriptions of these studies and indicate whether the researchers used pretests, manipulation checks, or both to ensure that their disgust manipulation did, in fact, induce disgust in participants. In the vast majority of published and unpublished studies, researchers reported successful pretests or manipulation checks, so any failure among the researchers of these studies to replicate the amplification effect cannot be attributed to their having failed to induce disgust successfully; only a few researchers reported unsuccessful manipulation checks (e.g., Schnall, Haidt, et al., 2008, Study 2) or no such checks at all (e.g., Zhong, Strejcek, & Sivanathan, 2010).
Obtaining Effect Sizes
Because in this meta-analysis we were concerned with experimental studies, we converted all results in the meta-analysis to standardized mean difference scores (Lipsey & Wilson, 2001), better known as Cohen’s d. In most cases, d was calculated from reported t tests, F tests with one degree of freedom, sample Ns, means, and standard deviations or standard errors. When the necessary statistical information was not reported in an article or manuscript, we contacted the corresponding author and requested the information or the raw data necessary to calculate d. All effect sizes were scored such that positive numbers indicate support for the amplification hypothesis. That is, positive ds indicate higher levels of moral disapproval in the disgust condition(s) of a study than in the control condition(s), and negative ds indicate the opposite pattern. We selected the between-subjects d as our effect size primarily because the overwhelming majority of studies in the meta-analysis contained between-subjects designs, so it simplified the analysis to calculate the between-subjects d. For within-subject designs, we converted the repeated-measures effect size to a between-subjects d (Morris & DeShon, 2002).
We conducted all of the following analyses using both fixed- and random-effects models. We focus on the results of the random-effects models, as these models allow for variation in effect sizes among studies deriving from sources other than sampling variance. Given the wide range of methods used in the reviewed studies, the expected moderating effects of the variables mentioned earlier, and the observed heterogeneity of effect sizes (described later), it seemed more reasonable to assume a random-effects model. We conducted all reported analyses using the Comprehensive Meta-Analysis software package.
Point Estimate for the Size of the Amplification Effect
Our first goal was to obtain a point estimate for the size of the amplification effect. To do so, we first computed one effect size for each study in which this hypothesis was tested (k = 50). When studies included multiple types of moral judgment (e.g., Horberg et al., 2009), and the raw data were not available, we calculated a single effect size collapsing across the different moral judgment types using reported F tests with one degree of freedom. When a study included more than one control condition—for instance, a sadness condition and a no-emotion condition (e.g., Schnall, Haidt, et al., 2008, Study 4)—we computed a single effect size comparing the disgust condition with these two control conditions (Lipsey & Wilson, 2001). However, when a study included an additional condition of theoretical interest—such as a condition inducing another moral emotion, such as anger, that was predicted to produce different results than disgust—this condition was not compared with the disgust condition (and was not collapsed with the control condition). It was generally clear whether a condition was meant to be a control condition or to serve some other theoretical purpose. Similarly, when a study contained multiple levels or variants of its disgust induction (e.g., Schnall, Haidt, et al., 2008, Study 1), we computed a single effect size comparing these conditions with the control condition (Lipsey & Wilson, 2001). We did not include judgments of nonmoral actions when computing each study’s overall effect size. Before computing a mean effect size estimate, we applied the small-sample correction to all ds to obtain an unbiased effect size estimator (Hedges, 1981; Lipsey & Wilson, 2001).
We calculated a point estimate for the size of the amplification effect as well as the 95% CI for this point estimate. Specifically, we calculated a weighted mean of the effect sizes in which each effect size was weighted by the inverse of the sampling variance of the study from which it was derived (Lipsey & Wilson, 2001). This point estimate for the effect size was statistically significant (d = 0.11, p = .002, 95% CI [0.04, 0.19]) but suggests a small effect (Cohen, 1992). There is thus evidence for the amplification hypothesis, but the effect appears to be fairly insubstantial. There was also significant heterogeneity among the calculated effect sizes, Q(49) = 143.66, p < .001, so we assessed the moderating influence of two preselected variables (described later).
Assessing and Accounting for Publication Bias
Our second goal was to assess the likelihood that there is publication bias in this literature. As an initial test, we recomputed the effect size estimate described earlier using only studies in the published literature, and this estimate was somewhat larger (d = 0.17, p = .001, 95% CI [0.07, 0.27]). This indicates that the published studies tend to show larger effect sizes than the unpublished studies. This is perhaps not surprising. What is striking, however, is how small the effect sizes found by the unpublished studies tend to be. The weighted mean effect size calculated from the obtained unpublished results is effectively zero (d = 0.03, p = .59, 95% CI [−0.09, 0.16]). The weighted mean effect sizes of published and unpublished studies were marginally significantly different by a random-effects, inverse-variance weighted, one-way analysis of variance (ANOVA; Lipsey & Wilson, 2001), the meta-analytic analog of the typical one-way ANOVA, Q(1, 48) = 2.89, p = .09. In short, the published literature suggests a reliable, though small, effect, whereas the unpublished literature suggests no effect.
In Figure 3, we present a funnel plot of the published effect sizes. In the absence of publication bias, the number of studies with high sampling error—that is, studies toward the bottom of the plot—should be symmetrically distributed around the weighted mean effect size. On visual inspection, there appears to be a preponderance of low-powered studies on the right side of the graph. Consistent with this, Egger’s regression coefficient is significant, Intercept = 1.15, t(29) = 3.20, p = .003, indicating that studies with high sampling variance are associated with larger standardized effect sizes (effect size divided by sample standard error) than would be expected, suggesting bias in the data set (Egger, Smith, Schneider, & Minder, 1997).

Funnel plot of published incidental disgust induction effect sizes.
Given that there is suggestive evidence that the published literature is unbalanced, the next question is, what becomes of the amplification effect when this is accounted for statistically? Using Duval and Tweedie’s (2000) trim-and-fill procedure, we imputed 10 missing studies to produce the expected symmetric funnel plot, and we estimated a much smaller, and nonsignificantly negative, adjusted weighted mean effect size (d = −0.01, 95% CI [−0.12, 0.10]). 5 Thus, the extent to which incidental disgust amplifies moral judgments appears to be overestimated in the published literature, and when this is corrected for, no significant effect is present.
Moderator Analyses
Our third goal was to assess the effects of two theoretically relevant moderators. The first moderator is the sensory modality by which disgust was induced in participants. The oral rejection theory’s conception of disgust as having originated as a gustatory response suggests that oral/nasal disgust inductions might be more potent sources of disgust than other sorts of induction, which would, in turn, exert stronger effects on moral judgments. The pathogen avoidance theory of disgust leads to a similar, though not identical, prediction, as it suggests that “certain perceptual modalities (notably smell and touch) may be especially associated with disgust, perhaps because of more privileged access to the insular cortex and anteriomedial temporal lobe structures” (Oaten et al., 2009, p. 313). Accordingly, we examined sensory modality as a potential moderator. We group olfactory inductions (i.e., ambient odors) and gustatory inductions (i.e., disgusting tastes) together because there is not a sufficient number of studies in which such inductions are used to treat them as individual categories, and because the senses of smell and taste are closely related.
The second moderator that we examine is whether the violation being judged falls into the purity (or divinity) domain. Rozin, Lowery, Imada, and Haidt (1999) found that people tended to associate only purity violations (and not other sorts of moral violations) with disgust reactions (though see Royzman, Atanasov, Landy, Parks, & Gepty, 2014, for a counterpoint). This finding provides some reason to think that moral judgments of purity violations might be especially influenced by experimental manipulations of disgust. In such cases, experienced disgust may be seen as particularly informative with respect to the gravity of the moral transgression (in line with the affect-as-information approach; Schwarz & Clore, 1983). Indeed, consistent with this speculation, researchers have recently reported evidence that purity violations are particularly susceptible to manipulations of disgust (as described earlier; e.g., Horberg et al., 2009; Seidel & Prinz, 2013), although there is also some countervailing evidence (Schnall, Haidt, et al., 2008). There are also some theoretical reasons to expect a particularly strong amplification effect on purity violations. For instance, if purity/divinity violations produce more intrinsic (or integral) disgust than do other sorts of moral violation (see Haidt & Graham, 2007), participants may be more likely to misattribute the disgust produced by an incidental manipulation to the transgression itself, thus amplifying their judgments of its wrongness (see also Cameron et al., 2013). We therefore predicted that purity/divinity violations would show greater amplification effects than would nonpurity violations.
We did not examine any individual difference measures that have been proposed as moderators of the amplification effect because such measures have been included in only a handful of studies (e.g., private body consciousness: Johnson et al., 2014; Schnall, Haidt, et al., 2008; attentional control: Van Dillen et al., 2012; emotional differentiation: Cameron et al., 2013; mindfulness: Sato & Sugiura, 2014), thereby rendering an analysis of them infeasible.
These moderator analyses required the creation of a new database of effect sizes in which separate effect sizes within each study were computed according to the type of moral transgression that participants judged (rather than computing a single effect size for each study, as in the analyses described earlier) and in which the effects were coded according to the nature of the sensory modality used to induce disgust, which varied across rather than within studies—visual, gustatory/olfactory, imagined experience, or other (see Table 1). Regarding transgression type, each judged violation was coded as either purity-related or non-purity-related. Calculating these effect sizes typically required obtaining additional statistical information or raw data from an article’s corresponding author, as many researchers did not separately analyze purity-related and non-purity-related transgressions. For some studies, these data were unavailable, meaning that such studies could not be included in the moderator analyses (see Table 1). The nature of a transgression was usually obvious, though there were certain ones that were more difficult to categorize. The handful of transgressions that proved impossible to categorize (e.g., driving to work instead of walking [Schnall, Haidt, et al., 2008], which might be seen as either nonmoral or as an offense against one’s community, by virtue of polluting it, or even as a purity violation in that it destroys the purity of nature) were not included in the database. 6 Overall, this enlarged database contained a total of 73 effect sizes (43 published, 30 unpublished). As in the analyses described earlier, the small-sample correction was applied to all of these effect sizes prior to analysis (Hedges, 1981; Lipsey & Wilson, 2001). The two moderators themselves were unrelated to one another, χ2(3) = 0.90, p = .83, which provides reassurance that any observed effects are not attributable to a lack of independence among them.
Because the two proposed moderators are categorical variables, we conducted analyses of both using random-effects, inverse-variance weighted, one-way ANOVAs (Lipsey & Wilson, 2001). The number of studies falling into each moderator category as well as the effect size estimates with 95% CIs are presented in Table 2.
Number of Studies in Each Moderator Category (k), Estimated Effect Sizes for Each Category (d), and 95% Confidence Intervals (CIs) for Effect Size Estimates
There was a significant effect of the sensory modality of a study’s disgust induction on the resulting effect size, Q(3, 69) = 8.18, p = .04. As predicted by both the oral rejection and disease avoidance theories, gustatory and olfactory disgust inductions produced much larger amplification effects than did visual disgust inductions, which produced only small, yet still significant effects. Moreover, inductions that involved imagined experiences, reading about disgusting stimuli, or recalling disgusting events—which arguably constitute a single conceptual category of “imagined” or “mental” inductions—produced essentially no amplification effect. The heterogeneous “other” class of disgust inductions—consisting of hypnotic suggestions, evaluative conditioning, as well as auditory and tactile inductions—produced a very small weighted mean effect size, comparable with the effect size for imagined inductions.
The domain of the judged transgression (purity vs. nonpurity) did not explain a significant proportion of the observed heterogeneity in effect sizes, Q(1, 71) = 0.05, p = .82. Contrary to the predictions described earlier, and also contrary to a handful of published results (e.g., Horberg et al., 2009; Seidel & Prinz, 2013), the mean effect size for nonpurity violations was, if anything, slightly larger than the mean effect size for purity violations (see Table 2), suggesting that the amplification effect is not restricted to moral transgressions involving bodily purity, sexual purity, or crimes against nature.
Point Estimate for the Size of the Moralization Effect
Our fourth and final goal was to assess the existing evidence for the moralization hypothesis—the idea that experienced disgust is itself sufficient to produce condemnation of typically amoral actions. To do so, we constructed a further database of effect sizes for judgments of nonmoral actions (k = 13). We observed a significant moralization effect (d = 0.21, p = .01, 95% CI [0.05, 0.37]). This is again a rather small effect, and this result should be interpreted with caution because it is based on a small number of studies. Moreover, once again, the published studies (k = 6) suggest a larger effect size (d = 0.33, p = .02, 95% CI [0.06, 0.61]) than the unpublished studies (k = 7; d = 0.14, p = .18, 95% CI [−0.06, 0.34]). This difference is not significant, Q(1, 11) = 1.32, p = .25, probably owing to the small number of data points. Still, the unpublished studies show no significant effect, whereas the published studies show a small but reliable effect. There is, therefore, some preliminary evidence in support of the moralization hypothesis, but it is far from conclusive. Future research is needed to investigate this further.
Discussion
What role, if any, does affect play in moral judgment? Researchers working within a neo-sentimentalist framework have recently argued that affect is a major driver, perhaps the major driver, of moral judgment. A key piece of evidence for this claim is that moral condemnation becomes harsher—that is, it is amplified—by negative affect that is unrelated to the moral issues at hand. Because of its prenormative character, disgust is an ideal emotion for testing this amplification hypothesis, and it has been the emotion most studied by moral psychologists. Indeed, numerous researchers have found evidence for the amplification hypothesis, but several other researchers have failed to find evidence for it. Our goal was to clarify the current state of the evidence for this effect.
To evaluate the extant evidence for the amplification hypothesis, we meta-analyzed all available studies (k = 50)—published (k = 31) and unpublished (k = 19)—in which the researchers experimentally manipulated incidental disgust prior to or concurrent with a moral judgment task. In this meta-analysis, we found a small but reliable amplifying effect of disgust (d = 0.11). However, we also found evidence suggesting publication bias in this literature. Indeed, even with the several published failures to replicate included, published studies produced a larger effect size than the overall estimated effect size (d = 0.17), whereas unpublished studies showed essentially no effect (d = 0.03); furthermore, when missing studies were accounted for mathematically, no effect was observed (d = −0.01). We also found preliminary evidence that the amplification effect is stronger for gustatory or olfactory disgust inductions, is weaker for visual disgust inductions, and is essentially nonexistent for imagined or mental inductions. Finally, we found evidence for a small but reliable effect of disgust on the severity of judgments of nonmoral actions (d = 0.21).
Relation to the affect and morality literature
The existence of a reliable amplification effect may be problematic for hardline rationalist theorists who would deny any role for affect in moral judgment. However, it is not clear that such theorists exist. Even Kohlberg (1971) afforded some role for affect in his model of moral judgment, channeled through cognitive processes (see pp. 230–231). However, the relatively small effect size of the amplification effect seems to pose a conflict with some of the stronger claims made by neo-sentimentalist researchers. For instance, it has sometimes been claimed that affective intuition is the primary determinant of moral judgment and that “for most people, most of the time, most of the action is in the quick, automatic, affective evaluations they make of people and events” (Schnall, Haidt, et al., 2008, p. 1097). However, the estimated effect size of d = 0.11 is equivalent to a correlation coefficient of r = .06 (Rosenthal & DiMatteo, 2001) and explains only 0.3% of the variance in participants’ moral judgments, whereas the amplification effect disappears entirely when accounting for publication bias. These data therefore undermine what is arguably the best evidence for a causal role of affect in moral judgment.
A meta-analysis of this sort is not capable of decisively resolving the debate between neo-sentimentalists and rationalists. It does not (and could not) rule out the possibility that affect plays an important role in moral judgment. For instance, it could be that unmeasured affective reactions, separate from the manipulated disgust, are doing most of the work in driving participants’ moral judgments (e.g., anger; see P. S. Russell & Giner-Sorolla, 2011). Moreover, because we only examined evidence for the neo-sentimentalist position, we cannot directly argue for the rationalist position or any other alternative theory on the basis of our results. In particular, the present analysis also does not allow us to compare our estimate of the variance explained by disgust with similar estimates for other processes that might play a role in moral judgment, including cognitive processes such as assessments of harm (see, e.g., Royzman et al., 2011, 2009). We therefore do not know how comparatively large or small the amount of variance explained by manipulations of disgust is.
We did uncover evidence that gustatory and olfactory disgust inductions exert a reliable, small- to medium-sized effect on moral judgments. The reasons for this are not entirely clear. It could be that these are more potent or more direct ways of eliciting disgust and, therefore, are more effective in influencing moral judgment, as both the oral rejection and disease avoidance models seem to predict. Alternatively, these inductions might be less intrusive and less salient and, therefore, less likely to be discounted by participants. This latter explanation seems unlikely, however. The largest effect size in which a gustatory/olfactory induction was used (and the third largest effect size in the entire data set) came from a study in which disgust was induced by having participants consume a bitter beverage prior to making their moral judgments (Eskine et al., 2011). The source of participants’ disgust in this study is likely to have been quite salient to them (“I just drank that horrible, bitter liquid”), much more so than if their disgust had been produced by, for example, an ambient odor. This suggests that the larger effect of gustatory/olfactory disgust inductions on moral judgment is unlikely to result from these inductions being less salient than visual and mental inductions.
One potential limitation of our meta-analysis concerns the size of the reviewed literature. Although our reviewed literature of k = 50 studies might be considered small, we see two reasons why the present meta-analysis is important, notwithstanding this potential issue. First, studies in which researchers purport to show an amplification effect of disgust on moral judgment are widely cited and have informed a great deal of theorizing in moral psychology and even philosophy (e.g., Prinz, 2006). Second, widespread confounds throughout this literature suggest that more rigorous research is needed on the topic of how disgust affects moral judgment, and it seems preferable to examine the existing evidence critically now rather than wait for more studies with similar confounds to be conducted. We turn now to the issue of these confounds.
The confounding of disgust and disapproval
There is a general methodological problem with most studies in this literature that warrants further attention. Some researchers have noted that disgust inductions that involve observing (or imagining) other people engaging in disgusting behaviors might elicit moral disapproval directed at these people (Case et al., 2012; Royzman, 2014). Other researchers have suggested that certain disgust inductions could evoke moral disapproval directed at experimenters or others in the environment (Baron, Royzman, & Goodwin, 2013; Royzman, 2014)—a criticism that we think applies quite broadly.
To illustrate the first problem, researchers commonly use a scene from the film Trainspotting as a disgust induction; in this scene, a drug addict digs through a filthy public toilet to retrieve his drugs (Horberg et al., 2009, Study 2; Johnson et al., 2014, Study 2; Schnall, Benton, & Harvey, 2008, Study 2; Schnall, Haidt, et al., 2008, Study 4). This scene clearly elicits core disgust through its prominent and graphic depiction of human bodily waste, but it may also elicit moral disapproval directed at the drug addict himself. This moral disapproval arguably confounds the manipulation of disgust and makes it difficult to establish whether it is the disgust, per se, or the moral disapproval that is doing the work in amplifying the severity of subsequent moral judgments. By comparison, an example of an induction that does not include a person acting, and that is therefore free of this sort of moral “contamination,” is the presentation of images from the International Affective Picture System depicting rotting animal corpses, insects in contact with food, and similar nauseating content (Case et al., 2012).
It may also be that by intentionally exposing their participants to disgusting, inappropriate stimuli, the experimenters in these studies are also seen as breaching social norms, thereby inadvertently making themselves the target of participants’ moral disapproval. This activated moral disapproval may then prime greater disapprobation in the subsequent moral judgment task. A clear instance of this problem is the “dirty desk” manipulation (Schnall, Haidt, et al., 2008, Study 2). Participants were seated at a greasy and disgusting work space while filling out their questionnaires. They may well have felt angry and disapproving of the fact that the experimenters had been derelict in their duty to provide a clean environment for them (Baron et al., 2013).
This general problem is, we think, much more pervasive than has been previously recognized. Experimenters in disgust induction studies almost invariably expose participants to stimuli that are clearly going to evoke a negative emotional response and that may induce the unpleasant physiological experience of nausea. Moreover, they do so intentionally and without prior warning. Arguably, therefore, the experimenters are knowingly doing some small harm to their participants, who may feel that they have the right not to be exposed to such things or to at least be given full warning that they will be so exposed. Thus, this potential problem applies in a different and more subtle way to almost all of the disgust inductions that have been used in past research, including those involving disgusting videos and images.
To address it, future researchers need to devise new experimental procedures that reliably induce core disgust in ways that do not breach social norms of appropriateness—a task that will require considerable ingenuity. Furthermore, we think that a more thorough approach is needed during the debriefing stages of future experiments to examine the extent to which participants feel irritation or anger in response to the behaviors depicted in or implied by the primes or toward the experimenter directly. In particular, it would be important to examine the extent to which both the primes and the experimenter’s behavior elicit moral disapproval or produce affront of some kind.
Thus, we consider our estimated effect size of d = 0.11 to be a plausible upper bound on the size of the amplification effect because confounds may have inflated the effect sizes observed in many of the reviewed studies. However, even if the amplification hypothesis was to be examined in a way that is free of such confounds, the present meta-analysis suggests that it may be infeasible to do so because a study of this sort would require a very large sample to be sufficiently well powered. 7 Furthermore, this suggests that studies in which the researchers purport to show the amplification effect with small sample sizes should be approached with skepticism, as they are likely underpowered.
The status of the moralization hypothesis
Although we have focused primarily on the amplification hypothesis in this meta-analysis, some of the relevant evidence also relates to the stronger moralization hypothesis. In the moralization hypothesis, it is asserted that actions that are morally irrelevant can be judged to be morally wrong in the presence of disgust. Although only a handful of studies have included judgments of such nonmoral actions, we found some preliminary evidence for a small moralization effect of disgust on judgments of otherwise neutral actions. It is therefore possible that incidental disgust, by itself, can lead to a small degree of moralization of otherwise innocuous actions. However, this result is based on a small number of studies (k = 13), and the methodological criticisms described earlier obviously apply to these studies as well. More research on this topic is clearly needed. A particularly important question for researchers to address is when disgust is sufficient for moralization, as there was significant heterogeneity among these effect sizes, Q(12) = 22.52, p = .03.
Conclusion
In the present meta-analysis, we have established a plausible upper bound on the extent to which incidental disgust amplifies moral judgment (and the extent to which it produces moralization of nonmoral actions), though confounds endemic to this literature leave the lower bound yet to be determined. We have also found evidence suggesting publication bias in this literature. The overrepresentation in the published literature of studies finding amplification effects might partially explain many researchers’ rapid acceptance of these findings and their enthusiasm for the neo-sentimentalist approach. On the basis of our results, we cannot definitively adjudicate between the rationalist and neo-sentimentalist positions, and doing so was not our aim. Rather, we hope that this review and meta-analysis can further the understanding of the role of disgust in moral judgment and can help to inspire the next wave of rigorous research on how affect and morality intertwine.
Footnotes
Declaration of Conflicting Interests
The authors declared that they had no conflicts of interest with respect to their authorship or the publication of this article.
Supplemental Material
Notes
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
