Abstract
Extraneous neuroscience information improves ratings of scientific explanations, and affects mock juror decisions in many studies, but others have yielded little to no effect. To establish the magnitude of this effect, we conducted a random-effects meta-analysis using 60 experiments from 28 publications. We found a mild but highly significant effect, with substantial heterogeneity. Planned subgroup analyses revealed that within-subjects studies, where people can compare the same material with and without neuroscience, and those using text, have stronger effects than between-subjects designs, and studies using brain image stimuli. We serendipitously found that effect sizes were stronger on outcomes of evaluating satisfaction or metacomprehension, compared with jury verdicts or assessments of convincingness. In conclusion, there is more than one type of neuroscience explanations effect. Irrelevant neuroscience does have a seductive allure, especially on self-appraised satisfaction and understanding, and when presented as text.
Keywords
Introduction
The seductive allure of neuroscience explanations, or SANE effect, is the finding that neuroscience explanations and brain imagery may have an oversized impact on judgments of the quality and convincingness of information. Importantly, irrelevant neuroscience or brain images may persuade people to agree with a position, buy a product (Ariely and Berns, 2010; Racine et al., 2010), or to side with one party in court (Baker et al., 2017; Greene and Cahill, 2012; McCabe et al., 2011; Phalen et al., 2021; Saks et al., 2014; Salerno and Bottoms, 2009). Increased media presentations of neuroscience may also change laypeople’s core understanding of themselves. This can be positive, such as by reducing the stigma of mental illness, if the public believes brain scans can confirm that symptoms are “real.” However, researchers have raised questions about neuroscience changing views on free will, or guilting parents into obsessively using “brain-based” developmental strategies on their children (O’Connor and Joffe, 2013a, 2013b, 2015). Moreover, the term “neuro-policy” was coined to describe often-inappropriate uses of neuroscience to advance social or political agenda (Racine et al., 2005).
This impact of neuroscience information may have been an unintended consequence of the “Decade of the Brain” in the 1990s, and the simultaneous emergence of modern brain imaging (Racine et al., 2005), which made neural correlates of behavior more accessible for the general public. At the same time, media presentations of neuroscience often de-emphasize criticism or limitations of the methods (Bourdaa et al., 2015; Dumas-Mallet et al., 2017; Racine et al., 2010). This seems to encourage a somewhat mystical sense of the insight about the human experience that can be gained by studying the brain, as indicated by early studies into the SANE effect (Gurley and Marcus, 2008; McCabe and Castel, 2008; Weisberg et al., 2008). Furthermore, it has been found that the explanatory power of neuroscience may be part of a larger allure of explaining phenomena using more reductionistic sciences (Hopkins et al., 2016, 2019). However, the general public seems to hold a special esteem for neuroscience over other fields (Fernandez-Duque et al., 2015; Hopkins et al., 2016), and sees a need for neuroscience methods in the study of psychology (Weisberg et al., 2018).
Neuroscience is more impactful in some instances than others
The SANE effect persists even after controlling for possible confounds such as length of the neuroscience explanation, and the use of jargon (Rhodes et al., 2014; Weisberg et al., 2015). However, numerous reports of null findings indicate limitations of the effect (Baker et al., 2013; Gruber and Dickerson, 2012; Marshall et al., 2017; Michael et al., 2013; Schweitzer et al., 2013; Van Elk, 2019; West et al., 2014). For instance, the effect on ratings of a scientific explanation appears to be limited to rescuing low ratings of poorly-reasoned or circular explanations, with no impact on a well-reasoned argument (Fernandez-Duque et al., 2015; Hopkins et al., 2016; Minahan and Siedlecki, 2016; Weisberg et al., 2008, 2015). This may be due to the incorrect perception that a phenomenon is better understood by referencing brain regions with which it correlates, even though “‘where’ is not the equivalent of ‘why’ or ‘how’” (Weisberg, 2008: 53). Second, neuroscientific evidence is are not compelling when participants disagree with the findings (Scurich and Shniderman, 2014).
More broadly, it is important to understand what types of impact neuroscience may have on laypeople. Some of the strongest SANE effects are in improving ratings of satisfaction with explanations of psychological phenomena (Fernandez-Duque et al., 2015; Im et al., 2017; Weisberg et al., 2008, 2015). This not only suggests a preference for an increasingly reductionistic, neurological understanding of the human experience, but also that neuroscience information influences decision making that is based on scientific evidence. For instance, functional magnetic resonance imaging (fMRI) evidence was rated of higher quality than cognitive testing in determining that a politician was not competent to continue in office (Munro and Munro, 2014). Furthermore, neuroscience improves self-reported comprehension, without changing any objective measures of understanding (Ikeda et al., 2013). It could be predicted, therefore, that references to the brain can persuade laypeople to agree with positions or buy products that are “brain-based.” However, added neuroscience is often ineffective in changing agreement with scientific findings (Hook and Farah, 2013; Michael et al., 2013; Schweitzer et al., 2013), suggesting that improved ratings of satisfaction and metacomprehension may not always be sufficient to change attitudes or behavior.
It has been found that brain imagery is only persuasive when presented following a non-image control, which serves as a reference point, in within-subjects fashion. In an elegant design (Schweitzer et al., 2013), brain graphics improved ratings only when presented after the control condition, but not when presented first. This is noteworthy because it suggests that the ecological validity of the effect is limited; people rarely encounter the same material with and without extraneous neuroscience. One notable exception that can be envisioned is in the context of a courtroom, where different parties may present jurors with evidence that varies in the inclusion of neuroscience information or images. Indeed, brain images influenced sentences of death, indicating that mock jurors were persuaded by the party that offered the images, compared with evidence without them, which actually tended to persuade jurors against the side that offered it (Saks et al., 2014). However, it should be noted that this and other between-subjects designs have found a SANE effect (Greene and Cahill, 2012; Hopkins et al., 2016; Ikeda et al., 2013; Im et al., 2017; McCabe et al., 2011; Minahan and Siedlecki, 2016; Munro and Munro, 2014; Phalen et al., 2021; Racine et al., 2017; Rhodes et al., 2014; Saks et al., 2014; Weisberg et al., 2008, 2015). Therefore, while within-subjects designs often have higher effect sizes (e.g. Fernandez-Duque et al., 2015), the effect appears in between-subjects studies as well.
Another important consideration is whether the added neuroscience is graphic (i.e. brain imagery) or textual. As originally described (McCabe and Castel, 2008), images led to higher ratings of the science behind a cognitive neuroscience study. Adding brain pictures produces higher ratings of metacomprehension of scientific material, more so than a bar graph (Ikeda et al., 2013). Im et al. (2017) found that imagery and text led to higher ratings of educational research than either alone. This suggests a certain dependency on “dosage” (Im et al., 2017: 530); the more types of irrelevant neuroscience, the stronger the effect. In mock courtroom settings, lie detection using fMRI or electrophysiology was rated more highly than non-neuroscience methods (McCabe et al., 2011; West et al., 2014). Similarly, pain assessed through fMRI was rated more severe, and led to higher settlements than other methods (Phalen et al., 2021); taken together, this may indicate a belief that imaging is harder to trick, and therefore more trustworthy. However, several experiments failed to demonstrate any extra persuasiveness of images, beyond neuroscience text (Hook and Farah, 2013; Michael et al., 2013).
The role of expertise
Expert knowledge of a relevant field tends to confer immunity to the impact of extraneous neuroscience. This is true not only of neuroscientists (Weisberg et al., 2008), but also experts in personnel selection who evaluated a personality test with neuroscience information (Diekmann et al., 2015). However, this ability to identify irrelevant neuroscience appears to be very domain-specific, given that the SANE effect is seen in scientists from other fields (Hopkins et al., 2019), and appears to require at least an undergraduate degree in a relevant major. The SANE effect is found among those who demonstrate introductory-level neuroscience knowledge (Im et al., 2017; Rhodes et al., 2014; Weisberg et al., 2008), and confers upon laypeople a greater sense of comprehending the material, while not actually improving understanding (Ikeda et al., 2013; Rhodes et al., 2014).
This article is not the first to address parameters under which the SANE effect can be elicited (Aono et al., 2019; Baker et al., 2017; Farah and Hook, 2013; Michael et al., 2013). However, given the disparate findings, we believe it is necessary to meta-analyze the persuasive impact of neuroscience to determine its effect size. As a starting point, combining the original McCabe and Castel (2008) findings with 10 replications, Michael et al. (2013) found a trivial raw effect size of 0.07 points on a 4-point scale. It is unclear whether a larger meta-analysis of this and other relevant findings would produce a stronger effect.
The random-effects meta-analysis
We therefore conducted a meta-analysis to determine the strength of the SANE effect in non-experts, defined as those with less than an undergraduate degree in a relevant field. A meta-analysis statistically combines all relevant findings on a topic, and synthesizes them into a single average effect size. This should indicate the true nature of the phenomenon, including how strong the effect is likely to be, and whether we are confident it exists (i.e. whether the effect size is significantly greater than zero). Thus, meta-analysis is superior to an unstructured literature review, or any single primary research study, in resolving the true magnitude of the SANE effect. The random-effects procedure is needed because of the variety of ways that neuroscientific explanations have been manipulated, and differences in outcomes measured (Borenstein et al., 2009). Given these differences in the literature, we anticipated that there may be more than one type of SANE effect. Although there may be heterogeneity, most studies have theoretical and methodological ancestry in a few early reports (Gurley and Marcus, 2008; McCabe and Castel, 2008; Weisberg et al., 2008). Therefore, meta-analysis should reveal whether neuroscience unduly influences laypeople, or if some subsequent studies are correct that brain information is no more impactful than other types of arguments.
We also planned two subgroup analyses. First, we planned to compare within- and between-subjects designs. It has been predicted (Schweitzer et al., 2013) that the former type of studies would produce a stronger effect than studies in which participants rate only one stimulus. We sought to uncover the magnitude of this difference, and if there even is a SANE effect in between-subjects designs at all. Second, we predicted that studies featuring neuroscience as text would produce a stronger effect than images. The former kinds of studies often have larger effect sizes; furthermore, Michael et al. (2013) successfully replicated an effect of neuroscience text (Weisberg et al., 2008), but failed to reliably replicate a classic study of the impact of brain scan images (McCabe and Castel, 2008). We also predicted that combining both stimuli would have an additive SANE effect (Im et al., 2017).
Methods
This analysis follows PRISMA guidelines. This review was not registered; therefore, we clarify actions that were not planned in advance.
Searches
We searched the PsycInfo, PsycArticles, Medline, and Psychology & Behavioral Sciences Collection databases on 16 September 2021, and again on 5 December 2022. Search terms (number of results from the more recent search in parentheses) were as follows: seductive allure (436); “neuroscience explanation” (28); “brain image” or neuroimage AND judgment AND law or legal or jury or juror (4095). There were no limits on searches. The initial search found 32 of the 35 papers assessed for eligibility. The second search yielded three newer publications for consideration (Bulut et al., 2022; Perricone et al., 2022; Phalen et al., 2021). Initial screening was conducted by one author (PJM). Exclusion decisions after screening were made jointly between authors based on the following characteristics:
Population: We include only laypeople as study participants, because expertise protects against the SANE effect (Diekmann et al., 2015; Hopkins et al., 2019; Weisberg et al., 2008; cf. Bulut et al., 2022).
Intervention and Comparison: We considered studies that compared either neuroscience text, images, or both, to other kinds of information (e.g. descriptions of behavior). Studies were only included if they compared neuroscience information and/or images to similar material without.
Outcome: Outcomes of interest were either ratings of satisfaction, quality, or convincingness of, or agreement with, the intervention stimulus, or some behavior or judgment (such as a courtroom verdict) that may be influenced by it.
As shown in Figure 1, the overwhelming majority of reports were excluded as clearly unsuitable upon initial screening. In most cases, they involved actual neuroscience methods, rather than perceptions of neuroscience. All other excluded references at this stage involved no data gathering (e.g. philosophical works). From the references of included papers, we identified 13 other relevant studies, and included results from a SANE study from our group that has not been submitted for publication, and which yielded no difference between a neuroscientific and a psychological explanation.

Flow diagram for identification and selection of included studies.
Two studies (Bulut et al., 2022; Hopkins et al., 2019) and one experiment from an otherwise-included paper (Weisberg et al., 2008) were excluded for sampling experts. In an additional study (Diekmann et al., 2015), we only used the cohort of psychology students and excluded the expert groups. We excluded Study 3 from Weisberg et al. (2015) because it reused data from its own Study 1. In one study (Aspinwall et al., 2012), neuroscientific information was offered by prosecution and defense, as aggravating and mitigating evidence, respectively, which could not be combined with other studies, in which neuroscience is offered only as mitigating evidence. Others, while relevant to understanding the SANE effect, were excluded because they did not include outcomes of interest (Berent and Platt, 2021; Sandoboe and Berent, 2021) or a non-neuroscience comparison (Scurich and Shniderman, 2014).
Several studies did not include enough data to calculate effect sizes. One of us (EMB) made at least two attempts, spaced by 3–5 weeks, to contact the corresponding author by email to obtain either the relevant statistics or the raw data. Seven studies were excluded because authors did not reply, or were unable to supply the needed data. One study (Schweitzer et al., 2011) did not include group sample sizes; these we estimated by dividing the overall sample size (N) by the number of groups (k).
Included data
We collected data on authors, year of publication, and experimental design (Supplemental Table 1). Mean differences were taken from neuroscience and non-neuroscience conditions, collapsed across levels of other factors (e.g. combining both “good” and “circular” explanations, as in Weisberg et al., 2015). For simplicity, in studies in which there were more than two conditions, we selected the “strongest” neuroscience and non-neuroscience conditions. For instance, in studies that had neuroscience text with and without images, the condition with imagery was used (e.g. Im et al., 2017). In the control group, a psychological explanation was preferred over a condition with no explanation (as in West et al., 2014). Studies were high quality, using either randomization, or counterbalancing in within-subjects designs.
Outcomes of interest were ratings of quality, satisfaction, or convincingness with regard to the stimulus. Convincingness could be assessed either by a direct rating, or agreement or belief in the main point of the information. For mock jury studies, we used guilty or not guilty verdicts, and where these were unavailable, length of sentences.
Because a few studies had small samples, we use Hedges’ g (and 95% CI) as a measure of effect size. In most cases, this was calculated from Cohen’s d. In studies using dichotomous outcomes (i.e. guilty/not guilty verdicts), we converted log odds ratios into Hedges’ g (Borenstein et al., 2009). Authors worked jointly on data gathering and analysis.
Procedure
We planned a random-effects analysis, and combined studies with the inverse-variance method. Synthesis was undertaken using the dmetar package in R (Harrer et al., 2021), with some analysis using Meta-Essentials (Suurmond et al., 2017). Heterogeneity was assessed with I2. We assessed publication bias using visual inspection of the funnel plot, as well as Rosenthal’s fail-safe N.
Two kinds of subgroup analyses were planned. Differences between subgroups were assessed with analysis of variance (ANOVA) comparing variance (Q) between and within groups (Borenstein et al., 2009). In the first analysis, we hypothesized that studies presenting stimuli within-subjects would yield stronger effects than those that presented only one type of stimulus per participant. These between-subjects studies were far more numerous and were found to be heterogeneous. Therefore, between-subjects studies were further divided into subgroups in an unplanned exploratory analysis, as described below. Second, we hypothesized that studies comparing both neuroscience images and text to non-neuroscience control stimuli would have greater effect sizes than those with only text. Studies with only text should, in turn, have larger effects than those with only images.
Results and discussion
There were 28 studies included, comprising a total of 60 individual experiments, with a combined 13,088 observations (Figure 2). Overall, the effect size was mild (Hedges’ g = 0.25, CI = [0.18, 0.31], PI = [−0.22, 0.72]), but highly significant (z = 7.34, p < .001). On the face of it, this confirms the notion that the SANE effect is real, but not very strong. However, there are three important considerations.

Forest plot of effect sizes. Error bars of individual studies represent standard error. Overall effect shown at bottom, with confidence (black) and prediction (gray) intervals. Note that removal of eight outliers changes the overall effect to g = 0.20, CI = [0.16, 0.25], PI = [0.00, 0.40].
First, there is evidence of publication bias. As seen in the funnel plot (Figure 3), there is a small number of studies with strong (g > 0.80) effects, and very few of high variance but no effects, as indicated by the lack of studies in the bottom left of the funnel. This was unexpected. There are several high-profile publications of null results; therefore, we surmised that there were very few studies still in the proverbial file drawer. We had access to data from only one relevant dissertation (Chambers, 2014), which failed to show a significant influence of brain imagery, and our own unpublished research (g = 0.04). There were two other dissertations, from which we could not access raw data. One found no effect of neuroscience added to explanations of psychological disorders in a small (N = 60) group of PhD and non-PhD students (Sapolsky, 2017), although PhD students would be considered experts, and therefore excluded from the meta-analysis. However, another unpublished dissertation (Capestany, 2018) found an impact of brain-based and neuropsychological diagnoses on sentence length (p < .001, η2 = .03). However, given this was the only significant result among the four unpublished studies we found, the overall effect size may be lower than what we report. However, Rosenthal’s fail-safe N test revealed that 5270 findings of null results would be needed to produce a non-significant effect.

Funnel plot. Note the lack of studies at bottom left. This likely suggests extant low-powered null or negative results that have not been published, but may also indicate stronger effects in small studies of satisfaction or metacomprehension, which appear to have larger effects.
Second, visual inspection of Figures 2 and 3 indicated possible outliers. We therefore added the find outliers function from the dmetar package in R (Harrer et al., 2021), which identifies studies in which confidence intervals do not overlap with that of the overall mean effect size. There were eight such studies. Three of these reported negative effect sizes; that is, control stimuli were rated nonsignificantly higher than neuroscience. The remainder had standard mean differences of 0.61 to 1.14. Removing these weakens the overall SANE effect even more (N = 52 studies and 11,850 participants, g = 0.20, CI = [0.16, 0.25], PI = [0.00, 0.40]). At the same time, we had no a priori justification for removing outliers. Indeed, some of the strongest effects came from stimuli created by Weisberg et al. (2008), in which explanations of psychological phenomena are judged more satisfying when irrelevant neuroscience is added.
Finally, there is indication of heterogeneity. We anticipated that this may happen, as differences throughout the literature suggest that there is more than one type of SANE effect. This can be seen in the size of the prediction interval, before removal of outliers. This interval indicates the distribution of possible SANE effect sizes, and ranges from −0.22 to 0.72. Furthermore, most of the variability is due to heterogeneity (I2 = 87.2%). Removing outliers reduces both the prediction interval, and also, I2 to 43.5%; substantially less, but still worth exploring. We undertook subgroup analyses to understand the source of the heterogeneity. Because a reduction in variability may ease identification of different groups of effects, we performed the following analyses after removing the outliers identified earlier. As may be expected, including them in the analysis increased effect sizes, but also variance; however, it does not substantively change the patterns that follow.
Within- versus between-subjects designs
We proposed a priori that within-subjects designs would yield stronger effects than between-subjects studies. Schweitzer et al. (2013) found a moderate effect (d = 0.5) only when a brain illustration was presented after a control stimulus, to which it could be compared. Participants who saw the brain image first rated it the same as the subsequent control stimulus (d = −0.02). Combining these, we calculated an overall effect size of g = 0.24 for this experiment, closely matching the weighted average of all within-subjects studies (0.28), as shown in Table 1. This is significantly greater (Q (df = 1) = 6.32, p = .012) than the mean effect for between-subjects designs (g = 0.18, CI = [0.13, 0.24]). This suggests that the SANE effect is indeed stronger when people have a non-neuroscience reference, to which neuroscience can favorably compare. However, moderate heterogeneity persisted in the subgroup of between-subjects studies (I2 = 40.4) which we explored further.
Within-subjects and between-subjects studies of differing outcomes.
CI: confidence interval; PI: prediction interval; LL: lower limit; UL: upper limit.
Impacts on different outcomes: Satisfaction, convincingness, and legal verdicts
After viewing the data, we made two observations: first, studies that asked whether neuroscience explanations were more satisfying, higher in quality, or increased self-reported understanding, seemed to yield stronger effects than studies that measured convincingness of the material, usually measured by agreement with the main point of the stimulus. The second observation was that studies about the persuasiveness of neuroscience in the courtroom vary in their findings, with some strong effects (e.g. Allen et al., 2019; Greene and Cahill, 2012), but several notable null results (e.g. Marshall et al., 2017; Schweitzer and Saks, 2011).
Taking these together, we divided our data set into four groups (Table 1): those measuring satisfaction, quality, or metacomprehension; those measuring convincingness or agreement; courtroom studies, usually measuring verdict type or sentence length; and within-subjects studies of all types. This framework was found to have significant between-group differences (Q(df = 3) = 25.94, p < .001).
Perhaps the most serendipitous discovery is that the SANE effect differs according to the measure taken. When grouping ratings of quality, satisfaction, and perceived comprehension, we obtained a more noteworthy effect size (g = 0.28), equal to that of within-subjects studies, with low heterogeneity (Table 1). We surmise that this is because irrelevant neuroscience alters participants’ affective reaction to the material. In contrast, neuroscience has a significant, but extremely weak (g = 0.09) ability to convince participants in assessments of agreement with the material. These often take the form “do you agree or disagree with . . .” (Michael et al., 2013) or “to what extent do you believe . . .” (Phalen et al., 2021; Schweitzer et al., 2013) conclusions or other statements of fact in the material. Agreement with statements such as “the scientific reasoning in the article made sense” (McCabe and Castel, 2008) also indicate that one has been convinced by the material. We propose that questions in the convincingness group assess cognitive appraisal of the material, and that neuroscience has little impact in this area. This may explain why laypeople are aware of brain optimization techniques reported in mass media, but usually report that they do not practice them (O’Connor and Joffe, 2015). This finding is in line with the small overall effect size on agreement found by Michael et al. (2013) for their 10 experiments, along with the study they modeled (McCabe and Castel, 2008, Exp. 3). When standardized, we calculate g = 0.13, well within the 95% CI for the convincingness subgroup. This is unsurprising, given that these comprise 11 of the 18 studies in the group. Nevertheless, the others range from −0.08 to 0.12, suggesting that effect sizes of convincingness are uniformly low.
This finding raises questions about the real-life impact of extraneous neuroscience. Taken together, we find that it can improve subjective reaction to scientific material, seen in ratings of participants’ reports of their own satisfaction and understanding. This subjective sense of comprehension of the brain may have a major impact on laypeople’s perceptions, especially their confidence that science supports their views about psychology and social issues (O’Connor and Joffe, 2014). However, this change in appraisal of one’s reaction to the material does not appear to confer an outsized impact of neuroscience in convincing or changing attitudes in line with presented material. In one case, neuroscience had no impact on convincingness, modestly increased ratings of stimulus quality, and substantially (d = 0.94) altered self-reported understanding, although it did not affect the ability to detect a flaw in the stimulus study. The authors referred to this as “an illusion of explanatory depth” (Rhodes et al., 2014, p. 1432). This inflated sense of comprehension without any evidence of improvement in objective assessments of understanding has been found by others (Ikeda et al., 2013). Furthermore, in one study that obtained large effect sizes (Fernandez-Duque et al., 2015), participants were told that the material was taken from “solid, replicable research” (p. 928), and they were only to judge the quality of explanations of the phenomena. This raises the possibility that neuroscience can improve subjective quality of the explanation most strongly when the veracity of the information is not up for debate (i.e. no convincing is needed). Moreover, combining satisfaction measures and within-subjects presentation very often produced moderate to large effects (0.33 < g < 1.14; Fernandez-Duque et al., 2015; Weisberg et al., 2008) including three of the experiments that were identified as outliers.
Revisiting publication bias
Identification of different types of SANE effects opens the possibility that Figure 3 may not indicate strong publication bias. Rather, it is possible that the smallest studies (thus, those at bottom with large standard error) instead represent a different type of effect (Borenstein, 2019). We reviewed all studies with a standard error of more than 0.20, and found that 7 of the 11 assessed quality or satisfaction. Therefore, it is possible that asymmetry in the funnel plot may be at least partly due to different types of SANE effects, although this was a post hoc discovery and should be interpreted with caution.
Courtroom studies
Heterogeneity among courtroom studies may reflect a complex set of effects of introducing brain-related material in criminal cases. In line with the proposed difference between quality and convincingness, it has been found that neuroscience explanations were rated higher quality than behavioral evidence in lie detection (McCabe et al., 2011), but did not affect guilty verdicts differently (West et al., 2014). Images of brain scans did increase ratings of pain severity, as well as compensation, in a mock case involving claims of physical or emotional pain (Phalen et al., 2021). This may indicate a belief that neuroimaging is more objective than other methods of pain assessment or lie detection, and harder to trick.
More generally, invoking brain differences such as lesions or other abnormalities in a violent criminal (and almost all courtroom studies reviewed here involve violent crimes) may lead to a so-called double-edged sword (DES; Aspinwall et al., 2012). The abnormality at the cause of the violent behavior may be seen as hard-wired, making the accused more dangerous, less treatable, and more likely to reoffend. At the same time, it may lessen ratings of culpability and responsibility for their actions. If true, neuroscience information or images may not reduce measures such as conviction or sentence length, because participants may see a strong need to warehouse the accused in the name of public safety (Baker et al., 2017). However, it would also imply that jurors would opt for less severe punishments, if defendants are seen as less responsible for their actions.
Most studies support this hypothesis. Greene and Cahill (2012) found that, among highly dangerous defendants, neuropsychological data dramatically mitigated recommendations for the death penalty over life imprisonment. This suggests leniency in the severity of punishment, because in either option, the defendant would not be able to reoffend. The finding that mock jurors given neuroscientific evidence opt for life imprisonment over the death penalty was replicated by others (Appelbaum et al., 2015; Saks et al., 2014). In some cases, null results involve guilty versus not guilty verdicts or sentence length, in which mitigating effects of neuroscience information may be balanced by motivation to keep the defendant away from the general public (Appelbaum et al., 2015; Mowle et al., 2016). Furthermore, neuroscience imaging alone was sufficient to reduce death penalty verdicts, and ratings of responsibility in the defendant (Saks et al., 2014), but increase likelihood of recidivism (Baker et al., 2013), in line with the DES hypothesis (Baker et al., 2017). This is in spite of the fact that brain imagery alone is usually found not to affect verdicts (Marshall et al., 2017; Phalen et al., 2021; Schweitzer et al., 2013, 2011; Schweitzer and Saks, 2011), as will be discussed further in the next section.
In a similar fashion, brain-based evidence leads to verdicts of not guilty by reason of insanity, or guilty but mentally ill, in studies in which participants may choose these options (Gurley and Marcus, 2008; Schweitzer and Saks, 2011). In an unpublished dissertation (Capestany, 2018), neuroscience decreased severity of punishment and ratings of moral responsibility in the accused. However, based on the DES hypothesis, neuroscientific evidence should increase ratings of violence risk and likelihood of recidivism, while lowering responsibility and self-control in the defendant, and some studies find no effect on these measures (LaDuke et al., 2018; Marshall et al., 2017; Schweitzer et al., 2011).
Two recent studies attempt to resolve these discrepancies. First, it was found that sentencing judgments in line with the DES hypothesis were not based on consequences like predicted treatability or effects on public safety, but rather deontological concerns about the ethics of punishing those less blameworthy or responsible, and about duty to provide treatment to those with mental illness (Allen et al., 2019). Second, researchers added neuroscientific evidence to a criminal case in a pretest–posttest design, and also varied stated reasons for incarceration (Perricone et al., 2022). There was no main effect of neuroscience on sentence length, as several others have found, but an interaction showing that those who read that the reason for incarceration is for public safety or to take time to rehabilitate them increased their recommended sentences, while sentences were decreased when the reason for a longer sentence was retribution. In line with the literature discussed here, much of the heterogeneity may be due to a decreased need for punitive measures, but greater concern for public welfare.
Impact of brain images
Given that effect sizes tend to be stronger when neuroscience text is added to written material, compared with images, we planned a priori to divide the data set into studies in which the primary manipulation was the inclusion of brain imagery, studies that varied neuroscience text, and those that compared both images and text to control conditions with neither. Our planned analysis indicates a significant effect (Q(df = 2) = 15.9, p < .001),
Subgroup data are shown in Table 2. There is a weak overall effect for imagery, compared with neuroscience text. Interestingly, the combination of the two manipulations does not produce a reliable additive effect. Instead, the effect size is similar to that of brain imaging alone, but is much more heterogeneous. While Marshall et al. (2017) found no impact of a 3D brain image and fMRI on guilty verdicts and sentence length, another study indicated that a combination of a brain scan picture and text improved quality and credibility of an article, while either condition alone did not (Im et al., 2017). These two studies are on opposite ends of this category in terms of effect size, and may reflect differences in study outcomes, as was discussed.
Brain imagery and neuroscience text subgroups.
BI: brain imagery; CI: confidence interval ; PI: prediction interval; LL: lower limit; UL: upper limit.
It is less surprising that neuroscience text studies have stronger effects than those using images. There have been several failures (Gruber and Dickerson, 2012; Hook and Farah, 2013; Michael et al., 2013; Racine et al., 2017; Schweitzer et al., 2013) to replicate the initial brain imagery work of McCabe and Castel (2008), McCabe et al. (2011). The most powerful effects produced by images alone are in increasing perceived comprehension of study material (Ikeda et al., 2013).
Other potential moderators of the SANE effect
In reconciling studies that do and do not find a SANE effect, it has been suggested that the impact of neuroscience imagery and information may be waning, due to greater public familiarity over time (Marshall et al., 2017). We tested this possibility by regressing all effect sizes in our sample over time. The SANE effect diminishes by only about 1.1% of a standard deviation every year (b = −0.011, 95% CI = [−0.035, 0.013]), since the classic studies were published 15 years ago (Gurley and Marcus, 2008; McCabe and Castel, 2008; Weisberg, 2008; Weisberg et al., 2008). The trend was not significant (F(1, 58) = 0.90, p = .348). Furthermore, it is unclear if this decline, if present, is due to changes in perceptions about neuroscience, or because replications typically have lower effect sizes than earlier work (Open Science Collaboration, 2015).
As mentioned, expertise beyond the level of a few neuroscience courses moderates the SANE effect (Hopkins et al., 2019; Im et al., 2017; Weisberg et al., 2008). Other than this, very few individual differences seem to reliably predict susceptibility. There were no differences in age and overall education (Michael et al., 2013), Need for Cognition (NFC; Minahan and Siedlecki, 2016), attitudes about the field of study (Fernandez-Duque et al., 2015; Im et al., 2017), intuitive thinking (Van Elk, 2019), or performance on the Cognitive Reflection Test (CRT), or tests of concrete or abstract syllogisms (Fernandez-Duque et al., 2015; Van Elk, 2019).
Dualism, or the belief that mind and body are distinct, also minimally affects the SANE effect (Hook and Farah, 2013). However, it has been demonstrated that the SANE effect may be caused by an implicit dualism (Sandoboe and Berent, 2021). In this account, a preference for neural explanations of behavior resolves dissonance of how mind interacts with body, by substituting the material brain as causal agent. However, it has yet to be shown that individuals with such implicit dualistic beliefs rate brain images or irrelevant neuroscience text more highly than people with less implicit dualism. The authors speculate that implicit dualism is very common, and unlikely to vary a great deal between individuals (Sandoboe and Berent, 2021).
All told, given the heterogeneity in effect sizes across the literature, it is sensible to hypothesize that some people would be more influenced by irrelevant neuroscience than others. However, with the exception of domain-specific expertise, no consistent individual difference has yet been found that predicts a predisposition toward the SANE effect. The types of materials and outcomes in each study are much more relevant to the strength of the effect.
Limitations and future directions
As mentioned, there was unexpectedly large heterogeneity. However, this supports the notion that there are several types of SANE effects, depending upon the stimulus, study design, as well as whether the main outcome is satisfaction and perceived comprehension, or convincingness of the material.
Relatedly, effects in SANE studies may be limited by several aspects of their experimental designs that do not apply in real-world settings, where understanding of neuroscience may be less accurate, and communicated to persuade, or to align with established beliefs. Indeed, the SANE effect is strongest when the neuroscience matches participants’ opinions, and disappears when it disagrees with their views (Scurich and Shniderman, 2014). In one media analysis (O’Connor and Joffe, 2014), a finding of gender differences in neural connectivity that tested no cognitive or behavioral correlates (Ingalhalikar et al., 2014) was interpreted by some Internet commenters as a scientific basis for their gender stereotypes, and for traditional gender roles in work and parenting. Most of the studies analyzed in this article do not address this impact of neuroscience information.
There is also concern that mass media use neuroscience to promote changes in health and parenting such as brain training games, nutrition, and medication (O’Connor and Joffe, 2013a, 2013b, 2015). This is proposed to cause guilt and anxiety about the difficulty maintaining endless initiatives for the health and success of oneself and one’s children, while underestimating factors outside personal control, such as genetics and inequality (O’Connor and Joffe, 2015). In addition, neuroscience is frequently used to sell products including “Neuro Drinks” and oxytocin supplements for social bonding (O’Connor and Joffe, 2016). These issues are exacerbated by the fact that null results and refutations of neuroscience findings are underreported (Dumas-Mallet et al., 2017; Racine et al., 2010). This may lead journalists to overestimate the certainty within the field, and the effectiveness of, for instance, psychotropic medication (Bourdaa et al., 2015).
These analyses of media and consumer behavior suggest that brain science can persuade in ways that were not detected in our discovery of the weak SANE effect on convincingness. Some SANE studies with null results (Hook and Farah, 2013; Michael et al., 2013; Schweitzer et al., 2013) tested McCabe and Castel’s (2008) original work, and involved evaluating scientific articles that may have had no clear impact on the lives or views of participants. Participants in laboratory studies may be influenced by demand characteristics to report some agreement with scientific findings of trivial interest to them, regardless of added neuroscience. This may not capture some persuasive aspects of neuroscience in the media, or in laypeople’s personal lives. It should be noted, however, that neuroscience explanations reliably alter ratings of satisfaction and understanding of phenomena such as head-bobbing in lizards (Hopkins et al., 2016; Weisberg et al., 2008, 2018) which laypeople presumably do not feel very strongly about. This is in line with the differences in outcomes we uncovered.
In the future, it is important to explore laboratory findings of the SANE effect with real-world consequences of popular neuroscience beliefs. For instance, little is known about how consumers rate “brain-based” products in relation to those marketed without extraneous neuroscience. It is similarly important to use experimental methods to understand how SANE effects on satisfaction and feelings of comprehension (especially in the absence of any objective change in understanding) can be used to persuade or amplify beliefs in topics ranging from health to social or political issues.
Conclusion
Neuroscience information, both written and graphic, affects laypeople. In some cases, the effect is statistically significant, but trivial. This is most likely true for brain images per se, and in criminal proceedings where the information or imagery is deployed in an attempt to persuade jurors to release a defendant. Our exploratory analysis also indicates that we should continue to examine what it means to say that neuroscience explanations are seductive. Invoking brain information, while judged irrelevant by experts, seems to produce a feeling of satisfaction, familiarity, and understanding among laypeople. Effect sizes here are typically small or moderate, but not insubstantial, and occur without any objective changes in understanding. However, this inflated sense of satisfaction is often not sufficient to convince most people to agree with a fact or position to which neuroscience has been added, at least not in laboratory conditions using stimuli related to scientific findings. It is therefore less likely that the kinds of irrelevant neuroscience stimuli that have been studied could change attitudes or behavior in real-world situations. However, the apparent ubiquity of neuroscience messaging in marketing and social media suggests that there is at least a perception that brain information is persuasive, and that certain techniques are influential, in ways that researchers have yet to uncover.
Supplemental Material
sj-docx-1-pus-10.1177_09636625231205005 – Supplemental material for Neuroscience explanations really do satisfy: A systematic review and meta-analysis of the seductive allure of neuroscience
Supplemental material, sj-docx-1-pus-10.1177_09636625231205005 for Neuroscience explanations really do satisfy: A systematic review and meta-analysis of the seductive allure of neuroscience by Elizabeth M. Bennett and Peter J. McLaughlin in Public Understanding of Science
Footnotes
Acknowledgements
The authors thank Dr Daniel Bennett and Dr Marc Sylvester for technical assistance. They are also grateful for the assistance of several cited authors for providing their raw data and words of encouragement.
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
Supplemental material
Supplemental material for this article is available online.
Author biographies
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
