Abstract
Across six studies, people used a “bad is black” heuristic in social judgment and assumed that immoral acts were committed by people with darker skin tones, regardless of the racial background of those immoral actors. In archival studies of news articles written about Black and White celebrities in popular culture magazines (Study 1a) and American politicians (Study 1b), the more critical rather than complimentary the stories, the darker the skin tone of the photographs printed with the article. In the remaining four studies, participants associated immoral acts with darker skinned people when examining surveillance footage (Studies 2 and 4), and when matching headshots to good and bad actions (Studies 3 and 5). We additionally found that both race-based (Studies 2, 3, and 5) and shade-based (Studies 4 and 5) associations between badness and darkness determine whether people demonstrate the “bad is black” effect. We discuss implications for social perception and eyewitness identification.
On April 15, 2013, three people died and 264 were injured when two bombs exploded near the finish line of the Boston Marathon. Dozens of cameras captured the moments before and after the explosions. Three days later, the FBI released footage depicting the chief suspects: two brothers in their 20s. Numerous spokespeople publicly identified the brothers as dark skinned. CNN’s John King proclaimed that the FBI had arrested “a dark-skinned individual,” and Boston police officers made similar claims after the FBI captured the brothers (Terkel, 2013). Members of Reddit, an online community, identified alternate suspects who were almost universally dark-skinned (Shontell, 2013). In truth, though, the brothers were far from dark-skinned; they were born in Chechnya, in the Caucasus region of Eastern Europe, and so were literally Caucasian. This incident suggests that, even when people are highly motivated to uncover the truth, they might perceive the skin tone of wrongdoers to be darker than it actually is.
In contrast to CNN’s announcement, only an hour later “CBS News” (2013) reported that the not-in-custody suspect seen on the surveillance video was a “White man.” Why did CNN and CBS reporters allege starkly different descriptions of the same target despite looking at the same footage? In this article, we explore the underlying beliefs that shape the perceptual representations people form of others who have committed acts of malice or charity.
Theoretical Background: “Black Is Bad”
Associations of darkness with negativity, and lightness with positivity, pervade human history. These associations exist, perhaps, for two main reasons. The first possibility is a “shade-based” account. Specifically, the dangerous uncertainty of nighttime and the comparative safety of daytime may have led early humans to fear darkness and associate it with negativity (Schaller, Park, & Mueller, 2003; Williams & Morland, 1976). Without daylight, people were more susceptible to harm because their environment was less visible and so left them open to attacks. Holding the association of darkness and negativity may then have conferred an evolutionary advantage because people would have remained in secure places during times of darkness. These shade-based associations subsequently seeped into human culture. For thousands of years, people have worn white garments in religious practices to assert their moral purity (Eliade, 1996). Similarly, everyday language describes villains and scoundrels as “blackguards” and people who have attained knowledge and intellect as having “seen the light.”
The second possibility is a “race-based” account in which negative attitudes toward African Americans and motivations to maintain racial hierarchies in modern Western society reinforce these associations. Over the course of U.S. history, prejudicial attitudes and discrimination have led African Americans (relative to Whites) to occupy low status occupational positions and living situations, receive less access to educational and health resources, and be stereotyped as criminal (see Maddox, 2004; Maddox & Gray, 2002, for reviews). Overall, both “shade-based” and “race-based” factors are likely to impact how strongly a person associates darkness with negativity.
For both shade- and race-based reasons, general associations between darkness and negativity have become ingrained in human psychology. Indeed, researchers have found that people more quickly associate Blackness with badness, and Whiteness with goodness, whereas they struggle to mentally form the inverse associations (Meier, Robinson, & Clore, 2004; Sherman & Clore, 2009). Moreover, Westerners are more likely to guess that Chinese ideographs have negative meanings when they are printed in black font, and positive meanings when they are printed in white font (Lakens, Semin, & Foroni, 2012).
These associations between Black and bad influence the social evaluations and beliefs we hold about others. People who believe that Barack Obama’s likeness is best captured in artificially darkened photos evaluated him more negatively and were less likely to have voted for him in the 2008 election (West, Pearson, Dovidio, Johnson, & Phills, 2014). Similarly, when professional sports teams wore black uniforms, spectators believed they were behaving more aggressively, and referees penalized them more harshly (Frank & Gilovich, 1988).
Across paradigms, researchers consistently find an association between Blackness and badness. While these examples show that people perceive badness in darkness, the current research goes beyond existing literature in three ways. First, we reversed the usual “black is bad” association, and explored whether individuals believe “bad is black.” Do people represent actors who commit wrongful acts as having darker skin than actors who commit conspicuously moral acts? Testing whether morality affects the ways in which people represent the physical features of others is important given how strongly visual representations shape social interaction. People rely on physical features more so than traits or behaviors to differentiate people (Kessler & McKenna, 1978), to determine trustworthiness (Ratner, Dotsch, Wigboldus, van Knippenberg, & Amodio, 2014), to hire workers (Harrison & Thomas, 2009), and to assign legal punishments (Levinson & Young, 2010).
As a second advance, this research moves beyond a test of racial stereotype- and category-driven social evaluations. Much work has documented the consequences of physical stereotypicality for Black people. Blacks with prototypically Black physical features are more negatively evaluated (Livingston & Brewer, 2002), have fewer non-Black friends (Hebl, Williams, Sundermann, Kell, & Davies, 2012), and are more likely to be mistakenly shot when unarmed (Kahn & Davies, 2011; Ma & Correll, 2011). In addition, both Blacks and Whites with more Afrocentric facial features (e.g., larger lips, wider nose) receive harsher criminal sentences (Blair, Judd, & Chapleau, 2004; Eberhardt, Davies, Purdie-Vaughns, & Johnson, 2006), whereas police use less force on Whites with fewer Afrocentric facial features (Kahn, Goff, Lee, & Motamed, 2016). Closer to the present work, people who are primed to think about crime select darker skinned and more stereotypically Black targets from a line up (Eberhardt, Purdie, Goff, & Davies, 2004; Osborne & Davies, 2013). In contrast to earlier work, we do not manipulate physiognomy, but instead manipulate only skin tone. Although the two are related, and both facilitate racial categorization (Brown, Dane, & Durham, 1998), skin tone is independently tied to the activation of negative race-based stereotypes (Messing, Jabon, & Plaut, 2015) as well as negative evaluations of Black targets among White perceivers (Hagiwara, Kashy, & Cesario, 2012). To extend beyond the study of race and racial stereotypes, we explore factors that affect the shade of representations of White majority group members and Black minority group members. In this way, we are able to isolate the association between darkness and negative actions and test the “bad is black” directional hypothesis broadly, beyond the context of race.
As a third advance, this research examines the basic mechanisms that drive the “bad is black” association. Whereas most existing research focuses on the role of race-based associations, we also examine the unique role of shade-based “bad is black” associations in driving our effects. By separately testing the influence of valenced shade-based associations and race-based associations, we ask whether negative evaluations and actions toward darker skinned individuals may stem not solely from cultural stereotypes but from archaic, metaphoric links between darkness and badness.
Mechanisms of Perception: Race-Based and Shade-Based Associations
In this article, we argue that the “bad is black” effect will occur when people mentally associate darkness with negativity. We consider both race-based and shade-based factors as independent mechanisms behind the “bad is black” effect. The race-based hypothesis suggests that people who hold more negative views of darker skinned racial outgroups may be particularly likely to show these perceptual effects. Although people express both explicit (Nosek, Banaji, & Greenwald, 2002) and implicit (e.g., Devos & Banaji, 2005; Nosek et al., 2007) preferences for White- and light-skinned people over Black- and dark-skinned people, these preferences are not universal (e.g., Devine, 1989). Some people show strong anti-Black biases, whereas others are relatively egalitarian (e.g., Greenwald, McGhee, & Schwartz, 1998). If the strength of the relationship between morality and skin tone is tied to attitudes that people hold about racial groups, we expect people who hold stronger pro-White and anti-minority sentiments to more strongly link moral social evaluations and representations of skin tone.
In addition, we test a shade-based hypothesis, which posits that the degree to which people believe immoral deeds are perpetrated by darker figures is heightened by individual differences in the strength of cognitive associations between darkness and badness. As previously described, the shade-based hypothesis focuses on cognitive associations that stem from a broader taxonomy of good and bad, founded in evolutionary, religious, and cultural origins (see Meier et al., 2004; Sherman & Clore, 2009). The shade-based hypothesis suggests that representations of immoral actors as darker should be strongest among people who associate dark and bad more so than among people who do not.
Overview of Studies: “Bad Is Black”
We present two archival studies and five experiments to examine whether the morality of an actor’s behavior shapes perceptions of his skin tone. We test whether magazines and newspaper articles systematically pair negative or positive deeds with darker or lighter photographs of celebrities (Study 1a) and politicians (Study 1b). We then experimentally test these associations, presenting participants with vignettes or surveillance footage depicting target actions. We manipulated the immorality of targets’ actions, and participants indicated whether the action was conducted by the person portrayed in the darker or lighter photographs (Studies 2, 4, and 5) or indicated the representativeness of photographs depicting the target in varying skin tones (Study 3). Furthermore, we used a moderator-oriented approach to explore the mechanistic processes that produce the “bad is black” effect (Baron & Kenny, 1986); we tested whether participants’ race-based (Studies 2-5) and shade-based beliefs (Studies 4 and 5) moderated this tendency to assume immoral actions were performed by darker skinned individuals.
Study 1a: Celebrity Depictions in Magazines
In Study 1a, we asked whether visual depictions of celebrities are systematically biased in ways that support the “bad is black” association. Anecdotal allegations suggesting that magazines alter the skin tone of celebrities in photos accompanying articles are common. In 1994, Time Magazine was said to have darkened a cover image of accused murderer O. J. Simpson. In 2010, Elle Magazine was said to have lightened its cover image of Oscar-nominated actress Gabourey Sidibe. Critics contended that Time had darkened the image of Simpson to make him appear more menacing, and that Elle had lightened the image of Sidibe to make her more appealing. Although these anecdotal reports focus on Black celebrities, we examined the photos that accompanied articles about both Black and White celebrities. We expected the skin tone of celebrities to be darker when photographs accompanied critical articles rather than complimentary articles, regardless of the celebrity’s race or gender.
Method
Database development
To develop our database, we selected the five most popular White and Black, female and male celebrities from TV Guide’s “Most Popular Celebrities” list of 2013. We then searched Forbes’ Celebrity 100 list for 2011-2013. The Forbes list ranks celebrities by a power index that includes media attention, compiled earnings, social media impact, and impact on the entertainment industry and culture. We added to the list Black and White, female and male celebrities that appeared in the top 10 ranked positions for each year. Given that some celebrities appear on the list from year to year, and are cross-listed, these search criteria produced a total of 34 celebrities.
Next, we created a database of online news articles about each celebrity by first selecting 10 celebrity gossip news sources (see Online Supplement, for more information). We selected specific articles by pulling five articles from each of the 10 selected news sources according to the following criteria: The article included a photograph of the celebrity in which the face was visible (profiles permitted) and not obstructed by sunglasses, and the article’s content had to focus on the target celebrity. Articles that met these criteria were then filtered chronologically. We selected the most recent article, and then only one article per month moving back in time, to ensure we covered a diverse range of articles on each celebrity. The goal was to amass a total of 50 articles per celebrity. Not every news outlet covered every celebrity as extensively across time, thus our selection criteria provided a final sample of 1,550 articles.
Coding article valence
Trained research assistants coded text files containing only the written content of the articles, with photographs removed. After reading four extreme examples to serve as anchors, they coded article valence using a scale labeled with −2 (very negative), 0 (neutral, reporting just the facts), and +2 (very positive) scale.
Skin tone luminance
Research assistants assessed the luminosity of celebrities’ exposed skin in the photograph accompanying each article using Adobe Photoshop CS6. They used the quick selection tool to isolate only parts of the photo with exposed skin. They excluded eyes, lips, fashion accessories (e.g., sunglasses perched on a target’s head), hair, background, and clothes. As our measure of skin tone lightness, we used the mean luminosity value across all pixels in a given face. Values range from 0 (darkest) to 255 (lightest). We excluded four photographs (0.26% of all photographs) that contained mean skin tone values three standard deviations above or below the grand mean of all photographs.
Control variables
Research assistants coded each photograph on additional dimensions that may have directly affected skin tone luminance. First, assistants evaluated photographs on professional composition. We explained that photographs can balance contrast, shading, and light to different degrees, and might be executed technically and without flaw while others seem fuzzy or unclear, and some photographs are framed better than others. Considering these factors, assistants coded the degree to which the photograph appeared professionally composed using a 1 (not at all) to 4 (extremely) scale. Second, assistants evaluated photographs on the celebrities’ physical appearance. We explained that people may be well dressed or disheveled, might appear caught off guard, or their facial expressions unusual. Considering these factors, assistants coded the degree to which the celebrity appeared “put together” using a 1 (not at all) to 4 (extremely) scale. When it was possible to determine, assistants coded whether the photograph was taken inside or outside (n = 1,534) and, if outside, whether during the day or night (n = 892).
Results and Discussion
Primary analysis
Because each celebrity had multiple photographs, we constructed a MIXED model that accounted for the non-independence of the skin tone values among each celebrity’s photographs (Fitzmaurice, Laird, & Ware, 2011). We specified a compound symmetry covariance matrix, which assumes that ratings within a celebrity are equally correlated. This procedure uses the Satterthwaite method to calculate degrees of freedom, which adjusts the degrees of freedom to account for non-independence among the repeated factor (photos within celebrity) and can result in fractional values (Kenny, Kashy, & Cook, 2006). To test our main prediction, we first constructed a model that included valence (grand-mean centered) as a predictor. We also included target race and gender as main effect predictors to examine the effect of valence above and beyond these variables. Mean luminance values for skin tone was the dependent variable.
There was a significant main effect of race, b = 12.54, SE = 2.53, t(30.76) = 4.97, p < .001, 95% confidence interval (CI) = [7.39, 17.69], indicating that White celebrities had higher luminance values (i.e., lighter skin tone) than Black celebrities. There was a main effect of gender, b = 7.33, SE = 2.46, t(30.86) = 2.99, p = .006, 95% CI = [2.32, 12.35], such that female celebrities were pictured with lighter skin tone than male celebrities. Importantly, as predicted, there was a main effect of article valence, b = 2.54, SE = 0.69, t(1532.53) = 3.69, p < .001, 95% CI = [1.19, 3.89]. Negative articles tended to be accompanied by photographs of celebrities depicted with darker skin tones. We also specified a random effect of celebrity, to account for the possibility that individual celebrities might drive our effects. The random effect of celebrity was not significant (p = .72), and article valence still predicted skin tone lightness (p < .001).
We additionally constructed a model that tested for higher order interactions. The model included predictor variables of target race (White = 1, Black = −1), target gender (female = 1, male = −1), article valence (grand-mean centered), and all interactions. The interaction of valence and gender was significant, b = 1.82, SE = 0.71, t(1529.58) = 2.56, p = .01, 95% CI = [0.42, 3.21], suggesting that the relationship between lightness and favorability of the articles was especially pronounced among women. In addition, there was a marginal three-way interaction predicting luminance, b = −1.32, SE = 0.71, t(1529.58) = −1.85, p = .06, 95% CI =[−2.71, 0.08] (Figure 1). The general pattern suggests that although favorable articles are generally associated with pictures of lighter skin tone across all targets, this relationship may not occur among Black men, who seem to be consistently portrayed with darker skin tones.

Skin tone lightness where higher values indicate lighter skin tone, as measured by mean luminance values, as a function of the valence of news articles (plotted at +1/−1 SD from mean), celebrity race, and celebrity gender.
Adding control factors
We re-ran the planned main effects mixed model analysis to include additional predictor variables that might have plausibly affected skin tone luminance. The reported main effect of valence remained significant when adjusting for the photograph’s professional quality and how put together the celebrity appeared. In addition, we re-ran the original mixed model, including whether the photograph was taken outside or inside, and whether it was taken during the day or night when it was possible to determine time of day. Including these control factors, the main effect of valence remained significant (see Online Supplement).
Our results suggest that magazines systematically choose lighter celebrity photos to accompany complimentary gossip articles, and darker photos to accompany critical articles. It is difficult to determine, based on this study, which of these choices was made inadvertently and which was made intentionally—as happened in a number of prominent photoshopping scandals. Based on our large sample, we believe that editors made a large portion of these selections unconsciously.
Study 1b: Politician Depictions in Articles
In Study 1b, we sought to replicate the findings of Study 1a within the domain of politics. Both empirical and anecdotal evidence suggest that politicians’ skin tone is subject to distortion. Darker images of Obama were more frequently paired with negative ads during the 2008 presidential campaign (Messing et al., 2015), a coupling that Hillary Clinton’s campaign staff members were thought to have intentionally presented in ads aired during the presidential primaries. In Study 1b, we extended these findings outside of strategic political campaign ads and examined whether politicians’ photos in everyday news articles would be shaped by the content of the article. In addition, this study extends previous work by examining photographs of a variety of different politicians who vary in their race and gender. We expected the skin tone of politicians to be darker when photographs accompanied critical rather than complimentary articles. We also examined whether the unpredicted moderation by race and gender found in Study 1a would replicate.
Method
Database development
We selected politicians for inclusion by compiling a list of all Black members of Congress and Cabinet members of the Executive Branch from 1997 through 2014. We then selected White matches for each Black politician that held the same position, represented the same state if applicable, of the same gender and party affiliation (as categorized by CQ Press Electronic Library, 2013), and matched the most recent year in office. We included all White matches to a Black politician if they fit these criteria. If a White match could not be found within a Black politician’s state, we expanded our search for a match to neighboring states. Politicians who switched party affiliation while in office were eliminated. Following these procedures, the database included 153 unique politicians (16 = Black female; 37 = Black male, 27 = White female, 73 = White male).
Selection of articles and photos
We selected online news articles by searching a politician’s name in the “news” section of Google. We included articles from news sources included in assessments of political bias of the source conducted by Tim Groseclose (Groseclose, 2011; Groseclose & Milyo, 2005) or AllSides.com (Gable et al., 2013). Groseclose “slant quotients” (values available in Groseclose, 2011), ranging from 0 (most conservative news bias) to 100 (most liberal news bias), are based on content analyses of charged political phrases that appeared in print from that source (see also Gentzkow & Shapiro, 2010). AllSides Bias ratings are based on crowd-sourced evaluations of bias in the news; sources are categorized as Left (coded as 10), Leans Left (30), Center (50), Leans Right (70), and Right (90). We excluded all editorial, op-eds, and blog posts given that the bias of the author could not be objectively assessed and may deviate from the bias of the source in general. Articles included in the database were of a political nature and the selected politician was a primary character in the narrative. The article must have included a color image of the target politician because black-and-white photographs may obscure perceptual distinctions in skin tone (Blair, Judd, Sadler, & Jenkins, 2002). The photograph clearly presented the politician’s face and had no fewer than 72 pixels per inch (image DPI). Articles presenting videos rather than photos were excluded. All articles that fit these criteria were included in the database. Our database included 3,126 unique articles.
Skin tone luminance
Mean skin tone luminance was calculated in the same way as in Study 1a, with higher values indicating greater lightness. We excluded three photographs (0.10% of all photographs) that contained mean skin tone values three standard deviations above or below the grand mean of all photographs.
Coding article valence
Research assistants in lab and online participants coded a small but variable portion of the text files containing only the written content of the articles, with photographs removed. After reading four extreme examples to serve as anchors, they coded article valence using a scale anchored at −2 (very negative), 0 (neutral, reporting just the facts), and +2 (very positive) scale. Articles could receive multiple ratings from coders, so long as the amount of time spent reading the article by the coder met or exceeded 10 s to ensure depth of processing. Ratings were averaged for each article to create a single valence score for each article. We assessed interrater reliability by calculating an intraclass correlation (ICC; Shrout & Fleiss, 1979). Reliability across raters was highly significant, ICC = .29, SE = 0.03, z = 8.73, p < .001, 95% CI = [0.22, 0.35].
Control variables
Research assistants coded how put together the politician looked and the professional quality of each photograph in the same way as in Study 1a.
Results and Discussion
Primary analysis
Because each politician was depicted in multiple photos, we constructed a MIXED model and specified a compound symmetry covariance matrix. To test our main prediction, we first constructed a model that included valence (grand-mean centered) as a predictor. We also included target race and gender as main effect predictors to examine the effect of valence above and beyond these variables. Mean luminance values for skin tone was the dependent variable.
There was a significant main effect of race, b = 13.51, SE = 1.02, t(130.78) = 13.27, p < .001, 95% CI = [11.50, 15.52]. White celebrities had lighter skin tone than Black celebrities. The effect of gender was not significant (p = .76). Importantly, as predicted, there was a main effect of article valence, b = .73, SE = 0.34, t(3096.34) = 2.15, p = .03, 95% CI = [0.06, 1.40]. Negative articles tended to be accompanied by photographs of politicians depicted with darker skin tones. We also specified a random effect of politician, to account for the possibility that individual politicians might drive our effects. The random effect of politician was not significant (p = .20), and article valence still predicted skin tone lightness (p = .03).
We additionally constructed a model that tested for higher order interactions. The model included predictor variables of target race (White = 1, Black = −1), target gender (female = 1, male = −1), article valence (grand-mean centered), and all interactions. No interaction effects were significant (ps ≥ .12), indicating the association between positive valence and lighter skin tone representation was not significantly influenced by politician’s race or gender.
Adding control factors
We re-ran the planned main effects mixed model analysis, adjusting for ratings of the photograph’s professional quality and how put together the politician appeared. The main effect of valence became marginally significant when adjusting for these factors (see Online Supplement).
These findings conceptually replicate those of Study 1a. Politicians were presented as having lighter skin to the extent that the article paired with their photograph was more positive in its content. In addition, this association occurred regardless of the politician’s race or gender.
Study 2: Surveillance Footage and Symbolic Racism
Studies 1a and 1b provide correlational evidence suggesting that people link negative acts to dark skin. Study 2 and the studies that follow use experimental paradigms to test whether the visual representations of a target individual shift as a function of knowing whether he has engaged in an act of virtue or vice. Although person perception research has firmly established that individual and situational variables shape how actions are interpreted and evaluated (e.g., Allport & Postman, 1947; Darley & Gross, 1983), the perception of people’s physical features is often assumed to be a fixed and immutable starting point for later judgment. We ask whether people see the skin tone of target individuals in systematically biased ways that reflect the “bad is black” association. Related to our point, Osborne and Davies (2013) found that people remembered the perpetrator of a stereotypically Black crime as being more prototypically Black. We extend this work by manipulating the skin tone of potential perpetrators. In doing so, we are able to directly examine the effect of negative acts on darker skin tone and rule out the role of other prototypically Black characteristics (e.g., wider nose, thicker lips) that covary with skin tone. In addition, unlike Osborne and Davies (2013), we used a crime that we reasoned to be generally immoral and not strongly associated with race.
Moreover, Study 2 tested whether the mental representation formed of the target individual can be predicted by individual differences in race-based biases. We expected people who show stronger pro-White biases to endorse the link between morality and skin tone more strongly. Race-based biases should moderate the effect of social evaluations on representations of skin tone.
Method
Participants
One hundred fifty-three adult participants (85 females) completed a brief study in exchange for 50 cents using MTurk. According to participants’ self-reports, 39% were White, 39% Asian, 8% Hispanic or Latino, 2% Black, and 10% belonged to an unlisted racial or ethnic group. Participants’ race did not moderate the results in this or any of the remaining studies. A post hoc power analysis indicated that we possessed at least 75% power to detect our predicted effect of race-based beliefs on skin tone perceptions of immoral and moral actors.
Procedure
The experiment consisted of two separate components. First, participants completed a simple eyewitness identification task. They saw a grainy surveillance image that featured a man, and a brief written sentence that described the man’s actions shortly before or after the image was captured. Alongside the surveillance image, they saw clear, cropped headshots of two men. Participants completed two trials, each of which presented two separate surveillance images and featured a different pair of headshots. In one trial, the man captured on film was described as having committed a good act (returning a missing wallet containing US$1,200 to a local charity). In another trial, the man captured on film was described as having committed a bad act (running down an elderly pedestrian with his car). We describe the actions as moral and immoral in this study and the ones that follow because participants themselves elect to use these terms to describe these actions (see Online Supplement for more information).
After viewing each piece of surveillance footage and reading what actions the man committed, participants moved to a separate web page and indicated which of the two men depicted in the clear headshots appeared in the surveillance image. They responded along a 6-point continuum that ranged from 1 (definitely Person A) to 6 (definitely Person B). In fact, neither Person A nor Person B was depicted in the surveillance image; all four headshots (two after each piece of surveillance footage) were taken from local political election websites across the United States. Using Photoshop, we deliberately darkened one of the images, whereas we deliberately lightened the other image (see Online Supplement). We counterbalanced which target was darkened, and which action was presented with these targets, across participants. We coded participants’ responses so higher scores reflected greater certainty that the person depicted in the surveillance footage matched the person in the darkened photo (which we call the Dark Representative Index).
In a second, unrelated component of the experiment, participants completed the Symbolic Racism scale (Henry & Sears, 2002), which measures anti-Black attitudes.
Results and Discussion
Participants were generally no more certain that the darker skinned man committed the immoral act (M = 3.51, SD = 1.11) than the moral act (M = 3.65, SD = 1.04),
To examine the relationship between symbolic racism and endorsement of the “bad is black” association, we regressed participants’ Dark Representative Index scores on their symbolic racism scores. 1 As expected, higher symbolic racism scores were associated with a stronger belief that the dark-skinned man committed the immoral (vs. moral) act, b = .92, SE = .35, 95% CI = [0.23, 1.60], t(151) = 2.66, p = .009, rsp = .21 (see Figure 2). 2

Influence of symbolic racism on participants’ skin-tone ratings of the good and bad actors in the surveillance images.
These results show that people with stronger anti-Black beliefs were more likely to link immoral actions to people with darker skin. Although this study provides support for our predictions, we included a limited array of moral actions in the study. As such, we conducted a second study designed to examine the same basic questions using a slightly different paradigm and a larger set of moral actions.
Study 3: Matching Skin Tones and Actions
In Study 2, the men depicted in the grainy surveillance footage were impossible to identify, which gave participants plenty of leeway when they completed the matching task. In Study 3, we conducted a more conservative test of our hypothesis by limiting the ambiguity of the target actors. By presenting an actual photograph of the target alongside a description of his actions, we limited the extent to which participants could distort the information that drove their representations of the target (Kunda, 1990; Sanitioso, Kunda, & Fong, 1990). Furthermore, we again tested the race-based hypothesis with a new measure of negative affect toward darker-skinned individuals, to more rigorously examine whether race-based beliefs explain the association between immoral deeds and darker actors.
Method
Participants
Ninety-one adults (45 females) completed an MTurk study in exchange for 50 cents. The majority of participants identified themselves as White (76%), and the remaining participants as Asian (7%), Black (9%), Hispanic or Latino (4%), or belonging to another ethnic group (4%). Post hoc power analyses indicated that we had at least 95% power to detect all significant predicted effects.
Procedure
Participants completed a two-part questionnaire. First, they were asked to decide how well three different photographs captured the likeness of a target individual. They completed this process twice: once for a target who had committed a morally good act and once for a target who had committed a morally bad act. In each case, they began by seeing a photo of the individual with a caption describing his action. We created several different versions of the study with different moral and immoral actions to ensure that our results generalized across a range of actions. The moral actions included risking his life to save three people from a burning house, jumping into a river to save two dogs from drowning, and establishing a charity to help children of fallen Iraq War veterans pay for college. The immoral actions included serially murdering seven people, locking his daughter in his basement, and adopting an orphaned brother and sister and abusing them over many years.
After reading about the individual and his behavior, participants rated how well three different photographs captured his true character. They also indicated how representative each photograph was of him (1 = not at all to 7 = a great deal). As responses to the character and representativeness questions were highly correlated, r = .77, p < .001, we averaged them to create a single score. The photos were headshots that captured the individual in three different poses. One of the photos was unchanged, one was darkened, and the other was lightened using Photoshop. For each action, we created a Dark Representativeness Index (adapted from Caruso, Mead, & Balcetis, 2009) to capture how much more representative participants believed the darker image was than the lighter image, controlling for their representativeness ratings on the unchanged image (i.e., [DARKrepresentativeness − LIGHTrepresentativeness] / UNCHANGEDrepresentativeness).
Finally, participants completed several feeling thermometer scales, indicating how they felt about Whites, Blacks, Hispanics, Latinos, Arabs, and Muslims (from 0 = cold to 100 = warm). We combined these five ratings to form an index of how participants felt about darker skinned people, which we refer to as non-White for ease of discourse (α = .92).
Results and Discussion
Primary analyses
According to their Dark Representative Index scores, participants generally believed that darker skinned photos were more representative of immoral behaviors (M = .13, SD = 0.55) than of moral behaviors (M = −.07, SD = 0.23),
We expected participants to show this pattern more strongly when they generally felt warmer toward Whites than darker skinned groups, so we regressed the difference between their immoral-act and moral-act Dark Representative Index scores on the Non-White Warmth Index and the White Warmth Variable, both centered. Overall, the model was significant, R2 = .13, F(2, 88) = 6.55, p = .002. Warmer positive feelings toward Whites were associated with greater beliefs that the immoral (vs. moral) actor had darker skin, b = .01, SE = .003, 95% CI = [0.01, 0.02], t(88) = 3.62, p < .001, rsp = .36. In contrast, participants’ feelings toward non-Whites did not significantly influence their representation of the immoral (vs. moral) actor’s skin tone, b = −.004, SE = .003, 95% CI = [−0.01, 0.002], t(88) = −1.43, p = .16, rsp = −.14. These findings further support the “bad is black” effect, as well as the suggestion that this effect is exaggerated by race-based beliefs.
Study 4: Surveillance Footage and Color Associations
Studies 2 and 3 provided evidence that warmer feelings toward White individuals strengthen the degree to which people ascribe immoral acts to darker skinned people. However, more basic associations between lightness and goodness may also drive differences in the mental representation of others. Here, we begin to test the shade-based hypothesis, which suggests that immoral people are represented as darker skinned based on implicit cognitive associations that stem from sources of input unrelated to social judgments, such as fear of the dark.
Importantly, Study 4 first tests the unique effects of both the race-based and shade-based hypotheses on perceptual representations of others’ skin tone. As in Study 3, participants reported the degree to which they feel warmth toward Whites and an array of darker-skinned minority groups in general; however, in this study, we asked about warmth toward an array of darker skinned targets in addition to African Americans. This extends the race-based hypothesis to feelings and evaluations about darker skinned minority groups in general.
To assess the degree to which participants cognitively associated shades with valence, they indicated the color of the target’s soul on a spectrum that ranged from black through gray to white. Target soul color is a metaphorical representation of shade-based beliefs, fully and effectively distinct from skin tone. However, we predicted unique main effects of race-based beliefs and shade-based associations, such that participants who felt colder toward non-White Americans and those who reported that immoral (compared with moral) actions were conducted by a man with a darker (compared with lighter) soul would be more certain that a darker skinned man engaged in the vicious act and a lighter skinned man engaged in the virtuous act.
Method
Participants
In exchange for 50 cents, 136 Mturkers (n = 90 female) participated. Of them, 51 were White (38%), 60 were Asian (44%), five were African American (4%), nine were Hispanic or Latino (7%), and 11 reported being more than one race or another race (8%). Post hoc power analyses indicated that observed power for all significant predicted effects ranged from 47%-92%.
Procedure
Participants indicated how well different photographs captured the likeness of a target individual. Just as in the paradigm created for Study 2, participants first saw a surveillance photograph of an individual with a caption describing his action. Participants completed two trials, one in which the action was moral and one in which the action was immoral. After reading about the individual and his behavior, participants saw two headshots depicting different men also used in Study 2, in which one man’s skin tone was darkened and the other was lightened. They indicated which of these two men was depicted in the surveillance footage using a 6-point scale ranging from 1 (definitely Person A) to 6 (definitely Person B). As in Study 2, we created a Dark Certainty Index by subtracting participants’ certainty that the moral action was conducted by the darker of the two men from their certainty that the immoral action was conducted by the darker of the two men.
Finally, participants completed measures that assessed the degree to which they cognitively associate negativity with darkness. They were reminded about each action committed by the man in the surveillance photograph and indicated the “color of his soul.” Participants saw a spectrum ranging from black through gray to white. They clicked on the shade in the spectrum that best reflected the color of the man’s soul who had performed the moral and immoral actions. We created a Cognitive Shade Association Index by subtracting the degree to which the moral action was committed by a darker souled man from the degree to which the immoral action was committed by a darker souled man (see Online Supplement).
To assess race-based beliefs, participants responded to feeling thermometers. They indicated how warmly they felt toward darker skinned people including African Americans, Muslim Americans, Latino Americans, Hispanic Americans, and Arab Americans (from 0 = cold to 100 = warm). They also indicated how warmly they felt about Caucasian/White Americans, which became our White Warmth Index. We created a Non-White Warmth Index by averaging feelings toward all non-White Americans.
Results and Discussion
In general, participants were no more certain that the darker skinned man committed the immoral act (M = 3.63, SD = 0.81) than the moral act (M = 3.71, SD = 0.87),
There was a main effect of the Cognitive Shade Association Index, b = .009, SE = .003, 95% CI = [0.004, 0.014], t(132) = 3.38, p = .001, rsp = .28. The more participants associated immoral rather than moral actions with darker rather than lighter colored souls, the more certain they were that the darker skinned man committed the immoral (vs. moral) act. There was also a main effect of the Non-White Warmth Index, b = −.01, SE = .006, 95% CI = [−0.02, 0.0001], t(132) = 1.99, p = .05, rsp = −.16. The more warmly participants felt toward non-Whites, the less certain they were that the immoral rather than moral actions were committed by darker skinned men. There was no main effect of the White Warmth Index, b = .002, SE = .005, 95% CI = [−0.01, 0.01], t(132) = 0.35, p = .73, rsp = .03.
Study 4 also gave us the opportunity to examine the independent contributions of race-based associations between dark skin and badness, on one hand, and shade-based associations between darkness and badness, on the other. We ran two separate hierarchical regressions to examine the unique contributions of each variable. In the first analysis, we entered the White warmth and non-White warmth indices in the first step, and the Cognitive Shade Association Index scores in the second step. The Cognitive Shade Association Index uniquely predicted participants’ association between bad acts and darker skinned targets, r2Δ = .078, FΔ(1,132) = 11.45, p = .001, suggesting that the association between darker tone and badness is an independent predictor beyond race-based color associations. In the second analysis, we reversed the process, entering Cognitive Shade Index scores in the first step, and the two-racial warmth indices in the second step. The race-based warmth variables did not significantly predict the association between bad acts and darker skinned targets over and above tone-badness associations, r2Δ = .031, FΔ(2,132) = 2.27, p = .11. Accordingly, only participants’ shade-based beliefs uniquely and independently predicted participants’ tendency to believe that darker skinned men were responsible for bad acts.
Study 5: Shade-Based Associations and Valence
In Study 5, we tested the unique influences of race-based and shade-based beliefs on “bad is black” perceptual representations. To assess whether these judgments reflected race-based beliefs, participants again responded to feeling thermometers for White and darker skinned Americans. To assess whether judgments reflected shade-based associations, we used a measure of these associations that is expressly independent of the target actor. We adapted a paradigm developed by Lakens et al. (2012) in which participants indicated which of two Chinese ideographs, one of which appeared in white font and the other in black font, depicted an English word.
We again predicted that darker skinned individuals will more often be identified as the perpetrators of immoral acts than lighter skinned individuals, who will be identified as having engaged in more moral acts. We explored whether this pattern of judgment was moderated by race-based beliefs and shade-based associations.
Method
Participants
In exchange for 50 cents, 391 (n = 247 female) participants completed a survey on MTurk. Of these participants, the majority were White (79%), and the remaining participants were Asian (5%), African American (6%), Hispanic or Latino (5%), or indicated belonging to another racial or ethnic group (5%). Post hoc power analyses indicated that observed power for all significant predicted effects ranged from 51%-89%.
Procedure
Participants indicated which of two men that appeared simultaneously on the screen committed a particular action. The photographs depicted headshots of White males in which one man’s skin tone was darkened and the other was lightened using Photoshop. Whether each male was lightened or darkened was counterbalanced between participants. We created several different versions of the study with different photographs of men to ensure that our results were not specific to one set of photographs.
Participants completed two trials: one featuring one of three moral actions (e.g., a man who raised money for a youth charity), and one featuring one of three immoral actions (e.g., a man convicted of embezzlement). We created several versions of the survey where the specific moral and immoral actions were counterbalanced between subjects (see Online Supplement).
We created a Dark-Skinned Selection Index by subtracting the number of times participants selected the darker man when presented with the moral action from the times they selected the darker man for the immoral action.
To assess race-based associations, participants completed the same feeling thermometer measures as in Study 4. To assess cognitive associations with shades, participants completed a thin-slice language task. They saw an English word and indicated which of two Chinese ideographs depicted that English word. The English words were either positive (e.g., love, gift, and sunshine) or negative (e.g., hate, war, and frustrating). One ideograph appeared in white and the other in black font. Questions were depicted on a gray background. To compute a Cognitive Shade Association Index, we calculated the percentage of times participants selected the black ideograph for the positive English word and subtracted that from the percentage of times participants selected the black ideograph for the negative English word.
Results and Discussion
In general, participants were no more likely to select the darker skinned man for the immoral or moral act,
However, we anticipated that beliefs regarding who committed immoral and moral acts would be moderated by cognitive shade-based associations as well as race-based beliefs. To test this hypothesis, we ran a regression predicting the Dark-Skinned Selection Index from the Cognitive Shade Association Index, the Non-White Warmth Index, and the White Warmth variable, all centered. Overall, the model was significant, R2 = .04, F(3, 387) = 5.87, p = .001.
There was a main effect of the Cognitive Shade Association Index, b = .41, SE = .13, 95% CI = [0.16, 0.66], t(387) = 3.20, p = .001, rsp = .16. The more frequently participants indicated that negative English words were depicted by black Chinese characters, the more likely they were to select the darker man for the immoral (vs. moral) action. There was also a main effect of the Non-White Warmth Index, b = −.004, SE = .002, 95% CI = [−0.007, 0.00], t(387) = −1.98, p = .05, rsp = −.10. The more warmly participants felt toward non-Whites, the less likely they were to select the darker skinned man for the immoral (vs. moral) actions. As in Study 4, there was no main effect of the White warmth variable, b = −.002, SE = .002, 95% CI = [−0.01, 0.001], t(387) = −1.24, p = .22, rsp = −.06.
As in Study 5, we examined the independent contributions of race-based and shade-based “bad is dark” associations. We ran the same two hierarchical regression analyses. In the first analysis, we entered the White variable and Non-White Warmth Index in the first step, and the Cognitive Shade Association Index scores in the second step. The Cognitive Shade Association Index uniquely predicted participants’ associations between bad acts and darker skinned targets, R2Δ = .025, FΔ(1, 387) = 10.23, p = .001, suggesting that the association between darker tone and badness is an independent predictor beyond race-based color associations. In the second analysis, we reversed the process, entering Cognitive Shade Index scores in the first step, and the two-racial warmth indices in the second step. The race-based warmth variables were unique predictors of the association between bad acts and darker skinned targets, R2Δ = .02, FΔ(2, 387) = 3.78, p = .02. In contrast to the results in Study 4, both participants’ cognitive shade-based and race-based associations were unique and independent predictors of their tendency to believe that darker skinned men were responsible for bad acts.
General Discussion
Psychologists have known for decades that people associate darker skin with negative personality traits (e.g., Devine, 1989; Greenwald et al., 1998). Our studies are among the first to show the reverse causal story: that judgments of skin tone are a consequence of learning about another’s moral character. Across six studies, we demonstrated that people associated immoral or dissolute behavior with darker skin tones. Specifically, we found that people characterized an individual’s skin tone as darker when he committed a morally bad act rather than a morally good act.
Meta-Analysis of Main and Moderator Effects
In the present research, we found that the emergence of the “bad is black” effect was contingent on how strongly people displayed a personal bias against dark-skinned minority groups (Studies 2, 3, and 5) or held cognitive beliefs pairing darkness with badness (Studies 4 and 5). We conducted a meta-analysis to statistically test the strength of both the main effect and the predicted moderated relationships of the “bad is black” effect. We used the effect sizes from Studies 2 to 5, which are reported as Cohen’s ds for main effects and rs for interactions. Studies 1a and 1b were not included in the meta-analysis given that there is not a generally agreed upon method for calculating standardized effect sizes in mixed models (e.g., Fitzmaurice et al., 2011). We calculated the variances of the ds using equations reported in Borenstein, Hedges, Higgins, and Rothstein (2009). We tested whether the average effect size across studies significantly differed from zero using Comprehensive Meta-Analysis software (CMA).
There was not a significant average main effect, d = −.06, SE = .05, z = −1.06, p = .29, 95% CI = [−0.16, 0.05]. For the moderator analyses, we separately examined the impact of race- and shade-based beliefs. For race-based beliefs, we separately tested the influences of feelings toward Whites (Studies 3-6) and feelings toward non-Whites, which included symbolic racism and feeling thermometers (Studies 2-6). Individuals with more positive feelings toward Whites were no more likely to associate darker skin with the immoral actor and lighter skin with the moral actor, r = .02, SE = .04, z = .56, p = .58, 95% CI = [−0.06, 0.10]. However, the more negatively people felt toward non-Whites, the more certain they were that immoral (vs. moral) actions were committed by darker skinned men, r = .14, SE = .04, z = 3.80, p < .001, 95% CI = [0.07, 0.21]. Finally, across studies, shade-based beliefs predicted a stronger association between darker skin and immoral deeds, r = .19, SE = .04, z = 4.42, p < .001, 95% CI = [0.11, 0.27]. These analyses indicate that both shade-based and race-based beliefs predict a stronger black is bad association.
We additionally conducted a meta-analysis examining the strength of the “bad is black” effect separately among people who held strong and weak race- and shade-based beliefs. The full results of the simple effect tests and meta-analysis are reported in the online supplemental materials. Among people who strongly associate darkness with negativity, there was an overall significant effect, r = .13, SE = .04, z = 3.47, p < .001, 95% CI = [0.05, 0.19], such that they mentally tied immoral (vs. moral) actions to people with darker skin. Among people who weakly associate darkness with negativity, there was an overall significant effect in the opposite direction, r = −.17, SE = .04, z = −4.64, p < .001, 95% CI = [−0.24, −0.10], such that they mentally tied moral (vs. immoral) actions to people with darker skin. These results indicate that people who strongly and weakly associate darkness with negativity both display bias in their social judgments, but in different ways: Those high in the association show a “bad is black” effect, whereas those low in the association show a “good is black” effect.
The Subjective Interpretation of Moral Actions
In the present research, we found that people who committed immoral actions were represented as having darker skin. One might wonder about the scope of immoral behaviors that would predict darker skin tone judgments. In our experimental studies, we manipulated the morality of an action using actions that were unambiguously negative (e.g., running over an elderly woman) and violated a value that people tend to strongly prioritize (i.e., harm). In other words, we used situations where people would generally agree that the action was moral or immoral. However, people sometimes diverge in their subjective evaluations of whether an action is immoral (Haidt, 2001), especially when the judgment at hand taps into values that people differentially prioritize (Haidt & Graham, 2007). Our theoretical argument predicts that people would evaluate a target person as possessing darker skin to the extent that they subjectively evaluate the person’s actions as immoral. An interesting question for future research would be to examine this prediction in a context where the action is morally ambiguous (e.g., not defending one’s ingroup, being sexually promiscuous).
Implications for Eye-Witness Identification and Bias Reduction Strategies
The results of the present research have important practical implications for the probity of eyewitness identification. Our research suggests that merely knowing that an immoral crime has been committed may affect the accuracy of identifications. As the case of the “dark-skinned” Boston bombers illustrates, it is possible that eyewitnesses may be apt to misremember wrongdoers as having darker skin than they actually do—and, consequently, to choose innocent darker skinned people from a line up rather than guilty lighter skinned alternatives. Knowledge of character, our data suggest, influences representations of skin tone. Given that criminal suspects are more likely to be condemned for belonging to darker skinned ethnic and racial groups (e.g., Eberhardt et al., 2006), the mere accusation of criminality might lead eyewitnesses to accuse a darker skinned foil. What seems to be a reciprocal relationship between morality and skin tone may lead people to see characters as more or less culpable, regardless of whether they are in fact guilty or innocent.
Our findings also hold interesting implications for research examining the factors that shape evaluations of racial minorities and how bias can be effectively reduced. Previous research examining perceptions and evaluations of African Americans has argued that outcomes predicted by skin tone (e.g., criminal sentencing decisions) are driven by the usage of cultural stereotypes (e.g., Eberhardt et al., 2006; Maddox & Gray, 2002). In turn, researchers have proposed that the effectiveness of bias reduction strategies is heavily predicated on minimizing the extent to which people rely on cultural stereotypes to make judgments (e.g., Fiske, 2000). Our findings and theoretical argument raise an additional perspective that shade-based associations, which develop independent of race and prejudicial attitudes, also impact judgments based on skin tone. Although we did not directly examine how cognitive associations of darkness and negativity shape perceptions of African Americans who vary in their skin tone in the present research, we believe that this would be a fruitful direction for future research. To the extent that shade-based associations also guide evaluations of racial minorities, researchers will need to consider these types of associations when developing bias reduction strategies.
Concluding Remarks
In the present research, we demonstrated that immoral actions are mentally tied to people who possess darker skin. We additionally demonstrated that the “bad is black” effect on social judgment is impacted both by attitudes toward racial groups and general cognitive associations between darkness and negativity. Our findings provide a novel perspective on how associations shape judgments and generate interesting directions for future research examining the dynamics of intergroup relations and social perception.
Footnotes
Acknowledgements
The authors thank Amy Alpert, Aneline Amalathas, Priscilla Chin, Natashia Corley, Jeremy Horne, Pamela Gomez, Dustin Grue, Michellee Mayers, Surya Menon, and Tai Williams for assistance designing stimuli and collecting data.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was conducted with support from National Science Foundation (NSF) Grant SBE 1460626, awarded to Balcetis.
