Abstract
Past research has found that members of stigmatized groups may feel more certain of poor performance when negative stereotypes are made accessible after finishing a task (i.e., stereotype validation). However, no research to date has identified the potential effects of activating positive stereotypes after performance. Based on past research and theory, we hypothesized that such positive stereotype validation may serve to bolster—rather than hinder—important beliefs related to one’s abilities and future task performance. Across three studies, the accessibility of positive group stereotypes was manipulated after participants completed an initial test on a topic. Consistent with predictions, the activation of positive, self-relevant stereotypes after the initial test was found to increase how certain participants were that they performed well on it. Furthermore, these increases in evaluative certainty predicted more positive ability beliefs, higher expectations for future performance, and better performance on a follow-up test that participants completed.
A broad literature suggests that group stereotypes can have a substantial influence in performance settings. For instance, research on stereotype threat has found that accessible negative stereotypes may often serve to hinder the performance of stigmatized group members (for a review, see Inzlicht & Schmader, 2012). On the contrary, an accumulation of findings also suggest that a person’s performance may be facilitated—rather than undermined—when positive, self-relevant stereotypes become activated in some contexts. For example, research has shown that elderly participants performed better on a memory test when they were primed with positive traits associated with their group (e.g., wise, insightful; Levy, 1996). Similarly, studies have found that Asians performed better on a subsequent math test when their ethnic identity is made accessible in certain ways (e.g., Shih, Ambady, Richeson, Fujita, & Gray, 2002; Shih, Pittinsky, & Ambady, 1999).
To date, this research literature has been predominantly focused on stereotypes that become accessible before a person performs a task. However, recent work has begun to identify the potential implications of stereotype activation that occur primarily after an individual has finished performing. In particular, studies have shown that activating negative stereotypes after an intellectual task may lead people to feel more certain that they performed poorly on it. And, in turn, these certainty-inducing effects (i.e., stereotype validation) were found to hold pernicious downstream consequences for people’s beliefs, interests, and future behaviors related to the performance domain (Clark, Thiem, Barden, Stuart, & Evans, 2015).
While these findings provide evidence of negative stereotype validation, research has yet to uncover the potential validating effects of positive stereotypes—which may differ greatly from those distinguished by previous research. Therefore, the aim of the current work was to investigate the potential outcomes, consequences, and conditions under which making positive stereotypes accessible may validate evaluations of one’s previous performance.
Research on self-validation (Petty, Briñol, & Tormala, 2002) has identified a number of variables that can influence how certain people are toward different types of social information. Furthermore, this literature suggests that many of these variables may influence certainty primarily when they are introduced after—rather than before—a person first evaluates a stimulus (for a review, see Briñol & Petty, 2009). For instance, research on persuasion has shown that people are more certain of their thoughts about an appeal when they are later induced to feel powerful as opposed to powerless (Briñol, Petty, Valle, Rucker, & Becerra, 2007), when they experience positive versus negative mood (Briñol, Petty, & Barden, 2007), and when they learn that the source of a previous message is high compared with low in credibility (Tormala, Briñol, & Petty, 2007; cf. Clark & Evans, 2014). Based on a wealth of research, effects on subjective feelings of certainty are considered to hold important implications for the durability of people’s perceptions and evaluations. In particular, evaluations that are held with high versus low certainty have been shown to exert a greater bias on beliefs (Marks & Miller, 1985), are more difficult to change (Bassili, 1996), and are more likely to guide relevant behaviors (e.g., Fazio & Zanna, 1978).
Building from research on self-validation, Clark and colleagues (2015) examined whether negative, self-relevant stereotypes may hold the potential to influence certainty in academic performance settings. It was hypothesized that, when a person has an unfavorable view of their previous performance, the activation of a negative stereotype can increase how certain he or she is toward this evaluation. For example, participants in one study completed a highly challenging math test that was then followed by a manipulation of gender stereotype accessibility (i.e., reporting their gender on a demographic question or not). For men who thought they did poorly on the test, making their gender salient had no effect on how certain they were about this negative evaluation of their performance. However, the manipulation was found to have a significant influence on women who believed they had done poorly. Compared with the control group, women were more certain they performed poorly when their gender was made salient after the test. And importantly, higher levels of certainty predicted diminished beliefs in their own math skills and abilities.
Clark et al. (2015) likened these findings to how convergent validity is acquired in science (D. T. Campbell & Fiske, 1959)—where confidence in a particular conclusion is enhanced when multiple pieces of evidence point to the same result (see Clark, Wegener, Briñol, & Petty, 2009; Clark, Wegener, Sawicki, Petty, & Briñol, 2013). For men, the stereotype that “men are good at math” was presumably inconsistent with the difficulty they experienced and the negative evaluation they produced about their own test performance. Thus, due to this lack of convergence, stereotype activation should have been unlikely to make them feel more certain that they performed poorly. However, for women who had a negative view of their performance, the stereotype that “women are bad at math” was consistent with their test experience. Therefore, activation of the gender stereotype may have provided an additional piece of information that reinforced evaluations and led women to feel more certain of their poor test performance.
While not identified by previous research, it stands to reason that positive performance stereotypes may also hold the potential to validate evaluations. However, the consequences and circumstances under which this may occur should be quite different than those associated with negative stereotype validation. In particular, following from a convergence-based conceptualization, positive stereotypes activated after performance should be most likely to validate evaluations when individuals believe that they have performed well on a task. Furthermore, enhanced certainty from positive stereotypes may also produce downstream consequences that are relatively favorable rather than adverse. For instance, consistent with the literature on certainty and evaluative strength (for a review, see Tormala & Rucker, 2007), feeling more certain about strong task performance may serve to elevate beliefs related to one’s abilities and facilitate future performance in a given domain.
Importantly, evidence for these hypothesized effects of positive stereotypes would also provide substantial clarity regarding what mechanisms may be responsible for stereotype validation phenomena more generally. Although several investigations support the convergence-based conceptualization described earlier, there is a plausible alternative explanation for the effects that have been demonstrated in the literature to date. More specifically, it could be argued that negative stereotype validation effects may be driven by self-serving biases in attribution (e.g., Greenberg, Pyszczynski, & Solomon, 1982). For instance, after performing poorly on a task, an accessible negative stereotype may serve as a compelling, external reason for one’s poor performance. And as a result, an individual may be especially likely to attribute their performance to the stereotype—thereby increasing certainty in their evaluations and guiding their beliefs, interests, and subsequent behaviors.
However, this attributional bias account should not hold in situations where an individual believes that he or she has performed well on a task. In sharp contrast to situations where one encounters failure, a wealth of research suggests that individuals tend to attribute their own successes to internal or personal characteristics as opposed to external circumstances (for a review, see W. K. Campbell & Sedikides, 1999). Thus, evidence that stereotypes may validate positive performance evaluations in the hypothesized ways would speak against the plausibility of self-serving biases as the mechanism responsible for stereotype validation.
Research Overview
The present research tested our predictions regarding how positive group stereotypes may validate performance evaluations. A total of three studies were conducted and previous research (Clark & Thiem, 2017; Clark et al., 2015) was used to approximate sufficient sample sizes. Studies 1 and 2 examined positive stereotypes about math aptitude among a sample of Asians (Study 1) and men (Study 2), respectively. Study 3 focused on positive stereotypes of women with regard to childcare abilities. In each study, the accessibility of gender stereotypes was manipulated after participants completed an initial quiz on one of these topics. Participants then reported how certain they were about their performance, answered questions on beliefs about their abilities, and completed a follow-up quiz on the same subject.
Study 1
As previously discussed, past research has shown that the math performance of Asian participants can be enhanced when their ethnic identity was made accessible prior to a test (e.g., Shih et al., 1999). In Study 1, we recruited a sample of Asian participants and examined whether activating their ethnic identity after a math test may validate positive performance evaluations and, in turn, have a bolstering effect on relevant beliefs and future performance. Participants in the study completed an easy math test that was designed to elicit largely positive evaluations of one’s own performance. In this context, making participant’s Asian identity and associated stereotypes (i.e., “Asians are good at math”) accessible after the test should increase how certain they are about their previous performance (due to the convergence with their positive performance evaluations). As a result, their evaluations of strong performance should become reinforced and thus more likely to guide beliefs and later performance in a positive, stereotype-consistent direction.
Method
Participants and design
A total of 83 self-identified Asian undergraduates (46 women and 37 men; Mage = 19.28 years, SD = 1.51) at the University of Iowa participated in exchange for partial course credit in their introductory psychology class. Participants were randomly assigned to a condition in which their Asian identity was (identity-salient condition) or was not made salient (control condition) after completing a math quiz.
Procedure and materials
For each study session, a single participant was greeted by an Asian female experimenter and seated at a computer station. Participants were told that the study was about problem solving and that they would be asked to answer several multiple-choice questions. In an effort to avoid any pre-performance stereotype activation, these instructions did not make any explicit mention of math, gender, ethnicity, or race (adapted from Rydell, McConnell, & Beilock, 2009). Participants then completed a quiz comprised of 12 randomly presented math problems drawn from practice GRE tests. These questions were found to elicit approximately 75% accuracy from past GRE test-takers (Educational Testing Service, 1998). The goal was to create an easy test in which the vast majority of participants would perform well and produce positive evaluations of their performance.
Following completion of this quiz, participants rated their performance on a scaled measure. Next, participants answered a series of questions that were designed to manipulate the salience of their Asian identity. Following this manipulation, participants completed scaled measures of evaluative certainty, beliefs about their own math ability, and expectations regarding how they would perform on a similar math quiz in the future. Next, participants completed a second math quiz that was designed to be more difficult than the first. It contained six GRE questions that approximately 50% of past examinees answered correctly (Educational Testing Service, 1998). After finishing this second quiz, participants answered demographic questions (age, race, and gender) and were subsequently debriefed.
Independent variable
Identity salience manipulation
Upon completion of the first math quiz and rating their performance, participants answered a series of open-ended questions that were designed to manipulate the accessibility of their Asian identity. In the identity-salient condition, participants were given a series of questions including whether their parents or grandparents spoke languages other than English, what languages they knew, and how many generations of their family had lived in America (adapted from Shih et al., 1999).
Conversely, participants who were assigned to the control condition answered questions that were presumably unrelated to their Asian identity. These control condition items included the following: how frequently they go to the movies, how frequently they dine out, and how frequently they download music from the Internet (adapted from Shih et al., 2002).
Dependent measures
Pre-manipulation actual performance
The total number of correct answers on the 12 math questions was used to index actual performance.
Pre-manipulation perceived performance
Participants rated their performance on the following scale immediately after completing the math problems: “Overall, how well do you think you performed on the math test?” (1 = performed extremely poorly to 11 = performed extremely well).
Evaluative certainty
After the manipulation of Asian identity salience, participants reported their certainty toward how they think they performed on the math quiz via four, 11-point scales (1 = strongly disagree to 11 = strongly agree). These items were “Please express how much you agree with the following statement:” (1) “I am CERTAIN that I performed WELL on the quiz” (2) “I am SURE that I performed WELL on the quiz” (3) “I am CERTAIN that I performed POORLY on the quiz” (reverse-scored); (4) “I am SURE that I performed POORLY on the quiz” (reverse-scored). Participants’ responses to these four measures were averaged to form a composite of evaluative certainty (α = .93; M = 7.52, SD = 2.61).
Math ability beliefs
Following the measures of evaluative certainty, participants responded to five measures targeting beliefs in their own math ability. These items read as follows: “Please rate your own math ability on the following scale” (1 = very low to 11 = very high); “Please rate your own math skills on the following scale” (1 = very low to 11 = very high); “To what extent do you believe that your math skills need improvement?” (1 = not at all to 11 = very much; reverse-scored); “Compared with other students, my math ability and skills are weak” (1 = strongly disagree to 11 = strongly agree; reverse-scored); and “Compared with other students, my math ability and skills are strong” (1 = strongly disagree to 11 = strongly agree). Responses were averaged to form an index of math ability beliefs (α = .89; M = 6.61, SD = 2.06).
Future performance expectations
Participants answered the following questions on 11-point scales: “Imagine taking a similar math test in the future. I predict that my performance would be:” (1 = very poor to 11 = very strong); and “ . . . How well do you think you would perform?” (1 = would perform very poorly to 11 = would perform very well). Responses were averaged to form a single composite (α = .96; M = 8.05, SD = 2.23).
Post-manipulation performance
The total number of correct responses to the six questions of the second math quiz served as the measure of post-manipulation actual performance (M = 2.86, SD = 1.77).
Results
Pre-manipulation performance
As previously described, our goal was to create an initial performance context in which the majority of participants would perform well and produce largely positive evaluations of their performance. Analyses of performance on the first math quiz were consistent with this aim. With regard to actual performance (M = 7.54, SD = 2.73), accuracy of responses was significantly above 50%, t(82) = 5.14, p < .001. For perceived performance (M = 7.34, SD = 2.75), evaluations were found to be significantly above the midpoint (6) of the rating scale, t(82) = 4.43 p < .001.
Actual performance
The results of a one-way ANOVA revealed no difference across conditions on actual performance, Midentity-salient = 7.88 (SD = 2.71) versus Mcontrol = 7.23 (SD = 2.75), F(1, 81) = 1.15, p = .287.
Perceived performance
As with the findings on actual performance, perceived performance on the first math quiz also did not differ as a function of the identity salience manipulation, Midentity-salient = 7.75 (SD = 2.75) versus Mcontrol = 6.95 (SD = 2.73), F(1, 81) = 1.75, p = .189.
Evaluative certainty
As in past research (Clark et al., 2015; Clark, Thiem, Hoover, & Habashi, 2017), all analyses of the key dependent measures controlled for pre-manipulation actual and perceived performance. The results of a one-way ANCOVA revealed a marginally significant effect of the identity salience manipulation, F(1, 7a9) = 3.77, p = .056,
Direct and indirect effects on math beliefs and post-manipulation performance
Math ability beliefs
An ANCOVA that controlled for both actual and perceived performance on the first quiz showed no direct effect of the identity salience manipulation on participants’ beliefs in their own math abilities, Midentity-salient = 6.68 (SE = .27) versus Mcontrol = 6.54 (SE = .26), F < 1, p = .697. However, a significant effect of perceived performance was found, F(1, 79) = 27.16, p < .001,
The observed effect of the identity salience manipulation on evaluative certainty was predicted to carry important, downstream consequences for beliefs and behavior. Greater certainty—triggered by a salient positive stereotype—should strengthen participants’ positive evaluations of their performance (see Tormala & Rucker, 2007). In turn, this enhanced certainty should increase the extent to which performance evaluations may guide domain-relevant beliefs and behaviors in a positive direction (see Clark et al., 2015; Tormala & Rucker, 2007). In particular, we hypothesized that validation from making one’s Asian identity and associated positive stereotypes salient should predict increased beliefs in one’s math ability, higher expectations for future performance, and greater performance on the second math test.
These predictions were tested using the PROCESS macro developed by Hayes (2014). Model 4 was used and it allowed evaluative certainty to mediate the relationship between the identity salience manipulation (independent variable) and each math-related outcome (dependent variable) while also controlling for differences in perceived and actual performance on the first quiz. The data were treated as the population and 10,000 bootstrap samples were drawn (with replacement) to produce 95% bias-corrected confidence intervals.
The results of these analyses are displayed in Table 1. As described in the previous ANCOVA tests, participants tended to be more certain that they performed well on the first math quiz when their Asian identity was made salient relative to the control condition. In turn, greater evaluative certainty significantly predicted higher beliefs in one’s math abilities. Moreover, this indirect effect was found to be statistically significant.
Results of Bootstrapping Mediation Analyses in Study 1.
Note. Bold indicates reliable indirect effect, where BC CI does not include zero. BC CI = bias-corrected confidence interval.
p < .10. *p < .05. **p < .01. ***p < .001.
Future performance expectations
Results of an ANCOVA that controlled for actual and perceived performance on the first quiz showed a main effect of perceived performance on the first quiz, F(1, 79) = 22.17, p < .001,
Consistent with our hypotheses, the mediation analysis revealed a significant indirect effect of evaluative certainty on future performance expectations. As shown in Table 1, participants tended to feel more certain they had done well on the first math test when their Asian identity was made salient afterward compared to when it was not. In turn, higher certainty significantly predicted more positive expectations about how one might perform on a similar quiz in the future. Furthermore, this mediational pattern was significant.
Post-manipulation performance
The number of correct answers on the second math quiz was subjected to an ANCOVA that controlled for actual and perceived performance on the first math quiz. A main effect of participants’ actual pre-manipulation performance was found via an ANCOVA, F(1, 79) = 41.71, p < .001,
The results of a mediation analysis did not support our predictions regarding the effects of evaluative certainty on post-manipulation performance. While evaluative certainty tended to differ as a function of the manipulation, it was a weak predictor of performance on the second math quiz (see Table 1).
Discussion
Study 1 provided initial evidence of the validating effects of positive group stereotypes. After completing an easy math test, Asian participants tended to be more certain that they had performed well when their Asian identity was made salient compared to a control condition. Furthermore, consistent with our conceptualization and past work on evaluative strength (see Petty & Krosnick, 1995; Tormala & Rucker, 2007), increased certainty triggered by the stereotype manipulation predicted elevated beliefs in one’s abilities and higher expectations for future achievement in math.
Study 2
The goal of Study 2 was to conceptually replicate and extend the previous findings to stereotypes about gender and math aptitude. In addition to the aforementioned work on positive stereotypes of Asians (e.g., Shih et al., 1999), other research has identified performance boosting effects among men when gender stereotypes are made salient in some math contexts (see Walton & Cohen, 2003). With these findings and those of Study 1 in mind, it stands to reason that the activation of gender stereotypes may also have the potential to validate the positive evaluations that men may hold toward their previous math performance.
Method
Participants and design
One hundred eighty-two male undergraduates at the University of Iowa participated and received partial course credit in their introductory psychology class. Participants were randomly assigned to one of two gender stereotype conditions (gender stereotype–present vs. gender stereotype–absent). All participants completed an attention check that asked them to correctly recall information contained in the stereotype manipulation. A total of 24 participants were excluded based on incorrect responses to this check. Thus, the final sample included a total of 158 males (Mage = 19.02 years, SD = 1.80). Approximately 80% of these participants (132 of 158) were White.
Procedure and materials
The procedure was identical to that used in Study 1, with the following exceptions. First, all study sessions included one to three participants and were conducted by a White male experimenter. Following the introductory instructions, participants completed a GRE-type math test that was designed to be easier than the one used in Study 1. In particular, previous GRE test-takers were found to respond overall with approximately 90% accuracy to the 12 questions that comprised this quiz (Educational Testing Service, 1998). Upon completion of these math problems and rating their perceived performance, participants received information that was designed to manipulate the accessibility of the gender stereotype regarding math ability. Then, participants reported their evaluative certainty, math ability beliefs, future performance expectations, and completed the same second math quiz used in Study 1. Finally, participants responded to an attention check, answered demographic questions, and were debriefed.
Independent variable
Gender stereotype information
Following the first math quiz and rating of perceived performance, the accessibility of the gender stereotype regarding math ability was manipulated. In the gender stereotype–present condition, participants read the following: Research suggests that men tend to perform better than women on standardized tests of math ability. The research you are participating in is aimed at a better understanding of this.
In contrast, participants in the gender stereotype–absent condition were given the following information: Research suggests that performance on standardized tests of math ability tends to vary as a function of some personality variables. The research you are participating in is aimed at a better understanding of this.
Dependent measures
Pre-manipulation actual performance
The number of correct answers on the first math quiz served as the measure of pre-manipulation actual performance.
Pre-manipulation perceived performance
Perceived performance on the first math quiz was measured using the same 11-point scaled measure from Study 1.
Evaluative certainty
Participants responded to a total of six, 11-point scales. The first four items were identical to those used previously and the same two items that were reverse-scored in Study 1 were reverse-scored in Study 2. Furthermore, two new measures were added and they read as follows: “To what extent are you CERTAIN that your perceptions of your performance are accurate?” (1 = not at all certain to 11 = very certain), and “To what extent are you SURE that your perceptions of your performance are accurate?” (1 = not at all sure to 11 = very sure). Participants’ responses to these six measures were averaged to form a composite of evaluative certainty (α = .93, M = 9.32, SD = 1.59).
Math ability beliefs
The five scaled measures of ability beliefs were identical to those used in Study 1. Responses were averaged to form a composite of math ability beliefs (α = .86, M = 7.75, SD = 1.69).
Future performance expectations
These two measures were the same as those from Study 1 and responses were averaged to form a single index (α = .95, M = 9.66, SD = 1.46).
Post-manipulation performance
As in Study 1, the number of correct responses to the six questions of the second math quiz was used as the measure of post-manipulation actual performance (M = 2.87, SD = 1.82).
Attention check
After reporting their gender, race, and age, participants were asked the following question that was adapted from previous research (Clark et al., 2017): “After the FIRST math test, you were told which of the following?” The four response options to this question were as follows: (1) “Performance on standardized tests of math ability tends to vary as a function of some personality variables” (2) “There are NO gender differences in performance on standardized tests of math ability” (3) “Men and women tend to perform differently on standardized tests of math ability” and (4) “Men tend to perform better than women on standardized tests of math ability” As previously described, this item was used to evaluate the attentiveness of participants. All participants in the stereotype-absent condition who did not choose Option 1 and all participants in the stereotype-present condition who did not select Option 4 were removed from our sample (24 out of 182). Thus, the final sample consisted of data from 158 participants that were submitted to statistical analyses.
Results
Pre-manipulation performance
Analyses revealed that participants performed very well and viewed their performance as such on the first math quiz. In particular, actual performance (M = 11.04, SD = 1.40) was significantly above 85% accuracy, t(157) = 7.55, p < .001, and mean perceived performance (M = 9.22, SD = 1.80) was significantly above the midpoint of the rating scale, t(157) = 22.50, p < .001. Taken together, as in Study 1, these results are consistent with our goal of creating a context that should facilitate positive stereotype validation effects.
Actual performance
A one-way ANOVA conducted on the number of correct answers to the first math quiz showed no differences as a function of the gender stereotype manipulation, Mstereotype-present = 10.95 (SD = 1.60) versus Mstereotype-absent = 11.12 (SD = 1.18), F(1, 156) = .62, p = .431.
Perceived performance
Similar to the findings on actual performance, an ANOVA revealed no effects of the condition on participants’ evaluations of their performance on the first math quiz, Mstereotype-present = 9.13 (SD = 1.97) versus Mstereotype-absent = 9.30 (SD = 1.62), F(1, 156) = .34, p = .562.
Evaluative certainty
As in Study 1, all analyses of the key dependent measures controlled for pre-manipulation perceived and actual performance. A one-way ANCOVA on evaluative certainty revealed the predicted main effect of the stereotype manipulation, F(1, 154) = 5.22, p = .024,
Direct and indirect effects on math beliefs and post-manipulation performance
Math ability beliefs
An ANCOVA that controlled for actual and perceived performance on the first quiz showed an effect of the stereotype manipulation that approached significance, F(1, 154) = 3.68, p = .057,
The potential mediational role of evaluative certainty on the various measured outcomes was assessed using bootstrapping procedures that were identical to Study 1. With regard to math ability beliefs, our hypothesis was not supported. As presented in Table 2, the stereotype manipulation was found to elicit the predicted differences in evaluative certainty. However, evaluative certainty was a weak predictor of ability beliefs.
Results of Bootstrapping Mediation Analyses in Study 2.
Note. Bold indicates reliable indirect effect, where BC CI does not include zero. BC CI = bias-corrected confidence interval.
p < .10. *p < .05. **p < .01. ***p < .001.
Future performance expectations
No effect of the gender stereotype manipulation emerged from an ANCOVA that controlled for actual and perceived performance on the first quiz, adjusted Mstereotype-present = 9.78 (SE = .13) versus adjusted Mstereotype-absent = 9.55 (SE = .13), F(1, 154) = 1.58, p = .211. However, the two covariates tended to influence these ratings: pre-manipulation perceived performance, F(1, 154) = 78.61, p < .001,
While no direct effect of the manipulation emerged, the results of the mediation analysis indicated the presence of an indirect effect through evaluative certainty (see Table 2). Participants were more certain they had performed well on the test when the gender stereotype was made accessible afterward, and this elevated certainty was a reliable predictor of greater expectations for future math performance.
Post-manipulation performance
An ANCOVA that controlled for pre-manipulation actual and perceived performance revealed that the main effect of the stereotype manipulation approached significance, F(1, 154) = 3.86, p = .051,
Consistent with our hypotheses, the mediation analysis revealed a significant indirect effect of evaluative certainty on participants’ performance on the second math test. As shown in Table 2, participants felt more certain they had done well on the first math test when the gender stereotype was made salient afterward compared to when it was not. In turn, higher levels of certainty evoked by this manipulation were found to be a significant predictor of better performance on the second math test.
Discussion
The findings of Study 2 offered additional evidence of positive stereotype validation. Male participants reported greater certainty that they performed well on a previous math test when the gender stereotype about math aptitude was made salient as opposed to when it was not. Similar to Study 1, this effect of the stereotype information was found to hold key downstream consequences. For instance, greater evaluative certainty was found to predict more positive expectations for how one might perform in the future. In addition, validation from the gender stereotype information was found to be a strong predictor of better actual performance on the second math test that participants completed.
Study 3
Study 3 was designed to extend the previous findings in some important ways. One goal was to provide support for the idea that convergence between a stereotype and a person’s performance evaluation is necessary for the predicted effects to emerge. Thus, we examined the influence of a common gender stereotype in a sample that included both women and men. With this in mind, a second aim was to conceptually replicate the previous effects found on math achievement to a very different domain—childcare performance. Research has found that women are typically viewed as having more interest in children (Prentice & Carranza, 2002) and are perceived to have greater abilities than men with respect to caregiving (Cejka & Eagly, 1999). Considering these associations, it is plausible that activation of this gender stereotype could serve to validate the positive evaluations that women may have toward their performance on a childcare task.
An online sample of male and female participants completed a quiz about childcare knowledge. Upon completion of this test, the salience of gender stereotypes about childcare performance and abilities was manipulated. Similar to the math tests used in Studies 1 and 2, this initial childcare quiz was designed to be easy—with the expectation that participants would produce largely positive evaluations of their performance. For women, the gender stereotype about childcare (“women are good at childcare”) should be consistent with the positive evaluations they had about their own quiz performance. Thus, when this stereotype is made salient, women should feel more certain of their strong performance and such validation should serve to bolster beliefs and later performance in this domain. On the contrary, if our convergence-based conceptualization is correct, a very different pattern of results should emerge among male participants. For men, the gender stereotype (“men are bad at childcare”) should not converge with their strong performance and positive evaluations. Therefore, making this stereotype salient should have little effect on the certainty they hold about their performance.
Method
Participants and design
Five hundred eighty-three U.S. citizens completed the study in exchange for US$0.50 through Amazon Mechanical Turk. The accessibility of the gender stereotype about childcare was manipulated after completing an initial quiz on the topic. The study took the form of a 2 (gender: women vs. men) × 2 (gender stereotype information: present vs. absent) between-participants factorial design.
Participants completed an attention check that was very similar to the measure used in Study 2. A total of 89 participants were excluded from the sample because they failed to correctly identify the information they received as part of the gender stereotype manipulation. Therefore, the final sample consisted of 494 participants (269 women and 225 men; Mage = 34.81 years, SD = 11.57). Over 75% of these participants reported their race as White (377 of 494).
Procedure, materials, and measures
After some brief instructions, participants completed an initial quiz that targeted knowledge about caring for infants (i.e., pre-manipulation quiz). This quiz consisted of 10 randomly presented questions (five multiple-choice and five true-false) that were intended to be easy and to facilitate strong actual and perceived performance. Two example items were as follows: “A newborn starts gaining weight rapidly during his or her first day of life” (A = true, B = false) and “What is the best position for a baby to sleep in?” (A = on his or her back, B = on his or her tummy, C = on his or her side, D = sitting up). Immediately after completing this first quiz and rating their performance, participants received information that was designed to manipulate the salience of the gender stereotype about childcare abilities (stereotype-present vs. stereotype-absent). Following this manipulation, participants responded to measures of evaluative certainty and beliefs about their own knowledge and abilities regarding infant care. Next, participants were given a second, post-manipulation infant care quiz. This test contained five multiple-choice questions and was intended to be more difficult than the first quiz. For example, one item was “What can a baby do up until 7 months that an adult cannot?” (A = cross their eyes while sticking out their tongue, B = lick their elbow, C = breathe and swallow simultaneously, and D = sneeze with their eyes open). Upon completion of this second quiz, participants responded to an attention check, answered demographic questions, and were debriefed.
Independent variable
Gender stereotype information
Following the first infant care quiz and the measure of perceived performance, participants in the gender stereotype–present condition received the following passage: Research suggests that women tend to perform better than men on tests of infant care knowledge. The research you are participating in is aimed at a better understanding of this.
On the contrary, participants in the gender stereotype–absent condition were provided with the following: Research suggests that performance on tests of infant care knowledge tends to vary as a function of some personality variables. The research you are participating in is aimed at a better understanding of this.
Dependent measures
Pre-manipulation actual performance
The total number of correct answers on the first infant care quiz was used as the measure of pre-manipulation actual performance.
Pre-manipulation perceived performance
The measure of participants’ evaluations on the first quiz was identical to the 11-point scaled measure used in Studies 1 and 2 except that words “math test” were replaced with “infant care quiz.”
Evaluative certainty
The six scaled measures of evaluative certainty were identical to those used in Study 2. Furthermore, a single index was created by averaging responses to these items (α = .93, M = 7.51, SD = 2.29).
Infant care ability beliefs
Participants’ beliefs about their own knowledge and ability regarding infant care were assessed via four items. These 11-point scaled measures were as follows: “Please rate your own infant care skills on the following scale” (1 = very weak to 11 = very strong); “Please rate your own infant care knowledge on the following scale” (1 = very low to 11 = very high); “I believe that I am very knowledgeable about infant care” (1 = strongly disagree to 11 = strongly agree); and “I believe that I could provide adequate care for an infant on my own” (1 = strongly disagree to 11 = strongly agree). A composite of ability beliefs was created by averaging participants’ responses to these items (α = .96, M = 7.06, SD = 2.88).
Post-manipulation performance
Total correct responses to the five questions of the second infant care quiz were used as the measure of post-manipulation actual performance (M = 3.06, SD = 1.10).
Attention check
Prior to receiving the debriefing information, participants responded to this question: “After the first infant care quiz, you were told which of the following?” The response choices were as follows: (1) “Performance on tests of infant care knowledge tends to vary as a function of some personality variables” (2) “There are NO gender differences in performance on tests of infant care knowledge” (3) “Men and women perform differently on tests of infant care knowledge” (4) “Women tend to perform better than men on tests of infant care knowledge” and (5) “Performance on this infant care quiz often varies based on the gender of the respondent AND your quiz performance will be compared with that of male respondents.” All participants in the gender stereotype–present condition who did not select Option 4 and participants in the gender stereotype–absent condition who did not choose Option 1 were excluded (89 out of 583). Hence, a final sample of 494 participants was subjected to statistical tests.
Results
Pre-manipulation performance
As expected, initial analyses showed that actual and perceived performance on the first infant care quiz was strong. The mean number of correct answers to this quiz (M = 7.83, SD = 1.40) was significantly above 75%, t(493) = 5.29, p < .001, and mean perceived performance (M = 7.12, SD = 2.44) was significantly above the scale midpoint, t(493) = 10.15, p < .001. Similar to the previous studies, these findings suggest that this performance situation should be conducive to positive stereotype validation effects.
Actual performance
A 2 (gender: women vs. men) × 2 (gender stereotype information: present vs. absent) between-subjects ANOVA showed that women (M = 8.26, SD = 1.15) performed better than men (M = 7.33, SD = 1.51), F(1, 490) = 58.79, p < .001,
Perceived performance
A two-way ANOVA on the 11-point perceived performance rating revealed a main effect of gender, F(1, 490) = 44.62, p < .001,
Evaluative certainty
All analyses of evaluative certainty, ability beliefs, and post-manipulation performance controlled for differences in pre-manipulation actual and perceived performance. The results of a two-way ANCOVA on evaluative certainty showed a pattern that was consistent with the hypotheses. As displayed in Figure 1, certainty tended to be higher in the gender stereotype–present condition relative to the gender stereotype–absent condition, F(1, 488) = 3.30, p = .070,

Adjusted mean evaluative certainty in Study 3 as a function of participant gender and the gender stereotype manipulation (controlling for actual and perceived performance on the first infant care test).
Direct and indirect effects on infant care beliefs and post-manipulation performance
Infant care ability beliefs
A two-way ANCOVA that controlled for actual and perceived pre-manipulation performance showed a significant effect of perceived performance, F(1, 488) = 615.37, p < .001,
As in the previous studies, we hypothesized that greater certainty in one’s strong performance should have an enhancing effect on beliefs and behavior. In particular, after strong performance and predominantly positive evaluations, analyses showed that female participants expressed greater certainty that they performed well when the gender stereotype about childcare was made accessible compared with the control condition. Similar to the previous studies, such increases in evaluative certainty should hold positive implications for women’s beliefs about their childcare abilities and their follow-up quiz performance. In sharp contrast, the degree to which male participants were certain about their strong performance was not found to be influenced by the stereotype manipulation. Hence, evaluative certainty should not play the same mediational role among men.
These moderated-mediation predictions were tested using Model 59 of the PROCESS macro (Hayes, 2014). In this model, gender was allowed to moderate each relationship between the stereotype manipulation (independent variable), evaluative certainty (mediator), and each infant care outcome (dependent variable) while also controlling for pre-manipulation perceived and actual performance. The results of this bootstrapping procedure are displayed in Table 3. In particular, the evaluative certainty of women was found to be higher when the gender stereotype regarding infant care was made salient. In turn, higher certainty associated with the manipulation predicted more positive beliefs in one’s infant care ability, and this indirect effect was significant. Conversely, as expected, no evidence of an indirect effect on beliefs emerged among men.
Results of Bootstrapping Moderated-Mediation Analyses in Study 3.
Note. Bold indicates reliable indirect effect, where BC CI does not include zero. BC CI = bias-corrected confidence interval.
p < .05. **p < .01. ***p < .001.
Post-manipulation performance
A two-way ANCOVA that controlled for actual and perceived pre-manipulation performance revealed a significant main effect of gender. Women (adjusted M = 3.20, SE = .07) performed better than men (adjusted M = 2.90, SE = .07), F(1, 488) = 8.56, p = .004,
The potential indirect effects on post-manipulation performance were examined using the same mediation model and bootstrapping procedures described earlier. As shown in Table 3, increased evaluative certainty from making the gender stereotype salient predicted greater performance on the second infant care quiz among women, and this mediational relationship was statistically significant. For men, evaluative certainty did not have a mediational effect on post-manipulation performance.
Discussion
Building from the findings of Studies 1 and 2, the results of Study 3 provided additional support for validation via positive stereotypes. Consistent with our convergence-based account, women were more certain that they had performed well on an easy quiz about infant care when the gender stereotype about childcare abilities was made accessible relative to the control. Importantly, this increased certainty elicited by the positive stereotype was found to hold downstream consequences for beliefs and behavior. More specifically, for women, higher certainty in strong performance predicted elevated beliefs in one’s abilities and better performance on the second infant care quiz that participants received. As predicted, however, these effects associated with the gender stereotype manipulation did not hold among male participants.
General Discussion
Within task performance settings, previous research has found that activating positive stereotypes can facilitate subsequent performance among members of some groups (e.g., Asians; see Shih et al., 1999). Although these effects have been demonstrated in several studies, it is an open question whether positive group stereotypes can have influence when activated only after a person has finished performing a task. Building from recent research on stereotype validation (Clark et al., 2015), the current work investigated these possibilities in the context of positive stereotypes about Asians in math (Study 1), men in math (Study 2), and women with respect to childcare performance (Study 3). The findings across the three studies showed evidence that largely positive evaluations of previous test performance were validated when a positive, self-relevant stereotype was activated following completion of the task.
In addition, this conclusion was further supported by a supplementary meta-analysis which combined the key effects of the stereotype manipulation on evaluative certainty across the three studies (i.e., the main effect of the stereotype manipulation in Studies 1 and 2 and the simple effect of the stereotype manipulation on evaluative certainty among female participants in Study 3). Following procedures specified by Goh, Hall, and Rosenthal (2016), the results of a Stouffer’s Z test revealed a significant effect of the stereotype manipulation among participants for whom positive performance stereotypes were self-relevant, Zcombined = 3.72, p < .001 (two-tailed). In particular, these participants were more certain that they had performed well on the previous test when a positive stereotype was made salient afterward compared with a control condition.
Accordant with past research on evaluative strength (e.g., see Tormala & Rucker, 2007), these increases in certainty were associated with critical downstream consequences for beliefs and behavior. Greater certainty in strong performance on an initial test predicted higher beliefs in one’s abilities (Studies 1 and 3), enhanced expectations for future performance (Studies 1 and 2), and increased performance on a follow-up test (Studies 2 and 3). Furthermore, consistent with our hypotheses and past research on negative stereotypes (Clark et al., 2015; Clark et al., 2017), these validating effects only emerged when the positive stereotypes were directly relevant to a person’s group membership (Study 3).
Taken together, the observed effects also provide a unique contribution to our understanding of the processes that may drive stereotype validation. As previously discussed, evidence of stereotype validation in the extant literature is consistent with two competing explanations—a convergent validity-based conceptualization and a self-serving bias account. If self-serving biases are responsible for stereotype validation, individuals presumably attribute their poor performance to a salient negative stereotype (i.e., an external attribution)—thereby increasing how certain they are about how they performed. However, the literature on self-serving biases would strongly suggest that such effects should be limited to situations where people experience failure. In contexts where success is attained, research has shown that people are relatively unlikely to rely on external reasons—such as stereotypes—for their performance. Rather, under these circumstances, individuals have been shown to predominantly make personal or internal attributions for their accomplishments (see W. K. Campbell & Sedikides, 1999). With this in mind, the current evidence of stereotype validation in situations where individuals perform well speaks against the plausibility of self-serving biases driving the effects. Instead, the present findings add further support to the rationale that stereotype validation occurs due to a convergence between an accessible, self-relevant stereotype and an individual’s evaluation of his or her performance.
The findings of the current research should also hold important implications and facilitate other directions for future inquiry. For instance, follow-up investigations could examine additional boundary conditions for these and other positive validation effects. In Study 3, the convergence between a stereotype and an evaluation was found to be a key aspect of the observed effects—with women but not men influenced by the positive childcare stereotype. With this in mind, another determinant should be the extent to which a person believes he or she performed well or poorly on a previous task. In the current research, we aimed to create performance situations that would best facilitate the hypothesized positive validation effects. Thus, we gave participants easy tasks in an attempt to constrain evaluations of performance to be positive.
If our convergence-based account is correct, it should be the case that positive stereotype validation should be unlikely to occur when people hold more negative views of their task performance. While not directly tested in the current studies, previous research supports this claim. For example, in one study, Clark et al. (2015) had participants take a very difficult—rather than an easy—childcare test prior to manipulating the salience of stereotypes. Although this manipulation was identical to one used in the current Study 3, it had no effect on evaluative certainty or downstream outcomes among women in the sample. In this case, participants held predominantly negative rather than positive evaluations of their performance. Therefore, the positive stereotype likely did not converge with the test experiences these women had. In addition, several other investigations have shown a similar lack of positive validation effects when tasks are difficult and performance evaluations are negative (Clark et al., 2015; Clark et al., 2017).
Beyond these and other situational factors, future work could also identify individual differences that may determine the likelihood of positive stereotype validation. One possibility may be individual differences in stigma consciousness—the degree to which people are focused on and concerned about stereotypes of their group. Previous research suggests that higher levels of stigma consciousness are associated with greater motivation to avoid, reject, and react against stereotypes of one’s group (Pinel, 1999). Consistent with this literature, Clark and colleagues (2017) found that women high compared with low in stigma consciousness were less validated by negative gender stereotypes that were made salient after poor performance in economics. It stands to reason that a similar relationship may also emerge under circumstances that facilitate positive rather than negative stereotype validation. In these situations, being highly concerned about stereotypes of your group may increase the likelihood that one downplays or discounts the validity of these beliefs—even if such associations are positive.
We look forward to examining these and other possibilities in follow-up work. In addition, it is our hope that the current findings propel innovative research into the myriad of ways that group stereotypes can influence social judgment and behavior.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was supported in part by National Science Foundation Grant 1226417 awarded to Jason K. Clark.
Supplemental Material
Supplementary material is available online with this article.
Notes
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
