Abstract
Performance on implicit attitude measures is influenced both by the nature of activated evaluative associations and by people’s ability to regulate those associations as they respond. One consequence is that identical implicit attitude scores may conceal different underlying processes. This study demonstrated this phenomenon and also shed light on the nature of age differences in antiaging bias on implicit attitude measures. Although younger and older participants demonstrated equivalent levels of antiaging bias on an Implicit Association Test (IAT), application of the Quad model showed that antiold associations were less activated among older than younger adults, but that older adults were less able to overcome these associations in performing the task. Thus, the lack of age differences in IAT performance concealed differences in both underlying evaluative associations and the ability to control those associations. These findings have important implications for the measurement and interpretation of implicit attitudes.
Chris and Steve both indicate on a self-report measure that they have mildly negative attitudes toward older people. Does this mean that they are equally prejudiced? Chris has mildly negative attitudes toward older people and reports these attitudes accurately. Steve, on the other hand, has strong antiaging attitudes but modulates his responses so as not to appear prejudiced. The self-report measure has concealed differences between Chris and Steve in their underlying attitudes.
For researchers familiar with the “willing and able” problems, it comes as no surprise that self-reported attitudes may reflect both the attitude and the regulation of the expression of the attitude. Indeed, a desire to separate so-called true attitudes from impression management was central to the development of implicit attitude measures. 1 These measures were initially conceived as process-pure measures of automatic associations that could not be controlled. However, a growing body of research has shown that performance on implicit measures reflects additional component processes, including the inhibition of associations, the detection of appropriate responses, and response biases (Allen, Sherman, & Klauer, 2010; Amodio et al., 2004; Bartholow, Dickter, & Sestir, 2006; Conrey, Sherman, Gawronski, Hugenberg, & Groom, 2005; Gonsalkorale, Allen, Sherman, & Klauer, 2010; Gonsalkorale, Sherman, Allen, Klauer, & Amodio, 2011; Gonsalkorale, Sherman, & Klauer, 2009; Gonsalkorale, von Hippel, Sherman, & Klauer, 2009; Payne, 2001; Sherman, 2006; Sherman et al., 2008). Other research has shown that nonassociative responses such as recoding (e.g., Chang & Mitchell, 2011; De Houwer, Geldof, & De Bruycker, 2005; Gast & Rothermund, 2010; Kinoshita & Peek-O’Leary, 2005, 2006; Meissner & Rothermund, 2013; Rothermund, Teige-Mocigemba, Gast, & Wentura, 2009; Rothermund & Wentura, 2001, 2004; Rothermund, Wentura, & De Houwer, 2005), task-set shifts and task-set simplification (Mierke & Klauer, 2001, 2003), and speed–accuracy trade-offs (e.g., Brendl, Markman, & Messner, 2001; Klauer, Voss, Schmitz, & Teige-Mocigemba, 2007) may also contribute to Implicit Association Test (IAT) effects. Thus, we cannot assume that responses on an implicit measure reflect only automatic associations. This means that two people can show equivalent performance on an implicit attitude measure for very different reasons. Just like self-report measures, implicit measures can conceal differences in underlying attitudes.
To further illustrate this point, consider the Stroop (1935) task. A young child who knows colors but does not know how to read will likely perform very well on the task, making few errors. An adult with full reading ability may achieve the same level of success. However, these performances would be based on very different underlying processes. In the case of the adult, the automatic habit to read the word must be overcome in order to report the color of the ink accurately on incompatible trials (e.g., the word “blue” written in red ink). In contrast, the child has no automatic habit to overcome—she only sees the color of the ink. The same logic applies to many implicit measures of attitudes (which often employ the same compatibility logic as the Stroop task). For example, on an implicit measure of antiaging bias, automatically activated evaluative associations between old age and negativity must be overcome on incompatible trials that require participants to pair old age and positive stimuli. As such, the identical responses of the two individuals may reflect mildly biased associations in one case, but strong associations that are successfully overcome in the other.
There were two purposes of the present research. First, we attempted to empirically demonstrate this challenge for interpreting group differences in implicit attitudes. Second, in examining this problem, we sought to shed light on the nature of age differences in implicit antiaging bias. We chose to examine implicit antiaging bias because previous research suggested that this context may be one in which the opposing effects of activated associations and the ability to regulate those associations may be concealed on measures of implicit attitudes. In fact, several studies have reported equally strong proyoung bias among younger and older adults’ performance on an IAT (Greenwald, McGhee, & Schwartz, 1998) that measures implicit attitudes toward age (e.g., Hummert, Garstka, O’Brien, Greenwald, & Mellott, 2002; Jost, Banaji, & Nosek, 2004; Nosek et al., 2007). Other studies have even found slightly greater proyoung bias among older participants (Nosek, Banaji, & Greenwald, 2002). In some cases, these findings have been interpreted as evidence of system justification—the process by which status hierarchies and inequality between groups are justified and maintained (e.g., Jost & Banaji, 1994; Jost et al., 2004). According to this view, both younger and older adults have adopted the proyoung/antiold associations that are prevalent within society, with this evaluative preference for youth serving to legitimize the disadvantaged status of older adults.
However, although younger and older adults’ proyoung bias on the IAT may reflect equally biased evaluative associations in the two age groups, it may alternatively reflect mildly biased associations that are not successfully inhibited by older participants, but strong associations that are successfully inhibited by younger participants. Indeed, aging is associated with diminished inhibitory functioning (e.g., Connelly, Hasher, & Zacks, 1991; Hasher & Zacks, 1988), including the inhibition of race-based associations (e.g., Gonsalkorale, Sherman, et al., 2009; also see Stewart, von Hippel, & Radvansky, 2009 for evidence that aging is associated with diminished control of associations). This suggests that, even if older people possess more favorable associations with aging than do younger people, they may show similar levels of implicit antiaging bias due to an inability to inhibit whatever negative associations they possess. Such differences between younger and older adults could be revealed by disentangling the processes underlying implicit attitude measures.
The current study examined whether an absence of age-based differences in implicit antiaging bias conceals differences in underlying attitudinal component processes. Younger and older participants completed the young–old version of the IAT (Greenwald et al., 1998). We separated the component processes of the IAT by applying the Quadruple process model (Quad model; Conrey et al., 2005; Sherman et al., 2008). The Quad model is a multinomial model (see Batchelder & Riefer, 1999) designed to estimate the independent contributions of multiple processes from responses on implicit measures of bias (for reviews, see Sherman, 2006; Sherman et al., 2008). According to the model, responses on implicit measures reflect the operation of four qualitatively distinct processes: activation of associations (AC), detection of correct responses (D), overcoming bias (OB), and guessing (G). The AC parameter refers to the degree to which biased associations are activated when responding to a stimulus. All else being equal, the stronger the associations, the more likely they are to be activated and to influence responses. The D parameter reflects participants’ ability to detect the correct response in performing the task. Sometimes, the activated associations conflict with the detected correct response. For example, on incompatible trials of implicit attitude measures (e.g., trying to associate old faces with positive words in an IAT), activated associations (e.g., between older adults and negativity) conflict with detected correct responses. In such cases, the Quad model proposes that an OB process resolves the conflict. As such, the OB parameter refers to inhibitory processes that prevent activated associations from influencing behavior when they conflict with detected correct responses. Finally, the G parameter reflects general response tendencies that may occur when individuals have no associations that direct behavior and they are unable to detect the appropriate response. The Quad model and the construct validity of its parameters have been extensively validated in previous research (see Beer et al., 2008; Conrey et al., 2005; Gonsalkorale, Sherman, et al., 2009; Sherman et al., 2008).
Method
Participants
Participants were 93,067 (61.17% female) visitors to the IAT demonstration website (http://implicit.harvard.edu/; Nosek et al., 2002) between December 2002 and May 2006. Visitors falling into two age ranges were selected for analysis: younger participants: 21–40 (N = 91,186) and older participants: 65+ (N = 1,881).
Materials and Procedure
After providing demographic information, participants completed the age attitude IAT. In the IAT, participants used two keys to categorize 12 target images (6 old faces, 6 young faces) and 16 evaluative words (8 pleasant, 8 unpleasant). They were instructed to make their classifications as quickly and accurately as possible. They first completed two 20-trial practice blocks, in which they discriminated pleasant from unpleasant words, and old from young faces. The third and fourth blocks were critical blocks consisting of 20 and 40 trials, respectively. Participants were instructed to press one key whenever they saw a picture of a young person or a pleasant word, and another key whenever they saw a picture of an old person or an unpleasant word. The keys used to categorize old and young faces were switched in the remaining blocks. The fifth block was a practice block in which participants discriminated old from young faces. In the last two blocks, “old” shared a response key with the evaluative dimension “unpleasant.” Participants who respond more quickly when “old” shares a key with “unpleasant” (“compatible” trials) than when it shares a key with “pleasant” (“incompatible” trials) are thought to have an implicit preference for young relative to old people (Greenwald et al., 1998). Category labels remained on the top left and right of the screen throughout the task, while stimulus pictures and words appeared in the center of the screen. A red “X” appeared whenever participants made an error, and they were required to correct it before moving on to the next trial. The order of the critical blocks was counterbalanced across participants.
Results
IAT Scores
IAT scores were calculated according to the algorithm described by Greenwald, Nosek, and Banaji (2003). This algorithm was designed, in part, to control for differences in overall speed of responding, and has been shown to minimize the effects of age-related slowing on the IAT. Trial latencies greater than 10,000 ms were dropped from analysis prior to computing separate mean latencies for the compatible and incompatible blocks. Because the IAT contained a built-in error penalty, no further penalty was applied to error latencies (Greenwald, Nosek, & Banaji, 2003). The difference between the mean compatible and incompatible latencies was then divided by the pooled standard deviation of all critical trials to produce IAT scores, such that higher scores indicate stronger implicit proyoung preference. Despite the enormous sample size, there was no difference between older (M = .47, SD = .41) and younger (M = .48, SD = .38) participants in their IAT scores, t(93,065) = 1.18, p = .24. These results are consistent with previous findings that both younger and older adults show proyoung bias and that there are no age differences in the magnitude of this bias (e.g., Hummert et al., 2002; Jost et al., 2004; Nosek et al., 2007).
Modeling
The structure of the Quad model is depicted as a processing tree in Figure 1. In the tree, each path represents a likelihood. Processing parameters with lines leading to them are conditional upon all preceding parameters. For instance, OB is conditional upon both AC and D. The conditional relationships described by the model form a system of equations that predict the numbers of correct and incorrect responses in different conditions (e.g., compatible and incompatible trials). For example, an old face in an incompatible trial will be responded to correctly with the probability: AC × D × OB + (1 − AC) × D + (1 − AC) × (1 − D) × G. This equation sums the three possible paths by which a correct answer can be returned in this case. The first part of the equation, AC × D × OB, is the likelihood that the association between old and unpleasant is activated and that the correct answer can be detected and that the association is overcome in favor of the detected response. The second part of the equation, (1 − AC) × D, is the likelihood that the association is not activated and that the correct response can be detected. Finally, (1 − AC) × (1 − D) × G is the likelihood that the association is not activated and the correct answer cannot be detected and that the participant guesses correctly. The respective equations for each item category (e.g., old faces, young faces, pleasant words, and unpleasant words in both compatible and incompatible blocks) are then used to predict the observed proportions of errors in a given data set. The model’s predictions are then compared to the actual data to determine the model’s ability to account for the data. A chi-square (χ2) estimate is computed for the difference between the predicted and observed errors. In order to best approximate the model to the data, the parameter values are changed through maximum likelihood estimation until they produce a minimum possible value of the χ2. The final parameter values that result from this process are interpreted as relative levels of the processes (for further details about multinomial modeling, see Batchelder & Riefer, 1999; Sherman, Klauer, & Allen, 2010).

The Quadruple process model (Quad Model). Each path represents likelihood. Parameters with lines leading to them are conditional upon all preceding parameters. The table on the right side of the figure depicts correct (√) and incorrect (X) responses as a function of process pattern and trial type (Panel A for “old” targets and Panel B for “unpleasant” attributes). In this particular figure, the guessing bias refers to guessing with the positive (pleasant) key.
The overall error rate for the IAT was 6.55%. We conducted aggregate level analyses by first estimating, two AC, one OB, one D, and one G parameter for each of the two age groups (i.e., one set of parameter estimates for younger adults and another set for older adults). One AC parameter measured the extent to which associations between “old” and “unpleasant” were activated in performing the task and the other AC parameter measured the extent to which associations between “young” and “pleasant” were activated in performing the task. The G parameter was coded, so that higher scores represent a bias toward guessing with the positive (pleasant) key.
One of the difficulties with modeling large data sets is that the χ2 test is dependent on sample size, such that minute deviations from the model can jeopardize model fit when power is high (see Cohen, 1988). Not surprisingly, the aggregate-level analysis indicated that the Quad model did not fit our large data set, χ2(3) = 1,687.79, p < .0001, N = 11,168,040. However, the effect size of this difference between the data and the model’s predicted data was small, w = .012, indicating satisfactory fit when controlling for power.
Aggregate-level parameter estimates for the two groups are displayed in Table 1. Analyses of these aggregate-level parameter estimates showed that there were significant differences in all parameters at p < .0001. Estimates of old–unpleasant AC, Δχ2(1) = 85.75, w = .003 and young–pleasant AC, Δχ2(1) = 287.14, w = .005 were significantly lower for the older age group than for the younger age group, indicating that older participants had both less positive associations with youth and less negative associations with old age activated than younger participants. At the same time, OB was lower among the older adults, demonstrating that the older participants were less likely to inhibit their (less biased) activated associations than were younger participants, Δχ2(1) = 329.55, w = .006. There was also a significant difference in detection (D), indicating that older adults were better able to detect correct and incorrect responses on the IAT, Δχ2(1) = 884.42, w = .043. Positive guessing was also higher among the older adults, Δχ2(1) = 23.49, w = .001. 2
Parameter Estimates for Young–Old Implicit Association Test (IAT).
Note. AC = Activation of Associations; D = Detection; G = Guessing; OB = Overcoming Bias.
Given the robust finding that older adults tend to prioritize accuracy over speed in responding to tasks (e.g., Rabbitt, 1979; Salthouse, 1979), we next examined whether the age differences in parameter estimates reflect differences in speed–accuracy trade-offs. Using the EZ diffusion model (Wagenmakers, van der Maas, Dolan, & Grasman, 2008) for each participant, we calculated estimates of parameter a for boundary separation, which captures speed–accuracy trade-offs. We also calculated Quad model parameter estimates of AC, D, OB, and G for each participant. We then conducted analyses of covariance (ANCOVAs) with age group as the IV, boundary separation estimates as the covariate, and individual-level Quad model parameter estimates as the DV. Boundary separation estimates were a significant covariate in all ANCOVAs, Fs > 4,201.62, ps < .0001. The age differences in old–unpleasant AC, F(1, 87,655) = 6.17, p = .01, OB, F(1, 87,655) = 24.31, p < .001 and G, F(1, 87,655) = 4.61, p < .05 were significant in these analyses. However, there were no significant age differences in young–pleasant AC, F(1, 87,655) = 0.04, p = .83 or D, F(1, 87,655) = 1.97, p = .16. 3
Discussion
The current study makes two important points. First, the results demonstrate that implicit measures can obscure important differences in the processes underlying implicit attitudes. Consistent with previous research (e.g., Hummert et al., 2002; Jost et al., 2004; Nosek et al., 2007), scores on the implicit attitude measure indicated that younger and older people have equally negative attitudes toward aging. However, the Quad model analysis suggests that younger and older adults showed equivalent performance on the implicit measure for different reasons. Antiold associations were less activated among older adults, who also were less likely to overcome these associations than younger adults in responding to the task. This finding illustrates an important conceptual and empirical challenge for interpreting group differences in performance on implicit measures. Though it is now broadly acknowledged that performance on implicit measures involves a variety of processes in addition to association activation, it remains the case that the vast majority of researchers using these measures interpret the results to reflect the extent to which respondents possess biased evaluative associations. Previous research has shown that differences in implicit attitudes across contexts or groups of participants may reflect a number of distinct processes (for a review, see Sherman et al., 2008). The current results show that the absence of differences in implicit attitudes may conceal real differences in these underlying processes.
The second important conclusion from this research is that previous claims of equal antiaging bias among younger and older respondents have been overstated. Though the IAT scores of younger and older respondents are consistent with this conclusion, the modeling results demonstrate that older people, in fact, have less negative old–unpleasant associations than do younger people. These findings suggest a modification to conclusions of system justification in the case of antiaging attitudes—older adults appear to be just as biased as younger adults only because they are less likely to inhibit their (weaker) associations. However, it is important to note that both young and old participants show robust associations between aging and negativity, both in IAT scores and in the AC-old parameter estimate (which reliably differ from zero in all cases). As such, these data are consistent with the system justification view, even if the levels of bias among young and old people may differ.
Interestingly, the aggregate analyses indicated that older adults also showed higher detection and a more positive response bias. Both of these effects replicate earlier findings (Gonsalkorale, Sherman, et al., 2009) and neither of them can explain why older adults would demonstrate aging bias equal to younger adults, despite having less biased associations. That is, higher detection and a more positive response bias would lead to reduced not increased antiaging bias among the elderly. The age-based increase in D is consistent with the widespread notion that older adults put more weight on the accuracy side of the speed–accuracy trade-off (e.g., Rabbitt, 1979; Salthouse, 1979). This interpretation is corroborated by the present covariational analysis involving boundary separation. Specifically, the effect on the D parameter was not robust when controlling for speed–accuracy trade-offs, suggesting that age differences in the D parameter may, in part, reflect the tendency for older adults to be more cautious in responding to the IAT. This finding highlights that processes other than those represented in the Quad model may influence IAT performance. Certainly, the Quad model is not intended to offer an exhaustive account of the processes that may contribute to implicit task performance (Sherman, 2006). Besides speed–accuracy trade-off settings (e.g., Brendl et al., 2001; Klauer et al., 2007), processes to do with recoding (e.g., Chang & Mitchell, 2011; De Houwer et al., 2005; Gast & Rothermund, 2010; Kinoshita & Peek-O’Leary, 2005, 2006; Meissner & Rothermund, 2013; Rothermund et al., 2009; Rothermund & Wentura, 2001, 2004; Rothermund et al., 2005), task-set shifts and task-set simplification (Mierke & Klauer, 2001, 2003) have been shown to influence IAT performance. Future research that directly examines the roles of these processes in age differences in implicit attitudes would be important.
The capacity to identify subgroups of people who differ in underlying processes, despite equivalent performance on implicit measures, has implications beyond measurement accuracy. For example, this may aid development of appropriate interventions. Strategies that focus on eradicating evaluative associations (e.g., in the domains of prejudice, substance addiction, etc.) may only be effective for individuals whose associations are strong to begin with. Similarly, interventions focusing on improving control over automatic habits may be more beneficial for individuals with weak control (as opposed to strong associations). The Quad model provides a unique tool for identifying when standard measures of implicit attitudes conceal differences in underlying processes and suggesting strategies to reduce bias.
Footnotes
Acknowledgments
We are grateful to Brian Nosek, Tony Greenwald, and Mahzarin Banaji for sharing data from Project Implicit. We thank Klaus Rothermund and two anonymous reviewers for their helpful feedback on an earlier draft of this article.
Declaration of Conflicting Interests
The author(s) declared no conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was supported by a grant from the National Science Foundation (BCS 0820855) to the second author.
