Abstract
Why do some people recognize faces easily and others frequently make mistakes in recognizing faces? Classic behavioral work has shown that faces are processed in a distinctive holistic manner that is unlike the processing of objects. In the study reported here, we investigated whether individual differences in holistic face processing have a significant influence on face recognition. We found that the magnitude of face-specific recognition accuracy correlated with the extent to which participants processed faces holistically, as indexed by the composite-face effect and the whole-part effect. This association is due to face-specific processing in particular, not to a more general aspect of cognitive processing, such as general intelligence or global attention. This finding provides constraints on computational models of face recognition and may elucidate mechanisms underlying cognitive disorders, such as prosopagnosia and autism, that are associated with deficits in face recognition.
Keywords
Faces are arguably the most important social stimuli; they contain rich information on identity, gender, mood, and age. Although face recognition is essential for daily activities, the ability to recognize faces varies considerably across individuals. Some people are able to recognize almost every face they see, even after just one exposure (Russell, Duchaine, & Nakayama, 2009), whereas other people complain about their failure to recognize even their family members’ or close friends’ faces (for reviews, see Behrmann & Avidan, 2005; Duchaine & Nakayama, 2006). In the study reported here, we investigated what cognitive mechanisms underlie individual differences in face recognition.
Previous studies have demonstrated that upright faces are processed as integrated wholes rather than as simple collections of face features (e.g., Maurer, Le Grand, & Mondloch, 2002; McKone, Martini, & Nakayama, 2001; Sergent, 1984; Tanaka & Farah, 1993; Young, Hellawell, & Hay, 1987). This holistic processing of faces is shown most clearly with two behavioral markers: the composite-face effect (CFE) and the whole-part effect (WPE). The CFE refers to the increase in response time in identifying the top half of a face composed of top and bottom halves from different faces if the halves are aligned relative to when they are misaligned (Young et al., 1987). The WPE refers to the increase in accuracy in distinguishing which of two face parts (e.g., noses) appeared in a previously seen face when those parts are presented in the context of whole faces rather than in isolation (Tanaka & Farah, 1993).
Studies on the other-race effect (e.g., Michel, Rossion, Han, Chung, & Caldara, 2006) and prosopagnosia (e.g., Ramon, Busigny, & Rossion, 2010; Zhu, Li, Chow, & Liu, 2009) suggest that holistic face processing may be crucial for face recognition. However, little is known about whether individual differences in holistic face processing predict face recognition ability. Using the individual differences approach (Wilmer, 2008), two recent studies have attempted to answer this question but have provided contradictory results. Konar, Bennett, and Sekuler (2010) reported no correlation between holistic face processing and face recognition, whereas Richler, Cheung, and Gauthier (2011) found a positive correlation but only with a specific kind of CFE paradigm (i.e., the congruency-composite paradigm).
We argue that the results of both studies are inconclusive, because the researchers did not isolate processes that are specific to face recognition in their tasks. That is, the index of holistic face processing (i.e., the CFE) used in the two studies was based on the difference between performance with aligned and misaligned faces; however, the index of face recognition (e.g., face identification accuracy) was an absolute measure of identifying a target face among foil faces. The use of this absolute measure is problematic because it may have diluted the variance of the cognitive process of interest (i.e., the process specific to face recognition) with the variance of other domain-general processes (e.g., general visual-discrimination abilities, attention, decision making). When difference measures are used, however, the domain-general functions are subtracted out, and the process of interest is isolated. Thus, difference measures have been used in a variety of domains, especially when individual differences are considered; such domains include attention (Fan, McCandliss, Sommer, Raz, & Posner, 2002), face recognition (Zhu et al., 2010; see also Yovel & Kanwisher, 2008), theory of mind (Happé, Brownell, & Winner, 1999), and deception (Morgan, LeSage, & Kosslyn, 2009).
In the study reported here, we devised three behavioral measures, each of which contrasted face processing with processing of a matched nonface stimulus. We first used a difference measure between performance with faces and performance with flowers in an old/new recognition task as an index of face- specific recognition ability (FRA). Then, we examined whether individual differences in FRA were associated with individual differences in holistic face processing, as indexed by the CFE and the WPE. Finally, a variant of Navon’s (1977) task with hierarchical shapes was included to test whether global processing of visual stimuli is linked with face recognition (Behrmann, Avidan, Marotta, & Kimchi, 2005).
Method
Participants
Three hundred thirty-seven students (141 males, 196 females; mean age = 20.4 years, SD = 0.9 years) from Beijing Normal University participated in the study. All participants had normal or corrected-to-normal vision. The study was approved by the institutional review board of Beijing Normal University. Prior to testing, we obtained written informed consent from all participants.
Stimuli and procedure
Participants completed four computer-based tasks: an old/new recognition task, composite-face task, whole-part task, and global-local task. In addition, they completed one paper-based test: Raven’s Advanced Progressive Matrices (Raven’s APM; Raven, Raven, & Court, 1998). Raven’s APM was conducted on a separate day than the computer-based tasks. The images in all face tasks were gray-scale adult Chinese faces with the external contours removed (leaving a roughly oval shape with no hair on the top and sides, with the addition of the neck in the old/new recognition task). These images were selected from an in-house database of adult Chinese faces.
Old/new recognition task
Thirty face images and 30 flower images were used in the old/new recognition task (Fig. 1a). Flower images were gray-scale pictures of common flowers with leaves and background removed. There were two blocks in this task: a face block and a flower block, which were counterbalanced across participants. Each block consisted of one study segment and one test segment. In the study segment, 10 images of one object category were shown for 1 s per image with an interstimulus interval (ISI) of 0.5 s, and these studied images were shown twice. In the test segment, 10 studied images were shown twice, randomly intermixed with 20 new images from the same category. On presentation of each image, participants were instructed to indicate whether the image had been shown in the study segment.

Example stimuli and trial types. In the old/new recognition task (a), participants studied a single image (either a face or a flower); then they were shown a series of individual images of the corresponding type and asked to indicate which of the images had been shown in the study segment. In the composite-face task (b), participants saw two consecutively presented faces, each of which was composed of the top half of one face and the bottom half of a different face; the halves of both faces in each pair were either aligned or misaligned. Participants were instructed to judge whether the top halves of the faces in each pair were identical. In the whole-part task (c), participants first studied three faces with associated names. In each test trial, participants were shown either a pair of complete faces (whole condition) or a pair of partial faces (part condition). In both conditions, participants were asked to identify which of the two stimuli contained a target feature (e.g., eyes) that belonged to a specific face they had seen in the study phase. In the global-local task (d), participants viewed either a consistent shape, in which the local objects forming the shape had the same shape as the global object (e.g., local circles forming a global circle), or an inconsistent shape, in which the local shapes were different from the global shape (e.g., local squares forming a global circle). In separate blocks, participants were asked to identify the local shape or the global shape.
Composite-face task
For the composite-face task, we created face composites by splitting face images into halves horizontally across the middle of the nose and then combining the top half of one face and the bottom half of another face. The top half and the bottom half were presented either aligned or misaligned (Fig. 1b). Each trial started with a blank screen for 1 s, followed by the first composite face presented at the center of the screen for 0.8 s. Then, after an ISI of 0.5 s, the second composite face appeared for another 0.5 s. In each trial, both faces within a pair were either aligned or misaligned, and these two conditions were randomly intermixed. There were 40 trials for each condition, half of which consisted of face pairs that shared an identical top half (same trials), and half of which consisted of face pairs with different top halves (different trials). The bottom halves were always different. Participants were instructed to judge whether the top halves of the composite faces were identical.
Whole-part task
The whole-part task had one study segment and one test segment (Fig. 1c). In the study segment, participants were instructed to memorize three faces and their associated names. Each face-name pair was shown for 5 s with an ISI of 1 s. Only when the participants could correctly identify all face-name pairs were they allowed to enter the test phase. On each trial of the test phase, a question (e.g., “Which is Xiao Zhang’s nose?”) was presented, followed by a choice of two alternative pictures presented on the left and right sides of the screen. The display remained on the screen until the participants responded. There were two conditions, each consisting of 36 trials. For the part condition, the display contained two isolated features (e.g., two noses): One was from the target face (e.g., Xiao Zhang’s face), and the other was from one of the other studied faces. For the whole condition, the display contained two whole faces, with the target and a foil face differing only with respect to one face part. Stimuli were matched between the two conditions, such that each part (e.g., Xiao Zhang’s nose) tested in the whole condition was also tested in the part condition in a separate trial. The whole and part conditions were randomly intermixed.
Global-local task
The global-local task was a variant of Navon’s (1977) task. The stimuli were four hierarchical shapes of two types: consistent shapes, in which the global and the local objects forming the shapes shared an identity (e.g., local circles forming a global circle), and inconsistent shapes, in which the shapes at the two levels had different identities (e.g., local squares forming a global circle; Fig. 1d). There were two blocks, each of which contained 80 trials; each block was preceded by instructions to identify shapes at either the local or global level. In each block, there were 40 trials of consistent shapes and 40 trials of inconsistent shapes; the two types of trial were intermixed randomly. Each trial started with a blank screen for 0.5 s, followed by a central fixation cross for 0.7 s. Then, one of the four possible stimuli appeared for 0.15 s. Participants were instructed to indicate whether the shape they saw was a circle or a square as quickly as possible.
Raven’s APM
Raven’s APM contains 48 multiple-choice items of abstract reasoning, in which participants are asked to identify the missing figure required to complete a larger pattern. Because the participant group in our study was highly homogeneous (i.e., college students), the number of correctly answered items was used as a measure of each individual’s general cognitive ability.
Data analysis
For each measure, we calculated the difference between stimuli of interest and control stimuli as an index of cognitive functions specific to face processing. To index FRA, we used the A′ statistic from signal detection theory. Specifically, we subtracted A′ for flowers from A′ for faces in the old/new recognition task (Duchaine & Nakayama, 2005). A′ for each type of stimulus was calculated separately using the following formula: 1/2 + [p(hit) − p(false alarm)] × [1 + p(hit) − p(false alarm)]/{4 × p(hit) × [1 − p(false alarm)]}. The CFE, an index of holistic face processing, was calculated by entering response-time performance for aligned and for misaligned faces in the composite-face task in the following formula: (aligned − misaligned)/(aligned + misaligned) (de Heering & Rossion, 2008). (We also calculated the CFE using the same formula but with accuracy instead of response time. See Supplemental Analysis in the Supplemental Material available online for details.) Only performance on same trials was used for calculating the CFE, as is traditionally done (e.g., de Heering, Houthuys, & Rossion, 2007; de Heering & Rossion, 2008; Le Grand et al., 2006; Le Grand, Mondloch, Maurer, & Brent, 2004; Michel et al., 2006). Different trials were used as filler trials because there is no CFE on those trials (e.g., Le Grand et al., 2004).
Another index of holistic face processing, the WPE, was calculated by entering accuracy for whole faces and for partial faces on the whole-part task in the following formula: (whole − part)/(whole + part) (Zhu et al., 2009). Global-to-local interference (GLI), an index of the tendency to globally process general objects, was calculated by entering response times in the global-local task in the following formula: [consistent(global − local) − inconsistent(global − local)]/[consistent(global + local) + inconsistent(global + local)] (Behrmann et al., 2005). This measure was used as a control for face-specific holistic processing.
Two different but related types of analyses were used. The first analysis focused on participants from the opposite ends of the distribution in the face recognition task. That is, participants with a relatively high score (i.e., 1 SD above the population mean) were compared with participants with a relatively low score (i.e., 1 SD below the population mean), thus forming a maximum phenotypic separation (Plomin, Haworth, & Davis, 2009). The criterion of selecting extreme participants (i.e., 1 SD above or below the mean) was based on the standard used in previous studies (e.g., Cheung, Rutherford, Mayes, & McPartland, 2010). The second analysis was a traditional correlational analysis that was based on data from the entire sample of participants tested. These two analyses are complementary to each other: The former relies on the difference in means, and the latter relies on the distribution of individual differences.
Nine outlier participants (2.7% of the participant population) were excluded from further analyses. (Outliers were defined as being two interquartiles below the first quartile for accuracy-based absolute measures, or two interquartiles above the third quartile for response-time-based absolute measures.) The Spearman-Brown-corrected split-half reliability was calculated for each task. The upper boundary of the correlation between two difference measures was calculated as the square root of the product of the reliability scores for these two measures. True correlation coefficients (rcorrected) were estimated by dividing the raw correlation coefficients by the corresponding upper boundaries (Schmidt & Hunter, 1999).
Results
We first tested the necessity of using a difference measure rather than an absolute measure to index FRA. Figure 2a shows the distribution of standardized A′ for faces in the old/new recognition task, and Figure 2b shows the distribution of the FRA based on the standardized difference between A′ for faces and A′ for flowers. There were large individual differences in the magnitudes of the absolute and difference measures. (Table 1 shows mean raw scores and reliability estimates for the old/new recognition task.)

Results of the old/new face recognition task and the correlation between performance on this task and IQ score. The histograms show the distributions of participants’ standardized scores, as indexed by (a) an absolute measure (i.e., A′ for face stimuli) and (b) a difference measure (i.e., A′ for face stimuli − A′ for flower stimuli). For both measures, high-score and low-score groups were identified as those individuals who scored more than 1 standard deviation above or below the mean, respectively. The graph in (c) shows the correlation between each measure of face recognition and IQ (score on Raven’s Advanced Progressive Matrices, or APM; Raven, Raven, & Court, 1998) for each group. Error bars indicate ±1 SE. Asterisks indicate a significant difference between groups (**p < .001).
Mean Scores and Reliability Estimates for Each Task in the Study
Note: Standard deviations are given in parentheses.
p < .0001.
We compared participants who had a relatively high score (i.e., 1 SD above the population mean) with participants who had a relatively low score (i.e., 1 SD below the population mean) using either the absolute measure (above: n = 41; below: n = 51; Fig. 2a) or the difference measure (above: n = 40; below: n = 49; Fig. 2b), respectively. Using the absolute measure, we found that the high-score group performed better on Raven’s APM than the low-score group did, t(90) = 4.31, p < .001; this result indicates that the absolute measure contains variance from more general cognitive functions (Fig. 2c).
In contrast, the analysis based on the difference measure (i.e., FRA) did not show any significant difference in Raven’s APM between groups (t < 1); this result shows that domain-general functions are subtracted out (Fig. 2c). In addition, using the difference measure, we found that the high-score group performed better in the face block of the old/new recognition task than the low-score group did, t(87) = 10.71, p < .001; this result shows that the information of interest (e.g., face recognition) is also preserved in the difference measure. In the following analysis, the difference measure was used as an index of face-specific recognition.
Second, participants exhibited both the CFE and the WPE. As expected, participants were slower in discriminating faces when the top halves and the bottom halves were aligned than when they were misaligned, t(327) = 14.32, p < .0001, Cohen’s d = 0.79 (Table 1). Similarly, participants were better at identifying a face part when it was presented in the context of the whole face than when it was presented in isolation, t(327) = 9.01, p < .0001, Cohen’s d = 0.50 (Table 1).
More important, significantly larger CFEs and WPEs were observed in the high-score group than in the low-score group—CFEs: t(87) = 2.10, p < .05, Cohen’s d = 0.45; WPEs: t(87) = 2.18, p < .05, Cohen’s d = 0.47 (Fig. 3a). Similarly, the correlational analysis based on the entire sample of participants showed that individual differences in face recognition were associated with individual differences in holistic face processing—FRA and CFE: r = .13, 95% confidence interval (CI) = [.02, .24], p < .05, rcorrected = .32; FRA and WPE: r = .13, 95% CI = [.03, .24], p < .05, rcorrected = .36 (Fig. 3b). (We also tested the relation between face recognition and holistic face processing with regression analysis. See Fig. S1 in the Supplemental Material.) The true correlations accounted for approximately 10.2% and 13% of the variance in face recognition with the variance of the CFE and the WPE, respectively.

Magnitude of the composite-face effect, the whole-part effect, and global-local interference and their correlations with face-specific recognition ability (FRA). In (a), the magnitude of each effect is shown as a function of group (high FRA = 1 SD above the mean difference score for face recognition; low FRA = 1 SD below the mean). Error bars indicate ±1 SE. Asterisks indicate significant differences between groups (*p < .05). The graph in (b) shows the correlation between FRA and each effect, along with the upper boundaries of these correlations. Error bars indicate 95% confidence intervals. Asterisks indicate correlations significantly different from 0 (*p < .05).
Did the CFE and WPE make joint or distinct contributions to face recognition? There was no correlation between the CFE and WPE across individuals, r = .03, 95% CI = [−.08, .14], p = .55, which suggests that the CFE and WPE make distinct contributions to face recognition performance. This finding was further confirmed by a stepwise multiple regression analysis, in which either the CFE or the WPE alone was able to predict the FRA (Table 2). That is, the CFE and the WPE together accounted for significantly more variance in face recognition than either effect alone.
Results From the Multiple Regression Predicting Face Recognition
p < .05.
Next, we tested whether the link between face recognition and holistic processing is face-specific. To address this issue, we examined whether the tendency to process general objects more globally than locally, as indexed by GLI, correlated with face recognition. The high-score group did not differ from the low-score group in the magnitude of GLI (t < 1; Fig. 3a), and the correlation between the magnitude of GLI and the FRA was essentially zero, r = −.01, 95% CI = [−.12, .10], p = .85, rcorrected = −.03 (Fig. 3b; see also Fig. S1 in the Supplemental Material). The null result cannot be accounted for by the lack of statistical power, because the size of the GLI measure was significant, t(327) = 14.75, p < .0001, Cohen’s d = 0.81 (Table 1); this result indicates that participants did prefer to process general objects more globally than locally. Therefore, it is face-specific holistic processing that influences face recognition.
It is also worth mentioning that when the absolute measure of face recognition was used, we replicated Konar et al.’s (2010) failure to find the association between holistic face processing and face recognition, even with a larger number of participants and more measures of holistic face processing (see Fig. S2 in the Supplemental Material). That is, no significant difference in holistic face processing was found between the high-score and low-score groups—CFE: t(90) = 1.21, p = .23; WPE: t < 1. Also, the correlation between the absolute measure of face recognition and the absolute measure of holistic processing was not significant—CFE: r = .03, 95% CI = [−.07, .14], p = .53, rcorrected = .07; WPE: r = .05, 95% CI = [−.06, .16], p = .35, rcorrected = .12.
Discussion
There is a consensus that faces are a special category of stimuli, and holistic processing is thought to be a critical factor that makes faces special. In the study reported here, we provide direct evidence linking holistic face processing with face recognition. We found that individuals who were better at recognizing faces were also better able to process faces holistically. This finding was reliable across different behavioral markers of holistic face processing—the CFE and the WPE—and across different analysis methods—the comparison between the high-score versus low-score groups and the correlational analysis across the participant population. In addition, the correlation between holistic face processing and face recognition was not mediated by general cognitive functions, as indexed by Raven’s APM scores and measures of GLI, but may instead reflect the unique way in which faces are processed. In short, our study demonstrates that the extent to which people process faces holistically rather than featurally has a significant influence on individual differences in face recognition, thus providing constraints on computational models of face recognition.
In this study, we devised three novel designs to reveal the link between holistic face processing and face recognition. First, unlike researchers in previous studies who used an absolute measure of face recognition performance (Konar et al., 2010; Richler et al., 2011), we isolated processes specific to face recognition by subtracting out variance reflecting domain-general cognitive processes (e.g., general visual-discrimination abilities, attention, decision making). This manipulation was critical, because only with the difference measure did we find a positive correlation between holistic face processing and face recognition. In addition, using the difference measure, we showed that the positive correlation does not have to rely on the specific paradigm (i.e., the congruency-composite paradigm) suggested by Richler et al. (2011). That is, the traditional CFE paradigm (e.g., Le Grand et al., 2004) used in the majority of previous studies is a valid and effective measure of holistic face processing.
Second, instead of using one measure of holistic face processing, we used two behavioral markers: the CFE and WPE. We found that both markers contributed significantly to face recognition performance, further validating the link between holistic processing and face recognition. More important, the CFE and WPE did not correlate with each other, which implies that holistic face processing may not be a unitary construct as previously assumed. Alternatively, it is also possible that holistic face processing could be a single process, but different tests capture only different fractions of it.
Finally, we examined the specificity of the link between holistic processing and face recognition by using Navon’s (1977) task with hierarchical shapes (i.e., to assess GLI). The absence of correlation between the magnitude of GLI and face recognition ability argues against the hypothesis that face-specific recognition demands a general level of global analysis (Behrmann et al., 2005); instead, the dissociation in correlations suggests that holistic processing is the very factor that makes faces a special category of stimuli (Busigny & Rossion, 2011).
Of course, holistic face processing is not the only factor that influences face recognition. In this study, we found that only a portion of the variance in face recognition can be accounted for by holistic face processing. Other forms of face-specific processing, such as processing of the first-order configuration (i.e., the “T” shape; Liu, Harris, & Kanwisher, 2010; Zhu et al., 2009) and the second-order configuration (i.e., spacing among face parts; Le Grand, Mondloch, Maurer, & Brent, 2001; Maurer et al., 2002; Yovel & Kanwisher, 2004) may influence face recognition as well (e.g., Rotshtein, Geng, Driver, & Dolan, 2007). In addition, the processing of face parts (e.g., eyes, nose, and mouth) may account for a significant portion of the unexplained variance in face recognition (e.g., Cabeza & Kato, 2000; Rotshtein et al., 2007). Finally, similar to the observation that neither CFE nor WPE alone was enough to capture the level of performance of holistic face processing, the old/new face recognition task may only capture a portion of variance in face recognition. Therefore, future studies with more measures (e.g., Garrido et al., 2009; Wilhelm et al., 2010) may help quantify individual differences in face recognition.
Our findings raise an interesting question on the causal relation between holistic processing and face recognition. First, studies on acquired prosopagnosia (e.g., Busigny & Rossion, 2011; Ramon et al., 2010) and developmental prosopagnosia (Zhu et al., 2009; but see Le Grand et al., 2006) suggest that individuals with prosopagnosia may suffer deficits in the holistic processing of faces as well. However, it is unclear whether the deficit in holistic face processing lies at the core of the mechanism that causes the deficit in face recognition in prosopagnosia and other face-recognition-related disorders, such as autism. If it is true, we may train patients on holistic face processing to improve their ability to recognize faces (see also DeGutis, Bentin, Robertson, & D’Esposito, 2007). In addition, holistic face processing is heritable (Zhu et al., 2010) and is mature at the age of four or even younger (de Heering et al., 2007; Tanaka, Kay, Grinnell, Stansfield, & Szechter, 1998). However, little is known about how holistic face processing shapes face recognition during development. Studies addressing these questions may further elucidate the role of holistic processing in face recognition.
Footnotes
Acknowledgements
Ruosi Wang and Jingguang Li contributed equally to this study. We thank two anonymous reviewers for invaluable comments on the manuscript.
The authors declared that they had no conflicts of interest with respect to their authorship or the publication of this article.
This study was funded by the 100 Talents Program of The Chinese Academy of Sciences; the National Basic Research Program of China (2010CB833903, 2011CB505402); the Key Laboratory of Mental Health, Institute of Psychology; and the Fundamental Research Funds for the Central Universities (2009SD-3).
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
