Abstract
The face-inversion effect is the finding that picture-plane inversion disproportionately impairs face recognition compared to object recognition and is now attributed to greater orientation-sensitivity of holistic processing for faces but not common objects. Yet, expert dog judges have showed similar recognition deficits for inverted dogs and inverted faces, suggesting that holistic processing is not specific to faces but to the expert recognition of perceptually similar objects. Although processing changes in expert object recognition have since been extensively documented, no other studies have observed the distinct recognition deficits for inverted objects-of-expertise that people as face experts show for faces. However, few studies have examined experts who recognize individual objects similar to how people recognize individual faces. Here we tested experts who recognize individual budgerigar birds. The effect of inversion on viewpoint-invariant budgerigar and face recognition was compared for experts and novices. Consistent with the face-inversion effect, novices showed recognition deficits for inverted faces but not for inverted budgerigars. By contrast, experts showed equal recognition deficits for inverted faces and budgerigars. The results are consistent with the hypothesis that processes underlying the face-inversion effect are specific to the expert individuation of perceptually similar objects.
Keywords
An enduring question in high-level vision is whether faces recruit cognitive processes that are exclusive to face recognition (face-specific hypothesis) or whether these same processes can be deployed for the recognition of nonface objects (process-specific hypothesis). In this article, the predictions of the face-specific hypothesis versus the process-specific hypothesis are investigated by testing a unique type of perceptual expert: people who specialize in the individuation of budgerigar birds.
The computational problem of face recognition is defined both by the perceptual demands of the task and by the perceptual similarity of faces. First, we are deeply motivated to attend to and recognize faces; they are the essence of our identity and the means by which we communicate intentions and emotions with others. Second, unlike most objects which are identified at the basic level (e.g., ‘chair’, ‘bird’, ‘car’; Rosch, Mervis, Gray, Johnson, & Boyes-Braem, 1976), faces are identified at the subordinate level of the individual person (e.g., Barack Obama, Meryl Streep). At this level of classification, there is a greater degree of perceptual similarity because faces share the same parts (i.e., facial features) in a set geometric configuration. Consequently, this geometric similarity demands a finer perceptual analysis because faces cannot be differentiated on the basis of a single diagnostic feature (Grill-Spector & Kanwisher, 2005; Jolicoeur, Gluck, & Kosslyn, 1984). Third, faces are omnipresent in our everyday existence, so our perceptual experience is marked by the sheer volume of unique exemplars that we encounter. Despite these demands, most people are able to recognize a familiar face accurately, quickly and with little cognitive effort.
One solution to the face recognition problem is holistic processing. In contrast to the part-based approach used for object recognition (Biederman, 1987), face perception is characterized by a sensitivity to the shape of facial features and their spatial organization (e.g., Freire, Lee, & Symons, 2000; Haig, 1984; but see Burton, Schweinberger, Jenkins & Kaufmann, 2015). We encode a face not in terms of its separate features and their configuration but as an integrated ‘whole’ face representation (Maurer, Le Grand, & Mondloch, 2002; Tanaka & Farah, 1993). The perception of faces is therefore described as occurring ‘holistically’. The primacy of this holistic strategy in face perception has been demonstrated in the face composite task, where selective attention to one half of a composite face is disrupted by the other task-irrelevant half (Young, Hellawell, & Hay, 1987) and in the parts or wholes task, where recognition of a face part (e.g., the eyes) is better in the whole face context than it is in isolation (Tanaka & Farah, 1993). Perhaps the most iconic demonstration of the power of holistic processing in face recognition is the face-inversion effect: A pronounced impairment in the perception of and memory for inverted faces that greatly exceeds the effect of inversion on object recognition (Yin, 1969). Inversion abolishes the holistic effects observed in both the face composite task (Rossion & Boremanse, 2008; Young et al., 1987; but see McKone et al., 2013; Richler, Mack, Palmeri, & Gauthier, 2011) and in the part or whole task (Tanaka & Sengco, 1997) – two tasks that directly manipulate holistic processing by separating (or recombining) face features and observing the resulting effect on facial feature recognition. Similarly, prosopagnosic patients who display face recognition deficits do not exhibit the classic face-inversion effect (Busigny & Rossion, 2010; Palermo, Willis, Rivolta, McKone, & Wilson, 2011; Ramon, Busigny, & Rossion, 2010). The collective evidence from inversion studies strongly support the view that diminished recognition performance for inverted faces is because inversion disrupts the holistic strategy that is used to process and recognize faces.
The fact that only faces appear to be processed holistically has led to the hypothesis of a face-specific processing mechanism (e.g., Kanwisher, 2000; McKone, Kanwisher, & Duchaine, 2007). An alternative hypothesis posits that holistic recognition is a process-specific strategy reserved for individuating visually similar objects and, could, under the right conditions, be applied to solve nonface object recognition (Diamond & Carey, 1986; Gauthier & Tarr, 1997). For example, people with special interests (e.g., dog judges) who are motivated to identify objects at a subordinate level might be capable of using holistic strategies to discriminate perceptually similar exemplars. To examine the specificity of holistic processing, Diamond and Carey (1986) tested dog experts and control novice participants for their recognition of individual dogs and faces presented in their upright and inverted orientations. The critical finding was that dog experts showed an inversion effect (i.e., better recognition in the upright orientation relative to the inverted orientation) for dogs and faces, whereas the novice participants showed a reliable inversion effect for faces but not for dogs. Importantly, this effect was only observed for breeds of dog for which the judge was an expert (Experiment 3). Their results supported the view that, with sufficient experience and motivation, holistic processes underlying face perception could be recruited for the recognition of other perceptually similar objects.
Since Diamond and Carey’s study, studies of expert object recognition have shown that inversion slows expert recognition processes (Ashworth, Vuong, Rossion, & Tarr, 2008; Rossion & Curran, 2010; Rossion, Gauthier, Goffaux, Tarr, & Crommelinck, 2002), alters processing efficiency (Gauthier, Curran, Curby, & Collins, 2003) and reduces visual short-term memory capacity (Curby, Glazek, & Gauthier, 2009). However, most studies have failed to find the distinct decline in recognition accuracy that characterizes the inversion effect for faces, as in the case for expert recognition of handwriting (Bruyer & Crispeels, 1992), fingerprints (Busey & Vanderkolk, 2005), learned Greebles (Gauthier & Tarr, 1997; Gauthier, Williams, Tarr, & Tanaka, 1998), or cars and birds (Gauthier, Skudlarski, Gore, & Anderson, 2000; Rossion & Curran, 2010). Others may have failed to show perceptual expertise effects (Weiss, Mardo, & Avidan, 2016) or did not compare the magnitude of inversion effects to that of faces (Moore, Cohen, & Raganath, 2006; Xu, Liu, & Kanwisher, 2005).
To resolve these contradictory findings, Robbins and McKone (2007) attempted to replicate and extend the results of Diamond and Carey with Labrador dog experts applying three key tests of holistic processing: recognition of inverted images, recognition of contrast reversed images and a composite task. For all three tasks, they found that the manipulations intended to disrupt holistic processing had no greater effect on experts’ recognition for dogs than the novices’. However, two limitations in the Robbins and McKone’s study compromise the interpretability of their results. First, the expert breeders, trainers and judges specialized in British-type Labradors, yet over half (38 of the 60) of the stimuli depicted the visually dissimilar American-type Labradors. According to Diamond and Carey (1986), this was a critical oversight because matching the test stimuli to the domain of expertise is paramount for engaging holistic processes. Second, the validation of perceptual expertise should be evidenced by greater recognition accuracy in the normal upright condition for experts relative to novices. Robbins and McKone report this expertise effect for 12 experts compared to age-matched controls (Robbins & McKone, 2007, p. 39). However, because the recognition performance was not reported for the full sample of the 15 experts included in their analyses, it is not clear whether the expertise requirement was met in their experiments.
Given the significance of the inversion effect in holistic face recognition, the inability to replicate inversion effects for expert recognition of nonface objects is a major weakness of the process-specific hypothesis (McKone et al., 2007). Yet, it is possible that this shortcoming lies in the forms of expertise that have been examined, involving subordinate (e.g., Tennessee warbler vs. MacGillvray’s warbler) but not identity-level recognition. In the current study, we examined a unique group of experts – budgerigar experts – whose expertise requires the recognition of individual budgerigar birds (‘budgies’). These experts breed show budgerigars for hobby and typically keep between 50 and 500 birds. Birds are not normally named, but breeders recognize each bird with respect to its age, sex, personality characteristics and genetic lineage. Similar to faces, birds share basic features and markings that appear in similar spatial arrangement (see Figure 1). Given the number of birds kept by these breeders and the visual expertise required to individuate these perceptually similar birds, budgerigar expertise provides the ideal domain for testing the claims of the process-specificity hypothesis. Thus, we tested budgerigar experts and novice controls for the recognition of budgerigars and faces presented in upright and inverted orientations. We examined three claims of the process-specific hypothesis: (a) experts should perform better than novices for budgerigars presented in the normal upright orientation, (b) experts should show decreased accuracy for recognizing budgerigars in the inverted orientation relative to the upright orientation and (c) experts should show stronger inversion effects on budgerigar recognition than novices.
Illustration of the common markings of the exhibition budgerigar (image courtesy of the World Budgerigar Organisation).
Method
Participants
Participants were budgerigar breeders and judges who had a minimum of 5 years of experience breeding and showing exhibition budgerigars (N = 8 males, Mage = 57.4 years, age range: 40–65 years). We recruited experts in British Columbia through advertisement in regional budgerigar clubs and through word of mouth. Our inclusion criteria were as follows: Breeders who had maintained aviaries with a minimum of 100 birds year round, who had at least 5 years of breeding experience, were currently active in either participating in or judging budgerigar shows and not over 70 years of age. The rare nature of this expertise was a limiting factor for our sample size and a common constraint in studies of exceptional perceptual skill. Age-matched novices (N = 8 males, Mage = 56.8 years, age range: 43–67 years) were recruited through advertisement in public areas and through friends and relatives of the experimenters to volunteer as participants and had no particular experience with birds. Novices were matched to experts on a one-to-one basis for age and sex – age comparison: t(14) = 0.12, p = .91 – but not education. Written informed consent was obtained from all participants.
Stimuli
Budgerigars
Images of 16 individual budgerigars were included, each photographed from two different viewpoints for a total of 32 unique images (see Figure 2(a)). To avoid biases in the stimulus set, photographs were taken by the experimenter from five aviaries to capture a range of regional birds and to ensure consistency (only three breeders met inclusion criteria to participate in the study). All photographs were converted to greyscale and edited in Adobe Photoshop to remove all background and to normalize brightness and contrast across all images. Images subtended a visual angle of approximately 8.9° × 8.9° with participants sitting 100 cm from the screen.
Examples of same-trial stimuli in the upright and inverted orientations from the bird recognition task (a) and different-trial stimuli in the upright and inverted orientations from the face recognition task (b). Study and test images always differed in viewpoint (i.e., frontal or 3/4). Orientation (i.e., upright or inverted) was constant between the study and test images. Individual (grey) and group mean (black) sensitivity d′ scores for experts and novices on the budgie and face same or different tasks. Error bars represent 95% CI for within-subject measures. *p < .05. ***p < .001.

Faces
Thirty-two greyscale photographs of 16 different individuals from the Karolinska-directed emotional faces (Lundqvist, Flykt, & Öhman, 1998) of neutral expression and without glasses, facial hair or make-up were used (see Figure 2(b)). All face photographs were edited in Adobe Photoshop to remove hair and were cropped to the same overall shape. Images subtended a visual angle of approximately 6.0° × 8.6° with participants sitting 100 cm from the screen.
Design and Procedure
In this experiment, expertise (experts and novice) was a between-group variable and stimulus type (budgerigar and face) and orientation (upright and inverted) were within-group variables. Each experimental trial began with a fixation cross presented at the centre of the screen (500 ms), followed by a budgerigar or face study stimulus (1,000 ms), a noise mask (2,000 ms) and then a budgerigar or face test stimulus. The test stimulus remained on the screen until the participant’s response. The participant’s task was to decide whether the budgerigar or face study and test stimuli were the same identity or different identities and to indicate their decision via a key press response. For ‘same’ trials, the same budgerigar or face was presented but at different viewpoints. For ‘different’ trials, a distractor selected to be visually similar to the study stimulus appeared as the test stimulus. The orientation of the study and test stimuli were the same (e.g., an upright trial consisted of an upright study stimulus and an upright test stimulus). Each budgerigar or face exemplar appeared once as a study stimulus and once as a distractor stimulus, and each same and different trial appeared once in the upright orientation and once in the inverted orientation, so that the only difference between upright and inverted conditions was orientation of presentation.
The budgerigar and face trials were presented in two separate tests. Within each test, there were 32 ‘same’ trials (16 upright and 16 inverted trials) and 32 ‘different’ trials (16 upright and 16 inverted trials). Trial order was randomized within each test, and participants were given a break halfway through each test. The presentation order of the tests was counterbalanced across participants. The described experimental procedures were approved by the Human Ethics Research Board at the University of Victoria.
Results
Recognition Performance of Budgerigar Experts and Novices for Upright and Inverted Budgerigar and Face Trials in Terms of Sensitivity (d′).
p < .05. ***p < .001.
Table 1 shows means for sensitivity for both groups across orientation and stimulus conditions. A two-tailed independent samples t-test revealed that experts recognized budgerigars better than novices in the normal upright orientation, confirming the experts’ visual expertise, t(14) = 2.74, p = .016, Cohen’s d = 1.37. Critically, paired-samples t tests showed that experts’ recognition for inverted budgerigars (M = 0.57) was significantly impaired relative to their recognition of upright budgerigars (M = 1.25), t(7) = 3.42, p = .011, Cohen’s d = 1.19. Finally, to compare the effect of inversion on expert and novice budgerigar recognition, independent samples t tests on difference scores (upright–inverted) showed that expert inversion effects (M = 0.68) significantly exceeded novice inversion effects (M = −0.12), t(14) = 2.56, p = .02, Cohen’s d = 1.28. Moreover, unlike the effect of orientation on expert recognition, novice budgerigar sensitivity was not different across orientation conditions, t < 1.
The data were submitted to a mixed factorial analysis of variance (ANOVA) with stimulus and orientation as within-subject factors and group as a between-subject factor. A main effect of orientation, F(1, 14) = 30.00, p < .001,
Despite that the three-way interaction was not significant, direct comparison showed that the effect of inversion on budgerigar recognition differed for experts and novices. To better examine the effect of expertise on orientation effects for each stimulus type, we performed separate ANOVAs for each stimulus type. For faces, a main effect of orientation was obtained, F(1, 14) = 37.81, p < .001,
Finally, effects of orientation on each stimulus type were compared in separate ANOVAs for each group. Novices showed a main effect of stimulus type, with faces (M = 1.05) better recognized than budgerigars (M = 0.43), F(1, 7) = 6.96, p = .03,
Although test stimuli included images of budgerigars belonging to three of the eight experts, removing trials containing images of an expert’s own birds did not change the pattern or significance of the results. For only one of these experts did the number of their own birds exceed 20% of the total stimuli and the results remained qualitatively the same when this expert was removed from analysis.
Discussion
Experts and novices were asked to recognize individual budgerigars and faces in their upright and inverted orientations. Both experts and novices demonstrated the classic face-inversion effect where recognition of inverted faces is impaired. Compared to novice performance, experts demonstrated superior recognition for individual budgerigars but only in the upright orientation. However, compared to their recognition for birds in the normal upright orientation, experts were impaired by budgerigar inversion and this effect was the same that people (as ‘face experts’) demonstrate for faces. By contrast, novices’ budgerigar recognition was statistically equivalent in the upright and inverted orientations.
The inversion paradigm has the critical advantage of allowing the test stimuli to be presented as intact, whole objects. This is especially useful when testing the recognition of objects (like budgerigars) that are not easily amenable to other tests of holistic processing because they have features that are not clearly demarcated (e.g., feather colouring patterns with no discrete boundaries), the relevant features are not already well known, or because the spatial dimensions of a feature vary from one exemplar to another.
A disadvantage, though, is that the inversion paradigm does not provide direct evidence that the upright advantage is mediated by the perceptual integration of parts into the holistic ‘gestalt’ that characterizes face perception. In principle, better performance for upright budgerigars could arise from better part-based processing in the normal upright orientation. Feature processing has been implicated in expert face recognition (Rhodes, Hayward, & Winkler, 2006), and feature recognition is also impacted by orientation changes (Riesenhuber, Jarudi, Gilad, & Sinha, 2004; Sekuler, Gasper, Gold, & Bennett, 2004; Yovel & Kanwisher, 2004). However, it is generally found that inversion has a greater impact on sensitivity to metric distances and the spatial information between those features (e.g., Barton, Keenan, & Bass, 2001; Collishaw & Hole, 2002; Freire et al., 2000; Leder, Candrian, Huber, & Bruce, 2001).
In addition, since Yin (1969) first proposed that inverting a face disrupts the ability to get ‘an impression of the whole picture’ (p. 145), several approaches show that the effect of inversion on recognition accuracy is attributable to decreased holistic processing. Inversion reduces holistic processing measured in paradigms that directly manipulate featural integration (Rossion & Boremanse, 2008, Tanaka & Sengco, 1997; Van Belle, De Graaf, Verfaillie, Rossion, & Lefèvre, 2010; Young et al., 1987). Conversely, prosopagnosia patients with impaired face recognition do not show the normal face-inversion effect (Busigny & Rossion, 2010; Palermo et al., 2011; Ramon et al., 2010).
In the case of face recognition, inversion may disrupt several processes; however, the collective evidence suggests that the magnitude of this effect is primarily caused by a reduction in the simultaneous feature processing that underlies holistic face perception in the normal upright orientation (Rossion, 2008, 2009). The disproportionate effect of inversion on faces may therefore reflect a greater reliance on holistic processes in the recognition of faces compared to common objects. However, because face recognition constitutes a well-developed ability to discriminate between highly similar exemplars, the skilled discrimination of visually similar objects may also rely more heavily on simultaneous feature processing for successful recognition. If so, it would be expected that the effect of inversion on expert object recognition would resemble the effect of inversion on face recognition.
Although other studies have failed to show face-like inversion effects for objects of expertise, the inversion effects observed in our budgerigar experts and Diamond and Carey’s (1986) dog experts are equivalent in magnitude to the face-inversion effect typically reported in the literature. A possible explanation for face-like inversion effects in the budgerigar and dog experts is that their visual expertise more closely matches the perceptual processing demands of face recognition. More specifically, budgerigar and dog experts are unique in that they recognize birds or dogs at the identity level, and individuation at the identity level is a central factor that sets face recognition apart from common object recognition. Whereas objects are most quickly recognized at a general category level (‘fish’; Rosch et al., 1976), faces are most readily recognized at the identity level (‘Meryl Streep’; Tanaka, 2001). Behavioural (Jolicoeur et al., 1984; Rosch et al., 1976), electrophysiological (Tanaka, Luu, Weisbrod, & Kiefer, 1999) and neuroimaging (Gauthier, Anderson, Tarr, Skudlarski, & Gore, 1997) evidence suggests that additional perceptual processing is needed for more specific levels of categorization, and the level of categorization is repeatedly shown to mediate the acquisition of perceptual expertise in training studies. That is, perceptual discrimination is enhanced only when training involves recognizing objects at specific category levels (e.g., ‘great horned owl’) compared to general category levels (e.g., ‘owl’) or mere exposure (McGugin, Tanaka, Lebrecht, Tarr, & Gauthier, 2011; Nishimura & Maurer, 2008; Tanaka, Curran, & Sheinberg, 2005; Wong, Palmeri, & Gauthier, 2009) – an effect obtained even without the use of labels (Bukach, Vickery, Kinka, & Gauthier, 2012). Thus, the identity level at which faces (and other objects) are recognized may be an important factor in engaging perceptual processing mechanisms that differ from the part-based processing used for common object recognition.
Face-like inversion effects in these experts might also be a consequence of perceptual processing demands required to discriminate stimuli with high visual similarity. Quick and accurate discrimination of highly similar objects, such as individual exemplars of budgerigars or Labrador dogs, may depend on the ability to integrate information distributed across a wider spatial area if there is not enough diagnosticity in local feature differences. Consistent with this, orientation effects have been shown to increase with visual similarity (Lawson & Jolicoeur, 1999; Murray, 1998).
Alternatively, it might be argued that inversion effects for nonface objects may be due to the stimuli having properties that approximate face features of eyes, nose and mouth (Kanwisher & Yovel, 2006). However, in the case of budgerigars, the face is usually obscured by feathers and lacks the classic face-like schema of eyes, nose and mouth (see Figures 1 and 2). Moreover, studies have shown that nonhuman faces (e.g., dog faces) and nonface objects that resemble faces (e.g., car fronts) do not engage holistic processing (Diamond & Carey, 1986; Scapinello & Yarmey, 1970; Tanaka & Gauthier, 1997). Hence, the face-inversion effect is restricted to the stimuli for which people are experts.
Our replication of Diamond and Carey’s result is noteworthy given that our approach addressed important limitations in their study. First, because Diamond and Carey used the same images during study and test, performance on their task may reflect image-based recognition rather than object recognition. Second, because stimuli were photographs of champion dogs taken from magazine publications, it is possible that the upright advantage that they observed in dog experts might have been due to experts having been familiar with the test stimuli (e.g., Ashworth et al., 2008; Husk, Bennett, & Sekuler, 2007). To address these issues, we tested budgerigar recognition above simple image recognition by testing recognition of individual birds across changes in viewpoint. We did not ask experts if they recognized any of the birds in the test images, but because the breeders only exhibit a select few of their birds at shows, it is highly unlikely that our experts were familiar with the particular birds in the stimuli and, more importantly, the test stimuli were completely novel.
In summary, the disproportionate inversion effect observed for faces has been a distinguishing quality of face recognition. Although it has been rarely observed in the recognition of nonface objects, we found that participants with enhanced abilities to discriminate between highly similar, individual birds show inversion effects resembling those obtained for faces. Similar effects of inversion on budgerigar and face recognition are consistent with the hypothesis that a common process underlies the expert budgerigar and face recognition and, given that this form of visual expertise closely mirrors the visual task of face recognition, are in support of a process-specific explanation of the holistic processing observed for faces.
Footnotes
Acknowledgements
We thank the breeders and show judges of the Western Canada Budgerigar Association for their extensive help and support in allowing this research to be conducted. Thank you to the World Budgerigar Organisation for the use of their illustration of the budgerigar. We also thank Rachel Robbins and an anonymous reviewer for their helpful comments.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was supported by the Natural Sciences and Engineering Research Council of Canada (NSERC) grants awarded to James Tanaka and by the NSERC Undergraduate Student Research Award to Alison Campbell.
