Abstract
This article focuses on sequence learning on the Visual Expectation Paradigm (VExP) using human faces as stimulus material. For a sample of 133 Caucasian German infants assessed longitudinally at 3 and 6 months of age, a previous study has shown that the response latency of 6-month-old infants was shorter when the infants solved the task with Caucasian own-race faces in contrast to African other-race faces. The advantage for own-race faces occurs at the same age the Other-Race-Effect (ORE) has been reported to emerge. As studies on ORE development have shown the phenomenon in infants from various cultural backgrounds, the follow-up question to be answered here is whether the performance differences on the VExP can also be found in other than Caucasian infants. As a complement to the German sample, 30 African infants from Cameroon were assessed longitudinally with the same VExP task at ages 3 and 6 months. Our results indicate that perception differences between own-race and other-race faces influence performance on the VExP in both samples. As expected, the Cameroonian infants improved performance on the VExP from 3 to 6 months only in their own-race African faces condition.
Keywords
Introduction
The focus of this article is on sequence learning in different cultural environments (Cameroon and Germany) using familiar and unfamiliar stimulus categories (African vs. Caucasian faces). The Visual Expectation Paradigm (VExP) was used to assess sequence learning in infancy, which is the ability to recognize a rule in environmental contingencies and to develop associations and expectancies based on that rule. This dynamic perception-action task was developed by Haith and his colleagues (Haith, Hazan, & Goodman, 1988; Haith, Wentworth, & Canfield, 1993). The original VExP consists of a simple sequence in which stimuli appear alternately on the left and right sides of the infant’s visual field for 700 ms (Canfield, Smith, Brezsnyak, & Snow, 1997; Canfield, Wilken, Schmerl, & Smith, 1995). Haith et al. (1988) found that by 2 months of age infants are able to learn this sequential relation between stimuli. Rule learning is indexed by decreases in reaction time (RT) across trials. RT is measured by the latency of an infant’s eye movements which are pursued either in reaction to the onset of a stimulus (reflexive saccades) or in anticipation toward the position of the stimulus to appear next, once the sequence is learned (Canfield et al., 1997).
To shift visual attention on the VExP, an infant has to be able to focus attention on a stimulus and to engage at the stimulus’ position but also to disengage attention and be able to shift attention to another location. The latter is especially important for anticipatory saccades in which an infant disengages from a stimulus rapidly to shift attention to the next stimulus’ location. It has been shown that an inter-stimulus-interval (ISI) facilitates information processing and therefore, disengagement and reengagement in settings like the VExP (McConnell & Bryson, 2005). A saccade is defined as anticipatory when RT is less than 200 ms after the appearance of the new target stimulus because in this case, the internal motivation to initiate a saccade must have been generated before the onset of the new target stimulus (Haith et al., 1988).
There are two main influential factors on the reduction of saccade latency and RT: One is the learning process underlying a successful task performance. Infants learn the underlying rule of the stimulus sequence (alternate appearance of the stimulus on the left and on the right) and thus improve performance across trials. The second influential factor can be seen in the infants’ improvement of oculomotor abilities with age. Better control of eye movements enhances the fixation of stimuli and thus learning on the VExP. (Canfield et al., 1997). Thus, the reduction of RT and the number of anticipatory responses should be higher in the course of the VExP in older than in younger infants. Canfield and colleagues found RT to decline with age in their longitudinal studies, but they did not find any increase in the number of anticipations on the VExP with age. For the present longitudinal study, because it seems to be a more meaningful indicator of the learning process, the focus is on the reduction of RT. The possible influences of the stimulus material used on RT reduction across trials were investigated. Similar to the original design, a VExP task was designed with a simple left–right sequence to investigate the influence of human faces as visual VExP stimuli.
When using faces in a visual learning or memory task for infants, some basic considerations on face perception in infancy have to be taken into account. Several studies have shown that from birth, infants prefer human faces or face patterns to other stimuli by fixating them longer: Valenza, Simion, Cassia, and Umiltà (1996), for example, presented face-like patterns and inverted face patterns to newborns. They reported that stimuli more similar to faces were preferred to inverted stimuli.
Research on face preferences has also shown preferences for different face categories. One preference found is for female faces (Quinn, Yahr, Kuhn, Slater, & Pascalis, 2002; Ramsey, Langlois, & Marti, 2005). This preference was found to be stable until the end of the first year of life (Quinn, 2002; Rennels & Davis, 2008). Quinn et al. (2002) explained face preferences by means of experience due to exposure to face categories. It was found that infants prefer faces depicting the gender of the primary caregiver (Quinn et al., 2008; Rennels & Davis, 2008). Based on these findings and based on the fact that in Western cultures, the mother is typically the primary caregiver during the first month of life, it can be assumed that infants acquire more experience with female faces and thus develop more detailed categories, that is, categories with more information on individual faces, for them. The better a face fits a category, the more attractive and therefore preferable it becomes (Rhodes, Sumich, & Byatt, 1999). Another face category preference in infancy is for own-race faces in contrast to other-race faces. This preference can also be explained by exposure to faces, which then enhances category development. Kelly and colleagues (2005) found that by 3 months, Caucasian infants developed a preference for own-race faces. Similarly, Chinese 3-month-old infants living in China preferred Chinese faces to other-race faces (Kelly, Liu, et al., 2007). Bar-Haim, Ziv, Lamy, and Hodes (2006) assessed preferences for African and Caucasian faces in 3-month-old Ethiopian infants living in Africa, infants of Ethiopian immigrants born in Israel, and Caucasian Israeli infants living in Israel (Caucasian environment). They also found that infants prefer faces of the categories to which they are exposed. African infants born in Israel did not show any preference for either category, probably because of their exposure to both categories since birth.
The preferences for own-race faces versus other-race faces found in 3-month-olds of different cultural environments (but tested in their own cultural environments) is consistent with the hypothesis of a perceptual narrowing process. One outcome of the early preference for own-race faces is the Other-Race-Effect (ORE) found in children and adults. The ORE refers to difficulties distinguishing or recognizing faces of another than the individual’s prototypical facial environment. Findings regarding the onset of this process of perceptual narrowing are, however, inconsistent. Sangrigoli and de Schonen (2004) habituated 3-month-old Caucasian infants to either Caucasian or Asian female faces. A trial pairing a familiar face with a new face of the same ethnic group was shown after the habituation phase. This procedure assessed the infant’s ability to differentiate between the familiar and the novel stimulus, and this ability could only be found in the own-race Caucasian faces condition. Similarly, Kelly, Quinn, et al. (2007) habituated 3-, 6- and 9-month-old Caucasian infants to Caucasian, African, Middle Eastern, and Chinese faces. In contrast to Sangrigoli and de Schonen, in all four conditions, they found that 3-month-old infants were able to identify a novel stimulus of the same ethnic category paired with the habituated one. The 6-month-olds’ recognition abilities were limited to Caucasian and Chinese faces, and the 9-month-olds’ to own-race Caucasian faces (Kelly, Quinn et al., 2007). Kelly and colleagues replicated these findings for Chinese infants reared in China when they were habituated to either Chinese or Caucasian faces. At 3-months, the infants recognized individuals of both ethnicities. At 6-months, only marginal recognition was shown with Caucasian faces, and by 9-month, only the own-race Chinese faces were recognized (Kelly, Liu et al., 2009).
In summary, the emergence of the ORE in infancy can be explained by a differential perception of own- versus other-race faces due to a perceptual narrowing dependent on the infant’s cultural environment. This process seems to start at about 3 months of age when infants show preferences for own-race to other-race faces. The preferences are followed by increasing difficulties in recognizing other-race faces, although results concerning the onset of this process are not consistent. At about half a year of age, these recognition difficulties (due to inexperience with other-race faces) still increase while the infants’ specialization process for own-race faces continues. The inability to recognize other-race faces seems to be fully developed by 9 months of age.
The present study investigates whether African versus Caucasian face stimuli presented to both African and Caucasian infants lead to performance differences on the VExP. In Fassbender et al. (2012), it was shown that the Caucasian infants assessed longitudinally at 3 and 6 months of age performed differently with own-race Caucasian and with other-race African faces. The performance improvement that can be expected with maturation with age was only found in the own-race Caucasian faces condition. German infants who learned the stimulus sequence with other-race African faces did not show faster RTs to stimulus shifts at 6 months. The present study reports follow-up data collected from a sample of Cameroonian infants also assessed longitudinally at 3 and 6 months of age. The VExP stimulus sequences applied in Cameroon were the same as the German infants had seen. For the Cameroonian infants, the African stimuli represent the own-race condition and the Caucasian stimuli represent the other-race condition, respectively. The data from the Cameroonian infants are compared with data from the German sample of Caucasian infants in an overall analysis.
As ORE influences tend to develop around 3 months of age and as they seem to derive from a differential processing of own-race and other-race faces, performance differences on the VExP with respect to own- and other-race faces were expected from 3 to 6 months. Thus, infants should perform differentially on the VExP with own-race versus other-race faces. Just as the Caucasian infants performed better only in the own-race Caucasian condition, the African infants should show improvement with age only in their own-race African faces condition. Thus, besides age (3 vs. 6 months) two additional variables are considered in the present study: stimulus class (African vs. Caucasian faces) and cultural background (Germany vs. Cameroon). For German infants, Caucasian faces are the own-race stimuli and African faces are the other-race stimuli, while it is vice versa for African infants. Thus, contrasting performance patterns are expected for German and Cameroonian infants with Caucasian versus African faces as the VExP stimuli. As a consequence, a significant three-way interaction is expected between age (3 months vs. 6 months), stimulus class (African vs. Caucasian faces), and cultural background (Germany vs. Cameroon) regarding infants’ performance.
Method
Participants
One hundred thirty-three Caucasian infants (63 boys and 70 girls) and 30 African infants (17 boys and 13 girls) were assessed longitudinally at 3 months (88-104 days, M = 95.7, SD = 4.0 for the Caucasian infants; 90-99 days, M = 93.8, SD = 2.3 for the African infants) and at 6 months (179-199 days, M = 187.2, SD = 4.3 for the Caucasian infants; 181-194 days, M = 185.7, SD = 3.6 for the African infants). The Caucasian infants were recruited from German urban middle- and upper-middle-class families by means of newspaper advertisements or letters of invitation sent to parents of newborn children. The parents were motivated to participate in the repeated assessments by offering a small financial incentive. 1 The African infants from the Cameroonian tribe Nso were recruited in Kikaikelaki at the local health center in which basic medical care for newborns and their mothers is provided. The project team contributed donations for the local health center such as an electricity generator, which not only assured electricity to run the lab but also provided (constant) supply for other laboratory rooms of the health center. As in Germany, the parents in Cameroon were offered a small financial incentive for participation. Written consent was obtained from all parents prior to participation. The African infants included in the final data set were unfamiliar with Caucasian faces just as the German infants included were unfamiliar with African faces. All infants were born at term and, apart from minor illnesses, had been healthy since birth. For both samples only infants were included if the principal caregiver was a woman (typically the mother) until 6 months of age to assure familiarity with female faces.
Another 59 infants (42 German, 17 Cameroonian) were excluded from the longitudinal data set either for that reason or because of missing data for either of the two assessments due to off-task behavior (i.e., not watching the screen, fussiness), illness, crying, technical difficulties, or because they failed to return for the second assessment.
Apparatus
For each laboratory, an upright rectangular enclosure measuring 94.5 cm × 96.5 cm × 164 cm and opened on one side was constructed to reduce distraction. Its ceiling and its three walls were covered with a dim gray sound-absorbing rubber. The open front of the enclosure was so sized that an attendant, seated on a chair, could be comfortably positioned with an infant on her lap. The infant’s head position was about 60 cm from the presentation screen, which measured 40 cm × 70 cm. The VExP was presented using the Apple Keynote program. The enclosure ceiling contained a block, which prevented the attendant from seeing the screen. Small mirrors were positioned on the side of the block facing the screen so that the stimulus position could be recorded simultaneously with the infant’s looking behavior by a camcorder hidden above the screen. One enclosure including all technical instruments was shipped to Cameroon. During assessment, in each laboratory, the room light was so dimmed that the light intensity was of about 26 luces at the infant’s head position for comparable measurements.
Stimuli
Photographs of the head and shoulders of six African and six Caucasian women between 20 and 30 years of age were the VExP stimuli. The Caucasian models were German university students; the African models were Nso women. They were photographed in Kumbo, the capital city of the Nso Kingdom of the Nso people, to assure that for the local infants, these models would represent culturally typical (but still unknown) faces. All models wore a black shirt and their hair was tied back so that the whole face and the ears were visible. As infants prefer smiling faces to faces with a neutral expression (Kuchuk, Vibbert, & Bornstein, 1986), each model was photographed with a slight smile to create stimuli that draw the infants’ attention. Each stimulus was photographed in a full face frontal orientation and additionally in two three-quarters profile orientations from both sides of the face which are known to be easily recognizable even for infants (Turati, Bulf, & Simion, 2008). Please see Figure 1 for an illustrative example. The three orientations were chosen to enable infants to recognize the identity of the face and not simply a single facial picture. In all three orientations, the model looked into the camera to create the effect of smiling at the infant. The stimuli were presented alternately on the left or on the right side of the enclosure’s screen with projection size 24.7 cm × 18.7 cm. The position of the stimuli was 3.9 cm from the left or right margin of the screen, 4.8 cm from the top, and 9.5 cm from the bottom. The left–right separation between the two positions was as large as the size of each stimulus, that is, 24.7 cm.

Examples of African and Caucasian face stimuli in frontal and three-quarters profile orientations.
Procedure
In Germany, parents with their infants were invited to a university laboratory; in Cameroon, mothers and infants were invited to the laboratory room set up at the local health center. At the start of the laboratory session the general procedure was explained, and the experimenter asked the attendant about the infant’s physiological data and family’s background information. During this time, the attendant was seated half inside the testing enclosure with the infant on her lap so that she and the infant could become acquainted. When the infant was attentive and alert, the room light was slowly dimmed, the attendant with infant was suitably positioned and asked to maintain quiet for 2 minutes. Then, the VExP procedure was initiated. The infant’s position in front of the screen and the camera were controlled by the experimenter seated behind the enclosure.
There were six different VExP-presentations, three presentations with African faces and three with Caucasian faces. In each presentation, either two African models or two Caucasian models were presented in all three orientations on a light gray background on the screen (six pictures in total). Every model was presented on only one side of the screen for an altering left–right-sequence of 18 trials. Thus, the two models of each presentation were also presented alternately. The presentation sequence of each VExP presentation (A for model on the left in three orientations, B for model on the right in three orientations, F for frontal, L for left profile, and R for right profile orientation) was AF—BF—AL—BR—AR—BL—AL—BF—AR—BL—AF—BR—AR—BL—AL—BF—AF—BR. The infants were randomly assigned to one of the face conditions. In the German sample, 66 infants were assigned to the own-race Caucasian faces condition and 67 infants to the other-race African faces condition. In the Cameroonian sample, 17 infants received the Caucasian other-race faces and 13 infants received African own-race faces. The condition was maintained at both assessments for each individual. However, a different set of stimuli was presented at the second assessment (i.e., VExP with Caucasian models 5 and 6 at 3 months and VExP with Caucasian models 3 and 4 at 6 months, for example).
Pretests had shown that there was no difference between infants assessed with a repeated right–left sequence and infants assessed with a repeated left–right sequence. The side of the screen on which the first stimulus appeared had no effect on infants’ performance on the VExP. Previous pretests had also shown that drop-out-rates were lower by presenting facial stimuli for 1,500 ms instead of 700 ms as described by Canfield et al. (1997). The presentation time was thus set to 1,500 ms, which is also in accordance with Domsch, Lohaus, and Thomas (2009). The ISI (1,000 ms) was set in accordance with Canfield et al. (1997).
Measurement
Data from infants’ eye movement on the VExP were obtained by a frame-by-frame video analysis (1 frame = 33 ms). To measure RT, the number of frames between the onset of a stimulus and the infant’s shift toward it was calculated. The smaller the RT value, the more rapidly the infants reacted to the onset of the stimulus. The videos were analyzed by two raters. Reliabilities for the German sample were calculated across 65 videos (1,170 trials) with comparable proportions of 3- and 6-month-old infants (in total 48.8% of the 3- and 6-month videos). Inter-rater correlations for assessing frame counts were 0.90 and 0.93 for the 3- and 6-month assessments, respectively. For the Cameroonian sample, inter-rater correlations were calculated across 17 videos (306 trials) for each assessment, which is 56.7% of the videos. Inter-rater correlations for frame counts in the Cameroonian sample were 0.77 for the 3-month assessment and 0.72 for the 6-month assessment. Please note that the raters found it more difficult to follow the Cameroonian infants’ pupils than the Caucasian infants’ pupils in the videos. The lighting conditions in the enclosure had been arranged for the study in Germany and were applied exactly equally in Cameroon. It was not until the video analysis to notice that a slightly more intensive illumination would have been appropriate for the lower contrast of the pupil on a dark iris. This explains the lower, but still appropriate, reliabilities in the Cameroonian sample.
From the 18 VExP trials, a maximum number of 18 RTs per infant could be obtained. Because not all infants completed all trials, as a first step, the 18 trials were reduced to six blocks of typically three trials each to reduce within-infant RT variability. Trials 1 to 3 composed block 1, trials 4 to 6 composed block 2, whereas trials 16 to 18 composed block 6. Further calculations were based on the average within-block RT. Each infant’s performance measure was typically based on six RT blocks, but if a block included zero or only one RT trial (e.g. due to infant fussiness), that block was deleted. Thus, the performance measure for the infants could not always be based on six RT blocks. Infants from which less than three valid blocks could be obtained were excluded from further analyses (off-task behavior as described above). This procedure eschewed the need for imputing missing data. For the German sample fewer than 10% of the assessments yielded less than three blocks, whereas for the Cameroonian sample 12% of the assessments yielded fewer than three blocks.
Statistical Analyses
A nonlinear regression model was applied to regress the vector of the mean RTs for each infant against the corresponding block numbers. The model is E(Y|b) = α + β exp[−δ(b − 1)2]; Model 3 in Thomas & Gilmore (2004), with Y denoting an infant’s mean RT response on block b. The model fits a generally decreasing and nonincreasing monotone function with a positive asymptote to the data. It was constructed to reflect the theoretical features of the habituation process (cf. Thomas & Gilmore, 2004). Like habituation data, VExP responses are expected to decrease over trials. The model may be viewed as a data smoother at the individual infant level. Its three parameters allow sufficient flexibility to capture the main features of each infant’s VExP mean RTs, and in addition have explicit conceptual interpretations.
The three nonnegative parameters, α, β, and δ, the interpretation of which can be seen in the hypothetical fitted function in Figure 2, reflect features of the attention processes. Within the VExP framework, the parameters have explicit and useful interpretations: α represents the fastest RT response of the infant, or said differently, the asymptotic baseline RT, α + β (denoted below as αβ) expresses the maximum or slowest RT, which—under the model—occurs on the first block of trials. β represents the range of the infant’s RT responses, that is, the width of the infant’s estimated RT function. δ represents the most important variable to indicate the learning process of each individual infant. With values ranging from zero to one, it provides information on the RT decline; the larger δ, the greater the RT decline, with δ = 0 indicating no decline across the (up to six) blocks of RT trials. Parameter estimates used least squares methods implemented under R (R Development Core Team, 2010). In addition to removing some within-individual error variability by smoothing each infant’s responses, this approach can identify features of the RT responses not achievable if the RT is not modeled at the level of the individual infant. By allowing each infant to have individual model parameter values, the procedure can be viewed as a random effects model framework.

Estimates α, β, and δ across trials.
For hypothesis testing, in both groups (Cameroonian vs. German infants) analyses of variance with repeated measures using estimates of α, β, and δ as dependent variables were calculated for 3- and 6-month-old infants across the stimuli.
Results
In the following section, the data obtained from the Cameroonian sample are contrasted to data from the German sample presented in Fassbender et al. (2012).
For the three parameter estimates of α, β, and δ, intercorrelations were calculated. At both 3 and 6 months, between estimates of α and β, fairly large and significant negative correlations were found. For the German sample, they were r = −.67, p < .001 at the 3-month assessment and r = −.83, p < .001 at the 6-month assessment. For the Cameroonian sample, α and β correlated r = −.72 at 3 months and r = −.76 at 6 months, both p < .001. There are no significant differences between the two samples regarding the intercorrelations between α and β. The negative correlations between the two parameters can be expected in the framework of the model: A small α indicates fast asymptotic RTs in the last block of trials and therefore possibly higher β values, which reflect the range between slowest and fastest RT responses. Small values of β, however, indicate a relatively small within-subject range of RTs which, as the data indicate, are commensurate with relatively high α values. The latter negative correlation expresses a shallow response function in the model. Thus, independent of cultural background and age, faster infants also have a greater range of RT responses than slower infants.
In addition, for the German sample, there was a modest within-age correlation between α and δ at 3 months, r = .26, p < .01, but not at 6 months. For the Cameroonian sample, there was no significant correlation between δ and any other parameter. The rate of decline indexed by δ is generally uncorrelated with whether the infant is slow or fast asymptotically and it is generally uncorrelated with the width of the infant’s RT function β. This means that the parameter δ conveys important information on the attention process that cannot be expressed by the relation between maximum RT (indexed here by αβ) and minimum RT (indexed here by α) as in linear settings. The rate of decline is generally uncorrelated with other features of the infants’ RT response function and therefore adds further information.
In each cultural group, there are few across-age intercorrelations among parameter estimates. For the German sample, β at 3 months correlates with δ at 6 months, r = −.19, p < .05, and in the Cameroonian sample, α at 3 months correlates with β at 6 months, r = −.37, p < .05. As these correlations are not systematic, infant performance at one age is largely uncorrelated with performance at another. The lack of between-age stability, however, is not surprising for an infancy study. For example, infant alertness is not very stable at these early ages and therefore may have influenced infant behavior in the testing situations.
For hypotheses testing, three ANOVAs were conducted with the within-subjects factor age (3 vs. 6 months) and the between-subject factors stimulus class (African vs. Caucasian faces) and cultural background (German infants vs. Cameroonian infants). Dependent variables were (a) the estimated αβ indicating the slowest RT in the first trial block, (b) α as the asymptotic fastest performance in the last trial block, and (c) the decline rate in RT (δ). Infant sex as an additional between-subjects factor did not lead to main nor interaction effects for any of the three parameters and will thus not be further considered in the results section.
Consider first the estimated slowest RT (αβ) across both the African and the Caucasian faces conditions. For the German sample, αβ was M = 60.8 (SD = 11.8) frames at 3 month and M = 54.7 (SD = 6.2) frames at 6 month, which correspond to about 2,023 ms and 1,805 ms in RT. For the Cameroonian sample, αβ was M = 61.3 (SD = 8.9) frames at 3 month and at 6 month M = 54.7 (SD = 6.4) frames, which is 2,006 ms and 1,805 ms, respectively. The differences between the German and the Cameroonian samples at 3 months are not significant. The decrease of αβ with age, however, is significant across the two samples, F(1,159) = 22.61, p < .001, η2 = .124. This decrease, which expresses general maturation, is not dependent on cultural background nor on stimulus class as no interactions were found, F(1,159) = 0.010, ns (nonsignificant), for cultural background and F(1,159) = 0.001, ns for stimulus class. Infants in both culture samples improve performance on the VExP as a consequence of their maturation process.
The repeated measures ANOVA on α indicated that there was neither a main effect for age, F(1,159) = 1.99, ns, nor any interaction with the cultural background, F(1,159) = 0.55, ns, or stimulus class, F(1,159) = 0.56, ns. The asymptotic fastest response on VExP is hence invariant across the different age groups. However, a two-way interaction between cultural background and stimulus class (but no main effect for cultural background in general) was found for α, F(1,159) = 4.48, p < .05, η2 = .027. Across both assessments, infants in Cameroon achieve smaller α values with own-race African faces (M = 46.4 frames, SD = 10.8, n = 13 for African faces versus M = 52.1 frames, SD = 4.2, n = 17 for Caucasian faces), whereas infants in Germany achieve smaller α values with own-race Caucasian faces (M = 45.7 frames, SD = 7.7, n = 66 for Caucasian faces versus M = 46.7 frames, SD = 7.6, n = 67 for African faces). Independent of the two age groups, when solving the VExP task with own-race faces, infants in both samples reach quicker RT responses at the end of the task than when solving the VExP task with other-race faces.
The third ANOVA with repeated measures was run on δ, indicating the curvilinear decrease of RT across blocks. The longitudinal data revealed no main effect for an increase or decrease of δ, F(1,159) = 0.89, ns, although with age, on the VExP an increase of δ can be expected. There was, however, a significant between-subjects effect for cultural background, F(1,159) = 4.51, p < .05, η2 = .028. Independent from age and independent from the stimuli, Cameroonian infants have higher δ values (M = 0.45, SD = 2.9) than German infants (M = 0.33, SD = 2.5). There were no more main or two-way interactions.
The most interesting effect for δ was, however, the expected three-way interaction between age, cultural background, and the stimulus class, F(1,159) = 4.97, p < .05, η2 = .030. This effect remained stable when integrating α and β as covariates. Dependent on whether infants saw African or Caucasian faces, the change of δ across age differs between German and Cameroonian infants. The different development for δ for the two stimulus classes can be seen in Figure 3a for the German sample and in Figure 3b for the Cameroonian sample. To check whether the expected differential δ-performance with age is dependent on the stimuli, for both cultural backgrounds, post hoc one-tailed t tests were calculated. They were calculated to test whether the δ values of the 6-month assessment were higher than those of the 3-month assessment in the own-race faces conditions or lower in the other-race faces condition, respectively. The German infants’ RT decline led to significantly higher δ values in the Caucasian condition, t(66) = −2.38, p < .01, whereas for the African faces, δ decreased and thus RT even increased with age, but this increase was not significant, t(67) = 1.46, ns. For the Cameroonian infants, the RT development shows a mirror-inverted pattern: Infants who watched own-race African faces achieved significantly higher δ values, t(13) = −1.88, p < .05, whereas again, for the other-race (Caucasian faces) condition, RT even increased slightly but not significantly from 3 to 6 months, t(17) = 0.39, ns.

Significant interaction of stimulus class (African vs. Caucasian) and age (3 and 6 months) with regard to δ in Germany (a) and Cameroon (b).
To analyze whether infants in general show increased learning performance on the VExP with own-race faces in comparison to other-race faces, all infants with own-race faces and all infants with other-race faces were combined (independent of their cultural background). At 3 months of age, there is no difference in performance (δ) with own-race versus other-race faces, F(1,162) = 1.70, ns. At 6 months, however, the learning curves are significantly steeper when infants perform the task with own-race face stimuli, F(1,162) = 7.62, p < .01, η2 = .045. Consequently, the learning improvement with age in the VExP is increased for infants who solve the task with own-race face stimuli, and this effect is independent of the infants’ cultural background.
Discussion
The VExP is a sophisticated instrument to assess sequence learning in infancy. This study has analyzed the possible impact different stimuli might have on sequence learning success. When faces are used as VExP stimuli, the 3- and 6-months longitudinal data on German and Cameroonian infants illustrate that infant face-perception and sequence-learning abilities interact. Based on findings on perceptual narrowing in favor of an infant’s environment, the differences between other-race and own-race face processing from about 6 months of age, which can explain the emerging ORE, were suggested to influence infants’ sequence-learning abilities. With age, RT to stimulus shifts was found to decline (Canfield et al., 1995). Thus, mean RT was assumed to decrease from 3 to 6 months. This decrease was expected in the own-race faces condition. For the other-race faces, on the contrary, RT was expected not to decrease. In fact, the results showed even an increase with age.
Ferguson, Kulkofsky, Cashon, and Casasola (2009) found both own- and other-race faces to be processed holistically at 4 months of age. By 8 months, the infants still processed their own-race Caucasian faces holistically, but the other-race African faces were processed featurally. Featural face processing is assumed to be less efficient than holistic face processing in which spatially separate features are aggregated into an integrated unit. Holistic processing uses fewer feature slots, which then reduces the attentional resources required in working memory (Curby, Glazek, & Gauthier, 2009). As a possible interpretation, featural processing of other-race faces at 6 months could slow down sequence learning in the other race-faces condition from 3 to 6 months. This impairment is not shown for own-race faces which are processed holistically.
The hypotheses, confirmed for the German sample in Fassbender et al. (2012), could also be confirmed here for the Cameroonian sample. In an overall analysis, a significant three-way interaction between age, the stimulus classes, and cultural background was found for δ, indicating that in both samples, the expected learning improvement only occurs in the own-race faces conditions. In the other-race faces conditions, however, RT even increased, although not significantly, in both infant samples. It is remarkable that this pattern could be found almost identically in both cultural backgrounds.
In the framework of the model, other findings on VExP sequence learning could be replicated. Independent from the stimuli and from cultural background, RT was found to decline with age as reported by Canfield et al. (1995), Canfield (1997): αβ is significantly smaller at 6 months than at 3 months. This decrease shows no interaction with the stimuli and the infants’ cultural background and thus reflects general maturation. At 6 months, the infants react more quickly to stimulus shifts than at 3 months.
The parameter α reflects an individual infant’s fastest RT response, the smaller α the faster. In the longitudinal analysis, for α, neither a main effect nor any interactions with cultural background or the stimuli were found. From 3 to 6 months, infants do not achieve a higher maximum speed in the course of the VExP assessment. Thus, it may be assumed that there is a maximum speed achievable in infancy which does not improve significantly during the interval from 3 to 6 months. Invariant across the two age groups, in the longitudinal analysis a significant interaction between cultural background and stimulus class was found: Cameroonian infants achieve quicker RT responses in the course of the experiment in their own-race African faces condition, whereas German infants achieve quicker RT responses in the course of the experiment in their own-race Caucasian faces condition. From the age at which preference for own-race faces (Kelly et al., 2005) as well as recognition difficulties for other-race faces (Sangrigoli & de Schonen, 2004) were reported, also in the present sample, infants show an advantage toward own-race faces.
Because the parameter α is independent of the kind of stimuli and the cultural background, it seems to be unlikely that the results can be explained by a different engagement in the own-race versus other-race conditions. If the poorer recognition of other-race faces would lead to less engagement, because the infants saw alternating photos of highly similar-looking other-race faces, this would probably have consequences also on their estimated fastest RT (α).
Regarding the analysis for RT reduction (δ), a significant between-subjects effect for cultural background was found. Cameroonian infants show higher δ values (M = 0.48) than German infants (M = 0.32). The reason for the lower δ in German infants may be that in the sample of 133 German infants, there are n = 22 infants scoring δ = 0 at the 3-month assessment and n = 28 infants scoring δ = 0 at the 6-month assessment. In the framework of the model, these infants, although attentively watching the VExP and thus not being dropped out, did not reduce RT sufficiently to achieve a δ indicating learning improvement. In the Cameroonian sample, however, none of the 30 infants achieved δ = 0, indicating that all infants included in the final data set showed learning improvement. When excluding all German infants with δ = 0 from the longitudinal analyses, the German sample size reduced to n = 90. To assure the results, the three ANOVAs were recalculated excluding the (longitudinal) n = 43 German infants with δ = 0 in either of the two or in both assessments, so that δ ranged from 0.1 to 1.0 in both samples. All effects remained stable including the significant three-way interaction among age, cultural background, and the stimulus class, F(1,116) = 5.0, p < .05, η2 = .041. The significant between-subjects effect for cultural background in δ, however, disappeared after this exclusion, F(1,116) = 0.13, ns. When considering the results only for successful infants (with δ > 0), for none of the parameters there are performance differences between infants of the two samples. Thus, the question to be answered here is why there are increased numbers of (in the framework of the model) nonlearners in the German sample. One speculation could be that in the German sample, the infants with δ = 0 paid attention to the stimuli on the screen but somehow passively watched it. In the Cameroonian sample, on the contrary, our observation was that either an infant was totally involved in the task or disliked it (and was consequently excluded from the sample). Because in German families it is common to have at least one television, and in most cases computers also, even very young infants can to a certain degree be considered familiar with screens on which something “happens.” In Cameroon, this is completely different. None of the families in our sample had a computer at home and only four families (infants were included in the final data set, two of them in the Caucasian, two in the African faces condition) had a television at home. It may be suggested that for that reason, there were no nonlearners, that is, passive watchers, included in the Cameroonian sample. For most infants, the VExP assessment at 3 months of age was the first contact with a screen. This might either explain their increased attention and readiness for learning or the fact they did not want to watch the screen (dropout).
The results of the present study comparing infant VExP performance in Cameroon and Germany are interesting for different reasons. One is that, when dealing with VExP studies in infancy, one should consider that difficult (i.e., nonexperienced) stimuli might have impact on contingency learning. As in Fassbender et al. (2012) for the German sample, also for African infants the VExP shows sensitivity to the stimulus material. As the ORE, which was found in infants from various cultural backgrounds to emerge around 6 months of age, the influence of own-race vs. other-race face stimuli on VExP performance was found to be a cross-cultural phenomenon. Independently from the cultural context, there is an advantage for own-race faces in the VExP task. As a consequence, to exclude possible ORE effects on sequence learning in a VExP, faces as social stimuli can only be applied when they derive from the infants’ own cultural background.
In contrast to the VExP literature referred to in the “Introduction” section, a different approach to calculate infant performance on the task was pursued in the present study. To calculate RT reduction, we did not insert infant raw data—as is done usually—into the analysis of variance because then, individual differences, which are common in infancy studies, are confounded with residual variance. Instead, we modeled the data to obtain the parameters as performance measures on the individual infant level. With this procedure, likely more statistical power might have been achieved as some within-infant error variation was removed. Especially, the parameter δ, which reflects the nonlinear learning curve for each infant, adds information on the learning process that cannot be obtained in a linear setting. We thus consider modeling frameworks as the one used here to be useful in future studies.
A possible limitation of the present study may be that the focus is restricted to faces as stimuli. There is no rationale to predict cultural differences in sequence learning in general. The cultural approach shows, however, that there may be cultural-specific learning facilitations or impairments with specific stimuli. African and Caucasian stimuli are such stimuli which are familiar or unfamiliar depending on the specific surrounding. Faces may, however, be special stimuli because face and nonface object recognition may be processed by partially independent neural mechanisms (Dailey & Cottrell, 1999). Thus, it remains unclear whether the effects on learning are restricted to familiar versus unfamiliar faces or whether they can also be found for other familiar versus unfamiliar nonface objects. Moreover, only two ethnicities were used as own-race and other-race faces. As outlined in the “Introduction” section, habituation studies on the development of the ORE have shown that Asian face stimuli could still be recognized by 6-month-old Caucasian infants and Caucasian face stimuli by 6-month-old Asian infants whereas African faces could not. It would therefore be interesting to analyze whether the performance differences on the VExP found in the present study could also be found if faces of other ethnicities were incorporated as additional VExP conditions. Another limitation may be that the longitudinal design of the present study consisted of only two assessments. As the ORE intensifies beyond an age of 6 months, it would be interesting to analyze the effects of (various) other-race face categories on learning parameters at older ages. It should also be noted as a possible limitation that only female faces were used as stimuli. As stated in the “Method” section, all infants included in the final data set had a female primary caregiver. This can explain a preference for female faces as outlined in the Introduction. An alternative explanation of the developmental change for own-race faces may be that infants speed up on female own-race faces alone as only female faces represent the experienced face category. As a consequence, it could be interesting to replicate these findings with male faces. Another limitation may be seen in the comparably small sample size of the Cameroonian infants. However, compared with many other infant studies, this is nevertheless a reasonable sample size, especially when taking into account the difficulties in data collection in Cameroon, such as inconstant electricity supply and difficulties in contacting families without modern communication media. A strength of this study is that it is, to our knowledge, the first study that compared VExP data from a Western sample with a non-Western rural sample. Furthermore, even the small sample size of Cameroonian infants was sufficient to show that the stimulus effect previously shown for German infants was—as expected—reversed for the Cameroonian infants.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This study was supported by a grant from the German Research Foundation (Deutsche Forschungsgemeinschaft) to Heidi Keller, Monika Knopf, Arnold Lohaus, and Gudrun Schwarzer (KE 263/53-1, KN 275/6-1, LO 337/20-1, and SCHW 665/9-1).
