Abstract
Across two eye-tracking experiments, we showed that infants are sensitive to the statistical reliability of informative cues and selective in their use of information generated by such cues. We familiarized 8-month-olds with faces (Experiment 1) or arrows (Experiment 2) that cued the locations of animated animals with different degrees of reliability. The reliable cue always cued a box containing an animation, whereas the unreliable cue cued a box that contained an animation only 25% of the time. At test, infants searched longer in the boxes that were reliably cued, but did not search longer in the boxes that were unreliably cued. At generalization, when boxes were cued that never contained animations before, only infants in the face experiment followed the reliable cue. These results provide the first evidence that even young infants can track the reliability of potential informants and use this information judiciously to modify their future behavior.
Learning from other people is a fundamental part of social-cognitive development (Bandura, 1962). One essential skill in learning from others is being able to determine whether they are trustworthy. Research suggests that young children are selective in trusting other people’s testimony. Children are capable of selecting sources by tracking speakers’ past accuracy (e.g., Clément, Koenig, & Harris, 2004; Koenig, Clément, & Harris, 2004; Koenig & Woodward, 2010) and can infer others’ accuracy on the basis of probabilistic evidence (Pasquini, Corriveau, Koenig, & Harris, 2007). Young children also register multiple cues that point to a speaker’s potential accuracy, such as the speaker’s epistemic knowledge (e.g., Sabbagh & Baldwin, 2001) or group membership (e.g., Kinzler, Corriveau, & Harris, 2011). Research in this area suggests that children’s social learning about unobservable, linguistic, or cultural information emanates from their ability to learn from other people selectively (e.g., Harris & Koenig, 2006; Mascaro & Sperber, 2009; Sobel & Kushnir, 2013).
Much of the research on children’s “trust in testimony” (cf. Harris & Koenig, 2006) concentrates on preschool-aged children’s ability to learn from informants selectively and judiciously (e.g., Koenig & Harris, 2005; Koenig & Jaswal, 2011; Sobel & Corriveau, 2010). What about infants and toddlers? Some evidence of selective trust has been found in the second year of life (see Harris & Lane, 2013, for a review): Fourteen-month-olds will follow eye-gaze cues, and 16-month-olds will point more in the presence of adults who have correctly, rather than incorrectly or inconsistently, identified the locations or labels of objects (Begus & Southgate, 2012; Chow, Poulin-Dubois, & Lewis, 2008). Fourteen-month-olds also choose to imitate actors differently on the basis of each actor’s demonstrative knowledge of an object’s function or conventional usage (Poulin-Dubois, Brooker, & Polonia, 2011; Zmyj, Buttelmann, Carpenter, & Daum, 2010). However, in none of these cases did toddlers have to track the consistency of information or update their initial associations across trials; one person supplied the correct label, function, or location, while the other person never did. No studies have investigated infants’ capacity to track and respond selectively to informants’ relative reliability. Is this ability present in infancy, and how might it develop?
Successfully tracking an informant’s reliability depends on the efficient deployment of selective attention toward events that contain statistically reliable information and away from those that are inconsistent or irrelevant. From a young age, infants can learn statistical regularities among events (e.g., Haith, 1993; Kirkham, Slemmer, & Johnson, 2002), and they can use this information to distribute attention among multiple targets (Kidd, Piantadosi, & Aslin, 2012; Tummeltshammer & Kirkham, 2013). Further, young infants can allocate attention selectively to support task-relevant learning (S. P. Johnson, Slemmer, & Amso, 2004; Richardson & Kirkham, 2004; Tummeltshammer & Kirkham, 2013; Wu & Kirkham, 2010). Such learning abilities have been shown to affect the ways in which infants acquire linguistic (Graf Estes, Evans, Alibali, & Saffran, 2007; Saffran, Aslin, & Newport, 1996), social (S. Johnson, Slaughter, & Carey, 1998; Kaye & Fogel, 1980; Trevarthen, 1979), and causal (Sobel & Kirkham, 2006, 2012) information. Older infants are also capable of integrating statistical data with existing physical and social knowledge (e.g., Denison & Xu, 2010; Gweon & Schulz, 2011; Kushnir, Xu, & Wellman, 2010).
These data suggest that the ability to track an informant’s reliability in early childhood might emanate from the statistical-learning capacities already present in infancy. That is, infants’ sensitivity to statistical regularity may enable them to distinguish between reliable and unreliable informants by driving attention toward consistent, learnable events (as in Tummeltshammer & Kirkham, 2013). Such a mechanism could also be applied to novel or abstract cues, which would help infants to discover their usefulness without requiring a deeper understanding of communicative intentions or social exchanges.
Across two eye-tracking experiments, we investigated whether 8-month-olds can track the reliability of informants over time and use this information to guide their predictions. We considered whether this ability is limited to familiar cues that typically provide knowledge (faces in Experiment 1) or could be similarly observed with novel cues (arrows in Experiment 2). In each experiment, a reliable or unreliable stimulus cued the locations of four different animal animations. The reliable cue always indicated a box where the animal would appear, while the unreliable cue pointed to a box containing the animal only 25% of the time. After familiarization with the cues, infants viewed test and generalization trials, in both of which a location was cued and the corresponding animal sound played, but no animation appeared. On test trials, previously cued locations were cued; on generalization trials, novel locations were cued. If infants had learned to expect an animation in the cued box, then they should search longer in the cued box than in the uncued boxes. We hypothesized that infants would track the accuracy of the cues across trials and use this information to motivate differential search behaviors: When cued by the reliable stimulus, infants would follow the cue and search longer in the cued location, but when cued by the unreliable stimulus, they would not follow the cue and instead search randomly.
Experiment 1: Faces
Method
Participants
Twenty-four 8-month-old infants (11 females, 13 males; mean age = 8 months, 13.40 days; range = 7 months, 12 days to 9 months, 7 days) participated in the experiment. Four additional infants were excluded because of fussiness, inattention, or failure to complete the calibration procedure. Caregivers with infants were recruited on a voluntary basis via local advertisements. Informed consent was received from all caregivers, and infants received a small gift.
Apparatus and stimuli
Eye movements were recorded using a Tobii (Danderyd, Sweden) TX300 eye tracker with a 23-in. built-in monitor. Stimuli were presented using Tobii Studio presentation software, and sounds were played through external stereo speakers. Infants were monitored via a video camera built in to the eye tracker, and their eye movements were observed through the Tobii Studio Live Viewer display. Two female actors were filmed, and their footage was edited into face-cue stimuli in Final Cut Express HD3. The animated clips were created using Macromedia Director MX 2004 and were combined with the face cues using Final Cut Express.
Infants saw a full-screen display (1,280 × 1,024 pixels) comprised of four hollow square boxes with white borders, each of which was placed in one of the four corners of the display against a black background. Within each box, an animated animal appeared: a barking dog in Box 1, a croaking frog in Box 2, a gurgling fish in Box 3, and a chirping bird in Box 4. For each infant, each animal always appeared in the same corner of the display. The animations were preceded by centrally presented face cues. On each trial, one of two female faces appeared in the center of the display, smiled at the infant, and said, “Wow, look!” She then turned to one of the boxes and froze. An animal sound played and after a 500-ms delay, the corresponding animal appeared in its box. The animal bounced or rotated within the box for 3.5 s, while the face remained frozen, as shown in Figure 1.

Examples of familiarization trial blocks containing four reliable face cues (left) and four unreliable face cues (right) from Experiment 1. At the start of each trial, one of two female faces appeared and looked toward one of the boxes. An animal then appeared in one of the boxes. The reliable face always looked to the correct box (i.e., where the animal would appear), but the unreliable face looked to the correct box on only one out of every four trials.
Design and procedure
All infants were tested individually in a quiet room, seated on their caregiver’s lap approximately 60 cm away from the monitor. A five-point calibration sequence (the four corners and the center of the screen; for details, see von Hofsten, Dahlström, & Fredriksson, 2005) was used to obtain a reliable signal. Infants needed to fixate each point before the experimenter manually advanced the calibration sequence; if fewer than four points were accurately calibrated, the sequence was repeated.
Following successful calibration, all infants were familiarized with two faces—a reliable face and a different, unreliable face—in separate blocks. The identities of the faces and order of presentation were counterbalanced across infants. The reliable face always cued a box in which an animal animation would appear. The unreliable face cued a box containing an animal only 25% of the time; the other 75% of the time, an animal appeared in a box that did not correspond to where the face had looked. Critically, the reliable face only cued Boxes 1 and 2 (with the appropriate animal appearing in that box), while the unreliable face only cued Boxes 2 and 3 (with animals appearing in Box 1 or 3, so that the animal only matched the cue on 25% of trials). Thus, Box 1 was cued only by the reliable face, Box 2 was cued by both faces, Box 3 was cued only by the unreliable face, and Box 4 was never cued during familiarization.
Following familiarization, infants viewed test trials and generalization trials. On a test trial, the face looked to the box it had previously cued (i.e., Box 1 for the reliable face and Box 3 for the unreliable face), and an animal sound played. The corresponding animation, however, did not appear. Instead, all four white boxes flashed briefly (200 ms) to encourage infants to make a saccade. On a generalization trial, the face looked to the box that had never been cued before (i.e., Box 4 for both faces), and a new animal sound played. Again, no animation appeared, but all four white boxes flashed briefly to encourage saccades.
Infants viewed four blocks of four familiarization trials each for both the reliable and unreliable faces, which appeared on alternating blocks. Familiarization was followed by two test blocks for each face cue. Each test block contained two familiarization trials interleaved with one test trial and one generalization trial; test and generalization trials were not presented back to back, so that infants would continue to expect the animations to appear. This entire sequence was then repeated, for a total of 40 familiarization (20 reliable, 20 unreliable), 4 test, and 4 generalization trials. 1 Each trial lasted 8 s with 500 ms between trials, for a total experiment length of about 7.5 min.
Data analysis
Eye movements were recorded and filtered into discrete fixations using a spatial filter of 30 pixels and a temporal filter of 100 ms. On test and generalization trials, when all four boxes flashed but no animations appeared, accumulated looking times (i.e., the summed durations of all fixations) to each of the four boxes were measured as a proportion of total looking time.
Results
Figure 2 shows the mean proportion of looking time to each of the four boxes during the test trials. These proportions were analyzed with a 2 (reliability) × 4 (box) repeated measures analysis of variance (ANOVA). 2 Results showed a significant main effect of box, F(3, 66) = 3.64, p = .017, η p 2 = .14, as well as a significant Reliability × Box interaction, F(3, 66) = 3.55, p = .019, η p 2 = .14. This interaction was unpacked using separate univariate ANOVAs for test trials with reliable and unreliable faces. On trials with the reliable face, a significant main effect of box was apparent, F(3, 66) = 8.32, p < .001, η p 2 = .27, and post hoc comparisons indicated that infants looked longer at the cued box than at any other box, p < .040 (Bonferroni corrected). On trials with the unreliable face, no effect of box emerged, F(3, 66) = 0.21, p = .888, which indicates that infants did not look significantly longer at the cued box than at any other box. Finally, infants looked more to the cued box when it was cued by a reliable face than by an unreliable face, t(22) = 2.66, p = .014, d = 0.55.

Mean proportion of looking time to each of the four boxes on test trials in Experiment 1 as a function of the reliability of face cues. In this illustration, box numbers are assigned arbitrarily, as the actual locations of the cued and noncued boxes were counterbalanced across infants. Error bars show standard errors of the mean.
Figure 3 shows the mean proportion of looking time to each of the four boxes during generalization trials. These proportions were analyzed with a 2 (reliability) × 4 (box) repeated measures ANOVA. 3 Results showed a main effect of box, F(3, 63) = 2.70, p = .053, η p 2 = 0.11, as well as a significant Reliability × Box interaction, F(3, 63) = 9.83, p < .001, η p 2 = 0.32. This interaction was explored using separate univariate ANOVAs for generalization trials with reliable faces and generalization trials with unreliable faces. On trials with the reliable face, a significant main effect of box emerged, F(3, 63) = 12.39, p < .001, η p 2 = .38. Post hoc comparisons indicated that infants followed the gaze of the face to the box that had never been cued, looking longer at this new box than at any other box, p < .024 (Bonferroni corrected). On trials with the unreliable face, no effect of box was apparent, F(3, 63) = 0.40, p = .754, which indicates that infants did not follow the face’s gaze to the new box, nor did they look longer at any other box. Finally, infants looked more to the new box when it was cued by a reliable face than when it was cued by an unreliable face, t(21) = 4.20, p < .001, d = 0.89.

Mean proportion of looking time to each of the four boxes on generalization trials in Experiment 1 as a function of the reliability of face cues. In this illustration, box numbers are assigned arbitrarily, as the actual locations of the cued and noncued boxes were counterbalanced across infants. Error bars show standard errors of the mean.
Discussion
Experiment 1 showed that 8-month-olds can monitor the reliability of individual faces and use this information to guide looking behavior. Infants searched consistently in the box cued by the reliable face when it cued both familiar locations as well as novel ones. Infants did not follow the gaze of the unreliable face and rather searched at chance among all four boxes despite the same amount of familiarization with the locations of the animals. This finding is consistent with a number of studies suggesting that appropriate cues can enhance infants’ processing and learning of cued events (Reid, Striano, Kaufman, & Johnson, 2004; Senju, Csibra, & Johnson, 2008; Wu & Kirkham, 2010; Yoon, Johnson, & Csibra, 2008). These data show that infants do not naively follow gaze; rather, they are sensitive to the relation between an adult’s gaze and the locations of objects (see also Senju et al., 2008), and they track this relation over time to evaluate the reliability of adults’ gaze behaviors. In addition, we posit that the mechanism by which children track other individuals’ reliability is based on the statistical-learning capacities already present in infancy.
This hypothesis suggests that infants can track the accuracy of any cue, even one with which they have no prior experience. They are not just tracking the reliability of social information, nor are they simply tracking faces because they are an attractive and highly familiar stimulus. In Experiment 2, we used the same procedure as in Experiment 1, except that we replaced the faces with two novel, abstract arrow cues. If infants’ responses resulted from prior experience with faces, we would expect infants to be unable to distinguish between reliable and unreliable novel cues. In contrast, if infants tracked the statistical regularity with which each cue predicted a spatial location, then they should also be able to evaluate the relative reliability of these novel cues.
Experiment 2: Arrows
Method
Participants
Twenty-four 8-month-old infants (10 females, 10 males; mean age = 8 months 10.0 days; range = 7 months, 13 days to 9 months, 6 days) participated in the experiment. Five additional infants were excluded because of fussiness, inattention, or failure to complete the calibration procedure. Sample size was chosen to match that of Experiment 1. Families were recruited and provided consent as in Experiment 1.
Apparatus and stimuli
The eye tracker, software, and testing setup were identical to those used in Experiment 1. Infants saw the same full-screen display and the same animal animations. Instead of faces, however, centrally presented arrow cues preceded the animations. On each trial, one of two colorful shapes appeared in the center of the screen and either bounced up and down with a “boing” sound or shook side to side with a “ding-dong” sound. An extension then protruded from one corner of the shape, creating a directional arrow that pointed emphatically toward one of the boxes and froze (see Fig. 4). An animal sound played, and after a 500-ms delay, the corresponding animal appeared and moved within its box, as in Experiment 1.

Examples of two familiarization trials from Experiment 2. On each trial, one of two shapes appeared and then morphed into an arrow that cued one of the boxes. For half of the infants, Arrow 1 (top row) reliably cued a box in which an animal appeared, and Arrow 2 (bottom row) unreliably cued a box in which an animal appeared (and vice versa for the other half of the infants).
Design and procedure
Infants were tested in the same manner as in Experiment 1. Infants were familiarized with a reliable arrow and an unreliable arrow on separate blocks (order counterbalanced across infants). The reliable arrow always pointed to the box in which an animal animation would appear, reliably cuing two different boxes on separate trials. The unreliable arrow also cued two different boxes on separate trials but pointed to an animal only 25% of the time. As in Experiment 1, one box was cued only by the reliable arrow, a second box was cued by both arrows, a third box was cued only by the unreliable arrow, and the last box was never cued. Infants then viewed similar test trials and generalization trials with these arrows. The same number of trials as in Experiment 1 were presented.
Data analysis
Eye movements were recorded, filtered, and analyzed as in Experiment 1.
Results
The mean proportion of looking time to each of the four boxes during test trials, displayed in Figure 5, was analyzed with a 2 (reliability) × 4 (box) repeated measures ANOVA. Results showed a significant main effect of box, F(3, 69) = 4.65, p = .005, η p 2 = .17, as well as a significant Reliability × Box interaction, F(3, 69) = 8.32, p < .001, η p 2 = .27. A univariate ANOVA with only reliably cued trials showed a significant main effect of box, F(3, 69) = 16.28, p < .001, η p 2 = .42, and post hoc comparisons indicated that infants looked longer at the cued box than at any other box, p < .002 (Bonferroni corrected). However, a univariate ANOVA with only unreliably cued trials showed no effect of box, F(3, 69) = 0.32, p = .817, which indicates that infants did not look longer at the cued box, nor at any other single box. Finally, a planned comparison across reliably and unreliably cued test trials confirmed that infants looked more to the cued box when it was cued by a reliable arrow than when it was cued by an unreliable arrow, t(23) = 4.86, p < .001, d = 0.99.

Mean proportion of looking time to each of the four boxes on test trials in Experiment 2 as a function of the reliability of arrow cues. In this illustration, box numbers are assigned arbitrarily, as the actual locations of the cued and noncued boxes were counterbalanced across infants. Error bars show standard errors of the mean.
The mean proportion of looking time to each of the four boxes during generalization trials, shown in Figure 6, was analyzed with a 2 (reliability) × 4 (box) repeated measures ANOVA. 4 Results showed no main effect of box, F(3, 66) = 0.12, p = .946, and no Reliability × Box interaction, F(3, 66) = 1.78, p = .160. This indicates that infants did not look longer to the cued box on generalization trials and did not look differently at the boxes whether cued by a reliable or an unreliable arrow.

Mean proportion of looking time to each of the four boxes on generalization trials in Experiment 2 as a function of the reliability of arrow cues. In this illustration, box numbers are assigned arbitrarily, as the actual locations of the cued and noncued boxes were counterbalanced across infants. Error bars show standard errors of the mean.
Results were compared across experiments using 2 (experiment) × 2 (reliability) × 4 (box) ANOVAs. For test trials, there was a significant effect only of box, F(3, 135) = 8.16, p < .001, η p 2 = .154, and a significant Reliability × Box interaction, F(3, 135) = 10.76, p < .001, η p 2 = .193. No effects or interactions with experiment emerged (all ps > .148), which confirms that there were no differences in the extent to which infants selectively responded to face cues and arrow cues. For generalization trials, results showed a significant effect of experiment, F(1, 43) = 8.22, p = .006, η p 2 = .160, and a significant Experiment × Reliability × Box interaction, F(3, 129) = 7.73, p < .001, η p 2 = .152. This indicates a difference in how infants generalized information about the faces and arrows; namely, they followed the gaze of the reliable face to a new location but did not follow the reliable arrow.
Attention during familiarization was also compared across experiments by considering both total looking time to the cues and to the whole screen, using 2 (experiment) × 2 (reliability) ANOVAs. A trend toward an effect of experiment, F(1, 46) = 3.75, p = .059, η p 2 = .075, indicated that infants looked slightly longer at face cues than at arrow cues. However, no effects emerged for total looking to the whole screen, which indicates that infants paid equivalent attention to familiarization trials whether viewing faces or arrows.
Discussion
Experiment 2 provides the first evidence that infants can track the statistical reliability of novel abstract spatial cues and use this information to orient attention selectively. On test trials, infants consistently followed the reliable arrow to the cued box and predicted the locations of the reliably cued animals, whereas they did not follow the unreliable arrow or learn the locations of the unreliably cued animals. Notably, Experiment 2 demonstrates that infants’ tracking and use of reliability is not limited to gaze following or social situations (although see Corkum & Moore, 1998; Deligianni, Senju, Gergely, & Csibra, 2011; S. Johnson et al., 1998).
The extent to which children learned these abstract cues, however, was limited. On generalization trials, infants did not follow the reliable cue to a box that had not been cued previously (as they did in Experiment 1); instead, they searched randomly in the four boxes. This result suggests that infants had difficulty applying their knowledge of the arrows’ relative informativeness to a new situation, and it highlights an interesting difference between infants’ tracking of familiar and novel cues. It is consistent with previous findings suggesting that depth of learning may be greater with face cues than with abstract attentional cues (Wu & Kirkham, 2010).
General Discussion
The current study demonstrates that young infants can track the reliability of a potential informant and use this information to change their future behavior. Infants tracked the relation between individual cues and their targets, learning that one cue gave reliable information about the location of the target, and the other cue gave unreliable information. In two experiments, infants distinguished between cues that reliably and unreliably indicated familiar locations (i.e., on test trials), making predictive saccades to the locations indicated by the reliable cue but not to those indicated by the unreliable cue. However, infants generalized only the familiar face cues, such that when the reliable face cued a novel location, infants expected an animation to appear there.
Infants’ lack of generalization to the novel arrow cues suggests several distinct interpretations. The first is that infants have the capacity to learn the statistical regularities of both familiar and novel cues and their target spatial locations, but the familiarity of the cue stimulus affects infants’ generalization abilities. Eight months of exposure to adult faces directing gaze toward a variety of locations and objects in the world may have provided infants with enough experience to generalize their specific knowledge to a new situation, whereas further exposure would be necessary to make the same generalization with the arrow cues. Evidence for this possibility comes from imitation studies: Elsner and Pauen (2007) found that 12-month-olds could not generalize the efficacy of a novel object (producing a novel outcome) to a new situation, whereas 15-month-olds could. These data suggest that although infants can learn about novel statistical regularities, their generalization capacities are developing into the second year of life.
The second interpretation is that infants processed the cues and targets differently when viewing faces than when viewing arrows. Though looking times during familiarization showed only a modest trend, infants may have in fact paid better attention to the faces because of their salience. At the same time, the arrows may have required more attention for equivalent learning because of their novelty. Critically, differences in attention might have stemmed from infants understanding the identities of the face and arrow stimuli differently. In Experiment 1, the faces kept their identities (and thus histories of accuracy) throughout the procedure. In Experiment 2, infants might have interpreted each arrow that pointed in a different direction as a different object. Thus, during the generalization trials, when the arrows pointed to new locations, they would have treated them as new objects and thus had no basis for prediction. Future studies are necessary to investigate the familiarity, attentional, and perceptual differences in the cues (e.g., using a highly familiar stimulus that does not typically function as a spatial cue) that may be driving differences in depth of learning.
In their review, Harris and Lane (2014) argue that infants may “already understand how testimony works” and are selective in their choice of informants in the second year of life. However, they do not explain how this understanding develops or whether it may be present earlier in infancy, prior to infants’ active participation in communicative exchanges. The present study demonstrates that even young infants have the capacity to monitor the accuracy of other individuals and selectively use this information in planning responses. We suggest that this ability emanates from the sensitivity to statistical regularities that is already present very early in life, which guides attention to informative events and helps distinguish between accurate and inaccurate cues.
To conclude, the current study demonstrates that 8-month-old infants are keenly aware of the relative informativeness of both familiar and novel visual cues. Consistent with other recent work showing that infants modify their distribution of attention based on statistical coherence (Kidd et al., 2012; Tummeltshammer & Kirkham, 2013), these results support a characterization of infants as active information gatherers who use observed data to guide attention and action. Such early capacities to track the reliability of information in the world suggest that infants are capable of learning from other people judiciously at very early ages and that these capacities might be the building blocks of socially constructed knowledge.
Footnotes
Acknowledgements
We thank Irati Rodriguez Saez de Urabain, Leslie Tucker, and Marian Greensmith for their help with testing and recruitment.
Declaration of Conflicting Interests
The authors declared that they had no conflicts of interest with respect to their authorship or the publication of this article.
Funding
This work was funded in part by grants from the Nuffield Foundation to Natasha Z. Kirkham, the National Science Foundation (Grant No. 1223777) to David M. Sobel, and the United Kingdom Medical Research Council (Grant No. G0701484) to Mark Johnson.
