Abstract
As a powerful social signal, a body, face, or gaze facing toward oneself holds an individual’s attention. We asked whether, going beyond an egocentric stance, facingness between others has a similar effect and why. In a preferential-looking time paradigm, human adults showed spontaneous preference to look at two bodies facing toward (vs. away from) each other (Experiment 1a, N = 24). Moreover, facing dyads were rated higher on social semantic dimensions, showing that facingness adds social value to stimuli (Experiment 1b, N = 138). The same visual preference was found in juvenile macaque monkeys (Experiment 2, N = 21). Finally, on the human development timescale, this preference emerged by 5 years, although young infants by 7 months of age already discriminate visual scenes on the basis of body positioning (Experiment 3, N = 120). We discuss how the preference for facing dyads—shared by human adults, young children, and macaques—can signal a new milestone in social cognition development, supporting processing and learning from third-party social interactions.
Keywords
As an extremely relevant social communicative signal, a body, face, or gaze facing toward oneself holds an individual’s attention (Senju & Hasegawa, 2005). This response reflects a sensitivity to social communicative signals that emerges very early in life (Farroni et al., 2002) and supports subsequent social development (Baron-Cohen, 1995; Johnson et al., 2015). Here, we asked whether, going beyond an egocentric stance, facingness between others, that is, the mutual perceptual accessibility of two others, has a similar effect.
Research in the past years has shown that two static bodies, close and facing each other, are processed more efficiently than nonfacing bodies in visual perception. In particular, under low visibility conditions, a human body (but not an inanimate object) is more likely to be detected and recognized when it faces toward another body than when it faces away (Papeo & Abassi, 2019; Papeo et al., 2017; Vestner et al., 2019), yielding effects that suggest an impact of social interaction on the very early—preattentive or unconscious—stages of visual perception (Papeo et al., 2019; Xu et al., 2023), up to visual memory (Ding et al., 2017; Paparella & Papeo, 2022). The behavioral advantage in processing facing people has a counterpart in neuroimaging results showing that a person facing toward (vs. away from) another evokes stronger neural activation and distinctive neural activity patterns in visual cortex (Abassi & Papeo, 2020, 2022, 2024; Walbrin & Koldewyn, 2019).
In the growing literature, the question of why this advantage would exist is left to the intuition that social facingness, the mutual perceptual accessibility of two social entities, is reliably associated with social interaction. In effect, being face-to-face favors fundamental social behaviors, such as joint attention, gaze following, and communication. Thus, facing bodies may benefit from an attentional or perceptual advantage because they imply social interaction. This view fits within evolutionary theories, according to which social beings are equipped with mechanisms to preferentially orient attention toward socially relevant stimuli (New et al., 2007). These mechanisms, also invoked to explain visual preferences for eye gaze, faces, and biological motion, tend to be functional early in human development and to be shared with other social species, like monkeys and chicks (Buiatti et al., 2019; Farroni et al., 2005; Spadacenta et al., 2019; Vallortigara et al., 2005). They are thought to help animals to spot the presence of other living beings and initiate social interaction (Vallortigara, 2021). Social interaction, even when concerning others, would retain high social relevance because the way in which others interact provides important information for regulating one’s own behavior for social learning and knowledge.
On this reasoning, in a task in which subjects are free to explore the visual environment, facing bodies should be spontaneously preferred to nonfacing bodies. Moreover, if such bias is behaviorally relevant for socialization, it could be found in other socially gregarious species. Here, we asked, Do humans have a spontaneous preference for facing, over nonfacing, bodies? And does this preference appear in other social species? We addressed these questions using an identical preferential looking time paradigm on human adults (Experiment 1) and macaques (Macaca mulatta; Experiment 2), a group-living species with rich social life (Cheney et al., 1986; de Waal & Luttrell, 1988) and early visual preference for social stimuli, such as faces (Kuwahata et al., 2004) and direct gaze (Muschinski et al., 2016).
Statement of Relevance
Facing another is the most powerful signal of social engagement. The sensitivity to this signal has been extensively studied in first-person scenarios, where another’s attention is oriented toward oneself, but not in third-party scenarios, where another’s attention is oriented toward another. Here, we report the first study in which the same behavioral paradigm, based on looking time measurement, was used to test human adults, infants, children, and rhesus macaque monkeys. We characterize a new behavioral adaptation to a particularly relevant aspect of the social world, that is, a spontaneous visual preference for third-party scenarios with two conspecifics face-to-face (vs. the same two in another spatial configuration). These results extend the list of behavioral markers of social cognition, which are a primary tool to capture the milestones of social cognitive development and evaluate social sensitivity in typical and atypical developmental trajectories.
Because we provided an affirmative answer to both questions, we then asked, When does this pattern emerges on the human developmental timescale? As mentioned already, visual preferences for social stimuli have a history of appearing early in life. However, previous work on infants makes the developmental course of the facingness effect less certain. One study testing the effect of body positioning (facing vs. nonfacing) indeed found that young infants looked longer at nonfacing dyads (Goupil et al., 2022). Here (Experiment 3), we investigated whether this effect could be replicated and, if so, when during development the looking behavior reverses toward an adult-like preference for facingness. We addressed this by testing groups of infants and children between 7 months and 5 years with the same stimuli and paradigm of Experiment 1. In sum, building on the evidence of perceptual adaptations for processing third-party face-to-face social interactions, this study addressed whether and when the preference for facingness emerges in humans and whether humans share this preference with monkeys.
Experiment 1a: Looking Times in Human Adults
Method
Participants
Twenty-four participants were tested (13 females; age, range 19–34 years, M = 23.45, SD = 4.37). In the absence of similar prior experiments, this sample size was defined a priori with a power analysis estimating the minimal sample for detecting a medium effect size (Cohen’s d = .60) in a two-tailed t test (α = .05, β = .80), using GPower 3.1 (Faul et al., 2007). Participants reported no history of neurological or psychiatric disorders or medications and were right-handed and had normal or corrected-to-normal vision. The present and following experiments on human participants were conducted according to the guidelines of the Declaration of Helsinki. Written informed consent was obtained from each participant before data collection. All procedures were approved by the local ethics committee (CPP sud-est II) and conducted at the Institut des Sciences Cognitives Marc Jeannerod. Participants were paid €5.
Stimuli
Stimuli consisted of displays showing two dyads of human bodies: one with bodies face-to-face (facing dyad) and the other with the same two bodies presented back-to-back (nonfacing dyad). Stimuli were created from grayscale renderings of 16 human bodies (eight unique bodies in lateral view and their mirrored versions), edited with Daz3D (Daz Productions, Salt Lake City, UT) and the Image Processing Toolbox of MATLAB (MathWorks, Natick, MA). Bodies had 16 different poses, all biomechanically possible. Sixteen unique facing body dyads were created combining each of the 16 bodies with a different body. Sixteen nonfacing dyads were created by simply swapping the position of the two bodies in each facing dyad. Across all dyads, the two bodies were at the same distance from each other considering both the center of bodies (5.3°) and their closest points (M = 3.12, SD = 0.69); facing versus nonfacing, t(30) = 0.03, p = .979. Thus, facing and nonfacing dyads differed only for the relative positioning of bodies. Each facing dyad was paired with its nonfacing counterpart, yielding 16 displays, used as stimuli during eye tracking. The facing dyad was on the left in 50% of displays. Dyads were displayed inside two rectangular areas highlighted with a different background color (lighter gray) relative to the screen background (darker gray). The two areas were separated by 19.9° so that the two bodies of a dyad were much closer to one another than to the bodies of the other dyad. A dyad subtended approximately 10.05° (SD = 0.71) of visual angle (for a single body, M = 3.52, SD = 1.11).
Procedure
Participants sat on a stool, in front of and 60 cm away from a Tobii T60XL eye-tracker screen in a dark, soundproof booth. The experiment began after the eye-tracker calibration. To ensure that participants paid attention to the stimuli, they were instructed to attend to each image for a subsequent memory task (see the Supplemental Material). The experiment involved 16 trials with upright displays and 16 trials with inverted displays, in a random order. Inverted displays were generated by rotating images by 180°. Because body inversion leaves the visual properties of the stimuli unchanged, but impacts the spatial relations between bodies and body parts (Papeo et al., 2017; Reed et al., 2003), inverted stimuli served to control that any difference between facing and nonfacing upright stimuli could be attributed to the difference in spatial relations between parts and not to other visual differences. Trials began automatically after participants fixated a cross blinking in the center of the screen for more than 100 ms. Then, the cross was replaced by stimulus display for 2,500 ms. Throughout the experiment, stimulus presentation, recording of eye-tracking data, and behavioral responses were controlled through PsyScope X (http://psy.ck.sissa.it/).
At the end of the eye-tracking session, participants were presented with facing and nonfacing dyads one by one (32 in total). Sixteen were presented during the eye-tracking experiment; the remaining 16 showed dyads from a novel set. This set was created by combining 16 new body postures (eight unique bodies in a lateral view and their mirrored images) in 16 face-to-face dyads and 16 back-to-back dyads, following the procedure detailed earlier. Trials were presented in a random order. For each dyad, participants had to report whether they had seen it before (i.e., during the previous experiment) or not by pressing one of two keys on a computer keyboard with the left or right index finger, respectively (key mapping to “yes” or “no” was counterbalanced across subjects). Images were displayed for 5 s, and participants had unlimited time to respond. Accuracy and response time (RT) were recorded. This task was included only to introduce an active task, beside the passive looking, which could encourage participants to attend to the stimuli during eye tracking and help assess their attention.
Analyses
The analysis focused on the time course of preferential looking. The time-course analysis has the potential to identify transient tendencies that could be missed when averaging over the arbitrary trial duration, to provide timing information (early spontaneous effect vs. slow effect), and to identify dynamic patterns (e.g., look first at one type of stimulus and then the other) or systematic biases (e.g., look first to the left and then to the right, regardless of the stimuli). Analyses were computed in R 4.0.2 (R Core Team, 2020), using eyetrackingR 0.1.8 (Dink & Ferguson, 2018) for processing eye-tracking data and ggplot2 3.3.6 (Wickham, 2016) for data visualization.
Preprocessing
In the 2,500 ms of each trial duration, series of up to five missing samples (less than 100 ms, the minimal fixation duration; van Renswoude et al., 2018; Wass et al., 2014) were linearly interpolated. Samples were coded with respect to whether the look was on-images (i.e., within the rectangular areas where dyads were shown) or off-images (i.e., missing samples; blinks; gaze in the center, on the background, or off-screen). On-images samples were further coded as located on the facing or the nonfacing dyad.
Informative time window
We used a previously established data-driven approach (see Goupil et al., 2022) to determine the informative time window (ITW), defined as the points in time in which the majority of participants looked at images in most trials. For each participant, at every time point, the proportion of off-image eye-tracking samples was subtracted from the proportion of on-image eye-tracking samples. A positive value indicated a larger number of looks on-images; a negative value indicated a larger number of looks off-images. For each point in time, the distribution of this score across participants was compared against chance (0) with a one-sample t test. A cluster-mass permutation test (Hochmann & Papeo, 2014; Maris & Oostenveld, 2007) identified the ITW as the largest cluster of adjacent time points, with ps < .01 (one tailed). All the subsequent analyses were run within the ITW.
Differential looking times
For each participant, for each time point within the ITW, differential looking time between facing and nonfacing bodies was computed, separately for upright and inverted displays, as the difference between the proportion of eye-tracking samples on the facing dyad minus the proportion of samples on the nonfacing dyad divided by the proportion of samples on images (the sum of the two). Positive differences indicated a larger number of looks on the facing dyad; negative differences indicated a larger number of looks on the nonfacing dyad. If, for a given time point, a participant did not look at either of the images in any of the trials, they received a score of 0 for that time point. For each point in time, differential looking times were tested against chance (0) with a one-sample t test and a cluster-mass permutation test. Differential looking times for upright displays were then compared with differential looking times for inverted displays (paired t test, two tailed) and tested with a cluster-mass permutation test. In significant clusters, differential looking times were averaged by participant, for upright and inverted displays separately, and tested against chance (0) with a one-sample t test (two tailed).
Results
The results of the memory task confirmed that participants attended to the stimuli during eye tracking. Responses were overall very accurate (approximately 80% of accurate responses) with no difference between facing and nonfacing dyad in accuracy (Mfacing = 0.81 ± 0.18 SD; Mnonfacing = 0.77 ± 0.17), t(23) = 1.09, p = .289, Cohen’s d = 0.22; in RT (Mfacing = 2,555 ± 817; Mnonfacing = 2570 ± 833), t(23) = −0.24, p = .815, Cohen’s d = 0.05; or in the d prime analysis (Mfacing = 1.53 ± 0.90; Mnonfacing = 1.30 ± 0.85), t(23) = 1.08, p = .290, Cohen’s d = 0.22.
We identified an ITW starting 350 ms after the trial onset and lasting until the end of the trial (p < .001). Time-course analyses within the ITW showed longer looking times for facing versus nonfacing dyads for upright displays, in an interval between 717 and 1,167 ms (p = .039). No significant difference was found at any point in time with inverted displays. Differential looking times for upright displays diverged from differential looking times for inverted displays between 467 and 950 ms (p = .026; Fig. 1B). Over this period (Fig. 1C), differential looking times were significantly positive when images were upright (M = 0.10 ± 0.18), t(23) = 2.60, p = .016, Cohen’s d = 0.53, indicating that participants looked more at facing than at nonfacing dyads. There was no significant effect when images were inverted (M = 0.02 ± 0.18), t(23) = 0.62, p = .543, Cohen’s d = −0.13.

Preferential looking paradigm and results. (a) Stimuli. Examples of facing and nonfacing dyads presented to humans (left) and macaques (right). In each trial, a facing dyad and a nonfacing dyad featuring the very same bodies were presented simultaneously on the screen. (b) Results of the time-course analysis testing whether and when in a trial subjects looked more at facing or nonfacing dyads. For each time point, the curves show the proportion of looks to the facing dyads minus the proportion of looks to nonfacing dyads divided by the sum of the two. Horizontal dotted lines denote the chance level (0); positive values mean that subjects looked more to facing dyads; shaded areas around the curves denote standard errors from the mean; intervals highlighted by gray areas are those where significant differences between groups (left and central plots) or conditions (right plot) were found with cluster-mass permutation tests. From left to right: Results of infants (less than vs. more than 1 year), children (3 vs. 5 years), and adults (upright vs. inverted displays). Note that infants and children saw only upright displays). (c) Results of the analyses on the differential looking times averaged across all the time points in the intervals where significant differences were found (gray areas in [B]). In the box plots, dots indicate means; thick horizontal bars, medians; lower and upper hinges, first and third quartiles, respectively; whiskers, the span encompassing values larger/smaller than 1.5 times the interquartile range; small dots, values beyond this range; and horizontal dotted lines, the chance level (0). From left to right: Results of infants, children, adults (upright and inverted stimuli), and macaques. Stars above boxes denote significant differences from chance (0) or between groups. *p < .05. **p < .01.
In summary, human adults showed a spontaneous preference to look at facing (vs. nonfacing) dyads. The early timing of this effect suggests a spontaneous and automatic capture of attention by facing dyads, congruent with results of visual search studies (Papeo et al., 2019). The lack of differences with inverted stimuli makes it unlikely that the preference arose from nonspecific low-level visual differences between facing and nonfacing dyads.
Experiment 1b: Rating Study
Method
Participants
To measure the extent to which facingness adds social value to stimuli, a total of 138 English-speaking participants (male and female human adults) were recruited for rating social semantic dimensions (meaningfulness of the scene, emotional content, and intentionality) of facing and nonfacing dyads as well as individual bodies. Participants were recruited and tested on Amazon Mechanical Turk. Data from one subject were discarded because of a technical failure. We considered this sample size large enough to measure differences in ratings across conditions. We confirmed this with a sensitivity analysis (GPower 3.1; Faul et al., 2007) showing that the current sample size was sufficient to detect small differences in two-tailed paired t tests (t = 1.98, df = 137, d = .21, α = .05, β = .80).
Stimuli
Body dyads were created as explained in Experiment 1a from 30 unique bodies in as many unique body poses, which were randomly paired to create 15 unique facing dyads. Fifteen nonfacing dyads were created by swapping the two bodies in each facing dyad. In each dyad, the centers of the two bodies were at the same distance from the center of the image (1.8°), which corresponded to the center of the screen. Moreover, the distance between the closest points of two bodies in a dyad was matched across facing and nonfacing dyads (facing, 1.22°; nonfacing, 1.24°), t(29) = 0.292, p = .772.
Procedure
From Amazon Mechanical Turk, participants were redirected to the online platform Testable.com (Rezlescu et al., 2020), where the experiment was implemented. During the experiment, participants saw facing dyads, nonfacing dyads, and each individual body that composed the dyads. They had to rate, on a 10-point Likert scale, each image with respect to each of three social semantic dimensions (meaningfulness, emotional content, and intentionality) and an arbitrary perceptual dimension (implied motion). We instructed the participants to judge, for meaningfulness, how much they thought the whole scene made sense; for emotion, how strong the emotional content of each scene was; for intentionality, how much each scene, as a whole, gave the impression that the individuals were acting intentionally; and for motion, how dynamic each scene looked to them. We did not provide any further definition or specification and never mention the facing/nonfacing manipulation as we aimed to capture the participants’ general impression of the stimuli without any bias. Each dimension was rated in four separate blocks of 20 stimuli (five facing dyads, five nonfacing dyads, and 10 individual bodies). In each block, in each trial, a stimulus was presented for 1.5 s. A 10-point Likert scale was shown on the bottom of the screen and remained until the response. Participants had unlimited time to respond. Each participant saw one of three different lists of stimuli, in which each body appeared only once (i.e., in one of the three conditions). The order of blocks and the order of stimuli within a block were randomized. The rating itself was preceded by the calibration of the physical size of the stimuli on the participant’s screen (automated by Testable.com), the informed consent, and the display of task instructions. At the beginning of each block, participants were reminded of the instructions for that block (i.e., the dimension to evaluate).
Results
As the main purpose of this study was to test differences in the participants’ judgment of facing and nonfacing dyads, we first focused on pairwise comparisons (t test) between the two types of dyads, separately for each dimension. Results (Fig. 2) showed that facing dyads were rated significantly higher than nonfacing dyads for all social semantic dimensions: meaningfulness, t(137) = 6.72, p < .001, d = 0.57; emotional content, t(137) = 5.13, p < .001, d = 0.44; and intentionality, t(137) = 5.55, p < .001, d = 0.47; but not for implied motion, t(137) = 1.10, p = .272, d = 0.09. In sum, although facing and nonfacing dyads were rated as perceptually similar, as shown by ratings of implied motion, their representation substantially differed with respect to dimensions that are important in encoding social interactions. Put in another way, the spatial relation between bodies in visually matched images changed the representation of conceptual dimensions such as meaningfulness, emotional content, and intentionality. These findings contribute to support our hypothesis that the aforementioned effect in the differential looking times would be linked to a spontaneous preference for stimuli with the higher social value.

Results of the rating study evaluating three social semantic dimensions of the stimuli—meaningfulness (meaning), emotional content (emotion), and intentionality (intent)—and one perceptual dimension—implied motion (motion). In box plots, large dots indicate means; thick horizontal bars, medians; lower and upper hinges, first and third quartiles, respectively; whiskers, the span encompassing values larger/smaller than 1.5 times the interquartile range; and small dots, values beyond this range. Stars highlight significant pairwise comparisons (*p < .001).
Extended results
We considered the ratings for individual bodies and tested how they related to rating of facing and nonfacing dyads. Results showed that across all the social semantic dimensions, facing dyads were rated higher than individual bodies: meaningfulness, t(137) = 7.05, p < .001, d = 0.60; emotional content, t(137) = 7.07, p < .001, d = 0.60; intentionality, t(137) = 7.55, p < .001, d = 0.64; whereas ratings did not differ between individual bodies and nonfacing dyads: meaningfulness, t(137) = −1.95, p = .053, d = −0.17; emotional content, t(137) = 1.44, p = .153, d = 0.12; intentionality, t(137) = 0.36, p = .719, d = 0.03. In contrast, individual bodies were rated lower for implied motion, relative to both facing, t(137) = 7.06, p < .001, d = 0.60, and nonfacing dyads, t(137) = 5.79, p < .001, d = 0.49.
Experiment 2: Macaques
Method
Subjects
Twenty-one juvenile rhesus macaques (Macaca mulatta; 11 females; approximate age 2.5 years) were tested in an indoor environment, in which they were free to move around without any restraining device. We note that we included juvenile macaques for opportunistic reasons, but because their age corresponded to the adolescence period, they can be considered as mature subjects, that is, closer to adults than to children. Therefore, we ran Experiment 2 with the objective to replicate in a nonhuman species the effect found in human adults (Experiment 1a). All subjects had previously participated in experiments with visual stimuli presented on the computer monitor and were familiar with the current setting. All housing and procedures conformed to guidelines for the care and use of laboratory animals (European Community Council Directive No. 86–609) and were approved by the local ethics board (October 3, 2018) and the French Ministry of Research (October 10, 2018) (see the Supplemental Material). The sample size could not be chosen, but we tested all the available subjects.
Stimuli
A set of 10 colored images was created, including five unique photographs of macaques (open licensed pictures available on Google Image) and their mirrored images (Fig. 1A). In each image, the monkey appeared on a white background in lateral view, sitting in a natural posture with neutral facial expression and gaze, head, and body oriented in the same direction (leftward or rightward). Twenty unique facing dyads (visual angle: 20 × 15.33°) were created by combining the 10 photographs (10 × 15.33°). Each body was presented once in each view (i.e., leftward or rightward) paired with another individual. The center of each individual body was at a distance of 5° of visual angle from the center of images, and the extremities of both bodies were separated by 2°. To create 20 nonfacing dyads, the position of the two bodies in each facing dyad was swapped. Stimuli for the experiment consisted of displays featuring a facing dyad and the corresponding nonfacing dyad image next to each other (the facing dyad was on the left in 50% of displays). Both dyads on a display showed the same monkeys. Each dyad appeared once on the left side of the screen and once on the right side. Dyads were equally distant from the center of the screen.
Procedure
Each subject was temporarily separated from their group and placed into the testing area, a large cage (87 × 100 × 120 cm) in which the animal was free to move, with a front delimited by a large-mesh metallic grid. A computer monitor (35 × 61cm; 2560 × 1440 resolution) was placed 60 cm from the grid. Subjects were given about 5 min to habituate to the testing area before the experiment began. A moving geometric pattern accompanied by a nonbiological sound appeared in the center of the screen to attract the subject’s attention; when the animal looked toward the screen, a stimulus display was shown for 5 s. Stimulus presentation was triggered by the experimenter (H. Rayson or A. Massera), who monitored the animal’s behavior through a separate screen connected to a webcam (30 fps) placed on the top center of the stimulation screen. Video recording onset/offset was automatically triggered at the start/end of each stimulus presentation, controlled through Psychopy v1.90.2 (Peirce et al., 2019). Each subject was presented with a maximum of 10 trials. The spatial arrangement of dyads on the first trial (e.g., facing dyad left, nonfacing dyad right) was counterbalanced across subjects. For a subject, the positioning of facing and nonfacing dyads on the display alternated across trials.
Analyses
This group of monkeys was not trained for eye-tracking experiments; they did not continuously attend to the screen and did not consistently provide a sufficient number of aligned data points to allow the implementation of a time-course analysis. Therefore, we implemented a standard cumulative looking time analysis as follows. Subjects’ gaze position was manually coded offline, frame by frame, by a researcher (H. Rayson) blind to the position of the two dyads on the screen. This researcher had been established as reliable using this coding scheme, with very good reliability scores (k = 0.84) obtained in a previous study with the same paired-stimuli presentation setup and the same coding scheme (Rayson et al., 2021). On each video frame, the coder decided whether the monkey looked at the right image, at the left image, in an ambiguous location or in a task-irrelevant location (off the display). Each entry of the coding file indicated the number of consecutive frames during which the monkey looked in either direction. Next, this number was multiplied by the frame duration (in seconds) to obtain a looking time. Trials in which the monkey looked at the two dyads for less than 500 ms in total were discarded. Subjects with fewer than two trials were discarded. For the remaining monkeys, for each trial, differential looking time was computed as the difference between looking time to the facing dyad minus looking time to the nonfacing dyad divided by the total looking time (sum of the two). For each subject, differential looking times were averaged across trials and tested against chance (difference = 0) with a one-sample t test, where positive values denoted longer looking times toward facing dyad and negative values denoted longer looking times toward nonfacing dyad.
Results
Five monkeys were excluded as they never attended to the displays. One more subject provided only one trial above the inclusion criterion (looking time > 500 ms) and was excluded from subsequent analyses. For the remaining subjects (N = 15; seven females), on average 35% (SD = 29%) of trials were discarded because of looking times less than 500 ms. The analysis of the remaining trials of these 15 subjects revealed significantly positive difference scores (M = 0.20 ± 0.22), t(14) = 3.56, p = .003, Cohen’s d = 0.92 (Fig. 1C), indicating longer looking times for facing than for nonfacing dyads. Thirteen out of 15 macaques exhibited longer looking times for facing than for nonfacing dyads (exact binomial test, p = .007).
Experiment 3a: Human Infants
Method
Participants
Experiment 3a involved young infants (less than 1 year; n = 40) and older infants (more than 1 year; n = 40). Infants in the 1st year of life were 7-month-olds (n = 20; seven females; age range 6 months 15 days to 7 months 21 days, M = 7 months 3 days, SD = 11 days) and 10-month-olds (n = 20; nine females; age range 10 months 6 days to 11 months 17 days, M = 10 months 22 days, SD = 13 days). Infants in the 2nd year of life were 15-month-olds (n = 20, 11 females; age range 15 months 5 days to 15 months 27 days, M = 15 months 16 days, SD = 8 days) and 18-month-olds (n = 20, 11 females; age range 18 months 2 days to 19 months 4 days, M = 18 months 20 days, SD = 9 days). The sample size of 20 was chosen following a power analysis based on results in Goupil et al. (2022; Experiment 1: d = −.71, β = .80, α = .05; minimal sample size N = 18; GPower 3.1). Six additional infants were tested but rejected because of fussiness (see Analyses). Written informed consent was obtained from the infants’ parents before data collection. Parents were given €5 for reimbursement of travel expenses.
Stimuli and procedure
The same stimuli, paradigm, and procedure as in the adults’ Experiment 1 were used, except for the following changes. First, only upright displays were shown; second, stimuli stayed on the screen for 5 s, rather than 2.5 s, to take into account infants’ slower processing of visual information (Hochmann & Kouider, 2022); third, participants received no explicit instruction. Throughout the experiment, infants sat on their parent’s lap at a distance of 60 cm from the eye-tracker screen. The size of body dyads was sufficient to be clearly visible at 7 months (Goupil et al., 2022; Gwiazda et al., 1997). Parents were instructed to close their eyes during the experiment to prevent biasing infants’ response to the stimuli and interference with the eye tracking. The experiment included 16 trials.
Analyses
Fussiness was evaluated using a data-driven approach, described in detail in Goupil et al. (2022), which introduces objective criteria to define fussiness across experimenters and studies. In this approach, short looking times are used to identify trials in which infants are inattentive (trials with cumulative looking times on dyads more than one standard deviation below the mean), and low cumulative looking times averaged across all trials are used to identify infants who are globally inattentive (infants with looking time cumulated over all trials below two standard deviations from the group mean). With these criteria, we excluded trials with cumulative looking times shorter than 2,402 ms on average (7 months, 2,327 ms; 10 months, 2,048 ms; 15 months, 2,453 ms; 18 months, 2,778 ms) and data from six infants (two 7-month-olds, one 10-month-old, one 15-month-old, and two 18-month-olds). These six infants were replaced to achieve the desired sample size. In the final sample, an average of 17.70% (SD = 15.92) of trials was discarded (7 months, M = 19.94 ± 21.99 SD; 10 months, M = 19.30 ± 13.98; 15 months, M = 15.19 ± 11.27; 18 months, M = 16.35 ± 15.21). In order to test groups (and differences between groups) over the same time interval, looking times were analyzed within a common ITW. This was defined by computing the ITW of each group (see Analyses of Experiment 1a) and selecting the time period that overlapped between the ITWs of all groups. Differential looking times were computed inside this common ITW. Age differences were tested at each time point by regressing the effect of age (7, 10, 15, 18 months) on the differential looking times. A cluster-mass permutation test (permuting difference score sign) was used to correct for the multiple comparisons.
Results
The ITW started with a similar delay for all age groups (7 months, 550 ms; 10 months, 533 ms; 15 months, 483 ms; 18 months, 500 ms; all ps < .001) and lasted until the end of the trial. Thus, the common ITW was a period between 550 and 5,000 ms. Within this period, differential looking times changed with age in three consecutive intervals (2,617–2,833 ms, p = .046; 2,983–3,267 ms, p = .023; 3,450–3,733 ms, p = .020; Fig. 1B). To further inspect this effect, for each infant, differential looking times were averaged across the three clusters, then each age group was compared to older age groups with a t test (one tailed). This analysis showed no difference between 7 and 10 months, t(38) = −0.48, p = .318, Cohen’s d = −0.15; and no difference between 15 and 18 months, t(38) = −0.07, p = .471, Cohen’s d = −0.02; but significant differences between 7 and 15 months, t(38) = −1.74, p = .045, Cohen’s d = −0.55; 7 and 18 months, t(38) = −1.86, p = .035, Cohen’s d = −0.59; and 10 and 18 months, t(38) = −1.86, p = .036, Cohen’s d = −0.59; and a trend for a difference between 10 and 15 months, t(38) = −1.66, p = .052, Cohen’s d = −0.53. These results indicated a discontinuity in the infants’ behavior between the 1st and 2nd year of life, whereby the younger group looked longer at nonfacing dyads and the older showed no bias (Fig. 1C). To confirm this developmental change, we combined all the data of infants in the 1st year and compared them with all data of infants in the 2nd year. Results showed a significant difference, t(78) = −2.52, p = .007, Cohen’s d = −0.56 (one tailed). A one-sample t test against chance (0) showed significantly negative differential looking times in the younger group, confirming that these infants looked longer at nonfacing dyads (M = −0.10 ± 0.31), t(39) = −2.11, p = .042, Cohen’s d = −0.33, and no difference in the older group (M = 0.05 ± 0.24), t(39) = 1.39, p = .171, Cohen’s d = 0.22. Thus, replicating and extending to new age groups the findings in Goupil et al. (2022) here, differential looking times showed that infants in the 1st year of life look longer at nonfacing dyads. This effect disappears in the 2nd year of life.
Experiment 3b: Children
Method
Participants
Experiment 3b involved young children (3 years; n = 20) and older children (5 years; n = 20). All 3-year-olds (eight females; age range 37 months 4 days to 47 months 25 days, M = 42 months 9 days, SD = 154 days) and 5-year-olds (10 females; age range 60 months 17 days to 71 months 26 days, M = 65 months 6 days, SD = 119 days) who were recruited and tested were included in the final analyses. Written informed consent was obtained from parents before data collection. Parents were given €5 for reimbursement of travel expenses.
Stimuli, procedure, and analyses
Stimuli, procedure, and analyses were identical to Experiment 3a. Using the same criteria to identify fussiness, we discarded on average 16.41% (SD = 13.25) of trials (3 years, M = 16.88% ± 13.16; 5 years, M = 15.94% ± 13.67), which had a duration shorter than 2,610 ms on average (3 years, 2,427 ms; 5 years, 2,794 ms). For each group, the ITW was computed on the remaining trials, and differential looking times were computed on the common ITW comprising the time points overlapping between the two ITWs. Age differences were tested at each time point by comparing the differential looking time courses of 3- versus 5-year-old children with a cluster-mass permutation test (two-tailed t tests; permuting difference score sign).
Results
The ITW started with similar delay in both groups (3 years, 500 ms; 5 years, 467 ms; all ps < .001) and lasted until the end of the trial. In the common ITW, from 500 to 5,000 ms, a cluster-mass permutation test revealed a significant difference between the differential looking times of 3- and 5-year-olds between 4,367 and 4,983 ms (p = .026; Fig. 1B). Differential looking times inside this cluster were averaged for each subject and tested against chance with a one-sample t test (two tailed). Differential looking times did not differ from chance in 3-year-olds (M = −0.10, SD = 0.26), t(19) = −1.71, p = .103, Cohen’s d = −0.38, whereas they were significantly above chance in 5-year-olds, revealing an adult-like preference for facing dyads (M = 0.14, SD = 0.20), t(19) = 3.13, p = .005, Cohen’s d = 0.70 (Fig. 1C). A cluster-mass permutation test comparing the time course of differential looking times of 5-year-olds against chance revealed a significant cluster between 4,400 and 4,983 ms (p = .043). There was no significant effect for the 3-year-olds. Contrary to what we observed with adults in Experiment 1, the preference for facing dyads occurred late in the trial, suggesting that the effect may be less spontaneous and automatic at 5 years than it is in adults.
Discussion
Several vertebrate species manifest predispositions to preferentially attend to relevant social signals, such as faces, direct eye gaze, and biological motion, which are considered preparatory to social cognition development. Here, using an identical paradigm to test infants, children, adults, and monkeys, we demonstrated a new visual preference toward an uncharted class of visual stimuli consisting of face-to-face dyads of conspecifics.
First, we found that human adults spontaneously looked longer at two facing versus nonfacing bodies. This pattern reflected a genuine effect of body positioning as this was the sole apparent difference between the two conditions, and any effect of (lower-level) visual differences was ruled out with the test of inverted stimuli (see Cheng et al., 2021, for converging evidence using pupillometry). Facing dyads were also rated higher than nonfacing dyads on social semantic dimensions such as meaningfulness, intentionality, and emotional content, confirming that facingness increases the social value of body stimuli.
We consider longer looking times as indication of preference, discarding an interpretation based on violation of coherence with respect to one’s expectation or knowledge. On the latter account, participants looked longer at facing dyads because those stimuli looked like social interactions but were difficult to interpret. However, there is no indication that adults found facing dyads more awkward than nonfacing dyads. On the contrary, the rating study showed that participants represented facing dyads as meaningful, significantly more than nonfacing and single bodies (see also Paparella & Papeo, 2022).
Interpreted as preference, the effect of social facingness supports the theory that humans are equipped with specialized perceptual mechanisms for responding to social interaction (Papeo, 2020; Pitcher & Ungerleider, 2021). Our results also showed that the preference for facingness is shared with macaques, suggesting that, like humans, macaques generalize the relevance of facingness toward oneself (Muschinski et al., 2016) to facingness between others. It remains unknown whether this shared behavior reflects phylogenetic continuity (i.e., a biologically determined mechanism) or a common solution to similar environmental conditions where facingness is a frequent and relevant feature of social life. It also remains possible that frequency alone explains the preference for facingness, although there is no evidence that, in the real world, people are more often face-to-face than in other configurations; and in macaques, most common social activities (e.g., grooming) involve nonfacing configurations, such as one facing toward another’s back (Lehmann et al., 2007). Likewise, the generalizability of our findings across human cultures might be potentially limited by our sample’s composition, predominantly participants living in the area of Lyon (France). Therefore, our findings warrant further investigation to assess their relevance across different cultural contexts.
Moving from phylogeny to ontogeny, we replicated Goupil et al. (2022), observing that young infants (less than 1 year) looked longer at nonfacing dyads. Goupil et al. linked this pattern to an effect of visual complexity. In adults, whereas nonfacing bodies are processed as two independent units, facing bodies are perceived as a structured unit (similar to facial features in a face), with benefits in terms of processing efficiency (Adibpour et al., 2021; Goupil et al., 2023). In this perspective, shorter looking times for facing dyads in infants would reflect faster processing times for the less complex of two stimuli. Supporting this interpretation, Goupil et al. (2022) showed that infants not only had shorter looking times for facing (vs. nonfacing) dyads but also devoted a comparable amount of looking time to facing dyads and single bodies.
Going beyond Goupil et al. (2022), the present study describes a developmental trajectory where the early “nonfacing > facing” effect progressively reverses toward an adult-like preference for facing dyads. This pattern suggests a tension between two effects: the effect of visual complexity, yielding longer looking times toward nonfacing dyads, and the visual preference for the more socially relevant type of stimulus, yielding longer looking times toward facing dyads. The former effect, found in young infants, may decrease with age because as perception becomes more efficient, difficult, near-threshold tasks are needed to highlight perceptual differences between stimuli. In effect, the perceptual advantage of facing dyads in adults was highlighted using tasks with visual noise, fast stimulus presentation, and/or masking (Papeo et al., 2017, 2019; Xu et al., 2023). As the “nonfacing > facing” effect becomes less visible in older infants’ looking times, an adult-like preference for facing dyads emerges gradually, outweighing the effect of visual efficiency by 5 years. This interpretation acknowledges the possibility that toddlers and younger children also had a preference for facingness, but the effect was not strong enough to overrule the competing effect of visual complexity.
In particular, it is possible that the preference for facingness can be found earlier, if additional information highlights its social function. Encouraging this thinking, Thiele et al. (2021) reported longer looking times toward facing (vs. nonfacing) people in 9-month-olds when stimuli involved ostension and head turning: Two individuals first looked toward the observer-infant and then turned toward each other. In Beier and Spelke (2012), 10-month-olds (but not 9-month-olds) discriminated between facing and nonfacing dyads but showed to understand facingness as a social interaction signal only when the two greeted or talked to each other. Facingness was also found to aid the effect of joint attention on object representation: 9-month-olds encoded an object better when jointly attended by two people who had turned toward (vs. away from) each other; facingness alone, however, did not yield any advantage on object representation (Thiele et al., 2021; see also Thiele et al., 2023). Notwithstanding the differences in stimuli, tasks, and measures, these studies consistently highlight the early (less than 1 year) effect of spatial relations between people on the encoding of visual scenes. At the same time, they show that before the 1st year, facingness between others is interpreted as a signal of social engagement only in the presence of other social behaviors (motion or head turning toward another, joint attention toward an object) and/or ostension.
Given the early emergence of other visual preferences, why would the preference for facingness emerge so late? Identifying the social relevance of facing people implies recognizing that two individuals are mutually accessible or attend to or engage with one another. Overcoming the early egocentric stance (Piaget, 1927), this could be achieved by generalizing to the gaze toward another the early sensitivity to the gaze toward oneself (direct gaze) as a signal that one is being addressed. This generalization in turn involves the ability to follow another’s eye direction and represent the relation between gaze and its target, that is, that the target is the content of another’s attention or mental representation. Young infants (by 4 months) automatically shift attention in the direction indicated by another’s gaze (Hood et al., 1998). However, they can represent the referential role of gaze in relation to an object only after 9 months and only if gaze shift is preceded by communicative or ostensive signals toward the infant, such as eye contact or infant-directed speech (Senju et al., 2008; Senju & Csibra, 2008). This development might occur even later when the target of another’s gaze is not an object but another person (Spelke, 2022, 2023). Preceding communicative or ostensive signals do not seem to be necessary to understand gaze-target relations after 3 years of age (Ristic et al., 2002; Senju et al., 2004); that is exactly when we found the spontaneous preference for facingness tout court.
In conclusion, young (7-month-old) infants leverage spatial relations (facing/nonfacing) to discriminate between otherwise identical visual social scenes. We propose that this sensitivity to visual relational information in infants anticipates the understanding that more socially relevant than a scene with two people is a scene with two facing people! In this spirit, the preference for facingness would mark a milestone in social cognition development, signaling that a child represents others as social agents who attend to, engage with, and act upon one another. The same preference in monkeys suggests an evolutionary and/or behaviorally relevant mechanism for holding an individual’s attention where social interaction is likely to occur. As a marker of the social brain, the preference for facingness can help track the milestones of normal social cognitive development and deviations from it.
Supplemental Material
sj-docx-1-pss-10.1177_09567976241242995 – Supplemental material for Visual Preference for Socially Relevant Spatial Relations in Humans and Monkeys
Supplemental material, sj-docx-1-pss-10.1177_09567976241242995 for Visual Preference for Socially Relevant Spatial Relations in Humans and Monkeys by Nicolas Goupil, Holly Rayson, Émilie Serraille, Alice Massera, Pier Francesco Ferrari, Jean-Rémy Hochmann and Liuba Papeo in Psychological Science
Footnotes
Transparency
Action Editor: Angela Lukowski
Editor: Patricia J. Bauer
Author Contributions
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
