Abstract
Detecting where our partners direct their gaze is an important aspect of social interaction. An atypical gaze processing has been reported in autism. However, it remains controversial whether children and adults with autism spectrum disorder interpret indirect gaze direction with typical accuracy. This study investigated whether the detection of gaze direction toward an object is less accurate in autism spectrum disorder. Individuals with autism spectrum disorder (n = 33) and intelligence quotients–matched and age-matched controls (n = 38) were asked to watch a series of synthetic faces looking at objects, and decide which of two objects was looked at. The angle formed by the two possible targets and the face varied following an adaptive procedure, in order to determine individual thresholds. We found that gaze direction detection was less accurate in autism spectrum disorder than in control participants. Our results suggest that the precision of gaze following may be one of the altered processes underlying social interaction difficulties in autism spectrum disorder.
Introduction
The direction of other people’s gaze is a useful signal: it directly provides information about interests and dangers in the environment. What others look at also provides cues about their inner states: what they know, what they desire, and what they attend to. Therefore, gaze following behavior, defined as the ability to trace a line of sight to discern the object/target of someone else’s eyes’ fixation and, most likely, their owner’s attention, is an extremely adaptive behavior (Argyle and Cook, 1976; Emery, 2000). Thus, an altered ability to follow gaze is likely to cause social impairment.
Autism spectrum disorder (ASD) is a neurodevelopmental disorder characterized by impairments in communication and social interaction, as well as restricted, stereotyped behavior and interests (American Psychiatric Association, 2013). The clinical phenotype is broad, encompassing a large range of behaviors and intellectual abilities, resulting in a highly heterogeneous population. In terms of cognitive functioning, poor understanding of social situations (“Theory of Mind deficit,” Baron-Cohen et al., 2000), lower drive for social interaction (“social motivation deficit,” Chevallier et al., 2012), atypical perception of social stimuli (Yang et al., 2015), as well as an atypical pattern of executive functions (see Wilson et al., 2014 for a review) and/or perception (Mottron et al., 2006) have all been associated with ASD.
Abnormal eye contact is one of the defining features of ASD (American Psychiatric Association, 2013). Individ-uals with an ASD less spontaneously orient their gaze by following someone else’s (Leekam et al., 1997). Their fixation time toward the face and the eye regions is reduced (Klin et al., 2002). More generally, atypical patterns of visual exploration have been reported (Sasson et al., 2008; Noris et al., 2012). Importantly, abnormal gaze patterns and neural activity in response to eye gaze might be very early predictors of ASD, prior to the emergence of the behavioral symptoms (Elsabbagh et al., 2012). In adults with ASD, reduced brain activation during gaze processing was reported, both in dyadic (direct vs averted gaze) and in triadic (gaze directed toward an object) situations, in different regions including the superior temporal sulcus (Pelphrey et al., 2005), the temporo-parietal junction, and the medial prefrontal cortex (Tanabe et al., 2012; Von dem Hagen, Stoyanova, Rowe, Baron-Cohen, & Calder, 2013).
What can be inferred from another agent’s gaze ultimately relies on the accurate detection of gaze direction. The literature on the precision of direct gaze (i.e. toward the viewer) processing in ASD yields inconsistent results. One study, using synthetic stimuli, reported that individuals with ASD accurately distinguished direct from averted gaze (Dratsch et al., 2013), while others, using distinct procedures and photographic stimuli, reported lower direct gaze detection (Senju et al., 2003; Vida et al., 2013) or mixed results (Ashwin et al., 2009), with direct gaze detection being altered only in degraded conditions (averted head). These findings suggest that geometric aspects of direct gaze processing (i.e. deriving a direction from the position of eyes) might be intact, while other aspects of gaze direction detection (GDD) might be processed differently in ASD. For example, during direct gaze processing tasks, participants without ASD are more sensitive to picture manipulations, such as inversion (Dratsch et al., 2013) or changes from positive to negative luminance polarity (Ashwin et al., 2009) than participants with ASD, suggesting that contextual facial information might differently affect the geometric processing of gaze in both populations. Moreover, atypical reactions to direct gaze in individuals with ASD (Falck-Ytter et al., 2015; Kylliäinen et al., 2012) might account for their higher rate of commission errors (i.e. gaze mistakenly rated as direct) (Dratsch et al., 2013).
Contrasting with these situations involving direct gaze, indirect gaze processing (i.e. the ability to detect the direction of someone’s gaze toward objects) has often been regarded as a preserved aspect of gaze processing in ASD (Nation and Penny, 2008). Indeed, although a diminished tendency to follow gaze is often observed in natural and semi-natural situations, the majority of studies using Posner-like paradigms have found intact attentional cueing by large gaze shifts in individuals with ASD, suggesting that they process gaze as a cue to orient toward external stimuli (Kuhn et al., 2010; but see also Vlamings et al., 2005 and Gillespie-Lynch et al., 2013). However, Posner-like paradigms do not assess GDD accuracy since they usually display gaze shifts toward diametrically opposed targets. Using photographic stimuli of an actor gazing at different objects, Leekam et al. (1997) reported an intact GDD in children with ASD. However, the ASD group was much older (mean age: 13.4 years) than both control groups (typically developing: 4.3 years, learning disabled: 6.2 years). This discrepancy, which was a consequence of group matching on mental age, raises the possibility that an effect of chronological age, independent from the developmental measure, might have masked abnormal processing in the ASD group. Another concern regards the possibility of a ceiling effect, since only the easiest stimuli of the set (i.e. the stimuli with the greatest distance between the objects) were used in that study. In a very similar paradigm, based on real situations, Webster and Potter (2008) investigated GDD in adolescents with and without ASD (n = 11 per group), with no intellectual deficiency (intelligence quotient (IQ) > 75). Participants sat in front of the experimenter while they had to judge which object he or she was looking at. No main effect of diagnosis was found, but the authors reported an interaction between age, group, and diagnosis, revealing that only younger participants with ASD were less accurate than controls in discriminating gaze direction (Webster and Potter, 2008). Finally, using the same photographic stimuli as Leekam et al. (1997), Webster and Potter (2011) added difficult trials in order to control for a possible ceiling effect. Comparing adults and children with ASD with IQ in the normal range, the authors reported an effect of diagnosis, but also found that the difference tended to decrease with age. Surprisingly, the difference between diagnostic groups arose both for easy and difficult trials (wide vs narrow angle) in children, but was only significant for easy trials in adults, casting doubt on whether the difference was really due to a diminished accuracy in eye direction detection in ASD or by general task requirements. Thus, due to limitations in procedures and populations studied, it remains unclear whether indirect gaze direction is accurately interpreted in individuals with ASD and whether a potential deficit would also exist in adults with ASD.
As a factor affecting both gaze processing (Ashwin et al., 2009; Bayliss et al. 2005; Ohlsen et al., 2013) and the neurobiology of autism (Lai et al., 2013; Mottron et al., 2015), sex might also have an impact on GDD in ASD. In particular, a recent study in a general population sample (Matsuyoshi et al., 2014) reported that direct gaze perception was associated with autistic traits in males only, suggesting that it might constitute a sex-dimorphic endophenotype in autism. Abnormalities in gaze direction precision have also been reported in Turner syndrome (Elgar et al., 2002), an X-linked condition affecting only females and associated with a higher incidence of ASD (Marco and Skuse, 2006). While this may seem contradictory with the previous result, this may reflect specificities of Turner syndrome more than general sex differences. Thus, the effect of sex on GDD remains a largely open issue, particularly in ASD.
This study aimed to evaluate indirect GDD in both children and adults with ASD, using an adaptive procedure that tracked individual detection thresholds without being affected by ceiling or floor effects, and test the potential influences of age and sex.
Methods
Participants
Participants’ demographic characteristics are summarized in Table 1. Participants enrolled in the study were included in two distinct age groups, children (n = 25) and adults (n = 46), forming a total of 33 individuals with an ASD and 38 IQ- and age-matched controls. IQs were determined by the Wechsler intelligence Scales appropriate for each age. Participants with ASD met thresholds for ASD on the Autism Diagnostic Observation Schedule (ADOS; Lord et al., 2000) and/or the Autism Diagnostic Interview–Revised (Lord et al., 1994), and had to fulfill the Diagnostic and Statistical Manual of Mental Disorders (4th ed., text rev.; DSM-IV-TR) criteria for autism, Asperger’s syndrome or pervasive developmental disorder–not otherwise specified (PDD-NOS). Final diagnosis was based on consensus by psychiatrists and psychologists specialized in ASD research and clinics. No significant differences in age and IQ were found between individuals with ASD and controls in either children or adult groups, or in the overall group. Participants in the ASD group were recruited from two outpatient clinics of university hospitals: the adult psychiatry department at Albert-Chenevier Hospital, Créteil, France, and the child and adolescent psychiatry department at Robert-Debré Hospital, Paris, France. Control participants were recruited through advertisements. Prior to their recruitment, the control participants were screened to exclude anyone with a history of psychiatric or neurological disorders. All participants were native French speakers and had normal/corrected to normal vision. No between-group difference was found in visual acuity, using a procedure adapted from Bach (1996).
Demographic characteristics of individuals with ASD and controls.
ASD: autism spectrum disorder; SD: standard deviation; IQ: intelligence quotient; ADOS: Autism Diagnostic Observation Schedule (Lord et al., 2000); SRS: Social Responsiveness Scale (Constantino and Gruber, 2005).
The present research has been approved by the local Ethics committee (Inserm, Institut Thématique Santé Publique; C07-33; Principal investigators: Prof. Leboyer and Prof. Bourgeron). All participants signed informed consent before volunteering for this study. The investigation was conducted according to the principles expressed in the Declaration of Helsinki.
Experimental procedure
Participants sat in front of a liquid-crystal display (LCD) laptop monitor at a distance of 60 cm and responded using a modified keypad with only two response keys, on the left and right sides. The response keys were labeled with leftward and rightward arrows. The experiment was programmed and displayed using Flash Professional 8®.
The participants took part in training and test sessions that were embedded in a larger battery of perception, social cognition, and executive functions (Forgeot d’Arc et al., 2014). At the beginning of the training session, participants were given pre-recorded verbal, written, and illustrated instructions, informing them that they would have to indicate whether the depicted character watched a green or a blue object. By varying the distance between the two objects following an adaptive procedure, we were able to compute an individual GDD threshold.
Stimuli consisted of a 130 × 95 mm (β = 12.4 × 9.05°) picture displaying a face and two objects. The face was schematic and encompassed two circles of 12 mm (β = 1.15°) diameter figuring eyes, including a dark zone of 7 mm (β = 0.67°) figuring the iris/pupil (Figure 1). The iris and pupil’s location within the eye was manipulated in such a way as to indicate a peripheral gaze shift directed toward one of the two objects. The objects were a pair of 5 mm (β = 0.48°) squares, located at randomly varying positions at the periphery of the picture.

(a) Stimuli; (b) experimental setup. Angles from the character’s and the participant’s points of view are labeled α and β, respectively. Participants sat in front of an LCD laptop monitor at a distance (d2) of 60 cm. Stimuli consisted of a 130 × 95 mm (β = 12.4 × 9.05°) picture displaying a face and two objects. The face was schematic and encompassed two circles of 12 mm (β = 1.15°) diameter figuring eyes, including a dark zone of 7 mm (β = 0.67°) figuring the iris/pupil. The iris and pupil were mobile, so as to figure a peripheral gaze shift directed toward one of the two objects. The objects were a green and a blue 5 mm (β = 0.48°) square, displayed on randomly varying positions the periphery of the picture. The two large colored squares at the bottom figured the response options. The distance between the two objects, d1, varied between trials, following a two-up-one-down staircase procedure beginning at 72 mm (β = 6.87°, α = 92°) and decreasing by a 0.8 ratio at each step.
Each trial began with the presentation of a face with iris and pupil right in the center of the eyes, facing one blue and one green object. After a 500-ms delay, the gaze shifted toward one of the two objects. The scene remained on screen until a response was made or, during the test session, until the maximum time allowed for response (5000 ms) was elapsed. Trials unanswered after 5000 ms were recorded as false. The distance between the two objects varied between trials, following a two-up-one-down staircase procedure beginning at 72 mm (β = 6.87°, forming a 92° α angle with the character’s eyes) and decreasing by a 0.8 ratio at each step.
During the training session, each response was followed by visual and auditory feedback. The training session was composed of the same stimuli and repeated until the participant succeeded in four out of five consecutive trials. The participant was then informed by a pre-recorded and written message that no more feedback would be provided and that the time was going to be limited, and the program moved on to the test session. During the test session, if the participant failed to provide a response for two consecutive trials, negative feedback was provided and the program paused. The experimenter then repeated the instructions and made sure that the participant was ready to resume. When the task is stopped after a fixed number of reversals, as in many staircase procedures, the final result may be affected by early errors, unrelated to the threshold but possibly sensitive to age, diagnosis, IQ. To avoid these possible confounds, in the present experiment, the test session stopped as soon as the distance between the two objects did not go under its lowest value in 10 consecutive trials, suggesting that the maximum performance of the participant had been reached. The GDD threshold was the mean of the five lowest values of the α angle formed by the character’s eyes and the two objects.
The staircase and threshold procedures were determined after pilot experiments in children and adults in the general population, in order for the threshold to reflect most closely participants’ most accurate discrimination, without being affected by possible errors occurring randomly during the task.
Statistical analysis
A power analysis revealed that 58 participants (29 in each group) would allow finding a difference with a power of 0.8 at a 0.05 significance level and a moderate effect size.
Assumptions for parametric tests were met after logarithmic transformation of the GDD threshold. In order to evaluate GDD in individuals with ASD, GDD thresholds were compared between groups using a t-test. Secondarily, we tested the interactions between diagnosis (ASD vs controls) and, successively, IQ, sex, and age group (adult vs children) on GDD thresholds in separate analyses of variance (ANOVAs). GDD thresholds were then analyzed in a final model using the variables that had shown an effect on GDD or an interaction with diagnosis. As a control for motivation to engage in the task, response time and number of missed trials were compared between groups. The p value for statistical significance was defined as p < 0.05. All analyses were performed with R (R-Core-Team, 2012). Cohen (1988) suggested that d = 0.2 be considered a “small” effect size, 0.5 represents a “medium” effect size, and 0.8 a “large” effect size. He also suggested that η2 = 0.01 be considered a “small” effect size, 0.06 represents a “medium” effect size, and 0.14 a “large” effect size.
Results
GDD was more accurate in the control group (mean α angle = 12.6° (standard deviation (SD) = 6.8°)) than in the ASD group (α = 16.5° (7.9°)) (t(70) = 2.4, p = 0.02 (Cohen’s d = 0.53)). There was an effect of age group (F(1, 67) = 4.6, p = 0.03, η2 = 0.09), with a more accurate GDD in adults (α = 13.1° (8.5°)) than in children (α = 16.8° (8.5°)), but there was no interaction between age and diagnosis (F(1, 67) = 0.8, p = 0.38, η2 = 0.01) on GDD. There was no effect of IQ (F(1, 67) = 2.4, p = 0.13, η2 = 0.035) and no interaction between IQ and diagnosis (F(1, 67) = 0.4, p = 0.52, η2 = 0.0060) on GDD. Therefore, IQ was excluded from subsequent analyses. There was no main effect of sex (F(1, 67) = 0.0, p = 0.99, η2 = 0.00) but an interaction between sex and diagnosis (F(1, 67) = 4.4, p = 0.04, η2 = 0.066) on GDD (see Figure 2). The final ANOVA, with age group and sex as between-subject factors, revealed main effects of diagnosis (F(1, 65) = 6.2, p = 0.01, η2 = 0.087) and age group (F(1, 65) = 5.3, p = 0.02, η2 = 0.075), but no effects of sex (F(1, 65) = 0, p = 0.99, η2 = 0.00) on GDD threshold. There was no interaction between diagnosis and age group (F(1, 63) = 0.05, p = 0.83, η2 = 0.0007), but an interaction between diagnosis and sex (F(1, 65) = 5.11, p = 0.03, η2 = 0.078). Given the significant interaction between sex and diagnosis and the unbalanced sex ratio of the groups, we further investigated whether the effect of diagnosis on the GDD threshold was significant in each sex. We therefore performed separate post hoc analyses comparing ASD and controls using unilateral t-tests in both sexes’ subgroups (Tukey Honest Significant Differences method with p adjusted for multiple comparisons). GDD was found altered in the males of the ASD group (α = 17.5° (7.9°)) compared to the males of the control group (α = 11.5° (6.3°), p = 0.02, d = 0.76). No other difference was found (all p > 0.29). In the female subgroup, ASD participants (α = 11.7° (6.1°)) did not differ from controls (α = 13.8° (7.4°), p = 0.94, d = 0.22). Thus, GDD was more accurate in adults than in children (regardless of diagnosis), and in male controls than in male participants with ASD.

Gaze direction detection thresholds in female (F) and male (M) individuals with ASD and controls (CTL).
To ensure that the main effect of diagnosis on GDD was not artificially caused by sex imbalance in our population sample, we also conducted the main analysis on males only. Removing female participants did not affect age and IQ matching (Supplementary Table 1). The ANOVA showed a main effect of diagnosis (F(1, 43) = 11.18; p = 0.002, η2 = 0.21), a significant effect of age group (F(1, 43) = 10.7; p = 0.002, η2 = 0.20), and no interaction between age group and diagnosis (F(1, 43) = 0.4, p = 0.53, η2 = 0.009).
We found no between-group differences in response time (mean response times = 2115 and 2092 ms in control and ASD groups, t(60) = 0.22, p = 0.82, d = 0.053) and in number of missed trials (mean missed trial rate = 0.7% and 1.5%, respectively, t(50) = 1.6, p = 0.11, d = 0.39).
Discussion
This study examined GDD in adults and children with ASD using an adaptive procedure that tracked individual detection thresholds. We tested the effect of diagnosis, as well as the potential influence of age and sex. The present results reinforce previous findings (Leekam et al., 1997; Webster and Potter, 2008, 2011), suggesting that individuals with an ASD discriminate gaze direction less accurately, as compared to age- and IQ-matched controls. This result was well established for males with ASD. Furthermore, the sex by diagnosis interaction and post hoc tests indicated that the group difference in GDD was at least weaker in females, consistent with previous evidence (Matsuyoshi et al., 2014). We did not find evidence for a significant deficit in GDD in females with ASD; however, given the small number of female participants in this study, our statistical power to detect such an effect was small, and therefore, a definitive conclusion on GDD in females with ASD will await future studies including a larger sample of female participants.
Previous studies by Webster and Potter (2008, 2011) compared performance at different difficulty levels and found that between-group differences were present in children and adults only at the easier levels. By measuring individual thresholds, our study used a different approach that reflected more precisely the maximum level of discrimination for each participant. Thus, we believe that the present results provide more compelling evidence for a lower accuracy in GDD in ASD. However, the difference between the two groups is moderate which may explain inconsistencies among previous studies.
Moreover, while in Webster and Potter’s (2008, 2011) studies group differences in GDD tended to decrease with age, in this study, there was no interaction between age group and diagnosis, nor did we find any trend in this direction. It is possible that the “improvement” observed by Webster and Potter (2008, 2011) might be an artifact due to a ceiling effect in the adult group, whereas in our task, thanks to the adaptive procedure, no ceiling was reached.
Our results also suggest that GDD might be impaired in males, but preserved in females with ASD, as suggested by a significant interaction between sex and diagnosis on GDD, and the absence of difference in the female subgroup, despite a similar variance as compared to the male subgroup. No difference was found in IQ and severity of autistic traits (SRS score, Autism Diagnostic Interview (ADI) score) between males and females in the ASD group, consistent with the current literature (Van Wijngaarden-Cremers et al., 2014).
To our knowledge, this is the first study to suggest that, in a clinical population, GDD might constitute a sex-dimorphic autistic endophenotype, in accordance with previous findings in the general population (Matsuyoshi et al., 2014). However, given the small sample of females with ASD in our sample, these findings can only be considered preliminary and call for further investigation.
When considering the current results, some limitations should be addressed. First, in this experiment, we used synthetic stimuli because they allow for an adaptive and accurate control of the gaze direction. Yet, the question may be raised whether synthetic stimuli effectively elicit GDD. We believe that they do, since synthetic stimuli have been widely used in previous studies on gaze, showing consistent behavioral responses and brain activation (i.e. Pelphrey et al., 2005). Might the use of synthetic stimuli have caused a disadvantage in the participants in the ASD group, creating an artificial difference? To our knowledge, previous mentions of differences between photographic and synthetic stimuli in ASD (Forgeot d’Arc et al., 2014; Rosset et al., 2008) have consistently shown the opposite pattern (i.e. synthetic stimuli being more easily processed by ASD groups than natural stimuli). Thus, it would seem unlikely that, in the present experiment, using synthetic stimuli disadvantaged the ASD group. If anything, one might expect that running a similar protocol with photographic stimuli would lead to a larger effect size. One may also suspect that individuals with ASD were simply less engaged in the task than participants in the control group. However, since we found no between-group difference in response time and missed trials, we have no reason to suspect a lower engagement or attention in the present task.
Another limitation may relate to the very large age range of the population studied: might it have introduced a bias? This is unlikely because while age group unsurprisingly affected GDD thresholds, it did not interact with diagnosis, suggesting a similar effect of diagnostic group in both children and adults. This effect is consistent with the general notion that adults do better than children in most tasks.
In conclusion, this study contributes to the understanding of GDD in ASD by showing that impairment in detection of gaze direction affects both children and adults with ASD. Further studies should address the effect of sex on GDD alteration in ASD, given that our finding that GDD might be preserved in females with ASD relies on a small sample.
What can diminished GDD in ASD account for? Where people look at informs us about what they know, want, or attend to. Atypical or altered detection of gaze direction might thus lead to impoverished acquisition of social information and social interaction. Alternatively, it has been suggested that abnormal monitoring of inner states (Happé and Frith, 2014), or the lack of social motivation (Chevallier et al., 2012), would explain the reduced tendency to follow conspecific gaze in individuals with ASD. Either way, a lower tendency to look at the eyes and to follow the gaze would provide fewer opportunities to practice GDD ability. Thus, impaired GDD might either play a causal role in atypical social interaction, or conversely be a consequence of it. Exploring GDD earlier in development might help disentangle this issue.
Footnotes
Acknowledgements
The authors wish to thank the participants and their families, and Claude Berthiaume, statistician, for his help.
Funding
This work is supported by the INSERM (grant number: C07-33), Agence Nationale de la Recherche (Grant numbers: Contracts ANR-09-BLAN-0327 SOCODEV, ANR-10-IDEX-0001-02 PSL*, and ANR-10-LABX-0087 IEC).
