Abstract
Failing to engage in joint attention is a strong marker of impaired social cognition associated with autism spectrum disorder (ASD). The goal of this study was to localize the source of impaired joint attention in individuals with ASD by examining both behavioral and fMRI data collected during various tasks involving eye gaze, directional cuing, and face processing. The tasks were designed to engage three brain networks associated with social cognition [face processing, theory of mind (TOM), and action understanding]. The behavioral results indicate that even high-functioning individuals with ASD perform less accurately and more slowly than neurotypical (NT) controls when processing eyes, but not when processing a directional cue (an arrow) that did not involve eyes. Behavioral differences between the NT and ASD groups were consistent with differences in the effective connectivity of FACE, TOM, and ACTION networks. An independent multiple-sample greedy equivalence search was used to examine these social brain networks and found that whereas NTs produced stable patterns of response across tasks designed to engage a given brain network, ASD participants did not. Moreover, ASD participants recruited all three networks in a manner highly dissimilar to that of NTs. These results extend a growing literature that describes disruptions in general brain connectivity in individuals with autism by targeting specific networks hypothesized to underlie the social cognitive impairments observed in these individuals.
Introduction
Following the eye gaze of another individual occurs reflexively for most people (Friesen and Kingstone, 1998) from early childhood. Eye gaze may be especially potent in capturing attention (Driver et al., 1999; Langton et al., 2000) even when the one doing the gazing is a member of a different species (Deaner and Platt, 2003; Ricciardelli et al., 2002). The ability to follow gaze underlies fundamental social skills and impaired gaze following is viewed as an early sign of autism spectrum disorders (ASDs) as well as being predictive of impaired language skills (Mundy et al., 1990).
Impairments in gaze following and language skills often accompany deficits in other forms of social communication. Children with ASD have difficulty engaging in symbolic play (Sigman and Ungerer, 1984) and are less likely to use conventional gestures such as waving (Hobson and Lee, 1998) or pointing (Hobson and Meyer, 2005; Landry and Loveland, 1988). Impaired social cognition has been attributed to deficits in (1) Theory of Mind (TOM) (Baron-Cohen et al., 1985, 1995; Leslie et al., 2004; Siegal and Varley, 2002), (2) face processing (Dawson et al., 2005; Grelotti et al., 2002; Schultz, 2005), and (3) action understanding (Boria et al., 2009; Enticott et al. 2012; Gallese et al., 2009).
The evidence that impaired social cognition is associated with poor TOM (Baron-Cohen et al., 1985, 1995; Siegal and Varley, 2002) or mind-reading (Baron-Cohen et al., 1997; Golan et al., 2006; Pellicano et al., 2005; Ponnet et al., 2008; Roeyers et al., 1998) skills is extensive. From this perspective, failure to follow eye gaze, as observed in individuals with ASD, can be seen as failure to understand another's intention or to empathize with another's interest.
Impaired joint attention in individuals with ASD has also been attributed to difficulty processing faces (Dawson et al., 2005; Grelotti et al., 2002; Schultz, 2005). Some researchers have argued that impaired face processing stems from atypical brain response in areas, such as fusiform gyrus (Dawson et al., 2005) and amygdala (Baron-Cohen et al., 2000), or the connectivity between the two structures (Schultz, 2005).
A third explanation that has been offered for impaired social cognition in individuals with ASD focuses on action understanding (e.g., Gallese et al., 2009). The argument is that appropriate social interchange is highly dependent on successful imitation of others, and impaired imitation can be traced to atypical mirror neuron activation. Mirror neurons were first discovered in nonhuman primates, but human analogues have been identified, particularly in inferior frontal gyrus (IFG) (see Rizzolatti et al., 2004 for a review). Moreover, there is evidence that individuals with ASD exhibit a pattern of response in the brain areas associated with action understanding that differs from that for neurotypicals (NTs) (Enticott et al., 2012).
Our goal in the present study was to determine whether and how recruitment of the three brain networks associated with TOM, face processing (FACE), and action understanding (ACTION) might differ between individuals with ASD and NTs. We presented our participants with five different diagnostic visual tasks, each designed to engage one or more of the three brain networks (TOM, FACE, and ACTION). We then compared behavioral performance [response time (RT) and accuracy], brain activation (using GLM analysis), and effective connectivity [using independent multiple-sample greedy equivalence search (IMaGES)] across tasks and participants.
Methods
Participants
Participants ranged in age from 18 to 35 years and were fluent in English. None of the participants had a major medical illness or a history of seizures for the previous 2 years. None had embedded metal, such as surgical pins or electronic devices, such as a pacemaker. All participants signed a consent form in compliance with Institutional Review Board (IRB) regulations and were paid for their participation.
ASD participants were recruited from the Autism Center of the University of Medicine and Dentistry of New Jersey/New Jersey Medical School (UMDNJ-NJMS) and included nine men. Based on assessment using the Autism Diagnostic Interview–Revised (ADI-R) and the Autism Diagnostic Observation Schedule (ADOS-G), four participants were found to have autism and four to have ASD. IQ scores [as tested on the Wechsler Abbreviated Scale of Intelligence (WASI)] ranged from 77 to 129 with a mean of 104. NT participants included nine men with IQ scores (WASI) ranging from 88 to 127 and a mean of 110.
Many individuals with ASD were taking psychotropic medication (e.g., selective serotonin reuptake inhibitors and atypical neuroleptics) for symptoms and behaviors related to these disorders. We did not ask subjects on medication to discontinue these in order to qualify for inclusion in the study. This decision was made on the basis of (1) the ethical concerns about withdrawing people from medication (especially in a study that includes no direct therapeutic benefit to subjects), (2) the pervasive social communication deficits that persist despite being on these medications, and (3) the finding by Schultz et al. (2000) of no significant differences of any fMRI activation variables between autism subjects taking psychotropic medication versus those not taking medication.
Scanning procedure and parameter information
Imaging was performed using an Allegra 3T (Siemens, head only) system for all scans. Participants were scanned in a prone position and a standard quadrature head coil was used. Foam cushioning was used to stabilize head position and minimize head movement. The stimuli were presented using the software E-Prime under the Windows XP operating system projected onto a back-projection screen placed at the rear of the scanner bore. Participants viewed the screen by looking in a mirror attached to the head coil. The mirror was adjusted individually to maximize viewing comfort of the participants. An MRI-compatible two-button mouse was used for responses. Scanning was synchronized with stimulus presentation through a trigger pulse sent to the Eprime software.
T1-weighted axial anatomical scans (TR=2000 msec, TE=4.38 msec, 204×256 matrix, FOV=22 cm, slice thickness=2 mm, 0-mm gap, 80 slices) were obtained prior to the experimental trial sequence. These anatomic scans were used to register the functional imaging data. Functional imaging was done using an echo planar gradient echo imaging sequence and axial orientation and was obtained using the following parameters: TR=2000 msec, TE=30 msec, 64×64 matrix, FOV=22 cm, slice thickness=4 mm, 0-mm gap, 32 slices.
Stimuli and design
Three brain networks have been implicated in the impaired cognitive processing demonstrated by individuals with autism. These networks are engaged in face processing (FACE), TOM, and action understanding (ACTION). While some regions of interest (ROIs) may appear in more than one network [e.g., posterior superior temporal sulcus (pSTS)], most are unique to a given network (e.g., angular gyrus). Five stimuli were used to engage the processing associated with each network and identify differences in activation of ROIs between ASD and NT subjects. The five stimuli were as follows: arrow-object (AO), eyes-object (EO), eyes-left-right (ELR), eyes-open-closed (EOC), and mouth-open-closed (MOC). An example of each stimulus can be seen in Figure 1. Three stimuli, ELR, EOC, and MOC, were used to engage face processing. These stimuli focused on eyes and the mouth. Two stimuli, EO and ELR, were used to activate the TOM network. These stimuli displayed the eyes as averted, a position used during joint attention. To activate the ACTION network, three stimuli—AO, EO, and ELR—were used. Each of these stimuli implies movement through the depiction of orientation both for eyes and for a nonface object.

Example of stimuli used in each condition.
The experiment consisted of having participants make simple, two-alternative forced-choice decisions (“yes” or “no”) about the presented stimuli. For the AO and EO conditions, the task was to decide whether the arrow (or eye gaze) was indicating a predesignated target. For the ELR condition, subjects were asked either “is the person looking right?” or “is the person looking left?” The EOC and the MOC conditions required subjects to determine whether the eyes (EOC) or mouth (MOC) of the person in the picture was open or closed. A typical trial sequence is shown in Figure 2.

Timing and progression of a standard trial.
The stimuli were drawn from a pool of 8 different objects and 10 different faces (half men and half women) and are shown in Figures 3 and 4. For each trial a stimulus was generated online with the following constraints: (1) all objects and all faces were presented with the same probability across trials; (2) objects appeared on the left and right an equal number of times; (3) for the AO and EO trials, the direction of the arrow or eye gaze was to the left and right equally often; and (4) for half of all trials in a given condition, the correct response was “yes” and for the remaining half it was “no.” A unique set of experimental trials was generated for each subject.

Objects used in the EO and AO conditions. AO, arrow-object; EO, eyes-object.

Faces used in the EO, ELR, EOC, and MOC conditions. ELR, eyes-left-right; EOC, eyes-open-closed; MOC, mouth-open-closed.
Stimuli were presented in blocks of 24 trials. All stimuli in a given block tested the same condition (e.g., AO or MOC). Two blocks of each type were presented and the order of the blocks was randomized for each subject. Thirty seconds of rest, in which the subject was not required to make any judgment or response, preceded each block of trials.
Before entering the magnet, each subject practiced each type of task using stimuli similar to those that would be presented during the experiment. Practice continued until both the subject and the experimenter felt confident that the task and experimental procedure were well understood. Following these practice trials, the subject was prepared for scanning and placed in the magnet. Once the experimental trials were completed, the subject was removed from the magnet and paid for his/her participation. Any questions or comments about the experiment that the subject may have had were addressed at this time.
Behavioral analysis
Both judgment accuracy and RTs were collected on every trial. Mean accuracy (percentage of correct trials) was calculated for each participant in each condition. Mean RT was calculated by averaging over trials in a given condition and was also performed for each subject individually. Mean performance (accuracy and RT) for each condition was calculated separately for ASD and NT subjects by averaging individual subject means.
GLM analysis
Because we were interested in how the activation of the FACE, TOM, and ACTION networks might differ between ASD and NT subjects, we created masks using ROIs associated with each network to confine the GLM analyses. ROIs for the FACE network included amygdala, pSTS, fusiform gyrus, and inferior occipital cortex (IOC) (e.g., Haxby et al., 2002). TOM network ROIs were amygdala, pSTS, precuneus, posterior cingulate cortex (PCC), and paracingulate (e.g., Carrington and Bailey, 2009). ROIs included in the ACTION network analyses included frontal operculum, pSTS, IFG, postcentral gyrus, and inferior parietal lobule (IPL) (e.g., Gallese et al., 2009). Masks generated from these “theory-based” ROIs were defined with the Harvard-Oxford atlas.
Analysis of the fMRI data was carried out using FMRIB's Software Library (FSL,
Although each stimulus condition was presented in two blocks of 24, the subject-level GLM analysis was conducted as an event-related design. When there are few blocks and/or the number of subjects is small as in our study, power can be increased by analyzing the data as an event-related design (Mechelli et al., 2003).
Comparisons between stimulus conditions were performed by subtracting out activation patterns from a condition that was similar in all features except the contrast of interest. So, for example, in order to compare the effect of eye gaze as a directional indicator relative to an arrow, activation recorded during the AO condition was subtracted from (i.e., contrasted with) activation during the EO condition. In this way, all aspects of the task and stimulus were identical with the exception of the factor of interest, which in this example was the type of directional indicator (arrow or eye gaze) that was presented. This allowed us to focus on the factor that differentiated two conditions and eliminate activation that was common to both (e.g., response generation, number of components in the display, and type of judgment required).
These individual contrasts were then used to compare ASD and NT subjects using FEAT Version 5.90, part of FSL (
IMaGES and connectivity
Although core ROIs have been identified with the FACE, TOM, and ACTION networks, there is less consensus about the structure of those networks; that is, how the ROIs in a given network interact with one another? A number of approaches have been used to define the relation among ROIs in a network. One approach is to use a directed-search algorithm based on an a priori theory of the network of interest. This approach is used by methods such as DCM and SEM and requires prior specification of a graph model that is then fit to the ROI covariance. Methods for determining functional connectivity (e.g., Granger Causality and seed Pearson R correlations) do neither model specification nor search. Ideally, an exhaustive search of potential connectivity patterns among a set of ROIs would produce the most accurate graph, but the computing resources and time required make this approach untenable for all but the smallest networks. An exhaustive search is known to be exponential, so that just five nodes will produce more than 50,000 graphs to fit (Hanson et al., 2009), which makes specification or confirmatory methods like DCM simply untenable. We used a Bayesian search connectivity method called IMaGES (Ramsey et al., 2011). IMaGES is based on a modification of the greedy equivalence search (Meek, 1997). It was designed with the specific goal of avoiding the spurious statistical dependencies that can arise when directly combining time series across subjects. IMaGES searches over the pool of potential graph candidates by using a parallel Bayesian method to exploit the constraints imposed by multiple subjects doing the same task [see Fig. 5 and Ramsey et al. (2010) for more details]. IMaGES has been validated using benchmark sets of large sets (∼40 cases) with 98%/90% recall/precision (Smith et al., 2011). IMaGES has also been found to support structural connections derived from diffusion tensor imaging, recovering a large density of connections that other methods such as Granger Causality miss (Sun et al., 2012). IMaGES does several preprocessing steps in order to remove trends and increase temporal resolution by estimating auto-regressive residuals for TR interpolation that are then used as the actual input time series residuals (see Fig. 6).

Depiction of steps used to preprocess time series data by IMaGES using goodness of fit (GOF).

The search process used by IMaGES to generate connectivity graphs is shown.
As described earlier, the five stimuli used in this study were designed to activate the FACE, TOM, and ACTION networks. Separate graph analyses were performed for each network using the brain activation elicited by the network-appropriate stimuli. In this way two to three connectivity graphs were generated independently for each network from data obtained in response to different stimuli created to activate that network. For the FACE network, these stimuli were the ELR, EOC, and MOC. Connectivity analyses for the TOM network were drawn from the EO and ELR stimuli and the ACTION network analyses were based on activation observed in response to the AO, EO, and ELR stimuli.
The nodes used for the connectivity analyses were the same ROIs included in the network masks used during the GLM analyses. The average time series was extracted from the voxels in a given ROI, for a given stimulus, and a given network for each subject. These time series were extracted from the raw activation obtained in the individual subject-level analyses. Time series extracted for each ROI, for each stimulus, in each network for every subject was used as input to the IMaGES analyses. Separate group-wise analyses were performed for ASD and NT subjects. For each of the eight group-wise analyses, the graph with the highest score based on the graph's Bayesian information criterion (BIC) value (Schwarz, 1978) was chosen.
Results
Separate analyses were conducted to compare ASD and NT participants in the five stimulus conditions. For the behavioral analyses, we examined RT and accuracy to determine whether and how ASD and NT participants differ in performing the various tasks. We used a GLM analysis to identify significant activation of ROIs associated with the FACE, TOM, and ACTION networks. Finally, we compared effective connectivity of the three networks using IMaGES as previously described earlier. The results are discussed separately in the following sections.
Behavioral data
In general, ASD subjects tended to be both less accurate and slower than NT subjects across all five conditions, but still performed well above chance. Thus, whereas ASD subjects appeared to be impaired relative to NT subjects, they clearly had little difficulty understanding and complying with the task requirements.
Comparison of EO and AO stimuli
Arrows, like eye gaze, can initiate reflexive orientation of attention in both NT subjects (Bayliss and Tipper, 2005; Ristic et al., 2002; Tipples, 2002) and ASD subjects (Kylliainen and Hietanen, 2004; Swettenham et al., 2003). Moreover, for individuals with ASD, eye gaze and arrows appear to be equally effective in orienting attention (Senju et al., 2004; Vlamings et al., 2005).
To determine whether ASD subjects had a general problem following directional indicators, we compared response accuracy and reaction time by ASD and NT subjects on the EO and AO tasks. Because these tasks differed only in the type of directional indicator that was used (arrows in the AO task and eye gaze in the EO task), a comparison of performance on these tasks would allow us to determine whether ASD subjects are generally impaired relative to NT subjects in processing directional indicators. Moreover, a comparison of responses in the AO and EO tasks would allow us to determine whether any discrepancy in performance between the two groups reflected a general difference in processing, or was specific to a particular directional indicator.
An analysis of variance was conducted on the percentage of correct responses, with subject type (ASD and NT) as the between-subject factor and stimulus type (AO and EO) as the within-subject factor. A main effect for subject type [F(1,32)=18.75, p<0.001] was found reflecting the superior accuracy of NT subjects in both tasks. Stimulus type was also found to be significant [F(1,32)=5.64, p<0.05], suggesting that both subject groups found the AO task to be easier than the EO task. Finally, the interaction between the two factors was also significant [F(1,32)=5.03, p<0.05], reflecting the greater difference in performance between the AO and EO tasks for ASD subjects relative to NT subjects. Pairwise comparisons (Tukey's honestly significant difference [HSD] test) confirmed these conclusions. Whereas ASD subjects were significantly more accurate on AO stimuli than on EO stimuli (p<0.02), the accuracy of NT subjects did not differ between the two conditions (p>0.99). Moreover, NT subjects were significantly more accurate than ASD subjects in the EO task (p<0.0001), but did not differ from ASD subjects on the AO task (p>0.46).
RTs followed the same pattern observed in the accuracy data (see Table 1), although an analysis of variance performed on the RTs did not yield any significant differences. As with the analysis of RTs, we used subject type as the between-subject factor and stimulus type as the within-subject factor. The results of the analysis of variance on RTs revealed no significant effect for subject type [F(1,32)=3.33, p>0.08], stimulus type [F(1,32)=0.81, p>0.38], or the interaction of the two factors [F(1,32)=0.04, p>0.84]. Given that the trend of the RT data followed that of the accuracy data, we attribute the lack of significance found in the RT analysis to the greater variance we observed in RTs relative to mean accuracy.
Behavioral Performance (Accuracy and Response Time) of Autism Spectrum Disorder and Neurotypical Participants for Each Stimulus Condition
AO, arrow-object; EO, eyes-object; ELR, eyes-left-right; EOC, eyes-open-closed; MOC, mouth-open-closed; ASD, autism spectrum disorder; NT, neurotypical.
Comparison of EOC and MOC stimuli
The comparison of performance on EO and AO stimuli strongly indicates that ASD subjects are disadvantaged relative to NT subjects when processing eye gaze. However, the difference in response to the EO and AO stimuli by ASD subjects could reflect a general difficulty processing eye gaze in particular or face processing in general. To test this possibility, we compared accuracy and RTs of both subject groups while performing the EOC or MOC task. The two tasks required subjects to determine whether the target feature (eyes in the EOC task or mouth in the MOC task) was open or closed. Thus, the two tasks both used face stimuli and required the same type of judgment to be made, but allowed us to determine whether eyes were particularly difficult for ASD subjects relative to NT subjects.
We performed an analysis of variance using subject type as the between-group factor and stimulus type as the within-group factor as we had in the AO-EO comparison. The analysis yielded a significant main effect for subject type [F(1,32)=9.51, p<0.01], reflecting the generally higher mean accuracy scores achieved by NT subjects. Neither the main effect for stimulus type [F(1,32)=1.38, p>0.25] nor the interaction between subject type and stimulus type [F(1,32)=0.91, p>0.35] was significant. Pairwise comparisons (Tukey's HSD test) revealed that NT subjects were significantly more accurate than ASD subjects on the EOC task (p<0.04), but not significantly different from ASD subjects on the MOC task (p>0.44). It appears that eyes, and not faces in general, posed a problem for our ASD subjects.
We replicated the analysis of variance with the RT data and found, as we did in the analysis of mean accuracy, a subject type main effect [F(1,32)=4.61, p<0.04]. As with the accuracy data, neither the stimulus type [F(1,32)=0.66, p>0.47] nor the interaction between the two main factors [F(1,32)=0.00, p>0.97] was significant. Pairwise comparisons (Tukey's HSD test) revealed no difference between ASD and NT subjects on either the EOC task (p>0.45) or the MOC task (p>0.43).
Comparison of ELR and EOC stimuli
A comparison of mean accuracy between ELR and EOC was made to determine whether the perception of averted eyes differed between ASD and NT subjects. As in the comparisons described earlier, an analysis of variance was conducted using subject type as the between-group factor and stimulus type as the within-group factor. Subject type was found to be significant, [F(1,32)=8.88, p<0.01] although neither stimulus type [F(1,32)=1.00, p>0.32] nor the interaction between the two factors [F(1,32)=0.54, p>0.47] was significant. Pairwise comparisons (Tukey's HSD test) found NT subjects to be more accurate on ELR tasks than ASD subjects (p=0.06), but not to differ from ASD subjects on the EOC task (p>0.40).
A similar comparison using RT data found a significant effect of subject type [F(1,32)=5.63, p<0.03] and no significant effect of either of the stimulus types [F(1,32)=0.61, p>0.44] or the interaction between subject type and stimulus type [F(1,32)=0.01, p>0.95].
Pairwise comparisons (Tukey's HSD test) revealed no significant difference between NT and ASD subjects on either the ELR task (p>0.37) or the EOC task (p>0.33).
Comparison of EO and ELR stimuli
From the comparison of AO and EO stimuli, we learned that the EO stimuli were more difficult to process for ASD subjects than were the AO stimuli. The comparison of EOC and MOC tasks verified that ASD subjects had difficulty processing eye gaze, not just face stimuli. In this comparison between EO and ELR, we examined whether the problem ASD subjects demonstrated in the EO task could be due to the presence of potential targets. Specifically, we wondered whether ASD subjects have difficulty maintaining attention between eye gaze and object, or whether they have difficulty simply perceiving the direction of averted eyes. To answer this question, we compared accuracy and RTs for ASD and NT subjects performing the EO and ELR tasks. The tasks were identical with the exception that EO stimuli included objects that could be potential targets.
We used analysis of variance to compare performance between ASD and NT subjects on the EO and ELR tasks using subject type as the between-subject factor and stimulus type as the within-subject factor. A main effect for subject type was found for both accuracy [F(1,32)=12.49, p<0.002] and RT [F(1,32)=5.85, p<0.05] reflecting the tendency of NT subjects to respond not only more accurately than ASD subjects, but also to respond more quickly. Stimulus type was not significant for either the accuracy measure [F(1,32)=0.36, p>0.55] or the RT measure [F(1,32)=0.69, p>0.41] and no significant interaction was found for either accuracy [F(1,32)=0.13, p>0.71] or for RT [F(1,32)=0.27, p>0.86]. These results suggest that tasks involving eyes are particularly difficult for ASD subjects whether or not eyes are used to indicate an object.
We wondered whether the inferior performance of ASD subjects relative to NT subjects was restricted to stimuli in which attention needed to follow eye gaze toward an object (EO task), or whether performance would be impaired for both tasks relative to NT subjects. Pairwise comparisons (Tukey's HSD test) found that NT subjects were more accurate than ASD subjects on the ELR task (p<0.05), but did not differ significantly from that of ASD subjects on the EO task (p>0.13). Similar pairwise comparisons on the RT data did not yield any significant differences for either the ELR task (p>0.40) or the EO task (p>0.28).
GLM analysis
Brain activation related to the four contrasts used in the behavioral analysis (EO>AO, EOC>MOC, ELR>EOC, and EO>ELR) was examined. These contrasts were first performed at the subject level, so that brain activity associated with each contrast was obtained for each subject individually. Activation for each contrast obtained per subject was then averaged separately for ASD and NT subjects in a second-level group analysis. Separate group analyses were performed for the three networks of interest: TOM, FACE, and ACTION. Network-specific GLM analyses were performed by using a prethreshold mask consisting of the ROIs associated with each network. TOM network ROIs included angular gyrus, pSTS, PCC, paracingulate gyrus, and precuneus. ROIs associated with the FACE network included amygdala, pSTS, IOC, and fusiform gyrus. The ACTION mask ROIs that were used included frontal operculum, IFG, IPL, paracingulate gyrus, and pSTS. To determine differences in activation patterns between subject groups for the three networks, NT>ASD and ASD>NT contrasts were performed. The results of these analyses are shown in Figure 7 and discussed in greater detail in the following sections. All group analyses were performed using a cluster threshold of p<0.001, and only activation surpassing this threshold is reported.

Activation in each stimulus condition for the FACE, TOM, and ACTION networks. Shown in blue are areas where NT subjects activated more than ASD subjects, and in yellow where ASD subjects activated more than NT subjects. ASD, autism spectrum disorder; IMaGES, independent multiple-sample greedy equivalence search; NT, neurotypical; TOM, theory of mind.
EO contrasted with AO
We first subtracted AO activation from EO activation individually for each subject. Because the EO and AO conditions differed only in the use of eye gaze (EO) or an arrow (AO), the subtraction of AO from EO provided the means of examining brain activity associated specifically with processing eye gaze as a directional indicator. These first-level analyses were then used to perform the group analysis contrasting ASD and NT subjects in order to determine how these two groups differed in processing eye gaze. No ROIs were found to be activated more for ASD than NT subjects in any of the three networks tested: FACE, TOM, or ACTION. NT subjects displayed greater activation than ASD subjects in fusiform gyrus and left amygdala for FACE ROIs, in paracingulate and precuneus for TOM ROIs, and pSTS, IPL, and IFG for ACTION ROIs.
EO contrasted with ELR
The aim of this contrast was to determine how the presence of an object could mediate processing of averted gaze. Both EO and ELR stimuli involved averted gaze, with the only difference between the two conditions being the presence of additional objects. For FACE ROIs, ASD subjects showed bilateral fusiform gyrus activation whereas activation in left fusiform gyrus and left amygdala was observed in NT subjects. Although NT subjects displayed greater activation than ASD subjects in some areas of paracingulate and precuneus using the TOM mask, ASD subjects showed greater activation than NT subjects in other areas of these same ROIs. Activation patterns were also found to differ for ROIs included in the ACTION network mask. NT subjects exhibited greater activation in bilateral IFG and bilateral pSTS, whereas only in right pSTS did ASD subjects show greater activation than NT subjects.
ELR contrasted with EOC
This contrast focused on how averted gaze, specifically, would influence brain activation in the two subject groups. Both the ELR and the EOC conditions targeted eye processing, with the difference being that the ELR condition involved averted gaze. Analysis using the FACE mask found no ROIs that were more active for ASD subjects than for NT subjects, whereas bilateral fusiform gyrus and left amygdala were found to be more active in NT subjects. Of the TOM ROIs, NT subjects activated small regions of precuneus more than did ASD subjects although ASD subjects exhibited greater activation than did NT subjects in other areas of precuneus as well as in PCC and paracingulate. NT subjects showed greater activation in right pSTS than did ASD subjects in the ACTION network analysis. ASD subjects displayed greater activation than did NT subjects in the left frontal operculum.
EOC contrasted with MOC
This contrast was performed to determine whether brain activation differences could be detected between ASD and NT participants as a function of whether eyes were targeted by the task. Both the EOC and the MOC conditions required face processing; however, only the EOC condition required specific focus on eyes. Consequently, this analysis was conducted to provide information about how processing of eyes, in particular, affects brain activation. Both ASD and NT subjects exhibited greater activation in different areas of left fusiform gyrus during analysis using the FACE mask, but only NT subjects demonstrated greater activation in bilateral areas of amygdala. Using the TOM mask, both ASD and NT subjects showed greater activation (in different regions) of PCC and paracingulate, but only NT subjects exhibited greater activation in precuneus. In the analysis of ACTION network ROIs, NT subjects showed greater activation in bilateral areas of pSTS than did ASD subjects, although in an area of right pSTS, ASD subjects had greater activation than did NT subjects.
Effective connectivity analysis
In the connectivity analysis, IMaGES is provided ROIs from one of the three theory-based networks per subject per candidate conditions. For example, all subjects in the NT ELR, EOC, and MOC conditions using FACE-theory-based ROIs are processed in parallel with IMaGES, producing a set of Markov-equivalent-class solutions (MEC) that have similar goodness of fit to the all the time series in the set. This group-wise set is ordered by BIC for a given set of time series. The MEC never contained more than a few solutions and typically only one. In the cases with more than one solution, the one with the largest BIC score was always selected which relative to other models results in the “best” model. Differences between solutions tended to involve only one edge difference in terms of orientation, but never in adjacency. Each graph also contains connections that have scaled regression coefficients (scaled to normalized residual time series—between −2.0 and 2.0) that pass threshold at p<10-e5, and t values typically larger than t=4.0, indicating the high statistical reliability of the solutions. Nodes are estimated as the mean voxel value in the solution that represents the diagonal of the covariance matrix, while the estimated covariance is reconstructed from the connectivity of the model and compared to the time-series-calculated covariance.
Face processing
Within the five diagnostic tasks, three have primary face attributes, including the ELR, EOC, and MOC tasks. These were used to compare face processing in ASD and NT participants. The EO condition was not included because of the presence of objects in the stimulus and of course AO possesses no face features. The graphical models of the face-processing network for ASD and NT participants are shown in Figure 8. NT subjects demonstrated similar patterns for the EOC and MOC conditions, but a different pattern for the ELR condition. Specifically, in the ELR condition, an additional edge was observed between pSTS and fusiform gyrus. Activation for all three conditions in ASD subjects was similar to that of the NT subjects. This result suggests that ASD subjects engaged the FACE network in a manner similar to that for NT subjects.

Effective connectivity of the FACE network for ASD and NT subjects for tasks designed to engage face processing. Numbers on the edges reflect the regression values for those edges.
Theory of mind
TOM tasks included the EO and ELR conditions to assess the effective connectivity of the TOM network (see Fig. 9). These conditions possess the strongest elements of joint attention insofar as they involved oriented eyes. As was found for the face-processing network, the NT participants showed highly similar activation patterns for both conditions. In contrast, the pattern observed for ASD subjects was markedly different between the two conditions. Moreover, neither of the patterns found for the ASD subjects was similar to that seen for the NT group. A notable difference for ASD subjects in the ELR is the absence of an edge between amygdala and paracingulate gyrus, which is seen in the EO condition for ASD subjects and in both the ELR and EO conditions for NT subjects.

Effective connectivity of the TOM network for ASD and NT subjects for tasks designed to engage TOM. Numbers on the edges reflect the regression values for those edges.
Action understanding
For this analysis we compared the AO, EO, and ELR conditions. These conditions each depicted directionality, an implied correlate of movement or action (again note the overlap with TOM). The graphical models obtained for the ACTION network are shown in Figure 10. As was found for the FACE and TOM networks, NT subjects showed similar patterns across the three stimulus conditions whereas ASD subjects had markedly different connectivity patterns across the same conditions. The ASD group did exhibit a connectivity pattern similar to that of the NT group, but only for the AO condition. This similar connectivity pattern is consistent with the behavioral data showing little difference between ASD and NT subjects. Once again, the greatest difference between the ASD and NT groups was found for the ELR condition.

Effective connectivity of the ACTION network for ASD and NT subjects for tasks designed to engage action understanding. Numbers on the edges reflect the regression values for those edges.
General Discussion
Three brain networks have been implicated in impaired social cognition: (1) face processing, (2) TOM, and (3) action understanding. This study examined how the recruitment of these three networks differs between individuals with ASD and NTs when engaged in various tasks related to social cognition.
The behavioral results affirm that individuals with ASD, even those who are high functioning, have difficulty processing social cues relative to NTs. ASD participants were slower to respond and less accurate than were NTs particularly when the task required attention to eyes (EO, ELR, and EOC). ASD participants did not differ significantly from NTs in tasks that did not require eye processing (AO and MOC). These results suggest that individuals with ASD do not have difficulty processing a nonsocial directional cue such as an arrow, nor are they impaired in a face-processing task that does not involve attention to eyes. However, our ASD participants were particularly impaired when processing averted gaze (EO and ELR). To the extent that joint attention requires the ability to follow averted eye gaze, it is not surprising that individuals with ASD have difficulty engaging in joint attention.
To map brain activation to the behavioral differences we observed between ASD and NT participants, we took two approaches to examining the fMRI data. The first approach was to conduct a GLM analysis in which brain activation patterns were contrasted between tasks and between our subject groups. Specifically, we compared the activation in ROIs related to the three networks associated with social cognition (FACE, TOM, and ACTION). The GLM analysis revealed differential recruitment of brain areas by ASD and NT participants. In general, NT participants appeared to engage brain areas associated with the three networks to a greater extent than did ASD participants. The exception to this pattern appeared in ROIs associated with the TOM network. ASD participants demonstrated significantly more activation of the TOM network than did NT participants for EO>ELR and ELR>EOC contrasts. This result is consistent with research that suggests connectivity abnormalities in ASD brains (Belmonte and Yurgelun-Todd, 2003; Rubenstein and Merzenich, 2003) and may explain why the tasks involving the processing of averted gaze (EO and ELR) were performed least well by ASD participants.
The GLM analysis provided evidence that the activation of ROIs differed across tasks for ASD and NT participants. However, GLM analysis provides no understanding of the neural networks underlying cognitive performance. To this end, we conducted graph analyses of each network (FACE, TOM, and ACTION) for each task and subject group. We found that NT participants demonstrated similar connectivity in a given network across tasks used to recruit that network. On the other hand, for the ASD participants, connectivity patterns differed not only from NT participants but also differed across tasks used to recruit the same network. This pattern in ASD participants was particularly evident for the TOM and ACTION networks. ASD recruitment of the FACE network did not differ greatly from that of NT subjects. It should be noted that the greatest difference between ASD and NT subjects was found for the ELR condition, the condition that also resulted in the poorest behavioral performance for ASD subjects.
Taken together, the behavioral, GLM, and effective connectivity analyses indicate that the source of social cognitive impairment in individuals with ASD is not isolated to processing faces, engaging in TOM, or understanding action. Rather, the effective connectivity analyses indicate that individuals with ASD recruit the brain networks associated with social cognition very differently than do NTs. Additional research should focus on identifying the extent to which individuals with ASD differ from NTs in recruiting other brain networks such as those associated with attention, memory, or reward. It is possible that atypical effective connectivity in individuals with ASD is restricted to networks associated with social cognition. However, it is also possible that atypical recruitment of social cognition networks will be seen in other networks not exclusively associated with social cognition. This atypical recruitment is consistent with a growing body of research that finds more general connectivity problems in the brain response in individuals with ASD (e.g., Assaf et al., 2010; Belmonte and Yurgelun-Todd, 2003; Boersma et al., 2013; Rubenstein and Merzenich, 2003).
Footnotes
Acknowledgments
The authors would like to thank Dr. Charles Cartwright for his recruitment and psychiatric assessment of the participants with ASD. This study received funds from the NJ Governor's Council for Medical Research and Treatment of Autism and the James S. McDonnell Foundation.
Author Disclosure Statement
None of the authors have received any financial compensation for their effort. None of the authors have any commercial affiliations that would create a conflict of interest with the research reported in this article.
