Abstract
Impairments in theory of mind (ToM)—long considered common among individuals with autism spectrum disorder (ASD)—are in fact highly heterogeneous across this population. Although such heterogeneity should be reflected in differential recruitment of neural mechanisms during ToM reasoning, no research has yet uncovered a mechanism that explains these individual differences. In this study, 78 (48 with ASD) adolescents viewed ToM vignettes and made mental-state inferences about characters’ behavior while participant electrophysiology was concurrently recorded. Two candidate event-related-potentials (ERPs)—the late positive complex (LPC) and the late slow wave (LSW)—were successfully elicited. LPC scores correlated positively with ToM accuracy and negatively with ASD symptom severity. Note that the LPC partially mediated the relationship between ASD symptoms and ToM accuracy, which suggests that this ERP component, thought to represent cognitive metarepresentation, may help explain differences in ToM performance in some individuals with ASD.
Deficits in theory of mind (ToM), or the ability to understand and make inferences about the mental states of others, are among the most well-studied aspects of social cognition in autism spectrum disorder (ASD; Baron-Cohen, 2000; Frith, 2012; Tager-Flusberg, 2007). Like many aspects of the ASD phenotype (Pelphrey et al., 2011; Rice et al., 2012), decades of research have revealed substantial heterogeneity in ToM performance in individuals with ASD (Happé, 1995; Moran et al., 2011; White et al., 2014), particularly among more cognitively able youths (Altschuler et al., 2018). One possibility is that despite intrinsic dysfunction in the neural systems underpinning ToM reasoning, individuals with ASD may learn behavioral and cognitive strategies to compensate for deficits in ToM skills (Livingston et al., 2019). Evidence of these compensatory strategies are also thought to be reflected in the variability exhibited during discrete stages of neural processing during ToM reasoning (White et al., 2014), but such stages have yet to be captured with precision. Electrophysiological methods, such as electroencephalography (EEG), offer high temporal resolution ideal for examining specific and sequential stages of cognitive processing, which would facilitate examination of such stages and their consequent associations with behavioral activity during ToM reasoning in individuals with ASD.
Although some researchers have begun exploring EEG-derived event-related-potentials (ERPs) related to ToM in typically developing (TD) populations (Geangu et al., 2013; Liu et al., 2004; Liu, Meltzoff, & Wellman, 2009; Liu, Sabbagh, et al., 2009; Meinhardt et al., 2011; Sabbagh, 2004; Sabbagh & Taylor, 2000), no research to date has examined ERPs in relation to ToM in youths with ASD. Such work is essential for determining whether differences in neural activity can help explain variability in ToM performance between individuals with and without ASD as well as whether differences in neural activity can explain individual differences in ToM ability across ASD. Adolescence is a developmental period marked by unique social vulnerability for individuals with ASD (Picci & Scherf, 2015). Because the importance of peer relationships and social demands increase during adolescence, youths with ASD are at greater risk of experiencing higher levels of peer rejection and demonstrate difficulties in forming close, enduring friendships (Mendelson et al., 2016). These social challenges are likely due in part to difficulties in making more nuanced inferences about the thoughts, feelings, and motivations of their peers, skills necessary for successful social interactions. Indeed, adolescents with ASD demonstrate both behavioral deficits in ToM task performance (Kaland et al., 2008) and aberrant neural organization and activation of neural pathways implicated in ToM processing (Kana et al., 2015). Thus, research in this area is valuable for advancing the social neuroscience of ASD, advancing the search for biologically homogeneous subgroups of individuals with ASD, and providing more refined and individualized intervention targets. In the present study, we offer the first investigation of ERPs associated with ToM reasoning and behavioral accuracy in youths with ASD.
ToM in ASD
Note that ASD symptom severity is not a reliable indicator of individual differences in ToM reasoning (i.e., people with increased autism symptoms do not always exhibit increased deficits in ToM and vice versa; Brunsdon & Happé, 2014; Jones et al., 2018; Ziv et al., 2014; though see Mazza et al., 2017). Furthermore, a growing body of literature suggests that ToM reasoning is not a unitary construct but, rather, comprises multiple discrete cognitive processes that are differentially evoked during ToM reasoning tasks (Liu, Meltzoff, & Wellman, 2009; Saxe & Powell, 2006). For instance, both primary representation (i.e., literal representation of the current perceived situation) and metarepresentation (i.e., internal representation, involving additional manipulation or interpretation of a primary representation) are thought to be necessary to engage in ToM reasoning (Leslie, 1987). Indeed, this latter stage is thought to be specifically affected in some youths with ASD (Leslie & Frith, 1990), although no direct measure of metarepresentation in ASD has yet been identified. Thus, differences in ToM abilities among individuals with ASD may reflect within-persons variability in engagement of distinct neurocognitive processes across different types of ToM reasoning tasks (Frith et al., 1994; Senju, 2012).
Within-persons variability of ToM performance is supported by studies that have shown that individuals with ASD frequently demonstrate uneven performance on ToM reasoning tasks, given that some individuals with ASD perform similarly to TD peers (Schneider et al., 2013; Senju, 2012; Senju et al., 2009). However, some individuals with ASD show no overt behavioral deficits in ToM reasoning tasks yet still experience significant challenges engaging in ToM reasoning in their daily lives (Brunsdon & Happé, 2014; Livingston et al., 2019). Because behavioral ToM measures are sensitive to both behavioral deficits in ToM skills and alternative cognitive and behavioral strategies used to make up for impaired ToM skills, behavior alone is not an adequate measure of the mechanism by which ToM functions in ASD. Use of lab-based behavioral ToM measures alone will almost certainly fail to identify individuals with ASD who use alternative strategies to compensate for differential recruitment of these ToM-related mechanisms and still experience challenges with ToM reasoning in more complex real-world scenarios (Livingston et al., 2019; Livingston & Happé, 2017). Examination of the relationship between behavioral accuracy and neural activity provides a viable strategy for detecting more subtle differences in ToM processing both within and between individuals when overt behavior appears typical across ToM measures.
Neuroimaging Studies of ToM
Considerable function MRI (fMRI) work with individuals with and without ASD has identified specific brain regions involved in ToM reasoning, including the temporoparietal junction (TPJ) and medial prefrontal cortex (mPFC; Nijhof et al., 2018; Schurz et al., 2014). Evidence suggests that the right TPJ (RTPJ) is uniquely implicated in ToM processes, especially those that require reasoning about other people’s thoughts (Saxe & Powell, 2006). fMRI studies have shown that compared with TD individuals, individuals with ASD exhibit atypical activation of the RTPJ during ToM tasks (Castelli et al., 2002; Kana et al., 2009, 2015; Mason et al., 2008; White et al., 2014). The mPFC appears to support other aspects of social cognition (e.g., executive-functioning skills and response selection) required for successful ToM reasoning but does not appear to be independently sufficient for thinking about other people’s mental states (Saxe, 2010). fMRI research therefore indicates a primary role for the TPJ and a supporting role for the mPFC in ToM processing. Another line of neuroimaging research has recently used magnetoencephalography (MEG) to investigate ToM processes in TD children and children with ASD (Yuk et al., 2018). Findings showed that whereas TD children and children with ASD did not differ on behavioral accuracy on a false-belief ToM task, children with ASD showed decreased TPJ activation and increased right inferior frontal gyrus activity compared with the TD group (Yuk et al., 2018). According to one interpretation of the findings by the authors, these results suggest that frontal brain regions may compensate for impaired TPJ activity during false-belief reasoning in ASD (Yuk et al., 2018).
To date, however, studies of ToM reasoning in ASD have been limited to comparisons of neural activity and behavioral responses across ToM tasks. Exploring the temporal dynamics of neural activation during ToM tasks is essential for deciphering how and when ToM reasoning unfolds in ASD and for pinpointing when in the processing pipeline individuals with ASD experience differences in ToM reasoning. Analysis of ERPs extracted from EEG recordings provides the temporal precision required for exploring differences observed in ToM processing between groups, as well as between and within individuals, and increases the feasibility of collecting sufficient data among clinical populations with behavioral challenges and sensory sensitivities compared with alternative neuroimaging methods. ERPs therefore provide an ideal method for uncovering temporally precise information about the neural activity implicated in cognitive processes such as ToM in individuals with ASD.
Although no previous research has examined the relationship between behavioral performance on ToM tasks and concurrent neurocognitive mechanisms (i.e., linking accuracy and neural activity) in ASD, previous ERP research in TD populations (Jiang et al., 2016; Kuhn-Popp et al., 2013; Liu et al., 2004; Liu, Sabbagh, et al., 2009; Meinhardt et al., 2011, 2012; Sabbagh & Taylor, 2000; Wang et al., 2010; Zhang et al., 2009) suggests that correct ToM reasoning is associated with slow wave brain activity 300 to 1,500 ms after stimulus presentation, and differences in scalp distribution and morphology are present across development (Bowman et al., 2012). In previous work, activity in this window has been segmented into two overlapping ERPs referred to most frequently as a late positive component (LPC) and a late slow wave (LSW; Geangu et al., 2013; Jiang et al., 2016; Kuhn-Popp et al., 2013; Liu et al., 2004; Liu, Meltzoff, & Wellman, 2009; Liu, Sabbagh, et al., 2009; Meinhardt et al., 2011, 2012; Sabbagh, 2004; Sabbagh & Taylor, 2000; Wang et al., 2010; Zhang et al., 2009).
The LPC has usually been identified as a deflection occurring between 300 and 600 ms after a stimulus. The LPC is thought to index aspects of attention associated with the processing of conditional salience or expectancy during the initial stages of spontaneous ToM reasoning (Meinhardt et al., 2011). Metarepresentational processes have also been theorized to occur during this initial phase of ToM reasoning, and thus, the LPC may be implicated in early stages of metarepresentation (Jiang et al., 2016; Kuhn-Popp et al., 2013; Meinhardt et al., 2012; Wang et al., 2010; Zhang et al., 2009). In typical populations, the LPC shows a broad, centrally focused scalp distribution (most pronounced at parietal sites) in adults and a more posterior distribution in children (Meinhardt et al., 2011).
The LSW occurs after the LPC in a somewhat variable poststimulus time window that appears to vary according to the age of the sample: from 600 to 900 ms in TD adults to 750 to 1,450 ms in TD children. The LSW is believed to be a marker of relatively later, more elaborative ToM reasoning processes such as making and reasoning about mental-state attributions (Geangu et al., 2013; Kuhn-Popp et al., 2013; Liu et al., 2004; Liu, Meltzoff, & Wellman, 2009; Liu, Sabbagh, et al., 2009; Meinhardt et al., 2011, 2012; Sabbagh & Taylor, 2000). In TD adults, a midfrontal and right-posterior LSW was observed during a ToM task requiring reasoning about beliefs (Liu, Meltzoff, & Wellman, 2009). However, a ToM false-belief study found the LSW to be centrally located in the midfrontal region for TD adults and more broadly distributed across anterior scalp sites in TD children (Meinhardt et al., 2011). The timing and scalp distribution of the LSW has not previously been explored in adolescent populations.
The topographical and temporal variability of the LSW raises an important question about whether this component is best understood as distinct from or a continuation of the LPC. Identifying this distinction is particularly important for delineating the time course of ToM processing in ASD. In addition, although evidence shows that the LPC and LSW are elicited during the 300-to-1,500-ms poststimulus time window during ToM tasks in TD samples, the extent to which each component specifically relates to ToM reasoning remains unclear, particularly in the ASD population, in which no study has examined ERPs related to ToM.
In the current study, we therefore aimed to explore whether the LPC and LSW are elicited in response to stimulus presentation in a novel ToM EEG task in youths with and without ASD. It was hypothesized that the ERPs would be elicited differentially in response to the correct condition (ToM) compared with the incorrect condition (which does not involve ToM reasoning). Furthermore, it was expected that the magnitude of the difference in amplitude between the ERPs elicited to correct and incorrect ToM conditions would relate to behavioral response accuracy on the task (providing evidence that these components are indices of neural mechanisms involved in ToM reasoning). Finally, ASD status both categorically (diagnosis via clinical cutoff on the Autism Diagnostic Observation Schedule, 2nd edition [ADOS-2]; Lord et al., 2012) and dimensionally (ADOS-2 Comparison Score [CS]) was expected to relate to (a) behavioral performance on the ToM EEG task and (b) the relative size of the neural response elicited to presentation of correct compared with incorrect ToM conditions during the EEG task.
Method
Participants
Participants were 78 adolescents (48 with ASD), ages 11 to 17 (M = 13.08 years, SD = 1.83), with IQ of at least 70 (M = 106.85, SD = 14.59). Full participant demographics are presented in Table 1 and group comparisons are presented in Table 2.
Participants’ Demographics and Test Scores
Note: Values are ns unless otherwise specified. ASD = autism spectrum disorder; FSIQ = full-scale IQ measured with the Kaufman Brief Intelligence Test, 2nd edition (KBIT-2; Kaufman & Kaufman, 2004); VIQ = verbal IQ measured by the KBIT-2; ADOS-2 CS = Autism Diagnostic Observation Schedule, 2nd edition (Lord et al., 2012) comparison score; SCQ = Social Communication Questionnaire, Lifetime version (Rutter et al., 2005); ToM = theory of mind. Higher ADOS-2 CS and SCQ scores indicate greater ASD symptoms and severity. Mean annual household income was calculated separately for each group according to the number of valid cases (ASD: n = 42; M = 8.57, SD = 4.00; non-ASD: n = 29; M = 8.55, SD = 2.95) and corrected using Levene’s test for equality of variances.
ASD: n = 46; bnon-ASD: n = 29.
Tests of the Demographic and Test-Score Differences Between the ASD and Non-ASD Groups
Note: Tests are two-tailed independent-samples t tests unless otherwise specified. ASD = autism spectrum disorder; FSIQ = full-scale IQ measured with the Kaufman Brief Intelligence Test, 2nd edition (KBIT-2; Kaufman & Kaufman, 2004); VIQ = verbal IQ measured by the KBIT-2; ADOS-2 CS = Autism Diagnostic Observation Schedule, 2nd edition (Lord et al., 2012) comparison score; SCQ = Social Communication Questionnaire, Lifetime version (Rutter et al., 2005); ToM = theory of mind.
Noninteger df is reported because equal variances between groups were not supported, and so a t test not assuming equal variance was used.
Procedures
Participants were recruited via community-based clinical organizations, a commercial mailing list from communities near Stony Brook University, and consented follow-up with participants who had participated in past studies. Before enrollment, all participants completed a phone screen to assess eligibility using the following inclusion criteria: adolescent and primary caretaker English language proficiency sufficient to complete study assessments, verbal ability consistent with administration of the ADOS-2 (Lord et al., 2012) Module 3 or 4, and no current or history of a significant medical disability or disorder, known developmental disability aside from ASD, or significant cognitive impairment (measured as IQ < 70). Eligible participants then attended an initial screening visit in the lab, during which all phone-screened eligibility criteria were confirmed and diagnostic and cognitive assessments administered. The current study was approved by the university’s institutional review board, and all participants provided informed consent and were compensated for their participation.
Measures
Participant demographics and developmental and medical history data were obtained via a study-specific Developmental History Form. Verbal and overall intellectual ability were assessed using the Kaufman Brief Intelligence Test, 2nd edition (Kaufman & Kaufman, 2004), and eligible participants scored 70 and higher on full-scale IQ. Pubertal status was assessed using the self-report version of the Pubertal Development Scale (PDS; Carskadon & Acebo, 1993). The PDS includes five items rating physical development on a 4-point Likert scale ranging from not yet started (1) to seems completed (4). PDS Items 1 through 3 assess the onset of the pubertal growth spurt, growth of body hair, and skin changes, and Items 4 and 5 are gender specific, assessing voice changes and growth of facial hair in males and breast growth and menarche in females. Calculating the average of the five physical development items yields the PDS total score.
Recent research suggests that the phenotypic structure of ASD symptoms is best conceptualized as a multidimensional continuum that extends into the general population (Constantino et al., 2004; Kim et al., 2019). Such work also indicates that the common research practice of comparing categorically defined, diagnostically distinct groups of participants increases the likelihood of finding group differences while obscuring important information that could increase the understanding of neurodevelopmental disorders (Kim et al., 2018). To address this problem, ASD research must include nontraditional control samples: individuals not affected by ASD but whose phenotypic heterogeneity is comparable with the level observed in ASD. ASD symptoms were therefore assessed in all participants using the parent-report Social Communication Questionnaire–Lifetime version (SCQ; Rutter et al., 2005), a widely used screener for ASD, and the clinician-rated ADOS-2 (Lord et al., 2012), the “gold-standard” tool for ASD diagnosis administered by research-reliable examiners. The ADOS-2 CS is used as an index of ASD symptom severity outcome. Participants with a prior ASD diagnosis were included in the ASD group after meeting a high sensitivity score of 11 or higher on the SCQ and an overall total score of 6 or higher on the ADOS-2. Participants with no previous diagnosis who met the clinical cutoffs for ASD on the SCQ and ADOS-2 were further assessed using the parent-report Autism Diagnostic Interview Revised (ADI-R; Le Couteur et al., 2003), a complementary parent interview paired with the ADOS-2 to achieve “gold-standard” diagnostic confirmation. A high sensitivity score of 6 or higher on the ADI-R was applied for ASD. Participants scoring below cutoff on the SCQ, ADOS-2, or (when administered) ADI-R were placed in the non-ASD group; participants meeting cutoff scores on the SCQ, ADOS-2, and (when administered) ADI-R were placed in the ASD group.
EEG acquisition
Data-collection procedures adhered to best practices for EEG data collection in ASD (Webb et al., 2015). Full EEG recording procedures are presented in the Supplemental Material available online.
EEG ToM paradigm
The EEG paradigm used to assess ToM reasoning was an adaptation of the SELweb ToM module (McKown et al., 2016). The original SELweb ToM paradigm has strong construct validity generally (McKown, 2019; McKown et al., 2013, 2016) and in ASD populations in particular (Russo-Ponsaran et al., 2019). Participants viewed illustrated, narrated ToM vignettes and were asked to make mental-state inferences about characters’ behavior (Mckown et al., 2016). The task was adapted by augmenting the original SELweb ToM item set for additional trials, adding fixation crosses, setting the timing and presentation of stimuli to facilitate ERP acquisition (see below), and reducing the number of response options presented from four to three for each vignette. This was done to maintain task difficulty while reducing likelihood that response choice was due to random selection and to balance task length, engagement, and fidelity to the original task.
Participants were instructed that they would hear and see stories and would be asked to answer a question about each story. First, a nonsocial practice task was completed to familiarize participants with using the button box, the task format, and the response-window timing. The practice task presented images (e.g., a ball or a car), and participants indicated which image they had been shown. Participants learned that response options would be shown individually before the response window and that they were to select their response when all response options were presented on screen simultaneously.
The ToM paradigm (Fig. 1) comprised 36 randomized trials presented in three blocks of 12 trials each, with self-paced breaks between blocks. On-screen still images were presented on a black background at a visual angle of 10.51° (width) × 10.51° (height) and accompanied by an audio description (≈50 dB) of each image. Images and audio narration were presented simultaneously. Each vignette comprised four individual story panel images presented sequentially with audio narration describing each image (e.g., Image 1: “Julia and Stephen are going to fly kites”; Image 2: “Stephen says, ‘The weather today is wonderful for flying kites, very windy’”; Image 3: “Once they are out of the house, it’s not windy anymore”) and followed by a 500-ms pause between panels. All story panel images remained on screen until audio narration of the fourth panel had concluded (e.g., “Julia says, ‘Sure is wonderful weather to fly a kite’”), at which point a question about the vignette was presented aurally (e.g., “Why does Julia say what she says?”). Story panels were then cleared, followed by presentation of a fixation cross (1,100 ms–1,200 ms jittered). Next, three individual response options were presented sequentially, allowing target ERPs to be time-locked to the initial presentation of correct and incorrect response-option images.
Each trial included one correct response (e.g., “She’s disappointed and is being sarcastic”) and two incorrect responses (e.g., “She hates flying kites” and “She thought it was good weather to fly a kite”), for a total of 36 correct ToM response options and 72 incorrect ToM response options presented across 36 vignettes. Because of a coding error, three incorrect response options were excluded from analysis, resulting in 69 incorrect response options. Correct and incorrect response options were presented in randomized order, each consisting of an image displayed for 4,500 ms paired with a simultaneous audio description of the image. A fixation cross was displayed (1,100 ms–1,200 ms jittered) on screen between response options. After presentation of the final response option, all three response options were displayed on screen simultaneously, left to right in the same order initially presented. Participants were instructed to press the button on the button box corresponding to the colored circle displayed beneath the image of their response choice. Trials did not advance until a response was made. Once a response was received, the trial ended, and a fixation cross was displayed (500 ms–1,000 ms intertrial interval). Mean task length for all participants was 24.23 min (SD = 1.92), and diagnostic groups did not significantly differ in the amount of time to complete the task (p > .62). Behavioral responses were recorded, and behavioral accuracy was calculated as the proportion of correct responses divided by the total number of trials.

Example of a typical theory of mind vignette. Images and audio narration were presented simultaneously. 1. Each vignette comprised four individual story panel images, followed by presentation of response options. 2. Three individual response options were presented sequentially, allowing target event-related-potentials to be time-locked to the presentation of correct and incorrect response options. 3. All three response options were then displayed on screen simultaneously, and participants were instructed to press the button on the button box corresponding to the colored circle displayed beneath the image of their response choice.
Data analytic plan
EEG data processing
All data were offline band-pass-filtered from 0.1 Hz to 30 Hz and were rereferenced to the average of the TP9 and TP10 electrode sites (placed on the mastoids). Eye-blink and ocular corrections were conducted using a standard-regression-based algorithm (Gratton et al., 1983). Segments were time-locked to the initial presentation of correct and incorrect ToM response option images (Fig. 1) and were 1,700 ms long (−200 to 1,500 ms) with a 200-ms prestimulus baseline recorded before image presentation. Artifact rejection included a threshold of a voltage step exceeding 50.0 μV between sample points and a voltage difference of 200.0 μV within a 400-ms sliding window, and a minimum voltage difference of 0.50 μV within a 100-ms interval was applied. Bad channels were interpolated using equally balanced information from surrounding channels.
After artifact rejection, out of 105 total possible trials, the number of artifact-free segments did not differ significantly between the non-ASD group (M = 102.23, SD = 6.40, range = 72–105) and the ASD group (M = 100.30, SD = 9.07, range = 63–105), t(74) = −1.01, p = .32. Groups did not differ significantly in the number of artifact-free trials in the correct ToM condition (ASD: M = 34.28, SD = 3.69, range = 17–36; non-ASD: M = 35.10, SD = 2.95, range = 21–36), t(74) = −1.02, p = .31, or the incorrect ToM condition (ASD: M = 66.02, SD = 5.61, range = 45–69; non-ASD, M = 67.13, SD = 3.59, range = 51–69), t(74) = −0.96, p = .34. Segments without artifacts were averaged separately for each participant and each ToM response condition (correct and incorrect).
ERP analysis
Analysis of ERP data included trials with both correct and incorrect behavioral responses. To test Hypothesis 1a, that the LPC and LSW would be elicited in response to stimulus presentation in a novel ToM EEG task, we created separate grand average waveforms for correct and incorrect ToM response conditions for ASD and non-ASD groups. Next, to identify the poststimulus epoch and scalp location at which each component is maximal in the current sample, we examined waveform morphology visually at parietal and frontal electrode sites during the 300-to-600-ms (LPC) and 600-to-1,200-ms (LSW) poststimulus time windows identified in previous work with TD populations (Liu et al., 2004; Liu, Meltzoff, & Wellman, 2009; Liu, Sabbagh, et al., 2009; Meinhardt et al., 2011).
The LPC and LSW were extracted as mean area amplitudes within two discrete poststimulus time windows (LPC: 300 ms–600 ms; LSW: 600 ms–1,200 ms) at five parietal electrode sites (P3, P4, Pz, P7, P8) and were extracted separately for correct and incorrect ToM conditions for each participant. Extracted ERP data for each ToM condition and time window were then averaged across parietal sites for each participant, creating four pooled parietal ERPs per participant (correct LPC, incorrect LPC, correct LSW, and incorrect LSW). Residualized difference scores (Meyer et al., 2017) were calculated to measure the difference in amplitude between the ERPs elicited to correct and incorrect ToM conditions (equivalent to incorrect LPC – correct LPC and incorrect LSW – correct LSW) across pooled parietal electrode sites. Because the LPC presented as a precursor to the LSW (see Results), statistical analyses were performed for the LPC only.
Statistical analysis
Bivariate correlations were run to test for relationships between participant age and verbal IQ with behavioral ToM performance, ASD symptom severity, residualized LPC, and residualized LSW.
To test Hypothesis 2, that the magnitude of the residualized LPC would be positively correlated with ToM performance on the EEG task, we ran bivariate correlations between residualized LPC and ToM performance.
To test Hypothesis 3a, that behavioral response accuracy on the EEG task (a) would be negatively correlated with ASD symptom severity and (b) would differ significantly between diagnostic groups, we first ran a bivariate correlation between ToM performance and ADOS-2 CS. Second, we ran independent groups t tests to assess for differences between ASD and non-ASD groups in behavioral response accuracy on the EEG task.
To test Hypothesis 3b, that the magnitude of the difference between the LPC elicited to correct versus incorrect ToM conditions would be associated with ASD symptom severity and, separately, ASD diagnostic status, we first ran a mixed model 2 × 2 repeated measures analysis of variance with clinical group as the between-subjects factor (two levels: ASD, non-ASD) and ToM response condition (two levels: correct, incorrect) as the within-subjects repeated measures factor. Second, we ran independent groups t tests to assess for group differences in the size of the residualized LPC. Third, we ran a bivariate correlation to test the relationship between the residualized LPC and ADOS-2 CS.
Results
Verbal IQ was positively associated with overall behavioral task performance, r(73) = .66, p < .001, but not with ASD symptom severity, r(76) = −.218, p = .055; residualized LPC, r(67) = .11, p = .38; or residualized LSW, r(67) = −.06, p = .65. Greater age was associated with smaller residualized LPC, r(67) = −2.86, p = .17; greater residualized LSW, r(67) = .42, p < .001; and greater ASD symptom severity, r(78) = .24, p = .04; but not with behavioral task performance, r(75) = −.03, p = .83.
ToM performance during the EEG task did not differ significantly between the ASD group (M = .87, SD = .15) and the non-ASD group (M = .92, SD = .09), F(73) = 1.427, p = .158 (Table 2). When measured dimensionally, increased ASD symptoms (i.e., higher ADOS-2 CS) was associated with poorer ToM performance, r(76) = −.349, p = .002.
ToM-linked ERP components
Visual inspection of grand average waveforms and topographic maps (Fig. 2) revealed activity consistent with prior descriptions of both the LPC and LSW, which were maximal at parietal electrode sites (P3, P4, Pz, P7, P8; see Fig. S3 in the Supplemental Material). Data showed an expected deflection beginning 300 ms after ToM stimuli presentation, followed by LSW activity beginning around 600 ms and continuing until 1,200 ms to 1,400 ms after stimulus presentation and characterized by amplitude differentiation (statistical results presented below) between ERPs elicited by correct and incorrect ToM conditions.

Grand average event-related-potentials (ERPs; a) and topographical maps (b and c) for autism spectrum disorder (ASD), non-ASD, and combined diagnostic groups during two poststimulus time windows: (b) The late positive component (LPC; 300 ms–600 ms) and (c) the late slow wave (LSW; 600 ms–1200 ms). ERPs are recorded during the presentation of individual story panel pictures corresponding to correct and incorrect theory of mind (ToM) response options and pooled (averaged) across five parietal electrode sites (P7, P3, Pz, P4, P8; data from individual channels are represented in Fig. S3 in the Supplemental Material). Topographical maps represent the main effect of ToM response condition; mean amplitude difference is separated by diagnostic group and calculated by subtracting the ERP elicited by incorrect ToM conditions from the ERP elicited to correct ToM conditions.
Activity consistent with an LPC manifested as a positive-going deflection 300 ms to 600 ms after stimulus presentation and was most pronounced at parietal electrode sites. Both groups exhibited maximal LPC at posterior, parietal sites. Compared with the ASD group, the LPC activity in the non-ASD group showed a larger, more widely dispersed scalp distribution with positive activity apparent across parietal, central, and some frontal regions. In both groups, a LSW manifested as a continuation of the LPC, beginning approximately 600 ms after stimulus presentation and continuing at posterior electrode sites until 1,200 ms to 1,400 ms. Waveform morphology in the current sample does not support differentiating between two distinct waveforms. Statistical analyses performed for the LPC are presented below. Statistical analyses for the LSW were largely nonsignificant and are presented in the Supplemental Material.
Effect of correct condition compared with incorrect condition on LPC
Across all participants, there was a significant main effect of ToM response condition at pooled parietal electrode sites during the 300- to 600-ms time window, F(1, 67) = 12.05, p = .001, which indicates a significant difference in the amplitude of the LPC elicited to correct ToM response options (M = 6.58, SD = 6.90) compared with incorrect ToM response options (M = 7.92, SD = 7.43) . The Condition × Group interaction effect was nonsignificant, F(1, 67) = 2.43, p = .12.
Relationship between LPC and ToM performance
There was a significant positive relationship between ToM performance and the residualized LPC, r(76) = .404, p = .001, such that a larger LPC response to incorrect ToM response options relative to correct ToM response options was associated with increased ToM performance (i.e., fewer errors) during the EEG task (see Fig. S1a in the Supplemental Material).
Relationship between LPC, ASD status, and ASD symptom severity
The relative size of the LPC elicited to correct ToM conditions compared with incorrect ToM conditions differed by ADOS-2 diagnosis and was associated with ASD symptom severity measured dimensionally with the ADOS-2 CS. The residualized LPC negatively correlated with ADOS-2 CS, r(69) = −.349, p = .003, such that a larger LPC response to incorrect ToM response options relative to correct ToM response options was associated with reduced ASD symptom severity (see Fig. S1b in the Supplemental Material).
Post hoc analyses
Moderation by ASD status
To examine whether the relationships between the LPC, ToM performance, and ASD symptom severity were dependent on diagnostic group, we entered ASD status (ASD, non-ASD) into each regression model as a moderator.
Moderation analyses revealed significant interaction effects for ASD Status × Residualized LPC, change in R2 = .127, F(1, 62) = 11.32, p = .001 (see Fig. S2 in the Supplemental Material). Post hoc probing revealed that when separated by diagnostic group, the effect of the residualized LPC on ToM performance was evident only for the ASD group, b = 0.027, t(66) = 4.751, p ≤ .001, but not the non-ASD group, b = −0.004, t(66) = −0.516, p = .608.
A significant interaction effect was found for ASD Status × Residualized LPC (change in R2 = .023), F(1, 65) = 7.29, p = .009. Post hoc probing revealed that when separated by ASD status, the effect of the residualized LPC on ASD symptom severity was significant for the ASD group, b = −0.275, t(66) = −4.374, p ≤ .001, but not for the non-ASD group, b = −0.025, t(66) = 0.274, p = .785.
Mediation of ASD status and ToM by LPC
Given the correlations between ASD symptoms, residualized LPC, and ToM behavioral accuracy, we sought to explore whether the relative size of the LPC to correct ToM conditions compared with incorrect ToM conditions mediated the relationship between ASD symptoms (as measured by ADOS-2 CS) and ToM skills (behavioral accuracy). Second, we explored whether the strength of the relationships between ASD symptom severity, residualized LPC, and ToM performance differ according to diagnostic group status and found the interaction effect was nonsignificant (change in R2 = .024), F(1, 62) = 1.831, p = .181. Bootstrapped mediation analyses were performed.
Analyses revealed a marginally significant direct effect (c′, p = .053) of ADOS-2 CS predicting behavioral ToM accuracy on the EEG task (Fig. 3). However, when controlling for variance in the LPC, ADOS-2 CS did not remain a significant predictor of ToM performance (95% confidence interval [CI] = [−0.017, 0.000]). The unstandardized indirect effect, ab = −0.005, SE = 0.004 (95% CI = [−0.015, −0.0001]), was significant, which indicates that the LPC significantly mediates the relationship between ADOS-2 CS and behavioral accuracy on the ToM EEG task. This mediation effect had a medium effect size: κ2 = .120, SE = .070 (95% CI = [0.011, 0.279]; Preacher & Kelley, 2011).

Mediation model showing the effect of ADOS-2 comparison score (CS) on behavioral accuracy on the theory-of-mind (ToM) electroencephalography task, as mediated by the residualized difference score of the late positive complex (LPC). On the path from ADOS-2 CS to ToM behavioral accuracy, the values below the arrow are from the model without the mediator, and the values above the arrow are from the model that included the mediator. Values on each path are unstandardized regression coefficients; standard errors are given in parentheses and 95% confidence intervals (CIs) in brackets. Asterisks indicate significant path coefficients (*p < .05, **p < .01). ADOS-2 = Autism Diagnostic Observation Schedule, 2nd edition (Lord et al., 2012).
Age and verbal intelligence as covariates
To assess the potential effects of participant age and verbal intelligence on our findings, we reran all relevant moderation and mediation analyses with participant age and verbal IQ included as covariates. Results from these additional analyses were consistent with our previous findings and did not alter the pattern of effects in any of the models.
Effects of pubertal status
The PDS did not correlate with the residualized LPC, residualized LSW, or behavioral accuracy on the ToM task (all ps > .07), and covarying PDS did not attenuate any of the correlations represented in our primary analyses (all ps remained < .03).
Discussion
This was the first study to explore ERP correlates of ToM in adolescents with ASD. Findings show activity consistent with prior descriptions of LPC and LSW, which demonstrates that both ERP components are elicited at parietal electrode sites in response to a novel ToM EEG task in adolescents with and without ASD. Results showed that differentiation between ERPs elicited in response to correct and incorrect ToM conditions is maximal during the LPC epoch. In the current sample, LSW activity suggests a continuation of the LPC waveform. Exploratory analyses reveal that the relationship between ToM behavioral accuracy and ASD symptom severity is mediated by the LPC. Findings suggest that a neural measure that may reflect engagement of metarepresentational capacities during ToM processing is a potential contributing factor to the observed individual differences in ToM ability across ASD.
The LPC was maximal at parietal electrode sites and exhibited a main effect of ToM response condition. The number of errors made on the ToM EEG task did not differ significantly between ASD and non-ASD groups. However, ASD symptom severity was negatively associated with behavioral accuracy on the EEG ToM task. Behavioral accuracy on the EEG task and ASD symptom severity were both correlated with the relative size of the LPC elicited to correct ToM conditions compared with incorrect ToM conditions. Finally, results of exploratory post hoc analyses suggest that the relationship between ASD and ToM is in fact mediated by (i.e., contingent on) the LPC.
To minimize the likelihood of spurious findings, in the current study, we adopted a conservative approach to ERP analysis using only the ERPs (LPC, LSW) most relevant to the extant empirical literature. However, as suggested by Meinhardt et al. (2011) and Dien et al. (2004), the latency and topographical distribution of the LPC show similarities with the P300, a component associated with allocation of attentional resources and context updating (Polich, 2007), processes integral to metarepresentation in ToM reasoning. Consistent with this, the LPC identified in the current study shares many of these features with the P300 component. The LSW identified in the current study occurred 600 ms to 1,200 ms after stimulus and appeared to function primarily as an extension of the LPC. This is consistent with evidence showing that the variability observed in similarly long-lasting positive components (e.g., the late positive potential [LPP], which shares morphological and temporal similarities with the LSW) may reflect ongoing, elaborative cognitive processes (Hajcak & Foti, 2020). In the context of ToM processing, the LSW may therefore serve an integral role in bridging the gap between early, more automated processes related to perception and attention to stimuli and relatively later, cognitive processing, integration, and manipulation of information. In the current study, the LPC and LSW appear to correlate differently with different aspects of behavior, and it is unclear whether this pattern is unique to parameters of the current EEG paradigm. Future studies examining these ERPs should continue to examine the LPC and LSW as distinct components but should quantitatively examine their overlap to determine when—and if—they ultimately may measure the same underlying construct. In addition, future studies using methodologies such as time-frequency analysis and principal component analysis are needed to further clarify areas of overlap and differentiation between the LPC and the LSW as well as temporally and morphologically similar ERPs (e.g., N2, P300, LPP) and their relation to specific task characteristics.
The relative size of the LPC elicited to correct and incorrect ToM conditions showed a significant association with ADOS-2 CS. Higher ADOS-2 CS (i.e., greater ASD symptoms) was associated with a larger LPC response to correct ToM conditions relative to incorrect ToM conditions (i.e., in the direction hypothesized). This suggests that differences in ToM processing arise at the point of transformation of external stimuli into internal mental representations (i.e., metarepresentation) such that individuals who show increased engagement during this process also show greater ASD symptoms. Conversely, people who are less cognitively engaged (i.e., people for whom the transformation process has become automatic) show reduced ASD symptoms.
When grouped by diagnostic status, these relations between the LPC, ToM performance, and ASD symptom severity were present only for the ASD group. This may have arisen because of apparent reductions in variability in these dependent variables in the non-ASD group (see Fig. S1 in the Supplemental Material). To assess this possibility, future studies should include measures of these variables that are more sensitive to variability in non-ASD populations.
Exploratory post hoc mediation analyses revealed that whereas ADOS-2 CS is a significant predictor of behavioral accuracy on the ToM EEG task, when the variance of the LPC is taken into account, ADOS-2 CS is no longer a significant predictor of behavioral ToM accuracy (Fig. 3). These exploratory findings therefore support the long-supposed relationship between ASD symptom severity and ToM ability; however, that relationship appears contingent on the magnitude of differentiation of the LPC elicited to correct and incorrect ToM response options. The LPC therefore mediates the relationship between ADOS-2 CS and behavioral accuracy on a ToM task. This may explain some of the characteristic heterogeneity of symptom presentation observed in ASD: Although ToM deficits are often observed in ASD, not everyone with ASD experiences these deficits in ToM reasoning. Our findings suggest that the LPC explains the heterogeneity of ToM performance across variable levels of ASD symptom severity. Thus, individuals who have a high ADOS-2 CS and LPCs showing little to no differentiation between ToM conditions are more likely to experience behavioral impairments in ToM reasoning abilities.
Consistent with results from prior studies with non-ASD samples (Jiang et al., 2016; Wang et al., 2010; Zhang et al., 2009), the current findings provide evidence of a neural mechanism (the LPC) that may reflect the metarepresentation stage of ToM reasoning (Leslie, 1987). Furthermore, these findings support Leslie and Frith’s (1990) metarepresentation conjecture in that individual differences in this capacity link ASD symptoms to ToM performance deficits (Leslie & Frith, 1990). This is the first direct evidence in support of discrete neural indices of specific stages of ToM processing in ASD. Future research should focus on deciphering which specific processes (e.g., metarepresentation) the LPC and LSW may measure, for example by using tasks that parse out more complex aspects of ToM reasoning and consequent EEG indices (Tesar et al., 2020).
The exploratory mediation results also have important implications for conceptualizing how and when during cognitive processing ToM abilities in ASD begin diverging from typical patterns of development, thus further increasing the understanding of the nature of ToM deficits in ASD. Such information is crucial for informing and designing efficient interventions targeting ToM impairments. In addition, the mediation results provide new evidence toward explicating one facet of the significant phenotypic heterogeneity observed between individuals with ASD, which suggests that individual differences in observed behavior are partially explained by individual differences in neural activity, and this same pattern is observed across diagnostic boundaries. Thus, the current findings highlight the importance of using multimodal methods of measurement and analysis to investigate such nuanced patterns of social cognition and behavior. Broader use of such methods holds the potential to improve identification of the sources contributing to individuals’ strengths and weaknesses at an individual level, information that is vital for tailoring interventions to individuals’ needs to ultimately improve the effectiveness of interventions.
The current findings are consistent with Livingston and Happé’s (Livingston et al., 2019; Livingston & Happé, 2017) compensation framework, which suggest that some individuals with ASD may employ effortful, alternative cognitive or behavioral strategies to alter social behaviors to appear less affected by ASD symptoms despite persisting differences at cognitive and/or neurobiological levels. For example, individuals with ASD may learn, and consciously engage in, behavioral strategies such as mimicking TD peers’ social behaviors (e.g., orienting toward faces or making eye contact during conversations) without necessarily using such behaviors to represent their peers’ mental states to successfully navigate social situations requiring ToM (Livingston & Happé, 2017). In the present study, some but not all individuals with ASD exhibited behavioral impairments in ToM. Our evidence suggests that some individuals with greater ASD symptoms may in fact exhibit appropriate LPC modulation during ToM reasoning and may develop successful ToM skills, whereas individuals who do not use this capacity exhibit persistent deficits in ToM.
This study has several limitations that constrain interpretation of the findings. First, to ensure participants’ comprehension of task instructions as well as the verbal content of the narrated vignettes, the current ToM EEG paradigm requires participants to have intact receptive language and intellectual ability. Second, the current study includes EEG measurement at only one time point; thus, the degree to which the LPC ERP and its role in ToM reasoning may vary within individuals across development, or in response to intervention, remains unclear. Third, because the ADOS-2 was originally developed to measure ASD symptoms among individuals with ASD, the current study’s use of the ADOS-2 CS to measure ASD symptom severity may have resulted in limited variation of ASD symptoms in the non-ASD group. Fourth, in an attempt to increase ecological validity, this study uses a novel ToM EEG task involving social meaning making; however, it remains uncertain whether the LPC is elicited during other types of ToM tasks, such as nonverbal tasks or those involving tracking the movements of animated geometric shapes (see Senju, 2012). Fifth, it remains unclear whether a P300 component is being elicited by this paradigm. Future work should conduct within-persons comparisons examining the current paradigm and a paradigm shown to unambiguously elicit a P300. Sixth, it remains unclear whether the LPC is similarly or differentially implicated in other important aspects of ToM reasoning, such as affective ToM and cognitive ToM, and in the relationships between those factors and ASD.
Conclusion
An ERP component, the LPC, was elicited in response to a novel ToM EEG task in a sample of adolescents with and without ASD and differentiated correct ToM conditions from incorrect ToM conditions. Crucially, results from exploratory analyses show that the LPC mediates the long-supposed relationship between core ASD symptoms and ToM ability. Evidence suggests that deficits in ToM reasoning in ASD occur relatively early in the perceptual processing pipeline and may specifically stem from difficulties engaging in cognitive metarepresentation. The current findings thus increase understanding of the neural mechanisms implicated in social-cognitive functioning in ASD and further inform clinical practice, research, and theory involving ToM reasoning in ASD.
Supplemental Material
sj-pdf-1-cpx-10.1177_21677026211021975 – Supplemental material for An Electrocortical Measure Associated With Metarepresentation Mediates the Relationship Between Autism Symptoms and Theory of Mind
Supplemental material, sj-pdf-1-cpx-10.1177_21677026211021975 for An Electrocortical Measure Associated With Metarepresentation Mediates the Relationship Between Autism Symptoms and Theory of Mind by Erin J. Libsack, Elizabeth Trimber, Kathryn M. Hauschild, Greg Hajcak, James C. McPartland and Matthew D. Lerner in Clinical Psychological Science
Footnotes
Acknowledgements
We thank the participants and families who volunteered their time to support this research. We thank the research coordinators, research assistants, graduate students, and postdoctoral fellows who work in the Social Competence and Treatment Lab at Stony Brook University and contributed to this research. We thank Cara M. Keifer and Tessa Clarkson for their advice and commentary on earlier versions of this manuscript and Nicole M. Russo-Ponsaran, Clark McKown, the Rush NeuroBehavioral Center, and xSEL Labs for providing the original SELweb stimuli that were adapted for use in this study.
Transparency
Action Editor: Erin B. Tone
Editor: Kenneth J. Sher
Author Contributions
M. D. Lerner developed the study concept, and J. C. McPartland and G. Hajack contributed to conceptualization of the study design. E. Trimber developed the adapted electroencephalography task. E. J. Libsack participated in data acquisition, oversaw data processing, conceptualized hypotheses, conducted statistical analyses and interpretation, and wrote the manuscript under the supervision of M. D. Lerner. K. M. Hauschild, E. Trimber, and J. C. McPartland provided critical revisions. All of the authors approved the final manuscript for submission.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
