Abstract
Previous research has shown that social cues, including eye gaze, can readily guide our focus of attention—a phenomenon referred to as social attention. Here, we demonstrated that internally maintained social cues in working memory (WM) can produce an analogous attentional effect (N = 57). Using the delayed-match-to-sample paradigm combined with the dot-probe task, we found that holding irrelevant gaze cues in WM can induce attentional orienting in college-age adults. Importantly, this WM-induced attention effect could not be explained simply by the perceptual-attentional process, because the identical gaze cues that were only passively viewed and not memorized in WM could not trigger attentional orienting beyond the typical time window of social attention. Furthermore, nonsocial cues (i.e., arrows) held in WM failed to elicit the attentional-orienting effect. These findings provide new evidence for the conceptualization of WM as internally directed attention and highlight the uniqueness of social attention compared with nonsocial attention.
Sharing attention with interactive partners via social cues (e.g., eye gaze), known as social attention, is crucial for interpersonal interactions and adaptive social behaviors (Birmingham & Kingstone, 2009). This fundamental ability enables humans to learn about the other person’s inner state and where the important events are in the environment (Nummenmaa & Calder, 2009). This ability appears to underpin the development of complex sociocognitive skills (e.g., theory of mind, language; Baron-Cohen, 1995; Brooks & Meltzoff, 2005; Shepherd, 2010) and to be closely linked with sociocognitive disorders (e.g., autism; Dawson et al., 2012). The simple and well-established gaze-cuing task has been widely used for simulating and measuring social attention (Friesen & Kingstone, 1998; Frischen, Bayliss, & Tipper, 2007). Typically, a nonpredictive gaze cue is presented centrally, and it will trigger attentional orienting as evidenced by faster response times (RTs) to gazed-at targets compared with targets appearing opposite to the gaze direction. Such an attentional effect was found to be both reflexive and long lasting because it emerged as early as 100 ms after gaze onset and was sustained for about 1,000 ms (Frischen, Smilek, et al., 2007). Note that although gaze cues appeared in the central location, as traditional endogenous cues do, the direction of gaze cues was not predictive of the location of the subsequent target. On the other hand, social attention is likewise different from exogenous attention, as it is not caused by the peripherally presented cue, and its inhibition of return is quite delayed compared with exogenous attention. Moreover, recent studies found that social attention is highly heritable and relies on specialized mechanisms (Ji et al., 2020; Wang et al., 2020). Given these special properties, social attention challenges the classic dichotomous categorization of covert attention (i.e., exogenous and endogenous) and opens up new avenues for visual-attention research.
Although the uniqueness of social attention has been well investigated with externally presented gaze cues, it is heretofore unknown whether this special type of attentional orienting can also occur when the cues are no longer externally available but are internally maintained in working memory (WM). Recently, WM has been conceptualized as internally directed attention that has a behavioral impact similar to that of externally attended stimuli (Johnson et al., 2013; Kiyonaga & Egner, 2014, 2016; Saad & Silvanto, 2013). For instance, keeping a color word in WM could interfere with the following color discrimination task, which showed a novel WM Stroop effect akin to the classic Stroop effect (Kiyonaga & Egner, 2014). Similarly, other classic effects of attention, such as “inhibition of return” (Johnson et al., 2013), also exist in the WM domain. Moreover, self-related representations in WM can automatically attract attention, much as self-prioritization does in exogenous attention (Yin et al., 2019, 2021). Following the internally directed attention view, we asked whether WM representations could also induce an analogous special form of attention (i.e., social attention) beyond the realms of classic attention (e.g., exogenous attention). Given that WM is central to high-level cognition (Baddeley, 2003; Lépine et al., 2005), such internal social attention allows us to keep track of others’ intentions and prepare for high-level sociocognitive behaviors. Hence, exploring internal social attention may help to extend our understanding of the mechanisms underlying the impact of social attention on high-level sociocognitive behaviors and fill up the gap between WM and social attention. Furthermore, this internal social attention, which involves high-level cognitive functions, could serve as a potentially more reliable behavioral marker of sociocognitive disorders relative to external social attention.
In the current study, we interspersed a WM paradigm with a dot-probe task to directly test social attention in WM. Specifically, participants were first asked to remember the identity of a face with averted eye gaze, and then they were required to discriminate the location of a target presented at either the left or right side of the screen. In addition, to rule out the possibility that the WM-induced social-attentional effect, if observed, was contributed by any residual perceptual process, we implemented a control condition in which the gaze cue was only passively viewed but not kept in WM. To further investigate whether any obtained WM effect was specific to social cues, we employed nonsocial arrow cues. Previous research has found that nonpredictive arrow cues can also trigger attentional-orienting effects (Ristic et al., 2002; Tipples, 2002). It is an innovative way to examine the uniqueness of gaze cuing compared with arrow cuing in the domain of WM, which may contribute to elucidating the longstanding dispute concerning the specificity of social attention (Friesen et al., 2004; Frischen, Bayliss, & Tipper, 2007; Ristic et al., 2007).
Statement of Relevance
Sharing attention with interactive social partners via eye gaze, known as social attention, is a common phenomenon in our daily life. In this research, we found that social-attention behavior can also occur when the gaze cues are not externally available but are internally maintained in working memory. Specifically, we first asked observers to memorize the identity of a face with averted gaze and then required them to discriminate the location of a target presented at either the left or the right side of the screen. We found that observers responded more quickly to targets appearing on the same side, as indicated by the irrelevant gaze direction stored in working memory, than to targets appearing in the opposite direction. This effect was specific to social cues inasmuch as nonsocial cues (i.e., arrows) held in working memory failed to trigger attention allocation. Such unique internal social attention allows us to keep track of others’ intentions and prepare for high-level sociocognitive behaviors.
Experiment 1
Method
Participants
Sixteen college students whose ages ranged from 22 to 32 (eight females; age: M = 25.3 years, SD = 3.2 years) were recruited via an online advertisement in Experiment 1. All participants had normal or corrected-to-normal vision, and all gave written informed consent in accordance with the procedure and protocols approved by the institutional review board of the Institute of Psychology, Chinese Academy of Sciences. A two-tailed power analysis using G*Power (Version 3.1.9.4; Faul et al., 2007) indicated that a sample size of at least 15 participants would afford 80% power to detect a large attentional-orienting effect (Cohen’s d = 0.80) in the WM task (Downing, 2000). Thus, we set a target sample size of 16; data collection stopped when this sample size was reached.
Stimuli
Stimuli were displayed using MATLAB (The MathWorks, Natick, MA) together with the Psychophysics Toolbox extensions (Brainard, 1997; Pelli, 1997) on a 19-in. CRT monitor (1,280 × 1,024 pixels at 60 Hz) with a gray background (red, green, blue [RGB] value = 128, 128, 128). Sixteen face images (half male and half female) with neutral expressions were taken from the NimStim Set of Facial Expressions (Tottenham et al., 2009). They were first manipulated in face-image modeling software (i.e., FaceGen Modeller Version 3.4) to remove features outside of the face (e.g., hair and ears). Then the faces (about 3.6° × 6.1°) were converted to gray scale, and the gaze direction was generated by shifting the irises and pupils of the eyes to the canthi using Adobe Photoshop software. A phase-scrambled image generated from one of the original face pictures was used as a mask (3.5° × 6.0°).
Procedure
Participants were seated at a viewing distance of 57 cm from the computer screen. They were instructed to complete two phases in Experiment 1, namely a passive-viewing phase followed by a WM phase. The delayed-match-to-sample paradigm combined with a dot-probe task was employed in the WM phase. At the beginning of each trial, a central fixation cross (0.7° × 0.7°) within a white frame (17.9° × 17.9°) was presented for a random time interval varied from 1,500 to 2,000 ms. Then a sample face with averted eye gaze (leftward in half of the trials and rightward in the other half) was presented for 1,000 ms. This was followed by an immediate mask presented for 100 ms to eliminate any perceptual aftereffects. Participants were required to memorize and retain the face throughout the trial for a recognition test at the end of the trial. After an interstimulus interval of 400 ms, a Gabor patch (1.2° × 1.2°) flashed as a target for 100 ms on either the left or the right visual field at a distance of 4.5° from the central cross. Participants were instructed to localize the target by pressing one of two arrow keys (left arrow key for left target and right arrow key for right target) as quickly as possible but to give priority to response accuracy. After a response was made or after more than 1,500 ms elapsed, a face looking straight ahead was displayed at the center of the screen, preceded by a blank interval of 1,000 ms. Participants were required to press the assigned letter keys to indicate whether or not this face had the same identity as the memory face (“Z” for the same identity and “C” for a different identity; see Fig. 1). Observers were explicitly told that the gaze directions of faces preceding the targets were nonpredictive of the target location and unrelated to the memory test.

Schematic representation of the passive-viewing and working memory phases in Experiments 1 through 3. In Experiment 1, the working memory phase started with a sample face with leftward or rightward eye gaze, followed by a brief mask. Participants were instructed to maintain the identity of this face in memory throughout the trial. Then they had to respond to the location of a peripheral target (Gabor patch) preceded by a blank interval. Following this, a test face was presented for the final memory-recognition task. The passive-viewing phase was identical to the working memory phase except that the sample face was viewed only passively and the test period was removed. Experiments 2 and 3 followed the same structure as Experiment 1, the only difference being that participants were required to memorize the pattern of the arrows pointing leftward or rightward (Experiment 2) or to hold the exact direction of a plain black arrow in working memory (Experiment 3). ISI = interstimulus interval.
A similar procedure was adopted in the passive-viewing phase as in the WM phase, except that participants were told to passively maintain attention on the sample face but did not have to memorize it, and the recognition test was removed. Note that the task adopted in the passive-viewing phase varied from the traditional gaze-cuing paradigm to ensure the elimination of any perceptual effects. Specifically, we used a backward mask that immediately followed the cue along with a relatively longer stimulus onset asynchrony (SOA) of 1,500 ms. As a consequence, cue and target were presented asynchronously in the current study, whereas gaze cues remained on the screen throughout the trial in most studies in the literature (Bayliss et al., 2006, 2007; Friesen & Kingstone, 1998; Gregory & Jackson, 2017, 2019).
Each participant completed 128 trials in each phase (with half congruent and half incongruent trials) and took a short rest break midway. The gender of the face stimuli was counterbalanced across the experimental conditions. In the WM phase, the recognition face matched the identity of the memory face in half of the trials and shared the same gender with the memory face across all trials. Each phase was preceded by some practice trials, and participants had to reach 87.5% accuracy in the target-localization task before starting experimental trials.
Results
The overall performance of the WM recognition test was high (M = 91.1%, SE = 1.1%), suggesting that participants remembered the face items well. For each phase, trials with incorrect probe responses and RTs less than 100 ms or greater than 1,000 ms were excluded from further analyses, and we also excluded trials with RTs exceeding 2 standard deviations from the mean. In the passive-viewing phase, in which participants were asked to passively view the central faces with averted gaze, a paired-samples t test did not show an attentional-orienting effect: There was no difference between the RTs to congruent targets (presented at the gazed-at locations) and incongruent targets (presented at the gazed-away-from locations; M = 381 ms, SD = 19 ms vs. M = 380 ms, SD = 19 ms), t(15) = 0.49, p = .629, d = 0.12, 95% confidence interval (CI) for the mean difference = [−3, 5] (see Fig. 2a). This was in accordance with our expectation because gaze-induced attention allocation was often absent at a long SOA beyond 1,000 ms (Friesen & Kingstone, 1998; Greene et al., 2009). Critically, the identical gaze cues that were memorized in WM could give rise to a gaze-cuing effect (M = 386 ms, SD = 11 ms vs. M = 401 ms, SD = 11 ms), t(15) = −4.54, p < .001, d = 1.13, 95% CI for the mean difference = [−22, −8] (see Fig. 2a), even when participants were explicitly told that the target location was not predicted by the gaze direction. Moreover, the two-way interaction between phase (passive viewing vs. WM) and congruency (congruent vs. incongruent) was highly significant, F(1, 15) = 15.61, p = .001, η p 2 = .51.

Results of (a) Experiment 1, (b) Experiment 2, and (c) Experiment 3. For each experiment, response time in the passive-viewing and working memory phases is shown for both congruent targets (presented at the gazed-at locations) and incongruent targets (presented at the gazed-away-from locations). Error bars represent standard errors of the mean. The asterisk indicates a significant difference between responses to congruent and incongruent targets (*p < .001).
Taken together, these findings provided strong evidence that holding a face image with task-irrelevant averted eye gaze in WM could induce and sustain a robust involuntary attentional-orienting effect even beyond the typical time window of social attention. This WM-induced attentional effect could not be explained by the perceptual-attentional process, because the actual presentation of the same face stimuli without a memory requirement failed to produce an analogous effect.
Experiment 2
In Experiment 2, nonsocial arrow cues instead of faces were held in WM so we could examine whether the WM-triggered social-attentional effect was specific to social cues or could also be generalized to nonsocial cues.
Method
A new group of 16 naive adults whose ages ranged from 19 to 29 (12 females; M = 23.9 years, SD = 2.2 years) was recruited in Experiment 2. Several border styles (dense and sparse dashed lines) and textures (e.g., dots, stripes, grid) were combined to create 16 kinds of black and white arrows. These patterns were similar but distinguishable so as to make the WM task challenging enough for the participants. The arrows were equated for contrast, akin in terms of luminance, and equally presented in each experimental condition. Experiment 2 replicated the structure and design employed in Experiment 1, with variations being that directional arrows (6.6° × 3.2°) instead of face images served as memory items and nondirectional double-headed arrows (8.9° × 3.2°) served as test stimuli. Accordingly, the mask was a phase-scrambled image generated from one of the double-headed arrow images (9.2° × 3.3°). Participants were asked to hold the arrow pointing leftward or rightward in WM and respond whether the pattern of the nondirectional arrow was identical to the memory cue in the test period (see Fig. 1).
Results
The overall WM performance for arrows was again very high (M = 94.8%, SE = 1.0%). The identical outlier analysis from Experiment 1 was used in this experiment. Again, the actual presentation of nonsocial arrows as the central cues failed to elicit involuntary attentional orienting after a long delay (M = 385 ms, SD = 12 ms vs. M = 389 ms, SD = 13 ms), t(15) = −1.13, p = .276, d = 0.28, 95% CI for the mean difference = [−10, 3] (see Fig. 2b), which was in line with previous findings (Friesen et al., 2004; Greene et al., 2009). In contrast to Experiment 1, results showed that the WM-induced attentional effect could not be found when nonsocial arrow cues were held in WM (M = 416 ms, SD = 16 ms vs. M = 417 ms, SD = 16 ms), t(15) = −0.42, p = .681, d = 0.11, 95% CI for the mean difference = [−7, 4] (see Fig. 2b). Not surprisingly, the interaction between phase (passive viewing vs. WM) and congruency (congruent vs. incongruent) was not significant, F(1, 15) = 0.39, p = .544, η p 2 = .03. Moreover, when comparing Experiment 2 with Experiment 1, we found a significant three-way interaction of Cue (gaze vs. arrow) × Phase (passive viewing vs. WM) × Congruency (congruent vs. incongruent), F(1, 30) = 10.32, p = .003, η p 2 = .26. This interaction confirmed the difference between the two experiments and indicated that WM-induced attentional effect was likely specialized to social cues and not generalized to nonsocial directional cues (i.e., arrows).
Experiment 3
In Experiment 2, nonsocial arrow cues failed to elicit an attentional-orienting effect where the directional information of the cues was task irrelevant and incidentally memorized. We conducted Experiment 3 to make further investigations by rendering the exact direction of arrows to be internally held in WM and testing whether the task-relevant cue direction could make a difference.
Method
Another group of 16 naive adults whose ages ranged from 19 to 28 (eight females; M = 22.9 years, SD = 2.5 years) participated in Experiment 3. In Experiment 3, we adopted a similar design and procedure to that of Experiment 2 except that plain black arrows (6.9° × 2.2°) pointing leftward or rightward were rotated by −7°, −3°, 0°, 3°, or 7° from the horizontal axis to be presented as memory cues and test stimuli, and a solid black bar (6.9° × 2.2°) was used as the mask. Participants were told to memorize and indicate whether the memorized arrow and the recognition arrow had exactly the same orientation in the WM phase (see Fig. 1). Each phase (passive viewing vs. WM) contained a total of 120 trials with 60 congruent trials and 60 incongruent trials.
Results
Memory for the exact direction of the central arrow cues was high overall (M = 89.6%, SE = 0.9%). The identical outlier analysis from the previous two experiments was used. To be consistent with Experiment 2, in which horizontal arrows were presented, we selected trials with memorized arrows rotated by −3°, 0°, and 3° for further analysis. Neither the actually presented arrow direction (M = 386 ms, SD = 12 ms vs. M = 388 ms, SD = 14 ms), t(15) = −1.15, p = .267, d = 0.29, 95% CI for the mean difference = [−7, 2] (see Fig. 2c) nor the intentionally memorized arrow direction (M = 433 ms, SD = 12 ms vs. M = 435 ms, SD = 12 ms), t(15) = −0.36, p = .726, d = 0.09, 95% CI for the mean difference = [−8, 5] (see Fig. 2c) could trigger an attentional-orienting effect. Again, no two-way interaction between phase (passive viewing vs. WM) and congruency (congruent vs. incongruent) was found, F(1, 15) = 0.14, p = .718, η p 2 = .01. These findings indicated that when participants intentionally kept the exact direction of the arrow instead of its pattern in their WM, arrows remained unable to trigger an attentional-orienting effect after a long interval. In summary, the current experiment together with Experiment 2 suggests that arrow-mediated attentional orienting likely hinges on the cues being currently presented in the environment, and therefore, cues held in WM were unable to induce a similar effect no matter whether the directional information was incidentally or intentionally stored. The three experiments together lend strong support to the distinction between social attention and nonsocial attention in WM.
Experiment 4
Experiments 1, 2, and 3 together demonstrated a novel internal-direction-mediated attentional orienting specific to social cues. Next, we investigated whether this unique internal social attention would sustain at a longer interval, given the long-lasting maintenance of contents kept in WM.
Method
A new group of 41 naive adults whose ages ranged from 18 to 30 (25 females; M = 23.4 years, SD = 3.0 years) participated in Experiment 4. We purposely oversampled in anticipation that the attentional effect at a longer interval would be relatively small, as indicated by earlier studies on external social attention (Frischen, Bayliss, & Tipper, 2007; Frischen, Smilek, et al., 2007). We largely replicated the procedure of Experiment 1 but lengthened the interstimulus interval from 400 ms to 900 ms. Consequently, we had a longer interval of 2,000 ms in which to examine the effectiveness of internal gaze cues to trigger attentional shifts.
Results
Participants showed overall high memory accuracy (M = 91.0%, SE = 0.7%) that ensured good maintenance of gaze cues in WM after a longer delay. We followed the same outlier analysis to be consistent with the previous three experiments. Similar to Experiment 1, an attentional-orienting effect triggered by internal eye-gaze cues was again observed at an even longer time interval of 2,000 ms (M = 410 ms, SD = 8 ms vs. M = 415 ms, SD = 8 ms), t(40) = −2.93, p = .006, d = 0.46, 95% CI for the mean difference = [−8, −2]. However, as expected, RTs in congruent trials and incongruent trials did not differ when gaze cues were only attended but not memorized in WM (M = 402 ms, SD = 8 ms vs. M = 400 ms, SD = 7 ms), t(40) = 0.78, p = .441, d = 0.12, 95% CI for the mean difference = [−3, 6]. Again, the two-way interaction between phase (passive viewing vs. WM) and congruency (congruent vs. incongruent) was significant, F(1, 40) = 4.91, p = .032, η p 2 = .11. These results demonstrated that the gaze-cued orienting effect in WM was robust and persisted for at least 2,000 ms, highlighting the unique nature of social attention together with Experiments 1, 2, and 3. Furthermore, compared with Experiment 1, this internal social-attention effect observed at a longer interval decreased significantly, as shown by an independent-samples t test on standardized cuing effects ((RTincongruent - RTcongruent)/(RTincongruent + RTcongruent), M = 0.019, SD = 0.004 vs. M = 0.006, SD = 0.002), t(55) = 3.18, p = .002, d = 0.94, 95% CI for the mean difference = [0.005, 0.021]. This decrease in the magnitude of the internal social-attention effect with the increase of SOA paralleled that found in external social attention by many earlier studies (Friesen & Kingstone, 1998; Frischen, Bayliss, & Tipper, 2007; Frischen, Smilek, et al., 2007; Ristic et al., 2002). Such a finding is meaningful, as it reveals that internal social attention exhibits a similar temporal property of external social attention, providing further evidence for the conceptualization of WM as internally directed attention.
General Discussion
The phenomenon of social attention has exploded in popularity in recent years, showing that social cues can direct our focus of attention (Birmingham & Kingstone, 2009; Friesen & Kingstone, 1998; Frischen, Bayliss, & Tipper, 2007; Nummenmaa & Calder, 2009; Shi et al., 2010; Wang et al., 2014). Here, we extended this line of inquiry by reporting a novel form of internal social attention that is analogous to the classic social attention. Using a modified WM central-cuing paradigm, we found that social cues (i.e., eye gaze) incidentally maintained as internal WM contents could trigger an involuntary attentional-orienting effect that could not be accounted for by the perceptual-attentional process. In contrast, this effect could not be generalized to nonsocial cues (i.e., arrows). Neither the incidentally nor intentionally memorized arrow direction was capable of driving attention allocation. Taken together, these findings demonstrate a robust internal-direction-mediated attentional orienting that is specialized to social cues.
In the classic social-attention task, a long line of research has found that the facilitation effect triggered by gaze cues sustains for about 1,000 ms (Friesen & Kingstone, 1998; Greene et al., 2009). In Experiment 1, the result of the passive-viewing phase converged on this view in that merely attending to faces with averted gaze failed to induce an attentional shift to the gazed-at location at a longer time course of 1,500 ms. However, when gaze cue was encoded into WM, the internal representation could trigger an attentional-orienting effect akin to the classic gaze cuing but beyond the typical time course. Note that not only was the gaze direction nonpredictive of the target location, but also the direction information of gaze cues was maintained without participants’ explicit intention, given its irrelevance to the task of memorizing the face identity. Thus, social attention in WM to some extent can be regarded as an involuntary process. In a nutshell, our findings echo the results of previous studies showing that WM maintenance interacts with the online visual process (Kang et al., 2011; Kiyonaga & Egner, 2014; Saad & Silvanto, 2013) and together espouse the notion that WM might be internally directed attention. The internal-attention interpretation of WM is powerfully advocated by a crucial line of visual search studies, revealing that the cue-matching WM contents could automatically capture attention (Downing, 2000). In this respect, the orienting of attention was bound to the specific location of the WM-relevant cue. By comparison, the attentional effect observed in our study was tied to the location indicated by the directional information maintained in WM but not that of the WM-relevant cue. A second remarkable distinction from the search studies is that we found a new and special form of internal attention in the WM domain that was triggered specifically by the social-directional signal, whereas previous findings were all illustrations of classic types of attention (e.g., exogenous attention). In fact, this is not the first attempt to combine social attention with WM process; however, those pioneers mainly focused on the impact of WM load on social attention (Hayward & Ristic, 2013; Yokoyama et al., 2019). We, on the other hand, demonstrated that WM itself could trigger a similar social-attention effect.
Notably, the WM-induced attention effect observed in the current study is highly specific to gaze but not arrow cues. However, in the standard perceptual-cuing task, arrow cues, like gaze cues, can also elicit robust attentional-orienting effects (Ristic et al., 2002; Tipples, 2002). It should be noted that although observers were explicitly told that the central cues were uninformative, it is likely that observers still associated the cue direction with the location of the ensuing target in a simple single task. While in the current dual-task paradigm, we reasoned that the cue could be well disconnected from the target given the primary WM task together with the inserted mask stimulus and a long SOA. Therefore, the null results of the arrow-cuing effect in WM cast doubt on the automaticity of nonsocial attention and suggest that arrow-mediated orienting obtained in the standard perceptual-cuing task might involve some voluntary processes (Liu et al., 2021). On the other hand, the dissociation between gaze cuing and arrow cuing in WM lends support to the view that “social attention is special.” It has been demonstrated that though social and nonsocial cues share characteristics of inducing attentional effect, they do guide behaviors distinctly in some contexts in which higher-order cognitive function, such as theory of mind, might be involved (Bayliss et al., 2006, 2007; Gregory & Jackson, 2017). Moreover, some recent studies further reveal that social attention is supported by unique genetic and neural mechanisms shared across different social but not nonsocial cues, which implies the existence of a “social-attention detector” in the human brain (Ji et al., 2020; Wang et al., 2020). Overall, our current study extended these findings by demonstrating a novel type of involuntary internal attention that was specific to social but not nonsocial cues. Given that the stimuli and procedure we adopted are common and well-established, we expect our findings to generalize to more complex situations. It is important for future studies adopting more naturalistic stimuli and tasks to explore real-life phenomena of internal social attention. Moreover, because we used only eye-gaze cues, whether the observed internal social-attention effect could be extended to other types of social signals, such as the highly impoverished point-light walkers (Ji et al., 2020; Wang et al., 2020), needs further investigation. Additionally, as the current samples were all healthy adults, it is probable that the internal social-attention effect may be relatively diminished in people with sociocognitive deficits such as autism (Dawson et al., 2012).
What is the potential neural basis of this novel internal social attention? It is quite intuitive to target the burgeoning sensory-recruitment theory of WM, which suggests that the mnemonic processing in WM recruits the same areas for sensory perception (Gayet et al., 2018; Harrison & Tong, 2009; Scimeca et al., 2018; Yin et al., 2021). This is indeed the case in face WM studies demonstrating that the occipital and temporal cortical regions dedicated to face processing (e.g., fusiform face area) showed sustained signals across the retention period (Druzgal & D’Esposito, 2003; Postle et al., 2003; Yoon et al., 2006). Moreover, several studies have found support for overlapping neural networks underlying internal and external spatial attention (Gazzaley & Nobre, 2012; Griffin & Nobre, 2003; Kuo et al., 2009). It is plausible that internal social attention recruits brain areas similar to those involved in classic external social attention (e.g., fusiform gyrus, superior temporal sulcus). Future research could utilize neuroimaging methods to identify the exact neural network subserving internal social attention and examine whether the underlying neural mechanisms of internal and external social attention are comparable.
To conclude, the current study found that attention allocation could be involuntarily triggered by social cues stored as internal representations in WM. Such an effect might be modulated by the intrinsic value of socially relevant stimuli in that it could not be generalized to nonsocial cues. These findings together offer support for the notion that WM acts as internally directed attention and provide new evidence for the uniqueness of social attention compared with nonsocial attention. Future research applying the internal social-attention test developed here to people with autism may have important clinical implications.
Footnotes
Transparency
Action Editor: Leah Somerville
Editor: Patricia J. Bauer
Author Contributions
H. Ji and T. Yuan contributed equally to this work. L. Wang developed the study concept. H. Ji, L. Wang, and Y. Jiang designed the experiments. Testing and data collection were performed by H. Ji, T. Yuan, and Y. Yu. H. Ji and T. Yuan analyzed and interpreted the data under the supervision of L. Wang and Y. Jiang. H. Ji, T. Yuan, and Y. Yu drafted the manuscript, and L. Wang and Y. Jiang provided critical revisions. All authors approved the final manuscript for submission.
