Abstract
Background:
Impairments in the ability to recognize facial affective expressions may lead to social dysfunction and difficulties with interpersonal communication.
Objective:
The objective was to compare the attentional responses on a Stroop emotional task using words and faces by testing whether the two stimuli differ in the degree of interference they produce in patients with Alzheimer’s disease (AD).
Methods:
There were 75 participants: 25 healthy older adults, 25 with mild AD, and 25 with moderate AD. A variation of the classic emotional Stroop test was administered. This task combined emotional words (happy or sad) superimposed on facial expressions (happy or sad), where the words were either incongruent or congruent with the emotion expressed by the face stimuli.
Results:
Facilitation was shown on negative words in healthy older adults, and significant effects were obtained for condition, valence, group, and the condition x group interaction. Although less interference was observed on negative stimuli, the fastest reaction times were found for congruent positive stimuli. The effect of interference in healthy older adults is similar in both conditions. However, in the AD groups, there is less interference on the words task than on the faces task.
Conclusion:
The more complex nature of faces, as opposed to the over-learning and automaticity of words, may explain the higher interference in AD patients in the faces condition. In patients with AD, words can be a better method for recognizing emotions than affective facial expressions.
INTRODUCTION
The Stroop effect [1] is a prime example of the human failure to attend selectively to a particular aspect of a complex stimulus. When the environment is ambiguous or presents competing demands, or the mapping of stimuli onto the response is complex or contrary to habit, making performance prone to error [2], it is necessary to suppress the distractions that may arise and use a decision-making and coordination strategy [3].
Cognitive control is the ability to align mind and action with task-related goals, and it consists of a variety of distinct executive processes that include attention shifting, reaction conflict, or inhibition [4]. Cognitive conflict occurs when the processing of task-relevant information is challenged by a distractor and interference arises that can compromise the ability to complete tasks requiring cognitive control. To complete daily activities, efficient resolution of interference is crucial.
Currently, clinicians use the term Alzheimer’s disease (AD) to refer to a clinical entity that typically presents with a characteristic progressive amnestic disorder and subsequent appearance of other cognitive, behavioral, and neuropsychiatric changes that impair social functioning and activities of daily living [5]. Although studies investigating the neuropsychology of AD have generally focused on cognitive impairment related to memory function, an increasing number of studies have analyzed deficiencies in other cognitive domains. Among them, conflict resolution deficits in AD patients are of interest because they may have repercussions on their everyday life activities, as suggested by clinical observations of the patients. In AD, pathology involves frontal and parietal regions, in addition to medial temporal lobe structures [6]. Therefore, AD pathology affects frontoparietal networks that exercise domain-general task-related cognitive control. However, little is known about the differences between the different degrees of AD (mild, moderate, or severe) in conflicts involving emotional stimuli. Therefore, it is relevant to study how this pathology affects the resolution of cognitive conflict.
The Stroop task is a central experimental paradigm used to test cognitive control by measuring participants’ ability to selectively attend to task-relevant information and inhibit automatic task-irrelevant responses [7]. Although various versions of the Stroop task are used, the general aim is to compare behavioral execution (reaction times and hits) in two conditions: one where there is conflict between two types of information (one relevant and the other irrelevant to the task) and another that does not produce conflict [8]. In the no-conflict or congruent condition, participants are required to read names of colors printed in black ink and name different color patches. Conversely, in the conflict condition, the named color-word condition, color-words are printed in an inconsistent ink color (for instance the word “red” is printed in blue ink). Thus, in this incongruent condition, participants are required to name the color of the ink instead of reading the word [9]. In other words [10], the participants are required to perform a less automated task (i.e., naming the ink color) while inhibiting the interference arising from a more automated task (i.e., reading the word). Typical Stroop data show a robust Stroop interference effect that indicates the presence of informational conflict. This paradigm collects data in terms of reaction times (RT), and interference is shown by longer RTs on incongruent trials between the color and the name. A recent systematic review [11] showed slower RT and a higher interference effect in AD groups than in healthy elderly people. The interference effect can also be considered a measure of inhibition, and patients with AD would exhibit lower inhibitory ability than healthy participants [12]. Inhibitory control is classically considered to represent an important executive function, and inhibitory deficits have frequently been reported in the first stages of AD [13].
One approach to adapting the Stroop paradigm to questions of emotional control has been to present stimuli with emotional dimensions that vary in congruency. This modified paradigm more directly assesses the effects of emotional conflict, and different emotional Stroop task types have been used. The emotional variant has been widely used to investigate the association between attentional bias and emotion, and it was created based on the hypothesis that words on pictorial stimuli with affective significance can produce interference and tend to capture the person’s attention more effectively than neutral stimuli [14]. Typically, colored words are presented, and participants have to pronounce or classify the ink color of these words as fast and as accurately as possible. The interference effect arises if the words themselves are of particular relevance to the participants, or if the word has a high emotional valence. The prevailing methodology is to use words with a negative connotation to produce an emotional response. The responses to color-words with a negative valence are usually slower and more error prone than responses to neutral words—possibly indicating the automatic allocation of attention to emotional stimuli [15]. In AD patients [16], people with mild dementia did not differ significantly from healthy controls in their color-naming reaction times for positive, negative, and neutral words, whereas people with moderate AD took significantly longer to respond to color-name negative and positive words compared to neutral words, suggesting that, in AD, the emotional valence increases concomitantly with the severity of the condition.
Finally, some authors [17, 18] have developed a variation of the emotional Stroop, combining emotional words and photographs of facial expressions where the words are either incongruent or congruent with the emotion expressed by the face stimuli. Participants are asked to identify the emotional expressions on the faces while ignoring the overlaid emotionally-charged words, or vice versa. Other picture-word studies have asked participants to categorize a word by valence or emotion while ignoring the valence or emotional category of the accompanying image [19]. On this task, two emotional stimuli compete to obtain cognitive control and reduce interference, and although affective information seems to be processed automatically, it remains unclear whether there is hierarchical processing of different types of affective stimuli. Words activate emotional concepts more easily than other emotional stimuli, and, therefore, they lead to stronger top-down effects in processing [20]. However, it can be argued that pictures represent stronger or ecologically more valid stimuli than words and, thus, may lead to stronger concept activation.
Interference effects for both facial expressions and words were shown in two experiments [21], but interference effects caused by emotional facial expressions were larger and more robust than those resulting from affective words. In this study, the explanation for this result is that affect expressed in faces may be processed more automatically than affect denoted in words because faces hold a more significant biological and social value. This finding contradicts a study [22] showing that attending to words is processed in a more automatic manner than attending to faces, based on the shorter response latencies and smaller number of errors found in the word instruction condition. The faster processing of word reading is indicative of the formation of stronger stimuli-response associations in an over-learned behavior compared to an instinctive one.
Reading and emotional face processing are widely believed to occur with a high degree of automaticity [23]. An automaticity-based account might predict behavioral interference, regardless of the dimensional relevancy, but the translational model predicts a different outcome [24]. This approach argues that asymmetrical interference emerges because word information is already represented in the required response modality, whereas other stimuli must be translated into a verbal representation. In some types of tasks used [17, 18], to indicate the emotion conveyed by the facial expression, a non-verbal representation of the expression would have to be translated into the corresponding verbal representation. By contrast, the word representation is inherently verbal and, thus, requires no additional translation once it has been recognized.
Patients with AD may experience a progressive impairment in their ability to process affective information, including a specific impaired ability to recognize facial expressions of emotion [25]. The findings of facial expression recognition suggest that deficits in facial expression recognition increase significantly with AD severity [26, 27]. Therefore, from a lifespan perspective, facial emotion recognition seems to follow an inverted U-shaped trajectory, with the best performance in younger adults and worse performance in children and healthy older adults [28]. In a systematic review [29] to clarify the relationship between facial expression recognition and AD, the main findings pointed out were: (a) facial expression recognition has mostly been found to be impaired in AD; and (b) negative emotion identification seems to be more impaired than positive emotion identification. Deficits in facial expression recognition in AD may be associated [30] with increased caregiver burden, decreased quality of life, and a greater probability of institutionalization due to behavioral problems of agitation and aggression that may result in not recognizing of meaningful people.
The main objective of the present study was to compare the attentional responses on a mixed-block Stroop emotional task using words and faces, examining the influence of the emotional valence. Although the research suggests that both faces and words are automatically processed, we believe that the complex nature of a face and its impact on AD patients compared to a word could be a factor that contributes to diminishing its ability to interfere with word recognition because the use of fewer cognitive resources for word recognition would facilitate its success. In addition, selective attention deficits have been found in patients with AD [31], as well as a slowing in healthy older adults when they have to flexibly choose which stimuli to process and which ones to ignore. Therefore, it should be easier to recognize a stimulus that requires fewer cognitive resources. Thus, face processing would produce less interference when presented as a distractor in the recognition of words, and, in turn, it would obtain higher interference scores when it is the stimulus to recognize.
MATERIALS AND METHODS
Participants
The sample was composed of 75 subjects: 25 healthy older adults (HOA; 11 men, 14 women), 25 mild adults with AD (10 men, 15 women), and 25 with moderate AD (9 men, 16 women).
The general inclusion criteria were: age > 65 and no: significant asymptomatic neurovascular disease, history of previous symptomatic stroke, medical condition significantly affecting the brain, serious psychiatric symptoms. The diagnosis of AD was determined according to the DSM-IV criteria [32]. The inclusion criteria for the mild AD group were: a score on the Mini-Mental State Examination (MMSE) [33] of less than 23 and a Global Deterioration Scale (GDS) [34] between 3 and 4. The inclusion criteria for the moderate AD group were: a score below 19 on the MMSE and a GDS greater than or equal to 4. The healthy older participants were recruited from various senior citizen centers in the city of Valencia (Spain), and all the patients were recruited through the Neurology Department at the General Hospital in Valencia (Spain). The clinical diagnosis was the end result of an extensive evaluation, including medical history and neuropsychological examinations (see Table 1 for demographic and neuropsychological data). All the participants (or close family members) gave written informed consent to participate in the study.
Mean scores (standard deviations and range) on demographic and neuropsychological variables in the different study groups
Instruments
Neuropsychological assessment
The GDS [34] describes seven clinically distinguishable global stages ranging from normality (1) to severe dementia of the Alzheimer-type (7). The GDS analyzes patients’ ability to function, reflected in daily living and instrumental activities, as well as psychiatric morbidity based on progressive cognitive loss. The Center for Epidemiologic Studies Depression Scale (CES-D) [35], adapted to Spanish [36], is a short self-report scale consisting of 20 questions that assess symptoms of depression during the week before the test. Scores on each question range from 0 to 3, with a total score ranging from 0 to 60. The CES-D is widely used in research with adults of all ages.
The MMSE [33] is a screening test that quantitatively estimates the existence and severity of cognitive impairment, without providing a diagnosis of any specific nosological entity. The maximum score is 30 points, which is obtained by adding together the scores on all the items. The cut-off score for cognitive impairment is usually set at 23 points.
The semantic verbal fluency (name animals for one minute) and phonological fluency (words that begin with the letter “p” for three minutes) subtests of the Barcelona Test Revised (TBR) [37] were administered to assess language ability.
Verbal memory (short-term recall and delayed recall) was assessed using the Spain-Complutense Verbal Learning Test (TAVEC) [38]. A list of 16 words from four different categories are presented orally five times to the participants; after each presentation, subjects are assessed on the number of words they remember correctly (scores from 0–16 on each trial, and scores from 0–80 on all five trials). After a 20-min rest period, during which other tests are administered in order to distract subjects, delayed recall is assessed.
Finally, we applied the Memory Alteration Test (T@M) [39]. The T@M is a screening test for amnestic mild cognitive impairment and AD that assesses verbal episodic and semantic memory. The T@M has a minimum of 40 questions and a maximum of 50 (depending on the subject’s free recall success). It has five subtests: encoding, temporal orientation, semantic memory, free recall, and cued-recall (applied if the subject fails the free recall part). The maximum T@M score is 50 points, and a score of 37 provides an optimal cut-off score with a sensitivity of 0.92 and specificity of 0.87 for detecting cognitive impairment.
Emotional Stroop task
The procedure was based on a typical emotional Stroop task [40]. On this task, a set of 84 faces were presented, 42 expressing happiness and 42 expressing sadness, taken from the database by Minear and Park [41]. All the pictures were normalized for size (in a 480×620-pixel format) and luminance and presented in the center of the screen against a black background. The faces were presented twice, yielding a total of 168 trials; the color pictures were of Caucasian subjects matched on gender and age group (young people, adults, and older adults). These images were edited so that the word of an emotion was superimposed on the face (happy or sad). The word appeared in the center of the face between the mouth and eyes, horizontally, in black capital bold Times New Roman font 18-point letters. The image could be presented in congruent trials and incongruent trials. If the trial was congruent, the valence of the facial expression corresponded to that of the word; when the trials were incongruent, the facial expression did not coincide with the valence of the word. Half of the stimuli pairings were congruent, and half were incongruent. Stimuli were presented in a random order.
To perform the task, each participant completed two identical blocks of 84 stimuli. In the first block (or face condition), the objective was to respond as quickly as possible to the emotion expressed by the face (happy or sad), with the written word (happy or sad) acting as distractor. To respond, the participant had to press the K key on the computer keyboard if the expression on the face was happy, or the D key if the expression on the face was sad, without taking into account the word written on the image. By contrast, in the second block (or word condition), the objective was to indicate which emotion was written on the image, with the facial expression acting as distractor. The participants had to respond with the K key if the word was happy or the D key if the word was sad, without taking into account the emotional expression on the face. Blocks were counterbalanced across participants.
Each trial began with a + shown at the center of the screen for 1000 ms, followed by a stimulus (face plus word) that appeared for 3 s. During this period, the participant had to respond as quickly as possible by pressing one of the two keys on the keyboard assigned to each of the emotions. After the participant’s response, the next trial began automatically. If the participant did not respond after 3 s, the program continued on to the next trial.
Analyses
We analyzed the data using a mixed ANOVA with three groups (healthy older adults, mild AD, and moderate AD, between-subjects), and two conditions (faces and words) and two valences (positive and negative) as within-subject factors. Post hoc comparisons with Bonferroni correction were performed. All analyses were conducted using the SPSS program 21.
RESULTS
To analyze the faces and words on the Stroop emotional task, only RTs on correct responses with values above 200 ms (RTs below 200 ms were assumed to reflect non-deliberate behavior) or below 3000 ms were used, and all RTs scores above 2.5 SD for each participant and condition were discarded. When exclusively considering the reaction times on the correct trials, the other responses were considered errors. It should be noted that two types of errors could be distinguished: response errors and response omissions. Information on the two types of errors is presented in Table 2. In addition, three AD participants were eliminated from this analysis, taking into account the exclusion criterion of errors on 50% or more of the trials in either of the two possible conditions (words or faces).
Percentage and standard deviation of error and omission responses by condition and group
RTs mean and Stroop interference score on the face and word conditions in the groups
To examine interference effects, the mean RTs of the conditions (face or word) and valences (negative or positive) were subtracted from their respective congruent or incongruent trials per participant to produce difference scores. A positive difference score indicates facilitation, and, conversely, a negative difference score indicates interference.
As Table 2 shows, facilitation only occurred in the case of negative words in the HOA group, where the effect of the interference of an incongruent positive image did not increase the reaction times, which would have caused interference.
The results showed a significant main effect of the independent variable condition (F (1, 72) = 15.34; p < 0.001; η2 = 0.176), with higher interference on the faces task (–178.74 versus –63.25), and valence (F (1, 72) = 11.35; p < 0.001; η2 = 0.136), with less interference for the negative valence (–81.63 versus –160.35). In addition, the main effect of the group variable was also significant (F (2, 98) = 13.19; p < 0.001; η2 = 0.268). In relation to the interactions, significant results were obtained only for the condition x group interaction (F (2, 72) = 3.44; p = 0.037; η2 = 0.087).
Given that the condition-x-group interaction was significant, post hoc tests were applied to study the differences (see Fig. 1 for interference score). First, each group was analyzed independently. Whereas in the healthy group no differences were observed in the interference scores on the two tasks, the interference on the word task was significantly lower in the mild AD (p = 0.040) and moderate AD (p < 0.001) groups. When the groups were compared on the face recognition task, significant differences were obtained between the group of healthy adults and the mild AD (p = 0.009) and moderate AD (p < 0.001) groups, with greater interference in the AD groups, and between the mild AD and moderate AD (p < 0.001) groups, with more interference in the moderate AD group. In the word recognition condition, no differences were obtained between the groups.

Interference on the tasks by group.
DISCUSSION
This study aimed to analyze the functioning of a sample of older adults with and without dementia on an emotional Stroop task. The results of two experimental conditions (faces and words) were compared to find out if there were differences in the interference that occurs in each block and depending on the valence. The findings showed that negative stimuli present less interference, and that interference is greater in the face recognition condition. In addition, when the two conditions are compared, the dementia groups present less interference on the word task. Finally, the comparison of the groups indicates significant differences in the faces condition, with greater interference when there is greater AD severity.
The results for interference in healthy older adults indicated a facilitating effect on the recognition of words with a negative valence. That is, incongruent stimuli (positive faces) produce less interference than congruent stimuli with words with a negative valence. The application of different evaluation processes can explain these results because any savings in processing time on the congruent trials might be canceled out by a double-check procedure [42]. Thus, when an incongruent stimulus appears, subjects would apply a set check, where a response that does not belong to the target set is rejected, because they adopt a speed strategy. By contrast, when presented with a congruent stimulus, beyond applying a set check, subjects would employ a source check to reject a response that belongs to the target set but is not the target response on a given trial because subjects would apply an accuracy strategy. Therefore, the healthy participants, instead of emitting the only available response immediately, performed a source check to be sure. Therefore, on the congruent trials, an additional step would be to verify the response twice. This step would add additional time to the process that would cancel out the time saved in processing congruent stimuli, and this could be hiding a possible congruence effect.
In addition, less interference was obtained for negative stimuli, although when the reaction times are observed, some results are noteworthy. The fastest RTs are found for congruent positive stimuli, but the appearance of a negative incongruent stimulus considerably increases the RT, causing the interference score to increase considerably, whereas the RTs for the negative stimuli, both congruent and incongruent, are more similar, which reduces their interference score. Two conclusions can be drawn from these results. First, it seems that the positive effect is maintained, and that healthy older adults have a predisposition towards positive information that would be consistent with the theory of social-emotional selectivity. The positivity effect proposes that older adults implement chronically active emotional regulation goals to prioritize their attentional focus and processing resources toward positive information, in contrast to younger adults, who do not have these chronically active goals [43].
By contrast, the interference in the negative stimuli is smaller; that is, the difference between the RTs to congruent (both negative) and incongruent stimuli (when a positive stimulus interferes) is reduced; however, when a positive stimulus interferes with a positive one, the RT increases, and so the interference is greater for positive stimuli. The valence of the stimulus is crucial because all negative stimuli, but not positive stimuli, attract attention [44], and the prioritization of negative emotions facilitates emotional conflict resolution [45]. Moreover, it is assumed that negative stimuli elicit a stronger reaction than positive stimuli [46], and that, in general, negative stimuli receive processing priority [47]. This differential processing is evolutionarily adaptive because negative information is more relevant for immediate survival than the potential long-term benefits of positive information. Negative affect carries an important signaling value because it indicates to the organism that there is a need to change or adjust its current state or activity. This strong negative affect gives rise to more effortful, systematic, analytical, and vigilant information processing, whereas positive affect triggers more heuristic, superficial, and rapid processing.
Regarding the conditions (faces and words), the results indicate that interference is greater for face recognition, and this result is consistent with those obtained using affective faces and words in a similar modified version of the Stroop task [22]. These authors pointed out that, although the innate ability to recognize faces could be expected to be processed more automatically, the results indicated otherwise. They found that attending to words is processed in a more automatic manner than attending to faces. Other studies have obtained contradictory results [21] and found that words produced greater interference effects than facial expressions, suggesting that affective faces were processed more automatically. These contradictory results could be due to differences in the stimuli presentation time and saliency. Although, as previously indicated, the stimuli used in our experiment can be processed automatically, the reading of words is over-learned, and the recognition of faces is important for survival, the brain must compete to prioritize them in order to be as efficient as possible. One possible explanation could be that word reading is computed faster because more neural resources have been dedicated to its processing over the years, making our brains more efficient and, hence, faster at this coding [22]. Furthermore, the complex nature of a face compared to a word could also be a contributing factor to the longer latencies observed during the processing of facial expressions.
When comparing the effect of interference in each group, the findings revealed that, whereas in the group of healthy older adults this effect is similar in both conditions, in the AD groups the interference is lower on the word task than on faces. Thus, when AD patients have to recognize a word, less interference is produced by the face than in the opposite condition. AD patients show executive dysfunction that includes poor selective attention, failure to inhibit interfering stimuli, and poor manipulation skills [48, 49]. These deficits, together with the more complex nature of faces, as opposed to the over-learning and automaticity of words, may explain the lower interference in AD patients in this type of condition.
Additionally, several studies found deficits in facial emotion recognition in people with AD, compared to healthy older adults, and a significant increase in this deficit with disease severity [26]. This recognition difficulty would justify the results obtained when comparing the interference of the groups in the faces condition, where interference is greater when the disease is more severe. Furthermore, some studies have considered the influence of the intensity of the expression on the recognition of the emotion portrayed, and findings suggest that people with AD have more difficulty in recognizing low intensity emotional facial expressions than high intensity expressions [27]. Neuroanatomical damage in AD, specifically changes in the anterior medial frontal cortex, medial temporal cortex, and amygdala, may underlie impairments in facial emotion recognition in AD that increase as the condition advances [50].
From a clinical perspective, impairments in the ability to recognize facial affective expressions may lead to social dysfunction and difficulties with interpersonal communication in people with AD [51]. It is important to find out about these difficulties from family members and caregivers because AD patients with impairments in facial recognition can experience greater social isolation due to their interaction problems. In addition, difficulties in interpreting nonverbal communication can lead to misunderstandings that deteriorate their relationships. Therefore, words can be a better method for recognizing emotions, but research should examine whether this is transferable to common objects and would facilitate their recognition, which could help people with AD to adapt to aspects and tasks of daily life. In addition, it is important to transmit consistent information without having to choose between two sources of information.
A limitation that future studies could solve is the fact that in the current task, both stimuli (faces and words) appear together in both conditions. In addition, neutral stimuli were not used in this study. Therefore, to separate facilitation and interference effects, there should be at least one baseline condition (e.g., using words or faces alone, using neutral faces or words).
