Abstract
In contrast to standard models of emotional valence, which assume a bipolar valence dimension ranging from negative to positive valence with a neutral midpoint, the evaluative space model (ESM) proposes two independent positivity and negativity dimensions. Previous imaging studies suggest higher predictive power of the ESM when investigating the neural correlates of verbal stimuli. The present study investigates further assumptions on the behavioral level. A rating experiment on more than 600 German words revealed 48 emotionally ambivalent stimuli (i.e., stimuli with high scores on both ESM dimensions), which were contrasted with neutral stimuli in two subsequent lexical decision experiments. Facilitative processing for emotionally ambivalent words was found in Experiment 2. In addition, controlling for emotional arousal and semantic ambiguity in the stimulus set, Experiment 3 still revealed a speed-accuracy trade-off for emotionally ambivalent words. Implications for future investigations of lexical processing and for the ESM are discussed.
Introduction
In the past few decades, emotional processing attracted attention as a main research topic in psychology. Two major theoretical approaches dominate the discussion so far, namely, dimensional theories and theories that assume discrete emotions. According to the first, a limited number of independent affective dimensions accounts for the entire human emotional experience. The two most important and best understood dimensions are emotional valence, indicating the hedonic value of a specific emotion as either positive or negative, and emotional arousal, indicating its intensity (Bradley & Lang, 2000; Russell, 2003). Some theories additionally suggest a third dimension, for example, dominance, indicating the feeling of being in control versus being controlled (Bradley & Lang, 1999), or an approach-withdrawal dimension (Davidson, 1993, 1995). Discrete emotion theories, however, are a second major theoretical approach, postulating that a limited number of discrete emotions with specific characteristics, physiological correlates, and behavioral action tendencies trigger emotional experiences (e.g., Panksepp, 1998). Although the exact number of discrete emotions is debated, at least five discrete emotions—happiness, anger, disgust, fear, and sadness—are widely accepted.
Dimensional theories and theories that assume discrete emotions differ in many aspects, but despite their differences, with only few exceptions, they agree in one assumption: Emotional experiences are either positive or negative. This, however, is questioned by the evaluative space model (ESM), which suggests that emotional experiences are sometimes both, positive and negative at the same time (Cacioppo & Berntson, 1994; Norman et al., 2011). Therefore, according to the ESM, three affective states have to be differentiated: positive, negative, and emotionally ambivalent (i.e., positive and negative). Evidence supporting the ESM assumptions is, for example, provided by an analysis of verbal reports (J. T. Larsen, McGraw, & Cacioppo, 2001) or by evaluations of bittersweet films (J. T. Larsen & McGraw, 2011), where participants reported to feel happy and sad simultaneously. Using a rating scale designed to differentiate between simultaneous mixed feelings (i.e., ambivalent emotions) and sequential mixed feelings (i.e., different emotions in very fast succession, but not at the same time), Carrera and Oceja (2007) published ratings indicating emotional ambivalence while participants recall naturally occurring situations and during emotion induction. Film clips depicting disgusting humor also seem to rely on emotional ambivalence (Hemmenover & Schimmack, 2007). Whereas most studies investigating the ESM rely on behavioral data and explicitly manipulate the stimulus material to be as ambivalent as possible, recent neuroimaging studies in support of the ESM applied on rather implicit approaches. Investigating which emotional valence conception best predicts hemodynamic responses while participants indicate whether a presented word refers to themselves or not, Lewis, Critchley, Rotshtein, and Dolan (2007) found that the standard linear bipolar valence model (Bradley & Lang, 2000; Russell, 2003) accounts for a considerable amount of variance, but is unexpectedly outperformed by a U-shaped hedonic value model and a model assuming two independent positivity and negativity dimensions (see Lewis et al., 2007, Figure 3). This result was replicated by Viinikainen et al. (2010) using pictures taken from the International Affective Picture System (IAPS; Lang, Bradley, & Cuthbert, 2005) instead of words and a valence judgment paradigm instead of a self-referential task with essentially the same results: Negative and positive valence correlated with distinct brain regions, and a U-shaped model additionally accounted for ambivalence in the stimuli (see also Viinikainen, Kätsyri, & Sams, 2012, using auditory stimuli). Even in studies that were explicitly designed to support bipolar models of affective space, a parametric mapping of the valence dimension reveals some regions that reflect negative valence (e.g., the dorsolateral prefrontal cortex, anterior midcingulate cortex, frontal pole, inferior parietal cortex), whereas others reflect positive valence (e.g., substantia nigra, ventral striatum, right caudate nucleus; see Colibazzi et al., 2010). These results thus question a neural basis for the standard linear bipolar valence model and instead support the assumption of two independent positivity and negativity dimensions.
Given the explanatory power of the independent dimensions model (Lewis et al., 2007; Viinikainen et al., 2010), and considering that different brain regions show parametric activity depending on whether positive or negative emotions are induced (Colibazzi et al., 2010), it seems promising to continue the search for valence-ambivalence effects documented solely in explicit tasks so far by applying the ESM to tasks where the manipulation of the affective material is implicit to the task requirements. Knowing that language and emotion are closely related to one another (Barrett, Lindquist, & Gendron, 2007; Panksepp, 2008), with emotion effects being documented in numerous studies using spoken (e.g., Buchanan et al., 2000; Mitchell, Elliott, Barry, Cruttenden, & Woodruff, 2003; Ververidis & Kotropulos, 2006) and written word stimuli (e.g., Briesemeister, Kuchinke, & Jacobs, 2011a, 2011b; Hofmann, Kuchinke, Tamm, Võ, & Jacobs, 2009; Kuchinke, Võ, Hofmann, & Jacobs, 2007), the current study meant to directly investigate and contrast the explanatory power of the unidimensional valence model and the independent dimensions valence model using lexical stimuli. After collecting data in an explicit valence rating task (Colibazzi et al., 2010; Viinikainen et al., 2010), two lexical decision tasks (LDTs) were used, wherein the processing of the words’ valence is incidental to the task requirements (Kuchinke et al., 2007).
Computational models of visual word recognition mostly focus on the simulation of orthographic and/or phonological processes (e.g., Coltheart, Curtis, Atkins, & Haller, 1993; Coltheart, Rastle, Perry, Langdon, & Ziegler, 2001; Grainger & Jacobs, 1996; Jacobs, Graf, & Kinder, 2003), while ignoring affective variables. The extended multiple read-out model (MROMe, see Kuchinke, 2007) is the only known exception, considering affective information to modulate lexical decisions incidentally before the recognition of a particular word. The MROMe predicts facilitated processing of emotionally laden words (i.e., faster response times [RTs] and/or fewer errors) but unfortunately does not differentiate between positive and negative valences. In general, regression analyses on lexical decision and naming data suggest that lexical decision times mirror independent effects of positivity and negativity fairly well (Estes & Adelman, 2008a, 2008b). Thus, based on the MROMe, the neuroimaging results of Lewis et al. (2007) and Viinikainen et al. (2010), and the numerous previous word recognition studies (Briesemeister et al., 2011a, 2011b; Estes & Adelman, 2008a, 2008b; Hofmann et al., 2009; Kuchinke et al., 2007), we expected to find facilitated processing for emotionally ambivalent words.
Experiment 1: Rating Words
Previously published stimulus lists including affective words focus on the unidimensional valence model (e.g., Bradley & Lang, 1999; Võ et al., 2009) or discrete emotions (Briesemeister et al., 2011b; Stevenson, Mikels, & James, 2007). To provide an independent dimensions valence model of emotional words that would allow for a comparison of ratings collected on the basis of the unidimensional approach with ratings on the basis of the two-dimensional approach, we collected independent positivity and negativity ratings for words taken from the Berlin Affective Word List–Reloaded (BAWL-R; Võ et al., 2009; for an overview, see also Briesemeister, Hofmann, Kuchinke, & Jacobs, 2012). Nouns that might contain ambivalent (positive and negative) emotional valences were identified by three independent judges (B.B.B. and two students), resulting in a list of 662 words for the rating. Assuming that a linear bipolar valence dimension is sufficient to describe emotional experience, we expected that the BAWL-R valence ratings highly correlate with the ratings collected on independent positivity and negativity dimensions. Moreover, a high negative correlation between positivity and negativity scales can be predicted based on unidimensional approaches, because negative words are suggested to load high on the negativity scale and low on positivity, whereas positive words are suggested to load high on the positivity scale and low on negativity, respectively. Observing a moderate to low correlation, however, would support the appropriateness of the ESM in language processing.
Method
Participants
Altogether, 71 different participants (43 female; M age = 27.24, SD = 6.53, range = 19-59) were recruited at the Freie Universität Berlin to participate in four rating studies, with some participants taking part in more than one rating. They were offered either course credit or 5 Euros per completed rating study. Some participated voluntarily without recompense.
Material and procedure
The 662 words identified as having the potential for being ambivalent on the valence dimension were presented to the participants in four separate rating studies, each study using a different stimulus list. The first 3 ratings contained 200 items each; the forth rating contained the remaining 62 items. Ratings were collected using an online questionnaire.
Positivity and negativity scores were collected independently but in one session on the same participants. Participants were instructed to carefully read the presented word before judging its positivity and its negativity by clicking on a 7-point Likert-type scale (1 = low positivity/negativity, 7 = high positivity/negativity). Each word was presented individually in black uppercase letters (font type = Times New Roman, font size = 18) on white background in randomized order. Participants were able to individually decide when to attend to the next trial by clicking on a button. Half of the participants started rating all words on positivity, followed by the negativity rating of all words. This order was reversed in the second half of the participants. Participants were explicitly allowed to rate more than one of the four different stimulus lists, with 7 participants rating more than one list. Online ratings were averaged offline per item and per valence condition, using JMP software (Version 7, SAS Institute Inc., Cary, NC). Each word received ratings from at least 20 different participants.
Rating Results
Correlation analysis was computed using SPSS software (Version 13.0, SPSS Inc., USA) on the mean positivity and negativity ratings of each word using two-tailed testing and an a priori significance level of .05. The collected positivity and negativity ratings were correlated with r = −.876 (p < .01). Both ratings also correlated with the valence ratings from BAWL-R. Whereas positivity and valence were positively correlated (r = .902, p < .01), negativity was negatively correlated with unidimensional valence (r = −.874, p < .01). These results are depicted in Figure 1.

The relationship between unidimensional and two-dimensional valence
Discussion and Introduction to Experiment 2
Most theories concerning affective processing assume that emotions are either positive or negative, but not positive and negative at the same time (e.g., Bradley & Lang, 2000; Panksepp, 1998; Russell, 2003). In this point, they crucially differ from the ESM, which explicitly suggests ambivalent emotional experiences (J. T. Larsen et al., 2001). The concept of coactivation of positive emotions (e.g., happiness) and negative emotions (e.g., sadness) is usually investigated in contexts where highly complex stimuli are used, such as emotion-inducing film clips (e.g., J. T. Larsen et al., 2001) or music (e.g., Juslin & Västfjäll, 2008). It is easy to imagine that a piece of music itself is perceived as being sad, while it triggers memories of happy events, or a film clip depicting a funny scene in an overall dramatic context.
At the same time, there also exists some evidence from recent neuroimaging studies indicating that positivity and negativity are processed independently in different brain regions using less complex pictures or words (e.g., Lewis et al., 2007; Viinikainen et al., 2010). Following this data, wherein positivity and negativity evaluations are associated with activity in distinct cortical networks, it can be speculated that independent neural processing of positive and negative evaluations accounts for the observed ambivalence effects. However, it should also be noted that the current neuroimaging literature is at least heterogeneous—if not contradictory—with respect to the neural basis of emotional valence processing and far from testing the appropriateness of current models of affective space.
According to unidimensional valence approaches, asking participants to independently judge a word’s positivity and negativity should lead to basically the same results as when applying an unidimensional valence scale to the ratings. Positivity and negativity ratings should highly correlate with unidimensional valence ratings and should correlate highly negatively with each other. This is exactly what was found.
The correlations between unidimensional valence and the alternative two-dimensional conception were .902 and −.874, respectively. As expected, by far, most of the variance in a two-dimensional valence conception is accounted for by a unidimensional valence scale. In addition, the positivity and negativity ratings were correlated with −.876. Words judged as being positive (i.e., positivity rating > 3; M positivity = 4.1) were also judged as not negative (M negativity = 2.4; range = 1.15-4.6), and negative stimuli (i.e., negativity rating > 3; M negativity = 4.1) were judged as not positive (M positivity = 2.4; range = 1-4.1). This mirrors exactly the core difference between the unidimensional valence conception (e.g., Bradley & Lang, 2000; Russell, 2003) and its two-dimensional reformulation (Cacioppo & Berntson, 1994).
Thus, on first sight, it seems obvious that a unidimensional valence account is sufficient to describe two- dimensional valence ratings. When taking a closer look, however, three details may challenge this conclusion. First, some stimuli (~7%) received scores >3 on both unidimensional scales. They are rated as being positive and negative at the same time and can therefore be considered as ambivalent. It is worth noting that these mean rating scores are not biased in the sense that some participants rated them as being very negative but not positive whereas others rated them as being very positive but not negative, which also would result in scores >3 on both scales. In fact, taking a look at the raw data, these 48 words were rated as being at least slightly emotion inducing on both scales (i.e., ratings ≥ 2) by more than 50% of the participants (see also the “rated as ambivalent by” column in the online appendix). Second, taking a very close look at Figure 1, one will recognize that the majority of items used for the rating can be classified as being neutral according to the BAWL-R criteria. Out of 662 items, 386 have normative valence ratings between −0.6 and 0.6. Thus, these words are neither positive nor negative according to the BAWL-R norms. Assuming that the ESM is correct, and assuming that these words are actually processed on two independent positivity and negativity scales, wouldn’t this be exactly the result to be expected when participants are asked to rate ambivalent words on one dimension? It seems plausible that such emotionally ambivalent stimuli are rated on one dimension as lying between the ends of the scale, that is, as being neutral. A further detail visible in Figure 1 substantiates this thought, being the third detail to challenge the obvious interpretation of the results. Items on the positive pole of the unidimensional valence scale are also rated as highly positive on the positivity scale and as not negative on the negativity scale. An equivalent relationship is observable for items on the negative pole. Items in the neutral range of the unidimensional valence scale, however, exhibit a relatively wide distribution on the two independent dimensions (1.85-4.65 for negativity; 1.95-4.65 for positivity). As a result, 44 of the above-mentioned 48 items that have high scores (>3) on both independent dimensions fall within the neutral range of the unidimensional valence scale. In support of the ESM, emotionally ambivalent words are mainly rated as being neutral according to the unidimensional approach, which also explains the relatively small effect when computing linear correlations, resulting in the above-reported high correlation coefficients.
Based on these considerations, the second experiment was planned to directly test whether neutral, nonaffective words (i.e., neutral on the unidimensional scale and low on both, negativity and positivity) and ambivalent words (i.e., neutral on the unidimensional scale and high on both, negativity and positivity) affect the participants’ performance in a cognitive task differentially, as would be expected from the ESM. The lexical decision paradigm was chosen for this purpose, being a standard test using verbal stimuli (Jacobs & Grainger, 1994) that is known to be reliably affected by emotional content (Briesemeister et al., 2011a, 2011b; Estes & Adelman, 2008a, 2008b; Hofmann et al., 2009; Kousta, Vinson, & Vigliocco, 2009; Kuchinke et al., 2007; R. J. Larsen, Mercer, Balota, & Strube, 2008). Using a standard manipulation with positive, negative, and neutral words and an additional emotionally ambivalent stimulus category, faster responses to positive and slowed responses to negative words (Carretié et al., 2008; R. J. Larsen et al., 2008), but no difference between neutral and emotionally ambivalent words, were expected according to the unidimensional valence model. In contrast to that, given the predictions of the MROMe that affective information facilitates the LDT, facilitative processing of emotionally ambivalent words when compared with nonaffective neutral words was expected according to the ESM.
Method
Participants
For the LDT, 25 native German participants (19 female; M age = 27.2, SD = 5.57, range = 20-42; 2 left-handed) reporting normal or corrected to normal vision were recruited at the Freie Universität Berlin. Participants received course credit for participation; some participated voluntarily without recompense.
Material
The stimulus set consisted of 140 nouns taken from the ratings described above and an equal number of nonwords. Four conditions were constructed, each containing 35 words. Words in positive condition had a unidimensional valence rating above 1 according to the BAWL-R, a positivity rating above 3 and a negativity rating below 3 (e.g., “TOLERANCE”). Negative words had a unidimensional valence rating below −1, a positivity rating below 3, and a negativity rating above 3 (e.g., “ATTACK”). Neutral (e.g., “MOTOR”) and emotionally ambivalent (e.g., “SCHOOL”) words were neutral according to unidimensional theories (valence between −0.6 and 0.6; t < 1), but differed with respect to the two-dimensional approach. Whereas all words in the neutral condition had positivity and negativity ratings below 3, words in the ambivalent condition were both positive and negative (both ratings > 3).
All four conditions were controlled for the variables length (4-8 letters), syllables, imageability, phonemes, frequency, orthographical neighborhood size, and bigram frequency (token count), using ANOVAs (all F < 1). Neutral and ambivalent conditions were controlled for unidimensional valence from BAWL-R (Võ et al., 2009), using a t test (t < 1). Stimulus characteristics are summarized in Table 1.
Stimulus Characteristics (Ms and SDs) for Experiment 2
Note: N = orthographical neighborhood size; valence = unidimensional valence. All measures were taken from the Berlin Affective Word List–Reloaded (BAWL-R).
Additional 140 words were taken from the BAWL-R to create pronounceable nonwords by changing one or two letters. Nonwords were matched to words on length (4-8 letters) and syllables using a t test (t < 1).
Procedure
Participants were seated in a quiet room in front of a 15-inch laptop screen. They were instructed to decide as fast and as accurately as possible whether they were presented a correct German word or a nonword. Decisions were made using left and right index fingers, lying on the respective SHIFT buttons. The button-to-response assignment was counterbalanced across participants. After nine practice trials—not belonging to the stimulus set and therefore excluded from any analysis—the experimenter left the room, provided that participants did not have further questions.
Stimuli were presented with Presentation 9.9 software (Neurobehavioral Systems Inc., Canada) in randomized trial order in the center of a blank white screen, using black uppercase letters (font type = Arial, size = 24, ~ 0.56° vertical visual angle). Each trial began with a fixation cross (+) presented for 500 ms, followed by the stimulus (500 ms) and another fixation cross, presented until the button press.
Data preparation
Error-free mean RTs were calculated for each condition and each participant. Trials with responses faster or slower than the individual mean RT ± 3 SD were excluded as outliers (1.5%). For error analyses, behavioral errors were summed up per participant and condition. All analyses were computed using SPSS software (Version 13.0, SPSS Inc., USA) at an a priori significance level of .05.
Results
A repeated-measures ANOVA over all four conditions (positive, negative, neutral, and emotionally ambivalent) revealed a significant main effect in RTs, F(3, 72) = 13.247, p < .001, partial η2 = 0.356. Planned pairwise comparisons using matched-pairs t tests revealed faster responses to positive words (M = 634 ms, SD = 119 ms) than to negative (M = 671 ms, SD = 133 ms), t(24) = −5.664, p < .001, partial η2 = 0.572, and to neutral (M = 662 ms, SD = 139 ms), t(24) = −4.346, p < .001, partial η2 = 0.440, words. In addition, emotionally ambivalent words (M = 643 ms, SD = 122 ms) were processed significantly faster than neutral, t(24) = −2.357, p = .027, partial η2 = 0.188, and negative, t(24) = −4.888, p < .001, partial η2 = 0.446, words. The RT difference between emotionally ambivalent and positive stimuli did not reach significance, t(24) = −1.795, p = .085, partial η2 = 0.118.
Concerning the error rates (ERR), a repeated-measures ANOVA over all four conditions (positive, negative, neutral, and emotionally ambivalent) revealed a significant main effect, F(3, 72) = 11.551, p < .001, partial η2 = 0.325. Planned pairwise comparisons using matched-pairs t tests revealed a lower ERR for positive words (M ERR = 2.2, SD = 1.4) when compared with emotionally ambivalent (M ERR = 4.2, SD = 2.3), t(24) = −4.612, p < .001, partial η2 = 0.470, neutral (M ERR = 4.6, SD = 2.6), t(24) = −4.576, p < .001, partial η2 = 0.466, and negative (M ERR = 4.6, SD = 2.8), t(24) = −4.537, p < .001, partial η2 = 0.462, words. Neutral and emotionally ambivalent words did not differ in their respective ERR, t(24) = 0.910, p = .372, partial η2 = 0.033. For an overview, results are also summarized in Figure 2.

Mean response times and error rates for Experiment 2
Discussion and Introduction to Experiment 3
Experiment 2 meant to test whether neutral, nonaffective words are processed differently than ambivalent words, using a cognitive task where the manipulation of the affective information is incidental to the task requirements. The results seem clear-cut: Stimuli rated as neutral on one- and two-dimensional valence scales were processed significantly slower than stimuli rated as neutral on unidimensional valence but as ambivalent on two independent scales, which is perfectly in line with the assumptions of the ESM. The unidimensional valence model alone, in contrast, cannot account for this difference. Considering this result, the variability in the two-dimensional ratings observed in Experiment 1 seems no coincidence: A manipulation within this distribution significantly affects the processing speed in a LDT, even if unidimensional valence is controlled for. Considering that emotional words are theoretically predicted (MROMe; Kuchinke, 2007) and empirically known to facilitate the LDT (Kousta et al., 2009), the variance in the neutral range of Figure 1 and the observed processing facilitation of emotionally ambivalent words seem to support the ESM.
The current study, however, not only focused on a neutral versus ambivalent contrast but also included positive and negative words. Taking a close look at these and comparing the effects of Experiment 2 with the existing literature, the ESM again seems to outperform the single unidimensional valence dimension: In the present study, words in the positive condition were processed faster than neutral and negative words, which is a consistent finding (e.g., Hofmann et al., 2009; Kuchinke et al., 2005, 2007; R. J. Larsen et al., 2008). Negative words, however, were processed slower than ambivalent words only, with no other contrasts reaching significance. Based on the existing literature, a slowdown in processing of negative words when compared with neutral stimuli had been expected (Carretié et al., 2008; R. J. Larsen et al., 2008), which was not replicated. Yet again, the ESM model might account for this inconsistency: Given that previous lexical decision studies did not differentiate between neutral and emotionally ambivalent words, and assuming that they nonetheless involuntarily included both stimulus types, the mean RTs for a mixed neutral condition including truly neutral and ambivalent items might actually be faster than for a purely neutral condition as observed in Experiment 2. This, in turn, would result in greater RT differences between the negative and mixed neutral conditions and hence in a significant effect. Post hoc analyses on the data collected in Experiment 2, collapsing the ambivalent and the neutral words into a mixed neutral condition support just that: A three-factorial repeated-measures ANOVA (positive, negative, mixed neutral) revealed a significant main effect of emotion, F(2, 48) = 22.799, p < .001, partial η2 = 0.487, with words in the mixed neutral condition being processed faster than negative words, t(24) = −3.507, p = .002, partial η2 = 0.339, but slower than positive words, t(24) = 4.138, p < .001, partial η2 = 0.416. Thus, the ESM might help to explain why the inhibitory effect for negative valence (Carretié et al., 2008; R. J. Larsen et al., 2008) does not reach significance in all experiments (e.g., Kissler & Koessler, 2010; Siegle, Granholm, Ingram, & Matt, 2001).
Furthermore, based on the results of Experiment 2, one might ask whether the observed differences between the four conditions can be explained by the words’ positivity alone. Whereas positive and negative words as well as neutral and emotionally ambivalent words differ in positivity and negativity norms, the RT differences between positive and neutral as well as between emotionally ambivalent and negative words seem to be caused by differences in positivity norms alone. Thus, all RT differences in Experiment 2 seem to be related to differences in positivity, accompanied by a processing facilitation in the case of higher positivity. A post hoc linear multiple regression analysis predicting the RT data collected in Experiment 2 with positivity and negativity as predictors seems to confirm this: Although negativity did not explain RT differences (β = −0.126, t = −0.758, p = .449), positivity approached significance (β = 0.303, t = −1.826, p = .070.
Up to this point, the ESM accounts for all results in Experiments 1 and 2, while the unidimensional valence approach reaches its limits. This, however, might also be explained by two possible confounds. First, as mentioned in the introduction, most dimensional theories propose a two-dimensional affective space, combining emotional valence with an emotional arousal dimension (Bradley & Lang, 2000; Russell, 2003). Emotional arousal is not considered as a reliable dimension in the ESM and accordingly, the stimulus set in Experiment 2 was not controlled for arousal. Even though the work from Kousta et al. (2009) suggested that emotional arousal does not provide explanatory value in predicting lexical decision RTs, thus suggesting that controlling for emotional arousal would not affect the previous results, several studies applying the unidimensional valence conception to their data document that highly arousing stimuli speed up lexical decision times (Hofmann et al., 2009; R. J. Larsen et al., 2008). Knowing that emotional arousal causes higher dimensional interactions with valence (R. J. Larsen et al., 2008), arousal might have affected the processing of the emotionally ambivalent but not of the neutral words. This alternative explanation seems to find support in Figure 1: Most words classified as being neutral in Experiment 1—thus being selected for the neutral condition in Experiment 2—are also low-arousing words. Emotionally ambivalent words, in contrast, are mainly high in arousal, replicating the reported high correlation between arousal and negativity (R. J. Larsen et al., 2008) and probably indicating that most of the variance visible in Figure 1 might be due to emotional arousal.
Second, emotional ambivalence might also be confounded with semantic ambiguity, which is also known to facilitate processing in the LDT (e.g., Atchley, Grimshaw, Schuster, & Gibson, 2011; Hino, Kusunose, & Lupker, 2010; Hino, Lupker, & Pexman, 2002; Rodd, Gaskell, & Marslen-Wilson, 2002). Semantically ambiguous words receive their particular meaning depending on the context they are processed in (Rodd et al., 2002), which may lead to differences in perceived positivity and negativity ratings, depending on the respective context. For example, the German word “NOTE,” which was among the emotionally ambivalent words in Experiment 2, normally refers to the grades students receive in school, which might explain its negative connotation depending on the student’s experience. Depending on the context, “NOTE” also refers to the notes in a piece of music, which in turn might explain the positive connotation, or to bank notes, where “NOTE” would be a more infrequent term, however. In this example, emotional ambivalence and semantic ambiguity cannot be separated. Expecting that emotionally ambivalent words are more likely to also be semantically ambiguous than neutral words, semantic ambiguity might have affected the processing of the emotionally ambivalent but not of the neutral words in Experiment 2.
To test for these confounding variables, a post hoc linear multiple regression analysis was calculated, predicting the RT data collected for neutral and ambivalent words with positivity, negativity, their interaction, arousal, and the number of Dornseiff entries as predictors. Dornseiff groups represent different functional groups for classification of words based on their functional meaning (Dornseiff, 2004), which means that a word that is listed in more than one Dornseiff group is used in more than one semantic context and can therefore be considered to be semantically ambiguous. None of the explanatory variables reached significance in this analysis, but while Dornseiff entries and arousal clearly did not explain unique variance (all ps > .1), the three ESM variables at least approached significance. This result is not univocal evidence but suggests that negativity (β = 3.245, t = 1.768, p = .082), positivity (β = 3.354, t = 1.816, p = .074), and their interaction (β = −6.035, t = −1.871, p = .066) were not confounded with arousal or semantic ambiguity in Experiment 2.
To explicitly test whether the lexical decision ambivalence effect from Experiment 2 was confounded with arousal or semantic ambiguity, a third experiment was designed with emotional arousal and semantic ambiguity being controlled for across all conditions. Arousal norms were taken from the BAWL-R and the number of Dornseiff entries was taken from the Leipzig Corpora Collection (see Biemann, Heyer, Quasthoff, & Richter, 2007). Mean number of Dornseiff dictionary entries were kept as low as possible, although it was not possible to exclude all words with Dornseiff entries >1. Using this better controlled stimulus material, a replication of a facilitatory effect for emotionally ambivalent words when compared with truly neutral words would strongly support a reconsideration of the emotional valence dimension in terms of the ESM. If, however, previous effects were biased by stimulus arousal or semantic ambiguity, no differences between neutral and emotionally ambivalent words should be observed. This would rather support a unidimensional valence conception within the two-dimensional affective space as proposed in Bradley and Lang (2000) or Russell (2003).
Method
Participants
Altogether, 30 right-handed native German participants (24 female; M age = 27.6, SD = 5.9, range = 19-62) reporting normal or corrected to normal vision were recruited at the Freie Universität Berlin. Participants received course credit for participation, some participated voluntarily without recompense.
Material and procedure
To test whether the effects of Study 2 would persist when arousal and semantic ambiguity are controlled for, a stimulus set with all four conditions (neutral, emotionally ambivalent, positive, and negative) being controlled for arousal and mean Dornseiff entries was constructed (F < 1). This resulted in 28 stimuli per condition, that is, 112 words altogether. All variables controlled in Study 2 (i.e., length, syllables, imageability, phonemes, frequency, orthographical neighborhood size, and bigram frequency [token count]) were also controlled in Study 3 using ANOVAs (all Fs < 1). Neutral and emotionally ambivalent words did not differ in any of these variables (all ts < 1), and they did not differ in emotional arousal (t < 1) or mean number of Dornseiff entries (t < 1; see also Table 2). The nonwords used in Study 2 were reduced to 112 items, and matched for length and syllables to the word material (t < 1). The procedure used for Study 3 was identical to the one of Study 2.
Stimulus Characteristics (Ms and SDs) for Experiment 3
Note: N = orthographical neighborhood size; valence = unidimensional valence; Dornseiff entries = mean number of Dornseiff dictionary entries. Except for Dornseiff entries, all measures were taken from the Berlin Affective Word List–Reloaded (BAWL-R).
Data preparation
Error-free mean RTs were calculated for each condition and each participant. Trials with responses faster or slower than the individual mean RT ± 3 SD were excluded as outliers (1.8%). For error analyses, behavioral errors were summed up per participant and condition. All analyses were computed using SPSS software (Version 13.0, SPSS Inc., USA) at an a priori significance level of .05.
Results
A repeated-measures ANOVA over all four conditions (positive, negative, neutral, and emotionally ambivalent) revealed a significant main effect in RTs, F(3, 87) = 10.508, p < .001, partial η2 = 0.266. Planned pairwise comparisons using matched-pairs t tests revealed faster responses to positive words (M = 654 ms, SD = 107 ms) than to negative (M = 696 ms, SD = 117 ms), t(29) = −5.121, p < .001, partial η2 = 0.475, and to emotionally ambivalent words (M = 675 ms, SD = 117 ms), t(29) = −3.031, p = .005, partial η2 = 0.241. The RT difference between positive and neutral words (M = 665 ms, SD = 108 ms) did not reach significance, t(29) = −1.852, p = .074, partial η2 = 0.106. In addition, negative words were processed significantly slower than neutral, t(29) = 4.255, p < .001, partial η2 = 0.384, and emotionally ambivalent words, t(29) = 2.195, p = .036, partial η2 = 0.143. Neutral and emotionally ambivalent words did not differ significantly, t(29) = −1.179, p = .248, partial η2 = 0.046.
A repeated-measures ANOVA over all four conditions (positive, negative, neutral, and emotionally ambiguous) revealed a significant main effect, F(3, 87) = 11.679, p < .001, partial η2 = 0.287, in the ERR as well. Planned pairwise comparisons using matched-pairs t tests revealed a lower ERR for positive words (M ERR = 1.2, SD = 1.8) when compared with negative (M ERR = 2.9, SD = 1.8), t(29) = −5.052, p < .001, partial η2 = 0.468, neutral (M ERR = 3.0, SD = 1.9), t(29) = −5.505, p < .001, partial η2 = 0.512, and emotionally ambivalent (M ERR = 2.1, SD = 1.8), t(29) = −2.821, p = .009, partial η2 = 0.215, words. In addition, ERR was lower for emotionally ambivalent words than for neutral, t(29) = −2.619, p = .014, partial η2 = 0.191, and negative, t(29) = −2.420, p = .022, partial η2 = 0.168, words. Results are also summarized in Figure 3.

Mean response times and error rates for Experiment 3
General Discussion
The current study meant to examine the appropriateness of the two independent positivity and negativity dimensions suggested by the ESM (Cacioppo & Berntson, 1994) in explaining variance in lexical processing above and beyond the unidimensional valence conception (e.g., Bradley & Lang, 2000; Russell, 2003). Three experiments were conducted, one explicit rating study and two LDTs.
In Experiment 1, about 7% of the stimuli rated on two independent positivity and negativity dimensions received ratings above 3 on both scales. As this result was not caused by an averaging bias—each of the 48 words was rated as at least slightly emotion inducing on both scales by more than 50% of the participants—these stimuli can be seen as ambivalent according to the criteria of the ESM. Contrasting these ambivalent with truly neutral stimuli in Experiment 2, a facilitative ambivalence effect in lexical decision RTs was discovered. Post hoc multiple regression analyses seemed to confirm the effect, revealing that the results were not biased by semantic ambiguity or emotional arousal. As the ESM dimensions only approached significance in the post hoc analyses, Experiment 3 meant to replicate the findings of Experiment 2, controlling the stimulus set for emotional arousal and semantic ambiguity and thus eliminating the two confounds. The differences observed in Experiment 2 were diminished, however. RTs for neutral and emotionally ambivalent stimuli did not differ significantly, and it seems highly unlikely that a lack of statistical power is the reason for the missing ambivalence RT effect, considering that a total of 30 participants were analyzed. Moreover, the common pattern with positive words being processed faster than neutral words (p = .074) and (low arousing) negative words being processed slower than neutral stimuli (e.g., Hofmann et al., 2009; R. J. Larsen et al., 2008) was basically replicated, contradicting the results found in Experiment 2. Up to this point, the ESM dimensions cannot explain variance beyond traditional affective space models. Looking at the ERR, however, an additional speed-accuracy trade-off was observed, with emotionally ambivalent stimuli being processed more accurately than neutral stimuli. This trade-off marks a facilitative processing of ambivalent words and thus rather supports the predictions of the ESM predictions. Neither the two dimensional affective space model nor semantic ambiguity can explain this difference in processing accuracy, as both conditions were carefully controlled for their valence, arousal, and their mean Dornseiff dictionary entries as a measure of semantic ambiguity. The effect is thus in line with the MROMe, a computational model of visual word recognition, which predicts generally facilitated processing of emotionally laden words.
While the speed-accuracy trade-off is no strong support for the ESM—emotionally ambivalent stimuli were not processed more accurately than neutral words in Experiment 2—the data of all three experiments at some point are best explained with two independent positivity and negativity dimensions. The ESM seems to be the best explanation for the speed-accuracy trade-off, although future studies will also have to test for alternative explanations. In the context of dimensional theories, for example, emotional valence and arousal are often supplemented or replaced by other affective dimensions, such as dominance (Hess, Adams, & Kleck, 2005) and approach avoidance (Elliot, 2006). Neither was considered in the present study. Although emotionally ambivalent stimuli are likely to be ambivalent on a approach-avoidance dimension as well—positive information is generally correlated with approach, whereas negative information is linked to avoidance, even though some studies document that this relationship is more complex (e.g., Carver, 2004)—they might also be more dominant than neutral words. Future studies will therefore have to further evaluate the predictions and interactions of these models in visual word recognition—regarding dimensional models like the ESM (Cacioppo & Berntson, 1994) or the core affect model (Russell, 2003), three dimensional models including dominance (Bradley & Lang, 1999), or approach and avoidance (Elliot, 2006).
Another line of research should address whether the current results are affected by gender-specific processes, given that the present data were collected on mainly female samples (78% female participants). Recent studies document that female participants not only respond more sensible to emotional stimulation (e.g., Bradley, Codispoti, Sabatinelli, & Lang, 2001), which might include a stronger response to emotional ambivalence, but also differ from male participants with respect to their neural correlates of affective processing (Hamann & Canli, 2004; Wager, Phan, Liberzon, & Taylor, 2003). The effects presented here might therefore look different when the data are collected on a mainly male sample.
It should also be noted that the processing of emotional information is incidental to the task itself in the LDT. Although the task proved to be suitable when comparing different emotion theories in verbal processing (Briesemeister et al., 2011a, 2011b), previous effects in support of the ESM have been reported only for explicit emotion processing tasks (e.g., self-referential task in Lewis et al., 2007; emotion induction in J. T. Larsen et al., 2001) or valence ratings of affective pictures (Viinikainen et al., 2010). It is therefore possible that the higher appropriateness of the ESM in these studies depends on the stronger focus on affective processing and the emotionally more intensive material, which might lead to more univocal ambivalence effects. Nonetheless, summarizing the results of the all three experiments, ambivalence effects seem to also affect single-word processing at some point.
In addition to the investigation of emotional ambivalence, this study provides evidence that the unidimensional valence effect reported in previous studies (e.g., Hofmann et al., 2009; Kousta et al., 2009; Kuchinke et al., 2005, 2007; R. J. Larsen et al., 2008) is independent of semantic ambiguity, at least in stimulus sets with overall low stimulus arousal (see Table 1). According to our knowledge, semantic ambiguity has never been controlled in affective LDTs before. Of course, future studies will have to further investigate semantic ambiguity with affective stimulus material, testing for possible interactions and/or confounds with arousal. This is especially of importance as high arousal and high semantic ambiguity are known to speed up the lexical decision response. Controlling for semantic ambiguity and keeping arousal at a low level, the results of Experiment 3 are still in line with the prediction that emotional processes affect visual word recognition (Kuchinke, 2007).
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research and/or authorship of this article.
