Abstract
Pitch is a fundamental musical factor; however, findings about its contribution to the elicitation of emotions are contradictory. The purpose of this work was to assess the effect of systematic pitch variations on self-reports of felt valence and arousal. In a within-subject design, 49 subjects listened to four 1-minute classical piano excerpts, each presented at three different pitch levels (one octave lower than the original version, the original version and one octave higher than the original version). Compared to excerpts both without octave modification and in the +1 octave variant, pleasantness of excerpts in the -1 octave variant was significantly lower. This main effect was stronger for women than men and, importantly, was modulated by the specific characteristics of the stimuli. There was also a significant, yet smaller, negative relationship between pitch level and arousal, moderated by gender: Compared to higher pitch, lower pitch was associated with higher arousal in men only. Regarding the complex outcomes of this study, future studies should investigate to which extent our findings can be generalized to other musical works. The ultimate goal might be to demonstrate how pitch level interacts with other musical features and listeners’ characteristics in eliciting diverse affective experiences.
Introduction
It seems that one of the most important reasons why people play, compose, and listen to music is its ability to induce emotions (Juslin & Sloboda, 2010). But do people recognize emotions expressed by the music or do they experience music-induced emotions? This classical distinction in music research refers to the idea that music may both express emotions that listeners perceive (cognitive position) and induce emotions that listeners actually feel (emotive position). Although perceived and felt emotions seem to be most often positively related to each other (Evans & Schubert, 2008; Kallinen & Ravaja, 2006), they may sometimes be distinct, reflecting different psychological mechanisms (Gabrielsson, 2002).
Although the view that music evokes emotion is not unanimously accepted, particularly within music theory and philosophy (Konečni, 2003), there is accumulated evidence that music induces brain-mind and bodily reactions similar to those induced by other (affective) non-musical stimuli (e.g., Blood & Zatorre, 2001; Gomez & Danuser, 2004; Koelsch, Fritz, von Cramon, Müller, & Friederici, 2006; Menon & Levitin, 2005). However, there is little consensus regarding the mechanisms that lead from musical stimuli to an experienced emotion. A parsimonious hypothesis is that musical stimuli provoke emotions through similar routes as any other emotion eliciting event (i.e., appraisal processes, memory, empathy, contagion, proprioceptive feedback) (Juslin & Västfjäll, 2008, pp. 575–621; Scherer & Zentner, 2001). Yet this does not exclude that some mechanisms may play a more important role in the induction of emotions by music than non-musical stimuli. In this context, a distinction between utilitarian and aesthetic emotions may be important (Scherer, 2004). Utilitarian emotions are triggered by the urge to adapt to events that are of main significance to the subject’s interests, well-being and survival. On the contrary, aesthetic emotions are elicited in situations that generally do not have an obvious material effect on the subject’s well-being and survival and only occasionally lead to specific goal-oriented behaviors. Music may more often induce aesthetic than utilitarian emotions (Zentner, Grandjean, & Scherer, 2008). We agree with Scherer and Zentner’s (2001) induction model, according to which elicitation of an emotion by music is a process depending on multiple interacting factors (i.e., musical features, listener features, performance features, contextual features). The present study investigates the role of musical features with a focus on the role of pitch level in eliciting emotions.
Ever since the pioneering work by Hevner (1935, 1936, 1937), there has been an interest in exploring the link between musical structure with its different factors such as tempo, mode, intensity and pitch, and perceived emotions (reviewed by Gabrielsson, 2009; Gabrielsson & Juslin, 2003; Gabrielsson & Lindström, 2010). However, less attention has been devoted to the study of the relationships that might exist between musical structure and felt emotions. Felt musical emotions are expected to be determined not only by the musical structure but also by personal and situational factors (Scherer & Zentner, 2001). We think that it is important to explore to what extent and how experienced emotions are related to the internal structure of the music, in order to establish the relative contribution of intra- and extramusical factors in the elicitation of musical emotions. This has implications for understanding how music can shape human behavior and has relevance for therapeutic purposes (Koelsch, 2010).
Recently, our group investigated how several musical factors relate to experienced emotions (Gomez & Danuser, 2007). One of the main conclusions of this study was that the internal structure of the music played a central role in the induction of the emotions. A recent study confirmed that a significant part of listeners’ felt emotions can be predicted by psychoacoustic features (Coutinho & Cangelosi, 2011). The goal of the present study is to extend this line of research by investigating how manipulations of one specific factor of real musical excerpts, i.e., pitch level, affect felt emotions. As theoretical framework we adopt the bi-dimensional model of valence (named also pleasantness) and arousal (named also activation). The subjective affective response to a wide range of affective stimuli has been repeatedly shown to be organized around these two dimensions (e.g., Bradley & Lang, 1994). The structure of musical emotions can also be well described in terms of valence and arousal (e.g., Faith & Thayer, 2001; Gabrielsson & Lindström, 2010).
Considering that pitch is a fundamental feature of sound and therefore music, it is surprising that only few studies have investigated the link between pitch level and emotions. Moreover, most studies have concentrated on emotional expression rather than emotional experience. Because the present study adopts the model of valence and arousal, it is appropriate to organize the existing data according to this model. Nevertheless, one should keep in mind that most of the studies exploring the link between pitch level and perceived emotions have not explicitly used this bi-dimensional model as conceptualized here. Concerning the dimension of valence, some studies seem to suggest that high-pitched music is more often associated with positive emotional adjectives such as happy, graceful, dreamy, glad, and light (Collier & Hubbard, 2001; Coutinho & Cangelosi, 2009; Eitan & Timmers, 2010; Gundlach, 1935; Hevner, 1937; Kleinen, 1968; Rigg, 1940; Watson, 1942; Wedin, 1972), whereas low-pitched music is more often associated with negative emotional adjectives such as sad, dramatic, dark, agitated, bored, and somber (Gundlach, 1935; Hevner, 1937; Rigg, 1940; Scherer & Oshinsky, 1977; Watson, 1942; Wedin, 1972). However, high pitch has also been associated with negative emotions such as anger and fear and low pitch has been associated with pleasantness (Ilie & Thompson, 2006; Scherer & Oshinsky, 1977).
Data are also rather contradictory as to the possible relationship between pitch level and arousal. High-pitched music has been related to high-arousal terms such as excitement, exuberant, animated, activity, potency, alert, and high-arousal emotions such as anger and fear (Coutinho & Cangelosi, 2009; Eitan & Timmers, 2010; Gundlach 1935; Scherer & Oshinsky, 1977; Watson, 1942; Wedin, 1972). Furthermore, higher pitch level has been found to increase tension arousal (Ilie & Thompson, 2006). However, low-arousal words such as graceful and serene have also been associated with higher pitch level (Hevner, 1937). Low-pitched music has been linked to high-arousal terms such as vigorous, exciting and agitation (Hevner, 1937; Rigg, 1940), but also to low-arousal emotions such as sadness and boredom (Hevner, 1937; Scherer & Oshinsky, 1977). Because the relation between one affective dimension and pitch level may depend on the level of the other dimension, it seems fruitful to consider combinations of the two dimensions. From this perspective, the most consistent finding seems to be that the combination of negative valence and low arousal is associated with low pitch level. There is also some evidence that the combination of positive valence and high arousal is associated with high pitch level. In contrast, for the combination of positive valence and low arousal there is a lack of data to draw any firm conclusion and for the combination of negative valence and high arousal data are contradictory. As summarized by Gabrielsson and Lindström (2010), the association between pitch level and perceived emotions is much less clear than for other factors such as tempo and sound intensity.
The studies reviewed in the preceding paragraph have used different methodological approaches to investigate the relationship between musical structure and perceived emotions. Each method has its advantages and disadvantages. Most of them have used real music and have inferred the association between musical factors and emotional responses by means of descriptive statistics or multivariate analysis techniques (Gundlach, 1935; Kleinen, 1968; Watson, 1942; Wedin, 1972). Although this approach means good ecological validity, it has the disadvantage that the relative contribution of each single musical factor cannot be determined conclusively because of intercorrelations between factors (Gabrielsson & Lindström, 2010). Another source of uncertainty in this approach is given by the fact that the structural factors were not determined on the basis of objective criteria but were evaluated by human beings. Even though this evaluative task was performed by several musical experts, this approach surely leaves room for inaccuracy. Another approach consists in manipulating one or several structural factors, either in short synthesized sound sequences without musical context (Scherer & Oshinsky, 1977), or within a musical context (Hevner, 1937; Ilie & Thompson, 2006; Rigg, 1940). In the former approach, ecological validity is reduced. In the studies by Ilie and Thompson (2006) and Scherer and Oshinsky (1997), pitch level was studied regarding only two levels, precluding the possibility of testing nonlinear trends. In Hevner’s (1937) and Rigg’s (1940) work, the pieces were played by a pianist. As noted by Hevner herself, it was very difficult for the pianist to vary only one factor in real musical pieces without varying other factors. Finally, the fact that all studies used very different lists of affective terms or dimensions to measure the emotional response makes it extremely difficult to compare findings and to draw satisfactory conclusions on the effect of pitch.
The few studies that have investigated the association between pitch level and felt emotions (i.e., experienced valence and arousal) provide contradictory findings. In our study exploring the relationships between musical structure and self-reports of experienced valence and arousal (Gomez & Danuser, 2007), we found that felt arousal was negatively correlated with pitch level, i.e., low-pitched excerpts were associated with higher arousal levels than high-pitched excerpts. We found neither a significant linear correlation between pitch level and felt valence, nor a significant interaction between valence and arousal. This is in contrast to the outcomes of Coutinho and Cangelosi’s (2011) study who reported significant correlations between pitch level and both valence and arousal: increasing pitch level was associated with increasing pleasantness and arousal. In these two studies the association between pitch level and felt emotions was determined by correlational analyses. Complementing these investigations with studies in which pitch level is systematically varied in real pieces of music may allow us to draw more definite conclusions about the effects of pitch level on felt emotions.
With the present study we wished to extend current knowledge on the relationship between pitch level and felt emotions by manipulating pitch level of real classical piano excerpts. More specifically, the pitch level of four original excerpts was increased and decreased by an octave. We opted for an octave transposition because the key does not change, which actually is in contrast to other transpositions (e.g., by a fifth). Thus, participants listened to three variants of each of four different excerpts. This allowed us to test both linear and curvilinear effects of pitch variations. The four original excerpts were from musical works by Gabriel Fauré (9 Preludes, op. 103 no .9), Frédéric Chopin (24 Preludes, op. 28 no. 8), Ludwig van Beethoven (11 Bagatelles, op. 119 no .4) and Friedrich Burgmüller (25 Progressive Studies, op. 100 no. 23 “The Return”), and each was expected to induce a different combination of valence and arousal. Fauré’s excerpt was meant to evoke a relatively negative and low-arousal affective response, whereas Chopin’s excerpt was expected to elicit relatively negative and high-arousal emotions. Beethoven’s and Burgmüller’s excerpts were both selected to induce relatively positive emotions, with Beethoven’s excerpt being less arousing than Burgmüller’s excerpt. The four excerpts can be regarded as prototypical in terms of two major determinants of valence and arousal, i.e., mode and tempo. In fact, the two rather positively valenced excerpts (Beethoven and Burgmüller) are in major mode, whereas the two rather negatively valenced excerpts (Fauré and Chopin) are in minor mode. The two more arousing excerpts (Chopin and Burgmüller) have relatively faster tempi, while the two less arousing excerpts (Fauré and Beethoven) have relatively slower tempi. In addition, by using modern computer technology of MIDI files editing and recording, this study presents the advantage of varying only pitch level and keeping constant all other structural factors. In sum, our methodological approach enables us to keep a satisfactory ecological validity while getting accurate conclusions on the effect of pitch level on emotional experience.
Due to the paucity of studies and the above-mentioned conflicting results and methodological issues, we can only advance tentative hypotheses for our study. Assuming a positive relationship between perceived and felt emotions (Evans & Schubert, 2008), we predicted that musical excerpts with high pitch level would induce more positive valence than musical excerpts with low pitch level (Coutinho & Cangelosi 2009, 2011; Gundlach, 1935; Hevner, 1937; Kleinen, 1968; Rigg, 1940; Scherer & Oshinsky, 1977; Watson, 1942; Wedin, 1972). For arousal, we did not make any predictions given the opposite findings both among studies investigating felt emotions (Coutinho & Cangelosi, 2011; Gomez & Danuser, 2007) and those investigating perceived emotions (e.g., Coutinho & Cangelosi, 2009; Eitan & Timmers, 2010, Hevner, 1937; Rigg, 1940; Scherer & Oshinsky, 1977). Finally, gender has been found to moderate the effects of musical factors such as tempo, mode, and loudness on emotional responses (Kellaris & Rice, 1993; Webster & Weir, 2005). Therefore, gender was included as a factor in all analyses, although we made no a priori predictions regarding gender effects.
Method
Participants
Participants were 24 men and 25 women, aged 18–38 years (mean age 23.3). Most participants were undergraduate students. None reported suffering from cardiovascular, respiratory, neurological, metabolic, or psychiatric diseases. None reported taking any drugs, and all were healthy on the day of testing. They were asked to refrain from drinking alcohol for 12 hours and stimulating beverages for 3 hours, and from smoking for 2 hours prior to the experiment. All were French speakers or rated their level in French as sufficient or good. All participants had normal hearing as determined with a pure-tone audiogram performed with a diagnostic audiometer AD226 (Interacoustics A/S, Assens, Denmark) and based on the modified Hughson–Westlake method. All participants gave written informed consent and were paid 30 Swiss Francs for participation.
Most participants were regular consumers of music: Sixty-seven percent of the participants listened to music at least one to two hours per day. Only one participant listened to music only some hours per month. Pop music and rock music were often listened to by 60% of the participants, followed by hip-hop/r ’n’ b music (29%), techno music (21%) and classical music (10%). No participants reported disliking classical music. To avoid excessive familiarity with the stimuli and musical expertise, having received piano training (amateur or professional), and having attended professional classes in music schools (in any instrument) were exclusion criteria. Thirty-five percent of the participants reported having received some musical training other than the basic one in compulsory education, and 56% of the participants reported having played an instrument, during 5.6 years on average. Just before the beginning of the experiment, the participants reported low levels of anxiety (mean on the STAI-S = 28.10, SD = 7.48) and they felt good (mean on the 13-point valence scale = 10.38, SD = 2.05) and moderately aroused (mean on the 13-point arousal scale = 5.90, SD = 2.95). Men and women did not differ on any of these variables.
Musical stimuli
Stimuli were four 1-minute classical piano excerpts, each presented at three different pitch levels (i.e., one octave lower than the original version, the original version, and one octave higher than the original version). Thus, in total 12 1-minute excerpts served as stimuli. Piano pieces were preferred because the piano is one of the most studied instruments in the field of musically induced emotions (Juslin & Laukka, 2003), because it is recognized as an instrument suggesting a variety of emotions and because it has a large octave range. The four excerpts were chosen from a pool of 31 excerpts on the basis of a pre-test carried out with an independent sample of 32 individuals. Details about the pre-test can be obtained from the last author. The four musical excerpts were passages from (1) Gabriel Fauré, 9 Preludes, op. 103 no. 9; (2) Frédéric Chopin, 24 Preludes, op. 28 no. 8; (3) Ludwig van Beethoven, 11 Bagatelles, op. 119 no. 4; (4) Friedrich Burgmüller, 25 Progressive Studies, op. 100 no. 23 “The Return.” Relevant characteristics of these four excerpts are given in Table 1. These four excerpts, in their unmodified version, were each expected to induce different combinations of valence and arousal, i.e., relatively negative valence and low arousal (Fauré), relatively negative valence and high arousal (Chopin), relatively positive valence and low arousal (Beethoven), and relatively positive valence and high arousal (Burgmüller). We intentionally selected four excerpts that were not “extreme” instances in their respective quadrant of the affective space defined by valence and arousal, because excerpts with an extreme affective tone in their original version (e.g., very positive and highly arousing) would have not enabled a bi-directional emotional appraisal of their pitch-modified versions. Importantly, the three octave variants of each musical excerpt differed only by their pitch level; all other musical features were maintained constant, particularly tempo, mode, dynamics, and sound intensity.
The musical excerpts.
Three additional 1-minute excerpts were used as examples at the beginning of the experiment to familiarize the participants with the procedure. These were Peter I. Tchaikovsky op. 39 no. 21 “Sweet Reverie” from Album for the Young (measures 1–27.5), Peter I. Tchaikovsky op. 59 Dumka (measures 1–15.5) and Robert Schumann op. 68 no. 1 “Melody” from Album for the Young (entire piece). The musical excerpt from Tchaikovsky op. 39 no. 21 was increased by one octave and Schumann op. 68 no. 1 was decreased by one octave so that participants were prepared to listen to varied pitch levels.
The excerpts were obtained from musical pieces in MIDI files format selected from different websites specialized in collecting files of western piano music. The preparation of the musical excerpts was done in a professional musical studio (http://www.pianobello.com). The octave transposition was performed with the sound editing software Cubase 4.0 (Steinberg, Germany). Further, sound intensity was edited so that the three variants of the same excerpt differed by less than 1 dB(A) in their sound intensity as measured with a sonometer (Bruël & Kjaer, Denmark). The sound engineer edited the MIDI files so that their duration was exactly 1 minute and then faded out the last 5 seconds so as not to give the stimulus a clipped sound at the end. To get a high piano sound quality, MIDI files were transposed into mp3 acoustic files with a modern recording process using a Yamaha DS4 Pro Mark 4 automated grand piano. This piano is equipped with a mechanical process allowing playing in an autonomous way (i.e. without pianist). Sound was recorded with two Schoeps MSTC 64 (ORTF) microphones placed 2 centimeters over the strings and two Neumann U-87 microphones placed outside the piano, near the keyboard, separated by 1.5 meters, at the height of the ears of a standing man, in order to get a rich and balanced stereo sound.
Sound intensity was regularly measured during the experimental phase to ensure that it remained constant. Mean sound intensity (Leq) of the three-octave variants for the Fauré excerpt was 46.88 dB(A), for the Chopin excerpt 47.98 dB(A), for the Beethoven excerpt 46.89 dB(A), and for the Burgmüller excerpt 47.76 dB(A). The excerpts by Chopin and Burgmüller were intentionally louder to reflect their usual dynamics. Mean differences between maximal and minimal sound intensity of each musical piece were negligible (0.15 dB(A) for Fauré, 0.20 dB(A) for Chopin, 0.25 dB(A) for Beethoven and 0.28 dB(A) for Burgmüller).
Measures
Valence and arousal scales
Self-reports of felt valence and arousal were registered using two 13-point Likert-like scales from 1 = most negative valence/lowest arousal to 13 = most positive valence/highest arousal. Anchors for the valence scale were “positive” and “negative,” and for the arousal scale “high arousal” and “low arousal.”
Questionnaires
Just before the presentation of the music excerpts, participants were given two questionnaires measuring their momentary affective state: two 13-point valence and arousal scales and the state scale of Spielberger’s State Trait Anxiety Inventory (STAI-S; Spielberger, 1983). At the end of the experiment, participants filled in an online questionnaire about their musical preferences, listening habits and musical training (see “Participants” earlier).
Procedure
Participants were tested individually in a sound-insulated room during one experimental session lasting approximately 1.5 hours. First, the experimenter provided the participants with an outline of the procedure. Then, participants filled out an informed consent form. They were told that 15 musical piano excerpts of 60-seconds duration would be played and that each excerpt should be attended to for the entire duration of the presentation. They were then told that after each musical excerpt they had to report how they felt emotionally while listening to it by giving one valence and one arousal rating using the corresponding scales. The importance of assessing the musical excerpt as they actually felt while they listened to the excerpts was emphasized. Participants were also asked to listen to each musical excerpt naturally and freely, as when they listen to music in their everyday life. Further, they were told that some musical excerpts may appear similar but that they should rate each excerpt spontaneously without taking too much time and without thinking about ratings given to excerpts previously heard. Next, they were told how to use the valence and arousal scales and then they completed the questionnaires about their momentary affective state. Afterwards, they were told that one practice musical excerpt would be played in order to familiarize them with the rating procedure. They were also instructed to keep their eyes open during the entire musical session to avoid facilitation of mental images creation. Then, the 15 1-minute musical excerpts were presented with E-prime 2.0 Professional (Psychology Software Tools, Pittsburgh, PA) running on a PC through headphones (Sennheiser H215). The first three stimuli were examples (not analyzed further) and were followed by the 12 stimuli chosen according to the procedure explained above. They were separated by 1 minute of silence during which the participants gave one valence and one arousal rating about the preceding excerpt within about 15 seconds and then relaxed.
The 12 excerpts were presented in 24 different orders. Each order was attributed to one female and one male participant. Within each of the 24 orders, the 12 stimuli were ordered in three blocks of four excerpts. Each block included one variant of each of the four excerpts Fauré, Chopin, Beethoven, and Burgmüller. The order within each block was one of the 24 possible combinations and was the same in all three blocks of the same order so that the three variants of each piece were played with the same time interval in between (i.e., 7 minutes). Importantly, the three octave variations of each of the four excerpts were distributed so that the six possible combinations (i.e., +1, –1, 0; +1, 0, –1; –1, +1, 0; –1, 0, +1; 0, –1, +1; 0, +1, –1) were balanced across participants. In sum, the pieces (Fauré, Chopin, Beethoven, Burgmüller) and their pitch variations (-1 octave, no pitch modification, +1 octave) were perfectly balanced across participants. Here is an example of presentation order: Fauré (–1 octave) – Chopin (+1 octave) – Beethoven (no pitch modification) – Burgmüller (–1 octave) – Fauré (+1 octave) – Chopin (no pitch modification) – Beethoven (–1 octave) – Burgmüller (+1 octave) – Fauré (no pitch modification) – Chopin (–1 octave) – Beethoven (+1 octave) – Burgmüller (no pitch modification).
After the emotional rating of the last musical excerpt, participants completed the online questionnaire about their musical preferences, listening habits, and musical training. Finally, they took the auditory capacity test before being thanked and paid for their participation.
Statistical analysis
Valence and arousal ratings were analyzed in separate repeated-measure analysis of variance (ANOVA) with three within-subject factors: a priori valence (positive vs. negative), a priori arousal (high vs. low) and pitch level (–1, 0, +1). A priori relatively positive excerpts were Beethoven and Burgmüller, whereas a priori relatively negative excerpts were Fauré and Chopin. A priori rather high-arousal excerpts were Burgmüller and Chopin, whereas a priori rather low-arousal excerpts were Beethoven and Fauré. Gender was included as between-subject factor. For all omnibus tests, the multivariate test statistic Wilks’ lambda is reported to avoid potential sphericity issues. Further, linear and quadratic within-subjects contrasts are reported. Where appropriate, independent-samples t-tests and paired t-tests were additionally conducted. For paired t-tests comparing the three pitch levels to each other, Bonferroni-adjusted p-values are reported (i.e., the p-value of the simple t-test multiplied by three in accordance with the number of comparisons involved). For all analysis, the significance level was set at 5%. Effect sizes were estimated using partial eta squared (
Results
Table 2 shows means and SEMs for valence and arousal ratings for the 12 musical excerpts and by gender. We first report within-subject effects (i.e., a priori valence, a priori arousal and pitch level) for valence and arousal ratings and, then, gender effects.
Mean valence and arousal ratings (SEMs in parentheses) for each musical excerpt for women, men and all participants.
Notes. aFor valence and arousal ratings, 1 = most negative valence/lowest arousal and 13 = most positive valence/highest arousal. b F = Fauré, C = Chopin, Be = Beethoven. Bu = Burgmüller, -1 = -1 octave modification variant, 0 = variant without octave modification, +1 = +1 octave modification variant.
A priori valence, a priori arousal, and pitch effects for valence ratings
The plots for valence ratings are shown in Figure 1(a). As expected, valence ratings of a priori relatively positive musical excerpts (Beethoven and Burgmüller) were more positive than valence ratings of a priori relatively negative musical excerpts (Fauré and Chopin) (F(1,47) = 140.231, p < .001,

Mean affective ratings for musical excerpts by Fauré, Chopin, Beethoven, and Burgmüller according to pitch level modification for (a) valence ratings and (b) arousal ratings. Error bars refer to SEMs. Scales for valence and arousal ratings ranged from 1 = most negative valence/lowest arousal to 13 = most positive valence/highest arousal.
Pitch level had a significant main effect on valence ratings (F(2,46) = 24.568, p < .001,
Importantly, the A priori valence × A priori arousal × Pitch level interaction for the omnibus test and for the linear trend was also significant (F(2,46) = 10.594, p < .001,
In contrast, valence ratings of the excerpt by Chopin (a priori relatively negative valence and high arousal) were influenced by a systematic pitch level variation (F(2,46) = 18.988, p < .001,
The valence ratings of the excerpt by Beethoven (a priori relatively positive valence and low arousal) were influenced by a systematic pitch level modification (F(2,46) = 16.463, p < .001,
Finally, valence ratings of the excerpt by Burgmüller (a priori relatively positive valence and high arousal) were influenced by a systematic pitch level modification (F(2,46) = 6.304, p < .01,
A priori valence, a priori arousal, and pitch effects for arousal ratings
The plots for arousal ratings are shown in Figure 1(b). As expected, arousal ratings of a priori rather high-arousal musical excerpts (Chopin and Burgmüller) were significantly greater than arousal ratings of a priori rather low-arousal musical excerpts (Fauré and Beethoven) (F(1,47) =178.263, p < .001,
For pitch level, there was a significant linear trend (F(1,47) = 4.360, p < .05,
Gender effects for valence ratings
For valence ratings there were a significant main effect of gender (F(1,47) = 4.299, p < .05,

Mean affective ratings according to pitch level modification for men and women and for (a) valence ratings and (b) arousal ratings. Error bars refer to SEMs. Scales for valence and arousal ratings ranged from 1 = most negative valence/lowest arousal to 13 = most positive valence/highest arousal.
Gender effects for arousal ratings
For arousal ratings there was a significant Pitch level × Gender interaction for the linear trend (F(1,47) = 4.583, p < .05,
Discussion
This experiment investigated the effects of systematic pitch level variation on the subjective experience of emotions (i.e., self-reports of felt valence and arousal) induced by classical piano excerpts. In a within-subject design, participants listened to four excerpts played in the original version and in two octave transpositions (an octave down and an octave up) while keeping all other musical features constant.
Effects of pitch level variation on felt valence and arousal
Pitch level variation significantly modulated felt valence and, to a lesser degree, felt arousal. More precisely, felt valence for the excerpts in the -1 octave variant was less positive than felt valence for both the musical excerpts without octave modification and in the +1 octave variant (significant linear and quadratic trend). Felt arousal decreased with increasing pitch level (significant linear trend). Thus, the general finding is that compared to higher pitch level, lower pitch level was associated with more negative valence and higher arousal. However, as discussed below in greater detail, there were several significant interactions of pitch level with a priori valence, a priori arousal and gender, supporting the idea that the effects of pitch level are “ambiguous” and “complex” (Gabrielsson & Lindström, 2010). Nevertheless, the main effects for valence and arousal are interesting in light of empirical data from investigations with pictorial stimuli suggesting that the valence and arousal dimensions are not independent of each other; often a negative association has been reported indicating that more negative valence tends to be accompanied by higher arousal levels (Lang, Bradley & Cuthbert, 1998; Libkuman, Otani, Kern, Viger & Novak, 2007).
A positive relationship between pitch level and pleasantness is supported by studies investigating the effect of pitch level on perceived emotions (Collier & Hubbard, 2001; Gundlach, 1935; Hevner, 1937; Kleinen, 1968; Rigg, 1940; Watson, 1942; Wedin, 1972) and also by Coutinho and Cangelosi’s (2011) study on felt emotions. However, other studies have found that low-pitched music was associated with more pleasantness than high-pitched music (Ilie & Thompson, 2006; Scherer & Oshinsky, 1977), and we found no effect of pitch level on ratings of felt valence (Gomez & Danuser, 2007).
The main effect of pitch level on valence ratings must be interpreted in light of the significant three-way interaction between a priori valence, a priori arousal and pitch level. This interaction suggests that the effect of pitch level variation on felt pleasure depends on the specific musical excerpt and, thus, on its musical structure. In particular, because mode is a major determinant of valence, and tempo is a major determinant of arousal (Gomez & Danuser, 2007), it is plausible that the interactive effect of pitch with a priori valence and a priori arousal reflects, at least partly, the interdependency between these three musical factors in determining the experience of pleasure. According to our results, an increasing pitch level for a musical piece with fast tempo written in minor mode (Chopin) induced a linear increase of felt valence. For a musical piece with low tempo written in major mode (Beethoven), an increasing pitch level induced a linear and quadratic increase of felt valence. For a musical piece with fast tempo and written in major mode (Burgmüller), an increasing pitch level induced a quadratic effect on felt valence. Finally, pitch level variation had no effect on felt valence for a musical piece with low tempo written in minor mode (Fauré). The hypothesis that the effects of pitch level on ratings of pleasantness strongly depend on other musical characteristics is supported by Ilie and Thompson’s (2006) study, in which pitch level interacted significantly with both sound intensity and tempo: pitch had an effect on perceived valence only when the music was relatively loud and fast.
Studies based on diverse research paradigms suggest that some musical dimensions are perceptually and cognitively related through intensity analogies (Eitan 2007; Eitan & Granot, 2006, 2007). Different degrees of perceived congruency between musical features may, at least partly, explain some of the interaction effects observed for valence ratings and indirectly explain the contrasting findings in the literature. For instance, there is an association between pitch and tempo in the way that pitch rise and crescendo are perceived as more congruent than pitch rise and decrescendo. The excerpt by Chopin was the fastest piece and the only one to have a pure linear increase in valence ratings from the -1 octave variant to the +1 octave variant. One may speculate that this effect was due to increasing perceived congruency between pitch level and tempo from the -1 octave variant (lowest congruency) to the +1 octave variant (highest congruency). In contrast, for the other three relatively slower excerpts there was not a pure linear relationship between pitch level and valence. For the excerpts by Beethoven and Burgmüller it might be that the original version had an average pitch level that was perceived as more congruent with the tempo of the excerpts than the pitch level of the -1 and +1 octave variants, and thus the original versions received the highest valence ratings. Exploring in greater detail how different psychoacoustic features interact with each other in eliciting different affective responses seems an important avenue for future research. The fact that the effects of pitch level variation on felt valence did not follow a linear trend only also demonstrates the importance of studying the effects of musical factors with more than two opposite levels (i.e., high or low pitch), in order to determine possible response curvilinearity (see Gabrielsson & Lindström, 2010).
For arousal, the negative relationship with pitch level confirms our previous finding (Gomez & Danuser, 2007) and is in line with the outcomes of studies assessing perceived emotions (Hevner, 1937; Rigg, 1940). However, it contrasts with findings from other investigations (Coutinho & Cangelosi, 2011; Eitan & Timmers, 2010; Gundlach 1935; Ilie & Thompson, 2006; Scherer & Oshinsky, 1977; Watson, 1942; Wedin, 1972). Interestingly, Ilie and Thompson (2006) observed that modification of pitch level had an effect on tension arousal but not on energy arousal. Differences in how arousal is conceptualized by the experimenters, as well as both explained to and understood by the participants may be important in understanding the contrasting findings. Importantly, as discussed below, the negative relationship between pitch level and arousal was significant for men but not for women.
Gender differences
Significant gender effects were found for both felt valence and felt arousal. Regardless of the a priori arousal and valence of the musical pieces, women gave more positive ratings than men. This could be explained by sociopsychological processes based on the stereotype according to which women experience and express emotions more intensely and more frequently than men (Feldman Barrett, Robin, Pietromonaco, & Eyssell, 1998). These beliefs may be rooted in social roles that specify that women are more affectively responsive than men (Eagly, 1987; Wood, Rholes, & Whelan, 1989). Notions about general affective responsivity may be constructed in line with the social role ascribed to gender. Therefore, the effect of gender on affective valence could reflect that listeners gave role-guided responses leading to greater (for women) and smaller (for men) amounts of pleasure.
Perhaps more strikingly, gender moderated the effects of pitch level variation on felt valence and arousal. Although valence ratings of both women and men increased in a linear and quadratic way when pitch level increased, this relation was more obvious (i.e., larger effect size) for women. Concerning arousal ratings, increasing pitch level induced a decrease of felt arousal in men but not in women.
In sociopsychological terms, the difference in the rating of valence between men and women may come from socially ascribed gender roles. According to Landon (1974), people may respond more positively to objects that are congruent with their self-concept. Voice pitch is positively correlated with perceived femininity (Feinberg, Debruine, Jones, & Perrett, 2008), and high pitch is associated with “feminine/female” in studies on cross-domain mappings of auditory pitch (Eitan & Timmers, 2010). This association between high/low pitch and femininity/masculinity probably stems from the fact that the human voice is about half as high in men as it is in women (Titze, 2000). In that way, it is possible that music with higher pitch was experienced as more congruent with traditional concepts of femininity and thus felt as more positive by women, whereas the same music was felt as less congruent with traditional notions of masculinity by men. If we consider that valence is linked with the tendency to view a situation as personally relevant (Frijda, 1986; Reisenzein, 1994), it is possible that the effects of pitch level variation on valence ratings of women are related to the more personally relevant range of pitch levels used in this study for women than for men.
Voice pitch is associated with interpersonal power and deference relations (e.g., Gregory, 1994). More specifically, the pitch level of male utterances is negatively associated with men’s ratings of physical and social dominance, and it has been suggested that interaction with relatively dominant conspecifics may increase activation (Puts, Gaulin, & Verdolini, 2006). Assuming a close relationship between vocal expression and music (Juslin & Laukka, 2003), one may speculate whether the negative relationship between pitch level and felt arousal observed in this study and in our previous study (Gomez & Danuser, 2007) reflects, at least partly, the perception of higher dominance and power in low-pitched music. Because of the importance of male dominance competition for intrasexual selection, these pitch-related effects on arousal may be stronger in men than in women, although, to the best of our knowledge, this remains to be tested.
Another way to discuss the interaction effect of pitch level and gender on felt valence and arousal stems from the fact that during an emotional experience induced by music, women may be more focused on the valence dimension whereas men may be more focused on the arousal dimension. The concepts of valence focus and arousal focus were first introduced by Feldman (1995). Valence focus corresponds to the tendency to attend to and to report the pleasant or unpleasant aspects of affective experience, whereas arousal focus refers to the tendency to attend to and report the physiological arousal associated with affective experience. Although no study has reported gender effect on valence and arousal focus, it could be that our female participants were rather focused on the valence dimension than the arousal dimension of the emotional experience. Male participants could have been rather focused on the arousal dimension. Importantly, the gender effects observed here cannot be attributed to differences in music listening habits, musical preferences, and musical training, or to differences in the momentary mood since men and women were not significantly different on these variables.
Limitations and outlook
The nature of the specific musical stimuli used in the current study imposes limits to generalizability. Given the multitude and variety of musical works, this is an inherent and unavoidable problem of all studies on musical emotions. Consequently, selecting a larger subset of stimuli for each quadrant of the valence and arousal space would not have made the results significantly more generalizable. Rather, this would have made the experiment excessively long with the risk of causing fatigue, annoyance, and decrease in attention in the participants. Therefore, our strategy was to choose four excerpts that are prototypical in terms of major musical determinants of valence and arousal (i.e., mode and tempo). Obviously, it could be the task of future studies to determine to which extent our findings can be generalized to other musical works.
This study suggests that emotional responses and more specifically valence depend on the musical pitch level and its interaction with other factors of the musical structure. Like perceived emotional expression in music (Gabrielsson & Lindström, 2010), felt emotions are the result of a combination of many factors. Future research would benefit from adopting an interactive approach in which different musical factors are simultaneously manipulated. This seems particularly important in light of intensity-based analogies between musical parameters (e.g., pitch, tempo, loudness) and cross-dimensional effects on auditory perception (Eitan, 2007; Eitan & Granot, 2006, 2007). This approach will help researchers to keep on developing an accurate taxonomy of the effects of musical factors on emotional responses. Moreover, our study suggests an important modulatory role of gender. Thus, a further step would be to investigate how pitch and other features of the musical structure interact with personal factors.
In the present study we used the bi-dimensional model of valence and arousal as theoretical framework. This model is efficient in “summarizing” emotions; nonetheless it also means a loss in specification. It would be important to investigate the effects of pitch variations on felt emotions adopting other models such as the three-dimensional model of affect (Schimmack & Reisenzein, 2002), the basic emotion model (Ekman, 1992) or domain-specific models (e.g., Zentner et al., 2008), ideally in combination. This may help gaining a more elaborated understanding of the apparently complex relationship between pitch level and emotional experience.
Music changes in its structure across time. These changes induce modifications in the emotional expression and experience. Even though we selected excerpts with a relatively homogeneous musical structure, the affective ratings that listeners made after each excerpt could not satisfactorily reflect the variations in emotional experience induced by the dynamic nature of music. To better capture these temporal variations, continuous recordings methods may be used (Schubert, 2010).
In conclusion, the present study contributes to our understanding on how musical structure impacts our emotional experiences by showing that pitch level affects both the experience of pleasure and, to a lesser degree, arousal. In general, lower pitch levels tend to be associated with more negative valence and high arousal than higher pitch levels. Importantly, the pitch effect on valence is modulated by the specific properties of the musical stimuli, and the pitch effect on both valence and arousal is influenced by the gender of the listener: The positive association between pitch level and valence is stronger for women than men, and the negative association between pitch and arousal is found only among men.
Footnotes
Acknowledgements
We thank Marcia Klimek for her comments on an earlier version of the manuscript.
