Abstract
Scherer and Zentner (2001) propose that affective experiences might be the product of a multiplicative function between structural, performance, listener, and contextual related features. Yet research on the effects of structure, and particularly texture, has mostly focused on perceived emotions. We therefore sought to test the effects of structural features on subjective musical experiences in a listening study by manipulating the performance, solo versus ensemble, of five segments of a piece for string quartet, while also exploring the impact of listener features such as musical training, listening habits and stable dispositions such as empathy. We found that participants (N = 144, 78% female; Mage = 22.74 years, SD = 5.13) felt like moving more (ME) and perceived their physiological rhythms change more (VE) during ensemble compared to solo conditions. Moreover, ME significantly predicted positive emotions, such as Wonder and Power, while VE significantly predicted both positive and negative emotions, such as Tension and Nostalgia. We also found direct main and interaction effects of both segment and performance factors on all four emotion models. We believe these results support Scherer and Zentner’s model and show the importance of considering the interaction between compositional and instrumental texture when studying music-induced emotions.
Part of our attraction toward music may be explained by the emotional effects that it can have on listeners (Krumhansl, 2002). Indeed, studies have found that the regulation and management of both positive and negative mood is among the main reasons for listening to music (Lonsdale & North, 2011). This has sparked a need to better characterize both the musical features and psychological processes that might contribute to affective experiences during music listening, whether preferences, moods, or emotional episodes (Scherer, 2004). As a starting point, Scherer and Zentner (2001) have suggested that music-induced emotions might be the result of a multiplicative function between structural, performance, listener, and contextual related features. In order of importance, music structural features have been found to be particularly powerful determinants of emotional experiences and especially so in the case of Western classical music (Scherer & Coutinho, 2013; Scherer, Zentner, & Schacht, 2002). However, with a few notable exceptions (Gomez & Danuser, 2007; Sloboda, 1991), work on the effects of musical features has tended to focus on the perception of musically expressed emotions (Gagnon & Peretz, 2003; Juslin & Lindström, 2010; Schubert, 2004) or has mostly been concerned with the effects of tempo and mode (Husain, Thompson, & Schellenberg, 2002; Larsen & Stastny, 2011). Indeed, the role of other important features on emotion such as instrumental texture, which is related to the way that different voices in a piece combine, has remained relatively understudied. While emotion perception studies have found links between non-harmonized or simple harmonies and positive valence (Gabrielsson & Lindström, 2001; Webster & Weir, 2005), the relationship between texture and felt emotions remains unknown. In the following study we sought to test the coarse effects of structural features on subjective musical experiences in a music listening study. To do so, we manipulated the performance (solo versus ensemble) of five segments of a piece for string quartet, while also testing the impact of listener features such as musicianship and stable dispositions such as empathy. As mentioned earlier, there is a wide range of subjective experiences listeners might have in response to music, from simple pleasure or liking, to the recognition of musical structures, to experiences of affect and motion for example (Bharucha, Curtis, & Paroo, 2006; Scherer, 2004). In this paper we have chosen to focus on impressions of changes in bodily states and the desire to move in response to the music by using the Musical Entrainment Questionnaire (MEQ) (Labbé & Grandjean, 2014), and specific emotions from the Geneva Emotional Music Scale (GEMS) (Zentner, Grandjean, & Scherer, 2008), a nine-factor scale specifically designed to capture music-induced emotions in expert and non-expert listeners. The MEQ is constituted by two factors, a Visceral Entrainment (VE) factor consisting of items related to subjective experiences of having one’s body rhythms change as a result of listening to music, and a Motor Entrainment (ME) factor comprising items related to the urge to move in response to the music, a similar concept to groove as proposed by Janata, Tomic, and Haberman (2012). Interestingly, Janata et al. found that among the top 20 attributes defining the concept of the groove was the idea that “[the] groove depends on the density of the texture” (Janata et al., 2012, p. 57), which could mean that listeners might feel like moving more to ensemble than to solo music for example. Using the MEQ is also of interest because both its factors have been found to significantly predict different GEMS dimensions in a music listening study using solo violin performances (Labbé & Grandjean, 2014). More specifically, VE was found to significantly predict all dimensions except for Peacefulness, while ME significantly predicted all positive (i.e., Joyful Activation, Transcendence, Wonder, Power, Tenderness, and Peacefulness) but no negative dimensions (i.e., Nostalgia, Tension, and Sadness). In this study, we decided to implement a similar paradigm using music for string quartet while focusing on four GEMS dimensions: Tension, Nostalgia, Wonder, and Power. These were chosen since each had been well represented in a pilot study using the full set of GEMS dimensions and because they can each be conveniently mapped onto a valence by arousal quadrant (Russell, 1980) – Tension (negative valence, high arousal), Nostalgia (negative valence, low arousal), Wonder (positive valence, low arousal) and Power (positive valence, high arousal), a similar organization having also been found by Trost, Ethofer, Zentner, and Vuilleumier (2012). Given the association between texture and groove (Janata et al., 2012) and previous findings linking ME with positive emotions (Labbé & Grandjean, 2014), we expected ME ratings to be higher during ensemble performances and to better predict positive dimensions, while VE was expected to predict both positive and negative dimensions.
Method
Participants
Participants in this study consisted of second-year bachelor students at the University of Geneva. Students were invited to participate in the study at the end of a lesson on a voluntary basis. The results from 144 students (112 women, 28 men, 4 N/A) who completed all or part of the questionnaires were kept for analyses (1.41% missing data). Ages ranged from 18 to 50 years (M = 22.74, SD = 5.13), 85 participants had sung or played an instrument at one point in their lives (Training), most of them having started around the age of 10, however, only 28 of them reported that they still played. Finally, while 103 of our participants reported that they did not regularly listen to classical music (ClassicalListen, 3 N/A), 118 of them stated that they liked it (ClassicalLike).
Ethics
All participants were informed about the goals of the study, the institutions involved and were verbally reminded they had the right to withdraw from the study at any point whatever the reason. All data were analyzed anonymously and the study was conducted in accordance with the Declaration of Helsinki.
Materials
Musical stimuli
Stimuli consisted of segments from Schubert’s String Quartet No. 14 in D minor, “Death and the Maiden”, recorded at the CasaPaganini-InfoMus Lab (University of Genoa, Italy) and performed by Quartetto di Cremona in the context of the EU project SIEMPRE. 1 String quartet music was chosen because it offers the experimenter better control over how each instrumental voice interacts and contributes to the overall sound. In order to explore the features of musical structure that might affect listeners’ experiences, the piece was subdivided into five segments ranging between 17 and 34 seconds (see online supplementary material). The segments were recorded once with all musicians (ensemble) and once with the first violin only (solo).
Questionnaires
After each stimulus participants had to answer eight questions about their subjective experience on seven-point Likert scales. These were: the two most saturating items on each MEQ factor (“to what extent did you feel like dancing?” and “to what extent did you feel like moving?” for ME, and “to what extent did you feel your own bodily rhythms change?” and “to what extent did your own body resonate with the music?” for VE), followed by the dimensions of Tension, Wonder, Nostalgia, and Power. Before the music listening section of the study began we also used a short form of the Fantasy, Perspective Taking and Empathic Concern subscales of the Interpersonal Reactivity Index (IRI) and collected information on participants’ musical experience since similar characteristics have been found to be relevant predictors that can modulate the emotion-induction experience (Labbé & Grandjean, 2014).
Procedure
Trials were presented through loudspeakers in a small auditorium in a semi-randomized order, ensuring that the same segment would not be presented twice in a row. Enough time was left for participants to answer each set of questions before playing the next stimulus. The study lasted a total of 15 minutes.
Results
Linear mixed-effect models
In order to test the effects of Performance on MEQ ratings, we conducted two separate linear mixed-effect models (“lmer” from the lme4 package in R 3.0.2) on the extracted factor scores for the ME and VE ratings. In both cases we defined Performance (solo/ensemble) as a fixed factor, and controlled for the effects of Participant, Sex, and Order of Presentation by defining these as random factors. We found a significant effect of Performance on both ME (χ2 = 6.4, df = 1, p = .011) and VE (χ2 = 4.403, df = 1, p = .036) models, both factor scores being significantly greater in the ensemble condition.
Generalized linear mixed-effects models
Next, in order to test the influence of these factors and stable personality and music listening traits on felt emotion ratings, we used generalized linear mixed-effects models (“glmer” from the lme4 package) fitted to a gamma distribution separately for each GEMS dimension. Indeed, our data were not normally distributed, presenting a mass at 1, which is typical of emotion studies. For each model, we defined Segment (five segments), Performance (solo/ensemble), ClassicalListen (yes/no), ClassicalLike (yes/no), and musical Training (yes/no), as fixed factors while MEQ factor scores and the averaged Fantasy, Perspective Taking, and Empathic Concern subscales of the IRI were used as continuous predictors. Once again, Participant, Sex, and Order of Presentation were defined as random factors.
We found significant effects of Segment, Performance, Training, ClassicalListen, and Empathic Concern on all emotion models. All results and design of the models are summarized in Table 1.
Summary of generalized linear mixed models of the four Geneva Emotional Music Scale dimensions of interest.
Note. χ2 statistics of the generalized linear mixed models using a gamma distribution with the categorical factors Segment, Performance, Training, ClassicalListen, ClassicalLike, continuous predictors Visceral Entrainment, Motor Entrainment, empathy (IRI) and the interactions of interest. Participant, Sex, and Order of presentation were used as random factors.
p < .05; **p < .01; ***p < .001.
Tension
A significant interaction between Segment and Performance (χ2 = 16.39, df = 4, p < .001, see Figure 1) showed listeners experienced more tension during solo performances only during segment 3 (χ2 = 35.58, df = 1, p < .001). Tension ratings were also significantly predicted by greater VE scores (χ2 = 25.79, df = 1, p < .001), having no training (χ2 = 35.7, df = 1, p < .001), not listening regularly to classical music (χ2 = 93.85, df = 1, p < .001), and having lower IRI ratings (χ2 = 86.9, df = 1, p < .001).

Interaction effects between Segment and Performance for each emotion model. Error bars denote 95% confidence intervals.
Nostalgia
A significant interaction between Segment and Performance (χ2 = 17.13, df = 4, p = .0018) showed listeners felt more nostalgic during ensemble performances during segment 3 (χ2 = 19.47, df = 1, p < .001) but less nostalgic during segment 4 (χ2 = 6.45, df = 1, p = .085). We also found greater VE scores (χ2 = 36.92, df = 1, p < .001), being trained (χ2 = 26.02, df = 1, p < .001), liking classical music (χ2 = 4.34, df = 1, p = .037), not listening regularly to classical music (χ2 = 121.8, df = 1, p < .001), and lower IRI scores (χ2 = 72.98, df = 1, p < .001) to predict greater feelings of Nostalgia.
Wonder
A significant interaction between Segment and Performance (χ2 = 14.68, df = 4, p = .0054) showed listeners felt more wonder during ensemble performances in the second (χ2 = 10.19, df = 1, p = .0057), third (χ2 = 21.4, df = 1, p < .001), and fourth segments (χ2 = 8.18, df = 1, p = .038); however, the opposite relation was observed during the fifth segment (χ2 = 5.52, df = 1, p = .038). We also found wonder ratings to be significantly predicted by higher ME (χ2 = 146.66, df = 1, p < .001) and VE scores (χ2 = 170.55, df = 1, p < .001), being trained (χ2 = 27.72, df = 1, p < .001), liking (χ2 = 9.14, df = 1, p = .0025) and regularly listening to (χ2 = 98.45, df = 1, p < .001) classical music, and higher IRI scores (χ2 = 77.24, df = 1, p < .001). A significant interaction between ME and VE (χ2 = 32.87, df = 1, p < .001) also showed higher ME ratings caused increases in the contribution of VE to the model.
Power
A significant interaction between Segment and Performance (χ2 = 15.39, df = 4, p = .004) showed listeners experienced more feelings of power during ensemble performances in the first (χ2 = 6.89, df = 1, p = .033), second (χ2 = 6.88, df = 1, p = .033), third (χ2 = 6.96, df = 1, p = .033), and fourth segments (χ2 = 27.63, df = 1, p < .001), but not in the fifth (χ2 = 4.4, df = 1, p = .036) where the reverse relation was true. In addition, higher ME (χ2 = 72.38, df = 1, p < .001) and VE scores (χ2 = 102.66, df = 1, p < .001), being trained (χ2 = 39.16, df = 1, p < .001), not listening regularly to classical music (χ2 = 116.96, df = 1, p < .001), and lower IRI scores (χ2 = 83.67, df = 1, p < .001) predicted greater power ratings. As with the wonder model, the interaction between ME and VE (χ2 = 5.88, df = 1, p = .015) showed that the higher the ME ratings the higher the contribution of VE to the model.
Conclusion
The purpose of this study was to test the effects of structural and listener features (Scherer & Zentner, 2001) on affective experiences induced by short excerpts performed by a violin (solo) or a string quartet (ensemble). We specifically tested whether ensemble performances would lead to a greater desire to move in response to the music in terms of higher ME ratings and whether ME ratings would predict positive emotions, while perceived physiological changes (VE) were expected to predict both positive and negative emotions.
As predicted, we found that ME ratings were significantly greater during ensemble than during solo performances, confirming the idea that greater instrument density is associated with “that aspect of the music that induces a pleasant sense of wanting to move” (Janata et al., 2012, p. 56). Interestingly, VE ratings were also greater during ensemble performances, which was an exploratory question for which we had no a priori hypotheses. This could be because feeling like moving, which can be considered to be a part of the motivational component of the emotion experience, likely feeds back into the emotion process (Sander, Grandjean, & Scherer, 2005), increasing arousal and leading to listeners’ increased awareness of physiological changes, which would be consistent with the interaction effects found in the Wonder and Power models.
Concerning the relationship between MEQ and emotion ratings, we expected that wanting to move would be predictive of positive dimensions such as Power and Wonder, while having a sense of one’s own bodily rhythms changing would be predictive of both positive and negative dimensions, which is indeed what we found. The Wonder and Power models also revealed an interaction effect between VE and ME suggesting a complementary relationship between these two distinct but related factors. We found that ME acts as a catalyst for VE effects, in other words, at low levels of ME, feeling like moving is enough to either induce or characterize positive states but as ME increases so does the sense that one’s internal rhythms are changing contributing to the felt intensity of positive states. Overall, these results follow the direction of our initial assumptions and are consistent with previous findings (Labbé & Grandjean, 2014). As for the effects of texture, Performance and the interaction between Performance and Segment were significant predictors in all models. Concerning the positive dimensions, the ensemble condition induced significantly higher Wonder and Power overall, whereas the solo condition induced significantly more Tension overall. Looking at the interaction between Performance and Segment we further found that the effect of Performance could be related to the characteristics of the musical score, highlighting the importance of considering the compositional-related features together with the instrumental texture when addressing the emotional impact of the music (Scherer & Zentner, 2001). The main difference observed in positive emotions between segment 5 and the others may lie in the fact that, in this segment, the first violin has a leading part and relies much less on the other parts. In the other segments, the different instrumental voices are integrated with one another at several degrees and show a network of harmonic or rhythmic mutual dependencies: consider for instance the homo-rhythmic texture of segments 1 and 4, the subdivision into two contrasting duets in segment 2 or the fugato dialogue between all the voices in segment 3. For the negative emotions such interdependency between the different voices has a less delimited impact. Consider for example the ambivalent case of segment 3: it receives higher ratings in the solo condition in the Tension model while receiving less in the Nostalgia model. These two cases may also reveal how emotion orients attention to specific aspects of the relationship between the music structure and the instrumental texture. It makes sense that the sense of tension increases in the solo performance as the different instrument channels are highly interrelated in this segment and the lack of other channels is felt more strongly, contributing to a feeling of irresolution which is typical of tension. Conversely, Nostalgia may increase in the ensemble condition because in the fugato style a voice is repeatedly performed by various instruments, giving rise to a sense of echo, in some way, metaphorically related to Nostalgia, and as for segment 4, the solo condition may have a stronger impact here as the voice symbolically embodies the tragic theme of death through melodic contour.
Lastly, we found that most listener features had significant effects on felt emotion ratings, but in different ways. Being musically trained for instance seemed to make listeners more prone to feeling positive emotions leading to lower ratings of Tension and higher ratings of Power, Wonder, and Nostalgia. It could be that for trained musicians listening to music is rewarding in itself, thus leading to a more positive experience overall. Indeed, Nostalgia can be considered a “bittersweet” emotion embodying both positive and negative aspects (Larsen & Stastny, 2011). Most listeners reported liking classical music and this also led to significantly higher ratings of Wonder and Nostalgia, both relatively subtle and low arousal dimensions, which might be more dependent on listeners feeling engaged by the music. Regularly listening to classical music on the other hand led to lower feelings of Tension, Nostalgia, and Power but higher Wonder. This curious relation might be due to a more analytic style of listening that made these listeners less prone to feeling strong emotions, the experience being nevertheless an intellectually pleasant one, which explains the high Wonder judgments. Finally, high empathy scorers were also less likely to experience strong Tension, Nostalgia or Power, but not less likely to experience high levels of Wonder. This could be due to the fact that these participants are also better able to regulate negative or extreme dimensions, but this remains to be tested.
Footnotes
Acknowledgements
We would like to thank the Quartetto di Cremona for performing the pieces used in this study, as well as CasaPaganini-InfoMus staff and its scientific director Professor Antonio Camurri for making these recordings possible. We are also grateful to Jon Hargreaves for providing his help and expertise during the musicological analysis of the pieces.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The work reported in this paper is part of the SIEMPRE project, which acknowledges the financial support of the Future and Emerging Technologies program within the Seventh Framework for Research of the European Commission under FET-Open Grant Number 250026-2. This study was also financed by the National Centre of Competence in Research in Affective Sciences supported by the Swiss National Science Foundation grant number 51NF40-104897 – DG.
