Abstract
Mode and tempo are known to influence affective experiences during music listening. While mode (major/minor) is associated with emotional valence (positive/negative), tempo (slow/fast) is associated with emotional arousal (calm/excited). Heart rate (HR) and respiration rate (RR) are also thought to adapt (entrain) to the tempo, leading to emotion elicitation via afferent feedback mechanisms. Here, we tested the influence of mode, tempo, and entrainment on affective experiences by recording HR, RR, and self-reported subjective entrainment and affect measures while participants (N = 20) listened to major and minor chords embedded in slow and fast isochronous, metrical, and random sequences. Though there was no effect of tempo on HR or RR, both were faster during major and metrically random chord sequences, respectively. Slower HR positively predicted visceral entrainment (VE) ratings, the extent to which one feels one’s internal rhythms changing, and fast tempo positively predicted motor entrainment (ME) ratings, the extent to which one feels like moving. Compared to minor chords, fast major chord sequences induced more feelings of vitality (positive, high arousal), while minor sequences induced more feelings of unease (negative, high, and low arousal). Both ME and VE positively predicted pleasantness ratings and positive emotions, and negatively predicted negative emotions.
Music has long been recognized as having the power to elicit strong responses in listeners (Meyer, 1956), including shivers down the spine, tears, spontaneous movement, mood changes, and even emotions (Scherer & Zentner, 2001). In this study, we were interested in affective experiences, affect being an umbrella term for states such as moods, interpersonal stances, attitudes, personality traits, preferences, and emotions (Scherer, 2004, 2005). We especially focused on preferences (liking/disliking), as measured by pleasantness ratings, and emotions, as measured by aesthetic emotions ratings, in response to musical stimuli.
Indeed, beyond rhythm or melody, when we listen to music, most of us get the sense that it somehow expresses emotions that we can recognize and label (Gabrielsson & Juslin, 2003; Juslin, 2013b). This could be due to the repeated association of meaningful events with specific pieces (or acoustic features) and/or due to the similarities between acoustic cues and emotionally relevant cues in human vocal expression (Juslin, 2013b; Juslin & Laukka, 2003; Scherer, 1995). To test these assumptions, one approach has been to systematically manipulate musical structure and performance variables in order to correlate them to listeners’ perceptual judgments (Gabrielsson, 2009). In this way, it has been possible to associate acoustic features, notably mode and tempo, with dimensions, such as valence and arousal/activation, which are thought to account for most of the variance in emotion perception studies (Russell, 2003). Thus, in Western music, the minor mode has been typically associated with negative valence and the major mode with positive valence, while fast tempo has been mostly associated with high arousal or “excitement” and slow tempo with low arousal or “calmness” (Gabrielsson, 2009). Therefore, music in a major key tends to be perceived as sounding happy when it is fast and tender when it is slow, whereas music in a minor key can be variously perceived as sounding angry, fearful, or tense when it is fast and sad when it is slow (Gabrielsson, 2009; Juslin & Lindström, 2010).
Interestingly, some of the relationships between musical factors and expressed emotions have also been found when studying felt emotions, that is, emotions induced by the music in listeners (e.g., Gomez & Danuser, 2007; Hunter, Schellenberg, & Schimmack, 2010; Larsen & Stastny, 2011; McConnell & Shore, 2011; van der Zwaag, Westerink, & van den Broek, 2011). This distinction is important since listeners do not necessarily experience the same emotions—nor even the same kinds of emotions (Juslin & Laukka, 2004)—that they perceive being expressed by music (Gabrielsson, 2002; Scherer, 2004). This is especially true of music that is perceived as sounding sad, which often induces a variety of felt emotions besides sadness, including mixed and positive emotions (Kawakami, Furukawa, Katahira, & Okanoya, 2013; Taruffi & Koelsch, 2014; Vuoskoski, Thompson, McIlwain, & Eerola, 2012). This lack of correspondence is thought to be partly due to the greater complexity of the mechanisms underlying emotion elicitation (Juslin, 2013a).
Indeed, as pointed out by Scherer and Zentner (2001) in their emotion production routes model, while structural and performance features certainly remain important factors, experienced emotions are also likely to depend on their interactions with features related to the listener (e.g., level of musical expertise) and the listening context (e.g., location, type of event). Thus, these researchers proposed a multiplicative function where the experienced emotion is the result of structural, performance, listener, and contextual features. They further proposed different theoretical emotion elicitation “routes” that primarily involve the central nervous system (central routes) or the peripheral nervous system (peripheral routes). Among the latter, the proprioceptive feedback route is based on the notion that the emotion components (cognitive appraisal, motor expression, autonomic physiology, action tendency, and subjective feeling) are coupled to such a degree that activating the response pattern of one can cause changes in the others and even trigger a full emotional response. For instance, Levenson, Ekman, and Friesen (1990) found that intentionally producing facial expressions generated emotion-specific autonomic activity and subjective experiences. This may then result in either a full-blown emotional response or the facilitation of its emergence. This was the case in Strack, Martin, and Stepper’s (1988) seminal study where naïve participants holding a pen lengthwise between their teeth—contracting the muscles involved in smiling—found cartoons funnier than when holding it between pursed lips. In other words, although we typically appraise a new situation as being good for us first and then smile, our hearts beat faster, we want to jump, and subjectively experience “happiness” via efferent effects, the reverse (i.e., smiling, faster pulse, or jumping) could also trigger an emotion via afferent feedback mechanisms (for a review, see Price & Harmon-Jones, 2015). In this way, it has been suggested that rhythm might cause the patterning of some of the emotion components—and autonomic physiology in particular—to change and adapt to the external pattern in the music and then spread to the other components (Scherer & Zentner, 2001). For example, a lullaby could induce a slower breathing pattern and the proprioceptive feedback from that pattern could lead to slower body movement, a desire to rest, and peacefulness. This idea was further developed in the updated version of the model in the form of an entrainment route (Scherer & Coutinho, 2013). A similar mechanism can also be found in the BRECVEMA framework (Juslin, 2013a), which is a theoretical framework that aims to explain emotional responses to music.
Entrainment can be understood as the process through which two independent systems that emit a regular signal (oscillators) tend to synchronize as they interact with each other (Clayton, Sager, & Will, 2005). Given this interpretation and the fact that many biological and psychological processes are rhythmic in nature, there is a high potential for entrainment to occur during music listening. Consequently, Trost, Labbé, and Grandjean (2017) proposed a framework describing at least four levels at which entrainment can take place and lead to the induction of affective experiences in listeners. At the perceptual level, attention and perceptual processes are proposed to entrain to periodicities in the auditory signal allowing for the percept of a beat and metrical hierarchy to emerge. While at the autonomic physiological level, autonomic nervous system activity, such as heart rate (HR) and respiration rate (RR), is proposed to adjust toward the music’s periodicities (e.g., tempo). At the motor and social levels, it is listeners’ movements that coordinate with perceived periodicities in the music and in others’ movements or sounds in the case of social entrainment. Finally, these levels contribute to a subjective feeling component of entrainment that shares similarities with the experience of “being in the groove” (Janata, Tomic, & Haberman, 2012, p. 54) that has been found to be an especially good predictor of positive aesthetic emotions (Labbé, Glowinski, & Grandjean, 2016; Labbé & Grandjean, 2014). This is consistent with Trost et al.’s (2017) proposal that entrainment induces mostly positive affective experiences.
Importantly, depending on what level is investigated, perfect phase (same onset) and period (same duration) synchronization may not always be achieved. As expounded by Clayton et al. (2005), entrainment describes a tendency toward coordination and not just common phase and/or periodicity. Furthermore, individual differences and other factors may also interact with entrainment processes, which partly explains the many and varied results concerning changes in physiological rhythms in response to music. For instance, Haas, Distenfeld, and Axen (1986) found a correlation between tempo and RR in half of their participants and instances of inspiratory phase coupling in 80% and 50% of their musically trained and untrained participants, respectively. However, a tendency toward acceleration for fast and deceleration for slow music has been found in a number of studies measuring RR (Gomez & Danuser, 2007; Kenntner-Mabiala, Gorges, Alpers, Lehmann, & Pauli, 2007; Khalfa, Roy, Rainville, Dalla Bella, & Peretz, 2008) and HR (Bernardi, Porta, & Sleight, 2006; Etzel, Johnsen, Dickerson, Tranel, & Adolphs, 2006; Gomez & Danuser, 2007; Shoda, Adachi, & Umeda, 2016). This could explain the link between tempo and arousal that is often reported in the literature concerning music and emotion where HR and/or RR have been found to correlate with arousal ratings (Gomez & Danuser, 2007; Iwanaga & Moroki, 1999; Witvliet & Vrana, 2007).
To test the effects of structural features and entrainment on felt emotions, we employed simple and manipulable stimuli in the form of fast and slow rhythmic chord sequences. Major and minor chords consist of at least three notes played simultaneously: the root or tonic, the third, and the fifth. The pitch of the fifth is seven semitones higher than the root and the pitch of the third can be either four or three semitones higher, resulting in major and minor chords, respectively. Major and minor chords have been found to retain emotional meaning by facilitating the processing of both happy and sad words (Steinbeis & Koelsch, 2009) and faces (Bakker & Martin, 2015) in affective priming tasks. Major and minor chords are also known to be processed by different brain regions and evoke different brain activity related to emotion processing (Pallesen et al., 2005) and to convey distinct emotional qualities along various dimensions, including valence (Lahdelma & Eerola, 2016). While there is much work concerning the perception of emotion conveyed by major/minor chords, few researchers have investigated how they induce emotions, presumably because chords are too brief. Fortunately, because we wanted to examine the effects of tempo, we circumvented this problem by embedding major and minor chords in fast and slow isochronous, metrical, and random sequences. This was necessary because to establish a tempo, one must instill—at the very least—a sense of beat.
The beat is the perception of a periodic pulse (Patel & Iversen, 2014), and beat induction is the cognitive ability to extract and synchronize with a regular pulse in a stream of music (Honing, 2012). Once perceived, it guides our attention, allowing us to form expectations concerning future events; facilitates the synchronization of our movements; and influences our perception of the tempo, typically measured in beats per minute or “bpm” (Geiser, Walker, & Bendor, 2014). When there is more than one hierarchical pulse, the music is said to have a metrical structure or meter, typically resulting in the perception and anticipation of regularly alternating accented (strong) and unaccented (weak) beats (London, 2004). Geiser and colleagues (2014) described at least three different and usually interacting ways in which accents can arise: dynamic, melodic, and temporal. Temporal accenting arises “when a tone is relatively isolated, the second of a two-tone cluster, or the initial or final note of a cluster of three or more notes” (Geiser, Ziegler, Jancke, & Meyer, 2009, p. 94). Dynamic and melodic accenting, on the other hand arise from a tone’s intensity or pitch being relatively different from those of surrounding tones, respectively. Since temporal accenting is a good predictor for meter perception (Hannon, Snyder, Eerola, & Krumhansl, 2004), which has been successfully induced using solely temporal onset cues (e.g., Grube & Griffiths, 2009; Iversen, Repp, & Patel, 2009), we interspersed acoustically identical chords with rests to induce a percept of subjectively accented (strong) and unaccented (weak) chords.
We predicted that listeners’ HR and RR would adapt to the tempo reflecting entrainment effects at the autonomic physiological level (H1). We further expected these changes to correlate with self-reported visceral entrainment (VE) ratings (the extent to which people feel their own rhythms change), but not motor entrainment (ME) ratings (the extent to which they feel the desire to move) (H2). Together with mode and tempo, these scores were expected to predict pleasantness, sublimity, vitality, and unease ratings. Specifically, we expected fast major chords to induce vitality-related emotions (positive, high arousal) (H3), slow major chords to induce sublimity-related emotions (positive, low arousal) (H4), and both fast and slow minor chords to induce unease-related emotions (negative, low + high arousal) (H5).
Method
Participants
Twenty participants (15 females, 5 males; M = 22.0 years, SD = 1.7 years) took part in this study. Participants were recruited from a pool of psychology students and were awarded course credits upon successfully completing the task. All participants were informed of the aims of the study, the institutions involved, and were reminded (verbally and through a detailed consent form) that they had the right to voluntarily withdraw from the study at any point for any reason. The university’s ethics committee approved the experimental protocol and all participants gave informed written consent. All data were analyzed anonymously, and the study was conducted in accordance with the Declaration of Helsinki.
Materials
Stimuli
Stimuli consisted of 13 bars of an A major (A4, C#5, E5) or A minor (A4, C5, E5) chord repeated at a slow (96 bpm) or a fast (128 bpm) tempo. Metrical sequences (4/4 and 5/4) consisted of chords being played on beats one, two, and three leaving beat four silent, to instill a sense of a 4/4 meter, and beats one, two, three, and four leaving beat five silent, to instill a sense of a 5/4 meter. The metronome sequences consisted of chords being played on all beats for the equivalent of 13 bars in a 4/4 time signature. The random sequences were constructed with the help of Microsoft Excel’s “rand()” function, which was used to randomly assign chords or rests to each beat during the equivalent of 13 bars in a 4/4 time signature (see Figure 1 and Supplemental Material for examples).

Examples of the metronome, 4/4, 5/4 meter, and random conditions in the major mode.
Though not a variable of interest, a different keyboard timbre (“Afro Cuban Upright Piano,” “Chorused Electric Piano,” and “Whirly”) was used in each of the three blocks to prevent potential effects of boredom. This resulted in 48 chord sequences (3 timbres × 2 modes × 2 tempi × 4 meters) lasting between 25 and 41 s that were presented in a pseudo-randomized order in three blocks. All chord sequences were created using GarageBand on a MacBook Pro and edited in Adobe Audition 3.0 for volume normalization.
Ratings
Participants were asked to rate the pleasantness of each chord sequence on a scale from 1 (little) to 5 (a lot). During the first block, participants also rated to what extent they experienced sublimity, vitality, and unease, the three second-order factors of the Geneva Emotional Music Scale (Zentner, Grandjean, & Scherer, 2008) on a scale from 1 (little) to 5 (a lot). The following descriptors of the second-order factors were available to participants throughout the study: sublimity (feelings of wonder, transcendence, tenderness, nostalgia, and peacefulness), vitality (feelings of power and joyful activation), and unease (feelings of tension and sadness). During the first block, they were also asked to rate to what extent they felt entrained on a scale from 1 (little) to 5 (a lot) using four items from the Musical Entrainment Questionnaire (MEQ; Labbé & Grandjean, 2014). The MEQ is a 12-item questionnaire that assesses listeners’ self-reported subjective experiences of feeling entrained to music and has a two-factor structure. The first factor, VE, is composed of items related to listeners’ impressions of bodily rhythm changes, while the second factor, ME, is composed of items related to their desire to move. However, we only used the top two items that saturated most heavily on each factor in Labbé and Grandjean’s original study. These were Items 11 (“to what extent did you feel your own body rhythms change?”) and 12 (“to what extent did you feel your own body resonate with the music?”) for VE, and Items 2 (“to what extent did you feel like dancing?”) and 5 (“to what extent did you feel like moving?”) for ME. In the last two blocks, the number of questions was reduced and participants only rated the pleasantness of the tracks.
Procedure
Upon arrival to the laboratory, participants were given a consent form with a description of the objectives of the study. They were then fitted with three disposable electrocardiogram (ECG) electrodes, one over each clavicle and one above the left hip, and nasal cannula to measure respiration. 1 Participants then completed a custom-made music preferences questionnaire used by our group (see Supplemental Material) and the Edinburgh Handedness Inventory (Oldfield, 1971). Forty percent of participants agreed or strongly agreed with the statement that they know how to sing and 50% agreed that they know how to play one or more instruments. Ninety-five percent of participants reported that they never, seldom, or only sometimes played an instrument or sang in a professional capacity and 75% of participants reported to never, seldom or only sometimes play or sing as a hobby. A trial session programmed in E-Prime 2 (Psychology Software Tools Inc., Pittsburgh, Pennsylvania, USA) followed where they were given the opportunity to ask questions about the task and set the volume of the speakers (ESCOM 80 watt) to a comfortable level. On-screen instructions differentiated between perceived and felt emotions while explaining that this study was only concerned with the latter. Participants were advised to find a comfortable position and sit still during stimulus playback to avoid movement artifacts. A following set of instructions described the reproduction task that they were asked to perform after listening to every chord sequence, which consisted of tapping what they had just heard as best they could using the keyboard’s “Ctrl” key. This ensured that participants would remain attentive throughout the study. Tapping data were not used in the analyses. Thereafter, participants reported their experiences using the entrainment and affect rating scales, which were collected with the numeric keypad. Preparation lasted approximately 15 min and the study itself 45 min.
Data acquisition and data analysis
ECG and respiration measures were recorded using BIOPAC (MP150, BIOPAC Systems Inc., Santa Barbara, California, USA) and analyzed using AcqKnowledge (version 4.1, BIOPAC Systems, Inc., Goleta, California, USA) to compute HR and RR. Trials were visually inspected for potential artifacts; this resulted in a loss of 1.04% of all respiration data. In accordance with BIOPAC guidelines, all respiration signals were bandpass filtered between 0.05 and 1 Hz. The RR of 15 participants was then automatically computed with the program’s “find rate” function using a positive peak detection threshold varying between −0.4 and −0.1 cm H2O depending on the participant. However, in five cases, the variability of the signal was such that the rate had to be found by manually placing respiration cycle markers. Ten ECGs were bandpass filtered between 0.8 and 50 Hz to reduce the level of noise in the signal. HR was then computed using the find rate function. Mean and standard deviations of HR and RR were extracted over a 15-s period beginning 5 s after stimulus onset and ending 20 s after stimulus onset, which is similar to the approach used by Khalfa et al. (2008). Z scores of HR and RR were then calculated according to each participant’s mean HR and RR throughout the experiment, and values of ±2 were excluded from analyses. This resulted in 2.6% and 1.46% of the HR and RR data being excluded, respectively. The results of the electroencephalography (EEG) and the electromyography (EMG) data are not presented due to an unforeseen design confound.
Statistical analyses were performed using R version 3.3.2 in R Studio version 1.0.136 for Windows. Because they allow for the definition of both fixed and random effects, we fit HR and RR to linear mixed effects models (“lmer” from the lme4 package). However, since participants’ self-reported affect ratings were not normally distributed, we fit them to generalized linear mixed effects models (“glmer”) with a gamma distribution instead. This was because the greater number of observations concentrated at the low end of the rating scales, which is typical of emotion ratings in psychology. To calculate the interaction and main effects, we used a likelihood-ratio test approach by performing chi-square tests between a model with and without the interaction or variable of interest. Interactions were analyzed using “testInteractions” from the phia package.
Results
Subjective ratings
Before testing the effects of our manipulations on subjective ratings, we performed a Friedman analysis of variance on the pleasantness ratings to check that there was no preference for one timber over any of the others. Since no significant differences were found, χ2(2) = 0.57, p = .750, all subsequent measures did not take this variable into account.
Subjective entrainment
To investigate the relationship between the tempo and physiological measures with subjective entrainment, particularly VE, we performed two linear mixed effects models on both MEQ factor scores by defining meter (metronome, 4/4, 5/4, random) and tempo (fast, slow) as categorical factors, and HR and RR as continuous predictors. We also included the interactions between tempo and HR, and tempo and RR, and used order of presentation and participant as random factors. Concerning ME, we found a significant effect of tempo, χ2(1) = 24.24, p < .001, such that fast tempo predicted higher ME scores. No other effects were significant. Concerning VE, we found a significant effect of meter, χ2(3) = 15.57, p = .002 (see Figure 2), such that participants rated that they felt their internal rhythms change more during the metronome than the random, χ2(1) = 13.38, p = .002, and the 5/4, χ2(1) = 8.16, p = .021, conditions. There was also a main effect of HR, χ2(1) = 3.94, p = .047, such that VE ratings were significantly higher as HR became slower.

Effect of meter on visceral entrainment (vertical bars denote 95% confidence intervals).
Affect ratings
To test the effects of our manipulations and subjective entrainment on ratings of pleasantness, vitality, sublimity, and unease, we defined four separate generalized linear mixed effects models using mode (major, minor), tempo, and meter as categorical factors, and ME and VE as continuous predictors. We also included the interaction between mode and tempo and used the order of presentation and participant as random factors. We corrected the p values for the number of rating scales using the Bonferroni correction.
Pleasantness
There was a significant effect of meter, χ2(3) = 25.19, p < .001 (see Figure 3), such that the 4/4 condition was rated as being more pleasant than both the random, χ2(1) = 8.77, p = .012, and the metronome, χ2(1) = 21.11, p < .001, conditions, and the 5/4 condition was rated as more pleasant than the metronome condition, χ2(1) = 13.64, p = .001. We also found significant effects of ME, χ2(1) = 53.19, p < .001, and VE, χ2(1) = 14.2, df = 1, p < .001, such that higher MEQ scores positively predicted pleasantness ratings.

Effect of meter on pleasantness (vertical bars denote 95% confidence intervals).
Vitality
We found significant effects of tempo, (χ2(1) = 14.59, p = .001), mode, (χ2(1) = 15.63, p < .001), ME, (χ2(1) = 40.59, p < .001, and VE, (χ2(1) = 18.08, p < .001), such that fast stimuli in a major mode and higher MEQ scores predicted greater feelings of vitality.
Sublimity
We found a significant effect of meter, χ2(3) = 12.47, p = .024 (see Figure 4), such that ratings were significantly greater in the 5/4 condition compared to the metronome condition, χ2(1) = 12.27, p = .003. We also found significant effects of ME, χ2(1) = 17.28, p < .001, and VE, χ2(1) = 12.03, p = .002, such that higher MEQ scores predicted greater feelings of sublimity.

Effect of meter on sublimity (vertical bars denote 95% confidence intervals).
Unease
We found significant effects of mode, χ2(1) = 6.67, p = .039; ME, χ2(1) = 38.51, p < .001, and VE, χ2(1) = 8.03, p = .018, such that stimuli in a minor mode and lower MEQ scores predicted greater feelings of unease.
RR and HR
To test whether participants’ autonomic physiological rhythms adapted to the tempo of the music and to explore the effect of mode on HR and RR, we ran two separate linear mixed effects models using tempo, mode, meter, and the interaction between tempo and mode as categorical factors, and order of presentation and participant as random factors. Concerning HR, there was a main effect of mode, χ2(1) = 4.15, p = .042, such that HR was significantly faster during chord sequences in a major mode. The effects of tempo, χ2(1) = 3.14, p = .077, and meter, χ2(3) = 5.5, p = .139, were nonsignificant. Concerning RR, we found a main effect of meter, χ2(3) = 19.92, p < .001 (see Figure 5), such that respiration was significantly faster during random stimuli compared to stimuli with a 5/4, χ2(1) = 10.9, p = .005, and 4/4 meter, χ2(1) = 18.11, p < .001. There was no significant effect of tempo, χ2(1) = 1.64, p = .200, or mode, χ2(1) = 0.03, p = .870.

Effect of meter on respiration rate (vertical bars denote 95% confidence intervals).
Discussion
In the present study, we sought to test the influence of musical structural features and entrainment on affective experiences by embedding major and minor chords in fast and slow, isochronous, metrical, and random sequences. By recording HR, RR, and self-reported measures, we tested whether autonomic physiological and subjective entrainment could be used to predict positive and negative, high and low arousal emotions.
We predicted that HR and RR would entrain to the tempo of the chord sequences by speeding up during fast and slowing down during slow stimuli (H1). In fact, both HR and RR appeared to be relatively faster in the fast and slower in the slow conditions, but neither effect was statistically significant. This could be due to the tempo difference between 96 and 128 bpm being too subtle to induce noticeable changes or two levels of tempo being insufficient for the effect to be observed. However, we found a main effect of mode such that HR was significantly faster during major chord sequences. The same relationship between mode and HR has been observed for instrumental music excerpts (Gomez & Danuser, 2007), as has greater HR deceleration during “negative” compared to “positive” music (Witvliet & Vrana, 2007). Taken together, these results suggest that HR might also be a physiological marker of valence as others have suggested (Koelsch & Jäncke, 2015). Supporting evidence comes from studies contrasting listeners’ emotional responses to consonant (pleasant) and dissonant (unpleasant) music, which have found greater decrease in HR during unpleasant excerpts (Orini et al., 2010; Sammler, Grigutsch, Fritz, & Koelsch, 2007). As for RR, we found that breathing was significantly faster during chord sequences without a regular meter, that is, the random conditions. Since random sequences were unpredictable by nature, they might have been experienced as more interesting, thereby inducing more excitement or tension and relatively faster RR. The anticipated tapping task, which was harder in the random condition, could explain this too. However, there was no effect of meter in the unease model, which included the tension dimension. Alternatively, Orini et al. (2010) found a reduction in RR during pleasant compared to unpleasant music and since the 4/4 condition was rated as more pleasant, differences in RR could also be interpreted in this way.
Concerning self-reported feelings of entrainment, we predicted that physiological changes would influence VE but not ME ratings (H2) and this is what we observed between HR and VE. Since “visceral” entrainment is a subjective measure of the sensation that one’s internal rhythms are changing due to the music, it is expected to correlate with objective physiological changes. Why the relationship was a negative one is less clear. It could be that the nature of the setup (i.e., sitting down in a comfortable chair) made it easier for participants to relax and notice their internal rhythms slow down rather than speed up. Alternatively, if slower HR is in fact a good marker of negative valence, then the feeling that one’s internal rhythms were changing could have been related to internal discomfort or a “tense arousal” state (Thayer, 1990). This is an interesting possibility as VE ratings were significantly greater in the metronome condition, which happened to have some of the lowest pleasantness ratings. On the contrary, the repetitiveness of the metronome condition could have also been more conducive to becoming absorbed by or resonating with the music, emulating trance-like states. In fact, Hove et al. (2016) successfully used rhythmic isochronous drumming sequences to induce states of trance in shamanic practitioners. Compared to drumming sequences with more irregular timing, the isochronous drumming sequences led to more self-reported experiences of trance and less connectivity in auditory pathways. This reduction in connectivity suggests a dampening or suppression of repetitive sound processing, which may have happened with participants in the metronome condition. This would be consistent with our results as VE ratings were the highest in the metronome and lowest in the random conditions. Conversely, while not affected by the meter manipulation, ME ratings were significantly greater during fast compared to slow sequences, meaning that participants felt like moving more to a fast tempo. This is consistent with the finding that faster songs have higher “groove” ratings, which is also related to wanting to move to the beat (Janata et al., 2012).
As predicted, both structural features and entrainment had significant effects on felt affect and pleasantness ratings. Consistent with results that have found links between mode and valence as well as tempo and arousal (Gomez & Danuser, 2007; Hunter et al., 2010; Larsen & Stastny, 2011; McConnell & Shore, 2011), participants reported feeling more vitality (power and joyful activation), which is composed of positive high arousal emotions, during fast major chord sequences (H3), and more unease (tension and sadness), which is composed of negative emotions, during minor chord sequences (H5). The fact that unease combines both tension and sadness, which are high and low arousal emotions, respectively, might be why we failed to find tempo effects in this model. Curiously, there were no effects of mode or tempo on sublimity ratings (wonder, transcendence, tenderness, nostalgia, and peacefulness), which is largely composed of positive low arousal emotions and was expected to be higher for slow major chord sequences (H4). That being said, these distinctions are not so clear cut as nostalgia can be considered to be a bitter-sweet emotion composed of both negative and positive elements (Barrett et al., 2010). Similarly, in a study that used the Geneva Emotional Music Scale as well as valence and arousal ratings, the first-order factors of joyful activation, power, and wonder were found to cluster with high arousal emotions (Trost, Ethofer, Zentner, & Vuilleumier, 2012). Sublimity might therefore be better described as an amalgam of (mostly) positive emotions in the same way that unease is composed of negative emotions. Concerning the link between subjective entrainment and affect ratings, we found that both MEQ factors significantly predicted positive and negative experiences according to their valence. In other words, both MEQ factors positively predicted vitality and sublimity and negatively predicted unease ratings. Both ME and VE also positively predicted pleasantness ratings, suggesting that the more pleasant the stimulus, the more entrained participants feel, which is consistent with past results (see Trost et al., 2014). Concerning the effects of structural features on pleasantness, we did not find a statistically significant effect of mode or tempo on pleasantness ratings. However, there was a significant effect of meter, such that participants found the metrical (i.e., 4/4 and 5/4) to be the most pleasant and the metronome the least pleasant conditions. This is likely due to a familiarity effect (van den Bosch, Salimpoor, & Zatorre, 2013; Zajonc, 1968) as most Western music is metrically organized and both the random and isochronous stimuli are unusual. However, it should be noted that these effects are relative, as the overall pleasantness rating was M = 1.8/5.
Limitations
Concerning the limitations of the present study, the sample size may have been too small to account for changes in physiological rhythms like HR. Shoda et al. (2016), for instance, showed effects of tempo on HR with a sample size of N = 37 and Gomez and Danuser (2004, 2007) with N = 31. Another limitation is that the sound level was not fixed, but rather chosen by the participants themselves. While all participants set the sound level to a preferred level, which then did not change, we cannot rule out the presence or absence of effects being the result of differences in loudness, especially along the arousal dimension. Finally, while the tapping task was employed to ensure that participants remain attentive, this may have inadvertently induced more stress during random sequences, which were harder to reproduce. This may account for the faster RR during the random sequences. Future studies wishing to use a similar method to ensure that listeners remain focused could use a beat-tapping task during the trials themselves instead. If well instructed during training trials, movement artifacts could be reduced to a minimum and the study would benefit from being reduced in length.
Conclusion
Using very basic stimuli in the form of fast and slow sequences of major and minor chords, our study attempted to determine whether it was possible to induce entrainment and emotion in listeners. As predicted, major sequences induced positive emotions, while minor sequences induced negative emotions, demonstrating that emotions can be induced and not just perceived with these simple stimuli. In addition, fast major chord sequences specifically induced more high arousal positive emotions, showing that it is not just valence that is manipulated. However, contrary to our predictions, we found no effects of tempo on HR or RR, which suggests that the effects of tempo are restricted to subjective arousal. Instead, HR was significantly faster during major sequences, which we attributed to an effect of valence. It is suggested then that HR may be a good physiological marker for valence in emotion induction studies using music. Per our predictions, VE, which is related to the sensation of internal rhythms changing, but not ME, which is related to the urge to move to the beat, was influenced by physiological changes. This negative correlation between VE and HR is important because it shows that this subjective measure of the entrainment experience is in fact related to objective physiological changes. Therefore, future studies could examine correlations between VE ratings and other physiological measures using more complex stimuli than chord sequences. Concerning the ME factor, tempo had a significant effect on ME but not VE ratings, with faster sequences leading to wanting to move more. Since tempo has been shown to influence positive high arousal emotions in induction studies (Hunter et al., 2010; Kellaris & Kent, 1993; Webster & Weir, 2005) and ME positively predicted these emotions, we believe the influence of tempo is at least partly mediated by feelings of entrainment.
Overall, these results show that it is possible to induce positive and negative experiences and feelings of entrainment using simple major and minor chord sequences that vary according to tempo and meter. Accordingly, stimuli like ours could serve as a basis for manipulating further variables suspected of influencing entrainment and emotion, such as structural (e.g., pitch, rhythm) or performance features (e.g., quantization, expressivity).
Supplemental Material
Supplemental_material – Supplemental material for Affective experiences to chords are modulated by mode, meter, tempo, and subjective entrainment
Supplemental material, Supplemental_material for Affective experiences to chords are modulated by mode, meter, tempo, and subjective entrainment by Carolina Labbé, Wiebke Trost and Didier Grandjean in Psychology of Music
Footnotes
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work is part of the SIEMPRE (Social Interaction and Entrainment using Music PeRformance Experimentation) project, which was supported by the Future and Emerging Technologies program within the Seventh Framework for Research of the European Commission (FET-Open Grant Number 250026-2).
Supplemental material
Supplemental material for this article is available online.
Notes
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
