Abstract
It is well recognised in psychology that music has affective connotations and that musical stimuli can modify affective states. The aim of this study was to assess the affective connotations of 120 fifteen-second musical excerpts, covering both modern musical genres such as pop, rock, jazz, rap/R&B and electronic music (5 x N = 20), and classical music (N = 20). Expert judges used predetermined criteria to select excerpts with positive or negative valence that induced high arousal or low arousal. The excerpts were assessed by 50 undergraduate students (25 women) from different academic departments, aged between 18 and 28 years (M = 21.46 years, SD = 1.85). They listened to all 120 fragments and rated them with respect to six dimensions: valence, arousal, dominance, origin, subjective significance and imageability. Analyses showed that ratings were reliable, with high split-half correlations and Cronbach’s alpha estimates. We did not identify any gender differences concerning affective reactions to the music. Some music genre specificity was found for all measures, and initial music preference appeared to shape affective ratings. The results presented here will be of interest to researchers working on musical perception and the influence of music on affective outcomes and emotional regulation.
Musical stimuli have been perceived as sources of emotional experience since at least the era of Ancient Greece (Juslin, 2009). Music consists of auditory stimuli created by humans for listening to on various occasions (Greasley & Lamont, 2011; Juslin, Liljeström, Laukka, Västfjäll, & Lundqvist, 2011; Juslin & Västfjäll, 2008). It is therefore important for psychology to understand the mechanisms underlying the induction of emotion by music (Bigand, Vieillard, Madurell, Marozeau, & Dacquet, 2005; Eerola, & Vuoskoski, 2011; Schubert, 1999). Standard research tools are a prerequisite for this endeavour (Paquette, Peretz, & Belin, 2013). The aim of the current study was to introduce a bank of 120 excerpts of everyday music from various genres (classical, pop, jazz, rock, rap/R&B and electronic) with known affective connotations. We hope that, as in the case of images (Lang, Bradley, & Cuthbert, 2008) and words (Bradley & Lang, 1999a), establishing affective norms for a set of stimuli will result in more experiments on how music evokes emotions.
Emotional responses to music
The extant literature on emotions in music is based on two different approaches. One strand of research (e.g., Gabrielsson, 2002) treats the perception of emotion in music as a form of cognitive processing influenced by musical expertise (Kreutz, Ott, Teichmann, Osawa, & Vaitl, 2008). The second strand of research is of more interest to researchers in the field of emotion and relates to the ability of music to evoke emotions (Juslin, 2009), that is, how the subjective perception of psychophysiological changes is related to emotions (Gabrielsson, 2002). This study was designed to identify musical stimuli that evoke emotional reactions in listeners and to determine the nature of these emotional responses.
In the science of emotions there are two general models of emotional processing (Kagan, 2007). One posits that individual emotions are processed by specific mechanisms (e.g., Pankseep, 1998); emotions such as sadness, happiness, anger and many others can be identified and the subjective experiences associated with them can be measured. The second approach assumes that the subjective experience of emotion is the result of processing, or affective qualities that are inaccessible to the perceiver (e.g., Jarymowicz & Imbir, 2015; Kagan, 2007; Russell, 2003). This assumption is crucial to the search for dimensions such as valence and arousal (the dimensional model; Russell, 2003) or valence, origin and source (the emotion duality model; Imbir, 2016a; Jarymowicz & Imbir, 2015). This dimensional approach can provide a basis for affective norms creation for particular stimulus materials (e.g., Bradley & Lang, 1999a, 1999b; Imbir, 2015, 2016b; Lepping, Atchley, & Savage, 2015; Warriner, Kuperman, & Brysbaert, 2013). The advantage of this approach is that carefully controlled experiments can be used to investigate the processes underlying specific emotional dimensions. One can take a dimension of interest – for example, arousal, and then search existing datasets for materials that differ in their arousal-inducing properties but are matched to other emotionally relevant variables such as valence, dominance and imageability.
Music can induce emotions in individuals in different ways, including both physical and cognitive aspects of music, that allow us to predict different emotional responses to a given musical stimulus (Juslin, Harmat, & Eerola, 2014). The emotional response to music may also be modulated by lyrics (e.g., Fiveash & Luck, 2016; Mori & Iwanaga, 2014; Vuoskoski & Eerola, 2013); most contemporary music composed and used on an everyday basis contains lyrics. Lyrics provide extra-musical information that influences the interpretation of the music (e.g., Thompson, Russo, & Quinto, 2008). Even the perception of purely instrumental music seems to be mediated by narratives relating to the feelings it evokes (Tan & Kelly, 2004) that could influence the final emotional state induced by the music. Given the potential impact of lyrics on the perception of the associated music, we chose modern songs performed in foreign languages (usually English) and recruited participants with an intermediate level of English language skills.
Music as research method for emotions elicitation
There are several methods available for manipulating affective states when studying the psychology of emotions. The most commonly used are emotional pictures (e.g., Lang et al., 2008), words (e.g., Bradley & Lang, 1999a; Imbir, 2015; Warriner et al., 2013), short texts (e.g., Imbir, 2016b) or sounds (Bradley & Lang, 1999b). It is thought that simply watching, listening to, or thinking about these stimuli is sufficient to evoke an affective state in humans. Musical stimuli have not been extensively studied in the field of affective sciences, but have been explored by researchers interested in the psychology of music (Bigand et al., 2005; Eerola & Vuoskoski, 2011; Lepping et al., 2015; Paquette et al., 2013; Schubert, 1999). Music is a type of stimulus created by humans in many cases to influence the affective state of listeners, so musical stimuli have considerable potential as a research tool for the affective sciences. A few music psychology studies have measured the affective response induced by listening to music of various genres (e.g., Gregory & Varney, 1996; Paquette et al., 2013; Robazza, Macaluso, & d’Urso, 1994), but most research has focused on classical music (e.g., De Vries, 1991; Kreutz et al., 2008). Some studies have investigated emotions generated by listening to genres such as jazz (e.g., Wallach & Greenberg, 1960), pop (North & Hargreaves, 1997), or Indian ragas (Balkwill & Thompson, 1999; Gregory & Varney, 1996; Gupta & Gupta, 2005). Kreutz et al. (2008) have suggested that classical music carries a greater emotional charge than modern music which has other aims such as entertainment, or simply filling a silence.
We decided to choose musical excerpts from several different modern and classical genres and compare them using the affective norms studies methodology (e.g., Bradley & Lang, 1999a; Imbir, 2015; Lang, 1980). This decision was based on our expectation of evoking emotions in different types of experiments that engage participants with different music preferences. This selection also enables researchers to compare the mechanisms by which different genres of music elicit affective responses. We also wanted to provide data that could be used to compare the affective responses elicited by classical and modern instrumental music; we therefore included electronic musical excerpts in our set of experimental stimuli.
Dimensions of affective responses to music
Six dimensions of affective responses to music were measured. Three of these dimensions (valence, arousal and dominance) have a long history in music psychology, having originally been introduced by Osgood, Suci and Tannenbaum (1957). These dimensions are used to evaluate the responses elicited by emotionally charged materials in most studies designed to establish affective norms. We also measured two dimensions (origin and subjective significance) connected with the recently introduced emotion duality model (Imbir, 2016a; Jarymowicz & Imbir, 2015), which distinguishes between emotional states on the basis of their automatic or reflective origin. Finally, we also used an imageability dimension to measure specific music qualities, such as involuntary musical imagery (Williamson & Jilka, 2014). The dimensional approach to assessing emotional responses (cf. Schubert, 1999) differs from the categorical approach, which considers discrete emotions. An example of the categorical approach is the study by Vieillard et al. (2008), which validated a set of 56 musical excerpts that conveyed one of four emotions: happiness, sadness, threat or peacefulness. The categorical approach is useful for investigating the elicitation of discrete emotions, but is not comparable to the dimensional approach; nevertheless there have been attempts to combine the two approaches (Stevenson & James, 2008) for an International Affective Digitised Sounds stimuli set (Bradley & Lang, 1999b).
Valence, the intuitive and easily accessible feeling that something negative or positive is happening now, is the dimension most naturally used to describe affective states (Kagan, 2007). Osgood et al. (1957) found this dimension to be the most crucial for their semantic differential scale. Valence ratings have remained the most reliable and replicable aspect of evaluations of affective responses (e.g., Imbir, 2015; Kousta, Vinson, & Vigliocco, 2009).
Arousal is the energetic aspect of emotion. Osgood et al. (1957) found that arousal was the second most important component of variance in semantic differentials; subsequently, the concept of arousal was incorporated into research on the psychology of emotion and mood. Russell (2003) argued, for example, that core affect, the simplest affective state from which more complex emotions are constructed, corresponds to a combination of valence and arousal. Arousal is not dependent on language; it is an immediate response to the appearance of an arousing stimulus (Imbir, 2015).
Dominance is the extent to which the sensations and affective reactions evoked by a stimulus are controllable. Some sensations are uncontrollable and dominate human behavioural responses, for example making an individual cry, feel fear or shout at another person. However, many other emotional experiences are controllable; that is, the individual is confident that he or she can cope with or regulate them. The dominance dimension was identified from a factor analysis of semantic differential data (Osgood et al., 1957); however, many studies of normative affective responses have shown that dominance is less reliable than other emotional dimensions, or that it is closely related to valence. For example, Polish normative studies on emotional responses to words (Imbir, 2015) and short texts (Imbir, 2016b), and research by Bradley and Lang (1999a) indicated that dominance was highly positively correlated with valence and suggested that positive feelings are perceived as controllable, whereas negative feelings are regarded as uncontrollable.
The Origin of Emotional Experience Scale was developed to measure duality of emotion dimension (Jarymowicz & Imbir, 2015). The emotion duality model is related to the broad family of dual processing theories of cognition (cf. Gawronski, & Creighton, 2013), which posit that cognition comprises two processes or systems: (a) associative, automatic, uncontrolled and effortless processes and (b) reflective, controlled and effortful processes (e.g., Epstein, 2003; Kahneman, 2003, 2011), both of which contribute to everyday behaviour and decision-making. It is said that there are two types of emotions and affective states. The first are automatic in origin, are based on the biological value of the relevant stimulus (Damasio, 2010), appear immediately after the stimulus, and are enhanced by arousal; an activation mechanism specific to this mental system (cf. Epstein, 2003). The second are reflective in origin, derived from a standards-based cognitive appraisal of the stimulus (Reykowski, 1989) or propositional thinking (Strack & Deutsch, 2004), appear after inspection of a situation, and create a mental model of the environment that is used to compare it with expectations. Affective states based on effortful processing are enhanced by subjective significance, an activation mechanism specific to the reflective emotional system.
Subjective significance was the second dimension introduced into the emotion duality model as the reflective system’s equivalent of arousal (Imbir, 2016a). Psychological research has demonstrated that the default mode of cognitive processing is rather heuristic and automatic (e.g., Epstein, 2003; Gawronski & Creighton, 2013; Kahneman, 2003, 2011). Nevertheless, individuals do sometimes engage in difficult thinking. The authors of the emotion duality model asked themselves why they do so, and what might energise this kind of systematic and effortful reflective processing. It appears that the answer is simple: individuals engage in effortful processing because they want to, and believe it is worthwhile (Imbir, 2015); in other words, experiences that provoke this kind of processing are subjectively significant and they have a very important influence on the individual’s goals, plans and expectations.
Imageability is the final dimension we assessed. We developed an imageability scale to assess the ease with which a given musical stimulus evokes visual imagery. It is known that listening to music can evoke involuntary musical imagery (e.g., Williamson & Jilka, 2014). This is a ubiquitous cognitive phenomenon concerning visual and motor projections in response to music. Images evoked by music may be directly due to the music or a result of the emotional response to the music but, in both cases, the imagery enhances the intensity of the emotional response (Juslin & Västfjäll, 2008). By adding this scale we hope to provide quantitative research materials useful in studies concerning Involuntary Musical Imagery, allowing researchers to contrast between stimuli that are hard to imagine, and invoking vivid visual imaginations.
Aim of the study
The aim of this study was to investigate perceptions of music and generate a bank of stimuli with known affective properties in non-native-English-language subjects, that could be used by all researchers interested in using music to manipulate affective states. The study was exploratory so we did not formulate specific hypotheses. We did, however, have two specific objectives. The first was to check whether, by using music excerpts as stimuli, we could find relationships between dimensions describing evoked feelings similar to those found when other types of stimuli were used, such as words (Imbir, 2015) or sentences (Imbir, 2016b). The second objective was to investigate whether affective reactions to music exhibit any genre specificity (Kreutz et al., 2008).
Method
Participants
The participants comprised 50 students (25 women) from various universities and colleges in Warsaw. Participants’ ages ranged from 18 to 28 years (M = 21.46, SD = 1.85). They participated voluntarily in exchange for a small financial reward (approximately 12 Euros). All participants had normal hearing and were frequent music listeners, and were without musical education or training (e.g., attending music schools, playing in bands, working for companies connected to music such as radio stations, televisions, production companies etc.). Participants’ musical preferences were measured in two ways. First they were asked to indicate how much they liked 12 musical excerpts (listed in Appendix A in the online supplementary material) using a seven-point Likert scale where 1 represented I don’t like it at all and 7 represented I like it very much. The raw data are presented in Appendix A. These data were used in analyses concerning the impact of music preference on affective reactions to music (Results section).
Materials
Musical excerpts
Expert judges (MG, KI) selected the musical excerpts to be evaluated. It was intended that the experimental set of stimuli would cover six broad musical genres: rock, pop, electronic music, jazz, rap/rhythm and blues (R&B) and classical music. Music genre was determined by external sources (classifications made by radio stations, music production companies and playlist users on the internet). Musical features such as texture, tempo, style or mode were not controlled while selecting music. Twenty musical excerpts from each genre were chosen in order to ensure that the dataset included excerpts expected to elicit negative and positive affective reactions, and would be associated with low and high levels of arousal. The sole aim of the selection process was to provide a suitably variable set of experimental stimuli. Details of all musical excerpts are provided in Appendix B in the online supplementary materials: title and author, excerpt timing and a web link to the source. All the selected excerpts were copied to a personal computer and edited using digital sound processing software (Audacity: http://audacity.sourceforge.net/about/). After editing, the volume of all excerpts was standardised. These prepared stimuli were named U1 to U120, and were copied to a folder designated for materials relating to the measurement procedure. In preparation for the assessment phase of the study, the excerpts were randomised into 50 different sequences.
Affective scales
Perceptions of the musical excerpts were assessed using six different scales based on Self Assessment Manikins (SAMs; Lang, 1980) for valence, arousal, dominance, origin, subjective significance and imageability (Imbir, 2015). The choice of scales was governed by the need to include (a) those most commonly used in traditional affective research, that is, valence, arousal and dominance (cf. Bradley & Lang, 1999a); (b) those central to the emotion duality model, that is, origin and subjective significance (cf. Imbir, 2015) and (c) those commonly used in the psychology of music, that is, imageability (cf. Williamson & Jilka, 2014). Figure 1 depicts the SAMs used to assess the various dimensions. The SAM scales were developed to allow respondents to describe feelings without using language. Figures were shown to participants before the study for familiarisation with each scale range. In this study, each SAM scale was preceded by a description giving an example of both poles (cf. Imbir, 2015). Descriptions and examples were included because we were asking participants to evaluate both intuitively understandable dimensions such as valence, and less intuitive dimensions such as the origin of feelings, and we wanted to be able to compare the ratings for excerpts associated with different dimensions. Perceptions of the valence, arousal and dominance of states evoked by music were assessed using the standard SAM scale (Lang, 1980) preceded by descriptions developed specifically for this study. These are presented in Table 1. Perceptions of the origin and subjective significance of affective states elicited by the excerpts were assessed using SAM scales and descriptions developed in other research (Imbir, 2015). A new SAM scale and scale description were developed to assess perceptions of imageability (Figure 1, final row, and Table 1).

Self Assessment Manikin scales used to assess emotional states elicited by listening to the music.
Descriptions of Self Assessment Manikin scales.
Note. The first five scales are the same as used in Imbir (2015), the last was developed specifically for this study.
The participants assessed each musical excerpt using a nine-point Likert scale, where 1 represented negative/calm/being in control/from the heart/of no consequence/hard to imagine; and 9 represented positive/excited/controlling/from the mind/important/easy to imagine; and 5 represented a neutral/mixed/moderate state.
Apparatus
Stimuli were displayed using PowerPoint 2007 presentation software on a standard 15-inch laptop computer running the Windows 7 operating system and responses were collected on paper questionnaires. Musical excerpts were played through headphones that reduced external noise and provided good sound quality. All excerpts were presented via computer using a standard sound card. All analyses were carried out using the IBM SPSS 22 statistical package.
Procedure and design
Participants evaluated the excerpts individually in a quiet, stable laboratory environment. Participants were recruited via paper advertisements placed in various departments in universities in Warsaw and via internet advertisements on social sites of the same universities. Only non-professional, everyday music users were eligible to participate. Participants were given information about the aims of the research. They were told that the assessments they would be making would have important applications in music psychology and affective science because they would enable researchers to choose music with specific characteristics for future research. The importance of the work was emphasised to participants several times to ensure that they were sufficiently motivated. Before starting the assessment procedure, participants were instructed on how to indicate their evaluations and how to use the stimulus presentation apparatus, including personalising the display settings (e.g., loudness). After these preliminary instructions, participants were asked to listen to and rate their perceptions of 12 music excerpts (two per genre). This procedure also served as a training session to ensure that the participants understood the procedure. After this, the researcher explained the assessment scales and SAM pictures, and provided printed descriptions. Participants were free to ask questions about any aspect of the procedure and were assured that their data would remain confidential and would be used only for scientific purposes. After being instructed on how to use the scales, participants rated the set of excerpts on all six SAM scales. The sequence in which the excerpts were played was random and different for each participant. Participants assessed the dimensions in the following order: valence, arousal, origin, imageability, subjective significance, and dominance. Participants completed assessments of all six dimensions for one excerpt before moving on to the next excerpt. Participants could listen to an excerpt as many times as they wanted. The whole procedure lasted between one and two hours. After assessing all the excerpts, participants were asked to complete a questionnaire about their age, sex and number of years of education.
Results
The final scores for each musical excerpt were based on the following analytical procedure. First, data were entered into a database. Data from all participants were included in the analysis. Then the mean (M), standard deviation (SD) and range (Min and Max) for scores on all dimensions and the number of participants who had assessed the excerpt (N) were calculated. All data on valence, arousal, dominance, origin, significance, and imageability are included in the online supplementary material (Appendix B). These data were used in further analyses to determine the reliability of the assessments and the relationships between the variables investigated.
Ratings distributions
The important issue for all interested in using the norms presented here concerns the ratings distributions defining the way that participants performed their assessments. Figure 2 presents means (M) for each music excerpt and the standard deviation (SD) for this stimulus. As we can see, the range for each dimension does not cover the whole nine-point scale. Each time, a quadratic function, drawn in a solid line on each plot, better described the relationship between M and SD. This is not surprising, because if a stimulus is assessed in an extreme way it will collect extreme and coherent ratings; thus, the SD value should be lower than the SD obtained for neutral/moderate stimuli. It is worth noting that, in most cases, one can find two types among neutral stimuli: (a) those with low SDs, showing that the stimulus was assessed congruently as neutral/moderate and (b) those with high SDs, suggesting that the stimulus was assessed at certain dimensions by some participants as low, while by another group as high, meaning that perceived emotional reaction was incongruent.

Means plotted against standard deviations for all six measured dimensions.
Reliability
We used two different methods to measure the reliability of the assessments. First, we divided our dataset into two subsets based on participants’ numerical identifiers (odd or even), controlled for equal numbers of women and men in both subsets. This procedure is standard in affective norms studies (e.g., Imbir, 2015; Moors et al., 2013). This split-half procedure allowed us to calculate the correlation (Pearson’s r) between the ratings on certain scales given by odd and even numbered participants. All correlations were significant at the p < .001 level. Dominance ratings were the least reliable (r = .670) and arousal ratings the most reliable (r = .902). The second method was Cronbach’s alpha, calculated using raw data. The reliability of the scales varied from α = .802 in the case of imageability, to α = .961 related to the arousal scale. Table 2 presents data on reliability in terms of split-half correlations and Cronbach’s alpha estimation.
Estimates of reliability based on (a) Pearson’s r correlations between split-half participants groups and (b) Cronbach’s α estimation.
Correlations between variables
Relationships between the dimensions investigated were analysed using correlations (Pearson’s r). Only high correlations (r > .35) for variables sharing more than 10% of their variance are discussed here. Valence and arousal were significantly positively correlated (r = .568, p = .001). This correlation was not the U-shaped relationship often reported where affective norms are concerned (e.g., Imbir, 2015). We confirmed this in two ways, first by dividing our dataset into two categories, (1) excerpts with negative valence (valence ratings ⩽ 5) and (2) excerpts with positive valence (valence ratings > 5), and calculated separate correlations for the two groups. Valence and arousal were positively correlated in both groups (negative valence excerpts: r = .398; positive valence excerpts: r = .467; both p < .001). Second, we conducted regression analysis including both linear and quadratic relationships. This analysis showed that the valence and arousal relationship is better explained by a linear function y = .794x + .343: R2 = .568, F(1, 118) = 56.2, p = .001, than a quadratic relationship: R2 = .572, F(1, 117) = 28.4, p = .001, in terms of a not significant R2 change due to inclusion of the quadratic function: F(1, 117) = .75, p = .4.
We also found that dominance was highly correlated with valence (r = .764, p < .001) and arousal (r = .739, p < .001). There were also high correlations between imageability and origin (r = .68, p < .001), imageability and significance (r = .887, p < .001) and origin and significance (r = .775, p < .001). Table 3 presents pairwise correlation coefficients for all dimensions. To investigate the nature of all relationships, additional linear and quadratic regressions analyses were conducted. The shaded cells in Table 3 show relationships better explained by quadratic functions. Results description and figures illustrating the relationships found can be seen in Appendix C in the online supplementary material.
Pairwise correlations between all six dimensions.
Note. Shaded cells indicated that relation between dimensions is better (significant R2 change) explained by a quadratic function.
p< .05’ ** p< .005.
The effects of sex and genre of music on affective reactions to music
In order to check if there were any gender differences and genre-specific differences, we conducted a mixed-design ANOVA with genre (6 levels, repeated measure) x specific excerpt (20 levels, repeated measure) x sex (2 levels, between-subjects measure) as factors. Since the main effect of specific excerpts is obvious, reports of this will be omitted. For the valence scale, we found a significant main effect of music genre, F(5, 240) = 4.99, p = .001, η2 = .094, no significant main effect of sex, F(1, 48) = .3, p = .6, and no significant interaction of both, F(5, 240) = 1.44, p = .21. Post-hoc analysis for a genre effect, using the Bonferroni correction for multiple comparisons, showed that valence assessments were lower for rap/R&B music (M = 4.82, SEM = .12) compared to jazz (M = 5.31, SEM = .13, p = .011), pop (M = 5.49, SEM = .15, p = .001) and rock music (M = 5.32, SEM = .15, p = .003).
For the arousal scale, we found a significant main effect of music genre, F(5, 240) = 30.83, p = .001, η2 = .391, no significant main effect of sex, F(1, 48) = .09, p = .8, and no significant interaction, F(5, 240) = 1.72, p = .13. Post-hoc analysis for a genre effect, using the Bonferroni correction, showed that arousal assessments were lower for jazz (M = 3.56, SEM = .12) than for all another genres: electronic music (M = 4.98, SEM = .13, p = .001), rap/R&B (M = 4.67, SEM = .14, p = .001), classical music (M = 4.19, SEM = .15, p = .001), pop (M = 4.64, SEM = .14, p = .001) and rock music (M = 5.04, SEM = .12, p = .001). Furthermore, classical music was less arousing than electronic (p = .001) and rock music (p = .001), while rap/R&B appeared to be less arousing than rock music (p = .045).
For the dominance scale, we found a significant main effect of music genre, F(5, 240) = 4.01, p = .002, η2 = .077, no significant main effect of sex F(1, 48) = 1.88, p = .18, and no significant interaction, F(5, 240) = 1.09, p = .37. Post-hoc analysis for a genre effect, using the Bonferroni correction, showed that dominance assessments were lower for electronic (M = 4.76, SEM = .18) compared to rap/R&B music (M = 5.22, SEM = .16, p = .021); jazz (M = 4.63, SEM = .16) compared to rap/R&B (p = .005); and jazz compared to rock music (M = 5.16, SEM = .14, p = .014).
For the origin scale, we found a significant main effect of music genre, F(5, 240) = 16.55, p = .001, η2 = .256, no significant main effect of sex, F(1, 48) = 2.962, p = .092, and no significant interaction of both, F(5, 240) = .6, p = .7. Post-hoc analysis for a genre effect, using the Bonferroni correction, showed that origin assessments were higher (more reflective) for electronic music (M = 5.02, SEM = .14) compared to most other genres: jazz (M = 4.18, SEM = .13, p = .001), classical (M = 4.09, SEM = .21, p = .002), pop (M = 3.73, SEM = .13, p = .001) and rock music (M = 4.09, SEM = .12, p = .001). Rap/R&B (M = 4.78, SEM = .13), appeared to be perceived as evoking more reflectively originated experiences than jazz (p = .009), pop (p = .001) and rock (p = .001). The difference in assessments for pop and rock music also appeared to be significant (p = .042).
For the subjective significance scale, we found a significant main effect of music genre, F(5, 240) = 26.13, p = .001, η2 = .353, no significant main effect of sex, F(1, 48) = .054, p = .8, and no significant interaction, F(5, 240) = .51, p = .8. Post-hoc analysis for a genre effect, using the Bonferroni correction, showed that subjective significance assessments were lower for electronic music (M = 3.51, SEM = .15) compared to most other genres: jazz (M = 4.57, SEM = .18, p = .001), classical (M = 5.37, SEM = .23, p = .001), pop (M = 4.81, SEM = .2, p = .001) and rock music (M = 4.71, SEM = .18, p = .001). Rap/R&B (M = 3.95, SEM = .2), appeared to be perceived as evoking less significant experiences than jazz (p = .021), classical (p = .001), pop (p = .001) and rock music (p = .001). Classical music was associated with more highly significant emotional experiences than jazz (p = .001) and rock music (p = .03).
For the imageability scale, we found a significant main effect of music genre, F(5, 240) = 11.21, p = .001, η2 = .19, no significant main effect of sex, F(1, 48) = .012, p = .9, and no significant interaction, F(5, 240) = 1.043, p = .4. Post-hoc analysis for a genre effect, using the Bonferroni correction, showed that imageability assessments were lower for electronic music (M = 4.4, SEM = .16) compared to most other genres: jazz (M = 5.12, SEM = .17, p = .001), classical (M = 5.75, SEM = .23, p = .001), pop (M = 5.3, SEM = .18, p = .001) and rock music (M = 5.27, SEM = .16, p = .001). Classical music was perceived as invoking more imagery than jazz (p = .004) and rap/R&B music (M = 4.21, SEM = .19, p = .026).
Participants’ music preferences and intensity of affect
To determine whether music preference influenced the affective ratings, we conducted additional regression analyses, applying the backward elimination method. The mean “liking” score of two music excerpts, associated with the six different genres presented at the initial stage of the procedure, was used as a predictor for mean assessments in the valence, arousal, dominance, origin, significance and imageability dimensions. In the case of the valence dimension, the classical, rock, pop and rap/R&B preferences allowed the prediction of mean assessments: y = .098 (classical music “liking”) + .209 (rock “liking”) + .141 (pop “liking”) + .128 (rap/R&B “liking”) + 2.689; R2 = .403, F(4, 49) = 7.59, p = .001. Regarding the arousal dimension, the jazz and rap/R&B preferences allowed the prediction of mean assessments: y = .121 (jazz music “liking”) + .157 (rap/R&B “liking”) + 3.265; R2 = .183, F(2, 47) = 5.25, p = .009. In the dominance dimension, only the jazz preference allowed the prediction of mean assessments: y = .168 (jazz music “liking”) + 4.221; R2 = .09, F(1, 48) = 4.74, p = .034. For the origin dimension, only the classical music preference allowed the prediction of mean assessments: y = –.149 (classical music “liking”) + 4.982; R2 = .105, F(1, 48) = 5.65, p = .022. In the case of the subjective significance dimension, the rock, jazz and rap/R&B preferences allowed the prediction of mean assessments: y = –.18 (rock music “liking”) + .177 (jazz “liking”) + .212 (rap/RNB “liking”) + 3.578; R2 = .176, F(3, 46) = 3.266, p = .003. Finally, regarding the imageability dimension, the classical and pop music preferences allowed the prediction of mean assessments: y = .254 (classical music “liking”) + .203 (pop “liking”) + 3.211; R2 = .255, F(2, 47) = 8.04, p = .001.
Discussion
Reliability of ratings
The reliability of the ratings was very good in comparison with other normative studies (e.g., Bradley & Lang, 1999a; Imbir, 2015, 2016b; Warriner et al., 2013), and indicates that our dataset is suitable for use in research on affective states elicited by music. Both methods (split-half correlations and Cronbach’s alpha) showed that the most reliable dimension was arousal, followed by valence, origin and significance. Reliability was lower for the remaining two scales, dominance and imageability, but still good. The pattern and level of results is similar to those obtained in previous studies using the same scales to assess affective responses to words (Imbir, 2015) and short texts (Imbir, 2016b) using split-half estimation for arousal (r = .78 for words and .755 for sentences), valence (r = .95 and .935), origin (r = .73 and .815), significance (r = .78 and .745) and dominance (r = .78 and .85). The lower correlations for imageability could be due to the cross-modal (auditory and visual) nature of this scale.
Correlations between dimensions
It is also worth comparing the pattern of correlations between dimensions for various modes of emotional stimuli. The main difference between the affective states elicited by music norms and other modes of affective stimuli appears to lie in the relationship between arousal and valence. Many existing studies of affective responses (e.g., Bradley & Lang, 1999a; Imbir, 2015, 2016b; Moors et al., 2013; Warriner et al., 2013) describe a U-shaped relationship between arousal and valence; that is, strongly valenced stimuli (both positive and negative) are highly arousing, while neutral words are least arousing. In this study, however, we found a positive correlation between arousal and valence; in other words, positively valenced musical excerpts were arousing, whereas negatively valenced excerpts elicited a state of low arousal. This could be because of: (1) music selection for this study or (2) music generation and mechanisms of music propagation. Put simply, it may be that listeners (or the researchers choosing music for the current study) do not want to listen to arousing negative music that would intensify active, negative feelings such as anger but instead prefer nostalgic, calm and sad music; consequently, composers avoid writing arousing negative music, which comes to be under-represented in the music market (or in this stimuli bank). Another important relationship observed in this normative study was the high correlation between dominance and valence. This relationship is very common in studies of normative responses to emotional stimuli. For example, studies using emotional words (Bradley & Lang, 1999a) and short texts (Imbir, 2016b) reported correlations between valence and dominance of r = .84 and r = .83, respectively. It seems that positive experiences are generally treated as controllable, while negative experiences are perceived as uncontrollable.
The pattern of correlations between the dimensions also suggests that the recently introduced dimensions of origin and subjective significance (Imbir, 2015, 2016a, 2016b; Jarymowicz & Imbir, 2015), are valid. First, both emotive qualities, valence and origin, were weakly correlated in a nonlinear way. The same pattern was observed in the case of words and sentences. Second, arousal and subjective significance – which capture different kinds of activation – were significantly negatively, nonlinearly correlated, sharing only about 9% of their variance. Surprisingly, we found a high negative (also nonlinear) correlation between subjective significance and origin, suggesting that feelings originating in the automatic system had more subjective significance for listeners than feelings originating in the reflective system. This result may be specific to musical stimuli, or to the particular set of musical stimuli used in this study because, in previous studies, there was no relationship found between these dimensions (Imbir, 2015).
It is also worth discussing the pattern of correlations between imageability and the other dimensions, particularly the negative correlation between imageability and origin (nonlinear, cf. Appendix C in the online supplementary material) and the positive correlations with both subjective significance and valence (linear). This pattern of results is consistent with our observation of the relationship between origin and subjective significance discussed above, and suggests that this relationship might be explained by differences in the imageability of music that is “of the head” and music that is “of the heart”. It appears that affective states attributed to the automatic system (from the heart), and that are perceived as subjectively significant, are more likely to elicit vivid mental images. Affective states originating in the automatic system are non-verbal (cf. the concept of biological value; Damasio, 2010) and are thus more readily translated into images. We suggest that imageability also makes the affective state elicited by a piece of music more relevant to the listener’s goals and expectations, and is thus of greater importance to him or her; this is the dimension captured by the subjective significance scale.
Genre differences and genre preferences in affective responses
An important issue in music psychology is whether modern and classical music have similar abilities to elicit affective responses from listeners. Music is often designed to move its listeners, but some musical genres may be more effective in doing so than others. We found that, in terms of valence, all the genres elicited a similar quality of feelings in listeners. This is a surprising finding that conflicts with some theoretical predictions (cf. Kreutz et al., 2008). This pattern of results might be due to: (1) our selection criteria; we tried to include similar numbers of positively and negatively valenced musical excerpts in our experimental dataset and this may account for the similarity in valence ratings across the different musical genres; or (2) the suggestion that classical music is not the most emotional genre, or at least the most valenced one. Other factors could probably also have influenced this relationship, such as music preference; including participants interested in classical music may influence their assessments on the most preferable genre. This claim is supported by our analysis of music preferences predicting ratings, which showed classical music “liking” predicted valence, origin and imageability assessments for all presented excerpts. Data on the other dimensions of affective response did reveal some genre specificity (Results section); thus, including genre in experimental designs seems to be crucial for affect elicitation. Furthermore, music preference expressed by liking particular music pieces appeared to influence the affective assessments. This suggests a two-way relationship between the genre of music and subjective preference in the shaping of the affective consequences of listening to the music.
Current study limitations
Regarding the limitations of the current study, we have to mention three issues. First, the absolute sample size in this study was much smaller in comparison to other affective norms studies (cf. Bradley & Lang, 1999a; Imbir, 2015; Warriner et al., 2013). In these classical approaches, a small number of stimuli from a large stimuli set are assessed by a large number of participants. Recently, it has been demonstrated that the same level of reliability of assessments can be obtained with the engagement of fewer participants, but the whole set of stimuli needs to be assessed (Moors et al., 2013). This means that a single stimulus is assessed by a number of participants comparable to that found in the standard approach, but the total number of subjects involved is much lower.
Second, the music preferences shown by our participants were quite heterogeneous (cf. Appendix A in the online supplementary material). This could have influenced the assessments. As we showed in the Results section, the music preferences of our participants modified the assessments carried out for each scale separately. Nevertheless, our sample was recruited from a standard student population, thus the norms should be useful for other researchers using general student populations.
The third limitation concerns the nature of the materials selected, which were based on available music excerpts; the memories of participants or their understanding of lyrics depended on their English-language skills and could have influenced the assessments. Therefore, the affective norms should be most applicable for non-native-English speakers. It is also worth noting that assessments were carried out among young participants, students of various universities; the collected norms thus apply mostly to young subjects.
Description of the dataset and possible use of affective norms for music
The dataset used in this study was generated and evaluated to provide a bank of musical stimuli covering multiple genres that could be used by researchers to influence participants’ moods. Appendix B in the online supplementary material consists of three spreadsheets presenting: (1) legend for the dataset, (2) data on all the musical excerpts (N, Min, Max, M and SD) with respect to the six dimensions and (3) data for men and women presented separately. Affective norms for music may be useful in research on emotion regulation. There are several other possible uses for the dataset; for example, the normative data on the dimensions investigated here may be of interest to researchers measuring physiological aspects of affective reactions to auditory stimuli. The stimuli could be also used to elicit a known type – in terms of the dimensions investigated – of affective state. The ability to elicit a known affective state is crucial to experimental research on, for example, how affect or emotions influence cognition. This dataset is available for non-commercial use in scientific research. We encourage scientists to use these norms to prepare affective manipulations concerning modern and classical music.
Footnotes
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The project was funded by the National Science Center on the basis of decision DEC-2012/07/D/HS6/02013.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
