Abstract
In this article I show that although biological and neuropsychological factors enable and constrain the construction of music, culture is implicated on every level at which we can indicate an emotion-music connection. Nevertheless, music encourages an affective sense of human affiliation and security, facilitating feelings of transcultural solidarity.
One of the enduring philosophical issues regarding music and emotion is the question of how it is that the two have any connection. Intuitively, it is not obvious why deliberately organized sound should move us, or why we should regard sonic patterns as so apt for emotional expression. In recent times, the recognition that cultures structure music in ways too diverse to ensure mutual comprehension raises the question of whether any effort to provide an account will be cross-culturally valid. This article will focus on the role of cultural construction in the linkage between music and emotion.
At the outset, we should note that the terms in which this topic is formulated are imprecise. “Culture” is vague, referring to a cluster of features, including beliefs, behaviors, conventions, social institutions, and technologies, which may interact in various ways (Thompson & Balkwill, 2010). Commonly, distinguishable subcultures and cohorts are nested within larger cultural groups, and individuals are often members of several of these smaller groups. Musical cultures need not be restricted to a geographical region, nor need they correlate with cultural membership of any extramusical sort (see Turino, 2008, p. 139). In some discussions, moreover, the term “culture” is not used to refer to community practices, but to whatever in musical experience is a product of learning.
The term “music” is also variously understood, sometimes construed so broadly as to refer to any kind of organized sound (as Edgar Varèse [1967] alleged), environmental sound (whether organized or not) (Cage, 1961), or the abstract idea of moving tonal patterns (Kivy, 1983). “Biology” is also vague, given that data about the nervous system and its operations require interpretation. We might be able to measure the brain’s electrical activations precisely, but the significance of these measurements remains debatable.
Any attempt to ascertain the respective contributions made by culture and biology to the music-emotion relationship is complicated by our difficulty in locating a purely biological level of musical response. Experimental strategies that attempt to isolate those features of musical experience necessitated by the requirements of human cognition are useful (Thompson & Balkwill, 2010). However, such phylogenic factors often operate as constraints rather than absolute determinants of musical organization, leaving considerable room for cultural variation in what counts as “music” and how it is structured.
In what follows I will show that although biological and neuropsychological factors enable and constrain the construction of music, culture is implicated on every level at which we can indicate an emotion-music connection. Nevertheless, music encourages an affective sense of human affiliation and security, facilitating feelings of transcultural solidarity.
Culture and the Link between Music and Emotion
Bases for Cross-Cultural Convergence
Before trying to pinpoint ways that cultures differ in linking emotion and music, we should try to determine ways in which they are the same. In this section I will consider several bases for this connection that are common across cultures, including universals of musical perception, the illusion of movement in music, dynamic features of music that correlate with emotional behavior, and prosodic cues. Despite significant resulting overlap in the emotions associated with features of music, these bases do not assure cultural uniformity.
Universals of musical perception
If we assume that the physiology of the auditory and tactile systems (through which music is experienced) and the basic neurology involved in musical perception are roughly standard across the species, we should be able to find some common patterns in how human beings perceive music. Indeed, several aspects of musical perception appear to be universal characteristics. In processing music human beings characteristically distinguish signals from noise, perceive music in chunks, recognize a kind of equivalence between a tone and its octave, stretch octaves at higher frequencies, organize musical signals in terms of melodic contour, experience melodic patterns that ricochet back and forth between different parts of the frequency range as two melodic lines, categorically perceive pitches and rhythms (hearing slightly sharp or flat tones or slightly mistimed events as on the mark), prefer intervals with small integer frequency ratios between the tones, attend more to overall timing patterns than to the timing of specific events, utilize scales of typically five to seven tones as frameworks, experience pitches within the scale as hierarchically organized, normalize rhythm, employ tones of uneven temporal length, keep tempo in proportion from movement to movement in long works, and apply Gestalt principles to musical signals (Dowling & Harwood, 1986; Epstein, 1988; Meyer, 1956; Patel, 2008).
Such apparently universal features of musical perception may figure to a limited extent in shaping the character of affect that is associated with music. For example, the seemingly universal preference for certain consonances (perfect octaves, fifths, and fourths) may correlate with relative pleasure when such consonances are used. However, musical context has much to do with the specific reaction to consonance. A sense of relief may attend the appearance of consonant intervals after a span of dissonance; however, no such reaction is likely in a highly consonant piece. Moreover, beyond the perfect consonances mentioned before, consonance and dissonance are culturally relative (Dowling & Harwood, 1986). Even the perfect fourth was considered relatively dissonant in certain periods of Western musical history. We can conclude little about the relationship between emotions and music from the universals of musical perception.
Structural complexity
Another plausible basis for cross-cultural convergence would be affective correlations with broad features of music, evident even to those who lack familiarity with a particular style. One such feature might be structural complexity, which some psychological studies correlate with sadness, while correlating melodic simplicity with happiness (Balkwill & Thompson, 1999; Nielzén & Cesarec, 1981; Patel, 2008). Emotional associations with relative complexity may, however, be particularly susceptible to cultural variation, since perceived structural complexity is relative and is likely to be judged on the basis of the typical density of familiar music. In an experiment conducted by Balkwill, Japanese listeners took perceived complexity as a factor in judgments that anger was expressed, while Canadian listeners did not (Thompson & Balkwill, 2010). It would be worthwhile to consider whether the relative sparsity of texture in traditional Japanese music might account for this difference. Another factor may be second-order attitudes toward anger, with one culture being more reluctant than the other to express this emotion.
The illusion of movement and music’s dynamic features
A more promising basis for seeking cross-cultural links between emotion and music is to consider features related to the illusion of “movement” that attends the experience of music. This illusion appears to be universal; Eric Clarke (2005) argues that it derives from our auditory system’s being fine-tuned for motion detection. Charles Nussbaum (2007) suggests that because music exploits some of the same neurological mechanisms that we use in exploring the external environment, our mental representations of music involve motor hierarchies and action plans. Accordingly, it is not surprising that we identify the motion of music with our own movements, or that we envision correlations between music and extramusical domains.
The dynamic features of music’s moving tonal patterns strike many theorists as a basis for associating music and emotion, which is significant for our purposes, given that these would seem detectable regardless of a listener’s familiarity with a particular style. Nineteenth-century music theorist Eduard Hanslick (1854/1986) itemized these features as including “audible changes in strength, motion, and proportion,” as well as “increasing and diminishing, acceleration and deceleration, clever interweavings, simple progressions, and the like” (1854/1986, p. 10). Given the universality of the illusion of movement in music, these features would seem to be cross-culturally apparent.
According to Hanslick, music can suggest ideas related to emotion, but it does not represent full-fledged emotions; it can only reproduce “the motion of a physical process according to the prevailing momentum: fast, slow, strong, weak, rising, falling” (1854/1986, p. 11). The same music can also be compatible with a variety of emotions, so long as they share the same dynamic profile. Many contemporary philosophers are more sanguine about dynamic features grounding a connection between music and emotions, arguing that particular dynamic patterns are reminiscent of behavioral manifestations of emotion seen from a third-person point of view (Budd, 1989; Davies, 1980, 1994; Kivy, 1989, 2002; Levinson, 1990). The idea is that music presents characteristics of behavior, such as gait and demeanor, that typify the behavior of a person experiencing certain emotions.
One would expect some degree of cross-cultural agreement in what musical behaviors mimic bodily movement. Such features as speed, strengthening, and fading would seem to be evident in the music of another culture, even if much about its structure is unfamiliar. While it might take accompanying lyrics, performance context, or accompanying dance to specify the exact emotional cast of a musical passage or piece (as sad, solemn, or something else), the rhythmic possibilities suited for expressing such a particular range of emotions are not culturally specific. Slow, even beats resemble the pace of a sad or solemn person, for example. Impressions of tension and relaxation can similarly be conveyed by dynamic features, such as the weakening in vocal intensity that might occur when a song is approaching the high limit of a singer’s vocal range. The impression of tension, of course, falls short of an affective association, but it can be among cues that, taken together, suggest an emotion.
Acoustic codes in prosody
Music’s dynamic features importantly connect music and emotion through our apparently “hard-wired” acoustic codes for emotional expression. These codes are evident in prosody, those vocal inflections (“tone of voice”) that convey emotion independent of semantic content and facial expression (Peretz, 2001). Much experimental evidence suggests that similar acoustic profiles correlate with specific general emotion categories in both speech and music. This presumably accounts for the cross-cultural employment of the same acoustic characteristics for particular emotional song-types, such as lullabies, love songs, and martial music (Juslin, 2001). Descending intervals, for example, predominate in the melody lines of lullabies (Unyk, Trehub, Trainor, & Schellenberg, 1992).
In a far-reaching analysis of prosody studies, Patrik Juslin and Petri Laukka (2003) found broad convergence among the findings of 41 studies of instrumental music and 104 studies of vocal expression. The studies show listeners’ accuracy in gauging intended emotion in both kinds of expression for five general categories of emotion: “anger, fear, happiness, sadness and love-tenderness” (2003, p. 776). Juslin and Laukka found strong agreement in the patterns of acoustic cues that are associated with particular emotions by both listeners and performers, with especially strong correlations in the cases of anger and sadness.
Cue Redundancy and Cross-Cultural Correlations
Despite considerable cultural overlap, Juslin and Laukka (2003) found some evidence that culture affects accuracy in recognizing emotional expression. They analyzed 12 studies in which subjects were drawn from more than one nation, and they found a 7% lower accuracy rate for cross-cultural recognition as opposed to intracultural recognition of both linguistic and musical expression of emotion. They contend that expression in both domains is context-dependent, pointing out that recognition becomes more accurate as cues are combined (see also Juslin, 2001). In this judgment, they concur with Balkwill and Thompson’s cue-redundancy model.
According to Balkwill and Thompson’s model (1999), culture-specific cues can supplement psychophysical cues to the affective character of unfamiliar music. This can enable listeners who are familiar with culture-specific cues to decode the intended emotional content in music more accurately than those who are not. More recently, Thompson and Balkwill (2010) have elaborated a “fractionating emotional systems” model to take account of different environments and enculturation as a process that develops over time. As familiarity with a musical style and its context grows, individuals learn to decipher more domain- specific and culture-specific cues to emotion, thereby gaining more facility in recognizing those features of emotional expression that are culturally specified. By emphasizing the cooperation of cultural and perceptual factors in the identification of emotion, these cue-redundancy models indicate one cause of our inability to sharply separate biological from cultural bases for links between music and emotion.
Even if members of a culture are more accurate, several studies offer evidence that recognition of intended emotional expression can extend across cultural boundaries, apparently on the basis of such features of music as we have been considering. In an experiment conducted by Balkwill and Thompson (1999), for example, Canadian subjects were exposed to excerpts from 12 Hindustani ragas and were asked to rate each in terms of four emotions (anger, joy, sadness, and peacefulness) as well as tempo, complexity of rhythm and melody, and pitch range. The experimenters analyzed the latter four dimensions as perceptual cues, evident without prior knowledge or enculturation. They found significant correlations of the subjects’ judgments about intended emotional expression for anger, joy, and sadness with those of “expert listeners” long familiar with Hindustani music. Given that their subjects did not have prior awareness of the conventional emotional associations with the ragas, the subjects’ recognition of the intended emotions, the experimenters concluded, must depend on independent, psychophysical grounds. More recent experiments, involving subjects from Japan and Canada, have confirmed the finding that listeners from other cultures are fairly accurate in recognizing intended emotion (see Thompson & Balkwill, 2010).
Balkwill and Thompson’s distinction between perceptual and enculturated cues may be a bit too sharp to fully reflect the role of cultural factors. While perceptual mechanisms are universal, in that they operate in the same way for all human beings in which they are functional, perception itself is mediated by cultural construction. In making sense of music, listeners make use of schemata that they have acquired through their exposure to music (Dowling & Harwood, 1986; Hopkins, 1982; Lynch & Eilers, 1991; Perlman & Krumhansl, 1996). To a point, for example, listeners conflate the pitches they hear with allowable pitch categories in the music with which they are familiar; but beyond that point, a tone can sound “out of tune.” Foreign music that is tuned differently from the music of a listener’s own culture can, accordingly, be perceived as deviant. This could have an impact on recognition of emotional content in music. Deviance suggests danger, which provokes an affective response; but what sounds deviant to someone outside a musical culture might sound predictable to someone within it (Thompson & Balkwill, 2010).
Thomas Fritz et al. (2009) also conducted a comparative study of listener recognition of “happy, sad, and scared/fearful music” (2009, p. 573). Twenty-one of their subjects were from the Mafa tribe of Cameroon; 20 were Western. Both groups were previously unfamiliar with the other culture’s music, but both were able to recognize the three emotions above chance, although subjects in the Mafa group varied in ability to perform the task, with two scoring at chance level. Given this variability, the experimenters’ claim that the three emotions studied are “universally” recognized in Western music is an overstatement. What the study did show was that the subjects from both societies used the same bases for assessing emotional expression. Both used temporal patterns and scale as cues, although neither used tempo as an indicator of sadness. Both groups also correlated indefinite mode with sadness. Thus, features of musical movement and structure served as cross-cultural bases for associating music and affect.
Emotional Expression and Emotional Arousal
Is Expressed Emotion Aroused?
An issue of relevance for any discussion of the relationship between emotion and music is the extent to which the emotions expressed by a piece of music provoke arousal of the same emotions. The belief that music arouses emotion is ubiquitous, but whether the aroused emotions are the same as those expressed by the music is another issue. Moreover, some listeners who recognize emotional expression do not experience arousal. Certain philosophers hold that this situation is desirable because it indicates intellectual appreciation of music, not passive physiological response (Kivy, 1980, 1989; Pratt, 1952). Some would go so far in the other direction as to say that the appropriate response to emotional expression in music is to feel that emotion oneself (Matravers, 1998). “Appropriateness,” however, seems relevant primarily to conscious behavior, and it is not obvious that conscious recognition of expressed emotion is necessary in order for emotion to be aroused (Trainor & Schmidt, 2003). In fact, some mechanisms for emotional arousal may occur without the listener’s awareness and may be more effective in its absence (Juslin & Västfjäll, 2008).
The predominant view among philosophers and other theorists is that the conjunction of expression and arousal is typical, and some have proposed a mirroring mechanism along the lines of the “mirror neurons” model (Davies, 1980, 1994; Juslin, 2001; Juslin & Västfjäll, 2008; Koelsch, Fritz, von Cramon, Müller, & Friederici 2006; Levitin, 2006). Music may provoke emotions (e.g., annoyance) besides those that are expressive content, but pleasurable listening often involves emotional identification with the music’s movement and the emotions that it suggests.
Meyer’s Expectation-Oriented Model
Leonard B. Meyer (1956) describes recognition of emotional content and emotional arousal as complementary ways of responding to music, the difference being a function of viewpoint. He argues that affect is aroused by the frustration of expectations. Expectations require assimilation of a (culturally constructed) musical style, but assuming the relevant background, a listener’s expectations will be defied at predictable points in a musical piece. If one is regarding the music from a nonanalytic standpoint, emotional arousal will be prompted at these points. Alternatively, one can attend to the music more reflectively and determine structural points at which affect will occur by examining the piece itself.
Meyer’s theory is formulated in terms of information theory, but one need not embrace this position to accept his point that expectation plays a role in affective response to music or to see how culture figures in this process. Musical cultures and their characteristic styles set up expectations, deviation from which may prompt an emotional arousal. This indicates a ground for anticipating cultural differences among listeners. Those listeners who have internalized a culture’s expectation patterns are able to predict or experience affect at structural points where deviations from these patterns occur; but those who are unfamiliar with these patterns may not be susceptible to such arousal.
Meyer’s account suggests that the same person can go back and forth between more analytic and more emotionally engaged modes of attention, although he is more concerned to characterize the two stances than to consider ways in which these stances might shift. In accounts of the emotional experience of listeners, we should note that the listener’s attention can fluctuate even during a single listening experience, at moments analytically attending to musical features and at other moments attending more toward his or her state of emotional arousal. This is not to suggest that listeners lurch from being emotionally overwhelmed to being dispassionate during a brief span of listening, but that they commonly exhibit what Paul Thom (1993) refers to as “playful attention.” Thom observes that listeners’ attention plays between the musical present and the musical past or future, between the musical content and the musical form, between the current performance and others that are recalled, between features of the performance and features of the listener’s personal life, and between what is happening in the performance space and what may happen outside it. Some kinds of attentive play involve taking a relatively detached perspective on the musical structure (e.g., play between the music’s content and the means of conveying it), while others may involve a more emotionally engaged stance.
Although I am not aware of empirical demonstrations that listeners typically shift their attention in this way, some “stream of consciousness” reports corroborate this idea. One of Alf Gabrielsson’s experimental subjects indicates shifts of attention in her description of her emotional experience of a performance of Mahler’s unfinished Tenth Symphony, reconstructed.
The orchestra is breaking out in a warmth that is fascinatingly painful. I remember tears filling my eyes. I felt as if I understood a message, from one time to another, from one human to another. I sat as if turned to stone, with my fingers gripping the armrest. After a while I became quite dizzy. Then I realized that I had forgotten to breathe. . . . The last movement opens with eleven beats of an enormous wooden club against the largest bassdrum. The effect is horrible. A flute, followed by the strings, starts hesitatingly to develop a melody, but it is suddenly crushed by the beat of the wooden club. I was totally taken by surprise. Each time the club beat, I was seized by terror. Each time an instrument began to sound, doubtfully, I anticipated the hit of the wooden club, I mourned. Here, too, I felt like I was a receiver of a message . . . I felt like I had not missed a single note . . . the feeling of being spoken to, strongly and directly, lived inside me. (Gabrielsson, 2001, p. 441)
This subject moves from a relatively detached realization that she hadn’t been breathing to extreme emotional identification with the music, describing herself as mourning and feeling terror in response. Juslin and Västfjäll (2008) construct a rather similar “empirically inspired” example of a listener’s emotional responses during a concert, which includes references to a sudden harmonic change “causing his breathing to come to a brief halt,” the thought “This piece of music is really a cleverly constructed piece!” and a sense of happiness when the goal is reached (2008, p. 563).
Embodied Engagement
Deviation from expectations is not the only way that affect is generated. Observing that Meyer’s account is designed to account for affect aroused by composed music, Charles Keil (1966) claims that it is not equally useful for improvised and participatory music. The music of many cultures is more dance-related than syntactically organized, and the affect aroused by such music is often a consequence of bodily engagement, spontaneity, and specific musical gestures (such as “landing on” the pulse at certain moments during a performance).
Juslin and Västfjäll’s Multitude of Mechanisms
Juslin and Västfjäll (2008) suggest that a variety of mechanisms can prompt emotional arousal through music, only one of which correlates with Meyer’s expectation-oriented model. Besides cognitive appraisal (in which emotion is provoked by the sense that the music is helping or hindering the attainment of one’s goals), their list of mechanisms includes “(1) brain stem reflexes, (2) evaluative conditioning, (3) emotional contagion, (4) visual imagery, (5) episodic memory, and (6) musical expectancy” (which encompasses Meyer’s model) (2008, p. 563). Brain stem reflexes are hard-wired for rapid autonomic nervous system (ANS) activation in response to certain sounds. Emotional conditioning occurs when a particular piece of music is associated with positive or negative emotional stimuli with which it has been repeatedly paired. Emotional contagion is the listener’s mirroring of the emotion expressed by the music, a topic we have already considered. Visual imagery is a process of conjuring up images to accompany the music one is hearing, images that have affective significance to the listener. These might well move and change in accordance with the dynamic features of music considered in the previous section. Episodic memory is the musical evocation of a personal memory, with its associated affect. Acknowledging this multiplicity of mechanisms enables us to recognize a range of idiosyncratic, species invariant, and culturally conditioned bases for emotional arousal through music.
Other Mechanisms
Other mechanisms might also generate affect. For example, in the case of listening to a piece with which one is personally quite familiar, one can take great satisfaction in mentally awaiting and then hearing each expected musical event, much as a child enjoys rehearing the same story told again and again in exactly the same way. The mechanism for arousal in this case is not evaluative conditioning, episodic memory, or musical expectancy, and yet it seems to involve aspects of all of them. It is akin to evaluative conditioning in that previous experiences with the piece have resulted in a positive association. It depends on episodic memory in the sense that one will probably relate the present music to previous occasions during which one was engaged with the piece, though importantly, this recollection can be focused on the musical structure rather than on extramusical contexts. This mechanism also involves expectations about the musical structure, in that ongoing satisfaction occurs as precisely anticipated events sequentially occur (Higgins, 1997). Further study should focus on identifying other possible arousal mechanisms.
Problems and Qualifications in Determining Cultural Influences
Terminological Inconsistencies
Hanslick denied that dynamic features can represent full-fledged emotions because listeners are not unanimous in their emotional characterizations, a phenomenon that has been experimentally verified when the range of allowable emotion terms is broad (Gregory & Varney, 1996; Hevner, 1936, 1937). Cultural differences in emotional vocabulary and the nonuniformity of emotional concepts across cultures would only exacerbate the lack of consensus (Benamou, 2003; Patel, 2008; Shweder, Haidt, Horton, & Joseph, 2008).
We need not be as concerned as was Hanslick with terminological consistency, however. Contemporary experimenters have primarily been interested in agreement on broad emotional characteristics, even when subjects are offered a large variety of terms (e.g., Shaver, Schwartz, Kirson, & O’Connor, 1987; Zentner, Grandjean, & Scherer, 2008). Juslin (2001) speculates that the system that evolved in human beings for the detection of affect is geared more to differentiating broad emotional categories, preventing mistakes about another agent’s emotional state, than to detecting more subtle distinctions (see also Juslin & Laukka, 2003; Juslin & Timmers, 2010). Significant evidence shows that listeners within a culture tend to agree about the basic emotional character of a musical piece.
Which Emotions?
Even if we restrict ourselves to broad emotion categories, however, several issues arise when we try to delineate the role of culture in determining the emotional character of music. The first is the question of which emotions can be expressed through music (particularly music whose emotional content is not rendered precise through conjunction with words) and whether the range is stable across cultures.
Basic emotions
To the extent that cross-cultural correlations can be made between music and expressive emotional behavior, they are likely to apply to more or less basic emotions, as opposed to cognitively complex ones (Juslin, 1997). We should note, however, that not every emotion taken to be “basic” is susceptible to musical expression. For example, it is not obvious how one could musically express “disgust,” one of the basic emotions on Paul Ekman’s (1972) list (as well as in the traditional list of rasas, or universal emotional “flavors,” itemized in Indian performance theory) (Bharata, 1967 [200–500 CE]; Ekman, 1992).
Basic emotions are, by definition, universally recognized. This has led some critics to challenge the very idea of basic emotions on the ground that it bypasses the issue it purports to address: whether there are any emotions that can be said to be “the same” across cultures (Shweder et al., 2008; Solomon, 2002). Even assuming that there are basic emotions, however, cultures still construct the display rules (Ekman, 2003). If culture inflects expressive behavior of even the allegedly basic emotions, we might expect that musical expression may vary accordingly. To demonstrate this, it would be necessary to show that music can reflect subtle differences in display styles among cultures. Doing so does not seem prima facie implausible. Anthropologist Alan Lomax (1962/1971, 1976) proposes that a culture’s characteristic song style, identified in terms of its generic features, reflects many of its social patterns. While Lomax’s view has been criticized for characterizing both song and social patterns with brushstrokes that are too broad, the general idea that music reflects cultural ideals for interpersonal relations has been widely accepted (Feld, 1984b). Given that social display rules are among the regulative standards for human interaction within a culture, the possibility that distinct cultural ideals for emotional expression might be reflected in music seems worthy of consideration. Cultures with more restrained bodily behavior, for example, might construct and detect quieter musical expressions for certain emotions than would cultures with less restrained emotional display.
Cognitively complex emotions
Some philosophers have argued that nonbasic emotions can also be expressed in music, even instrumental music. Jenefer Robinson, for example, contends that instrumental music can express cognitively complex emotions by presenting an unfolding psychological drama, interpreted as the experience of a musical “persona,” and she has offered, in collaboration with Gregory Karl (Karl & Robinson, 1995), a fascinating analysis of Shostakovich’s Tenth Symphony in these terms (see also Robinson, 2005). Jerrold Levinson (1990) has similarly analyzed Mendelssohn’s Hebrides Overture as expressing a narrative of hope. This approach might be extended to analysis of some non-Western music, but only if it exhibits clear narrative structure, which is not the case for all musical forms or all cultures’ music. Whether or not listeners reliably converge in identifying cognitively complex emotions in longer works has not, to my knowledge, been studied.
The idea that some music presents an unfolding emotional drama applies more obviously to longer works than to shorter ones. Juslin suggests another possibility, that of analyzing the cues for certain cognitively complex emotions as constructed from cues for more basic ones. Drawing on Robert Plutchik’s proposal (1994) that cues for pride combine cues for happiness and anger, Juslin (2001) proposes that pride might be conveyed by means of a compromise between cues expressing happiness and those expressing anger. If relationships between cues for basic emotions and those for cognitively complex emotions could be established, this might indicate a cross-cultural basis for recognizing musical expressions of cognitively complex emotions.
Narrative structure and cultural archetypes
For those cultures whose music does display narrative structure, it would be valuable to consider whether narrative schemes reflected in their musical genres are associated with emotional expression. Laurence Berman (1993) analyzes Western musical works in terms of cultural “archetypes,” fundamental ideas about extramusical content that particular cultures associate with music. He sees these archetypes as embedded in certain musical forms and as emotionally evocative. For example, he analyzes the controlled employment of polyphony in medieval Christian music, which clearly interfered with focus on the texts, as reflecting the achievement of an emotional ecstasis, in which the believer caught a glimpse of the serene joy of the other world. Sonata-allegro form, typically employed in the opening movement of a symphony, suggests the satisfying reconciliation of oppositions, in keeping with the aspirations of the era from which sonata-allegro form emerged. Lewis Rowell (1983) indicates narrative development with both Japanese and Indian music. A cross-cultural study of narrative structures in music could be useful for indicating common patterns in approaches to expressive content.
Do Listeners across Cultures Hear the Same Emotion?
A second issue regarding culture’s role in forming links between emotion and music is how we can establish that listeners from different cultures recognize or experience the “same” emotion. The inherent limitations of emotional language challenge such an effort. Studies that require subjects from different cultures to assign a limited set of broad semantic labels to music beg the question of how much convergence would occur if the labels were more specific (although individuals may not be terribly precise in their use of specific terms in any case).
Even broad linguistic labels for emotions seem oddly applied to music, however, for music does not provide the bases for individuating emotions that we find in most other cases. Emotions are typically individuated in relation to their objects (Davies, 2001; Kivy, 1988, 1989), but it is not obvious what object could be used to justify calling particular emotional content in music “sadness,” for instance, or “fear.” Music does not present objects of sadness or fear, unless we take something in the music itself to be the object; and in that case, we would confront the problem of explaining how one could be sad about or afraid of the music, as such.
Moreover, pieces of music often express multiple emotions in succession, a changing emotional trajectory that is difficult to pin down with emotional language. This problem might be solved, or at least diminished, by “continuous response” techniques of the sort that Schubert (2001, 2010) describes; but using them requires that one restrict the number of emotion categories utilized to make continuous self-reporting feasible. Some recent studies have bypassed the issue of which emotion categories to use by relying on dimensional categories such as “tension,” “musical intensity,” and “aesthetic response.” The distinction between emotional and dimensional categories is not always sharp, however, since dimensional categories are sometimes used to characterize particular emotions.
Yet another problem for identifying the emotion expressed or aroused by a musical work is that using emotional labels may influence the listener’s experience. Titles, program notes or comments, descriptions of the music by others, and the listener’s monitoring of his or her responses may all suggest emotion terms that might influence the way in which the listener understands the experience. A listener’s expectation that a piece or passage of music will be “sad” (as might stem from having heard that the work is a “tragic symphony”) might prime or reinforce arousal of sadness. The phenomenon of hypercognizing emotions may also play a role here, with mixed impact. Noticing one’s emotion and internally labeling it might make the features consistent with the label more salient; but the mental act of labeling might temporarily distract attention from the music itself, and hence reduce arousal.
The imperfect translation of emotion terms also poses a problem for establishing that members of different cultures recognize or experience the same emotion. Marc Benamou, in a study of affect terms used for describing expressiveness in music, determined that some terms used by Javanese subjects did not straightforwardly correspond to Western categories (Benamou, 2003). In such cases, we might conclude that subjects in Javanese- and English-speaking groups do not recognize or experience the same emotion, although the emotions in question may be similar. But to assume that the same emotion is intended when the languages have correlative terms is not obviously justified either, given the possibility of different connotations associated with the respective terms.
Cultural Convergences and Cultural Construction
We should keep in mind that cross-cultural commonalities in emotional judgments and responses do not establish that they are biologically determined (Thompson & Balkwill, 2010). Some “hard-wired” physiological responses to music may suggest similar associations to listeners from various cultures, but these associations would nevertheless be culturally constructed. For example, most cultures employ centering tones or some other device for establishing an overall tendency within a piece of music. This feeling of tendency, combined with the fact that music is structured through time, can be emotionally evocative, reminding us of general features of the human condition, which involves many goal-oriented undertakings over a temporal expanse. In the case of composed music, in which works are determined in great detail, Nussbaum (2007) suggests that the music offers relief from our existential horror at the contingent. Although cultures vary with respect to how much they utilize composition as opposed to improvisation, most cultures preserve and reuse certain songs and other works (Davies, 2001). The existential relief that Nussbaum mentions might apply whenever a recognized tune or melody reappears.
Another case in which cultures might converge in forming affective associations with features of music builds on the fact that more physical effort is required for singers to produce higher pitches than lower ones. As a consequence, musical events suggest greater and lesser tensions depending on their relative height in pitch-space (Patel, 2008; Thompson & Balkwill, 2010), and in this respect, they resemble many of our activities. Structuring a musical work to intensify the tensions without much fulfillment might suggest and even provoke emotions related to frustration. Structuring music to delay but ultimately achieve resolution may, contrarily, result in emotions related to breakthrough.
Despite convergences, cultural associations with what is perceived in music are diverse at the level of specifics simply because music is, in fact, interpreted. Steven Feld (1984a) proposes a multifaceted model of “interpretive moves” that cultures make in relating music to locations, categories, associations, reflections, and evaluations that they consider important. Any of these interpretive moves can have emotional overtones for a culture (or an individual). For example, we might expect that listeners relate techno music to their associations with machines, electricity, and big cities; a sense of their own ability to behave in highly regular ways (like a machine); reflection on life in a revved-up society; and, conceivably, some evaluation of whether life in this situation is exciting, de-humanizing, or both. Importantly, the “interpretive moves” model underscores listeners’ active role in interpreting music, and indicates how cultures might associate even neurophysiologically invariant responses with matters of more specific cultural interest. Given works or passages of music might cross-culturally arouse emotions with certain neurophysiological profiles, but these are fleshed out by particular, culturally specific interpretations.
Musical Vitality and Human Affiliation
Thus far we have focused on the ways in which culture interacts with biology in enabling the recognition and arousal of emotion in connection with music. We should not, however, underplay the role of biology. This section will consider the body’s role in arousing emotions related to human affiliation.
The Vitality of Music
Music’s immediate biological appeal is cross-culturally evident. The excitement felt as a result of bodily involvement with music appears to be independent of cultural conditioning, in that it can be appreciated even in the case of extremely foreign music. This is not to say that any listener will appreciate every instance of foreign music. Schemata for what pitches are acceptably “musical” are instilled at a very early age, ensuring that music pitched in too different a manner will be heard as “out of tune,” a circumstance that some find distressing (Dowling & Harwood, 1986). However, listeners in general can recognize and relate to the vitality that is manifest in music, and this affords them an opportunity for enjoyment and the exhilaration of participation (if only as a listener), both emotional responses.
The vitality of music has been characterized in a number of ways. Charles Keil (1966) refers to the “vital drive” of music, a term coined by André Hodeir (1956), who describes it in terms of the “rhythmic fluidity” involved in the phenomenon of “swing.” “Vital drive,” in jazz, according to Keil, is achieved when “the rhythms conflict with or ‘exhibit’ the pulse without destroying it altogether.” This results in “engendered feeling,” which must accumulate for a jazz solo to develop (Keil, 1966, p. 345). Another embodied aspect of music that generates affect is what Keil (1987) calls “participatory discrepancies.” By this he means the nuanced deviation from mechanical precision that gives music its “human” feel (Seashore, 1967, p. 147). These deviations are person-specific, performance-specific, and even gesture-specific. Participatory discrepancies are means of individual expression, through which every individual performer makes the music his or her own; at the same time, the idiosyncratic discrepancies within and among the performances of various individual participants create subtle rhythmic conflicts that engender affect and produce feelings of drive that invite participatory movement by others (Keil, 1987; Keil & Feld, 1994).
Vitality Affects
Daniel Stern’s (1985) account of the “vitality affects,” formulated in his analysis of infant development, yields further insight into the impression of vitality in music. He defines vitality affects as kinetic characteristics that are perceived multimodally. Infants recognize vitality affects in their caregivers even before being able to distinguish them as specific individuals; in fact, they are the means by which the infant learns to recognize distinct individuals as unified beings. The various sensory streams that indicate the dynamic profiles of the caregiver’s movements are recognized as simultaneous through “activation contours (such as ‘rushes’ of thought, feeling, or action)” that correspond to “similar envelopes of neural firings . . . in different parts of the nervous system” (Stern, 1985, pp. 57–58). Infants and caregivers “attune” with each other, adjusting their expressive actions toward each other in terms of both timing and contour. The vitality affects are involved in our sense of being “with” someone else at any stage of life.
Stern sees music as a paradigmatic illustration of the vitality affects’ expressiveness, and he contends that music returns us to the condition we experienced as infants attuning to our caregivers. While listening, we again attune to the directly perceived vitality affects of the music (Bunt & Pavlicevic, 2001). If this is so, music revives feelings of interpersonal connection on a basis that is not dependent on cultural constructions (although the rhythmic profile of an individual’s movements will reflect the cultural style of movement). Such feelings are a basis for a sense of solidarity through music.
Of particular interest in Stern’s account is its suggestion that a secure sense of rapport with another person is originally grounded in vitality affects. If music restores our early stance of attuning to other persons, surely this has an emotional impact. The fact that music in general is cross-culturally associated with happiness or joy, regardless of the emotions expressed or aroused by particular pieces (Becker, 2004), may reflect music’s capacity to arouse feelings of secure connections with others. One might expect that the rapport felt by the audience would be primarily directed toward performers, and perhaps the zealous responses to musicians by their “fans” (a term that originally abbreviated “fanatics”) support this idea. But solidarity seems to be generated among those who are members of a musical audience as well, perhaps because they are literally pulsating with the music (see Schutz, 1951/1977).
Entrainment and Feelings of Solidarity
Feelings of solidarity are also encouraged by entrainment, our tendency to synchronize some features of our biological activity to external rhythms (London, 2004). Even if one is only listening, musical rhythm encourages one to move with the music, perhaps tapping a foot or bobbing one’s head. While experiencing music together, people become rhythmically synchronized in their movements and breathing. The fact that everyone in the environment is enveloped by the same music and likely aroused to kindred emotions encourages bonding, as does the sharing of any strong experience.
Participatory Music
Mutual participation in music furthers the extent of interpersonal synchronization, and some kinds of music are particularly designed to encourage social bonding through participation. Thomas Turino (2008) points out that throughout the world one encounters styles that feature textural density (concealing flaws in individual contribution) and involve roles of varying degrees of complexity (encouraging participation by novices while maintaining the interest of those with greater expertise). The aim is to encourage participation by as large a group as is possible, and participation in such musical styles has a powerful physical impact. One is aware of one’s own body resonating along with the bodies of all the other participants, and one is frequently close enough that one can feel others’ physical presence and movement. Emotionally, this is often exhilarating, and emotional arousal is an indication of musical success in participatory styles.
Music’s rhythmic reflection of vitality, its ability to entrain its listeners, and its participatory enlistment of the body are all ways in which our biology is engaged in provoking affect. Culture fleshes out many of the details of the emotional experience, yet we have every reason to think that music can stir emotions of solidarity across cultural divides. It can also prompt a second-order affective response to the recognition of our common emotional nature as human beings.
Conclusion and Future Directions
Some suggestions for future research on culture’s role in linking music and emotion have already been indicated. One project is to further analyze the interactions of the arousal mechanisms that Juslin and Västfjäll describe and to consider other possible mechanisms. Another is to elaborate further on the interplay of cultural and more “hard-wired” factors, as opposed to defending one or the other as more important (Becker, 2004, 2010; Feld, 1996).
Future research should also take account of the many types of cross-cultural music-making that occur in the contemporary world. Analyses of the emotions generated by such musical hybrids would be worthwhile. Actively transcending cultural boundaries through such music might occasion specifically joyous, achievement-related, or solidarity-provoking emotions. On the other hand, if the hybridization depends on weakening some of the characteristic features of musical style that have strong emotional associations within a culture, this may result in diminished arousal of at least certain emotions. The extent to which emotions are involved among the conditions for aesthetically successful hybridizations might also be explored. We might consider whether the induction of emotions through mechanisms that require prolonged encoding—or the avoidance of mixed emotions—is more difficult when the sources of music derive from multiple cultures. We might also want to investigate the extent to which universal acoustic cues facilitate the musical splices that successful hybrids involve.
We should also expect further research not only into the brain’s role in emotional arousal through music, but also other ways in which bodily involvement in music figures in the music-emotion connection (and indeed in art-emotion connections more generally). Music may not be a universal language of emotion, but it is time to establish scientifically how much credence we should give the long-standing idea that its emotional impact transcends cultural boundaries.
