Abstract
When individuals coordinate their behaviour, they need to both anticipate actions and respond to each other in meaningful ways. Jazz musicians often encounter situations in jam sessions in which they interact with previously unknown musicians, allowing insights into spontaneous collaboration. The current study investigated call and response patterns in free jazz improvisations by analysing movement and musical characteristics of duos. Twelve jazz musicians were paired into six duos of an e-guitar and a saxophone. Balanced across duos, one musician was asked to play a series of improvisations expressing the emotions happy, sad or neutral. The second musician responded to each improvisation without knowing the emotional intention of the first musician. Call and response roles were then exchanged. While musicians improvised or listened to their duo partner, they were both recorded with an optical motion capture system. Results indicate correspondences between call and response musicians in movement variability and cumulative distance of head motion. There were marked differences between happy and sad emotional expressions both in movement parameters and musical features including mean intensity, mode, and, albeit to a lesser extent, tempo. Retrospective verbal decoding of the call musicians’ emotional intentions was correct in 76.5% of all cases. Independently of explicit decoding success and even for the first encounters, musicians spontaneously tuned into each other’s performances by means of their body movements and the musical characteristics of the improvisations.
Free jazz improvisations are cases of heightened human interaction. In a condensed form, they offer insights into spontaneous creative collaboration, social responsiveness, intentional coordination, and shared emotional expression as manifested in bodily and musical characteristics (Berliner, 1994; Monson, 1996; MacDonald & Wilson, 2005; Palmer & Zamm, 2017). The communication in jazz ensembles has been described as “give and take” and “collective conversation” (Berliner, 1994, p. 348), and it can be argued that some of the fascination with free jazz stems from witnessing the interplay of these processes in a live situation. In jam sessions in particular, musicians often play together for the first time. In these situations, they need to adjust their timing and negotiate musical actions spontaneously in a manner of “participatory sense-making” (De Jaegher & Di Paolo, 2008, p. 41). Drawing on research into the psychology of interaction as well as studies of jazz, this paper addresses in what ways jazz musicians respond to each other when they perform together with a previously unknown partner. Specifically, the investigation focuses on which expressive information the musicians perceive in each other’s play and how they transform the material through musical and bodily actions. This approach may further our understanding of human interaction more generally, since patterns of jazz improvisation share essential features with other expressive situations such as conversations.
Interactions in small groups and in musical ensembles
Interpersonal coordination in the widest sense comprises the processes of action and reaction. In smaller groups and particularly in dyadic interaction, these behaviours typically alternate continuously such as in sports involving the throwing of a ball or a shuttlecock, or in turn-taking in spoken interaction (Gallois & Markel, 1975). Successful coordination is reached when individuals respond to each other by anticipating the action outcome of their partner, team member or opponent in sports. A recent study investigated vocal and movement parameters in dyads when making a decision within a conversation (Stevanovic et al., 2017). In order to manage sequential transitions, both speakers lowered the pitch level of their voices and tended to synchronise their body sway to some extent. While patterns of turn-taking in conversations may not simply be transferred to the sphere of music, some of the underlying processes in dyads as well as in larger teams may be fundamentally similar.
In this way, Glowinski, Bracco, Chiorri, and Grandjean (2016) described musical ensembles as “resilient systems” that, like all such systems, are characterised by the mastering of anticipation, monitoring, responding, and learning. These capacities highlight processes related to external and internal demands. For the functioning of internal processes in musical ensembles, traditional leader–follower roles emerged that regulate coordination in temporal, dynamic and articulatory dimensions (Wöllner & Keller, 2017). Clearly defined roles may allow ensembles an efficient coordination of the high demands in joint musical actions, especially regarding timing and expressive content. Empirical research on ensemble performance showed that even experimentally assigned roles affect temporal coordination in music. In Goebl and Palmer’s (2009) study, followers in piano duos played the lower part and tracked the leaders in terms of timing, showing a larger amount of error correction as indicated by a higher coefficient of variation in inter-onset intervals (IOI). This finding suggests that followers adapted to the timing of leaders to a greater extent than vice versa. Furthermore, the head movements of the leaders preceded the followers’ head movements, providing support for a visual manifestation of these roles.
While a further study did not show clear leader–follower patterns in the timing of the body sway of primo and secondo pianists, alignment of body sway was positively correlated with IOI synchronisation, indicating that bodily interactions shaped the timing in duos (Keller & Appel, 2010). Research on error correction mechanisms in string quartets further suggests that the leader’s willingness to adapt to the timing of the fellow musicians (and not only vice versa) may reflect a more team-oriented leadership style (Wing, Endo, Bradbury, & Vorberg, 2014).
Social roles and empathy in jazz ensemble improvisation
In free jazz and particularly in duos, while there is occasionally a clear leader, it is generally expected that each musician plays soli and that there are passages of “call and response” (C&R), in which one musician transforms the musical material of the other (Monson, 1996). These passages have also been described as “musical turn-taking” (Phillips-Silver & Keller, 2012, p. 2; Hadley, Novembre, Keller, & Pickering, 2015), in that jazz musicians construct predictions about the fellow musician’s timing based on motor simulation. This skill requires a high degree of flexibility from musicians, since “improvisers must respond creatively to surprise that constantly arise during performances” (Berliner, 1994, p. 374), especially when there are changes in timing. Although “jazz turn-taking” shows some similarities to leading and following such as in the piano duos described above, the terms “call and response musicians” are kept throughout the current paper, because both roles are typically interchanged in free jazz improvisations regardless of overall social relationships.
Based on performance traditions, nevertheless, certain instruments in a standard jazz ensemble frequently presume leading roles in terms of timing. In a recent study with several jazz trios including drummers, bassists and a saxophonist, the drummers’ timing influenced overall tempo of popular jazz songs (Hofmann, Wesolowski, & Goebl, 2017). A subsequent perception experiment that included manipulated timing relationships showed that listeners preferred precise temporal coordination in the rhythm section of the trios. Larger asynchronies were found to be acceptable only when the saxophonist as a solo instrument was included. Similarly, an earlier study of recordings of famous jazz ensembles showed that the soloists tended to lag behind the beat, which may lead to a “swing feel” in jazz (Friberg, & Sundström, 2002, p. 334).
In free jazz, as mentioned above, there is no general separation between rhythm group and soloists, but rather “musical interactions between all musicians within a group” (Jost, 1975, p. 18, author’s translation). Compared to other subgenres of jazz, there are only a limited number of music-theoretical norms and rules. In jazz standards, for instance, there are several pre-set patterns in terms of the melodic and harmonic material, a given emotional valence of the standard, and the soli of instrumentalists are typically eight measures long (Berliner, 1994; Norgaard, 2014). Free jazz improvisers, on the other hand, are indeed “freer” from conventions to express their musical ideas. In a sense, they are all soloists negotiating a common musical performance in the very moment.
Due to the spontaneity encompassed by the improvisatory nature of free jazz performance, musicians need to constantly refer to each other and predict their co-performers’ ideas and actions. In an observational study, Seddon (2005, p. 56) named this social skill “empathetic attunement”, which was paramount for successful interaction in jazz performances. Empathy seems indeed related to important ensemble skills, and was found to enhance accuracy when differentiating between live and dubbed jazz standards (Pesquita, Corlis, & Enns, 2014). Iyer (2016, p. 80) suggests that empathy, related to a “sense of mutual embodiment”, is crucial for the perception of improvisation, which has consequences for listeners and ensemble musicians alike.
A study by Engel and Keller (2011) asked expert jazz musicians to listen to piano recordings, some of which consisted of improvised music, and other recordings presented imitated music. While overall recognition rate was only 55% correct, those respondents with higher scores in perspective-taking (an empathy sub-component) were more accurate at differentiating between the two versions. Identifying the subtle differences between imitation and improvisation as well as between jazz playing styles (cf. Moran, Hadley, Bader, & Keller, 2015) may constitute one of the dimensions in successful jazz musicians’ skills. It remains a question for further research as to whether empathy should be regarded as a relatively stable trait that may be required as prerequisite for jazz ensembles and other teams, or whether playing in such ensembles may enhance interpersonal responsiveness (cf. Wöllner, 2017).
Action anticipation and expectancy in jazz improvisation
In an ensemble, perceiving others and responding to them is crucial for successful coordination, requiring aural skills, an awareness of the meaning of visual signals, and familiarity with a performance style. Internal simulation of another’s action allows precise timing of temporal nuances and the prediction of the partner’s intentions in the interplay of acting and re-acting (Pecenka & Keller, 2011). Simulation is enhanced when musicians play comparable instruments, so that shared motor skills facilitate sensorimotor synchronisation. In a visual synchronisation experiment (Wöllner & Cañal-Bruland, 2010), string musicians and other participants tapped along with the entry movements of a first violinist as the leader of a quartet, presented visually-only on a computer screen. Based on their motor and visual skills, string musicians were more precise and less variable in timing as compared to non-musicians or musicians of other instruments, who were only visually skilled in following a first violinist. These findings suggest that musical coordination involves action anticipation that is facilitated by assigned roles, expertise, and familiarity with the auditory and visual-bodily cues in a performance situation.
Expectancy has recently been studied in an EEG study comparing jazz musicians, “non-improvising” musicians and non-musicians (Przysinda, Zeng, Maves, Arkin, & Loui, 2017). Chord patterns were played to them that were typical for the Western classical music tradition and varied in harmonic expectancy. In contrast to the other two groups, improvising jazz musicians preferred medium levels of expectancy instead of high-expectancy chord progressions, and also showed higher preference ratings for the low-expectancy chords. These behavioural measures were mirrored in event-related potentials (ERPs), as shown in larger early right anterior negativity in jazz musicians. In addition, they outperformed the other groups of participants in a domain-general task of divergent thinking. It can be argued, however, that the low-expectancy chord progressions, such as using a Neapolitan chord at the end of a cadence rather than the more highly expected tonic, are simply closer to the musical idiom of jazz as compared to, say, pop music. Although harmonic expectancy undoubtedly cannot be generalised across genres, this study nevertheless points to domain-specific (medium levels for chord expectancy) as well as domain-general (divergent thinking) characteristics of jazz musicians’ processing that may aid them in responding to a fellow performer’s intentions.
Expectancy and action anticipation is further enhanced by familiarity with the musical material and may facilitate ensemble interaction. Research has shown that experienced improvisers playing jazz standards rely more on automatised melodic patterns when they have to focus on something else at the same time. For instance, when pianists direct their attention to another task such as listening to a fellow musician, or, in an experimental study, counting (Norgaard, Emerson, Dawn, & Fidlon, 2016), such patterns as automatised schemata help them to maintain a fluent performance. Melodic patterns also play an eminent role in theoretical models of certain genres of jazz improvisation (Pressing, 1988), and can be found in empirical analyses of recorded music (Norgaard, 2014). Such automatised patterns may free some of the cognitive resources needed for following other ensemble members and predicting their actions.
Some jazz musicians believe that an inflationary use of overlearned patterns is less spontaneous and creative than improvising novel melodies (cf. Berliner, 1994). Further tentative evidence for this issue, which seems to be at the very core of what constitutes creative improvisation, is provided by a neuroimaging study. Free improvisation was linked to a dissociation in prefrontal activity patterns as compared to playing overlearned sequences (Limb & Braun, 2008), suggesting that creative improvisation could be marked by a reduction of central processes associated with conscious control. Relying on patterns, on the other hand, may free some of the cognitive resources needed for attending to an ensemble partner. Thus the demands on ensemble musicians’ interactions may be higher in free jazz, while at the same time this could be balanced by the smaller number of harmonic constraints. Since free jazz improvisation, as mentioned above, involves fewer rules and musical norms compared to other types of jazz, musicians in the current study were less likely to rely on standardised or overlearned patterns and corresponding mental representations.
Emotional expression in music performance
Anticipation in ensemble performance does not only require attending to a fellow musician’s timing, but also to predict their emotional intentions. It can be argued that musicians perceive a great deal about one another’s expressiveness by channels similar to domains outside of music, such as speech. There is evidence that the expression of basic or discrete emotions share common features across different human behaviours. Juslin and Laukka (2003) conducted a systematic review including more than 100 original studies and a meta-analysis that addressed the acoustical cues in music performance and vocal expression in speech. Decoding accuracy even in cross-cultural studies was above chance, and comparably high for speech and music performance. Among five basic emotions examined in the meta-analysis, happiness and sadness as contrasting basic emotions were not confused with each other by participants. Decoding accuracy for happiness ranged from π 1 = .68–1.00 and for sadness from π = .79–1.00 in music performance, and mean values were above π = .80 in both music performance and within-cultural vocal expression in speech. Happiness was found to be reflected by higher speech rate and higher tempo in music, intensity, as well as variability of intensity. In most studies sadness was expressed with slower tempi, lower intensity and reduced variability.
Expressions of basic emotions can also be perceived in musicians’ bodily movement (Dahl & Friberg, 2007). When presented with visual-only video material of musical performances, observers more successfully identified sad and happy emotions compared to other emotional expressions. Happy emotions were associated with perceived higher amounts of body movement and speed as compared to sad emotions. Further evidence stems from motion capture analysis of dancers, who express happiness induced by music with higher rotation range and movement complexity and lower foot distance (the latter supposedly caused by the higher overall bodily complexity in movement), and sadness with simpler movements and reduced movement complexity (Burger, Saarikallio, Luck, Thompson, & Toiviainen, 2013).
The current study
While previous research on interpersonal synchronisation as discussed above mainly required musicians to perform pre-composed music with assigned roles (e.g., Keller & Appel, 2010), or jazz standards with a higher number of conventions and musical rules (e.g., Hofmann et al., 2017), free jazz has only recently been empirically addressed in a case study (Pras, Schober, & Spiro, 2017). The challenge for free jazz ensemble improvisation, which is often particularly fascinating for the audience, lies in its perceived spontaneity in terms of timing, melodic invention, and emotional expression. In other words, there seem to be hardly any a priori limitations, requiring ensemble musicians to constantly monitor the whole performance and to focus on one another. The relative independence from music-theoretical rules allows comparisons to be drawn with other types of dyadic interactions as mentioned above.
Furthermore, no research has analysed bodily and musical characteristics in call and response patterns. Previous approaches involved duo musicians who played at the same time, yet less is known about what musicians do in an ensemble when not actively performing but listening to each other. It is believed that listening is highly important in successful improvisations that build on previous material and transform it (cf. Pecenka & Keller, 2011; Pras et al., 2017). If jazz musicians indeed “tune into” each other’s play (cf. Seddon, 2005), then this should be manifest in their shared actions.
The present study therefore addressed the following questions:
(1) How do jazz improvisers perform in a first musical encounter?
(2) In call and response situations, do musicians “tune into” one another when listening and preparing for their own solo? Are there correspondences in musical and acoustical features as well as bodily expressions?
(3) How do they express contrasting emotions in terms of these characteristics?
Jazz musicians who had not performed together before improvised the expression of two basic emotions, happiness and sadness. Acoustic as well as kinematic data were analysed, which allowed comparisons across different instrumentalists and generalisations beyond instrument-specific performance characteristics. With this approach, musical features (i.e., mode, key), acoustical features (i.e., intensity, mean tempo), and the bodily kinematics of musicians when performing and when listening to each other could be studied. Duos were formed of a saxophonist and a guitarist, a common formation for such jazz formations. These instrumentalists also frequently play soli in ensembles, as both guitar and saxophone are melody instruments. Each instrumentalist acted as a “call musician”, improvising freely an emotional expression, and as a “response musician”, trying to incorporate the call musician’s implicit expressive intentions in their own play. In both performing and listening situations, the musicians’ bodily motion was quantitatively captured and their musical performance was acoustically recorded.
First, it was hypothesised that C&R musicians synchronise their bodily actions to some extent, and that response musicians would be able to decode the call musicians’ expressive intentions. The call musician as a temporary leader should influence the bodily motion and musical expression in the interaction (Goebl & Palmer, 2009). Second, it was expected that these shared performance parameters differ for conditions with contrasting emotional expressions. For both C&R musicians, happy as compared to sad improvisations should be higher in tempo, intensity and variability in acoustical and bodily dimensions (Burger et al., 2013; Juslin & Laukka, 2003).
Method
Twelve male jazz musicians, aged 20–73 years (M = 39.17; SD = 17.11 years), were invited to take part in the study as duos. Half of them played the saxophone, the other half the guitar with an electronic amplifier. All had received formal lessons in at least one instrument (i.e., either the instruments they played in the current study or other instruments) for an average of 8.32 years (SD = 5.80); five of them had taken lessons in three or more instruments. Nine musicians had learned further instruments on their own (first further instrument: M = 15.22 years of self-study, SD = 14.70). They had played in jazz duos for at least three years prior to the study (M = 11.22 years, SD = 10.39) and many of them had various further ensemble experiences such as playing in big bands or other ensembles. Musicians were thus considerably skilled in jazz and ensemble performance. There were no differences between guitarists and saxophonists in performance experience. Six duos of a saxophonist and a guitarist were formed such that none of the duo musicians had played with the other one in the same musical ensemble prior to the study. Duo partners were roughly matched according to age and expertise.
After a warm-up session, one of the musicians (guitar or saxophone first, balanced across duos) was asked to improvise for approximately 20 seconds (“call”) with either one of two emotional expressions (happy, sad) or a neutral expression. Musicians used their intuitive knowledge about how to express these emotions and did not discuss it with their duo partner. After a short pause, the second musician responded to the improvisation without knowing which emotional intention the first musician had had in mind (“response”). In other words, improvisations resembled rather individual soli, giving each musician the freedom to potentially develop his own musical ideas, the only task being that the response musician should match the emotional expression of the call musician. After the first call and response (C&R) improvisations, the call musician improvised according to two further emotion conditions in an order of his choice, each immediately followed by the response. Subsequently, C&R roles of musicians were exchanged. The neutral condition was included only in order to have a further option once the first emotional expression had been performed, and to make guessing the emotion more difficult; yet the neutral condition was not of interest for analysis. The neutral conditions for the first duo were not entered into recognition analysis because they were explicitly played first. After each round of individual C&R improvisations was completed, the response musician told the experimenter which emotional expressions the call musician presumably had in mind. The call musician then stated the intended order of emotion conditions verbally.
While musicians improvised or listened to their duo partner, they were both recorded with a 12-camera optical motion capture system (OptiTrack). A full body model of 36 markers was used for the recording with a sample rate of 120 Hz. In order not to conflate shared movements of the musicians with characteristics of their respective instruments when performing, the motion of one head marker of each musician was analysed in terms of cumulative distance travelled and variability in velocity (standard deviations) using the MoCap toolbox (Burger & Toiviainen, 2013). Sound was recorded with an AT2020 microphone and synchronised with the motion capture system. For all trials, the first 3 seconds were discarded and the following 15 seconds were considered for analysis, providing data of each improvisation that could be compared without the typical tempo fluctuations of the beginning and end. Musical features (tempo, intensity) were analysed with Sonic Visualizer (Cannam, Landone, & Sandler, 2010), and differences between emotion conditions were calculated with one-tailed t-tests according to the predictions of the second hypothesis based on previous research. Response musicians’ recognition scores of the intended emotions were calculated both as percent correct and π (Rosenthal & Rubin, 1989; see also Rosnow & Rosenthal, 2003), taking into account the proportion of correct responses in relation to the number of alternative choices.
Results
Correspondences in musicians’ movements when performing and when listening to each other are presented first. Differences in kinematics between happy and sad emotions are then reported, further addressing whether musical performance parameters are related to intended emotional expression. Finally, it was asked whether response musicians could reliably decode the call musicians’ expressive intentions.
Since both instruments afford different movement patterns, and playing an instrument naturally leads to higher motion of the hands and arms as compared to listening, only the head marker was analysed. This marker provides an index of both duo musicians’ movement patterns independently from instrument characteristics. For each musician, four conditions (performing and listening, happy and sad) were analysed, leading to a total of 47 trials (one response of a saxophonist in the sad condition excluded due to large gaps in motion capture data). Figure 1 presents the velocity profiles of a sample duo. Even when listening, duo partners seemed to entrain with their duo partner’s movements.

Velocity (m/s) of the head markers for a duo in the happy condition. In the upper panel, the guitarist (grey line) listens to the saxophonist (black line); roles are reversed in the lower panel. The horizontal line presents the frame numbers; sample rate was 120 per second. Please note the differences in scale size, hence there was less overall velocity in the first improvisation.
Variability in velocity averaged 0.048 m/s (SD = 0.034) for guitarists and 0.044 m/s (SD = 0.034) for saxophonists. Variability was correlated (Spearman) between performers and listeners, rS = 0.35, p < .05. Cumulative distance travelled, indicating the total movement path and related to overall motion quantity (since time analysed was consistent across all trials), was at M = 0.53 m (SD = 0.37) for guitarists and M = 0.60 m (SD = 0.66) for saxophonists. There was a significant correlation in cumulative distance between performers and listeners, rS = 0.30, p < .05. Thus, while saxophonists tended to move slightly more and with less variability than guitarists, there was a close correspondence in kinematics between duo partners: the more the performer moved, the more the listener moved along as well (Figure 2).

Relationships between performers and listeners across all duos and all emotion conditions for variability in velocity (left panel, SD of velocity in m/s), and cumulative distance travelled (right panel, in meters).
Emotional expression was reflected in performance movements (Figure 3). Analyses focus on C&R performances only, hence do not include listening. Paired-samples t-tests (one-tailed), contrasting happy with sad conditions, indicate that call musicians performed with higher variability in velocity, t[11] = 3.15, p < .01, dz = 0.93, and a larger cumulative distance, t[11] = 2.12, p < .05, dz = 0.61, 2 in the happy emotion condition. Similarly, response musicians moved with more variability, t[10] = 2.94, p < .01, dz = 0.89, and a larger cumulative distance, t[10] = 2.36, p < .05, dz = 0.71, in the happy condition (data for one saxophonist excluded, see above). Thus, both C&R musicians moved more and with more variability in happy as compared to sad emotions.

Call and response musicians’ kinematics and performance characteristics according to sad and happy emotional expressions (means and SD). Variability in velocity (upper left panel) and cumulative distance travelled (upper right panel). The lower panel presents tempo (left) and intensity (right) in C&R conditions. Asterisks indicate significant differences between conditions: * p < .05; ** p < .01.
Are musical performance parameters related to these findings? The mean tempo was analysed with a beat tracker in Sonic Visualizer (Cannam et al., 2010). The beat tracker deduced mean tempo from the performances based on pitches and accents. It should be noted that there are limits to estimate the tempo in free jazz solo improvisations, that is, to differentiate between rhythm, melodic surface structure or “embellishments” and underlying tempo. Mean tempo data can thus only be taken as an approximation, particularly for the relatively short excerpts. Figure 3 (lower left panel) shows mean tempo for happy and sad emotional expressions in C&R situations. Call musicians played faster in the happy condition as compared to the sad condition, t[11] = 2.14, p < .05, dz = 0.62. There was no significant tempo difference between happy and sad conditions in response musicians’ improvisations, t[11] = 1.37, p = .10, dz = 0.39. There was also no significant correlation between C&R in tempo, indicating that musicians’ performance tempi on the two instruments were not affected by their duo partner’s play, as detected with the automatic beat and tempo tracker.
Mean intensity, referring to the averaged sound pressure level or volume of performances, differed between happy and sad emotional expressions in response musicians (Figure 3, lower right panel), with higher intensity in the happy condition, t[11] = 1.85, p < .05, dz = 0.54. No difference between emotion conditions was found for call musicians, t[11] = -0.42, p = .34, dz = 0.12. There were significant correlations in intensity between C&R for both the happy condition, rS = 0.71, p < .05, and the sad condition, rS = 0.93, p < .001. Thus, results illustrate some correspondence in duo musicians’ expressive performance choices (Figure 4).

Correlation in intensity (mean dB across performances) for happy and sad emotional expressions between C&R.
These correspondences are also mirrored to some extent in the chosen modes (i.e., minor or major as evaluated by ear judgment). In the happy emotion condition, nine call musicians played primarily in a major mode, one in minor, and for the remaining two musicians no mode was detectable. Eight response musicians played in a major mode following a “happy call”, three in a minor, and one in no specific mode. Response musicians chose the same key as call musicians four times (out of 12 C&R combinations). In the sad emotion condition, one call musician chose a major and 11 chose a minor mode, and all 12 response musicians played in a minor mode. Here correspondence in key was observed six times. These findings indicate that especially for sad emotional expressions, nearly all musicians chose a minor mode even in free jazz improvisations, which was picked up by response musicians.
Across all duos, responders judged the call musicians’ intended emotional expression correctly in 76.47% of all cases. The corresponding effect size is considerably high at π = .87, SEπ = .08 (Rosenthal & Rubin, 1989; Rosnow & Rosenthal, 2003). Correct verbal judgments were not related to participants’ experience (years of playing an instrument, ensemble experience). Furthermore, Mann Whitney U-tests on kinematic and musical characteristics did not yield any statistical differences between those trials in which the response musician correctly decoded the emotional expression and those where this was not the case. Thus, retrospective verbal judgments may not always reflect response musicians’ actual musical and bodily actions in relation to the call musicians.
Discussion
Free jazz improvisation offers specific perspectives on interaction that are perhaps more closely related to turn-taking in spoken conversation and non-hierarchical cooperation than other musical genres (cf. Jost, 1975). Berliner (1994, p. 386) remarked that the “dynamic reciprocity” in improvisation facilitates a “collective interplay [that] can lead players beyond the bounds of their initial plans”. The present study investigated bodily and musical characteristics in call and response situations. Results indicate that even when listening, response musicians mirror their duo partner’s play in bodily movements as shown by correspondences in movement variability and cumulative distances. Even more, response musicians followed the call musicians’ expressive emotional intentions, unbeknown to them, and differentiated between happy and sad emotions in terms of intensity and mode in their own response performances. Similarly, bodily movements differed clearly between happy and sad emotions for both call and response musicians. The shared musical and bodily characteristics in expressive performances do not necessarily depend on explicit verbal knowledge about another musician’s intentions, given that only about three quarters of trials were correctly identified in terms of emotional expression. This finding suggests that even for “first encounters”, musicians incorporate their duo partner’s play by mirroring movements and by responding musically to their expressive ideas.
Joint actions in improvisation
Jam sessions offer opportunities for playing with other jazz musicians for the first time. Especially in free jazz, there is only a limited set of conventions and patterns, and musicians need to carefully listen to each other and watch each other’s performance gestures in order to predict future actions and to reach a coherent performance. Components of these skills were investigated in the current study. Ensemble members that are highly familiar with each other or perform existing compositions, in contrast, do not rely on visual communication to the same extent; they rather appear to have internalised their ensemble members’ actions (cf. Canonne & Aucouturier, 2015). Professional ensembles may thus focus more on the scores (in other musical genres) rather than on one another even when rehearsing a highly complex new piece (for a documentary of the Arditti quartet, see Archbold & Still, 2012).
Previous research addressed first encounters in music and their consequences for learning (Ginsborg, Prior, & Gaunt, 2013). In a case study, Schober and Spiro (2014) asked two jazz musicians to improvise on a jazz standard while not seeing each other (they were placed on sides of a barrier). After the improvisations, they were asked about their own musical intentions and their partner’s presumed intentions, and about the performance. Curiously, agreement between statements was relatively low, and lower than agreements with the statements of an independent expert listener. Similar findings were reported in a further case study, in which two jazz musicians described a recording of their joint performance differently (Pras et al., 2017). From this research it can be concluded that ensemble musicians do not necessarily need to agree with each other verbally in order to reach a coherent performance. The current study investigated both performance and listening situations and found that response musicians appeared to entrain with the call musician’s play. Listeners followed the musical and bodily actions, which shaped their own subsequent response.
Studies of team behaviour show that successful interaction depends on responding to each other and anticipating the other’s actions (De Jaegher & Di Paolo, 2008; Glowinski et al., 2016). In conversations, these interactions are marked by periods in which an individual listens and focuses primarily on the partner. Towards the end of these periods, the conversation partner prepares for taking over, which is typically indicated by the speaking partner’s lowering of voice and by gestures (Gallois & Markel, 1975; Stevanovic et al., 2017). Moreover, transition-relevant phrases are often marked by overlapping actions. In music, these turns may also be communicated by breathing and gestures at transition-relevant moments. Most studies on ensemble performance to date have investigated simultaneous actions, often by taking into account leader–follower roles in ensembles (Goebl & Palmer, 2009; Keller & Appel, 2010; Wing et al., 2014). While such hierarchies, even if assigned randomly in experimental studies, may allow efficient levels of coordination, they do not necessarily reflect the setting in musical improvisations that go beyond clearly assigned roles.
Since in the present study specific tasks were given regarding the expression of an emotion, no spontaneous turn-taking occurred. Following the tradition of C&R patterns in jazz (Berliner, 1994), musicians rather responded to the call musician after he had finished his play. It would be interesting to study transition periods in improvising solo musicians with no hierarchical relationships or assigned roles.
Emotional expression in ensemble performance
While emotion recognition was acceptable, and 76% correct (π = .87) may seem relatively robust for first encounters, other studies have reported even higher recognition rates for happy and sad emotions in speech and song (cf. Juslin & Laukka, 2003), and for cross-cultural recognition of basic emotions in facial affective displays (for a review, see Russell, 1994). Since auditory and visual communication channels were both available, it was expected that participants of the current study would reach even higher recognition scores. It may well be that improvising according to a given basic emotion may not constitute a typical task for jazz musicians and may therefore have prevented them from a less ambiguous display of emotional expression in some cases. Musicians may primarily communicate musical ideas and not basic emotions per se. On the other hand, Berliner (1994, p. 367) supports the idea of communicating emotions in jazz by stating that “improvisers immediately catch and follow up the feelings of despair and joy”. Both the bodily entrainment of response musicians and the musical distinctiveness of happy and sad improvisations indicate that they indeed followed the expressions of their duo partner. Explicit and implicit agreement in emotional interaction is undoubtedly a field for further investigation.
Both in bodily movement features and in musical performance dimensions, there were marked differences between happy and sad conditions. The head marker was used to analyse musicians’ movements, as the head motion is informative about general bodily motion in the three dimensions. At the same time, movements of the head are relatively independent from performance movements of specific instruments. Regarding musical and acoustical characteristics, intensity was correlated between C&R, but did not differ between happy and sad conditions in the call musicians’ play. Yet response musicians performed with higher intensity for happy emotional expressions as was expected for this emotion (Juslin & Laukka, 2003). This finding indicates that response musicians followed the call musicians’ intention to express a specific emotion, even if this was not explicitly verbalised in retrospect or clearly performed by call musicians. Separate analyses of the musical features for correct versus incorrect emotion judgments did not change the pattern of results.
No overall correspondence was found for mean tempo, and a potential limitation may lie in using an automatic beat tracker for two instruments with different playing techniques. In this regard, the distinction between rhythm and tempo or metre, especially for free jazz, appears to be more challenging than for other genres, because there is often no steady beat. This is particularly true for solo improvisations in tempo rubato. Hence mean tempo can only be taken as an approximation. On the other hand, performers did not simply imitate each other, which was evident in the response musicians’ play with the material. A mutual tuning-in when listening and performing, and a transformation of the musical material is undoubtedly more important than simple imitation for this genre.
Suggestions for future work
Further research may focus on individual differences in responding to each other, either by in-depth case studies or by comparing a range of different duos. Case studies may transcribe the music such as in conversational analyses, and transition-relevant moments could be scrutinised for spontaneous situations that resemble turn-taking patterns in conversations. Nevertheless, jazz improvisation is not necessarily a model for successful interaction in general, and there are limits to using ensemble performance as a metaphor for team work. Besides some parallels to other types of interaction such as decision making in conversation (cf. Stevanovic et al., 2017), the specific setting and purpose of making music together should not be neglected.
Future studies could also evaluate the collaborative and musical quality of ensembles externally and relate these to the degree to which ensembles managed transitions and agreed on each other’s emotional expressions. If expressive intentions are not explicitly shared beforehand or unambiguously communicated by a musical leader, such analyses may use retrospective statements (Schober & Spiro, 2014) or musical and bodily performance characteristics as in the current study. Further movement features could be analysed for musicians playing the same instrument. It is expected that for genres with a relatively small set of standardised patterns, temporal and melodic features may vary between musicians, while on an underlying level overall agreement in musical intentions would be reflected in bodily movements or basic acoustic and musical characteristics.
Finally, individual responsiveness and team skills could be related to ensemble success. While some solo musicians may primarily impose their own intentions, others may show a more collaborative attitude and respond more frequently to other musicians, perhaps modulated by empathy. These differences should influence ensemble coordination and be more audible and visible in first encounters.
Conclusions
The current study asked free jazz musicians to improvise in call and response roles with a previously unknown partner. Musicians followed each other even if not being explicitly aware of their duo partner’s expressive intentions. While listening, response musicians entrained their bodily movement patterns as analysed by variability and cumulative distance of the head motion, and incorporated some of the call musicians’ expressive intentions in terms of intensity and mode. The skills demonstrated by the musicians in listening and observing before responding supported the emergence of meaningful interaction and coherent performances, even when roles and actions were interchanged successively. Duo success and coherence was evident by musicians’ differentiation between emotional expressions in both roles. Rather than simply imitating each other, jazz musicians transformed their partner’s musical material and thus showed some form of performance diversity in ensemble unity.
Footnotes
Acknowledgements
I am grateful to Vera Komeyer and Jesper Hohagen for their assistance in initial phases of the project, and to Michelle Outram for helpful comments on a previous version of the manuscript.
Funding
This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.
