Abstract
Musical improvisation is one of the most complex forms of creative behavior, which offers a realistic task paradigm for the investigation of real-time creativity where revision is not possible. Despite some previous studies on musical improvisation and brain activity, what and how brain areas are involved during musical improvisation are not clearly understood. In this article, we designed a new functional magnetic resonance imaging (fMRI) study, in which, while being in the MRI scanner, advanced jazz improvisers performed improvisatory vocalization and imagery as main tasks and performed a prelearned melody as a control task. We incorporated a musical imagery task to avoid possible confounds of mixed motor and perceptual variables in previous studies. We found that musical improvisation compared with prelearned melody is characterized by higher node activity in the Broca's area, dorsolateral prefrontal cortex, lateral premotor cortex, supplementary motor area and cerebellum, and lower functional connectivity in number and strength among these regions. We discuss various explanations for the divergent activation and connectivity results. These results point to the notion that a human creative behavior performed under real-time constraints is an internally directed behavior controlled primarily by a smaller brain network in the frontal cortex.
Introduction
Musical improvisation is an excellent model to study human creativity in which the output is created in real time and revision impossible. Similar to innovative verbalizations or movement sequences, musical improvisation is only possible because choices are constrained by esthetic rules and physical limitations (Pressing and In, 1988). Expert practitioners who have internalized these rules and practiced the related motor movements can produce amazingly intricate improvisations. Despite some previous studies, the neural underpinnings of musical improvisation are not clearly understood. This spontaneous process may involve divergent brain activation and connectivity patterns. One emerging idea is that creative behavior, such as musical improvisation, involves the dynamic interaction of the default mode network (DMN) and the executive control network (ECN) (Beaty et al., 2016). Interestingly, these two networks are usually associated with different tasks and are typically not active concurrently. DMN activity is associated with spontaneous and self-generated thought, including mind-wandering, mental simulation, social cognition, autobiographical retrieval, and episodic future thinking, whereas ECN activity is associated with cognitive processes that require externally directed attention, including working memory, relational integration, and task-set switching (Beaty et al., 2016). Improvisation may involve the interaction between an automatic bottom-up process (DMN) that may supply possible choices and a top-down control process (ECN) that may guide those choices according to hierarchical rules (Beaty, 2015; Beaty et al., 2016).
Improvisers manipulate elements on different hierarchical levels
The hierarchical structure of tonal music is a central constraint that may be used by the ECN to evaluate and select choices offered up by the DMN. Musical events, henceforth referred to as notes, are organized into two independent hierarchical structures related to rhythm and pitch, respectively (Jackendoff and Lerdahl, 2006). The lowest level of the rhythm hierarchy relates to distances in time between individual notes. Higher levels relate to note groupings. Meter refers to a rhythmic reference that typically is constant throughout large sections of music. For instance, in a musical piece in waltz meter, timings of individual notes are related to a rhythmic framework in which every third instance is emphasized. Similarly, pitches are organized hierarchically with the individual frequency distance between two notes referred to as an interval, small note groupings as motives, slightly longer groupings as phrases, and longer sections as choruses.
Koechlin and Jubault (2006) suggested that Broca's area (BCA) and its right homologue are specifically involved in the hierarchical organization of actions, whereas other areas in the frontal lobe process temporal organization. Accordingly, “appropriate actions are selected as subordinate elements that compose ongoing structured action plans rather than from occurrences of temporally distant events” (p. 963). Specifically, Koechlin and Jubault (2006) predict that phasic activations are different for action selection on three hierarchical levels in a button-pressing task. Premotor regions control the selection of individual motor movements, whereas posterior BCA regions are engaged at the second level, the boundaries between simple action chunks. The third and highest hierarchical level could be defined as groupings of simple action chunks. Koechlin and Jubault (2006) showed experimentally that anterior BCA regions are specifically involved in selection and inhibition of these action chunk groupings. Recently, Alamia et al. (2016) showed that disruption to BCA by transitory application of transcranial magnetic stimulation inhibited participant's ability to chunk nonmotor sequences.
Skilled improvisers manipulate elements within the tonal and rhythmic hierarchies to create and violate expectations of the listener. On a lower level, improvisers may repeat motives or introduce tension by employing notes from outside the dominant tonality. On a higher level, improvisers describe organizing their entire solo around an architectural design (Berliner, 1994; Norgaard, 2011). Independent of training, listeners within a musical culture learn to decode expectations and violations much the same way they learn their native language (Hay et al., 2011). Furthermore, it appears that these fulfilled or violated predictions may elicit emotions in the listener (Brattico et al., 2013; Salimpoor et al., 2015). Listeners appear to prefer music that contain a balance of predictability and novelty as related to their individual background (Pearce and Wiggins, 2012). We would expect involvement of BCA and other regions related to the ECN during musical improvisation as available choices have to be evaluated and selected according to these intricate hierarchical musical rules.
Improvisations consist of concatenated motor movements
One of the most cited theoretical frameworks for cognition behind improvisation is centered around concatenated motor movements (Pressing and In, 1988). Indeed reviews of large corpora of jazz improvisations have identified numerous repeated musical patterns that more than likely are generated using corresponding motor chunks (Norgaard, 2014). On a higher level, the motor chunks are likely selected according to higher level plans for action chunk groupings (Norgaard, 2011). Although Pressing's framework does not specifically include action chunk groupings, the verbal accounts of improvisers would appear to indicate that they often concentrate on this higher hierarchical level. In addition, experimental research shows that melodic patterns are more frequent in improvisations done while conscious involvement is attenuated through engagement with a secondary unrelated task (Norgaard et al., 2016). This would indicate that less cognitive engagement with the improvisation inhibits the improviser's ability to vary and design improvisations around higher hierarchical plans, instead relying on a smaller repertoire of repeated motor chunks. In other words, when a secondary task engages the ECN, the lack of control may result in the improviser using more stereotypical patterns offered up by the DMN.
Previous studies of musical improvisation used overt movement tasks
There is some support for the interaction between the DMN and the ECN during musical improvisation from previous neuroimaging research. However, much of this research used only pianists who performed supine in a magnetic resonance imaging (MRI) scanner on very short keyboards limiting ecological validity and generalizability to other instruments. Berkowitz and Ansari (2008) investigated neural correlates of musical improvisation in a study in which trained pianists played either novel or prelearned rhythmic and melodic sequences while functional magnetic resonance imaging (fMRI) data were collected. A brain network was identified based on activations in the dorsal premotor cortex (PMD), the rostral cingulate zone of the anterior cingulate cortex, and the inferior frontal gyrus (IFG) during improvisation compared with prelearned condition. However, the participants were classically trained pianists with no prior experience of jazz improvisation. Owing to the lack of improvisational training, it is possible that the ECN was heavily engaged during this study as participants were grappling with the novel improvisational task. In addition, the musicians played on a keyboard with only five notes, severely limiting note choices.
Another study by Limb and Braun (2008) used a similar contrast and found that the entire dorsolateral prefrontal region was attenuated during improvisation, partially contradicting the activations found by Berkowitz and Ansari. Limb and Braun (2008) investigated brain activity while jazz pianists played either a prelearned melody or an improvised solo over the same accompaniment. The six participants in this study were advanced improvisers who were accompanied by a jazz rhythm track and played a 35-note keyboard. Limb and Braun concluded that conscious control processes are less active during improvisation and theorized that the medial prefrontal regions could generate the improvised output without conscious involvement. In this case, the DMN may have been able to guide improvisational choices due to the high level of improvisational training of the participants. Indeed, another study that included expert improvisers, and included interaction found increased activation in frontal control regions (Donnay et al., 2014). Here the extra cognitive resources related to interpreting and responding to another musician during improvisation may be responsible for the activation related to the ECN.
de Manzano and Ullén (2012) investigated improvisations by a group of professional classical pianists, by studying overlaps and differences in brain activity during both pseudorandom key presses and piano improvisation. The activity in both modes of generation was significantly higher in IFG, which included the dorsolateral prefrontal cortex (dlPFC), bilateral insula, and cerebellum (Cb) compared with a control condition. They concluded that the activation pattern reflects a generic process that is independent of the overall goal. Again, the activation of frontal control regions may have been related to the task of selecting novel keypresses, which is unfamiliar to classical pianists used to only performing prelearned music.
To reconcile previous contradictory findings related to prefrontal control regions, a recent study by Pinho compared activation during an emotional play condition (play happy or fearful melody) with a pitch-set condition (Pinho et al., 2016). The pitch-set conditions (pitch-set vs. emotional) induced a comparably greater activation of the bilateral dlPFC, extending throughout the middle frontal gyrus (MFG) into the PMD in the right hemisphere. In addition, there was greater activity in the bilateral parietal lobes. The reverse contrast (emotional vs. pitch-set) revealed comparably greater activation of the left dorsomedial prefrontal cortex in the superior medial gyrus, the left medial orbital gyrus, and bilateral insula, extending into the amygdala. They interpreted the results as suggesting that the dlPFC activation during improvisation with a limited number of pitches is due to subjects holding the pitch-set in working memory. In contrast, during the emotional condition, subjects relied on implicit associations between valence and musical output. Concerning connectivity, the emotional condition was associated with increased connectivity between dlPFC and the DMN. Beaty et al. (2016) suggested that the dlPFC may exert a top-down influence over generative processes stemming from the default network during the strategic expression of emotionally based improvisation.
Vocalizing and imagining improvisations
Participants engaged in overt motor movements in all previous studies. Although attempts were made to control variables, this study bypassed potential confounds related to overt movement by including an imagery task. It is well established that auditory perceptual and secondary motor regions can be activated during covert auditory imagery. This effect has been observed during internal auditory discrimination (Zatorre et al., 1996), auditory imagery of a musical score (Yumoto et al., 2005), and even during passive listening (Kraemer et al., 2005). In a study with advanced pianists, Meister et al. (2004) found that a bilateral frontoparietal network was active during both play and imagining. The only difference was that during imagining, activation in the contralateral primary motor cortex and bilateral posterior parietal cortex was not observed (Meister et al., 2004). Interestingly, the level of motor activation is dependent on the subject's knowledge of the actual movements necessary to play the music even in listening only conditions, and this association can be trained over just a couple of days (Lahav et al., 2007). Finally, expert musicians often use mental imagery explicitly during both practicing and actual performance; for a review, please see Keller (2011).
We investigated differences in activations between vocalizing and imagining prelearned and improvised music. Specifically, the participants vocalized or imagined singing well-known melodies and continued to improvise over those melodies and the related chord structure. This task allowed for the recruitment of expert jazz improvisers who played several different primary instruments. We hypothesized that the improvisation minus prelearned contrast would activate a network similar to networks identified in previous research related to music improvisation. This would include BCA in IFG, the dlPFC, premotor areas, parietal association areas, and the cerebellum. We also hypothesized that the contrast would include the BCA for the following reason: as the four included melodies were well known, participants would more than likely have learned to combine related motor movements into larger chunks representing longer phrases of the melodies. In contrast, improvisations would involve selecting and inhibiting unwanted motor chunks. Furthermore, during improvisation, those chunks may be selected according to architectural plans on a higher hierarchical level. We did not have predictions related to changes in connectivity as earlier studies utilizing the contrast between improvisation and memory retrieval did not report related changes in connectivity. We did, however, hypothesize that the superior temporal gyrus (STG) would be part of a network based on our prior electroencephalography study (Adhikari et al., 2016) and the location of the auditory cortices.
Materials and Methods
Participants
Twenty-four male advanced jazz improvisers (4 left handed, 20 right handed; mean age ± standard deviation [SD] = 31.9 ± 13.6 years) were exclusively recruited for this study. A criterion for participation was expertise in jazz improvisation. Participants had at least 6 years of professional experience (mean ± SD = 21.3 ± 13.5 years) on jazz improvisation (Table 1). Twenty-three participants had previous education in a University System School of Music; average schooling years for all participants was 16.2 years (SD = 1.8 years). Participants were also required to know how to read music. Primary instruments included piano (n = 5), saxophone (n = 11), guitar (n = 2), trumpet (n = 2), drums (n = 1), trombone (n = 1), French horn (n = 1), and bass (n = 1). All participants had normal or corrected to normal vision and reported normal neurological history. Participants provided written and signed consent forms and were compensated for their participation in the experiment. Institutional Review Board for Joint Georgia State University and Georgia Institute of Technology Center for Advanced Brain Imaging, Atlanta, Georgia, approved this study.
Age, the Primary Musical Instrument, and Years of Experience (Jazz Experience) of the Participants in This Study
Participants, shown in bold italic faces in table, had all runs with improper response time duration and were excluded from functional magnetic resonance imaging data analysis.
Experimental conditions
Before fMRI recording, participants were familiarized with the four tasks: vocalize prelearned (VP), vocalize improvised (VI), imagine prelearned (IP), and imagine improvised (II). During the prelearned conditions, participants were prompted to vocalize or imagine one of the four melodies (Au Privave, Now's the Time, Blues for Alice, and Billies Bounce; Fig. 1A), which were memorized and rehearsed before the day of the experiment. All four melodies are based on a standard 12-bar blues chordal progression and participants were familiar with the melodies before the testing. In addition, they were asked to complete each task module before the scan to check and make sure they could perform the given tasks accurately within the appropriate time duration. Participants were instructed to imagine singing without any overt vocalization during the imagine condition, and to sing (vocalize) during vocalization. These four melodies were chosen from the Bebop era of jazz, as the complexity of these melodies is comparable with expected improvisations (Berliner, 1994). During IP condition, participants were instructed to imagine melodies without any overt vocalizations. These performances of prelearned melodies from memory require little to no creativity. Results from both prelearned conditions were contrasted with the two improvised conditions: VI and II, during which participants vocalized or imagined a spontaneously improvised melody over the blues chord progression. We did not require participants to vocalize melodies and improvisations at the quality of a trained jazz singer. Here, we simply asked the musicians to vocalize as they would during a practice session (nonwind instrumentalists) or during casual practice without the instrument. Such practice is common among jazz musicians and jazz students are typically asked to vocalize improvisations as a pedagogical tool (Berliner, 1994).

Experimental task paradigms.
No metronome beat was audible during the experimental conditions, but before each trial, there was an audible beat representing a two-measure count-in (for 3.6 sec). Participants vocalized or imagined the cued melody twice and then went directly into a two-chorus improvisation over the same harmonic progression. Participants indicated that they switched from melody to improvisation by pressing a button.
Upon arrival at the testing site, participants provided informed consent and were familiarized with the task. They went through practice sessions at a mock scanner to reduce anxiety and make sure they performed all experimental tasks correctly before going into the scanner for actual functional runs. They were asked to remain still, not to move their heads or other parts of their body during the recording session. An fMRI compatible microphone was used for auditory recording. To constrain head motion, foam pads were used for support in the head coil. The task sequences were displayed in a screen inside the scanner through E-prime program “E-prime_V2.0.10.242” (
Data Acquisition and Analysis
Behavioral data analysis
Behavioral data were recorded on the computer that also ran the E-prime program displaying the experimental task sequences. The audio output (vocalized melodies and improvisations) was recorded as MP3 files using an fMRI scanner compatible microphone. Stimulus onset time and the time between the onset of a task condition and the button press (start of improvisation) in each trial were recorded. Audio files were analyzed to determine participants' performance accuracy in reproducing the cued melodies. The improvisations were evaluated to ensure they implied the dictated blues chord progression. Any performed trials or runs with inappropriate duration (taking too long or too short) were dropped and thus not included in the data analysis.
fMRI data
The whole-brain MR imaging was done on a 3-Tesla Siemens scanner available at Georgia State University and Georgia Institute of Technology Center for Advanced Brain Imaging, Atlanta, Georgia. The functional scans were acquired with T2*- weighted gradient echo-planar imaging sequence: echo time (TE) = 30 msec, repetition time (TR) = 1970 msec, flip-angle = 90°, field of view (FOV) = 204 mm, matrix size = 68 × 68, voxel size = 3 × 3 × 3 mm3, and 37 interleaved axial slices with a thickness of 3 mm each. High-resolution anatomical images were acquired for anatomical references using a magnetization-prepared rapid gradient-echo sequence with TR = 2250 msec, TE = 4.18 msec, flip-angle = 9°, and voxel size = 1 × 1 × 1 mm3).
fMRI data were preprocessed by using Statistical Parametric Mapping (SPM12; Welcome Trust Centre, London,
Connectivity analysis
The regions of interest (ROIs) were based on activation t-maps during overall improvisation (VI+II) compared with overall prelearned (VP+IP) condition except the primary auditory cortex in temporal region, which is based on our hypothesis. We defined six ROIs, a sphere of 6 mm radius in MarsBar (
Functional connectivity
Average time series for a trial were calculated for each subject from all ROIs. We then calculated pairwise correlation coefficients from trial to trial between two ROIs. To estimate the average effect, we used Fisher's z-transformation (Bond and Richardson, 2004; Cox, 2008; Silver and Dunlap, 1987) on correlation values. The correlation values were converted to their equivalent Fisher's z-values [z = arctan h(r)] and computed average Fisher's z-value. The average Fisher's z-values were then used to calculate the grand average z-value, the statistical significance level p, and the corresponding correlation coefficient for each pair of ROIs. Inter-regional correlation analysis was performed in overall musical improvisation and prelearned and in vocalize and imagine conditions.
Directed functional connectivity
We performed Granger causality (GC) analysis to characterize the directional information flow between ROIs. The ensemble-mean removed segmented deconvolved time series from separate voxels and subjects were treated as trials for reliable estimates of the network measures. We calculated the frequency-dependent GC spectra (Dhamala et al., 2008) for pairs of ROIs. The significant GC spectra and hence the significant network interactions were defined by setting a GC threshold above the random-noise baseline. We constructed a set of surrogates by randomly permuting trial data from each participant and task condition. To compute the GC threshold value, we used a random permutation technique (Blair and Karniski, 1993; Brovelli et al., 2004) and the threshold value was based on the null hypothesis that there was no statistical interdependence between nodes when trials were randomized. We computed GC spectra from all possible pairs of ROIs with a minimum of 1000 random permutations and picked maximum GC on each permutation. The threshold for GC spectra at significance p < 10−6 was obtained by fitting the distribution with a gamma-distribution function (Dhamala et al., 2008) and this threshold value was used to identify significantly active directed network activity among ROIs. Conditional GC analysis was carried to rule out the mediated interactions among the ROIs and to retain only the direct network interactions. We also computed the time-domain GC values for significantly active network directions from each participant and performed paired t-tests on these values to find the significant network modulation during the various musical task conditions.
Results
Behavioral results
The recorded prelearned and improvised vocalizations audio files were analyzed to assure the number of notes during the prelearned and improvised conditions was not significantly different. A paired t-test found that there was not any significant difference in note count between the two conditions. Imagined tasks were monitored during recording for appropriate performance duration and to make sure no confound vocalization occurred during the imagery tasks.
Any trial or run with inappropriate performance duration, either too long or too short duration, was not included in the data analysis. Trials were monitored during data acquisition and compared with the expected length. Based on the tempo given by the metronome played at the beginning of each trial, vocalizing or imagining twice the improvisations or the melodies should take about 32 sec, so only the trials with durations between 28 and 38 sec were included in data analysis. Four participants (participant numbers 7, 11, 20, and 23, shown in bold italic faces in Table 1) had all runs with inappropriate performance duration and were thus excluded from fMRI the data analysis. Excluding those four participants resulted in a mean age ± SD and mean years of experience ± SD of 30.9 ± 13.3 years and 20.2 ± 12.8 years, respectively. The retained 20 participants had 111 trials (11.6%) with inappropriate performance durations indicating they may not have improvised over the given harmonic framework. These trials were also excluded from further analysis.
In addition, the vocalization trials were rated for accuracy independently by two expert jazz musicians not affiliated with the study using the Consensual Assessment Technique (Amabile, 1996). Accuracy was rated on a 7-point Likert Scale with 1 being “extremely inaccurate” and 7 being “highly accurate.” Accuracy for the improvisation trials was defined as “pitches imply underlying blues chord progression and rhythms imply a steady pulse.” We should note that due to technical difficulties, we only recorded the audio from 13 participants although vocalizations were monitored during the data acquisition. Mean ratings ± SD were 6.34 ± 0.35 and 6.01 ± 0.37 for the prelearned and the improvised vocalizations, respectively.
Brain activations
Brain activations were studied with all possible contrasts: VI versus VP, II versus IP, and overall improvisation (VI+II) versus overall prelearned (VP+IP). Each of these tasks was also compared with rest as baseline. During each improvisational task, there was significantly higher brain activation compared with prelearned condition, but there was no activation the other way around. The brain activations during any improvised or prelearned tasks or any combination of tasks were always significantly higher when compared with rest, but no activation was observed when comparing rest with the other conditions. The significant brain activations are listed in Table 2.
Brain Activations for Various Contrasts
The table includes the information about the anatomical locations, cluster sizes, t-value (z-score), and MNI coordinates for the activations. The brain activations listed in the table for contrasts compared with rest are under statistical significance p < 0.05, FWE correction, for multiple comparisons correction, and cluster extent k > 20. The brain activations that compare between two task conditions are for p < 0.0005 (uncorrected FWE) and k > 20.
Corrected FWE.
BA, Brodmann area; BCA, Broca's area; Cb, cerebellum; dlPFC, dorsolateral prefrontal cortex; FWE, family-wise error; IFG, inferior frontal gyrus; II, imagine improvised; IMG, overall imagination (II+IP); IMP, overall improvisation (VI+II); IP, imagine prelearned; L, left; lPMC, lateral premotor cortex; MFG, middle frontal gyrus; MNI, Montreal Neurological Institute; PL, overall prelearned (VP+IP); PoCG, postcentral gyrus; PrCG, precentral gyrus; R, right; SMA, supplementary motor area; STG, superior temporal gyrus; VI, vocalize improvised; VOC, overall vocalization (VI+VP); VP, vocalize prelearned.
Activations during improvised tasks were associated with significant changes in frontal activity. During overall improvisation compared with prelearned condition, we observed widespread activations in left IFG that included the BCA, referred as IFG unless it is stated, dlPFC, motor areas; lPMC in MFG, referred as MFG, and left SMA plus the RCb (Fig. 2). Maximum probability mapping using SPM Anatomy toolbox (

Brain activations. The brain activations for overall improvisation (vocalize improvised+imagine improvised) versus overall prelearned (vocalized prelearned+imagine prelearned). The color intensity represents t-statistics, and the activations are overlaid on the Montreal Neurological Institute structural template brain in neurological orientation. Cb, cerebellum; IFG, inferior frontal gyrus; lPMC, lateral premotor cortex; SMA, supplementary motor area. Color images are available online.

The overlap between the activation clusters and brain structures defined with maximum probability mapping in SPM Anatomy is shown. The overlaid color cluster represents the functional activation on BCA (IFG) during improvisation compared to pre-learned condition; the hotter the color, the higher the activation. BCA, Broca's area; IFG, inferior frontal gyrus. Color images are available online.
Network activity
We performed connectivity analysis among the six nodes: IFG, dlPFC, MFG, SMA, RCb, and STG (primary auditory cortex). Inter-regional correlation analysis, as described earlier, was used to see whether these regions were functionally connected. Figure 4 shows the functional connectivity during prelearned (PL) and improvisation (IMP) conditions, indicating that there was less functional connectivity during IMP compared with PL. Figure 5 shows the functional connectivity during vocalize (VOC) and imagine (IMG) conditions, we found less functional connectivity during imagine compared with vocalize condition. Only the functionally significant connections (significance level, p < 0.05) are shown in the figures with their corresponding correlation coefficient and p values.

Functional connectivity for prelearned (PL) condition and improvised (IMP) condition. Only functionally significant connections (p < 0.05) are shown here with corresponding correlation coefficient r and p value. dlPFC, dorsolateral prefrontal cortex; lPMC, lateral premotor cortex; RCb, right cerebellum; STG, superior temporal gyrus.

Functional connectivity for vocalize (VOC) condition and imagine (IMG) condition within musical improvisation and prelearned. Only functionally significant connections (p < 0.05) are shown here with corresponding correlation coefficient r and p value.
We computed GC spectra to assess directional network interactions among the six nodes. Pairwise-GC spectra were calculated separately for the improvised and prelearned conditions; both including vocalize and imagine conditions. We used the permutation threshold criteria to find the significant causal interaction directions (details are in Materials and Methods section). The significant causal connections (schematic representation) with significant functional connections among these nodes are shown in Figure 6. Figure 6 shows the significant network interactions during PL (left panel) and IMP (right panel) conditions. The thickness of the line represents the strength of the causal interactions, as shown in each plot. The node pointed to by the arrowhead receives the causal influence from the node that line starts from. During prelearned conditions, we found bidirectional interactions between dlPFC to all other nodes except RCb and SMA. Unidirectional causal influence was found from RCb to SMA. There were unidirectional causal influences from IFG to SMA, IFG to lPMC, and IFG to RCb, which were found mediated from other nodes and hence were ruled out. We found significant unidirectional causal influence from STG to other nodes except to RCb (no functional correlation between STG and RCb, Fig. 4). During improvisation (right panel in Fig. 6), the network interactions from dlPFC to STG and SMA to RCb were ruled out as they were found to be mediated. During improvisation, we found the bidirectional interactions from dlPFC and RCb, unidirectional causal influence from dlPFC and STG to lPMC and from lPMC to SMA.

Network interactions. Schematic representation of significant causal interactions directions among six nodes: BCA; left IFG, dlPFC, lPMC, SMA, STG, and RCb. The significant causal connections for overall prelearned (PL) and for overall improvisation (IMP), as determined by using permutation threshold criteria (p < 10−6), are shown by a solid line with an arrowhead; the width of the line represents the connection strengths (maximum Granger causality values), the thicker the lines, the more causal strength. The red stars (left panel) represent the significant increase in network interaction directions (p < 0.05) when the causal strength during overall prelearned is compared with overall improvisation. dlPFC, dorsolateral prefrontal cortex. Color images are available online.
We performed the analysis to find out how the causal interactions changed during different task conditions. The time-domain GC values calculated from the entire frequency range from all the participants were compared across task conditions for statistical significance using paired t-tests. When the causal interaction strengths during prelearned were compared with the causal interaction strengths during improvisation condition, the directed interactions from dlPFC and RCb to SMA were found significantly increased (p < 0.05) and are indicated by a red star (Fig. 6). No other interaction directions changed significantly. We also compared the causal interaction strengths between task conditions (imagine and vocalize) within musical IMP and PL. We found significant increase (p < 0.05) in bidirectional interactions between dlPFC and SMA and unidirectional interaction from RCb to SMA during VP condition compared with VI condition, as marked by a red star (left panel, Fig. 7). During IP compared with II conditions, significant increase (p < 0.05) in the directed causal interactions was found from BCA (IFG) to RCb and SMA, and dlPFC to SMA as marked by a red star (right panel, Fig. 7). We showed only the directions, which are functionally connected with task conditions and causal interactions are significant, ruling out the mediated interaction from the conditional GC analysis.

Network interaction modulation. Significant changes in network interactions (p < 0.05) are marked with a red star during vocalized prelearned compared with vocalized improvisation (VOC) in the left panel, and imagined prelearned compared with imagined improvisation (IMG) in the right panel. A red star represents the increase in network interaction. Color images are available online.
Discussion
In this study we investigated fMRI BOLD responses during vocalized or imagined musical performance of melodies retrieved from memory (prelearned condition) followed by improvisations (improvised condition) on the same chordal structure. In the current paradigm, improvised and prelearned conditions both gave rise to similar motor actions, only the mode of creation was different. The neural correlates behind this difference were the focus of the current research. We found that musical improvisation is characterized by significant changes in frontal cortices, increased widespread activity in the left IFG including BCA, dlPFC, and extended to the motor areas lPMC in the MFG, SMA, and RCb (Figs. 2 and 3). Interestingly, the functional connectivity as measured by correlations was significantly less during improvisation (Fig. 4). The causal interaction strengths during prelearned condition from dlPFC and RCb to SMA were significantly increased compared with the improvisation (Fig. 6). Furthermore, we found a significant increase in the directed causal interactions from dlPFC and RCb to SMA (left panel, Fig. 7) during VP compared with VI, and from IFG to RCb and SMA, and dlPFC to SMA (right panel, Fig. 7) during IP compared with II. Hereunder we discuss why improvisation leads to increased node activation but decreased connectivity from higher level prefrontal control to motor planning areas.
Cognitive processes underpinning musical improvisation include fitting responses to an overall architectural structure, combining discrete chunks into an action chain, and selecting individual auditory and motor chunks (Pinho et al., 2016; Pressing and In, 1988). The activation of BCA during improvisation in this study may indicate the central role of BCA in the generation, and selection, and execution of action sequences. Specifically, BCA has been implicated in higher order chunking mechanisms that are central to hierarchically organized sequences (Alamia et al., 2016). Tonal music is hierarchically organized both according to tonal and rhythmic hierarchies (Jackendoff and Lerdahl, 2006) and tonal jazz improvisations show statistical distributions similar to other tonal music (Jarvinen, 1995). Therefore, BCA may control the selection and concatenation of auditory chunks that together form a syntactically pleasing sequence that displays these hierarchies (Beaty, 2015). However, this interpretation of the activation does not explain why connectivity during improvisation (IMP) is less than that during prelearned (PL) performance. The regional brain (node) BOLD response can be attributed to the synaptic input to the neuronal population of that region and its intrinsic processing (Lauritzen, 2005; Logothetis, 2003). The intrinsic processing dominantly contributes to the overall activity (up to ∼79%) (Harris et al., 2010). Consistent with these findings, it is reasonable to assume that the elevated activity in IMP compared with PL is most likely to be related to the additional cognitive load fulfilled by intrinsic neural processing in each brain area rather than a greater coordination among areas as in PL. We offer two explanations for this observed phenomenon: one related to Broca's involvement in evaluation processes and the other related to the translation of abstract information to motor commands.
It is possible that the higher node activity in the cognitive control areas is related to ongoing evaluation of ideas (Beaty et al., 2016) but that most of those ideas were initially appropriate alleviating the need to communicate corrective information to the motor areas. New research investigating the role of IFG in a traditional alternative found that left IFG is involved in the evaluation of creative ideas, generated by neural structures associated with the DMN (Kleinmintz et al., 2018). During musical improvisation, the real-time demands of the task most likely involve continuous generation with concurrent evaluation (Norgaard, 2011). This is in opposition to traditional creative tasks such as poetry generation (Liu et al., 2015) and painting (Ellamil et al., 2012) in which the lack of time constraints allows for separate generation and evaluation stages. In this study, we postulate that the subjects, who were advanced improvisers with extended knowledge of the dictated harmonic context, the bottom-up generative processes served up mostly appropriate ideas (Limb and Braun, 2008). Although frontal cortical regions monitored the output more closely during improvisation due to the novelty of the generated responses (higher activation), the initial ideas were mostly appropriate. Therefore, the output of the executive network evaluation did not need to be communicated to motor regions (less connectivity). In contrast, during the prelearned condition, an exact auditory picture retrieved from memory was constantly being compared with the actual output, and detailed adjustments communicated continuously from executive control areas to motor planning regions resulting in higher connectivity. However, since no retrieval and concatenation of novel output were required, activation of the ECN areas was less than in the improvisation condition.
It has recently been suggested that musical improvisation as well as other creative behaviors rely on a constructive interplay between the DMN and ECN (Beaty et al., 2016). Indeed, the activation of control areas and the coupling of the ECN and the DMN appear to be directly related to the amount of goal-directed processing necessary for the task. Pinho et al. (2016) compared two types of improvisation tasks and found a network similar to the network identified in this study (frontal-motor) during a task in which participants were required to use a specific pitch set during improvisation. They identified a different network (frontal to DMN) in a task in which improvisers were simply required to communicate emotions. In this study, we compared network activity between specified nodes as opposed to Pinho et al. (2016) who used a seed region to identity two different networks. In addition, we used the prelearned/improvised contrast, whereas Pinho compared two types of improvisation. Nonetheless it is interesting that their pitch set condition pointed to a network where dlPFC was connected to motor regions similar to the network we identified. However, we found less connectivity in this network during improvisation compared with prelearned. It may be that the highly constrained pitch set improvisation condition in some ways was more similar to our prelearned condition. Future research could evaluate the effect of constraints and goal directedness on top-down control and related connectivity during improvisation (Beaty et al., 2016).
Another possible explanation for the observed node activation accompanied by attenuated connectivity may be related to the role of IFG in translating abstract information to motor commands. This process has been described in another domain that is associated with production of hierarchically organized structures: language (Levelt, 2001). The traditional role of BCA as related to speech production has recently been investigated further (Flinker et al., 2015). It appears the area is engaged in mediating interaction between temporal and frontal regions by translating abstract information into articulatory code. However, as this code is implemented by the motor cortex, BCA is surprisingly silent (Flinker et al., 2015). In this study, this same translation of the auditory image of the retrieved longer prelearned melodies into motor commands may account for the increased connectivity during the prelearned condition. Even during imagery, the motor planning areas are known to be active presumably requiring a translation process (Baumann et al., 2007). Yet, as mentioned, no concatenation of novel output following syntactic rules was required in this condition as the prelearned melodies are retrieved in fully intact form (less node activation). We postulate that the translation of auditory image to motor commands is needed less during improvisation because auditory chunks offered up by the default network are already linked to their related motor commands (less connectivity). Yet, BCA is still engaged in concatenation of these chunks (more node activation). Future research could investigate this idea by manipulating the links between auditory image and motor commands of chunks used during improvisation. This could be done in an instrumental improvisation task by changing the key in which improvisations are performed from a familiar key where auditory image and motor commands are linked to an unfamiliar key (Goldman, 2013).
The areas that exhibited increased activation during improvisation in the current study were dlPFC, lPMC, SMA, and Cb. The dlPFC is also associated with goal-directed behaviors that are consciously monitored, evaluated, and corrected as already described and is a central part of the ECN. Specifically, dlPFC may be involved in inhibiting habitual responses (de Manzano and Ullén, 2012). Thus, the activation of left dlPFC during improvisation may indicate top-down control, attentional monitoring, and evaluation, which are consistent with previous studies and consistent with functions of the ECN (Beaty, 2015; Berkowitz and Ansari, 2008). The activation of the motor planning areas lPMC in MFG and SMA during improvisation may be due to the process of selecting single motor acts or single sensorimotor associations associated with the hierarchical organization of the human behaviors (Koechlin and Jubault, 2006). These areas have also been implicated in previous research involving various music improvisation tasks (Beaty, 2015). Finally, the Cb may be associated with movement coordination and maintenance of an internal pulse (Buhusi and Meck, 2005; Spencer et al., 2005). Cerebellar activation has specifically been observed when subjects move to both heard and imagined music (Schaefer et al., 2014).
In this study, we did not find differential activations in medial prefrontal and parietal regions in the prelearned versus improvisation contrasts. We, therefore, did not find specific support for the activation of the DMN during improvisation. This difference compared with previous findings is most likely due to the current paradigm using vocalization and imagery (Limb and Braun, 2008; Pinho et al., 2016). Future studies using the current paradigm with a larger sample size should investigate both the role of the ECN evidenced in this study and the complementary contribution of the DMN. Furthermore, future studies should investigate the role of expertise using the current paradigm. The current sample only included experts and the available audio recordings of improvisations were all judged highly accurate by independent raters reflecting both adherence to the underlying harmonic progression and rhythmic pulse. The slight difference between less and more accurate improvisations did not affect cluster level activations as discussed in Appendix A1. We had audio from only 13 subjects to analyze and an obvious limitation of the current paradigm is the lack of ratings for the imagined trials where only overall timing could be used for trial validation. The strength of this study compared with previous research is that potential confounds related to overt movements were eliminated in the imagine conditions. A comparison of the vocalize condition with the imagine condition or vice versa was not the focus of this research, but is discussed in Appendix A2.
In conclusion, we found differences in activation and connectivity between closely matched performance of memorized and improvised melodies. To our knowledge, this is the first study to investigate this contrast using a vocalization and imagery tasks. The observed node activations during improvisation appear to confirm the central role of BCA in the creation of novel musical output. Yet, the accompanying attenuation of connectivity supports the idea of limited top-down control. It is possible that this apparent disassociation between node activity and functional connectivity is central to the cognitive underpinnings of real-time creativity.
Footnotes
Acknowledgment
This work was funded by a Brains and Behavior Seed Grant to M.D. and M.N.
Author Disclosure Statement
No competing financial interests exist.
