Abstract
Several studies have investigated how to improve episodic memory performance by manipulating the factors that are crucial for successful encoding. There is an ongoing debate about whether a complex stimulus such as music can improve memory, and in particular memory for words, rather than interfere with correct encoding of information. Therefore, the present study aims to investigate whether verbal episodic memory can be improved by background context of instrumental music. Twenty young adults were asked to memorize different lists of words presented against a background of music, environmental sounds or silence. Their episodic memory performance was then tested in terms of item and source memory scores. Results revealed better memory performance under the music condition than with environmental sounds or silence in the retrieval of the context (i.e. source) of the encoded material. These findings, contrasting with studies showing an interfering effect of music, are discussed in terms of both methodological and theoretical perspectives with the aim of furthering the debate about music and memory. In sum, our results indicate that music can specifically act as a facilitating encoding context for verbal episodic memory, opening important perspectives for music as a rehabilitation tool for episodic memory deficits.
Introduction
Music has a strong and influential presence throughout life. Thus, like Proust’s madeleine, listening to a particular song or melody can evoke specific moments of life, such as childhood, a special person, or a particularly enjoyable party. Regarding long-term memory, Tulving (1972) defined the memory for events occurring at a particular time and place as episodic memory. This concept was presented as a heuristic distinction between memory for general facts (i.e. semantic) and memory for personally experienced events (i.e. episodic), namely the kind of source of to-be-remembered information (Tulving, 1972, 2002). While semantic memory is a form of declarative memory that allows people to retrieve information about general facts, episodic memory concerns the recall of subjective previous experiences that happened at particular times or places. Episodic memory is thus strictly related to the content as well as the context in which an experience happened and it is essential for the performance of numerous tasks, such as recalling the name of someone previously met, remembering the current date or an appointment in the near future (see e.g. Ranganath, Flegal, & Kelly, 2011). Thus, episodic memory is a long-term memory system which allows humans to violate the law of the irreversibility of time by making possible “mental time travel through subjective time, from the present to past, thus allowing one to re-experience, through autonoetic awareness, one’s own previous experiences” (Tulving, 2002, p. 5). Episodic memory therefore contains information about both the content of an experience and the context in which that experience was encoded. Several studies have shown that memory for content (item memory) and memory for context (source memory) can be dissociated (Glisky, Polster, & Routhieaux, 1995). For example, we can recognize a person (the content) without being able to retrieve the context, in other words, where and when we met them. This has led several authors to investigate what can improve episodic memory and the processes that are involved. It is well known that the instructions given to subjects at encoding play a crucial role in memory performance. For example, the manipulation of deeper levels of processing (e.g. semantic processing, Craik & Lockart, 1972) can improve the retrieval of previously encoded items. It has also been shown that enacted encoding (i.e. using external objects to represent the encoded nouns, Lövdén, Rönnlund, & Nilsson, 2002) can improve memory performance. Furthermore, the type of material used can considerably modulate episodic encoding. For example, several studies have shown how emotional stimuli such as pictures, words, sentences and narrated slide shows can enhance subsequent item and source memory performance (Davidson, McFarland, & Glisky, 2006; Doerksen & Shimamura, 2001; Kensinger & Corkin, 2003; see also Hamann, 2001 for a review). The encoding context also plays a crucial role, and it has been shown that the richness of contextual details during the encoding of an event can strongly influence subsequent retrieval (see e.g. Eich, 1985; Hamann, 2001). Eich (1985) highlighted the importance of the physical context (e.g. natural environment or exam room) in remembering what has been learned. Furthermore, it has been suggested that the enriched context of a fictional survival scenario (Nairne & Pandeirada, 2008; Nairne, Thompson, & Pandeirada, 2007) can improve memory encoding and subsequent retrieval (Kroneisen & Erdfelder, 2011). Music, with its rich structure (e.g. melodies, chords, themes, riffs, rhythms and tempos), has also been identified in the literature as a captivating stimulus (see Zatorre, 2005 for a review) and has been shown to stimulate the whole brain through a diverse set of perceptive and cognitive operations from auditory mechanisms and motor programming to higher cognitive functions such as attention and memory storage and retrieval (Altenmüller, 2001; Zatorre, 2005). Music also exerts a strong emotional power by evoking emotions and influencing moods (Koelsch, 2014; Zatorre & Salimpoor, 2013).
There is ongoing debate about the effect that music exerts on memory. While a large number of studies have shown that music can positively influence memory performance (e.g. Balch, Bowman, & Mohler, 1992; Balch & Lewis, 1996; Kang & Williamson, 2014; Ludke, Ferreira, & Overy, 2014; Simmons-Stern, Budson, & Ally, 2010; Wallace, 1994), several studies have suggested that it can draw participants’ attention away from relevant information, hindering rather than assisting encoding (El Haj, Omigie, & Clément, 2014; Jäncke, Brügger, Brummer, Scherrer, & Alahmadi 2014; Jäncke & Sandmann, 2010). Furthermore, several mechanisms have been hypothesized to be related to such different outcomes; for example, the modulation of emotions and arousal, as proposed by the arousal and mood hypothesis (Thompson, Schellenberg, & Husain, 2001), or modulation of attentional levels (i.e. music distracts attention from a specific task thus impairing performance; see Kämpfe, Sedlmeier, & Renkewitz, 2010 for a review of the effects of background music on performance in a variety of tasks). This heterogeneity may be due, at least in part, to the great variety of paradigms (e.g. background music, sung versus spoken, long-term music exposure), the memory task (e.g. free recall, recognition, episodic vs working memory tasks, etc.) and stimuli (e.g. vocal, instrumental) employed, which could affect different memory processes. It is therefore crucial to understand which memory systems are involved and how. Although many studies have investigated the relationship between music and memory, little is known about how music specifically acts on the episodic system.
One interesting hypothesis is that music, when used as an encoding context, can facilitate the episodic encoding of an event. Smith (1985) demonstrated that background music during word encoding induces context-dependent memory during retrieval, improving recall of the previously encoded words. This supports the view of the existence of a music-dependent memory (Balch et al., 1992; Balch & Lewis, 1996), suggesting that a musical context can positively influence the encoding of an event. Following this line, a number of studies have investigated the enhancing effect of music, using musical stimuli such as sung text or background music to improve verbal memory and learning performance in both healthy (Kang & Williamson, 2014; Ludke et al., 2014; Wallace, 1994) and clinical populations (Moussard, Bigand, Belleville, & Peretz, 2012; Racette & Peretz, 2007; Simmons-Stern et al., 2010). Overall, these studies support the idea that a musical source can improve recall of previously encoded items. However, none of them specifically investigated the role played by a musical context during retrieval of the source itself, and the few studies that have focused on this question found contrasting results (El Haj et al., 2014; Ferreri, Aucouturier, Muthalib, Bigand, & Bugaiska, 2013; Ferreri, Bigand, et al., 2014). A series of fNIRS (functional near-infrared spectroscopy) studies on young and older adults showed that a musical background during the encoding of verbal material can improve both item and source memory performance by modulating prefrontal cortex activity (Ferreri, Aucouturier, et al., 2013; Ferreri, Bigand, et al., 2014). One of the main limitations of these studies is that the musical context was compared with a silent encoding condition. Therefore, it was not possible to know whether a simple non-musical auditory stimulation is enough to improve memory performance. Furthermore, using images of objects, El Haj et al. (2014) recently showed that a musical background is likely to impair source memory performance in young and older adults, adding new evidence to the controversy about the positive effects of music.
To examine these issues further, our aim was to investigate: 1) whether verbal memory can be improved by music in terms of both item and source memory, and 2) whether the facilitation provided by an auditory context is music-specific or could be due to auditory stimulation in general. We therefore asked young adults to encode lists of words presented with different auditory backgrounds, and we then tested both their item and source memory performance. To ascertain whether the positive effects on memory rely on the intrinsic features of music (i.e. rhythm, melody, emotional valence, etc.) rather than on general sounds, stimuli were presented under three auditory contextual conditions: music, silence and environmental sounds. Considering music to be a rich and facilitatory encoding context, we expected to find greater improvements in episodic memory performance under the music condition than under either the environmental sounds or silence conditions.
Method
Participants
Twenty undergraduates (13 women, mean age 23.3 ± 2.9 years) at the University of Burgundy took part in the experiment in exchange for course credits. All participants were volunteers and reported themselves to be free from medication known to affect the central nervous system. They were non-musicians, native French speakers, and reported having normal or corrected-to-normal vision and hearing. The study was anonymous and complied fully with the Helsinki Declaration and the Convention of the Council of Europe on Human Rights and Biomedicine.
Stimuli and apparatus
The experiment had a single-factor (encoding context: silence, environmental sounds and music) within-subjects design. Verbal stimuli consisted of 90 taxonomically unrelated concrete nouns selected from the French “Lexique” database (New, Pallier, Brysbaert, & Ferrand, 2004). They were divided randomly into two sets of 45 words, matched for frequency and length. One set was presented at encoding and these words were used as target items in the succeeding recognition test, while the other set provided the lures. Half of the participants were presented with one list, half with the other. Moreover, the two word lists were randomly divided into three lists of 15 words presented with three different auditory contexts and equated for word length and frequency of occurrence. Auditory stimuli were chosen after a pre-test in which participants were asked to rate the pleasantness, the emotional intensity and the arousal quality of the words using a 10-point scale. The chosen stimuli were rated as enjoyable, with medium emotional intensity and low arousal quality. The musical background was an excerpt from “Down, Down, Down” (by Joe Satriani, 1995). Instrumental rather than vocal music was chosen in order to avoid possible interference with the verbal to-be-encoded material. The environmental sounds context consisted of the sound of a waterfall (from Majestic Waterfall CD album, 1998). Presentation of task instructions and stimuli, as well as the recording of behavioral responses, were controlled by E-Prime software (Psychology Software Tools, Inc.) using a laptop with a 15” monitor. Verbal stimuli were visually presented in the middle of the screen. Auditory stimuli were presented using a headset, and the overall loudness of the excerpts was adjusted subjectively to ensure constant loudness throughout the experiment.
Procedure
Participants were tested individually and were seated at a PC in a quiet room. They were told to read the words and to remember them for a subsequent test.
The encoding phase consisted of 5 seconds of word encoding, preceded and followed by 20 seconds of context only. The order of music / environmental sounds / silence blocks was counterbalanced, as were the order of word lists and the order of words in the lists. Encoding was followed by an interference phase lasting about 5 minutes, during which subjects performed the “X-O” letter-comparison task (Salthouse, Toth, Hancock, & Woodard, 1997) and the “plus-minus” task (Jersild, 1927; Spector & Biederman, 1976). These tasks were included to avoid self-repetition. The interference phase here is crucial to test for long-term episodic memory (rather than working memory) performance. After that, participants were tested for item and source memory (Glisky et al., 1995). The retrieval phase included the 45 old words and 45 new words. For each word, subjects had to indicate if they had previously seen the word in the encoding phase (yes/no button on the keyboard – item memory task). If they answered yes, they were asked to indicate the context (music / environmental sounds / silence button on the keyboard – source memory task).
Data analysis
Each participant’s accuracy (hit) rates (number of yes/no recognition hits for each condition) and total false alarm rates in the item-memory tasks were calculated. Source-memory was calculated as the result of correct source judgments (number of context recognition hits for each condition) minus false alarm rates for each of the three experimental conditions. One-sample t-tests were used to ascertain whether all the scores were significantly above chance level. Repeated-measures ANOVAs were run on item- and source-memory scores to identify main effects, and LSD post-hoc t-tests were applied to test for significant differences between conditions. Data were statistically analyzed using SPSS Software 17.0.
Results
One-sample t-tests indicated that item memory scores were greater than the chance level for music (t(19) = 5.98, p < .001, d = 0.808), environmental sounds (t(19) = 3.31, p = .004, d = 0.605) and silence conditions (t(19) = 4.40, p < .001, d = 0.710). Furthermore, source memory scores were greater than the chance level for music (t(19) = 6.38, p < .001, d = 0.809), environmental sounds (t(19) = 2.98, p = .008, d = 0.564) and silence (t(19) = 2.98, p = .008, d = 0.454) contexts.
Repeated-measures ANOVA for item memory scores did not show significant main effects (F(2, 38) = 2.297, SS =16.90, MS = 8.45, p = .114,

Item memory (a) and Source memory (b) scores (mean percentage of correct answers scores ± SD) for music (black), environmental sounds (grey) and silence (white) conditions. Results from post-hoc LSD pairwise comparisons revealed that subjects tended to better retrieve words previously encoded with music (M = 73.33, SD = 17.44) than environmental sounds (M = 66, SD = 21.62) and silence (M = 65.67, SD = 15.93). Subjects significantly better retrieved the musical source (M = 46.68, SD = 15.21) as compared to the environmental (M = 38.55, SD = 20.38) and silence contexts (M = 34.83, SD = 19.81). *p < .05, (*).05 < p < .09 resulted from post-hoc LSD comparisons.
Discussion
This study was designed to investigate the specific role of music in episodic memory. As predicted in our working hypothesis, results showed that a context of instrumental music produced better episodic performance than silence or environmental sounds. In particular, this positive effect was statistically significant for the context (source) of encoding.
These results are in line with several studies showing that music facilitates verbal memory performance (e.g. Kang & Williamson, 2014; Ludke et al., 2014; Moussard et al., 2012; Racette & Peretz, 2007; Simmons-Stern et al., 2010; Wallace, 1994) and further support the hypothesis that music can act as an enriched and facilitatory encoding context for episodic memory performance.
One of the main findings of our study is that a background of music, and not of other sounds (i.e. environmental condition), can provide a helpful episodic context during verbal encoding. This raises the question of which specific features of a musical background can improve memory performance. Palmer, Jungers, and Jusczyk (2001) suggested that the sources of musical stimulus variability, such as perceptual features, emotional state or interpretive effects, offer a primary resource for aiding memory and learning. Further evidence for the crucial role of music-related emotions comes from a study showing that emotional information modulates musical memory in a similar way to the influence of emotional states in other domains (Eschrich, Münte, & Altenmüller, 2008). It is therefore possible that the intrinsic features of music that help create musical episodic memory (Eschrich et al., 2008; Palmer et al., 2001; Platel, Baron, Desgranges, Bernard, & Eustache, 2003), namely salient perceptual traits and emotional power, work together to improve item and source memory performance for verbal memory. Considering music as a rich and salient context, it is also possible that it can help trigger item-source associations during word encoding (Ferreri, Aucouturier, et al., 2013; Ferreri, Bigand, et al., 2014), thus allowing deep encoding processing (Craik & Lockart, 1972) that can improve subsequent retrieval of information. Such explanation is supported by studies using free-recall memory tasks. Indeed, greater word chunking has been shown for items previously encoded with music, thus suggesting that inter-item and item-source bindings are facilitated by a musical background (Ferreri, Bigand, Bard, & Bugaiska, 2015; McElhinney & Annet, 1996). Another interesting point of discussion concerns the role of music-related reward responses in cognitive tasks. Music is one of the most rewarding stimuli (see e.g. Blood & Zatorre, 2001), and it has recently been suggested that this could be a significant factor in music-related improved cognitive performance (Adcock, Thangavel, Whitfield-Gabrieli, Knutson, & Gabrieli, 2006; Mas-Herrero, Marco-Pallares, Lorenzo-Seva, Zatorre, & Rodriguez-Fornells, 2013). The musical stimulus we used was evaluated as pleasant; it is therefore possible that this feature led participants to accurately retrieve the contextual details they liked most. In view of the complexity of a musical background compared to the silence or environmental sounds conditions, it is possible that many mechanisms (related to both emotions and deep semantic processing) may act together during music-memory tasks, leading to music-related improved performance. A caveat concerning the environmental sounds condition should be considered. The environmental sounds source was chosen with the aim of immersing subjects in a pleasant and medium arousing context that could help (rather than interfere with) verbal encoding. However, it is possible that the different levels of complexity of the two auditory contexts may have had a different effect on the participant’s encoding state. This issue should be taken into account in future research, which should compare contexts of equal complexity, such as more perceptually heterogeneous environmental sounds (e.g. by adding bird-song and the rustling of leaves), or two music conditions (e.g. vocal versus instrumental background music) in order to identify more precisely which specific features of the musical context drive the positive effects on source memory.
Nevertheless, in our opinion, the present findings offer an important contribution to the debate on music-related memory improvements. Although several studies have studied how music can help memory and learning, especially in the verbal domain, little is known about how to experimentally stimulate episodic, and more specifically, source memory performance through music. Therefore, although the employed paradigm does not allow testing for specific memory and music-related mechanisms and further research is needed in this domain on both behavioral and neural levels, the present work suggests that music-driven cognitive improvement may rely on the fact that instrumental music acts as a helpful encoding context for creating and storing new episodes in memory. As outlined in the introduction, episodic memory is a particular memory system that contains information about both the content of an experience (i.e. the quantity) and the context in which this experience has been encoded (i.e. the quality). The fact that a musical context leads to better source memory performance would therefore suggest that, rather than drawing participants’ attention away from the relevant information to remember (El Haj et al., 2014; Jäncke et al., 2014; Jäncke & Sandmann, 2010), a background of pleasant, non-familiar, instrumental music could help create new connections between items and the source itself, namely new episodes that participants can then retrieve during their subjective mental “time travel” (Tulving, 2002). This would therefore lead to improvements not only in the quantity (Balch et al., 1992; Balch & Lewis, 1996; Kang & Williamson, 2014; Ludke et al., 2014; Simmons-Stern et al., 2010; Wallace, 1994), but also in the quality of memory performance (see also Ferreri, Bigand, et al., 2014).
Conflicting findings come from studies reporting no, or a negative, effect of music on memory (e.g. Jäncke et al., 2014; Jäncke & Sandmann, 2010). In particular, a recent work by El Haj and colleagues (2014) suggested that music hinders source memory, as young and older adults had greater difficulty retrieving the location of objects previously seen in a music context than those seen under silence or noise (traffic sounds) conditions. In our opinion it is important to discuss these contrasting results, in order to further the debate on music and memory. More specifically, it is possible that these divergent findings are due to differences in experimental design. First, El Haj et al. (2014) used familiar music (Vivaldi’s “Four Seasons”). Although several studies have shown that familiar music leads to better memory performance (e.g. Simmons-Stern et al., 2010), they typically used a sung versus spoken modality for the presented items rather than background music. The main studies using background music and reporting positive results on memory and learning used musical pieces that were unfamiliar to the subjects (e.g. Balch & Lewis, 1996; Kang & Williamson, 2014; Ferreri, Aucouturier, et al., 2013; Ferreri, Bigand, et al., 2014). It is not yet clear which types of musical stimulus improve memory performance and which interfere with it, and further research in the music cognition domain is needed to disentangle this important issue. However, this evidence suggests that a well-known musical piece may draw participants’ attention away from the encoding task and interfere with subsequent memory performance. More specifically, in line with our previous observation, it is possible in this case that the use of familiar music may evoke personal events (Sacks, 2006), thus hindering the creation of new bindings and hence new episodes for subsequent retrieval in terms of content and contextual information.
Furthermore, although El Haj et al. (2014) used a musical background, the source-memory test involved object location and not the musical context. In other words, the music was not the source accompanying the object encoding, but rather a further stimulus that was considered irrelevant for retrieval of the target information. This difference in task may also explain the interfering effect of the musical background. The aim of our study was to create a rich encoding source to help participants encode lists of words. This leads to another important difference, namely the fact that El Haj et al. (2014) used non-verbal material (images of objects). In line with previous studies focusing on the relationship between music and verbal memory, our results suggest that there may be a specific link between music (as opposed to a general auditory context) and words (see e.g. Tillmann, 2012), which disappears when using non-verbal material (see Lockhart, 2000). This may be related to associative bindings created between items and between the item and the source (Ferreri, Aucouturier, et al., 2013; Ferreri, Bigand, et al., 2014). In other words, it is possible that the unique link between words and music is responsible for improved episodic memory performance, and that this link makes it easier to bind item and source, thereby creating new episodes in memory.
These findings could have important implications in the clinical domain. A number of authors have already investigated the role of music as a rehabilitation tool for memory deficits. For example, because of their spared memory for music despite serious general memory deficits, Alzheimer’s patients have been the focus of several studies on the rehabilitative effects of music (e.g. Moussard et al., 2012; Simmons-Stern et al., 2010; see also Baird & Samson, 2009 for a review). Music has also been used to stimulate verbal memory and learning performance in aphasic patients (Racette, Bard, & Peretz, 2006; Racette & Peretz, 2007), stroke patients (Sarkamo et al., 2008), and patients with multiple sclerosis (Thaut, Peterson, & McIntosh, 2005). However, none of these studies specifically focused on source memory, which is known to be impaired not only in frontal lobe lesions such as in Alzheimer’s disease (see e.g. Janowsky, Shimamura, & Squire, 1989) but also in healthy aging (see e.g. Souchay, Isingrini, & Espagnet, 2000). The fact that music is able to improve source memory thus offers new possibilities for the use of music as a rehabilitation tool for specific episodic memory deficits.
In conclusion, we have shown that music can improve episodic memory by improving episodic, and more specifically, source memory performance in young adults. Furthermore, the fact that another auditory context, such as environmental sounds, does not improve memory performance suggests that this effect is related to intrinsic features of the musical background rather than to general auditory stimulation. Differences with other studies showing the interfering effects of a musical source may be due to differences in the experimental design, particularly the type of music (familiar vs non-familiar), the task (the type of source tested), and the type of material used (verbal vs non-verbal). Overall, these results open up important prospects for music as a rehabilitation tool for source memory deficits, in healthy aging or in Alzheimer’s disease, for example.
Footnotes
Funding
This work was supported by the European Project EBRAMUS (European BRAin and MUSic) ITN – Grant Agreement Number 218357 and the MAAMI ANR Project, TecSan Program, ANR-12-TECS-0014.
