Abstract
The purpose of this study was to investigate effects of music notation reinforcement on aural memory for melodies. Participants were 41 undergraduate and graduate music majors in a within-subjects design. Experimental trials tested melodic memory through a sequence of target melodies, distraction melodies, and matched and unmatched answer choices. Target melodies were presented either aurally only, or aurally with matching notation. Results of a paired samples t-test revealed no significant difference in experimental test scores based on presentation format, suggesting that music notation reinforcement of melodies does not affect aural memory for those melodies. Recommendations for further research include an aural-visual melodic memory test paired with a survey of participant learning modalities; a survey of successful melodic memory strategies employed by aural skills students; and longitudinal studies of visual imagery applied to aural skills tasks. Implications for music education at all levels center on the need to cultivate strong cognitive relationships between visual and aural aspects of music.
Melodic dictation can be a challenging task for music students in high school and college. Dictation requires the fluent transfer of aural percepts to accurate musical notation (Larson, 1977; Telesco, 1991). Development of this synthesis of skills may have far-reaching effects on related musical abilities in error detection, sight-singing, and music reading (Killian & Henry, 2005; Sheldon, 1998). Therefore, systematic investigation of factors affecting the dictation process may be beneficial in understanding music perception and cognition in general, and in ultimately examining their role in school music classes and rehearsals.
Common challenges in the dictation process are lack of time to process percepts, lack of repetitions of the melody, and lack of a silent environment during the task (Karpinski, 2000; Killian & Henry, 2005). Students taking dictation must learn to work quickly and accurately, and to develop strategies to overcome the interference of aural distractions. A barrier to success for those taking dictation may lie in the moment of cognitive transfer of perceived aural stimuli into notation output (Lucas & Gromko, 2007; Pembrook, 1986). Students employ various strategies in making this transfer (Beckett, 1997; Mikumo, 1994; Thompson, 2004), but some of these approaches may actually hinder their efforts (Pembrook, 1987).
The aural-to-visual transfer is complicated by two human limitations: the inability to maintain an entire melody in aural memory (Miller, 1956); and the inability to maintain even a section of the melody in aural memory in the midst of ensuing melodic sections (Karpinski, 2000). Miller’s seminal, frequently cited work found that the human mind is capable of retaining only seven, plus or minus two, bits of information in memory, but that a small group of bits can be encoded into a larger chunk for memory storage. Musicians may be able to use a “chunking” strategy while processing melodic information during dictation (Madsen & Staum, 1983). However, they must combat aural distractions while carrying out this task. How can they best remember the information for quick and accurate retrieval?
Previous research (Cassidy, 2001; Center, Freeman, Robertson, & Outhred, 1999; Mikumo, 1994; Segal & Fusella, 1970) suggests that aural images are more likely to be confused by external aural stimuli – and visual images by visual stimuli – than by cross-modal combinations. In light of Miller’s suggestions, a “snapshot” of a specific chunk of notation held in visual imagery during dictation may be an effective strategy for accurately remembering that chunk, in the midst of aural interference of subsequent sections of the melody. During conversion to physical notation, the visual information can also be scanned very quickly, compared to temporal scanning of auditory information (Halpern, 1988; Weber & Harnish, 1974), providing faster decision-making capabilities. If the new information is assimilated into existing music theory constructs in visual imagery, efficiency and accuracy in dictation might increase even more (Center et al., 1999; Harrison, 1990; Hishitani, 1990; Long, 1977; McClung, 2001; Pinto & Tall, 2002; Saariluoma & Kalakoski, 1997, 1998; Wilson & Saling, 2008).
While it is difficult to test reliably whether subjects are visualizing the notation of a target melody, and to what extent (Burton, 2003), it is feasible and worthwhile to explore effects of overt presentation of music notation of melodies on aural memory for those melodies. Research exploring effects of visual information on aural perception and cognition – and of aural information on visual perception and cognition – has focused mainly on subjects’ extramusical descriptions and interpretations of percepts (Boltz, Ebendorf, & Field, 2009; Iwamiya, 1994). Research focused on interactions between the aural and visual domains during specific musical tasks may provide greater understanding of the cognitive tools employed by music students. Because so much Western music requires a synthesis of aural and notational musical fluency, increased understanding may have far-reaching applications in music education. The purpose of this study was to investigate effects of music notation reinforcement on aural memory for melodies. The null hypothesis was that there is no statistically significant difference in melodic memory test scores based on test presentation format (aural-only or aural-visual).
Method
Participants
The participants (N = 41) in this study were undergraduate and graduate music majors, representing an age range of 18 to 28, at a large northeastern university in the United States. Twenty-nine participants were instrumental concentration majors and 12 were vocal concentration majors, but these data were not factored into analysis because some participants self-reported equal strength in both areas. The sole prerequisite to participation was a passing grade in the first two semesters of college music theory, to ensure that all participants possessed the basic musical skills required to complete the experimental tasks. Students who reported having absolute pitch were excluded, to avoid the potential of their reliance on this relatively rare skill to complete the experimental tasks.
Testing instrument
The testing instrument consisted of two audio-video-interleave (AVI) files and a paper answer sheet. The files presented six test items each, corresponding to two test formats: (1) aural-only, and (2) aural-visual. Each test item on both formats of the files consisted of the same sequence: recorded verbal instructions, aural establishment of key, a target melodic phrase, a distraction melodic phrase, and two answer choice melodic phrases. The paper answer sheet contained three answer choices for each item, “Melody A,” “Melody B,” or “Neither,” for participants to choose which, if any, of the answer choices matched the target.
The only difference between test formats was that the aural-visual test included standard notation of the target melody, shown on the screen for four seconds immediately after the final note of the target sounded. The screen remained black throughout the rest of the item. During the aural-only test, the screen remained black throughout, including the four-second silent interval after the final note of the target. The four-second length of the interval on both tests paralleled the length of each target melody (eight beats at MM = 120), allowing participants time to “replay” the target internally.
Target and distraction melodies were created by the researcher, in consultation with two professors of music theory at the university where the experimental work was completed. Content validity was established through these experts’ examination of all test items, their determination that the items contained typical tonal content found in melodic dictation tests in theory courses (see Figure 1), and that the melodic memory being tested was indeed an important contributing factor to taking dictation. As a result of these consultations, draft melodies were edited to exhibit appropriate difficulty level and optimum clarity in rhythmic organization and beaming.

Target melodies. MM = 120 for all melodies.
All melodies were generated by an Encore (2009) piano patch with Encore notation. Piano timbre was chosen because of its common use in music theory and aural skills curricula. Computer-generated audio was chosen so that target melodies and answer choices would sound identical, except for the deliberate discrepancy in pitch or rhythm in the incorrect answer choice of each example.
Target melodies were identical across both tests, with the exception of a practice melody at the beginning of each test, used to familiarize the participants with the experimental trial format. The six melodies were presented in different order on each of the two tests, to avoid pattern recognition or transference of any memorized answers from one test to the other. Based on pilot testing, it was determined that using more than six melodies on each of the two tests could introduce a confounding factor of mental fatigue. To avoid presentation order bias, half the participants received the aural-only items first; the other half received the aural-visual items first. Pilot test data had revealed that use of identical melodies yielded no practice effect based on recognition of identical melody phrases. Pilot test participants had actually performed slightly better overall on the first test they took, than on the second (M = 3.75, M = 3.33, respectively). It was therefore determined that identical melodies, in different presentation orders, would be used in formal data collection as well.
Pilot test
Pilot test participants were 12 recently graduated music majors, which ensured that they had passed the first two semesters of music theory, and which placed them at approximately the midpoint of the expected age range of participants in the formal study. Analysis of pilot test data and informal interviews with the pilot participants indicated that reliability of the six test questions, clarity of directions and procedures, and overall test difficulty were consistent and appropriate. Pilot data analysis yielded adequate test-retest reliability, r(12) = .76. Table 1 displays the calculated difficulty index for each test item. Because there were three answer choices (“A,” “B,”, or “Neither”), chance difficulty index for each question was 0.33. All difficulty indices were above chance, and well below a potential ceiling effect. As Table 1 shows, items represented a reasonable range of difficulty, and the average index was 0.59, very close to the ideal difficulty level of 0.65 (the midpoint between chance and perfect scores). Therefore, no items were discarded, and no major adjustments to the testing materials were made before formal data collection began, although several stages of revisions had already been made prior to the pilot.
Test item difficulty indices.
Procedures
This study employed a within-subjects design, in which each participant completed both test formats individually, in a small, quiet office. Slight extraneous noise in the office was eliminated with Sony MDR-NC7 Noise Canceling Headphones. The testing instrument for this study was self-contained, except for the paper sheet on which participants marked their answers. Participants used a Dell Inspiron E1405 laptop computer to proceed through the two AVI files.
During administration of the testing instrument, data were recorded of the order in which each participant completed the two test formats, and participants read the directions for themselves. For the aural-only presentation, they were instructed to listen to the I-V-I chord progression, first melodic phrase, and distraction melodic phrase, and then to circle which of two answer choices matched the original, or “neither,” if appropriate. In the aural/visual presentation, they were instructed to listen to the I-V-I chord progression, to listen to and then view the first melodic phrase, to listen to the distraction melodic phrase, and then to circle which of two answer choices matched the original, or “neither,” if appropriate. After reading the directions for themselves, participants were asked to summarize them verbally to the researcher, to ensure understanding of the procedures.
On both test formats, before each test item, the key of the stimulus was established aurally, using the chords I-V7-I, by the same Encore piano patch as described above for the target melodies. The target melodic phrase followed immediately, and four seconds after each target, a musically related distraction melodic phrase sounded. The distraction phrase was essentially the melodic consequent of the stimulus. In other words, the target stimulus was the musical “question” and the distraction was the musical “answer.” This sequence was intended to represent the common dictation approach of focusing on remembering only the first part of a melody first, while the rest of the melody continues to sound. In the aural-visual presentation, the notation appeared on the screen during the silent four-second interval. In the aural-only presentation, the screen remained black during the silent four-second interval. Two seconds after the distraction phrase, participants heard each answer choice, decided if either choice matched the original target melodic phrase, and marked their answer sheets. Five seconds after the second answer choice was heard, the recorded announcement of the next example began.
Analysis
Collected data consisted of participants’ (N = 41) scores across two formats of test presentation. Each participant completed the experimental tasks under both formats, so data represent 82 total scores. Threats to internal validity were reduced due to the experimental design: participants were compared to themselves under the given conditions, test items were identical for both tests, and tests were taken one immediately following the other. The within-subjects design and counterbalanced test administration also mitigated threats to external validity of selection bias and testing effects, respectively.
Results
The independent variable in this within-subjects design was presentation format and the dependent variable was experimental test scores. A paired samples t-test was employed to evaluate any difference in experimental test scores based on presentation format. Results of this test showed no significant difference in experimental test scores between the music notation reinforcement (M = 3.10; SD = 1.04) and aural-only (M = 2.95; SD = 1.45) presentation formats, t(40) = .614, p = .66. Therefore, the null hypothesis could not be rejected.
Discussion
The central purpose of this study was to determine what, if any, effects music notation reinforcement has on aural melodic memory. The null hypothesis was that there is no statistically significant difference in melodic memory test scores based on test presentation format (aural-visual or aural-only). Based on the results of this study, notation reinforcement of melodic stimuli does not significantly affect aural memory for those stimuli in the midst of aural distractions.
The foundation for this study derived from a practical problem in high school and university aural skills courses: poor student performance in melodic dictation. Dictation skills represent an important bridge between the aural and visual aspects of music so commonly found in Western classrooms and rehearsals. The development of skills in melodic dictation is also a key factor in a number of tasks a music educator might perform including, but not limited to, sight-singing, score reading, error detection, and materials assessment. Therefore, investigation of the way these skills are cultivated has potential implications for both students and teachers of music.
Dictation students likely are not able to remember a whole melody at once, or even a portion of a melody, given the aurally distracting nature of the rest of the melody (Karpinski, 2000; Madsen & Staum, 1983; Miller, 1956; Segal & Fusella, 1970). Previous research suggested that a visual snapshot of the perceived melodic information might help mitigate the ensuing aural distraction (e.g., Cassidy, 2001; Center et al., 1999; Mikumo, 1994; Segal & Fusella, 1970). Curiosity as to whether or not visual reinforcement of a largely aural phenomenon would aid listeners’ melodic memory fueled this investigation. Interestingly, visual reinforcement neither helped nor hindered participants in this case. Providing participants with a visual “chunk,” based on Miller’s (1956) report, did not help them to maintain the melodic material correctly in memory. Future studies might test other approaches to aural memory for melodies, and other aspects of the general skills set required for melodic dictation tasks.
Informal discussion with participants, instructors, and students following the experimental trials yielded the general opinion that music notation reinforcement of the melodies would certainly produce higher scores. This was not the case, according to the results of the study. Some participants may have found the visual input distracting. Some, in fact, reported being discouraged by the fact that they could not “see their way through” the entire target before it disappeared from the screen. Some participants may be “aural learners,” better able to focus during the memorization stage without any visual input at all (Beheshti, 2009). Future research might pair a similar aural-visual melodic memory test with a survey of participant learning modalities.
If overt, correct notation reinforcement of target melodies did not improve aural memory for those melodies, it seems unlikely that self-generated visual imagery would either. If dictation students create their own visual images of aural percepts, they run the risk of embedding errors in the image. A correct, clear, sizable image on the computer screen did not improve melodic memory for participants in this study. It seems unlikely that students’ self-generated visual images of the same information would be any more helpful.
However, the visual component of the computer program used in these experimental trials was predetermined. Images were consistent and notation was standard, but participants had to work with exactly what was shown them. They had to perceive the notation as displayed, and make quick decisions about how to use it in memory, in relation to the matching aural percept. When students create their own visual images, they are able to customize them for optimum utility and lasting memory. They can choose to use, for example, vivid colors, intricate designs, and visual backgrounds and highlights, to effectively store information, prioritize it, and make it available for the task at hand. Overt visual presentation of melodies may have eliminated these options for participants in this study. Future research might attempt to capture the nature of aural and visual imagery generation during musical tasks, through a survey of successful melodic memory strategies employed by dictation students, and through longitudinal studies of visual imagery applied to aural skills tasks.
Detailed measurement of imagery activity is still an undeveloped and inconclusive science. While fMRI studies identify which parts of the brain are active during particular musical activities, they do not yet provide conclusive evidence about why that brain activity occurs the way it does, or how humans can use their capacity for imagery in musically-related tasks (Morrison, Demorest, Aylward, Cramer, & Maravilla, 2003). Self-reporting is a common alternative in studies of imagery, but is prone to inaccuracy, due to confounding variables such as faulty memory, exaggeration, and desire to report what researchers are seeking (Burton, 2003).
Because of these limitations, it was determined that the most effective approach to testing the hypothesis of the current study was overt visual presentation of notation of the target melodies serving as a proxy for visual imagination of those melodies. Participants actually saw the notation representing the percepts, rather than being asked to imagine it. This aspect of the design was intended to avoid the potential confounding variables presented by self-reporting, and to provide all participants with consistent, correct visual information for testing purposes.
Although overt presentation of target melodies likely increased the effectiveness of experimental procedures, it may not have served successfully as a proxy for participants’ own imaginative faculties. As experimental technologies are improved, future research may involve more specific and precise analyses of these imaginative faculties in relation to music cognition and memory tasks.
Melodic dictation also requires engagement of the kinesthetic modality, as students physically write pitches and rhythms on paper using correct spacing. Some students may use kinesthetic strategies for decoding pitch information during dictation tasks (Mikumo, 1994). Future research might explore combinations of kinesthetic, aural, and visual modalities in melodic memory and dictation tasks.
Implications
The results of this study suggest that, under these circumstances, music notation reinforcement of melodic stimuli does not significantly affect aural memory for those stimuli. Cultivating strong cognitive relationships between aural and visual aspects of music is key to success in many standard Western school music activities (Killian & Henry, 2005; Sheldon, 1998). Skills in error detection, sight-singing, and music reading depend, at least in part, on development of those cognitive relationships.
Inservice PreK–12 general music teachers, ensemble directors, and music theory teachers might explore ways to consistently reinforce sound through sight, and sight through sound, to help students further master these relationships through daily activities. Preservice teachers and teacher educators can seek improved methods for drawing explicit connections between specific cognitive development in aural skills coursework and the music teaching applications to which it relates. Finally, detailed attention to ways of testing aural skills, such as those employed in this study, is crucial. Researchers and practitioners must continue to refine how they test students’ aural ability, and how those results may apply to daily classroom and rehearsal activities.
Footnotes
Funding
This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.
