Abstract
The present study employed a protocol specially developed for testing toddlers’ singing competence. The protocol was designed to increase responsivity of toddlers during testing. The singing ranges and singing accuracy of three-year-old children were measured using the protocol (N = 39). A large proportion of the three-year-olds participated in at least one item (89.7%), which is a high rate of participation for this age group, validating the appropriateness of the protocol applied.
The most successful test item was the Self-selected song as it elicited the highest response rate of all items (87%). Furthermore, the Self-selected song resulted in more accurate renditions in terms of preservation of melodic contour (85%) and intervals (24%) than another item consisting of a familiar standard song phrase. In ascending pitch glides, 73% of the toddlers lifted their voice above C5 and the highest produced pitch was C6 (two octaves above middle C). Pitch matching accuracy was highly pitch dependent, with middle C as the most accurately matched pitch (53%) and C above middle C the least accurately matched pitch (11%). The findings support previous research that describes three-year-olds as capable singers while contradicting more widely accepted views of three-year-olds having poorly developed singing skills.
Keywords
Singing in early childhood develops naturally and effortlessly across cultures, just as language does (Trehub & Gudmundsdottir, 2015; Welch, 2006). Observational studies have revealed considerable singing by toddlers in the course of play (Björkvold, 1992; Custodero, 2006). In a typical day, toddlers sing frequently and in a variety of ways (Barrett, 2006; Björkvold, 1992; Mang, 2005; Sundin, 1998; Whiteman, 2001; Young, 2002).
Despite young children’s natural inclination to engage in various singing behaviours, there is little agreement about their singing competence. The general consensus is that the achievement of culturally correct imitations of standard songs is a very protracted process. Young children reportedly produce a standard song either in monotone rhythmic chants or with approximate melodic contours and compressed intervals, demonstrating little adherence to a tonal centre (Flowers & Dunne-Sousa, 1990; Welch, Sergeant, & White, 1996). Some scholars describe early childhood singing as devoid of fixed pitches and scale structures (Davidson, 1985), with children relying primarily on outlines or contour schemes that approximate the target contours (Davidson, McKernon, & Gardner, 1981; Dowling, 1984). “[The] contours of children’s early tunes are restricted in range and contain no clear melodic patterns” (McKernon, 1979, p. 47).
What do we know about the singing ability of three-year-old toddlers?
According to the contour scheme theory (Davidson, 1985; McKernon, 1979), toddlers’ small vocal range impedes their ability to sing. With a pitch range spanning only a few semitones, toddlers necessarily compress the intervals of the target song to fit their available vocal scheme or range (Davidson, 1985; Dowling, 1984; McKernon, 1979). Children’s vocal pitch range is thought to expand gradually with development, which allows for increased singing accuracy (Davidson, 1985; Dowling, 1984; Flowers and Dunne-Sousa, 1990; McKernon, 1979; Rutkowski, 2013). It is notable, however, that insecure adult singers also compress intervals when singing (Pfordresher & Brown, 2007; Pfordresher, Brown, Meier, Belyk, & Liotti, 2010), raising the possibility that compression of intervals in song singing can reflect something other than immature vocal structures.
More compelling evidence against developmental limitations in vocal range as the major factors in young children’s inaccurate singing is the fact that three-year-olds’ vocal range in singing can be increased considerably by guided practice within a relatively short time (Jersild & Bienstock, 1931). Expansion of singing range is also possible with two-year-olds (De Vries, 2005). Moreover, infants and toddlers use a large vocal range during free play (Fox, 1990; Mang, 2001; Reigado, Rocha, & Rodrigues, 2011) and a larger range in pitch-matching exercises than they do when they sing songs (Flowers & Dunne-Sousa, 1990; Welch, Rush, & Howard, 1991). If young children can produce a large range of pitches during play, it is not obvious what prevents them from using their full vocal range when singing.
Just as early language development depends critically on the amount and nature of language input, especially in one-on-one contexts (Huttenlocher, Haight, Bryk, Seltzer, & Lyons, 1991; Ramírez-Esparza, García-Sierra, & Kuhl, 2017), the amount and nature of musical input must contribute to the large variation in toddlers’ singing proficiency (Kelley & Sutton-Smith, 1987; Moog, 1976). In fact, there are substantial variations in singing proficiency in school-age children (Flowers & Dunne-Sousa, 1990; Leighton & Lamont, 2006) and in adults (Pfordresher & Brown, 2007).
Viewing toddlers as capable singers
Reports of toddlers as capable singers have not received much attention from scholars and textbook writers in the field of music education, perhaps because of their focus on the type of singing acquisition that begins at around five years of age with the onset of formal schooling (Rutkowski, 2013; Welch, 2006). Nevertheless, there are noteworthy reports of singing skills in early childhood.
Moog (1976) described recordings and observations of the singing of 500 German children and the comments of their parents. He reported that “by the age of two, every child of normal development can sing” (p. 42). He was not referring to complete performances of songs but rather describes the predominant singing behaviour of this period, which he called “babbling songs.” He noticed, however, that some toddlers were capable of more advanced singing behaviours: stating that “one-third of children aged one to two begin singing songs which bear some resemblance to what has been sung to them” (p. 42). He found that most three-year-olds were capable of “imitative singing” or singing standard songs and song phrases heard in their environment. He concluded that “about half of the three-year-olds could sing the words, rhythm and pitch of a whole song more or less correctly” (p. 43). Moog’s view of three-year-olds as capable singers is markedly different from the conventional view that the singing of three-year-olds is largely inaccurate and limited in pitch range (Flowers and Dunne-Sousa, 1990). In cultures where singing is a common everyday practice for all (e.g. Zimbabwe), most three-year-olds are capable singers (Kreutzer, 2001). Even Western toddlers’ (20 to 40 months of age) average pitch range was found to be roughly nine semitones (range: 7–12 semitones) when they spontaneously sang a song that was highly familiar to them—“Twinkle, Twinkle, Little Star”—in the comfort of their home environment (Gudmundsdottir & Trehub, 2018).
Longitudinal studies of singing in preschool children are consistent with the view that children have different developmental trajectories (Kelley & Sutton-Smith, 1987; Moog, 1976; Whiteman, 2001). For example, some toddlers begin imitating songs at around two years of age, using simple syllables like loo or la instead of words (De Vries, 2005; Honig, 1995), or otherwise exhibit greater interest in melodies than in words (Stadler Elmer, 2012).
Scholars and educators of young children must consider the multifaceted nature of song singing. Songs have complex structures that integrate melodic pitches and rhythms. Most songs also have words, often but not always with meaningful content. The words of a song can be considered as providing percussive texture. Words are sounds, first and foremost, and are most likely treated as such by very young singers (Forrester & Borthwick-Hunter, 2015; Gudmundsdottir & Trehub, 2018). Some songs also feature bodily gestures that may be as important to young children as the syllabic sounds, rhythm, pitches, or melodic contour. At early stages in the song-learning process, such gestures may be the most important feature of a song for young toddlers (Forrester, 2010).
Learning to perform a song also involves learning the phrasing and expressive aspects (Stadler Elmer, 2012). For toddlers, singing a particular song may be linked to specific feelings or moods. A child who is learning the art of singing a song may not understand how crucial a stable tonal centre is to “culturally acceptable” performances or that intervals should be performed exactly as in the model song in repeated performances. Remarkably, some children seem to understand the importance of interval accuracy and consistency at an early age (Honig, 1995; Kreutzer, 2001; McGraw, 2017; Moog, 1976; Stadler Elmer, 2012). It is clear that frequent one-on-one practice with an adult facilitates the development of singing accuracy (De Vries, 2005). Even in group lessons, considerable progress in singing is possible with very young children (Jersild and Bienstock, 1931).
Testing three-year-olds
One factor contributing to the issue of limited data on singing in three-year-olds is their shyness about singing, especially in the presence of unfamiliar adults. Researchers, even parents, have reported that toddlers tend to stop singing when their singing becomes the focus of attention (Barrett, 2011; Dean, 2015). A number of toddlers and preschoolers regard their singing as a private venture, reacting negatively to others, even family members, who join in their singing (Dean, 2015; Forrester, 2010; De Vries, 2005). As a result, three-year-olds may not reveal their singing skills in a standardized test situation or in any evaluation conducted by an adult stranger (Crais, 1995, 2011). It is of considerable interest and importance to develop age-appropriate procedures for eliciting singing responses from this age group.
The present study of three-year-old children was guided by three research questions: 1) What is the most advanced singing ability observed in typically developing three-year-olds? 2) How common is their ability to produce contour-preserving renditions of standard songs? c) How commonly do they use a singing range that exceeds a few semitones?
Method
Participants
Participants in the present study consisted of 39 three-year-olds (M = 40.05 months, range = 36–44 months, 17 girls) who were recruited through public preschools 1 for two- to five-year-olds in a typical middle-class neighbourhood in Reykjavik, Iceland. Parents received a letter about the study and provided informed consent in writing.
Test protocol
Two research assistants who were trained specifically for this study had formal vocal training in addition to a BEd degree with specialization in music education. Each research assistant spent the full week prior to data collection at the preschool where they would conduct testing. They assumed the role of caregiver in the groups of three-year-olds to establish a trusting relationship with the children. They also conducted a couple of sessions of singing circle during the week-long familiarization phase before individual testing. In the singing circles, children learned games that were later repeated in the one-on-one testing situation. The main goal of the familiarization phase was to enhance three-year-olds’ comfort in the testing situation and therefore the likelihood of revealing their abilities.
The test protocol in the present study was developed in a previous project aimed at designing effective methods for eliciting singing responses from children who are three years old or younger. The refined protocol, which was found to be effective with two- and three-year-olds (Thorsdottir, 2013) consisted of a one-week familiarization phase and an individual testing phase involving a researcher-guided play session for 15–20 minutes. The play session had four parts, as indicated below and were administered in a fixed order except in cases when skipped items were revisited once more towards the end of a session.
Voice lift (The fire truck game)
During the familiarization phase (in the preschool setting), the children sang a song about fire trucks that made high-pitched sounds, and they practised imitating the siren of a fire truck. In the testing situation, they would chose a fire truck to play with and engaged in a playful exercise with the research assistant who encouraged them to imitate a siren, which was recorded to examine their pitch production.
Pitch matching (singing puppets)
The pitch-matching task was prepared by videotaping five different puppets singing the pitches A3, C4 (middle C), F4, A4, and C5 on the syllable na repeated three times in succession. During the testing, children chose a hand puppet to hold while their hand-held puppet tried to imitate the pitches in the video. The goal here was to tap into three-year-olds capacity of role-play by lending their voice to the puppet they were holding. Holding the puppet directed children’s attention away from their own voice when they were asked if the hand puppet could imitate the puppets in the video clips.
Standard-phrase (The ‘Ba-bou’ song)
The ‘Ba-bou’ song is a well-known song that is popular among three-year-olds in the local area (see Figure 1). The song has a refrain with the repeating non-words “ba-bou” and “tra-la-la.” This catchy phrase is often sung repeatedly by three-year-olds during free play. This refrain was selected as the Standard-phrase because of the likelihood that three-year-olds would be interested in singing it by themselves. At the same time, the phrase is challenging because of its six changes in melodic contour. In the testing session, the research assistant sang the song with appropriate hand gestures together with each child and then encouraged them to sing the refrain by themselves at least once.

The Standard-phrase in the protocol: a refrain from a song highly familiar to the children in the study.
Self-selected song (The magic hat)
An important game introduced during the familiarization phase was the game of the magic hat. The person wearing the magic hat got to sing their favourite song or any song of their choosing. The magic hat was large enough to conceal their eyes but allowed them to peek out through holes in the hat. It was intended to provide a disguise to encourage their comfort in solo singing. The magic hat was introduced as a rewarding opportunity towards the end of the test session, providing children with freedom and a sense of empowerment in choosing a song to sing.
Procedure
Children were tested individually in a quiet room at their preschool. The sessions with the research assistant were introduced as a playtime when they could receive the full attention of the now familiar adult for approximately 15–20 minutes. Children determined the extent of their participation, and they received continuous, positive verbal feedback regardless of their effort or willingness to cooperate. If a child indicated reluctance to participate in one item, the researcher proceeded to the next item. Skipped items were revisited later in the session for one further attempt. All children received praise and colourful stickers after their session with the research assistant. Each session was videotaped and audio-recorded.
Apparatus
The recording equipment used was a Sony Handycam for the video content and a Zoom digital recorder for the audio. Pitch analysis was achieved through using the professional sound editing software Melodyne (by Celemony). Melodyne is a powerful tool for detecting pitch in musical sounds (McLeod, 2009; Presto, 2011) as well as speech sounds (Heaton, Davis, & Happé, 2008; Järvinen-Pasley & Heaton, 2007). However, the software is expensive because of its music editing capacities far beyond pitch detection and not as commonly used as other pitch detection programs that are more affordable or free of charge.
Pitch analysis
Unless otherwise stated, the pitch analyses were achieved through opening the audio recordings as sound files (wav) in the software Melodyne. Expert members of the research team (musically trained) checked each pitch analysis output for automatic octave equivalence errors. The software was set to detect pitches using quantification to the nearest semitone. A pitch was judged as correct if it was within a semitone above or below the target pitch. This can be particularly appropriate when measuring single pitches or detecting highest and lowest sung pitch in a performance (pitch range). The purpose of the automatic pitch analyses was to provide objective measures of the pitches sung by the children. However, the human ear judgements of experts were applied when categorising song performances; for example, in terms of melodic contour preservation.
Results
Participation varied among tasks because children could choose to play one game and skip another. The Self-selected song had the highest participation at 87%; singing single pitches elicited 81% participation on average. Voice lift and the standard song phrase elicited responses from 82% of participants.
Voice lift
The children produced a large range of high siren pitches that were ascending glissandos. The highest pitches ranged from D4 to C6, with the pitch C5 as the most frequent high pitch (Figure 2). The majority (72%) of the children who participated in this item lifted their voice to C5 or above (Figure 2).

The highest pitches produced by the children when imitating a siren. Some 72% of the children landed on C5 or higher in vocal ascending glissandos.
Matching single pitches
Pitch-matching accuracy differed across pitches. The most frequently matched pitch was C4 (middle C) with a 57% success rate, and the least frequently matched pitch was C5 with a 12% rate. A3, F4, and A4 resulted in pitch-matching success of 38%, 25%, and 32%, respectively (Figure 3).

The pitch-matching task included five pitches. Middle C (C4) was the pitch most successfully matched by the three-year-olds.
Standard song phrase
Children’s melodic phrase productions were classified as interval preserving (IP), contour preserving (CP), or contour violating (CV). A majority of the children (61%) reproduced the phrase in an interval- or contour-preserving manner, the remaining children made some errors in melodic contour (see Figure 4). Of the subset of children who preserved the melodic contour, a third of the children (21% of the whole sample) also preserved the intervals.

Preservation of melodic contour (CP) was found in the majority of the children’s renditions of the Standard-phrase and their Self-selected songs. Interval preservation (IP) occurred most often in Self-selected song. Contour violating (CV) renditions were more common in the Standard-phrase than in Self-selected song.
Self-selected song
The Self-selected songs were in all cases standard children’s songs from the repertoire of the preschools. If a child could not think of a song to sing, the researcher would suggest a couple of songs from the preschool repertoire. None of the children invented their own song during this item. Children’s renditions of Self-selected songs were analysed according to the following scheme:
1 Correct intervals with key stability.
2 Correct intervals but changing tonal centre.
3 Some intervals preserved but not consistently.
4 Contour preserved but not intervals.
5 Mostly chanting with little or no hint of melody.
Two expert judges (early childhood music education specialists with formal vocal training) applied this scheme to the children’s performances (see Figure 5), with high inter-rater reliability (α = .96). Only 5 children (15%) failed to produce contour-preserving renditions of their Self-selected songs. The remaining children produced contour-preserving performances (85%), some of them even singing the correct intervals with a stable tonal centre (24%).

Singing accuracy in Self-selected song.
Pitch range of standard phrase
The standard phrase had a nine-semitone range when sung as notated (Figure 1). Although 83% of the children participated in the task of singing a standard phrase only 56% (n = 18) of the phrases produced were good candidates for pitch analysis using the software Melodyne. Many children found it challenging to sing the refrain on their own when the adult stopped singing, while others were content to continue singing alone. The 18 performances analysed for pitch had an average vocal range of 5.44 semitones (range = 4–8).
Pitch range of Self-selected song
The average pitch range for the Self-selected song was 8.29 semitones (range = 4–12 semitones, see Figure 6) with nine semitones as the most frequent singing range (n = 14, see Figure 6). The lowest pitch was F3 and the highest was C#5. Figure 7 displays the song range for each child’s Self-selected song (n = 34).

Singing ranges used in Self-selected song. Fourteen children used the range of nine semitones when singing a Self-selected song.

Singing ranges per individual child singing Self-selected song. The Y-axis displays the pitch names. The range sizes are in an ascending order from left to right. The horizontal line marks the middle C (C4).
Discussion
The aim of the present study was to provide evidence of the singing abilities of typically developing three-year-olds by means of engaging, age-appropriate tasks. An overall response rate exceeding 80% on most tasks attests to the generalizability of the findings for this population. Test participation is not always reported in previous singing research. However, participation in singing tasks in widely cited studies has been as low as 53% with five-year-olds (Welch, Sergeant, & White, 1996). The high level of participation by the toddlers in the present study confirms that the given tasks were realistic for this age group and attests to the appropriateness of the protocol applied.
Most children in this sample demonstrated the ability to lift their voices in upper registers and singing standard songs in a manner that preserved the melodic contour. In fact, most of the children singing a Self-selected song preserved the melodic contour and nearly a third of them further demonstrated the ability to preserve the intervals of the target song. These findings resonate with reports by Moog (1976) and Kreutzer (2001) suggesting that most three-year-olds are capable of singing songs in a culturally correct manner. The present findings do not corroborate results indicating that three-year-olds rarely demonstrate ability to correctly replicate songs from the surrounding culture (Davidson, 1985; Davidson, McKernon, & Gardner 1981; Flowers & Dunne-Sousa, 1990; McKernon, 1979).
Even though children at this age frequently invent their own songs, especially during free play (e.g. Björkvold, 1992; Sundin, 1998; Whiteman, 2001), they have gained considerable knowledge of the standard song repertoire in their culture and demonstrate that they have mastered a standard song when they get the opportunity. Indeed, when children are observed by adults as “inventors” of songs, the children themselves may not be aware that they are inventing songs as from the point of view of adults. The act of inventing songs is merely a playful mode of being for three-year-olds. Therefore, it does not come naturally to start inventing a song when an adult asks them to sing a song outside of the context of spontaneous play. The children in this study seemed to have a clear concept of a standard song because none of them invented their own song when asked to sing a song of their choice.
Interestingly, the children performed a Self-selected song with greater accuracy than the phrase from a standard song they were all asked to sing. This was evident, even though the Self-selected song was the last item in the protocol, which could have resulted in poorer performances due to fatigue. The superior performance of Self-selected song might be explained by the effect of familiarity and practice, by way that they probably had sung the Self-selected song more often than the Standard-phrase. The children may have been more confident and comfortable singing a song that they chose themselves rather singing a phrase on demand, even in this friendly and playful setting. Nevertheless, this standard phrase is quite commonly sung with and by children of this age in Iceland. The difficulty of the Standard-phrase (nine-semitone range with six changes in melodic contour) does not surpass the most common Self-selected songs such as the “Colour song” (the same melody as “Twinkle, Twinkle, Little Star” with a nine-semitone range and seven changes in melodic contour). Therefore, it seems unlikely that the difficulty of the standard phrase greatly surpassed the difficulty of the Self-selected songs. The measured difference between the Self-selected song and standard phrase demonstrates how much the context and content of test items can affect performance accuracy. However, further studies on the effect of practice and song complexity on toddler’s singing are recommended.
The children matched the pitch C4, or middle C more often than the other pitches that were either lower or higher. A follow-up study should investigate whether the same success rate of pitch matching can be reached with pitches closer to C4. A closer look at the children’s responses suggests that non-matches are sometimes wrong by an octave, either above or below the target. This was most notable when the children were asked to sing A4, as six of them sung the pitch A3 an octave below. Interestingly, when asked to sing the pitch A3, there were only 12 children (38%) who sang that pitch. Other pitch matching errors in the single pitch matching test that could be mentioned were occasional misses by less than a whole-step (200 cents) from the target pitch (but more than a semitone). In conclusion, the task of matching single pitches using their voices may be confusing to some three-year-olds and they may not fully understand that the intent of imitation is to match a pitch rather than, for example, a timbre. Alas, it is not safe to draw the conclusion that absence of pitch matching in a given task means inability to match pitch. In fact, the different results for individual pitches suggest that the children in this study were somewhat selective of which pitches they did successfully match, perhaps based on the comfort range of their voice. Ostensibly, some three-year-olds may interpret the game of pitch matching according to their own liking and may be unaware of the aim of the task and the importance of matching the modelled pitches.
The present study demonstrates that proficient singing skills are not as rare in a population of normally developing three-year-olds as might be expected according to theories on singing acquisition in toddlerhood (Davidson, 1985; Davidson, McKernon, & Gardner, 1981; Dowling, 1984). With only a few exceptions, the three-year-olds in this study were able to sing contour preserving renditions of regular children’s songs using vocal ranges that approximated the target songs which is in line with reports on toddlers as capable singers (Kreutzer, 2001; McGraw, 2017; Moog, 1976; Stadler Elmer, 2012). The average vocal range used to sing a Self-selected song was over eight semitones and reaching up to 12 semitones, which surpasses the expected singing range of three-year-olds by a great margin. In fact, studies have suggested that the singing range of three-year-olds is usually no larger than seven semitones (Flowers & Dunne-Sousa, 1990) and five-year-olds might use only three to four semitones when singing songs with words (Rutkowski & Miller, 2003; Welch, Rush, & Howard, 1991). The findings of the present study suggest that assumptions may have been made previously about the singing ability and singing skills of toddlers based on too little evidence. Also, the limited data collected with the population of three-year-olds in previous research may not demonstrate optimal abilities of children because of the limitations of methods employed. Further singing data need to be collected using testing methods appropriate for the delicate population of toddlers.
Footnotes
Acknowledgements
The author would like to thank Sandra Trehub for her guidance and support in the writing of this paper. Also, acknowledging the brilliant contributions of the research team for this study, especially Hildur Halldórsdóttir and Björg Thórsdóttir.
Declaration of conflicting interests
The author(s) declare no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship and/or publication of this article: University of Iceland Research Fund, the Social Sciences and Humanities Research Council of Canada and AIRS (Advancing Interdisciplinary Research in Singing).
