Abstract
While it is well established that non-native speakers differ from native speakers in their perception and/or production of Mandarin lexical tones, empirical studies focusing on non-native learners are still limited. The objective of this study is to add to the current understanding of lexical tone perception by comparing native speakers of standard Korean from the Seoul/Kyunggi area differing in Mandarin experience (NK1, NK2) with native speakers of Mandarin. NK1 (n = 10) had no experience with Mandarin whereas NK2 (n = 10) consisted of highly advanced learners of Mandarin. A group of 10 native Mandarin (NM) speakers was included as controls. Accuracy of perception of six tone pairs (T1–T2, T1–T3, T1–T4, T2–T3, T2–T4, T3–T4) was assessed in a four-alternative forced-choice discrimination test. As expected, the NK2 group with extensive Mandarin learning experience resembled the NM group to a greater extent than did the NK1 group. T2–T3 was the hardest pair for both NK groups, but NK2 had the largest advantage over NK1 for this pair. Apart from T2–T3 which is generally considered difficult, tone pairs involving T1 caused some misperception by the NK groups. This may be related to the difficulty with perceiving a level tone which shows the least fundamental frequency (F0) movement and possibly has limited perceptual salience.
I Introduction
Mandarin is a tone language with four tone categories (Tone 1 (T1): high level (ā), Tone 2 (T2): high rising (á), Tone 3 (T3): dipping (ǎ), Tone 4 (T4): high falling (à)). In tone languages, incorrect use of lexical tones leads to confusion/misunderstanding (e.g. 妈 mā ‘mother’ vs. 马 mǎ ‘horse’ or 买 mǎi ‘buy’ vs. 卖 mài ‘sell’). While it is well established that naïve, non-native speakers differ from native speakers in their perception/production of Mandarin lexical tones (e.g. Hallé et al., 2004; Huang and Johnson, 2010; Lee et al., 1996; So and Best, 2010, 2014), empirical studies focusing on the processing and acquisition of tones by non-native learners who are actively engaged in the second language (L2) Mandarin learning are the minority as pointed out by some researchers (Hao, 2012; Wang, 2013).
The accepted wisdom in L2 speech learning is that the age at which learning commenced, i.e. age of L2 onset, plays a crucial role such that learners who started L2 learning in early childhood generally outperform learners who started L2 learning later in life (e.g. Piske et al., 2001; Stölten et al., 2013). Nevertheless, a positive effect of L2 learning experience in adulthood has been reported for the processing of Mandarin tones (Hao, 2012, 2018; Shen and Froud, 2016; Tsukada et al., 2015, 2016). For example, advanced learners of Mandarin who speak American English as their first language (L1) have been shown to perceive lexical tones in a native-like categorical way (Shen and Froud, 2016), indicating that it is possible to achieve native-like performance in adulthood.
As for individuals who may be characterized as learners at the very beginning stages of L2 learning, recent tone perception training studies reported that pitch-specific perceptual measures predicted tone learning more successfully than non-tonal measures (e.g. musicality, L2 aptitude, general cognitive ability) (Bowles et al., 2016; Perrachione et al., 2011). Although perceptual training of Mandarin tone has been shown to be effective and generalize to new stimuli and talkers, trainees’ performance typically does not reach the accuracy level expected of native listeners and leaves room for further improvement (Wang et al., 1999). Further, the effect of phonetic variability depended on learner characteristics such that the learners with weaker perceptual abilities may not benefit from highly variable input (Chang and Bowles, 2015; Perrachione et al., 2011).
The objective of this study is to add to the current understanding of Mandarin tone perception by comparing two groups of native Korean (NK) speakers from the Seoul/Kyunggi area differing in Mandarin experience with native speakers of Mandarin (NM). The first NK group (NK1) had no experience with Mandarin, whereas the second group (NK2) consisted of highly advanced learners of Mandarin. Given the paucity of published phonetic research focusing on the L2 acquisition of Mandarin lexical tones by NK learners (for notable exceptions, see Tu et al., 2016; Zhang, 2016), we are interested in finding out if and to what extent advanced learners from a different L1 background approximate to or diverge from the NM group in their tone perception. As discussed below, looking at listeners from the L1 Korean background would be insightful, because standard Korean differs prosodically from Mandarin in not using fundamental frequency (F0) variations in a lexically contrastive way. The findings would have implications for extensive foreign language learning experience and ultimate attainment in cross-language tone perception as well as pedagogy of L2 Mandarin pronunciation.
In our previous research (Tsukada et al., 2015, 2016; Tsukada and Kondo, under review), we observed relatively high discrimination accuracy for T3–T4 by naïve listeners from diverse L1 backgrounds, both tonal (Burmese, Thai, Vietnamese) and non-tonal (Australian English). The accurate T3–T4 discrimination was attributed to multiple acoustic cues supporting this contrast including durational differences (short for T4 and long for T3) and creaky voice associated with T3 (e.g. Ding et al., 2011; Hiki et al., 2004, Yu and Lam, 2011, 2014; for an alternative view, see Gårding et al., 1986). The use of creakiness coupled with the extensive fall-rise pitch movement may enhance perceptual salience of T3 when produced in isolation. Further, the NM speakers were shown to use phonation cues in tone identification especially for T3 (Yang, 2015). If the high discriminability of T3–T4 is due to acoustic salience and is language-general, both groups of NK listeners, regardless of the Mandarin experience, would be expected to find this dissimilar pair easy to discriminate.
On the other hand, in addition to the T2–T3 confusion due to acoustic similarity which is frequently reported in the literature (e.g. Chuang et al., 1975; Hao, 2012, 2018; Kiriloff, 1969; Shen and Lin, 1991; So and Best, 2010, 2014; Wang et al., 1999; Wong, 2013; Wong et al., 2005), tone pairs including T1 were somewhat problematic. In this connection, Gao (2016) reported bi-directional T1–T2 confusion by 16 Swedish-speaking learners of Mandarin at two different high schools. These young Swedish learners identified T3 most accurately (83.8%) and T2 least accurately (44.4%). Further, in a training study with native speakers of American English, Wang et al. (1999) reported that T1 and T4 were most resistant to change after 8 training sessions each lasting 40 minutes. Of relevance to the lower discrimination accuracy for tone pairs including T1 which is the only level tone in Mandarin, some researchers reported that level tones were harder for native (Khouw and Ciocca, 2007) and non-native (Hallé et al., 2004; Wayland, 1997) speakers to process than contour tones. Possibly, this is because level tones may be less informative as a reference or less clearly defined than contour tones in the tone system. Thus, it is possible that T1 with the least F0 movement has limited acoustic salience and reduced discriminability especially when it is produced by multiple speakers. It is expected that the NK listeners would also be affected by these acoustic phonetic characteristics to a greater extent than the NM listeners and find T1 not sufficiently distinct from other tones.
Unlike Mandarin, standard Korean spoken in the Seoul/Kyunggi area does not use F0 variations for lexical distinctions at the level of words. Rather, the NK speakers use pitch and the associated features of duration and intensity mostly to distinguish discourse meaning at the level of phrase (Jun, 1996, 2005). The lexical pitch accent contrasts from Middle Korean (spoken from the 10th to 16th centuries) are preserved in the Kyungsang dialect spoken in the southeastern area of the Korean peninsula, but most Korean dialects including standard Korean have lost their lexical pitch accent system (e.g. Lee and Jongman, 2015; Lee and Zhang, 2014; Ramsey, 1975). A linguistic process corresponding to ‘tonogenesis’ is currently in progress in Seoul Korean whereby the primary cue to the lenis/aspirated stop distinction in phrase-initial position is shifting from voice onset time (VOT) to F0 over time (Kang and Han, 2013; Silva, 2006). However, in phrase-medial position, VOT is still the primary cue to the sound distinction. For standard Korean to develop into a true tone language such as Mandarin, the use of contrastive F0 would need to spread from phrase-initial to other prosodic levels (Bang et al., 2018). From these phonetic perspectives, we think that it is possible the speakers of standard Korean are becoming more sensitive to pitch variations than before, but this level of tone sensitivity would not equate to the use of F0 in typical tone dialects or languages (that for lexical tone contrasts). Hence, it is not clear if and to what extent the NK speakers from the Seoul/Kyunggi area possess the same level of tone sensitivity as the tone language speakers.
Taking the above-mentioned cross-linguistic phonetic differences into account, this study examined the perception of Mandarin lexical tones by two groups of NK listeners (NK1, NK2) who differed according to their experience of Mandarin with a view to verifying positive L2 effects on advanced learners from an L1 background other than English. It sought to establish the extent to which the NK listeners approximate to the NM listeners as their Mandarin learning experience increased. While we expect the NK2 listeners with their extensive Mandarin experience to outperform the NK1 listeners, it is unclear how exactly these two groups of NK listeners may differ from each other on the one hand and from the NM listeners on the other hand in their non-native tone perception.
II Methods
The purpose of this experiment was to compare the perception of Mandarin tones by NK listeners to that of NM listeners. It examined the discrimination accuracy of six Mandarin tone pairs (T1–T2, T1–T3, T1–T4, T2–T3, T2–T4, T3–T4) via a discrimination test with a four-alternative forced-choice oddity task. The stimuli were produced by multiple native speakers of Mandarin as described below.
1 Speakers, stimuli and procedure
Eight (4 males, 4 females, mean age = 27.8 years, sd = 9.2) NM speakers were recruited from the undergraduate student population at a university in Sydney. Their mean length of residence in Sydney was 1.6 years. While some of them spoke regional dialects in addition to Mandarin, they all identified themselves as native speakers of Mandarin. 1 The speakers were recorded in a sound-treated studio on university campus under the supervision of a Mandarin–English bilingual experimenter and received monetary reward for their participation.
The 28 test words consisted of seven CV syllables (where C = /p, t, m/ and V = /i, a, u/) across all four Mandarin tones (Table 1). A total of 76 monosyllabic words including the 28 test words were presented on the computer screen one word at a time in random order and produced twice in isolation and once in a short carrier sentence (我读______这个字 wǒ dú ___ zhè ge zì ‘I read the word ___’). All materials were transcribed in Chinese characters with pinyin (the Romanized spelling system of Chinese characters with tones indicated by diacritics) on top to minimize any ambiguity of pronunciation. The pace of presentation was controlled by the experimenter. The recorded speech materials were digitized at 44.1 kHz and the target words were segmented and stored in separate files. Tokens produced in isolation were used as stimuli in this study.
Test words used in this study.
The listeners’ tone discrimination accuracy was assessed in a discrimination test with a four-alternative forced-choice oddity task used in previous research (e.g. Flege, 2003; Flege et al., 1999; Tsukada et al., 2005, 2015, 2016; Wayland and Guion, 2003, 2004). Because this task does not require lexical access, it is suitable for examining phonetic processes used in cross-language speech perception by participants who have no prior experience with the target language. As described in Wayland and Guion (2003: 118), this is ‘a version of ABX discrimination task’ and ‘is designed to minimize response bias (guessing)’. A high level of performance in this task would require not only the use of purely auditory information but also the establishment of phonetic categories for one or both sounds in a given pair.
The presentation of the stimuli and the collection of perception data were controlled by the UAB (University of Alabama at Birmingham) software 2 (Smith, 1997) or the PRAAT program (Boersma and Weenink, 2016). The stimuli were presented in triads and the listeners were given four (‘1’, ‘2’, ‘3’, ‘NO’) response categories. The following tone combinations were tested: T1–T2, T1–T3, T1–T4, T2–T3, T2–T4, T3–T4. Each of these six pairs was tested by change and no-change (catch) trials. Each change trial consisted of three tokens representing two distinct tone categories and contained an odd item. In constructing the stimuli, care was taken so that tokens produced by each of the eight NM speakers would be distributed as evenly as possible (including the odd item). The listeners were asked to choose an odd ‘word’ that was different from the other two, if there was any. For example, a change trial testing the T1–T2 pair might consist of ‘mā2’–‘mā1’–‘má3’ (where the subscripts indicate different talkers). The correct response for change trials was the button (‘1’, ‘2’, or ‘3’) indicating the position of the odd item, which occurred with equal frequency in all three possible serial positions. The serial position of the odd item was not fixed, which increased task uncertainty. The change trials tested the participants’ ability to respond appropriately to relevant phonetic differences between tokens and distinguish tones drawn from two different categories.
Each no-change trial contained three physically different instances of a single tone category (e.g. /tǐ/3 /tǐ/1 /tǐ/2 or /pà/1 /pà/3 /pà/2). The correct response to no-change trials was a fourth button marked ‘NO’. The no-change trials tested the participants’ ability to ignore audible but phonetically irrelevant within-category variation (in e.g. voice quality). The three tokens in all trials were spoken by three different talkers, and so were always physically different even in no-change trials, as this was considered a better measure of listeners’ perceptual capabilities in real world situations (Strange and Shafer, 2008). The participants were required to respond to each trial, and were told to guess if uncertain. A trial could be replayed as many times as the listener wished, 3 but responses could not be changed once given. The inter-stimulus interval in all trials was 0.5.
A total of 360 trials were presented in three blocks of 120 trials. A different randomization was used for each block. The first eight trials in each block were for practice and were not analysed. Of the 24 (eight × 3 blocks) practice trials, 19 were change and 5 were no-change trials. The resulting 336 (3 blocks × 112) test trials consisted of 252 change trials testing six tone pairs (42 trials each for T1–T2, T1–T3, T1–T4, T2–T3, T2–T4, T3–T4) and 84 no-change trials (21 trials each for T1, T2, T3, T4).
Responses to the change and no-change trials were used to calculate A-prime (A′) scores (Snodgrass et al., 1985), an index of discrimination accuracy. These scores were based on the proportion of ‘hits (Hs)’ and ‘false alarms (FAs)’ obtained for each tone pair. If the proportion of Hs equaled the proportion of FAs, then A′ was set to 0.5. If H exceeded FA, then A′ = 0.5+((H–FA)*(1+H–FA))/((4*H)*(1–FA)). However, if FA exceeded H, then A′ = 0.5–((FA–H)*(1+FA–H))/((4*FA)*(1–H)). An A′ score of 1 indicated perfect sensitivity, whereas an A′ score of 0.5 or lower indicated a lack of sensitivity.
2 Participants
Two groups of NK listeners from the Seoul/Kyunggi area differing in experience with Mandarin were compared in this study. The first (NK1) group consisted of 10 (5 males, 5 females, mean age = 23.5 years, sd = 2.7) participants with no prior experience with Mandarin or any tone languages. Their contact with Mandarin in daily life was estimated to be negligible for the purpose of this study. The second (NK2) group consisted of 10 (3 males, 7 females, mean age = 21.2 years, sd = 2.3) advanced learners of Mandarin, nine of whom had passed the highest level 6 of the HSK 4 (Hanyu Shuiping Kaoshi). The only advanced learner who never sat for the HSK spent the longest time, 12 years, in Tianjin, China. Their mean length of residence in China was 5.2 years (range = 0.5–12 years, sd = 3.9) and they had been removed from living in a Mandarin-speaking environment for up to 7 years (mean = 3.6 years, sd = 2.1) at the time of testing. The NK2 listeners started learning Mandarin at the mean age of 12.1 years (range = 8–18 years, sd = 3.7). Their mean length of learning Mandarin was 7.1 years (range = 3–12 years, sd = 2.9). Except for two NK2 listeners, each of whom participated in the study in Australia and Japan, all NK listeners were undergraduate students at a university in Seoul. 5 A group of 10 (2 males, 8 females, mean age = 25.4 years, sd = 4.3) college-educated NM speakers participated as controls. None of them participated in the recording sessions. The NM listeners participated in the study in Australia (n = 5), Japan (n = 1) or Singapore (n = 4) according to their place of residence and availability. The NM listeners lived in these countries temporarily (for 0.5 months to 5 years) and identified themselves as native speakers of standard Chinese (Mandarin or Putonghua). All listeners were tested individually in a session lasting approximately 45 to 60 minutes. The experimental session was self-paced and the listeners could take a break after each block if they wished. They heard the stimuli at a self-selected, comfortable amplitude level over the high-quality headphones on a notebook computer.
III Results
1 Overall results
Figure 1 shows the distribution of discrimination scores (A′) as a function of group. Averaged across six tone pairs, the mean discrimination scores were 0.78, 0.96 and 0.98 for the NK1, NK2 and NM groups, respectively. Figure 2 shows the distribution of A′ scores by three groups of listeners as a function of tone pairs. The mean discrimination scores for each tone pair for each group are given in Table 2.

The distribution of discrimination (A′) scores averaged across six tone pairs (T1–T2, T1–T3, T1–T4, T2–T3, T2–T4, T3–T4) by three groups of listeners (NK1, NK2, NM).

The distribution of discrimination (A′) scores for six tone pairs by three groups of listeners (NK1, NK2, NM).
Mean discrimination scores (A′) by three groups of listeners.
Notes. NK = native Korean. NM = native Mandarin. Standard deviations are in parentheses.
The NK groups were less accurate than the NM group for all six pairs, but the extent of between-group differences varied depending on the tone pairs (Figure 2). The between-group difference was largest for T2–T3 (0.38) and smallest for T3–T4 (0.07). The number of NK2 listeners whose discrimination scores fell within the range set by the NM group was 4 for T1–T2, 4 for T1–T3, 4 for T1–T4, 7 for T2–T3, 8 for T2–T4 and 8 for T3–T4, respectively. On the other hand, the number of NK1 listeners whose discrimination scores fell within the range set by the NM group was 0 for T1–T2, 1 for T1–T3, 0 for T1–T4, 0 for T2–T3, 5 for T2–T4 and 6 for T3–T4, respectively.
While the mean discrimination accuracy by the NK1 listeners was greatly affected by different tone pairs (0.59 for T2–T3 and 0.91 for T3–T4), NK2 and NM listeners’ discrimination accuracy was consistently high for all pairs (from 0.91 for T2–T3 to 0.98 for T2–T4 for NK2 and from 0.97 for T2–T3 to 0.99 for T1–T2 and T2–T4 for NM, respectively). T2–T3 was the hardest pair for both NK1 and NK2 groups, but NK2 had the largest advantage over NK1 for this pair. T2–T3 difficulty has been frequently reported in the literature even for advanced learners (Hao, 2018). The between-group difference was smallest for T3–T4 (0.91 for NK1 and 0.97 for NK2). Tone pairs including T1 (e.g. T1–T4, T1–T2) caused some misperceptions by the NK1 listeners, in particular.
2 Effects of group and tone pair
The listeners’ A′ scores were analysed in a two-way repeated-measures ANOVA (analysis of variance) with Group (G: NK1, NK2, NM) as a between-subjects factor and Tone Pair (T: T1–T2, T1–T3, T1–T4, T2–T3, T2–T4, T3–T4) as a within-subjects factor to examine if there were between-group differences in the listeners’ tone discrimination and if so, on which tone pairs. The main effects of Group and Tone Pair reached significance and so did the two-way interaction [G: F(2, 27) = 12.6, p < .001, η G 2 = .41, T: F(5, 135) = 18.8, p < .001, η G 2 = .15, G × T: F(10, 135) = 9.9, p < .001, η G 2 = .12].
Table 3 shows the results of post-hoc tests which assessed the effect of Group for each tone pair. The effect of Group reached significance for all pairs except for T2–T4 and T3–T4. For the four tone pairs, for which between-group differences reached statistical significance, the NK1 group was significantly less accurate than the NM group. The NK1–NK2 difference reached significance for T1–T2, T1–T4 and T2–T3. The NK2 group was significantly less accurate than the NM group only for T1–T2.
Results of the Welch’s F-test assessing the effect of Group for each tone pair and Dunnett’s Modified Tukey–Kramer pairwise multiple comparison tests (significance level at .05).
Notes. NK = native Korean. NM = native Mandarin.
Table 4 shows the results of post-hoc tests which assessed the effect of Tone Pair for each group. The effect of Tone Pair was significant for the NK1 group [F(5, 24.6) = 4.3, p < .01], but not for the NK2 and NM groups. The NK1 listeners made the most misperceptions for T2–T3 (0.59), which was significantly less discriminable than T3–T4 (0.91).
Results of the Welch’s F-test assessing the effect of Tone Pair for each group and Dunnett’s Modified Tukey–Kramer pairwise multiple comparison tests (significance level at .05).
Notes. NK = native Korean. NM = native Mandarin.
IV Discussion and conclusions
We examined the discrimination accuracy of Mandarin lexical tones by two groups of native Korean speakers (NK1, NK2) from the Seoul/Kyunggi area who differed in their experience with Mandarin. There were three main findings. First, the NK2 listeners who were advanced learners of Mandarin outperformed the NK1 listeners with no prior Mandarin learning experience for all six tone pairs. This is not surprising, because the NK2 listeners would be expected to benefit from their extensive Mandarin learning experience. The between-group difference was significant for T1–T2, T1–T4 and T2–T3 with the largest NK2 advantage over NK1 for T2–T3 (0.91 vs. 0.59). These pairs were identified as difficult in previous research (So and Best, 2010, 2014). Second, the NK2 listeners were largely native-like and significantly less accurate than the NM listeners for T1–T2 only. This suggests that it is possible for highly advanced non-native learners to acquire native-like discrimination of most lexical tone pairs. Third, while the naïve NK1 listeners were clearly less accurate than the NK2 and NM listeners, the between-group differences varied according to the tone pairs.
It appears that the NK1 listeners’ tone sensitivity was strongly affected by acoustic phonetic characteristics of the two tones being paired. For example, the NK1 listeners were just as accurate as the other groups for T2–T4 and T3–T4. The members of these two pairs differ markedly in their pitch contours (and duration) and would thus be perceived as dissimilar. On the other hand, the NK1 listeners’ discrimination scores were lowest for T2–T3 and relatively low for T1–T2 and T1–T4.
T2 and T3 have been frequently reported to be phonetically similar and difficult for listeners from diverse linguistic backgrounds to differentiate (e.g. Chuang et al., 1975; Hao, 2012, 2018; Kiriloff, 1969; Shen and Lin, 1991; So and Best, 2010, 2014; Wang et al., 1999; Wong, 2013; Wong et al., 2005). As for the source of T2–T3 discrimination difficulty, both tones are characterized with the initial F0 falling followed by rising, which is likely to add to acoustic and perceptual similarity. Also, T3 may be particularly foreign-sounding and ‘uncategorizable’ (So and Best, 2010, 2014) to non-native listeners when accompanied by creakiness.
However, at the same time, this may qualify T3 as striking and ‘new’ (Flege, 1995, 2007) and promote its learning. Wang (2013) found that introductory learners from diverse L1 backgrounds (i.e. Hmong, Japanese, American English) identified T3 most accurately. As can be confirmed in Table 2 and Figure 2, T3 was highly discriminable to all groups of listeners when it was paired with T1 or T4. In other words, T3 was confusable only with T2. In fact, previous studies (e.g. Gao, 2016; Hao, 2012; Kiriloff, 1969; Miracle, 1989; Wang, 2013; Wang et al., 1999) suggest T2 may be more problematic than T3. In summary, while T2–T3 confusion was frequently reported, it may be due to how T2 rather than T3 was perceived.
In the introduction, we raised the possibility that T1 with the least F0 movement may not be sufficiently identifiable to the NK (but not NM) listeners. Some T1 tokens may be perceptually indistinguishable from T2 (rising) or T4 (falling) if the pitch level is not sustained high throughout the syllable. 6 In other words, T1–T2 and T1–T4 may be perceived as belonging to a single tone category with phonetic variations especially when they are spoken by multiple speakers. Also, the NK listeners may associate the falling pitch contour of T4 with (declarative) statement intonation rather than perceiving it as an independent tone category. T1 may not necessarily be the most basic or easiest tone for non-native listeners.
The NK2 listeners were successful in discriminating Mandarin tone pairs and showed a positive effect of high quality L2 Mandarin learning experience. Their results can be taken as evidence that highly advanced learners are capable of achieving native-like tone perception. Perhaps, a combination of 1) sufficient mean length of residence of more than 5 years in China, 2) high proficiency as reflected in formal language qualification (e.g. HSK6) and 3) relatively young mean age of L2 onset of 12 years contributed to their high scores. Although the NK2 listeners differed from the NM listeners for T1–T2, four (40%) of the NK2 listeners’ scores fell within the range set by the NM group. Thus, it is unlikely that there is an absolute limit in their ability to achieving truly native-like tone perception.
In future work, it would be useful to employ a longitudinal design and explore the minimum time required for attaining native-like tone perception. It is conceivable that less advanced NK learners may be able to process the Mandarin tones just as well as the NK2 and NM listeners who participated in this study. It would be theoretically important and practically useful to pinpoint how long it might take adult non-native learners to reach the native level of tone processing. At the same time, it would be pedagogically valuable to gain a better understanding of the relationship between the time and effort invested and the outcome achieved.
One of the limitations of the current study is the small number of participants. Because of this, generalizability of our findings to a larger population remains unknown. Although we trust the between-group differences are real, we need to increase the number of participants to further increase the reliability of the results obtained. Given that more than 70% of the Mandarin vocabulary consists of disyllabic words (e.g. Chang and Bowles, 2015; Hao, 2012; Tu et al., 2016), our future work should also include the examination of those multi-syllabic lexical items that non-native learners frequently come across in language classrooms.
In conclusion, our results for the advanced NK learners generally agree with those of previous studies (Hao, 2018; Shen and Froud, 2016; Wayland and Guion, 2003) in confirming a positive effect of L2 experience on the perception of non-native tones by experienced learners. Furthermore, they extend what was observed for L1 English learners to L1 Korean learners. The NK2 listeners not only outperformed the NK1 listeners, but they closely resembled the NM listeners, providing empirical evidence for adult learners’ potential to attain native-like tone perception in a new language.
Footnotes
Acknowledgements
We thank the editor and three anonymous reviewers for their time and input, Sujin Oh for her research assistance and participants for making the study possible.
Declaration of Conflicting Interest
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was supported by the 11th Hakuho Foundation Japanese Research Fellowship (2016–17) and the 2018 Endeavour Research Fellowship.
