Abstract
Infants have remarkable abilities to learn several languages. However, phonological acquisition in bilingual infants appears to vary depending on the phonetic similarities or differences of their two native languages. Many studies suggest that learning contrasts with different realizations in the two languages (e.g., the /p/, /t/, /k/ stops have similar VOT values in French, Spanish, Italian and European Portuguese, but can be confounded with the /b/, /d/, /g/ in German and English) poses a particular challenge. The current study explores how similarity or difference in the realization of phonetic contrasts affects word-learning outcomes. Bilingual infants aged 16 months were tested on their capacity to learn pairs of new words, differing by a phonological feature (voicing versus place) on their initial consonant. Two groups of infants were considered: bilinguals exposed to languages (French and either Spanish, Italian or European Portuguese) in which the contrasts tested are realized relatively similarly (“similar contrast” group) and bilinguals exposed to languages (French and either English or German) in which the contrasts are realized very differently (“different contrast” group). In the present word-learning situation, the “similar contrast” bilinguals successfully processed the relevant phonetic detail of the word forms, while the “different contrast” bilinguals failed. The present pattern reveals the impact on word learning of phonological differences between the two languages, which is consistent with studies reporting slight time course differences among bilinguals in phonological acquisition. In line with a larger literature on bilingual acquisition, these results provide further evidence that linguistic similarity or difference in the two languages influences the pattern of bilingual acquisition.
It is always surprising to observe the ease and speed with which infants capture the properties of the language(s) spoken in their environment. Even more striking is their capacity to simultaneously learn two or more languages. Many children grow up in a bilingual environment, a language exposure of particular interest as it offers insight into the flexibility and constraints of language-learning mechanisms. The nature of the bilingual input is complex, given the co-existence of two linguistic systems that share some of their properties but differ in many others. Bilingual infants have to sort out this mixed input in order to extract regularities for each of their languages separately. Interestingly, the complexity of the bilingual input is not the same in every situation but varies depending on the pair of languages being learned and the linguistic level considered (Barac & Bialystok, 2012). This variation is of importance as it induces different patterns of language acquisition (Fabiano-Smith & Barlow, 2010; Sundara & Scutellaro, 2011). In the present study, we explore word-learning skills in young bilingual infants and the extent to which this process is sensitive to similarities or differences in the acoustic realization of phonemes in the two native languages.
In monolingual infants, language-specific phonological processing emerges early in development, with a perceptual attunement to the processing of native categories at around 6 months for vowels and at around 9–10 months for consonants (Kuhl, Williams, Lacerda, Stevens, & Lindblom, 1992; Polka & Werker, 1994; Rivera-Gaxiola, Silva-Pereyra, & Kuhl, 2005; Werker & Tees, 1984). Phonological acquisition happens in tandem with the acquisition of first words, as attested by evidence that monolingual 6-month-olds possess some lexical knowledge (Bergelson & Swingley, 2013; Tincoff & Jusczyk, 1999, 2012). However, learning minimal pairs of words (that is, words that differ by one phoneme only) requires infants to specifically focus on the sound (phonetic) differences that are relevant to word meaning (phonemic) while neglecting idiosyncratic variations within and across talkers or contexts. This process is computationally demanding (Werker & Curtin, 2005) however, and as a result, monolingual infants sometimes fail to learn pairs of words that only differ by one phonological feature, while discriminating that same contrast in an auditory task (e.g., /bih/-/dih/; Stager & Werker, 1997). Infants’ difficulties are mainly observed at the beginning of the second year of life (14 months), possibly due to their limited cognitive resources being taken up by the word-object linkage process. By 16–17 months, when fast mapping abilities eventually become more efficient, relevant phonetic detail becomes easier to process during word learning (Havy & Nazzi, 2009; Nazzi, 2005; Pater, Stager, & Werker, 2004; Stager & Werker, 1997; Werker, Fennell, Corcoran, & Stager, 2002).
Bilingual infants have shown difficulties in the acquisition of and/or access to the language-specific phonological systems (Bosch & Sebastián-Gallès, 2003a, 2003b; Sebastián-Gallès & Bosch, 2009) and in the use of relevant phonetic detail during word learning (Fennell, Byers-Heinlein, & Werker, 2007). Like monolinguals, bilinguals have to deal with variability across different talkers to learn new word forms. In contrast to monolinguals, they also have to deal with variability resulting from the coexistence of two linguistic systems and the partial overlap of information in the acoustic space. Importantly, studies suggested that early in development, the congruence or incongruence of phonologies between the languages in acquisition had an impact on the pattern of phonological acquisition. In the following section, we review how the phonological similarities or differences between the two languages of bilingual infants affect phonological development, and how this could ultimately pose a particular challenge for word learning.
In the literature, different patterns of phonological acquisition have been reported in bilingual infants, depending on the similarities or differences in the acoustic realization of phonemes in the two native languages. Overall, phonological acquisition (as attested by the ability to discriminate phonological contrasts) was found to follow the same time course as in monolinguals for cases where the phonological categories were present in both languages and clearly separated acoustically (/e/-/u/; Sebastián-Gallès & Bosch, 2009). However, when the contrast was expressed in only one language or when the distinction was marked with a clear misalignment of category boundaries, a different pattern emerged.
This was evidenced in studies showing that between 8 and 12 months of age, Catalan-Spanish bilinguals have temporary difficulties at discriminating the vowels /e/ and /E/, used contrastively in Catalan but not in Spanish, while Catalan monolinguals are able to do so (Bosch & Sebastián-Gallès, 2003b). This pattern was also found in the same population for another vowel height contrast (/o/-/u/) present in both languages but with a marked misalignment of category boundaries (Sebastián-Gallès & Bosch, 2009). However, the above difficulties were not consistently found in the literature (Table 1) and varied as a function of the cognitive demands of the task. While more research will be needed in this domain, it appears that some contrasts are easier to discriminate than others, perhaps depending on their presence in both languages or not, as well as their distribution within the acoustic spaces of both languages considered.
Summary of speech perception abilities in monolingual and bilingual infants.
Note. Description of speech perception abilities in monolingual (Albareda-Castellot et al., 2011, n = 45; Bosch & Sebastián Gallès, 2003a, 2003b, n = 96; Burns et al., 2007, n = 10; Sebastián-Gallès & Bosch, 2009, n = 48; Sundara & Scutellaro, 2011, n = 40) and bilingual (Albareda-Castellot et al., 2011, n = 27; Bosch & Sebastián Gallès, 2003a, 2003b, n = 84; Burns et al., 2007, n = 9; Sebastián-Gallès & Bosch, 2009, n = 24; Sundara & Scutellaro, 2011, n = 40) infants as a function of the phonological contrast tested, the methodology used, the languages and age of the infants. Yes / No indexes the capacity to discriminate or not the considered contrast.
During the second year of life and beyond, phonetic perception improves, and bilingual infants perceive subtle sound (phonetic) differences that distinguish phonemes regardless of how these contrasts are realized in their two languages (see Table 1). However, despite greater discrimination capacities, the use of relevant phonetic detail during word learning remains challenging (Curtin, Byers-Heinlein, & Werker, 2011) and processing contrasts presumably hard to perceive has a cost that might ultimately affect word learning. Only a few studies have examined bilinguals’ word-learning capacities (Table 2). Of these, one study explored how a similarity or difference in the realization of a phonological contrast across languages affects word form processing (Fennell, Byers-Heinlein, & Werker, 2007). This study revealed that by 16–17 months, bilingual infants, unlike monolinguals, still have difficulties at using relevant phonetic detail during word learning. After an exposure to two objects paired with two minimally different words (e.g., object A = /bih/ – object B = /dih/), English-learning monolingual infants successfully identified a change in word–object association (e.g., object A = /dih/) as early as 17 months (Werker et al., 2002), while English-French or English-Chinese bilinguals failed to react to the mismatch before 20 months (Fennell, Byers-Heinlein, & Werker, 2007). Importantly, there was no significant difference in performance between the two groups of bilingual infants, although the realization of the phonological contrast was relatively aligned in English and Chinese but clearly misaligned in English and French. Similarly, difficulties in processing the relevant phonetic detail of known words were found in Catalan-Spanish bilinguals, at an age (18 months: Ramon-Casas, Swingley, Sebastián-Gallès, & Bosch, 2009; adults: Sebastián-Gallès, Echeverria, & Bosch, 2005) when they are not found in English-learning monolinguals anymore (e.g., Swingley & Aslin, 2000). If the above findings by Fennell et al. (2007) were generalized, it would imply that similarities or differences in the phonological properties of the two languages of bilingual infants do not play a role in their ability to process the detail of word forms. However, more recent findings call into question this possibility by showing the acquisition of new words in English-French bilinguals at 17 months using the same task with slight methodological changes (/kem/-/gem/, Fennell & Byers-Heinlein, 2014; /bos/-/gos/, Mattock, Polka, Rvachew, & Krehm, 2010).
Summary of word-learning abilities in monolingual and bilingual infants.
Note. Description of word-learning skills in monolingual (Fennell et al., 2007, n = 32; Fennell & Byers-Heinlein, 2014, n = 31; Mattock et al., 2010, n = 32) and bilingual infants (Fennell et al., 2007, n = 53; Fennell & Byers-Heinlein, 2014, n = 30; Mattock et al., 2010, n = 16), as a function of the stimuli tested and the characteristics of the participants. Yes / No indexes the capacity to learn minimally different words.
At present though, it is unclear which of the methodological differences between these studies allowed infants to succeed in one case but fail in the other (Table 2). Possible explanations for bilingual success/failure lay in the specific phonetic properties of the contrasts tested and their relative perceptibility: the feature manipulated and the magnitude of the contrast differed across studies. Indeed, the contrast used in Fennell et al. (2007) involved a close place of articulation contrast (/b/-/d/, 1 step), while the contrast used in Mattock et al. (2010) involved a more distant place of articulation contrast (/b/-/g/, 2 steps). Similar to monolinguals, bilingual infants may process the larger differences with greater ease (White & Morgan, 2008). Bilinguals’ success in using phonetic detail might also vary as a function of the type of feature considered, as success is observed with a close voicing feature (/k/-/g/) in Fennell and Byers-Heinlein (2014). In addition, success in Mattock et al. (2010) and Fennell and Byers-Heinlein (2014) might be due to the fact that the contrasts considered may have been easier to perceive because they were realized with a more familiar pronunciation. In Mattock et al. (2010), infants were presented with bilingual word instantiations (words pronounced by a bilingual speaker) matching their language-learning environment while in Fennell et al. (2007), they were presented with monolingual instantiations (words pronounced by a monolingual speaker). Consistent with this, Fennell and Byers-Heinlein (2014) found that bilinguals attend more to phonetic information from bilingual input and monolinguals from monolingual input. At this age, infants might be still vulnerable to subtle idiosyncratic phonetic variations.
Taken together, these findings suggest that beside the existence of a general delay in word-learning skills; there is the possibility however, that bilingual infant performance varies as a function of the perceptibility of the contrast tested. Processing contrasts presumably hard to perceive could ultimately affect word learning. As the perceptibility of a given contrast varies as function of its realization across the two languages (Bosch & Sebastián-Gallès, 2003a, 2003b; Sebastián-Gallès & Bosch, 2009), one might predict different word learning outcomes in infants learning languages with similar acoustic/phonetic realization of the contrast relative to infants learning languages with different realizations. Given the phonetic literature and the early discrimination difficulties, one might expect more difficulties in bilinguals learning languages with different realizations of the contrast.
The purpose of the present study was to provide further insights into 16-month-old bilinguals’ word learning capacities and determine whether their sensitivity to the relevant phonetic detail of word forms varies according to how the contrasts tested are realized in their two languages. Word-learning skills were evaluated using the interactive word-learning task developed by Havy and Nazzi (2009). Monolinguals at the same age have already demonstrated successful performance in this task, showing their capacity to learn minimal pairs of words (that is, words that differ by one phoneme only) involving a voicing or a place consonant change. Bilingual infants in the current study were tested with the exact same stimuli and procedure. The task consisted of the presentation of two word–object pairings (e.g., object A: /pyf/ – object B: /tyf/) followed by the presentation of a third object (object C: /pyf/) with the request to find the other object sharing the same name (object A), Words were presented in sentential context during live interaction with a native monolingual French-speaker. Note that live interaction (Kuhl, Tsao, & Liu, 2003) and the use of sentential context (Fennell & Waxman, 2010) place infants in ideal learning conditions as they should facilitate the establishment of reference and the identification of the language in use by the bilingual infants. Participants were tested in French on four consonantal place contrasts (/pyf/-/tyf/, /dul/-/gul/, /beji/-/ deji/, /tize/-/kize/), and four consonantal voicing contrasts (/paS/-/baS/, /koet/-/goet/, /tola/-/dola/, /piva/-/biva), present in their two languages (see details in the stimuli section). Previous studies have found lower performance in processing phonetic detail in initial (less accented) than in final (more accented) syllables of bisyllabic words by French-learning 11-month-olds (Hallé & De Boysson-Bardies, 1996; see also reanalyses in Vihman, Nakai, DePaolis, & Hallé, 2004) and 14-month-olds (Zesiger & Jöhr, 2011). To control for this possibility, word structure was manipulated and all eight consonantal contrasts appeared in the onset position of monosyllabic or bisyllabic pseudo-words. Testing infants with eight different contrasts in different syllabic positions was expected to provide generalizable results compared to previous studies testing infants on only one contrast in onset position of monosyllable (Fennell & Byers-Heinlein, 2014; Fennell et al., 2007; Mattock et al., 2010).
To evaluate the potential role of similarities or differences in the acoustic/phonetic realization of the phonemes on the word-learning outcome, bilingual infants from two different groups were tested. The first group of bilinguals included infants exposed to French and another Romance language (Italian, Spanish, or Portuguese), sharing a relatively similar realization of the voicing and place contrasts tested (“similar contrast” group). The second group included infants exposed to French and a Germanic language (English or German), with a clearly misaligned realization of the contrasts tested (“different contrast” group). Place and voicing characteristics of stops across the languages used are presented in Table 3. For the place feature, the misalignment concerned the stops /t/ and /d/ realized as lamino-dental in French (Mortreux, 2008), Spanish (Whitley, 2002), Italian (Agard & Di Pietro, 1965) and European Portuguese (Cruz-Ferreira, 1995; Martins, Carbone, Pinto, Silva, & Teixeir, 2008) and apico-alveolar in English and German, with a higher intercept with the second formant of the vowel (Boase-Beier & Lodge, 2003). For the voicing feature, the misalignment concerned VOT values (voice onset time: duration of the period of time between the release of a plosive and the beginning of vocal fold vibration). In French, Spanish, Italian and European Portuguese, 1 voiced stop consonants are realized with the vibration of the vocal cords starting on average 100 ms before the consonant release (−100 ms VOT) while unvoiced consonants are realized with a +30 ms VOT (French: Serniclaes, 1987; Spanish: Williams, 1977; Italian: MacKay, Flege, Piske, & Schirru, 2001, Öğüt, Kiliç, Engin, & Midilli, 2006, Stevens & Hajek, 2010; European Portuguese: Cruz-Ferreira, 1995; Pinho, Jesus, & Barney, 2010). In contrast, voiced consonants in English and German are realized with a 0 ms VOT while unvoiced consonants are realized with a +70 ms VOT (English: Lisker & Abramson, 1964; German: Braunschweiler, 1997; Jessen & Ringen, 2002; Pouplier, Marin, & Waltl, 2014). As a result of these differences, unvoiced consonants in French overlap with voiced consonants in English and German, leading confusions between the French /p, t, k/ and the English/ German /b, d, g/.
Phonetic characteristics of plosives across languages
Note. Place of articulation and mean VOT (voice onset time: duration of the period of time between the release of a plosive and the beginning of vocal fold vibration) values of stop consonants for the 4 Romance and the 2 Germanic languages considered in the study. Note that these values are susceptible to dialectal variation. Of importance, note that mean VOT for voiced segments (/b/, /d/ and /g/) is negative in the Romance languages used, but positive in the Germanic languages used. Data that is not available in the phonetic literature (/d/ and /g/ for Italian and Portuguese) is left blank.
Due to the ambiguity around the category boundary introduced by misalignment, the “different contrast” bilinguals are expected to have more difficulties processing these contrasts compared to infants from the “similar contrast” group, thus predicting lower word learning performance for the “different contrast” group.
Method
Participants
Thirty-six 16-month-old bilingual infants (20 boys, 16 girls) from the Paris area participated in the study (mean chronological age: 16 months 20 days, range: 15 months 27 days–17 months 4 days). The infants were exposed to French and an additional language from birth and received input in each language for a minimum of 30% and a maximum of 70%, respectively (inclusion criterion recommended by Pearson, Fernandez, Lewedeg, & Oller, 1997). French exposure came from at least one of the two parents, who was a native speaker. Two groups of bilinguals were tested. The first “similar contrast” group included 18 infants exposed to French and a language sharing a similar phonetic realization of the contrasts tested (Italian: n = 8; Spanish: n = 6; 2 European Portuguese: n = 4). They were hearing French 61.38% of the time (range: 50%–70%), with 16 infants showing a French dominance (as assessed by the Language Exposure Questionnaire, Bosch & Sebastián-Gallès, 1997). The second “different contrast” group included 18 infants exposed to French and a language with a different phonetic realization of the contrasts tested (German: n = 8; English: n = 10) (See material section for more details about the differences of acoustic realization). They were hearing French 61.18% of the time (range: 45%–70%), with 15 infants showing a French dominance. French exposure in both bilingual groups was not significantly different, t(35) = .89, p = .38, d = .30.
Both groups of bilinguals were equivalent in terms of general word-learning capacities. Indeed, the mean productive vocabulary (as assessed by the French version of the CDI, MacArthur Communicative Development Inventory; Kern, 2003, on which parents were asked to check, for each word, whether their infants produced it in French and/or in her other language) for the “similar contrast” and “different contrast” bilinguals was, respectively, 23 (SD = 23) and 24 (SD = 21) words, when calculated on the combined lexicons of both languages, which was not significantly different (t < 1).
To ensure that differences in performance between the two groups of bilinguals could not be due to differences in socio-economic status (SES), we calculated SES scores by aggregating information on parents’ education (on an ascending scale from 1 to 6, based on I.N.S.E.E., 2009) and jobs (on an ascending scale from 1 to 8, based on I.N.S.E.E., 2009). 3 SES scores were respectively of 12.89 for “similar contrast” bilinguals and 12.29 for “different contrast” bilinguals, which was not significantly different, t(25) = 1.66, p = .11, d = .66.
Materials
The same eight triads of small objects used by Havy and Nazzi (2009) were used in the present study. All objects within each trial were unknown to infants and differed in shape, colour and texture.
The same eight pairs of pseudo-words as in Havy and Nazzi (2009) were used, each in association with a given triad of objects. Four pairs tested consonantal place contrasts (/pyf/-/tyf/, /dul/-/gul/, /beji/-/deji/, /tize/-/kize/), while four other pairs tested consonantal voicing contrasts (/paS/-/baS/, /koet/-/goet/, /tola/-/dola/, /piva/-/biva). All eight consonantal contrasts appeared in onset positions of monosyllabic or bisyllabic pseudo-words. These contrasts were selected based on the phonetic literature on the acoustic realization of voicing and place features across languages. All contrasts tested were chosen because they involve consonants (unvoiced /p/, /t/, /k/, and voiced /b/, /d/, /g/) present in all the languages of the bilinguals tested, in roughly similar phonetic spaces. The realization of place and voicing in French is relatively similar to that of the three languages considered in the “similar contrast” group (Spanish, Italian, European Portuguese), while it is clearly different from that of the two languages considered in the “different contrast” group (English, German; see Table 3 for place and voicing characteristics of stops across all these languages).
Apparatus and procedure
The experiment was designed to evaluate whether after a short exposure to two word–object pairs, infants would be able to identify the object later requested by the experimenter. Testing was done in French by a native French monolingual female. Each of the trials consisted of a learning phase followed by a test phase (Figure 1). In the learning phase, two objects were presented one at a time and placed on the table in a left-to-right sequence (child’s perspective). Each object was named exactly 6 times in the following carrier phrases: “Regardes! Un /koet/. C’est un /koet/. Est-ce que tu veux jouer avec le /koet/? Oui, joue avec le /koet/. Regardes ce /koet/. Très bien, posons le /koet/ sur la table. Ici.” (“Look! A /koet/. This is a /koet/. Do you want to play with the /koet/? Yes, play with the /koet/. See this /koet/. All right, let’s put the /koet/ on the table. Here.”). Within each trial, one of the objects was labelled with a pseudo-word (e.g., /koet/), and the other object was labelled with the phonologically contrasted pseudo-word (e.g., /goet/). After the learning phase, the experimenter took out a third object while saying, “Regardes, c’est un /koet/, je mets ce /koet/ dans la tasse, peux tu mettre l’autre /koet/ dans la tasse?” (“Look, this is a /koet/, I put this /koet/ in the cup, can you put the other /koet/ in the cup?”). The infant was thus requested to match one of the previously labelled objects with the third (same-labelled) object. Since all objects presented during the trial were very different, infants had to base their decision on the name of the objects regardless of their visual properties. While waiting for the response, the experimenter looked at either the infant’s face or the cup in order to avoid influencing the infant’s response. After the infant’s response, positive feedback was provided regardless of the choice made. Successful performance corresponded to the selection of the object that had the name requested by the experimenter.

Procedure used.
The session lasted approximately 15 minutes, during which the infant was tested on the eight trials in a pseudo-randomized order. This order protocol was constructed by varying, between participants, the order of presentation of the eight trials, the order of presentation of the two objects used during the learning phase, and which of these two objects was requested in the test phase. Moreover, the position of the target object on the table (left side vs. right side) was controlled within participants; thus, half of the correct responses were on the infant’s left (first introduced object), and the other half were on his or her right (last introduced object; for more details on the construction of the protocols, see Havy & Nazzi, 2009).
Results
Infants received a score of 1 when they selected the correct object and a score of 0 for an incorrect selection. The total scores were transformed into percentages. The values for the two groups of bilingual infants (“similar contrast” versus “different contrast”) were then entered into various analyses. Multivariate analyses of variance (ANOVA) were conducted on mean performance after normalization of the data. Data were normalized using an arcsine square root transformation, appropriate for percentage data (DeCoster, 2001). T tests were performed to evaluate mean performance against the chance level, set at 50% since each response involved a choice between two equally probable possibilities. Chi-squared analyses were also conducted to compare the distribution of children having performance below/at 50% versus above 50%, with the chance-level theoretical distribution, or to compare distributions across the two groups.
Overall, the “similar contrast” bilinguals demonstrated above chance level performance on the task, with a mean score of 65.28%, SD = 18.96%, t(17) = 4.20, p = .001, d = 2.04, while “different contrast” bilinguals performed at chance, with a mean score of 52.78%, SD = 18.47%, t(17) = 1.09, p = .29, d = .53; Figure 2. Moreover, most of the “similar contrast” bilinguals performed above 50%, n = 14/18; χ2 (1, N = 18) = 6.31, p = .01, contrary to the “different contrast” bilinguals, n = 7/18, χ2 (1, N = 18) < 1.

Mean correct object choices for each bilingual infant.
To evaluate which factors might modulate performance, scores were submitted to an analysis of variance with language exposure (“similar contrast” bilinguals, “different contrast” bilinguals) as a between subjects factor and type of word (monosyllable versus bisyllable) and type of contrast (place versus voicing) as within-subjects factors. The results revealed a significant main effect of language exposure, F(1, 34) = 4.02, p = .05, ηp2 = .11, suggesting significant differences between the two bilingual groups. This pattern was further confirmed by chi-squared analyses, χ2 (1, N = 18) = 5.6, p = .02. The analysis did not reveal any effect related to the type of word (F < 1), type of contrast (F < 1), nor any interaction involving one of these factors (F < 1). The present results thus demonstrate differential word-learning performance across the two bilingual groups, with an advantage for bilinguals whose languages realize the tested phonological contrasts similarly at the acoustic level.
Discussion
The aim of the current study was to determine whether and under which conditions bilingual infants use relevant phonetic information when learning new words. Given divergent results in the literature (Fennell & Byers-Heinlein, 2014; Fennell et al., 2007; Mattock et al., 2010), the aim was to re-evaluate this ability, and explore how the performance of bilingual infants is influenced by the combinations of languages in acquisition and the way phonological contrasts are acoustically realized in their two native languages. To explore this issue, we administered an 8-trial word-learning task to 16-month-old bilingual infants. In each trial, they were presented with two new objects paired with two minimally different names and then asked to select one of them (as done by Havy & Nazzi, 2009, for monolingual infants). Words were presented in sentences in an interactive task, with the intention to facilitate the determination of the language used (French), and the establishment of reference necessary for word learning (Fennell & Waxman, 2010; Kuhl et al., 2003). The contrasts (involving consonants present in all of the bilinguals’ languages) were selected in order to distinguish two groups of bilinguals: those exposed to two languages in which the contrasts are realized acoustically in relatively similar ways (“similar contrast” bilinguals), and those exposed to two systems in which they are clearly realized differently (“different contrast” bilinguals).
Results show that in the present word-learning situation, similar to monolinguals of same age, the “similar contrast” bilingual infants successfully learned the words, processing the 1-feature phonological contrasts that distinguished these words. However, the “different contrast” bilinguals performed at chance, despite the same level of exposure to French and a similar socio-economic background. This pattern was consistent across the two different features that we tested: place and voicing. It was also consistent across monosyllabic and bisyllabic words, despite previous evidence for lower performance in other tasks in processing phonetic detail in less accented word positions (initial syllable of bisyllables) found by Hallé and de Boysson-Bardies (1996) at 11 months, and Zesiger and Jöhr (2011) at 14 months.
Given that the two groups of bilinguals were tested in identical conditions and using the same stimuli, the present results suggest that performance varies as a function of the distance between the two languages on some linguistic dimensions relevant to lexical acquisition. In the following, we discuss how these results, compared to recent findings on similar issues, allow us to specify important factors determining bilinguals’ performance. First, the present study used a task involving an interactive component and the introduction of words into phrasal contexts, while in Fennell et al. (2007) and Mattock et al. (2010), words were presented in isolation without communicative cues supporting word learning. These changes were motivated by findings showing that the use of sentences provides a referential context that facilitates word learning in monolinguals (Fennell & Waxman, 2006) as well as bilinguals (Fennell & Byers-Heinlein, 2014), and might help bilinguals identify the language in use (as suggested by Fennell et al., 2007; Mattock et al., 2010). Therefore, the linguistic and referential richness of the present procedure should have given the bilingual infants every chance to learn the words. Both groups did not benefit equally from it, however, as the “similar contrast” bilingual group performed above chance level, while the “different contrast” bilinguals were at chance level, thus demonstrating that referential/linguistic richness in the context of word presentation alone cannot explain the differences in results across our groups of bilinguals.
What other factors might explain the difference in performance between the two groups of bilinguals, particularly the lower performance of “different contrast” bilinguals? First, mapping word-forms onto objects might have been more challenging for 16-month-old bilinguals learning more distant languages than for those learning more similar languages. However, this explanation is discarded by recent evidence that regardless of the similarities of the languages in acquisition, bilingual infants at this age are able to link word forms and objects after only a short exposure (“lif”-“neem”, Byers-Heinlein, Fennell, & Werker, 2013). In addition, CDI scores in the present study revealed that the vocabulary size was equivalent for both groups of bilinguals (when calculated on both lexicons) and similar to that of monolinguals of the same age (in Havy & Nazzi, 2009). The CDI, however, does not measure sensitivity to the relevant phonetic detail of word forms. It is thus more likely that performance difficulties in the present study for the “different contrast” bilinguals were related to the demands of processing precise phonetic information at the word-form level than to the formation of new referential word–object links per se. Accordingly, and as predicted by Byers-Heinlein, Fennell, and Werker (2013), the two groups of bilinguals should perform at the same level when learning pairs of phonetically very dissimilar words, which will have to be tested in the future.
With respect to the issue of word form processing, different factors are likely to have had an impact on the observed pattern. The first one, which was directly manipulated in the present study, relates to the way the contrasting phonemes are realized in the acoustic space of the two languages of the bilinguals. Language-specific phonological processing emerges over the first year of life in both monolinguals and bilinguals. Phonological acquisition poses a particular challenge for bilingual infants as they need to cope with additional variability resulting from the coexistence of two linguistic systems and the partial overlap of information in the acoustic space of their two languages. Resulting difficulties have mainly been observed over the first year of life when contrasts are absent from one of the bilinguals’ native languages (Bosch & Sebastián-Gallès, 2003a, 2003b) or have clearly different acoustic realization in the two languages (Sebastián-Gallès & Bosch, 2009). Discrimination difficulties were found to diminish over the second year of life, with an increase in sensitivity to contrasts presumably hard to perceive (Burns, Yoshida, Hill, & Werker, 2007; Sebastián-Gallès, 2003a).
Despite finer discrimination capacities, the use of relevant phonetic detail during word learning remains computationally demanding (Curtin, Byers-Heinlein, & Werker, 2011; Werker & Curtin, 2005). Yet the pattern of difficulty could be modulated by processing demands specific to each contrast, with more difficulties at processing the phonetic detail of word forms for bilinguals for whom these contrasts are realized differently in their languages (as was the case for the bilinguals in the “different contrast” group), than for bilinguals for whom they are realized relatively similarly in their languages (as was the case for the bilinguals in the “similar contrast” group). Therefore, our pattern of findings is consistent with the literature showing that not all phonological contrasts are equal for bilingual infants, and suggesting effects related to the similarities or differences in the acoustic realization of the contrasts tested in the two native languages. Combined with previous word-learning studies, our results indicate that by 16 months, if bilingual infants can use relevant phonetic information in word learning, processing is still susceptible to perceptual difficulty. Difficulties should disappear later on as infants use abstract phoneme representations to overcome perceptual difficulty.
Besides phonological differences, our participants’ languages inevitably differed on other characteristics, and the potential impact of those factors on the current results should be discussed. The first of these factors is the rhythmic distance between languages. The “similar contrast” bilinguals were learning two syllable-timed languages (French, Italian, Spanish, Portuguese), the “different contrast” bilinguals were learning a syllable-timed (French) and a stress-timed language (German, English). 4 Infants discriminate languages with different rhythms from birth (monolinguals: Mehler et al., 1988; Nazzi, Bertoncini, & Mehler, 1998; bilinguals: Byers-Heinlein et al., 2010) but languages with similar rhythms only by 4–5 months of age (monolinguals: Nazzi, Jusczyk, & Johnson, 2000; bilinguals: Bosch & Sebastián-Gallès, 1997). Bilinguals exposed to different rhythms might thus separate input from their two languages a few months earlier than bilinguals exposed to similar rhythms. Therefore, if language discrimination feeds into later word learning, one could have expected poorer word learning abilities in the “similar contrast” group of infants. Yet our results revealed greater performance in this group.
However, rhythm is also one of the cues that infants use at a young age to start segmenting word forms from fluent speech. These rhythm-based segmentation strategies appear early in development, between 6 and 8 months, and differ between stress-timed languages (stress-based procedure: English: Jusczyk, Houston, & Newsome, 1999; Dutch: Kooijman, Hagoort, & Cutler, 2005, 2009) and syllable-timed languages (syllable-based procedure: Spanish and Catalan: Bosch, Figueras, Teixidó, & Ramon-Casas, 2013; French: Goyet, Nishibayashi, & Nazzi, 2013; Nishibayashi, Goyet, & Nazzi, 2014; Polka & Sundara, 2012). Importantly, early segmentation skills were found to be predictive of later lexical achievement (Junge, Kooijman, Hagoort, & Cutler, 2012; Kooijman, Junge, Johnson, Hagoort, & Cutler, 2013; Newman, Bernstein Ratner, Jusczyk, Jusczyk, & Dow, 2006). Therefore, differences among the two groups of bilinguals could have been the result of the fact that infants in the “different contrast” group had to learn two rhythm-based segmentation procedures instead of only one shared by both languages, as was the case for the infants in the “similar contrast” group. This possibility is consistent with studies showing that monolinguals can segment non-native stimuli from languages of similar rhythmic organization (English-Dutch: Houston, Jusczyck, Kuijpers, Coolen, & Cutler, 2000) but not from languages of different rhythms (English-French: Polka & Sundara, 2012). However, note that if this was a determining factor in lexical learning at 16-months, the age of the bilinguals tested here, we would have predicted lower vocabularies in the “different contrast” group, which was not the case according to the CDI data. Therefore, while this rhythmic factor appears to be crucial in younger infants, it may not have played a determinant role in the variation in word learning performance observed among the infants tested in the present study.
A further factor distinguishing our two groups of bilinguals is the fact that the “similar contrast” bilinguals were learning two Romance languages, sharing beyond their historical link, similar typological features at the phonological and lexical levels, while the “different contrast” bilinguals were learning typologically more distant languages (a Romance and a Germanic language). For example, one source of difference is the existence of more lexical overlap between Romance languages than across French and Germanic languages. Importantly, typologically related languages usually share a larger number of similar word forms (cognates) than typologically distant languages. Recent findings showed that 17–24-month-old infants were less sensitive to phonological changes in familiar word forms which were cognates compared to non-cognates (e.g., for Catalan-Spanish: Ramon-Casas & Bosch, 2010; Ramon-Casas et al., 2009). Subtle cross-linguistic differences between word forms sharing the same meaning might extend the part of acceptable variation in these particular word forms and accordingly decrease the overall sensitivity to the phonetic detail of these words.
Although the present study used pseudo-word stimuli, the crosslinguistic neighbourhood size of the word forms was not controlled for. If bilingual infants’ exhibit decreased sensitivity for similar word forms across languages, this factor might also affect word-learning performance. However, effects of lexical overlap/cognates have only been reported for vowels (e.g., for Catalan-Spanish: Ramon-Casas & Bosch, 2010; Ramon-Casas et al., 2009) and the current study used consonant differences. Neighbourhood effects might be lessened given the particular importance of consonants in lexical processes (Havy & Nazzi, 2009; Nazzi, 2005). In addition, the cognate difficulty would predict better performance for infants in the “different contrast” group (in which there should be less lexical overlap between the two languages in acquisition), contrary to the pattern of results found.
Another factor distinguishing the two groups is the phonotactic characteristics of the languages in acquisition. While the stimuli used were phonotactically legal in all languages tested in terms of syllabic structure (CVC, CVCV), infants’ vocabulary reports indicate that, even in infancy, CVCV words are more common in Romance languages (French: Kern, 2003; Italian: Caselli, Rinaldi, Stefanini, & Volterra, 2009; Portuguese: Lima, 2008; Spanish: Jackson-Maldonado, Thal, Marchman, Bates, & Gutiérrez-Clellen, 1993) than in Germanic languages (English: Fenson et al., 2007; German: Szagun, Steinbrink, Franik, & Stumper, 2006). In this context, one might have predicted lower performance on CVCV words in French-Germanic language bilinguals, which is not supported by the current results.
An additional factor distinguishing the two groups is that Romance languages share many grammatical features, whereas there are clear differences in grammar across Romance and Germanic languages. It is possible that beyond word form processing, infants in the similar group had an advantage understanding the sentences and the overall instructions of the task. However, recent evidence showed that French monolinguals were able to learn phonetically different words with the exact same task run in English. This suggests that the task supports lexical acquisition even in the absence of sentence-level comprehension (Bijeljac-Babic, Nassurally, Havy, & Nazzi, 2009).
Finally, all of the above differences might have differently affected the overall quality of the input received by the two groups of bilingual infants and ultimately influenced the observed pattern of performance. In line with this idea, it is important to note that bilinguals often hear their languages spoken not only by native speakers, but also by foreign speakers speaking with accented speech. Adults speaking in a foreign language have different levels of foreign accents individually, but it is also possible that some characteristics of foreign accents would be differently marked across typologically/rhythmically/phonetically closer versus more distant languages, and more marked across more distant languages. If so, the “different contrast” bilinguals could have received a less clear input that might have negatively affected the acquisition of the phonological and lexical properties of their native language. Moreover, this more accented language in the environment of the “different contrast” bilinguals would be more different from the language of the experimenter (a native French speaker) than the accented language in the environment of the “similar contrast” bilinguals, which might further reduce infants’ performance, as proposed by Fennell and Byers-Heinlein (2014). To explore these issues, it would be important to collect information about the quantity and quality of the input received by each bilingual infant in parallel to the evaluation of their word learning performance.
In conclusion, we established that bilingual infants can learn pairs of words differing only by a one-feature change on their initial consonant as early as 16 months. Importantly, this ability was found only in one of our two groups of bilinguals, demonstrating that this ability is influenced by similarities or differences between the two languages in acquisition. The purpose of the study was to explore the impact of similarities or differences in the realization of the contrasted phonemes in the infants’ two languages, and results support the proposal that differences in acoustic realization decrease word learning performance. Our findings thus add to the few studies previously conducted in this domain (Fennell & Byers-Heinlein, 2014; Fennell et al., 2007; Mattock et al., 2010), and open new questions for future exploration. First, other potentially important factors that might have influenced performance were discussed (linguistic rhythm, lexical overlap, quality of input), and will need to be directly tested in the future. Further studies will also be needed to explore how long these word learning difficulties persist over development, whether the use of different experimental tasks leads to similar findings or not, and the factors that facilitate successful learning as found here for the bilinguals learning French and another Romance language.
Footnotes
Notes
Funding
This research was funded by ANR-13-BSH2-0004 to Thierry Nazzi and LABEX EFL (ANR-10-LABX-0083) to Camillia Bouchon and Thierry Nazzi.
