Abstract
Aims and objectives:
In this study, we investigated crosslinguistic influence in the phonetic systems of simultaneous bilinguals (2L1s) during adulthood.
Methodology:
Specifically, we analyzed the voice onset time (VOT) of the voiceless stop /k/ in the spontaneous speech of 14 German–French bilinguals who grew up in France or Germany. We looked at both languages, first comparing the groups, second comparing their VOT to their global accent.
Data and analysis:
The material consisted of interviews, lasting for about half an hour.
Findings/conclusions:
Most 2L1s showed distinct VOT-ranges in their two languages, even if they were perceived to have a foreign accent in the minority language of their childhood environment. We conclude that the phonetic systems of 2L1s remain separate and stable throughout the lifespan. However, the 2L1s from France had significantly shorter VOTs in German than the 2L1s from Germany, and their speech was overall more accented. These findings are discussed with respect to the role of intra- and extra-linguistic factors.
Originality:
Our study adds a new perspective to existing VOT studies of bilinguals by using naturalistic speech data and by comparing two groups of 2L1s who have the same language combination but grew up in different countries, which allows us to evaluate the impact of their childhood environment on VOT development.
Significance/implications:
Language exposure during childhood seems to be beneficial for pronunciation during adulthood.
Keywords
Introduction
While many recent studies have investigated grammatical variation in adult simultaneous bilinguals (henceforth 2L1s), including heritage speakers, 1 comparatively little is known about ultimate attainment in their pronunciation (Benmamoun, Montrul, & Polinsky, 2013). Moreover, existing research has mostly compared minority or heritage languages to monolinguals, while comparisons between the minority and majority language of 2L1s are scarce. Finally, most existing studies focus on minority languages with a comparatively low social prestige, and only a few on cases where the minority language receives support outside of the speaker’s home.
The present study has two main goals. First, we compare the realization of voice onset time (VOT) by 14 adult German–French 2L1s who grew up in either France or Germany, in order to establish whether there is crosslinguistic influence (CLI) and to establish the importance of childhood language environment on their pronunciation during adulthood. These speakers were studied previously (Kupisch et al., 2014a) with respect to their perceived global (foreign) accent, i.e. the accent perceived by a monolingual speaker, resulting from phonetic and/or phonological properties that deviate from the accent associated with monolingual speakers. 2 The results of the study showed that the 2L1s had little or no foreign accent when speaking the majority language of their childhood environment, but more often had a foreign accent when speaking their minority/heritage language. In the present study, we are interested in whether this accent is reflected in objective analyses of segmental features, specifically VOT, and therefore the second main goal is to relate the findings of the two studies to one another. In fact, the relation between global accent and VOT has already been established for other types of learners, including second language (L2) learners (Flege, 1984; Flege & Eefting, 1987b; Major, 1987), L3-learners (Magdalena Wrembel, personal communication, September 15, 2014) and L1-attriters (Sancier & Fowler, 1997), but not for simultaneous bilinguals. In addition to the main goals listed above, we also attempt to relate the findings to speaker-specific situations, such as the language spoken at their schools and the time spent in the heritage country.
The language pair German–French is especially suitable for studying VOT, as these languages have different VOT systems, so that a potential CLI should be visible in the speech of 2L1s (Ternes, 1976). CLI is an established phenomenon in the early stages of bilingual acquisition (e.g. Kehoe, Lleó, & Rakow, 2004), but it is still an open question whether such an influence will result in compromise values during adulthood, as was found for late L2-learners (e.g. Flege, 1991; Laeufer, 1996) and L1-attriters (Flege, 1987; Major, 1992).
There is consensus that the earlier in life one begins to acquire a language, the more monolingual-like one becomes (for pronunciation see e.g. Abrahamsson & Hyltenstam, 2009; Flege, Munro, & MacKay, 1995; Flege, Yeni-Komshian, & Liu, 1999). Further, there is substantial evidence that in phonology heritage speakers have advantages over late L2-learners (Au, Knightly, Jun, & Oh, 2002; Chang, Yao, Haynes, & Rhodes, 2011; Kupisch et al., 2014a; Oh, Jun, Knightly, & Au, 2003). At the same time, a considerable number of studies have shown that exposure from birth does not guarantee monolingual-like performance, especially in a minority or heritage language. For example, Oh et al. (2003) found that native Korean students who ceased to use Korean around school entry when beginning to use English intensively spoke Korean with an accent, although their VOT was indistinguishable from that of monolingual native Koreans. Au et al. (2002) reported similar findings for Spanish heritage speakers who stopped speaking Spanish around the age of six and restarted learning it at age 14. Both studies suggest that interrupted exposure in childhood in combination with intensive input and use of an L2 can affect global accent – though not necessarily VOT – in the L1, even if L2 exposure starts after age six.
The picture is still unclear when it comes to 2L1s with exposure to both languages from birth. Deviances between 2L1s and monolinguals are not expected according to the speech learning model (SLM) (Flege, 1995), which postulates separate phonetic categories/systems if L2-acquisition happens early and merged categories/systems in late L2-acquisition. Nonetheless, once acquired, phonetic systems do not necessarily remain stable and “[…] reorganize in response to sounds encountered in an L2 through the addition of new phonetic categories, or through the modification of old ones.” (Flege, 1995, p. 233). In other words, even if bilinguals develop separate categories early in life, these categories may change later in life.
In what follows, we summarize existing research on VOT in French and German. The third section presents our own study, then we conclude with a discussion and a summary.
VOT and previous research
VOT is the most salient cue differentiating the language-specific realizations of voiced (/bdg/) and voiceless (/ptk/) plosives. It refers to the interval between “the release of the stop” and “the onset of glottal vibration, that is, voicing” (Lisker & Abramson, 1964, p. 389). According to Lisker and Abramson (1964), there exist three different types of VOT: (1) “voicing lead” (voicing starts before the release, resulting in voiced, unaspirated stops); (2) “short voicing lag” (voicing begins shortly after the release, resulting in voiceless, unaspirated stops); (3) “long voicing lag” (voicing starts late after the release, resulting in voiceless, aspirated stops). In the following, we focus on voiceless stops.
VOT in French and German voiceless stops
Table 1 summarizes previous studies on VOT with German and French monolinguals, including Canadian French (CF). There are several factors that influence VOT, resulting in considerable variation across the studies. One of the most important factors is place of articulation. VOT tends to be longer in /k/ than in /p/ or /t/ (Fowler et al., 2008; Laeufer, 1996; Lisker & Abramson, 1964). It is further influenced by syllable stress (Lisker & Abramson, 1967; Stock, 1971), speech rate (Lisker & Abramson, 1964) and the quality of the following vowel (Fischer-Jørgensen, 1979; Flege, 1991; Nearey & Rochet, 1994). In addition, German displays regional variation (Braun, 1996). Finally, stops in isolated words have longer VOTs than those in spoken sentences and spontaneous speech (Baran, Laufer, & Daniloff, 1977; Lisker & Abramson, 1964). Even word length may affect VOT (Flege, Frieda, Walley & Randazza, 1998; Lisker & Abramson, 1967). In German, syllable stress might be especially relevant for VOT. In contrast to the other studies in Table 1, Stock (1971) did not differentiate between stressed and unstressed syllables, or between word-initial and word-medial position, which may account for shorter VOT productions in this study. Voiceless stops in stressed syllables are produced with a short voicing lag in French with VOT values between 14 and 46 ms, and in German with a long voicing lag with VOT values between 46 and 67 ms and with aspiration. Aspiration also exists in French, especially with /k/, but is comparatively rare (Kohler, 1981). Note that none of these studies analyzed spontaneous speech.
Overview of L1 French and German VOT measurements.
VOT: voice onset time; CF: canadian french.
Despite variation, VOT in German is usually longer than in French, which leads to the expectation that a German influence on French would result in longer VOT compared to those of monolingual French speakers. Conversely, a French influence on German would result in shorter VOT compared to monolingual German speakers.
The acquisition of VOT
Monolingual children acquire the distinction between short and long lag VOT by age 2;0–2;6 (Davis, 1995; Kehoe et al., 2004; Kewley-Port & Preston, 1974; Macken & Barton, 1979), shortly after producing their first words (Stoel-Gammon, 1985). The contrast between lead and short lag VOT, present in French, Spanish and Italian, is acquired only after age three, due to the difficulty in producing lead voicing (see Allen (1985) for French; Bortolini, Zmarich, Fior, and Bonifacio (1995) for Italian; Macken and Barton (1980) for Spanish).
A number of studies have looked at child 2L1s acquiring languages with different contrasts, i.e. lead vs. short lag VOT in one language and short vs. long lag VOT in the other. Most of these studies indicate no difference between 2L1s and age-matched monolinguals in terms of quality (i.e. realizing the relevant contrasts in both languages), but some showed CLI and delays. For example, Kehoe et al. (2004) studied four German–Spanish 2L1s whose VOTs were similar to those of monolinguals in Spanish, but in German two of the children acquired the contrast only around age 3;9, i.e. with a delay compared to monolinguals. One English–Spanish child (aged 1;7–2;3) studied by Deuchar and Clark (1996) acquired the contrast in English at age 2;3 and started distinguishing the Spanish stops just like monolingual children. Three French–Swedish 2L1 children studied by Splendido (2014), recorded several times between 3;7 and 6;3, showed language separation from the first recording. While their French VOTs did not differ from those of monolinguals, Swedish long lag VOT was delayed. Specifically, the children’s Swedish VOTs were longer than those typical of adult monolinguals – a phenomenon also observed in monolinguals but at a younger age. Fabiano-Smith and Bunta (2012) showed that the /p/ and /k/ productions of eight Spanish–English 2L1s (aged 3;0–4;0) did not differ from those of Spanish monolinguals but from those of English monolinguals, suggesting influence from Spanish. Watson (1990) found that the VOT development of 15 English–French 2L1s (ages six, eight and 10) was similar to that of monolinguals in both languages. 5 Finally, Lee and Iverson (2012) showed that Korean–English 2L1s, aged five or 10 years old, acquired distinct VOTs in both languages.
Adult 2L1s speaking two languages with different VOT types usually maintain two separate phonetic systems. For 2L1s in Canada (French–English), Sundara, Polka, and Baum (2006) found VOT values for /t/ similar to those of monolinguals in both languages. 6 MacLeod and Stoel-Gammon (2009, 2010) studied the VOTs of labial and coronal stops in eight adult French–English bilinguals (age of acquisition (AoA) before age 4;0). Again, in both languages their voiceless stops did not differ from those of monolinguals. Even sequential bilinguals (AoA 8–12) in Canada can achieve monolingual-like VOTs (MacLeod & Stoel-Gammon, 2010). However, monolingual-like performance does not seem to be guaranteed. Mack (1989) investigated the VOT of /t/ and /d/ in adult English–French bilinguals (AoA 4;6 years, only one bilingual AoA <3;0) in the U.S., concluding that the phonetic systems of these speakers approximated those of monolinguals but did not perfectly resemble them. In contrast to Sundara et al. (2006), Fowler et al. (2008) showed that adult French–English 2L1s in Canada differed significantly from monolinguals in their French productions of /ptk/, while deviances in English were not significant. Note that most of these studies involved bilinguals in Canada, where both languages are relatively accessible, or even regularly used at home and work.
As for studies in the European context, Kupisch et al. (2014b) analyzed VOT for /p/, /t/ and /k/ in five French–German 2L1s from Germany and five from France. Only the 2L1s from Germany showed different VOTs for /t/ and /k/ with respect to monolinguals. However, the 2L1s’ VOTs were compared to monolingual values available from previous literature (Table 1), and factors potentially influencing VOT, e.g. vowel quality, were not controlled for. Moreover, French VOT was not compared to German VOT. The present study repeats this comparison but includes (a) four additional bilingual speakers, (b) German data from all speakers, which allows comparisons between the two languages, and (c) data from monolingual speakers. We focus on the consonant /k/ because the associated VOT values are acquired later than those for /p/ and /t/; therefore, /k/ is potentially the most difficult of these stops (e.g. Bortolini et al., 1995; Macken & Barton, 1979). Consequently, CLI might be more likely to occur with /k/ than with /p/ or /t/, and if speakers have separate VOT categories for /k/, it is likely that this holds for /p/ and /t/, too. All 2L1s were part of the global accent study by Kupisch et al. (2014a), to which we will compare the VOT data. Our research questions are:
Do simultaneous bilinguals have separate phonetic categories in their two languages (in line with Flege, 1995)?
Is there a correlation between VOT and global foreign accent?
In our discussion, we will also consider the potential role of time spent in the country where the heritage language is spoken.
Present study
Data and methods
Speakers
The bilingual data stems from the HABLA-corpus (Hamburg adult bilingual language acquisition), which contains semi-guided interviews (20–30 minutes) with simultaneous and successive bilinguals. 7 The interviews were conducted by monolinguals or 2L1s of the respective languages. For the present analysis, we selected seven German–French 2L1s who had grown up in France (henceforth 2L1s from France), and another seven who had grown up in Germany (henceforth 2L1s from Germany). All of the 2L1s acquired both languages from birth. In terms of general proficiency, the 2L1s from Germany can be considered German-dominant; the 2L1s from France can be considered French-dominant, even though they were resident in Germany at the time of testing (see Kupisch & van de Weijer, 2015, for various proficiency measures).
Table 2 shows the participant characteristics, i.e. their age, country of birth, residence in the heritage country before and after age 19, language preference and language spoken primarily at school. The last two columns in Table 2 show how often proportionally the 2L1s were rated as having a foreign accent. The numbers are based on the previous study by Kupisch et al. (2014a). In that study, 21 native speakers of German and 23 native speakers of French judged a naturalistic speech sample of 15 seconds, deciding whether the speaker was a native speaker or not. 8 The result was that the 2L1s were more often judged as foreign when they spoke their heritage language, compared to when they spoke their dominant language.
2L1 participant descriptives.
Numbers in parentheses are years spent in the heritage country during childhood and adulthood respectively.
We additionally collected data from five German monolinguals and five French monolinguals for the sake of having reference values for spoken language. These L1 speakers had not acquired a second language before age 10. The L1-German participants spoke northern German varieties, and the L1-French participants were from Paris, Chambery, Bordeaux and Nice. 9 The L1 data were narratives based on images of objects whose names often began with /k/ (e.g. German Kind [kʰɪnt] “child,” Katze [kʰatsə] “cat,” French quiche [kiʃ] “quiche,” cage [kaʒ] “cage”). Participants were asked to spontaneously invent a story, including as many of the objects on the pictures as possible. The reason for not using simple naming tasks or list-readings was that we wanted the speech to be as similar as possible to the interview data.
Method
The acoustic measurements were done with Praat (Boersma & Weenink, 2013). Following Lisker and Abramson (1964), we identified the period between the release of the consonant closure and the onset of glottal vibration, specifically the peak of the first regular wave (Figure 1). The VOT of the velar stop /k/ was analyzed only in words with initial stress and with /k/ in word-onset prevocalic position. Since the predominant stress pattern is trochaic in German and iambic in French, the analysis of the French data was restricted to monosyllabic words and disyllabic words in which the second syllable contained a schwa. 10 For instance, cadre [ˈka.dʀə] “frame” was included but not couteau [ku.ˈto] “knife.” We also included French function words which can be produced in isolation, e.g. contre [ˈkɔ̃.tʀə] “against,” while excluding function words like prenominal demonstratives as in quels amis [kɛl.z ̮a.ˈmi:] where liaison leads to a trisyllabic sequence. Each word was coded for number of syllables, second syllable schwa [ə], function or content word, and place of articulation of the following vowel (high–low, front–back).

Example of voice onset time (VOT) measurement in German Katze [ˈkʰaʦə] (“cat”). The left arrow in the waveform marks the release and the right arrow marks the beginning of vocal fold vibration (peak of the first regular wave). These points are visible in the spectrogram as a sudden burst of energy and a voice bar respectively, marked with boundaries on the third tier (k1 = VOT interval, here: 60 ms).
Material
Table 3 shows four properties of the material, relevant for the investigation. The two samples were comparable in size. The French sample was more repetitive than the German sample, containing relatively more function words and relatively fewer disyllabic words. The place of articulation of the vowel following /k/ was not completely balanced either, as shown in the table. Since these factors are likely to have an influence on VOT, we included them in the statistical analysis.
Material description (numbers in parentheses are proportions).
Results
Separation of phonetic categories
Table 4 shows the average VOT in the three groups. Across all groups, the average VOT for French was shorter than for German. Figure 2 is a boxplot of the measurements in the groups speaking French (left panel) and speaking German (right panel).
Average voice onset time (ms) with standard deviations in parentheses.

Voice onset time (VOT) boxplots across speaker groups in French and German.
A preliminary examination of the boxplots and the average values suggests that the speakers made a clear difference in VOT when speaking each of the two languages, i.e. the 2L1s produced longer VOTs in German than in French. Secondly, the difference in VOT between the two languages seems to be smaller for the speakers from France (approximately 13 ms) than for those from Germany (approximately 22 ms). Finally, the 2L1s’ average VOTs appear to be considerably shorter than the corresponding L1s’ VOTs in both languages.
The VOTs produced by the 2L1s were analyzed as a mixed-effects regression model. The main effect of interest here was the interaction of childhood country (France or Germany) with language (French, German), since this effect represents how well the 2L1s differentiated the VOT categories in the two languages. The other predictors were primarily included in order to control for the imbalance in the dataset shown in Table 3. These were the interaction of vowel height and vowel frontness (resulting in four categories of vowel position), word length (either one or two syllables) and word type (function word or content word). Finally, the model included random intercepts for participant and for word. The analysis was carried out in R (version 3.0.3, R Core Team, 2014). The L1 groups are not included in the analysis because their data were not collected in the exact same manner as the 2L1 data and because we are primarily interested in the comparison between the 2L1 groups.
Table 5 shows the outcomes of the analysis, which are commented upon below. The exact estimates of the effects and the associated p-values are given in the table and will not be repeated in the text.
Regression table.
There are four sizeable coefficient estimates, which are all significant. The first one is the coefficient for the intercept, which corresponds to the predicted VOT (approximately 34 ms) for a bilingual speaker from Germany when speaking French. This predicted VOT was only minimally different from that produced by a bilingual from France speaking French (a non-significant difference of less than 1 ms). There was a clear language effect for the bilinguals from Germany. Their predicted VOT when speaking German was approximately 23 ms longer than when they spoke French. In other words, the 2L1s from Germany made a clear distinction in VOT in the two languages. The 2L1s from France also made a difference between the two languages, but this difference was approximately 8 ms smaller as indicated by the interaction estimate in the table. This interaction of childhood country and language spoken was significant, and is shown in Figure 3.

Interaction of childhood country and language spoken.
As Table 5 shows, vowel height also had a significant effect on VOT. /k/ followed by a high vowel had on average nearly 17 ms longer VOT than /k/ followed by a low vowel. None of the other effects were significant.
In summary then, when speaking French, the bilinguals from Germany produced VOTs that were indistinguishable from those produced by the bilinguals who had grown up in France. When speaking German, on the contrary, the bilinguals from France produced VOTs that were significantly shorter than those produced by the bilinguals from Germany. In the next section we relate these findings to the speakers’ perceived foreign accent (see Table 2).
Relation to foreign accent
As a first step in the relation of VOT and foreign accent, we plotted the VOT measurements against the individual proportions indicating how often each speaker was deemed foreign (Figure 4). The horizontal axis is the VOT scale, while the vertical axis represents the proportion of times a speaker was classified as having a foreign accent. The gray shaded rectangles indicate the range of the L1 VOT measurements. The left and right panels show the 2L1s speaking French and German respectively. Within each panel, triangles represent the measurements by the 2L1s speaking the language of their childhood environment, while circles represent them speaking the heritage language.

VOT and global accent.
The 2L1s were seldom rated as having a foreign accent when speaking the language of their childhood environment and are situated at the bottom of the panels. In contrast, the 2L1s were more often perceived as foreign sounding when speaking the heritage language, and are therefore located higher up. Furthermore, the VOT measurements for the 2L1s all fall within the range of the L1 VOTs, which appear to have been extended towards the right, both in French and in German. In order to establish a potential relationship between VOT and perceived foreign accent, we calculated four correlation coefficients, one for each combination of speaker group and language (Table 6). None of these four correlation coefficients was significant, so we cannot demonstrate a relationship between the realization of VOT and perceived foreign accent.
Correlations between voice onset time and perceived accent.
As a second way of examining the relationship between VOT and foreign accent, we created boxplots for each speaker individually while speaking German and French (Figure 5).

Individual voice onset times (VOT) in the 2L1s’ French and German.
In Figure 5, the 2L1s from France and Germany are shown in the upper and lower seven panels respectively. For all but four speakers (D16, F2, F6 and F7), there was no overlap in the interquartile ranges (the middle 50% of the observations) of the measurements representing German and French, suggesting that the speakers produced distinct VOTs in both languages. Speaker F6 had exceptionally short VOTs when speaking German (similar to the French VOT), which was expected because he also had a high number of “foreign accent ratings” in German while sounding native-like in French (cf. Table 2). D16 had comparatively short VOTs when speaking German, which strongly overlapped with her French VOT range, pointing to compromise values. This speaker was not, however, rated as having a foreign accent either in German or in French. F2 and F7 also had overlapping VOT-ranges in their two languages, and, similar to F6 and D16, they had very different accent profiles from each other: F2 sounded native-like in French and had a foreign accent in German, while F7 sounded native-like in French and German.
In summary, the results fail to show a systematic relationship between foreign accent and VOT. All 2L1s realized VOTs within more or less the same range, and nearly all speakers made clear differences between the two languages.
Discussion
The goals of this study were to find out whether adult simultaneous bilinguals show evidence of separate VOT categories in their two languages, and to relate these results of VOT production to the perceived foreign accent of these speakers. Our analysis of VOT for /k/ showed that 2L1s realize VOT differently in German and French, and that the formation of distinct categories was clearer in the group of 2L1s from Germany. In our second analysis, we found that these results do not mirror the patterns previously found for global accent. In the following subsections we will discuss these findings.
VOT and global accent
Our results suggest no relation between the perceived foreign accent of our bilingual participants and their VOT. There were speakers with a native-like accent and a deviant VOT, and speakers with a non-native accent but distinct VOT categories. This result seems unexpected given previous studies that showed a correlation between VOT and global accent (e.g. Flege, 1984; Flege and Eefting, 1987b). However, target-like VOT does not preclude an accent in other phonological domains, and, in fact, there are studies showing that heritage speakers produce monolingual-like VOTs, despite having a perceivable foreign accent (e.g. Au et al., 2002; Oh et al., 2003).
There are three possible explanations for the absence of a relation: First, raters who judge an accent attend to features that are more prominent than VOT, such as prosody. Second, the data collection itself may play a role. Previous studies that found correlations between VOT and global accent were based on comparatively more controlled data, such as reading (e.g. Flege, 1984; Flege & Eefting, 1987b; Major, 1987). An accent may be easier to suppress in reading than in spontaneous speech, which may lead to different findings. The third possible explanation is that VOT is less susceptible to CLI than other phenomena, such as suprasegmentals, because it is acquired at an early age and therefore does not depend on other linguistic properties acquired later in life (e.g. discourse knowledge). This last explanation is compatible with the fact that children acquiring two languages with different VOT patterns can differentiate the VOT categories in both languages well before the age of four years (Deuchar & Clark, 1996; Kehoe et al., 2004; Splendido, 2014; Watson, 1990).
Phonetic systems of adult bilinguals
With regard to our first research question, we found that all bilinguals produced longer VOTs in German than in French. On the individual level, all but four speakers had distinct VOT categories. This generally confirms Hypothesis 4 of Flege’s (1995) SLM, according to which early bilinguals will develop separate phonetic categories in both languages. The results tie in well with developmental data from early bilinguals, showing that bilingual children form separate VOT categories early on (e.g. Deuchar & Clark, 1996). What is more, they show that these separate categories will not be fused into a unified category later on in life despite developmental delays (as reported by Kehoe et al., 2004), and massive input from the majority language. It has indeed been suggested that features which are similar in the two languages in contact are especially vulnerable to attrition (e.g. Andersen, 1982; Schmid, 2009). Given that the phonetic categories for /k/ in French and German have the same place and manner of articulation but different VOTs, they could be potentially affected. Our results, however, do not support this assumption. One reason could be that VOT categories are comparatively stable because they are acquired relatively early in life. Another reason could be that our 2L1 speakers’ minority language was supported through schooling and residence in the heritage country, as will be discussed below.
Comparison with monolinguals
The difference between bilinguals and monolinguals displayed in Figures 2 and 4 deserves some comment. In fact, the VOTs from the L1 groups in the present study are not only longer than those of the 2L1s in both languages, but also longer than those reported in the literature. The most likely explanation is that the L1 data consisted of semi-spontaneous narratives, while the bilingual data consisted of more naturalistic interviews. Possibly, the narratives were produced at a slower speaking rate than the interviews, or the target words received more stress, leading to longer VOTs in the L1 speech. Or it could be that speech register or speaking style is an additional factor that influences VOT. We leave this possibility open for future research.
Extra- and intra-linguistic factors
The results above suggested that the 2L1s from France were not as successful in producing German-like VOT as the 2L1s from Germany were in producing French-like VOT. In this section we explore whether any of the background information which we disposed over (through questionnaires and the content of the interviews) could possibly be related to this difference between the groups. This part of our discussion is tentative, and is primarily meant to raise questions that may be addressed in future studies. The factors that we consider are years spent in the heritage country during childhood (i.e. before age 19), years spent in the heritage country during adulthood (i.e. after age 19), language preference and language spoken at school (cf. Table 2).
One possible explanation of why the 2L1s from Germany were better at producing French-like VOT than the 2L1s from France were at producing German-like VOT is that the speakers from Germany, on average, spent more time in France during their childhood than the speakers from France spent in Germany. The speakers from France, by contrast, spent much more time in Germany during adulthood (since they moved there), than the speakers from Germany spent in France. This suggests that time spent in the heritage country during adulthood cannot compensate for the smaller amount of time spent in the heritage country during childhood. These findings do not speak directly to the “earlier-is-better” view (e.g. Abrahamsson & Hyltenstam, 2009), because all speakers had been exposed to both languages from birth, but indirectly in the sense that relatively more input in the minority language is more beneficial early in life than later in life. In fact, although we did not find a significant relation between VOT and global accent, the accent data in Table 2 show that the bilinguals from Germany were also more often deemed native-like when speaking French as compared to the bilinguals from France when speaking German.
Secondly, many of the German speakers attended schools where the language spoken was primarily French. This suggests that school language may be an important factor as well, and naturally, adds to the amount of input that the speakers received, as well as its quality and variety, in addition to the time spent in the heritage country. Nonetheless, as shown in Table 2, most 2L1s from France also attended schools at which German was spoken as either a primary or secondary language.
Language preference, finally, was comparatively diverse among the 2L1s from France, while the 2L1s from Germany preferred to speak German. If language preference were crucial in attaining monolingual-like VOT, one would have expected the 2L1s from France – who seemed to be relatively more comfortable speaking their weaker language – to have more monolingual-like VOT in German, contrary to what we observed.
Finally, there is an alternative, intra-linguistic explanation for the observed difference in VOT contrasts between the 2L1s from France and Germany. One could assume that it is easier for German-dominant speakers to produce French-like VOT in voiceless stops because German has short lag VOT with voiced stops. So, from an articulatory point of view, production of short lag VOT should not pose a problem for German-dominant speakers. By contrast, French-dominant speakers do not have long lag VOT in their phonetic system in French. Thus, it may be harder for them to produce long lag VOT. If correct, these ideas make two predictions that could be tested in future research. First, when comparing equally proficient L2 learners of French and German (having the respective other language as their L1), L2 learners of French should have more difficulty producing German-like VOT with voiceless stops than vice versa. Second, the mirror-pattern should emerge with voiced stops, where German-dominant speakers should encounter relatively more difficulties, given the absence of lead voicing in their dominant language system.
Conclusion
Our study showed that adult German–French 2L1s produce shorter VOTs when speaking French than when speaking German. Results are consistent with previous studies on child language, showing language separation in VOT, as well as with previous research on adult heritage speakers, suggesting that early bilinguals successfully acquire VOT and maintain target-like values throughout adulthood. The VOT contrast between German and French was more pronounced in the 2L1s from Germany than in the 2L1s from France, which may be due to intra- or extra-linguistic factors.
Our results showed no systematic relation between VOT and global accent. Our interpretation is that VOT, because it is acquired early, stabilizes before contact with the majority language gets more intense, while other aspects of pronunciation, such as rhythm and intonation, continue to evolve after that age and therefore are more susceptible to CLI. The role of extra-linguistic factors remains inconclusive, suggesting that various factors might play together. To say the least, age of onset is not the only factor that determines acquisition outcomes, since all bilinguals investigated here were exposed to both languages from birth. Finally, observable differences in accent between minority/heritage and majority speakers suggest that the childhood language environment has an impact on aspects of the adult phonological system other than VOT.
Footnotes
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
