Abstract
The present article provides an exploration of ultimate attainment in second language (L2) and its limitations. It is argued that the question of maturational constraints can best be investigated when the reference population is bilingual and exposed on a regular basis to varieties of their first language (L1) that show cross-linguistic influence. To this end, 20 advanced Dutch–English bilinguals are compared to 9 English native speakers immersed in a Dutch environment. All participants are teachers or students of English at a Dutch institution of higher education. The populations are shown to be at similar global proficiency levels. Two phonetic variables (voice onset time or VOT and vowel discrimination) and one grammatical variable (verb phrase ellipsis), which are assumed to present particular challenges to Dutch learners of English, are explored, and speakers are furthermore rated for their global nativeness. The findings show no differences between populations on VOT but some variance on the production of a vowel that has no correlate in Dutch (the English trap vowel). However, all but one of the L2ers are rated outside the range of the natives on perceived foreign accent. There are also differences between groups where acceptance of different sentence types with verb phrase ellipsis are concerned. We interpret these findings to indicate that there are areas of L2 knowledge and production that are persistently difficult to acquire even under circumstances that are highly favourable for L2 acquisition.
Keywords
I Introduction
Investigations of second language acquisition (SLA) have long recognized the fact that there is a statistical correspondence between the age at which acquisition of the second language (L2) starts (age of acquisition, henceforth AoA) and the level of ultimate achievement. The older the learner is at the onset, the less likely he or she is to attain extremely advanced (near-native) proficiency (Abrahamsson and Hyltenstam, 2008; Bialystok, 1997). A number of external factors play an undisputed role in this: the amount of time and effort that the learner is free to devote to the acquisition process; the level of motivation and aptitude; and the fact that most processes of learning and memory become less efficient with increasing age (Bialystok, 2001: Chapter 3).
An unresolved and highly controversial question, on the other hand, is whether there are independent maturational constraints that limit ultimate attainment in L2 for learners with AoAs beyond the so-called ‘critical period’. Proponents of critical (or sensitive; for a discussion of the terminology, see Bialystok, 1997) periods for language learning assume that there are certain (possibly language-specific) cognitive functions that are used for language acquisition and in particular grammar building in childhood. On this view, once the linguistic system has stabilized these functions become unavailable for L2 learning; and success in acquiring particular features is contingent on whether or not these are pre-instantiated by the first language (L1) (Birdsong, 2006).
Critical/sensitive period approaches do not assume that L2 learning will become impossible after a certain age, but that a qualitative change may ensue for the learning of particular features. This change has sometimes been put in terms of explicit vs. implicit learning, resulting in, respectively, declarative or procedural representations of grammatical knowledge (e.g. Paradis, 2004, 2008). Other researchers have argued in favour of an underlying representational deficit that learners can overcome by means of non-grammatical compensatory strategies (e.g. R. Hawkins, 2001; R. Hawkins and Hattori, 2006; R. Hawkins and Tsimpli, 2009). The implication of all of these proposals is that it may be possible for L2 learners to behave indistinguishably from natives, in particular on offline tasks that allow the application of explicit knowledge, but that there is a difference in the underlying nature of the linguistic knowledge. In tasks that involve additional pressure (Abrahamsson and Hyltenstam, 2009) or require online, unmonitored L2 use – that is, where the learner cannot fall back on what Long (2005: 289) terms ‘language-like behaviour’ – L2 speakers may be unable to sustain native-like behaviour.
This has led Long (2011) to stipulate that the question concerning maturational constraints hinges on identifying late learners who have become native-like across the full range of linguistic abilities: No-one denies that learners can achieve native-like abilities on some tasks, especially off-line, language-like tasks, or with their control of some sounds, lexical items and collocations, or grammatical structures. Many early stage learners can do that. What opponents of the idea of maturational constraints need to produce is a single case of a late starter who can perform like a NS [native speaker] across the board.
Quite irrespective of whether one accepts this as both a necessary and a sufficient condition for proving the (non)existence of maturational constraints, it is important to point out a methodological fallacy concerning the baseline for comparison. Most investigations of highly advanced learners use native controls with no or very limited levels of bilingualism (e.g. Abrahamsson and Hyltenstam, 2009; Bongaerts et al., 1997). Monolinguals may, however, constitute an inappropriate point of reference for ultimate attainment in SLA due to the fact that even low-proficiency bilinguals experience measurable degrees of bi-directional cross-linguistic interference on all linguistic levels (Hopp and Schmid, 2013; Schmid, 2013). The impact of L2 on L1 increases with higher proficiency and longer exposure. If bilinguals inevitably experience transfer from L2 to L1, it is unreasonable to expect them to be able to completely overcome L1 to L2 transfer and attain L2 levels that correspond to those of monolingual natives. As Hopp and Schmid put it, ‘the choice of a monolingual control group […] moves the yardstick of native-likeness to a point that may, by definition, be out of reach for most bilinguals’ (2013: 364).
For bilinguals who are immersed in an L2 environment, there may be the additional factor of interacting in their L1 with non-natives (that is, with speakers who have varying degrees of foreign accent and other manifestations of cross-linguistic interference) on a regular basis. This is particularly true for individuals who use either their native language or the L2 professionally, e.g. as foreign language teachers (Porte, 2003). Since speakers and listeners who are exposed to particular dialects adapt not only their own speech production (Chambers, 1992) but also their underlying mental representations (Dahan et al., 2008), it is to be expected that both natives and L2 speakers who receive a significant amount of clearly non-native input may develop or retain a contact variety of that language that is not necessarily restricted by constraints to ultimate attainment but assimilates to the variety that they are exposed to every day. The baseline for comparison in investigations of ultimate L2 attainment should therefore consist of bilingual native speakers who are themselves not only proficient in the native language of the target population but also experience input similar to that of the target population. If such a comparison could still demonstrate that late learners systematically deviate from early learners and/or native speakers on some measures, that would constitute a more convincing argument in favour of maturational constraints than differences found in comparison with a monolingual baseline.
II Ultimate attainment in L2 pronunciation
Acquiring a native accent is perhaps the most difficult aspect of SLA, and ultimate nativeness in pronunciation appears to become unlikely at a younger age than is the case for other linguistic phenomena: many advanced L2 speakers seem to have mastered even complex aspects of L2 grammar, yet speak with an obviously foreign accent (Long, 1990; Singleton, 2005). Here, as in other areas of linguistic investigation, the question of whether maturational constraints or more gradual, cumulative factors such as progressive entrenchment are at the root of this phenomenon is highly contested (e.g. Bylund et al., 2013; Piske et al., 2001). In particular, two phonetic/phonological variables have often been invoked in the context of foreign accent and maturational constraints on L2 pronunciation: Voice Onset Time (VOT) and vowel categorization (e.g. Flege, 1987).
1 Voice onset time
VOT refers to the time period between the release of a prevocalic plosive and the starting point of vocal cord vibration (Crystal, 1991; D. Gilbers et al., 2013). Languages differ from each other regarding the duration of the lag following the release burst of voiceless plosives: some languages, such as English, are long lag (VOT in voiceless plosives above 36 ms, i.e. aspirated) whereas others, such as Dutch, are considered short lag (0–36 ms) (Jansen, 2004). Furthermore, some languages (Dutch among them) are characterized by the fact that vocal cord vibration of initial voiced plosives takes place even before the release burst, a phenomenon called voicing lead (Crystal, 1991), negative VOT, or prevoicing (D. Gilbers et al., 2013). Table 1 displays the VOT-related differences between English and Dutch, indicating that even though both languages contrast between prevocalic plosives /b/:/p/, /d/:/t/, and /ɡ/ 1 :/k/, English does so differently than Dutch (D. Gilbers et al., 2013; Mayr, Price and Mennen, 2012).
Comparison of Dutch and English VOT (adapted from D. Gilbers et al., 2013).
VOT values have been shown to be a good predictor of globally perceived foreign accent; Major, 1987; Riney and Takagi, 1999; note that Flege and Eefting (1987) only found such a correspondence for low-proficiency Dutch–English bilinguals. Beginning Dutch L2 learners of English prevoice their voiced plosives and fail to sufficiently aspirate their voiceless plosives when speaking English, directly mapping their L1 VOT system onto that of the L2 (Flege and Eefting, 1987). More advanced L2 learners achieve a closer approximation of the native VOT system, at least in elicited data that focuses particularly on the voiced–voiceless contrast and therefore draws the learner’s attention to this feature (Flege and Eefting, 1987). However, even advanced bilinguals often do not fully attain native-like values. This was attested, for instance, by Flege (1987), who found that native speakers of French – a non-aspirated language (Tranel, 1987; Van Boxtel et al., 2005) – living in the US and native speakers of English living in France did not fully conform to native norms in either of their languages, which Flege interpreted as evidence for cross-linguistic assimilation.
2 Phonetic mapping of the vowel space
Vowel categories are distributed along two parameters: length and spectral location. Length, which is not phonemic in English, will not be treated here. Spectral location is two-dimensional in nature, as it is itself based on the first (F1) and second formant (F2). 2 The acoustic F1 parameter is articulatorily related to the openness of the lower jaw and the vertical position of the tongue when producing a vowel; the higher the F1 value is, the more open the vowel. The F2 parameter is related to the horizontal position of the tongue, with a higher F2 value indicating a more fronted vowel.
Even though every realization of a particular vowel is slightly different, native speakers perceive them as belonging to the same particular vowel category, due to the fact that L1 learners discover which sounds are contrastive, as well as how these sounds are produced (Flege, 1992). On the basis of the lexicon of the target language in relation to contextual situations, children acquiring English as their L1 will learn that a word realized as [phæt] (pat) is meaningfully contrastive from one realized as [phεt] (pet). Since Dutch does not make a phonemic contrast between /ε/ and /æ/, L1 learners of Dutch confronted with [pεt] (pet, ‘hat’ in Standard Dutch) and [pæt] do not have a meaningful contrast to rely on, and hence perceive both vowels as belonging to the same category (Flege, 1992; Flege et al., 1997).
The two dominant models of L2 vowel categorization, the Speech Learning Model (SLM; Flege, 1995) and the Perceptual Assimilation Model (PAM; Best, 1995), invoke the influence of L1 vowel categories in SLA. Essentially, both models assume that categories that are identical cross-linguistically will transfer without any problems, that L2 sounds sufficiently phonetically different from any alternative in the L1 – Flege (1995) terms these ‘new sounds’ – will also be fully acquired (after sufficient exposure), but that L2 sounds that are similar to but not exactly the same as any sound in the L1 will prove difficult to attain (Best, 1995; Flege, 1992, 1995; Marković, 2009). Considering the fact that for Dutch L2 learners of English, /æ/ belongs to the third category, the /ε/–/æ/ contrast should be challenging for them. This expectation is supported by Bohn and Flege (1992), who found a similar persistent difficulty with the /ε/–/æ/ contrast among German L2 learners of English (Bohn and Flege, 1992).
Both production and perception of L2 vowel categories have been shown to improve (i.e. become closer to the native target) with increasing experience (Bohn and Flege, 1992; Flege et al., 1997). Irrespective of the level of experience, however, participants in a study by Flege and colleagues made errors that were clearly induced by their respective native language (German, Korean, Mandarin and Spanish), which led the authors to conclude that ‘[n]on-native speakers’ rate of learning, and perhaps their ultimate degree of success, may be influenced by the perceived similarity of English [sounds] to [sounds] in the L1’ (Flege et al. 1997: 438). Flege et al. point to two possible explanations: first, they invoke the possibility ‘that hearing English spoken with a foreign accent by other non-natives may have influenced our subjects’ performance’ (1997: 467). Second, they concede the possibility that ‘adults’ ability to learn aspects of the L2 sound system is limited in an absolute sense’ (1997: 467).
III Ultimate attainment in L2 grammar: A perspective from UG
Where grammatical categories are concerned, the debate on the limits to ultimate attainment in L2 has been approached from a variety of different theoretical perspectives. The discussion here will be confined to the Universal Grammar (UG) framework. In this context, the main question has been to what degree second language learners can draw on this resource and whether SLA is constrained by age and maturational processes through which UG (or parts thereof) become(s) unavailable around a certain age.
Early theories of a ‘critical period’ constraining SLA proposed very strong assumptions, such as the notion that UG becomes entirely unavailable after a certain age (e.g. Bley-Vroman, 1990; Clahsen and Muysken, 1986, 1989). On this view, adult SLA makes use of non-language-specific strategies based on ‘principles of information processing and general problem solving’ (Clahsen and Muysken, 1989: 26), and UG can only be used insofar as it is already instantiated by the pre-existing L1. This point of view has come to be known as the ‘Fundamental Difference Hypothesis’ (Bley-Vroman, 1990).
The view that SLA learners have no access to UG has been refuted by a number of studies demonstrating that L2 learners can, in some cases, overcome the poverty of the stimulus (POS) problem, for example in syntactic phenomena such as long distance wh-movement in English (for overviews of studies on POS phenomena in late SLA, see Schwartz and Sprouse, 2000; White, 2003). Whether access to UG can be differentially affected by age in SLA remains a source of debate.
R. Hawkins and Chan (1997) propose that the grammar of late L2 learners may have a representational deficit, based on the fact that access to certain subparts of UG is subject to maturational constraints, and that UG is operative in SLA ‘in some attenuated form’ (p. 188). Specifically, features associated with functional categories (such as agreement and determiner) are taken in this approach to become inaccessible after puberty. Since L2 learners whose L1 does not encode a functional feature that needs to be acquired in SLA cannot build a target-like representation, they may apply non-grammatical compensatory strategies to model the target system. This assumption can account for common observations in SLA, for example that even otherwise highly proficient L2 English speakers from languages that do not have articles (such as Russian and other Slavic languages, but also many Asian languages) have persistent problems in using English determiners in a target-like fashion.
The account originally proposed by R. Hawkins and Chan has gone through various reformulations. Among the most recent ones is the Interpretability Hypothesis proposed by Tsimpli and Dimitrakopoulou, which assumes that ‘[u]ninterpretable features [not instantiated in the learner’s L1, MSS/SG/AN] are subject to critical period constraints and, as such, they are inaccessible to L2 learners’ (2007: 224).
Against this view, Full Transfer/Full Access models argue that the observation that there is persistent optionality in the overt realization of a grammatical feature cannot be taken as evidence that the feature is not or only partially/defectively represented in the speaker’s mental grammar (the ‘Missing Surface Inflection Hypothesis’). On this view, late L2 learners retain access to UG (i.e. they can acquire representations of functional categories) and variability in performance is due to external factors constraining control of the output: the L2 speaker has the knowledge, but does not always map the underlying form onto the surface feature due to procedural constraints or competition from the L1 (this view has been proposed by e.g. Lardiere, 1998; Prévost and White, 2000).
In summary, theories that assume an innate basis for (first) language learning are to some extent divided on the question of whether access to this inborn knowledge in the learning of further languages is subject to some maturational constraint or not. Again, it can be argued that these assumptions can best be tested by comparing L2 learners to speakers who demonstrably have no representational deficit, but who experience similar processing limitations/cross-linguistic competition as L2 learners (instead of the traditional monolingual controls), that is bilinguals for whom the language under investigation is the L1.
Verb phrase ellipsis
A recent study by R. Hawkins (2012) investigates the phenomenon of English Verb Phrase Ellipsis (VPE) and its acquisition by L2 learners from the point of view of the representational deficit account. VPE concerns the option in English of eliding parts of the verb phrase where it is anaphorically recoverable from the preceding discourse context, as in the examples listed in (1) (1) a. Jack wrote Jill a letter. Mary did ____ too. b. Jack wrote Jill a letter. Mary will ____ too. c. Jack has written Jill a letter. Mary did ____ too. d. Jack was writing Jill a letter. Mary did ____ too. (all examples from R. Hawkins, 2012)
R. Hawkins investigates the acceptability of these types of sentences among native speakers and L2 learners (native speakers of Arabic and Mandarin Chinese) of English, and concludes that all populations accept the sentences in (1a–d) as grammatical despite the fact that VPE is not licensed in either Arabic or Mandarin Chinese. All populations furthermore reject VPE in cases such as (2), where the elided phrase contains not an auxiliary but a stranded verb.
(2) Jack sent Jill a letter. *Mary sent ____ too.
R. Hawkins investigates a number of other cases of VPE that will not be discussed here. Crucial for the purpose of our investigation, however, is the distinction between (3a) and (3b): (3) a. Jack wrote Jill a letter. #Mary was ____ too. b. Jack wrote Jill a letter. Mary has ____ too.
In (3a), the elided verb is in the past progressive (was writing), while in (3b), it is in the present perfect (has written). R. Hawkins’ native and L2 population both reject (3a), but only the L2ers reject (3b), which is an acceptable English sentence for his native speakers. R. Hawkins ascribes the difference in results between his native speakers and L2ers to the fact that the elided constituents in (3a) and (3b) differ with respect to interpretability: [T]he interpretable meaning of the English perfect (have V-en) is carried by the auxiliary have. The -en inflection is solely an agreement marker with an uninterpretable [perfective] feature, valued by an interpretable feature of have. Since -en has no semantic meaning, it does not violate ‘recoverability’ when deleted under VPE, and so can be freely deleted whether there is a counterpart in the antecedent or not. By contrast, the inflection -ing has an interpretable [progressive] feature. Its meaning is not recoverable from the antecedent […], and the deletion is infelicitous. (p. 410)
He thus suggests that the failure of his L2 population to accept sentences like (3b) as grammatical may be an indication of constraints on native-like attainment in late SLA for speakers of languages that do not have an uninterpretable perfective feature, in line with the Interpretability Hypothesis.
IV Aims of the present study
The present investigation addresses the question of whether highly advanced L2 learners remain persistently non-native in particular areas of their L2 knowledge and production. As was noted above, investigations that aim to probe the issue of maturational constraints should conform to the following criteria:
they should investigate highly advanced, highly motivated speakers who have reason and resources to engage with their L2 at the highest level (e.g. language professionals);
they should establish a point of reference that is not, per se, unattainable for bilinguals; in other words, the baseline for comparison should consist of native speakers who have some level of bilingualism in the L2 learners’ L1 and are regularly exposed to similarly cross-linguistically impacted speech as the target population.
We have chosen learners for whom the impact of the external factors that have been invoked in constraining ultimate attainment (Bialystok, 2001), namely time available to devote to L2 development, motivation and aptitude can be assumed to be as favourable as possible: the L2 population under investigation here consists of Dutch natives who either study or teach English at a Dutch university, and who are identified by their teachers or colleagues to be native-like. All speakers began learning the L2 in school around age 11. The control population consists of English native speakers who also either study or teach English at a Dutch university. Specifically, we are investigating grammatical and phonetic features that are deemed particularly difficult to acquire for Dutch natives in L2 English, as well as holistic proficiency and globally assessed foreign accent.
One drawback of this approach is that the late L2 learners (experimental population) chosen for the present study are not currently immersed in an English language environment. This was an inevitable consequence of our choice to use university-level English language professionals: it is highly unusual for non-native speakers (in particular late learners) to work at this level at an English department in the English-speaking world. Opting for immersed bilinguals (Dutch migrants in an English environment), on the other hand, would have yielded a pool of L2 learners with more diverse learning paths, levels of motivation, available time for learning/maintaining the L2 and so forth. Levels of exposure to native and L2 influenced speech and writing as well as awareness of the phenomena under investigation were therefore assessed in the two populations.
Ultimate attainment in L2 phonetics is assessed on the basis of VOTs in voiced and voiceless plosives in both free and elicited contexts, as well as on the differentiation of two English vowels which are not distinguished in the L1 Dutch of the target population. As noted above, differences in how the vowel space is carved up in both of a bilingual’s languages make it possible to predict which vowel categories will be most problematic in SLA. In Standard Southern British English (SSBE), 3 /ε/ has an average F1 of 494 Hz and an average F2 of 1650 Hz, and /æ/ has an average F1 of 690 Hz and an average F2 of 1550 Hz in connected speech (for male speakers). In other words, in SSBE, /ε/ is higher and more fronted than /æ/ (Deterding, 1997; S. Hawkins and Midgley, 2005). For Dutch male speakers, /ε/ is quite close to that of SSBE with an average F1 of 583 Hz and an average F2 of 1725 Hz (Rietveld and Van Heuven, 2001). It is thus unlikely that Dutch L2 learners of English will have trouble pronouncing words belonging to the dress lexical set in a native-like manner. The English trap vowel, on the other hand, is not labelled as a distinct category in the Dutch vowel system and may therefore be perceived as an allophone of /ε/ by Dutch L2 learners, who often pronounce words such as ‘bat’ as [bεt] instead of [bæt].
Where the choice of a grammatical feature that might elude even highly advanced Dutch–English bilinguals was concerned, the present investigation was faced with the problem that the population under investigation can be considered to have a very high level of explicit knowledge of English grammar. English language proficiency is a substantial component of the curriculum at all the institutions of higher education with which our participants are affiliated, and all participants have either taught or attended classes that specifically aim at training Dutch L2ers in native-like pronunciation, speaking and writing, with special focus on avoiding any of the common ‘pitfalls’ that occur in Dutch–English interlanguage. It was therefore deemed unprofitable to use grammatical tests such as the ones often applied in studies of ultimate attainment (e.g. the grammaticality judgment task devised by Johnson and Newport, 1989; and adapted by DeKeyser, 2000 and DeKeyser et al., 2010), since the constructions tested there are ones that all participants would have explicit (and hence declarative) knowledge about due to their teaching and/or studying experience.
Replicating R. Hawkins’ (2012) investigation of Verb Phrase Ellipsis among L2 speakers of English appeared a suitable way of identifying a possible persistent gap in the L2 English grammar of our target population for three reasons. First, VPE and the contexts in which it is or is not licensed do not form part of the canon of English L2 grammar teaching. Second, it differs contrastively between our bilingual populations’ languages, being legitimate in English but not in Dutch. Third, where VPE structures of the type Simple Past – Present Perfect are concerned (‘Jack wrote Jill a letter. Mary has ____ too.’), Dutch and English form a particularly interesting contrast: Dutch does have a present perfect tense that is morphologically nearly identical to its English equivalent, being formed by means of a tense auxiliary and past participle (see 4)
4
: (4) Tom heeft een kaart aan Jill gestuurd. Tom has a card to Jill sent. ‘Tom has sent a card to Jill’
Historically, the Dutch present perfect tense did mark perfective/completive aspect (similar to present-day English), but this function was lost before the codification of Standard Dutch (Zwart, 2011: 12). In present-day Dutch, the present perfect indicates anteriority, while viewpoint aspect is not grammaticalized (Zwart, 2007, 2008). 5 Dutch grammar therefore does not incorporate an uninterpretable perfective feature, the circumstance to which R. Hawkins ascribes the failure of his L2 populations to recognize the grammaticality of such sentences. In replicating R. Hawkins’ experiment we were therefore particularly interested in differences between our L2 learners and native English speakers in their acceptance rates of VPE sentences with Simple Past – Present Perfect (his Type 4), predicting that the natives would have a higher acceptance rate of this type.
1 Research questions
Can highly advanced and highly motivated learners of L2 English with Dutch as their native language (target population) achieve similar scores on a global proficiency task as native English speakers of a similar educational background who live in a Dutch environment (reference population)?
Is the target population able to produce initial plosives (voiced and voiceless) with VOT values that fall within the range of those produced by the reference population? Are there differences with respect to this comparison in free vs. elicited speech?
Is the target population able to differentiate the dress and trap vowels in speech production in the same manner as the reference population?
Is the target population able to judge the acceptability of English sentences containing Verb Phrase Ellipsis (VPE) in a similar manner to the reference population? In particular, is the target population sensitive to the fact that VPE sentences of the type Simple Past – Present Perfect are acceptable, while sentences of the type Simple Past – Past Progressive are not?
For those features listed under 1–4 above where target and reference population differ at the group level, are there any individuals who perform within the bilingual native range across the board?
2 Participants
a Dutch L2 learners of English
The target population investigated in this study consisted of 20 very advanced L2 learners of English with Dutch as their L1 (11 males), all of whom were either teaching 6 (n = 10) or studying (n = 10) English at a Dutch institution of higher education. 7 The ages of the participants ranged from 20 to 64 (M = 36.3, SD = 16.0). Students were selected on the basis of their perceived near-nativeness as (informally) assessed by their teachers (both L1 and L2 speakers of English), and teachers were selected on the basis of the same criteria as assessed by their colleagues and the researchers themselves. All participants had begun to learn English in school around the age of 11 (M = 10.6, SD = 1.73) and none of them had been exposed to English before (e.g. at home). The population was thus homogenous with respect to factors such as AoA, educational level and a high motivation for L2 learning. 8
b Control group
To establish a baseline, 9 native speakers of English (5 males) 9 were selected to form a control group which, in similar fashion to the Dutch L2 learners of English group, was composed of students (n = 4, 1 male) and staff members (n = 5, 4 males) affiliated with an English department at the same institutions as the target population. The age range of the control speakers was 19–54 (M = 34.7, SD = 14.8). All control group participants spoke a British variety of English relatively close to Standard British English as their L1. All participants belonging to the control group were living in the Netherlands for their studies or occupation.
3 Materials
a Personal questionnaires
Two different sets of personal questionnaires were employed for this study. The one intended for Dutch L2 learners of English asked participants to supply their date of birth, gender, (additional) childhood language(s), at what age and in what setting they started learning English, whether they had spent any time abroad (holidays excluded). They were also provided with the option of noting any further remarks or explanations. The control group questionnaire elicited the same information, except for the age and location of their acquisition of English (and hence also provided information on their length of residence in the Netherlands). Both populations furthermore completed a set of questions on language use and language awareness. 10
b C-Test
Global proficiency levels were assessed by means of a C-Test (a variant of the cloze test; Grotjahn, 2010). This task is commonly accepted to be a good indicator of proficiency at higher levels of attainment, since it requires the participant to make use of the general redundancy of texts and to use a variety of levels of linguistic knowledge (stylistic, lexical, grammatical, general cohesion, etc.). It involves completing short texts from which parts of words have been deleted according to a pre-determined scheme. To establish the context, the first sentence is left intact, and from the second sentence onward every second word’s second half is removed and replaced by a gap, as illustrated in the following example: The decision to remove soft drinks from elementary and junior high school vending machines is a step in the right direction to helping children make better choices when it comes to what they eat and drink. Childhood obe___ (target:
The C-Test used in this study consists of 5 short texts of around 70 words in length, each of which contains 20 gaps which the participants were required to complete. All correct responses were awarded one point, amounting to a maximum possible score of 100 points.
c Word list
To collect elicited tokens for VOT analysis, participants were recorded as they read out a list of stimuli devised by Mayr et al. (2012), which consisted of 24 monosyllabic English words beginning with a plosive immediately followed by a vowel. The target words, listed in Table 2 below, were preceded by the carrier phrase ‘I say …’. The 8 tokens in this list containing /ε/ (n = 2) or /æ/ (n = 6) were also used for elicited vowel quality analysis, allowing a comparison between free and elicited vowels.
List of 24 target words (adapted from Mayr et al., 2012).
For the reading task, the order of the words was randomized.
d Film retelling task
In order to obtain free speech data, participants were shown an excerpt lasting about 10 minutes of the Charlie Chaplin movie Modern Times (1936) (Perdue, 1993; Schmid, 2011), and were subsequently asked to retell what they had seen while being recorded.
Three types of measures were collected from these data. First, all words containing stressed instances of /ε/ and /æ/ were extracted from the recordings for vowel quality analysis. On average, there were 17.24 (range: 7–20, total: 500) tokens of /ε/ and 19.73 (range: 11–33; total: 572) tokens of /æ/ per speaker. Second, 12 stressed words beginning with a voiceless plosive followed by a vowel – 4 for /p/, 4 for /t/, and 4 for /k/ – were extracted for VOT analysis of free speech. 11 Acoustic measurements were conducted using the program PRAAT (version 5.3.16) (Boersma and Weenink, 2012). For the vowel formants, measurements for F1 and F2 were executed at the point of the vowel’s highest intensity, as this is commonly assumed to lead to the most stable measurements (D. Gilbers, personal communication). VOT was measured from the plosive’s burst until the onset of the immediately following vowel for aspirated plosives, and from the start of vocal fold vibration up until the plosive’s burst for prevoiced plosives. Lastly, a short segment was extracted from each file for a Foreign Accent Rating experiment.
e Foreign accent rating
The Foreign Accent Rating (FAR) experiment was based on a 15–20 second segment of uninterrupted speech extracted from each of the film retellings. Each segment was played twice, and each instance was followed by 7 seconds of silence for the listeners to make and confirm their rating. Ratings were given in answer to the question ‘Does this speaker have a foreign accent’ on a 6-point Likert scale, where the left end of the scale (coded as 1) was labelled ‘no foreign accent’ and the right end was labelled ‘strong foreign accent’. The 29 samples from the speakers investigated in this study were preceded by a short training session of four different samples (two from native Dutch L2 English speakers and two from English native speakers).
Sixteen native speakers of British English (3rd year students of linguistics at Queen Margaret University London), aged 20–24 years, participated in the experiment, which took about 30 minutes in total. 12
f Verb phrase ellipsis acceptability judgment task (VPE AJT)
We replicated the procedure described by R. Hawkins (2012), using a 60-item acceptability judgment task and a 3-point Likert response scale where 1 = impossible, 2 = possible (not entirely natural, but might be used by native speakers) and 3 = perfect. The task contained 9 control sentences with no VPE (Type 1), 16 sentences that constituted a replication of an earlier study into different aspects of VPE (Arregui et al., 2006), 11 filler items and 24 items with VPE. These fell into the following categories (each with three items):
Type 2: Simple Past – Simple Past (felicitous); e.g. ‘It was Jill’s birthday. John sent her a card by email. Tom thought that Mary did too.’
Type 3: Simple Past – stranded main verb (ungrammatical); e.g. ‘Jill and Mary were applying for the same job. Jill sent an application by email. John thought that Mary sent too.’
Type 4: Simple Past – Present Perfect (felicitous, expected to be rejected by L2ers); e.g. ‘Tom posted a Christmas card to Jill. Sue believes that Jack has too.’
Type 5: Present Perfect – Simple Past (felicitous); e.g. ‘Jill has posted a parcel to Mary. John said that Sue did as well.’
Type 6: Simple Past – Past Progressive (infelicitous); e.g. ‘In the laboratory, Tom mixed sodium and iron. John said that Jill was too.’
Type 7: Past Progressive – Simple Past (felicitous); e.g. ‘Following the incident in the town centre, a policeman was questioning witnesses. Jill said that a journalist did too.’
Type 8: Simple Past – Future (felicitous); e.g. ‘After graduating, Sue returned her books to the library. John hopes that Tom will too.’
Type 9: Copula – Future/Modal (infelicitous); e.g. ‘Now he has retired, John is happy. Tom thinks that his wife, Sue, will too.’
The test was administered on paper. Each set of 5 sentences was followed by a page break and a distractor task, where participants were asked to identify among a list of eight words those that had not appeared in the preceding sentences. This task was not scored.
V Group results
1 Language use and language awareness
In order to assess to what extent the participants in the present study were indeed exposed to similar levels of English produced by natives vs. L2 speakers, we asked the participants to estimate the proportion of time they spent reading and speaking both Dutch and English, and the proportion of Dutch vs. English interlocutors they had. Both populations report a similar (very high) proportion of reading in English: the English natives read this language 87% of the time for recreational purposes and 78% of the time in a work-related setting, whereas for the Dutch natives, the respective proportions are 74% and 88%. The English natives report a higher proportion of interactions that take place in English both at work (82% vs. 50%, t(28) = 2.851, p < .01) and in private (81% vs. 19%, t(28) = 6.886, p < .001); there is no significant difference between the two populations in the proportion of native Dutch speakers with which they speak English in either setting (work-related t(28) = –.503, p = .617; private: t(28) = −1.542, p = .137). While the English natives thus apparently receive a higher amount of English input, in particular in their private spheres, the proportion of English spoken is still quite substantial for the Dutch natives and, more importantly, the proportion of L2 input is similar for both populations.
Awareness of Dutch influences on English in the areas of grammar and phonetics was, furthermore, higher among the Dutch than among the English population. On a scale from 1 (not very aware) to 4 (highly aware), the Dutch rated their awareness of a Dutch accent in others at 3.80 and in themselves at 3.50, while the English scored at 3.33 and 3.11. The awareness of grammatical influences of Dutch on English was at exactly the same levels among the Dutch natives and at 3.44/3.11 for the English natives. Interestingly, while the awareness of both phonetic and grammatical impact of Dutch on English in others did not differ significantly for either group, there was a significant difference (p < .05) with respect to the self-awareness in both areas.
When asked specifically about the phonetic contrasts explored here, the picture was similar. For VOT, the Dutch rated their awareness in others at 3.60 and in themselves at 3.15, while the English scored lower at 3.11 and 3.0, respectively. For the vowel contrast, the Dutch rating was 3.65 (others) and 3.25 (self), while the English reported 3.22 (others) and 3.00 (self). All these contrasts were significant at the p < .05 level.
2 C-Test
The group results on the C-Test are shown in Table 3. The difference between the populations was not significant (t(27) = −0.552, p = 0.59). No L2er fell outside the lower range delimited by the natives (and the highest overall score was achieved by an L2er).
Descriptive results: C-Test.
Notes. * Maximum possible score for range is 100.
3 VOT
The group results on VOT are shown in Table 4. For all plosives there were instances of production among the L2ers that fell outside the lower ranges delimited by the natives, while the upper ranges are similar. A multivariate analysis of variance (MANOVA) found no significant differences between populations (Roy’s Largest Root F(3, 26) = 0.433, p = 0.9), indicating that the L2ers did not produce shorter VOTs in either the elicited or the free conditions.
Descriptive results: VOT (in ms).
For both groups, the difference between free and elicited voiceless VOT was highly significant for all phonemes, with the elicited values being the longest (see Table 9 in Appendix 1). A MANOVA was conducted for the difference measures across the three phonemes, in order to determine whether the differences between free and elicited speech might have been more pronounced for one of the populations. This proved not to be the case (Roy’s Largest Root F(3, 26) = 0.246, p = 0.863).
4 Vowel formants
Recall that, in Standard Southern British English (SSBE), average formant values for dress are almost identical to its counterpart in Standard Dutch, but that Dutch lacks an equivalent of the English trap vowel, which has a higher F1 (lower tongue position) and a lower F2 (tongue position further back) than English dress. It was predicted that the native Dutch L2 learners of English would exhibit more variety on how they pronounced the trap vowel, either by adapting its pronunciation to that of dress or by hypercorrecting and producing even higher F1 and lower F2 values than in the Standard British target.
For this analysis, participants were split into male and female populations, and MANOVAs were conducted per gender to determine whether differences in vowel production existed between the natives and the L2ers. To this end, for each vowel, the three productions with the highest and the three with the lowest values for F1 and F2 were selected per speaker, resulting in 12 different tokens for each of the two vowels. The descriptive statistics for these measures can be found in Appendix 1 (Table 10).
For both males and females the group difference was significant, with similarly strong effect sizes (Men: Roy’s Largest Root F(8, 47) = 6.821, p < .001, η2 = .583; Women: Roy’s Largest Root F(8, 41) = 7.662, p < .001, η2 = .650). While the measures related to the dress vowel did not differ significantly among populations for either gender, the trap measures were revealed to be consistently different between natives and L2ers (with the exception of the high F2 values for the males and the low F2 values for the females). The results are summarized in Table 5.
Formant comparison between natives and L2ers.
In other words, the L2 speakers’ production of the trap, but not of the dress, vowel shows more variability than that of the native speakers. The sets of vowels on which these calculations were based (i.e. 12 instances of trap and 12 instances of dress for each speaker) were plotted with the help of JPlot Formants v. 1.2.2 (Van Der Lee, 2003). Figures 1 and 2 illustrate that, while dress occupies a similar area for both populations (although in particular the male L2ers have a tendency to either front or back it more than the natives), the trap incidences are pronounced less consistently among the L2ers and thus occupy a circular, as opposed to elliptical (natives), shape.

Production of /ϵ/ by native (◆) and L2 (◊) males.

Production of /ϵ/ by native (◆) and L2 (◊) females.

Production of /æ/ by native (◆) and L2 (◊) males.

Production of /æ/ by native (◆) and L2 (◊) females.
5 Foreign accent rating
The results per population of the FAR are presented in Table 6. Inter-rater reliability was excellent (Cronbach α = .930). The differences between the two populations were significant (t(19.762) = 5.528, p < .01). There was only one L2 speaker who fell within the range delimited by the natives, but this speaker did achieve a ‘perfect’ rating of 1.00 (i.e. no foreign accent perceived by any of the judges).
Foreign accent rating, descriptive statistics (1 = no foreign accent, 6 = strong foreign accent).
6 VPE acceptability judgment task
Since the scores on the AJT are based on Likert-scale responses, and since there are only three tokens for most types we followed the approach adopted by R. Hawkins (2012) in choosing non-parametric statistics (Mann–Whitney U) to compare the responses on the different types of VPE. The full results are presented in Table 7. 13 As the findings in Table 7 show, the L2ers have a general tendency to rate the VPE sentences lower on the acceptability scale than the natives, with the exception of the ungrammatical Type 3 and the infelicitous Type 6 and 9 sentences, which both groups assign a median ranking in the ‘impossible’ range.
Descriptive statistics and group comparisons (Mann–Whitney U) for the results on the acceptability judgment task.
Notes. *f = felicitous, i = infelicitous, u = ungrammatical.
The group differences with respect to median and range reach significance for the core VPE sentences of the type Simple Past – Simple Past (‘John wrote Jill a letter. Mary did, too.’). For the L2ers, these are in the ‘possible’ range, but not all of them accept them all as ‘perfect’, while the natives seem to find them natural. The same goes for Type 8 sentences (Simple Past – Future). Most importantly, a significant difference was found between natives and L2ers for Type 4 sentences (Simple Past – Past Perfect), the structure which was also dispreferred by R. Hawkins’ L2 learners of English. The Dutch L2 learners tended to reject these sentences, while they were in the possible-perfect range for the natives. Of the 20 L2 speakers, 7 gave the sentences of this type a rating that fell outside the range delimited by the natives.
VI Individual results
The previous sections have demonstrated that the population of highly advanced L2 speakers of English tested here diverge from a control population of native speakers who experience a similar level of foreign-accented English input on globally perceived foreign accent, their production of the vowel /æ/ and on their judgment of Type 4 VPE sentences. On the other hand, there were no group differences on a global proficiency task (C-Test), VOT or the production of the vowel /ε/.
Such group results do not, however, preclude the possibility that the L2 population might contain one or more individuals who conform to Long’s (2011) challenge of performing like a native across the board. The present section will therefore investigate, for those features on which group differences were found, how many individuals fell within the native range. It will also be ascertained whether there are any individual speakers who perform within this range for all features tested. Unfortunately, due to the fact that the grammaticality judgment task as designed by R. Hawkins and replicated here contained only three instances of the crucial structure (Type 4 VPE), including this feature in the range analysis conducted in the present section would not be particularly illuminative, and we recommend this structure for future, more detailed and elaborate investigations. The analysis here is thus confined to the pronunciation-related features of FAR, VOT and the trap vowel /æ/.
Table 8 summarizes the number of instances (out of a total of 12 for each variable) where a speaker produced a voiced or voiceless (the latter in both elicited and free speech) word-initial plosive with a VOT that was lower than the lowest one produced by any of the natives, as well as the number of instances (also out of 12) in which the trap vowel was produced with either an F1 or an F2 value that fell outside the native range. In order to make the table more easily readable, cells have been shaded in proportion to the number of instances they contain (the darker the shade, the more instances of productions outside the native range).
Number of instances of voiced (elicited), voiceless (elicited) and voiceless (free) plosives and of F1 or F2 values on the trap vowel produced outside the native range per L2 speaker.
Notes. The shading of the cells visualizes the number of instances (darker = more deviant productions).
This table illustrates that there does not appear to be any kind of implicational hierarchy across these features: some speakers deviate from the native norm only on VOT, others only on the vowel production. Two speakers (NL1–05 and NL2–04) remain within native parameters across the board. However, in the foreign accent rating experiment, these are not perceived to be among the most native-like: NL2–04 scores only relatively slightly outside the native range (1.0–1.2) here with a FAR of 1.33 (the fourth best score attained by any L2er), while NL1–05 is perceived to have a quite noticeable foreign accent with a FAR of 3.27 (rank 15 out of 20). The only L2er to be perceived within the native range (NL1–07) with a perfect FAR of 1.0, on the other hand, has 2 instances of VOT and 3 of the trap vowel that fall outside the scale delimited by the natives here.
Based on the pronunciation variables investigated here, we can therefore conclude that not a single one of our very advanced L2 learners (recall that they were selected on the basis of being perceived as native-like by their colleagues or teachers) fulfils Long’s criterion of ‘performing like a native across the board’.
VII Discussion
The present investigation has attempted to probe phonetic and grammatical ultimate attainment among advanced L2 learners of English with Dutch as their L1. Participants were speakers who engage with their L2 on a daily basis as either teachers or students of this language at a Dutch university of higher education. We hypothesized that the outcome of this extensive exposure might be a kind of double-edged sword: on the one hand, for these speakers a near-native level of proficiency is not only a professional requirement but also a matter of pride. On the other, individuals in this type of situation receive a very large amount of input that is strongly L2-influenced from (fellow) students at lower proficiency levels. It was assumed that it might be more difficult to achieve or sustain native-like pronunciation and command of grammar under such circumstances. We therefore chose a reference population of native speakers of English in the same situation.
First, it was determined whether the two populations would differ on measures of general proficiency as measured by means of a C-Test. No group differences were found on this task, and all L2ers performed within the native range despite the fact that the L2 population was larger, showing that overall proficiency was matched well across groups. It should be noted here that the populations under investigation in this study are matched for level of education (an important predictor for the C-Test; see Schmid, 2011). In addition, our L2ers not only match native speakers, but native speakers who are language professionals. This indicates that it is indeed possible for late L2 learners to reach the higher end of native performance on general measures of overall proficiency. The question then is whether, despite this generally high performance, there might still be areas of pronunciation or grammar which our L2ers fail to fully attain. In order to test this, we investigated both articulatory phonetics and a grammatical feature that is not represented in the L1 grammar.
The first phonetic feature, VOT, was investigated by means of elicited data (word list reading) as well as in free speech (for voiceless plosives). Neither for the free nor for the elicited data did we find any differences between the L2 and the native groups, showing that irrespective of task demands, our L2 learners have VOTs that are similar to those of native speakers of English immersed in a Dutch environment at group level. 14 On the individual level, five of our 20 L2ers (25%) did not produce any plosives with VOTs shorter than the shortest one produced by the natives in the same category.
The second phonetic feature we investigated was vowel categorization. We focused on two vowels, one of which has a very close equivalent in Dutch (dress) while the other does not correspond to any Dutch vowel (trap). Based on the work by Flege (1992, 1995), we hypothesized that the L2 speakers might show more overlap between dress and trap. This assumption was confirmed, as we found that our L2 speakers occasionally produce trap with a tongue position that is lower and more to the front than that of native speakers. The L2ers thus differentiated [ε] and [æ] more clearly on the F1 value (the vertical position of the tongue) than the natives, but merged the two vowels to some extent where F2 (the horizontal position of the tongue) was concerned. These findings indicate that late bilinguals, even at very advanced levels of proficiency, may have difficulties with vowel categories that are not instantiated by their L1. Again, five of the 20 total L2ers (25%) remained within native parameters on all of the instances of trap that we investigated.
FARs, on the other hand, quite clearly discriminated between natives and L2ers, with only one L2er falling within the range of the natives. This suggests that, although the L2ers have come to be very close to the target norm even on phonetic features that are difficult to produce for Dutch native speakers, they still retain a foreign accent that is detectable to English natives.
The last feature we tested was the acceptability of sentences containing verb phrase ellipsis, a feature that is allowed in English but disallowed in Dutch. Our L2 speakers rejected three types of these structures that were considered acceptable by the natives. For the core VPE sentences (Type 2, Simple Past – Simple Past) this difference is probably due to stylistic preferences or lexical choices in the sentence, not to objections to the grammatical well-formedness of these items. There were three instances of L2ers giving a rating of ‘impossible’ to Type 2 sentences. We contacted these participants after completion of the data collection and asked what their disapproval had been based on. In all three cases, the participants declared themselves baffled as to their judgments and declared that they thought the sentence was fine. We considered modifying the scores but decided on preserving the original responses.
A similar situation probably obtains with respect to Type 8 (Simple Past – Future) sentences, where there is also a significant difference between L2ers and natives. For example, seven out of 20 L2ers (but only one native) rated the following sentence as ‘impossible’: (4) The exam paper was really hard. Jill handed in a poor answer. Tom thinks that Sue will too.
When asked about the basis for their judgment, several participants responded that the first sentence indicated that the exam was already over, and so it was not possible that Sue would hand in her answer at some future time. In other words, the sentence was rejected because the content was ambiguous, not because of its grammar.
Where Type 4 sentences were concerned, however, those participants from whom we were able to ask a motivation of their choice later on unanimously responded that the crucial sentence was ungrammatical due to the fact that it contained a different tense. This confirms the assumption that these L2 learners are unable to recover the Present Perfect based on a Simple Past antecedent. A closer look at the data from the natives for this sentence type furthermore revealed a very interesting picture: speakers who had a longer period of residence in the Netherlands tended to rate these sentences as less well formed than those who only had been living here for a shorter period of time (see Figure 3; the correlation is significant at r = –.697, p < .05). This suggests that this particular grammatical feature may not only be very resistant to SLA but also vulnerable to L1 attrition.

Natives’ acceptability judgments of Type 4 sentences by length of residence in the Netherlands.
Since the present study is based on only a very limited number of judgments per participant and sentence type, these findings can only be taken as tentative indications. They are, however, worthy of further study. Further investigations should try to replicate these findings in larger populations, possibly trying to investigate L2 learners who are language professionals but immersed in an L2 environment, in order to reduce the impact of foreign-accented input.
In summary, the findings from the present study indicate that late L2 learners do retain some perceptible indications of non-nativeness, even when compared not to a monolingual reference group but to L1 speakers who experience comparable L2-impacted input. Where phonetic features are concerned, the impact of L2 on L1 is not consistent across speakers; some speakers were within native parameters on all phonemes measured here, but still perceived to have a foreign accent, while one other passed as a native despite a number of deviant pronunciations. This suggests that the speakers may be aware of the target pronunciation, but are unable to realize it correctly in all cases under the demands of speech production; this is an assumption that is consistent with the notion of some kind of maturational constraint leading to a different representation of the L2 system.
VIII Conclusions
Taken together, the results from the present study seem to indicate that even the most dedicated and most successful late L2 learners may encounter some ‘pockets’ of L2 grammar or phonetics that prove difficult to fully master. While they may have explicit knowledge of these features, they may be unable to consistently apply them in online speech production. We have argued that studies that compare L2 learners with (largely) monolingual natives may make an unfair comparison: we are, in fact, not only asking mere mortals to run as fast as Usain Bolt, but we are asking them to do so with lead weights attached to their feet. Since we cannot remove the metaphorical shackles of bilingualism – the inevitability of cross-linguistic transfer – from the feet of our L2 learners, we decided to compare them to a group of Usain Bolts who are similarly fettered. What our findings seem to show is that there are some residual pockets of L2 proficiency where it may be impossible for late learners to ever catch up.
Footnotes
Appendix 1
Descriptive statistics per population and gender of the three highest and the three lowest tokens of F1 and F2 values for
| Vowel | Formant | Upper/lower extremes | Sex | Group | Mean | SD | Range |
|---|---|---|---|---|---|---|---|
|
|
F1 | High | Males | Native speakers | 628.6 | 70.8 | 513–766 |
| L2ers | 615.2 | 81.2 | 505–881 | ||||
| Females | Native speakers | 752.9 | 79.6 | 593–838 | |||
| L2ers | 751.0 | 82.2 | 626–972 | ||||
| Low | Males | Native speakers | 444.4 | 107.2 | 259–578 | ||
| L2ers | 434.6 | 74.6 | 243–525 | ||||
| Females | Native speakers | 570.9 | 88.7 | 358–683 | |||
| L2ers | 520.3 | 89.2 | 315–680 | ||||
| F2 | High | Males | Native speakers | 1796.7 | 130.2 | 1557–1959 | |
| L2ers | 1873.7 | 209.9 | 1482–2247 | ||||
| Females | Native speakers | 2057.1 | 164.0 | 1799–2348 | |||
| L2ers | 2132.6 | 108.4 | 1916–2380 | ||||
| Low | Males | Native speakers | 1365.3 | 81.9 | 1284–1527 | ||
| L2ers | 1421.9 | 138.9 | 1085–1755 | ||||
| Females | Native speakers | 1603.2 | 132.2 | 1365–1856 | |||
| L2ers | 1637.8 | 226.3 | 1193–1937 | ||||
|
|
F1 | High | Males | Native speakers | 713.0 | 69.0 | 623–818 |
| L2ers | 788.0 | 79.4 | 656–1001 | ||||
| Females | Native speakers | 859.4 | 54.5 | 744–929 | |||
| L2ers | 912.5 | 92.9 | 785–1191 | ||||
| Low | Males | Native speakers | 493.2 | 96.2 | 344–637 | ||
| L2ers | 582.1 | 110.7 | 274–743 | ||||
| Females | Native speakers | 660.7 | 130.0 | 404–875 | |||
| L2ers | 567.4 | 127.2 | 329–765 | ||||
| F2 | High | Males | Native speakers | 1613.7 | 167.2 | 1320–1979 | |
| L2ers | 1655.4 | 244.9 | 1290–2681 | ||||
| Females | Native speakers | 1775.9 | 131.5 | 1593–1973 | |||
| L2ers | 1958.9 | 153.7 | 1671–2313 | ||||
| Low | Males | Native speakers | 1177.9 | 157.5 | 793–1412 | ||
| L2ers | 1343.9 | 123.7 | 1012–1537 | ||||
| Females | Native speakers | 1490.3 | 132.4 | 1325–1744 | |||
| L2ers | 1560.6 | 188.1 | 1207–1830 |
Acknowledgements
We are grateful to Dicky Gilbers and Wander Lowie for their insights into the phonetic analyses and their interpretation, to Jan-Wouter Zwart for sharing with us his insights into the Dutch Present Perfect tense, and to Roger Hawkins for allowing us to replicate his Verb Phrase Ellipsis Acceptability Judgment task. We are also particularly grateful to our colleagues and students for participating in the experiments on which this article is based.
Declaration of conflicting interest
The authors declare that there are no conflicting interests.
Funding
This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.
