Abstract
Aims and Objectives/Purpose/Research Questions:
We investigate the extent to which L1 versus adult L2 phonological systems resist influence from an L3. We test the Phonological Permeability Hypothesis (Cabrelli Amaro & Rothman, 2010), which states that adult L2 phonological systems are different from L1 systems with regards to instability.
Design/Methodology/Approach:
To isolate the variable of age of acquisition, we examined the acquisition of L3 Brazilian Portuguese (BP) by two types of sequential bilinguals: L1 English/L2 Spanish, L1 Spanish/L2 English. We tested perception via a forced-choice goodness task and production via a delayed repetition task. First, we assessed acquisition of the phonological property in BP (in this case, word-final vowel reduction, and excluded learners’ data that was not target-like in BP. We then tested the learners’ Spanish to determine the level of BP influence.
Data and analysis:
Perception data were analyzed for accuracy and reaction time. Production data were analyzed acoustically for formant structure, duration, and intensity. We compared L1 English/L2 Spanish data (n=15) with L1 Spanish/L2 English data (n=8), and with Spanish (n=11) and BP controls (n=14).
Findings/Conclusions:
While data from the preference task do not signal instability of perception for early or late acquirers of Spanish, L2 Spanish production data for vowel height measured differs from the L1 Spanish and Spanish control data. We take this as preliminary support for our hypothesis.
Originality:
By comparing L1 and L2 vulnerability to L3 influence, this study takes a novel approach to the debate over the constitution of phonological systems acquired in childhood versus in adulthood.
Significance/Implications:
The novel methodology implemented, together with these empirical findings, will afford further development of a research program dedicated to L3 bidirectional influence and the study of what L3 acquisition can tell us about language acquisition more generally.
Keywords
Introduction
Third language (L3) learners are distinct from typical adult L2 acquirers, since the former possess a larger repertoire of linguistic knowledge. With a minimum of three languages in the mix, there is an increase in the number of potential sources for crosslinguistic influence (CLI). The last decade has seen a surge in this area of research, during which a majority of CLI research has focused on the variables that drive progressive transfer from the L1 and/or L2 to the L3 (see e.g. Rothman, Cabrelli Amaro, & de Bot, 2013, for a review). However, very little is known about L3 regressive transfer, that is, CLI in which the L3 affects the L2 and/or L1. Cabrelli Amaro and Rothman (2010) posit that the value of this virtually unexplored area lies in its potential to inform long-standing debates regarding the mental constitution of early-acquired (L1) vs late-acquired (L2) language systems. The aim of this study is to investigate L3 effects on speech perception and production in L1 and L2 systems, and to use these empirical findings to further the development of a research program dedicated to L3 bidirectional influence and the study of what L3 acquisition can tell us about language acquisition more generally.
The debate over differences between an L1 and L2 has been at the forefront of the field of second language acquisition for the better part of a century. Arguably, the center of this debate has been over whether there is a window of time after which a system cannot be acquired such that it is indistinguishable from a system acquired during said window (the Critical Period Hypothesis; Lenneberg, 1967). This debate over age as an explanatory factor in second language acquisition (SLA) remains unresolved particularly for phonology, and it is not definitively clear that differences in production and perception when comparing L1 and L2 learners are maturationally conditioned (see e.g. Granena & Long, 2013). With this in mind, we contend that the investigation of L3 phonological acquisition and its effects on L1 versus L2 systems can shed new light on the (lack of) differences in the constitution of phonological systems acquired before and after puberty (operationalized here as after age 12).
The Phonological Permeability Hypothesis (PPH; Cabrelli Amaro and Rothman, 2010) assumes that L1 and L2 systems are fundamentally different, and that this difference is maturationally conditioned. However, the fundamental difference we propose is not in terms of maturationally constrained access to language universals. In fact, in line with Schwartz and Sprouse (1996), we assume that the L2 initial state is a copy of the learner’s L1, and that the learner has full access to universals. Similarly, the L3 initial state in the case of an L1 English/L2 Spanish or L1 Spanish/L2 English speaker is a copy of the learner’s Spanish system with access to universals (see ‘L3 acquisition at the initial stages’ for an explanation of why we assume that Spanish, and not English, transfers). Rather, the nature of the difference between systems regards stability, whereby even an ostensibly native-like L2 is more vulnerable to L3 influence than an L1 (Cabrelli Amaro & Rothman, 2010, p. 280). If these two types of systems are constructed in the same way, CLI effects are predicted to be equally evident in L1 and L2 systems, and the PPH will not be supported. If they are constructed differently, however, we predict that L3 influence will be more pervasive in an L2 than in an L1, a finding which would support the hypothesis. A logical question that arises from this is whether the nature of any potential influence is representational or superficial. While this question has posed a challenge in the study of L2 phonology (see Cabrelli Amaro & Rothman, 2010), we propose that there is one outcome in which it could be possible to tease apart domain-specific representational changes and epiphenomenal effects (e.g. cognitive control mechanisms, motor-sensory control): If the L1 and L2 Spanish speakers’ Spanish perception remains native-like, but the L2 group’s decision-making speed and/or production does not, it is possible to formulate the hypothesis that L3 influence is attributable to domain-general mechanisms that differentiate early- and late-acquired systems. In such a case, we will consider a reformulation of the PPH.
In this empirical investigation of the PPH, we examine the acquisition of L3 Brazilian Portuguese (BP) by two types of sequential bilinguals: L1 English/L2 Spanish speakers, and L1 Spanish/L2 English speakers. Examining these groups allows us to isolate age of acquisition (AoA) as a distinguishing factor. After testing the learners’ BP production and perception to assess acquisition of the relevant phonological property (in this case, word-final vowel reduction, which is part of the BP phonological system but not the Spanish system), we test learners’ Spanish to determine the level of BP influence on Spanish perception and production.
The remainder of this article is organized as follows. In ‘Background’, we begin with a review of the relevant literature in L3 acquisition and phonological attrition. We then present an overview of word-final unstressed vowels in BP and Spanish, followed by an outline of our methodology. After a presentation of the results, we discuss the implications of our findings for native and non-native phonological acquisition and attrition, and offer directions for future research.
Background
There are two aspects of acquisition central to the tenets of the PPH: The acquisition of an L3 phonological system, and the modification of existing native and non-native systems.
L3 acquisition at the initial stages
As noted in the introduction, much of current L3 research is concerned with sources of transfer to the L3 at the initial stages, and this includes L3 phonology research. The goal in this line of inquiry is to understand what drives the transfer of one system over another when there are two (or more) systems available for transfer. This question has direct implications for the present study: Since our participants are no longer at the initial stages, this research allows us to predict which system (English or Spanish) has transferred to L3 BP. While evidence of a privileged status for the L1 (e.g. Ringbom, 1987) and the L2 (e.g. Hammarberg & Hammarberg, 1993) has been found, there is also evidence that either system (e.g. Wrembel, 2010, 2011) or both systems (e.g. Wrembel, 2010) can transfer under certain conditions. Studies of English/Spanish bilingual groups acquiring L3 BP reveal strong support for transfer of either system. Specifically, English/Spanish bilinguals tend to transfer Spanish at the L3 BP initial stages regardless of whether a) Spanish is the L1 or L2 or b) transfer is facilitative (e.g. Montrul, Dias, & Santos, 2011, for syntax; Cabrelli Amaro & Rothman, 2010 for phonology). These studies substantiate the claim that initial stages transfer is driven by structural similarity and that a single linguistic system transfers to the L3, formalized in the Typological Primacy Model (TPM; e.g. Rothman, 2015). 1 Given the evidence that Spanish transfers to L3 BP, we focus on Spanish and BP perception and production in this study, and do not test English.
Modification of existing linguistic systems
To our knowledge, there are no studies that examine differential effects of an L3 on an L1 versus L2. There is, however, an established body of research on the effects of a late-acquired L2 on an L1, and to a lesser extent, the loss of an L2 in the L1 environment (see Bardovi-Harlig & Stringer, 2010, for a review). Such effects are commonly referred to as attrition, and we adopt Köpke and Schmid’s (2004) definition of attrition as a “non-pathological decrease in a language that had previously been acquired by an individual” (p. 5). This working definition is intentionally vague: Because of the exploratory nature of this research, we do not predict at this point whether potential L3 effects on existing systems are representational or epiphenomenal. L2 influence on the L1 is a well-documented phenomenon and the interaction of existing and developing linguistic systems is dynamic. Herein, we review relevant research that informs the present study.
Attrition in L1 production and perception
For production, there is evidence for L1 modification at the segmental level (e.g. de Leeuw, Mennen, & Scobbie, 2013; Mayr, Price, & Mennen, 2012) and suprasegmental level (e.g. de Leeuw, Mennen, & Scobbie, 2011), as well as in terms of global foreign accent (de Leeuw, 2008; Hopp & Schmid, 2013; Major, 2010; Sancier & Fowler, 1997). Comparably fewer studies have investigated perceptual modification, with most reporting on segmental modification. Although receptive skills may be more resistant to influence than productive skills (e.g. Cohen, 1989), perception has been found to be vulnerable to attrition of phonemic distinctions of consonants (Cancila and Celata, 2010; Ventureyra, Pallier, & Yoo, 2004) and vowels (Flege, Mackay, & Meador, 1999). However, it is possible that perception of suprasegmental phenomena is more resistant to L2 influence (Parlato, Christophe, Hirose, & Dupoux, 2010).
Attrition of perception and production of vowels
While Bullock, Dalola, and Gerfen (2006) show L1 French vowel contrasts to be resistant to L2 English influence on both phonetic and phonological planes, a case study of L1 Dutch attrition in an L2 English setting (Mayr et al., 2012) shows that the speaker’s L1 vowel space had shifted in the direction of the L2 after 26 years. Changes in vowel height (F1) were also found in Chang’s (2013) study of L1 English/L2 Korean learners after six weeks in an L2 immersion setting. As Chang notes, changes related to F1 are more likely to occur due to the sensitivity that the human auditory system has to differences at lower frequencies (Goldstein, 2010).
Timing of attrition
In the current study, learners’ L3 exposure ranges between six months and three years. Considering most studies cited above investigate the L1 of speakers living in the L2 environment for years and even decades, one might question their relevance to our hypothesis. However, as we note above, L2 influence on the L1 has been documented as early as six weeks into L2 acquisition (Chang, 2012). Chang (2013) and Levy, McVeigh, Marful, and Anderson (2007) find that L2 influence on the L1 is stronger at the early stages of L2 acquisition, which contradicts earlier findings that L1 influence increases with L2 proficiency (e.g. Major, 1992).
Predictions
Based on the research reviewed here, we predict that production will be more vulnerable to L3 influence than perception, and that any influence will minimally take the form of a change in vowel height. However, attrition is not a uniform occurrence; in several of the studies cited, there were individual speakers who did not undergo influence (e.g. de Leeuw, Schmid, & Mennen, 2010; Major, 1992). Therefore, we predict inter-speaker and intra-speaker variation. We return to these predictions in our results and discussion. We now turn to a review of vowel reduction. First, we describe the surface output of word-final unstressed vowels in BP and Spanish. We then present tentative predictions as they relate to the PPH.
Vowel reduction
Vowel reduction is a change in the acoustic quality of a vowel that is conditioned by stress. This phenomenon is present in BP and English, but does not occur in any of our participants’ dialects of Spanish.
Vowel reduction in BP
In stressed position, the BP vowel inventory is /i e ε a ɔ o u/, and in unstressed position, the number of contrastive vowels reduces in accordance with the grade of weakening of the syllable. Post-tonic syllables are the weakest and are limited to an output inventory of [ɐ ɪ ʊ] (Barbosa & Albano, 2004) (1). 2 We limit our investigation to the final syllable, because the vowel inventory in this position is most stable across dialects (Oliveira Silva, 2012). Similarly, we exclude [ɐ] from the study due to lack of stability across dialects (Oliveira Silva, 2012).
(1) a. [ɪ] padre [’pa.d.d] ‘priest’
b. [ʊ] libro [’li.b.b] ‘book’
Stress-governed vowel reduction in BP has been associated with changes in formant frequency, whereby the F1 of [e o] is higher than the F1 of [ɪ ʊ], that is, [e o] occupy a lower portion of the vocalic space. The F2 value of the front vowel [e] is typically higher (more front) than that of reduced [ɪ], whereas the F2 of [o] will be lower (farther back) than reduced [ʊ] (Callou, Moraes, & Leite, 2002). Duration has also been reported to correlate with reduction, whereby reduced vowels are shorter than fully realized vowels (e.g. Massini-Cagliari, 1992). In post-tonic position, [ʊ] is shorter than [ɪ] (Oliveira Silva, 2012), so relative duration values are predicted to be larger for /o/ than /e/. Finally, Massini-Cagliari (1992) finds that higher sonority vowels ([e o]) are higher in intensity than their lower-sonority reduced counterparts [ɪ ʊ].
The Spanish vowel inventory and stress
Unlike BP and English, the Spanish dialects of our participants maintain an inventory of five full vowels [a e i o u] independent of stress (Quilis & Esgueva, 1983). Compare the word-final unstressed outputs in (1) with the Spanish outputs in (2).
(2) a. [e] padre [‘pa.ð̞ɾe] ‘father/priest’
b. [o] libro [‘li.β̞ɾo] ‘book’
Spanish speakers are predicted to produce vowels of similar quality independent of stress. Ortega Llebaria and Prieto (2011) compared measurements of correlates of stress in Spanish and in Catalan, a language that exhibits vowel reduction similarly to BP. They confirm that in Spanish, formant frequencies remain stable in stressed and unstressed position, and that intensity is not a correlate of stress in Spanish, although it is a correlate in Catalan driven by formant frequency changes. Duration is a correlate of stress in both languages, although durational differences in Catalan are larger as a result of centralization.
Comparing the BP literature and Ortega Llebaria and Prieto’s findings, we expect to find the following differences between Spanish and BP word-final unstressed /e o/: a) higher and/or centralized formant frequencies in BP compared with Spanish, b) a larger difference in intensity in BP when comparing stressed and unstressed vowels, c) differences in relative duration in both Spanish and BP, with BP exhibiting larger differences. We will confirm these predictions based on the acoustic comparison of the BP and Spanish control data we present in “Results.”
Methodology
Participants
Experimental participants
Experimental data come from two L3 groups. Group 1 (n=15) are English native speakers that acquired L2 Spanish after age 12 and have advanced Spanish proficiency (see below for proficiency assessment details). Group 2 (n=8) are native Spanish speakers raised in South or Central America who acquired L2 English after age 12. All participants were students at US universities at the time of testing. Participants minimally had one semester of BP instruction. The only variable that separates the groups is timing of Spanish acquisition.
A two-part screening process was used in the selection of participants. First, we administered a background questionnaire, and participants with proficiency in additional languages were excluded from the study. While this left us with 48 participants across the two groups, 25 of these learners did not have sufficient proficiency in Spanish and/or BP and were excluded during the second stage of screening. In this stage, participants completed Spanish and BP proficiency assessments. To confirm advanced Spanish proficiency, two assessments were used. The first was a 50-item Spanish proficiency cloze test used extensively in generative L2 research (e.g. Montrul & Slabakova, 2003). We also used foreign accent ratings based on 15-second speech samples (Hopp & Schmid, 2013). Sixteen native Spanish speakers rated each foreign accent on a scale of 1 (very strong accent) to 7 (native speaker). To qualify, participants needed 40/50 on the Spanish proficiency test and 4/7 on the accent rating. If there was a mismatch between the written score and accent rating, the accent rating was used since our research questions concern speech production/perception. This applied to one L2 Spanish speaker (1047), whose written score was 39/50 and accent rating was 4.56. Once minimum Spanish proficiency was confirmed, participants’ BP proficiency was assessed. To do so, we used a 100-point written test (see Rothman & Iverson, 2009) and foreign accent ratings by 11 BP native speakers. Intermediate BP speakers had a written score of 60–79, and an accent rating of less than 4/7. Advanced BP speakers scored at least 80 on the written test and 4/7 on the accent scale. Based on these criteria, 5/8 of the L1 Spanish group and 7/15 of the L1 English group had intermediate proficiency (see Tables 11a and 11b for individual proficiency information, and Cabrelli Amaro, 2013, for further details).
Control groups
Comparing monolingual control data with bi/multilingual data is not the most appropriate measurement of L2 ultimate attainment, as such a comparison “serves to move the yardstick of nativelikeness to a point which may, by definition, be out of reach for most bilinguals” (Hopp & Schmid, 2013, p. 364). To this end, our Spanish control data come from speakers (perception task, n=11; production task, n=9) that were raised in a monolingual Spanish environment and came to the US after the age of 12. 3 Age of onset of English acquisition was ⩾12. All speakers have lived in the US a minimum of seven years, and 10 of 12 have demonstrated sufficient English proficiency to study in US universities at the graduate level (2 of 12 are first-generation immigrants that have resided in the US for 25 and 28 years). While the control participants were native Spanish speakers of a wide range of varieties, including Venezuela (Caracas), Spain (Asturias, Madrid), Colombia (Bogotá), El Salvador (San Miguel), the Dominican Republic (Santiago), and Cuba (Havana), and /e/ and /o/ F2 have been found to vary across dialects (e.g. Chládková, Escudero, & Boersma, 2011), raw F1 and F2 values are within the ranges reported in Chládková et al. Speakers did not have fluency in any additional languages. As with the Spanish control group, BP control participants (n=14) had been raised in a monolingual environment. The onset of L2 acquisition (either Spanish or English) was minimally 12 years of age. All BP controls were self-reported advanced speakers of English and 12 of 14 were advanced speakers of Spanish as measured by the written proficiency assessment used with the experimental participants. Nine BP control participants lived in the US and five lived in Brazil at the time of testing. We chose BP speakers with exposure to English and Spanish because, as will be explained further in ‘Acquisition of BP reduced vowels’, inclusion of L3 learners’ data for each variable in the study depended on how their BP compared to that of the BP controls. If we had used monolinguals as a point of comparison, the threshold for inclusion could have been set too high and we might have had to (unjustly) exclude data. Therefore, comparing learners to controls whose BP reflects that of a native BP speaker with English and/or Spanish influence (i.e. the same grouping of languages as the learners) is more realistic than comparing them to monolinguals.
Stimuli
A master set of stimuli was used, allowing for direct comparisons across tasks and languages. To minimize lexical interference and tap phonological/phonetic knowledge, nonce words were used exclusively. All critical tokens were disyllabic and phonotactically legal in both languages, and had a /C(C)V.CV/ input structure (e.g. /na.fe/, /pla.ko/). Tokens were presented in a carrier phrase (Es en referencia a/É em referência a ___ ‘It is in reference to ____ ’), which does not contain /e/ or /o/ in unstressed word-final position.
Natural speech stimuli from a bilingual speaker of Spanish and BP were used in the perception and production tasks in both languages to maintain ecological validity, control for speaker variation across languages, limit the cognitive load associated with exposure to multiple speakers (see e.g. Wong, Nusbaum, & Small, 2004), and produce tokens for the perception task with characteristics of both languages (see ‘Perception and processing’). The speaker was a 22-year-old female born in Salvador, Brazil; Spanish AoA was five years. With the exception of two years in Brazil between the ages of 13 and 15, she lived and studied in Cuba and Venezuela between the ages of 5 and 18. She reported using Spanish and BP daily. Tokens were recorded within the relevant carrier phrase Es en referencia a ___/É em referência a ___ and phrases were produced with penultimate stress.
Experimental testing paradigms
In this study, we report on one perception task and one production task. During the perception task, we also recorded reaction time (RT) in an effort to tap unconscious processing that behavioral methodology arguably cannot access.
Testing procedure
Testing in Spanish and BP took place on different days to control for language mode, and the order of languages tested was counterbalanced across participants. After completing a background questionnaire, each session began with a 10-minute recorded conversation in the language being tested. The speech sample provided an excerpt for the foreign accent rating and helped move the participants into the relevant language mode. The experimental tasks were then administered and task order was counterbalanced across participants. E-Prime 2.0 was used for stimuli presentation in both tasks, and for recording participants’ answers and RT from the perception task.
Perception and processing
A forced-choice preference task (e.g. Guion, 2005) was chosen to test learners’ preference for the BP versus Spanish allophone. Trials consisted of auditory presentation of pairs of experimental (n=50, 20 tested vowel reduction) or filler (n=25) disyllabic words presented in the carrier phrase used throughout the study. Each pair of words was identical, with the exception that one contained a BP-like allophone and one contained a Spanish-like allophone. For example, in the Spanish test, a participant heard Es en referencia a [ˈna.fe]. After an interval of 500 ms, selected to allow for access to the phonological loop, Es en referencia a [ˈna.fɪ] was presented. Since Spanish word-final vowels do not reduce, the first phrase sounds more natural in Spanish. Order of presentation of target and non-target stimuli was counterbalanced across trials. Participants were instructed to select the word that sounded most natural in the language being tested and had 3000 ms to make their selection by pressing “1” or “2” on a keyboard.
Answers were logged by E-Prime as correct (1) if the participant chose the token with the target allophone within 3000 ms, or incorrect (0) if they chose the token with the non-target allophone, or did not make a selection within 3000 ms. Accuracy and RT (measured in milliseconds) were recorded for analysis. RT was recorded to observe any potential differences in decision-making speed among the groups. As is standard, only reaction times from accurate responses were submitted to analysis. To reduce means contamination due to inattention (prolonged RTs), RTs more than two standard deviations (SDs) away from the group mean were excluded from analysis.
Production
A delayed repetition task was used to investigate production. Trials were presented auditorily to avoid orthographic interference and consisted of a token at the end of the carrier phrase Es en referencia a __/É em referência a __. To minimize imitation effects, the participant then heard the distracter question Es en referencia a qué?/É em referência a que? ‘It is in reference to what?’, prompting her to produce the target word in the carrier phrase. The main BP block consisted of 105 randomized trials and the SP block consisted of 120. Of these trials, 20 tested word-final unstressed vowels in each language (10 /e/, 10 /o/. Fifteen of the critical tokens (/e/ (n=7), /o/ (n=8)) were submitted to acoustic analysis. 4
Analysis of production data
Data were analyzed in Praat (Boersma & Weenink, 2013). For each item, the stressed and unstressed vowels were segmented, using the onset and offset of clear F2 structure as the primary cue and presence of periodicity in the waveform as a secondary cue. Files were excluded from analysis in cases of complete lack of formant structure, complete devoicing, or mispronunciation. After segmentation, a script was run on audio files and corresponding text grids to measure formant frequency, relative vowel duration, and relative intensity.
Formant frequencies
Formant measurements were taken from the temporal midpoint between vowel onset and offset. To normalize variation in formant resonance, formant values were converted from Hz to Mel with the Praat hertzTomel function. Since a subset of the vocalic inventory was analyzed, we then followed the vowel normalization methodology in Baker and Trofimovich (2005). We measured height as the difference between the first formant (F1) frequency and fundamental frequency (F0, calculated via cross-correlation), and frontness as the difference between the second formant (F2) frequency and F1 frequency. Presence of creak or devoicing yielded several undefined pitch values; these items were excluded from analysis. We also excluded all F1–F0 data from participants with more than 50% of F1–F0 data points missing. In total, we excluded 39% of the control, 46% of the L1 English, and 52% L1 Spanish /e/ data, and 38% of the control, 42% of the L1 English, and 60% of the L1 Spanish /o/ data from the analysis.
Duration
To control for speech rate, relative duration was calculated by subtracting the duration (in milliseconds) of the word-final vowel from the duration of the stressed vowel. A larger (positive) difference correlates with a more reduced vowel; a smaller difference (or any negative difference) correlates with a less reduced vowel.
Intensity
Relative intensity was measured as the maximum intensity (in decibels) of the word-final vowel subtracted from the maximum intensity of the penultimate vowel. The lower the relative intensity, the more reduced the vowel.
Statistical analysis
Rather than use parametric statistics with our small sample, we follow Plonsky’s (2015) call for the implementation of descriptive statistics when power is low as a result of small sample size (p. 30). We use a combination of effect size and confidence intervals (CI) to determine the magnitude of the difference in perception and production across groups. Effect sizes of mean differences are expressed via Hedges’ g, which corrects for bias yielded by small sample size. In line with Plonsky and Oswald (2014), small, medium, and large between-group effects correspond to absolute values of .40, .70, and 1.00, respectively. We also provide 95% confidence intervals for means. When a group mean does not fall within a comparison group’s CI, the difference between the two mean scores is considered significant (Plonsky, 2015, p. 40).
Results
Spanish and BP acoustic comparison
Before turning to the experimental data, an acoustic comparison of BP and Spanish control data serves to confirm predicted cross-linguistic acoustic differences and establish baselines for comparison with experimental group data. Means, SD, and CIs for each variable in Spanish and BP are presented in Tables 1a–b, as well as the effect size for the SP-BP difference in means. BP-SP comparisons that yield an absolute effect size of .40 or larger and whose CIs do not overlap are included in our analysis of the experimental data.
Spanish and BP acoustic comparison (/e/).
Spanish and BP acoustic comparison (/o/).
g: Hedges’ g.
Hedges’ g > .4 or <-.40.
Mean does not fall within comparison group’s CI.
A comparison of the Spanish and BP vowel space is illustrated in Figure 1; while BP vowels were predicted to be more centralized, BP /e/ is more peripheral than Spanish /e/. BP-Spanish comparison of relative intensity did not yield effect sizes >.40, and each group’s mean falls within the other group’s CI. This indicates that intensity differences in BP stressed and unstressed vowels are not different in Spanish vowels. The duration comparison returned a small–medium effect size only for /o/, suggesting that the difference in duration between the tonic /a/ and atonic /o/ syllables in BP is larger than the difference in Spanish. The effect sizes of the group differences for F1–F0 and F2–F1 range from medium to very large. Therefore, we include each of these variables in our Spanish analysis, and predict that BP influence on Spanish will take the form of higher and/or more fronted vowels, and a larger difference in duration between stressed /o/ and unstressed /o/.

BP and Spanish control /e/ and /o/ F2–F1 and F1–F0.
Acquisition of BP reduced vowels
Before examining the learners’ Spanish, we constructed separate Spanish data sets for each variable that differed between the two control groups. A learner’s data were included in the Spanish analysis of a variable if the individual mean fell a) within the BP control CI, or b) outside of the CI in the opposite direction of the Spanish control’s CI. We did not separate data sets by proficiency to avoid a further decrease in group size, and because global L3 BP proficiency is not as important in this case as acquisition of the particular L3 variables. However, we will discuss proficiency trends in light of our findings in “Discussion.” The number of participants in each analysis from each group is listed in Table 2, and the BP-like variables for each participant are included in Tables 11a and 11b.
Number of participants included in each analysis (number of intermediate learners in parentheses).
Spanish results
Perception and processing (forced-choice preference task)
Accuracy and RT means, SDs, and CIs are presented in Table 3. Means are collapsed across /e/ and /o/ due to effect sizes <.70 (a small within-group effect according to Oswald & Plonsky, 2014) when comparing vowels within groups.
Mean, SD, CI (Accuracy/RT).
Accuracy
Only the L1 English–Spanish control comparison produces an appreciable effect size (.46) (Table 4) and has CIs that do not overlap, suggestive of a stronger preference for fully realized vowels by the L1 English group. Group results are bolstered by individual data; all experimental participants’ accuracy rates were within (or above) the Spanish control CI. Thus, while 9/15 L1 English and 6/8 L1 Spanish participants prefer reduced vowels in the BP task at a rate similar or higher to that of the BP control, they maintain a native-like preference for mid vowels regardless of the status of the Spanish system as early- or late-acquired and regardless of BP proficiency (three L1 Spanish, five L1 English are intermediate BP learners). That said, we recognize that the Spanish control group does not set the bar very high. It is possible that the controls and L1 Spanish speakers accept the word-final reduced vowels [ɪ ʊ] as natural in Spanish upwards of 25% of the time because of the reduced segments’ proximity in the vowel space to fully realized [i u]. As an anonymous reviewer points out, this could also be due to their linguistic reality (i.e. living in the L2 environment) which may lead to a higher acceptance of acoustic variation for close sounds. The fact that the controls and learners do not prefer [ɪ ʊ] more often than they do could be because a) they are phonetically reduced and b) that [i u] in word-final position in Spanish are infrequent in comparison to word-final [a e o], limited to words of indigenous origin (e.g. miski [ˈmis.ki], from Quechua “sweet”) or as a colloquial diminutive (e.g. guapi [ˈgu̯a.pi] “good-looking”) (A. de Prada Pérez, personal communication, January 27, 2013).
Effect size (Accuracy and RT).
Considering the three possible outcomes outlined in “Background,” the accuracy data question the possible outcome that there is a difference between L1 and L2 groups at the levels of perception and production, while initially supporting the outcome that there is no difference between groups at the level of perception or production and the outcome that there is no difference in L1 and L2 Spanish perception data, but there is a difference across the groups’ production data. The comparable stability of the early and late acquirers in terms of word-final unstressed vowel preference potentially indicates a stable mental representation. If RT data yield a difference between early and late acquirers, differences in accuracy and RT results could be due to epiphenomenal effects.
Reaction time
While there are no effect sizes >.40, the L1 English and Spanish control CIs do not overlap, suggesting that the L1 English group’s processing speed for selection of fully realized unstressed word-final vowels in Spanish is faster than that of the control group. Looking at individual data, only one participant from the L1 English group (1018) registered an RT slower than the upper limit of the Spanish CI.
Interim summary
Taking the accuracy and RT data together, there is no evidence of vulnerability to L3 BP influence in terms of preference for fully realized word-final vowels or decision-making speed for either experimental group. This finding aligns with possible outcome 1, and therefore does not support the PPH. Production data from the delayed repetition task will shed more light on whether the lack of instability evidenced during perception extends to production. Should there be significant differences between the L1 Spanish and L2 Spanish speakers, an asymmetry in perception and production will support a weak version of the PPH. However, in the case of evidence of comparable (in)stability between the L1 Spanish and L2 Spanish learners when compared to the control group, the PPH will not be supported by the overall data set.
Production
Herein, we present the results of each of the five production variables that differed between the BP and Spanish control groups: F1–F0 /e/, F1–F0 /o/, F2–F1 /e/, F2–F1 /o/, and relative duration of /o/. Means, SDs, and CIs are presented throughout as well as effect sizes for means comparisons between groups. We do not provide a visual plot of formant values since the F1–F0 and F2–F1 data sets do not include all of the same participants.
F1–F0 /e/ and /o/
The effect sizes for /e/ (Table 6) are negligible and each group’s mean falls within the other groups’ CIs (Table 5). However, when we look at the individual data, three of five L1SP learners and three of nine L1EN learners have F1 values above the upper limit of the Spanish control CI. It is likely that this outcome is not reflected in the group mean comparisons because most values are close to the CI’s upper limit. For /o/, there is a small effect size when comparing the L1 English group with each the control group and L1 Spanish group. The L1 English mean does not fall within the Spanish control’s CI, and sits on the edge of the L1 Spanish group’s CI. The direction of these effect sizes indicates that the L1 English group’s word final /o/ is higher in the vowel space, or more BP-like. Individually, production of /o/ from three of six L1 English and one of five L1 Spanish learners results in a higher vowel than that of the Spanish control.
Mean, SD, CI (F1–F0).
Effect size (F1–F0).
F2-F1 /e/ and /o/
Comparing the frontness of /e/ between the L1 English group and L1 Spanish group and the L1 English group and Spanish control group yields small and medium effect sizes, respectively, and no CI overlap (Tables 7 and 8). Interestingly, the direction of this difference is not toward a BP-like value. This difference cannot be explained by English influence, because vowels would be more fronted than the Spanish control (i.e. more BP-like), not more centralized (Giacomino, 2012). Unlike the F2–F1 values for /e/, the values for /o/ are uniform across the three groups, as evidenced by effect sizes <.15 and means that fall within the other groups’ CIs. If we look at the individual variation, however, we see that 2/4 L1 SP and 2/10 L1 EN participants have mean F2–F1 values that are above the high end of the control CI.
Mean, SD, CI (F2–F1).
Effect size (F2–F1).
Relative duration /o/
Overlapping CIs and effect sizes <.15 for all three comparisons do not suggest BP influence on either experimental group (Tables 9 and 10). Similarly to the F2–F1 /o/ individual data, however, relative duration is larger than the upper limit of the Spanish control CI for two of four L1 SP learners and two of ten L1 SP learners.
Mean, SD, CI (relative duration /o/).
Effect size (relative duration /o/).
Discussion
Summary
To determine the relative stability of early-acquired and late-acquired Spanish systems, the present study compares the perception and production of word-final unstressed Spanish vowels by sequential L1 Spanish/L2 English and L1 English/L2 Spanish bilinguals that are in the process of acquiring BP as a third language. Responses from an auditory forced-choice preference task were measured for accuracy and RT. Productions from a delayed repetition task were measured for the acoustic cues found to differentiate Spanish and BP word-final unstressed vowels: vowel height/frontness and relative duration of /o/. Results from the perception task do not reveal any BP influence in terms of non-native-like preference for reduced vowels or decision-making speed. However, production data signal a difference between the L1 English group and the L1 Spanish and control groups for vowel height (F1–F0). That is, the L1 English group tends to produce back vowels in Spanish that are BP-like in height, while the L1 Spanish group’s back vowel height remains within the control range. In fact, the L1 Spanish group does not differ from the control for any of the variables analyzed. In what follows, we address the specific predictions of the PPH, consider patterns of intra- and inter-speaker variation, and conclude with limitations and future directions.
The PPH
The PPH argues that phonological systems acquired in adulthood are more vulnerable to influence from a third language than systems acquired during early childhood. Three possible outcomes of the testing of the PPH were proposed. The first was that there would be no differences across groups for either perception or production, which would not support the PPH. Since there was a difference found between groups for production, we can rule out this outcome. The second possibility was that the perception and production data would yield differences between the early and late acquirers, which would support the hypothesis. This outcome was not realized, since no differences were found between groups in the perception task. A third possibility was that there would be a difference between groups for production and/or processing, but not for perception as measured by allophone preference. Our group data best align with this outcome, which does not support the PPH in its original form since there is no evidence of changes to mental representation. Instead, the outcome leads us to hypothesize that the addition of a novel phonological system can affect aspects of speech production to a larger degree in late-acquired systems, at least when the languages under observation are typologically related. While we are not in a position to comment on the nature of these effects, empirical examination of relevant motor-sensory aspects (see Simmonds, Wise, & Leech, 2011) and cognitive control mechanisms (e.g. failure to inhibit or block non-target forms in a speech-output buffer, see Finkbeiner, Gollan, & Caramazza, 2006) will shed light on this question.
Individual variation
Tables 11a and 11b illustrates the inter- and intra-speaker variation present in the data. Participants appear in order of BP foreign accent rating (FAR). For each dependent variable, if the participant’s BP mean is target-like in BP (and thus their data were included in the Spanish analysis), the relevant cell is marked with an “X.” If the mean of that same variable in Spanish is BP-like, the cell is shaded.
Individual data, L1 Spanish group.
X = BP mean is within BP control CI. Shading = Spanish mean is within BP control CI.
Individual data, L1 English group.
X = BP mean is within BP control CI. Shading = Spanish mean is within BP control CI.
While group data show L3 influence only on the L2 Spanish group’s production, individual data reveal L3 BP effects on individuals in both groups. There is evidence of individual BP-like production for each of the five dependent variables in both groups (Table 12). In fact, with the exception of F2–F1 /e/, at least one learner in both groups produces BP-like values in Spanish for each variable. This finding, in tandem with the group data, aligns with our prediction that L3 BP influence would be evident in both groups, but that the degree of influence would distinguish the groups.
Number of participants whose means fall outside of the Spanish CI for each dependent variable (number of intermediate learners in parentheses).
If we consider that there is a continuum of influence, there are learners in both groups that fall at both ends of the continuum and some that fall somewhere in the middle. While BP-like variables appear in individuals’ Spanish in both groups, there are individuals in both groups that do not demonstrate any evidence of BP-like productions (8/15 L1 English; 1/8 L1 Spanish). In these cases, however, data from three or fewer variables were included in the analysis since BP production of certain variables was not target-like. Most common are learners in both groups (6/8 L1 Spanish; 7/15 L1 English) with one or two dependent variables outside the Spanish CI. Again, with the exception of F1–F0, there is not a clear pattern that shows one variable as more vulnerable than another. At the other end of the continuum, only one learner from each group (1001 and 1018) produced four BP-like variables in Spanish. Both are intermediate BP speakers, and both produced target-like BP formant values and BP-like Spanish formant values. The L1 English speaker had a FAR of 3.6/7, which is close to the advanced proficiency cutoff, and her written proficiency was advanced. However, the L1 Spanish speaker had a BP FAR of 2/7, which indicates spoken BP strongly influenced by Spanish. In spite of her overall BP accent, however, her word-final unstressed vowels are clearly BP-like. These learners’ data led us to look for proficiency effects in the individual data. While effects to the same extent are not found elsewhere, at least one attriter for each variable had intermediate proficiency. In the case of F1–F0 /e/ and F2–F1 /o/, all attriters were intermediate. Thus, although there are not enough data to make any strong claims, these data fall in line with Chang’s (2013) and Major’s (1992) findings of influence at intermediate and advanced levels of proficiency, respectively. Following the same learners longitudinally will allow us to observe how L3 influence changes over time and across proficiency and contexts. This brings us to our discussion of future directions, and specifically to the need for longitudinal L3 data.
Future directions
A primary goal of the investigation of cross-linguistic influence is to understand the dynamic nature across the lifespan, and long-term longitudinal testing of the PPH will contribute to the understanding of bidirectional influence over time. There are several advantages to L3 longitudinal methodology. First, without measures taken at the onset of L3 acquisition, the variation that is characteristic of bilinguals (early and late alike) makes it impossible to know what the Spanish phonological system of each BP speaker looked like at the onset of L3 acquisition. To illustrate, consider the L1 English speakers’ duration. Let us say that a speaker produces relative duration values in BP that are target-like in BP, but are BP-like in Spanish. Although the speaker’s proficiency was assessed, it is still possible that the target-like relative duration in BP is a reflection of English influence, and that the lower relative duration characteristic of Spanish was never acquired in the speaker’s L2. By measuring each learner’s baseline and following their acquisition over time, however, learners act as their own control. This obviates the need to compare static learner performance with a control sample that also exhibits variation. Second, examination that begins at the initial stages will make it possible to confirm our assumption that the participants’ Spanish system was transferred during the L3 initial stages. Third, examination of BP and Spanish production over time and in different contexts will inform the permanency of BP influence. For example, while an L1 English/L2 Spanish participant in Cabrelli Amaro (2013) demonstrated BP influence in his Spanish production after 12 weeks of a BP immersion experience, this influence diminished once he had returned from Brazil to the US. Nine months after resuming daily use of Spanish in professional and personal environments while reducing his use of BP, the quality of his Spanish vowel production had returned to baseline values. In their study of L1 lexical attrition, Ecke and Hall (2013) indicate a rapid recovery of stability despite infrequent L1 use, pointing to processing issues as a source of L1 modification. Future research will help determine what, if any, differences there are between native and non-native systems in terms of recovery, and whether Ecke and Hall’s findings might extend to the phonological domain of grammar. Fourth, we will be able to determine whether influence is more robust earlier versus later in acquisition, and have a better chance of understanding the nature of the influence. While large samples are a challenge for longitudinal investigations because of the small window of time for baseline testing, data from an ideal sample size will substantiate (or not) the tentative conclusions made based on the results of this study and provide further insight into the plausibility of the PPH.
Conclusion
The primary aim of this study has been to examine potential differences between early- and late-acquired phonological systems through the lens of L3 acquisition. Rather than measuring “same” or “different” in terms of whether an element of a phonological system can be acquired in adulthood, we compare the L1 and L2 Spanish phonological systems of consecutive English/Spanish bilinguals in their resistance to influence from an L3. Specifically, we test the Phonological Permeability Hypothesis, which states that early-acquired phonological systems are more impervious to influence from novel systems than systems acquired after a so-called critical period. While data from a forced-choice preference task do not signal instability of perception for early or late acquirers of Spanish, L2 Spanish production data for one of the variables measured differs from the L1 Spanish and Spanish control data. We take this as preliminary support for our hypothesis, as well as for the examination of L3 acquisition as a novel source to test the claims and predictions of SLA theories and models.
Footnotes
Acknowledgements
I am grateful to the editors of this issue, Marit Westergaard and Jason Rothman, and to the three anonymous reviewers for their valuable comments on an earlier version of this paper. Thank you to Clara Ramos and ACBEU (Salvador, Bahia), CIEE Salvador and Lara Schmertmann, IBEU and the University of Florida program in Rio de Janeiro, and Felipe Amaro for their assistance with data collection, to Daniel Oliveira Peres for his assistance with data processing, and to Magdalena Wrembel and Mike Iverson for their input on methodological issues.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This material is based upon work supported by the National Science Foundation under Doctoral Dissertation Research Improvement Grant (Division of Behavioral and Cognitive Sciences) No. 1132289. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author and do not necessarily reflect the views of the National Science Foundation.
