Abstract
Recent findings (see, for example, Muñoz and Singleton, 2011) indicate that age of onset is not a strong determinant of instructed foreign language (FL) learners’ achievement and that age is intricately connected with social and psychological factors shaping the learner’s overall FL experience. The present study, accordingly, takes a participant-active approach by examining and comparing second language (L2) data, motivation questionnaire data, and language experience essays collected from a cohort of 200 Swiss learners of English as a foreign language (EFL) at the beginning and end of secondary school. These were used to analyse (1) whether in the long run early instructed FL learners in Switzerland outperform late instructed FL learners, and if so the extent to which motivation can explain this phenomenon, (2) the development of FL motivation and attitudes as students ascend the educational ladder, (3) the degree to which school-level variables affect age-related differences, and (4) learners’ beliefs about the age factor. We set out to combine large-scale quantitative methods (multilevel analyses) with individual-level qualitative data. While the results reveal clear differences with respect to rate of acquisition in favor of the late starters, whose motivation is more strongly goal- and future-focused at the first measurement, there is no main effect for starting age at the end of mandatory school time. Qualitative analyses of language experience essays offer insights into early and late starters’ L2 learning experience over the course of secondary school, capturing the multi-faceted complexity of the role played by starting age.
I Introduction
In recent decades, it has become increasingly apparent that factors of a social, psychological and contextual nature are prominent in both early and late second language (L2) learners in naturalistic settings as well as classrooms (see Moyer, 2014a). However, so far only minimal attention has been paid to the interaction between person and context in quantitative and qualitative age research. One reason for this is the limited availability of convenient and successful methods for addressing context effects statistically. Another challenge is mentioned by Moyer (2014b), who suggests that since age of onset (AO) has a significant relationship with experience, the nature of that relationship needs to be clarified in future research via the ‘messiness’ of introspective methods (p. 458). In conformity with this view, Pica (2010) points out that the heavy emphasis on age in making decisions about school policy and practice has overlooked the abundant research on psychosocial factors such as learner beliefs and motivation that have been shown to impact on language learning in a school context and which may explain why ‘early L2 schooling is not necessarily better’ (p. 260). Thus, whilst learners’ ultimate levels of achievement and proficiency will remain a focus of age-related research, an important additional perspective needs to focus on the processes and timescales in which learners can be seen to be happy and experience flourishing in language learning, as well as situations in which they struggle with boredom or with challenges that demand more of them than they are capable of delivering.
This article focuses on methodological advancements in the area of AO and motivation in the classroom by combining large-scale quantitative methods that give an account of both participant and item variability with individual-level qualitative data. Multilevel analyses are performed to investigate to what extent late starters’ long-term achievement in instructional settings matches the supposedly advantaged performance of early starters. Also analysed is how motivation factors into this process. In order to capture psychological elements of learning English as a foreign language (EFL) from different ages and on different levels internal to the learner, language experience essays produced by the participants are drawn on. These help identify aspects of early and late EFL instruction that seem salient to particular individuals at the beginning and at the end of secondary school, and thus help constrain the influential factors other than age that play a role in L2 development. Such a holistic approach takes into account the combined and interactive operation of different elements/conditions relevant to specific situations, rather than following the traditional practice of examining the relationship between well-defined variables in relative isolation. This approach can also, we believe, provide a richer picture of the interaction of AO and other (often hidden) variables (see Muñoz, 2014a) than an approach solely focusing on learners’ long-term outcomes as a function of AO.
II An ecological approach to age
Age-in-context 1: Cohort effects on motivation
It has been well documented that the localized practices, experiences and histories of learners in particular classrooms are pivotal in shaping the process of L2 learning motivation and performance. Since individuals are known to accommodate to the normative environment within their class setting, changes in an individual’s motivational state are thought to be the result of sustained exposure to and observation of peers, a process commonly referred to as ‘modeling’ (e.g. Berndt and Keefe, 1995). Learners observe and assess the motivational states of their peers and may gravitate to the group norm. They may well influence each other in their perceptions and in their orientation to the classroom environment. In her theory of group affective tone, George (1995) posits that groups, over time, develop a tendency to display collective mood states. Positive group affect, for instance, can lead to increases in motivation and ‘spreading goodwill’ during interpersonal encounters (George and Brief, 1992: 310). According to Forsyth (2010) group orientations can also change the way group members think about themselves (see also Mercer, 2014; Sedikides and Brewer, 2001). Since most people’s selves are a combination of both personal and collective elements, their answers to the question, ‘Who am I?’ in time will thus change to include more collectivistic elements.
Peers also influence each other with respect to the value they place on the development of L2 proficiency. As they do so, their own motivational state may become dynamic and may eventually lead to greater or lesser gains in L2 proficiency over time. For instance, Kozaki and Ross (2011) examined the clustering effects of streamed classes in a foreign language (FL) program and found class compositional effects to exert both ameliorating and constraining effects on proficiency growth. The class compositional effects – perceived peers’ normative aspiration to professional pursuit and orientation to the social mainstream – mediated the trajectories of individual differences in growth of proficiency. The results of the study suggest that peers can exert a normalizing influence in FL classrooms that can augment or undermine individual learners’ own motivations to learn the FL.
Individuals in a school context are influenced not only by their peers but also by the circumstances of the learning environment. Chaudron (2001) suggests that classroom processes are heavily influenced by the structure of classroom organization. Different patterns of teacher–student interaction, group work, degrees of learners’ control over their learning, and variations in tasks and their sequencing are seen to play a significant role in the quantity and quality of learners’ production of and interaction with the target language. Similarly, intensity of teaching and small groups are found to be conducive to positive attitudes in young learners (e.g. Vilke and Vrhovac, 1995). Given all this, it is not surprising that students within a classroom have been found to be more similar to each other than to students in other classrooms due to whatever school level characteristics are measured (Seltman, 2009: 375).
From a theoretical and research perspective, these arguments place a premium on classroom-focused empirical studies which investigate how learning contexts shape processes of motivation and L2 proficiency in individual classrooms (see also Ushioda, 2013). In the field of individual differences in second language acquisition (SLA), for instance, Dörnyei (2005) has suggested that research should seek to focus on particular constellations where cognition, affect and motivation function together as wholes. Because most L2 learning in EFL settings happens in institutional environments, the age factor also needs to be considered in the light of macrocultural and microcultural phenomena having a bearing on interpersonal relations; these may influence and shape the motivational states of individuals and groups. The importance of contextual factors in age research is recognized in research which has highlighted the significant effect of the ‘macro context’. For instance, it is observed that instructional conditions lead to different age-related results from natural exposure conditions (see, for example, the reviews in Lambelet and Berthele, 2015; Muñoz and Singleton, 2011; Singleton and Ryan, 2004). Numerous classroom studies in Europe and across the world have yielded consistent results showing not only a rate advantage for late starters over early starters but also very few linguistic advantages to beginning the study of an FL earlier in a minimal input situation (see, for example, Al-Thubaiti, 2010 for Saudi Arabia; Muñoz, 2006, 2011 for Catalonia; Larson-Hall, 2008 for Japan; Myles and Mitchell, 2012 for Great Britain; Pfenninger, 2013, 2014a, 2014b for Switzerland; Unsworth et al., 2012 for the Netherlands).
Age-in-context 2: Interaction of age and individual difference variables
The age factor also interacts with social-psychological, personal and affective variables (see, for example, Moyer, 2014a, 2014b). In the realm of L2 learning motivation, Dörnyei currently considers vision to be one of the highest order motivational forces, allowing the consideration of motivation as a long-term, ongoing endeavor (see, for example, Dörnyei, 2014; Dörnyei and Kubanyiova, 2014). A strong future vision of L2 success is a reliable predictor of students’ long-term intended effort and overall perseverance, which are necessary to bring them to high ultimate attainment. Along similar lines Ryan and Irie (2014) emphasize that ‘possible selves … contain an element of experiencing ourselves in that future state’ (p. 113). More importantly, it has often been reported that younger school learners are more motivated and have a more positive attitude towards a foreign language than older school learners, and that this is a definite advantage of an early start (e.g. Blondin et al. 1998; Edelenbos et al., 2007; Hawkins, 1996), although the opposite has also been found (e.g. Dewaele and MacIntyre, 2014). However, Muñoz (2008) cautions against confounding biological age and age of onset when it comes to young learners’ attitude: the finding that younger starters have a more positive attitude towards learning a second language than older starters may be a result of their younger chronological age rather than or in addition to their earlier start. Also important is the fact that social-psychological, personal and affective variables may be under the influence of the more local learning situation. For a statistical model in age research this means that, for example, students who are nested within classes within schools cannot – must not – be treated as independent observations, as the errors of measurements are not independent. Furthermore, recent thinking on age (see, for example, Singleton and Pfenninger, 2015) suggests that external factors such as classroom effects also need to be addressed, as environmental influences interact with age effects and possibly mediate them. What is more, if inadequate attention is paid to the unit of analysis (students, class groups, teachers, or schools), differences found in the dependent measures may be due to uncontrolled differences among the participating groups rather than the main independent variable (in the present case, age of onset of learning). Filtering out or failing to address external influences would thus be a gross error of omission. However, the use of general linear models such as ANOVA, t-tests, single-level regression, χ2-tests, etc., which require prior aggregation and are run on the averaged data, is still widespread in age-related research in SLA. These models cannot take account of the various unmeasured aspects of the upper level units (e.g. schools or classrooms) that affect all of the lower level measurements (e.g. measurements within participants or students within classrooms) similarly for a given unit. Accordingly, a t-test (or, equivalently, an ANOVA) may well yield a statistically significant result when there is, in fact, no effect (e.g. for starting age). It is thus high time for age researchers to begin to employ models that permit the study of inter- and intra-individual variability across situations and across time with a more careful parsing of variance between persons and between items and thus have built-in ecological validity. Such an approach is multilevel modeling (MLM), which is proposed in this article.
III Research questions
The following research questions are addressed in this study:
Do early instructed FL learners in Switzerland outperform late instructed FL learners in the long run, and if so to what extent can motivation explain this phenomenon?
How do FL motivation and attitudes towards learning of EFL develop as students ascend the educational ladder in secondary school?
To what extent do school-level variables affect age-related differences?
How do beliefs about the age factor vary among EFL learners with different AOs?
Note that in our study, ‘long term’ refers to attainment at the end of mandatory schooling in Switzerland (but see Muñoz, 2008, 2014b). By school-level variables we mean context variables, which include, for example, school location and resources, and climate variables, applied to characteristics of the learning environment (e.g. class size, learner expectations, motivation, attitudes, beliefs, influence of teachers and parents, etc.). With all of this in mind, we opted for an equal-status sequential mixed methods design, where the rationale was that of complementarity, development and triangulation (described in detail in Singleton and Pfenninger, 2015). We thus focus not only on FL motivation and learners’ beliefs as individual difference variables but on particular students who are engaged in language learning. On the practical side, we seek to outline a way of researching the age factor in SLA that accommodates the multifaceted nature of this variable, with a special emphasis on the crucial mediating role that clustering effects and individual characteristics and beliefs play. Note, however, that we do not here attempt to provide a detailed account of research designs, methodologies and instruments for investigating the age factor. The focus is on the conceptual basis of the models in question.
Several data collection instruments were deployed in the study:
six L2 proficiency tasks;
one language experience essay (composed in the learners’ language of literacy, that is, Standard German); and
one questionnaire that mapped students’ general motivational dispositions at both data collection times.
IV Method
1 Participants
Our participants were 200 secondary school students from the German-speaking part of Switzerland, who were tested at the beginning and at the end of academically oriented high school when they were 13 and 18 years old respectively. One hundred of the students were early classroom learners (henceforth ECLs) of English who had started being instructed in the language in early childhood (AO 8), and the other 100 were late classroom learners of English (henceforth LCLs) who had started being instructed in the language around puberty (AO 13); see Table 1.
Participants participating in the study.
Note. ECL1 = early classroom learners at Time 1; ECL2 = early classroom learners at Time 2; LCL1 = late classroom learners at Time 1; LCL2 = late classroom learners at Time 2.
The two groups were controlled for L1 (Swiss German), additional FLs learned (Standard German, French), socio-economic background (SES), teaching method and weekly hours of EFL instruction received. Early starters were not mixed in the same classes as late starters. The 200 learners were nested within 12 classes that were nested within five schools in the canton of Zurich. One out of the four schools was in a suburban area, while the others were in urban school districts.
Note that, despite its status as a language of literacy, Standard German is considered an L2 in Switzerland: while Swiss German is a High Alemannic variety of German, it is hardly understandable to someone who knows only Standard German, as the two languages differ considerably in lexicon, phonology and syntax (see, for example, Berthele, 2010). According to Lüdi (2007: 161), most Swiss citizens are monolingual in childhood, becoming bilingual in the early primary grades when they receive formal literacy training in L2 German from 1st grade on (age 7). This means that German-speaking Swiss children have to learn a relatively unknown language. The situation is similar regarding French: although one of the four national languages of Switzerland, it is considered a foreign language in this study because children in Zurich grow up monolingual, speaking Swiss German, and learn French exclusively in school.
For the qualitative analysis, we selected a focal group of 20 early learners and 20 late learners from those 200 who had participated in the quantitative phase. Early and late learners were selected according to scores on a range of L2 proficiency tests administered at Times 1 and 2. Following Muñoz (2014a), the criterion for inclusion in the high achievement groups was a score in the 75th percentile on all tasks, and for inclusion in the low achievement groups a score in the 25th percentile on all tests. Furthermore, the high-achievers all had grades at or above 5 (6 being the highest grade). Following these grouping criteria, we ended up with four groups of 10 participants: 10 early learners, high achievement (ELH); 10 early learners, low achievement (ELL); 10 late learners, high achievement (LLH); and 10 late learners, low achievement (LLL). This enabled us to study the most successful learners vs. the least successful learners in the sample.
2 L2 proficiency tasks
Language data were collected by means of a test battery that included a standardized listening comprehension task (see Pfenninger, 2014a, 2014b), two written compositions (an argumentative and a narrative essay), a grammaticality judgment task, 1 a vocabulary size test (Academic sections in Schmitt et al.’s (2001) Versions A and B of Nation’s Vocabulary Levels Test), Laufer and Nation’s (1999) Productive Vocabulary Size Test, and two oral tasks (the re-telling of a silent movie and a spot-the-difference task). The grammaticality judgment task included morphosyntactic structures that have been found to be particularly age-sensitive, such as articles and inflections, as well as structures that are not particularly age-sensitive, for instance word order and do-support (see, for example, McDonald, 2006). We applied two different analyses to the data from the spoken and written production tasks in order to answer research question 1: (a) a communicative holistic analysis, and (b) a quantitative analysis. For the holistic evaluation of the English and German essays, we partly followed Jacobs et al.’s scale (1981), which, according to Lasagabaster and Doiz (2003: 140), requires two evaluators and considers the communicative effect of the speaker’s linguistic production on the receptor and, therefore, comes close to the main objective of the process of language acquisition, namely interpersonal communication. Our evaluation system consisted of two criteria which measure different aspects of written production (Lasagabaster and Doiz, 2003: 142–43):
Content (30 points): this category considers the development and comprehension of the topic as well as the adequacy of the content.
Organization (20 points): several factors are considered here, namely the organization of ideas, the structure and cohesion of paragraphs and the clarity of exposition of the main and secondary ideas.
The results for each of the criteria were summed, the maximum score being 50. The final score was the average of the total points assigned by each of two independent evaluators. The inter-rater correlation (Pearson correlation coefficient) for the written content subscore was 0.82; the organization subscore 0.89; and the total score 0.90 (for the oral data: 0.79, 0.81, 0.86). It was decided to include only two holistic measures, since some authors have questioned the reliability and informativeness of the holistic rating of compositions (for discussion, see Torras et al., 2006: 157ff.).
In the context of the quantitative approach, competence was measured in terms of oral and written fluency, lexical and syntactic complexity, and morphosyntactic errors. Following Wolfe-Quintero et al. (1998), fluency was examined in terms of words per T-unit, which is defined as one main clause and all of the dependent modifying clauses (Ellis and Barkhuizen, 2005). We should mention that words/T-unit is often also used as a complexity measure. Syntactic complexity was examined in both languages using the clauses per T-unit complexity ratio. Lexical complexity was examined using Guiraud’s Index of Lexical Richness: word types divided by the square root of the word tokens. Accuracy was examined by counting the number of misspellings (excluding ‘mechanical errors’ such as punctuation errors) and the number of morphosyntactic errors per T-unit. Finally, oral fluency was examined by means of pruned syllables per minute (see, for example, Gavin, 2014).
3 Language experience essays
Student perspectives occupy a central position in social constructivist approaches to education (e.g. Brooks and Brooks, 2000; Larochelle et al., 2009) as well as in the advocacy of autonomy in the classroom (e.g. Cotterall and Crabbe, 2008; Little, 2007; Ushioda, 2009, 2011), but individualized approaches to age research are still scarce. Thus, in order to give a better account of the interaction of AO and other (often hidden) variables such as motivation, attitudes and beliefs, we used language experience essays, which we hoped would elicit:
the participants’ reflections on their experience of multiple FL learning at the beginning and at the end of secondary school;
the participants’ affect in respect of foreign languages, and English in particular; and
participants’ beliefs about the age factor (rationale of complementarity; see research question 3).
The use of these essays was based on the idea that, on the one hand, learners’ beliefs are – consciously or unconsciously – gleaned from past experiences; and that, on the other, learners’ beliefs have an influential role in respect of learning outcomes (see, for example, Gregersen and MacIntyre, 2014). We provided loose guidelines for the writing. No specific length was set; students wrote between 203 and 475 words.
4 Motivation questionnaire
On the basis of the data from the first qualitative phase, including the essays, we constructed a more structured motivation questionnaire with 28 closed-ended and one open-ended item, formulated in Standard German, which was administered to the same 200 students twice, at the beginning and at the end of secondary school (rationale of development; see research questions 1 and 2). The qualitative analysis of the language experience essays was conducted in three stages. The first stage involved separately reading through the essays for each student several times, getting a general understanding of issues covered and taking note of interesting features. Starting from the second reading, the essays were analysed independently for emerging categories. Fifteen categories emerged as significant relative to target language development and age-related differences. Finally, after the saturation of categories, some were merged with others, resulting in eight final categories:
future L2 self-states;
present L2 self-states;
FL learning anxiety;
linguistic self-confidence;
attitudes towards FLs in general;
attitudes towards the learning situation;
cultural interest and media usage;
parental encouragement.
Future L2 self-states encompass learners’ ‘experiencing’ themselves in future states, their strongly valued future possible selves that included the FL, such as the wish to become similar to native speakers of English, and also the usefulness of the FL skills to be learned in the future and the incentive value of success, i.e. the value of potential outcomes and rewards, external or internal. This included a desired (imagined) L2 community that offers possibilities for an enhanced range of identity options in the future (see, for example, Norton, 2014).
According to Dörnyei (2009: 11) a person’s present L2 self has traditionally been seen as ‘the summary of the individual’s self-knowledge related to how the person views themselves at present’ and is assumed to also concern information derived from the individual’s past experiences (Markus and Nurius, 1986). Present L2 self-states thus refer to the current attitudes learners displayed toward EFL and the FL community and their reactions to a world in which English plays a predominant role, as well as the extent to which the learners wanted to experience cross-cultural contact involving English and travel to English-speaking countries. This dimension also includes factors of external regulation leading to action in order to avoid bad grades or to assuage a guilty conscience.
Making a distinction between present and future self-states is important for two main reasons: on the one hand, the motivation literature emphasizes that motivated behaviour occurs as the learners seek to reduce the gap between their ideal L2 self in the future and their present self (e.g. Dörnyei, 2005, 2014). Measuring this gap can thus shed light on early vs. late starters’ motivated behaviour. On the other hand, reports from our language experience essays highlighted the fuzziness of the ideal L2 self/ought-to L2 self binary in the L2 Motivational Self System proposed by Dörnyei (2005) as well as the integrativeness/instrumentality binary in Gardner’s Socio-Educational Model of Language Learning (e.g. Gardner, 1985; Gardner and Lambert, 1959). For instance, it turned out that internalized instrumental motives, such as perceived benefits and usefulness of English in a globalized world, can be part of the students’ ideal L2 self.
FL anxiety refers to ‘the worry and negative emotional reaction aroused when learning or using a second language’ (MacIntyre, 1999: 27). Since it has been recommended (e.g. Dewaele and MacIntyre, 2014) that researchers examine both positive and negative emotions in the same study, owing to the absence of anxiety being ambiguous and thus difficult to interpret, we added linguistic self-confidence to assess positive emotions. This dimension refers to the belief of learners that they are capable of engaging in social interactions in the L2 and is often said to develop, on the one hand, as a consequence of frequency of (prior) contact and quality (or pleasantness) of contact with the L2 and members of the L2 group (e.g. Sampasivam and Clément, 2014), or, on the other hand, as a precursor to contact when feelings of confidence motivate learners to seek such contact (see, for example, Kormos and Csizér, 2008).
Attitudes towards FLs is concerned with FL learning in general and includes FLs other than English (e.g. French). In order to give a full account of the role of FL learning experiences, it was decided to include and adapt a category on attitudes toward the learning situation, which covers the immediate learning situation important to any study of L2 motivation in a classroom context (syllabus, teacher, class atmosphere, etc.), as well as the learners’ sense that their behaviors are self-determined even thought they might be influenced by external sources.
Students’ particular interest in English-speaking cultures would also have been gauged by questions on cultural interest and media usage, relating to the appreciation of cultural products as, for instance, delivered by the media.
Finally, we added one more dimension, parental encouragement, which refers to the extent to which parents encourage their children to study English (Kormos and Csizér, 2008). On the one hand, this dimension relates to previous findings in the literature that parents can influence their children’s attitudes and motivation in subtle and sometimes unconscious ways through their own attitudes to FLs or FL learning, even without actively involving themselves in their children’s learning (see Mihaljević Djigunović, 2012; Nikolov, 1999). In addition, we wanted to make reference to the fact that parents frequently demand the inclusion of a FL in primary school curricula (Kubanek-German, 1998).
Each category was allotted between two and eight items, giving a total of 28 items. Table 2 shows the Cronbach’s alpha reliability coefficients for the three multi-item scales of the present study. All of the reliability coefficients are above the recommended .70 threshold.
Information about the multi-item scales.
A five-point Likert scale was used for all categories, to provide enough possibilities (whilst avoiding confusion with the Swiss grading system, which scores 1–6). Some of these questions were adapted for the Swiss school context, a third of them were made negative, and the resultant list was translated into German and randomized. Attention was paid to ensure that the questions were not beyond the grasp of the 13–14 age groups at Time 1. The questionnaire was pilot-tested with 50 participants in 2008 (see Singleton and Pfenninger, 2015). This led to the deletion of some items and the reformulation of others. Finally, in the open-ended question, the ECLs were asked about the main differences between EFL in primary school and EFL in secondary school and how they experienced the transition from primary to secondary school with respect to English.
5 Modeling ‘learner-in-context’
The best ways to deal with a person-in-context relational view of age and motivation are to employ multilevel models (Baayen et al., 2008; Jaeger, 2008), which are ideal for a potentially generalizable study of age effects and motivation, given the availability of both individual-level and aggregated contextual-level data with a sufficiently large number of groups. These models take into account, for instance, that students within a classroom (and school) might be more similar to each other than to students in other classrooms (and schools) by including random intercepts. Multilevel models can also be used for assessing the impact of context-varying factors on individual difference variables, and they take account of the fact that different participants and/or classes and/or different items may vary with regard to how sensitive they are to the manipulation at hand by including school-specific, subject-specific and item-specific slopes for the fixed effect AO. They provide us with a way to empirically measure and analyse contextual motivational factors, which are often only implicitly reflected in the individual’s self-reported attitudes and cognitions.
Table 3 displays the learner-level and class-level variables that we selected in our model, where we controlled for the clustering of learners within particular classes within particular schools. As Table 3 shows, we had five learner-level variables and one class-level variable. We ran three multilevel analyses:
MLM 2-level analysis (class, school) examining impact of AO on L2 proficiency and motivation at Time 1 and Time 2; we allowed the effects of AO, gender and class size on L2 achievement to vary across classes and/or schools.
MLM 2-level analysis (class, school) examining impact of motivation on L2 proficiency at Time 1 and Time 2.
MLM 4-level (time, learner, class, school) examining individual growth curves for L2 proficiency and motivation over two waves.
Student, class and school level variables.
In 1 and 2 we wanted to see how much difference there was within a class and within a school, i.e. whether all classes and all schools had the same relationships or whether there was variability in the effect of the fixed variables (AO, gender, class size, time, motivation) on learners’ L2 achievement. As the MLM in 3 shows, longitudinal data can also be conceptualized as a hierarchy, where we have different observations nested within people. The outcome and the occasions (time) were Level 1 variables, the learner characteristics were Level 2 variables, class characteristics were at Level 3 and school characteristics at Level 4. Thus, in order to measure growth and development over the years, we fitted 2-level linear growth models to each set of longitudinal L2 proficiency scores. Note that when including continuous predictors such as motivation in a mixed-effect model, it is often useful to center each predictor around its mean value (see, for example, Cunnings, 2012: 376). This involves subtracting from each individual value of a predictor the predictor’s overall mean, and is done to help reduce colinearity within the model (e.g. between main effects and interactions).
Visual inspection of residual plots did not reveal any obvious deviations from homoscedasticity or normality. P-values were obtained by likelihood ratio tests of the full model with the effect in question against the model without the effect in question. All models reported were fitted using Laplace estimation with the R software (R Development Core Team, 2014) and lme4 (Bates et al., 2014). Also, all models were first evaluated with likelihood ratio tests (test model vs. null model with only the control variables). If the full model vs. null model comparison reached significance, we present p-values based on likelihood ratio tests. Given the lack of degrees of freedom with mixed models, we refrain from reporting df.
V Results of the quantitative analysis
Research question 1
In accordance with our research questions 1 and 2, we will examine the impact of AO and motivation on L2 achievement at both data collection times, and follow this with a discussion of the influence of AO and time on motivation. The participants’ performance across all skills tested is shown in Table 4. As can be seen in Tables 4 and 5, the effect of AO is strong and significant for the following dimensions at the beginning of secondary school: receptive vocabulary and written lexical richness, for which an earlier AO were more advantageous, and oral and written accuracy, where the late starters outperformed the earlier starters.
Descriptive statistics (means and standard deviations).
Multilevel regression analyses for scores as dependent variable at Time 1 (fixed effect estimates for AO).
Notes. * p ⩽ .05; ** p < .01.
At Time 2, there were no longer any links between the learners’ AO and their FL achievement except for the significantly better grammaticality judgment results of the late starters (see Table 6). This means that the late starters were able to make more progress within a shorter period of time, i.e. there is a clear difference in rate of EFL learning in favor of the late starters.
Multilevel regression analyses for scores as dependent variable at Time 2 (fixed effect estimates for AO).
Notes. * p ⩽ .05.
With respect to motivation, the two groups differed from each other at Time 1 in terms of the strength of their future vision of themselves as competent L2 users, with the late starters having significantly higher values, as well as in terms of their present L2 self-states, which were stronger for the early starters – although the latter result was only marginally significant. Table 7 presents the values for each motivational dimension (descriptive statistics), while Table 8 shows the results of the multilevel model with AO as the main predictor of motivation at Time 1.
Descriptive statistics for motivation (means and standard deviations).
Multilevel regression analyses for motivation as dependent variable at Time 1 (fixed effect estimates for AO).
Notes. * p ⩽ .05; ** p < .01.
Two observations are interesting with respect to the learners’ present and future self-states. First, while the relationship between AO and future and present self perceptions was the same across the five schools, there was significant between-class and between-school variation concerning both these dimensions, as Figures 1 and 2 demonstrate for future self perceptions (although the results in Figure 2 are only marginally significant).

Variation across classes for future L2 self-states at Time 1.

Variation across schools for future L2 self-states at Time 1.
Classes and schools had a significant impact on students’ perceptions and orientations. Also, whereas present self-states did not have the same impact on the scores at either Time 1 or Time 2, future self-states had a large and significant effect at both data collection times (see Tables 10 and 11 in Appendix 1).
The LCLs were also less anxious than the ECLs at Time 1, and they had more positive attitudes towards FLs and the learning situation (see Table 4 above). In fact, the ECLs had extremely unfavorable attitudes towards FLs in general when they began secondary school (mean value of 1.89 on a 5-point scale at Time 1). It has to be pointed out, however, that anxiety, confidence, and attitudes to FLs did not impact on the scores greatly at either measurement time (for a discussion of these findings, see Pfenninger and Singleton, in prep.). By contrast, attitudes to the learning situation had a particularly marked impact after six years, i.e. after the classes’ structures had had time to develop (see Tables 12 and 13 in Appendix 1). Finally, parental encouragement had a significant impact on proficiency at Time 1 but not at Time 2, irrespective of AO (see Tables 14 and 15 in Appendix 1). This is interesting insofar as neither ECLs nor LCLs thought that their parents had had a particularly active, encouraging role in respect of their L2 learning (see the values around 3.5 on a 5-point scale for dimension 10 in Table 4 above).
Research question 2
With respect to motivation as a dependent variable (see research question 2), the results did not show a decline in positive attitudes as students moved up the school. Quite to the contrary, almost all orientations received higher values at Time 2, e.g. future L2 self-states (see Figure 3; for all dimensions, see Table 16 in Appendix 1). In addition, the two age groups showed a similar growth from Time 1 to Time 2 (see Table 9) – except in regard to anxiety and attitudes to FLs, where the ECLs advanced significantly more. At Time 2, there were no longer any differences between the two age groups (Table 17 in Appendix 1).

Single-level regression of future L2 selves over time across the 12 classes.
Multilevel regression analyses for growth of the two age groups: Fixed effect estimates: AO.
Notes. * p ⩽ .05.
Not surprisingly, motivation varied across the 12 classes and the five schools. Figure 4, for instance, shows evidence of the relationship between future L2 self-states and receptive vocabulary being different. The intercepts are very different (some classes have a higher intercept than other classes) and the slopes are not the same, i.e. they are not exactly parallel. We can also see a different pattern that emerges for the two AO groups: for the dotted lines (i.e. the early starters) intercepts tend to be higher than for the long-dash lines (i.e. the late starters), reflecting the early starters’ better scores on this test at Time 1 (see Tables 4 and 6 above), but the effect of this motivational dimension on receptive vocabulary seems to be the same for both groups (the slopes were equally strong and weak respectively). The differences in motivation slopes are thus not attributable to AO but rather to characteristics of the clusters at the level of sampling. While there was hardly any effect of gender on motivation or learner outcomes, we found a strong negative effect for class size: as the number of students within a class increased, the L2 performance (see, for example, receptive vocabulary in Figure 5) and particularly motivation (see, for example, future L2 selves in Figure 6) tended to decrease at both data collection times.

Single-level regression of future L2 selves on receptive vocabulary at Time 1 (T1) across the 12 classes.

Effects of class size on receptive vocabulary at Time 2 (T2).

Effects of class size on future L2 selves at Time 2 (T2).
About 50% of the written measures were affected by class size, notably listening comprehension, receptive vocabulary, spoken and written content, spoken and written organization, written accuracy, and grammaticality judgments. Class size also had a significant effect on all motivational dimensions – except for attitudes to FLs (at Time 1 and 2), culture and media (at Time 1) and parental encouragement (at Time 1 and 2).
Research question 3
The results not only revealed variability in motivation effects across subjects, classes and schools, but also significant variability in age effects across the five schools (see research question 3). For instance, as Figure 7 shows, one of the four schools – the only school in a suburban context – tended to have weaker slopes than the others across all measures (here for receptive vocabulary at Time 1), which means there were hardly any age-related differences found in this school context. This indicates that age-related differences are mediated by wider contextual factors. Variation across classes could not be measured, since early and late starters were not integrated in the same classes.

Single-level regression of AO on receptive vocabulary at Time 1 (T1) across the five schools.
VI Results of the qualitative analysis
In order to be able to answer our last research question (research question 4), regarding learners’ beliefs about the age factor, we used the qualitative data drawn from a selection of the essays written by the totality of 200 participants. Before we deal with the selection in question, however, it may be interesting to delve into the responses to one of the open-ended questions submitted by the totality of early English students, dealing with the primary school experience vis-à-vis secondary:
78% of the responses in question talked about the perception that English instruction in primary school had not focused on explicit rule-learning, whereas in secondary school it very much had;
72% of the responses concerned the perceived inefficiency of the way that early English was taught;
56% of comments expressed criticism of the teacher’s choice of language of instruction;
41% of students’ answers to the question complained about the experience of starting everything again from scratch in secondary school; and, finally:
19% of students’ reactions offered some thoughts about the place of English versus French in primary school.
We come now to the focal group, i.e. the 10 early high-achieving starters, the 10 early low-achieving starters, the 10 late high-achieving starters, and the 10 late low-achieving starters. We concentrate here on retailing the learners’ perceptions with regard to the age at which their instruction in EFL had begun.
The trend at Time 1 was for learners to be positive about the age that they themselves had started learning English. The early high achievers came out fairly uniformly at Time 1 with sentiments like the following:
1. ‘The earlier the better: We should learn foreign languages early because our brain learns a foreign language faster when we’re children.’ (07_ELH3_M_GER)
At Time 1 the late high achievers tended, on the other hand, to support the pattern of starting English later:
2. ‘I personally don’t think it’s good to begin learning too early … so beginning English at 12 or 13 I think is exactly right.’ (07_LLH10_F_GER)
The late low achievers also tended at Time 1 to support the pattern of starting English later which they themselves had experienced:
3. ‘An 8-year-old child very probably still doesn’t understand grammar. He/she at that time has other things in his/her head.’ (07_LLL4_M_GER)
The exception at Time 1 to the expression of satisfaction was the tenor of the comments offered by the early low achievers, who were clearly less than charmed by their encounter with English in primary school. At Time 2 the early high achievers showed less unanimity than previously in regard to their assessment of the value of early English. At Time 1 (see above), the views expressed by this group were overwhelmingly favorable; when the learners in question were older the picture was more mixed. Opinions supportive of early English were still in evidence; some more nuanced, more skeptical views also appeared, however:
4. ‘I remember how in early years the learning was unconcentrated and slow. At secondary level it progressed really fast.’ (12_ELH9_M_GER)
The early low achievers were, if anything, even more skeptical about early English at Time 2 than they had been at Time 1.
5. ‘In my opinion the early “learning” of foreign languages … isn’t meaningful. First really because they [the students] don’t learn anything, but are only killing time and get demotivated for foreign languages. Besides this, day by day they lose motivation for school.’ (12_ELL1_F_GER)
Amongst the late-starting high-achievers at Time 2, as at Time 1, the trend was for the late start in English that they had experienced to be approved:
6. ‘As a child I always envied my brother, who had English as early as the second class of primary school. … But looking back I don’t see this advantage as so big any more. Within half a year I had in the 2nd year of secondary school the same level of English as my brother.’ (12_LLH6_M_GER)
The late low achievers at Time 2 remained as satisfied as they had been at Time 1 with late English, and as skeptical as they had been with regard to the wisdom of the introduction of English at primary level. In sum, we learned from the language experience essays that for the most part the late starters were content with and positive about their late start, and that those who had been able to compare themselves with early starters (e.g. younger siblings) did not find themselves at a disadvantage from beginning English later. Amongst the early starters we found differences between the high achievers and the low achievers. At Time 1 the mood amongst the high-achieving early starters was very buoyant, many of the positive opinions expressed, though, seeming to be based on ‘received wisdom’ about the desirability of beginning English instruction early. At Time 2, views were mixed, a number of high-achieving early starters referring to their disappointment with their actual experience of early English. The pattern of perceptions voiced by the early low achievers was mostly negative at both Time 1 and Time 2.
VII Discussion
In our study we first addressed the question about the main differences concerning L2 achievement in two different AO groups (research question 1). With respect to rate of acquisition, it became obvious that the late starters were able to catch up very quickly (i.e. within six months in secondary) with the performance of the early starters – who had had five years more EFL instruction – with respect to a range of oral and written measures, and that they were able to remain on a par with the early starters until the end of obligatory schooling in Switzerland. The overall lack of effect of starting the FL at an earlier age on FL achievement could be accounted for by reference to a number of theoretical, affective and contextual factors. On a theoretical level the long-term advantage conferred on most learners by an early start in a naturalistic language learning context is not found in an FL learning context (see, for example, Muñoz, 2014b). With reference to possible reasons for the ‘kick start’ of the LCLs at Time 1 and the general lack of age-related differences, the results indicated that for the LCLs, motivation was more strongly goal- and future-focused at the first measurement, while the motivation of the ECLs was predominantly influenced by (present and past) cumulative experiential factors. Since future selves – but not present selves – had a strong impact on the L2 achievement, the LCLs were possibly able to profit from their orientations at Time 1. As outlined in the literature review above, the strong link between a future time perspective and academic achievement is not new: students who ascribe higher valence to goals in the distant future have been found to be more persistent and obtain better academic results in the present (see, for example, Dörnyei et al., 2014). Since future selves contain an element of experiencing ourselves in that future state, they involve a sense of agency, i.e. the belief that one is capable of affecting outcomes in the future, based on past experiences and attributions for success and failure. As mentioned above, agency is a vital characteristic of successful learners and is central to appreciating their engagement, motivation, autonomy, and self-regulatory behaviors (see, for example, Mercer, 2012). It is thus possible that due to their past experiences in primary school the ECLs did not experience the requisite sense of agency. Interestingly, the LCLs were able to keep their visions alive over time, as they had similarly high values in this motivational dimension at Time 2. What is more, the gap between present and future selves was the same for both AO groups at Time 1. It is important to bear in mind, however, that there were no longer any differences in terms of future self-states between the two AO groups at Time 2, and for both groups, future self-representations were at that point stronger than present self-states.
The expression of negative attitudes towards FLs and the learning environment at Time 1 is a striking result for the early starters. From the qualitative analysis it became clear that various factors seemed to contribute to the disengagement of the early starters and might be responsible for the observed lack of enthusiasm for engaging with English in school. These might include a lack of belief in the efficacy of in-school learning environments among learners (see also Henry, 2014) and a relationship between not liking the teacher and not liking the subject (see also Taylor, 2013). Resistance also appears to have arisen from a discrepancy between the learners’ expectations of ‘good teaching’ and the pedagogical practices of the teacher. It also seemed that the ECLs had to deal with a range of challenging aspects of L2 learning at the beginning of secondary school, such as difficulty adjusting to the new teaching style. This is also what Cenoz (2004) observed. She found significant differences in favor of late starters when it came to the L2 learning motivation of learners who were in the same school year (4th secondary) but who had received different amounts of instruction. Cenoz hypothesized that this might have been related to the differences in input and methodology between primary school and secondary school. The ECLs’ responses also raise the question as to whether the skills that are acquired in primary school are adequately measured and accredited in secondary school.
The ECLs’ dissatisfaction with early English and the transition from EFL in primary to EFL in secondary is problematic in several respects. Norton (2014), who takes a poststructuralist view of motivation and resistance in a classroom, points out that a student can be highly motivated and eager to learn English in general, but that if the language practices of the classroom make a learner unhappy or dissatisfied, the learner may resist participation in classroom activities, or become increasingly disruptive. This position finds support from Ushioda (2014), who points out that social-environmental conditions that undermine learners’ sense of competence will generate forms of motivation that are less internalized, less integrated into the self or aligned with its values, and more externally regulated by environmental influences, pressures and controls. The reports in this study also confirm the influence of the teacher that has been documented abundantly in the SLA literature (e.g. Noels et al., 1999; Taylor, 2013; Ushioda, 2011). Lamb and Budiyanto (2013) explain that if the teachers do not have any personal experience of Anglophone culture, English will be taught and learned as a ‘values-free body of knowledge conveyed via official textbooks’ (p. 26) and the students might become more oriented towards practice for local and national exams. In a similar vein, anxiety can result from the classroom situation (see, for example, Horwitz et al., 1986). For many students, the learning of English is not an enjoyable activity in itself, but one which they have been required to persist at for many years in primary school with negligible levels of success.
The fact that the LCLs were equally confident L2 speakers as the ECLs at Time 1 despite their lesser contact with the L2 in a school context might be explained in terms of the idea that linguistic confidence can also result from contact via foreign media use, travel, and perceived importance of contact (see, for example, Clément et al., 1994; Kormos and Csizér, 2008). At both data collection times we found very high values for both AO groups in the area of cultural interest and media. Kormos and Csizér (2008) observed that English language cultural products had a significant effect on motivated behavior in secondary school pupils compared to adults for whom ‘international posture’ was a more important predictive variable (see also Tragant, 2006). This was also found in the data collected from 623 Hungarian students by Kormos and Csizér (2008). Among the limited research on the facilitating effect of computer-mediated communication (CMC) on L2 acquisition, findings suggest that for beginning learners, the use of asynchronic CMC methods such as text chat can allow learners to develop a sense of L2 confidence and alleviate anxiety (see Satar and Ozdener, 2008).
As regards the question of development of motivation over the course of secondary school (research question 2), our results show that learners do not necessarily become more disenchanted with EFL over time. On the contrary, our participants became increasingly more motivated in terms of a range of motivational dimensions. In this respect our findings confirm those reported by, for example, Dewaele and MacIntyre (2014), who found a steady increase in FL enjoyment from pre-teens to those in their thirties. However, we cannot say that more hours of instruction are associated with more positive attitudes, as suggested, for example, by Tragant and Muñoz (2000), as there were no differences between the two AO groups with respect to growth.
At the contextual level (see research question 3), our findings illustrate the importance of school diversities – for example in school curricula, materials and resources, teacher background and training – and their association with age-related differences. The participants in this study came from different primary and secondary school districts and neighborhoods and hence slightly different educational backgrounds that emphasized different skills and values. It is thus not surprising that we found variation across schools when it came to differences between early and late starters. Previous studies have already demonstrated a strong link between socio-economic status and achievement and motivation respectively (see, for example, Kormos and Kiddle, 2013; Lamb, 2012). For instance, Muñoz (2008) argued that students from different social backgrounds have access to different types of schools (state vs. private) and to varying degrees of extracurricular exposure to the target language (e.g. private tuition, learning resources, study abroad, etc.). While there were no students from disadvantaged backgrounds in this study, the results nevertheless showed how the school district can impact on students’ motivated behavior and, by extension, mediate age-related differences: resources available and used in FL education are dependent on schools, which might then influence learners’ intrinsic interest indirectly (see, for example, Kormos and Kiddle, 2013), with the mediation of classroom factors (Muñoz, 2008). Students who are highly motivated might thus be able to make up for a later start. By the same logic, early starters who were in primary schools with less than optimal learning conditions might not be able to profit from the extended learning period, as they might have, for instance, significantly less favorable future L2 self-state. It needs to be noted that motivated behavior and L2 performance were also strongly influenced by class size in secondary school, which has also been observed in numerous studies on willingness to communicate (see, for example, Cao and Philip, 2006).
What could be considered a limitation of the current research might be the relatively short instruction period of English instruction (five years) experienced by our later beginners. Ideally, we should have liked to follow all our learners, or at least some of them, through into higher education or whatever their next stage in life held in store for them. Unfortunately, this was not possible for practical and logistical reasons.
VIII Conclusions
We can draw three general conclusions from the findings of our quantitative multilevel analyses and the individual-learner qualitative data:
While there were clear differences with respect to rate of acquisition in favor of the late starters, we found no main effect for age at the end of mandatory school time, which was also reflected in the qualitative data, e.g. in the reported comparisons that late learners did with their younger siblings who had experienced early English – and who failed, according to the reports, to perform better than the late starters.
A strong future vision of L2 use and usefulness was a significant predictor of success for both early and late starters – but only the latter displayed high values in this motivational area at Time 1, which might have contributed to their kick-start at the beginning of secondary school.
The broader social and educational school context – i.e. the schools – played an important role in attitude formation and in influencing students’ future L2 self-states, which had a mediating effect on starting age.
The quality of learners’ day-to-day experiences, shared histories and relations in particular EFL classrooms represents an important microlevel that shaped students’ affective engagement with English and thus assumes particular importance for discussions of motivation.
Our results thus run counter to the commonly held views that earlier starters show a significant advantage over later starters due to their greater exposure, and that the main gains of early FL learning lie in the development of positive attitudes and motivation (e.g. Blondin et al., 1998; Edelenbos et al., 2007). Furthermore, positive attitudes were not associated with biological age either, as younger learners were not more motivated than older learners, i.e. motivation increased with time.
It seems that Ushioda’s (2013) observation holds true that ‘it is what happens (or does not happen) in each individual classroom, as orchestrated by the teacher, that will have a critical bearing on how students are motivated (or not) to invest effort in learning English’ (p. 235). Since it is at a very localized level of students’ learning experience that the real potential for engaging (or disengaging) their motivation may lie, there is an increasing need for methods that obtain ecologically valid tests of age effects in a classroom. The method described in the quantitative part of this article, multilevel modeling (MLM), turned out to be a convenient method, as it reduces arbitrariness because it more closely reflects the influence of situations as they are encountered in the students’ daily lives, and thus achieves adequate estimates of variances and therefore correct standard errors and correct inferences and (likelihood-based) p-values. MLM thus highlights the growing methodological sophistication of group researchers as they identify new ways to deal with the challenge of studying individuals nested in groups (Forsyth, 2010). That is not to say that contextually-grounded research approaches do not necessarily have to be qualitative any more. While MLM allows us to integrate contextual factors, context is defined as an independent background variable that influences motivation, AO and proficiency, but over which learners have no control (see Ushioda, 2009). The qualitative dimension allows analysis to arrive at a ‘flavor’ of learners’ perceptions and reactions, which is very often indispensable when it comes to constructing a true-to-life interpretation of the quantitative data. In the ever-growing system of educational accountability it is imperative that studies of age effects examine the way that schools and classes can use climatic characteristics to influence students’ academic performance.
Footnotes
Appendix 1
Acknowledgements
The writing of this article was supported by a research grant of the University of Zurich, Grant FK-15-078, to Simone E Pfenninger. This grant is hereby gratefully acknowledged. We are greatly indebted to the students and teachers for their enthusiastic participation and support. Also, we are grateful to Johanna Gündel for her research assistance, as well as to the editor and the three anonymous reviewers for their helpful comments and suggestions. Any remaining errors remain our own.
Declaration of conflicting interest
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The writing of this article was supported by a research grant of the University of Zurich, Grant FK-15-078, to Simone E Pfenninger.
1.
The reliability coefficient (KR-20) obtained was .90 for grammatical items and .95 for ungrammatical items.
