Abstract
Finding the match between individuals and educational treatments is the aim of both educators and the aptitude-treatment interaction research paradigm. Using the latent growth curve analysis, the present study investigates the interaction between the type of explicit instructional approaches (deductive vs. explicit-inductive) and the level of foreign language aptitude (high vs. low) in the learning of explicit grammar rules. The results indicate that on the whole the two equally explicit instructional approaches did not differentially affect learning performance. However, when the level of language aptitude, measured by grammatical sensitivity, associative memory, and memory for text (with the last variable being the best measure), was taken into account, low-aptitude learners performed significantly better with the deductive instruction, in the sentence-correction tests. The interaction effects of equally explicit instructional approaches suggest the need for considering aptitude-treatment interaction to maximize learners’ potential for success in second language learning.
I Introduction
The question of how second language (L2) grammar is best learned has generated a lot of debate among researchers in the fields of second language acquisition (SLA) research and applied linguistics. There is now a considerable amount of evidence that lends support to explicit types of instruction as opposed to implicit types of instruction (e.g. de Graaff, 1997; DeKeyser, 1995; Doughty, 1991; Reber, Kassin, Lewis, & Cantor, 1980; Robinson, 1997; Scott, 1989; for meta-analyses of the effects of instructed SLA, see Norris & Ortega, 2000, 2001). Additionally, evidence supporting a positive relationship between the extent of explicitness of an instructional condition and learner performance has also been gathered (Leow, 1998; Robinson, 1997; Rosa & Leow, 2004; Rosa & O’Neill, 1999).
Although a considerable body of work has accumulated in support of the positive effect of explicit types of instruction on L2 acquisition, little is yet known concerning how individual differences (IDs) at the level of language aptitude interact with various equally explicit learning conditions to affect the success of the learning of grammar rules. Almost any L2 teacher or researcher will agree that individual learners differ in their readiness to benefit from a particular instructional approach and the appropriate combination of instructional approaches matched with the appropriate learners is what really produces success (Skehan, 1989). For that reason, whether independent groups of learners, who share the same aptitude score profile, will differentially benefit from different but equally explicit instructional approaches is a question of great significance for both L2 teachers and researchers. Answers to this question can shed light on two important questions in SLA. First, how can form-focused instruction be used to match learner characteristics with instructional characteristics (Robinson, 2001; Sawyer & Ranta, 2001)? Second, how do we account for variation in language learning success under particular instructional conditions (Robinson, 2001; see DeKeyser, 2009)?
The present study investigates the effects of two equally explicit instructional conditions, deduction and an explicit type of induction, hereafter explicit-induction, on the success of learning the construction with the Spanish psych verb gustar (‘to like; to be pleasing’), known for being difficult to acquire by native English-speaking learners (Gascón, 1998; López Jiménez, 2003; Marras & Cadierno, 2008; Montrul, 1997; VanPatten, 1986; Zyzik, 2006). The study hypothesized that the effectiveness of deduction and explicit-induction would vary across the level of language aptitude. Specifically, learners who share the same aptitude score profile would differentially benefit from equally explicit instructional approaches.
II Background and motivation
In the following sections we will review:
deductive and inductive learning;
aptitude-treatment interaction (ATI);
foreign language aptitude relevant for the learning of grammar rules; and
sources of IDs in learning and long-term memory ability.
Drawing on the literature reviewed, we will provide the motivation of the current study.
1 Deductive and inductive learning
Deduction and induction are instructional techniques commonly used by language teachers. Deduction means that rules are presented before examples are encountered, whereas induction means that examples are encountered before rules are inferred (DeKeyser, 1995). Furthermore, deduction is the process that goes from consciously formulated rules to the application in language use, while induction is the process that involves real language use, from which patterns and generalizations emerge (Decoo, 1996). While deductive learning is inevitably explicit, with concurrent awareness of what is being learned, induction can be either implicit or explicit (DeKeyser, 1995). The debate over the superiority of one technique as opposed to the other has a long history in L2 teaching (Erlam, 2003; Fischer, 1979; Haight, Harron, & Cole, 2007, Hammerly, 1975; Herron & Tomasello, 1992; Seliger, 1975; Shaffer, 1989). Further complicating the debate is that the techniques of deduction and induction used in L2 teaching are not uniform (Decoo, 1996). While the didactic strategies of deduction may vary in the degree of explicitness or elaborateness (DeKeyser, 1995), induction has taken on many forms. Decoo (1996) identified at least four types of induction:
conscious induction as guided discovery;
induction leading to an explicit ‘summary of behavior’;
subconscious induction on structured material; and
subconscious induction on unstructured material.
He further maintained that each of these four modalities of induction and even the deduction could be further refined or combined into subtypes.
Given the diverse ways that induction and deduction are used in L2 teaching (Decoo, 1996), the diversity of research design with regard to treatment conditions comparing deduction and induction (Haight et al., 2007), and the different ways that explicit instructions, including deduction and induction, are operationalized in research studies (Norris & Ortega, 2000), it is not surprising that the results of studies comparing the effectiveness of deduction and induction to date have been mixed. For example, three studies, which utilized only immediate measurements, reported no significant differences between these approaches (Abraham, 1985; Rosa & O’Neill, 1999; Shaffer, 1989). Seliger (1975) also reported no significant difference between these approaches; however, the deductive group showed superiority in the three-week delayed posttest. Abuseileek (2009) reported that for simple structures there was no difference between these two methods; however, for complicated structures, the deductive method was better than the inductive method. Although Fotos and Ellis (1991) did not intend to compare deduction with induction, they reported no significant difference between the group that received an induction-based approach (consciousness raising tasks) and the group that received a deduction-based approach (traditional grammar lessons) 1 ; however, the deduction-based group performed significantly better in the two-week delayed posttests. In a later study, Fotos (1994) reported that the group that received the induction-based approach, which was not identical to that of Fotos and Ellis (1991), and the deduction-based group both made similarly significant gains, which were maintained after a two-week period. Furthermore, Sjöberg and Tropé (1969) reported that the deductive learning was more effective but that the advantage disappeared after five weeks. Erlam (2003) and Robinson (1997) also reported that the deductive condition was more effective. On the contrary, Herron and Tomasello (1992) concluded that guided inductive learning was superior in the learning of certain grammatical structures. Further, Vogel, Herron, Cole, and York (2011) found that guided inductive learning was superior in the short-term learning but not in the long-term learning. Finally, Haight et al. (2007) and Leow (1998) reported the superiority of induction over deduction in terms of both short-term and long-term retention. 2
The use of deductive and inductive methods is also common outside of L2 teaching. Educational psychologists such as Bruner (1961, 1973) suggested that a discovery method (as used in induction) can ‘lead to more orderly, integrative, and viable organization, transformation, and use of knowledge’ (Ausubel, 1963, p. 160). This is because ‘the very attitudes and activities that characterize “figuring out” or “discovering” things for oneself also seems to have the effect of making material more readily accessible in memory’ (Bruner, 1973, p. 412). Bruner derived this view largely from psychological research, which suggested that the key to retrieval of stored information is organization, or knowing where to find information and how to get there. On the other hand, Ausubel (1963) argued that the act of discovery per se does not lead to the organizing and integrative effects of learning by discovery. It accomplishes such effects only to as much extent as ‘the learning situation is highly structured, simplified, and skillfully programmed to include a large number of diversified exemplars of the same principle, carefully graded in order of difficulty’ (Ausubel, 1963, p. 160). The above views suggest that a highly structured and simplified discovery learning environment may lead to learned materials more readily accessible in memory. In other words, induction may be superior to deduction. Nevertheless, a differing view was expressed by educational psychologist Anderson (1967). When summarizing evidence on the general effect of various orders of rule and example, Anderson indicated that the literature on discovery learning (which follows the example–rule order) generally shows that rule–example procedures (as used in deduction) result in speedier acquisition and better retention than discovery methods.
As a significant amount of evidence from L2 studies has lent strong support to more explicit over less explicit types of instruction, a logical prediction derived from such evidence is that learning conditions sharing a comparable level of explicitness will have a similar effect on learning performance. In other words, despite the conflicting evidence as to the effectiveness of deduction and induction, deductive and inductive approaches are expected to produce similar outcomes when they hold comparable degrees of explicitness. Taking this prediction into account, to ensure that the deductive and the inductive approaches used in the current study are comparable in explicitness, they were both operationalized to draw learners’ attention to grammatical forms and provide correct grammar rules, even though at different points during the learning process.
2 Aptitude-treatment interaction (ATI)
ATI refers to the concept that some instructional techniques are more or less effective for particular individuals depending upon their specific abilities or characteristics (Corno, Cronbach, Kupermintz, et al., 2002; Cronbach & Snow, 1977; Snow, 1991). Several educational studies have attempted to resolve the conflicting views regarding deduction and induction by considering the effects of ATI (Eggins, 1979; McLachlan & Hunt, 1973; Tomlinson & Hunt, 1971). For example, Tomlinson and Hunt (1971) found that whether a rule is presented before an example (deduction) or afterward (induction) produced differential effects. Low conceptual level students (incapable of generating their own concepts) learned better with the rule–example order (deduction). High conceptual students (capable of generating new concepts) did not perform significantly differently in both conditions. However, their performance was worse in the rule–example condition (deduction).
A few L2 studies have explored ATI by comparing the performance of learners who share the same level or type of cognitive ability in different instructional conditions (Abraham, 1985; DeKeyser, 1993; Gallegos, 1968; Hauptman, 1971; Nation & McLaughlin, 1986; Wesche, 1981; Zampogna, Gentile, Papalia, & Gordon, 1976). For example, Abraham (1985) compared the effectiveness between deductive and inductive methods. Her results showed that field-independent participants, defined by earning a score of 11 or above in the Group Embedded Figures Test (GEFT) (Oltman, Raskin, & Witkin, 1971), performed better with the deductive lesson, while field-dependent participant, defined by earning a score below 11 in GEFT, performed better with the inductive lesson. However, participants as a whole did not differ significantly in their performances under different instructional conditions. Moreover, Hauptman (1971) reported that students of high language aptitude, measured by the Modern Language Aptitude Test – Elementary Form (Carroll & Sapon, 1959), and intelligence performed significantly better under a ‘situational’ rather than a ‘structural’ approach; however, there was no significant difference between approaches among students of lower aptitude and intelligence.
As shown above, studies that have explored ATI have provided some evidence that supports the relevance of the level of aptitude in instructional design and the use of it to accommodate IDs. Nevertheless, because very few L2 studies have examined possible interactions between language aptitude and instructional conditions, the potential causal role of the language aptitude variable in different instructional conditions remains largely unexplored. Given that previous ATI research studies have shown that learners who shared the same level of cognitive ability performed differently under different instructional conditions, the present study hypothesized that the effectiveness of deduction and explicit-induction would differ across the level of language aptitude.
3 Foreign language aptitude relevant for the learning of grammar rules
Studies investigating L2 learning success in relation to language aptitude have consistently shown that language aptitude is the single best predictor of subsequent language learning achievement (for a review, see Sawyer & Ranta, 2001). Regarding the components of language aptitude responsible for facilitating grammar rules learning, Robinson (1997) showed that grammatical sensitivity (the ability to recognize the grammatical functions of words in the context of sentences) and rote/associative memory (the ability to bond between stimuli, i.e. native language words, and responses, i.e. target language words) correlated positively with learning performance in the learning condition in which learners received explicit instruction of grammar rules. In a later study, Robinson (2001) proposed that memory for contingent text, consisting of text memory and speed of working memory for text, and metalinguistic rule rehearsal, consisting of grammatical sensitivity and rote/associative memory, are responsible for explicit rule learning.
Skehan (1980, 1982, 1989) noted that two aspects of memory can predict language learning success: response integration, the ability to recall a set of words in an unknown language without being given word associates in the first language, and memory for text, ‘the ability to analyse text, to extract its propositional content, and remember such content’ (1989, p. 31). To measure memory for text, Skehan (1980, 1982) used Indonesian grammar rules, each of which contained a different number of propositions (see Appendix 1). As he explained, what the memory-for-text predictor seems to do is ‘measure how adept people are at bringing their general knowledge of meaning to bear on newly presented materials; to see relationships between the elements involved; and to relate them to existing knowledge’ (1989, p. 31). This view is consistent with many psychological research findings, which suggest that prior knowledge influences future learning. Those who know more learn better. This view will be discussed more in the next section.
Drawing on the skill acquisition theory, knowledge about L2 grammar can start out in the declarative form (DeKeyser, 1998, 2001; see Chi & Rees, 1983). It is plausible that the memory for meaningful material and connected text, as investigated by Skehan (1980, 1982), has a facilitative effect on the learning of explicit rules. This view seems to coincide with Robinson’s (2001) hypothesis, already mentioned above. That is, text memory (subsumed under memory for continent text) is related to explicit rule learning.
Taking the results of previous studies into account, the present study, which used the latent growth curve analysis (Duncan, Duncan, & Strycker, 2006), hypothesized a priori that three indicator variables – memory for text, grammatical sensitivity, and associative memory – were able to measure a latent variable, language aptitude. The extent of the relationships between the latent variable and the three indicators were measured by factor loadings.
4 Sources of IDs in learning and long-term memory ability
According to Resnick and Neches (1984), the sources of IDs in learning ability are commonly divided into two broad classes: capacity differences and knowledge differences. Capacity is a fixed characteristic of an individual at any given point in development. In contrast, knowledge is acquired through learning about specific domains. In reviewing the psychological research to identify sources of IDs in learning and long-term memory ability, Bors and MacLeod (1996) indicated that ‘knowledge’ is their single-word summary. As ‘knowledge is the essential material used in generating elaborations and forming links’ (Kyllonen, Tirre, & Christal, 1991, p. 75), a person’s knowledge base in a given domain influences his/her ability to acquire new knowledge in the same domain, whether the learning is declarative or procedural in nature (Bors & MacLeod, 1996). For example, people high in baseball knowledge needed less information than those low in baseball knowledge in order to recognize a game description as old or new (Chiesi, Spilich, & Voss, 1979).
In addition to IDs in knowledge base, Bors and MacLeod (1996) pointed out that people vary in how they arrange information in memory. Such organization governs how quickly and reliably they can access the information. Further, people differ in ‘retrieval speed for well-known information in long-term memory and these differences are related to measures of intellectual ability, most notably verbal ability’ (p. 432). Given the views presented above, it appears reasonable to assume that an instructional approach is able to accommodate individual learners in domain knowledge base and the organization and retrieval of information in memory, consequently enhancing their ability to acquire new knowledge.
Skehan (1980, 1982) indicated that the factor analysis of the test he used to measure memory for text (used in the present study as well) suggested that it measured simultaneously verbal intelligence, trained language ability (the amount of education and previous grammar/language analysis instruction), and presumably memory for natural language. Thus, it is reasonable to assume that learners who score low on such a test have a low knowledge base about grammar and verbal intelligence. As the deductive approach presents grammar explanations and rules to the learners at the outset and provides them with the opportunity to practice the rule immediately, it is plausible that, compared to the explicit-inductive approach, it can better remediate these learners’ weaknesses. On the contrary, it is reasonable to assume that learners who score high on this test have a high knowledge base about grammar and verbal intelligence. It is likely that the explicit-inductive approach, which allows information to be organized in such a way that it can be retrieved more easily (Bruner, 1973), can better capitalize on their strength of not needing much information at the outset to support learning.
III Research questions
Three research questions guided the present study:
Do learners receiving one explicit instructional approach perform as equally well as those receiving the other explicit instructional approach?
Does language aptitude (a latent variable measured by three indicator variables: memory for text, grammatical sensitivity, and associative memory) bear a significant relationship with learner performances?
Does explicit-inductive learning condition produce better results among learners high in language aptitude and, conversely, does deductive learning produce better results among learners low in language aptitude?
IV Method
1 Participants
The initial pool of participants consisted of approximately 400 students enrolled in 21 classes of a first-quarter Spanish course at a public university in the USA. Learners were randomly assigned to two explicit learning conditions: deductive (DE) and explicit-inductive (EI). The final sample contained 93 native English speakers from 21 classes who finished the learning activities (46 males, 47 females; nDE = 42, nEI = 51). 3
2 Grammatical structure
The construction with the Spanish psych verb gustar (‘to like; to be pleasing’) was chosen to be the target structure for the following five reasons.
First, Spanish psych verb constructions are considered very difficult to acquire by native English-speaking learners (Gascón, 1998; López Jiménez, 2003; Marras & Cadierno, 2008; Montrul, 1997; VanPatten, 1986; Zyzik, 2006). One of the likely factors underlying the difficulty of its acquisition is the ways that this construction differs from English. As shown in examples (1) and (2), in English the entity that experiences the emotion, the experiencer, is coded as the subject of the sentence (I, students), and the stimulus that causes the emotion as the direct object of the sentence (strawberries, summer). In contrast, in Spanish the experiencer is coded as the indirect object (me, les) while the stimulus is coded as the subject (las fresas, el verano). Nevertheless, in both languages the experiencer is placed at the beginning of a sentence and the stimulus at the end of a sentence. 4
(1) Me gustan las fresas. me are pleasing the strawberries ‘I like strawberries.’ (2) A los estudiantes les gusta el verano to the students them is pleasing the summer. ‘Students like summer.’
Since native English-speaking learners tend to process the first noun or pronoun they encounter in a sentence as the subject or agent (VanPatten, 2003), correctly interpreting sentences containing this verb is not an issue for them. Conversely, correctly producing sentences containing this verb is a challenge. Because learners can successfully derive the correct meaning of a sentence in spite of overlooking the case of the noun or pronoun that occupies the beginning position in a sentence and/or the verb inflection, the grammatical features related to this construction lack ‘communicative value’ (see VanPatten, 1985). The manner in which these two languages differ may have contributed to some common errors made by native English-speaking learners, well known to Spanish instructors. They include subject–object confusion (Gascón, 1998; VanPatten, 1986), the use of singular verb for plural subject (Gascón, 1998), the omission of object pronouns (Gascón, 1998), object pronoun errors (Gascón, 1998), and the omission of personal a (‘to’) when inclusion is required (Gascón, 1998). The following two examples demonstrate the errors that learners may make with regard to sentence (1) above.
(3) * Gusto[V.-1st person sing.] las fresas. (4) * Yo[Subj. Pron.-1st person sing.] me gusta[V.-3rd person sing.] las fresas.
The following examples demonstrate the errors that learners may make with regard to sentence (2) above.
(5) * Los estudiantes gustan[V.-3rd person plural] el verano. (6) * Los estudiantes le[Obj. Pron.-3rd person sing.] gusta el verano.
5
Second, using the psych verb gustar to construct a sentence is a complex task because more than one grammatical concept has to be taken into account in order to arrive at a correct form in language production. The difficulty of its acquisition, the lack of communicative value of grammatical features, and the task complexity involved suggest that the likelihood of spontaneous noticing and processing will be low and explicit instruction, rather than an instruction based on comprehension, will be more beneficial (see de Graaff, 1997; Hulstijn & de Graaff, 1994).
Third, the rules for this syntactic structure are 100% regular. Therefore, explicit rule-based instruction is likely to be beneficial.
Fourth, to date, very few experimental research studies have used this syntactic structure as the target structure (two exceptions include Bowles, 2008; López Jimenez, 2003). 6
Fifth, this grammatical structure was not formally introduced to learners outside the exposure period during the academic term in which the study was conducted.
3 Instructional materials
All instruction was presented in an electronic format via the internet. Since the participants who received treatment came from 21 different classes, the online environment allowed all students to receive the same instruction presented in the same way, eliminating teacher differences. The online materials consisted of five lessons containing a total of approximately 40–50 minutes of learning materials.
The first three lessons taught three areas of concepts related to the target structure: the subject of the sentence (subject–verb agreement), the indirect object pronoun, the personal a (‘to’) before a common/proper noun. The types of pages contained in each of the first three lessons of the DE condition were as follows:
Introduction;
Presentation of the grammar concept/s;
Exemplars accompanied by explanations;
Multiple-choice (with only two selections for each question) or fill-in-the-blank exercises that required learners to apply the concept learned, with corresponding try-again and good-job feedback.
The types of pages contained in each of the first three lessons of the EI condition were as follows:
Introduction;
Highly structured and simplified activities (Ausubel, 1963) that required learners to observe exemplars and indicate what was observed through multiple-choice questions (with only two selections for each question), with corresponding try-again and good-job feedback;
Multiple-choice questions (with only two selections for each question) that required learners to indicate the pattern/s discovered, with corresponding try-again and good-job feedback.
To ensure that the participants of both conditions would receive a comparable amount of exposure to the targeted structure, the researcher provided (1) the same set of exemplars in both conditions and (2) a comparable amount of grammar explanations in both conditions, through either instructional (DE) or feedback (EI) pages.
In the fourth lesson, the concepts presented in the first three lessons were organized into a sequence of mental and observable steps that learners could follow when they created sentences containing the target verb, from left to right (see Appendix 2). These steps, created by one of the authors by following an ‘information-processing analysis’ (Gagné, Wager, Golas, & Keller, 2005, p. 153), were expected to facilitate in the process of creating sentences in real time (for a discussion on the need of grammar rules that learners can use to produce sentences, see Garrett, 1986; Randall, 2007). Learners in the DE condition received these sequenced rules and were asked to memorize them before writing them down from memory. Learners in the EI condition were asked to derive their own sequence of rules by considering the exemplars provided. Afterwards, their sequence of rules and the researcher’s were displayed side by side on the screen to facilitate comparisons by them. 7
The fifth lesson contained five post-instructional exercises, in which learners of both conditions practiced identifying and correcting errors in sentences using the grammar rules they learned earlier, and producing the target structure. These activities aimed to help anchor the grammar rules solidly in the learners’ consciousness, in declarative form (see DeKeyser, 1998). 8
Because the learning tasks took place online, several program control features were integrated into the computer program. To ensure that learners paid attention to the learning of the target structure, learners were required to have obtained perfect scores on the multiple-choice or fill-in-the-blank learning activities during the first three lessons before they were allowed to access subsequent activities. 9 Additionally, to ensure that learners would receive the instructional treatment and in the manner intended, they were required to perform learning activities sequentially without skipping any pages.
4 Aptitude measures
Two measures of memory and one measure of analytical language ability were obtained via three group-administered tests given in two class periods, one week and two weeks before the learner accessed the online learning program. They included:
memory for text, assessed through the task involving recall of fifteen Indonesian grammar rules based on Skehan (1982), each of which contained a different number of propositions (M = 14.98, SD = 6.70, Maximum Score = 40) (see Appendix 1);
grammatical sensitivity, assessed by Modern Language Aptitude Test (MLAT), IV – Words in Sentences (Carroll & Sapon, 1959); and
associative memory, measured by MLAT, V – Paired Associates (Carroll & Sapon, 1959). The reliabilities for Words in Sentences (M = 17.96, SD = 5.59, Maximum Score = 45) and Paired Associates (M = 15.85, SD = 5.44, Maximum Score = 24), using Kuder–Richardson, were .81 and .89 respectively.
5 Assessment
A written sentence-production test and a written sentence-correction test were chosen to measure the participant’s knowledge about the targeted structure (see Appendix 3). The production task, consisting of four questions in English, enabled learners to engage in using the language to communicate likes or dislikes. The participant had to answer two questions that solicited his/her responses about his/her own likes or dislikes and two questions that solicited his/her responses about the likes or dislikes of someone else. For the correction task, the test presented the participant with 17 sentences: five grammatical and 12 ungrammatical. Grammatical sentences were included in this task so that no clue to the correctness of any sentence was revealed through this test. 10 The participant had to decide if a sentence was ‘grammatical’, ‘not grammatical’, or ‘I don’t know’. If the participant indicated that the sentence was ‘not grammatical’, he or she was asked to correct the sentence so that it would be grammatical. The production task was administered before the correction task to avoid learners gaining exposure to the targeted structure through the correction task and using that knowledge to create sentences during the production task.
Four versions of the production task – differing in the Spanish vocabulary that the participant was required to use and four versions of the correction task, differing in the Spanish vocabulary used in the sentences and the order of question types – were prepared. None of the instances used in the learning program appeared in the tests so that the participants were being tested on their ability to transfer what they had learned to new but parallel situations. Participants were given a maximum of 10 minutes to complete both tasks.
6 Scoring
In scoring the production and correction tasks, the number of grammar concepts that the participant had applied correctly was counted. Two concepts (each with two sub-concepts) were identified for sentences containing the first-person experiencer (e.g. I like) and three concepts (two with sub-concepts) were identified for sentences containing the common/proper noun experiencer (e.g. students like, John likes). These concepts and the point assigned for each of them is included in Appendix 4. A maximum of 10 points was possible for each task, 4 of which were related to sentences containing the first-person experiencer and 6 of which were related to sentences containing a common/proper noun experiencer.
For the production task, the participant’s sentences were checked for the correctness of the application of rules required by the questions. If the application of a rule was required twice and the participant applied it correctly twice, full credit for the rule was assigned. If the participant applied it correctly only once, one-half credit for the rule was assigned. Since one particular rule was required once in the test, if the participant applied it correctly, full credit for such rule was assigned. For the correction task, for each ‘grammatical’ or ‘I do not know’ circled, no points were assigned. For those ‘not grammatical’ circled correctly, the participant’s corrections were checked for accuracy. Since each type of error tested the participant’s knowledge about a rule and occurred at least three times during this task, if the participant was able to correct one type of error correctly twice, he or she was given full credit for the rule. If the participant only corrected the error correctly once, he or she was assigned half-credit for the rule.
7 Procedure
Participants in the instructional groups took the language aptitude tests in two class periods. All participants took the pretest in class and logged into the research website to answer a background survey. One week after the pretest, participants in the instructional groups accessed the learning materials and completed the immediate posttest online. 11 Approximately three weeks after the immediate posttest, all participants took the first delayed posttest in class. Two weeks later they took the second delayed posttest in class.
8 Data analysis method
Less than 5% of the data points of the instructional groups were missing, and Little’s (1988) test of missing completely at random (MCAR) indicated that the missing pattern was MCAR (χ2 = 51.143, df = 50, p = .429). Consequently, a simple mean imputation was performed to replace the missing points.
Latent growth curve analysis (LGCA) (Duncan, Duncan, & Strycker, 2006) was chosen to answer three research questions for two reasons. First, LGCA is a statistical approach developed to study change or growth patterns; it can provide in-depth analyses, such as the analysis of individual variations, on the learning trajectory, which other methods, such as repeated measures, cannot provide. Second, learners’ performance over time (before and after treatment) is expected to be non-linear (curve), which can be captured by the flexible LGCA. For this study, a dichotomous variable was chosen to represent the two instructional conditions: deductive and explicit-inductive, and as the predictor variable for the intercept and the slope of the learning trajectory. To model the plausible nonlinear growth curve in the learning process, a creative extension of LGCA, known as the unspecified LGCA (Duncan, Duncan, & Strycker, 2006; Tisak & Meredith, 1990), was conducted to infer the shape of growth from the data. Consequently, the first two slope parameters were fixed as 0 and 1 and the third and fourth slope parameters were left free. Using the growth change between the first two measurement occasions as a benchmark (Hancock & Lawrence, 2006), the latter two measurement occasions were inferred accordingly. The model fit was assessed by four commonly-used key goodness-of-fit indices: chi-square (χ2), normed fit index (NFI), comparative fit index (CFI), and root mean square error of approximation (RMSEA). The effect size was reported as the standardized regression coefficient (b) which assessed the strength of relationship between two variables in LGCA after controlling for other variables in the model.
To test the ATI effect, a multi-group LGCA through structural equation modeling, an innovative method for testing the interaction effect according to Baron and Kenny (1986), was used. In this analysis, the sample was split into two groups, high and low in aptitude, in accordance with the median factor score of language aptitude, and the language aptitude was treated as the moderator of the relationship between learning conditions and learning outcomes. Then, an ordinary LGCA were conducted on two aptitude groups simultaneously. In the unconstrained model, the effects of learning conditions on the intercept and the slope were allowed to vary across two groups, whereas in the constrained model, the effects of learning conditions were constrained to be equal across groups. The nested chi-square test statistic was used to compare the fit between these two models. If a better model fit was obtained from the unconstrained model, it would suggest that the effects of learning conditions were moderated by the learners’ language aptitude; in other words, there was an ATI effect.
V Results
Descriptive analysis
Descriptive statistics are summarized in Table 1. It shows that the learners’ scores improved considerably from the pretests to the immediate posttests, dropped from the immediate posttests to the second posttests, and increased slightly from the second posttests to the third posttests. This empirically confirms the assumption that the learners’ performance development over time is non-linear.
Mean scores and standard deviations.
Research question 1
To answer research questions 1 and 2, a LGCA was conducted for the production and the correction tests, respectively (see Table 2 and Figure 1). The deductive condition was coded as 0, while the explicit-inductive condition as 1. In other words, a positive path coefficient would indicate that the explicit-inductive condition produced better results, whereas a negative path coefficient would indicate that the deductive condition was superior. Excellent model fit was achieved for both models. Namely, both models provided a good description of the data.
Correlations between language aptitude components and test scores.
Notes. n = 42 students in deductive instruction [above diagonal]; n = 51 students in explicit-inductive instruction [below diagonal]. * p < .05; ** p < .01.

Latent growth curve models for production and correction with learning condition and language aptitude as covariates († p < .10, * p < .05, ** p < .01, *** p < .001).
Note that in a LGCA model, an intercept refers to the value at the beginning of the growth process and a slope refers to the average rate of growth. For the production test, the latent mean of the intercept was 3.31 (p < .001) and the latent mean of the slope was 4.91 (p < .001); this suggests that the learners began with an average production score of 3.31 and gained an average score of 4.91, multiplied by the corresponding coefficient (.00, 1.00, .44, and .53), at each time point. For the correction test, the latent mean of the intercept was 2.13 (p < .001) and the latent mean of the slope was 4.94 (p < .001); this suggests that the learners began with an average correction score of 2.13 and gained an average score of 4.94, multiplied by the corresponding coefficient (.00, 1.00, .59, and .68), at each time point.
Note that in a LGCA model the variance of the intercept and that of the slope show the extent to which learners had different initial levels and rates of growth over time. For the production test, the variance for the intercept and that of the slope were 3.29 (p < .001) and 4.14 (p < .01), respectively, suggesting a large variation in growth curves among learners. For the correction test, a large variation in growth curves among learners was also evident; 5.13 (p < .001) for the intercept and 7.10 (p < .01) for the slope.
As shown in Figure 1, the learners assigned to two learning conditions did not differ significantly in the intercepts; this is evidenced by the non-significant path coefficients from the treatment to the intercept, i.e. –.21 for the production test and .20 for the correction test. Further, no statistically significant differences in treatment effects were found between two learning conditions. This is evidenced by the non-significant path coefficients from the treatment to the slope, namely, .18 for the production test and –.30 for the correction test. To sum up, research question 1 is answered affirmatively.
Research question 2
To find out which indicator variable had the strongest relationship with the latent variable, language aptitude, the standardized factor loadings from language aptitude to three indicator variables were obtained. For the production test, the standardized factor loadings from language aptitude to memory for text, grammatical sensitivity, and associative memory were .76, .61, and .63, respectively. For the correction test, the standardized factor loadings from language aptitude to memory for text, grammatical sensitivity, and associative memory were .81, .60, and .59, respectively. Note that Comrey and Lee (1992) suggest that factor loadings in excess of .71 (50% variance) are considered excellent, .63 (40% variance) very good, .55 (30% variance) good, .45 (20% variance) fair, and .32 (10% variance) poor. Thus, the factor loadings reported above are between excellent (.81) and good (.59). In summary, for both types of tests, memory for text had the strongest relationship with language aptitude.
To interpret the relationship between the latent variable and the three indicator variables, the unstandardized loadings had to be used (see Figure 1). For the production test, the factor loadings of memory for text, grammatical sensitivity, and associative memory, were 1.00, .67, and .68, respectively. In other words, for each 1 unit increase in language aptitude, an individual’s score on memory for text increases 1 unit point, whereas his/her scores on grammatical sensitivity and associative memory will increase .67 of 1 unit and .68 of 1 unit, respectively. For the correction test, the factor loadings of memory for text, grammatical sensitivity, and associative memory were 1.00, .63, and .60, respectively.
As shown in Figure 1, significant language aptitude effects were detected in both models. For the production test, the coefficients from the language aptitude to the intercept and from the language aptitude to the slope were .11 (p < .05, b = .30) and .10 (p < .10, b = .24), respectively, suggesting that higher aptitude positively affects both baseline test score and rate of growth over time. The same conclusions can be drawn for the correction test as well. For this test, the coefficients from the language aptitude to the intercept and from the language aptitude to the slope were .13 (p < .01, b = .30) and .20 (p < .10, b = .37), respectively. In summary, language aptitude had a positive effect on both the initial level of performance and the rate of growth. Research question 2 is thus answered affirmatively.
Research question 3
To answer this research question, a multi-group LGCA by language aptitude with a median split (median factor score = –.036) on language aptitude was conducted for both the production and the correction tests. As explained earlier, a better model fit obtained from the unconstrained model, vs. the constrained model, would suggest that an ATI effect was present. The results show that the unconstrained model had a significantly better model fit, as revealed by the nested chi-square test (for the production test, Δχ2 = 11.413, Δdf = 2, p < .003; for the correction test, Δχ2 = 15.386, Δdf = 2, p < .001). In other words, an ATI was detected. Figure 2 shows that a good model fit was achieved for the sub-models of the multi-group LGCA analyses.

Latent growth curve models for production and correction with learning condition as a covariate moderated by language aptitude (†p < .10, *p < .05, **p < .01, ***p < .001).
Note that in these LGCA models, the deductive condition was coded as 0, while the explicit-inductive condition as 1. In other words, a positive path coefficient means that the learners in the explicit-inductive condition performed better, whereas a negative path coefficient means that the learners in the deductive condition did better. The results show that for the learners of low language aptitude in the correction test, the unstandardized path coefficient from the learning condition to the slope was −1.62 (p < .05), indicating that those receiving the deductive instruction had a significantly higher slope, or average rate of growth, than those receiving the explicit-inductive instruction. The standardized regression coefficient b was −.29, revealing a near medium effect. In summary, learners of low language aptitude learned significantly better under the deductive condition when they were measured by the correction test. When these learners were measured by the sentence production test, those receiving the deductive instruction also had a higher slope (unstandardized path coefficient −.53). However, the difference was not significant. On the other hand, among learners of high language aptitude, those receiving the explicit-inductive instruction had higher slopes than those receiving the deductive instruction (unstandardized path coefficient .13 for the production test and .002 for the correction test). Nevertheless, the differences were not significant. The mean plots in Figure 3, estimated from the LGCAs, provide alternative representations of both the significant and insignificant interaction effects in the form of growth curves. Further, they show that for the case of a significant interaction, the effect appeared at the immediate posttest, which endured throughout the two delayed posttests. In conclusion, research question 3 is partially supported.

Latent growth curves for production and correction moderated by language aptitude.
VI Discussion
The main purpose of this study was to discover how individual differences at the level of language aptitude interact with equally explicit learning conditions to affect the success of the learning of grammar rules. Deduction and explicit-induction were chosen to be studied because they are widely used by L2 instructors to teach grammar rules. Consequently, an understanding of their effects on different types of learners is significant for the instruction of explicit rules.
The results indicate that participants as a whole did not differ significantly in their performances under deductive and explicit-inductive learning conditions (e.g. Abraham, 1985; Rosa & O’Neill, 1999; Shaffer, 1989). These results are consistent with the prediction derived from current evidence from L2 studies, which suggests that grammar learning conditions sharing a comparable level of explicitness will have a similar effect on learning (e.g. de Graaff, 1997; DeKeyser, 1995; Doughty, 1991; Leow, 1998; Reber et al., 1980; Robinson, 1997; Rosa & Leow, 2004; Rosa & O’Neill, 1999; Scott, 1989).
However, to understand the effects of these two explicit learning conditions on different types of learners, our study took into account psychological research in the influence of domain knowledge base and verbal intelligence in learning (Bors & MacLeod, 1996; Chiesi et al., 1979; Kyllonen et al., 1991). The overall results confirm one of our hypotheses. That is, the deductive, in comparison with the explicit-inductive method, helped learners with low aptitude (i.e. low domain knowledge about grammar and verbal intelligence) better acquire grammar knowledge and retrieve it over time (see Tomlinson & Hunt, 1971). In other words, deductive method better matches with this type of learners.
The reasons that the psych verb investigated in the current study is regarded as difficult to acquire by native English-speaking learners (Gascón, 1998; López Jiménez, 2003; Marras & Cadierno, 2008; Montrul, 1997; VanPatten, 1986; Zyzik, 2006) are the ways that it differs from English, the lack of communicative value of its grammatical features (see VanPatten, 1985), and its linguistic complexity (learners have to keep track of several elements when constructing a sentence). In spite of the limitations of the current study, the above findings do suggest that grammatical structures sharing similar linguistic complexities to the psych verb investigated in the current study and requiring an explicit instruction may show a similar sensitivity to deduction and explicit-induction approaches.
Our results also show that the explicit-inductive method, which presumably can lead to better accessibility of information in memory (Bruner, 1961, 1973), produced slightly better, but statistically insignificant, performances for learners of higher language aptitude (see Tomlinson & Hunt, 1971). Although one of our hypotheses is not confirmed, the trend toward a differential effect of instructional methods is intriguing and calls for further research. It is plausible that in order to find differential effects of instructional conditions with a statistical significance, the right combination of learning conditions and learner characteristics is needed. While the learning conditions used in the present study allow us to detect significant interaction effects for the low-aptitude learners, they may need to be adjusted for the higher-aptitude learners. For such learners, a more ‘challenging’ learning environment is probably needed to uncover their sensitivity to instructional approaches. This may be accomplished by decreasing the degree of explicitness of the learning conditions under investigation, increasing the complexity of the target grammar point, which may involve teaching additional grammar points, typically taught at a more advanced level, or increasing the degree of challenge of the measurements.
VII Conclusions
Although most researchers and teachers agree that learners are not homogeneous, many L2 research studies to date have focused on determining if one instructional approach is superior to the other; very few studies have investigated if different learners benefit from different instructional approaches (exceptions include Abraham, 1985; DeKeyser, 1993; Gallegos, 1968; Hauptman, 1971; Nation & McLaughlin, 1986; Wesche, 1981; Zampogna et al., 1976). Our findings indicate that different but equally explicit instructional approaches can have differential effects on the learning performance of learners at different aptitudinal levels. In view of these findings and the existing evidence concerning the superiority of explicit types of instruction as opposed to implicit types, to maximize learners’ potential for success as well as account for variations in language learning success under different instructional conditions (see DeKeyser, 2009; Robinson, 2001), it is necessary for L2 educators to investigate the effects of equally valid grammar instructional approaches on different types of learners. One learner characteristic that may deserve their attention is domain knowledge base and verbal intelligence. However, to uncover at which point learners may become sensitive to different instructional approaches, researchers may need to examine different combinations of instructional approaches and learner characteristics. The present study has strengthened the significance in understanding the interaction between learner characteristics and instructional conditions. Such knowledge is surely indispensable to the design of learning activities on textbooks or electronic materials.
Footnotes
Appendix 1
Appendix 2.
Appendix 3
Appendix 4.
Funding
This study was in part supported by Charles Phelps Taft Research Center.
