The long-term effect of explicit instruction on learners’ knowledge on English articles

Abstract

This study examines the role of explicit instruction in article semantics to L2 learners of English. Two types of generic sentences, expressed by different articles, were tested over time. An instruction group (n = 21), a control group (n = 16) and a native English speaker control group (n = 9) participated in the study. The instruction group received nine 60-minute lessons across 9 weeks. A pre-test was administered to both groups before instruction began and four post-tests were given to both groups. The results from delayed post-tests show that the instruction group improved, but after one year little knowledge was retained. The findings suggest that explicit knowledge of articles is unlikely to be retained unless ongoing instruction is achieved.

Keywords

articles explicit knowledge generics instruction L2 learner

I Introduction

Many second language (L2) learners of English find that acquiring functional morphology such as articles (the and a) and plural -s is difficult, especially when the equivalent morphology is missing in the first language (L1). A study by Snape (2013) has shown that Japanese speakers, even at advanced levels of English proficiency, invariably find that concepts like genericity make the article system problematic to acquire and ultimately master. Part of the difficulty with articles is the asymmetric pattern of generics, where the definite singular (the) is used when an NP denotes a ‘kind’ (subject NPs that refer to an individual or group of individuals) and the indefinite singular (a) is favoured to provide a general description of something or someone. In the generative approach to SLA (GenSLA), some properties of language are deducible through exposure to primary linguistic data (PLD), and this has been shown through many studies, e.g. Rothman (2008). Nonetheless, some properties of language may be too complex, or there are simply not enough exemplars in the PLD for L2 learners to extract and internalize. Even if PLD is limited, Ionin, Montrul, Kim, and Philippov (2011) argue that if Universal Grammar is available to adult L2 learners, post-critical period, in terms of the syntax–semantics interface, PLD should help facilitate participants’ mapping of a [+kind] or [+species] feature onto the definite singular generic. It is likely to take a great deal of time for learners to incorporate the different functions of articles, for example, and as VanPatten and Rothman (2015) point out, hundreds if not thousands of exemplars need to be present in the PLD to become robustly represented. The definite singular generic is a good test case because it is not a highly frequent form in the input. One way to enhance PLD is through instruction in a classroom setting, which can provide positive and negative feedback of definite singulars as generics. Perhaps explicit instruction could somehow jump start the learning process for those learners who do not have the benefit of positive L1 transfer effects, as discussed in Section II below.

The effectiveness of explicit instruction has been of interest to second language acquisition researchers since Krashen (1981) proposed that there is a distinction between implicit and explicit learning, and that there is ‘no interface’ between the knowledge resulting from the two types of learning. Schwartz (1993) articulates this assumption by arguing that unconscious L2 linguistic knowledge is not affected by explicit instruction or negative evidence; what can affect underlying unconscious L2 linguistic knowledge is PLD, just as it does for child L1 learners, thereby taking the ‘no interface’ view offered by Krashen. In other words, the ‘no interface’ view posits that learned linguistic (explicit) knowledge does not become acquired (implicit) knowledge. Recently, the role of explicit instruction in L2 acquisition has been gaining some attention within the GenSLA research paradigm, due to work mainly by Whong (2011) and Whong, Gil, and Marsden (2013). Whong, Marsden and Gil (2013) suggest that L2 learners could benefit from more linguistically informed generalizations.

The aim of the current study is to attempt to provide instruction to a group of Japanese-speaking L2 learners of English to see whether participants can determine the differences between the definite singular and the indefinite singular for generics by teaching them descriptions of English articles grounded in linguistic research. Bare plurals can be used for both types of generics and the general observation from studies by Ionin et al. (2011) and Snape (2013) is that these are relatively unproblematic for L2 learners possibly due to a higher distribution in the input. The descriptions we use to teach Japanese speakers are taken from Krifka et al.’s (1995) generics framework. According to Krifka et al., the subject NP can be definite or bare plural when one refers to a ‘well-established kind’ or a ‘natural kind’. An example of a natural kind with a special type of verbal construction – ‘kind predicate’ in italics in examples (1a) and (1b) – can refer to an entire species. These are termed as NP-level generics.

(1) a. The pelican is protected as a species.

b. Pelicans are protected as a species.

Conversely, the subject NP in (2a) and (2b) can be indefinite or bare plural for general statements or descriptions. These are referred to as sentence-level generics.

(2) a. A coat is necessary in winter.

b. Coats are necessary in winter.

The two types of generics (definite and indefinite) in examples (1) and (2) are mutually exclusive. In other words, * A pelican is protected as a species, is ungrammatical, hence the use of the symbol * before it as one cannot refer to the entire species of pelicans with an indefinite article and, # The coat is necessary in winter, implies that there is a specific coat one is referring to, thus the use of the symbol # meaning that it is pragmatically odd as it is not a generic description of a coat.

In Japanese there is no contrast between NP-level genericity and sentence-level genericity as Japanese is an article-less language with no number distinction between singular generics and plural generics (Kuroda, 1992).

This study investigates whether L2 learners can benefit from instruction on English articles that is informed by linguistic analyses rather than the rules of thumb that are typically taught to L2 learners in classrooms. In addition, we aim to answer the question of whether the effect of this instruction is still noticeable after a period of 15 months when instruction began.

The article is organized as follows: Section II provides an overview of studies that have investigated the acquisition of genericity by L2 learners of English. Section III focuses on previous intervention studies where groups of L2 learners have participated in instruction sessions to help them learn the English article system. Section IV gives details about our intervention study. Section V presents the discussion and concludes the article.

II L2 studies of genericity

Ionin et al. (2011) tested Russian speakers’ and Korean speakers’ understanding of NP-level and sentence-level generics in L2 English. Since Russian and Korean lack definite and indefinite articles, Russian- and Korean-speaking L2 learners of English may be unaware of how NP-level and sentence-level genericity operates, as illustrated in examples (1) and (2) above. To find out if learners could make distinctions between the two types of generics, an acceptability judgement task (AJT) was administered. Participants were instructed to read a series of short contexts and then rate five possible continuation sentences for each context. A rating of 4 means the sentence is completely acceptable; a rating of 1 means the sentence is completely unacceptable. Ionin et al. (2011) found that both the Russian and Korean L2 learners were inaccurate at rating the definite singular highly for NP-level generics (a rating of 2.5 or less), but were better in their ratings of the indefinite singular for sentence-level generics (a rating of 3 or above). Their results showed that both Russian and Korean learners’ acceptance rate of definite singulars for NP-level generics was low, meaning that these learners had difficulty accepting the definite singular as a marker for genericity.

In a replication study, Snape (2013) tested adult Japanese speakers and Spanish speakers of L2 English. The findings clearly showed that for the Spanish speakers, the mean rating for the definite singular was higher (2.7 out of 4) than the indefinite singular (1.5 out of 4) for NP-level generics due to L1 transfer effects. However, for the Japanese speakers, the mean rating for the definite singular was lower (2.0 out of 4) and equal to the ratings for the indefinite singular (2.2 out of 4) and the ungrammatical bare noun (2.0 out of 4). These results are consistent with the findings in the Ionin et al. (2011) study as the Russian and Korean speakers performed much like the Japanese speakers in their rating of the definite singular. The findings from Ionin et al. and Snape collectively demonstrate that L2 learners are target-like in their interpretation of sentence-level generics expressed with the indefinite article and bare plural, but there is less accuracy with the interpretation of the definite singular as an NP-level generic.

III Previous intervention studies

The effectiveness of explicit instruction has generated a lot of interest over the last three decades of L2 acquisition research. Master (1994) tested the effectiveness of explicit instruction with high-intermediate and low-advanced learners of English with varying L1 backgrounds.¹ The participants were divided into an instruction group and a control group. The test was in the form of a forced-choice task, in which the learners were asked to choose a, the, or no article and it was administered to both groups before the instruction period began as a pre-test. After the pre-test, the instruction group received 9-weeks of instruction on English articles, focusing on aspects such as the singular-plural, definite-indefinite, and specific-generic distinctions. A week after the end of the instruction period, the post-test was administered to both groups. There were significant differences between the pre-test and the post-test for the instruction group, but not for the control group. Master concluded that instruction on English articles, if it is done in a systematic way, is effective. However, in Master’s study, the post-test was administered only a week after the instruction period ended. Therefore, it is not clear whether the positive effect found in his post-test was an indication of learners’ underlying linguistic knowledge or their explicit knowledge. Master argued that instruction is useful and that the article system can be learned, but the author did not address whether explicit knowledge can be retained over a long period of time (see also Bitchener & Knoch, 2008).

Therefore, what needs to be addressed in the case of articles, is whether explicit knowledge can be retained over the long term, beyond weeks or a few months. The goal of most intervention studies is to assess the improvement that learners demonstrate at a post-test after instruction has been given, compared to the results from a pre-test, which is administered before instruction. One of the problems with many intervention studies of this type is that post-tests are typically administered shortly after the instruction period has ended. However, as it has been pointed out by Schwartz and Gubula-Ryzak (1992), it is not clear whether the results from post-tests, administered shortly after the instruction period, demonstrate learners’ underlying linguistic knowledge has changed as a result of explicit instruction, or learners are using explicit knowledge learned from instruction when they take a post-test.

White’s (1991) intervention study on adverb placement tested whether instruction could have a long-term effect. White’s participants showed improvement on their performance and judgements on adverb placement in the post-test administered five weeks after instruction, but when they were tested again a year later, their performance mirrored the results from the pre-test. Based on White’s post-test results one year after instruction, Schwartz and Gubula-Ryzak (1992) argued that the improvement shown was a manifestation of participants using explicit knowledge gained as a result of the explicit instruction they had received. One year after they received instruction, White’s learners no longer retained their explicit knowledge.

Snape and Yusa’s (2013) study addresses whether explicit instruction, based on Krifka et al.’s (1995) generics framework and Ionin et al.’s (2004) definitions of definiteness and specificity, can aid L2 learners to develop their explicit metalinguistic knowledge regarding genericity, definiteness and specificity in English. The participants were Japanese speakers at a high-intermediate level of English proficiency. They were divided into an instruction group and a control group. The instruction period was three weeks, consisting of explicit instruction on definiteness, specificity and genericity. The instruction group received three lessons in total. In the first lesson, the participants worked in pairs to construct their own dialogues where they had to choose the appropriate article for each dialogue. The second lesson focused on the perception of articles in sentences and the third lesson allowed time for the participants to create their own generic sentences. The lessons were once a week for 70 minutes. The instruction group and the control group took a pre-test and a post-test and a delayed post-test, which was administered two weeks after the instruction period ended. The results of the pre-test and both post-tests showed that both the instruction group and the control group performed well on sentence-level generics in their interpretations of the indefinite singular and bare plural as generics. However, both groups were non-target-like with the definite singular used as an NP-level generic, and the instruction group continued to be non-target-like after the instruction period. Thus, the authors concluded that the explicit instruction given in their study was ineffective with regard to the definite singular generic contexts.

The present research follows up on Snape and Yusa’s study by changing a few aspects of their study. Snape and Yusa speculated that the ineffectiveness of their instruction was because the instruction period was too short. Only one 70-minute session was dedicated to English generics and one 70-minute session to definiteness and specificity. They also note that the content of the instruction might have been too complex to grasp in English. To examine the effectiveness of the instruction based on linguistic descriptions, we conducted a new study with a longer instruction period and instruction given in the learners’ native language, Japanese, testing Snape and Yusa’s explanations on the ineffectiveness of explicit instruction in their study.

IV Study

1 Participants

Participants in our study were recruited at a university in Japan. In total, 37 learners agreed to participate in our study. 21 participants were placed in an instruction group and 16 were placed in a control group.² The participants were all female university students, and they were native speakers of Japanese majoring in English at the time of testing. All participants continued to attend EFL and content classes (between 6–8 hours per week) during our study. Their English proficiency level is between high-intermediate and advanced levels. Their scores from the Test of English for International Communication (TOEIC) are summarized in Table 1 at pre-test and post-test 4. A paired-samples t-test showed that there was no significant difference between the instruction group and the control group (t = 0.608, p = .546) in their level of English proficiency when the pre-test was administered.

Table 1.

Means and ranges of Test of English for International Communication (TOEIC) scores of the L2 learners.

	Mean	Range
Instruction group:
Pre-test (n = 21)	720.7	590–945
Post-test 4 (n = 14)	757.0	610–905
Control (no instruction) group:
Pre-test (n = 16)	740.9	590–935
Post-test 4 (n = 8)	802.0	700–980

In addition to the learners’ groups, nine native speakers of British English participated as native controls. All participants received remuneration for pre- and four post-tests and the instruction group participants received payment for every instruction session they attended.

2 Acceptability judgement task

Table 2 lists the 10 types of items featured in the AJT. The pre-test and post-tests are based on Ionin et al.’s (2004, 2011) items. Example (3) is a distractor. The only acceptable continuation for the definite second mention singular context is (3a).

Table 2.

Types and number of tokens in the acceptability judgement task (AJT).

Second mention (distractors)	Definite singular	n = 4
Second mention (distractors)	Definite plural	n = 4
Kinds (NP-level generics)	Definite singular	n = 12
Kinds (NP-level generics)	Bare plural	n = 12
General (sentence-level generics)	Indefinite singular	n = 12
General (sentence-level generics)	Bare plural	n = 12
Definiteness and specificity³	Definite, specific	n = 4
	Definite, nonspecific	n = 4
	Indefinite, specific	n = 4
	Indefinite, nonspecific	n = 4

(3)	Distractor condition: Definite second mention (non-generic)
	Yuka always eats healthy lunches. She loves fruit. Today she plans to eat one red apple and two bananas. But, …
a.	the red apple looks bad.	1	2	3	4
b.	red apple looks bad.	1	2	3	4
c.	red apples look bad.	1	2	3	4
d.	the red apples look bad.	1	2	3	4
e.	a red apple looks bad.	1	2	3	4

The example in (4) is of a NP-level generic. There is not one but two possible acceptable continuations, the definite singular (4a) and bare plural (4b). (4c) and (4d) should receive low ratings as they do not have a kind interpretation. (4e) is completely unacceptable and ungrammatical as there is no article, it is simply a bare noun, and thus should receive the lowest rating.

(4)	Test condition: NP-level generic
	I know that you like birds. Well, if you ever visit California, you’ll see different kinds of birds there. For example, I found out ….
a.	the pelican lives on the California coast.	1	2	3	4
b.	pelicans live on the California coast.	1	2	3	4
c.	a pelican lives on the California coast.	1	2	3	4
d.	the pelicans live on the California coast.	1	2	3	4
e.	pelican lives on the California coast.	1	2	3	4

Similar short contexts were created for the sentence-level generics. Two possible acceptable continuations for the context in (5) are the indefinite singular (5a) and bare plural (5b). (5d) and (5e) should receive low ratings as they do not have a sentence-level interpretation. (5c) is completely unacceptable and ungrammatical as there is no article and like (4e) should receive the lowest rating.

(5)	Test condition: Sentence-level generic
	I want to go skiing in December. I heard Northern Japan is popular. Of course, it is very cold. Everyone knows, for instance, ……
a.	a coat is necessary in winter.	1	2	3	4
b.	coats are necessary in winter.	1	2	3	4
c.	coat is necessary in winter.	1	2	3	4
d.	the coat is necessary in winter.	1	2	3	4
e.	the coats are necessary in winter.	1	2	3	4

Two versions of the AJT were created. One version was administered for the pre-test and a different version for post-test 1. For the following three post-tests we employed the same pattern, alternating between the two versions.⁴

3 Research questions

Our research questions are as follows:

Research question 1: Can the instruction group gain knowledge of NP-level generics through instruction and practice of the definite singular and bare plural?

Research question 2: Does the instruction group exhibit target-like knowledge of the indefinite singular and bare plural for sentence-level generics before instruction begins? If not, can instruction and practice of the indefinite singular and bare plural foster this knowledge?

Research question 3: Is the effect (if any) of the instruction durable?

The first research question directly addresses whether L2 learners can improve through instruction on the definite singular and bare plural for NP-level generics. The second research question was formulated because the instruction group may well have a good understanding of the indefinite singular and bare plural since their function as a sentence-level generic is somewhat like a non-specific use. Nevertheless, it is possible that when we compare the results of ‘before and after’ instruction we may see instruction effects in terms of higher ratings for indefinite singulars (and bare plurals). The third research question, like the title of our article, directly addresses whether the explicit instruction has a long-term effect.

4 Procedure

Instruction sessions and pre- and post-tests were administered following the schedule summarized in Table 3. All participants in the instruction group took the pre-test a day before the instruction session started.⁵

Table 3.

Schedule for tests and instruction.

Week 1	Pre-test: Instruction (n = 21) and Control (n = 16) groups
Weeks 1–3	Instruction (generics)
Week 3	Post-test 1: Instruction (n = 21) and Control (n = 16) groups
Weeks 4–7	Instruction (definiteness and specificity)
Weeks 8–9	Review (generics, definiteness and specificity)
Week 10	Post-test 2: Instruction (n = 21) and Control (n = 16) groups
12 weeks after post-test 2	Post-test 3: Instruction (n = 19) and Control (n = 15) groups
One year after post-test 3	Post-test 4: Instruction (n = 14) and Control (n = 8) groups

During the nine instruction sessions, there were four participants who were absent once, and two participants who were absent twice. In the first three weeks of the instruction period, generics were taught. In the following weeks, the instructor focused on definiteness and specificity. In the final two weeks, the instructor provided a review. Post-test 4 was administered one year after the pre-test. 14 participants from the instruction group and 8 participants from the control group completed post-test 4. Some participants were unable to take part as they had either already graduated or they were unavailable.

5 Instruction

Instruction followed previous studies (e.g. White, 1991) that set out to focus on a particular area of grammar through explicit instruction. A native speaker of Japanese gave weekly instruction sessions in Japanese and English. The instruction was offered in Japanese as Snape and Yusa (2013) speculated that instruction in English may not have been completely successful due to a difficulty in understanding challenging concepts like definiteness, specificity and genericity.⁶ Each instruction session lasted for 60 minutes every week, over the course of 9 weeks. Instruction consisted mainly of metalinguistic explanations. The metalinguistic explanation regarding genericity in English was divided into two types: NP-level generics expressed with the definite singular and bare plural, and sentence-level generics expressed with the indefinite singular and bare plural. For the NP-level generics, the participants were told that subjects that refer to an individual or a group of individuals can be classified as natural kinds, well-established kinds and kind predicates, as shown in (6–8). A singular noun with the definite article, the, can express this type of genericity. In addition, they were told that the bare plural is also acceptable with this type of generic.

(6) Natural kind:

The lion lives in Africa.

Lions live in Africa.

(7) Well-established kind:

The Coca-Cola bottle has a narrow neck.

Coca-Cola bottles have a narrow neck.

(8) Kind predicate:

The cell phone was invented in 1973 by Martin Cooper.

Cell phones were invented in 1973 by Martin Cooper.

As for sentence-level generics, the participants were told that a singular noun in subject position with the indefinite article, a(n), can be interpreted as a general description if the sentence describes a general property of an entity which the noun is referring to, as in (9–11). Furthermore, just like NP-level generics, the participants were told that the bare plural is also acceptable for this type of generic.

(9) Natural kind as a general statement:

A potato contains vitamin C, amino acids, protein and thiamine.

Potatoes contain vitamin C, amino acids, protein and thiamine.

(10) Non-well-established kind:

A chocolate bar is enjoyed by many people.

Chocolate bars are enjoyed by many people.

(11) Non-kind-predicates:

A cell phone is expensive in Japan.

Cell phones are expensive in Japan.

In addition, the participants received negative feedback from the instructor about non-well-established kinds, non-kind-predicates and bare nouns; the use of a definite singular in (12) is inappropriate as there is no generic interpretation.

(12) # The chocolate bar is enjoyed by many people.

Bare singular count nouns, exemplified in (13), are ungrammatical.

(13) * Lion lives in Africa.

The instruction sessions included time for participants to work by themselves, in pairs or small groups. For practice on the generic use of articles, the participants were given pictures and asked to create generic sentences of the type they had received instruction on. For example, participants were provided with a list of kind predicates and pictures of objects and animals that are used for NP-level generics. They were instructed to produce their own sentences by using a kind predicate with a picture, e.g. The Oreo cookie is sold throughout many countries. Other exercises included fill-in-the-blank texts where the articles were removed and participants had to work out which article (the or a) was appropriate or they had to decide which noun phrase best suited the context.⁷

6 Analysis of results

The analysis of the results is based on obtaining the mean rating for each item type from the AJT for the pre- and post-tests. For example, perhaps participants rated the definite singular NP-level generic items as 2, 1, 3, 3 etc., thus the mean rating for 2, 1, 3, 3 would be 2.3. Once the mean ratings were calculated for the different types of items (see Table 2), we ran several non-parametric statistical tests as the assumption for parametric tests was not met. Comparisons between the native control (NS) group and the two L2 learner groups were performed using Kruskal–Wallis tests, Mann–Whitney U tests and two Freidman tests. Wilcoxon signed-rank tests were performed for within-instruction group comparisons. The NS group performed as expected across the target items in the AJT. Their ratings match those of NS ratings from Ionin et al. (2011) and Snape (2013). The NS responses thus support test validation for the AJT.

7 Results

The results in Figure 1 from all the tests show the average of correct judgements for acceptable sentences by each participant group for the definite second mention conditions. The results from Kruskal–Wallis tests show that the NS group and the L2 learner groups are not statistically different from each other on definite singulars (χ² = 2.066, p = .356) and there is no difference in ratings of indefinite singulars (χ² = 2.162, p = .339), the inappropriate article for the context. These results are important to note as they demonstrate that the L2 learners have a basic understanding of the English article system as ratings are high for the definite singular in second mention contexts (a rating of 3.9 or above) and low for the indefinite singular (2.2 or lower).

Figure 1.

Mean ratings (scale = 1–4) for definite second mention singular.

Figure 2 and Table 4 show the pre-test results from the NP-level generic contexts from the three groups. To simplify the results reported in this paper, only the ratings for definite singulars, indefinite singulars and bare plurals are presented in the figures reporting the results of the generic conditions.

Figure 2.

Mean ratings (scale = 1–4) for pre-test results of NP-level generics.

Table 4.

Means ratings (scale = 1–4) and standard deviations (in brackets) for pre-test results of NP-level generics.

	def sing	*indef sing	bare pl
Native Controls	3.3 (0.25)	2.0 (0.11)	3.9 (0.21)
Instruction group	2.1 (0.24)	2.1 (0.08)	3.4 (0.23)
Control (no instruction) group	2.5 (0.31)	2.2 (0.16)	3.4 (0.43)

Recall that for the NP-level generics, high ratings are predicted for definite singulars and bare plurals. A Kruskal–Wallis test showed the three groups differed significantly on definite singulars⁸ (χ² = 17.212, df = 2, p = .0001) and bare plurals (χ² = 10.426, p = .005), but there was no difference for indefinite singulars (χ² = 0.955, p = .620). To find out where the differences lie, Mann–Whitney U tests were performed. The results show that there are no significant differences between the instruction and no instruction groups for definite singulars and bare plurals (def sing: Z = −1.370, p = .171; bare plural: Z = −.401, p = .689). Clearly, Figure 2 shows that the L2 groups are already highly accurate with bare plurals (above 3.0) but are less accurate with definite singulars (less than 2.5).⁹ The L2 groups rated the definite singular and indefinite singular almost equally to refer to NP-level generics. For bare plurals, there were significant differences between the NS group and both L2 groups, but despite significant differences, the participants from the instruction and no instruction groups rated bare plurals 3.4 on the AJT.

Figure 3 and Table 5 show the pre-test results from sentence-level generic contexts. For the sentence-level generics, high ratings were expected for indefinite singulars and bare plurals. The NS group rated indefinite singulars and bare plurals highly (3.6 and 3.9 respectively). The L2 groups rated indefinite singulars higher than definite singulars, though the distinction between them was not as clear as the NS group. A Kruskal–Wallis test showed that there was a significant difference for all three choices among the three groups (indef sing: χ² = 9.513, p = .009, bare plural: χ² = 16.327, p = .0001, def sing: χ² = 6.110, p = .047). Mann–Whitney U tests showed that there were no significant differences between the instruction and no instruction groups for indefinite singulars, bare plurals and inappropriate definite singulars (indef sing: Z = −.215, p = .830; bare plural: Z = −.646, p = .518; def sing: Z = −1.612, p = .107). For bare plurals, significant differences were found between the NS group and the L2 groups, but like the bare plurals for NP-level generics, the L2 groups rated them highly (3.3 or above).

Figure 3.

Mean ratings (scale = 1–4) for pre-test results of sentence-level generics.

Table 5.

Mean ratings (scale = 1–4) and standard deviations (in brackets) for pre-test results of sentence-level generics.

	indef sig	bare pl	#def sig
Native Controls	3.6 (0.31)	3.9 (0.03)	2.7 (0.25)
Instruction group	2.7 (0.17)	3.4 (0.15)	2.1 (0.24)
Control group	2.7 (0.10)	3.3 (0.09)	2.5 (0.32)

Figure 4 and Table 6 represent the pre- and four post-test results for NP-level generics. After three weeks of instruction on generic expressions in English, in post-test 1, the instruction group showed improvement on their ratings of definite singulars for NP-level generics, indefinite singulars for sentence-level generics and bare plurals for both types of generic conditions, while, as expected, there were no clear changes in the ratings for the control (no instruction) group. However, target-like ratings by the instruction group gradually decreased across the post-tests, and by post-test 4, which was administered a year later, their ratings had regressed to the pre-test ratings rather than to the post-test 1 ratings. A Friedman test showed that the ratings given by the instruction group for definite singulars, indefinite singulars and bare plurals in the NP-level contexts across the pre-test and four post-tests were all statistically significant (p = .0001). Wilcoxon signed-rank tests were performed to highlight where differences occur. Table 7 provides the results. Figure 5 provides the effect sizes for comparisons between pre- and post-tests.¹⁰

Figure 4.

Mean ratings (scale = 1–4) for pre-test and post-test results of NP-level generics.

Table 6.

Mean ratings (scale = 1–4) and standard deviations (in brackets) for pre-test and post-test results of NP-level generics.

	Instruction			Control
	def sig	*indef sig	bare pl	def sig	*indef sig	bare pl
Pre-test	2.1 (0.24)	2.1 (0.08)	3.4 (0.23)	2.5 (0.31)	2.2 (0.16)	3.4 (0.43)
Post 1	3.2 (0.11)	2.8 (0.13)	3.8 (0.13)	2.1 (0.13)	2.2 (0.10)	3.3 (0.27)
Post 2	2.9 (0.35)	2.5 (0.13)	3.8 (0.14)	2.0 (0.13)	1.9 (0.28)	3.3 (0.35)
Post 3	2.8 (0.20)	2.3 (0.15)	3.7 (0.19)	2.2 (0.24)	2.1 (0.17)	3.0 (0.36)
Post 4	2.5 (0.32)	2.1 (0.28)	3.6 (0.38)	2.1 (0.25)	2.1 (0.22)	3.0 (0.34)

Note. Target choices: definite singular and bare plural.

Table 7.

Instruction group: Wilcoxon signed-rank test results for NP-level generic contexts.

Tests	def sing	*indef sing	bare pl
Pre-test – Post 1	p = .0001Z = −3.848	p = .007Z = −2.696	p = .005Z = −2.791
Pre-test – Post 2	p = .003Z = −3.019	p = .037Z = −2.008	p = .006Z = −2.749
Pre-test – Post 3	p = .002Z = −3.045	p = .626Z = −0.487	p = .122Z = −1.548
Pre-test – Post 4	p = .074Z = −1.785	p = .706Z = −0.377	p = .314Z = −1.006

Figure 5.

Instruction group: Effect sizes (r) for NP-level generics across all tests.

The findings indicate that instruction was effective up until post-test 3 for the definite singular and post-test 2 for the bare plural generics. The ratings for indefinite singulars, the ungrammatical choice, initially increased after post-test 1 (mean of 2.8), but were rated lower than the definite singulars (mean of 3.2). No significant differences were found for indefinite singulars between pre-test and post-test 3 and pre-test and post-test 4 with small effect sizes. A Wilcoxon signed-rank test run between definite singulars and indefinite singulars for post-test 4 reveals no significant difference (Z = −1.262, p = .207). For the definite singular choice, the instruction group continued to show the effects of instruction up to post-test 3; however, in post-test 4, the effect size is between medium to low.

Figure 6 and Table 8 provide the pre- and four post-test results for sentence-level generic contexts. A Friedman test was performed on indefinite singulars and bare plurals in the sentence-level generic contexts. The pre-test and four post-test results were all statistically significant (p = .0001). No significant difference was found for definite singulars (p = .080). Wilcoxon signed-rank test results along with effect sizes are given in Table 9 and Figure 7.

Figure 6.

Mean ratings (scale = 1–4) for post-test results of sentence-level generics.

Table 8.

Mean ratings (scale = 1–4) and standard deviations (in brackets) for post-test results of sentence-level generics.

	Instruction			Control
	indef sig	#def sig	bare pl	indef sig	#def sig	bare pl
Pre-test	2.7 (0.17)	2.1 (0.12)	3.4 (0.15)	2.7 (0.10)	2.5 (0.08)	3.3 (0.09)
Post 1	3.5 (0.03)	2.3 (0.27)	3.9 (0.07)	2.8 (0.02)	2.1 (0.19)	3.3 (0.04)
Post 2	3.3 (0.09)	2.0 (0.16)	3.8 (0.12)	2.7 (0.17)	1.9 (0.11)	3.4 (0.05)
Post 3	3.0 (0.18)	2.1 (0.15)	3.9 (0.07)	2.8 (0.13)	2.1 (0.10)	3.0 (0.09)
Post 4	3.0 (0.08)	2.1 (0.02)	3.7 (0.09)	2.6 (0.09)	1.8 (0.13)	3.2 (0.19)

Note. Target choices: indefinite singular and bare plural.

Table 9.

Instruction group: Wilcoxon signed-rank test results for sentence-level generic contexts.

Tests	indef sing	*def sing	bare pl
Pre-test – Post 1	p = .002Z = −3.102	p = .135Z = −1.496	p = .001Z = −3.323
Pre-test – Post 2	p = .001Z = −3.282	p = .654Z = −0.448	p = .001Z = −3.476
Pre-test – Post 3	p = .281Z = −1.078	p = .702Z = −0.383	p = .005Z = −2.824
Pre-test – Post 4	p = .209Z = −1.256	p = .310Z = −1.015	p = .014Z = −2.453

Figure 7.

Instruction group: Effect sizes (r) for sentence-level generics across all tests.

The findings reveal that the ratings given by the instruction group for definite singulars across the five tests were not statistically significant as ratings were consistently low (2.3 or lower). Ratings for indefinite singulars were significant for pre- and post-test 1 and pre- and post-test 2 but not for post-test 3 and post-test 4, though ratings for indefinite singulars (3.0 or above) were higher than the definite singulars consistently across the post-tests. The instruction group continuously rated bare plurals highly across the four post-tests. All effect sizes are medium to large, suggesting that instruction was successful as participants continued to show long-lasting effects of identifying bare plurals as acceptable generics for sentence-level descriptions.

A post hoc power and effect size analysis was conducted using the software package G*Power (Faul, Erdfelder, Lang & Buchner, 2007).¹¹ The sample size of 21 (instruction group) was used for the statistical power analysis since this is roughly the number of participants we had from pre-test through to post-test 3. The effect size used for this assessment was large (Cohen’s d = .80). The alpha level used for this analysis was p < .05. The post hoc analysis revealed the statistical power was .71. Thus, overall, there was adequate power to achieve a medium effect size level through our intervention.

V Discussion and conclusions

The aim of our study was to see whether L2 learners from an article-less L1 (Japanese) could benefit from explicit instruction in genericity. The uniqueness of our study is that unlike many other article intervention studies, our delayed post-tests were not employed weeks after instruction, but months after (post-test 3) and post-test 4 was administered a full 15 months after instruction.¹² For NP-level generic contexts, all L2 learners, both in the instruction group and the control group, demonstrated little to no understanding of the definite singular in the pre-test results, rating definite singulars lower than the native speaker group. Participants were a little better at recognizing that indefinite singulars could be used in sentence-level generics, but like NP-level generics, their ratings for indefinite singulars were lower than the native speakers in the pre-test. All participants performed well in the pre-test on rating bare plurals as acceptable (3.4) for NP-level and sentence-level generics and the instruction group improved in their ratings across the four post-tests. Research question (1) was confirmed because after 3 weeks of instruction, the instruction group improved in their ratings of the definite singular for NP-level generics between pre-test and post-test 1. In addition, differences were found between 1.) post-test 2 and the pre-test and 2.) post-test 3 and the pre-test. Research question (2) was partially confirmed since there was some improvement with indefinite singulars (up to post-test 2) and bare plurals for sentence-level generics. In comparison to the participants in the Snape and Yusa (2013) study, the participants in the current study were much better at accepting the definite singular for NP-level generics after instruction. We suggest three reasons for the instruction group’s improvement between pre- and post-tests 1, 2 and 3:

the participants received instruction in the L1, Japanese, rather than in the L2, English.

9 weeks of instruction was provided, 60 minutes each lesson, rather than 3 weeks with one 70-minute class per week.

clear explanations were offered during instruction with examples on how the definite singular can be used as a generic and how the indefinite singular refers to general properties.

Despite improvements, however, research question 3 was unconfirmed as post-test 4 revealed that the instruction group ratings decreased for the appropriate article for NP-level generics and sentence-level generics after an extended period. The findings show only short-term positive effects can be derived.¹³

The results of the current study clearly revealed that any improvement demonstrated after a few weeks of instruction cannot reliably address the issue of implicit knowledge. Positive effects of instruction were noticeable three months after the instruction period and if post-test 4 had not been administered, the conclusion of this study would have been more in line with what is reported in Snape et al. (2016). We believe that our findings demonstrate that explicit knowledge is unlikely to become implicit knowledge over an extended period, at least for this property (for discussion, see VanPatten, 2016; VanPatten & Rothman, 2014, 2015).

Although the present study found no real long-term effects of instruction, linguistic descriptions, such as the one proposed by Krifka et al. (1995), may be useful for highly proficient L2 learners. For example, the lion in sentences (14a) and (14b) have different meanings, depending on the context: (14a) can be generic, i.e. referring to all lions, but (14a) can also have the interpretation of (14b) where both sentences presuppose that there is exactly one salient lion in the discourse situation and the speaker asserts that she saw ‘that dangerous lion’ or ‘that lion sleeping’; if there are no salient lions, or two or more salient lions, then (14a, b) are infelicitous (Ionin et al., 2011).

(14) The lion is dangerous.

The lion is sleeping.

Learners receive a mix of PLD inside or outside of the classroom and this is likely to be a source of puzzlement because there is a lot of overlap between meanings and forms. In addition, the input does not provide much evidence, since in everyday life there are few occasions to discuss generic situations and to use generic NPs. Still, GenSLA can inform instructors of the types of difficulties certain L2 learners may face with articles. Since the acquisition of grammar is only possible when the language learner is able to map the feature to the form (Lardiere, 2005), in this case the additional feature [+kind] to the definite singular, linguistically-informed language teaching could at least help instructors to guide L2 learners in the right direction in teaching them about nominal morphosyntax.¹⁴

Footnotes

Acknowledgements

We wish to thank the Prefecture of Gunma for a grant for research at Gunma Prefectural Women’s University and the Japanese government as this research was in part supported by a number of Grants-in-Aid for Scientific Research from the Ministry of Education, Culture, Sports, Science and Technology (No. 26284090, No. 24520684, No. 25580133, No. 16K13266 and No. 17K02364). We are grateful to the Generative Approaches to Second Language Acquisition 2015 and European Second Language Association 2015 audience members for questions and comments, to Heather Marsden, Roumyana Slabakova and two anonymous reviewers for comments and suggestions on an earlier version of our article, and to all the participants in our study in Japan and the UK. Any errors are solely our own responsibility.

Declaration of conflicting interests

The author(s) declared the following potential conflicts of interest with respect to the research, authorship, and/or publication of this article: Part of the same dataset was already published in Snape, N., Umeda, M., Wiltshier, J., and Yusa, N. (2016). Teaching the complexities of English article use and choice for generics to L2 learners. In Proceedings of the 13th Generative Approaches to Second Language Acquisition Conference (GASLA 2015), D. Stringer, J. Garrett, B. Halloran, and S. Mossman (Eds.), 208–222. Somerville, MA: Cascadilla Proceedings Project. The dataset was organized differently to the current paper as it examines each generic category in more detail. The current paper provides more overall detail of the study and includes post-test 4.

Notes

References

Bitchener

Knoch

(2008). The value of written corrective feedback for migrant and international students. Language Teaching Research, 12, 409–431.

Boers

(2015). Weighing the merits of form-focused intervention. Language Teaching Research, 19, 251–253.

Faul

Erdfelder

Lang

A.G.

Buchner

(2007). G*Power 3: A flexible statistical power analysis program for the social, behavioral, and biomedical sciences. Behavior Research Methods, 39, 175–191.

Ionin

Wexler

(2004). Article semantics in L2 acquisition: The role of specificity. Language Acquisition, 12, 3–69.

Ionin

Montrul

Kim

J.H.

Philippov

(2011). Genericity distinctions and the interpretation of determiners in L2 acquisition. Language Acquisition, 18, 242–280.

Krashen

(1981). Second Language Acquisition and Second Language Learning. Oxford: Pergamon Press.

Krifka

Pelletier

F.J.

Carlson

G.N.

ter Meulen

Link

Chierchia

(1995). Genericity: An introduction. In Carlson

Pelletier

(Eds.), The Generic Book (pp. 1–125). Chicago: University of Chicago Press.

Kuroda

S.-Y.

(1992). Japanese Syntax and Semantics: Collected papers. Dordrecht: Kluwer.

Lardiere

(2005). On Morphological Competence. In: Dekydtspotter

Sprouse

R.A.

Liljestrand

(Ed.) Proceedings of the 7th Generative Approaches to Second Language Acquisition Conference GASLA 2004 (pp. 178–192). Somerville, MA: Cascadilla Proceedings Project.

10.

Lindstromberg

(2016). Inferential statistics in Language Teaching Research: A review and ways forward. Language Teaching Research, 20, 741–768.

11.

Master

(1994). The effect of systematic instruction on learning the English article system. In: Odlin

, (Ed.), Perspectives on Pedagogical Grammar (pp. 229–252). Cambridge: Cambridge University Press.

12.

Rothman

(2008). Aspect selection in adult L2 Spanish and the Competing Systems Hypothesis: When pedagogical and linguistic rules conflict. Languages in Contrast, 8, 74–106.

13.

Schwartz

(1993). On explicit and negative data effecting and affecting competence and linguistic behavior. Studies in Second Language Acquisition, 15, 147–163.

14.

Schwartz

Gubula-Ryzak

(1992). Learnability and grammar reorganization in L2A: Against negative evidence causing the unlearning of verb movement. Second Language Research, 8, 1–38.

15.

Snape

(2013). Japanese and Spanish adult learners of English: L2 acquisition of generic reference. Studies in Language Sciences: Journal of the Japanese Society for Language Sciences, 12, 70–94.

16.

Snape

Kupisch

. (2016). Second Language Acquisition: Second Language Systems. London: Palgrave Macmillan.

17.

Snape

Yusa

(2013). Explicit article instruction in definiteness, specificity, genericity and perception. In: Whong

Gil

K.H.

Marsden

(Eds.), Universal Grammar and the Second Language Classroom (pp. 161–183). Netherlands: Springer.

18.

Snape

Umeda

Wiltshier

Yusa

(2016). Teaching the complexities of English article use and choice for generics to L2 learners. In: Stringer

Garrett

Halloran

Mossman

(Eds.), Proceedings of the 13th Generative Approaches to Second Language Acquisition Conference GASLA 2015 (pp. 208–222). Somerville, MA: Cascadilla Proceedings Project.

19.

VanPatten

(2016). Why explicit knowledge cannot become implicit knowledge. Foreign Language Annals, 49, 650–657.

20.

VanPatten

Rothman

(2014). Against rules. In: Benati

Laval

Arche

M. J.

(Eds.), The Grammar Dimension in Instructed Second Language Acquisition: Theory, Research, and Practice (pp. 15–35). London: Bloomsbury.

21.

VanPatten

Rothman

(2015). What does current generative theory suggest about the explicit–implicit debate? In Rebuschat

(Ed.), Explicit and Implicit Learning of Languages (pp. 91–116). Amsterdam: John Benjamins.

22.

White

(1991). Adverb placement in second language acquisition: Some effects of positive and negative evidence in the classroom. Second Language Research, 70, 133–161.

23.

Whong

(2011). Language Teaching: Linguistic Theory in Practice. Edinburgh: Edinburgh University Press.

24.

Whong

Gil

K.H.

Marsden

(2013). Universal Grammar and the Second Language Classroom. Dordrecht: Springer.

25.

Whong

Marsden

Gil

K.H.

(2013). How we can learn from acquisition: The acquisition-learning debate revisited. In: Cabrelli Amaro

Judy

Pascual y Cabo

(Eds.), Proceedings of the 12th Generative Approaches to Second Language Acquisition Conference GASLA (pp. 203–210). Somerville, MA: Cascadilla Proceedings Project.