Abstract
Objectives:
The study was intended to test the hypothesis that L2 speakers have difficulty in automatically activating a grammaticalized L2 meaning that is not morphologically marked in L1.
Methodology:
The study consisted of three experiments. A sentence–picture matching task was designed to assess the activation of grammaticalized meaning. The participants were asked to judge if a sentence correctly described the physical relationships of three objects in a picture. Hidden in the stimuli that required a positive response was a number agreement manipulation whereby a noun phrase in the sentence may agree or disagree with the number of objects in the picture. A number disagreement effect, as shown in a delay in producing a positive response on items of number disagreement was used to assess automatic activation of number meanings.
Data and Analysis:
The data constituted reaction times and accuracy rates from 32 English native speakers, 36 Chinese native speakers, 54 Chinese–English bilinguals, and 26 Russian–English bilinguals. Analyses of variance were performed in analyzing these data.
Findings:
The results showed a number disagreement effect in L1 and L2 among Russian English as a second language (ESL) speakers only. Chinese ESL speakers showed no difference between the two critical conditions in either language. A follow-up experiment showed that Chinese ESL speakers had no difficulty in automatically activating number meanings which were expressed lexically in English sentence processing. These findings provided support for the idea that the well documented difficulty L2 learners have in learning incongruent L2 inflectional morphemes may have to do with their difficulty in automatically activating a grammaticalized meaning that is not grammaticalized in their L1.
Originality:
The sentence–picture matching task represented a unique and effective approach to the study of the activation of grammaticalized meanings.
Significance:
The findings from the study represented some first psycholinguistic evidence regarding the activation of grammaticalized meanings among non-native speakers.
Introduction
The acquisition of inflectional (or grammatical) morphemes such as English tense and plural markers by second language (L2) learners has been the focus of attention from the very early days of second language acquisition (SLA) research. In more recent years, it has been a test ground for a number of theoretical and empirical issues such as the critical period hypothesis, the role of Universal Grammar (UG) in adult L2 learning, and the aspect hypothesis. One finding that emerged from this research is what Jiang, Novokshanova, Masuda, and Wang (2011) referred to as the morphological congruency effect. In their conception, an L2 morpheme is referred to as a congruent morpheme from a learner’s perspective if it is also instantiated in a learner’s first language (L1). An L2 morpheme whose meaning is not morphologically marked in a learner’s L1 is considered an incongruent morpheme. The morphological congruency effect refers to the finding that an incongruent morpheme is much more difficult to acquire than a congruent one.
The evidence for this morphological congruency effect can be found in many studies. Some of these studies examined the development of different morphemes by the same group of L2 learners. Some of the morphemes were congruent ones in that they had a counterpart in the learners’ L1 but some other were incongruent morphemes as they were not instantiated in the learners’ L1. For example, White (2003) examined the development of English third person singular, past tense, plural, and articles by a Turkish English as a second language (ESL) speaker for whom articles were incongruent but plural and past were congruent. Some other studies considered the development of the same L2 morpheme by learners with L1 which did or did not mark the related meaning morphologically, e.g. English past tense by German and Chinese ESL learners in Hawkins and Liszka (2003). In both types of studies, L2 learners’ development of grammatical morphemes was assessed by examining their accuracy rates in performing a linguistic task which varied from filling in the blank to spontaneous production. Participants in a large number of such studies reliably produced higher accuracy rates on congruent morphemes than on incongruent morphemes. In Hawkins and Liszka’s (2003) study, for example, the accuracy rates for the congruent German ESL group, were above 95%, but the proficiency-matched incongruent Chinese ESL group showed much lower accuracy rates on both regular and irregular verbs (62.5% and 84.2%). Similar findings were obtained in studies reported by Bialystok (1997), Bialystok and Miller (1999), Dewaele and Véronique (2001), Franceschina (2001, 2005), McDonald (2000), Parodi, Schwartz, and Clahsen (2004), and Sabourin, Stowe, and de Haan (2006).
The difficulty involved in learning an incongruent morpheme is also demonstrated in studies that examined the processing of ungrammatical morphemes by adult L2 learners in online tasks. These studies showed a lack of native-like sensitivity to morphosyntactic errors involving incongruent morphemes on the part of adult L2 learners. For example, Guillelmon and Grosjean (2001) examined the development of French gender marking by English-speaking learners in a shadowing task. The participants were asked to repeat the last word in a noun phrase which was either consistent or inconsistent in gender with its modifier or determiner. The native speakers and young L2 learners of French (French onset age: 4:0) showed a delay in repeating (or shadowing) the target word when it disagreed with the preceding word in gender, but adult L2 learners showed no such sensitivity. Similarly, Scherag, Demuth, Rosler, Neville, and Roder (2004) showed that English native speakers who had resided in Germany for more than 15 years failed to show any sensitivity to gender marking in German in a priming study. In a series of studies that examined English plural marking among Chinese, Japanese, and Russian ESL speakers in a self-paced reading task, Jiang and his collaborators (Jiang, 2004, 2007; Jiang et al., 2011) demonstrated that Russian ESL speakers, for whom the plural marker was a congruent morpheme, were able to show a native-like sensitivity to plural marking errors, but Chinese and Japanese ESL speakers, for whom the plural marker was an incongruent morpheme, were not.
Explaining the morphological congruency effect
What makes an incongruent morpheme more difficult? Several scholars have attributed such difficulty to maturational constraints, sometimes within the UG framework (e.g. Franceschina, 2005; Hawkins & Liszka, 2003; Scherag et al., 2004). However, such proposals are often less than sufficiently detailed to be tested empirically. We hope to outline a more specific explanation for this effect and put this explanation to test in a series of experiments.
Our approach begins with the assumption that native-like morphological performance is dependent on the automatic activation of the meaning marked by a grammatical morpheme. In expressing the meaning of “I like the books” in English, the plural meaning has to be activated in the speaker’s mind in order to trigger the activation and articulation of the related plural marker in output. Similarly, in hearing this sentence, an individual has to automatically encode the plural marker –s and link it to the plural meaning in his or her mental representation in order to understand the speaker’s intended meaning of multiple books. The importance of a grammaticalized meaning at the message level for speech comprehension and production is well recognized in psycholinguistics. Almost all language production models (e.g. Dell, 1986; Eberhard, Cutting, & Bock, 2005; Levelt, 1989, 1999) postulate that the selection or activation of a linguistic element (in this case, an inflectional morpheme such as a plural marker) is driven by an element in the higher level of mental representations (e.g. semantic features in the preverbal message). The meanings represented at the message level determine what inflectional morphemes will be activated or selected; thus, the right meaning should be present in the message, “Otherwise the morphology cannot come out right” (Levelt, 1989, p. 104).
However, languages differ in what meanings are grammaticalized and marked morphologically. While learning the native language, an individual learns, among other things, what meanings are grammaticalized in the language and thus should be automatically activated in language processing. In learning a language which grammaticalizes singular and plural meanings such as English, a child learns to automatically pay attention to and encode the number meaning as singular or plural. A child learning Arabic has to automatically make a distinction between singular, dual, and plural meanings as required by the language. In contrast, automatic encoding of the number meaning is not obligatory for children learning Chinese as their native language; 1 this cross-language difference in the automatic activation of grammaticalized meanings is also well recognized in psycholinguistics. Levelt (1989), for example, explicitly states that “languages differ in the kinds of semantic features that are grammatically acknowledged. As a consequence, the encoding of messages is not the same for speakers of different languages” (Levelt, 1989, p. 106). He illustrates this point by suggesting that English speakers have to have temporal meaning such as “past” represented in the preverbal message, but Chinese speakers do not (Levelt, 1999, p. 94). A similar view is put forward in Slobin’s theorizing of thinking for speaking (Slobin, 1996), who suggests that “one fits one’s thoughts into available linguistic forms” (Slobin, 1987, p. 435, cited in Slobin, 2003).
We believe that this difference in what meanings are grammaticalized and thus have to be automatically activated in language processing across languages has a huge consequence in the subsequent learning of a new language. Specifically, a learner faces two very different tasks while learning a congruent versus an incongruent L2 morpheme. In learning a congruent morpheme, the related meaning is already part of routinely activated meanings as a result of learning the L1, so the task is to link the meaning to a new morphological marker. However, in learning an incongruent morpheme, a learner has to first learn to automatically activate a meaning that is not automatically activated in his or her L1. We suggest that this is where the difficulty lies.
We suspect that the development of new grammaticalized meanings may be subject to maturational constraints; note that the number of grammaticalized meanings in a particular language are usually limited and stable. A language is very unlikely to add a new grammaticalized meaning in one’s life time. Thus, there is no need for an individual to develop new grammaticalized meanings after the establishment of one’s first language. In this regard, one may compare the development of new grammaticalized meanings to the development of new phonological categories for speech perception. Grammaticalized meanings and phonological categories share two common properties: they are limited in number in a language; and they remain stable in one’s life time – the latter of which means that once they are developed, there is no need to learn new ones. If Kuhl (2000, 2004) is right in suggesting that the perceptual map of phonological categories is laid out very early in one’s childhood and becomes very difficult to alter later in life, the same can be true for the learning of grammaticalized meanings. The pattern of semantic activation involving grammaticalized meanings, such as number and tense, may be finalized in the process of learning one’s L1 and becomes very difficult to change.
To sum up, we argue that the cause of an L2 learner’s difficulty in learning an incongruent morpheme lies in the difficulty in automatically activating the meaning expressed by the L2 morpheme because this meaning is not routinely activated in his or her L1.
Testing the explanation: the sentence–picture matching task
The present study was intended to test the assertion that adult L2 learners have difficulty in automatically activating a grammaticalized meaning that is related to an incongruent morpheme. It was done through the use of a sentence–picture matching task. In this task, a picture of three objects was first presented on a computer monitor. It was then followed by a sentence that was presented above the picture. The participant was asked to decide whether the sentence correctly described the physical relationship among the three objects in the pictures, e.g. the spatial positions of the objects. 2 Half of the sentences provided a correct description and thus required a Yes response. The other half did not and should generate a No response. In addition to this physical relationship manipulation that was directly related to the task, a hidden manipulation was also built in the stimuli that were not directly related to the task. Each Yes-response item had two versions: a number-match (or n-match) version; and a number-mismatch (or n-mismatch) version. The two versions shared the identical sentence but differed in the number of one of the objects. As shown in Figure 1, in the n-match version, there was only one basket in the picture, which matched the sentence-final word basket in number. In the n-mismatch version, there was more than one basket, so the picture and the sentence did not match in number. Note that the participants were never informed of this manipulation or asked to pay attention to the number of objects.

A test item in the sentence–picture matching task: each item had a set of three objects accompanied by a sentence that described the physical relationship of the objects; and each item had two versions, a n-match version with a single tall basket (a) and a n-mismatch version with two baskets (b).
If the participants followed the instructions correctly, they were expected to produce a positive response for all the Yes items, regardless of number agreement, as these sentences described the physical relationship among the three objects correctly. However, if the number meaning is automatically activated, a singular meaning would be activated and represented in association with a sentence-final noun, e.g. basket. A singular or plural meaning would be generated in association with the object of basket depending on whether there was one or more than one basket. Furthermore, in the n-mismatch condition where a sentence was paired with a picture with multiple tokens of the object referred to by the noun, an individual should (most likely subconsciously) notice a mismatch in number between the word basket in the sentence and multiple baskets in the picture while performing the matching task, which should result in a delay in producing a positive response. This delay is referred to as a number mismatch effect in this context. If no singular meaning is activated in processing the sentence, no number mismatch would be present, and as a result, no delay in performing the task would be expected. Thus, we can compare the reaction time for the n-match and n-mismatch conditions to determine whether the number meaning was activated in the matching process. A number mismatch effect, in the form of a delay in responding to the n-mismatch items, was taken as evidence for the automatic activation of the number meaning in sentence processing.
Three experiments were conducted. The first one was intended to confirm the usefulness of the method by testing native speakers of English and Chinese in their respective L1s. If the method was sensitive enough to capture the activation of the singularity/plurality meanings, only native speakers of English would show a number mismatch effect. In the Chinese version of the experiment, because of a lack of number marking in Chinese, there was no number mismatch between the sentence and the picture in the n-mismatch condition, so Chinese participants should show no difference between the n-match and n-mismatch conditions. The second experiment was the main experiment in which a group of L2-proficiency-matched advanced Chinese and Russian ESL speakers were tested in the same task in both their L1s and L2. The Russian and Chinese ESL participants represented L2 learners for whom the singular/plural distinction was a congruent and an incongruent one, respectively. If we were correct in suggesting that only meanings grammaticalized in L1 get automatically activated in L2 processing, only Russian ESL speakers would produce a number mismatch effect. Chinese ESL speakers should show no difference. Experiment 3 was done to reinforce our interpretation of the results of Experiment 2 by demonstrating that the method was powerful enough to detect the activation of lexicalized meanings among Chinese ESL speakers.
Experiment 1: assessing the effectiveness of the method
To assess the effectiveness of the sentence–picture matching task for examining the activation of the singular/plural meanings in real-time language processing, English and Chinese speakers were tested in their respective L1. The test materials were identical for the two groups except for the language of the sentences. If the method was effective, only English-speaking participants would produce a number mismatch effect.
Method
Participants
A total of 73 college students served as participants in the study. Five of them were excluded due to high error rates. The remaining 68 participants were 32 native English speakers who were taking an undergraduate psychology course at an American university and 36 native Chinese speakers studying in a Chinese university. They participated in the English and Chinese part of the experiment, respectively.
Stimuli
The test materials, first constructed in English, included 80 test items. Each item consisted of a picture and a sentence that described the physical relationship of the objects in the picture, as illustrated in Figure 1. The picture contained three easily recognizable objects whose names were familiar to the participants. The objects were placed side by side horizontally or positioned diagonally with one object placed right above or below one of the other two objects. The size of the pictures was approximately 4” by 4”. Most sentences described the spatial relationship between the objects (e.g. the banana is above the red apple). Other sentences referred to the color or shape of the objects to reduce monotony of the task (e.g. The shirt is of the same color as the tall tree; the arrow is pointing to the blue towel). All of the sentences ended with a noun phrase which consisted of the definite article, a modifier, and a singular noun, as shown in the above examples. Among them, half described the pictures correctly and required a positive response and the other half did not and required a negative response.
Of the 40 Yes-response items, two versions were created for each. The two versions shared the identical sentence, but differed in the number of objects. The n-match version showed a single object referred to by the last singular noun of the corresponding sentence, and the n-mismatch version showed multiple objects labeled with the same singular noun (see Figure 1).
Two presentation lists were constructed. Each list consisted of 40 Yes-response items and 40 No-response items. Half of the Yes items were in the n-match condition and the other half in the n-mismatch condition. These Yes items were counterbalanced across the two presentation lists such that the two versions of the same item never appeared on the same list. If the n-match version appeared in List A, its n-mismatch version appeared in List B, and vice versa. Both lists shared the same 40 No-response filler items.
To construct the test materials for the Chinese part of the study, one of the present authors translated the English sentences into Chinese, and the translation was checked by two additional Chinese–English bilingual speakers for its accuracy and idiomaticity. For example, the English sentence The bike is above the tall basket was translated as自行车在高的篮子上方(zixingche zai gao de lanzi shangfang). Even though the same set of pictures were used from the n-match and n-mismatch conditions of the English part, there was no difference in the two conditions in number matching in the Chinese stimuli because of a lack of number marking in these Chinese sentences.
Procedures
The participants were tested individually and were assigned to one of the two presentation lists randomly in their respective language. Once they were seated in the laboratory, they were asked to read and sign a consent form and then read the instructions for the test, which explained that they would match sentences to pictures on the basis of the spatial or other physical relationship of the objects in the pictures. The instructions emphasized the need for both accuracy and speed. The participants had twelve practice items before the test items. All test items began with the appearance of a picture at the center of a computer monitor, which was followed 1330 milliseconds (ms) later by a sentence two lines above the picture. The stimulus onset asymmetry (SOA) of 1330 ms was determined on the basis of a pretest to ensure that the participants had enough time to view the pictures. Both the picture and the sentence remained on the monitor until a response was given. Test items were presented in a different random order for each participant. The participants responded by pressing two buttons, one for “Yes”, and the other for “No.” The data included the participants’ reaction time (RT), which was the interval between the onset of the sentence and the execution of their response, and their error rate (ER). The task was self-paced in that the participant could choose to take a break between items or immediately go to the next item. The administration of the stimuli and the collection of data were done with DMDX (Forster & Forster, 2003).
Results and discussion
Participant and item means were first computed for both RT and ER for further statistical analysis. RT of incorrect responses and outliers were excluded from the computation of these means. The outliers were any RT that fell two standard deviations outside of the same subject mean or outside the low and high cutoffs set at 1200 millisecond (ms) and 6000 ms, respectively. 7.3% of data were affected by this data treatment. The mean RT and ER for the n-match, n-mismatch, and No-response conditions are shown in Table 1. Data from No-response items were not analyzed.
English and Chinese speakers’ mean reaction times, mean error rates, and their standard deviations (in parentheses) for the number-match (n-match), number-mismatch (n-mismatch), and No-response conditions.
The data from Yes-response items were first analyzed in a two-way analysis of variance (ANOVA) using the SPSS GLM-repeated measures procedure, with participant group as a between-participant variable (English and Chinese) and number matching as a within-participant variable (matching and nonmatching). Effect sizes were reported in the form of partial eta squared (η2partial ) where appropriate. The RT analysis produced no reliable main effect of number matching, F1(1,66)=1.5, p>0.05; F2 (1,78)=1.7, p>0.05. The main effect of participant group was reliable in item analysis, but not in participant analysis, F1(1,66)<1; F2 (1,78)=3.7, p=0.05, η2partial=0.04. Importantly, there was a reliable interaction between the two variables, F1(1,66)=6.0, p<0.05, η2partial=0.08; F2 (1,78)=4.5, p<0.05, η2partial=0.05. This interaction suggested that the two groups showed two different patterns, which was confirmed in the subsequent paired samples t-tests to be described below.
The analysis of the ER data showed a reliable main effect of number matching in participant analysis only, with more errors produced on nonmatching items than matching items, F1(1,66)=7.9, p<0.05, η2partial=0.10; F2 (1,78)=1.7, p>0.05. There was also a reliable main effect of participant group in both participant and item analyses, with more errors produced by Chinese speakers than English speakers, F1(1,66)=37.0, p<0.05, η2partial=0.36; F2 (1,78)=5.4, p<0.05, η2partial=0.06. The interaction was not reliable, both Fs<1.
The two groups’ data were also analyzed separately in a set of the paired samples t-tests.
The English-speaking group was 113 ms faster in responding to the n-match items than they were in responding to the mismatch items. This number mismatch effect was reliable in both participant (t1) and item (t2) analyses, t1(31)=2.33, p<0.05; t2(39)=2.45, p<0.05. There was also a 3.9% difference in ER, and the difference was significant in participant analysis only, t1(31)=2.61, p<0.05; t2(39)=1.71, p=0.096.
The Chinese participants showed no significant difference in RT between the two conditions. They responded to the mismatch items 37 ms faster than the match items, but the difference was not significant, t1(35)=0.97, p>0.05; t2(39)=0.56, p>0.05. Neither was the difference in ER, t1(35)=1.57, p>0.05; t2(39)=0.60, p>0.05.
The results demonstrated a clear difference in the performance of English and Chinese speakers. English speakers took longer to respond to items that did not match in number between the sentences and pictures, but Chinese speakers did not. Could the n-mismatch effect found in English speakers be attributed to the fact that the n-mismatch pictures had more objects in them than the n-match pictures and, thus, should take longer to recognize? The answer is negative for two reasons. First, the pictures were shown before the sentences, and the 1330 ms SOA should allow the participants enough time to recognize all objects based on our estimate from the pretest. Second, if the perception of a single vs. multiple tokens of objects had led to the difference in RT among English speakers, the same should have happened among Chinese speakers, but the number of tokens of objects had no effect on the latter group.
The observation of the number mismatch effect in RT among English-speaking participants indicated that the method was sensitive enough to capture the activation of the number meanings. We also consider the activation of such number meanings observed in this task an automatic process as it was not required by or relevant to the task. The lack of this effect among Chinese speakers helped strengthen our interpretation of the English results. If it was something other than the activation of the number meanings that contributed to the mismatch effect found among English speakers, we would have observed the same effect among Chinese speakers. The results from both groups of participants taken together suggest that singular and plural meanings are automatically activated in language processing only among those whose languages mark these meanings morphologically and that this activation can be successfully detected in the sentence–picture matching task.
Experiment 2: the activation of grammaticalized number meanings in L2 processing
Experiment 2 is the main experiment in which we examined the main assertion of the psycholinguistic explanation outlined in the introduction. Specifically, it was intended to examine the proposal that the meaning correspondent to a congruent L2 morpheme is automatically activated, but the meaning related to an incongruent L2 morpheme is not. Advanced Russian and Chinese ESL speakers were tested in the same sentence–picture matching task in both their L1 and L2. Based on the results of Experiment 1, Russian ESL were expected to show a n-mismatch effect in RT while performing the matching task in Russian, as the singular/plural meanings are morphologically marked in Russian. Based on our proposal, we also expected the Russian ESL speakers to show a number mismatch effect in English. In contrast, we expected Chinese ESL speakers to show no number mismatch effect in either their L1 or L2. In addition to replicating the L1 Chinese results found in Experiment 1, the prediction of a lack of number mismatch effect in L2 followed directly from our view that adult L2 learners have difficulty in automatically activate the number meanings in L2 processing that are not grammaticalized in their L1.
Method
Participants
Twenty-six Russian ESL speakers and 28 Chinese ESL speakers participated in the experiment. Twenty Russian ESL speakers and all 28 Chinese ESL speakers were graduate students studying in an American university at the time of testing, and the other six Russian ESL speakers were mostly English teachers and professional translators who were living and tested in Russia. A cloze test taken from Bachman (1982) was used to assess their English proficiency. Three participants in each group did not finish the cloze test. The average score for the remaining 50 Russian and Chinese participants was 20.8 (SD=3.3) and 21.5 (SD=5.4), respectively, out of 30 possible points. A one-way ANOVA showed that the difference was not significant, F(1,48)=0.15, p=0.70. In comparison to the average score of 16.8 obtained from 418 university students who participated in Bachman’s (1982) study, these participants could be best considered as advanced ESL speakers.
Materials
The test materials were the same as those in Experiment 1 except that a Russian version of the experiment was created for Russian ESL participants. The creation of the Russian version followed the same procedure as the Chinese version in Experiment 1.
Procedure
All participants were tested in English first and in their native language second. This is to prevent any potential influence of the native language on their performance in the second language. Otherwise, the procedure was the same as that in Experiment 1.
Results
The same data treatment procedure used in Experiment 1 was adopted in this experiment, except that a higher high cutoffs of 7000 ms was adopted for the L2 data than that of 6000 ms for the L1 data for both groups. Data treatment affected 8.3% of the data for Russian ESL speakers and 6.9% of the data for Chinese ESL speakers. Data from two Russian ESL speakers and on two items in the Chinese part of the experiment had to be discarded due to high ERs (20% or higher). The mean RTs and ERs after data treatment for the two groups of participants are shown in Table 2.
Russian and Chinese English as a second language speakers’ mean reaction times (RTs) (in ms) and error rates (ERs) (in percentage) for the n-match and n-mismatch conditions while performing the task in L1 and L2.
Significant in participant or item analysis;
Significant in both participant and item analysis.
All data from Yes-response items were first analyzed in a three-way ANOVA using the GLM-repeated measures procedure. In this analysis, participant group was treated as a between-particpant variable with two levels (Russian and Chinese ESL groups). Language (L1 and L2) and number matching (matching and nonmatching) were treated as within-participant variables. It was followed by analyses of the data from the two groups separately where the main effects of language (L1 and L2) and n-match (match and mismatch) and their interaction were examined in a two-way ANOVA for each group using the same GLM-repeated measures procedure, followed by a set of paired-sample t-tests to specifically examine the number mismatching effect. In all analyses, both participant means (F1, t1) and item means (F2, t2) were used. Where appropriate, effect sizes were reported in the form of partial eta squared (η2partial). We report the results of these analyses below.
All data
The three-way ANOVA of RT data produced a main effect of language in both participant and item analyses, F1(1,50)=89.3, p<0.05, η2partial =0.64; F2 (1,76)=589.1, p<0.05, η2partial =0.89. The participants responded to L1 items 1013 ms faster than to L2 items. There was also a main effect of number matching in both analyses with n-match items responded to 70 ms faster than n-mismatch items, F1(1,50)=5.7, p<0.05, η2partial =0.10; F2 (1,76)=5.0, p<0.05, η2partial =0.06. The main effect of participant group (Russian vs. Chinese ESL group) was not significant in participant analysis, but significant in item analysis, F1(1,50)=1.7, p=0.20, η2partial=0.03; F2 (1,76)=5.63, p<0.05, η2partial=0.07. Critically, there was a reliable interaction between number matching and participant group, F1(1,50)=3.2, p=0.05, η2partial=0.06; F2 (1,76)=4.3, p=0.04, η2partial=0.05, suggesting that the two groups showed different patterns in responding to items in the two critical conditions. There was also a significant interaction between number matching and language in item analysis only, F1(1,50)=1.1, p>0.05, η2partial=0.02; F2 (1,76)=6.2, p<0.05, η2partial=0.08. Other interactions were not significant. The three-way ANOVA of the error data showed a main effect of language, with more errors on L2 items than L1 items, F1(1,50)=4.6, p<0.05, η2partial =0.08; F2 (1,76)=6.7, p<0.05, η2partial=0.08. There was also a significant interaction of language and participant group, F1(1,50)=5.9, p<0.05, η2partial=0.10; F2 (1,76)=4.4, p<0.05, η2partial=0.06. There was no other significant main effect or interaction.
Russian ESL speakers
These participants showed a main effect of both language and number matching. Their performance in Russian was 895 ms faster than that in English, and the difference was significant, F1(1,23)=65.7, p<0.05, η2partial=0.74; F2 (1,39)=208.9, p<0.05, η2partial=0.84. They also responded to the n-match items 130 ms faster than to the n-mismatch items and the difference was also significant, F1(1,23)=10.1, p<0.05, η2partial=0.31; F2 (1,39)=8.0, p<0.05, η2partial=0.17. There was no significant interaction between the two in RT, both Fs<1. The ER data showed no main effect of language, both Fs<1, or number match, both Fs<1. Neither was a significant interaction of the two, F1(1,23)=1.5, p=0.24; F2(1,39)<1. The paired-samples t-tests of the RT data showed that the n-match effect in L1 approached significance in participant analysis, t1(23)=1.90, p=0.07, and significant in item analysis, t2(39)=2.40, p<0.05. The n-match effect in L2 English was significant in both participant and item analyses, t1(23)=2.79, p<0.05, t2(39)=1.97, p=0.05. No significant n-match effect was found in ER in either language, L1, t1(23)=1.65, p=0.113, t2(39)=1.16, p=0.25; L2, both ts<1.
Chinese ESL speakers
There was a significant main effect of language in both RT, F1 (1,27)=41.8, p<0.05, η2partial=0.61; F2 (1,37)=422, p<0.05), η2partial=0.92, and ER (F1(1,27)=16.4, p<0.05, η2partial=0.38; F2 (1.37)=8.6, p<0.05, η2partial=0.19. No significant main effect of number matching was found in RT (both Fs<1) or ER (both Fs<1), neither was any significant interaction between the two variables in RT (both Fs<1) or ER (both Fs<1). Paired-samples t-tests produced no reliable n-match effect in either L1 (both ts<1) or L2 (t1(27)<1; t2(39)=1.10, p=0.28) in RT. No reliable n-match effect was found in either L1 or L2 in ER, either (L1: t1(27)=1.29, p=0.21, t2(37)=0.98, p=0.33; L2: both ts<1).
The results from the two groups of participants first replicated those of Experiment 1. While performing the task in their respective L1s, Chinese and Russian speakers showed a different pattern of results. Russian speakers showed a reliable n-mismatch effect, but Chinese speakers did not. These results reinforced our contention that a number mismatch effect in sentence–picture matching would only occur when the singular and plural meanings are morphologically marked in a language and automatically activated in language processing.
More importantly, the two groups produced different results while performing the task in L2, too. Russian ESL speakers showed a mismatch effect in performing the task in English, but Chinese ESL speakers showed no such an effect. These L2 results suggested that the singular/plural meanings are automatically activated and played in a role in the matching process for Russian ESL speakers, but not for Chinese ESL speakers. These results are consistent with the view that L2 learners can routinely activate grammaticalized meanings in L2 processing only when such meanings are also grammaticalized in L1.
To provide a more detailed account of the L2 results, let us consider the example in Figure 1. When Russian ESL speakers processed the picture which contained three single objects in the n-match condition (Version a in Figure 1), a singular meaning was automatically activated in association with the three objects, including the critical object basket. A singular meaning was also activated in processing the word basket in the sentence. No number mismatch was present in the messages constructed for comparison. A positive Yes response was not adversely affected under this circumstance. In processing an n-mismatch item, however, a plural meaning was generated in association with the basket in the picture because of the presence of multiple baskets in the picture, but a singular meaning with the word basket in the sentence. This mismatch caused a delay in reaching or executing a Yes response to the n-mismatch items.
It was a very different scenario for Chinese ESL speakers, however. Note that due to a lack of plural marking in Chinese, there was no mismatch in number between the picture and sentence in the Chinese part of the experiment in either the n-match or n-mismatch condition. Thus, no mismatch effect was expected. In the English part, assuming that the number meanings were not part of the routinely activated meaning among these Chinese ESL speakers, the mismatch in number between the English sentences and the pictures did not matter to them linguistically and psycholinguistically. This explains the lack of the number mismatch effect in their results.
One may argue that, for some reason, the activation of the number meaning is harder to detect among Chinese ESL speakers in the task adopted in the study. We explored this possibility in Experiment 3 by demonstrating a lexical number mismatch effect among Chinese ESL speakers using the same task. We suggested earlier that the number meanings are lexicalized, not grammaticalized, for Chinese speakers, which means that such meanings are linked to words such as yige (one), jige (several), xuduo (many) in Chinese and to words such as one, several, and many in English. A lexical number mismatch effect in this context refers to a delay in responding to an n-mismatch item where a single object was matched with a sentence that contained a plural noun preceded by a quantifier such as several baskets. Empirical demonstration of a lexical number mismatch effect among Chinese ESL speakers could serve two purposes: showing that the task is sensitive enough to capture the automatic activation of the number meaning among Chinese ESL speakers; and confirming that such meanings are readily activated in the presence of a lexical item. Such a finding would help strengthen our proposal that what is lacking for Chinese ESL speakers is the presence of a grammaticalized (not lexicalized) number meaning in English processing.
Experiment 3: lexical number mismatch effect among Chinese ESL speakers
To demonstrate that lexicalized singular/plural meanings are automatically activated in language processing for Chinese ESL speakers and the task is powerful enough to capture such activation, we modified the test materials used in the first two experiments to create critical trials such as the one illustrated in Figure 2. These test items were different from those in the first two experiments in mainly two ways. First, a word such as several was used to indicate the plural meaning lexically in these sentences. Second, the pictures with multiple tokens of the critical object were in the n-match condition and the pictures with a single object were in the n-mismatch condition because of the plural nature of the sentence, as illustrated in Figure 2. If Chinese ESL speakers do activate and represent singular and plural meaning lexically in English and picture processing, we would expect them to produce a lexical number mismatch effect in that they would show a delay in responding to the n-mismatch items. 3

N-match (a) and mismatch (b) items in Experiment 3.
Method
Participants
Twenty-six Chinese ESL speakers participated in the experiment. They were all graduate students studying in an American university at the time of testing.
Materials and procedure
A total of 80 items were constructed for this experiment, all in English. Forty of them had sentences and pictures matched in physical relationship and served as critical stimuli and the other 40 were nonmatching filler items. Each of the 40 matching items had two versions. The sentences for both versions were identical. All sentences ended with a noun phrase that had a plural quantifier such as several, two, many, a few, and a plural head noun. The picture had multiple tokens of the referent of the head noun for the n-match condition, and only a single object of the referent for the n-mismatch condition. The construction of the trial lists and the procedure were the same as those in Experiment 2.
Results
The data were treated in the same way they were in the first two experiments except a higher high cutoff of 1800 ms was adopted in light of the longer RTs in this data set. Data treatment affected 6.1% of the data. The participants’ mean RTs and ERs for the n-match and mismatch conditions are shown in Table 3. Paired samples t-tests showed that the 311-ms lexical n-mismatch effect was significant in both participant and item analyses, t1(25)=3.90, p<0.05; t2(39)=2.61, p<0.05. So was the difference in ER, t1(25)=6.17, p<0.05; t2(39)=3.38, p<0.05. Many participants reported noticing the mismatch between the plural noun phrase in the sentence such as several paper bags and a single bag in the picture and thus providing a negative response, incorrectly, even though a sentence described the physical relationship among the three objects correctly and a positive response was expected. This contributed to the high ERs for the mismatch condition.
Chinese English as a second language participants’ reaction time (RT) and error rate (ER) in responding to congruent and incongruent items in Experiment 3.
Significant in both participant and item analysis.
Two things are clear from the results. First, this sentence–picture matching task is adequate in assessing whether a meaning is activated in language processing. Given the right manipulation of the test materials and the task, the activation of a meaning can be detected by means of measuring the participants’ RT under different conditions. In the context of this particular experiment, the activation of a meaning revealed itself through a longer RT for the n-mismatch condition. Second, lexical meanings related to number, such as several, two, or many, get automatically activated in language processing among Chinese ESL speakers, which lead them to notice the number mismatch between the sentence and the picture and respond more slowly to the mismatch items, or make more errors on these items. The finding of a lexical number mismatch effect in this experiment drew a sharp contrast to the results of Experiment 2 in which Chinese ESL speakers showed no number mismatch effect when the number meaning was expressed through the means of morphology. Taken together, the results from the Chinese ESL speakers in Experiments 2 and 3 showed that lexicalized L2 meanings were automatically activated but not grammaticalized L2 meanings that were not grammaticalized in L1.
General discussion
We suggested in the introduction that L2 learners have difficulty in automatically activating a grammaticalized meaning in language processing that is not grammaticalized or morphologically marked in a learner’s L1, and that this may be the cause of the difficulty L2 learners have in acquiring incongruent morphemes in L2. Three sentence–picture matching experiments were conducted to test the assertion. Three findings emerged from these experiments:
Speakers of a language that marks the number meanings morphologically (English and Russian speakers in Experiments 1 and 2) showed a number mismatch effect in L1 processing in that they took longer in responding to test items whose critical noun in a sentence and its referent in the picture did not match in number than to items that did not involve such mismatch, but speakers of a language that does not mark number meanings morphologically (Chinese speakers in Experiments 1 and 2) did not show this effect;
When ESL speakers performed the task in their L2 English, Russian ESL speakers showed a number mismatch effect, but not Chinese ESL speakers (Experiment 2); and
Chinese ESL speakers were able to show a lexical number mismatch effect when the mismatch occurred between a number word and an object (Experiment 3).
The number mismatch effect observed in this task provides a tangible and reliable means to determine whether the number meanings are automatically activated or not in the process of language processing. More importantly, the findings provided empirical confirmation for the proposition that a meaning that is morphologically marked in L2 but not in L1 is not automatically activated (without lexical means) in L2 sentence processing; these findings strengthen the validity of our meaning-activation approach to the explanation of the morphological congruency effect.
Two issues related to the explanation of the morphological congruency effect deserve more discussion. One is how maturational constraints affect morpheme acquisition in particular and the other is how such a meaning-based model can be applied to the acquisition of L2 morphemes that are less semantic in nature such as gender marking.
The role of maturational constraints, or a critical period, in adult L2 learning is well recognized. A long-standing assertion is that the human brain becomes neurobiologically less plastic for learning a new language as an individual grows older (e.g. Lennerberg, 1967; Pulvermüller & Schumann, 1994; Scovel, 1969). More recently, a number of scholars suggested that the brain becomes hardwired to fit a specific language (i.e. an individual’s native language) as a result of learning L1 and that altering this neurobiological configuration is very difficult. For example, in explaining why adult L2 learners are less capable of relying on implicit learning in developing native-like proficiency in an L2, Ellis (2011) commented that “the L2 learner’s neocortex has already been tuned to the L1, incremental learning has slowly committed it to a particular configuration, and it has reached a point of entrenchment where the L2 is perceived through mechanisms optimized for the L1” (Ellis, 2011, p. 40). This neural commitment view is also adopted by Kuhl in the discussion of phonological development. She suggests that “learning results in a commitment of the brain’s neural networks to the patterns of variation that describe a particular language. This learning promotes further learning of patterns that conform to these initially learned, while interfering with the learning of patterns that do not conform to those initially learned” (Kuhl, 2004, p. 832).
A maturational constraint explanation has also been suggested to account for L2 learners’ difficulty with incongruent morphemes by several scholars (e.g. Franceschina, 2005; Hawkins & Liszka, 2003; Scherag et al., 2004). However, these proposals are often less than specific and do not lead to a testable hypothesis to be empirically examined. The meaning-activation approach taken in the present study represents a proposal that can be independently and empirically verified. Within this approach, maturational constraints can be said to affect what grammaticalized meanings are routinely activated in language processing. For example, a number meaning (singular or plural) has to be activated in association with a countable noun in English. This patterned activation of grammaticalized meanings is language-specific and determined by the grammaticalization patterns of a language. Learning a language in childhood comprises learning this particular pattern of semantic activations. Learners have to develop the knowledge about what grammaticalized meanings should go with what lexicalized meanings. Because the number of grammaticalized meanings is limited and stable, it is reasonable to think that the patterned activation of grammaticalized meanings is finalized at a young age in the process of learning one’s L1, just like the finalization of perceptual categories for speech sounds in early years of life. It is this particular pattern of activation of grammaticalized meanings that is constrained by maturation. Thus, maturational constraints affect the acquisition of L2 inflectional morphemes by putting a particular pattern of semantic activation in place in childhood L1 learning that is difficult to alter. Whether the resistance to alteration is the result of neural commitment, as Ellis (2011) and Kuhl (2004) suggested, or merely “because of stylistic differences in learning at different times in life,” as Bialystok (1997, p. 132) argued, is debatable.
The second issue has to do with how a meaning-activation approach applies to grammatical morphemes that are less semantic than plural and tense markers. Good examples of the former are the third person singular marker in English and gender markers in Spanish and French. These markers are clearly not as semantic as plural and tense markers in English as they do not usually express a meaning a speaker wants to convey. Instead, they are more syntax-motivated and have no semantic meaning associated with them. How such morphemes differ from more semantic morphemes in acquisition is yet to be determined. Before more evidence becomes available, there are two good reasons to examine the development of these morphemes in the current model as well. First, this model can serve as an initial point of departure if one defines meaning broadly to include all that is morphologically marked, e.g. person, gender, and case. Similar to English plural and tense markers, the third person singular meaning, for example, though more syntactic than semantic, has to be activated in the process of speech production or comprehension. Thus, the idea of a finalized pattern of meaning activation in childhood L1 acquisition discussed earlier should apply to these morphemes as well. Second, published studies point to the existence of a morphological congruency effect in the acquisition of gender marking by adult L2 learners (e.g. Franceschina, 2005; Guillelmon & Grosjean, 2001; Sabourin et al., 2006; Scherag et al., 2004). Given the presence of some conflicting findings (Foote, 2011; Keating, 2009; Sagarra & Herschensohn, 2010, 2011), further research is certainly necessary to examine the morphological congruency effect in the acquisition of gender markers in particular, and in the acquisition many other inflectional morphemes in general.
Conclusion
The difficulty associated with inflectional morphemes in L2 learning has been widely recognized. However, we know much less about the exact nature of this difficulty. The present study represents our effort toward a better understanding of what underlies this difficulty from a cognitive perspective. The meaning-activation model of L2 morpheme acquisition proposed in this study provides a conceptual framework for exploring L2 morpheme acquisition in general and for understanding the difficulty in the learning about incongruent morphemes in particular.
We also hope to consider this study as part of a more challenging but worthwhile endeavor of identifying what is not acquirable in adult L2 acquisition. Both common sense and research evidence have shown that adult L2 learners are very unlikely to develop native-like competence in all aspects of language. It is surprising in this regard that there has been little orchestrated, deliberate, and conscious effort in SLA research in identifying what is not acquirable. We believe that the identification of unacquirable morphosyntactic structures should be a central topic in SLA research. Progress on this topic is prerequisite for understanding what is unique about adult L2 acquisition.
Footnotes
Acknowledgements
We would like to express our appreciation to Zhang Weimin and Fang Yanhua for their assistance in the collection of data for Experiment 1.
Funding
This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.
