Abstract
The question as to whether there is a threshold value for input below which bilinguals do not achieve a monolingual-like development often arises. Although input does not seem to be determining for learning syntax, according to Juan-Garau and Pérez-Vidal, the amount of vocabulary acquired is proportional to the time of exposure. This article contributes to the current discussion with data from very young children exposed either to monolingual input of Basque or to different degrees of bilingual input of Basque and Spanish/French. The corpus for the present investigation has been extracted from the adaptation to Basque of the MacArthur–Bates communicative development inventories 1 and 2 questionnaires based on parental reports. These questionnaires, adapted to more than 40 languages throughout the world (www.sci.sdsu.edu/cdi), involve monthly data collection from different children for each age interval between ages 8 and 30 months and have proved to be a powerful instrument to establish normal linguistic development and deviance at an early stage. This study finds that the development of productive vocabulary in the four input groups established follows the tendencies described by Bates et al. but at different paces corresponding to different vocabulary sizes. In other words, the ‘nominal bias’ lasts to a higher age in the lower input groups, and the lexical diversity appears earlier in the higher input groups. Furthermore, lexical verbs, not predicates, are analysed as a separate category based on the study by Barreña and Serrat, which showed a higher proportion of verbs appeared in Basque as compared to the surrounding Romance languages. This tendency towards the use of verbs is confirmed in the data collected from input groups with higher exposure to Basque.
Keywords
Introduction
This article explores the lexical development in Basque of Basque monolingual and Basque–Spanish bilingual children. Our data have been extracted from the most extensive corpus on Basque so far collected using the Basque adaptation of the MacArthur-Bates communicative development inventories (CDIs) (Fenson et al., 1993; Fenson et al., 2007).
Monolingual and bilingual language acquisition
The question of whether monolingual and bilingual children follow the same pattern in their language acquisition process has been an often discussed issue during the last decades. In this context, the definition of simultaneous and successive acquisition of two languages has been given different interpretations, with McLaughlin (1984), considering age 3 a threshold for early bilingual language acquisition, whereas according to De Houwer (1991, 1995, 2009) only simultaneous and balanced contact with two languages from birth could be defined as bilingual first-language acquisition (BFLA). Recently, the distinction between BFLA and child second-language (child L2) acquisition has also been discussed. Meisel (2008) proposes a period for child L2 acquisition that could be set between 3–4 and 7–8 years of age, after which the process of acquisition becomes more and more like adult L2. Similarly, the effects of a simultaneous acquisition of two radically different languages, or the possibility that an unbalanced input may lead to a ‘strong’ and a ‘weak’ language, are other issues that have given rise to further debate (Bernardini & Schlyter, 2004; Lanza, 1998, 2001; Meisel, 2001; Müller & Kupisch, 2003).
The study of early, simultaneous acquisition of two languages has led researchers to defend different positions. The separation hypothesis, which states that children acquiring two languages simultaneously develop both languages separately from a very early age, seems to have become increasingly accepted (Deuchar & Quay, 2000, Genesee & Nicoladis, 2006; Meisel, 2007). Research carried out on bilingual children acquiring Basque and Spanish simultaneously has shown that their development in each language follows that of their monolingual counterparts very closely, going through similar stages and producing similar errors (Almgren & Barreña, 2001; Barreña, 1995, 1997, 2001; Ezeizabarrena, 1996, to quote the most recent ones).
Studies on perception have even shown that in the cases of BFLA, children are able to discriminate prosodic properties of their respective languages already at 4 months of age (Bosch & Sebastián-Galles, 2001), although they may need some more time than their monolingual peers to set the categorical distinction corresponding to either language (Sebastián-Galles, 2006).
In recent years, interest has switched from the sole issue of separation or fusion of codes, to the possibility of some cross-linguistic influence in cases of bilingual acquisition. The supposition that a bilingual is the exact sum of two monolinguals now seems out of place. Paradis (2000) stresses that although not systematically apparent, in situations of close language contact interactions between the two developing systems may occur. Müller and Hulk (2000) even predict syntactic conditions where influences are plausible, and Barnes (2006) observed certain influence at a syntactic level in the interrogatives of a trilingual child.
Meisel (2004, 2007) suggests that although BFLA subjects are able to differentiate between their two languages, they may develop either language at a different pace to monolinguals.
Furthermore, whereas some studies on the acquisition of the lexicon show a smaller vocabulary size for bilinguals in each language (Pearson, Fernández, & Oller, 1993), others come to the conclusion that the additive vocabulary bilingual children master is developed at the same pace as in monolingual acquisition (Hamers & Blanc, 2000). Administering the MacArthur–Bates CDI questionnaires to a Spanish–English bilingual population, Pearson, Fernández, and Kimbrough (1995) and Pearson, Fernández, Lewedeg, and Kimbrough (1997) also found that when adding the conceptual 1 vocabulary from both languages, it was fully comparable to that mastered by monolinguals.
The CDI questionnaires have been applied to bilingual populations for multiple purposes. Thus, for example, Marchman and Martínez-Sussmann (2002) and Marchman, Martínez-Sussmann, and Dale (2004) showed high correlations between vocabulary and grammatical complexity in each language when testing English–Spanish bilingual children. However, language-specific effects were also attested.
Nevertheless, it should be noted that most of the above-mentioned studies have dealt with the cases of balanced bilingualism, where children are exposed equally to both languages from birth, growing up in families where each parent follows the Grammont principle ‘one person, one language’ when addressing the child (Ronjat, 1913).
The role of input in language acquisition
It would not seem to be a matter of much discussion that in order to acquire a language, a child needs to be exposed to it, and in order to acquire two languages, to both of them. But what happens in the cases of unbalanced input of either language? What amount of input is necessary for a child in order to develop his/her language as L1 (Barnes, 2011)? Or, as Pearson et al. (1997, p. 41) put it: ‘It is common sense to expect that the more a child interacts with a speaker of a language, the more of that language the child will learn. But it is not obvious just how close the association between exposure and learning will be’.
The question whether there is a threshold value for input below which bilinguals do not achieve a monolingual-like development often arises. At which point does language exposure determine the subsequent language development? Although input does not seem to be a determiner for learning syntax, according to Juan-Garau and Pérez-Vidal (2001), the amount of vocabulary acquired is proportional to the amount of exposure. So, how closely are input and vocabulary growth related? This article aims to contribute to the current discussion with data from very young children exposed either to monolingual input of Basque, or to different degrees of bilingual input of Basque and Spanish/French. (Data collection took place in the three territories where Basque is spoken. 2 ) But the main focus of our study, in addition to vocabulary growth, is the issue of possible differences in the composition of the lexicon in relation to the amount of input. In this sense, the proportion of verbs produced is a language-specific feature, which may indicate a more or less ‘native-like’ development.
Studies on the composition of the lexicon
When dealing with the appearance of the different semantic categories in early productive vocabulary, the predominance of nouns or ‘nominal bias’, which was pointed out by Bates et al. (1994), has been corroborated in other studies. In a longitudinal analysis, not based on CDI data, Nicoladis (2001) showed a preponderance of nouns in a bilingual child’s productions in English as well as in Brazilian Portuguese, independent of inter-linguistic differences in distribution and sentence position of nouns and verbs in the input.
In a study based on CDI data, Caselli, Casadio, and Bates (1999) found similarities between Italian and English as to the predominance of nouns and the later appearance of verbs in early vocabulary. The composition of the lexicon was observed to change at different vocabulary sizes. Below 10 words, sounds and noises represented a large part of the vocabulary. Verbs and adjectives hardly reached 5% before the 100-word limit (Caselli, Stefanini, & Pasqualetti, 2007).
The same developmental trends were found for Swedish (Berglund & Eriksson, 1994, 2000) and for German infants and toddlers (Kauschke & Hofmeister, 2002), although both intra- and inter-linguistic differences may appear. Thus, Bassano (1998) reported a higher proportion of verbs in a French child than in American infants before 18 months of age. Differences were also found when comparing Basque and the surrounding Romance languages (Barreña & Serrat, 2007). The percentages of nouns produced in relation to vocabulary size were found to be similar in the four languages compared. However, in Basque, the proportion of verbs was higher (Figure 1) up to a vocabulary size of 301–400 words.

Verbs (%) as related to total vocabulary size.
When relating the composition 3 of lexical production to vocabulary size, Kern (2003) found that at an interval of 0–50 words, nouns constituted 45% of the French lexicon, predicates (verbs and adjectives) 5%, and the category ‘others’ (games and routines, noises and sounds of animals) another 40%. The ‘others’ category descended very quickly to around 10% when the vocabulary size reached 201–300 words, whereas the noun category grew to over 60%. In a second study, Kern and Gayraud (2007) showed that in normally developing French-speaking children aged 24–26 months with a vocabulary size ranging between 197 and 263 words, the proportion of nouns reached around 61%, whereas predicates (verbs and adjectives) represented between 15% and 18%. Berglund and Eriksson (1994) found the general trends of vocabulary composition among Swedish and American infants to be almost identical. At 8–16 months of age, children fell into what Bates et al. (1994) defined as the ‘first wave’ of vocabulary growth with an increase in the proportion of common nouns up to a size of about 100 words. The presence of predicates and closed-class items was almost negligible, although children with the largest vocabularies produced more predicates than children with smaller vocabularies.
Berglund and Eriksson (1994) also showed that Swedish toddlers aged 16–30 months followed the developmental tendencies defined by Bates et al. (1994), as a ‘second wave’ of vocabulary growth, with a slow increase of predicates, especially at vocabulary sizes between 100 and 400 words. Finally, in a ‘third-wave’, closed-class words were proportionally consistent in vocabularies up to 400 words, followed by a sharp increase thereafter.
Sociolinguistic situation of Basque
The effort made to promote Basque during the last decades has contributed to an increasing social presence of Basque. However, it should be remembered that it shares its space with two majority languages, Spanish and French, in the area where it is spoken on either side of the Pyrenees. For this reason, most Basque children do have contact with two languages from a very early age, be it active or passive. As will be shown in the following, even though the aim of the data collectors was to focus on monolingual subjects, it was inevitable that different degrees of bilinguals would be included in the sample. This is not a unique situation within Europe (see O’Toole & Fletcher, 2010, for Irish). It should also be borne in mind that although all adult speakers of Basque are Basque–Spanish or Basque–French bilinguals and fluent in both languages, inversely, it cannot be assumed that all Spanish/French speaking adults are also proficient in Basque. This fact gives a sort of asymmetric social bilingualism, where Spanish/French is the dominant language in many areas. But it is also true that many parents, both L1 and L2 speakers of Basque, make an effort to address their children in Basque, at least at an early age. The quality of input from these L1 and L2 speakers may vary but it is not necessarily related to the amount of input given to the children. Measuring the quality of parental input would require a separate study with a different instrument.
Characteristics of Basque
Basque is a Non-Indo-European language with subject–object–verb (SOV) word order, contrary to the surrounding Romance languages, which have predominantly subject–verb–object (SVO) word order. It has no morphologic gender system, marks case and number postpositionally and uses suffixes with readings that may be temporal, instrumental, of commitment or beneficiary. Aspect marking is suffixed to the lexical verb and the auxiliary verb marks tense. The intransitive and transitive verb system with triple agreement for subjects and direct and indirect objects is complex and allows for dropping both subjects and objects. The saliency of verbs within the Basque sentence is a fact that may contribute to a higher proportion of this category in Basque as compared to the Romance languages of contact, as has also been pointed out for Irish (O’Toole & Fletcher, 2010).
Research questions
The first point of analysis in our study is the question of whether overall vocabulary growth is directly related to amount of input. The main premise is that where input is lower, slower vocabulary growth will be expected.
Secondly, development in the four input groups defined is expected to follow the tendencies described by Bates et al. (1994) along a similar path but at a different rate corresponding to the successive ‘waves’ of vocabulary diversification. Consequently, we expect the ‘nominal bias’ to last to a higher age in the lower input groups and the lexical diversity to appear earlier in the higher input groups.
Thirdly, the question posed is whether the higher proportion of verbs in Basque, as compared to the surrounding Romance languages in accordance with Barreña and Serrat (2007), is a trend, which holds for all four input groups, or whether the higher input groups maintain a higher proportion of verbs than the lower input groups at equal vocabulary sizes. In other words, is there a point below which the composition of the lexicon becomes less like L1 Basque? Lexical verbs, not predicates, were analysed as a separate category in the above-mentioned study.
Method
Corpus and data collection
The corpus for the present investigation is based on data collected using the Basque version of the CDI questionnaires (Barreña, García, et al., 2008). The CDI 1 ‘words and gestures’ questionnaire, covering 8–15 months of age, focuses on the understanding and production of words and communicative gestures. Parents are asked to mark in one column the words their child understands and in a separate column the words the child understands and produces. In the Basque version, the Vocabulary Checklist section contains 397 items, corresponding to 19 semantic categories. These categories follow the original American version closely, but items that had too strong a cultural bias were changed to items that were thought to be closer to Basque children’s reality. The present study refers to sounds, games and routines as one category, nouns related to persons, objects, animals and so on, as a second category, action words or verbs as a third category, and descriptive words or adjectives as a fourth category. These are the most frequent categories of young children’s vocabulary and constitute the main body of the questionnaire in most languages available so far. In fact, nouns constitute 54% of the total vocabulary in the Basque CDI 1 questionnaire and verbs 14%. In the CDI 2 questionnaire, nouns represent 53% and verbs represent 16% of the vocabulary checklist. Other word classes are labelled as ‘others’ in a fifth category, which refers to time words, quantifiers, pronouns, adverbs, question words, postpositions and conjunctions. When counting the number of verbs produced, only the lexical verb category has been considered, in accordance with the structure of the questionnaires used.
The Basque version of the CDI 2 ‘words and sentences’, covering 16–30 months of age, contains 654 items in 21 categories in its vocabulary checklist section. Focusing on production (parents are asked to mark the items their child produces), this questionnaire includes a wider range of linguistic categories corresponding to emerging morphology and syntax, with a richer selection of items, as reflects the child’s growing proficiency. On this occasion, our analysis is limited to the vocabulary production of the same categories as in CDI 1 also within the CDI 2 age span. For typological reasons, these categories are the most easily comparable to data from other languages, since the complex Basque morphology demanded substantial adaptations in other sections.
Data were collected once for each child at a given age, and questionnaires were handed out through day care centres or individual contacts. On their completion, parents returned them to the researchers. The data reflect a fairly high proportion of parents with higher education levels (40% secondary and 47% university). So far, however, only a minimal effect (η2 < 8) of maternal educational level on language development has been attested (Barreña, García, et al., 2008; García et al., 2011).
The questionnaires measure the development in children exposed to Basque from both parents from birth, although data from children exposed to varying degrees of bilingual input from parents, grandparents or caregivers were inevitably gathered in the process, as can be expected in bilingual communities. Based on the information provided by the parents on the amount of exposure to either language, 4 the children have been grouped together as Basque monolingual subjects with >90% of input in Basque (868), high-input subjects with exposure of 90%–60% Basque (294 subjects), balanced input of Basque and Spanish/French 5 with around 60%–40% exposure to Basque (142 subjects) and the low-input group around or below 40% of Basque input (74 subjects). The smaller sample sizes of the lower input groups should be taken into account when interpreting results. So far, no statistically significant differences in development have been shown among children exposed to >60% Basque, but below that level differences appear, particularly in the later age span, approximately from 26 months of age (Almgren, Ezeizabarrena, & Garcia, 2007; Barreña, Ezeizabarrena, & García, 2008).
Since, in some cases, parents did not provide any information about the children’s linguistic background on the questionnaires, 14 subjects from CDI 1 ages and 25 subjects from CDI 2 ages were not included in the present study, leaving a total corpus of 1378 subjects (428 from CDI 1 ages and 950 from CDI 2 ages).
Results
Comprehension and production in Basque in the CDI 1 age span (8–15 months)
As a starting point, a general view of vocabulary understood and produced in Basque at the earliest ages will be given (note that no data on comprehension were collected on the CDI 2 questionnaire). As has been widely described (see De Houwer, 2009, for an overview), a substantial gap between comprehension and production, and also a steady growth in vocabulary comprehension can be observed at these very early ages (Figure 2).

Comprehension and production at 8–15 months of age (CDI 1).
Before 11 months of age, most of our subjects still do not understand any of the words included in the CDI 1 questionnaire. At 8, 9 and 10 months of ages, 28.2%, 16% and 11% of the children are reported, respectively, not to understand any words. It is only after 11 months of age that more than 50% of them understand between 1 and 50 words, and at 15 months of age, the average number of words understood is 161.30 on a widespread scale from 23 to 396 words.
Production is still very limited at this age, and before 10 months, it is almost negligible. At 11 and 13 months of ages, 50% and 29% of the subjects, respectively, still produce no words. By the end of the CDI 1 period, the average number of words produced by our subjects is 9, although individual differences span from 10.8% of the subjects who still do not produce any words, to 13% who produce between 21 and 50 words (Barreña, García, et al., 2008).
Slight inter-group differences are attested in relation to the age from which a monthly increase in average production can be accounted for. In the <40% of Basque input group, this limit can be set at 14 months. In the 40%–60% input group, increase is appreciable from 12 months of age, in the 60%–90% input group from 11 months of age and finally in the Basque monolingual input group (>90% input) at 10 months of age, with an average of 2.03 words at 11 months of age. In the Irish sample, which also contained a number of bilingual subjects, global single word production was reported from 11.5 months (O’Toole & Fletcher, 2010).
By the end of the period, at 15 months of age, average overall vocabulary production does show differences between the input groups, as reflected in Figure 3.

Vocabulary production as related to input at 15 months of age.
Although visually appreciable, the differences at this limited range of production are not statistically significant (F(3,57) = 0.29, p = 0.82). The lowest input group only produces an average of 4.5 words, whereas the 40%–60% input group reaches an average of 8.2 and the monolingual group 9.27, almost doubling the <40% input group. Unexpectedly, the 60%–90% input group shows a higher average than the others at 15 months of age.
Composition of productive vocabulary at 8–15 months of age
The question that now arises is whether any differences in lexical diversity can be discerned. Figure 4 shows the proportions of the five semantic categories taken into account for the four input groups.

Vocabulary production by categories as related to input at 15 months of age (CDI 1).
This figure represents percentages of the vocabulary categories for each input group counted from the age of appreciable production. As pointed out in reference to Figure 3, no statistically significant differences appear, but some tendencies can be seen. Although nouns certainly predominate, it seems clear that in the monolingual (>90%) input group, there is greater lexical variety. While in the <40% input group, nouns constitute 70% of vocabulary, these percentages drop to 60% in the 40%–60% input group, 55% in the 60%–90% input group and 52% in the >90% input group.
The lowest input group produces no adjectives or ‘other categories’, whereas in the >90% group, adjectives represent 2.45% and other categories 2.56%. At this point, these ‘other categories’ include quantifiers, pronouns and adverbs. Even though their presence is still not statistically significant, these categories also appear sporadically in the 60%–90% input group.
As for verbs, a difference can be distinguished between the lowest input groups where they represent around 3% and the 60%–90% input group with 4.7% and especially in the >90 group that shows values of 8.25%. So, although global data confirm that these subjects fall into the so-called first wave of vocabulary growth in the CDI 1 age span, one can appreciate the incipient presence of the ‘second wave’ of vocabulary reorganisation in the >90 group, in spite of having an average vocabulary still below 10 words.
Production in the CDI 2 age span
During the CDI 2 age span (16–30 months), vocabulary growth is steady, reaching an average of 400 words by 30 months of age (Figure 5), with 5.3% of the subjects still situated below the 100-word threshold, and 28.7% reaching the 500- to 654-word interval (Barreña, García, et al., 2008).

Average vocabulary growth in CDI 2 ages.
The most salient differences in vocabulary growth between the four input groups during the period of 16–30 months are reflected in Figure 6. In general, steady and regular growth can be seen, in which over 100 words are produced in the age range of 19–21 months, 200 words in the age range of 25–27 months and 300 words in the age range of 28–30 months. However, the subjects with the lowest input (<40) show a more irregular pattern of development in which rises in production are followed by falls. The results of the variance analysis only identified significant differences in the final age period between 28 and 30 months (F(3,236) = 14.00, p < 0.01). The post hoc analysis (Student–Newman–Keuls) reveals that three levels of vocabulary production can be shown to be related to input. The highest level of vocabulary is that of those groups with an input of over 60% in Basque (60–90 and >90) who produce more than 380 words. Next, at an intermediate level, are those subjects with an input in Basque of between 40% and 60%. The lowest word production, slightly over 200 words, is found in the subjects with an input of less than 40%. So, in view of these results, the question that now arises is whether the same differences are reflected in the main lexical categories.

Average vocabulary growth as related to input (CDI 2).
Nouns as compared to total lexical size
As can be seen in Figure 7, nouns make up from 50% up to 68% of the language produced by all four input groups. In general, the proportion of nouns increases with age and increases at all ages in relation to the amount of input, comprising 57% of language overall and at later ages even surpassing this. According to the variance analysis, significant differences in relation to the amount of input are found at the age range of 22–24 months (F(3,195) = 8.33, p < 0.01), 25–27 months (F(3,220) = 4.14, p < 0.01) and 28–30 months (F(3,236) = 12.68, p < 0.01). The post hoc tests (Student–Newman–Keuls) show that in the age range of 22–24 months, there are significant differences between the groups with higher input (>90 and 60–90) and the lowest input (<40 and 40–60), and it is the former that have fewer nouns (around 56%) in total lexical mass while the latter show higher percentages over 63%. However, within the age range of 25–27 months, where the results of the variance analysis were significant as has been mentioned, the post hoc analysis did not detect significant differences between groups.

Percentages of nouns out of total lexical size (CDI 2).
In the age range between 28 and 30 months, the post hoc analysis detected three levels according to the proportion of nouns in the total vocabulary. The group with the highest proportion of nouns is the one with lowest input (>40) with a percentage of nouns at around 70%. At an intermediate level with around 62% of nouns, we find the subjects with balanced input (40%–60%), while the groups with the lowest proportions of nouns (below 60%) in their lexical mass are the groups with the highest input in Basque (>90% and 60%–90%).
Verbs as compared to total lexical size
It can be seen in Figure 8 that the percentage of verbs in the total vocabulary is far less than the percentage of nouns and never surpasses 17% of lexical mass. This percentage increases with age in all input groups with the exception of the low-input group (<40), which shows irregular increases and decreases at this percentage.

Percentages of verbs out of total vocabulary size (CDI 2).
The results of the variance analysis showed significant differences from the age of 22 months: between 22 and 24 months (F(3,195) = 3.04, p < 0.05), at the age range of 25–27 months (F(3,220) = 3.26, p < 0.05) and between 28 and 30 months (F(3,236) = 18.53, p < 0.01).
The post hoc analysis revealed significant differences at the age range of 22–24 months in those subjects with the highest (<90%) and lowest (<40%) input, the former showing higher percentages of verbs (at around 15%). In the age range of 25–27 months, the post hoc analysis did not show significant differences between groups by input. Finally, at the age group of 28–30 months, the post hoc analysis revealed three levels for percentages of verbs. The input groups with the highest percentage of verbs at over 15% were those with the highest input (>90% and 60%–90%), the middle group for percentage of verbs at around 13% was the balanced input group (40%–60%) and the lowest group for percentage of verbs, less than 10%, was the group with the lowest input (<40%) in Basque.
General characteristics of vocabulary composition during CDI 2 age span
The average production throughout the CDI 2 age period (16–30 months) is characterised by a large distance between nouns and other categories in proportions that vary between input groups. The verb category also shows variation in relation to input. However, no statistically significant inter-group differences related to amount of input are found for the categories adjectives or sounds, games and routines. Nonetheless, in terms of average production, two subgroups can be established for the latter category: above and below 60% input.
The category labelled as ‘others’ shows statistically significant inter-group differences (F(3,95) = 4.16, p < 0.01), which are not necessarily identical for all the word classes included. Thus, for example, there is again a scale of differences for time words, more definable between the lowest and the other three groups. For the category of question words, the difference is significant between the highest and the lowest input groups. For the rest, differences generally become marked above and below 60% input.
Inter-group differences in average production of verbs at equal vocabulary sizes
The final issue to examine is whether high-input subjects produce more verbs in comparison to low-input subjects at equal lexical sizes. The first interval established by Bates et al. (1994) (0–100 words) mostly reflects the increase of the noun category according to the so-called first wave of vocabulary growth. Prior to a lexical size of 100 words, average production of verbs (4–5) is too small to make any comparison significant. The next interval established by Bates (100–400), on the other hand, is too broad to reflect the developmental trends we find.
Using intervals of 100 words, Table 1 reflects, in absolute numbers, that at 101–200 word size, the >40% input group produces only about half as many verbs as the two higher input groups. The 40%–60% is closer to the higher input groups than to the lower input group in the average number of verbs. At the next interval (201–300 words), the three higher input groups seem to divide stepwise with the lowest input group at a 6-point distance from the 40%–60% group. By the time vocabulary reaches 301–400 words, the 40%–60% group has become situated at an almost identical level to the two higher input groups, while the <40% group is still producing an average of 7–8 verbs less. Finally, at 401–500 word stage, this pattern has become even more pronounced.
Average number of verbs produced as related to input and total vocabulary size.
Discussion and conclusions
Our first research question related the rate of Basque input to overall vocabulary growth in Basque, expecting slower vocabulary growth where input is lower. In fact, our data seem to indicate that the evolution of children’s early vocabulary follows the general trends of development in accordance with the amount of input.
CDI 1 data seem to prove that even at early ages, amount of input can be observed to make a difference to the outcomes of productive vocabulary, although it is not the >90% Basque input group that produces most vocabulary at 15 months, it is the group that is exposed to Basque 60%–90% of the time. This was unexpected that it can hardly be attributed to the difficulty for parents in distinguishing which of the languages the child responds to, as in the Galician case (Pérez-Pereira, 2008) since Basque and Spanish are so different. We assume that it is due to individual factors or occasional variations in the amount of input, in the sense that some of these children may occasionally have had more input of Basque than estimated by parents at that precise time. Variations in input, as Pearson et al. (1997) pointed out, might alter the results.
However, if we look at the age of first productive use, there is a trend that shows that the lower input group is later in non-sporadic production of its first words (at 14 months) than the higher input group (at 10 months).
During the CDI 2 period, trends for global vocabulary growth become more distinguishable in relation to the amount of input. The monolingual (>90% input) group shows the most regular evolution with a steady increase from 16 months of age. The 60%–90% input group seems to stabilise during the latter part of the period, following the >90% group fairly closely. The two lower input groups reach both higher and lower scores than the two higher groups, and their development is more unstable. By the end of the period, however, vocabulary growth in the <40% input group declines considerably in comparison with the other groups.
These data seem to indicate that although, in general terms, the amount of input influences the rate of vocabulary growth, as postulated by Juan-Garau and Pérez-Vidal (2001), the relation between input and production at early ages might not be as straightforward as assumed. According to the trends observed by Pearson et al. (1997), the relation between quantity of input and amount of vocabulary is noteworthy for high-input groups. However, it is weaker for lesser amount of input and for environments where input is not constant. As shown in the previous studies (Almgren et al., 2007; Barreña, Ezeizabarrena, et al., 2008), there seems to be a limit around 60% of input, which is significant for developmental trends, although it is difficult to establish an exact threshold level.
The second research question postulated a development in the four input groups, following the tendencies described by Bates et al. (1994) along a similar path but at different paces corresponding to the successive ‘waves’ of vocabulary diversification. Thus, the ‘nominal bias’ was expected to last to a higher age in the lower input groups and lexical diversity to appear earlier in the higher input groups.
Though differences during the stage of limited production of CDI 1 are not yet statistically significant, some trends in the composition of the lexicon have been shown to emerge. The two lowest input groups produce very little lexical diversity. Nouns represent the main part of their productive vocabulary and verbs are hardly present. The higher input groups produce proportionally fewer nouns and more words from other categories, and here especially of note is the relatively high proportion of verbs that are used by the >90% input group, even with an average vocabulary of around 10 words.
Some individual peculiarities are also attested during the CDI 1 age span. In Bates et al. (1994), no examples from the category ‘words about time’ were attested during the CDI 1 age span. In the Swedish sample (Berglund & Eriksson, 1994), no child with a vocabulary size below 50 words produced any items from this category. In the Basque sample, one exception was found within the CDI 1 age span: in the >90% input group, a subject with an extremely low production at 15 months of age (12 words) does use the time expression ‘gero’ (then, later). The incipient differences observed between the four input groups in our corpus are expected to show more clearly in the CDI 2 age range.
It is throughout the CDI 2 age span that statistically significant inter-group differences are attested. Up to 20–21 months of age, production only sporadically surpasses the 100-word limit, identified by Bates et al. (1994) as the ‘first wave’ of vocabulary growth with an increase in the proportion of common nouns and a proportion of verbs and adjectives that hardly reaches 5% (Caselli et al., 2007). Thereafter, the slower pace of vocabulary growth in the two lower input groups is also reflected in the degree of lexical diversity.
A smaller vocabulary size brings about a higher proportion of nouns, a lower proportion of verbs and less lexical diversity. Figure 7 reflects the gradual inter-group differences, again setting a separation line of above or below 60% of nouns between the two lower and the two higher input groups. Statistically significant differences were clearest between the <40% and the >90% input groups from 24 months of age, but when considering the whole CDI 2 period a gradual decrease in the proportion of nouns from the highest proportion in the <40% to the lowest proportion in the >90% is shown. An interesting feature that reflected the general tendency of interdependence between vocabulary size and nominal bias was distinguished at 20 and 21 months of age for the <40% input group. Exactly at these ages, this group showed an increase in general vocabulary size (Figure 6), which corresponds to a decrease in proportions of nouns (Figure 7) and an increase in proportions of verbs (Figure 8). It is true that the small sample sizes (two and three subjects, respectively) lead us to interpret these data with caution, but nevertheless they seem to confirm the general tendencies for vocabulary diversification. On the whole, vocabulary growth and vocabulary diversification seem to confirm that bilinguals, at least those exposed to <60% input of the language in question, may follow different rates of development to monolinguals as postulated by Meisel (2004, 2007) and others.
The third question posed was whether the higher proportion of verbs in Basque as compared to the surrounding Romance languages in accordance with Barreña and Serrat (2007) is a trend, which holds for all four input groups, or whether the higher input groups maintain a higher proportion of verbs than the lower input groups at equal vocabulary sizes. In other words, is there a point below which the composition of the lexicon becomes less like L1 Basque?
At a lexical size of between 197 and 263 words, Kern and Gayraud (2007) found an average of 60%–62% nouns and 18%–15% predicates, including verbs and adjectives. At 23 months of age and an average vocabulary size of 215, our monolingual subjects (>90% input group) show a distribution where nouns represent 59% of their production and verbs (not predicates) represent 16%. This might indicate that the ‘nominal bias’ is somewhat less pronounced in Basque than in the French corpus. Verbs, on the other hand, represent a slightly higher proportion in the Basque sample, taking into account that adjectives constitute 7% in this group at this age. In our opinion, these data confirm the higher proportion of verbs as a characteristic feature of Basque.
The 60%–90% input group produces between 199 and 268 words at 23–25 months of age, with an average of 59% nouns and 16% verbs, values that are situated close to the monolingual group. When it comes to the 40%–60% input group, proportions change somewhat. At 24 and 25 months of age and vocabulary sizes between 207 and 228, nouns represent 64% and verbs represent 13%. Finally, the lowest input group shows a rather irregular vocabulary growth until the average score is maintained above 200 words at 26/27 months of age (Figure 6). At this point, nouns represent an average of 64% and verbs represent around 13%, although percentages of verbs decline during the last months of CDI 2 age span for these subjects, as was reflected in Figure 8.
At equal lexical sizes, there are noteworthy differences. The average number of verbs produced is significantly higher in the >90% input group and in the 60%–90% input group up to a lexical size of 300 words. At the next interval, also the 40%–60% input group becomes more L1 like, whereas the lowest input group, whose scores at the end of the CDI 2 period are the lowest, maintains the distance to the other groups. It is also noteworthy that already at CDI 1 ages and at very small vocabulary sizes, verbs are present in a high proportion in the higher input groups. In fact, these groups are those who show a neatly different pattern from the surrounding Romance languages.
Again, these findings should be interpreted with some caution, bearing in mind that the sample for the lower input groups is more reduced than for the monolingual group. A possible interpretation of these data is that an unbalanced input may lead to a ‘strong’ and a ‘weak’ language, as pointed out by Bernardini and Schlyter (2004), Meisel (2001) and Müller and Kupisch (2003), to quote but only the most recent ones. However, in order to examine whether the additive vocabulary bilingual children master is developed at the same pace as in monolingual acquisition (Hamers & Blanc, 2000), it would be of interest to examine data on both languages – a project currently underway.
Footnotes
Acknowledgements
We are greatly indebted to Margareta Almgren for her time and expertise throughout the preparation of this article. We are also extremely grateful to Pilar Larrañaga and Pedro Guijarro-Fuentes for their very helpful and incisive comments.
Funding
This work was supported by the Basque Government (PI2009-22) and the Spanish Ministry of Science and Innovation (FFI2009-13956-CO2-2 sub-programme FILO).
