Abstract
The vocabulary size and composition of one group of full-term and three groups of low risk preterm children with different gestational ages (GA) were longitudinally compared at 10, 22 and 30 months of age. Expressive vocabulary development was assessed through the CDI. Cognitive development was also assessed at 22 months (Batelle Developmental Inventory), and data concerning biological and environmental characteristics of the children were also obtained. Growth curve analyses indicated that there were no significant differences in vocabulary size or percentage of word categories among GA groups. Regression analyses showed that word production and cognitive scores measured at 22 months were the main predictors of total vocabulary and word categories at 30 months. Gender, maternal education and GA did not contribute in a significant way to the variance of use of the vocabulary categories or vocabulary size. Therefore, GA does not seem to affect vocabulary development and composition when biomedical complications associated to prematurity are excluded.
Keywords
Introduction
In this article we compare the development of expressive vocabulary size and composition of low risk preterm (PT) children to those of full-term (FT) children throughout a longitudinal study, and analyse the factors which predict vocabulary size and composition at 30 months of age.
Development of vocabulary composition
The study of vocabulary development greatly increased after the publication of the Communicative Development Inventories (CDI) (Fenson et al., 1994). This instrument allowed investigators to study large samples of children, which increased the possibility of generalizing results. At the same time the adaptation of the CDI to a great number of languages made comparisons possible among children speaking different languages. Quantitative comparisons of vocabulary production led to more refined analyses of the type of vocabulary used and stylistic variations among children. The study of vocabulary composition was stimulated by the publication of the seminal paper by E. Bates et al. (1994). The authors investigated the percentage of types of words used by 1803 children between 8 and 30 months of age. The study found that the proportion of common nouns (animal names, vehicles, toys, clothing, body parts, small household items, food and furniture), predicates (verbs and adjectives) and closed-class words (pronouns, prepositions, question words, quantifiers, articles, auxiliary verbs and connectives) varied depending on the children’s vocabulary size. E. Bates et al. (1994) suggested that changes in vocabulary composition between 16 and 30 months of age reflect a shift in emphasis from reference to predication to grammar.
Later, Caselli, Casadio, and Bates (1999) studied the vocabulary composition of 1001 American English-speaking and 386 Italian-speaking children between ages 1;6 and 2;6 through the CDI. This time the authors included in the analysis the category of social words (names for people, sounds of animals and things, games and routines), in addition to the other three categories studied in E. Bates et al. (1994). There was a high correlation between age and the type of words used, and an even greater one between vocabulary size and the type of words used. The general developmental trend found by Caselli et al. (1999) indicates that common nouns show a marked increase between 1 and 200 words, and a slight, steady drop after that point, although common nouns continue to be the most frequent category, comprising around 45% of the overall vocabulary. Predicates show steady growth throughout this period of development, reaching their highest values when vocabularies reach 400 words or more (around 25% of the total). Closed-class words show hardly any growth in English and Italian children up to 400 words, and then start to increase up to 12–15%. Finally, social words start out as the largest category in children with vocabularies of under 50 words (45% for American and 65% for Italian children), but show a sharp nonlinear drop after that point.
Similar developmental trends to those found by Caselli et al. (1999) for English- or Italian-speaking children were found in other studies carried out with (Mexican) Spanish-speaking children (Jackson-Maldonado, Thal, Marchman, & Gutierrez-Clellen, 1993), Finnish (Stolt, Haataja, Lapinleimu, & Lehtonen, 2008), Hebrew (Maital, Dromi, Sagi, & Bornstein, 2000) and Estonian-speaking children (Schults & Tulviste, 2016; Schults, Tulviste, & Konstabel, 2012).
Preterm children’s vocabulary development
When investigating PT children it is important to bear in mind that they are not a homogeneous group, and therefore not all have an equal risk of suffering developmental delays (Singer et al., 2001). PT children may be classified according to gestational age (GA) into late preterm (LPT) (GA 34–36 weeks), moderately preterm (MPT) (GA 32–33 weeks), very preterm (VPT) (GA between 28 and 31 weeks) and extremely preterm children (EPT) (GA below 28 weeks) (Blencowe et al., 2013; Goldenberg, Culhane, Iams, & Romero, 2008; March of Dimes, PMNCH, Save the Children, & World Health Organization, 2012). Different studies have shown that not only are gestational age (GA) and birth weight (BW) (which are usually correlated with each other) factors that predict later linguistic outcomes, but that medical complications, such as bronchopulmonary dysplasia and intraventricular haemorrhage, are also important determining factors. The risk of medical complications increases as GA and BW decrease (Johansson & Cnattigius, 2010). Therefore it would be expected that PT children with different GAs or BWs, with a different incidence of medical complications, should have different linguistic outcomes. In this setting, it is very important for investigations to control the characteristics of the PT children studied. A pertinent research question, therefore, is whether PT children with different degrees of risk can have different linguistic outcomes.
Most studies on the vocabulary size (production) of PT children apparently confirmed that PT children show delays in relation to FT children (Foster-Cohen, Edgin, Champion, & Woodward, 2007; Kern & Gayraud, 2007; Sansavini, Guarini, & Savini, 2011; Sansavini, Guarini, Savini, Broccoli, et al., 2011; Schults, Tulviste, & Haan, 2013; Stolt et al., 2014; Stolt, Matomaki, Haataja, Lapinleimu, & Lehtonen, 2012). However, these studies were carried out with EPT or VPT children (Foster-Cohen et al., 2007; Sansavini, Guarini, & Savini, 2011; Sansavini, Guarini, Savini, Broccoli, et al., 2011; Stolt et al., 2014; Stolt et al., 2012), or the existence of additional medical complications was not fully controlled (Kern & Gayraud, 2007; Schults et al., 2013). A few follow-up studies carried out with children < 32 weeks of GA found differences in expressive vocabulary size only after a given point in time (around age 18 or 24 months), although not earlier (Bosch, Ramon-Casas, Solé, Nícar, & Iriondo-Sanz, 2011; Stolt, Haataja, Lapinleimu, & Lehtonen, 2009), which highlights the need for longitudinal studies. In contrast, other studies mostly carried out with PT children with a wider range of GA, and not only EPT or VPT, did not find significant differences between PT and FT children’s expressive vocabulary size development (Cattani et al., 2010; Menyuk, Liebergott, Schultz, Chesnick, & Ferrier, 1991; Pérez-Pereira et al., 2011; Pérez-Pereira, Fernández, Gómez-Taibo, & Resches, 2014; Pérez-Pereira, Fernández, Resches, & Gómez-Taibo, 2013; Sansavini et al., 2006; Stolt et al., 2007).
These differing results lead to an interesting research question: can low risk PT children have a different expressive vocabulary size than high risk PT children?
Preterm children’s vocabulary composition
The study of vocabulary composition seems to have even greater relevance because this offers us more detailed information on vocabulary development than simple vocabulary size does. The study of vocabulary composition with PT children was carried out along the lines of the study by Caselli et al. (1999) with typically developing children. One of the first studies which specifically addressed this topic was that carried out by Kern and Gayraud (2007) with a wide sample of 323 PT children and a control group of 166 FT children between 24 and 26 months of age, using the French version of the CDI. In relation to the relative proportion of the different categories of words used, Kern and Gayraud (2007) found that FT children produced significantly more nouns than EPT and VPT children, and more predicates than VPT children. EPT children produced significantly fewer predicates and closed-class words that the other GA groups (VPT, MPT and FT children). In contrast, EPT children produced a significantly higher number of social words than VPT, MPT and FT children. Similarly, VPT children produced more social words than the FT group. In any case, MPT children did not show any significant differences with PT children, and when EPT children were excluded from the entire PT group, no differences were found between PT and FT children.
Sansavini et al. (2006) studied 73 Italian PT children with BW < 1600 grams and GA < 33 weeks, and one control group of 22 FT children at 26 months of age. The results obtained in the Italian CDI showed that the expressive vocabulary composition of the PT group was similar to that of the FT sample.
Stolt et al. (2007) compared the expressive vocabulary size of 66 PT children with very low birth weight (VLBW) (mean GA 28 weeks) and 87 FT children at 24 months of age, as well as the composition of their lexicon by using the Finnish CDI. A few children in both groups had a mental developmental index < 85. The expressive vocabulary composition of the PT and FT children was similar in relation to total vocabulary size, and confirmed the results found by Caselli et al. (1999), from routines to reference to predication and finally to grammatical function words. The only significant differences were found for the group of children with vocabulary size > 425 words, with FT children having higher percentages of grammatical function words and lower percentages of common nouns than VLBW children. There were no significant differences between the FT and PT groups regarding vocabulary size.
Schults et al. (2013) compared vocabulary size and vocabulary composition in one group of 40 PT children of ages between 16 and 25 months, with a mean GA of 30.6 (which is in the range of VPT children), to those of another group of 120 FT children matched to the PT group by age and gender. Using the Estonian CDI to assess the children, the authors found that the FT children had larger vocabularies than the PT children. Significant differences were also noted between the PT group and the FT group regarding vocabulary composition, with the PT group producing significantly more social terms and fewer function words than the FT group. There were no differences in the production of predicates and common nouns. Regression analysis indicated that preterm birth seems to be associated with a higher proportion of social terms and a lower proportion of function words and predicates, as well as with a lower vocabulary in comparison to full-term birth.
Recently, and using a different assessment instrument which does not offer a complete panorama of vocabulary composition, Sansavini et al. (2015) found that EPT children show delays in early noun and predicate comprehension and production in relation to FT children at 24 months of age.
Particularly relevant for the present study are two longitudinal studies on vocabulary composition, one carried out by Stolt et al. (2009) and the other by Sansavini, Guarini, Savini, Broccoli, et al. (2011). Stolt et al. (2009) compared lexical development of 32 VLBW children (mean BW 1032 g and mean GA 28 weeks) and 35 FT children. Five VLBW children had major neurological disabilities (cerebral palsy, mental retardation, bilateral hearing loss, or severe visual impairment). Receptive lexicon was gathered at 9, 12 and 15 months, and expressive lexicon at 9, 12, 15 and 24 months of age using the Finnish version of the CDI. Receptive vocabulary sizes of VLBW children were significantly smaller than those of FT children at each age point. Expressive vocabulary size, however, did not differ between VLBW children with no major neurological disabilities and FT children at each age point. The only significant difference found in total expressive vocabulary was between the entire group of VLBW children (including the five children with major neurological disabilities) and the FT children at 24 months of age. Comparisons were carried out using the raw values of social terms, common nouns, verbs, adjectives and grammatical function words, and also using the percentages of these categories counted from the total number of words in the lexicon of children with vocabularies of similar sizes (Stolt et al., 2009). The acquisition order of the different lexical categories of the receptive lexicon was the same in VLBW and FT children: social terms and common nouns were acquired first, then verbs, followed by adjectives and grammatical function words. There were significant differences, however, in the number of words in each category in the receptive vocabularies of VLBW and FT children at 9, 12 and 15 months of age. These results are in contrast with those found for expressive vocabulary. The raw values of all lexical categories in the expressive lexicon were only significantly lower for VLBW children at 24 months of age. When comparisons were performed with the percentages of lexical categories in the receptive lexicons of similar sizes, most proportions were comparable for the two groups; the only exception being grammatical function words, which were lower in the VLBW than in the FT group in the two lower lexicon size subgroups (1–9 and 10–49 words). Practically no difference was detected between VLBW and PT children in the composition of the expressive lexicon when the lexicon size was taken into consideration at 9, 12, 15 and 24 months of age.
Sansavini, Guarini, Savini, Broccoli, et al. (2011) longitudinally compared a group of 104 very preterm (VPT) children (mean GA 29.5 weeks, mean BW 1268 g) without major cerebral damage to 20 full-term infants (mean GA 40 weeks, mean BW 3470 g) in their vocabulary size and composition, both in word comprehension and production. The children’s vocabularies were studied through the short forms of the Italian CDI, which were administered at 12, 18 and 24 months of age. Significant differences between PT and FT children were found in word production and a trend found in word comprehension, with PT children showing lower vocabulary sizes. In addition, an interaction between age and group was found in the MANOVA analyses, which indicates that the differences between PT and FT children increased from 12 to 24 months of age for total word production. In relation to vocabulary composition, PT children understood a significantly lower number of social words, nouns and predicates at 18 months than FT children. In contrast, no significant difference was found in the production of the four word categories at 12, 18 or 24 months of age, the only exception being that PT children produced a significantly lower number of social words at 18 months of age than FT children (the difference did not reach significance after Bonferroni correction for multiple comparisons). The lower number of social words in PT children is against expectations.
Therefore, the findings of previous studies comparing vocabulary composition in FT and PT children show conflicting results. Some studies have found greater differences for receptive vocabulary than for expressive vocabulary (Stolt et al., 2009; Sansavini, Guarini, Savini, Broccoli, et al., 2011), but in some cases no significant difference was found for expressive vocabulary (Sansavini et al., 2006; Sansavini, Guarini, Savini, Broccoli, et al., 2011). Other studies (Kern & Gayraud, 2007; Schults et al., 2013) found differences in expressive vocabulary composition which paralleled those found by Caselli et al. (1999) with typically developing children: children with lower GA tended to use a higher proportion of social words and lower proportions of closed-class words and predicates than children with GA over 37 weeks (FT). The different results found in the former studies do not seem to be related to the characteristics (GA) of the samples of the investigations, and new studies seem necessary to disentangle the point.
This study of a wide GA range sample of PT children with no severe associated disabilities may be useful to ascertain the particular role of GA in vocabulary growth and composition, excluding the effect of severe medical complications. This is an aim of this study. Furthermore, as this is a follow-up study, by using growth curve modelling it is possible to check the longitudinal trajectory of vocabulary size and vocabulary composition, which may be different for PT and FT children. At the same time, longitudinal data may help determine whether differences exist at a particular age but not at other ages, and this point is relevant since the age of assessment of children in previous studies has varied. As far as we know, only one of the former studies on vocabulary composition adopted a longitudinal strategy of data gathering and performed an analysis of the developmental trajectories of children which goes further than comparisons at particular points of development (Sansavini, Guarini, Savini, Broccoli, et al., 2011). In contrast to the study of Sansavini, Guarini, Savini, Broccoli, et al. (2011), the PT sample of our study contains PT children with a wider range of GA and does not include only VPT children.
Predictors of vocabulary growth and composition
Another aim of the study is to investigate which factors may affect vocabulary size and composition. Previous research pointed to the definite determinant role of previous cognitive ability (Pérez-Pereira et al., 2014; Sansavini, Guarini, Savini, Broccoli, et al., 2011; Stolt et al., 2007), and previous linguistic development (Bosch et al., 2011; Pérez-Pereira et al., 2014; Sansavini, Guarini, & Savini, 2011; Sansavini, Guarini, Savini, Broccoli, et al., 2011; Sansavini et al., 2010; Schults et al., 2013). Gestational age (Foster-Cohen et al., 2007; Sansavini et al., 2010; Schults et al., 2013; Stolt et al., 2007) was also found to have an impact on vocabulary growth and composition, although other studies did not find this effect (Menyuk et al., 1991; Pérez-Pereira et al., 2014; Pérez-Pereira et al., 2013; Sansavini et al., 2006). The impact of other variables, such as gender or maternal education, is more controversial. Gender, a typical biological variable, was found to have an effect on the vocabulary development of PT children (Bosch et al., 2011; Sansavini et al., 2006; Sansavini, Guarini, & Savini, 2011; Schults et al., 2013), with boys having lower results than girls. Other studies, however, did not find this result and gender differences were observed only in FT children (a generally well-established result with typically developing children: Eriksson et al., 2012; Fenson et al., 2007; Stolarova et al., 2016) but not in VLBW children (Stolt et al., 2007). There is substantial evidence that maternal education as an index of socioeconomic status affects vocabulary size in the case of typically developing children (Hoff, 2006). Certain studies with PT children also observed that children with mothers with a higher education had a greater vocabulary size than children with mothers with a lower educational level (Menyuk et al., 1991; Sansavini, Guarini, Savini, Broccoli, et al., 2011; Stolt et al., 2007). Other studies, however, did not observe this effect (Pérez-Pereira et al., 2014; Sansavini et al., 2006). The effects of these factors will also be explored in our study.
Aims and hypotheses
To summarize, the aim of the present study is to compare longitudinal trajectories of expressive vocabulary size and vocabulary composition in low risk PT and FT children who were followed from 10 to 30 months of age. We will also analyse the effect of certain factors on vocabulary size and composition.
The hypotheses of the present study are:
H1: There will be no significant difference in vocabulary size or composition between children with different GAs (different groups of PT children and FT children) at any time, given the low risk composition (absence of serious medical complications) of the PT sample.
H2: There will be no differences in longitudinal trajectories between children with different GAs.
H3: Previous cognitive and linguistic development will have a clear determinant effect on later vocabulary size and composition, while gestational age, maternal education and gender will have a lower effect.
Method
Participants
The children of the study form part of an original sample of 150 PT and 49 FT children recruited at birth in four different hospitals in Galicia (Spain) for a longitudinal investigation. Approval by the Galician Ethics Committee of Clinical Research and parents’ consent were obtained before beginning the investigation. For the purposes of the present study the children were assessed on their language development at 10, 22 and 30 months of age. Age was corrected for PT children. The sample at 10 months comprised 142 PT and 49 FT children; at 22 months of age there were 137 PT and 43 FT children; finally at 30 months, the PT sample consisted of 115 children and the FT sample of 37 children.
PT children with additional serious complications were excluded from the study. Among the exclusion criteria were cerebral palsy (as diagnosed up until 9 months of age), periventricular leukomalacia, intraventricular haemorrhage greater than grade II, hydrocephalus, encephalopathy, genetic malformations, chromosomal syndromes, metabolic syndromes associated to mental retardation, or important motor or sensorial impairments. Newborn children with Apgar scores below 6 at 5 minutes were also excluded.
Although the number of children decreased over time, the characteristics of the PT sample did not vary throughout the period studied (see Pérez-Pereira et al., 2014 for a complete description). At 30 months of age, the characteristics of the 115 PT and the 37 FT children were very similar to those of the sample at the beginning of the study. The characteristics of the PT and FT samples in terms of GA, BW and Apgar score (1 minute) as well as maternal education level and gender at the beginning of the study and at 30 months of age are shown in Table 1. Maternal education has been categorized into three levels: (1) basic maternal education (primary and secondary education), (2) high school and technical school and (3) university degree. The PT and FT children did not differ in terms of mother’s education (χ2 (1) = 8.66, p > .05), gender (χ2 (1) = 0.000, p > .05) or Apgar score (t (197) = −0.909, p > .05).
Mean (and standard deviation) GA, BW and Apgar scores, and distribution by maternal education level and gender of the sample at the beginning of the study and at 30 months of age.
GA = gestational age in weeks; BW = birth weight in grams; PT = preterm; FT = full-term.
For the growth curve analyses, the participants were grouped into four GA groups: (1) very and extremely preterm children (participants with GA between 26 and 31 weeks), (2) moderately preterm (with GA between 32 and 33 weeks), (3) late preterm children (with GA between 34 and 36 weeks) and (4) FT children with GA over 36 weeks. VPT and EPT children were put together because the number of children in the EPT group was very small (n = 3 at 30 months of age). With this classification the number of children in each GA group was comparable (see Table 2).
Descriptive data and comparisons between the three GA groups of PT children. a .
One way ANOVA in the upper part, and chi squared in the lower part.
Mean and (standard deviation).
GA = gestational age in weeks; BDI = Batelle Developmental Inventory; NICU = neonatal intensive care unit.
In Table 2 descriptive data of the three preterm GA groups are displayed and the results of one factor ANOVA or χ2 analyses (when required) are shown. The results indicated that the three groups were comparable in terms of mother’s age, maternal educational level and gender. As is logical there were significant differences in Apgar score, birth weight and stay in the neonatal intensive care unit (NICU). Bonferroni post hoc analyses indicated (p < .05) that significant differences existed among the three groups in BW, and between the group with the lowest GA (< 31 weeks) and the other two groups in the Apgar score. In the entire sample of PT there were only four children with bronchopulmonary dysplasia randomly distributed among the three GA groups. In any case, no child presented any of the exclusion/inclusion characteristics.
Instruments
The Galician version of the MacArthur–Bates Inventories (Inventario do Desenvolvemento de Habilidades Comunicativas; IDHC) (Pérez-Pereira & García-Soto, 2003; Pérez-Pereira & Resches, 2011) was administered when the children were 10, 22 and 30 months of age. We used corrected age for PT children. For the purposes of the present study we took into consideration the results obtained in the section on word production. When the children were 10 months of age, the form for children between 8 and 15 months was employed (Palabras e Xestos ‘Words and Gestures’). The section list of words consisted of 384 words divided into 19 categories. When the children were 22 and 30 months of age the form for children between 16 and 30 months was employed (Palabras e Oracións ‘Words and Sentences’). The section list of words consisted of 700 words divided into 22 categories.
When the children were 22 months of age, the Spanish version of the Batelle Developmental Inventory (Newborg, Stock, & Wnek, 1996) was administered to assess cognitive development. The skills assessed by the Batelle scale are adaptive, personal-social, communication, motor and cognitive. The overall raw score was used for the analysis.
Analyses performed
The type of word categories produced by the participants was analysed based on Caselli et al. (1999). The word categories were the following:
Common nouns: including nouns for animals, vehicles, toys, food and drink, clothes, body parts, household items and furniture and rooms. There were 156 words of this category in the IDHC-W&G (Words and Gestures) and 277 words in the IDHC-W&S (Words and Sentences), which means 40.62% of the IDHC-W&G and 39.57% of the IDHC-W&S, respectively.
Predicates: including verbs and adjectives (descriptive words). There were 89 words of this category in the IDHC-W&G and 172 words in the IDHC-W&S, which means 23.17% of the IDHC-W&G and 24.57% of the IDHC-W&S.
Social words: including interjections and animal sounds, people and games and routines. There were 68 words in this category in the IDHC-W&G and 77 words in the IDHC-W&S, which correspond to 17.70% of the IDHC-W&G and 11% of the IDHC-W&S.
Grammatical function words or closed-class words: including pronouns and possessive and demonstrative adjectives, question words, prepositions and locative and manner adverbs, quantifiers and articles and connectives. There were 36 words in this category in the IDHC-W&G and 103 words in IDHC-W&S, which correspond to 9.37% of the IDHC-W&G and 14.71% of the IDHC-W&S.
The percentage of use of these word categories relative to the total number of words produced was calculated for each participant. Descriptive results of the total vocabulary and the type of vocabulary (components) produced at 10, 22 and 30 months of age by the four previously described GA groups are presented in Table 3.
Mean (and standard deviation) of total vocabulary and relative percentage of word categories produced by GA groups at different ages.
GA = gestational age.
For the analysis of vocabulary size and these word categories, two statistical approaches were performed.
First, to model the change over time of raw scores of vocabulary size and vocabulary composition a growth curve analysis (GCA) was performed. GCA is a multilevel regression technique for analysis of longitudinal data; this approach can be used to analyse both group-level differences (GA in this study) and individual-level differences. In longitudinal analysis, the typical method of estimation (ML, maximum likelihood), the repeated measures are assumed to be continuous and normally distributed. An alternative way to fit a longitudinal model to non-normal response data is to fit a generalized linear mixed model, in which the linear predictor incorporates random effects in addition to the fixed-effects parameters but allowing for more general forms of the distribution of the response (of the error distributions and link functions).
In this study, mixed effect Poisson regression models with a log link were used. Models were fitted in R software using the glmer function from the lme4 package (D. Bates & Maechler, 2009).
The growth curve analysis was conducted in two phases, which is the usual procedure: first, in the unconditional or within-individuals phase, the analysis directly relates outcome to time, to determine the best fitting models of growth. Each individual is described by a unique intercept (baseline score) and slope (change in outcome over time). The average intercept and rate of change are called fixed effects and the variability in individuals’ baseline scores and rates of change are called random effects.
In this study, to examine the effect of time on change in vocabulary (total production and components) two models were compared (random intercept/fixed slope and random intercept/random slope) and the one showing the best fit was selected.
Then, in a second (conditional or between-individuals) phase, independent variables have been added to the previously selected unconditional growth model, and the existence of differences between conditions in the intercept and over-time change has been evaluated. In this case, to test whether gestational age (GA) resulted in a different pattern of vocabulary growth, models including GA and the interaction term GA*time were compared. Gestational age (GA) was coded as 1 (for very and extremely preterm, < 31 weeks), 2 (32–33 weeks), 3 (34–36 weeks) and 4 (for full-term children, > 37 weeks) to evaluate the existence of a linear effect on the intercept (level of vocabulary at a given time) and slope (rate of change over time).
Second, stepwise multiple regression analyses were performed to test the predictive effect of certain variables on the total number of words produced, and the number of social words, common nouns, predicates and closed-class used by the participants at 30 months of age, which were, in turn, the dependent variables. Overall score in the Batelle Developmental Inventory obtained by the children at 22 months of age and vocabulary production at 22 months were used in Model 1 as predictive variables. These measurements were used because both cognitive development and earlier language development were found to have a strong effect on later vocabulary development in previous studies (Pérez-Pereira et al., 2014; Sansavini, Guarini, & Savini, 2011; Sansavini, Guarini, Savini, Broccoli, et al., 2011; Stolt et al., 2007, 2009). Model 2 used Batelle overall score obtained by the children at 22 months of age and vocabulary production at 22 months, plus the mothers’ level of education (classified into three groups), gestational age (as a continuous variable) and gender, as predictive variables; these last three variables are considered to be typical environmental and biomedical measures, and some of them (GA and gender) were found to have a significant predictive effect on the production of word categories (Schults et al., 2013). Although the use of GA may appear to be redundant, since it was already used in the growth curve analysis, it is pertinent to use GA as a continuous variable (and not the four GA groups) in this analysis because now we are analysing the effect of GA on the vocabulary composition and size at 30 months of age in all the participants together, who have different GAs.
The program IBM SPSS Statistics 20.0 was used to perform the multiple regression analyses.
Results
Comparisons of growth of vocabulary size and composition
Differences between groups of children defined by gestational age were explored by growth curve analysis with respect to size and vocabulary composition in three waves (10, 22 and 30 months of age).
Results of the unconditional phase, which examines the effect of time (age) on change in vocabulary size and in word categories, are shown in Table 4. In this work, the model with random intercept and random slope (shown in Table 4) fitted substantially better than the model with random intercept and fixed slope (not shown) in all the vocabulary analyses (total production and components), indicating the existence of significant variability between individuals in both parameters (differences between individuals at the starting point and in rate of growth). The high negative correlation between intercept (baseline score) and slope (change in outcome over time) indicates a lower rate of growth in those individuals showing a higher starting point (larger vocabulary).
Growth curve analysis: unconditional phase. Fixed effects are the average of the individual intercept and rate of change (slope) and random effects show the variability in individuals’ intercept and slope. R is the correlation coefficient between intercept and slope.
SE = standard error; t = Student’s t; Prob. = probability; Var. = variance; SD = standard deviation.
The conditional models (Table 5) showed no significant effect of GA group either on the intercept or the slope (GA*time interaction) indicating that the growth curves are not different between GA groups in total vocabulary or in any word category (common nouns, predicates, social words and grammatical function words). Analyses were performed using the GA group as a numeric variable (groups coded from 1 to 4 as previously explained) but results were also confirmed comparing each preterm GA class with the full-term, used as a reference class, to check the possible differentiation for a particular group (results not shown). The lack of differentiation between GA groups is illustrated for vocabulary size and the four vocabulary categories in Figure 1, which shows the mean predicted value at each moment and in each GA group based on the conditional models described in Table 5.
Growth curve analysis: conditional phase. Effect of gestational age on intercept (GA) and slope (interaction time*GA).
SE = standard error; t = Student’s t; Prob. = probability; GA = gestational age.

Expected mean value of vocabulary size (and vocabulary components) as a function of GA membership, following the models described in Table 4.
Effect of predictive factors on vocabulary production
Finally, multiple regression analyses were performed with the entire sample to determine the effect of several predictive variables on the relative use of the word categories at 30 months of age. The results, which appear in Table 6, clearly indicate that the scores obtained on previous cognitive and linguistic measures explained a relatively important part of the variance found in the use of the different word categories at 30 months of age: 34.7%, 36.6%, 27.2% and 41.0% for common nouns, predicates, social terms and closed-class words, respectively, and 39.2% for total word production. In all cases the significance level of the effect of previous production of words is higher than that of cognitive development. At the same time, when GA, gender or maternal education are added to the former variables in Model 2, there is hardly any increment of the variance explained, and the change in F did not reach significance level in any case. Of the three variables introduced in Model 2, only GA reaches a significant single effect (standardized β = −.139, p = .041).
Multiple regression analyses: predictors of the use of the word categories and total vocabulary at 30 months of age.
Discussion
The aims of the study were (1) to compare developmental trajectories of expressive vocabulary size and composition of three groups of PT children with different GAs, and another group of FT children, which were comparable in terms of gender distribution and mothers’ educational level; and (2) to analyse the effect of previous linguistic and cognitive achievements, as well as the effect of GA, gender and maternal education, on vocabulary growth and composition at 30 months of age.
The results of the growth curve analysis indicated that there were no significant differences in vocabulary development among the four groups with different GAs, neither in vocabulary size nor in the growth of the four word categories studied. The main factors which predicted total vocabulary size or vocabulary composition at 30 months were word production and general cognitive development at 22 months of age, with GA, gender and maternal education having a much lower effect, which was almost always insignificant. These results seem to confirm the hypotheses of the study.
The healthy status of the PT sample is a relevant feature of the present research, which makes it different from previous studies on vocabulary composition, which were mainly carried out with EPT or VPT children, and sometimes included children with additional severe medical problems. The PT children of our investigation can be considered as low risk or healthy because the sample was composed of children with a wide range of GAs and BWs (and not only VLBW or VPT children), children without severe medical complications and with a mean Apgar score that was very similar to that of the FT group.
Another distinctive feature of the article is the use of growth curve modelling to compare the longitudinal trajectories of the four GA groups, which can be only carried out with a longitudinal design. In the first unconditional growth model, significant variability between individuals in both parameters (starting point and rate of growth) was found. In general terms the results of the present longitudinal study confirm the interpretation proposed by E. Bates, Dale, and Thal (1995) concerning the existence of individual differences in onset time and rate of growth in vocabulary production. In addition, it was found that individuals with a higher starting point showed a lower rate of growth (negative correlation between intercept and slope). This result points to a convergence of vocabulary repertoire with the passing of time among children, and a trend towards reduction of individual differences.
The results of the conditional growth curve modelling indicate that there was no significant effect of GA on expressive vocabulary growth between 10 and 30 months of age, either on total vocabulary repertoire or on word categories. The results of the present longitudinal study contrast with those found in other cross-sectional studies mainly carried out with VPT and EPT children (Foster-Cohen et al., 2007; Sansavini, Guarini, & Savini, 2011; Sansavini, Guarini, Savini, Broccoli, et al., 2011; Schults et al., 2013; Stolt et al., 2014; Stolt et al., 2012), and in part with the two reported longitudinal studies (Sansavini, Guarini, Savini, Broccoli, et al., 2011; Stolt et al., 2009). The similar pattern of growth of children with different GAs is probably related to the characteristics of the PT children studied: healthy PT children without biomedical complications. Additionally, our results did not support the idea that differences in expressive vocabulary development between FT and PT children increase as children grow older, in contrast to other studies on expressive vocabulary size (Bosch et al., 2011; Sansavini, Guarini, Savini, Broccoli, et al., 2011; Stolt et al., 2009).
In relation to vocabulary composition, a more fine-grained analysis of the results found in the four GA groups shows that the greatest differences in the proportion of use of the four categories appear at 10 months of age. At 22 months of age the percentages of use of the four categories in the four GA groups are very similar, and even more equal at 30 months. This developmental trend would indicate that language development follows a similar route in the children, independently of their GA, and that the trend is not towards an increase of differences with age, but all to the contrary. These results do not reinforce the idea that GA is a determining factor for producing a less developed vocabulary, characterized by a relatively higher proportion of social words and a lower proportion of closed-class and predicates (Kern & Gayraud, 2007; Schults et al., 2013), since the VPT and EPT children (GA < 31 weeks) in our study did not usually show the highest percentage of use of social words nor the lowest of closed-class or predicates. Thus, the results found do not support the claim that there are differences in vocabulary composition related to GA. The limited influence of GA on the relative percentage of use of the four word categories is also confirmed in the regression analyses (see below).
If the results of the four GA groups are put together and compared with each other at 10, 22 and 30 months, one can see (see Figure 1) that the developmental paths followed by the four groups in relation to the use of the four word categories and total vocabulary at these ages are very similar (see also Table 5). The developmental trends are very similar to those found in most studies with full-term (Caselli et al., 1999; Jackson-Maldonado et al., 1993; Maital et al., 2000; Schults et al., 2012; Stolt et al., 2008; Stolt et al., 2007) and preterm children (Kern & Gayraud, 2007; Sansavini, Guarini, Savini, Broccoli, et al., 2011; Schults et al., 2013; Stolt et al., 2007). Full-term children as well as preterm children use social words very frequently (around 60% of the total lexical repertoire) at the beginning (10 months). Later on there is a sharp decrease in the relative use of this type of words, which decline to 20% at 22 months and 12–13% at 30 months of age. The percentage of common nouns reaches around 30% at 10 months, and then increases to 45% at 22 months of age, with a tiny decrease to 43% at 30 months of age. Predicates are used only rarely at 10 months: 7% by the FT and less than 2% by the PT children. Later on there is a smooth and constant slope until reaching around 22% at 30 months of age for all the GA groups. A similar trend is observed for all the groups in relation to the closed-class words, although the slope is less pronounced, and all the children produce around 12% of closed-class words at 30 months. Developmental trends are, therefore, practically identical for the PT and the three FT groups, and correspond to the developmental pattern previously reported.
In relation to the factors that determine the use of vocabulary at 30 months of age, the multiple regression analyses performed indicate that previous cognitive and vocabulary development at 22 months of age are strong predictors of the relative use of total number of words, common nouns, predicates, social terms and closed-class words at 30 months of age. The variance explained reached 39.2%, 34.7%, 36.6%, 27.2% and 41% respectively. These results coincide with those found by other studies on vocabulary size (Pérez-Pereira et al., 2014; Pérez-Pereira et al., 2013; Sansavini, Guarini, & Savini, 2011; Sansavini, Guarini, Savini, Broccoli, et al., 2011; Stolt et al., 2009; Stolt et al., 2007), and indicate that previous linguistic development, together with previous cognitive development are major predictors of later linguistic abilities. The effect (significance level) of previous word production scores was higher than the effect of previous cognitive scores for all word categories, and particularly for the use of closed-class words.
Contrary to the results found by Schults et al. (2013), neither gestational age nor gender was a predictor of the vocabulary size or the relative use of the word categories studied in our sample, and nor was maternal education a predictor of the relative use of those categories. When these biological or environmental variables are added to previous cognitive and linguistic development (Model 2), the change in the variance explained did not reach significance in any word category (change in F), and their joint contribution to the variance explained for common nouns, predicates, social words and closed-class varied between 0.7% and 2.7%. The effect (standardized β) of GA, gender or maternal education did not reach significance for total vocabulary or any word category, with the exception of GA for predicates. In any case, the relationship between GA and production of predicates is a negative one, and if the descriptive results in Table 3 are checked, it is possible to see that the lower percentage of predicates at 30 months of age was produced by the FT group, a result contrary to expectations according to other studies (Kern & Gayraud, 2007; Sansavini, Guarini, Savini, Broccoli, et al., 2011; Schults et al., 2013; Stolt et al., 2009; Stolt et al., 2007). Similar considerations could be made for the trend observed in total word production. The size of the effect on the variance of total vocabulary size or any of the word categories at 30 months was, however, practically non-existent (change in R2 for Model 2). The results relative to gender agree with other studies which did not find an effect of gender on vocabulary development of FT (Andonova, 2015) or PT children (Stolt et al., 2007). The results relative to maternal education cannot give support to the findings of former investigations which found that PT and FT children whose mothers had a higher educational level had greater lexical size and higher proportions of those categories associated with a more advanced lexicon such as predicates and grammatical function words (Menyuk et al., 1991; Sansavini, Guarini, Savini, Broccoli, et al., 2011; Stolt et al., 2007).
In general terms, the results of our study seem to indicate that GA hardly affects productive vocabulary development or composition when biomedical complications are excluded, and the sample includes children with a wide range of GAs and BWs, and not only VPT or EPT children.
Conclusions
Expressive vocabulary development of PT and FT children do not differ when medical complications are controlled and the PT group is composed of healthy or low risk children. Coherently, vocabulary composition of the groups with different GAs were not different, and children with different GAs showed very similar patterns of change of word categories. Developmental trends in the composition of expressive vocabulary for children with different GAs were practically identical, and they very much corresponded to the results found by other studies.
Finally, the factors which mainly predicted the vocabulary size and the relative production of the word categories were previous linguistic and cognitive development. Gender, GA and level of maternal education had practically no effect on vocabulary use at 30 months of age. Therefore, GA seems to affect vocabulary development and composition of low risk PT children in a minimal way when biomedical complications associated to prematurity have been excluded. Other factors, such as former use of gestures and vocabulary comprehension, for instance, which were not taken into consideration in this research, could be affecting vocabulary growth and composition. The use of corrected age for PT children at 30 months of age might also have affected the results.
More ambitious research designs, which include children with different GAs and with and without medical complications, would be needed to get more confident results.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was funded by the Ministerio de Ciencia e Innovación of the Spanish Government (grants PSI2008-03905, and PSI2011-23210 to the first author) and the Xunta de Galicia (INCITE) (grant PGIDIT07PXIB211044PR to the first author).
