Abstract
This article evaluates the extent to which pre-schoolers’ picture books can be viewed as a form of enriched linguistic input. Twenty best-selling picture books were analysed in terms of syntactic constructions and compared with a sample of Child Directed Speech. The findings of the study demonstrate the prevalence of canonical utterances (i.e. those displaying Subject-Verb (Object) ordering) and Complex constructions within the book sample, both types of which occur with very low frequency in everyday Child Directed Speech. It is concluded that the linguistic content of young children’s books has the potential to play an important role in children’s grammatical development.
Keywords
Over the years, a number of claims have been made regarding the nature and influence of the input on language development. Chomsky (1965) famously argued that the input available to young children is impoverished, that is lacking in the complexity and richness of structural information necessary for the acquisition of a child’s target language. Researchers working within usage-based, constructivist frameworks have argued to the contrary. For example, Cameron-Faulkner, Lieven, and Tomasello (2003) suggested that Child Directed Speech (CDS) is well suited to the early stages of language development due to the prevalence of item-based frames (e.g. It’s a X; Where’s the Y) found in everyday speech to children. According to usage-based theories of language development, the item-based frames are extracted, stored and either used as pre-fabricated units and/or subsequently schematised over time.
However, to some extent the findings reported by Cameron-Faulkner et al. (2003) do indicate a degree of structural impoverishment in the input addressed to young children. In their construction-based analysis of speech taken from 12 English-speaking mothers, canonical ‘full’ constructions (i.e. those containing a Subject, Verb and Object [SVO]), and Complex utterances (i.e. those containing more than one verb clause) were relatively rare and accounted for only 15% of the sample. Instead, many of the item-based frames identified in the sample were instances of ‘non-canonical’ constructions such as Fragments, Copulas and Interrogatives. Consequently, the findings from Cameron-Faulkner et al. indicate that young children learning English (a strong SVO language) frequently hear constructions in which the verb occurs at the front of a subject-less construction (e.g. Put it down), subjects after auxiliaries (e.g. Can you reach?) and a large number of utterances in which one major component of the SVO assembly is missing (e.g. Oh, look, a cat!). Therefore, while the item-based frames attested in the Cameron-Faulkner et al. (2003) study may aid a young child’s acquisition of the copula construction, for example, their contribution to the discovery of more abstract relations such as SVO word order is limited, as is their influence on the development of complex, multi-clause constructions. In summary, while CDS may be well suited to the early stages of multi-word development, it appears to be lacking in the more structurally rich constructions that form an important component of adult linguistic knowledge, and typically emerge at later stages of children’s language development.
To date, there has been little research relating to the development of multi-clause constructions within the usage-based approach; studies which have investigated this aspect of development have tended to focus on children’s use of structure-building in the production of multi-clause constructions. For example, Diessel and Tomasello (2001) suggest that English-speaking children’s earliest Complex constructions involve the combination of a formulaic matrix clause (e.g. I think) with a full sentence (e.g. It’s gone ➔ I think it’s gone). The authors therefore suggest that there is little evidence of abstract structure within children’s early multi-clause constructions.
However, in the same way that researchers have identified strong links between CDS and the emergence of early multi-word constructions (e.g. Kirjavainen, Theakston, Lieven, & Tomasello, 2009; Lieven, Pine, & Baldwin, 1997; Tomasello, 2003) there is evidence to suggest that the acquisition of later emerging constructions (i.e. Complex constructions) is subject to input frequency effects. For example, Huttenlocher, Vasilyeva, Cymerman, and Levine, (2002) highlight the significant relationship between the proportion of multi-clause utterances attested in the input and the production/comprehension of multi-clause utterances by 4-year-old English-speaking children. That is, in their sample, children who heard more Complex constructions in a range of settings produced more Complex constructions in their everyday speech. The authors conclude that individual differences in the use of Complex constructions can be explained in terms of input characteristics. However an alternative interpretation of the results presents itself; it could be the case that caregivers increase their production of Complex constructions in response to their children’s production of the constructions.
In the current study we look beyond CDS in order to identify other potential sources of linguistic input available to young children; specifically, we focus on the grammatical characteristics of books. A number of studies have highlighted the positive impact of shared book reading on language development and narrative structure. The most robust findings in the literature relates the positive effect of shared book reading on vocabulary development (e.g. Elley, 1989; Farrant & Zubrick, 2011; Ninio, 1983; Sénéchal & LeFevre, 2002), conversational ability (Morrow, 1988), reading development (e.g. Bus, van Ijzendoorn, & Pellegrini, 1995) and narrative development (e.g., Reese, 1995).
One of the key factors driving the relationship between shared book reading and vocabulary development appears to be the high levels of joint attention afforded by the activity (Farrant & Zubrick, 2011; Ninio & Bruner, 1978). Joint attention involves an individual following the focus of a co-participant, and in terms of caregiver–child interaction can be contrasted with a caregiver’s attempts to redirect the child’s focus. A number of studies have highlighted the positive correlation between joint attention and the early stages of language development (e.g. Bruner, 1983; Farrant & Zubrick, 2011; Tomasello & Farrar, 1986) and indeed the ability to engage in joint attention is considered to be a prerequisite for language development within the usage-based framework (Tomasello, 2003).
While the positive effects of shared book reading appear to be well attested in the areas of vocabulary and narrative development, its effect on grammatical development appears to be less clear. Some studies identify a correlation between shared book reading and sentence length/complexity (e.g. Feitelson, Kita, & Goldstein, 1986; Whitehurst et al., 1988), while others fail to find a significant relationship (e.g. Debaryshe, 1993). However, we suggest that methodological and theoretical issues may be playing a crucial role in this apparent lack of clarity. The standard procedure employed in this area of research, which is predominantly situated in the field of education or literacy research, is to analyse aspects of shared book reading, for example, frequency or style of shared book reading, and then measure the children’s linguistic skills using one of many language scales and language measurement tools, e.g. the Reynell Developmental Language Scales (Debaryshe, 1993). Since the studies do not ascribe to any particular theory of language development this approach would appear sensible; grammatical development is just another global measure of a child’s linguistic ability. However, it could be argued that the tests and scales used in the previous studies are not sufficiently detailed to pick up on the more fine-grained aspects of development which are attested in constructivist studies of child language development (Kidd, Lieven, & Tomasello, 2006; Kirjavainen, Theakston, Lieven, & Tomasello, 2009; Lieven et al., 1997; Noble, Rowland, & Pine, 2009; Tomasello, 2003). That is, broad language development measures cannot identify the presence or absence of specific constructions. Therefore, the effects of shared book reading on grammatical development is still very much an open question.
In the current study we analyse the types of linguistic constructions typically found in young children’s books. The study is based on a sample of 20 books for pre-schoolers. First we analysed the constructions found within the books and then compared the construction profiles with the sample of CDS reported in Cameron-Faulkner et al. (2003). In doing so we aim to compare the constructions found within the two samples and ascertain whether the language within the books could be viewed as a form of enriched CDS. Two key research questions guide our analyses:
RQ1: Does the book text differ significantly from CDS in terms of construction frequency?
RQ2: Are differences in construction type attested within the book sample itself?
Method
Materials
Twenty picture books were selected from the Amazon UK website. The books were taken from the best seller list of titles aimed at 2-year-old children on two dates, 14 April and 18 May 2011. The list of books can be found in Appendix 1. Books were excluded if:
The same style of book or a book with the same author had been selected already (e.g. only one That’s not my X style book was included in the sample).
The book was clearly inappropriate for the target age group. Customer reviews were considered if the book appeared inappropriate. If the intended age was not clear from the reviews we examined the book ourselves in order ascertain its suitability for the target age group.
Child Directed Speech sample
The CDS analysis is taken from Cameron-Faulkner et al. (2003), which analysed corpus data taken from the Manchester corpus (Theakston, Lieven, Pine, & Rowland, 2001), hosted on the CHILDES website (MacWhinney & Snow, 1990). The corpus contains the linguistic interaction of 12 mother–child dyads. The children (six girls and six boys) were all first-born monolingual English speakers with mothers as the primary caregivers. Socioeconomic status was not taken into account with respect to participant recruitment, though the children were from predominantly middle-class families. The dyads were recorded in their homes in the presence of a research assistant. The mothers were instructed to play with their children as normal (i.e. free play sessions) and for the purposes of the current study it is important to point out that the mothers were given explicit instructions to avoid shared book reading within the recordings. Thirty-minute recordings were conducted on two separate occasions every three weeks for one year. The analysis reported in Cameron-Faulkner et al. (2003) was based on two hours of recording for each dyad in which the age of the children ranged between 1;9.28 and 2;6.23, and the Mean Length of Utterance (MLU) of each child was calculated to be between 2.00 and 2.49. In total, 16,903 CDS utterances were included in the data sample.
Coding
The book text was coded in accordance with the original CDS sample. The text from each book was broken down into utterance-level construction types and coded by the two authors according to the taxonomy used in Cameron-Faulkner et al. (2003). The taxonomy is based on standard linguistic criteria and displayed below with examples.
big cat
on the table
yellow
Is he in the box?
Where’s the truck?
Put it over there
Eat your greens
It’s very heavy
That’s nice
He ate the cake
She’s running
They put it there
I know that you love doing jigsaws
I thought that you had been here before
A number of the books contained reported speech (e.g. He said ‘X’) resulting in the addition of a book-specific construction category, the reported speech clause.
‘Amazing’, said the mouse.
Reliability tests were conducted on 10% of each of the two coders’ data (20% in total). The combined result of the reliability analysis indicated a high level of agreement and consistency within the coding (kappa = .90).
To address the two research questions outlined previously, two analyses were conducted on the data. First, we conducted a one-way MANOVA to compare the mean frequency of construction types in the two samples (i.e. the book sample and CDS sample). Second, we compared the frequency of construction types on an individual basis for each book against the CDS sample in order to ascertain the similarities and differences within the book sample itself.
Results
Analysis one: Mean frequencies of construction types
A comparison of proportional construction frequency was conducted between the CDS and book samples. A one-way MANOVA was run, with condition (book, CDS) as the independent variable and frequency of global construction categories (Fragments, Questions, Imperatives, Copulas, Subject-Predicate and Complex) as the dependent variable (see Figure 1).

Proportional frequency (SE) of global constructions in the book and CDS samples.
There was a significant main effect of condition, Wilks’ λ = .28, F (6, 25) = 10.55, p =. 001, ηp2 = .72. Given the significance of the overall test, the univariate main effects were examined. Significant univariate main effects for condition were obtained for the following structures, Questions, F (1, 30) = 68.51, p = .001, ηp2 = .70, Subject-Predicates, F (1, 30) = 8.04, p = .008, ηp2 = .21, and Complex structures, F (1, 30) = 7.35, p = .011, ηp2 = .20.
The results of the global construction analysis indicate higher levels of Subject-Predicate and Complex constructions in the book sample. Conversely, the book sample contained fewer Question constructions than the CDS sample.
Analysis two: Book-specific analysis of construction types
In Analysis two we compared each book against the CDS sample. For the purpose of this analysis we divided the constructions found in the CDS sample into three broad categories; non-SV(X) constructions (Single-word Fragments, Multi-word Fragments, Imperatives and Copulas), SV(X) constructions (Subject-Predicate and Complex) and Questions (Wh-questions and Yes/No questions).
The proportional frequency of constructions within each book was then compared to the CDS sample leading to the categorisation of books in the following manner.
SV-heavy: Any book containing twice the proportional frequency of SV(X) constructions in comparison to the CDS sample.
SV-light: Any book containing twice the proportional frequency of non-SV(X) constructions in comparison to the CDS sample.
The majority of the books were categorised as SV-heavy (75%), while only four books were classed as SV-light. Only one book did not fit within either category and was identified as having a similar profile of construction frequency to the CDS sample. Therefore when taken on an individual basis the books cluster into two categories, which present a significantly different construction profile to the CDS sample. The SV-heavy books contain a higher proportion of Subject-Predicate and Complex constructions as compared to the CDS sample, while the SV-light books contain more non-SV(X) utterances than the CDS sample. It is interesting to note that in the case of both book types, Questions occur with less frequency than within the CDS sample.
Discussion
In the present study we conducted a construction-based analysis of 20 popular pre-school books. The results of the analysis were then compared with a sample of CDS in order to identify the extent to which the text found within pre-school books provides enriched linguistic input to young language learners. The study is situated within a constructivist, usage-based approach to language development, and is therefore a departure from the main body of shared reading literature. Typically, shared book reading is studied within the disciplines of education and literacy research, and is consequently less concerned with the processes underlying language development per se. Instead, our analysis is motivated by claims regarding the nature of the linguistic input available to young children. Specifically, we aimed to discover whether the frequency of construction types found within a sample of young children’s books bridged the construction profile attested in everyday CDS and the wider range of linguistic structures found in more mature registers of language.
Our findings indicate a significant difference in the frequency of constructions within the book sample and the CDS sample reported in Cameron-Faulkner et al. (2003). In addition, two categories of book type emerged from the data, SV-heavy and SV-light, with the former category being most frequent within our sample. In this section we discuss the two findings in detail and comment on their implications for usage-based approaches to language development.
Overall, the book sample contained significantly more Subject-Predicate and Complex constructions than the CDS sample. A key issue relating to the role of the input is the lack of canonical constructions (i.e. Subject-Predicate constructions) and indeed this was attested in the findings of Cameron-Faulkner et al. (2003). Hence, the books have the potential to provide children with increased exposure to the canonical constructions which occur with low frequency in typical adult–child interaction.
According to a usage-based approach to language development, the increased frequency of canonical and Complex construction types affords two benefits. First, the increased exposure facilitates the extraction, storage and subsequent use of the constructions in question. For example, the higher frequency of Complex constructions provides young children with information relating to multi-clause ordering within their target language. As mentioned earlier, significant differences are attested in the children’s use of early Complex constructions (Huttenlocher et al., 2002) with input frequency playing a central role in children’s knowledge of these constructions. Therefore, our findings suggest that shared book reading may be a contributory factor in the development of Complex constructions for children who are read to frequently and are at a suitable level of linguistic ability to assimilate structural knowledge pertaining to multi-clause constructions.
Second, higher levels of Subject-Predicate exposure may contribute to the child’s underlying knowledge of her or his target language. In English, a language with strict SVO word order, increased levels of canonical utterances have the potential to provide the language learning child with important information relating to the linguistic realisation of ‘who did what to whom’ within their ambient language. This in turn may contribute to the development of abstract constructions (e.g. the transitive construction) within the young child’s system of linguistic representation.
Up until this point our discussion has focused on the potential of books to provide increased exposure to SV(X) constructions (i.e. Subject-Predicate and Complex constructions). However, not all the books were SV-heavy. Four out of the 20 books within the sample contained a higher level of non-SV constructions than CDS (i.e. the SV-light books). These SV-light books too have their value, given a usage-based approach to language development; while they may not add to a child’s linguistic inventory, they do have the potential to reinforce frequent and accessible CDS constructions.
The findings of the current study therefore suggest that the linguistic content of the book sample is a source of enriched linguistic input, both in terms of presenting the young child with increased exposure to low frequency Subject-Predicate and Complex constructions, and also by reinforcing familiar CDS constructions. However, it is not only the frequency of constructions within the book sample that may be of benefit, but the actual context within which the constructions are presented (i.e. in books during shared reading activities). First, young children’s books consist of a (usually) predictable story with stable visual cues (two-dimensional illustrations). Thus, the child has a number of environmental cues to aid deeper levels of comprehension of the language presented within the text. Second, the fact that the constructions are embedded in a story may result in higher levels of arousal due to the novelty, humour or surprise often found in young children’s book (Elley, 1989). Together these features may have a positive effect on incidental learning of constructions. Third, books can be read more than once and thereby present the child with multiple exposures to constructions. Studies indicate the benefits of repeated book reading (Fletcher & Reese, 2005; Morrow, 1988; Snow & Goldfield, 1983) and this could be of particular value when considering the acquisition and development of more complicated constructions. Finally, shared book reading is an activity involving a high degree of joint attention and as mentioned earlier a wealth of studies indicate the positive impact of joint attention on language development. Therefore, when considering the potential influence of book reading on linguistic development, we need to factor in the affordances provided by the activity itself; it is not only the frequency of constructions which may be of value, but also the context within which the constructions are used.
Limitations and future research
The frequency and quality of shared book reading differs dramatically from family to family. In this regard, the input available to young children is varied and reflects similar claims made regarding the uniformity of CDS (e.g. Hoff-Ginsberg, 1992; Lieven, 1994; Oshima-Takane & Robbins, 2003). Some children will be read to very rarely but still eventually acquire a mature representation of their target language. However, just as construction-based studies of CDS have indicated frequency effects of input on the acquisition of specific constructions, we suggest that similar effects will be found within the context of shared book reading. That is, we predict that children who are read to more frequently and exposed to a broader range of constructions will master the complexities of their target language more rapidly. However, this has yet to be tested and is an important avenue for future research.
It is not just the frequency of shared book reading (and the associated frequency of construction presentation) that may provide positive benefits, but also the style of caregiver interaction (Huebner & Meltzoff, 2005; Lever & Sénéchal, 2011; Whitehurst et al., 1988). Studies indicate that a dialogic form of presentation provides additional benefits, over and above day-to-day shared book reading (i.e. sticking to the text). However, the extent to which caregivers deviate from the text appears to be open to debate and displays large individual differences (see e.g. Huebner & Meltzoff, 2005; Kang, Kim, & Pan, 2009). It may be the case, as some studies appear to indicate, that book type (both in terms of text complexity and content) plays a major role in the degree to which caregivers embellish the text (e.g. Nyhout & O’Neill, 2013; Pellegrini, Perlmutter, Galda, & Brody, 1990); indeed, this is a current focus within our research group. There is also evidence of cross-cultural differences in caregiver interaction during shared book reading (e.g. Luo, Snow, & Chang, 2011), which again reflects cross-cultural differences in CDS. Future research should factor in the variable of reading style when investigating the effects of shared book reading on grammatical development.
Conclusions
The books analysed within our sample contained a wide range of constructions which occur with low frequency in the input. Book text has the potential to provide young language learners with vital clues about the underlying structure of their target language. We stress the importance of shared book reading in the early stages of language development and also the need to further investigate the role of shared book reading on grammatical development from a constructivist/usage-based approach.
Footnotes
Appendix 1
Acknowledgements
We express our thanks to Elena Lieven for comments on the article. We are also extremely grateful to the editor and the reviewers of the article for their valuable comments and feedback.
Funding
This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.
