Abstract
The present meta-analysis was conducted to examine how shared book reading affects the English language and literacy skills of young children learning English as a second language. The final analysis included 54 studies of shared reading conducted in the United States. Features of the intervention and child characteristics were tested as potential moderators, and the impact of methodological criteria was examined using sensitivity analyses. Results revealed an overall significant, positive effect of shared reading on English learners’ outcomes. Children’s developmental status moderated this effect, with larger effect sizes found in studies including only typically developing participants than in studies including only participants with developmental disorders. No other significant moderators were identified. The main positive effect was robust to the application of more stringent methodological inclusion criteria. These results support shared book reading as an early educational activity for young English learners.
There are a large and growing number of children learning English as a second language in the United States. Often referred to as English learners (ELs), these children come from a variety of language backgrounds and differ widely in their exposure to English and their home language proficiency when they begin to attend school. EL students are overrepresented among students who read at below-basic levels in the United States (Hemphill & Vanneman, 2011). In the early elementary grades, ELs often exhibit significantly lower mean English vocabulary scores compared with their monolingual English-speaking peers—as much as 2 standard deviations lower (Wood Jackson, Schatschneider, & Leacox, 2014). They also may have gaps in knowledge or proficiency on other language and literacy measures, leading to later deficits in academic achievement (Hernandez, 2012; Lesaux, Kieffer, Kelley, & Harris, 2014). This is evidenced by the fact that a disproportionate number (56%) of those students failing to attain basic reading skills are Hispanic, with only 44% of Latino/a fourth graders scoring at or above “basic” level compared with 75% of Anglos (National Center for Education Statistics, 2004). The achievement gap begins early and widens, as seen in Kieffer’s (2008) longitudinal study of the reading growth trajectories of ELs, in which children who entered kindergarten with low proficiency in English diverged from the national average and fell farther behind over time on the reading measures. Given the strong relationship between oral language skills and reading achievement (Foorman, Petscher, & Herrera, 2018), identifying effective methods for improving oral language skills in English among this population is a key concern in education research (Baker et al., 2014; Moore & Klingner, 2014).
In this article, we focus on one language-based intervention approach commonly used with young children: shared book reading (Justice, Logan, & Damschroder, 2015; Lonigan, Shanahan, & Cunningham, 2008). Shared reading has received considerable attention as an educational activity, evidenced by the publication of several prior meta-analyses that have examined its impact on monolingual children’s academic outcomes (Bus, van IJzendoorn, & Pellegrini, 1995; Lonigan et al., 2008; Mol & Bus, 2011; Mol, Bus, de Jong, & Smeets, 2008; Therrien, 2004; U.S. Department of Education, Institute of Education Sciences, What Works Clearinghouse, 2015). Although these studies have added to our understanding of the effectiveness of shared book reading interventions for monolingual populations, they did not examine EL populations in any depth. Given the differences in language development observed between EL students and their monolingual peers and the persistent gap in performance between these populations, the effect of shared reading interventions on young EL students in the United States remains an important area of investigation. The aim of the present article is to determine the impact of shared book reading interventions on young ELs’ language and literacy outcomes and to identify intervention components, child characteristics, and features of study design that influence the effect of shared reading. Increasing knowledge of the effect of shared reading for ELs and factors that moderate the relationship will inform educational practice and direct future research involving shared book reading interventions with this population.
Shared Reading as an Early Intervention Approach
Shared book reading is considered an effective practice for enhancing language and literacy development among both monolingual and EL children (Adesope, Lavin, Thompson, & Ungerleider, 2011; Lonigan & Shanahan, 2010). During shared book reading, an adult reads with one or more children and may use interactive practices such as dialogic reading techniques to engage the children or reinforce specific words or ideas from the text (U.S. Department of Education, Institute of Education Sciences, What Works Clearinghouse, 2015). Shared reading can be considered both a social and an educational practice (Barton, Hamilton, & Ivanic, 2000) and is commonly employed as a vehicle for delivering specific educational interventions, such as vocabulary or early literacy instructional programs (Justice, Kaderavek, Fan, Sofka, & Hunt, 2009; Lugo-Neris, Wood Jackson, & Goldstein, 2010).
For young ELs, shared reading interventions are among the most common early language focused intervention reported in the early intervention literature (Larson et al., 2018). Based on a systematic review of language-focused interventions for culturally and linguistically diverse children (Larson et al., 2018), specific intervention components are often highlighted as critical for the intervention’s success. These active ingredients include the frequency of shared reading, adults’ use of language during reading (Gesell, Wallace, Tempesti, Hux, & Barkin, 2012), as well as rich explanation of words (Collins, 2010). Other studies have highlighted specific strategies employed during shared reading, such as dialogic reading strategies (Reese, Leyva, Sparks, & Grolnick, 2010), preview/review (Leacox & Wood Jackson, 2014), children’s story creations or retells (Bernhard, Winsler, Bleiker, Ginieniewicz, & Madigan, 2008), and the incorporation of home language (Lugo-Neris et al., 2010; Wood et al., in press). Although effective shared reading interventions vary in the specific features employed, the language-rich interactions that occur during shared reading and the resulting creation of a literacy-rich environment appear to contribute to their success and continued use and popularity.
The popularity of shared book reading interventions may be further explained by several appealing aspects. Shared reading does not require many physical resources beyond a selection of books, and books are easily accessible through public libraries in the United States. The same set of books may be read repeatedly with positive impacts on children (Koskinen et al., 2000). Because a disproportionate number of young ELs live in poverty (Ryan, 2013), the material demands of an intervention program can influence that program’s feasibility of implementation for ELs and their families. Poverty not only compounds ELs’ risk for reduced language and literacy achievement (Kieffer, 2010) but also creates practical barriers to service delivery (Hernandez, 2004). Consequently, educational activities that require few and readily available materials are particularly desirable.
Another attractive feature of shared book reading is the adaptability of the approach to match the language needs, communication style, and language preferences of the adults and children involved. Shared reading interventions have been implemented using children’s home language only (Gesell et al., 2012), English only (Collins, 2010), or both the children’s home language and English (Hammer & Sawyer, 2016). This flexibility of implementation in the children’s first language (L1) or second language (L2) makes shared reading interventions more feasible with a highly heterogeneous population of ELs. Additionally, some approaches have included training and/or support for adults to adapt the intervention or adjust reading materials and interaction to be appropriate for a particular child’s developmental level (Lim & Cole, 2002) or for specific interests, routines, or memorable experiences self-identified by families (Boyce et al., 2004). Context can also vary, with interventions implemented in homes (Hammer & Sawyer, 2016) as well as child care or early learning programs (Pollard-Durodola et al., 2016). The adaptability of shared reading contributes to the desirability of this instructional approach for young ELs who often come from diverse socioeconomic and linguistic backgrounds, with wide variation in their home literacy environments and caregiver support.
Despite the practical appeal of shared book reading as an educational activity and its particular value for ELs, the current literature does not provide a clear indication of which intervention features facilitate gains in children’s language and literacy skills. Much of the existing literature has focused on using shared reading as a vehicle for delivery of a specific language intervention program (Farver, Lonigan, & Eppe, 2009) or to facilitate parent–child interaction (Barbre, 2003). This has restricted the extent to which studies of shared reading have incorporated multiple intervention approaches or compared effectiveness between different EL groups. In addition, individual research studies are inherently limited by constraints on resources, time, and availability of participants. Meta-analysis provides a method for identifying influential components of intervention, by summarizing information across many individual studies that differ in intervention, participant, and methodological characteristics. By including many studies with a wide variety of participants and approaches, meta-analytic techniques can assist in identifying factors that differentially affect study outcomes, which is critical information for practitioners and researchers. These findings can help educators develop more effective educational practices and may assist researchers in discerning some of the mechanisms of learning and of improved outcomes for ELs (Ahn, Ames, & Myers, 2012).
Prior Meta-Analyses of Shared Reading
Several prior meta-analyses have examined shared reading interventions with young children, albeit with a focus on monolingual English speakers (Lonigan et al., 2008; U.S. Department of Education, Institute of Education Sciences, What Works Clearinghouse, 2015). These studies revealed some positive effects of shared reading on certain outcomes, but results were mixed and inconsistent across meta-analyses. In a 2015 intervention report from What Works Clearinghouse, the authors identified eight studies that met What Works Clearinghouse group design standards and aggregated findings from seven of these studies (the eighth study was not included in the final effectiveness estimate because it did not include sufficient information to compute an effect size). Results were varied, with mixed effects on children’s language comprehension and composite language and no evidence of effects on alphabetics or general reading scores (U.S. Department of Education, Institute of Education Sciences, What Works Clearinghouse, 2015). Similarly, the National Early Literacy Panel’s report identified evidence of shared reading having a positive impact on oral language outcomes, print knowledge, and writing, but no significant effects on alphabet knowledge, cognitive ability, phonological awareness, or reading readiness (Lonigan et al., 2008). The variability in the findings from these studies suggests the existence of additional factors, beyond type of outcome measure, that may moderate the effect of shared book reading intervention on children’s language and literacy skills. Moderators, which can be either child characteristics or components of a specific program, predict differential response to an intervention. Recommendations for educators regarding effective shared book reading practices cannot be made with any confidence unless these factors are identified.
Aside from the lack of investigation of potential moderators, an essential limitation of the previous meta-analyses of shared reading is their primary focus on monolingual populations. Several of these meta-analyses did not include any studies reporting results from EL samples (Bus et al., 1995; Mol & Bus, 2011; U.S. Department of Education, Institute of Education Sciences, What Works Clearinghouse, 2015). Exceptions include a meta-analysis by Takacs, Swart, and Bus (2015) and the report from the National Early Literacy Panel (Lonigan et al., 2008). In Takacs et al. (2015), the authors conducted a meta-analysis of shared storybook interventions that were technology enhanced. This meta-analysis included several studies with samples comprising children from low socioeconomic, immigrant, and bilingual backgrounds (e.g., Silverman, 2013; Verhallen & Bus 2009, 2010; Verhallen, Bus, & de Jong, 2006). The authors reported that children from disadvantaged backgrounds benefited most from the interventions with overall small to moderate effects of technology-enhanced shared stories on comprehension (.39 average effect size, p < .01) and expressive vocabulary (.24 average effect size, p = .046). The report by Lonigan et al. (2008) did not exclude studies that used samples comprising bilingual children. However, the final sample of studies included only one study that utilized a bilingual sample—namely, a primarily Hispanic participant sample. This single study was also conducted outside of the United States (Valdez-Menchaca & Whitehurst, 1992), further limiting the generalizability of the authors’ findings to ELs educated in a U.S. context.
There is a clear need for practical and generalizable information regarding the impact of shared book reading on ELs in the United States. Although it is commonly presumed that language learning strategies employed during shared reading would benefit ELs, it is also possible that a certain proficiency in English is needed to fully benefit from shared book reading in English. This notion is supported by the threshold hypothesis (Ardasheva, Tretter, & Kinny, 2012), which proposes that a minimum level of L2 competency is needed for ELs to benefit from oral language learning strategies. Applying the threshold theory, children with higher L2 proficiency would be greater beneficiaries of language learning strategies such as modeling and guided cooperative practice. Aside from the threshold hypothesis, general proficiency in L2 has been reported as a moderator in several studies, indicating that more proficient L2 learners tend to benefit more from language learning strategy interventions (Ardasheva, Wang, Adesope, & Valentine, 2017; Plonsky, 2011; Taylor, 2014). Furthermore, results of a recent meta-analysis by Prevoo, Malda, Mesman, and van IJzendoorn (2016) indicated moderate within-language correlations for both L1 and L2 between language proficiency and early literacy performance (.33 < r < .37).
Critically, because few studies with a focus on ELs have been included in previous meta-analyses, the opportunities to identify important moderators of the effect of shared book reading on ELs have been severely limited. In the present meta-analysis, we address several factors related to features of the intervention, child characteristics, and methodological components that may influence effect size estimates.
Potential Moderators of Shared Reading
Features of the Intervention
The characteristics of shared reading programs may differentially affect their effect on ELs’ outcomes. In the present study, three specific intervention features were examined as potential moderators of the effectiveness of shared reading. These were the language of intervention, the relation of the reader to the child, and the type of outcome measure used to assess growth.
Language of intervention
In the context of shared book reading, bilingual adults have the option to choose to read and interact in whichever language they prefer. Reading in either language may yield positive social, educational, and cultural benefits for the child (Cheung & Slavin, 2012; Gutiérrez-Clellen, 1999; Kohnert, Yim, Nett, Kan, & Duran, 2005; Roberts, 2008). However, there is no consensus on whether educators should use children’s L1, L2, or both during instruction (Cheung & Slavin, 2012; Farver et al., 2009; Francis, Lesaux, & August, 2006; Gutiérrez-Clellen, 1999). Several studies have found no significant differences in ELs’ English outcomes when students are randomly assigned to English-only or bilingual instructional groups (Farver et al., 2009; Slavin, Madden, Calderón, Chamberlain, & Hennessy, 2011). These results seem to indicate that ELs benefit similarly from support in either English or their home language; therefore, the language of instruction may be flexible based on circumstantial needs and preferences. Other studies report advantageous outcomes for incorporating L1 instruction (Barnett, Yarosz, Thomas, Jung, & Blanco, 2007; Durán, Roseth, Hoffman, & Robertshaw, 2013; Restrepo, Morgan, & Thompson, 2013). Conclusions about the impact of the language of intervention are limited, however, because few studies to date have compared English-only, L1, and bilingual intervention programs with sufficient methodological rigor to allow for valid comparisons. Further investigation is needed across samples of children from more diverse backgrounds.
Relation of child and reader
Many empirical studies rely on researchers (e.g., primary investigators, graduate students, or research assistants) to conduct shared reading with children (e.g., Restrepo et al., 2013; Vadasy, Sanders, & Nelson, 2015). In general practice, individuals such as service providers or caregivers are more likely to engage in shared reading practices with children on a regular basis. Examining differences in outcomes based on these forms of shared reading delivery can provide insight into the question of effectiveness versus efficacy (Gottfredson et al., 2015).
Outcome measures
The kinds of outcome measures used to assess growth can have an influence not only on the effect size estimates obtained but also on the inferences that are made about how ELs’ language and literacy develops. Shared book reading is generally a literacy-focused activity. During reading, children see text on a page, which adults may emphasize to teach concepts such as print awareness, letter knowledge, and phonological awareness. However, shared reading also provides opportunities for oral language exposure and use. Children may hear new words or linguistic structures and/or engage expressively with the adult during reading. Because shared book reading has the potential to improve both children’s language and literacy abilities, discerning which skills benefit the most from this activity can be informative for practitioners attempting to support ELs’ educational development. Understanding how shared reading affects ELs’ early skills, whether specifically literacy-focused or more general language skills, can assist practitioners and researchers in more accurately conceptualizing how ELs develop language and literacy in English.
Child Characteristics
Individual children benefit distinctively from different intervention programs. To explore how children’s backgrounds and characteristics influence their responses to shared reading interventions, we tested four child factors as moderators. These included the child’s L1 or home language, developmental status, age, and socioeconomic status (SES).
Home language
In 2011, more than 39 coded language groups were identified as the native language of children being educated in public schools in the United States (Ryan, 2013). Languages differ from one another on features that can affect how quickly and efficiently the speakers of a given language can acquire English (Branum-Martin, Tao, & Garnaat, 2015; Duncan & Paradis, 2016). Different L1/L2 pairings (e.g., Spanish and English compared with Vietnamese and English) have been suggested to result in distinct patterns of dual-language development with varying degrees of cross-linguistic interaction among bilingual children (Branum-Martin et al., 2015; Duncan & Paradis, 2016). ELs also experience varying levels of exposure to their L1 and to English (Hammer et al., 2014), which in turn affects the rate and course of development in each language (Marchman & Martinez-Sussman, 2002; Parra, Hoff, & Core, 2011). In the context of this meta-analysis, we will investigate the extent to which home language, when it is reported, serves as a moderator of children’s response to shared reading intervention.
Developmental status
Children’s developmental status (i.e., typically developing vs. disordered) may also influence the observed effects of shared book reading. The presence of a developmental disorder, such as speech or language impairment, intrinsically affects how ELs acquire both their L1 and English (Kohnert, 2010). Understanding how shared reading may differentially benefit children with developmental disorders compared with those who are typically developing can also inform clinical practice for service providers working with ELs.
Age and socioeconomic status
Identifying how shared reading may be more or less effective for children of different ages and SES backgrounds can assist educators in identifying children for whom shared book reading would be beneficial (Schickedanz & McGee, 2010). Shared reading is typically recommended as a practice that is beneficial for younger children (Lonigan et al., 2008; U.S. Department of Education, Institute of Education Sciences, What Works Clearinghouse, 2015), but the impact of shared reading on children of different ages has not been explored extensively. It is possible that younger children benefit more from shared reading than older children because books offer rich language exposure that may not otherwise be provided in the home environment (Collins, 2010; Larson et al., 2018). Older children may receive similar language input in their school settings—therefore potentially making shared reading less impactful.
The influence of SES on academic achievement and children’s responsiveness to educational interventions is widely recognized (e.g., Dietrichson, Bog, Filges, & Klint Jorgensen, 2017). Children from low SES backgrounds have been shown to respond to academic interventions differently than peers from high SES backgrounds, and their performance may be influenced by health disparities (Currie, 2009), low quantity of language input at home (Hart & Risley, 2003), access to resources, and different parenting practices (Esping-Andersen et al., 2012). Therefore, evaluation of how the effects of shared reading vary by child-level characteristics is warranted.
Methodological Considerations
The obtained effect sizes for shared book reading interventions can also be affected by the methodology used within the individual studies. All results from research are estimates of a true outcome, with some amount of error in each observed result. For a meta-analysis to yield the most accurate estimate of that true outcome or effect, it is essential to address how the methodology used in the included studies may influence the effect sizes obtained (Ahn et al., 2012). Design features such as within- versus between-group comparisons, random assignment, and choice of outcome measures can affect the validity of a study’s findings (Shadish, Cook, & Campbell, 2002). Experimenter-created outcome measures, for example, are in general more proximal to the intervention, more likely to have stronger associated effects, and more susceptible to researcher bias in that they may be constructed to be more sensitive to the particular intervention than tests constructed independent of the research project. Generally, standardized measures not constructed by the authors themselves are more distal and show smaller effects (Marulis & Neuman, 2010). Other design features have less predictable influences on empirical results and therefore may be even more important to consider. Although practical and widely used, within-group comparisons are open to several major internal validity threats, including maturation and history, whose influence cannot be determined or quantified when participants serve as their own controls (Shadish et al., 2002). Between-group comparisons can be similarly susceptible to confounds, such as preexisting differences that may affect growth, when random assignment is not used to place children into groups. These methodological features are therefore important to consider when aggregating results for a meta-analysis.
Simply discarding studies that do not meet the most rigorous methodological criteria may restrict the number of studies eligible for inclusion, however. This discarding approach can reduce the threats to internal validity for the meta-analysis, but it also limits the exploration of potential moderators. If fewer studies are included in the meta-analysis, there will be less variability in the number of moderators reported and/or targeted within the sample of included studies. Consequently, for the present meta-analysis, we elected not to exclude studies based on the rigor of the methodology employed but rather to code for these characteristics. This strategy allows for exploration of the effects of a larger number of moderators. Furthermore, we can then quantify how differences in methodology affect the effect sizes obtained, therefore guiding future research and practice.
Additionally, there is evidence that publication bias is still prevalent in social science research. Polanin, Tanner-Smith, and Hennessy (2016) found that published studies tend to produce larger effect sizes than unpublished studies. They concluded that this difference was suggestive of publication bias and recommended that unpublished research, such as dissertations and theses, be included in future meta-analyses to mitigate the overestimation of effect sizes (Polanin et al., 2016). Consequently, the publication status of papers was not included in the exclusion criteria of the present work.
The Current Study
The purpose of the present meta-analysis is to examine the impact of shared book reading on language and literacy outcomes among ELs, and to evaluate potential moderators that influence the impact of shared book reading on ELs’ outcomes. Specific moderators of interest are (a) the language of interaction during book reading, (b) the relationship of the individual(s) providing treatment to the ELs, (c) the children’s L1, (d) the children’s developmental status as typically developing or disordered, (e) the children’s ages, (f) the children’s SES, and (g) methodological design characteristics of the included studies. The following research questions were addressed:
Method
Search Strategy
To locate articles appropriate for inclusion in the meta-analysis, papers were identified from several sources. First, the databases ComDisDome, ERIC, Medline, PsycARTICLES, ProQuest Dissertations, and PsycINFO were searched for relevant studies. Studies of interest were identified using the following search parameters: the article title, subject, abstract, or keywords contained—(a) bilingual*, L2 learners, second-language learners, English language learners, ELL, English second language, ESL, English additional language, EAL, language minority, limited English proficient, LEP, limited English speaking, or multilingual* (Melby-Lervåg & Lervåg, 2014); and (b) story reading, storybook reading, joint reading, parent-child reading, shared reading, interactive book reading, dialogic book reading, or joint book reading. The search terms were deliberately broad to increase the likelihood of identifying all relevant studies. This initial database search yielded 5,120 articles. A manual search was also completed by examining the reference lists of relevant reviews or meta-analyses (Adesope et al., 2011; Cheung & Slavin, 2012; Hammer et al., 2014; Lonigan et al., 2008; Lonigan & Shanahan, 2010), the reference lists of articles included in the meta-analysis, and of articles citing articles included in the meta-analysis. Finally, several researchers with expertise in the topic area were contacted to request any unpublished or in-press research. The manual search and personal communication with researchers yielded 23 additional studies, resulting in 5,143 total studies to be considered for inclusion.
Study Inclusion Criteria
Figure 1 shows the search, screening, and final identification procedures. From the articles identified from the preliminary search, papers were included in the final sample using the following criteria: the study (a) was published between January 1, 1981, and April 30, 2017; (b) was written in English; (c) utilized a participant sample comprising at least 80% ELs who were 12 years old or younger (or data were disaggregated for this population); (d) was conducted in the United States; (e) included an empirical design with an intervention that incorporated shared book reading; (f) reported at least one measure of English language or literacy as an outcome; and (g) included sufficient information to calculate effect sizes for the ELs included in the sample (or the authors provided this data on request).

Study selection process.
These criteria were selected to restrict the included studies to those that were most relevant to the research questions. Specifically, January 1, 1981, was selected as the earliest date for including studies to reflect educational policy changes in response to the influential Casteñeda v. Pickard court case in 1981. The outcome of the case established a precedent that schools were required to provide equal, evidence-based, and effective educational opportunities for bilingual children (Gándara, 2000). The participant sample and study location criteria were applied to guarantee that included studies’ results were relevant to young ELs being educated in the United States. The remaining inclusion criteria were established to enable the computation of effect sizes for shared reading interventions on ELs’ language and literacy skills in English. Based on these criteria, studies such as those completed by Crevecoeur, Coyne, and McCoach (2014), Garcia (2006), and Restrepo and colleagues (2010) were excluded. Studies were commonly excluded because means and standard deviations were not fully reported, outcomes were not measured in English, or data were not disaggregated for EL participants. Although efforts were made to obtain any needed information for the published studies, these efforts were not always successful. The final number of studies that met criteria for inclusion was 54.
Coding Procedures
The authors developed the coding scheme for the included studies through an iterative process. Initially, a preliminary coding sheet was created. This sheet included primarily open-response items to obtain identifying information for each study (e.g., study citation, doi or UMI number), study sample characteristics (e.g., sample size, sample L1), description of the shared book reading intervention (e.g., language of reading, relation of person reading to children), information about outcome measures (e.g., construct targeted by outcome measure, standardization approach), study design characteristics (e.g., within- vs. between-groups examination, use of random assignment), and data to compute effect sizes. The first and second authors applied the preliminary coding scheme to five studies selected at random from the 54 studies that met all inclusion criteria. On the basis of this initial round of coding, the coding scheme was subsequently modified to account for all discovered responses to the open-ended items. Closed, multiple response option items were created, and this finalized coding scheme was then used to code all 54 studies.
Reliability
Initial reliability of 90% was established between the first and second authors, both of whom have expertise in education science and literacy research, during the development of the coding scheme. The studies to be included were then divided and coded independently by one of the two coders. The authors then independently recoded approximately 75% (n = 40) of the studies and calculated interrater reliability as percent agreement for all study features to be included in the analyses. Reliability was found to be at 95%, with agreement on each feature ranging from 0.89 to 1. All discrepancies were resolved through discussion between the first and second authors.
Missing Data
During the coding process, instances of missing data were identified. When the missing data were essential for computing effect sizes for shared book reading, we contacted the authors to request the necessary information. This approach had mixed results, but did allow for the inclusion of three additional studies that may not have otherwise been included in the primary effect size analysis.
Planned Analyses
Computing Effect Sizes
All effects were coded as either within-group or between-group comparisons. Studies with a clear comparison between a control condition and an experimental condition were coded as between-group. If there was only one group in the study (i.e., single-group design), the comparison was coded as within-group. In some cases, although the study was designed as a group comparison, by necessity the effects were coded as within-group. This occurred in four different situations. First, if there were two groups but they were clearly nonequivalent (e.g., monolingual comparison group, or disordered treatment group with typically developing control group), within-group comparisons were made for the groups, which met the larger study inclusion criteria (i.e., EL sample). Second, if there were two groups but the “control” group was also receiving a form of shared reading intervention, within-group comparisons were made for the groups, which met the larger study inclusion criteria. Third, if no information was provided regarding the nature of the control condition, the within-group effect for the control group was not included (this does not refer to control conditions described as business-as-usual controls, but only to conditions about which no information was given). Finally, if the study included multiple groups, but insufficient information was provided to calculate between-group effects (e.g., no posttest means reported for control group), the available information was used to compute within-group effect sizes instead.
To compute effect sizes from studies using between-group comparisons, Cohen’s d was obtained as recommended by Borenstein, Hedges, Higgins, and Rothstein (2009a) for studies that use independent groups. This value was then converted to Hedge’s g using a correction factor (Hedges, 1981). Many of the studies included in the present meta-analysis employed small samples, which can result in an overestimation of the effect size when Cohen’s d is used. Hedge’s g was used to provide a less biased estimate of the intervention effect.
Similarly, Hedge’s g was computed from Cohen’s d for the studies using within-group comparisons. To account for the nesting of pre–post scores within participants that is inherent to within-group comparisons, the formula for Cohen’s d was adjusted (see Borenstein et al., 2009a). The modified formula for computing d from pre–post scores requires an estimate of the correlation between those pre–post scores. Because very few, if any, authors of the within-group studies reported correlations between their participants’ scores, we elected to use a range of plausible correlations: r = .70, r = .80, and r = .90. All the effect sizes from within-group comparisons were computed using all three values of r.
Combining Effect Sizes
Most of the included papers yielded more than one effect size. Because these effect sizes were not stochastically independent, robust variance estimation was used to combine effect sizes across all studies and account for the dependency of effect sizes within studies (Hedges, Tipton, & Johnson, 2010; Tanner-Smith & Tipton, 2014). This approach was selected because it allows for all the effect sizes within each study to be included in the analyses but does not require information about the covariance structure of the individual effect sizes (Tanner-Smith & Tipton, 2014). Few authors reported the covariance structure of their obtained effect sizes, making other approaches such as hierarchical linear modeling impractical for the current study.
Random-effects weighting was used during robust variance estimation to compute the overall estimated mean effect size g and its standard error. To obtain estimates of variation in effect sizes, a range of tau-squared (τ2) values were computed using method of moments as the estimator (Hedges et al., 2010). These were computed using a sensitivity analysis approach for the assumed constant correlation ρ. The τ2 values were computed for values of ρ ranging from 0 to 1, as recommended by Tanner-Smith and Tipton (2014). To obtain measures of effect size heterogeneity, the weighted residual sum of squares QE and I2 statistics were computed. QE is a test of variation in effect sizes across studies, and I2 provides an index of the magnitude of the heterogeneity (Borenstein, Hedges, Higgins, & Rothstein, 2009b).
Because the within-group comparison studies’ effect sizes had been computed using three separate values of r (.70, .80, .90) as estimates of the pre–post score correlations, three separate overall, combined effect sizes were also reported. Each combined effect size was computed using one of the three effect sizes obtained for the within-group studies and the constant effect sizes obtained for the between-group studies.
Sensitivity Analyses
Sensitivity analyses were employed to evaluate the impact of specific methodological exclusion criteria on the overall effect size estimate. The most conservative estimate of r (.70) for computing within-group effect sizes was assumed for these and all subsequent analyses. Constraints were systematically applied to the inclusion criteria for studies to be included in the effect size computations. This approach was applied so that effect sizes obtained from studies using experimenter-created outcome measures, studies relying on within-groups designs, studies using control groups other than business-as-usual, and studies without random assignment were each excluded, in turn, from the calculation of the overall effect size estimate. Combinations of these constraints were also applied to determine the size of the effect obtained from the studies using only the most rigorous, generalizable methodology.
Moderator Analyses
To investigate the effects of the potential moderators of interest, a series of metaregressions were conducted with robust variance estimation. Separate metaregressions were carried out for each moderator, including design characteristics (monolingual vs. bilingual intervention, English-only vs. not English-only intervention, relation of reader to child, literacy outcome, language outcome) and child characteristics (L1 of child, developmental status, age, SES). As recommended by Tanner-Smith and Tipton (2014) for metaregressions involving a limited number of studies, a slightly more conservative alpha level was used and effects for moderators were determined to be significant only if p < .01. Retrospective power analyses were conducted to examine the statistical power for detecting differences between studies by potential moderator characteristics (Hedges & Pigott, 2004; Valentine, Pigott, & Rothstein, 2010).
Publication Bias
Publication bias is a consistent threat to the validity of findings obtained from meta-analyses (Polanin et al., 2016). To evaluate the likelihood of publication bias as a threat within the present study, we first visually examined the symmetry of a funnel plot. The plot was constructed in R (R Core Team, 2017) using the package ggplot2 (Wickham, 2009) by plotting the effect sizes on the x-axis and the inverse of those effect sizes’ standard errors on the y-axis. Asymmetry in this plot is indicative of publication bias. To supplement this analysis, a moderator analysis was completed to test the significance of any differences between effect sizes obtained from unpublished studies and those obtained from peer-reviewed studies.
Results
The 54 studies included in the meta-analysis yielded a total of 226 effect sizes, with an average of 4.19 effect sizes obtained per study. Over half of the studies were dissertations (n = 31, 57%), which contained 112 effect sizes (50% of the total effect sizes). The total number of participants across all included studies was 3,989. The average sample size utilized was 88.25 (SD = 100.20) and the approximate average age of participants across the studies was 6.33 years (SD = 1.97). Authors used oral language measures more often than literacy-focused measures to assess child outcomes (127 vs. 93 effect sizes, respectively). Of the 93 effect size estimates obtained from literacy-focused tools, 90 were reading based and only 3 targeted writing skills specifically. The reading tools were designed to assess a variety of skills, from print knowledge and phonological awareness to oral reading fluency and reading comprehension. Additional descriptive information is provided in Table 1, and the full list of studies is provided in Supplemental Table S1 in the online version of the journal.
Descriptives for included effect sizes (n) and studies (k)
Note. n = number of effect sizes; k = number of studies; SES = socioeconomic status; L1 = first language, L2 = second language.
Effect Size Estimates and Sensitivity Analyses
The overall combined effect size, obtained using robust variance estimation with r = .70 for all within-group comparisons, for the impact of shared book reading on ELs’ outcomes was positive and moderate: Hedge’s g = 0.28 (p < .001, 95% CI = 0.18 to 0.38). The τ2 values obtained as measures of variation in effect sizes ranged from .099 to .100 for ρ = .00 to ρ = 1.00. Effect sizes ranged from g = −1.05 to g = 5.55, with significant, large heterogeneity: QE(53) = 253.78, p < .001, I2 = 79.12. The individual effect size estimates are shown in a forest plot in Figure 2. Slightly larger overall combined effect sizes were obtained when the pre–post correlation for within-group comparisons was set at r = .80 and r = .90. Full results for each value of r are provided in Table 2.

Forest plot of effect sizes and confidence intervals grouped by study.
Overall effect size estimation and sensitivity analyses
Note. SE = standard error; df = degrees of freedom; CI = 95% confidence interval; n = effect sizes; k = studies; τ2 = estimated variance in effect sizes across studies; ρ = assumed correlation between scores in within-group designs; QE = weighted sum of squares on a standardized scale; I2 = index of the magnitude of heterogeneity between studies; RA = random assignment; BAU = business-as-usual.
Exclusion of within-group comparisons effect sizes resulted in a larger overall combined effect size for shared reading (g = 0.42, n = 65, k = 18). However, when the control groups were examined within the between-groups studies, it was noted that several of the effect size estimates (n = 20, k = 5) were obtained from studies that provided alternate treatments to the control group. The alternate treatments are described in Supplemental Table S1 in the online version of the journal. Only 13 of the 18 between-groups studies used business-as-usual control groups. When effect size estimates were recomputed for the subgroup of studies using business-as-usual control groups, the combined weighted effect size was similar to that observed for the full sample (g = 0.27, n = 45, k = 13).
Effect sizes were also obtained only for studies implementing random assignment of participants to conditions. Within the subset of studies that used between-group comparisons, only nine used random assignment for the purpose of examining the effectiveness of shared reading. Exclusion of studies that did not employ random assignment yielded an overall combined effect size (g = 0.47, n = 30, k = 9). However, the large heterogeneity in the effect sizes obtained from those studies (QE[8] = 115.93, p < .001, I2 = 93.10) resulted in effect size not being statistically significant (95% CI for g = −0.22 to 1.16).
Sensitivity analyses revealed a decrease in the overall combined effect size when experimenter-created measures were excluded from effect size calculations (g = 0.17, n = 143, k = 39). As shown in Table 2, these effect sizes were still positive and significant, but exhibited slightly less heterogeneity than was found with the full sample: QE(38) = 100.23, p < .001, I2 = 62.23. All results obtained from the experimenter-created outcome measures exhibited more heterogeneity than those obtained from standardized outcome measures. See Table 2 for full sensitivity analysis results.
Moderators
The effects of moderator variables were investigated using a series of metaregression analyses. First, we evaluated characteristics of the shared reading program. Full results are provided in Table 3. Tests for moderation of language of reading on children’s outcomes revealed no significant differences between the impact of monolingual and that of bilingual shared reading. Similarly, there were no significant differences in outcomes from English-only shared reading interventions when compared with those from bilingual or L1-only interventions. High levels of heterogeneity were observed in both analyses: for monolingual versus bilingual reading, QE(49) = 240.98, p < .001, I2 = 79.67; and for English-only versus bilingual or L1-only reading, QE(49) =241.74, p < .001, I2 = 79.73. These analyses were underpowered (0.71 and 0.79, respectively) for detecting a difference of 0.20 or smaller.
Results from moderator analyses of intervention characteristics
Note. Coeff = coefficient; SE = standard error; df = degrees of freedom; CI = 95% confidence interval; n = number of effect sizes; k = number of studies; τ2 = estimated variance in effect sizes across studies; ρ = assumed correlation between outcomes in within-group experimental designs; QE is the weighted sum of squares reported on a standardized scale; I2 = index of the magnitude of heterogeneity between studies; int = intercept; P = statistical power to detect an effect of 0.20 with a type I error rate of 0.05.
No significant differences were present between effect sizes obtained from shared reading delivered by researchers and those obtained from shared reading delivered by caregivers or service providers. The relation of the reader to the child did not moderate the effect of shared reading. Again, high levels of heterogeneity were observed: QE(51) = 249.87, p < .001, I2 = 79.59, although the analysis was adequately powered to detect an effect of 0.20 or larger (0.92). Finally, we compared children’s outcomes following participation in shared reading by the type of outcome (literacy vs. nonliteracy outcome). No evidence of moderation was observed. There was evidence of high levels of heterogeneity: QE(52) = 250.88, p < .001, I2 = 79.27.
Next, child-level characteristics were tested as potential moderators. Full results are presented in Table 4, including retrospective power analyses. The L1/L2 pairing experienced by the child was tested first. Because Spanish was by far the most common L1 reported for participants (n = 176, k = 38), Spanish was compared against all other L1s reported. No significant differences in children’s shared reading outcomes were observed. Large heterogeneity was noted: QE(52) = 240.98, p < .001, I2 = 79.67.
Results from moderator analyses of child characteristics
Note. Coeff = coefficient; SE = standard error; df = degrees of freedom; CI = 95% confidence interval; n = effect sizes; k = studies; τ2 = variance in effect sizes across studies; ρ = assumed correlation between outcomes in within-group designs; QE = weighted sum of squares; I2 = magnitude of heterogeneity; P = statistical power to detect an effect of 0.20 with a type I error rate of 0.05.
Children’s developmental status as typically developing versus disordered did yield evidence of moderation of the effectiveness of shared reading. Although it was only possible to compare outcomes among studies that explicitly reported children’s developmental statuses (n = 99, k = 28), children with disordered development exhibited lower levels of growth from shared reading interventions than children with typical development. The effect size estimate for typically developing children was 0.48, whereas that for children with disordered development was 0.17 (Δg = −0.31, p = .004). Large heterogeneity was found: QE(26) = 139.39, p < .001, I2 = 81.35.
Neither child age (measured in years) nor SES (low vs. average or mixed) revealed evidence of moderation. High levels of heterogeneity were observed in both analyses. For age, QE(52) = 250.23, p < .001, I2 = 79.22. For SES, QE(52) = 252.60, p < .001, I2 = 79.41.
Publication Bias
The funnel plot including all effect sizes plotted against their inverse standard errors is provided in Figure 3. Visual inspection of the plot revealed asymmetry indicative of publication bias. Among both published and unpublished studies, effect sizes appeared to be missing from the left side of the plot. This is a clear indication of bias in the included studies. A moderator analysis testing for differences in effect sizes obtained from the included published and unpublished papers revealed no significant differences (Δg = 0.12, p = .229), indicating that both the published and unpublished work tended to report positive effects.

Funnel plot of effect sizes from published and unpublished papers (n = 226). The effect sizes obtained from published papers are represented by filled circles. The effect sizes from unpublished papers are represented by open circles. The 95% confidence interval around the overall combined effect size is represented by dashed vertical lines.
Discussion
The present meta-analysis was conducted to examine the impact of shared book reading on the language and literacy outcomes of young ELs being educated in the United States and to test for moderators that may influence the impact of shared book reading on ELs’ outcomes. The primary analyses indicate that shared book reading interventions appear to affect ELs’ English language and literacy skills positively. Moderate, positive effect sizes were obtained for shared reading. This result was relatively robust to adjustments in the methodological inclusion criteria. Although the size of the effect differed based on the types of research designs included, all effect sizes were positive and most were statistically significant. Exceptions were when the strictest inclusion criteria (i.e., random assignment of participants to conditions) were used. Only nine papers qualified as randomized trials for the purpose of examining the effectiveness of shared reading. As such, the overall effect size estimate obtained from these nine studies had a large standard error and was not statistically significant.
Notably, the standard error in the effect size estimates was likely amplified by several extreme effect size estimates. Close examination of the most extreme of these estimates indicated that those values were most likely outliers and not representative of the true effect of shared reading on ELs’ language and literacy outcomes. The largest negative effect size estimate (g = −1.05, SE = 0.65), as observed in Figure 2, was obtained from a study conducted by Reese and colleagues (2010). The authors included a subsample of EL children in a primarily monolingual sample and compared the impact of two different parent-implemented intervention conditions to a business-as-usual control condition. The intervention conditions included a dialogic reading condition, which included shared reading, and an elaborative reminiscing condition, which did not include shared reading. Although the primary intervention of interest within the paper was the elaborative reminiscing condition (Reese et al., 2010), only data from the dialogic reading and control groups met inclusion criteria for this meta-analysis. Within those two groups, only 9 of the 11 ELs completed a posttest measure. Consequently, only one effect size was calculable and that was from the experimenter-created story comprehension measure, which focused on children’s ability to recall key plot points, inferences, and the main idea of a story in English. Critically, the EL children’s performance on this measure revealed evidence of floor effects, particularly in the dialogic reading group. At both pretest and posttest, at least one EL child in the dialogic reading group scored a zero on the story comprehension measure. None of the EL children in the control group scored a zero at posttest. These observations suggest that the validity of these specific findings was influenced by selection bias, with the dialogic reading group’s scores being affected by floor effects on the story comprehension measure.
On the opposite end of the distribution of effect sizes, two studies yielded markedly large effect size estimates. The first, conducted by Witte (2014), was a dissertation project in which the author was also the participants’ teacher. The author iteratively adapted the shared reading intervention so that the instruction was responsive to children’s growth. Additionally, the single outcome measure that yielded a relatively extreme effect size estimate (g = 5.55, SE = 1.39) was an experimenter-created proximal measure. The more distal measures used in the paper produced more moderate effect sizes (g = 0.19, SE = 0.28; g = 0.07, SE = 0.28; Witte, 2014). The second study that resulted in an extreme positive effect size was also a dissertation project conducted by Arriaz de Allen (2010). The small-sample study (n = 10) yielded one extreme value out of the 11 that were computed from the author’s findings. That effect size estimate (g = 3.17, SE = 0.59) was obtained from a teacher rating measure, which indirectly quantifies children’s abilities and is therefore open to additional sources of error compared with direct measures.
Consideration of the context in which the most extreme effect size estimates were obtained indicates that the heterogeneity of the estimates may be attributable to methodological limitations. The studies that produced the largest and the smallest effect sizes were vulnerable to validity threats that may have artificially inflated or decreased the results obtained. It is likely that these immoderate values were truly outliers and not reflective of the precise impact of shared book reading interventions on ELs’ outcomes.
The overall finding that shared book reading positively affects ELs’ English language and literacy skills does not support a threshold hypothesis or a minimum language proficiency required to benefit from shared reading interventions. Of the studies included in this meta-analysis, the clear majority yielded positive effects for shared reading. These overall positive effects support the widespread use of this educational technique with young ELs. Considering the need for educational practices that are effective for enhancing ELs’ early English skills to contribute to reducing the achievement gap between ELs and their monolingual peers, the robustness of these results is encouraging. Although the shared reading approaches were highly diverse, most appeared to affect ELs positively. This suggests that many different forms of shared reading can facilitate language growth for this population. The continued development of educational programs around shared book reading is warranted.
The overall combined effect size in the current study for the impact of shared book reading on ELs’ outcomes (g = 0.28) was similar to the average effects reported in other related meta-analyses. For example, Takacs et al. (2015) reported storybook interventions that were technology enhanced resulted in an average effect of 0.39 on comprehension and 0.24 on expressive vocabulary. Furthermore, these effect sizes are comparable with those reported in meta-analysis of other interventions for low SES students. In Dietrichson et al. (2017), the authors reported similar average effect sizes for tutoring (0.36), feedback and progress monitoring (0.32), and cooperative learning (0.22).
Intervention Characteristics as Moderators
No statistically significant moderators were identified from among the intervention characteristics tested. There were no differences in the effectiveness of shared reading by the language of reading, indicating that bilingual or L1-only reading yielded the same effects as English-only reading. This finding is consistent with prior work (Farver et al., 2009; Slavin et al., 2011) and suggests that ELs’ English growth can be supported similarly by either English-only or bilingual instruction. However, because the L1 for most of the children who participated in the included studies was Spanish, this finding may only be reflective of that L1/L2 pairing (i.e., Spanish–English) or simply a lack of statistical power to detect an effect. Neither the relation of the reader to the child nor the type of outcome measure (i.e., literacy-focused vs. nonliteracy outcome) moderated the effect size estimates obtained from the included sample of studies.
Theoretically, several of the null findings were consistent with the literature. In addition to the language of intervention, another theoretically reasonable null result was the lack of identified differences between literacy-focused and language-focused outcome measures. Language and literacy are intertwined developmentally, with both print-based skills and oral language skills contributing to the acquisition of literacy (Verhoeven & van Leeuwe, 2012). Furthermore, language abilities are particularly difficult to disentangle from literacy abilities among young children learning English as a second language. ELs’ performance on literacy-focused measures in English is generally reliant on their English language proficiency and can be complicated by their educational and cultural backgrounds (Pitoniak et al., 2009). The finding that there were no differences between ELs’ language and literacy outcomes from shared book reading may therefore either be an artifact of the difficulty of discriminating between language and literacy in EL assessment practice, or it may reflect a true lack of differences in how language and literacy are affected by shared reading activities. Regardless, both ELs’ language and literacy skills were found to increase through shared reading. This finding indicates that this practice is beneficial, although the exact mechanisms of how shared reading practices affect ELs’ specific language and literacy skills is still unclear.
The lack of evidence for the relation of the reader to the child was somewhat inconsistent with prior work. Although it is possible that researcher-implemented interventions and caregiver/service provider–implemented interventions are equally effective for improving ELs’ outcomes, the different levels of control that are possible based on who the reader is makes the equivalent findings surprising (Gottfredson et al., 2015). We anticipated that researchers would be able to conduct their shared reading interventions with high levels of fidelity and consistency, leading to stronger effects. The results indicate that stronger effects were not obtained from researcher-implemented shared reading compared with those implemented by other people.
However, there may have been too much variability in the relation of the reader to the child to detect effects. Within the studies that included intervention programs not delivered by a researcher, there were many different individuals who provided the shared reading. Although overall there were no differences between the effects of researcher-implemented and nonresearcher-implemented shared reading programs, it is possible that specific individuals (e.g., teachers) may be more effective implementers of shared reading. Given the limitations of the current body of literature, we recommend that future work continue exploring these factors as potentially important characteristics of shared reading interventions.
Child Characteristics as Moderators
One child characteristic exhibited evidence of moderation. The status of children’s development as either typical or disordered moderated the effectiveness of shared reading such that children with typical development experienced larger gains from shared book reading than children with disordered development. Because children with developmental disorders, by definition, have lower language and/or literacy skills than children who are typically developing, the disparity in growth may simply reflect that difference. Both typically developing ELs and those with disordered development benefitted from shared book reading. Although the ELs without identified disorders exhibited larger levels of growth, it is possible that the effects of shared reading may be more functionally meaningful for ELs with developmental disorders. Small increases in language and/or literacy ability at the lower end of the spectrum may have an important influence on children’s quality of life. Consequently, the finding that shared reading positively affected both children with and without disorders suggests that this educational activity is beneficial for ELs regardless of developmental status.
The other child characteristics examined did not exhibit evidence of moderation. Neither the L1/L2 pairing of the child, the child’s age, nor the child’s SES emerged as a significant factor influencing the effectiveness of the shared reading interventions. This result suggests that shared reading practices may be equally beneficial for ELs from relatively diverse backgrounds. However, the null findings from several of these moderator analyses may be inconclusive, rather than indicative, of truly irrelevant moderators. Many of the moderator analyses were underpowered, even to detect a relatively large moderator effect (i.e., 0.20). As such, it is possible that some of the factors examined may truly moderate the effect of shared reading. This is particularly true for the language of intervention, children’s developmental status, and the child’s native language.
It was reasonable that child age was not a significant moderator. The present study focused on a relatively narrow age range, including only children below age 12. Although it was hypothesized that younger children would benefit more from shared reading practices than older children, there has been relatively limited exploration of the influence of age on the effectiveness of shared reading. It is possible that ELs benefit from shared reading regardless of age, considering that ELs tend to exhibit diminished literacy development compared with their monolingual peers (Baker et al., 2014).
Null findings that were more surprising from a theoretical perspective included the lack of significant moderation of shared reading by the L1 of the child and the child’s socioeconomic background. It is plausible that neither of these factors truly moderate the effect of shared reading, but several pieces of evidence preclude the unconditional acceptance of these null findings. First, there are some theoretical reasons to expect differential impacts by these factors. There is a large body of work, including meta-analyses (Dietrichson et al., 2017; Sirin, 2005), that has demonstrated that children from low SES backgrounds exhibit disproportionately low academic achievement compared with children from higher SES backgrounds. As such, children from low SES backgrounds may be expected to demonstrate differential gains from shared reading compared with children from more affluent backgrounds. Second, as noted previously, statistical power was a limiting factor. The examination of the child’s L1 as a moderator was significantly underpowered, suggesting that the null finding requires revisiting in future analyses.
It is also worth noting that the measurement of the children’s L1 and SES was relatively imprecise. The studies included in the present meta-analyses primarily reported SES dichotomously based on children’s eligibility for free and reduced lunch status. In contrast, a meta-analytic review of research on SES and academic achievement indicated that the literature base generally includes three components: parental income, parental education, and parental occupation (Sirin, 2005). Other studies in the literature (e.g., Esping-Andersen et al., 2012) have also considered specific factors such as quality of child care, birth weight of child, education of mother, and employment of mother and teacher quality and teachers’ lower expectations of students from low SES backgrounds (Chetty, Friedman, & Rockoff, 2014). More precise measurement related to SES might have yielded different findings.
Finally, the ability to detect moderation effects of child L1 and SES may have been restricted by the variability of these factors in the studies that focus on ELs. Spanish was the most frequently occurring L1 reported within the included studies. More than 75% of the effect size estimates were obtained from sample of ELs who spoke Spanish at home. Furthermore, and consistent with other reports in the literature, most of the participant samples included in this meta-analysis were comprising children from low SES backgrounds. The relative homogeneity of these potential moderators limits the ability to detect differences in outcomes by these factors.
Limiting Factors Within the Included Papers
We suggest that caution is needed in interpreting the limited identification of moderators and the values obtained for the effect sizes within the present work. There was much variability in the amount and specificity of information that authors reported about their shared reading interventions, participant characteristics, and methodology used. Some authors were thorough and detailed in their reporting, but others failed to report essential information such as the language used during reading or who conducted the reading. Although some of these factors were tested as moderators, studies that did not include clear descriptions could not be included in those analyses, leaving them underpowered.
Authors also frequently did not report details about the participants included in the study. For example, children’s proficiencies in their L1 and L2 were rarely reported despite their theoretical importance in predicting outcomes (Baker et al., 2014; Kieffer, 2010). Studies identify proficiency level in different ways making it difficult to standardize or compare across studies, which is a commonly reported limitation in meta-analyses involving ELs (e.g., Ardasheva et al., 2017). This limited the amount of detail that could be included in coding for intervention and child characteristics, which in turn limited the exploration of moderators of shared reading.
The methodological rigor of the included papers may also have contributed to the lack of identification of shared reading moderators. As noted previously, of the 226 effect sizes included in the analyses, only 63 were obtained using between-group comparisons. This is a problem because within-group comparisons are open to several major internal validity threats (Shadish et al., 2002). Additionally, of those 63 effect sizes that were obtained through between-group designs, only 30 were obtained from experiments using random assignment (from nine studies).
Another consideration in interpreting the findings from the moderator analyses is that a large amount of heterogeneity was found in the effect size estimates. Despite the overall positive effect of shared reading, there was substantial variance in the size and direction of the effects found across the 54 included studies. Although most of the potential moderators could not explain this variability, it is possible that confounding factors masked the existence of genuine moderators. For example, the lack of identification of significant moderators may be attributable to the complexity of the relations between the included predictors. Although limitations encountered in reporting and methodology precluded the exploration of interactions between hypothesized moderators in this article, it is feasible that each moderator interacted with other child and intervention characteristics. ELs’ cultural, linguistic, and educational experiences can influence their development and response to various educational programs in multifaceted ways that are not fully predictable (Genesee & Riches, 2006). Another possible explanation for the lack of moderation is that the true moderators of shared reading were not examined by the present work. It is plausible that we did not consider or were not able to include important factors (e.g., intervention frequency and duration, story genre) that truly moderate the effectiveness of shared reading on ELs’ language and literacy outcomes.
Limitations of the Present Meta-Analysis
A key limitation of the present study is that we were unable to examine all of the features of shared book reading interventions that may have differentially affected ELs’ outcomes. Because of the methodological limitations of the included studies and the wide variability of effect size estimates, a larger body of empirical research is needed to facilitate adequately powered examination of specific types of shared reading programs. High levels of transparency in describing intervention approaches are essential for future meta-analyses to examine the impacts of different instructional techniques within the context of shared book reading with ELs.
Another notable limitation is that, when computing the effect sizes for the within-group designs, the correlations between pre- and post-test measures were estimated rather than obtained from the studies’ authors. Few authors reported pre–post correlations. Therefore, to compute effect sizes for these papers, three separate reliability values were compared against one another. As we selected the most conservative estimates obtained from these comparisons, it is possible that this resulted in an underestimation of the overall effect of shared book reading on ELs’ language and literacy outcomes. It is likely that the effect size computed using only results from the papers with the most rigorous methodology (i.e., between-groups comparisons with random assignment and using standardized, norm-referenced outcome measures) most accurately reflected the true impact of shared book reading.
The limited reporting of correlations also restricted the use of more precise statistical models to compute effect size estimates (Cheung, 2014; Tanner-Smith & Tipton, 2014). The statistical approach implemented in the present study accounted for the dependence of effect sizes within studies through adjusting the standard errors (Hedges et al., 2010). However, a multilevel approach explicitly models the dependency in effect sizes, allowing for increased analytic precision compared with the current approach (Cheung, 2014). Future research would benefit from increased reporting of correlations in the body of research focused on ELs to permit the use of more precise models to compute effect size estimates.
Finally, tests for publication bias yielded some evidence that publication bias is present in the current literature focused on shared book reading as an intervention approach. This bias appeared to be present even among the dissertations included, which produced some of the most extreme effect size estimates. Although the overall impact of shared reading appears relatively robust, positively affecting ELs’ language and literacy outcomes, these findings should be interpreted with caution. It is likely that some of the effect sizes obtained in the present meta-analysis are overestimations of the true effect of shared reading. To determine this intervention’s precise effect, more rigorously designed studies are needed.
Conclusion
The results from this meta-analysis suggest that shared book reading interventions may facilitate modest amounts of growth in EL’s language and literacy skills. The effect sizes were robust to several design characteristics and were moderated only by children’s developmental status as typical or disordered. Children with developmental disorders exhibited smaller amounts of growth from shared reading interventions than children who were typically developing. The lack of identification of other moderators may be attributable to low statistical power to detect effects and the methodological rigor of the included papers.
Overall, the widespread use of shared reading as an educational activity and as a vehicle for delivering intervention programs appears to be warranted. ELs benefit from shared book reading, evidenced by growth in their English language and literacy skills. However, to identify the essential characteristics of shared reading programs and of ELs themselves that influence the approach’s effectiveness, further research is needed with attention to detailed reporting and rigorous research methodology.
Supplemental Material
DS_10.3102_0034654318790909 – Supplemental material for Shared Book Reading Interventions With English Learners: A Meta-Analysis
Supplemental material, DS_10.3102_0034654318790909 for Shared Book Reading Interventions With English Learners: A Meta-Analysis by Lisa Fitton, Autumn L. McIlraith and Carla L. Wood in Review of Educational Research
Footnotes
Notes
Authors
LISA FITTON is a postdoctoral researcher at the Florida Center for Reading Research at Florida State University, 2010 Levy Avenue, Suite 100, Tallahassee, FL 32310, USA; email:
AUTUMN L. McILRAITH received her PhD in communication science and disorders from Florida State University. She is currently a postdoctoral fellow at the Texas Institute for Measurement, Evaluation and Statistics, at the University of Houston, 4849 Calhoun Road, Houston, TX 77204, USA; email:
CARLA L. WOOD is a professor in the School of Communication Science and Disorders at Florida State University, 201 W Bloxham St, Tallahassee, FL 32301, USA; email:
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
