Abstract
Dialogic reading (DR) is an evidence-based practice for young children who are typically developing and at risk for developmental delays, with encouraging evidence for children with disabilities. The purpose of this review was to comprehensively evaluate the evidence base of DR across early childhood settings, with specific attention to fidelity features. We coded identified studies (n = 30) published in peer-reviewed journals on a number of variables, including participant characteristics, setting, adherence to intervention components, fidelity of training procedures, implementation fidelity, dependent variables, overall outcomes, and study rigor. Our findings indicate wide variance is present in adherence to the DR protocol despite all studies reporting use of DR. In addition, although most researchers describe training procedures, none reported fidelity of those practices. Variability was also noted in how the implementation of DR with children is monitored in research.
Reading aloud has been referred to as “the single most important activity for building knowledge required for success in reading” (Anderson, Hiebert, Scott, & Wilkinson, 1985, p. 23). By reading aloud to children at early ages, adults facilitate the development of valuable skills that support later reading development, particularly print knowledge and oral language skills (Aram, Ekelham, & Nation, 1984; Catts, Fey, Tomblin, & Zhang, 2002; Justice, Kaderavek, Bowles, & Grimm, 2005). Adults can support children’s participation by asking them questions about the book and encouraging them to ask their own questions or comment about the events and characters in the story. In addition to hearing language, children have opportunities to use language during shared interactive reading. Shared interactive reading practices, such as dialogic reading (DR), yield statistically significant and moderate-sized effects on children’s oral language skills and print knowledge, as well as account for unique variance in their expressive vocabulary and morphological skills (Bus, Van Ijzendoorn, & Pellegrini, 1995; Mol, Bus, & de Jong, 2009). The National Early Literacy Panel (NELP) corroborated these findings, reporting the largest impact of shared book reading experience on oral language outcomes (NELP, 2008).
Simply reading to children does not ensure that children will make adequate gains in their oral language and print knowledge, but rather the quality of book reading is more important than the quantity of words read (Scarborough & Dobrich, 1994). Shared interactive reading is a broadly used term that encompasses a number of interventions to engage children in book reading using strategies such as child-centeredness, elaborations of children’s utterances, active responding, pause time, and evaluation of children’s responses (Hemmeter & Kaiser, 1994). DR is a particular method of shared interactive reading in which the adult uses specific question prompts to encourage children to talk during book readings, thereby optimizing oral language development. Whereas shared interactive reading generally incorporates many of the same strategies of DR (e.g., repetition of vocabulary words, oral language prompts, evaluation of children’s responses), DR provides a specific, systematic framework for how adults can engage children in interactive reading.
Instead of simply reading the text of the story, adults using a DR approach encourage children to take on an interactive role with the story through an intentional scaffolding instructional sequence that begins with the adult posing a question to the child (Flynn, 2011). Initially developed by Whitehurst and colleagues (Lonigan & Whitehurst, 1998; Whitehurst et al., 1994; Whitehurst et al., 1988), parents and educators were taught to use the mnemonics “PEER” and “CROWD” to remember DR steps and specific question prompts, respectively. In DR, the adult periodically prompts a child to verbally participate in the reading. Once the child responds to the adult prompt, the adult evaluates the accuracy of that response and expands on the child’s utterance. Finally, the adult repeats the prompt to allow the child another opportunity to recite the response. The specific types of prompts used in DR include completion, recall, open-ended, wh-, and distancing questions. Use of these DR strategies allows the child to move from a passive listener into a storyteller during book reading (Whitehurst et al., 1988).
Whitehurst and his colleagues trained a group of mothers to read to their children using DR strategies. After 4 weeks, the children in the DR condition had greater expressive vocabulary scores than children in the control group who were read to without any change of behaviors. Children also demonstrated improved print knowledge and early reading/writing skills (Whitehurst et al., 1988). The research on DR has been implemented in a variety of settings, such as classrooms (Lonigan & Whitehurst, 1998; Valdez-Menchaca & Whitehurst, 1992; Whitehurst et al., 1994), homes (Arnold, Lonigan, Whitehurst, & Epstein, 1994; Whitehurst et al., 1988), and across different populations including children from low-income families (Valdez-Menchaca & Whitehurst, 1992) and those with language delays (Dale, Crain-Thoreson, Notari-Syverson, & Cole, 1996).
Positive outcomes for children, especially in terms of oral language development, have been consistently noted in the DR literature (Lonigan, Anthony, Bloomfield, Dyer, & Samwel, 1999; Lonigan & Whitehurst, 1998; Wasik & Bond, 2001). DR has been associated with strengthened skills in the domains of expressive language and emergent literacy (Zevenbergen, Whitehurst, & Zevenbergen, 2003) and is a strong predictor of literacy development later in life (Dickinson & Smith, 1994). Based on a report from the NELP (2008), the effect size (ES) of DR (ES = .59) is larger than that of non-DR practices (ES = .42) although both reached significance.
The U.S. Department of Education’s What Works Clearinghouse (WWC) has published two reviews of DR, one for “Early Childhood Education” (WWC, 2007) and one for “Early Childhood Education for Children with Disabilities” (WWC, 2010). In addition, a meta-analysis of 16 studies investigating the “added-value” of DR in parent–child home book reading contexts was completed by Mol, Bus, de Jong, and Smeets (2008). Based on these reviews, DR has been established as an evidence-based practice for children who are typically developing and those at risk (WWC, 2007), showing moderate ESs for young children’s (ages 2–3 years) expressive vocabulary skills (Mol et al., 2008). Although DR is well supported in the literature, the instructional components of DR have not been closely examined. Specifically, this literature has not examined data on the individual instructional components suggested within the DR framework (i.e., “PEER” and “CROWD”), making it difficult to decipher the “active ingredients” in this intervention.
When examining the use of evidence-based approach, factors that affect implementation are crucial to the effectiveness of the intervention (Fixsen, Naoom, Blasé, Friedman, & Wallace, 2005). In each of the earlier reviews (Mol et al., 2008; WWC, 2007, 2010), no information was included related to the different aspects of fidelity across contexts, research designs, or variety of participants with which DR studies have been conducted. Mol and colleagues (2008) limited their analysis to quasi-experimental studies in the home setting with children who were typically developing, with no specific coding of adherence to the DR protocol, training fidelity, or implementation fidelity. Similarly, the WWC (2007) review in early childhood education included only five group design studies, all conducted in child care centers, and the WWC (2010) review included only two group design studies for DR conducted with children with disabilities in the home setting. No review to date has captured studies using single-case research design (SCRD), limiting our understanding of potential implications for children with disabilities, as SCRD is a common methodology used to examine intervention effects for this population (Wong et al., 2013).
Focus of the Present Study
The implementation science movement calls for systematic examination of variables and conditions that affect the effectiveness and sustainability of evidence-based practices (Fixsen, Blase, Duda, Naoom, & Van Dyke, 2010). As researchers are using the implementation science framework to minimize the research to practice gap by examining factors that affect implementation of evidence-based strategies, processes in implementing commonly used strategies such as DR must be carefully examined. Specifically, three types of fidelity related to the independent variable should be addressed: (a) the components of DR that are included in each study (i.e., adherence fidelity; Mowbray, Holter, Teague, & Bybee, 2003), (b) whether the training provided to a nonresearcher “interventionist” (e.g., parents, teachers) is documented in a replicable manner (i.e., training fidelity; Ledford & Wolery, 2013), and (c) whether the interventionists employed the components of DR during the book reading with children (i.e., implementation fidelity; Horner et al., 2005; Ledford & Wolery, 2013). Each of these is a fundamental feature of the intervention that determines effectiveness of such practices and needs to be investigated to determine a functional relation between DR and child outcomes (Ledford & Wolery, 2013). Systematic examination of implementation features of DR can also lead to the identification of active ingredients that affect the changes in observed outcomes as well as help researchers design additional trainings or support for interventionists (Ledford & Wolery, 2013). Furthermore, it has been suggested that caregivers and early childhood professionals should be able to easily learn and implement DR because of its formulaic structure (Teale, 2003). The veracity of this assertion, however, is unclear given that fidelity of implementation was not measured, or fully explored, in many of the foundational studies demonstrating its effectiveness. The purpose of this article is to systematically examine fidelity features (i.e., adherence, training, and implementation fidelity) of the DR approach across contexts, participants, and research designs. Specifically, the following research questions are addressed:
What components of DR (i.e., CROWD, PEER) have been included in studies examining the outcomes of DR (i.e., adherence fidelity)? To what extent have procedures for training parents and providers to implement DR been documented (i.e., training fidelity)? To what extent has fidelity to implement DR strategies been documented (i.e., implementation fidelity)?
Parents and providers who receive training in DR? and Children with whom adults use the DR strategies?
Method
Search Procedure
Literature searches were conducted through the PsycINFO and ERIC databases using the following search terms separately: dialogic reading, shared book reading, shared interactive reading, and shared reading. Studies must have been published in peer-reviewed journals, written in English, and included at least one child participant between the ages of 2 and 5 years to be included in the review. The initial search yielded 194 articles that were screened by the first author for the following: (a) author(s) of study indicated specific use of “dialogic reading” as the focus of intervention, (b) employed an experimental or quasi-experimental group design or a SCRD regardless of meeting study rigor criteria, and (c) report of child outcome data. Studies in which interventions focused on the broader category of “shared interactive reading” were excluded even if strategies implemented within those studies mirrored some components of DR. One hundred sixty of the first round articles were excluded due to failure to meet one or more of the inclusion criteria, yielding a total of 34 articles. Four additional articles were excluded from the final analysis as during the coding process it was discovered that no child outcome data were reported or DR was not reported as the primary intervention, resulting in 30 articles for the final analysis (see Table 1 for included studies).
Study Identifier and Demographic Information Across Studies.
Note. SCRD = single-case research design; H = home; L = lab; C = classroom; T = therapy room; G = small group (2–5 children); EG = experimental group; CG = control group; NR = not specifically reported; M = males; F = females; TP = total participants; TD = typically developing; AR = at risk; DD = diagnosed disability.
May be expressed in age range or average age.
Coding Procedures
For all articles, descriptive statistics were coded by the first and fourth authors. Information was independently extracted for each study based on 52 variables related to the following areas: study identification (i.e., author, year published), participant and interventionist characteristics (i.e., number, gender, age, disability status, race/ethnicity, socioeconomic status [SES], education), intervention characteristics and components (i.e., components of DR), training procedures and fidelity (i.e., training characteristics, frequency, follow-up, teaching arrangement and setting), dependent variables, and study rigor. Interrater reliability (IRR) was completed by a trained research assistant for six (20%) randomly selected articles. Using the formula (Agreements / [Agreements + Disagreements] × 100), overall agreement was high, at 92%. Consensus for all disagreements was completed with the first and fourth authors.
DR instructional sequence
As one of our study aims was to explore the adherence fidelity of DR, the intended use of CROWD strategies was coded according to the presence of each of the five prompt types (i.e., the “P” in PEER). Similarly, the evaluation, expansion, and repetition portion of PEER was coded separately (see Table 2). For all aspects of CROWD and PEER, articles were coded as to the author(s)’ description of the intervention (i.e., their intended use of the strategies). Although some studies did not specifically refer to all aspects of CROWD, if the author(s) used terminology consistent with CROWD (e.g., “open-ended” questions), articles were coded in that regard. Any additional strategies falling outside of the CROWD parameters (e.g., “shadow the child’s interest,” “help the child as needed”) were not noted in this analysis.
Fidelity of DR Studies.
Note. C = completion; R = recall; O = open-ended; W = wh-question; D = distancing; NR = not specifically reported; Y = yes; N = no; P = parent; TE = teacher; SLP = speech–language pathologist; R = researcher; a = video viewed during training; b = role-play during training; c = feedback given from the trainer; d = interactive discussion; e = handouts with a summary of dialogic strategies provided; ns = not specified; Level 0 = no training was stated; Level 1 = authors stated training was provided, but no description was given; Level 2 = authors stated training was provided and gave a limited description of the training; Level 3 = authors stated training was provided and gave a detailed description of training.
Indicates adequate interobserver agreement/reliability was measured and reported for implementation fidelity. bIndicates data for implementation fidelity reported, with no report of interobserver agreement/reliability.
Training procedures
To determine training fidelity, we recorded if researchers collected fidelity data on procedures used to train interventionists, data collection methods on training procedures, and if estimates of interobserver agreement (IOA) were reported. Specifically, each study was coded for the types of training practices implemented (e.g., video training, role-play, handouts), if follow-up to the initial training was provided, and the number of training sessions. For the purposes of this review, the training fidelity was coded as follows: (a) no training was stated; (b) authors stated training was provided, however, no description was given; (c) authors stated training was provided and a limited description of training was given; (d) authors stated training was provided and gave a detailed description of training (i.e., replicable); or (e) authors described a detailed training and that training was observed/recorded and fidelity reported.
Implementation of DR strategies
We also coded the extent to which researchers gathered and reported data on implementation fidelity during intervention procedures with children. This information was based on their intended use of initial CROWD question prompts and the remaining steps of the DR instructional sequence (i.e., evaluation, expansion, and repetition). Studies were dichotomously coded as to the presence or absence of this information and whether IOA were reported of fidelity estimates (see Table 2).
Methodological rigor
Studies in this review were evaluated for methodological rigor based on quality indicators established by panels of experts. Each group design study was coded according to nine quality indicators set forth by Gersten and colleagues (2005). Specifically, each study was evaluated for provision of the following: (a) evidence of random assignment, (b) sample attrition of less than 30%, (c) description of sample comparability across groups, (d) adequate description of the independent variable and the comparison condition, (e) use of more than one outcome measure, (f) data on reliability of outcome measures, (g) data on validity of outcome measures, (h) data of fidelity of independent variable (i.e., implementation fidelity), and (i) ES of outcomes for child participants (see Table 3). All data coded related to outcome measures, including ESs, were related specifically to child participants as the studies evaluated were examining the effects of DR on children.
Study Rigor: Group Design.
Note. IV = independent variable; Y = Yes; N = No; — = not reported.
Study employed a within-subject repeated measures design. bNo comparison group; Gersten et al. (2005).
For SCRD studies, the authors coded for the presence of functional relation using visual analysis guidelines set forth by Horner and colleagues (2005). Studies were coded for the following data characteristics: (a) a stable baseline, (b) overlapping data, (c) immediacy of change between phases, (d) the consistency of treatment effect, and (e) at least three attempts at demonstrating treatment effects. These data features were coded individually and collectively at the study level to determine if experimental control was established and functional relation was demonstrated in each study. In addition, SCRD studies were coded for adherence to the following six design standards recommended by WWC (Kratochwill et al., 2010, 2013): (a) systematic manipulation of independent variable, (b) repeated measure of dependent variable, (c) IOA reported for over 20% of data, (d) IOA at over 80% agreement, (e) at least three attempts at demonstrating a treatment effect, and (f) at least 3 data points per phase. The initial four criteria were scored as present or not present at the study level and the final two design standards were coded at the case level using a dichotomous scale for the fifth standard and two category indications for the sixth (3–4 data points per phase or greater than 5 data points per phase). The individual studies were ultimately classified as (a) Meets Standards if they provided 5 or more data points per condition and met all other design standard criteria, (b) Meets Standards with Reservations if there were 3 or 4 data points per condition and met all other criteria, and (c) Does Not Meet Standards if there were fewer than 3 data points per condition or if the case failed to meet any criteria.
Results
To answer the first research question and contextualize the results of this review, data are first presented for the participants and settings that were included in the 30 analyzed studies.
Participants and Settings
Child participants
Across all 30 studies, there was a total of 1,608 child participants with ages ranging from 24 to 75 months. There was a wide range in number of child participants: from two (Huennekens & Xu, 2010) to 324 participants (Lonigan, Purpura, Wilson, Walker, & Clancy-Menchetti, 2013). The majority of articles (47%) included more male participants, with 30% of articles including a majority of female participants. Researchers did not report the gender of their participants in five studies. In eight articles, researchers reported having at least some child participants with an identified disability, with the most common reported as language delay. In the remaining 22 studies, researchers described their participants as either “at-risk” (23%) or “typically developing” (77%). Researchers in two studies (Fleury, Miramontez, Hudson, & Schwartz, 2014; Fleury & Schwartz, 2017) involved participants diagnosed with Autism Spectrum Disorder. In one study (Rahn, Coogle, & Storie, 2016), researchers included children with Attention-Deficit/Hyperactivity Disorder and cleft palate. Limited information was provided on the ethnic and socioeconomic information of participants; researchers who provided this information (n = 11) reported a majority of African American participants (55%) and low-income families (73%). In one study, researchers focused on only “wealthy” participants, with educational levels of up to PhD (Chow, McBride-Chang & Cheung, 2010).
Adult participants
Researchers in 57% of studies used parents as the main interventionists, 20% used researchers, 3% teachers, and 17% a mix of parents and teachers. One study (Desmarais, Nadeau, Trudeau, Filiatrault-Veilleux, & Maxès-Fournier, 2013) utilized the child’s regular speech–language pathologist (SLP) as the interventionist. The majority of researchers did not provide specific information about the adult participants, with the most information provided by those who utilized parents as interventionists. In 81% of these studies, researchers reported that the majority of parents were considered low income and had received low to no college credit. The ethnicities and races that were represented in five studies include Chinese, White, African American, and Latino.
Setting
The most common setting across studies was the home (53%), which correlates with the commonality of parents serving as the interventionists. The second most common setting was preschool classrooms (33%) with adults reading to one child (83%) or in small groups (17%). Researchers in three studies reported the intervention as occurring in two locations (e.g., home and school). See Table 1 for participant characteristics and study settings.
Systematic Evaluation of Procedural Fidelity
To answer the second research question, results are presented related to the three types of fidelity: (a) researcher’s adherence to the DR protocol (i.e., CROWD, PEER), (b) author(s)’ report of training methods and fidelity of training procedures, and (c) reported measurement of actual use of DR strategies or implementation fidelity.
Adherence fidelity
Researchers in each of the 30 articles implemented the use of DR as some aspect of the intervention. When coded for the specific elements of the DR strategies, researchers in 29 of the articles (97%) reported some use of at least one of the CROWD or PEER elements, whereas researchers in one study (Fielding-Barnsley & Purdie, 2003) did not specifically report use of any of the strategies (see Table 2). Researchers in nine studies explicitly reported full implementation of both the CROWD and PEER strategies. Researchers in the remaining 70% of studies reported use of some, but not all, elements with much variability across studies, including asking wh-questions, asking open-ended questions, providing expansions, linking the story to the child’s life, and providing praise.
Ten of the authors (33%) noted using all five of the CROWD prompts. The most commonly used CROWD features were open-ended and “wh” questions, which were present in 23 of the studies. Researchers in 40% of the studies (e.g., Arnold et al., 1994; Crain-Thoreson & Dale, 1999; Lonigan et al., 2013; Lonigan & Whitehurst, 1998) reported using two phases of training, with both incorporating questioning techniques and giving feedback. Specifically, in Phase 1, adults were to ask “wh” questions, follow a correct response with another question, repeat what the child says, assist and praise the child, and follow the child’s interest. Phase 2 added the strategies of asking open-ended questions and expanding the child’s comments (e.g., Hargrave & Sénéchal, 2000). Within the 19 studies where CROWD was not fully implemented, authors varied on selection of included prompts. Completion prompts were reported in two studies (Brannon & Dauksas, 2012; Hargrave & Sénéchal, 2000), recall in three studies (Brannon & Dauksas, 2012; Desmarais et al., 2013; Lever & Sénéchal, 2011), and distancing prompts in one study (Brannon & Dauksas, 2012). Researchers in three studies (Chow & McBride-Chang, 2003; Fielding-Barnsley & Purdie, 2003; Huennekens & Xu, 2010) did not specifically report use of any CROWD strategy.
The full use of PEER was more commonly reported than CROWD prompts across the 30 articles, with researchers in 21 studies (70%) reporting inclusion of the evaluation, expansion, and repetition components. Within the remaining nine studies, authors in four did not report a focus on evaluation, expansion, or repetition (Desmarais et al., 2013; Fielding-Barnsley & Purdie, 2003; Huebner, 2000; Niklas, Cohrssen, & Tayler, 2016). Five of the remaining nine authors mentioned at least one feature, with one study reporting use of evaluation (Hargrave & Sénéchal, 2000), four reporting expansion (Brannon & Dauksas, 2012; Huennekens & Xu, 2010; Lever & Sénéchal, 2011; Reese, Leyva, Sparks, & Grolnick, 2010), and two repetition (Hargrave & Sénéchal, 2000; Lever & Sénéchal, 2011).
Training method
The method by which researchers trained primary interventionists to implement DR techniques is essential for future replication and application. Researchers in 26 (87%) of the articles reported the use of training procedures although these varied widely (see Table 2). Individuals conducting training ranged from university students (13%); teachers, paraprofessionals, and SLPs (10%); and lead researchers (33%). Researchers in seven of the studies did not report the use of trainers, either because the training was completed solely online, or the lead interventionists were the researchers themselves so no training was necessary. The format of training sessions ranged from in-person sessions with video components to sessions comprised entirely of video training. The majority of these sessions were done as a group while 33% were done individually, either due to interventionists’ schedules or preference of the researcher. A total of 73% of the training sessions included didactic training and modeling of DR strategies during the training. The frequency of training sessions ranged from 1 to 15 sessions, with the length of individual sessions ranging from 15 min to 1.5 hr.
During training sessions, 53% of researchers described practicing strategies through role-playing with the researcher or other interventionists. Feedback was provided by the trainers to enhance practices. A variety of materials were provided to the interventionists, including handouts with DR strategies and logbooks to record readings completed. Researchers in one study asked the interventionists to assess the acceptability of the DR strategies using the Intervention Rating Profile (IRP; Blom-Hoffman, O’Neil-Pirozzi, Volpe, Cutting, & Bissinger, 2007). Researchers in three studies required interventionists to critique vignettes in the training video according to DR strategies and indicate what the reader should have done differently (Lonigan et al., 1999; Lonigan & Whitehurst, 1998; Whitehurst et al., 1994). See Table 2 for specific training practices.
All studies were coded as to whether support was offered to the interventionists following the training sessions (see Table 2). Researchers in nearly half (43%) of the studies did not report any type of follow-up training or support. In four studies, this was due to researcher implementation; however, the remainder of the studies used parents, teachers, or SLPs as interventionists. Included within the types of support were regular phone call checkups throughout the intervention to answer questions (e.g., Chow & McBride-Chang, 2003; Crain-Thoreson & Dale, 1999; Reese et al., 2010), scripts, suggested questions, or hints provided in books (e.g., Brannon & Dauksas, 2012; Desmarais et al., 2013; Zevenbergen et al., 2003), observations with feedback (e.g., Fleury & Schwartz, 2017; Lonigan et al., 2013; Tsybina & Eriks-Brophy, 2010), observations without feedback (Hargrave & Sénéchal, 2000; Lever & Sénéchal, 2011), and handouts (e.g., Arnold et al., 1994; Dale et al., 1996; Strouse, O’Doherty, & Troseth, 2013). Beschorner and Hutchison (2016) offered a discussion board for their online intervention group to post questions. Researchers also observed some intervention sessions to ensure fidelity of implementation (e.g., Hargrave & Sénéchal, 2000; Lever & Sénéchal, 2011).
Fidelity of training procedures
Across the 30 studies analyzed, none measured the effectiveness of the initial training procedures by providing specific fidelity data during training sessions, and therefore no researchers reported on the IOA of those procedures. In three studies, this was due to researcher implementation. However, within the remaining 27 studies, researchers in one did not report any training (Desmarais et al., 2013), researchers in eight studies reported a limited description of the training that was provided (e.g., Strouse et al., 2013; Zevenbergen et al., 2003), and researchers in 18 studies provided a detailed description of the training that may be replicable (e.g., Arnold et al., 1994; Crain-Thoreson & Dale, 1999). All authors reporting training listed at least some components in the narrative, in a table or in the appendix (see Table 2).
Implementation fidelity
Researchers in less than one-third of the studies (n = 9) included a report of implementation fidelity. Each of these studies was among those that aimed to use some, if not all, elements of both the CROWD and PEER strategies. Authors in five of these studies reported both specific data for implementation fidelity as well as IOA of these data (Blom-Hoffman et al., 2007; Fleury et al., 2014; Fleury & Schwartz, 2017; Lonigan et al., 1999; Rahn et al., 2016). Researchers in the remaining four articles reported data for implementation fidelity, however, no IOA was given. Therefore, if fidelity data were not collected or if the author(s) provided no data on the fidelity of intervention, there can be no conclusions made as to whether the DR strategies that were described were actually implemented.
Assessment of fidelity of implementation
The fidelity assessment methods varied significantly, including unobtrusive observations by the researchers to ensure strategy use (Hargrave & Sénéchal, 2000; Lever & Sénéchal, 2011; Tsybina & Eriks-Brophy, 2010), interactive observations during which researchers offered feedback (Fleury & Schwartz, 2017), and use of reading or video viewing logs to specifically track whether the DR strategies were implemented (Hargrave & Sénéchal, 2000; Lever & Sénéchal, 2011).While other researchers reported use of some of the same strategies, it was not for the purposes of fidelity information (e.g., Huebner, 2000; Sim, Berthelsen, Walker, Nicholson, & Fielding-Barnsley, 2014; Towson & Gallagher, 2014). Researchers in six studies were explicit in describing how strategies were measured (Blom-Hoffman et al., 2007; Fleury et al., 2014; Fleury & Schwartz, 2017; Lonigan et al., 1999; Rahn et al., 2016; Tsybina & Eriks-Brophy, 2010), often using video or audio recording of reading sessions to capture data for coding and IOA. Examples of measurement included requiring at least two prompts of each type (i.e., CROWD) be used during each book reading (Fleury et al., 2014), implementing seven different prompt types per book reading (Blom-Hoffman et al., 2007), and ensuring at least three prompts per target word were implemented each session (Tsybina & Eriks-Brophy, 2010).
Study Design and Rigor
Research Question 3 was addressed by analyzing all 30 studies by standards set forth by panels of experts for both group design (Gersten et al., 2005) and SCRD (Horner et al., 2005; Kratochwill et al., 2010, 2013). Researchers in the majority (87%) of the studies included in this review used group design, with four of the 30 studies utilizing SCRD.
Group study rigor
Nine quality standards derived from Gersten and colleagues (2005) and replicated from Barton and Fettig (2013) were used to evaluate group design study quality (see Table 3). Each standard was coded using a dichotomous scale and included the nine variables listed in Table 3. No study met all nine design standards. However, three studies met eight of the nine standards evaluated (Lever & Sénéchal, 2011; Lonigan et al., 1999; Lonigan & Whitehurst, 1998). Fielding-Barnsley and Purdie (2003) met the least number of standards, at three. A total of 17 (65%) of the research studies met six or more of the design standards. While 15 authors (58%) reported reliability data on outcome measures, only two (Lonigan & Whitehurst, 1998; Sim et al., 2014) reported validity data. Researchers in all 26 studies reported attrition rates of less than 30%, with the highest at 29.5% (Strouse et al., 2013). Authors of 23 of the studies reported adequate descriptions of research conditions and used multiple outcome measures for child participants. Effect sizes for child outcomes were reported in 15 (58%) of all studies examined. See Table 3 for additional information by study.
SCRD study rigor
Of the four studies, three utilized multiple baseline design while one used an alternating treatment design (Rahn et al., 2016). One of the studies had stable baseline (Fleury & Schwartz, 2017). Three studies had more than 10% of overlapping data across conditions for all participants and variables, and one study (Huennekens & Xu, 2010) had overlapping data on one of the two variables gathered. Two studies documented immediate effect for one of the two variables gathered, whereas two studies did not demonstrate immediate effect on any of the variables. Only one study (Fleury & Schwartz, 2017) reported one of its variables demonstrating consistent change within and across conditions. When examining functional relation between the independent variable and the dependent variables in each of the four studies, only one study (Fleury & Schwartz, 2017) demonstrated functional relation with one of the dependent variables reported.
The authors also coded each SCRD study using the design standards set forth by WWC (Kratochwill et al., 2010, 2013) as listed in Table 4. Authors in all four articles reported manipulation of the dependent variable, repeated measurement of the dependent variable, and reporting of IOA for greater than 20% of data at greater than 80% agreement. Only two of the four studies (Fleury et al., 2014; Rahn et al., 2016) attempted at least three demonstrations of a treatment effect. Researchers in one study (Fleury et al., 2014) collected at least 3 data points per phase, with the remaining three studies reporting a minimum of 5 data points per phase. Only one study was classified as meeting WWC standards (Rahn et al., 2016), with one study (Fleury et al., 2014) classified as meeting with reservations and two studies did not meet standards (see Table 4).
Study Rigor: Single-Case Research Design.
Note. SCRD = single-case research design; DV = dependent variable; IOA = interobserver agreement; MB = multiple baseline; N = evidence not present; Y = evidence present; M w/R = meets with reservations; AT = alternative treatment.
Magnitude difference between two comparison variables.
Study Outcomes of DR
To answer the final research question and describe what types of outcomes have been examined in DR research, results are presented for both the interventionists (e.g., parents, teachers) and for the children with whom adults have used the DR strategies.
Interventionist outcomes
Across the seven articles that considered interventionists’ use of DR as the dependent variable (Beschorner & Hutchison, 2016; Blom-Hoffman et al., 2007; Crain-Thoreson & Dale, 1999; Dale et al., 1996; Fleury & Schwartz, 2017; Hargrave & Sénéchal, 2000; Strouse et al., 2013), authors of only one did not report a significant increase in wh-questions (Beschorner & Hutchison, 2016). Other increases were seen in open-ended questions (Blom-Hoffman et al., 2007; Crain-Thoreson & Dale, 1999; Dale et al., 1996; Strouse et al., 2013), recall questions (Strouse et al., 2013), distancing questions (Strouse et al., 2013), completion prompts (Strouse et al., 2013), expansions (Crain-Thoreson & Dale, 1999; Dale et al., 1996), repetitions (Hargrave & Sénéchal, 2000), and evaluations of children’s responses (Blom-Hoffman et al., 2007; Crain-Thoreson & Dale, 1999; Hargrave & Sénéchal, 2000). Researchers in one article (Beschorner & Hutchison, 2016) reported a general increase in the frequency of the behaviors, with no statistically significant difference between online and face-to-face training for interventionists.
Child outcomes
Most researchers examined language and literacy skills for their child outcomes. Researchers in 53% of studies tested language skills, 10% examined literacy skills, and 27% assessed both. Researchers in one study tested fluid reasoning in their child participants (Niklas et al., 2016). There was a significant range in the measures used to assess child outcomes across studies. These included use of standardized assessments in language (e.g., Peabody Picture Vocabulary Test; Expressive One-Word Picture Vocabulary Test) and informal assessments of language (e.g., mean length of utterance [MLU], total number of different words, verbal participation). Standardized assessments of emergent literacy (e.g., Concepts About Print Test; Get Ready to Read–Revised) and informal emergent literacy assessments (e.g., rhyme awareness) were also reported. Frequently, child outcomes were measured through tools created by the researchers that directly assessed the targeted outcome of intervention. Researchers in 27 studies reported increases in child language and/or emergent literacy skills based on the specific measure used.
Discussion
Our purpose in conducting this review was to provide a systematic evaluation of the evidence base for DR, with a particular emphasis on the fidelity of the intervention practices. Specifically, this study evaluated the adherence to the intended protocol of DR (i.e., use of CROWD and PEER), the reported fidelity of training of the DR strategies, and the actual implementation of these strategies by interventionists across settings and research designs. Our findings suggest that DR is typically implemented in home and school settings by caregivers and educators. Researchers in more than 70% of studies focused on children described as typically developing or at risk for later deficits. Only eight studies included children with identified disabilities leading us to conclude that there is relatively less known about the extent to which children with disabilities, specifically those with more significant impairments, can benefit from DR. In addition, no prior review explored the characteristics of the interventionists or child participants. Beyond gender, age, and limited information on SES as discovered in this study, it is difficult to determine “for whom” DR is effective and “who” is able to effectively implement the strategies. Researchers in the majority of the DR studies used group design to examine the language and literacy outcomes in young children. While not all studies fully implement all aspects of CROWD and PEER, adult and child outcomes are favorable.
In regard to researchers adhering to the DR protocol, it was evident that the types of prompts used (CROWD) and the completion of the PEER sequence varied by interventionists, with only nine studies reporting focus on all aspects of DR. Whereas researchers in 29 of the 30 articles reported using at least one aspect of CROWD or PEER, one did not report use of any of the features, yet described its intervention as DR. Adherence to the PEER portion of DR was more common, suggesting researchers may put priority on scaffolding the responses of children than on the variety of prompts they provide. It should be noted that extracting the information for the presence or absence of CROWD was often obscured by the fact that recall and distancing questions might have been included in what was described as wh-questions, as all three prompt types might implore the use of a what, where, when, or why question. This could also be true for extraction of PEER, unless each strategy was explicitly labeled by the author(s). In addition, the most commonly used question prompts were open-ended and wh-questions. This could be attributed to the fact that in early DR literature (Lonigan & Whitehurst, 1998) training was described as occurring in two phases. In Phase 1, adults were to ask “wh” questions, follow a correct response with another question, repeat what the child says, assist and praise the child, and follow the child’s interest. Phase 2 added the strategies of asking open-ended questions and expanding the child’s comments with no mention of the other types of prompts (e.g., completion, recall, distancing) or the “R” in PEER. Regardless of how specific question strategies are categorized or named, it appears that the “active ingredients” in DR may be related to prompting the child and completing the entire PEER sequence to ensure that children’s opportunity for learning is maximized.
To determine what supports are needed to successfully implement the strategies of DR, the extent to which researchers report on training practices, the number of trainings, provision of ongoing feedback, and fidelity of these practices must be examined. Across the 30 studies analyzed, results were widely varied, making it difficult to determine what types of training are most practical and effective for interventionists to implement DR with fidelity. Although authors in the majority of the studies stated that initial training was provided to the interventionists, none reported data on fidelity of training sessions thus limiting our ability to conclude whether sufficient trainings were provided to adequately prepare interventionists to deliver DR. It was noted that a common theme among training practices was the use of didactic strategies implemented through role-playing and discussion opportunities between the trainers and trainees. It may be that practice of the CROWD and PEER strategies is an important factor when determining whether the adult reader can adequately implement these strategies later with a child or group of children. Similarly, because follow-up was not utilized in 43% of studies, it is difficult to determine whether the interventionists would improve use of DR strategies with additional training throughout the intervention period. The type, duration, and frequency of training procedures, as well as the use of follow-up should be explicitly explored in future studies.
While the WWC (2007, 2010) has reported DR as showing positive outcomes in oral language skills for children with and without disabilities, attributing these benefits directly to DR strategies is challenging without adequate evidence that the intervention was implemented as designed. It is important to note that the majority of studies were conducted in home with parents as interventionists; thus, the report of implementation fidelity is limited due to the difficulty of tracking these intervention elements in home environments. Moreover, while researchers in nine studies described keeping record of the implementation fidelity, only five reported specific data with IOA, and these data varied in what strategies of DR were tracked and how data were collected. For example, Blom-Hoffman and colleagues (2007) reported coding three videotaped observations of reading sessions collecting data on eight prompt categories, whereas Tsybina and Eriks-Brophy (2010) completed weekly observation of reading sessions to ensure that a minimum of three prompts per targeted word were used in each session. Other attempts at tracking the DR approach were limited, such as the use of reading logs or phone call follow-ups, which are inadequate to capture the actual use of strategies, but rather the frequency the intervention was attempted. Therefore, this review cannot adequately capture the actual use of DR strategies, but only the intended use as reported by the authors. It should be noted that of the nine studies reporting implementation fidelity, one third were SCRD. Even though these studies represented only 13% of all studies reviewed, future research in this area may be best addressed through SCRD studies as this design lends itself well to tracking specific components of interventions.
Many aspects that were in question related to procedural fidelity in this study could be linked to study rigor. Of the 30 studies reviewed, only one met design standards according to WWC (Kratochwill et al., 2010, 2013) for SCRD and none met group design standards (Gersten et al., 2005). This could be attributed to 14 of the studies being published prior to 2010. Notably, the only study meeting standards used was SCRD, which by design often has more control and tracking of implementation fidelity. Nevertheless, future researchers should consider the importance of procedural fidelity measurement when designing their studies.
Similarly, outcome measures for adult behavior related to DR strategies were tracked in only eight studies and there was variability in those outcomes. Across these studies, it appears that interventionists were most able to implement the ‘wh’ question prompt. However, with the limited information on training and implementation fidelity, it is difficult to make conclusions as to why this may be so. Child outcome data most often related to some aspects of language and literacy, which DR is most known for impacting (Mol et al., 2009), however, the measurement of these outcomes was also variable. Thus, it is challenging to draw strong relations between specific components of DR and child outcomes with the variability of measures, dependent variable constructs, and lack of adherence and implementation fidelity data available to date.
Implications
These findings present several implications for future research and practice. Foremost is to expand the current study of DR with strict adherence to both the CROWD and PEER sequence. This can be accomplished through more rigorous studies that examine and report the training and implementation fidelity of using DR in different contexts. Most studies that were included in the WWC and Mol’s meta-analysis were conducted in home settings and all were group design. Additional studies in the classroom context will allow better understanding of how DR can be used in the classrooms to support young children’s skill development. Second, although DR has been deemed to be easy to implement due to its formulaic procedure (Teale, 2003), because of limited implementation fidelity information it is unclear whether or not the intervention is implemented fully, regardless of context and interventionist. Within future studies, additional description of the interventionists and participants is necessary to determine “who” can successfully implement DR, what training is needed for that implementation, and “for whom” does DR work.
The methodological rigor of studies included in this review should be considered when evaluating the implications of their results. Only one of the 30 studies met design rigor standards and just nine reported that implementation fidelity related to the intervention was tracked. Holding researchers accountable only to tracking fidelity of implementation may be insufficient. Within the authors reporting implementation fidelity information, there was little consistency across studies in how this information was collected, evaluated, and reported, leaving practitioners with limited information on which features are essential for the successful implementation of DR. It is also unclear from the current literature which features of CROWD and PEER are successfully and consistently implemented when not delivered by researchers. Future studies should provide replicable information on training and implementation procedures and their fidelity to determine the quantity and quality of training and support needed as well as key aspects of DR that may contribute to child outcomes. This information will also provide the field an understanding of features of DR that can be successfully completed by different stakeholders (i.e., parents, teachers, SLPs, etc.).
It is critical to consider the elements that studies claim to implement in their intervention when stating focus on an evidence-based practice such as DR. If a study fails to practice the necessary strategies to qualify as DR but yet calls it DR, the literature, and ultimately professional practice, is compromised. Finally, future studies can attempt to correlate specific DR strategies to specific child language, literacy, and other outcomes. While DR shows promise in promoting young children’s language and literacy skills, this shared book reading procedure also presents opportunities to embed social emotional learning and other academic skills such as math and science.
Limitations
In reviewing the overall analysis presented, there are limitations to this review. First, shared reading is a broad category that encompasses a vast variety of strategies that would not lend well to systematic review. We limited inclusion to studies that used the specific term “dialogic reading.” Studies that use intervention procedures that align with key components of DR but named the intervention with other terms (e.g., shared reading, read aloud) would not be captured in this review. For example, the seminal article in DR (Whitehurst et al., 1988) was not included as at that time in the development of the approach, the term DR was not conceptualized, nor were CROWD and PEER fully developed. Second, the procedural fidelity of DR findings in this review is only coded based on what was reported in the published studies. The types of prompts and the DR sequence may or may not have been explicitly stated as CROWD and PEER in the narrative of each study, making it difficult to decipher specific use of each strategy. Finally, it is possible that relevant studies may have been omitted from this review as the original identification of studies was completed by the first author, with no reliability completed on the search and initial screening.
Conclusion
DR is an evidence-based practice for children who are typically developing and at-risk populations (WWC, 2007) and has promising evidence for children with disabilities (WWC, 2010). However, translating an evidence-based practice such as DR from research to general practice requires evaluating the actual adherence to the DR strategies; the training type, dosage, and frequency required for adults to implement the strategies with fidelity to ensure children receive the positive outcomes outlined within the research base; and the tracking and reporting of implementation fidelity to inform future studies and actual practice. The procedural fidelity of existing DR studies must be examined systematically to determine which training practices are necessary for parents, teachers, paraprofessionals, SLPs, and others. Unfortunately, this focus on fidelity of training procedures was not found in this review of literature. While less than one third of the studies reported some type of implementation fidelity, data are inconsistently tracked and reported. Future studies of DR require careful examination of adherence to the protocol, training procedures, as well as fidelity of intervention strategies employed to translate this evidence-based strategy from research to the hands of stakeholders.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
