Abstract
There is empirical evidence to suggest that oral language and vocabulary on entering kindergarten are the best predictors of later reading success. Identifying skills that are predictive of later achievement using psychometrically sound measurement methods is a necessary component of early intervention efforts. Currently, there are limited methods for measuring early vocabulary acquisition. The Dynamic Indicators of Vocabulary Skills (DIVS) were designed to measure the vocabulary of preschool and kindergarten students. The purpose of this article is to contribute to the psychometric evidence supporting the use of the DIVS as effective measures of early vocabulary acquisition. This article presents an array of validity evidence for the DIVS, including predictive validity and the construct validity evaluated by both convergent validity and discriminant validity estimates.
Evidence suggests that oral language and vocabulary on entering kindergarten are the best predictors of later reading success (Braze, Tabor, Shankweiler, & Menel, 2007; National Institute of Child Health and Human Development [NICHD] Early Child Care Research Network, 2005; Rupley, 2005). While effective early reading instruction should focus on word identification strategies necessary for proficient reading, vocabulary knowledge is equally necessary for comprehension. Research indicates that early literacy instruction must incorporate targeted vocabulary instruction so as to ensure later reading achievement (Biemiller, 2006, Brabham & Lynch-Brown, 2002; Neuman & Dwyer, 2009; NICHD, 2000).
The relationship between vocabulary knowledge, reading acquisition, and comprehension is incredibly complex (RAND Reading Study Group, 2002). Anderson and Freebody (1985) proposed three explanations for the high correlations found between tests of comprehension and vocabulary. They suggested an instrumental theory, where vocabulary knowledge causes understanding. They also suggested that vocabulary and comprehension are interrelated constructs within one general verbal ability construct. Lastly, they suggested that the development of general background knowledge begets better vocabulary and understanding.
The instrumental hypothesis may also be applied to the relationship of vocabulary in learning to read. Phonemic awareness is foundational for learning the phonetic code of written English language. Its development is supported through oral language activities where children are asked to recognize and manipulate sounds in spoken words. When words used for these activities are unknown, the task becomes burdened by working memory and the phonological awareness instruction is diluted.
Early vocabulary knowledge can also affect word decoding. When a known word is first decoded, knowledge of the word may reinforce the decoding strategies used to correctly decipher it. Word familiarity may also facilitate faster acquisition of word-level automaticity and subsequent fluency. Finally, when the meaning of the decoded word is known, the reader is able to comprehend the text. These examples imply that robust vocabulary prior to early reading instruction may facilitate the process of learning to read.
Unfortunately, children who enter the process of reading acquisition with language deficits often lag behind their language competent peers. In their seminal study, Hart and Risley (1995) found that children with larger vocabularies at age 3 acquired new words at nearly two times the rate of their peers with smaller vocabularies. Longitudinal studies also demonstrate that students who are poor readers at the end of their 1st year of reading instruction continue to have difficulty throughout elementary school (Francis, Shaywitz, Stuebing, Shaywitz, & Fletcher, 1996; Juel, 1988). Students who do not attain critical early reading skills have less opportunity to practice reading, leaving them to fall further behind their peers (Torgesen, Rashotte, & Alexander, 2001). However, targeted early literacy instruction is effective in preventing later reading failure (Foorman, Francis, Fletcher, Schatschneider, & Mehta, 1998; Torgesen, 2004; Vellutino et al., 1996).
It is well accepted that phonological awareness and alphabetics should be targets of early literacy instruction, but vocabulary knowledge is equally important and oft neglected. Neuman and Dwyer (2009) examined 10 published preschool literacy curricula to examine their adherence to evidence-based vocabulary instructional practices. They found that all the programs specified words to be learned but only three provided strategies for teaching, practicing, and reviewing the words. They also found that only three curricula provided strategies for assessing vocabulary. They relied on behavioral observations for assessment, but did not provide a means for measuring the developing vocabulary knowledge. Despite specific recommendations of the importance of systematic vocabulary instruction and assessment for instructional decision making (NICHD, 2000), effective early literacy instructional components are often lacking in practice.
Implicit in these recommendations is the availability of methods for measuring essential early literacy skills. Currently, there are valid and reliable methods to screen and monitor the progress of young children in the areas of alphabetics and phonemic awareness. Information gathered through curriculum-based measurement (CBM) tools can be used to identify students at risk of later reading failure based on these constructs of early reading mechanics (Deno, Marston, Shinn, & Tindal, 1983; Kaminski & Good, 1996). Vocabulary, however, is not measured using these methods. The Dynamic Indicators of Vocabulary Skills (DIVS; Parker, 2000) were designed to fill this void in the body of CBM measurement tools and function as screening and progress-monitoring measures. They are presently available for measuring the vocabulary of preschool and kindergarten-aged students so as to identify those at risk of later reading failure as soon as they enter formal schooling.
Developed based on the characteristics of CBM (Deno, 1985), the DIVS differ from existing measures of vocabulary. CBM tests adhere to standardization, reliability, and validity standards of published norm-referenced tests, but their formats vary for their unique purposes. CBM tools are designed to measure response to classroom curriculum and must reflect curricula. The DIVS were developed by choosing common nouns found in kindergarten and first-grade basal readers. Word selection is an important topic in the development of vocabulary measurement (Pearson, Hiebert, & Kamil, 2007). The most popular recommendation is to select words that are used to name common concepts in sophisticated terms (Beck, McKeown, & Kucan, 2002). However, because the DIVS were designed to assess early vocabulary knowledge, the words chosen for them were the common labels for foundational concepts.
CBM evolved from a behavioral measurement model. Rather than more standard question-answer testing formats, CBM relies on a fluency metric where a score is derived as correct answers produced within a limited time. The DIVS consist of two subtests—picture naming fluency (PNF) and reverse definition fluency (RDF). PNF is a fluency task where students are presented with 44 picture tiles and asked to name as many pictures as they can in 1 min. RDF is administered by providing students with a simple definition for which the students are directed to say what word is described. The testing tasks differ from the existing measures of vocabulary, such as the Peabody Picture Vocabulary Test (3rd ed.; PPVT-III; Lloyd M. Dunn & Dunn, 1997), which measure receptive vocabulary by asking students to point to a picture that represents the word they are told. The brief 1-min DIVS tasks primarily reflect expressive language of young children. This testing method reflects children’s agility with vocabulary production—a more difficult cognitive task than receptive word knowledge (Pearson et al., 2007).
Another important feature of the behavioral measurement model is the use of the data for time-series analysis where idiographic growth can be assessed. The format of the DIVS facilitates both the efficient screening of all children and frequent measurement for assessing within-child progress. As students’ vocabulary knowledge increases, their correct response rates also increase, which reflects developing agility with oral vocabulary.
As with any test, the scores elicited from the measure must be both reliable and valid. This article will contribute to the psychometric evidence supporting the use of the DIVS as effective measures of early vocabulary acquisition. This study was designed to test two hypotheses regarding construct validity and two hypotheses related to predictive validity. The construct validity was evaluated by constructing a multitrait–multimethod matrix to examine the ways that tests measuring the similar vocabulary construct converged and less similar phonemic awareness and phonics constructs diverged (Campbell & Fiske, 1959). It was hypothesized that concurrent student vocabulary performance as measured by the DIVS and the PPVT-III would converge while concurrent performance on these vocabulary measures would diverge from early phonological awareness and phonics skills as measured by the Phonological Awareness Literacy Screening (PALS; Invernizzi, Meier, Swank, & Juel, 1997). It was also hypothesized that initial vocabulary performance on entering preschool as measured by the DIVS would predict later performance on the DIVS and later performance on the PPVT-III similar to the predictive power of the PPVT. Lastly, because vocabulary is an important indicator of later literacy performance, and phonological awareness and early phonics knowledge are essential skills for learning to read, it was also hypothesized that performance on the DIVS would predict later performance in phonological awareness and phonics subtests of the PALS.
Method
Participants and Settings
The city from which the data for this study were drawn has a population of approximately 95,000 people, of whom 74.5% are Caucasian, 16.7% are of Hispanic or Latin origin, and 6.4% are Black. The median household income for the years 2006 to 2010 was US$36,172, and 22.7% of the population fell below the poverty line (U.S. Census Bureau, 2012). Through Early Reading First grant funding, full-day, full-year preschool education was provided to this community. The goals of the teachers and staff in this school were to improve children’s oral language, phonological awareness, print awareness, and alphabet knowledge within the context of rich language and literacy activities. To support these early literacy goals, the preschool used the Open Court Reading Pre-K as their primary curriculum. In addition, storybook read-alouds and other language enrichment activities were used to supplement the early literacy curriculum, and parent–school communication initiatives were put in place to foster language development at home. The preschool staff also collected early literacy screening data to monitor student progress and evaluate program outcomes. These data were used for purposes of this present study.
Data were collected from 328 4-year-old students enrolled in this urban Massachusetts public preschool. The sample consisted of approximately 45% female (n = 147) and 55% male (n = 181) students of whom nearly 25% (n = 83) received Special Education services. Of the participants, 72% (n = 236) participated in the free or reduced lunch program, and nearly 34% (n = 113) were English language learners (ELL).
Measures
DIVS
The DIVS are screening measures designed to assess levels of receptive and productive vocabulary acquisition in preschoolers and kindergarteners. When used to identify children with weak vocabularies on entering school, the DIVS can provide educators information to help them differentiate their instruction, provide intensive oral language instruction, and prevent later reading failure. The DIVS comprise two vocabulary measures, picture naming fluency and reverse definition fluency. Each measure is administered individually for a 1-min time period during which students orally respond to the prompts and their response rate is measured.
Picture naming fluency
During PNF test administration, an examinee is presented with a probe consisting of 44 color pictures. Pictures are presented on the page in an array of 11 rows with four pictures per row. The pictures represent common nouns found in popular children’s literature. Each picture included on the test was found in kindergarten and first-grade basal readers at least five times (Parker, 2006). To administer PNF, the examiner adheres to standardized directions that instruct the student to begin at the top and name the pictures as they are presented across the page. The score elicited is a raw score of pictures named correctly (PNC) in 1 min.
PNF can be administered for the purpose of benchmark screening in the fall, winter, and spring to students in both preschool and kindergarten. Preliminary psychometric evidence supporting the use of DIVS is provided in the technical report for the test (Parker, 2006). The alternate-form reliability for PNF was found to be .84 for preschool students and .73 for kindergarten students. The concurrent validity of the PNF with the PPVT-III (Lloyd M. Dunn & Dunn, 1997) was .75, with the Preschool Language Scale—Auditory Comprehension was .64, and with the Preschool Language Scale—Expressive Communication was .67. The concurrent validity of the PNF with the DIVS RDF was reportedly .77.
Reverse definition fluency
To administer RDF, an examiner provides a brief description of a word, and the student is asked to respond with the word that was described. The test contains 30 formal definitions of words commonly found in children’s literature, as those selected for PNF. Examinees are guided through two practice examples. Then the examiner provides the first word definition and begins a stopwatch. Once the student completes the response the stopwatch is paused, timing only the latency of response. The timer is started again after the next definition is provided. This method assesses the students’ fluency of response while eliminating administrator variation in reading the item descriptions. The examiner stops presenting items once 1 min is over on the stopwatch. The score elicited is a raw score of words named correctly (WNC).
RDF is typically administered for benchmarking purposes in the winter and spring of preschool and in the fall, winter, and spring of kindergarten. The alternate-form reliability for RDF was reported as .86 for preschoolers and .76 for kindergarteners in the technical report provided by the test author (Parker, 2006). The concurrent validity for RDF with DIVS PNF was reportedly .77. The concurrent validity for RDF with the PPVT-III was .83, with the Preschool Language Scale—Auditory Comprehension was .70, and with the Preschool Language Scale—Expressive Communication was .73 (Parker, 2000, 2006).
PPVT-III (3rd ed.)
The PPVT-III (Lloyd M. Dunn & Dunn, 1997) is a norm-referenced test designed to measure receptive vocabulary for persons aged 2.5 to 90+ years. The PPVT-III can be used as a screening test of verbal ability. The test features two parallel forms, which allows for measuring vocabulary development between two time points. Each form of the test consists of 204 items. The test items comprise four illustrations from which the examinee must select the one picture which best represents the word the examiner orally presents. Internal reliability was calculated using the alpha coefficient. The reliability for 4-year-olds was .95 (Lloyd M. Dunn & Dunn, 1997). The criterion validity of the PPVT-III demonstrated moderate correlations (.63-.83) with the Oral and Written Language Scales (OWLS) for children aged 3 to 5.
PALS
PALS (Invernizzi et al., 1997) is a criterion-referenced test designed to measure skills indicative of later reading success. The PALS-PreK form is specifically designed to measure early literacy skills acquired during the preschool years and consists of eight subtests including letter sounds, beginning sound awareness, rhyme awareness, uppercase alphabet knowledge, lowercase alphabet knowledge, name writing, nursery rhyme awareness, and print and word awareness. Of these subtests, the first six were used for the purposes of this study.
The alphabet knowledge subtest was used to measure early phonics achievement. This subtest includes uppercase and lowercase letter identification. If a child has the ability to accurately identify 16 uppercase letters, he or she is given the opportunity to move to the lowercase alphabet recognition task. If the child is able to name nine or more lowercase letters correctly, he or she moves on to the letter sounds subtest.
The beginning sound awareness and rhyme awareness subtests were used to measure the phonological awareness of children in this study. Beginning sound awareness is an oral assessment that involves having a child match pictures based on their initial sound. The Rhyme awareness subtest requires children to match pictures based on rhyme.
The PALS-PreK has been examined for two forms of reliability: internal consistency and interrater reliability. Interrater reliability was determined by having two individuals’ score tasks independently, and the correlation coefficient for all subtests used in this study was .99. The internal consistency of the beginning sound and rhyme subtests was determined using split-half reliability and Cronbach’s alpha. Split-half estimates for the beginning sound and rhyme subtests were.94 and .87, respectively. Cronbach’s alpha for the beginning sound and rhyme subtests were .93 and .84, respectively. Criterion-related validity was determined by assessing the concurrent validity of two similar measures: the Test of Awareness of Language Segments (TALS) and the Test of Early Reading Ability (TERA-3). The TALS, which measures a child’s ability to segment and identify individual phonemes, was found to have a medium–low but significant (r = .41) correlation. The TERA-3, a measurement of mastery of early reading skills, was found to have a medium–high and significant correlation (r = .67) with the PALS-PreK.
Procedures
The data used in this study were aggregated over two consecutive academic years, using the same tests, standardized testing procedures, and testing time points each year. The data were gathered as part of the schoolwide benchmarking procedures that were part of the standard practices in the preschool. Researchers from a private company were contracted by the school district to administer and score the tests, enter and store all the data, and ensure the reliability of both the test results and data entered into the database. All test administrators received extensive training in the administration of each test. All tests used in this study were individually administered to students, and each student participated in a separate testing session for each test administered at each time point.
Data were gathered each year at the same three testing time points. Fall testing occurred mid-September through mid-October. Winter testing occurred mid-January through mid-February, and spring testing occurred in May. During the fall test sessions, students were administered the DIVS PNF, the PPVT-III, and the PALS across three separate test occasions. Students took three forms of the PNF; probe lasted for approximately one min, for a total of 3 min of testing time. The PPVT-III and the PALS are not timed and each typically took approximately 20-25 min for the 4-year-olds to complete.
The winter test sessions included the two DIVS measures—the PNF and the RDF. During these test sessions, students were given three probes for each test with approximate testing time of 1 min per probe. Total testing time for each child during the winter session was approximately six min.
The spring test sessions were also all conducted within a 4-week period, and included the three vocabulary measures—PNF, RDF and PPVT-III—and the PALS. The two DIVS measures were administered on the same day, and the PPVT-III and PALS were administered during two separate testing sessions.
Analyses
This study was designed to test hypotheses regarding the construct validity and predictive validity of the DIVS. To examine these relationships, multiple correlation matrices were computed and analyzed for the magnitude of the correlations between measures. First, concurrent validity was examined for both the fall and spring data sets by computing separate correlation matrices for the data collected at those two time points. Second, the correlations for data gathered across the three testing points—fall, winter, and spring—were computed to analyze the predictive validity of students’ performance on the fall vocabulary measures to all of the spring literacy testing results.
Results
Descriptive Statistics
Table 1 reports the descriptive statistics for the measures used in this study. Based on the skewness and kurtosis statistics, the distributions for most of the measures were not severely skewed or heavy or light tailed. However, a few of the measures appeared to be non-normally distributed, which may suppress the observed correlations in this study. The fall uppercase letter recognition measure (PALS UR) and beginning sound recognition (PALS BS) were slightly light tailed. For spring, three of the measures were negatively skewed (PALS UR, PALS LR, and PALS BS), primarily due to a ceiling effect.
Descriptive Statistics.
Note. DIVS = Dynamic Indicators of Vocabulary Skills; PPVT = Peabody Picture Vocabulary Test; PALS = Phonological Awareness Literacy Screening; UR = uppercase letter recognition; LR = lowercase letter recognition; LS = letter sound; BS = beginning sounds; RA = rhyme awareness; PNF = picture naming fluency; RDF= reverse definition fluency.
Reliability
Using Pearson’s r, the stability of students’ responses was examined across the multiple test occasions for each of the tests of vocabulary acquisition used in this study. The correlations found for PNF across the three test sessions ranged from .80 to .87. The stability observed between the RDF winter scores compared with spring scores was .86. The fall to spring stability estimate for the PPVT-III was .81. These data indicate that the observed reliability between testing occasions is similar for the DIVS and PPVT-III usage.
Construct Validity
Similar patterns of convergence and divergence were found in both the fall and spring data sets. It was hypothesized that performance on the DIVS would be highly correlated to performance on the PPVT-III. For data gathered in both fall and spring, a strong relationship was found between performance on the PNF and PPVT-III, where r = .76 and .70, respectively (see Tables 2 and 3). A strong relationship was also found for PNF and RDF in the spring data set (r = .75).
Correlations for Concurrent Validity Across Fall Early Literacy Measures.
Note. PNF = picture naming fluency; PPVT = Peabody Picture Vocabulary Test; UR = uppercase letter recognition; LR = lowercase letter recognition; LS = letter sound; BS = beginning sounds; RA = rhyme awareness.
Correlations for Concurrent Validity Across Spring Early Literacy Measures.
Note. PNF = picture naming fluency; RDF = reverse definition fluency; PPVT = Peabody Picture Vocabulary Test; UR = uppercase letter recognition; LR = lowercase letter recognition; LS = letter sound; BS = beginning sounds; RA = rhyme awareness.
It was also hypothesized that concurrent correlations between the DIVS and the subtests of the PALS would diverge, indicating that the DIVS is a measure of vocabulary and not early literacy skills, such as phonics and phonological awareness. A negligible to moderate relationship was found between fall PNF performance and fall performance on the phonics subtests of the PALS with correlations of .18 for lowercase letter recognition, .30 for letter sound awareness, and .54 for uppercase letter recognition. More moderate relationships were found between performance on the spring vocabulary measures and the spring phonics measures. Although a weaker relationship was found between spring vocabulary and spring uppercase letter recognition than was observed in the fall (with r ranging from .38 to .41 in the spring from the .51 and .54 correlations observed in the fall), stronger relationships were found between the spring vocabulary measures and spring lowercase recognition than were observed in the fall (with correlations ranging from .27 to .31 from the negligible relationships of .13-.18 observed in the fall). No changes were observed in the small correlations observed from fall to spring between the vocabulary measures and the letter sound awareness tasks of the PALS.
Moderate relationships were observed between initial PNF performance and initial phonological awareness skills with an observed correlation of .60 for PNF and beginning sounds and .41 for PNF and rhyme awareness. Similar, albeit weaker, concurrent correlations were found in the spring data for performance on the vocabulary measures and beginning sound awareness (with correlations ranging from .56 to .60 in the fall and .46 to .51 in the spring). Interestingly, a stronger relationship was found between vocabulary and rhyme awareness in the spring (correlation, .48-.55) than was observed in the fall (correlation, .41-.43).
It is important to note, the correlations found for PNF and RDF varied across vocabulary, phonics, and phonemic awareness in a similar pattern to that observed with the PPVT-III, demonstrating additional construct validity for these new vocabulary measures.
Predictive Validity
It was also hypothesized that fall vocabulary performance would strongly predict later performance on the DIVS and later performance on the PPVT-III. The relationship between the fall measures of vocabulary and the winter measures ranged from .74 to .8. The relationship between fall and spring was .69 and .81. The strongest relationships were not surprisingly observed between the same tests given over time, also indicating strong stability of each test across testing occasions. These correlations reflect a strong predictive relationship between initial vocabulary and later vocabulary performance.
It was also hypothesized that early vocabulary skills as measured by the DIVS would predict later performance in early phonics skills. Small correlations were found between the fall vocabulary performance and spring phonics performance with correlations ranging from .20 to .36 across all vocabulary and phonics tests. These results suggest that initial vocabulary performance in preschool may not effectively predict later preschool phonics performance as measured by the PALS.
Moderate relationships were observed between fall vocabulary performance and spring phonological awareness performance with correlations ranging between .45 and .55 across all vocabulary and phonological awareness subtests. As predicted, initial vocabulary may in fact be a moderate predictor of student performance on later phonological awareness tasks.
As was seen for the patterns in concurrent validity, similar patterns were observed in the predictive relationships of the DIVS PNF scores and the PPVT-III scores on both phonics and phonological awareness performance, providing more evidence for the construct validity of the PNF (see Table 4).
Predictive Validity of DIVS for Later Early Literacy Achievement.
Note. DIVS = Dynamic Indicators of Vocabulary Skills; PNF = picture naming fluency; PPVT = Peabody Picture Vocabulary Test; RDF = reverse definition fluency; UR = uppercase letter recognition; LR = lowercase letter recognition; LS = letter sound; BS = beginning sounds; RA = rhyme awareness.
Discussion
The data from this study provided evidence supporting our hypotheses, suggesting that the DIVS converges with other measures of vocabulary and diverges from measures that do not directly assess vocabulary. We also found that fall performance on the DIVS would predict later vocabulary performance as measured by both the DIVS and PPVT-III.
We also hypothesized that early vocabulary skills would predict early phonics skills across the school year. Small correlations were found for the relationship between initial vocabulary as measured by both PNF and the PPVT-III and spring performance on uppercase letter recognition. These findings may suggest a weak relationship between measures of early vocabulary and the acquisition of phonics skills across the preschool academic year. These results might be a function of the developmental nature of vocabulary and phonics skills. Environmental factors that affect exposure to language development and letter symbols may play a role in this relationship, where uppercase letters are more commonly presented to young children than lowercase letters. However, the testing procedures of the PALS may also play a role in that the phonics subtests are presented in a multiple-gated fashion where discontinuation rules may confound the results found in this study. The weaker observed correlations may also be due to the ceiling effects that were observed in the spring PALS data.
Finally, because vocabulary is an important indicator of later literacy performance and phonological awareness is an essential skill for learning to read, it was hypothesized that early vocabulary skills would predict later performance in phonological awareness. Moderate relationships were observed between fall vocabulary performance on both the DIVS and PPVT-III and the spring phonological awareness tasks of the PALS with correlations ranging between .48 and .55. As predicted, initial vocabulary may in fact be a moderate predictor of student performance on later phonological awareness tasks.
Additional Evidence Supporting the Use of the DIVS
This study provided additional psychometric evidence supporting the use of novel testing tools for measuring early vocabulary acquisition. A technical report provided by the test author reports the alternate-form reliability for PNF as .84 for preschool-age and .73 for kindergarten-age students. This study identified the stability in test performance across the school year with test–retest estimates for PNF ranging from .80 to .87, and .86 for RDF winter scores compared with spring scores. Similar stability estimates were found for the PPVT-III (r = .81). This study demonstrated that the reliability of the DIVS is comparable with the reliability observed for the PPVT-III.
Previous reports of the concurrent validity of the DIVS measures with the PPVT-III ranged from .75 to .77 (Parker, 2006), which is comparable with the analyses found in the present study (r = .70-.76). In addition to the construct validity found in these concurrent validity analyses, the results of this study also provide additional validity evidence based on the similar trends that were found in concurrent validity and predictive validity analyses observed for the DIVS and PPVT-III. The concurrent correlations of PNF and RDF varied in a similar pattern as the PPVT-III across vocabulary, phonics, and phonemic awareness. Similar patterns were also apparent in the predictive relationships of the DIVS PNF scores and the PPVT-III scores for both phonics and phonological awareness. In addition, the test–retest estimates for RDF and PPVT-III were nearly identical. The psychometric comparability between these measures is important for validating the use of the DIVS for screening and progress-monitoring purposes, for which the PPVT was not designed.
Comparing the DIVS With Existing Screening Measures
According to the National Center on Response to Intervention (2011), similar screening tests used to measure early literacy skills such as AIMSweb, Dynamic Indicators of Basic Early Literacy Skills (DIBELS), easyCBM, and the Standardized Testing and Reporting (STAR) Reading Enterprise assessments have found reliability and validity coefficients consistent to the DIVS. Test–retest coefficients found in this study for the DIVS ranged from .80 to .87. The median coefficient for the test–retest reliability reported for AIMSweb’s CBM reading is .95. The median coefficient for the test–retest reliability for AIMSweb’s Test of Early Literacy letter naming fluency (LNF) for kindergarten is .81. The median coefficient reported for the test–retest reliability of the STAR Reading Enterprise assessments for students in Grades 1 through 5 is .83.
The predictive validity coefficients found for the fall DIVS scores and the spring DIVS and PPVT-III scores ranged from .72 to .74. The predictive validity coefficient reported by the National Center on Response to Intervention for easyCBM vocabulary from fall to spring for third grade was .46.
The predictive validity of the DIVS to other early literacy skills was also evaluated in this study, where the fall DIVS and the spring PALS phonological awareness measures ranged from .45 to .55. This is similar to predictive validity estimates reported by the National Center on Response to Intervention. They report the median coefficient of predictive validity for DIBELS LNF and phonemic segmentation fluency to be .46 for kindergarten. The median predictive validity coefficient of the STAR Reading and Stanford Achievement Test (9th ed.; SAT-9) for Grades 2 to 6 is .68. The predictive validity for the AIMSweb’s LNF for the fall of first grade and the spring of first grade has a median coefficient of .76.
Limitations and Future Research
The primary limitation of this study is the sample from which these data were drawn. The population of children used in this study was from a poor urban center and the distribution of characteristics of the students in the sample is not representative of the student population in the United States. Seventy-two percent of the children in this study received free or reduced lunch benefits, which is nearly double the rate as seen across the state of Massachusetts. The bias sample raises questions about the generalizability of the results of this study. However, it is important to analyze data such as these, because screening measures that adequately measure student performance on essential early academic skills and predict later academic achievement are invaluable to educators who work with students from at-risk communities. This study provides evidence for the DIVS usage in a high-risk student population. It is important to replicate this study with a more representative sample of preschool children and examine whether similar results are found.
The present study is an initial investigation of the validity of the DIVS for assessing early vocabulary knowledge. Future research is necessary to evaluate the predictive nature of DIVS performance on later measures of reading achievement. Subsequent studies are needed to evaluate how the DIVS may reflect the developmental nature of vocabulary, phonological awareness, and phonics skills by examining the relationship of the DIVS with phonics knowledge in kindergarten and first grades and the relationship of preschool DIVS performance with later reading achievement in first and second grades.
Implications
By screening young students for their acquisition of skills that predict later reading achievement, preventative instructional practices can be implemented differentially based on students’ needs. The DIVS show promise as effective screening measures for early vocabulary acquisition. The analyses presented in this study reveal psychometric characteristics that are important to any test used in measuring student achievement, where the DIVS were found to be reliable and valid measures of the vocabulary construct. The predictive nature of early oral language skills for later reading achievement needs further investigation, and measurement technology such as the DIVS may allow for such longitudinal analyses.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
