Abstract
This report documents the reliability and validity of scores on the Preschool-Wide Evaluation Tool (PreSET), an assessment used to measure program-wide implementation of the universal level of positive behavior interventions and support (PBIS) in early childhood settings. Initial analyses of descriptive statistics, item, subscale, and total correlations and tests of internal consistency reveal that the PreSET meets or exceeds basic psychometric criteria for measurement tools used in research. The PreSET had strong subscale intercorrelations, high interobserver agreement, positive correlations to conceptually related subscales on the Teaching Pyramid Observation Tool, and sensitivity to implementation change. Overall, the PreSET demonstrates strong promise as a tool that generates reliable and valid scores to use in the assessment of program-wide implementation of universal PBIS in early learning environments.
This report presents initial documentation of the psychometric properties of the Preschool-Wide Evaluation Tool (PreSET; Steed, Pomerleau, & Horner, 2012). The PreSET is an assessment of program-wide implementation of universal positive behavior interventions and support (PW-PBIS) in early childhood settings. PW-PBIS refers to the adaptation of school-wide positive behavior interventions and supports (SW-PBIS) for early childhood programs (Frey, Park, Browne-Ferrigno, & Korfhage, 2010). Like SW-PBIS, PW-PBIS involves tiered prevention and intervention strategies to improve children’s social-emotional development and decrease challenging behavior (Fox & Hemmeter, 2009). The PW-PBIS framework includes specific models such as the Teaching Pyramid Model (Hemmeter, Ostrosky, & Fox, 2006) and utilizes language and recommendations that take into account the unique organizational structure of early childhood settings, the emphasis on developmentally appropriate practice and family-centered approaches, and the developmental needs of very young children (Frey, Faith, Elliot, & Royer, 2006).
The first level of the PW-PBIS framework involves universal support for all children, focusing on building positive relationships among children, teachers, and families (Fox & Hemmeter, 2009). A positive social culture is encouraged in the program by establishing and teaching program-wide rules and expectations, using specific verbal praise, and providing children with predictable routines and expected transitions throughout their day (Stormont, Lewis, & Beckner, 2005). At the secondary level of PW-PBIS, targeted social skills strategies are implemented with a small group of children who are at risk of social-emotional difficulties (Joseph & Strain, 2003). Finally, at the tertiary level of PW-PBIS, function-based interventions are provided for children who demonstrate severe or chronic challenging behavior (Fox, Carta, Strain, Dunlap, & Hemmeter, 2010). Initial research indicates that PW-PBIS may be successfully applied to early childhood settings through professional development to improve preschool teachers’ use of universal strategies (e.g., Benedict, Horner, & Squires, 2007; Carter & Van Norman, 2010; Stormont, Covington-Smith, & Lewis, 2007). Researchers have also demonstrated that early childhood educators can implement individualized and function-based PW-PBIS interventions to effectively reduce young children’s challenging behavior (e.g., Blair, Fox, & Lentini, 2010; Wood, Ferro, Umbreit, & Liaupsin, 2011).
In addition to a tiered hierarchy of support, a PBIS framework involves other systemic features that are cited as important for adoption and sustainability of PBIS over time. These features include administrator support in the form of time, training, and resources for teachers to implement recommended strategies and data-based decision making (Sugai & Horner, 2009). Data-based decision making involves the use of reliable and valid information to make programmatic and individual decisions (Newton, Horner, Algozzine, Todd, & Algozzine, 2009). The use of data derived from assessment and evaluation efforts allows professionals to evaluate how a program or school is currently implementing features of PBIS, plan priorities for improvement, and evaluate progress toward goals on a regular basis (Newton et al., 2009). Within the SW-PBIS framework for older students, there are several instruments that track school and student variables that would be expected to change following SW-PBIS implementation (Sugai & Horner, 2009). Some of these include the School-Wide Evaluation Tool (SET; Sugai, Lewis-Palmer, Todd, & Horner, 2001) that is completed by an outside evaluator to measure fidelity of implementation of SW-PBIS; the Team Implementation Checklist (TIC; Sugai, Horner, & Lewis-Palmer, 2001), a self-assessment completed by the SW-PBIS leadership team; and the Implementation Phases Inventory (IPI; Bradshaw, Barrett, & Bloom, 2004), a measure of SW-PBIS fidelity at the universal level. Of these instruments, the SET appears to have the most documentation regarding the reliability and validity of its scores (Horner et al., 2004; Vincent, Spaulding, & Tobin, 2010). Within the SW-PBIS framework, there are also data-collection tools for tracking students’ problem behavior (e.g., Office Discipline Referrals) and data entry and graphing systems such as the School-Wide Information System (SWIS; May et al., 2010) for assisting SW-PBIS leadership teams in their data-based decision making at the student level.
There are a few data-based decision making tools that have been developed and piloted to assess the effectiveness of PW-PBIS implementation efforts in early childhood settings. These tools include the Inventory of Practices for Promoting Social Competence (PPSEC; Center on the Social and Emotional Foundations for Early Learning [CSEFEL], 2006), the Early Childhood Program-Wide Positive Behavior Support (PBS) Benchmarks of Quality (Fox, Hemmeter, & Jack, 2006), the Teaching Pyramid Observation Tool (TPOT; Fox, Hemmeter, & Snyder, 2008), and the PreSET. These tools were developed to address the need for measures of PW-PBIS that fit the unique context of early childhood settings and the developmental period of early childhood. The PPSEC is a self-assessment that might be used by teachers to indicate features of PW-PBIS currently in place and areas for future growth (CSEFEL, 2006). The Early Childhood Program-Wide PBS Benchmarks of Quality is also a self-assessment measure that is used by leadership teams to assess their fidelity of PW-PBIS. This instrument provides information about PW-PBIS. However, there are no data regarding the reliability and validity of the tool at this time. The TPOT and PreSET are tools completed by an outside evaluator using a combination of staff interviews and classroom observations. The TPOT measures all levels of PW-PBIS from universal to tertiary interventions at the classroom level. There is an initial report that the TPOT has strong psychometric properties (Hemmeter, Snyder, & Fox, 2010).
Although the PPSEC, Early Childhood Program-Wide PBS Benchmarks of Quality, and TPOT meet some needs for measurement of key practices related to PW-PBIS in early childhood settings, there are three needs they do not collectively address. First, the PPSEC and TPOT do not collect or report data on program-wide features associated with PBIS adoption and sustainability. These features include such things as teacher planning time, the number of opportunities for teachers’ professional development, and data collection for programmatic and student decision making, features cited in the literature as important for long-term PBIS implementation (Sugai & Horner, 2009). Second, the PPSEC and Early Childhood Program-Wide PBS Benchmarks of Quality are self-assessments that do not include an objective outsider’s evaluation of PW-PBIS implementation. The third issue is that the PPSEC, Early Childhood Program-Wide PBS Benchmarks of Quality, and TPOT utilize a divergent scoring system from the SET. This is problematic for the majority of educational programs that already use the SET to evaluate PBIS implementation across K-12 schools. As the PreSET and SET utilize the same scoring system, administrators, district, and state-level personnel may use the same rubric to monitor the effectiveness of PBIS implementation across all educational environments.
A graduate student and professors in the Educational Community Supports (ECS) research unit at the University of Oregon developed the PreSET in 2006 to meet the need for a data-based tool designed specifically for early childhood settings that measures classroom and program-wide features of PW-PBIS. The PreSET is an adaptation of the SET using the same 3-point Likert-type scoring system (0 = not implemented, 1 = partially implemented, 2 = fully implemented) and methods for calculating subscale (percentage of possible points) and total scores (mean of the subscale scores). The PreSET maintains some of the subscale names used in the SET (e.g., Behavioral Expectations Taught) while adding new subscales (e.g., Family Involvement) that are conceptually relevant for early childhood settings. Modifications to the names of PreSET subscales and the wording of items were informed by research on adapting PBIS for early childhood environments (e.g., Stormont et al., 2005), the language and philosophy of developmentally appropriate practice and family-centered approaches, and user feedback from 5 years of pilot testing in six states and two countries.
The purpose of the present report is to provide results of analyses of the psychometric properties of the PreSET. We will present the results of item analyses, including descriptive statistics and internal consistency of item, subscale, and total scores. In addition, interobserver reliability, the PreSET’s correlation with another instrument, and its sensitivity to change will be provided. These data will provide important initial documentation of the reliability and validity of PreSET scores.
Method
Instrumentation
The PreSET includes 30 items that are organized into eight face-valid subscales. These subscales correspond to eight features of PW-PBIS at the universal tier of intervention. The items and subscales were largely adopted from the items and subscales used in the SET. However, the wording of items and subscales was modified in some instances and two subscales and their related items were added to reflect differences in how PW-PBIS is implemented in early childhood settings. The literature on how PW-PBIS should be applied in early educational contexts (e.g., CSEFEL, 2003; Hemmeter et al., 2006; Stormont et al., 2005) was utilized during the adaptation of SET items and subscales to create the PreSET. The PreSET measures universal PW-PBIS strategies that include developing program-wide expectations, teaching expectations, acknowledging appropriate behavior, responding consistently to problem behavior, and involving families (Fox & Hemmeter, 2009). The PreSET also measures program-wide supports that are associated with long-term use of PW-PBIS, including collecting and monitoring data on child and program variables, establishing a leadership team, and providing administrator support for teachers (Frey, Lingo, & Nelson, 2008). A summary of the features of PW-PBIS covered in the eight PreSET subscales is provided in Table 1.
Features of Program-Wide Positive Behavior Interventions and Support, and the Eight Subscales of the PreSET.
Note. PreSET = Preschool-Wide Evaluation Tool; PBIS = positive behavior interventions and support.
The PreSET uses interviews with the program administrator, teachers, and children as well as observations of materials and teacher–child interactions in each classroom in a program to measure universal and PW-PBIS. The PreSET takes approximately 1 hr to implement in a program with one participating classroom. Administration is approximately 20 min per extra classroom in a program. Users of the PreSET should have training in areas such as early childhood education, PBIS, and program evaluation and should complete reliability training with a trained PreSET user, reaching at least 80% interobserver reliability prior to using the PreSET independently. An outside evaluator completes the PreSET ideally twice a year, once in the fall and once in the spring of a school year.
The PreSET should be administered across an entire program to provide optimal information regarding implementation of PW-PBIS and universal tier supports. However, the PreSET may be used without adaptations in a single classroom within an early childhood program or in a single early childhood classroom that is part of a school that includes older students (e.g., in a preschool classroom that is located in an elementary school).
Data Collection
Trained observers collected PreSET data from 138 early childhood classrooms. PreSET training included review of the PreSET manual, review of PreSET forms, and practice reliability trainings in early childhood classrooms. PreSET administration involved approximately 30 min to 1 hr of data collection in each classroom and included classroom observations of teacher behavior (e.g., use of praise) and transitions; reviewing permanent products such as visual schedules and program handbooks; and interviewing the lead teacher, assistant teachers when applicable, a sample of children, and the early childhood program administrator about program policies regarding behavior management, training, and family involvement.
Of the 138 early childhood classrooms participating, 66 were Head Start classrooms, 26 were private child care classrooms, 16 were state-funded preschool classrooms, 15 were special education classrooms, 11 were public/nonprofit early childhood programs, and four were preschool classrooms in an elementary school. The 138 classrooms came from 119 programs. In the majority of cases (108), a program had 1 classroom participate in the PreSET. Eleven participating programs had more than 1 classroom (2–7) participate in the PreSET. PreSET data were collected in Oregon (n = 101), Georgia (n = 31), and Nevada (n = 6). All data reflect preimplementation PreSET data, with the exception of post-PreSET data that are reported for 29 classrooms that participated in pre- and postdata collection to assess the instrument’s sensitivity to change. These postimplementation data were not utilized in any other psychometric analyses.
Psychometric Analyses
The following statistical data analyses were conducted to determine the psychometric adequacy of the PreSET: (a) calculations of means, ranges, and standard deviations of item, subscale, and total scores; (b) item analyses, including item-subscale, item-total, and subscale-total correlations, and assessment of internal consistency; (c) intercorrelations of PreSET subscales; (d) interobserver reliability analyses with calculations of percent agreement, kappa, and interclass correlations (ICC); (e) correlations of the PreSET with another widely used tool; and (f) sensitivity-to-change analyses. The results of these analyses are presented beginning with descriptive statistics for item, subscale, and total PreSET scores. Then, item-subscale, item-total, and subscale-total correlations are described, followed by a discussion of PreSET subscale intercorrelations and sampling adequacy. Finally, initial findings related to the interobserver reliability, construct validity, and sensitivity to change of the PreSET are presented.
Results
PreSET Descriptive Statistics
Table 2 presents the descriptive statistics for all PreSET item, subscale, and total scores for 138 classrooms. Item scores are reported as the average score across classrooms on the 3-point Likert-type scale. Subscale scores are reported as the average percent score, and the total score is the average percent score across the 8 subscales. PreSET scores overall demonstrated adequacy of central tendencies and variability for sensitivity for items, subscales, and total scores. Across classrooms, the highest PreSET scores were observed on items in the Organized and Predictable Environment subscale, the subscale with the lowest variability. On average, classrooms scored lowest on items within the Management, Family Involvement, and Monitoring and Decision-Making subscales. They also scored low on Item B3 that asked children to state the classroom expectations.
PreSET Item, Subscale, and Total Score Means, Standard Deviations, and Minimum and Maximum Scores (N = 138).
Note. PreSET = Preschool-Wide Evaluation Tool.
Item Analysis and Internal Consistency
Items and subscales on the PreSET were evaluated by statistics available from reliability analyses via SPSS 18.0. Statistics from reliability analyses primarily assess interrelatedness or homogeneity of items on a scale, and item-subscale correlation assesses the strength of each item’s association to the subscale. Item-subscale, item-total, and subscale-total correlations (the Pearson product–moment correlation coefficients), Cronbach’s alpha coefficients, and standard error of measurement (SEM) for item, subscale, and total scores are summarized in Table 3. Results demonstrated reasonably strong item-subscale correlations with a mean of .56 and a median of .58. Item-subscale correlations were strongest in the Management subscale. The lowest item-subscale correlations occurred for items C1 regarding teachers’ acknowledgment of children’s appropriate behavior, C5 pertaining to the use of precorrection, and F3 regarding the inclusion of families in the development of classroom rules. These PBIS features were often not in place in an early childhood classroom (score of 0) even though other elements in the subscale were implemented. However, the low item-scale correlation values might be acceptable given the criterion-referenced nature of the instrument. Item-total correlations were also reasonably strong with a mean of .49 and a median of .52. Subscale-total correlations were strong with a mean of .59, median of .58, and a range from .39 to .71, indicating that the subscales are highly correlated with one another, and thus with the total scores of the PreSET.
PreSET Item-Subscale, Item-Total, Subscale-Total Correlations, Cronbach’s Alpha Coefficients, and Standard Error of Measurement (SEM) for Item, Subscale, and Total Scores (N = 138).
Note. PreSET = Preschool-Wide Evaluation Tool.
The overall alpha was .91, demonstrating reasonably strong internal consistency. In general, coefficient alpha of .70 is considered to be acceptable for basic research on individual differences (i.e., differences in classrooms; Nunnally, 1975). Three of the subscales have alpha less than .70 (subscales C, D, and F). The SEM was highest for these subscales as well (1.78, 1.47, and 1.21, respectively). This means that these subscales may be less consistent with the total score and may need to be modified in future iterations of the tool.
Intercorrelations Among the Subscales
Intercorrelations across PreSET subscales were analyzed to further assess the cohesiveness of the PreSET. This is an initial way to examine the structural validity of the instrument prior to more elaborate factor analyses. Intercorrelations among the subscales are summarized in Table 4. There were low to moderate positive correlations among the subscale scores (mean r = .42, median r = .45). The Kaiser-Meyer-Olkin Measure of Sampling Adequacy on the intercorrelations was .82, indicating meritorious cohesiveness among the subscales.
PreSET Subscale Intercorrelations (N = 138).
Note. PreSET = Preschool-Wide Evaluation Tool.
p < .05. **p < .01.
Interobserver Agreement
Interobserver reliability was assessed on a subset of PreSET data (n = 22). The 22 classrooms included representative classrooms from each geographical region of the larger data set, including the Southeast (n = 15), Pacific Northwest (n = 5), and West (n = 2). Various types of classrooms were represented, including state-funded preschool (n = 11), private child care (n = 7), Head Start (n = 3), and special education (n = 1) classrooms. Interobserver reliability was obtained when two independent and trained observers collected PreSET data in the same classroom. Classrooms that participated in interobserver reliability were chosen based on the availability of two data collectors for some data-collection sessions. Percent agreement, kappa, and intraclass correlations (ICC) are reported on Table 5.
PreSET Interobserver Agreements and Reliabilities (N = 22).
Note. PreSET = Preschool-Wide Evaluation Tool; ICC = intraclass correlation; NA = kappa or ICC could not be computed because both within- and between-observer variances were zero.
Percent agreement was calculated for each item by dividing the number of agreements by the total number of agreements and disagreements and multiplying by 100. The average percent agreement on individual PreSET items was 95% (range = 68%–100%). Percent agreement was above 80% for all items except an item that concerned teachers’ use of a transition signal. Lower interobserver agreement on this item may be explained by divergent definitions of signals among data collectors and/or the fleeting nature of teachers’ use of transition signals that may have been missed by one of the data collectors. Cohen’s kappa and ICC were computed via SPSS 18.0. Kappa is calculated based on the difference between observed agreement (agreement that is actually present) and expected agreement (agreement presented by chance alone). ICC is an estimate of between-participant variance to total variance, which parallels the classical test theory definition of reliability. ICC (2, 1), a measure of absolute agreement, was used. ICC approaches 1.0 when there is no variance due to the observers. Because each measure of interobserver agreement is calculated differently, there are discrepancies among the magnitudes of agreement coefficients. For instance, Item B1 regarding teachers’ plan to teach classroom rules had 91% agreement between the two observers. However, kappa coefficient for the same item was zero due to high unbalanced agreement (both raters rated zero for the item for 20 out of 22 classrooms). Likewise, ICC for the item was zero because both between-participant and total variance were nearly zero. The overall kappa was .80 and the ICC was .83.
Construct Validity
The validity of the PreSET was first evaluated using correlations with the TPOT (Fox et al., 2008). We correlated PreSET scores with scores from the TPOT, another tool that measures fidelity of implementation of PBIS in early childhood classrooms. We utilized a subset of the data(n = 31) where an observer collected both PreSET and TPOT data in the same classroom. All data were collected in the Southeast and included 16 state-funded preschool classrooms and 15 private child care classrooms. Data collectors were trained to administer both the PreSET and TPOT through independent review of each instrument’s manuals and forms and the TPOT online training podcast. A practice scoring session followed in which each data collector was required to obtain at least 85% reliability with the primary data collector on each measure.
The TPOT includes a 2-hr observation of classroom practices across various classroom routines and an interview with the lead teacher. The TPOT includes 38 items (7 environmental indicators, 15 instructional practices indicators, and 16 “red flags”).
The PreSET and TPOT both measure the extent to which an early childhood classroom has implemented aspects of PBIS. For example, both instruments include items related to the presence of a rules poster with 3 to 5 positively stated rules or expectations. However, the instruments differ in two key ways. First, the PreSET focuses on PW-PBIS by including the program administrator in the interview process and addressing program-wide factors that support PBIS in the early childhood context. Second, the PreSET focuses on universal prevention strategies and program-wide supports. The TPOT measures universal, secondary, and tertiary interventions at the classroom level. Thus, it was expected that some PreSET and TPOT subscales would be highly correlated, whereas others would not be related. Table 6 presents the correlations of PreSET and select TPOT subscales that were expected to be convergent.
PreSET Subscale Correlations With Select TPOT Subscales (N = 31).
Note. PreSET = Preschool-Wide Evaluation Tool. PreSET subscale Management not included in correlations due to zero variance of this variable.
p < .05. **p < .01.
Correlations were high on some PreSET and TPOT subscales that were predicted to measure similar constructs, including scores on the PreSET subscale Expectations Defined and the TPOT subscale Classroom Environment that includes items related to the posting of classroom rules (.66, p < .01). Other strong correlations included scores on the PreSET subscale Responses to Appropriate and Challenging Behavior and the TPOT subscales Transitions (.57, p < .01), Supportive Conversations (.52, p < .01), and Responding to Problem Behavior (.58, p < .01). Additional strong correlations were between the PreSET subscale Organized and Predictable Environment and the TPOT subscales Classroom Environment (.45 p < . 05), Schedules and Routines (.52, p < .01), and Transitions (.55, p < .01). Two PreSET subscales (Expectations Taught and Family Involvement) were positively correlated with TPOT subscales that measure similar constructs (Teaching Children Behavior Expectations and Supporting Families Social Emotional Development), respectively, but the relationships were not statistically significant at .25 and .19.
As expected, scores on PreSET and TPOT subscales that were not conceptually linked did not demonstrate significant positive correlations. An example is the low correlation of scores on the PreSET subscale of Monitoring and Decision Making with all TPOT subscales (range = −.31 to .15). There are no items or subscales on the TPOT that measure data-collection procedures such as those measured on this subscale of the PreSET.
PreSET and TPOT total scores from the 31 classrooms were moderately and positively correlated, r = .33. The PreSET and TPOT correlation scores demonstrate that the PreSET has initial convergent construct validity with another widely used tool that measures PBIS in early childhood settings.
Sensitivity to Change
To evaluate the PreSET’s sensitivity to change, preimplementation and postimplementation data were compared for 29 early childhood classrooms that were involved in training and coaching in PW-PBIS during the 2006–2007 academic year in the Pacific Northwest. Participating classrooms represented a range of early childhood settings, including half- and full-day Head Start, private and public preschool, and special education preschool classrooms. Following a fall PreSET evaluation, all participating classroom teachers attended a 2-day training in PBIS and received individualized follow-up coaching with an early interventionist or behavior consultant. Consultants used PowerPoint materials and other resources from the CSEFEL training modules for their training and consultation sessions with teachers (CSEFEL, 2003). During each classroom visit, consultants observed teachers and then met with teachers to discuss PBIS goals, current practices, and targeted skills to implement. Consultants visited with participating teachers approximately nine times for a total of 9.5 hr of consultation over the course of the academic year. A spring PreSET evaluation was conducted in each classroom. Figure 1 shows mean preimplementation and postimplementation total PreSET scores across the 29 participating classrooms.

Mean preimplementation and postimplementation scores on the PreSET for 29 early childhood classrooms.
Preimplementation scores averaged 41% with a range from 5% to 84%. Postimplementation scores averaged 76% with a range from 52% to 92%. All of the classrooms increased their PreSET scores from fall to spring. Paired t-test results compared preimplementation and postimplementation mean PreSET scores and t = 10.49 (df = 28), p < .000. These results provide initial evidence that the PreSET is sensitive to implementation change.
Discussion
Instruments are needed to measure the extent to which PBIS is implemented in early childhood classrooms and programs. These measures should be contextually appropriate for early childhood environments that are grounded in developmentally appropriate practice and family-centered approaches (Frey et al., 2008). Instruments should also be easy to implement so that various users in the early childhood field (e.g., behavior support teachers, inclusion specialists) may be trained to use them. Finally, measurement tools should yield reliable and valid scores that may be used in decision making for teachers and program administrators. Data-based decision-making tools, such as the SET, TIC, and IPI have been developed for elementary, middle, and high schools. However, tools that are adapted for early childhood classrooms such as the PreSET and TPOT are only just now emerging and have limited evidence of the reliability and validity of their generated scores (Hemmeter et al., 2010).
This brief report provides initial evidence of the psychometric properties of the PreSET, an instrument that may be used to assess implementation of PBIS in early childhood settings. Specifically, we integrated various aspects of evidence to form an overall validity judgment that incorporates domain content relevance and representativeness as well as criterion relatedness (Messick, 1989). First, the correlational structure of the PreSET met or exceeded standards for the internal consistency of research tools at .30 for item-subscale correlations and .60 for total scores (Nunnally & Bernstein, 1994). The strong magnitude of relationships for items-to-subscales and subscales-to-total scores provide initial evidence of the content validity of the PreSET. The magnitude of PreSET subscale intercorrelations was strong, especially given the low number of items in many subscales (3–5 items). Initial analyses of the cohesiveness of PreSET subscales were positive, suggesting the use of a single total score. Furthermore, the PreSET had high interobserver reliability with an average percent agreement of 95% and an overall kappa of .80. In addition, the PreSET appears to measure what it was intended to measure: implementation of universal and program-level PBIS in early childhood settings.
The PreSET was modestly and positively correlated with the TPOT with significant positive correlations for subscales that are conceptually linked between the two instruments. This provides initial evidence of the PreSET’s convergent construct validity with another widely used instrument. The PreSET was sensitive to change following implementation of professional development in the form of training and classroom-based consultation in 29 classrooms. The various sources of validity evidence indicate that the scores on the PreSET provide evidence for differences in classroom practice (i.e., item-total correlation), content regularity (i.e., interrater reliability), subscale structure of the construct domain (i.e., intercorrelations among subscales), and program evaluation tool (i.e., sensitivity to change). Altogether, the PreSET shows promise as a tool that will yield reliable and valid scores for early childhood administrators, policy makers, and other change agents implementing PBIS in their programs, regions, and states.
A limitation of this initial analysis of the psychometric properties of the PreSET is the limited sample size relative to the number of items on the PreSET. Future research should be conducted with a larger sample of early childhood programs so that factor analyses may be conducted to further demonstrate the structural validity of the PreSET, the use of the eight face-valid subscales, and the use of the total score in score reporting. Future analyses should also include more substantial sample sizes in the assessments of interobserver reliability, construct validity, and sensitivity to change. Convenience samples were reported here. More elaborate and varied sampling that is representative of early childhood settings across the country would be ideal to demonstrate the reliability and validity of the instrument. Other future analyses should include test–retest reliability and feasibility analyses. Future research also may assess the amount and type of training required for users in the field to implement the PreSET with reliability.
Another limitation of this initial technical report is the lack of program-level data and analyses, given the current use of the PreSET as a program-wide measure of PW-PBIS. The focus on classroom-level analyses occurred due to the characteristics of the data set and the use of the PreSET at the time as a tool to provide feedback about PBIS implementation at the classroom level. The overwhelming majority of data involved single classrooms within a single program (108 of 138 classrooms). Only 11 programs out of 119 participating programs included PreSET data from multiple classrooms within a program. The data from these 30 classrooms in 11 programs were insufficient for describing program-wide trends or how program-level features affected the internal consistency, correlations, and sensitivity of change in PreSET scores. Future analyses should address these potential issues and speak to whether the PreSET may be used as a program-wide measure of PBIS in early childhood settings.
A limitation of the PreSET itself is that it focuses only on the primary, universal level of PBIS. Although a more comprehensive tool may yield more information about implementation of PBIS at the secondary and tertiary levels of PBIS, there is a large research base to support an initial focus on primary prevention when beginning PBIS efforts (Walker & Shinn, 2002). Initial analyses of PBIS implementation in early childhood environments, such as child care centers, indicate a low level of PBIS implementation. For example, a recent report indicates that early childhood teachers currently implement fewer than 40% of the key features associated with PW-PBIS (Fox, Hemmeter, Snyder, Binder, & Clarke, 2011). Given that early childhood classrooms will have much to work on during their initial PBIS effort, the use of the PreSET, designed specifically for primary universal prevention, seems a reasonable and calculated place to start.
The implications of the reliability and validity of scores on the PreSET are immediate and connected to ongoing efforts to implement PW-PBIS in preschool settings. Early childhood programs may use the results of the PreSET to learn about features of PW-PBIS that are currently in place and plan goals for implementing other features in their programs. PreSET results may also be used to plan professional development efforts to train administrators. Individuals involved in statewide efforts to execute PBIS in their state’s schools and programs may also aggregate PreSET data to assess and evaluate implementation across counties, regions, districts, programs, or classrooms. The use of a tool that generates valid and reliable scores allows for this comparison across and between programs. For all of these reasons, the PreSET appears to be a valuable instrument to inform PW-PBIS efforts in preschool settings.
Footnotes
Declaration of Conflicting Interests
The authors declared the following potential conflicts of interest with respect to the research, authorship, and publication of this article: Elizabeth A. Steed wishes to disclose a potential conflict of interest in that she receives a portion of royalties on net sales of PreSET.
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
