Abstract
Universal screening for behavioral and emotional risk is an important part of implementing multi-tiered behavioral supports in schools. The current study adds to our understanding of universal screening by examining teacher and student reports of behavioral and emotional risk. Participants included 73 fourth-grade students and 4 teachers in an urban school in the Midwestern United States. Correlations between the two informants ranged from moderate to large for the overall T-score, internalizing problems, and externalizing/self-regulation problems, but were not significant for personal adjustment/adaptive skills. Furthermore, the Behavioral and Emotional Screening System (BESS) Teacher Form (TF) showed concurrent and predictive validity with academic scores, whereas the BESS Student Form (SF) showed concurrent and predictive validity with measures of school climate. Results of this study indicate that teachers and students may provide unique information regarding student functioning.
Keywords
One reason behavioral/emotional screening has fallen behind academic screening is the lack of information available regarding screening instruments (Bruhn, Woods-Groves, & Huddle, 2014). Implementation questions remain a challenge, such as who should provide information on student functioning during behavioral and emotional screening (Kamphaus, Reynolds, & Dever, 2014). Effective implementation of a universal screening program requires time, money, and expertise. Investing in these resources, especially in already resource-strapped schools, is difficult to justify when there is a dearth of information available to school personnel on screening tools.
The primary aim of this study was to investigate the reliability and validity of a commonly used universal screening form (Bruhn et al., 2014), the Behavioral and Emotional Screening System, third edition (BESS-3; Kamphaus & Reynolds, 2015). This revision includes new sub-indices that have not been examined outside of the manual. The present study compared the BESS-3 Teacher Form (TF) with Student Form (SF) by examining the rate of agreement between teachers and students, and their prediction of academic and school climate outcomes. A strong body of evidence indicates that children’s reports of their own behavioral and emotional functioning can be quite different from what is observed by their teachers (Dowdy & Kim, 2012; Youngstrom, Loeber, & Stouthamer-Loeber, 2000). Comparing teacher and student reports of student behavior, and using each to predict important school outcomes, provides preliminary information regarding the type of information garnered from different informants.
Although it is customary practice to assess the validity of a behavioral screener by comparing it with other well-established behavioral measures, a less common but equally important practice is examining the validity of a behavioral screener compared with commonly collected school outcomes (Eklund, Kilgus, von der Embse, Beardmore, & Tanner, 2017). Measures of academic functioning and school climate were utilized as outcomes in the current study. Both are widely used by school personnel to make decisions regarding student supports, and both may be influenced by students’ behavioral and emotional functioning (Masten et al., 2005; Perfect & Morris, 2008). For example, students who suffer from internalizing symptoms may have difficulty concentrating or focusing on class assignments, which could result in poorer academic performance (Miller & Jome, 2008). These same risk factors also influence how students perceive their school climates, and whether they feel supported or challenged in the school setting (Gage, Larson, Sugai, & Chafouleas, 2016; Wang & Degol, 2016).
Researchers hypothesized that there would be stronger interrater reliability between teachers and students on externalizing domains of functioning as compared with internalizing domains. Regarding concurrent and predictive validity, it was hypothesized that both teacher and student reports would demonstrate significant relationships with academic outcomes; however, it was predicted that only student reports would be related to school climate factors.
Method
Participants and Procedures
This study utilized data collected in a Midwestern urban school in the United States. The school is classified as a high poverty school, meaning that the percentage of students enrolled who qualify as economically disadvantaged falls in the district’s top quartile (Ohio School Report Cards, 2018). The BESS-3 SF was completed by all fourth grade students, and the BESS-3 TF was completed by all fourth grade teachers for students in their homeroom classes. The sample consisted of 73 students and 4 teachers. All of the teachers identified as White (75% female). Students were 41% female, with 50.7% of students identifying as Black/African American, 39.7% White, 5.5% Hispanic/Latino, 2.7% multiracial, and 1.4% Asian. The BESS-3 was administered in November 2016 and in May 2017. All instructions and items were read aloud to students. Teachers were given the BESS-3 measures to complete independently in the same week the measure was administered to students.
Measures
BESS-3
The BESS-3 SF is a 28-item instrument, and the BESS-3 TF is a 20-item instrument. Each BESS-3 produces a raw score that is transformed to a T-score, in which higher scores reflect a higher risk for behavioral/emotional problems (Kamphaus & Reynolds, 2015). The BESS-3 includes the global behavioral and emotional risk index as well as new sub-indices (internalizing, externalizing/self-regulation problems, and adaptive skills/personal adjustment problems). The psychometric properties of the BESS-3 SF are acceptable, having good internal consistency (.93-.94), and test–retest reliability (.88). The psychometric properties of the BESS-3 TF are also acceptable, having good internal consistency (.95-.96), and test–retest reliability (.86; Kamphaus & Reynolds, 2015).
Measure of Academic Progress (MAP)
The MAP is an interim testing tool that measures student academic progress. The MAP test is a norm-referenced, computer adaptive test. The MAP was published by the Northwest Evaluation Association in 2000 and is used in more than 10% of schools nationally and more than a third of schools in the Midwestern United States (Cordray, Pion, Brandt, Molefe, & Toby, 2012). The MAP was administered three times a year to all students in the current sample. The MAP tests for math and reading exhibit acceptable internal consistency, test–retest reliability with the equivalent forms, and predictive validity (Brown & Coughlin, 2007).
Conditions for Learning Survey (CLS)
The CLS is administered as part of district-wide efforts to monitor school climate (Osher, Kendziora, & Chinen, 2008). The survey includes four school climate factors: school safety, academic rigor, student support, and peer social-emotional climate. This survey is used nationally to help schools assess students’ social, emotional, and learning needs. It was created by the American Institute for Research (AIR), adapting questions from other social-emotional learning surveys with strong reliability, and validity. AIR conducted a pilot study to analyze the psychometric properties of the measure and found acceptable reliability and validity (Osher et al., 2008).
Results and Discussion
Pearson correlation coefficients (r) were calculated between concurrent student and teacher scores to estimate interrater reliability using the global risk score and each of the domain scores. For all correlations, Cohen’s (1992) interpretations of effect size were used, where .10 to .29 is a small effect, .30 to .49 is moderate, and .50 and higher is large. Correlations between the two informants were significant (p < .01) for the overall scores and two of the three domains; effect sizes ranged from moderate to large for the overall T-score (Fall: r = .47; Spring: r = .55), internalizing problems (Fall: r = .36; Spring: r = .45), and externalizing/self-regulation problems (Fall: r = .52; Spring: r = .46).
Pearson correlation coefficients were calculated among Fall BESS-3 scores, Fall MAP scores, and Fall CLS scores as estimates of concurrent validity in the Fall and in the Spring. In general, these coefficients provided evidence of concurrent validity of the BESS-3 TF with MAP math and reading scores, with Fall Teacher BESS-3 T-score demonstrating significant (p < .001) and moderate correlations with Fall math (r = −.49) and reading (r = −.44). The Spring BESS-3 Teacher T-score was moderately and significantly (p < .01) related to Spring math (r = −.39), but not reading. Student BESS-3 T-scores were not significantly related to math or reading scores at either time point.
When considering the domains of risk, teacher-rated externalizing (r = −.47, p < .001), internalizing (r = −.26, p < .05), and adaptive skills (r = .38, p < .01) in the Fall were related to Fall MAP math scores. Teacher-rated externalizing (r = −.45, p < .001) and adaptive skills (r = .33, p < .01) in the Fall were also moderately related to Fall MAP reading scores, whereas teacher-rated internalizing problems in the Fall were not significantly related to Fall reading. In the Spring, teacher-rated externalizing (r = −.47, p < .001) and adaptive skills (r = .34, p < .05) were moderately related to Spring MAP math scores only. None of the dimensions on the student BESS-3 were significantly related to MAP math or reading scores, at either time point.
The patterns for concurrent validity with the school climate outcomes were quite different. The Fall BESS-3 TF T-score demonstrated a weak correlation (r = −.26, p < .05) with Fall safety only. The Spring BESS-3 TF T-score demonstrated moderate correlations with academic rigor (r = −.31, p < .05) and student support (r = −.34, p < .05). The Fall BESS-3 SF T-score was moderately related to Fall academic rigor (r = −.38, p < .01), student support (r = −.39, p < .01), and social-emotional learning climate (r = −.36, p < .01), and strongly related to Fall safety (r = −.53, p < .001). In the Spring, safety (r = −.31, p < .05), academic rigor (r = −.31, p < .05), student support (r = −.34, p < .05), and social-emotional learning climate (r = −.31, p < .05) were moderately correlated with the BESS-3 SF T-score.
When considering the individual domains, Fall teacher-rated externalizing problems were moderately associated with Fall safety (r = −.31, p < .05). Fall BESS-3 TF adaptive skills were weakly associated with safety (r = .25, p < .05) and academic rigor (r = .28, p < .05). In the Spring, BESS-3 TF adaptive skills were moderately correlated with academic rigor (r = .34, p < .05), whereas BESS-3 TF externalizing problems were moderately correlated with student support (r = −.34, p < .05). The student-report BESS-3 demonstrated more associations with climate overall. In the Fall, the BESS-3 SF internalizing problems domain was moderately related to safety (r = −.37, p < .01), and weakly related to social-emotional learning climate (r = −.29, p < .05). Fall self-regulation was moderately related to safety (r = −.33, p < .01) and academic rigor (r = −.36, p < .01). Fall BESS-3 SF personal adjustment was moderately correlated with safety (r = .41, p < .001), and weakly correlated with academic rigor (r = .27, p < .05), student support (r = .26, p < .05), and social-emotional learning climate (r = .25, p < .05). In the Spring, only BESS-3 SF internalizing problems were moderately associated with safety (r = −.32, p < .05) and weakly associated with student support (r = −.29, p < .05).
Predictive validity of the BESS-3 forms was assessed through a series of simultaneous multiple regression analyses, with Fall BESS-3 teacher and student ratings as predictors of Spring MAP and CLS outcomes. A total of 12 multiple regression analyses were run, as there were 6 outcomes (2 academic, 4 school climate), and models were run separately with (a) T-scores and (b) individual domains of risk as predictors. The alpha level for statistical significance was set to .01 to account for the multiple analyses. Fall BESS-3 teacher T-scores predicted both math (β = −.47, p < .001) and reading MAP scores (β = −.54, p < .001) in the Spring; student-reported T-scores in the Fall did not predict either achievement outcome. When considering the domains of risk, teacher-report of externalizing (β = −.51, p < .01) in the Fall predicted Spring math achievement and reading achievement (β = −.52, p < .01; Table 2).
When assessing Spring school climate outcomes, Fall student-reported BESS-3 T-scores predicted safety (β = −.39, p < .01), student support (β = −.43, p < .01), and social-emotional learning climate (β = −.36, p < .01); teacher-reported BESS-3 T-scores in the Fall did not significantly predict any of the Spring climate outcomes (Table 1). When considering specific domains, none of the individual domains on either of the BESS-3 forms predicted the climate variables at the threshold of p < .01 (Table 2).
Summary of Multiple Regression Analyses Using BESS T-Scores as Predictors.
Note. BESS = Behavioral and Emotional Screening System; MAP = Measure of Academic Progress.
p < .05 **p < .01. ***p < .001.
Summary of Multiple Regression Analyses Using BESS Dimensions as Predictors.
Note. BESS = Behavioral and Emotional Screening System; MAP = Measure of Academic Progress.
p < .05 **p < .01. ***p < .001.
Results of this study suggest that both teachers and students provide unique and equally important information regarding student behavioral and emotional functioning. Results of this study indicate that teacher reports in elementary school may help to identify students struggling with externalizing risk factors that affect academic outcomes, whereas student reports provide more insight into internalizing risk and the identification of students who may be struggling socially and emotionally.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
