Preliminary Psychometrics of Responses to the Youth Externalizing Problems Screener

Abstract

This brief report presents preliminary psychometrics of responses to the Youth Externalizing Problems Screener (YEPS), which is a 10-item self-report rating scale intended for use as a screening instrument. The YEPS was designed to function as a companion measure to the Youth Internalizing Problems Screener (YIPS), facilitating the screening of broad mental health problems among students in secondary school settings. Analyses presented herein were conducted with the same small, preliminary samples of urban high-school students as those reported on for the initial development and validation of the YIPS (Sample 1: n = 177, Sample 2: n = 219). Results suggest that responses to the YEPS showed a sound, unidimensional factor structure that is internally consistent, providing initial evidence for the purported internal structure of the measure. Findings also showed that YEPS scores had meaningful associations with other self-reported, theoretically relevant mental health variables, providing initial convergent evidence in favor of construct interpretation. Taken together, preliminary psychometrics support the validation argument for the interpretation and use of YEPS scores as a brief measure of adolescents’ general externalizing problems. Implications for future research and practice are discussed.

Keywords

screening externalizing problems school mental health measurement assessment

One approach to accomplishing mental health screening in secondary schools is to administer brief self-report behavior rating scales to the entire student population. Results from such universal screening might be used to classify students according to lesser-or-greater levels of mental health risk, inform intervention planning, and prioritize allocation of school resources (Dowdy, Ritchey, & Kamphaus, 2010). The most comprehensive problem screeners used in schools target both internalizing and externalizing concerns. Externalizing problems are characterized by excessive and disruptive public behaviors (i.e., physical and verbal actions) directed toward the social environment, whereas internalizing problems are characterized by excessive and aversive private behaviors (i.e., thoughts and feelings) directed toward oneself (Achenbach, 1985). Research indicates that both types of problem behavior are correlated with school performance difficulties, suggesting that students experiencing these problems might benefit from school-based identification and intervention (e.g., Bradley, Doolittle, & Bartolotta, 2008). One recent self-report rating scale developed for use as a universal screener of student mental health problems is the 10-item Youth Internalizing Problems Screener (YIPS; Renshaw & Cook, 2018). The intention of the present study was to probe the preliminary psychometrics of a complementary measure—the Youth Externalizing Problems Screener (YEPS)—that is intended for use as a companion to the YIPS when screening for broad behavior problems in secondary schools.

The rationale for developing the YIPS and YEPS consists of three key points, one grounded in appropriateness, one in technical adequacy, and the other in usability (see Glover & Albers, 2007). The first point, as described in Renshaw and Cook (2018), is that the item content comprising most common screeners fails to provide adequate coverage of broad internalizing and/or externalizing problems, highlighting concerns with construct underrepresentation and/or contamination—suggesting a need for more appropriate measures. The second point is that evidence regarding the diagnostic accuracy of common screeners fails to provide strong support for their classification utility—suggesting a need for more technically adequate measures (see Renshaw & Cook, 2018). The third point is that there is lack of free, publicly available self-report screeners targeting adolescents, as the majority of mental health measures are proprietary and thus cost-prohibitive for conducting regular screening in secondary schools—suggesting a need for more usable measures (cf. Bruhn, Woods-Groves, & Huddle, 2014; findings regarding reasons reported by schools for not conducting behavioral and mental health screening).

Although concern has been expressed in the assessment literature regarding youth’s ability to accurately self-discriminate and estimate their own behavioral and mental health functioning (e.g., Loeber, Green, Lahey, & Stouthamer-Loeber, 1991), we suggest youth are likely to be the optimal informants for mental health screening in secondary schools, even for externalizing problems. Our rationale for this position is threefold. First, self-reports yield a single data point per student that allows for truly universal decision making, while avoiding the pitfalls associated with aggregating and analyzing data from multiple informants (e.g., several teachers or teachers plus parents) that is likely to have low levels of agreement (cf. Achenbach, McConaughy, & Howell, 1987). Second, self-reports conserve teachers’ and caregivers’ time, making for a more efficient screening process that can be carried out within the scope of regular school hours. Finally, research regarding the reliability (or lack thereof) of youths’ self-reports of externalizing problems appears to be mixed, with some studies demonstrating that externalizing behavior may be more accurately reported by youth compared with their caregivers (Eckert, Dunn, Guiney, & Codding, 2000). Given the YEPS is intended to function as the initial screener within a multiple-gating identification process (see Stiffler & Dever, 2015), and not as diagnostic or criterion or high-stakes measure, we suggest these points further support the argument for the appropriateness and potential usefulness of the YEPS as a behavioral screener in secondary schools. That said, the primary purpose of this study was to probe the technical adequacy of responses to the YEPS, which should be a coequal consideration when evaluating the usefulness of screeners (Glover & Albers, 2007).

This brief report presents the preliminary psychometrics of responses to the YEPS by probing evidence based on internal structure and evidence based on relations to other variables, both of which are key to the validation argument for the interpretation and use of YEPS scores. Analyses presented herein were conducted with the same small, preliminary samples of urban high-school students as those reported on for the initial development and validation study for the YIPS (see Renshaw & Cook, 2018). We hypothesized that responses to the YEPS would indicate a sound, unidimensional (one-factor) measurement model. This structural hypothesis was based on the intended interpretation and use of the YEPS as a single-score screener of general externalizing problems, which is supported by broader psychopathology theory (Achenbach, 1985) and has proved to be a viable structure for modeling responses to other brief behavior rating scales (e.g., Goodman, Lamping, & Ploubidis, 2010). Furthermore, we hypothesized that scores derived from the YEPS would be most strongly correlated with another general measure of externalizing problems, while also being moderately-to-strongly correlated with measures of general internalizing problems, transdiagnostic mental health dysfunction, and academic performance problems. This relational hypothesis was based on previous research demonstrating associations among youth’s self-reported mental health variables and was intended to provide convergent evidence in favor of interpreting and using YEPS scores as a brief measure of adolescents’ externalizing problems.

Method

Participants

Sample 1 consisted of 177 students (52.4% female, 97.3% Black or African American) in Grades 9 to 12, who were attending a small, public high school located in an urban school district in the southeastern United States. Sample 2 was comprised of 219 students (54.8% female, 96.3% Black or African American) in Grades 9 to 12, also attending a public high school located in an urban school district in the southeastern United States. Further information regarding sample demographics and sampling method can be found in Renshaw and Cook (2018).

Measures

The YEPS pilot measure was developed using a multistep process that paralleled the development of the YIPS pilot measure (see Renshaw & Cook, 2018). First, Diagnostic and Statistical Manual of Mental Disorders (5th ed.; DSM-5; American Psychiatric Association, 2013) criteria were reviewed for “Attention-Deficit/Hyperactivity Disorder” as well as for all disorders listed within the “Disruptive, Impulse-Control, and Conduct Disorders” category to identify core themes for externalizing problems. Next, several common self-report rating scales for measuring adolescents’ externalizing problems were reviewed to identify item themes: the Conners-3 (Conners, 2008), the Youth Self-Report (Achenbach, 2001), the Burks Behavior Rating Scales-2 (Burks & Gruber, 2006), the Strengths and Difficulties Questionnaire (SDQ; R. Goodman, 2001), and the BASC-2 Behavioral and Emotional Screening System (BESS; Kamphaus & Reynolds, 2007). Based on these reviews, seven pilot items were generated to represent key behavioral domains characteristic of conduct problems/oppositional defiance (C/O), and seven additional pilot items were generated to represent key behavioral domains characteristic of hyperactivity–impulsivity/ inattention (H/I). All items were directly phrased (see Table 1), requiring no reverse scoring, and were arranged along a 4-point, relative frequency-based response scale (1 = almost never, 2 = sometimes, 3 = often, 4 = almost always). Readability analysis of the 14-item YEPS pilot measure indicated grade-level estimates ranging from a lower bound of 2.4 (Flesh-Kincaid Grade Level) to an upper bound of 5.7 (Coleman–Liau Index), with an average grade-level readability estimate of 3.2.

Table 1.

YEPS Pilot Items and Factor Loadings.

Item	Domain	S1_F λ	S1_P λ	S2 λ
1. I forget things and make mistakes.	H/I	.44	—	—
2. I lose my temper and get angry with other people.	C/O	.60	.48	.51
3. I have a hard time sitting still when other people want me to.	H/I	.64	.64	.50
4. I fight and argue with other people.	C/O	.70	.61	.56
5. I have trouble staying organized and finishing assignments.	H/I	.53	—	—
6. I break rules whenever I feel like it.	C/O	.78	.82	.66
7. I talk a lot and interrupt others when they are talking.	H/I	.78	.81	.70
8. I say or do mean things to hurt other people.	C/O	.70	.71	.60
9. I have a hard time focusing on things that are important.	H/I	.61	.51	.59
10. I like to annoy people or make them upset.	C/O	.74	.77	.58
11. I feel like I have a lot of energy and need to get it out.	H/I	.44	—	—
12. I get revenge when other people hurt me.	C/O	.54	—	—
13. I get distracted by the little things happening around me.	H/I	.59	.55	.47
14. I choose not to follow directions and do not listen to adults.	C/O	.75	.78	.66

Note. S1_F = Sample 1, full set of 14 pilot items; S1_P = Sample 1, preferred model with reduced set of 10 items; S2 = Sample 2, preferred model with 10 items. H/I = hyperactivity–impulsivity/inattention; C/O = conduct problems/oppositional defiance.

Responses to the YEPS were tested in relation with responses to several other measures representing theoretically relevant mental health variables. In both samples, the YIPS (Renshaw & Cook, 2018) was used as the intended companion screener for the YEPS, representing general internalizing problems. In Sample 1, internalizing problems were also gauged by the relevant index from the SDQ (hereafter referred to as the SDQ-INT), which was obtained by summing scores from the Emotional Symptoms and Peer Problems subscales (Goodman et al., 2010). Sample 1 also made use of the externalizing index from the SDQ (hereafter referred to as the SDQ-EXT), which was obtained by summing scores from the Hyperactivity and Conduct Problems subscales (Goodman et al., 2010). In Sample 2, responses to the YEPS were analyzed with scores derived from the eight-item version of the Avoidance and Fusion Questionnaire for Youth (AFQ-Y), which is a transdiagnostic measure of mental health dysfunction (Greco, Lambert, & Baer, 2008), as well as the Subjective Academic Problems Screener (SAPS), which is a measure of self-perceived performance difficulties related to schoolwork (Renshaw, 2018). Descriptive statistics for responses to all study measures are presented in Table 2.

Table 2.

Descriptive Statistics.

Sample/Measure	M	SD	Skewness	Kurtosis	α	ω
Sample 1
YEPS	18.02	5.86	0.59	−0.52	.84	.84
YIPS	17.31	5.41	1.02	1.24	.84	.84
SDQ-EXT	15.43	3.70	0.53	−0.32	.76	.77
SDQ-INT	15.39	3.55	0.75	0.19	.71	.74
Sample 2
YEPS	16.70	5.11	0.96	0.70	.84	.84
YIPS	16.83	4.93	1.32	2.94	.84	.84
AFQ-Y	7.68	6.86	1.12	0.84	.79	.79
SAPS	12.04	3.97	1.03	1.29	.78	.80

Note. YEPS = Youth Externalizing Problems Screener; YIPS = Youth Internalizing Problems Screener; SDQ-EXT = Externalizing Index of the Strengths and Difficulties Questionnaire; SDQ-INT = Internalizing Index of the Strengths and Difficulties Questionnaire; AFQ-Y = Avoidance and Fusion Questionnaire for Youth; SAPS = Subjective Academic Problems Scale.

Data Analyses

All data analyses were conducted using the free, open-source JASP statistical software (JASP Team, 2018). The internal structure of responses to the YEPS was probed using confirmatory factor analyses (CFA) with both Sample 1 and Sample 2. Given the ordinal nature of self-report rating scale data, the diagonal weighted least squares (DWLS) estimator was employed. Standardized factor loadings (λ) ⩾ .30 were considered adequate. Adequate data-model fit was evaluated using the following indices and decision rules: comparative fit index (CFI) ⩾ .90, root mean square error of approximation (RMSEA) ⩽ .08, and standardized root mean square residual (SRMR) ⩽ .08 (Hu & Bentler, 1999). Internal consistency reliability of the resulting YEPS scale was considered adequate for α and ω coefficients ≥ .70.

The relations of scores derived from the YEPS with other theoretically relevant mental health variables were evaluated using a series of bivariate correlations. For Sample 1, correlations were run between the YEPS and the YIPS, SDQ-INT, and SDQ-EXT. For Sample 2, correlations were run between the YEPS and the YIPS, AFQ-Y, and SAPS. The magnitude of resulting correlation coefficients was evaluated using traditional decision rules for Pearson r: ⩾ .10 = small, ⩾ .30 = moderate, ⩾ .50 = large.

Results and Discussion

Findings from the first CFA conducted with Sample 1, which structured the 14 YEPS pilot items as indicators of a single latent factor (representing externalizing problems), yielded suboptimal data-model fit: χ² = 195.14 (df = 77, p < .001), CFI = 0.959, RMSEA (90% confidence interval [CI]) = 0.099 ([0.082, 0.117]) (p < .001), SRMR = 0.107. This baseline model was altered by removing the four lowest-loading items (see Table 1), to make the length of the measure congruent with the 10-item YIPS (see Renshaw & Cook, 2018). CFA was then rerun, resulting in a somewhat improved, yet still suboptimal, data-model fit: χ² = 90.11 (df = 35, p < .001), CFI = 0.975, RMSEA [90% CI] = 0.100 [0.075, 0.125] (p < .001), SRMR = 0.097. Given all 10 items had strong factor loadings (λ > .40), modification indices were explored to determine potential alterations that might improve fit. Results suggested that adding a covariance between the residuals for Items 2 and 4 would substantially improve fit. Considering the content of these items was theoretically similar (see Table 1), this additional parameter was deemed appropriate and added to the model. CFA was again rerun, resulting in an improved and generally adequate data-model fit: χ² = 62.46 (df = 34, p < .01), CFI = 0.987, RMSEA [90% CI] = 0.073 [0.043, 0.101] (p = .094), SRMR = 0.082. See Table 1 for a full presentation of factor loadings for the baseline model and the preferred model, and Table 2 for internal consistency reliability estimates for the resulting scale.

Results from the initial CFA conducted with Sample 2, which structured the 10 YEPS items from the final model in Sample 1 as indicators of a single latent factor (sans additional parameters), yielded good data-model fit: χ² = 47.26 (df = 35, p = .81), CFI = 0.984, RMSEA [90% CI] = 0.040 [0.000, 0.067] (p = .70), SRMR = 0.075. See Table 1 for a full presentation of factor loadings for this model, and Table 2 for internal consistency reliability estimates for the resulting scale. Taken together, findings from these factor analyses suggest that responses to the 10-item YEPS yield a sound, unidimensional (one-factor) measurement model that is characterized by internally consistent responding. Considering the item generation process and resulting content (see Table 1), it seems reasonable to interpret this structural and reliability evidence as supporting the interpretation and use of the YEPS as a single-score measure of general externalizing problems.

Regarding evidence based on relations with other variables, correlations conducted with Sample 1 indicated large positive associations for the YEPS–YIPS and YEPS–SDQ-EXT, and a moderate positive association for the YEPS–SDQ-INT (see Table 3). Correlations conducted with Sample 2 yielded large positive associations for the YEPS–YIPS and YEPS–SAPS, and a moderate positive association for the YEPS–AFQ-Y (see Table 3). It is noteworthy that, as hypothesized, the strongest association was observed with another general measure of externalizing problems (SDQ-EXT). Moreover, it is noteworthy that, although correlations with the intended companion screener (YIPS) were strong, responses to both measures shared only approximately one third of the score variance (Sample 1: r² = .40, Sample 2: r² = .29), suggesting the measures tap related-yet-distinct constructs. Similarly, the correlation with the transdiagnostic measure of mental health dysfunction (AFQ-Y) shared only 16% of the score variance, which is much lower than the variance shared by the measures of externalizing and internalizing problems, which are more conceptually similar. Taken together, these results provide initial convergent evidence suggesting that scores derived from the YEPS relate in meaningful and expected ways to scores derived from other, self-reported mental health variables. Such evidence provides further support for interpreting and using YEPS scores as an indicator of youths’ general externalizing problems.

Table 3.

Bivariate Correlations.

Sample/Measures	r [95% CI]
Sample 1
YEPS–YIPS	.63 [0.55, 1.00]***
YEPS–SDQ-INT	.36 [0.24, 1.00]***
YEPS–SDQ-EXT	.73 [0.66, 1.00]***
Sample 2
YEPS–YIPS	.54 [0.46, 1.00]***
YEPS–AFQ-Y	.40 [0.30, 1.00]***
YEPS–SAPS	.53 [0.45, 1.00]***

Note. CI = confidence interval; YEPS = Youth Externalizing Problems Screener; YIPS = Youth Internalizing Problems Screener; SDQ-EXT = Externalizing Index of the Strengths and Difficulties Questionnaire; SDQ-INT = Internalizing Index of the Strengths and Difficulties Questionnaire; AFQ-Y = Avoidance and Fusion Questionnaire for Youth; SAPS = Subjective Academic Problems Scale.

***

p < .001.

Given the nature and scope of the present studies, the psychometrics observed for responses to the YEPS should be considered preliminary. Both samples were relatively small, suggesting that replication studies are warranted to probe the consistency of results with larger samples. Furthermore, given both samples were comprised of demographically similar high school students, generalizability studies are needed to explore the stability of these psychometrics across diverse youth in an array of secondary school settings. Findings from these studies also have limited reach regarding recommendations for practice, as no analyses were conducted to investigate diagnostic accuracy or treatment utility. So far, then, the only reasonable (albeit tentative) implication for practice is that the YEPS might be used within school mental health screening frameworks to procure scores that represent youths’ general externalizing problems. Further research is thus needed to probe the diagnostic accuracy and other classification utiliy (e.g., contribution toward intervention planning or matching) of YEPS scores when used within mental health screening frameworks in secondary schools. In addition, potential use of YEPS scores for basic research purposes should be tempered by the fact that the measure is intended as a screener and may= therefore under-represent the broader construct of externalizing behaviors. Until stronger and more comprehensive evidence is generated to support the validation argument for the YEPS, any interpretation and use of YEPS scores should be undertaken with proper caution.

Footnotes

Declaration of Conflicting Interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) received no financial support for the research, authorship, and/or publication of this article.

ORCID iD

Tyler L. Renshaw

References

Achenbach

T. M.

(1985). Assessment and taxonomy of child and adolescent psychopathology. Thousand Oaks, CA: SAGE.

Achenbach

T. M.

(2001). Youth self-report for ages 11-18. Burlington, VT: Achenbach System of Empirically Based Assessment.

Achenbach

T. M.

McConaughy

S. H.

Howell

C. T.

(1987). Child/adolescent behavioral and emotional problems: Implications of cross-informant correlations for situational specificity. Psychological Bulletin, 101, 213-232. doi:10.1037/0033-2909.101.2.213

American Psychiatric Association. (2013). Diagnostic and statistical manual of mental disorders (5th ed.). Arlington, VA: American Psychiatric Publishing.

Bradley

Doolittle

Bartolotta

(2008). Building on the data and adding to the discussion: The experiences and outcomes of students with emotional disturbance. Journal of Behavioral Education, 17, 4-23. doi:10.1007/s10864-007-9058-6

Bruhn

A. L.

Woods-Groves

Huddle

(2014). A preliminary investigation of emotional and behavioral screening practices in K–12 schools. Education and Treatment of Children, 37, 611-634. doi:10.1353/etc.2014.0039

Burks

H. F.

Gruber

C. P.

(2006). Burks Behavior Rating Scales (2nd ed.). Los Angeles, CA: Western Psychological Services.

Conners

C. K.

(2008). Conners (3rd ed.). San Antonio, TX: Pearson.

Dowdy

Ritchey

Kamphaus

R. W.

(2010). School-based screening: A population-based approach to inform and monitor children’s mental health needs. School Mental Health, 2, 166-176. doi:10.1007/s12310-010-9036-3

10.

Eckert

T. L.

Dunn

E. K.

Guiney

K. M.

Codding

R. S.

(2000). Self-reports: Theory and research in using rating scale measures. In Shapiro

E. S.

Kratochwill

T. R.

(Eds.), Behavioral assessment in schools: Theory, research, and clinical foundations (2nd ed., pp. 288-322). New York, NY: Guilford.

11.

Glover

T. A.

Albers

C. A.

(2007). Considerations for evaluating universal screening instruments. Journal of School Psychology, 45, 117-135. doi:10.1016/j.jsp.2006.05.005

12.

Goodman

Lamping

D. L.

Ploubidis

G. B.

(2010). When to use broader internalising and externalising subscales instead of the hypothesised five subscales on the Strengths and Difficulties Questionnaire (SDQ): Data from British parents, teachers and children. Journal of Abnormal Child Psychology, 38, 1179-1191. doi:10.1007/s10802-010-9434-x

13.

Goodman

(2001). Psychometric properties of the Strengths and Difficulties Questionnaire. Journal of the American Academy of Child and Adolescent Psychiatry, 40, 1337-1345. doi:10.1097/00004583-200111000-00015

14.

Greco

L. A.

Lambert

Baer

R. A.

(2008). Psychological inflexibility in childhood and adolescence: Development and evaluation of the Avoidance and Fusion Questionnaire for Youth. Psychological Assessment, 20, 93-102. doi:10.1037/1040-3590.20.2.93

15.

Bentler

P. M.

(1999). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural Equation Modeling: A Multidisciplinary Journal, 6, 1-55. doi:10.1080/10705519909540118

16.

JASP Team. (2018). JASP (Version 0.9) [Computer software]. Available from https://jasp-stats.org

17.

Kamphaus

R. W.

Reynolds

C. R.

(2007). BASC-2 Behavioral and Emotional Screening System manual. Circle Pines, MN: Pearson.

18.

Loeber

Green

S. M.

Lahey

B. B.

Stouthamer-Loeber

(1991). Differences and similarities between children, mothers, and teachers as informants on disruptive child behavior. Journal of Abnormal Child Psychology, 19, 75-95. doi:10.1007/BF00910566

19.

Renshaw

T. L.

(2018). Preliminary validation of the Subjective Academic Problems Scale: A new tool to aid in triaging school mental health screening results. Canadian Journal of School Psychology, 33, 242-256. doi:10.177/082957351702020

20.

Renshaw

T. L.

Cook

C. R.

(2018). Initial development and validation of the Youth Internalizing Problems Screener. Journal of Psychoeducational Assessment, 36, 366-378. doi:10.1177/0734282916679757

21.

Stiffler

Dever

(2015). Multiple-gating and mental health screening. In Stiffler

Dever

(Ed.), Mental health screening at school: Instrumentation, implementation, and critical issues (pp. 91-105). New York, NY: Springer.