Abstract
The purpose of this study was to evaluate the psychometric properties and measurement invariance of a web-based, self-administered battery of assessments of social-emotional comprehension called “SELweb.” Assessment modules measured children’s ability to read facial expressions, infer others’ perspectives, solve social problems, delay gratification, and tolerate frustration. In an ethnically and socioeconomically diverse sample of 4,419 children in kindergarten through third grade who completed SELweb: (a) scores from assessment modules exhibited moderate to high internal consistency and moderate 6-month temporal stability; (b) composite assessment scores exhibited high reliability; and (c) assessment module scores fit a theoretically coherent four-factor model that includes factors reflecting emotion recognition, social perspective-taking, social problem-solving, and self-control. In addition, the present study supports configural and metric invariance across time, sex, and ethnicity. Analyses suggest partial scalar invariance across time, sex, and, to a lesser degree, ethnicity.
Social-emotional comprehension includes the ability to encode, interpret, and reason about social-emotional information. The conceptualization of social-emotional comprehension in this article draws on several theoretical traditions and hypothesizes four interrelated dimensions (Lipton & Nowicki, 2009). First, drawing on research on nonverbal communication, emotion recognition is defined as the ability to understand others’ emotions from nonverbal cues (Nowicki & Duke, 1994; Pons, Harris, & de Rosnay, 2004). Second, drawing on research on children’s theory of mind understanding, social perspective-taking is defined as the ability to interpret others’ mental states (Wellman & Liu, 2004). Third, drawing on research on social information processing, social problem-solving is defined as the ability to reason about social problems (Crick & Dodge, 1994; Denham, 2006). Finally, self-control includes the effortful control of attention, emotions, and behavior to achieve a goal (Duckworth, 2011).
Social-emotional comprehension is consequential. A large body of research suggests that emotion recognition, social perspective-taking, social problem-solving, and self-control are each associated with outcomes as wide ranging as self-esteem, locus of control, peer acceptance, physical health, substance use, income, socioeconomic status, single parenthood, and criminality (Blair & Raver, 2015; Blair & Razza, 2007; Crick & Dodge, 1994; Denham, 2006; Denham et al., 2012; Dubow, Tisak, Causey, Hryshko, & Reid, 1991; Duckworth & Seligman, 2005; Iyer, Kochenderfer-Ladd, Eisenberg, & Thompson, 2010; Izard et al., 2001; Lecce, Caputi, & Hughes, 2011; Moffit et al., 2010; Nowicki & Duke, 1994). In prior work, for example, McKown, Russo-Ponsaran, Allen, Johnson, and Russo (2016) found that the better early elementary-aged children scored on a measure of social-emotional comprehension, the more teachers reported that children engaged in socially skilled behavior and the less they engaged in problem behavior. Furthermore, social-emotional comprehension was positively associated with peer acceptance and both teacher reported and directly assessed reading and math skills, controlling for IQ.
The Need for Direct Assessments of Social-Emotional Comprehension
Social-emotional comprehension includes cognitive and affective skills that may not have straightforward behavioral correlates. As a result, third-party raters must make a high level of inference, potentially limiting the validity of this form of assessment for measuring social-emotional comprehension. Self-report may not be well suited to assessing social-emotional skills because self-reported skill is only modestly correlated with skill level measured more objectively (Shrauger & Osberg, 1981), and children may respond in ways that reflect social expectations more than actual skill level (Crowne & Marlowe, 1960).
Social-emotional comprehension lends itself to direct assessment, in which children demonstrate their skill in a particular domain by solving challenging domain-relevant tasks. Ideally, direct assessments will have adequate construct coverage, high ease of use, the ability to conduct group administration, and appropriateness for a wide range of children. Existing direct assessments vary in domain and age coverage and population for which they are appropriate. Most require expertise to administer, score, and interpret.
SELweb to Assess Social-Emotional Comprehension
To address the need for direct assessments of children’s social-emotional comprehension that can be administered at large scale to the general population, the author and his colleagues created SELweb, a web-based system for measuring social-emotional comprehension in kindergarten through third grade. SELweb includes five assessment modules, one each to measure emotion recognition, social perspective-taking, and social problem-solving. Two additional modules assess dimensions of self-control—delay of gratification and frustration tolerance. SELweb’s assessment modules use direct assessment.
Prior research suggests that direct assessment is a promising approach to assessing social-emotional comprehension. In four separate studies examining SELweb and other assessments, McKown, Allen, Russo-Ponsaran, and Johnson (2013) and McKown and colleagues (2016) found that (a) composite social-emotional assessment scores exhibited latent factor internal consistency reliabilities (Nunnally & Bernstein, 1994) averaging greater than .80; (b) social-emotional assessment observed scores fit a four-factor structure reflecting emotion recognition, social perspective-taking, social problem-solving, and self-control; (c) social-emotional comprehension factor scores demonstrated convergent and discriminant validity; and (d) controlling for IQ and demographic characteristics, performance on SELweb was positively associated with peer acceptance, teacher report of social skills, and multiple indicators of academic achievement, and negatively associated with teacher report of problem behaviors. Evidence of the reliability and validity of a Spanish language version SELweb is consistent with these findings (Russo, McKown, Russo-Ponsaran, & Allen, 2018).
Measurement Equivalence of SELweb
Prior studies included diverse samples of typically developing and clinic-referred children. Similarity in findings across measures, samples, and analyses suggests that these assessments yield reliable scores that are valid for understanding how well-developed children’s social-emotional comprehension is. Nevertheless, no prior research I am aware of has directly evaluated the measurement equivalence of direct assessments of social-emotional learning (SEL). Evaluating SELweb’s measurement equivalence will provide important information on the extent to which SELweb scores can be interpreted in the same way for children from different groups.
Measurement equivalence can be tested within a confirmatory factor analysis (CFA) model (Dimitrov, 2010; Millsap, 2011), by comparing nested CFA models with varying degrees of equality constraints. The most basic question about measurement equivalence is whether the factor structure is the same across groups (configural invariance). Assuming configural invariance assumptions are met, a second important question is whether factor loadings are equivalent for different groups (metric invariance). Metric invariance means that a one-unit change in the latent construct is reflected by the same change in the observed variables for all groups. Assuming metric invariance requirements are met, a third important question is whether latent intercepts are equivalent for different groups (scalar invariance). Scalar invariance means that, at a given level of the latent variable, people from different groups achieve the same score on the observed variables. Assuming scalar invariance assumptions are met, another question is whether residual item and factor variances and covariances are equal across groups (strict invariance).
This study examines the measurement equivalence of SELweb for boys and girls, for children from different ethnic groups, and across two administrations. Testing measurement equivalence for sex and ethnicity is important to ensure that the scores achieved by children from different groups have the same meaning. Testing measurement equivalence across time is important for two reasons. First, if SELweb is administered to the same child more than once, it will be important to ensure that testing familiarity or fatigue does not change the meaning of the obtained scores. Second, establishing measurement equivalence across time helps ensure that observed changes in SELweb scores over time reflect changes in what is being measured.
The first study goal was to evaluate the reliability and validity of SELweb modules and factor scores. Based on prior work (McKown and colleagues, 2016), I hypothesized that module score reliabilities would be between .70 and .80, that factor score reliabilities would exceed .80, and that module scores would reflect a four-factor structure that includes factors reflecting emotion recognition, social perspective-taking, social problem-solving, and self-control. The second study goal was to evaluate measurement equivalence of SELweb’s factor structure across time, sex, and ethnicity. To do so, I tested a series of nested CFA models with increasingly stringent equality constraints to assess invariance across time, sex, and ethnicity (Millsap, 2011). Specifically, I tested configural, metric, and scalar invariance across time, sex, and ethnicity.
Method
Participants
The sample included 4,419 children from 20 schools in three urban and six suburban school districts in five states who were tested during 2014-2015. Mean age of participants was 7.5 years (SD = 1.1). Sample characteristics are summarized in Table 1. Ethnicity labels that were common across all districts included synonyms for White, Black, Latino, and Asian. Members of other ethnic groups are categorized in this study as “Other.”
Sample Characteristics.
Note. Low-income estimates were taken from public records about the proportion of children eligible for free and reduced-price lunches, or whose families received public aid.
Procedures
In all participating districts, school staff administered SELweb to all students in kindergarten through third grade to learn about their students’ SEL skills. Districts received data on student SELweb performance. The investigators received de-identified data to evaluate SELweb’s technical properties. The university’s institutional review board (IRB) granted a waiver of informed consent to use de-identified SELweb for research purposes.
School personnel administered SELweb in one or two sessions in a room with several Internet-connected computers. All sessions were group administrations. To complete SELweb, children are logged in by an administrator. SELweb assessment modules are illustrated and narrated with pictorial forced-choice responses that respondents select with a mouse. Children wear headphones and complete SELweb autonomously. Scoring is described in McKown and colleagues (2016).
Total testing time was approximately 45 min. Kindergarten and first-grade students generally completed SELweb in two separate sessions of approximately equal length, typically within 1 week. Students in second and third grade nearly always completed the assessment in a single session. In one district, SELweb was administered in fall (October and November) and spring (April and May), and data from that district were used to estimate temporal stability. Mean time between administrations was 176 days (SD = 47.7 days).
Measures
SELweb modules, described below, were designed to assess emotion recognition, social perspective-taking, social problem-solving, and self-control. Summary descriptions of SELweb’s modules and item scoring rules are summarized in Table 2. Total scores on each item were summed across items within module.
Description of SELweb Modules, Questions, and Item Scoring.
Emotion recognition
Six photographs of child faces with neutral facial expressions, including three girls, three boys, and two ethnic minorities, one Black girl and one mixed-race boy, were used to create the emotion module. The photographs were digitized with FaceGen software (Singular Inversions, 2005). FaceGen was then used to digitally manipulate the face images to produce emotion displays of happy, sad, angry, and frightened. For each face and emotion, 10 faces were created ranging from low- to high-intensity affect displays. From this item pool, five different test forms were created, each with 40 items. Faces were assigned to test forms to ensure a balance of emotions, intensities, and child faces within a given form. Sixteen to 20 items on each test form were included on more than one form. During SELweb administration, after each face was presented, children clicked to indicate whether the face reflected happy, sad, angry, scared, or just okay.
Social perspective-taking
My colleagues and I created 12 illustrated and narrated vignettes, six of which assessed false belief understanding and six of which assessed children’s ability to distinguish between what a speaker appears to say and what they really mean, as when they are sarcastic, lying, or hiding their feelings. After each story, children were asked a question whose correct answer required an accurate inferences about the story character’s mental state.
Social problem-solving
We created 10 illustrated and narrated vignettes, five involving ambiguous provocation and five involving peer entry. After each vignette, children selected (a) the extent to which they felt a story character was hostile (not at all, a little, or a lot); (b) how they wanted things to turn out, indexing social goals (prosocial or retribution); and (c) what they would do (with a choice of actions that reflected competence, asking for help, ignoring, or walking away). We created five test forms with six vignettes each. Each form included three ambiguous provocation vignettes and three peer entry vignettes. Each vignette was included on three forms.
Self-control
We developed a choice-delay task (Kuntsi, Stevenson, Oosterlaan, & Sonuga-Barke, 2001) and a frustration-tolerance task (Bitsakou, Antrop, Wiersema, & Sonuga-Barke, 2006) described in Table 2. For the choice-delay task, children earned points for clicking on one of three animated rockets that would then travel to a planet. The slower and therefore more tedious the rocket ship, the more points were awarded. Children first completed a training phase in which the narrator explained the point value of each rocket, and children selected each rocket to understand each rocket’s speed and point value. Children completed 10 trials.
For the frustration-tolerance task, children were told to identify whether or not pairs of shapes presented one after the other were identical and to get as many correct as possible in 90 s. For several items, the computer was programmed to get “stuck,” thereby inducing mild frustration. Clicking on the response buttons results in no changes to the screen. In early work, the number of items correct was most strongly associated with other indicators of self-control. As a result, I use this score as the indicator of frustration tolerance.
Results
Descriptive statistics, correlational analyses, and reliabilities were calculated using SPSS version 19.0 (IBM, 2010). CFAs, including multigroup analyses used to evaluate measurement noninvariance, were conducted with Amos version 17.0 (Arbuckle, 2008).
Normality Assumption
To test multivariate normality, skewness and kurtosis were computed for all 10 observed scores, as summarized in Table 3. For nine out of 11 scores, the absolute value of the skewness was less than 2. Kurtosis of nine out of 11 variables had an absolute value less than 3. To evaluate the impact of these deviations from normality, a Monte Carlo simulation with 200 bootstrap samples was run. For this analysis, Amos drew random, and therefore normally distributed, samples with the same means, variances, and covariances as the observed data. The distribution of parameter estimates from the simulated data was compared with parameter estimates from the observed data. There were no statistically significant differences between parameters estimated from bootstrap samples and those estimated from observed data. This suggests that deviations from normality in this sample did not have a meaningful impact on parameter estimates.
Correlations Between Variables in the Model and Descriptive Statistics.
Note. All correlations significant at p < .001. R-A = real–apparent emotions; F-B = false beliefs; PA = social problem-solving positive attributions; Goal = social problem-solving goal preference; Solution = social problem-solving solution preference; Delay = delay of gratification; Frust = frustration tolerance.
Missing Data
Complete data were available for all SELweb data except for social perspective-taking. Twenty-nine of 4,419 children did not have social perspective-taking scores (0.66%). Simulation studies with small amounts of missing data (<2% of cases) have found that mean substitution is equivalent to more complex procedures (McCartney, Burchinal, & Bub, 2006). Therefore, mean substitution was used to impute perspective-taking values for those 29 cases.
Descriptive Statistics and Correlations
Descriptive statistics and Pearson correlations between variables in the CFA models are summarized in Table 3.
Reliability
Internal consistency
The internal consistencies of observed scores and factor scores are summarized in Table 4. Internal consistencies of factor scores were estimated using procedures described by Nunnally and Bernstein (1994).
Score Reliabilities.
Note. ryy = internal consistency reliability; r12 = temporal stability reliability.
Six-month stability
Six-month measurement stability estimates are presented in Table 4. Because children were randomly assigned to emotion recognition and social problem-solving test forms, for those assessment modules, temporal stability estimates reflected a mix of alternate forms and test–retest reliability.
Factor Structure
Prior research has found a four-factor model fits the SELweb observed scores. Consistent with that research, in the present study, the fit of a four model to the observed scores, as depicted in Figure 1, was excellent, χ2(38) = 364.3, p < .05, comparative fit index (CFI) = .97, root mean square error of approximation (RMSEA) = .044, 90% confidence interval (CI) = [.040, .048]. Accordingly, measurement invariance analyses reported below were based on this model.

Four-factor model of social-emotional comprehension.
In this and all models tested below, the chi-square tests of model fit were statistically significant, indicating a lack of fit between the data and the model. However, the chi-square test of model fit is sensitive to sample size, even when the fit of the data to the model is excellent (Brannick, 1995; Ullman, 2006). As a result, model fit was evaluated with CFI and RMSEA, indicators of model fit that are less sensitive to sample size. The configural model was judged to be a good fit to the data when CFI ≥ .95 and RMSEA ≤ .06 (Dimitrov, 2010). Metric invariance models for each grouping were compared with the configural model, and scalar invariance models for each grouping were compared with their respective metric invariance model. Criteria for rejecting the null hypothesis of metric and scalar invariance included a decrease in model fit from the less restrictive model of ≥.01 in CFI or an increase of ≥.015 in RMSEA (Chen, 2007).
Measurement Invariance
Measurement invariance for each grouping variable (time, sex, and ethnicity) was tested by conducting multigroup CFA analyses and testing a series of nested models with increasingly stringent equality constraints (Vandenberg & Lance, 2000) on the four-factor model. Ethnicity invariance analyses compared model fit between children identified as “White,” “Black,” “Latino,” “Asian,” or “Other.”
Configural invariance
To test configural invariance, no equality constraints were imposed between groups on the four-factor model. A good fit with the unconstrained model supports configural invariance, or the equivalence of the factor structure across groups. As summarized in Table 5, across all configural invariance analyses, the models demonstrated an excellent fit to the data, with all CFI > .95 and all RMSEA < .05. In other words, for children who took SELweb twice, for boys and girls, and for children from different ethnic groups, the four-factor model fits all groups equally well.
Measurement Invariance Fit Statistics for Four-Factor Model.
Note. RMSEA = root mean square error of approximation; CFI = comparative fit index.
Freed delay of gratification, false belief, and reality appearance.
Freed delay of gratification and goal preference.
Freed all false belief, White and Black reality appearance, Black, Hispanic, and other positive attribution; Black and Hispanic goal preference; Black and Hispanic solution preference; Black and Hispanic frustration tolerance; White, Black, and Asian delay of gratification.
p < .05.
Metric invariance
To test metric invariance, or the equivalence of the factor loadings across groups, equality constraints between groups were imposed on factor loadings. Imposing equality constraints on factor loadings across time, sex, and ethnicity led to a reduction in CFI < .01 and an increase in RMSEA < .002. This provided evidence of metric invariance across time, sex, and ethnicity. This means that across time, sex, and ethnicity, a one-unit change in latent variable is reflected by the same change in the observed scores.
Scalar invariance
To test scalar invariance, or the equivalence of factor intercepts across groups, additional equality constraints across groups were imposed on latent intercepts. With time and sex, the increase in RMSEA was <.015, consistent with a conclusion of scalar invariance, and the overall fit of the model remained excellent. However, across time and sex, the reduction in CFI for the scalar model CFI was >.01. The reductions in model CFI were .018 for time and .020 for sex.
Next, I freed the fewest equality constraints in the latent intercept to achieve a reduction in CFI < .01. To that end, I inspected modification indices to identify sources of scalar noninvariance. The key modification index was the expected drop in chi-square when freely estimating a given parameter. The parameter with the greatest modification index was freed first, and the model was rerun. This process was repeated until the discrepancy in CFI between the modified scalar model and the metric invariance model was reduced to <.01.
For time, when the delay of gratification, false belief, and reality-appearance intercepts were freed, the fit of the model was restored to equivalence with the metric invariance model, with a decline in model fit from the metric invariance model to the modified scalar model of .005 and .010 for RMSEA and CFI, respectively. For sex, when the delay of gratification task and goal preference intercept equality constraints were freely estimated, the fit of the model was restored to equivalence with the metric invariance model, with a decline in model fit from the metric invariance model to the modified scalar model of .003 and .010 for RMSEA and CFI, respectively. In the case of time and sex, SELweb therefore demonstrates partial scalar invariance.
The scalar invariance model for ethnicity resulted in a reduction in CFI of .069 to .90 and an increase in RMSEA of .012 to .032, meaning that for CFI but not RMSEA, imposing equality constraints across ethnicity on the latent intercepts resulted in a substantially poorer fit of the model to data than the metric invariance model. Sequentially freeing 14 of the 44 intercept equality constraints resulted in an improvement in model fit to CFI = .95 and RMSEA = .024, reducing the increase in RMSEA from the metric model to the modified scalar model to .004, and restoring the CFI to a value that reflects an excellent overall fit (Dimitrov, 2010). Thus, for ethnicity, a scalar invariance model in which 14 equality constraints were freely estimated partially met the criteria for scalar invariance (ΔRMSEA < .01). However, the CFI of this modified scalar invariance model was still .026 lower than the CFI from the metric model. Freeing an additional five equality constraints improved model fit to CFI = .955 and RMSEA = .022, reflecting a decline in model fit of .001 and .010 for RMSEA and CFI, respectively.
Because analyses found only partial support for the scalar invariance, more restrictive models, including strict invariance models, were not tested.
Discussion
Summary and Interpretation
The goals of this study were to evaluate key measurement properties of SELweb. A first goal of this study was to evaluate evidence of score reliability, based on evidence of internal consistency and temporal stability, and validity, based on evidence from the factor structure of the scores. Specific hypotheses related to this goal were based on prior field trials of SELweb and similar assessments (McKown et al., 2013, 2016).
In terms of reliability, consistent with prior findings, I hypothesized that the internal consistency reliability and temporal stability reliability of module scores would be moderate, and that factor score reliabilities derived from multiple correlated indicators would be substantially higher. Findings from the present study support these hypotheses. In general, score reliabilities—both internal consistency and temporal stability—at the level of the factor score were sufficiently high for the purposes of understanding student strengths and weaknesses in the areas assessed. In contrast, score reliability for the scores that were used as the indicator variables in the construction of those factor scores was low enough that they should be interpreted with caution when using SELweb to understand individual student strengths and needs.
In addition to internal consistency reliability, temporal stability coefficients were modest, raising important questions about SELweb’s utility for detecting change over time. It is reassuring that across the 6-month test–retest interval, SELweb observed scores improved by an average of .25 standard deviations, and the change in all observed scores was statistically significant. This suggests that SELweb scores are sensitive to the normative changes in social-emotional skill that unfolds over the course of a school year. Because temporal stability estimates were taken over a 6-month interval, rather than the traditional 2-week interval, test–retest reliability statistics computed for this study should be interpreted as a lower-limit estimate. In the future, it will be important to obtain test–retest reliability estimates over a shorter interval.
In terms of factor structure, consistent with prior findings, I hypothesized that the key indicator scores derived from SELweb would fit a four-factor model that includes latent variables reflecting emotion recognition, social perspective-taking, social problem-solving, and self-control. A confirmatory model supported this hypothesis. The overall fit of the model to the data was excellent, and factor loadings were consistently robust. In addition, covariances were generally moderate, suggesting that that the latent variables in the model are related but partially distinct.
One exception was the high covariance between the self-control and perspective-taking latent variables. This suggests that although social perspective-taking and self-control are conceptually distinct, they share a common underlying feature. Perhaps, for example, both reflect metacognitive skill. Nevertheless, because they are conceptually distinct constructs, I have modeled them as separate latent variables. Future research should investigate the nature of the common variance between these two seemingly different constructs that nevertheless covary highly.
A second goal of this study was to evaluate the measurement equivalence of SELweb across time, sex, and ethnicity. Specifically, this study evaluated the extent to which SELweb scores reflect the same underlying constructs on the same scale, with the same relationship between underlying skill level and observed score.
Analyses supported SELweb’s configural invariance across time, sex, and ethnicity, meaning that observed scores reflect the same underlying factor structure across these ways of grouping respondents. Analyses also supported SELweb’s metric invariance across time, sex, and ethnicity. This means that a one-unit change in the latent variable is associated with the same change on the observed scores across time, sex, and ethnicity. Furthermore, across time and sex, the data supported partial scalar invariance. Analyses provided less support for scalar invariance by ethnicity. This means that to a large degree for time and sex, and to a lesser degree for ethnicity, at a given level of skill on the latent variable, children from different groups achieve the same score on the observed variables.
These findings suggest that (a) for all groups, SELweb scores reflect the same underlying constructs (configural invariance); (b) for all groups, a change in skill level is reflected by the same change in observed scores; (c) for sex and repeated measurement, latent variable scores reflect similar observed scores; and (d) for ethnicity, latent variable scores may reflect different scores on the observed score. As a result, interpretation of mean differences between children from different ethnic groups on SELweb observed scores or composites should be made with caution.
An important area for future work on SELweb will be to identify and address the sources of ethnic noninvariance. Possibilities include ethnic differences in levels of engagement with the assessments, in effort applied to answering questions, and in interpretations of the meaning of assessment content. Until sources of noninvariance are identified and addressed, it will be important for users of SELweb to interpret mean differences between members of different ethnic groups with caution. Applying separate norms for each ethnic group may be an appropriate remedy for some applications.
Significance and Future Directions
This extends prior theory and research on children’s social-emotional comprehension. Much of the existing theory and empirical research focuses on a single social-emotional skill area, including emotion recognition (Nowicki & Duke, 1994), theory of mind (Wellman & Liu, 2004), social problem-solving (Crick & Dodge, 1994), and self-control (Duckworth, 2011). The present study’s findings are consistent with prior research in each of these areas suggesting that these skill areas are measurable dimensions of social-emotional comprehension that are correlated but partially distinct.
The present work builds on these largely separate lines of work by integrating important social-emotional skills into one conceptual framework, which was in turn used to design the SELweb modules. Findings from this study are consistent with multicomponent models of SEL that describe the processes by which children encode, interpret, and reason about social and emotional information (Collaborative for Academic, Social, and Emotional Learning, 2017; Crick & Dodge, 1994; Lipton & Nowicki, 2009; Salovey & Mayer, 1990). There is considerable common ground between the findings of this article and each of those models, and between the models themselves. Nevertheless, each emphasizes some social-emotional processes more than others. An important future direction for the field is, therefore, to clarify the commonalities and distinctions between models of SEL.
This work suggests important next steps in the practical application of SELweb and assessments like it. SELweb assesses social-emotional skills that are commonly taught in evidence-based SEL programs (Durlak, Weissberg, Dymnicki, Taylor, & Schellinger, 2011; Jones et al., 2017). Ongoing field trials of SEL interventions are using SELweb as an outcome measure, and if SELweb is sensitive to intervention effects, it may be a useful program evaluation tool. Future work should also examine SELweb’s usefulness as a formative tool by which educators can understand their students’ strengths and needs and use that information to guide instruction and investment in programs.
The ultimate goal of SELweb, and assessments like it, is to inform instruction and intervention planning. In fact, SELweb assesses dimensions of SEL that are commonly addressed in evidence-based SEL curricula and clinical interventions. Ideally, then, teachers and other professionals will be able to use SELweb to guide instruction or intervention planning. The findings of this research and other studies of SELweb’s psychometric properties suggest that it has many of the technical properties of just such an assessment. Further work to increase score reliability, eliminate sources of scalar noninvariance, and evaluate sensitivity to treatment effects will provide important additional information about SELweb’s practical usefulness.
Footnotes
Author’s Note
The opinions expressed are those of the author and do not represent views of the Institute of Education Sciences or the U.S. Department of Education.
Declaration of Conflicting Interests
The author(s) declared the following potential conflicts of interest with respect to the research, authorship, and/or publication of this article: Clark McKown has financial interests in xSEL Labs, Inc. which could potentially benefit from the outcomes of this research.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The research reported here was supported by the Institute of Education Sciences through Grant R305A110143 to Rush University Medical Center.
