Abstract
This study examined the internal structure of the Scales for Assessing Emotional Disturbance-3 Rating Scale (RS), a teacher-completed RS developed to measure emotional disturbance (ED). As defined in U.S. law and regulations, ED involves five characteristics or patterns of behavioral and emotional maladaptation. RS data obtained on a sample of students with ED were used to examine validity evidence based on the internal structure of the assessment. Of particular interest was the extent to which multivariate factors derived from the RS data conform to the five characteristics of ED stated in the definition. Results indicate that the RS data fit a 5-factor model reasonably well. A subsequent bifactor analysis identified a considerable proportion of common variance across factors, suggesting the presence of a strong general ED factor, two distinct group factors (Inability to Learn and Inappropriate Behavior), and three weak group factors. The findings provided evidence of the validity of the SAED-3 RS based on internal structure and pointed to support for use of the RS in contributing to the process of determining whether a student qualifies for the ED education disability. Implications for improved research on the nature of ED and how students with ED can be better served are discussed.
Research reports and government policy papers indicate that at least 20% of school-age children experience a mental health problem at some point before adulthood (e.g., Jaffee et al., 2005; Merikangas et al., 2010; National Research Council and Institute of Medicine [NRC and IoM], 2009; U.S. Department of Health and Human Services, 1999). Many of these children do not receive mental health services (Ghandour et al, 2019), and only a small number of children qualify for school services based on the guidelines stated in the Individuals with Disabilities Education Improvement Act (IDEA, 2004). Children who qualify for special education and related services under IDEA are those with a documented emotional or behavioral challenge that adversely affects their education performance. Approximately one-half of 1% of students are school identified with emotional disturbance (ED (Kauffman & Landrum, 2017; U.S. Department of Education, 2018). Unfortunately, students with ED are likely to exhibit poor educational, academic, social, and life outcomes (Bradley et al., 2008; Kauffman & Landrum, 2017; Wagner, Kutash, et al., 2005). The small number of students school identified with ED, together with the poor educational and life outcomes, indicate that efforts need to be focused on identifying and serving this student population.
A key component of identifying and serving students with ED is for schools to implement evidence-based identification practices, using reliable and valid assessment procedures. Students whose emotional and behavioral challenges are not identified and served at an early point are at greater risk for academic and behavioral issues throughout school and poorer life outcomes (Costello et al., 2003; Essex et al., 2009). However, evidenced-based assessment and intervention services for students with ED can prevent or reduce the short- and long-term effects of emotional and behavioral challenges for students (Conroy et al., 2004; Mrazek & Mrazek, 2005; NRC & LoM, 2009).Therefore, experts have identified several significant considerations in selecting assessment instruments, primarily acceptable reliability and validity functioning, but also the purpose of their intended use, and their usefulness and acceptability to consumers (Glover & Albers, 2007; National Association of School Psychologists, 2016).
In regard to the ED category of special education, school professionals need to select reliable and valid measurement instruments that adhere to the federal definition of ED outlined in IDEA. The IDEA definition identifies ED as a condition that has been present for a period of time, is not present in typical student peers, and negatively affects the student’s educational performance. Five characteristics are listed as follows: (a) an inability to learn that cannot be explained by intellectual, sensory, or health factors. (b) An inability to build or maintain satisfactory interpersonal relationships with peers and teachers. (c) Inappropriate types of behavior or feelings under normal circumstances. (d) A general pervasive mood of unhappiness or depression. (e) A tendency to develop physical symptoms or fears associated with personal or school problems.
Over the past 50 years, the IDEA definition of ED has received significant criticism (Florell, 2018; Forness & Knitzer, 1992; Hanchon & Allen, 2013, 2018; Merrell & Walker, 2004; Skiba & Grizzle, 1992). The major objections are that the definition is dated, it does not have empirical support, has arbitrary exclusionary statements, and it is vague and ambiguous. On the contrary, that this definition originated five decades ago does not necessarily make it obsolete. Moreover, the federal definition stated in IDEA is to be followed by state and local education agencies in the identification of children to receive special education and related services for ED. Currently, a number of psychometrically sound tests exist to assess the emotional and behavior challenges of school-age students (see Achenbach, 2009; Goodman, 2001; Reynolds & Kamphaus, 2015). However, few explicitly address the major components of the definition of ED as presented in the IDEA. Thus, the inability to directly link the characteristics of ED as specified in IDEA to assessment tools makes identifying students with ED particularly challenging (e.g., Algozzine, 2017; Becker et al., 2011; Mitchell et al., 2019).
The Scales for Assessing Emotional Disturbance-3 (SAED-3; Epstein et al., 2020) is a suite of four assessment instruments and procedures specifically developed to aid school personnel in the identification of students with ED. The SAED-3 Rating Scale (RS) is a 45-item standardized, norm-referenced instrument developed to operationally define the five primary characteristics of ED (39 items): Inability to Learn, Relationship Problems, Inappropriate Behavior, Unhappiness or Depression, and Physical Symptoms or Fears. The SAED RS also operationalizes the socially maladjusted concept found in the definition (6 items), as well as other parts of the IDEA definition of ED. The RS was designed to help in the process of identification of a student with ED.
The SAED RS was developed in several stages (see Epstein & Cullinan, 1998), all intended to identify teacher-rated items that, in combinations, operationally define the five characteristics of ED as stated and implied by the IDEA definition of ED. First, the research literature on the emotional and behavioral disorders, existing tests and RSs to measure children’s behavior, and criticisms of the federal definition were reviewed and led to the development of 85 behavior problem items. Second, 30 teachers and school psychologists evaluated the 85 items in terms of relationship to ED, item readability, redundancy and overlap of items, and user acceptability, and resulted in removing 19 items. Third, the 66-problem item provisional scale was completed by 74 teachers on students with ED (n = 369) and without (n = 386) ED, and the analyses indicated that every one of the 66 items discriminated between the two student groups. Finally, the RS authors used exploratory factor analyses with the provisional scale data to help reduce the number of items while still producing a separate scale for each of the IDEA definition’s five characteristics of ED, plus socially maladjusted.
This process resulted in a final RS of 45 items, each contributing to one of the six RS subscales: Inability to Learn (8 items), Relationship Problems (6 items), Inappropriate Behavior (10 items), Unhappiness or Depression (7 items), Physical Symptoms or Fears (8 items), plus the supplemental scale, and Socially Maladjusted (6 items). Studies have demonstrated evidence that the scores from a previous version of the SAED-RS meet acceptable standards of reliability and validity (Cullinan et al., 2002; Epstein, Cullinan, et al., 2002; Epstein et al., 1999, 2020; Epstein, Nordness, et al., 2002). Several recent studies have presented evidence that scores from the SAED-3 RS demonstrated acceptable classification accuracy for students with and without ED (area under the curve [AUC] = .81), adequate discrimination between students with ED and students with learning disabilities (Lambert, Cullinan, et al., 2021), acceptable convergent and divergent validity with scores from the Behavioral Assessment for Children-3 and the Strengths and Difficulties Questionnaire (Epstein et al., 2020), and acceptable measurement invariance across groups of students from diverse backgrounds (Lambert, Martin, Epstein, & Cullinan, 2021; Lambert, Martin, Epstein, Cullinan, & Katsiyannis, 2021).
However, researchers have not yet examined the evidence of validity of the scores based on the internal structure of the SAED-RS, and the degree to which the hypothetical 5-factor structure (based on the federal definition of ED) is empirically supported. Specifically, researchers need to explore the dimensionality of the SAED-RS scores with a new large sample of students with ED, and explore the extent to which the five characteristics of ED demonstrate discriminant validity. Thus, the purposes of the study were to conduct a confirmation factor analysis with students who were school-identified as ED to (a) determine the dimensionality of the SAED-3 RS scores, (b) measure the extent to which the constructs are interrelated, and (c) estimate score reliability based on confirmatory factor analysis (CFA) model parameters.
Method
Participants
Participants for this study were 491 students school-identified with ED who had an active Individualized Education Program. The sample was drawn from a larger normative study of the SAED-3. The sample was fairly geographically representative of the United States, with 34.2% of the students from the Northeast, 24.2% from the Midwest, 30.8% from the South, and 10.8% from the West. Participants were predominately male (67.6%; n = 332). In terms of race and ethnicity, the sample was 55.2% white/non-Hispanic (n = 271), 28.3% black/non-Hispanic (n = 139), 5.5% multiracial/non-Hispanic (n = 27), 3.3% other race (e.g., Asian, American Indian) (n = 16), and 7.7% Hispanic (n = 38). Students ranged in age from 5 to 18 with a mean age of 13.6 years (SD = 3.52). Data on school settings were unavailable for 12.4% of the sample (n = 61); of the remaining students, about 85% (n = 367) were enrolled in public schools. Less than 5% of the sample was identified as English language learners (n = 16). None of the students were identified with any disabilities other than ED.
Measure
The SAED-3 RS is a standardized, norm-referenced 45-item measure. The RS has five basic subscales (Inability to Learn, Relationship Problems, Inappropriate Behavior, Unhappiness or Depression, and Physical Symptoms or Fears) and one supplemental subscale (Socially Maladjusted). Each RS item is rated on a 4-point Likert-type scale (0 = “not a problem,” 1 = “mild problem,” 2 = “considerable problem,” and 3 = “severe problem.”) by a teacher familiar with the student’s behavior for a minimum of 2 months. Items composing each of the subscales are summed to obtain a raw score which is transformed to a scaled score. For this study, the items from the supplemental scale (Socially Maladjusted) were not included in the analysis because only the five “core” subscales align with the five characteristics of the federal definition of ED.
Data Collection
Data were collected as part of the SAED-3 re-norming process (Epstein et al., 2020). Prior to data collection, three university Internal Review Boards (University of Nebraska-Lincoln, University of Northern Colorado, and Elon University) approved sample recruitment and data collection procedures. Data were collected from fall 2015 through spring 2018.
The test authors recruited educators from across the United States to represent the four major U.S. regions: Northeast, Midwest, South, and West. Teachers were contacted by one of the SAED-3 authors either by mail, email, or telephone and asked to participate in the norming process. Teachers who agreed to participate were asked to complete the RS on all students on their class roster(s), or to select an unbiased sample of their students using the following simple procedure: First, decide how many students you wish to rate. Then, start at the top or bottom of your class roster and rate every child. Do not skip any student unless you have known this student for less than two months. Stop selecting and rating students when you have reached the number of students you wished to rate.
Each rater also provided demographic information for each student (e.g., grade level, age, race, ethnicity, etc.).
Data Analysis
Mplus v7.11 (Muthén & Muthén, 1998-2014) was used to fit a series of CFA models to evaluate the internal structure and reliability of scores from the SAED 3 Rating Scale. The primary focus of the analysis was to examine the fit of the 5-factor hypothetical model proposed by the developers (Epstein & Cullinan, 1998; Epstein et al., 2020) based on the five characteristics of ED stated in the IDEA definition. Coefficient omega (McDonald, 1999), which is an estimate of score reliability, was computed for the five factors based on the CFA model parameters. For comparison purposes (to examine the falsifiability of the hypothetical structure), two alternative factor models were also fit to the data: (a) single-factor model; (b) bifactor version of the 5-factor model including a general ED factor as well as the five group factors. A bifactor version of the 5-factor model was examined to obtain an additional perspective on scale structure by partitioning the item response variance into common sources (see Reise, 2012 for an in-depth description of bifactor models).
Bifactor models represent a structure with one general factor and several group factors, all orthogonal to each other (i.e., uncorrelated). Thus, the bifactor model allows each factor to directly influence the item responses independent of the other factors (DeMars, 2013) and can complement a correlated factors model “by evaluating whether item response variance is due to a general construct versus group factors” (Brouwer et al., 2013, p. 138). To assess the degree to which ratings are attributable to the general factor, we computed the explained common variance (ECV) index and the omega hierarchical estimate for the general factor. The ECV index provides information on the ratio of explained variance attributable to the general factor compared to the group factors of the bifactor model—higher values indicate a greater proportion of variance attributable to the general factor. The interpretation of ECV relies on the percentage of uncontaminated correlations (PUC; Reise et al., 2013), the percentage of correlations between items that are due to only the general factor. Rodriquez et al. (2016) suggest that scores are “essentially unidimensional” when EVC > .70 and PUC >.70. PUC for the SAED-3 bifactor model is .82—around 82% of correlations between RS items are due to only the general factor.
The omega hierarchical (ωH; McDonald, 1999) estimate provides information about how much raw score variance is attributable to the general factor or a specific group factor—that is to say, omega hierarchical represents the proportion of the observed score that reflects the latent factor variance (i.e., omega hierarchical is a model-based estimate of score reliability). When applied to the general factor, omega hierarchical provides information about the proportion of an overall scale score that is attributable to a single, common source of variance. When applied to a group factor, omega hierarchical provides information about the proportion of a subscale score that is attributable to a specific source of variance (accounting for the common variance).
As items were measured on a 4-point Likert-type scale, we treated the ratings as ordinal rather than continuous indicators. To estimate the models, we used weighted least squares with mean and variance (WLSMV) adjustments, and the factors were scaled using a fixed mean and variance approach. All CFA models were specified without correlated residual variances between items because no item residuals are hypothesized to correlate. There were minimal missing data at the item-level (<.005%) and cases with missing data were included using a pairwise-present approach as is default in Mplus when using the WLSMV estimator.
The indicators used to assess goodness-of-fit were the comparative fit index (CFI; Bentler, 1990), Tucker–Lewis index (TLI), and the root mean square error of approximation (RMSEA; Steiger & Lind, 1980) at its 90% confidence interval. CFI and TLI are comparative fit indices representing the degree of improvement over the worst fitting model (Boomsma, 2000). Both indexes are scaled from 0 to 1 with values closer to 1 indicating better fit. A close fitting model has CFI and TLI values greater than or equal to 0.95 (Hu & Bentler, 1999) while earlier research suggested that an acceptable fitting model has values greater than or equal to 0.90 (Browne & Cudeck, 1993). RMSEA represents the degree of model misfit and is reported on a scale of 0 to 1; values closer to zero indicate better fit with values close to .06 indicating close fit while balancing Type I and Type II error rates (Hu & Bentler, 1999); however, values less than .08 indicate acceptable fit while values over .10 indicate very poor fit to the model (MacCallum et al., 1996). In addition to examining the point estimate, the 90% confidence interval was also used to evaluate misfit with upper limits lower than .08 representing acceptable fit. The chi-square difference test (Δχ2) and CFI differences (ΔCFI) were computed to evaluate the fit of nested models (e.g., the 1-factor versus the 5-factor model). Nonsignificant chi-square difference tests and/or differences in CFI less than .01 indicate that the fit of the two models being compared are statistically equivalent.
Results
Table 1 reports the goodness-of-fit indicators for the three CFA models. The single-factor model did not exhibit acceptable fit (CFI = .753, TLI = .739, RMSEA = .135 [.132, .138]). Therefore, the SAED-3 RS data are not strictly unidimensional. The 5-factor model demonstrated acceptable fit (CFI ≥ .90, TLI ≥ .90, RMSEA < .08) with the upper limit of the RMSEA confidence interval also within the acceptable range (<.08). The 5-factor model demonstrated a statistically significant improvement in fit over the single-factor model, Δχ2(10) = 929.99, p < .001; ΔCFI = .17. The five latent factors were moderately to highly correlated (see Table 2); the correlations ranged from .41 (Inappropriate Behavior with Unhappiness or Depression) to .85 (Unhappiness or Depression with Physical Symptoms or Fears). The high correlation between Unhappiness or Depression with Physical Symptoms or Fears suggests poor discriminant validity between the two scores, which limits the interpretability and implications of the scores (Brown, 2006). Omega reliability estimates for the five subscale scores ranged from .918 (Physical Symptoms or Fears) to .965 (Inappropriate Behavior) indicating high levels of reliability for each subscale score.
CFA Goodness-of-Fit Indices.
Note. CFA = confirmatory factor analysis; CFI = comparative fit index; TLI = Tucker–Lewis index; RMSEA = root mean square error of approximation; CI = confidence interval.
p < .001.
Correlations Between Latent Factors.
Note. All correlations were statistically significant at the .001 level.
The bifactor version of the correlated factors model fit the data well (CFI ≥ .90, TLI ≥ .90, RMSEA < .08), but did not fit the data more closely than the correlated factor models as indicated by the minor difference in CFI of .004 (see Table 1). Note that the chi-square difference test could not be computed due to a singular matrix, so the test is not reported in Table 2. The bifactor model suggests that there is a relatively strong general ED factor, distinct Inability to Learn and Inappropriate Behavior factors, and weak Relationship Problems, Unhappiness or Depression, and Physical Symptoms or Fears factors. Using bifactor model estimates, 57% of the ECV was due to the general ED factor, 17% was attributable to the Inappropriate Behavior factor, 12% to the Inability to Learn factor, and less than 5% to each of the other three factors.
Omega hierarchical, which represents the proportion of observed score variance attributable to the true score variance of each factor, was .84 for the general ED factor. For the group factors (e.g., Inability to Learn, Inappropriate Behavior, etc.), omega hierarchical represents the reliability of subscale scores after accounting for true score variance attributable to the general factor. Omega hierarchical was .61 for the Inappropriate Behavior factor, .55 for the Inability to Learn factor, .25 for the Physical Symptoms or Fears factor, .24 for the Relationship Problems factor, and .21 for the Unhappiness or Depression factor.
Discussion
During the initial SAED development process, item content was generated to align with the federal definition, and exploratory factor analysis was used to assist in creating scales that correspond to the IDEA definition’s five characteristics of ED. The developers used data from a nationally representative sample of students with and without ED (Epstein & Cullinan, 1998). To our knowledge, this is the first study to examine the internal structure of the SAED-3 RS in a confirmatory framework with a sample of students with ED. Evidence of the internal structure is the basis for score interpretation and score reliability estimation. The findings of this study provide moderate empirical support for the validity of the 5-factor internal structure of the SAED-3 RS. The 5-factor model fit the data acceptably well. Specifically, the 5-factor model fit significantly better than a single-factor model and met acceptable thresholds for the indicators (i.e., CFI, TLI, and RMSEA) that assess goodness of fit. However, the high correlations between the Relationship Problems, Unhappiness/Depression, and Physical Symptoms factors raises questions about the uniqueness of these factors and the utility of interpreting the these dimensions separately.
Although the major goal of this study was to examine the structural validity of the 5-factor model aligned with the federal definition of ED, for purposes of comparison, a bifactor solution was also examined. While the bifactor model fit the data acceptably, this model did not fit the data appreciably better than the 5-factor model. The bifactor model does, however, provide additional insight into the meaningfulness and interpretation of the factors. In the bifactor model, ECV and omega hierarchical indicated that the Inability to Learn and Inappropriate Behavior were distinct, unique factors; however, limited reliable variance beyond that attributable to the general ED factor existed for the other three factors (Relationship Problems, Unhappiness or Depression, and Physical Symptoms or Fears). As such, the interpretation of these subscale scores as unique constructs may be limited.
As noted, the overriding objective in the design of the RS was to create measurable scales corresponding to the five ED characteristics (Epstein & Cullinan, 1998). The creation of those scales was strongly guided by exploratory factor analysis but were not based entirely on those results. In contrast, the present findings are based entirely on CFA of data collected from a different sample of ED students and their teachers. In consideration of these differences, it is not surprising that some RS subscales received limited support as unique factors.
While several studies have demonstrated that scores from the RS exhibit acceptable psychometric characteristics including internal consistency, convergent validity, and interrater and test–retest reliability, for the most part, this evidence was demonstrated during the development and norming of the original SAED (see Epstein & Cullinan, 1998). The findings from this study afford additional psychometric support for the SAED-3 RS scores by demonstrating essential evidence of the internal structure of the assessment that aligns with the federal definition of ED. This study also provides more rigorous evidence of score reliability for the subscales, which indicated that the variances for the Inability to Learn and Inappropriate Behavior subscale scores are reliable and largely unique while the variances of the other three subscale score are somewhat less reliable due to a common source of variance. In studies of other forms of reliability (e.g., Epstein et al., 2020), all five characteristic scales have shown statistically significant, medium to large degrees of reliability.
Limitations
A number of limitations of this research should be noted. First, the national sample of students with ED was not randomly selected. School personnel were contacted and asked to volunteer to provide data. Thus the sample included teachers who volunteered to participate and completed rating forms on the students with whom they work. Obviously, this sample does not include data about students with ED whose teachers chose not to provide data. Responses of nonparticipating teachers might be different from those of teachers who participated, so this consideration may have introduced bias to the findings. Furthermore, limited data were collected on teacher characteristics, and those data were collected at the student level, which made describing teacher characteristics untenable for this study.
Second, goodness-of-fit indexes for the correlated factors model and the bifactor model represented acceptable fit to the data, but did not indicate “close” fit to the data. Perhaps the heterogeneity of the population of students with ED underlies the relative misfit of the CFA models. In addition, only three CFA models were evaluated in this study; however, there are other potential alternative models that were not tested which may fit the data well. Furthermore, latent variable modeling approaches such as CFA are sample dependent, and therefore, the extent to which the findings generalize to the broader population of students with ED is not entirely clear.
Third, Hispanic students school-identified with ED were underrepresented. In the sample, Hispanic students represented 7.7% of students with ED, but in IDEA data, Hispanic students account for approximately 18% of students with ED (U.S. Department of Education, 2018). The generalizability of the present findings may be limited because of the underrepresentation of Hispanic students.
In addition, not all teachers rated the same number of students with ED: some rated one student, while some rated all students with ED on their class rolls. This situation calls for a multilevel analysis where students are nested within teachers, but because of International Review Board (IRB) data collection requirements, it was not feasible to identify which teachers rated each student. Therefore, single-level analysis models were used as if each student were rated by a different teacher. We recommend that in future replications, the researcher should take steps to allow for nested data analyses.
Future Research
The stated limitations suggest needed lines of research. First, it may be possible to address the participating versus nonparticipating teacher issue by making a substantial effort to persuade the teachers that the value of all teachers participating is so great that the result will be worth their investment of time and effort. The likelihood of such an effort’s success would probably be greater with smaller sample sizes. Second, future studies should attempt to replicate the factor models evaluated by this study as well as evaluate other theoretically justified factor structures. Such replications and extensions would provide useful information about the generalizability of the findings to the larger population of students with ED. Third, future studies should investigate the characteristics of ED among Hispanic students, especially involving groups of student with ED in which the proportion of Hispanic students matches that in national data.
A fourth direction is that the SAED-3 RS could be used in longitudinal studies to examine the stability and sensitivity to change of teacher ratings. The long-term stability and the sensitivity to change of SAED-3 RS ratings are important issues for school personnel who are considering using the assessment in planning and measuring outcomes of intervention services. It would make little sense to plan services or supports for students with ED around a personal variable that would change markedly over time in the absence of services or to evaluate the effect of an intervention using a variable that is not sensitive to change. Furthermore, other researchers should evaluate the predictive validity of the RS scores with respect to student social and emotional functioning, school behavioral outcomes (e.g., office referrals, suspensions, expulsions), and academic outcomes (e.g., grades, graduation rates, standardized test performance). In addition, researchers need to examine the utility of the SAED-3 RS as a specialized teacher RS measure within a comprehensive school approach to accurately identifying students with ED. Specifically, investigators should examine the feasibility and acceptability of the SAED-3 RS by teachers and school administrators in using the instrument as part of a school assessment model.
Moreover, additional studies are needed to further investigate other aspects of the validity of the SAED-3 RS. For example, its predictive validity could be examined by measuring the RS subscale scores of many students who are under consideration for identification as ED, while making those RS data unavailable to the decision-making teams. The RS would show predictive validity to the extent that RS subscale scores are correlated with the decision to identify as ED. The utility of the SAED-3 RS should be studied by examining how much its results increase the accuracy and/or efficiency of a decision to identify a student as ED, and facilitate the process of communicating the identification decision to teachers, parents, and other consumers.
The SAED-3 RS was not designed to measure the extent of improvement shown by a student with ED, but it may have value as such an indicator. Its value in that role could be studied by comparing data collected two or more times from both the RS and some other indicator of student emotional and behavioral functioning (e.g., recording of several target behaviors).
Implications
The findings of this study contribute to the existing literature base on the SAED-3 RS and provide essential data that the RS appears to yield scores that are psychometrically sound and assess the emotional and behavioral problems of students who are school-identified as ED. Previous research demonstrated that the SAED-RS scores reached acceptable levels of reliability (i.e., internal consistency, inter-rater reliability, and test–retest reliability) and validity (i.e., content description, criterion prediction, construct identification, test bias, and diagnostic accuracy) (Epstein et al., 2020). In this study, the SAED-RS appeared to measure five key characteristics of students school-identified as ED. However, three of the five subscale scores (Relationship Problems, Unhappiness/ Depression, and Physical Symptoms) lack strong discriminant validity and may need to be interpreted with that in mind.
For teachers and other school personnel, the primary use of the SAED-3 RS is in identifying students with ED, and who may be eligible to receive special education and related services under IDEA. We emphasize that no one instrument can be used to identify students as ED, but the compelling evidence of reliability and validity of the SAED RS scores indicates that the RS is well positioned to play a significant role in the identification of students with ED. For state and local school administrators, evidence of internal structure of the SAED-RS scores indicates that this assessment procedure can help operationally define the federal definition of ED and help meet the IDEA mandates. For researchers interested in studying school-identified students with ED, the SAED-RS provides a psychometrically sound measurement for studying important issues related to the education of students with ED, especially clarifying the characteristics of this student population, and potentially determining individual and group responses to selected school-based interventions, and measuring the long-term outcomes of educational interventions.
Footnotes
Declaration of Conflicting Interests
It is important to note that the second and third authors are developers of the SAED-3, and receive royalties from sales of the assessment. The fourth author is currently employed by the SAED-3 publisher, and therefore has an indirect financial interest in the assessment. It is also important to note that the data were analyzed and interpreted by the lead author independently from the other authors. The lead author has neither a financial interest related to the SAED-3 nor any other conflict of interest related to this study.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
