Abstract
Detecting psychological distress among international students can be challenging given diverse languages, cultural backgrounds, and lack of refined measurement properties of measures tailored to international students. Despite the challenges, ensuring that a psychological distress measure works effectively has considerable potential value for assessment purposes. The current study evaluates the measurement properties of a short 10-item version of Radloff’s Center for Epidemiologic Studies Depression Scale (CES-D). Grounded in long-standing evidence on gender differences in depressive symptoms, specific attention was given to examining measurement invariance of the CES-D Short-form across women and men. Based on a large, two-cohort sample of international students (N = 468), and through multiple analyses evaluating factor structure and measurement invariance, we derived an even briefer, seven-item single-factor form of the CES-D (CES-D Short-form International) that can be used with international students.
Broadening access to mental health services is an important priority among mental health professionals in the United States. One area receiving recent attention involves underrepresentation and underutilization of mental health services by international students (e.g., Hyun, Quinn, Madon, & Lustig, 2007). Recent surges in the number of international students indicates this group representing a sizeable minority in the U.S. higher education system, with likely corresponding effects on local communities and agencies that provide services to students and their families. Universities are interested in recruiting and retaining international students because they bring diverse scholarly and cultural perspectives to campuses and, by fostering culturally inclusive college environments, offer the opportunity for domestic students to broaden their awareness of the global community. Of course, universities also seek international students for the economic stimulus they can bring to campuses and university communities (Chen, 2008; Owens, Srivastava, & Feerasta, 2011). The social and economic resources international students bring to campus suggest it is prudent for universities to increase the likelihood for success of international students and prevent or reduce their risk in meeting educational and career goals. Challenges for international students can include acculturative stress-related issues they may experience as part of their transition to studying abroad, as well as other factors that might reduce well-being and interfere with academic progress.
Indeed, apart from the normative and expected acculturation process that international students typically undergo (Berry, 1997; Y.-W. Wang, Lin, Pang, & Shen, 2007), the magnitude and significance of the growing mental health concerns and needs of international students are worthy to note. International students frequently report feelings of depression (Hamamura & Laird, 2014; Wei et al., 2007) and anxiety (e.g., Sümer, Poyrazli, & Grahame, 2008). Identified risk factors for such difficulties include acculturative stress, language barriers, feelings of loneliness and isolation (Russell, Rosenthal, & Thomson, 2010; Sawir, Marginson, Deumert, Nyland, & Ramia, 2008), and lack of belongingness (Glass & Westmont, 2014). In a large sample of international students followed from prearrival to their second year of study, K. T. Wang et al. (2012) identified 11% who could be classified as “culture-shocked” (experiencing a significant uptick in psychological distress) by their second and third semesters, and another 10% who were consistently and substantially distressed over the time frame of their study. Thus, perhaps as many as one in five international students could be experiencing significant psychological distress during the early portion of their time on a university campus.
Several studies found that international students frequently endorse experiencing psychological distress. Results suggest that findings from K. T. Wang et al. (2012), though potentially alarming, might have underestimated the scope of distress, at least on some campuses. For example, Han, Han, Luo, Jacobs, and Jean-Baptiste (2013) reported that 45% of the 130 international students in their sample had experienced symptoms of depression, and nearly a third (29%) had experienced anxiety symptoms. Similarly, in a study examining longitudinal distress trajectories among international students, 39% of students was identified as moderately and stably distressed and approximately 5% of students was classified as being highly and stably distressed (Hirai, Frazier, & Syed, 2015). These potentially alarming rates of international students reporting experiencing distress are roughly mirrored in domestic student samples as well; approximately 34.5% of college students also report feeling depressed to the point they also experience impaired functioning (American College Health Association, 2015).
Despite the similarity between international and domestic students’ experiences of psychological distress, it is important to note the specificity of the circumstances and experiences that international students face that may more likely precipitate and then maintain distress, in addition to the potential fallibility of making direct comparisons between international and domestic students’ distress based on scores of measures. That is, although some experiences potentially linked to distress are common for both international and domestic students (e.g., lack of social connectedness, adjusting to new academic environment), international students face additional challenges such as acculturative stress (e.g., Wei et al., 2007), language barriers (Yeh & Inose, 2003), and perceived discrimination (Hanassab, 2006; Poyrazli & Lopez, 2007) that put them at higher risk for experiencing depression and other forms of distress. In fact, Poyrazli and Lopez (2007) found that international students experience higher levels of homesickness and discrimination compared with domestic students. Furthermore, current assessment practices fail to examine whether a distress measure developed and widely tested among a range of domestic samples can be directly applied to cross-cultural samples that include international students. Van de Vijver and Poortinga (1997) cautioned that ensuring minimal bias should be a required course of action before making cross-cultural comparisons or interpretations based on a given measure. In sum, there is reason to suspect international students may be at unique risk for psychological distress compared with domestic students, but care must be taken to accurately assess international students’ distress.
Many factors have been found to predict depression among international students (Zhang & Goodson, 2011). One variable that influences depression is length of stay. Also implicated in theoretical framework of acculturation as an important factor that moderates one’s acculturation process (e.g., Berry, 1997), length of stay has been consistently associated with depression level. In their review article, Zhang and Goodson (2011) summarized length of stay and depression levels are inversely associated, although specific trajectories of depression levels may vary by time (e.g., linear, curve-linear). For instance, Hechanova-Alampay, Beehr, Christiansen, and van Horn (2002) found that the levels of depression generally decreased over time, with depression peaking at the 3-month point, in comparison with the initial or the 6-month point on entry.
Language as well as cultural barriers may also complicate establishing support networks and increase risk for depression (Mori, 2000). For many international students, English may be their second language, and lower levels of English fluency has been associated with higher levels of depression (Sümer et al., 2008; C. C. D. Wang & Mallinckrodt, 2006). This finding in part is not surprising given the integral function and role English language plays in adjusting to and interacting with the U.S. culture. Furthermore, adjusting to the unique cultural atmosphere and learning cultural norms of the Unites States can be harder for some individuals than others, perhaps when the cultural norms of the United States are incompatible or drastically different from that of their country and culture.
For instance, many studies focus on East Asian international students theorizing that they carry Asian values (i.e., collectivistic) that are very distinct from American culture’s values (Sodowsky & Plake, 1992; Wei, Tsai, Chao, Du, & Lin, 2012), which may incur additional difficulties when adjusting to the U.S. culture. Similarly, Steptoe, Tsuda, Tanaka, and Wardle (2007) found students from collectivistic cultures and countries reporting higher depression levels than their peers from individualistic cultures. To this end, it is reasonable to suspect that individuals from some countries or cultures are at higher risk to experience depression than others. In sum, international students appear to be vulnerable to experiencing substantial psychological distress that may be further amplified by social, language, and cultural challenges, and the present study takes into consideration the effects of these factors.
Although contextual and cultural factors may play a role in depression among international students, the role of other individual differences should also be considered. In studies of domestic U.S. samples, there has been considerable research on the link between depression and gender. There is some controversy as to whether gender differences in depression exist, with some arguing against differences (Ceyhan, Aykut-Ceyhan, & Kurtyilmaz, 2005), but most arguing in favor of differences (e.g., Dekker, Koelen, Peen, Schoevers, & Gijsbers-van Wijk, 2007; Kelly, Kelly, Brown, & Kelly, 1999; Nolen-Hoeksema, 2001; Schuch, Roest, Nolen, Penninx, & de Jonge, 2014). The overall gender differences in prevalence are relatively well-established in domestic U.S. samples, and exhibited across various subpopulations, ranging from community adolescents (Bennett, Ambrosini, Kudes, Metz, & Rabinovich, 2005), adult outpatients (Dekker et al., 2007; Gagné, Vasiliadis, & Préville, 2014), and college students (Kelly et al., 1999; Nolen-Hoeksema, 2001). However, research investigating gender differences among international student populations is relatively limited. A few studies have examined gender differences in the prevalence of depression among international students. For instance, Dao, Lee, and Chang (2007) found gender to be a moderator of the relationship between acculturation and depression, with more of the variance of depression explained by acculturation for women than men. Other studies (Sümer et al., 2008; Wei, Ku, Russell, Mallinckrodt, & Liao, 2008; Ying & Han, 2006) report no association between gender and depression among international students.
Regardless of the direction of effects, an important consideration should be taken into account before interpreting the association between gender and depression among international students. That is, there is a possibility that the differences found between women and men in international samples may not be substantive, but instead could reflect measurement noninvariance caused by systematic differential item functioning. Indeed, some have also argued that measurement items might be gender-biased, leading to artificially higher scores in depression among women, or lower scores among men (e.g., Salokangas, Vaahtera, Pacriev, Sohlman, & Lehtinen, 2002). For instance, Leach, Christensen, and Mackinnon (2008) specifically focused on measurement bias as one possible explanation to account for gender disparity in depression, indicating that women are more likely than men to endorse certain items due to cultural norms regarding emotional expression. As one example, proportional odds of responding to an item, “I had crying spells” were 2.14 times greater in women (S. R. Cole, Kawachi, Maller, & Berkman, 2000), with such differential item functioning pattern witnessed across studies (e.g., Yang & Jones, 2007). Similarly, Carleton et al. (2013) examined item characteristic curves and confirmed that women are more likely to endorse a higher response option for this particular item compared with men. Interestingly, this particular item was dropped by J. C. Cole, Rabin, Smith, and Kaufman (2004) when refining the original CES-D measure to a short version with 10 items due to a marginal fit to a Rasch model.
Testing measurement invariance is a suggested way to address potential systematic variation in item functioning to rule out the possibility that certain items may function differently across groups (Dimitrov, 2010). Support of measurement invariance reflects that an item functions the same across groups, in this case for women and men, further allowing for a more substantial interpretation of the derived scores. For instance, to analyze average group differences and interpret those results, factor loadings and item thresholds should first be comparable between women and men. If not, group similarities or differences could be overestimated or underestimated, likely stemming from measurement artifacts and not real similarities or differences (Chen, 2008; Millsap, 2010).
Several studies examined potential systematic variation in item measures between women and men, especially with distress measures that are considered to show gender differences. For instance, Gomez, Summers, Summers, Wolf, and Summers (2014) found support for scalar invariance between women and men in 20 of the 21 items of the Depression Anxiety Stress Scales-21. Wu and Huang (2014) also examined measurement invariance of the Chinese version of the Beck Depression Inventory II (Beck, Steer, & Brown, 1996) among Taiwanese adolescents. They found support for a partial scalar invariance model, indicating potential bias in item responses that would inhibit direct interpretation of observed overall differences between women and men using this measure.
Although international students’ psychological distress has been widely examined, it is not until recently that a handful of studies have begun to explore invariance of these distress measures (K. T. Wang, Wei, Zhao, Chuang, & Li, 2014; Wei, Wang, & Ku, 2012). For instance, K. T. Wang et al. (2014) developed the Cross-Cultural Loss Scale and validated its measurement properties by conducting cross-sectional measurement invariance testing between two samples of international students. Similarly, Wei et al. (2012) developed the Perceived Language Discrimination Scale and validated its measurement between genders and different levels of English proficiency. Understandably, measurement invariance typically is tested in the process of scale development. However, it is also important to consider invariance during subsequent uses of a measure to examine structural validity when the scale is used with specific populations that were not part of the original scale development efforts (e.g., international students). In this vein, examining the invariance of a widely used distress measure among international students is a way to critically investigate measurement properties of existing or new scales, minimizing the possibility of inaccurate score interpretations.
Goal of the Current Study
Depression is one of the most common symptoms reported by international students (Wei et al., 2007), although rigorous psychometric evaluation of a measure used to tap depression for this specific population has rarely been undertaken. The Center for Epidemiologic Studies Depression Scale (CES-D; Radloff, 1977) is a frequently used measure of psychological distress, and several short CES-D forms have been developed. Although scores from the multiple forms of CES-D have been shown to possess good psychometric qualities in a variety of populations (e.g., Mainland Chinese adolescents [M. Wang et al., 2013]; youth of various ethnic groups [Skriner & Chu, 2014]; adult women [Ferro & Speechley, 2013]), to our knowledge, the factor structure and other psychometric features have not been examined with international students. As such, the current study first investigates gender measurement invariance and factor structure of a 10-item Rasch-derived CES-D Short-form (J. C. Cole et al., 2004) using multiple group confirmatory factor analysis (CFA) approaches. Additionally, we considered whether gender moderated the associations between region of origin, length of stay in the United States, and English proficiency with depression, three predictor variables that have been theoretically or empirically related to international student adjustment (Zhang & Goodson, 2011).
Method
Participants and Procedure
All incoming new international graduate students (N = 1,443) at a large southeastern public university were invited to participate in the current study. After e-mail invitation to all incoming students, a total of 468 international graduate students from 44 countries of origin initially responded to the survey. Student responses were obtained during the first 2 months of their attendance at the current institution. Two successive cohort responses were combined for the analyses. There were 244 students (162 men, 82 women) in the first cohort and 224 in the second cohort (150 men, 74 women); the gender distribution did not vary by cohort, χ2(1, N = 468) = 0.02, p = .896. There were no significant differences between cohorts in terms of the 10-item depression total score, t(440) = 1.70, p = .091. Most students (72%) in both cohorts were from Asian countries, with India (39%) and China (30%) making up the largest proportions, followed by students from the South and North American countries (9%), Middle-Eastern countries (5%), European countries (3%), and African countries (2%). Regional distribution did not vary by cohort, χ2(4, N = 424) = 4.00, p = .406. Age ranged from 19 to 49 (M = 24.86, SD = 4.09) years. In terms of academic classification, 64.3% of students were in master’s degree programs and 35.7% were in doctoral programs. There were no cohort differences in the distribution of degree programs, χ2(1, N = 434) = 0.38, p = .538. A total of 58.1% of the students majored in engineering followed by liberal arts and sciences (12.5%). Many students had been in the United States for a month or less (32.7%), or 2 months or less (80.1%), at the point of survey administration and there were no cohort differences in the length of stay, t(426) = 0.61, p = .540. Because students demonstrated adequate English proficiency (standardized testing) at entry to the program and the impracticality of translating the survey to all possible first languages, the survey was administered in English. There were no cohort differences in overall English proficiency, t(433) = 0.57, p = .570.
Instruments
Depression
The Center for Epidemiologic Studies Depression scale short form (CES-D short form; J. C. Cole et al., 2004) was used to assess depression severity. Originally, a 20-item measure, J. C. Cole et al. (2004) used item response theory analyses and identified a 10-item short form with good psychometric features. We elected to use the 10-item short form given its psychometric properties as well as practical efficiency in administration; fewer items could accommodate many international students with English as a second language. Students respond on a 4-point Likert-type scale ranging from 0 (rarely or none) to 3 (most of the time). Scores can range from 0 to 30, with higher scores indicating greater distress. Sample items include “I felt hopeful about the future (reverse-scored),” “I thought my life had been a failure,” “I felt lonely,” and “People were unfriendly.” Several studies used the full CES-D or shorter item sets from the CES-D to gauge depressive symptoms among international students. For example, K. T. Wang, Wong, and Fu (2013) reported internal consistency of .86 among Asian international students using eleven CES-D items, while Wei et al. (2008) reported internal consistency of .86 among Asian international students using the full CES-D. In both studies, measures were administered in English. The construct validity of the full CES-D has been extensively supported with international students by its positive correlations with acculturative stress (e.g., Wei et al., 2007) as well as perceived general stress (r = .64) and self-esteem (r = −.48) in the Wei et al. (2008) study. J. C. Cole et al. (2004) reported internal consistency of .82 and .75 in two samples of undergraduate students and community volunteers, respectively. In the same study, construct validity was supported through a strong positive correlation (r = .74) with the Beck Depression Inventory (Beck, Ward, Mendelson, Mock, & Erbaugh, 1961).
Predictor Variables
Three predictor variables were utilized to examine the moderating effect of gender: Region of Origin, Length of Stay in the United States, and English Proficiency. First, the Region of Origin was assessed by asking participants their home country. Given that students were from 44 different countries, it was deemed appropriate to cluster countries into more manageable number of categories. However, we could not locate any previous study that compared international students according to their region (or nation) of origin, other than studies that theorized Asian international students to have qualitatively different characteristics among all international students. As such, we focused on analyzing two largest subgroups of international students to examine the effects of region of origin, who also are from Asian countries: Asian Indian students and Chinese students. At a broader context, students from India and China comprise the largest subpopulations of international students in the United States as well. We created a dummy variable to code Asian Indian and Chinese students for group comparisons. The Length of Stay in the United States was measured by responses to an open-ended question: “How long have you been in the United States?” Based on subject responses, we created a variable that computed total days in the United States. Last, English Proficiency was measured by the Perceived English Proficiency Scale (Wei, Liao, Heppner, Chao, & Ku, 2012) that included five items assessing individuals’ perception of listening, speaking, reading, writing, and overall English proficiency. Participants responded to a 5-point scale ranging from 1 (very poor) to 5 (very good). Scores can range from 5 to 25, with a higher score indicating greater English proficiency. Sample item includes “How good are you at writing a paper in English?” The measure exhibited sound internal consistency of .87 for East Asian international students (Wei, Tsai, et al., 2012). The construct validity of the Perceived English Proficiency has also been supported through its negative association with acculturative stress in Chinese international students (Wei, Liao, et al., 2012).
Results
Descriptive Statistics and Preliminary Analysis
Analyses were conducted with IBM SPSS Version 21 (2012) and Mplus Version 7.11 (Muthén & Muthén, 1998-2013). Item responses were substantially nonnormally distributed in that eight items exhibited a significant skew (p < .001), as could be expected from a scale measuring psychological distress. The tests of model fit were based on the robust weighted least squares mean and variance adjusted estimator in Mplus. The weighted least squares mean and variance estimator produces unbiased parameter standard errors (Flora & Curran, 2004) for ordered-categorical data. For measurement invariance testing, each model was fit using the Theta parameterization option from the Mplus. Covariance coverage ranged from 0.982 to 1.00 and indicated that item-level missingness was inconsequential.
Confirmatory Factor Analysis
Formal invariance testing began with the one-factor 10-item solution, with the model separately tested for women and men. For women, this 10-item single factor structure seemed relatively good in several respects: χ2(35, N = 155) = 75.39, p = .0001, comparative fit index (CFI) = 0.960, and root mean square error of approximation (RMSEA) = 0.086 (90% confidence interval [CI; 0.059, 0.113]). For men, less than desirable fit emerged: χ2(35, N = 302) = 187.41, p < .0001, CFI = 0.886, and RMSEA = 0.120 (90% CI [0.104, 0.137]). However, for both groups, several standardized factor loadings were relatively low, ranging from.15 to .86 for women and .10 to .82 for men. Modification indices in both analyses suggested correlating errors could improve fit, and both analyses implicated the same, low loading item for those correlations (“I felt that I was just as good as other people”). Results also pointed to another low loading item also shared by both groups (“I felt that everything I did was an effort”), λ = .15 and .25, for women and men, respectively. Excluding those two items produced a relatively good fit for women: χ2(20, N = 155) = 39.08, p = .0065, CFI = 0.980, and RMSEA = 0.078 (90% CI [0.041, 0.115]), and a better fitting model for men: χ2(20, N = 302) = 58.53, p < .0001, CFI = 0.962, and RMSEA = 0.090 (90% CI [0.067, 0.113]). This eight-item version of the CES-D was advanced for invariance tests.
Multiple Group CFA for Measurement and Structural Invariance
We followed programming recommendations (e.g., Bovaird & Koziol, 2012; Millsap & Yun-Tein, 2004) to test measurement invariance between women and men on an eight-item single-factor CES-D, and when warranted, also followed procedures described by Sass (2011) to deal with noninvariance. Nested models were compared using the DIFFTEST χ2 procedure in Mplus. Additionally, ΔCFI was also examined to determine the improvement or decrement in model fit (Cheung & Rensvold, 2002; Dimitrov, 2010). We began by testing a configural invariance (unconstrained) model where parameters were freely estimated between women and men. Thereafter, parameters were constrained at successive steps in the model testing: metric invariance (factor loadings) and scalar invariance (item thresholds). Significant decrement in fit for these model comparisons would indicate measurement noninvariance between women and men. Women were treated as the reference group in all model comparisons.
Table 1 displays fit statistics for the models and comparisons. The initial test of the multiple groups configural model showed a reasonable fit, comparable with that observed for the separate analyses by gender (e.g., CFI = 0.969, RMSEA = 0.087). Constraining factor loadings to be same between women and men produced a significant decrement of the fit, DIFFTEST Δχ2(7, 457) = 13.18, p = .068, ΔCFI = 0.001. Seven of the factor loadings were quite similar between women and men and differed by .003 to .11. However, one of the items (“I felt hopeful about the future”) accounted for a .41 difference in standardized loadings between women (λ = .64) and men (λ = .23). Allowing that item to be freely estimated between the two groups while constraining the remaining loadings to invariance, provided a support for partial metric invariance, DIFFTEST Δχ2(6, 457) = 4.12, p = .661, ΔCFI = 0.011.
Goodness-of-Fit Statistics for Gender Measurement Invariance Tests.
Note. df = degrees of freedom; CFI = comparative fit index; RMSEA = root mean square error of approximation.
Sass (2011) described several options when encountering partial measurement invariance:
(a) delete the noninvariant items and only use invariant items for statistical analyses, (b) apply a partial measurement invariance (PMI) model, (c) use all the items and assume any differences are small and do not influence the results, (d) interpret the scores independently and preclude group comparisons, or (e) simply avoid using the scale. (p. 351; see also Cheung & Rensvold, 1999)
Because the gap in loadings was practically and statistically substantial for the item in question, and because of the practical difficulties in advancing a partial measurement invariance approach for future uses of the measure, we elected to drop that item and rerun the configural and metric tests based on a seven-item scale. Metric invariance for the seven-item scale was clearly supported, DIFFTEST Δχ2(6, 457) = 9.08, p = .17, ΔCFI = 0.001. Scalar invariance was also clearly supported, DIFFTEST Δχ2(13, 457) = 17.09, p = .20, ΔCFI = 0.004.
Several substantive structural invariance questions were also explored. We first tested whether factor means for the latent depression factor significantly differed between women and men. There was no substantial difference in the overall level of depression between women (factor mean of 0.0) and men (factor mean of −0.19), DIFFTEST Δχ2(2, 457) = 3.09, p = .21. Furthermore, no substantial difference between women (factor variance of 1.00) and men (factor variance of 0.99) were found in terms of individual variations of the latent depression factor, DIFFTEST Δχ2(1, 457) = 0.003, p = .954. This measure showed good internal consistency of ρ = .83 as indicated by Raykov’s (2009) model-based estimate. In sum, the seven-item single-factor CES-D evidenced impressive psychometric strengths.
Effects of Predictor Variables
Model tests examined whether similar association patterns between the three predictor variables (Region of Origin, Length of Stay, and English Proficiency) and depression were exhibited for women and men. Using the Model Test option in the Mplus, we first tested the association between region of origin and depression. The Wald test was not significant, χ2(1) = 1.99, p = .16, indicating that women (.304) and men (−.127) did not differ in the association between region of origin and depression. Similar results were shown in the association between length of stay and depression: χ2(1) = 0.006, p = .94, indicating that women (.000) and men (.000) did not show different association patterns. Additionally, when the association between English proficiency and depression was equal between women and men, results revealed that Wald test was not significant, χ2(1) = .27, p = .60, indicating that women (−.125) and men (−.05) did not differ in this association.
Discussion
Rigorous efforts have been made to accurately reflect international students’ psychological experiences and adjustment (Bardi & Guerra, 2011; K. T. Wang et al., 2014; Wei, Wang, et al., 2012). One way to ensure accurate assessment of psychological adjustment is to use measures with sound psychometric properties to gauge students’ distress (Millsap, 2006). Following this method, the current study examined measurement and structural invariance of the 10-item CES-D between genders among international students. Initial analyses did not support all 10 items as adequate indicators of a depression or distress construct. After trimming two poorly performing items and additionally dropping one item considering practical implications, a seven-item version of the CES-D for international students (CES-D Short-form International) was supported. Beyond initial structure, the seven-item measure performed well even under tests of scalar invariance between women and men. In other words, both women and men showed the same pattern of factor structure, factor loadings, and item thresholds. No significant gender differences were found in variation around the latent construct, or in testing latent level mean difference between women and men. Additionally, gender did not moderate the associations of region of origin, length of stay, and English proficiency with depression.
One of the notable strengths of the current study resides in its test of measurement invariance across gender. Specifically, it was found that the factor loadings and item thresholds were gender invariant, ensuring any observed score difference between women and men to be reflective of real differences in the symptom endorsement. This finding is especially a welcomed addition to the literature because it provides an empirically supported method for measuring, and interpreting, gender similarities, or differences in the prevalence of depression symptoms among international students. In the current study, seven items were retained for international students because they did not exhibit item difficulty bias (i.e., one group responding higher on a certain item after being matched on the total score with another group) or item discrimination bias (i.e., item difficulty bias change in response to the level of the latent variable). Support for scalar invariance in the current study provides a rigorous testament to the minimal level of item bias of these seven items across genders. Additional analyses that explored the associations of region of origin, length of stay, and English proficiency with depression between women and men also indicated that the CES-D Short-form International as a structurally valid measure to gauge gender differences in depression or distress.
Second, the current study attempted to support the utilization of the CES-D among international students. The items retained through invariance testing reflect that negative affect, somatic symptoms, and interpersonal difficulties are all features of distress among international students. These results point to the commonality of distress symptoms also evident in international students, compared with other populations. Some items of the CES-D have been argued to be culturally biased among minority populations (Li & Hicks, 2010; Van de Velde, Bracke, & Levecque, 2010). Based on results in the current study, the two positively worded items that reflect positive affect were dropped, whereas retained items reflected negative affect, somatic symptoms, and interpersonal aspects of depression. Two possible explanations exist. First, it is plausible that international students’ depression does not necessarily reflect lack or presence of positive affect. For instance, Goldberg, Oldehinkel, and Ormel (1998) found cultural differences in the item functioning across 15 cities around the world. Specifically, an item in the General Health Questionnaire (Goldberg, 1978) that reflected positivity (i.e., enjoy activities) differentiated depressed and nondepressed individuals in some cultures but not in others. In short, positive affect items could have been deleted due to cultural differences in construing distress.
Another explanation could be that because these positive affect items were positively worded (e.g., “I felt hopeful about the future,” “I felt that I was just as good as other people”), the cultural values (e.g., modesty, collectivism) relatively prevalent among international students could have minimized the endorsement to these items. For instance, Asian cultures value social harmony and pronounced consideration of others’ mood based on collectivistic culture (Iwata & Buka, 2002). Ultimately, less endorsement of positively worded items results in artificially elevated depression scores.
In sum, international students from Asian cultures (78% of the current sample) could have culturally linked reasons to be less affirmative toward positively worded self-referenced mood items, suggesting those items may not perform as well as they do for individuals from other cultural contexts as measures of psychological distress. In this realm, future measurement invariance studies may benefit from a more direct incorporation of cultural variables into the modeling. For instance, Wei, Liao, et al. (2012) incorporated the forbearance coping subscale to measure the degree to which one refrains from sharing problems with others for the fear of being burdensome. Similarly, K. T. Wang et al. (2014) incorporated the perceived discrimination subscale from the acculturative stress to assess the degree to which one is acculturated to the host culture. Incorporating such measures in conjunction with region of origin of these participants would allow for a more thorough investigation of potentially culture-laden item response biases. Alternatively, it would be worth conducting invariance tests on the CES-D or other distress measures between international samples and domestic samples to examine the effects of cultural contexts.
Last, in terms of practical implications, this short seven-item single-factor CES-D (CES-D Short-form International) can be utilized to assess distress among international students and compare gender differences. Due to its brevity, it is cost effective in terms of being easy to administer and score. Furthermore, it does not impose response burden to international students who oftentimes use English as their second language (i.e., cultural sensitivity). International students are likely to be less acculturated to the U.S. system and their construal of accessibility and stigma regarding seeking of mental health services can be quite different from domestic students. As such, exploring the possibility of utilizing the short seven-item single-factor CES-D as a screening measure to detect “at-risk” students early on and longitudinally following up with these students can reduce the exacerbation of potential mental health problems. Relatedly, future studies should explore the sensitivity, specificity, and positive and negative predictive power of this measure in distinguishing distressed versus nondistressed individuals if the short seven-item single-factor CES-D will be utilized as a screening tool. Although J. C. Cole et al. (2004) did not examine a cutoff score with the Rasch-derived 10-item form, a cutoff score suggested in the original 20-item CES-D measure could be used as a preliminary guide to gauge a cutoff score for this seven-item measure.
Despite the strength of the current study in its test of gender invariance among the international student population and investigation for a culturally sensitive and cost-effective short measure, several limitations exist, which may guide future studies. First, the current study failed to explore whether the same results hold in other samples, including international students. To reliably use this shortened version as a prescreening measure of psychological distress, future studies should explore whether the same results are found across diverse populations. Construct validity of the short CES-D among international students could have been further addressed by testing its association with related criterion variable, such as the acculturative stress. Also, the CES-D was administered to students who were relatively early into their first semester at a U.S. university. Although there is an ongoing debate about different phases or stages or adjustment and stress for recently transitioning international students (e.g., Anderson, 1994), future work with this short CES-D could address additional questions relevant to acculturative processes: Is there a so-called “honeymoon phase” during which international students are thought to experience less psychological distress and more optimism? Does acculturation moderate invariance of a distress measure? In this vein, testing a longitudinal invariance of a psychological distress measure and examining whether acculturative stress affects biasing responses would be informative.
There was an imbalance in the gender ratio of participants in the current study, such that there were twice as many men as there were women in the current data, so future research might examine more balanced gender distributions or intersectionalities of gender with, for example, country or region of origin. Also, although the international student body is diverse and relatively well-represented in terms of its number in the current sample, data collection was limited to international graduate students attending one institution. Future work might examine similar measurement and substantive questions on multiple campuses with different proportions of international students to address generalizability of results. Although adequate levels of English proficiency were witnessed for all students, perhaps future studies could examine a potential language effect. For instance, comparing measurement invariance between a group of students who responded to a measure with their first language, and another group who responded to a measure with English would be informative. Last, it would be interesting to see whether future studies that examine measurement invariance in the CFA framework with the original CES-D (Radloff, 1977) would conclude with the same item set as found in the current study.
To summarize, the current study provided substantial psychometric support for the use of a seven-item version of the CES-D with international students (CES-D Short-form International). Support of scalar invariance allows future studies to directly compare CES-D Short-form International scores between women and men. To our knowledge, this is the first study to examine potential different item functioning between women and men among international students, especially on a measure that is capturing underlying gender differences in its prevalence. It is also noteworthy that the current study undertook a rigorous measurement invariance testing on a depression measure, a highly likely state to be experienced by international students given the acculturative stress. In sum, the current study sets the stage for future research using the CES-D Short-form International to assess and compare distress levels among international students. Future research testing measurement invariance across genders and other subgroups would similarly enhance the confidence in inferences and interpretations derived from tests used with international students. Such efforts are advantageous to accurately gauge and understand international students’ distress. Further applied research with the CES-D Short-form International could help determine if mental health and other college student affairs professionals can utilize this measure to effectively screen distress levels and then positively intervene to assist in the adjustment of international students.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
