A Measurement Invariance Analysis of the General Self-Efficacy Scale on Two Different Cultures

Abstract

The 10-item General Self-Efficacy Scale (GSES) was developed to assess an individual’s beliefs to cope with a variety of situations in life. Despite the GSES being used in numerous research from researchers in different countries and presented in different languages, little is known about the use of its validity in an Asian culture. The aim of the present study is to assess the measurement invariance of the GSES across two countries with distinctly different cultures. Results revealed that although there was full metric invariance in the two samples, only partial scalar invariance was achieved as the intercepts of 4 out of 10 items were non-invariant.

Keywords

self-efficacy measurement invariance cultures

Introduction

Self-efficacy is commonly defined as “people’s beliefs about their capabilities to produce designated levels of performance that exercise influence over events that affect their lives” (Bandura, 1994, p. 71). Arguably the foundation of human agency, self-efficacy plays a central role in the self-regulation of motivation through goal challenges and outcome expectations (Bandura, 2001). Although self-efficacy was originally proposed as a domain-specific construct, a construct of general self-efficacy defined by one’s confidence in coping across various situations has also been proposed (Jerusalem & Schwarzer, 1981). Empirical studies examining general self-efficacy have found that it was predictive of one’s perceived ability in a variety of tasks and moderated the influence of previous performance on subsequent specific self-efficacy formation. Some of these outcomes included optimism, perception of challenge, proactive coping, and self-regulation. Arising from their discussion of general self-efficacy, Schwarzer and Jerusalem (1995) developed the General Self-Efficacy Scale (GSES) to measure the construct.

The GSES was originally developed in German as a 20-item scale by Jerusalem and Schwarzer (1981) before it was modified as a 10-item version in 1995. In the GSES, the 10 items assess an individual’s beliefs to cope with a variety of situations in life. Since its development, the GSES has been translated into 26 other languages by various researchers. Compared with other scales that were designed to assess self-efficacy, the GSES explicitly measures the degree of personal agency, that is, the belief that an individual’s actions are responsible for successful outcomes. In particular, the scale was created with the aim to predict coping with daily hassles as well as adaptation after experiencing all kinds of stressful life events. Despite the GSES being used in numerous research from researchers in different countries and presented in different languages (e.g., Luszczynska, Gutierrez-Dona, & Schwarzer, 2005), a review of the literature revealed that little is known about the use of its validity in an Asian culture. Chen (2008) conducted a comprehensive review on cross-cultural measurement invariance (MI) literature and found close to three quarters of these studies failed to find metric invariance in the scales that they examined. Possible reasons for measurement non-invariance included cross-cultural difference in response styles and differences in the meaning of measurement anchors. Therefore, the aim of the present study is to assess the MI of the GSES between the original country where the scale was developed, Germany, and one Asian country, Singapore. It is expected that the findings of this study could inform and further our understanding of the cross-cultural validity of the GSES. Figure 1 shows the measurement model of the GSES.

Figure 1.

Measurement model of the GSES.

Method

Participants and Procedure

Two samples totaling 498 participants (258 females) were used in this study. These participants were from Germany (GE; n = 240; 129 females) and Singapore (SG; n = 258; 162 females). Data from SG were obtained by this author from a secondary school and those for GE were taken from a large-scale 24-nation study on the GSES involving 19,896 participants (Scholz, Doña, Sud, & Schwarzer, 2002). The German sample was selected because it shared a similar profile as the SG sample in age and educational level and that its culture was distinctly different from SG.

The SG participants completed an online version of the 10-item GSES during school hours and, on average, they spent about 15 min to complete the GSES. The mean age was 13.00 (standard deviation [SD] = 0.00) for the German sample and 13.57 (SD = 0.78) for the Singaporean sample. As the German sample data were adopted from a previous study (Scholz et al., 2002), it was not known why the SD was zero although it could be speculated that the age variable had been simplified for data entry purposes.

Instrument

The English version of the 10-item GSES (Schwarzer & Jerusalem, 1995) was administered to the SG sample. Those from GE had completed the German version of the GSES. All items were hypothesized to load on one factor and worded in the same direction. Each item was measured using the original 4-point scale of the GSES, with 1 = not at all true and 4 = exactly true. The Cronbach’s alphas are .73 in GE and .95 in SG. The scale has been shown to have meaningful nomological networks with psychological well-being constructs, such as moderate positive correlations with optimism, expected social support and proactive coping, and moderate negative correlation with depression, anxiety, and work burnout in both Western and Eastern samples (Scholz et al., 2002).

Results

Descriptive Statistics and Modeling Information

The data (N = 498) were examined for out-of-range responses (i.e., responses greater than 4), and none were detected. As the data were either taken from a previous study (GE sample) or collected using an Internet survey questionnaire online form that prevented submission of incomplete forms (SG sample), no missing data were found. The means and SDs for each item in the GSES ranged from 2.83 to 3.08 and 0.63 to 0.71, respectively. All confirmatory factor analyses (CFAs) were based on maximum likelihood estimation with robust standard errors because these analyses do not assume strict multivariate normality. Model comparisons were based on a scaled chi-square difference test particularly designed to contrast among models with robust standard errors (Yuan & Bentler, 1998).

Tests of MI

The first step in establishing MI involves testing whether the same factor structure exists across the two countries (configural invariance; for example, Vandenberg & Lance, 2000). A good fitting unconstrained model would indicate that a common factor structure is shared across the GE and SG samples. Following, using a multigroup CFA based on mean and covariance structure (MACS) analysis, MI was tested in the following sequence of nested models (Meredith, 1993): metric (weak) invariance where factor loadings were constrained to be equal across countries and, scalar (strong) invariance where factor loadings and item intercepts were constrained to be equal across countries. If support for metric and scalar invariance is found, it would be reasonable to infer that the GSES has been measured on the same scale across the two cultures and, in general, these tests were sufficient conditions for MI to establish cross-group equality (Millsap, 1998).

Model fit was evaluated using the χ², Tucker–Lewis index (TLI), comparative fit index (CFI), root mean square error of approximation (RMSEA), and standardized root mean square residual (SRMR). Hu and Bentler (1999) proposed TLI and CFI statistics greater than .90 as an “adequate” model fit and values greater than .95 as a “good” model fit. Those for RMSEA and SRMR values with less than .06 and .08, respectively, are “good” and values between .08 and .10 are considered “mediocre,” respectively.

A test of the configural model revealed an acceptable level of invariance although the TLI value is mediocre, χ² = 182.518; TLI = .913; CFI = .932; RMSEA (95% confidence interval [CI]) = .080 [.069, .092]; SRMR = .041; Figure 1. Traditionally, the chi-square test is used to compare between nested models, but it has been known to be oversensitive to large sample sizes (Chen, 2007). Following Chen’s recommendations, ΔCFI, ΔRMSEA, and ΔSRMR will be used, in addition to the Δχ² in this study, to assess for evidence of invariance. The criteria for invariance will be ΔCFI ≤ - .01, ΔRMSEA ≤ .015, and ΔSRMR ≤ .03 for tests of factor loading invariance and ΔCFI ≤ -.01, ΔRMSEA ≤ .015, and ΔSRMR ≤ .01 for test of scalar invariance.

Following a well-fitting baseline (configural) model, the model was then constrained by the factor loadings across the two countries to test for metric invariance. These constraints increased the chi-square value by 11.673 (Δχ²), gaining 9 degrees of freedom (Δdf). Because the metric invariance model (Model 2) is nested within the baseline model (Model 1), a scaled chi-square difference (Δχ²) test (Yuan & Bentler, 1998) was performed. Given that the Δχ² of 11.673 with Δdf = 9 was not statistically significant (p = .23), and that the change in the values of the CFI, RMSEA, and SRMR was within recommended guidelines, metric invariance was supported.

Scalar invariance was tested by constraining the intercepts of the 10 items to be the same across the two countries. A Δχ² test was performed comparing the scalar invariance model (Model 3) against the metric invariance model (Model 2) and neither the Δχ² test nor the ΔCFI supported the scalar invariance model. Following the recommendations of Byrne, Shavelson, and Muthén (1989), a partial scalar invariance test was conducted. This involved removing the constraint on the intercept of each item in turn to identify the non-invariant item(s). The process revealed that the intercepts of Items 2, 3, 5, and 6 were non-invariant; hence, they were removed to allow a test for partial scalar invariance (Model 3a). When compared with Model 2, the partial scalar invariance model (Model 3a) with six items resulted in a drop of model fit by Δχ² of 9.601 with Δdf = 5. As the Δχ² was not statistically significant (p = .09) and the change in the values of the CFI, RMSEA, and SRMR were within recommended guidelines, partial scalar invariance was supported (Table 1).

Table 1.

Summary of Invariance Tests Across the Germany (n = 240) and Singapore (n = 258) Samples.

Model^a	χ²	df	Δχ^2b	Δdf	TLI	CFI	RMSEA [95% CI]	SRMR	Model comparison	ΔCFI	ΔRMSEA	ΔSRMR
1	182.518	70	—	—	.913	.932	.080 [.069, .092]	.041	—	—	—	—
2	195.988	79	11.673	9	.920	.929	.077 [.066, .089]	.051	1 vs. 2	−.003	−.003	.010
3	297.339	88	118.540*	9	.871	.874	.098 [.087, .108]	.074	2 vs. 3	−.055	.021	.023
3a	207.583	84	9.601	5	.920	.925	.077 [.066, .088]	.053	2 vs. 3a	−.004	.000	.002

Note. df = degrees of freedom; TLI = Tucker–Lewis index; CFI = comparative fit index; RMSEA = root mean square error of approximation; CI = confidence interval; SRMR = standardized root mean square residual.

1 = configural model; 2 = metric model; 3 = scalar model; 3a = partial scalar model.

The chi-square difference values have been scaled based on Yuan and Bentler’s (1998) method, because each model is estimated with robust standard errors.

p < .001.

Discussion

This study aims to examine the MI of the 10-item GSES proposed by Schwarzer and Jerusalem (1995) across two countries with distinctly different cultures. From the data obtained with samples of students ranging 13 to 14 years old from GE and SG, there was support for the one-factor hypothesized model that underlies the GSES. The results also supported a full configural and metric invariance model. For the scalar invariance model, it was partially supported as the intercepts of 4 out of 10 items were non-invariant.

Given that individual’s environment, personal factors, and behavior have an impact on his or her perceptions about the self and what he or she can accomplish, self-efficacy can influence subsequent behavior (Bandura, 1997). As the samples in this study were taken from different countries, it was possible that the environmental and personal factors that shape the beliefs about personal agency in students would be different to an extent that affected the scalar invariance. In addition, owing to the different language used in the GSES to measure the German and Singaporean students, the results of this study may have been subjected to the effect of language translation in the scale items. However, the translated version of our English GSES scale was the most common and widely accepted version; thus, the possible issue of translation error should be minimal. In addition, the validity of the scale was supported by the meaningful nomological networks of the GSES with psychological well-being measures in both Western and Eastern countries (Scholz et al., 2002) although subtle cross-cultural differences in the meaning of general self-efficacy may exist, and these subtle differences may affect attainment of MI among items as well as scale validity. Finally, East Asians have shown distinct response styles, such as frequent use of moderate response anchors and higher acquiescence response styles (Chen, 2008). These response styles may not have affected metric invariance in the current study, but their combination may create unpredictable effects that hinder scalar invariance. Attainment of metric invariance supports the use of the GSES scale to examine cross-cultural correlational differences, but absence of scalar invariance implies the presence of bias in cross-cultural mean comparison with this scale.

Future research on the GSES may include additional validation involving participants across the spectrum of ages, including the very young and old. This would contribute to our understanding of self-efficacy in terms of its stability and progression through the life span. Given that the internal consistency for GSES was found to be higher in SG as compared with GE, future studies should examine the role of response styles, such as frequent use of moderate response anchors among East Asians, as well as other factors that affect examination of cross-cultural mean difference. Invariance testing should also be continued to establish validity across cultures, languages, and educational systems. Finally, the concurrent and nomological validity of the GSES could be examined to support its currency and relevance in psychological and educational research.

Footnotes

Declaration of Conflicting Interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) received no financial support for the research, authorship, and/or publication of this article.

References

Bandura

(1994). Self-efficacy. In Ramachandran

V. S.

(Ed.), Encyclopedia of human behavior (Vol. 4, pp. 71-81). New York, NY: Academic Press.

Bandura

(1997). Self-efficacy: The exercise of control. New York: Freeman.

Bandura

(2001). Social cognitive theory: An agentic perspective. Annual Review of Psychology, 54, 1-26.

Byrne

B. M.

Shavelson

R. J.

Muthén

(1989). Testing for the equivalence of factor covariance and mean structures: The issue of partial measurement invariance. Psychological Bulletin, 105, 456-466.

Chen

F. F.

(2007). Sensitivity of goodness of fit indices to lack of measurement invariance. Structural Equation Modeling, 14, 464-504.

Chen

F. F.

(2008). What happens if we compare chopsticks with forks? The impact of making inappropriate comparisons in cross-cultural research. Journal of Personality and Social Psychology, 95, 1005-1018.

Bentler

P. M.

(1999). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural Equation Modeling, 6, 1-55.

Jerusalem

Schwarzer

(1981). Fragebogen zur Erfassung von “Selbstwirksamkeit. In Schwarzer

(Ed.), Skalen zur Befindlichkeit und Persoenlichkeit (Forschungsbericht No. 5). Berlin, Germany: Institut fuer Psychologie, Freie Universitaet.

Luszczynska

Gutiérrez-Doña

Schwarzer

(2005). General self-efficacy in various domains of human functioning: Evidence from five countries. International Journal of Psychology, 40, 80-89.

10.

Meredith

(1993). MI, factor analysis and factorial invariance. Psychometrika, 58, 525-543.

11.

Millsap

R. E.

(1998). Group differences in regression intercept: Implication for factorial invariance. Multivariate Behavioral Research, 33, 403-424.

12.

Scholz

Gutiérrez-Doña

Sud

Schwarzer

(2002). Is general self-efficacy a universal construct? Psychometric findings from 25 countries. European Journal of Psychological Assessment, 18, 242-251.

13.

Schwarzer

Jerusalem

(1995). Generalized Self-Efficacy Scale. In Weinman

Wright

Johnston

(Eds.), Measures in health psychology: A user’s portfolio. Causal and control beliefs (pp. 35-37). Windsor, UK: NFER-NELSON.

14.

Vandenberg

R. J.

Lance

C. E.

(2000). A review and synthesis of the MI literature: Suggestions, practices, and recommendations for organizational research. Organizational Research Methods, 3, 4-69.

15.

Yuan

K. H.

Bentler

P. M.

(1998). Robust mean and covariance structure analysis. British Journal of Mathematical and Statistical Psychology, 51, 63-88.