Factorial Validity of the Anxiety Questionnaire for Students (AFS): Bifactor Modeling and Measurement Invariance

Abstract

The present study with 2,273 students aimed to examine the factorial validity of the Anxiety Questionnaire for Students (AFS) by using the bifactor modeling framework, that is, contrasting a confirmatory factor analysis (CFA) model to an exploratory structural equation model (ESEM) and two bifactor models (B-CFA and B-ESEM). In addition, measurement invariance and latent mean differences in the three facets of the AFS (test anxiety, manifest anxiety, dislike of school) across gender, age groups, and school types were explored. Results provided strong support for the multidimensionality of the AFS. The B-ESEM showed the best fit to the data as opposed to the other models. Also, measurement invariance across gender, age groups, and school types was fully supported. Girls and younger students generally reported higher levels of anxiety than boys and older students, while the latter stated more dislike of school than girls and younger students. Furthermore, elementary school students showed generally higher levels of anxiety than students of the other school types.

Keywords

factorial validity anxiety bifactor modeling measurement invariance

The Anxiety Questionnaire for Students (AFS; in German: Angstfragebogen für Schüler; Wiesczerkowski et al., 2016) is arguably one of the most commonly used instruments for measuring different facets of students’ anxiety. However, despite the highly practical significance of the AFS, research testing the factor structure of the AFS by using different factor analyses is missing, in particular, confirmatory factor analysis (CFA) and structural equation modeling (SEM) or the emerging bifactor modeling.

The present study aims to examine the factorial validity of the AFS by using, specifically, the bifactor modeling approach proposed by Morin, Arens, and Marsh (2016a), that is, combining a CFA model with an exploratory SEM (ESEM), a bifactor CFA model (B-CFA model), and a bifactor ESEM (B-ESEM). This research is of considerable importance given that the investigation of the three facets of the AFS enables researchers and practitioners to better explain and predict students’ behavior and other (educational) outcomes. For instance, numerous studies have revealed the negative effects of anxiety in children and adolescents such as lower self-esteem (Dan & Raz, 2012), lower motivation (Cheng et al., 2014), and lower academic achievement (Hill et al., 2016; Pekrun, Lichtenfeld, Marsh, Murayama, & Götz, 2017). In particular, drawing on a large sample of 3,256 participants (7-19 years) of the BELLA cohort study, Klasen et al. (2016) showed that based on parental report, a total of 10.6% of all children and adolescents in Germany generally reported anxiety, while the prevalence rate based on self-reports was even 15.1% and appeared to be higher with increasing age and among girls.

The AFS

The AFS consists of 50 items and four scales, namely, test anxiety (TA), manifest anxiety (MA), dislike of school (DS), and social desirability (SD). The scale TA comprises 15 items measuring students’ emotions of insufficiency and failure including vegetative reactions (e.g., “I am afraid of facing an unexpected test in class.”). TA is defined as a state emotion that particularly arises from failure on an exam or similar evaluative situation (Zeidner, 1998). It includes two central components (Spielberger & Vagg, 1995): (a) an emotionality component which relates to all subjective feelings of tension (e.g., nervousness) and physical or autonomic states of excitement (e.g., palpitations) and (b) a worry component which refers to all cognitive aspects such as worry about a person’s own failure. The scale MA measures 15 items of more general anxiety symptoms such as heart palpitations (e.g., “I often have strong heart palpitations.”). This scale is based on the assumption that anxious students generally tend to feel anxious, irrespective of the stimulus. Manifest anxiety can, thus, be seen as a dispositional (trait) emotion. The scale DS consists of 10 items measuring the feeling of dislike of school as an internal defense against the school and an absence of motivation caused by unpleasant experiences (e.g. “It would be nice if I did not need to go to school anymore.”). The scale SD which serves as a control variable includes 10 items measuring the extent of students’ social desirability. All items of the AFS can be responded on a dichotomous rating scale ranging from 0 (true) to 1 (not true).

A series of studies has supported the criterion-related validity of the four scales by correlations with external criteria such as motivation, self-concept, and school grades (Nitkowski, Lohbeck, Petermann, & Petermann, 2017; see for an overview, Wiesczerkowski et al., 2016). In contrast, there is little evidence for the factor structure of the AFS in the normative data set (Wiesczerkowski et al., 2016): The CFI was only .69, although the expected 4-factor structure could be sufficiently reproduced and factor loadings were sufficient ranging from .51 to .75 for TA, from .40 to .77 for MA, and from .26 to .71 for SU. However, no other alternative models and no measurement invariance for gender, age groups, or school types have yet been tested, although there is some evidence for gender and age-related effects on students’ anxiety (e.g., Götz, Bieg, Lüdtke, Pekrun, & Hall, 2013; Klasen et al., 2016). In addition, the question of whether school types play a role for the three facets of the AFS, has not been addressed in previous research. This is unfortunate because the AFS is particularly designed for students in different school types.

Bifactor Modeling

Bifactor modeling seems to be a promising approach for testing the multidimensionality of an instrument, in particular, the B-ESEM framework. In a CFA model, each item only loads on the expected factor and no cross-loadings on other factors are permitted. However, as a CFA model assumes the cross-loadings between the items and nontarget factors to be zero (Asparouhov & Muthén, 2009), factor correlations may be substantially biased because the only way to define them is through the inflation of the estimated factor correlations (Marsh, Liem, Martin, Morin, & Nagengast, 2011). Moreover, in a CFA model, the items measuring the constructs underlying the instrument tend to be fallible indicators, as they are often systematically related to constructs other than the constructs of the instrument including random measurement error, in particular, when the instrument targets multiple factors of conceptually related constructs (Morin et al., 2016a). As a result, a CFA model might be unrealistically restrictive. In contrast, EFA assumes all cross-loadings to be freely estimated leading to more natural and more exact estimates for the latent factor correlations than CFA (Asparouhov & Muthén, 2009). However, EFA does not permit a priori hypotheses and is unsuitable for confirmatory analysis. ESEM based on target rotation proposed by Asparouhov and Muthén (2009) seems to be a more favorable approach, as it combines EFA within the SEM framework and allows a priori hypotheses of the expected factor structure. In ESEM based on oblique target rotation, all cross-loadings are “targeted” to be close to zero, while main loadings are freely estimated. Bifactor models, conversely, states the coexistence of a global factor (G-factor) underlying all items and meaningfully specific factors (S-factors), whereby all factors are set to be orthogonal (uncorrelated). For instance, with respect to the AFS, a B-CFA model posits that all items load on a global anxiety factor (G-factor) and on three specific anxiety factors (S-factors) reflecting the three facets of the AFS (TA, MA, DS) with no cross-loadings between the S-factors. The G-factor and S-factors are specified as orthogonal to ensure that the S-factors reflect the part of the items’ variance that is not explained by the G-factor, while the G-factor reflects the part of the items’ variance that is shared across the items (Reise, 2012). As a result, in a B-CFA model, the total covariance among the items is divided into a general component underlying all items, and f–1 S-components reflecting the residual covariance not explained by the G-factor. Finally, in a B-ESEM based on orthogonal bifactor target rotation (Reise, 2012), all items of the three facets of the AFS represent a G-factor, while the three S-factors are defined from the same pattern of target and nontarget factor loadings that is used in the ESEM. Figure 1 in Appendix A of the supplemental materials depicts all models under investigation.

Bifactor models have increasingly been used in previous research testing the multidimensionality of an instrument measuring, for instance, academic self-concept (Morin et al., 2016a), workplace affective commitment (Perreira et al., 2018), intelligence (Gignac & Watkins, 2013), and anxiety disorders (Simms, Grös, Watson, & O’Hara, 2008). However, for the AFS, the question of whether there exists a global anxiety factor has not been validated so far, which limits the use of bifactor modeling for the AFS. In accordance with the bifactor modeling’s assumptions, we argue that there exists a global anxiety factor in which specific higher or lower anxiety factors are included. Two students, for instance, may have a similar level of global anxiety, although they show different levels of anxiety in specific domains: One student may report a higher test anxiety but a lower manifest anxiety, while the other student may report a lower test anxiety but a higher manifest anxiety. Each person has, thus, an individual “anxiety profile.” However, this individual anxiety profile may also represent a biased picture of the more specific anxiety. For instance, a student’s anxiety in a specific domain is higher than in reality due to his higher global anxiety. In contrast, the more specific anxiety factors should not change the overall picture, as these more specific anxiety levels are just one facet of the global anxiety factor. As a consequence, taking global anxiety into account while measuring anxiety in specific domains simultaneously leads to a more accurate measurement of anxiety beyond this global level of anxiety. Thus, bifactor modeling may also be a suitable approach for testing the multidimensionality of the AFS.

Aims and Hypotheses

The present study is unique in examining the factorial validity of the AFS by using the emerging approach of bifactor modeling proposed by Morin et al. (2016a). More specifically, the central objective of this study is threefold: First, by focusing on the three facets of the AFS (TA, MA, DS), the factor structure of the AFS is tested within the traditional CFA framework. Second, according to the bifactor modeling approach, the presumed 3-factor CFA model is contrasted to (a) a ESEM, (b) a B-CFA model, and (c) a B-ESEM. Third, measurement invariance across gender, age groups, and school types is investigated. Additionally, beyond the central objectives of this study, potential effects of gender, school types, and age groups in all three facets of the AFS are explored. In line with previous research, we assume that a 3-factor CFA model will show a better fit to the data than a more global 1-factor or a 2-factor CFA model. However, due to absence and/or inconsistency of research, we do not offer specific hypotheses for the other models or for measurement invariance as well as for potential effects of gender, school types, and age groups.

Method

Sample

For the present study, a subsample of the second normative data set of the AFS was used. This sample consists of N = 2,273 German students (n = 1,127 boys, n = 1,146 girls) aged 9 to 19 years (M = 13.81, SD = 2.41). Students attending special schools and vocational schools were excluded as the AFS was primarily used in regular schools and the number of participants in these two school types was relatively low (special schools: n = 85, vocational schools: n = 135). Participants attended 124 classes from Grades 4 to 12 (Grade 4: n = 307, Grade 5: n = 155, Grade 6: n = 313, Grade 7: n = 377, Grade 8: n = 336, Grade 9: n = 360, Grade 10: n = 258, Grade 11: n = 154, Grade 12: n = 13). The number of students per class varied from 6 to 28. Students were randomly selected from 35 elementary and secondary schools in six federal states of Germany (Lower Saxony, Saxony, Bremen, Berlin, North Rhine-Westphalia, Hesse). In Germany, there are three secondary school tracks: (a) Gymnasium (highest school track), (b) Haupt-/Realschule (intermediate school track), and (c) Gesamtschule/Oberschule (mixed school track). Given the different names of the middle secondary school tracks in some federal states of Germany, the intermediate and mixed secondary school tracks (Hauptschule, Realschule, Oberschule) were considered as one secondary school type, resulting in three different school types in this study (elementary school: n = 307, Haupt-/Real-/Oberschulen: n = 838, Gymnasium: n = 1,128).

Procedure

Data collection took place during January 2015 and July 2015. All students were informed that their data would be treated anonymously. Participation was voluntary. Parental consent was obtained prior to the study. All participants filled out the questionnaires in class and were able to complete the questionnaires in one single lesson. Due to the lower reading competencies of elementary school children, all items were read aloud to the children by trained testing personnel and student assistants.

Measure

As social desirability is mainly used as a control variable in the AFS, only the three facets of TA, MA, and DS of the AFS (Wiesczerkowski et al., 2016) were measured in this study. All three scales showed satisfactory Cronbach’s alpha coefficients (TA: α = .87; MA: α = .84; DS: α = .71).

Data Analysis

All models were estimated using the robust weighted least square estimator (WLSMV) in Mplus 7.3 (Muthén & Muthén, 2018). First, the following three CFA models were tested: (a) a 1-factor CFA model with all items loading on a single factor, (b) a 2-factor CFA model stating one factor for TA and MA and a second factor for SD, and (c) a 3-factor CFA model differentiating between the three facets of the AFS. Second, the bifactor modeling approach proposed by Morin et al. (2016a) was used which started with the comparison of the 3-factor CFA model and the ESEM. Support for the ESEM solution was assumed if there were substantially reduced factor correlations, better-fit indices, and multiple cross-loadings ⩾0.10 or even ⩾0.20 indicating that a global construct might be present in the data (Marsh, Morin, Parker, & Kaur, 2014). Next, the 3-factor CFA model was contrasted to the B-CFA model. The key elements for justifying the B-CFA solution were better-fit indices, a well-defined G-factor, and some well-defined S-factors. Finally, the B-ESEM solution was pursued. The adequacy of the B-ESEM solution was supported by improved goodness-of-fit indices, a well-defined G-factor, and relatively small cross-loadings, ideally smaller than those related to the ESEM (Marsh et al., 2014). According to the approach for measurement invariance testing in the presence of ordered categorical responses (see for more details of model specification under WLSMV, Morin, Arens, Tran, & Caci, 2016b), that is, configural invariance, scalar invariance (loadings and thresholds), and strict invariance (loadings, thresholds, and uniquenesses), the retained model was further used for measurement invariance testing across gender, age groups, and school types. For measurement invariance across age groups, the following four age groups were constructed: (a) 9 to 11 years (n = 476), (b) 12 to 13 years (n = 658), (c) 14 to 15 years (n = 654), and (d) 16 to 19 years (n = 485). This procedure ensures four comparably large age groups. However, due to the relatively low number of 18- and 19-year-old students in the data set, three ages were included into the oldest age group. For model fit evaluation, the comparative fit index (CFI), the Tucker-Lewis index (TLI), and the root mean square error of approximation (RMSEA) with its confidence interval were used. Values greater than .90 and .95 for the CFI and TLI were evaluated as adequate and excellent fit, while values lower than .08 or .06 for the RMSEA were interpreted as acceptable and excellent fit (Hu & Bentler, 1998). Changes of model fit were analyzed with the Mplus DIFFTEST function (MDΔχ²; Asparouhov, Muthén, & Muthén, 2006). Given the known oversensitivity of χ² and MDΔχ² to sample size and minor misspecifications, the following additional indices were used in measurement invariance testing (Cheung & Rensvold, 2002): a CFI decrease of 0.010 or less and a RMSEA augmentation of 0.015.

To assess the composite reliability of the factors, omega coefficients were calculated using McDonald’s ω = [(Σ|λ_g|)² + (Σ|λ_i|)²] / [(Σ|λ_g|)² + (Σ|λ_i|)² + Σδ_ii], where λ_g represents the factor loadings of the G-factor, λ_i reflects the factor loadings of the S-factors, and δ_ii defines the error variances (McDonald, 1970). Additionally, the (hierarchical) omega coefficients for the B-CFA model and the B-ESEM were computed (Reise, Moore, & Haviland, 2010) in which only the loadings of the G-factor were taken into account: ω_h = (Σ|λ_g|)² / [(Σ|λ_g|)² + (Σ|λ_i|)² + Σδ_ii]. Latent mean differences for gender, school types, and age groups were estimated within the strict measurement invariance models. Due to the hierarchical data, the intraclass correlations (ICC) for all three scales of the AFS were tested and the Mplus type = complex option was used. The ICC were very low ranging from .00 to .03. Missing values were negligible (max. 2.7 % on the item level).

Results

Factor Structure

Table 1 provides the results of factor analyses under investigation.

Table 1.

Results of Factor Analyses and Measurement Invariance Testing.

Models	χ²	df	CFI	TLI	RMSEA	[90 % CI]
1-factor CFA	3,879.618	740	.876	.870	.043	[.042, .045]
2-factor CFA	3,345.883	739	.897	.892	.039	[.038, .041]
3-factor CFA	2,942.252	737	.913	.908	.036	[.035, .038]
ESEM	1,467.424	663	.968	.963	.023	[.022, .025]
B-CFA	1,767.736	702	.958	.953	.026	[.024, .027]
B-ESEM	1,136.682	626	.980	.975	.019	[.017, .021]
Gender
B-ESEM
configural	1,820.849	1,252	.978	.972	.019	[.017, .021]
scalar	1,942.926	1,395	.978	.976	.018	[.016, .020]
strict	1,953.375	1,433	.980	.978	.017	[.015, .019]
Age groups: 9-11, 12-13, 14-15, 16-19
B-ESEM
configural	2,891.883	2,504	.982	.977	.016	[.013, .018]
scalar	3,334.451	2,927	.981	.980	.015	[.012, .017]
strict	3,468.617	3,041	.980	.980	.015	[.012, .017]
School types: Grundschule, Oberschule, Gymnasium
B-ESEM
configural	2,205.854	1,878	.981	.976	.015	[.012, .018]
scalar	2,502.474	2,161	.980	.978	.014	[.012, .017]
strict	2,602.745	2,237	.978	.977	.015	[.012, .017]

Note. ESEM was estimated with target oblique rotation. B-ESEM were estimated with bifactor orthogonal target rotation. χ² = WLSMV chi square; CFI = comparative fit index; TLI = Tucker-Lewis Index; RMSEA = root mean square error of approximation; CI = confidence interval; CFA = confirmatory factor analysis; ESEM = exploratory structural equation model; B = bifactor model; WLSMV = weighted least square estimator.

The fit indices of the CFA models were as expected: The 3-factor CFA model showed the best fit to the data (χ² = 2,942.252, df = 737, CFI = .913, TLI = .908, RMSEA = .036, factor loadings: min. λ = .23, max. λ = .84), followed by the 2- and 1-factor CFA models. Thus, the 3-factor model was used for the further bifactor modeling analysis.

When testing the significant differences between the fit of the models using the Mplus DIFFTEST function (MDΔχ²; Asparouhov et al., 2006) and the cut-off criteria proposed by Cheung and Rensvold (2002), results showed that the 3-factor CFA model fitted significantly worse to the data than all the other models within the bifactor modeling framework (see Table 1). The fit of the 3-factor CFA model, in turn, was significantly worse than that of the ESEM (ΔCFI = −0.055, ΔTLI = −0.055, ΔRMSEA = +0.013). The ESEM, in contrast, revealed no substantially worse fit than the B-CFA model (ΔCFI = −0.010, ΔTLI = −0.010, ΔRMSEA = +0.003), while it fitted significantly poorer to the data than the B-ESEM (ΔCFI = −0.012, ΔTLI = −0.012, ΔRMSEA = +0.004). The B-CFA model, in turn, exhibited a substantially worse fit than the B-ESEM (ΔCFI = −0.038; ΔTLI = −0.022, ΔRMSEA = +0.007).

According to this descriptive statistic, the B-ESEM appeared to be the best reproduction of the data. However, given that each of the alternative model is able to absorb sources of misfit behind similarly fitting models (Asparouhov, Muthén, & Morin, 2015), the comparison of fit indices is not a sufficient basis for model selection (Morin et al., 2016a). For this reason, the examination of standardized parameter estimates, statistical conformity, and theoretical adequacy should be taken into account when selecting the best representation of the data (Morin et al., 2016a). If the G-factor of the B-ESEM turned out to be weakly defined through low factor loadings, then the ESEM should be selected as a more viable alternative model. Table 2 presents the parameter estimates from all models under investigation.

Table 2.

Standardized Factor Loadings of the Models Under Investigation.

Factor	CFA	B-CFA		ESEM			B-ESEM
Factor	λ	G-factor λ	S-factor λ	Factor 1 λ	Factor 2 λ	Factor 3 λ	G-factor λ	S-factor λ	S-factor λ	S-factor λ
Item 2	.70	.59	.26	.36	.38	−.02	.71	−.13	.02	.02
Item 3	.59	.59	.26	.81	−.24	.11	.38	.05	.61	.12
Item 5	.56	.49	.33	.59	−.01	.05	.45	.00	.37	.06
Item 8	.63	.55	.34	.57	.08	.06	.53	.02	.34	.07
Item 11	.72	.72	.02	.44	.34	−.07	.74	−.14	.07	−.02
Item 14	.58	.48	.47	.69	−.07	−.01	.49	−.08	.41	.03
Item 19	.69	.64	.25	.65	.11	−.11	.61	.01	.36	−.09
Item 20	.61	.49	.51	.74	−.15	.14	.44	.05	.54	.15
Item 26	.60	.56	.21	.46	.15	.10	.52	.05	.26	.11
Item 27	.57	.49	.34	.58	.03	−.04	.47	.04	.37	−.03
Item 38	.84	.81	.15	.59	.32	−.09	.76	.09	.30	−.08
Item 39	.66	.53	.56	.81	−.13	.03	.50	.01	.56	.04
Item 42	.59	.58	.04	.33	.30	−.05	.60	−.11	.04	−.01
Item 45	.62	.54	.36	.68	−.05	.05	.47	.07	.48	.05
Item 49	.77	.73	.20	.51	.32	−.04	.73	−.02	.19	−.02
ω	.92		.69	.91					.76
Item 1	.58	.50	.30	.11	.53	−.09	.55	.20	−.01	−.11
Item 7	.69	.70	.01	.31	.42	.01	.67	.04	.06	.03
Item 12	.68	.54	.51	−.07	.75	.08	.56	.52	.01	−.01
Item 15	.58	.43	.50	−.14	.67	.19	.42	.60	.04	.10
Item 22	.66	.62	.18	.26	.41	.13	.53	.36	.24	.08
Item 23	.60	.52	.32	.08	.50	.17	.46	.46	.15	.09
Item 25	.74	.76	−.01	.38	.39	.10	.70	.07	.14	.11
Item 30	.66	.55	.38	−.03	.62	.32	.47	.58	.12	.23
Item 32	.79	.63	.55	−.03	.85	−.05	.73	.35	−.12	−.10
Item 34	.71	.65	.26	.16	.61	−.11	.74	.00	.13	−.10
Item 36	.80	.67	.45	.07	.78	−.11	.77	.26	−.09	−.14
Item 40	.83	.68	.55	−.03	.87	.06	.73	.47	−.04	−.02
Item 41	.75	.72	.17	.27	.53	−.01	.76	.01	−.02	.02
Item 44	.67	.54	.45	−.09	.80	−.12	.70	.11	−.29	−.13
Item 47	.78	.66	.45	.02	.80	−.07	.75	.27	−.12	−.10
ω	.94		.78		.93			.74
Item 10	.70	.34	.59	.22	.02	.59	.30	−.01	.15	.60
Item 16	.42	.12	.68	−.06	.03	.70	.08	.07	.03	.68
Item 17	.23	.00	.51	−.06	−.04	.49	.02	−.09	−.05	.51
Item 29	.65	.34	.46	.34	−.07	.48	.24	.10	.32	.47
Item 31	.61	.26	.66	.01	.11	.69	.20	.13	.08	.67
Item 35	.84	.53	.23	.14	.36	.27	.50	.08	.01	.27
Item 50	.53	.35	.16	.27	.06	.18	.27	.10	.21	.16
Item 6	.28	.05	.59	−.06	−.03	.59	.04	−.02	.00	.59
Item 21	.42	.08	.85	−.14	.05	.86	.08	.01	−.05	.84
Item 46	.25	−.03	.66	−.15	−.01	.64	.01	−.06	−.10	.65
ω	.77	.96	.83			.83	.96			.83
ω_h	−	.81					.81
Latent factor correlations
CFA	Factor 2 (TA)	Factor 3 (DS)			ESEM		Factor 2 (TA)	Factor 3 (DS)
Factor 1 (MA)	.80***	.45***			Factor 1 (MA)		.64***	.21***
Factor 2 (TA)		.41***			Factor 2 (TA)			.11***

Note. CFA = confirmatory factor analysis; B = bifactor model; ESEM = exploratory structural equation model; ω = McDonald’s omega coefficient; ω_h = McDonald’s hierarchical omega coefficient; TA = test anxiety; DS = dislike of school; MA = manifest anxiety.

***

p < .001.

The first comparison started with the 3-factor CFA model and the ESEM. In the CFA model, all factors were well defined by relatively high factor loadings (λ = .23–.84) and satisfactory composite reliability (ω = .77–.94). However, also the latent factor correlations were relatively high (.41–.80) indicating no strong discriminant validity of these factors. The ESEM, in contrast, revealed much lower factor correlations (.11–.64), while all factors remained well defined (λ =.18–.87) and reliable (ω = .83–.93). Although most cross-loadings were small (|.01–.38|), 10 items showed cross-loadings ⩾ 0.30 indicating that another source of unmodeled multidimensionality may be involved.

When scrutinizing the parameter estimates of the B-CFA model, there was support that most items of the B-CFA model revealed adequate loadings > .20 on the G-factor (λ = .26–.81) that was sufficiently reliable (ω = .96, ω_h = .81). The only exceptions were the items 16, 17, 6, 21, and 46 with lower loadings < .20 (λ = |.00–.12|) on the G-factor but relatively high loadings on their corresponding S-factor DS (λ = .51–.85). Thus, these five items retained a relatively low level of specificity when the G-factor was taken into account, while the omega coefficients of the S-factor DS remained sufficient (ω = .83). The S-factor MA appeared to represent the lowest meaningful level of specificity (λ = .02–.56, ω = .69), followed by the S-factor TA (λ = |.01–.55|, ω = .78).

Results of the B-ESEM solution showed that the pattern of target loadings were quite similar to that of the B-CFA model: The G-factor of the B-ESEM was clearly defined by most items (λ = .20–.77, ω = .96, ω_h = .81). Only the items 16, 17, 6, 21, and 46 displayed lower loadings < .20 (λ = .01–.08) on the G-factor but high loadings on their corresponding S-factor DS (λ = .51–.84, ω = .83), again. The target loadings and reliability of the S-factors MA (λ = .02–.61, ω = .76) and TA (λ = .00–.60, ω = .74) were much weaker than those of the S-factor DS (λ = .16–.84; ω = .83). In contrast, the level of specificity associated with these three S-factors of the B-ESEM was similar to that of the B-CFA model. Furthermore, the cross-loadings of the B-ESEM remained mostly smaller (|.00–.32|) than those of the ESEM (|.01–.38|) and also the S-factor loadings of the B-ESEM were mostly smaller (λ = .00–.84) than those in the ESEM (λ = .18–.87) indicating that the multidimensionality absorbed in the cross-loadings served to reproduce the G-factor. These results provided support for the superiority of the B-ESEM solution, which was retained for the further analyses.

Measurement Invariance

Results of measurement invariance testing across gender, age groups, and school types based on the B-ESEM are included in Table 1.

For both gender and age groups as well as school types, there was strong support for full measurement invariance. According to the guidelines suggested by Cheung and Rensvold (2002), there was no significant decrease of model fit between the less and more restricted invariance models. The fit indices of all models were sufficient (see Table 1).

Latent Mean Differences for Gender, School Types, and Age Groups

When exploring potential effects of gender, school types, and age groups, there were some significant latent mean differences. Appendix B in the supplemental materials presents the latent mean differences for gender, school types, and age groups.

Girls reported higher levels of TA and MA than boys, while boys stated a higher DS than girls.

Elementary school students showed significantly higher levels of TA and MA than students attending Oberschule or Gymnasium, and they scored significantly lower on DS than students attending Gymnasium. Students attending Oberschule, in turn, scored significantly lower on DS than students attending Gymnasium.

Students aged 9 to 11 years experienced significantly higher levels of TA and MA but a significantly lower DS than all the older age groups. Students aged 12 to 13 years reported a significantly higher TA than students aged 14 to 15 years or students aged 16 to 19 years, and they stated a higher level of MA than students aged 16 to 19 years. Furthermore, students aged 12 to 13 years reported a significantly lower DS than students aged 16 to 19 years, and students aged 14 to 15 years experienced a significantly higher TA but a significantly lower DS than students aged 16 to 19 years.

Discussion

The present study was the first to examine the factor validity of the AFS by using the emerging bifactor modeling approach suggested by Morin et al. (2016a).

As expected, the 3-factor CFA model showed the best fit to the data compared to a more global 1- or a 2-factor CFA model. However, when contrasting the 3-factor CFA model to the other alternative models according to the bifactor modeling approach, the B-ESEM solution appeared to be the best reproduction of the data. The factor loadings of the G-factor were generally numerically higher than those of the S-factors, as also noted by Perreira et al. (2018). This is reasonable because all items loaded on both the G-factor and the three S-factors. Thus, the total covariance of the factors was divided leading to a higher total variance. However, although both the reliability indices for the S-factors and most S-factor loadings remained equally well defined by strong and positive target loadings from most items, there were 10 items of the S-factors with lower loadings < .20 in the B-ESEM solution. In particular, the items 16, 17, 6, 21, and 46 which all defined the S-factor DS showed very poor loadings < .20 indicating the distinct nature of dislike of school. However, apart from the superior fit, the target S-factor loadings and estimates of composite reliability were satisfactory in the B-ESEM solution, indicating that the S-factor loadings represented meaningful specificity not reflected into the G-factor and supporting the superiority of the B-ESEM solution. For this reason, the B-ESEM was retained for the further analyses.

Empirical support for the factorial validity of the AFS was also found by the measurement invariance testing across gender, age groups, and school type using the B-ESEM. For all measurement invariance models, there were no substantial deteriorations of model fit between the less and more restrictive models.

Beyond the central objectives of this study, latent mean differences between gender, school types, and age groups in the three specific facets of the AFS were additionally explored. In line with some previous research (e.g., Hill et al., 2016; Pekrun et al., 2017), girls reported significantly higher levels of TA and MA than boys. However, somewhat surprisingly, students attending Gymnasium did not show significantly higher levels of TA and MA than students attending the other school types, although it can reasonably be assumed that students attending Gymnasium are more likely exposed to TA due to their final exams and general qualification for university entrance. A possible reason for this research may be the overrepresentation of students attending Grades 7 and 9 in the present study who were not close to graduation and may be, thus, less exposed to TA than students of higher grades before graduation or leaving school.

Some more interesting findings were found for the latent mean differences across age groups: Students of the youngest age group 9 to 11 years reported significantly higher levels of TA and MA than the older age groups. For the older age groups, there was only a consistent pattern for TA suggesting that students aged 12 to 13 years experienced a significantly higher MA than students aged 16 to 19 years. For DS, conversely, there was a reverse pattern: Students of the youngest age group stated a significantly lower DS than students of the older age groups, while students aged 12 to 13 years and students aged 14 to 15 years experienced only a significantly lower DS than students of the oldest age group. However, this does not seem to be unusual given the increasingly decline of motivation during school years (Gnambs & Hanfstingl, 2016; Leroy & Bressoux, 2016).

Limitations and Conclusion

The present study has some limitations: A first limitation concerns the oldest age group which included all students between 16 and 19 years. However, it is possible that students aged 18 to 19 years may experience a different level of anxiety than younger students aged 16 to 17 years. Further studies should, thus, use more heterogenous samples.

Another limitation arises from the measurement of students’ anxiety as it is solely based on students’ self-reports. However, it can be assumed that some students may not admit their anxiety, especially boys. This may also be the reason for the significantly higher levels of anxiety among girls, as girls generally show more emotions than boys (Linley, Dovey, Beaumont, Wilkinson, & Hurling, 2016). Behavioral observations of students’ learning by teachers during exams or by parents at home would, thus, be a good extension to the AFS.

Finally, another major limitation of this study is the absence of a specific hypothesis regarding the factor structure of the ESEM and B-ESEM. However, it will be the case that the ESEM solution will result in a better fitting model than the other models and will suffer from the same problems as an EFA model when used without an a priori hypothesis.

Apart from these limitations, the results provided strong support for a newly proposed representation of the AFS suggesting that the B-ESEM solution supported the use of the total scale anxiety score and the facet (subscale) scores. More practically, results showed that the bifactor ESEM framework tends to naturally disaggregate the effects attributable to global anxiety relative to more specific facets of anxiety. An important implication of this research is that future investigations should also test B-ESEM for the measurement of students’ anxiety, in particular, latent variable modeling which accurately depicts the central constructs and corrects for measurement errors (Marsh & Hau, 2007). When investigating several complex models simultaneously, final predictive models could be based on factor scores saved from preliminary measurement models. These factor scores may help understanding the underlying nature of students’ anxiety while correcting for measurement errors (Morin et al., 2016a).

To resume, the results of this study provided strong support for the factorial validity of the AFS and the adequate use of bifactor modeling with the AFS. The B-ESEM showed the best factor representation for the AFS. However, as this was the first study testing the factor structure by the bifactor modeling framework, further research is recommended to substantiate the reported findings and the multidimensionality of the AFS, again.

Supplemental Material

APPENDIX_A_REV2 – Supplemental material for Factorial Validity of the Anxiety Questionnaire for Students (AFS): Bifactor Modeling and Measurement Invariance

Supplemental material, APPENDIX_A_REV2 for Factorial Validity of the Anxiety Questionnaire for Students (AFS): Bifactor Modeling and Measurement Invariance by Annette Lohbeck and Franz Petermann in Journal of Psychoeducational Assessment

Supplemental Material

APPENDIX_B_REV2 – Supplemental material for Factorial Validity of the Anxiety Questionnaire for Students (AFS): Bifactor Modeling and Measurement Invariance

Supplemental material, APPENDIX_B_REV2 for Factorial Validity of the Anxiety Questionnaire for Students (AFS): Bifactor Modeling and Measurement Invariance by Annette Lohbeck and Franz Petermann in Journal of Psychoeducational Assessment

Footnotes

Acknowledgements

The authors would like to thank Alexandre J. S. Morin for his great support and very constructive comments on the data analysis.

Declaration of Conflicting Interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) received no financial support for the research, authorship, and/or publication of this article.

Supplemental Material

Supplemental material for this article is available online.

References

Asparouhov

Muthén

B. O.

(2009). Exploratory structural equation modeling. Structural Equation Modeling, 16, 397-438. doi:10.1080/10705510903008204

Asparouhov

Muthén

B. O.

Morin

A. J. S.

(2015). Bayesian structural equation modeling with cross-loadings and residual covariances: Comments on Stromeyer et al. Journal of Management, 41, 1561-1577. doi:10.1177/0149206315591075

Asparouhov

Muthén

B. O.

Muthén

(2006). Robust chi square difference testing with mean and variance adjusted test statistics. Matrix, 1(5), 1-6.

Cheng

Klinger

Fox

Doe

Jin

(2014). Motivation and test anxiety in test performance across three testing contexts: The CAEL, CET, and GEPT. TESOL Quarterly, 48, 300-330. doi:10.1002/tesq.105

Cheung

G. W.

Rensvold

R. B.

(2002). Evaluating goodness-of-fit indexes for testing measurement invariance. Structural Equation Modeling, 9, 233-255. doi:10.1207/S15328007SEM0902_5

Dan

Raz

(2012). The relationships among ADHD, self-esteem, and test anxiety in young adults. Journal of Attention Disorders, 19, 231-239. doi:10.1177/1087054712454571

Gignac

G. E.

Watkins

M. W.

(2013). Bifactor modeling and the estimation of model-based reliability in the WAIS-IV. Multivariate Behavioral Research, 48, 639-662. doi:10.1080/00273171.2013.804398

Gnambs

Hanfstingl

(2016). The decline of academic motivation during adolescence: An accelerated longitudinal cohort analysis on the effect of psychological need satisfaction. Educational Psychology, 36, 1691-1705. doi:10.1080/01443410.2015.1113236

Götz

Bieg

Lüdtke

Pekrun

Hall

N. C.

(2013). Do girls really experience more anxiety in mathematics? Psychological Science, 24, 2079-2087. doi:10.1177/0956797613486989

10.

Hill

Mammarella

I. C.

Devine

Caviola

Passolunghi

M. C.

Szűcs

(2016). Maths anxiety in primary and secondary school students: Gender differences, developmental changes and anxiety specificity. Learning and Individual Differences, 48, 45-53. doi:10.1016/j.lindif.2016.02.006

11.

L.-T.

Bentler

P. M.

(1998). Fit indices in covariance structure modeling: Sensitivity to underparameterized model misspecification. Psychological Methods, 3, 424-453.

12.

Klasen

Petermann

Meyrose

A.-K.

Barkmann

Otto

Haller

A.-C.

. . . Ravens-Sieberer

(2016). Verlauf psychischer Auffälligkeiten von Kindern und Jugendlichen [Trajectories of mental health problems in children and adolescents: Results of the BELLA cohort study]. Kindheit und Entwicklung, 25, 10-20. doi:10.1026/0942-5403/a000184

13.

Leroy

Bressoux

(2016). Does amotivation matter more than motivation in predicting mathematics learning gains? A longitudinal study of sixth-grade students in France. Contemporary Educational Psychology, 44-45, 41-53. doi:10.1016/j.cedpsych.2016.02.001

14.

Linley

P. A.

Dovey

Beaumont

Wilkinson

Hurling

(2016). Examining the intensity and frequency of experience of discrete positive emotions. Journal of Happiness Studies, 17, 875-892. doi:10.1007/s10902-015-9619-7

15.

Marsh

H. W.

Hau

K. T.

(2007). Applications of latent-variable models in educational psychology: The need for methodological-substantive synergies. Contemporary Educational Psychology, 32, 151-170. doi:10.1016/j.cedpsych.2006.10.008

16.

Marsh

H. W.

Liem

G. A. D.

Martin

A. J.

Morin

A. J. S.

Nagengast

(2011). Methodological measurement fruitfulness of exploratory structural equation model: New approaches to issues in motivation and engagement. Journal of Psychoeducational Assessment, 29, 322-346. doi:10.1177/0734282911406657

17.

Marsh

H. W.

Morin

A. J. S.

Parker

Kaur

(2014). Exploratory structural equation modeling: An integration of the best features of exploratory and confirmatory factor analysis. Annual Review of Clinical Psychology, 10, 85-110. doi:10.1146/annurev-clinpsy-032813-153700

18.

McDonald

R. P.

(1970). The theoretical foundations of common factor analysis, principal factor analysis, and alpha factor analysis. British Journal of Mathematical and Statistical Psychology, 23, 1-21. doi:10.1111/j.2044-8317.1970.tb00432.x

19.

Morin

A. J. S.

Arens

A. K.

Marsh

H. W.

(2016a). A bifactor exploratory structural equation modeling framework for the identification of distinct sources of construct-relevant psychometric multidimensionality. Structural Equation Modeling: A Multidisciplinary Journal, 23, 116-139. doi:10.1080/10705511.2014.961800

20.

Morin

A. J. S.

Arens

A. K.

Tran

Caci

(2016b). Exploring sources of construct-relevant multidimensionality in psychiatric measurement: A tutorial and illustration using the Composite Scale of Morningness. International Journal of Methods in Psychiatric Research, 25, 277-288. doi:10.1002/mpr.1485

21.

Muthén

L. K.

Muthén

B. O.

(2018). Mplus user’s guide. Los Angeles, CA: Author.

22.

Nitkowski

Lohbeck

Petermann

. (2017). Hat die Angstausprägung bei Kindern und Jugendlichen in Deutschland von 1974 bis 2016 zugenommen? [Has anxiety in German children and adolescents increased from 1974 to 2016? Cross-temporal analysis of anxiety over a period of 42 years]. Kindheit und Entwicklung, 26, 110-117. doi:10.1026/0942-5403/a000222

23.

Pekrun

Lichtenfeld

Marsh

H. W.

Murayama

Götz

(2017). Achievement emotions and academic performance: Longitudinal models of reciprocal effects. Child Development, 88, 1653-1670. doi:10.1111/cdev.12704

24.

Perreira

T. A.

Morin

A. J. S.

Hebert

Gillet

Houle

S. A.

Berta

(2018). The short form of the Workplace Affective Commitment Multidimensional Questionnaire (WACMQ-S): A bifactor-ESEM approach among healthcare professionals. Journal of Vocational Behavior, 106, 62-83. doi:10.1016/j.jvb.2017.12.004

25.

Reise

S. P

. (2012). The rediscovery of bifactor measurement models. Multivariate Behavioral Research, 47, 667-696. doi:10.1080/00273171.2012.715555

26.

Reise

S. P.

Moore

T. M.

Haviland

M. G.

(2010). Bifactor models and rotations: Exploring the extent to which multidimensional data yield univocal scale scores. Journal of Personality Assessment, 92, 544-559. doi:10.1080/00223891.2010.496477

27.

Simms

L. J.

Grös

D. F.

Watson

O’Hara

(2008). Parsing general and specific components of depression and anxiety with bifactor modeling. Depression and Anxiety, 25, 34-46. doi:10.1002/da.20432

28.

Spielberger

C. D.

Vagg

P. R.

(1995). Test anxiety: Theory, assessment and treatment. Washington, DC: Taylor & Francis.

29.

Wiesczerkowski

Nickel

Janowski

Fittkau

Rauer

Petermann

(2016). Angstfragebogen für Schüler [Anxiety questionnaire for students] (7th ed.). Göttingen, Germany: Hogrefe.

30.

Zeidner

(1998). Test anxiety: The state of the art. New York, NY: Plenum Press.

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

0.24 MB

0.03 MB