Abstract
A procedure for evaluating the average R-squared index for a given set of observed variables in an exploratory factor analysis model is discussed. The method can be used as an effective aid in the process of model choice with respect to the number of factors underlying the interrelationships among studied measures. The approach is developed within the framework of exploratory structural equation modeling and is readily applicable with popular statistical software. The outlined procedure is illustrated using a numerical example.
Keywords
For over a century, factor analysis (FA) has received an impressive amount of attention across the educational, behavioral, social, marketing, organizational, and biomedical sciences (e.g., Cudeck & MacCallum, 2007). For most of this time, a particular mode of FA, called exploratory factor analysis (EFA), has been a highly popular approach for examining latent structures of sets of studied observed variables (for instance, those comprising scales, tests, test batteries, multiitem measuring instruments, composites, subscales, testlets, self-reports, inventories, questionnaires, or surveys—e.g., Mulaik, 2009). A main activity in EFA applications has been concerned thereby with model selection, specifically with ascertaining the number of presumed underlying latent variables, constructs, dimensions, or factors (Raykov & Marcoulides, 2008). Throughout the past half century or so, the so-called Kaiser eigenvalue rule and its associated scree-plot, in addition to substantive interpretation of rotated solutions and factor structures, have been widely used during the process of determining this number, in addition to possible applications of information criteria indices like the Akaike information criterion (AIC) and the Bayesian information criterion (BIC; Fabrigar & Wegener, 2013).
Due to its exploratory nature, conventional traditional utilizations of EFA do not include error covariances and restrictions on model parameters (Mulaik, 2009). Recent methodological advances have allowed, however, relaxation of some model parameter constraints. In particular, the exploratory structural equation modeling (ESEM) approach permits imposing constraints on error variances and mean structure parameters (Asparouhov & Muthén, 2009). As a result of these advances, it becomes also possible to introduce new external (additional) parameters that may be helpful in the frequently difficult and at times not uncontroversial process of model choice in EFA, and especially in the process of evaluating the number of factors.
The present article aims to contribute to the discussion of model selection in EFA and particularly with respect to the process of ascertaining the number of common factors underlying a studied set of manifest measures. The discussion focuses on a readily applicable statistic that can be used as a complement to the aforementioned popular selection indices, which are widely used during this process in EFA applications. In the next section, we draw attention to an average proportion explained variance (average R-squared) index, which can with its point and interval estimates significantly aid researchers interested in evaluating the number of factors in EFA. The subsequent illustration section uses a numerical example, where this explained variance statistic is shown to help correctly identify the true number of factors and is found to outperform in this respect the popular AIC index.
The Average R-Squared Index and Its Utility in Exploratory Factor Analysis
Background, Notation, and Assumptions
To accomplish the aims of this article, we assume that a set of p (approximately) continuous observed variables are given, which are denoted X1, X2, . . ., Xp (p > 2). They may, but need not, represent the components of a psychometric scale, test, inventory, composite, questionnaire, survey, test-battery, or self-report, and their number p is considered fixed (i.e., these measures are not drawn or sampled from a larger pool of items, and thus represent the only set of manifest variables of interest). The variables X1, X2, . . ., Xp are presumed to have been administered to a sample from a studied population of units of analysis (e.g., patients, students, subjects, or respondents), which is not characterized by clustering effects or a substantial degree of unobserved heterogeneity (e.g., Raykov et al., 2016). In addition, we posit that the common factor model is valid for them (e.g., Mulaik, 2009), that is,
holds, where
Ascertaining the Number of Underlying Factors and a Widely Followed Practice
In EFA, a major question is concerned with evaluating the number m of underlying common factors. While oftentimes m = 1 is desirable, especially when employed scales, tests, or composites are anticipated to be unidimensional, in empirical research m > 1 can and will frequently hold, for instance in early stages of instrument development (e.g., Raykov, 2012). A number of indices have been developed over the past half century or so to aid researchers in this process of ascertaining m, which may at times be complicated and not uncontroversial. Some of the most popular criteria include Kaiser’s eigenvalue rule and its associated scree-plot. These criteria, along with crucial substantive interpretations of the factors based on factor pattern and factor structure matrices (after an appropriate rotation when m > 1; e.g., Fabrigar & Wegener, 2013), are routinely used in EFA applications. In addition, information criteria such as the AIC and BIC can provide relevant information for the process of evaluating the number of factors. In the remainder of this article, for convenience all indices mentioned in this paragraph are referred to as traditional or conventional indices.
Average R-Squared Index as a Helpful Aid in Factor Number Evaluation
A main index of model fit in standard regression analysis is the popular R-squared index, defined as the proportion of explained variance in a response variable of interest in a given regression model (e.g., Agresti & Finlay, 2009). With this index in mind, a revisit of the common FA model in Equation (1) can be quite helpful. Specifically, Equation (1) could be viewed—at least conceptually—as a regression model with unobserved predictors that are the factors in the above vector
and viewed as an index of “local” fit (quality) of a considered EFA model with respect to Xk, that is, as an index of how well this m-factor model describes and explains the data on Xk (k = 1, . . ., p; see also below; it is mentioned in passing that *R2 k,m would equal the well-known communality statistic with orthogonal factors). Since the p proportions in Equations (2) are in general unequal, whether in a population of concern or a sample from it, when one is interested in how well this EFA model fits the data on the analyzed set of observed variables X1, X2, . . ., Xp (with respect to observed variance), it would be useful to consider also their average, denoted *R2 and defined as
The quantity in the right side of Equation (3) can be interpreted as an index of how well the EFA model describes and explains on average the variability in the analyzed manifest variables. For simplicity and convenience, we refer to this quantity *R2 m , as average R-squared (ARS) for a considered EFA model with m factors, and discuss next how one can point and interval estimate this index in an empirical study.
Point and Interval Estimation of the Average R-Squared Index
On fitting an EFA model of interest to a given data set and estimating its parameters, a point estimate of the ARS results as usual by substituting the relevant model parameter estimates into the right-hand side of Equation (2), and then the resulting R-squared indices into that side of Equation (3). This process is readily conducted within the framework of the comprehensive latent variable modeling methodology (Muthén, 2002), and is straight-forwardly carried out in empirical research using the popular latent variable modeling program Mplus (Muthén & Muthén, 2020). In particular, due to the invariance property of the widely used maximum likelihood method when the latter is applicable (e.g., with normality of the manifest measures; Casella & Berger, 2002), this activity renders the maximum likelihood estimate of the ARS as the resulting quantity in the right-hand side of Equation (3). This procedure is exemplified in a following section (see Appendix B for the needed source code for point estimation of the ARS; see also details in the next section).
To obtain an interval estimate of the ARS index, we can proceed as follows. We commence with the initial monotone transformation approach discussed in Raykov and Marcoulides (2011; see also references therein). Its application is appropriate for this purpose due to the observation that the corresponding population ARS (cf. Equation 3) is bounded from below and above by 0 and 1, respectively. Hence, we can select the well-known logit transformation as such an initial monotone transformation (e.g., Browne, 1982). On rendering then a 95% confidence interval (CI), as traditionally done, for the logit of ARS—namely by adding and subtracting 1.96 times its standard error obtainable with the popular delta method (e.g., Raykov & Marcoulides, 2004)—we finally use on its limits the inverse, logistic function to furnish the endpoints of the sought 95% CI of the ARS. (For further details, see Raykov & Marcoulides, 2011, ch. 4 and 7.) The delta method is implemented in Mplus and automatically invoked on request (Muthén & Muthén, 2020, ch. 13). Then the last mentioned final step of this interval estimation procedure is readily carried out with the R-function “ci.ar2” that is provided in Appendix C. (See note to that appendix for a CI at a different confidence level when desired.) The outlined ARS point and interval estimation procedure is illustrated in a following section.
We discuss next how one can use the resulting point and interval estimates of the ARS for ascertaining the number of factors in EFA, as a helpful complement to the application of the above mentioned traditional or conventional indices routinely employed for this purpose (see introductory section).
Application of the Average R-Squared for Model Choice in EFA
When carrying out EFA, as indicated earlier a main step is that of ascertaining the number m of underlying factors (1≤m≤p). The typical approach then (e.g., Muthén & Muthén, 2020) is to compare the factor solution obtained with m = 1 factor to that with m = 2 factors, then to that with m = 3 factors, and so on. This exploratory process is followed until (a) (consistently) inadmissible solutions or lack of convergence occur, (b) a prespecified maximal number of factors of interest is reached, or (c) m = p. Based on the admissible and converged solutions thereby and their comparison as well as interpretation, a (tentative) decision can be made about the number of factors that is (i) optimal with regard to the traditional statistical indices mentioned previously, and at least as importantly (ii) associated with the best theoretically acceptable and justifiable substantive interpretation of the factors and their subject-matter meaning in the particular domain of application. 1
An equivalent approach to EFA with a prespecified set of consecutive integers considered as potential numbers of factors (usually beginning with m = 1), for a given set of observed variables, is provided by the recently developed ESEM method (Asparouhov & Muthén, 2009). Accordingly, one fits to the analyzed data set an ESEM model with m = 1 factor, then an ESEM model with m = 2 factors, and so on as above, thus allowing one to ascertain the optimal number of factors with respect to the criteria (i) and (ii) in the preceding paragraph based on the earlier mentioned traditional/conventional indices (e.g. Muthén & Muthén, 2020).
Unlike the conventional EFA approach, this ESEM method possesses an additional feature that can be particularly useful by further informing the process of evaluating the number of factors in EFA. Specifically, the ESEM method offers the opportunity to introduce new/additional (external) parameters for a given EFA model as functions of its own parameters. These new parameters are not parameters of the model per se, but only represent functions of them, and can be point and interval estimated simultaneously as the pertinent ESEM model is fitted to data. This opportunity is especially beneficial for the purposes of the present article. The reason is that, as seen from the right-hand side of Equation (3), the ARS index is a function of relevant model parameters. Hence, this ESEM approach allows an empirical scientist interested in evaluating the quality of an EFA model, and in particular concerned with EFA model choice with respect to the number of underlying factors, to point and interval estimate the ARS measure in Equation (3).
Once the ARS index is evaluated for the different EFA models with increasing number of factors, typically starting with m = 1, one may argue that as this number m approaches an optimal number m′ of factors the corresponding ARS indices may be expected to markedly increase:
Thereby, depending on sample size, the 95% CIs say of these indices may be expected to be nonoverlapping (see illustration section). However, once passing this number m′, one may argue that the ARS indices may be expected to “stabilize,” plateau, and only marginally increase thereafter (“≈” symbolizes next this relationship):
with this incremental process possibly leading in part to inadmissible solutions, or nonconvergent solutions, or alternatively reaching the maximal possible number, p, of factors in the last considered solution then. Thereby, depending on sample size, the 95% CIs of these indices may be expected to be overlapping (see illustration section). 2
In a later section, we demonstrate the utility of this ARS point and interval estimation with a numerical example.
Relationship to Previous Research
The outlined procedure for evaluation of the ARS index in EFA models may be seen to have some conceptual relationship to a method proposed recently for the framework of principal components analysis (PCA; e.g., Raykov & Marcoulides, 2014; see also references therein). The present approach is distinct, however, from that prior method in the following relevant aspects. First, this procedure is specifically developed for models within the FA framework that differs decisively from that of PCA, as discussed extensively in the literature (e.g., Raykov & Marcoulides, 2008). In particular, the procedure makes an essential use of residual terms and specifically their variances (e.g., Equation 3) that do not have—at least direct—counterparts in PCA. 3 Second, the present procedure is critically dependent on models for the relationships among a set of analyzed observed variables, whereas in PCA there are no (straight-forward) counterparts to such models (see also Note 3). Third, the factors within the EFA models of importance for the present approach need not be orthogonal (as the components in PCA do), and in empirical educational, behavioral, or social research one may argue that they will usually be correlated (e.g., Mulaik, 2009). By way of contrast, the orthogonality requirement is essential for extracted principal components in PCA. Fourth, the factors in EFA, and more generally in any FA model, are inherently latent—that is, unobserved—variables, and thus cannot equal linear combinations of observed variables. On the contrary, principal components in PCA are by construction intrinsically observed variables (i.e., share the same observed variable status as the analyzed manifest measures). This is because the components are defined as linear combinations of the observed variables to begin with. Hence, based on the above crucial differences between EFA and PCA, the question of appropriate number of extracted principal components cannot be logically equivalent (in the general case) to the question of appropriate number of “extracted” latent factors, from a given set of manifest measures (see also Note 3 and below).
As a consequence of these critical differences between the present procedure and that earlier method for the framework of PCA, point and interval estimates of the proportion explained variance following the latter method are in general distinct from those of the ARS index furnished by the procedure outlined in this article. In this connection, it is worth also observing the following fact. As can be seen from the right-hand side of Equation (3), when all factors are orthogonal and due to the factor covariances vanishing, the resulting ARS will conceptually be the same as the proportion explained variance in PCA then, which proportion is only of concern in Raykov and Marcoulides (2014; see also Note 3).
In the next section, we demonstrate the ARS point and interval estimation outlined in this article using a numerical example.
Illustration on Data
To accomplish the aims of this method illustration section, we use simulated data on p = 15 observed variables with m = 5 factors for n = 2,000 cases, which are generated using the following model (cf. Equation 1):
where η1 through η5 are standard normal variates with correlations described in Equations (7) below, and the 15 residuals, ε1 through η15, are independent normal variates with variance .9 each (for details pertaining to the data simulation process, see Appendix A containing the used Mplus source code that includes also the seed utilized thereby). The factor correlations used then were as follows:
As a next step, we fitted successively 7 ESEM models of relevance in this setting (see preceding section), namely, those with 1, 2, 3, 4, 5, 6 and 7 factors, respectively. (This was accomplished using the Mplus source code in Appendix B. We decided to stop at 7 the increase in number of factors to avoid inadmissible solutions and lack of convergence; we note, however, that using the source code in Appendix A one can generate the same analyzed data set and then fit ESEMs with more than 7 factors if desirable, only trivially extending the source code in Appendix B as explicated in its Note 2.) For the sake of completeness of this discussion Table 1 contains the eigenvalues of the relevant correlation matrix of the observed variables. The resulting indices of concern in this article are summarized in Table 2 of focal interest next.
Eigenvalues for the Simulated Data Set Used in the Illustration Section.
Note. In the used software format, the consecutive ranks of the eigenvalues, in descending order, are listed above separating line, with their numerical magnitude stated immediately below that line (Muthén & Muthén, 2020).
Average R-Squared Indices, Standard Errors, 95% Confidence Intervals, Root Mean Square Error of Approximation, and Information Criteria Indices for the Seven ESEM Models Fitted in the Illustration Section.
Note. ARS = average R-squared (see Equation 3); SE = standard error for it; 95% CI = 95% confidence interval for ARS; RMSEA = root mean square error of approximation; m = number of factors.
True model (with m = 5 factors; see Appendix A).
As seen from Table 1, there are 5 eigenvalues notably larger than 1, with all subsequent eigenvalues being considerably smaller than 1. Hence, if using the popular Kaiser eigenvalue rule, one would suggest extracting m = 5 factors for the analyzed data set. This suggestion equals the true number of factors used in the data simulation process (see Equations 6 and 7 above as well as their surrounding discussion). We note also the fact that the difference between the fifth and sixth eigenvalue is marked relative to the differences among the sixth through 15th eigenvalues, which additionally corroborates this suggestion (cf. Raykov & Marcoulides, 2008).
Next, it is readily seen from Table 2 that as the number of factors m approaches the “true” number of factors, 5, the point estimates of the ARS index increase steadily. In addition, the 95% CIs of the ARS, for m increasing from 1 through 5, are also notably shifting to the right (relative to these CIs for fewer number of factors, starting with m = 1). At least as importantly, we stress the fact that these 5 CIs are not overlapping. However, once reaching and then passing 5 that is the true number of factors, the ARS point estimates show marginal increase (relative to the case with fewer than 5 factors) and their 95% CIs are now overlapping (upward from m = 5). (We note in passing that the model with 5 factors is the first in this incremental process, where the root mean square error of approximation is less than .05. We stress, however, that we use this index merely in a descriptive rather than inferential role here, as also implied from Notes 1 and 2 and the pertinent discussion earlier in the article.)
Since we know all parameters of the model used to generate the analyzed data set, we can determine the “true” (population) ARS by substituting these loading and residual/unique factor variance values correspondingly in the right-hand side of Equation (3). This yields the true ARS, denoted ARS population , as follows:
Revisiting now Table 2, we observe that this true ARS value of .363 (i) is covered only by the 95% CI of the ARS for the true number of factors, 5; while being (ii) above all 95% CIs for the ARS with fewer factors (i.e., for m = 1, . . ., 4), and (iii) below the 95% CIs for the ARS with more than 5 factors (i.e., for m = 6 and m = 7). This observation suggests that (a) with fewer than the true number of factors, the ARS consistently underestimates its true value here; while (b) with larger than the true number of factors, the ARS consistently overestimates its true value, as could be anticipated based on general statistical estimation principles (e.g., Casella & Berger, 2002).
Given the discussion in the preceding sections, this behavior of the point and interval estimates of the ARS for the analyzed data set is readily observed to be entirely congruent with the earlier indicated expectations for it in this article. In particular, this series of ARS point and interval estimates permit here to correctly identify the true number of factors, m = 5, which underlies the data generation process (see above in this section). At the same time, it is worth noting that the behavior of the popular information criterion index AIC, unlike that of the BIC, differs notably from this pattern, since the AIC continues to decrease after passing m = 5 (relative to its value at m = 5 and as compared with its value at lower numbers of factors).
The findings observed in Table 2 (see also Table 1) demonstrate in this illustration example the potential utility of the ARS index of the present article as a complement to widely used traditional indices for ascertaining the number of factors in EFA (perhaps at times with the exception of the AIC).
Conclusion
This note was concerned with a readily applicable procedure for interval estimation of the ARS index across a set of observed variables in EFA. Its aim was to revisit a widely used index of model fit in applications of standard regression analysis and show its potential utility in the process of ascertaining the number of factors in EFA applications. Our discussion highlighted this potential of the ARS as a complement to widely used traditional and conventional indices in EFA for this purpose, such as Kaiser’s eigenvalue rule and related scree-plot, factor pattern, and factor structure matrices as well as information criteria like the AIC and BIC. We have thereby argued (see also Note 3) that the ARS can be a helpful aid to the empirical researcher particularly in EFA models with oblique factors (cf. Raykov & Marcoulides, 2014), over and above these conventional indices.
As can be seen from the preceding discussion, the article did not involve use of or reference to statistical tests within the EFA framework. We would argue that the utilization then of such tests need not be generally dependable for the following reasons. One, the essence of EFA is exploration per se, and as such EFA is logically decoupled from statistical testing and therefore best carried out in the absence of such tests. Two, statistical test results are inherently affected by sample size, and it is in general not possible to tease apart the contribution of hypothesis violation from that of sample size that are both confounded in the pertinent test statistic. Thus, while it would be possible in applications of the outlined ARS procedure to evaluate (test) the fit of individual ESEM models, and of their differences as nested models for increasing numbers of factors, such tests need not be used in our view in a conclusive or notably informative role in the process of ascertaining the number of factors in EFA applications. The main reason for this opinion and recommendation is the fact that their p values have at best limited value for the above and related reasons (see also Schmidt, 1996).
It is in this sense that we argue in favor of using the earlier outlined procedure in this note for point and interval estimation of the ARS, only as a potentially helpful complement to the above mentioned traditional indices for evaluation of the number of factors in EFA. We do not consider in particular the ARS as a replacement for any of these (or other) indices, nor do we ascribe greater importance to the ARS relative to any of them. As indicated repeatedly before, our aim is solely to recommend consideration of the ARS as an additional means of likely utility in the complicated and at times controversial process of ascertaining the number of factors in EFA utilizations in empirical research. Particularly under the latter circumstances (and possibly not only then), we suggest that the ARS may be significantly informative in the generally difficult process of evaluation of the number of factors, over and above the conventional indices used then. Relatedly, we also suggest that being defined as average proportion explained variance, the ARS can be generally a useful means of evaluating fit (quality) of a given EFA model, particularly in terms of the extent to which the latter explains analyzed measure variance (e.g., Raykov & Zajacova, 2012).
The discussed ARS procedure has several limitations that need to be articulated at this point. First, it is currently unknown in more rigorous terms what the effect of sample size is on the width of the CIs of the ARS across models with increasing number of factors, and specifically what the impact of this effect may be on the (tentative) conclusions about this number that may be based on or justified using the ARS (see also Notes 1 and 2). It is for this reason that we look at the ARS merely as one more means of possible help in the process of evaluation of the number of factors in EFA; and as indicated above, we would generally recommend its use especially in empirical cases where application of the traditional indices for factor number determination may not be conclusive. We therefore encourage further research into the properties of the ARS, possibly based on comprehensive simulation studies, which goes beyond the confines of this note. Second, and relatedly, it is at present not clear how to further formalize the process of determining whether the ARS (a) increases “steadily,” as expected with increasing number of factors before reaching their “true” or optimal number with respect to a given data set (see earlier discussion in this note); or (b) “stabilizes,” as anticipated when passing that “true” or optimal number. We recognize that numerically the ARS will further increase (with admissible, converged solutions of course) after passing that number of factors, and this increase may even be observable in its CIs with sufficiently large sample sizes. However, as indicated earlier, one would tend to expect that it will be markedly less pronounced than the increase in ARS and corresponding shift upward in its CIs before attaining the “true”/optimal number of factors (see also Notes 1 and 2). Third, it is to be kept in mind that with increasing number of factors it is not unlikely that inadmissible and/or nonconverging model solutions result, possibly increasingly more frequently. We therefore would like to emphasize that for any such solution (number of factors), whether before or after reaching a “true” or optimal number of factors, the ARS should be considered undefined. This may contribute to difficulties in applying the ARS in some studies; but we would like to point out that with such solutions none of the traditional indices used in ascertaining the number of factors in EFA should be trusted or considered dependable. Fourth, as mentioned at the outset, we assumed no clustering effects and a single-class (as opposed to mixture) population of concern. While minor violations of the former assumption may not limit significantly the applicability in a trustworthy way of the ARS, we will generally caution against using this method in mixture settings before appropriate modifications are carried out on it in order to account for the multiplicity of latent classes. These modifications need to be the subject of future research that is also encouraged here. Fifth, this article does not intend to suggest that the outlined ARS evaluation procedure will always or most of the time provide relevant and unambiguous information about the underlying number of factors, which is consistent with a possible decision about it based on the aforementioned traditional indices. Rather, our aim was merely to draw attention to a readily applicable procedure that provides potentially important additional and complementary information that may be useful in the generally difficult process of ascertaining the number of factors in EFA. Sixth, and relatedly, the note does not mean to suggest that the behavior demonstrated by the AIC in the illustration section, where it missed to identify the correct number of factors, will be necessarily found in many other circumstances. Last but not least, we would like to stress that the better applications of the ARS procedure of this note will be expected to occur with larger samples, since the process of model parameter estimation and related indices (in particular, their standard errors and subsequent CIs) is based on an asymptotic statistical theory (e.g., Bollen, 1989). Future research is also needed into this limitation of the ARS, which will hopefully yield relevant results allowing one to determine whether a given sample size may be treated as sufficient to consider the underlying large sample theory attaining practical relevance in an empirical study. 4
In conclusion, the present note outlines an interval estimation procedure for average proportion explained variance in EFA. The method allows behavioral, educational, and social researchers as well as biomedical, clinical, and marketing scientists to use one more means with potential utility, as a complement to existing traditional indices, in the complicated and at times not uncontroversial process of ascertaining the number of factors in FA applications in the empirical disciplines.
Footnotes
Appendix A
Appendix B
Appendix C
Acknowledgements
We are indebted to T. Asparouhov and G. A. Marcoulides for valuable and helpful discussions and advice on exploratory structural equation modeling and its applications.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
