Abstract
This paper evaluated the evidence supporting the factor structure of extant coping instruments based on modern psychometric standards. Our literature search identified nine coping instruments that are routinely used to measure coping strategies in adult populations. While nearly 10 thousand papers have been published using these instruments, only 39 studies have investigated their psychometric validity. Our findings revealed that the majority of these studies did not follow current psychometric recommendations for establishing internal validity in part because they did not account for the ordinal nature of the data. Further, studies employing exploratory factor analysis used methods for identifying the number of factors to retain that have been found to have a low accuracy in a simulation study while those employing confirmatory factor analysis reported model fit statistics that did not meet widely accepted benchmarks. Hence, conflicting results were found within and across the nine coping instruments. Recommendations are made for improving future validation studies.
Keywords
The overwhelming majority of questionnaires in social science, medical, and educational research utilize Likert-type items which yield ordinal data; however, methods used in validation studies rarely account for this fact. Because ordinal data attenuate variability, it is well-known that treating ordinal scales as continuous, entails attenuated estimations of variances, effect sizes, correlations, reliabilities, and factor and structural loadings thereby affecting the overall accuracy of reported findings (Flora & Curran, 2004; Gadermann et al., 2012; Gugiu et al., 2009; Jöreskog et al., 2016; Jöreskog, 1994; 2005; Zumbo et al., 2007). These issues can directly impact the stability of factors, which, in turn, call into question the validity of the measured latent construct (Scherer et al., 1988) and the theories that arise from their use (Cook & Heppner, 1997). While methodological and statistical caveats of treating ordinal data as continuous have been broadly described in the psychometric literature (e.g., see Gugiu et al., 2009), recommendations for the use of survey instruments rarely consider the impact of deviations from this guidance. This study summarizes the psychometric standards pertaining to survey validation, particularly with respect to the identification of the underlying factor structure and assessment of reliability. It then applies these standards to the body of literature surrounding the instruments designed to measure coping. Disparate findings across this literature base and their implications for constructing theory are discussed in the context of deviations from these standards.
Measuring Coping Behavior
Coping is the process by which people respond to a stressor (Carver et al., 1989; Lazarus & Folkman, 1984). In the past seven decades, tens of thousands of studies have examined the extent to which coping moderates the impact of environmental stressors on one’s physical and emotional well-being. Coping influences mental and physical health outcomes (Addison et al., 2007), including longer survival and improved physical and mental health in hemodialysis patients (Niihata, Fukyma, Akizawa, & Fukuhara, 2017), subjective well-being in university students (Sanjuan & Avila, 2018), increased parental efficacy for mothers of adolescents (Woodman & Hauser-Cram, 2013), decreased anxiety and depression in African American victims of intimate partner abuse (Mills et al., 2018), and decreased suicidal behavior in adolescent and young adult males (Horwitz et al., 2018). Understanding coping behavior is vital in physical and behavioral health research and practice.
Early coping instruments were developed inductively based on observed patterns of coping behaviors rather than on an existing theoretical framework (Folkman & Lazarus, 1980; Pearlin & Schooler, 1978). Lazarus & Folkman, 1984 transactional model provides theoretical grounding for extant coping instruments and posits that coping is the constant changing collection of cognitive and behavioral efforts of an individual to meet specific internal or external demands. Stress is a psychological construct produced by interactions between the individual and stressors, including activities, events, conditions, or stimuli (Greer, 2007).
Coping is comprised of higher order categories known as coping styles and lower order categories known as coping strategies (Pang et al., 2013). Styles are broad categories summarizing how an individual typically responds to stress while strategies are specific coping behaviors—conscious or unconscious—taken by an individual in response to stress based on their abilities and situational demands (Scherer et al., 1988). When confronted with a stressful situation, individuals appraise their ability to manage it and use specific coping strategies in response (Chao, 2011; Hagan et al., 2017). The hierarchical structure of coping with styles as higher order categories and strategies as lower order categories has implications for the internal validity evidence and factor structures of the instruments used to measure it.
Researchers use self-report questionnaires to measure coping because the construct integrates psychological, behavioral, and emotional aspects that are too difficult to observe in natural settings. Popular instruments for measuring coping in adults include the Ways of Coping (Lazarus & Folkman, 1984), Coping Strategies Inventory (Amirkhan, 1990; Tobin et al., 1989), Coping Inventory for Stressful Situations (Endler & Parker, 1990), Coping Strategy Indicator (Amirkhan, 1990), and Coping Orientation for Problem Experiences (Carver et al., 1989). Exploratory factor analysis (EFA) and confirmatory factor analysis (CFA) have been used to identify latent factors corresponding to the assessed coping strategies (Skinner et al., 2003). These studies found varying factor structures, and, overall, low internal consistency of these measures (Cook & Heppner, 1997; Endler & Parker, 1990; Endler et al., 1993; Rexrode et al., 2008; Stone et al., 1991; Zuckerman & Gagne, 2003). These results are troubling since the use of valid measures is paramount to empirical testing and improvement of theoretical models (Skinner et al., 2003). Specifically, the direction of research on coping could be influenced by the analytical methods used to develop the “rulers” (i.e., the instruments, factor structures, and scoring algorithms) designed to measure the latent constructs rather than by the relationships between the constructs.
The primary purpose of the present study was to review and re-assess the evidence for the factor structure of several, frequently utilized extant coping instruments using modern psychometric standards. Our secondary goal was to provide readers with a roadmap of steps that should be undertaken when examining the internal consistency and factor structure of survey instruments that utilize ordinal response scales. These steps, and the supporting literature of why they should be performed, are summarized in the next section.
Methods
Literature Review
The objective of study was to evaluate the quality of psychometric evidence produced in support of the validity of the factor structure of extant coping instruments. To this end, the first two authors performed a systematic review of the literature consistent with PRISMA standards (Moher et al., 2015). Our search targeted frequently used coping scales, including the Coping Orientation to Problems Experienced (COPE), Ways of Coping Questionnaire, Coping Strategies Questionnaire, Coping Inventory for Stressful Situations, Coping Across Situations Questionnaire, Coping Response Inventory, Coping Strategy Indicator, Daily Coping Inventory, Jalowiec Coping Scale, and Utrecht Coping List (Kato, 2015). The search terms and Boolean operators—“name of inventory” & (“factor analysis” | Rasch | “item response theory”), were used to search seven databases: Academic OneFile, Academic Search Complete, Directory of Open Access Journals, MasterFile Complete, Newspaper Source Plus, OAIster, and WorldCat.org. Validity studies on adult English-speaking populations were compiled and evaluated. Figure 1 presents the flowchart for the literature search process. Literature search flowchart.
Analysis
Several psychometric techniques exist for examining items, their response scale performance, and their factor structure, for example, exploratory factor analysis (EFA), principal component analysis (PCA), confirmatory factor analysis (CFA), Rasch modeling, and item response theory (IRT). We constrained this study to evaluate the evidence based on EFA, PCA, and CFA methods since we could not locate in our broad literature review a single validation study that utilized IRT or Rasch modeling. Our evaluation of validation studies included in this review was based on the set of criteria summarized below, which are based on the foundations of modern psychometrics.
Exploratory factor analysis. The primary purpose of EFA is to unearth the factor structure that replicates the shared information (covariance) contained in the correlation matrix. Although EFA was developed for continuous variables, it has since been adopted for ordinal data. However, a series of steps must be performed to properly conduct an ordinal EFA.
First, factor analysis of ordinal data (e.g., Likert scale) should not be performed on raw data or on a Pearson correlation matrix because such data yield biased results (Drasgow, 1986; Gadermann et al., 2012; Gugiu et al., 2010; Gugiu et al., 2009; Jöreskog, 1994; Olsson, 1979; Zumbo et al., 2007). Instead, EFA should be performed on a polychoric correlation matrix (known as ordinal EFA), which is an unbiased estimate of the latent correlations under the assumption of multivariate normality (Jöreskog, 1994; Jöreskog et al., 2016). Monte Carlo simulations have shown that ordinal factor analysis yields factor structures that more accurately reproduce the theoretical model than an EFA on raw data or Pearson correlations (Jöreskog & Moustaki, 2001; Holgado–Tello et al., 2010). However, since the distributional assumption may not always be met (e.g., frequency scales are unlikely to have a latent normal distribution because they measure discrete data), one could input a Spearman correlation matrix to accommodate for the use of ordinal scales. Further, the robustness of EFA fundings can be further enhanced using estimation methods such as weighted least square mean and variance adjusted estimator (WLSMV; also known as diagonal weighted least squares [DWLS]) applied to a polychoric correlation matrix (Barendse et al., 2015).
Second, researchers should not determine the number of factors to retain using Kaiser’s criterion (eigenvalue >1) or the scree plot because these methods yield imprecise inaccurate results (∼57% accuracy) for continuous data (Hayton et al., 2004; Zwick & Velicer, 1982) and, by extension, would likely yield even more imprecise results for ordinal data. Instead, researchers should employ Horn’s (1965) parallel analysis since it can achieve an over 90% accuracy rate (Hayton et al., 2004; Zwick & Velicer, 1982). 1 Gugiu et al. (2009) has adopted this method for ordinal data using a polychoric correlation matrix. It is important to note that while parallel analysis can identify how many factors to retain, one must verify the interpretability and stability of the factor structure and ignore the recommendation when it compromises these two properties. In such instances, the true dimensionality is likely to be close to the number of factors that are most interpretable.
Third, researchers frequently utilize PCA when they should use EFA instead. EFA is appropriate when a theoretical model exists to delineate the relationships between a latent psychological trait(s) and their associated observable indicators. It assumes the latent trait affects the level of the observed variable; that is, a hidden attribute, imprinted within the individual, influences their responses to items either due to their genetic inheritance, biological processes, psychological characteristics, and/or the accumulation of external factors or life experiences. In contrast, PCA is a tautological data reduction system (Steiger, 1990) which can be used without the causal assumption linking the observed and latent concept. That is, the concept is not a manifestation of an internal process that compels individuals to respond to items in a certain way. Further, PCA decomposes the covariance matrix into eigenvalues and eigenvectors, based on classical statistics, and does not include the estimation of measurement error (as in EFA) or the estimation of factor loadings since components are just linear combinations of the items (Rencher & Christensen, 2012). Hence, PCA is appropriate, for example, for extracting a component denoting the “overall impact of stressful events” (e.g., rape, accidents, death of a family member, combat) since such events are not the product of a biological or psychological predisposition or internal process within individuals, or observable indicators of a latent construct. In contrast, we regard coping as a latent construct that influences observable behaviors as well as psychological and emotional processes.
Unfortunately, many researchers regard these methods as interchangeable and perform a PCA and then describe the extracted components as “factors.” In general, if it is theorized that an underlying dimension is latent and that the scores on the observed items are caused by the person’s level on that latent factor, or that there is measurement error, then EFA is the appropriate method to use. If EFA is deemed appropriate and the raw data are approximately continuous (>10 ordered response categories) and normal, one should employ maximum likelihood (ML) EFA. If an ordinal response scale is used to measure responses, one should calculate a polychoric correlation matrix and then perform ML EFA on the matrix since both procedures assume latent normality.
Fourth, orthogonal rotation such as Varimax should only be used when factors are uncorrelated. Since it is not entirely clear whether coping factors are independent or not, it is prudent to use oblique rotation (e.g., Promax, Direct Oblimin, or Geomin) because if the factors are truly uncorrelated, this will be evident from very low correlations (<0.3) between all the factors; otherwise, the oblique rotation will find the optimal correlation between them.
Fifth, to enhance the stability and interpretability of multidimensional factor structures, items with low factor loadings (<.4) on all the factors, salient loadings (>.5) on two or more factors, or the items for factors with less than five primary items should be dropped from further analysis unless doing so compromises the content validity of the instrument (Pett et al., 2003; Tabachnick & Fidell, 2013). With respect to the first recommendation, items with low factor loadings should not be dropped if they are substantively related to the construct and more than 80% of respondents picked one of the extreme response options. The latter exception is necessary to avoid dropping very difficult or easy to endorse items that can be used to measure respondents with very high or low ability on the construct. Low factor loadings for such items are attributable to a lack of variability, given the sample to which the items were administered, and may not necessarily be reflective of a poor item. While there is no consensus of whether to retain or drop items with high salient loadings on two or more factors (Pett et al., 2003), we recommend dropping them because they obfuscate one from differentiating between constructs. More importantly, such items tend to contribute to the instability of the factorial solution (Guadagnoli & Velicer, 1988; Velicer & Fava, 1998) that leads to the problem of low replicability (which is even more problematic than the interpretation of the factors with salient cross-loadings). Lastly, although the benefit of short measures is often emphasized by researchers, it is important to note that factors with very few items do not differentiate between people as well as those with more items, especially if their item difficulties are relatively similar. Since item difficulties are often unknown in the survey development phase, we recommend initially including at least 10 items per factor. At minimum, five items are needed to attain stable factor structures that are not unduly influenced by sample specific characteristics. However, if a factor has a very limited scope, it may be possible to measure it with fewer items.
Sixth, an adequate sample size is needed to produce stable EFA findings. The exact number needed, however, have been long debated by psychometricians. For example, Comrey and Lee’s (1992) recommendation of 300 cases continues to be cited by popular statistics textbooks (Tabachnick & Fidell, 2013) and by books (Pett et al., 2003) and journal articles (Worthington & Whittaker, 2006) on EFA. Alternatively, some psychometricians prefer recommendations based on the number of cases per variable. Such recommendations typically span the ratios: 5:1 (Gorsuch, 1983), 10:1 (Nunnally, 1978), and 20:1 (Costello & Osborne, 2005). Our preference is to follow guidance from Monte Carlo simulations examining the impact of ordinal data on EFA. Such a study was conducted by Jin (2012) who reported that for extracting four factors with item communalities between 0.6 and 0.8, a sample size of 200 was adequate when there are at least 14 items per factor while a sample size of 600 was adequate for seven items per factor. However, more than 800 cases are needed for extracting six factors or more.
Finally, one should not estimate composite scores using EFA, even when polychoric correlations are used as an input for the analysis, because an infinite number of ways exist for rotating factors with no mathematical way of designating which rotation is correct (known as factorial indeterminacy). Furthermore, since factor scores are estimated from raw score matrix (Tabachnick & Fidell, 2013; Mulaik, 2010), the scores cannot fully account for ordinal data. Factor scores should only be estimated with methods specifically designed for use with ordinal data, such as Rasch modeling (Boone et al., 2014; Bond & Fox, 2015; Linacre, 2013).
Confirmatory factor analysis. The primary purpose of a CFA is to determine whether the observed covariance matrix is the same as the covariance matrix produced by the hypothesized model. Like EFA, CFA was developed for continuous data but has since been adopted for ordinal data. However, a series of steps must be performed to properly model and interpret the findings from ordinal data. First, a polychoric correlation matrix should be used in place of raw data or a Pearson correlation matrix (Brown, 2006; Jöreskog et al., 2016; Jöreskog, 2005). However, as noted earlier, this substitution carries the burden of defending the multivariate normality assumption. Generally, this assumption can be defended if the latent variable is continuous and can be characterized as a composite variable produced from a large number of random processes so that its sampling distribution converges upon normality via the Central Limit Theorem (Gugiu, 2011).
Second, in the case of ordinal data, researchers should utilize diagonally weighted least squares (DWLS; known as WLSMV in the MPlus software), instead of ML, to extract parameter estimates (Brown, 2006; Flora & Curran, 2004; Curran et al., 1996; Gugiu et al., 2010; Gugiu et al., 2009). Third, model fit should be assessed using the Sattora–Bentler chi-square (Satorra & Bentler, 1994; Bryant & Satorra, 2012) rather than the normal chi-square. However, because this statistic is sensitive to large sample sizes, additional model fit indexes should be examined. The most popular indexes include the standardized root mean residual (SRMR), root mean square error of approximation (RMSEA), comparative fit index (CFI), and Tucker–Lewis index (TLI). There are multiple guidelines in the literature about cutoffs considered for good, acceptable, and poor model fit; a CFA model is generally considered to fit well when the SRMR and RMSEA are
Third, although SEM software often gives modification indices that suggest additional paths, one should not attempt to improve model fit by freeing parameters such as correlated errors and cross-loadings since they are unlikely to replicate in new samples (Sanders et al., 2015) unless a plausible explanation exists for why these specific parameters were freed and others were not. Further, programs such as Mplus often list these modification indices in order of impact on model fit. One method for improving model fit is to free constrained parameters (i.e., adding correlated errors or cross-loadings) starting with the parameter with the highest modification index and progressing towards lower modification indices. However, as noted earlier, new pathways between parameters should not be added without a strong theoretical argument that justifies freeing these specific parameters but not others.
A variety of sample size recommendations have been published for CFA studies, including recommendations for 10 to 20 cases per statistical parameter (rather than variable) (Kline, 2011). Our preference is to follow guidance from Monte Carlo studies which found that a sample size of 200 is adequate (Flora & Curran, 2004; Curran et al., 1996) to model polychoric correlations between ordinal variables when DWLS and the Sattora–Bentler chi-square are employed to model ordinal data. However, further research is needed to determine whether larger sample sizes are needed to test models with lots of factors and few items per factor.
Internal consistency reliability. Without a doubt, the most popular method for estimating reliability is Cronbach’s (1951) alpha (of note here is the historical fact that technically Cronbach’s alpha is just a different name for Guttman’s lambda-3 from 1945). However, this estimator should only be used for continuous data and even then more accurate methods exist such as McDonald’s (1999) omega. For ordinal data, we recommend using ordinal alpha or ordinal omega (Zumbo et al., 2007), Gugiu’s (2011) nonparametric split-half parallel reliability, or Raykov’s rho (Brown, 2006). Further details about these estimators can be found in Dembe et al. (2014). Finally, while it is customary to compare reliability estimates to Nunnally’s (1978) .70 standard, Gugiu and Gugiu (2018) showed that higher standards may be necessary when decisions are made at the individual-level rather than the group-level.
Results
General Findings
Characteristics of Exploratory Factor Analysis Validity Studies.
Characteristics of Confirmatory Factor Analysis Validity Studies.
No evidence was found the studies accounted for the use of ordinal data by inputting either a polychoric or a Spearman correlation matrix. Nor is it the case that the reported software (SPSS, SAS, LISREL, AMOS, MPlus) automatically detects the presence of ordinal data and adjusts the statistical model accordingly. Thus, we conclude the reported factor loadings, interfactor correlation coefficients, and reliability estimates are attenuated in magnitude. Nearly three-quarters of the studies created mean or summated composite scores as an index of performance on factors. This is problematic because neither factor analysis nor structural equation modeling can adjust for the absence of equal intervals between response options. Nearly 90% of the studies reported Cronbach’s alpha, which has been found to yield biased estimates for ordinal data (Zumbo et al., 2007; Gadermann et al., 2012).
Exploratory Factor Analysis Findings
All nine coping instruments identified in our literature search have been subjected to a factor analysis. However, the analyses performed in these studies deviated from recommended psychometric standards in several areas. One EFA study used ML to estimate parameters even though this method is only appropriate for normally-distributed data (Costello & Osborne, 2005). Nearly two-thirds of the studies performed a PCA although EFA is more appropriate for modeling coping. No study examined whether coping is comprised of higher order (styles) and lower order (strategies) factors. One study utilized parallel analysis to determine the number of factors to retain while a third used Kaiser’s criterion which has been shown to have only a 22% accuracy rate (Hayton et al., 2004; Zwick & Velicer, 1982). Our results supported previous Monte Carlo findings in that coping studies that used Kaiser’s criterion retained more factors than those that used the scree plot (across all instruments: 7.9 vs. 5.2, on average; within the same instrument: 8.8 vs. 5.0, on average). Beyond the obvious implication for interpretability, retaining too many factors also reduces the stability of factors when the ratio of items-to-factors is low.
Interestingly, 80% of the studies utilized Varimax rotation (presumably because it is the default rotation in SPSS and other programs). While this is not necessarily inappropriate, it can be if the factors are correlated. Unfortunately, the reason supporting this decision was not discussed by the authors. However, some studies reported strong correlations between coping factors (Edwards & O'Neill, 1988). Hence, it is not clear the extent to which the reported factor structures are replicable. In fact, findings in the next section suggest they may not be stable.
Although the average sample size for the studies met some of the guidance found in the literature (i.e., >300 cases, > 10:1 cases to variables), it fell short of more recent guidance emerging from Monte Carlo studies. On average, factors were measured with 8.3 items, with more than half the studies including at least one factor measured with less than five items (recommended minimum) (Tabachnick & Fidell, 2013). According to Jin (2012), one needs a sample size of about 600 cases to properly extract factors when they are measured with seven items each. Hence, it is likely the coping factors with 10 or more items were measured well but the smaller ones may be unstable.
Confirmatory Factor Analysis Findings
Only six of the nine coping instruments have been subjected to a CFA. The reported sample sizes for the CFA studies met psychometric guidance. However, the analyses performed deviated from recommended psychometric standards. All the studies used ML to estimate parameters—most likely because it is the default option in structural equation modeling software—even though it has been shown to yield biased estimates for ordinal data (Flora & Curran, 2004; Jöreskog, 2005; Jöreskog et al., 2016). All the reported chi-square statistics were significant (signifying model misfit). However, since Satorra–Bentler chi-squares (Satorra & Bentler, 1994; Bryant & Satorra, 2012) were not reported, it is impossible to know whether this misfit was a function of poor statistical modeling, model misspecification, or due to the influence of large sample sizes (Brown, 2006).
Unfortunately, one cannot accurately assess model fit from the other reported indexes since they are all computed from the chi-square statistic (Schumacker & Lomax, 2010). However, it is worth noting that only 22% of the standardized root-mean square residuals (SRMR), 55% of the root-mean-square error of approximation (RMSEA) indexes, 21% of the comparative fit indexes (CFI), and 20% of the Tucker–Lewis indexes (TLI) met recommended standards for good fit (SRMR and RMSEA ≤.05, CFI and TLI ≥.95) (Brown, 2006; Schumacker & Lomax, 2010). Evaluation of other reported model fit indexes against standard benchmarks reaffirmed the conclusion that few of the hypothesized coping factor structures exhibited adequate internal validity. Hence, even if one ignored that ordinal data were not properly modeled, these findings suggest the theoretical factor structures underlying existing coping instruments are not supported by empirical data.
Another point worth noting is the varying degrees of freedom reported for the chi-square statistic within the same instrument as reported across multiple studies suggests researchers tested different models because they either dropped items due to misfit, allowed them to cross-load across factors, or permitted correlated errors to improve model fit. This too is an indication of model misfit that should be reported within published studies in more details. At present, the literature is full of “phantom degrees of freedom” steaming from the intermediate models researchers fit but did not report, which may be related to elevated Type I errors. Regardless of the reasons for the discrepancies, they insinuate the factor structures underlying all extant coping instruments are unstable. Hence, further research is needed to develop and validate coping instruments.
Discussion
Although over 30 thousand peer-reviewed studies have been published on coping, researchers have only developed nine survey inventories to delineate its primary dimensions in adult populations. Our literature search revealed 39 studies investigated the validity of these nine instruments. Yet, scrutiny of the psychometric analyses performed, summarized below, raise concerns regarding the validity of the coping instruments given the final number of factors across the instruments ranged from two to 12. Moreover, even within the same coping instrument, studies rarely converged upon the same factor structure. This connotes that the “rulers” used by researchers to measure coping are not invariant. If we cannot trust our “rulers” have the same interpretations, then how robust can the theories that emerge from these studies be?
A variety of factors likely contributed to the lack of stable results. First, all the studies ignored the ordinal nature of the response scales. This is a critical issue because psychometric theory dictates that researchers should never treat ordinal data as continuous (Stevens, 1946; McDonald, 1999). Simply put, arithmetic operations yield unbiased results only when data have equal intervals between adjacent response categories and the thresholds demarking the boundaries of these categories align across all the items. Monte Carlo studies have found treating ordinal data as continuous results in biased reliability (Zumbo et al., 2007), correlations (Olsson, 1979) and factor loadings (Jöreskog & Moustaki, 2001; Holgado–Tello et al., 2010). These biases are consistent with our own research and is consistent with prior work in this area. For example, the LISREL software treats all variables with less than 15 response categories as ordinal (Jöreskog et al., 2001). Hence, even the maximum number of response scale options (seven) used by an extant coping instrument is not high enough to allow one to assume the scales were approximately continuous.
Second, two-thirds of the factor analytic studies used Kaiser’s criterion or the scree plot to identify the number of factors to retain. As expected, studies that employed the scree plot extracted fewer factors than those that used Kaiser’s criterion. Direct comparison between parallel analysis and these methods was not possible because only a single study employed the former method. However, Monte Carlo studies indicate parallel analysis is more accurate at recovering the true number of factors. Kaiser’s criterion, while objective, retains too many factors. The scree plot is somewhat more accurate, but it is too subjective to yield consistent findings. In our experience, the factor structure produced by parallel analysis is more interpretable than those produced by Kaiser’s criterion or the scree plot.
Third, over three-quarter of the factor analytic studies reviewed assumed, without justification, the factors were orthogonal. While it is conceivable one coping strategy may be unrelated to use of another strategy, this is not necessarily the case. Edwards and O'Neill (1988, p. 968), for example, reported several of the factors they tested in a CFA “exhibited correlations with other factors that did not differ significantly from unity.” As best as we can tell, none of the models tested in the CFA studies fit particularly well. However, this may be because they did not model ordinal data or use the Sattora–Bentler chi-square statistic to test the models.
Fourth, all the instruments employed frequency scales. The impact of this choice on findings is not readily understood. It is not clear whether the Pearson correlation matrices employed met the latent normality assumption since frequency scales index count data, which should be modeled with either a Poisson, overdispersed Poisson, or negative binomial distribution. Conventional software packages have yet to provide a procedure for computing correlation matrixes for count data. However, the assumption of latent normality is not unreasonable if many events can occur in the interval of time measured by the response scale since the limiting distribution of the Poison is the normal (Hogg et al., 2005).
Fifth, nearly three-quarters (72%) of the studies reported mean or summated composite scores. This percentage is higher if one includes studies that reported Cronbach’s alpha, which assumes a linear composite score is generated from the raw scores. However, composite scores generated from totals or averages are only interpretable if the item-level data are continuous. Though widely employed, Cronbach’s alpha should only be employed for continuous data since adjacent ordinal response categories are unlikely to be equally distant (e.g., the distance from “Strongly agree” to “Agree” is unlikely to be equal to the distance between “Agree” and “Neutral,” and so forth; this is readily observed by comparing the distance between Andrich thresholds in a Rasch analysis). Further, Cronbach’s alpha is downwardly biased when ordinal data are used. However, if Cronbach’s alpha is greater than 0.70 then Zumbo’s ordinal alpha and omega will also be greater than this benchmark.
SAS and SPSS programming syntax
Conducting an ordinal EFA is relatively straightforward. Derived from the lecture notes of one of the authors, Supplemental Appendixes 1 and 2 present SAS and SPSS syntax for generating a polychoric correlation matrix, substituting it for the raw data, and performing an ordinal EFA that extracts factors based on the recommendations derived from a parallel analysis. Mplus syntax for performing an ordinal EFA and CFA is included in Supplemental Appendix 3.
Future Research on Coping
Further research is needed to refine the tools used in coping research based on the guidance provided in this paper. We recommend the adoption of a Likert-type scales rather than the use of frequency scales. This requires minor rewording of existing items but would allow researchers to defend the use of polychoric correlations more easily in analyses. Once the number of factors is known (via an ordinal parallel analysis), researchers can then use the polychoric matrix to perform ordinal EFA to examine the interpretability of the factor structure. The same matrix can be also used to estimate reliability (e.g., ordinal alpha or omega; see Gugiu et al., 2009). A CFA can be also performed on the polychoric correlation matrix, though it should be performed on a different data set than the one utilized for the EFA. Finally, DWLS and the asymptotic covariance matrix (known as robust DWLS) should be employed to extract parameter estimates and the Satorra–Bentler chi-square to assess model fit along with other indexes.
A surprising omission from the current literature was the complete absence of Rasch modeling or item response theory. These methods can provide insight into instruments beyond what can be learned from factor analytic approaches. For example, Rasch modeling can be used to examine the extent to which instruments exhibit ceiling and floor effects, identify the optimal region in which respondents are measured with more information than measurement error, determine the performance of response categories and the number of scale options respondents can distinguish between, visually compare the distribution of person ability on the latent trait to item difficulty, test for item bias and measurement invariance, compute contextualized differences between groups, and calculate nonlinear composite scores. Finally, we recommend that researchers consider altering the wording of some of the items to increase the functional range of the survey instruments. Though we could not directly test the extent to which item redundancy occurred, many of the survey items we read appeared to be of a similar item difficulty. Optimal surveys need to spread people out along the latent continuum in order to allow one to differentiate between them.
Supplemental Material
sj-pdf-1-ehp-10.1177_01632787221084773 – Supplemental Material for A Critical Appraisal of the Evidence Supporting the Factor Structure of Extant Coping Instruments
Supplemental Material, sj-pdf-1-ehp-10.1177_01632787221084773 for A Critical Appraisal of the Evidence Supporting the Factor Structure of Extant Coping Instruments by Paul C. Gugiu, Damon Drew and Ela Polek in Evaluation & the Health Professions
Supplemental Material
sj-ppt-2-ehp-10.1177_01632787221084773 – Supplemental Material for A Critical Appraisal of the Evidence Supporting the Factor Structure of Extant Coping Instruments
Supplemental Material, sj-ppt-2-ehp-10.1177_01632787221084773 for A Critical Appraisal of the Evidence Supporting the Factor Structure of Extant Coping Instruments by Paul C. Gugiu, Damon Drew and Ela Polek in Evaluation & the Health Professions
Supplemental Material
sj-pdf-3-ehp-10.1177_01632787221084773 – Supplemental Material for A Critical Appraisal of the Evidence Supporting the Factor Structure of Extant Coping Instruments
Supplemental Material, sj-pdf-3-ehp-10.1177_01632787221084773 for A Critical Appraisal of the Evidence Supporting the Factor Structure of Extant Coping Instruments by Paul C. Gugiu, Damon Drew and Ela Polek in Evaluation & the Health Professions
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
Supplemental Material
Supplemental material for this article is available online.
Note
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
