Abstract
This article summarizes the general uses and major characteristics of factor analysis, particularly as they may apply to counseling research and practice. Exploratory factor analysis (EFA) and confirmatory factor analysis (CFA) are overviewed, including their principal aims, procedures, and interpretations. The basic steps of each type of factor analysis are elucidated. For EFA, the methods of factor extraction (principal component analysis and principal axis factoring), retention, rotation, and naming are summarized. CFA’s basic operations (model specification, testing, and interpretation) are discussed. In conclusion, EFA and CFA are directly applied to the development of a counseling-related instrument.
Keywords
An interested client asked her licensed professional counselor about how the test she had just completed was made and whether the counselor could assure her that the questions were valid for her ethnicity. The counselor had only a vague idea about the process.
With limited resources and increased public scrutiny of mental health services, counselors, and therapists need to provide evidence-based practice, understand various research methodologies, and evaluate research and other evidence to inform their practice (Cooper, 2010). Almost on a daily basis, counselors find themselves either reading a client’s psychological test results or administering instruments to assess aspects of the client’s mental health, personality, or psychosocial functioning. Al-though well-educated counselors are largely competent at giving, scoring, and interpreting tests (Tymofievich & Leroux, 2000), indirect evidence suggests they are probably unfamiliar with the research and statistical processes fundamental to test construction (Bauman, 2004; Kamen, Veilleux, Bangen, VanderVeen, & Klonoff, 2010). In fact, designing and validating multidimensional norm-referenced tests (e.g., psychoeducational diagnostic battery) or even well-regarded self-report questionnaires (e.g., a measure of self-efficacy) often requires several years. In most quality test manuals or booklets, at least one chapter summarizes the qualitative and quantitative steps the test author followed to ensure the instrument was reliable and valid. Test manuals are also replete with data tables summarizing the findings of statistical analyses on the items and scales, including the results from various factor analyses computed on the test questions. Unless one is planning on creating a new test or validating an existing measure, the intricacies of factor analysis do not need to be mastered; however, practitioners should understand the basics of this statistical procedure, for most tests or questionnaires are composed of dimensions or scales and these are generated in part by this multivariate statistical method. In other words, why certain items are grouped together on one dimension (e.g., extroversion) and another set of items from the same test comprise an “opposite” scale (e.g., introversion) are explained by the outcome patterns of factor analysis.
A graduate student in counseling and her professor were working on a research project and wanted to use an existing measure of social anxiety with a sample of late-elementary to middle school-age children. However, because the questionnaire was originally designed and validated for use with adolescents and young adults, they knew the overall measure and items needed to be rechecked for reliability and validity with the new respondent group. From their training, they also understood the basic steps of the test revision process, including revisiting the questionnaire items for developmental appropriateness, perhaps even rewording them and rescaling the response choices from a 7-point to a 4-point Likert-type scale. Next, a large data set from their target sample would need to be collected, followed by conducting a series of factor analyses on the revised items. Regrettably, neither researcher understood the procedure well enough to conduct these analyses and interpret the findings with any confidence.
This latter scenario reflects past studies indicating that counselor educators teaching in master’s and doctoral counseling programs across the United States require further training in research methods (e.g., Astramovich, Okech, & Hoskins, 2004). Given the number of pertinent research designs and statistical methods for counseling graduate programs to cover, it is no surprise that factor analysis can only be addressed in a general way. Fortunately, countless statistical textbooks and journal articles are available on this topic. However, in our view, these publications tend to be overly sophisticated and assume a fairly strong background in multivariate research and statistics. As such, our aim is to provide a largely conceptual overview of factor analysis, one that is directly applicable to the work of counseling professionals and researchers. To achieve this end, the essentials of factor analysis are first reviewed. Next, exploratory factor analysis (EFA) is considered in some depth, including the primary steps involved in conducting this procedure. Third, confirmatory factor analysis (CFA) is summarized. In each section, we have attempted to situate the discussion within the context of counseling research and practice. Moreover, to avoid inundating readers with minutiae, the mathematics of factor analysis is kept to a minimum. We do, however, assume that readers have some familiarity with basic and advanced statistical notions and terms.
Essentials of Factor Analysis
To frame the ensuing discussion, it is essential to first understand the fundamental aims and uses of factor analysis as well as appreciate how they can be applied to counseling. Around the mid-1900s, when the psychological testing movement in American schools was in full force (Kaplan & Saccuzzo, 2010), English psychologist and statistician Charles Spearman (1939) coined the term “factor analysis” to describe a mathematical approach to determining the underlying patterns of relationships among different test scores. With this procedure, Spearman demonstrated that schoolchildren’s performances on a wide range of mental ability tasks were at least moderately intercorrelated on a general (g-factor) mental ability dimension (Bartholomew, 1995). His seminal work on factor analysis has in part contributed to the dramatic rise in the use of tests and measurements in fields like psychology, education, and counseling.
One way to characterize factor analysis is to view it as a sophisticated correlational method to locate regularity and trends in a large data set. In more technical language, this statistical procedure reveals meaningful latent clusters of variables (e.g., a group of interrelated test items) from a larger set of variables and calculates how much shared variance 1 is accounted for in each of these variable groupings. The more shared variance that is associated with each cluster of variables, the better the variables will load on a particular dimension.
This description can be practically illustrated. Suppose a neuropsychologist has administered for years three different 15-item memory tests (45 questions in all) to clients suffering from traumatic brain injuries (TBI). In the hopes of combining three tests that basically measure the same construct into one relatively shorter measure, a factor analysis is computed on a large data set gathered from former clients. The results show that 20 of the 45 items were so highly correlated (i.e., the questions substantially covaried with each other) that the clients were essentially being asked the same memory-related question multiple times. As such, most of the redundant questions were dropped and one 25-item memory test was a far better option. In short, factor analysis efficiently reduces (condenses) correlational data into one or more conceptually related dimensions (DeCoster, 1998).
Uses of Factor Analysis
As exemplified above, factor analysis is a flexible statistical procedure and researchers apply it in a variety of ways. Here are three of the most common uses of factor analysis in counseling and related professions.
Test or Survey Construction
One vital application of factor analysis concerns the development of psychometrically sound measures (test, survey, and questionnaire). The purpose of factor analysis in this instance is both data reduction and simplification. In other words, the test developer wants to derive a small number of comprehensible factors from a much larger number of variables (e.g., a data set comprised of test items). Superfluous, indistinct, or irrelevant items can be identified for elimination. The following scenario should elucidate how factor analysis is employed during the test construction process.
A group of counseling researchers were interested in measuring the construct of “life satisfaction” in young adolescents, aged 11–15. After perusing existing measures of life satisfaction, only two possible options were located. Because neither instrument was validated with young adolescents, our investigators chose to design their own. Returning to the theoretical and research literature, they found enough solid work addressing life satisfaction issues in their target population to create enough survey items (variables). They generated 40 sample statements (e.g., “My family makes me feel happy”) that respondents would react to using a 5-point Likert-type scale, ranging from 1 (I completely disagree with the statement) to 5 (I completely agree). Since their intent was to measure several dimensions associated with early adolescent life satisfaction, the researchers anticipated that subsets of the 40 variables would cluster or group on several conceptually related dimensions or factors. In other words, once the factor analysis was run, conceptually related items would moderately to strongly intercorrelate (e.g., from .40 to .85), clustering on one or more interpretable dimensions.
Next, they gathered survey data from nearly 500 students attending four different schools. After entering the data into statistical software program (e.g., IBM SPSS) and computing the factor analysis procedure, the fictional output (not shown) revealed three relatively clear patterns in the item groupings, forming three factors composed of 10 items each. One set of 10 conceptually related items were highly intercorrelated creating Factor 1. Because these variables appeared to measure the respondents’ perceptions of their family life, they named this factor “family satisfaction.” Ten other items were moderately to strongly intercorrelated, clustering on Factor 2, and labeled the “satisfaction with peer relations” dimension. A third factor emerged from the results, with 10 items moderately correlating with each other. It was subsequently titled the “optimism–hopefulness” dimension.
The 10 remaining survey items weakly intercorrelated (e.g., rs between .00 and .30) with each other and with the other items comprising the three derived factors. The researchers reviewed the content of these “leftover” items finding that they addressed adolescent qualities unrelated to family satisfaction, peer satisfaction, or optimism–hopefulness. These variables were considered irrelevant and dropped from the 30-item Young Adolescent Life Satisfaction Inventory (YALSI). Finally, using a factor analysis procedure option, factor scores were computed for each of the three dimensions. The YALSI was now ready for its trial run with a new group of youngsters. In brief, factor analysis used in test construction reduced (or condensed) the original set of 40 variables to 30 usable items, which in turn clustered on three relatively distinct dimensions or factors.
Revising Established Tests
For various reasons, existing instruments need to be reworked. Perhaps, the test completion time is far too long for certain client groups (e.g., a group of TBI clients) or certain items are outdated or even biased toward respondents from a particular ethnicity. As a practical example, suppose a rehabilitation counselor desires to assess his clients’ level of satisfaction with their new work environment. The counselor locates a high-quality 150-item questionnaire—“Workplace Fulfillment Questionnaire” (WFQ), one with a wide range of items and scales pertaining to satisfaction with various aspects of the work setting (e.g., supervisor–worker relations, safety issues, and difficulty level of job responsibilities). The counselor wonders if all 150 items have to be administered or could the measure be shortened without compromising reliability and validity. The counselor pays a visit to the agency’s data and records manager looking for some statistical assistance.
The WFQ has the following items, among the many others, which are conceptually linked with the “Supervisor-Work Relations” scale: “Item 1: Workers are able to comfortably voice their opinions to their supervisors;” “Item 29: The boss is very open to employees’ viewpoints;” and, “Item 45: People who work at this company are free to express their concerns.” The data manager conducts a factor analysis on the data collected from a large sample of previously administered inventories, finding that these three variables are highly intercorrelated (ranging from .87 to .89) and strongly “load” on the 15-item “Supervisor-Work Relations” scale. In this situation, one might want to eliminate two of the three highly correlated items, because they seem to measure the same content. If the manager combed through all the results, she might locate other highly correlated items loading strongly on different factors. This process could hypothetically reduce the number of items on the survey by say one third. Of course, permission to administer the revised WFQ would have to be obtained from the hypothetical publisher.
Theory Testing
Another common application of factor analysis involves the empirical testing of theoretical data structures. Here, the central goal is to confirm an existing theory and to identify key variables that represent the theory. A researcher, for example, may be interested in revising a theory of adult personality formation. She can use factor analysis to investigate the underlying correlational patterns shared by personality-related variables in order to test or confirm her theoretical model of personality development.
To conclude, factor analysis has a number of worthwhile applications. Perhaps most importantly, it is a powerful statistical method to determine how best to represent a set of p observed variables with a set of m derived variables so that m < p (Velicer, Eaton, & Fava, 2000). From a large inter-item correlation matrix, the factor analysis process, as explained later, results in a smaller number of interpretable factors or conceptually related dimensions. Technically, a factor (also called a latent dimension or construct) can be described as a “condensed statement of the relationships between a set of variables” (Kline, 1994, p. 5) and it is defined by the magnitude of its factor loadings (i.e., the strength of the correlations of specific variables with a particular factor).
Two Major Categories of Factor Analysis
To optimize the value of factor analysis, investigators should base their research decisions on a solid theoretical foundation. By doing so, the type of factor analysis to deploy becomes obvious and the results are more coherently interpreted (i.e., the factor loadings are understood in light of their conceptual underpinnings). Although there are several types of factor analysis, most statisticians working in education, social science, and allied disciplines place them into one of the two overarching categories: EFA and CFA. Various scholars include principal component analysis (PCA) as a third type of factor analysis (e.g., Guadagnoli & Velicer, 1988; Schonemann, 1990; Velicer & Jackson, 1990). Others classify PCA as a method of extracting factors from an intercorrelation matrix rather than an actual type of factor analysis (e.g., Costello & Osborne, 2005; Fabrigar, Wegener, MacCallum, & Strahan, 1999). For our purposes here, we have adopted this latter position. Conceptual and technical comparisons between various EFA methods and PCA are challenging to elucidate fully in an introductory article; however, these are summarized later as well as in Table 1 and Table 2. For detailed discussions, counselors and researchers have numerous generally accessible texts to consult (e.g., Child, 2006; Goldberg & Velicer, 2006; Kline, 1994; Nunnally & Bernstein, 1994; Pett, Lackey, & Sullivan, 2003; Tabachnick & Fidell, 2007; Thompson, 2004).
Types of Factor Analysis, Extraction Methods, Rotation Methods, and Sample Counseling Studies.
aDeCoster (1998). bRotation method can be applied to PCA and PAF.
Basic Steps in Conducting EFA and CFA.
Note. EFA = exploratory factor analysis; CFA = confirmatory factor analysis.
EFA
Although researchers may have some vague ideas about how test items may cluster, the primary goal of EFA is to discover the underlying structure of observed variables. As such, the researcher should not impose any preconceived structures. EFA identifies latent factors that explain the covariation (correlation) among a set of variables. Ideally, the derived factors should consist of relatively homogenous variables, where each item loads strongly onto one factor and minimally on the other factor(s). To illustrate this approach, a national group of school counselors are interested in measuring “effectiveness” and they identify several characteristics of competent school counselors from the research literature. Specifically, they are interested in detecting the underlying dimensions of an effective school counselor practice rather than several separate characteristics. Assume the factor analysis resulted in three underlying dimensions as depicted in Figure 1. Certain items substantially loaded on Factor 1 (“Counseling Effectiveness”), others loaded on Factor 2 (“Administrative Effectiveness”), and so on. Thus, from a reasonably large assortment of variables measuring aspects of school counselor effectiveness, factor analysis summarized the data into three reliable dimensions.

Three clusters/factors derived from a number of items on “school counselor effectiveness” measure. Note. Retrieved from http://upload.wikimedia.org/wikipedia/commons/thumb/8/88/FactorAnalysis_ConceptualModel_DotsRings.png/450px1.0FactorAnalysis_ConceptualModel_DotsRings.png
Steps in Conducting an EFA
As alluded to earlier, EFA is largely data-driven in that the latent patterns in the data comprise the factors. There are no restrictions placed on the patterns of relationships among variables. The common derived factors are assumed to influence every observed variable (Albright & Park, 2009) and are either correlated or uncorrelated. To conduct an EFA, general statistical software packages such as IBM SPSS or SAS are good choices. The principal EFA steps are summarized in the following subsections.
Item Creation
Working from relevant theoretical and research literature, researchers must decide on the construct(s) they want to actually measure. To operationalize this construct and its potential latent dimensions, they generate, after much planning (e.g., determining the number and nature of the variables), a series of interrelated variables, generally more than what is actually required for a good factor analytic solution. Obviously, if the variables do not conceptually “go together,” finding common factors with substantial shared variance is highly unlikely (Kahn, 2006). In devising the variables, the type of data to be analyzed needs to be considered as well; namely, the data collected should be measured on a scale that is suitable for correlational analysis. Although there are ways to factor analyze categorical (e.g., dichotomous “yes or no” items) information, best practice is to start with continuous (ratio or interval) data. Data measured on, for example, a 5- or 7-point Likert-type scale are generally acceptable for factor analysis.
Sample Size Estimation
Prior to collecting data, researchers determine the nature and size of the sample to be studied. In most psychometric investigations, a large sample size is recommended. There are several good reasons for this practice. As the sample size increases, the standard errors of the resulting factor loadings generally decrease. Large samples also tend to provide more stable results and the derived sample loadings more closely estimate population factor loadings. Scholarly consensus is lacking, however, on what sample size is sufficiently large to achieve reliable and stable estimates.
Most researchers determine the minimum sample size required by calculating the ratio of sample size N (total cases) to the number of variables being analyzed, p. Depending upon the expert one reads, the recommended N:p ratio ranges from 3 to 20 (e.g., Cattell, 1978; Comrey & Lee, 1992; Everitt, 1975). Perhaps a middle ground is a ratio of 10 cases per variable. Thus, when 20 variables are under study, the number of cases should be at least 100 and as high as 200. In general, samples sizes greater than 200 are preferred. Comrey and Lee (1992) provided the following sample size guidelines: N = 50—very poor; 100—poor; 200—fair; 300—good; 500—very good; and 1000+ excellent.
It should be noted that while there are recommended minimum sample sizes, studies have shown adequate stability and reproduction of population factor loadings with smaller samples (e.g., MacCallum, Widaman, Zhang, & Hong, 1999). Hogarty, Hines, Kromrey, Ferron, and Mumford (2005) encouraged careful planning in conducting factor analysis, where sample size considerations should also reflect the context of the study and various aspects of the factor solution that are most important to the research (e.g., the communality and overdetermination of factors). With “strong data” a smaller sample size can be adequate. Characteristics of such a robust data set is one that yields high communalities (≥.80), low cross factor loadings (<.32), and moderate to high factor loadings (≥.50). If the researcher is unsure about the quality of the data set, it is better practice to maximize, as much as feasible, the sample size.
Data Collection, Screening, and Checking for Parametric Assumptions
After the data have been secured, they must be closely vetted for missing information, data entry errors, and irregular response patterns. Next, the researcher determines whether the parametric assumptions of factor analysis can be met. Although, similar to other multivariate statistical procedures, factor analysis tends to be robust to modest violations of normality, the solution is enhanced if the variables are normally distributed. Field (2009) recommended a number of procedures to examine the data of normality. For instance, skew and kurtosis indices can be computed for each variable. Preferably, these are less than ±1.0. Variable distributions should be visually inspected using histograms, as well as box, probability–probability (P-P), and quartile–quartile (Q-Q) plots. Basically, a P-P plot (see, e.g., Figure 2) compares the cumulative distribution of sample data to a normal distribution. Similarly a Q-Q plot (see, e.g., Figure 3) assesses quartile values of the sample data to those of a normal distribution. When there is a relatively close match the plot will appear nearly linear.

Sample P-P plot. P-P = probability–probability.

Sample Q-Q plot. Q-Q = quartile–quartile.
Moreover, extreme outliers must be identified and, in most cases, removed. Whereas, simple scatter and box plots are used to check for bivariate outliers, multivariate outliers can be detected using a diagnostic tool like the Mahalanobis (D) distance (Field, 2009). The scale-invariant D index is the distance between an observation (e.g., a test score) and the mean of a distribution, taking into account correlations within the data set. Finally, since factor analysis assumes that the variables are linearly related, bivariate scatterplots between variables should be examined.
Creating a Correlation Matrix and Inspecting It for Factorability
Once the data set has been screened and checked for normality, the researcher computes correlations among all the variables. The intercorrelation matrix is then examined for its suitability for factor analysis. Generally, the intercorrelation matrix is factorable if the majority of the correlation coefficients are least low–moderate to strong (r = .20+). If one finds a number of correlations between variables exceeding .85, multicollinearity becomes a concern. Table 3 is a hypothetical 8 by 8 intercorrelation matrix that appears, by visual inspection, to be factorable.
Sample 8 by 8 Intercorrelation Matrix.
Note. Of the 28 correlations, 15 are >.20, suggesting the matrix is perhaps factorable.
When factor analyzing an intercorrelation matrix comprised of say 50 variables, examination of each correlation is tedious and error-prone. Mercifully, two quantitative methods to assess correlation matrix are available through well-known statistical packages (e.g., IBM SPSS and SAS). One option is Bartlett’s test of sphericity, which examines whether the variables are largely uncorrelated. In the worst case scenario, one finds an identity matrix, where the correlations on the diagonals of the intercorrelation matrix are 1.00 and the rest are 0.00. A significant (p < .05) sphericity χ2 suggests that the data set, and thus, the correlation matrix are factorable. Another alternative is to compute the Kaiser–Meyer–Olkin (KMO) to measure the sampling adequacy. The KMO index specifies how small the partial correlations are relative to the original correlations. When the intercorrelation matrix is an identity matrix, the KMO should be .05, the default value in IBM SPSS. Small KMO values indicate that correlations between pairs of variables cannot be explained by other variables. For data to be at least marginally factorable, the KMO should be greater than .60 (Kaiser, 1974). KMO estimates closer to .80 or .90 suggest the intercorrelation matrix is almost ideal for factor analysis (Pett et al., 2003).
Factor Extraction
Once the intercorrelation matrix’s suitability for factor analysis has been established, initial factors (EFA) or components (PCA) are extracted from the matrix. As mentioned previously, one can use an EFA or PCA extraction method. For various technical reasons, statisticians generally prefer EFA over PCA, but with large sample sizes the differences in factor solutions are rarely dramatic (Costello & Osborne, 2005). Without going into too much detail, the extraction process entails the partitioning of the common or shared variance associated with each variable from its unique variance (i.e., variance in each variable that is not shared with any of the other variables) and error variance (i.e., variance not otherwise accounted for in a variable such as random variance) to reveal the underlying factor structure (Brown, 2010). A good factor analytic solution is one where the shared variance represented by the variable’s communality (h 2) is maximized and the unexplained and error variance minimized. Variable communalities are represented as values ranging from 0.0 to 1.0. The closer the h 2 is to 1.0, the greater the percentage of variance of a variable (e.g., a test item) is accounted for by a set of common factors or components. In short, a variable’s communality is the proportion of its variance that is explained by each of the extracted factors.
Communalities are derived by squaring each variable’s factor loading across each of the derived factors and summing them. To obtain a percentage, the sum is multiplied by 100. To illustrate the simple math involved, assume survey Item A’s loadings (correlations) on Factors 1, 2, 3, and 4 were .96, −.02, −.08, and −.04, respectively. Clearly, Item A loads highly on Factor 1 and negligibly on the other three factors. The resulting h 2 for Item A is calculated as follows: [.96]2 + [−.02]2 + [−.08]2 + [−.04]2 = .93. Thus, an exceptionally high 93% (.93 × 100) of the shared variance of Item A is explained by Factors 1 through 4.
It makes sense that the quality of the factor analysis solution improves with increasing communalities. A good factor solution explains most of the variance (50–75%) in the intercorrelation matrix with the possible fewest factors. If most of the communalities are relatively low (<0.5), the variables still possess a substantial amount of unexplained variance. The researcher should therefore consider either extracting more factors to explain the variance or removing the variables with low communalities. High communalities (e.g., >0.8) indicate that the extracted factors explain most of the variance in the variables being analyzed.
Furthermore, a good factor solution means all the derived factors are overdetermined. Such factors possess two major characteristics: (a) each reveals moderate-to-strong loading (e.g., .40 to .80) on at least three variables and (b) the overall factor matrix achieves simple structure. This latter notion implies that the factor solution is most parsimonious, where each variable loads moderately to strongly on one factor and very weakly (near 0.0) on the other factors. A situation where a factor is underdetermined suggests that the underlying factor is difficult to identify and thus, interpret (Comrey & Lee, 1992).
As each item has a communality, each factor has an Eigenvalue (EV or λ). An EV represents the amount of variance that is accounted for by each factor relative to the total variance of the factor matrix and is calculated as the sum of the squared factor loadings for each variable. If the four loadings comprising Factor 1 are −.50, .36, .89, and .92, then the EV equals 2.02 (−.502 + .362 + .892 + .922). Moreover, EVs are allocated to factors according to the amount of variance explained out of the total variance. The proportion of variance explained by each factor is determined by dividing the factor’s EV by the total number of variables comprising the factor (i.e., EV/number of variables). If Factor 1 from our above example is composed of four variables and the EV is 2.02, it then accounts for about 50% of the total variance in the intercorrelation matrix (2.02 ÷ 4).
The factor analysis process extracts factors one by one until as much variance in the intercorrelation matrix is accounted for by the least number of salient factors. The initial factor always accounts for the most variance. As long as there is additional variance to explain after the first factor is extracted, as is almost always the case, the extraction process continues. When little or no variance is left to be explained in the intercorrelation matrix by the derived factors, the process stops. Each successively extracted factor will have a lower EV, because the amount of variance remaining is gradually diminished. A meaningful extracted factor should always explain more variance than a single variable. Some researchers (e.g., Reise, Waller, & Comrey, 2000) argue that it is preferable to overextract than to underextract factors. Underextraction is more likely to result in factors that contain substantial error. The standard error in factor loadings is influenced by elements other than sample size, including the method of rotation (see below), the number of factors, and the degree to which the factors are correlated (MacCallum & Tucker, 1991). Finally, factors with EVs of at least 1.0 are more stable and thus more likely to be retained as part of the solution (Pett et al., 2003).
Although there is an array of extraction methods that factor analysts can choose from, the discussion here focuses on the two conventional strategies, PCA and principal axis factoring (PAF; also called principal factor analysis). Before the advent of high-speed computing, PCA was the most expedient way of extracting components (or factors). Its central goal is to simplify a large number of items (e.g., 40) to a far smaller number of parsimonious components (e.g., 4). By analyzing both variance unique to each variable and the variance the variables have in common (shared variance), PCA is a process that reproduces the data structure using the smallest number of components. Every pre-extraction item communality must equal 1.0, for all the variance associated with each item is initially represented. When a component is extracted, all the variance associated with it is partialled out, and thus, post-extraction item communalities are far less than 1.0. What remains, the residual, is independent of all extracted components, meaning that each component is treated as if it is independent of all others with intercorrelations of zero. It should be noted that because PCA analyzes all the variance in each variable, some experts contend that PCA is not a true factor analysis procedure (Fabrigar et al., 1999). They contend that the goal of factor analysis is identifying common factors, whereas PCA finds linear combinations that retain as much information as possible.
PAF is another commonplace extraction method. It is best suited for exploring the underlying factors theorized by the researcher (DeCoster, 1998). In other words, the method reveals the latent structure of a set of original variables. The emerging factors from PAF explain the variance common to more than one variable. More specifically, unlike PCA, PAF analyzes only shared variance and leaves out the variables’ unique variance. That is, PAF analyzes only the communality. Figure 4 distinguishes between PCA and PAF in terms of the variance elements (pre-extraction and post-extraction) each method analyzes. Figure 5 presents the initial and extracted communalities for 15 hypothetical self-efficacy survey items using PCA and PAF, respectively. Notice that for the PCA, the initial communalities (pre-extraction) are 1.0 and for PAF, they are relatively lower. The reason for this difference lies in the fact that PCA places 1.0 in intercorrelation matrix as the initial communality. Instead of 1.0, PAF uses the R 2 of intercorrelation matrix as the initial estimate of communality. Computationally, the R 2 will always be less than 1.0, so the extracted communities for PAF will also be somewhat smaller than those for PCA (see Figure 5). Since the extracted communalities for both PCA and PAF represent the proportion of variance in an item explained by other items, the h 2 for each variable with both extraction methods is less than 1.0.

Variance analyzed by PCA and PAF. PCA = principal component analysis; PAF = principal axis factoring.

PCA (left) and PAF (right) extraction output for 15 items (IBM SPSS).
Quite often, different extraction methods computed on data sets with large sample sizes produce comparable factor solutions. Simulation studies, however, show PAF to generate a more reliable solution when communalities are low (Kahn, 2006). Furthermore, PAF is robust to violations of normality in the data. In contrast, PCA can lead to misleading results as it tends to yield larger loadings than do other methods. Again, if one’s purpose is to determine latent factors in the data, PAF and related methods (e.g., maximum likelihood [ML]) are preferred. As we mentioned earlier, some researchers use multiple approaches to factor extraction, retention, and rotation, and then compare the results. The most advantageous factor structure is then reported. Doing this is an unsound practice, compromising the primary aims of each method.
Factor Retention
To reiterate, the goal of factor analysis is to explain the maximum amount of variance in the intercorrelation matrix with the fewest factors or components. Choosing the right factor extraction approach is vital, but just as importantly, one must accurately decide on the number of factors to be retained for subsequent rotational analysis. Most factors that could be extracted are not meaningful (i.e., they do not account for much variance), and therefore, should not be retained. The major statistical programs use a default setting called the Kaiser criterion, which keeps only those factors with an EV over 1.0. Researchers need to override the setting, for accepting default position may lead to erroneous decisions. Pett, Lackey, and Sullivan (2003), among many other experts, recommended starting with the Kaiser 1.0 guideline and then examining the scree plot (see, e.g., Figure 6). Basically with this line graph, factor EVs (Y-axis) are plotted against the number of possible factors (X-axis). The point at which the line begins to show a clear bend indicates the actual number of factors that should be retained. In other words, the cutoff for the number of factors to retain is the point on the graph where additional factors fail to add appreciably to the cumulative variance. By inspecting the sample scree plot depicted in Figure 6, an experienced factor analyst would probably retain three factors.

Sample scree plot (IBM SPSS).
Conducting a parallel analysis (PA) can augment one’s decision-making process (Horn, 1965). Unlike the Kaiser’s criterion and scree plot approaches, PA takes into account sampling error. Essentially with PA, random data sets are generated from those observations that match the observed data in size and number of variables. The mean EVs for the random data are calculated. The factors that are retained are at those points where the EV of the sample data exceeds that of the randomly generated data. As illustrated in Figure 7, three factors would be retained. Accuracy of prediction is improved by replicating the process with multiple sets of random data.

Illustration of parallel analysis.
Yet another option to assist with determining how many factors to retain is the minimum average partial (MAP) correlation procedure (Velicer, 1976). In the MAP procedure, a new correlation matrix is calculated after each factor is partialled out of the original matrix. If the removal of the factor results in the removal of common variance, the MAP function will decrease. If the removal of the factor results in the removal of unique variance, the MAP function will increase (see Figure 8). The point where the MAP function is at a minimum indicates the number of factors to be retained. Here, the matrix most closely resembles an identity matrix. Examining Figure 8, the MAP procedure suggests that two factors would be retained. In short, adding PA and MAP to the decision-making process improves the likelihood of retaining the correct number of factors (Zwick & Velicer, 1986).

Illustration of MAP procedure. MAP = minimum average partial.
Factor Rotation
Once the decision on the number of factors to be retained is made, the researcher deploys the statistical software to rotate the initial factors in such a way to maximize simple structure. Before discussing the nuances of the process and the different rotation strategies, further explanation of key concepts is needed. First, factor loadings indicate the relative importance of each variable to each factor. If one squares a factor loading, the amount of variance that the variable explains on a particular factor is calculated. Consequently, a variable with a higher factor loading (e.g., .77) is relatively more important to the factor than a variable with a minimal loading (e.g., .20), because after squaring the loading, the amount variance explained is more substantial (e.g., .772 = .59 or 59% vs. .202 = .04 or 4% of the variance). The variables with the strongest loadings on a particular factor are used to “mark” or “define” a factor. The threshold magnitude to mark a factor varies from study to study, but in most cases, it should be no lower than .30 or approximately 10% of the explained variance. Conservatively speaking, the minimum factor loading should be .35 or even .40 (Comrey & Lee, 1992).
To reiterate, the initial solution indicates that all variables have a tendency to load strongly on the first factor, for it accounts for the maximum amount of the variance. In other words, since the first factor is generally more highly correlated with the variables than the second factor, it is to be expected that Factor 1 will account for the most overall variance. The second factor is orthogonal to the first, as is the second from the third, and so on. With each successive factor extraction (called iterations), less and less variance overall will be explained.
The initial (prerotation) solution is referred to as the unrotated factor structure or a matrix of derived factor loadings. Variables in the matrix are represented as rows and factors as columns. For the most part, the unrotated factor matrix is indecipherable. Loadings on particular factors will vary widely and many of the variables will load on two or more factors (crossload). Table 4 illustrates this situation. To create a more interpretable factor structure, the factor loading matrix most often is rotated. If each row of the loadings matrix is a coordinate of a point in M-dimensional space, then each factor corresponds to a coordinate axis. Factor rotation is equivalent to rotating those axes and computing new loadings in the rotated coordinate system. In an attempt to maximize simple structure, rotation essentially changes the “viewing angle” of the factor space. After rotation, the vectors are rearranged to optimally go through clusters of shared variance. The resulting factor loadings and the factors so produced can be more readily interpreted. After rotating the initial factor matrix (see Table 4), the PAF-rotated factor matrix (see Table 5) shows 12 items clearly loading on Factor 1 and 8 items marking Factor 2. Notice in rotated factor matrix that Items ESA18 and ESA19 on Factor 1 and Items ESA10, ESA5, and ESA7 located on Factor 2 have weak factor loadings (>|.35|). These may be eliminated.
Prerotation Factor Matrix.
Note. Extraction method = principal axis factoring; five factors extracted with Eigenvalue 1.0.
Rotated PAF Factor Matrix (Varimax Rotation).
Note. PAF = principal axis factoring. Loadings ≤.20 were suppressed.
Researchers can rotate the factor matrix orthogonally or obliquely. By definition, the orthogonal approach rotates the vectors at 90° angles to minimize the amount of factor covariation. Once rotated, the factor loadings from the rotated factor matrix should be interpreted. The magnitude of the factors loadings will change, but the variable communalities will remain the same after rotation. IBM SPSS has multiple options that generate relatively similar factor solutions, including varimax, quartimax, and equamax rotations. The orthogonal procedure may be appropriate when theory and research support uncorrelated factors, namely, when one is measuring distinct concepts. For example, research attempting to validate the Cross-cultural Counseling Inventory, a three-factor orthogonal solution of cross-cultural counseling skill, sociopolitical awareness, and cultural sensitivity was reported (LaFromboise, Coleman, & Hernandez, 1991). Orthogonal rotation, however, should be rarely used in counseling investigations, because most constructs are intercorrelated (e.g., depression and optimism). Ignoring this reality yields a factor solution that is largely synthetic in nature.
Oblique rotation addresses this artificiality but increases the interpretation complexity of the factor solution. With this approach, the rotated vectors are intentionally put at angles less than 90° (oblique angles), allowing for some factor covariation (e.g., r ≤ .30). With IBM SPSS, oblique rotation options are direct oblimin and promax. The parameter delta (Δ) determines the extent of obliqueness of the factors. The default setting is 0.0; however, researchers can override by changing the Δ to a negative value. Doing so lessens the degree of factor intercorrelations, because the factors or components become less oblique. Positive Δ values increase the correlation between factors. Harman (1976) advised researchers to avoid using Δ values greater than .80, for they result in highly correlated factors that are virtually indistinguishable. Obtaining the best solution (i.e., simple structure while limiting the size of the factor intercorrelations) using the oblique approach requires extensive factor analytic skills.
Since the final factor solution (post-oblique rotation) includes the amount of overlap (correlation) between factors, the amount of variance explained by a rotated factor differs from its initial solution. Like orthogonal rotation, oblique rotation does not alter the communalities.
The choice of the rotation method is guided by the researcher’s goal (Rennie, 1997). If the primary objective is a factor structure that best fits the data, oblique rotation is a better choice. Sample measurements substantially impact oblique rotation factor solutions, making the results less likely to be replicated in other studies. Should the research goal center on factor generalizability, orthogonal rotation is preferred. In many research scenarios, the results of oblique and orthogonal rotation tend to be similar. This is particularly the case when the factors are weakly intercorrelated. Rennie suggested that orthogonal rotation be the default method unless the results of oblique rotation are noticeably different. Pedhazur and Schmelkin (1991) claimed that orthogonal rotation produces results that are parsimonious and more replicable. Hence, in their view, this advantage outweighs the criticism that the orthogonal results fail to represent real-world conditions.
Whichever rotation strategy the counseling researcher adopts, the key is to have defensible theoretical and research-based reasons for one’s decision making (Comrey & Lee, 1992). To reiterate, the choices should ensure that the derived rotated factor structure is coherent, achieving simple structure. Each variable must have at least a moderate loading (i.e., at least .30 to .40, depending on the investigation) on a single rotated factor. However, the judgment of whether a variable “belongs” with a factor must be based on the squared value of each particular loading, not simply the loading. The squared loading reflects the amount of variance that the variable contributes to the factor. Additionally, each factor must have at least three moderate-to-strong loadings, as more loadings tend to yield greater reliability (i.e., stronger factor internal consistency). Four to 10 items per factor are typically considered reasonable.
Naming the Factors
The final step is to label each factor or component in such a way as to represent as a whole the conceptual meaning of each variable defining a particular latent dimension. Admittedly, this is a somewhat subjective process, requiring a strong grasp of the theoretical and research literature from with the measure was constructed. To illustrate this process, suppose a new survey based on Lanes, Kuk, and Tamim’s (2011) research related to depression in postpartum mothers was created. Following an EFA, two reliable and valid factors were generated. Fifteen items strongly loaded on Factor 1. A counseling researcher understood these items to be principally addressing a mother’s “ability to cope with life events” after birth. The content of the 12 items comprising Factor 2 involves facets generally associated with “negative clinical implications for maternal-infant attachment.” Perhaps, the researcher would title the first dimension “coping ability” and the second, “negative attachment.”
A Word of Caution
Counselors and researchers must understand which conclusions can and cannot be drawn from factor analytic studies, particularly those using EFA. This procedure provides good information about relationship among variables and factors, but one cannot draw causal inferences from these relationships. CFA, however, allows the researcher to make such inferences. In the next section, CFA is overviewed.
CFA
Fundamentally, CFA assists researchers determine if an instrument’s factor structure derived from an EFA can hold up with another respondent sample. Whereas EFA is largely designed for locating clear patterns in the data set without a priori stipulations, CFA requires upfront a conceptual framework to base one’s hypotheses about how the variables will load on particular latent dimensions. More plainly, EFA is considered a theory generating approach and CFA is a theory testing strategy. The latter approach uses a preexisting factor model and statistically it is allied with structural equation modeling (SEM). CFA tests the correlational structure of a data set against a hypothesized structure and evaluates the “goodness of fit” (GOF). Unlike EFA which tries to uncover the nature of factors that influence a set of variables, CFA tests whether a factor influences a set of variables in a theorized way. It produces inferential statistics (GOF indices) which are used to draw conclusions about the population from the sample.
Because CFA is theory-driven, the researcher places constraints on the model to be tested (e.g., limits the number of factors) and often sets the effect of the latent variables on the observed variables to a particular value. Because of its complexity, conducting CFA requires specialized software such as LISREL and AMOS (Albright & Park, 2009) and substantial statistical expertise in this area. The CFA steps, however, are relatively straightforward.
Defining the Model
Based on existing theory and research, the first phase is to articulate the conceptual model and the hypotheses to be confirmed. Since they cannot be observed or directly measured, the factors or dimensions are referred to as latent variables. A counseling researcher, for instance, may be interested in evaluating a recognized model of test anxiety with a new sample. Data are collected on observed and measurable variables which the researcher determines to be proxies for the latent variables. Pulse rate serves as a proxy for the participants’ level of physiological anxiety. Once the latent and observed variables are identified, a structural model is developed that predicts which variables will load on the hypothesized factors (see, e.g., Figure 9). Extending the scenario a bit further, the two ovals in Figure 9 could represent two latent and unobserved dimensions called physiological anxiety and psychological anxiety. The arrows point toward measurable and observed indictors of these dimensions (rectangles). The goal is to determine whether these indicators actually represent the hypothesized constructs. Stated differently, because the latent phenomena are theorized to cause the observed variables, the model shows arrows pointing from the unobserved to the observed variables. The double-headed arrow at the top of the model indicates that the two unobserved variables are correlated to some degree. This suggests that physiological and psychological anxiety dimensions are correlated. Once the CFA is computed, the magnitude of this relationship becomes known.

Sample hypothesized structure model used in confirmatory factor analysis.
Data Collection
Because the goal is inference, for CFA, samples are generally larger than what would be considered adequate for EFA. The size of the sample speaks to the reliability of the estimates. Whatever sample size is used, the rule of thumb guiding the selection of N should be specified. Data are then collected on the observed variables and the data set must be complete to ensure factorability. Specifically, unless data are missing at random, researchers must address this issue and describe what method they have used to rectify the situation. A couple of ways to deal with missing data is to insert a hypothetical score or to exclude the individual from the sample (Huck, 2012). IBM SPSS provides two options. The first is to drop the cases from CFA. Doing this is problematic, for it may lead to a bias in estimates in cases where data are not missing at random. The second option is to eliminate the variable in instances where data are absent. This strategy obviously changes the covariance matrix to be factored. AMOS and LISREL have a built-in procedure (full information maximum likelihood [FIML]) to estimate the values of the missing data, allowing the researcher to use all participant data. FIML has its limitations as well.
Correlation Matrix
CFA uses a correlation or a covariance matrix developed from the sample data as a beginning point of the analysis. Covariances and correlations between variables allow for the inclusion of a relationship between two variables that is not necessarily causal. Typically structural equation models contain both causal and noncausal relationships. The correlations are restricted to hypothesized relationships. This is a distinction from EFA which assumes that all variables are correlated with all factors. The CFA model parameters are estimated based on these sample values. Often times the residuals matrix is examined, for it represents the difference between the sample correlation matrix and the implied correlation matrix. Residuals analysis is particularly useful when the observed or measured variables are scaled differently.
Fitting the Hypothesized Model to the Data
The SEM framework is typically used in testing models for data fit. This structure is valid under the assumption of multivariate normality. That is, each observed variable should largely follow a normal distribution; all combinations of observed variables must have normal joint distributions; and all bivariate scatterplots are approximately linear with distributions of equal variance. When this assumption of multivariate normality is even moderately violated, there is a potential distortion in the fit statistics as well as increased probability of committing a Type I error. Once the assumptions of CFA are met, the overall analysis can be computed.
As previously mentioned, CFA emphasizes model fit, which is not synonymous with model validity. Specific analyses are designed to evaluate one or the other. In EFA, the GOF test is conducted on individual factors using a holistic approach. GOF looks at a set of relationships between the observed and the latent variables. The model that is being tested is referred to as the default model. When all the variables are linked together and the model fits the data as well as possible, the model is said to be saturated. When no such connections exist and the model represents a poor fit to the data, it is said to be independent. Clearly then, the objective of CFA is to have the hypothesized model reflect a saturated model as closely as possible.
Evaluating the Model
Experts have developed numerous indices to assess fit by looking at both the degree of difference between the model and data and the degree to which the model and data match (see Table 6 for widely accepted sample indices). They can be classified as absolute, incremental, and parsimony measures. Whereas absolute fit indices evaluate how well the hypothesized model fits the data, incremental fit indices compare the model to a null model in which all the observed variables are uncorrelated. Moreover, incremental indices measure the improvement so that greater fit corresponds with greater improvement to the null model. Parsimony fit measures indicate if the specified model is parsimonious. They help determine whether the model can be improved by specifying fewer estimated parameter paths and thus creating a simpler model. When the indices are used in combination, the researcher’s conclusion about best fit tends to be more reliable. Most statistical software report similar indices, but none of them output every alternative.
Sample Well-Researched CFA Indices for Assessing Model Fit.
Note. CFA = confirmatory factor analysis.
Generally, CFA researchers first examine two χ2 values (χ2 and χ2/df), both conventional measures of overall model fit. These statistics describe the similarity of the observed sample covariance matrix and expected covariance matrix. The null hypothesis is as follows: There is a nonsignificant difference between the two covariance matrices. Since a good CFA result is when the matrices are quite similar, one anticipates a nonsignificant χ2(p > .05), allowing the researcher to retain the null hypothesis. For a variety of reasons (e.g., large sample sizes generate significant χ2s), this GOF index is most often inaccurate, thus other indices must be considered.
Next, the Root Mean Squared Error of Approximation (RMSEA) index should be examined. The RMSEA indicates the amount of unexplained variance or the residual. It gives a suggestion of how well the parameter estimates fit the population covariance matrix. The value of RMSEA is influenced by sample size, the degrees of freedom, and the χ2 value. RMSEA values nearer to 0.0 indicate a quality fit. A general guideline for an adequate and good fit is <.10 and <.05, respectively. In short, RMSEA denotes the level to which lack of fit is due to misspecification of the tested model as opposed to being a result of sampling error.
Second, researchers check the Standardized Root Mean Square Residual (SRMR) index, as well as the Nonnormed Fit (NNFI) and Comparative Fit (CFI) indices. The SRMR is an absolute measure of fit that represents the standardized difference between the observed covariance (correlation) and that implied (predicted covariance) by the model. As the sample size increases and the number of parameters increases, the SRMR becomes smaller, and thus, the hypothesized model better fits the data. Even though researchers strive for a “perfect” fit (SRMR = 0.0), a value less than .08 is generally considered a good fit. The NNFI is comparable to R 2, the coefficient of determination. A zero value indicates the worst possible model fit and a value close to 1.0 indicates a good fit. The NNFI is a fit index that is less affected by sample size. Finally, the CFI measures the extent to which the hypothesized model differs from the null model. In the latter model, the variances and covariances are zero. CFI values nearing 1.0 (≥.95) are considered sufficient evidence for a good model fit.
The model fit may be inadequate because an observed variable loads heavily on multiple factors. A solution to this may be to eliminate the problematic variables. One consideration when eliminating variables is the size of the main factor loading. The suggested minimum loading is .35 to .40. Furthermore, the variables should have cross-loading of no more than .30. Attention must also be given to the contribution the variable makes to the factor’s interpretability as well as its face validity. Further guidance as to whether a particular variable should be retained or excluded is provided by the number of variables already comprising the factor. Should the researcher decide that the best way to resolve an inadequately fitting model is to eliminate variables, this process must be done one variable at a time with the CFA repeated before the next exclusion. If two variables have comparable factor loadings and reliabilities, the variable with the least skew and kurtosis should be preserved.
Comparing Models
It could also be that the original hypothesized model itself is inadequate. In such a circumstance, a solution would be to return to the data and conduct an EFA to generate a model that can then be later tested. It is advisable to compare models with different latent variables. The researcher can examine the fit statistics for each model as well as conduct a χ2 test to compare the models. Separate model fit χ2 statistics are computed. The difference between the two is then evaluated for statistical significance. The researcher should also provide evidence showing that a model that has been modified and reanalyzed is statistically superior to the original model with a χ2 test. A model that has been modified, a trimmed model, is referred to as a nested or hierarchical model. There must be a theoretical basis for the modification to the model. Changes must not be made simply because an analysis called for an addition or reduction in parameters. By paying attention to the theoretical basis of modification decisions, the likelihood of making a Type I error is reduced (Schreiber, Stage, King, Nora, & Barlow, 2006).
Factor Analysis in Practice
To practically summarize the key concepts related to factor analysis, we revisit the “measuring life satisfaction” scenario mentioned in the EFA section above. The overall research goal was to develop a reliable and valid life satisfaction measure that could be used with young adolescents. Based on previous research, the counseling researchers initially assumed that any derived factors or components would be distinct and uncorrelated. Because of its ease of use and the need for data reduction, PCA extraction method was selected. After the first factor was extracted, a second one emerged maximizing the left over variance from PCA’s first iteration. This extraction process continued until all the variance in the intercorrelation matrix was accounted for by the derived components. After reviewing the factor extraction table (see Table 7), the research team noticed that nine components had EVs over 1.0, explaining 60.3% of the total variance in the intercorrelation matrix. However, the total amount of variance explained declined to 5.5% for Component 4. Subsequent components explained even less variance, suggesting that components from 4 onward are perhaps trivial in terms of variance explained. Although the scree plot indicated that three to four components could be rotated, PA suggested only three components be rotated. Because the components of the data set were considered orthogonal and uncorrelated with each other, the varimax (orthogonal) rotation procedure was selected. After reviewing the three-component rotated factor matrix, the researcher decided that simple structure was achieved with 30 items that were defined by three reasonably strong factors. Each dimension was comprised of 10 items loading at least .40 on their particular factor. The factors were then named following the process discussed previously.
Hypothetical PCA Extraction Table.
Note. PCA = principal component analysis. Extraction method: principal component analysis.
Suppose the life satisfaction survey responses were thought to reflect one or more underlying common factors, where each item measures some unique aspect of a latent factor or factors. The derived factors were not expected to extract all the variance since there was a unique element associated with each item. Only the proportion of variance shared by several items is extracted. In this situation, PAF was the preferred extraction method, particularly when the goal was to classify items and detect a latent factor structure. Furthermore, the researchers assumed the factors were correlated, because dimensions of life satisfaction were considered less than discrete entries. In other words, young peoples’ scores on various survey items influenced how they answered other survey questions. After extracting factors with PAF, the criteria for how many factors to retain and rotate were evaluated. Direct oblimin, an oblique method of rotation, was computed. Again, the .40 threshold for marking a particular factor by an item was utilized. The pattern and structure factor matrices were examined for the best factor solution. Not surprisingly, with a very substantial N, the rotated PAF solution represented in the structure matrix (correlations between the variables and the factors) largely mirrored the simple structure found with PCA, where three latent dimensions were each defined by 10 items with moderate to strong loadings. Based on item content, factor names were assigned as well.
At this point, the factorial validity of the 30-item YALSI has been established with one group of youth. The next logical research phase was to cross-validate the measure with a different respondent sample of equal or greater size. The counseling researchers sampled a diverse set of 500+ young adolescents from a neighboring state representing five different schools. Since the CFA assumptions were adequately met, this procedure, using the ML method, was computed on the prefigured factor model (i.e., the three-factor, 30-item structure). In comparing several GOF indices, each suggested that YALSI’s underlying dimensionality previously found with EFA could be confirmed. The reliabilities of the derived factors were double checked, producing adequate Cronbach α coefficients for each dimension, ranging from .74 to .85. The final result of the entire process was a usable measure of life satisfaction with three dimensions. Counselors were now able to cautiously administer the YALSI to their young adolescent clients.
Conclusion
Factor analysis is a routine statistical tool with a variety of purposes outlined in this article. It is obvious that researchers in counseling and allied professions should possess a thorough grasp of EFA and CFA, particularly when they are developing a new measure for use with clients or evaluating the psychometric properties of an established instrument. Similarly, practicing counselors are well advised to learn how quantitative instruments are constructed, knowing that factor analysis is foundational to establishing their reliability and validity. Regrettably, counselors may assume that because the measure has face validity and is publicly available with a test manual, it must be valid and reliable with multiple respondent groups. In many cases, counseling-related tests and surveys have not gone through extensive factor analytic work, particularly CFA. With increased technical familiarity comes enhanced test interpretation. Counselors become more cognizant of the underlying strengths of the measures they administer as well as their inherent weaknesses. By examining the factor analytic work conducted on instruments, one understands which potential client sample(s) the measure was tested on and how the factor structures look with each respondent group.
Finally, we recommend that first counselors review the test manual looking for clear signs of validity and reliability. Read about the test construction process, including the respondent sample(s) and what statistical procedures were conducted. If the psychometric information is limited in scope and depth, perhaps a better measure is available. Second, counselors should peruse reviews on the measure available in scholarly publications (e.g., Mental Measurement Yearbook, Buros Institute, University of Nebra-ska). Sometimes the measure is so new that it has not been properly inspected by the research community. In this case, it is probably wise to hold off on using it for screening or diagnostic purposes.
Helping professionals who closely follow the ethical guidelines of their profession (e.g., American Counseling Association) understand that assessment literacy is required. One’s assessment knowledge will come under close examination when there is a potential lawsuit by a client who believes the process was mismanaged and even faulty. In this situation, knowing about statistical aspects of test development will assist in defending one’s testing choices. Perhaps a reminder that the adage “caveat emptor” applies to test usage.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
