Abstract
The Woodcock–Johnson-III cognitive in the adult time period (age 20 to 90 plus) was analyzed using exploratory bifactor analysis via the Schmid–Leiman orthogonalization procedure. The results of this study suggested possible overfactoring, a different factor structure from that posited in the Technical Manual and a lack of invariance across both age ranges under study. Even when forcing the seven-factor fit, the structure was problematic. The results from the 20 to 39 age group displayed patterns of convergence with and divergence from the Technical Manual’s structure. The results from the 40 and above age group were generally consistent with the Technical Manual’s structure except for retrieval fluency. This study is consistent with the body of exploratory factor analysis structural validity evidence suggesting that contemporary tests of cognitive ability, particularly those based on Cattell–Horn–Carroll theory, are overfactored and lack alignment with their respective Technical Manual’s presented structure.
Keywords
The Woodcock–Johnson-III (WJ-III) is a widely used instrument that is often cited for its utility in clinical diagnosis and educational classification. Containing an element of circularity, the WJ-III served as the initial evidentiary basis for Cattell–Horn–Carroll (CHC) theory, while CHC theory influenced the development of the WJ-III. The test authors have promoted CHC taxonomy as “ . . . the most comprehensive and empirically supported framework available for understanding the structure of human cognitive abilities” (p. 9; McGrew & Woodcock, 2001). With such a bold claim and because the WJ-III is potentially used in decisions about thousands of individuals’ lives, the stakes are high for the instrument’s structure.
The WJ-III structure has primarily been investigated using confirmatory factor analysis (CFA) methodology. These investigations have largely supported its structure (e.g., Keith & Reynolds, 2010; Taub & McGrew, 2004). There are two notable recent exceptions. Dombrowski (2013) and Dombrowski and Watkins (2013) subjected the two school-age correlations matrices (ages 9 to 13 and 14 to 19) in the WJ-III full battery and the WJ-III Cognitive to exploratory bifactor analysis via a Schmid–Leiman (SL) orthogonalization procedure. In both studies, these researchers demonstrated not only possible overfactoring but also a lack of structural invariance across the two age ranges. The lack of factor invariance is not restricted solely to the WJ-III (e.g., Stanford Binet–Fifth edition [SB5] via Canivez, 2008; Cognitive Assessment System [CAS] via Canivez, 2011; SB5 via DiStefano & Dombrowski, 2006; Reynolds Intellectual Assessment Scales [RIAS] via Dombrowski, Watkins, & Brogan, 2009), but it poses a serious problem. Tests of cognitive ability require factorial invariance across developmental periods so that results from one age period may be compared with results in another (i.e., age 9 to age 16; Horn, McArdel, & Mason, 1983; Labouvie & Ruetsch, 1995; Reise, Widaman, & Pugh, 1993). If structure is found to be inconsistent across groups, then this considerably degrades capacity to compare scores across developmental periods (Reise et al., 1993) and limits the instrument’s clinical use.
The purpose of this study is to conduct an exploratory and hierarchical factor analysis of the WJ-III Cognitive on two normative sample (e.g., ages 20 to 39; 40 plus) correlation matrices that span the adult time period. The WJ-III Cognitive in the adult time period has never been subjected to exploratory and higher order factor analysis. Instead, an understanding of the structure of the WJ-III Cognitive in adulthood rests primarily on CFA.
However, several structural validity studies are available in the adult literature. The Wechsler Adult Intelligence Scale (WAIS) has been analyzed using the SL procedure as well as bifactor CFA analysis. These studies found that the instruments are primarily dominated by the g factor (e.g., the French WAIS-III [Golay & Lecerf, 2011] and the WAIS-IV [Canivez & Watkins, 2010a, 2010b; Niileksela, Reynolds, & Kaufman, 2012]) and that primary interpretive emphasis should reside at that level.
There are several choices for exploratory factor analysis, but one with a well-established research base is the SL orthogonalization procedure (Schmid & Leiman, 1957). The SL procedure is a type of bifactor model that has been widely used by psychometric researchers in cognitive ability (Canivez, 2013). In fact, John Carroll used this procedure, and literally insisted on its use, when he created his Three Stratum Theory of Cognitive Abilities (Carroll, 1993). It is perplexing that the SL was overlooked by the WJ-III test authors when Carroll’s theory was considerably influential in the development of the WJ-III. Because the WJ-III Cognitive in adulthood has never been subjected to exploratory factor analysis, both within its Technical Manual and in the outside literature, this study seeks to fill this critical empirical gap.
Method
Participants
The WJ-III authors collected and reported information relative to seven age groups: 2 to 3 years, 4 to 5 years, 6 to 8 years, 9 to 13 years, 14 to 19 years, 20 to 39 years, and 40 years and older. The data for the WJ-III Cognitive norms were collected from a nationally representative sample of 8,818 participants from age 2 through 90 plus. The Technical Manual reports that the normative data were matched to the 2000 U.S. Census for geographic region, community size, sex, race, educational level, and occupation. Demographic characteristics are provided in the WJ-III Technical Manual. For this study, the two adult (20 to 39 years and 40 plus years) subtest correlation matrices (20 by 20) were obtained from the Technical Manual. The 20 to 39 age range contained an average of 1,100 participants, while the 40 plus age range contained an average of 807 participants.
Instrument
The WJ-III Tests of Cognitive Abilities (WJ-III Cognitive; Woodcock, McGrew, & Mather, 2001) contains 20 cognitive tests that are purported to measure g and seven CHC factors: visual-spatial thinking (Gv), fluid reasoning (Gf), processing speed (Gs), long-term retrieval (Glr), auditory processing (Ga), short-term memory (Gsm), and comprehension-knowledge (Gc). The WJ-III also yields a general intellectual-ability score reflective of g.
Procedure
The correlation matrices within this study were analyzed using Bartlett’s Test of Sphericity (Bartlett, 1954) and the Kaiser–Meyer–Olkin (KMO; Kaiser, 1974) statistic. The intercorrelation matrices were subjected to principal axis factoring (Cudeck, 2000; Fabrigar, Wegener, MacCallum, & Strahan, 1999; Tabachnick & Fidell, 2007) with promax rotation (k = 4). The minimum average partial (MAP; Velicer, 1976) test and parallel analysis (Horn, 1965) were used to determine the number of factors to extract. Scree plots (Cattell, 1966) were also inspected (see Figures 1 and 2). In addition, the seven-factor solution presented in the Technical Manual was forced. Finally, a bifactor analysis using the SL (Schmid & Leiman, 1957) procedure was applied to the oblique first-order factors to elucidate the structure of the WJ-III across the two adult correlation matrices.

Scree plots for HPA for the WJ-III Cognitive age 20 to 39.

Scree plots for HPA for the WJ-III Cognitive age 40 Plus.
Results
Exploratory (First-Order) Analyses
Results from Bartlett’s Test of Sphericity (Bartlett, 1950) for both analyses’ age ranges indicated that the correlation matrices were not random (20 to 39 age range χ2 = 12,863.01, df = 190, p < .000; 40 plus age range χ2 = 11,361.11, df = 190, p < .000). For the 20 to 29 and 40 plus age ranges, the KMO (Kaiser, 1974) statistic was .905 and .934, respectively, well above the minimum standard for conducting a factor analysis suggested by Kline (1994). Measures of sampling adequacy for each variable were also within reasonable limits. Thus, the correlation matrices were appropriate for factor analysis.
Factor-extraction criteria
Parallel analysis (Horn, 1965) suggested that three factors be retained for the 20 to 39 age range, while two factors were indicated for the 40 plus age range. The MAP (Velicer, 1976) criterion recommended retention of two factors for the 20 to 39 age range and one factor for the 40 plus age range (see Figures 1 and 2). A visual scree test indicated evidence for one strong factor at both age ranges with the possibility of two additional factors at the 20 to 39 age range and one additional factor at the 40 plus age range. Because MAP recommended the retention of two factors and parallel analysis recommended the retention of three factors, both a two- and a three-factor solutions were extracted and analyzed at the 20 to 39 age range. Across both age ranges, seven factors were extracted in accord with the theoretical structure indicated in the Technical Manual. At age 20 to 39, Factors 5 through 7 were essentially trivial factors with nominal root sizes. At age 40 plus, Factors 4 through 7 were trivial factors. Psychometrically, the extraction of two factors at age 20 to 39 and 40 plus was plausible, but interpretability and linkage to theory were problematic when attempting to interpret beyond the general factor. Extraction of three factors at age 20 to 39 was also plausible but offered only a two subtest long-term-retrieval (Glr) factor and a four subtest processing-speed (Gs) factor apart from a first factor that represented an agglomeration of the remaining subtests.
Principal axis factoring with promax rotation
The adult-aged correlation matrices were separately subjected to principal axis factoring (PAF) with an oblique (promax) rotation. The age 20 to 39 and 40 plus analyses suggested that the first factor accounted for 41.35% and 50.01% of the variance, respectively. This dwarfed the variance accounted for by the second factor at the 20 to 39 and 40 plus age range (7.79% and 6.49%, respectively). Correlations between the extracted factors were as follows: The 20 to 39 age range yielded correlations between the seven factors ranging from .25 to .69 (Mdn = .41). Similarly, factor analysis of the 40 plus age range yielded correlations between the seven factors ranging from .24 to .69 (Mdn = .60). High correlation among factors suggests the possible presence of a higher order factor which needs to be extracted and examined (Gorsuch, 1983; Thompson, 2004).
Hierarchical Factor Analysis (SL Orthogonalization)
Forced seven-factor solution
Results from the SL (Schmid & Leiman, 1957) procedure on the seven-factor solution across both age ranges are presented in Tables 1 and 2. In the age 20 to 39 SL analysis, the higher order factor accounted for 35.06% of the total variance and 57.95% of the common variance. In the age 40 plus SL analysis, the higher order factor accounted for 43.92% of the total variance and 67.64% of the common variance. The general factor also accounted for between 11% and 65% (Mdn = 34.5%) of individual subtest variance in the 20 to 39 analysis. The g factor accounted for between 21% and 66% (Mdn = 42%) of individual subtest variability in the 40 plus analysis. For the 20 to 39 analyses, the seven first-order factors accounted for a small proportion of the total variance (2.37% to 6.36%) and common variance (3.92% to 10.51%). The first-and second-order factors combined to measure 60.5% of the variance in the WJ-III Cognitive, reflecting 39.5% unique variance. For the 40 plus analyses, the seven, first-order factors accounted for 2.06% to 7.49% of the total variance and 1.34% to 7.49% of the common variance. The first- and second-order factors of the 40 plus analysis combined to measure 64.9% of the variance in the WJ-III, reflecting 35.1% unique variance. The results of both analyses demonstrate a robust manifestation of general intelligence where the combined influence of general intelligence and uniqueness exceeded the contributions made by the first-order factors. The reliability of WJ-III Cognitive was also estimated with ωh, which ranged from .158 to .462 across all analyses. Low ωh coefficients suggest that interpretation of the factor indices beyond the general factor is inappropriate as little variance exists beyond the general factor (Reise, 2012).
WJ-III Sources of Variance According to a Schmid–Leiman Orthogonalization (Seven Factor) Ages 20 to 39.
Note: Loadings ≥.30 are
WJ-III Sources of Variance According to a Schmid–Leiman Orthogonalization (Seven Factor) Ages 40 Plus.
Note: Loadings ≥.30 are
Hierarchical Factor Analysis Using Psychometrically Sound Factor-Extraction Rules
Age 20 to 39
In the age 20 to 39 SL analysis (see Table 3), the higher order factor accounted for 32.4% of the total variance and 65.0% of the common variance. The g factor accounted for between 11% and 54% (Mdn = 33%) of individual subtest variability. The three lower order factors accounted for 4.8% to 6.8% of the total variance and 9.7% to 13.7% of the common variance. The first- and second-order factors combined to measure 49.8% of the variance in the WJ-III Cognitive, reflecting 50.2% unique variance.
WJ-III Sources of Variance According to a Schmid–Leiman Orthogonalization (Three Factor) Ages 20 to 39.
Note. Loadings ≥.30 are
Age 40 plus
In the age 40 plus SL analysis (see Table 4), the higher order factor accounted for 31.2% of the total variance and 61.9% of the common variance. The g factor accounted for between 15% and 45% (Mdn = 31.5%) of individual subtest variability. For the age 40 plus analysis, the first-order factors accounted for 9.4% to 9.8% of the total variance and 18.6% to 19.5% of the common variance. The first- and second-order factors combined to measure 50.3% of the variance in the WJ-III Cognitive, reflecting 49.7% unique variance.
WJ-III Sources of Variance According to a Schmid–Leiman Orthogonalization (Two Factor) Ages 40 Plus.
Note. Loadings ≥ .30 are
The reliability of WJ-III Cognitive for these last two analyses was also estimated with ωh, which ranged from .22 to .44 across both analyses. Again, low ωh coefficients suggest that interpretation of the factor indices beyond the general factor is inappropriate as little variance exists beyond the general factor (Reise, 2012).
Discussion
The WJ-III Cognitive test authors overlooked exploratory factor analysis (EFA) structural analyses in favor of singular reliance on CFA. This approach to cognitive ability scale development over the past decade has led to criticism of internal structure (e.g., Canivez, 2008; DiStefano & Dombrowski, 2006; Dombrowski, 2013; Dombrowski & Watkins, 2013), concern about confirmation bias (Greenwald, Pratkanis, Leippe, & Baumgardner, 1986), possible overfactoring (Canivez, 2013; Fraizer & Youngstrom, 2007), and a lack of measurement invariance (e.g., DiStefano & Dombrowski, 2006; Dombrowski, 2013). Independent research on the WJ-III has also been based primarily on CFA methodology. This body of research has largely been supportive of the factor structure of the instrument and its relationship with CHC theory (e.g., Floyd, McGrew, Barry, Rafael, & Rogers, 2009; Keith & Reynolds, 2010; Taub & McGrew, 2004). Two recent exceptions include studies using exploratory bifactor analysis (e.g., Dombrowski, 2013; Dombrowski & Watkins, 2013).
Although the WJ-III Cognitive test authors invoked Carroll’s theory as a guide to the development of the instrument, they overlooked the EFA procedures (e.g., PAF followed by an SL orthogonalization) explicitly recommended by Carroll and other experts in factor analysis (e.g., Canivez, 2013; Carroll, 1993; Gorsuch, 1983; Gustafsson & Snow, 1997; McClain, 1996; Thompson, 2004). Because of these important omissions, the two adult aged (20 to 39 years; 40 plus years) correlation matrices were subjected to EFA and higher order factor analysis.
The results of this study suggest concerns about the theoretical structure of the WJ-III Cognitive across the adult age range. Similar concerns about the structure of the WJ-III full battery and the WJ-III Cognitive were raised when the school-age period (9 to 19) was analyzed (Dombrowski, 2013; Dombrowski & Watkins, 2013). The results of the present study add to these concerns, extending the time period under question (9 to 90 plus).
Use of EFA factor-extraction procedures (e.g., parallel analysis and MAP as supplemented by a visual scree) that are considered to be the most psychometrically robust suggests that the WJ-III Cognitive is a three-factor test at ages 20 to 39 and a two-factor instrument at ages 40 plus. However, extracting this number of factors generally renders the instrument less available for interpretation because the subtest alignment lacks full linkage to theory and the structure posited in the WJ-III Technical Manual. At the 20 to 39 age range (Table 3), the subtests that load Gc (Verbal Comprehension and General Information) combine with two of the three Gsm subtests, two of the Ga subtests, and retrieval fluency (Glr) to form the first factor. The second factor loads two Glr subtests (Visual-Auditory Learning and Visual-Auditory Learning Delayed), while the third factor loads all four Gs subtests. At age 40 plus (Table 4), two factors emerged. The first factor represents a combination of the Gs, Gv, and Glr (two of three) subtests, while the second factor combines the Gc, Gsm, and Ga (two of three subtests). Structural analyses using psychometrically sound EFA procedures do not support the seven-factor model presented in the Technical Manual. They do support the WJ-III Cognitive as a solid measure of general intelligence across the 20 to 90 plus time period.
When casting aside the above mentioned factor-extraction decision rules and extracting seven factors in accord with the structure posited in the Technical Manual, there were areas of convergence with and divergence from the Technical Manual. At age 40 plus (Table 4), the forced seven-factor solution was generally consistent with the Technical Manual’s posited structure except for Glr, which instead loaded Gs. At ages 20 to 39 (Table 3), there were areas of convergence with and divergence from the Technical Manual. The Gs, Gc, and Gsm subtests alignment was consistent with the Technical Manual’s structure. Ga and Glr were partially aligned, but Glr paired with the Gc subtests. Auditory attention (Ga) formed a distinct, single subtest seventh factor. The subtests that comprise Gf and Gv paired together but had poor loadings on a third factor. Thus, forcing the seven-factor fit generally holds at 40 plus (with the exception of Glr) but does not hold at ages 20 to 39. Providing further support for exclusive interpretation of the second-order factor, the reliability of the broad factor had strong estimates (ω = .93 to .96, ωh = .75 to .90). Estimates for primary factors were low (ωh = .158 to .462) and not necessarily sufficiently high for measuring unique constructs (Reise, 2012) and for individual interpretation.
Conclusion and Implications for Practitioners
The field has moved away from its psychometric roots and cast aside EFA analyses when developing recent versions of IQ scales. Prior cognitive-ability instruments (e.g., WJ-R; SB-IV) used EFA and CFA to elucidate internal structure. This omission could pose serious problems. First, research suggests that recent intelligence tests may be overfactored (see Frazier & Youngstrom, 2007). Reasons for overfactoring have included lenient factor-extraction decision-making rules, increasingly complex models of intelligence, and commercial pressure on test publishers for an ever-increasing array of constructs to interpret. Second, the derived factor structure of an instrument helps determine how it should be interpreted with interpretation predicated on convergence among the sources of structural validity.
The results of the present study add to the body of EFA structural validity literature that cautions against overlooking interpretation of the higher order factor (g) in favor of interpretation of lower order factors (e.g., Canivez, 2013; Canivez & Watkins, 2010b; DiStefano & Dombrowski, 2006, 2013; Dombrowski & Watkins, 2013; Dombrowski et al., 2009; Glutting, Watkins, Konold, & McDermott, 2006; Nelson & Canivez, 2012; Oh, Glutting, Watkins, Youngstrom, & McDermott, 2004; Parkin & Beaujean, 2012; Watkins, 2010). It is concerning that subsequent EFA studies on instruments linked to CHC theory, including this one, have generally failed to produce evidence supportive of interpretation of lower order factors (e.g., Canivez, 2008; DiStefano & Dombrowski, 2006; Dombrowski, 2013; Dombrowski & Watkins, 2013).
Given the considerable divergence in structural results between EFA and CFA, the small amount of variance accounted for by lower order factors, and low omega hierarchical estimates, the most conservative practical implication would be to interpret the instrument where there is greatest convergent evidence (i.e., at the level of g). A more lenient approach—but one that might be asking the practitioner to skate on thin psychometric ice—would be to view lower order factors as potential screeners of a particular ability (e.g., Ga or Gsm) so long as there is convergent evidence regarding subtest alignment between independent studies and the Technical Manual. When a more thorough understanding or assessment of a particular cognitive ability is necessary, then the clinician or researcher could follow up with a full-scale assessment of that particular ability (e.g., a full-scale memory test). Interpretation of lower order factors in a test such as the WJ-III Cognitive could lead to erroneous diagnostic decision making by relying on factors that lack adequate psychometric support. The evidence is accumulating to suggest that caution should be heeded when interpreting the seven lower order, CHC factors on the WJ-III Cognitive.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
