Abstract
The two featured articles and eight commentaries on the WISC-IV (Wechsler, 2003) and WAIS-IV (Wechsler, 2008) in this special issue of Journal of Psychoeducational Assessment are of exceptional quality. As a collective, this special issue greatly advances the field of cognitive assessment by intelligently synthesizing the best of methodology (confirmatory factor analysis) with the best of theory (Cattell–Horn–Carroll). The Weiss et al. articles represent sophisticated approaches to test validation and interpretation and the commentaries deal with applications, extensions, and limitations of the findings. This “response to the respondents” focuses on Wechsler’s approach to assessment, historical antecedents of factor analysis (especially the contributions of Jacob Cohen in the 1950s), and the “intelligent testing” model of test interpretation.
David Wechsler
When the first form of the Wechsler–Bellevue was published, Wechsler (1939) cautioned clinicians: “The kind of life one lives is itself a pretty good test of a person’s intelligence. When a life history (assuming it to be accurate) is in disagreement with the ‘psychometric,’ it is well to pause before attempting a classification on the basis of tests alone. Generally it will be found that the former is a more reliable criterion of the individual’s intelligence” (p. 48).
That bit of elegant wisdom reminds us that Dr. Wechsler was a clinician and humanist who treated individual differences with great respect. But Wechsler was also skilled in psychometrics. He studied with Pearson and Spearman in London in 1919 and was mentored as a graduate student at Columbia University by Robert Woodworth, Edward Thorndike, and James McKeen Cattell, the three founders of The Psychological Corporation. And, importantly, he introduced standard scores as the metric of choice for individual IQ tests just 2 years after Terman and Merrill (1937) decided to retain the formula-based ratio IQ because, “the majority of teachers, school administrators, social workers, physicians, and others who utilize mental test results have not learned to think in statistical terms. To such a person a rating expressed as ‘+2 sigma’ is just so much Greek” (pp. 27-28).
Though I worked with Dr. Wechsler for nearly 5 years in the early 1970s during the process of revising the 1949 WISC and developing and standardizing the WISC-R—which was called the WISC (Rev.) in the manual’s page proofs until a last-minute decision by an executive rewrote history—I never knew of his psychometric background. To me, Dr. Wechsler was the consummate clinician who deferred to my statistical expertise because I trained at Columbia with Robert Thorndike, Edward’s son. I found out a few years later about his work with psychometric pioneers just after World War I, but during the time Dr. Wechsler mentored me, he never let on about his statistical savvy. I wanted to include, directly in the 1974 WISC-R test manual (Wechsler, 1974), the exploratory factor analyses of the WISC-R that I later published for normal children (Kaufman, 1975) and individuals with mental retardation (Van Hagen & Kaufman, 1975), and to have examiners compute three factor scores in addition to the three IQs, but he calmly said, “No, not yet; it isn’t time.”
Overview
I was asked to contribute to this special issue of Journal of Psychoeducational Assessment as a respondent to (a) the two centerpiece articles on the WISC-IV and WAIS-IV by Weiss, Keith, Zhu, and Chen (2013a, 2013b), and (b) the eight commentaries written about these articles by true luminaries in the field of cognitive assessment. The first section of this piece is intended to serve as a reminder that no discussion of Wechsler’s scales is complete without consideration of Wechsler the man and clinician; that no discussion of test profiles can be done in a psychometric vacuum, without keeping in the foreground Wechsler’s credo that precision and technical sophistication cannot substitute for what real-world accomplishments tell us about a person’s intelligence; and that all discussion of present and future applications of the high-tech studies by Weiss and colleagues needs to be grounded, at least to some extent, in what has gone before.
The articles in this special issue, as a collective, address all three of these topics in fine fashion. Weiss et al. (2013a, 2013b) never lose sight of the notion that their high-level Confirmatory Factor Analysis (CFA) methodology is meaningful to the extent that it will enhance applications and implications of test profiles in the real world of clinical evaluations. Goldstein (2013) and Grégoire (2013) offer keen historical perspectives that feature Wechsler’s direct and indirect roles in how CHC theory is applied to the latest editions of the WISC and WAIS. Claeys (2013) and Schwartz (2013) make the case that, ultimately, interpretation resides at the individual level, specific to the person’s own strengths and weaknesses within the domains of both cognition and personality (what Wechsler, 1950, called conative factors). Schwartz and Claeys take positions that support the intelligent testing model that I have championed for decades (Kaufman, 1979; Lichtenberger & Kaufman, 2013). Also pertinent to “intelligent testing,” Flanagan, Alfonso, and Reynolds (2013) and Schneider (2013) use the Weiss et al. (2013a, 2013b) results as a springboard to interpret Wechsler’s fourth editions within the broader context of comprehensive cross-battery assessment and key statistical considerations such as confidence intervals and regressed scores. Their user-friendly tables use psychometric and cognitive theory to extend the application of Wechsler’s scales to real-life clinical situations; in terms of Guilford’s (1967) ability of divergent production (creativity), Schneider and Flanagan et al. excel at “elaboration.” Bowden (2013) and Canivez and Kush (2013) provide state-of-the-art psychometric critiques of Weiss et al.’s methodology that are well reasoned and serve the function of reminding us that all clinical applications of test profiles must be rooted in a solid psychometric foundation—even if some of us disagree with the conclusions that they draw from the data.
I have written “response articles” to four previous special issues devoted to hot topics (Cicchetti, Kaufman, & Sparrow, 2004; Kaufman, 1984, 2001, 2010). In each prior instance, the articles in the special issues were polarized, the rhetoric was often vitriolic and contentious, and the arguments were sometimes personal rather than professional. Not so for this special issue. The two centerpiece articles by Weiss and colleagues were thorough, methodologically strong, and eclectic; neither the traditional four-factor model nor the CHC-based five-factor model was hailed as the “best”; and the authors consistently oriented their discussions toward pragmatic, clinical applications for scientist-practitioners.
Contributions of the Weiss et al. Studies
The authors of the eight response articles, almost universally, praised Weiss and colleagues for the high quality of their analyses and the thoughtfulness of their interpretation. Flanagan et al. considered the major contribution of the two well designed and executed studies was the finding of invariance across samples for both factor models (suggesting a lack of test bias) and noted that there wasn’t much to critique. Claeys praised the studies for improving the power of our measurement technology to unveil the nuances of the subtests to better inform understanding of individual needs; in particular, they provide “users of legacy tests with empirical evidence of the factor structure vis-à-vis CHC theory” (Claeys, 2013, pp. 171). Goldstein believed that the Weiss et al. studies offered compelling data to support Wechsler’s “vibrant and valuable” vision and validated cross-cultural research.
Though questioning the clinical meaningfulness of the data, Schwartz stated that the WISC-IV and WAIS-IV analyses clearly demonstrated measurement invariance for the normative and clinical samples and showed that the tests measured the same constructs across samples. Bowden (2013) believed that the studies offer a good model for improved theoretical understanding of two highly popular tests of cognitive ability and that “the significance of the finding of something close to so-called strict measurement invariance . . . should not be underestimated (p. 150). To Schneider (2013), “the models presented in the target articles . . . are reasonably close to the limits of how far single-battery CFAs of this sort can take us in understanding which abilities are measured by the WISC-IV and the WAIS-IV” (p. 186). Grégoire was impressed by the validity evidence for the four-factor model, less so for the CHC-based solution; Canivez and Kush were not especially impressed by either model or the specific CFA methodology, emphasizing the role of g for Wechsler interpretation.
Intelligent Testing
Canivez and Kush, and Schneider, agreed on certain methodological problems with the Weiss et al. research: nested factor analysis should have been conducted to compare with their final models; the virtual identity of Gf and g in the CHC models; and the questionable way that cross loadings were handled in the analyses. However, these researchers differed in how the Weiss et al. results should be applied for clinical purposes. Canivez and Kush (2013), who also believed that a direct hierarchical (bifactor) model should have been conducted, among other methodological flaws, concluded: “We strongly believe that the substantial theoretical, methodological, and practical limitations greatly limit any interpretations of the results, particularly those suggesting utility of the findings for practitioners” (p. 166). By contrast, Schneider (2013) concluded: “I am prepared to take a leap of faith and declare the models to be excellent and ready to be used with individuals in applied settings” (p. 186).
I greatly prefer Schneider’s approach, which is a call to action, rather than the Canivez–Kush cautionary tale that strives to limit Wechsler interpretation to the Full Scale IQ. Most of the other respondents to the Weiss et al. articles in this special issue also agree that there is much to be gained by applying the results of the four-factor and five-factor CFAs to clinical practice. We are not practitioners of an exact science and we need to develop and explore hypotheses for individuals by merging data from as many sources as we can muster. Profile interpretation has been referred to as a “malpractice” (Hirshoren & Kavale, 1976) that provides “illusions of meaning” (McDermott, Fantuzzo, Glutting, Watkins, & Baggaley, 1992). I believe that failure to build on the superb theory-based research presented by Weiss et al. in this issue, and by related CFA studies that have appeared for more than a decade, is a kind of malpractice. Research on cognitive tests doesn’t get much better than the featured studies in this special issue, and clinical applications that build on these findings don’t get much better than the models provided by Flanagan et al. (2013) and Schneider (2013). Emphasizing Full Scale IQ and taking a wait-and-see attitude for profile interpretation is fine for a laboratory, but not for the real world of diagnosis and intervention. I believe that the g factor is a fascinating construct, worthy of empirical research (Kaufman, Reynolds, Liu, Kaufman, & McGrew, 2012); g may lie at the apex of the intelligence hierarchy, as John Carroll insisted, or may be an artifact, as John Horn argued. From a clinical standpoint, however, I still believe what I wrote so many years ago (Kaufman, 1979): “Beginning test interpretation with the Full Scale IQ does not elevate this global score into a position of primacy. Rather, the Full Scale IQ serves as a target at which the examiner will take careful aim. In fact, as examiners explore peaks and valleys . . . they are, in effect, trying to declare the Full Scale IQ ineffectual as an explanation of the child’s mental functioning” (p. 21).
Historical Considerations
I was interested in the historical antecedents of theory presented by Claeys: “From its very inception intelligence, as we conceive of it today, was vested with a multifactorial hierarchical model that Alfred Binet called ‘scheme of thought’ . . . Binet’s proposed edifice included a general factor, four ‘broad’ factors, and as many as 10 ‘narrow’ factors” (p. 170). Wasserman (2012) states that Alfred Binet believed “complex, multidimensional tasks were more sensitive to developmental changes than narrow, unidimensional tasks” (p. 14) and he identified the following 10 discrete mental faculties: “memory, imagery, imagination, attention, comprehension, suggestibility, aesthetic sentiment, moral sentiment, muscular strength/willpower, and motor ability/hand-eye coordination” (p. 14). Binet and g became intertwined, in large part because the Binet–Simon and Stanford–Binet yielded only a single, global IQ. But his original conception of intelligence was a forerunner of contemporary approaches to multiple abilities and neuropsychological processes. Henri Simon stated—long after Binet’s death in 1911—“the use of a summary IQ score was a betrayal (‘trahison’) of the scale’s objective” (Wasserman, 2012, p. 17).
Also of interest to me was the general lack of historical acclaim given to Cohen (1952a, 1952b, 1957a, 1957b, 1959) for his innovative exploratory factor analyses of Wechsler’s scales with both normal and clinical samples. Grégoire noted Cohen’s important contributions, and Cohen was also cited by Weiss et al. (2013) and Flanagan et al. (2013), but otherwise six of the eight response articles overlooked Cohen’s landmark studies. He discovered the omnipresent third factor that he variously named Memory or, famously, Freedom from Distractibility. And his five-factor solutions for the WISC and WAIS were eerily prescient of the five-factor CHC solution reported by Weiss et al. in this issue and by others (Benson, Hulac, & Kranzler, 2010; Keith, Fine, Taub, Reynolds, & Kranzler, 2006; Niileksela, Reynolds, & Kaufman, 2012; Ward, Bergman, & Hebert, 2012): Verbal (Gc), Nonverbal (Object Assembly, Block Design, Mazes; Gf), Memory (Arithmetic, Digit Span; Gsm), Picture Completion (Gv), and Coding/Digit Symbol (Gs). Wechsler (1958) was deeply interested in factor analysis, devoting an entire chapter of his adult book to the topic; he enlisted Cohen’s help with studies of the Wechsler–Bellevue and WAIS: “Dr. Cohen’s obliquely rotated solutions are presented in Table 33, and have been supplemented with a bifactor analysis done at [my] request” (Wechsler, 1958, p. 120).
However, Wechsler’s interpretation of factors was sometimes more clinical than psychometric. For example, he called the “Picture Completion” factor Relevance (“By relevance we mean appropriateness of response. . . . Many schizophrenics and other subjects, instead of noting the called for and essential missing part of a picture, respond with an irrelevant detail;” Wechsler, 1958, p. 126). Still, he was intrigued by the percentage of variance that was accounted for by cognitive factors, as Goldstein (2013) mentions, and he was aware of some of the same topics that appear in this special issue: “One of the surprises of the Arithmetic Test is its high loading on the Memory factor, although as one might expect, it also shows good g saturation. . . . Even more than in the case of Comprehension, one suspects that with the addition of other reference tests it might show substantial loadings on other factors. . . . The high Memory factor loading which Arithmetic has at all age levels, particularly in the age group of 60-75, makes one question some of the ‘abilities’ often posited by teachers as necessary for proficiency in mathematics” (Wechsler, 1958, p. 130).
Exploratory factor analysis has been justifiably criticized in this special issue for producing misleading results (Flanagan et al., 2013) and for being susceptible to sample-specific solutions (Bowden, 2013). These criticisms are especially noteworthy in the age of sophisticated CFAs. But in the early days of factor analysis, Cohen used fairly primitive techniques to make clinicians aware of the fact that Wechsler’s scales measured more than the Verbal, Performance, and Full Scale IQs, and that the scales did not reduce to 10 or 12 subtest-specific abilities. It is true that Cohen overfactored and rotated too many factors. Nonetheless, the well reputed, highly publicized “multiple abilities” research by Thurstone (1938; Thurstone & Thurstone, 1941) and Guilford (1956) failed to make a dent in how clinicians interpreted the WISC and WAIS. Yet, Cohen’s factor analyses in the 1950s—endorsed by Wechsler and integrated into his own clinical approach to interpretation —changed the way clinicians viewed subtest profiles and provided the foundation for present-day analyses.
Developmental Considerations
There are a few insightful comments about the Weiss et al. studies that merit attention. Schwartz (2013) notes that developmental differences might account for “the difficulty in identifying quantitative marker variables in the WISC-IV like they were able to do with the WAIS-IV” (p. 181). Further, he states, “The cognitive developmental trend is more likely to be present in children than in adults, unless the authors are speaking of cognitive decline with aging” (p. 181). In fact, differences in development (including age-related experiences in being taught arithmetic, as Schwartz discusses) need to be given more weight in explaining differences between the results for WISC-IV versus WAIS-IV, for understanding possible developmental trends within the WISC-IV age range of 6 to 16, and for explaining any differences observed in future studies that contrast the structure of WISC-IV and WPPSI-IV (ages 2 years 6 months-7 years 6 months; Wechsler, 2012). Notably, the WPPSI-IV is the first Wechsler scale to provide separate indexes for Gf (Fluid Reasoning) and Gv (Visual–Spatial), rendering analyses from the perspective of CHC theory true tests of its construct validity for ages 4 to 7 years 6 months. However, Schwartz’s caution about the need for developmental interpretation of the data, and my agreement with it, must be tempered by one undeniable fact—the invariance in factor patterns across instruments, across populations (normal–clinical), and across age groups between 6 and 69 is striking and far more compelling than the fairly small differences from sample to sample. Furthermore, CFAs of the WPPSI-IV indicate that five-factor models fit best for ages 4 to 7 years 6 months, and for separate age groups within this range (4, 5, 6-7 years 6 months; Wechsler, 2012, pp. 75-85). The five factors measure the same CHC constructs reported by Weiss et al. for ages 6 to 69.
Alternate Five-Factor Model of WAIS-IV for Entire Age Range (16-90)
Absent from Weiss et al.’s (2013a) WAIS-IV study are CFAs for ages 70 to 90, a substantial portion of the adult age range. This omission prevents a true developmental analysis of the Weiss et al. data because much of the decline in cognitive functioning (especially in fluid reasoning and processing speed) starts to accelerate at ages 65 and above (Lichtenberger & Kaufman, 2013; Salthouse & Saklofske, 2010). However, recent evidence shows that a five-factor CHC structure has been identified for the WAIS-IV at ages 70 to 90 (Niileksela et al., 2012) that is closely similar to the model identified for the WISC-IV at ages 6 to 16 and WAIS-IV at ages 16 to 69. Niileksela et al. entered the three parts of Digit Span (Forward, Backward, Sequencing) as separate variables to specify Gsm and defined Gf with Matrix Reasoning and Arithmetic. Canivez and Kush criticized Weiss et al. for failing to use hierarchical or nested factor solutions or to conduct Schmid–Leiman transformations. Niileksela et al. (2012) conducted all of these analyses, including CFAs similar to the ones conducted by Weiss et al. and concluded: “In addition to a higher order model, a nested factor model was tested. The findings from the nested factor models were very similar to those from the higher order model . . . suggesting that the methods provide fairly similar information with regard to the saturation of g and broad abilities in composites derived from the subtests” (p. 12).
Whereas the five-factor solution identified for the WAIS-IV by Weiss et al. and others (Benson et al., 2010; Ward et al., 2012) depends on administering the supplemental Figure Weights and Letter-Number Sequencing subtests, the solution for ages 70 to 90 does not depend on the administration of any supplemental subtests. In addition, it was found to be invariant across the entire 16 to 90 age range (Niileksela et al., 2012), making it an alternate CHC model for all ages. The measurement of the Gf and Gsm constructs is superior in terms of breadth and depth when the key Gf and Gsm supplemental subtests are administered; nevertheless, the five-factor model proposed by Niileksela et al. can be used when those subtests have not been given at ages 16 to 69 or are age inappropriate for ages 70 to 90.
Schwartz (2013) made a pertinent observation: “The five-factor model is based on the 15 subtests of the WISC-IV and WAIS-IV, while the four-factor model is based on only 10 subtests. At what point would the clinician know that all 15 subtests would be necessary in order to use the five-factor model?” (pp. 182). The availability of the new Niileksela et al. (2012) five-factor model can simplify that choice by permitting the interpretation of a five-factor CHC-based solution even when examiners have administered none of the supplemental subtests. Tables for computing standard scores on the five traditional CHC factors (Keith et al., 2006; Weiss et al., 2013b) for the WISC-IV are available in Flanagan and Kaufman (2009); tables for computing both traditional (Benson et al., 2010; Weiss et al., 2013a) and “alternate” (Niileksela et al., 2012) five-factor CHC solutions for the WAIS-IV are available in Lichtenberger and Kaufman (2013) for ages 16 to 69 (traditional) and ages 16 to 90 (alternate).
For simplification, all standard scores on each of the five CHC factors in all WISC-IV and WAIS-IV models are based on two subtests apiece. That decision makes the practical task easier for clinicians, but, as Grégoire (2013) noted, the fact that one can measure children and adults on five CHC-based factors does not mean that these factor-based scores are good measures of their designated broad abilities. Further, he made the astute observation that the particular five CHC abilities that have evolved in CFAs of Wechsler’s scales are limited in scope (why is long-term retrieval or Glr excluded?). Grégoire provides a historical approach to Wechsler test development to explain how the particular five CHC factors survive to this day: (a) first, Wechsler found support in his initial two-factor approach from Alexander’s (1935) hierarchical factor analysis that identified verbal and practical factors; (b) then, the publisher developed Symbol Search to strengthen the distractibility factor identified by Cohen (1959) and Kaufman (1975), (c) however, the unexpected outcome was to split the factor into memory and processing speed factors rather than producing a more comprehensive distractibility factor; and (d) finally, the mission of the publisher was to enhance the fluid reasoning component of the Performance Scale by developing tasks like Matrix Reasoning and Picture Concepts, which led to the ultimate splitting off of Gv from Gf in the CFAs of the WISC-IV and WAIS-IV, and in the actual scale structure of WPPSI-IV for ages 4 to 7 years 6 months.
Practical Considerations
Weiss et al. offer simple sensible guidelines for when to interpret the four- versus five-factor model for a given individual. They suggest that test users look for consistency within each of the four Indexes, and stick to the given four Indexes when such consistency characterizes the test profile; however, when discrepancies exist among scores on subtests within an Index, then consider the five-factor model. Canivez and Kush (2013) seem appalled by these guidelines: “Decide which model to interpret based on subtest scatter?! Where are the data in these articles, or any other for that matter, to support this recommendation? The practice of interpreting subtest scatter has an abundant literature . . . illustrating poor reliability, validity, and diagnostic utility that would argue against this” (p. 165). That is an irrational, knee-jerk response. Weiss et al. are asking to examine the consistency among a person’s scaled scores to determine if each Index measures a unitary trait—a perfectly normal psychometric procedure that has nothing to do with the invalid use of scatter for clinical diagnosis. Bowden (2013) cites an analogous application of “scatter” to the interpretation of clinically significant differences between Index scores: “interpretation of WAIS-IV subtest scatter now reflects a more cautious, base-rate oriented method of interpretation, seeking to minimize the false-positive rate of ‘abnormality.’ Lichtenberger & Kaufman (2009) provide an excellent example of a cautious, empirically oriented approach to detection of clinically meaningful differences between WAIS-IV Index scores and within-Index subtest scatter” (pp. 154). Weiss et al. apply that same degree of caution in their use of discrepancies among scores to help guide the choice of model. Although the scatter concept has been abused, as Canivez and Kush point out, its application sometimes reflects psychometrically sound and cautious procedures for making sound decisions.
The Curious Case of Arithmetic
Of all Wechsler subtests, Arithmetic was the most maligned by respondents. It migrates to various factors and ought to be supplemental on the WAIS-IV (Flanagan et al., 2013); it appears to have an uncomfortable fit on any one factor (Claeys, 2013); it’s a good example of the difficulty of creating tasks that measure only one ability (Grégoire, 2013); and because of its many cross loadings it ought to be removed or supplemented with new quantitative reasoning subtests (Canivez & Kush, 2013). I agree with the criticisms of the maverick Arithmetic subtest; it behaved not at all like a Gsm task in the recent study by Niileksela et al. (2012). But I would not like to see it eliminated from the WISC or WAIS even though it was removed from the WPPSI. Arithmetic is cognitively complex and Dr. Wechsler liked his tasks to be complex. He would have been sick to find out that Picture Arrangement was removed from his comprehensive batteries; he never would have agreed to have an auditory processing (Ga) scale on a Wechsler test because it is too cognitively simple and does not meet his personal definition of intelligence. Dr. Wechsler was cognitively complex, like Arithmetic, and a maverick, like this multifactored subtest. Despite all future improvements in Wechsler-tests-to-be, the master’s imprint should remain front and center. Leave Arithmetic alone. And keep firmly in mind Wechsler’s attitude that how a person functions in the real-world trumps statistics, regardless of the sophistication of the methodologies or the accuracy of the measurements.
Conclusions
Intelligent testing demands that individually administered tests need to be personalized to each child or adult being assessed. The articles by Weiss et al. (2013a, 2013b), and most responses to those studies, endorse personalization and provide a scientifically valid, theory-based, psychometrically sophisticated foundation for generating meaningful hypotheses about a person’s cognitive functioning. This type of scientific rigor was not available 3 decades ago when I wrote Intelligent Testing with the WISC-R (Kaufman, 1979). But my goals were similar to the goals of all authors in this special issue—to apply psychometrics and theory to test profiles (at a time when neither played an important role in interpretation). The psychometrics were no match for CFA and the theories fell short of CHC. But case reports in the 1970s often included statements such as: “John’s scaled score of 11 in Picture Arrangement indicates his strength in social comprehension and his scaled score of 9 in Picture Completion reveals his inability to distinguish essential from non-essential details.” In that era, V-P IQ discrepancies of 15 points or greater meant clear-cut brain damage, and some authoritative sources told Wechsler examiners that: High Digit Symbol/Coding with low Digit Span suggests a person
“who seems to be controlling strong and pressing anxiety by excessive activity. When we find the reverse pattern . . . we are usually confronted with an essentially depressed person who is attempting to ward off recognition of depressive affect perhaps in a hypomanic way, usually via denial, but not necessarily through activity and acting out behavior” (Allison, Blatt, & Zimet, 1968, p. 32).
In the 1970s, I was among a group of psychologists and special educators who were trying to convert a straw house to one built of wood (e.g., Bannatyne, 1971; Bush & Waugh, 1976; Kaufman, 1979; Lutey, 1977; Matarazzo, 1972; Myers & Hammill, 1976; Sattler, 1974). Ever since, the aim has been to turn the wood house into brick or stone. Whereas the 1970s was characterized by texts, the 1980s witnessed the first group of individually administered tests built from theory (Kaufman & Kaufman, 1983; Thorndike, Hagen, & Sattler, 1986; Woodcock & Johnson, 1989). The 1990s was highlighted by the birth and expansion of CHC theory and cross-battery assessment (McGrew & Flanagan, 1998) and the direct applications of neuropsychological theories of processing to cognitive assessment (Kaplan, 1990, Korkman, Kirk, & Kemp, 1998; Naglieri & Das, 1997). During the 2000s, the most psychometrically sophisticated and theoretically relevant tests of cognitive ability were published and widely used for clinical evaluations in schools and clinics (Elliott, 2007; Kaufman & Kaufman, 2004; Reynolds & Kamphaus, 2003; Roid, 2003; Wechsler, 2003, 2008).
During the first half of the decade of the 2010s, psychometric and theoretical sophistication have merged in a dynamic way, as reflected by the applications of CHC theory (Flanagan, Alfonso, & Ortiz, 2012; Schneider & McGrew. 2012), neuropsychological processing (Naglieri, Das, & Goldstein, 2012), and CFA (Keith & Reynolds, 2012) to facilitate diagnoses and inform interventions. The two Weiss et al. and the authors of all eight commentaries—featured in this issue—reflect the highest form of the art of cognitive assessment. The foundation for intelligent testing is now built out of brick.
We have definitely come a long way.
Footnotes
Declaration of Conflicting Interests
The author declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author received no financial support for the research, authorship, and/or publication of this article.
