Abstract
Many children with autistic disorder, or autism, are described as having low intelligence quotients. These descriptions are partially based on use of various editions of the Wechsler Intelligence Scale for Children (WISC), the most widely used intelligence test for children with autism. An important question is whether task demands of the Wechsler scales are sensitive to unique characteristics of children with autism that might affect test performance. Another question relates to how changes in the newest edition of the WISC have affected its sensitivity for the population of individuals with autism. The administration guidelines and task demands of the third (WISC-III) and fourth (WISC-IV) editions were examined to determine appropriateness when measuring intelligence of children with autism. Implications related to use of these instruments with children with autism are discussed.
Keywords
Autistic disorder, or autism, is a pervasive developmental disorder characterized by severe impairments in the social, communicative, and behavioral domains (American Psychiatric Association [APA], 2000). Another frequently reported characteristic is a low intelligence quotient (IQ), leading many to assume a high comorbidity between autism and intellectual disability (e.g., Graziano, 2002; Hartley & Sikora, 2010). Indeed, the Diagnostic and Statistical Manual of Mental Disorders (4th ed., text rev.; DSM-IV-TR) indicates in the diagnostic criteria for autism that “in most cases there is an associated diagnosis of mental retardation, that can range from mild to profound” (APA, 2000, p. 71).
A deficit in intellectual functioning is only one of three criteria necessary for the identification of intellectual disability. Professionals agree that the condition must occur during the developmental period (before 18 years of age) and must include deficits in adaptive behavior concurrently with deficits in intellectual functioning (APA, 2000; Schalock et al., 2010). Therefore, low intellectual functioning alone does not constitute an intellectual disability; however, it must be present before the diagnosis is made. Generally, an individual is considered to have a deficit in intellectual functioning if his or her IQ is approximately 2 SDs or more below average on a standardized intelligence test, considering the standard error of measurement of the specific instrument (Schalock et al., 2010). Schalock et al. also emphasize that the strengths and limitations of intelligence tests be considered when using IQ as a criterion for the diagnosis of intellectual disability. Legally, the Individuals With Disabilities Education Improvement Act (IDEA, 2004) requires that tests used to assess a child be provided and administered in the language and form most likely to yield accurate information. For example, it would be inappropriate to measure the intelligence of a student with cerebral palsy using a test requiring motor responses or using a test requiring visual input for a student with low vision (Taylor, 2009). Schalock et al. (2010) referred to this issue as one of test fairness and provided the example that using an intelligence test requiring a verbal response for individuals with limited verbal ability will underestimate their IQs. Therefore, it is important to examine the content and format of intelligence tests to determine their appropriateness for individuals with autism, to ascertain whether reported IQs are accurate or whether they could be an artifact of assessment instrumentation as a function of diagnostic characteristics. Diagnostic features of autistic disorder include difficulties in the areas of spoken language (SL), language comprehension, attention (Attn), and social interaction. The implication is that if intelligence tests reflect these characteristics, then they might not be “fair” to children with autism and thus might underestimate their IQs.
The various editions of the Wechsler scales, including the Wechsler Intelligence Scale for Children (WISC), have been reported to be the most widely used instruments with individuals with autism (Bolte, Dziobek, & Poustka, 2009). In a meta-analysis of 130 cognitive and behavioral studies of autism, Mottron (2004) reported that almost half (46.9%) used the Wechsler scales to determine IQ. This implies that the third and fourth editions of the WISC (WISC-III, Wechsler, 1991; WISC-IV, Wechsler, 2003) have been used frequently to determine the intelligence of children with autism. The WISC-III was published shortly after autism was first designated as a disability category by the IDEA in 1990. Thus, many reports of low IQs of individuals with autism were based on this test.
The purpose of this study was to examine the administration guidelines and task demands of the WISC-III and the WISC-IV. Two specific questions were addressed:
“Do the Wechsler scales measure the general intelligence of children with autism fairly, given their characteristics?” A negative answer would partially explain claims that “intelligence has been underestimated in autistics” (Dawson, Soulieres, Gernsbacher, & Mottron, 2007, p. 657).
“How do the task demands of the WISC-IV compare with the WISC-III?” Such a comparison might illuminate differences in IQs reported in professional literature and have implications for test selection.
Method
Instruments
The WISC-III consists of 13 subtests, including 3 optional (O) subtests. Administration is interpreted to render subtest-scaled scores, index scores, and Verbal, Performance and Full-Scale IQs. The 13 subtests are Picture Completion, Information, Coding, Similarities, Picture Arrangement, Arithmetic, Block Design, Vocabulary, Object Assembly, Comprehension, Symbol Search (O), Digit Span (O), and Mazes (O). The four index scores are for the areas of Verbal Comprehension (VC), Perceptual Organization (PO), Freedom From Distractibility (FD), and Processing Speed (PS).
The WISC-IV has 15 subtests, including 5 optional subtests. A total of 10 of the WISC-III subtests were retained, with altered optional status for several of them: Similarities, Vocabulary, Comprehension, Information (O), Block Design, Picture Completion (O), Digit Span, Arithmetic (O), Symbol Search, and Coding. The subtests of Picture Arrangement, Object Assembly, and Mazes (O) were deleted. The new subtests added were Word Reasoning (O), Matrix Reasoning, Picture Concepts, Letter–Number Sequencing, and Cancellation (O). The WISC-IV retained the Full-Scale IQ and four index scores but deleted the Verbal and Performance IQs. The index scores are for the composites of VC, Perceptual Reasoning, Working Memory, and PS.
Procedure
We began our analysis of each test by reviewing the examiner manuals. We focused on any special administration instructions that were provided for the use of the tests with children with disabilities. Of special interest were discussions regarding examiner/examinee familiarity and the use of alternative methods of communication during testing.
After a review of the administration guidelines, we independently evaluated the task demands to determine those on which performance could be affected by the diagnostic features of autistic disorder noted in the DSM-IV (4th ed.; APA, 1994). Codes for those features and definitions are provided in Table 1.
Coding Definitions.
Note: LC = language comprehension; SL = spoken language; Attn = attention; SC = social comprehension.
Each subtest could be categorized as being affected by 0 to 4 of the autistic features, resulting in a possible score of 52 (13 subtests × 4 features) for the WISC-III and 60 (15 subtests × 4 features) for the WISC-IV. Percentages of the number of diagnostic features reflected in each subtest were calculated. For example, if three subtests involved listening comprehension (LC) and SL, two subtests involved Attn, and two subtests involved social comprehension (SC), 12 of the 60 (WISC-IV) possible features (20%) would be reflected in the subtests. This information was used to evaluate a variety of scores (e.g., IQs, subtest profiles, index scores). The higher the percentages, the more the features of autism were reflected in the scores.
Results
WISC-III
Administration guidelines
The examiner’s manual of the WISC-III contains a statement indicating that low scores on the test may be due to a variety of conditions other than low intellectual functioning, including autism and deafness. Indeed, “The examiner must take into account or carefully eliminate such factors before diagnosing intellectual impairment” (Wechsler, 1991, p. 9). According to the manual, modifications such as the use of sign language or visual aids may have an impact on test scores and will invalidate use of the norms. A page in the manual contains details on the importance of establishing rapport with the examinee.
Format of subtests
Subtests were analyzed to determine those that could be affected by the 4 diagnostic features evaluated in this study (see Table 2). For the WISC-III, 15 of the 52 possible features were reflected in the 13 subtests (28.8%). A total of 5 (38.5%) were categorized as being affected by LC, 4 (30.8%) each by SL and Attn, and 2 (15.4%) by SC. When the optional subtests were deleted from the analysis, the resulting percentages changed only slightly. In all, 13 of the possible 40 features (32.5%) for the 10 required subtests were included. There also were some slight differences for the specific categories: 50% (LC), 40% (SL), 20% (Attn), and 20% (SC).
Demands of the WISC-III and WISC-IV Subtests.
Note: WISC-III = Wechsler Intelligence Scale for Children–Third Edition (Wechsler, 1991); WISC-IV = Wechsler Intelligence Scale for Children–Fourth Edition (Wechsler, 2003); X = diagnostic feature included; (O) = optional subtest; NI = not included in this edition.
In addition to the Full-Scale IQ determined from the 10 required subtests, 5 subtests are used to calculate the Verbal IQ and 5 subtests are used for the performance IQ. As expected, the verbal subtests were considerably more affected by the diagnostic features of autism (55%) than were the performance subtests (10%). Index scores for the WISC-III are determined by combining various subtests, including 2 of the 3 optional subtests. A total of 4 different subtests provide the index scores for VC and PO, and two different subtests provide scores for FD and PS. The number and percentage of the diagnostic features reflected in each index were calculated (see Table 3). VC was affected by the greatest number of features of autism (56.3%) and PO by the fewest (6.3%).
Number and Percentage of Features of Autism Involved in the WISC-III and WISC-IV Indices.
Note: WISC-III = Wechsler Intelligence Scale for Children–Third Edition (Wechsler, 1991); WISC-IV = Wechsler Intelligence Scale for Children–Fourth Edition (Wechsler, 2003).
WISC-IV
Administration guidelines
The examiner’s manual of the WISC-IV contains the following statement: “It is important not to attribute low performance on a cognitive test to low intellectual ability when, in fact, it may be attributable to physical, language, or sensory difficulties” (Wechsler, 2003, p. 11). In addition, it notes that children with speech, language, or hearing impairments may be at a disadvantage on VC subtests. Also emphasized is the importance of balancing the needs of a particular child with the need to maintain standard administration procedures and the importance of establishing rapport with the examinee.
Format of subtests
For the WISC-IV, 26.7% of the 60 possible features were reflected in the 15 subtests (see Table 3). Regarding the categories, 6 subtests (40%) were affected by LC, 4 (26.7%) by SL, 6 (40%) by Attn, and 1 (.067%) by SC. When the optional subtests were deleted, the percentage of features for the 10 subtests was 27.5%. The following percentages were found for the specific categories: 40% (LC), 30% (SL), 40% (Attn), and 10% (SC).
Although separate Verbal and Performance IQs are not calculated for the WISC-IV, four index scores can be determined using combinations of the 15 subtests. These are VC (5 subtests), Perceptual Reasoning (4 subtests), Working Memory (3 subtests), and PS (3 subtests). The number and percentage of the diagnostic features reflected in these indexes also can be found in Table 3. Similar to the WISC-III, VC was affected by the greatest number of features (50%) and Perceptual Reasoning by the fewest (0%).
Discussion
There are practical and legal considerations that must be addressed when evaluating the intelligence of children with autism. Factors that can affect the results of any evaluation include the examiner–examinee interaction and the characteristics of the test itself (Taylor, 2009). These factors are even more important when testing a child with autism because of the varying and atypical characteristics associated with the diagnosis.
Examiner–Examinee Interaction
Problems with social reciprocity are well documented in children with autism (APA, 2000), making the examiner–examinee relationship particularly important. Unless the examiner becomes familiar with the child with autism prior to testing, the student’s obtained score likely will not accurately reflect his or her performance relative to the normative group. In general, test constructors should include information in their manuals regarding the importance of pretest contact so that unfamiliarity does not create a bias against children with certain disabilities and threaten the validity of their test results (Fuchs, Fuchs, Power, & Dailey, 1985). No mention is made in either manual about the importance of the examiner being familiar with examinees. This omission of information regarding familiarity with the examiner is certainly not unique to the Wechsler scales; nonetheless, it emphasizes the importance of recognizing this factor when considering the interpretation of the test results and magnifies the importance of establishing rapport prior to testing.
The examiner manuals for both tests do include discussions of methods of establishing rapport. Considerable care should be taken to establish rapport prior to the testing session, particularly because children with autism are unlikely to perform optimally when evaluated by unfamiliar individuals. Additional time spent establishing rapport is not only recommended but should be required.
Characteristics of the Tests
IDEA (2004) requires that tests be administered in the language and form most likely to yield accurate information. This includes using those tests whose results are not affected by an individual’s known characteristics. Given this consideration, the use of the WISC-III and the WISC-IV is questionable for children with autism, particularly when the tests are used to measure general intelligence. When the 10 subtests used to determine the Full-Scale IQ for each instrument were analyzed, 32.5% of the features of autism identified for this study were involved in the WISC-III and 27.5% in the WISC-IV. Notable exceptions (those subtests including none of the identified features) were Block Design and Object Assembly on the WISC-III, and Block Design, Matrix Reasoning, and Picture Concepts on the WISC-IV. It is interesting to note that more than half of the diagnostic features of autism are reflected on the 5 subtests comprising the Verbal Intelligence Quotient (VIQ) on the WISC-III. These results suggest that the Full Scale Intelligence Quotient (FSIQ) from both instruments and the VIQ from the WISC-III should be interpreted cautiously for children with autism.
If an FSIQ is determined, should the WISC-III or WISC-IV be used? Our results indicate that there is only a slight difference in the number of features of autism included on the subtests of the two tests. The characteristics of autism used in the current study were included in the task demands of approximately one third of the subtests of the WISC-III and WISC-IV. This finding would suggest that the more recent edition should be used because it includes more current norms. Mayes and Calhoun (2008) recommended that the WISC-IV be used, noting that it may be more sensitive to the strengths in visual and verbal reasoning and the weaknesses in Attn, PS, and comprehension/social reasoning that they observed in children with higher functioning autism. However, this sensitivity to the weaknesses might not make the WISC-IV FSIQ the most appropriate to use because it includes four subtests measuring memory and PS compared with just two on the WISC-III FSIQ. Similarly, the WISC-III FSIQ includes four subtests that measure visual processing, a strength of autism (Sattler, 2008), whereas the WISC-IV FSIQ includes only two.
Perhaps, the logical recommendation is to use intelligence tests other than the Wechsler scales to determine IQs for children with autism. Use of the latter might not result in valid, appropriate, and meaningful scores, underestimating their IQ and leading to faulty conclusions regarding comorbidity with intellectual disability. This suggests that the estimates that 75% of children with autism also have intellectual disability (Graziano, 2002) may or may not be accurate and could be an artifact of the instrumentation used to determine IQ. The WISC-III and WISC-IV manuals specifically state that low scores may be due to factors other than low intellectual functioning in certain conditions, with autism specifically mentioned in the WISC-III manual.
As an illustration of the impact of test selection, Dawson et al. (2007) reported that children with autism scored significantly lower on the WISC-III FSIQ (approximately 90) than on the Raven Progressive Matrices (RPM; Raven, Raven, & Court, 2003), on which they scored approximately 103. Dawson et al. pointed out that the RPM, a measure of visual analogical reasoning, is the preeminent test of fluid intelligence. Bolte et al. (2009) found that the RPM produced slightly higher IQs than the Wechsler scales but only for those whose Wechsler IQs were below 85. One possible explanation for this finding might be that the effect of language, included on the Wechsler scales but not the RPM, decreases as IQ increases (Siegel, Minshew, & Goldstein, 1996). Put another way, children with lower IQs are more negatively affected by language measures than those with higher IQs.
The profile of cognitive skills also appears to be related to general level of intelligence. For example, Mayes and Calhoun (2003) noted that children with autism with IQs >80 (M = 103) had significant strengths on Information and Similarities, and significant weaknesses on Coding, Arithmetic, and Digit Span. However, those with IQs <80 (M = 67) had significant strengths in Object Assembly, Block Design, Information, and Vocabulary. Conversely, no subtests were identified as significant weaknesses although the lowest performance was on Comprehension and Arithmetic. When these results are interpreted based on the findings of the current study, an interesting pattern emerges. For individuals with higher IQs, the number of characteristics reflected in their highest performing subtests (50%) is actually greater than those reflected on their lowest performing subtests (33%). This suggests that the characteristics of autism reflected in the task demands do not play an important role in the cognitive profile of this group. Conversely, for the low IQ group, the characteristics of autism reflected in the subtests reported as their strengths were considerably fewer (25%) than those in the subtests reported as their weaknesses (62.5%). This suggests that the task demands do play an important role in the profile for this group.
Mayes and Calhoun (2003, 2004) also analyzed the index scores on the WISC-III and reported that children with autism with higher IQs had significantly higher scores on the VC and PO indices than on the FD and PS indices. For children with lower IQs, no significant differences among index scores were found although the pattern of strengths and weaknesses was the same. Similar results with the WISC-IV—strengths in VC and Perceptual Reasoning and weaknesses in Working Memory and PS—also were found (Mayes & Calhoun, 2008). Our findings reflecting autism-related characteristics are consistent with the reported strength in PO (6.3%)/Reasoning (0%) and the weakness in FD (37.5%)/Working Memory (33%) and PS (25% on WISC-III, 33.3% on the WISC-IV). The reported strength in VC would not be expected based on our analysis; that index had the most characteristics of autism reflected in the task demands (56.3% for WISC-III and 50% for WISC-IV). Again, this can be explained by the decreased effect of language as IQ increases. Put another way, the verbal subtests on the Wechsler scales were neither designed to be sensitive to the deficits in more complex language abilities found in individuals with higher functioning autism (Siegel et al., 1996) nor are the subtests measuring the types of communication deficits that are characteristic of children with autism.
In conclusion, the question of whether to use the Wechsler scales with children with autism does not have a simple answer. Neither is the decision to use the WISC-III or WISC-IV a straightforward one. If the purpose of assessment is to determine the type and the degree of an individual’s cognitive strengths and weaknesses, then the Wechsler scales can provide useful information. However, if the purpose is to determine an IQ that represents general intellectual functioning, particularly if diagnostic decisions are involved, several concerns arise. The results of this study as well as others indicate that there are two issues that must be considered when using either the WISC-III or WISC-IV for this latter purpose. The first has to do with the level of intelligence of the child with autism. It appears that children with autism who are lower functioning might be more negatively affected by the task demands of the Wechsler scales than are children with autism who are higher functioning. This could account for the relatively high comorbidity between autism and intellectual disability. That is, many of these reports were based on studies before the increased interest in, and information about, higher functioning autism and other disorders on the autism spectrum.
Given these considerations, it seems advisable to administer an alternative, visually oriented, nonverbal test such as the RPM or the Universal Nonverbal Intelligence Test (Bracken & McCallum, 1998) to initially determine the general intelligence of children with autism. Subsequently, if Wechsler scale information is wanted, the WISC-IV should be administered only to those with relatively higher IQs to obtain a cognitive profile that might help identify strengths and weaknesses useful for educational programming. If this practice is followed, all 15 subtests should be administered.
A question remains as to the relative benefit of determining an IQ in the first place. Such a determination for an individual with autism neither establishes eligibility for additional services nor provides the most useful information for instructional planning. In other words, attaching the label of intellectual disability to an individual with moderate to severe autism may be inaccurate and serves no functional educational purpose, but it does add another stigma. Effective assessment for this population should use a functional, naturalistic, authentic approach that enables educators to define the skills that need to be developed and the methods of instruction most likely to be successful. Portfolio assessment seems a natural choice. Carothers and Taylor (2003) outlined the procedures to follow to develop a meaningful portfolio for children with autism. Entries could be included in areas such as socialization, communication, behavior, academics, functional skills, and fine and gross motor skills.
Assessment of the cognitive functioning of children with autism is less likely to provide life-enhancing benefits than is assessment that identifies strengths, weaknesses, and ways to remediate deficits. If intellectual assessment is a part of a diagnostic battery, then the limitations of using these instruments, the Wechsler scales in particular, have to be acknowledged. If our goal is to provide meaningful services rather than to merely attach stigmatizing labels, our assessment choices for children with autism must change and improve.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
