Abstract
Language abilities in adolescents with autism spectrum disorders (ASDs) are variable and can be challenging to ascertain with confidence. This study aimed to compare and evaluate different forms of language assessment: standardized language testing, narrative analysis and parent/teacher reports. 14 adolescents with ASD and 14 typically developing adolescents matched on age, gender and nonverbal ability were assessed using a number of standardized assessments for receptive and expressive language skills, a standardized narrative test, two experimental narrative assessments and a parent/teacher report measure of pragmatics. The findings were that, although adolescents with ASD scored within the normal range on expressive and receptive language, their performance on narrative tasks revealed difficulties with both structural and evaluative language. It should be noted that both teachers and parents rated the pragmatic language skills of the young people with ASD as significantly lower than those of the typically developing group but parents were more likely than teachers to additionally identify difficulties in speech and syntax. The implications of these results for professionals in terms of assessing the language skills of adolescents with ASD and for the planning of appropriate intervention are discussed.
I Introduction
1 Language skills in autism spectrum disorder
This study compares different methods of assessing language skills in adolescents with autism spectrum disorder (ASD), a neurodevelopmental disorder characterized by impairments in social interaction, verbal and nonverbal communication and restricted and repetitive behaviours (American Psychiatric Association, 2013). Although the difficulties of individuals with ASD are well documented, there is still much we do not know about the language abilities of this population, in part because of the challenges involved in the assessment of these skills (Hudry et al., 2010). This is an area which merits further research to enable the planning of appropriate programmes of intervention and support.
Language abilities of children and young people with ASD vary greatly, with some individuals’ effectively nonverbal (Lord and Paul, 1997) but others displaying no significant impairment on traditional measures of structural language (Williams et al., 2008). However, it is well established that difficulties with pragmatic language are almost universal in individuals of all ages (Tager-Flusberg et al., 2005). Pragmatic language refers to the appropriate and relevant use of language in context and requires both speaker and listener to have a range of social linguistic skills (Prutting and Kirchner, 1987). Studies report that children and young people with ASD struggle with initiating and turn taking in conversation, topic maintenance, pedantic speech and the use of figurative language (Diehl et al., 2006; Tager-Flusberg, 2000).
2 Assessment of language skills in individuals with autism spectrum disorder
The assessment of language ability in individuals with ASD is problematic. Previous research has shown that on standardized psychometric language measures such as the Clinical Evaluation of Language Fundamentals (CELF; Semel et al., 2006), a test measuring structural aspects of oral language, and the British Picture Vocabulary Scale (BPVS; Dunn and Dunn, 2009), a test assessing receptive vocabulary, high functioning individuals with ASD may score within the normal range, even though it is clear to both professionals and parents that they have significant communication difficulties (Volden and Phillips, 2010; Young et al., 2005). Such tests focus on linguistic structures, syntax and vocabulary in a one to one situation and may not accurately reflect the difficulties individuals experience in using language in an everyday social setting. Not only has this implications for planning targeted programmes of intervention, but there are also consequences for the funding of resources and support (Young et al., 2005). A child or young person who has language scores falling within the average range may be judged not to require additional resources and services (Klin et al., 2000). In this study, therefore, we also consider alternative but complementary forms of assessment: parental and teacher reports of adolescents’ pragmatic language skills and the use of narrative analysis.
3 Assessing pragmatic language skills: The Children’s Communication Checklist (CCC-2)
A number of tests have been developed to assess pragmatic language. Of these, the CCC-2 is reported to be the most proficient at identifying pragmatic language impairment (Volden and Phillips, 2010). Carers or professionals are asked to rate the child on a checklist comprising 10 scales, four relating to pragmatic language, four assessing structural language and two assessing non-language domains such as autistic features not directly related to language. The test has not been designed to diagnose ASD, but low scores on pragmatic scales combined with impairment in the non-language domains indicate that the child should be referred for a more detailed clinical assessment. Used on its own, it is thought to have limitations in the identification of autism; Charman et al. (2007) questioned its efficacy as a screening instrument for autism, arguing it produces false positives and may overestimate the prevalence of autism.
The use of parent report measures in the CCC-2 overcomes many of the difficulties inherent in formal language testing, but may introduce other biases. It is difficult to ascertain with confidence the validity and reliability of these reports. Furthermore, correlations between parent’s ratings of pragmatic language at home and those of professionals indicate some inconsistency across environments. Bishop and Baird (2001) report correlations ranging from 0.30 and 0.58 for individual pragmatic scales and 0.46 for the pragmatic composite. Such discrepancies in children with developmental difficulties are not unusual and have also been reported in assessments of emotional, behavioural and social difficulties (Redmond and Rice, 1998).
4 Narrative as a tool for assessing language ability in ASD
There is growing support for the view that the assessment of narrative skills in children and young people with ASD provides useful information about language abilities, beyond that obtained through formal language testing. The construction of a narrative is a complex task, drawing on a number of linguistic, social and cognitive abilities. Individuals must understand and produce language, plot events, use grammatical structures to mark causal and temporal relations and both attribute and understand thoughts, emotions, and intentions (Losh and Capps, 2003). Research suggests that narrative assessment provides an alternative and ecologically valid method of obtaining information on the language skills of individuals with ASD (Botting, 2002; Manolitsi and Botting, 2011). In a study examining the structural and pragmatic language of children with ASD and specific language impairment (SLI), Manolitsi and Botting found that whilst standardized tests yielded relatively little information which distinguished one group of children from the other, measures on a structured narrative task revealed a number of qualitative differences between the groups. Similar findings are reported by Banney et al. (2014) who found that, compared to a language matched group of typically developing children, narratives elicited from children with ASD during the autistic diagnostic observation schedule (ADOS) demonstrated impairments in syntax, use of pronouns and elements of story grammar. However, evidence in this area is not yet conclusive. A study comparing the use of two language tests to identify pragmatic language difficulties in children with ASD, found that the Strong Narrative Assessment Procedure (SNAP; Strong, 1998) failed to differentiate clearly between children with and without ASD (Young et al., 2005).
5 Aims and predictions
The aim of this study was to compare the performance of adolescents with ASD and their typically developing peers on three different forms of language assessment in order to evaluate the effectiveness of each to identify language difficulties experienced by young people with ASD. Participants were assessed on the CELF IV, the BPVS, the CCC-2 and tasks of narrative analyses across three narrative genres: retelling a story, constructing a fictional story and talking about events. Further, given that previous research has reported low correlations between ratings of parents and professionals of children with developmental difficulties, a secondary aim was to examine differences in parent and teacher ratings on the CCC-2.
II Methods
1 Participants
The total sample of this study comprised 28 adolescents, 14 with a clinical diagnosis of ASD and 14 typically developing adolescents matched with the ASD group on chronological age and nonverbal ability. The ASD group consisted of 14 male adolescents with ASD, aged 11–14 years, all with a clinical diagnosis of ASD given prior to the commencement of the study and confirmed by parents and schools. With the exception of one, all had statements of special educational needs identifying an autistic spectrum disorder as their primary need. This is a legal document issued by a local authority in the UK describing the needs of the pupil and how these should be addressed. Three young people attended a mainstream school and the remainder, a specialist ASD school or unit. Fourteen typically developing adolescents, drawn from two secondary schools were matched with the ASD group on chronological age, gender and nonverbal ability. Mean nonverbal and verbal ability scores for both groups were within the average range for their age. No significant differences were found between the two groups on measures of age, nonverbal or verbal ability (Table 1).
Characteristics of autism spectrum disorder (ASD) and chronological age match group (CM).
2 Materials
a Standardized tests
All young people were assessed on the following standardized tests:
Matrices test of the BAS II (British Ability Scales. 2nd edition; Elliot et al., 1996): The matrices test of the BAS II is a measure of nonverbal ability. Participants are presented with a set of patterns where one pattern is incomplete and required to point to the missing piece. The BAS II demonstrates good reliability (.95 for school age children) and validity (.69).
BPVS II (British Picture Vocabulary Scale. 2nd edition; Dunn et al., 1997): The BPVS II is a measure of receptive vocabulary. Participants are shown four line drawings and asked to choose the one that best illustrates a word spoken by the assessor. It has good psychometric properties (reliability .93, validity .76.).
CELF IV UK (Clinical Evaluation of Language Fundamentals. 4th edition; Semel et al., 2006): The CELF IV consists of core sub tests measuring semantics, syntax and working memory with supplementary subtests for receptive and expressive language and automatic naming ability. Composite scores are calculated from the scores of different combinations of the subtests. The composite scores all have excellent or good reliability scores ranging from .88 to .92.
Children’s Communication Checklist (CCC-2; Bishop, 1998): The CCC-2 comprises 70 items divided into nine scales. Two scales assess aspects of language structure (speech and syntax), five assess pragmatic aspects of communication (inappropriate initiation, coherence, stereotyped conversation, use of context, and rapport), and two assess non-linguistic aspects of autistic behaviour (social relationships and interests). Each item contains a statement describing a specific behaviour (e.g. ‘talks repetitively about things that no-one is interested in’) which is rated as ‘definitely applies’, ‘applies somewhat’, ‘does not apply’ or ‘unable to judge’.
The Expression, Reception and Recall of Narrative Instrument (ERRNI; Bishop, 2004): The ERRNI is an assessment of spoken narrative skills. Participants are required to look at a series of colour pictures, retell the story, with the pictures in sight and answer a series of comprehension questions. Three of these questions are based on literal information presented in the text with six requiring the participants to make inferences. Reliability scores range from .75–.90.
b Experimental measures
Narrative Tasks: Two narrative tasks developed by King et al. (2013, 2014) were used to assess oral narrative language skills.
Event narratives: Participants completed two different event narrative tasks, one designed to elicit a general response and one a specific response. In the general condition, participants were asked six questions encouraging them to recall a general event (e.g. ‘What do people usually do at Halloween?’). In the specific condition, questions were designed to encourage a narrative of a specific personal event (e.g. ‘Can you tell me about what you did at Halloween one time?’). Each question was accompanied by a relevant picture. Internal reliability is reported as high (Cronbach’s Alpha coefficient of > 0.7).
Story stems: In this task, participants were presented with the story stems shown below, with accompanying pictures, and asked to continue the story.
1. The boy ran into the forest. He looked ahead of him and saw a little green man in a spaceship.
2. When the girl climbed up the mountain, she saw, hidden among the trees, a little wooden house covered in snow.
Good internal consistency is reported for this task (Cronbach’s Alpha coefficient of > 0.7).
c Procedure
Each pupil was assessed on three separate occasions individually in a quiet room at school. The order in which the event narratives were presented, the order of the questions, and the order of presentation of the story stems was counterbalanced to control for order effects and possible differences in difficulty. Narratives were recorded and later transcribed and coded.
The event narratives were coded for structural language, evaluative language and enrichment devices using the Systematic Analysis of Language Transcripts (SALT) programme (Miller and Iglesias, 2008). Structural measures included length of narrative (as measured by the number of main body words), mean length of utterance, and the number of different word roots. Evaluative measures included references to mental states, causal statements and narrative enrichment devices (character speech, negative comments, emphatic markers, and hedges -narrative devices used to distance the speaker, for example, ‘sometimes’, ‘perhaps,’ ‘probably’, ‘might’).
The story stems were coded in the same manner as the event narratives using the SALT programme of analysis but, in addition, a hand coded analysis was undertaken using the narrative scoring scheme (NSS) in the SALT handbook. Each story was given a score between 0 and 5 on seven categories, 1 being ‘proficient use’ and 5 being ‘minimal use’.
To ensure the reliability of the coding, 10% of the narratives were coded separately by two trained researchers. Inter-rater reliability was found to be high for both the event narrative (.90) and the story stem tasks (.87). The narrative coders were blind to whether the participants were in the ASD group or typically developing group.
Parents and teachers were asked to complete a copy of the CCC-2. The study was approved by the ethics committee of the University College London Institute of Education and followed British Psychological Society guidelines. Consent for participation was obtained from both parents and schools.
III Results
1 Overview of the data and statistical analysis
Means and standard deviations of the scores of both groups on each of the measures of the administered tests were calculated. Independent t tests were conducted to test for significant differences between the groups on the core, expressive and receptive language measures of the CELF IV, the BPVS II, and the narrative assessments. A series of one-way between-groups ANOVAs were employed to investigate differences between the ASD and comparison group on the scaled scores of each scale of the CCC-2 for both parent and teacher ratings. Differences between teacher and parent ratings of the young people on the CCC-2 scales were examined using a paired samples t test. The analysis of the scores on the experimental narrative tasks and the CCC-2 necessitated the use of multiple tests therefore, in order to control for a Type 1 error and, consistent with previous studies examining narrative performance over a number of different measures, we adopted a more stringent alpha level, p < 0.01 (Rumpf et al., 2012). Effect sizes were evaluated with Cohen’s d using the means and standard deviations of the two groups.
A comparison of the standard scores of both groups on the CELF IV and the BPVS II, as depicted on Table 2, revealed differences between the language profile of the adolescents with ASD and that of the typically developing comparison group. Scores for both groups on the expressive, receptive and core language measures of the CELF IV and receptive vocabulary scores on the BPVS II fell within the average range, but those of the young people with ASD were lower than those of the comparison group. These differences were significant for the core language composite score (t (26) = −2.36, p = .03, Cohen’s d = 0.89), but not the receptive or expressive language indices of the CELF IV, or the BPVS II.
Means and standard deviations for CELF IV standard scores on language indexes and BPVS II standard scores for autism spectrum disorder (ASD) and chronological age match groups (CM).
2 Comparisons on narrative assessments
Differences between the two groups on the narrative assessments are shown in Table 3. In the ERRNI, standard scores for each group on both measures of the test, ‘ideas’ and ‘mean length of utterance’, fell within the average range but those of the ASD group were significantly lower than on both indices (‘ideas’ t(26) = −2.44, p = .02, Cohen’s d = 0.92; ‘mean length of utterance’, t(26) = −3.89, p = .001 Cohen’s d = 1.47).
Means and standard deviations for ERRNI standard scores for autism spectrum disorder (ASD) and chronological age match groups (CM).
The event narratives of the adolescents with ASD differed from those of the comparison group in both structural and evaluative measures of language (Table 4 and Table 5). They were significantly shorter, (GEN (t(26) = −2.99, p < .01, Cohen’s d = 1.15; SEN (t(26) = −2.67, p < .01, Cohen’s d = 1.01), contained proportionally fewer different word roots (GEN (t(26) = −2.81, p < .01, Cohen’s d = 2.26; SEN (t(26) = −4.18, p < .01, Cohen’s d = 1.58), had a shorter mean length of utterance (GEN (t(26) = −5.15, p < .01, Cohen’s d = 2.00; SEN (t(26) = −5.19, p < .01, Cohen’s d = 1.97), contained fewer causal statements (GEN (t(26) = −2.91, p < .01, Cohen’s d = 1.15; SEN (t(26) = −4.60, p < .01, Cohen’s d = 1.8), had fewer references to mental states (GEN (t(26) = −2.91, p < .01, Cohen’s d = 1.3), and made less use of enrichment devices (GEN (t(26) = −3.66, p < .01, Cohen’s d = 1.4).
Means and standard deviations for structural measures in general event narratives (GENs) and specific event narratives (SENs) for autism spectrum disorder (ASD) and age match groups.
Mean scores and standard deviations of proportional scores for evaluative measures in general event narratives (GENs) and specific event narratives (SENs) for all groups.
Tables 6 and 7 show that the mean scores of the ASD group in the story stem task were lower than those of the comparison group on all structural and evaluative measures although differences were significant for just two of the measures: mean length of utterance and number of causal statements (MLU: t(26) = −3.12, p < .01, Cohen’s d = 1.18; causal statements: t(26) = −2.78, p < .01, Cohen’s d = 1.05). However, the results of the analysis of the NSS codes of the stories showed that stories produced by the ASD group differed significantly from those of the comparison group on a number of aspects of storytelling (character development: t(26) = −4.24, p < .01, Cohen’s d = 1.60; references to mental states: t(26) = −3.51, p < .01, Cohen’s d = 1.33; referencing: t(26) = −4.18, p < .01, Cohen’s d = 1.58; conflict resolution: t(26) = −2.98, p < .01, Cohen’s d = 1.13; cohesion: t(26) = −3.33, p < .01, Cohen’s d = 1.27; and total narrative score t(26) = −3.40, p < .01, Cohen’s d = 1.29).
Results for analysis of story stem structural and evaluative measures for autistic spectrum disorder (ASD), language match (LM) and chronological age match (CM) groups.
Results of analysis of NSS codes for story stems for autistic spectrum disorders (ASD), language match (LM) and chronological age match (CM) groups.
3 Comparisons on the CCC-2 observational checklist
Table 8 shows that mean scaled scores of parent and teacher CCC-2 checklists for the adolescents with ASD are considerably lower than those of the standardization sample. Using interpretation guidance from the CCC-2 manual, the scores for this group fall within normal limits on the speech and syntax scales but, on most other scales, are below the 5th percentile, suggesting communication problems of clinical significance. Scores below the 6th percentile on the two scales of the SIDC of the CCC-2 indicate a communicative profile characteristic of adolescents with ASD.
Mean (SD) scaled scores CCC-2: Parental and teacher ratings of children with autism spectrum disorder (ASD), age matched control group and standardization sample.
Notes. NCC = nonverbal communication composite; SIDC = social interaction deviance composite.
4 Teacher ratings
Teacher ratings of the ASD group were significantly lower than those of the comparison group on the semantics [F(1, 26) = 14.97, p = 0.005, Cohen’s d = 1.46], coherence [F(1, 26) = 16.76, p = 0.00, Cohen’s d = 1.55], inappropriate initiation [F(1, 26) = 11.20, p = 0.002, Cohen’s d = 1.26], stereotyped language [F(1, 26) = 24.48, p = 0.00, Cohen’s d = 1.87], use of context [F(1, 26) = 44.16, p = 0.00, Cohen’s d = 2.51], nonverbal communication [F(1, 26) = 59.82, p = 0.00, Cohen’s d = 2.92], social relations [F(1, 26) = 49.89, p = 0.005, Cohen’s d = 2.67] and interests’ [F(1, 26) = 34.64, p = 0.00, Cohen’s d = 2.23] scales of the CCC-2. No significant differences were found between their ratings of the groups on the speech and syntax scales.
5 Parent ratings
Parental ratings of adolescents with ASD on all scales of the CCC-2 were significantly lower than those of the comparison group on all measures (speech [F(1, 26) = 14.04, p = 0.002, Cohen’s d = 1.42], syntax [F(1, 26) = 113.73, p = 0.002, Cohen’s d = 1.40], semantics [F(1, 26) = 25.92, p = 0.00, Cohen’s d = 1.93], coherence [F(1, 26) = 30.61, p = 0.00, Cohen’s d = 2.09], inappropriate initiation [F(1, 26) = 47.63, p = 0.00, Cohen’s d = 2.61], stereotyped language [F(1, 26) = 33.28, p = 0.00, Cohen’s d = 2.18], use of context [F(1, 26) = 74.11, p = 0.00, Cohen’s d = 3.25], nonverbal communication [F(1, 26) = 62.20, p = 0.00, Cohen’s d = 2.97], social relations [F(1, 26) = 56.82, p = 0.00, Cohen’s d = 2.84], and interests’ [F(1, 26) = 63.15, p = 0.00, Cohen’s d = 2.99].
A secondary aim of the study was to investigate teacher and parent differences in ratings of adolescents on the CCC-2 scales. Results showed that ratings of the parents of the adolescents with ASD on the CCC-2 (M = 3.35, SD = 2.24) were significantly lower than teacher’s ratings on the inappropriate initiation scale (M = 6.00, SD = 3.16), t(13) = 4.17, p = .001, Cohen’s d = 0.96 and the interests scale (M = 2.50, SD = 1.69), (M = 5.14, SD = 2.44), t(13) = 5.3, p = .000, Cohen’s d = 1.25. No significant differences were found between the parent and teacher ratings on any of the other scales. There were no significant differences between parent and teacher ratings of adolescents in the control group on any scale of the CCC-2.
IV Discussion
This study compared the performance of adolescents with ASD with that of a matched typically developing group on three forms of language assessment. In line with findings from previous studies (Kjelgaard and Tager-Flusberg, 2001; Manolitsi and Botting, 2011), we found that the scores of the adolescents with ASD fell within the average range on standardized tests of language ability measuring oral structural language and receptive vocabulary. Scores on the BPVS, the core language composite and the expressive and receptive indexes of the CELF IV, were lower than those of the comparison group but these differences were only significant for the core language composite score. From these findings alone, it may be difficult to argue that the young person requires extra support and funding.
On the other hand, the parent/teacher reports and the analysis of the narrative tasks clearly indicate that adolescents in the ASD group have communication problems of clinical significance. However, there were some interesting differences between the ratings of teachers and parents. Teachers rated adolescents with ASD significantly lower than the comparison group on all scales, apart from speech and syntax, but parent’s ratings were significantly lower on all scales, including speech and syntax. This may reflect the fact that teacher’s ratings were mostly drawn from observations made in structured classroom situations whereas parent’s ratings were more likely to be based on interactions in less structured situations. There is clearly much to be gained from listening to the views of both parents and teachers when assessing young people with ASD (Volden and Phillips, 2010) but it could equally be argued that such information may be biased and unreliable.
The narratives of the adolescents with ASD also differed from those of the comparison group on a number of measures. Three types of narrative tasks were analysed, each yielding different information about the language abilities of the ASD group. Mean scores of both groups on the two measures of the ERRNI fell within the average range but those of the ASD group were significantly lower than those of the comparison group indicating their narratives were shorter and less grammatically complex than those of their typically developing peers. Their event narratives were also significantly shorter and less grammatically complex but additionally contained more limited vocabulary, included fewer reasons and explanations, made fewer references to emotions and thoughts and made less use of linguistic enrichment devices. An analysis of the story stem narratives showed that adolescents with ASD scored lower than the typically developing group on all measures but that differences were significant for just two of these: mean length of utterance and the number of causal statements. However, the NSS analysis rating the story stem narratives from a more ‘global’ perspective, found that stories of the ASD group differed significantly from those of the comparison group on aspects related to character development, references to mental states, referencing, conflict resolution and coherence, concurring with findings reported in previous research (Diehl et al., 2006; Losh and Capps, 2003; Tager-Flusberg and Sullivan, 1995).
V Educational, research and clinical implications
The findings of the present study have educational and clinical implications for those working with young people with ASD in education, research and clinical practice. However, they must be interpreted with caution as this is a small study with participants of a limited age range and cognitive ability. Furthermore, whilst tests such as the BPVS II and the CELF IV are judged high on reliability measures and, for the most part, scoring is objective, parents/teacher ratings on the CCC-2 and the coding and analysis of the narratives may be more subjective and open to bias. Moreover, whereas the CCC-2 is relatively easy to administer and score, narrative analysis can be time consuming and complex, although advances in technology are making this form of assessment more manageable. Further, although this study has examined a broad range of language abilities in young people with ASD, it is by no means comprehensive and there are other areas which would merit further exploration. Many studies report impairment or delay in individuals with an ASD in figurative language -the ability to go beyond that which has been explicitly stated (Happé, 1995; Rundblad and Annaz, 2010). Although there is evidence that this may be related to core language skills, syntax and vocabulary (Kalandadze et al., 2018; Whyte and Nelson, 2015), an assessment of figurative language skills may yield useful information for clinicians and educators.
Notwithstanding, we believe that our study raises a number of issues about the assessment of language abilities in young people with an ASD. Our results indicate that scores attained by adolescents with an ASD on commonly used standardized tests of language ability may provide only limited information about the strengths and difficulties they may have with language and communication. The implications of this cannot be underestimated as service providers both in the UK and internationally often regard the scores achieved on these tests as providing them with reliable scientific information for their decisions about the provision of support and resources for individual children (King and King, 2006; Webb and Whitaker, 2012). The assessment of language and communication skills in young people with ASD is complex and the use of additional methods such as an observational checklist and narrative analysis to complement traditional standardized tests may help to provide more detailed information about specific areas of difficulty and aid in the planning of appropriate intervention and support.
Footnotes
Acknowledgements
The authors would like to thank those who participated in this study, and also Professor Julie Dockrell and Professor Morag Stuart for their advice in the development of the experimental narrative tasks.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.
