Stability and change in autism spectrum disorder diagnosis from age 3 to middle childhood in a high-risk sibling cohort

Abstract

Considerable evidence on autism spectrum disorder emergence comes from longitudinal high-risk samples (i.e. younger siblings of children with autism spectrum disorder). Diagnostic stability to age 3 is very good when diagnosed as early as 18–24 months, but sensitivity is weaker, and relatively little is known beyond toddlerhood. We examined stability and change in blinded, clinical best-estimate diagnosis from age 3 to middle childhood (mean age = 9.5 years) in 67 high-risk siblings enrolled in infancy. Good agreement emerged for clinical best-estimate diagnoses (89.6% overall; kappa = 0.76, p < 0.001, 95% confidence interval = 0.59–0.93). At age 3, 18 cases (26.9%) were classified with “autism spectrum disorder”: 17 retained their autism spectrum disorder diagnosis (94.4%; 13 boys, 4 girls) and 1 no longer met autism spectrum disorder criteria at follow-up. Among “non–autism spectrum disorder” cases at age 3, 43/49 remained non–autism spectrum disorder at follow-up (87.8%; 22 boys, 21 girls) and 6/49 met lower autism symptomatology criteria (“Later-Diagnosed”; 3 boys, 3 girls). Later-diagnosed cases had significantly lower autism spectrum disorder symptomatology and higher receptive language at age 3 and trends toward lower autism symptoms and higher cognitive abilities at follow-up. Emerging developmental concerns were noted in all later-diagnosed cases, by age 3 or 5. High-risk children need to be followed up into middle childhood, particularly when showing differences in autism-related domains.

Keywords

autism spectrum disorder diagnostic stability middle childhood sibling toddler

Introduction

The past decade has yielded considerable research on the early manifestations of autism spectrum disorder (ASD), largely from prospective longitudinal studies of high-risk (HR) infants (i.e. those with an older sibling with ASD). This has propelled our understanding of the emergence of ASD, informing intervention priorities and public policy. A recurrence rate of ASD in siblings of 18.7% has been reported based on pooled data from the “Baby Siblings Research Consortium” (BSRC; Ozonoff et al., 2011), and very good stability has been demonstrated for diagnoses made in younger siblings as early as 18–24 months of age (Ozonoff et al., 2015; Zwaigenbaum et al., submitted). However, a substantial subset of children with ASD (38%–46% in these recent studies) did not receive their first diagnosis until age 3, with higher functioning children over-represented in this later-diagnosed group. Although outcome assessments at age 3 have become conventional in HR cohort studies, these findings raise the question as to whether additional children might cross the diagnostic threshold for ASD beyond age 3.

Good diagnostic stability has also been documented in clinical ASD samples in the toddler years (e.g. Chawarska et al., 2007; Kleinman et al., 2008), but studies examining stability into childhood are limited (Charman et al., 2005; Lord et al., 2006; Turner et al., 2006). Moreover, studies of clinically ascertained samples are often unable to address the question of sensitivity, since few non-diagnosed cases are followed beyond the initial assessment, precluding identification of later diagnoses. Davidovitch et al. (2015) recently conducted a chart review of over 200 clinic cases diagnosed with ASD after age 6, following non-diagnosis during multiple previous assessments. The vast majority had previously identified language, motor, attention, and/or cognitive difficulties, but ASD-related symptoms were noted in the charts of fewer than half. This is consistent with the average age of ASD diagnosis remaining around age 4 (Daniels and Mandell, 2013), suggesting that new diagnoses past age 3 are quite common. A key issue is whether these later-diagnosed cases represent initial misclassification or whether social-communication challenges and/or repetitive behaviors became more pronounced with age, thus “pushing” those cases over the diagnostic threshold as they got older. Finally, despite generally good stability, a subset of children may meet ASD diagnostic criteria as toddlers but not later (e.g. “optimal outcome” cases; Fein et al., 2013). The estimated recurrence of 18.7%, based on pooled rates across several sites of the BSRC (Ozonoff et al., 2011), is higher than rates reported in recent population-based studies (e.g. Sandin et al., 2014), raising the possibility that some HR siblings may display transient ASD-related behaviors as toddlers. Early ASD-related symptoms in HR toddlers may increase the risk of false-positive diagnoses. Following these children beyond the toddler years provides important information about the long-term stability of diagnoses made at age 3 within a HR context.

Our objective was to examine stability and change in diagnostic classification, from age 3 to middle childhood, in a longitudinal HR cohort of younger siblings of children with ASD.

Method

Participants

A total of 67 younger siblings of children with ASD (HR siblings) were drawn from our larger longitudinal study (Zwaigenbaum et al., 2005). Participants were enrolled based only on familial risk and followed from 6 to 12 months until middle childhood (7.5–12.5 years; hereafter, the “9-year/follow-up” assessment). All participants seen at age 3 were invited to return for both a 5-year (non-blinded) and a 9-year (blinded) assessment. From our longitudinal cohort, 238 cases were eligible for the 9-year assessment, having been seen at age 3 and having reached age 8 (7.5 in one case) by the time of data collection. Of these, 129 were seen at age 5 (54%), 67 of whom returned for the 9-year assessment (52%). To examine possible retention bias, we explored the association between return status (returned or not at follow-up) and diagnosis at age 3 or 5. No significant association emerged as a function of diagnosis at 3 (χ² = 0.93, p = 0.39) or 5 years (χ² = 0.113, p = 0.85), suggesting no systematic retention bias based on previous diagnosis.

Procedures

Blinded clinical best-estimate (CBE) diagnoses were made by different clinicians at the 3- and 9-year assessments, respectively, informed by Autism Diagnostic Observation Schedule (ADOS; original ADOS scores were converted to revised algorithm scores; Autism Diagnostic Observa-tion Schedule-2 (ADOS-2); Lord et al., 2012), Autism Diagnostic Interview—Revised (ADI-R), and Diagnostic and Statistical Manual of Mental Disorders (4th ed., text rev.; DSM-IV-TR) criteria. We included a blinded CBE diagnosis at age 3 as this was presumed to be the most reliable time-point for “early” but stable diagnosis. Our second blinded CBE was conducted at the 9-year time-point (our final time seeing the children), to capture any changes in diagnosis as late as possible; the interim assessment at age 5 did not include a blinded assessment largely due to resource constraints. The focus of this article is on the 3- and 9-year time-points. Cognitive functioning was assessed using the Mullen Scales of Early Learning (MSEL; Mullen, 1995) at age 3, and the Wechsler Intelligence Scale for Children, 4th ed. (WISC-4; Wechsler, 2003) whenever possible at the 9-year assessment. At both time-points, clinical experts made an ASD diagnostic determination (ASD or not) based on all available current information, but blind to previous assessment information; other diagnostic impressions were also recorded. At age 3, clinicians were blind to group membership (HR vs controls with no familial risk); this was not maintained at follow-up because only HR siblings were followed beyond age 3. Institutional ethics approval was received from all participating sites.

Analyses

Cohen’s Kappa was used to evaluate diagnostic stability across time-points. Four separate multivariate analyses of variance (MANOVAs) were used to compare two key subgroups (Stable ASD vs Later-Diagnosed ASD) on indices of social-communication and cognitive functioning at each age (critical p for within-subjects tests subjected to Bonferroni correction). Files were reviewed for additional details about cases whose diagnoses changed.

Results

Mean age of participants was 37.8 months (standard deviation (SD) = 1.9, range = 35.5–45.0) and 114.1 months (9.5 years; SD = 15.1, range = 92.8–152.6 months), at the 3- and 9-year assessments, respectively. Kappa revealed good agreement between CBE diagnoses, χ² = 39.5, p < 0.001, kappa = 0.76, 95% confidence interval (CI) = 0.59–0.93, with overall crude agreement of 89.6% (60 of 67). At age 3, 18 of 67 (26.9%) cases were classified as having ASD. At follow-up, 17 of 18 retained their ASD diagnosis (94.4% “Stable ASD”; 13 boys, 4 girls) and 1 child, a girl, did not. Of the cases in the non-ASD group at age 3, 43 of 49 remained in this category at follow-up (87.8%; 22 boys, 21 girls) and 6 of 49 cases met ASD criteria (“Later-Diagnosed”; 3 boys, 3 girls). Sex was not associated with stability of diagnosis, based on distribution of boys and girls among the three groups (Stable ASD, Stable Non-ASD, and Later-Diagnosed ASD), Fisher’s exact = 3.4, p = 0.19. We excluded the one “lost diagnosis” case from this analysis. We informally explored the 5-year diagnostic assessment data (n = 56), although it is important to note that this was not based on a blinded clinical assessment and may thus be biased by the clinician’s knowledge of prior diagnostic classification. Of the cases seen at age 5, 16 had received ASD diagnoses at 3; 15/16 of those cases retained the diagnosis at 5, while 3 new cases received an ASD diagnosis. Between ages 5 and 9, 18/19 retained their ASD status, and 3 new cases emerged.

To explore differences between cases identified at age 3 and those identified later, we directly compared these two groups (Stable ASD vs Later-Diagnosed ASD) on measures of cognitive functioning and social-communication at both blinded assessment time-points. At age 3, omnibus MANOVA was significant for ADOS, F = 3.38; p = 0.04, partial η² (effect size (ES)) = 0.35, and approached sig-nificance for MSEL, F = 2.62; p = 0.07, ES = 0.40. Within-subjects tests revealed that later-diagnosed cases had significantly lower ADOS scores than those diagnosed by age 3 (see Table 1) and significantly higher Receptive Language scores (Table 2). At follow-up, omnibus MANOVA was non-significant both for ADOS, F = 1.45; p = 0.26, ES = 0.19 and WISC-4, F = 1.84; p = 0.19, ES = 0.28. Although non-significant, trends consistently pointed to lower ASD symptoms (Table 1) and higher cognitive abilities (Table 2) in the later-diagnosed group. Although the Stable Non-ASD group was not included in these analyses, it bears noting that the cognitive performance of the Later-Diagnosed group at both 3- and 9-years is comparable to that of the Stable Non-ASD group; indeed, the Later-Diagnosed group appears to have moderately higher mean IQ scores, in the high average range, at follow-up.

Table 1.

ADOS scores at 3-, 5-, and 9-year assessments for Stable Non-ASD, Stable ASD, and Later-Diagnosed groups; between-subject effects comparing Stable ASD versus Later-Diagnosed ASD at the 3- and 9-year assessments.

	Stable Non-ASD (n = 43)	Stable ASD (n = 17)	Later-Diagnosed ASD (n = 6)	F (Stable ASD vs Later-Diagnosed)	p* (effect size)
3-year assessment, mean score (SD)
ADOS-2: SA	3.12 (2.55)	9.35 (5.18)	3.50 (1.38)	7.26	0.014 (0.26)
ADOS-2: RRB	1.60 (1.48)	4.59 (1.91)	2.00 (1.67)	8.65	0.008 (0.29)
ADOS-2: comparison score	2.37 (1.66)	6.12 (2.57)	2.50 (1.64)	10.22	0.004 (0.33)
5-year assessment,^a mean score (SD)
ADOS-2: SA	2.42 (2.34)	8.73 (4.83)	6.00 (4.47)	–	–
ADOS-2: RRB	0.73 (0.84)	4.20 (1.90)	3.20 (2.28)	–	–
ADOS-2: comparison score	1.82 (1.40)	6.60 (2.29)	5.20 (3.27)	–	–
9-year assessment, mean score (SD)
ADOS-2: SA	3.44 (3.38)	10.29 (5.06)	5.83 (2.56)	4.19	0.053 (0.17)
ADOS-2: RRB	0.77 (1.21)	3.88 (2.32)	2.00 (0.89)	3.68	0.069 (0.15)
ADOS-2: comparison score	2.65 (2.27)	7.00 (2.48)	4.83 (1.60)	3.94	0.060 (0.16)

ASD: autism spectrum disorder; SD: standard deviation; ADOS-2: Autism Diagnostic Observation Schedule-2; SA: social affect; RRB: restricted and repetitive behavior.

Original ADOS scores were converted to revised algorithm scores (ADOS-2; Lord et al., 2012).

Five-year assessment not included in the analyses because they were not independent of the 3-year assessment; mean (SD) presented for informal comparison.

Critical p set to (0.05/3 = 0.0167).

Table 2.

Developmental and intellectual functioning at 3- and 9-year assessments for Stable Non-ASD, Stable ASD, and Later-Diagnosed groups; between-subject effects comparing Stable ASD versus Later-Diagnosed ASD.

	Stable Non-ASD (n = 43)	Stable ASD (n = 17)	Later-Diagnosed ASD (n = 6)	F (Stable ASD vs Later-Diagnosed)	p* (effect size)
3-year assessment, mean score (SD)
MSEL-VR-SS	120.07 (18.63)	95.50 (23.03)	118.25 (20.11)	3.50	0.077 (0.16)
MSEL-RL-SS	109.04 (13.37)	84.34 (17.58)	107.25 (11.04)	7.90	0.011 (0.29)
MSEL-EL-SS	109.86 (12.63)	89.76 (19.62)	108.00 (15.32)	3.36	0.083 (0.15)
MSEL-ELC	114.90 (16.32)	87.87 (22.76)	111.67 (19.53)	5.04	0.037 (0.21)
9-year assessment, mean score (SD)
WISC-4 VCI	104.17 (13.68)	93.85 (16.61)	112.00 (28.56)	2.99	0.106 (0.16)
WISC-4 PRI	104.07 (18.19)	95.23 (15.59)	115.40 (18.10)	5.49	0.032 (0.26)
WISC-4 FSIQ	103.76 (15.84)	90.08 (15.86)	110.40 (20.99)	5.05	0.039 (0.24)

ASD: autism spectrum disorder; SD: standard deviation; MSEL: Mullen Scales of Early Learning; SS: standard score; VR: visual reception domain; RL: receptive language; EL: expressive language; ELC: Early Learning Composite; WISC-4: Wechsler Intelligence Scale for Children, 4th ed.; VCI: Verbal Comprehension Index; PRI: Perceptual (non-verbal) Reasoning Index; FSIQ: Full-Scale Intelligence Quotient.

Critical p set to (0.05/4 = 0.0125).

We conducted detailed file reviews for the non-stable cases. Among these, only one girl did not retain her ASD classification based on the blinded assessment at age 9. She was classified at both 3 and 5 years as having ASD (3-year ADOS severity score = 9; Mullen Scales of Early Learning—Early Learning Composite (MSEL-ELC) within average limits). At follow-up, she retained a high ADOS severity score (7), but her ADI-R Social score was well below cut-off. She was identified by the blinded clinician as having language and intellectual delays (Full-Scale Intelligence Quotient (FSIQ) well below average) and social-communication difficulties described as the “broader autism phenotype” (BAP; Piven et al., 1997; Szatmari et al., 1998). For the purpose of our analyses, we considered this a non-ASD best-estimate diagnosis at age 9, but it bears emphasizing that this child’s clinical ASD diagnosis was not removed; once unblinded, the expert clinician concluded that the ASD diagnosis should stand. Six cases (three males, three females) classified as non-ASD at age 3 received CBE diagnoses of ASD at follow-up. These later-diagnosed cases manifested a range of symptom presentations and severity at follow-up (ADOS severity score range = 3–7; four cases with severity scores ⩾ 5), and CBE diagnoses included all DSM-IV subtypes: autism (n = 2), Asperger (n = 2), and pervasive developmental disorder not otherwise specified (PDD-NOS; n = 2). Five had been seen at age 5 (non-blinded interim assessment), three of whom were diagnosed with ASD at age 5. These included two girls who had been identified at age 3 with delayed language, and a boy for whom the clinician had identified “other behavioral challenges” at age 3; all had higher ADOS comparison scores (previously calibrated severity scores) at age 5 than at 3. Three remaining cases were diagnosed with ASD for the first time at age 9. Two had histories of other developmental or behavioral challenges at age 3, including social anxiety and behavioral inflexibility. One case had no identified challenges at age 3, but the clinician identified rigid behavior (i.e. “own agenda”) at age 5.

Conclusion

This is the first report of stability and change in blinded, CBE diagnoses of ASD from age 3 to middle childhood in a prospectively ascertained HR infant sibling cohort. Only one child with ASD at age 3, a girl, did not meet the CBE criteria for an ASD diagnosis at follow-up (although language, intellectual, and social challenges continued to be evident). Moreover, once unblinded, the clinician did not remove this child’s clinical diagnosis; as such, our data represent a conservative estimate of stability over time. Six cases were not identified until after age 3 (later-diagnosed ASD); in all of these, other social, language, and/or behavioral challenges had been noted by age 3 or 5. These children were characterized, at age 3, by lower ASD symptomatology and stronger receptive language than the cases diagnosed at that time-point. Although non-significant at follow-up, the later-diagnosed group remained somewhat higher functioning compared to cases diagnosed by age 3 and did not appear distinguishable from the Non-ASD group. This aligns with evidence of earlier diagnoses (i.e. before age 3) in more severely affected individuals in our HR cohort (Zwaigenbaum et al., submitted). Indeed, four of the six later-diagnosed cases were relatively mildly affected, with intact language and cognitive abilities, characteristics often associated with diagnoses in the school-age years (Howlin and Asgharian, 1999).

Davidovitch et al. (2015) proposed four possible explanations for the later diagnoses in the sample they reviewed, including measurement error (i.e. diagnostic overshadowing from other conditions, missed symptoms, or over-diagnosis at the later time-point), or true increasing evidence of symptoms over time. It could also be that the DSM-IV-TR thresholds used in our 3-year assessment were too stringent and that higher sensitivity might be achieved by a less stringent criterion. This would inevitably result in lower specificity, which might have more disadvantageous consequences. Given that our infant sibling study was designed to explore the emergence of ASD-related characteristics in a HR sample, it is not likely that ASD-related differences were overshadowed by other conditions or simply overlooked in our later-diagnosed group. Moreover, our observation of earlier emerging ASD-related symptoms in all later-diagnosed cases challenges the notion of later over-diagnosis. Instead, our findings support the evolution of symptoms, wherein sub-threshold ASD-related challenges in some cases do not result in impairment until later childhood when the demands of the environment begin to exceed the child’s ability (as highlighted in the Diagnostic and Statistical Manual of Mental Disorders, 5th ed.; DSM-5; American Psychiatric Association (APA), 2013), and restricted interests unfold. It remains possible that clinicians’ confidence also increases with the child’s age, resulting in greater comfort making a diagnosis in higher functioning children with milder symptoms as they reach middle childhood. These issues increase the risk of false-negative diagnoses in toddlerhood and serve to underscore the importance of following these HR children clinically beyond toddlerhood to ensure that they do not fall through the cracks.

Strengths of this study include enrollment based only on risk status (vs clinical concern), our long-term follow-up, rigorous CBE assessments, and the use of standardized tools. We do acknowledge the following limitations: our findings in this HR sibling sample may not be representative of community referred samples (see Zwaigenbaum et al., submitted) and thus may not be generalizable to the broader population with ASD. Also, given our high rates of ASD at follow-up, some retention bias is possible (i.e. families who returned for follow-up may have been motivated by ongoing concerns about their children’s development), and we have not systematically examined participation in intervention. Finally, some analyses may have been underpowered due to our sample size.

Overall, however, findings highlight the relative stability of early diagnoses and also shed light on factors that contribute to variability in the emergence and expression of ASD from toddlerhood to middle childhood. Our results also highlight the need for continued surveillance of HR siblings who do not meet criteria for ASD at age 3 but who show signs of the BAP and might later be diagnosed with ASD. Further studies with larger and more complete samples will be needed to elucidate stability and change associated with the patterns of emergence of ASD throughout childhood.

Footnotes

Funding

This work was supported by the Canadian Institutes of Health Research, Autism Speaks, Autism Speaks Canada, NeuroDevNet, and the Simons Foundation. LZ is supported by the Stollery Children’s Hospital Foundation Chair in Autism. At the time of data collection, SB was supported by the Joan and Jack Craig Chair in Autism Research. IMS is currently supported by the Joan and Jack Craig Chair in Autism Research. We also wish to thank the families for their dedication to our longitudinal work and our committed research assistants/coordinators, and blinded clinicians (Drs S Bauld, I Drmic, H Flanagan, C Goldfarb, S Maruchuk, M Penner, V Rombough, A Solish, and J Shuster) for making this work possible.

References

American Psychiatric Association (APA) (2013) Diagnostic and Statistical Manual of Mental Disorders. 5th ed. Washington, DC: APA.

Charman

Taylor

Drew

. (2005) Outcome at 7 years of children diagnosed with autism at age 2: predictive validity of assessments conducted at 2 and 3 years of age and pattern of symptom change over time. Journal of Child Psychology and Psychiatry 46(5): 500–513.

Chawarska

Klin

Paul

. (2007) Autism spectrum disorder in the second year: stability and change in syndrome expression. Journal of Child Psychology and Psychiatry 48(2): 128–138.

Daniels

Mandell

(2013) Explaining differences in age at autism spectrum disorder diagnosis: a critical review. Autism 18(5): 583–597.

Davidovitch

Levit-Binnun

Golan

. (2015) Late diagnosis of autism spectrum disorder after initial negative assessment by a multidisciplinary team. Journal of Developmental and Behavioral Pediatrics. Epub ahead of print May. DOI: 10.1097/DBP.0000000000000133.

Fein

Barton

Eigsti

. (2013) Optimal outcome in individuals with a history of autism. Journal of Child Psychology and Psychiatry 54: 195–205.

Howlin

Asgharian

(1999) The diagnosis of autism and Asperger syndrome: findings from a survey of 770 families. Developmental Medicine and Child Neurology 41(12): 834–839.

Kleinman

Ventola Pandey . (2008) Diagnostic stability in very young children with ASD. Journal of Autism and Developmental Disorders 38: 606–615.

Lord

Risi

DiLavore

. (2006) Autism from 2 to 9 years of age. Archives of General Psychiatry 63(6): 694–701.

10.

Lord

Rutter

DiLavore

. (2012) Autism Diagnostic Observation Schedule, Second Edition (ADOS-2) Manual (Part I): Modules 1–4. Torrance, CA: Western Psychological Services.

11.

Mullen

(1995) Mullen Scales of Early Learning. Circle Pines, MN: American Guidance.

12.

Ozonoff

Young

Carter

. (2011) Recurrence risk for autism spectrum disorders: A Baby Siblings Research Consortium study. Pediatrics 128(3): e488–e495.

13.

Ozonoff

Young

Landa

. (2015) Diagnostic stability in young children at risk for autism spectrum disorder: A Baby Siblings Research Consortium study. Journal of Child Psychology and Psychiatry 56(9): 988–998.

14.

Piven

Palmer

Jacobi

. (1997) Broader autism phenotype: Evidence from a family history study of multiple-incidence autism families. Am J Psychiatry 154(2): 185–190.

15.

Sandin

Lichtenstein

Kuja-Halkola

. (2014) The familial risk of autism. JAMA: Journal of the American Medical Association 311(17): 1770–1777.

16.

Szatmari

Jones

Zwaigenbaum

. (1998) Genetics of autism: overview and new directions. Journal of Autism and Developmental Disorders 28(5): 351–368.

17.

Turner

Stone

Pozdol

. (2006) Follow-up of children with autism spectrum disorders from age 2 to age 9. Autism 10(3): 243–265.

18.

Wechsler

(2003) Wechsler Intelligence Scale for Children—Fourth Edition (WISC-IV). San Antonio, TX: The Psychological Corporation.

19.

Zwaigenbaum

Bryson

Rogers

. (2005) Behavioral manifestations of autism in the first year of life. International Journal of Developmental Neuroscience 23(2): 143–152.

20.

Zwaigenbaum

Bryson

Brian

. (submitted) Stability of diagnostic assessment for autism spectrum disorder in a high-risk cohort. Autism Research.