Abstract

We appreciate Drs Durkin, Bilder, Pettygrove, and Zahorodny’s thoughtful response to our editorial regarding the most recent autism prevalence estimates from the Center for Disease Control and Prevention Autism Developmental Disabilities Monitoring (ADDM) Network. It appears that we agree on several points, disagree on others, and have been misinterpreted on at least one point.
We agree with the authors that record review is much more cost-effective than direct assessment of an entire population. When condition-specific laboratory tests are used, this type of surveillance can be quite effective. Where this type of surveillance is particularly effective is when a biomarker and specific associated laboratory test exist, such as in infectious disease and many types of cancers. This is not the case for autism spectrum disorder (ASD).
A fundamental assumption underlying these methods is that the properties of the laboratory test—its ability to discriminate between those with and without the health condition—do not change over time. Because there is no biomarker for ASD, ADDM Network surveillance efforts are dependent, not on a specific laboratory test, but on a disparate body of information found in a child’s record. As research using ADDM data has found, gold standard diagnostic assessments rarely are used in community practice. Therefore, ADDM clinicians must search for specific words and phrases found in children’s records. There are a number of potential pitfalls with this method. First, it is completely dependent on the clarity of the information found in the records. In addition, if these words and phrases are used with different frequency over time and across site, it could dramatically affect the observed prevalence. While ADDM network criteria may be consistent over time, community clinician behavior may not be, with more clinicians including the language of ASD diagnosis, even when children would not meet the criteria if they were directly assessed by an experienced clinician.
We disagree with the authors’ implication that the only alternative to this strategy is to conduct direct assessment of hundreds of thousands of people. Many successful prevalence studies of psychiatric and developmental disorders have been conducted using two-stage, population-based sampling. Typically, this is done by first screening a large portion of the population using an inexpensive screening tool, then more carefully assessing a large fraction of those with very high screening scores and a smaller fraction of those with low screening scores. In fact, this is exactly the method currently in use by one of the ADDM sites to validate the current surveillance strategy. We eagerly await the results of this validation study and would be delighted if our concerns about the ADDM surveillance efforts prove to be unfounded.
We agree with the authors that some studies have been conducted to validate the current surveillance strategy. We disagree, however, on the interpretation of those studies’ results. In the most recent validation study, Bakian et al. (2014) tested the extent to which external clinical reviewers agreed with the study reviewers when they reviewed children’s records. The authors found good agreement (intra-class coefficient = 0.84). Inter-rater reliability is important, but does not speak to the validity of the method. That is, experts can agree on what they see in the record, but may decide something else entirely if they observe the child directly.
The only validation study to directly assess children in the ADDM Network study was conducted by the same group of researchers (Avchen et al., 2011). To our point, they reported that ADDM Network strategy resulted in a sensitivity of 0.60 and positive predictive value of 0.79. This means that 21% of identified “cases” did not in fact have ASD. They also reported a negative predictive value of 0.91. That is, 9% of children who were not identified as having ASD did in fact have ASD. As a group, these statistics suggest that many children were misclassified.
In the third and oldest validation study the authors cite, Van Naarden Braun and colleagues (2007) pull together theory and empirical findings to validate findings from the surveillance efforts that relied on data from 2002. They rightly point to the many positive attributes of this study design, including relative simplicity, flexibility, and data quality. There is little discussion, however, of comparing ADDM Network methods to direct clinical assessments. They point only to one study from the United Kingdom that found many false-positive cases in educational records (Tebruegge et al., 2004). Van Naarden Braun et al. also described changing practices in different sites that may have affected observed prevalence (e.g. improved data quality in West Virginia), and the very real challenges of over- and under-ascertainment using the ADDM strategy.
Finally, we think the authors misinterpret at least one of our points. We did not argue that the increase in prevalence and site variation necessarily invalidate the ADDM Network results. We are concerned, however, that the site-specific rapidity of changes and the extraordinary differences among sites should raise concerns about how to interpret these results. Durkin and colleagues stated that this site variation is potentially due to changing awareness and changing diagnostic practices, or perhaps racial or socio-economic disparities in access to care. We agree; this is exactly our point. If these factors are driving site differences, then what is being measured is not community prevalence and suggests that factors unrelated to how many children have ASD per se influence these estimates.
In conclusion, we agree with the authors that it is of critical importance to obtain accurate prevalence estimates. Much has been learned through ADDM surveillance efforts. We reiterate our concerns, however, about whether these figures should be construed as representing a true and accurate prevalence of ASD in the United States.
