Abstract
Over the past 20+ years, researchers have worked toward identifying early behavioral predictors of autism spectrum disorder (ASD) and developing observation-based screeners to supplement existing parent-report methods. This study is a follow-up, 3 to 8 years later, with parents/caregivers of 57 children previously enrolled in a U.S. university-based study evaluating early ASD-risk. The original study evaluated infants’ (ages 15–35 months) ASD-risk through both observation-based and parent-report screeners. At follow-up, caregivers completed a phone interview inquiring about their child’s developmental progress and diagnostic outcomes. Results indicated screener at-risk status agreement in infancy predicted only one of the four parent-reported ASD diagnoses at follow-up. Single instrument at-risk status aligned with two additional ASD diagnoses (one per screener), and both screeners missed one ASD diagnosis at follow-up. Results did not indicate significant added utility for the observation-based screener over the commonly used parent-report screener, suggesting that ASD behavioral markers may be hard to observe at early ages.
Autism spectrum disorder (ASD) is now considered one of the most common developmental disabilities (Zeidan et al., 2022). Specifically, we have seen a rise from the early four to six in 10,000 rates first suggested in the 1960s (Lotter, 1966) to recent reports of one in 44 children diagnosed with ASD (Maenner et al., 2021). While ASD is continually diagnosed more frequently, we still do not fully understand its etiology. Clinicians and researchers alike are in pursuit of the earliest possible diagnostic age because early, intensive intervention demonstrates promising results (Estes et al., 2015; Zwaigenbaum et al., 2015). However, access and support (e.g., insurance coverage) to intervention oftentimes is tied to receiving a formal diagnosis based on clinical criterion, especially in Western healthcare systems (Trump & Ayres, 2020).
To that end, there have been consistent research and policy efforts over the past 20+ years toward identifying ASD in children as young as 18 months (Kleinman et al., 2008; Stone et al., 1999; Zwaigenbaum et al., 2016). These fruitful efforts have included several prospective infant sibling studies (Landa et al., 2012, 2013; Zwaigenbaum et al., 2016) that have helped to identify behavioral and physiological manifestations of ASD as early as 18 to 24 months of age, compared with the previous diagnostic age of around 3 years in the 80s to 90s (Howlin & Moore, 1997). The long-term stability of ASD diagnosis in children ≥24 months of age is now considered well established (Kleinman et al., 2008). Still, recent studies have reported the global mean diagnostic age hovers around 5 years (van’t Hof et al., 2021). Continued research is needed to determine how parents and child development experts (e.g., pediatricians and daycare providers) can reliably identify ASD and distinguish it from other developmental disorders.
Early diagnosis is facilitated by early screening, and the tools used for each of these processes are different. Developmental screening is the process of casting a wide net to identify signs of ASD in the general population whereas more comprehensive diagnostic assessment is the process of confirming whether a child meets Diagnostic and Statistical Manual of Mental Disorders (5th ed.; DSM-5; American Psychiatric Association, 2013) criteria for ASD. Diagnosis involves referral to a specialist (e.g., child psychologist and developmental pediatrician) for administration of gold-standard diagnostic tools (e.g., Autism Diagnostic Observation Schedule, Second Edition, ADOS-2, Lord et al., 2012; Autism Diagnostic Interview-Revised, ADI-R, Rutter, Le Couteur, & Lord, 2003). Current screening practices often start with parent concerns. Concerns may be voiced in a pediatrician’s office (i.e., well-baby checkup) but will likely first be investigated through use of parent-report screeners because social development is hard to evaluate in a clinical pediatric setting (Pinto-Martin et al., 2008). Retrospective studies have found that parents may have concerns as early as 15 to 18 months (e.g., Herlihy et al., 2015) but fail to mention these to their pediatrician until later or are met with reassuring “wait and see” responses from providers (Zuckerman et al., 2015). Given the stigmatization of atypical behavior that is often grounded diagnostic labeling, parents may choose not to disclose early concern out of fear of judgment or misdiagnosis. Furthermore, healthcare providers may be rightfully cautious to pursue early diagnostic labeling that could result in identity foreclosure for the child and/or caregivers that may undermine the opportunity for natural developmental improvement over time. Navigating ASD diagnostic assessment can be mentally, financially, and emotionally taxing on families and may result not only in missed opportunities for early intervention but also general dissatisfaction with the process regardless of diagnostic result (Bendik & Spicer-White, 2021).
However, some pediatricians have begun routinely asking parents to complete screening questionnaires at 18- and 24-month well-baby checkups to identify possible early signs of ASD, per recommendations from the American Academy of Pediatrics (AAP; C. P. Johnson & Myers, 2007). Administering these only to parents who express clear concerns would likely bias diagnosis in the same ways that self-report measures bias any psychological endeavor. However, adherence to AAP recommendations has varied considerably in pediatric settings, with reported estimates ranging from 17% to 73% (Carbone et al., 2020). Recent research has found variations in previous ASD-specific training and provider type (i.e., pediatricians vs. other primary care providers) are key contributors to variance in recommended screening practices (Mazurek et al., 2021). The well-established and widely used parent-report screeners such as the Modified Checklist for Autism in Toddlers (M-CHAT; Robins et al., 2001) require parents to have familiarity with the child’s history and current behavior. Such assessments are subject to biases such as overestimation of risk (i.e., high false-positive rates) due to initial parent-mentioned concerns and failure to participate in follow-up interviews that are considered crucial for clarification of the nature of initial concerns (Taylor et al., 2014). Furthermore, pediatricians often cite limited time and familiarity with ASD-specific assessments as barriers to routine screening and thorough follow-up (Snijder et al., 2021). Administering the full protocol of current initial screeners and follow-up assessments to every child therefore is a large undertaking that seems unlikely to gain clinical support. As a result, researchers have begun to explore new avenues for earlier diagnosis of ASD, including the creation of shorter, objective behavior screeners.
The Rapid-ABC or R-ABC screener created by Ousley and colleagues (2013) is one attempt at a quick, interactive, observation-based, ASD-specific screener for young infants and toddlers that could offer utility in screening for ASD. In the R-ABC, Rapid refers to the brief nature of the assessment (~4 min), and ABC refers to the protocol’s focus on eliciting social
Purpose of Study
The purpose of this study was to compare standard sources of information (i.e., M-CHAT, an American Academy of Pediatrics recommended parent-report screener) with a promising new behavioral screener (R-ABC) to assess degree of consensus in detecting likely cases of ASD. This study builds upon an archival data set, the Multimodal Dyadic Behavior Dataset (MMDB), that collected screening data as part of a larger study protocol not designed initially to investigate diagnostic outcomes (i.e., there was no follow-up time point designated to assess true clinical diagnosis or later developmental progress beyond infancy). Furthermore, this study adds to the literature by evaluating the use of observation-based screeners like the R-ABC in community-based samples where the percentage of the sample considered at risk for ASD is not pre-selected to be include a larger percentage than would be expected in the general population, as commonly seen in validation studies. We collected follow-up data (Time 2; T2) from parents of the infants in the archival dataset (Time 1; T1), turning the dataset into a longitudinal study, to directly compare these early sources of information (M-CHAT vs. R-ABC) and their relationship with later parent-reported developmental outcomes. Through a brief (10–20 min) phone interview, we gained information from parents who had previously participated in the original study about their children’s social and communicative development. The infants who participated in the original study were now older (i.e., 4–10 years old) and more likely to have received an ASD diagnosis, as research has shown that the median age of diagnosis lies around 4 to 5 years (Maenner et al., 2021). Based on the Ousley et al. (2013) initial validation study of the R-ABC, we hypothesized that there would be no significant differences between the R-ABC and M-CHAT in the number of children identified as at-risk for ASD in infancy (T1). Furthermore, we hypothesized that both screeners would predict parent-mentioned ASD outcome during the follow-up interview (T2).
Method
Participants
The method for this study featured a follow-up interview conducted in 2018–2019 with parents of children who participated in a previous Institutional Review Board (IRB) approved study conducted at a large public university in the United States from 2011 to 2015. Participants for the original study were recruited through advertisements placed on public bulletin boards in places that serve young children (e.g., childcare and medical/health facilities) and online (e.g., parent listservs and lab website). The MMDB dataset (Rehg et al., 2013) features 160 recorded sessions of audio, video, and physiological recordings of 121 children, ages 15 to 35 months old, who interact in a toy play assessment known as Rapid-ABC or R-ABC (Ousley et al., 2013). Parents also completed a parent-report screener (M-CHAT) scored by two independent researchers. Children who had not exceeded the upper age-limit (35 months) were invited back for a follow-up session 2 to 3 months later. A subset of 39 children participated in the follow-up, where they again participated in the R-ABC assessment and parents completed the M-CHAT. The R-ABC was administered by three researchers trained by a consulting clinician. Research reliability (i.e., fidelity of both administration and scoring) was calculated to ensure all examiners reached excellent reliability (≥90% agreement) with one another and the consulting clinician.
We contacted 111 parents/caregivers (91.7%) whose children had previously participated in the original study via mailed and emailed invitations; ten caregivers were unreachable due to missing or outdated contact information. Fifty-eight interviews were completed (56% of the contactable sample), and eight additional caregivers responded to the initial recruitment letter but failed to schedule an interview. One interview was excluded as we realized post-interview that the parent had been answering questions about a child not in the original study, resulting in a final sample size of 57 interviews from parents (55 mothers, one father) of children in the original study (30 boys, 27 girls). At the time of the follow-up interview, the children (53% male) of the parents interviewed ranged in age from 4 to 10 years (M = 7.3, SD = 1.2). Child ethnicity included Caucasian (75.4%), African American (10.5%), Asian (3.5%), Hispanic/Latino (3.5), and mixed ethnicity (7%). Additional T2 family demographics (e.g., parental education and employment, household income) are provided in Supplementary Material 1. The first author conducted all interviews via a private Skype caller account and recorded using Ecamm software. Participants were compensated with a US$20 gift card.
Instruments
Archival Data From MMDB
Modified Checklist for Autism in Toddlers (M-CHAT)
The M-CHAT is a two-stage parent-report screener used to assess risk status for ASD in children ranging in age from 16 to 30 months (Robins et al., 2001). Parents respond Yes/No to a series of 23 questions about their child’s development. Scoring involves calculating the number of responses that indicate ASD risk. Low risk (score 0–2), Medium risk (3–6), or High Risk (7–23) categories are then used to make recommendations about follow-ups or behavior surveillance. For medium risk, a second-stage follow-up interview (FUI) designed to reduce the false-positive rate is conducted. The M-CHAT FUI (Robins et al., 1999) selects questions based on the initial scores that indicated risk for diagnosis and asks elaborative questions via a decision-tree. The follow-up determines final risk categorization where the failure of certain “critical” items or three or more items warrants a specialist referral for a formal diagnostic evaluation. For high risk on the initial M-CHAT, it is clinically acceptable to bypass the FUI and refer the child for further diagnostic assessment and intervention. For the purposes of our study, any initial medium or high M-CHAT scores prompted the FUI, and ASD risk status was updated after scoring the interview.
Rapid-ABC Assessment (R-ABC)
The Rapid-ABC or R-ABC task is an ASD-specific behavioral assessment for young infants and toddlers in which an examiner engages the child in a semi-structured play interaction designed to elicit a broad range of social behaviors which, if absent, could be potential “red-flags” for ASD (Ousley et al., 2013). In the original MMDB study, children were allowed to engage in free play (~15 min) in efforts to establish rapport with the examiner prior to beginning the R-ABC. Children could choose to play with a variety of toys (e.g., plush animals, blocks, and pretend food/dishes) placed in toy boxes throughout the exam room floor. If a child was slow-to-warm to the examiner or shy, an additional 5 to 10 min of free-play was given to build rapport. Parents remained in the room during both the warm-up and R-ABC assessment periods, in efforts to increase the comfort of the child. The R-ABC protocol consists of five distinct stages, in the following order:
Greeting—The examiner greets the child by smiling and saying hello using the child’s name, with a slight pause to see how the child responds (i.e., scoring smile and eye contact). The examiner then asks the child if he/she is ready to play and retrieves a ball from below the table.
Ball play—Once the ball is visible over the table edge, the examiner pauses for a moment to see if the child requests or shares attention about the ball before initiating a turn-taking game. The examiner rolls the ball to the child and requests the child to roll it back (if they do not do this on their own) by extending her arms, palms up and prompting, “Can you roll it back?” This back-and-forth interaction is repeated a maximum of 3 times. The examiner then puts the ball away and retrieves a picture book.
Book reading—Once the book is visible over the table edge, the examiner again pauses to see if the child requests or shares attention about the book before initiating a social reading activity. The examiner reads the book and asks the child, “what do you see?” to elicit naming and pointing. The child can receive credit for pointing with the index finger to a picture in the book, either spontaneously or in response to the examiner’s question. The examiner also offers moments for the child to engage with the book or help turn pages by asking, “what is next?” The interaction is repeated for a maximum of three pages.
Hat—Once finished reading, the examiner closes the book before placing it overhead (pretending the book is a hat) and gasping. The examiner engages the child by asking, “Where is the book?” For this stage, there is only one chance for observation of target behaviors (smile, eye contact). Once pointing out that the book is on their head like a hat (if the child does not do so spontaneously), the examiner closes the book and places it below the table.
Tickle play—Finally, the examiner engages the child in a gentle tickling game. This social game is similar to the back-and-forth ball activity in that the examiner wiggles her fingers saying, “I am going to tickle you,” briefly pauses before moving in to gently tickle the child (saying “tickle, tickle, tickle”), then retreats. This sequence is repeated a maximum of three times. If a child expressed strong disinterest in tickling, the examiner would continue the verbal script but tap fingers on the table in front of the child instead of making bodily contact.
While moving through the activities, examiners note the presence or absence (based on a single occurrence) of 17 target socio-communicative and participatory behaviors. For example, the examiner scores whether the child-initiated joint attention (e.g., looked at the ball then to the experimenter), smiled, turned book pages, or pointed. For many target behaviors, there is only one opportunity to observe the behavior in each stage of the R-ABC. For repeated interactions (e.g., ball play) a child may receive credit for that behavior (e.g., rolling the ball back to show reciprocity) on any of the turns. The examiner also rates overall engagement (i.e., how much effort was required to engage the child) during each stage of the protocol on a 3-point Likert type scale (0 = easily engaged to 2 = significant effort required to engage the child). A rating of 0 is given if the interaction requires little to no effort for the examiner (i.e., the child was ready and eager to engage) and could be obtained even if some of the target behaviors for that stage were absent. A rating of 1 is given if the interaction required examiner effort due to shyness or distractibility. The highest rating of 2 is given if the interaction requires extensive experimenter effort or if the child is fussy or refuses to interact. Final R-ABC scoring combines the number of absent behaviors with the engagement ratings for a composite score ranging from 0 to 27, with a suggested cutoff of 13 indicative of risk for ASD diagnosis.
Follow-Up Data Collected as Part of Present Study
Parents were asked to provide the child’s current age. If the child had been diagnosed with ASD, parents were asked to provide age of diagnosis, the diagnosing professional, and any therapies or treatments in which their child was participating. For this study, only diagnoses from those qualified to diagnose ASD (e.g., clinical psychologists and developmental pediatricians) were considered. Because the T2 research team did not have access to medical records, parents were also asked to complete a contemporaneous parent-report ASD screener. Given the amount of time that had passed since the original study (3–8 years), the M-CHAT was no longer considered age-appropriate at T2. Instead, we adopted the Lifetime version of the Social Communication Questionnaire (SCQ) by Rutter, Bailey, and Lord. (2003).
Social Communication Questionnaire
The SCQ is a brief instrument designed to evaluate social functioning and communication skills concerning ASD (Rutter, Bailey, & Lord, 2003). The screener has demonstrated effectiveness in predicting ASD versus non-ASD outcomes (see Chesnut et al., 2017, for meta-analysis). Parents respond Yes/No to a series of 40 questions about their child’s development. Scores of 15+ indicate possible ASD, and parents are recommended to seek a more comprehensive evaluation of their child’s behavior. The SCQ is derived from one of the “gold-standard” assessments for ASD, the ADI-R (Berument et al., 1999), and has also been validated against the ADOS (Corsello et al., 2007) and demonstrated cross-cultural validity (Bölte et al., 2008). The Lifetime version (used as part of this study) has been validated for children 4 years of age or older and is widely recommended (Marvin et al., 2017).
Procedure
At T2, each parent was assigned a new unique identifier. T1 and T2 identifiers were connected after all data collection was completed so that the T2 interviewer did not know the child’s T1 risk status during the follow-up interview. All T2 interviews were conducted by the first author who had not conducted any T1 assessment. A second coder transcribed audio recorded interview and SCQ responses to ensure fidelity and completeness. Notes from the live administration and transcriptions were combined and discussed with the second author, who has extensive experience analyzing qualitative data, to ensure thorough and accurate interpretation. The semi-structured interview included various open-ended questions spanning the following topics: general (child’s likes/dislikes, parent concerns about development), education (placement, progress, concerns), and social relationships (Supplementary Material 2). Additional questions regarded specific clinical diagnoses, formal treatments/therapies, and demographics. Although the interview spanned various topics, child developmental progress was considered through an ASD diagnosis lens when formulating questions. The interview was designed to feel conversational (i.e., letting the parent share whatever they felt relevant or important) to illustrate a clearer, holistic picture of the child, which could have included mentioning developmental challenges not associated with ASD. Upon completion of the interview, the researcher verbally administered the SCQ. Participants were provided with a copy of their SCQ responses, accompanied by a letter describing the results, so that they could share with a pediatrician if desired.
After combining T1 and T2 data sets, T1 risk status (from the archival M-CHAT and R-ABC results) were used as predictor variables for T2 study outcomes (parent-reported ASD diagnosis and SCQ score). To capture maximum “risk,” we decided to collapse all T1 data (intake and follow-up) and utilize an average screener score for the M-CHAT and R-ABC separately, when available, to classify “at-risk.” Parent-reported ASD diagnosis was considered the primary outcome variable and SCQ as a supporting variable to provide independent validation of ASD.
Statistical Analysis
Data were analyzed using IBM SPSS 28 software. Correlations were used to evaluate relationships between instruments. Chi-square was conducted to compare frequencies at which each screener affirmed cases meeting respective at-risk criteria for ASD diagnosis. Binomial logistic regression was performed to ascertain if T1 screeners predicted the likelihood that children had parent-reported ASD diagnosis at T2. The ability of the T1 screeners to correctly predict the dichotomous ASD/non-ASD T2 outcome was evaluated using area under the curve (AUC) scores from nonparametric receiver operating characteristics (ROC) curve analyses. Results were interpreted based on AUC benchmarks suggested by Hosmer et al. (2013): low (0.5–0.7), moderate (0.7–9), and high (>0.9) accuracy. Sensitivity was calculated by computing the proportion of children diagnosed with ASD at T2 who were correctly identified as at-risk on a T1 screener. Specificity was calculated by computing the proportion of children not diagnosed with ASD at T2 who were correctly identified as not At-Risk for ASD on the T1 screener.
Results
Agreement Amongst Early Sources (T1) in At-Risk Classification
Participant demographics from T2 are found in Supplementary Material 1 and were found to be representative of the T1 population. The cutoff scores for at-risk (AR) status at T1 was 3+ (i.e., medium or high risk) out of 23 for the M-CHAT and 13+ out of 27 for the R-ABC. In the original MMDB study (T1), the average M-CHAT score was 1.1 (SD = 2.2; range = 0–10) and average R-ABC score was 7.5 (SD = 4.8; range = 0–19). The two instruments were not significantly correlated with one another (r = .15, p = .27). At T2, the average SCQ score was 4.4 (SD = 5.3, range = 0–29). T1 M-CHAT was significantly correlated with T2 SCQ (r = .76, p < .001), whereas T1 R-ABC was not correlated with T2 SCQ (r = .06, p = .66). Of the 57 children in the T1 sample, 5 (8.8%) met criteria for at-risk for ASD on the M-CHAT, whereas 10 (17.5%) met criteria for at-risk for ASD on the R-ABC. Table 1 illustrates where the two instruments did and did not overlap in classifications. McNemar’s test revealed no significant difference in the proportion of children considered at-risk for ASD from the M-CHAT and R-ABC, p = .23.
Case Frequencies Meeting At-Risk Criteria for ASD on the M-CHAT and R-ABC.
Note. McNemar’s Test p = .227. ASD = autism spectrum disorder; M-CHAT = Modified Checklist for Autism in Toddlers; R-ABC = rapid-ABC.
Early Sources (T1) Predicting Parent-Mentioned ASD Outcome (T2)
During the follow-up interview, parents were asked to report any diagnoses their child had received. Four parents (7%) reported that their child had and 53 (93%) reported that their child had not received an ASD diagnosis since T1. Twelve parents (21%) reported that their child had received other, non-ASD diagnoses, including attention deficit hyperactivity disorder (ADHD), dyslexia, epilepsy, 22q deletion syndrome, speech delay, mixed expressive receptive language disorder, and other rare, non-cognitive motor disorders. Parents could indicate multiple diagnoses when applicable (e.g., ASD and speech delay). Of the four children whose parents reported a confirmed ASD diagnosis by T2, only two had SCQ scores above the cutoff. The logistic regression model was statistically significant, χ2(2) = 6.99, p = .03. The model explained 29% (Nagelkerke R2) of the variance in ASD diagnosis and correctly classified 89.5 percent of the cases. M-CHAT score was the only statistically significant variable (see Table 2). Higher T1 M-CHAT scores were associated with an increased likelihood of T2 parent-reported ASD diagnosis.
Logistic Regression Predicting the Likelihood of T2 ASD Diagnosis From T1 Screener Scores.
Note. ASD = autism spectrum disorder; CI = confidence interval; M-CHAT = Modified Checklist for Autism in Toddlers; R-ABC = rapid-ABC.
Significant predictor (p < .05).
For this sample, the area under the curve was .89 (SE = .06, p < .001) for the M-CHAT and .62 (SE = .16, p = .44) for the R-ABC, respectively. The M-CHAT was moderately valid in separating children with and without an ASD diagnosis in a similar manner as what was reported by the parents at T2, whereas the R-ABC had low discriminative validity. Fifty-two children (91%) had been considered low-risk, three (5%) medium-risk and two (4%) high-risk based on the M-CHAT at T1. Of these five medium/high-risk children, two went on to receive ASD diagnoses (one medium-risk and the other high-risk) and the other three went on to receive other, non-ASD diagnoses (i.e., genetic disorders, speech delay, sensory processing disorder, and dyslexia). Two additional children who went on to have ASD diagnosis mentioned at T2 were considered low-risk based on the M-CHAT at T1. For this sample, an M-CHAT cutoff as low as ~1 appears to yield a sensitivity of .88 and a specificity of .73.
In the original validation study conducted by Ousley et al. (2013), a score of 13 on the R-ABC yielded high specificity (.96) and sensitivity (.83) for correctly classifying at-risk for ASD and not-at risk for ASD using an “all-information available” clinical evaluation. In this study, this cutoff score classified 10 children as at-risk at T1; of these, two had received ASD diagnosis by T2. Of the remaining eight who had been considered at-risk at T1, two received other non-ASD diagnoses (i.e., genetic disorders, ADHD, and dyslexia), and six had no T2 clinical diagnoses. For this sample, there was not an R-ABC score that yielded high sensitivity/specificity ratios. A cutoff of ~8 yielded a sensitivity of .75 and a specificity of .60.
Discussion
The purpose of this study was to compare the frequencies of cases meeting “at-risk’ criteria for an ASD diagnosis for a well-known parent-report screener (M-CHAT) and a novel direct-observation instrument (R-ABC) to assess convergence in a general population sample. Furthermore, we evaluated how these early sources of information on a child’s social-communication development related to subsequent ASD diagnoses, as reported by caregivers. Findings suggested 81% agreement between the M-CHAT and R-ABC (using respective cutoff scores) on early risk status with the R-ABC identifying more cases (10 vs. five) as at-risk for ASD diagnosis. However, consensus (i.e., where both T1 assessments indicated risk) predicted only one of the four parent-reported ASD diagnoses at T2. Of the three other children diagnosed at T2, two were correctly classified as at-risk by only one screener, and both screeners missed the other. Although unexpected, this finding likely reflects the current state of the science with respect to early signs of ASD-risk. ASD is a condition with a spectrum of symptoms that emerge on different developmental trajectories with considerable individual variability (e.g., Landa et al., 2012, 2013). Studies have suggested that there is not one clear trajectory solely predicting ASD outcome, and a considerable number of infants will look as if they are typically developing (40% in Landa et al., 2012) whether or not they are later diagnosed. Further investigation into each of these four cases and incorporating additional information such as specific behaviors flagged from audio/video recordings available in the MMDB database could shed light on more subtle behavioral differences, especially in the two cases flagged as at-risk at T1 by one screening instrument but not the other. A case-study approach could also consider age differences at T1. It is possible that the “missed” child in the present sample had either not developed ASD symptoms at T1 (i.e., late-onset) or came in at an age when early symptoms had already diminished or improved to a point where they were not flagged (see Landa et al., 2013). Further investigation as a single case would be needed to make strong claims in either direction.
We also considered whether one T1 screener was more predictive of T2 diagnostic outcome than the other. That is, of those children who failed both screeners compared with those children who failed only one, would the one screener have been sufficient. Ideally, a brief behavioral observation instrument such as the R-ABC could further tease apart these “false” rates (i.e., catch misses and lower false-positive rate) resulting from the existing parent-report methods (e.g., M-CHAT) currently recommended as initial steps in widespread ASD screening. However, this study found that the R-ABC did not add significant discriminative information. ROC analysis revealed that for this sample, the M-CHAT appeared to be adequate as a single source of information when correctly classifying later parent-reported ASD diagnoses with good specificity/sensitivity. In this study, the observational method utilized in the R-ABC may not have captured “true risk” in a way that differed from the reports of parents alone.
One possible explanation is that the R-ABC assessment lacks external validity. Although the R-ABC assessment in this study was administered in a lab designed to look like a playroom and with substantial warm-up playtime (~15 min) in hopes of familiarizing the children with the experimenter and setting, it is still possible that the behaviors captured were not typical of the infants in their familiar environment. Disinterest or a “bad day” for an infant could have yielded inaccurate data within the present sample and might warrant a further investigation into the annotated audio/video recordings from the MMDB archival dataset. Furthermore, it is important to note that the T1 data was collected by research reliable examiners. The R-ABC is intended to be simple enough that potential users would not need extensive autism-specific knowledge or in-depth clinical experience to administer the assessment (Ousley et al., 2013). However, in this study, the T1 research team developed experimenter scripts and manual to clarify fine-grained scoring scenarios (e.g., scoring if eye-contact was made prior but not in direct response to the greeting). Furthermore, because the T1 sessions were audio/video recorded, it was also possible to go back and review moments of the assessment when scoring. In the field, examiners may not have such access. Future studies including examiners with varying ASD experience can better assess the skill level required prior to widespread R-ABC implementation.
For ASD screeners, it is possible that false-positives are mostly children who go on to receive other non-ASD diagnoses. In this study, two out of eight false-positives on the R-ABC at T1 went on to have other non-ASD diagnoses by T2. For the M-CHAT, all three T1 false-positives went on to have other non-ASD diagnoses mentioned, and two of these children had been flagged by both screeners. As part of the initial Ousley et al. (2013) study, the authors pointed out that future research needed to be conducted to determine whether the R-ABC could discriminate ASD from other developmental disorders. ASD has various behavioral, cognitive, genetic, and medical comorbidities (e.g., anxiety, intellectual disability, ADHD, and seizures) that make the development of biomarkers especially challenging (M. H. Johnson et al., 2015). Many of these clinically relevant behaviors were mentioned by parents at T2 when ASD diagnosis was negated. This illuminates the need for continued investigation of not only the R-ABC but any existing or emergent early screening instruments for the ability to discriminate ASD from other developmental disorders. While sensitivity alone (i.e., flagging any sort of developmental delay, ASD-specific or not) can guide opportunities for early behavioral intervention during key developmental periods, clinical specificity often facilitates the understanding and shaping of treatment plans. This is especially true in Western health care systems, where access and funding for treatment and intervention is often tied to DSM-5 categorizations. However, recent research has advocated for function-based diagnosis as a replacement for formal diagnoses in determining access to key services (e.g., applied behavior analysis; Trump & Ayres, 2020).
It is also possible that children who were displaying at-risk behaviors flagged at T1 were able to gain access to therapy or early intervention that ameliorated concerns and made an eventual diagnosis less likely to be mentioned at T2. This possibility is supported by an exploratory evaluation of the qualitative T2 parent responses. For all five children flagged by the M-CHAT, parents reported participation in therapy or intervention (e.g., speech, occupational, and play/social), and all had received some clinical diagnosis (two ASD). This is perhaps not surprising as parents whose T1 M-CHAT responses yielded at-risk status had been given a letter from the research team explaining that while the results were not a formal diagnosis and only collected for research purposes, parents may want to discuss them with a pediatrician or developmental professional. For the R-ABC no such letter was given since the assessment was not standardized. However, 7/10 parents of children who would have been unknowingly flagged as at-risk by the R-ABC at T1 also reported participation in therapy or intervention, including four children who had not also been flagged by the M-CHAT. None of these four children had received any formal diagnosis of ASD or other developmental disorders at T2. It is possible that the behaviors flagged in these cases by the R-ABC were addressed with intervention. Finally, when discussing ASD diagnosis and differing developmental trajectories, it is also important to note that emergent approaches to developmental assessment and intervention celebrate neurodiversity and utilize strength-based approaches which, unlike the traditional medical models, do not view all ASD-associated risk behaviors as pathological (see Pellicano & den Houting, 2022 for review). It is therefore also possible that misalignment of early at-risk status and later diagnosis is in part due to an underlying transition away from the traditional medical-based approaches that defined ASD and diagnosis as “Yes/No” categorizations as in this study. This emergent shift in the ASD literature has been well received both scientifically and in the general public and should be considered when operationalizing ASD developmental outcomes in future studies.
Limitations
We acknowledge the limitations of this study. First, we recognize that as a convenience sample limited in size, the potential number of children at T2 with an ASD diagnosis would inevitably be small. It is particularly challenging to gather evidence of validity on general population samples without longitudinal follow-ups on large numbers of children. Much of this work is naturally embedded within clinical settings, with surveillance studies frequently acquiring data from pediatrician referrals (e.g., Robins, 2008). Community samples may provide less chance of “finding ASD” in the general population. However, we note that the prevalence rate of parent-reported ASD diagnosis at T2 in this study was higher (4/57) than the current ASD prevalence rate (one in 44; Maenner et al., 2021). Together, clinical and community studies will help create consistent, highly sensitive, and specific screening protocols that could be mandated on a larger scale.
Another limitation of this study was that ASD and other parent-reported clinical diagnoses were not confirmed through medical record review or a direct clinical assessment. Furthermore, we only considered parent-reported diagnoses from those qualified to diagnose ASD (e.g., clinical psychologists and developmental pediatricians). It is therefore possible that children who had received other (e.g., school-based) evaluations related to ASD but had not sought out or had access to additional clinical evaluation were excluded. Future studies interested in diagnostic outcomes from the MMDB sample could add direct assessments of the children or gain parent permission for copies of clinical assessments as well as educational assessments. This approach could help understand which clinical behaviors and those observed in educational settings are more robustly present at T2 and how it relates to observational details present or absent at T1. It is further possible that some children in this sample still have not received ASD diagnoses by T2. The median age of diagnosis still lies around 4 to 5 years (Maenner et al., 2021; van’t Hof et al., 2021) and T2 follow-ups occurred as early as 4 years old depending on when the child had entered the T1 study. Furthermore, diagnostic age is not uniform across all racial/ethnic and socioeconomic status (SES) groups (e.g., Caucasian, and high-SES children diagnosed earlier) or levels of symptom severity (i.e., more severe diagnosed earlier; see Daniels & Mandell, 2014 for review). The present sample (see Supplementary Material 1) was largely educated (i.e., majority of parents held at least a bachelor’s degree higher [n = 51]), primarily married (n = 52), and reported household incomes of US$75,000+ (n = 49). Future studies including samples with greater sociodemographic diversity will be important for validating behavioral screeners prior to widespread implementation. Another limitation of utilizing all parent-report data is that parents of older children may have had more difficulty remembering characteristics that their child had at ages 4 to 5 years than do parents of younger children. Finally, parents were asked to complete the SCQ screener via phone interview. While a single researcher completed all interviews, it is possible that having questions read aloud influenced how parents interpreted and responded to certain items. This could have led to different SCQ scores that would have been obtained had the screener been administered in person or via another modality (e.g., asked to fill out and return via mail). In the future, a longitudinal study that includes an in-person observation confirming diagnosis would improve upon this study’s methods by providing an additional objective follow-up measure and adding direct contact with the child and caregiver at follow-up.
Conclusion and Future Directions
In sum, this work features a follow-up study on an archival sample for ASD risk that was collected from the general population, an approach that has strong potential for broader ecological validity of risk-screeners for the population at-large. Results suggest that a novel, brief behavioral assessment did not predict ASD diagnosis above and beyond that of currently used parent-report screening tools. While it is possible that the R-ABC is a less robust tool for evaluating younger infants, our preliminary look at this question did not signal support for this notion as the four ASD-diagnosed children at T2 represented the sample’s full age range at T1. Another possibility, as mentioned above, is that the ASD characteristics that the R-ABC focuses on may be more overlapping with other developmental disorders, which may reduce any advantage over the M-CHAT for predicting ASD. New screening instruments featuring objective behavioral assessments should be tested in samples that are also screened using well-known methods and followed longitudinally to evaluate predictive validity for not only ASD but also developmental delays that show similar symptomatology in infancy. Validated behavioral assessments used in conjunction with the current parent-report methods could constitute a comprehensive surveillance program for identifying children in need of further assessment. If high levels of ASD-specific predictive ability were achieved, brief behavioral assessments could yield a practical screening protocol administered to all children in routine pediatric checkups. Widespread screening will likely further increase global prevalence rates but will also decrease the likelihood that children who present early risk behaviors are overlooked, missing the critical early intervention window for both ASD and other developmental disorders.
Supplemental Material
sj-docx-1-foa-10.1177_10883576241232904 – Supplemental material for The Utility of the R-ABC in Assessing Risk for Autism Compared With the M-CHAT: An Exploratory Study
Supplemental material, sj-docx-1-foa-10.1177_10883576241232904 for The Utility of the R-ABC in Assessing Risk for Autism Compared With the M-CHAT: An Exploratory Study by Sidni A. Justus, Jenny L. Singleton and Agata Rozga in Focus on Autism and Other Developmental Disabilities
Supplemental Material
sj-docx-2-foa-10.1177_10883576241232904 – Supplemental material for The Utility of the R-ABC in Assessing Risk for Autism Compared With the M-CHAT: An Exploratory Study
Supplemental material, sj-docx-2-foa-10.1177_10883576241232904 for The Utility of the R-ABC in Assessing Risk for Autism Compared With the M-CHAT: An Exploratory Study by Sidni A. Justus, Jenny L. Singleton and Agata Rozga in Focus on Autism and Other Developmental Disabilities
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
