Abstract
Background:
Delayed diagnosis of autism spectrum disorder (ASD) remains a persistent pediatric health problem, due to limited access to competent diagnosticians and tertiary health care. A telemedicine method using a store-and-forward approach presents an opportunity to facilitate early identification and referral for intervention. This study aimed to evaluate the validity of protocol-guided video recording compared with direct assessment (DA) for diagnosing ASD.
Materials and Methods:
Children aged 18–30 months with chief complaints of delayed speech or social indifference, and Modified Checklist for Autism in Toddlers, Revised (M-CHAT-R) score of more than two were included. Parents were instructed to video record certain scenarios, which were assessed by an experienced professional based on the Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition (DSM-5) checklist for ASD. DAs using DSM-5 criteria were considered to be the gold standard of diagnosis. Diagnostic agreement, sensitivity, specificity, predictive values, and likelihood ratios were calculated to measure diagnostic validity.
Results:
The diagnostic agreement between the two methods was 82.5%. The sensitivity of video recording for diagnosing ASD was 91.3% (95% confidence interval [CI] [79.7%–100%]), while the specificity was 70.6% (95% CI [48.9%–92.2%]). The positive predictive value was 80.7% (95% CI [65.6%–95.9%]), while the negative predictive value was 85.7% (95% CI [67.4%–100%]). The positive likelihood ratio was 3.1 (95% CI [1.47–6.5]), while the negative likelihood ratio was 0.16 (95% CI [0.03–0.47]).
Conclusions:
A telemedicine approach using protocol-guided video recording evaluation has substantial validity compared with DA for diagnosing ASD.
Introduction
Early diagnosis of autism spectrum disorder (ASD) is essential to ensure timely intervention and good outcomes. 1,2 However, delayed diagnosis persists as one of the greatest challenges in ASD management, even in developed countries. 3 The autism and developmental disabilities monitoring (ADDM) network reported that only 42% of 3-year-old children with developmental concerns received comprehensive evaluations. 4 In Indonesia, studies on ASD diagnosis and management have been limited. Nevertheless, obstacles to comprehensive evaluation include limited access to tertiary health care professionals due to geographical and socioeconomic problems, as well as lengthy waiting lists. 5
Another diagnostic pitfall is dealing with atypical behaviors in children with ASD. Basic diagnostic processes for ASD include developmental history assessment and direct observation. However, direct assessment (DA) may not provide enough information as children sometimes display atypical behaviors and refuse to be examined. 6
Using telemedicine to diagnose ASD is under exploration. Telemedicine relies on telecommunication technologies to exchange health information and provide health care services. 7 The value of diagnosing ASD through telemedicine has been investigated, using either live videoconferencing in prepared facilities or a store-and-forward approach that facilitates sharing behavioral examples via video recordings. 8 –11 Live videoconference permits health care professionals to directly observe a patient's natural behavior, while caregivers follow a structured protocol. 8,10 Nevertheless, this method requires equipped facilities, which are not always feasible. 8 The store-and-forward approach enables caregivers to record videos in the course of day-to-day activities, which include natural expressions of the child's behavior. Aside from its accuracy, home recordings can be carried out over the course of several days, reducing shortcomings associated with single-assessment issues. 9
Previous store-and-forward studies instructed parents to record videos using mobile applications with brief guidance, that is, pointing, calling by name, or offering a toy, as well as recording parental concerns. These actions were done to identify ASD symptoms in children's responses. 9,11 An unpublished study noted a remarkable diagnostic value of using novel instructions for diagnosing ASD, including blocking and teasing, which had not been addressed in other telemedicine studies. 12 We aimed to compare a store-and-forward approach using protocol-guided video recordings to an independently conducted DA for ASD diagnosis.
Materials and Methods
Participants
Our diagnostic study measured the accuracy of protocol-guided video recordings compared with DA for ASD diagnosis. Participants included 40 children and their parents/caregivers who were registered as waiting list patients in a neurodevelopmental clinic in Jakarta, Indonesia. Children were between the ages of 18 and 30 months, had complaints of speech delay or social indifference, yet had never been diagnosed nor treated for ASD. They completed ASD screening using the Modified Checklist for Autism in Toddlers, Revised (M-CHAT-R) and scored more than two points. Parents/caregivers gave written informed consent and agreed to make videos according to the protocol. Evaluations were conducted after participants provided informed consent, and there were no exclusions on the basis of the DA results. The study was approved by the Medical Research Ethics Committee of the University of Indonesia in Jakarta.
Procedure
Evaluation of video recordings was done by a psychologist with 15 years of experience diagnosing ASD. DA was used as the gold standard of ASD diagnosis, done by a pediatric neurologist who had worked in the field for 30 years. The two diagnosticians had worked together in the same clinic for 15 years. Nevertheless, to ensure similar competence levels, before the study, we performed kappa statistical analysis of 39 children aged 18–30 months, who came for ASD evaluation and had M-CHAT-R score of more than two points. Overall agreement of diagnoses was 87.2%, with 0.74 Cohen's Kappa, indicating substantial agreement.
Protocol-guided video recording evaluation
After completing M-CHAT-R screening and enrolling as participants, parents/caregivers were instructed on the recording protocol via telephone and e-mail. They were instructed to record their children in at least 3 2–5-min scenarios: (1) playtime with others, (2) playtime alone, and (3) alarming behavior. “Playtime with others” was meant to give an opportunity for the child to demonstrate social-communication skills, while “playtime alone” and “alarming behavior” were done to capture any repetitive and stereotypic behaviors. For each scenario, we gave specific instructions on room setup and how to provoke responses from the child. During “playtime with others” caregivers were told to provide specific toys (cars, puzzles, and blocks) and a play partner who interacted with the child (call the child's name to get attention, ask the child to share the toys, point at something and direct the child's attention to it, tease the child by offering an object but not giving it, and cover up the child's toy so that he or she was unable to play with it). During “playtime alone” and “alarming behavior,” parents were guided to record particular behaviors, that is, meaningless repetitive speech, flapping hands, lining up toys, and spinning wheels. Additional instructions for each scenario suggested that parents use a tripod setup and ensure that relevant elements were in frame and clearly in view. Parents could record as many videos as possible, as long each contained at least one of the instructed scenarios. They e-mailed the videos within 1 week of enrollment. If videos were deemed to be poorly recorded (bad lighting/sound), parents were asked to rerecord the scenario.
The rater reviewed the videos and completed the Indonesian-translated Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition (DSM-5) checklist for ASD by Carpenter. 13 The DSM-5 is an ASD diagnostic checklist based on 2 diagnostic criteria: (1) criterion A, persistent deficits in social communication and social interaction, and (2) criterion B, restricted, repetitive patterns of behavior, interests, or activities. Criterion A is further divided into 3 subcriteria: A1 (problems with social initiation and response), A2 (problems with nonverbal communication), and A3 (problems with social awareness and insight, as well as with the broader concept of social relationship). Criterion B is divided into 4 subcriteria, including B1 (atypical speech, movements, and play), B2 (rituals and resistance to change), B3 (preoccupations with objects or topics), and B4 (atypical sensory behaviors). The DSM-5 contains specific examples and symptoms for each point. For every example, the rater picked one answer: “present,” “not present,” or “cannot be assessed.” If the “present” column was checked, the rater noted the particular video that had the specific behavior and commented on the judgment. The final diagnosis (ASD or non-ASD) was based on the total checklist score by scoring diagnostic confidence on a scale from 1 (doubtful) to 3 (certain). An ASD diagnosis was made if all A subcriteria and 2 of 4 B subcriteria were fulfilled. The families were informed of the video recording evaluation (VRE) results after they completed the DA.
Direct assessment
Participants were scheduled for DA within 2 weeks after completing the video recordings. The investigator was blinded to the VRE results. The DA for ASD was done through history-taking and direct observation of the child's behavior, which were guided by the DSM-5 checklist for ASD. 13 If an ASD diagnosis was not established, the investigator made another diagnosis appropriate to the child's condition.
Analysis
Subjects' VREs were compared to DAs by calculating sensitivity, specificity, positive predictive value (PPVs) and negative predictive values (NPVs), and likelihood ratios.
Results
We recruited 40 subjects who fulfilled the inclusion criteria (Table 1). Twenty-six subjects (65%) were diagnosed with ASD based on VRE, while DA revealed only 57.5% with ASD. Using DA, non-ASD subjects were diagnosed to have social communication disorder (20%), expressive language disorder (7.5%), functional language delay (5%), obsessive compulsive disorder (2.5%), and unlabeled (5%).
Subject Characteristics Based on Direct Assessment
ASD, autism spectrum disorder; M-CHAT-R, Modified Checklist for Autism in Toddlers, Revised; SD, standard deviation.
The overall agreement of VRE and DA diagnoses was 82.5%. The psychologist provided true-positive ASD diagnoses in 52.5% of the children (n = 21), false-positive results in 12.5% children (n = 5), true-negative results in 30% (n = 12), and false-negative results in 5% (n = 2). This generated VRE sensitivity of 91.3% (95% confidence interval [CI] [79.7%–100%]), specificity of 70.6% (95% CI [48.9%–92.2%]), PPV of 80.7% (95% CI [65.6%–95.9%]), and NPV of 85.7% (95% CI [67.4%–100%]). The positive likelihood ratio was 3.1 (95% CI [1.47–6.5]), while the negative likelihood ratio was 0.16 (95% CI [0.03–0.47]).
Twenty-one subjects had diagnostic concordance between the 2 methods. The number of videos sent by each family in the concordant group was 6.05 ± 2.9. We found perfect concordance in criterion A fulfillment, yet a variety of discordance in criterion B fulfillment. Subcriteria B1 (100%) and B3 (80.9%) had high agreement, but B2 (47.6%) and B4 (57.1%) had low agreement. Among subjects with ASD diagnoses, B2 was observed in only 33.3% of VRE cases (n = 7), but in 47.6% (n = 10) of DA cases. In addition, B4 was observed in only 66.7% (n = 14) of subjects using VRE, but in 80.9% (n = 17) of subjects using DA.
Five subjects who were falsely diagnosed as ASD by VRE had 40–60% disagreement on criterion A, 60% on B1, 40% on B2, all on B3, and 40% on B4. Moreover, two subjects had false diagnoses of non-ASD, which may have been due to the lack of some behaviors on their video recordings. Behaviors related to B1, B2, and B3 were not observed on one child's video recordings, while behaviors related to A2, A3, and B were not observed on the other one.
The rate of clinical confidence in VRE was measured. The psychologist stated that ASD diagnosis was “almost certain” in 23% of cases and “certain” in 61.5% of cases. In VRE- and DA-concordant ASD cases, the confidence level reached 23.8% (n = 5) for “almost certain” and 66.6% (n = 14) for “certain.” In the VRE non-ASD group, an “almost certain” diagnosis was confirmed in 7.1% of cases and a “certain” diagnosis was confirmed in 50% of cases. In concordant VRE and DA non-ASD cases, the psychologist was almost certain in 8.3% of subjects and certain in 58.4%. Low clinical confidence for VRE occurred due to lack of video variety and inconsistent appearance of repetitive, stereotypic behavior.
Discussion
Given the increasing prevalence of autism and discrepancies in early access to services, it is urgent to provide alternative methods for immediate diagnosis. We compared the accuracy of a telemedicine practice using protocol-guided VRE with DA for diagnosing children with ASD. To the best of our knowledge, this is the first study comparing telemedicine with a conventional diagnostic approach for ASD diagnosis in Indonesia. Earlier studies in this field were done in developed countries with sophisticated facilities. 8 –11
In this study, we developed and evaluated a protocol to guide the videotaping process. The biggest challenge in developing this protocol was ensuring that videos could produce remarkable evidence to support clinical judgment. Adapting the study results of Subroto we combined several behavior tests with high sensitivity and specificity into the protocol, including calling, blocking, and teasing tests. 12 We also decided to utilize the scenario-based protocol (Naturalistic Observation Diagnostic Assessment [NODA]) by Smith et al. with some adjustments in duration and quantity of video recordings. The NODA included collection of both developmental history and 4 10-min videos, while ours had at least three 3–5-min video recordings. We aimed to shorten the recording and reviewing processes, while maintaining adequate information for diagnosis. 9 Some repetitive and stereotypic behavior examples that are usually found in daily practices, such as hand flapping and wheel spinning, were also included.
Heterogeneous presentation of ASD and clinical judgment tended to expose the variability in the outcome level. Our findings showed high diagnostic agreement, good sensitivity, and adequate specificity of video recording evaluations. Sensitivity was higher than reported by Smith et al. (84.9%) and Juárez et al. (78.95%). 8,9 Nevertheless, VRE had lower specificity; Smith et al. reported 94.4% specificity of NODA. 9 This may have resulted from a high false-positive number due to fewer video recordings and higher discrepancies in criteria A and B. Criterion A disagreement occurred when VRE failed to show good interaction skills that were found in DA. However, subcriteria B1 and B3 were easier to observe through VRE.
Utilization of PPVs and NPVs is more relevant to clinical practice. While sensitivity and specificity indicate the effectiveness of a test with respect to a trusted “outside” referent, PPV and NPV indicate the effectiveness of a test for categorizing people as having or not having a target condition. 14 We found both high PPVs and NPVs. These results differed from Smith et al. who reported high PPV (96.5%), but low NPV (54.4%). 9
The high true positive rate yielded a high PPV. Subjects sent an adequate number of videos with good behavioral variety, which corresponded to high rater confidence in diagnosing ASD. Both methods showed significant agreement between criteria A, B1, and B3. Subcriteria B2 assessment through video recording was challenging because expressions of adherence to routines, ritualized patterns of verbal and nonverbal behavior, and rigid thinking were sometimes impossible to document, although direct evaluation also sometimes fails to reveal such conditions. Greater disagreement arose from evaluation of B4. Despite the fact that DA was superior in supporting overall clinical judgment, some B4 behaviors were more apparent during VRE. An example of better VRE value would be close visual inspection of objects for no clear purpose, that is, holding things at unusual angles, and extreme interest or fascination with watching movement of objects, that is, spinning wheels of toys, opening and closing of doors, and electric fans. The same issue of high disagreement in B2 and B4 was also reported by Smith et al. The developmental history questionnaire may help to compensate for this difficulty. 9 However, Stronach and Wetherby reported different results regarding repetitive and stereotypic behavior; their videos recorded less variety and informative matter than direct measurement. They noted that this might have been due to lack of toys and other environmental setups for exploring these behaviors. 15
We found a low positive likelihood ratio, which may have resulted from low specificity. On the contrary, the negative likelihood ratio, which correlates with sensitivity, was relatively low, implying that the probability of not having ASD after negative VRE was rather high. 16
Addressing the need to increase specificity, some additional instructions related to subcriteria B2 and B4 should be added. Supplementary questionnaires on preoccupation to certain habits and rituals may be beneficial, aside from recording orders to make trivial changes (moving items on the dinner table or changing the position of toys) to demonstrate resistance to change typical of ASD. Regarding hyper- or hyporeactivity or unusual interest to sensory input mentioned in B4, questions about pain tolerance can be added. Parents/caregivers should also be instructed to record videos of children hurting themselves (poking their own eyes), preoccupation with texture (sand, hairy toys, grass), licking or sniffing objects, distressed responses, persistent atypical focus to certain sounds, and significant aversion to cutting their hair or toenails. 9,13
Inter-rater reliability of VRE also needs further investigation. Differing rater experience in diagnosing ASD may lead to different outcomes in telemedicine studies. Thus, multiple rater agreement studies involving professionals with expertise in ASD practices should be done to ensure repeatability and reproducibility. 17
For future implementation of telemedicine in ASD diagnosis, we plan to develop a mobile application, accommodating all protocols and ideas drawn from this study. Technical issues, including lighting condition, audio quality, and stable view of the child's face, should be detected by the app to inform the parents/caregivers without needing the diagnostician to review the videos. Examples of videos on certain instructions (teasing, blocking) or specific behavior (hand-flapping) will be inserted to inform parents of important clinical information that should be recorded. Nevertheless, when in doubt, early referral should be done immediately, as DA is needed to confirm the diagnosis.
In conclusion, a novel approach to diagnose ASD is necessary to advance early detection and intervention, particularly in remote areas and low-resource communities. We tested the accuracy of a telemedicine diagnostic procedure using protocol-guided video recording for early identification. This work demonstrated high sensitivity, substantial specificity, and marked predictive values for remote assessment with standardized methods. Further studies regarding reliability and applicability of this telemedicine-based diagnostic approach are needed, as well as a mobile application that includes realistic operation of the protocol-guided video recording process.
Footnotes
Acknowledgments
The authors thank the families who gave their time to participate in this project, Check My Child® staff for their assistance in gathering participant information, and Anita Chandra for her assistance with video recording evaluation.
Disclosure Statement
No competing financial interests exist.
Funding Information
No funding was received for this article.
