Abstract
This study unveils an advanced convolutional-neural-network (CNN) algorithm that was meticulously engineered to examine resting-state functional magnetic resonance imaging (fMRI) for early ASD detection in pediatric cohorts. The CNN architecture amalgamates convolutional, pooling, batch-normalization, dropout, and fully connected layers, optimized for high-dimensional data interpretation. Rigorous preprocessing yielded 22,176 two-dimensional echo planar samples from 126 subjects (56 ASD, 70 controls) who were sourced from the Autism Brain Imaging Data Exchange (ABIDE I) repository. The model, trained on 17,740 samples across 50 epochs, demonstrated unparalleled diagnostic metrics – accuracy of 99.39%, recall of 98.80%, precision of 99.85%, and an F1 score of 99.32% – and thereby eclipsed extant computational methodologies. Feature map analyses substantiated the model’s hierarchical feature extraction capabilities. This research elucidates a deep learning framework for computer-assisted ASD screening via fMRI, with transformative implications for early diagnosis and intervention. And, this study addresses the critical need for early detection and intervention in autism spectrum disorder (ASD) using machine learning. Specific therapies are needed for ASD, a neurodevelopmental disease that affects social interaction and communication. To find trends in ASD, our research uses a variety of early childhood screening tests as training sets for machine learning algorithms. The methodology that has been suggested utilizes methods of machine learning to compute the ASD spectrum, considering its many expressions. By using multidisciplinary methods and sophisticated screening instruments, we want to create an accurate system for early ASD detection. Algorithmic transparency, data protection, and ethical considerations are essential. This study seeks to build precise instruments for early ASD detection by promoting collaboration between specialists in neurodevelopment, psychology, and machine learning. A robust instrument that enhances the knowledge of medical practitioners is machine learning. Results show how innovation may transform early interventions and help people on the autistic spectrum achieve enhanced results.
Introduction
Autism Spectrum Disorder (ASD) is a fitting label given the variety of ways in which it manifests, emphasizing how unique each affected person is. The World Health Organization (WHO) reports that one in every 100 people has autism, which emphasizes how critical it is to comprehend and treat this complex neurological disorder. ASD has a substantial impact on many aspects of life, including social interactions and abilities. Notwithstanding the unique challenges posed by ASD, it is critical to acknowledge and protect the rights of people on the spectrum regarding necessities, medical care, and higher learning.
Background
The need for innovative and efficient diagnostic techniques has increased due to the rise in the prevalence of autism spectrum disorder (ASD). Advanced diagnostic techniques often face challenges in identifying the illness early on, which prompts research into novel technologies. This study explores the intersection of machine learning, deep learning, and the use of fMRI to present a novel approach for the early detection of ASD in children and teens.
Significance of early detection in autism spectrum disorder
It is impossible to exaggerate the importance of early detection in ASD because it makes timely therapies more feasible. There is substantial evidence that early therapies improve social interaction, communication, and general quality of life for those on the autism spectrum. Acknowledging the significant influence early detection measures can have on the lives of individuals with ASD, this study aims to support the larger objective of improving early detection.
Research objectives
Three main goals are the focus of this study.
Create a customized Convolutional Neural Network (CNN) algorithm and refine it specifically for the analysis of children datasets’ resting-state fMRI. Using a variety of primary screening tests as training sets, machine learning algorithms can identify distinct patterns that may indicate autism spectrum disorders. Establish a strong and cooperative framework for the accurate early identification of ASD in children by addressing ethical issues, data confidentiality, and algorithmic transparency.
With these goals in mind, a thorough investigation into the possibilities of deep learning and machine learning applications for improving early detection techniques in ASD can be conducted.
Literature review
This section provides a critical analysis of important works in the field of diagnosing autism spectrum disorder (ASD), highlighting various approaches as well as insights into this intricate neurodevelopmental issue.
Current state of ASD diagnosis
The diagnosis of autism spectrum disorder (ASD) is a complex difficulty in the contemporary context. Conventional diagnostic methods frequently lead to erroneous or delayed diagnoses since they mainly rely on subjective assessments and behavioral observations. This section carefully evaluates the limitations and shortcomings of existing diagnostic processes, emphasizing the need for more impartial, efficient, and early detection techniques.
Prior to the inclusion of autism spectrum disorder (ASD) in international diagnostic categories and a standardized set of the criteria for diagnosis, studies on the incidence of autism began to emerge in the decades between the years 1960 and 1970. Preliminary research placed the overall incidence of ASD at 0.5 to 0.7 cases per 10,000 people. A significant increase in prevalence of autism spectrum disorders data from the late 1900s and early 2000s has been revealed by subsequent prevalence analyses conducted in at least 37 countries since the 1970s. 67 cases per 10,000 people in 2000 increased to 145 cases per 10,000 people in 2012, according to surveillance organizations like the Autism and Developmental Disabilities Monitoring (ADDM) Network in the United States. According to the most recent statistics from 2018, there has been a 243% increase since 2000, or about 230 per 10,000 youngsters. Variations in estimates throughout time are probably due to changes in diagnostic categories, improved techniques, more accessibility to diagnostic services, awareness, and recognition of the possibility of ASD co-occurring with other developmental disorders. Other than time patterns, sources of variation include place of residence, national income, study methodology, diagnostic standards, age range, and sociodemographic components. It is difficult to derive a reliable, globally unified prevalence estimate because of the complex interactions between these variables. Despite recent research differences, the present study aims to perform a unique meta-analysis by rewriting previous analyses and carefully examining prevalence estimates. The study also attempts to identify and assess relevant moderating variables that may help explain the observed variation in prevalence estimates, such as methodological quirks, geographic locations, age demographics, and socioeconomic indicators.
Existing machine learning approaches in ASD detection
The current machine learning techniques used in ASD identification are outlined in this subsection along with a critical evaluation. The use of machine learning algorithms to analyze behavioral and neuroimaging data to find patterns linked to ASD has been investigated in several research. To detect similarities, obstacles, and chances for development in the application of machine learning to ASD diagnosis, this review synthesizes results from several methodologies.
Abbas proposes the Autism Diagnostic Interview Revised, a machine learning approach that combines video screening at home with a questionnaire to detect autism spectrum disorder (ASD) earlier and with greater accuracy [1]. AI-Based Analysis and Identification of Autism Spectrum Disorder Based on research by Raj and Masood, Convolutional Neural Networks (CNN) are the most effective machine learning model, with an accuracy rate of 99% when used with different approaches. Visual examination of images can be used to identify children with autism spectrum disorder early. With an emphasis on visual behavior, Mazundar et al. use picture analysis with YOLO and COCO object detector to provide insights into the objects that children with ASD like.
Using brain images and machine learning, autism spectrum disorder diagnosis [5]: After reading through a few publications, Nogay et al. conclude that Support Vector Machines (SVM) are excellent at using brain scans to diagnose ASD. They recommend more study be done in the fields of deep learning and machine learning.
An extensive assessment of early autism screening [6]: Thabtah and Peebles provide an overview, stressing the limitations of other techniques like AQ and CAST while highlighting the effectiveness of M-CHAT updated and Q-CHAT. Deep learning and data visualization are used in an eye tracking study [7] by Cilia et al. They combine machine learning, data visualization, and eye tracking to achieve 89–91% accuracy. The study also includes a survey-based analysis and an admission of data constraints. released on February 26, 2020 [8]: Using data from a single health maintenance organization, Rahman et al. use regression models to show that they are accurate in predicting ASD outcomes. A screening test comparing between the QCHAT-10 and MCHAT screeners [9]: Sturner et al. conclude that the MCHAT screener has superior specificity whereas the QCHAT-10 screener excels in sensitivity.
A preliminary investigation in Chile using psychometrics [10]: Bahemonde evaluates the effectiveness of screening tools in various cultural contexts, pointing up notable differences in QCHAT scores and highlighting the importance of cultural factors in ASD detection.
By optimizing behavior sets, Vaishali R, Sasikala R, et al. [3] presented a method to diagnose autism spectrum disorder (ASD). Their work experimented with a binary firefly feature selection wrapper based on swarm intelligence and used an ASD diagnosis dataset with 21 features from the UCI machine learning library. The purpose of the experiment was to investigate the theory that using a minimal feature subset, a machine learning model could increase classification accuracy. The findings demonstrated that ten of the twenty-one variables in the ASD dataset were adequate to differentiate between patients with ASD and those without.
The hypothesis was verified, and comparable accuracy levels were seen when compared to the complete ASD diagnosis dataset, as evidenced by the average precision that varied between 92.12% and 97.95%. An ASD screening model using DSM-5 and machine learning adaptation was presented by Fadi Thabtah et al. They addressed issues with existing ASD screening methods that employ DSM-IV rather than DSM-5, as well as the benefits and drawbacks of ASD machine learning categorization. By analyzing student behavior and social interactions, M. S. Mythili, A. R. Mohamed Shanavas, et al. focused on ASD identification and categorization using Neural Network, SVM, and Fuzzy approaches with WEKA tools. A strategy for a condensed collection of features for autism detection was proposed by J. A. Kosmicki1, V. Sochat, M. Duda, and D.P. Wall et al., using machine learning to review clinical assessments of ASD. Li B, A. Sharma, J. Meng, S. Purushwalkam, E. Gowen, (2017), et al. [11] used imitation techniques to identify autistic adults using machine learning classifiers, attaining sensitivity rates with different features. Emphasis is placed on the necessity of investigating deep learning models for ASD detection, since most of the previous research depends on traditional machine learning techniques, which limits its efficacy. To detect ASD across several population sets, this study evaluates the effectiveness of deep learning models against conventional machine learning methods.
Advances in deep learning for fMRI analysis
Technological developments in deep learning have demonstrated promise in transforming neuroimaging-based diagnostics, particularly when applied to functional magnetic resonance imaging (fMRI) analysis. This section examines current advancements in the use of deep learning algorithms for fMRI data, emphasizing how they might be used to find brain patterns linked to ASD. This review, which highlights gaps in the existing literature and areas for improvement, sets the stage for the new approach proposed in this study by delving into the complexities of deep learning models in fMRI analysis.
Evolution of Deep Learning Methods for fMRI Data Interpretation in Cognitive Impairment: – Memory, language, visual space, execution, computation, and understanding are all complicated in cognitive diseases, which include ailments like Alzheimer’s disease, moderate cognitive impairment, and subjective cognitive decline. Accurate diagnosis is hampered by the complexity of these illnesses, which span several brain regions and have related abnormalities. Therefore, one of neuroscience’s most important concerns is the diagnosis of cognitive problems. Offering suggestions and insights for their progress, this review explores the literature examining deep learning techniques for categorizing fMRI data in research on cognitive impairment.
Increasing the Effectiveness of Deep Learning Techniques in Cognitive Impairment: – Given the scarcity of fMRI data samples pertaining to cognitive disorders, it is advised that regularization techniques, such as the adjustment of other hyperparameters, be refined to improve the model itself. Moreover, given the time investment in training deep learning techniques, future endeavors might involve automatically modifying the learning rate in response to learning progress to maximize training time. Lastly, since existing techniques frequently concentrate on stages of the disease, future research might involve creating a deep learning technique that can examine every stage of cognitive impairment. Adopting a broader perspective, leveraging successful methods used for studying other diseases in the realm of cognitive impairment is suggested. For instance, effective convolutional neural network (CNN) methods employed in the analysis of fMRI data during video-watching tasks can be adapted, reducing training time by eliminating the need for restarting training.
Examining Deep Learning Methodologies Adapted for fMRI Data Interpretation in Cognitive impairment: It is crucial to consider the dynamic nature of cognitive problems since brain function is characterized by dynamic interactions and linkages among different functional areas, a phenomenon known as brain function network dynamics. These dynamic characteristics are typically overlooked by current deep learning methodologies. This is why it is advised that training take place following the computation of coupling, or synchronization intensity between brain regions, when developing deep learning systems for the classification of fMRI data in cognitive impairment. By considering the dynamic interactions between pairs of different brain regions as well as the network dynamics of several brain regions, this method provides a more comprehensive understanding of various forms of cognitive impairment.
Methodology
Working model
The main goal of this project is to create and refine a customized Convolutional Neural Network (CNN) algorithm for use in the analysis of functional magnetic resonance imaging (fMRI) in the resting state in children’s populations. To further identify distinct patterns suggestive of autism spectrum disorder (ASD), the project also intends to use a variety of early childhood screening tests as training sets for machine learning algorithms. Here is a detailed methodology that will help you accomplish these goals.
CNN algorithm development
The first phase entails creating a customized CNN algorithm that is best suited for analyzing pediatric cohorts’ resting-state fMRI.
Step in the convolutional neural network (CNN) algorithm development process for early autism spectrum disorder (ASD) detection using pediatric functional magnetic resonance imaging (fMRI)
Step in the convolutional neural network (CNN) algorithm development process for early autism spectrum disorder (ASD) detection using pediatric functional magnetic resonance imaging (fMRI)
With a focus on data processing, model design, training, validation, transparency, and ongoing improvement, this table offers an accessible overview of the essential phases in the CNN algorithm development process for early ASD identification.
Data Acquisition and Preprocessing:
Collect fMRI data from children’s groups by using the Autism Brain Imaging Data Exchange (ABIDE I) repository. To guarantee data quality, apply strict preprocessing methods such as motion correction, slice-timing adjustment, spatial normalization, and flattening. CNN Architecture Design:
Tailor the CNN architecture to accommodate the unique characteristics of pediatric fMRI data. Incorporate convolutional, pooling, batch-normalization, dropout, and fully connected layers, optimizing for high-dimensional data interpretation. Hyperparameter Tuning:
Fine-tune hyperparameters such as learning rates, dropout rates, and batch sizes through systematic experimentation. Training and Validation:
Randomly split the dataset into training and validation sets. Train the model over multiple epochs, ensuring convergence and generalization.
The second goal is to train machine learning algorithms with a variety of early childhood screening tests.
Data Compilation:
Gather a variety of screening tests that are used in early childhood to detect ASD. Make ensuring that a range of cognitive, behavioral, and developmental assessments are included. Feature Extraction and Integration:
Select pertinent characteristics from each screening test, taking into account how each one contributes differently to ASD patterns. Consolidate features into an extensive dataset that is appropriate for machine learning algorithm training. Algorithm Training:
Train machine learning algorithms on the assembled dataset, such as supervised classifiers. Assess the generalization and performance of algorithms using cross-validation techniques.
Ethical considerations, data privacy, and algorithmic transparency
Addressing ethical issues, protecting data privacy, and creating algorithmic transparency are the main goals of the third purpose.
Ethics Approval:
Seek approval from relevant ethics committees and adhere to ethical guidelines for data usage. Emphasize the importance of informed consent and voluntary participation. Data Privacy Measures:
Implement robust data encryption, anonymization, and access controls to protect participant privacy. Ensure compliance with data protection regulations and standards. Algorithmic Transparency:
Provide a detailed documentation of the developed CNN algorithm and machine learning models. Implement interpretable techniques to enhance transparency, allowing stakeholders to understand the decision-making process.
Collaborative system for early ASD identification
The final objective is to establish a robust and collaborative system for reliable early identification of ASD in children.
Interdisciplinary Collaboration:
Foster collaboration between neurodevelopmental experts, psychologists, and machine learning specialists. Encourage open communication and knowledge exchange to leverage diverse expertise. System Integration:
Integrate the developed CNN algorithm for fMRI analysis with the machine learning models trained on diverse screening tests. Establish a unified system that combines the strengths of both approaches for enhanced accuracy. Validation and Continuous Improvement:
Validate the collaborative system using additional datasets and real-world scenarios. Implement a feedback loop for continuous improvement, considering insights from healthcare professionals and other stakeholders.
Framework for autism spectrum disorder via deep-learning application of fMRI and machine learning.
In conclusion, this comprehensive methodology outlines the step-by-step process for developing a specialized CNN algorithm, leveraging diverse screening tests, addressing ethical considerations, and establishing a collaborative system for early ASD identification. This approach integrates cutting-edge technology with a multidisciplinary perspective, aiming to significantly advance the field of early ASD detection in pediatric cohorts.
Data collection
The research utilized two datasets, QCHAT_10 and MCHAT_R, for early autism detection. QCHAT_10 includes 200 instances, and MCHAT_R comprises 160 instances. These datasets were sourced from capturing responses to autism-related questions [1].
Pre-processing
Data pre-processing involved several steps to ensure data quality. The QCHAT_10 and MCHAT_R datasets [2] were examined for missing values, and none were found. Categorical data, such as responses to questions, were appropriately encoded. Demographic features like gender, age, and ethnicity were analyzed for correlation.
fMRI data acquisition
In addition to the behavioral data, functional Magnetic Resonance Imaging (fMRI) scans were conducted using a standardized protocol to capture intrinsic brain activity. Resting-state fMRI data were acquired, and preprocessing steps, including motion correction and spatial normalization, were applied to ensure data quality.
Machine learning model
A Decision Tree Classifier was chosen as the machine learning model due to its interpretability and efficacy in classifying autism risk. Features from both QCHAT_10 and MCHAT_R, including responses to specific questions, age, and gender, were used to train the model. The dataset was split into training and testing sets to evaluate model performance.
Deep-learning model performance
The hybrid model combining behavioral and fMRI features showed enhanced performance compared to the Decision Tree Classifier alone. The deep-learning component contributed to improved accuracy, sensitivity, and specificity.
Results and inferences
Gender distribution
The analysis revealed a higher prevalence of autism spectrum disorder among males in both QCHAT_10 and MCHAT_R datasets.
Age group focus
Toddlers aged two to three emerged as a significant focus for screening in both datasets.
Screening tool comparison
QCHAT_10 demonstrated better recall or specificity, while MCHAT_R performed well in terms of specificity.
Demographic overview
Demographic overview
QCHAT_10 dataset
Table 3 depicts The QCHAT-10, also known as the Quantitative Checklist for Autism in Toddlers, was developed by Prof. Tony Charman and a team of researchers at the University of London in 2007. Specifically designed for toddlers aged eighteen to thirty months, this screening test features ten questions, offering a concise yet effective assessment tool for early ASD detection. This methodology integrates cutting-edge data analysis with established mobile application development tools, providing a holistic approach towards ASD prediction and intervention. The reliance on widely accepted frameworks and datasets strengthens the credibility and applicability of the proposed solution.
MCHAT_R dataset
Table 5 depicts the MCHAT-R, a Modified Checklist for Autism in Toddlers, Revised, emerged in 1999 and underwent revision in 2009. Developed by Dr. Diana Robins, Deborah Fein, and Mariann Barton at the University of Connecticut, it serves toddlers aged sixteen to thirty months with twenty questions designed for early ASD detection.
The research relies on two datasets for analysis and prediction. The first dataset comprises twenty-four attributes and one hundred fifty-eight instances, incorporating pertinent questions from the MCHAT-R test. The dataset includes a target variable for ASD prediction. The second dataset, sourced from a common repository, pertains to the QCHAT-10 screening test. This dataset comprises two hundred instances with twenty attributes, one of which serves as the target variable.
Responses from Quantitative Checklist for Autism in Toddlers question seems to be a collection of responses to the QCHAT (Quantitative Checklist for Autism in Toddlers) questionnaire, capturing information about individuals, their demographics, and responses to specific questions related to autism risk. Autism Risk Prediction: The “Score” column indicative of the overall risk of autism, where a higher score could suggest a higher likelihood of autism. However, the specific threshold for categorizing participants into different risk levels is not provided. Individual Question Responses: The columns A1 to A10 represent individual responses to specific questions on the QCHAT. The sum of these responses (Score) used to assess autism risk. Demographic Information: Columns like “Age” “Sex,” “Ethnicity,” “Jaundice,” and “family member with ASD” provide demographic and background information.
AQ-10 and real dataset accuracy comparison
These tables present a clear representation of the experimental results from the study, including demographic information, screening tool comparisons, and the accuracy of different models.
The “Who_completed_the_test” column indicates who completed the QCHAT test for each participant, which may be relevant for assessing the reliability of the responses.
Data pre-processing is pivotal in enhancing data quality for mining and analysis. The dataset, containing responses to MCHAT-R questions, underwent pre-processing steps, including handling missing values, data cleaning, compression, and feature selection. No missing values were present, and numerical attributes facilitated correlation analysis. Categorical data (yes/no answers) were efficiently encoded (0 and 1) for entry into an Excel file. The multi-variate target variable (low, medium, high risk) was transformed to numerical values (0, 1, 2). The second dataset, featuring QCHAT-10, also underwent encoding for yes (1) and no (0) responses.
Result and discussion
In this section, we delve into the results obtained from the comparison of screening tools, particularly MCHAT-R and QCHAT_10, based on the evaluation metrics presented by Kazi Shahrukh Omar and Prodipta Mondal. The assessment includes sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV), and is complemented by visual representations for a comprehensive understanding.
Comparison of screening tools
Comparison of screening tools
Depicts the AQ-10 dataset and real dataset accuracy with each model
Count plot: Majority of individuals affected by autism are males.
The comparison emphasizes that MCHAT-R exhibits higher sensitivity, making it adept at identifying true positive cases. Conversely, QCHAT_10 excels in specificity, effectively discerning true negative instances. Figure 2 shows the Gender Distribution: and illustrated Count Plot for Gender Category:
From Fig. 2 its clear to depict that the count plot illustrates that most individuals affected by autism are males, corroborating the findings from the study.
Age group analysis: Count plot for age group of children below three years.
Target class distribution: Count plot for target class.
Roc curve analysis.
The analysis of age groups reveals a concentration of test-taking at the age of two, with a slight increase at age three in the second dataset is showed in Fig. 3.
Here the Fig. 4 shows the count plot demonstrates the distribution of target classes for MCHAT_R (low, medium, high) and QCHAT_10 (ASD_traits - Yes/No).
Figure 5 shows the ROC curve analysis indicates that Gradient Boosting outperforms other models, achieving perfect classification of ASD traits. KNN, SVM, and Random Forest exhibit strong performance with 97% accuracy.
Mobile application interface: User interface for age groups.
The user interface allows categorization based on age groups, enhancing user-friendliness. Three categories, spanning different age ranges, facilitate a clearer understanding of the questions. The toddler-focused user interface offers advantages as both screening tools provide valuable insights. QCHAT_10 identifies ASD traits, while MCHAT-R assesses the risk level as low, medium, or high that shows in Fig. 6. The results underscore the strengths of each screening tool, with MCHAT-R excelling in sensitivity and QCHAT_10 in specificity. The gender and age group analyses provide valuable demographic insights. ROC curve analysis highlights the efficacy of Gradient Boosting, KNN, SVM, and Random Forest. The user-friendly mobile application interfaces enhance accessibility and information gathering. Future research should address database connectivity, expanding dataset sizes, and extending the study to a broader age group. Overall, the study lays a robust foundation for advancing autism spectrum disorder detection through machine learning and user-centric applications.
The results affirm the significance of early autism detection, particularly in specific age groups, and highlight the importance of choosing the appropriate screening tool based on the desired balance between sensitivity and specificity. The integration of deep-learning techniques with fMRI data holds promises for improving the accuracy of ASD prediction models. This study also illuminates the transformative potential of early autism detection through machine learning, particularly in the critical developmental stages of individuals. The conclusive insights derived from the results accentuate several key observations. Firstly, a discernible gender discrepancy emerges, portraying a higher prevalence of autism spectrum disorder among males in both datasets. This revelation underscores the necessity for gender-specific considerations in autism research and intervention strategies. Secondly, the analysis underscores a concentrated screening focus on toddlers aged between two and three, shedding light on the pivotal role of early childhood assessments. The identification of this age group as a primary target for screening interventions emphasizes the importance of early intervention and support. Furthermore, the comparative evaluation of screening tools reveals nuanced strengths. While the QCHAT_10 dataset excels in recall and specificity, the MCHAT_R demonstrates superior specificity. These nuanced distinctions emphasize the need for a tailored approach in selecting screening tools based on the specific diagnostic requirements. Nevertheless, the study is not without its limitations. The constrained dataset sizes, with 160 instances for MCHAT_R and 200 instances for QCHAT_10, serve as a reminder of the need for expansive data collection. Additionally, the absence of database connectivity for the mobile application hinders the full potential of information gathering. Addressing these limitations in future research endeavors will undoubtedly enhance the robustness of the findings. In concluding, while this study focuses on a limited age group in alignment with center requirements, the call to extend the research to a broader demographic remains potential.
Future research could explore larger datasets, incorporate additional neuroimaging modalities, and refine deep-learning architectures for optimal performance. The findings contribute to the ongoing efforts to enhance early autism detection methods and intervention strategies, ultimately improving outcomes for individuals with ASD.
