Deep learning-based feature selection and prediction system for autism spectrum disorder using a hybrid meta-heuristics approach

Abstract

Autism Spectrum Disorder (ASD) is a complicated neurodevelopment disorder that is becoming more common day by day around the world. The literature that uses machine learning (ML) and deep learning (DL) approaches gained interest due to their ability to increase the accuracy of diagnosing disorders and reduce the physician’s workload. These artificial intelligence-based applications can learn and detect patterns automatically through the collection of data. ML approaches are used in various applications where the traditional algorithms have failed to obtain better results. The major advantage of the ML algorithm is its ability to produce consistent and better performance predictions with the help of non-linear and complex relationships among the features. In this paper, deep learning with a meta-heuristic (MH) approach is proposed to perform the feature extraction and feature selection processes. The proposed feature selection phase has two sub-phases, such as DL-based feature extraction and MH-based feature selection. The effective convolutional neural network (CNN) model is implemented to extract the core features that will learn the relevant data representation in a lower-dimensional space. The hybrid meta-heuristic algorithm called Seagull-Elephant Herding Optimization Algorithm (SEHOA) is used to select the most relevant and important features from the CNN extracted features. Autism disorder patients are identified using long-term short-term memory as a classifier. This will detect the ASD using the fMRI image dataset ABIDE (Autism Brain Imaging Data Exchange) and obtain promising results. There are five evaluation metrics such as accuracy, precision, recall, f1-score, and area under the curve (AUC) used. The validated results show that the proposed model performed better, with an accuracy of 98.6%.

Keywords

Autism spectrum disorder Meta-Heuristic Deep learning Convolution neural network seagull and elephant herding optimization LSTM fMRI.

1 Introduction

The disability’s in-person development, which differs from normal human behavior, is considered an autism spectrum disorder. The main cause of this disorder is due to neurological problems in the brain. Some studies declare this to be a genetic disease. Slow attention to certain activities differs from that of normal people, as does social communication. Autistic people have different perceptual abilities and socialisation with others. Some people develop this symptom in childhood, which is difficult to identify. The severity of autism is unique to each person. It varies from low intelligence to high intelligence. Also, behaviour is adjustable or exceeds limits. Due to the unique nature of the disorder, it is difficult to predict ASD [1, 2]. According to a recent WHO estimate, one in every 100 children has ASD.

ASD prediction research materials use various different advanced neuro imaging data for analysis. The tools include structural and functional magnetic resonance imaging (fMRI), positron emission tomography (PET), electroencephalography (EEG), novel protocols, and magnetoencephalography (MEG). Among task-oriented and resting-state fMRI, fMRI contains a large amount of data. Machine learning is considered a field of perfect data learning frameworks where data prediction and classification are perfectly done. During the last decade, different technologies have been used to classify ASD and typical controls. The majority of researchers use machine learning techniques to test the algorithm’s efficiency. Other brain diseases like depression, schizophrenia, and Alzheimer’s disease are also predicted using machine learning techniques. Advancements in data lead to exploration using deep learning algorithms. Massive data processing is easily done by deep learning. Recently, optimization has been improved using metaheuristic approaches.

As of now, ASD is not easy to predict. On the other hand, if we have the technology to predict early, then it is more welcome. This motivates many researchers to identify autonomous learning techniques to predict ASD [3]. Feature selection is the first preprocessing model for very high-dimensional data. Machine learning problems prefer feature selection for improving efficiency [4, 5]. Also, feature selection helps to remove the noise and make this data easier to understand. Basically, feature selection is performed by using filter and wrapper methods. Recently, optimization subsets were used with wrapper techniques to improve the selection efficiency. Nature-inspired algorithms use optimal subsets in the search process and find the optimal solution. The large space problem requires nature-inspired techniques to search the global features in the selection process with fewer samples.

Nature-inspired algorithms work by imitating the behaviour of certain optimization models like ant colonies, bee colonies, seagulls, elephant herding, etc. This emerging model helps in feature classification using optimization algorithms. Major applications, like the diagnosis of challenging diseases with bio-inspired techniques, can provide the optimal solution. Further, to improve optimization quality, deep learning is suggested with meta-heuristic performance. A convolutional neural network with metaheuristic optimization techniques is standalone and capable of outperforming other networks.

This paper’s main contribution is as follows:

This paper proposed a novel approach to selecting the most important features from the complex input data of fMRI images for ASD detection.

The most relevant and important features are selected using the proposed novel method called the CNN-SEHOA approach.

CNN was used to extract features from the input data in this case.Using the extracted features, the relevant and most important features are selected using a hybrid MH approach called the seagull-elephant herding optimization algorithm (SEHOA).

A deep learning algorithm called LSTM has been used for classification. The proposed feature selection-based classification system is evaluated using the fMRI (functional magnetic resonance imaging) images from the ABIDE dataset.

The proposed ASD prediction model is evaluated and compared with the existing approaches, and the results show that the proposed model gives promising results in ASD prediction.

Table 1
Survey on deep learning classifier with feature selection

Authors Dataset Feature selection Classification Accuracy

Aghdam et al (2018) [9] ABIDE (116 ASD and 69 TC) – Deep belief network 65.56%

Dekhil et al (2018) [10] rs-fMRI data (123 ASD and 160 TC) High correlation PSD RBF-SVM 91%

Ferdo et al (2019)[11] ABIDE (306 ASD and 350 TC) Conditional random forest Random forest 73.8%

Eslami et al (2019) [12] ABIDE (505 ASD and 530 TC) AE Single layer perceptron 80%

Niu et al (2020) [13] ABIDE (408 ASD and 401 TC) – Multichannel DANN 73.2%

Thomas et al (2020) [14] ABIDE (620 ASD and 542 TC) – 3D CNN 64%

Sherkatghanad et al (2020) [15] ABIDE (505 ASD and 530 TC) – CNN 70.2%

Authors	Dataset	Feature selection	Classification	Accuracy
Aghdam et al (2018) [9]	ABIDE (116 ASD and 69 TC)	–	Deep belief network	65.56%
Dekhil et al (2018) [10]	rs-fMRI data (123 ASD and 160 TC)	High correlation PSD	RBF-SVM	91%
Ferdo et al (2019)[11]	ABIDE (306 ASD and 350 TC)	Conditional random forest	Random forest	73.8%
Eslami et al (2019) [12]	ABIDE (505 ASD and 530 TC)	AE	Single layer perceptron	80%
Niu et al (2020) [13]	ABIDE (408 ASD and 401 TC)	–	Multichannel DANN	73.2%
Thomas et al (2020) [14]	ABIDE (620 ASD and 542 TC)	–	3D CNN	64%
Sherkatghanad et al (2020) [15]	ABIDE (505 ASD and 530 TC)	–	CNN	70.2%

2 Related work

This section discusses previous autism prediction research works and evaluates the performance of the current research works. The auditory hypersensitivity data is used to predict the level of autism using computer-aided diagnosis techniques [1]. ASD prediction using the fMRI dataset [2] uses brain atlases for accurate disorder identification. The deep neural network is trained with atlases, and the input feature is classified as ASD-affected or not. The accuracy of the prediction is 88%, which is not much higher. Even though its accuracy is low, its AUC was 96%. It proves that deep learning is an intelligent platform where predictions happen based on a well-learned platform. An fMRI dataset with a minimum spanning tree is used to detect autism in patients [3]. Here, feature selection is done using fMRI. Ensemble techniques are competitive research models where the most competitive algorithms are tested by maximum voting. Here, the population graph is used first to retrieve the features [4]. There are also some technologies introduced by researchers to support and guide autistic children [5]. The speech-based multi-modal system is used to set up environments like virtual reality. The augmented model [6] is used for interaction with autistic children.

The convolutional neural network (CNN) with separate channel (SC) attention is used to detect and classify ASD in the early stages. CNN takes more time to train on the data. A separate channel helps the CNN model discriminate between ADHD and healthy controls. The brain region is composed of temporal features, which are used by CNN. Secondly, the dependent feature on temporal is adopted by the attention network [7]. CNN identifies the autism-related features in an effective way. A filter and wrapper technique for feature selection is considered a flat feature selection model for handling huge datasets. The evolutionary technique is used to optimise the final feature [8]. The grey wolf optimizer optimises in terms of selecting prey with high accuracy.

The study of machine learning performance in predicting ASD [16] among 433 children in the age group of 3 yrs.–6 yrs. was conducted. Nave Bayes, decision trees, logistic regression, and generalised linear models are computed. The decision tree performs better than other ML techniques. The main issue here is that it can only work with small datasets. Overfitting problems are common in small datasets. This can be overcome by using an optimizer that performs better on overfitting problems. In the prediction of ASD, the biomarker of brain image [17] plays an essential role. The data on functional connectivity is computed using machine learning pipelines. The functional connectivity matrix helps develop a classification model and predicts ASD. A brief study on various AI techniques used in the prediction of ASD is reviewed [18]. In this regard, AI techniques such as CNN, logistic regression, SVM, and others excel across a wide range of datasets. Machine learning makes predictions with all possible learning abilities. ASD patients’ eyes are tracked as a biomarker for autism prediction. The scanned eye tracking is learned by ML, and classification of testing data is performed [19]. When compared to other biomarkers, it aids in disease prediction. The ML performance is improved by using optimizers like an ant colony, a particle swarm, a gorilla optimizer, etc. In this research, we focused on improving ML performance by using a seagull–elephant herding optimizer.

2 Proposed materials and methods

The overview of the proposed ASD model is sthe preparedhown in Fig. 1. In the data preparation stage, the fMRI image data are prepared for processing witnetwork,rain network as discussed in section 3.2, which will create the functional connectithatvector which is used for feature extraction and cltor that is used for feature extraction and classification. on process. This model consists of three phases. First, the features from theprepared input data are extracted using CNN. Then, the dataset is divided into training and testing datasets. Second, the relevant and most important features are selected using a hybrid MH model called SEHOA. In the third phase, using the selected feature subset, an LSTM classifier is used to predict the ASD of the patients, and the accuracy of the results is analysed using the evaluation metrics.

Fig. 1

Proposed ASD prediction model.

2.1 Dataset used

In this study, the ABIDE dataset was used for our analysis [20]. It includes fMRI brain images, structural MRI, and phenotypic information. This data is collected from 17 institutions and made available to the researchers for their scientific research. The Autism Brain Imaging Data Exchange (ABIDE) scheme has collected structural and functional brain scan data across laboratories all over the country in order to aid scientists in a greater understanding of the neurological foundations of autism. The ABIDE 1 dataset contains 1120 people, of whom 545 have ASD and 580 are healthy standards (ages 7–64 years, median 14.7 years across groups). ABIDE II has 19 organisations, including 10 chartered organisations and 7 current recruits, who’ve already given 1120 records on 525 individuals with ASD and 598 without (in the age range of 5–64 years).

2.2 Data preparation –brain network

The network consists of edges and nodes, and creating the brain network using the functional MR data is complex and difficult to process because of the intricate edges and nodes [21]. If the edges and nodes are not properly identified, then the network study becomes complex to analyze. The two most common methods to describe the network are the voxel-based method [22] and the ROI-based method [23]. In the voxel-based method, each voxel present in the MRI data is declared a node, and the connections between the nodes are determined as edges. The ROI parcellation method splits the human brain into various ROIs, and these ROIs are declared as nodes. The connection between ROIs is endpoints. ROIs correspond to various anatomical brain parts called the hippocampus, fissure, perirhinal, and pons. The cortex separated from the anatomical atlas is considered an anatomical feature and is widely used in neuroimaging studies. In this paper, multi-scale functional brain parcels are used, which are created using the bootstrap method of a stable cluster named BASC [24]. The scales are selected based on the STEPS method [25]. The scale is set at 123 for this study. The connection between the two nodes is declared as an edge. The edge between ROI pairs is weighted using the Pearson correlation coefficient denoted in Equation (1). $R_{xy} = \frac{\sum_{i = 1}^{l} (x_{i} - x_{m}) (y_{i} - y_{m})}{\sqrt{\sum_{i = 1}^{l} {(x_{i} - x_{m})}^{2}} \sqrt{\sum_{i = 1}^{l} {(y_{i} - y_{m})}^{2}}}$ (1)

Where, x and y –time series, l –length, x_i and y_i –ith component of time series and x_m, y_m –mean value of time series x and y. In this paper, preprocessed ABIDE 1 dataset [26] is used for analysis.

2.3 Feature extraction using CNN

In the feature extraction phase, the feature vector from the data preparation stage is used. The convolutional neural network is the most widely used feature extractor in a variety of applications, which include image and text classification, speech recognition, and so on [27, 28]. In this study, the architecture shown in Fig. 2 is used for feature extraction.

Fig. 2

Presented CNN structure for feature extraction.

The block conv1 –1*3 @ 64 denotes the convolution layer with a filter size of 64, and each filter has a 1*3 size with an astride size of 1. The input data is one-dimensional. The major building blocks are the convolution layer, fully connected layer, pooling layer, and activation function. Based on the constructed data, CNN learns complex feature representations. The convolution operation was used to learn the activation map from the input data. This simple CNN structure-based feature extraction improves the classification accuracy. The best-trained model, based on its performance on test data, is used to extract the features for the feature selection process. In Fig. 2, the convolution is followed by the rectified linear unit (ReLU) [29] denoted in Equation (2) to avoid the propagation of the negative and small values. The pooling layer is used to reduce the dimensionality of the input data X. $ReLu (X) = \max (0, X)$ (2)

Dropout layers are used to reduce complexity and avoid overfitting [30].The regularization rate for this dropout layer is 0.5, which can cause some neurons to drop during training. The convolution operation on input data Xl-1 of the previous layer is declared in Equation (3). $X^{l} = w^{l} . X^{l - 1} + b^{l}$ (3)

Where w and b –weight and bias of lth layer respectively ad and X^l –output. The extracted features from the last pooling layer are given as input to the fully connected (FC) layer. The layers FC1, FC2, and FC3 are used for feature extraction [31]. FC4 is used for output using the softmax activation function. Since CNN is the regularization method, batch normalization (BN) is used to normalize the features given to FC4. The extracted feature vector from layer FC3 is of size 1*64, which is given as input to the feature selection process. This will boost the classification performance with improved accuracy.

2.4 Feature selection using SEHOA

In this section, the most relevant and important features are selected using the hybrid SEHOA model that combines Elephant Herding Optimization [32] and the Seagull optimization algorithm [33]. The traditional EHO will not use the required data for its future searches, and the traditional SOA approach has large-scale constraints for industrial applications. Still, the computational complexity of these algorithms is an issue for solving optimization problems. This will motivate us to combine both algorithms, which will give better results with improved convergence speed.

Basically, the elephants lived in a social group called clans, and each clan stayed with its matriarch, which is the female leader elephant. The mature male elephant is kept apart from the other elephants. The population of the elephant is randomly produced and splits into various clans based on its fitness value. The standard EHO has three significant rules, as follows:

The population of the elephant is comprised of various clans, with a fixed number of male and female elephants in each and every clan.

Individually, some male elephants live far from the clan.

The elephants lived with the female leader of each clan.

2.4.1 Updating the clan

This operator is used to update the clan individually. the matriarch (m) influences each element’s position in the clan h for the updating process. Seagull optimization function has been used for clan updating process which is denoted in Equation (4) $S_{b} (t) = b . (S_{best} (t) - S_{l} (t)) + Levy (δ)$ (4)

Where, S_b- seagull search agent S_l location towards the best fit agent called S_best.S_l - current position of search agent, t –current iteration and b –behavior which is the randomized number to balance the exploration and exploitation. The best fit elephant updation is obtained from the Equation (5) $S_{n, h, k} = η \times S_{c, h}$ (5)

Where, η –clan centre with the range [0,1], the new individual is expressed from the information gathered by all the elephants in the clan h, S_c,h - clan h centre denoted in Equation (6) $S_{c, h} = \frac{1}{g_{h}} \times \sum_{k - 1}^{g_{h}} S_{h, k, d}$ (6)

Where d = 1 ⩽ D ⩽ d denotes the total number of dimensions in D, g_h - number of elephants and S_h,k,d - dth dimension of the individual S_h,k.

2.4.2 Separating operator

The male elephants, which are now grown, are starting to live separately. This operator is determined once the separating process is over. Based on the hybrid SEHOA, the search ability is enhanced, and the elephant’s worst fitness in separating operators for each generation is defined from the Equation (7) $S_{worst, h} = S {(S_{best} - S_{worst})}_{\min}$ (7)

Where, S_min- single elephant minimum bound position, S_worst - worst individual elephant of clan h which is replaced using the Equation (8). The workflow of SEHOA is shown in Fig. 3.

Fig. 3

Workflow of hybrid SEHOA algorithm.

$R_{\overset{+}{b} 1} = \cos ({\overset{ˇ}{b}}^{\cos^{- 1}} (R_{\overset{ˇ}{b}}))$ (8)

2.5 Classification using LSTM

LSTM is a special kind of recurrent neural network that learns from long-term dependencies. It was developed in the mid-1990s and is now widely used for classification. The standard LSTM consists of four gates: input gate I, forget gate F, control gate C, and memory cell output gate O [34]. Figure 4 depicts the architecture of a standard LSTM. The model inputs are the previous cell state called h_t - 1, the current input vector X_t, and the bias b. The model output is the C_t, which represents the memory content and current cell state ht. The network data are influenced by the four gates, and forget gate F has a range of values from 0 to 1, which indicates how much data is ignored from the previous memory cell. The value closest to 0 means a new time stamp, and values closer to 1 mean a previous time stamp.

Fig. 4

LSTM structure.

Based on Fig. 4, the mathematical model of LSTM classification is denoted in Equations (13) $F_{t} = σ (w_{F} [h_{t - 1}, X_{t}] + b_{F})$ (9)

The next sigmoid function is used to decide the information need to update. The candidate value list is created and combined the two operations. ${\tilde{C}}_{t}$ is the list of candidate values and the two operations are denoted in Equation (11) $I_{t} = σ (w_{I} [h_{t - 1}, X_{t}] + b_{I})$ (10) ${\tilde{C}}_{t} = \tanh (w_{C} [h_{t - 1}, X_{t}] + b_{C}$ (11)

The new memory cell status is computed using Equation (12) $C_{t} = F_{t} C_{t - 1} + I_{t} {\tilde{C}}_{t}$ (12)

Finally, the output of the system h_t is calculated using Equation (13) $h_{t} = O_{t} \tanh (C_{t})$ (13)

Table 2

Confusion matrix

	Actual positive	Actual negative
Predicted positive	True positive (T^P)	False positive (F^P)
Predicted negative	False negative (F^N)	True negative (T^N)

3 Experimental results and discussions

The effectiveness of the proposed CNN-SEHOA-based feature selection with LSTM is evaluated using the ABIDE 1 dataset. The training and testing datasets have been incorporated into a 10-fold cross-validation process, with eight folds for training data and two folds for testing data. The proposed model is implemented using the Python Sklearn library. Area under the curve (AUC), accuracy, sensitivity, specificity, and f1-score measures are used for statistical analysis of the confusion metrics shown in Table 4.

Table 4
Proposed model metrics results

Metrics Results

Accuracy 0.986

Recall 0.996

Precision 0.975

Specificity 0.978

F1-score 0.985

Metrics	Results
Accuracy	0.986
Recall	0.996
Precision	0.975
Specificity	0.978
F1-score	0.985

The specificity is the ratio between actual negatives predicted as negative. It is also known as true negative rate as shown in Equation (14) $Sp = \frac{T^{N}}{T^{N} + F^{P}}$ (14)

The sensitivity or recall is detecting the true positives in the model which is major factor to identify the actual patients with heart disease. It is computed using the Equation (15) $recall = \frac{T^{P}}{T^{P} + F^{N}}$ (15)

Precision is the ratio between true positive and all the positives denoted in Equation (16) $Precision = \frac{T^{P}}{T^{P} + F^{P}}$ (16)

Accuracy is the ratio between overall correct predictions to the total number of predictions as denoted in Equation (17) $Accuracy = \frac{T^{P} + T^{N}}{T^{P} + T^{N} + F^{P} + F^{N}}$ (17)

F1- score is the mean between precision and recall as in Equation (18) $F 1 - Score = \frac{2 * T^{P}}{2 * T^{P} + F^{P} + F^{N}}$ (18)

3.1 ROC curve

ROC is a graph that is used to show the performance measures of accuracy, sensitivity, and specificity. It has two parameters: the true positive rate and the false positive rate.

3.2 True positive rate (TPR)

It is also known as “sensitivity,” so TPR is calculated by the probability of actual positive relay nodes using: $TPR = 1 - FNR$ (19)

3.3 False positive rate (FPR)

It is used to evaluate the ratio between correctly identified numbers of relay nodes to wrongly identified relay nodes using: $FPR = 1 - Specificity$ (20)

Based on the results of Table 3, the evaluation metrics results of the proposed model are shown in Table 4.

Table 3

Confusion matrix results of proposed model

	Actual positive	Actual negative
Predicted positive	524	3
Predicted negative	1	595

The proposed model is evaluated and compared with various classifiers such as Convolution neural network (CNN), Random Forest (RF), Support vector machine (SVM) and Deep neural network (DNN). The evaluated results are shown in Fig. 5.

Fig. 5

Comparative analysis results of proposed vs existing models.

Compared to the existing classifiers such as CNN, RF, SVM, and DNN, the proposed model’s performance is superior and efficient, with an accuracy of 98.6%, precision of 97.5%, recall of 99.6%, specificity of 97.8%, and F1-score of 98.5%. Various other classifiers such as CNN secured the accuracy, precision, recall, specificity, and F1-score of 94.5%, 93.2%, 95.3%, 93.4%, and 94.3%, respectively. RF secured 92.8%, 91.7%, 93.5%, 91%, and 92.2%, respectively. SVM received 93.7%, 92.8%, 94.8%, 92.9%, and 93.4% sequentially, while DNN received 95.4%, 94.6%, 96.3%, 94.5%, and 95.6. The ROC of proposed vs. existing approaches is shown in Fig. 6. Compared to the existing approaches, the proposed feature extraction, feature selection, and classification system secured a ROC value of 0.989.

Fig. 6

ROC comparison.

The proposed model is evaluated in terms of error rate and execution time. These evaluated results are shown in Table 5. In comparison to other approaches, the proposed models achieved the lowest error rate in predicting ASD patients. In terms of execution time, the proposed model took longer than other approaches due to the execution of the feature extraction, feature selection, and classification phases. Hence, the proposed feature selection with classification system is efficient, robust, and effective for the prediction of ASD.

Table 5

Error rate and time comparison of proposed vs existing approaches

Methods	Error rate	Time (seconds)
CNN [36]	3.2	1300
RF [34]	7.4	3000
SVM [35]	15.3	1200
DNN [2]	2.3	1500
Proposed CNN-SEHOA-LSTM	0.003	5000

Some of the previous ASD prediction systems using ML and meta-heuristic-based feature selection approaches are compared with the proposed meta-heuristic-based feature selection and deep learning-based classification models. The evaluated results are listed in Table 6.

Table 6

Comparison of previous Meta heuristic feature selection-based ASD prediction systems

Data and Ref	Feature selection	Classifier	Accuracy
Jin et al., [33]	t-test filter and LASSO regression	SVM	76%
Katuwal [34]	Recursive Feature Elimination	Random Forest	60%
Chen et al., [35]	Particle Swarm Optimization	SVM and RF	81% and 91%
Thomas et al., [36]	–	CNN	63%
Dvornek et al., [37]	–	RNN	70.1%
Proposed model	CNN-SEHOA	LSTM	98.6%

Hence, the evaluation and comparison results prove the efficiency of the proposed model in the prediction of ASD with increased accuracy, ROC, and reduced error.

4 Conclusion

This paper developed an ASD prediction model using an enhanced feature selection process that includes CNN-based feature extraction and hybrid SEHOA-based feature selection approaches. The efficiency of the model is based on this proposed feature selection model and is classified using LSTM. Using the ABIDE 1 dataset, the proposed feature selection-based ASD model is evaluated in terms of accuracy, recall, precision, specificity, and the f1-score. Compared to the existing approaches such as CNN, RF, SVM, and DNN, the proposed model is superior, with an accuracy of 98.6% and ROC of 0.9. The proposed model performs better when screening for ASD and predicts ASD patients with a lower error rate. The proposed model takes longer to execute than other approaches due to the use of a meta-heuristic-based deep learning model. In the future, the proposed model will be implemented with larger datasets, and a user-friendly web-based application will be developed so that individual users can use the application to predict the onset of autism spectrum disorder as early as possible.

References

Johnston

, Egermann

and Kearney

, Sound Fields: A virtual reality game designed to address auditory hypersensitivity in individuals with autism spectrum disorder, Applied Sciences, p. 2996, 2020.

Subah

F.Z.

, Deb

, Dhar

P.K.

and Koshiba

, A deep learning approach to predict autism spectrum disorder using multisite resting-state fMRI, Applied Sciences, p. 3636, 2021.

Shi

, Zhang

and Wu

, An fMRI feature selection method based on a minimum spanning tree for identifying patients with autism, Symmetry, p. 1995, 2020.

Rakhimberdina

, Liu

and Murata

, Population graph-based multi-model ensemble method for diagnosing autism spectrum disorder, Sensors, p. 6001, 2020.

Johnston

, Egermann

and Kearney

, Measuring the behavioral response to spatial audio within a multi-modal virtual reality environment in children with autism spectrum disorder, Applied Sciences, p. 3152, 2019.

Magrini

, Curzio

, Carboni

, Moroni

, Salvetti

and Melani

, Augmented interaction systems for supporting autistic children. Evolution of a multichannel expressive tool: The SEMI project feasibility study, Applied Sciences, p. 3081, 2019.

Zhang

, Li

, Peng

, Kang

, Jiang

, Li

, Zhu

, Yao

and Biswal

, Separated channel attention convolutional neural network (SC-CNN-attention) to identify adhd in multi-site rs-fmri dataset, Entropy, p. 893, 2020.

Al-Tashi

, Rais

H.M.

, Abdulkadir

S.J.

, Mirjalili

and Alhussian

S. H

, A Review of GreyWolfOptimiser-Based Feature Selection Methods for Classification. In Evolutionary Machine Learning Techniques, Springer: Singapore, p. 273–286, 2020.

Aghdam

M.A.

, Sharifi

and Pedram

, Combination of rs-fMRI and sMRI data to discriminate autism spectrum disorders in young children using deep belief network, J. Digit. Imaging, p. 895–903, 2018.

10.

Dekhil

, Hajjdiab

, Shalaby

, Ali

M.T.

, Ayinde

and Switala

, Using resting state functional MRI to build a personalized autism diagnosis system, PLoS One, p. 6351, 2018.

11.

Fredo

A.R.J.

, Jahedi

, Reiter

M.A.

and Muller

R.A.

, Classification of severe autism in fMRI using functional connectivity and conditional random forests, Neural Comput. App, p. 8415, 2019.

12.

Eslami

, Mirjalili

, Fong

, Laird

A.R.

and Saeed

, ASD-DiagNet: a hybrid learning approach for detection of autism spectrum disorder using fMRI data, Front Neuroinform, p. 70, 2019.

13.

Niu

, Guo

, Pan

, Gao

, Peng

and Li

, Multichannel deep attention neural networks for the classification of autism spectrum disorder using neuro imaging and personal characteristic data, Complexity, p. 1–9, 2020.

14.

Thomas

R.M.

, Gallo

, Cerliani

, Zhutovsky

, El-Gazzar

and Wingen

G.V.

, Classifying autism spectrum disorder using the temporal statistics of resting-state functional MRI data with 3D convolutional neural networks, Front Psychiatry, p. 440, 2020.

15.

Sherkatghanad

, Zomorodi Moghadam

, Khosrowabadi

, Akhondzadeh

, Abdarand

and Salari

, Automated detection of autism spectrum disorder using a convolutional neural network. Front Neurosci, p. 1325, 2020.

16.

Usta

M.B.

, Karabekiroglu

, Sahin

, Aydin

, Bozkurt

, Karaosman

, Aral

, Cobanoglu

, Kurt

A.D.

, Kesim

and Sahin

, Use of machine learning methods in prediction of short-term outcome in autism spectrum disorders, Psychiatry and Clinical Psychopharmacology, p. 320–325, 2019.

17.

Graña

and Silva

, Impact of machine learning pipeline choices in autism prediction from functional connectivity data, International Journal of Neural Systems, p. 2150009, 2021.

18.

Chaddad

, Li

, Lu

, Li

, Okuwobi

I.P.

, Tanougast

, Desrosiers

and Niazi

, Can autism be diagnosed with artificial intelligence? A narrative review, Diagnostics, p. 2032, 2021.

19.

Kanhirakadavath

M.R.

and Chandran

M.S.M.

, Investigation of Eye-Tracking Scan Path as a Biomarker for Autism Screening Using Machine Learning Algorithms, Diagnostics, p. 518, 2022.

20.

Wong

, Anderson

J.S.

, Zielinski

B.A.

and Fletcher

P.T.

, Riemannian regression and classification models of brain networks applied to autism, International Workshop on Connectomics in Neuroimaging, Springer, Cham, 2018.

21.

Zalesky

, Fornito

, Harding

I.H.

, Cocchi

, Yücel

, Pantelis

and Bullmore

E.T.

, Whole-brain anatomical networks: does the choice of nodes matter? Neuroimage 50(3) (2010), 970–983.

22.

Jenkinson

, Bannister

, Brady

and Smith

, Improved optimization for the robust and accurate linear registration and motion correction of brain images, Neuroimage 17(2) (2002), 825–841.

23.

Abraham

, Milham

, Martino

A.D.

, Craddock

R.C.

, Samaras

, Thirion

and Varoquaux

, Deriving reproducible biomarkers from multi-site resting-state data: An Autismbased example, NeuroImage 147 (2017), 736–745.

24.

Bellec

, Rosa-Neto

, Lyttelton

O.C.

, Benali

and Evans

A.C.

, Multi-level bootstrap analysis of stable clusters in resting-state fMRI, Neuroimage 51(3) (2010), 1126–1139.

25.

Bellec

, Mining the hierarchy of resting-state brain networks: selection of representative clusters in a multiscale structure, International Workshop on Pattern Recognition in Neuroimaging, IEEE, 2013.

26.

Craddock

, et al., The neuro bureau preprocessing initiative: open sharing of preprocessed neuroimaging data and derivatives, Frontiers in Neuroinformatics, 2013.

27.

Fan

, Dahou

, Ewees

A.A.

, Yousri

, Abualigah

and Al-qaness

M.A.A.

, Social media toxicity classification using deep learning: real-world application UK brexit, Electronics 10(11) (2021), 1332.

28.

Qaness

M.A.A.A.

, Abbasi

AA.

, Fan

, Ibrahim

R.A.

and Alsamhi

S.H.

, An improved yolo-based road traffic monitoring system, Computing 103(2) (2021), 211–230.

29.

Nair

and Hinton

G.E.

, Rectified linear units improve restricted Boltzmann machines, in Proceedings of the 27th International Conference on Machine Learning (ICML-10), p. 807–814, DBLP, Israel, June 2010.

30.

Elhosseini

M.A.

, elSehiemy

R.A.

, Rashwan

Y.I.

and Gao

X.Z.

, On the performance improvement of elephant herding optimization algorithm, Knowledge-Based Systems 166 (2019), 58–70.

31.

Dhiman

and Kumar

, Seagull optimization algorithm: Theory and its applications for large-scale industrial engineering problems, Knowledge-Based Systems 165 (2019), 169–196.

32.

Zhu

, et al., Redundancy and Attention in Convolutional LSTM for Gesture Recognition., IEEE Trans Neural Networks Learn. Syst, 2019.

33.

Jin

, Wee

C.Y.

, Shi

, Thung

K.H.

, Ni

, Yap

P.T.

and Shen

, Identification of infants at high-risk for autism spectrum disorder using multiparameter multiscale white matter connectivity networks, Hum Brain Mapp 2015, p. 4880–4896. [CrossRef], 2015.

34.

Katuwal

G.J.

, Machine Learning Based Autism Detection Using Brain Imaging, Rochester Institute of Technology: Rochester, NY, USA, 2017.

35.

Chen

C.P.

, Keown

C.L.

, Jahedi

, Nair

, Pflieger

M.E.

, Bailey

B.A.

and Müller

R.A.

, Diagnostic classification of intrinsic functional connectivity highlights somatosensory, default mode, and visual regions in autism, NeuroImage Clin p. 238–245, 2015.

36.

Thomas

R.M.

, Gallo

, Cerliani

, Zhutovsky

, El-Gazzar

and van Wingen

, Classifying Autism Spectrum Disorder Using the Temporal Statistics of Resting-State Functional MRI Data With 3D Convolutional Neural Networks, Front Psychiatry, p. 440, 2020.

37.

Dvornek

N.C.

, Ventola

and Duncan

J.S.

, Combining phenotypic and resting-state fMRI data for autism classification with recurrent neural networks, In Proceedings of the 2018 IEEE 15th International Symposium on Biomedical Imaging (ISBI 2018), Washington, DC, USA, pp. 725–728, 2018.

Deep learning-based feature selection and prediction system for autism spectrum disorder using a hybrid meta-heuristics approach

Abstract

Keywords

1 Introduction

2 Proposed materials and methods

2.2 Data preparation –brain network

2.4.1 Updating the clan

Table 4 Proposed model metrics results Metrics Results Accuracy 0.986 Recall 0.996 Precision 0.975 Specificity 0.978 F1-score 0.985

3.2 True positive rate (TPR)

References

Table 4
Proposed model metrics results

Metrics Results

Accuracy 0.986

Recall 0.996

Precision 0.975

Specificity 0.978

F1-score 0.985