Abstract
Autism is a developmental disorder that influences social communication skills. It is currently diagnosed only by behavioral assessment. The assessment is susceptible to the experience of the examiner as well as to the descriptive scaling standard. This paper presents a computer aided approach to discrimination between neuro-typical and autistic children. A new method- based on the computing of the elliptic area of the Continuous Wavelet Transform complex plot of resting state EEG- is presented. First, the complex values of CWT, as a function of both time and frequency, are calculated for every EEG channel. Second, the CWT complex plot is obtained by plotting the real parts of the resulted CWT values versus the related imaginary components. Third, the 95% confidence value of the elliptic area of the complex plot is computed for every channel for both autistic and healthy subjects; and the obtained values are considered as the first set of features. Fourth, three additional features are computed for every channel: the average CWT, the maximum EEG amplitude, and the maximum real part of CWT. The classification of those features is realized through artificial neural network (ANN). The obtained accuracy, sensitivity and specificity values are: 95.9%, 96.7%, and 95.1% respectively.
Introduction
Autism spectrum disorder (ASD) is a kind of neurological developmental disorder mainly reflected in the language barriers and difficulties in social communication [1]. The detection of ASD is a complex procedure that involves behavioral/cognitive features [2]. It is currently achieved only by behavioral assessment via qualitative scaling criteria, e.g. Childhood Autism Rating Scale (CARS). The assessment is consequently susceptible to the experience of the examiner as well as to the descriptive standard [3].
Recently, researchers have been trying to develop ASD diagnostic methods with the help of electroencephalography [2]. Different approaches based on: (1) the power and characteristics of Delta, Theta, Alpha, Beta and Mu frequency bands [4], (2) the coherence and connectivity between brain sites [5], (3) the entropy of different EEG time scales [6], and the hemispheric activity asymmetry [7] have been implemented in recent literature. Machine learning, pattern recognition, clustering and classification techniques are the most powerful among those mentioned approaches [8]. Their main target (output) is generally the discrimination between ASD and neuro-typical groups. The inputs to those techniques are the features extracted by the feature selection/extraction procedures. Different feature selection methods have been used in current research: (1) Fourier transform and Short Fourier transform [9], (2) Spectrogram, wavelet transform and Scalogram [10], (3) fractal analysis methods [11], and time-scale based feature extractors [12]. The features are usually fed into the classifiers to help distinguish between the ASD group and the normal control. A variety of classifiers and machine learning algorithms have been implemented and applied to ASD in recent research: (1) artificial neural network (back-propagation, probabilistic, competition and Elman) [13], neuro-fuzzy techniques [8], learning vector quantization [14], radial basis Function [15], support vector machine [16], statistics based classifiers [17], Nearest Neighbor [18], and Discriminant Analysis [18].
Most of the existing literature related to Autism diagnosis focuses on the comparison between the qualitative and quantitative features useful for the clinical identification of ASD. On the other hand, few works have approached the automatic computer-based quantitative discrimination. In [19], Naive Bayes classification and support vector machine have been implemented with logistic regression for ASD diagnosis. The obtained accuracy was less than 80%. In [20], blind classifiers based on an evolutionary system with an invariant features vector have been developed. The obtained accuracy was 84%. The collaborators in [21] applied the Fisher Linear Discriminant Analysis to Fast Fourier Transform features in ASD EEG with a correct rate of 88.14%. The authors in [13] solved the Auto Regressive (AR) coefficients by least squares- based methods to reduce the forward/backward errors. AR Burg and Elman Neural Network led then to the classification accuracy of 95.63%. However, the selected sample included only six autistic volunteers. In [22], machine learning was applied to brain connectivity values. The classification gave 94.7% accuracy. However, the results showed low sensitivity. Multiple Support vector Machines based on metrics of brain functional connectivity have been implemented in [1] with a high achieved accuracy (96.15%). Nevertheless, the work had to involve features extracted from functional magnetic resonance imaging not only simple EEG. Feed Forward Neural Network using Auto-Regressive Features for EEG Based ASD Diagnosis has been explored in [23] with 92.69% accuracy but the number of subjects was only ten children. In [24], Pattern Recognition and Layered Recurrent Neural Networks yielded an accuracy of 94.62% but the sample was very small. The same issue -of small sample (only 9 autistic subjects) and low variability of severity- is found in [10] where EEG-Based Computer Aided Diagnosis of Autism Spectrum Disorder has been conducted using Wavelet, entropy (Renyi and Shannon), and Artificial Neural Network with accuracies in the range 83% –98%. In [25], Wavelet, Neural Network, Radial Basis, variation statistics and Fractal techniques were applied to electroencephalogram and resulted in an accuracy of 90%. The sample included only eight autistic subjects.
This paper presents a user-friendly simple computer aided quantitative approach proposed for automatic discrimination between neuro-typical and autistic children. It is applied to a statistically significant number of subjects (122 subjects with different severity levels) and is only based on EEG. The suggested method is based on the computing of the elliptic area of the complex plot related to the Continuous Wavelet Transform (CWT) of resting-state EEG. First, the complex values of CWT, as a function of both time and frequency, are calculated for every EEG channel (total number of channels is 64). The output values related to the frequency range less than 0.8 HZ are eliminated to avoid the effect of movement artifacts. Second, the CWT complex plot (CWT_CP) is obtained by plotting the real parts of the resulted CWT values versus the related imaginary components. Third, the 95% confidence value of the elliptic area is computed for every channel for both autistic and healthy subjects; and the obtained values are considered as the first set of features. Fourth, three additional features are computed for every channel: the average CWT, the maximum EEG amplitude, and the maximum real part of CWT. Fifth, the four sets of features for all channels produce the final feature matrix. The classification of those features is realized through artificial neural network (ANN). Finally, the classification results are statistically assessed.
Materials and methods
Volunteers
61 autistic and 61 neuro-typical children, in the age range 4–13 years, are involved in the present study. They are selected from Jordanian centers of special education as well as from regular schools. The subjects had been pre-diagnosed by specialists of behavioral assessment. All medicated children and those suffering from other neurological issues have been excluded. The severity levels of the ASD volunteers were as follows: 19 mild, 24 moderate and 18 severe cases. The severity has been assessed based on international standards [16, 46] and established scales [47, 48].
The present study is approved by the Institutional Review Board (IRB) ethical committee of Jordan University of Science and Technology. Consent forms have been signed by the father/mother of every child.
Protocol, material and pre-processing
The general protocol of the acquisition is resting-state EEG for twenty minutes. Every volunteer is assisted by the recording expert to wear the 64-electrode Waveguard cap (ANT Neuro Company) that matches perfectly the size of his head, following the instructions of the international 10-20 system. The placed cap is then linked through a cable to the 64-EEGO amplifier (ANT Neuro Company) and a desktop computer with installed acquisition software EEGO, analysis software LA-106 ASA ERP and MATLAB. The signal of every channel is sampled at 500 Hz. The EEG frequency range taken into account is 0.8Hz–40 Hz.
Continuous wavelet transform
Continuous Wavelet Transform (CWT) is a powerful time–frequency processing tool that is broadly used in signal handling in several applications [26]. It gives detailed information about the signal spectral distribution for every time instant. There are two famous forms of wavelet transform: CWT and Discrete Wavelet Transform (DWT). In the latter, the fact that the wavelets are discretely sampled leads sometimes to loss of valuable information [26]. In the present study, CWT is used to make use of all benefits of localized information in time and also in frequency.
The CWT for a signal x(t) can be computed as [27]:
ψ (t) is the indicated mother wavelet. ‘b’ is the temporal translation parameter that offers the way to the shifted versions of the mother wavelet; it helps therefore track the signal information through time. ‘a’ is the scaling parameter that offers the way to the dilated or compressed versions of the mother wavelet; it helps therefore track the signal spectral content through frequency [27]. In the present work, the ‘a’ parameter values are selected so that the dilated versions can encompass the set of frequency values existing in the EEG signal according to the following formula:
fc is the center frequency of the mother wavelet, and Ts is the sampling period.
Different options of wavelets can be found in literature [28] such as Daubechies, Biorthogonal, Symlets, Morlet, Mexican hat, Meyer.etc. In this work, analytic Morse wavelet is used [29]. The advantage of Morse wavelet is that, by adjusting its main parameters, analytic wavelets with different properties can be obtained [29]. The gamma/symmetry parameter and the time-bandwidth product values are therefore set to 3 and 60, respectively.
In the present paper, the complex plot of CWT of the EEG signal is obtained by sketching the real parts versus the imaginary parts. All parts related to frequencies less than 0.8 Hz are eliminated. The 95% confidence ellipse area value [30–35] of the complex plot is then calculated for every channel, and it is hence considered as the first used set of features. Note that, robustness of features related to EEG-based area and loops has been proved by the authors of the present work in a number of previous publications [3, 48]. The 95% confidence ellipse area is computed by the following method [34–36]: First, the parameters mR, mIM. mR,IM, D, A and B are computed based on the real R(n) and imaginary parts IM(n), respectively, as in Table 1. Then, the 95% confidence ellipse area (Ellipse) is found as:
Formulas useful for calculating the 95% confidence ellipse area of EEG signal
In addition to the feature Ellipse mentioned above, three features are also computed for every channel: the average CWT, the maximum EEG amplitude, and the maximum real part of CWT. Four features are therefore calculated for every channel of every volunteer. The size of the obtained feature array is 122 (number of volunteers) x4 (number of features) x 64 (number of channels).
Principal Components Analysis (PCA)
The size of the obtained feature array is 122 x 4 x 64. As a result, the cost of computation is high and the method is time-consuming. The reduction of the cost and time consumption can be achieved by dimension reduction without loss of information as possible. In the present work, Principal Components Analysis (PCA) is applied to reduce the obtained feature array dimension. The PCA provides output features representing a combination of input features. The PCA is mainly based on the Eigenvectors of the covariance matrix [37].
The obtained features matrix was dimensionally reduced using PCA. Then, the first 30 and 20 PCA factors were tested.
Classification
Artificial neural network (ANN) model with levenberg-Marquardt training function has been used for feature classification to discriminate Autism from neuro-typical subjects. Several numbers of neurons of hidden layers have been tested. 70% and 30% of data have been used for training and testing, respectively.
Statistical assessment of classification results
The output of classifier has been assessed by MedCalc through statistical descriptors: Accuracy, Sensitivity, Specificity, Positive Likelihood Ratio, Negative Likelihood Ratio, Disease prevalence, Positive Predictive Value, and Negative Predictive Value, as in Table 2.
Statistical descriptors used for the assessment of classification results. a: true positive, b: false negative, c: false positive and d: true negative
Statistical descriptors used for the assessment of classification results. a: true positive, b: false negative, c: false positive and d: true negative
Figure 1(a) illustrates the magnitude scalogram of a resting EEG channel (30 seconds) acquired from an autistic child, while Fig. 1(b) illustrates the magnitude scalogram of a resting EEG channel acquired from a neuro-typical child. The comparison between the two figures indicates that the high power amplitudes are more dominant in the autistic case than in the normal case. Both scalograms do not include frequencies less than 0.8 Hz due to the fact that the lowest boundary of the slowest EEG frequency band (Delta) is defined as between 0.5 Hz and 1 Hz. The present work has therefore experimentally set the value 0.8 Hz using the software LA-106 ASA ERP to eliminate all essential EEG-free activities. Also, all frequencies higher than 40 Hz have been excluded with the intention of focusing only on the main traditional bands: delta, theta, alpha, beta and gamma.

(a) Magnitude scalogram of an EEG channel acquired from an autistic child. (b) Magnitude scalogram of an EEG channel acquired from a neuro-typical child.
Since the magnitude of a complex number is calculated based on the real and imaginary parts, the result in Fig. 1 can be more clarified in Fig. 2 where imaginary parts of CWT of two counterpart EEG channels acquired from autistic –Fig. 2(a)- and neuro-typical –Fig. 2(b)- children are sketched versus their corresponding real parts.

Complex Plot for two counterpart EEG channels in healthy and Autistic volunteers.
The area values of the two complex plots in Fig. 2 are clearly different. The complex plot in the ASD situation is larger than the plot of the normal case due to the differences in the ranges of imaginary and real parts. This finding is consistent with the results in literature showing that the pace of EEG activity in ASD and healthy situations are dissimilar, where the power and dominance of Alpha, Beta, Delta, Theta and Mu frequency bands are different [3, 38–41]. In addition, the obtained outcome is consistent with the results in literature indicating the dissimilarity in the long-range and short-range connectivity between brain sites [42–44]. Furthermore, the outcomes are perfectly matching the conclusions made by the authors in [3, 48] where spectral tendencies or chaotic behaviors of EEG-related 2D plots were highly affected by ASD; they were also very useful indicators in cases of severity wide variations [47, 48]. Nevertheless, those previous techniques are different than the suggested one (regression, second order difference plots and entropy).
Note that, in the present work, the 95% confidence area value is the output taken into account- not the value of the whole area of the complex plot- to avoid the outliers in the plot of the normal condition. Furthermore, the 95% confidence area approach focuses on the main significant “body” of the plot.
The obtained result highlights the difference in neural discharge frequencies between ASD and neuro-typical cases; which means that the related brain activities during resting state are different. The rhythm of neural discharge in ASD has been previously studied by the authors in [3, 48] where all conclusions pointed to alterations appearing in ASD.
The current work includes a wide coverage of brain zones with the help of the used EEG helmet; this can assure a realistic accurate study of brain actions and ensure sufficient data for the next step of classification.
After finding the area value for every channel (total number of channels is 64) for every child (total number of children is 122), three additional features are computed (for every channel for every child): the average CWT, the maximum EEG amplitude, and the maximum real part of CWT. The four sets of features for all channels produce the final feature matrix, with the dimension 122x256. However, the PCA in the present work has been useful in reducing the dimension without influencing the accuracy. Table 3 presents the testing and the overall accuracies achieved by neural network classification applied to different reduced forms of the feature matrix. Furthermore, several numbers of NN hidden layers have been explored. The NN classification of the features with the reduced dimension 122x20 -that presents 99.99% of variance- yielded the best accuracies (testing: 91.7% and overall: 95.9%) when the NN configuration consisted of 10 neurons.
Classification results of 122 subjects (61 normal and 61 autism)
The classification outcome indicated that the mixing between spectral and temporal features has resulted in a powerful promising technique that takes into account both the temporal energy and spectral distribution/dominance of EEG channels in ASD. This conclusion is consistent with the recent research that studied the main irregularities in ASD EEG compared to normal EEG [45]. Note that, although the proposed method is mainly based on the elliptic area, additional features had to be inserted in order to consider the effect of normalization. In wavelet analysis, the complete prevention of the effect of EEG amplitude modulation on frequency information cannot be achieved; the complementary features are hence embedded in the classification system to avoid the consequences of this phenomenon. In addition, the system has to overcome the inter-individual variability of EEG values.
Table 4 presents the statistical assessment for the best classification scenario found in Table 3. The key values are: accuracy 95.90%, specificity 95.08%, and sensitivity 96.72%. Figure 3 gives an idea about the numbers of children used in the different phases of NN classification as well as the accuracies in the training and validation phases.
Statistical assessment of classification overall results (the array 122x20)

Obtained results of classification applied to data after PCA (the array 122x20).
The main outcome in the present work is the high accuracy achieved after the classification of a number of features extracted from a statistically significant sample. As indicated in the introduction section, the small sample is one of the main limitations in most of the current research about ASD [24]. It makes the reader confused about the robustness of the suggested approaches or the repeatability of the results. Furthermore, it is noteworthy that the sample selected in the present work encompasses a variety of ASD severity levels, which is not the case in many current articles where volunteers with close severity levels are recruited [46]. Finally, the present work does not involve any additional cerebral imaging [1].
The proposed method is a promising simple EEG based computerized technique that can discriminate between ASD and neuro-typical children with a high accuracy.
In the future works, the number of volunteers will be increased to test the robustness of the proposed method. Furthermore, the usefulness of the approach will be tested to discriminate between different severity levels of ASD.
Footnotes
Acknowledgment
This study has been conducted under the umbrella of the scientific research support fund (SRSF)-funded project # MPH/1/11/2014-Ministry of Higher Education and Scientific Research, Hashemite Kingdom of Jordan.
