Abstract
Keywords
Introduction
Epilepsy is a chronic disease, whereby the neurons in the brain discharge suddenly, and causing the brain to have a short dysfunction [1]. Epilepsy may result in cognitive decline, convulsion, injury or even death [2]. According to the extent of the involvement in brain areas in epileptic seizure, epilepsy can be divided into two classes: 1) generalized seizures and 2) partial seizures [3]. Generalized seizures involve bilateral cerebral hemispheres while partial seizures originate from part neurons of hemisphere. There are many causes of epilepsy, such as heredity, brain diseases and general or systemic diseases. The main way to detect epilepsy is checking electroencephalogram (EEG), which also contributes to the classification of epilepsy.
EEG is the recordings of electrical activities in brain, and it pertains to a complex and aperiodic time series [4]. It is a harmless detection method that acts as a main role in neurological diseases diagnosis. EEG has two categories according to the location of electrodes on the brain. The first one is called scalp EEG, which means that electrodes of appliances are placed on the scalp. The second one is intracranial EEG, which is obtained by the special electrodes implanted in the brain during a surgery [3]. EEG recordings contain lots of valuable information for understanding epilepsy. Therefore, the epilepsy diagnosis can be made according to the result of EEG classification.
Detrended fluctuation analysis (DFA) is a method that is related to time series. It is appropriate for analyzing random and non-stationary time series, which has long memory. Epilepsy detection and classification based on DFA all achieves good results [5, 6]. Kantelhardt et al. [7, 8] proposed the multi-fractal detrended fluctuation analysis (MF-DFA) method based on DFA. Traditional features extraction typically adopts a discrete wavelet transform or entropy. However this paper extracts features from EEG by exploiting MF-DFA. Although the method is simple and utilizes less parameters, the obtained features can effectively represent the samples even if the number of features is much smaller. Besides, we use genetic algorithm (GA) to set the C and γ parameters used in support vector machine (SVM). We classify the EEG by combining the MF-DFA and the GA based SVM (GA-SVM). In the experiments, we use sample data to train the classifier first. Then, the trained classifier is used to predict the test data and the results are compared with other proposed methods. The results show that our method has high accuracy in the classification of EEG and is superior to some other algorithms, which justify its use to detect epileptic seizure effectively.
The paper is organized as follows. In Section 2, the related works are described briefly. Then, in Section 3, we describe the methods used for EEG classification. This includes the multi-fractal detrended fluctuation analysis (MF-DFA), support vector machine (SVM), genetic algorithm based support vector machine (GA-SVM), and the sensitivity and specificity to measure the results of classification. Section 4 and 5 present how the proposed method is applied for classifying EEG signals and the results of comparison. Finally, we conclude our paper in Section 6.
Related works
EEG signals have been used for the detection of most neurological diseases. Roach et al. [9] analyzed the EEG in Schizophrenia. Brunner et al. [10] identified the muscle artifacts in the sleep EEG. The all-night EEG power spectrum showed significant reductions in power density when the EEG signal was cleared of muscle artifacts. Then, Chervin et al. [11] proposed a method for detecting respiratory cycle-related EEG changes in sleep-disordered breathing. EEG signals were used for human emotion detection by Murugappan et al. [12]. Besides, Brown et al. [13] and Jatupaiboon et al. [14] also detected emotion from EEG. And Jatupaiboon et al. [14] utilized real-time EEG signals to classify happy and unhappy emotions elicited by pictures and classical music. Bellotti et al. [15] analyzed the EEG in order to detect migraine. In addition, Temko et al. [16] detected neonatal seizure from the EEG signals.
There are also some other researchers who worked on the automatic detection of seizures [1–4, 17–24]. For instance, Guo et al. [3] used line length features combined with an artificial neural network (ANN) to classify the EEG signals regarding the existence of seizure or not. Next, Yadav et al. [1] proposed a model-based patient-specific method for automatic detection of seizures in the intracranial EEG recordings. This model included the template seizure pattern segmentation, redundant and noisy segments rejection, features extraction, the best model selection and classifier training. Guler et al. [20] used Lyapunov exponents and Levenberg-Marquardt algorithm to evaluate the classification accuracy of the recurrent neural networks on EEG signals. Mixture of Experts (ME) model was used to detect epileptic seizure [21]. Besides, the wavelet feature extraction was employed. In the work of Güler et al. [22], decision making was performed in two stages: feature extraction by eigenvector methods and classification using the classifiers. Cross-correlation aided SVM classifier was proposed by Chandaka et al. [23]. Then, Das et al. [2] employed normal inverse Gaussian (NIG) parameters in the dual-tree complex wavelet transform domain for classifying EEG data. Lastly, Güler et al. [4] and Murugavel et al. [24] used multiclass SVM for EEG signals classification.
Methodology
There are already a large amount of epileptic EEG data and health EEG data in real life. Here, we adopt machine learning methods to learn a classifier from the existing data and use the classifier to predict new data. However, the existing data are random, non-stationary and nonlinear. Therefore, we need to extract features of each data and use less features to describe data. This can reduce calculation cost during learning and prevent meaningless data influencing the classifier. The whole process can be summarized as shown in Fig. 1. Here, we use MF-DFA to extract features and the classifier adopts the GA-SVM.
Multifractal detrended fluctuation analysis (MF-DFA)
Here, we assume that an EEG is a time series x(t), t = 1,2, ... ,n.
First step: construct trend y(i):
Second step: split the serial y(i) into non-overlapping m segments with the same length s. For the s points in each segment v (v = 1, 2, ... , m), the least square method is used for polynomial fitting. It will result in k (k = 1, 2, ...) fitting polynomials, which are denoted as yv,k(j)(j = 1,2, ... , s). For each segment, we calculate the variance of the fitting polynomial trend yv,k(j) and y(i):
Third step: calculate the mean value of q order wave function:
If the time serial y(i) is long-range correlated, there exists power law relation between the mean value of q order wave function F
q
(s) and time scale s.
In the Equation (5), h(q) is the generalized Hurst index while h(2) is the classic Hurst index. When q > 0, F
q
(s) reflects big fluctuations of time series while small fluctuations when q < 0. When q = 0, we can use Equation (6) to calculate.
Figure 3 shows the exponential relationship between F
q
and s when the values of q are – 6, – 4, – 2, 0, 2, 4, 6. The generalized Hurst index h(q) is calculated by Equation (5). And Equation (7) calculates the scaling exponent τ (q).
The multifractal parameters can be got by Legendre transform, including the moment-order q of wave function and the generalized dimension D
q
.
Through above calculations, the multiple spectrum of signals can be obtained, which can be shown by Fig. 4.
Historically, SVM is first proposed by Vapnik [25] in the 1990s. It is a neural network model for small sample and small probability events. The mechanism of SVM is to find the optimal separating hyperplane which meets the classification requirements. The hyperplane can maximize the blank area of its both sides while ensuring the classification accuracy at the same time. In theory, SVM can implement the optimal classification of linearly separable data.
Given the training sample dataset, (x
i
, y
i
) , i = 1, 2, … , l, x ∈ R
n
, y ∈ { ± 1 }, the hyperplane is denoted as (w · x) + b = 0. To ensure that the hyperplane can correctly classify all samples and has the classification interval, it needs to meet the constraint:
The classification interval is 2/||W||. Therefore, the problem of constructing optimal hyperplane can be transformed to solve object (10) while meeting the constraint that is presented by Equation (9):
To simplify the constraint, the Lagrange function is imported as Equation (11):
In Equation (11), a
i
> 0 is the Lagrange multiplier. The solution of constrained optimization problem can be decided by the saddle points of Lagrange function where the solution satisfies that the partial derivatives of w and b are 0. As such, the problem can be transformed to a dual problem (12):
The optimal solution is , and the optimal weight vector w* and the optimal offset b* are:
In Equation (13) and Equation (14), . Therefore, the optimal classifying hyperplane is (w* · x) + b* = 0 and the optimal classifying function is:
For linearly inseparable dataset, SVM maps the input vector to a high-dimensional eigenvector space and constructs optimal classification hyperplane in the eigenvector space. The transform φ of mapping x from input space R
n
to eigenvector space H is shown in Equation (16):
Using eigenvector φ (x) to replace input vector x, the optimal classification function is defined as follows:
We note that parameters such as the true positives (TP), true negatives (TN), false positives (FP) and false negatives (FN) measure the quality of classification results. In particular, sensitivity and specificity measure the prediction accuracy for positive compounds, and the prediction accuracy for negative compounds respectively.
The overall prediction accuracy can be calculated by Equation (20).
We note that GA is an optimization method based on biological genetic and evolutionary mechanism that was proposed by Holland et al. [26] in 1975. It is appropriate for adaptive probability optimization of complex system. Unlike traditional search algorithms, GA is not based on the evaluation function or Gaussian statistics. It searches global optimal solutions by simulating nature evolutionary process. Our paper uses the Simple Genetic Algorithm (SGA) to optimize parameters in SVM model, which is defined as follows:
Based on our definition, C is the chromosome encoding method in GA, while E is the individual fitness function. P0 is initial population and M is the size of it. φ is the selection operator. Then, Γ is the crossover operator, Ψ is the mutation operator, and T is the termination condition. Fernandez et al. [27] proposed the general strategy of using GA to optimize parameters in SVM. We use the strategy to optimize SVM’s parameters by combining them with the features of time series data. The processes are illustrated in Fig. 5.
In our experiment, chromosome encoding is first applied by using certain symbols to abstract an object to a certain ordered string. Since GA has implicit parallelism and global search ability, it can search global optimal solutions in a short time. The SVM kernel function in this paper is RBF (Radial Basis Function), and only parameter C and γ need to be optimized. The encoding structure is as follows (Fig. 6).
Then, a fitness function is applied to evaluate performance based on the natural selection standard for distinguishing good and bad individuals in the colony. The selection directly affects the performance of the algorithm. Here, the fitness value is calculated by Equation (20). The genetic operators, which include selection, crossover and mutation, pertain to the algorithm’s operations on the colony. By adopting roulette method, the selection operator can ensure that chromosomes with high fitness value can have high probability of being selected while chromosomes with low fitness value can also have certain probability of being selected so that to avoid local optimal solutions. Based on the crossover probability, the crossover operator adopts multi-point crossover strategy to randomly select multiple points on one chromosome crossing with the same points on another chromosome. The mutation operator mutates on a chromosome’s one point based on the mutation probability. During the mutation, it negates on the corresponding bit of the mutation point.
The classification model of SVM comes from MLlib (a machine learning library of SPARK) [28] and the GA is based on Pyevolve (a library of Python) [29]. We combined them to get the GA-SVM working on cloud platform.
First, the data source is described in [23]. The whole dataset is composed of five sets (Z, O, N, F, S)and each set corresponds to a class which has 100 records. Each record is a single channel EEG segment with the length of 23.6 s and the frequency of 173.6HZ. The Z set represents normal EEG while S set only has epileptic EEG. The examples of EEG are shown in Fig. 7.
Next, to extract features, the value of s was set to 16 and k was set to 1. In Fig. 8, we respectively chose two samples from set S and set Z for MF-DFA. From these four samples, we find that these two classes (S and Z) have clear differences according to the relationship between parameter D q and h q . The differences between maximum and minimum value of h q of samples in set S are greater than those in set Z. At the same time, the differences between values of D q corresponding to the maximum and minimum value of h q of samples in set S are also larger than those in set Z.
Here, we chose the points with max h q , min h q and max D q as the feature of a sample. For example, the feature of sample 1 is ((1.248285, – 0.291364), (0.274880, 0.689524), (1.000000, 0.524602)). The traditional Fourier Transform based methods are difficult to fix and choose frequency and phrase on different decomposition levels. Therefore, the feature extraction is hard and has heavy calculation. However, MF-DFA is simple and efficient for feature extraction. Moreover, the features can represent the signals better.
Then, according to the structure of the classification problems, the parameters of the GA-SVM are shown in Table 1. These parameters were set based on the general guidelines given in the literature and the author’s computational experiments with the proposed algorithm.
Results
First, we used the GA to determine the value of C and γ. Here, 10 data samples were chosen from set S and set Z respectively as training data and the remaining data from set S and set Z were used as test data. The Accuracy was used as the fitness value. As such, we obtain C = 2, γ= 0.2.
Based on the above C and γ values, we used different methods to train and test our classifier. The classification results are shown in Table 2. In the first experiment, only one data sample was selected from the two sets respectively as training data and the remaining data samples as test data. The Accuracy was 89.90%, and when the size of training data increased, the algorithm kept achieving the same accuracy. In general, our classifier can achieve high classification accuracy and stability when the proportions between the size of training data and size of test data were different.
Many researchers have worked on the research of epileptic seizure detection. Table 3 compares our method with some previous methods by using the same data set. For Z-S classification problem, the classification accuracies of all the methods are close to 100%. Since MF-DFA is much simpler and low-cost, our method can perform the epileptic seizure detection more quickly and almost in real time.
Conclusion
In this study, we proposed and tested a new MF-DFA based SVM classification method or scheme. This scheme can effectively diagnose epilepsy by analyzing EEG. We note that MF-DFA is usually used in the analysis of economic time series data, but we have used it for the feature extraction based on EEG. Even though MF-DFA is much simple and utilizes less parameters, the features it obtains can represent samples as well as traditional wavelet transform and Lyapunov exponents. Besides, we implemented the genetic algorithm (GA) to set the C and γ parameters used in support vector machine (SVM). We classified the EEG by combining the MF-DFA and the GA based SVM (GA-SVM). The experiments confirm that our method can achieve comparable accuracy, which means that it is effective in epileptic seizure detection. There are many symptoms of epilepsy, but so far, we can only achieve classification of brain signals in health status and epilepsy. For future implementation, we will focus on the classification of multiple types of epilepsy by means of deep learning methods.
Footnotes
Acknowledgments
This work was supported by NSFC (No. 61402387); NSFC (No. 61402390); Science and Technology Key Project of Fujian Province, China (2014H0044); Science and Technology Guiding Project of Fujian Province, China (2015H0037, 2016H0035); Enterprise Technology Innovation Project of Fujian Province; Education and Research Project of Young and Middle-aged Teacher of Fujian Province, China (JAS151230, JA15018); Overseas Study Scholarship of Fujian Province; and Science and Technology Project of Xiamen, China (3502Z20153026).
