Epileptic seizure detection in EEG signal with GModPCA and support vector machine

Abstract

Background and objective:

Epilepsy is one of the most common neurological disorders caused by recurrent seizures. Electroencephalograms (EEGs) record neural activity and can detect epilepsy. Visual inspection of an EEG signal for epileptic seizure detection is a time-consuming process and may lead to human error; therefore, recently, a number of automated seizure detection frameworks were proposed to replace these traditional methods. Feature extraction and classification are two important steps in these procedures. Feature extraction focuses on finding the informative features that could be used for classification and correct decision-making. Therefore, proposing effective feature extraction techniques for seizure detection is of great significance.

Methods:

Principal Component Analysis (PCA) is a dimensionality reduction technique used in different fields of pattern recognition including EEG signal classification. Global modular PCA (GModPCA) is a variation of PCA. In this paper, an effective framework with GModPCA and Support Vector Machine (SVM) is presented for epileptic seizure detection in EEG signals. The feature extraction is performed with GModPCA, whereas SVM trained with radial basis function kernel performed the classification between seizure and nonseizure EEG signals. Seven different experimental cases were conducted on the benchmark epilepsy EEG dataset. The system performance was evaluated using 10-fold cross-validation. In addition, we prove analytically that GModPCA has less time and space complexities as compared to PCA.

Results:

The experimental results show that EEG signals have strong inter-sub-pattern correlations. GModPCA and SVM have been able to achieve 100% accuracy for the classification between normal and epileptic signals. Along with this, seven different experimental cases were tested. The classification results of the proposed approach were better than were compared the results of some of the existing methods proposed in literature. It is also found that the time and space complexities of GModPCA are less as compared to PCA.

Conclusions:

This study suggests that GModPCA and SVM could be used for automated epileptic seizure detection in EEG signal.

Keywords

Electroencephalogram signal (EEG)Modular Principal Component Analysis (MPCA)Global Modular Principal Component Analysis (GModPCA)Support Vector Machine (SVM)

1. Introduction

Epilepsy is a neurological disorder which affects approximately 50 million people of world population as reported by World Health Organization [1]. Electroencephalogram (EEG) is a common measure of reading brain’s electrical activity [2]. EEG is popularly used in the medical application for the diagnosis of epileptic seizure [3,4]. EEG signals are recorded by placing electrode over the scalp. Manual analysis of these recorded signals by visual inspection is time consuming as well as it may lead to error. Hence, automated schemes with a high seizure detection rate is significantly required.

In the last few years, a number of methods have been proposed by researchers for seizure detection in EEG signal. The two basic steps involved in these methods are feature extraction and classification. Feature extraction reduced the dimension of the input patterns by keeping the most important attribute and constitute the feature vectors which are then given as input to a classifier to carry out the classification. Some of the methods suggested in the literature include the techniques like Fourier transform [5,6], wavelet transform [7–13], multi wavelet transform [14] and time frequency analysis [15]. Empirical mode decomposition (EMD) has also been used for the classification between normal and epileptic EEG signals [16]. Principal component analysis (PCA) is a dimensionality reduction technique and has been used for seizure detection and classification [17,18].

Recently, different techniques based on linear prediction error energy [19], fractional linear prediction [20], Hilbert Huang Transformation [21], wavelet based nonlinear feature with extreme machine learning [22] have been reported for epileptic seizure detection in EEG signal.

Local binary pattern (LBP) is a well known feature extraction technique used for face recognition [23]. Kaya et al. [24] applied 1-d LBP to the raw EEG signal for feature extraction. These extracted features were then trained with different classifier to perform the classification.

In this study, global modular PCA (GModPCA) and SVM have been employed for seizure detection. PCA focuses only on finding global variation. Modular PCA (MPCA) was introduced by Asari et al. [25] for face recognition. In MPCA, the input images are divided into subimages and PCA is performed on each subimage. It focuses on finding local variation in a small segment. Recently, Kadappa and Negi [26] introduced GModPCA. This technique focuses on both local and global variations of input patterns. This technique consists of two steps. The first step is done identically with the MPCA. In the second step, PCA is performed on the features extracted in first step to further reduce the dimensionality and take the advantage of inter-sub-pattern correlation [26]. These feature vectors obtained are given as input to SVM for the classification.

To evaluate the performance, the scheme has been tested on the benchmark EEG data set considering 10-fold cross validation. In addition, we have also explained the time and space complexities of PCA, MPCA and GModPCA.

The remaining content of this paper is organized as follows: Methodology and materials used are discussed in Section 2. Experimental results are shown in Section 3. Finally, Section 4 concludes the article with future direction.

2. Methodology and materials

PCA is a dimensionality reduction technique. MPCA and GModPCA are two variations of PCA. MPCA was introduced by Asari et al. [25] for face recognition, which focuses on local variations. GModPCA was proposed by Kadappa et al. [26] in order to take the advantage of both local and global variations. In this study, we have applied both the techniques for epileptic seizure detection. For the classification between seizure and nonseizure EEG signals SVM has been used. Even though MPCA was introduced for face recognition, we have also tested this technique for epileptic seizure detection in EEG signal. In both these techniques the input patterns are divided into S subpatterns. Figure 1 depicts the example of a subpattern.

Fig. 1.

An EEG signal is divided into S subpatterns, where $S = 4$ .

2.1. GModPCA

The steps involved in GModPCA are as follows:

1.
Let $X_{N * d}$ denotes the N input signals each having dimension d. Divide each input signal into S number of subpatterns with dimension of each subpattern $u = d / S$ . A set $I_{NS * u}$ is formed by collecting all the subpatterns into a single group.
1.1
The mean pattern of $I_{NS * u}$ is computed as, $\begin{matrix} (1) & M_{1 * u} = \frac{1}{NS} \sum_{i = 1}^{NS} I_{i} \end{matrix}$
1.2
The covariance matrix is computed as, $\begin{matrix} (2) & {(C)}_{u * u} = \frac{1}{NS} \sum_{i = 1}^{NS} {(I_{i} - M)}^{T} (I_{i} - M) \end{matrix}$
1.3.
Compute the eigenvalues ( $λ_{j}^{i}$ ) and corresponding eigenvectors ( $e_{j}^{i}$ ), for $j = 1, \dots, u$
1.4.
Select r ( $r ⩽ u$ ) largest eigenvalues and find corresponding eigenvectors. Let E denotes the set of these r selected eigenvectors.
1.5.
The local PCs for the subpattern set I is obtained by projecting it onto E. The local PCs set (Y) is obtained as, $\begin{matrix} (3) & {(Y)}_{NS * r} = {(I)}_{NS * u} {(E)}_{u * r} \end{matrix}$
1.6.
Concatenate Y, in accordance with the partition sequence followed in step 1. Let Y be the set obtained after concatenation. $\begin{matrix} (4) & Y_{N * Sr} = concatinat e ({(Y)}_{NS * r}) \end{matrix}$
2.
Once the feature set ${(Y)}_{N * Sr}$ is obtained the following operations are performed.
2.1.
Compute the covariance matrix, ${(C^{G})}_{Sr * Sr}$ for $Y_{N * Sr}$ .
2.2.
Find eigenvalues ( $λ_{j}^{G}$ ) and corresponding eigenvectors ( $e_{j}^{G}$ ), for $j = 1, \dots, Sr$
2.3.
Select w ( $w < Sr$ ) largest eigenvalues and corresponding eigenvectors. Let $E^{G}$ be the set denoting the chosen w eigenvectors.
2.4.
The final features set Z is obtained by projecting Y onto $E^{G}$ i.e. $\begin{matrix} (5) & {(Z)}_{N * w} = {(Y)}_{N * Sr} {(E^{G})}_{Sr * w} \end{matrix}$

GModPCA consists of two steps. The first step is constituted by MPCA (Fig. 2). The second step is the injection of PCA on features extracted in the previous step (Fig. 3).

Fig. 2.
Step 1 of GModPCA.

Fig. 3.
Step 2 of GModPCA.
2.1.1. Subpattern formation

The partition of patterns into equal size subpatterns set must be carried out such that, the loss of pattern is avoided or minimized. The subpattern formation can be done in a contagious manner or randomly. In this research, a contiguous partitioning approach has been followed (Fig. 1).

2.1.2. Selection of projection vectors (r, w)

In both the approaches, i.e., MPCA and GModPCA, projection vectors (PVs) are computed from the covariance matrix. The basic two approaches for selecting the number of PVs are as follows: (1) selecting a fixed number of eigenvectors for projection (2) setting a threshold (δ) on total variation.

2.2. MPCA

The set of operations performed in step 1 of GModPCA constitutes MPCA. In case of MPCA, the features set Y (Fig. 2) obtained is used for classification. However, in case of GModPCA the features set Z (Fig. 3) is used for classification.

2.3. Support vector machine (SVM)

Support Vector Machine (SVM) is a binary classifier [27]. It draws a maximum margin decision boundary to separate the classes. Figure 4 depicts the diagram of an SVM.

Fig. 4.

Support Vector Machine.

Consider a binary classification problem with a training data set T having a n number of samples. Let d be the dimension of each sample. $\begin{matrix} T = {(x_{i}, l_{i})}_{i = 1}^{n} \end{matrix}$ Here $l_{i}$ represents the class label of the sample $x_{i} \in R^{d}$ with $l_{i} \in {1, - 1}$ . The decision boundary that separates both the classes is: $\begin{matrix} (6) & w * x + b = 0 \end{matrix}$ Here w is known as the weight and b is the bias. The decision function of the linear classifier is, $\begin{matrix} (7) & y (x) = sign (w * x + b) \end{matrix}$

In order to find the best separating hyperplane, the following optimization condition needs to be solved, i.e., $\begin{array}{l} (8) & \begin{aligned} Minimize \frac{1}{2} | | w |^{2} \\ Subject to l_{i} (w * x_{i} + b) ⩾ 1, i = 1, \dots, n \end{aligned} \end{array}$ Introducing Lagrange multiplier $α_{i}$ and kernel function $Z (x, x_{i})$ the decision function can be rewritten as follows [27,28]: $\begin{matrix} (9) & y (x) = sign (\sum_{i = 1}^{n} l_{i} α_{i} Z (x, x_{i}) + b) \end{matrix}$ In our study Radial basis kernel has been used. For rbf kernel, $\begin{matrix} (10) & Z (x, x_{i}) = e^{- \frac{‖ x - x_{i} ‖^{2}}{2 σ^{2}}} \end{matrix}$ Here σ is a free parameter that controls the width of the kernel.

2.4. k-fold cross validation

k-fold cross validation is well known technique for evaluating the system performance. k-fold cross validation is performed by partitioning the entire dataset to the k number of equal subparts. One out of the k subparts is taken as the testing set and the remaining $k - 1$ subparts as the training set. In the next iteration, another subpart is taken as testing set and the remaining subparts as training set. In this way the training and testing is repeated k times [29].

2.5. Time complexity of PCA, MPCA and GModPCA

Here we have discussed the time complexity (Tc) involved in all these three techniques. Suppose $X_{N * d}$ be the input patterns of N classes and d represents the dimension of each pattern.

Let $C_{d * d}$ be the covariance matrix computed by PCA. The next step of PCA is to find the eigenvalues/eigenvectors. So the Tc of PCA considering all these computations is given as: $\begin{matrix} (11) & Tc (PCA) = O [N * d^{2} + d^{3}] \end{matrix}$ In case of MPCA, the input patterns are divided into subpatterns, and these subpatterns are grouped into a single set. Let each input pattern is divided into S subpatterns with dimensions of each subpattern u, i.e., $u = d / S$ . If $I_{NS * u}$ represents the group then Tc of MPCA is $\begin{matrix} (12) & Tc (MPCA) = O [N * S * u^{2} + u^{3}] \end{matrix}$ GModPCA compute an additional covariance matrix ${(C^{G})}_{Sr * Sr}$ in its second step. Hence, $\begin{matrix} (13) & Tc (GModPCA) = Tc (MPCA) + O [N * S^{2} * r^{2} + S^{3} * r^{3}] \end{matrix}$

Lemma 1.
$Tc (PCA) > S * Tc (MPCA)$ where $2 ⩽ S ⩽ \frac{d}{2}$ , i.e., the time complexity (Tc) of PCA is greater than S times the time complexity of MPCA.
Proof.
The Tc of PCA as described in equation (11) is $\begin{array}{rcl} Tc (PCA) & = & O [N * d^{2} + d^{3}] \\ = & O [(N * {(S * u)}^{2} + {(S * u)}^{3}] (d = S * u) \\ = & O [S * (N * S * u^{2} + S^{2} * u^{3})] \\ > & O [S * (N * S * u^{2} + u^{3})] (since S^{3} * u^{3} > u^{3}) \\ > & S * O [(N * S * u^{2} + u^{3})] \\ > & S * Tc (MPCA) \end{array}$ Hence the lemma follows. □
Lemma 2.
$Tc (GModPCA) ⩽ (\frac{1}{S} + \frac{r^{2}}{u^{2}}) * Tc (PCA)$ where $2 ⩽ S ⩽ \frac{d}{2}$ , i.e., the time complexity (Tc) of GModPCA is less than equal to ( $\frac{1}{S} + \frac{r^{2}}{u^{2}}$ ) times the time complexity of PCA.
Proof.
The Tc of GModPCA as described in equation (13) is $\begin{array}{rcl} Tc (GModPCA) & = & Tc (MPCA) + O [N * S^{2} r^{2} + S^{3} r^{3}] \\ = & O [N * S * u^{2} + u^{3}] + O [N * S^{2} * r^{2} + S^{3} * r^{3}] \\ = & O [N * S * \frac{d^{2}}{S^{2}} + \frac{d^{3}}{S^{3}}] + O [N * \frac{d^{2}}{u^{2}} * r^{2} + \frac{d^{3}}{u^{3}} * r^{3}] (because d = S * u) \\ = & O [N * d^{2} * (\frac{1}{S} + \frac{r^{2}}{u^{2}}) + d^{3} * (\frac{1}{S^{3}} + \frac{r^{3}}{u^{3}})] \\ ⩽ & O [N * d^{2} * (\frac{1}{S} + \frac{r^{2}}{u^{2}}) + d^{3} * (\frac{1}{S} + \frac{r^{2}}{u^{2}})] (because r /^{} u ⩽ 1) \\ ⩽ & (\frac{1}{S} + \frac{r^{2}}{u^{2}}) O [N * d^{2} + d^{3}] \\ ⩽ & (\frac{1}{S} + \frac{r^{2}}{u^{2}}) * Tc (PCA) (from equation (11)) \end{array}$ Hence the lemma follows. □

It should be noted that $Tc (GModPCA) < Tc (PCA)$ when $r < u * \sqrt{\frac{S - 1}{S}}$ .
2.6. Space complexity of PCA, MPCA and GModPCA

Here we have discussed the space complexity (Spc) of the PCA, MPCA and GModPCA for the same input patterns set $X_{N * d}$ .

For PCA the space complexity, including input patterns set $X_{N * d} (O (N * d))$ , covariance matrix $C_{d * d} (O (d^{2}))$ , eigenvalues and eigenvectors $O (d^{2})$ , final principal components $Y_{N * w} (O (N * w))$ is, $\begin{matrix} (14) & Spc (PCA) = O (N * d + d^{2}) \end{matrix}$ The space complexity of the MPCA is, $\begin{matrix} (15) & Spc (MPCA) = O (N * S * u + u^{2}) \end{matrix}$ Similarly, for GModPCA the space complexity is, $\begin{matrix} (16) & Spc (GModPCA) = max {Spc (MPCA), O [N * (S * r) + {(S * r)}^{2}]} \end{matrix}$

Lemma 3.
$Spc (PCA) > Spc (MPCA)$ where $2 ⩽ S ⩽ \frac{d}{2}$ , i.e., the space complexity (Spc) of PCA is greater than S times the space complexity of MPCA.
Proof.
The Spc of PCA as described in equation (14) is $\begin{array}{rcl} Spc (PCA) & = & O [N * d + d^{2}] \\ = & O [N * S * u + {(S * u)}^{2}] (because d = S * u) \\ > & O [(N * S * u + u^{2})] (because S^{2} * u^{2} > u^{2}) \\ > & Spc (MPCA) (from equation (15)) \end{array}$ Hence the lemma follows. □
Lemma 4.
$Spc (GModPCA) < Spc (PCA)$ where $2 ⩽ S ⩽ \frac{d}{2}$ , i.e., the space complexity (Spc) of GModPCA is less than the space complexity of PCA.

The Spc of MPCA as described in equation (16) is $\begin{matrix} Spc (GModPCA) = max {Spc (MPCA), O [N * (S * r) + {(S * r)}^{2}]} \end{matrix}$ Proof.
Case 1: $\begin{array}{rcl} Spc (GModPCA) & = & Spc (MPCA) \\ < & Spc (PCA) (from Lemma 3) \end{array}$

Case 2: $\begin{array}{rcl} Spc (GModPCA) & = & O [N * (S * r) + {(S * r)}^{2}] \\ = & O [N * (\frac{d}{u} * r) + \frac{d^{2}}{u^{2}} * r^{2}] \\ = & O [N * d * (\frac{r}{u}) + d^{2} * \frac{r^{2}}{u^{2}}] \\ < & O [N * d + d^{2}] (since r < u) \\ < & Spc (PCA) (from equation (14)) \end{array}$ Hence the property follows. □

2.7. Dataset

We have used the publicly available EEG time series dataset of Department of Epileptology1

¹
EEG time series dataset http://epileptologie-bonn.de/cms/front_content.php?idcat=193lang=3changelang=3.

at Bonn University, Germany [30]. This dataset has 5 different subsets i.e., A to E. Each subset contains 100 single-channel EEG signals. Each signal was recorded for 23.6 s duration with an 128 channel amplifier system using a common average reference. These signals were digitized through 12 bit A/D converter and the sampling frequency was 173.6 Hz. The subsets A and B contains the EEG recording of five healthy volunteers while their eyes were opened and closed, respectively. The signals in subsets C and D were recorded on patients before epileptic attack at hemisphere hippocampal formation and from the epileptogenic zone respectively. The EEG signals within subset E were recorded from patients during the seizure activity. All the 5 subsets have been used for the classification of seizure and nonseizure EEG signals. The EEG signal of each subset is shown in Fig. 5.

Fig. 5.

Epilepsy Data Set.

3. Experimental results and discussion

In this section, the experimental results and analysis have been done.

3.1. Results

GModPCA begins by dividing the input patterns to S number of non overlapping subpatterns of equal sizes. For this study, the number of partitions, $S = 8$ . Once the subpatterns are formed, they are grouped into a single set. The feature extraction has been carried out with MPCA and GModPCA. Once the features extraction step is over, the extracted feature vectors are fed to SVM to carry out the classification between seizure and nonseizure EEG signals. Each subpattern is projected with k number of eigenvectors in both MPCA and in the first step of GModCA. In the second step of GModPCA, a threshold δ on total variation is set for eigenvectors selection.

The publicly available epilepsy EEG dataset has been used. The dataset consist of 5 subset (A to E). We have used all these five subsets for classification between seizure and nonseizure EEG signals. A set of seven different experimental cases has been tested, i.e., A vs E (case 1), B vs E (case 2), C vs E (case 3), D vs E (case 4), AB vs E (case 5), CD vs E (case 6), and ABCD vs E (case 7). For each experimental case, we have performed k fold cross validation with $k = 10$ . Each experiment has been repeated for 10 times. For 10-fold cross validation, the built-in MATLAB function crossvalind has been used. The built-in MATLAB functions svmtrain and svmclassify have been used for training and classifying the feature vectors of EEG signals respectively. The SVM is trained with Radial Basis Function (RBF) kernel. The values of the RBF parameters (C and σ) are set to 1. The mean classification accuracy for MPCA and GModPCA, with different number of eigenvectors are presented in Tables 1–7.

Table 1
Classification accuracy of MPCA and GModPCA with SVM for A vs E

Case PCsper subpattern No. of PVs Accuracy (%)

r MPCA $Sr$ GModPCA $w (δ)$ MPCA GModPCA

A vs E 1 8 7 (98.00%) 94.25 95.50

2 16 14 (99.00%) 97.60 97.00

3 24 19 (98.00%) 97.80 98.50

4 32 24 (98.00%) 98.00 99.50

5 40 28 (98.00%) 98.80 100

6 48 31 (98.00%) 99.20 100

7 56 34 (98.00%) 99.10 100

8 64 36 (98.00%) 99.50 100

9 72 38 (98.00%) 99.50 100

10 80 40 (98.00%) 99.50 100

11 88 41 (98.00%) 99.50 100

Case	PCsper subpattern	No. of PVs	Accuracy (%)
A vs E	1	8	7 (98.00%)	94.25	95.50
2	16	14 (99.00%)	97.60	97.00
3	24	19 (98.00%)	97.80	98.50
4	32	24 (98.00%)	98.00	99.50
5	40	28 (98.00%)	98.80	100
6	48	31 (98.00%)	99.20	100
7	56	34 (98.00%)	99.10	100
8	64	36 (98.00%)	99.50	100
9	72	38 (98.00%)	99.50	100
10	80	40 (98.00%)	99.50	100
11	88	41 (98.00%)	99.50	100

Table 2

Classification accuracy of MPCA and GModPCA with SVM for B vs E

Case	PCsper subpattern	No. of PVs		Accuracy (%)

	r	MPCA $Sr$	GModPCA $w (δ)$	MPCA	GModPCA
B vs E	1	8	7 (98.00%)	92.80	93.80
	2	16	14 (98.00%)	95.80	96.30
	3	24	20 (98.00%)	97.40	98.10
	4	32	24 (98.00%)	97.70	98.20
	5	40	28 (98.00%)	98.20	98.30
	6	48	30 (98.00%)	98.90	99.10
	7	56	34 (98.00%)	98.80	99.20
	8	64	35 (98.00%)	98.80	99.10
	9	72	39 (98.00%)	98.80	99.20
	10	80	40 (98.00%)	98.70	99.10
	11	88	42 (98.00%)	98.20	99.10

Table 3

Classification accuracy of MPCA and GModPCA with SVM for C vs E

Case	PCsper subpattern	No. of PVs		Accuracy (%)

	r	MPCA $Sr$	GModPCA $w (δ)$	MPCA	GModPCA
C vs E	1	8	7 (98.00%)	89.60	92.70
	2	16	14 (98.00%)	95.40	95.50
	3	24	19 (98.00%)	96.90	97.50
	4	32	24 (98.50%)	97.10	97.50
	5	40	28 (98.00%)	97.80	98.10
	6	48	30 (98.50%)	98.10	98.10
	7	56	34 (98.00%)	98.10	98.20
	8	64	35 (97.50%)	98.10	98.50
	9	72	39 (98.00%)	97.60	98.00
	10	80	41 (98.00%)	97.60	98.20
	11	88	42 (98.00%)	97.90	98.30

Table 4

Classification accuracy of MPCA and GModPCA with SVM for D vs E

Case	PCsper subpattern	No. of PVs		Accuracy (%)

	r	MPCA $Sr$	GModPCA $w (δ)$	MPCA	GModPCA
D vs E	1	8	7 (98.00%)	89.20	90.90
	2	16	14 (98.00%)	92.20	92.80
	3	24	19 (98.00%)	93.50	93.70
	4	32	24 (98.00%)	93.80	94.00
	5	40	27 (97.50%)	93.70	94.00
	6	48	27 (96.00%)	93.30	94.20
	7	56	32 (97.00%)	93.40	93.40
	8	64	35 (97.50%)	92.40	93.50
	9	72	36 (97.00%)	92.40	93.50
	10	80	38 (97.00%)	92.40	93.30
	11	88	40 (97.00%)	90.20	93.20

Table 5

Classification accuracy of MPCA and GModPCA with SVM for AB vs E

Case	PCsper subpattern	No. of PVs		Accuracy (%)

	r	MPCA $Sr$	GModPCA $w (δ)$	MPCA	GModPCA
AB vs E	1	8	7 (97.00%)	96.00	96.11
	2	16	13 (97.00%)	97.00	97.66
	3	24	19 (97.00%)	98.60	98.70
	4	32	24 (98.00%)	99.00	99.20
	5	40	26 (97.00%)	99.33	99.40
	6	48	29 (97.00%)	99.13	99.66
	7	56	31 (97.00%)	98.73	99.20
	8	64	33 (97.00%)	98.44	99.00
	9	72	35 (97.00%)	97.53	99.10
	10	80	37 (97.00%)	96.73	99.20
	11	88	38 (97.00%)	95.66	99.20

Table 6

Classification accuracy of MPCA and GModPCA with SVM for CD vs E

Case	PCsper subpattern	No. of PVs		Accuracy (%)

	r	MPCA $Sr$	GModPCA $w (δ)$	MPCA	GModPCA
CD vs E	1	8	8 (98.00%)	94.20	94.20
	2	16	14 (98.00%)	94.60	94.80
	3	24	18 (97.00%)	95.66	95.80
	4	32	23 (98.00%)	95.11	95.46
	5	40	25 (96.00%)	95.33	95.46
	6	48	27 (96.00%)	94.80	95.26
	7	56	30 (96.00%)	94.11	94.80
	8	64	33 (96.00%)	92.99	95.20
	9	72	35 (96.00%)	92.33	94.86
	10	80	37 (97.00%)	91.66	94.80
	11	80	38 (97.00%)	89.86	94.13

Table 7

Classification accuracy of MPCA and GModPCA with SVM for ABCD vs E

Case	PCsper subpattern	No. of PVs		Accuracy (%)

	r	MPCA $Sr$	GModPCA $w (δ)$	MPCA	GModPCA
ABCD vs E	1	8	7 (97.00%)	94.60	94.80
	2	16	14 (97.00%)	96.36	96.80
	3	24	17 (97.00%)	96.80	96.84
	4	32	22 (97.00%)	96.92	97.17
	5	40	26 (97.00%)	96.10	96.17
	6	48	27 (96.00%)	95.84	96.40
	7	56	29 (96.00%)	95.92	96.20
	8	64	33 (96.00%)	96.32	95.20
	9	72	35 (96.00%)	90.28	96.00
	10	80	37 (96.00%)	80.50	95.60
	11	88	39 (96.00%)	80.00	95.60

The classification accuracy of PCA, MPCA and GModPCA for each experimental case is shown in Fig. 6.

Fig. 6.

Classification Accuracy of GModPCA, MPCA, and PCA with SVM for different experimental cases, i.e., (a) A–E, (b) B–E, (c) C–E, (d) D–E, (e) AB–E, (f) CD–E, (g) ABCD–E.

The abnormality or disorder recorded in EEG signal posses certain unique patterns. It is very crucial to capture these hidden patterns for correct diagnosis. PCA focus on the extraction of global features and hence its capability for detecting these unique patterns becomes limited. On the other hand, MPCA and GModPCA begins by dividing the signal into subparts and extracted features from these subparts individually. As a result of which, both the techniques capture the hidden unique patterns and the chances for the correct diagnosis of a disorder is maximized. From the Fig. 6 it could be seen that the epileptic seizure detection rate of MPCA and GModPCA is high as compared to PCA. A feature extraction technique not only focus on extracting the informative features, but also it should be computationally simple. In this research, we not only show that the MPCA and GModPCA have a high capability for seizure detection, but also we proved analytically that the time and space complexities of both the methods are less as compared to PCA.

3.2. Discussion

The following observations are made from the experimental results. With the same number of projection vectors, the classification accuracy obtained by MPCA for different experimental cases is more than the accuracy achieved through PCA. It proved that the features extracted from the modules are more informative than the features extracted directly from the EEG signals. MPCA focus only on finding the local features. However, GModPCA focus on both the local features and global features. The experimental results presented in Tables 1–7 proved that the GModPCA has been able to achieve better classification accuracy than MPCA with less number projection vectors. In most of the cases, it is found that GModPCA acieved the best classification accuracy with 24–35 features. As can been seen in Fig. 6 that for different experimental cases, GModPCA achieved higher classification accuracy than PCA and MPCA with less number of extracted features or principal components.

As mentioned earlier MPCA was introduced for face recognition. We have also tested its effectiveness for epileptic seizure detection along with GModPCA. From Lemmas 1–4, it is proved that MPCA and GModPCA have less time and space complexity than PCA.

Different methods have been proposed in literature for seizure detection on the same dataset used under in this study. A comparative study of these methods along with the our proposed approach is presented in Table 8.

Table 8
Reference, year, methods and Classification accuracy obtained for some cases in literature

Authors Year Methods Cases Accuracy (%)

[9] 2007 Wavelet feature extraction and a mixture of expert model A–E 94.50

[6] 2007 Fast Fourier transform and decision tree classifier A–E 98.70

[31] 2007 Approximate entropy and ANN A–E 100

[8] 2009 Discrete wavelet transform and approximate entropy A–E 96.00

[14] 2010 Approximate entropy and ANN A–E 99.85

ABCD–E 98.27

[32] 2011 Time and Frequency features A–E 100

[33] 2011 Statistical features and SVM A–E 99.69

B–E 96.78

C–E 97.69

D–E 93.91

A–D 82.53

[34] 2012 Permutation entropy and SVM A–E 93.55

B–E 82.88

C–E 88.00

D–E 79.94

[35] 2014 Wavelettransform, phase-space reconstruction with Euclidean distance A–E 98.17

[21] 2014 Time-frequency image using HHT and SVM A–E 99.125

[36] 2015 Weighted permutation entropy and SVM A–E 99.50

B–E 85.00

C–E 93.50

D–E 96.50

[37] 2015 IMFs and LS-SVM classifier CD–E 98.67

[12] 2016 DWT+PSR+SVM A–E 100

[13] 2016 DWT+ABC+ANN A–E 72.6

D–E 98.0

Proposed approach

GModPCA and SVM A–E 100

B–E 99.20

C–E 98.50

D–E 94.20

AB–E 99.66

CD–E 95.80

ABCD–E 97.17

Authors	Year	Methods	Cases	Accuracy (%)
[9]	2007	Wavelet feature extraction and a mixture of expert model	A–E	94.50
[6]	2007	Fast Fourier transform and decision tree classifier	A–E	98.70
[31]	2007	Approximate entropy and ANN	A–E	100
[8]	2009	Discrete wavelet transform and approximate entropy	A–E	96.00
[14]	2010	Approximate entropy and ANN	A–E	99.85
ABCD–E	98.27
[32]	2011	Time and Frequency features	A–E	100
[33]	2011	Statistical features and SVM	A–E	99.69
B–E	96.78
C–E	97.69
D–E	93.91
A–D	82.53
[34]	2012	Permutation entropy and SVM	A–E	93.55
B–E	82.88
C–E	88.00
D–E	79.94
[35]	2014	Wavelettransform, phase-space reconstruction with Euclidean distance	A–E	98.17
[21]	2014	Time-frequency image using HHT and SVM	A–E	99.125
[36]	2015	Weighted permutation entropy and SVM	A–E	99.50
B–E	85.00
C–E	93.50
D–E	96.50
[37]	2015	IMFs and LS-SVM classifier	CD–E	98.67
[12]	2016	DWT+PSR+SVM	A–E	100
[13]	2016	DWT+ABC+ANN	A–E	72.6
			D–E	98.0
Proposed approach
		GModPCA and SVM	A–E	100
		B–E	99.20
		C–E	98.50
		D–E	94.20
		AB–E	99.66
		CD–E	95.80
		ABCD–E	97.17

For case 1, classification the maximum accuracy reported in the literature is 100% which was achieved by Srinivasan et al. [31] with the application of entropy and neural network. Similarly, the classification accuracy of 100% was achieved by Iscan et al. [32] through the combination of time frequency domain features. In this study, MPCA and GModPCA achieved the classification accuracy of 99.5% and 100% respectively, for case 1, which is better than the recent classification accuracy achieved by Lee et al. [35] and Chai et al. [21].

For cases 2–4, the best classification accuracy (%) achieved by MPCA and GModPCA are 98.90, 98.10, 93.80 and 99.20, 98.50, 94.20 respectively. Nicolaou et al. [34] reported the classification accuracy of 82.88, 88.00, and 78.98 respectively for these experimental cases.

For cases 5–7, MPCA achieved the best accuracy (%) of 99.33, 95.33, and 96.92 respectively. Similarly, with GModPCA the best accuracy (%) found to be 99.66, 95.80, and 97.17 respectively.

These results show that GModPCA has the tendency to acquire high seizure detection rate. As shown in Table 8. Even though a number of methods have been proposed in the literature, none of these methods addressed the issue of inter-sub-pattern correlation between the EEG signals. This research aims to strengthen the research in the direction of exploring the inter-sub-pattern correlation and showing the possibility of the effectiveness in the field of biomedical signal processing. Both the techniques work directly on the raw EEG signal.

4. Conclusion and future work

In this paper, an effective approach with GModPCA and SVM have been proposed for automated seizure detection in EEG signal. Features are extracted using GModPCA. We have also tested the effectiveness of MPCA, which focuses on local variation, whereas the GModPCA focus on both local and global variations. After the feature extraction is performed, the extracted feature vectors are fed to the SVM to carry out the classification. Seven different experimental cases for classification have been conducted. By observing the classification accuracy it could be interpreted that GModPCA with SVM achieved a better classification accuracy as compared to some of the existing techniques proposed in literature. This shows that there exist a strong inter-sub-pattern correlation in EEG signals. In this paper, It is also proved analytically that MPCA and GModPCA have less time and space complexities as compared to PCA. GModPCA is an efficient dimensionality reduction technique which can be applied to other medical applications in future.

Conflict of interest

The authors have no conflict of interest to report.

References

World Health Organization, Epilepsy, Factsheet, http://www.who.int/mediacentre/factsheets/fs999/en/#. Accessed: 30-05-2016.

Berger

, Über das elektrenkephalogramm des menschen, European Archives of Psychiatry and Clinical Neuroscience 87(1) (1929), 527–570.

Ray

G.C.

, An algorithm to separate nonstationary part of a signal using mid-prediction filter, IEEE Transactions on Signal Processing 42(9) (1994), 2276–2279. doi:10.1109/78.317850.

Iasemidis

L.D.

Shiau

D.S.

Chaovalitwongse

Sackellares

J.C.

Pardalos

P.M.

Principe

J.C.

Carney

P.R.

Prasad

Veeramani

and Tsakalis

, Adaptive epileptic seizure prediction system, IEEE Transactions on Biomedical Engineering 50(5) (2003), 616–627. doi:10.1109/TBME.2003.810689.

Srinivasan

Eswaran

and Sriraam

A.N.

, Artificial neural network based epileptic detection using time-domain and frequency-domain features, Journal of Medical Systems 29(6) (2003), 647–660. doi:10.1007/s10916-005-6133-1.

Polat

and Güneş

, Classification of epileptiform EEG using a hybrid system based on decision tree classifier and fast Fourier transform, Applied Mathematics and Computation 187(2) (2007), 1017–1026. doi:10.1016/j.amc.2006.09.022.

Ghosh-Dastidar

Adeli

and Dadmehr

, Mixed-band wavelet-chaos-neural network methodology for epilepsy and epileptic seizure detection, IEEE Transactions on Biomedical Engineering 54(9) (2007), 1545–1551. doi:10.1109/TBME.2007.891945.

Ocak

, Automatic detection of epileptic seizures in EEG using discrete wavelet transform and approximate entropy, Expert Systems With Applications 36(2) (2009), 2027–2036. doi:10.1016/j.eswa.2007.12.065.

Subasi

, EEG signal classification using wavelet feature extraction and a mixture of expert model, Expert Systems With Applications 32(4) (2007), 1084–1093. doi:10.1016/j.eswa.2006.02.005.

10.

Swami

Gandhi

T.K.

Panigrahi

B.K.

Bhatia

Santhosh

and Anand

, A comparative account of modelling seizure detection system using wavelet techniques, International Journal of Systems Science: Operations & Logistics 1 (2016), 1–2.

11.

Swami

Gandhi

T.K.

Panigrahi

B.K.

Tripathi

and Anand

, A novel robust diagnostic model to detect seizures in electroencephalography, Expert Systems With Applications 56 (2016), 116–130. doi:10.1016/j.eswa.2016.02.040.

12.

Xie

Jin

and Hirasawa

, A sequential method using multiplicative extreme learning machine for epileptic seizure detection, Neurocomputing 214 (2016), 692–707. doi:10.1016/j.neucom.2016.06.056.

13.

Satapathy

S.K.

Dehuri

and Jagadev

A.K.

, ABC optimized RBF network for classification of EEG signal for epileptic seizure identification, Egyptian Informatics Journal 18(1) (2016), 55–66.

14.

Guo

Rivero

and Pazos

, Epileptic seizure detection using multiwavelet transform based approximate entropy and artificial neural networks, Journal of neuroscience methods 193(1) (2010), 156–163. doi:10.1016/j.jneumeth.2010.08.030.

15.

Tzallas

A.T.

Tsipouras

M.G.

and Fotiadis

D.I.

, Epileptic seizure detection in EEGs using time–frequency analysis, IEEE Transactions on Information Technology in Biomedicine 13(5) (2009), 703–710. doi:10.1109/TITB.2009.2017939.

16.

Pachori

R.B.

and Bajaj

, Analysis of normal and epileptic seizure EEG signals using empirical mode decomposition, Computer Methods and Programs in Biomedicine 104(3) (2011), 373–381. doi:10.1016/j.cmpb.2011.03.009.

17.

Ghosh-Dastidar

Adeli

and Dadmehr

, Principal component analysis-enhanced cosine radial basis function neural network for robust epilepsy and seizure detection, IEEE Transactions on Biomedical Engineering 55(2) (2008), 512–518. doi:10.1109/TBME.2007.905490.

18.

Subasi

and Gursoy

M.I.

, EEG signal classification using PCA, ICA, LDA and support vector machines, Expert Systems With Applications 37(12) (2010), 8659–8666. doi:10.1016/j.eswa.2010.06.065.

19.

Altunay

Telatar

and Erogul

, Epileptic EEG detection using the linear prediction error energy, Expert Systems With Applications 37(8) (2010), 5661–5665. doi:10.1016/j.eswa.2010.02.045.

20.

Joshi

Pachori

R.B.

and Vijesh

, Classification of ictal and seizure-free EEG signals using fractional linear prediction, Biomedical Signal Processing and Control 9 (2014), 1–5. doi:10.1016/j.bspc.2013.08.006.

21.

Chai

and Dong

, Classification of seizure based on the time-frequency image of EEG signals using HHT and SVM, Biomedical Signal Processing and Control 13 (2014), 15–22. doi:10.1016/j.bspc.2014.03.007.

22.

Chen

L.L.

Zhang

Zou

J.Z.

Zhao

C.J.

and Wang

G.S.

, A framework on wavelet-based nonlinear features and extreme learning machine for epileptic seizure detection, Biomedical Signal Processing and Control 10(1) (2014), 1–10. doi:10.1016/j.bspc.2013.11.010.

23.

Ahonen

Hadid

and Pietikainen

, Face description with local binary patterns: Application to face recognition, IEEE Transactions on Pattern Analysis and Machine Intelligence 28(12) (2006), 2037–2041. doi:10.1109/TPAMI.2006.244.

24.

Kaya

Uyar

Tekin

and Yıldırım

, 1D-local binary pattern based feature extraction for classification of epileptic EEG signals, Applied Mathematics and Computation 243 (2014), 209–219. doi:10.1016/j.amc.2014.05.128.

25.

Gottumukkal

and Asari

V.K.

, An improved face recognition technique based on modular PCA approach, Pattern Recognition Letters 25(4) (2004), 429–436. doi:10.1016/j.patrec.2003.11.005.

26.

Kadappa

and Negi

, Global modular principal component analysis, Signal Processing 105 (2014), 381–388. doi:10.1016/j.sigpro.2014.06.014.

27.

Burges

C.J.

, A tutorial on support vector machines for pattern recognition, Data mining and knowledge discovery 2(2) (1998), 121–167. doi:10.1023/A:1009715923555.

28.

Cheng

and Yang

, A fault diagnosis approach for gears based on IMF AR model and SVM, EURASIP Journal on Advances in Signal Processing 2008(1) (2008), 647135. doi:10.1155/2008/647135.

29.

Kohavi

, A study of cross-validation and bootstrap for accuracy estimation and model selection, Ijcai 14(2) (1995), 1137–1145.

30.

Andrzejak

R.G.

Lehnertz

Mormann

Rieke

David

and Elger

C.E.

, Indications of nonlinear deterministic and finite-dimensional structures in time series of brain electrical activity: Dependence on recording region and brain state, Physical Review E 64(6) (2001), 061907. doi:10.1103/PhysRevE.64.061907.

31.

Srinivasan

Eswaran

and Sriraam

, Approximate entropy-based epileptic EEG detection using artificial neural networks, IEEE Transactions on Information Technology in Biomedicine 11(3) (2007), 288–295. doi:10.1109/TITB.2006.884369.

32.

Iscan

Dokur

and Demiralp

, Classification of electroencephalogram signals with combined time and frequency features, Expert Systems With Applications 38(8) (2011), 10499–10505. doi:10.1016/j.eswa.2011.02.110.

33.

and Wen

P.P.

, Clustering technique-based least square support vector machine for EEG signal classification, Computer Methods and Programs in Biomedicine 104(3) (2011), 358–372. doi:10.1016/j.cmpb.2010.11.014.

34.

Nicolaou

and Georgiou

, Detection of epileptic electroencephalogram based on permutation entropy and support vector machines, Expert Systems With Applications 39(1) (2012), 202–209. doi:10.1016/j.eswa.2011.07.008.

35.

Lee

S.H.

Lim

J.S.

Kim

J.K.

Yang

and Lee

, Classification of normal and epileptic seizure EEG signals using wavelet transform, phase-space reconstruction, and Euclidean distance, Computer methods and programs in biomedicine. 116(1) (2014), 10–25. doi:10.1016/j.cmpb.2014.04.012.

36.

Tawfik

N.S.

Youssef

S.M.

and Kholief

, A hybrid automated detection of epileptic seizures in EEG records, Computers & Electrical Engineering 53 (2016), 177–190. doi:10.1016/j.compeleceng.2015.09.001.

37.

Sharma

and Pachori

R.B.

, Classification of epileptic seizures in EEG signals based on phase space representation of intrinsic mode functions, Expert Systems With Applications 42(3) (2015), 1106–1117. doi:10.1016/j.eswa.2014.08.030.

Epileptic seizure detection in EEG signal with GModPCA and support vector machine

Abstract

Background and objective:

Methods:

Results:

Conclusions:

Keywords

1. Introduction

2. Methodology and materials

2.1.2. Selection of projection vectors (r, w)

2.2. MPCA

2.3. Support vector machine (SVM)

2.5. Time complexity of PCA, MPCA and GModPCA

1 EEG time series dataset http://epileptologie-bonn.de/cms/front_content.php?idcat=193lang=3changelang=3.

3.1. Results

Conflict of interest

References

¹
EEG time series dataset http://epileptologie-bonn.de/cms/front_content.php?idcat=193lang=3changelang=3.