Entropy parameter optimization for epileptic seizure detection: A parallel approach

Abstract

Brain Computer Interface (BCI) – one of the recent advancements in the field of Bioinformatics which offers a real-time support for the people, who are affected by chronic neurological disorders. Owing to the rapid progression of Electroencephalogram (EEG) – based BCI system, the detection of epileptic seizures has become much simpler. However, accurate detection through visual inspection is tedious, time-consuming and prone to error. Thus, automation has become inevitable and for automating the epileptic seizure detection, entropies are appropriate as the nature of EEG signals are complex, arrhythmic, ephemeral, and non-stationary. Several renowned entropies are widely applied, nevertheless, the existing models fail to identify the optimal parameters of the entropies which greatly influences the performance of the Machine Learning models that could make better predictions. Hence to address the aforementioned issue, this paper presents a parallel machine learning based farmland fertility algorithm which optimizes the parameters of various entropies thereby detecting Epileptic Seizures in a systematic way. A novel weighted fitness function has been designed based on Kullback-Leibler Divergence (KLD). The extracted features are further classified using state-of-the-art classifiers. The overall performance of the proposed algorithm was evaluated using the EEG dataset obtained from University of Bonn, Germany, University of Bern and Indian EEG, New Delhi and the results show the supremacy of the proposed model in terms of sensitivity, specificity, precision, F1-score, G-mean and classification accuracy.

Keywords

Epileptic seizure detection kullback-leibler divergence farmland fertility optimization algorithm

1. Introduction

Epilepsy is a condition of the brain causing seizure, hence it is also known as seizure disorder that affects people of different ages, races, and backgrounds [1]. In seizure disorders, electrical activity of the Central Nervous System is intermittently disturbed, which results in a certain degree of momentary dysfunction [2]. Recently Bangalore Urban-Rural Neuro-epidemiological Survey conducted a population-based survey to estimate the prevalence of epilepsy and observed that an average of 8.8/1000 population are affected, with the prevalence rate in rural communities (11.9) being twice that of urban areas (5.7) [3]. This survey highlights the need to respond to the huge prevalence of epilepsy. Moreover, the occurrence and frequency of seizure is in-deterministic and people with epileptic seizures face challenges in their everyday life. Therefore, it is ineluctable to develop a robust epilepsy detection model which simplifies the diagnosis process and provides better treatment to the patients [4]. The treatment of epileptic seizures depend on its accurate diagnosis. In general, to diagnose the neurological disorders many brain imaging techniques are available to observe brain activity. Among the techniques, EEG is cost-effective than other brain imaging techniques and this can be a notable aspect in the diagnosis of epilepsy in rural communities. Further, EEG is a painless, non-invasive, and completely passive recording technique that captures neuron activity of brain through electrodes placed on the scalp [5]. However, EEG has proved to be an important means in diagnosing people with epileptic seizures because it directly measures the electrical activity of the brain. Unfortunately, the frequency and occurrence of seizures cannot be detected in advance; long recordings of EEG signals are obligatory. Nevertheless, epilepsy diagnosis from long recordings of EEG signals is usually done by visual inspection which is inefficient, time-consuming and error-prone process. Hence in recent years, several methods are proposed by researchers for automating epilepsy detection [6, 7, 8, 9].

To automate the diagnosis process of epilepsy, the entire process can be split into three main phases namely signal decomposition, feature extraction, and classification. For analysing the EEG signals in multiple resolutions, signal decomposition becomes inevitable. Similarly, feature extraction is an essential step that provides a solution to identify the hidden information regarding the signal’s time and frequency components effectively. However, the identification and extraction of significant features still remains a research challenge [10]. Literature reveals that the entropies have the capability to capture the complexity of non-stationary EEG signals, as entropy is a measure that quantifies randomness present in the signal [11]. While extracting the entropy based features, the parameters present in various entropies plays a vital role and they can be viewed as hyperparameters, as it has an impact in the discrimination power if they are altered. Thus, this research work focuses on setting the optimal parameters for various entropies. Though the parameters present in the entropies has an impact, manual selection of optimal parameters is time consuming. Therefore, this article proposes a parallel Farmland Fertility nature-inspired meta-heuristic optimization algorithm and a novel weighted fitness function is designed which has to satisfy the objective function. Finally, the entire process ends with the classification i.e., the extracted features are classified using state-of-the-art classifiers.

The major contributions of the proposed epileptic seizure detection model are

A parallel farmland fertility algorithm is proposed to optimize the parameters of entropies (Kraskov, Permutation, Renyi and Tsalli’s) with minimal computational time

A novel fitness function is designed based on Kullback-Leibler divergence which helps to identify the optimal parameters of aforementioned entropies

Based on the optimal parameters of each entropy, the features are extracted and classified using state-of-the-art (SOTA) classifiers (SVM, LDA, KNN, RF, ANN)

The performance of optimal parameters obtained using PFFO is analysed. Three EEG benchmark datasets were used to evaluate the predominance of the proposed model.

The rest of the article is structured as follows: Section 2 discusses the research works which rely on entropy for designing an automated epileptic seizure detection system. Section 3 elucidates the need of the proposed work along with the intuition behind the selection of the entropy features. Section 4 discusses the fundamentals to provide the insight of the proposed work. Section 5 explains the proposed work. Section 6 articulates the experimental setup, brief description about the datasets considered followed by the discussions and Section 7 concludes the article.

2. Literature review

Literature reveals that various entropies contribute significantly during epileptic seizure detection and prediction as it measures the randomness of the physiological signals. In 2012, Nicoletta and Julius [12] proposed a model based on permutation entropy to characterize and classify epileptic EEG signals from the normal signals. In 2012 [13], Song and Zhang designed an automatic recognition system to diagnose epileptic EEG patterns using non-linear features: sample entropy, Hurst exponent and permutation entropy extracted after decomposing the signals using Discrete Wavelet Transform. In 2013, Guohun Zhu et al. [14] employed delay permutation entropy for the accurate detection of epileptogenic regions of the human brain. In 2014, Yatindra et al. [15] proposed fuzzy approximate entropy for epileptic seizure detection and obtained significant improvement in the results. In 2015, Rajendra Acharya et al. [11] compared the various entropies used for epileptic seizure detection and prediction along with its merits and limitations. In the same year, Rajeev Sharma et al. [16] proposed an entropy based model for the identification of focal EEG signals. In 2017, Lina Wang et al. [17] extracted spectrum and approximate entropies for the precise discrimination of brain abnormalities. In the same year, Tao Zhang et al. [18] introduced a novel entropy termed as fuzzy distribution entropy which combines wavelet packet decomposition for the classification of epileptic EEG signals. Also, Patidar and Panigrahi [19] developed a seizure detection model using tunable-Q wavelet transform from which kraskov entropy features are extracted and classified using LS-SVM. In 2019, Vipin Gupta et al. [20] employed FBSE-EWT and extracted entropies (Log-Entropy and Norm) for effective epileptic seizure detection. In 2020, Palani et al. [21] extracted entropy measures from Multivariate Empirical Mode Decomposition for Schizophrenia detection using multichannel EEG signals. Priscila et al. [22] employed discrete wavelet transforms for decomposing the EEG signal and extracted entropy measures for classifying interictal and ictal states in EEG signals. In 2021, Reem et al. [23] have analysed the relationship between spontaneous neural activity and psychiatric traits using EEG signals by extracting Multiscale entropy. Asghar and Babak [24] have implemented two signal decomposition methods: discrete wavelet transform and orthogonal matching pursuit and extracted entropy based features for epileptic seizure detection. Sukriti et al. [25] proposed an epileptic seizure detection system by applying a novel denoising technique based on multiscale principal component analysis and empirical mode decomposition. Further, refined composite multiscale entropies such as sample, fuzzy and permutation are extracted and the impact on epileptic seizure diagnosis is analysed. From the literature, it is evident that entropy measures are greatly contributing to the effective discrimination of normal and abnormal EEG signals, as it captures the randomness of the non-linear and non-stationary EEG signals and deliberates the distinct nature of the various brain states. However, most of the entropies have parameters that play a significant role in extracting useful information from the pre-processed signals and none of the researchers attempted to identify the patient-specific optimal parameters of the entropy measures yet they fail to utilize its complete thrust. Besides, Tomasz and Duch compared the performance of decision trees implemented based on three entropies: Renyi, Tsallis and Shannon entropy [26]. In which the performance of each entropy with respect to the probabilities was analysed by varying Further, the performance of the decision tree was improved while tuning as it nullifies the trade-off between the information gain and the probability among the different classes. Yin et al. [27] proposed a novel Multiscale Permutation Renyi entropy (MPEr) for effectively quantifying the complexity of EEG signals. The complexity of MPEr was analysed by varying the different parameters such as and Abhik and Basu [28] proposed a new logarithmic norm-entropy (LNE) which have two parameters & and viewed the proposed entropy measure & its corresponding cross-entropy measure as an optimization problem. The probability values of LNE were plotted by varying the & values ranging from 0 to 100. It was inferred that LNE decreases if any one of the values increases by keeping another as a constant value. Therefore, it is implicit from the literature that the parameters present in the entropy measures assuredly influence the performance. However, none of the research works viewed the parameters present in the entropy measures as hyperparameters that eventually extracts the significant patient-specific information from the raw EEG signals. Also, it is worthy to mention that, none of the researchers attempted automating the identification of optimal parameters of entropy measures which obviously improves the role discrimination power of the features in classification.

3. The reason behind

For identifying the seizure occurrence using EEG signals a diagnostic set of features have been extracted. The extracted features have to recognize intra-class relationships and reveal inter-class variances. For which in recent times entropy based features are highly preferred as it captures the degree of randomness of any physiological signal. However, each entropy based feature has its own benefits and limitations. Among them the entropies that better capture the characteristics of seizure activity have to be extracted. Thus, the entropies which are used to extract the prominent information from EEG signals are employed and the significance of each of them are given below,

Kraskov Entropy [29]: From the literature, it is observed that the value of Kraskov entropy is higher during the occurrence of seizure activity compared to normal EEG signals. Also, the features extracted from third level decomposition hold significant information compared to other decomposition levels. This inference necessitate the identification of optimal parameters for each level coefficients of decomposed EEG signals

Permutation Entropy [30]: Recent studies reveal that, permutation entropy handles dynamical noise present in the chaotic non-stationary EEG signals effectively. Further, model assumption is not required and it is appropriate for analysing the nonlinear processes. Real-time EEG signals are noisy in nature, handling those using filters remove the required information at times. Thus, an entropy that extracts appropriate information from the noisy signal is suitable

Renyi Entropy [31]: It is a smooth entropy and it remains unaffected over the diverse density functions. Thus the information extracted using Renyi entropy is independent of the learning model, it holds the information without loss

Tsallis Entropy [32]: It measures the complex dynamics of bursts and estimates the gravity of brain impairment. Furthermore, it evaluates the diversified rhythms and bursts present in the EEG signals and extracts the required information out of it.

3.1 Why entropy parameter optimization

Entropy describes the characteristics of a signal in terms of randomness and the change in parameter influences the performance. Further, clear understanding and a proper conclusion of any entity can be arrived while changing its parameter. In machine learning perspective, the performance of the learning model is highly dependent on parameters involved in it. [26] confirms the role and influence of entropy parameters by varying and plotting the parameter values of different entropies. Also, in the diagnostic perspective the EEG signals are patient-specific. The default parameters fail to extract the patient-specific information. Further, while extracting the features from the decomposed signals, the features extracted from the different levels of decomposition gives significantly different inferences. These hypotheses and understandings paved the way for ideating and proposing this work and the results confirm that the proposed work yields better performance compared to the existing approaches.

4. Preliminaries

4.1 Farmland fertility optimization algorithm

Human et al. [33] proposed a nature inspired metaheuristic optimization algorithm based on the selection of fertile farmland to yield better cultivation. The basic idea behind this metaheuristic algorithm is, the algorithm splits the search space into several sections, and the global and local optimal solutions for each sections are computed which mimics the behaviour of farmers choosing the suitable farmland for cultivation. To improve the fertility of the soil green manures, organic and chemical fertilizers are used by the farmers. As a result the quality of the soil increases which yields better quality products. Based on the nature of soil, the farmers add appropriate fertilizers to make the soil a better one. By considering the previous observations the farmers decide whether to add fertilizers or to leave the farmland in its current form. After determining the quality of the soil in each section, the section with worst quality will be improved by adding the appropriate fertilizers to it. Thus, the section with worst fertility will have more changes compared to the other sections. The best part of the soil from all the sections will be stored in warehouses (global memory) and the best part of the soil from each individual section will be stored in local memory. To be clear, global memory holds the best solution that ever found in the entire search space and the local memory holds the best solution found in the each section. Algorithm consists of six mathematical stages which is depicted below.

Stage 1:
Generation of Initial Population

The number of initial populations are generated based on the number of sections and its corresponding solutions (Eq. (1)).
$N = k * n$
(1)

Where $k$ denotes the number of sections and $n$ represents the number of solutions. in each section and $N$ is the total number of populations to be generated. The initial population is generated randomly using Eq. (2).
$x_{i j} = L_{j} + rand (0, 1) \times (U_{j} - L_{j})$
(2)

Where $U_{j} & L_{j}$ denotes the upper and lower bounds of $x$ respectively, $i & j$ represents the total number of populations $[1, \dots, N]$ and number of input dimensions respectively. In real time scenarios, determining the $k$ value is a NP hard problem In this context the value of $k$ is determined based on the number of entropy features extracted from the EEG signals.
Stage 2:
Determining the fertility of the soil in each section of farmland

Once the populations are generated, the fitness of the existing solutions are computed from which the fertility of the soil can be obtained. The solutions for each section is calculated using Eq. (3).
${Section}_{K} = x (a j)$
(3)

where $a \to n * (K - 1) : n * K; K \to 1, 2, \dots, k$ and to determine the fertility of the each section the mean of solutions is computed using Eq. (4).
${Fitness}_{K} = Mean (all (x_{j i}) {in section}_{K})$
(4)
Stage 3:
Updating the memoriest

Once the solutions and the mean value of each section of the farmland are computed, the local and global memory can be updated. The number of solutions to be stored in the local and global memory are determined using Eqs (5) and (6). Further, the best and worst sections of farmland are identified.
$\begin{aligned} N_{local~} = round (f * n) \end{aligned}$
(5)

$\begin{aligned} N_{global~} = round (f * n) \end{aligned}$
(6)

$N_{local~}$ and $N_{global~}$ are the number of solutions in the local and global memory respectively.
Stage 4:
Altering the quality of the soil in each section of farmland

In this stage, the section with worst soil quality has to be changed significantly. The solutions in the werst section of farmland has to be pooled with the solution available in the global memory using Eqs (7) and (8) [33].
$\begin{aligned} b & = φ * rand (- 1, 1) \end{aligned}$
(7)

$\begin{aligned} x_{new~} & = b * (x_{i j} - x_{N_{glabal~}}) + x_{i j} \end{aligned}$
(8)
where $φ$ is a number varies between 0 and 1 which has to be initialized in the beginning. $x_{N_{glabal~}}$ is a solution picked randomly from the solutions stored in the global memory. $x_{i j}$ denotes the solution of the worst section of farmland. The other parts of farmland are updated using Eqs (9) and (10)
$\begin{aligned} b & = ϑ * rand (0, 1) \end{aligned}$
(9)

$\begin{aligned} x_{new~} & = b * (x_{i j} - x_{u j}) + x_{i j} \end{aligned}$
(10)

In Eq. (9), $ϑ$ is a number lies between 0 and 1 valued in the beginning. $x_{u j}$ is a randomly picked solution from the search space and $x_{i j}$ is the solution to be updated.
Stage 5:
Combining the soil

To determine the amount of combination of solution with the global best solution $(B_{G})$ , a random variable named $β$ has been introduced and its value is determined in the beginning. If $β >$ rand
$\begin{aligned} x_{new~} & = γ * (x_{i j} - B_{G}) + x_{i j} \end{aligned}$
(11)

$\begin{aligned} x_{new~} & = rand (0, 1) * (x_{i j} - B_{L}) + x_{i j} \end{aligned}$
(12)

In Eq. (11), $γ$ is determined in the beginning and decreases gradually as the iteration continues and is further computed using Eq. (13).
$γ = γ * R; R (0, 1)$
(13)

$B_{L}$ combines the best available solutions in their local memory and $B_{G}$ is the best solution ever found in the entire search space in order to improve the quality of the other solutions.
Stage 6:
Termination Condition

Based on the fitness function, either the final condition or the maximum number of iterations is attained the algorithm ends else the algorithm continue to reach the termination condition.

4.2 Entropy based features

Entropy can simply be stated as the degree of randomness. Four Entropy based features are extracted in this study, namely Kraskox, Permutation, Renyi and Tsallis entropy.

4.2.1 Kraskov Entropy (KE) [19]

It measures the non-linear characteristics of finite length physical or physiological time series. KE has numerous applications and it is found to be a robust estimator where the nature of data is variance-nonstationary. Another significant advantage of KE is, it does not have constraint that the data should follow a specific distribution thus avoiding the necessity of performing Gaussian transformation. KE is calculated using Eq. (14).

K E (K, η, δ) = Ψ (η) - Ψ (k) + \log ς_{δ} + \frac{δ}{η} \sum_{i = 1}^{η} \log Δ_{i}

(14)

In Eq. (14), $k$ is pondered as a hyperparameter which is to be optimised using the proposed model.

4.2.2 Permutation entropy

(P E)

[34]

Permutation entropy introduced by Bandt and Pompe has been widely used in the analysis of data in various departments. The Permutation Entropy is the Shamnon Entropy of decomposed dynamic elements of the given time series and it estimates the complexity of physiological signals for a given time sequence. It takes 3 elements delay $(τ)$ , coarse time series, and order $m$ (Eq. (15)). The significant feature of permutation entropy is that it handles real and noisy signals effectively.

ϵ P E (m, τ, {(x_{t})}_{t = 0}^{N - 1}) = - \frac{1}{m} \sum_{π \in π_{m}} p_{π}^{τ} \ln p_{π}^{τ}

(15)

In Eq. (15) selection of $m$ is randomly chosen which can be yiewed as a hyperparameter for automation.

4.2.3 Renyi entropy

(R E)

[28]

It quantifies the spectral complexity of the physiological signals. It relies mainly on the single parameter $α$ and it plays a vital role in extracting the information from the signal. In Eq. (16), if $α$ tends to 1, it is analogous to Shannon Entropy. It is worthy to mention that RE has a quality that it remains unaffected for various density functions.

R E (α) = - \frac{α}{1 - α} \sum \log p_{i}^{α}

(16)

In Eq. (16), $α$ is pondered as a hyperparameter which is to be optimised using the proposed model.

4.2.4 Tsallis entropy (TE) [32]

It characterizes the physical behaviour of the signal. It clarifies long range interactions, measures the unforeseen changes and helps to discriminate EEG spikes effectively. The parameters involved for estimating TE are $q$ , which measures nonextensivity and $p_{i}$ denotes the probability of discrete sets with $w$ configurations (Eq. (17)). It helps to measure the uncertainty and it provides more information than the other traditional entropies, exclusively while analysing the spikes of the EEG signals.

T E (q) = \frac{1 - \sum_{i - 1}^{w} p_{i}^{q}}{q - 1}

(17)

In Eq. (17), selection of is randomly chosen which can be viewed as a hyperparameter for automation.

5. Proposed methodology

This section provides insight into the proposed work of optimising the parameters of entropy measures using the parallel farmland fertility algorithm by designing a novel fitness function. Among the existing metaheuristic algorithms, farmland fertility algorithm is more effective as it divides the search space into many segments and freezes the optimal solution in each segment and aggregates them to attain the global optimal solution. Also, it effectively balances exploration and exploitation as it handles local minima traps and identifies the global optimal solution where many of the existing metaheuristic algorithms often fail to balance both. Besides, the parameters of several entropy measures are bounded between different boundary values, thus a single search space for identifying the optimal parameters of different entropy measures is not feasible. Hence, this article proposes a novel farmland fertility algorithm by parallelizing the search spaces with different boundaries of different entropies for locating their global optima respectively. Further, a real challenge lies while designing a fitness function as it quantitatively measures how good the identified solution is in solving the given problem. Designing a single fitness function for multiple search spaces bounded with different boundary values is challenging. In this work a new Kullback Leibler (KL) divergence based fitness function has been proposed to improve the optimization process and identify global optimal parameters set to improve performance of entropy-based classifications. As KL divergence measures the asymmetry of different probability distributions, the entropy features computed using optimal solution provide class-specific information and aggregates same class points closely for effective classification.

5.1 Signal decomposition

Before extracting the significant features, all you need is a suitable signal decomposition tool as EEG signals are non-stationary, aperiodic and complex in nature. Among the existing signal decomposition tools, Discrete Wavelet Transforms (DWT) is found to be a promising tool as it performs multi-resolution analysis by considering the unique thickness of EEG signals in different resolutions. Another notable advantage of DWT is it captures both time and frequency information of a signal simultaneously. Literature reveals that Daubechies (DB) wavelet is more appropriate for physiological signal processing as it has overlapping windows and therefore frequency spectrum coefficient apprehends all the changes in the frequency domain. There is a trade-off between the frequency and time resolution, it is observed that frequency resolution of the signal is enhanced as the order increases however, time resolution is deteriorated. For an effective compromise of both time and frequency information, in this work DB10 [35] is employed, after evaluating many families of DB using minimum entropy criterion. DB10 with four levels of decomposition yield four detailed coefficients (D1-D4) and one approximate coefficient (A4) which is mapped to the conventional frequency bands (D1 ( $δ$ ), D2 ( $θ$ ), D3 ( $α$ ), D4 ( $β$ ), A4 ( $γ$ )) of the human brain defined globally.

5.2 Parallel Farmland Fertility Optimization Algorithm (PFFO)

In Farmland Fertility Optimization algorithm (Fig. 2), the search space is divided into segments ( $ε$ ), as it mimics the agricultural land. However, in the proposed work, each entropy is considered as an individual segment which is further subdivided into multiple sub-segments. The optimization process of the entropy parameter is independent of each other, thus making parallelization feasible.

Let the sub-segment be denoted as $ω$ and the number of populations generated in each sub-segment be represented as $τ$ . Thus, the number of populations ( $N$ ) generated for each entropy can be expressed as $N = ω * τ$ . Aggregation of all populations ( $N_{agg}$ ) generated in the each segment can be given as $N_{agg} = ε * N$ . Here, the number of segments ( $ε$ ) of the agricultural land is the number of entropy parameters to be optimised. Determining the number of sub-segments ( $ω$ ) is a NP Hard problem however, according to the “Principle of randomness” which is followed by the metaheuristic algorithms, the search space are divided arbitrarily. In [33] the authors have considered 5 assumptions and concluded that, the value of may vary between 2 and 8.

5.2.1 Generation of initial population

In general, metaheuristic algorithms begin with the population initialization and the population size is decided based on the expected solution. Let us consider a search space of d-dimension with $m$ population, in which the i^th candidate can be represented as,

Y_{i} = [y_{i}^{1}, y_{i}^{2}, \dots, y_{i}^{p}, \dots, y_{i}^{d}]

Where $y_{i}^{p}$ is the position of i^th candidate in the p^th dimension.

Here, the generated population is the five-tuple $< d e_{1}, d e_{2}, d e_{3}, d e_{4}, a e_{4} >$ where $d e_{1} - d e_{4}$ denotes the entropy parameters employed to extract entropy features from the detailed coefficients and represents the entropy parameters employed to extract entropy features from the approximate coefficients. As the parameters of different entropies are to be optimised, the boundary conditions to be fixed for each entropy to effectively maintain the balance between exploration and exploitation process in a search space.

x_{i j} = L_{j} + rand (0, 1) \times (U_{j} - L_{j})

where

U_{j} & L_{j}

denotes the upper and lower bounds of

x

respectively. In this context the value of

U_{j} & L_{j}

. are determined based on the boundary conditions identified for each entropy parameters.

Figure 1.

Glimpse: Parallel farmland fertility optimization algorithm.

Figure 2.

Workflow: Parallel farmland fertility optimization algorithm.

5.2.2 Fitness function

Designing a suitable objective function is highly inevitable as it quantitatively measures how pertinent the particular solution is while solving the given problem. If the identified objective function is inappropriate, it leads to a bottleneck. Since, the aim of the study is to maximize the performance of the learning model by identifying the optimal parameters of each entropy, the proposed work can be viewed as a Maximization problem. Keeping the performance of the learning model as an objective function is cumbersome. Thus, the difference between the data distributions can be extracted which eventually captures the information needed for maximizing the performance. To achieve this, Kullback-Leibler (KL) Divergence which is extensively employed in variational inference can be used to design an objective function, as it quantifies the difference among the data distributions. Since the problem considered here is a maximization problem, the fitness function based on KL Divergence can be formulated as follows,

{F i t n e s s}_{P F F O} = M a x (K L (P | | Q))

Where $P$ and $Q$ are the probability distributions. $K L (P | | Q)$ can be computed as follows,

K L (P | | Q) = \sum_{i = 1}^{N} p (x_{i}) . (\log p (x_{i}) - \log q (x_{i}))

In general, KL divergence is used to measure the information loss while approximating the original distribution with the underlying distribution. Whereas, in this context in a new way KL divergence is employed to compute the difference between the distributions of two entities.

5.2.3 Segment-wise Parallelizatio

Identifying the optimal parameters of each entropy using farmland fertility algorithm in a single processor will leads to computational overhead. Also, it is worthy to mention that the entropy features to be extracted are independent of each other. Thus, to parallelise the farmland fertility algorithm, parallel processing has been employed through which each task (entropy parameter optimization) is executed simultaneously in multiple processors thereby reducing the overall processing time. Further the parallel processing is executed asynchronously as the parameters of each entropy do not influence each other thus, locking is not required. Thus, the parameters of each entropy are optimized and the entropy features are extracted asynchronously through parallel processing.

5.2.4 Termination condition

Once the optimal fitness value is reached or the algorithm attains the maximum number of iterations the proposed algorithm is terminated.

5.2.5 Population updat

If the termination condition is not attained, based on the obtained fitness value each population gets updated using Eqs (3)–(13). Once the termination condition is reached, the optimal solution from each segment of the farmland is gathered from its global memory, based on the optimal entropy parameters the level specific entropy features are extracted.

5.3 Epileptic seizure detection

The entropy based features extracted using the level specific optimal parameters are fed into the state-of-the-art classifiers for detecting the epileptic seizures. Background description of the renowned classifiers are as follows: Support Vector Machines (SVM), it forms potential hyperplanes and the best hyperplane is identified using the constraints, the kernel is configured with Radial Basis Function as the nature of input is non-linear and the learning process is carried out in a supervised way. Linear Discriminant Analysis (LDA), the classification is made by the probability estimation which is computed using Bayes theorem. It minimalizes the variance and maximizes the inter-class distance between the classes. K-Nearest Neighbors (KNN) is an instance-based learner which embeds high dimensional feature vector into a low-dimensional space. K is a hyperparameter which is fixed as 3. Ensemble of decision trees result in Random Forest (RF) which is a bagging method, the error in the individual tree does not propagate and it provides additional randomness and searches the optimal feature among the feature subsets which results in a better learning model. Artificial Neural Network (ANN) is inspired from the biological neuronal interactions which contains artificial neurons as nodes. It consists of input layer followed by hidden and output layers respectively. ReLU and Softmax are configured as an activation in the hidden and output layers respectively. Adam optimizer is used for training the neural network.

6. Results and discussions

This section provides the details of the experimental setup, EEG datasets used and validation procedure. The inferences from the recorded results were discussed.

6.1 Experimental setup

The proposed work was implemented in Python3 using Scikit-learn’s classifier functions to perform classification tasks. NumPy and Matplotlib were used for implementing mathematical operations and plotting respectively. In addition pyWavelets package was used to extract Discrete Wavelet Transform (DWT) coefficients from raw EEG signals. All the experimentations were carried out in Google Colaboratory.

6.2 Dataset description

In order to evaluate the performance of the proposed model without any bias towards the specific data, three different EEG benchmark datasets were used: University of Bern-Barcelona focal Epilepsy dataset, University of Bonn Epilepsy dataset and Indian EEG Dataset.

6.2.1 University of Bern-Barcelona [36]

It consists of two classes: Focal and Non-Focal EEG signals. Each class contains 3750 pairs of randomly picked EEG signals recorded simultaneously with a sampling rate of either 512 Hz or 1024 Hz. Each sample contains 10240 data points which corresponds to 20 second window [37].

6.2.2 University of Bonn [38]

It comprises of five different classes, with each class contains 100 samples, in a sampling rate of 173.61 Hz, healthy volunteers: Eyes closed (A), Eyes open (B), and epileptic patients: Inter-Ictal Focal (C), Inter-Ictal Non-Focal (D) and Ictal (E). Each sample contains 4097 data points which corresponds to 23.6 second window.

6.2.3 EEG Dataset New Delhi [39]

EEG signals collected from ten epilepsy patients of Neurology & Sleep Centre, Hauz Khas, New Delhi is used in this work. Each sample consists of 1024 data points recorded for the duration 5.12 seconds with a sampling rate of 200 Hz. Gold plated electrodes were positioned by following 10–20 electrode placement system. Recorded EEG signals were filtered between 0.5 Hz and 70 Hz and labelled into three different classes: ictal, inter-ictal and pre-ictal.

6.3 Data fine-tuning and validation

In order to improve the robustness of the learning model on the view of real-time applications, where influence of noise is indispensable, no separate filters were applied on EEG signals. The entropy features extracted using the optimal parameters obtained from the proposed work is normalised using zero-mean unit-variance normalisation. Stratified ten-fold cross validation was employed and the entire process was repeated for twenty times and the results were recorded to avoid biased results. The state-of-the-art classifiers: Support Vector Machines (SVM) with Radial Basis Function as kernel, Linear Discriminant Analysis (LDA), K-Nearest Neighbors (KNN), Random Forest (RF) and Artificial Neural Network (ANN) were used to show the role of the optimal parameters in improving the overall performance of the learning model. The performance of the learning models were evaluated in terms of performance metrics given in Section 6.4.

6.4 Performance metrics

In order to calculate the performance, True Positive (

T_{p}

), True Negative (

T_{n}

), False Positive (

F_{p}

), False Negative (

F_{n}

) values are obtained from the confusion matrix. The formulae are given as follows,

S. no.	Performance metrics	Formula
1	Sensitivity	$\frac{T_{p}}{T_{p} + F_{n}}$
2	Specificity	$\frac{T_{n}}{T_{n} + F_{p}}$
3	Precision	$\frac{T_{p}}{T_{p} + F_{p}}$
4	F1 score	$2 * (\frac{S e n s i t v i t y * P r e c i s i o n}{Sensitvity~ + P r e c i s i o n})$
5	Geometric mean	$\sqrt{S e n s i t v i t y * S p e c i f i c i t y}$
6	Classification accuracy	$\frac{T_{p} + T_{n}}{T_{p} + T_{n} + F_{p} + F_{n}}$

6.5 Discussions

Using the proposed PFFO, level-specific (A4, D4, D3, D2, & D1) entropy parameters of Renyi, Tsallis, Permutation, and Kraskov entropies are tabulated (Tables 1 and 2). It is inferred that for Renyi and Tsallis entropies, the optimal parameter values obtained from PFFO relies (2

\pm

0.30) nearly to the default parameter value

(α, q = 2)

. However, for University of Bern the optimal parameter values varies significantly (Table 1). For Permutation and Kraskov entropies, significant changes in the optimal parameter values are observed to that of default parameter values (

m = 3, k = 4)

(Table 2). Also, it is worthy to mention that, for different levels of decomposition the parameter values of entropies differ.

Table 1.
Optimal parameter values obtained from PFFO (Renyi & Tsallis Entropies)

Default parameter value		Renyi entropy ( $α$ ): 2					Tsallis entropy ( $q$ ): 2
Dataset	Cases	A4	D4	D3	D2	D1	A4	D4	D3	D2	D1
EEG dataset, New Delhi	Ictal-InterIctal	2.00	2.21	1.80	2.00	1.80	2.30	2.30	2.30	2.30	2.30
	Inter-PreIctal	1.80	2.00	2.00	2.00	2.00	2.30	2.30	2.30	2.30	2.30
	Ictal-PreIctal	1.80	2.30	2.30	2.00	1.93	2.30	2.30	2.30	2.30	2.30
Bern EEG	F vs NF	9.20	2.08	9.45	7.54	0.47	10.0	9.84	10.0	10.0	9.86
University of Bonn	A-E	2.12	1.94	2.05	2.17	2.15	2.19	2.18	2.15	2.18	2.15
	C-D	2.00	1.93	2.11	2.04	1.94	2.19	2.14	2.19	2.18	2.15
	C/D-E	2.09	2.02	2.19	2.01	2.06	2.17	2.19	2.19	2.19	2.20
	C-E	2.15	2.08	2.04	2.10	1.98	2.16	2.17	2.16	2.13	2.13
	D-E	2.00	1.87	2.08	2.02	2.18	2.20	2.19	2.17	2.17	2.19
	A-B	1.91	2.09	2.10	1.90	2.02	2.19	2.13	2.19	2.19	2.19
	A-C	1.98	2.10	2.00	1.86	2.06	2.19	2.18	2.19	2.19	2.17
	B-C	2.06	1.96	2.03	1.94	1.92	2.14	2.17	2.19	2.19	2.17
	A-D	2.01	1.99	1.88	1.93	2.10	2.19	2.20	2.19	2.15	2.19
	B-D	2.00	2.07	2.12	2.10	2.10	2.19	2.19	2.14	2.14	2.18
	B-E	2.09	2.05	1.92	2.00	2.20	2.18	2.05	2.12	2.17	2.20

Table 2.

Optimal parameter values obtained from PFFO (Permutation & Kraskov Entropies)

Default parameter value		Permutation entropy ( $m$ ): 3					Kraskov entropy ( $k$ ): 4
Dataset	Cases	A4	D4	D3	D2	D1	A4	D4	D3	D2	D1
EEG dataset,	Ictal-InterIctal	5	4	5	5	5	3	2	7	9	10
New Delhi	Inter-PreIctal	5	5	5	5	6	2	2	6	10	10
	Ictal-PreIctal	5	5	5	5	6	2	3	8	8	8
Bern EEG	F vs NF	6	5	6	6	2	5	5	9	9	6
University	A-E	5	5	5	6	6	2	5	7	4	15
of Bonn	C-D	5	5	5	6	6	2	4	5	15	15
	C/D-E	5	5	5	6	6	3	2	9	4	15
	C-E	5	5	5	6	6	4	6	9	8	15
	D-E	5	5	5	6	6	4	4	6	8	15
	A-B	5	5	5	6	6	3	4	3	8	9
	A-C	5	5	5	6	6	2	3	9	9	11
	B-C	6	4	4	6	5	2	7	7	9	9
	A-D	5	5	4	6	6	2	5	8	15	9
	B-D	6	5	5	6	6	4	5	9	15	9
	B-E	5	5	5	6	6	3	3	8	8	9

From the confusion matrix, sensitivity, specificity, precision, classification accuracy, F1 score and Geometric mean are computed. Sensitivity is calculated with respect to the actual positive labels. It is observed (Table 3) that the optimal parameters obtained using the proposed model outperforms. For the complex cases minuscule difference is inferred (C/D-E) and in case of A-C, the default parameters outperform. Nevertheless, for most of the cases the values obtained using PFFO shows its supremacy.

Table 3.

Performance comparison in terms of sensitivity

Classifiers		SVM		LDA		KNN		RF		ANN
Dataset	Cases	Best	Default	Best	Default	Best	Default	Best	Default	Best	Default
EEG dataset, New Delhi	Ictal-InterIctal	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00
	Inter-PreIctal	100.00	100.00	100.00	88.89	87.50	80.00	100.00	100.00	100.00	80.00
	Ictal-PreIctal	100.00	100.00	100.00	100.00	100.00	88.90	100.00	83.33	80.00	100.00
Bern EEG	F vs NF	75.00	73.68	80.70	72.73	84.30	80.30	89.40	83.82	80.30	75.71
University of Bonn	A-E	100.00	100.00	100.00	93.33	100.00	100.00	100.00	94.12	100.00	100.00
	C-D	85.71	76.92	78.57	60.00	88.24	84.62	68.75	62.50	93.75	84.62
	C/D-E	93.75	93.75	100.00	94.44	93.75	92.86	90.00	100.00	100.00	100.00
	C-E	100.00	92.31	95.00	93.75	100.00	92.31	86.67	92.86	100.00	100.00
	D-E	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	93.75	94.44
	A-B	100.00	94.44	100.00	94.44	100.00	93.33	100.00	93.33	100.00	93.75
	A-C	92.86	100.00	100.00	92.86	100.00	93.75	100.00	93.33	93.33	89.47
	B-C	100.00	92.86	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00
	A-D	100.00	100.00	100.00	89.47	100.00	92.31	100.00	91.67	100.00	94.44
	B-D	100.00	91.67	92.31	100.00	100.00	100.00	100.00	76.92	100.00	87.50
	B-E	100.00	93.33	100.00	100.00	100.00	100.00	100.00	100.00	100.00	87.50

Specificity is computed with respect to the actual negative labels. For most of the cases specificity measure outperforms (Table 4). However, for the complex case C-D both the default and best parameters yields low results and for D-E the default parameter outperforms. Owing to the complex EEG patterns the SOTA classifiers fail to discriminate True negatives effectively.

Table 4.

Performance comparison in terms of specificity

Classifiers		SVM		LDA		KNN		RF		ANN
Dataset	Cases	Best	Default	Best	Default	Best	Default	Best	Default	Best	Default
EEG dataset, New Delhi	Ictal-InterIctal	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	81.82
	Inter-PreIctal	75.00	50.00	100.00	50.00	71.40	60.00	81.80	70.00	75.00	70.00
	Ictal-PreIctal	80.00	87.50	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00
Bern EEG	F vs NF	79.50	78.38	77.60	76.71	87.50	77.20	67.90	78.05	81.10	77.50
University of Bonn	A-E	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	94.44
	C-D	50.00	52.94	68.75	66.67	61.54	58.82	85.71	71.43	64.29	58.82
	C/D-E	100.00	100.00	100.00	91.67	100.00	93.75	100.00	100.00	93.33	93.75
	C-E	100.00	100.00	100.00	92.86	100.00	94.12	100.00	100.00	100.00	94.44
	D-E	94.12	100.00	100.00	94.44	93.33	87.50	95.00	94.74	92.86	100.00
	A-B	100.00	91.67	100.00	91.67	100.00	86.67	100.00	73.33	100.00	92.86
	A-C	100.00	100.00	100.00	93.75	100.00	92.86	100.00	93.33	100.00	100.00
	B-C	100.00	93.33	100.00	94.12	100.00	100.00	93.75	100.00	100.00	92.86
	A-D	100.00	100.00	92.86	100.00	100.00	94.12	94.44	66.67	100.00	100.00
	B-D	100.00	100.00	100.00	100.00	93.75	100.00	100.00	100.00	100.00	92.86
	B-E	100.00	100.00	100.00	100.00	100.00	100.00	100.00	93.75	100.00	92.86

Precision is computed with respect to the positive predicted labels. It is evident from Table 5, features extracted using optimal parameters show their dominance over the features extracted using the default parameters.

Table 5.

Performance comparison in terms of precision

Classifiers		SVM		LDA		KNN		RF		ANN
Dataset	Cases	Best	Default	Best	Default	Best	Default	Best	Default	Best	Default
EEG dataset, New Delhi	Ictal-InterIctal	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	66.67
	Inter-PreIctal	77.78	63.64	100.00	72.73	77.78	80.00	66.67	62.50	77.78	57.14
	Ictal-PreIctal	90.91	87.50	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00
Bern EEG	F-NF	77.14	77.78	81.71	76.71	85.51	76.00	68.60	76.00	81.33	74.65
University of Bonn	A-E	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	92.31
	C-D	60.00	55.56	68.75	64.29	75.00	61.11	84.62	71.43	75.00	61.11
	C/D-E	100.00	100.00	100.00	94.44	100.00	92.86	100.00	100.00	93.75	93.33
	C-E	100.00	100.00	100.00	93.75	100.00	92.31	100.00	100.00	100.00	92.31
	D-E	92.86	100.00	100.00	92.31	93.75	87.50	90.91	91.67	93.75	100.00
	A-B	100.00	94.44	100.00	94.44	100.00	87.50	100.00	77.78	100.00	93.75
	A-C	100.00	100.00	100.00	92.86	100.00	93.75	100.00	93.33	100.00	100.00
	B-C	100.00	92.86	100.00	92.86	100.00	100.00	93.33	100.00	100.00	94.12
	A-D	100.00	100.00	94.12	100.00	100.00	92.31	92.31	64.71	100.00	100.00
	B-D	100.00	100.00	100.00	100.00	93.33	100.00	100.00	100.00	100.00	93.33
	B-E	100.00	100.00	100.00	100.00	100.00	100.00	100.00	93.33	100.00	93.33

Classification accuracy reveals the number of correct predictions over the total number predictions. For all the considered cases, in terms of classification accuracy (Table 6) significant improvement is inferred while training the SOTA classifiers using the features extracted using the optimal parameters. However, for the cases C/D-E & C-E the default parameters yield better results while classifying them using RF.

Table 6.

Performance comparison in terms of classification accuracy

Classifiers		SVM		LDA		KNN		RF		ANN
Dataset	Cases	Best	Default	Best	Default	Best	Default	Best	Default	Best	Default
EEG dataset, New Delhi	Ictal-InterIctal	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	86.67
	Inter-PreIctal	86.67	73.33	100.00	73.33	80.00	73.33	86.67	80.00	86.67	73.33
	Ictal-PreIctal	93.33	93.33	100.00	100.00	100.00	93.33	100.00	93.33	86.67	100.00
Bern EEG	F-NF	77.33	76.00	79.33	74.67	86.00	78.67	77.33	80.67	80.67	76.67
University of Bonn	A-E	100.00	100.00	100.00	96.67	100.00	100.00	100.00	96.67	100.00	96.67
	C-D	66.67	63.33	73.33	63.33	76.67	70.00	76.67	66.67	80.00	70.00
	C/D-E	96.67	96.67	100.00	93.33	96.67	93.33	96.67	100.00	96.67	96.67
	C-E	100.00	96.67	96.67	93.33	100.00	93.33	93.33	96.67	100.00	96.67
	D-E	96.67	100.00	100.00	96.67	96.67	93.33	96.67	96.67	93.33	96.67
	A-B	100.00	93.33	100.00	93.33	100.00	90.00	100.00	83.33	100.00	93.33
	A-C	96.67	100.00	100.00	93.33	100.00	93.33	100.00	93.33	96.67	93.33
	B-C	100.00	93.10	100.00	96.67	100.00	100.00	96.67	100.00	100.00	96.67
	A-D	100.00	100.00	96.67	93.33	100.00	93.33	96.67	76.67	100.00	96.67
	B-D	100.00	96.67	96.67	100.00	96.67	100.00	100.00	90.00	100.00	90.00
	B-E	100.00	96.67	100.00	100.00	100.00	100.00	100.00	96.67	100.00	90.00

Figure 3.

Performance comparison in terms of F1-score.

Similarly, bar graph is plotted in terms of F1-Score (Fig. 3) and G-Mean (Fig. 4). It is evident from the graphs (Figs 3 and 4) that the parameters optimized using the proposed model outperforms in most of the cases. In few cases, both the parameters attain similar results.

6.6 Statistical analysis

Table 7.
Statistical analysis: WTEST

Classifiers	SVM	LDA	KNN	RF	ANN
$P$ -value	0.04401	0.00397	0.00376	0.03828	0.02967

In order to have better clarity and to show the statistical dissimilarity between the performances of the learning models trained, validated and tested using the default and optimal parameters of the entropy features, the Wilcoxon rank-sum test (WTEST) is performed. WTEST, a non-parametric test in which the null hypothesis is rejected when the computed p-value is lesser than 0.05 i.e., less than 5% significance level. Table 7 confirms the $p$ -values obtained from WTEST is less than 0.05 and determines that the dissimilarity exists between the entropy features extracted using default and optimal parameters is significant.

Figure 4.

Performance comparison in terms of G-mean Score.

6.7 Ablation study

Owing to the combinatorial influence of the novel components employed in this work, the performance of the proposed model is prominent. Thus, the impact of each component is discussed in this section.

6.7.1 Influence of novel fitness function

In general, for identifying the optimal parameters with respect to the learning model, performance of the learning model is widely considered. However, the identified optimal parameters will become model dependent, it is obvious that for other learning models the optimal parameters identified could not yield better results. Also, if the performance of the learning model is kept as an objective function, superfluous computational overhead occurs. Thus, for extracting the difference among data distributions which ultimately holds the required information needed for maximizing the performance irrespective of the learning model’s novel objective function based on KL divergence is designed. Therefore, the optimal parameters obtained from PFFO is model independent and reliable.

6.7.2 Influence of segment-wise parallelization

As the parameters of each entropy is independent of the other entropies considered, novel parallelization component employed in this work is highly creditable. If the algorithm is executed in sequential fashion, the execution time will be higher compared to the segment-wise parallelization component designed in this work.

6.7.3 Influence of identification of optimal parameters

Optimal parameters obtained using the proposed model have certainly influences the performance of the SOTA classifiers considered which is quite evident from the results obtained (Tables 3–6). Though, for the Indian EEG dataset and University of Bonn dataset, the range of the identified optimal parameters of Renyi and Tsallis lies around the default parameters, significant improvement is inferred in the performance of the learning model. Also, for all the three datasets, substantial difference is noticed between the optimal and default parameters of Permutation & Kraskov entropies. The combinatorial impact of all the optimized entropies greatly impacts the performance of the SOTA classifiers taken into account.

On a concluding note, performance improvement is clearly inferred while optimising the parameters of the entropies. Due to the complex nature of EEG patterns, in some difficult cases, SOTA classifiers struggle to discriminate the samples accurately. Also, it is noticeable that level-specific parameter optimization directly influences the performance of the classifiers considered.

7. Conclusions

Epilepsy is characterized by recurrent seizures occur in different parts of the brain and usually assessed by EEG signals which are non-stationary, non-linear and transient in nature. Non-stationary implies signal’s statistical and spectral characteristics change with time. Entropy measures are proven to be a promising tool for characterizing the non-stationary property of EEG signals. However, while extracting features the parameters present in the various entropies plays a vital role and it influence the performance of the learning model. Hence this paper put forth a novel farmland fertility optimization algorithm to optimize entropy parameters based on Kullback-Leibler Divergence in a parallel fashion. Once the optimal parameters are identified, features are extracted and classified using state-of-the-art classifiers. The proposed model was validated using the EEG benchmark datasets obtained from University of Bonn, University of Bern and Indian EEG, New Delhi in terms of classification accuracy, sensitivity, specificity, precision, F1-score, and G-mean. The proposed algorithm can be used by the neurologists & technicians for the precise detection and prediction of epileptic seizures in advance and useful for other physiological signal specific biomedical applications.

Footnotes

Acknowledgments

This work was supported by The IBM Shared University Research Grant 2017. New York, USA.

References

Mursalin

Zhang

Chen

Chawla

. Automated epileptic seizure detection using improved correlation-based feature selection with random forest classifier. Neurocomputing.2017 Jun 7; 241: 204-14.

National Institute of Neurological Disorders and Stroke (US). Office of Communications and Public Liaison. The epilepsies and seizures: Hope through research. Department of Health & Human Services, NIH, National Institute of Neurological Disorders and Stroke; 2015.

Gourie-Devi

Gururaj

Satishchandra

Subbakrishna

. Prevalence of neurological disorders in Bangalore, India: a community-based study with a comparison between urban and rural areas. Neuroepidemiology.2004 Sep 16; 23(6): 261-8.

Glory

Vigneswaran

Jagtap

Shruthi

Hariharan

Sriram

. AHW-BGOA-DNN: A novel deep learning model for epileptic seizure detection. Neural Computing and Applications.2021 Jun; 33: 6065-93.

EEG Pocket Guide, IMOTIONS Biometric Research Platform, 2016.

Ramanna

Tirunagari

Windridge

. Epileptic seizure detection using constrained singular spectrum analysis and 1D-local binary patterns. Health and Technology.2020 May; 10: 699-709.

Srinivasan

Eswaran

Sriraam

. Artificial neural network based epileptic detection using time-domain and frequency-domain features. Journal of Medical Systems.2005 Dec; 29: 647-60.

Tian

Deng

Ying

Choi

Qin

Wang

Shen

Wang

. Deep multi-view feature learning for EEG-based epileptic seizure detection. IEEE Transactions on Neural Systems and Rehabilitation Engineering.2019 Sep 11; 27(10): 1962-72.

Acharya

Hagiwara

Deshpande

Suren

Koh

Arunkumar

Ciaccio

Lim

. Characterization of focal EEG signals: A review. Future Generation Computer Systems.2019 Feb 1; 91: 290-9.

10.

Boonyakitanont

Lek-Uthai

Chomtho

Songsiri

. A review of feature extraction and performance evaluation in epileptic seizure detection using EEG. Biomedical Signal Processing and Control.2020 Mar 1; 57: 101702.

11.

Acharya

Fujita

Sudarshan

Bhat

Koh

. Application of entropies for automated diagnosis of epilepsy using EEG signals: A review. Knowledge-Based Systems.2015 Nov 1; 88: 85-96.

12.

Nicolaou

Georgiou

. Detection of epileptic electroencephalogram based on permutation entropy and support vector machines. Expert Systems with Applications.2012 Jan 1; 39(1): 202-9.

13.

Song

Zhang

. Automatic recognition of epileptic EEG patterns via extreme learning machine and multiresolution feature extraction. Expert Systems with Applications.2013 Oct 15; 40(14): 5477-89.

14.

Zhu

Wen

Wang

. Epileptogenic focus detection in intracranial EEG based on delay permutation entropy. In AIP Conference Proceedings2013 Oct 9; (Vol. 1559(1), pp. 31-36). American Institute of Physics.

15.

Kumar

Dewal

Anand

. Epileptic seizure detection using DWT based fuzzy approximate entropy and support vector machine. Neurocomputing.2014 Jun 10; 133: 271-9.

16.

Sharma

Pachori

Acharya

. Application of entropy measures on intrinsic mode functions for the automated identification of focal electroencephalogram signals. Entropy.2014 Dec 8; 17(2): 669-91.

17.

Wang

Xue

Luo

Huang

Cui

Huang

. Automatic epileptic seizure detection in EEG signals using multi-domain feature extraction and nonlinear analysis. Entropy.2017 May 27; 19(6): 222.

18.

Zhang

Chen

. Fuzzy distribution entropy and its application in automated seizure detection technique. Biomedical Signal Processing and Control.2018 Jan 1; 39: 360-77.

19.

Patidar

Panigrahi

. Detection of epileptic seizure using Kraskov entropy applied on tunable-Q wavelet transform of EEG signals. Biomedical Signal Processing and Control.2017 Apr 1; 34: 74-80.

20.

Gupta

Pachori

. Epileptic seizure identification using entropy of FBSE based EEG rhythms. Biomedical Signal Processing and Control.2019 Aug 1; 53: 101569.

21.

Krishnan

Raj

Balasubramanian

Chen

. Schizophrenia detection using MultivariateEmpirical Mode Decomposition and entropy measures from multichannel EEG signal. Biocybernetics and Biomedical Engineering.2020 Jul 1; 40(3): 1124-39.

22.

Rocha

Barros

Silva

Sousa

da Silva

. Classification of the interictal state with hypsarrhythmia from Zika Virus Congenital Syndrome and of the ictal state from epilepsy in childhood without hypsarrhythmia in EEGs using entropy measures. Computers in Biology and Medicine.2020 Nov 1; 126: 104014.

23.

Al-Jawahiri

Jones

Milne

. Spontaneous neural activity relates to psychiatric traits in 16p11.2; CNV carriers: An analysis of EEG spectral power and multiscale entropy. Journal of Psychiatric Research. 2021 Apr 1; 136: 610-8.

24.

Zarei

Asl

. Automatic seizure detection using orthogonal matching pursuit, discrete wavelet transform, and entropy based features of EEG signals. Computers in Biology and Medicine.2021 Apr 1; 131: 104250.

25.

Chakraborty

Mitra

. A novel automated seizure detection system from EMD-MSPCA denoised EEG: Refined composite multiscale sample, fuzzy and permutation entropies based scheme. Biomedical Signal Processing and Control.2021 May 1; 67: 102514.

26.

Maszczyk

Duch

. Comparison of Shannon, Renyi and Tsallis entropy used in decision trees. In Artificial Intelligence and Soft Computing – ICAISC 2008: 9th International Conference Zakopane, Poland, June 22–26, 2008 Proceedings 9 2008 (pp. 643-651). Springer Berlin Heidelberg.

27.

Yin

Sun

. Multiscale permutation Rényi entropy and its application for EEG signals. PLoS One.2018 Sep 4; 13(9): e0202558.

28.

Ghosh

Basu

. A scale-invariant generalization of the rényi entropy, associated divergences and their optimizations under tsallis’ nonextensive framework. IEEE Transactions on Information Theory.2021 Jan 27; 67(4): 2141-61.

29.

Kraskov

Stögbauer

Grassberger

. Estimating mutual information. Physical Review E – Statistical, Nonlinear, and Soft Matter Physics.2004 Jun; 69(6): 066138.

30.

Zanin

Zunino

Rosso

Papo

. Permutation entropy and its main biomedical and econophysics applications: a review. Entropy.2012 Aug 23; 14(8): 1553-77.

31.

Righero

. A cooperation index based on the Rényi entropy of correlation matrix spectrum. Communications in Nonlinear Science and Numerical Simulation.2012 Jul 1; 17(7): 2960-8.

32.

Zhang

Jia

Ding

Thakor

. Application of Tsallis entropy to EEG: quantifying the presence of burst suppression after asphyxial cardiac arrest in rats. IEEE Transactions on Biomedical Engineering.2009 Aug 18; 57(4): 867-74.

33.

Shayanfar

Gharehchopogh

. Farmland fertility: A new metaheuristic algorithm for solving continuous optimization problems. Applied Soft Computing.2018 Oct 1; 71: 728-46.

34.

Bandt

Pompe

. Permutation entropy: a natural complexity measure for time series. Physical Review Letters.2002 Apr 11; 88(17): 174102.

35.

Anila Glory

Vigneswaran

Shankar Sriram

. Identification of suitable basis wavelet function for epileptic seizure detection using EEG signals. First International Conference on Sustainable Technologies for Computational Intelligence: Proceedings of ICTSCI 2019; 2020 (pp. 607-621). Springer Singapore.

36.

Andrzejak

Schindler

Rummel

. Nonrandomness, nonlinear dependence, and nonstationarity of electroencephalographic recordings from epilepsy patients. Physical Review E – Statistical, Nonlinear, and Soft Matter Physics.2012 Oct; 86(4): 046206.

37.

Anila Glory

Vigneswaran

Shankar Sriram

. Unsupervised bin-wise pre-training: A fusion of information theory and hypergraph. Knowledge-Based Systems.2020 May 11; 195: 105650.

38.

Andrzejak

Lehnertz

Mormann

Rieke

David

Elger

. Indications of nonlinear deterministic and finite-dimensional structures in time series of brain electrical activity: Dependence on recording region and brain state. Physical Review E.2001 Nov 20; 64(6): 061907.

39.

Swami

Panigrahi

Nara

Bhatia

Gandhi

. EEG epilepsy datasets. Neurology & Sleep Centre, Hauz Khas, New Delhi. 2016; 10.

Entropy parameter optimization for epileptic seizure detection: A parallel approach

Abstract

Keywords

1. Introduction

2. Literature review

3. The reason behind

3.1 Why entropy parameter optimization

4. Preliminaries

4.1 Farmland fertility optimization algorithm

4.2.1 Kraskov Entropy (KE) [19]

5.1 Signal decomposition

5.2 Parallel Farmland Fertility Optimization Algorithm (PFFO)

5.2.1 Generation of initial population

5.2.3 Segment-wise Parallelizatio

5.2.4 Termination condition

5.2.5 Population updat

5.3 Epileptic seizure detection

6. Results and discussions

6.1 Experimental setup

6.2 Dataset description

6.2.1 University of Bern-Barcelona [36]

6.2.2 University of Bonn [38]

6.2.3 EEG Dataset New Delhi [39]

6.3 Data fine-tuning and validation

6.4 Performance metrics

6.5 Discussions

Table 1. Optimal parameter values obtained from PFFO (Renyi & Tsallis Entropies)

Table 7. Statistical analysis: WTEST

6.7.1 Influence of novel fitness function

6.7.2 Influence of segment-wise parallelization

6.7.3 Influence of identification of optimal parameters

7. Conclusions

Footnotes

Acknowledgments

References

Table 1.
Optimal parameter values obtained from PFFO (Renyi & Tsallis Entropies)

Table 7.
Statistical analysis: WTEST