Abstract
Numerous advanced data processing and machine learning techniques for identifying epileptic seizures have been developed in the last two decades. Nonetheless, many of these solutions need massive data sets and intricate computations. Our approach transforms electroencephalogram (EEG) data into the time-frequency domain by utilizing a short-time fourier transform (STFT) and the spectrogram (t-f) images as the input stage of the deep learning model. Using EEG data, we have constructed a hybrid model comprising of a Deep Convolution Network (ResNet50) and a Long Short-Term Memory (LSTM) for predicting epileptic seizures. Spectrogram images are used to train the proposed hybrid model for feature extraction and classification. We analyzed the CHB-MIT scalp EEG dataset. For each preictal period of 5, 15, and 30 minutes, experiments are conducted to evaluate the performance of the proposed model. The experimental results indicate that the proposed model produced the optimum performance with a 5-minute preictal duration. We achieved an average accuracy of 94.5%, the average sensitivity of 93.7%, the f1-score of 0.9376, and the average false positive rate (FPR) of 0.055. Our proposed technique surpassed the random predictor and other current algorithms used for seizure prediction for all patients’ data in the dataset. One can use the effectiveness of our proposed model to help in the early diagnosis of epilepsy and provide early treatment.
Introduction
There are numerous central nervous system diseases (CNS), with epilepsy being one of the most prevalent chronic neurological disorders. According to a WHO report [20], epilepsy is caused by unexpected seizures, neuronal injury, or sudden neuronal failure [20,45], and it affects 1% of the world’s population. The variation in the electrical activity of the brain that causes epilepsy is observed with EEG testing. EEG signals are used to diagnose any activity related to an epilepsy seizure. The occurrence of the first variations in brain activity is called the preictal state. Further, this preictal state is essential for the prediction of seizures. Therefore, determining the preictal state is important. A robust seizure prediction system can easily control the epilepsy disease [45]. It could take the required precautions before the seizure started and avoid it quickly. Therefore, determining the preictal state is critical as it provides essential information for seizure prediction. Determining the preictal stage is difficult and time-consuming. Hence, various research works have been carried out to build an adaptive system for detecting epileptic seizures. Epilepsy affects around 50 million people worldwide. Epilepsy is a chronic disease of the brain characterized by a recurrent pattern of seizures [38]. Seizures are caused by aberrant electrical signals generated by the cerebral cortex, which interfere with motor, sensory, cognitive, and behavioral functioning. Approximately 30% of people have incurable epilepsy, meaning their strokes are uncontrollable despite the use of anti-epileptic drugs (AEDs) [37]. An electroencephalogram (EEG) is used to diagnose and analyze seizures that record electrical signals. Depending on the brain region being examined, EEG signals can be divided into intracranial EEG and scalp EEG. The EEG is used to record the electrical activity of the brain during an intracranial bleed. Electrodes are put directly on the exposed cerebral cortex during surgery, and scalp EEG electrodes detect time-series signals from the brain scalp. Although intracranial EEG can record a high signal-to-noise ratio, scalp EEG can be used as the measuring method for frequent patient monitoring and seizure alerts because different electrodes must slice the skull. It has a broader range of applications and is more convenient to use. EEG data can separate seizures into four stages based on a patient’s EEG record. The first stage is the ictal period, in which a seizure occurs. The second stage is the preictal state, which occurs before the beginning of a seizure. The third stage is the postictal form, which occurs after the seizure has ended. And the final stage is the interictal period, which is the interval before the preictal stages and occurs between seizures [48]. These four stages are depicted in Fig. 1.

Interictal, preictal, ictal, and postictal components of epileptic brain states [48].

Raising an alarm before epileptic seizures ONSET.
A method for predicting seizures is based on two distinct strategies. In the first strategy, to divide EEG signals into preictal and interictal stages, a preictal duration is chosen early in the process. A binary classifier differentiates between the two sets. If postictal and ictal segment details are considered independent states, they cannot help in seizure prediction, and therefore, they are eliminated from the study. A seizure prediction system must predict an impending epileptic seizure by generating an alarm prior to the start of the seizure. Since it is impossible to predict precisely when a seizure will begin, therefore this uncertainty must be accounted for. The seizure occurrence period (SOP) should be described as the expected time of the attack. In addition, treatment systems necessitate a minimum interval between the activation of an alarm and the initiation of the SOP. As demonstrated in Fig. 2, the seizure prediction horizon (SPH) is determined by the preictal time interval. Recently, the use of machine learning has enhanced the prediction of epileptic seizures. It is also addressing the complexity of EEG signals [46]. The machine learning tools make it easy to distinguish and evaluate seizure features. Classical machine learning techniques were the most often utilized method for automatically recognizing epileptic seizures [5]. The effectiveness of these methods were highly dependent on strategies for extracting hand-engineered features. Choosing appropriate features from EEG signals poses a significant challenge [29]. The convolutional neural network (CNN) is the supervised learning technique used to distinguish between preictal and interictal stages. The trained classifier detects the preictal period in new EEG data and predicts the probability of seizures. Use of threshold-based techniques form the another approach. During the preictal stage, the focus is on classifying increasing or decreasing trends in the values of particular characteristics. The value of the examined section is checked against the sub-threshold value, and the seizure alarm is given if it exceeds [1]. Predictive seizures indicate a continuous rise or reduction in EEG channel phase-locking and synchronization during the preictal stage [6,24,32]. For the past couple of years, classical machine-learning methodologies have shown huge promises and significant impact on it [26]. There are two stages in the proposed technique. The EEG signal is first used as raw data, and it is then transformed into image files. Seizures are predicted in stage two using EEG data images to learn the difference between interictal and preictal phases [19]. In this research, we propose a new method for predicting epileptic seizures. The remaining sections of this work are organized as follows: Section 2 discusses earlier research on seizure prediction. In Section 3, using a scalp EEG dataset, preprocessing procedure, and proposed model are explained. The performance evaluation based on preictal and interictal length is defined in Section 4 and compared with earlier research. The final Section 5 concludes the research work with discussion of its results.
Researchers have been actively working on seizure prediction for the past few years. Seizure prediction is based on the imbalanced data between the interictal and preictal periods. Deep learning algorithms such as CNN [3,7] have recently been studied in detail in the same way as doctors do with careful observation. Authors in [31] proposed a method for collecting the raw intracranial EEG signals to transform them into uni-variate spectrum power, classifying them using SVM, and eliminating occasional and inaccurate data. Recent interest in deep learning has led to new EEG signal evaluation applications [2,25]. Various deep learning applications are used to assess these EEG signals in 1-D signals or 2-D visuals. EEG signals are categorized as normal, or abnormal [49]. After feature extraction, 2-D EEG spectrum analyzer images are categorized to evaluate speech patterns [42,47].
Authors [33] proposed a method for determining phase-locking values of EEG scalp data. This scalp EEG data is categorized into short interictal or preictal segments. This is performed by combining multivariate wavelet transform decomposition with an SVM classifier. The SVM classifier used sixty-five seizures [10] from the CHB-MIT data sets. Authors [45] differentiated preictal and interictal states using information collected from graph theory, time, and the frequency domain. Some authors [39] used the wavelet transform to separate EEG signals into sub-bands, after which the features were retrieved using PCA, LDA, and IDA [23]. EEG data can be used by various signal processing techniques to extract features from the data [21,40].
CNN is the most prominent deep learning algorithm used in seizure prediction research because it typically requires image data as input. The authors of [43] divided the raw EEG data by a window size of 30 seconds and then used STFT to extract spectrogram image information, which was then fed into CNN. In the CHB-MIT dataset, this technique found 64 seizures in 13 patients and yielded a sensitivity and an FPR of 81.2% and 0.16, respectively. Continuous Wavelet Transform (CWT) is used to transform signals from various EEG signal bands into time-frequency. The author suggested applying the adjusted data as an input to CNN, which learns the distinction between interictal and preictal states to predict seizures, achieving an average FPR of 0.142 with three unpredictable seizures. Authors [30] predict seizures using a multi-frame 3D CNN model. The model utilized preprocessed features with power spectrum bands, Hjorth parameters, and statistical moments. These features attained a sensitivity and an FPR of 85.71% and 0.096, respectively.
Traditional classification techniques have been widely used to distinguish preictal EEG segments. These EEG segments employed spectrum power analysis to extract initial characteristics via different classical machine learning models. Finally, preictal states and interictal states differentiated these features after loading the characteristics into an SVM classifier. The authors [27] also employed a CNN model with six CNN layers to extract the features. These features were used to divide EEG segments into three groups: ictal, preictal, and interictal states. CNN is fed the result of every EEG channel’s wavelet transform calculated at various scales as input. For the evaluation, the CHB-MIT database is used. The researchers examined 15 participants, who recorded eight seizures and 50 interictal recordings, resulting in an average FPR of 0.142 and three unpredicted seizures. CNN has influenced the most among deep learning techniques in seizure prediction [44]. The wavelet transforms collected the spectral information from the 30-sec signal increments. The average seizure sensitivity and FPR were 81.2% and 0.16 false predictions per hour, respectively. The dense convolutional network (DenseNet) [19] addresses problems like vanishing gradients and parameter increases that appear when the CNN layer gets deeper. DenseNet surpasses CNN when comparing learning information from a limited source of EEG data. Furthermore, the LSTM [34] model addresses the Recurrent Neural Network (RNN) dependence problem. It has been designed to identify time-series data, which makes it ideal for detecting temporal features of the EEG.
In addition, deep multi-layer CNN was used to analyze the application of spatial filters for extracting motor imagery tasks from linearly combined EEG channels. It improves the power spectrum band features of the EEG signal [41]. LSTM networks have been used in the analysis focused on various research areas to evaluate the automated detection of epileptic seizures in scalp EEG data [16]. A recurrent convolutional neural network consisted of a 4-layer CNN with each EEG channel is provided with spectral information as input. Then the LSTM model is employed to perform the complete classification of EEG segments.
Materials and methods
Scalp EEG dataset
We used the open-source Children’s Hospital Boston and Massachusetts Institute of Technology (CHB-MIT scalp) EEG data set [9,15], which can be found in publicly available1
[12]. At Children’s Hospital Boston, EEG signals were gathered from 23 children with uncontrollable attacks. The recordings were made at a sampling rate of 256 and a 16-bit range using a 10-to-20 electrode location system. These electrodes are fixed directly on the scalp of the brain to find out from where the seizure event is coming. The total length of the EEG recordings provided is nearly 983 hours. After visual evaluation of the EEG epochs containing ictal activity, clinical personnel manually recorded the seizure start and offset time intervals, resulting in a total of 198 validated seizures. There are two interictal cases because the time interval between ictal stages varies from patient to patient.In addition to the preictal length, there is a 5-min delay preceding the ictal period. The duration before the 5-min interval is removed from the preictal size because our model is trained after a 5-min interval. It takes a specific length of time (e.g., 5, 15, and 30 minutes) to assure that the patient controls the seizure, and the particular length of time is given in Fig. 3. This time frame was chosen because it is essential to allow the doctor to intervene before the seizure begins in the patient [35].

Preictal length [35].
For seizure prediction to work, the scalp EEG data of epileptic patients must be free of various artifacts and noise, for which it requires explicit filtering. Therefore, the available multichannel EEG signals are filtered through a Butterworth bandpass filter [8] with a lower cut-off frequency of 0.1 Hz and a higher cut-off frequency of 127 Hz. The bandpass method is generally used as the filtering method to obtain a useful frequency range from 0.1 to 60 Hz for studying biomedical signals. The multichannel raw EEG signal is preprocessed to eliminate distortions before being translated into time-interval-based segments. The STFT is used to transform the EEG signals into a two-dimensional matrix as the pre-trained CNN model requires input in an image format. These EEG segments of each patient’s data is transformed into spectrogram images, as shown in Fig. 4.

STFT to transform raw EEG signals into time-frequency images.

EEG signal spectogram (a), preictal (b), interictal.
In numerous signal processing applications, such as the automatic analysis of EEG signals, the signal is subdivided into smaller sections, with model parameters such as amplitude and frequency for each region. This is performed by moving window analysis [18]. Identifying preictal and interictal EEG segments improves the discriminative power of classification algorithms. The number of preictal and interictal EEG segments provided as input ranged from 5 to 50 components during training. Firstly, we generate a sliding window with the size
Spectrogram
Due to the low signal-to-noise ratio (SNR), we translate the continuous EEG data into the frequency domain. A wavelet and Fourier transforms are applied to transform EEG segments into spectral images. A short-time Fourier transform (STFT) converts EEG time-domain data into two-dimensional matrices. STFT is utilized to obtain pertinent data for seizure prediction. Dimensionality reduction and the development of higher-order feature spaces are made possible via feature extraction. The STFT module segments the time-varying EEG signal into a two-dimensional matrix with frequency and time axes. As a result, the two-dimensional map can provide information about the periods for each time window. Assume there are P domains (patients) in total. The input data of P domains are represented by
Here, we have
To begin, the STFT technique in a sliding window is being used to obtain the information flow properties of the scalp EEG signals. Second, these characteristics are recreated as channel-frequency feature maps that are then given as inputs into a ResNet-50 model. Lastly, the LSTM model is employed for classification to accurately detect epileptic seizures. Fig. 6 presents a block diagram representation of the model we constructed.

The proposed seizure prediction methodology.
In medical applications, CNN models have demonstrated to perform well in classification and detection [7,26]. A CNN model requires a robust training procedure to achieve a massive range of recognition. The fundamental convolution equation is stated in equation (2).
Many medical applications use ResNet-50 [36] as the backbone of the deep neural network. ResNet-50 takes an input image of

Bottleneck residual-identity block.
The LSTM unit is a unique method to predict sequential data, like text or time-series data output. LSTM is a modification of recurrent neural networks (RNN). However, to address potential long-term dependencies [11], several LSTM models were tested and evaluated. After doing so, it was revealed that the traditional LSTM model with the forget gate performed the best across a wide range of tasks [17]. As shown in Fig. 7, LSTM follows a step-by-step process. The input gate (

Structure of LSTM cell.
The ResNet-LSTM model is formed by merging the ResNet50 and LSTM networks. In the hybrid model, ResNet50 extracts complicated characteristics from images, and LSTM performs the role of a classifier. The proposed network for epilepsy detection is depicted in Fig. 8. The hidden layers of the ResNet-50 model are forwarded to the LSTM unit, as shown in Fig. 9. The size of the LSTM unit needs to be the same as the size of the sequences being fed into it. With the help of global average pooling, GAP can reduce the number of hidden layers to

Hybrid ResNet50-LSTM model for epileptic seizure prediction.
The function map is transmitted to the LSTM layer in the final section of the architecture, which extracts temporal information. The output shape is found after the convolutional block (none, 7, 7, 2048). The input size of the LSTM layer has been reduced using the reshape approach (1, 128). After reshaping, the dense layer is divided into 64, 32, 16, and two layers. Table 1 shows a summary of the planned architecture. After analyzing the time features, the architecture organizes them via a fully connected layer to determine one of the groups (preictal or interictal). Finally, the sigmoid function is used to classify the features generated by LSTM into preictal and interictal stages of 5 minutes, 15 minutes, and 30 minutes. Table 1 represents the detailed structure of the hybrid model.
Architecture of ResNet50-LSTM model
Performance evaluation
The following statistics are used to evaluate categorization performance: true positive, true negative, false positive, and false negative are defined as
Experimental setup
The datasets for each patient are separated into three subsets: training, validation, and testing, which comprise 60%, 20%, and 20% of the total, respectively. Other trainable parameters are likewise randomly initialized. The ReLU function is often utilized in cases where activation is required. The Adam optimizer [5] technique is used as the basis for the training optimizer, with an initial learning rate of 0.001 and
We select the preictal window sizes of 5, 15, and 30 minutes of EEG data, demonstrate the experimental observations, and compare the existing techniques. The model was trained on the concept that a size of 5 minutes gives a higher sensitivity than preictal sizes of 15 and 30 minutes. The model assuming a 5-minute interval learned the preictal interval better than other models, resulting in a preictal frequency characteristic between 0 and 5 minutes. This method suggests that, as compared to other models, the model trained to assume 15 and 30 minute durations separated the interval classes as interictal. According to the preictal duration, the average sensitivity in the model is high when the preictal size is 5 minutes. Still, the sensitivity for patient 4 is lower than 15 and 30 minutes, and the specificity is reasonable. In the case of patient 23, however, the sensitivity of preictal lengths of 15 and 30 minutes is lower than 5 minutes. Therefore, the preictal size of 5 minutes’ characteristics seems more prominent.
Similarly, the total average performance is better when the preictal period is estimated to be 5 minutes than the performance measured during the preictal periods of 15 and 30 minutes. When the preictal period is 5 minutes, the averages for
Preictal intervals of 5, 15, and 30 minutes predicted seizures in 24 CHB-MIT scalp EEG patients
Preictal intervals of 5, 15, and 30 minutes predicted seizures in 24 CHB-MIT scalp EEG patients

(a) acc, (b) sens (c), spec (d), fpr (e), and the f1-score for 5, 15, and 30-minute preictal lengths.
According to the results, the proposed seizure prediction algorithm can correctly predict all 185 seizures (events) in the CHB-MIT database. Table 2 demonstrates that the suggested ResNet-LSTM model obtained an average fpr of 0.055-0.058 in this situation, with false alarms increasing as the preictal duration window rises from 5 to 30 minutes before a seizure. Furthermore, depending on the instance, the ResNet-LSTM classifier gives zero false alarms in 7, 8, 9, 11, 12, 13, 17, 19, and 20 of the 24 cases in the database. Our ResNet-LSTM model shows its accuracy, sensitivity, specificity, fpr, and f-1 score using the bar chart as shown in Fig. 10. The bar chart shows the preictal 5-minute scores, with the best performance in accuracy, sensitivity, and f-1 score, but fpr is among the highest in the 15-minute preictal length. The lowest performance among all three is a 30-minute preictal length.
We compared our proposed algorithm with existing algorithms to statistically validate its performance. The results of the proposed technique is compared with other classification methods presented in the literature, are given in Table 3. This comparison focuses on research that was also examined using the CHB-MIT EEG database, which is now the only publicly available database. The suggested ResNet-LSTM model predicts seizures better than any previous technique and is tested using the same EEG dataset. The majority of the remaining research has only used a smaller fraction of the available recordings, but at this time the entire database of 24 patients is being used [36]. The comparison of the sensitivity and FPR of our proposed technique with the previous studies is shown using a bar chart, as shown in Fig. 11.
5 minutes preictal time is optimal for epileptic seizure prediction on CHB-MIT sclap EEG dataset
5 minutes preictal time is optimal for epileptic seizure prediction on CHB-MIT sclap EEG dataset

Comparison of the proposed method with previous studies on (a) sensitivity and (b) false positive rate (fpr).
In this study, a hybrid deep learning method called the ResNet50-LSTM model is proposed. This model achieved better sensitivity and a lower FPR. The primary purpose of epileptic seizure prediction is to classify EEG data as interictal or preictal. The ResNet-50 improves the performance of the existing CNN model. This technique is suitable for large datasets, so that the model will be trained well. Therefore, the performance of the model continues to improve in terms of evaluation metrics. The LSTM was used to classify the seizures in this study, and it proved to be a better performance metric for assessing preictal and interictal EEG signals. The proposed technique produced better seizure prediction results and improved the false alarm rate. It gives a prediction
Footnotes
Acknowledgements
This research is financially supported by the Department of Science and Technology (DST) and the Council of Scientific & Industrial Research (CSIR) (File No: 09/263(1097)/2016-EMR-I).
Declarations
Both the authors have given their consent on the following points:
