Abstract
Brain-computer interface (BCI) is an emerging paradigm to achieve communication between external devices and the human brain. Due to the low signal-to-noise ratio of the original electroencephalograph (EEG) signals, it is different to achieve feature extraction and feature selection, and further high classification accuracy cannot be obtained. To address the above problems, this paper proposes a pattern recognition method that takes into account sample entropy combined with a batch-normalized convolutional neural network. In addition, the sample entropy is used to extract features from the EEG signal data processed by wavelet transform and independent component analysis, and then the extracted data are fed into the convolutional neural network structure to recognize the EEG signal. Based on the comparison of experimental results, it is found that the method proposed in this paper has a high recognition rate.
Introduction
With the development of artificial intelligence (AI), how to make machines with human-like perception, thinking ability and behavioral functions has become a hot issue in current research, and emotional computing technology plays a crucial role in realizing intelligent human-computer interaction, and emotional recognition, as a branch of emotional computing, is the basis and core of realizing human-computer emotional interaction [1]. Compared with non-physiological signals such as speech and expressions, electroencephalograph (EEG) signals can reflect the inner emotional state of a person without human subjective control and have the advantages of non-fake ability, real-time variability, and easy acquisition.
EEG signal-based emotion recognition has potential application prospects in the fields of entertainment, education, criminal investigation, and medical rehabilitation [2]. For example, in the aerospace field, monitoring the emotional state of astronauts can grasp the physiological state of astronauts in real time and help astronauts complete space missions [3, 4]. In clinical medical care, by observing the emotional state of patients, different care measures can be taken to improve the quality of care.
Currently, many researchers have investigated EEG signals. Pandey et al. proposed a subject-independent emotion recognition technique based on EEG signals [5], which extracted features using a variational pattern decomposition technique and extracted subject-independent EEG emotion features from EEG signals using deep neural networks as classifiers, and the experimental results showed that the method improved the classification accuracy by about 6.4%. Kwon et al. used wavelet transforms to obtain a two-dimensional time-frequency mapping of EEG signals, and adapted an adaptive convolutional neural network model to achieve average recognition rates of 76.56%, 80.46%, and 73.43% on the validity and arousal dimensional classification and validity-arousal dimensional quadruple classification of the DEAP dataset, respectively [6]. Li et al. transformed the original physiological signals of each channel into spectrograms to obtain temporal and frequency features, and then used a multi-modal attention-based BiLSTM to automatically learn the best temporal features from them and input them into a deep neural network to predict the emotional output probability of each channel with high accuracy [7]. Rabby et al. divided the EEG signal into multiple non-overlapping signal segments and extracted multiple time-domain, frequency-domain, and nonlinear dynamics features from each segment [8], which were connected into feature sequences for a long time and used to train the long short-term memory (LSTM) classification model, and the experimental results showed that the model obtained 73.87%, 73.50% and 72.80% of classification accuracy. Manish et al. proposed a two-layer two-way gated cyclic unit model with an attention mechanism, which extracted salient features of EEG signals by assigning different weights to local and global EEG features, and achieved 67.9% and 66.5% in validity and arousal dimension binary classification recognition rates, respectively, which were 4.2% and 4.6% higher than the traditional LSTM model [9]. Kevinvric et al. obtained the EEG signal using wavelet packet transform time-frequency decomposition reconstruction, input the instantaneous power signal to DBN for unsupervised training pre-training, and then fine-tuned it by supervised training to achieve automatic feature extraction and pattern classification using a softmax classifier [3].
Although there are numerous methods for EEG emotion recognition, there are still two important problems that need to be studied in depth. First, because the EEG signal is a non-smooth and non-linear random signal, when using the time-frequency domain feature extraction method, although it can improve the correct recognition rate of the EEG signal and shorten the recognition time to a certain extent, it still cannot meet the requirement of control system stability. Secondly, how to construct a more effective deep feature learning and emotion classification model.
In this paper, we study the EEG signal of imaginative movement, firstly we use the sample entropy algorithm to achieve the feature value extraction of the signal after noise reduction and reconstruction and select the sample entropy value of each channel in the imaginary motion period data as the feature value to establish the EEG emotion classification model. And then the feature vector is used for pattern recognition using a deep learning classification algorithm, which improves the recognition rate of EEG signal classification and lays the foundation for the controlling future realization of external devices.
EEG signal acquisition and pre-processing
The EEG signals were first generated by a 32-channel EEG acquisition device with a sampling frequency of 51 Hz, and the electrode positions were referenced to the international 10–20 electrode method. In EEG signal acquisition experiments, the 10–20 electrode method is an internationally accepted method of placing electrode positions on the scalp, and the system is designed based on the relationship between the electrode positions and potential areas of the cerebral cortex. Before the acquisition, the parameters of the EEG signal acquisition equipment need to be set, the subject wears an EEG cap to prepare for the acquisition, and then the tester clicks the start acquisition button on the signal acquisition interface, enters the EEG signal acquisition state and then clicks the start button on the stimulation signal generation interface to give the evoked signal and record the subject’s EEG signal. The original signal is depicted in Fig. 1.
The original signal of EEG.
After the EEG signal is successfully acquired, it needs to be pre-processed to improve the signal-to-noise ratio of the EEG signal due to its very weak amplitude, large background noise, and strong randomness, which is susceptible to external interference. In this paper, wavelet transform and independent component analysis are combined to pre-process the acquired EEG signals. Firstly, the EEG containing various noises is wavelet decomposed by using the multi-resolution property of wavelet transform [8, 9]. In this paper, 3-layer wavelet decomposition is used to obtain sub-band signals of different frequency bands as shown in Fig. 2.
3-layer wavelet decomposition tree diagram.
In Fig. 2, A indicates the low-pass approximation component (low-frequency component), D indicates the detail component at different scales (high-frequency component), and the end serial number is the index number of decomposition layers. To imagine the left hand as an example of motion, db4 is selected as the wavelet basis function to carry out three layers of small wave decomposition, as shown in Fig. 3.
The wavelet 3-layer decomposition diagram of imagining the left-hand motion C4 channel.
Then, the sub-band signals of each EEG at the wavelet scale are combined into the input of independent component analysis according to the need, and the signals of each frequency band of EEG are separated by using the blind source separation property of independent component analysis [10]. The advantage of this method is that the spike/spike and slow-wave component signals in the EEG signal can be enhanced while removing artifacts, and then separated and extracted by independent component analysis without involving other parameters and options, which is very concise and effective. Moreover, no signal loss will occur if it is chosen to decompose the signal using a continuous wavelet transform. In this paper, the continuous wavelet transforms soft threshold method is used to analyze the original EEG signal of the C4 channel on the left-hand motion as shown in Fig. 4.
C4 channel primitive EEG of imagining left-hand motion.
The principle of the algorithm is as follows: assuming that there are
where
C4 channel reconstructed EEG after soft threshold noise reduction of imagining left-hand motion.
EEG signal is a non-smooth and non-linear random signal, and sample entropy is a non-linear method that is sensitive to small fluctuations of the signal so that the complexity of the signal time series can be further measured. Therefore, this paper selects the sample entropy method to extract the eigenvalues of EEG signals [11, 12].
The sample entropy is a statistic that does not count the matches proposed by Pincus based on approximate entropy. The sample entropy has all the advantages of approximate entropy while avoiding the problem of inconsistent statistics in approximate entropy [13]. Therefore, the sample entropy is used to quantize the probability of incoming patterns in the time series. The larger the sample entropy value is, the greater the probability of incoming patterns in the time series and the more complex the series is. The detailed procedures to calculate the sample entropy are shown as follows.
(1) The raw EEG signal sequences were sequentially composed into a
(2) Define the distance between the vector
In Eq. (3),
(3) Given that the similarity capacity
(4) The average of the calculation
(5) Repeat steps (1) to (4) to transform the
(6) The sample entropy of the sequence is defined as follows.
(7) In practice, due to the limited length of the sequence, the estimation value of the sample entropy of the sequence with several points
where SampEn is determined by the values of the parameters
The ERD/ERS phenomenon of imaginary hand movements is mainly concentrated at around 10 Hz and around 20 to 24 Hz. When imagining the left hand moving, it can be seen from Fig. 6a that C4 (ERD) is near 10 Hz. The amplitude of the region is smaller than that of the C3 (ERS) region. In contrast, it can be seen from Fig. 6b that the C3 (ERD) amplitude is around 10 Hz during right-hand movement in the middle image. The value is smaller than the amplitude of the C4 (ERS) region.
Spectrogram of imagining left-band motion (a) and right-hand motion (b).
Pattern recognition is a crucial step in brain-computer interface technology, and the key to controlling external devices lies in the accuracy of classifier recognition results [15, 16, 17]. Only by accurately classifying EEG signals can human-computer interaction be perfectly realized. In pattern recognition of EEG signals, a convolutional neural network algorithm is used to accomplish EEG signal recognition and use batch normalization to make the parameter search problem easy and suppress model overfitting.
Because the network is trained with the data in the input layer (because the samples in the input layer have been artificially normalized), the parameters of each subsequent layer will change with the input layer, and batch normalization can address the situation where the distribution of the data in the middle changes [18]. When training the model, batch normalization mainly uses the mean and variance on small batches and maps them to a region with a variance of 1 and a mean of 0. The intermediate output of the convolutional neural network is continuously adjusted so that the output of the whole network is more stable in each layer [19, 20, 21]. The batch normalization algorithm proceeds as follows.
(1) Calculate the mean
(2) Batch normalization operation for the current batch of input data.
(3) Reconstructive transformation of the batch normalized data by the learnable parameters
where
Convolutional neural networks are a class of feedforward neural networks that include convolutional computation and have a deep structure [22]. They automatically extract various features of the input signal through multi-layer convolution and pooling and then obtain the expected classification results through the fully connected layers and classifiers deployed at the end. The pattern recognition architecture proposed in this paper is shown in Fig. 7.
Architecture of the proposed pattern recognition.
The convolutional neural network structure for feature classification consists of four convolutional layers, four maximum pooling layers, two fully connected layers, and one SoftMax layer. For the convolutional layers, the ReLU function is used as the activation function, the convolutional kernel is 3, and the padding is “same”, which represents the padding of the input so that the output has the same length as the original input.
In this paper, the sample entropy value of all EEG data within 3
Sample entropy diagram of imagining left-band motion (a) and right-hand motion (b).
In addition, the foot movement is imagined as shown in Fig. 9, the ERS phenomenon occurs in C3 and C4 channels, the EEG complexity of the foot movement is low, and the sample entropy becomes low. The ERD phenomenon occurs in Cz channels due to both the increasing EEG complexity and the high sample entropy.
Sample entropy diagram of imagining foot motion.
The classification accuracies of each subject at each period were recorded and averaged to obtain the classification accuracy of each subject. The classification accuracy of each subject was averaged to obtain the classification accuracy of each subject. Finally, the classification accuracy of the 15 subjects was averaged to obtain the overall average classification accuracy. The classification accuracy of each subject during the experiment is recorded in Table 1.
As can be seen from Table 1, the different subjects have differences in classification accuracy, where for learning vector quantization (LVQ) neural network, the average classification accuracy over imagining motion is 87% (lowest) and the accuracy of recognition for left-hand, right-hand, and foot is 90%, 83%, and 87%, respectively. For BP neural network, the average classification accuracy over imagining motion is 92% (middle), which reflects individual variability. However, for the same subject, there were fluctuations in classification accuracy at different periods. Moreover, for the convolution neural network, the average classification accuracy over imagining motion was 93% (highest).
Classification results of three motor imageries for different neural networks
Classification results of three motor imageries for different neural networks
For comparison, the performance of the proposed method in this paper is compared with other better models. It can be seen from Table 1 that the proposed method combining sample entropy and batch-normalized convolutional neural network demonstrates better performance. The convolution neural network, whether for a single limb imagine movement EEG classification result or effect on the overall classification is superior to the former two kinds of classifier, and the characteristics of three kinds of imagine movement EEG signals vector classification has improved significantly, the overall recognition rate reached 93%. Considering the portability and high recognition rate of the convolution neural network, the convolution neural network is used as a pattern recognition classifier of the sample entropy feature vector to provide control signals for prosthesis control.
Emotion plays a crucial role in human life, and in recent years, the field of artificial intelligence has increasingly focused on the research of emotion recognition. Benefiting from the successful application of deep neural networks in feature extraction and classification recognition, various new methods for the emotion recognition of EEG signals have emerged. However, due to the diversity and complexity of EEG signals, different brain regions do not experience emotions to the same degree, and these factors greatly increase the difficulty of EEG emotion recognition, and how to improve the recognition rate of EEG signal-based emotion classification is still a challenge to be solved.
In this paper, we propose an EEG signal emotion recognition method that combines sample entropy and batches normalized convolutional neural network. The EEG signals collected by EEG acquisition devices are processed by noise reduction using wavelet transform and independent component analysis, and the effective EEG feature signals are extracted using the sample entropy method, while the extracted EEG feature signals are recognized later, an average classification accuracy of over 93% is achieved. The effect of increasing the number of convolutional layers to tap deeper channel features on the accuracy of emotion classification can be further studied and analyzed later. In addition, combining human physiological signals and external emotion representation carriers such as expressions, speech and behavioral postures for multimodal emotion recognition research is also of great practical importance to further improve the accuracy of emotion recognition.
Footnotes
Funding
This work was supported by the General Program of Chongqing Natural Science Foundation (No. cstc2021jcyj-msxm2774).
