Abstract
The electrocardiogram (ECG) signal is a kind of time-varying signal, which has the characteristics and difficulties of variability, instability, and noise. Aiming at that, this paper put forward a novel 13-layer deep dynamic neural network model (DDNN) for the ECG signal learning and classification. The proposed DDNN model is a dynamic hybrid deep learning model. It includes a wavelet block, a convolutional block, a recurrent block, and a classification block, which combines the learning property and classification mechanism of convolutional neural network for the large-scale data sets, the learning and memory ability of Long Short-Term Memory (LSTM) for time series, and the noise reduction and processing ability of wavelet basis for the signals to meet the requirement of the learning and classification of ECG signal characteristics. Sufficient experimental results show that the proposed model is feasible and effective in the electrocardiogram signal pattern classification.
Keywords
Introduction
In recent years, the incidence rate of cardiovascular diseases is on the rise, which has seriously threatened human life, health, and safety. Compared with other diseases such as cancer, cardiovascular disease has become the main cause of death for urban and rural residents in China. People with the cardiovascular disease usually show arrhythmia at the beginning of the disease. Therefore, it is the key step to prevent cardiovascular disease to diagnose and recognize arrhythmia timely and accurately, and to give early warning to patients. Electrocardiogram (ECG) is an electrocardiograph, which records the changes of electrical activity of central muscle cells in each systolic and double systolic cardiac cycle from the surface of the human body. It can effectively reflect the human heart health status and provide an important basis for the prevention and diagnosis of clinical cardiovascular disease [5, 24].

An example ECG signal.
Aiming at the problem of pattern classification of ECG signals with variability, instability, and noise, in this paper will propose a Deep Dynamic Neural Network (DDNN) model and its corresponding intelligent analysis method.
The electrocardiogram signals of the human are a typical time-varying signal. They have the characteristics of low frequency, weak, variable, and unstable, and contain a lot of noise, and the signal distribution is nonuniform [13]. Different patients, different parts of the same patient, and even different times of the same patient may have different ECG signal patterns [11]. Additionally, the ability of ECG to resist interference is weak and susceptible to external noise. The above reasons lead to the complexity and diversity of research data. Therefore, the study of the ECG signal is a difficult point and is an important representative for the study of the time-varying signal. It holds a great theoretical research significance and practical value. Meanwhile, ECG also can be used to design the authentication schemes [5, 28].
It is a classic research topic in the pattern recognition field to diagnose arrhythmia with the ECG signals. Many experts have done a lot of research in the arrhythmia diagnosis field. In 2004, Osowski et al. proposed an SVM-EL model based on a support vector machine to identify arrhythmia and achieved an accuracy of 95.91% [20]. In 2009, Talbi proposed a recognition algorithm based on the QRS power spectrum and self-organization chart of PVC, which achieved an accuracy of 86.96% [25]. Dong et al. presented an arrhythmia detection method based on theBayesian model in 2010, which attained an accuracy of 90.1% [7]. Then in 2011, Kutlu et al. put forward a KNN-EL model for detecting arrhythmia, with an accuracy of 95.46% [14].
Due to there are many difficulties in electrocardiogram signal processing and recognition, the traditional pattern recognition method cannot meet the requirements of arrhythmia diagnosis. In recent years, the rise of deep learning technology provides new methods and ideas for the ECG signal processing and recognition. In 2013, Martinez et al. applied the deep learning method to physiological signal recognition for the first time and achieved the accuracy of around 70% to 75% [17]. Then in 2017, Acharya et al. put forward an ECG classification model based on the CNN, and the classification accuracy of five types of ECG was 94.03% [1]. Yildirm put forward an ECG classification model based on a deep LSTM network in 2018, and the accuracy of five types of ECG classification was reached to 99.39% [32]. In 2019, Saadatnejad et al. proposed an ECG classification based on LSTM which used for continuous monitoring of personal wearable devices, achieved an accuracy of 99.3% [22]. In the same year, based on active learning and RNNs, Wang et al put forward a Global updatable ECG beat classification system, and the accuracy of four types of ECG classification was reached to 99.7% [12]. Then Niu et al. proposed an ECG classification between patients based on symbolic representation and multi view CNN, and this model reached an accuracy of 96.4% [19]. In 2020, Xu and Liu proposed an ECG heartbeat classification method by a CNN model [30], and Shaker et al. proposed a generalization of CNNs model using generative adversarial networks [23], which achieved the accuracy of 99.43% and 98.3%, respectively.
These methods prove the feasibility and effectiveness of deep learning methods in the electrocardiogram recognition field.
Although many researchers have made great efforts in this research field, an electrocardiogram signal intelligent recognition model with higher diagnostic accuracy and stronger applicability is still expected. Based on the existing research work, the characteristics and difficulties of ECG signals, this paper introduces a novel deep dynamic neural network (DDNN) model. The DDNN model proposed in this paper is a kind of hybrid model. The main idea is to combine the wavelet transform, the convolutional neural network [4, 35] method, and the Long Short-Term Memory (LSTM) model [9] together. The CNN has some advantages in processing time-varying signals [29]. For example, it can extract and classify features simultaneously, enabling feature extraction to help feature classification, and the weight sharing can reduce the training parameters of the network, and make the neural network simple in structure and strong in adaptability. CNN has its powerful functions, but the recurrent neural network (RNN) with some memory ability is more suitable for learning time series or for problems sensitive to time series. Therefore, the combination of the CNN and the LSTM network is considered in this paper. In this way, the CNN part can capture the spatial features maps, and then the RNN part can extract the temporal dynamics present of these features maps. Additionally, LSTM is an excellent variant model of RNN, is a kind of dynamic network. Wavelet transform has clear advantages in time-varying signal analysis. Owing to it can analyze the time-varying signals in time domain and frequency domain at the same time, it has the advantages of small-signal characteristic loss and good filtering effect. Therefore, aiming at the characteristics and difficulties of the electrocardiogram signals, in this paper, the deep convolution network combined with LSTM and wavelet analysis noise reduction method is used to construct a deep dynamic neural network model for ECG pattern recognition and classification.
The structure of this paper is as follows: In the first section, the research background and significance of this paper are described, and then the relevant work is analyzed. Next, the deep dynamic neural network model for the learning and classification of the ECG signals will be put forward. In the next section, the feasibility of the proposed model will be verified by sufficient experiments. The last section is the conclusion.
Aiming at the characteristics and difficulties of the electrocardiogram signals, we put forward a novel 13-layer Deep Dynamic Neural Network (DDNN) model for the learning and classification of the electrocardiogram signals.
The proposed DDNN model is shown in Fig. 2, which is a deep hybrid dynamic neural network model with a wavelet block, a convolution block, a recurrent block, and a classification block. In a nutshell, it is a multi-layer architecture composed of alternating convolution and nonlinear structure, and it uses a full connection network to connect with the SoftMax classifier at the end of the model.
The DDNN can combine the learning property and classification mechanism of convolutional neural network for the large-scale data sets, the learning and memory ability of LSTM for time series, and the noise reduction and processing ability of wavelet basis for the signals to meet the requirement of the learning and classification properties of ECG signal characteristics. The specific structure of the DDNN model is described as follows:
1) Wavelet Block. The first layer realizes the input of time-varying signals X (t) = (x1 (t) , x2 (t) , ⋯ , x n (t)) into the network. The second layer is the wavelet layer, which is mainly used for data processing. After the first layer receives the input information of time-varying signals to the network, the time-varying signals are transmitted to this layer. In this layer, the Mallat algorithm, a fast algorithm for constructing orthogonal wavelet and wavelet transform based on multi-resolution analysis (MRA) proposed by S. Mallat in 1988 [16] is applied. According to the ECG signal data characteristics, because the scale function corresponding to the SymN wavelet base is similar to the QRS wave group of ECG signal, this paper chooses the Sym8 wavelet as the decomposition wavelet base, and the wavelet decomposition scale is 10.

DDNN Model.
In the process of analyzing a time-varying information segment, the detection and acquisition of signal characteristics is a very important step. The collected ECG signals usually with some noise. The noises are mainly caused by the power frequency interference, electromyography interference, baseline drift, etc. The selection of threshold will have a great influence on the original signal at the time of signal de-nosing. In this layer, the soft threshold method is used to realize the feature acquisition and noise reduction of the ECG signal [3, 10]. The introduction of the soft threshold method can make the network better realize the feature acquisition of short-term variable signals and make the advantages of multi-scale analysis. The soft threshold function applied here can be expressed as follows [3]:
Next, the wavelet layer normalizes the input ECG signal:
Experiments show that the wavelet layer can detect and capture the features of the input signals, which can improve the recognition accuracy of the time-varying signals.
2) Convolutional Block. The convolution block consists of the third layer to the 10th layer. The third, fourth, and seventh layers are composed of one-dimensional convolution layers with a convolution kernel size of 1 * 5. The one-dimensional convolution layer has a good performance in time-varying signal processing. In the one-dimensional convolution layer, different weights are used to convolute the characteristic images of ECG segments. In the third and fourth layers, 64 * 3 weight vectors are used for one-dimensional convolution on the input ECG signal, and the step length is 3. Then, the activation output of the fourth layer uses the batch normalization (BN) layer to normalize each batch. Followed with the sixth layer is a pooling layer. In this layer, the new feature map is generated by averaging the range specified on the feature map obtained from the previous layer. The sixth layer uses an average pooling function with a step size of 2 and a pooling size of 3. This layer reduces the feature map size from the top level according to the size of the scope. And an important step to reduce the cost of deep learning structure is to reduce the size of feature mapping. In the seventh layer, the input feature map repeats the one-dimensional convolution process on the weight of 32 * 5, and the step length is 3. Then average pooling the output of the seventh layer in one dimension.
The ninth layer of the DDNN is a flatten layer, which transforms multidimensional data into a single dimension. The converted data is fed to a full connection layer, and the downsampling rate is set to 0.5. In this way, it can prevent the overfitting of the model in the learning process, reduce the number of parameters, enhance the diversity of the model, and make the model more robust. After that, a Rectified Linear Units (ReLU) excitation function is followed. The distribution of functions is shown in Fig. 3. The function of ReLU is:

ReLU activation function.
The derivative expression of the ReLU excitation function is as follows:
Using the ReLU excitation function can improve the calculation speed, because it only needs to judge whether the input is greater than 0, and the convergence speed of the function is much faster than the sigmoid function and the tanh function. The ReLU function changes the output of some neurons become to 0. This results in the sparsity of the network. And the interdependence between parameters is reduced, and the over fitting problem is effectively alleviated.. Moreover, it can enhance the nonlinear characteristics of the neural network and decision function, and its function itself will not change the convolution layer. Additionally, the ReLU function can effectively solve the gradient disappearance problem.
3) Recurrent Block. It contains an LSTM hidden layer and a fully connected layer. Specifically, the LSTM hidden layer consists of a set of 20 recursively connected storage units. Every storage unit in the hidden layer contains a recursively connected input gate, output gate and forgetting gate. These storage units provide write, read, and reset operations for the unit. And each memory unit has the input and output weight parameters, which can enable the information to pass selectively. After that, the output of the LSTM hidden layer then serves as the aggregated feature vector and is downsampled at a rate of 0.25. It can prevent overfitting on the training data, thus affecting the prediction performance of the model. Then, it followed a full connection layer.
The recurrent block enables the feature aggregation in the time dimension. After extracting the spatial feature map from the convolution block, the subsequent recurrent block is helpful for the model to capture the time dynamics in these feature maps.
4) Classification Block. The thirteenth layer of the DDNN model is the classification layer, which uses the Softmax function to realize the classification output of the ECG signal. By using the Softmax classification layer, the classification results of each category are represented in the form of probability, and the prediction of the category of the input data is realized. The Softmax function maps an arbitrary k-dimensional real number vector to another k-dimensional real number vector. And each element in these k-dimensional real number vectors takes a value between (0, 1). The Softmax function takes the form:
where j = 1, 2, . . . , K, and K represents the real vector dimension. By the Softmax function, the range of σ (z) is between the [0, 1], the result of positive sample is close to 1, and the result of negative sample is close to 0.
The DDNN model can capture the characteristics of the ECG signal. It can pick up the robust features from the memory signal, weaken the influence of the short-term signal drift and expansion, and reduce the influence of noise. And it has good applicability for the analysis, extraction, and memory of the multi-modal short-term variable signal features.
ECG signals are a kind of time-varying signal. Its waveform is composed of time-varying points, so it contains certain spatial information. The features of the ECG signals are mainly composed of wave attributes, which present certain spatial information as well. Convolution neural network focuses on spatial mapping, plus the learning and memory ability of LSTM to time series and the ability of wavelet to denoise and process signals, so the DDNN model proposed in this paper is suitable for then ECG signals learning and classification.
The experimental data is from the MIT-BIH Arrhythmia Database [18]. The ECG data in this database is comprehensive and reliable, so this database has been widely used in the field of ECG research all over the world since it was released. At present, it is the most widely used database in the field of ECG research. A lot of research on the ECG signals are based on this benchmark database as the experimental data and the test standard of models. This paper choose the modified lead II signals in MIT database as the experimental subject. For keep the uniformity of the leads, the No.102 record and No.104 record that do not contain lead II were eliminated in this experiment.

Five types of heart beat signal waveforms.
In this paper, five types of ECG signals in MIT-BIH Arrhythmia Database are selected for experiments: Normal beat (N), Left bundle branch block beat (L), Right bundle branch block beat (R), Atrial premature beat (A), Premature ventricular contraction (V).
To avoid the small sample data leads to the overfitting problem of DNN training, this paper applies the method of “adding noise” to expand the amount of experimental data based on the original data. In this way, can improve the generalization ability and enhance the robustness of the training model. Then the experimental data are segmented into time-varying signals segments based on the heartbeat. The segmentation algorithm selected here is based on the location of the R-wave peak to segment the ECG signals [21].
The electrocardiogram is an effective method to diagnose arrhythmia. To verify the validity and feasibility of the DDNN model, two kinds of experiments are carried out in this paper. The first experiment was to identify the arrhythmia ECG signals, and the second experiment was to identify five types of ECG signals.
Experiment 1: Arrhythmia Recognition
The recognition of arrhythmia is a binary classification problem. The experimental data we use here is the ECG signals from the MIT-BIH Arrhythmia Database and the ECG signals expanded through “noise adding”, with a total of 100000. The experimental data were divided into 50000 normal and 50000 arrhythmia types, that is, the ratio of normal to abnormal ECG signals is 1:1. In this experiment, the number of samples is divided into 75% of the training set and 25% of the testing set. The data set partition of the diagnostic experiment for arrhythmia is shown in Table 1.
Data set partition of diagnostic experiment for arrhythmia
Data set partition of diagnostic experiment for arrhythmia
Experiment 2: Arrhythmia Types Classification
The experimental data we used here is the ECG signals from the MIT-BIH Arrhythmia Database and the ECG signals expanded through “noise adding”, with a total of 65000. The number of samples is divided into 75% of the training set and 25% of the testing set in this experiment. The experimental data set partition for classification of the arrhythmia types is shown in Table 2.
Experimental data set partition for classification of arrhythmia types
The DDNN model has trained 600 epochs in every experiment. The convergence speed of the DDNN model was fast in the training processes, and there were not appear the overfitting problem. The accuracy of the testing set is higher than that of the training set. And the accuracy is high and the error is small.
The result of the recognition of the arrhythmia experiment is shown in Table 3.
Result of the experiment 1
Result of the experiment 1
99.95% of the training accuracy and 99.98% of the test accuracy were obtained in the arrhythmia recognition experiment.
The result of the recognition of the arrhythmia types experiment is shown in Table 4.
Result of the experiment 2
The experiment of the classification of the arrhythmia types obtained the 99.95% training accuracy and 99.97% test accuracy.
The two experiments above both verified the validity and feasibility of the DDNN model.
At present, the recognition method of the ECG signals is a hotspot in the fields of medical information technology, artificial intelligence, etc. There are many experts have published many related studies, and some of them have an excellent result. Table 5 summarizes some excellent work related to ECG recognition in recent years, as well as the comparison with the experiments in this paper. The experimental data of these works are also from the MIT-BIH Arrhythmia Database.
Experimental results of DDNN and the comparisons with the state-of-the-art algorithms
It can be seen from the comparison in Table 5 that under the same data set, the DDNN model proposed in this paper has achieved good experimental results under the two and five classifications of the ECG signals.
In this paper, a novel 13-layer deep dynamic neural network model (DDNN) is proposed for the ECG signals with low frequency, variability, instability, and noise. This deep network model is suitable for large-scale data sets, and provides an automatic classification of input signals through the end-to-end structure. It can overcome the non-stationary characteristics of the ECG signals, and realize the feature discovery, acquisition, and depth analysis. Moreover, the DDNN model proposed in this paper has good applicability to the pattern classification of the ECG signals. It can extract and memorize the robust features of the signal, weaken the influence of the short-term signal drift and expansion, reduce the influence of noise on the time signal analysis and the loss of important information in the analysis process. The experimental results show that the deep dynamic neural network model proposed in this paper is feasible and effective in the ECG signal pattern learning, and can provide a new method for the ECG signal pattern classification.
