Abstract
A multi-layer convolutional neural network (MCNN) with hyperparameter optimization (HyperMCNN) is proposed for classifying human electrocardiograms (ECGs). For performance tests of the HyperMCNN, ECG recordings for patients with cardiac arrhythmia (ARR), congestive heart failure (CHF), and normal sinus rhythm (NSR) were obtained from three PhysioNet databases: MIT-BIH Arrhythmia Database, BIDMC Congestive Heart Failure Database, and MIT-BIH Normal Sinus Rhythm Database, respectively. The MCNN hyperparameters in convolutional layers included number of filters, filter size, padding, and filter stride. The hyperparameters in max-pooling layers were pooling size and pooling stride. Gradient method was also a hyperparameter used to train the MCNN model. Uniform experimental design approach was used to optimize the hyperparameter combination for the MCNN. In performance tests, the resulting 16-layer CNN with an appropriate hyperparameter combination (16-layer HyperMCNN) was used to distinguish among ARR, CHF, and NSR. The experimental results showed that the average correct rate and standard deviation obtained by the 16-layer HyperMCNN were superior to those obtained by a 16-layer CNN with a hyperparameter combination given by Matlab examples. Furthermore, in terms of performance in distinguishing among ARR, CHF, and NSR, the 16-layer HyperMCNN was superior to the 25-layer AlexNet, which was the neural network that had the best image identification performance in the ImageNet Large Scale Visual Recognition Challenge in 2012.
Keywords
Introduction
An electrocardiogram (ECG) is an important tool for diagnosing heart disease. The four stages of ECG signal classification are preprocessing, segmentation, feature extraction, and, finally, classification. First, the preprocessing stage detects and attenuates the frequency of the associated ECG signal, typically by performing signal normalization and enhancement. Next, the segmentation stage splits the signal into smaller segments that represent the electrical activity of the heart [1]. Researchers can use previously developed tools and techniques to obtain good results for these two stages [2]. Therefore, most studies of ECG signal classification have focused on the last two stages, feature extraction and classification [3].
Convolutional neural networks (CNNs) have proven highly effective in various real-world applications, including classification of images [4], speech [5], and sounds [6–10]. Compared to its predecessors, the main advantage of CNN is that it automatically detects important features without human knowledge. A CNN model combines two functions, feature extraction and classification. The convolutional layer and the pooled layer in the CNN perform feature extraction. The fully connected layer then uses the extracted features to perform image classification. Fig. 1 shows a classic multi-layer convolutional neural network (MCNN) with an input layer, multiple hidden layers, and an output layer. The hidden layers of a MCNN include convolutional layers, pooling layers, and fully connected layers. Since a MCNN can perform feature extraction and classification, this study used a MCNN to solve ECG classification problems.

A classic multi-layer convolutional neural network.
A major challenge is to select MCNN hyperparameters that provide the best image recognition performance. The MCNN hyperparameters are conventionally optimized by trial-and-error or by using previously published reference data. This study used uniform experimental design (UED) approach to find the best combination of hyperparameters for the MCNN. After the MCNN structure was defined, the UED was used to design the hyperparameter combinations for the 16-layer CNN as described in Chou et al. [11]. The experiments designed by the UED were performed to achieve the highest correct rate in classifying ECG recordings. The 16-layer CNN with an appropriate hyperparameter combination (16-layer HyperMCNN) obtained the highest correct rate. To validate its robustness, the proposed 16-layer HyperMCNN was used to classify ECG recordings in independent experimental runs. The experiments showed that the proposed 16-layer HyperMCNN indeed obtained better average correct rate and standard deviation compared to the 16-layer CNN with a hyperparameter combination given by the Matlab example (16-layer MatlabCNN) and the 25-layer AlexNet.
This study is organized as follows. Section 2 defines the research problem. Section 3 presents the proposed approaches and steps. Section 4 introduces and discusses the findings. Finally, Section 6 summarizes the study.
All ECG recordings used in the experiments in this study were obtained from three PhysioNet databases: MIT-BIH Arrhythmia Database [12, 13], BIDMC Congestive Heart Failure Database [12, 14], and MIT-BIH Normal Sinus Rhythm Database [12]. The ECG recordings were for patients with cardiac arrhythmia (ARR), congestive heart failure (CHF), and normal sinus rhythm (NSR). Continuous ECG signals in each recording were labeled. Figures 2–4 show the signals randomly selected from the MIT-BIH Arrhythmia Database, from the BIDMC Congestive Heart Failure Database, and from the MIT-BIH Normal Sinus Rhythm Database, respectively.

Randomly selected signals from the MIT-BIH Arrhythmia Database.

Randomly selected signals from the BIDMC Congestive Heart Failure Database.

Randomly selected signals from the MIT-BIH Normal Sinus Rhythm Database.
The considered problem was how to classify large numbers of different ECG recording types efficiently and accurately. The ECG signal can differ even for the same illness. Therefore, a specialist or machine learning is needed to assist the physician in classifying ECG signals. Figure 5 shows an example of the ECG signal classification problem.

Example of ECG signal classification problem.
The research methods and steps were ECG data collection, feature extraction, data grouping, MCNN architecture design, UED planning, using MCNN to classify ECG recordings, and validating the robustness of the appropriate hyperparameter combination in classifying ECG recordings. The detailed steps were as follows.
Collecting ECG data
The ECG recordings were collected from three PhysioNet databases: MIT-BIH Arrhythmia Database, BIDMC Congestive Heart Failure Database, and MIT-BIH Normal Sinus Rhythm Database. The recordings included 96 ARR recordings, 30 CHF recordings, and 36 NSR recordings. The ECG data had two fields: Data and Labels. The Data field was a 162-by-65536 matrix in which each row was an ECG recording sampled at 128 hertz. Labels were a 162-by-1 cell array of diagnostic labels. The three diagnostic categories were ARR, CHF, and NSR.
Creating time-frequency RGB scalograms
A scalogram is the absolute value of the continuous wavelet transform (CWT) coefficients of a signal [2, 15]. To maintain compatibility with the MCNN architecture, each RGB scalogram was a 227×227×3 array. Figures 6–8 show examples of ARR, CHF, and NSR scalograms, respectively, generated by CWT.

Example of a cardiac arrhythmia scalogram generated by continuous wavelet transform.

Example of a congestive heart failure scalogram generated by continuous wavelet transform.

Example of a normal sinus rhythm scalogram generated by continuous wavelet transform.
The 162 ECG scalograms generated by CWT were then randomly divided into a training dataset and a validation dataset. The training dataset included 130 randomly selected ECG scalograms (80% of all recordings). Of these, 29 were NSR, 24 were CHF, and 77 were ARR. The remaining 32 ECG scalograms (20% of all recordings) were used in the validation dataset and included 7 NSR scalograms, 6 CHF scalograms, and 19 ARR scalograms.
Designing the MCNN architecture
The architecture of the 16-layer HyperMCNN inspired by Chou et al. [11] was used to identify the ECG scalograms. The five design hyperparameters were number of filters (Nf), filter size (Fc), pooling size (Fp), pooling stride (Sp), and the gradient method; padding (P) was set to 1, and filter stride (Sc) was set to 1. Each ECG scalogram was a 227×227×3 RGB scalogram. The 16 layers of the HyperMCNN are listed below:
The three gradient methods used to train the MCNN were stochastic gradient descent with momentum (Sgdm), adaptive moment estimation (Adam), and root mean square propagation (RMSProp).
Using UED approach to select MCNN hyperparameter combinations
The UED approach developed by Wang and Fang [16] uses a space-fill design to create a set of experimental points that are evenly dispersed in the continuous design parametric space. The UED only considers uniform dispersion and not comparable order. Therefore, UED minimizes the number of experiments needed to acquire all available information [17].
The uniform layout (UL) is denoted by U a (a b ), where U is the symbol of UL, a is the number of experiments and levels, and b is the number of parameters. The steps for using good grid method to construct the UL were as follows [18, 19].
Let a be the number of experiments. Find positive integers h that are less than a, where the greatest common divisor between h and a is 1. Use the following equation to calculate the UL element u i ,j:
where
In the UED approach, when a is a prime number, the UL has (a-1) columns. When a is not a prime number, the UL has less than (a-1) columns.
When the number of test variables is less than the number of ULs, the factors should be selected. Therefore, each UL has a table indicating which factors should be selected for the design variables and which experimental points of the design variables can be evenly distributed in a good grid. The centered L2-discrepancy (CL2) is considered an attractive feature because it remains unchanged when the order of operations changes and the factors are re-marked. It reflects the point on any plane that passes through the center of the unit cube and is parallel to one of the faces. The latter is equivalent to the invariance that occurs when the ith coordinate x i is replaced by 1- x i to some i = 1, ... ., s. To analyze CL2, Hickernell [20] proposed the following mathematical expression:
The CL2 confirms that the uniformity of the UED and the design points obtained by the UL are evenly dispersed in the experimental domain. For example, Table 1 shows an eleven-level UL of U11(1110), and Table 2 is the table used for U11(1110). The table shows that the UED is well suited for solving multiple factors involving multiple levels.
Eleven-level UL of U11(1110)
Table used for U11(1110)
For convolutional layers of the MCNN, the hyperparameters were number of filters, filter size, padding size, and filter stride. For max-pooling layers of the MCNN, the hyperparameters were pooling size and pooling stride.
For the convolutional layer, the output image size was calculated as
For the max pooling layer, the output image size was computed as follows.
When designing hyperparameter values, the constraints are (H1-Fc+2P) >0, (W1-Fc+2P) >0, (H3-Fp) >0, and (W3-Fp) >0 [11].
Table 3 shows the 11-level UL of the U11(115), which is selected from five factors of Table 2 and used to design the hyperparameter combinations for the proposed 16-layer HyperMCNN, because there are five design hyperparameters that are Fc, Nf, Fp, Sp, and gradient method. Since the initial weights for the MCNN were randomly set, the correct rate differed in each run. Therefore, each row of hyperparameter combinations for the U11(115) was performed in five independent runs. The output value of each row of hyperparameter combinations was the average correct rate for five independent runs.
The 11-level UL of U11(115) used to allocate five design parameters for eleven levels
The 11-level UL of U11(115) used to allocate five design parameters for eleven levels
The experimental environment was Matlab R2019 and its toolboxes developed by The MathWorks. For the proposed 16-layer HyperMCNN, Matlab toolbox was used to set the network training options, e.g., ‘MaxEpochs’ (maximum number of epochs), 20, ‘ExecutionEnvironment’, ‘gpu’, ‘InitialLearnRate’, 0.001, etc.
The optimized hyperparameter combination for the proposed 16-layer HyperMCNN was considered valid if it was sufficiently robust for classifying ECG recordings.
Since Matlab R2019 (MathWorks) is a widely used commercial software program, the hyperparameters provided by Matlab R2019 were assumed to be valid and efficient. Therefore, the 16-layer MatlabCNN was used to compare the performance of the proposed 16-layer HyperMCNN.
Additionally, the 25-layer AlexNet, which was the champion of the ImageNet Large Scale Visual Recognition Challenge (ILSVRC) in 2012, was used to compare the performance of the proposed 16-layer HyperMCNN.
Implementation and discussion of results
The architectures of the proposed 16-layer HyperMCNN included the 16-layer CNN with an appropriate hyperparameter combination for ECG recording classification. The training set comprised 130 randomly selected ECG scalograms (80% of the total recordings), including 29 NSR, 24 CHF, and 77 ARR scalograms. The validation set comprised the remaining 32 ECG scalograms (20% of the total recordings), which included 7 NSR, 6 CHF, and 19 ARR scalograms.
The five design hyperparameters for the proposed 16-layer HyperMCNN were Fc, Nf, Fp, Sp, and gradient method. Each ECG recording was converted to a 227×227×3 RGB scalogram. The 11-level UL of U11(115) was used to design hyperparameter combinations for the 16-layer HyperMCNN. The value of factor Fc was an integer ranging from 3 to 5. The value of factor Nf was an integer ranging from 16 to 20. The value of factor Fp was an integer ranging from 2 to 4. The value of factor Sp was an integer ranging from 1 to 2. The gradient methods for training MCNN were Sgdm, Adam, and RMSProp. Table 4 shows the level values of the five design hyperparameters for the proposed 16-layer HyperMCNN. Table 5 shows the factor values of the five design hyperparameters that combined the values in Tables 3 and 4.
Level values of five design hyperparameters for the proposed 16-layer HyperMCNN
Level values of five design hyperparameters for the proposed 16-layer HyperMCNN
Factor values of five design hyperparameters for the proposed 16-layer HyperMCNN
Table 6 shows the average correct rates and standard deviations (SDs) obtained by using the five design hyperparameter combinations in Table 5 in five independent experimental runs when the 16-layer HyperMCNN was used to identify ECG scalograms in training and validation sets. In Table 6, the best average correct rate is the fifth combination, which has a value of 93.75% in the validation set. The best hyperparameter combination was Fc 4, Nf 20, Fp 4, Sp 2, and gradient method ‘Sgdm’. To validate its robustness, the best hyperparameter combination was used to classify ECG recordings. Table 7 shows average correct rates when the 16-layer HyperMCNN was used to classify ECG recordings and SDs in ten independent experimental runs. For the validation set, the average correct rate obtained in ten independent runs was 93.75%, and the SD was 0. That is, the best hyperparameter combination (Fc 4, Nf 20, Fp 4, Sp 2, and gradient method ‘Sgdm’) was sufficiently robust for classifying ECG recordings.
Average correct rates and SDs in identifying ECG scalograms in five independent experimental runs when the five design hyperparameter combinations in Table 5 were used in the proposed 16-layer HyperMCNN
Performance comparison of the proposed 16-layer HyperMCNN, the 16-layer MatlabCNN, and the 25-layer AlexNet in terms of average correct rates and SDs in identifying ECG scalograms in 10 independent experimental runs
The proposed 16-layer HyperMCNN was then compared with the 16-layer MatlabCNN and the 25-layer AlexNet. Given by the Matlab example, the hyperparameter combination for the 16-layer MatlabCNN was Fc 3, Nf 16, Fp 2, Sp 2, and gradient method ‘Sgdm’. Table 7 compares the proposed 16-layer HyperMCNN, the 16-layer MatlabCNN, and the 25-layer AlexNet in terms of average correct rates and SDs in classifying ECG recording in 10 independent experimental runs. The table shows that the proposed 16-layer HyperMCNN achieved the highest average correct rate for the validation set. For the validation set, the proposed 16-layer HyperMCNN had a higher average correct rate compared to the 16-layer MatlabCNN. Moreover, for both the training and validation sets, the proposed 16-layer HyperMCNN had higher average correct rates and lower SDs compared to the 25-layer AlexNet. That is, for classifying ECG recordings, the proposed 16-layer HyperMCN was more robust than the 16-layer MatlabCNN and the 25-layer AlexNet.
By integrating UED and MCNN, the proposed 16-layer HyperMCNN effectively optimizes its hyperparameters for ECG recording classification. The main contribution of this study is the use of the UED for rapidly obtaining the best hyperparameter combination for a 16-layer HyperMCNN used to classify ECG recordings. When the best hyperparameter combination was used in the 16-layer HyperMCNN, the average correct rates for classifying ECG recordings were 99.23% and 93.75% for a training set of 130 scalograms and a validation set of 32 scalograms, respectively. Additionally, the average correct rates and SDs obtained by the proposed 16-layer HyperMCNN to classify ECG recordings were superior to those obtained by the 16-layer MatlabCNN. That is, the best hyperparameter combination obtained by UED for the proposed 16-layer HyperMCNN was more robust compared to that of the 16-layer MatlabCNN. Therefore, we conclude that integrating UED in the proposed 16-layer HyperMCNN enables effective and systematic optimization of hyperparameters for classifying ECG recordings. Tests of performance in classifying ECG recordings further revealed that, although it has fewer layers, the proposed 16-layer HyperMCNN has a superior average correct rate and SD compared to the 25-layer AlexNet.
Footnotes
Acknowledgments
This work was supported in part by the Ministry of Science and Technology, Taiwan, under Grant Numbers MOST 107-2221-E-153-005-MY2, MOST 109-2221-E-153-005-MY3, MOST 108-2221-E-037-007, and MOST 109-2221-E-037-005. This work was also supported in part by the Headquarters of University Advancement and Intelligent Manufacturing Research Center (iMRC), National Cheng Kung University, which is sponsored by the Ministry of Education, Taiwan.
