Abstract
Biometric-based authentication methods have been widely used, for example on portable devices (e.g., Android and iOS devices). However, there are several known limitations in existing authentication methods based on biometrics (e.g., those using facial, iris, and fingerprint). For example, in a healthcare context, a user may be physically incapable of completing the authentication due to his/her medical conditions. Hence, as a complementary authentication mechanism, there have been attempts to also utilize electrocardiogram (ECG). In this work, we propose an ECG authentication system that leverages deep learning. Specifically, to achieve generalization ability, complementary ensemble empirical decomposition (CEEMD) is introduced in our design. Moreover, a 1-D Multi-scale Convolutional Neural Network (1-D MCNN) is implemented to achieve accurate authentication. To evaluate the usability of our proposed approach, we have performed extensive experiments on eight databases, and the findings show that our proposed approach achieves good performance even on abnormal databases and can be adapted for different application environments. In addition, our adopted data from eight public databases requires theoretical statistical treatment for practical applications in real authentication scenarios.
Introduction
Biometric-based authentication approaches are increasingly becoming the norm for both work and leisure (e.g., popular consumer devices such as Android and iOS devices). However, there are some known limitations and weaknesses in such authentication approaches [4,15,31,40], such as the vulnerability against forgery attacks [29]. In addition to vulnerabilities in the underlying authentication approaches, we have to consider situations where conventional biometrics may not be convenient. For instance, in healthcare systems, a series of operations are included that require multiple authentication requests. In addition to vulnerabilities in the underlying authentication approaches, we should consider situations where the users may not be able to complete the authentication due to changes in the users’ medical or physical conditions (e.g., when the user is wearing a face mask during pandemics). Hence, in such situations particularly in a healthcare setting (e.g., involving medical staff wearing surgical masks), one can deploy an authentication mechanism based on electrocardiogram (ECG).
ECG is generally utilized to diagnose heart disease (see also Fig. 1). Specifically, there are three typical waveforms in ECG signals, namely: P-wave, QRS-wave, and T-wave. P-wave reflects the depolarization of the atrium, while QRS complex and T-wave respectively represent the depolarization and repolarization of ventricular [35].

Typical ECG signal (the “101” recording of MITDB).
The ECG signal can reveal unique features among different individuals, such as physiological status, exercise status, mood, and other aspects of a user [26]. In other words, the features of ECG signals could be used to facilitate user authentication: 1) distinctiveness: different individuals can be distinguished; 2) universality: it could be obtained from any individual with a heartbeat, even when the individual is in a coma; 3) security: counterfeiting ECG signals is computationally challenging, partly due to its liveness trait [34].
As discussed earlier, there are known limitations in existing approaches (see also the next section). In recent years, many attempts have been made to leverage deep learning to improve the accuracy of ECG authentication. One key limitation in the early deep learning-based approaches is the reliance on the fiducial detectors, which results in significant computational overheads and non-robustness against noises. Spectral analysis has also been utilized to avoid the use of fiducial detectors. However, the use of spectral methods in ECG authentication brings a new set of challenges. First, the accuracy of ECG authentication may be significantly different due to the change of base functions and window functions for different ECG databases. Second, it is difficult for a non-adaptive spectral method to achieve both local and global optimality. Third, achieving accurate authentication with generalization ability and high efficiency is challenging.
Motivated by these challenges, we utilize a self-adaptive spectral method (CEEMD) to process ECG signals and a 1-D MCNN for accurate ECG authentication. To evaluate the performance of our proposed methods, we further performed extensive experiments on eight public databases. Since our work focused on authentication methods, we directly adopted ECG data from eight public databases, and the data used in our experiments require theoretical statistical treatment for practical applications. The main contributions of our paper are summarized as follows:
This paper presents the first study that demonstrates the utility of CEEMD in extracting biometrics from ECG signals, in order to achieve high generalization ability. Unlike other methods that use CEEMD for ECG signal denoising, we use CEEMD for ECG biometric extraction. In our design, ECG signals is decomposed according to their intrinsic features in the frequency domain, requiring only a simple predetermined setting in data processing, which ensures high adaptiveness for different databases.
To achieve accurate ECG authentication, a 1D Multi-scale Convolutional Neural Network (1D MCNN) is designed for the required authentication information, where a series of 1-D convolutional neural networks are utilized to extract the temporal and spectral features from multiple 1-D components after ECG decomposition.
To balance the accuracy, generalization ability, and efficiency in ECG authentication, we split the ECG signals into segments without using additional fiducial detectors and processed them without using noise removal tools. This allows us to significantly reduce computational costs while achieving high authentication accuracy and generalization ability.
The rest of this paper is organized as follows. Section 2 briefly reviews some state-to-the-art literature, and Section 3 introduces the required preliminaries. In Section 4, we introduce our authentication system. In Section 5, we describe the experimental design and training strategy of our model, as well as present a comparative summary of the performance evaluation of our implemented system and six other competing approaches [1,2,10,14,49,51]. Finally, we conclude this paper in Section 6.
This section will now briefly review existing ECG authentication approaches, with and without fiducial detectors.
Fiducial methods
Fiducial features refer to the morphological characters that ECG signals reflect, including the amplitude difference, the duration time, and the location of signal peaks.
Fiducial features could be directly used as biometrics for human identification. For example, Ivanciu et al. proposed an ECG authentication system among different sensors of body area networks [27]. They used time durations of fiducial points between two ECG segments to produce an
Arteaga Falconi et al. proposed an ECG-based authentication scheme for mobile environment [5]. In their work, fiducial points including LP, P, Q, R, S, T, and TP were detected for feature extraction, and a hierarchical algorithm was designed for ECG authentication. Remarkably, the acquisition time for ECG signals was reduced to four seconds, though the accuracy of their ECG authentication scheme needs to be improved. In the scheme of Tang et al. [43], an efficient authenticated key agreement scheme based on ECG signals was proposed using the same fiducial features mentioned above, and the distance between feature vectors was calculated via distance functions to achieve ECG authentication.
A parallel method for ECG authentication was proposed by Zhang et al. [50]. In their work, time-domain and amplitude features were involved, including PQ, QR, RS, and ST duration scans, and the amplitudes of PQ, PT, and SQ. Similarly, Huang et al. proposed an ECG-based authentication scheme for healthcare systems using multiple fiducial points [22]. Specifically, they computed the average activation time of ECG signals (the average time duration of P waves), the amplitudes of QR and RS, and the QR, and RS duration for feature extraction. Moreover, the concept of Kullback–Leibler divergence was utilized in their design for ECG authentication, and they performed their experiments in a noisy environment to simulate the real application scenarios.
Meanwhile, fiducial features could also work as the anchor for signal segmentation. For example, the location of R peaks was first detected to align all the ECG segments before further processing [1,8–10,13,14,24,33,47].
In these methods mentioned above, an algorithm is required to detect fiducial points accurately. However, since the ECG signals exhibit different morphological characteristics in different ECG databases, current detector algorithms are hardly adaptive to various databases. Hence, methods without extra fiducial detectors were proposed to enhance the accuracy and generalization ability of ECG authentication.
Non-fiducial methods
In non-fiducial methods, the biometrics are extracted based on temporal analysis or spectral analysis of ECG signals. In recent years, many attempts have been made on non-fiducial methods for ECG authentication. Kim et al. proposed a Multi-variable Regression model for ECG authentication based on Decision Tree (DT) [33]. In their work, the ECG signals were first segmented according to the RR interval. Then, the Machine Learning (ML) technique was adopted to identify individuals. A similar idea was presented in the schemes of Labati et al. [14] and Chu et al. [10]. In their works [10,14], ECG signals were split into pieces according to the location of R peaks, and these segments of a predetermined length were subsequently processed through neural networks to get the final classification results.
In some schemes, methods of image feature extraction were used to extract biometrics from ECG signals. Louis et al. proposed a continuous ECG authentication scheme using One Dimensional Multi-Resolution Local Binary Patterns (1DMRLDP) which was modified from image-based Local Binary Patterns (LBP) [36]. In their design, the bagging method with a dynamic decision threshold was adopted to realize continuous ECG authentication.
In addition to the methods mentioned above, deep learning techniques have also been employed to enhance the accuracy of ECG authentication. Ibtehaz et al. proposed a deep learning-based method for extracting ECG biometrics from individual heartbeats [23]. In their approach [23], a deep-supervised 1D MultiResUNet model is utilized to locate R peaks, which are then used for ECG heartbeat segmentation. Following this, a 1D MultiRes Block-based model with limited pooling operations is introduced to facilitate closed-environment human identification based on ECG heartbeats. The authentication process is ultimately achieved through a Siamese architecture. This method demonstrates the potential of deep learning in improving the precision of ECG-based authentication systems.
Furthermore, 2D CNNs were adopted for ECG biometrics extraction [13,41]. In the design of Da Silva Luz et al. [13], the ECG signals were divided into two parts, one of which was used to generate spectral images for further classification. Two different CNN models were respectively trained to get the classification scores for both raw ECG segments and ECG spectrogram, and the classification scores were used to get the final authentication results. Srivastva et al. also proposed a novel ECG authentication scheme using 2D CNN [41]. But different from the scheme of Da Silva Luz et al. [13], they directly extracted biometrics from the ECG signal images instead of the ECG spectrograms. Moreover, a fine-tuned 2D CNN model named PlexNet was adopted in their design incorporating both transfer learning and ensemble learning, which enhanced the robustness of their proposed scheme.
Recurrent Neural Networks (RNN) is a deep learning method to extract biometrics from ECG signals based on temporal analysis. In the scheme of Kim et al. [32], they designed an ECG authentication framework using bidirectional Long-Short Term Memory (LSTM) based Deep Recurrent Neural Networks (DRNN). Since the bidirectional neural network architecture was adopted in their design, their method enhanced the overall performance compared to traditional LSTM-based RNNs.
Another possible choice to extract ECG biometrics without fiducial point detectors is spectral analysis. As a widespread tool for spectral analysis, Fast Fourier Transform (FFT) has been applied in ECG authentication. In the scheme of Zhang et al. [50], they extracted eigenvalues from the FFT spectrum of ECG signals by calculating its Linear Prediction Coefficients (LPC). FFT was also adopted in the design of Belgacem et al. [6] to realize ECG authentication. However, these methods based on FFT could not expose the spectral changes in ECG signals over time.
The Short-Time Fourier Transform (STFT) and Wavelet Transforms (WT) are also efficient spectral methods in signal processing. These methods enable signal analysis from a combined perspective involving both time and frequency domains. Therefore, in ECG authentication, these methods are utilized to extract more valuable information from ECG signals.
In the scheme of Abdeldayem et al. [1], STFT and Continuous Wavelet Transform (CWT) were used to generate the spectral-temporal images for ECG authentication. In their approach, a 2D Convolutional Neural Network (2D-CNN) was established to obtain more dynamic biometrics from ECG signals. Abdeldayem et al. [2] also proposed an approach to extract biometrics from spectral correlation images. In their work, the cyclo-stationary characteristics of ECG signals were utilized for blind segmentation, and the 2D-CNN was also used to classify individuals.
WT was also used in 1-D Multiresolution Convolutional Neural Network (1D MCNN) [49] to achieve ECG authentication. In the design of Zhang et al. [49], blind segmentation and Discrete Wavelet Transform (DWT) were adopted to get the optimal combination of wavelet components. Moreover, they performed some experiments on eight databases to evaluate the efficiency of their design.
Moreover, researchers have made some efforts on other spectral methods to find an optimal resolution. For instance, Zhao et al. proposed an approach on ECG authentication using Generalized S-Transformation (GST) [51]. In their work, the ECG trajectory images were captured using the GetFrame technology to expose the frequency characteristics at different times.
However, the spectral methods mentioned above are non-adaptive, which may limit the generalization ability of ECG authentication. Since CEEMD owns the adaptiveness property inherently, it could be considered to solve this problem. Currently, CEEMD has been applied in noise removal, which is proven to solve the above issue. [30]. Inspired by this, we adopt CEEMD to construct our ECG authentication mechanism with the hope of achieving high generalization ability.
Preliminaries
This section briefly introduces some preliminaries on the signal processing method that we use.
Empirical Mode Decomposition (EMD)
EMD is a self-adaptive method for spectral analysis requiring no predetermined settings. It divides the signal into multiple components of different frequency bands and a residual signal just according to its intrinsic features [21]. These components are named Intrinsic Mode Functions (IMF). There are two constraints of IMFs:
The number of extremal points and trans-zero points must be equal or differ by at most one.
At any moment, the upper and lower envelopes must be locally symmetric concerning the time axis.
The result of EMD is given as (1):
However, EMD brings a problem, namely mode mixing, greatly influencing the decomposition performance.
Mode mixing means when an IMF consists of oscillations of dramatically disparate scales.
Usually, there are two scenarios where mode mixing appears:
signals of different scales appear in one IMF component. signals of the same scale are dispersed into different IMF components.
Ensemble Empirical Mode Decomposition (EEMD)
Consequently, EEMD was introduced to avoid mode mixing with the property that the mean value of white noise is zero [45].
The steps of EEMD are as follows:
Create a series of noise-added signals: For each For the obtained IMFs and the corresponding residual signals, compute the average values:
Complementary Ensemble Empirical Mode Decomposition (CEEMD)
During EEMD, the greater N is, the better mode mixing problem solved. However, EEMD is satisfactory only when the residual white noise
Remarkably, it is proved that the residual white noise produced by CEEMD is much lower than that produced by the EEMD [44]. Accordingly, the computational cost is much lower as well.
Convolutional Neural Network (CNN)
CNN is a powerful deep learning model in image processing. Generally, the input of a CNN model is supposed to be a color image consisting of 3 channels, i.e., the RGB channels. In each channel, the color degree values at different picture positions are represented by values between 0 and 255, and all the values form a matrix of size
Our proposed ECG authentication scheme
In this paper, we present an accurate ECG authentication scheme using deep learning. In our design, CEEMD is introduced in the ECG signal processing procedure to achieve high generalization ability. In addition, to enhance the accuracy of ECG authentication, a 1D MCNN is constructed in our proposed scheme to extract more authentication information from multiple 1D IMFs.
The proposed authentication scheme consists of three main phases: data preprocessing phase, ECG authentication phase, and model validation phase. First, the data preprocessing phase is performed to process ECG signals for the further biometric extraction procedure. During data preprocessing, the initial ECG signals are first divided into small segments according to duration, and every segment is decomposed into multiple IMFs by CEEMD. Then, the IMFs of each ECG segment is fed to the 1D MCNN model constructed in the ECG authentication phase, and the classification results will be obtained. After that, the model validation process is executed on eight ECG databases to evaluate the accuracy and generalization ability of our ECG authentication scheme. In the subsequent sections, the details of the proposed ECG authentication scheme are illustrated and the global view of our proposed scheme is displayed in Fig. 2.

Workflow of the proposed scheme.
In our proposed ECG authentication scheme, the initial ECG signals are divided into multiple segments, each assigned a specific duration to ensure it contains at least one heartbeat. Each of these ECG segments is then decomposed into a set number of intrinsic mode functions (IMFs) using the Complete Ensemble Empirical Mode Decomposition (CEEMD) method. To illustrate this process, we use a three-second recording from the MIT-BIH Arrhythmia Database (MITDB) as an example. The ECG signal is decomposed into several components that belong to different frequency domains, with their frequencies decreasing from top to bottom. Figure 3 displays all the IMFs obtained through CEEMD and their corresponding frequency spectra obtained through Fast Fourier Transform (FFT). As described in Algorithm 1, the standard number of IMFs for MITDB is determined to normalize the input. For instance, if a segment contains ten IMFs and the standard number is nine, the first two IMFs are combined, as shown in Fig. 4-(a). Conversely, if the standard number is eleven, an extra zero vector equal in length to the first ten IMFs is added as the eleventh IMF, as demonstrated in Fig. 4-(b).

The obtained CEEMD IMFs of an ECG signal and their corresponding frequency spectrum through FFT.

Standardize the number of IMF

IMF normalization.
Since all the required ECG signals from eight databases have been processed in the previous step, the procedure of ECG authentication is now well prepared.
In our work, a 1D MCNN is established to achieve accurate ECG authentication. The complete architecture of the proposed model is illustrated in Fig. 5.

The architecture of proposed CNN model.
In the proposed model, the input IMFs decomposed from ECG segments are supposed to be scaled to a value between 0 and 1. Subsequently, three blocks are involved in extracting the biometric information, as shown in Table 1. Five different layers are included inside these three blocks, i.e., the convolution layer, the batch normalization layer, the activation layer, the pooling layer, and the classifier layer. The detailed setup for each layer is introduced as follows.
Convolution layer. In the convolutional layer, the filter kernel produces a feature map through the convolution process. In our study, the kernel size is set to 1*3 for each convolution layer.
Batch Normalization layer. Before the activation layer, the output of the convolution layer is sent to the Batch Normalization layer to accelerate the training process and avoid gradient vanishing [20] and gradient explosion [39]. Furthermore, the processed data is supposed to be normalized and shifted to meet its original probability distribution.
Activation layer. A non-linear activation function is used in this layer to decide whether a neuron works or not. Compared to other activation functions, the ReLU function is chosen to avoid overfitting and increase computational speed.
The scale of each layer in the proposed network
1 the 1-D Convolution layer.
2 the 1-D Batch Normalization layer.
3 the Fully Connected layer.
The ReLU function is given by:
Pooling layer. A pooling layer is designed to expand the receptive field and guarantee the translation euivariance property of CNN [25]. There are several strategies for pooling, and the max-pooling strategy is utilized in our design instead of mean-pooling or min-pooling since it could help remove redundant information.
Classifier layer. The classifier layer is utilized to judge the category of the output vector. In this layer, two fully connected layers, a dropout layer, and a softmax layer are involved. 1) Two fully connected layers are used to get a 1D vector before classification. The length of this 1D vector is set as the number of categories in the corresponding database. 2) A dropout layer is added to the classifier layer to avoid overfitting on a relatively small database. In our design, the dropout rate of this layer is set to 0.3. 3) The softmax layer utilizes the softmax function to obtain the classification results.
The softmax function is given by:
As the ECG signal is divided into several CEEMD IMFs, how to process all these IMFs is a challenge. In our study, IMF values are first normalized to between 0 and 1. Then, each IMF is fed to Block1 and Block2 in order, as shown in Fig. 6, and the output is accumulated as a vector. Subsequently, these vectors are processed by two fully connected layers and a dropout layer. Finally, the classification results are obtained by using the softmax function.

The procedure to process multiple IMFs in a network.
As previously stated, our model validation utilizes eight databases. In our approach, the ECG authentication process is simplified to a closed-environment classification problem. This refers to a multi-class classification task where all potential classes (or individuals, in the context of ECG authentication) are known a priori and included in the training set. The identification of the login subject is achieved by incorporating a softmax layer into our proposed 1DMCNN model. The number of neurons in the output layer corresponds to the total number of subjects in the closed environment.
To ensure a comprehensive evaluation of our proposed ECG authentication scheme, we employ a 60–20–20 split for our data. Specifically, 60% of the data is allocated for training the model, 20% is utilized for validation during the training process, and the remaining 20% is reserved for testing the performance of the trained model.
Implementations and discussion
Databases
In our design, eight different databases are used involving excerpts collected in different conditions to achieve high adaptiveness and generalization ability for ECG authentication. These databases are all obtained from physionet [18] including MIT-BIH Atrial Fibrillation Database (AFDB) [37], Combined measurement of ECG, Breathing and Seismocardiograms (CEBSDB) [16], European ST-T Database (EDB) [42], Fantasia Database (Fantasia) [28], MIT-BIH Arrhythmia Database (MITDB) [38], Normal Sinus Rhythm RR Interval Database (NSRDB) [18], MIT-BIH ST Change Database (STDB) [3], and MIT-BIH Malignant Ventricular Ectopy Database (VFDB) [19].
The brief information about utilized databases is given in Table 2, and the details are listed as follows:
ECG databases
ECG databases
N = Normal, AbN = Abnormal.
∗ As a recording of Fantasia is sampled at 333.3 Hz which is different from others, only 39 recordings of all are used.
The above databases were mainly acquired by Holter Monitor and Biopac MP36 acquisition system. A Holter Monitor is a wearable device that is widely used for ambulatory electrocardiography recording. It requires only a few ECG leads and electrodes to monitor the heart activity for a long period of time. Among our utilized databases, AFDB, EDB, MITDB, NSRDB, STDB, and VFDB were collected via Holter Monitor, and most of them were sampled from long-term Holter test data at Beth Israel Hospital Arrhythmia Laboratory.
A BIOPAC MP36 acquisition system is a four-lead physiological signal monitor, two of which are used to measure ECG signals, and CEBSDB was collected using this device. Particularly, during ECG measurement of CEBSDB, the subjects were asked to keep still and supine, and the monitoring electrodes were fixed by foam tapes and sticky gels (3M Red Dot 2560).
In our design, we adopted the raw data from the eight public databases, which are directly exported from the acquisition devices. Notably, each ECG recording in all ECG databases mentioned above contains two ECG signals from different leads. Therefore, to achieve higher convenience and usability of our proposed approach, we only utilized the first-lead ECG signals from each ECG recording.
Now, we introduce the hardware setup and the corresponding software environment in our experiments, as well as the training strategy.
In our work, the entire procedure of our experiment was carried out on a Lenovo laptop with a 2.30 GHz Intel(R) Core(TM) i7-10875H Central Processing Unit (CPU), 16.0 GB RAM. The Graphics Processing Unit (GPU) in this laptop is NVIDIA RTX 2060 with 6 GB of memory, and the operating system is Windows 10 64-bit. Both data acquisition and preprocessing were performed on Matlab R2017a, and the subsequent CNN model was built using torch 1.7.1 under python 3.8.5. Pytorch was chosen for our CNN model because of its high flexibility and good computational performance [7].
In our design, all parameters, input data, and local outputs were processed on GPU to speed up the training process. With the help of CUDA invented by NVIDIA [11,17], a 5x to 10x speedup was achieved compared to CPU.
During the training process, the Cross-Entropy Function was utilized to calculate the loss after a training iteration. In our experiments, the concept of the backpropagation algorithm [48] was used to train this model, and the Adam optimizer [12] was chosen to achieve faster convergence. We also used a dynamic learning rate that decays to one-tenth of the previous one per 100 epochs to control the loss, with the initial value set to be 0.0001. In our experiments, we adopted a mini-batch strategy to train the model. First, the input data was divided into multiple batches, and the batch size is set as 50. Since only the data of each batch occupies GPU memory, our strategy required less GPU memory. Second, after each batch, the model’s parameters were updated through the optimizer, which improved the convergence of the model.
Results evaluation
In binary classification, since there is only a negative category and a positive category, four predictions could be used to describe the classification results, namely: True Positive (TP), False Positive (FP), True Negative (TN), and False Negative (FN), as shown in Fig. 7.

Confusion matrix for binary authentication.
However, in our scheme, ECG signals are classified into multiple classes. Therefore, the confusion matrix is adopted to translate multi-class classification into a binary classification. For example, there are three classes, namely Tom, John, and Jack, and if Tom is set as the positive class, John and Jack would be regarded as negative classes. After this, TP, TN, FP, and FN for multi-class classification could be obtained based on positive and negative classes.
In our design, we adopt three performance measures to analyze our scheme’s generalization ability and authentication accuracy according to TP, TN, FP, and FN.
In addition to evaluating the performance of our proposed scheme, we also evaluated the performance of six other competing approaches, namely those presented in [1,2,10,14,49,51], using eight ECG databases. These databases are the three ECG databases (i.e., CEBSDB, Fantasia, and NSRDB) collected from healthy individuals, and five ECG databases from patients with heart diseases (i.e., AFDB, EDB, MITDB, STDB, and VFDB).
In our study, we conducted a series of experiments to investigate the impact of ECG segment duration on the accuracy of our proposed authentication scheme. We tested three different segment durations: one second, two seconds, and three seconds, across eight different ECG databases. As detailed in Table 3, when the ECG segment duration was one second, the accuracy (ACC) for AFDB, CEBSDB, Fantasia, MITDB, STDB, and VFDB databases reached 100%. The NSRDB and EDB databases achieved slightly lower ACC values of 98.75% and 99.22%, respectively. When the ECG segment duration is increased to two seconds, all databases maintain high ACC values. Specifically, AFDB, CEBSDB, MITDB, STDB, and VFDB remained at 100%, while Fantasia achieved an ACC of 99.41%. Notably, both NSRDB and EDB showed improvement with ACC values of 99.75% and 99.64%, respectively. Upon further increasing the ECG segment duration to three seconds, all databases achieved an ACC of 100%, with the exception of EDB which reported an ACC of 99.94%. These results clearly demonstrate that our proposed ECG authentication scheme maintains high accuracy across different ECG segment durations and various databases. Importantly, it was observed that a three-second ECG segment duration yielded the best overall performance for our proposed scheme. Therefore, in our subsequent experiments and evaluations, we have chosen to use a three-second duration for ECG segments.
Accuracy (ACC) on test databases with different ECG segment durations
Table 4 summarizes the ACC, FAR, and FRR of our scheme for each ECG database. And for normal and abnormal ECG databases, the average values of ACC, FAR, and FRR are also given in Table 4. As shown in Table 4, the ACC for three normal databases are
The ACC, FAR, and FRR on test databases

The radar figure in terms of ACC, FAR, and FRR on test databases.
Comparison with state-of-the-art schemes
1 Databases: (count) Type; N = Normal, AbN = Abnormal.
1 Preprocessing: AS = Accurate Segmentation, PC = Phase Correction, NR = Noise Removal, BS = Blind Segmentation.
We further compare our scheme with seven related works [1,2,10,14,23,49,51] based on several key metrics: the ECG databases used, the model or method employed, the number of heartbeats or duration for input ECG segment, the signal preprocessing strategy, and the average authentication accuracy (ACC).
As delineated in Table 5, each scheme employs a unique combination of models and methods. Zhang et al. [49] utilized auto-correlation and Discrete Wavelet Transform (DWT) to generate multiple wavelet components, which were subsequently processed by a 1D CNN model. Abdeldayem et al. [1] employed Short-Time Fourier Transform (STFT) and Continuous Wavelet Transform (CWT) to generate figures from ECG segments for further biometric extraction via a 2D CNN model. In their subsequent approach [2], they utilized Spectro-Correlation Figures (SCF) derived from ECG segments for feature representation, coupled with a 2D CNN for ECG authentication. Labati et al. [14] and Chu et al. [10] streamlined their approach by directly inputting ECG segments into 1D CNN models. Zhao et al. [51] utilized Generalized S-transformation (GST) to generate ECG spectral figures, which were then processed by a 2D CNN to extract biometrics. Ibtehaz et al. [23] employed a comprehensive approach using different 1D CNN models for both R peak detection and ECG identification, along with a Siamese architecture for the authentication process. Our proposed scheme stands out by using CEEMD for signal decomposition and a 1D CNN model for feature extraction and classification, leveraging the strengths of CEEMD in handling non-linear and non-stationary ECG signals, and the power of 1DCNNs in biometrics learning.
Signal preprocessing is a vital step in ECG authentication. As indicated in Table 5, the approaches vary in their preprocessing methods. Some, like Abdeldayem et al. [1], DonidaLabati et al. [14], Chu et al. [10], and Ibtehaz et al. [23], utilize Accurate Segmentation to locate ECG heartbeats. Others, including our proposed scheme and the work of Abdeldayem et al. [2], adopt Blind Segmentation (BS), which simply splits ECG signals according to the ECG segment duration, enhancing efficiency and robustness. Regarding noise removal, only Abdeldayem et al.’s work [2] and our proposed scheme do not require this process. Furthermore, Zhang et al. and Labati et al. incorporated an additional Phase Correction (PC) operation in their designs [14,49]. Our proposed ECG authentication scheme demonstrates its superiority over these approaches [1,10,14,23,49,51] by requiring the least signal preprocessing operations, thereby ensuring efficiency and generalization ability.
The efficiency of a scheme can also be indicated by the number of heartbeats or duration required for the input ECG segment. According to Table 5, the requirements vary across different schemes. For instance, Abdeldayem et al. [1] required 10 heartbeats for input segments, which is relatively high compared to other methods. On the other hand, Zhang et al. [49], Abdeldayem et al. [2], and Zhao et al. [51] processed ECG segments of 2 or 3 seconds, demonstrating a more efficient use of data. Ibtehaz et al. [23] adopted a flexible approach, requiring 3–6 heartbeats for the authentication process. Our proposed scheme stands out by requiring only 3 seconds of ECG data for input segments, striking a balance between data efficiency and authentication performance.
The robustness of an ECG authentication scheme is often demonstrated by the ECG databases it can work with. For instance, Labati et al. [14] and Zhao et al. [51] used only two ECG databases in their designs. Ibtehaz et al. [23] used four ECG databases in their designs, while Chu et al. [10] adopted four ECG databases. Zhang et al. [49] and Abdeldayem et al. [1] used eight ECG databases in their schemes. Abdeldayem et al.’s subsequent design [2] added a ninth ECG database to the existing eight. In our design, we used eight ECG databases, demonstrating the robustness of our scheme across various datasets including normal and abnormal ECG datasets.
In terms of accuracy, among these seven schemes, a lower ACC was reported in the approach of Zhang et al. [49] at 93.50% because it was a relatively early attempt at ECG authentication based on spectral methods. In the schemes of Abdeldayem et al., a 95.60% [2] and a 97.86% [1] average ACC was obtained, respectively. And in the schemes of Chu et al. [10] and Zhao et al. [51], the average ACC was reported at 98.08% and 96.63%, respectively. The highest accuracy was achieved by Labati et al.’s scheme [14] and Ibtehaz et al.’s scheme [23], both reaching a 100.00% ACC. Our proposed scheme is closely followed with an ACC of 99.99%.
While these schemes have demonstrated high accuracy, it’s important to note that our proposed scheme provides higher generalization ability and efficiency. This is because our scheme requires fewer steps for signal preprocessing and fewer heartbeats for input segments compared to Labati et al.’s scheme [14] and Ibtehaz et al.’s scheme [23].
In conclusion, while there are several effective methods for ECG authentication, our proposed scheme demonstrates a balance of high accuracy, efficiency, and practical applicability, making it a promising approach for real-world applications.
Potential application scenarios
In recent years, e-healthcare systems have emerged in people’s daily life, and it enables users to access medical services remotely at home. With e-healthcare systems, the physical status of the users (e.g. blood pressure, temperature, and some physiological signals such as ECG and PPG) are monitored and transmitted to the medical server, thus the user can obtain professional feedback from the experts according to their real-time condition. To provide privacy protection of health data, the identity of users should be verified before transmitting the health data in e-healthcare systems. In these scenarios, our proposed ECG-based authentication method can provide convenient authentication without any additional actions by the patients.
Figure 9 presents a potential application scenario of our proposed ECG-based authentication scheme. In this scenario, the patient’s physiological signals are collected through multiple medical sensors and forwarded to the control unit. Before transmitting these signals to the corresponding medical server, the control unit performs our proposed ECG-based authentication scheme to verify the identity of the patient without introducing additional operations. If the verification is passed, the patient’s health data is encrypted and then transferred to the medical server. Otherwise, the subsequent transmission is terminated. With our proposed ECG-based authentication scheme, the patients are not required to input extra biometrics to complete the authentication process, which provides more convenient authentication, especially for critically ill patients.
In addition, our proposed ECG authentication method can also be applied to smartwatches and bracelets to achieve convenient authentication, when the ECG data collection technology satisfies certain accuracy requirements.

A potential application scenario of our proposed ECG-based authentication scheme.
We presented our proposed deep learning-based ECG authentication approach, where a blind segmentation strategy is utilized to segment ECG signals and CEEMD extracts the relevant biometric information from these ECG segments. In addition, a novel method was proposed to normalize the IMFs after CEEMD. Since only a simple predetermined setting is required through data processing in CEEMD, our proposed scheme is highly generalizable. Furthermore, we constructed a 1D MCNN to extract biometrics from IMFs, and perfect accuracy was achieved on eight ECG databases, even on the abnormal ECG database. Last but not least, by avoiding the use of AS, PC, and NR, the computational cost is minimized.
Future research includes implementing a prototype of our proposed scheme in collaboration with a hospital so that we can evaluate its real-world utility and identify any potential constraints. In addition, in the future, we will deploy our proposed scheme in on-sale smartphones or bracelets to perform furthermore investigation on ECG authentication.
Footnotes
Acknowledgment
The research was financially supported by the National Natural Science Foundation of China (No. 62172303), the State Key Laboratory of Geo-Information Engineering and Key Laboratory of Surveying and Mapping Science and Geospatial Information Technology of MNR, CASM (No. 2023-04-04), the Key Laboratory of Data Protection and Intelligent Management, Ministry of Education, Sichuan University and also the Fundamental Research Funds for the Central Universities (No. SCU2023D008), Yunnan Key Laboratory of Blockchain Application Technology (No. 202105AG070005, No. YNB202303). The work of K.-K. R. Choo was supported only by the Cloud Technology Endowed Professorship.
