Abstract
We analyzed several QRS detection algorithms in order to build a quality industrial beat detector, intended for a small, wearable, one channel electrocardiogram sensor with a sampling rate of 125 Hz, and analog-to-digital conversion of 10 bits. The research was a lengthy process that included building several hundred rules to cope with the QRS detection problems and finding an optimal threshold value for several parameters. We obtained 99.90% QRS sensitivity and 99.90% QRS positive predictive rate measured on the first channel of rescaled and resampled MIT-BIH Arrhythmia ECG database. Even more so, our solution works better than the algorithms for the original signals with a sampling rate of 360 Hz and analog-to-digital conversion of 11 bits.
Introduction
Advances in the Internet of Things (IoT) field have encouraged researchers to intensify their focus on Electrocardiogram (ECG) processing, especially for wearable devices. This field has become a popular research topic in biomedical engineering [1]. Our primary focus is, on building a quality industrial QRS detector for a wearable ECG sensor that is significantly better than existing robust algorithms for QRS detection, which are inconvenient for mobile devices with limited resources [2].
The performance of the QRS detector is evaluated by calculating how many real QRS peaks are found (QRS sensitivity, denoted as
One of the most cited papers for QRS detection built for small devices with limited resources is the Pan and Tomkins algorithm [4]. Its robustness lies in the fact that it runs fast enough to be used in real time and can cope with noisy signals. However, the performance of this algorithm depends on bit resolution in the AD conversion. In our case, when using one channel ECG sensor and smaller sampling frequencies, its performance was not satisfactory, especially for signals with smaller amplitudes. It was not a good solution for us because of these factors.
Another alternative is the Hamilton algorithm [5]. Compared to the Pan and Tompkins algorithm, it is quite similar but uses different filters and decision rules. It is a stable solution, but, still is unable to cope with small amplitudes, or variations in consecutive amplitude levels, especially, when using a smaller bit resolution in AD conversion.
Physionet.org [6] is a very comprehensive resource where one can find several QRS detection algorithms, including Wavedet, gqrs, wqrs, and sqrs. They represent simple and fast algorithms demanding a small number of resources and obtain high sensitivity and positive predictive rate values. However, they lack the beat classification, and the obtained sensitivity and the positive predictive rate are also considered to be lower than the demands of a quality industrial QRS detector for a wearable one channel ECG sensor.
After these initial efforts, the attention of researchers gradually focused on developing more sophisticated QRS detection algorithms, including Machine Learning and other methods, as described in Section 2 (related work). Although some of the new approaches achieved better performance, they generally require computationally intensive algorithms, not suitable for smartphones that collect continuous ECG data from wearable sensors.
In our research, we improve Hamilton’s algorithm [5] in order to make it efficient for industrial application. The improvement was a rather long process due to the exponential nature of the effort to improve the algorithm. The closer you are to the margin of 100%, the more effort is needed for a very small improvement of the performance. We introduced several hundred rules to cope with the identified problems in QRS detection, and several thousand tests to tune parameters, and threshold values for identified solutions. Some threshold values obtained good performance on some test file while performing badly on others. When we fine-tuned some parameters, it so happened that some of the rules did not work on other test datasets, which was even more challenging.
Finally, the results proved that we have realized our goals. We reached higher values than all other published results, though we were working with almost three times less sampling frequency and half of the AD conversion bit resolution.
The structure of this paper is organized as follows. Section 2 discusses the related work in the area of QRS detection algorithms and Section 3 presents the background. Analysis of problems are presented in Section 4, and our approach to improving the algorithms in Section 5. The results from the experiments are evaluated in Section 6 and compared with other solutions. Section 7 is devoted to conclusions and future work.
Related work
QRS detection algorithms generally follow the same routine [7], starting with Digital Signal Processing (DSP) filtering that eliminates noise and baseline wander. Then the output is matched to a set of thresholds. The algorithms mainly differ in the way they calculate the threshold values and implement decision rules, which is another layer implemented after thresholding [8].
Long ECG recordings are generally occupied with noise in the form of a subtle deviation of the heart rhythm. In addition, signal quality changes to alternating changes in the wave amplitude. Unless noise is eliminated, detection of such QRS complexes becomes difficult, and decreases the accuracy [9]. It is important to state that any algorithm designed for quality industrial QRS detection must be adaptable to any type of noise.
There are several survey papers [9, 10] which give a comprehensive overview of popular QRS detection methods. In summary, published QRS detection algorithms are based on the following techniques:
Differentiation (derivation), where the difference between the current and previous samples is calculated, as a way of identifying the slope, and then it is compared to a given threshold value including the Pan & Tompkins algorithm [4], Hamilton’s algorithm [5], or other relevant approaches [11, 12, 13];
Pure DSP algorithms, where fundamental DSP filters with different characteristics are combined to produce a bandpass filter, with the aim to eliminate noise, and filter the signal so that a threshold will determine the beats [14, 15, 16, 17, 18];
Pattern Recognition algorithms, where the signal data is matched with predefined patterns and a waveform is detected in case of similarity within given constraints of the amplitude and slopes [19, 20, 21, 22, 23];
Neural Network, Multilayer Perceptron (MLP), Radial Basis Function (RBF), and Learning Vector Quantization (LVO) networks are used to adaptively predict the location of the next peak [24, 25, 26, 27, 21];
Digital Wavelet Transformation (DWT), where the signal is decomposed to a certain level of scales, and then recomposed, which effectively reduces noise. Then a threshold is applied to select proper peaks [28, 29, 30, 31, 32];
Genetic Algorithms are used to optimize the preprocessing polynomial filter. The ECG signal is compared to an adaptive threshold and the parameters are optimized with a genetic optimization approach [33];
Hidden Markov Models (HMM), used to train the probability function varying according to the hidden Markov chain, and then the model predicts the current state, which could be QRS complex. P and T waves can also be computed [34, 35, 36];
Hilbert Transform, where Hilbert transform of ECG signal is calculated by Fast Fourier Transform (FFT), and that is used for calculating the signal envelope [37, 38, 39]; and
Phasor Transformation, where each ECG sample is converted into a phasor to correctly manage P, and T waves, by definition having lower amplitudes than an R-peak with low computational cost, and then compares it against a threshold [40].
High performance QRS detector directly affects the amount and the quality of valuable information on ECG. QRS detection is the initial step for further ECG analysis.
In this section, we will explain the evaluation metrics and give an overview of the original Hamilton’s QRS detector algorithm.
Performance measures
The benchmarks used in our testing methodology are the same used in the IEC 60601-2-47 standard for particular requirements for the safety, including an essential performance of ambulatory electrocardiographic systems, and ANSI/AAMI EC57:2012 for Testing and Reporting Performance Results of Cardiac Rhythm and ST Segment Measurement Algorithms. These standards use the MIT-BIH ECG arrhythmia database [3], and the American Heart Association’s (AHA database) [41].
MITDB contains half-hour ECG recordings for 48 anonymized persons, and only 44 records exclude those that contain paced beats. These recordings are publicly available on the physionet.org web site [6]. The recording frequency is 360 samples per second, per channel, with a 11-bit resolution. Even though each recording contains two-channels, we used the first channel, identified as ML II, in most of the records.
In addition, we follow the requirements according to the standard IEC 60601-2-47:2012 for medical electrical equipment, particularly, requirements for essential performance of ambulatory electrocardiographic systems. According to these requirements, any calculated peak is considered as detected if it is at most 150 ms away from the real beat.
A detected QRS is denoted to be True Positive (TP), if the QRS detector has found a QRS closer than 150 ms from the one which is annotated. A False Negative (FN) is a missed QRS, or if the QRS detector has found a QRS outside the 150 ms perimeter, while a False Positive (FP) is an erroneously detected QRS (extra found).
The commonly used performance measures are sensitivity and positive predictive rate, calculated by Eq. (1).
In addition, to find an optimal value of a parameter, we provided a lot of test experiments, and calculate the number of Total Errors by Eq. (2), as a sum of errors denoted by FP and FN. The smaller the errors are, the better performance is achieved.
EP Limited’s open source software for arrhythmia detection serves as a basis for this research [42, 5]. It has a complete C-code implementation of Hamilton’s algorithm, with three different detectors, and a simple beat classifier. Two of the detectors are for general-purpose, whereas the third one is for environments with a small amount of memory.
Algorithmic details are theoretically provided in their original work [5]. Figure 1 represents the conceptual level for the two phases, and the high-level steps conducted for each of them.
Hamilton’s approach for QRS detection.
The algorithm starts with a low pass filter (LPF) with a cutoff frequency of 16 Hz. Then, the signal is passed through a high pass filter (HPF) of 8 Hz. Both filters have the effect of a band pass filter (BPF). Then, the slope of the signal is calculated with a differentiation method (
After eliminating the noise in the DSP filtering phase, the algorithm continues with the peak detection phase. It already has two thresholds for the AVG signal, classified as a
static threshold with a fixed value, and
dynamic adaptive threshold (DAT) which is affected by the amplitudes of the latest peaks.
The original algorithm sets the static threshold (STHR) at value
When a new local peak is found, the dynamic adaptive threshold is calculated by taking the mean values for real peaks and noise peaks into account. The mean value is computed by Eq. (3).
Let the mean value for real beats and noise peaks be denoted as qmean and nmean respectively, and also TH be the constant multiplier (with a default value of 0.3125), then the DAT is calculated by Eq. (4).
When a new local maximum is detected by calculating an AVG value, both of the thresholds are compared to this value. If the value is higher than the static threshold, then it is considered to be a potential peak, otherwise, it is an artifact. It is classified as a noise peak if the calculated value is lower than the dynamic peak, and as a real beat if it is higher than the dynamic adaptive threshold.
Figure 2 presents both dynamic and static thresholds. Note, that detected peaks A1, A2,
Detecting artifacts, noise and real peaks based on values of static and dynamic thresholds in the original Hamilton’s algorithm presented on signal extract over MITDB record 124 (1046 sec).
A discrepancy in peak amplitudes may introduce bad detection. We have identified two cases when this happens:
a sequence of low-amplitude peaks after an isolated high-amplitude peak; an isolated low-amplitude peak after a sequence of high-amplitude peaks.
Furthermore, apart from fine-tuning the threshold value, and expecting lower performance on cropped signals, we analyzed the segments where the algorithm showed lower sensitivity, and specificity even though the signal was not contaminated with noise. The conclusion was that lower performance was obtained for specific segments, and the problems can be classified as:
a mixture of low and high-amplitude peaks; artifact elimination; and wrong R-peak location.
Figure 3 presents the case when an isolated high-amplitude peak is preceded and followed by a sequence of low-amplitude peaks. An 8-second segment extract of MITDB record 114 is displayed including the original signal and outputs after executing each of the processing steps BPF, ABS, and AVG.
Signal extracts of executing the Hamilton algorithm over the MITDB record 114 (240 sec) a) Original ECG signal; b) Output after bandpass filtering; c) Output after differentiation and absolute calculation; d) Output after average over an 80 ms window.
Figure 3d identifies the static and dynamic thresholds and shows the case where the beats between the two high amplitude are considered as artifacts, although they should be real QRS beats.
The reason for bad detection of low-amplitude peaks after high-amplitude peaks is primarily due to the high level of static threshold. Even if one makes a correction by decreasing the static threshold value to include these peaks, there will still be a problem regardless of the fact that the peak will be treated as a candidate, and the dynamic threshold check will be applied. This is because the high-amplitude peak will increase the dynamic threshold value caused by the calculation of the mean, and so the peaks will be classified as noise peaks.
High-amplitude peaks directly affect the calculation of dynamic adaptive threshold. It is recalculated each time a new local maximum is found with an amplitude higher than the static threshold. In this case, it is considered as a potential peak, and the values lower than the dynamic threshold are considered as a noise peak, while the others as real QRS peaks. The original algorithm buffers the latest eight peak amplitudes and calculates a DAT value by Eq. (4).
Signal extracts of executing the Hamilton algorithm over the MITDB record 201 (424 sec) a) Original ECG signal; b) Output after bandpass filtering; c) Output after differentiation and absolute calculation; d) Output after average over an 80 ms window. 
An extensive analysis of the MITDB record 201 shows too many misses, especially in cases of aberrated atrial premature beats (classified as a beat) as illustrated in Fig. 4. Two of the beats highlighted by default cannot be captured due to the dynamic adaptive threshold and mean calculation, since most of the latest beats have a high amplitude. In this case, neither the static nor the dynamic adaptive threshold will work. The example is highlighted in Fig. 5 by peaks
Static and dynamic threshold values on the output after average over an 80 ms window on MITDB record 201 (424 sec).
Dynamic adaptive thresholding identifies noise and real peaks. Although in most cases, the dynamic adaptive threshold reacts properly, there are still cases where a noise peak is incorrectly calculated as real. Correct classification of artifacts is of primary importance for a quality industrial QRS detector.
Increasing the static threshold directly decreases the number of artifacts, however, this drastically increases the missed beats. This also applies to the dynamic adaptive threshold. To find an optimum of the static and dynamic peaks means to search for a comprise that would reach high values of both sensitivity, and positive predictive rate.
An example is illustrated in Fig. 5. Peaks labeled as
Calculation of R-peak location
Signal extracts and R-peak detection over the MITDB record 201 (426.4 sec): a) Original signal and local peaks 
One of the issues in executing the original Hamilton algorithm is the proper detection of a QRS peak. Fig. 6 illustrates such a case, where local maxima are labeled with
Note that the filter makes a constant delay, which is deducted from the location of peak values (as displayed on the signal average output). It correctly determines properly the previous QRS peak
The original Hamilton’s algorithm detects the location of R-peaks to be at points
Hamilton’s algorithm detects a peak based on a calculated amount of delay [42]. Although filters produce a fixed delay, the author has pointed out that the detection delay can easily vary from 395 ms to 1 sec, depending on the heart rate and detection rule. It is important to note that the search back detection method [42]can produce a fixed delay if the search back algorithm does not report any local peaks. Thus, we can conclude that the original Hamilton’s algorithm provides the best likely position of the R-peak, however, it is not always exact. This problem can particularly increase the total number of FP’s. This particular case is observed in almost all MITDB records. Even though the difference is not so big, there are cases where the difference between the real and detected location is higher.
Increasing the algorithm performance is directly related to fine-tuning thresholds and algorithm improvements.
Improving the detection of low-amplitude peaks
A careful analysis shows that the step (ABS) executed prior to time average, will not be able to cope with the bad performance in detection of low-amplitude peaks. This is especially crucial in cases when the signal is a mixture of one high amplitude beat, and then followed by several beats with low amplitudes. We used the idea introduced in the Pan Tompkins algorithm [4] to square the signal, instead of calculating the absolute value.
Figure 7 presents a case where a combination of a square mode and the optimized static threshold will improve the detection of low energy peaks. The peaks labelled as
Nevertheless, there are side effects, especially in the calculation of the dynamic adaptive threshold. This threshold increases due to increased amplitudes, and it becomes slightly difficult to adapt to sudden changes in the amplitudes.
Signal extracts of executing the Hamilton’s algorithm over the MITDB record 201 (424 sec): a) Original signal; b) Output after average over a 80 ms window using the original Hamilton’s algorithm; c) Output after square average over a 80 ms window with the optimized static threshold.
This operation behaves as an important amplifier especially if it followed by a calculation of an average over an 80 ms moving window. It shows that this copes better with the identified problems. However, this is not enough, since this algorithm cannot perform with static threshold values, and needs dynamic calculation by other rules.
We have conducted several tests to experiment with threshold values STHR from 2 to 50, to find the optimal threshold value for which the number of errors is minimal. The left part of Fig. 13 presents the false detections for the conducted experiment. The performance of the algorithm gradually decreases as the threshold increases, mainly due to the high values of FP. The best performance is obtained for
Since the Hamilton algorithm only gives an approximation where the peak is, one way to reduce the number of FP’s that occur due to the determination of a proper R peak location, is to search the real peak in near proximity. The idea is to search the most convenient local maximum by analyzing the noise-eliminated output after bandpass filtering.
Proper calculation of the R-peak location on the MITDB record 201 (426.4 sec): a) Calculated R-peak location 
False detections improving the calculation of the R-peak location.
The original Hamilton’s algorithm will determine the best approximate for the location of the R-peak. This is the starting point for the search of the local maximum, within a range, found SearchL ms to the left, and SearchR ms to the right of the approximated R-peak location. Once the local maximum is found on the output of the bandpass filter, we continue to find the local maximum on the actual signal, though the range for searching will be limited to 48 ms.
Figure 8 illustrates the basic steps of this improvement algorithm in the example presented in Fig. 6. The corresponding search segments are marked on the original signal. The original Hamilton’s algorithm detects the local peak
We realized another experiment to locate the optimal values for SearchL, and SearchR. False detections for thresholding values are plotted on a surface graph presented in Fig. 9, for different values for SearchL, and SearchR, using the static threshold value
Two functions, denoted as mean Eq. (3) and thresh Eq. (4), play a crucial role in the original QRS detection algorithm.
We proposed a change in the mean function in order to alleviate the effect of high amplitude complex proceeded by a significantly lower one. Instead of calculating the mean of the last 8 peaks, our algorithm considers only half of this value when the peak amplitude is higher than the dynamic threshold (DTHR) and the exact value in all other times, as defined by Eq. (5). This prevents a linear increase in the threshold especially for high amplitude signals and solves the identified problem.
In addition, we changed the thresh method. Previously, the calculated threshold was multiplied by the constant
A good performance is achieved on both cases with:
a sequence of low-amplitude peaks after an isolated high-amplitude peak; an isolated low-amplitude peak after a sequence of high-amplitude peaks.
Figure 10 illustrates the improvement idea for the example presented in Fig. 3, where a sequence of low-amplitude peaks is followed by a high-amplitude peak, which will increase the dynamic threshold to a value where all consequent low-amplitude peaks are marked as noise peaks.
The figure demonstrates the output of an average of an 80 ms window realized on squared (not absolute) values along with the static, and dynamic thresholds. Note that the detected potential peaks and their absolute values presented in Fig. 3 are smaller than the original dynamic threshold value.
Effect of a dynamic threshold to detect peaks after average of squared values over an 80 ms window over the MITDB record 114 (240 sec).
Parameter list for artifact detection rules
Effect of a dynamic threshold to detect peaks after average of squared values over an 80 ms window over the MITDB record 201 (424 sec).
The effect of applying the new way to calculate the dynamic threshold can be also observed in Fig. 11, presenting isolated, low-amplitude peaks after a sequence of high-amplitude peaks. The new minimum threshold detects 17 candidate peaks, whereas the original algorithm with default dynamic threshold is not able to capture the peaks labeled as
We have conducted a lot of experiments to determine the optimized value for the
In the final step, we introduce a new Classification phase. The aim of this is to classify whether the calculated beat is real or an artifact. Three important decision rules decide whether the peak is an artifact. If none of these rules are satisfied, the beat is calculated as a real peak.
From the preliminary analysis, we observed that artifacts generally follow a real beat and are closer than 320 ms away. The second important issue is that an artifact obviously has lower energy when compared to the previously detected beat. The original Hamilton’s algorithm eliminates artifacts closer than 195 ms. Our findings show that this value can also be optimized. Table 1 describes some parameters used in our optimization approaches. We introduce the following optimization rules for artifact elimination:
False detections of optimization approaches for low-amplitude peaks (left); sequences of high-amplitude peaks (middle), beat rate impact (right). False detections of optimization approaches A0 (left), A1 (middle) and A2 (right).
Signal extracts and outputs after squared average over 80 ms window of executing our algorithm over MITDB: a) signal and d) output for A0 type artifact in record 103 (1304.4 sec); b) signal and e) output for A1 type artifact in record 124 (413.3 sec); c) signal and f) output for A2 type artifact in record 101 (132.2 sec).
Examples of different types of detected artifacts are presented in Fig. 14. The identified segments demonstrate that the detected peaks are artifacts, since their energy is higher than static, and dynamic thresholds, but satisfies one of the A0, A1, and A2 rules. Otherwise, if none of these rules are satisfied, then the detected beat
The results of the experiment using the A0 optimization approach, are presented in the left part of Fig. 13. A threshold value of
The impact of threshold values on the A1 approach is demonstrated in the middle part of Fig. 13. The
The right part of Fig. 13 shows how the threshold parameter impacts the performance of the A2 optimization method. The
Our experiments have shown that the beat rate affects the intervals in the A0-A2 optimization approaches. Let
To classify whether a peak is an artifact, we have set three time constants
However, our analysis has shown that premature beats might appear closer than these values. This is why we introduced a scaling factor to the previous improvement, and use the time thresholds calculated by Eq. (8), which are multiplied by the scaling factor BPM_BASE and heart rate BPM.
Comparison of algorithm performance over MIT-BIH Arrhythmia database
The right part of Fig. 13 shows how the scaling factor BPM_BASE impacts the performance. The
We observe that a value of
The following parameter combination achieves the best overall performance:
Table 2 gives an overview of the obtained results and shows that our algorithm has reached better-combined values of sensitivity and positive predictive rate.
One can face several problems comparing any QRS detection method to other published papers as summarized below:
no source code provided to check other approaches; no info about positive predictive rate; or no info about the number of errors.
When analyzing the performance, only a small number of papers give information on the achieved positive predictive rate and they usually target achieving higher sensitivity values. However, it is very easy to achieve a higher sensitivity value and capture most of the results you would like to include in your algorithm by relaxing the constraints on the optimization parameters, but, at the same time, this will produce many extra generated peaks that do not represent a QRS peak. This is why it is very important to address both the sensitivity and positive predictive to evaluate performance.
This means that there is no direct comparison method. To cope with this problem, we have analyzed the number of errors as a performance measure (as defined by Errors in Eq. (2)). Although this performance measure can be achieved by calculation the sum of FP and FN, it can also be calculated through the harmonic mean (HM) of the QRS sensitivity and positive predictive rate by Eq. (9).
We used Eq. (10) to evaluate this relation. In addition, we assumed that the number of extra detected (false negative) peaks is a lot smaller than the number of correctly detected QRS peaks, i.e.
A number of discrepancies were found while analyzing related work. Table 1 of [4] shows that the number of errors is 782 although it is 784. The calculation of total beats is the most unambiguous. For example, TP
We found that various authors have used a different number of detected peaks. They should use the total number of detected beats, since the total number of peaks includes also non-beat annotations, for example, locations where there is a rhythm change. That is why we use the total number of annotated beats in the MIT-BIH Arrhythmia database 109494.
Table 2 also shows that researchers focusing on QRS detection algorithms mostly tend to use the original sampling frequencies, and resolution of reference ECG databases. Running the algorithm on 125 Hz means that nearly 3 times less data is feed to the QRS detector, thus the execution times are deduced accordingly. Moreover, the adjustments that we proposed increased the performance metrics, which in total yields a better QRS detector intended for a small wearable one channel ECG sensor.
We have introduced several methods to improve the QRS detection in differentiation-based algorithms. Even though our approach is demonstrated on Hamilton’s QRS detection algorithm, it can also be implemented in other algorithms. The results show superior performance over other published results. The improvements were efficiently built in on an industrial QRS detector, for a wearable ECG monitor, where the sampling frequency is 125 Hz, with 10-bit resolution of the AD converter.
Tuning the threshold values might increase the performance, but one needs to develop new relations to generate the thresholds, such as finding
how many previous beats might be analyzed to estimate the mean value, which peaks will be classified as QRS beats since they are very similar in shape, and the impact of the beat rate on classifying the artifacts.
For this purpose, we have introduced several new rules for 1) threshold calculation to better classify QRS peaks; 2) elimination of QRS-like artifacts, and 3) artifact elimination based on beat rate.
Due to rescaling, loose ECG contacts, and noise caused by muscles, 62 beats cannot be detected by analyzing the first ECG channel, without the analysis of a second channel. This gives a higher performance of the QRS sensitivity to 99.96%, and our algorithm has reached a 99.90% QRS sensitivity, with 99.90% positive predictive rate when all records are analyzed in the MITDB, and 99.91% QRS sensitivity for 44 records without paced beats.
Generally, the published papers on QRS detection algorithms do not offer their source codes, and only some of them are validated with referenced ECG databases, such as the MIT-BIH Arrhythmia database. Most of the algorithms give a brief description of design issues without implementation details, addressing only theoretical related results. This is why one cannot directly compare results. Different approaches generally do not achieve results as the ones achieved in a real implementation.
This study can facilitate possible QRS detection algorithms to consider rescaling, and resampling on reference ECG databases in return for better performances. Our findings show that the adjusted version of Hamilton’s QRS detection algorithm yields better results with 125 Hz data on nearly three times shorter execution times.
Our future work will be in the direction of classifying QRS detection errors, and eventually eliminating them, as well as for the purpose of modeling the dependence on sampling rate, and bit resolution. Additionally, we are considering building a quality industrial heart beats classifier, with higher performance, on top of the detection algorithm.
Footnotes
Conflict of interest
None to report.
