Heart sound classification using wavelet scattering transform and support vector machine

Abstract

OBJECTIVE:

A representation of the sound recordings that are associated with the movement of the entire cardiac structure is termed the Phonocardiogram (PCG) signal. In diagnosing such diverse diseases of the heart, PCG signals are helpful. Nevertheless, as recording PCG signals are prone to several surrounding noises and other disturbing signals, it is a complex task. Thus, prior to being wielded for advanced processing, the PCG signal needs to be denoised. This work proposes an improved heart sound classification by utilizing two-stage Low pass filtering and Wavelet Threshold (WT) technique with subsequent Feature Extraction (FE) using Wavelet Scatter Transform and further classification utilizing the Cubic Polynomial Support Vector Machine (SVM) technique for CVD.

METHOD:

A computer-aided diagnosis system for CVD detection centered on PCG signal analysis is offered in this work. Initially, by heavily filtering the signal, the raw PCG signals obtained using the database were pre-processed. Then, to remove redundant information and noise, it is denoised via the WT technique. From the denoised PCG, wavelet time scattering features were extracted. After that, by employing SVMs, these features were classified for pathology.

RESULTS:

For the analysis, the PCG signal obtained from the Physionet dataset was considered. Heavy low-pass filtering utilizing a Low-Pass Butterworth Filter (LPBF) is entailed in the pre-processing step. This removed 98% of the noise inherently present in the signal. Further, the signal strength was ameliorated by denoising it utilizing the WT technique. Promising results with maximum noise removal of up to 99% are exhibited by the method. From the PCG, Wavelet Scattering (WS) features were extracted, which were later wielded to categorize the PCG utilizing SVMs with 99.72% accuracy for different sounds.

DISCUSSION:

The Classification accuracies are analogized with other classification techniques present in the literature. This technique exhibited propitious outcomes with a 3% improvement in the F1 score when weighed against the top-notch techniques. The improvement in the metrics is attributed to the usage of the pre-processing stage comprising of Low-pass filter and WT method, WS Transform (WST), and SVMs.

CONCLUSION:

The superiority of the proposed technique is advocated by the comparative investigation with prevailing methodologies. The system revealed that Coronary Artery Disease (CAD) can be implemented with superior methods to achieve high accuracy.

Keywords

Phonocardiogram (PCG)Support Vector Machine (SVM)Coronary Artery Disease (CAD)Low-Pass Butterworth Filter (LPBF)WS Transform (WST)Feature Extraction (FE)

1. Introduction

Globally, the chief reason for morbidity and mortality is Heart Diseases (HDs). Owing to HD, one out of every ‘3’ deaths happens worldwide [1, 2]. 16.7 million Deaths were engendered globally in 2002 as well as the numbers are anticipated to attain 23.3 million by 2030 [3]. CAD is the common reason behind HD. Also, the occurrence, as well as deaths owing to this, has augmented in the last few years [4]. The main cause of CAD is the plaque’s deposition on the inner wall of coronary arteries, which engenders unexpected cardiac arrest [5]. CAD relies on (a) Age, (b) smoking, (c) lifestyle, (d) exercise patterns, (e) diabetes, (f) hypertension, (g) obesity, and (h) raised blood cholesterol of an Individual [6]. The death rates connected with CAD differ worldwide. When analogized to other countries/regions, the maximum number of deaths is recorded in Asia, the Middle East, and Russia [7]. Single Vessel CAD (SVCAD), Double Vessel CAD (DVCAD), and Triple Vessel CAD (TVCAD) are the ‘3’ diverse categories of CAD, and they are classified based on the number of major coronary arteries affected by plaque buildup. The 3 types of CAD are given as follows.

SVCAD

In SVCAD, only one of the major coronary arteries are affected by plaque buildup. This can result in partial or complete blockage of blood flow to the heart, leading to symptoms, such as chest pain or discomfort, shortness of breath, or fatigue.

DVCAD

In DVCAD, two of the major coronary arteries are affected by plaque buildup. This can increase the risk of heart attack or other complications, as there is a greater likelihood of reduced blood flow to the heart.

TVCAD

In TVCAD, all three of the major coronary arteries are affected by plaque buildup. This can significantly lower blood flow to the heart, increasing the risk of heart attack or other serious complications [8, 9]. The CAD diagnosis is contingent on the harshness of left ventricular dysfunction along with the number of infected arteries. CAD’s signs may not evolve in diagnosis until stenosis becomes worse owing to compensatory mechanisms of the cardiac as well as pulmonary vasculature [10]. This might append intricacies to the diagnosis, and until it turns out to be extremely complex for non-invasive therapy, the person may stay undiagnosed. To lessen the death rate as well as evade further complexities, CAD’s timely and exact diagnosis is indispensable. Recently, for the diagnosis of CAD, Coronary Angiography (CA) is considered the gold standard [7]. It is a costly as well as invasive clinical process; also, it needs specialist expertise to operate [11]. Severe complications like artery segmentation, arrhythmia, or even unexpected fatality might be induced by CA. For cardiac disorders screening, an economical and useful procedure that is widely wielded is PCG auscultation [12]. Owing to coronary stenosis in PCG signals, weak heart murmurs are generated as per earlier works [13, 14]. In manual auscultation, these murmurs are extremely complex to identify. For diagnosing via manual auscultations, the types of CAD have no potential. Thus, for detection and the categorization of CAD, it is very important to create an intellectual PCG signal analysis technique [15]. PCG signals provide valuable information for the diagnosis of different heart diseases. PCG signals are recordings of the sounds produced by the heart during its normal functioning; also, they can be analyzed to detect abnormalities in the heart’s functioning. Some of the most general sorts of heart diseases include,

Coronary artery disease

This is caused by a plaque buildup in the arteries that supply blood to the heart. The arteries are reduced by plaque, which in turn reduces blood flow and causes chest pain or else a heart attack.

Heart failure

This occurs when the heart is not able to pump adequate blood for meeting the body’s requirements. It could be caused by a variety of factors like diabetes, high blood pressure, coronary artery disease, et cetera.

Arrhythmias

These are abnormal heart rhythms that could cause the heart to beat too fast, too slow, or irregularly. They could be caused by a variety of factors like electrolyte imbalances, medications, and heart disease.

Heart valve disease

This occurs when the valves in the heart don’t function properly, either by not opening fully or by not closing completely. It could be caused by a variety of factors, namely congenital heart defects, infection, and age.

Other less common sorts of heart disease include congenital heart defects, cardiomyopathy, and pericardial disease.

1.1 PCG signals

PCG is a non-invasive diagnostic tool that records and analyzes the sounds produced by the heart during its normal functioning. The PCG signal is obtained by placing a microphone or a sensor on the chest wall and then recording the sound waves produced by the heart. The PCG signal gives the characteristics of different heart sounds by capturing the acoustic vibrations generated by the heart during its normal functioning. This provides valuable information about the heart’s functioning and can help in the diagnosis of various heart conditions.

The PCG signal gives the characteristics of different heart sounds by capturing the acoustic vibrations generated by the heart during its normal functioning. The different heart sounds are characterized as S1, S2, S3, and S4. The (S1) is produced by the closing of the mitral and tricuspid valves at the beginning of systole. The sound is usually described as a lub and is heard as a low-frequency sound in the PCG signal. The (S2) is produced by the closing of the aortic and pulmonic valves at the end of the systole. The sound is usually described as dub and is heard as a high-frequency sound in the PCG signal. Additional heart sounds, such as S3 and S4 can also be heard in the PCG signal. S3 is produced by the rapid filling of the ventricles during diastole, while S4 is produced by the contraction of the atria just before S1.

The PCG signals can be classified based on different factors, such as,

Timing: PCG signals can be classified based on the timing of the heart sounds like the duration of the systolic and diastolic periods of the heart cycle. Frequency: PCG signals can also be classified grounded on the heart sounds’ frequency content, thus providing information about the different components of the heart sounds. Amplitude: PCG signals can be classified based on the amplitude or intensity of the heart sounds, which can be used to detect changes in the heart’s functioning.

2. Literature review

The majority of the previous studies concentrated on CAD recognition via Electrocardiogram (ECG) signals [16, 17, 18, 19, 20, 21, 22, 23]. Phonocardiograms (PCGs) and electrocardiograms (ECGs) are both non-invasive diagnostic tests wielded for evaluating the heart’s function.

Phonocardiograms are recordings of the sounds produced by the heart and are typically used to detect abnormalities in the heart valves and muscles. PCGs can detect murmurs, clicks, and other abnormal sounds that indicate issues, such as valve stenosis or regurgitation, ventricular septal defects, or hypertrophic cardiomyopathy.

Electrocardiograms, on the other hand, measure the heart’s electrical activity and could provide information about the rhythm and function of the heart. ECGs can detect arrhythmias, such as atrial fibrillation, and can also provide information about the presence of ischemia or damage to the heart muscle, such as in the case of a heart attack.

In [16], ECG-centered categorization of CAD was offered [16]. From ECG beat’s 4th-level detail coefficients decomposed via flexible analytical Wavelet Transform (WT), Cross Information Potential features were extracted. A detection accuracy of 99.6% was accomplished by utilizing Least Squares SVM (LS-SVM). To recognize CAD via ECG signals, authors in [18] deployed Convolutional Neural Networks (CNNs) with long short-term memory. By utilizing deep CNN, ECG signals were examined in other analogous work [17]. Arabasadi et al. [19] propounded a hybrid methodology by wielding a Genetic Algorithm along with Neural Networks (NNs) to identify CAD utilizing ECG signatures. 93.8% diagnostic accuracy was attained by the study. A methodology for CAD identification utilizing ECG signal’s heart rate instability investigation was offered by Dolatabadi et al. [21]. Through an optimized SVM classifier, 99.2% accuracy was accomplished by the technique. For the CAD detection utilizing ECG signals, a signal processing structure was propounded by Khan et al. [22]. By means of Empirical Mode Decomposition (EMD), ECG signals were initially denoised. To execute classification, Multi-Domain Features (MDFs) were extracted by utilizing SVM. One of the largely budding fields in Machine Learning (ML) is Deep Learning (DL). By deploying 2 and 5 s ECG segments, automated discovery of CAD was carried out via the 1-dimensional version of CapsNET [23]. From ‘7’ CAD and ‘40’ normal subjects, these signals were amassed. Higher-order statistics bi-spectrum and cumulant features were extracted as of ECG beat to differentiate CAD as well as healthy subjects in other research [20]. Through Principal Component Analysis (PCA), the feature dimensions were lessened. For training Decision Tree (DT) along with K-Nearest Neighbours (KNN) classification approaches, reduced features were wielded that achieved 98.9% and 98.1% accuracies, correspondingly. Important signatures that are related to cardiac disorders were carried by the PCG signals. By utilizing the openly obtainable datasets Physio-Net/CinCChallenge [24, 25, 26, 27, 28, 29] along with PASCAL [30, 31], the majority of the current works on PCG classification meant for cardiac disorders were done. For valvular heart disorder detection, many works amassed PCG datasets [32, 33, 34]. On the contrary, a comparatively finite amount of research on CAD detection via PCG signal analysis was present [5, 35, 36, 37, 38]. The ML pipeline’s traditional steps, that is, FE and classification, were pursued by these works. To identify feeble murmurs produced by CAD utilizing acoustic features, an approach was propounded by Semmlow et al. [15]. A digital electronic stethoscope for the CAD diagnosis was tested by the authors in Ref [35]. From 161 patients who had numerous cardiac diseases, they amassed data. By utilizing PCG signals, a lower-cost CAD detection model was propounded by Schmidt et al. [36]. From ‘5’ overlapping frequency bands, ‘9’ features were extracted and categorized via quadratic Discriminant investigation. CAD subjects’ 204 recordings together with 231 recordings from Non-CAD subjects were encompassed in the self-accumulated dataset. By means of extorting Cross-Power Spectral Density (CPSD) characteristics as of the heart’s sounds, CAD was diagnosed in [39]. A classification accuracy of 84% was obtained via the SVM classifier. To differentiate normal together with CAD subjects, a PCG-centered multi-channel data acquisition setup was wielded in another similar study [40]. Spectral entropy (SE), PSD function’s moments, Spectral Moments (SM), Autoregressive (AR) parameters, along with Instantaneous Frequency (IF) feature were the ‘5’ diverse features, which were calculated from the PCG signal’s frequency domain transformation. To execute classification, an Artificial NN (ANN) was deployed. For multi-channel as well as signal-channel data, accuracies of 74.2% and 69.6% were attained. For CAD detection, a 4-channel PCG signal acquisition system was developed by Samanta et al. [41]. To recognize CAD along with normal classes, time and frequency domain features were scrutinized. The finest outcome of 82.5% accuracy was generated by the sub-band features with the ANN classifier. [42] Researched the impact of noise on CAD detection via PCG signals. By utilizing an imaginary part of CPSD, the heart sound spectrum was extracted. To differentiate normal and CAD classes, Spectral Features (SFs) calculated from sub-bands were classified by utilizing SVM. For CAD detection via PCG signals, a feature fusion methodology was propounded [5]. From diverse domains, 110 features were obtained, which were lessened via PCA. For extracting Deep Features (DFs), spectrogram images calculated as of MFCCs were fed to the CNN. These DFs were fused with the decreased MDFs as well as categorized via a Multilayer Perceptron Network (MLP). The experiment was carried out on a dataset amassed from 175 subjects. Accuracy of 90.4%, a sensitivity of 93.6%, along with a specificity of 83.3% were accomplished via the propounded fused structure. For the recognition of diverse kinds of CAD, a PCG-based ‘2’-stage classification strategy was propounded recently [43]. Via EMD, PCG signals were pre-processed. By extracting MFCC as well as statistical features with a KNN classifier, normal and CAD signals were classified in the 1 ${}^{\text{st}}$ stage. Identification of the kinds of CAD (DVCAD, SVCAD, together with TVCAD) via fused MFCC, statistical, as well as SF in tandem with SVM with the cubic kernel, occurred in the 2 ${}^{\text{nd}}$ stage. 1190 PCG signals were encompassed in the self-recorded dataset on which the experiments were carried out. The suggested technique accomplished an overall accuracy of 88.4%. Time-varying frequency characteristics of diastolic, as well as systolic periods of PCG deploying the cardiac cycle’s Synchrosqueezing Transform (SST), were evaluated [38]. By utilizing ‘4’-channel PCG signals that are amassed from the chest, the experiments were performed. Here, 40 CAD along with 40 Non-CAD subjects participated. For multichannel data, the maximum accuracy of 81.2% was produced by the propounded system. Enhanced accuracy of 83.4% was engendered by the SF’s fusion. CAD detection via ECG or PCG signals was concentrated in the majority of the earlier studies. There is a well-built prerequisite to formulate a precise system for CAD detection along with its classification into essential types (SVCAD, DVCAD, and TVCAD) for helping the cardiologist in constructing the right decisions concerning treatment procedures. A methodology for the detection, as well as classification of CAD and its kinds via a blend of propounded ‘1’-Dimensional Adaptive Local Ternary Patterns (1D-ALTP) and MFCC features, was exhibited. By holding the discriminative information intact and rejecting the redundant and analogous content, the PCG signals are pre-processed via EMD. By means of traditional MFCCs and 1D-ALTP, the features are extracted. For diverse kinds of CAD along with normal data classes, Feature Representation (FR) is considerably ameliorated by the 1D-ALTP feature’s fusion with MFCCs. By means of the Multidimensional Scaling (MDS), the fused feature vector is decreased as well as examined with a gamut of classifiers to differentiate (i) normal, (ii) SVCAD, (iii) DVCAD, along with (iv) TVCAD classes. Manual auscultation of PCG signals can be subjective and prone to human error, which can lead to misdiagnosis and delayed treatment. Therefore, an intelligent PCG signal analysis technique is necessary for accurate and reliable detection and CAD categorization. Intelligent PCG signal analysis methods, such as machine learning and deep learning techniques can be utilized for automatically extracting features from the PCG signals and classifying them into different categories based on the severity and type of CAD. These methods can also be used to detect subtle changes in the PCG signals that may not be apparent to the human ear, allowing for early detection and timely treatment of CAD.

The following are the major contributions of this study.

1.
A novel methodology is proposed that deploys a two-stage Low pass filter (LPF) and WT-centered pre-processing, Wavelet Scatter Transform-centered FE, along with classification steps utilizing SVMs. The classification performance was significantly ameliorated by the proposed two-stage LPF-WT-based denoising and Wavelet Time scattering.
2.
Regarding accuracy, sensitivity, along with specificity, the proposed technique provides a considerably 3% enhanced performance for CAD detection via PCG signals when analogized to the prevailing literature.
3.
Centered on the severity level, the proposed methodology can classify the kind of CAD. By utilizing a Binary Class (BC) Support Vector framework, this classification is carried out, and it accomplished 96.02% specificity, sensitivity 94.2%, and 93.26% F1 score for abnormal, and 97.41% specificity, sensitivity 96.16%, and 96.78% F1 score for Normal classes.

3. Materials and method

3.1 Overview

Figure 1.

Block Diagram of the CVD diagnostic system.

Figure 1 depicts the proposed computer-aided diagnosis scheme’s architecture for CVD categorization via PCG signals. PCG signals obtained from the Physionet database are exhibited as a 1st stage; they are initially pre-processed to eliminate redundant information together with noise. To remove most of the noise present in the signal up to a limit of 98%, a 6 ${}^{\text{th}}$ -order LPBF is wielded as a first stage. Next, to denoise the signal further up to a limit of 99%, WT techniques are used as a second stage that removes almost all of the noise hidden in the signal. Next, to extract the PCG signal and obtain the Wavelet Time Scattering features, the 3 ${}^{\text{rd}}$ stage is featured. To categorize the signal as normal or pathological/abnormal, the extracted features are classified by the most popular machine algorithm.

3.2 Pre-processing

3.2.1 Low pass filtering and Wavelet Threshold method

Here, a two-stage filtering strategy was espoused to remove unwanted noise and other artifacts hidden underneath the signal. In the first stage, LPBF with a cut-off frequency of 160 Hz is wielded for removing higher-frequency noise from the signal. In the second stage, a WT is applied to the signal. WT is used to identify and remove hidden noise from a signal by decomposing the signal into its component wavelets at different scales. The wavelets that contain noise can be isolated and removed, while the remaining wavelets are recombined to reconstruct a clean signal. The advantage of using WT for hidden noise reduction is that it can preserve important features of the signal while removing noise.

Two methods are popular in the WT denoising procedure. They are the Hard Threshold and the Soft threshold [44].

Hard thresholding

Hard thresholding involves setting the wavelet coefficients to zero if they are below a certain threshold value. This method is more aggressive than soft thresholding and can lead to a loss of signal features. The mathematical interpretation of the hard threshold is indicated as,

$\displaystyle Y=\textit{sign}(X).(|X|-T)_{+}\left\{{{\begin{array}[]{ll}(x)_{+% }=0&\text{if }x<0\\ (x)_{+}=x&\text{if }x\geqslant 0\\ \end{array}}}\right\}$

Soft thresholding

Soft thresholding involves shrinking the wavelet coefficients toward zero using a threshold value. This method allows the coefficients to have a continuous transition towards zero, which can help to preserve the signal features and reduce the noise. The mathematical interpretation of the soft threshold is denoted as,

$\displaystyle Y=X.1({|X|>T})\{{(x)_{+}=x,\textit{ for all }x}\}$

The soft threshold method is a lot finer and is also called the wavelet-shrinkage method. The hard threshold is much cruder. The basic principle behind the wavelet shrinkage method is to apply a thresholding function to the wavelet coefficients. There are various sorts of thresholding functions; however, the most common ones are hard thresholding and soft thresholding.

The wavelet shrinkage method has several properties that make it useful for signal denoising, which is given as follows.

•
It is a non-parametric technique that does not require any assumptions about the noise distribution or the signal characteristics.
•
It is a computationally efficient method that can be applied to large signals.
•
It can be applied to different types of signals, including stationary and non-stationary signals.
•
It can be used for both denoising and compression purposes.
•
The thresholding function can be adapted to the specific characteristics of the signal, such as its sparsity or smoothness.

The denoised signal is represented by $Y$ , while the original noisy signal is $X$ . The Threshold Value (TV) that is set to denoise the noisy signal’s coefficients is represented by $T$ . The threshold technique’s pictorial representation is displayed in Fig. 2.

Figure 2.
Original signal (pane 1), hard threshold signal (pane 2), soft threshold signal (pane 3).

3.2.2 Statistical analysis

The examination of the WT model’s performance is carried out on the PCG signal that is corrupted with noise with high and low input SNR levels. By utilizing statistical metrics, the performance of the algorithm can be quantitatively assessed.

Mean Square Error (MSE) is indicated by

$\displaystyle\textit{MSE}=\frac{1}{N}\mathop{\sum}\limits_{i=1}^{N}(|{s(n)}|-|% {y(n)}|)^{2}$ (1)

Signal to Noise Ratio (SNR) (in dB) is indicated by

$\displaystyle\textit{SNR}=10\log_{10}\frac{\mathop{\sum}\nolimits_{i=1}^{N}(s(% n))^{2}}{\mathop{\sum}\nolimits_{i=1}^{N}(s(n)-y(n))^{2}}$ (2)

Peak Signal to Noise Ratio (PSNR) (in dB) is indicated by

$\displaystyle\textit{PSNR}=\frac{\text{max}(s[n])^{2}}{\textit{MSE}}$ (3)

The Correlation Coefficient (CC) is symbolized by,

$\displaystyle CC=\frac{N\left(\mathop{\sum}\nolimits_{i=1}^{N}s(n)y(n)\right)-% \left(\mathop{\sum}\nolimits_{i=1}^{N}s(n)\right)\left(\mathop{\sum}\nolimits_% {i=1}^{N}y(n)\right)}{\sqrt{\left[N\mathop{\sum}\nolimits_{i=1}^{N}(s(n))^{2}-% \right.}\left.\left(\mathop{\sum}\nolimits_{i=1}^{N}s(n)\right)^{2}\right]% \left[N\mathop{\sum}\nolimits_{i=1}^{N}(y(n))^{2}-\left(\mathop{\sum}\nolimits% _{i=1}^{N}y(n)\right)^{2}\right]}$ (4)

3.3 Feature extraction using wavelet scattering transform

Translation Invariant (TI), stable, along with informative signal representations are constructed by a WST. The WST is a signal-processing technique that is used for feature extraction. The scattering transform aims to extract features that are invariant to translations and small deformations from the signal. The modulus non-linearities in the scattering transform provides a way to capture features that are invariant to translations, and the multiple layers of the transform make the neural network learn complex features. Moreover, it is efficient for deformations as well as protects against class discriminability, which makes it predominantly effectual for classification. For classification, [44, 45, 46, 47, 48] is referred to as its outstanding practical performance. The notations in [46] are followed. Assume $f(t)$ as the signal under investigation. To construct filters that wrap the entire frequencies comprised in the signal, LPF $\phi$ and the wavelet function $\psi$ are modelled. At a predefined scale $T$ , assume $\phi_{J}$ as the LPF that offers locally TI descriptions of $f$ . The wavelet indices’ family possesses an octave frequency resolution ${Q}_{k}$ is symbolized by ${\Lambda}_{k}$ . By dilating the wavelet $\psi$ , the multi-scale high-pass Filter Banks (FBs) $\{\psi_{J_{k}}\}_{J_{k}\epsilon{\Lambda}_{k}}$ are constructed. A WST is executed with a deep convolution network that iterates over conventional WT, nonlinear modulus, and averaging operators. A locally TI feature of $f$ is produced by the convolution $S_{0}f(t)=f\ast\phi_{J}(t)$ . However, it as well engenders the loss of higher-frequency information. By means of a wavelet modulus transform $|{W_{1}}|f=\{S_{0}f(t),|{f\ast{\psi}_{j1}(t)}|\}_{j1\epsilon{\Lambda}_{1}}$ , these lost higher frequencies are recovered. By averaging the wavelet modulus coefficients with $\phi_{J}:S_{1}f(t)=\{|{f\ast{\psi}_{j1}(t)}|\ast\phi_{J}(t)\}_{j1\epsilon{% \Lambda}_{1}}$ , the 1 ${}^{\text{st}}$ -order Scattering Coefficients (SCs) are acquired. $S_{1}f(t)$ are viewed as the low-frequency component of $|{f\ast{\psi}_{j1}(t)}|$ for recovering the information lost by averaging. The complementary high-frequency coefficients are extracted by $|{W_{2}}||{f\ast{\psi}_{j1}(t)}|=\{S_{1}f(t),|{f\ast\psi_{j1}(t)|{\times{\psi}% _{j2}(t)}|}|\}_{j1\epsilon{\Lambda}_{1}}$ .

The second-order SCs $S_{2}f(t)=\{|{f\ast{\psi}_{j1}(t)|{\ast{\psi}_{j2}(t)}|}|_{ji\epsilon{\Lambda}% _{i}}$ , $i=$ 1, 2 are delineated. The wavelet modulus convolutions are delineated by iterating the aforementioned process.

$\displaystyle U_{m}f(t)=\{|{f\ast{\psi}_{j1}(t)}|\ast\ldots\ast|{{\psi}_{j2}(t% )}|\}_{ji\epsilon{\Lambda}_{i}}\quad i=1,2\ldots m.$

By averaging $U_{m}f(t)$ with $\phi_{J}$ , the $m^{\text{th}}$ -order SCs $S_{m}f(t)=\{|{f\ast{\psi}_{j1}(t)}|\ast\ldots\ast|{{\psi}_{j2}(t)}|\ast\phi_{J% }(t)\}_{ji\epsilon{\Lambda}_{i}}$ , $i=1,2\ldots m$ is provided.

Figure 3 portrays the scattering process. To elucidate the input signal’s features, SCs of the entire orders are aggregated by the final scattering matrix $Sf(t)=\{S_{m}f(t)\}_{0\leqslant m\leqslant l}$ , where, the maximal decomposition order is $l$ . Owing to the average operation ascertained by the LPF, the network is invariant to translations up to the invariance scale, which is probably huge. $Sf(t)$ are steady to local deformations since the property is inherited from WT. Subtle modifications in the amplitude along with the duration of PCG signals are captured by the scattering decomposition and they are difficult to gauge, yet, the heart’s condition is reflected. Thus, to generate the PCG heartbeat’s robust representations that reduce differences within the ‘1’ arrhythmia category whilst maintaining adequate discriminability betwixt diverse categories, the wavelet scattering network is wielded. There are ‘2’ major differences although the WS network’s structure is the same as the CNN: the filters aren’t learned, however, set in advance as well as the features aren’t just the last convolution layer’s output but as well the mixture of all these layers. With about 99% of the energy comprised in the 1st 2’ layers, the energy of SCs reduces rapidly as the layer level augments [46, 47]. Thus, a two-order scattering network is wielded for extracting the PCG signal’s features. The two-order scattering network comprises a series of convolutional and scattering layers. The input to the network is the PCG signal, which is passed via a series of convolutional layers to extract low-level features. The convolutional layers’ output is then passed via a scattering layer, which computes the first-order scattering coefficients. These coefficients capture information about the amplitude and phase of the signal at different scales and frequencies. The first-order scattering coefficients are then passed through another set of convolutional layers to extract higher-level features. The output of these layers is then passed through another scattering layer, which computes the second-order scattering coefficients. These coefficients capture information about the interactions between the different frequencies and scales in the signal. The second-order scattering layer’s output is then passed through a final set of convolutional layers for extracting the PCG signal’s most discriminative features. The reason for choosing the two-order scattering is because it is a powerful tool to analyze non-stationary signals like PCGs that have complex time-frequency characteristics. The two-order scattering network consists of multiple layers of wavelet transforms and modulus operations, which can capture the PCG signal’s high-level features. Furthermore, the computational complexity is considerably lessened.

Figure 3.

Tree view of the scattering network.

Figure 4.

Wavelet filters. (a) The LPF with $N=$ 10000 invariance scale. (b) The 1stFB with 8 wavelets per octave and the 2ndFB with 1 wavelet per octave.

3.4 Classification

By employing top-notch classification techniques, namely SVMs, the classification of healthy, along with Abnormal PCG signals is carried out. Via a 10-fold Cross-Validation (CV) system, the classifiers were trained and authenticated. The classifier’s performance with the feature set is scrutinized. For distinguishing the ‘2’ classes of PCG signals, the pre-eminent performance is produced by means of the SVM-C. The better-performing classifier is tested by utilizing 5, 15, and 20 folds CV and via 80–20% and 70–30% train-test schemes to additionally merge the classification outcomes.

The process of classifying PCG signals as normal or pathological using machine learning typically involves the following steps.

Feature extraction

The PCG signal is analyzed to extract a set of features that can be used to differentiate between normal and pathological signals. These features may include time-domain and frequency-domain features like the duration of different heart sounds, the heart sounds’ intensity, or the spectral content of the signal.

Model training

A classification model is trained on a labeled dataset of PCG signals. The dataset consists of PCG signals that have been classified as either normal or pathological based on clinical criteria.

Model testing

The trained classification model is then tested on a separate dataset of PCG signals to evaluate its accuracy and performance. This dataset consists of PCG signals that were not used during the model training phase.

Model evaluation

The classification model’s performance is evaluated utilizing metrics, such as specificity, sensitivity, and accuracy, which measure the ability of the model to correctly classify true positives (pathological signals) and true negatives (normal signals), respectively.

4. Results

Here, for the detection along with the classification of cardiac disorders, the proposed computer-aided diagnosis scheme’s outcomes are exhibited. The proposed technique is trained as well as tested on the Physionet PCG signal database.

PCG signals are generated from the depolarization wave that travels through the heart during the cardiac cycle. The depolarization wave causes the heart muscles to contract, which produces mechanical vibrations that can be heard through a stethoscope. To generate PCG signals from the depolarization wave, a microphone or an electronic sensor is placed on the chest to capture the mechanical vibrations produced by the heart. PhysioNet is the database that provides open access to a large collection of PCG signals recorded from various sources, including healthy individuals and patients with different cardiac conditions.

4.1 Preprocessing

The two-stage low-pass filtering and WT method were tested on the Physionet database. For denoising sounds, soft and hard threshold methods are used. The TVs were limited to $T=$ 0.002, 0.004, 0.006, 0.008, and 0.01.

Table 1
The accuracy of every TW categorized by the SVM

Time window	1	2	3	4	5
Accuracy	99.9	99.8	99.7	99.8	99.9

Table 2

Classification metrics for various SVM classifiers

Classifier	Specificity	Sensitivity	F1-score
SVM-L	92.53	94.13	94.8
SVM-P	95.9	96.45	96.3
SVM-G	95.17	95.8	95.5

Table 3

The confusion matrix for 5-TWsmerged with the SVM across 5 folds

Class		Predicted class
		Normal	Abnormal	Sensitivity	Specificity	F1-score
True class	Abnormal	30	346	94.536	92.021	93.261
	Normal	753	20	96.169	97.413	96.787

4.2 Feature extraction

To execute wavelet decomposition, Gabor Wavelets (GWs) are utilized. The equivalent LPF $\phi$ is a Gaussian function. The invariance scale is fixed at 0.5 seconds. ‘2’ layers are encompassed in the built WS network. At the 1st and 2nd layers, Q ${}_{1}=$ 8 and Q ${}_{2}=$ 1, correspondingly are set as wavelets per octave. Other dissimilar settings for the invariance scale, as well as wavelet octave resolution, have been attempted; however, this architecture safeguards the signal information finest for classification. The employed GWs together with their LPF $\phi_{J}(t)$ are evinced in Fig. 4. It should be viewed that the coarsest-scale wavelet doesn’t surpass the invariance scale ascertained by the time bolster of the LPF $\phi_{J}(t)$ . A tensor with the size of 279 $\times$ 5 $\times$ 2680 is formed by the WS network’s output. Every slice of the tensor is the SC of one PCG signal. Centered on the LPF’s bandwidth, the SCs are decisively down sampled in time and this engenders in 8-Time Windows (TWs) for every 279 scattering paths. The tensor is reshaped into a 13400 $\times$ 75 matrix in which every column and row corresponds to a scattering path along with a TW, correspondingly to acquire a data structure well-suited with the wielded classifiers. Since there are 5-TWs for every 2680 signals in the database, 13400 rows are acquired. The 5-TW’s SCs for one PCG signal are evinced in Fig. 4.

For only 2 classes, namely Healthy and Abnormal, the PCG signal dataset was labelled in the BC experiments.

Table 4
Classification metrics for various SVM classifiers with different folds and hold out

Classifiers	K-fold CVAccuracy				Hold out accuracy
	5-fold	10-fold	15-fold	20-fold	70%–30%	80%–20%
SVM-L	94.1	94.3	94.3	94.31	94.31	94.37
SVM-P	99.4	99.6	99.72	99.7	99.72	99.8
SVM-G	98.69	99.0	99.05	99.0	99.05	99.14

Table 5

Previous classification works mentioned in literature

Reference	Year	Dataset	Method	Results
[39]	2017	Four-channel PCG private dataset CAD, Healthy: 33, 33	CPSD features, ReliefF, SVMs	CAD versus Healthy Accuracy: 84%
[40]	2018	Four-channel private PCG dataset CAD, Healthy: 33, 33	SM, PSD function’s moments, AR parameters, SE, along with IF, ANN Classifier	CAD versus Regular Accuracy: 74.24%
[41]	2018	Four-channel private PCG dataset CAD, Healthy: 33, 33	SM, PSD function’s moments, AR parameters, SE, together with IF, ANN Classifier	CAD versus Regular Accuracy: 74.24%
[5]	2019	Dataset size, Healthy, CAD: 66 subjects, 37, 29	Three Time along with Four Frequency features ANN	Accuracy, Sensitivity: 82.5%, 85.6% Specificity: 79.5%
[38]	2020	Four-channel PCG signals, CAD subjects, Non-CAD Subjects: 960, 40, 40	The cardiac cycle’s SST was SVM, KNN	Accuracy: 83.4%
[43]	2021	Private PCG dataset Normal, DVCAD, SVCAD, TVCAD: 492, 182, 197, 319	EMD-centered pre-processing, MFCC, statistical as well as SFs with SVM-C for multiclass classification ((i) Normal, (ii) DVCAD, (iii) SVCAD, (iv) TVCAD)	Total Accuracy: 88.4%, Classwise accuracies: Normal, DVCAD, SVCAD, TVCAD: 88.0%, 89.2%, 91.1%, 85.3%
[49]	2021	Private PCG dataset Healthy, SVCAD, DVCAD, TVCAD: 459, 251, 203, 277	EMD-centered pre-processing, MFCC along with 1D-ALTP features, SVMsclassifier	DetectionAccuracy, Abnormal, Normal, Multiclass Accuracy, Normal, SVCAD, DVCAD, TVCAD: 98.3%, 98.4%, 98.3%, 97.2%, 98.9%, 97.7%, 96.6%, 95.3%
This work	2022	Physionet	WT Pre-processing, WS features, SVMs	99.8% accuracy, 96.45% sensitivity, and 95.9% specificity

Table 6

Previous classification utilizing physionet database

Reference	Features	Classification method	Results
Whitaker [24]	Sparse Coding	SVMs	Balanced accuracy: 89.2%
Homsi [25]	On the whole, 131 features as of (a) time,	Nested ensemble classifiers	Balanced accuracy: 80.1%
	(b) frequency, wavelet along with
	(c) statistical domains
Langley [26]	Blend of spectral amplitude along with	Tree Classifier	Balanced accuracy: 79%
	wavelet
Jinghui [27]	WT	Twin SVM	Balanced accuracy: 90.4%
Iqtidar [49]	MFCC together with 1D-ALTP Features	SVM with Cubic Kernel	Balanced accuracy: 91.3%
This work	WT Pre-processing, WS features	WT Pre-processing, WS	Balanced accuracy: 99.63%
		features, SVMs

Figure 5.

Performance comparison with previous studies.

Figure 5 compares the accuracy rates of the proposed method and the existing works. From the comparison analysis, it is known that the proposed method remains to be more accurate as compared to the existing research methodologies. In the proposed work, the PCG signals are efficiently denoised and the more appropriate feature extraction phase is carried out, that significantly extracts the hidden features from the signal. Due to this, the proposed method withstands better accuracy (99.63%).

5. Discussion

The earlier work’s outline for CAD detection utilizing PCG signals is provided in Tables 5 and 6. For the proposed algorithm’s training along with corroboration, these studies deployed self-amassed private datasets of diverse sizes. Thus, the one-to-one comparison is not reasonable. The majority of the earlier work just carried out the classification of normal vs abnormal types. For superior CAD diagnosis, the PCG signals were obtained from 4 diverse sites in Ref. [39]. From PCG signals, CPSD features are extracted as well as decreased via the ReliefF algorithm. For this binary classification issue, SVM is utilized as a classifier. To execute the CAD classification and normal PCG signals, ‘5’ diverse features are extracted by utilizing an ANN classifier in other research [40] and it accomplished 74.2% accuracy with AR features. For CAD detection via PCG signals, a feature fusion methodology is propounded by Li et al. [5]. The feature fusion framework is a machine learning approach that combines multiple features extracted from PCG signals for enhancing the accuracy of CAD detection. The merits of utilizing the feature fusion framework are listed below.

Improved accuracy

By combining multiple features extracted from PCG signals, the feature fusion framework can improve the accuracy of CAD detection. This is because different features capture different aspects of the signal, and combining them can provide a more comprehensive and accurate representation of the signal.

Reduced dimensionality

The feature fusion framework reduces the dimensionality of the feature space by combining multiple features into a single feature vector. This could aid in enhancing the efficacy of the machine-learning approach utilized for CAD detection.

A variety of MDFs is extracted and lessened from PCG signals. The input for the CNN is MFCC’s images. The MFCCs are two-dimensional images, with time on one axis and frequency on the other axis. Each MFCC coefficient corresponds to a pixel in the image, and the overall MFCC image reflects the temporal and spectral characteristics of the PCG signal. These attributes were merged with multidomain PCG signal features as well as categorized via MLP. The multidomain PCG signal features are listed in Table 7.

Table 7
Multi-domain PCG signal features

Features	Description
Time-domain features	It describes the PCG signal’s temporal characteristics, namely the duration of heart sound, the time interval betwixt heart sounds, and the amplitude of each heart sound.
Frequency-domain features	It describes the frequency content of the PCG signal like the signal’s power spectrum, the dominant frequency components, along with spectral entropy.
Wavelet-based features	It describes the signal in both the time and frequency domains using WTs, which include the coefficients of the wavelet decomposition, the energy of each wavelet sub-band, and the wavelet entropy.
Statistical features	It describes the statistical properties of the PCG signal, namely mean, variance, skewness, and kurtosis of the signal.
Fractal-based features	It describes the fractal properties of the PCG signal, such as the Hurst exponent, the correlation dimension, and the fractal dimension.
Waveform morphology features	It describes the shape and morphology of the PCG signal, such as the slope of the upstroke and downstroke of each heart sound, the area under the waveform, and the peak-to-peak amplitude.
Time-frequency features	It describes the time and frequency domains of the signal.

From 175 subjects, PCG’s dataset is amassed and 90.4% of accuracy was obtained. For the recognition of normal, DVCAD, SVCAD, together with TVCAD, a ‘2’-stage classification system is proposed [43]. By utilizing EMD, PCG signals are denoised followed by FE via a blend of statistical, spectral, along with MFCC. The EMD, which decomposes a complex signal into simpler components named Intrinsic Mode Functions (IMFs), is a signal processing technique. The important application of EMD is signal denoising and feature extraction from signals. EMD decomposes the PCG signal into its different frequency bands and then filters out unwanted noise from each band individually. After denoising the signal using EMD, the different IMFs can be analyzed to extract features that can be used for further analysis of heart conditions. KNN was utilized in the 1 ${}^{\text{st}}$ stage, while an SVM classifier was deployed in the 2 ${}^{\text{nd}}$ stage to offer an accuracy of 90.2%. For the identification of normal as well as abnormal signals, the PhysioNet/CinC dataset was deployed by the majority of the works that were based on automatic PCG signal classification. On this dataset, the proposed technique is corroborated in this study. Tables 4 and 5 put forward the comparison of the proposed technique’s performance with the studies. Chiefly, the earlier works are centered on traditional ML approaches [24, 25, 26, 27]. The utmost accuracy of 90.4% is stated by Jinghui [27]. Contrary to these works, new WS features are extracted to obtain a robust FR of the PCG signal’s diverse classes. By employing SVM with the cubic kernel, the features are classified. The model’s competitiveness with an accuracy of 99.63% is verified by the proposed technique’s performance assessment.

5.1 Support vector machine classification

Utilizing the 4th TW as a feature, it offers the highest average accuracy of 99.8% amongst all methods that wield the SVM classifier. Table 2 depicts the confusion matrix for 5-TWs. A sensitivity of 94.536%, specificity of 92.021%, and F1-score of 93.261% are displayed by the abnormal sounds whilst a sensitivity of 96.169%, specificity of 97.413%, along with F1-score of 96.787% is exhibited by the Normal sounds as per the confusion matrix. SVM-L, which expresses Linear Kernel, SVM-C, which expresses Cubic Polynomial Kernel, and SVM-G, which expresses Gaussian Kernel, were the utilized ‘3’ variants of SVM. The description for the SVM variants,

SVM-L

The purpose of SVM-L is to find the hyperplane that separates the data in a linearly separable way. The linear kernel is a simple dot product between the input features and is computationally efficient. SVM-L is commonly used for binary classification problems.

SVM-C

The purpose of SVM-C is to find a non-linear decision boundary using a cubic polynomial kernel. The cubic polynomial kernel maps the input features to a high dimensional space, allowing for a non-linear decision boundary to be found. SVM-C is useful for problems, where the data is not linearly separable.

SVM-G

The purpose of SVM-G is to find a non-linear decision boundary using a Gaussian kernel. The Gaussian kernel is a popular non-linear kernel that maps the input features to an infinite-dimensional space. SVM-G is commonly used for problems, where the data is not linearly separable.

In heart sound analysis, SVM is used with different types of features and classification methods. Some of the feature classification methods that SVM can handle contain:

Time-domain features: Features that are extracted directly from the raw signal, such as the amplitude, duration, and frequency content of the signal. Frequency-domain features: Features that are extracted using Fourier transform or other spectral analysis techniques, such as the power spectrum or spectral entropy. Wavelet transforms features: Features that are extracted using wavelet transform, which can capture the time-frequency characteristics of the signal. MFCCs: Features that are extracted using the Mel frequency cepstral coefficients (MFCCs) technique, which is commonly used in speech recognition and can be adapted for heart sound analysis.

The Classification metrics for the above ‘3’ variants of SVM are articulated in Table 1. Regarding performance metrics like 96.45% Sensitivity, 95.9% Specificity, and 96.3% F1-score, SVM-C surpasses the other ‘2’ variants. By utilizing the ‘3’ variants of SVM, the K-Fold Validation along with Hold-out tests are also carried out on the dataset. For 5, 10, 15, and 20 K-Fold Validations, the SVM-C surpasses the other ‘2’ variants regarding the Accuracy of 99.4%, 99.6%, 99.72%, and 99.7% as of Table 3. In the Hold-out evaluation with metrics of 99.8% accuracy, the 80%–20% Train-test dataset outperformed the 70%–30% Train-test dataset as per the simulation results in Table 3.

6. Conclusion

Prominent characteristic information regarding human cardiac states is comprised in the PCG signals as per the study. So, to identify heart disorders, this information is extracted. Centered on PCG signal analysis, a computer-aided diagnosis system is proposed to efficiently differentiate healthy and abnormal classes. To pre-process the raw PCG signals, the proposed methodology deployed WT. For the binary experiment, the WS features are classified. Better performance of 99.63% accuracy, 95.9% sensitivity, along with 96.2% specificity is generated by the proposed technique for a binary experiment (Healthy versus CAD).

References

Mc Namara

Alzubaidi

and Jackson

J.K.

, Cardiovascular disease as a leading cause of death: How are pharmacists getting involved? Integrated Pharm. Res. Pract.8 (2019).

Roth

G.A.

Johnson

Abajobir

Abd-Allah

Abera

S.F.

Abyu

Ahmed

Aksut

Alam

et al., Global, regional, and national burden of cardiovascular diseases for 10 causes, 1990 to 2015, J. Am. Coll. Cardiol.70(1) (2017), 1–25.

Mathers

C.D.

and Loncar

, Projections of global mortality and burden of disease from 2002 to 2030, PLoS Med.3(11) (2006), e442.

Kones

, Primary prevention of coronary heart disease: Integration of new data, evolving views, revised goals, and role of rosuvastatin in management. a comprehensive survey, Drug Des. Dev. Ther.5 (2011), 325.

Wang

Liu

Zeng

Zheng

Chu

Yao

Wang

Jiao

and Karmakar

, A fusion framework based on multi-domain features and deep learning features of phonocardiogram for coronary artery disease detection, Comput. Biol. Med., 2020, 103733.

Huang

Y.-Y.

Kung

P.-T.

Chiu

L.-T.

and Tsai

W.-C.

, Related factors and incidence risk of acute myocardial infarction among the people with disability: A national population-based study, Res. Dev. Disabil.36 (2015), 366–375.

Zipes

D.P.

Libby

Bonow

R.O.

Mann

D.L.

and Tomaselli

G.F.

, Braunwald’s Heart Disease E-Book: A Textbook of Cardiovascular Medicine, Elsevier Health Sciences, 2018.

Shah

Jan

M.U.

Altaf

and Salahudin

, Correlation of hyper-homocysteinemia with coronary artery disease in absence of conventional risk factors among young adults, J. Saudi Heart Assoc.30(4) (2018), 305–310.

Nishiyama

Iwase

Nishi

Ishiwata

Komiyama

Yanagishita

Nakanishi

and Seki

, Long-term outcome in double-vessel coronary, Jpn. Heart J.38(2) (1997), 181–189.

10.

Griffel

Zia

M.K.

Fridman

Saponieri

and Semmlow

J.L.

, Detection of coronary artery disease using automutual information, Cardiovasc. Eng. Technol.3(3) (2012), 333–344.

11.

Nissen

S.E.

, Limitations of Computed Tomography Coronary Angiography, 2008.

12.

Hanna

I.R.

and Silverman

M.E.

, A history of cardiac auscultation and some of its contributors, Am. J. Cardiol.90(3) (2002), 259–267.

13.

Padmanabhan

and Semmlow

J.L.

, Dynamical analysis of diastolic heart sounds associated with coronary artery disease, Ann. Biomed. Eng.22(3) (1994), 264–271.

14.

Akay

, Harmonic decomposition of diastolic heart sounds associated with coronary artery disease, Signal Process.41(1) (1995), 79–90.

15.

Semmlow

and Rahalkar

, Acoustic detection of coronary artery disease, Annu. Rev. Biomed. Eng.9 (2007), 449–469.

16.

Kumar

Pachori

R.B.

and Acharya

U.R.

, Characterization of coronary artery disease using flexible analytic wavelet transform applied on ecg signals, Biomed. Signal Process Control31 (2017), 301–308.

17.

Acharya

U.R.

Fujita

Lih

O.S.

Adam

Tan

J.H.

and Chua

C.K.

, Automated detection of coronary artery disease using different durations of ecg segments with convolutional neural network, Knowl. Base Syst.132 (2017), 62–71.

18.

Tan

J.H.

Hagiwara

Pang

Lim

S.L.

Adam

San Tan

Chen

and Acharya

U.R.

, Application of stacked convolutional and long short-term memory network for accurate identification of cad ecg signals, Comput. Biol. Med.94 (2018), 19–26.

19.

Arabasadi

Alizadehsani

Roshanzamir

Moosaei

and Yarifard

A.A.

, Computer aided decision making for heart disease detection using hybrid neural network-genetic algorithm, Comput. Methods Progr. Biomed.141 (2017), 19–26.

20.

Acharya

U.R.

Sudarshan

V.K.

Koh

J.E.

Martis

R.J.

Tan

J.H.

S.L.

Muhammad

Hagiwara

Mookiah

M.R.K.

Chua

K.P.

et al., Application of higher-order spectra for the characterization of coronary artery disease using electrocardiogram signals, Biomed. Signal Process Control31 (2017), 31–43.

21.

Dolatabadi

A.D.

Khadem

S.E.Z.

and Asl

B.M.

, Automated diagnosis of coronary artery disease (cad) patients using optimized svm, Comput. Methods Progr. Biomed.138 (2017), 117–126.

22.

Khan

M.U.

Aziz

Naqvi

S.Z.H.

and Rehman

, Classification of coronary artery diseases using electrocardiogram signals, in: 2020 International Conference on Emerging Trends in Smart Technologies (ICETST), IEEE, 2020, pp. 1–5.

23.

Butun

Yildirim

Talo

Tan

R.-S.

and Acharya

U.R.

, 1d-cadcapsnet: One dimensional deep capsule networks for coronary artery disease detection using ecg signals, Phys. Med.70 (2020), 39–48.

24.

Whitaker

B.M.

Suresha

P.B.

Liu

Clifford

G.D.

and Anderson

D.V.

, Combining sparse coding and time-domain features for heart sound classification, Physiol. Meas.38(8) (2017), 1701.

25.

Homsi

M.N.

and Warrick

, Ensemble methods with outliers for phonocardiogram classification, Physiol. Meas.38(8) (2017), 1631.

26.

Langley

and Murray

, Heart sound classification from unsegmented phonocardiograms, Physiol. Meas.38(8) (2017), 1658.

27.

and Du

, Classification of heart sounds based on the wavelet fractal and twin support vector machine, Entropy21(5) (2019), 472.

28.

Singh

S.A.

Majumder

and Mishra

, Classification of short unsegmented heart sound based on deep learning, in: 2019 IEEE International Instrumentation and Measurement Technology Conference (I2MTC), IEEE, 2019, pp. 1–6.

29.

J.M.-T.

Tsai

M.-H.

Huang

Y.Z.

Islam

S.H.

Hassan

M.M.

Alelaiwi

and Fortino

, Applying an ensemble convolutional neural network with savitzky-golay filter to construct a phonocardiogram prediction model, Appl. Soft Comput.78 (2019), 29–40.

30.

Eslamizadeh

and Barati

, Heart murmur detection based on wavelet transformation and a synergy between artificial neural network and modified neighbor annealing methods, Artificial Intelligent Med.78 (2017), 23–40.

31.

Zhang

Han

and Deng

, Heart sound classification based on scaled spectrogram and tensor decomposition, Expert Systems Applications84 (2017), 220–231.

32.

Safara

Doraisamy

Azman

Jantan

and Ramaiah

A.R.A.

, Multi-level basis selection of wavelet packet decomposition tree for heart sound classification, Comput. Biol. Med.43(10) (2013), 1407–1414.

33.

Karar

M.E.

El-Khafif

S.H.

and El-Brawany

M.A.

, Automated diagnosis of heart sounds using rule-based classification tree, J. Med. Syst.41(4) (2017), 60.

34.

Yadav

Singh

Dutta

M.K.

and Travieso

C.M.

, Machine learning-based classification of cardiac diseases from pcg recorded heart sounds, Neural Comput. Appl., 2019, 1–14.

35.

Makaryus

A.N.

Makaryus

J.N.

Figgatt

Mulholland

Kushner

Semmlow

J.L.

Mieres

and Taylor

A.J.

, Utility of an advanced digital electronic stethoscope in the diagnosis of coronary artery disease compared with coronary computed tomographic angiography, Am. J. Cardiol.111(6) (2013), 786–792.

36.

Schmidt

S.E.

Holst-Hansen

Hansen

Toft

and Struijk

J.J.

, Acoustic features for the identification of coronary artery disease, IEEE (Inst. Electr. Electron. Eng.) Trans. Biomed. Eng.62(11) (2015), 2611–2619.

37.

Banerjee

Choudhury

A.D.

Datta

Pal

and Mandana

K.M.

, Noninvasive detection of coronary artery disease using pcg and ppg, in: eHealth 360, Springer, 2017, pp. 241–252.

38.

Pathak

Samanta

Mandana

and Saha

, Detection of coronary artery atherosclerotic disease using novel features from synchrosqueezing transform of phonocardiogram, Biomed. Signal Process Control62 (2020), 102055.

39.

Samanta

Mandana

Saha

et al., Identification of coronary artery disease using cross power spectral density, in: 2017 14th IEEE India Council International Conference (INDICON), IEEE, 2017, pp. 1–6.

40.

Samanta

Pathak

Mandana

and Saha

, Identification of coronary artery diseased subjects using spectral features, in: 2018 Twenty Fourth National Conference on Communications (NCC), IEEE, 2018, pp. 1–6.

41.

Samanta

Pathak

Mandana

and Saha

, Classification of coronary artery diseased and normal subjects using multi-channel phonocardiogram signal, Biocybernet. Biomed. Eng.39(2) (2019), 426–443.

42.

Pathak

Samanta

Mandana

and Saha

, An improved method to detect coronary artery disease using phonocardiogram signals in noisy environment, Appl. Acoust.164 (2020), 107242.

43.

Khan

M.U.

Aziz

Iqtidar

Zaher

G.F.

Alghamdi

and Gull

, A two-stage classification model integrating feature fusion for coronary artery disease detection and classification, Multimedia Tools Applications, 2021, 1–30.

44.

Gyanaprava

Kumar

and Asit

K.M.

, Denoising of heart sound signal using wavelet transform, International Journal of Research in Engineering and Technology Apr-20132(4) (2013), 719–723.

45.

Leonarduzzi

Liu

and Wang

, Scattering transformand sparse linear classifiers for art authentication, Signal Processing150 (2018), 11–19.

46.

Bruna

and Mallat

, Classification with scattering operators, Computer Vision and Pattern Recognition, pp. 1561–1566, Providence, RI, USA, June 2011.

47.

Andén

and Mallat

, Multiscale scattering for audio classification, in: International Society for Music Information Retrieval Conference, Miami, Florida, USA, 2011, pp. 657–662.

48.

Andén

and Mallat

, Deep scattering spectrum, IEEE Transactions on Signal Processing62(16) (2014), 4114–4128.

49.

Khushbakht

Usman

Sumair

and Muhammad

U.K.

, Phonocardiogram signal analysis for classification of Coronary Artery Diseases using MFCC and 1D adaptive local ternary patterns, Computers in Biology & Medicine138 (2021), 1–15. doi: 10.1016/j.compbiomed.2021.104926.

50.

Fan

Hong

Shang

Klaus

and Fengyu

, Classifcation of heart sound using convolutional neural network, Applied Sciences10(11) (2020), 1–17.

51.

Palani

T.K.

Parvathavarthini

and Snekhalatha

, Automated heart sound classifcation system from unsegmented phonocardiogram (PCG) using deep neural network, Physical and Engineering Sciences in Medicine43 (2020), 505–515.

52.

Wuyou

Wangqi

Sheng

Xitian

and Hongying

, Research on segmentation and classification of heart sound signals based on deep learning, Applied Sciences11(10) (2021), 1–15.

53.

Faiq

A.K.

Anam

and Muhammad

S.K.

, Automatic heart sound classification from segmented/unsegmented phonocardiogram signals using time and frequency features, Physiological Measurement41(5) (2020), 1–13.

54.

Bassam

A.N.

Hossam

Nasr

and Abdel-Razzak

M.A.H.

, A framework classification of heart sound signals in physionet challenge 2016 using high order statistics and adaptive neuro-fuzzy inference system, IEEE Access8 (2020), 224852–224859.

Heart sound classification using wavelet scattering transform and support vector machine

Abstract

OBJECTIVE:

METHOD:

RESULTS:

DISCUSSION:

CONCLUSION:

Keywords

1. Introduction

SVCAD

DVCAD

TVCAD

Coronary artery disease

Heart failure

Arrhythmias

Heart valve disease

1.1 PCG signals

2. Literature review

3.1 Overview

3.2.1 Low pass filtering and Wavelet Threshold method

Hard thresholding

Soft thresholding

Feature extraction

Model training

Model testing

Model evaluation

4. Results

4.1 Preprocessing

Table 1 The accuracy of every TW categorized by the SVM

Table 4 Classification metrics for various SVM classifiers with different folds and hold out

Improved accuracy

Reduced dimensionality

Table 7 Multi-domain PCG signal features

SVM-L

SVM-C

SVM-G

6. Conclusion

References

Table 1
The accuracy of every TW categorized by the SVM

Table 4
Classification metrics for various SVM classifiers with different folds and hold out

Table 7
Multi-domain PCG signal features