A novel mechanism to recognize heart disease by optimised deep belief network with SVM classification

Abstract

A heart attack is a common cause of death globally. It can be treated successfully through a simple and accurate diagnosis. Getting the right diagnosis at the right time is very important for the treatment of heart failure. Currently, the conventional method of diagnosing heart disease is not reliable. Machine learning is a type of artificial intelligence that can be used to analyze the data collected by sensors. Data mining is another type of technology that can be utilized in the healthcare industry. These techniques help predict heart disease based on various factors. We developed a prediction and recommendation model aimed at predicting heart disease using the Optimized Deep Belief Network. It does so by taking into account the various features of the heart disease UCI and Stalog database. Finally, the proposed method classifies healthy people and people with heart illness with an accuracy of 97.91%.

Keywords

Heart disease diagnosis machine learning deep learning

1 Introduction

Most of the deaths that happened in the last few years were due to heart disease [1]. It is a common cause of serious illness and fatality globally due to the insufficient amount of blood that’s pumped by the heart [2]. This issue can be caused by various factors such as high cholesterol, inactivity, etc. [3]. The American Heart Association states that these symptoms could include an increase in heart rate and urination. This condition can also cause rapid weight gain. Usually, it can happen in about 1-2 kg per day [4]. These symptoms are similar to diseases that can be triggered by aging. Like the aging persons, these symptoms also appear different from the diseases that they experience. Getting a proper diagnosis is a challenging task. Knowing which prediction is a vital step in improving one’s health [5] can help people to keep track of their health condition [6].

Certain risk factors such as blood pressure, cholesterol, and age can increase a person’s chances of suffering from heart disease [7]. These factors can be easily controlled, but they can also be very complex to manage. This is why it’s important to keep track of all of your health conditions [8]. Various techniques are utilized to diagnose different diseases. Some of these include physical examinations, medical history, and symptom analysis [9]. Unfortunately, these techniques tend to provide imprecise diagnoses due to human error.In the past few years, the use of health monitoring systems has evolved to be more challenging. Instead of simply counting sleep hours, they are now being used to analyze data in a more complex manner. This allows them to provide more useful information to the end-users.Recent studies have shown that machine learning techniques can be used to represent complex data in a more effective manner.This technology can be utilized for various applications, such as prediction, anomaly detection, and decision-making process. These are all part of the machine learning process that involves analyzing the input data.Despite the advantages of machine learning, it is still not yet feasible to analyze large amounts of data efficiently. This makes it necessary to develop effective analytic methods that can provide accurate and timely diagnoses. E-health systems are designed to help minimize the risk of disease by identifying and monitoring various conditions at an early stage.Although it has been widely accepted that there is a need to identify and treat a disease at its earliest stages, there is still a need to provide personalized medical advice to patients.

Currently, machine-learning technology is being used to improve the diagnosis of heart disease. This system consists of a support machine, a decision tree, and fuzzy logic. According to researchers, this system can help decrease the number of heart disease deaths. The various components of this system include a support vector machine, fuzzy logic, and a decision tree. Various studies have shown that these components can help diagnose heart disease [11]. The systems of machine-learning technology have been shown to help decrease the number of deaths caused by heart disease.

Further, Deep learning is a type of machine learning that is used in various fields such as education, medicine, and advertising. It has numerous advantages. In this paper, we introduce novel deep learning models that are designed to predict heart disease [12]. Through the learning features, the models can provide high accuracy and control over the data. This paper presents a deep learning framework that combines deep learning with multilayer network architecture to develop a prediction model for cardiovascular disease identification. Figure 1 and 2 shows the human heart parts and measures considered for heart disease identification.

Fig. 1

Anterior view of the human heart with blood vessels identified [Gaze, D.C. ed., 2013, Ischemic Heart Disease].

Fig. 2

MRI of Human heart.

The main goal of this work is to develop a reliable cardiovascular disease identification model, existing models suffer from reliable identification due to large differences in the characteristics of patients with the same disease. We introduce a method to initialize the network using the best network parameters, which will help us to avoid generating unstable models. In addition, we introduce a method to improve the prediction model by using reconstruction error. This will allow the model to independently determine the network’s depth.

2 Related works

Shah, D et al. (2020) [13] - In this study, the author used various data mining techniques to classify heart disease. The results of the study revealed that the algorithm used for classifying the disease performed better than the others. However, the algorithm’s usefulness is still limited by the complexity of the data collected. The author noted that further studies are needed to improve it.

Yazdani, et al. (2021) [14]- The use of weight scores for various features of heart disease has been proven to improve the accuracy of predicting heart disease. A set of significant features was then used to determine the strength of the algorithm used to predict heart disease. The findings of the study have contributed to the development of the top five rules for predicting heart attack severity. This study aims to study the various techniques used in predicting heart disease and how they can be utilized in other prediction models.

Ali, S.A et al. (2020) [15]- The proposed framework is based on Ruzzo-Tompa and aims to find the optimum configuration of a DBN for predicting heart disease. It avoids generating erroneous predictions and focuses on the most accurate results. The method was compared to the ANN and Ruzzo-Tompa frameworks. It also performs an analysis of the various network configurations. Although it is more accurate than other methods, it cannot still identify the exact heart disease prediction.

Memon et al. (2018) [16]- The goal of this research was to develop a diagnostic system that can classify healthy subjects with HD. The system was tested on a large database of heart disease patients. Some of the features of the system were not ideal. The complexity of the system and its high cost have slowed down the system’s processing time. This study proposes algorithms that can improve the system’s accuracy and reduce its overall processing time.

Embarak, O et al. (2019) [17]- The data mining techniques used in the study were compared to predict heart disease. Some of these include Bayesian, SVM, and Naive Bayes. The researchers noted that these techniques could provide better predictions than the traditional methods. The researchers then focused on finding the most accurate algorithm for predicting heart disease. They noted that this algorithm could be used in hospitals to diagnose patients and save them from heart disease.

3 Proposed methodology for detecting heart disease prediction using DBN-SVM

Due to the increasing number of heart failure cases, it has become increasingly important to develop a technique that can detect and diagnose heart disease accurately [24 –30]. Currently, machine learning models are being studied to identify and diagnose heart disease. The main challenge facing healthcare institutions is the availability of reliable and cost-effective facilities. Currently, most of the machines used for heart disease diagnosis are not able to provide accurate and precise results. The goal of this research is to find the most accurate and precise machine learning technique to diagnose heart disease. Figure 3 (a) shows the block diagram of the proposed methodology, and Fig. 3 (b) shows the architecture of the method flow.

Fig.3

(a) Block diagram of the proposed method. (b): Proposed DBN-SVM for heart Disease prediction Architecture.

3.1 Dataset description

The Cleveland database is the known machine learning repository that contains 76 attributes. The “target” field is used to indicate the occurrence of heart disease in a patient represented from 0 to 4”. $E_{s} = e_{1} + e_{2} + e_{3} + . . . . . e_{n}$ (1)

In the above eqn 1, Es be the collection of heart disease dataset image from the UCI database, and e1, e2, e3 . . . en is the individual attribute. Table 1 shows the Commonly Used Features from the Cleveland Heart Disease Dataset.

Table 1

Commonly Used Features from the Cleveland Heart Disease Dataset

Sl. no.	Attribute	Representative icon	Details
1	Age	Age	Patients age, in years
2	Sex	Sex	0 = female; 1 = male
3	Chest pain	Cp	4 types of chest pain (1— typical angina; 2— atypical angina; 3— non-anginal pain; 4— asymptomatic
4	Rest blood pressure	Trestbps	Resting systolic blood pressure (in mm Hg on admission to the hospital)
5	Serum cholesterol	Chol	Serum cholesterol in mg/dl
6	Fasting blood sugar	Fbs	Fasting blood sugar > 120 mg/dl (0— false; 1— true)
7	Rest electrocardiograph	Restecg	0— normal; 1— having ST-T wave abnormality; 2— left ventricular hypertrophy
8	Max Heart rate	Thalch	Maximum heart rate achieved
9	Exercise induced	Exang	Exercise-induced angina (0— no; 1— yes)
10	ST depression	oldpeak	ST depression induced by exercise relative to rest
11	Slope	Slope	slope of the peak exercise ST segment (1— upsloping; 2— flat; 3— down sloping)
12	No of vessels	Ca	No. of major vessels (0— 3) colored by fluoroscopy
13	Thalassemia	Thal	Defect types; 3— normal; 6— fixed defect; 7— reversible defect
14	target (class attribute)	Class	target-have disease or not (1 = yes, 0 = no) (= the predicted attribute)

Table 2

All the ranges formed for each feature (ranged from 0–4)

	age	sex	cp	trestbps	chol	fbs	restecg	thalach	exang	oldpeak	slope	ca	thal	target
0	63	1	3	145	233	1	0	150	0	2.3	0	0	1	1
1	37	1	2	130	250	0	1	187	0	3.5	0	0	2	1
2	41	0	1	130	204	0	0	172	0	1.4	2	0	2	1
3	56	1	1	120	236	0	1	178	0	0.8	2	0	2	1
4	57	0	0	120	354	0	1	163	1	0.6	2	0	2	1

3.2 Preprocessing of the dataset

Due to the number of outliers in the data, the approach had to be changed to minimize the risk of overfitting the data. The initial results were not promising. However, after overcoming the issue through a pre-training strategy, the results were quite promising. The data pre-processing phase consisted of 6 instances, where all the missing records were deleted. The following 14 normal attributes were used to classify the data. The results show the criticality level of a patient with heart disease. Although the number 0 indicates that there is no heart disease, the other values indicate that other heart diseases are present at various criticality levels.

3.3 Description of the proposed approach

In this section, we discuss a novel approach to improve the accuracy of the prediction of heart disease using a deep neural network. This method is optimized with SVM. The following are some of the problems that arise when using DBN with SVM: Too many errors, incorrect data collection, and improper calculation. Deep learning is a type of computer science that learns how to extract high-level features from various samples. It does so by combining low-level inputs with higher-level representations. The learned features help avoid the artificial feature selection and design process, which can be very costly and time-consuming. This paper shows that a deep neural network-based prediction model can predict a person’s disease state better than its predecessor.

We first build the models with the deep belief network and then improve it by adopting the best network parameters. In this technique, the model can independently determine the depth of the deep trust network. The convergence rate of neural networks is slow. It can be easily affected by the random initialization of the network. The selection of the BP neural network structure is not generally considered a simple and quick process. Generally, experience is required to select it. The DBN model is an improved version of the general neural network that can self-adjust. It avoids the defects of the BP network and can also model one-dimensional data.

A neural network is an artificial intelligence system that learns the probability of heart disease based on an independent data set. The data collected by the system is the learned samples, while the ones that were hidden are those that were given to the next layer. Figure 4 shows the Network Configuration of the deep neural network to predict heart disease.

Fig. 4

Network Configuration of the deep neural network to predict heart disease.

The concept of network energy is a key concept in our proposed learning process. It relates to the probability distribution of joint state visual units and their hidden units.

$\begin{matrix} u = J (c, l) = y (n) - \sum_{j = 1}^{uC} h_{j} c_{j} \\ - \sum_{i = 1}^{uL} \sum_{j = 1}^{uC} l_{i} D_{ij} c_{j} - \sum_{i = 1}^{sM} g_{i} l_{i} \end{matrix}$ (2)

Fig. 5

An Application of a real-time monitoring system used in predicting the heart disease.

The binary states of a heart feature i and a label feature j are respectively m_i and b_o. U(c) and U(l), together of which signify the sum of hidden and visible units. $J (c, l) = \frac{1}{B} e^{- J (c, l)}$ (3)

The following are the related two margin distributions of labeled and unlabeled heart features: $U (c | l) = \prod_{j = 1}^{sM} \frac{exp (h_{i} c_{j} + \sum_{j = 1}^{sB} l_{i} D_{ij} c_{j})}{\sum_{\tilde{b}} c_{j} + exp (h_{i} c_{j}^{\sim} + \sum_{j = 1}^{sB} l_{i} D_{ij} c_{j}^{\sim})}$ (4) $U (c | l) = \prod_{j = 1}^{sM} \frac{exp (h_{i} c_{j} + \sum_{j = 1}^{sB} l_{i} D_{ij} c_{j})}{\sum_{\tilde{b}} c_{j} + exp (h_{i} c_{j}^{\sim} + \sum_{j = 1}^{sB} l_{i} D_{ij} c_{j}^{\sim})}$ (5)

Because there are no hidden links between the various units in one layer, we can easily acquire them without looking for them. $U (c_{i} = 1 | l) = \frac{1}{1 + exp (- h_{i} \sum_{i = 1}^{sB} l_{i} D_{ij}}$ (6)

Maximum likelihood estimation is a reliable technique for effectively learning our proposed network model. $Q (θ) = log U (c | l) = log \sum_{m} e^{- J (b, m)} - log \sum_{b . m} e^{- J (b, m)}$ (7)

Where EGi denotes the parameters to be estimated for successful cardiac feature learning. The gradient may be expressed as follows: $\frac{\partial (Q)}{\partial (θ)} = \frac{\partial}{\partial (θ)} (\sum_{m} e^{- J (c, l)} - log \sum_{c, l} e^{- J (c, l)})$ $= \frac{\sum_{m} e^{- J (b, m)}}{\sum_{b, m} e^{- J (b, m)}} (- \frac{\partial J (c, l)}{\partial (θ)}) - \sum_{b, m} \frac{e^{- J (c, l)}}{\sum_{b, m} e^{- J (c, l)}} (- \frac{\partial J (c, l)}{\partial θ})$ $= \sum_{m} U (c | l) (- \frac{\partial J (c, l)}{\partial (θ)}) - \sum_{m} U (c | l) (- \frac{\partial J (c, l)}{\partial θ})$ (8)

To make our learning process easier, we add two more variables to replace the terminology as easy as possible: ${〈 θ 〉}_{data} = \sum_{m} U (c | l) (- \frac{\partial J (c, l)}{\partial (θ)})$ ${〈 θ 〉}_{mod el} = \sum_{m} U (c | l) (- \frac{\partial J (c, l)}{\partial θ})$ (9)

Partial derivatives of the energy function of parameters are shown in our proposed framework model. $- \frac{\partial J (c, l)}{\partial (θ)} = b_{i} m_{j} - \frac{\partial J (c, l)}{\partial θ} = = b_{j} \frac{\partial J (c,;)}{\partial} = l_{i}$ (10)

The first object can be computed easily. However, the second one has to cross all the possible configurations of the unlabeled heart features. We proposed a framework to handle large sets of data with non-label heart features. We divide the learned data into mini-batches to improve computing efficiency. $θ = θ + ɛ Δ θ = θ + ɛ (< θ >)_{data} - < θ >_{mod el}$ (11)

The learning rate of our proposed framework for detecting heart disease is. It can be expanded to fit any mini-batch. $Δ D_{ij} = \frac{\sum_{x = 1}^{l} (l_{(x)}^{(0)} c_{(x)}^{(0)} - l_{(x), i}^{k} c_{(x), j}^{(k)}}{L}$ (12) $Δ h_{j} = \frac{\sum_{x = 1}^{l} (c_{(x), j}^{(0)} - c_{(x), j}^{(k)})}{L}$ (13) $Δ o_{i} = \frac{\sum_{x = 1}^{l} l_{(x), j}^{(0)} - l_{(x), j}^{(k)})}{L}$ (14)

Several features are computed after k-steps when the heart rate is indicated by a notation. The structure of the program is learned layer by layer, and its parameters are revealed after all the data has been collected.

3.4 Particle swarm optimization

Each particle in the collection flies with a velocity and tries to acquire the best possible velocity. It does so by comparing its past best and its partner’s best flight experience. In an n-dimensional search space, the characteristics of heart disease are initialized and their fitness is calculated based on the input values. The texture features of the heart disease are then shifted to the new positions using the equations below.

$\begin{matrix} w_{i} (i + 1) = ω . w_{i} (i) + c_{1} . φ 1 . (Pbest - Y_{i} (i)) \\ + C_{2} . φ 2 . (Pbest - Y_{i} (i)) \end{matrix}$ (15) $Y_{i} (i + 1) = Y_{i} (i) + W_{i} (i + 1)$ (16)

PSO is a search engine that can find the exact settings or parameters that are required to enhance a given issue. The function space of a given issue is mapped to its one-dimensional fitness space, which gives a solitary fitness value to each of the proposed parameters.

Fig. 6

IoT healthcare applications.

3.5 Statistical analysis based heart disease recommender model

The goal of the hybrid system is to provide a prompt and accurate recommendation regarding the severity of a disease. The system uses an intelligent algorithm to analyze the data collected by a patient about their disease. This system then uses its predictive capabilities to provide recommendations regarding the appropriate treatment. The system uses a knowledge base, which is composed of various medical factors, to provide recommendations. These factors play a huge role in the accuracy of heart disease diagnosis. This system can help reduce the workload and time spent by both the patient and the healthcare practitioner. It can also provide timely and accurate recommendations to remote patients.

3.6 Recommendations

After analyzing a clinical dataset, a statistical analysis is performed to generate recommendations for patients. A rule set is then created to generate recommendations. The rule set is formulated using a criterion derived from the data and risk factors. The recommendations are categorized into five classes. These are categorized according to the rule set generated for them. The rule set defines the process by which clinical tests are performed when a patient arrives at the hospital. The risk factor and probability estimations are then performed to conclude the exposure level of a patient. The information collected during the study is then matched with the rule set to arrive at the severity level and weight of the patient. The final score is then computed using the importance and severity weight of the data, $FinalScore = \sum_{i = 1}^{m} S_{i} (W_{i})$

Where m shows the number of exposures considered, S_i shows the severity score of exposure and W_i represents the importance weight for exposure.

4 Result and discussion

This research aims to predict which patient will develop heart disease. The research is carried out on the classification techniques based on DBN-SVM. Experiments are conducted through the Python tool. The research is performed on an 8^th generation Intel Corei7 processor. The data was split into 3 batches. The test was performed using two sets of Python programming. The first set was used for training, while the second set was for the test. Python programming was used for the test. The data sets and scores were compared in the following table.

The Cleveland dataset is used for this paper. The data collected in the matrix is used to predict heart disease. There are various rows and columns in the matrix. The UCI repository has a variety of datasets related to heart disease. There are 76 attributes and over 300 records in the Cleveland dataset. All of the published experiments use a subset of these attributes. The target column of the Cleveland dataset contains various risk factors. In the Table 1 below, the corresponding values are shown. The values in this section are used as input for a proposed framework that will implement machine learning methods. Due to the nature of these methods, they are constantly updated. Due to the nature of the data, it is hard to maintain and optimize it. The proposed framework uses a combination of Backpropagation and cluster selection methods to achieve optimal cluster sizes.

4.1 Dataset description

The data in these databases are selected from the Cleveland Heart Disease Database, which contains about 76 attributes. It’s the only known machine learning repository that’s been used to this extent.

4.2 Simulation environment for heart Disease Prediction using DBN with SVM

The goal of a real-time monitoring system is to predict the development of a disease by monitoring various physiological parameters such as blood glucose levels, respiratory rate, and temperature. The data collected by the system is sent and received using wireless networks. The proposed framework explains how this system works. The patient’s body is equipped with sensors that are designed to detect the presence of disease. These include a heart rate sensor, a respiratory rate sensor, and a hemoglobin range sensor. The system uses an adaptive alarm system to notify the healthcare providers and the patient’s doctor when the detected condition exceeds a certain threshold. The threshold values are calculated based on the findings of the researchers and the government. Figure 6 shows the IoT healthcare applications.

This paper aims to build a DBN model that takes into account training data. The model is built on the inputs that are given.The training sample risk factor datax, training sample tag valuey, and testing samplex’,y’ are the outputs of this step.

(1) The learning rate is set to 1. The error setting is set to 0, the training period for each RBM has been increased to 10 times, and the weight, visible layer bias, hidden layer offset, and total weight are set to 100.

(2) The first step in the unsupervised training phase is input. This is done by taking the training data×out of the label. The number of neurons in the input layer automatically determines the risk factor of the data set.

The following non-negative feature findings were used to choose 13 of the best features from the Dataset.

(3) PSO is a technique used to find the optimal initial weight. This procedure works by searching for the various parameters that are necessary to perform the desired function.

In Fig. 7, the feature importance is a score that shows how significant a particular feature is to a given function or a performance variable. Feature value tells us how many functions are related to the result and how it affects the performance variable. This work learns to extract the top features of a dataset using the Feature importance class of Tree-Based Analyzers.

Fig. 7

Non-negative features to select 13 of the best features.

Figure 8 shows that there are 165 individuals with heart disease and 138 without it. The figure shows that people with heart disease are more likely to have chest pain than those with a non-heart disease condition. People with a value of 1 are more prone to having a heart attack or experiencing other health problems. For people with a value of 0, it’s believed that their heart disease is worse than that of people with a value of 1. The slope value of 2 indicates that people with heart disease are more prone to having it than those with a value of 1. The number of major vessels in the body is affected by blood movement. Having a value of 0 makes people more likely to have heart disease. Figure 9 shows the Heart Disease classification by Sex, chest pain, fasting blood sugar, and resting electrocardiographic.

Fig. 8

Heart disease count.

Fig. 9

Heart Disease by (a) Sex (b) chest pain (c) fasting blood sugar and (d) resting electrocardiographic.

Figure 10 shows High blood pressure over 130–140 is usually a cause for concern for cholserum cholesterol in mg/dl. High blood pressure over 200 is also a cause for concern. It can increase a person’s risk of heart disease. High blood pressure can also cause depression. It can also contribute to the development of heart disease.

Fig. 10

Prediction of heart disease graph with different features.

Fig. 11

Heart Disease in the function of Age and Max Heart Rate.

The graph shows that the target class does not have equal distribution. This is also true when a heat map is plotted.it can be visualized in Fig. 12.

Fig. 12

Correlation between Heart Disease and Numeric Features.

In Fig. 13, The first step in the process is to assign importance weight to the various aspects of an exposure risk analysis. Then, a severity analysis is performed and a score is assigned to generate recommendations.

Fig. 13

A detailed representation of the steps followed for SAbHD_RM in consultation with the medical experts.

4.3 Statistical analysis

The objective of the study was to analyze the effects of various factors on the concentration of BDNF. The DBN was performed to check the possible influence of different factors on the concentration of BDNF. A two-tailed Pearson correlation was also used to examine the relationship between various measures of obesity and CHD. The two-tailed Pearson correlation was used to examine the relationship between various measures of obesity and the concentration of BDNF. The Chi-square goodness-of-fit test was also performed to determine the frequency of the Val66Met genotypes in the Hardy-Weinberg equilibrium. The Chi-square test was also used to examine the frequency of smoking and high blood pressure among individuals. It was also conducted to determine the distribution of the Val66Met genotypes. The standardized residuals and the R value18 were then calculated to analyze the main factors that contribute to the differences in blood pressure and smoking. The various measures of health were then analyzed using the Mann-Whitney U test. Among the other factors that were analyzed were the body mass index, blood pressure, and cholesterol levels. The various tests were performed using two-tailed methods. The statistical power and sample size of the tests were calculated using the G*Power 3 Software19. The total sample size for the various tests was calculated at 244,197, and 352, respectively. For the Chi-square test, the sample size was 197, while for the t-test, the sample size was 352.

Fig. 14

Performance comparison of the different algorithms with different parameter.

4.4 Performance analysis

The performance metrics are a way to measure the accuracy and speed of the system. Our proposed system deals with some criteria such as accuracy, precision, and recall F-measure. These are calculated using the formula, We have collected data from various sources and split it into two sets: a training set and a test set. We will create a machine learning model that will learn and detect patterns in the training data in this study. It deals with several criteria in the proposed system, such as accuracy, precision, and recall F-measure. The formula is used to calculate these.

4.4.1 Precision

The precision is defined as the ratio of relevant data discovered to the total data detected. It is represented by, $Pr ecision (p) = \frac{Sum of relevant data det ected}{Total sum of data det ected}$

4.4.2 Recall

The recall analysis method is used to determine the effects of various independent variables on a specific dependent variable. It’s done by taking into account the input variables’ values. $Re call (r) = \frac{Sum of accurate data det ected}{Total sum of relevant data inthedatabase}$

4.4.3 Accuracy

To determine classification performances, the most often utilized classification accuracy is used to assess the classifier’s overall efficiency. $Accuracy = \frac{Pr ecision + recall}{2} * 100$

At the end of our experiment in Fig. 15. After an experiment, the proposed method of data mining and statistical modeling (ODBN-SVM) performed very well against other methods. SVM-trained Optimized DBN with random forest and logistic regression achieve training accuracy of 81.13% and accuracy testing of 97.91%. However, Fig. 16 shows that the hyper meter doesn’t improve much after tuning. We have proposed a method that can diagnose the faults of single-phase induction motors using acoustic signals and thermal imaging. This method can be used to identify the electric impact drills’ bearing and stator faults [18 –23].

Fig. 15

Comparison graph of different algorithm.

Fig. 16

Different models tuned by the hyper parameter in terms of accuracy.

Analysis of Fig. 17, The results of the evaluation of the proposed machine learning model for the Statlog (Heart) data set are presented. The model was compared with other methods for identifying relevant risk factors and performing statistical analysis. The proposed model is more focused on the data set’s characteristics than its usual functions. However, its accuracy rate and its generalization ability are not good. The proposed model does not take into account the various characteristics of the data set.The traditional method for classifying data involves extracting various features from the data. Deep learning can learn deep-level networks that can provide the necessary deep representation of the data. This method is more effective than the methods that rely on probability models and shallow neural networks.The paper presents a deep-confidence network that’s capable of independently determining the structure of the network. This method can perform deep-level analysis of the ECG data and identify cardiovascular diseases.

Fig. 17

Comparison of classification results of different methods.

5 Conclusion and future scope

Modern society has made it very clear that heart disease is the main cause of short life. People must have access to the right healthcare system. Data collected by healthcare organizations is very important to get a deeper understanding of the disease. Through the use of data innovation, organizations can extract valuable information from their data. Data mining techniques known as DBN-SVM are commonly used in healthcare organizations to perform various tasks related to collecting and analyzing data. This method can help in reducing the cost of providing healthcare services. We have used the Cleveland Heart Disease dataset to divide it into two sections. One of these is training and testing. We have implemented various algorithms to check the accuracy of the data. By the end of the paper, we found that the proposed method provided the best accuracy level. Due to the algorithm’s accuracy, we were able to achieve an overall accuracy level of 97.91 percent. For other datasets, the algorithm may provide a better accuracy level. However, for our situation, we decided to implement the proposed method. Although it’s possible that the number of training data can increase the accuracy of the algorithm, it will also take a bit more time to process the data. With the help of big data, We can use this technology to improve the accuracy of our prediction algorithm for predicting heart disease. In the next generation, we will also use it to predict heart disease in kids.The goal of our study is to improve the accuracy of the algorithm by developing a set of features that can be used to manage the various aspects of the prediction.In the future, hybrid techniques will be used to extract optimal features from various data sets. This could help improve the performance of models for predicting heart disease. Also, real-time medical datasets will be used to develop models.

References

Gaziano

T.A.

, Bitton

, Anand

, Abrahams-Gessel

and Murphy

, Growing epidemic of coronary heart disease in low-and middle-income countries, Current Problems in Cardiology 35(2) (2010), 72–115.

Ponikowski

, Anker

S.D.

, AlHabib

K.F.

, Cowie

M.R.

, Force

T.L.

, Hu

, Jaarsma

, Krum

, Rastogi

, Rohde

L.E.

and Samal

U.C.

, Heart failure: preventing disease and death worldwide, ESC Heart Failure 1(1) (2014), 4–25.

Hamilton

M.T.

, Hamilton

D.G.

and Zderic

T.W.

, Role of low energy expenditure and sitting in obesity, metabolic syndrome, type 2 diabetes, and cardiovascular disease, Diabetes 56(11) (2007), 2655–2667.

American Heart Association. Staying Hydrated-Staying Healthy. (2014) Available at http://www.heart.org/HEARTORG/HealthyLiving/PhysicalActivity/FitnessBasics/StayingHydrated—Staying-Healthy_UCM_441180_Article.jsp#.WOWKD5H3ahA.

Bharti

, Khamparia

, Shabaz

, Dhiman

, Pande

, Singh

Prediction of Heart Disease Using a Combination of Machine Learning and Deep Learning, Computational Intelligence and Neuroscience 2021, Article ID 8387680, pp. 11, 2021.

Haux

, Health information systems–past, present, future, International Journal of Medical Informatics 75(3-4) (2006), 268–281.

Potgieter

C.F.

The motor proficiency of obese 8-11 year old children (Doctoral dissertation, University of the Free State). 2005.

Patil

S.B.

and Kumaraswamy

Y.S.

, Extraction of significant patterns from heart disease warehouses for heart attack prediction, IJCSNS 9(2) (2009), 228–235.

Muhammad

, Tahir

, Hayat

and Chong

K.T.

, Early and accurate detection and diagnosis of heart disease using intelligent computational model, Scientific Reports 10(1) (2020), 1–17.

10.

Ul Haq

, Li

J.P.

, Memon

M.H.

, Nazir

, Sun

, A Hybrid Intelligent System Framework for the Prediction of Heart Disease Using Machine Learning Algorithms, Mobile Information Systems 2018, Article ID 3860146, pp. 21, 2018.

11.

Munawar

, Geetha

, Srinivas

An exploration to identify the most relevant parameters for prediction of heart disease .

12.

Shen

, Wu

and Suk

H.I.

, Deep learning in medical image analysis, Annual Review of Biomedical Engineering 19 (2017), 221–248.

13.

Shah

, Patel

and Bharti

S.K.

, Heart disease prediction using machine learning techniques, SN Computer Science 1(6) (2020), 1–6.

14.

Yazdani

, Varathan

K.D.

, Chiam

Y.K.

, Malik

A.W.

and Ahmad

W.A.W.

, A novel approach for heart disease prediction using strength scores with significant predictors, BMC Medical Informatics and Decision Making 21(1) (2021), 1–16.

15.

Ali

S.A.

, Raza

, Malik

A.K.

, Shahid

A.R.

, Faheem

, Alquhayz

and Kumar

Y.J.

, An Optimally Configured and Improved Deep Belief Network (OCI-DBN) Approach for Heart Disease Prediction Based on Ruzzo–Tompa and Stacked Genetic Algorithm, IEEE Access 8 (2020), 65947–65958.

16.

Haq

A.U.

, Li

J.P.

, Memon

M.H.

, Nazir

, Sun

A hybrid intelligent system framework for the prediction of heart disease using machine learning algorithms, Mobile Information Systems, 2018.

17.

Tarawneh

, Embarak

, February. Hybrid approach for heart disease prediction using data mining techniques. In International Conference on Emerging Internetworking, Data &Web Technologies (2019), (pp. 447–454). Springer, Cham.

18.

Arozi

, Caesarendra

, Ariyanto

, Munadi

, Setiawan

J.D.

and Glowacz

, Pattern recognition of single-channel sEMG signal using PCA and ANN method to classify nine hand movements, Symmetry 12(4) (2020), 541.

19.

Glowacz

, Tadeusiewicz

, Legutko

, Caesarendra

, Irfan

, Liu

, Brumercik

, Gutten

, Sulowicz

, Daviu

J.A.A.

and Sarkodie-Gyan

, Fault diagnosis of angle grinders and electric impact drills using acoustic signals, Applied Acoustics 179 (2021), 108070.

20.

Glowacz

, Glowacz

and Kozik

, Early fault diagnosis of bearing and stator faults of the single-phase induction motor using acoustic signals, Measurement 113 (2018), 1–9.

21.

Glowacz

, Fault diagnosis of single-phase induction motor based on acoustic signals, Mechanical Systems and Signal Processing 117 (2019), pp. 65–80.

22.

Glowacz

and Glowacz

, Diagnostics of stator faults of the single-phase induction motor using thermal images, MoASoS and selected classifiers, Measurement 93 (2016), pp. 86–93.

23.

Glowacz

, Fault detection of electric impact drills and coffee grinders using acoustic signals,pp, Sensors 19(2) (2019), 269.

24.

Sharma

, Pal

, Jaiswal

Chapter 12 - Heart disease prediction using convolutional neural network,Editor(s): Ayman S. El-Baz, Jasjit S. Suri,Cardiovascular and Coronary Artery Imaging, Academic Press,

25.

Kavitha

, Gnaneswar

, Dinesh

, Sai

Y.R.

, Suraj

R.S.

, January. Heart disease prediction using hybrid machine learning model. In 2021 6th International Conference on Inventive Computation Technologies (ICICT) (2021), (pp. 1329–1333). IEEE.

26.

Rani

, Kumar

, Ahmed

N.M.

and Jain

, A decision support system for heart disease prediction based upon machine learning, Journal of Reliable Intelligent Environments 7(3) (2021), 263–275.

27.

Mehmood

, Iqbal

, Mehmood

, Irtaza

, Nawaz

, Nazir

and Masood

, Prediction of heart disease using deep convolutional neural networks, Arabian Journal for Science and Engineering 46(4) (2021), 3409–3422.

28.

Ansarullah

S.I.

, Saif

S.M.

, Kumar

, Kirmani

M.M.

Significance of visible non-invasive risk attributes for the initial prediction of heart disease using different machine learning techniques, Computational Intelligence and Neuroscience 2022.

29.

Mukherjee

, Sadhu

, Kundu

Heart Disease Detection Using Feature Selection Based KNN Classifier. In Proceedings of Data Analytics and Management (2022), (pp. 577–585). Springer, Singapore.

30.

Dubey

A.K.

, Sinhal

A.K.

and Sharma

, An Improved Auto Categorical PSO with ML for Heart Disease Prediction, Engineering, Technology & Applied Science Research 12(3) (2022), 8567–8573.