Abstract
In recent years Covid-19 impact is causing unprecedented difficulties worldwide, affecting lifestyle choices. The post-pandemic era has made this even more critical.COVID-19 triggers widespread inflammation throughout the body, potentially causing damage to the heart and other vital organs. Mortality data from COVID-19 clearly show that the highest death rates occur in individuals with chronic conditions, such as diabetes, pneumonia, cardiovascular disease (CVD), and acute renal failure.CVD is a particular concern in the medical field. The early detection of CVD remains a significant challenge, as early identification can prompt lifestyle changes and ensure appropriate medical interventions when needed. Individuals with CVD are at an increased risk for heart attack and other serious complications. There is a limited amount of data available to study the effects of COVID-19 on CVD in COVID-19 patients. However, it is essential to monitor these patients to ensure full recovery without complications. The proposed system is specifically designed for individuals experiencing prolonged symptoms following a COVID-19 infection, commonly referred to as long COVID patients. This research introduces a novel Decision-Making System for CVD Prediction, utilizing an improved dual-attention residual bi-directional gated recurrent neural network unit (DA-ResBiGRU) algorithm with AI-Biruni Earth Radius Optimization (ABER). The proposed system employs state-of-the-art predictive algorithms and real-time monitoring to assess individual patient risk profiles accurately. This research addresses the critical need for personalized risk assessment in patients with long-term COVID, aiming to assist healthcare providers in timely and targeted interventions. By analyzing intricate patterns in patient data, the decision-making system enhances the precision of CVD prediction. Additionally, the system's adaptive nature allows it to continuously learn from new patient data, ensuring that its predictions remain up-to-date and reflective of the evolving understanding of long COVID-related cardiovascular risks. The simulation findings of this research highlight the potential of the proposed algorithm to be integrated into clinical decision-making, helping healthcare professionals identify high-risk patients more effectively. The proposed method outperformed existing algorithms, such as Deep Neural Network (DNN), Long short-term memory (LSTM), Inception-v3, Xception, and MobileNetV2, achieving the highest accuracy (97.88%), sensitivity (95.50%), specificity (94.29%), precision (96.68%), and F-measure (95.85%).
Keywords
Introduction
COVID-19 is the world's first global corona virus pandemic and public awareness is still growing. Furthermore, to improve medical technologies, big data is critical in preventing and controlling COVID-19.1,2 Chronic cardiovascular and other health conditions frequently cause death in COVID-19 individuals. In the past decade, CVD has remained the leading cause of death and is the principal cause of mortality. 3 In India, severe heart disease claims approximately 1.7 million lives annually. The World Health Organization (WHO) has recognized heart disease as a significant contributor to global illness and death rates. In 2019, CVD were the cause of approximately 17.9 million deaths globally, representing 32% of all fatalities. Among these CVD-related deaths, heart attacks and strokes accounted for 85% of the cases. Every day, the healthcare industry generates massive amounts of data, including redundancy, multitasking, insufficient information, and close links with time. Such information is challenging to handle, and data-mining approaches aid in the intelligent use of this information to extract essential insights and generate conclusions. Consequently, these approaches can assist healthcare practitioners in making more effective patient treatment decisions in a shorter time. 4
When evaluating the efficiency of hybrid models that incorporate diverse methodologies, several researchers have investigated numerous methods to improve the accuracy of predictions, such as neural networks and machine learning (ML). 5 ML and deep learning (DL) technologies have shown significant promise in supporting clinical decision making, assisting in developing medical guidelines and management algorithms, and promoting evidence-based medical practice for treating CVD. 6 Although ML 7 and DL 8 systems offer potential results in CVD prediction, they are not without barriers. The primary challenge is the complexity and diversity of the CVD information, which is often complicated by disjointed datasets. Another crucial issue is interpreting the model results. Traditional ML and DL models may only occasionally provide clear information on how the model arrived at its predictions, which medical professionals frequently want. Additionally, early diagnosis of cardiovascular disease using ML algorithms may eliminate the requirement for expensive and extensive clinical and laboratory testing, thereby lowering the cost imposed on healthcare organizations and patients. 9
As a result, transfer learning (TL) has become a highly effective approach for enhancing the accuracy of CVD prediction. This technique uses knowledge from related domains and tasks and applies it to a particular focus domain, even though the data distribution differs significantly. The specific form of this strategy has attracted considerable research attention. This enables an effortless exchange of information across areas, such as unifying knowledge from clinical images to genetic data, even if these data sources are unrelated to the same topic. This article explores the application of ML algorithms combined with various analytical methods to detect and diagnose cardiovascular diseases in post-COVID-19 patients. 10
This study introduces an innovative approach using an enhanced DA-ResBiGRU model integrated with ABER for improved CVD classification. The proposed method focuses on predicting and classifying CVD in long-term COVID-19 patients by utilizing two distinct datasets. The first dataset comprises cardiac magnetic resonance (CMR) images, which undergo preprocessing steps such as median filtering, intensity normalization, and histogram matching. These images are then segmented using a deep autoencoder network (DAE) combined with Bayesian fuzzy clustering, followed by feature extraction using VGG19. The second dataset consists of tabular data, which is preprocessed by cleaning, detecting, and removing outliers using the interquartile range (IQR), and performing feature selection through sequential forward selection (SFS). Finally, the classification is performed using the proposed DA-ResBiGRU model optimized with ABER. This comprehensive framework aims to enhance the accuracy and reliability of CVD prediction in long-term COVID-19 patients.
The major contributions of this study are as follows.
To introduce a novel attention-based system for cross-modal transfer learning, enabling accurate forecasting of CVD. This method effectively addresses the complexities associated with analyzing CVD data by integrating multimodal datasets. To enhance the generalization capability of the proposed model, two distinct datasets were utilized for training, ensuring robust performance across diverse scenarios. To optimize the machine learning model's performance, hyperparameter tuning was employed, refining the selection of parameters during the training process. To rigorously evaluate the proposed framework, its performance was compared against existing studies using key metrics such as accuracy, sensitivity, precision, F1 score, ROC, and MCC, demonstrating its effectiveness and reliability.
The structure of this paper is as follows: Section 2 provides a review of the literature, Section 3 describes the proposed method, Section 4 discusses the analysis and results, and Section 5 concludes with future research directions.
Literature survey
Pedro Ribeiro et al. (2024) examined three medical situations: low, moderate, and severe CVD. Two ECG signals were recorded for each situation from two different body postures. Then, ten nonlinear features were selected based on every recorded signal during each 1-s ECG time series and fed into 19 ML classifiers using a leave-one-out cross-validation approach. 11 Muhammad Ali Muzamil et al. (2024) suggest incorporating AI into ECG technology could transform cardiology by improving the detection and treatment of cardiac disorders. This progress can be achieved through scientific inquiry, cross-disciplinary collaboration, and the consideration of ethical ramifications. However, deploying AI-enhanced electrocardiograms requires a comprehensive evaluation of ethical issues. 12
Pawan Singh et al. (2024) developed various ML techniques to correlate human gait metrics and better understand cardiac health. Current technology allows for a noninvasive and effective assessment of each individual's risk of developing heart disease.Multiple gait metrics, including step length, cadence, and speed, were initially experimentally measured using gait systems and retroreflective markers.The experimental datasets were used to train the ML model, which was then tested using different gait metrics. 13 Mumita Moitra et al. (2023) created a CNN-based DL system that can predict CVD risk in COVID-19 patients with an accuracy of up to 97.97%. The efficacy of the proposed model was evaluated using measurements from cardiac CT scans, including the cardiothoracic ratio (CTR), pulmonary artery-to-aorta ratio (PA/A), and calcified plaque. 14
Vaishali Baviskar et al. (2023) created a new Genetic Sine Algorithm (GSA) to identify optimal features while avoiding local optima. Recurrent Neural Network (RNN) classification technology influences the extracted features combined with the LSTM method. Deep Progressive Attention-RNN + LSTM (DPA-RNN + LSTM) was designed to increase the classification rate by filtering all incorrect information and emphasizing critical information. 15 Morten Heath et al. (2023) conducted a survey of PubMed and Embase, supplemented with references to pertinent preceding research. This study included publications that compared one type of interpretation to another in a healthcare setting, including those that did not provide an interpretation or measure patient outcomes. 16
Mahbub Jaafari et al. (2023) employed DL technologies to diagnose CVD using CMR data, a field that is now being extensively researched. This review presents an overview of the research that uses CMR imaging and DL approaches to detect CVD.The introductory section explains the types of CVD, diagnostic approaches, and most essential medical imaging modalities.This work discussed research on CVD identification using CMR images, as well as the most relevant DL algorithms. 17 Saran Kumar et al. (2023) introduced a stacked ensemble framework designed for feature extraction, selection, and classification. Their approach utilized a novel sample-based neural network inference technique to develop classification and stacking models. To enhance the model's performance and achieve a global optimal solution, the study incorporated the Hawks Optimizer (HO). The dataset used to predict CVD was obtained from an open-site Kaggle CVD Prediction Dataset.The dataset class imbalance problem is also a significant factor that must be addressed to improve prediction quality. 18
Hatice et al. (2022) introduced COVID-DSNet, a DCNN to predict common types of pneumonia (bacterial and viral) as well as COVID-19, using CT scans, chest X-rays (CXR), and a integration of both imaging types. The proposed model is a cost-effective and practical DL network that can be easily utilized and further developed by data scientists. 19 Karthick et al. (2022) aimed to create a hybrid dataset to improve the development of more accurate cardiovascular disease (CVD) risk prediction models. To achieve this, they combined multiple datasets to form the “Sathvi” dataset, which consists of 531 instances, 12 attributes, and no missing values. 20 In a related study, Nandakumar P et al. (2022) introduced the Hamming distance feature selection method for preprocessing and cleaning various heart disease datasets.DL models, such as deep belief networks with the bionic cuckoo search algorithm, are used to make precise predictions about heart diseases. 21
Misha Urooj Khan et al. (2022) proposed detecting CVD via PCG signal processing with artificial neural networks (ANN) and spectral feature fusion. The input was processed, and the five spectral characteristics with the highest pairwise variability were selected. The proposed system offers advantages over current techniques, being noninvasive and dependable. 22 Jyoti Metan et al. (2021) suggested that the full-depth convolutional neural network optimized with the Sand Piper optimization (FDCN-SPO) method demonstrated the highest accuracy and computational performance for myocardial mass, wall thickness, left and right ventricular volumes, and ejection ratio. 23
Essam H. Houssein et al. (2021) developed various ECG signal descriptors for feature extraction using one-dimensional local binary patterns (LBP), wavelets, higher-order statistics (HOS), and morphological information. A hybrid ECG arrhythmia classification method, MRFO-SVM, has been proposed for feature selection and classification. 24 Table 1 presents a summary of the detailed survey.
Summary of the existing literature survey.
Problem statement
Among the various complications associated with long-term COVID, there is a growing recognition of cardiovascular involvement in affected individuals.CVD prediction and classification in patients with long-term COVID-19 present a multifaceted challenge.First, the pathophysiological mechanisms underlying cardiovascular complications in Long COVID are not fully understood.While acute COVID-19 has been linked to inflammation, endothelial dysfunction, and hypercoagulability, the persistence of these factors in long-term patients poses unique challenges for the accurate prediction and classification of CVD.Second, the manifestation of cardiovascular symptoms may vary widely among individuals with Long COVID periods, making it difficult to establish a standardized set of parameters for prediction models.Some patients may experience persistent chest pain, arrhythmias, or myocardial inflammation, whereas others may display more subtle signs, such as exercise intolerance or fatigue.This heterogeneity complicates the development of predictive algorithms that can reliably identify individuals at risk of cardiovascular complications.
In addition, the longitudinal nature of Long COVID necessitates continuous monitoring and reassessment, as cardiovascular symptoms may evolve over time. Effective prediction and classification models must account for the dynamic changes in the clinical presentation of CVD in patients with long-term COVID-19. Addressing these challenges requires a comprehensive understanding of the interplay between viral persistence, immune response, and cardiovascular pathology, coupled with advanced ML techniques to develop robust predictive models tailored to the unique characteristics of Long COVID. The successful development of such models could significantly enhance early intervention strategies and improve outcomes for individuals at risk for cardiovascular complications in the extended aftermath of COVID-19.Existing cardiovascular disease prediction methodologies employing deep learning and machine learning algorithms for long-term COVID-19 patients have numerous limitations.Many models exhibit a lack of generalizability because they are trained on restricted datasets that may not represent a diverse patient population.Second, they frequently neglect to integrate real-time data, thereby constraining their efficacy in dynamic clinical settings.Moreover, the interpretability of the models is frequently inadequate, complicating the ability of healthcare professionals to comprehend the rationale behind the predictions.Ultimately, there exists a risk of overfitting, wherein models exhibit strong performance on training data, yet falter on novel instances.
Proposed methodology
Predicting and classifying CVD in patients with long-term COVID-19 is crucial for identifying the potential health risks associated with the aftermath of the virus.Researchers have employed a multifaceted approach that integrates information from both the CMR and tabular datasets.For the CMR dataset, a comprehensive preprocessing pipeline was implemented, starting with the application of a median filter to reduce noise, followed by intensity normalization and histogram matching for standardization.The data were then segmented using DAE with Bayesian fuzzy clustering to enhance the accuracy of delineating relevant cardiac structures.Feature extraction was accomplished using the VGG19 model, capturing high-level representations that contribute to the subsequent prediction model. Figure 1 shows the proposed block diagram. Simultaneously, a tabular dataset undergoes preprocessing steps to ensure data quality. Cleaning processes address missing or inconsistent values, whereas numerical reduction techniques identify and handle outliers using IQR.Furthermore, feature selection through SFS refines the dataset and retains the essential variables for the predictive model. The final classification step involves an Improved DA-ResBiGRU with ABER. This advanced model integrates attention mechanisms, residual connections, and bidirectional recurrent units to capture temporal dependencies and intricate patterns, ultimately providing a robust prediction and classification framework for cardiovascular risks in patients with long-term COVID-19.

Proposed block diagram.
CVD prediction using CMR dataset
Advanced medical imaging techniques such as CMR imaging have emerged as promising avenues for CVD prediction.The process begins with meticulous pre-processing of the CMR dataset.Techniques such as median filtering are employed to mitigate noise, whereas intensity normalization ensures consistency across images.Histogram matching further aligns the intensity distributions, creating a standardized dataset for subsequent analysis.Segmentation is a crucial step in identifying relevant structures within CMR images.The DAE is deployed for segmentation owing to its ability to capture complex hierarchical features.Bayesian fuzzy clustering enhances the accuracy of segmentation by incorporating uncertainty measures, thereby allowing for a more robust delineation of anatomical structures.The segmented regions of interest provided a foundation for subsequent analyses.Feature extraction, a pivotal aspect of CVD prediction, was accomplished using a powerful VGG19 architecture. This CNN excels in capturing intricate spatial hierarchies within the segmented regions. The extracted features serve as discriminative indicators, capturing subtle patterns and variations that are indicative of potential cardiovascular abnormalities. Integrating these preprocessing, segmentation, and feature extraction techniques creates a comprehensive pipeline for CVD prediction, fostering accurate risk assessment and facilitating timely clinical interventions.
Pre-processing
In the pre-processing of CMR images for the detection and analysis of CVDs, several essential steps are employed to enhance image quality and facilitate accurate diagnosis. A common approach involves the application of a median filter to reduce noise and improve image clarity.This helps smooth out irregularities and artifacts present in the CMR images.Additionally, intensity normalization is crucial for standardizing pixel values across different images, ensuring consistency in their representation. This step aids in better comparability and interpretation of the images during subsequent analyses. Furthermore, histogram matching was applied to align the intensity distribution of the images with the aim of harmonizing variations in contrast and brightness. This normalization technique enhances the overall consistency and comparability of CMR images, contributing to more reliable and precise assessments in the context of cardiovascular disease diagnosis and research.
(a)
Median filtering is a highly efficient noise reduction technique that is frequently utilized in image processing. Mean filtering replaces the pixel value at the center of the sliding window with the average value of the pixels in the window. The median's mathematical filtering is described as, (b)
Preprocessing intensity normalization techniques can reduce fluctuations in the image signal intensity caused by technical factors (scanner specifications and sequence acquisition parameters).These approaches improve the intensity-based emission of the signal by “smoothing” the differences in intensity levels, that is, the repeatability of learning.Intensity normalization eliminates signal discrepancies between participants, MRI scanners, and sequence configurations by using signal values from nearby muscles.The normalized STIR spleen ratio was calculated using the same procedure as that used for T2-weighted STIR imaging.The intraclass correlation coefficient (ICC) was used to assess the interobserver agreement. To prevent division by zero, we normalize the image intensity between 0 and 1, subtract the minimum value, and then divide it by the range of values, using (c)
Histogram matching is commonly used to produce processed photos using particular histograms.Below is a formal explanation of histogram matching: Let S and T represent the source and target images’ continuous intensities, respectively (assumed as random parameters).
Where L represents the intensity level numbers, and x represents a substitute variable for integration. We assume that a random value w possesses this condition.
These two equations show that M(S) = G(T), which implies that T has to fulfill the requirement,
Equations (1–3) demonstrate how a given image can be transformed into an image with a specific PDF.
Segmentation
Deep autoencoders learn the hierarchical representations of complex data.Integrating Bayesian fuzzy clustering with deep autoencoders enhances the segmentation process by incorporating uncertainty and flexibility in the handling of image variations.The deep autoencoder extracts features and reconstructs the input images, thereby capturing intricate patterns in CMR data.The Bayesian fuzzy clustering algorithm accommodates the uncertainty in assigning pixels to different clusters, providing a probabilistic framework for more accurate and robust segmentation.
(a)
A DAE is a primary deep network built of AEs with several hidden layers that generate significant energy.The Softmax classifier is sometimes called the output layer when dealing with classification challenges.An AE is an invalidated network that encodes the input data into specific illustrations.Consequently, the technique of rebuilding input samples with fewer mistakes is widely used.
25
The training set was as follows:
Among these,
For each training data sample
The sigmoid function and linear transformation were used to understand how the encoding and decoding processes work.
The linear transformation, written as
The DAE design is based on the standard automated counter, which encodes
Individual design parameter training ensures that each layer provides the best possible outcome.Fine-tuning is a frequently used global optimization strategy in neural networks that improves DAE performance.The squared error cost for ideal samples is expressed as
The fine-tuning process is defined by the energy function J(w,b), which forces the result to be close to the correct label throughout the preparation.
The integer layer encoding constraints are defined as (b)
The BFC
26
algorithm merges probabilistic and fuzzy approaches. The breadth of fuzzy indicators used in traditional fuzzy approaches has increased in light of prior information and Bayesian theory. The BFC approach solves the optimization problem using the Markov Chain Monte Carlo (MCMC) technique strategy
27
and the particle filter method.
28
Fuzzy clustering is handled using the maximum a posteriori probability (MAP) with the number of clusters predicted using the normal distribution.This drawback makes the BFC approach inappropriate for large-scale data, severely constrains its application area, and fails to satisfy current practical needs.The BFC technique is intended to address fuzzy clustering problems using a probabilistic approach.The BFC probability model comprises three parts: fuzzy prior membership (FCP), fuzzy data probability (FDL), and cluster center priority, as shown in the image below.The probability of the fuzzy data is expressed as follows:
m,
The prior of fuzzy membership is expressed as,
It should be noted that
The user sets the parameter
Based on map theory, the joint probability of equation (22) is a negative logarithm that can be simplified by multiplying by two.The joint probability formula is as follows:.
Feature extraction
VGG-19 can automatically extract hierarchical features at different levels when applied to CMR images in the context of cardiovascular disease.This enables the model to discern intricate details of cardiac structures and abnormalities, facilitating the accurate identification and characterization of pathological conditions such as myocardial infarction, hypertrophy, or other cardiac anomalies.The use of VGG-19
29
in feature extraction for CMR images enhances the potential for automated and precise diagnosis, thereby contributing to the advancement of cardiovascular healthcare.
(a)
The VGG-19 network, as shown in Figure 2, has 19 layers, including 16 convolutional layers and three fully connected layers.As neural networks deepen, their precision improves, thereby enhancing the feature extraction.The convolutional layers perform convolutions and are connected to the max-pooling and dropout layers.These layers use 3 × 3 filters with max pooling to reduce the convolutional layer output dimensions.To limit false positives, the network was trained on a single lesion before testing all lesions.

Deep learning features using VGG19 model.
Where
Activation functions decrease the nonlinear results and relations by transforming negative values to zeros and processing only positive values.Upsampling or max-pooling layers are combined with convolutional layers to minimize the dimensionality of the convolutional output. max pooling is used, with each block's maximum value serving as the final image pixel.
Where
Among them,
The hyper planes are reformed if the margins are delineated and supplied as,
According to the preceding functions, the factor that must be decreased to formulate the ideal hyper plane is presented as,
The results were generated using features from the previous layers.To prevent overfitting, the dropout function randomly deleted neurons during training. Table 2 provides the details of the layer architecture.
Proposed architecture layer details.
CVD prediction using UCI dataset
CVD prediction and classification have become increasingly crucial in the field of healthcare by leveraging advanced machine learning techniques to enhance early diagnosis and risk assessment.The process often begins with the utilization of tabular datasets containing diverse patient information including demographic details, medical history, and various risk factors.Preprocessing plays a pivotal role in preparing datasets for accurate modeling. Cleaning involves addressing missing values and standardizing data formats to ensure uniformity, while reduction techniques focus on enhancing model efficiency by refining the dataset. Identifying numerical values within a dataset is crucial, as they often represent key indicators of cardiovascular health. Subsequently, the dataset underwent outlier detection, which is essential for maintaining the data integrity.IQR is an effective method for identifying and removing outliers. SFS is used for feature selection, systematically adding features to the model one at a time and evaluating its impact on performance.By selecting the most relevant features, SFS enhances the model accuracy, reduces overfitting, and improves interpretability.This meticulous approach to preprocessing and feature selection not only refines the dataset, but also ensures that the subsequent ML model is robust and capable of making accurate predictions regarding an individual's susceptibility to CVD.
UCI dataset
As previously stated, researchers considered the UCI 30 repository as a typical clinical property for selecting features for investigation.Subsequently, class tags were used to clean the list of properties before collecting data from Google Forms. Table 3 lists the attributes of these datasets.We discuss 14 qualities and provide the appropriate measurements.
Cardiovascular dataset attributes.
COVID 19 CT dataset
The dataset comprised 746 CT scan images, categorized into two groups: COVID-19 positive and COVID-19 negative.Specifically, 349 images in the dataset represent patients diagnosed with COVID-19, whereas the remaining 397 images were from individuals not affected by the virus.
Data pre-processing
This is an approach to data mining that prepares, cleans, and organizes raw data before creating and training ML algorithms.Often, the data received from multiple sources are raw and unsuitable for fast analysis.Missing attributes, outliers, and noisy data are identified and eliminated from the dataset.This study employs cleaning techniques, such as locating missing numbers and searching for and deleting outliers.This study checked the dataset for missing values using the Python programming language in the Anaconda Jupiter environment and discovered no missing values. Outliers are values with more than three standard deviations apart from the mean. Recognizing, Identifying, and deleting outliers before using a predictive model may significantly decrease errors while enhancing the accuracy.This study investigates the 14 available attributes of outliers and uncovers their specific characteristics.The missing data preparation approach does not include any missing data. However, using the interquartile range method, we identified and deleted numerous outliers that showed standard deviation from the mean.
Detecting and eliminating outliers through the IQR method
The IQR
31
method helps identify outliers that are spread across a dataset, with a larger IQR indicating that data points are more dispersed and a smaller IQR suggesting that data points are closer to the mean.To compute IQR, the first quartile (Q1) was subtracted from the third quartile (Q3).The formula for the IQR is Q3–Q1, which captures the middle 50% of the data.The IQR can be visually represented using a boxplot, where data points are divided into four equal parts corresponding to the quartiles from Q1 to Q3. This technique effectively separates the dataset into equal sections and identifies outliers between the first and third quartiles.
Outliers can be recognized using the algorithm described above by determining whether the value is above or below the range. The proposed algorithm discovered 256 outliers in the 1025 datasets used in this study after reducing outliers using the quartile approach.After removing the outliers, 769 data points were obtained.
Feature selection
In forecasting cardiovascular disease (CVD) using the UCI repository dataset, choosing suitable features is crucial for enhancing model efficacy and interpretability. Extraneous features can adversely affect the performance. Sequential Forward Selection (SFS) is a widely used technique. This method begins by identifying the most effective individual feature and then determining the optimal combination of two features, followed by the best set of three features, continuing until the desired number of relevant features is achieved. The SFS, a wrapper-based approach, gradually incorporates features into the model. At each stage, the performance of the model was evaluated, and the feature subset that yielded the highest predictive accuracy was selected. Figure 3 illustrates the workflow of the proposed model.

Flow chart for the proposed work.
To apply SFS in the context of cardiovascular disease prediction, the process typically begins with an empty set of features, and the algorithm iterates through the available features, adding the most informative feature in each step. The feature subset that yielded the best performance is retained. This iterative procedure continues until a predefined number of features are selected or until further additions do not significantly improve the model's predictive capability. The selected features can provide insights into the key factors contributing to cardiovascular diseases, aiding healthcare professionals in better understanding and managing the risks associated with this critical health issue.
Classification
Precise categorization of CVD is essential for proper diagnosis and treatment.Recent progress in AI has resulted in advanced models, such as the Enhanced DA-ResBiGRU with ABER, which enhances CVD classification.This architecture utilizes two attention mechanisms that focus on crucial input data aspects.It detects complex patterns in heart signals and improves the model explainability.By utilizing a bidirectional GRU, the model examines past and future contexts, offering a thorough analysis of temporal relationships in cardiovascular information.Additionally, the incorporation of ABER optimization ensures that the Earth's radius is accurately considered in geospatial features, contributing to more precise predictions, particularly in epidemiological studies where geographical factors play a role in CVD prevalence.The synergy of the Improved DA-ResBiGRU with ABER optimization offers a sophisticated approach for CVD classification.
DA-ResBiGRU
After feature extraction, the DA-ResBiGRU model, which is a novel hybrid approach, as shown in Figure 4, can be used for CVD classification. This method combines ResNET 50, which uses deeper layers, with a GRU.
32
It utilizes skip connections that bypass layers, connect residual mappings, and simplify the learning process.The performance of the model is reduced by skipping the layers through regularization.The residual function is asymptotically expressed by the following equation:

Improved dual attention residual bi-directional gated recurrent neural network unit.
To ensure high classification accuracy, CVDs were classified using a dual-attention technique and closed layers.The attention function focuses on the classification layer to obtain the best results.
Considering an input data:
Where
The update gate
A GRU is a recurrent structured neural network that considers historical information. The classification is based on hidden states and is expressed as,
Two hidden layers are linked in the BiGRU network to the same output layer, which gives the following equation, (a)
The encoder module
A, B, and C represent the real numbers that represent the learning weight matrix, and the classification results are normalized using softmax activation. The following equation defines the normalizing function,
The feature correlation,
(b)
A mechanism for self-attention arises in this decoder that produces efficient outcomes.This method does not work recursively in RNNs; therefore, the information sequence is not fixed. Absolute position encoding is employed to change the position;
The self-awareness layer encodes data using location embedding
Al-Biruni earth radius optimization algorithm (ABER)
This technique enhances development and navigation by dividing the population into different groups and dynamically adjusting their sizes throughout the process.Initially, the population was divided into two main groups: developers and explorers. To optimize the fitness value of individuals within each group, the proportion of the population engaged in development began at 30% and gradually increased to 70% as the optimization progressed.Conversely, the proportion assigned to the navigation group began at 70% and decreased to 30% over time.This approach significantly enhances the overall fitness of the individuals.Selective strategies are also employed to maintain the primary process response when no better alternatives are found, thereby ensuring that the population optimization process ultimately converges.Repeating the BER optimization thrice is unlikely to substantially improve the solution quality.If the solution reaches a local optimum, a mutation operation is applied.
The ABER 33 algorithm selects the most effective solution in each iteration, yielding excellent results.Although this approach may converge prematurely owing to the multimodal nature of the problem, it remains highly efficient owing to the elite method.The mutation process in ABER, along with the subsequent search within the navigation group, resulted in superior navigation capabilities.These capabilities enable ABER to effectively delay convergence. Algorithm 1 presents ABER pseudocode.Before executing the ABER, the number of iterations, population size, and mutation rate must be defined.The algorithm then creates development and navigation groups, dynamically adjusting their sizes as the process moves toward the optimal solution.Each group performs its role uniquely, and enhances its diversity and navigation.Implementing the ABER's selective strategy reduces the likelihood of process repetition and maintains leadership stability.
ABER optimization Pseudo code
The ABER efficiently adjusts the hyperparameters of the improved dual-attention residual bidirectional gated recurrent neural network unit for CVD prediction. This method reduces prediction errors and enhances the capacity of the model to identify intricate temporal relationships in patient data. Ultimately, ABER enhances the precision and reliability of CVD prediction by utilizing the advantages of sophisticated deep-learning architectures.
Results and discussion
This section evaluates the model's performance using metrics such as accuracy, precision, recall, and F1-score. The model's discriminative capability was analyzed using ROC curves and the AUC. To ensure a thorough assessment, the dataset was divided into training (70%), validation (15%), and testing (15%) subsets. This division allowed for model training, hyperparameter optimization, and performance evaluation on unseen data.
Performance comparison with ML models
An evaluation of the performance metrics for the various algorithms is presented in Table 4. Figures 5 and 6 in the provided images and table reveal the effectiveness of the suggested model compared to other existing models, such as the Decision Tree (DT), support vector machine (SVM), Random Forest (RF), multilayer perceptron (MLP), and Logistic Regression (LR).The proposed model achieved the highest accuracy of 97.88%, significantly outperforming the other models, which ranged from.45% to 88.36%. Specifically, the proposed model's sensitivity (95.50%), precision (96.68%), and F-measure (95.85%) demonstrate its capability.The high MCC value (92.53%) and AU-ROC score (95.39%) further confirm the ability of the model to handle imbalanced datasets and its effectiveness in distinguishing between classes.

Performance metrics comparison for machine learning algorithm.

MCC and AU-ROC comparison for ML algorithm.
Performance comparison of different ML algorithms.
Note: Bold letters indicates the proposed work.
In contrast, although models such as SVM and DT achieve acceptable results, they lag behind the proposed model, highlighting the advancements made by the proposed approach in improving the performance across multiple metrics. Specifically, it achieved the lowest FPR (4.78%) and FNR (2.45%) while maintaining the highest TNR (98%). Regarding RMSE and MAE, which quantify prediction error rates, the proposed method recorded the lowest values at 0.28 for RMSE and 0.21 for MAE, outperforming all other models. This further highlights the precision and dependability of the proposed method in predictive tasks.
Among the traditional algorithms, MLP performed relatively well, with the second-lowest FPR (14%), FNR (12.2%), and highest TNR (94.6%) after the proposed method.MLP also demonstrated a lower RMSE (0.32) and MAE (0.28) than other conventional algorithms such as SVM and RF. LR also showed good performance with a low FPR (15%), FNR (17.3%), and TNR of 94.3%. However, its RMSE (0.38) and MAE (0.30) were slightly higher than those of the MLP. In contrast, RF and SVM showed higher error rates and lower TNRs, making them less favorable for the given task.RF, despite being a commonly used ensemble method, exhibited relatively high RMSE (0.46) and MAE (0.34).Traditional models, such as MLP and LR, also show promise but fall short compared to the proposed method.Although they are commonly used, SVM and RF display higher error rates, making them less optimal for this scenario, as presented in Table 5 and Figures 7 and 8.

FPR, FNR and TNR comparison for ML algorithm.

RMSE and MAE comparison for ML algorithm.
Performance comparison of different ML algorithms.
Note: Bold letters indicates the proposed work.
Performance comparison with DL models
The performance of several DL models as shown in Table 6 and Figures 9 and 10, including DNN, LSTM, Inception-v3, Xception, MobileNetV2, and the suggested method. The suggested method demonstrated the best performance across all metrics, demonstrating its superiority over the other models. The highest accuracy (97.88%), sensitivity (95.50%), specificity (94.29%), precision (96.68%), and F-measure (95.85%) were achieved. These findings demonstrate that the proposed method is highly efficient in correctly identifying positive cases (high sensitivity), while maintaining a low false positives rate (high specificity and precision).In addition, the MCC value of 92.53% and the AU-ROC value of 95.39% of the proposed method highlight its robustness and reliability in classification tasks.

Performance metrics comparison for DL models.

MCC and AU-ROC comparison for DL models.
Performance comparison of different DL algorithms.
Note: Bold letters indicates the proposed work.
Xception and Inception-v3 demonstrated strong performance, achieving accuracies of 95% and 94.8%, respectively. However, they lag behind the proposed method and MobileNetV2 regarding specificity, sensitivity, and MCC. Xception achieved a slightly higher AU-ROC (88.91%) than Inception-v3 (87.14%), making it a slightly better option for classification tasks in which distinguishing between classes is crucial.DNN and LSTM, although still effective, demonstrated lower performance metrics than the other models.DNN, in particular, had the lowest MCC (81.94%) and AU-ROC (87.75%), indicating that it may not be as reliable as other models for this particular task. Thus, the proposed method was the most effective and achieved the highest performance across all metrics. MobileNetV2 offers a competitive alternative, whereas Xception and Inception-v3 provide solid performance. However, the functional, DNN, and LSTM may not be the best choices for this classification task based on the given metrics.
The effectiveness of different algorithms, such as DNN, LSTM, Inception-v3, Xception, MobileNetV2, and the proposed approach, is presented in Table 7 and Figures 11 and 12. The proposed method emerges as the top performer across all evaluation criteria, recording the lowest false positive rate (FPR) at 4.78% and false negative rate (FNR) at 2.45%, along with the highest true negative rate (TNR) of 98%. Furthermore, the proposed method achieved the smallest root mean square error (RMSE) of 0.28 and mean absolute error (MAE) of 0.21, underscoring its superior efficiency and dependability in predictive tasks.

FPR, FNR and TNR comparison for DL algorithm.

RMSE and MAE comparison for DL algorithm.
Performance comparison of different deep learning algorithms.
Note: Bold letters indicates the proposed work.
In comparison, MobileNetV2 performed well, showing competitive results, with a TNR of 95.8%, FPR of 11%, and FNR of 13.1%.It also maintained a lower RMSE (0.31) and MAE (0.25) than the other traditional models.Xception and Inception-v3 also showed strong performance, with Xception outperforming Inception-v3 slightly in all the metrics.In contrast, DNN and LSTM lagged behind, with DNN showing the highest FPR (16%) and FNR (19.14%), along with the highest RMSE (0.42) and MAE (0.35).This highlights that while traditional models can still be effective, the proposed method offers significant improvements in minimizing errors and enhancing classification accuracy.
Performance comparison with optimization models
Table 8 and Figures 13, and 14 illustrate the performance comparison of various algorithms:the Grasshopper Optimization Algorithm (GOA), Particle Swarm Optimization (PSO), Harris hawk optimization (HHO), Grey Wolf Optimizer (GWO), Ant Lion Optimizer (ALO), and the proposed algorithm. The proposed algorithm consistently outperformed the other algorithms, achieving a remarkable accuracy of 97.88%, which was significantly higher than that of the competing methods, which ranged from 80.23% to 86.14%.Additionally, the proposed method showed superior sensitivity (95.50%) and specificity (94.29%), indicating its robustness in efficiently identifying both positive and negative cases.The precision and F-measure values also highlight the reliability of the proposed method for balancing true positives with precision, thereby further emphasizing its effectiveness in classification tasks.

Performance metrics comparison for different optimization algorithms.

MCC and AU-ROC comparison for different optimization algorithms.
Performance comparison of different optimization algorithms.
Note: Bold letters indicates the proposed work.
The proposed algorithm leads to an AU-ROC performance with a score of 95.39%, demonstrating its strong capability to accurately differentiate between classes. The MCC value of the proposed method (92.53%) was the highest among the compared algorithms, demonstrating a robust correlation among the original and predicted classifications. These results collectively highlights the efficacy of the proposed algorithm across all evaluated metrics, making it a more reliable and precise option than other optimization algorithms, such as PSO, GOA, HHO, GWO, and ALO. Such improvements in the performance metrics suggest that the proposed approach is well suited for applications requiring high precision, recall, and overall classification accuracy.
The confusion matrix in Figure 15 illustrates the efficiency of the suggested COVID-19 prediction model. In the confusion matrix, the model accurately identified 98.35% of the actual COVID-negative cases as negative and correctly predicted 97.25% of the actual COVID-positive cases as positive. The false positive rate was 1.65%, meaning 1.65% of COVID-negative cases were mistakenly classified as positive. Similarly, the false negative rate was 2.75%, indicating that 2.75% of the actual COVID-positive cases were incorrectly labeled as negative. Overall, the confusion matrix demonstrates a highly effective model with exceptional accuracy in distinguishing between positive and negative COVID-19 cases.

Confusion matrix.
Analysis of learning rate, loss and batch size
Figure 16 demonstrates the impact of varying learning rates and batch sizes on the loss across training epochs. The first graph reveals that a smaller learning rate (0.001) results in a slower yet steady decline in loss over time. In contrast, higher learning rates (0.01 and 0.1) initially reduce the loss more quickly but eventually stabilize or show a slight increase as training continues.This indicates that, while higher learning rates may speed up the initial learning, they can also lead to suboptimal convergence or even overfitting.In the second graph, a comparison of batch sizes reveals that a larger batch size (16) results in a faster reduction in loss than a smaller batch size (8).However, both batch sizes showed a similar trend in loss reduction, suggesting that, while larger batch sizes may speed up training, the overall effect on convergence may not be drastically different.

Loss over epoch for different learning rates.
Figure 17 shows that it steadily improved throughout the epochs, eventually converging around a high accuracy value of 97.88%.The loss values decreased significantly in the initial stages and stabilized in later epochs.

Accuracy and loss graph for training and validation.
ROC analysis
Figure 18 illustrates the Receiver Operating Characteristic (ROC) curves for different models, showcasing the relationship between sensitivity and 1-specificity across varying thresholds. The Area Under the Curve (AUC) serves as a key performance indicator, where larger values reflect superior discriminatory ability. The proposed framework exhibited superior performance, attaining an AUC of 0.972 and demonstrating the highest classification accuracy.This outcome indicates that “Our Framework identifies true positives more accurately while reducing false positives compared to other methods.The other models, including MMT (AUC = 0.962), CMGAT (AUC = 0.966), MGNN (AUC = 0.960), and DCTA (AUC = 0.957), also demonstrated strong performance but were slightly shorter than the proposed framework.The close clustering of the curves for these models indicates that all methods perform well; however, the proposed framework has a slight edge in terms of overall accuracy and reliability.

ROC curve.
Classification comparison pre-processing techniques
A comparative analysis of classification performance using different preprocessing techniques is presented in Table 9. The study examines unprocessed images, fuzzy color preprocessing, stacked images, and a novel “proposed” method. Performance metrics, including precision, accuracy, F1-score, and recall, were used for evaluation. The simulation results indicate that the proposed preprocessing method outperforms the other techniques across all performance categories. It achieved an overall accuracy of 97.88%, with perfect recall and precision for the normal class and near-perfect scores for other categories. This indicates that the proposed method significantly improves the system's ability to accurately analyze CMR images across all categories, establishing it as a highly effective approach for medical image analysis. Other techniques showed comparatively lower performance, with the raw method achieving an average accuracy of 93%, while the fuzzy color technique reached 94.52%. Although these methods still perform reasonably well, their results are inferior to those of the stacked image approach, particularly in terms of precision and recall.
Classification comparison for pre-processing techniques.
State-of-the-art comaparison
A comparison of the performance metrics, as shown in Table 10 illustrates across various models with an accuracy of 97.88%, outperforming the other models. Additionally, it maintained strong performance in sensitivity, precision, and F-measure, with values of 96.68%, 95.50%, and 95.85%, respectively. This indicated the robustness of the model in effectively predicting true positives and ensuring consistent accuracy in its predictions. Although Srinivas et al.'s model achieved a close accuracy of 97.00%, the proposed model still holds an edge with a slightly better overall balance in the key metrics.
Comparison of performance metrics across different models.
In comparison, the methods developed by Solayman et al. and Afif et al. achieved strong results, with accuracies of 96.34% and 96.23%, respectively. However, their sensitivity and precision were slightly lower than those of the proposed model. Similarly, the frameworks introduced by Chadaga et al. and Joshi et al., both achieving 95.00% accuracy, delivered consistent outcomes but fell short of outperforming the top-tier models. Collectively, the proposed framework exhibits a notable advancement in accuracy and a balanced performance across critical evaluation metrics, establishing it as a more dependable and efficient solution compared to current methodologies.
Discussion
The DA-ResBiGRU framework improves the model's capacity to concentrate on the most pertinent data features by utilizing its dual attention mechanism. This capability enables the system to dynamically emphasize critical inputs and accurately identify complex temporal relationships and detailed patterns within cardiac magnetic resonance imaging and structured tabular data. Additionally, the integration of residual connections mitigates the vanishing gradient issue, allowing the model to develop deeper and more meaningful representations while maintaining optimal performance. The ABER algorithm is instrumental in refining the hyperparameters of the model. By maintaining a balance between the exploitation and exploration phases, ABER prevents the model from settling into local optima and guides it toward achieving optimal solutions. This method fine-tunes key parameters, such as the batch size, learning rate, and network weight, contributing to the model's high accuracy, sensitivity, and specificity.
The model's integration of multimodal data, combining CMR imaging with tabular clinical data, allows for more comprehensive analysis. Preprocessing steps such as median filtering, intensity normalization, and histogram matching for imaging data, along with outlier detection and sequential forward feature selection (SFS) for tabular data, ensure that only clean and relevant information is used. This thorough data preparation enhanced the robustness and adaptability of the model across various patient populations.
Unlike many state-of-the-art models that rely on single data modalities or lack sophisticated optimization techniques, this hybrid framework is particularly adept at identifying nuanced patterns related to long-term COVID-associated cardiovascular risks. By incorporating bidirectional gated recurrent units (BiGRUs), the model can analyze data in both forward and backward directions, enhancing its capability to identify temporal patterns essential for precise cardiovascular risk assessment. Extensive testing against a range of machine learning models (SVM, RF, LR), deep learning architectures (LSTM, Inception-v3, MobileNetV2), and optimization algorithms (GOA, PSO, GWO, HHO, and ALO) consistently showed that the proposed model delivered superior performance across all evaluated metrics. Its strong generalization to new data, minimized error rates, and high precision establish its reliability and effectiveness for clinical decision-making.
Conclusion
This study introduces an innovative decision-making framework aimed at forecasting CVD in long COVID patients by integrating state-of-the-art reinforcement learning techniques with prescriptive analytics. The system utilizes a DA-ResBiGRU model combined with ABER to enhance predictive precision. By incorporating CMR imaging and structured tabular data, the framework addresses complexities arising from diverse and unconnected datasets. Remarkably, the system achieved a peak accuracy of 97.88%, alongside sensitivity, precision, and F-measure values of 96.68%, 95.50%, and 95.85%, respectively. Additionally, it exhibited robust performance in Matthews correlation coefficient (MCC) at 85.75% and AU-ROC at 95.39%. These outcomes highlight the system's dependability in identifying cardiovascular risks in long COVID patients, offering a valuable tool for early detection and improved clinical management. Future research will focus on broadening the dataset to include a wider variety of patient demographics, enhancing the model's ability to generalize across diverse populations.
Footnotes
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Data availability statement
All data analysed during this study are included in this article.
