Abstract
BACKGROUND:
Gastrointestinal tract (GIT) diseases impact the entire digestive system, spanning from the mouth to the anus. Wireless Capsule Endoscopy (WCE) stands out as an effective analytic instrument for Gastrointestinal tract diseases. Nevertheless, accurately identifying various lesion features, such as irregular sizes, shapes, colors, and textures, remains challenging in this field.
OBJECTIVE:
Several computer vision algorithms have been introduced to tackle these challenges, but many relied on handcrafted features, resulting in inaccuracies in various instances.
METHODS:
In this work, a novel Deep SS-Hexa model is proposed which is a combination two different deep learning structures for extracting two different features from the WCE images to detect various GIT ailment. The gathered images are denoised by weighted median filter to remove the noisy distortions and augment the images for enhancing the training data. The structural and statistical (SS) feature extraction process is sectioned into two phases for the analysis of distinct regions of gastrointestinal. In the first stage, statistical features of the image are retrieved using MobileNet with the support of SiLU activation function to retrieve the relevant features. In the second phase, the segmented intestine images are transformed into structural features to learn the local information. These SS features are parallelly fused for selecting the best relevant features with walrus optimization algorithm. Finally, Deep belief network (DBN) is used classified the GIT diseases into hexa classes namely normal, ulcer, pylorus, cecum, esophagitis and polyps on the basis of the selected features.
RESULTS:
The proposed Deep SS-Hexa model attains an overall average accuracy of 99.16% in GIT disease detection based on KVASIR and KID datasets. The proposed Deep SS-Hexa model achieves high level of accuracy with minimal computational cost in the recognition of GIT illness.
CONCLUSIONS:
The proposed Deep SS-Hexa Model progresses the overall accuracy range of 0.04%, 0.80% better than GastroVision, Genetic algorithm based on KVASIR dataset and 0.60%, 1.21% better than Modified U-Net, WCENet based on KID dataset respectively.
Keywords
Introduction
Gastrointestinal tract (GIT) diseases affect the intestinal tract from the mouth to the anus. GIT is a complex system responsible for digestion and nutrient absorption, are susceptible to a variety of diseases that have a significant influence on health and value of life [1]. These disorders manifest in diverse ways, requiring tailored approaches to diagnosis and treatment, such as acid reflux and irritable bowel syndrome. It is crucial for healthcare professionals and patients to understand the etiology, symptoms, and management of gastrointestinal tract diseases. Heartburn, indigestion/dyspepsia, bloating, and constipation are frequent gastrointestinal problems in the neighbourhood [2, 3]. According to a global online survey-based study [4], less than 40% of people worldwide experience digestive issues, such as constipation, diarrhea, or irritable bowel syndrome with respective incidence rates of 4.7%, 11.7%, and 4.1%. Numerous things, such as eating a diet low in fibre, not getting regular exercise, travelling or other regular changes, consuming a lot of dairy products, and experiencing increased stress, might affect the motility (ability to move) of GI tract [5]. Wireless medical endoscopy (WCE) [6] offers a non-invasive method for screening gastro-intestinal (GI) diseases, covering the esophagus, stomach, large intestine, and small bowel [7]. It’s particularly recommended for diagnosing intestinal tumours, Crohn’s disease, celiac disease, and unexplained GIT bleeding. The swallowable capsule endoscope integrates LED lighting, a micro camera sensor, button cell batteries, a microprocessor, and radio frequency technology [8]. Endoscopic assessments heavily rely on gastroenterologists’ expertise, leading to variability in results. Manual examination of endoscopic data is tedious and subjective, but autonomous systems streamline the process, improving disease recognition, efficiency, and cost-effectiveness, potentially facilitating early diagnosis and disease management [9, 10].
Conventional endoscopic evaluation relies on gastroenterologists, introducing variability in results [11]. Manual analysis is time-consuming, requires intense concentration, and may be subject to clinician expertise [12]. Autonomous systems alleviate clinician workload [13], enhance disease recognition coherence, efficiency, and patient care time, making them more cost-effective [14]. Besides, initial recognition of gastrointestinal abnormalities is supreme for effective treatment and improved patient outcomes [15]. Conventional diagnostic methods often pose challenges such as invasiveness and human error, highlighting the need for more efficient and accurate detection techniques. WCE emerges as a non-invasive solution, particularly beneficial for diagnosing intestinal tumors, Crohn’s disease, and gastrointestinal bleeding. Integrating deep learning (DL) [16] and machine learning (ML) [17] into WCE enables enhanced diagnostic accuracy by analyzing vast amounts of visual data from swallowable capsule endoscopes. This advancement promises to revolutionize gastroenterology by offering timely and precise detection, ultimately enhancing patient care and reducing healthcare costs [18]. In this paper, an automatic feature analysis model and feature selection approach was established to categorize the five distinct types of the GIT diseases from the compressed WCE images. The main contributions are summarized as follows:
This work introduces a novel deep learning-based SS features optimized classification model to identify the hexa classes of GIT diseases. The structural features are retrieved from the segmented images and the statistical features are extracted with Mobile network with SiLU activation function to highlights the relevant features. The SS (structural & statistical) features are combined and the optimization-based feature selection strategy is leveraged to remove the complexity of the prediction model while classification. The DBN is used classified the GIT diseases into normal, ulcer, pylorus, polyps, cecum and esophagitis based on selected features. The competence of the proposed Deep SS-Hexa model was assessed with specificity, precision, accuracy, recall, and F1 score.
The rest of this paper was arranged into different sections. The prior research works of intestinal disease prediction are summarized in Section 2, Section 3 designates the comprehensive portrayal of the proposed Deep SS-Hexa model to identify the GIT diseases, Section 4 contains the experimental findings and discussion. Lastly, conclusion and future extension are entailed in Section 5.
Literature review
In recent days, several works were presented by the researchers mainly to enhance the classification accuracy for classifying Gastrointestinal tract diseases. A brief summary of some of the most recent research is provided in this section.
In 2023 Obayya et al. [19] introduced the Modified Salp Swarm Algorithm with DL-based GIT Disease Classification (MSSADL-GITDC) with the endoscopic images. Employing median filter facilitates image smoothing within the MSSADL-GITDC methodology. Furthermore, by integrating the class attention layer, an augmented capsulenet was developed to extract features in conjunction with the MSSADL-GITDC approach. Experimental validation on the Kvasir-V2 database demonstrates the efficacy of the MSSADL-GITDC method, showcasing significant improvements in gastrointestinal categorization accuracy, reaching a maximum of 98.03%.
In 2023 Aliyi et al. [20] developed a real-time automated recognition algorithm for the categorization and segmentation of anomalies linked to lower GIT cancer. This method enables the identification, categorization and segmentation of communal pathology, structural regions, and the intestine research scale from colonoscopy images in real time. This system was built utilizing the pre-trained models such as YOLOv4, SSD, and YOLOv5 object identification architectures with their performances rigorously compared by making minimal adjustments to the hyperparameters. Upon evaluation using the testing dataset, the YOLOv5 model demonstrated a remarkable mean average precision (mAP) of 98.064%.
In 2023 Gunasekara et al. [21] introduced GIT-NET, a weighted average ensemble model for classifying GI-tract disorders. Individual models may struggle to capture all class characteristics, leading to misdiagnosis. This occurs as models may prioritize learning traits of specific classes over others. To address this, we propose an ensemble approach leveraging predictions from pre-trained models: DenseNet201 (94.54% accuracy), InceptionV3 (88.38% accuracy), and ResNet50 (90.58% accuracy).
In 2022 Ramamurthy et al. [22] had developed an automated DL classification method to categories various gastrointestinal disorders. In order to boost the number of samples for better generality, the input images were progressively improved. These enhanced samples were sent to EfficientNet B0 and Effimix networks, two separate networks. Dropout regulation and feature fusion were utilized to merge the characteristics from these two models, and the proposed model yields an accuracy of 97.99%.
In 2022 Khan et al. [23] designed a deep learning approach and a Moth-Crow optimization approach for categorizing GIT diseases. Afterward, three data augmentation processes were applied after the original images were contrasted. A transfer learning method was then utilized for training two pre-trained DL models on images of the GIT ailment. A hybrid Crow-Moth optimization technique was used to extract features using the distance-canonical correlation (D-CCA) method. ML algorithms were used for the identification of GIT illnesses from the fused vector features.
In 2021 Hmoud et al. [24] introduced three DLbased CNN networks: ResNet-50, GoogleNet, and AlexNet. These structures were evaluated for their ability to diagnose lower gastrointestinal disorders. Through fine-tuning pretrained CNNs, the models adapt their learned behaviours to new challenges during classification. Utilizing the softmax activation function, input images are classified into five classes based on the deep feature vector. Notably, AlexNet achieved impressive performance with an accuracy of 97.0% and an AUC of 99.98%.
In 2021 Sharif et al. [25] devised geometric features-based fusion of deep CNN for classifying GIT disease. At first, the suggested contrast-enhanced color traits method was employed to extracted features from ailment regions from WCE images. The geometric traits and special features were merged, and the conditional-entropy procedure was subsequently utilized for selecting the best features. Finally, K-Nearest Neighbor classifies based on the chosen features. The suggested technique has a best classification accuracy of 99.42%.
In 2021 Ramzan et al. [26] introduced CADx for diagnosing and categorizing gastrointestinal tract disorders, aiming to enhance predictive accuracy. Their framework initiates with preprocessing in the LAB colour space tailed by the fusion of DL features from ResNet50, InceptionNet, and VGG-16, along with local binary patterns (LBP) or textures. This model reports that the subspace-based discriminant classifier surpassed current existing methods achieving an impressive 95.02% accuracy on the KVASIR dataset.
In 2019 Gamage et al. [27] introduced a GI-Net for the categorization of GIT through endoscopy images. To forecast 8-class irregularities in gastric tract disorders, a combination of pre-tuned CNN structures like ResNet-18, DenseNet-201, and VGG-16 were utilized as feature extractors, followed by a pooling layer resulting in an ensemble of deep characteristics represented as a single feature-vector. The findings exhibit exceptional performance compared to prior approaches attaining an impressive accuracy exceeding 97.0%.
In 2019 Cogan et al. [28] had designed a DL system for modular and automatic preprocessing of GIT images. The main functional components of the Modular Adaptive Preprocessing for GIT Imaging (MAPGITI) system are edge reduction, contrast adjustment, filtering, colour mapping, and scaling. The accuracy results were 0.97 0.98, and 0.98 for NASNet, Inception-v4, and Inception-ResNet-v2 respectively.
In 2023 Jha et al. [29] developed Gastro Vision, a multi-centre open-access dataset for GI images. It encompasses morphological landmarks, clinical anomalies including polyp removal cases and normal observations from the GIT. Comprising 8,000 images from Karolinska University Hospital (Sweden) and Baerum Hospital (Norway), the dataset underwent meticulous labelling by experienced GI endoscopists. Additionally, the dataset’s significance was validated through benchmarking against widely-used deep learning baseline models.
In 2023 Nouman et al. [30] introduced an innovative GIT disease classification system. This system uses an optimized brightness-controlled contrast-enhancement technique to heighten the quality of WCE images. This method enhances the overall quality of WCE images through a genetic algorithm (GA) by regulating contrast and illumination levels via a tailored fitness function. Following this enhancement, multiple transformations are applied to WCE images to augment the dataset. Finally, ML algorithms were trained on the retrieved characteristics to classify GI tract diseases.
In 2023 Raut et al. [31] had developed modified U-Net for both segmentation and categorization of GIT illnesses with WCE images. Initially, preprocessing steps involving filtering and contrast enhancement were applied to the WCE images sourced from the KID Atlas dataset. Lesion segmentation using Modified U-Net was enhanced with parameter adjustment via Deep Hunting with Distance-based Solution Update (DH-DSU) method. Extracted features were fed into a Deep neural network (DNN) with DH-DSU for hidden neuron optimization, enabling classification of ulcer, polypoid, and inflammatory GI tract diseases.
In 2021 Jain et al. [32] introduced WCENet which is a combination of Grad-CAM++ and custom SegNet designed to pinpoint irregularities in WCE images. Operating in two stages, WCENet first employs an attention-based CNN for categorizing images into vascular, polyp, seditious, and normal groups. Subsequently, if an image falls into an abnormal category, a combination of Grad-CAM++ and a customized SegNet is utilized for precise anomaly localization. The suggested WCENet achieves an impressive classifier accuracy of 98%.
Tabular comparison of the literature works with its advantages and disadvantages
Tabular comparison of the literature works with its advantages and disadvantages
From the literature works, numerous methods were concerned with WCE input images for accurately detection of the GIT diseases. The feature extraction and selection of important features are additional challenging stage in the classification of GIT diseases including bleeding, ulcers, and polyps. Moreover, the issues that need further investigation are detecting the proper activation functions for the same, the possibilities of feeding the spatial domain and Wavelet domain inputs to a CNN in separate branches sharing the decision which would cost of convolution filters. This research mainly focused on identifying GIT diseases from the WCE images via DL and lessening the cost for diagnosing.
In this section, a novel Deep SS-Hexa model is proposed to identify the GIT diseases into hexa classes such as normal, ulcer, pylorus, polyps, cecum and esophagitis from endoscopy images. The schematic representation of the proposed model is displayed in Fig. 1.
The overall representation of the proposed model.
The KVASIR dataset [33] contains endoscopic images of the intestinal tract divided into groups. It includes anatomical features like the pylorus, z-lines, and cecum as well as pathological findings including polyps, ulcerative colitis, and esophagitis. This collection also includes dyed and lifted polyps and dyed resection limits that are linked to the exclusion of polyps. The collection consists of 8000 images, with 1000 images in each class. This dataset is splitted into 75% of training data and 25% test data for the experimental setup. Moreover, for testing the proposed algorithm, we have collected 200 sample images of entire GI tract from KID dataset [34]. During cross-validation, the test set offers an impartial estimation of the final model, while the training set is used to tune hyperparameters and select the optimal model.
Preprocessing
The weighted median filter is a technique employed to pre-process endoscopic images to enhance the image quality. This algorithm facilitates the adjustment of denoising parameters to suit varying illumination conditions and directions, ensuring a more accurate denoised endoscopic image with reduced iteration requirements. The bilateral filter preserves edge while smoothing out noise, resulting in cleaner images. The filter accomplishes this by blending neighboring pixel values based on both their spatial proximity and intensity similarity. The cross and mutation probabilities of genetics are dynamically aligned based on the stable-state regional population density, aiming for enhanced precision. The weighted median filtering process determined as follows:
where
The semantic segmentation of clinical images has been successfully achieved by CNN encoder-decoders also known as U-net [35] architecture. The WCE images are given to the U-net to segment the affected region. The U-net receives the preprocessed images straight for additional segmentation with two pathways to construct the proposed U-net architecture. The encoder constitutes the second pathway, while the symmetric decoder comprises the latter. The decoder achieves precise localization, whereas the encoder captures the contextual information of the image utilizing transposed convolutions. Both input and output maintain identical dimensions because the design emphasizes pixel-by-pixel classification for localization and boundary identification. In the encoder segment, convolutional and max-pooling layers are utilized. Similarly, both simple convolution and transposed convolution layers are employed in the decoder phase.
Architecture of proposed segmentation model.
In Fig. 2, after a multilevel decomposition of the input image in the encoder pathway, a max-pooling layer aids in reducing the dimensions of the feature map. Red arrows symbolize the max-pooling layer, blue arrows denote 3
Every process in the encoder path involves two convolution layers. In the initial phase, the channel count shifts from 1 to 64. Decoder pointing blue arrow signifies the max-pooling layer, reducing the image size from 192
Subsequently, the enlarged image is combined with the one obtained from the encoder path. This mechanism serves to amalgamate data from preceding levels, enhancing the precision of predictions. The U-Net architecture features a rectangular feature map measuring 192
Commencing at 24
MobileNet uses feature extraction to remove redundant and unimportant information from input images. First, statistical features from the WCE image are retrieved and highlighted using a mobilenet. In this network, depth-wise separable convolution (DSConv) is used instead of standard convolution. MobileNet replaces standard convolutions with depthwise separable convolutions, which contain of two main operations: depthwise convolutional and pointwise convolutional. DSConv-based MobileNet was perform feature extraction with fewer parameters than conventional networks. This enables a network to have fewer hardware restrictions on its resources. Combining pointwise convolution (PWConv) and depthwise convolution (DWConv) results in depthwise separable convolution. In Fig. 3 represent the statistical feature extraction A) input preprocessed image B) output of statistical feature.
Examples of Statistical feature extraction.
In DWConv, there are no multidimensional convolution kernels, and each kernel can handle one channel. DWConv cannot expand channels after they have been converted. Furthermore, since each convolutional process is performed successively among each channel, it is not feasible to use features from several channels at the similar spatial position. To create feature maps, PWConv was used to combine the feature maps created by DWConv. The convolutional filter of PWConv is 1
The value of the multiplier is clear for context, and for the result halt in GIT disease recognition, the value of multiplier
In MobileNet, a reduction variable identified by
In dissimilar conditions, the width multiplier and resolution multiplier assist to regulate the window size for precise prediction. During feature extraction, an input image is passed through the MobileNet architecture for resulting in a set of high-level feature maps.
The walrus optimization (WO) algorithm [36] process uses the extracted features to choose the optimal features. Population-based metaheuristic algorithms employ walruses to explore population members, offering a potential solution to optimization issues within the WO algorithm. The process of updating each walrus’s position in WO is divided into three phases, reflecting the natural activities of walruses.
Phase 1: Feeding phase
Walruses exhibit a diverse diet, consuming approximately sixty species of marine organisms. The walrus with the longest tusks, typically the strongest in the group, takes the lead in foraging for food among its peers. Interestingly, the quality of candidate solutions’ objective function values correlates with the length of the walrus’s tusks. Consequently, the strongest walrus emerges as the optimal candidate solution, boasting the most favourable objective function value. Their search behaviour spans across different areas of the search space, enhancing the exploration capability during global searches.
Phase 2: Migration phase
The migration of walruses to rocky beaches is a natural behaviour triggered by late summer’s warming air. The Walrus Optimization (WO) algorithm harnesses this migration pattern to guide walruses in exploring search spaces suitable for their needs. In this model, each walrus relocates to a distinct, randomly selected location within the search space.
Phase 3: Escaping and fighting phase
Walruses face constant threats from killer whales and polar bears, leading to adaptive shifts in posture for defense and evasion. Emulating these natural behaviours enhances exploitation power in adjacent solution search areas. The most visually appealing walrus embodies the best feature, contrasting with the least appealing one representing the worst feature. Evading attacks from killer whales and polar bears represents the favourable aspect, while their potential lethality constitutes the downside. Through feature selection, the undesirable feature is eliminated, facilitating a streamlined explanation of this selection process. In order to simulate this behavior in WO algorithm, a neighborhood is initially assumed surrounding each walrus, and utilizing Eqs (5) and (6), a new position is first randomly produced in this region. Consequently, in accordance with Eq. (7), this new position takes the place of the prior position if the value of the goal function is increased.
Where
Deep Belief Network (DBN) [37] is a kind of ANN that contains of multiple layers of hidden units. It is composed of a stack of Restricted Boltzmann Machines (RBMs) where each RBM layer aids as the input for the subsequent layer including disease classification. Figure 4 displays the architecture of Deep SS-hexa model for GIT diseases.
Architecture of proposed Deep SS-Hexa model.
The first step in building a DBN involves pre-training each RBM layer in an unsupervised manner. RBMs learn to retrieve the traits from the input images without supervision. The energy function
Where
Where
This section analyzes the efficiency of the proposed GIT classification approach using numerous metrics namely specificity, recall, precision, accuracy, and F1 score on the basis of gathered images. The standard metric comprises with an accuracy rate that particularly quantified and estimated the competence of the proposed approach. Also, the assessment of the proposed approach with conventional deep learning models is also presented in this section.
Figure 5 portrays the fallouts of proposed Deep SS-Hexa model with the sample of endoscopy images from KVASIR dataset to identify the GIT diseases. Moreover, for testing the proposed algorithm, we have collected 200 sample images of entire GI tract from the KID dataset. The medical images from the gathered dataset are denoised with weighted median filter to remove the undesirable distortions. The next task was performed in two stages by segmenting the images for extracting the structural features (edges, points) and statistical features (skewness, and kurtosis) are extracted from the pre-processed images using Mobile net. Based on the selected best features, the DBN is utilized to classify the GIT diseases into cecum, esophagitis, pylorus, ulcer, polyps or normal cases.
Experimental results of the proposed Deep SS-Hexa model.
The efficiency of the proposed Deep SS-Hexa model was determined using the network parameters viz., F1 score, recall, precision, accuracy, and specificity.
where
Efficacy evaluation of the proposed Deep SS-Hexa model
Accuracy graph of the proposed Deep SS-Hexa model.
Loss graph of the proposed Deep SS-Hexa model.
The accuracy graph in Fig. 6 displays the number of epochs on the horizontal axis and the accuracy value on the vertical axis. The epochs and loss range in Fig. 7 show that the loss of the proposed model decreases with increasing epochs. Based on categorizing the different classes of GIT diseases by using endoscopy images., the proposed Deep SS-Hexa model achieves high levels of accuracy. Based on the experimental fallouts, the accuracy of proposed model was attained at 50 training epochs by reaching the testing accuracy of 99.16% with low error rate.
Confusion matrix of the proposed Deep SS-Hexa model.
The confusion matrix for the hexa-class classification of the proposed network is exposed in Fig. 8. The proposed network detects anomalies such as ulcers, pylorus, cecum, and esophagitis with 99.16% accuracy. This confusion matrix shows that, notably, the proposed has a decreased misclassification rate and achieves excellent accuracy in categorizing GIT diseases.
The capability of each DL networks was evaluated for confirming the fallouts of the proposed model with better accuracy. The comparative valuation was made between the proposed Deep SS-Hexa model with other conventional models viz., AlexNet, LeNet, ResNet-50 and VGG-16. The competence assessment was performed utilizing several parameters like precision, specificity, f1 score, recall and accuracy of each DL methods and the accuracy achieved by proposed Deep SS-Hexa model is 99.16%, which was higher than the traditional DL networks [41, 42, 43, 44].
Comparison among classical DL networks
Comparison among classical DL networks
Comparison of different segmentation networks
The different DL networks were compared by determining the appropriate proportion of classification accuracy displayed in Table 3. Although, the classic networks are not attained better results compared to the proposed MobileNet. The proposed MobileNet rises the overall accuracy range by 9.98%, 10.6%, 7.97% and 4.84% better than AlexNet, LeNet, VGG-16 and ResNet-50 respectively.
Performance metrics for various segmentation techniques were compared in Table 4. Table 4 shows the assessment of segmentation performance via the dice score and the Jaccard index. In comparison to MDSU-Net [38], FCN [39], and SAN-Net [40], the U-net demonstrates an enhancement in the overall Jaccard index by 11.4%, 10.0%, and 27.6% respectively. Furthermore, the Attention U-net surpasses MDSU-Net [38], FCN [39], and SAN-Net [40] by augmenting the overall Dice index by 16.2%, 10.5%, and 32.1%, respectively. The traditional segmentation networks perform poorly compared to the U-net.
Segmentation fallouts of various segmentation models.
Figure 9 depicts the segmentation results of the proposed U-net with other networks with Jaccard index of 0.807, the U-Net closely approximates the ground truth value. Conversely, traditional segmentation methods exhibit inferior performance. Despite achieving an exact Dice index of 0.842, which surpasses other models, these methods still fall short when contrasted to the U-Net. The proposed U-Net not only reduces the false positive rate but also enhances overall system performance. As evidenced by the comparison, the suggested U-Net outperforms other segmentation approaches in terms of both Dice and Jaccard coefficients. Leveraging the U-Net allows for precise segmentation of fine structural details from WCE images, as demonstrated by the Dice coefficient. Furthermore, the suggested U-Net surpasses current networks in performance, as illustrated in Fig. 9. Consequently, the U-Net’s projected results exhibit a high degree of accuracy in partitioning the structural elements of the intestine. The segmentation accuracy of the U-net is higher than the other two methods. The U-net is also more efficient in conserving time and resources. Therefore, U-net is the most suitable method for segmentation tasks.
Accuracy comparison – Existing models vs proposed model
Table 5 illustrates the comparison of SOTA techniques and proposed Deep Hexa model based on the KVASIR and KID datasets. The proposed Deep SS-Hexa Model progresses the overall accuracy range of 0.04%, 0.80% better than [29, 30] based on KVASIR dataset and 0.60%, 1.21% better than [31, 32] based on KID dataset respectively. According to Table 5, the Proposed Deep SS-Hexa Model yields the average accuracy of 99.16% which is relatively higher than prior works. Yet, the existing techniques not performs well when contrasted to the proposed Deep SS-Hexa Model. Thus, it is clearly seen that the proposed Deep SS-Hexa Model has good competence than other existing techniques for the categorization of hexa classes of GIT diseases.
In this work, the goal is to detect GIT disease by classifying the hexa classes using DL algorithms with low complexity and high accuracy. For the experimental analysis, we used both the KVASIR and KID for the detection of GIT diseases. The results were assessed using multiple assessment parameters with dice index, jaccard index, accuracy, specificity, precision, recall, and f1 score. Table 2 represents the efficiency metrics of a classification model for hexa classes of GIT disorders. The percentage of properly categorized cases for each class and the harmonic mean of precision and recall, calculating a ratio among the two metrics for each class. Overall, the Table 2 provides insights into the performance of the model across different disorder classes, shows high accuracy of 99.16%. The training and testing graphs are displayed in the Figs 6 and 7, the normalization process of the proposed model by adjusting parameters to minimize the loss function, while the testing graph assesses the efficiency of the proposed SS-Hexa model on gathered data to assess its generalization ability. Table 3 compares the performance of various classic DL networks across various evaluation metrics. Each row signifies a specific DL network, and the columns display metrics such as specificity, recall, precision, F1 score, and accuracy. The results highlight the effectiveness of different DL architectures, with MobileNet demonstrating particularly high scores across all metrics, indicating its superior performance compared to other networks. Table 4 compares different segmentation methods using Jaccard and Dice similarity indices, where higher values indicate better segmentation accuracy. U-net demonstrates superior performance with the highest scores among the listed methods, indicating its effectiveness in accurately segmenting objects in images. Moreover, Table 5 compares the accuracy of existing models with a proposed Deep SS-Hexa model across different datasets, demonstrating superior performance of the Deep SS-Hexa model in both KVASIR and KID datasets in the classification of hexa classes of GIT diseases.
Conclusion
This paper introduces a novel Deep SS-Hexa model was proposed to detecting the various GIT diseases in compressed WCE images. This feature extraction module is sectioned into two phases for the analysis of distinct regions of gastrointestinal. In the first phase, the statistical features of the image were retrieved using Mobile network to highlights the relevant features. In the second phase, the segmented intestine images are transformed into structural features to learn the local information and SS features are selected by WO algorithm. The DBN is used classified the GIT diseases into normal, ulcer, pylorus, polyps, cecum and esophagitis based on selected features. The efficacy of the proposed Deep SS-Hexa model attains high accuracy of 99.16% in the detection of GIT diseases. The proposed MobileNet progresses the overall accuracy range by 9.98%, 10.6%, 7.97% and 4.84% better than AlexNet, LeNet, VGG-16 and ResNet-50 respectively. Furthermore, the proposed selection approach decreases categorization duration while also improving classification efficiency. In the future, a deep CNN model may be developed to classify more GIT disorders. In addition, images can also be expanded and algorithms can be optimized based on their computational expense.
Ethical approval
Our research guide reviewed and ethically approved this manuscript for publishing in this Journal.
Author contributions
The authors confirm contribution to the paper as follows: Study conception and design: Ajitha Gladis K. P, Roja Ramani D; Data collection: Linu Babu P and Roja Ramani D; Analysis and interpretation of results: Mohana Suganthi N, Linu Babu P; Draft manuscript preparation: Mohana Suganthi N and Ajitha Gladis K. P. All authors reviewed the results and approved the final version of the manuscript.
Funding
No Financial support
Data availability
Data sharing is not applicable to this article as no new data were created or analysed in this Research.
Human and animal rights
This article does not contain any studies with human or animal subjects performed by any of the authors.
Informed consent
I certify that I have explained the nature and purpose of this study to the above-named individual, and I have discussed the potential benefits of this study participation. The questions the individual had about this study have been answered, and we will always be available to address future questions.
Footnotes
Acknowledgments
The author would like to express his heartfelt gratitude to the supervisor for his guidance and unwavering support during this research for his guidance and support.
Conflict of interest
This paper has no conflict of interest for publishing.
