Abstract
This manuscript presents a comprehensive approach to enhance the accuracy of skin lesion image classification based on the HAM10000 and BCN20000 datasets. Building on prior feature fusion models, this research introduces an optimized cluster-based fusion approach to address limitations observed in our previous methods. The study proposes two novel feature fusion strategies, KFS-MPA (using K-means) and DFS-MPA (using DBSCAN), for skin lesion classification. These approaches leverage optimized clustering-based deep feature fusion and the marine predator algorithm (MPA). Ten fused feature sets are evaluated using three classifiers on both datasets, and their performance is compared in terms of dimensionality reduction and accuracy improvement. The results consistently demonstrate that the DFS-MPA approach outperforms KFS-MPA and other compared fusion methods, achieving notable dimensionality reduction and the highest accuracy levels. ROC-AUC curves further support the superiority of DFS-MPA, highlighting its exceptional discriminative capabilities. Five-fold cross-validation tests and a comparison with the previously proposed feature fusion method (FOWFS-AJS) are performed, confirming the effectiveness of DFS-MPA in enhancing classification performance. The statistical validation based on the Friedman test and Bonferroni-Dunn test also supports DFS-MPA as a promising approach for skin lesion classification among the evaluated feature fusion methods. These findings emphasize the significance of optimized cluster-based deep feature fusion in skin lesion classification and establish DFS-MPA as the preferred choice for feature fusion in this study.
Keywords
Introduction
Skin lesion image classification is vital in dermatology and skin cancer diagnosis due to the increasing prevalence of skin cancer. Early and accurate detection of skin lesions is crucial for timely intervention. Visual inspection alone is subjective and prone to errors, leading to misdiagnosis or delayed treatment [1, 2]. Computer vision and deep learning techniques have advanced automated analysis, providing consistent and objective results. This enables early detection of malignant lesions, reducing workload for dermatologists and improving healthcare efficiency. Accurately classifying skin lesions presents challenges, especially in feature extraction, which is crucial for effective classification models. Skin lesions exhibit high variability in shape, size, color, and texture, making it difficult to define universal features representing all lesion types. Image artifacts, lighting variations, and low quality distort important features and hinder extraction. Limited annotated training data restricts capturing the full range of lesion characteristics, potentially leading to suboptimal feature representations. Furthermore, the complexity and subtle differences between benign and malignant lesions require informative features for effective discrimination. Overcoming these challenges necessitates advanced techniques capable of handling variability, noise, and data limitations to extract robust and discriminative features for accurate skin lesion classification [3, 4].
Deep learning and feature fusion techniques have revolutionized skin lesion classification, providing powerful tools for accurate and automated analysis. Deep learning techniques, such as convolutional neural networks (CNNs), have demonstrated exceptional performance in image classification tasks, including skin lesion analysis. CNNs excel at automatically learning hierarchical representations of image features by leveraging multiple layers of convolutional and pooling operations [5, 6, 7, 8, 9]. These networks can effectively capture both low-level features (e.g., edges, textures) and high-level semantic information (e.g., lesion boundaries, patterns) from skin lesion images. Pre-trained deep learning models, such as VGGNet, ResNet, and Inception, have proven to be beneficial, as they enable transfer learning and leverage large-scale datasets to boost classification accuracy [10, 11, 12, 13, 14, 15]. Feature fusion techniques play a vital role in enhancing the representation power and discriminative capability of skin lesion classification systems. The fusion strategies exploit the strengths of individual features or models, improving classification accuracy, robustness, and generalization capability. By leveraging the strengths of different features, feature fusion enhances the representation power and discriminative capability of the classification models. Feature fusion leverages the complementary nature of different feature sets or modalities to create a more informative representation. By integrating multiple sources of information, feature fusion can capture diverse aspects of skin lesions and improve the classification accuracy. Moreover, the combination of deep learning techniques and feature fusion strategies has propelled skin lesion classification to new heights. These approaches leverage the representation learning capabilities of deep neural networks and the integration of multiple sources of information to achieve more accurate, reliable, and interpretable classification results in the field of dermatology [16, 17, 18, 19, 20, 21, 22, 23].
The primary objective of this research is to enhance the accuracy of skin lesion image classification through the development of a comprehensive clustering-based design approach. Building upon our prior work on feature fusion models utilizing the concept of feature-based optimized weighted feature set (FOWFS) [24], this research aims to address certain limitations observed in the previous approach. Specifically, the fixed number of optimized weights and the weight threshold of 0.5 may restrict the adaptability of the model and lead to the exclusion of informative features or the inclusion of less discriminative ones, potentially overlapping with redundant features. To overcome these limitations, this current research introduces anoptimized cluster-based feature fusion approach, allowing for flexibility in weight selection and justifying weight thresholds. By incorporating these improvements, the research seeks to enhance the reliability and practical utility of the model in skin lesion classification tasks.
The study’s key contributions revolve around the clustering-based design approach and optimized deep feature fusion. Motivated by the goal of improving understanding and organization of skin lesion patterns, the clustering-based design approach enhances representation and discrimination of different lesion types, ultimately leading to enhanced classification accuracy. By incorporating clustering algorithms, it captures the inherent variability and complexity of skin lesions. Additionally, the integration of deep feature fusion harnesses the power of multiple pre-trained CNN models, extracting and fusing diverse and complementary information from different layers. This significantly boosts the discriminative capability of the classification system, effectively capturing both low-level and high-level features for a comprehensive representation of skin lesions. The study’s contributions can be summarized as follows: (a) introducing a comprehensive clustering-based design approach that enhances organization and understanding of skin lesion patterns, improving representation and discrimination; (b) integrating deep feature fusion techniques, leveraging multiple pre-trained models to extract and fuse diverse information; (c) introducing an optimized mechanism for feature fusion, enhancing adaptability and robustness across datasets and varying feature importance; (d) achieving enhanced classification accuracy by addressing the challenges posed by complex and variable skin lesions, aiding in early detection and improved patient outcomes.
The manuscript is structured as follows: Section 2 presents a comprehensive literature survey, discussing relevant prior research in the field. In Section 3, the methodology adopted in this study is outlined, accompanied by a detailed architectural representation of the proposed work. The experimental setup is described in Section 4, while Section 5 presents a thorough analysis of the obtained results. Section 6 provides an interpretation and in-depth analysis of the findings. The manuscript concludes with Section 7, which summarizes the key outcomes of the research and discusses potential future directions for further investigation.
Related work
This section aims to provide a comprehensive understanding of the existing knowledge and advancements in skin lesion image classification. It explores previous studies, methodologies, and techniques employed in this field, including approaches based on deep learning, transfer earning, feature fusion algorithms. The literature review critically evaluates the strengths, limitations, and gaps in the current literature, highlighting the need for the proposed comprehensive clustering-based design approach by deep feature fusion. By examining and synthesizing the existing body of knowledge, the literature review serves as a foundation for the research, guiding the development and refinement of the proposed methodology.
The reviewed literatures from [1, 2, 3, 4, 5] present significant contributions to the field of skin lesion analysis and melanoma classification, offering valuable insights and solutions to the challenges in this domain. Eduardo Valle et al. [1] demonstrated the effectiveness of deep learning models for melanoma classification, achieving top performance in the ISIC 2017 Challenge. Their research emphasizes the importance of ensembles and adherence to methodological constraints, leading to state-of-the-art AUCs. This study adds to the understanding of hyperparameter optimization, transfer learning, and ensembles. Łukasz Piatek et al. [2] address the critical medical challenge of early melanoma diagnosis with a specialized diagnostic system, achieving high accuracy using various learning models. Their focus on automatic processing and analysis of skin lesion images is commendable. The survey conducted by Md. Kamrul Hasan et al. [4] comprehensively reviews 594 publications on computer-aided design (CAD) for skin lesion analysis, providing a systematic overview of input data utilization, preprocessing, method configuration, and evaluation criteria. It highlights challenges in evaluating skin lesion segmentation and classification systems due to limited datasets, offering potential solutions. The reviewed studies contribute valuable findings and recommendations for future research in developing automated and robust CAD systems for skin lesion analysis.
Machine learning and deep learning play a crucial role in skin cancer classification and detection, revolutionizing the field of dermatology. The growing importance of machine learning algorithms in this domain is driven by advancements in digital data processing, faster computing, and cost-effective data storage. These technologies enable the development of powerful skin cancer detection and classification systems, providing dermatologists with valuable tools for accurate diagnosis and customized care. The literature comparison made by Mazhar, Tehseen et al. [5], Priya Choudhary et al. [6], Huiyan Jiang et al. in [7] reveal that while some approaches achieve high accuracy rates, others face challenges in multiple-lesion recognition due to little variation between lesions or limited datasets. The introduction of novel approaches like the dual optimization-based deep learning network (DODL net) by E. Gomathi et al. [8] demonstrates the potential of combining deep learning and optimization techniques to achieve outstanding results in skin cancer classification, with an accuracy of 98.76%. Imran Iqbal et al. [9] conducted a comprehensive study using deep learning algorithms for multi-class skin lesion classification. Their proposed deep CNN (DCNN) model demonstrated high precision, sensitivity, and specificity, outperforming state-of-the-art algorithms with an impressive AUROC of 0.964 in ISIC-17. The reviewed literatures [5, 6, 7, 8, 9] provide essential insights for researchers, guiding them in overcoming the complexities of multiple-lesion recognition and improving disease diagnosis in medical image analysis through deep learning-based methods. The exploration of various applications, including disease classification, dermopathology visual classification, and skin disease measurement, further highlights the versatility and effectiveness of deep learning in dermatology. As the field continues to evolve, future research can focus on enhancing unsupervised and semi-supervised learning, developing lightweight segmentation models, and utilizing large-scale image datasets to further advance the field of skin cancer classification and detection. These findings underscore the importance of machine learning and deep learning in advancing skin cancer detection and classification, offering valuable tools for accurate diagnosis and customized care in dermatology.
The literature reviewed showcases the importance of moving from deep learning to transfer learning in imageclassification and highlights the remarkable benefits this transition offers. Several studies [10, 11, 12, 13, 14, 15] demonstrate that transfer learning effectively addresses the challenges of training deep models from scratch, such as the need for vast labeled datasets and substantial computational resources. Transfer learning utilizes pre-trained models on large-scale datasets, adapting their learned features to new tasks with limited data, resulting in accelerated training, enhanced generalization, and improved performance for image classifiers. The research of Amirreza Mahbod et al. [10] explores the impact of image size on pre-trained CNNs for skin lesion diagnosis and reveals that image cropping is a superior strategy to image resizing, improving classification performance. The proposed multi-scale multi-CNN (MSM-CNN) fusion approach further enhances classification accuracy by combining results from different scales and CNNs. In the study of Md Shahin Ali et al. [11], a deep CNN (DCNN) model for skin cancer classification exhibits high accuracy and efficiency, outperforming other deep learning models, making it a valuable tool for early-stage skin cancer diagnosis. The survey by Fahad Shamshad et al. [12] delves into the transformative impact of Transformers in medical imaging, offering a comprehensive overview of their applications in image segmentation, detection, classification, and more, with potential solutions to key challenges. Muhammad Naved Qureshi et al. [13] address the challenge of limited training data by proposing a CNN-based transfer learning model, achieving impressive accuracy in skin lesion classification. Muhammad Asad Arshed et al. [14] explore the efficacy of CNNs and vision transformers (ViT) in skin cancer diagnosis, demonstrating the superiority of ViT in multi-class classification with effective data augmentation techniques. Dimililer and Sekeroglu [15] emphasize the increasing prevalence of CAD systems and transfer learning in skin lesion analysis, presenting a transfer learning model with significant improvements in classification rates. Collectively, these studies validate the importance of transfer learning for image classification tasks, particularly in medical imaging applications, and offer promising directions for future research in this evolving field.
The presented literatures [16, 17, 18, 19, 20, 21, 22, 23, 24] offer valuable insights into the significance of feature fusion in image classification, particularly in the context of skin lesion analysis for melanoma detection. Each study emphasizes the importance of combining information from various sources, including handcrafted features, pre-trained CNN models, and deep learning features, to improve the accuracy and efficiency of skin cancer classification systems. The research by Amirreza Mahbod et al. [16] proposes an ensemble scheme for CNNs, combining intra- and inter-architecture network fusion, achieving remarkable results on the ISIC 2017 dataset. Lina Liu et al. [17] introduce a novel mid-level feature learning approach, exploiting relationships among image samples to improve classification performance, outperforming other CNN-based methods. Mario Manzo and Simone Pellino [18]employ transfer learning and ensemble classification to enhance melanoma prediction accuracy. Almaraz-Damian et al. [19] present a CAD system that fuses handcrafted and deep learning features, achieving improved performance compared to other methods on the ISIC 2018 dataset. Samia Benyahia et al. [20] explore feature extraction methods on ISIC 2019 and PH2 datasets, highlighting the benefits of combining DenseNet201 with Fine K nearest neighbour (KNN) or Cubic support vector machine (SVM). Muhammad Ajmal et al. [21] demonstrate the efficiency of a fusion approach using fuzzy entropy slime mould algorithm and deep learning models, achieving impressive accuracy rates on HAM10000 and ISIC 2018 datasets. Gang Wang et al. [22] proposed a multiscale feature fusion model, combining DenseNet-121 and improved VGG-16, achieving high accuracy on the ISIC2018 dataset. Sarmad Maqsood et al. [23] introduced a unified CAD model, combining various deep learning techniques, achieving impressive results on multiple datasets. In our previous work on skin lesion classification [24], we proposed three feature fusion strategies with adaptive weights and an artificial jellyfish algorithm (AJS) [25, 26, 27], achieving high accuracy on HAM10000 and BCN20000 datasets. These strategies utilized three pre-trained CNN models: VGG16, EfficientNet B0, and ResNet50, named as adaptive weighted feature set (AWFS), model-based optimized weighted feature set (MOWFS), and FOWFS. The performances of these strategies were evaluated using different classifiers (decision tree (DT), naïve bayesian (NB), multi-layer perceptron (MLP), and SVM) based on accuracy, precision, sensitivity, and F1-score. The results showed that FOWFS-AJS outperformed the other strategies, achieving the highest accuracy of 94.05% and 94.90% for the HAM10000 and BCN20000 datasets, respectively, using SVM classification. However, the fixed number of optimized weights and the weight threshold of 0.5 in the FOWFS approach might limit its adaptability and lead to the exclusion of informative features or inclusion of less discriminative ones. This current research aims to address these limitations by introducing an optimized cluster-based feature fusion approach, allowing for flexibility in weight selection and justifying weight thresholds. By incorporating these improvements, the research seeks to enhance the reliability and practical utility of the model in skin lesion classification tasks, ultimately contributing to improved accuracy in diagnosing skin cancer at an early stage.
Methodology
This section provides an overview of the previously proposed feature fusion strategies, highlighting their advantages and areas for improvement, which are addressed in this research work. The discussion includes a brief overview of pre-trained networks, feature extraction and fusion techniques, clustering techniques, and the rationale behind their use. Additionally, the marine predator algorithm (MPA) [28, 29, 30, 31, 32, 33, 34] meta-heuristic optimization algorithm and the architecture of the proposed fusion approaches are outlined, offering insights into the methodology employed in this study.
Description of the proposed cluster-based optimized deep feature fusion approach
The previous study [24] explored the effectiveness of inductive transfer learning at the feature level, utilizing three pre-trained CNNs. The results indicated that this approach achieved superior performance compared to traditional feature selection models. By employing feature fusion models, such as CFS, the study successfully merged the outputs of the pre-trained networks, resulting in enhanced classification performance. Furthermore, the investigation conducted a comparative analysis between basic fusion strategies and a weighted approach for feature selection. The experimental findings demonstrated the superiority of the weighted approach, particularly the AWFS method, in terms of performance. The research also emphasized the significance of decision-making within feature fusion methodologies. By leveraging the AJS optimizer, the study identified the optimal point for feature fusion, taking into account both active and passive motions of the algorithm. This strategic approach facilitated the identification of the best cost, ultimately enhancing the overall performance of the system. In addition to the aforementioned contributions, the research introduced two decision-based feature fusion models: the MOWFS and FOWFS approaches. In the model-based approach, the cost function was based on one of the classifiers (DT, NB, MLP, and SVM), and optimized weights were derived from all three pre-trained models. On the other hand, the feature-based strategy optimized weights individually for each feature, resulting in the creation of a combined feature set [24]. These innovative strategies significantly improved the classification performance of the system. In summary, the previous research investigated the advantages of inductive transfer learning, designed robust classifier models, employed feature fusion techniques, compared basic fusion strategies with a weighted approach, highlighted the importance of decision-making in feature fusion, and introduced novel decision-based feature fusion models. However, we observed that in FOWFS, the fixed number of optimized weights and the weight threshold of 0.5 may restrict the adaptability of the model and potentially lead to the exclusion of informative features or the inclusion of less discriminative ones. This could also result in overlaps with redundant features. To overcome these limitations, the present research introduces novel strategies called optimized cluster-based feature fusion utilizing the K-means [35, 36] and density based clustering DBSCAN [37, 38] clustering algorithms namely K-means based feature set (KFS) and DBSCAN based feature set (DFS) respectively.
Explanation of the pre-trained networks employed
CNNs offer pre-trained models that have undergone extensive training on large-scale image classification datasets. These models can be used as-is or customized to fulfill specific requirements. This technique, known as transfer learning, allows the application of knowledge acquired from one task to a similar but distinct task. The field of image processing benefits immensely from a wide range of pre-trained CNN models, including LeNet, AlexNet, ResNet, GoogleNet (or InceptionNet), VGG, DenseNet, EfficientNet, PolyNet, and others. CNNs are built upon neural networks and incorporate fundamental components like convolution layers, pooling layers, and activation layers. In alignment with previous work, this study focuses on feature selection and devising a feature fusion strategy, utilizing three pre-trained CNNs-VGG16, EfficientNet B0, and ResNet50 [39, 40, 41, 42, 43, 44].
Pre-trained CNNs such as VGG16, EfficientNet B0, and ResNet50 have established themselves as powerful tools in the realm of computer vision and image processing, finding application in the experimentation conducted within this study. These models have undergone training on vast datasets comprising millions of labeled images, enabling them to learn intricate patterns and extract meaningful features through an iterative optimization process. VGG16, developed by the visual geometry group, boasts simplicity and uniformity, with multiple convolutional layers followed by max-pooling layers. By cascading these layers, VGG16 captures hierarchical features of increasing complexity, culminating in accurate image classification. The fully connected layers act as classifiers, generating predictions based on the learned features. EfficientNet B0, in contrast, employs a compound scaling technique, striking a balance between model size and performance. It combines depth-wise separable convolutions, which reduce computational complexity, with efficient scaling methods to achieve cutting-edge accuracy. EfficientNet B0’s architecture enables efficient processing and discriminative feature learning from images. ResNet50 introduced skip connections, or identity shortcuts, which revolutionized deep learning by addressing the issue of vanishing gradients. These skip connections facilitate more effective gradient flow during training, enabling the training of significantly deeper networks. With convolutional layers, batch normalization, and fully connected layers, ResNet50 captures intricate details and robustly recognizes objects and patterns in images. Utilizing pre-trained networks offers various advantages, such as feature extraction by extracting learned representations from intermediate layers, fine-tuning to adapt models to new datasets or domains while preserving pre-trained weights, and transfer learning to leverage knowledge from pre-training for related tasks, resulting in faster convergence and improved performance. By leveraging the knowledge captured by these pre-trained CNNs, we can effectively utilize their learned representations to achieve high-performance outcomes in our specific tasks.
Clustering algorithms used and their rationale
The K-means clustering algorithm is a popular unsupervised machine learning technique used for partitioning data into distinct clusters. It aims to minimize the within-cluster sum of squares, or distortion, by iteratively assigning data points to the nearest cluster centroid and updating the centroids based on the mean of the assigned points. The algorithm begins with the initialization of
Density-based clustering is an effective approach used to identify clusters or groups within a dataset based on the density of data points. Unlike traditional clustering methods that rely on predetermined cluster shapes or distance measures, density-based clustering focuses on the density of data points in their local neighbourhoods. The most widely recognized algorithm for density-based clustering is density-based spatial clustering of applications with noise (DBSCAN). The DBSCAN defines clusters as regions of high density separated by regions of lower density. It classifies data points into three categories: core points, border points, and noise points. Core points have a sufficient number of neighbouring points within a specified radius, while border points have fewer neighbours but are within the radius of a core point. Noise points are isolated points that do not belong to any cluster. One of the key advantages of density-based clustering is its ability to discover clusters of various shapes and effectively handle noise. It is robust against outliers and does not require prior knowledge of the number of clusters. Density-based clustering is also capable of handling datasets with varying cluster densities and is less sensitive to parameter settings compared to distance-based algorithms like K-means. The crucial parameters in density-based clustering algorithms are the radius or epsilon (eps), which defines the distance within which neighboring points are considered, and the minimum number of points (MinPts) required to form a core point. These parameters significantly influence the granularity and quality of the resulting clusters. The flexibility and robustness of density-based clustering make it a valuable tool for exploratory data analysis and gaining insights into the underlying structure of complex datasets [37, 38].
The primary motivation behind introducing a clustering-based approach lies in mitigating the limitations associated with weight-based models, which often assign high weights to specific features, resulting in redundancy across various feature extraction methodologies. Through the adoption of a clustering-based paradigm, the objective is to tackle this issue by aggregating similar features into clusters within the extracted feature set. This approach ensures comprehensive feature coverage while concurrently reducing the overall feature dimensionality. Here, features are grouped into clusters, and the features within each cluster are averaged to represent that particular group. This streamlined approach prevents the disregard of potentially valuable information. The conceptualization of feature fusion models utilizing this clustering-based methodology draws inspiration from the strengths of both K-means and DBSCAN algorithms. K-means is favoured for its simplicity, efficiency, and interpretability, serving as a fundamental tool for clustering-based feature fusion strategies. It facilitates the amalgamation and extraction of features from multiple sources based on centroid proximity. However, K-means encounters challenges when confronted with datasets containing numerous outliers, thus limiting its efficacy in certain scenarios. To counteract this limitation, the integration of DBSCAN is instrumental. Renowned for its ability to identify clusters of arbitrary shapes and robustly handle outliers, DBSCAN enriches the feature fusion models by incorporating the concept of density-based clustering. This integration empowers the feature fusion process to better discern and encapsulate intricate patterns and variations within the data, thereby enhancing the overall efficacy of the fusion methodology.
Explanation of the optimization strategy employed: Marine predators algorithm (MPA)
The MPA is a nature-inspired meta-heuristic optimization algorithm that draws inspiration from the hunting behavior of marine predators in the ocean ecosystem. Developed based on the concept of predator-prey interactions, MPA mimics the hunting strategies employed by marine predators to search for and capture their prey efficiently [28, 29]. The algorithm starts with an initial population of potential solutions, representing the predators in the search space. These predators interact with their prey, which corresponds to the problem’s objective function. The movement of predators is guided by various parameters, including their position, speed, and perception of prey. During the search process, predators employ different tactics, such as cruising, searching, and attacking, to locate and capture the prey. These tactics involve a balance between exploration and exploitation, allowing the algorithm to efficiently explore the solution space while converging towards promising regions. The movement of predators is influenced by several factors, including the position and fitness of the prey, the predator’s position, and the social behaviorof neighboring predators. Through these interactions and information exchanges, the algorithm adapts and refines its search strategy, gradually improving the quality of solutions.
One of the key advantages of this MPA is its ability to handle complex and multimodal optimization problems. By mimicking the natural hunting behavior of marine predators, MPA exhibits an inherent ability to balance exploration and exploitation, promoting effective search and convergence. The MPA algorithm has been applied to various optimization problems, including function optimization, feature selection, image segmentation, and parameter tuning for machine learning algorithms. Its performance has shown competitiveness against other popular meta-heuristic algorithms in terms of solution quality and convergence speed. However, like other meta-heuristic algorithms, the performance of MPA is influenced by several parameters, such as population size, movement parameters, and the search space representation. Proper parameter tuning and adaptation are essential for achieving optimal results with MPA. This algorithm provides a promising approach for solving complex optimization problems, leveraging the natural hunting behaviors of marine predators to guide the search process and discover high-quality solutions in various domains [30, 31].
The MPA leverages the advantageous characteristics of the Lévy strategy in combination with the features of Brownian motion, which have been proven to enhance the efficiency of exploration and exploitation. In standard Brownian motion, the step length follows a probability function defined by a Normal (Gaussian) distribution with zero mean (
MPA population formulation:
Like many other meta-heuristics, the MPA follows a population-based approach, where the initial solutions are uniformly distributed across the search space as the initial trial.
Where In this MPA, the fittest solution is designated as the top predator and used to construct a matrix known as the
The top predator vector, represented by The
MPA optimization scenarios (exploring the algorithm’s performance in different settings) The MPA algorithm consists of three primary optimization phases, each corresponding to different velocity ratios that simulate the complete life cycle of both prey and predator.
Exploration phase: During this phase, the prey exhibits rapid movement and engages in an exploratory behaviour using Brownian motion (BM) [28, 29] to search for its food. In contrast, the predator remains stationary, carefully observing the prey’s movements. This exploratory phase takes place during the initial third of the iterations as stated in Eq. ((a)). Let
While If
Transitional Phase (bridging the exploration and exploitation phases): In this intermediate Phase, both the prey and predator take nearly equal steps, representing a transition from the exploration phase to the exploitation phase. The prey adopts an exploitative strategy using Lévy flight (LF) [28, 29], while the predator continues its exploratory behaviour through BM. The entire population divides into two equal groups, with the first group focused on exploitation and the second group dedicated to exploration using Eqs ((a)) and ((b)).
While If
The vector
Exploitation Phase: During this phase, the predator moves at a higher speed compared to the prey. Employing Lévy flight, the predator executes an exploitative behavior to capture the prey. This third phase occurs during the final third of the iterations and can be mathematically expressed as mentioned in Eq. ((c)).
While If
The predator motion following the Levy strategy can be represented as the product
Eddy Formation Enhanced by the Influence of Fish Aggregating Devices (FAD)
In the marine environment, the formation of eddies and the presence of FADs significantly influence the behaviour of marine predators. According to Filmalter, Dagorn, Cowley, and Taquet (2011), sharks tend to spend most of their time in close proximity to FADs. However, during the remaining time, they venture into longer skips in various directions to explore areas with diverse prey distribution. This combination of FADs and long skips helps the MPA algorithm avoid stagnation in local optima, thereby enhancing its overall performance. The scenario involving FADs can be modelled as using Eq. (8). The FADs denote the probability of FADs effect
Memory of the marine predators
Marine predators possess a remarkable memory that enables them to recall effective foraging locations. Similarly, in the MPA algorithm, a memory mechanism is employed to store the optimal solution obtained from earlier iterations. The current solution is then compared to the stored solution, and if the present solution performs better, it replaces the earlier one in the memory.
In the initial phase of experimentation, following the approach mentioned in the previous work [24], the original feature sets are utilized as output from the pre-trained models including VGG16 (with 512 number of features), EfficientNet B0 (with 1024 number of features) and ResNet50 (1024 number of features) to the classification algorithms. The choice of fusion techniques is motivated by their ability to leverage the advantages of ensemble approaches, where the combination of results from multiple base models has proven to enhance the performance of final predictions. Additionally, these fusion techniques provide improved robustness by considering the dispersion or spread of predictions and model performance. Aligned with these considerations, this study also places a primary focus on the design of feature fusion strategies that explore the capabilities of pre-trained learning architectures.
In the previous work, four ensemble feature fusion strategies were proposed: CFS, AWFS, MOWFS, and FOWFS. CFS involves a simple ensemble technique where the outputs of the three pre-trained models are concatenated to form a batch of feature sets. AWFS utilizes adaptive weight selection and concatenation of features from each pre-trained model. MOWFS applies the AJS optimization algorithm to determine optimal weights through model-based cost evaluation. FOWFS focuses on feature-based optimization, resulting in optimized weights and a combined feature set. Features with weights above 0.5 are considered best performing. Additionally, this work explores two more fusion strategies: KFS and DFS, based on two widely used clustering approaches.
In KFS, the initial step involves combining all the features from the three pre-trained networks to form the feature sets, denoted as,
Cluster-based ooptimized weighted feature set generation processes (a) KFS and; (b) DFS.
This section aims to comprehensively depict the experimental stages conducted to elucidate the findings of the study. It encompasses a discussion on the datasets and parameter descriptions employed, as well as the evaluation metrics utilized to assess the performance of the proposed feature fusion approach algorithm. The experimentation was conducted on a system comprising an Intel(R) Core(TM) i5-7200U CPU @ 2.50 GHz with a 2.71 GHz processor, 4.00 GB (3.88 GB usable) RAM, a 64-bit operating system, x64-based processor operating system, and executed on the Google Colab platform.
Description of the dataset and its acquisition
This study builds upon previous work [24] and utilizes two skin lesion datasets: HAM10000 [45] and BCN20000 [46]. The HAM10000 dataset, widely employed in dermatology and skin lesion classification, is publicly available. Its acronym, HAM, stands for Human Against Machine, reflecting its purpose of evaluating automated classification algorithms against human experts. This dataset comprises 10,015 dermoscopic images categorized into seven classes: actinic keratoses and intraepithelial carcinoma/Bowen’s disease, basal cell carcinoma, benign keratosis-like lesions, dermatofibroma, melanoma, melanocytic nevi, and vascular lesions. Additionally, the BCN20000 dataset consists of 19,424 dermoscopic images collected from a hospital clinic in Barcelona between 2010 and 2016. It encompasses eight classes: nevus, melanoma, basal cell carcinoma, seborrheic keratosis, actinic keratosis, squamous cell carcinoma, dermatofibroma, and vascular lesions. These datasets offer a diverse range of lesion types, reflecting the complexity and variability encountered in clinical practice. They have played a crucial role in benchmarking and comparing various algorithms and approaches for skin lesion classification, facilitating the development of automated diagnosis systems.
Exploration of utilized parameters
The successful execution of our experiments relies heavily on the meticulous selection of network models, optimization techniques, and their associated parameter values. It is paramount to choose the optimal combination of these elements to attain precise and efficient results. In this section, we will present the parameters we have selected (see Table 1) and delve into their impact on the overall performance and convergence of our experimentation. Furthermore, a comprehensive analysis of the results will be provided in the subsequent section.
Parameters utilized
Parameters utilized
During the initial phase of experimentation, a comparison was conducted to validate the accuracy of ten fused feature sets (refer to Table 2). This table provides insights into the performance of these feature sets, obtained from both the previously proposed approach and the current research. The comparison includes information on the decrease in dimensionality (in %), accuracy improvement (measured using artificial neural network (ANN)) after using the fused feature sets (in %), and the time taken for feature selection/acquisition (in minutes) for both the HAM10000 and BCN20000 datasets.
Observed performance of proposed feature sets for HAM10000 and BCN20000 datasets
Observed performance of proposed feature sets for HAM10000 and BCN20000 datasets
The Table 2 presents the observed performance of various fused feature sets for the HAM10000 and BCN20000 datasets. The results demonstrate the effectiveness of the proposed clustering-based feature fusion approach. In terms of dimensionality reduction, all fused feature sets show a significant decrease compared to the original feature sets, ranging from approximately 38.5% to 54.7% for HAM10000 and from around 39.1% to 50.5% for BCN20000. Additionally, the fused feature sets lead to considerable accuracy improvements. The highest-ranked feature set for each dataset achieves accuracy levels of 0.9698 for HAM10000 and 0.9726 for BCN20000, showcasing the potential of the clustering-based fusion method. While the AWFS approach serves as the baseline, the use of genetic algorithm (GA) [47], particle swarm optimization (PSO) [48], and the MPA for feature selection yields better results. Although certain feature sets obtained with GA or PSO may have slightly lower accuracy than AWFS, the MPA consistently demonstrates promising results in improving accuracy. The feature selection time for the fused feature sets varies, with durations ranging from 8.23 minutes to 15.38 minutes for HAM10000 and from 9.65 minutes to 15.34 minutes for BCN20000. Overall, the findings confirm the efficacy of the clustering-based feature fusion approach, particularly when combined with the MPA, in enhancing accuracy and reducing dimensionality in skin lesion image classification.
While comparing the results of DFS-MPA for both datasets, we find that it consistently outperforms other fusion methods in terms of dimensionality reduction and accuracy improvement. For HAM10000, DFS-MPA reduces the dimensionality to 1160 (54.69% reduction), while for BCN20000, it achieves a dimensionality of 1268 (50.47% reduction). This significant decrease in dimensionality demonstrates the efficiency of the DFS-MPA approach in selecting relevant features for both datasets. Moreover, DFS-MPA achieves the highest accuracy among all the fused feature sets, obtaining 0.9698 accuracy for HAM10000 and 0.9726 accuracy for BCN20000. This represents an accuracy increase of 2.61% for HAM10000 and 1.69% for BCN20000, clearly showcasing the superiority of DFS-MPA in enhancing classification performance. Furthermore, the feature selection time for DFS-MPA is relatively efficient, taking 12.01 minutes for HAM10000 and 13.86 minutes for BCN20000, indicating its practicality for real-world applications. Analysis of this table reveals that the feature fusion approaches, KFS-MPA and DFS-MPA, exhibit superior performance compared to other methods. A closer examination of KFS-MPA and DFS-MPA reveals that the DFS-MPA stands out by demonstrating exceptional capabilities in reducing dimensionality and enhancing accuracy for both HAM10000 and BCN20000 datasets. This exceptional performance positions DFS-MPA as a promising and effective method for skin lesion image classification tasks
Observed performance of proposed feature sets for HAM10000 and BCN20000 datasets with respect to accuracy, sensitivity, specificity and F-score.
Furthermore, in Fig. 2, we can observe a comprehensive portrayal of the performance evaluation metrics, namely accuracy, sensitivity, specificity, and F-score (based on ANN), specifically for the feature fusion approaches discussed earlier. This detailed analysis is centered around two significant datasets: HAM10000 and BCN20000. The visual representation of these metrics leaves no room for doubt, as it unequivocally illustrates the exceptional superiority of the DFS-MPA approach, effectively outshining all the other feature fusion strategies that were compared. Notably, the superiority of DFS-MPA is evident across all the evaluation metrics, establishing its dominance in the realm of feature fusion methods for this particular classification task. Moreover, in Fig. 3, we are presented with another crucial aspect of the evaluation process. It showcases the average training time (measured in minutes) for each of the feature fusion strategies considered, once again for both the HAM10000 and BCN20000 datasets.
The average training time (in minutes) for HAM10000 and BCN20000 datasets.
This insightful addition provides valuable information regarding the computational efficiency of the approaches. The combination of Figs 2 and 3 offers a comprehensive and compelling overview of the comparative performance of feature fusion methods, where DFS-MPA not only excels in terms of accuracy, sensitivity, specificity, and F-score but also demonstrates competitive efficiency in terms of training time.
In order to gain valuable insights into the performance and behaviour of the optimized versions of KFS and DFS during training with the ANN classifier, learning curves were plotted. The results exhibit promising performance for both KFS-MPA and DFS-MPA. Notably, DFS-MPA distinguishes itself by outperforming all other feature fusion approaches, as clearly demonstrated in Fig. 4.
Convergence curves optimized versions of KFS and DFS for HAM10000 and BCN20000 datasets.
Following the impressive performance of DFS-MPA on ANN, a further evaluation was conducted to assess the classification accuracy along with few other accuracy measures of the proposed DFS-MPA using deep learning-based classifiers, specifically long short-term memory (LSTM) and CNN. The results of this evaluation are presented in Table 3 for both the HAM10000 and BCN20000 datasets. Among these models, the DFS-MPA-CNN approach stands out as the highest-performing method for both datasets. For the HAM10000 dataset, DFS-MPA-CNN achieves an impressive accuracy of 0.9723, which is the highest among all the neural network models. Additionally, it demonstrates high sensitivity (0.9782) and specificity (0.9766), showcasing its ability to accurately identify both positive and negative cases. The F1-Score, which considers the balance between precision and recall, is also notably high at 0.9773, indicating overall robust performance. Similarly, for the BCN20000 dataset, DFS-MPA-CNN outperforms the other models with an exceptional accuracy score of 0.9802, which is the highest achieved among all the models. Moreover, it exhibits high sensitivity (0.9789) and specificity (0.9794), demonstrating its effectiveness in correctly classifying skin lesion cases. The F1-Score is also remarkably high at 0.9791, further affirming the overall outstanding performance of DFS-MPA-CNN. The recorded results in Table 3 highlight that the DFS-MPA-CNN approach is the most effective and reliable method for skin lesion image classification tasks for both the HAM10000 and BCN20000 datasets.
Observed performance of DFS-MPA using ANN, LSTM and CNN for HAM10000 and BCN20000 datasets
To gain deeper insights into the proposed DFS-MPA feature fusion approach, its performance in terms of accuracy, sensitivity, specificity, and F-score was specifically evaluated using the CNN classifier. The recorded results are depicted in Fig. 5 for both datasets, providing a comprehensive view of how DFS-MPA performs with the CNN classifier for both the datasets.
Observed performance of DFS-MPA for HAM10000 and BCN20000 datasets with respect to accuracy, sensitivity, specificity and F-score.
In order to facilitate a clear understanding of the evaluation metrics and gain insights into the strengths and weaknesses of the DFS-MPA method for skin lesion image classification tasks, we have included Tables 4–7. These tables present the vital five-fold test results of the DFS-MPA method when utilizing three different classifiers: ANN, LSTM, and CNN. Through these tables, a comprehensive and comparative analysis of the method’s performance on two distinct datasets, HAM10000 and BCN20000, is provided, focusing on accuracy, sensitivity, specificity, and F-score. The tabular format allows for concise and transparent presentation of the metrics, enabling readers to readily assess the effectiveness of DFS-MPA with each classifier on both datasets.
Five-fold test results (in terms of accuracy) of DFS-MPA based on ANN, LSTM and CNN for HAM10000 and BCN20000 datasets
The Table 4 presents the five-fold test results based on accuracies observed for both the datasets using three classifiers. For the HAM10000 dataset, the DFS-MPA-ANN consistently achieves high accuracy, with values ranging from 0.9654 to 0.9699. The DFS-MPA-LSTM exhibits similar performance, with accuracy scores between 0.9665 and 0.9698. The DFS-MPA-CNN performs exceptionally well, with accuracy values ranging from 0.9720 to 0.9761. The average accuracy for DFS-MPA-ANN, DFS-MPA-LSTM, and DFS-MPA-CNN are 0.9677, 0.9687, and 0.9730, respectively. Turning to the BCN20000 dataset, all three classifiers display strong performance. The DFS-MPA-ANN, DFS-MPA-LSTM, and DFS-MPA-CNN achieve consistent accuracy scores of 0.9669 to 0.9726, 0.9700 to 0.9724, and 0.9768 to 0.9802, respectively. The average accuracy for DFS-MPA-ANN, DFS-MPA-LSTM, and DFS-MPA-CNN on BCN20000 is 0.9711, 0.9717, and 0.9789, respectively.Notably, the DFS-MPA-CNN exhibits the highest accuracy across the trials and also the average accuracy.
Five-fold test results (in terms of sensitivity) of DFS-MPA based on ANN, LSTM and CNN for HAM10000 and BCN20000 datasets
Table 5 presents the five-fold test results based on sensitivity of the DFS-MPA method when utilizing three different classifiers on both the datasets. The table provides a comprehensive and comparative analysis of the method’s sensitivity performance across multiple trials for each classifier on both datasets. For the HAM10000 dataset, the DFS-MPA-ANN achieves consistent sensitivity scores, ranging from 0.9726 to 0.9739. The DFS-MPA-LSTM displays sensitivity values between 0.9700 and 0.9761, while DFS-MPA-CNN shows sensitivity scores from 0.9738 to 0.9802. The average sensitivities for DFS-MPA-ANN, DFS-MPA-LSTM, and DFS-MPA-CNN are 0.9735, 0.9735, and 0.9773, respectively. Similarly, for the BCN20000 dataset, all three classifiers demonstrate strong sensitivity performance. The DFS-MPA-ANN, DFS-MPA-LSTM, and DFS-MPA-CNN achieve consistent sensitivity scores of 0.9687 to 0.9768, 0.9698 to 0.9776, and 0.9753 to 0.9791, respectively. The average sensitivities for DFS-MPA-ANN, DFS-MPA-LSTM, and DFS-MPA-CNN on BCN20000 are 0.9739, 0.9753, and 0.9777, respectively. The table’s results indicate that the DFS-MPA-CNN consistently exhibits the highest sensitivity scores for both datasets, suggesting its effectiveness in correctly identifying positive cases.
The Table 6 offers a comprehensive and comparative analysis of the method’s specificity performance across multiple trials for each classifier on both datasets.For the HAM10000 dataset, the DFS-MPA-ANN consistently demonstrates specificity scores ranging from 0.9667 to 0.9739. The DFS-MPA-LSTM exhibits specificity values between 0.9709 and 0.9734, while DFS-MPA-CNN shows specificity scores from 0.9773 to 0.9802. The average specificities for DFS-MPA-ANN, DFS-MPA-LSTM, and DFS-MPA-CNN are 0.9713, 0.9723, and 0.9782, respectively. Likewise, for the BCN20000 dataset, all three classifiers display strong specificity performance. The DFS-MPA-ANN, DFS-MPA-LSTM, and DFS-MPA-CNN achieve consistent specificity scores of 0.9720 to 0.9734, 0.9697 to 0.9780, and 0.9779 to 0.9802, respectively. The average specificities for DFS-MPA-ANN, DFS-MPA-LSTM, and DFS-MPA-CNN on BCN20000 are 0.9728, 0.9748, and 0.9790, respectively. The results in Table 6 indicate that the DFS-MPA-CNN consistently shows the highest specificity scores for both datasets, suggesting its effectiveness in correctly identifying negative cases. This comparative analysis of specificity performance provides valuable insights into the classifiers’ ability to accurately distinguish non-lesion cases.
The Table 7 presents the five-fold test results based on F-score of the DFS-MPA method, for the HAM10000 dataset, the DFS-MPA-ANN achieves F-scores ranging from 0.9503 to 0.9736, while the DFS-MPA-LSTM exhibits F-scores between 0.9706 and 0.9770. The DFS-MPA-CNN demonstrates F-scores from 0.9758 to 0.9773. The average F-scores for DFS-MPA-ANN, DFS-MPA-LSTM, and DFS-MPA-CNN are 0.9659, 0.9730, and 0.9766, respectively. Turning to the BCN20000 dataset, all three classifiers display strong F-score performance. The DFS-MPA-ANN, DFS-MPA-LSTM, and DFS-MPA-CNN achieve consistent F-scores of 0.9654 to 0.9702, 0.9689 to 0.9800, and 0.9764 to 0.9791, respectively. The average F-scores for DFS-MPA-ANN, DFS-MPA-LSTM, and DFS-MPA-CNN on BCN20000 are 0.9682, 0.9760, and 0.9782, respectively. The results in Table 7 indicate that the DFS-MPA-CNN consistently shows the highest F-scores for both datasets, suggesting its effectiveness in achieving a balance between precision and recall. This comparative analysis of F-score performance provides valuable insights into the classifiers’ overall performance in correctly classifying both positive and negative cases.
Five-fold test results (in terms of specificity) of DFS-MPA based on ANN, LSTM and CNN for HAM10000 and BCN20000 datasets
Five-fold test results (in terms of F-score) of DFS-MPA based on ANN, LSTM and CNN for HAM10000 and BCN20000 datasets
The comprehensive and comparative analysis of the DFS-MPA method’s performance across the four tables consistently indicates that the DFS-MPA-CNN classifier stands out as the most effective choice for skin lesion image classification tasks. Its consistent superior performance in accuracy, sensitivity, specificity, and F-score highlights its potential suitability for accurate and reliable skin lesion classification, making it a promising approach for medical image analysis and diagnosis.
The ROC-AUC curves were generated to take advantage of visualizing the learning activities of the proposed DFS-MPA across all three classifiers (ANN, long-short term memory (LSTM), and CNN). These curves offer a comprehensive representation of the classifiers’ performance by showcasing the trade-off between true positive rates (sensitivity) and false positive rates (1-specificity) at different decision thresholds. The concise summary provided by these ROC-AUC curves allows us to assess the classifiers’ ability to distinguish between positive and negative cases, regardless of the threshold chosen. Moreover, the area under the ROC curve (AUC) serves as a singular metric to quantify the overall classifier performance, where higher AUC values indicate better discriminative capabilities. Analyzing the ROC-AUC curves in Fig. 6 for both the HAM10000 and BCN20000 datasets provides valuable insights into the effectiveness of the DFS-MPA approach. Notably, the AUC values for CNN classifiers are 0.9759 for HAM10000 and 0.9788 for BCN20000, indicating the superior performance of DFS-MPA in both cases.
ROC-AUC curves for HAM10000 and BCN20000 datasets.
This section presents a comparative analysis of previously proposed feature fusion methods and the current feature fusion approaches proposed in this work. The optimized versions of these approaches include FOWFS-AJS, FOWFS-MPA, and DFS-MPA, evaluated on two datasets: HAM10000 and BCN20000 (as shown in Table 8).
Comparison with previous and current feature fusion approaches for HAM10000 and BCN20000 datasets
Comparison with previous and current feature fusion approaches for HAM10000 and BCN20000 datasets
Statistical validation based on Friedman statistical and Bonferroni-Dunn tests
Note: 1-FOWFS-GA-ANN; 2-FOWFS-PSO-ANN; 3-FOWFS-MPA-ANN; 4-KFS-GA-ANN; 5 KFS-PSO-ANN; 6-KFS-MPA-ANN; 7-DFS-GA-ANN; 8-DFS-PSO-ANN; 9-DFS-MPA-ANN; 10-DFS-MPA-LSTM; 11-DFS-MPA-CNN.
The evaluation metrics used for comparison are accuracy, sensitivity, specificity, and F-score. For the HAM10000 dataset, the FOWFS-AJS approach achieved an impressive overall accuracy of 94.48%, with high sensitivity (96.58%) and specificity (95.03%), resulting in a commendable F-score of 95.22%. The FOWFS-MPA approach performed even better, achieving an accuracy of 96.62% with slightly improved sensitivity (97.01%) and specificity (96.47%), leading to a higher F-score of 96.69%. However, the DFS-MPA approach outperformed both, achieving the highest accuracy of 97.23%, with exceptional sensitivity (97.82%) and specificity (97.66%), resulting in an outstanding F-score of 97.73%.Moving to the BCN20000 dataset, the FOWFS-AJS approach exhibited a notable accuracy of 96.05%, accompanied by considerable sensitivity (96.88%) and specificity (96.06%), resulting in an impressive F-score of 96.61%. The FOWFS-MPA approach displayed similar results with an accuracy of 96.77%, sensitivity of 96.84%, specificity of 96.99%, and an overall F-score of 96.88%. However, once again, the DFS-MPA approach outshined the other methods, achieving the highest accuracy of 98.02%, sensitivity of 97.89%, specificity of 97.94%, and an exceptional F-score of 97.91%. The results clearly demonstrate that the DFS-MPA approach consistently outperforms the other feature fusion strategies on both the HAM10000 and BCN20000 datasets. These findings underscore the effectiveness of the DFS-MPA in enhancing classification performance for skin lesion datasets, making it the preferred choice for feature fusion in this study.
This section focuses on statistical validation based on the Friedman test and Bonferroni-Dunn test [49, 50], which is of utmost importance in MRI classification, ensuring robust and reliable conclusions in medical image analysis. This test helps to identify meaningful differences among classifiers’ performances, revealing which classifiers consistently outperform others and also allows for pair-wise comparisons to pinpoint specific classifier pairs with statistically significant differences in performance. This comprehensive validation process not only aids in selecting the most suitable classifier for MRI classification but also adds credibility to our research findings, instilling confidence in the proposed classification approach. The Table 9 presents the results of statistical validation based on the Friedman test for multiple feature fusion methods combined with various classifiers, namely FOWFS-GA-ANN, FOWFS-PSO-ANN, FOWFS-MPA-ANN, KFS-GA-ANN, KFS-PSO-ANN, KFS-MPA-ANN, DFS-GA-ANN, DFS-PSO-ANN, DFS-MPA-ANN, DFS-MPA-LSTM, and DFS-MPA-CNN. The average ranks for each method are provided, indicating their relative performance across the evaluated criteria. Based on the average rank, we observe that DFS-MPA-CNN has the lowest average rank of 3.72, indicating its superior performance compared to other methods in this study. Following closely, DFS-MPA-LSTM and DFS-MPA-ANN obtained average ranks of 4.93 and 4.94, respectively, demonstrating their competitive performance in the evaluation. Moreover, DFS-PSO-ANN and DFS-GA-ANN achieved average ranks of 5.97 and 6.98, respectively, showcasing their respectable positions among the evaluated approaches. In contrast, FOWFS-GA-ANN, FOWFS-PSO-ANN, and FOWFS-MPA-ANN achieved higher average ranks of 8.69, 7.68, and 7.54, respectively, indicating comparatively weaker performance compared to the other methods. These results from the Friedman test’s average ranks indicate that DFS-MPA-CNN is the most promising approach for MRI classification among the evaluated feature fusion methods
Expected contributions and impact
The presented manuscript makes several significant contributions to the field of skin lesion image classification. First, the proposed clustering-based feature fusion approach demonstrates its effectiveness in enhancing the accuracy of classification. Through the fusion of different feature sets, a notable reduction in dimensionality is achieved, improving the efficiency of the classification process. Moreover, the study evaluates ten fused feature sets using three classifiers on two diverse datasets, providing a comprehensive analysis of the proposed method’s performance.
The main impact of the study lies in the specific focus on the DFS-MPA approach, which outperforms other fusion methods consistently in terms of dimensionality reduction and accuracy improvement. DFS-MPA attains impressive results, including a 54.69% reduction in dimensionality for HAM10000 and 50.47% for BCN20000, as well as achieving the highest accuracy levels among all the fused feature sets for both datasets. The superiority of DFS-MPA is further supported by the ROC-AUC curves, illustrating its exceptional discriminative capabilities.
Additionally, the in-depth evaluations with different classifiers demonstrate the versatility of the DFS-MPA approach, particularly highlighting the exceptional performance of DFS-MPA-CNN, which achieves remarkable accuracy, sensitivity, specificity, and F-score in skin lesion classification scenarios.
The proposed fusion technique represents a significant advancement over traditional fusion approaches in skin lesion classification. Unlike traditional methods, which may lack adaptability and inclusiveness in selecting informative features, the proposed technique, namely KFS-MPA and DFS-MPA, offers enhanced flexibility in weight selection and justifies weight thresholds. This adaptability addresses limitations observed in prior studies, ensuring that no informative features are excluded and less discriminative ones are not inadvertently included, thus mitigating the risk of overlapping with redundant features. Moreover, the proposed fusion technique leverages a clustering-based design approach, which not only enhances the representation and discrimination of skin lesion patterns but also captures the inherent variability and complexity of lesions more effectively. By integrating deep feature fusion techniques, the proposed approach further distinguishes itself by extracting and fusing diverse information from multiple pre-trained CNN models, thereby significantly boosting the discriminative capability of the classification system. This proposed fusion technique offers a more comprehensive and adaptable solution compared to traditional fusion approaches, thus contributing to improved accuracy and reliability in skin lesion classification tasks.
Conclusion and future work
In conclusion, this research has successfully achieved its primary objective of enhancing skin lesion image classification accuracy through a comprehensive clustering-based deep feature fusion approach. The proposed method addresses limitations observed in prior feature fusion models and presents a novel optimization mechanism to improve adaptability and robustness. The clustering-based design approach, combined with deep feature fusion techniques, significantly enhances the representation and discrimination of different skin lesion patterns, leading to improved classification accuracy. The contributions of this study are multi-fold. Firstly, the clustering-based design approach provides a more effective organization and understanding of skin lesion patterns, enhancing representation and discrimination capabilities. Secondly, the integration of deep feature fusion leverages the power of multiple pre-trained CNN models, extracting diverse and complementary information for comprehensive representation. Thirdly, the optimized mechanism for feature fusion ensures adaptability across datasets and varying feature importance, enhancing classification accuracy. Lastly, the research addresses the challenges posed by complex and variable skin lesions, which can aid in early detection and improved patient outcomes. The impact of this research lies in the significant improvements achieved in skin lesion image classification accuracy, supported by the superior performance of the DFS-MPA approach over other fusion methods. By addressing the limitations of previous approaches and introducing innovative clustering-based strategies, this research opens avenues for more reliable and practical skin lesion diagnosis. Looking ahead, there are several promising directions for future research. Further exploration of alternative fusion methods and classifiers can provide a more comprehensive understanding of skin lesion classification performance. Additionally, fine-tuning the parameters of the MPA algorithm can optimize its performance for specific applications. Moreover, investigating the applicability of the proposed approach to other medical image analysis tasks could extend its utility in diverse healthcare domains. Continuous efforts to enhance the clustering-based design approach and deep feature fusion techniques hold the potential for more accurate and efficient skin lesion image classification systems, ultimately contributing to improved healthcare outcomes.
