Disease classification using neighbourhood centroid opposition based multi-objective flamingo search algorithm based feature selection approach

Abstract

In machine learning, a crucial task is feature selection in that the computational cost will be increased exponentially with increases in problem complexity. To reduce the dimensionality of medical datasets and reduce the computational cost, multi-objective optimization approaches are mainly utilized by researchers. Similarly, for improving the population diversity of the Flamingo Search Algorithm, the neighbourhood centroid opposition-based learning mutation is employed. In this paper, to improve the classification accuracy, enhance their exploration capability in the search space and reduce the computational cost while increasing the size of dataset, neighbourhood centroid opposition-based learning (NCOBL) is integrated into the multi-objective optimization based Flamingo Search Algorithm (MOFSA). The optimal selected datasets are classified by using the weighted K-Nearest Neighbour classifier. With the use of fifteen benchmark medical datasets, the efficacy of the suggested strategy is assessed in terms of recall, precision, accuracy, running time, F-measure, hamming loss, ranking loss, standard deviation, mean value error, and size of the selected features. Then the performance of the suggested feature selection technique is compared to that of the existing approaches. The suggested method produced a minimum mean value, standard deviation, mean hamming loss, and maximum accuracy of about 99%. The experimental findings demonstrate that the suggested method may enhance classification accuracy and also eliminate redundancy in huge datasets.

Keywords

Flamingo search algorithm K-Nearest Neighbour feature selection multi-objective optimization disease classification

1 Introduction

In machine learning, an important pre-processing step is the feature selection (FS) process because of the increase in the dimensions and volume of information [1]. Many times the original features in the dataset may contain redundant features that are not necessary for regression and classification tasks. If we consider these redundant features, the classification performance will decrease [2]. Without losing any useful information content of the data it can be eliminated by using the feature selection approach [3]. The major benefits of FS techniques are an improvement in the effectiveness of classification and a decrease in the complexity of the dataset as well as the cost of computation. In the field of medical diagnosis, the feature selection approach gets more attention due to the widespread data generation by medical establishments. Filter based and wrapper-based approaches are the two main types of FS approaches [4]. Without using any learning methods the inherent characteristics of data are analysed and ranked based on the relevance of features and subsets are selected by the filter-based algorithm. The additional attributes are selected and ranking attributes are involved in this approach. The relationship between the dataset features is considered by the wrapper-based algorithms [5]. For the generation of various subsets of properties, the optimization algorithm is employed by this approach based on the appropriateness of the sub-categories. When comparing the performance of filter-based feature selection approaches, the wrapper-based approach had shown better performance [6].

The wrapper-based FS technique seeks to minimise the number of features used while retaining accuracy [7]. To achieve this goal a suitable optimization algorithm should be chosen. The selected features by the traditional optimization approaches cannot be efficient and effective due to the high dimension search space. Recently, meta-heuristic optimization techniques have been used in FS problems because they can find optimal solutions by utilising some global search techniques. This has helped a wrapper strategy to identify optimal feature subsets. The differential evolution (DE), particle swarm optimization (PSO), memetic algorithm (MA), Flamingo Search Algorithm (FSA), ant colony optimization (ACO), and genetic algorithm (GA) are some meta-heuristic feature selection techniques [8 –12] used for FS. Based on the objective function these optimization approaches are classified as single-objective optimization approaches and multi-objective optimization approaches. For satisfying the two goals of wrapper based feature selection approach, the multi-objective FS approaches are employed. The FSA is a popular optimization algorithm which is inspired by the foraging and migratory behaviour of flamingo birds [13]. In various optimization problems, FSA has been mainly utilized. When compared with existing algorithms the FSA have better convergence speed, search accuracy and stability. To improve the performance of FSA for high dimensional datasets, mutation operators such as opposition based learning (OBL), Levy flight mutation and Cauchy mutation can be applied. The neighbourhood centroid opposition based learning (NCOBL) mutation operator is used in this research to present the multi-objective FSA [14]. Consequently, the mutation operator increases the suggested approach’s convergence speed. Multi-Label Feature Selection Algorithm (MMFS) [15], Multi-Objective Artificial Bee Colony (B-MOABC) [16], and Multi-Objective Harris Hawks Optimisation and Fruitfly Optimisation Algorithm (MOHHOFOA) approach [17] are used to validate the performance of the suggested strategy.

The following list includes the article’s key contributions:

A novel FS strategy called NCOBL-MOFSA is proposed as a remedy for the inferior solutions to premature convergence and attains well balanced trade-off between exploration and exploitation strategy. To improve the population diversity of the FSA, the neighbourhood centroid opposition-based learning mutation is employed.

To classify the diseases, the weighted K-Nearest Neighbour (KNN) classifier makes use of the optimum features that are chosen using the suggested feature selection methodology.

The effectiveness of the suggested FS strategy NCOBL-MOFSA is compared with four conventional approaches using fifteen benchmark medical datasets. The evaluation parameters utilized are precision, accuracy, recall, running time, F-measure, hamming loss, ranking loss, standard deviation, mean value error and size of selected features.

The remainder of the article is structured as follows. The related works are explained in Section 2. The suggested strategy is detailed in Section 3. A discussion of the experimental findings is included in Section 4. The report is finally concluded in Section 5 with information on future studies.

2 Related works

The sequential or random based search strategy is used in wrapper based FS methods. The features sequentially add or remove could lead to trapping in local optimum solutions. Utilising random search approaches based on metaheuristic algorithms and their many search strategies, including sequential backward search, sequential forward search, and floating search, might help solve local optimal issues [1]. Since they employ gradient-free methods, the metaheuristics are effective in obtaining global optimum values [2]. The researchers’ understanding of the wrapper-based FS is still hazy. The multiple inertia weight strategy based co-evolution binary particle swarm optimisation is developed in [3] to enhance the global search capacity and good variety by dividing the particle population into numerous species and different inertia weight schemes. In [4] proposed a binary version of ant lion optimizer (BALO) algorithm for feature selection by adaptively optimized search of features in the space with the required diversity in population and well balanced trade-off between exploration and exploitation strategy.

In [18] a multi-objective feature selection algorithm called MOFS-BDE is proposed based on self-learning and Binary Differential Evolution (BDE). The performance is improved with the new three operators. To generate fresh solutions, a probability difference based new binary mutation operator is proposed. In the population, the elite individuals are refined using purifying search of one-bit. The crowding distance idea is paired with non-dominated sorting to cut down on time consumption and choose the best parent people. The findings show that the suggested strategy may balance local exploitation and global exploration while combining binary mutation and OPS. The algorithm’s ability to directly handle continuous or categorical information is constrained by the binary representation. A multi-objective approach is recommended in [19] for the prediction of warfarin dosage. They also employ the multi-objective PSO (MOPSO) and the Non-dominated Sorting Genetic Algorithm-II (NSGA-II) in addition to the artificial neural network. On 553 patients the INR test was performed during 2013–2015. To predict the dosage, the identification of genetic and clinical features is involved. In the experimental results, the MOPSO obtained higher accuracy than NSGA-II.

In [20], a feature in hyperspectral imagery is chosen using the multi-objective algorithm called the discrete sine cosine algorithm (SCA). To reduce duplication and increase the significance of the chosen feature subset, the suggested method uses a novel and practical hyper spectral FS framework where the ratio between mutual information (MI) and Jeffries-Matusita (JM) is displayed. To increase the quantity of information and to address the problem of choosing discrete hyperspectral features, a different measurement known as variance has been used. The proposed discrete SCA broadens the options for choosing the optimum feature’s subcategories. Five hyperspectral image data sets, ten conventional UCI data sets, and an experiment employing these datasets have all been utilised to demonstrate the usefulness and efficiency of the proposed technique. In [21] provides a description of the two-archive Multi-Objective Artificial Bee Colony (MOABC) algorithm solution to the FS issue. For observer bees the diversity-guiding search and for employed bee’s the convergence-guiding search are the two new operators that have been presented to find a non-dominated group with convergence and distribution. The two archives that have been used to enhance the search capabilities of various types of bees are the external archive and the leader’s archive. The outcomes show that this approach outperforms the single-objective ABC algorithm.

Using the criteria of simultaneously minimising the makespan and overall cost, the multi-objective fruit fly optimization algorithm (MOFOA) is suggested in [22]. For order preference, a search method based on vision is also applied. The effect of the parameters is also investigated using a design-of-experiment approach. This model performed well when measured against a number of common benchmarks and when the results were compared to those of other optimization techniques. A cloud model is employed in [23] to simulate food circumstances, and the FOA is supplied to resolve problems that have multi-objective functions. There is also the use of a self-adaptive parameter strategy. To improve diversity, the nearest neighbour distance is normalised.

A multi-objective FOA technique is detailed in [24] to address the issue of test point selection. Fruit fly locations are indicated by the binary string. When compared to other algorithms, the performance data for this model demonstrates that it has performed in an acceptable manner. A new recombined model for predicting air pollution is described in [25] and is depend on the multi-objective Harris hawks optimization method. In order to forecast air pollution time series, an extreme learning machine (ELM) is also used. This study evaluates and tests an air pollutant concentration system in three Chinese cities using a variety of different criteria. In experiments, this model is found to work successfully. An optimization problem with multi-objective function has been presented in [26] for selecting the features based on DE to locate generally similar clusters without knowing in advance how many clusters will be found using less features in the data sets. Additionally, a number of real-world experiments have been carried out with numerous synthetic criteria utilising a variety of clustering algorithms in order to analyse the suggested methodology. Additionally, the outcomes imply that the strategy they propose can considerably enhance clustering performance while lowering dimensionality.

Ruiz and Stutzle, 2007 [5] proposed an Iterated Greedy (IG) based wrapper feature selection algorithm for efficiently finding the best subset from all existing features with well-balanced destruction and construction operations at each iteration based on pre-calculated filter scores. This method avoids exploring the entire search space. A simple and effective IG based novel wrapper feature selection algorithm that is strategically guided by their pre-calculated filter scores is proposed by Gokalp et al. 2020 [6] for developing an effective feature selection method with the best dimensionality reduction in a sentiment analysis framework. In [7], a greedy crossover technique is integrated into coronavirus herd immunity optimizer (CHIO) to improve their exploration capability in the search space as a remedy for the inferior solutions to premature convergence and when it is locked into a local optimum search space.

Two multi-objective functions for FS proposed in [27] used the artificial butterfly optimization technique. The objective of the first solution was maximizing the accuracy of classification, and the objective of the second solution was to reduce the number of characteristics. The suggested method outperforms the examined methods in experiments on eight data sets, and both solutions outperform single-objective methods. Because FS’s problem is continuing and the multi-objective Grey Wolf Optimizer(GWO) method was developed to address continuous optimization issues, FS is given this option in [28]. A deep learning network was utilised to assess the subset categorization of the chosen characteristics, and the binary form was given with the operator of the sigmoid transmission function. Fifteen benchmark data sets were utilised to compare and assess the performance of the suggested technique using the MOPSO and NSGA-II algorithms. Most of the time, reducing features and classification mistakes while using less processing power is preferable to most of the optimization problems with multi-objective functions.

3 Proposed feature selection strategy

The suggested framework consists of pre-processing, feature selection and classification. At first, the collected medical dataset is pre-processed to speed up the process. Then the pre-processed features are selected using the suggested feature selection strategy. At last, the optimal selected features are used by the weighted KNN classifier to classify the disease. The suggested strategy’s design is depicted in Fig. 1.

Fig. 1

Framework of the suggested strategy.

3.1 Data pre-processing

To increase the quality of the dataset, cleaning is the first step in data pre-processing. In this stage, duplicates are eliminated, missing values are handled, and encoding is done. Only numeric data can be read by machines, however the dataset includes both nominal and numeric data. Therefore, encoding is used to transform the dataset’s character data into numeric values. Data scaling is the final pre-processing step done to speed up the procedure. The dataset’s characteristics exhibit significant range, magnitude, and unit variation. Scaling can normalise the dataset within the range [0, 1] because it is required to keep all the data in one format.

3.2 Feature Selection using NCOBL-MOFSA

The suggested strategy is used to choose the best features from the pre-processed dataset. The MO-FSA is combined with the neighbourhood centroid opposition learning based mutation operator. In the proposed approach multi-objective feature selection approach is proposed with a weighted KNN classifier. MOFSA is proposed with an NCOBL mutation operator. The flowchart of the suggested feature selection strategy is illustrated in Fig. 2. The proposed approaches main processes are discussed in this section.

Fig. 2

Flowchart of proposed feature selection approach.

Initially, the population P(t) is initialized and parameters are set. To mutate the population P(t), the neighbourhood centroid opposition learning based mutation operator is employed. The offspring population Q(t) is produced by the mutation operation. The joint population is obtained by combining the offspring population Q(t) and population P(t). Then the fitness function is calculated. Based on the fitness function, the position is updated based on the FSA. This process is repeated until the maximum iteration condition is achieved. The final output of this process is considered as the optimal selected features. This optimal feature subset is used to classify the diseases using a weighted KNN classifier. Then the performance is evaluated using various performance measures. The values are compared with existing approaches.

3.2.1 A. Multi-objective Flamingo Search Algorithm (MO-FSA)

A random number among 0 and 1 is created using SN individuals, initialising the population as P. Initialization of the parameters is done, and each population’s size is decided. The first section makes the claim that flamingos are migrating, MP_b. The maximum iteration is taken as Iter_Max. In the i_th iteration of population renewal, the number of foraging flamingos is MP_r = rand [0, 1] ×P × (1 - MP_b). In the first part of the iteration, the number of migrating flamingos is MP₀ = MP_b × P. In the next section of the iteration, the number of migratory flamingos is MP_t = P - MP₀ - MP_r. Based on the fitness value of flamingos, the population of flamingos is organized. The high fitness former flamingos MP_b and low fitness former flamingos MP_t are considered as migratory flamingos and others are considered as foraging flamingos. Equation (1) is used to modify the location of the flamingos’ foraging activities.

$\begin{matrix} x_{ij}^{t + 1} = \\ (x_{ij}^{t} + r_{1} \times {xa}_{j}^{t} + G_{2} \times | G_{1} \times {xa}_{j}^{t} + r_{2} \times x_{ij}^{t} |) / Q \end{matrix}$ (1)

Where, (t + 1) ^th iteration in j^th dimension i^thflamingo is represented by $x_{ij}^{t + 1}$ . r₁, r₂ represents the random number in the range of [–1, 1]. G₁ and G₂ is an arbitrary number determined by the normal standard distribution. The diffusion factor is signified as Q = Q (n) that is an arbitrary number depending on chi-square distribution. The position for the migratory flamingos is updated using Equations (2). $x_{ij}^{t + 1} = x_{ij}^{t} + g \times ({xa}_{j}^{t} - x_{ij}^{t})$ (2)

Where, t^th iteration flamingo location is represented as $x_{ij}^{t}$ , (t + 1) ^thiteration in j^thdimension i^thflamingo is represented by $x_{ij}^{t + 1}$ and best fitness is signified by ${xa}_{j}^{t}$ . g = K (o, k) is a Gaussian arbitrary number. The fitness value is determined following the position update in order to rectify the position if it is outside of the allowed range. Until the desired outcome is achieved, this technique is resumed. The outcome is the best possible feature set.

3.2.2 B. Neighbourhood centroid opposition-based learning (NCOBL)

OBL’s objective is to simultaneously analyse the present and its reverse solution and choose the better one to increase search efficiency. The reverse points in OBL are computed using the maximum and minimum boundaries. OBL has the drawback of not fully utilizing population-wide search data, which is a drawback. The NCOBL operator is used by the FS strategy to change the optimal seagull location in order to escape the local optimum. It can be defined as follows

The search space is D-dimensional, with unit mass X = [x₁, x₂, …, x_n] is considered to be n points; then, the total center of gravity can be explained as given in Equation (3) $M = \frac{\sum_{j = 1}^{D} x_{i, j}}{n} (j = 1, 2, \dots, D)$ (3)

The reverse point of X_i can be defined as given in Equation (4) $\vec{X_{i}} = 2 \times M - X_{i}, i = 1, 2, \dots, n$ (4)

where $\vec{X_{i}}$ can be denoted as x_i,j ɛ [a_j, b_j], where b_j and a_j are the maximum and minimum of x_i,j respectively.

3.2.3 C. Fitness function

In the proposed feature selection approach, two objectives are considered. The multi-objective function aims to increase classification accuracy while decreasing dataset size. Increasing classification accuracy is the first fitness function that is taken into account. For that, Equation (5) is used to calculate the classification error. $minimum ({function}_{1}) = (\frac{1}{n} \sum_{l = 1}^{n} \frac{I_{e}}{I_{total}}) \times 100 {%$ (5)

Where total instances are denoted by I_total and I_e represents the predicted instances. This function calculates the minimum error. The second fitness function considered is minimization of the feature size. This can be calculated using Equation (6). $minimum ({function}_{2}) = \sum_{i = 1}^{F} x_{i}$ (6)

Where the ith value of the feature set is denoted by x_i. The dataset’s overall number of features is represented by F.

3.3 Classification using weighted K-Nearest neighbour

The K-nearest neighbours (KNN) approach has been applied in the feature selection techniques in recent years. The conventional KNN technique can be thought of as giving each nearest neighbour a weight of 1/K. In other words, K-nearest neighbour samples contribute equally to the prediction testing sample category, ignoring the possibility that the significance of various nearest neighbours may vary. The likelihood that two samples will fall into the same category increases with sample proximity, hence its significance should increase. Therefore, each of K neighbours should be given a weight based on how close it is to the testing sample, if it is a regression or classification task. The allotted weight of the neighbour increases with closer distance. As a result, the weighted KNN approach is utilised in this work to determine sample prediction.

According to the weighted KNN approach, which is based on KNN, different nearest neighbours may have varying degrees of importance. This is because it is assumed that the closer a neighbour is to the target variable, the more influence it will have on the target category’s prediction. As a result, each neighbour is given a distance weighting. Using the KNN, the class label $\hat{y_{i}}$ of the test sample x_i can be calculated using Equation (7) $\hat{y_{i}} = \frac{1}{K} \sum y_{j}$ (7)

Where, the number of nearest neighbours is denoted K, and the nearest neighbours’ class labels are denoted by y_j. After considering the distance weighting, the target variable $\hat{y_{i}}$ can be predicted using Equation (8). $\hat{y_{i}} = \frac{1}{\sum_{i \neq j} w (d_{ij})} \sum_{i \neq j} w (d_{ij}) \cdot y_{j}$ (8)

Between x_j and x_i the Euclidean distance is represented by d_ij. A function of weighting of the distance d_ij can be denoted by wd_ij, and the value of w (d_ij) is between [0, 1].

4 Experimental results and discussion

4.1 Dataset details

The efficacy of the suggested feature selection strategy is evaluated using 18 benchmark datasets from the OpenML [29], including SpamBase, Sonar, Arrhythmia, Madelon, Isolet, Colon-cancer, Hepatitis, Iris, Lymphography, Primary-tumor, Breast-cancer, Lung-cancer, Liver-disorders, Dermatology, and Leukaemia. The dataset is described in Table 1 and includes the dataset name, the number of features, the number of data samples, and the number of classes.

Table 1
Dataset details

No. Dataset Total data samples Total features Total classes

1 Spambase 4601 58 2

2 Sonar 208 61 2

3 Arrhythmia 452 280 13

4 Madelon 2600 501 2

5 Isolet 7797 618 26

6 Colon-cancer 62 2001 2

7 Hepatitis 155 20 2

8 Iris 150 5 3

9 Lymphography 32 57 3

10 Primary-tumour 148 19 4

11 Breast-cancer 286 10 2

12 Lung-cancer 339 18 21

13 Liver-disorders 345 6 2

14 Dermatology 366 35 6

15 Leukemia 72 7130 2

No.	Dataset	Total data samples	Total features	Total classes
1	Spambase	4601	58	2
2	Sonar	208	61	2
3	Arrhythmia	452	280	13
4	Madelon	2600	501	2
5	Isolet	7797	618	26
6	Colon-cancer	62	2001	2
7	Hepatitis	155	20	2
8	Iris	150	5	3
9	Lymphography	32	57	3
10	Primary-tumour	148	19	4
11	Breast-cancer	286	10	2
12	Lung-cancer	339	18	21
13	Liver-disorders	345	6	2
14	Dermatology	366	35	6
15	Leukemia	72	7130	2

4.2 Implementation details

Python programming is used to implement the suggested model in Google Colab [30]. A Jupyter-based environment called Google Colab runs in a web browser. Python code is able to be developed and run on this open-source platform. Datasets are stored in Google Drive and mounted on the Google Colab.

4.3 Parameter settings

The parameter setting of the proposed approach and existing approaches are illustrated in Table 2.

Table 2
Parameter Setting

Method Parameter

Proposed approach population = 50, MP_b =0.1, Diffusion factor =G(0,1.2), K(8)

MMFS [15] θ = 0.5 and 0.6

B-MOABCFS [16] Limitation trial parameter = 5 The number of food sources = N/2

MOHHOFOA [17] rate of rise = 0.1 number of grids = 7 Factor of delete = 2 Factor of select = 2

Method	Parameter
Proposed approach	population = 50, MP_b =0.1, Diffusion factor =G(0,1.2), K(8)
MMFS [15]	θ = 0.5 and 0.6
B-MOABCFS [16]	Limitation trial parameter = 5 The number of food sources = N/2
MOHHOFOA [17]	rate of rise = 0.1 number of grids = 7 Factor of delete = 2 Factor of select = 2

4.4 Performance metrics

To assess the effectiveness of the suggested approach the performance metrics utilized are recall, precision, accuracy, ranking loss, hamming loss and F1 score. The way of computing these metrics is explained in the below section.

Precision (Pr): The precision is computed as below, $\Pr = \frac{1}{| s |} \sum_{i = 1}^{| s |} \frac{γ_{i} \cap P_{i}}{| P_{i} |}$ (9)

F1 score (F1): The F1-score is evaluated by the following equation, $F 1 = \frac{2 \times \sum_{i = 1}^{| s |} γ_{i}^{'} \cap γ_{i}}{\sum_{i}^{| s |} γ_{i} + \sum_{i}^{| s |} γ_{i}^{'}}$ (10)

Accuracy (Acc): The accuracy of the proposed approach is computed by Equation (11). $Acc = \frac{1}{| s |} \sum_{i = 1}^{| s |} \frac{| γ_{i} \cap P_{i} |}{| γ_{i} \cup P_{i} |}$ (11)

Recall (Re): The recall value is computed as, $Re = \frac{1}{| s |} \sum_{i = 1}^{| s |} \frac{γ_{i} \cap P_{i}}{| γ_{i} |}$ (12)

Ranking loss: Equation (13) is utilized for computing the ranking loss.

$\begin{matrix} R_{loss} (s) \\ = \frac{1}{| s |} \sum_{i = 1}^{| s |} \frac{| {(a, b) a \in γ_{i}, b \in \bar{γ_{i}}, φ_{i, a} ⩽ φ_{i, b}} |}{| γ_{i} | | \bar{γ_{i}} |} \end{matrix}$ (13)

Hamming loss: Equation (14) is utilized for computing the ranking loss. $H_{loss} (s) = \frac{1}{| s |} \sum_{i = 1}^{| s |} \frac{1}{| L |} | γ_{i} \oplus P_{i} |$ (14)

Where, the intersection among γ_i and are denoted ∩ and φ_i denotes the anticipated label list for i. Label count may be expressed as |L|. The multi-label dataset may be represented as s ={ (s_i, γ_i) |1 ⩽ i ⩽ |s| }, Z_i specifies a specific instance and its real label subset are indicated by γ_i ∈ L. The symmetric variance among the predicted label subsets and true label is denoted by ⊕. P_i denotes the set of labels predicted by the suggested classifier.

4.5 Experimental results

The suggested approach’s performance is contrasted with that of three well-known techniques, including MMFS, B-MOABC, and MOHHOFOA. The suggested technique is compared in terms of pareto front analysis in Fig. 3. A method used in multi-objective optimisation issues is Pareto front analysis, commonly referred to as Pareto front optimisation or Pareto front discovery. In order to find the optimum trade-offs among competing goals, it seeks to determine the set of ideal solutions. This comparison is made with various approaches currently in use using the Spambase dataset. The graph’s y-axis represents classification error, while its x-axis represents the size of the ideal feature collection. The feature size decreased along with the decrease in classification error. As a result, two objective functions remain in balance. According to the Pareto front results, the suggested NCOBL-MOFSA approach can reduce classification mistakes while also guaranteeing a decreased solution size for all 15 datasets.

Fig. 3

Pareto front analysis.

Figure 4 compares how well the suggested technique performs in terms of precision, accuracy, recall, F-measure, and specificity using boxplot analysis. Figure 4(a) compares the proposed approach’s accuracy to that of existing methodologies for all datasets using box plot analysis. The datasets used by Spambase for the suggested technique had a 99.3 accuracy rate. For the suggested strategy, the box plot’s top quartile is 99%. The greatest upper quartile values for current techniques like MMFS, BMOABC and MOHHOFOA algorithms are 94%, 93%, and 92%, respectively. As a consequence, the box plot of all known approaches demonstrates that the recommended strategy has the highest value of the upper quartile. In comparison to the already used ways, the recommended solution has a greater median line of the box. The accuracy, recall, F-measure, and specificity of the proposed technique are compared with those of existing approaches in box plot analysis, as shown in Figures 4(b), 4(c), and 4(d). The suggested methodology outperformed the other existing strategies. Since the average performance of all the strategies is higher, the box plot contains no potential outliers.

Fig. 4

Comparison of performance (a) accuracy (b) Precision (c) Recall (d) F-measure (e) Specificity.

The size of the selected features is compared with existing approaches such as MMFS, B-MOABC and MOHHOFOA in Fig. 5. The strategy that uses a smaller set of features can shorten training time while increasing classifier accuracy. The suggested method needs to use fewer features that have been specifically chosen in order to increase accuracy. The suggested technique has fewer features that have been selected for all 15 datasets, as shown in Fig. 3. The proposed approach has the lowest feature size when compared with the existing approaches.

Fig. 5

Selected features comparison.

Table 3 compares the average fitness values for several approaches across various datasets. On analysing the table it is observed that for all the 15 datasets, the suggested method achieves lower average fitness values compared to the other methods like MMFS, B-MOABC and MOHHOFOA. Lower fitness values generally indicate better performance or higher quality solutions. From the results, it can be known that the suggested NCOBL-MOFSA approach has better fitness value than the existing approaches.

Table 3

Comparison of average fitness

Datasets	Method
	Proposed	MMFS	B-MOABC	MOHHOFOA
Spambase	0.0003	0.0019	0.0212	1.6399
Sonar	0.0046	0.0813	0.1031	0.1633
Arrhythmia	0.0021	0.2193	0.2391	0.2864
Madelon	0.002	0.0420	0.0407	0.0438
Isolet	0.0028	0.0427	0.0504	0.0543
Colon-cancer	0.0135	0.1748	0.1786	0.1732
Hepatitis	0.0036	0.0926	0.0884	0.0896
Iris	0.0017	0.0452	0.0701	0.0495
Lymphography	0.0073	0.1705	0.1662	0.1259
Primary-tumor	0.0015	0.0794	0.0338	0.0722
Breast-cancer	0.0091	0.0626	0.0896	0.1021
Lung-cancer	0.0124	0.0155	0.0165	0.1536
Liver-disorders	0.0097	0.1681	0.1681	0.0207
Dermatology	0.0024	0.1136	0.2201	0.0201
Leukemia	0.0064	0.0172	0.0172	0.0232

Figure 6 shows the comparison of the hamming loss and ranking loss of the suggested feature selection with existing approaches. Hamming loss measures the fraction of incorrect labels for the suggested multi-label classification problem, where a lower value indicates better performance. Ranking loss, on the other hand, quantifies the inconsistency between the predicted rankings and the true rankings of labels. In the suggested approach the radar chart is used to compare the outcomes. On analysing the radar chart it is found that for most of the datasets, the hamming loss and ranking loss are lower than the existing MMFS, B-MOABC and MOHHOFOA strategies which specify that the suggested approach performs well in terms of both Hamming loss and Ranking loss compared to the other methods.

Fig. 6

Comparison for (a) Ranking loss and (b) Hamming loss.

Fig. 7

Error comparison (a) Standard deviation (b) Mean value.

In Fig. 4 mean value and standard deviation are represented as box plot analysis. This figure provides the standard deviation as a measure of variability or spreads in the results, while the mean represents the average value. Lower standard deviation values indicate less variability, which can be desirable in many cases. Higher mean values can suggest better performance, depending on the specific evaluation criteria and the nature of the problem. Based on the figure, the suggested method shows relatively lower variability (as indicated by the standard deviation) compared to the other methods in several datasets. This suggests that the suggested method may provide more stable and consistent results.

Figure 8 compares the suggested approach’s running time to that of the current techniques MMFS, B-MOABC, and MOHHOFOA. In the proposed approach lower runtimes generally indicate faster execution. On analysing the figure it is found that all 15 datasets take lower runtimes for the suggested strategy compared to the existing MMFS, B-MOABC and MOHHOFOA approaches. The findings show that the suggested method may retain greater accuracy even if feature size is reduced.

Fig. 8

Running time comparison.

5 Conclusion

In order to categorise the medical datasets, a new multi-objective feature selection methodology is proposed in this study. In this study, the neighbourhood centroid opposition-based learning (NCOBL) is integrated into the multi-objective optimization-based Flamingo Search Algorithm (MOFSA) to increase the size of the dataset while reducing computational cost, improving exploration capability in the search space, and improving classification accuracy. The weighted K-Nearest Neighbour classifier is used to categorise the best-chosen datasets. With the use of fifteen benchmark medical datasets, the efficacy of the suggested strategy is assessed. The suggested feature selection technique’s performance is compared to that of existing methods in terms of precision, accuracy, recall, running time, F-measure, hamming loss, ranking loss, standard deviation, mean value error, and size of the selected features. According to the experimental findings, the suggested technique can choose the optimal features with a better degree of accuracy of 99%. Despite having good performance, the suggested solution does not employ a dataset balancing strategy. The performance of the suggested strategy will be enhanced by the use of dataset balancing techniques.

References

Mandal

, Singh

P.K.

, Ijaz

M.F.

, Shafi

and Sarkar

, A tri-stage wrapper-filter feature selection framework for disease classification, Sensors 21(16) (2021), 5571.

Zorarpacı

and Özel

S.A.

, A hybrid approach of differential evolution and artificial bee colony for feature selection, Expert Systems with Applications 62 (2016), 91–103.

Kabir

M.M.

, Islam

M.M.

and Murase

, A new wrapper feature selection approach using neural network, Neurocomputing 73(16-18) (2010), 3273–3283.

Al-Tashi

, Md Rais

, Abdulkadir

S.J.

, Mirjalili

and Alhussian

, A review of grey wolf optimizer-based feature selection methods for classification, Evolutionary Machine Learning Techniques (2020), 273–286.

Ghosh

, Guha

, Sarkar

and Abraham

, A wrapper-filter feature selection technique based on ant colony optimization, Neural Computing and Applications 32(12) (2020), 7839–7857.

Nayak

S.K.

, Rout

P.K.

, Jagadev

A.K.

and Swarnkar

, Elitism-based multi-objective differential evolution with extreme learning machine for feature selection: A novel searching technique, Connection Science 30(4) (2018), 362–387.

Paul

, Jain

, Saha

and Mathew

, Multi-objective PSO based online feature selection for multi-label classification, Knowledge-Based Systems 222 (2021), 106966.

Vijayanand

, Devaraj

and Kannapiran

, Intrusion detection system for wireless mesh network using multiple support vector machine classifiers with genetic-algorithm-based feature selection, Computers & Security 77 (2018), 304–314.

Almomani

, A feature selection model for network intrusion detection system based on PSO, GWO, FFA and GA algorithms, Symmetry 12(6) (2020), 1046.

10.

Rostami

, Forouzandeh

, Berahmand

and Soltani

, Integration of multi-objective PSO based feature selection and node centrality for medical datasets, Genomics 112(6) (2020), 4370–4384.

11.

, Guo

, Wu

and Li

, A RF-PSO based hybrid feature selection model in intrusion detection system. In 2018 IEEE Third International Conference on Data Science in Cyberspace (DSC) (2018, June), (pp. 795-802). IEEE.

12.

Almasoudy

F.H.

, Al-Yaseen

W.L.

and Idrees

A.K.

, Differential evolution wrapper feature selection for intrusion detection system, Procedia Computer Science 167 (2020), 1230–1239.

13.

Zhiheng

and Jianhua

, Flamingo search algorithm: A new swarm intelligence optimization algorithm, IEEE Access 9 (2021), 88564–88582.

14.

Zhou

L.Y.

, Ding

L.X.

, Peng

and Qiang

X.L.

, Neighborhood centroid opposition-based particle swarm optimization, ACTA Electonica Sinica 45(11) (2017), 2815.

15.

Dong

, Sun

and Ding

, A many-objective feature selection for multi-label classification, Knowledge-Based Systems 208 (2020), 106456.

16.

Hancer

, et al., Pareto front feature selection based on artificial bee colony optimization, Inf Sci 422 (2018), 462–479.

17.

Abdollahzadeh

and Gharehchopogh

F.S.

, A multiobjective optimization algorithm for feature selection problems, Engineering with Computers (2021), 1–19.

18.

Zhang

, Gong

D.W.

, Gao

X.Z.

, Tian

and Sun

X.Y.

, Binary differential evolution with self-learning for multi-objective feature selection, Information Sciences 507 (2020), 67–85.

19.

Sohrabi

M.K.

and Tajik

, Multi-objective feature selection for warfarin dose prediction, Computational Biology and Chemistry 69 (2017), 126–133.

20.

Wan

, Ma

, Zhong

, Hu

and Zhang

, Multiobjective hyperspectral feature selection based on discrete sine cosine algorithm, IEEE Transactions on Geoscience and Remote Sensing 58(5) (2020), 3601–3618.

21.

Zhang

, Cheng

, Shi

, Gong

D.W.

and Zhao

, Cost-sensitive feature selection using two-archive multi-objective artificial bee colony algorithm, Expert Systems with Applications 137 (2019), 46–58.

22.

Wang

and Zheng

X-L

, A knowledge-guided multi-objective fruit fly optimization algorithm for the multi-skill resource constrained project scheduling problem, Swarm Evolut Comput 38 (2018), 54–63.

23.

, et al., Multi-objective Fruit Fly Optimization Based on Cloud Model. In: 2018 13th World Congress on Intelligent Control and Automation (WCICA) (2018), IEEE.

24.

, He

and Zhou

, Multi-objective fruit fly optimization algorithm for test point selection. In: 2016 IEEE advanced information management, communicates, electronic and automation control conference (IMCEC). (2016), IEEE.

25.

, et al., A novel hybrid model based on multi-objective Harris hawks optimization algorithm for daily PM2. 5 and PM10 forecasting, Appl Soft Comput 96 (2020), 106620.

26.

Amoozegar

and Minaei-Bidgoli

, Optimizing multi-objective PSO based feature selection method using a feature elitism mechanism, Expert Syst Appl 113 (2018), 499–514.

27.

Rodrigues

, de Albuquerque

V.H.C.

and Papa

J.P.

, A multiobjective artificial butterfly optimization approach for feature selection, Appl Soft Comput (2020), 106442.

28.

Al-Tashi

, et al., Binary multi-objective grey wolf optimizer for feature selection in classification, IEEE Access 8 (2020), 106247–106263.

29.

Vanschoren

, Van Rijn

J.N.

, Bischl

and Torgo

, OpenML: networked science in machine learning, ACM SIGKDD Explorations Newsletter 15(2) (2014), 49–60.

30.

Kanani

and Padole

, Deep learning to detect skin cancer using google colab, International Journal of Engineering and Advanced Technology Regular Issue 8(6) (2019), 2176–2183.