Abstract
The cogent area, Probabilistic rough sets, offers methods that are used to trisect the data into positive, negative and boundary regions for optimum (α, β) pairs. These basic methods generate three regions based on a single quality, including cost, entropy, impurity, correlation and variance, thereby the best (α, β) pair is generated. The optimization of multiple qualities has significance in real-life applications; however, experiments rarely discussed the optimization of different criteria together in probabilistic rough sets. This probe conducts multi-objective optimization of uncertainty, impurity and correlation, to determine a trisection at optimal (α, β) pairs. For that, this work proposes a hybrid method that involves Weighted Sum and Artificial Bee Colony Algorithm to optimize the thresholds. The results are compared with the Information-theoretic rough sets and Game-theoretic rough sets. The proposed method outperforms regarding optimal qualities, multiple optimum thresholds, minimal size of boundary regions, and better evaluation results. By attesting the study on experimental data sets, optimal (α, β) pairs are obtained at which the uncertainty and impurity are minima. Moreover, the correlation at this threshold is reasonable. From the application viewpoint, it reduces the cost of further analysis by generating the minimum delayed decision and maximizes the benefit with optimal decisions by considering multiple optimized qualities simultaneously.
Keywords
Introduction
Three-way classification acts as the solver of the limitations associated with the two-way classification. In binary classification, people make decisions as per the recorded information about the domain. In a real-life scenario, the uncertainty in decision making is mainly due to measurement errors, sampling errors, data entry errors, and so forth. However, specifically, a two-way decision model never gives space to a non-commitment decision [1, 2]. Also, it is challenging to minimize prediction errors or misclassification errors in binary classification [3]. The importance of three-way decisions is noted, when it gives an option to delayed decisions in case of any uncertainty in decision making. Besides, it helps to reduce the misclassification errors by relaxing or adjusting the threshold pairs, which determine the three-way classification [4].
As per [5], three-way classification blends with some generalized sets, they are rough sets [6, 7], fuzzy sets [8], shadowed sets [9, 10], interval sets [5, 7], rough fuzzy sets [11] and soft sets [12]. Also, various extended set theories were used to interpret three-way decisions in recent years [13–16]. This paper follows rough set models for handling uncertainty in available information. In three-way decisions, classical rough set theory is extended to quantitative methods, mainly Non-Probabilistic Rough Sets (NPRS) [7] and (Probabilistic Rough Sets) PRS [7, 18]. In PRS [7], a threshold pair (α, β) is used to split the entire data set into three regions, called acceptance, rejection and non-commitment. In PRS methods, (Decision-Theoretic Rough Sets) DTRS [19], (Game-Theoretic Rough Sets) GTRS [20], Naïve Bayesian Rough Sets (NBRS) [21], Confirmation-Theoretic Rough Sets (CTRS) [22] and Bayesian Rough Sets (BRS) [23], and Objective function-based methods on the basis of entropy (Information-Theoretic Rough Sets-ITRS [7]), impurity [24], correlation [25] and variance [26] are included. In Objective function-based methods, the best (α, β) pair is selected at which the objective function value with respect to three regions is optimum. All the above objective function-based studies are conducted independently. However, in real-life applications, the optimum point of one quality may not be the optimum point of others. In [25, 27], the optimum (α, β) pair in the case of entropy and χ2 statistic are different for the same probabilistic information system. Here comes the importance of the proposed method; in the case of applications which demand optimality of more than one quality.
This study considers the optimization of more than one quality at the same time and hence, the optimal regions. Specifically, the study sticks to the optimization of objective function-based PRS methods, such as Shannon entropy [7], Gini index [24] and χ2 statistic [25] simultaneously. ITRS introduces the objective functions, which use Shannon entropy to represent the uncertainty of a trisection and Gini index to represent the impurity of a trisection [7, 24]. Similarly, χ2 statistic studies the correlation between the trisection and the classification [25]. This paper handles the optimization of three objective functions simultaneously by using different data sets from the UCI repository [28]. In order to get optimal pair of thresholds, the Weighted Sum Method (WSM) and Artificial Bee Colony (ABC)-based framework are employed [29, 30]. The WSM synthesizes information from three qualities, and this information is optimized within the ABC-based framework. As an upshot of this study, an optimal threshold pair is derived at which the entropy and impurity are minimized, and the χ2 square statistic is maximized. In this work, as an added benefit of the ABC-based framework, the experiment returns more than one pair of optimal thresholds. Moreover, the proper selection of the feature selection algorithm boosts the results promisingly [31]. The Discrimination Frequency Relevance Measure (DFRM) Algorithm is used for the feature selection [32]. This algorithm selects features on the basis of their discrimination power by calculating the discernibility information of each object pair which belongs to different classes. This results in proper grouping of objects while creating equivalence classes and making decisions. The DFRM helps to give favourable results to the proposed method in terms of the size of equivalence class structure and granularity level. As a whole, this work gives more insights into the three regions in terms of three qualities when compared to the existing works, such as single objective function-based works. The proposed work is compared with the Basic Methods (BM) and GTRS model, which are used to optimize entropy, impurity and correlation. Promising results are obtained for the proposed method in terms of optimal qualities, more than one optimum (α, β) pair, reduced size of boundary regions and better evaluation results. However, the proposed method faces the problem of local optima which results in sub-optimal results sometimes.
The endured section of this paper is presented as follows. Section 2 starts with the theory behind the DFRM algorithm. Since the proposed method is compared with the BM and the GTRS, the PRS and the BM based on objective functions such as Shannon entropy, Gini index and χ2 statistic are explained. Moreover, the theory of GTRS is explained. At the end of this section, the basic theory of the ABC algorithm and state-of-the-art multi-objective optimization techniques in PRS are explained. In Section 3, the evaluation metrics are mentioned. In addition to this, Section 4 details the experimental framework. Section 5 presents the experimental results, and Section 6 deals with the corresponding discussion. The paper concludes with Section 7.
Background
Feature selection algorithms have a significant role in the generation of optimal pair of thresholds. This study considers a feature selection algorithm called DFRM Algorithm [32]. As the name indicates, it is a discernibility-based algorithm. This section starts with the theory related to this algorithm.
Discrimination frequency relevance measure (DFRM) algorithm
This algorithm in [32] focuses on the discernibility information between each object pair having different decisions. Each object in a particular class is compared to other objects by considering the differentiating information. For gathering this information, a discernibility matrix is constructed to set the object pairs as the rows and conditional attributes as the columns. A value within the cell of the discernibility matrix gives information regarding whether the corresponding attribute can differentiate the objects in that pair. In [32], the authors generate a discernibility matrix D(i,j) of size P × Q, where each entry of the D(i,j) is based on Equation(41).
In 1990, Yao et al. introduced a probabilistic version of the Rough set theory, called DTRS [7]. Motivated from a three-way interpretation of probabilistic positive (P α), negative (N β) and boundary (B(α,β)) regions, in 2012, Yao proposed a theory of three-way decisions based on the actions acceptance, rejection and non-commitment [5]. In this theory, the three-way classification is defined quantitatively. A pair of thresholds (α, β) split the entire data into three regions (P α, N β, B(α,β)) by comparing (α, β) with the conditional probability which is based on an equivalence class ([X] I ) of an object x with respect to an indiscernibility relation I [24, 33]. The conditional probability of x in class C where C ⊆ U, given that x is in [X] I is defined by Equation(2).
In PRS, by considering the class C and the pair of thresholds (α, β), the entire data set is partitioned into three mutually disjoint regions like positive region P α(C), boundary region B(α,β)(C) and negative region N β(C) [7]. These three-way classification regions (π(α,β)(C)) are expressed by Equation(4).
Shannon entropy is a frequently used approach to measure the uncertainty in a data set. Let π
C
= {C, C
c
} be the partition of the data set based on the class attribute and Pr
C
= {Pr(C) , Pr(C
c
)} gives the corresponding probabilities of the partition. As per this method, the initial entropy H(π
C
) of the data set based on the partition π
C
is computed by Equation(52).
H(πC|B(α,β)(C))=-Pr(C|B(α,β)(C))
*log Pr(C|B(α,β)(C))-Pr(Cc|B(α,β)(C))
*log Pr(Cc|B(α,β)(C))
The conditional probabilities involved in Equations(11), (12) and (13) are computed by Equations(14), (15) and (16), respectively.
H(π(C)|π(α,β)(C)) = Pr(Pα(C)) * H(π(C)| Pα(C)) +
Pr(Nβ (C)) * H(π(C)|Nβ (C))+ Pr(B(α,β)(C))
* H(π(C)|B(α,β)(C))
A low value of the total entropy means that the quality of the trisection is high, and hence the study tries to minimize the conditional entropy [1, 7]. As per Equation(8), the optimum entropy can be derived by using Equation(18) which minimizes the objective function for total uncertainty.
The Gini index is acknowledged as the measure of impurity that varies between 0 and 1. It is interpreted as the degree to which a particular attribute misclassifies a new instance based on the distribution of objects in different classes when it is randomly chosen [34]. There is no impurity when all the objects are classified into a single class, i.e., the Gini index is zero. Similarly, if the objects are randomly classified into different classes, then the impurity is high. Specifically, the Gini index value will be 0.5, if the objects are being equally distributed across two classes C and C
c
. The impurity (G) is measured by Equation(19).
n machine learning, the Gini index is used in decision trees to find the best splits by measuring the impurity of an attribute. In PRS, the Gini index can be used to measure the impurities in three regions. As per Equation(20), Pr(C|P α(C)) and Pr(C c |P α(C)) are the conditional probabilities that provide probabilistic information about C and C c , given positive region as the prior information. As per [24], the Gini index or the impurity (G(P α(C))) of positive region (P α(C)) for a certain pair of thresholds (α, β) is defined by Equation(20).
The χ2 statistic tests the independence between two variables based on the actual (d
obs
) and expected (d
exp
) data. If the observed values and the expected values of a variable follow the same distribution, then the value of the χ2 statistic will be zero [35]. The following formula, Equation(25) computes the χ2 statistic.
To expound the objective function for this method, consider a finite non-empty set of objects U where the objects can be partitioned into a positive class C and negative class C
c
such that |C ∪ C
c
| = |U| = n. Also, the following equations are used to find χ2 statistic of three regions based on the values in the contingency table specified by Cong Gao and Yiyu Yao in [25] with the table number Table 1, which gives the information regarding the trisection of C and C
c
for an arbitrary (α, β) pair [25]. As per Equations(26), (27) and (28) the χ2 statistic of three regions
With the help of Equation(6), the overall quality of three regions is calculated by using Equation(29), which is the linear combination of χ2 statistic of three regions, and its corresponding region probabilities [25].
When the sample sizes or the dimensions of the table differ, the χ2 statistic and its significance may not provide an accurate idea of the extent of the relationship between the two variables. To solve this issue, in this work, the χ2 statistic is replaced by the phi coefficient to determine the pair of thresholds [36]. The measure of association, phi (φ), is a measure that adjusts the χ2 statistic by the sample size. It is usually less than one. φ is interpreted by Equation(31).
t is a probabilistic method that follows the game theory to establish a trade-off between the two regions, such as the immediate region and deferred region [4, 38]. The positive region and negative region are collectively termed as the immediate region, and the boundary region is called the deferred region. The game includes players, strategy and payoff values. The specified regions are the players involved in the game and follow a strategy in terms of the varying threshold levels. The values produced by the corresponding regions are called the payoff values [38]. The players compete with each other under a strategy and stop the game at the Nash equilibrium state between the payoff values [37].
The proposed method is compared with the GTRS method. In the GTRS method, the two regions compete with each other to optimize the qualities like uncertainty, impurity and correlation. The specific threshold is determined for each quality when it is balanced between these two regions. In this experiment, the two regions follow predefined strategies in terms of varying (α, β) values which contain { α, α decremented by 0.05, α decremented by 0.1 } and { β, β incremented by 0.05, β incremented by 0.1 } with an initial threshold (1, 0). Also, the algorithm stops at one of the following conditions, i.e., Pr(C) < Pr(C|P(C)) or |B(α,β)(C) | = 0 or qualities are balanced between two regions [37].
ABC-based optimization
The Swarm intelligence algorithms are commonly used for optimization problems since they give more than one optimized result. These algorithms include Genetic Algorithms (GA), Ant Colony Optimization (ACO), Particle Swarm Optimization (PSO), Differential Evolution (DE), Artificial Bee Colony (ABC), Glowworm Swarm Optimization (GSO), and Cuckoo Search Algorithm (CSA) [30, 39–44]. In this proposed method, the ABC-based multi-objective optimization is adapted to produce optimal pair of thresholds. This algorithm imitates the foraging strategy of honey bees to identify the best food sources [30]. For this, the division of labour among the bees is employed by defining three categories of bees like employed bee phase, onlooker bee phase and scout bee phase. The simplified ABC algorithm is specified in Algorithm 1 [30].
Initial population
Initialization
Find the fitness values and the global best
employed bee phase
onlooker bee phase
scout bee phase
From the primary collection of food sources, new food sources are generated through the initialization process. The fitness value for each food source is calculated to identify the quality of the food source. Then, the best food source is preserved. In the employed bee phase, new food sources are explored on the basis of the neighbourhood, and they are updated for a better fitness value. In the onlooker bee phase, each onlooker is assigned based on the fitness probability. The food source with the highest fitness probability is the first food source that is selected for the onlooker bee phase. For exploiting the new food sources, the onlooker bee phase follows the same procedure as in the employed bee phase. In the next phase, the exhausted food sources are replaced with the new food sources by scout bees. These three phases work iteratively until a specified condition is satisfied [30].
Multi-objective optimization in PRS
In the world of Multi-criteria decision making, there are different methods like WSM, Weighted Product Model, Analytic Hierarchy Process and so forth [45–48]. In these methods, WSM is the classical and simplest method. It is applied in the proposed method to synthesize information from the three qualities, even though it is challenging to find Pareto-optimal solutions for the non-convex problems.
For obtaining an optimal threshold pair in PRS, multiple data qualities associated with the data are optimized, which subsequently influence the consistency of the model. In PRS, there are some significant works for learning thresholds by optimizing the multiple objectives. One contribution in DTRS is the multi-objective optimization of decision costs and size of the boundary region in a trade-off perspective using the Genetic algorithm [49]. In another work, the attribute reduct is obtained by optimizing multiple criteria such as positive region, decision costs and mutual information in DTRS [50]. Then, the neighbourhood-based DTRS is proposed with multi-objective optimization with respect to three criteria such as decreasing the size of the boundary region, decreasing the total decision cost and increasing the size of the neighbourhood in a trade-off perspective using the Evolutionary algorithm and Pareto optimal solutions [51]. Later, Fan Jia et al. proposed a novel loss function from evaluation values of criteria with the support of relative loss functions and inverse loss functions [52]. Moreover, an outranking relation-based rough-fuzzy model for multi-criteria decision making was developed in 2020 [53]. Also, another unified model for the user-oriented attribute importance analysis based on both quantitative and qualitative approaches in three-way decision employed in [54]. In 2021, Chengjun Shi et al. proposed a tri-level framework for multi-criteria decision-making [48]. Recently, Yiyu Yao et al. proposed a method called 3RD for multi-criteria decision making by using the theory of three-way decisions, various multi-criteria decision methods, and prospect theory [55]. Apart from the above methods, this work proposes the multi-objective optimization of entropy, impurity and correlation using WSM and ABC algorithm.
Evaluation metrics
In [1, 24], the authors provided different metrics to measure the quality of different regions in PRS. The below-shown metrics are used to measure the quality of three regions in C.
Correct-Acceptance Rate:
This section includes the explanation of data trisection in PRS and the corresponding quality computation. In order to synthesize information from these qualities, the objective function is needed. Therefore, the formation of the objective function is explained in the next part. Then, the objective function value is optimized within the framework of ABC, which is described with the help of Algorithm 1.
Determining qualities of trisection in PRS
The factors considered to determine the quality of the trisection are conditional entropy, impurity and correlation with the selected classes. Shannon entropy is employed in the experiment to determine the partition with minimum conditional entropy, and the Gini index is used to determine the partition with minimum impurity. Similarly, to identify a partition with maximum correlation, the χ2 statistic is used as the optimization function. To form the trisection and to study the quality of the resulting data set, a set of five (α, β) values are considered initially. In order to find various optimal (α, β) pairs in this experiment, the ABC-based framework is employed. For each trisection, corresponding qualities are determined as specified in Section 2.3.1, 2.3.2 and 2.3.3. The information from all three qualities is aggregated using an objective function through WSM.
Objective function
One of the simplest method of multi-objective optimization is weighted sum [29]. Suppose F1(x), F2(x),...., F n (x) are the n objective functions, used to optimize together with the weighting components such as W1, W2, . . . , W n , respectively. Then the weighted sum is expressed by Equation(44).
The ABC algorithm is a commonly used method for multi-objective optimization in real-world problems due to the optimized results with a common framework [56–58]. This framework includes sequential execution of different phases, namely initialization, employed bee phase, onlooker bee phase, and scout bee phase [30]. The execution follows an iterative strategy until a specified condition is satisfied.
In this work, to optimize the threshold, five pairs of predefined collection of thresholds (α
i
, β
i
), i = 1, 2, . . , 5 are identified as the initial population. In the initialization phase, these predefined threshold pairs are modified by incorporating random values rand1 (0.0 ≤ rand1 ≤ 1.0) and rand2 (-1.0 ≤ rand2 ≤ 0.0) respectively for generating new threshold values α
n
and β
n
, n = 1, 2 . . , 5 using Equation(11). Subsequently, the α
n
and β
n
values are generated within the limits 0.5 ≤ α
n
≤ 1.0 and 0 ≤ β
n
< 0.5, respectively.
In the employed bee phase, the neighbouring exploration of each threshold pair is performed based on the initialized threshold pairs (α n , β n ). After each exploration, the best threshold pair is preserved as the effective threshold pair in terms of its fitness value. The results of the exploration are the current best five (α e , β e ), e = 1, 2 . . , 5 pairs and the corresponding fitness values. Mathematically, the neighbourhood exploration is modelled with the assistance of Equation(50) [30].
By considering the near-optimal solutions generated by the employed bees, onlooker bees start the exploitation of threshold pairs by ranking the threshold pairs on the basis of the fitness probability. The fitness probability (FP) is expressed using Equation(51) [30].
During the exploration and exploitation phase, if a particular threshold pair is nonupdated and exceeds the limit, it is replaced by the scout bees with a new threshold pair by using Equation(48). The limit value (l) is determined by the following Equation(52) [30]. Again, the fitness value and the corresponding threshold pair are updated for the global best.
Predefined set of (α, β) pairs
Initialization phase
Find the fitness value for each (α, β) pair
Determine the global best
Generate new (α, β) pair and fitness value
Counter gets zero
Increments the counter
Update the global best
Find the FP values
Selects (α, β) pairs in higher order of FP
Generate new (α, β) pairs
Find the fitness value
Counter gets zero
Increments the counter
Update the global best
Initialize the (α, β) pair
Determine the fitness value
The global best is updated
Go for the next (α, β) pair
At the outset, each data set passes through the preprocessing step and enters into the ABC-optimization procedure. This procedure includes different sequential phases, which work iteratively and produce the optimal (α, β) pairs. The phases start with a predefined set of (α, β) pairs. All these threshold pairs are initialized using Equation(48). For each (α, β) pair, the data is trisected as explained in Section 4.2 and the objective function is formulated based on the trisection as explained in Section 4.3. From the objective function, fitness value is determined using Equation(49). The best fitness value and the corresponding pairs of (α, β) values are stored as the global best. In the next phase, for each (α, β) pair, the employed bee phase explores the new (α, β) pair using Equation(50). Then, the fitness value is computed, and the corresponding old threshold pair is replaced if the new threshold pair offers a better fitness value. At the end of this phase, the best fitness value is preserved along with the (α, β) pair. In the next phase, the onlooker bee phase selects each (α, β) pair based on the FP, which is determined via Equation(51). This phase exploits new (α, β) pairs and follows the same procedure of the employed bee phase to compare and update (α, β) pairs and the corresponding fitness values. Consequently, this phase determines the global best. The last phase, the scout bee phase, replaces the non-updated (α, β) pairs based on the limit, which is determined by Equation(52). For more readability, Algorithm 2 displays the entire workflow.
Experimental results
This work proposes a method for optimal trisection with minimum uncertainty, impurity and maximum correlation between the classification and trisection. In order to substantiate the proposed method, the experiment is conducted using different data sets from UCI machine learning repository [28]. They are Monks1 (M1), Monks2 (M2), Monks3 (M3), Credit Approval (CA), Congressional Voting Records (CVR), Blood Transfusion Service Center (BTSC) [59], Tic-Tac-Toe Endgame Data Set (TTTE), Immunotherapy (IMT) [60, 61]. The details about the data sets are shown in Table 1.
Description of data sets
Description of data sets
Initial α and β values
Before going to process the data using Algorithm 2, the necessary preprocessing techniques are employed on them. In this work, discretization and numeric to nominal methods are applied to the dataset with the help of WEKA [62]. Also, using DFRM with the half selection strategy, features are selected [32].
Some parameters affect the working of the Multi-Objective Optimization Method (MOOM), such as the number of iterations, size of the population and l. A maximum of 25 iterations is set for the algorithm since further iterations cannot improve the results of the specified data sets. Next, the population size and l are 5. The initial population of threshold pairs for the ABC-based framework is displayed in Table 3. In order to maintain the population size, except the α value 0.6 the initial population is created. Also, l is calculated as per Equation(52).
The results of MOOM method are compared with the results obtained through the BM as specified in Section 2.3 and GTRS method. For each dataset, Table 3 shows the experimental results in terms of three quality levels using BM, GTRS and MOOM. Next, Table 4 and 5 display the corresponding (α, β) levels for BM and GTRS respectively, and Table 6 shows the optimal (α, β) levels for MOOM. Since ABC-based optimization returns different optimal threshold values at different runs, for the sake of briefness, Table 6 displays ten different optimal (α, β) values for each dataset. Also, the region sizes are shown by Table 7. Finally, the evaluation results for BM, GTRS and MOOM are displayed in Table 8, 9 and 10, respectively. If the region size is zero, the corresponding result is denoted by - notation.
Experimental results of three qualities
Different optimal threshold levels of BM
This section observes the results of each data set and confirms the advantage of using multi-objective optimization of different qualities.
Different optimal threshold levels of GTRS
Different optimal threshold levels of GTRS
Optimal (α, β) values obtained from MOOM
The size of the regions
Experimental evaluation results of BM
Experimental evaluation results of GTRS
Experimental evaluation results of MOOM
In 1991, a group of researchers compared a set of learning algorithms to find the optimal one. Three problems are derived, and they are termed as Monk’s problems. The first problem, Monks1, is based on a logical formula that produces binary outcomes for the targeted concept of the data set [28]. The proposed work focuses on the optimal trisection of M1 in terms of three qualities. Each quality is optimal at different (α, β) pairs in BM. However, when optimizing three qualities simultaneously, MOOM is a better option among BM and GTRS methods. The proposed method MOOM gives the results which are very close to the results produced by the BM. Also, the MOOM gives a different set of (α, β) pairs at each run that is displayed in Table 8. In the case of the regions’ size, the boundary region is moderate in the MOOM. Also, the evaluation metric gives good results when compared to the existing ones.
M2
In this Monk’s problem, out of the six attributes, exactly two of them have the first value [28]. The optimum quality levels of trisected M2 are shown in Table 3. The three qualities are optimum in BM; however, the MOOM produces results that are very close to BM. The threshold levels are the same for both entropy and impurity in BM as well as GTRS. Moreover, MOOM produces the threshold pairs with more variation in β values. Also, the size of the boundary region is the same as that of the result of BM for entropy and impurity. Evaluation metric also gives the same result.
M3
In Monk’s problem 3, the decision value is based on the binary outcome of the logical formula. Moreover, the noise (5%) is added to the training set. In this experiment, all the three qualities are optimum in BM, and MOOM produces the results which are very close to BM. The size of the boundary region in MOOM is relatively smaller than BM and GTRS. Also, MOOM gives better evaluation results. The accuracy rate is not the highest, however, it provides the best coverage rate.
CA
This data includes the details about credit card applications, and the corresponding decisions [28]. The qualities of trisection of this data are optimum in BM, whereas other methods also produce very close results. Each quality is optimum at different threshold pairs. Meanwhile, the three qualities are optimal at different optimal threshold pairs in MOOM. It shows reduced size for boundary regions when compared to the other methods. Moreover, the evaluation results are the same for both MOOM and BM for entropy, since the MOOM gives the same entropy level as BM.
CVR
The data set includes the United State Congressional Voting Records, where each record has 16 attributes with a decision of democrat or republican [28]. Here, the quality, as well as the threshold levels, are almost similar for both BM and MOOM. Therefore the region sizes and evaluation metrics follow the same results for both.
BTSC
The BTSC is the data related to the randomly selected blood donors from a donor database, with a decision of whether a person gave blood in March 2007 [59]. The results associated with this data are the same for both MOOM and BM for entropy and impurity.
TTTE
The data set gives the entire combination of board at the end of a Tic-Tac-Toe game. The decision label is win or not for a gamer. In this experiment, the optimum qualities are produced by BM. However, MOOM shows comparatively better evaluation results than BM.
IMT
IMT is a small data set that includes different features collected for the immunotherapy. The target concept means patient’s response to the treatment [60, 61]. The peculiarity of this data set are the three qualities, which are optimum at the same threshold levels. Therefore, it is easy to obtain the multi-objective optimization of these three qualities. As expected, the evaluation metrics are same for BM, GTRS and MOOM.
In PRS, there are various optimization methods to enhance the quality of the generated trisection. This enables us to interpret the trisection from different perspectives. However, the optimum threshold level for one quality need not be the optimum level for another quality. Here comes the importance of the multi-objective optimization methods. That means, if a problem demands the optimization of multiple qualities of a trisection simultaneously, the proposed method is a solution for this. In this method, the qualities like uncertainty, impurity and correlation are optimized with the help of two main stages, such as the WSM and ABC-based optimization. The former method derives a single (OF) value from these qualities, and the optimized value is determined through the iteration mechanism established by the latter. These optimized results are verified by the various evaluation metrics. It is better to explain the experiment on the basis of these two stages.
The weighted sum method is the simplest method of multi-objective optimization. Therefore, it is easy to combine different qualities and produce a single value to analyze. This value acts as the OF value for the fitness value calculation. Since the ABC algorithm provides a framework for optimizing the fitness value, it is efficiently used to find the optimal pair of thresholds. Moreover, the proposed method produces more than one optimized result since it is a capability of the ABC algorithm. Also, the exploration and exploitation mechanisms for generating new threshold pairs offer new results from the extended search space other than the initial search space.
For each candidate in the population, the three qualities are determined. The three qualities produce the results (r) within the limit 0 ≤ r ≤ 1. These quantities are multiplied with the corresponding region probabilities and sum up to form a single value or OF. That means, the (α, β) values determines the OF. Based on OF, the fitness value is calculated and updated in different phases of the ABC algorithm. The three phases of the ABC algorithm explore different (α, β) values to optimize the fitness value and fix the thresholds at an optimum fitness value.
The simplest method WSM, and the ABC-based efficient framework provide various benefits in the proposed work. Firstly, the presence of a common framework helps to adapt different aggregation methods easily. Also, it helps to generate results from the extended search spaces too. Adding to this, the capability of the ABC-based framework results in more than one optimized result, as in Table 7. Moreover, the results of MOOM are very close to the optimal results of BM. GTRS model is not performed promisingly in this experiment since the predefined strategy employed and the stopping criterion are not favourable to this problem. Totally, the proposed method gives better results in terms of multi-objective optimization of PRS.
Since there is no trade-off among the qualities taken for this study, the results are not balanced like a GTRS method. The results show that qualities like entropy and impurity are mostly optimal at a point in the search space, whereas the optimal correlation is far from this point. Therefore, at the time of decision making, entropy and impurity give a better result when compared to the correlation. Another observation is related to the ABC-based framework. Usually, classical ABC algorithms face the problem of local optimization. Consequently, the algorithm rarely results in sub-optimal qualities. There are some global optimization methods for ABC algorithm [63–67]. For the sake of briefness, the proposed work concentrates on the benefits of the ABC-based framework.
Conclusion
There are different models for optimizing the qualities individually; such as cost, entropy, impurity, correlation and variance under the PRS. However, experiments hardly handles the optimization of more than one mentioned qualities for the application eye view. This paper addresses the above mentioned problem with a hybrid model, which involves WSM and ABC-based framework. The proposed method optimizes entropy, impurity and correlation simultaneously. The WSM derives the information about the three qualities, and ABC-based framework optimizes the derived information. The result is compared with the BM and GTRS. The output for the proposed method is more efficient in terms of optimal qualities, multiple optimum (α, β) pairs, reduced size of boundary regions and better evaluation results. However, the actual optimal point of correlation is away when compared to the entropy and impurity. Also, the proposed method faces the local optimization problem, which result in suboptimal qualities sometimes. The future work will concentrate on the better aggregation of three qualities to synthesize information from multiple qualities instead of WSM. As a future direction for the proposed work, the results can be further improved by adapting global optimization framework; to avoid the issue induced by trapping into the local optimum.
Declaration
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
