An artificial bee colony-based framework for multi-objective optimization of three-way decisions with probabilistic rough sets

Abstract

The cogent area, Probabilistic rough sets, offers methods that are used to trisect the data into positive, negative and boundary regions for optimum (α, β) pairs. These basic methods generate three regions based on a single quality, including cost, entropy, impurity, correlation and variance, thereby the best (α, β) pair is generated. The optimization of multiple qualities has significance in real-life applications; however, experiments rarely discussed the optimization of different criteria together in probabilistic rough sets. This probe conducts multi-objective optimization of uncertainty, impurity and correlation, to determine a trisection at optimal (α, β) pairs. For that, this work proposes a hybrid method that involves Weighted Sum and Artificial Bee Colony Algorithm to optimize the thresholds. The results are compared with the Information-theoretic rough sets and Game-theoretic rough sets. The proposed method outperforms regarding optimal qualities, multiple optimum thresholds, minimal size of boundary regions, and better evaluation results. By attesting the study on experimental data sets, optimal (α, β) pairs are obtained at which the uncertainty and impurity are minima. Moreover, the correlation at this threshold is reasonable. From the application viewpoint, it reduces the cost of further analysis by generating the minimum delayed decision and maximizes the benefit with optimal decisions by considering multiple optimized qualities simultaneously.

Keywords

Multi-objective optimization probabilistic rough sets artificial bee colony algorithm entropy gini index

1 Introduction

Three-way classification acts as the solver of the limitations associated with the two-way classification. In binary classification, people make decisions as per the recorded information about the domain. In a real-life scenario, the uncertainty in decision making is mainly due to measurement errors, sampling errors, data entry errors, and so forth. However, specifically, a two-way decision model never gives space to a non-commitment decision [1, 2]. Also, it is challenging to minimize prediction errors or misclassification errors in binary classification [3]. The importance of three-way decisions is noted, when it gives an option to delayed decisions in case of any uncertainty in decision making. Besides, it helps to reduce the misclassification errors by relaxing or adjusting the threshold pairs, which determine the three-way classification [4].

As per [5], three-way classification blends with some generalized sets, they are rough sets [6, 7], fuzzy sets [8], shadowed sets [9, 10], interval sets [5, 7], rough fuzzy sets [11] and soft sets [12]. Also, various extended set theories were used to interpret three-way decisions in recent years [13 –16]. This paper follows rough set models for handling uncertainty in available information. In three-way decisions, classical rough set theory is extended to quantitative methods, mainly Non-Probabilistic Rough Sets (NPRS) [7] and (Probabilistic Rough Sets) PRS [7 , 18]. In PRS [7], a threshold pair (α, β) is used to split the entire data set into three regions, called acceptance, rejection and non-commitment. In PRS methods, (Decision-Theoretic Rough Sets) DTRS [19], (Game-Theoretic Rough Sets) GTRS [20], Naïve Bayesian Rough Sets (NBRS) [21], Confirmation-Theoretic Rough Sets (CTRS) [22] and Bayesian Rough Sets (BRS) [23], and Objective function-based methods on the basis of entropy (Information-Theoretic Rough Sets-ITRS [7]), impurity [24], correlation [25] and variance [26] are included. In Objective function-based methods, the best (α, β) pair is selected at which the objective function value with respect to three regions is optimum. All the above objective function-based studies are conducted independently. However, in real-life applications, the optimum point of one quality may not be the optimum point of others. In [25, 27], the optimum (α, β) pair in the case of entropy and χ² statistic are different for the same probabilistic information system. Here comes the importance of the proposed method; in the case of applications which demand optimality of more than one quality.

This study considers the optimization of more than one quality at the same time and hence, the optimal regions. Specifically, the study sticks to the optimization of objective function-based PRS methods, such as Shannon entropy [7], Gini index [24] and χ² statistic [25] simultaneously. ITRS introduces the objective functions, which use Shannon entropy to represent the uncertainty of a trisection and Gini index to represent the impurity of a trisection [7, 24]. Similarly, χ² statistic studies the correlation between the trisection and the classification [25]. This paper handles the optimization of three objective functions simultaneously by using different data sets from the UCI repository [28]. In order to get optimal pair of thresholds, the Weighted Sum Method (WSM) and Artificial Bee Colony (ABC)-based framework are employed [29, 30]. The WSM synthesizes information from three qualities, and this information is optimized within the ABC-based framework. As an upshot of this study, an optimal threshold pair is derived at which the entropy and impurity are minimized, and the χ² square statistic is maximized. In this work, as an added benefit of the ABC-based framework, the experiment returns more than one pair of optimal thresholds. Moreover, the proper selection of the feature selection algorithm boosts the results promisingly [31]. The Discrimination Frequency Relevance Measure (DFRM) Algorithm is used for the feature selection [32]. This algorithm selects features on the basis of their discrimination power by calculating the discernibility information of each object pair which belongs to different classes. This results in proper grouping of objects while creating equivalence classes and making decisions. The DFRM helps to give favourable results to the proposed method in terms of the size of equivalence class structure and granularity level. As a whole, this work gives more insights into the three regions in terms of three qualities when compared to the existing works, such as single objective function-based works. The proposed work is compared with the Basic Methods (BM) and GTRS model, which are used to optimize entropy, impurity and correlation. Promising results are obtained for the proposed method in terms of optimal qualities, more than one optimum (α, β) pair, reduced size of boundary regions and better evaluation results. However, the proposed method faces the problem of local optima which results in sub-optimal results sometimes.

The endured section of this paper is presented as follows. Section 2 starts with the theory behind the DFRM algorithm. Since the proposed method is compared with the BM and the GTRS, the PRS and the BM based on objective functions such as Shannon entropy, Gini index and χ² statistic are explained. Moreover, the theory of GTRS is explained. At the end of this section, the basic theory of the ABC algorithm and state-of-the-art multi-objective optimization techniques in PRS are explained. In Section 3, the evaluation metrics are mentioned. In addition to this, Section 4 details the experimental framework. Section 5 presents the experimental results, and Section 6 deals with the corresponding discussion. The paper concludes with Section 7.

2 Background

Feature selection algorithms have a significant role in the generation of optimal pair of thresholds. This study considers a feature selection algorithm called DFRM Algorithm [32]. As the name indicates, it is a discernibility-based algorithm. This section starts with the theory related to this algorithm.

2.1 Discrimination frequency relevance measure (DFRM) algorithm

This algorithm in [32] focuses on the discernibility information between each object pair having different decisions. Each object in a particular class is compared to other objects by considering the differentiating information. For gathering this information, a discernibility matrix is constructed to set the object pairs as the rows and conditional attributes as the columns. A value within the cell of the discernibility matrix gives information regarding whether the corresponding attribute can differentiate the objects in that pair. In [32], the authors generate a discernibility matrix D_(i,j) of size P × Q, where each entry of the D_(i,j) is based on Equation(41).

$\begin{matrix} D_{(i, j)} \\ = {\begin{matrix} 1, & a_{j} (x) \neq a_{j} (y), i = 1, 2, 3 . . P, j = 1, 2, 3 . . Q \\ 0, & Otherwise \end{matrix} \end{matrix}$ (1) In Equation(1), (x, y) represents an object pair satisfying the condition d(x) ≠ d(y), where d is the decision attribute, and a_j(x) and a_j(y) are the values of the attribute a_j in x and y. After constructing D_(i,j), the sum of the values of a column of D_(i,j) represents the discrimination power of that attribute, leading to quantify the significance of that attribute in a decision making process. In this research, after ranking the attributes based on the significance, the Half-Selection Strategy is followed to select the features from the available features [32].

2.2 Probabilistic rough sets

In 1990, Yao et al. introduced a probabilistic version of the Rough set theory, called DTRS [7]. Motivated from a three-way interpretation of probabilistic positive (P_α), negative (N_β) and boundary (B_(α,β)) regions, in 2012, Yao proposed a theory of three-way decisions based on the actions acceptance, rejection and non-commitment [5]. In this theory, the three-way classification is defined quantitatively. A pair of thresholds (α, β) split the entire data into three regions (P_α, N_β, B_(α,β)) by comparing (α, β) with the conditional probability which is based on an equivalence class ([X] _I) of an object x with respect to an indiscernibility relation I [24, 33]. The conditional probability of x in class C where C ⊆ U, given that x is in [X] _I is defined by Equation(2).

$\Pr (C | [X]_{I}) = \frac{| C \cap [X]_{I} |}{| [X]_{I} |}$ (2) Correctly, the object x is classified into one of the three regions when it belongs to C, is based on the following equations in Equation(3):

$\begin{matrix} P_{α} (C) = {x \in U | \Pr (C | [X]_{I}) \geq α}, \\ N_{β} (C) = {x \in U | \Pr (C | [X]_{I}) \leq β}, \\ B_{(α, β)} (C) = {x \in U | β < \Pr (C | [X]_{I}) < α} . \end{matrix}$ (3) In the evolution of PRS theory, this probabilistic approach helps to obtain a better trisection with the thresholds (0 ≤ β < 0.5 ≤ α ≤ 1). In the PRS view, the entire data is divided into Pawlak’s three regions, such as the positive region, negative region and boundary region at the threshold (0,1) [7, 21].

2.3 Methods to optimize thresholds

In PRS, by considering the class C and the pair of thresholds (α, β), the entire data set is partitioned into three mutually disjoint regions like positive region P_α(C), boundary region B_(α,β)(C) and negative region N_β(C) [7]. These three-way classification regions (π_(α,β)(C)) are expressed by Equation(4).

$π_{(α, β)} (C) = {P_{α} (C), N_{β} (C), B_{(α, β)} (C)}$ (4) In order to generate optimal three-way decision regions, various qualities handled by the considered objective functions are utilized. After measuring the qualities with the help of these objective functions, an optimal pair of threshold values are determined by maximizing or minimizing these qualities individually, as the case may be [1, 25]. Let Q_P, Q_N and Q_B represent the qualities of the positive region, negative region and boundary region, respectively, and Q(π_(α,β)(C)) be the total quality. Hence, a linear combination of Q_P, Q_N and Q_B, the total quality is represented by Equation(5).

$Q (π_{(α, β)} (C)) = W_{P} * Q_{P} + W_{N} * Q_{N} + W_{B} * Q_{B}$ (5) where W_P, W_N and W_B are the weights of Q_P = Q(P_α(C)), Q_N = Q(N_β(C)) and Q_B = Q(B_(α,β)(C)) respectively [7]. By assigning the probability of each region of the trisection as the weights, Equation(5) becomes:

$\begin{matrix} Q (π_{(α, β)} (C)) = \Pr (P_{α} (C)) * Q_{P} + \Pr (N_{β} (C)) \\ * Q_{N} + \Pr (B_{(α, β)} (C)) * Q_{B} \end{matrix}$ (6) The following formulae in Equation(7) compute the probability of each region [1].

$\begin{matrix} \Pr (P_{α} (C)) = \frac{| P_{α} (C) |}{| U |}, \\ \Pr (N_{β} (C)) = \frac{| N_{β} (C) |}{| U |}, \\ \Pr (B_{(α, β)} (C)) = \frac{| B_{(α, β)} (C) |}{| U |} . \end{matrix}$ (7) Then the objective function for optimization is defined by Equation(8) or Equation(9).

$(α^{'}, β^{'}) = \underset{(α, β)}{argmin} (Q (π_{(α, β)} (C)))$ (8) or $(α^{'}, β^{'}) = \underset{(α, β)}{argmax} (Q (π_{(α, β)} (C)))$ (9) The selection of the objective function depends on the quality of the trisection, and (α′, β′) represents the optimized pair of thresholds.

2.3.1 Uncertainty by Shannon Entropy

Shannon entropy is a frequently used approach to measure the uncertainty in a data set. Let π_C = {C, C^c} be the partition of the data set based on the class attribute and Pr_C = {Pr(C) , Pr(C^c)} gives the corresponding probabilities of the partition. As per this method, the initial entropy H(π_C) of the data set based on the partition π_C is computed by Equation(52). $H (π_{C}) = - \Pr (C) * logPr (C) - \Pr (C^{c}) * logPr (C^{c})$ (10) In this method, the centre of attention is on computing the uncertainty involved in the data set when trisection is generated based on various pairs of thresholds (α, β) and then, identify a threshold pair which minimizes this uncertainty [1 , 27]. Consider a threshold pair (α, β) and the corresponding trisection π_(α,β)(C) generated for class C based on this (α, β). In order to compute the uncertainty, the data set is trisected on (α, β). Then, the entropy of π_C is computed based on each component of the trisection separately and the contributions are added together [7, 27]. So, when the positive region P_α(C) is considered, the uncertainty contribution H(π_C|P_αz(C)) is computed using Equation(11). $\begin{matrix} H (π_{C} | P_{α} (C)) = - \Pr (C | P_{α} (C)) * logPr (C | P_{α} (C)) \\ - \Pr (C^{c} | P_{α} (C)) * logPr (C^{c} | P_{α} (C)) \end{matrix}$ (11) Equation(32) gives the entropy or uncertainty based on the prior information regarding the positive region of the trisection generated using (α, β). This entropy value gives information concerning the quality of the generated positive region. If the computed value is small, the quality of the region is high, and vice-versa [1, 7]. The uncertainties involved in the other two regions are computed by Equations(12) and (13). They are: $\begin{matrix} H (π_{C} | N_{β} (C)) = - \Pr (C | N_{β} (C)) * logPr (C | N_{β} (C)) \\ - \Pr (C^{c} | N_{β} (C)) * logPr (C^{c} | N_{β} (C)) \end{matrix}$ (12)

H(π_C|B_(α,β)(C))=-Pr(C|B_(α,β)(C))

*log Pr(C|B_(α,β)(C))-Pr(C^c|B_(α,β)(C))

*log Pr(C^c|B_(α,β)(C))

The conditional probabilities involved in Equations(11), (12) and (13) are computed by Equations(14), (15) and (16), respectively.

$\Pr (C | P_{α} (C)) = \frac{| C \cap P_{α} (C) |}{| P_{α} (C) |},$ (13) $\Pr (C | N_{β} (C)) = \frac{| C \cap N_{β} (C) |}{| N_{β} (C) |},$ (14) $\Pr (C | B_{(α, β)} (C)) = \frac{| C \cap B_{(α, β)} (C) |}{| B_{(α, β)} (C) |} .$ (15) As per Equation(6), the total uncertainty or the conditional entropy of the trisection is determined by Equation(17) [7].

H(π_(C)|π_(α,β)(C)) = Pr(P_α(C)) * H(π_(C)| P_α(C)) +

Pr(N_β (C)) * H(π_(C)|N_β (C))+ Pr(B_(α,β)(C))

* H(π_(C)|B_(α,β)(C))

A low value of the total entropy means that the quality of the trisection is high, and hence the study tries to minimize the conditional entropy [1, 7]. As per Equation(8), the optimum entropy can be derived by using Equation(18) which minimizes the objective function for total uncertainty. $(α^{'}, β^{'}) = \underset{(α, β)}{argmin} (H (π_{C} | π_{(α, β)} (C)))$ (16) where (α′, β′) is the optimal threshold pair.

2.3.2 Impurity by Gini Index

The Gini index is acknowledged as the measure of impurity that varies between 0 and 1. It is interpreted as the degree to which a particular attribute misclassifies a new instance based on the distribution of objects in different classes when it is randomly chosen [34]. There is no impurity when all the objects are classified into a single class, i.e., the Gini index is zero. Similarly, if the objects are randomly classified into different classes, then the impurity is high. Specifically, the Gini index value will be 0.5, if the objects are being equally distributed across two classes C and C^c. The impurity (G) is measured by Equation(19). $G = 1 - \sum_{i = 1}^{z} (\Pr (C_{i}))^{2}$ (17) In Equation(19), Pr(C_i) represents the probability of an object being classified to class C_i and z is the number of classes [34].

n machine learning, the Gini index is used in decision trees to find the best splits by measuring the impurity of an attribute. In PRS, the Gini index can be used to measure the impurities in three regions. As per Equation(20), Pr(C|P_α(C)) and Pr(C^c|P_α(C)) are the conditional probabilities that provide probabilistic information about C and C^c, given positive region as the prior information. As per [24], the Gini index or the impurity (G(P_α(C))) of positive region (P_α(C)) for a certain pair of thresholds (α, β) is defined by Equation(20).

$G (P_{α} (C)) = 1 - \Pr (C | P_{α} (C))^{2} - \Pr (C^{c} | P_{α} (C))^{2}$ (18) Similarly, impurities of the negative region and boundary region of class C at a particular threshold (α, β) are defined by Equations(21) and (22): $G (N_{β} (C)) = 1 - \Pr (C | N_{β} (C))^{2} - \Pr (C^{c} | N_{β} (C))^{2}$ (19) $\begin{matrix} G (B_{(α, β)} (C)) = & 1 - \Pr (C | B_{(α, β)} (C))^{2} \\ - \Pr (C^{c} | B_{(α, β)} (C))^{2} \end{matrix}$ (20) Then, this paper follows with the minimizing total impurity as the objective function, and it is calculated by Equation(23) [24]: $\begin{matrix} G (π_{C} | π_{(α, β)} (C)) = \Pr (P_{α} (C)) * G (P_{α} (C)) \\ + \Pr (N_{β} (C)) * G (N_{β} (C)) + \Pr (B_{(α, β)} (C)) \\ * G (B_{(α, β)} (C)) \end{matrix}$ (21) The probability of each region is used as the weight value, and it is multiplied by the Gini index value of the corresponding region as mentioned in Section 3. Finally, the study derive optimal (α′, β′) based on Equation(8), at which the impurity is minimum. It is defined by Equation(24). $(α^{'}, β^{'}) = \underset{(α, β)}{argmin} (G (π_{C} | π_{(α, β)} (C)))$ (22)

2.3.3 Correlation by χ²Statistic

The χ² statistic tests the independence between two variables based on the actual (d_obs) and expected (d_exp) data. If the observed values and the expected values of a variable follow the same distribution, then the value of the χ² statistic will be zero [35]. The following formula, Equation(25) computes the χ² statistic. $χ^{2} = \sum (d_{obs} - d_{\exp})^{2} / d_{\exp}$ (23) In three-way decisions, there is always dependence between the trisection and the classification. So the experiment tries to maximize the value of the χ² statistic to reinforce this correlation [25]. If the value of the χ² statistic is high, then it implies a strong correlation between the trisection and the classification. Consequently, an optimal pair of thresholds (α, β) is obtained at which χ² value is maximum.

To expound the objective function for this method, consider a finite non-empty set of objects U where the objects can be partitioned into a positive class C and negative class C^c such that |C ∪ C^c| = |U| = n. Also, the following equations are used to find χ² statistic of three regions based on the values in the contingency table specified by Cong Gao and Yiyu Yao in [25] with the table number Table 1, which gives the information regarding the trisection of C and C^c for an arbitrary (α, β) pair [25]. As per Equations(26), (27) and (28) the χ² statistic of three regions $χ_{α}^{2} (P)$ , $χ_{β}^{2} (N)$ and $χ_{(α, β)}^{2} (B)$ are calculated respectively. $\begin{matrix} χ_{α}^{2} (P) = & \frac{(| C_{P} | - C_{tot} * | P_{α} (C) | / n)^{2}}{C_{tot} * | P_{α} (C) | / n} \\ + \frac{(| C_{P}^{c} | - C_{tot}^{c} * | P_{α} (C) | / n)^{2}}{C_{tot}^{c} * | P_{α} (C) | / n}, \end{matrix}$ (24) $\begin{matrix} χ_{β}^{2} (N) = & \frac{(| C_{N} | - C_{tot} * | N_{β} (C) | / n)^{2}}{C_{tot} * | N_{β} (C) | / n} \\ + \frac{(| C_{N}^{c} | - C_{tot}^{c} * | N_{β} (C) | / n)^{2}}{C_{tot}^{c} * | N_{β} (C) | / n}, \end{matrix}$ (25) $\begin{matrix} χ_{(α, β)}^{2} (B) = & \frac{(| C_{B} | - C_{tot} * | B_{(α, β)} (C) | / n)^{2}}{C_{tot} * | B_{(α, β)} (C) | / n} \\ + \frac{(| C_{B}^{c} | - C_{tot}^{c} * | B_{(α, β)} (C) | / n)^{2}}{C_{tot}^{c} * | B_{(α, β)} (C) | / n} . \end{matrix}$ (26) According to [25], |CP| indicates the number of objects that are correctly classified by the positive region P_α(C) of C. Hence |CP| = |C ∩ P_α(C) |. Similarly, |CN| = |C ∩ N_β(C) | and |CB| = |C ∩ B_(α,β)(C) | constitute the number of objects in C which are correctly classified by the negative region and the boundary region generated by the (α, β) pair. The values |C^cP|, |C^cB| and |C^cN| represent the number of objects in C^c which are correctly classified by the positive region, boundary region, and negative region, respectively. Also, |P_α(C) |, |N_β(C) | and |B_(α,β)(C) | specify the total number of objects in the positive region, negative region and boundary region of C at the threshold (α, β), respectively. The values such as C_tot and $C_{tot}^{c}$ constitute the total number of objects in C and C^c, respectively. As per [25], for the actual data |CP|, the expected data is calculated by $\frac{C_{tot} * | P_{α} (C) |}{n}$ .

With the help of Equation(6), the overall quality of three regions is calculated by using Equation(29), which is the linear combination of χ² statistic of three regions, and its corresponding region probabilities [25]. $\begin{matrix} χ^{2} (π_{(α, β)} (C)) = \Pr (P_{α} (C)) * χ_{α}^{2} (P) + \Pr (N_{β} (C)) \\ * χ_{β}^{2} (N) + \Pr (B_{(α, β)} (C)) * χ_{(α, β)}^{2} (B) \end{matrix}$ (27) Here, Pr(P_α(C)), Pr(N_β(C)) and Pr(B_(α,β)(C)) are the probabilities associated with each region such as positive, negative and boundary regions, respectively. Then, the overall quality of three regions or χ² statistic is maximized to strengthen the dependence between {C, C^c} and π_(α,β)(C). The appropriate α and β are selected at which the χ² statistic is maximum [25]. The optimal pair of thresholds which is generated from Equation(9) for maximization, and it is defined by Equation(30). $(α^{'}, β^{'}) = \underset{(α, β)}{argmax} (χ^{2} (π_{(α, β)} (C)))$ (28) where (α′, β′) is the optimal threshold pair.

When the sample sizes or the dimensions of the table differ, the χ² statistic and its significance may not provide an accurate idea of the extent of the relationship between the two variables. To solve this issue, in this work, the χ² statistic is replaced by the phi coefficient to determine the pair of thresholds [36]. The measure of association, phi (φ), is a measure that adjusts the χ² statistic by the sample size. It is usually less than one. φ is interpreted by Equation(31). $φ (χ^{2} (π_{(α, β)} (C))) = \sqrt{\frac{χ^{2} (π_{(α, β)} (C))}{n}}$ (29) The φ-function associated with each region is expressed by φ_P, φ_N and φ_B and they are defined by Equations(32), (33) and (34): $φ_{P} = φ (χ_{α}^{2} (P)) = \sqrt{\frac{χ_{α}^{2} (P)}{| P_{α} (C) |}}$ (30) $φ_{N} = φ (χ_{β}^{2} (N)) = \sqrt{\frac{χ_{β}^{2} (N)}{| N_{β} (C) |}}$ (31) $φ_{B} = φ (χ_{(α, β)}^{2} (B)) = \sqrt{\frac{χ_{(α, β)}^{2} (B)}{| B_{(α, β)} (C) |}}$ (32)

2.4 GTRS

t is a probabilistic method that follows the game theory to establish a trade-off between the two regions, such as the immediate region and deferred region [4 , 38]. The positive region and negative region are collectively termed as the immediate region, and the boundary region is called the deferred region. The game includes players, strategy and payoff values. The specified regions are the players involved in the game and follow a strategy in terms of the varying threshold levels. The values produced by the corresponding regions are called the payoff values [38]. The players compete with each other under a strategy and stop the game at the Nash equilibrium state between the payoff values [37].

The proposed method is compared with the GTRS method. In the GTRS method, the two regions compete with each other to optimize the qualities like uncertainty, impurity and correlation. The specific threshold is determined for each quality when it is balanced between these two regions. In this experiment, the two regions follow predefined strategies in terms of varying (α, β) values which contain { α, α decremented by 0.05, α decremented by 0.1 } and { β, β incremented by 0.05, β incremented by 0.1 } with an initial threshold (1, 0). Also, the algorithm stops at one of the following conditions, i.e., Pr(C) < Pr(C|P(C)) or |B_(α,β)(C) | = 0 or qualities are balanced between two regions [37].

2.5 ABC-based optimization

The Swarm intelligence algorithms are commonly used for optimization problems since they give more than one optimized result. These algorithms include Genetic Algorithms (GA), Ant Colony Optimization (ACO), Particle Swarm Optimization (PSO), Differential Evolution (DE), Artificial Bee Colony (ABC), Glowworm Swarm Optimization (GSO), and Cuckoo Search Algorithm (CSA) [30 , 39–44]. In this proposed method, the ABC-based multi-objective optimization is adapted to produce optimal pair of thresholds. This algorithm imitates the foraging strategy of honey bees to identify the best food sources [30]. For this, the division of labour among the bees is employed by defining three categories of bees like employed bee phase, onlooker bee phase and scout bee phase. The simplified ABC algorithm is specified in Algorithm 1 [30].

Algorithm 1 Simplified Artificial Bee Colony algorithm

Initial population

Initialization

Find the fitness values and the global best

while Termination condition is not satisfied do

employed bee phase

onlooker bee phase

scout bee phase

end while

From the primary collection of food sources, new food sources are generated through the initialization process. The fitness value for each food source is calculated to identify the quality of the food source. Then, the best food source is preserved. In the employed bee phase, new food sources are explored on the basis of the neighbourhood, and they are updated for a better fitness value. In the onlooker bee phase, each onlooker is assigned based on the fitness probability. The food source with the highest fitness probability is the first food source that is selected for the onlooker bee phase. For exploiting the new food sources, the onlooker bee phase follows the same procedure as in the employed bee phase. In the next phase, the exhausted food sources are replaced with the new food sources by scout bees. These three phases work iteratively until a specified condition is satisfied [30].

2.6 Multi-objective optimization in PRS

In the world of Multi-criteria decision making, there are different methods like WSM, Weighted Product Model, Analytic Hierarchy Process and so forth [45 –48]. In these methods, WSM is the classical and simplest method. It is applied in the proposed method to synthesize information from the three qualities, even though it is challenging to find Pareto-optimal solutions for the non-convex problems.

For obtaining an optimal threshold pair in PRS, multiple data qualities associated with the data are optimized, which subsequently influence the consistency of the model. In PRS, there are some significant works for learning thresholds by optimizing the multiple objectives. One contribution in DTRS is the multi-objective optimization of decision costs and size of the boundary region in a trade-off perspective using the Genetic algorithm [49]. In another work, the attribute reduct is obtained by optimizing multiple criteria such as positive region, decision costs and mutual information in DTRS [50]. Then, the neighbourhood-based DTRS is proposed with multi-objective optimization with respect to three criteria such as decreasing the size of the boundary region, decreasing the total decision cost and increasing the size of the neighbourhood in a trade-off perspective using the Evolutionary algorithm and Pareto optimal solutions [51]. Later, Fan Jia et al. proposed a novel loss function from evaluation values of criteria with the support of relative loss functions and inverse loss functions [52]. Moreover, an outranking relation-based rough-fuzzy model for multi-criteria decision making was developed in 2020 [53]. Also, another unified model for the user-oriented attribute importance analysis based on both quantitative and qualitative approaches in three-way decision employed in [54]. In 2021, Chengjun Shi et al. proposed a tri-level framework for multi-criteria decision-making [48]. Recently, Yiyu Yao et al. proposed a method called 3RD for multi-criteria decision making by using the theory of three-way decisions, various multi-criteria decision methods, and prospect theory [55]. Apart from the above methods, this work proposes the multi-objective optimization of entropy, impurity and correlation using WSM and ABC algorithm.

3 Evaluation metrics

In [1, 24], the authors provided different metrics to measure the quality of different regions in PRS. The below-shown metrics are used to measure the quality of three regions in C.

Correct-Acceptance Rate:

$CAR (C, P_{α} (C)) = \frac{| C \cap P_{α} (C) |}{| P_{α} (C) |}$ (33) Incorrect-Acceptance Error:

$IAE (C, P_{α} (C)) = \frac{| C^{c} \cap P_{α} (C) |}{| P_{α} (C) |}$ (34) Correct-Rejection Rate:

$CRR (C, N_{β} (C)) = \frac{| C^{c} \cap N_{β} (C) |}{| N_{β} (C) |}$ (35) Incorrect-Rejection Error:

$IRE (C, N_{β} (C)) = \frac{| C \cap N_{β} (C) |}{| N_{β} (C) |}$ (36) Non-commitment of Positive Error:

$NPE (C, B_{(α, β)} (C)) = \frac{| C \cap B_{(α, β)} (C) |}{| B_{(α, β)} (C) |}$ (37) Non-commitment of Negative Error:

$NNE (C, B_{(α, β)} (C)) = \frac{| C^{c} \cap B_{(α, β)} (C) |}{| B_{(α, β)} (C) |}$ (38) Non-commitment Rate:

$NCR (C, π_{(α, β)} (C)) = \frac{| B_{(α, β)} (C) |}{| U |}$ (39) Commitment Rate:

$CR (C, π_{α, β} (C)) = \frac{| P_{α} (C) \cup N_{β} (C) |}{| U |}$ (40) Accuracy of Three-way Regions:

$ATR (π_{(α, β)} (C)) = \frac{| C \cap P_{α} (C) + C^{c} \cap N_{β} (C) |}{| P_{α} (C) + N_{β} (C) |}$ (41) The CAR and the IAE are used to measure the quality of a positive region of {C, C^c}. i.e., here, CAR denotes the ratio of the cardinality of the correctly classified objects by the positive region of C and the cardinality of objects covered by the positive region of C. CAR is also called Accuracy of Positive Region (APR) in [24]. The IAE denotes the ratio of the cardinality of the objects incorrectly classified by the positive region of C and the cardinality of objects covered by the positive region of C. Likewise, CRR and IRE are used to measure the quality of the negative region of {C, C^c}, and NPE and NNE are used to measure the quality of the boundary region of {C, C^c}. In [24], CRR is also called the Accuracy of Negative Region (ANR). The NCR measures the non-commitment rate, and CR measures the commitment rate in the whole universe. As per [24], ATR measures the accuracy and CR measures the coverage of the three-way regions.

4 Experimental framework

This section includes the explanation of data trisection in PRS and the corresponding quality computation. In order to synthesize information from these qualities, the objective function is needed. Therefore, the formation of the objective function is explained in the next part. Then, the objective function value is optimized within the framework of ABC, which is described with the help of Algorithm 1.

4.1 Determining qualities of trisection in PRS

The factors considered to determine the quality of the trisection are conditional entropy, impurity and correlation with the selected classes. Shannon entropy is employed in the experiment to determine the partition with minimum conditional entropy, and the Gini index is used to determine the partition with minimum impurity. Similarly, to identify a partition with maximum correlation, the χ² statistic is used as the optimization function. To form the trisection and to study the quality of the resulting data set, a set of five (α, β) values are considered initially. In order to find various optimal (α, β) pairs in this experiment, the ABC-based framework is employed. For each trisection, corresponding qualities are determined as specified in Section 2.3.1, 2.3.2 and 2.3.3. The information from all three qualities is aggregated using an objective function through WSM.

4.2 Objective function

One of the simplest method of multi-objective optimization is weighted sum [29]. Suppose F₁(x), F₂(x),...., F_n(x) are the n objective functions, used to optimize together with the weighting components such as W₁, W₂, . . . , W_n, respectively. Then the weighted sum is expressed by Equation(44).

$F (x) = \sum_{i = 1}^{n} W_{i} * F_{i} (x)$ (42) where W_i satisfies the condition:

$\sum_{i = 1}^{n} W_{i} = 1, W_{i} \in (0, 1)$ (43) In the proposed method, Q₁, Q₂ and Q₃ are the different objective functions. The following equations in Equation(46) define them.

$\begin{matrix} Q_{1} = W_{P} * H_{P} + W_{N} * H_{N} + W_{B} * H_{B}, \\ Q_{2} = W_{P} * G_{P} + W_{N} * G_{N} + W_{B} * G_{B}, \\ Q_{3} = W_{P} * φ_{P} + W_{N} * φ_{N} + W_{B} * φ_{B} . \end{matrix}$ (44) In order to generate optimal threshold pair by optimising three objective functions simultaneously, an objective function is defined based on the weighted sum method as per Equation(44). The objective function (OF) is interpreted by Equation(47).

$\begin{matrix} OF = W_{P} * (H_{P} + G_{P} - φ_{P}) + W_{N} \\ * (H_{N} + G_{N} - φ_{N}) + W_{B} * (H_{B} + G_{B} - φ_{B}) \end{matrix}$ (45) In Equation(47), W_P, W_N and W_B are the weights associated with the positive region, negative region and boundary region, respectively. They are calculated as mentioned in Section 2.3 using Equation(44). Also, H_P, G_P and φ_P are the quantified entropy, impurity and correlation associated with the positive region. Similarly, H_N, G_N and φ_N are the same qualities associated with the negative region. Also, H_B, G_B and φ_B are the qualities related to the boundary region. Here, H_P, H_N and H_B are calculated using Equations(31), (32) and (33) respectively. Then, G_P, G_N and G_B are determined using Equations(11), (12) and (13), respectively. Also, φ_P, φ_N and φ_B are determined by the Equations(20), (21) and (22), respectively. Since the proposed method tries to minimize the objective function and maximize the φ values, it makes them negative in the objective function. The optimization of the objective function is handled by the ABC-based framework.

4.3 Multi-objective optimization using ABC algorithm

The ABC algorithm is a commonly used method for multi-objective optimization in real-world problems due to the optimized results with a common framework [56 –58]. This framework includes sequential execution of different phases, namely initialization, employed bee phase, onlooker bee phase, and scout bee phase [30]. The execution follows an iterative strategy until a specified condition is satisfied.

In this work, to optimize the threshold, five pairs of predefined collection of thresholds (α_i, β_i), i = 1, 2, . . , 5 are identified as the initial population. In the initialization phase, these predefined threshold pairs are modified by incorporating random values rand₁ (0.0 ≤ rand₁ ≤ 1.0) and rand₂ (-1.0 ≤ rand₂ ≤ 0.0) respectively for generating new threshold values α_n and β_n, n = 1, 2 . . , 5 using Equation(11). Subsequently, the α_n and β_n values are generated within the limits 0.5 ≤ α_n ≤ 1.0 and 0 ≤ β_n < 0.5, respectively. $\begin{matrix} α_{n} = min (α_{i}) + {rand}_{1} (max (α_{i}) - min (α_{i})), \\ β_{n} = min (β_{i}) + {rand}_{2} (max (β_{i}) - min (β_{i})) . \end{matrix}$ (46) For each threshold pair (α_n, β_n), the fitness value (f) is calculated as per Equation(49) [30].

$f = {\begin{matrix} \frac{1}{1 + OF}, & if OF \geq 0 \\ 1 + | OF |, & if OF < 0 \end{matrix}$ (47) where OF is calculated using Equation(47). At the end of the initialization phase, for the global best, the best fitness value along with the threshold pair is preserved from the set of (α_n, β_n) pairs.

In the employed bee phase, the neighbouring exploration of each threshold pair is performed based on the initialized threshold pairs (α_n, β_n). After each exploration, the best threshold pair is preserved as the effective threshold pair in terms of its fitness value. The results of the exploration are the current best five (α_e, β_e), e = 1, 2 . . , 5 pairs and the corresponding fitness values. Mathematically, the neighbourhood exploration is modelled with the assistance of Equation(50) [30].

$\begin{matrix} α_{e} = | α_{n} + {rand}_{1} (α_{n} - α_{k}) |, \\ β_{e} = | β_{n} + {rand}_{2} (β_{n} - β_{k}) | . \end{matrix}$ (48) In Equation(50), the random values rand₁ and rand₂ are selected from the limits -1.0 ≤ rand₁ ≤ 0.0 and -1.0 ≤ rand₂ ≤ 0.3, respectively. This produces the α_e and β_e within the limits 0.5 ≤ α_e ≤ 1.0 and 0 ≤ β_e < 0.5, respectively. In this phase, new set of threshold pairs (α_e, β_e) is explored from the (α_n, β_n) on the basis of k^th threshold pair (α_k, β_k) where and k = 1, 2, . . , 5. Later, the best fitness value and the corresponding threshold pair are updated as the global best.

By considering the near-optimal solutions generated by the employed bees, onlooker bees start the exploitation of threshold pairs by ranking the threshold pairs on the basis of the fitness probability. The fitness probability (FP) is expressed using Equation(51) [30].

${FP}_{j} = \frac{f_{j}}{Σ_{j = 0}^{m} f_{j}}$ (49) where m is the number of threshold pairs, and FP_j gives the FP of j^th threshold pair. The threshold pair with the highest FP is the first candidate for the exploitation. For exploiting the threshold pairs, Equation(50) is used with the random values rand₁ and rand₂ within the limits -1.0 ≤ rand₁ ≤ 0.0 and -1.0 ≤ rand₂ ≤ 0.3, respectively. The resultant threshold pairs are denoted by (α_o, β_o), o = 1, 2, . . , 5, in which each threshold pair posses the current best fitness value after its exploitation. Finally, threshold pair with the highest fitness value is selected from the (α_o, β_o) pairs as the global best.

During the exploration and exploitation phase, if a particular threshold pair is nonupdated and exceeds the limit, it is replaced by the scout bees with a new threshold pair by using Equation(48). The limit value (l) is determined by the following Equation(52) [30]. Again, the fitness value and the corresponding threshold pair are updated for the global best.

$\begin{matrix} l = \frac{m * d}{2} \end{matrix}$ (50) where m is the number of threshold pairs, and d is the dimension.

4.3.1 ABC-Framework

Algorithm 2 Workflow of multi-objective optimization

Predefined set of (α, β) pairs

Initialization phase

Find the fitness value for each (α, β) pair

Determine the global best

while Stopping criterion is not satisfied do

for each (α, β) pair in employed bee phase do

Generate new (α, β) pair and fitness value

if Compare and update the fitness value

then

Counter gets zero

else

Increments the counter

end if

Update the global best

end for

for each (α, β) pair in onlooker bee phase do

Find the FP values

Selects (α, β) pairs in higher order of FP

Generate new (α, β) pairs

Find the fitness value

if Compare and update the fitness value

then

Counter gets zero

else

Increments the counter

end if

Update the global best

end for

for each (α, β) pair in scout bee phase do

if counter exceeds limit then

Initialize the (α, β) pair

Determine the fitness value

The global best is updated

else

Go for the next (α, β) pair

end if

end for

end while

At the outset, each data set passes through the preprocessing step and enters into the ABC-optimization procedure. This procedure includes different sequential phases, which work iteratively and produce the optimal (α, β) pairs. The phases start with a predefined set of (α, β) pairs. All these threshold pairs are initialized using Equation(48). For each (α, β) pair, the data is trisected as explained in Section 4.2 and the objective function is formulated based on the trisection as explained in Section 4.3. From the objective function, fitness value is determined using Equation(49). The best fitness value and the corresponding pairs of (α, β) values are stored as the global best. In the next phase, for each (α, β) pair, the employed bee phase explores the new (α, β) pair using Equation(50). Then, the fitness value is computed, and the corresponding old threshold pair is replaced if the new threshold pair offers a better fitness value. At the end of this phase, the best fitness value is preserved along with the (α, β) pair. In the next phase, the onlooker bee phase selects each (α, β) pair based on the FP, which is determined via Equation(51). This phase exploits new (α, β) pairs and follows the same procedure of the employed bee phase to compare and update (α, β) pairs and the corresponding fitness values. Consequently, this phase determines the global best. The last phase, the scout bee phase, replaces the non-updated (α, β) pairs based on the limit, which is determined by Equation(52). For more readability, Algorithm 2 displays the entire workflow.

5 Experimental results

This work proposes a method for optimal trisection with minimum uncertainty, impurity and maximum correlation between the classification and trisection. In order to substantiate the proposed method, the experiment is conducted using different data sets from UCI machine learning repository [28]. They are Monks1 (M1), Monks2 (M2), Monks3 (M3), Credit Approval (CA), Congressional Voting Records (CVR), Blood Transfusion Service Center (BTSC) [59], Tic-Tac-Toe Endgame Data Set (TTTE), Immunotherapy (IMT) [60, 61]. The details about the data sets are shown in Table 1.

Table 1
Description of data sets

Dataset Number of conditional attributes Number of instances Size of equivalence class structure

M1 6 124 36

M2 6 169 18

M3 6 122 12

CA 15 690 546

CVR 16 435 64

BTSC 4 748 40

TTTE 9 958 198

IMT 7 90 84

Dataset	Number of conditional attributes	Number of instances	Size of equivalence class structure
M1	6	124	36
M2	6	169	18
M3	6	122	12
CA	15	690	546
CVR	16	435	64
BTSC	4	748	40
TTTE	9	958	198
IMT	7	90	84

Table 2

Initial α and β values

Candidate	α	β
1	1.0	0.4
2	0.9	0.3
3	0.8	0.2
4	0.7	0.1
5	0.6	0.0
6	0.5

Before going to process the data using Algorithm 2, the necessary preprocessing techniques are employed on them. In this work, discretization and numeric to nominal methods are applied to the dataset with the help of WEKA [62]. Also, using DFRM with the half selection strategy, features are selected [32].

Some parameters affect the working of the Multi-Objective Optimization Method (MOOM), such as the number of iterations, size of the population and l. A maximum of 25 iterations is set for the algorithm since further iterations cannot improve the results of the specified data sets. Next, the population size and l are 5. The initial population of threshold pairs for the ABC-based framework is displayed in Table 3. In order to maintain the population size, except the α value 0.6 the initial population is created. Also, l is calculated as per Equation(52).

The results of MOOM method are compared with the results obtained through the BM as specified in Section 2.3 and GTRS method. For each dataset, Table 3 shows the experimental results in terms of three quality levels using BM, GTRS and MOOM. Next, Table 4 and 5 display the corresponding (α, β) levels for BM and GTRS respectively, and Table 6 shows the optimal (α, β) levels for MOOM. Since ABC-based optimization returns different optimal threshold values at different runs, for the sake of briefness, Table 6 displays ten different optimal (α, β) values for each dataset. Also, the region sizes are shown by Table 7. Finally, the evaluation results for BM, GTRS and MOOM are displayed in Table 8, 9 and 10, respectively. If the region size is zero, the corresponding result is denoted by - notation.

Table 3

Experimental results of three qualities

M1	BM	GTRS	MOOM
Entropy	0.6172	0.6454	0.6220
Impurity	0.2902	0.2919	0.2902
Correlation	20.9028	10.9005	17.9979
M2	BM	GTRS	MOOM
Entropy	0.8398	0.8587	0.8398
Impurity	0.3994	0.4084	0.3994
Correlation	10.3647	0.3531	8.1206
M3	BM	GTRS	MOOM
Entropy	0.9556	0.9583	0.9579
Impurity	0.4714	0.472	0.4727
Correlation	2.3893	0.3914	1.8137
CA	BM	GTRS	MOOM
Entropy	0.0992	0.1419	0.1170
Impurity	0.0461	0.0506	0.0468
Correlation	293.4945	278.9336	277.4198
CVR	BM	GTRS	MOOM
Entropy	0.1365	0.1477	0.1365
Impurity	0.0545	0.0556	0.055
Correlation	179.489	140.1851	170.0120
BTSC	BM	GTRS	MOOM
Entropy	0.7089	0.7319	0.7089
Impurity	0.321	0.33	0.321
Correlation	38.8099	1.8317	30.4447
TTTE	BM	GTRS	MOOM
Entropy	0.4692	0.5311	0.4864
Impurity	0.2115	0.2215	0.2188
Correlation	185.9438	181.2411	151.6593
IMT	BM	GTRS	MOOM
Entropy	0.0222	0.0222	0.0222
Impurity	0.0111	0.0111	0.0111
Correlation	28.0166	28.0166	28.0166

Table 4

Different optimal threshold levels of BM

BM	Entropy		Impurity		Correlation
Datasets	α	β	α	β	α	β
M1	0.8, 0.9, 1.0	0.3	0.8, 0.9, 1.0	0.4	0.6	0.4
M2	0.5	0.2	0.5	0.2	0.7, 0.8, 0.9, 1.0	0.4
M3	0.6	0.2, 0.3	0.6	0.2, 0.3	0.5	0.0, 0.1
CA	0.9, 1.0	0.0, 0.1	0.7, 0.8	0.2, 0.3	0.6	0.4
CVR	0.7,0.8	0.0, 0.1	0.7, 0.8	0.2	0.5	0.4
BTSC	0.5	0.2	0.5	0.2	0.6, 0.7, 0.8, 0.9, 1.0	0.2
TTTE	1.0, 0.9	0.4	0.8	0.4	0.7	0.4
IMT	0.6, 0.7, 0.8, 0.9 1.0	0.0, 0.1, 0.2, 0.3, 0.4	0.6, 0.7, 0.8, 0.9 1.0	0.0, 0.1, 0.2, 0.3, 0.4	0.6, 0.7, 0.8, 0.9 1.0	0.0, 0.1, 0.2, 0.3, 0.4

6 Discussion

This section observes the results of each data set and confirms the advantage of using multi-objective optimization of different qualities.

Table 5
Different optimal threshold levels of GTRS

GTRS Entropy Impurity Correlation

Datasets α β α β α β

M1 0.7 0.4 0.7 0.4 1.0, 0.95, 0.9, 0.85, 0.8 0.0, 0.05, 0.1

M2 0.6 0.3 0.6 0.3 1.0 0.0

M3 0.6 0.4, 0.45 0.6 0.4, 0.45 0.9 0.15

CA 0.7 0.4 0.65 0.4 0.85, 0.9 0.2

CVR 0.8 0.2, 0.25 0.7, 0.75, 0.8 0.3 1.0 0.0

BTSC 0.6, 0.65, 0.7 0.35, 0.4 0.6, 0.65, 0.7 0.35, 0.4 1.0 0.0

TTTE 0.65 0.3 0.65 0.3 0.55 0.4

IMT {1.0, 0.95, 0.9}, {0.0, 0.05, 0.1}, {1.0, 0.95, 0.9}, {0.0, 0.05, 0.1}, {1.0, 0.95, 0.9}, {0.0, 0.05, 0.1},

{0.9, 0.85, 0.8}, {0.1, 0.15, 0.2}, {0.9, 0.85, 0.8}, {0.1, 0.15, 0.2}, {0.9, 0.85, 0.8}, {0.1, 0.15, 0.2},

{0.8, 0.75, 0.7}, {0.2, 0.25, 0.3}, {0.8, 0.75, 0.7}, {0.2, 0.25, 0.3}, {0.8, 0.75, 0.7}, {0.2, 0.25, 0.3},

{0.7, 0.65, 0.6}, {0.3, 0.35, 0.4}, {0.7, 0.65, 0.6}, {0.3, 0.35, 0.4}, {0.7, 0.65, 0.6}, {0.3, 0.35, 0.4},

{6} {0.4, 0.45} {6} {0.4, 0.45} {6} {0.4, 0.45}

GTRS	Entropy	Impurity	Correlation
Datasets	α	β	α	β	α	β
M1	0.7	0.4	0.7	0.4	1.0, 0.95, 0.9, 0.85, 0.8	0.0, 0.05, 0.1
M2	0.6	0.3	0.6	0.3	1.0	0.0
M3	0.6	0.4, 0.45	0.6	0.4, 0.45	0.9	0.15
CA	0.7	0.4	0.65	0.4	0.85, 0.9	0.2
CVR	0.8	0.2, 0.25	0.7, 0.75, 0.8	0.3	1.0	0.0
BTSC	0.6, 0.65, 0.7	0.35, 0.4	0.6, 0.65, 0.7	0.35, 0.4	1.0	0.0
TTTE	0.65	0.3	0.65	0.3	0.55	0.4
IMT	{1.0, 0.95, 0.9},	{0.0, 0.05, 0.1},	{1.0, 0.95, 0.9},	{0.0, 0.05, 0.1},	{1.0, 0.95, 0.9},	{0.0, 0.05, 0.1},
	{0.9, 0.85, 0.8},	{0.1, 0.15, 0.2},	{0.9, 0.85, 0.8},	{0.1, 0.15, 0.2},	{0.9, 0.85, 0.8},	{0.1, 0.15, 0.2},
	{0.8, 0.75, 0.7},	{0.2, 0.25, 0.3},	{0.8, 0.75, 0.7},	{0.2, 0.25, 0.3},	{0.8, 0.75, 0.7},	{0.2, 0.25, 0.3},
	{0.7, 0.65, 0.6},	{0.3, 0.35, 0.4},	{0.7, 0.65, 0.6},	{0.3, 0.35, 0.4},	{0.7, 0.65, 0.6},	{0.3, 0.35, 0.4},
	{6}	{0.4, 0.45}	{6}	{0.4, 0.45}	{6}	{0.4, 0.45}

Table 6

Optimal (α, β) values obtained from MOOM

MOOM
M1	M2	M3	CA	CVR
(0.8, 0.41)	(0.5, 0.17)	(0.5, 0.18)	(0.67, 0.06)	(0.79, 0.08)
(0.79, 0.44)	(0.5, 0.19)	(0.5, 0.3)	(0.72, 0.14)	(0.74, 0.07)
(0.83, 0.41)	(0.5, 0.27)	(0.5, 0.16)	(0.7, 0.17)	(0.73, 0.07)
(0.94, 0.4)	(0.5, 0.28)	(0.5, 0.15)	(0.78, 0.03)	(0.68, 0.14)
(0.85, 0.42)	(0.5, 0.2)	(0.5, 0.21)	(0.78, 0.07)	(0.71, 0.05)
(0.87, 0.42)	(0.5, 0.22)	(0.5, 0.33)	(0.69, 0.03)	(0.84, 0.02)
(0.79, 0.4)	(0.5, 0.29)	(0.5, 0.3)	(0.8, 0.4)	(0.67, 0.07)
(0.77, 0.41)	(0.5, 0.21)	(0.5, 0.15)	(0.74, 0.08)	(0.73, 0.18)
(0.87, 0.49)	(0.5, 0.25)	(0.5, 0.32)	(0.78, 0.17)	(0.76, 0.15)
(0.86, 0.4)	(0.5, 0.19)	(0.5, 0.24)	(0.75, 0.14)	(0.76, 0.14)
BTSC	TTTE	IMT
(0.5, 0.24)	(0.67, 0.01)	(0.72, 0.32)
(0.5, 0.28)	(0.74, 0.08)	(0.77, 0.07)
(0.5, 0.23)	(0.67, 0.14)	(0.84, 0.28)
(0.53, 0.24)	(0.69, 0.04)	(0.69, 0.05)
(0.54, 0.2)	(0.67, 0.00)	(0.54, 0.32)
(0.5, 0.27)	(0.67, 0.19)	(0.8, 0.31)
(0.5, 0.29)	(0.68, 0.17)	(0.8, 0.31)
(0.5, 0.21)	(0.75, 0.21)	(0.71, 0.17)
(0.54, 0.23)	(0.68, 0.05)	(0.89, 0.28)
(0.5, 0.28)	(0.74, 0.22)	(0.85, 0.22)

Table 7

The size of the regions

BM	Entropy			Impurity			Correlation
Datasets	POS	NEG	BND	POS	NEG	BND	POS	NEG	BND
M1	24.19	31.45	44.35	24.19	50.00	25.81	37.10	50.00	12.90
M2	37.87	25.44	36.69	37.87	25.44	36.69	0.0	56.80	43.20
M3	16.39	5.74	77.87	16.39	5.74	77.87	53.28	0.0	46.72
CA	49.57	40.29	10.14	52.61	41.01	6.38	54.64	42.17	3.19
CVR	36.32	56.09	7.59	36.32	57.24	6.44	39.77	60.23	0.0
BTSC	7.22	54.68	38.10	7.22	54.68	38.10	0.0	54.68	45.32
TTTE	37.79	27.56	34.66	48.23	27.56	24.22	51.57	27.56	20.88
IMT	77.78	20.00	2.22	77.78	20.00	2.22	77.78	20.00	2.22
GTRS	Entropy			Impurity			Correlation
Datasets	POS	NEG	BND	POS	NEG	BND	POS	NEG	BND
M1	27.42	50.00	22.58	27.42	50.0	22.58	24.19	11.29	64.52
M2	22.49	43.20	34.32	22.49	43.20	34.32	0.0	4.14	95.86
M3	16.39	27.05	56.56	16.39	27.05	56.56	0.0	5.74	94.26
CA	52.61	42.17	5.21	53.91	42.17	3.91	49.57	41.01	9.42
CVR	36.32	57.24	6.44	36.32	58.85	4.83	8.74	35.17	56.09
BTSC	0.0	74.06	25.94	0.0	74.06	25.94	0.0	6.28	93.72
TTTE	60.33	21.29	18.37	60.33	21.29	18.37	67.43	27.58	5.01
IMT	77.78	20.00	2.22	77.78	20.00	2.22	77.78	20.00	2.22
MOOM
Datasets	POS	NEG	BND
M1	24.19	50.00	25.81
M2	37.87	25.44	36.69
M3	53.28	5.74	40.98
CA	52.61	42.17	5.22
CVR	36.32	56.09	7.59
BTSC	7.22	54.68	38.10
TTTE	51.57	14.20	34.24
IMT	77.78	20.00	2.22

Table 8

Experimental evaluation results of BM

Entropy
Dataset	CAR	IAE	CRR	IRE	NPE	NNE	NCR	CR	ATR
M1	1.0	0.0	0.8718	0.1282	0.4909	0.5091	0.4435	0.5565	0.9275
M2	0.5938	0.4062	0.8837	0.1163	0.3387	0.6613	0.3669	0.6331	0.7103
M3	0.7	0.3	0.8571	0.1429	0.4737	0.5263	0.7787	0.2213	0.7407
CA	1.0	0.0	1.0	0.0	0.5857	0.4143	0.1014	0.8986	1.0
CVR	0.9747	0.0253	1.0	0.0	0.4242	0.5758	0.0759	0.9241	0.99
BTSC	0.537	0.463	0.8875	0.1125	0.3614	0.6386	0.381	0.619	0.8467
TTTE	1.0	0.0	0.8636	0.1364	0.6867	0.3133	0.3466	0.6534	0.9425
IMT	1.0	0.0	1.0	0.0	0.5	0.5	0.0222	0.9778	1.0
Impurity
Dataset	CAR	IAE	CRR	IRE	NPE	NNE	NCR	CR	ATR
M1	1.0	0.0	0.7903	0.2097	0.5938	0.4062	0.2581	0.7419	0.8587
M2	0.5938	0.4062	0.8837	0.1163	0.3387	0.6613	0.3669	0.6331	0.7103
M3	0.7	0.3	0.8571	0.1429	0.4737	0.5263	0.7787	0.2213	0.7407
CA	0.989	0.011	0.9965	0.0035	0.5227	0.4773	0.0638	0.9362	0.9923
CVR	0.9747	0.0253	0.996	0.004	0.4643	0.5357	0.0644	0.9356	0.9877
BTSC	0.537	0.463	0.8875	0.1125	0.3614	0.6386	0.381	0.619	0.8467
TTTE	0.9654	0.0346	0.8636	0.1364	0.6207	0.3793	0.2422	0.7578	0.9284
IMT	1.0	0.0	1.0	0.0	0.5	0.5	0.0222	0.9778	1.0
Correlation
Dataset	CAR	IAE	CRR	IRE	NPE	NNE	NCR	CR	ATR
M1	1.0	0.0	1.0	0.0	0.4	0.6	0.6452	0.3548	1.0
M2	-	-	0.7708	0.2292	0.5753	0.4247	0.432	0.568	0.7708
M3	0.5846	0.4154	-	-	0.386	0.614	0.4672	0.5328	0.5846
CA	0.9761	0.0239	0.9863	0.0137	0.5	0.5	0.0319	0.9681	0.9805
CVR	0.9422	0.0578	0.9809	0.0191	-	-	0.0	1.0	0.9655
BTSC	-	-	0.8875	0.1125	0.3894	0.6106	0.4532	0.5468	0.8875
TTTE	0.9514	0.0486	0.8636	0.1364	0.6	0.4	0.2088	0.7912	0.9208
IMT	1.0	0.0	1.0	0.0	0.5	0.5	0.0222	0.9778	1.0

Table 9

Experimental evaluation results of GTRS

Entropy
Dataset	CAR	IAE	CRR	IRE	NPE	NNE	NCR	CR	ATR
M1	0.9706	0.0294	0.7903	0.2097	0.5714	0.4286	0.2258	0.7742	0.8542
M2	0.6316	0.3684	0.8082	0.1918	0.4483	0.5517	0.3432	0.6568	0.7477
M3	0.7	0.3	0.6667	0.3333	0.5072	0.4928	0.5656	0.4344	0.6792
CA	0.989	0.011	0.9863	0.0137	0.5556	0.4444	0.0522	0.9478	0.9878
CVR	0.9747	0.0253	0.996	0.004	0.4643	0.5357	0.0644	0.9356	0.9877
BTSC	-	-	0.8375	0.1625	0.4536	0.5464	0.2594	0.7406	0.8375
TTTE	0.91	0.09	0.9216	0.0784	0.4773	0.5227	0.1837	0.8163	0.913
IMT	1.0	0.0	1.0	0.0	0.5	0.5	0.0222	0.9778	1.0
Impurity
Dataset	CAR	IAE	CRR	IRE	NPE	NNE	NCR	CR	ATR
M1	0.9706	0.0294	0.7903	0.2097	0.5714	0.4286	0.2258	0.7742	0.8542
M2	0.6316	0.3684	0.8082	0.1918	0.4483	0.5517	0.3432	0.6568	0.7477
M3	0.7	0.3	0.6667	0.3333	0.5072	0.4928	0.5656	0.4344	0.6792
CA	0.9812	0.0188	0.9863	0.0137	0.5185	0.4815	0.0391	0.9609	0.9834
CVR	0.9747	0.0253	0.9883	0.0117	0.5238	0.4762	0.0483	0.9517	0.9831
BTSC	-	-	0.8375	0.1625	0.4536	0.5464	0.2594	0.7406	0.8375
TTTE	0.91	0.09	0.9216	0.0784	0.4773	0.5227	0.1837	0.8163	0.913
IMT	1.0	0.0	1.0	0.0	0.5	0.5	0.0222	0.9778	1.0
Correlation
Dataset	CAR	IAE	CRR	IRE	NPE	NNE	NCR	CR	ATR
M1	0.8913	0.1087	0.7903	0.2097	0.5	0.5	0.129	0.871	0.8333
M2	-	-	1.0	0.0	0.3951	0.6049	0.9586	0.0414	1.0
M3	-	-	0.8571	0.1429	0.513	0.487	0.9426	0.0574	0.8571
CA	1.0	0.0	0.9965	0.0035	0.6154	0.3846	0.0942	0.9058	0.9984
CVR	1.0	0.0	1.0	0.0	0.8497	0.1503	0.3517	0.6483	1.0
BTSC	-	-	1.0	0.0	0.2539	0.7461	0.9372	0.0628	1.0
TTTE	0.8762	0.1238	0.8636	0.1364	0.5	0.5	0.0501	0.9499	0.8725
IMT	1.0	0.0	1.0	0.0	0.5	0.5	0.0222	0.9778	1.0

Table 10

Experimental evaluation results of MOOM

Dataset	CAR	IAE	CRR	IRE	NPE	NNE	NCR	CR	ATR
M1	1.0	0.0	0.7903	0.2097	0.5938	0.4062	0.2581	0.7419	0.8587
M2	0.5938	0.4062	0.8837	0.1163	0.3387	0.6613	0.3669	0.6331	0.7103
M3	0.5846	0.4154	0.8571	0.1429	0.42	0.58	0.4098	0.5902	0.6111
CA	0.989	0.011	0.9863	0.0137	0.5556	0.4444	0.0522	0.9478	0.9878
CVR	0.9747	0.0253	1.0	0.0	0.4242	0.5758	0.0759	0.9241	0.99
BTSC	0.537	0.463	0.8875	0.1125	0.3614	0.6386	0.381	0.619	0.8467
TTTE	0.9514	0.0486	1.0	0.0	0.4756	0.5244	0.3424	0.6576	0.9619
IMT	1.0	0.0	1.0	0.0	0.5	0.5	0.0222	0.9778	1.0

6.1 M1

In 1991, a group of researchers compared a set of learning algorithms to find the optimal one. Three problems are derived, and they are termed as Monk’s problems. The first problem, Monks1, is based on a logical formula that produces binary outcomes for the targeted concept of the data set [28]. The proposed work focuses on the optimal trisection of M1 in terms of three qualities. Each quality is optimal at different (α, β) pairs in BM. However, when optimizing three qualities simultaneously, MOOM is a better option among BM and GTRS methods. The proposed method MOOM gives the results which are very close to the results produced by the BM. Also, the MOOM gives a different set of (α, β) pairs at each run that is displayed in Table 8. In the case of the regions’ size, the boundary region is moderate in the MOOM. Also, the evaluation metric gives good results when compared to the existing ones.

6.2 M2

In this Monk’s problem, out of the six attributes, exactly two of them have the first value [28]. The optimum quality levels of trisected M2 are shown in Table 3. The three qualities are optimum in BM; however, the MOOM produces results that are very close to BM. The threshold levels are the same for both entropy and impurity in BM as well as GTRS. Moreover, MOOM produces the threshold pairs with more variation in β values. Also, the size of the boundary region is the same as that of the result of BM for entropy and impurity. Evaluation metric also gives the same result.

6.3 M3

In Monk’s problem 3, the decision value is based on the binary outcome of the logical formula. Moreover, the noise (5%) is added to the training set. In this experiment, all the three qualities are optimum in BM, and MOOM produces the results which are very close to BM. The size of the boundary region in MOOM is relatively smaller than BM and GTRS. Also, MOOM gives better evaluation results. The accuracy rate is not the highest, however, it provides the best coverage rate.

6.4 CA

This data includes the details about credit card applications, and the corresponding decisions [28]. The qualities of trisection of this data are optimum in BM, whereas other methods also produce very close results. Each quality is optimum at different threshold pairs. Meanwhile, the three qualities are optimal at different optimal threshold pairs in MOOM. It shows reduced size for boundary regions when compared to the other methods. Moreover, the evaluation results are the same for both MOOM and BM for entropy, since the MOOM gives the same entropy level as BM.

6.5 CVR

The data set includes the United State Congressional Voting Records, where each record has 16 attributes with a decision of democrat or republican [28]. Here, the quality, as well as the threshold levels, are almost similar for both BM and MOOM. Therefore the region sizes and evaluation metrics follow the same results for both.

6.6 BTSC

The BTSC is the data related to the randomly selected blood donors from a donor database, with a decision of whether a person gave blood in March 2007 [59]. The results associated with this data are the same for both MOOM and BM for entropy and impurity.

6.7 TTTE

The data set gives the entire combination of board at the end of a Tic-Tac-Toe game. The decision label is win or not for a gamer. In this experiment, the optimum qualities are produced by BM. However, MOOM shows comparatively better evaluation results than BM.

6.8 IMT

IMT is a small data set that includes different features collected for the immunotherapy. The target concept means patient’s response to the treatment [60, 61]. The peculiarity of this data set are the three qualities, which are optimum at the same threshold levels. Therefore, it is easy to obtain the multi-objective optimization of these three qualities. As expected, the evaluation metrics are same for BM, GTRS and MOOM.

In PRS, there are various optimization methods to enhance the quality of the generated trisection. This enables us to interpret the trisection from different perspectives. However, the optimum threshold level for one quality need not be the optimum level for another quality. Here comes the importance of the multi-objective optimization methods. That means, if a problem demands the optimization of multiple qualities of a trisection simultaneously, the proposed method is a solution for this. In this method, the qualities like uncertainty, impurity and correlation are optimized with the help of two main stages, such as the WSM and ABC-based optimization. The former method derives a single (OF) value from these qualities, and the optimized value is determined through the iteration mechanism established by the latter. These optimized results are verified by the various evaluation metrics. It is better to explain the experiment on the basis of these two stages.

The weighted sum method is the simplest method of multi-objective optimization. Therefore, it is easy to combine different qualities and produce a single value to analyze. This value acts as the OF value for the fitness value calculation. Since the ABC algorithm provides a framework for optimizing the fitness value, it is efficiently used to find the optimal pair of thresholds. Moreover, the proposed method produces more than one optimized result since it is a capability of the ABC algorithm. Also, the exploration and exploitation mechanisms for generating new threshold pairs offer new results from the extended search space other than the initial search space.

For each candidate in the population, the three qualities are determined. The three qualities produce the results (r) within the limit 0 ≤ r ≤ 1. These quantities are multiplied with the corresponding region probabilities and sum up to form a single value or OF. That means, the (α, β) values determines the OF. Based on OF, the fitness value is calculated and updated in different phases of the ABC algorithm. The three phases of the ABC algorithm explore different (α, β) values to optimize the fitness value and fix the thresholds at an optimum fitness value.

The simplest method WSM, and the ABC-based efficient framework provide various benefits in the proposed work. Firstly, the presence of a common framework helps to adapt different aggregation methods easily. Also, it helps to generate results from the extended search spaces too. Adding to this, the capability of the ABC-based framework results in more than one optimized result, as in Table 7. Moreover, the results of MOOM are very close to the optimal results of BM. GTRS model is not performed promisingly in this experiment since the predefined strategy employed and the stopping criterion are not favourable to this problem. Totally, the proposed method gives better results in terms of multi-objective optimization of PRS.

Since there is no trade-off among the qualities taken for this study, the results are not balanced like a GTRS method. The results show that qualities like entropy and impurity are mostly optimal at a point in the search space, whereas the optimal correlation is far from this point. Therefore, at the time of decision making, entropy and impurity give a better result when compared to the correlation. Another observation is related to the ABC-based framework. Usually, classical ABC algorithms face the problem of local optimization. Consequently, the algorithm rarely results in sub-optimal qualities. There are some global optimization methods for ABC algorithm [63 –67]. For the sake of briefness, the proposed work concentrates on the benefits of the ABC-based framework.

7 Conclusion

There are different models for optimizing the qualities individually; such as cost, entropy, impurity, correlation and variance under the PRS. However, experiments hardly handles the optimization of more than one mentioned qualities for the application eye view. This paper addresses the above mentioned problem with a hybrid model, which involves WSM and ABC-based framework. The proposed method optimizes entropy, impurity and correlation simultaneously. The WSM derives the information about the three qualities, and ABC-based framework optimizes the derived information. The result is compared with the BM and GTRS. The output for the proposed method is more efficient in terms of optimal qualities, multiple optimum (α, β) pairs, reduced size of boundary regions and better evaluation results. However, the actual optimal point of correlation is away when compared to the entropy and impurity. Also, the proposed method faces the local optimization problem, which result in suboptimal qualities sometimes. The future work will concentrate on the better aggregation of three qualities to synthesize information from multiple qualities instead of WSM. As a future direction for the proposed work, the results can be further improved by adapting global optimization framework; to avoid the issue induced by trapping into the local optimum.

Declaration

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

References

Deng

and Yao

, A multifaceted analysis of probabilistic three-way decisions, Fundamenta Informaticae 132 (2014), 291–313.

Jia

, Shang

Three-way decisions versus two-way decisions on filtering spam email, Transactions On Rough Sets XVIII (2014), 69–91.

Zhang

, Liu

, Yao

Three-way email spam filtering with game-theoretic rough sets, 2019 International Con-ference On Computing, Networking And Communications (ICNC), 2019, pp. 552–556.

Zhang

, Liu

, Yao

Three-way email spam filtering with game-theoretic rough sets, 2019 International Con-ference On Computing, Networking And Communications (ICNC), 2019, pp. 552–556.

Yao

Set-theoretic models of three-way decision, Granular Computing (2020), 1–16.

Palawk

, Rough sets, International Journal Of Computer And Information Science 11 (1982), 341–356.

Deng

Three-way classification models, Faculty of Graduate Studies, 2015.

Deng

and Yao

, Decision-theoretic three-way approximations of fuzzy sets, Information Sciences 279 (2014), 702–715.

Pedrycz

, Shadowed sets: Representing and processing fuzzy sets, IEEE Transactions On Systems, Man, And Cybernetics, Part B (Cybernetics) 28 (1998), 103–109.

10.

Pedrycz

, From fuzzy sets to shadowed sets: Interpretation and computing, International Journal Of Intelligent Systems 24 (2009), 48–61.

11.

Dubois

and Prade

, Rough fuzzy sets and fuzzy rough sets, International Journal Of General System 17 (1990), 191–209.

12.

Yang

and Yao

, Semantics of soft sets and three-way decision with soft sets, Knowledge-Based Systems 194 (2020), 105538.

13.

Wang

, Zhan

and Mi

, A three-way decision approach with probabilistic dominance relations under intuitionistic fuzzy information, Information Sciences 582 (2022), 114–145.

14.

Lang

, Miao

and Fujita

, Three-way group conflict analysis based on Pythagorean fuzzy set theory, IEEE Transactions On Fuzzy Systems 28 (2019), 447–461.

15.

Alcantud

The semantics of N-soft sets, their applications, and a coda about three-way decision, Information Sciences (2022).

16.

Feng

, Wan

, Alcantud

and Garg

, Three-way decision based on canonical soft sets of hesitant fuzzy sets, AIMS Mathematics 7 (2022), 2061–2083.

17.

Yao

, Probabilistic approcah to rough set, International Journal Of Approximation Reasoning 49 (2008), 255–271.

18.

Yao

, Three-way decisions with probabilistic rough sets, Information Sciences 180 (2010), 341–353.

19.

Yao

Decision-theoretic rough set models, International Conference On Rough Sets And Knowledge Technology, 2007, pp. 1–12.

20.

Azam

Investigating Decision Making with GameTheoretic Rough Sets, Faculty of Graduate Studies, 2014.

21.

Yao

, Zhou

Naive Bayesian rough sets, International Conference On Rough Sets And Knowledge Technology, 2010, pp. 719–726.

22.

Zhou

, Yao

Comparison of two models of probabilistic rough sets, International Conference On Rough Sets And Knowledge Technology, 2013, pp. 121–132.

23.

Slezak

and Ziarko

, The investigation of the Bayesian rough set model, International Journal Of Approximate Reasoning 40 (2005), 81–91.

24.

Zhang

and Yao

, Gini objective functions for three-way classifications, International Journal Of Approximate Reasoning 81 (2017), 103–114.

25.

Gao

, Yao

Determining thresholds in three-way decisions with chi-square statistic, International Joint Conference On Rough Sets, 2016, pp. 272–281.

26.

Azam

, Yao

Variance based determination of threeway decisions using probabilistic rough sets, International Joint Conference On Rough Sets, 2016, pp. 209–218.

27.

Deng

, Yao

An information-theoretic interpretation of thresholds in probabilistic rough sets, International Conference On Rough Sets And Knowledge Technology, 2012, pp. 369–378.

28.

Dua

, Graff

UCI Machine Learning Repository. University of California, Irvine, School of Information, 2017.https://archive.ics.uci.edu/ml.

29.

Mateo

Weighted sum method and weighted product method, Multi Criteria Analysis In The Renewable Energy Industry, 2012, pp. 19–22.

30.

Karaboga

and Akay

, A comparative study of artificial bee colony algorithm, Applied Mathematics And Computation 214 (2009), 108–132.

31.

Yao

, Deng

A granular computing paradigm for concept learning, Emerging Paradigms In Machine Learning, 2013, pp. 307–326.

32.

Sabu

Studies on rough set theory with applications to data mining, School of Computer Sciences, Mahatma Gandhi University, 2013.

33.

Yao

Three-way decision: An interpretation of rules in rough set theory, International Conference On Rough Sets And Knowledge Technology, 2009, pp. 642–649.

34.

Raileanu

and Stoffel

, Theoretical comparison between the gini index and information gain criteria, Annals Of Mathematics And Artificial Intelligence 41 (2004), 77–93.

35.

Lancaster

and Seneta

, Chi-square distribution, Encyclopedia Of Biostatistics 2 (2005).

36.

Guilford

, The phi coefficient and chi square as indices of item validity, Psychometrika 6 (1941), 11–19.

37.

Azam

and Yao

, Analyzing uncertainties of probabilistic rough set regions with game-theoretic rough sets, International Journal Of Approximate Reasoning 55 (2014), 142–155.

38.

Herbert

and Yao

, Game-theoretic rough sets, Fundamenta Informaticae 108 (2011), 267–286.

39.

Whitley

, A genetic algorithm tutorial, Statistics And Computing 4 (1994), 65–85.

40.

Dorigo

and Blum

, Ant colony optimization theory: A survey, Theoretical Computer Science 344 (2005), 243–278.

41.

Kennedy

, Eberhart

Particle swarm optimization, Proceedings Of ICNN’95-international ConferenceOn Neural Networks, 4, 1995, pp. 1942–1948.

42.

Price

Differential evolution, Handbook Of Optimization, 2013, pp. 187|ndash214.

43.

Kaipa

, Ghose

Glowworm swarm optimization: Algorithm development, Glowworm Swarm Optimization (2017), 21–56.

44.

Mareli

and Twala

, An adaptive Cuckoo search algorithm for optimisation, Applied Computing And Informatics 14 (2018), 107–115.

45.

Fishburn

, Letter to the editor— additive utilities with incomplete product sets: Application to priorities and assignments, Operations Research 15 (1967), 537–542.

46.

Miller

M.D.W.

Executive Decisions and Operations Research, Prentice-Hall, 1969.

47.

Saaty

What is the analytic hierarchy process? Mathematical Models For Decision Support, 1988, pp. 109–121.

48.

Shi

, Yao

Trilevel multi-criteria decision analysis based on three-way decision, Intelligence Science III (2021), 115–124.

49.

Pan

, Zhang

, Fan

, Cao

, Lu

and Yang

, Multi-objective optimization method for learning thresholds in a decision-theoretic rough set model, International Journal Of Approximate Reasoning 71 (2016), 34–49.

50.

, Jia

, Wang

and Zhou

, Multi-objective attribute reduction in three-way decision-theoretic rough set model, International Journal Of Approximate Reasoning 105 (2019), 327–341.

51.

Pan

, Wang

, Yi

, Zhang

, Fan

and Bao

, Multi-objective optimization method for thresholds learning and neighborhood computing in a neighborhood based decisiontheoretic rough set model, Neurocomputing 266 (2017), 619–630.

52.

Jia

and Liu

, A novel three-way decision model under multiple-criteria environment, Information Sciences 471 (2019), 29–51.

53.

Zhan

, Jiang

, Yao

Three-way multi-attribute decision-making based on outranking relations, IEEE Transactions on Fuzzy Systems (2020).

54.

Cui

, Yao

Modeling Use-Oriented Attribute Importance with the Three-Way Decision Theory, International Joint Conference On Rough Sets, 2020, pp. 122–136.

55.

Yao

, Shi

A multi-criteria decision-making method based on three-way rankings, Rough Sets (2021), 294–309.

56.

Yilmaz

, Oksar

and Basciftci

, Multi-objective artificial bee colony algorithm to estimate transformer equivalent circuit parameters, Periodicals Of Engineering And Natural Sciences 5 (2017).

57.

Rao

and Patel

, Multi-objective optimization of two stage thermoelectric cooler using a modified teaching–learning-based optimization algorithm, Engineering Applications Of Artificial Intelligence 26 (2013), 430–445.

58.

Delgarm

, Sajadi

and Delgarm

, Multi-objective optimization of building energy performance and indoor thermal comfort: A new method using artificial bee colony (ABC), Energy And Buildings 131 (2016), 42–53.

59.

Yeh

, Yang

and Ting

, Knowledge discovery on RFM model using Bernoulli sequence, Expert Systems With Applications 36 (2009), 5866–5871.

60.

Khozeimeh

, Alizadehsani

, Roshanzamir

, Khosravi

, Layegh

and Nahavandi

, An expert system for selecting wart treatment method, Computers In Biology And Medicine 81 (2017), 167–175.

61.

Khozeimeh

, Azad

F.J.

, Oskouei

Y.M.

, Jafari

, Tehranian

, Alizadehsani

and Layegh

, Intralesional immunotherapy compared to cryotherapy in the treatment of warts, International Journal Of Dermatology 56 (2017), 474–478.

62.

Witten

, Frank

Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann, 2005.

63.

Guo

, Cheng

, Liang

Global artificial bee colony search algorithm for numerical function optimization, 2011 Seventh International Conference On Natural Computation, 3, 2011, pp. 1280–1283.

64.

Liu

, Zhu

, Ma

, Zhang

and Xu

, An artificial bee colony algorithm with guide of global & local optima and asynchronous scaling factors for numerical optimization, Applied Soft Computing 37 (2015), 608–618.

65.

Gao

and Liu

, A modified artificial bee colony algorithm, Computers & Operations Research 39 (2012), 687–697.

66.

Gao

, Liu

and Huang

, A global best artificial bee colony algorithm for global optimization, Journal Of Computational And Applied Mathematics 236 (2012), 2741–2753.

67.

Gao

and Liu

, Improved artificial bee colony algorithm for global optimization, Information Processing Letters 111 (2011), 871–882.

An artificial bee colony-based framework for multi-objective optimization of three-way decisions with probabilistic rough sets

Abstract

Keywords

1 Introduction

2 Background

2.1 Discrimination frequency relevance measure (DFRM) algorithm

2.5 ABC-based optimization

2.6 Multi-objective optimization in PRS

3 Evaluation metrics

4.1 Determining qualities of trisection in PRS

4.2 Objective function

5 Experimental results

Table 1 Description of data sets Dataset Number of conditional attributes Number of instances Size of equivalence class structure M1 6 124 36 M2 6 169 18 M3 6 122 12 CA 15 690 546 CVR 16 435 64 BTSC 4 748 40 TTTE 9 958 198 IMT 7 90 84

6.2 M2

6.3 M3

6.4 CA

6.5 CVR

6.6 BTSC

6.7 TTTE

6.8 IMT

7 Conclusion

Declaration

References

Table 1
Description of data sets

Dataset Number of conditional attributes Number of instances Size of equivalence class structure

M1 6 124 36

M2 6 169 18

M3 6 122 12

CA 15 690 546

CVR 16 435 64

BTSC 4 748 40

TTTE 9 958 198

IMT 7 90 84