New approaches to intuitionistic fuzzy-rough attribute reduction

Abstract

Technological advancement in the area of computing has led to production of huge amount of structured as well as unstructured data. This high dimensional data is very complex to process. Feature selection is one of the widely used techniques for preprocessing of this huge data in predictive analytics. Rough set based feature selection is an approach for handling the vagueness in data and works fine on discrete data but struggles in the continuous case as it requires discretization. This process of discretization leads to information loss. Solution for this problem was given by various authors in form of fuzzy rough set as well as intuitionistic fuzzy rough set based approaches for feature selection. Intuitionistic fuzzy set has certain benefits over the theory of traditional fuzzy sets such as its ability in a better expression of underlying information as well as its aptness to recite fragile ambiguities of the uncertainty of the objective world. The benefits offered by Intuitionistic fuzzy sets is due to the concurrent contemplation of positive, negative and hesitancy degrees for an object to belong to a set. In this paper, three novel approaches of feature reduction based on intuitionistic fuzzy rough set are presented. For this, a new intuitionistic fuzzy rough set model is established by defining a pair of lower and upper approximations. Furthermore, three new approaches of feature selection based on the degree of dependency by using score function, membership grade and cardinality of intuitionistic fuzzy numbers are introduced. Moreover, the basic results on lower and upper approximations based on rough sets are extended for intuitionistic fuzzy rough sets and analogous results are established. Moreover, a suitable algorithm is given based on our proposed approaches. Finally, the proposed algorithm is applied to an arbitrary example data set and comparison has been made with the previous fuzzy rough set based technique. The proposed algorithm is found to be better performing in terms of selected features.

Keywords

Rough set fuzzy-rough set intuitionistic fuzzy-rough set score function degree of dependency T-equivalence relation

1 Introduction

Feature selection or feature reduction is one of the key issues encountered in data mining, signal processing and bioinformatics [11, 30]. The leading objective of this function is to retain the optimum prominent characteristics required for the pattern recognition process and to reduce the dimensionality space so that low complexity algorithms can be formulated for efficient classification [10]. Feature selection describes the problem of specifying those input features that are most prognostic of a given outcome. Unlike other feature reduction methods [13], feature selectors preserve the indigenous meaning of the features after reduction. It has applications in pattern recognition, which involves datasets containing very large number of features.

Although, it might be anticipated that the inclusion of a large number of features would increase the likelihood of encoding enough information to differentiate between classes, however, it is usually found that high-dimensional data sets increase the possibilities that a data mining algorithm will find bogus patterns that are invalid in general [21, 22]. Almost all techniques require some degree of reduction in order to cope with huge amounts of data. So, it is necessary to use an efficient and productive reduction method.

Pattern processing with insufficient or incomplete and uncertain knowledge is the center of major research in computational intelligence and cognitive sciences. Rough set theory (1982) [27] (proposed by Pawlak) provides one of the most natural approaches for modeling knowledge [28]. The rough set theory is an extension of conventional crisp set theory. The core in the rough set theory is “Indiscernibility Relation”, which is an equivalence relation. With the help of indiscernibility relation, the universe of discourse can be split into various indiscernibility classes. These classes are fundamental concepts for building of lower and upper approximations. Most of the information systems consist of real-valued data and rough set theory was not having the ability to tackle these types of data sets directly. Therefore, several discretization methods were applied in order to transform the data in discrete form. This may result in information loss. Moreover, there was no idea of handling noisy data directly.

Fuzzy rough feature selection is a way through which discrete or real-valued noisy data (or combination of both) can be reduced without the requirement of user-supplied information. Moreover, this method is frequently used to data with continuous or nominal decision features and intrinsically can be implemented to regression as well as classification problems.

As crisp equivalence classes are core to rough set, fuzzy rough sets are based on fuzzy equivalence classes. Fuzzy rough set was proposed by generalization of crisp rough set [25]. Dubois and Prade [12] combined fuzzy set (proposed by Zadeh [35]) and rough set (proposed by Pawlak). Fuzzy rough set was generalized by Richard Jensen et al. [17, 18] for feature selection.

In Intuitionistic fuzzy set [1, 2], an extension of Zadeh’s fuzzy set, positive, negative and hesitancy degrees can be applied simultaneously for an object related to a set. So, it has much stronger ability to deal with information system and draw a better glimpse of fragile ambiguities of the objective world. Intuitionistic fuzzy set concept has already been used in pattern recognition and decision making [3 , 32–34, 36].

Jena et al. [16], Chakrabarty et al. [4] and Nanda et al. [26] combined intuitionistic fuzzy set and rough set and proposed “Intuitionistic Fuzzy Rough Set” by using different approaches. Coker [8] added that a fuzzy rough set is in fact an intuitionistic L-fuzzy set. However, very few researchers are working in the area of intuitionistic fuzzy rough set based feature selection. Lu, Lei, & Hua [24] proposed the genetic algorithm for attribute reduction of intuitionistic fuzzy information system (IFIS). Chen & Yang [5] introduced a new attribute reduction method by combining intuitionistic fuzzy and rough sets with information entropy. Huang, Li, & Wei [15] presented an intuitionistic fuzzy rough set based attribute reduction model by using the concept of distance measure. Esmail, Maryam, & Habibolla [14] designed their method of attribute reduction by considering structure of intuitionistic fuzzy rough set model and its properties along with rule extraction. Z. Zhang [37] proposed an attribute reduction method by using the concept of discernibility matrix. However, none of the proposed approaches for feature selection is based on dependency function by considering similarity between two objects.

In this paper, we propose a novel feature selection method by combining intuitionistic fuzzy set and rough set. First, a new intuitionistic fuzzy rough set model is established by defining a pair of lower and upper approximations based on T-equivalence relation. Second, we define three new approaches to calculate degree of dependency based on score function, membership grade and cardinality of intuitionistic fuzzy numbers. Third, validity of the model analogous to rough set theory is established. Fourth, a suitable algorithm for feature selection based on our approaches is developed. Finally, these approaches are applied on an arbitrary example dataset. We show that these techniques perform better than fuzzy rough set based approach by calculating the reduct.

We organize rest of the paper as follows: Section 2 briefly introduces preliminary concepts. In Section 3, we define a novel intuitionistic fuzzy rough set model and validate it. Moreover, three novel approaches for feature selection based on dependency function are introduced. In Section 4, an algorithm regarding our approaches is presented. In Section 5, an arbitrary example of fuzzy information system is taken and fuzzy rough set based feature selection technique is applied. Furthermore, this information system is converted into intuitionistic fuzzy information system by using Jurio et al. [20] concept. Moreover, our proposed methods are applied to calculate the reduct set. In Section 6, entire work is concluded.

2 Preliminaries

In this section, we discuss some basic definitions regarding intuitionistic fuzzy number (IFN) and IFIS.

Definition 2.1. An ordered pair 〈m, n〉 is said to be an intuitionistic fuzzy value if 0 ≤ m, n ≤ 1 and 0 ≤ m + n ≤ 1. Let U be an universe of discourse, which represents finite collection of objects and L is an intuitionistic fuzzy set in U such that L = { 〈 x, m_L (x), n_L (x) 〉 |x ∈ U }, where m_L : U → [0, 1] and n_L : U → [0, 1] satisfy 0 ≤ m_L (x) + n_L (x) ≤1, ∀ x ∈ U, then m_L and n_L are respectively known as degree of membership and degree of non-membership of the element x ∈ U to L.

The cardinality of L is defined by $| L | = \sum_{x \in U} \frac{1 + m_{L} (x) - n_{L} (x)}{2}$ .

Definition 2.2. Let 〈m_i, n_i〉 be two intuitionistic fuzzy values for i = 1, 2, then

〈m₁, n₁ 〉 = 〈 m₂, n₂ 〉 ⇔ m₁ = m₂ ∧ n₁ = n₂

〈m₁, n₁〉 ∩ 〈 m₂, n₂ 〉 = 〈 min { m₁, m₂ }, max { n₁, n₂ } 〉

〈m₁, n₁〉 ∪ 〈 m₂, n₂ 〉 = 〈 max { m₁, m₂ }, min { n₁, n₂ } 〉

Definition 2.3. An intuitionistic fuzzy information system can be defined as a quadruple IFIS = (U, J ∪ K, S, T), where U = (≠ φ) is collection of finite number of objects, called universe of discourse, J (≠ φ) and K are finite sets of conditional and decision features such that J ∩ K = φ, S is the collection of all intuitionistic fuzzy values such that S = S₁ ∪ S₂, where S₁ and S₂ are domains of conditional and decision features and T is called information function which is defined as T : U × J ∪ K → S such that T (x, j) ∈ S_j, ∀ j ∈ J, S_j ⊆ S₁ and T (x, d) ∈ S₂ for K ={ d }, where T (x, j) and T (x, d) are intuitionistic fuzzy values.

Definition 2.4. Let τ =〈 m, n 〉 be an intuitionistic fuzzy value, then z (τ) = m - n is called the score of τ.

3 Intuitionistic fuzzy rough set (IFRS) based approach for feature selection

In 1998, Chakrabarty et al. [4] proposed a method to construct an IFRS (U, V) of a rough set (A, B), where U and V are both intuitionistic fuzzy sets in X (non-empty set of objects) such that U ⊆ V, i. e. μ_U ≤ μ_V and υ_U ≥ υ_V. In this case, lower approximation U and upper approximation V are both intuitionistic fuzzy sets.

In 2001, Samanta and Mondal [31] proposed their method to define intuitionistic fuzzy rough set. They defined a couple (U, V) as intuitionistic fuzzy rough set such that U and V are both fuzzy rough sets (as proposed by Nanda and Majumdar [26]) and U ⊆ Complement (V). From [26], it is obvious that IFRS is a generalization of an intuitionistic fuzzy set, in which membership and non-membership functions are fuzzy rough sets.

In 2002, Rizvi et al. [29] reported their proposal as rough intuitionistic fuzzy set, which also contains hesitation margin on lower and upper approximations.

In 2003, Cornelis et al. [9] defined the lower and upper approximations of X ⊆ U (Universe of discourse) as follows: $\begin{matrix} R ↓_{I} X (y) & = & inf_{x \in U} I (R (x, y), X (x)) \\ R ↑_{T} X (y) & = & sup_{x \in U} T (R (x, y), X (x)), \forall x, y \in U . \end{matrix}$ where, T is an intuitionistic fuzzy triangular norm, I an intuitionistic fuzzy implicator and R an intuitionistic fuzzy T-equivalence relation in U.

All above proposed definitions do not consider memberships and non-memberships of individual objects to the approximations. From the literature [12 , 19], we can define intuitionistic fuzzy lower and upper approximations by considering individual objects as follows:

Let IFIS be an intuitionistic fuzzy information system as defined in Section 2.3. Let P ⊆ J and X ⊆ U So, we can define lower and upper approximations of X over a set of features P by: $\begin{matrix} \underline{{approx}_{P}} (X) \\ = 〈 μ_{\underline{P} X} (x), υ_{\underline{P} X} (x) 〉 \\ = min (〈 μ (x), υ (x) 〉, inf_{y \in U} I (R (x, y), X (y))), \\ \forall x, y \in U \\ \bar{{approx}_{P}} (X) \\ = 〈 μ_{\bar{P} X} (x), υ_{\bar{P} X} (x) 〉 \\ = min (〈 μ (x), υ (x) 〉, sup_{y \in U} T (R (x, y), X (y))), \\ \forall x, y \in U . \end{matrix}$

Taking intuitionistic fuzzy triangular norm T_w and intuitionistic fuzzy implicator I_w as follows [9]: $\begin{matrix} T_{w} (x, y) \\ = 〈 max (0, x_{1} + y_{1} - 1), min (1, x_{2} + y_{2}) 〉 \\ I_{w} (x, y) \\ = 〈 min (1, 1 + y_{1} - x_{1}, 1 + x_{2} - y_{2}), \\ max (0, y_{2} - x_{2}) 〉 \end{matrix}$

Now, we can redefine lower and upper approximations as follows: $\begin{matrix} \underline{{approx}_{P}} (X) = 〈 μ_{\underline{P} X} (x), υ_{\underline{P} X} (x) 〉 \\ = min (〈 μ (x), υ (x) 〉, \\ inf_{y \in U} 〈 min (1, 1 + μ (y) - μ_{X} (y), \\ 1 + υ_{X} (y) - υ (y)), max (0, υ (y) - υ_{X} (y)) 〉) \\ \bar{{approx}_{P}} (X) = 〈 μ_{\bar{P} X} (x), υ_{\bar{P} X} (x) 〉 \\ = min (〈 μ (x), υ (x) 〉, \\ sup_{y \in U} 〈 max (0, μ_{X} (y) + μ (y) - 1), \\ min (1, υ_{X} (y) + υ (y)) 〉) \end{matrix}$

Now, intuitionistic fuzzy positive region can be defined by: $〈 μ_{P O S "_{P} (Q)} (x), υ_{P O S "_{P} (Q)} (x) 〉 = \sup_{X \in U / Q} 〈 μ \underline{p} X (x), υ_{P X} (x) 〉, \forall x \in U$ where Q is the set of decision attributes. Any object does not belong to positive region, only if the equivalence class, it belongs to, is not an element of the positive region.

Therefore, we can define degree of dependency in three ways based on membership grade, score function and cardinality by:

$γ_{P}^{m} (Q) = \frac{\underset{x \in U}{Σ} μ_{P O S "_{p} (Q)} (x)}{| U |}$

$γ_{P}^{s} (Q) = \frac{sum of scores of 〈 μ_{P O S_{P}^{"} (Q)} (x), υ_{P O S_{P}^{"} (Q)} (x) 〉}{| U |}$

$γ_{P}^{c} (Q) = \frac{\sum_{x \in U} \frac{^{1 + μ} {P O S}_{P}^{"}^{(x) - υ} P O S "_{P}}{2}}{| U |}$

Every time we add one feature in the feature subset and calculate degree of dependency. When degree of dependency has no increment, process stops and we get the required reduct.

Let (U, C ∪ D, V, f) be an intuitionistic fuzzy information system (IFIS), where U = (≠ φ) is universe of discourse, C and D are finite sets of conditional and decision features respectively, V is the collection of all intuitionistic fuzzy values and f is called information function which is defined as f : U × C ∪ D → V, such that f (x, b) ∈ V, ∀ b ∈ C ∪ D, ∀ x ∈ U.

Theorem 3.1. $Let \underline{{approx}_{P}} (X)$ and $\bar{{approx}_{P}} (X)$ are lower and upper approximations ofX ⊆ Urespectively, then $\underline{{approx}_{P}} (X) \subseteq \bar{{approx}_{P}} (X) .$

Proof. $To show that, inf_{y \in U} 〈 m_{1}, n_{1} 〉 \leq sup_{y \in U} 〈 m_{2}, n_{2} 〉$

Where, $\begin{matrix} m_{1} = min (1, 1 + μ (y) - μ_{X} (y), \\ 1 + υ_{X} (y) - υ (y)) \\ m_{2} = max (0, μ_{X} (y) + μ (y) - 1) \\ and n_{1} = max (0, υ (y) - υ_{X} (y)) \\ n_{2} = min (1, υ_{X} (y) + υ (y)) \end{matrix}$

Now for y ∈ X, μ_X (y) =1, υ_X (y) =0 $\begin{matrix} So, m_{1} = min (1, 1 + μ (y) - 1, 1 + 0 - υ (y)) \\ = min (1, μ (y), 1 - υ (y)) = μ (y) \end{matrix}$ (1) $n_{1} = max (0, υ (y) - 0) = υ (y)$ (2) $m_{2} = max (0, 1 + μ (y) - 1) = μ (y)$ (3) $n_{2} = min (1, 0 + υ (y)) = υ (y)$ (4)

Using Equations (1–4), we get $\begin{matrix} inf_{y \in U} 〈 m_{1}, n_{1} 〉 = inf_{y \in U} 〈 μ (y), υ (y) 〉 \\ \leq sup_{y \in U} 〈 μ (y), υ (y) 〉 = sup_{y \in U} 〈 m_{2}, n_{2} 〉 \end{matrix}$

Hence, $\underline{{approx}_{P}} (X) \subseteq \bar{{approx}_{P}} (X)$ .

Theorem 3.2.If P₁ ⊆ P₂ ⊆ C, then

$\underline{{approx}_{P_{1}}} (X) \subseteq \underline{{approx}_{P_{2}}} (X)$

$\bar{{approx}_{P_{1}}} (X) \subseteq \bar{{approx}_{P_{2}}} (X)$

Proof. $(a) For, \underline{{approx}_{P_{1}}} (X) \subseteq \underline{{approx}_{P_{2}}} (X),$ we have to show that $\begin{matrix} min (1, 1 + μ^{P_{1}} (y) - μ_{X} (y), 1 + υ_{X} (y) - υ^{P_{1}} (y)) \\ \leq min (1, 1 + μ^{P_{2}} (y) - μ_{X} (y), \\ 1 + υ_{X} (y) - υ^{P_{2}} (y)) \end{matrix}$ and max(0, υ^{P
₁} (y) - υ_X (y)) ≥ max(0, υ^{P
₂} (y) - υ_X (y)).

Now, for y ∈ X, μ_X (y) =1, υ_X (y) =0, $\begin{matrix} min (1, 1 + μ^{P_{1}} (y) - 1, 1 + 0 - υ^{P_{1}} (y)) \\ = min (1, μ^{P_{1}} (y), 1 - υ^{P_{1}} (y)) = μ^{P_{1}} (y) \end{matrix}$ (5) $\begin{matrix} min (1, 1 + μ^{P_{2}} (y) - 1, 1 + 0 - υ^{P_{2}} (y)) \\ = min (1, μ^{P_{2}} (y), 1 - υ^{P_{2}} (y)) = μ^{P_{2}} (y) \end{matrix}$ (6) $max (0, υ^{P_{1}} (y) - 0) = υ^{P_{1}} (y)$ (7) $max (0, υ^{P_{2}} (y) - 0) = υ^{P_{2}} (y)$ (8)

Since, P₁ ⊆ P₂, hence, μ^{P
₁} (y) ≤ μ^{P
₂} (y) and υ^{P
₁} (y) ≥ υ^{P
₂} (y).

From Equations (5–8) and along with above two conditions we get the required result.

Hence, $\underline{{approx}_{P_{1}}} (X) \subseteq \underline{{approx}_{P_{2}}} (X)$ .

$(b) For \bar{{approx}_{P_{1}}} (X) \subseteq \bar{{approx}_{P_{2}}} (X)$ , we have to show that, $\begin{matrix} max (0, μ_{X} (y) + μ^{P_{1}} (y) - 1) \\ \leq max (0, μ_{X} (y) + μ^{P_{2}} (y) - 1) \\ and, min (1, υ_{X} (y) + υ^{P_{1}} (y) - 1) \\ \geq min (1, υ_{X} (y) + υ^{P_{2}} (y)) \end{matrix}$

Now, for y ∈ X, μ_X (y) =1, υ_X (y) =0, $max (0, 1 + μ^{P_{1}} (y) - 1) = μ^{P_{1}} (y)$ (9) $max (0, 1 + μ^{P_{2}} (y) - 1) = μ^{P_{2}} (y)$ (10) $and min (1, 0 + υ^{P_{1}} (y)) = υ^{P_{1}} (y)$ (11) $min (1, 0 + υ^{P_{2}} (y)) = υ^{P_{2}} (y)$ (12)

Since, P₁ ⊆ P₂, therefore, μ^{P
₁} (y) ≤ μ^{P
₂} (y) and υ^{P
₁} (y) ≥ υ^{P
₂} (y). From Equations (9–12) along with above two conditions, we get the required result.

Hence, $\bar{{approx}_{P_{1}}} (X) \subseteq \bar{{approx}_{P_{2}}} (X)$ .

4 Algorithm: Intuitionistic fuzzy rough quick reduction

In this section, we present a quick reduct algorithm for feature selection. Algorithm starts with a null set and adds those attributes one by one, which provide greatest increase in degree of dependency of decision attribute over subset of conditional attributes until it gains highest possible value for any data set (it will be 1 in case of consistent system). This algorithm generates a close-to-minimal reduct of a decision system without exhaustively checking all possible subsets of conditional attributes, which is the main advantage of our proposed algorithm. The algorithm can be given as follows: $\begin{matrix} IntuitionisticFuzzyRoughQuickReduct (C, D) \\ Input : C, Collection of all conditional features; \\ D, Collection of all decision features; \\ Output : L, the feature subset \\ 1 . L \leftarrow {}, γ_{best}^{m} \leftarrow 0, γ_{prev}^{m} \leftarrow 0 \\ or γ_{best}^{s} \leftarrow 0, γ_{prev}^{s} \leftarrow 0 or γ_{best}^{c} \leftarrow 0, γ_{prev}^{c} \leftarrow 0 \\ 2 . do \\ 3 . K \leftarrow L \\ 4 . γ_{prev}^{m} \leftarrow γ_{best}^{m} or γ_{prev}^{s} \leftarrow γ_{best}^{s} \\ or γ_{prev}^{c} \leftarrow γ_{best}^{c} \\ 5 . \forall x \in (C - L) \\ 6 . if γ_{L \cup {x}}^{m} (D) > γ_{K}^{m} (D) \\ or γ_{L \cup {x}}^{s} (D) > γ_{K}^{s} (D) or γ_{L \cup {x}}^{c} (D) > γ_{K}^{c} (D) \\ 7 . K \leftarrow L \cup {x} \\ 8 . γ_{best}^{m} \leftarrow γ_{K}^{m} (D) or γ_{best}^{s} \leftarrow γ_{K}^{s} (D) \\ or γ_{best}^{c} \leftarrow γ_{K}^{c} (D) \\ 9 . L \leftarrow K \\ 10 . until γ_{best}^{m} = = γ_{prev}^{m} or γ_{best}^{s} = = γ_{prev}^{s} \\ or γ_{best}^{c} = = γ_{prev}^{c} \\ 11 . return L \end{matrix}$

In the current algorithm, all the three proposed approaches are considered and it can be easily applied on any intuitionistic fuzzy information system to calculate the smallest possible reduct set. For a data set with dimension n, the worst case data set will result in (n² + n)/2 evaluations of the dependency function in all the three cases.

5 Worked example

In order to illustrate our approach of intuitionistic fuzzy rough set based feature selection, a data set inspired from [19] is given in Table 1.

Table 1
Fuzzy information system

Attributes a b c d e f Q

Objects

x ₁ 0.4 0.4 1.0 0.8 0.4 0.2 1

x ₂ 0.6 1.0 0.6 0.8 0.2 1.0 0

x ₃ 0.8 0.4 0.4 0.6 1.0 0.2 1

x ₄ 1.0 0.6 0.2 1.0 0.6 0.4 0

x ₅ 0.2 1.0 0.8 0.4 0.4 0.6 0

x ₆ 0.6 0.6 0.8 0.2 0.8 0.8 1

Attributes	a	b	c	d	e	f	Q
Objects
x ₁	0.4	0.4	1.0	0.8	0.4	0.2	1
x ₂	0.6	1.0	0.6	0.8	0.2	1.0	0
x ₃	0.8	0.4	0.4	0.6	1.0	0.2	1
x ₄	1.0	0.6	0.2	1.0	0.6	0.4	0
x ₅	0.2	1.0	0.8	0.4	0.4	0.6	0
x ₆	0.6	0.6	0.8	0.2	0.8	0.8	1

From Table 1, decision class can be given as follows: $U / Q = {{x_{1}, x_{3}, x_{6}}, {x_{2}, x_{4}, x_{5}}}$

Degree of dependencies of Q over A = {a}, B = {b}, C = {c}, D = {d}, E = {e} and F = {f} can be calculated using [31] as follows: $γ_{A} (Q) = \frac{1.2}{6}, γ_{B} (Q) = \frac{2.4}{6}, γ_{C} (Q) = \frac{1.2}{6}, γ_{D} (Q) = \frac{1.2}{6}, γ_{E} (Q) = \frac{2.2}{6}, γ_{F} (Q) = \frac{1.2}{6}$ .

Since, feature b will cause the greatest increase in dependency degree. Hence, this feature is chosen and added to the potential reduct set.

Now, adding other features to potential reduct set we calculate degree of dependencies for {a, b}, {b, c}, {b, d}, {b, e}, {b, f} $\begin{matrix} γ_{{a, b}} (Q) = \frac{2.2}{6}, γ_{{b, c}} (Q) = \frac{2.2}{6}, \\ γ_{{b, d}} (Q) = \frac{2.6}{6}, γ_{{b, e}} (Q) = \frac{2.2}{6}, \\ γ_{{b, f}} (Q) = \frac{2.0}{6} \end{matrix}$

On adding feature d to the reduct candidate causes the larger increase of degree of dependency. So, new reduct becomes {b, d}. This process iterates and other degrees of dependencies are: $\begin{matrix} γ_{{a, b, d}} (Q) = \frac{2.4}{6}, γ_{{b, c, d}} (Q) = \frac{2.2}{6}, \\ γ_{{b, d, e}} (Q) = \frac{2.2}{6}, γ_{{b, d, f}} (Q) = \frac{2.2}{6} \end{matrix}$

It is obvious that by adding rest of the features with {b, d} cause no increase in degree of dependency, the algorithm stops and outputs the reduct {b, d}.

Now we convert the above fuzzy information system into intuitionistic fuzzy information system by using Jurio et al. [20] concept. The transformed information system is given in Table 2.

Table 2

Intuitionistic fuzzy information system

Attributes	a	b	c	d	e	f	Q
Objects
x ₁	〈0.32, 0.48〉	〈0.32, 0.48〉	〈0.80, 0.00〉	〈0.64, 0.16〉	〈0.32, 0.48〉	〈0.16, 0.64〉	1
x ₂	〈0.48, 0.32〉	〈0.80, 0.00〉	〈0.48, 0.32〉	〈0.64, 0.16〉	〈0.16, 0.64〉	〈0.80, 0.00〉	0
x ₃	〈0.64, 0.16〉	〈0.32, 0.48〉	〈0.32, 0.48〉	〈0.48, 0.32〉	〈0.80, 0.00〉	〈0.16, 0.64〉	1
x ₄	〈0.80, 0.00〉	〈0.48, 0.32〉	〈0.16, 0.64〉	〈0.80, 0.00〉	〈0.48, 0.32〉	〈0.32, 0.48〉	0
x ₅	〈0.16, 0.64〉	〈0.80, 0.00〉	〈0.64, 0.16〉	〈0.32, 0.48〉	〈0.32, 0.48〉	〈0.48, 0.32〉	0
x ₆	〈0.48, 0.32〉	〈0.48, 0.32〉	〈0.64, 0.16〉	〈0.16, 0.64〉	〈0.64, 0.16〉	〈0.64, 0.16〉	1

From Table 2, decision class for intuitionistic fuzzy information system is given by: $U / Q = {{x_{1}, x_{3}, x_{6}}, {x_{2}, x_{4}, x_{5}}}$

Setting A = {a},

For the first decision equivalence class X ={ x₁, x₃, x₆ } lower approximations of different objects can be calculated by using Section 3 as follows: $\begin{matrix} min {1, 1 + μ (x_{1}) - μ_{X} (x_{1}), 1 + υ_{X} (x_{1}) - υ (x_{1})} \\ = min {1, 1 + 0.32 - 1, 1 + 0 - 0.48} = 0.32 \\ min {1, 1 + μ (x_{2}) - μ_{X} (x_{2}), 1 + υ_{X} (x_{2}) - υ (x_{2})} \\ = min {1, 1 + 0.48 - 0, 1 + 1 - 0.32} = 1 \\ min {1, 1 + μ (x_{3}) - μ_{X} (x_{3}), 1 + υ_{X} (x_{3}) - υ (x_{3})} \\ = min {1, 1 + 0.64 - 1, 1 + 0 - 0.16} = 0.64 \\ min {1, 1 + μ (x_{4}) - μ_{X} (x_{4}), 1 + υ_{X} (x_{4}) - υ (x_{4})} \\ = min {1, 1 + 0.80 - 0, 1 + 1 - 0.00} = 1 \\ min {1, 1 + μ (x_{5}) - μ_{X} (x_{5}), 1 + υ_{X} (x_{5}) - υ (x_{5})} \\ = min {1, 1 + 0.16 - 0, 1 + 1 - 0.64} = 1 \\ min {1, 1 + μ (x_{6}) - μ_{X} (x_{6}), 1 + υ_{X} (x_{6}) - υ (x_{6})} \\ = min {1, 1 + 0.48 - 1, 1 + 0 - 0.32} = 0.48 \\ max {0, υ (x_{1}) - υ_{X} (x_{1})} = max {0, 0.48 - 0} \\ = 0.48 \\ max {0, υ (x_{2}) - υ_{X} (x_{2})} = max {0, 0.32 - 1} = 0 \\ max {0, υ (x_{3}) - υ_{X} (x_{3})} = max {0, 0.16 - 0} \\ = 0.16 \\ max {0, υ (x_{4}) - υ_{X} (x_{4})} = max {0, 0.00 - 1} = 0 \\ max {0, υ (x_{5}) - υ_{X} (x_{5})} = max {0, 0.64 - 1} = 0 \\ max {0, υ (x_{6}) - υ_{X} (x_{6})} = max {0, 0.32 - 0} \\ = 0.32 \end{matrix}$

So, inf {〈 0.32, 0.48 〉, 〈 1, 0 〉, 〈 0.64, 0.16 〉, 〈 1, 0 〉, 〈 1, 0 〉, 〈 0.48, 0.32 〉} = 〈 0.32, 0.48 〉.

Therefore, lower approximations of each object can be calucated using section 3 as follows: $\begin{matrix} 〈 μ_{\underline{P} X} (x_{1}), υ_{\underline{P} X} (x_{1}) 〉 \\ = min (〈 0.32, 0.48 〉, 〈 0.32, 0.48 〉) \\ = 〈 0.32, 0.48 〉 \\ 〈 μ_{\underline{P} X} (x_{2}), υ_{\underline{P} X} (x_{2}) 〉 \\ = min (〈 0.48, 0.32 〉, 〈 0.32, 0.48 〉) \\ = 〈 0.32, 0.48 〉 \\ 〈 μ_{\underline{P} X} (x_{3}), υ_{\underline{P} X} (x_{3}) 〉 \\ = min (〈 0.64, 0.16 〉, 〈 0.32, 0.48 〉) \\ = 〈 0.32, 0.48 〉 \\ 〈 μ_{\underline{P} X} (x_{4}), υ_{\underline{P} X} (x_{4}) 〉 \\ = min (〈 0.80, 0.00 〉, 〈 0.32, 0.48 〉) \\ = 〈 0.32, 0.48 〉 \\ 〈 μ_{\underline{P} X} (x_{5}), υ_{\underline{P} X} (x_{5}) 〉 \\ = min (〈 0.16, 0.64 〉, 〈 0.32, 0.48 〉) \\ = 〈 0.16, 0.64 〉 \\ 〈 μ_{\underline{P} X} (x_{6}), υ_{\underline{P} X} (x_{6}) 〉 \\ = min (〈 0.48, 0.32 〉, 〈 0.32, 0.48 〉) \\ = 〈 0.32, 0.48 〉 \end{matrix}$

Similarly, for the second decision class X = { x₂, x₄, x₅ }, lower approximations are: $\begin{matrix} 〈 μ_{\underline{P} X} (x_{1}), υ_{\underline{P} X} (x_{1}) 〉 = 〈 0.16, 0.64 〉 \\ 〈 μ_{\underline{P} X} (x_{2}), υ_{\underline{P} X} (x_{2}) 〉 = 〈 0.16, 0.64 〉 \\ 〈 μ_{\underline{P} X} (x_{3}), υ_{\underline{P} X} (x_{3}) 〉 = 〈 0.16, 0.64 〉 \\ 〈 μ_{\underline{P} X} (x_{4}), υ_{\underline{P} X} (x_{4}) 〉 = 〈 0.16, 0.64 〉 \\ 〈 μ_{\underline{P} X} (x_{5}), υ_{\underline{P} X} (x_{5}) 〉 = 〈 0.16, 0.64 〉 \\ 〈 μ_{\underline{P} X} (x_{6}), υ_{\underline{P} X} (x_{6}) 〉 = 〈 0.16, 0.64 〉 \end{matrix}$

Now, positive regions of every object are: $\begin{matrix} 〈 μ_{P O S_{A}^{"} (Q)} (x_{1}), υ_{P O S_{A}^{"} (Q)} (x_{1}) 〉 = 〈 0.32, 0.48 〉 \\ 〈 μ_{P O S_{A}^{"} (Q)} (x_{2}), υ_{P O S_{A}^{"} (Q)} (x_{2}) 〉 = 〈 0.32, 0.48 〉 \\ 〈 μ_{P O S_{A}^{"} (Q)} (x_{3}), υ_{P O S_{A}^{"} (Q)} (x_{3}) 〉 = 〈 0.32, 0.48 〉 \\ 〈 μ_{P O S_{A}^{"} (Q)} (x_{4}), υ_{P O S_{A}^{"} (Q)} (x_{4}) 〉 = 〈 0.32, 0.48 〉 \\ 〈 μ_{P O S_{A}^{"} (Q)} (x_{5}), υ_{P O S_{A}^{"} (Q)} (x_{5}) 〉 = 〈 0.16, 0.64 〉 \\ 〈 μ_{P O S_{A}^{"} (Q)} (x_{6}), υ_{P O S_{A}^{"} (Q)} (x_{6}) 〉 = 〈 0.32, 0.48 〉 \end{matrix}$

Now, we determine degree of dependencies using following three approaches:

On the basis of membership grade:

Degree of dependency of Q upon A using membership function can be calculated as: $γ_{A}^{m} (Q) = \frac{5 \times 0.32 + 1 \times 0.16}{6} = \frac{1.76}{6}$

For B = {b}, C = {c}, D = {d}, E = {e} and F = {f}, other degree of dependencies are: $\begin{matrix} γ_{B}^{m} (Q) = \frac{2.54}{6}, γ_{C}^{m} (Q) = \frac{1.76}{6}, \\ γ_{D}^{m} (Q) = \frac{1.76}{6}, γ_{E}^{m} (Q) = \frac{1.76}{6}, \\ γ_{F}^{m} (Q) = \frac{1.76}{6} \end{matrix}$

Since, degree of dependency of Q over B = {b} is largest. Hence, B is added to the reduct set. Now, On adding other features to the set {b} one by one, degree of dependencies of Q over {a, b}, {b, c}, {b, d}, {b, e}, {b, f} are: $\begin{matrix} γ_{{a, b}}^{m} (Q) = \frac{1.76}{6}, γ_{{b, c}}^{m} (Q) = \frac{1.76}{6}, \\ γ_{{b, d}}^{m} (Q) = \frac{1.76}{6}, γ_{{b, e}}^{m} (Q) = \frac{1.76}{6}, \\ γ_{{b, f}}^{m} (Q) = \frac{1.60}{6} \end{matrix}$

Since, adding other features to the set {b} causes no increment in terms of degree of dependencies, hence algorithm terminates and we get {b} as reduct set.

On the basis of score function:

Degree of dependency of Q upon A using score function can be calculated as: $γ_{A}^{s} (Q) = \frac{5 \times (- 0.16) + 1 \times (- 0.48)}{6} = \frac{- 1.28}{6}$

Finding other degree of dependencies on the basis of “score function”, we get, $\begin{matrix} γ_{B}^{s} (Q) = \frac{0.32}{6}, γ_{C}^{s} (Q) = \frac{- 1.28}{6}, \\ γ_{D}^{s} (Q) = \frac{- 1.28}{6}, γ_{E}^{s} (Q) = \frac{- 1.28}{6}, \\ γ_{F}^{s} (Q) = \frac{- 1.60}{6} \end{matrix}$

Since, B ={ b } causes the greatest effect on degree of dependency, hence, b will be a reduct candidate. Now, we add other features to the set {b} and get other degree of dependencies as follows: $\begin{matrix} γ_{{a, b}}^{s} (Q) = \frac{- 1.28}{6}, γ_{{b, c}}^{s} (Q) = \frac{- 1.28}{6}, \\ γ_{{b, d}}^{s} (Q) = \frac{- 1.28}{6}, γ_{{b, e}}^{s} (Q) = \frac{- 1.28}{6}, \\ γ_{{b, f}}^{s} (Q) = \frac{- 1.60}{6} \end{matrix}$

Since there is no increment in degree of dependency, therefore, process stops and we get the reduct {b}.

Now, the value of degree of dependency of Q over A, B, C, D, E and F is determined using intuitionistic fuzzy cardinality as follows:

On the basis of cardinality of an intuitionistic fuzzy set:

Degree of dependencies of Q over A = { a }, B = {b}, C = {c}, D = {d}, E = {e} and F = {f} by using cardinality of an intuitionistic fuzzy set are as follows: $\begin{matrix} γ_{A}^{c} (Q) = \frac{2.36}{6}, γ_{B}^{c} (Q) = \frac{3.32}{6}, γ_{C}^{c} (Q) = \frac{2.36}{6}, \\ γ_{D}^{c} (Q) = \frac{2.36}{6}, γ_{E}^{c} (Q) = \frac{2.36}{6}, γ_{F}^{c} (Q) = \frac{2.20}{6} \end{matrix}$

Since, B = {b} causes the greatest effect on dependency, hence, b is added to the reduct set. On addition of other attributes to the set {b}, degree of dependencies of decision attribute over {a, b}, {b, c}, {b, d}, {b, e}, {b, f} are: $\begin{matrix} γ_{{a, b}}^{c} (Q) = \frac{2.36}{6}, γ_{{b, c}}^{c} (Q) = \frac{2.36}{6}, \\ γ_{{b, d}}^{c} (Q) = \frac{2.36}{6}, γ_{{b, e}}^{c} (Q) = \frac{2.36}{6}, \\ γ_{{b, f}}^{c} (Q) = \frac{2.20}{6} \end{matrix}$

So, this approach also gives the same reduct {b}.

We observe that, in case of fuzzy rough set based approach the obtained reduct is {b, d} and after applying our proposed methods one by one, we get the same reduct set as {b}. It is obvious from the given example that our model works fine in order to find the smallest reduct set from a decision system. Moreover, our approach can perform better to handle uncertainty and noise available in the decision system by adjusting different types of intuitionistic fuzzy t-norms and implicators.

6 Conclusion

In this paper, we have presented three novel approaches for intuitionistic fuzzy rough set based feature selection by calculating degree of dependencies derived from membership grade, score function and cardinality of an intuitionistic fuzzy set. Moreover, the analogous results on lower and upper approximations have been established for intuitionistic fuzzy rough set. In this paper, the proposed algorithm has been applied to an example data set and comparison has been made with the previous fuzzy rough set based algorithm. We observed that our proposed method is superior to fuzzy rough set based approach as it considers membership, non-membership and hesitancy of an object in a set and produces the smallest reduct set of a decision system. The proposed work is capable to maintain the consistency of the system (a decision system is said to be consistent if degree of dependency of decision attribute over subset of conditional attributes is 1) as well as it can easily handle vagueness, uncertainty and noise available in data sets. Nowadays, the dimension of digital data is continuously growing with the enhancement of computer as well as database technology. Experts as well as machine learning processes on large volumes of high-dimensional data are main sources of knowledge. Knowledge extraction is an essential step in framing expert and intelligent systems. However, the knowledge extraction phase is very slow due to noise and large size of high-dimensional data. To enhance the productivity of learning, feature selection or attribute reduction plays a vital role in selection of predictive and non-redundant features to improve the performance of machine learning algorithms and interpretability of data. Many areas of real life applications like machine learning, image processing, data mining, natural language processing and bioinformatics, etc., which have high relevancy to expert and intelligent systems, are applications of feature selection. Our proposed approach handles uncertainty in much better way and produces the smallest possible reduct set as it is based on the concept of dependency function and it considers two important tools of uncertainty, i.e. intuitionistic fuzzy set and rough set, which is not considered till date.

In future, we intend to establish type-2 intuitionistic fuzzy rough set model and want to generalize it for feature selection. Some more accurate models like probabilistic variable precision intuitionistic fuzzy rough set model can be presented for feature reduction. Moreover, some more generalized methods for conversion of fuzzy information system into intuitionistic fuzzy information system can be presented so that these approaches can be implemented for real valued data sets.

References

Atanassov

K.T.

, Intuitionistic fuzzy sets, Fuzzy sets and Systems20 (1986), 87–96.

Atanssov

, Intuitionistic fuzzy sets: Theory and applications. Studies in fuzziness and soft computing,New York, Heidelberg, Physicaverl, 1999.

Beliakov

, Bustince

, Goswami

, Mukherjee

and Pal

N.R.

, On averaging operators for Atanassov’s intuitionistic fuzzy sets, Information Sciences181 (2011), 1116–1124.

Chakrabarty

, Gedeon

and Koczy

, Intuitionistic fuzzy rough set. in, Proceedings of 4th Joint Conference on Information Sciences (JCIS), Durham, NC, 1998, pp. 211–214.

Chen

and Yang

, One new algorithm for intuitiontistic fuzzy-rough attribute reduction, Journal of Chinese Computer Systems32 (2011), 506–510.

Chen

T.-Y.

, A comparative analysis of score functions for multiple criteria decision making in intuitionistic fuzzy settings, Information Sciences181 (2011), 3652–3676.

Chen

and Yang

, A new multiple attribute group decision making method in intuitionistic fuzzy setting, Applied Mathematical Modelling35 (2011), 4424–4437.

Coker

, Fuzzy rough sets are intuitionistic L-fuzzy sets. Fuzzy Sets and Systems96 (1998), 381–383.

Cornelis

, De Cock

and Kerre

E.E.

, Intuitionistic fuzzy rough sets: At the crossroads of imperfect knowledge, Expert Systems20 (2003), 260–270.

10.

Dash

and Liu

, Feature selection for classification, Intelligent data analysis1 (1997), 131–156.

11.

Devijver

P.A.

, Kittler

, Pattern recognition: A statistical approach, Prentice Hall, 1982.

12.

Dubois

, Prade

Putting rough sets and fuzzy sets together, Intelligent Decision Support. Springer, 1992, pp. 203–232.

13.

Duda

R.O.

, Hart

P.E.

, Stork

D.G.

, Pattern classification, John Wiley & Sons, 2012.

14.

Esmail

, Maryam

and Habibolla

, Rough set theory for the intuitionistic fuzzy information, Systems International Journal of Modern Mathematical Sciences6 (2013), 132–143.

15.

Huang

, Li

H.-X.

and Wei

D.-K.

, Dominance-based rough set model in intuitionistic fuzzy information systems, Knowledge-Based Systems28 (2012), 115–123.

16.

Jena

, Ghosh

and Tripathy

, Intuitionistic fuzzy rough sets, Notes on Intuitionistic Fuzzy Sets8 (2002), 1–18.

17.

Jensen

and Shen

, Fuzzy– rough attribute reduction with application to web categorization, Fuzzy Sets and Systems141 (2004), 469–485.

18.

Jensen

and Shen

, Fuzzy-rough sets assisted attribute selection, IEEE Transactions on fuzzy systems15 (2007), 73–89.

19.

Jensen

, Shen

, Computational intelligence and feature selection: Rough and fuzzy approaches, John Wiley & Sons, vol. 8, (2008).

20.

Jurio

, Paternain

, Bustince

, Guerra

and Beliakov

, A construction method of Atanassov’s intuitionistic fuzzy sets for image processing. in Intelligent Systems (IS), 2010 5th IEEE International Conference, IEEE, (2010), pp. 337–342.

21.

Kira

and Rendell

L.A.

, The feature selection problem: Traditional methods and a new algorithm. in Aaai1992, vol. 2, pp. 129–134.

22.

Langley

, Selection of relevant features in machine learning. in Proceedings of the AAAI Fall symposium on relevance1994, vol. 184, pp. 245–271.

23.

Lin

, Yuan

X.-H.

and Xia

Z.-Q.

, Multicriteria fuzzy decision-making methods based on intuitionistic fuzzy sets, Journal of Computer and System Sciences73 (2007), 84–88.

24.

Y.-L.

, Lei

Y.-J.

and Hua

J.-X.

, Attribute reduction based on intuitionistic fuzzy rough set [J], Control and Decision3 (2009), 003.

25.

Morsi

N.N.

and Yakout

M.M.

, Axiomatics for fuzzy rough sets, Fuzzy sets and Systems100 (1998), 327–342.

26.

Nanda

and Majumdar

, Fuzzy rough sets,, Fuzzy Sets and Systems45 (1992), 157–160.

27.

Pawlak

, Rough sets,, International Journal of Parallel Programming11 (1982), 341–356.

28.

Pawlak

, Rough sets: Theoretical aspects of reasoning about data, Springer Science & Business Media, vol. 9, 2012.

29.

Rizvi

, Naqvi

H.J.

and Nadeem

, Rough intuitionistic fuzzy sets. in JCIS2002, pp. 101–104.

30.

and Pe

, Pattern classification and scene analysis, 1973.

31.

Samanta

and Mondal

, Intuitionistic fuzzy rough sets and rough intuitionistic fuzzy sets, Journal of Fuzzy Mathematics9 (2001), 561–582.

32.

Vlachos

I.K.

and Sergiadis

G.D.

, Intuitionistic fuzzy information– applications to pattern recognition,, Pattern Recognition Letters28 (2007), 197–206.

33.

, Intuitionistic preference relations and their application in group decision making, Information sciences177 (2007), 2363–2379.

34.

, Chen

and Wu

, Clustering algorithm for intuitionistic fuzzy sets,, Information Sciences178 (2008), 3775–3790.

35.

Zadeh

L.A.

, Fuzzy sets, Information and Control8 (1965), 338–353.

36.

Zhang

S.-F.

and Liu

S.-Y.

, A GRA-based intuitionistic fuzzy multi-criteria group decision making method for personnel selection,, Expert Systems with Applications38 (2011), 11401–11405.

37.

Zhang

, Attributes reduction based on intuitionistic fuzzy rough sets, Journal of Intelligent & Fuzzy Systems30 (2016), 1127–1137.

New approaches to intuitionistic fuzzy-rough attribute reduction

Abstract

Keywords

1 Introduction

2 Preliminaries

3 Intuitionistic fuzzy rough set (IFRS) based approach for feature selection

5 Worked example

Table 1 Fuzzy information system Attributes a b c d e f Q Objects x 1 0.4 0.4 1.0 0.8 0.4 0.2 1 x 2 0.6 1.0 0.6 0.8 0.2 1.0 0 x 3 0.8 0.4 0.4 0.6 1.0 0.2 1 x 4 1.0 0.6 0.2 1.0 0.6 0.4 0 x 5 0.2 1.0 0.8 0.4 0.4 0.6 0 x 6 0.6 0.6 0.8 0.2 0.8 0.8 1

References

Table 1
Fuzzy information system

Attributes a b c d e f Q

Objects

x ₁ 0.4 0.4 1.0 0.8 0.4 0.2 1

x ₂ 0.6 1.0 0.6 0.8 0.2 1.0 0

x ₃ 0.8 0.4 0.4 0.6 1.0 0.2 1

x ₄ 1.0 0.6 0.2 1.0 0.6 0.4 0

x ₅ 0.2 1.0 0.8 0.4 0.4 0.6 0

x ₆ 0.6 0.6 0.8 0.2 0.8 0.8 1