Abstract
We extend the fuzzy rough set as probabilistic fuzzy rough set (PFRS) on a fuzzy probabilistic approximation space. The proposed PFRS models both vague and uncertain information at the same time. The viability of PFRS in decision making is illustrated through a case study. PFRS is shown to be useful for both probabilistic as well as deterministic problems.
Introduction
The information in the real world is often imprecise or vague, uncertain and incomplete [26]. Notwithstanding this fact, the decisions are made on the basis of the available information. Probability, fuzzy set [35] and rough set [11] theories adequately address the shortcomings of the information, viz., uncertainty, vagueness and imprecision. However, the three theories have a different role in the information handling. The probability deals with the occurrence of an event; a fuzzy set captures the vagueness by associating a membership degree to the belongingness to a property; and a rough set gives lower and upper approximations of knowledge in the form of equivalence classes. The rough set theory is comparatively recent with numerous applications, to mention a few [3, 24].
State-of-the-art
The rough set theory is extended into the fuzzy domain as fuzzy rough set (FRS) in [4, 16]. FRS encapsulates fuzziness and indiscernibility by constructing a pair of upper and lower approximation operators by using a fuzzy similarity relation. FRS is widely applied in applications such as neural networks [43], medical time series [44], case generation [45],mining stock price [46], feature selection [36–40], and classification problems [10, 17]. A generalized approximation theory of fuzzy sets is proposed in [41, 42, 41, 42]. Both rough set and fuzzy rough set theories rely on finite universes and an assumption of complete knowledge. The approximations obtained are therefore not generalized for the continuously updating information source values.
To this end, Ziarko [27] has proposed a variable precision rough set model by the generalization of the standard set-inclusion relation. In a similar context, the authors in [51, 52] have proposed the generalized rough sets based on inclusion degree [53, 54]. The concept of vague inclusion based on conditional probability is introduced in [54]. The notion of rough mereology with rough inclusion as a measure of subset-hood is proposed in [55, 56]. The quantitative rough set related to subset-hood measures is investigated in [57–59]. The inclusion degree based FRS is proposed in [2, 61].
Another generalization of rough set to deal with uncertainty in information source values has appeared as probabilistic rough set in [18, 25]. The probabilistic knowledge reflects the relative frequencies of occurrence of sets or events. The classical rough set approach may be adequate to handle deterministic problems but is handicapped when probabilistic events are involved. Probabilistic approaches to rough sets have appeared in many forms such as decision-theoretic rough set model [19, 21], Bayesian rough set model [14], information theoretic analysis [1], and decision analysis [6, 19, 6, 19]. The probabilistic rough sets have been increasingly gaining a lot of attention in the recent times [47–50].
Motivation
The inclusion degree based rough sets [27, 51–62], and probabilistic approaches to rough sets [18–20] helps to deal with the changing values in the universe. In this context, we enrich the conventional FRS with the features of both probabilistic and inclusion degree based rough sets, at the same time. While a FRS accommodates the uncertainty and vagueness, represented by membership degree of an event; it ignores theprobability of occurrence of an event. Similarly, a probabilistic rough set accommodates the probabilistic information but is not equipped to handle the vagueness in the information.
The proposed PFRS gives robust approximations with imprecise, vague, and random information source values. To the best of our knowledge such an extension of FRS is non-existent in the literature. FRS simplistically assumes the probability of information source values as 1 and is unable to deal with the random occurrences. Secondly, all inferences deduced from FRS hold good only for the given state of knowledge. In practice, general conclusions about the universe of discourse need to be drawn from a (manageable) subset of vast data. A generalized approach addressing these issues is the motivation behind this study.
The proposed work
We combine the features of inclusion degree based rough set, FRS and probabilistic rough set to introduce probabilistic fuzzy rough set (PFRS). The probabilistic information reflects the long-run expected frequency of occurrence of fuzzy sets or fuzzy events. We use the same in the proposed PFRS to arrive at the confidence in the approximations. A partial degree of inclusion at the core of PFRS acts as a lever to control the trade-off between quality and generalization of approximations. The generalized approximations obtained with PFRS hold true in the face of updating universe values. The conventional models of rough set and FRS are the special cases of PFRS.
The key contributions of the paper can be summarized as follows. After a discussion on the background of the rough set and fuzzy rough set theories inSection 2, we present the probabilistic fuzzy rough sets framework in Section 3. The usefulness of the proposed PFRS is shown in Section 4 through a real case-study. We have applied PFRS to approximate crisp sets in Section 5. A comparison has also been made between PFRS and the rough set approach.
Background
Rough sets
The concept of rough set [11] is originally introduced to deal with the incomplete information by the way of approximation. The primary step in the rough set theory is to identify elementary sets using equivalence relation. The elementary sets are used in approximating the knowledge represented by an information system (IS) [16]. that can be defined as a system (U, A), where U ={ x 1, x 2, …, x m } and A is the set of attributes (features, variables).
Each attribute a ∈ A defines an information function f a : U → V a where V a is the set of values that attribute a may take, and is called the domain a. The attribute set can be considered as the knowledge that helps partition the universe into basic concepts generated through the equivalence relations. For every set of attributes B ⊂ A, an indiscernibility relation IND (B) [16] is an equivalence relation such that two objects, x i and x j , are indiscernible by the set of attributes B in A, if b (x i ) = b (x j ) for every b in B.
IND (B) generates a partition of U denoted as U/IND (B) or simply U/B.
Indiscernibility is central to the rough set theory and the methodology to achieve this is quite close to the thought process of a human brain that conceptualizes the real-world objects through granulation on the basis of their similarity. Once the indiscernibility in the condition concepts is established, the equivalence classes are used to approximate the given objects. A rough set approximation [11] of an arbitrary subset Y of U is characterized by two unions of elementary sets referred to as B-lower and B-upper approximations of Y over the IS (U, B) respectively
The fuzzy rough set theory [4] deals with the vague and incomplete information based upon the principle of information granulation. This theory makes use of the fuzzy concepts or fuzzy information granules, whereas the rough sets theory is concerned with the crisp information granulation. It is generally accepted that these two theories are complementary but distinct [30]. In FRS, an equivalence relation is replaced by a fuzzy similarity relation as a key criterion to granulate the knowledge. The fuzzy equivalence classes are central to FRS in the same way as crisp equivalence classes are central to the rough set [9]. FRS categorizes the objects into classes with soft boundaries based on a fuzzy similarity relation.
A fuzzy binary relation on U is called a fuzzy similarity relation if the properties of reflexivity , symmetry , andT-transitivity hold good. The fuzzy similarity relation leads to the generation of the family of normal fuzzy sets produced by the fuzzy partitioning of universe of discourse. These fuzzy sets can be viewed as fuzzy equivalence classes and form the building blocks for approximation of concepts. Fuzzy equivalence relation [29], or sometimes also referred to as a similarity relation [33], represents similarity of two vague objects. A fuzzy relation R is called a fuzzy equivalence relation iff for all x, y and z in X, the following relations hold good.
R (x, x) = 1 (reflexivity);
R (x, y) = R (y, x) (symmetry);
(-transitivity)
Let be a fuzzy binary relation on U, C i be a fuzzy equivalence class, and Y be the concept to be approximated. Fuzzy approximation space [12] is defined as the pair . A FRS [4, 8] is constituted by a pair of fuzzy sets on U such that for every x ∈ U
where, μ C i (x) denotes the membership of x to C i , μ Y (x) the membership of x to Y, the degreeof certain membership of C i in Y, the degree of possible membership of C i in Y, and is a FRS. Unlike the conventional rough sets, FRS helps to deal with the both symbolic and real-valued attributes.
FRS provides an adequate tool for data analysis [31] by offering a high degree of flexibility. The equivalence and similarity relations generate crisp and fuzzy granules respectively of the available knowledge about the universe. The granules form the elementary sets for the approximation. The basic methodology in the two theories divides the objects of discourse into equivalence or fuzzy equivalence classes respectively containing indistinguishable or similar objects with respect to some attributes.
We present the probabilistic fuzzy rough set (PFRS) framework to deal with randomness in the information source values. The concepts are elucidated in detail as following.
The inclusion degree
The inclusion degree [23] for approximate reasoning is based on a partially ordered relation. It gives us a measure of containment of a fuzzy concept in another fuzzy concept. We extend the same on the fuzzy probabilistic approximation space. This relation is not necessarily symmetric as the degree to which a fuzzy concept C 1 overlaps another fuzzy concept C 2 may not be the same as the degree to which C 2 covers C 1. The degree to which C 1 overlaps C 2 also depends on their non-overlapping part.
The proposed PFRS is characterized with a flexible degree of inclusion of the probabilistic fuzzy concepts to impart generalization capabilities at the cost of some misclassification error. A partial degree of inclusion indicates a graded inclusion of two sets [32]. The graded inclusion of two sets gives the degree to which a concept or an equivalence class [x] belongs to another set S, given by
Inclusion degree has been extended for a partially ordered set (L, ≼) in [34] where ≼ is a binary relation satisfying the properties of reflexivity, anti-symmetry and transitivity. For any (a, b) ∈ L, the inclusion degree, Ion Lis defined as a real number I (b/a) with the following properties: 0 ≤ I (b/a) ≤ 1,
a ≼ b implies I (b/a) = 1,
a ≼ b ≼ c implies I (a/c) ≤ I (a/b) , and
a ≼ b implies I (a/c) ≤ I (b/c) for ∀ c ∈ L
The inclusion function [34] is defined as satisfying the following conditions:
We now extend the inclusion degree to the fuzzy probabilistic approximation space. Let P be a probability distribution over the non-empty, finite set of objects U, and I be the degree of inclusion of fuzzy concepts C
i
. Let be the family of all possible fuzzy sets defined on U, and C ={ C
1, C
2, … … , C
m
} denote the fuzzy sets from which forms a weak fuzzy partition of U. The fuzzy probabilistic approximation space is defined as
The fuzzy inclusion function I (W, V) defined on 〈U, P, C, I〉 is called the inclusion measure if:
It can be seen that the inclusion measure has a strong connection to the similarity measure between two fuzzy sets. Since, we are dealing with the probabilistic settings, there is a degree of confidence associated with each pair of fuzzy sets, W, V, with regard to their memberships in the fuzzy power set F (U). Given a fuzzy probabilistic approximation space 〈U, P, C, I〉, and D (W, V) ={ p
V
(x) , p
W
(x) > 0 : p
V
(x) ≤ pw (x) }, the degree of confidence C (W, V) for W, V ∈ F (U) is defined as:
The quality of approximation indicates the quality in terms of representation of the underlying data, and is directly related to the degree of inclusion that can be adjusted with parameter λ ∈ (0, 1]. A flexible degree of inclusion helps in the approximate characterization of a concept, Y in terms of probabilistic fuzzy concepts from . It controls the trade-off between the quality of approximation and the level of generalization. The relaxed condition paving the way for some non-overlapping of fuzzy concepts yields the generalized conclusions. A high quality of approximation leads to a low quality of approximation, and vice-versa. We now present the notion of quality of approximation and the related notions in a greater detail.
Let U ={ x
1, …, x
m
} be a finite set of objects, C ={ C
1, C
2, … … , C
n
} be a family of probabilistic fuzzy sets from such that U ⊂ supp ∪ C
i
∈C
C
i
, and let . The approximation of Yis said to be λ-approximable in approximation space if:
The degree of certainty of the approximation is
The tolerance of approximation is defined as:
The tolerance of approximation indicates the error in the approximation. A slight error in approximation brings generalization which is all the more necessary in the probabilistic settings with uncertain values. A related notion is that of the quality coefficient of approximation. Let be the concept to be approximated in the approximation space 〈U, P, C, I〉 with the degree of inclusion λ. Then the quality coefficient of approximation is given by:
Let Y ∈ F (U) be a λ-approx set in 〈U, P, C, I〉; A (Y) denote the set {D⊂ C ; I (Y, ∩ C
i
∈D
C
i
) ≥ λ } ; and B (Y) denote the set {E⊂ C ; I (∪ C
i
∈E
C
i
, Y) ≥ λ }. Let D
* ∈ A (Y) and E
* ∈ B (Y) be such that the tolerance of approximation
Then the lower and upper approximations of Yin the approximation space 〈U, P, C, I〉 are given by
The pair is a PFRS with the parameter λ controlling the level generalization in the approximations. The confidence value associated with the approximations obtained through a PFRS is computed by considering the probability distribution of the information source values. In the real world a probability distribution is often associated with a fuzzy event or a fuzzy object [22]. The probability of a current fuzzy event can be determined by applying the Bayesian law using the likelihood along with “prior” beliefs. The confidence level in the approximations helps to identify a positive decision region above a certain confidence threshold.
PFRS is especially useful to arrive at generalized approximations taking into consideration updating state of knowledge. For example, in order to assess the health of a company as an investment target, a generalized conclusion needs to be drawn by examining the company’s performance against the key parameters (like earning per share, dividend payout, price/earnings ratio etc.) over the past years. The probabilistic information about the current state of knowledge is determined from the relative occurrences of the information source values in the past. An approach is devised here to derive decision rules along with the confidence level in decision making problems.
Example of probabilistic fuzzy rough sets
A combination of probabilistic and inclusion degree based framework in the proposed PFRS fortifies it’s approximation capabilities. A case study adapted from [2] is discussed here to show the usefulness of PFRS in the selection of applicants for credit cards. The results of this case study are compared with those obtained by the original rough set theory.
Case Study 1
A set of eight applicants, U ={ x 1, …, x 8 }, wants to procure a credit card. Each applicant is described by his membership function values associated with the condition concepts {C 1, … , C 6 } and the decision concepts Y 1, Y 2 and Y 3 given in Table 1, where C 1= low bank balance, C 2= medium bank balance, C 3= high bank balance, C 4= low monthly expenses, C 5= medium monthly expenses, C 6= high monthly expenses; Y 1= favorable applicant, Y 2= passable applicant, Y 3= unsatisfactory applicant. It is obvious that X ⊂ supp (∪ C i ). Our goal is to approximate the decision concepts Y 1, Y 2 and Y 3 in terms of the condition concepts C ={ C 1, …, C 6 } using PFRS.
Table 1 shows the current state of the condition concepts for the applicants 1 to 8. But it is very natural to expect that the current bank balance or monthly expenses are very different from the past values. In order to get clue from the past, we take into account the probability distribution of the current state of the condition concepts. Table 2 gives the probabilities of the fuzzy membership values of Table 1 for each applicant and these are used to calculate the confidence in the membership values. From Table 1, it can be seen that μ ∩C i (x) = 0, ∀ x ∈ X, μ ∪C i (x) = 1 ∀ x ∈ X - x 1.
Now, the tolerance in the fuzzy rough approximation of the decision concepts is given in Table 2 and the degree of certainty in Table 3, where, P ∩C i (x) =μ ∩C i (x)* p ∩C i (x) ; P ∩C i (x) = μ ∪C i (x) * p ∪C i (x) ; D (x) = P ∪C i (x) - P ∩C i (x) .
Tolerance of the fuzzy rough approximation is expressed as:
Note that T C of each individual Y j should be less than T C . Table 4 gives the degrees of inclusion of fuzzy concepts in terms of their fuzzy membership values. Similarly, Table 5 gives the degrees of inclusion of concepts in terms of probability values.
If we go in for the approximation based on full covering of the fuzzy sets, then the degree of I F (Y j , C i ) must be 1. For the condition concept C 3 the value of I F (Y 1, C 3) is 1 and I P (Y 1, C 3) is 0.5. Therefore the lower approximation of Y 1 can be concluded as C 3 with a confidence value of 0.5.
The degree of covering of Y 1 is highest for C 3, i.e. 0.75. In order to increase the degree of covering we look for C i such that I F (C 3 ∪ C i , Y 1) for i = 1, 2, 4, 5, 6. We obtain both I F (C 3 ∪ C 4, Y 1) and I F (C 3 ∪ C 2, Y 1) are equal to 1. Therefore, . The probability values for approximation of Y 1 required in T C are given in Table 6, in which P C 3 (x) = μ C 3 (x) * p C 3 (x) ; P C3∪C4 (x) = μ C3∪C4 (x) * p C3∪C4 (x) ; P C2∪C3 (x) = μ C2∪C3 (x) * p C2∪C3 (x)
T
C
, when and
T
C
, when and
Thus C 3 ∪ C 4 is a better choice as the upper approximation of Y 1, since it has a lower value of the tolerance coefficient. The confidence in this approximation is I P (C 3 ∪ C 4, Y 1) = 0.875.
It is found that the fuzzy membership of an applicant in the class favorable applicant lie between the values of his fuzzy membership functions
From Table 1, it is observed that the fuzzy concept C 2 has the highest degree of inclusion as 0.80 in the fuzzy decision concept Y 2. In order to ensure full covering of Y 2, we look for another concept(s) joining which Y 2 can be covered wholly. It is noted that both I F (Y 2, C 2 ∩ C 5) and I P (Y 2, C 2 ∩ C 5) = 0.875.
Since I F (C 2 ∪ C i , Y 2) < 1 for i = 1, 3, 4, 5, 6, a union of three concepts is needed to ensure the full covering of Y 2. The unions that fully cover Y 2 are C 2 ∪ C 1 ∪ C 3, C 2 ∪ C 5 ∪ C 3, C 2 ∪ C 5 ∪ C 4 and C 2 ∪ C 1 ∪ C 4. In order to choose the best combination out of these we follow the same steps as above in the approximation of Y 2. The probability values for approximation of Y 2 are given in Table 7. These values are calculated so as to compute the tolerance coefficient.
From Table 7, it is clear that T C is least for the case of
T
C
when and
Similarly, T C when and , is 0.353.
T C when and , is 0.262
T C when and , is 0.41
Therefore the combination, C 2 ∪ C 5 ∪ C 4 is chosen being with lowest T C . The confidence in this approximation is given by I P (C 2 ∪ C 5 ∪ C 4, Y 2) = 0.5.
Hence, the decision concept
Following the above steps it is seen that the lower and upper approximations of the decision concept
Comparison with the original fuzzy rough sets
Considering Case-study 1, the most obvious choice for partitioning the universe would be C Balance ={ C 1, C 2, C 3 } , C Expenses = { C 4, C 5, C 6 }, and Y ={ Y 1, Y 2, Y 3 }. Therefore the approximation of Y 3 would lie in either C Balance or C Expenses . Also the probability of the fuzzy samples cannot be accounted in the case of FRS and thus we are never sure about the degree of confidence in the approximation of concepts by FRS. In the case of PFRS, the whole set C i , i = 1, …, 6 is also a partition of the universe. The approximations of the decision concepts Y j ∈ Y as per the original FRS theory in accordance with (4) and (5) are given in Table 8.
It is seen from Table 8 that the lower and upper approximations of the decision concepts Y 1 are and respectively, being the highest of the fuzzy membership degrees of the condition concepts in decision concept Y 1.
The results of the PFRS are close to those obtained from the conventional FRS. C 3is found to be the unanimous choice for both the lower and upper approximations of Y 1. The proposed approach also provides the confidence values, a feature not associated with FRS. The confidence is found to be 0.5 in the lower approximation of Y 1. The PFRS probes further to find the best upper approximation by taking the union of C 3 and C 4 as the top two highest degrees of covering for Y 1are seen to be 1 and 0.8. The confidence in this result is 0.875. Both PFRS and FRS yield similar results for approximation of Y 2and Y 3.
However, the results by our approach are comprehensive in the sense that it permits the confidence level in the approximation to be determined and this can play a vital role in the decision analysis. Secondly, PFRS ensures better results on account of the full covering of the concepts. For example, in the case of Y 2, C 2 along with C 5 forms the best choice for the lower approximation of Y 2, and C 2, C 5 and C 4 form the upper approximation of Y 2. It can be observed from Table 6 that the values of the membership functions are highest for these combinations. Similarly, C 1 is found to be the approximation for Y 3 in both the PFRS and FRS approaches.
The probabilistic fuzzy rough sets for crisp sets and rough sets: A comparison
PFRS can be used to approximate the decision concepts in the case of crisp sets, along with the confidence in the approximations. PFRS is characterized with a flexible degree of covering, λ ∈ (0, 1] of crisp decision concept Y (to be approximated) by the condition concepts and vice-versa. As far as the rough sets are concerned, the lower approximation of Y is the union of elementary sets that are fully covered by Y. Both the approaches can be made to converge by fixing λ as 1 and probability of all information source values as 1 in PFRS. The upper approximation is based on the partial covering of Y by the condition concepts in PFRS. The confidence in the approximations is arrived at by considering the probability values. This approach is now illustrated through an example [2].
Case Study 2
Consider the applicants of Case-study 1. Let each object x i ∈ U be described by the two attributes: the first condition attribute C BB with condition concepts C 1, C 2, C 3; and the second condition attribute C ME with condition concepts C 4, C 5, C 6 and the decision attribute Y with decision concepts Y 1, Y 2, Y 3. All concepts are crisp subsets of U. The membership values of applicants for each sample are given in Table 9. The probability of each applicant belonging to the different condition concepts is given in Table 10.
From Table 4, the condition concepts, C
BB
={ C
1, C
2, C
3 } and C
ME
={ C
4, C
5, C
6 } form the weak fuzzy partitions of U, along with the decision concept Y ={ Y
1, Y
2, Y
3 }. However, the approximation of Y in the rough sets is done on the approximation space 〈U, C
Cl
= C
BB
∩ C
ME
〉 but the approximation space of PFRS is 〈U, P, C
PFRS
= C
BB
∪ C
ME
, I 〉. The concepts as per Table 9 can be listed as follows:
Hence the partitions are:
Classical rough sets PFRS (with the degree of inclusion as 1)
C PFRS Y 1 = C 2 ={ 1, 5 } with the confidence of 0.75 (from Table 4)
with the confidence of 1 (from Table 2 as per (14))
Classical rough sets PFRS (with DOI as 1)
It can be seen that in the rough sets, a set of five concepts, {C 1, C 2, C 3, C 4, C 5, C 6} is required for the approximation of Y 2. On the other hand, PFRS allows us to approximate Y 2 only through a set of condition concepts, {C 1, C 4, C 6} or {C 1, C 3, C 6}.
Classical Rough Sets
the null set; PFRS (with DOI of 1), , with the confidence of 0.875 (as seen from Table 4) we have with the confidence of 0.5 (as seen from Table 5). Note that we require two concepts, C 1 and C 5 for the approximation of Y 3 by the rough set theory but only C 1 by PFRS.
Conclusions and future work
The concept of probabilistic fuzzy rough set is developed by the inclusion of probability values in the inclusion degree based FRS. The non-perfect inclusion adds to the generalization of the approximations. PFRS also provides the confidence level in the approximations. The PFRS framework addresses the issues of the incompleteness and inconsistencies in the vague events better than the FRS theory. The rough set and the FRS are the special cases of the proposed PFRS.
The concept of PFRS is illustrated through two case studies. Case-study 1 is concerned with the fuzzy concepts. The results obtained through PFRS are also compared with those obtained through the FRS. In the second case-study, PFRS are shown to be equally useful in the approximation of crisp concepts. The results obtained by the PFRS and the original rough set are also compared. It is noticed that it is possible with PFRS to arrive at a reasonable approximation by adjusting the inclusion degree even in situations where the conventional rough set and FRS theories lead to an empty set.
The present study paves the way for many such probabilistic extensions of FRS theory. For instance, it would be interesting to replace the standard inclusion degree in the proposed PFRS with the subset-hood measures. We are also inspired to investigate the approximation of fuzzy random variables using the proposed PFRS framework, in the near future. The extensions of the proposed PFRS in the intuitionistic fuzzy and linguistic domains would be worth exploring.
