Abstract
The main aim of rough multiset is reducing the boundary region and increasing the accuracy measure by increasing the lower approximation and decreasing the upper approximation. So in this paper, a new approach of rough multiset via multiset ideals is proposed to reduce the boundary region and increase the accuracy measure. The concepts of lower and upper multiset approximations via multiset ideals are introduced. In addition, some properties and results of these multiset approximations are studied. The relationships between the current multiset approximations are presented. Moreover, comparisons between the present method and the previous one are presented and shown to be more general. Furthermore, the multiset topology induced by the current method is finer than the multiset topology induced by the previous methods. The importance of the current paper is not only that it is reducing the boundary region and increasing the accuracy of sets which is the main aim of rough multiset, but also it is introducing an applied example in medical by applying the current method to illustrate the concepts in a friendly way.
Introduction
Multiset theory was introduced in 1986 by Yager [25]. A multiset is considered to be the generalization of a classical set. In classical set theory, a set is a well-defined collection of distinct objects. It states that a given element can appear only once in a set without repetition. So, the only possible relation between two mathematical objects is either they are equal or they are different. The situation in science and in ordinary life is not like this. If the repetitions of any object are allowed in a set, then a mathematical structure, that is known as multiset (mset [1] or bag [25], for short), is obtained in [3, 24]. For the sake of convenience an mset is written as {k1/x1, k2/x2, . . . , k n /x n } in which the element x i occurs k i times. The number of occurrences of an object x in an mset A, which is finite in most of the studies that involve msets, is called its multiplicity or characteristic value, usually denoted by m A (x) or C A (x) or simply by A (x). Noted that each multiplicity k i is a positive integer.
In Mathematics, the equation x2 - 4x + 4 =0 has a solution x = 2, 2 which gives the multiset S = {2/2}. Additionally, one of the simplest examples is the multiset of prime factors of a positive integer n. The number 504 has the factorization 504 = 233271 which gives the mset X = {3/2, 2/3, 1/7} where C X (2) =3, C X (3) =2, C X (7) =1. In Chemistry, a water molecule H2O is represented by the mset M = {2/H, 1/O} and without one of the two hydrogen atoms, the water molecule is not created.
In the physical world it is observed that there is enormous repetition. For instance, there are many hydrogen atoms, many water molecules, many strands of DNA, etc. This leads to three possible relations between any two physical objects; they are different, they are the same but separate or they coincide and are identical. Many conclusive results were established by these authors and further study was carried on by Jena et al. [16] and many others [4–6]. The notion of an mset is well established both in mathematics and computer science [1, 10]. Additionally, the basic definitions of relations in mset context are introduced by Girish et al. [10, 13].
Rough set theory had been proposed by Pawlak [21] in the early of 1982. The theory is a new mathematical tool to deal with vagueness and imperfect knowledge. It’s different from theories which care for the problem of ambiguities in the information, since imprecision in rough set theory is expressed by a boundary region of a set, not by crisp membership as in classical set theory nor partial membership as in fuzzy set theory proposed by Zadeh [26] nor the possibility of essential elements as in Dempster-Shafer theory of evidence [7] nor preliminary information nor additional prior on the elements as in probability theory. It is also dealing with vagueness (ambiguous) of the set by using the concept of the lower and upper approximations [21]. If the lower and upper approximations of the set are equal to each other, then it is called crisp (exact) set, otherwise known as rough (inexact) set. Therefore, the boundary region is defined as the difference between the upper and lower approximations, and then the accuracy of the set or ambiguous depending on the boundary region is empty or not respectively. Nonempty boundary region of a set means that our knowledge about the set is not sufficient to define the set precisely. The main aim of rough set is reducing the boundary region by increasing the lower approximation and decreasing the upper approximation. In 2011 and 2012, Girish et al. [12, 13] introduced rough mset in terms of lower and upper approximations. Moreover, they used an mset topological concept to investigate Pawlak’s rough set theory by replacing its universe by mset. From that time, many authors were interested in studying the extensions of results and properties of rough set to rough multiset [11, 27]. Recently, many related papers are published by Zhan et al. [28–31] about rough set theory with respect to rings.
In this paper, we introduce some important and basic issues of generalized rough msets induced by mset ideals to decrease the boundary region and increase the accuracy measure. The rest of the paper is organized as follows, Section 2 shows the basic definitions and notions which will be needed in our manuscript. In Section 3, we present the R**-upper and R**-lower mset approximations via mset ideals. Moreover, their properties have been introduced. These definitions are generalization to Zakaria et al.’s definitions [27] and Girish et al.’s definitions [11–13]. The purpose of Section 4 is to construct a new approach of rough msets via mset ideals. Also, the properties of this approach have been introduced. In addition, we compare between these approximations and the previous approximations [12, 27] which implies that our approximations are generalization to the other [12, 27]. In Section 5, an applied example in medical is introduced by applying the current method.
Preliminaries
The aim of this section is to present the basic concepts and properties of msets. At the end of this section, rough msets, the definitions and notions of relations in msets which will be needed in the sequel.
Here C M (x) is the number of occurrences of the element x in the mset M. The mset M is drawn from the set X = {x1, x2, . . . , x n } and it is written as M = {m1/x1, m2/x2, . . . , m n /x n } where m i is the number of occurrences of the element x i , i = 1, 2, 3, . . . , n in the mset M.
The mset space [X] ∞ is the set of all msets over a domain X such that there is no limit on the number of occurrences of an element in an mset. If X = {x1, x2, . . . , x k }, then [X] w = {{m1/x1, m2/x2, . . . , m k /x k } : m i ∈ {0, 1, 2, . . . , w} , i = 1, 2, . . . , k}.
M = N if C
M
(x) = C
N
(x) for all x ∈ X, M ⊆ N if C
M
(x) ≤ C
N
(x) for all x ∈ X, P = M ∪ N if C
P
(x) = max {C
M
(x) , C
N
(x)} for all x ∈ X, P = M ∩ N if C
P
(x) = min {C
M
(x) , C
N
(x)} for all x ∈ X, P = M ⊕ N if C
P
(x) = min {C
M
(x) + C
N
(x) , w} for all x ∈ X, P = M ⊖ N if C
P
(x) = max {C
M
(x) - C
N
(x) , 0} for all x ∈ X, where ⊕ and ⊖ represent mset addition and mset subtraction respectively.
Let M be an mset drawn from a set X. The support set of M denoted by M* is a subset of X and M* = {x ∈ X : C M (x) >0}, i.e., M* is an ordinary set.
The power set of an mset is the support set of the power mset and is denoted by P* (X). The following theorem shows the cardinality of the power set of an mset.
M, φ ∈τ, The union of the elements of any subcollection of τ is in τ, The intersection of the elements of any finite subcollection of τ is in τ.
Hence, (M, τ) is called an M-topological space. Each element in τ is called an open mset.
The interior of A is defined as the union of all open msets contained in A and denoted by int (A), i.e., int (A) = ∪ {G ⊆ M : G is an open mset and G ⊆ A} and Cint(A) (x) = max {C
G
(x) : G ∈ τ, G ⊆ A}, The closure of A is defined as the intersection of all closed msets containing A and denoted by cl (A), i.e., cl (A) = ∩ {K ⊆ M : K is a closed mset and A ⊆ K} and Ccl(A) (x) = min {C
K
(x) : K ∈ τ
c
, A ⊆ K}.
The cartesian product of three or more non-empty msets can be defined by generalizing the definition of the cartesian product of two msets. Thus, the cartesian product M1 × M2 × . . . × M n of the non-empty msets M1, M2, . . . , M n is the mset of all ordered n-tuples (m1, m2, . . . , m n ), where m i ∈ r i M i , i = 1, 2, . . . , n and (m1, m2, . . . , m n ) ∈ p M1 × M2 × . . . × M n with p = ∏r i , where r i = C M i (m i ) and i = 1, 2, . . . , n.
Also, R< n/y > is the intersection of all pre-msets containing y with nonzero multiplicity; that is,
The pair
It should be noted that 0 ≤ μ (N) ≤1.
The pair
It should be noted that 0 ≤ μ R I (N) ≤1.
The purpose of this section is to present a new method to define the basic concepts of rough multisets via multiset ideals by using the notion R < m/x > R. The main properties of the current method are studied and compared to Zakaria et al.’s method [27].
The following theorem studies the main properties of the current upper mset approximation.
R** (A) = [R** (A
c
)]
c
. R** (φ) = φ. A ⊆ B ⇒ R** (A) ⊆ R** (B). R** (A ∩ B) ⊆ R** (A) ∩ R** (B). R** (A ∪ B) = R** (A) ∪ R** (B). R** (R** (A)) ⊆ R** (A). A ∈ I ⇒ R** (A) = φ. R** (A) ⊕ R** (B) ⊆ R** (A ⊕ B). R** (A ⊖ B) ⊆ R** (A).
Let x ∈
m
R** (A). Then, R < m/x > R ∩ A ∉ I. Since, A ⊆ B and I is an M-ideal. Thus, R < m/x > R ∩ B ∉ I. Therefore, x ∈
m
R** (B). Hence, R** (A) ⊆ R** (B). Immediately by part (3). R** (A) ∪ R** (B) ⊆ R** (A ∪ B) by part (3). Let x ∈
m
R** (A ∪ B). Then, R < m/x > R ∩ (A ∪ B) ∉ I. It follows that (R < m/x > R ∩ A) ∪ (R < m/x > R ∩ B) ∉ I. Therefore, R < m/x > R ∩ A ∉ I or R < m/x > R ∩ B ∉ I that means x ∈
m
R** (A) or x ∈
m
R** (B). Then, x ∈
m
R** (A) ∪ R** (B). Thus, R** (A ∪ B) ⊆ R** (A) ∪ R** (B). Hence, R** (A ∪ B) = R** (A) ∪ R** (B). Let x ∈
m
R** (R** (A)). Then, R < m/x > R ∩ R** (A) ∉ I. Therefore, R < m/x > R ∩ R** (A) ≠ φ. Thus, there exists y ∈
n
R < m/x > R ∩ R** (A). This means that R < n/y > R ⊆ R < m/x > R (by Lemma 3.1) and R < n/y > R ∩ A ∉ I. Then, R < m/x > R ∩ A ∉ I. Hence, x ∈
m
R** (A). This completes the proof. Straightforward by Definition 3.1.
Since, A ⊖ B ⊆ A and by using part (3) we get, R** (A ⊖ B) ⊆ R** (A).
A ⊈ R** (A). R** (M) ≠ M. R** (A ∩ B) ≠ R** (A) ∩ R** (B). A ⊈ R** (R** (A)) and R** (A) ⊈ R** (R** (A)).
Let M = {3/x, 2/y, 4/z, 8/r}, I = {φ, {3/x} , {2/x} , {1/x} , {2/z} , {1/z} , {3/x, 2/z} , {3/x, 1/z} , {2/x, 2/z} , {2/x, 1/z} , {1/x, 2/z} , {1/x, 1/z}} and R = Δ ∪ {(3/x, 2/y)/6, (3/x, 4/z)/12, (4/z, 8/r)/32, (2/y, 4/z)/8, (2/y, 8/r)/16}. Then, R < 3/x > R = {3/x}, R < 2/y > R = {2/y}, R < 4/z > R = {4/z} and R < 8/r > R = {8/r}. If A = {3/x, 2/y}, then R** (A) = {2/y}. Hence, A ⊈ R** (A). Also, R** (M) = {2/y, 4/z, 8/r} ≠ M. Let M = {3/x, 2/y, 4/z, 5/r}, I = {φ, {2/x} , {1/x} , {2/r} , {1/r} , {2/x, 2/r} , {2/x, 1/r} , {1/x, 2/r} , {1/x, 1/r}} and R = Δ ∪ {(3/x, 2/y)/6, (2/y, 3/x)/6, (3/x, 4/z)/12, (4/z, 3/x)/12, (3/x, 5/r)/15, (5/r, 3/x)/15, (2/y, 4/z)/8, (5/r, 4/z)/20}. Then, R < 3/x > R = {3/x} , R < 2/y > R = {3/x, 2/y} , R < 4/z > R = {3/x, 4/z} and R < 5/r > R = {3/x, 5/r}. If A = {3/x, 5/r} and B = {2/y, 4/z}, then R** (A) = M, R** (B) = {2/y, 4/z} andR** (A ∩ B) = φ. Hence, R** (A ∩ B) ≠R** (A) ∩ R** (B). In Example 3.1 part (2), if A = {2/y}, then R** (A) = {2/y} and R** (R** (A)) = φ. Hence, A ⊈ R** (R** (A)) and R** (A) ⊈ R** (R** (A)).
The main properties of the lower mset approximations are presented in the following theorem.
R** (A) = [R** (A
c
)]
c
. R** (M) = M. A ⊆ B ⇒ R** (A) ⊆ R** (B). R** (A ∩ B) = R** (A) ∩ R** (B). R** (A ∪ B) ⊇ R** (A) ∪ R** (B). R** (A) ⊆ R** (R** (A)). A
c
∈ I ⇒ R** (A) = M. R** (A) ⊕ R** (B) ⊆ R** (A ⊕ B). R** (A ⊖ B) ⊆ R** (A).
R** (A) ⊈ A. R** (φ) ≠ φ. R** (A ∪ B) ≠ R** (A) ∪ R** (B).
Similarly. Let
The following theorem presents the relationship between the current approximation in Definition 3.1 and the previous Definition 4.1 in [27].
R** (A) ⊆ R* (A). R* (A) ⊆ R** (A).
It should be noted from Theorem 3.4 that the new Definition 3.1 decrease the upper mset approximation and increase the lower mset approximation. This new approach is different from the previous approach [12, 27] and more general. As a special case, if R is a symmetric mset relation, then the present mset approximations coincide with the previous mset approximations [27]. Also, if I = {φ} and R is a symmetric mset relation, then the present mset approximations coincide with the previous mset approximations [12, 13]. So, the previous mset approximations are special cases of the present mset approximations.
Example 3.1 shows that the inclusion in Theorem 3.4 parts 1 and 2 can not be replaced by equality relation in general (for part 1, if A = {4/z}, then R** (A) notsupseteqR* (A)). In a similar way, we can add example to part 2.
In this section, a new kind of generalized multiset approximations via multiset ideals is introduced. The properties of the current approximations are studied. In addition, the relationship between these approximations and the approximations which are defined in the previous section is presented. Comparisons between the present approximations and the previous approximations [12, 27] are presented and shown to be more general. Moreover, the multiset topology induced by the current approximations is finer than the multiset topology induced by the previous approximations [12, 27]. Finally, some examples are used to illustrate the present concepts.
The following proposition presents the properties of upper mset approximation in Definition 4.1.
The following proposition presents the properties of lower mset approximation in Definition 4.1.
The following theorem presents the relationship between the current approximations in Definitions 4.1 and 4.2.
The following theorem presents the relationship between the current approximation in Definitions 4.1, 4.2 and the previous Definitions 2.20 and 2.21 in [27].
BN
I
(N) ⊆ BND
R
I
(N). μ
R
I
(N) ≤ μ
I
(N).
Let Let Immediate. Straightforward from part (1) and part (2).
It is noted from Theorem 4.2 that the Definitions 4.1 and 4.2 reduce the boundary region and increase the accuracy measure of mset by increasing the lower mset approximation and decreasing the upper mset approximation with the comparison of the method in Definitions 2.20 and 2.21 [27].
It should be noted that the interior of an mset N,
The relationship between the topology which was generated by the previous method [27] and the topology which is generated by the present method is introduced in the following proposition.
From Example 4.1, the lower, upper mset approximation, boundary region and accuracy for subsets of M is computed by using Zakaria et al.’s method in Definitions 2.20, 2.21 [27] and the present method in Definitions 4.1 and 4.2 as shown in Table 1.
Comparison between the boundary and accuracy by using Zakaria et al.’s method in Definitions 2.20, 2.21 [27] and the present method in Definitions 4.1 and 4.2
For example, take {3/x, 8/r} , then the boundary and accuracy by the present Definition 4.2 are φ and 1 respectively. Whereas, the boundary and accuracy by using Zakaria et al.’s method in Definition 2.21 [27] are {3/x} and 8/11 respectively.
Single-valued medical information system
Rheumatic fever data
In this section, we briefly describe the rheumatic fever datasets used in this study as a topological application of data reduction [22]. No doubt that rheumatic fever is a very common disease and it has many symptoms that differ from one patient to another though the diagnosis is the same. So, we obtained the following data on 26 rheumatic fever patients. All patients are between 9 and 12 years old with history of Arthritis which began from age 3 to 5 years. This disease has many symptoms and it is usually started in young age and still with the patient along his life. Table 2 [18] represents the attributes as described in the rheumatic fever data sets and shows the coding of the data, which is described as follows: Sex (S) = {M, F} = {0, 1} , Pharyngitis (F) = {yes, no} = {1, 0}, ArthritisA = {a0, a1, a2} = {0, 1, 2}, Carditis R = {affe - cted, not affected} = {1, 0}, Chorea K = {yes, no} = {1, 0} , ESR E = {normal, high} = {0, 1} , Abdominal Pain P = {absent, present} = {0, 1} and Headache H = {yes, no} = {1, 0} . The decisionattribute is Diagnosis D = {rheumatic arthritis, rheumatic carditis, rheumatic arthritis andcarditis} = {d1, d2, d3} . Table 3 introduced the 26 patients characterized by 8 symptoms (attributes) using them to decide the diagnosis for each patient (decision attribute). Table 4 contains information on 26 patients characterized by eight symptoms (attributes) which were used to decide the diagnosis for each patient (decision attribute), where the attributes are shown in Table 3. Then, we constrain the multi-valued information system (MIS) as shown in Table 5.
Converted data description
Converted data description
Rheumatic fever data in multi-valued information system
The integral part of mset that the patients have the same attributes together as follows:
Therefore, M = {2/x, 4/y, 4/z, 2/q, 3/r, 6/s, 5/t} . In addition, we can summarize these information in Table 6.
Multiset of information table
The mset relation define as follows: (m/x) R a (n/y) ⇔ f a (m/x) ⊆ f a (n/y) , ∀ a ∈ At, where At = {α, β, δ} is the set of condition attributes. And for B ⊆ At, (m/x) R B (n/y) ⇔ f B (m/x) ⊆f B (n/y) , ∀ B ⊆ At . For a ∈ At, the classAR a <m/x>R a = {R a < m/x > R a : m/x ∈ M} , R a < m/x > R a = R a < m/x > ∩ < m/x > R a , R a < m/x > = ∩ {R a (n/y) : m/x ∈ R a (n/y)} , R a (m/x) = {n/y : ∃ some k with (n/y) R a (k/x)}. For B ⊆ At, the class AR B <m/x>R B = {R B < m/x > R B : m/x ∈ M} , R B < m/x > R B = R B < m/x > ∩ < m/x > R B , R B < m/x > = ∩ {R B (n/y) : m/x ∈ R B (n/y)} , R B (m/x) = {n/y : ∃ some k with (n/y) R B (k/x)}.
Discovering dependencies between attributes is an important issue. A set of attribute B depends totally on a set of attributes A denoted by A ⇒ B if all values sets of attributes from B are uniquely determined by values sets of attributes from A . Let A and B be subsets of At, we say that B depends on A in a degree K (0 ≤ K ≤ 1), denoted by: A ⇒
K
B if
If K = 1, B depends totally on A. If K < 1, B depends partially (in a degree K) on A. If we take A = At and B = D in the above two issues, where At is the set of condition attributes and D is the decision attribute, then we say that, D depends totally on At, denoted by At ⇒ D, if all values of attributes from D are uniquely determined by values sets of attributes from At . Otherwise, we say that D depends on At in a degree K, denoted by At ⇒
K
D . If
Then, we can get the degree of dependency for each attribute as follows:
Thus, the set of attributes of equal highest degree of dependency is the set of principal attribute subset (PA ⊆ At), of the current system. So, we conclude that δ is the PA of current system. Hence, the reduction is
It should be noted that the degree of dependency for each attribute with respect to the previous method [27] is:
By comparing the values of the degree of dependency for each attribute in both methods, it’s clear that our method is better than the previous one [27].
The main aim of rough msets is reducing the boundary region by increasing the lower mset approximation and decreasing the upper mset approximation, so in this paper, new lower and upper multiset approximations via multiset ideals are proposed to achieve this aim. Results of these new multiset approximations are studied, compared to the previous one [12, 27] and shown to be more general. Finally, an applied example in the medicine field is introduced.
It is well known that rough set theory has been considered as a generalization of classical set theory in one way. Furthermore, this is an important mathematical tool to deal with Vagueness (uncertainty). The boundary region approach is usually associated with Vagueness (i.e., existing of objects which cannot be uniquely classified to the set or its complement) which was first formulated in 1893 by the father of modern logic Frege [8]. Thus according to Frege “The concept must have a sharp boundary. To the concept without a sharp boundary, there would correspond an area that had not a sharp boundary-line all around”, i.e., mathematics must use crisp, not vague concepts, otherwise it would be impossible to reason precisely. Pawlak presented the concept of rough sets which have a wide range of applications in various fields like cognitive sciences, artificial intelligence, knowledge discovery from databases, machine learning expert systems and other fields can be found in [17, 32].
The original rough set theory does not consider attributes with repetition, that is, criteria. In fact, in many real-world situations, we are often faced with the problems in which the repetition of properties of the considered attributes plays an important role. Consequently, many situations may occur where the counting of the objects in the universe of discourse are not single. We believe that the applying of multisets in rough set theory plays a prominent role in the problem of decision making and optimizations, where repetition of objects is essential in such problems. As a natural need, it is a profitable way to extend classical rough sets to rough msets. In most of the applications related to computer systems, a huge amount of data is required to be stored and processed. Also, data is being upgraded constantly every day. Often this data is incomplete, vague and uncertain. Extracting useful information from such a data is not a trivial task. One of the convenient and effective tools for this process is the theory of rough sets and their generalizations. There may also a lot of situations where there will be copies of the same objects occurring in the database. When the copies or counting of each object is also significant, we cannot eliminate them. Multisets play a prominent role in processing such information. Information systems dealing with multisets are often regarded as information multisystems.
An information systems a pair S = (M, A) where M is a nonempty finite set of objects called the universe of discourse and A is a nonempty finite set of attributes. Information multisystems are represented using multisets instead of crisp sets. Formally, an information multisystem [15] can be defined as a triple, S = (M, A, R) where M is an mset of objects, A is the set of attributes, and R is an mset relation defined on M. For example [15], the chemical system can be defined as S = (M, A, R) where M is the mset of all possible molecules, A is an algorithm describing the reaction vessel or domain and how the rules are applied to the molecules inside the vessel and R is the set of collision rules, representing the interaction among the molecules. In short, the concepts of rough multisets and generalized rough multisets and their related properties with the help of lower and upper mset approximations are important frameworks for certain types of information multisystems.
It is worth while to note that Chakrabarthy [5, 6] introduced two types of bags call Ic bags and n k bags, which is suitable for situations where the counting of the objects in information system are not fixed and are represented in the form of intervals of positive integers and power set of positive integers. These kinds of problems appear, for instance, during a nuclear fission, when a nucleus (consisting of protons and neutrons) is split into multiple nuclei, each of them with its own number of protons and neutrons. It is possible to associate the concept of rough multisets and generalized rough multisets to Ic bags or n k bags with the help of lower mset approximation and upper mset approximation. This can also be used for certain types of decision analysis problems and could prove useful as mathematical tools for building decision support systems.
