Abstract
In this paper, ideals in the context of multisets on the lattice of all submultisets with the order relation as the multiset inclusion have been introduced. Moreover, generalization of rough multiset model by defining new multiset approximation operators in more general setting via multiset ideal has been presented. The concepts of lower and upper multiset approximations via multiset ideals have been mentioned. These new definitions decrease the upper multiset approximation and increase the lower multiset approximation and hence decreasing the boundary region and increasing the accuracy measure. Properties of these multiset approximations are studied and various examples are mentioned. Also, the comparison between the rough multiset approximations defined by Girish and John [11, 12] and the current multiset approximations has been presented. It’s therefore shown that the current definitions are more generally. Finally, the multiset topology induced by the present methods is finer than the multiset topology induced by the old method [11, 12].
Keywords
Introduction
In classical set theory, a set is a well-defined collection of distinct objects. If repeated occurrences of any object is allowed in a set, then a mathematical structure, that is known as multiset (mset [1] or bag [27], for short). Thus, an mset differs from a set in the sense that each element has a multiplicity a natural number not necessarily one that indicates how many times it is a member of the mset. One of the most natural and simplest examples is the mset of prime factors of a positive integer n. The number 400 has the factorization 400 = 2452 which gives the mset {2, 2, 2, 2, 5, 5}. Also, the cubic equation x3 - 5x2 + 3x + 9 =0 has roots 3, 3 and -1 which give the mset {3, 3, - 1}.
Classical set theory is a basic concept to represent various situations in Mathematical notation where repeated occurrences of elements are not allowed. But in various circumstances repetition of elements become mandatory to the system. For example, a graph with loops, there are many hydrogen atoms, many water molecules, many strands of identical DNA etc. This leads to effectively three possible relations between any two physical objects; they are different, they are the same but separate, or they are coinciding and identical. For example, ammonia NH3, with three hydrogen atoms, say H1, H2 and H3, and one nitrogen atom, say N. Clearly H1 and N are different. However H1, H2 and H3 are the same but separate, while H1 and H1 are coinciding and identical. There are many other examples, for instance, carbon dioxide CO2, sulfuric acid H2SO4, and water H2O etc.
The notion of ideal topological spaces was first studied by Kuratowski [17] and Vaidyanathaswamy [25]. Compatibility of the topology with an ideal was first defined by Njastad [18]. In 1990, Jankovic and Hamlett [13] investigated further properties of ideal topological spaces. Zhan [30] introduced the uncertainties of ideal theory on hemirings. Concept approximation problem is one of the most important issues in machine learning and data mining. Classification, clustering, association analysis, and regression are examples of well-known problems in data mining that can be formulated as concept approximation problems. A great effort of many researchers has been done to design newer, faster, and more efficient methods for solving the concept approximation problem. Rough set theory has been introduced by Pawlak [19] as a tool for concept approximation under uncertainty. The idea is to approximate the concept by two descriptive sets called lower and upper approximations. The lower and upper approximations must be extracted from available training data.
The main philosophy of rough set approach to concept approximation problem is based on minimizing the difference between upper and lower approximations (also called the boundary region). This simple, but brilliant idea, indicates to many efficient applications of rough sets in machine learning and data mining like feature selection, rule induction, discretization, or classifier construction. A. Kandil et al. [15] introduced the definition of lower and upper approximations via ideal. These definitions helped to decrease the boundary region and increase the accuracy measure. Girish et al. [11, 12] introduced rough mset in terms of lower and upper approximations. Moreover, they used an mset topological concept to investigate Pawlak’s rough set theory by replacing its universe by mset. For further reading in rough set and rough multiset [4, 31].
Rough set theory [20] has been conceived as a tool to conceptualize, organize and analyze various types of data, in particular, to deal with inexact, uncertain or vague knowledge in applications related to Artificial Intelligence. M. Kryszkiewicz [16] present Rough set approach to incomplete information systems; that is, to systems in which attribute values for objects may be unknown (missing, null). His concern is devoted to finding rules from such systems. Different ways were described in which null values may be handled [3, 23]. For instance, the methodology from [3] consists in transforming an incomplete system to a complete system, where each object with incomplete descriptor from the source system is represented by a set of quasi-objects in the target system. Another approach presented in [3] consists in removing objects with unknown values from the original system.
This paper is an attempt to explore a new approach of rough mset to decrease the boundary region and increase the accuracy measure. This paper is arranged as follows, Section 2 has a collection of all basic definitions and notions for further study. The purpose of Section 3 is to construct a new approach of ideals in the context of msets on the lattice of all submsets with the order relation as the mset inclusion. Properties of mset ideals are studied. In Section 4, the R*- upper and R*- lower mset approximations via mset ideals have been introduced. These definitions are different from Girish et al.’s definitions and more general. If , then our mset approximations coincide with Girish et al.’s mset approximations. So, Girish et al.’s mset approximations are special case of our mset approximations. Moreover, Properties of these mset approximations are studied. In Section 5, an mset topology via this new approach has been introduced. This mset topology is finer than Girish et al.’s one. Moreover, this new approach leads to decrease the boundary region and increase the accuracy measure. In addition, the boundary of a submset decreases as the mset ideal on a nonempty mset M increases. Furthermore, varied examples are introduced to show the significance of this new approach.
Preliminaries and basic definitions
In this section, a brief survey of the notion of msets as introduced by Yager [27], Blizard [1, 2] and Jena et al. [14] have been collected. Furthermore, the different types of collections of msets, Rough msets, and the basic definitions and notions of relations in mset context introduced by Girish and John [8–12].
Let M be an mset from the set X = {x1, x2, . . . , x n } with x appearing n times in M. It is denoted by x ∈ n M. The mset M drawn from the set X is denoted by M = {k1/x1, k2/x2, . . . , k n /x n } where M is an mset with x1 appearing k1 times, x2 appearing k2 times and so on. In Definition 2.1, C M (x) is the number of occurrences of the element x in the mset M. However those elements which are not included in the mset M have zero count. An mset M is a set if C M (x) =0 or 1 ∀ x ∈ X.
Let M, N ∈ [X]
m
. Then, the following are defined: M is a submset of N denoted by (M ⊆ N) if C
M
(x) ≤ C
N
(x) ∀ x ∈ X. M = N if M ⊆ N and N ⊆ M. M is a proper submset of N denoted by (M ⊂ N) if C
M
(x) ≤ C
N
(x) ∀ x ∈ X and there exists at least one element x ∈ X such that C
M
(x) < C
N
(x). P = M ∪ N if C
P
(x) = max {C
M
(x) , C
N
(x)} for all x ∈ X. P = M ∩ N if C
P
(x) = min {C
M
(x) , C
N
(x)} for all x ∈ X. The intersection of x into an mset M results in a new mset N, denoted by N = M ⊕ x such that
Addition of M and N results in a new mset P = M ⊕ N such that C
P
(x) = min {C
M
(x) + C
N
(x) , m} for all x ∈ X. The removal of x from the mset M results in a new mset N, denoted by N = M ⊖ x such that
Subtraction of M and N results in a new mset P = M ⊖ N such that C
P
(x) = max {C
M
(x) - C
N
(x) , 0} for all x ∈ X, where ⊕ and ⊖ represent mset addition and mset subtraction, respectively. An mset M is empty if C
M
(x) =0 ∀ x ∈ X. The support set of M denoted by M* is a subset of X and M* = {x ∈ X : C
M
(x) >0}; that is, M* is an ordinary set and it is also called root set. The cardinality of an mset M drawn from a set X is Card(M)= ∑x∈XC
M
(x). M and N are said to be equivalent if and only if Card(M)=Card(N).
The power set of an mset is the support set of the power mset and is denoted by P* (M). The following theorem shows the cardinality of the power set of an mset.
φ and M are in τ. The union of the elements of any sub collection of τ is in τ. The intersection of the elements of any finite sub collection of τ is in τ.
An mset topological space is an ordered pair (M, τ) consisting of an mset M and an mset topology τ ⊆ P* (M). Note that τ is an ordinary set whose elements are msets and the mset topology is abbreviated as a M-topology. Also, a submset U of M is an open mset of M if U belongs to the collection τ. Moreover, a submset N of M is closed mset if M ⊖ N is an open mset.
Here the entry (m/x, n/y)/mn in M1 × M2 denotes x is repeated m times in M1, y is repeated n times in M2 and the pair (x, y) is repeated mn times in M1 × M2.
The Cartesian product of three or more nonempty msets can be defined by generalizing the definition of the Cartesian product of two msets.
An mset relation R on an mset M is reflexive if (m/x) R (m/x) for all m/x in M, irreflexive if (m/x) R (m/x) never holds. An mset relation R on an mset M is symmetric if (m/x) R (n/y) implies (n/y) R (m/x), antisymmetric if (m/x) R (n/y) and (n/y) R (m/x) implies m/x and n/y are equal. An mset relation R on an mset M is transitive if (m/x) R (n/y) and (n/y) R (k/z), then (m/x) R (k/z). An mset relation R on an mset M is called an equivalence mset relation if it is reflexive, symmetric and transitive.
Also, R〈n/y〉is the intersection of all pre-msets containing y with nonzero multiplicity; that is,
and C
N
2
(x) ≤ C
N
1
(x) for all x ∈ X
,
and .
It should be noted that is an ordinary set whose elements are msets and the mset ideal is abbreviated as an M-ideal.
, Finite union of members of is in .
,
,
is finite}, called M-ideal of a finite submsets of M,
, such that {m
o
/x
o
} ⊆ M,
for all x ∈ X}.
,
and.
Proposition 3.2 implies that . Therefore, be a nonempty collection of submsets of M. Let and C
N
2
(x) ≤ C
N
1
(x) for all x ∈ X. It follows that and . Hence Definition 3.1 part (1) implies and . Thus . Now, let and . Hence and . Since and are M-ideals, then and . So . Hence is an M-ideal on M. Proposition 3.2 implies that . Thus, be a nonempty collection of submsets of M. Let and C
N
2
(x) ≤ C
N
1
(x) for all x ∈ X. Hence N1 = H ∪ G such that and . Since C
N
2
(x) ≤ C
N
1
(x) for all x ∈ X, it follows that there exist K1, K2 ⊆ M such that C
N
2
(x) = C(K1∪K2) (x) for all x ∈ X, C
K
1
(x) ≤ C
H
(x) for all x ∈ X, and C
K
2
(x) ≤ C
G
(x) for all x ∈ X. Since and are M-ideals, then and . Thus . Now, let and . Then N1 = H1 ∪ G1 and N2 = H2 ∪ G2 such that and . Hence and . Thus . Then the result. □
The following example shows that the union of two an M-ideals is not necessary to be an M-ideal.
Rough multiset via multiset ideal
If , then R* (N) = φ, If , then .
Let . Hence R* (N) = {m/x : 〈m/x〉
R
∩ N ∉ P* (M)} = φ. Let , then . □
R* (N) = [R* (N
c
)]
c
, R* (φ) = φ, N1 ⊆ N2 ⇒ R* (N1) ⊆ R* (N2), R* (N1 ∪ N2) = R* (N1) ∪ R* (N2), R* (R* (N)) ⊆ R* (N), R* (N1 ∩ N2) ⊆ R* (N1) ∩ R* (N2),
, R* (N1) ⊕ R* (N2) ⊆ R* (N1 ⊕ N2), R* (N1 ⊖ N2) ⊆ R* (N1).
The result is a direct consequence of Equation (8). Let x ∈
m
R* (N1). Then . Since 〈m/x〉
R
∩ N1 ⊆ 〈m/x〉
R
∩ N2. Definition 3.1 part (1) implies that , and hence x ∈
m
R* (N2). Then the result.
Let x ∈
m
R* (R* (N)), then . Hence 〈m/x〉
R
∩ R* (N) ≠ φ. Thus, there exists y ∈
n
〈m/x〉
R
∩ R* (N). It follows that 〈n/y〉
R
⊆ 〈m/x〉
R
and . Since 〈n/y〉
R
∩ N ⊆ 〈m/x〉
R
∩ N. Hence Definition 3.1 part (1) implies ; that is, x ∈
m
R* (N). Then the result.
The result follows immediately from Definition 3.1 and .
The following example shows that N ⊈ R* (N), in general. Moreover, it shows that equality does not holds in (6), (8), and (9) of Theorem 4.3 in general.
Let N = {3/x, 8/r} be a submset of M. ThenR* (N) = {8/r}. Thus N ⊈ R* (N). Also, let M1 = {4/z, 8/r} and M2 = {3/x, 2/y, 8/r} be two sub-msets of M. Then R* (M1) = M and R* (M2) = {3/x, 2/y, 8/r}, and hence R* (M1) ∩ R* (M2) = {3/x, 2/y, 8/r}. But M1 ∩ M2 = {8/r} and R* (M1 ∩ M2) = {8/r}. Therefore, R* (M1 ∩ M2) ≠ R* (M1) ∩ R* (M2). Moreover, M1 ⊖ M2 = {4/z}, and R* (M1 ⊖ M2) = {3/x, 2/y, 4/z}. Thus R* (M1 ⊖ M2) ≠ R* (M1). Furthermore, if N1 = {3/x, 1/z} and N2 = {2/z, 8/r} be two submsets of M. Then R* (N1) = φ and R* (N2) = {8/r}, and hence R* (N1) ⊕ R* (N2) = {8/r}. But N1 ⊕ N2 = {3/x, 3/z, 8/r} and R* (N1 ⊕ N2) = M. Consequently, R* (N1 ⊕ N2) ≠ R* (N1) ⊕ R* (N2).
and ,
,
.
Let . Then , since . It follows that ; that is, . Hence . On the other hand, let . Hence . Since . It follows that . Therefore, . Thus .
R* (N) = [R* (N
c
)]
c
, R* (M) = M, N1 ⊆ N2 ⇒ R* (N1) ⊆ R* (N2), R* (N1 ∩ N2) = R* (N1) ∩ R* (N2), R* (N) ⊆ R* (R* (N)), R* (N1) ∪ R* (N2) ⊆ R* (N1 ∪ N2),
, R* (N1) ⊕ R* (N2) ⊆ R* (N1 ⊕ N2), R* (N1 ⊖ N2) ⊆ R* (N1).
It should be noted that .
If , then , If , then .
, If N1 ⊆ N2, then ,
,
,
,
,
, If , then ,
,
,
, If N1 ⊆ N2, then ,
,
,
,
,
, If , then ,
,
.
In such case mset interior of N, , is identical with defined in Equation (11) and mset closure of N, , is identical with defined in Equation (10).
, (for , see Equation (6)), R* (N) is an mset closed; that is, .
Let . Hence x ∈
m
N or . It follows that x ∈
m
N or 〈m/x〉
R
∩ N ≠ φ, and hence . To prove that R* (N) be an mset closed, it suffices to show that . Let . It follows that x ∈
m
R* (N) or x ∈
m
R* (R* (N)). Hence Theorem 4.3 part (5) implies x ∈
m
R* (N). Hence the result. □
The following corollary compare between τ R and , where τ R is the M-topology defined in Theorem 2.23 and is that one defined in Corollary 5.5.
The integral part of mset that the researchers can collect the days that have the same attributes together as follows:
The mset relation define as follows:
Let be an M-ideal on M. Let N = {4/x, 2/z, 5/r} be a submset of M, then and . Consequently, and the accuracy of . On the other hand, according to Girish et al.’s [11, 12] and . Therefore, and the accuracy μ R of . It’s clear that .
The following theorem shows that the boundary of a submset decreases as the M-ideal on M increases.
From Theorems 5.3 and 5.6 we can conclude the final theorem as follows.
The following example compares between the boundary of the mset and the accuracy measure of the mset depending on Definition 2.22 and Definition 5.1. Moreover, it shows that the current methods reduce the boundary and increase the accuracy measure of submset N of a nonempty mset M in comparison of Girish et al.’s method [11, 12], (See Table 3).
The essential goal of rough set approach to concept approximation problem is based on minimizing the boundary region and increasing the accuracy measure. This brilliant idea leads to many efficient applications of rough sets in machine learning and data mining like feature selection, rule induction, discretization, or classifier construction. In this paper, a new definitions of lower and upper mset approximations via mset ideals have been introduced. These new definitions decrease the upper multiset approximation and increase the lower multiset approximation and hence decreasing the boundary region and increasing the accuracy measure. This new approach is different from Girish et al.’s approach and more general. If , then the present mset approximations coincide with Girish et al.’s mset approximations. So, Girish et al.’s mset approximations are special case of the present mset approximations. We believe that this new approach we mention here will be more useful for actual applications of the rough set theory and give us a hand to acquire much more insights into the mathematical structures of rough mset approximation operators.
In most of the applications related to computer systems, huge amount of data is required to be stored and processed. Also data is being upgraded constantly every day. Often this data is incomplete, vague and uncertain. Extracting useful information from such a data is not an easy task. One of the convenient and effective tools for this process is the theory of rough sets and their generalizations. There may also be situations where there will be copies of the same objects occurring in the database. When the copies or counts of each object are also significant, we cannot eliminate them. Multisets play a prominent role in processing such information. Information systems dealing with multisets are often regarded as information multisystems.
An information systems a pair S = (M, A) where M is a nonempty finite set of objects called the universe of discourse and A is a nonempty finite set of attributes. Information multisystems are represented using multisets instead of crisp sets. Formally, an information multisystem [10] can be formally defined as a triple, S = (M, A, R) where M is an mset of objects, A is the set of attributes, and R is an mset relation defined on M. For example [10], the chemical system can be defined as S = (M, A, R) where M is the mset of all possible molecules, A is an algorithm describing the reaction vessel or domain and how the rules are applied to the molecules inside the vessel, and R is the set of “collision” rules, representing the interaction among the molecules.
In short, the concepts of rough multisets and generalized rough multisets and their related properties with the help of lower and upper mset approximations are important frameworks for certain types of information multisystems. It is worthwhile to note that Chakrabarthy [5, 6] introduced two types of bags call I c bags and n k bags, which is suitable for situations where the counts of the objects in information system are not fixed and are represented in the form of intervals of positive integers and power set of positive integers. These kinds of problems appear, for instance, during a nuclear fission, when a nucleus (consisting of protons and neutrons) is split into multiple nuclei, each of them with its own number of protons and neutrons. It is possible to associate the concept of rough multisets and generalized rough multisets to I c bags or n k bags with the help of lower mset approximation and upper mset approximation. This can also be used for certain types of decision analysis problems and could prove useful as mathematical tools for building decision support systems.
Acknowledgments
The authors are grateful to the anonymous referee for a careful checking of the details and for helpful comments that improved this paper.
