Abstract
An information system as a database that shows relationships between objects and attributes is a crucial mathematical model in the field of artificial intelligence. A real-valued information system is an information system where information function values of each attribute are real numbers. This paper explores information structures in an incomplete real-valued information system. Distances between two objects in a given subsystem of an incomplete real-valued information system is first constructed. Then, the fuzzy T cos -equivalence relation, induced by this subsystem by using Gaussian kernel method, is obtained, where Gaussian kernel is based on this distance. Next, information structure of this subsystem is proposed. Moreover, relationships between two information structures are studied from the two aspects of dependence and separation. Finally, the dependence between two information structures is studied by using inclusion degree. These results will be helpful for establishing a framework of granular computing.
Keywords
Introduction
Research background and related works
Granular computing, presented by Zadeh [45–48], is a basic issue in knowledge representation and data mining. Its purpose is to seek for an approximation scheme, which allows us to view a phenomenon with different levels of granularity and then can effectively solve a complex problem. Information granulation, organization and causation are basic notions of granular computing. Information granule is a family of objects that are drawn together by some constraints, such as indistinguishability, similarity or functionality. The process of constructing information granules is called information granulation. It granulates a universe into a family of disjoint or overlapping information granules. Granular structure is the family of information granules where the internal structure of each information granule is visible as a sub-structure. Naturally, granular structure can be depicted as a vector consisting of information granules. Lin [11–13] and Yao [40–42] explained the importance of granular computing, this aroused people’s interest in it. Until now, the study on granular computing mainly has four methods, i.e., rough set theory [24], fuzzy set theory [44], concept lattice [23, 35] and quotient space theory [51].
Rough set theory is an effective tool to deal with uncertainty. An information system based on rough set theory was presented by Pawlak [24, 27]. Most applications of rough sets, such as uncertaintymodeling [2, 32], reasoning with uncertainty [39, 40], rule extraction [1, 36], uncertainty measure [3, 49], classification and feature selection [5, 33] are related to information systems.
In granular computing in information systems, the study of information structures is an important research topic. An equivalence relation is a special kind of similarity between two objects from a data set. Given an information system, each attribute subset determines an equivalence relation on the object set of this system. This equivalence relation partitions the object set into some disjoint classes, these classes are said to be equivalence classes. If two objects belong to the same equivalence class, then we may say that they cannot be distinguished under this equivalence relation. Thus, each equivalence class is seen as an information granule consisting of indistinguishable objects [1]. The family of all these information granules constitutes a vector, this vector is said to be an information structure in the given information system induced by this attribute subset. Actually, information structures in an information system are granular structures in the meaning of granular computing. For example, Zhang et al. [49] considered information structures in a fully fuzzy information system; Li et al. [18, 19] studied knowledge structures in a knowledge base and relationships between knowledge bases; Li et al. [17] gave information structure in distributed fc-decision information systems.
Motivation and inspiration
With the forthcoming era of information age, information acquisition, analysis and processing have become a hot research topic in the information technology world. It means that information uncertainty analysis, information fusion, attribute reduction and classification are becoming increasingly significant. Rough set theory is developed around the concept of an information system. An information system as a database that represents relationships between objects and attributes. If a information system’s information function values are real numbers but it has missing values, then it can be called an incomplete real-valued information system. So far, three-way decision in an incomplete real-valued information system has not been studied. The purpose of this paper is to use the decision-theoretic rough set model to go into the three-way decision on Gaussian kernel in an incomplete real-valued information system. Let (U, A) be an information system. Given P⊂ A. Thereupon, an equivalence relation (or indiscernibility relation) ind (P) can be defined by
In this paper, we will use the distance to describe the relationship between two information function values and spread the article according to the following ideas. First, the distance between two information function values of a given attribute in an incomplete real-valued information system is constructed. Then, the distances between two objects in a given subsystem of an incomplete real-valued information system is structured. Next, the fuzzy T cos -equivalence relation induced by this subsystem by using Gaussian kernel method is obtained. In this way, the fuzzy neighborhood of every point can be viewed as the information granule. Information structure of this subsystem is consequently proposed.
The remaining part of this paper is organized as follows. In Section 2, we look back on some basic conceptions about fuzzy sets, fuzzy relations and incomplete real-valued information systems. In Section 3, we construct distance between two objects based on a given subsystem in an incomplete real-valued information system and obtain fuzzy T cos equivalence relation induced by this subsystem by means of Gaussian kernel method. In Section 4, we propose some concepts of information structures in an incomplete real-valued information systems and give their properties. In Section 5, we conclude this paper and emphasize the potential of the future.
Preliminaries
We first review some basic concepts about fuzzy sets, fuzzy relations, Pawlak rough sets and incomplete real-valued information systems.
In the full text of the paper, U denotes a non-empty finite set, 2 U denotes the family of all subsets of U, |X| denotes the cardinality of X ∈ 2 U and I denotes the unit interval [0, 1].
In this paper, put
Fuzzy sets and fuzzy relations
Fuzzy sets are extensions of ordinary sets [44]. A fuzzy set P in U is defined as a function that assigns a value P (x) ∈ I to each element x in U, and P (x) is called the membership degree of x to the fuzzy set P.
In this paper, I
U
denotes the set of all fuzzy sets in U. The cardinality of P ∈ I
U
can be calculated with
If R is a fuzzy set in U × U, then R is called a fuzzy relation on U. In this paper, IU×U denotes the set of all fuzzy relations on U.
Let R ∈ IU×U. Then R may be represented by
If M (R) = E (the identity matrix), then R is a fuzzy identity relation, and we write as R =▵; if r ij = 1, i, j ≤ n, then R will be a fuzzy universal relation, and we write as R = ω.
(1) commutativity: T (a, b) = T (b, a) ,
(2) associativity: T (T (a, b) , c) = T (a, T (b, c)) ,
(3) monotonicity: a ≤ c, b ⩽ d = T (a, b) ⩽ T (c, d) ,
(4) boundary condition: T (a, 1) = a .
(1) Reflexivity: R (x, x) =1,
(2) Symmetry: R (x, y) = R (y, x) ,
(3) T-transitivity: T (R (x, y) , R (y, z)) ⩽ R (x, z) .
An incomplete real-valued information system
In this part, we recall the concept of an incomplete real-valued information system.
If (U, A) is an information system, and A = C ∪ D where C is a conditional attribute set and D is a decision attribute set, then (U, A) is called a decision information system.
If (U, A) is an information system. Given P ⊆ A. Then a binary relation on U can be defined as
Apparently, ind (P) is an equivalence relation on U and ind (P) = ⋂ a∈Pind ({a}) .
An incomplete real-valued information system
An incomplete real-valued information system
Table 1 expresses an incomplete real-valued information system (U, A), where U = {x1, x2, ⋯ , x13}is an auto set and C = {Cylinders (a1) , Displacement (a2) , Horsepower (a3) , Weight (a4) , Acceleration (a5) , Modelyear (a6) , Origin (a7)}.
In this section, the fuzzy T cos -equivalence relation induced by an incomplete real-valued information system is given by means of Gaussian kernel method.
Distances between two objects in an incomplete real-valued information system
To construct the distance between two objects in an incomplete real-valued information system, a novel distance function should be presented.
d (a (x) , a (y)) =
Then
Thus
Since a3 is a real-valued attribute, by Definition 3.1, we have
Then
The distance between two information function values on the attribute a1 in an incomplete real-valued information system
The distance between two information function values on the attribute a1 in an incomplete real-valued information system
The distance between two information function values on the attribute a2 in an incomplete real-valued information system
The distance between two information function values on the attribute a3 in an incomplete real-valued information system
The distance between two information function values on the attribute a4 in an incomplete real-valued information system
The distance between two information function values on the attribute a5 in an incomplete real-valued information system
The distance between two information function values on the attribute a6 in an incomplete real-valued information system
The distance between two information function values on the attribute a7 in an incomplete real-valued information system
The distance between two objects in the incomplete real-valued information system (U, A)
The fuzzy T
cos
-equivalence relation
Gaussian kernel method is a significant means in machine learning and pattern recognition. In the cause of making data linear and simplifying classification tasks, it maps data into a higher dimensional feature space [30]. Hu et al. [6, 7] found that there were some relationships between rough sets and Gaussian kernel method, in consequence Gaussian kernel can be used to obtain fuzzy relations. The part of this Gaussian kernel will be used to extract a fuzzy T cos -equivalence relation on the object set of a given incomplete real-valued information system.
Gaussian kernel
Evidently G (x, y) satisfies:
(1) G (x, y) ∈ [0, 1];
(2) G (x, y) = G (y, x);
(3) G (x, x) =1.
Thus, the fuzzy T
cos
-equivalence relation
An algorithm for generating the fuzzy T
cos
-equivalence relation
Information structures in an incomplete real-valued information system
In this section, we consider information structures in an incomplete real-valued information system.
Some concepts of information structures in an incomplete real-valued information system.
Given R ∈ IU×U. For each x ∈ U, we define a fuzzy set S
R
(x):
Based on Qian’s idea, we give the concept of information structures in an incomplete real-valued information system in the following definition.
Below, we propose dependence between information structures.
(1) S
δ
2
(Q) is called to depend on S
δ
1
(P), if for each i,
(2) S
δ
2
(Q) is called to depend partially on S
δ
1
(P), if there exists i,
(3) S
δ
2
(Q) is called to be independent on S
δ
1
(P), if for each i,
Properties of information structures in an incomplete real-valued information system
In this subsection, we give properties of information structures in an incomplete real-valued information system.
(1) If 0 < δ1 ≤ δ2 ≤ 1, then for any P ⊆ A, S δ 1 (P) ⪯ S δ 2 (P);
(2) If P ⊆ Q ⊆ A, then for any δ ∈ (0, 1], S δ (Q) ⪯ S δ (P).
Then
So
By Theorem 4.2,
(2) By Definition 4.3,
Then
So
Thus, by Theorem 4.2,
S δ 1 (Q) ⪯ S δ 2 (Q) ⪯ S δ 2 (P), S δ 1 (Q) ⪯ S δ 1 (P) ⪯ S δ 2 (P).
(1) 0 ≤ D (S δ (Q)/S δ (P)) ≤1;
(2) S δ (P) ⪯ S δ (Q) implies D (S δ (Q)/S δ (P)) =1;
(3) S δ (P) ⊑ S δ (Q) ⊑ S δ (L) implies D (S δ (P)/S δ (L)) ≤ D (S δ (P)/S δ (Q)).
The following theorem shows the fact that relationships between information structures in an incomplete real-valued information system can be quantitatively described by the inclusion degree.
(1) S δ (P) ⪯ S δ (Q) ⇔ D (S δ (Q)/S δ (P)) =1 .
(2) S δ (P) ⋈ S δ (Q) ⇔ D (S δ (Q)/S δ (P)) =0 .
(3) S δ (P) ⊑ S δ (Q) ⇔0 < D (S δ (Q)/S δ (P)) ≤1 .
It follows that ∀ l,
Hence S δ (P) ⪯ S δ (Q).
(2) “⇒". Since S
δ
(P) ⋈ S
δ
(Q), we have
Thus D (S δ (Q)/S δ (P)) =0.
“⟸". Since D (S
δ
(Q)/S
δ
(P)) =0, we obtain that ∀ l,
Then ∀ l,
(3) This follows from (1) and (2).
Information distance between two information structures
Considering separation between information structures, in this subsection, we propose the concept of information distance to differentiate two information structures in the same incomplete real-valued information system and give some of its properties.
For A, B ∈ I
U
, denote
If A ⊆ B, then |A ⊕ B| = |B - A| = |B| - |A| .

The sizes of A (x), B (x) and C (x).
We only prove case (1).
Given x ∈ U, since C (x) ≤ A (x) ≤ B (x), we have
|A ⊕ B| + |B ⊕ C| - |A ⊕ C|
= (|B| - |A|) + (|B| - |C|) - (|A| - |C|)
= 2 (|B| - |A|) ≥0.
Thus
If A ⊆ B ⊆ C, then |A ⊕ B| + |B ⊕ C| = (|B| - |A|) + (|C| - |B|) = |C| - |A| = |A ⊕ C|.
If C ⊆ B ⊆ A, then |A ⊕ B| + |B ⊕ C| = (|A| - |B|) + (|B| - |C|) = |A| - |C| = |A ⊕ C|.
ρ (S δ (P) , S δ (Q)) ≥0,
ρ (S δ (P) , S δ (Q)) = ρ (S δ (Q) , S δ (P)).
By Lemma 4.3,
ρ (S
δ
(P) , S
δ
(Q)) =0 ⇔ ∀ i,
By Lemma 4.3, we have
ρ (S
δ
(P) , S
δ
(Q)) + ρ (S
δ
(Q) , S
δ
(L))
Thus (
(1)
(2) If S
δ
(P) ⪯ S
δ
(Q) and
(3) If S
δ
(P) ⪯ S
δ
(Q), then
Then
Thus
Hence
(2) Since S
δ
(P) ⪯ S
δ
(Q), for any i, we have
(3) Note that S
δ
(P) ⪯ S
δ
(Q). Then, for any i,
In this paper, Gaussian kernel has been applied for getting the fuzzy T cos -equivalence relation induced by a given subsystem of an incomplete real-valued information system. In this way, the fuzzy neighborhood of every point can be viewed as the information granule. Information structure of this subsystem has been consequently proposed. Dependence between information structures have been studied by using the inclusion degree. Moreover, information distance between two information structures has been considered. These results will be significant for establishing a framework of granular computing in an information system. This vector-based framework can be used to represent different dimension of information and may have potential applications to knowledge discovery in an incomplete real-valued information system. In the future, we will consider some applications in knowledge discovery of an incomplete real-valued information system.
