Abstract
In several empirical situations, a decision is needed to be made based on data that is captured in some information system. The problem occurs when the information system holds complex data or even too much data attributes. This leads to the need for reducing the number of attributes required to obtain a decision. In this paper, a novel attributes’ reduction method is presented; the proposed method is based on constructing a weighted pre-topology that represents the information system under consideration. In addition, some essential operations for the weighted pre-topological space are presented; as well as, a brief study of their properties.
Introduction
For making decisions, Information systems play a key rule for reaching a right decision. In machine learning constructing a mathematical model of the information system is one way to deal with the situation.
In most cases, the information system holds to much data attributes, which costs in processing and might be time consuming; For so, the data attributes, needed to be reduced to mention the core attributes. The subject of studying the degree of accuracy of the reduction order in the information system was carried out in different ways, including the rough set [2, 16], the similarity matrix [11], the dissimilarity matrix [12].
In the past decade, several techniques for attribute reduction were proposed, including positive region preservation reduction, generalized decision preservation reduction, distribution preservation reduction, maximum distribution preservation reduction, and relative discernibility relation preservation reduction.
In 1981, Badard [15] introduced the concepts of a fuzzy pretopology to generalize the concepts of pretopology initiated by M. Brissaud [12]. Badard in his paper investigated some related concepts of fuzzy pretopology such as continuity, compactness, and connectedness. In 1987, A.A. Ramadan [1] investigated some concepts in fuzzy pretopological spaces such as the Hausdorff property, regularity, normality and several kinds of compactness and connectedness.
The aim of this paper is to introduce a new methodology for reducing the attributes in information system by using statistical method.
The remaining of this paper is structured as follows: In Section 2, the fundamental definitions from the pretopological space are introduced. In Section 3, developing a new type of a fuzzy pretopological space used in information system by using similarity method, In Section 4, a statistical method presented for the purpose of reducing the attributes. Hence, an illustrative example showing the method of similarity presented in Section 5, is introduced. To conclude, Section 6, gives a brief conclusion.
Prelimininaries
Definition [15]
A fuzzy pretopology on X can be defined as a mapping a : I
X
→ I
X
, which satisfies: (FP1) a (0) = 0. (FP2) A ⊆ a (A), for every A ∈ I
X
.
The pair (X, a) is said to be a fuzzy pretopological space (for short, fpts).
Definition [15]
Let (X, a) be a fpts, we define the fuzzy interior mapping int a : I X → I X by int a (A) = co (a (coA)). where coA is complement of A, A ⊆ I X .
In this case, (X, a) is said to be a fpts of type I.
Definition [6]
For any fpts (X, a), the following properties are considered (FP3): For every A, B ∈ I
X
such that A ⊆ B we have a (A) ⊆ a (B). (FP4): For every A, B ∈ I
X
we have a (A ∪ B) = a (A) ∪ a (B). In this case, (X, a) is said to be a fpts of type D. (FP5): For every A, B ∈ I
X
we have a2 (A) = a (a (A)) = a (A). In this case, (X, a) is said to be a fpts of type S.
Information system
An information system (IS) [13] is an ordered triple IS = (U, K, Q), such that U is a nonempty finite set of objects (students, toy blocks, . . .) called universe, K is a nonempty set of attributes (colors, characteristic, . . .) and Q is an attribute scale ordinal.
Weighted pretopological space generating from information system
In this work, we aim to develop a new type of a fuzzy pretopological using correlation method to represent the information system in hand.
Definition
For any information system (IS), we define an |U| × |U| similarity matrix, with the following form: -
where |K| is a cardinal number of a set of attributes k.
Definition
Let U ={ x1, x2, x3, … . , x n } be a set of objects, let Γ a (x i ) is a relation between x i , x j , its value is determined by corr (x i , x j ).
Then;
Where (U, Γ a (x i )) is a weighted pretopological space (for short, WP).
Definition
For any weighted pretopological space (X, Γ a ), the interior function int : F (X) → F (X) defined as;
Equivalently,
Definition
For any weighted pretopological space (X, Γ
a
), the following properties are considered: (WP1) : Γ
a
(0) = 0. (WP2) : A ⊆ Γ
a
(A).
Definition
For any fuzzy pretopological space used in information system (X, Γ
a
), the following properties are considered: (WP3) For every A, B ∈ I
X
such that A ⊆ B we have Γ
a
(A) ⊆ Γ
a
(B). In this case, (X, Γ
a
) is said to be a WP of type I. (WP4) For every A, B ∈ I
X
we have Γ
a
(A ∪ B) = Γ
a
(A) ∪ Γ
a
(B). In this case, (X, Γ
a
) is said to be a WP of type D. (WP5) For every A, B ∈ I
X
we have Γ
a
2 (A) = Γ
a
(Γ
a
(A)) = Γ
a
(A). In this case, (X, Γ
a
) is said to be a WP of type S.
Definition
For any set x ⊂ U, the attributes’ reduction degree of accuracy (γ
x
) [5], can be approximated using the following formula;
Definition
The attributes’ reduction ratio (
The proposed reduction procedure
For any information system IS = (U, K, Q), where U is a set of n objects, K is a nonempty set of attributes and Q is an attribute scale ordinal; we propose the following procedure in order to define a set of core attributes of the system under consideration: Construct an |U| × |U| similarity matrix, where its entities are the correlation coefficients between the objects in U, which are computed for all attributes in K using the definition [3.1]. Reconstruct the matrix in step (1), after removing one attribute from the set K. For each attribute in K, repeat the step (2), by removing another attribute. A weighted pretopolgy is to be constructed for each matrix of the matrices deduced from the previous steps (1 : 3); where its values are to be calculated using the definition [3.2]. A degree of accuracy for each subset of U is to be found using the definition [3.6]. Finally, a ratio for the attributes’ reduction is to be deduced using the definition [3.7]. Define the core attributes for the system in hand; in this step we have one of the following two cases:
Statistical application on weighted pretopological space
In this section, we give an example to demonstrate how to construct the pretopology from some information system.
The following (Table 1), presents some employee applications (objects) and some tests in science branches (attributes), Mathematics (Math.), Physics (Phys.), Chemistry (Chem.), Computer skills (Comp.).
Show the information system
Show the information system
To proceed, we have that |U| = 6, |K| = 4
Hence, we apply the following steps: Calculate sum of each attribute:
Compute the mean value Compute the divergence from the mean value for each attribute
Evaluate δ
K
for each attribute
Compute the covariance for the attribute u
i
and u
j
, as follows:
Compute the standard deviation for each attribute
Compute the correlation as follows:
Use the following relation to find a weighted pretopology for the attributes; denoted by Consider two partitions of U as follows:
Compute the degree of accuracy for each partition (E1 and E2). Calculate the ratio
Constructing weighted pretopolgy using full attribute
The following (Table 2), shows the sum of each attribute and the mean value obtained from the Equations (2).
Shows the sum and the mean value of each attribute
Shows the sum and the mean value of each attribute
The following (Table 3), shows the divergence from the mean value obtained from the Equation (3).
Shows
The following (Table 4), shows δMath. obtained from the Equation (4).
Shows δMath.
The following (Table 5), shows δPhy. obtained from the Equation (4).
Shows δPhy.
The following (Table 6), shows δChem. obtained from the Equation (4).
shows δChem.
The following (Table 7), shows δComp. obtained from the Equation (4).
Shows δComp.
The following (Tables 8 and 9), shows the covariance with k = 4 obtained from the Equation (5).
Shows the sum of all attributes
Shows the covariance of all attributes
The following (Table 10), shows the standard deviation obtained from the Equation (6).
Shows the standard deviation
The following (Table 11), shows the correlation obtained from the Equation (7).
Shows the correlation coefficients
The following (Table 12), gives the values of correlation coefficients between objects after shortening the numbers mentioned in (Table 11).
Gives the values of correlation between the objects
The aim of this section is to experiment the effect on the constructed weighted pretopology, when removing some attributes from the original information system. For that purpose, we will remove one attribute at a time.
5.1.2.1 Constructing weighted pretopology when removing the attribute Mathematical (Math.) For the example given in Section 5.1, and when removing the attribute, Math., In order to get a weighted pretopology, we are going to use the same steps described in (4.1).
The following (Table 13), shows the sum of each attribute and the mean value obtained from the Equations (2).
Shows the sum and the mean value of each attribute
Shows the sum and the mean value of each attribute
The following (Table 14), shows the divergence from the mean value obtained from the Equation (3).
Shows
The following (Table 15), shows δPhys. obtained from the Equation (4).
Shows δPhys.
The following (Table 16), shows δChem. obtained from the Equation (4).
Shows δChem.
The following (Table 17), shows δComp. obtained from the Equation (4).
Shows δComp.
The following (Tables 18 19), shows the covariance with k = 3 obtained from the Equation (5).
Shows the sum of three attributes
Shows the covariance of three attributes
The following (Table 20), shows the standard deviation obtained from the Equation (6).
Shows the standard deviation
The following (Table 21), shows the correlation obtained from the Equation (7).
Shows the correlation coefficients
The following (Table 22), gives the values of correlation coefficients between objects after shortening the numbers mentioned in (Table 21).
Gives the values of correlation between the objects
Remark 5.1:
Repeating the same procedure done with the previous tables, we reconstruct weighted pretopological spaces by removing one column at a time; that is removing the second column, the third column, the fourth column and so on. Similarity, we are able to obtain the weighted pretopological spaces, the interior, the degree of accuracy and calculate
5.1.2.2 Constructing weighted pretopology when removing the attribute Physics (Phys.) The following (Table 23), shows the correlation obtained from the Equation (7).
Shows the correlation coefficients
The following (Table 24), gives the values of correlation coefficients between objects after shortening the numbers mentioned in (Table 23).
Gives the values of correlation between the objects
5.1.2.3 Constructing weighted pretopology when removing the attribute Chemistry (Chem.) The following (Table 25), shows the correlation obtained from the Equation (7).
Shows the correlation coefficients
The following (Table 26), gives the values of correlation coefficients between objects after shortening the numbers mentioned in (Table 25).
Gives the values of correlation between the objects
5.1.2.4 Constructing weighted pretopology when removing the attribute Computer skills (Comp.) The following (Table 27), shows the correlation obtained from the Equation (7).
shows the correlation coefficients
The following (Table 28), gives the values of correlation coefficients between objects after shortening the numbers mentioned in (Table 27).
Gives the values of correlation between the objects
To recap, the following Table (29) shows the attributes’ reduction ratio for the two partitions (E1 and E2).
From the values shown Table 29, we may conclude that: The core attributes of the information system under consideration: core ={ Mathematical, Computer Skills }.
Conclusion
In this paper, a new attributes’ reduction method was presented. The proposed method is based on using a weighted pretopological structure to reduce the attributes in the information system.
The new technique provides the ability to exclude the objects with a corelation coefficient less than some certain value (σ) in order to define the core attributes; compared to existing methods, the new technique does not need a preconceived decision to achieve the core attributes.
For further study, we plan to develop this technique for the purpose of supporting information systems’ decision-making; And, to investigate the new technique with more real-life applications, such as decision support systems and geographical information systems.
