Abstract
With the revolution of computing and biology technology, data sets containing information could be huge and complex that sometimes are difficult to handle. Dynamic computing is an efficient approach to solve some of the problems. Since neighborhood multigranulation rough sets(NMGRS) were proposed, few papers focused on how to calculate approximations in NMGRS and how to update them dynamically. Here we propose approaches for computing approximations in NMGRS and updating them dynamically. First, static approaches for computing approximations in NMGRS are proposed. Second, search region in data set for updating approximations in NMGRS is shrunk. Third, matrix-based approaches for updating approximations in NMGRS while decreasing or increasing neighborhood classes are proposed. Fourth, incremental algorithms for updating approximations in NMGRS while decreasing or increasing neighborhood classes are designed. Finally, the efficiency and validity of the designed algorithms are verified by experiments.
Introduction
Since rough sets was proposed by Pawlak in 1982, it has been widely used in various fields such as pattern recognition [14], machine learning [32], image precessing [23, 24], decision making [11, 44], data mining and other relevant areas. Many models were proposed to extend its application, for example, covering based rough sets [39], variable precision rough sets [12], probabilistic rough sets [38], fuzzy rough sets [8, 36], fuzzy variable precision rough sets [45].
Qian et al. proposed multigranulation rough sets [25] (MGRS) based on multiple equivalence relation in 2010, which include optimistic and pessimistic multigranulation rough sets. In recent years, many models have been proposed based on the two decision strategies that “Seeking common ground while reserving differences” and “Seeking common ground with eliminating differences”. For example, by popularizing the binary relation from equivalence relation to neighborhood relation, Lin et al. proposed neighborhood multigranulation rough sets (NMGRS). Lots of studies focus on deriving models by the same decision strategy. Huang et al. proposed intuitionistic fuzzy multigranulation rough sets [9]. Feng et al. proposed variable precision multigranulation decision-theoretic fuzzy rough sets [5]. Li et al. proposed three-way cognitive concept learning via multi-granularity [15]. More over, there are researches about multigranulation rough sets and their relative models, such as multigranulation rough set theory over two universes [31], a comparative study of multigranulation rough sets and concept lattices via rule acquisition [16].
In an information explosion era, approximation computing becomes more and more difficult: the size of data sometimes is too huge to handle, the structure of data is changing to be more complex, and the granular structures often increase or decrease. The issue of computing and updating approximations in MGRS and their derived models attracts much research interest. The studies are often categorized into four classes, namely, how to update approximations while varying attributes [37, 42], how to update approximations while varying attribute values [3, 40], how to update approximations while varying decision attribute values [21, 37], how to update approximations while varying object set [7, 19].
No matter what variation is, there always exist two means to determine the relation between two sets: set operation or matrix product. Therefore, we can classify those studies into two categories. One is based on set operation. Scholars use set operation to determine whether a set is contained in another set or not, or whether their intersection is empty or not (see Chuan Luo [20, 22], Wenhao Shu [29], Guangming Lang [13], Mingjie Cai [2], Wei Wei [34] and etc). When granule size is generally big, set operation is a time consuming way because of its searching strategy: when we compute the intersection of two sets, we must confirm whether every object in a set is in another set or not. In the extreme case, when the two sets are both U(all samples are in a data set), then the time complexity of computing the intersection of them is |U|2. The other category is based on matrix. These studies are mostly based on matrix product or other operations. Scholars often change a set into binary matrix, and then design algorithms based on properties of binary matrix (see Jingqian Wang [33], Chengxiang Hu [6], Yanyong Huang [10] and etc). Although the time complexity of determining the relation between two sets via matrix approach is a constant, they often consider all the objects in the universe without filtering.
We attempt to combine the two approaches and derive new methods to overcome their defects. In other words, we concentrate on which part of the universe does not need to be considered while computing and updating approximations in NMGRS. At the same time, we determine the relation between two sets by matrix product.
Lin et al. proposed NMGRS [17] in 2012. NMGRS is a powerful mathematical tool in representing some real life situations and it extends the application of MGRS into a broader area. When we apply NMGRS in real life situations, approximation computing is an essential part. Approximations must be computed before making decision. Positive region must be calculated so that attributes can be selected in some attribute reduction process. Calculating approximations is necessary when applying rough set theory, thus how to compute approximations is an important issue in NMGRS. Granulation and approximation are two important issues in all rough sets models that can not be avoided. In NMGRS theory, different granule sizes have a big influence in its applications. Decreasing or increasing granules changes the approximations directly. Thus choosing an appropriate size of granules makes a great difference in decision making process [27]. Similar phenomenon exists in attribute reduction process because decreasing or increasing granules changes the positive region directly.
Neighborhood classes often decrease or increase. So it is important to update approximations in NMGRS based on approximations we have computed. We design algorithms for updating approximations while decreasing or increasing neighborhood classes. First, searching region while updating approximations in NMGRS need to be shrunk. A shrunk searching region can reduce the executing time of the algorithms. Second, matrix-based approaches need to be proposed to make the algorithms more efficient.
The rest of this paper is organized as follows. Section 2 introduces several basic concepts of NMGRS and matrix-based static algorithms to calculate approximations in NMGRS. In Section 3, dynamic approaches for updating approximations in NMGRS while decreasing or increasing neighborhood classes are proposed. Several algorithms designed using approaches that we proposed are given in Section 4. Experimental evaluations are conducted in Section 5 to verify the efficiency and validity of these algorithms. Finally, some conclusion and future work are given in Section 6.
Preliminaries
In this section, we review several basic concepts of NMGRS and propose matrix-based static approaches for computing approximations in NMGRS.
Neighborhood Multigranulation rough sets
In this subsection, several basic concepts of NMGRS are reviewed.
where d is a distance [28] between x and x
i
.
Accordingly, we say (U, δ) is a neighborhood approximation space, if there is an attribute subset in the system generating a neighborhood relation on the universe. We consider this system as a neighborhood information system, and denote it as NIS = (U, AT, δ), where U is a nonempty finite set and AT is a group of attribute set.
In this paper we consider distance between two objects as Euclid distance, which is defined as follows.
Matrix-based algorithm for computing approximations in NMGRS
A neighborhood information system
A neighborhood information system
Algorithm 1 is a matrix-based algorithm for computing approximations in NMGRS. The total time complexity of the algorithm is O (|AT||X||U|). Steps 2-7 are to calculate
Matrix-based algorithm for computing approximations in NMGRS
Matrix-based approaches for updating approximations while neighborhood classes are decreasing
In this subsection, we present matrix-based dynamic approaches for updating approximations in NMGRS while neighborhood classes are decreasing. We denote NIS
t
= (U, AT
t
, Δ) an neighborhood information system at time t, NISt+1 = (U, ATt+1, δ) an neighborhood information system at time t + 1 where δ and Δ are two nonnegative numbers and δ ≤ Δ. For all
If If If δ ≤ Δ, we have
□
Theorems 8 and 9 indicate the relations among lower and upper approximations in NMGRS between time t and time t + 1. However, they are not clear enough for updating approximations in NMGRS. The following theorem provides accurate approaches for updating approximations in NMGRS from time t to t + 1.
if if if if
“⇐”: Since we have “⇒”: It is similar to i. It is similar to ii.□
By Definition 1 we have
By Definitions 3 and 4 we have
By Theorem 10 we have
By Theorem 10 and Definition 8 we have By Theorem 10 and Definition 9 we have It is similar to i. It is similar to ii.□
By Definition 9 we have
Matrix-based approaches for updating approximations while neighborhood classes are increasing
In this subsection, we present matrix-based dynamic approaches for updating approximations in NMGRS while neighborhood classes are increasing. We denote NIS
t
= (U, AT
t
, δ) a neighborhood information system at time t, NISt+1 = (U, ATt+1, Δ) be a neighborhood information system at time t + 1 where δ and Δ are two nonnegative numbers and δ ≤ Δ. For all
If If If δ ≤ Δ, we have that
Theorems 14 and 15 indicate the relations among lower and upper approximations in NMGRS between time t and time t + 1. However, they are not clear enough for updating approximations in NMGRS. The following theorem provides accurate approaches for updating approximations in NMGRS from time t to t + 1 while neighborhood classes increasing.
if if if if
“⇒”: Since we have “⇒”: It is similar to i. It is similar to ii.□
By Definition 1 we have
By Definitions 3 and 4 we have
By Theorem 10 we have
By Theorem 16 and Definition 10 we have that By Theorem 16 and Definition 11 we have that It is similar to i. It is similar to ii.□
By Definition 11 we have
Matrix-based dynamic algorithms for updating approximations while neighborhood classes increasing or decreasing
Based on Corollary 12, we proposed a matrix-based Algorithm 2 for updating approximations in NMGRS while neighborhood classes are decreasing. The total time complexity of Algorithm 2 is
Matrix-based algorithm for updating approximations in NMGRS while neighborhood classes decreasing
Matrix-based algorithm for updating approximations in NMGRS while neighborhood classes decreasing
Since the time complexity of Algorithm 1 is O (|AT
t
||X||U|), and in general,
Ensure total time complexity of updating approximations in NMGRS while decreasing neighborhood classes no more than O (|AT t ||X||U|)
Based on Corollary 18, we proposed a matrix-based Algorithm 4 for updating approximations in NMGRS while neighborhood classes are increasing. The total time complexity of Algorithm 2 is
Matrix-based algorithm for updating approximations in NMGRS while neighborhood classes increasing
Since the time complexity of Algorithm 1is O (|AT
t
||X||U|), and in general, O (|AT
t
||
Ensure total time complexity of updating approximations in NMGRS while increasing neighborhood classes no more than O (|AT t ||X||U|)
In this section, several experiments were conducted to evaluate the effectiveness and the efficiency of the algorithms we proposed, namely Algorithm 1(MB), Algorithm 3(DMB) and Algorithm 5(DMB). We chose four algorithms to compare, namely, set operation based static algorithm(SOB) [37], set operation based dynamic algorithm(DSOB) [37], relation matrix based static algorithm(RMB) [6] and relation matrix based dynamic algorithm(DRMB) [6]. Six data sets were chosen from UCI machine learning repository. The details of these data sets are listed in Table 2. All the experiments were carried out on a personal computer with 64-bit windows 10, Inter(R) Core(TM) i7 6700HQ CPU @2.60GHz, and 16GB memory. The program language was Matlab r2015b.
Details of data sets
Details of data sets
The computational time are compared among MB,DMB,SOB,DSOB,RMB and DRMB when the size of data sets increases. With neighborhood classes decreasing or increasing. First of all, we randomly chose the categorical and numerical attributes
When the size of data sets increases gradually, results of the six algorithms while neighborhood classes increasing or decreasing in NMGRS are shown in Figs. 1 and 2. We can see that DMB is the most efficient algorithm when the size of data set increases gradually. From Fig. 1 we can see that DMB is effective and it reduces the computational time.

Comparison of Algorithm 3 when the size of U increase gradually.

Comparison of Algorithm 5 when the size of U increase gradually.
Similarly, instead of construct temporary data sets in Section 5.1, we construct temporary target concepts. The process of constructing and varying the three granular structures is similar to Section 5.1. We randomly divided each data set into ten subsets {X1, X2, ⋯ , X10 }. And then X1 was chosen as the first temporary target concept. Finally, we calculated the four approximations in MGRS by the the four algorithms ten times and compared the averages. Then we made X1 ∪ X2 the second temporary target concept and the whole process was repeated.
When the size of target concept increases gradually, results of MB, DMB, SOB, DSOB, RMB and DRMB are shown in Figs. 3 and 4. When the size of target concept increases gradually, RMB is always the most time consuming algorithm among the six algorithms. DMB and MB are more efficient than SOB,DSOB,RMB and DRMB. The computational time of DMB is always less than or equal to MB, so DMB is more efficient than any other algorithms among all the 6 algorithms considered. Sometimes the computational time of DMB is a little more than the MB while neighborhood classed decreasing, which is due to additional computation in DMB and it is within the expected range.

Comparison of Algorithm 3 when the size of X increase gradually.

Comparison of Algorithm 5 when the size of X increase gradually.
Data sets in real life applications sometimes are complex and huge. The granular structures often increase and decrease in some data sets. Sometimes it is difficult to handle. Therefore it is important to design algorithms to update approximations in NMGRS while neighborhood classes decreasing or increasing. In this paper, four algorithms have been proposed to ensure that the time complexity of the incremental algorithm is less than or equal to the static algorithm. Experimental results showed that the computational time of the incremental algorithms is no more than the static algorithm in most of the situations considered here.
We have examined the advantage and disadvantage of our proposed methods by experimental evaluation. The theoretical time complexity of DMB is more efficient than the other algorithms. However, the disadvantage of DMB and MB is also obvious: when the average size of granules is too small, the time complexity of matrix production is |U| and the time complexity of set operation is far less than |U|, which makes the algorithms based on the two approaches hard to compare. In the worst situation, DMB is more efficient than SOB and DSOB.
Approximation computation is a basic process of attribute reduction. In the future, we will further investigate attribute reduction algorithms using the approaches we proposed.
Footnotes
Acknowledgement
This work is supported by National Natural Science Foundation of China (No.11871259), National Natural Science Foundation of China (No.61379021), National Youth Science Foundation of China (No.61603173), the Natural Science Foundation of Fujian Province of China(No.2019J01748).
