Abstract
Attribute reduction is a widely used technique in data preprocessing, aiming to remove redundant and irrelevant attributes. However, most attribute reduction models only consider the importance of attributes as an important basis for reduction, without considering the relationship between attributes and the impact on classification results. In order to overcome this shortcoming, this article firstly defines the distance between samples based on the number of combinations formed by comparing the samples in the same sub-division. Secondly, from the point of view of clustering, according to the principle that the distance between each point in the cluster should be as small as possible, and the sample distance between different clusters should be as large as possible, the combined distance is used to define the importance of attributes. Finally, according to the importance of attributes, a new attribute reduction mechanism is proposed. Furthermore, plenty of experiments are done to verify the performance of the proposed reduction algorithm. The results show that the data sets reduced by our algorithm has a prominent advantage in classification accuracy, which can effectively reduce the dimensionality of high-dimensional data, and at the same time provide new methods for the study of attribute reduction models.
Introduction
Classification is to express and understand different things more clearly. According to different classification standards, various kinds of elements are divided into several categories. With the development of the Internet, information has grown exponentially. To quickly dig out useful information from this massive amount of data, data sets must be preprocessed. Through the attribute reduction method, redundant and irrelevant attributes are eliminated, which can speed up the data analysis process and improve the accuracy of data analysis [30]. Usually, people take the best result of clustering as a reference standard for classification, and classify similar samples into the same cluster. In the process of cluster analysis, if the relationship among clusters is measured by distance, the larger is the sample distance between clusters, the stronger the distinguishing ability, and vice versa. From the original intention of attribute reduction, due to the existence of redundant data, the data analysis process not only wastes a lot of time and space resources, but also affects the accuracy of data classification. Therefore, it is expected that attribute reduction can reduce the dimension of data and improve the classification accuracy of data set.
As a powerful mathematical tool for dealing with inaccurate and incomplete data, rough sets [1–3] have drawn much attention from researchers and have been successfully applied in the fields of machine learning [4, 5], data mining [6–8], expert system [9], fault diagnosis [10], et al. Attribute reduction [38–43] is one of the research hotspots of rough sets, and it is a widely used technique in the data preprocessing stage. Its goal is to effectively delete redundant conditional attributes and achieve data dimensionality reduction while maintaining the distinguishing ability of original decision-making system among samples. More importantly, it reduces the consumption of space resources, speeds up the running speed of the algorithm and improves the classification accuracy of data sets. In order to find out effective attribute reduction methods, researchers have proposed and improved the classic rough sets model, and designed a variety of reduction models. For example, positive region method uses equivalence class division [9–11, 23]. The discernibility matrix method adopts the discernibility matrix to derive the discernibility function, and then uses the disjunction of the discernibility function to find out the reduction sets [12–14, 37]. The information entropy reduction model applies information entropy to measure the importance of attributes and eliminate redundant attributes in turn [15–17, 27]. The granular computing attribute reduction method is to search out a subset of the attributes with the coarsest granularity in the conditional attributes as the reduction core, and on this basis, calculate the importance of the remaining attributes to obtain the reduction set [18, 36], et al., and the effectiveness of the above reduction model is verified through numerous experiments.
Although the methods of traditional reduction models are different, most of them use the indistinguishable relationship or equivalence relationship between samples to calculate the attribute reduction set, and these reduction sets all depend on the decision class of the equivalence relationship. According to the research [23, 37] findings, the reduction result which is obtained by using the equivalence relationship between the condition attribute and the decision attribute cannot fully reflect the obvious improvement of the sample classification ability after the reduction. Given a decision system DS = (U, C ∪ D), U is a nonempty finite set of data objects and x ∈ U ∧ y ∈ U, C is a conditional attribute set with C = {a, b, c, e}, and D is a decision attribute. Suppose, the set {x, y} is classified according to the decision label and we get [x] ∈ D i ∧ [y] ∈ D j ∧ i ≠ j, but after reduction, we get [x] = [y]. In other words, data objects x and y belong to the same equivalence relationship. For example, x = (1, 2, 1, 0, 1), y = (1, 2, 3, 1, 2), x ≠ y. If the reduction set is a, b, then we get x = (1, 2), y = (1, 2), at this time, x and y belong to equivalent relations. This phenomenon of inconsistency in the division of samples before and after reduction has a negative impact on the classification of the data set after reduction. In order to avoid the problems mentioned above, this paper proposes an attribute reduction method from the perspective of clustering.
To address above problem, we propose an attribute reduction method based on combined distance in the context of clustering. Firstly, divide all data objects according to decision attribute, and the objects with the same decision value are called intra-class samples. The intra-class samples are divided according to the conditional attributes, and the numbers of combinations formed by comparing the samples of the same sub-division pair by pair. Use those numbers of combinations to define the similarity distance of intra-class. Secondly, samples with different decision values are called inter-class samples, and the samples of different clusters are compared two by two different clusters, and the number of combinations obtained is used to define the inter-class sample distance. Thirdly, in order to comprehensively consider the factors of inter-class distance and intra-class distance, we use the formulas intra d istance + λ * inter d istance to calculate distance, and adjust the scale of parameters to determine the best clustering effect. Finally, delete unnecessary attributes to get the reduction set.
The rest parts of this paper are organized as following: Some basic concepts of rough sets are introduced in Section 2. In Section 3, the concept of intra-class combination distance and inter-class combination distance is defined, then the relevant standards of measuring clustering effects are applied to define the importance of attributes, and reduction rules and an optimized reduction algorithm are designed. Through experiments, the classification accuracy of the reduction set and the correlation of the data set before and after the reduction are analyzed in Section 4. In Section 5, a summary is made on the contribution of this article and the problems that need to be solved urgently in the future.
Preliminary
In this section, a brief introduction is made on the basic concepts of equivalence, distinguishable relationship, indistinguishable relationship and approximate combination number in rough sets.
Basic concepts
Then, an equivalence class containing the object x is represented as:
Equivalence class is a subset of samples with the same conditional attribute value.
POS
R
(D) = POS
C
(D) for any a ∈ R, POSR-a (D) ≠ POS
C
(D)
In this definition, the main purpose of positive region reduction is to find out a minimal attribute subset till the positive region unchanging.
Usually, attribute reduction is mainly applied to delete redundant attributes. In the classification work, if we delete the redundant and irrelevant attributes, not only the classification accuracy will be improved, but also a lot of time and space resources will be saved. How to measure the validity of the data set after reduction preprocessing, classification accuracy is one of the important criterions for evaluating attribute reduction algorithms. Due to that the traditional attribute reduction process does not rely on subsequent classifiers, the classification effect of the reduced data set is not ideal [21]. Therefore, we need to design a reduction model that combines the distribution of the data itself and the classification characteristics. In general, we can take the most ideal process of clustering as the classification efficiency. A class in the classification analysis process corresponds to a cluster in the cluster analysis process, and objects of the same class are attributed to the same cluster. Therefore, using the principle of clustering to reduce data set, we can improve the validity of data and improve the classification effect of the data set more conductively. The distance calculation is key to cluster analysis. In the following section, we discuss the calculation method of distance.
The Intra-class combination distance
The data set is divided according to decision attribute, and samples are classified into one class with the same decision value. The number of combinations is formed by the pairwise comparison of all objects in intra-class, and the larger is the value, the higher the sample similarity. In the process of cluster analysis, samples with greater similarity are easier to be classified into the same class. The intra-class combination distance is discussed as following.
The value of Sr R (x, y) denotes how many attributes have the same value on the objects x and y based on attribute sets R, we can draw the conclusion that Dr R (x, y) =1 - Sr R (x, y).
Where
Assume a ∈ R and |a| = 1, which means Sr
a
(x, y) =
Normally, assume x and y are two different random objects in D, if f
a
(x) = f
a
(y) then |Sr
a
(x, y) | = 1. To look for all pairs of samples in D
q
that satisfy condition f
a
(x) = f
a
(y), the number of pairwise comparisons of all elements in indistinguishable relation [x]
a
is calculated through
So, it is easy to conclude that formula (10) is equivalent to formula (8), from which we conclude that Definition 7 is equivalent to Definition 5.
Although Definition 5 is equivalent to Definition 7, the time complexity of calculating Definition 5 is O (|D q |2|R|), while Definition 7’s is O (|D q ||R|).
Since O (|D q ||R|) ≺ O (|D q |2|R|), the latter is an improvement of the former.
Definition 7 is to find out the degree of similarity within the intra-class D, and Definition 8 is to recognize the distinguish-ability within the inter-class D. Since IntraS (R, D q ) represents the degree of similarity of D q , then the bigger is the value, the higher the degree of similarity, the closer the samples in intra-class D q . On the contrary, 1 - IntraS (R, D q ) represents the degree of distinguishability of intra-class D q , the larger is the value, the higher degree of dissimilarity, and the looser the samples in intra-class D q .
A decision table
If calculate resemblance rate If calculate distinctive rate
The inter-class distance refers to the scale of distance between samples of different class. We assort two samples of different categories into a data set, and then divide them according to conditional attributes, where samples with the same value on conditional attribute are collected in the same sub-division. The number of combinations is formed by comparing the samples in the same sub-division pair by pair. The smaller is the value, the lower the degree of similarity.
and
Since |X
i
| = |Xi1| + |Xi2|, |Y
i
| = |Yi1| + |Yi2|, we have
Formula (13) indicates the degree of inter-class similarity, and formula (14) represents the degree of distinguishability. The bigger is the value of
If calculate resemblance rate
If calculate distinctive rate
Among the various clustering methods, some only consider the closeness between samples in the same clusters [45], and some only pay attention to the looseness between samples of different clusters [44]. This article considers both the inter-class distance and the intra-class distance to make the clustering more effective. The distance calculation methods of clustering samples can be measured by similarity and dissimilarity. The sample comparison includes intra-class samples and inter-class samples. In this way, the calculation methods and comparison objects are considered at the same time, and the following four distance calculation methods are explained:
Case 1: Intra-class resemblance rate and inter-class similarity rate. One of the desirable clustering effects requires high intra-class resemblance rate and low inter-class similarity rate. That’s to say, the resemblance distance of samples within intra-class should be large, and the similarity distance of samples within inter-class should be small. Generally speaking, the similarity of samples within intra-class is bigger than that of inter-class. For the same λ1, the bigger is the IntraRr (R, TD) and the smaller the InterRr (R, TD), the better the clustering effect. So, we use formula SIMR1 = IntraDis (R, TD) - λ1 * InterDis (R, TD) to measure the importance of conditional attribute R. The bigger is the value of SIMR1, the more important the conditional attribute R. The formula SIMR1 = IntraDis (R, TD) - λ1 * InterDis (R, TD) is marked as SIMR1 = IntraR _ InterR.
Case 2: Intra-class dissimilarity rate and inter-class dissimilarity rate. Another one of the better clustering effects requires weak intra-class dissimilarity and strong inter-class dissimilarity. That’s to say, the dissimilarity distance of samples of intra-class should be small and that of inter-class should be large. Generally speaking, the dissimilarity of samples of inter-class is bigger than that of intra-class. For the same λ2, the bigger is the InterDr (D, TD), the smaller the IntraDr (R, TD), the better the clustering effect. The formula SIMR2 = InterDis (R, TD) - λ2 * IntraDis (R, TD) is used to measure the importance of attribute R, and the bigger is SIMR2, the more important R will be. Similar to other cases, SIMR2 = - IntraDis (R, TD)+ λ2 * InterDis (R, TD) is marked as SIMR2 = IntraD _ InterD.
Case 3: Intra-class similarity rate and inter-class dissimilarity rate. The third one of the ideal clustering effects requires strong intra-class similarity and strong inter-class dissimilarity.
When IntraRr (R, TD) and InterDr (R, TD) become larger, the optimal clustering effects will be. We use the formula SIMR3 = IntraDis (R, TD) + λ3 * InterDis (R, TD) to measure the importance of attribute R, the bigger is SIMR3, the more important R will be. And we use the mark SIMR3 = IntraR _ InterD.
Case 4: Intra-class dissimilarity rate and inter-class similarity rate. The last one of the optimal clustering effects requires strong intra-class similarity and weak inter-class similarity. When IntraDr(R,TD) and InterRr (R, TD) become smaller, the clustering effect will be better. So the formula SIMR4 = IntraDis (R, TD) + λ4InterDis (R, TD) is used to measure the importance of attribute R. The smaller is SIMR4, the more important R will be. And we use the mark SIMR4 = IntraD _ InterR.
For the above four cases, when calculating IntraDis, we select IntraDr or IntraRr accordingly to obtain the intra-class sample distances. Similarly for calculating InterDis, InterDr or InterRr is selected properly to get the inter-class sample distance. Let λ i ∈ [0, + ∞), λ could be adjusted according to the importance on inter-class or intra-class.
and
In definition 13, the first condition guarantees that the reduced set R has the most important meaning and obtains the most ideal clustering effect. The second condition guarantees that the attribute set R is the smallest reduce set.
By using the combination distance to measure the degree of resemblance or dissimilarity of intra-class samples and inter-class samples, Definition 13 provides an ideal method for finding the optimal attribute reduction set. Although the distance of samples within inter-class is monotonic, the distance of samples within intra-class is not. Therefore, in order to get the optimal reduction, each condition attribute must be fully exerted.
When classifying and analyzing data, effective removal of redundant attributes will help improve the accuracy of data classification. So, in this paper, we adopt a deletion strategy to remove meaningless attributes. Firstly, calculate the all divisions of U/D. Secondly, we choose the method of calculating the importance of attributes, and compute the amount of information for each attribute according to the SIM. If we choose the case 1, case 2 and case 3 in section 3.3, the larger is the value of SIM, the better the clustering effect. Assume r ∈ R, delete r from R, if the value of SIMR-r - SIM R is larger, the classification accuracy of the data set after deleting r is higher, then the interference of r is greater. Suppose let Max (SIMR-r) = SIMR-a, we delete attribute a, R = R - a. Thirdly, if SIMR-a ≻ SIM R , then loop through the previous steps. Otherwise, the algorithm terminates. If we choose the case 4, the smaller is the value of SIM, the better the clustering effect. Every time we choose the smallest value in Min (SIMR-r). Assume SIMR-a is the smallest among the SIMR-r and attribute a, R = R - a will be deleted, the algorithm will loop until SIMR-a ≺ SIM R .
For any attribute a ∈ C, its inner amount of information is defined as following:
In case 1, 2 and 3, the larger the value of Information (a,C) is the less significant the attribute a is. In case 4, the smaller the value of Information (a,C) is, the more significant the attribute a is. The algorithm is described as Algorithm 1(in short RRCB).
In RRCB, according to Definition 9, the time complexity of computing IntraDis (R, TD) is O (|U||C|).
According to Definition 12, the time complexity of computing InterDis (R, TD) is O (2|U||C|). So, the time complexity of Step 2 is O (|U||C|)+O (2|U||C|).
In Step 4, we calculate the importance of each conditional attribute according to formula 18, the complexity of computing SIMR-a is O (|U||R| (|R| - 1)). If the reduction set is R, the Step 4 needs to loop |C| - |R| times. Therefore, the time complexity of Step 4 is O (|U| (|C| (|C| + 1) (2|C| + 1) - |R| (|R| + 1) (2|R| + 1))/ 6) ≈O (|U| (|C|3 - |R|3)/6). In summary, the time complexity of RRCB is O (|U| (|C|3 - |R|3)/6) + O (|U||C|) +O (2|U||C|) ≈ O (|U| (|C|3 - |R|3)/6).
In order to verify the effectiveness of the reduction algorithm RRCB and other four algorithms, we selected 21 data sets from the UCI [44] website. The basic information of the data set for comparison is shown in Table 2, where |U| is the number of samples, |C| is the number of sample categories.
UCI Machine Learning data sets
UCI Machine Learning data sets
In the experiment, each data set is subjected to 10 cross-validation to find out the average value, and the accuracy result is recorded in the form of “mean+standard deviation”. Four other types of reduction algorithm are named PRA, DMA, EA, KGA [48].
PRA represents the position region algorithm;
DMA represents the discernibility matrix algorithm;
EA represents the general entropy-based feature selection algorithm and KGA represents the knowled-ge granularity algorithm, respectively. RRCB is the proposed reduction algorithm in this paper by resemblance rate of combination. The setting of parameter λ varies with the method selection in the RRCB algorithm. More parameter setting analysis will be introduced in the following part. Although there are four different methods in RRCB, the reduction results are highly similar, we use inter-class resemblance and intra-class dissimilarity as representatives to describe the reduction process. In order to evaluate the efficiency of data classification after reduction, four common classifiers in WEKA are used, including NaiveBayes(NB), SMO, J48, RandomForest(RF). All consecutive data are discretized and filtered by WEKA tool, and we use unsupervised methods for other parameters, and the default parameter settings of the tool.
Use different algorithms to reduce the data set in Table 2, and the reduction results are shown in Table 3. The accuracy comparison of the data set is shown in Tables 4, 6 and 7. The data in bold in the table indicates that the corresponding method has achieved the best results, and the data in italics with shade background is the worst effect. The Best and Worst rows in the table represent the times of best and worst occasions of classification accuracy, time consuming and reduction set. The reduction results of RRCB are not promising from figures in Table 3 with 7 Worst results and 10 Best ones. The number of best results achieved by DMA, PRA, EA and KGA are 11, 5,10, 11 respectively, and the corresponding number of worst ones are 6,11,5,6. Obviously, DMA and KGA obtain the best reduction results with the smallest reduction set. The RRCB method does not perform so well in reduction, but get competitive results in computing time and classification accuracy.
Number of selected attributes with different attribute reduction algorithms
Number of selected attributes with different attribute reduction algorithms
Comparison of the classification accuracies on reduced data sets with NB(%)
Comparison of the classification accuracies on reduced data sets with SMO(%)
Comparison of the classification accuracies on reduced data sets with J48(%)
Comparison of the classification accuracies on reduced data sets with RF(%)
Among the 21 data sets, the RRCB algorithm has achieved good time performance in 19 data sets. On the data sets Handwritten, Mushroom and Sati-mage, the calculation time required by the RRCB algorithm is (89.869, 4.377, 23.389) seconds, and the DMA algorithm with poor time effect needs 10956.61, 1202.723, 2582.408) seconds on these data sets respectively. Especially, when reducing data sets with many samples and few attributes, RRCB has a more obvious time effect compared with the other four algorithms. For example, in the data sets Letters, Krkopt and Shuttle, the RRCB algorithm takes time (3.812,6.122,8.385), seconds, while the DMA algorithm takes(4831.528, 2522.502,3862.466) seconds respectively. The main reason is that the RRCB algorithm makes full use of the previous division results to calculate the sample similarity when calculating the similarity distance, while the DMA algorithm needs to consume a lot of space to build a discernibility matrix, wasting a lot of memory, and resulting in the speed declining of the calculation.
Similarly, the RRCB algorithm also has an excellent performance in the classification accuracy of the reduced set. We use the four classifiers NB, SMO, J48 and RF in the Weak tool to classify and analyze the reduction sets obtained by the above five reduction algorithms. Among them, on the NB and J48 classifiers, 19 of the reduced sets of the RRCB algorithm obtained the highest accuracy, and on the SMO and RF classifiers, 20 best results were obtained, and some accuracy was even higher than 30%. For example, on the NB classifier, the Promoter data set reduced by the RRCB algorithm obtained an accuracy of 93.2380, and the classification accuracy of the other four algorithms reduced sets were (65.3962,57.5472, 61.5094,67.3962) respectively. Similarly, for classifier SMO, RRCB achieves the best accuracy 90.3396, which is far better than the results(65.8491,56.6038, 69.6260,65.8491) obtained by DMA, PRA, EA and KGA. After reduction on data set Heart-statlog, RRCB gets an accuracy of 81.4858 by classifier J48, and the corresponding results of other four algorithms are(57.5185, 58.1585,75.1852,59.0124). Table 4–7 shows that our method has better classification effects with higher data quality than other methods.
In this section, we analyze the correlation between the original data set and the changes in the reduced data set. In this article, the standard Euclidean distance is used to measure the similarity between samples, and the commonly used VDM [31] method is to calculate the distance for non-numeric attributes
Where |U/D| is the classification number of the sample set on the decision attribute, m
a
(x) represents the number of samples with value x on attribute a, and ma,i (x) represents the number of samples with value x on attribute a and belonging to the i-th class on decision attribute. Then by using the Euclidean distance to calculate the distance between the objects x and y, the formula is as following:
|C| represents the number of conditional attributes,
Ci represents the i-th conditional attribute. By Using the formula Sig to standardize the calculation results, Sig is as following:
Due to that the reduction only removes redundant conditional attributes without reducing the number of samples, the number of samples in the data set remains unchanged before and after the reduction. In order to analyze the correlation between the original data and the reduced data, we use the Pearson coefficient, and they are strongly correlated if the result exceeds 0.8. We use the Jarque-Bera (in short JB) test to detect the significance of the correlation results. If the result is closer to 0, we believe that the entire data is closer to the normal distribution. Generally, the threshold is set to 0.05. If the significance result is less than 0.05, the original hypothesis is acceptable. For a data set with n samples, pairwise combinations form a total of
Where σ
X
, σ
Y
is the standard deviation of X,Y respectively, and cov (X, Y) is the covariance of (X,Y). JB test relies on skewness Skew(X) and kurtosis Kurt(X) for normality test.
Where E (X) is the mean. In order to verify the consistency of the data sets before and after reduction, we conducted correlation and significance analysis on the original data and the reduced data. The results are shown in Table 8. For the convenience of description, the data set reduced by the RRCB algorithm is referred to as RRCB_data, and the original data set is referred to as Raw-data. In the 21 data sets in Table 2, the correlation between RRCB-data and Raw-data is greater than 0.8. Among them, the correlation of 13 data sets is greater than 0.9, indicating that RRCB_data and Raw-data maintain a high degree of similarity. In order to evaluate the reliability of the correlation of the data sets before and after the reduction, we carried out a significance test on the correlation between the 21 RRCB-data and Raw-data, and the results are shown in the Revealing column of Table 8. All the results of significance detection are less than the common threshold of 0.05, and the significance results of 18 data sets are less than 0.005, indicating that the previous correlation analysis is completely feasible. Therefore, it can be concluded that the reduced set obtained by the RRCB algorithm is strongly correlated with the original data set, and can fully retain the original information of data set.
Correlation and Significance analysis between the original and the reduced data
When we perform cluster analysis, in order to get the best clustering effect, the distance which is usually required between samples within intra-class should be close, and the sample distance between different clusters should be far. In section 3.1, we propose four combined distance calculation methods. Different methods have different parameters λ i for calculating the distance between samples. The proportion of intra-class distance and inter-class distance is adjusted by changing the size of parameter λ i to achieve the best clustering effect. The reduction results obtained by RRCB are processed by four classifiers to get the classification accuracy, and the mean value of them is used as the criteria for evaluating data validity. Since λ i ∈ [0, + ∞), when λ i approaches 0, the intra-class distance weight will increase. When λ i become larger and larger, the inter-class distance weight will increase.
It can be seen from Fig. 1 that when λ i is increased, the accuracy of the reduced data set in the four cases is on the rise. But when it reaches a certain level, the accuracy will decrease reversely. It shows that the classification accuracy of the obtained data set is higher when both the distance of intra-class and inter-class are taken into consideration for the reduction method in the context of clustering. When λ i < 1, with the increase of λ i , the classification accuracy of the reduced data set increase accordingly. When λ i > 1, the weight of the inter-class distance is getting larger and larger, and the classification accuracy tends to be flat, with little change, but abrupt accuracy drop will occur until it reaches a certain value. In this case, the distance of intra-class has little effect on the distance of inter-class, resulting in lower and lower accuracy.

Influence of λ i with average classification accuracy of all classifiers on all data sets.
Attribute reduction is a popular technique used in the data preprocessing. Although the classic rough sets attribute reduction models have various methods, most of them define the importance of attributes from the perspective of equivalence class division or indistinguishable relationship, which leads to inconsistent division of data sets after reduction and affects the classification accuracy of data sets. In order to improve the classification accuracy of the reduced data set, according to the similarity, dissimilarity, intra-class, inter-class between the samples, we propose four calculation methods for the distance between samples, and then define the importance of attributes according to different distance calculation methods, and design reduction methods. Since this method fully considers the relationship between conditional attribute equivalence class division and the decision attribute classification, the classification accuracy of the reduced data set is greatly improved. In order to verify the effectiveness of the algorithm RRCB, the correlation ofthe data sets before and after the reduction is also compared, and the significance is used as a supplement as well. The disadvantage of the algorithm in this paper is that it does not consider the dynamic changes of the data set. When the samples of the data set changed, the previous reduction results may not be applicable, and it will cost a lot of time to recalculate the reduction set. In addition, the measurement of similarity and dissimilarity between samples needs to be further improved. Hence, it is highly expected in the near future to consider the introduction of incremental learning methods into rough sets attribute reduction while maintaining the ideal classification accuracy, so as to make the algorithm be adaptive to dynamic data sets.
Footnotes
Acknowledgments
This work was partially supported by the Natural Science Foundation of China(61836016), the key disciplines of Computer Science technology of Chaohu University (kj22zdxk01), The Provincial Natural Science Research Program of Higher Education Institutions of Anhui province(KJ2021A1030), the Key Subject Sub-projects of Chaohu University ZDXK-201815.
Conflicts of interest
The authors declare that they have no conflicts of interest to report regarding the present study.
