Improved FCM algorithm based on initial center optimization method

Abstract

Fuzzy c-means is one of the most popular partitional clustering. However, it has the shortcoming that it is sensitive to initial centers and noises. Density-based clustering algorithm overcomes this shortcoming, but cannot obtain the better clustering results when the density of data space has uneven distribution. Grid-based method is advantageous to save computational time, but the clustering performance was unsatisfied. Based on the above analysis, the improved FCM algorithm based on initial center optimization method is proposed. First, the initial center optimization method based on density and grid is presented to avoid the sensitivity of FCM to initial centers. Then, improved FCM algorithm based on initial center optimization method is proposed. Finally, the performance and effectiveness of the proposed clustering algorithm is evaluated by 4 San Francisco taxi GPS cab mobility traces data sets, and the experimental results show that the proposed algorithm has better clustering results.

Keywords

Fuzzy clustering fuzzy c-means density-based clustering grid-based method

1 Introduction

In the traffic system, we analyze cab mobility traces to provide support for the gas station’s location, and optimize the traffic operation system. The cab mobility traces should be analysis by using the data mining technology, and according to obtained information the reasonable setting of the gas station’s location which make cab mobility has the least the time and cost. Due to the cab mobility traces is a continuous trajectory, the density-based clustering methods [1 –8] will classify the whole trace into a cluster, therefore density-based clustering cannot find the reasonable cluster center (the possible positions of gas station). The Fuzzy c-means (FCM) [9 –16] is based on objective function, and has the advantages of the low time complexity and easy to implement. Therefore, FCM has extensive application, especially in the field of traffic system. However, FCM clustering method has some shortcomings:

The users should input the number of clusters c advanced.

The clustering algorithm is sensitive to the initial cluster centers. The different clustering results by using FCM with different initial centers are shown as Fig. 1.

The clustering algorithm is sensitive to the noise. The clustering results by using FCM and the real results are shown as Fig. 2.

Fig.1

The clustering results of different initial centers by using FCM.

Fig.2

The clustering results with a noise.

In order to overcome the shortcomings, the improved FCM algorithm based on initial center optimization method is proposed. First, the data space is divided into some scattered data sets according to the location of the data points by using grid-based method. Then, the data points are endowed with different weights based on the distances to the center. Thirdly, the cells are merged based on the density and location of cells until the number of cells reaches a specified number. The initial centers of FCM are obtained by calculating the centers of the final cells. The obtained centers as the initial centers overcome the drawback of FCM that is sensitive to the initial value. The performance and effectiveness of the proposed clustering algorithm is evaluated by 4 San Francisco taxi GPS cab mobility traces data sets, and the experimental results show that the proposed algorithm has better clustering results.

The rest of this paper is organized as follows. Section 2 introduces fuzzy c-means clustering. Section 3 proposed the initial center optimization method based on density and grid. Section 4 presents improved FCM algorithm based on initial center optimization method, and Section 5 reports the experimental results. Finally, Section 6 concludes this work.

2 Fuzzy c-means

Fuzzy c-means (FCM) classifies the dataset X = {x₁, …, x_N} into c data clusters as fuzzy sets (F_i). The objective of FCM is to accurately represent pattern matrix U = [μ_ij] and cluster centers V = [v_i] via a minimizer (U, V) of the evaluation function J_m.

$J_{m} (U, V) = \sum_{j = 1}^{N} \sum_{i = 1}^{c} μ_{ij}^{m} ∥ x_{j} - v_{i} ∥^{2}$ (1)

The algorithm applies an optimization iteration to minimize the evaluation function J_m, while pattern matrix U and cluster centers V are updated. The update formulas of the values of U and V are derived easily by Lagrange multipliers as following equations, $v_{i} = \frac{\sum_{j = 1}^{N} μ_{ij}^{m} x_{j}}{\sum_{j = 1}^{N} μ_{ij}^{m}}, 1 \leq i \leq c$ (2) $μ_{ij} = {[\sum_{k = 1}^{c} {(\frac{∥ x_{j} - v_{i} ∥^{2}}{∥ x_{j} - v_{k} ∥^{2}})}^{\frac{1}{(m - 1)}}]}^{- 1}$ (3)

An abstract procedure of this process is listed in Algorithm 1.

Algorithm 1 Fuzzy c-means

Input:

Dataset X = {x₁, x₂, …, x_N}, the number of clusters c, a certain threshold ɛ, fuzzy factor m

Output:

A pattern matrix U and cluster centers V

1: Initialize fuzzy partition matrix U_i = [μ_ij] for i = 1, 2, …, c such that Σμ_ij = 1;

2: objective function objfcn = realmax;

3: error = realmax;

4: while (error >ɛ) do

5: Calculate the cluster centers v_i by

6: $v_{i} = \frac{\sum_{j = 1}^{N} μ_{ij}^{m} x_{j}}{\sum_{j = 1}^{N} μ_{ij}^{m}}$

7: Update the fuzzy membership μ_ij by

8: $μ_{ij} = [\sum_{k = 1}^{c} (\frac{∥ x_{j} - v_{i} ∥^{2}}{∥ x_{j} - v_{k} ∥^{2}})^{\frac{1}{(m - 1)}}]^{- 1}$

9: Calculate new value of objective function newobjfcn by

10: $newobjfcn = \sum_{j = 1}^{N} \sum_{i = 1}^{c} μ_{ij}^{m} ∥ x_{j} - v_{i} ∥^{2}$

11: error = abs (newobjfcn - objfcn);

12: objfcn = newobjfcn;

13: end

3 The initial center optimization method based on density clustering and grid clustering

Density-based clustering put all density-connected data and other data which are within this range into a cluster so as to cluster. Due to the cab mobility traces is a continuous trajectory, the density-based clustering cannot be used directly to clustering the cab mobility traces. This kind of clustering methods will classify the whole trace into a cluster, therefore density-based clustering cannot find the reasonable cluster center (the possible positions of gas station).

Grid-based clustering [17 –24] partitions the data space into a finite number of cells, and creates a grid structure. Then the cluster centers can be obtained according to the density of cells. The processing unit is no longer a data point but an operational data set, therefore the computational complexity is significantly reduced by using this kind of clustering methods. However, this kind of methods have a known shortcoming is that the quality and accuracy of clustering is low.

Based on the above analysis, this paper presents an initial center optimization method that combines the density-based method with grid-based method. The partition method overcomes the shortcomings of density-based method’s limitations and grid-based method’s low accuracy rate. First, the data space is divided into some scattered data sets according to the location of the data points by using grid-based method. Then, the data points are endowed with different weights based on the distances to the center. Thirdly, the cells are merged based on the density and location of cells until the number of cells reaches a specified number. The initial centers of FCM are obtained by calculating the centers of the final cells. The obtained centers as the initial centers overcome the drawback of FCM that is sensitive to the initial value. The next will be introduced some basic definitions of the method.

Definition 1. (Cell) The database D = {x₁, x₂, …, x_n} is divided into m cells cell _i (i = 1, 2, …, m), which satisfy the following requirements:

cell_i ∩ cell_j = Φ, 1 ≤ i ≠ j ≤ m;

cell₁ ∪ cell₂ ∪ … ∪ cell_m = D.

Definition 2. (Center of the Cell) The center of cell is defined by the data which make the sum of the distance from others in the cell minimize, noted as cellcenter. The cellcenter is defined as follows: $cellcenter = \min_{x_{i} \in cell} \sum_{j = 1}^{n} ∥ x_{i} - x_{j} ∥$ (4) where x_i, x_j ∈ cell and n is the number of data in the cell, ∥· ∥ is the Euclidean norm between x_i and x_j.

Definition 3. (Weight of Data in Cell) The data point x_i ∈ cell_j is endowed with the different weight. The weight is defined as follows: $W (x_{i}) = \frac{1}{∥ x_{i} - {cellcenter}_{j} ∥}$ (5) where cellcenter_j is the center of cell_j.

Definition 4. (Density of Cell) The density of the cell is defined by the ratio of the sum of weights to the volume of the cell, noted as D(cell)and shown as: $D ({cell}_{j}) = \frac{\sum_{x_{i} \in {cell}_{j}} W (x_{i})}{Vol ({cell}_{j})}$ (6)

Definition 5. (Similarity Degree between Cells) The similarity degree between cell_i and cell_j is described by using the density and location of the cells, noted as S (cell_i, cell_j) and defined as:

$\begin{matrix} S ({cell}_{i}, {cell}_{j}) \\ = SD ({cell}_{i}, {cell}_{j}) \times (1 - SL ({cell}_{i}, {cell}_{j})) \end{matrix}$ (7) where SD (cell_i, cell_j) is the similarity degree of the density of the two cells, is defined as: $SD ({cell}_{i}, {cell}_{j}) = \frac{D ({cell}^{*})}{\sqrt{D ({cell}_{i}) \times D ({cell}_{j})}}$ (8) where cell^* is composed of the data points which locate between two adjacent cells cell_i and cell_j. SL (cell_i, cell_j) is the similarity degree of the location of the two cells, is defined as: $SL ({cell}_{i}, {cell}_{j}) = \frac{∥ {cellcenter}_{i} - {cellcenter}_{j} ∥}{\max ∥ {cellcenter}_{i} - {cellcenter}_{j} ∥}$ (9)

Algorithm 2 Improved FCM algorithm based on initial center optimization method

Input:

Data set X = {x₁, x₂, …, x_n}, the number of clusters c

Output:

A pattern matrix U and cluster centers V

1: Apply grid-based method to partition the data space into m cells;

2: For each cell do

3: Calculate the center of the cell by Eq. (4);

4: Calculate the weight of data in the cell by Eq. (5);

5: Calculate the density of the cell by Eq. (6);

6: end

7: Let the last density $\sqrt{n} - n^{3 / 8}$ cells be the noise cells;

8: Calculate the similarity degree between arbitrary two adjacent cell by Eq. (7)-Eq. (9);

9: whilem > cdo

10: Merge the two cells which have the maximum similarity;

11: Calculate the similarity degree between the new cell with other adjacent cell by Eq. (7)-Eq. (9);

12: end

13: Calculate the centers of the final cells as c initial centers;

14: Apply the FCM algorithm to classify the given data set.

From the above we can see that the value of SD (cell_i, cell_j) is bigger, the more density similar of the two cells, and the nearer the value of SL (cell_i, cell_j) is to 1, the closer the cell is to one other. Therefore, the bigger the value of S (cell_i, cell_j) is, the more similar of the two cells.

4 Improved FCM algorithm based on initial center optimization method

The main shortcoming of FCM is that it is sensitive to the initial value which caused the algorithm is easy to fall into local optimum. In order to overcome this shortcoming, the improved FCM algorithm based on initial center optimization method is proposed. The abstract procedure of the algorithm is listed in Algorithm 2, and the next will be described the detail steps of the algorithm.

Step 1: Apply grid-based method to partition the data space

Let $m = \sqrt{n}$ [25] be the number of cells, and use $\sqrt{m}$ equidistance parallel horizontal and vertical to divide the data space. Create the grid structure, i.e., partition the data space into m cells.

Step 2: Merge the cells based on the similarity degree between cells

Step 2.1: Calculate the center of each cell by Equation (4).

Step 2.2: Calculate the weight of data in each cell by Equation (5).

Step 2.3: Calculate the density of each cell Equation (6).

Step 2.4: Descend all cells based on the density, and let the last $\sqrt{n} - n^{3 / 8}$ cells be the noise cells. Then the number of cells is m = n^3/8.

Step 2.5: Calculate the similarity degree between adjacent cell by Equations (7)–(9).

Step 2.6: Merge the two cells which have the maximum similarity degree, then calculate the similarity degree between the new cell with other adjacent cell.

Step 2.7: If the number of cells is more than c, then go to Step 2.6; otherwise go to step 3.

Step 3: Perform the Fuzzy c-means algorithm

Step 3.1: Calculate the centers of the final cells as c initial centers of FCM.

Step 3.2: Apply the FCM algorithm to classify the given data set.

5 Experimental results

In this section, the performance and effectiveness of the proposed clustering algorithm is evaluated by the real data sets. The data sets include 4 San Francisco taxi GPS cab mobility traces data sets, named as Tra1, Tra2, Tra3, and Tra4. The cab mobility traces data sets includes 23495 data points, 5454 data points, 21962 data points, 22792 data points, respectively. The next will compare the proposed algorithm with FCM to cluster these data sets.

For the four data sets, FCM algorithm run 100 times with the number of clusters c equals 5, and calculated the average of the clustering results. The proposed algorithm classifies the same data sets, and compares the clustering results with the FCM. The following Tables 1–4 are comparisons of the clustering results of FCM and the proposed algorithm. Tables 1–4 select 5 set clustering results by FCM for data sets Tra1, Tra2, Tra3, and Tra4, respectively. Then, we compare the average of the 5 set clustering results with the proposed algorithm. From Table 1–4, we can conclude that the clustering centers find by the proposed algorithm is relatively stablethan FCM.

Table 1
The clustering centers of FCM and the proposed algorithm for Tra1 data set

Algorithm Center 1 Center 2 Center 3 Center 4 Center 5

FCM (37.62, - 122.40) (37.76, - 122.39) (37.75, - 122.44) (37.80, - 122.39) (37.81, - 122.42)

FCM (37.63, - 122.39) (37.75, - 122.38) (37.76, - 122.43) (37.81, - 122.39) (37.79, - 122.41)

FCM (37.63, - 122.40) (37.74, - 122.39) (37.78, - 122.43) (37.80, - 122.40) (37.80, - 122.42)

FCM (37.64, - 122.41) (37.76, - 122.40) (37.77, - 122.44) (37.79, - 122.39) (37.79, - 122.42)

FCM (37.65, - 122.40) (37.75, - 122.38) (37.78, - 122.42) (37.80, - 122.38) (37.78, - 122.43)

Average (37.63, - 122.40) (37.75, - 122.39) (37.77, - 122.43) (37.80, - 122.39) (37.79, - 122.42)

The proposed algorithm (37.63, - 122.40) (37.75, - 122.39) (37.76, - 122.43) (37.80, - 122.39) (37.79, - 122.42)

Algorithm	Center 1	Center 2	Center 3	Center 4	Center 5
FCM	(37.62, - 122.40)	(37.76, - 122.39)	(37.75, - 122.44)	(37.80, - 122.39)	(37.81, - 122.42)
FCM	(37.63, - 122.39)	(37.75, - 122.38)	(37.76, - 122.43)	(37.81, - 122.39)	(37.79, - 122.41)
FCM	(37.63, - 122.40)	(37.74, - 122.39)	(37.78, - 122.43)	(37.80, - 122.40)	(37.80, - 122.42)
FCM	(37.64, - 122.41)	(37.76, - 122.40)	(37.77, - 122.44)	(37.79, - 122.39)	(37.79, - 122.42)
FCM	(37.65, - 122.40)	(37.75, - 122.38)	(37.78, - 122.42)	(37.80, - 122.38)	(37.78, - 122.43)
Average	(37.63, - 122.40)	(37.75, - 122.39)	(37.77, - 122.43)	(37.80, - 122.39)	(37.79, - 122.42)
The proposed algorithm	(37.63, - 122.40)	(37.75, - 122.39)	(37.76, - 122.43)	(37.80, - 122.39)	(37.79, - 122.42)

Table 2

The clustering centers of FCM and the proposed algorithm for Tra2 data set

Algorithm	Center 1	Center 2	Center 3	Center 4	Center 5
FCM	(37.74, - 122.42)	(37.75, - 122.40)	(37.77, - 122.43)	(37.77, - 122.41)	(37.79, - 122.43)
FCM	(37.75, - 122.41)	(37.75, - 122.39)	(37.76, - 122.44)	(37.78, - 122.42)	(37.78, - 122.43)
FCM	(37.74, - 122.40)	(37.74, - 122.40)	(37.77, - 122.44)	(37.77, - 122.41)	(37.80, - 122.42)
FCM	(37.75, - 122.41)	(37.76, - 122.39)	(37.78, - 122.43)	(37.79, - 122.42)	(37.78, - 122.43)
FCM	(37.73, - 122.42)	(37.76, - 122.38)	(37.78, - 122.42)	(37.78, - 122.40)	(37.79, - 122.44)
Average	(37.74, - 122.41)	(37.75, - 122.39)	(37.77, - 122.43)	(37.78, - 122.41)	(37.79, - 122.43)
The proposed algorithm	(37.74, - 122.41)	(37.75, - 122.39)	(37.77, - 122.43)	(37.78, - 122.41)	(37.79, - 122.43)

Table 3

The clustering centers of FCM and the proposed algorithm for Tra3 data set

Algorithm	Center 1	Center 2	Center 3	Center 4	Center 5
FCM	(37.48, - 122.12)	(37.61, - 122.39)	(37.73, - 122.40)	(37.78, - 122.43)	(37.79, - 122.40)
FCM	(37.49, - 122.13)	(37.62, - 122.38)	(37.74, - 122.40)	(37.79, - 122.44)	(37.78, - 122.40)
FCM	(37.50, - 122.11)	(37.62, - 122.39)	(37.73, - 122.40)	(37.78, - 122.43)	(37.78, - 122.40)
FCM	(37.48, - 122.12)	(37.63, - 122.40)	(37.74, - 122.41)	(37.79, - 122.42)	(37.77, - 122.39)
FCM	(37.49, - 122.11)	(37.63, - 122.39)	(37.72, - 122.40)	(37.80, - 122.42)	(37.78, - 122.41)
Average	(37.49, - 122.12)	(37.62, - 122.39)	(37.73, - 122.40)	(37.79, - 122.43)	(37.78, - 122.40)
The proposed algorithm	(37.49, - 122.12)	(37.62, - 122.39)	(37.73, - 122.40)	(37.79, - 122.43)	(37.78, - 122.40)

Table 4

The clustering centers of FCM and the proposed algorithm for Tra4 data set

Algorithm	Center 1	Center 2	Center 3	Center 4	Center 5
FCM	(37.60, - 122.40)	(37.69, - 122.39)	(37.76, - 122.40)	(37.79, - 122.43)	(37.78, - 122.44)
FCM	(37.61, - 122.39)	(37.69, - 122.38)	(37.75, - 122.41)	(37.80, - 122.41)	(37.79, - 122.44)
FCM	(37.60, - 122.38)	(37.69, - 122.39)	(37.74, - 122.40)	(37.79, - 122.41)	(37.78, - 122.44)
FCM	(37.62, - 122.39)	(37.69, - 122.40)	(37.75, - 122.40)	(37.78, - 122.43)	(37.77, - 122.45)
FCM	(37.62, - 122.38)	(37.69, - 122.39)	(37.75, - 122.40)	(37.78, - 122.43)	(37.78, - 122.44)
Average	(37.61, - 122.39)	(37.69, - 122.39)	(37.75, - 122.40)	(37.79, - 122.42)	(37.78, - 122.44)
The proposed algorithm	(37.61, - 122.39)	(37.69, - 122.39)	(37.75, - 122.40)	(37.79, - 122.41)	(37.78, - 122.44)

Figures 3–6 show the clustering results by using the proposed algorithm and FCM for data sets Tra1, Tra2, Tra3, and Tra4, respectively. The data points in figures are the cab location information at different times. The figure’s horizontal axis and vertical axis are the latitude and longitude of the cab location, respectively. Figure 3 shows that the clustering results of the proposed algorithm merge some segmentation clusters, and the connected data points are assigned to the same cluster. Figures 4–6 also show the same conclusion. Therefore, from Figs. 3–6, we can see that the clustering results more reasonable, and more in line with the actual situation than FCM. Therefore, we can conclude that the proposed algorithm has better clustering results than the FCM algorithm.

Fig.3

The clustering results by using the proposed algorithm and FCM for Tra1 data set.

Fig.4

The clustering results by using the proposed algorithm and FCM for Tra2 data set.

Fig.5

The clustering results by using the proposed algorithm and FCM for Tra3 data set.

Fig.6

The clustering results by using the proposed algorithm and FCM for Tra4 data set.

The experimental results show that the proposed algorithm has the following advantages:

From Tables 1–4, we can see that the average of the clustering centers obtained by FCM and the centers of the proposed algorithm are nearly equal. Therefore, the proposed algorithm based on initial center optimization method is more stable than FCM.

From Figs. 3–6, we can see that the proposed algorithm assigns the connected data to the same cluster. Therefore, the clustering centers find by the proposed algorithm more matches the actual situation.

From Figs. 3–6, we can see that the clustering results is not affected by the noise. Therefore, the proposed algorithm is not sensitive to the noises.

From Figs. 3–6, we can see that the clustering results of the proposed algorithm have higher clustering accuracy. Therefore, the proposed algorithm overcomes low quality and accuracy shortcomings of the grid-based method.

6 Conclusion

This paper presents an initial center optimization method that combines the density-based method with grid-based method. The partition method overcomes the shortcomings of density-based method’s limitations and grid-based method’s low accuracy rate. First, the data space is divided into some scattered data sets according to the location of the data points by using grid-based method. Then, the data points are endowed with different weights based on the distances to the center. Thirdly, the cells are merged based on the density and location of cells until the number of cells reaches a specified number. The initial centers of FCM are obtained by calculating the centers of the final cells. The obtained centers as the initial centers overcome the drawback of FCM that is sensitive to the initial value. The performance and effectiveness of the proposed clustering algorithm is evaluated by 4 San Francisco taxi GPS cab mobility traces data sets, and the experimental results show that the proposed algorithm has better clustering results.

References

Sander

, Ester

, Kriegel

H.P.

, et al., Density-based clustering in spatial databases: The algorithm GDBSCAN and its applications, Data Mining and Knowledge Discovery2(2) (1998), 169–194.

Kriegel

H.P.

and Pfeifle

, Density-based clustering of uncertain data, IEEE International Conference on Data Mining IEEE Computer Society (2005), 689–692.

Tran

T.N.

, Wehrens

and Buydens

L.M.C.

, KNN-kernel density-based clustering for high-dimensional multivariate data, Computational Statistics and Data Analysis51(2) (2006), 513–525.

Kriegel

, Krger

, Sander

, et al., Density-based clustering, Wiley Interdisciplinary Reviews Data Mining and Knowledge Discovery1(3) (2011), 231–240.

Aliguliyev

R.M.

, Performance evaluation of density-based clustering methods, Information Sciences179(20) (2009), 3583–3602.

Viswanath

and Pinkesh

, I-DBSCAN: A fast hybrid density based clustering method, International Conference on Pattern Recognition IEEE Computer Society1 (2006), 912–915.

Halkidi

and Vazirgiannis

, A density-based cluster validity approach using multi-representatives, Pattern Recognition Letters29(6) (2008), 773–786.

Daszykowski

, Walczak

and Massart

D.L.

, Density-based clustering for exploration of analytical data, Analytical and Bioanalytical Chemistry380(3) (2004), 370–382.

Bezdek

J.C.

, Fuzzy mathematics in pattern classification, Ph.D. dissertation, Cornell University, Ithaca, NY, 1973.

10.

Dave

R.N.

and Bhaswan

, Adaptive fuzzy c-shells clustering and detection of ellipses, IEEE Trans Neural Networks3 (1992), 643–662.

11.

Krishnapuram

, Nasraoui

and Keller

, The fuzzy c spherical shells algorithm: A new approach, IEEE Trans Neural Networks3 (1992), 663–671.

12.

Gong

, Liang

, Shi

, et al., Fuzzy C-means clustering with local information and kernel metric for image segmentation, IEEE Transactions on Image Processing22(2) (2013), 573–584.

13.

Pedrycz

and Rai

, Collaborative clustering with the use of fuzzy c-means and its quantification, Fuzzy Sets Syst159 (2008), 2399–2427.

14.

Tsai

and Lin

, Fuzzy c-means based clustering for linearly and nonlinearly separable data, Pattern Recognit44 (2011), 1750–1760.

15.

Baraldi

, Razavi-Far

and Zio

, Bagged ensemble of fuzzy c-means classifiers for nuclear transient identification, Ann Nucl Energy38 (2011), 1161–1171.

16.

Marcelloni

, Feature selection based on a modified fuzzycmeans algorithm with supervision, Inf Sci151 (2003), 201–226.

17.

Sun

, Liu

and Zhao

, Clustering algorithms research, Journal of software2008 (191), 48–61.

18.

Park

N.H.

and Lee

W.S.

, Statistical grid-based clustering over data streams, Acm Sigmod Record33(1) (2004), 32–37.

19.

Thanigaivelu

and Murugan

, Grid-based clustering with predefined path mobility for mobile sink data collection to extend network lifetime in wireless sensor networks, Iete Technical Review29(2) (2012), 133–147.

20.

Qiu

B.Z.

, Zhang

X.Z.

and Shen

J.Y.

, Grid-based clustering algorithm for multi-density, Control and Automation3(9) (2005), 1509–1512.

21.

Zhuang

, Pan

and Wu

, Energy-optimal grid-based clustering in wireless microsensor networks with data aggregation, International Journal of Parallel Emergent and Distributed Systems25(6) (2009), 96–102.

22.

Yong

H.E.

and Liu

Q.B.

, Dynamic grid-based clustering over data stream, Application Research of Computers25(11) (2008), 3281–3284.

23.

Cun-Hua , Sun

Z.H.

, et al., A mean approximation approach to a class of grid-based clustering algorithms, Journal of Software14(7) (2003), 1267–1274.

24.

Qiu

B.Z.

and Zhang

X.Z.

, Grid-based clustering algorithm with the parameter automatization, Journal of Zhengzhou University27(2) (2006), 91–93.

25.

Ankerst

, Breuning

, Kriegel

H.P.

and Sander

, OPTICS: Ordering points to identify the clustering structure, Proceeding of the ACM SIGMOD Int’1 Conf, on Management of Data, Philadelphia, ACM Press, 1999, pp. 49–60.