The similarity measure for complex data may not precisely reflect the true data structure, which leads to suboptimal clustering performance for spectral clustering. In this paper, we propose a novel spectral clustering method which measures the similarity of data points based on the adaptive neighborhood in Kernel space. In Kernel space, by assigning the adaptive and optimal neighbors for each data point based on the local structure, the proposed method learns a sparse matrix as the similarity matrix for spectral clustering. The proposed method is able to explore the underlying similarity relationships between data points, and is robust to the complex data. To validate the efficacy of the proposed method, we perform experiments on both synthetic and real datasets in comparison with some existing spectral clustering methods. The experimental results demonstrate that the proposed method obtains quite promising clustering performance.
Clustering is an important tool in fields like machine learning, data mining, graph compression and many other tasks. Spectral clustering has attracted a lot of attention due to its high performance on some challenging clustering tasks. Because of the capacity of mining intrinsic geometric structures, spectral clustering is efficient for complex data [17, 3], and thus has superior performance compared to the traditional clustering methods. The applications of spectral clustering can be found in many research fields, including image segmentation [21], video retrieval [12], circuit layout [1] and bioinformatics [27].
Spectral clustering makes use of the spectrum of some normalized similarity matrix derived from the data to reveal the cluster structure. The similarity matrix is constructed with some kind of similarity measure method. Thus, how to measure the similarity of data points is critical to the performance of spectral clustering. To achieve proper clustering result, spectral clustering assumes that two nearby data points in the high density region of the reduced space (formed by the eigenvectors) have high similarity and belong to the same cluster. However, this assumption does not always hold. The complex data are often heterogeneous, high dimensional, and without prior knowledge. Due to the curse of dimensionality and complex data structure, the two data points maybe far away from each other in the original space [23]. Therefore, measuring the similarity of data points in the original space may not precisely reflect the underlying data structure, which leads to inaccurate clustering result.
In this paper, instead of measuring the similarity of data points in the original space, we consider to project the original data to a feature space with the Mercer Kernel and then measure the similarity. In the Kernel space, if the underlying structure can be precisely captured, the performance of spectral clustering can be improved when applied to complex data clustering. Aiming to achieve this goal, we propose a novel spectral clustering method that measures the similarity by learning the adaptive and optimal neighbors for each data point based on the local structure in Kernel space. In the Kernel space, the similarity of two data points is measured based on the probability that these two data points are neighbors. Probabilistic neighborhood has been reported to be effective in similarity and data feature learning [9, 8]. To the best of our knowledge, we are the first to learn the probabilistic neighborhood in the Kernel space for similarity measure. The Kernel methods have been successfully used to overcome the limitation of some existing data analysis techniques, such as Kernel PCA [2] and Kernel K-means [18]. In the proposed method, we introduce the kernel methods to measure the similarity for constructing the similarity matrix of spectral clustering, which increase the linear separability by mapping the data points into the projected feature space. We demonstrate the effectiveness of the proposed method by using both synthetic and real datasets. The experimental results demonstrate that the proposed method not only achieves good performance, but also outperforms the existing spectral clustering methods.
The rest of this paper is organized as follows. After reviewing the related work in Section 2, we give a brief overview of spectral clustering in Section 3. We detail the proposed method in Section 4. Experimental results are given in Section 5 and Section 6 concludes the paper.
Related work
A good similarity measure method to construct the similarity matrix can significantly improve the clustering results of spectral clustering. Recently, a lot of effort has been devoted to the problem of how to construct the similarity matrix for spectral clustering. The Gaussian kernel function has been widely used to measure the similarity for spectral clustering, which is defined as with the kernel parameter . is the Euclidean distance between data points and . Although Gaussian kernel function is simple to calculate, the optimal value of the kernel parameter is difficult to find.
Many spectral clustering methods which aim at finding the optimal value of the kernel parameter in the Gaussian kernel function have been proposed [19, 28, 29, 6]. Ng et al. [19] devoted to select the parameter automatically by comparing the results based on certain criteria. Zelnik-Manor and Perona [28] proposed a self-tuning spectral clustering method, which locally scaled the parameter by studying the local statistics of the neighborhood of data points. Cao et al. [6] optimized the local scaling parameter by calculating the maximum flows between data points. Li et al. [15] improved the parameter for more accurate similarity measure based on a warping model.
Instead of focusing on the kernel parameter, a family of spectral clustering methods measure the similarity of data points based on the -Nearest Neighbor (NN) graph [17]. In the NN graph, point is connected to point if is among the -nearest neighbors of or is among the -nearest neighbors of . According to the NN graph, the pairwise similarity is if point is connected to point , otherwise . For such kind of methods, the parameter is easier to find. And, the similarity matrix is sparse, which has higher computational efficiency for the solution of eigenvectors. Spectral clustering methods based on the NN graph can be found in [16, 24, 14].
An interesting alternative to the distance-based similarity measure is to use information regarding the shared nearest neighbors. In most cases, two data points belong to the same cluster not only because they are near in the distance, but also because they have many shared nearest neighbors which connect them in the same cluster [11, 26]. Shared nearest neighbor based similarity measure methods have been conducted by many researchers [13, 25]. Zhang et al. [29] used the shared nearest neighbors to explore the local density between data points, which had an effect of amplifying intra-cluster similarity. Ye and Sakurai [25] measured the similarity based on the closeness of shared nearest neighbors to improve the robust of spectral clustering. Besides the methods based on shared nearest neighbors, Beauchemin [4] proposed a similarity measure method for spectral clustering by a density estimator which relied on the K-means method with subbagging procedure.
In most of the previous work, the similarity of data points is measured in the original space. In this paper, we consider to measure the similarity of data points in the Kernel space, with the objective to explore the underlying similarity relationships between data points and improve the performance of spectral clustering.
Overview of spectral clustering
Given a set of data points in , the objective of clustering is to divide the data points into clusters. A general spectral clustering algorithm consists of two basic stages: similarity matrix construction and clustering.
The similarity matrix is , where is calculated by some kind of similarity measure methods. The normalized Laplacian matrix is then computed based on . Finally, the clustering step is performed based on the eigen-decomposition of the normalized Laplacian matrix .
We state the general spectral clustering algorithm as follows [17, 25] .
Construct a similarity matrix , where .
Compute the normalized Laplacian matrix as , where is the diagonal matrix with on the diagonal.
Compute the smallest eigenvectors of , which are , , …, .
Form the matrix using the eigenvectors as its columns.
Form the matrix from by normalizing the rows to norm 1, such as .
Let be a vector corresponding the row of .
Cluster the points with the K-means method.
Assign each data point to a given cluster if is assigned to this cluster.
The similarity matrix construction in the first step is crucial to the clustering result, since the following steps are based on it. In this paper, we focus on the first step to construct the similarity matrix , which is similar to that considered in the related work. Some work have also considered to improve other steps of spectral clustering. For example, the Nystrom method has been applied in the third step for the eigenvector extraction problem to reduce the computation cost [10]. Genetic algorithms have also been utilized in the seventh step to improve the results of spectral clustering [22].
The proposed method
In this section, we introduce the proposed method that performs Adaptive Similarity Measure in Kernel space for Spectral Clustering, which is referred to as the ASMK-SC method. We first introduce the Mercer Kernel. Then we learn the adaptive and optimal neighbors of each data point in the Kernel space. ASMK-SC measures the similarity of data points based on the obtained adaptive neighbors to learn a sparse matrix as the similarity matrix for spectral clustering.
Mercer Kernel
Given a set of data points in , a Mercer Kernel is calculated according to a function and expressed as
where performs a mapping from the original space to a high dimensional feature space . The relevant aspects in applications is that it is possible to calculate Euclidean distance in without knowing explicitly . In this way, Mercer Kernel allows large non-linear feature spaces to be explored while avoiding curse of dimensionality. The radial basis function (RBF) Kerne is one of the most popular Mercer kernels, which is defined as
where the parameter is to control the scale of the dot product of the two data points. For the sake of convenient, we use Eq. (2) to map the original space to its feature space. Note that as showed in the related work, the right side of Eq. (2) also has been used as a similarity function. In our method, we use it to perform data mapping but not similarity calculation.
The Euclidean distance between data points and in the feature space of Kernel is defined as
According to Mercer Kernel, can be calculated directly from Kernel values as follows.
Adaptive neighbors in Kernel space
Adaptive neighbor learning
The probabilistic neighborhood is learned based on the assumption that each data point can be connected to all the other data points with certain probabilities [9]. In this paper, we use the Euclidean distance to measure the distance for the calculation of probabilistic neighborhood. In the Kernel space, consider that is connected to as a neighbor with probability , . For all the data points, the probabilities to be connected to satisfy .
The data points that have smaller distances to should be connected to with higher probabilities. That is, should be large if is small. Thus, the probabilities () of all data points to be connected to can be determined by solving
However, the optimization problem in Eq. (5) has a trivial solution. That is, only the nearest neighbor of can be connected to it with probability 1. Other data points cannot be the neighbors of . To avoid the trivial solution, adjustment can be made by combining the following equation to (5).
The optimal solution of Eq. (6) is to make all data points be connected to with the same probability .
By combining Eq. (6) to Eq. (5), the objective function to obtain the adaptive neighbors in the Kernel space is defined as follows.
where is a regularization parameter controlling the trade off between the trivial solution () and the uniform distribution ().
Optimization
By adjusting the parameter , we would like to keep only the nearest neighbors for each data point in the Kernel space for local structure preservation. Inspired by [9], we provide an effective method to find the optimal parameter for the optimization in Eq. (7) .
As shown in Eq. (3), . Let and denote as a vector with the element as . Denote as a vector with the element as . Let denote a vector with all of its elements being 1. Equation (7) can be written in vector form as
where and are the Lagrangian multipliers. According to the KKT condition [5], the optimal solution can be obtained as
To keep only the nearest neighbors of , we sort () in descending order. Without loss of generality, we assume that , , …, are ordered from large to small. To satisfy that has only nonzero values, we set and . According to Eq. (10), we have
Since , we can further obtain
Since , by replacing in Eq. (10) with Eq. (12), the optimal value of can be obtained by
Then we show how to decide in Eq. (13). According to Eqs (11) and (12), we have the following inequality for .
To obtain the optimal solution that has only nonzero values, can be set as
Then, replace in Eq. (13) with the above value, we can obtain the optimal value of as
Compared to the regularization parameter , the number of nearest neighbors is much easier and more intuitive to tune. Parameter can be better handled by searching .
Similarity measure based on adaptive neighbors in Kernel space
We measure the similarity of data points based on the probabilistic neighborhood calculated in the Kernel space. Let denote the matrix with as its entries. According to Eq. (16), is not symmetric, i.e., . Since the similarity matrix for spectral clustering should be symmetric, we calculate based on and as
Thus, the similarity matrix is calculated as
Note that the number of nearest neighbors is a parameter of , as shown in Eqs (16) and (17). The similarity matrix is sparse, since is usually small compared to , which is computational efficiency for the solution of eigenvectors.
Spectral clustering is then performed based on the similarity matrix . The proposed ASMK-SC method is summarized in Algorithm 4.3. The detail of the last step in Algorithm 4.3 is performed as steps 4 8 in the general spectral clustering algorithm introduced in Section 3.
[h] The proposed ASMK-SC method[1] Data matrix ; Parameters , , ; Cluster indexes of , , …, ; Construct the -nearest neighbor graph in the Kernel space by computing in (4) ; to to ; Construct the similarity matrix by computing ; Compute a diagonal matrix with on the diagonal; Compute the normalized Laplacian matrix as ; Compute the smallest eigenvectors of ; Cluster the data points into clusters based on the smallest eigenvectors of by using the K-means method.
Complexity analysis
We show the computational cost of the key steps in the proposed ASMK-SC method.
We obtain the solution of Eq. (7) according to Eq. (16) without solving the optimization problem directly. Thus, the computational cost to measure the probabilistic neighborhood of each data point in the Kernel space is no so high. The complexity of similarity matrix construction of ASMK-SC is , which includes measuring the probabilistic neighborhood in Eq. (16) and adjusting the symmetry in Eq. (17). This computational complexity is similar to that of the spectral clustering methods which measure the similarity based on the NN graph.
In general, the complexity of computing the eigenvectors from a dense matrix is . However, similar to the NN based spectral clustering methods, the Laplacian matrix of the proposed ASMK-SC method is sparse. The eigenproblem can be solved by applying sparse eigensolvers [7], such as the variants of Lanczos/Arnoldi factorization (e.g., ARPACK) which have a cost of (number of restarted Arnoldi), where is the Arnoldi length used to compute the first eigenvectors of the Laplacian matrix.
Experimental results
We conduct the experiments on both synthetic and real datasets. We compare the proposed ASMK-SC method with three other state-of-the-art similarity measure methods for spectral clustering. The first one measures the similarity based on locally scaled parameter, i.e., ST-SC [28]. The second one measures the similarity based on local density, i.e., DA-SC [29]. The third one measures the similarity based on the NN graph, i.e., NN-SC [17]. To show the benefit of Kernel in our method, we also show the results by measuring the similarity based on the probabilistic neighborhood in the original space for spectral clustering, which is referred to as the ASM-SC method. The clustering results of K-means is also presented.
We evaluate the performance of different methods based on two widely used clustering metrics: NMI (Normalized Mutual Information) and ACC (Accuracy). In the experiments, we present the best result of each spectral clustering method obtained after exploring their parameters. The algorithm is repeated 20 times with random initializations, since the results of K-means (in the last step of spectral clustering) depend on the initialization. We present the average result with the standard deviation (std) for each method. We also evaluate the performance of the proposed method by varying the parameters and . We show the average results and the 95% confidence intervals when varying the parameter .
Clustering results on synthetic data
We evaluate different methods on six 2D synthetic datasets which are usually used in spectral clustering studies for method efficiency assessment, as shown in Fig. 1. The first toy data is 2-Spiral that contains two clusters of data randomly distributed in the two spiral shapes. Similarly, 4-Corner and 4-Circle consist of four clusters. 8-Gaussian contains eight clusters of data which obey the eight Gaussian distributions. 4S+Noise contains data points randomly distributed in four squares and the space among the four squares. Since the data points among the four squares make the four squares not well separated, these data points can be seen as noises corresponding to the four squares. And, the noises are referred to as a single cluster. Thus, 4S+Noise has five clusters. 2S+Circle contains three clusters of data randomly distributed in two squares and one circle. The two squares are fairly close to the circle which make the clustering task a challenging problem.
Tables 1 and 2 report the clustering results of different methods in terms of NMI and ACC, respectively. For these synthetic datasets, the spectral clustering methods have better performance than the K-means method, specially on 2-Spiral and 4-Circle. Most of the spectral clustering methods can find the correct clusters of 2-Spiral, 4-Corner and 4-Circle. The proposed ASMK-SC method outperforms the other methods on most of the synthetic datasets.
Clustering results (NMI% std) of different methods on synthetic data
Dataset
K-means
DA-SC
ST-SC
NN-SC
ASM-SC
ASMK-SC
2-Spiral
3.55 0.11
100 0.00
100 0.00
100 0.00
100 0.00
100 0.00
4-Corner
63.76 9.98
97.68 2.69
100 0.00
100 0.00
100 0.00
100 0.00
4-Circle
13.29 0.59
90.20 3.45
96.75 2.77
100 0.00
100 0.00
100 0.00
8-Gaussian
91.97 5.10
95.13 4.35
96.43 3.01
96.52 2.72
96.69 3.70
97.40 2.27
4S+Noise
70.61 4.61
81.01 0.88
82.00 2.58
81.66 1.25
81.90 1.56
82.66 1.75
2S+Circle
60.14 0.71
63.58 2.22
67.90 1.56
67.39 1.84
67.27 2.29
68.72 2.18
Six 2D synthetic datasets with true clusters denoted by different markers and colors
We further evaluate the performance of the proposed method by varying the parameter on the synthetic datasets. Since NN-SC and ASM-SC have the same parameter , we compare the proposed method with these two methods in the experiments. Figures 2 and 3 show the clustering results on the six synthetic datasets in terms of NMI and ACC, respectively. Although all the three methods can find the correct clusters of the 4-Circle dataset, the range of for the proposed method to find the correct clusters is longer than the other two methods. On the 2S+Circle dataset, NN-SC obtains the best result when 6, however, it is very unstable as varies. The performances of the proposed ASMK-SC method are better and more stable than NN-SC and ASM-SC on most of the synthetic datasets.
Clustering Results (ACC % std) of different methods on synthetic data
Dataset
K-means
DA-SC
ST-SC
NN-SC
ASM-SC
ASMK-SC
2-Spiral
61.03 0.17
100 0.00
100 0.00
100 0.00
100 0.00
100 0.00
4-Corner
64.58 2.27
98.02 0.52
100 0.00
100 0.00
100 0.00
100 0.00
4-Circle
37.62 1.04
89.63 4.33
95.47 3.63
100 0.00
100 0.00
100 0.00
8-Gaussian
89.75 5.96
93.06 5.06
95.88 3.21
95.98 3.69
95.85 3.24
97.42 2.51
4S+Noise
84.13 4.26
89.72 1.84
89.84 2.97
88.26 3.63
89.67 2.30
90.77 2.54
2S+Circle
85.10 0.31
86.82 1.37
88.90 0.18
87.80 1.49
88.66 0.96
89.19 0.68
NMI by varying the number of nearest neighbors on synthetic data.
Properties of real data
Dataset
# of samples
# of Features
# of Clusters
UMIST
575
400
20
JAFFE
213
676
10
AT&T
400
644
40
AR10P
130
2400
10
ORL
400
1024
40
COIL20
1440
1024
20
MNIST
2000
784
10
Isolet1
1560
617
26
Lung
203
3312
5
GLA-BRA-180
180
49151
4
ACC by varying the number of nearest neighbors on synthetic data.
Clustering results on real data
Clustering results (NMI % std) of different methods on real data
Dataset
K-means
DA-SC
ST-SC
NN-SC
ASM-SC
ASMK-SC
UMIST
63.05 1.61
67.09 1.81
64.31 1.29
81.05 1.62
84.07 2.11
85.82 1.85
JAFFE
77.71 5.15
85.54 4.57
88.03 6.51
91.95 3.02
93.40 3.70
95.37 2.73
AT&T
79.08 1.10
85.05 1.27
80.03 1.98
88.02 1.02
89.36 1.13
90.63 1.10
AR10P
19.82 3.66
29.22 2.31
28.17 3.09
29.63 2.13
28.40 2.71
30.21 1.94
ORL
71.44 1.85
79.75 1.49
79.41 0.76
78.73 1.04
82.36 0.70
82.83 0.93
COIL20
73.98 2.83
77.04 0.10
78.25 1.52
86.26 2.09
90.87 1.51
94.23 0.13
MNIST
51.52 2.17
52.74 1.46
52.91 1.68
52.101.56
63.18 2.06
66.86 1.89
Isolet1
73.64 1.56
72.05 1.53
72.37 1.57
72.22 1.25
77.12 1.20
78.26 1.22
Lung
49.58 2.84
60.17 4.55
54.34 4.25
62.89 7.40
63.39 6.80
65.81 5.49
GLA-BRA-180
25.53 2.64
24.34 1.73
26.32 2.31
25.80 0.34
28.94 0.96
29.83 0.52
Clustering results (ACC % std) of different methods on real data
Dataset
K-means
DA-SC
ST-SC
NN-SC
ASM-SC
ASMK-SC
UMIST
41.81 2.18
48.55 2.53
42.34 8.22
64.044 5.97
68.37 3.58
70.22 3.72
JAFFE
71.43 7.09
75.35 6.93
75.14 6.51
89.30 5.02
91.22 6.33
93.42 5.12
AT&T
59.21 2.98
70.15 3.21
61.26 2.93
75.13 2.87
77.39 2.46
79.10 2.27
AR10P
23.54 2.88
30.77 2.79
29.22 3.00
34.50 2.33
33.21 2.75
35.12 2.43
ORL
50.09 2.98
63.48 2.77
57.68 2.92
54.26 2.39
66.88 2.17
67.09 2.17
COIL20
58.48 5.87
67.24 0.23
59.89 4.23
77.73 3.86
80.71 4.15
89.60 1.25
MNIST
56.93 3.94
55.81 2.04
57.68 2.92
52.26 4.08
62.48 3.97
66.37 3.65
Isolet1
56.68 3.47
58.54 3.22
56.50 3.39
56.26 2.60
61.03 2.86
62.84 2.17
Lung
64.95 4.83
74.26 6.78
70.23 7.33
82.46 6.43
83.17 5.34
85.79 5.23
GLA-BRA-180
56.39 3.40
54.17 2.67
59.50 3.66
56.66 0.96
62.52 2.02
63.27 1.88
NMI by varying the number of nearest neighbors .
We use a diversity of ten public datasets to evaluate the performance of different methods. The datasets include five face image datasets, i.e., UMIST,1
one handwritten digit datasets, i.e., MNIST, one spoken letter recognition data, i.e., Isolet1, and two biological datasets, i.e., Lung and GLA-BRA-180. Their properties are summarized in Table 3.
The clustering results of different methods on the ten real datasets are shown in Tables 4 and 5 in terms of NMI and ACC, respectively. We can see from the two tables that the proposed ASMK-SC method outperforms other methods. The real datasets have high dimensions. Compared to the results on the synthetic datasets with two dimensions, ASMK-SC improves the clustering results more significantly on the real datasets. Compared to ASM-SC that is without Mercer Kernel, the proposed method has better performance. That is because Mercer Kernel can offer a more general way to represent the complex data, by which the clusters can be more accurately identified.
ACC by varying the number of nearest neighbors .
The proposed ASMK-SC method has two parameters, i.e., and . We evaluate the performance of ASMK-SC by varying the two parameters on six of the real datasets: JAFFE, AT&T, AR10P, COIL20, MNIST and LUNG. In Figs 4 and 5, we compare the clustering results among ASMK-SC, ASM-SC and NN-SC for variations in the parameter . We present the range of that gives the best clustering results for all the three clustering methods. From the clustering results, we can see that on most of the datasets ASMK-SC performs better and more stably than the other two methods. The results of ASMK-SC on the COIL20 dataset are very stable.
NMI by varying and the number of nearest neighbors .
ACC by varying and the number of nearest neighbors .
Finally, we study the performance variation of ASMK-SC with respect to the parameters and . The experimental results are shown in Figs 6 and 7 in terms of NMI and ACC, respectively. The range of to obtain the best result is found empirically. ASMK-SC can maintain good performance with a large range of . The results on JAFFE, AT&T, COIL20 and MNIST are quite stable to both the parameters and . The results on AR10P and LUNG are stable to the parameters . We can see from these results that the proposed ASMK-SC method is not very sensitive to the parameter .
Note that, on some of the datasets the proposed method could not obtain very significant improvements, and could not obtain ideal cluster results, such as AR10P. The reason maybe that we use the Euclidean distance as the distance metric to measure the similarity, which maybe not the best for the datasets. Another reason maybe that the datasets contain some redundant and noisy features that affect the accuracy of the clustering results. We will address these problems in the further work.
Conclusion
In this paper, we propose a novel spectral clustering method which measures the similarity of data points based on the adaptive neighborhood in the Kernel space. We introduce the kernel methods and apply probabilistic neighborhood to measure the similarity for constructing the similarity matrix of spectral clustering, which is able to explore the underlying similarity relationships between data points and improve the performance of spectral clustering. Experiments on various types of datasets demonstrate the advantages of the proposed method. In the future work, to further improve the clustering results, we will try different distance metrics to measure the data similarity and perform feature selection for data preprocessing. We will also consider some efficient methods to determine the parameter automatically to obtain the best results.
References
1.
AlpertC.J. and KahngA.B., Multiway partitioning via geometric embeddings, orderings and dynamic programming,IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems4(11) (1995), 1342–1358.
2.
SmolaA.ScholkopfB. and MullerK.R., Kernel principal component analysis, In Proceeding of Artificial Neural Networks, 1997, pp. 583–588.
3.
BachF.R. and JordanM.I., Learning spectral clustering, In Proceeding of Advances In Neural Information Processing Systems, 2004, pp. 305–312.
4.
BeaucheminM., A density-based similarity matrix construction for spectral clustering, Neurocomputing, 2015, pp. 835–844.
5.
BoydS. and VandenbergheL., Convex optimization, Cambridge university press, 2004.
6.
CaoJ.ChenP.ZhengandY. and DaiQ., A max-flow-based similarity measure for spectral clustering, ETRI Journal35(2) (2013), 311–320.
7.
ChenW.SongY.BaiH.LinC. and ChangE.Y., Parallel spectral clustering in distributed systems, IEEE Transactions on Pattern Analysis and Machine Intelligence33(3) (2011), 568–586.
8.
DuL. and ShenY., Unsupervised feature selection with adaptive structure learning, In Proceeding of ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2015, pp. 209–218.
9.
WangX.NieF. and HuangH., Clustering and projected clustering with adaptive neighbors. In Proceeding of ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2014, pp. 977–986.
10.
FowlkesC.BelongieS.ChungF. and MalikJ., Spectral grouping using the nystrom method, IEEE Transactions on Pattern Analysis and Machine Intelligence26 (2004), 214–225.
11.
GuhaS.RastogiR. and KyuseokS., Rock: a robust clustering algorithm for categorical attributes, In Proceedings of 15th International Conference on Data Engineering, 1999, pp. 512–521.
12.
HuW.XieD.FuZ.ZengW. and MaybankS., Semantic-based surveillance video retrieval, IEEE Transactions on Image Processing16(4) (2007), 1168–1181.
13.
JarvisR.A. and PatrickE.A., Clustering using a similarity measure based on shared near neighbors, IEEE Transactions on ComputersC-22(11) (1973).
14.
LiX.HuW.ShenC.DickA. and ZhangZ., Context-aware hypergraph construction for robust spectral clustering, IEEE Transactions on Knowledge and Data Engineering26(10) (2013), 2588–2597.
15.
LiZ.LiuJ.ChenandS. and TangX., Noise robust spectral clustering, In Proceeding of ICCV, 2007.
16.
LucińskaM. and WierzchońS.T., Spectral clustering based on k-nearest neighbor graph, Computer Information Systems and Industrial Management7564 (2012), 254–265.
17.
LuxburgU., A tutorial on spectral clustering, Statistics and Computing17(4) (2007), 395–416.
18.
MasulliF.FilipponeM.CamastraF. and RovettaS., Robust similarity measure for spectral clustering based on shared neighbors, Pattern Recognition1(41) (2008), 176–190.
19.
NgA.Y.JordanM.I. and WeissY., On spectral clustering: Analysis and an algorithm, In Proceeding of Advances In Neural Information Processing Systems, 2002, pp. 849–856.
20.
SamariaF. and HarterA., Parameterisation of a stochastic model for human face identification, In Proceeding of IEEE Workshop on Applications of Computer Vision, 1994.
21.
ShiJ. and MalikJ., Normalized cuts and image segmentation, IEEE Transactions on Pattern Analysis and Machine Intelligence22(8) (2000), 888–905.
22.
WangH.ChenJ. and GuoK., A genetic spectral clustering algorithm, Journal of Computational Information Systems7 (2011), 3245–3252.
23.
MaZ.YangY.ChangX.NieF. and ZhouX., A convex formulation for spectral shrunk clustering, In Proceeding of AAAI Conference on Artificial Intelligence2015, pp. 2532–2538.
24.
XiongC.JohnsonD.M. and CorsoJ.J., spectral active clustering via purification of the k nearest neighbor graph, In Proceedings of European Conference on Data Mining, 2012.
25.
YeX. and SakuraiT., Spectral clustering using robust similarity measure based on closeness of shared nearest neighbors, In Proceeding of International Joint Conference on Neural Networks (IJCNN), 2015, pp. 1–8.
26.
YeX. and SakuraiT., Robust similarity measure for spectral clustering based on shared neighbors, ETRI Journal38(3) (2016), 540–550.
27.
YuZ.LiL.YouJ. and HanG., Sc3: Triple spectral clustering based consensus clustering framework for class discovery from cancer gene expression profiles, IEEE/ACM Transactions on Computational Biology and Bioinformatics9(6) (2012), 175–1765.
28.
Zelnik-ManorL. and PeronaP., Self-tuning spectral clustering, In Proceeding of NIPS, 2005, pp. 1601–1608.
29.
ZhangX.LiJ. and YuH., Local density adaptive similarity measurement for spectral clustering, Pattern Recognition Letters32(2) (2011), 352–358.