Clustering based on adaptive local density with evidential assigning strategy

Abstract

A new clustering algorithm, based on Adaptive Local Density (ALD) and Evidential $K$ -Nearest Neighbors (EKNN), is proposed here. In density peaks clustering, many other density metrics fail to detect cluster centers on multi-density datasets, however the ALD deals with the tasks very well since it can better utilize the local information. To assign the remaining points after detecting the cluster centers, an assigning strategy in the framework of evidential theory, named EKNN, is created. The advantage of EKNN is twofold. Firstly, by fusing the information of $K$ -Nearest Neighbors, it can reduce the risk of a phenomenon named domino effect: the drawback of one classical clustering, i.e., clustering by fast search and find of density peaks (always named as DPC). Secondly, it can detect border and noise points simultaneously since a credal partition is derived which can mine ambiguity and uncertainty of data structure. Simulations on both synthetic and real-world datasets demonstrate the outstanding performance of ALD-EKNN compared with DPC and some of its successors.

Keywords

Clustering algorithm adaptive local density K-nearest neighbors evidential theory credal partition

1. Introduction

The task of cluster analysis is to divide a dataset into several groups or clusters according to their intrinsic characteristics or similarities, therefore intracluster points have high similarities and intercluster points have few similarities. Since clustering has powerful ability to unveil inherent, unknown and complex rules in real-world, it becomes one of the most important tasks in many fields such as extracting information [23], machine learning [32], pattern recognition [16], image analysis [39] and bioinformatics [30]. So far, many clustering algorithms [42, 12, 3, 44] have been proposed including partitioning clustering, hierarchical clustering, density-based clustering, gird-based clustering and distributed-based clustering.

$K$ -means clustering method [15] is a popular one in the partitioning family since it is simple and effective. Some variations [22, 19, 2] are developed to improve the clustering performance of $K$ -means. However, they fail to detect the nonspherical clusters while density-based clustering methods can deal with it. DBSCAN [9] is a representative one among them, which can detect clusters with arbitrary shapes. It can also find the number of clusters automatically. Nonetheless, there are some user-defined parameter values, such as neighborhood radius $\varepsilon$ and minimum number of points MinPts, to be pre-specified in advance. In addition, clustering performance of DBSCAN is poor for multi-density and high-dimensional datasets.

Recently, Rodriguez and Laio [35] proposes another density-based clustering method: clustering by fast search and find of density peaks (DPC). With the assumption that a cluster center is surrounded by the points with lower local densities and far away from the points with higher local densities, DPC can efficiently find cluster centers and then assign remaining points to their corresponding groups. DPC has powerful ability to detect non-spherical clusters and identify the number of clusters. Parmar et al. [31] computes the local density by a residual error to handle low density datasets and assign remaining points according to ascending order of their residual errors. Du et al. [8] propose a novel density peak clustering algorithm called DPC-MD, which extends DPC to deal with mixed data (with both numerical and categorical value).

To our best knowledge, DPC and most of DPC successors can not handle the datasets with variable-density very well, because they use global cut-off distance to compute local density. Here, we introduce a strategy to calculate local density based on local cut-off distance. Since local information of each point is considered, the local density could be adaptive to both variable-density datasets and invariable-density datasets. Additionally, assigning strategy in DPC may cause propagation errors. And it means that many points with lower density may be assigned incorrectly once a point with higher density is assigned erroneously. To reduce the probability of this wrong assignment, two modifications are made here. Firstly, these remaining points are assigned according to ascending order of their minimum distances. Secondly, the point’s $K$ -nearest neighbors are used to assign it while only 1-nearest neighbor is used in DPC. Considering that evidential theory has a powerful ability to deal with clustering problems [6, 26, 25], we introduce a novel assigning strategy based on EKNN rule, which can help us better use the information of ambiguity, uncertainty and ignorance to identify the border points and noise points. Each of the $K$ -nearest neighbors can provide a piece of evidence about the assignation. After combining $K$ pieces of evidence, a final piece of evidence is obtained to help us make the decision. A point having large conflict evidence indicates that it is difficult to be determined since there are large contradictions among these $K$ pieces of evidence. And a point having large ignorance evidence implies that it is very likely to be a noise point. By integrating the two strategies (adaptive local density strategy and assigning strategy based on EKNN rule), we name it clustering based on Adaptive Local Density and Evidential $K$ -Nearest Neighbors rule (ALD-EKNN for simplicity).

The main contributions of ALD-EKNN are: (1) a novel density metric named ALD is proposed to handle muti-density datasets; (2) EKNN can reduce the risk of false assignation in order to improve the clustering performance; (3) EKNN can provide a credal partition to identify border and noise points simultaneously.

Simulations on both synthetic and real-world datasets can show advantages of our ALD-EKNN. The rest of the paper is organized as follows. In Section 2, DPC algorithm and evidential theory is introduced briefly. Section 3 introduces our algorithm, including the adaptive local density strategy and assigning strategy based on EKNN rule. In Section 4, we give some experiments to test our algorithm. Section 5 gives the conclusion.

2. Preliminaries

In this section, we will recall some related works and briefly introduce DPC algorithm and evidential theory.

2.1 Related works

DPC is a well-known clustering algorithm and it has many followers. To handle local structure and high dimension of the datasets, Du et al. [7] propose a density peaks clustering based on $K$ -nearest neighbors and principal component analysis (DPC-KNN-PCA). The local density is computed based on $K$ -nearest neighbors, which can overcome the drawback of DPC-missing many clusters. PCA is efficient when the high-dimensional datasets are encountered. Liu et al. [45] propose an adaptive density peak clustering based on $K$ -nearest neighbors (ADPC-KNN) to tackle the vice of decision graph. Firstly, the cluster centers can be selected adaptively, however one cluster may be divided into several clusters; Secondly, aggregating strategy can merge those clusters called reachable clusters. After finding the cluster centers, some researchers focus on new assignation strategies to improve the performance of clustering. Xie et al. [43] propose a two-stage assigning strategy. In the first stage, non-outliers are assigned; In the second stage, outliers and the points unassigned are assigned using the technique of fuzzy weighted $K$ -nearest neighbors. Liu et al. [24] propose a two-step allocation method based on both nearest neighbors and shared neighbors. The former assigns one point to the cluster of one cluster center if this point and the cluster centers have larger shared neighbors than $K/2$ . The latter allocates those remaining points to the clusters emerging most frequently among their neighbors. Based on evidential theory, Su et al. [38] propose a belief peaks evidential clustering (BPEC). To find the better cluster centers, belief peaks based on evidential theory is used instead of density peaks. In addition, credal partition method which can yield better performance is adopted to assign the remaining points. Gong et al. [13] propose a novel clustering method based on accumulative belief peaks. The belief peaks can be accumulated while the number of nearest neighbors $K$ increases, then the cluster centers can be found adaptively. However, all these algorithms mentioned above are not good at handling variable-density datasets.

2.2 Density peak clustering

In DPC [35], a cluster center is defined as a point which has higher local density than its neighbors and a relatively large distance from the points with higher local densities. To detect cluster centers, two important quantities of each point are defined. The first one is local density $\rho_{i}$ , which is defined by

$\displaystyle\rho_{i}=\sum_{j}\chi(d_{ij}-d_{c}),$ (1)

$\displaystyle\rho_{i}=\sum_{j}\exp\left(-\frac{d_{ij}^{2}}{d_{c}^{2}}\right),$ (2)

where $d_{ij}$ is a distance between point $i$ and point $j$ , $d_{c}$ is the cutoff distance, $\chi(x)$ is equal to 1 if $x$ is less than 0 and $\chi(x)$ is equal to 0 otherwise. The second one is distance $\delta_{i}$ , which is computed by

$\displaystyle\delta_{i}=\left\{\begin{array}[]{l}\underset{j:\rho_{i}>\rho_{j}% }{\min(d_{ij})}\;\;\;{\rm{if}}\;\exists\rho_{i}>\rho_{j}\\ \max(d_{ij})\;\;{\rm{otherwise}}\end{array}\right.$ (3)

To compute these two quantities $\rho_{i}$ and $\delta_{i}$ , the cutoff distance $d_{c}$ should be determined firstly. Assume that total number of points in the dataset is $n$ , then total number of distances between every two points is $N_{t}$ , which is equal to $n(n-1)/2$ . After sorting these $N_{t}$ distances in ascending order, the $n_{c}$ -th distance is selected as the cutoff distance $d_{c}$ . The parameter $n_{c}$ is computed by

$\displaystyle n_{c}={\rm{round}}(pN_{t}),$ (4)

where $p$ is a percentage to be pre-specified which is recommended to be 1% or 2%. With the assumption that cluster centers should be surrounded by points with lower density and be farther away from those points with higher density, cluster centers are defined as these points with both large $\rho_{i}$ and large $\delta_{i}$ . The graph of $\gamma_{i}=\rho_{i}\delta_{i}$ , sorted in decreasing order, is plotted to help users choose cluster centers. After that, remaining points are assigned according to its nearest neighbor with higher local density. More details are referred to [35].

2.3 Evidence theory

Evidence theory, also named as Dempster-Shafer (DS) theory, is widely used to handle the reasoning and decision making problems [36, 37, 28]. In DS theory, some independent pieces of evidence are represented by belief functions, and they can be combined, by Dempster’s rule (refer to DS rule), to help us make a decision. Let ${\Omega}=\{\omega_{1},\omega_{2},\ldots,\omega_{C}\}$ be a finite set, which is a collectively exhaustive and mutually exclusive set of $C$ hypotheses or propositions. The mass function $m$ on ${\Omega}$ is a mapping from $2^{\Omega}$ to [0, 1] such that

$\displaystyle\sum_{A\subseteq{\Omega}}m(A)=1.$ (5)

Each subset $A$ of ${\Omega}$ , satisfying $m(A)>0$ , is called a focal element. Two other important functions such as belief function and plausibility function are defined, respectively, as

$\displaystyle\textit{bel}(A)=\sum_{\phi\neq B\subseteq A}m(B),$ (6) $\displaystyle Pl(A)=\sum_{B\cap A\neq\phi}m(B)=1-\textit{bel}(\overline{A}).$ (7)

The contour function $p l$ is the restricted version of plausibility function, which only contains singletons, such as $pl(\omega_{c})=Pl({\omega_{c}})$ , $c=1,2,\ldots,C$ . If there are two independent pieces of evidence, $m_{1}$ and $m_{2}$ , combination of them by DS rule is expressed by

$\displaystyle m_{1}\bigoplus m_{2}(C)=\frac{1}{1-\kappa}\sum_{B\cap A=C}m_{1}(% B)m_{2}(A).$ (8)

$\kappa$ in Eq. (8), which represents $m_{1}\bigoplus m_{2}(\phi)$ , is computed by

$\displaystyle\kappa=\sum_{B\cap A=\phi}m_{1}(B)m_{2}(A).$ (9)

To simplify the calculation, the combination of two contour functions is computed by

$\displaystyle pl_{1}\bigoplus pl_{2}(A)=\frac{pl_{1}(A)pl_{2}(A)}{1-\kappa}% \forall A\in{\Omega}.$ (10)

3. Clustering based on adaptive local density and evidential K-nearest neighbors

Our clustering algorithm consists of two steps. Firstly, cluster centers are detected by the adaptive local density strategy; Secondly, the remaining points are assigned in terms of assigning strategy based on EKKN rule. The two algorithms are introduced respectively as follows.

3.1 Adaptive local density strategy

In DPC, all points in the dataset use a global cut-off distance $d_{c}$ to compute local density. When confronting a dataset with multi-densities, it may lead to a result that points in a cluster with small intra-cluster distances always have larger local density than those points in a cluster with large intra-cluster distances. In this way, a cluster center in the cluster with large intra-cluster distances may be ignored. The detail can be referred to Section 4.1. To address this problem, an adaptive local density metric is proposed here.

In our strategy, each point has its own cut-off distance $d_{ci}$ , which is defined by

$\displaystyle d_{ci}=\mathop{\rm{quantile}}\limits_{j\in\textit{KNN}_{i}}(d_{% ij},q),$ (11)

where $q$ is a quantile number such that $0\leqslant q\leqslant 1$ and $\textit{KNN}_{i}$ represents $K$ -nearest neighbors of the point $x_{i}$ . Here $K$ and $q$ are fixed in advance. After the local cut-off distance is obtained, the adaptive local density is measured by

$\displaystyle\rho_{i}=\sum_{j\in\textit{KNN}_{i}}\exp\left(-\frac{d_{ij}^{2}}{% d_{cj}^{2}}\right).$ (12)

The distance $\delta_{i}$ is computed by Eq. (3). Finally we can use the plot of $\gamma_{i}=\rho_{i}\delta_{i}$ to select cluster centers.

The adaptive local density strategy is summarized in Algorithm 1:

[htb] The adaptive local density strategyDataset X, parameters $K$ and $q$ Prepare the data by normalizing them; Compute the distance matrix and get the $K$ -nearest neighbors of every point ( $\textit{KNN}_{i}$ ) in dataset X; Compute the cut-off distance $d_{ci}$ with Eq. (11); Compute the local density ( $\rho_{i}$ ) for all the points with density metrics Eq. (12); Compute the distance ( $\delta_{i}$ ) for all points according to Eq. (3); Choose the cluster centers on the graph of $\gamma_{i}=\rho_{i}\delta_{i}$ , sorted in decreasing order. Cluster centers ${\Omega}$

3.2 Assigning strategy based on EKNN rule

By comparison with DPC, there are two improvements in our assigning strategy. Firstly, remaining points are ranked by ascending order of the minimum distance, which is a minima of the distances between the point $x_{i}$ and cluster centers; Secondly, we use evidential $K$ -nearest neighbors rule to help us make the decision.

After detecting cluster centers, which can be regarded as a framework ${\Omega}=\{\omega_{1},\omega_{2},\ldots,\omega_{C}\}$ , we can divide all the points into two groups. Group one X ${}_{1}$ includes cluster centers and their $K$ -nearest neighbors, and group two X ${}_{2}$ includes remaining points (X-X ${}_{1}$ ). These cluster centers are labeled one by one. Then each cluster center shares its label with its $K$ -nearest neighbors.

The points, once being allotted, can provide pieces of evidence for those points in group two. So we hope that one point closer to a cluster center can be assigned earlier since it can provide more reliable evidence. According to the ideology, the points in group two are sorted in ascending order of their minimum distances, which is defined by

$\displaystyle d_{mi}=\min_{\begin{subarray}{c}x_{i}\in X_{2}\\ \omega_{c}\in{\Omega}\end{subarray}}(d_{i\omega_{c}}),$ (13)

where $d_{i\omega_{c}}$ is the distance between the point $x_{2i}$ and cluster center $\omega_{c}$ .

In terms of the assigning order, points in group two are assigned one by one. Without loss of generality, we assume that $x_{2i}$ is one point in group two. To allot it, we should find its $K$ -nearest neighbors in group one, because these points can provide $K$ pieces of evidence to help us make the decision. A piece of evidence is created by $d_{ij}$ , the distance between point $x_{2i}$ and one of its $K$ -nearest neighbors (point $x_{1j}$ ). Assume that point $x_{1j}$ belongs to cluster $\omega_{c}$ , this piece of evidence can be represented by the following mass function:

$\displaystyle\left\{\begin{array}[]{l}m_{ij}(\omega_{c})=\exp(-\gamma_{d}d_{ij% }^{2}),\\ m_{ij}({\Omega})=1-\exp(-\gamma_{d}d_{ij}^{2}),\\ m_{ij}(A)=0\quad\forall A\subseteq 2^{\Omega}\backslash\{\omega_{c},{\Omega}\}% ,\\ \end{array}\right.$ (14)

where $\gamma_{d}$ is a positive parameter. This group of mass functions provides a quantitative description of the possibility concerning that two points are in the same cluster. The smaller the distance $d_{ij}$ is, the larger $m_{ij}(\omega_{c})$ is. Then the more likely the two points are in the same cluster. $m_{ij}({\Omega})$ conveys ignorant information. Therefore, the larger $m_{ij}({\Omega})$ is, the more ignorant this evidence is. That is to say a noise point can only provide ignorant evidence since $d_{ij}$ is very large. After $K$ pieces of evidence are combined by DS rule, we can obtain the combined mass function which could guide us to make the decision. When $C$ is large, the computing process of combined mass function is time-consuming. As can be seen in Eq. (14), mass functions are equal to 0 except $m_{ij}(\omega_{c})$ and $m_{ij}({\Omega})$ , so we can simplify the combination of two pieces of evidence as follows. $m_{i1}\bigoplus m_{i2}(\omega_{l})$ is computed by

$\displaystyle m_{i1}\bigoplus m_{i2}(\omega_{l})=m_{i1}{({\Omega})}M_{2}+m_{i2% }{({\Omega})}M_{1}+M_{1}\cdot M_{2},$ (15)

where $l=1,2,\ldots,C$ , $M_{1}=[m_{i1}(\omega_{1}),m_{i1}(\omega_{2}),\ldots,m_{i1}(\omega_{C})]$ and $M_{2}=[m_{i2}(\omega_{1}),m_{i2}(\omega_{2}),\ldots,\linebreak m_{i2}(\omega_{% C})]$ ; $m_{i1}\bigoplus m_{i2}(\Omega)$ is computed by

$\displaystyle m_{i1}\bigoplus m_{i2}({\Omega})=m_{i1}{({\Omega})}m_{i2}({% \Omega});$ (16)

$m_{i1}\bigoplus m_{i2}(\phi)$ is computed by

$\displaystyle m_{i1}\bigoplus m_{i2}(\phi)=1-\sum_{l=1}^{C}m_{i1}\bigoplus m_{% i2}(\omega_{l})-m_{i1}\bigoplus m_{i2}({\Omega}).$ (17)

After $K$ pieces of evidence are combined by repeating Eqs (15)–(17), we can obtain the final mass functions $m_{i}(\omega_{l})$ ( $l=1,2,\ldots,C$ ). From the form of a credal partition, fuzzy, rough and hard partitions can be deduced. Under the hard version, the point $x_{2i}$ is assigned to $w_{t}$ , which satisfies

$\displaystyle m_{i}(\omega_{t})=\max_{l=1,2,\ldots,C}(m_{i}(\omega_{l}))$ (18)

After the point $x_{2i}$ is labeled, it is added to group one. Then it can provide a piece of evidence for those points in group two. The process of assignation will end until all the points in group two are labeled.

The process of EKNN is summary in Algorithm 2:

[htb] Assigning strategy based on EKNN ruleDataset Dataset X and its Cluster centers ${\Omega}$ Label these cluster centers ${\Omega}=\{\omega_{1},\omega_{2},\ldots,\omega_{C}\}$ one by one; For each cluster center $\omega_{l}$ , share its label $l$ with its $K$ -nearest neighbors ( $l=1,2,\ldots,C$ ); Add these labeled points into group one X ${}_{1}$ , and then allot remaining points into group two X ${}_{2}$ ; $i=1$ to $m$ (the number of points in Group two) $d_{mi}$ is computed by Eq. (13); $i$ $\leftarrow$ $i+1$ ; Sort points in group two in ascending order in terms of their minimum distances $d_{mi}$ ; $i=1$ to $m$ For the point $x_{2i}$ , find its $K$ -nearest neighbors in group one; Generate a piece of evidence for each of its $K$ -nearest neighbors according to Eq. (14); $j=2$ to $K$ $m_{i1}\bigoplus m_{ij}(\omega_{l})$ is computed by Eq. (15); $m_{i1}\bigoplus m_{ij}({\Omega})$ is computed by Eq. (16); $m_{i1}\bigoplus m_{ij}(\phi)$ is computed by Eq. (17); $m_{i1}(\omega_{l})$ $\leftarrow$ $m_{i1}\bigoplus m_{ij}(\omega_{l})$ , $m_{i1}({\Omega})$ $\leftarrow$ $m_{i1}\bigoplus m_{ij}({\Omega})$ , $m_{i1}(\phi)$ $\leftarrow$ $m_{i1}\bigoplus m_{ij}(\phi)$ ; $j$ $\leftarrow$ $j+1$ ; $m_{i}(\omega_{l})$ $\leftarrow$ $m_{i1}(\omega_{l})$ , $m_{i}({\Omega})$ $\leftarrow$ $m_{i1}({\Omega})$ , $m_{i}(\phi)$ $\leftarrow$ $m_{i1}(\phi)$ ;assign $x_{2i}$ according to Eq. (18);Add point $x_{2i}$ into group one; $i$ $\leftarrow$ $i+1$ ; labeled X

3.3 Complexity analysis of ALD-EKNN

Assume that the dataset has $n$ points and $C$ clusters. There are two stages for ALD-EKNN. In the first stage-identifying the cluster centers based on adaptive local density: The distance matrix which stores the $K$ -nearest neighbors for each point needs $K n$ spaces. In addition, each point has four attributes cut-off distance $d_{ci}$ , local density $\rho_{i}$ , distance $\delta_{i}$ and $\gamma_{i}$ , therefore, $4n$ spaces are needed. In the second stage-assigning the remaining points, there is a piece of evidence with $C+2$ focal elements for each point. So evidence matrix needs $(C+2)n$ spaces. The overall space complexity of ALD-EKNN is $O((C+6+K)n)$ . It is smaller than that of DPC whose space complexity is $O(n^{2})$ because $(C+6+K)$ is smaller than $n$ in most cases.

As for time complexity of ALD-EKNN, it depends on the following parts: (1) computing the distance matrix and sorting them $O(n^{2})$ ; (2) computing cut-off distance $d_{ci}O(n)$ ; (3) computing the local density $\rho_{i}O(Kn)$ ; (4) computing the distance $\delta_{i}O(Kn)$ ; (5) computing $\gamma_{i}O(n)$ ; (6) computing the distance between points in group two and cluster centers $O(C(n-CK-C))$ ; (7) sorting the points in group two in ascending order $O(C(n-CK-C))$ ; (8) assigning the remaining points in group two $O(3C^{2}(n-CK-C))$ . Considering that $C$ is always much smaller than $n$ , the total time complexity of ALD-EKNN is $O(n^{2})$ , which is equivalent to that of DPC.

4. Experiment results

This section consists of two parts. In Section 4.1, some illustrative examples are used to show the characteristics of ALD-EKNN. In Section 4.2, we compare the performance of ALD-EKNN with Self-Organization Map (SOM), DPC and some of DPC’s successors on synthetic and real-world datasets. To compare the performance of clustering results among different clustering algorithms, many cluster quality indices, i.e. accuracy [17], F-measurement [17], Normalized Mutual Information (NMI) [17], Adjust Rand Index (ARI) [41] can be good candidates. ARI is used here since it is a popular and comprehensive performance index. To make the results independent from the units for different attributes, these input data points are normalized into [0, 1] by a min-max rule

$\displaystyle x_{ij}=\frac{x_{ij}-\min(x_{j})}{\max(x_{j})-\min(x_{j})}.$ (19)

In addition, to evaluate the dissimilarities between different points, many distance measures can be selected. These measures include Euclidean distance [27], Manhattan distance [14], Mahalanobis distance [4], Minkowski distance [18] etc. To yield better performance, some researchers [34] combine some of distance metrics and some researchers [1, 46] propose some new distance metric learning methods. Distance measures play an important role in clustering and classification tasks, and it is a big topic. However, it is not the topic for this paper. We choose Euclidean distance here since it is simple and widely used in many literatures.

Figure 1.

Artificial dataset.

4.1 Illustrative examples

Example 1: In this example, we introduce an artificial dataset with multi-densities to show the advantage of our algorithm on detecting cluster centers. As shown in Fig. 1, this dataset contains three clusters. Both cluster 1, with small intra-cluster distances, and cluster 2, with medium intra-cluster distances, contain 1000 points and cluster 3, with large intra-cluster distances, has 50 points. There is a noise point in the dataset, which is shown as a red star in Fig. 1. Besides the density metric in Eq. (2), two other density metrics in DPC-KNN [7] and ADPC-KNN [45] are applied here for comparison. They are expressed in Eqs (20) and (21) respectively.

$\displaystyle\rho_{i}=\exp\left(-\left(\frac{1}{K}\sum_{j\in\textit{KNN}_{i}}d% _{ij}^{2}\right)\right),$ (20) $\displaystyle\rho_{i}=\sum_{j\in\textit{KNN}_{i}}\exp\left(-\frac{d_{ij}^{2}}{% d_{c}^{2}}\right).$ (21)

The cut-off distance $d_{c}$ is not needed in Eq. (20) or it can be regarded as a constant, 1. The cutoff distance $d_{c}$ in Eq. (21) is computed by

$\displaystyle d_{c}=\mu+\sqrt{\sum_{i=1}^{n}(\lambda_{i}-\mu)^{2}},$ (22)

where $\lambda_{i}=\max\limits_{j\in\textit{KNN}_{i}}(d_{ij})$ , $\mu$ is mean value of $\lambda_{i}$ , $n$ is the total number of points in the dataset.

Figure 2.

Graph of $\gamma$ via different algorithms.

Figure 2 shows the graph of $\gamma$ for ALD-EKNN, DPC, DPC-KNN and ADPC-KNN. Here, $K$ is set to 40 for ALD-EKNN, DPC-KNN and ADPC-KNN, and $p$ is 1% for DPC. Through ALD-EKNN, we can easily obtain three cluster centers. However, DPC and ADPC can get two cluster clusters while DPC-KNN can detect four. Considering the dataset has three clusters, our algorithm ALD-EKNN achieves the best result.

To pinpoint the specific cause, we do some studies on the cut-off distance $d_{c}$ . In what follows we will explain how the cut-off distance $d_{c}$ affects local density distribution and why our algorithm works best on such multi-density datasets. With the three density metrics (Eqs (2), (20) and (21)), the cut-off distances are 0.0051, 1 and 0.04, respectively. Figure 3 shows the exponential function ( $\exp(-\frac{d_{ij}^{2}}{d_{c}^{2}})$ ) with the different cut-off distance $d_{c}$ . As shown in Fig. 3, for DPC, only small distance $d_{ij}$ , less than 0.01, can make certain contributions to the local density because of the small $d_{c}$ causing the steep slope; for ADPC-KNN, the distance $d_{ij}$ , less than 0.08, can make some contributions; when it comes to DPC-KNN, even very large distance $d_{ij}$ , such as 1, can still contribute 0.3679 to the local density because the function decreases very slowly along $d_{ij}$ .

Figure 3.

Exponential function with different cut-off distances.

Figure 4.

Distance statistical distribution.

Figure 4 displays the statistical result of distance $d_{ij}$ for one point in cluster 1, 2 and 3 and the noise point, respectively. It can be seen from Fig. 4 that there are very few numbers of distance $d_{ij}$ less than 0.01 for the point in cluster 3. So the local density of the point in cluster 3 is very small when it is measured by Eq. (2). This is why a cluster center in cluster 3 is missed as shown in Fig. 2b. For the same reason, a cluster center in cluster 3 can not be detected when the local density is measured by Eq. (21). As for DPC-KNN, a cluster center in cluster 3 can be found due to the small difference of local density among all points in the dataset. Nonetheless, a noise point, with a large $\delta_{i}$ , can be regarded as a cluster center as shown in Fig. 2c. In summary, we may miss a cluster center of the cluster with large intra-cluster distances if the cut-off distance $d_{c}$ is small; we may regard a noise point as cluster center if the cut-off distance $d_{c}$ is large. To tackle the problem, for a noise point we want a small cut-off distance to get a very small local density; for a cluster with large intra-cluster distance, we want a large cut-off distance to obtain a large local density. If the cut-off distance is a constant, such as in DPC, DPC-KNN and ADPC-KNN, it’s very difficult for them to wriggle free from the dilemma.

In our algorithm, ALD-EKNN, cut-off distance $d_{c}$ is not a constant any more. And it is adaptive in terms of its $K$ -nearest neighbors as shown in Eq. (11). In this case, the noise point has cut-off distances (0.0502 $\pm$ 0.0299), where 0.0502 is the mean value of its 40-nearest neighbors’ cut-off distances and 0.0299 is the standard deviation of them. The cluster center in cluster 3, point 2050, has cut-off distances (0.1144 $\pm$ 0.0179). The local density of the noise point is 0.0002 and that of point 2050 is 28.9651. And that means ALD-EKNN could give the noise point a very small local density while yielding a large local density for the point in the cluster with large intra-cluster distance. So it could deal with dilemma well and detect the right cluster centers.

Example 2: In this example, we show the advantage of our EKNN strategy through another artificial dataset. The dataset has 400 two-dimensional points which are divided into four clusters. Figure 5a and b shows the clustering results for original and DPC, respectively. Comparing Fig. 5a with Fig. 5b, we can see that the green circular points in the black rectangle are assigned wrongly. They should belong to cluster 4, however, they are wrongly assigned to cluster 2. The index of these points are 140, 160, 308, 343, 364, 365, 369 and 392 respectively as shown in Fig. 5b. (The index here represents the order of the points in the dataset). Table 1 includes these points’ detailed information such as point’s index, point’s density, index of Point’s $K$ -nearest Higher-density Neighbor (PKHN), label of PKHN, point’s true label.

Figure 5.

Clustering results on artificial dataset with four classes.

According to DPC’s assigning rule, point 140 will be assigned firstly among these eight points due to its highest local density. Unfortunately, it is assigned wrongly since its nearest higher-density neighbor, point 380, belongs to cluster 2. Then it passes wrong information to other points with lower local density. This phenomenon is named as the ripple effect. The probable reason is that, in DPC, only the first nearest higher-density neighbor has been considered while making the decision. Once the undetermined point and its first nearest higher-density neighbor are not in the same cluster de facto, it will be wrongly assigned.

Table 1

The assigning information for some points

Point’s index	Point’s density	PKHN’s index	PKHN’s label	Point’s true label
140	3.237	380	2	4
364	3.178	140	2	4
160	3.112	140	2	4
308	2.065	364	2	4
365	1.862	308	2	4
343	1.833	308	2	4
369	1.530	343	2	4
392	1.007	369	2	4

To reduce the risk of wrong assignation, $K$ nearest higher-density neighbors would be considered in our algorithm. These $K$ points will offer $K$ pieces of evidence which can help us make the decision. Table 2 shows point 140’s 15-nearest higher-density neighbors’ information such as point’s index, local density, distance and point’s label. $i$ th represents the $i$ th nearest higher-density neighbor. As shown in Table 2, there are 6 points in cluster 2 and 9 points in cluster 4 among its 15 nearest higher-density neighbors. After combination of $K$ pieces of evidence, we obtain the final mass function, $m_{140}(\omega_{l})=[0,0.0004,0,0.0020]$ . Then point 140 will be assigned to the right cluster (cluster 4) according to Eq. (18).

In addition, $m_{140}(\phi)=0.9975$ tells us that point 140 is a border point. Figure 6 shows the contour surface of conflicting and ignorance evidence obtained by our strategy. As shown in the Fig. 6a, these points in the yellow area are difficult to be determined because of their large conflicting evidence. And we can see that most of these points appear in the area where two clusters are very close. So it is intuitive that these points are hard to be assigned. However, other hard assigning strategies, such as DPC and ADPC, failed to discover such information. Figure 6b displays the contour surfaces of ignorance evidence, $m({\Omega})$ . The point in the yellow circle has a piece of relative low ignorance evidence ( $m({\Omega})=0.3248$ ), so it could not be regarded as a noise point here. The point in the red circle could be viewed as the noise point due to its high ignorance evidence ( $m({\Omega})=0.9989$ ). This is consistent with our intuition. In a word, with the help of evidential theory, our strategy has the powerful ability to discover both the border points and the noise points.

Table 2

Point 140’s 15-nearest higher-density neighbors

$i$ th	Index	Local density	Distance	Label
1	380	3.639	0.0550	2
2	358	4.049	0.0570	2
3	170	4.491	0.0622	2
4	302	4.659	0.0646	4
5	111	3.833	0.0719	4
6	363	4.968	0.0738	4
7	135	4.302	0.0757	2
8	350	5.134	0.0765	4
9	327	6.298	0.0779	4
10	105	3.813	0.0780	2
11	319	5.079	0.0782	4
12	347	4.596	0.0793	4
13	353	6.690	0.0841	4
14	329	3.572	0.0963	4
15	195	3.899	0.1013	2

Figure 6.

Contour surfaces of the clustering results.

Example 3: In this example, the overlapping dataset fourclass [28], having four clusters and 100 points in each cluster, is applied to show the influence of parameters $K$ and $q$ on the clustering performance. We consider $K\in\{10,20,\ldots,90\}$ and $q\in\{0.1,0.2,\ldots,0.9\}$ . Figure 7 displays the number of cluster centers and ARI values versus the parameters $K$ and $q$ . It can be seen from Fig. 7a that the true number of cluster centers can always be detected while $q$ is equal to 0.8 or 0.9, however, it fails to find right number of cluster centers on certain $K$ while $q$ is varying from 0.1 to 0.7. To further study the influence of $q$ on the clustering performance. For each $q$ , the mean and standard deviation of ARI along different $K$ is displayed on Fig. 7b. It tells us that the better clustering performance is achieved when $q$ is equal to 0.8 or 0.9 due to their higher mean ARI and lower standard deviation of ARI. In the following experiment, $q$ is set to 0.8.

Figure 7.

Clustering performance via different parameters.

4.2 Experiments on synthetic and real-world datasets

In this subsection, ALD-EKNN is tested on some synthetic and real-world datasets with characteristics shown in Table 3. There are 7 synthetic datasets and 7 real-world datasets, which are widely used in many literatures [20, 40, 11, 10, 21, 33]. a3, d31 and s2 have mild overlapping. dim 1024 has 16 clusters with very high dimensions. s4 has 15 clusters with severe overlapping. Unbalance has 8 clusters with multi-density. All these real-world datasets have high dimensions.

To show the advantage of our assigning strategy, after detecting the cluster centers through adaptive local density strategy, we assign the remaining points via different methods. Firstly, we compare EKNN through distance sorting (shortly for EKNN-Dis) with EKNN through density sorting (shortly for EKNN-Den). Secondly, we compare EKNN-Dis with assigning strategy in ADPC-KNN, assigning the remaining points to their nearest centers (shortly for NC). Figure 8 displays the ARI values by EKNN-Dis versus those obtained by EKNN-Den and NC. Two datasets dim1024 and unbalance are not shown in Fig. 8, and their ARI values are all equal to 1 via these three methods, which indicates that there is no difference among them. Figure 8a shows that EKNN-Dis outperforms EKNN-Den on 8 datasets whereas EKNN-Den outperforms EKNN-Dis on 2 datasets and 2 draws on parkinsons and iris. It can be seen from Fig. 8b that NC only outperforms EKNN-Dis on pima while EKNN-Dis outperforms NC on 9 datasets and 2 draws on d31 and parkinsons. Therefore, it can be concluded that EKNN-Dis is superior to EKNN-Den and NC on most cases.

Table 3
Dataset description

Type	Dataset	Size	Attributes	Clusters	Source
Synthetic	a3	7500	2	50	[20]
	d31	3100	2	31	[40]
	dim1024	1024	1024	16	[40]
	R15	600	2	15	[40]
	s2	5000	2	15	[10]
	s4	5000	2	15	[10]
	Unbalance	6500	2	8	[33]
Real-world	Seeds	210	7	3	[21]
	Wdbc	569	30	2	[21]
	Iris	150	4	3	[21]
	Pima	768	8	2	[21]
	Wine	178	13	3	[21]
	Parkinsons	195	23	2	[21]
	Waveform	5000	21	3	[21]

Figure 8.

ARI values via different assigning methods.

Table 4

ARI values: Comparison between our algorithm and several other algorithms ${}^{1}$

Datasets
a3	0.7592	0.9800	0.9700	0.9755	0.9538 (60)	0.9956 (70)
d31	0.7273	0.9364	0.9400	0.9358	0.9382 (50)	0.9491 (60)
dim1024	0.8790	1.0000	1.0000	1.0000	1.0000	1.0000 (60)
R15	0.8154	0.9821	0.9928	0.9928	0.9928	0.9928 (30)
s2	0.7254	0.9348	0.9419 (30)	0.9286	0.9240	0.9423 (40)
s4	0.5369	0.6333	0.6300	0.6268	0.6353 (80)	0.6501 (40)
Unbalance	0.8227	0.9884	1.0000	1.0000	0.9884 (80)	1.0000 (15)
Seeds	0.6514	0.6531	0.7700	0.7076	0.7900	0.7529 (12)
Wdbc	0.7253	0.4705	0.0000 (5)	0.5175	0.7860	0.8306 (10)
Iris	0.7060	0.7600	0.7600	0.7076	0.9220	0.8857 (15)
Pima	0.0912	0.0140	0.0200	0.0000	0.0130	0.0630 (80)
Wine	0.0058	0.6990	0.4536 (8)	0.7128	0.8520	0.8837 (25)
Pakinsons	0.1338	0.3910	0.2566 (5)	0.0266	0.3910	0.4135 (45)
Waveform	0.2588	0.2669	0.2500	0.2516	0.3500	0.4449 (15)

${}^{*}$ The parameter $K$ used in ALD-EKNN for all datasets and other algorithms for the missing datasets are shown in the bracket next to the ARI value.

For comparison, other clustering algorithms, like Self-Organization Map (SOM) [29], DPC [35] and several DPC successors, including ADPC-KNN [45], FKNN-DPC [43], DPC-KNN [7], are applied to these datasets. Table 4 shows the ARI values of six clustering algorithms on both synthetic and real world datasets. The bold font represents the best result on this dataset. The parameter $K$ for our algorithm is shown in Table 4 and the quantile $q$ is set to 0.8 here. The iteration number of SOM is set to 100 and the cluster numbers are assumed to be known. The parameters for DPC-KNN are pre-specified following the suggestion in the reference [7]. $p$ is set to 1% in DPC. The ARI values for FKNN-DPC and ADPC-KNN are referred to [43, 45]. The values missed are added by us and the parameters, used in these algorithms, are shown next to the ARI value. As shown in Table 4, ALD-EKNN has the best result on all the 14 datasets except for seeds, iris and pima. FKNN-DPC has the largest ARI on four datasets, including dim1024, R15, seeds and iris. SOM has the best ARI on just one dataset pima and has the worst ARI on 11 datasets. Both ADPC-KNN and DPC-KNN have the best ARI on three datasets, including dim1024, unbalance and R15. DPC has the best ARI just on dataset dim1024.

To further compare these algorithms, the combination of Friedman test and Nemenyi test is applied here. To do the Friedman test, the ranking table should be obtained. We rank these algorithms on each dataset in terms of their ARI values. If some algorithms have the same ARI value, for example, on dataset dim1024, five algorithms have the same ARI, they will all rank 3, which are computed by $(1+2+3+4+5)/5$ . After ranking them on all datasets, the ranking table is given in Table 5. Friedman test could check whether there are significant differences among these algorithms. The significant level is set to 0.05 in this case. The variable $\tau_{F}$ is computed by F-distribution with $k-1$ and ( $k-1$ ) ( $N-1$ ) degree of freedom.

$\displaystyle\tau_{F}=\frac{(N-1)\tau_{\chi^{2}}}{N(k-1)-\tau_{\chi^{2}}},$ (23)

where $k$ is the number of algorithms and $N$ is the number of datasets. The variable $\tau_{\chi^{2}}$ is computed by $\chi^{2}$ -distribution with $k-1$ degree of freedom.

$\displaystyle\tau_{\chi^{2}}=\frac{12N}{k(k+1)}\left(\sum_{i=1}^{k}{{r_{i}}^{2% }}-\frac{k(k+1)^{2}}{4}\right),$ (24)

where $r_{i}$ is the average performance order of $i$ th algorithm.

In this case, the critical value is 2.550 when $k$ is 6 and $N$ is 14. And the $\tau_{\chi^{2}}$ and the $\tau_{F}$ is 30.59 and 7.45 according to Eqs (24) and (23). Friedman test rejects the null hypothesis since the value of $\tau_{F}$ (7.45) is larger than the critical value (2.550). And that means the assumption every algorithm doesn’t have significant difference is wrong. Since there are significant differences among the five algorithms, we do a post-hoc test, Nemenyi test, to find how much difference between ALD-EKNN and other algorithms. Critical difference is 2.02 here, which is computed by Eq. (25),

$\displaystyle CD=q_{\alpha}\sqrt{(}\frac{k(k+1)}{6N}),$ (25)

where $q_{\alpha}$ is referred to Table 5(a) of reference [5] while the significant level $\alpha$ is set to 0.05. As shown in Fig. 9, ALD-EKNN, FKNN-DPC and ADPC are covered by the red line, which means these three algorithms don’t have a significant difference. However, ALD-EKNN has significant difference from DPC, DPC-KNN and SOM, since their differences are beyond the critical difference. In addition, considering that our algorithm has the least average ranking, it has the best performance among these algorithms.

Table 5

Performance ranking on our algorithm and several other algorithms

Datasets	SOM	DPC	ADPC-KNN	DPC-KNN	FKNN-DPC	ALD-EKNN
a3	6	2	4	3	5	1
d31	6	4	2	5	3	1
dim1024	6	3	3	3	3	3
R15	6	5	2.5	2.5	2.5	2.5
s2	6	3	2	4	5	1
s4	6	3	4	5	2	1
Unbalance	6	4.5	2	2	4.5	2
Seeds	6	5	2	4	1	3
Wdbc	3	5	6	4	2	1
Iris	6	3.5	3.5	5	1	2
Pima	1	4	3	6	5	2
Wine	6	3	5	4	2	1
Pakinsons	6	2	4	5	2	1
Waveform	4	3	6	5	2	1
Average	5.29	3.64	3.50	4.11	2.86	1.61

Figure 9.

Nemenyi test: Comparison between our algorithm and several other algorithms.

5. Conclusion

A novel clustering algorithm based on adaptive local density and evidential $K$ -nearest neighbors rule is proposed here. There are some merits for our strategy. Firstly, ALD-EKNN can handle both multi-density datasets and invariable-density datasets through our adaptive local density strategy; Secondly, through credal partition which can mine the ambiguity and ignorance information of data structure, ALD-EKNN can detect border points and noise points simultaneously. Additionally, based on EKNN rule and using distance sorting, ALD-EKNN can reduce the risk of wrong assignation in order to improve the cluster performance. Experiments on some synthetic and real world datasets show that ALD-EKNN are superior to DPC and some of its successors in terms of their clustering results.

However, ALD-EKNN has a large time complexity when confronting big data. Therefore, future studies are needed to deal with it.

Footnotes

Acknowledgments

This work was supported by the National Natural Science Foundation of China (Grant No. 51976032).

References

Yibang Ruan

Yanshan Xiao

Zhifeng Hao

A.C.

and Bo Liu

, A nearest-neighbor search model for distance metric learning, Information Sciences 552 (2021), 261–277.

Arthur

and Vassilvitskii

, k-means++: The advantages of careful seeding, in: Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms, pages 1027–1035, Society for Industrial and Applied Mathematics, 2007.

Chen

and Schizas

I.D.

, Distributed information-based clustering of heterogeneous sensor data, Signal Processing 126 (2016), 35–51.

Davis

J.V.

Kulis

Jain

Sra

and Dhillon

I.S.

, Information-theoretic metric learning, in: Machine Learning, Proceedings of the Twenty-Fourth International Conference (ICML 2007), Corvallis, Oregon, USA, June 20–24, 2007, 2007.

Demišar

and Schuurmans

, Statistical comparisons of classifiers over multiple data sets, Journal of Machine Learning Research 7(1) (2006), 1–30.

Denœux

and Masson

M.-H.

, Evclus: Evidential clustering of proximity data, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics) 34(1) (2004), 95–109.

Ding

and Jia

, Study on density peaks clustering based on k-nearest neighbors and principal component analysis, Knowledge-Based Systems 99 (2016), 135–145.

Ding

and Xue

, A novel density peaks clustering algorithm for mixed data, Pattern Recognition Letters 97 (2017), 46–53.

Ester

Kriegel

H.-P.

Sander

et al., A density-based algorithm for discovering clusters in large spatial databases with noise, in: Kdd, Vol. 96, pages 226–231, 1996.

10.

Fränti

and Virmajoki

, Iterative shrinking method for clustering problems, Pattern Recognition 39(5) (2006), 761–775.

11.

Franti

Virmajoki

and Hautamaki

, Fast agglomerative clustering using a k-nearest neighbor graph, IEEE Transactions on Pattern Analysis and Machine Intelligence 28(11) (2006), 1875–1881.

12.

Geng

Y.-a.

Zheng

Zhuang

and Xiong

, Recome: A new density-based clustering algorithm using relative knn kernel density, Information Sciences 436 (2016), 13–30.

13.

Gong

Z.G.

Wang

P.H.

and Wang

, Cumulative belief peaks evidential k-nearest neighbor clustering, Knowledge-Based Systems 200 (2020), 105982.

14.

Greche

Jazouli

Es-Sbai

Majda

and Zarghili

, Comparison between euclidean and manhattan distance measure for facial expressions classification, in: International Conference on Wireless Technologies, 2017.

15.

Jain

A.K.

, Data clustering: 50 years beyond k-means, Pattern Recognition Letters 31(8) (2010), 651–666.

16.

Jajoo

Kumar

Yadav

S.K.

Adhikari

and Kumar

, Blind signal modulation recognition through clustering analysis of constellation signature, Expert Systems with Applications 90 (2017), 13–22.

17.

Jiang

Zhong-Yang

Yu-Fang

Yong

and Jie

, Density core-based clustering algorithm with dynamic scanning radius, Knowledge Based Systems 142 (2018), 58–70.

18.

Kamimura

and Uchida

, Greedy network-growing by minkowski distance functions, in: IEEE International Joint Conference on Neural Networks, 2004.

19.

Kanungo

Mount

D.M.

Netanyahu

N.S.

Piatko

C.D.

Silverman

and Wu

A.Y.

, An efficient k-means clustering algorithm: Analysis and implementation, IEEE Transactions on Pattern Analysis & Machine Intelligence (7) (2002), 881–892.

20.

Karkkainen

and Franti

, Dynamic local search for clustering with unknown number of clusters, in: Object Recognition Supported by User Interaction for Service Robots, Vol. 2, pages 240–243, IEEE, 2002.

21.

Lichman

et al., Uci machine learning repository, 2013.

22.

Likas

Vlassis

and Verbeek

J.J.

, The global k-means clustering algorithm, Pattern Recognition 36(2) (2003), 451–461.

23.

Liu

Pham

T.D.

Yan

and Liang

, Fuzzy mixed-prototype clustering algorithm for microarray data analysis, Neurocomputing 276 (2018), 42–54.

24.

Liu

Wang

and Yu

, Shared-nearest-neighbor-based clustering by fast search and find of density peaks, Information Sciences 450 (2018), 200–226.

25.

Liu

Z.-G.

Dezert

Mercier

and Pan

, Belief c-means: An extension of fuzzy c-means algorithm in belief functions framework, Pattern Recognition Letters 33(3) (2012), 291–300.

26.

Liu

Z.-G.

Pan

and Dezert

, Evidential classifier for imprecise data based on belief functions, Knowledge-Based Systems 52 (2013), 246–257.

27.

Liwei

Yan

and Jufu

, On the euclidean distance of images, IEEE Transactions on Pattern Analysis and Machine Intelligence 27(8) (2005), 1334–1339.

28.

Masson

M.-H.

and Denoeux

, Ecm: An evidential version of the fuzzy c-means algorithm, Pattern Recognition 41(4) (2008), 1384–1397.

29.

Obermayer

Blasdel

G.G.

and Schulten

, Statistical-mechanical analysis of self-organization and pattern formation during the development of visual maps, Physical Review A 45(10) (1992), 7568–7589.

30.

Pagnuco

I.A.

Pastore

J.I.

Abras

Brun

and Ballarin

V.L.

, Analysis of genetic association using hierarchical clustering and cluster validation indices, Genomics 109(5–6) (2017), 438–445.

31.

Parmar

Wang

Zhang

Tan

A.-H.

Miao

Jiang

and Zhou

, Redpc: A residual error-based density peak clustering algorithm, Neurocomputing 348 (2018), 82–96.

32.

Ping

Wang

Huang

and Li

, Adjustable preference affinity propagation clustering, Pattern Recognition Letters 85(C) (2017), 72–78.

33.

Rezaei

and Franti

, Set matching measures for external cluster validity, IEEE Transactions on Knowledge and Data Engineering 28(8) (2016), 2173–2186.

34.

Rodrigues

E.O.

, Combining minkowski and cheyshev: New distance proposal and survey of distance metrics using k-nearest neighbours classifier, Pattern Recognition Letters 110(jul.15) (2018), 66–71.

35.

Rodriguez

and Laio

, Clustering by fast search and find of density peaks, Science 344(6191) (2014), 1492–1496.

36.

Shafer

, A mathematical theory of evidence, Vol. 42, Princeton university press, 1976.

37.

Smets

, Decision making in the tbm: The necessity of the pignistic transformation, International Journal of Approximate Reasoning 38(2) (2005), 133–147.

38.

and Denoeux

, Bpec: Belief-peaks evidential clustering, IEEE Transactions on Fuzzy Systems 27(1) (2019), 111–123.

39.

Van Pham

Pham

L.T.

Nguyen

T.D.

and Ngo

L.T.

, A new cluster tendency assessment method for fuzzy co-clustering in hyperspectral image analysis, Neurocomputing 307 (2018), 213–226.

40.

Veenman

C.J.

Reinders

M.J.T.

and Backer

, A maximum variance cluster algorithm, IEEE Transactions on Pattern Analysis and Machine Intelligence 24(9) (2002), 1273–1280.

41.

Vinh

N.X.

Epps

and Bailey

, Information Theoretic Measures for Clusterings Comparison: Variants, Properties, Normalization and Correction for Chance, JMLR.org, 2010.

42.

Xian

Tie

Guan

and Rao

, Quasi-cluster centers clustering algorithm based on potential entropy and t-distributed stochastic neighbor embedding, Soft Computing (44) (2018), 1–13.

43.

Xie

Gao

Xie

Liu

and Grant

P.W.

, Robust clustering by detecting density peaks and assigning points based on fuzzy weighted k-nearest neighbors, Information Sciences 354 (2016), 19–40.

44.

Wang

and Deng

, Denpehc: Density peak based efficient hierarchical clustering, Information Sciences 373 (2016), 200–218.

45.

Yaohui

Zhengming

and Fang

, Adaptive density peak clustering based on k-nearest neighbors with aggregating strategy, Knowledge-Based Systems 133 (2017), 208–220.

46.

Zhong

Zheng

and Fu

, Slmoml: Online metric learning with global convergence, IEEE Transactions on Circuits and Systems for Video Technology 28(10) (2017), 2460–2472.

Clustering based on adaptive local density with evidential assigning strategy

Abstract

Keywords

1. Introduction

2. Preliminaries

2.1 Related works

2.2 Density peak clustering

3.1 Adaptive local density strategy

4. Experiment results

Table 3 Dataset description

Footnotes

Acknowledgments

References

Table 3
Dataset description