An improved fuzzy C-means clustering algorithm using Euclidean distance function

Abstract

The fuzzy c-mean (FCM) clustering algorithm is a typical algorithm using Euclidean distance for data clustering and it is also one of the most popular fuzzy clustering algorithms. However, FCM does not perform well in noisy environments due to its possible constraints. To improve the clustering accuracy of item varieties, an improved fuzzy c-mean (IFCM) clustering algorithm is proposed in this paper. IFCM uses the Euclidean distance function as a new distance measure which can give small weights to noisy data and large weights to compact data. FCM, possibilistic C-means (PCM) clustering, possibilistic fuzzy C-means (PFCM) clustering and IFCM are run to compare their clustering effects on several data samples. The clustering accuracies of IFCM in five datasets IRIS, IRIS3D, IRIS2D, Wine, Meat and Apple achieve 92.7%, 92.0%, 90.7%, 81.5%, 94.2% and 88.0% respectively, which are the highest among the four algorithms. The final simulation results show that IFCM has better robustness, higher clustering accuracy and better clustering centers, and it can successfully cluster item varieties.

Keywords

Fuzzy clustering FCM PCM Euclidean distance distance function

1 Introduction

As a significant branch of pattern recognition [1], the basic idea of cluster analysis is to divide data into several classes based on the properties of each other and the data should be as similar as possible in the same cluster and as different as possible in other clusters [2]. However, in practical problems, events are often accompanied by ambiguity and there are usually no clear boundaries between things. So fuzzy clustering can more effectively reflect the objective world and it is extensively applied in many scientific studies such as image recognition [3 –6], machine vision [7 –9] and text recognition [10 –12].

The fuzzy clustering algorithm uses the distance measure function to calculate the distances from each sample data point to the clustering centers, so as to divide the sample data point into different clusters. Therefore, selecting a suitable distance measure function to apply to the fuzzy clustering algorithm can greatly improve the clustering performance of the algorithm.

However, considering the large amount of data on the network, the traditional fuzzy clustering algorithm cannot always present the classification results with high accuracy. To improve the clustering accuracy of data samples, in this study, an improved FCM clustering (IFCM) algorithm is proposed. IFCM replaces the Euclidean distance in the original algorithm with a Euclidean distance function. The Euclidean distance function can give small weights to noisy data and large weights to compact data so as to successfully restrain the interference of noisy data and achieve better clustering results. We carried out clustering tests on a few datasets and proved that the IFCM clustering algorithm can effectively classify data with high accuracy.

The remaining sections of this paper are organized as follows. A literature review of the formation and development of fuzzy clustering is provided in Section 2. Section 3 briefly introduces some clustering algorithms. In section 4, the IFCM clustering algorithm is proposed. Section 5 carries out clustering experiments and describes the results. Finally, we make a summary and put forward further research in section 5.

2 Literature review

In 1966, Bellman, Kalaba and Zadeh first introduced the concept of fuzzy sets into the clustering algorithms [13]. In 1970, Ruspini first systematically proposed the fuzzy clustering algorithm based on minimizing a fuzzy objective function [14]. In 1987, Bezdek introduced the concept of the weighted index in membership degree and proved the convergence of fuzzy C-means (FCM) clustering [15], and thus the FCM algorithm has developed rapidly. However, FCM has a disadvantage which is its sensitivity to noisy data. To overcome this shortcoming of FCM, Krishnapuram and Keller abandoned the probability constraints of FCM and constructed the possibilistic C-means (PCM) clustering [16]. PCM can cluster data containing noisy points or outliers, but PCM requires suitable initial cluster centers. Otherwise, it will lead to consistent clusters. In 2005, Pal et al. put forward possibilistic fuzzy C-means (PFCM) clustering based on the respective advantages of the FCM and PCM [17]. Unfortunately, the PFCM can not always maintain excellent clustering performance, especially when the cluster size varies.

The distance measurement in the FCM algorithm is usually measured by Euclidean distance. When the sample space is within three dimensions, the Euclidean distance is the real distance from the sample points to the cluster centers. However, when feature similarity is high, using Euclidean distance will magnify the effect of these features and lead to the reduction of clustering efficiency [18]. As a result, scholars have replaced the Euclidean distance in traditional FCM with different distance metrics, resulting in some improved algorithms [19 –22]. In 2020, Gosain et al. attempted to use the Minkowski distance to solve the deficiency of FCM. They introduced an improved fuzzy possibilistic C-means (IFPCM) algorithm and applied this distance to FCM as well and demonstrated that the Minkowski distance remained effective coverage for convex data and multidimensional datasets [23]. IFPCM is robust to noisy and missing values and has fewer iteration numbers when processing data samples. In 2021, Zhao et al. proposed a general FCM clustering algorithm based on contraction mapping (cGFCM) which could be applied to the more general cases of using the Minkowski metric as the similarity measure [24]. The results showed that cGFCM could produce the highest accuracy and F-score compared with the other clustering algorithms. In 2022, Gao et al. designed an adaptive elastic distance based on membership and applied it to elastic fuzzy C-means (EFCM) to better identify intrinsic cluster structures. The adaptive elastic distance is like an elastic net that adaptively finds reliable points [25].

3 Related work

In this section, we briefly introduce FCM, PCM and

PFCM algorithms for comparison in this paper.

3.1 FCM clustering

Given an unlabeled data set X ={ x₁, x₂, …, x_n } ⊂R^P, FCM divides X into c (1 < c < n) fuzzy subsets by minimizing the following objective function: $J_{FCM} (U, V) = \sum_{i = 1}^{c} \sum_{k = 1}^{n} u_{ik}^{m} D_{ik}^{2}$ (1) where V = {v₁, v₂, …, v_c} is the cluster center matrix, and v_i is the value of the ithcluster center. U = [u_ik] _c×n( $0 ⩽ u_{ik} ⩽ 1, \sum_{i = 1}^{c} u_{ik} = 1$ ) is the fuzzy membership value matrix, and u_ikis the membership value of data x_K belonging to the cluster center v_i. c is the number of sample varieties and n is the number of sample data points. The distance D_ik =∥ x_k - v_i ∥ is the Euclidean distance from x_k to cluster center v_i. m (1 < m < ∞) is the weight index. The objective function of FCM is minimized under constraint conditions and the following equations are obtained: $u_{ik} = {[\sum_{j = 1}^{c} {(\frac{D_{ik}}{D_{jk}})}^{\frac{2}{m - 1}}]}^{- 1}$ (2) $v_{i} = \frac{\sum_{k = 1}^{n} u_{ik}^{m} x_{k}}{\sum_{k = 1}^{n} u_{ik}^{m}}$ (3)

3.2 PCM clustering

PCM relaxes the normalization condition of fuzzy membership degree and specifies t_ik ∈ [0, 1], which solves the problem of FCM algorithm sensitivity to noise. In FCM, u_ik represents the fuzzy membership degree of sample x_k to the cluster center v_i, while t_ik generated by PCM represents the typical value of sample x_k to the cluster center v_i. Given an unlabeled data set X = { x₁, x₂, …, x_n } ⊂ R^P, the objective function of PCM is shown as follows: $J_{PCM} (T, V) = \sum_{i = 1}^{c} \sum_{k = 1}^{n} t_{ik}^{w} D_{ik}^{2} + \sum_{i = 1}^{c} γ_{i} \sum_{k = 1}^{n} {(1 - t_{ik})}^{w}$ (4)

Where T = [t_ik] _c×n is the typical value matrix. The typical value t_ik is between 0 and 1. w (1 < w < ∞) is the weight index. γ_iis the penalty parameter: $γ_{i} = K \frac{\sum_{k = 1}^{n} u_{ik}^{m} D_{ik}}{\sum_{k = 1}^{n} u_{ik}^{m}}$ (5)

Here, K is usually chosen to be 1. γ_i is generally obtained by using the training results of FCM when setting the initial value of PCM algorithm.

The minimum values of parameters t_ik and v_i of Equation (4) are calculated by gradient method, and the iterative formulas of t_ik and v_i are as follows: $t_{ik} = \frac{1}{1 + {(\frac{D_{ik}}{γ_{i}})}^{\frac{1}{η - 1}}}$ (6) $v_{i} = \frac{\sum_{k = 1}^{n} t_{ik}^{w} x_{k}}{\sum_{k = 1}^{n} t_{ik}^{w}}$ (7)

3.3 PFCM clustering

PFCM combines PCM and FCM to better identify data clusters by generating fuzzy membership and typical values, so it has a better clustering effect than them. Given an unlabeled data set X ={ x₁, x₂, …, x_n } ⊂R^P, the objective function of PFCM is shown as follows:

$\begin{matrix} J_{PFCM} (U, T, V) = \sum_{i = 1}^{c} \sum_{k = 1}^{n} ({au}_{ik}^{m} + {bt}_{ik}^{w}) D_{ik}^{2} \\ + \sum_{i = 1}^{c} γ_{i} \sum_{k = 1}^{n} {(1 - t_{ik})}^{w} \end{matrix}$ (8)

Where U = [u_ik] _c×n and T = [t_ik] _c×n respectively represent the fuzzy membership matrix and typical value matrix. The fuzzy membership u_ik and typical value t_ik are between 0 and 1. The fuzzy weight indexes m, w > 1 and coefficients a, b > 0. γ_i is the penalty parameter of Equation (5) and calculated by running FCM. The fuzzy membership u_ik, typical value t_ik and cluster centers v_i can be described as follows by minimizing the objective function: $u_{ik} = {[\sum_{j = 1}^{c} {(\frac{D_{ik}}{D_{jk}})}^{\frac{1}{m - 1}}]}^{- 1}$ (9) $t_{ik} = \frac{1}{1 + {(\frac{b}{γ_{i}} D_{ik})}^{\frac{1}{w - 1}}}$ (10) $v_{i} = \frac{\sum_{k = 1}^{n} ({au}_{ik}^{m} + {bt}_{ik}^{w}) x_{k}}{\sum_{k = 1}^{n} ({au}_{ik}^{m} + {bt}_{ik}^{w})}$ (11)

4 Improved FCM (IFCM) clustering algorithm

4.1 A new distance metric

The traditional FCM usually uses Euclidean distance, but its clustering effect is poor in a noisy environment. In this section, we define a Euclidean distance function d (x, y) that gives less weight to noisy data and more weight to normal data, making the mean value more robust. $d (x, y) = 1 - {(1 + \frac{{∥ x - y ∥}^{2}}{ρ})}^{- 1}$ (12)

Where ρ (ρ > 0) is a coefficient and we know that one distance metric d (x, y) should satisfy the following three conditions [26]: $d (x, y) > 0, \forall x \neq y, d (x, x) = 0$ (13) $d (x, y) = d (y, x)$ (14) $d (x, y) ⩽ d (x, z) + d (z, y)$ (15)

Proof. It is obvious that Equation (13) and Equation (14) are true. Now we just have to prove Equation (15). $\begin{matrix} d (x, z) + d (z, y) - d (x, y) \\ = 1 - {(1 + \frac{{∥ x - z ∥}^{2}}{ρ})}^{- 1} - {(1 + \frac{{∥ z - y ∥}^{2}}{ρ})}^{- 1} \\ + {(1 + \frac{{∥ x - y ∥}^{2}}{ρ})}^{- 1} \\ ⩾ 1 - {(1 + \frac{{∥ x - z ∥}^{2}}{ρ})}^{- 1} - {(1 + \frac{{∥ z - y ∥}^{2}}{ρ})}^{- 1} \\ + {(1 + \frac{{∥ x - z ∥}^{2} + {∥ z - y ∥}^{2}}{ρ})}^{- 1} \end{matrix}$

To make the proof easier, we set $\frac{{∥ x - z ∥}^{2}}{ρ} = a$ , $\frac{{∥ z - y ∥}^{2}}{ρ} = b$ , so the above formula is expressed as: $d (x, z) + d (z, y) - d (x, y)$ (16) $\begin{matrix} ⩾ 1 - {(1 + a)}^{- 1} - {(1 + b)}^{- 1} + {(1 + a + b)}^{- 1} \\ ⩾ 1 - {(1 + a)}^{- 1} - {(1 + b)}^{- 1} + {(1 + a + b + ab)}^{- 1} \\ = 1 - \frac{1 + a + b}{1 + a + b + ab} = \frac{ab}{1 + a + b + ab} ⩾ 0 \end{matrix}$

Thus d (x, y) ⩽ d (x, z) + d (z, y), Equation (15) is verified and we can claim that the Euclidean distance function is a metric.

4.2 IFCM clustering

The Euclidean distance in FCM objective function is replaced by the functional form of Euclidean distance. Given an unlabeled data set X = {x₁, x₂, …, x_n} ⊂R^P, the objective function of IFCM is as follows: $J_{IFCM} (U, V) = \sum_{i = 1}^{c} \sum_{k = 1}^{n} u_{ik}^{m} [1 - {(1 + \frac{{∥ x_{k} - v_{i} ∥}^{2}}{σ^{2}})}^{- 1}]$ (17)

Where V = {v₁, v₂, …, v_c} is the cluster center matrix; v_i is the value of the ith cluster center. U = [u_ik] _c×n is the fuzzy membership value matrix where u_ik is the membership value of data x_k belonging to the cluster center v_i. c is the number of clusters, and n is the number of clustered data. m (1 < m < ∞) is the weight index. The coefficient ρ in Equation (12) is replaced by the covariance of the samples σ². $σ^{2} = \frac{1}{n} \sum_{k = 1}^{n} {∥ x_{k} - \bar{x} ∥}^{2}, \bar{x} = \frac{1}{n} \sum_{j = 1}^{n} x_{j}$ (18)

The Lagrangian equation of Equation (17) is constructed.

$\begin{matrix} L (α, u_{ik}, v_{i}) = \sum_{i = 1}^{c} \sum_{k = 1}^{n} u_{ik}^{m} \\ [1 - {(1 + \frac{{∥ x_{k} - v_{i} ∥}^{2}}{σ^{2}})}^{- 1}] - α (\sum_{i = 1}^{c} u_{ik} - 1) \end{matrix}$ (19)

Take the partial derivative with respect to α: $\frac{\partial L (α, u_{ik}, v_{i})}{\partial α} = - α (\sum_{i = 1}^{c} u_{ik} - 1) = 0$ (20)

Solving Equation (20) can obtain the following equation: $\sum_{i = 1}^{c} u_{ik} = 1$ (21)

Take the partial derivative with respect to u_ik:

$\begin{matrix} \frac{\partial L (α, u_{ik}, v_{i})}{\partial u_{ik}} = {mu}_{ik}^{m - 1} \\ [1 - {(1 + \frac{{∥ x_{k} - v_{i} ∥}^{2}}{σ^{2}})}^{- 1}] - α = 0 \end{matrix}$ (22)

Solution of Equation (22) can obtain the following equation: $u_{ik} = {(\frac{α}{m})}^{\frac{1}{m - 1}} {{[1 - {(1 + \frac{{∥ x_{k} - v_{i} ∥}^{2}}{σ^{2}})}^{- 1}]}^{- 1}}^{\frac{1}{m - 1}}$ (23)

Using the Equation (21), we can get the following equation: $\sum_{i = 1}^{c} {(\frac{α}{m})}^{\frac{1}{m - 1}} {{[1 - {(1 + \frac{{∥ x_{k} - v_{i} ∥}^{2}}{σ^{2}})}^{- 1}]}^{- 1}}^{\frac{1}{m - 1}} = 1$ (24)

Solution of Equation (24) can obtain the equation about ${(\frac{α}{m})}^{\frac{1}{m - 1}}$ : ${(\frac{α}{m})}^{\frac{1}{m - 1}} = \sum_{j = 1}^{c} {[1 - {(1 + \frac{{∥ x_{k} - v_{j} ∥}^{2}}{σ^{2}})}^{- 1}]}^{\frac{1}{m - 1}}$ (25)

Putting Equation (25) into Equation (23), we can get the formula of u_ik: $u_{ik} = {\sum_{j = 1}^{c} {[\frac{1 - {(1 + \frac{{∥ x_{k} - v_{i} ∥}^{2}}{σ^{2}})}^{- 1}}{1 - {(1 + \frac{{∥ x_{k} - v_{j} ∥}^{2}}{σ^{2}})}^{- 1}}]}^{\frac{1}{m - 1}}}^{- 1}$ (26)

Take the partial derivative with respect to v_i:

$\begin{matrix} \frac{\partial L (α, u_{ik}, v_{i})}{\partial v_{i}} = - \frac{2}{σ^{2}} \sum_{k = 1}^{n} u_{ik}^{m} (x_{k} - v_{i}) \\ {[1 + \frac{{∥ x_{k} - v_{i} ∥}^{2}}{σ^{2}}]}^{- 2} = 0 \end{matrix}$ (27)

Simplifying Equation (27) can obtain the following equation:

$\begin{matrix} \sum_{k = 1}^{n} u_{ik}^{m} x_{k} {(1 + \frac{{∥ x_{k} - v_{i} ∥}^{2}}{σ^{2}})}^{- 2} \\ = \sum_{k = 1}^{n} u_{ik}^{m} v_{i} {(1 + \frac{{∥ x_{k} - v_{i} ∥}^{2}}{σ^{2}})}^{- 2} \end{matrix}$ (28)

Based on Equation (19), we can get the formula of v_i: $v_{i} = \frac{\sum_{k = 1}^{n} {u_{ik}^{m} (1 + \frac{{∥ x_{k} - v_{i} ∥}^{2}}{σ^{2}})}^{- 2} x_{k}}{\sum_{k = 1}^{n} {u_{ik}^{m} (1 + \frac{{∥ x_{k} - v_{i} ∥}^{2}}{σ^{2}})}^{- 2}}$ (29)

The specific algorithm steps of IFCM are described as follows:

Initialization

(1) Fix m and c, n > c > 1, + ∞ > m > 1;

(2) Set the initial value of the cycle r = 1, the maximum number of cycles r_max and the iteration threshold ɛ (ɛ > 0);

(3) Select the initial cluster centers V⁽⁰⁾.

Repeat

Step 1:Update membership value U^(r) by Equation (26).

Step 2:Update cluster centers V^(r) by Equation (29).

Until (∥ V^(r) - V^(r-1) ∥ < ɛ) or r > r_max, the iteration comes to an end; otherwise r = r + 1, and return to Step 1.

5 Results and simulations

In this section, we run FCM, PCM, PFCM and IFCM on two different datasets: an artificial dataset and some real datasets. The advantages of IFCM are verified by comparison from four aspects: clustering accuracy, cluster centers, iteration times and computing time.

5.1 Experimental environment

All algorithms are run under the environment shown in Table 1. We firstly set the iteration threshold ɛ = 0.00001 and the maximum iteration number γ_max = 100.

Table 1
Experiment environment

Name Configuration

System Windows 10

Processor Intel i5-6300HQ

Running memory 8GB

Software Matlab R2019b

Name	Configuration
System	Windows 10
Processor	Intel i5-6300HQ
Running memory	8GB
Software	Matlab R2019b

5.2 Artificial data clustering

5.2.1 X₁₂ data set

X₁₂ is an artificial two-dimensional dataset [17]. It has twelve data points including ten normal data points and two noisy data points (x₆ and x₁₂). The normal data points form two diamonds on either side of the y-axis and the noisy points have the same distance from the centers of the two classes. The coordinate diagram of X₁₂ is shown in Fig. 1.

Set m = 2.0, w = 2.0, a = 1.0, b = 1.0, and the initialization cluster centers are [17]:

Fig.1

The distribution of data points in X₁₂.

$V^{(0)} = [\begin{matrix} 0.08 & 0.36 \\ 0.41 & 0.99 \end{matrix}]$ (30)

Table 2 shows the fuzzy membership values of FCM and IFCM. IFCM doesn’t release the restriction

Table 2

The fuzzy membership values of FCM and IFCM on X₁₂ data set

Data	X ₁₂		FCM		IFCM
	x	y	$U_{1}^{T}$	$U_{2}^{T}$	$U_{1}^{T}$	$U_{2}^{T}$
x ₁	–5.00	0.00	0.9364	0.0636	0.8389	0.1611
x ₂	–3.34	1.67	0.9673	0.0327	0.8476	0.1524
x ₃	–3.34	0.00	0.9897	0.0103	0.9980	0.0020
x ₄	–3.34	–1.67	0.8994	0.1006	0.8410	0.1590
x ₅	–1.67	0.00	0.9156	0.0844	0.8370	0.1630
x ₆	0.00	0.00	0.5000	0.5000	0.5000	0.5000
x ₇	1.67	0.00	0.0844	0.9156	0.1630	0.8370
x ₈	3.34	1.67	0.0327	0.9673	0.1524	0.8476
x ₉	3.34	0.00	0.0103	0.9897	0.0020	0.9980
x ₁₀	3.34	–1.67	0.1006	0.8994	0.1590	0.8410
x ₁₁	5.00	0.00	0.0636	0.9364	0.1611	0.8389
x ₁₂	0.00	10.00	0.5000	0.5000	0.5000	0.5000

$\sum_{i = 1}^{c} u_{ik} = 1$ , so the fuzzy membership values of x₆ and x₁₂ are all 0.5. But IFCM has an average membership value for each point in a class except the point at the center of the class which has the highest membership value, so IFCM has a better clustering result than FCM. Figure 2 and Fig. 3 respectively show the cluster centers of FCM and IFCM. It can be clearly seen that the cluster centers of IFCM are closer to the real cluster centers than FCM.

Fig. 2

Cluster centers of FCM on X₁₂ data set.

Fig. 3

Cluster centers of IFCM on X₁₂ data set.

Table 3 shows the cluster centers obtained by clustering X₁₂ with FCM, PCM, PFCM and IFCM. And the real cluster centers of the X₁₂ dataset are:

Table 3

The cluster centers from FCM, PCM, PFCM and IFCM on X₁₂ data set

FCM	PCM	PFCM	IFCM
$[\begin{matrix} - 2.99 & 2.99 \\ 0.54 & 0.54 \end{matrix}]$	$[\begin{matrix} - 2.15 & 2.15 \\ 0.02 & 0.02 \end{matrix}]$	$[\begin{matrix} - 2.83 & 2.83 \\ 0.36 & 0.36 \end{matrix}]$	$[\begin{matrix} - 3.18 & 3.18 \\ 0.02 & 0.02 \end{matrix}]$

$V^{(T)} = [\begin{matrix} - 3.34 & 3.34 \\ 0.00 & 0.00 \end{matrix}]$ (31)

The results of the three cluster algorithms are compared by calculating the distances between the three final cluster centers and the real cluster centers using the following formula [17]: $E_{*} = {∥ V_{T} - V_{*} ∥}^{2}$ (32)

Where the mark ‘*’ represents FCM, PCM, PFCM and IFCM. The calculation results: E_FCM = 0.4141, E_PCM = 1.4165, E_PFCM = 0.3897, E_IFCM = 0.0260. Obviously, the smaller E_* means the cluster centers are closer to the real cluster centers, so IFCM has the best cluster centers and PCM has the worst ones.

We change the weight index m to prove that IFCM has a better clustering effect on the noisy data. The terminal cluster centers and their distances from the real cluster centers are shown in Table 4. We can see that IFCM always has a smaller distance than FCM, PCM and the PFCM in all cases. As m increases, the distance becomes smaller and smaller which means that the cluster centers become better and better.

Table 4

The cluster centers of IFCM and distances with different values of m on X₁₂ data set

Parameters	Cluster centers	Distance
m = 2	$[\begin{matrix} - 3 . 1801 & 3 . 1801 \\ 0 . 0249 & 0 . 0249 \end{matrix}]$	0.0262
m = 2.5	$[\begin{matrix} - 3 . 1850 & 3 . 1850 \\ 0 . 0230 & 0 . 0230 \end{matrix}]$	0.0246
m = 3	$[\begin{matrix} - 3 . 1965 & 3 . 1965 \\ 0 . 0201 & 0 . 0201 \end{matrix}]$	0.0210
m = 3.5	$[\begin{matrix} - 3 . 2102 & 3 . 2102 \\ 0 . 0174 & 0 . 0174 \end{matrix}]$	0.0172
m = 4	$[\begin{matrix} - 3 . 2235 & 3 . 2235 \\ 0 . 0151 & 0 . 0151 \end{matrix}]$	0.0138

5.2.2 X₂₀ data set

X₂₀ is a dataset containing two clusters (9 data points per cluster) and two noisy points (x₁₉ and x₂₀). The data points of the two clusters form a rectangle with the same shape distributed on both sides of the y-axis. The two noise points are located on the y-axis and x₁₉ is further away from them. The coordinate distribution of X₂₀ is shown in Fig. 4.

Set m = 2.0, w = 2.0, a = 1.0, b = 1.0, and the initialization cluster centers are:

Fig. 4

The distribution of data points in X₂₀.

$V_{0} = [\begin{matrix} 1.00 & 0.00 \\ - 1.00 & 0.00 \end{matrix}]$ (33)

Table 5 shows the fuzzy membership values obtained by running FCM and IFCM on the X₂₀ dataset. Although the IFCM has fewer fuzzy membership value for the data points in a cluster than FCM, it gives the highest membership value of 1 for the cluster centers (x₅ and x₁₄). So, the final cluster centers of IFCM are closer to the real centers. Figure 5 and Fig. 6 respectively show the cluster centers of FCM and IFCM on the X₂₀ dataset. The final cluster centers generated by FCM, PCM, PFCM and IFCM are shown in Table 6.

Table 5

The fuzzy membership values of FCM and IFCM on X₂₀ data set

Data	X ₂₀		FCM		IFCM
	x	y	$U_{1}^{T}$	$U_{2}^{T}$	$U_{1}^{T}$	$U_{2}^{T}$
x ₁	–3.00	1.00	0.9250	0.0750	0.8233	0.1767
x ₂	–2.00	1.00	0.9543	0.0457	0.8794	0.1206
x ₃	–1.00	1.00	0.8560	0.1440	0.7562	0.2438
x ₄	–3.00	0.00	0.9496	0.0504	0.8967	0.1033
x ₅	–2.00	0.00	0.9978	0.0022	1.0000	0.0000
x ₆	–1.00	0.00	0.9126	0.0874	0.8451	0.1549
x ₇	–3.00	–1.00	0.9081	0.0919	0.8269	0.1731
x ₈	–2.00	–1.00	0.9260	0.0740	0.8849	0.1151
x ₉	–1.00	–1.00	0.8224	0.1776	0.7602	0.2398
x ₁₀	1.00	1.00	0.1440	0.8560	0.2438	0.7562
x ₁₁	2.00	1.00	0.0457	0.9543	0.1206	0.8794
x ₁₂	3.00	1.00	0.0750	0.9250	0.1767	0.8233
x ₁₃	1.00	0.00	0.0874	0.9126	0.1549	0.8451
x ₁₄	2.00	0.00	0.0022	0.9978	0.0000	1.0000
x ₁₅	3.00	0.00	0.0504	0.9496	0.1033	0.8967
x ₁₆	1.00	–1.00	0.1776	0.8224	0.2398	0.7602
x ₁₇	2.00	–1.00	0.0740	0.9260	0.1151	0.8849
x ₁₈	3.00	–1.00	0.0919	0.9081	0.1731	0.8269
x ₁₉	0.00	10.00	0.5000	0.5000	0.5000	0.5000
x ₂₀	0.00	–6.00	0.5000	0.5000	0.5000	0.5000

Fig. 5

Cluster centers of FCM on X₂₀ data set.

Fig. 6

Cluster centers of IFCM on X₂₀ data set.

Table 6

The cluster centers from FCM, PCM, PFCM and IFCM on X₂₀ data set

FCM	PCM	PFCM	IFCM
$[\begin{matrix} 1.88 & 0.14 \\ - 1.88 & 0.14 \end{matrix}]$	$[\begin{matrix} 0.80 & - 0.02 \\ - 0.80 & - 0.02 \end{matrix}]$	$[\begin{matrix} 1.65 & 0.08 \\ - 1.65 & 0.08 \end{matrix}]$	$[\begin{matrix} 2.01 & - 0.01 \\ - 2.01 & - 0.01 \end{matrix}]$

The real cluster centers of the X₂₀ dataset are: $V^{(T)} = [\begin{matrix} 2.00 & 0.00 \\ - 2.00 & 0.00 \end{matrix}]$ (34)

Equation (32) is applied to calculate the distance from the terminal cluster centers of FCM, PCM, PFCM and IFCM to the real cluster centers of the X₂₀ dataset. By calculating, the distances of FCM, PCM, PFCM, IFCM respectively are 0.0392, 28800, 0.2450 and 0.0002, so E_IFCM < E_FCM < E_PFCM < E_PCM. This indicates that IFCM is the least affected by noisy data points and its cluster centers are closer to the true centers.

To prove that IFCM has a better clustering effect on the X₂₀ data, the weight index m is also changed. Table 7 shows the terminal cluster centers and their distances from the real cluster centers with different values of m. Although the distances of the IFCM on X₂₀ do not decrease with the increase of m as it does on X₁₂, its maximum value of 0.0020 is still smaller than the other three algorithms.

Table 7

The cluster centers of IFCM and distances with different values of m on X₂₀ data set

Parameters	Cluster centers	Distance
m = 2	$[\begin{matrix} 2.0053 & - 0.0148 \\ - 2.0053 & - 0.0148 \end{matrix}]$	0.00044
m = 2.5	$[\begin{matrix} 1.9912 & - 0.0145 \\ - 1.9912 & - 0.0145 \end{matrix}]$	0.00042
m = 3	$[\begin{matrix} 1.9788 & - 0.0132 \\ - 1.9788 & - 0.0132 \end{matrix}]$	0.00090
m = 3.5	$[\begin{matrix} 1.9717 & - 0.0117 \\ - 1.9717 & - 0.0117 \end{matrix}]$	0.00160
m = 4	$[\begin{matrix} 1.9684 & - 0.0103 \\ - 1.9684 & - 0.0103 \end{matrix}]$	0.00200

5.3 Real data clustering

5.3.1 Data introduction

(1) The IRIS dataset [27] has 150 data samples divided equally into three categories (Setosa, Versicolour and Virginica) with four attributes: calyx Length, width and petal length, width. (2) The Wine dataset has three categories of 59, 71 and 48 data samples respectively [28]. The total phenol data for attribute 7 and the proanthocyanins data for attributes 10 are selected for clustering analysis. (3) The spectra of Meat data are collected using Monitir Fourier transform infrared (FTIR) spectrometer system and all spectra are in the region of 1000∼1800 cm^- 1 [29, 30]. The dataset contains FTIR spectral data of three different kinds of fresh meat including chicken, pork and turkey and each kind of meat has 40 samples. (4) The apple data set has four different varieties (Fuji, Huaniu, Gala and Huangjiao) with spectra ranging from 10000 to 4000 cm^- 1 [31]. To reduce the error, the average value of the data collected three times is used as the final experimental data. The collected apple dataset is 1557 dimensions. The detailed properties of the four datasets are shown in Table 8.

Table 8
The detailed properties of four datasets

Dataset Class Sample Dimension

IRIS 3 150 4

Wine 3 178 2

Meat 3 120 448

Apple 4 200 1557

Dataset	Class	Sample	Dimension
IRIS	3	150	4
Wine	3	178	2
Meat	3	120	448
Apple	4	200	1557

5.3.2 Clustering accuracy

First of all, we make the experiments by running FCM, PCM, PFCM and IFCM on the IRIS dataset. The coefficients are set as a = 4, b = 3. The initial cluster centers are: $V^{(0)} = [\begin{matrix} - 0.4326 & 0.2877 & 1.1892 & 0.1746 \\ - 1.6656 & - 1.1465 & - 0.0376 & - 0.1867 \\ 0.1253 & 1.1909 & 0.3273 & 0.7258 \end{matrix}]$ (35)

Because PCM is susceptible to the initial cluster centers, we choose the final cluster centers obtained by running the FCM as the initial cluster centers of PCM. Table 9 shows the misclassification numbers from FCM, PCM, PFCM and IFCM on the IRIS data-set. The weight index m and w are changed to observe more situations. From Table 9, we can see that although PCM has appropriate initial cluster centers, it still leads to a consistent cluster, with the misclassification numbers from its typical value (T) remaining at 100. When m = 2, the fuzzy membership (U) from IFCM has the maximum misclassification numbers. And when mincreases to 2.5, its misclassification numbers reach the minimum value of 11, but the numbers of IFCM stabilize at 12 with m increasing again. By calculating, the clustering accuracies of IFCM are between 90% and 92.67% higher than those of FCM, PCM and PFCM.

Table 9

The numbers of misclassification from FCM, PCM, PFCM and IFCM on IRIS data

Parameters	FCM	PCM	PFCM	IFCM
	U	T	U	T	U
m = w = 2	16	100	15	15	14
m = w = 2.5	15	100	14	15	11
m = w = 3	15	100	15	13	12
m = w = 3.5	14	100	13	14	12
m = w = 4	14	100	12	12	12
m = 2,w = 3	16	100	15	15	14
m = 3,w = 2	15	100	14	14	12
m = 2.5,w = 3	15	100	14	15	11
m = 3,w = 2.5	15	100	15	13	12
m = 3.5,w = 2.5	14	100	13	13	12
m = 3,w = 3.5	15	100	14	14	12
m = 3.5,w = 3	14	100	13	13	12
m = 2.5,w = 3.5	15	100	14	15	11
m = 3.5,w = 4	14	100	14	14	12
m = 2,w = 3.5	16	100	15	15	14

Secondly, we use principal component analysis (PCA) [32] to reduce the dimensionality of the IRIS dataset to three. The scores plot of PC1, PC2 and PC3 are shown in Fig. 7. The first principal component (PC1), the second principal component (PC2) and the third principal component (PC3) account for 92.46%, 5.30% and 1.71% of the variance respectively. We run FCM, PCM, PFCM and IFCM on the IRIS-3D dataset. The coefficients are set as a = 5, b = 4. The initial cluster centers are:

Fig.7

The scores plot of PC1, PC2 and PC3.

$V^{(0)} = [\begin{matrix} - 2 . 6842 & 0.3266 & 0.0215 \\ 0.8905 & - 1.6960 & 0.2035 \\ 1.9720 & - 0.1774 & - 0.0247 \end{matrix}]$ (36)

Also, the initial cluster centers of PCM are the terminal cluster centers by running FCM. The misclassification numbers from FCM, PCM, PFCM and IFCM are shown in Table 10. Table 10 shows that the misclassification numbers from IFCM are less than FCM, PCM and PFCM. For PCM, its clustering accuracy performance is still poor and the misclassification numbers of typical value (T) are 49 only when m = 3.5 and m = 4, but in other cases they are still 100 the same as on the IRIS dataset. The misclassification numbers from FCM range from 14 to 16 and become smaller and then larger with the increase of m. When m = 2 the numbers of misclassification of fuzzy membership (U) from IFCM are 14 while with the increase of m the numbers can remain at 12, so the classification accuracies of IFCM range from 90.67% to 92%.

Table 10

The numbers of misclassification from FCM, PCM, PFCM and IFCM on IRIS-3D data

Parameters	FCM	PCM	PFCM	IFCM
	U	T	U	T	U
m = w = 2	16	100	16	15	14
m = w = 2.5	15	100	14	15	12
m = w = 3	14	100	14	14	12
m = w = 3.5	14	49	13	14	12
m = w = 4	15	49	13	12	12
m = 2,w = 3	16	100	16	15	14
m = 3,w = 2	14	100	14	14	12
m = 2.5,w = 3	15	100	14	15	12
m = 3,w = 2.5	14	100	15	14	12
m = 3.5,w = 2.5	14	49	13	14	12
m = 3,w = 3.5	14	100	14	14	12
m = 3.5,w = 3	14	49	14	13	12
m = 2.5,w = 3.5	15	100	14	15	12
m = 3.5,w = 4	14	49	13	14	12
m = 2,w = 3.5	16	100	16	15	14

Thirdly, the IRIS dataset is reduced to two-dimensional data by PCA. FCM, PCM, PFCM and IFCM are performing on the IRIS-2D dataset. The coefficients are set as a = 5, b = 1. The initial cluster centers are: $V^{(0)} = [\begin{matrix} - 2.6842 & 0.3266 \\ 0.8905 & - 0.1696 \\ 1.9720 & - 0.1774 \end{matrix}]$ (37)

The initial cluster centers of PCM are the final cluster centers running by FCM. Table 11 shows the misclassification numbers of FCM, PCM, PFCM and IFCM on the IRIS-2D dataset. The misclassification numbers of the fuzzy membership values (U) from FCM are 17 when m = 2 or m = 2.5 and in other cases the numbers are 15. PCM performs better on this dataset than on the IRIS and IRIS3D datasets, but it still has the most misclassification numbers of the four algorithms. The misclassification numbers of PFCM are 15 in most cases, and 16 or 17 in a few cases. The misclassification numbers of IFCM decrease from 16 to 14 and remain constant as m = 2increases. The clustering accuracy of IFCM ranges from 89.3% to 90.7%, which is higher compared to the FCM, PCM and PFCM.

Table 11

The numbers of misclassification from FCM, PCM, PFCM and IFCM on IRIS-2D data

Parameters	FCM	PCM	PFCM	IFCM
	U	T	U	T	U
m = w = 2	17	100	17	17	16
m = w = 2.5	17	50	15	15	14
m = w = 3	15	50	15	15	14
m = w = 3.5	15	49	16	15	14
m = w = 4	15	49	15	15	14
m = 2,w = 3	17	100	17	16	16
m = 3,w = 2	15	50	15	15	14
m = 2.5,w = 3	17	50	15	15	14
m = 3,w = 2.5	15	50	15	15	14
m = 3.5,w = 2.5	15	49	15	15	14
m = 3,w = 3.5	15	50	15	15	14
m = 3.5,w = 3	15	49	15	15	14
m = 2.5,w = 3.5	17	50	15	15	14
m = 3.5,w = 4	15	49	16	15	14
m = 2,w = 3.5	17	100	17	17	16

Fourthly, we perform the experiments by running FCM, PCM, PFCM and IFCM on the Wine dataset. The coefficients are set as a = 1, b = 1. The initial cluster centers are: $V^{(0)} = [\begin{matrix} 0.4598 & 8.0943 \\ 1.9988 & 2.9551 \\ 2.8629 & 5.3018 \end{matrix}]$ (38)

The initial cluster centers of PCM are the final cluster centers running by FCM. The misclassification numbers from FCM, PCM, PFCM and IFCM are shown in Table 12. For Wine data, none of the four algorithms show excellent clustering performance. The numbers of misclassification of fuzzy membership (U) from FCM are around 37. The misclassification numbers of PFCM vary greatly in different cases. IFCM still has the best clustering accuracy, and its misclassification numbers decrease as m increases.

Table 12

The numbers of misclassification from FCM, PCM, PFCM and IFCM on Wine data

Parameters	FCM	PCM	PFCM	IFCM
	U	T	U	T	U
m = w = 2	39	128	78	75	37
m = w = 2.5	37	125	107	78	35
m = w = 3	37	176	107	107	34
m = w = 3.5	38	228	107	74	33
m = w = 4	38	228	107	99	33
m = 2,w = 3	39	128	78	38	37
m = 3,w = 2	37	176	107	107	34
m = 2.5,w = 3	37	125	37	75	35
m = 3,w = 2.5	37	176	107	71	34
m = 3.5,w = 2.5	38	228	107	69	33
m = 3,w = 3.5	37	176	84	107	34
m = 3.5,w = 3	38	228	107	68	33
m = 2.5,w = 3.5	37	125	40	36	35
m = 3.5,w = 4	38	228	107	80	33
m = 2,w = 3.5	39	128	39	75	37

Fifthly, we conduct the experiments on the Meat dataset by operating FCM, PCM, PECM and IFCM. The raw data has 448 dimensions, so in order to show a better effect we use PCA to reduce the dimension to13. The coefficients are set as a = 3, b = 1. The initial cluster centers are: $V^{(0)} = {[\begin{matrix} 0.0274 & 0.0374 & 0.0217 \\ - 0.0035 & - 0.0043 & - 0.0022 \\ 0.0007 & - 0.0040 & - 0.0011 \\ - 0.0013 & - 0.0005 & - 0.0022 \\ 0.0027 & - 0.0015 & - 0.0005 \\ - 0.0002 & - 0.0025 & - 0.0002 \\ - 0.0018 & - 0.0025 & - 0.0006 \\ - . 0020 & - 0.0002 & - 0.0006 \\ - 0.0005 & 0.0005 & - 0.0002 \\ - 0.0009 & - 0.0012 & - 0.0002 \\ - 0.0005 & 0.0000 & - 0.0005 \\ 0.0000 & - 0.0002 & 0.0000 \\ 0.0000 & - 0.0002 & - 0.0004 \end{matrix}]}^{(T)}$ (39)

The initial cluster centers of PCM are the final cluster centers of FCM. The misclassification numbers of the four algorithms are shown in Table 13. PCM does not cause problems with cluster consistency only when m = 2, and in other cases its clustering performance is still poor with an error rate of even 100%. Interestingly, the fuzzy membership values (U) from FCM and IFCM are unchanged regardless of changes in the weight index m, and they are stable at 11 and 7 respectively. The PFCM has wide range of misclassification numbers, from a maximum value of 16 to a minimum value of 9. The accuracy of IFCM is 94.17%, so IFCM has the highest clustering accuracy among the four algorithms.

Table 13

The numbers of misclassification from FCM, PCM, PFCM and IFCM on Meat data

Parameters	FCM	PCM	PFCM	IFCM
	U	T	U	T	U
m = w = 2	11	61	11	11	7
m = w = 2.5	11	120	11	11	7
m = w = 3	11	120	11	10	7
m = w = 3.5	11	120	12	10	7
m = w = 4	11	120	13	12	7
m = 2,w = 3	11	61	11	11	7
m = 3,w = 2	11	120	10	10	7
m = 2.5,w = 3	11	120	12	11	7
m = 3,w = 2.5	11	120	10	9	7
m = 3.5,w = 2.5	11	120	11	16	7
m = 3,w = 3.5	11	120	12	11	7
m = 3.5,w = 3	11	120	11	10	7
m = 2.5,w = 3.5	11	120	11	11	7
m = 3.5,w = 4	11	120	12	10	7
m = 2,w = 3.5	11	61	11	11	7

Finally, we perform the experiments on the Apple dataset by running FCM, PCM and IFCM. First, we preprocess the infrared spectral data of apple samples using multivariate scattering correction (MSC). To achieve a better clustering effect, the apple dataset is reduced to 4 dimensions by PCA. And then we use linear discriminant analysis (LDA) [33] to exact the discrimination information from the compressed apple dataset. The optimal number of eigenvectors is usually the number of cluster centers minus one, so there are c - 1 =3 feature vectors, i.e., DV1, DV2 and DV3. The scores plot of DV1, DV2 and DV3 are shown in Fig. 8. The coefficients are set as a = 4, b = 2. The initial cluster centers can be set as:

Fig.8

The scores plot of DV1, DV2 and DV3.

$V^{(0)} = [\begin{matrix} 0.0549 & - 0.1230 & - 0.0793 \\ 0.0473 & - 0.2567 & - 0.0961 \\ 0.0180 & - 0.3267 & - 0.0898 \\ 0.0096 & - 0.3563 & - 0.0004 \end{matrix}]$ (40)

The initial cluster centers of PCM are the final cluster centers of FCM. Table 14 shows misclassification numbers from four algorithms. For this dataset, PCM still has a poor clustering effect and for the other three algorithms they also produce more misclassification numbers. The membership value (U) from FCM has the maximum misclassification numbers under the condition m = 3.5 and m = 4. In the same case, the misclassification numbers of the fuzzy membership values (U) and typical values (T) from PFCM even exceed 100. The misclassification numbers generated by the fuzzy membership values (U) from IFCM decrease to 24 and then increase to 30 with the increase of m, and it has the highest clustering accuracies that range from 85% to 88%.

Table 14

The numbers of misclassification from FCM, PCM, PFCM and IFCM on Apple data

Parameters	FCM	PCM	PFCM	IFCM
	U	T	U	T	U
m = w = 2	36	151	33	33	25
m = w = 2.5	33	200	58	39	24
m = w = 3	35	200	58	59	24
m = w = 3.5	58	200	103	108	24
m = w = 4	58	200	103	112	30
m = 2,w = 3	36	151	36	35	25
m = 3,w = 2	35	200	57	44	24
m = 2.5,w = 3	33	200	58	39	24
m = 3,w = 2.5	35	200	58	59	24
m = 3.5,w = 2.5	58	200	102	59	24
m = 3,w = 3.5	35	200	58	59	24
m = 3.5,w = 3	58	200	103	108	24
m = 2.5,w = 3.5	33	200	39	37	24
m = 3.5,w = 4	58	200	102	108	24
m = 2,w = 3.5	36	151	34	34	25

5.3.3 Cluster centers

In this section, we evaluate the final cluster centers of the four algorithms by comparing the distances from the terminal cluster centers generated by FCM, PCM, PFCM and IFCM to the true cluster centers. The criterion is that the smaller distance, the better the cluster centers. The mean values of each group of sample data are taken as the real cluster centers. The true cluster centers of IRIS, IRIS-3D, IRIS-2D, Wine, Meat and Apple are shown in Equation (41), Equation (42), Equation (43), Equation (44), Equation (45) and Equation (46), respectively. We set the coefficients a = 1, b = 1, indexes m = 2, w = 2 and iteration threshold ɛ = 0.00001. $V_{IRIS}^{(T)} = [\begin{matrix} 5.0060 & 3.4280 & 1.4620 & 0.2460 \\ 5.9360 & 2.7700 & 4.2600 & 1.3260 \\ 6.5880 & 2.9740 & 5.5520 & 2.0260 \end{matrix}]$ (41) $V_{IRIS 3 D}^{(T)} = [\begin{matrix} - 2.6424 & 0.1908 & - 0.0136 \\ 0.5332 & - 0.2455 & 0.1183 \\ 2.1092 & 0.5466 & - 0.1047 \end{matrix}]$ (42) $V_{IRIS 2 D}^{(T)} = [\begin{matrix} - 2.6424 & 0.1909 \\ 0.5332 & - 0.2455 \\ 2.0737 & 0.0547 \end{matrix}]$ (43) $V_{Wine}^{(T)} = [\begin{matrix} 2.9824 & 5.5283 \\ 2.0808 & 3.0866 \\ 0.7815 & 7.3962 \end{matrix}]$ (44) $V_{Meat}^{(T)} = {[\begin{matrix} 0.0153 & - 0.0204 & 0.0051 \\ - 0.0036 & - 0.0023 & 0.0059 \\ - 0.0009 & - 0.0006 & 0.0015 \\ - 0.0007 & 0.0000 & 0.0008 \\ 0.0003 & 0.0001 & - 0.0004 \\ - 0.0001 & - 0.0002 & 0.0004 \\ 0.0003 & - 0.0002 & - 0.0001 \\ 0.0003 & - 0.0002 & - 0.0001 \\ 0.0000 & 0.0002 & - 0.0001 \\ 0.0000 & 0.0000 & 0.0001 \\ 0.0003 & 0.0000 & - 0.0002 \\ - 0.0001 & 0.0001 & 0.0000 \\ 0.0001 & 0.0001 & - 0.0001 \end{matrix}]}^{T}$ (45) $V_{Apple}^{(T)} = [\begin{matrix} 0.0399 & - 0.2641 & - 0.0214 \\ 0.0456 & 0.0017 & 0.0334 \\ - 0.0986 & - 0.0628 & 0.0046 \\ 0.0131 & 0.3252 & - 0.0166 \end{matrix}]$ (46)

We use Equation (32) to calculate the distances, V_T and V_* respectively represent the real cluster centers and the final cluster centers. Table 15 shows the distances of FCM, PCM, PFCM, and IFCM. For IRIS we can see that E_IFCM < E_PFCM < E_FCM < E_PCM, this means the terminal cluster centers of IFCM are better than FCM, PCM and PFCM. The IFCM does not perform well on the IRIS-3D dataset and Meat dataset which produces final cluster centers with larger distances to the real cluster centers than the other three algorithms. The FCM has a minimum distance of 0.1418 on the IRIS-3D dataset, and the PFCM has the best performance with a distance of only 0.00031 on the Meat dataset. The distance of IFCM is 0.0119 and satisfies E_IFCM < E_PFCM < E_FCM < E_PCM on the IRIS-2D dataset. For the Apple dataset, the distance of IFCM is only smaller than FCM and 0.0114 larger than the distance of PFCM 0.6879. The distance of IFCM on the Wine dataset is 19.9186, which is smaller than that of FCM, but only slightly larger than that of PCM and PFCM. In summary, the IFCM does not perform as well as PCM and PFCM on IRIS-3d, Wine, Meat and Apple datasets. However, compared with the original algorithm FCM, the final clustering centers of IFCM are closest to the real clustering center on IRIS, Iris-2D, Wine and Apple datasets.

Table 15

The distances of FCM, PCM, PFCM and IFCM

Dataset	E _FCM	E _PCM	E _PFCM	E _IFCM
IRIS	36.1472	37.1406	35.6731	35.2415
IRIS-3D	0.1418	1.7217	0.1767	0.1943
IRIS-2D	0.0781	1.2664	0.0235	0.0119
Wine	22.8871	19.1988	19.8450	19.9186
Meat	0.00042	0.00103	0.00031	0.00209
Apple	0.7164	0.4736	0.6879	0.6993

5.3.4 Convergence results

In this section, we investigate the convergence of FCM, PCM, PFCM and IFCM by comparing the iteration numbers. We set the coefficients a = 1, b = 1, indexes m = 2, w = 2 and iteration threshold ɛ = 0.00001. FCM, PCM, PFCM and IFCM algorithms are executed in the above six datasets and the iteration numbers are shown in Table 16. IFCM has the most iteration numbers while FCM, PCM and PFCM respectively have 28, 39 and 32 iteration numbers on the IRIS dataset. On the IRIS-3D, Wine and IRIS-2D datasets, the IFCM reaches the convergence state faster than PCM, but slower than FCM and PFCM. FCM has the same iteration numbers on the IRIS-3D, IRIS-2D and Meat datasets. IFCM needs 81 iterations to reach convergence state and the iteration numbers are more than that of FCM (32 iterations), PCM (74 iterations) and PFCM (60 iterations) on the Apple dataset.

Table 16
The iteration numbers of FCM, PCM, PFCM and IFCM

Dataset FCM PCM PFCM IFCM

IRIS 28 39 32 48

IRIS-3D 19 42 25 37

IRIS-2D 19 44 27 40

Wine 22 67 27 42

Meat 19 39 13 32

Apple 32 74 60 81

Dataset	FCM	PCM	PFCM	IFCM
IRIS	28	39	32	48
IRIS-3D	19	42	25	37
IRIS-2D	19	44	27	40
Wine	22	67	27	42
Meat	19	39	13	32
Apple	32	74	60	81

5.3.5 Computing time

In this section, we run all fuzzy clustering algorithms again in different cases to illustrate the computing time. We set the iteration threshold ɛ = 0 . 00001 and the maximum iteration number γ_max = 100. The computing time of four algorithms on the six datasets is shown in Table 17. PFCM requires the most computing time because it has to calculate both the fuzzy membership values and the fuzzy typical values. Since the initial clustering centers of PCM need to be obtained by running FCM, it takes more time to compute than FCM. Compared to the FCM, the IFCM has a more complicated distance formula so IFCM spends more time than FCM.

Table 17
The running time(s) of FCM, PCM, PFCM and IFCM on six data

Dataset Parameters FCM PCM PFCM IFCM

IRIS a = b = 1,m = w = 2 0.0539 0.0751 0.1039 0.0602

a = b = 1,m = w = 3 0.0499 0.0847 0.1242 0.0731

a = b = 2,m = 2,w = 3 0.0542 0.1465 0.0847 0.0681

a = b = 4,m = 3,w = 2 0.0637 0.0874 0.1136 0.0731

a = 1,b = 2,m = 3,w = 4 0.0552 0.0792 0.1204 0.0787

a = 1,b = 3,m = 4,w = 3 0.0534 0.0904 0.1781 0.0843

IRIS-3D a = b = 1,m = w = 2 0.0466 0.0759 0.0948 0.0526

a = b = 1,m = w = 3 0.0460 0.0780 0.1134 0.0678

a = b = 2,m = 2,w = 3 0.0493 0.0829 0.1025 0.0559

a = b = 4,m = 3,w = 2 0.0520 0.1039 0.1104 0.7161

a = 1,b = 2,m = 3,w = 4 0.0410 0.0825 0.1313 0.0640

a = 1,b = 3,m = 4,w = 3 0.0452 0.0880 0.1442 0.0683

IRIS-2D a = b = 1,m = w = 2 0.0274 0.0571 0.0588 0.0405

a = b = 1,m = w = 3 0.0311 0.0495 0.0522 0.0475

a = b = 2,m = 2,w = 3 0.0259 0.0434 0.0586 0.0451

a = b = 4,m = 3,w = 2 0.0275 0.0593 0.0627 0.0568

a = 1,b = 2,m = 3,w = 4 0.0254 0.0496 0.0515 0.0464

a = 1,b = 3,m = 4,w = 3 0.0342 0.0582 0.0652 0.0457

Wine a = b = 1,m = w = 2 0.0374 0.0482 0.0567 0.0449

a = b = 1,m = w = 3 0.0316 0.0513 0.0545 0.0479

a = b = 2,m = 2,w = 3 0.0331 0.0454 0.0580 0.0427

a = b = 4,m = 3,w = 2 0.0296 0.0539 0.0579 0.0418

a = 1,b = 2,m = 3,w = 4 0.0362 0.0501 0.0591 0.0480

a = 1,b = 3,m = 4,w = 3 0.0307 0.0479 0.0541 0.0412

Meat a = b = 1,m = w = 2 0.0525 0.0884 0.0841 0.0715

a = b = 1,m = w = 3 0.0412 0.0743 0.1069 0.0709

a = b = 2,m = 2,w = 3 0.0389 0.0751 0.0977 0.0511

a = b = 4,m = 3,w = 2 0.0426 0.0818 0.0895 0.0630

a = 1,b = 2,m = 3,w = 4 0.0365 0.0736 0.1027 0.0687

a = 1,b = 3,m = 4,w = 3 0.0462 0.0816 0.0943 0.0719

Apple a = b = 1,m = w = 2 0.0116 0.0367 0.0753 0.0269

a = b = 1,m = w = 3 0.0209 0.0619 0.0601 0.0449

a = b = 2,m = 2,w = 3 0.0156 0.0301 0.0454 0.0254

a = b = 4,m = 3,w = 2 0.0216 0.0337 0.0814 0.0319

a = 1,b = 2,m = 3,w = 4 0.0233 0.0361 0.0933 0.0337

a = 1,b = 3,m = 4,w = 3 0.0273 0.0601 0.1161 0.0444

Dataset	Parameters	FCM	PCM	PFCM	IFCM
IRIS	a = b = 1,m = w = 2	0.0539	0.0751	0.1039	0.0602
	a = b = 1,m = w = 3	0.0499	0.0847	0.1242	0.0731
	a = b = 2,m = 2,w = 3	0.0542	0.1465	0.0847	0.0681
	a = b = 4,m = 3,w = 2	0.0637	0.0874	0.1136	0.0731
	a = 1,b = 2,m = 3,w = 4	0.0552	0.0792	0.1204	0.0787
	a = 1,b = 3,m = 4,w = 3	0.0534	0.0904	0.1781	0.0843
IRIS-3D	a = b = 1,m = w = 2	0.0466	0.0759	0.0948	0.0526
	a = b = 1,m = w = 3	0.0460	0.0780	0.1134	0.0678
	a = b = 2,m = 2,w = 3	0.0493	0.0829	0.1025	0.0559
	a = b = 4,m = 3,w = 2	0.0520	0.1039	0.1104	0.7161
	a = 1,b = 2,m = 3,w = 4	0.0410	0.0825	0.1313	0.0640
	a = 1,b = 3,m = 4,w = 3	0.0452	0.0880	0.1442	0.0683
IRIS-2D	a = b = 1,m = w = 2	0.0274	0.0571	0.0588	0.0405
	a = b = 1,m = w = 3	0.0311	0.0495	0.0522	0.0475
	a = b = 2,m = 2,w = 3	0.0259	0.0434	0.0586	0.0451
	a = b = 4,m = 3,w = 2	0.0275	0.0593	0.0627	0.0568
	a = 1,b = 2,m = 3,w = 4	0.0254	0.0496	0.0515	0.0464
	a = 1,b = 3,m = 4,w = 3	0.0342	0.0582	0.0652	0.0457
Wine	a = b = 1,m = w = 2	0.0374	0.0482	0.0567	0.0449
	a = b = 1,m = w = 3	0.0316	0.0513	0.0545	0.0479
	a = b = 2,m = 2,w = 3	0.0331	0.0454	0.0580	0.0427
	a = b = 4,m = 3,w = 2	0.0296	0.0539	0.0579	0.0418
	a = 1,b = 2,m = 3,w = 4	0.0362	0.0501	0.0591	0.0480
	a = 1,b = 3,m = 4,w = 3	0.0307	0.0479	0.0541	0.0412
Meat	a = b = 1,m = w = 2	0.0525	0.0884	0.0841	0.0715
	a = b = 1,m = w = 3	0.0412	0.0743	0.1069	0.0709
	a = b = 2,m = 2,w = 3	0.0389	0.0751	0.0977	0.0511
	a = b = 4,m = 3,w = 2	0.0426	0.0818	0.0895	0.0630
	a = 1,b = 2,m = 3,w = 4	0.0365	0.0736	0.1027	0.0687
	a = 1,b = 3,m = 4,w = 3	0.0462	0.0816	0.0943	0.0719
Apple	a = b = 1,m = w = 2	0.0116	0.0367	0.0753	0.0269
	a = b = 1,m = w = 3	0.0209	0.0619	0.0601	0.0449
	a = b = 2,m = 2,w = 3	0.0156	0.0301	0.0454	0.0254
	a = b = 4,m = 3,w = 2	0.0216	0.0337	0.0814	0.0319
	a = 1,b = 2,m = 3,w = 4	0.0233	0.0361	0.0933	0.0337
	a = 1,b = 3,m = 4,w = 3	0.0273	0.0601	0.1161	0.0444

6 Conclusion

This paper presented an improved FCM algorithm (IFCM) based on the Euclidean distance function. The new distance measure improved the robustness of the original algorithm (FCM) by giving less weights to noisy points and more weights to normal data points. By running on two artificial datasets, it could be seen that IFCM showed better clustering performance when faced with data with noisy points compared to the original FCM. The clustering accuracy, cluster centers, number of iterations and computing time of the IFCM algorithm were explored on six real datasets (IRIS, IRIS-3D, IRIS-2D, Wine, Meat and Apple). The results showed that IFCM had higher clustering accuracy and smaller cluster distance than other algorithms. Compared with PCM and PFCM, IFCM spent less clustering time and had a better clustering result. In addition, we changed the weight index to observe the clustering results and IFCM still has the best clustering result. In conclusion, IFCM is better than FCM, PCM and PFCM. In the next work, we will focus on the selection of the weight index m to obtain better clustering results and try to apply this distance measure to other clustering algorithms such as PCM and PFCM mentioned in this paper to improve their clustering performance.

Footnotes

Acknowledgments

This research was financially supported by Undergraduate Scientific Research Project of Jiangsu University (20AB0020), Anhui Province Scientific Research Planning Project (2022AH040), Talent Program of Chuzhou Polytechnic (YG2019026 and YG2019024) and Key Science Research Project of Chuzhou Polytechnic (YJZ-2020-12).

Conflict of interest

The authors declare no conflict of interest.

Data availability statement

The data that support the findings of this study are available from the corresponding author upon reasonable request.

References

Jing

J.K.

, Ke

S.Z.

, Li

T.J.

, Wang

, Energy method of geophysical logging lithology based on K-means dynamic clusteringanalysis, Environmental Technology & Innovation 23(2021), 101534.

Kononenko

, Kukar

, Machine Learning and Data Mining, Woodhead Publishing 12 (2007), 321–358.

Ohri

, Kumar

, Review on self-supervised image recognition using deep neural networks, Knowledge-Based System 224(8) (2021), 107090.

Wang

, Wang

X.G.

, Liu

W.Y.

, Unsupervised local deep feature for image recognition, Information Science 351 (2016), 67–75.

D.X.

, Li

, Wang

, Zhu

T.G.

, Pornographic images recognition based on spatial pyramid partition and multi-instance ensemble learning, Knowledge-Based Systems 84(2015), 214–223.

Tang

N.N

, Image recognition algorithm for exercise fatigue based on FPGA processor and motion image capture, Microprocess and Microsystems 81(3) (2021), 103756.

Opiyo

, Okinda

, Zhou

, Mwangi

, Makange

, Medial axis-based machine-vision system for or-chard robot navigation, Computers and Electronics in Agriculture 185 (2021), 106153.

C.S

, Zhang

X.M

, Huang

Y.J.

, Tang

C.G.

, Fatikow

, A novel algorithm for defect extraction and classifi-cation of mobile phone screen based on machine vision, Computers and Industrial Engineering 146(1) (2020), 06530.

Malhotra

, Jha

, Fuzzy c-means clustering based colour image segmentation for tool wear monitoring in micro-milling, Precision Engineering-Journal of the International Societies for Precision Engineering and Nanotechnology 72 (2021), 690–705.

10.

Wang

K.L.

, Yi

Y.H.

, Tang

Z.W.

, Peng

J.B.

, Multi-scene ancient Chinese text recognition with deep coupled alignments, Applied Soft Computing 108(4) (2021), 107475.

11.

Shivakumara

, Wu

, Lu

, Tan

C.L.

, Blumenstein

, Anami

B.S.

, Fractals based multi-oriented text detection system for recognition in mobile video images, Pattern Recognition 68 (2017), 158–174.

12.

J.M.

, Shivakumara

, Lu

, Tan

C.L.

, Uchida

, A new method for multi-oriented graphics-scene-3D text classification in video, Pattern Recognition 49 (2016), 19–42.

13.

Bellman

, Kalaba

, Zadeh

, Abstraction and pattern classification, Journal of Mathematical Analysis and Applications 13(1) (1966), 1–7.

14.

Ruspini

E.H.

, Numerical methods for fuzzy clustering, Information Sciences 2(3) (1970), 319–350.

15.

Bezdek

J.C.

, Hathaway

R.J.

, Clustering with relational c-means partitions from pairwise distance data, Mathematical Modelling 9(6) (1987), 435–439.

16.

Krishnapuram

, Keller

J.M.

, A possibilistic approach to clustering, IEEE Transactions on Fuzzy Systems 1(2) (1993), 98–110.

17.

Pal

N.R.

, Pal

, Keller

J.M.

, Bezdek

J.C.

, A possibilistic fuzzy c-means clustering algorithm, IEEE Transac-tions on Fuzzy Systems 13(4) (2005), 517–530.

18.

Khalilia

M.A.

, Bezdek

, Popescu

, Keller

J.M.

, Improvements to the relational fuzzy c-means clustering algorithm, Pattern Recognition 47(12) (2014), 3920–3930.

19.

Gath

, Geva

A.B.

, Unsupervised optimal fuzzy clustering, IEEE Transactions on Pattern Analysis and Machine Intelligence 11(7) (1989), 773–781.

20.

Zeng

, Wang

X.Y.

, Duan

X.J.

, Zeng

, Xiao

Z.Y.

, Feng

, Kernelized Mahalanobis distance for fuzzy clustering, IEEE Transactions on Fuzzy Systems 29(10) (2021), 3103–3117.

21.

X.H.

, Zhou

H.X.

, Wu

, Zhang

T.F.

, A possibilistic fuzzy Gath-Geva clustering algorithm using the ex-ponential distance, Expert Systems with Applications 184 (2021), 115550.

22.

X.H.

, Zhu

, Wu

, Sun

, Qiu

S.W.

, Li

, A hybrid fuzzy K-harmonic means clustering algorithm, Applied Mathematical Modelling 39(12) (2015), 3398–3409.

23.

Chouikhi

, Saad

M.F.

, Alimi

A.M.

Improved fuzzy possibilistic C-means (IFPCM) algorithms using Min- kowski distance, International Conference on Control, Automation and Diagnosis (2017), pp. 402–405.

24.

Zhao

K.X.

, Dai

Y.P.

, Jia

Z.Y.

, Ji

, General fuzzy C-means clustering algorithm using Minkowski metric, Signal Processing 188 (2017), 108161.

25.

Gao

Y.L.

, Wang

Z.H.

, Xie

J.X.

, PAn

J.Y.

, A new robust fuzzy c-means clustering method based on adaptive elastic distance, Knowledge-Based Systems 237 (2022), 107769.

26.

K.L.

, Yang

M.S.

, Alternative c-means clustering algorithms, Pattern Recognition 35(10) (2002), 2267–2278.

27.

Thirunavkkarasu

, Ajay

S.S.

, Prakhar

, Sachin

Classification of IRIS Dataset using Classification Based KNN Algorithm in Supervised Learning, International Conference on Computing Communication and Automation (2018), pp. 1–4.

28.

Timm

, Borgelt

, Doring

, Kruse

, An extension to possibilistic fuzzy cluster analysis, Fuzzy Sets and Systems 147 (2004), 3–16.

29.

Al-Jowder

, Kemsley

E.K.

, Wilson

R.H.

, Mid-infrared spectroscopy and authenticity problems in sele-cted meats: A feasibility study, Food Chemistry 59(2) (1997), 195–201.

30.

Zheng

W.B.

, Fu

X.P.

, Ying

Y.B.

, Spectroscopy-based food classification with extreme learning machine, Chemometrics and Intelligent Laboratory Systems 139 (2014), 42–47.

31.

X.H.

, Zhu

, Wu

, Sun

, Dai

C.X.

, Discrimination of tea varieties using FTIR spectroscopy and allied Gustafson-Kessel clustering, Computers and Electronics in Agriculture 147 (2018), 64–69.

32.

Rahoma

, Imtiaz

, Ahmed

Sparse principal component analysis using bootstrap method, Chemical Engineering Science (2021), pp. 116890.

33.

C.N.

, Qi

Y.F.

, Shao

Y.H.

, Guo

Y.R.

, Ye

Y.F.

, Robust two-dimensional capped I (2,1)-norm linear discrimi-nant analysis with regularization and its applications on image recognition, Engineering Applications of Artificial Intelligence 104 (2021), 104367.References

34.

Arqub

O.A.

, Adaptation of reproducing kernel algorithm for solving fuzzy Fredholm-Volterra integrodifferential equations, Neural Computing & Applications 28 (2017), 1591–1610.

35.

Alshammari

, Al-Smadi

M.H.

, Arqub

O.A.

, Hashim

, Alias

M.A.

, Residual Series Representation Algorithm for Solving Fuzzy Duffing Oscillator Equations, Symmetry 12 (2020),572.

36.

Arqub

O.A.

, Singh

, Maayah

, Alhodaly

, Reproducing kernel approach for numerical solutions of fuzzy fractional initial value problems under the Mittag-Leffler kernel differential operator, Mathematical Methods in the Applied Sciences 2021 (2021), 1–22.

37.

Arqub

O.A.

, Singh

, Alhodaly

, Adaptation of kernel functions-based approach with Atangana-Baleanu-Caputo distributed order derivative for solutions of fuzzy fractional Volterra and Fredholm integrodifferential equations, Mathematical Methods in the Applied Sciences 2021 (2021), 1–28.

An improved fuzzy C-means clustering algorithm using Euclidean distance function

Abstract

Keywords

1 Introduction

2 Literature review

3 Related work

3.1 FCM clustering

4.1 A new distance metric

5.1 Experimental environment

Table 1 Experiment environment Name Configuration System Windows 10 Processor Intel i5-6300HQ Running memory 8GB Software Matlab R2019b

5.2.1 X12 data set

5.3.1 Data introduction

Table 8 The detailed properties of four datasets Dataset Class Sample Dimension IRIS 3 150 4 Wine 3 178 2 Meat 3 120 448 Apple 4 200 1557

Table 16 The iteration numbers of FCM, PCM, PFCM and IFCM Dataset FCM PCM PFCM IFCM IRIS 28 39 32 48 IRIS-3D 19 42 25 37 IRIS-2D 19 44 27 40 Wine 22 67 27 42 Meat 19 39 13 32 Apple 32 74 60 81

Footnotes

Acknowledgments

Conflict of interest

Data availability statement

References

Table 1
Experiment environment

Name Configuration

System Windows 10

Processor Intel i5-6300HQ

Running memory 8GB

Software Matlab R2019b

5.2.1 X₁₂ data set

Table 8
The detailed properties of four datasets

Dataset Class Sample Dimension

IRIS 3 150 4

Wine 3 178 2

Meat 3 120 448

Apple 4 200 1557

Table 16
The iteration numbers of FCM, PCM, PFCM and IFCM

Dataset FCM PCM PFCM IFCM

IRIS 28 39 32 48

IRIS-3D 19 42 25 37

IRIS-2D 19 44 27 40

Wine 22 67 27 42

Meat 19 39 13 32

Apple 32 74 60 81