Combining similarity and divergence measures for intuitionistic fuzzy information clustering

Abstract

In the study of intuitionistic fuzzy clustering, the construction of an intuitionistic fuzzy similarity matrix (IFSM) is a fundamental and important issue in the direct clustering analysis, since it determines clustering results and computational efforts. Many methods based on the axioms of intuitionistic fuzzy similarity relations are applicable to IFSM construction. However, most of existing methods may yield a “counterintuitive result” in some cases and consume much computational time. In this paper, we propose a novel intuitionistic fuzzy clustering method to deal with such problems. First, based on the normalized Hamming distance, we define a similarity measure between intuitionistic fuzzy numbers (IFNs), by which a similarity measure between intuitionistic fuzzy sets (IFSs) is induced. Second, a divergence measure between IFSs is obtained by extending the dissimilarity of IFNs. Third, we construct an IFSM by using together the similarity and divergence measures so as to cluster the intuitionistic fuzzy information. Finally, two examples are presented to show the effectiveness and advantages of our method.

Keywords

Intuitionistic fuzzy sets similarity measure divergence measure intuitionistic fuzzy similarity matrix clustering analysis

1 Introduction

Cluster analysis or clustering is to divide a family of objects into several different clusters, in which the objects in the same cluster are of some similar properties [1]. Clustering analysis is a core tool which plays the important role in various fields, such as decision making [2, 3], data minning [4, 5], knowledge discovery [6, 7], clustering ensembles [8], medical diagnosis [9], etc. In classical clustering analysis, we know that an object belongs exactly to one cluster, i.e., the so-called “hard" cluster [10]. In practice, however, it is usually very fuzzy for that whether or not an object belongs exactly to one cluster, i.e., the so-called “soft" cluster or fuzzy cluster, due to the incompleteness of information. In order to improve that, Bezdek [11] introduced the notion of fuzzy clustering that means a beginning of fuzzy clustering theory. Fuzzy sets, as important objects of study, are applied to explore the problems of fuzzy clustering. However, it is not very precise to reveal the fuzziness of objects under the complex and volatile environment since that fuzzy sets characterize the fuzziness just by the membership degree. So later, Atanassov [12] introduced again the notion of intuitionistic fuzzy set (IFS) which adds a hesitancy degree to fuzzy sets. Since then, IFS theory began to be focused on and applied to directions such as three-way decisions (3WD) [13 –16], rough sets [17, 18], information systems [19 –23], pattern recognition [24, 25] and image processing [26].

In the field of intuitionistic fuzzy set analysis, intuitionistic fuzzy clustering is a significant topic, has been studied for decades and plentiful results have been obtained [27 –32]. For example, an early survey is due to Zhang et al. [27] who constructed an intuitionistic fuzzy equivalence matrix by using transitive closure of an intuitionistic fuzzy similarity matrix (IFSM) and finally gave an effective method of clustering samples. For instance again, Thong and Son [32] who, recently, first calculated the most proper number of clusters by using techniques of particle swarm optimization, and then gave a method of automatic picture intuitionistic fuzzy clustering. Although the previous results obtained show respectively their own advantages in clustering intuitionistic fuzzy information, they usually lead to time-consuming in computation or loss of information for that all of them are just based on an intuitionistic fuzzy equivalence matrix, the transitive closure technique or optimization methods [33]. Thus in recent years, many authors [33 –37] proposed a novel technique by constructing an IFSM to cluster intuitionistic fuzzy information directly. It is pointed [33] out that this technique reduces time-consuming in computation since the clustering efforts and time are mainly determined by the IFSM calculation.

How to reduce clustering efforts and thus save the time in computation? Many authors [33 –37] did lots of work on this aspect, e.g., Wang et al. [33] reduced clustering efforts by constructing an IFSM that is obtained from intuitionistic fuzzy similarity degree. It’s certain that these approaches, to some extent, save the time by reducing clustering efforts, however, it leads to a “counterintuitive result” in some cases. So how to improve the methods so that it not only saves the time but also avoids “counterintuitive results”. Thus in this paper we present a new intuitionistic fuzzy clustering approach which is based on IFSM construction.

This paper is organized as follows: In Section 2, we recall the knowledge and classical methods of IFSM construction, and analyze briefly some of their disadvantages. Section 3 gives a similarity measure between intuitionistic fuzzy numbers, the similarity and divergence measures between IFSs. Then based on that, we construct an IFSM. In Section 4, an intuitionistic fuzzy clustering method is presented by IFSM construction. In Section 5, some examples are given to illustrate the advantages of our method. Finally we end with this paper in Section 6.

2 Preliminary

Let us recall briefly some basic concepts concerning intuitionistic fuzzy theory and some classical approaches to IFSM construction.

2.1 Intuitionistic fuzzy set

Let X = {x₁, x₂, …, x_n} be fixed. An intuitionistic fuzzy set (IFS) A in X is defined as [12]: $A = {(x_{k}, μ_{A} (x_{k}), ν_{A} (x_{k})) | x_{k} \in X},$ (1) where μ_A: X ↦ [0, 1] and ν_A: X ↦ [0, 1] denote respectively the membership and non-membership degrees of element x_k belonging to the IFS A, satisfying 0 ≤ μ_A (x_k) + ν_A (x_k) ≤1 for all x_k ∈ X. Additionally, π_A (x_k) =1 - μ_A (x_k) - ν_A (x_k) ∈ [0, 1] is called the hesitation degree of element x_k belonging to the A. Especially, when π_A (x_k) =0 for x_k ∈ X, then the A is reduced to a fuzzy set [38]. For clarity, Xu and Yager [39] further called a = (μ_a, ν_a) an intuitionistic fuzzy numbers (IFN), where μ_a ∈ [0, 1], ν_a ∈ [0, 1] and μ_a + ν_a ∈ [0, 1]. And there are other possible representations of intuitionistic fuzzy set [27, 40].

Let us now state some operations about the IFSs (referred to [12]). Assume A and B are two IFSs, then

$\begin{matrix} (1) A \subseteq B \Leftrightarrow μ_{A} (x_{k}) \leq μ_{B} (x_{k}) and ν_{A} (x_{k}) \geq ν_{B} (x_{k}) \\ forall x_{k} \in X; \\ (2) A = B \Leftrightarrow μ_{A} (x_{k}) = μ_{B} (x_{k}) and ν_{A} (x_{k}) = ν_{B} (x_{k}) \\ forall x_{k} \in X; \\ (3) A^{c} = {(x_{k}, ν_{A} (x_{k}), μ_{A} (x_{k})) | x_{k} \in X}; \\ (4) A \oplus B = {(x_{k}, 1 - (1 - μ_{A} (x_{k})) (1 - μ_{B} (x_{k})), \\ ν_{A} (x_{k}) ν_{B} (x_{k})) | x_{k} \in X}; \\ (5) A \cap B = {(x_{k}, \min {μ_{A} (x_{k}), μ_{B} (x_{k})}, \\ \max {ν_{A} (x_{k}), ν_{B} (x_{k})}) | x_{k} \in X}; \\ (6) A \cup B = {(x_{k}, \max {μ_{A} (x_{k}), μ_{B} (x_{k})}, \\ \min {ν_{A} (x_{k}), ν_{B} (x_{k})}) | x_{k} \in X} . \end{matrix}$

We introduce a similarity measure S (A, B) between the IFSs A and B, which is given respectively in [40, 41], such similarity is of axiomatic properties as follows:

$\begin{matrix} (S 1) 0 \leq S (A, B) \leq 1; \\ (S 2) S (A, B) = 1 \Leftrightarrow A = B; \\ (S 3) S (A, B) = S (B, A); \\ (S 4) If A \subseteq B \subseteq C, then S (A, B) \geq S (A, C) and \\ S (B, C) \geq S (A, C) . \end{matrix}$

Later, Montes et al. [42] proposed the notion of divergence measure between IFSs, which can be viewed as a particular case of the dissimilarity of IFSs and is of the following properties:

$\begin{matrix} (D 1) D (A, B) = D (B, A); \\ (D 2) D (A, A) = 0; \\ (D 3) D (A \cap C, B \cap C) \leq D (A, B); \\ (D 4) D (A \cup C, B \cup C) \leq D (A, B) . \end{matrix}$

Let us proceed to introduce the notion of intuitionistic fuzzy similarity degree (see [27 , 43]). Given the IFSs A, B, C and R (A, B) denotes the relation between A and B. We call R (A, B) the intuitionistic fuzzy similarity degree if satisfying the following conditions:

$\begin{matrix} (R 1) R (A, B) isanIFN; \\ (R 2) R (A, B) = (1, 0) \Leftrightarrow A = B; \\ (R 3) R (A, B) = R (B, A); \\ (R 4) If A \subseteq B \subseteq C, then R (A, C) \subseteq R (A, B) and \\ R (A, C) \subseteq R (B, C) . \end{matrix}$

Szmidt et al. [44] and Li [45] gave respectively the normalized Hamming distance and the dissimilarity between IFNs as follows:

Definition 1. ([44]) Let a₁ = (μ₁, ν₁) and a₂ = (μ₂, ν₂) be two IFNs, the normalized Hamming distance l (a₁, a₂) between them is defined as: $l (a_{1}, a_{2}) = \frac{1}{2} (| μ_{1} - μ_{2} | + | ν_{1} - ν_{2} | + | π_{1} - π_{2} |),$ (2) where π₁ and π₂ are the hesitation degree of the IFNs a₁ and a₂, respectively.

Definition 2. ([45]) Let a₁ = (μ₁, ν₁) and a₂ = (μ₂, ν₂) be two IFNs, the dissimilarity d (a₁, a₂) between them is defined by: $d (a_{1}, a_{2}) = \frac{1}{2} (| μ_{1} - μ_{2} | + | ν_{1} - ν_{2} |) .$ (3) Clearly, 0 ≤ d (a₁, a₂) ≤1 by (102). In (102), we can see that if the greater d (a₁, a₂) is then the more divergence is between a₁ and a₂. Particularly, when d (a₁, a₂) =1, then it will lead to two extreme cases: a₁ = (1, 0), a₂ = (0, 1) or a₁ = (0, 1), a₂ = (1, 0), which means the complete divergence, and also d (a₁, a₂) =0 means a₁ = a₂. Apparently, these two cases coincide with the mankind’s cognition.

2.1 Intuitionistic fuzzy similarity matrix

In order to facilitate this work, the concept of an intuitionistic fuzzy similarity matrix (IFSM) is given as follows:

Definition 3. ([46]) Let Z = (z_ij) _m×m be an intuitionistic fuzzy matrix, where z_ij = (μ_ij, ν_ij) is an intuitionistic fuzzy number for all i, j = 1, 2, …, m. Then, Z is called an intuitionistic fuzzy similarity matrix, if the following properties are satisfied:

$\begin{matrix} (1) (Reflexity) : z_{ii} = (1, 0) for any i = 1, 2, . . ., m . \\ (2) (Symmetry) : z_{ij} = z_{ji}, i . e ., μ_{ij} = μ_{ji}, ν_{ij} = ν_{ji} \\ for any i, j = 1, 2, . . ., m . \end{matrix}$

2.2 Existing approaches to IFSM construction

This section mainly reviews some classical approaches to IFSM construction based on the IFSs [27 , 33–37]. For the problems of multi-attribute decision making, assume that there is a discrete set of alternatives (or objects) denoted by A = {A₁, A₂, …, A_m}, and a discrete set of attributes denoted by X = {x₁, x₂, …, x_n}. Then the characteristic of each alternative (object) on the attributes x_k (k = 1, 2, …, n) is assumed as: $A_{i} = {(x_{k}, μ_{A_{i}} (x_{k}), ν_{A_{i}} (x_{k})) | x_{k} \in X}, i = 1, 2, . . ., m,$ where π_{A
_i} (x_k) =1 - μ_{A
_i} (x_k) - ν_{A
_i} (x_k) is the uncertainty of x_k to A_i (1 ≤ i ≤ m). Whence, we can construct the corresponding IFSM Z = (z_ij) _m×m, where z_ij = (μ_ij, ν_ij) = (μ_R (A_i, A_j), ν_R (A_i, A_j)) is an intuitionistic fuzzy similarity degree (1 ≤ i, j ≤ m).

Existing methods on IFSM construction in recent years are recalled as below:

(1) (Zhang et al., 2007 [27]) $\begin{matrix} z_{ij} = & (1 - (max_{x_{k} \in X} {w_{1} | μ_{A_{i}} (x_{k}) - μ_{A_{j}} (x_{k}) |^{p} + w_{2} | ν_{A_{i}} (x_{k}) - \\ ν_{A_{j}} (x_{k}) |^{p} + w_{3} | π_{A_{i}} (x_{k}) - π_{A_{j}} (x_{k}) |^{p}})^{\frac{1}{p}}, \\ (min_{x_{k} \in X} {w_{1} | μ_{A_{i}} (x_{k}) - μ_{A_{j}} (x_{k}) |^{p} + w_{2} | ν_{A_{i}} (x_{k}) - \\ ν_{A_{j}} (x_{k}) |^{p} + w_{3} | π_{A_{i}} (x_{k}) - π_{A_{j}} (x_{k}) |^{p}})^{\frac{1}{p}}) . \end{matrix}$ (4)

(2) (Wang and Xu, 2011 [33]) $\begin{matrix} z_{ij} = (1 - \frac{\sum_{k = 1}^{n} | ν_{A_{i}} (x_{k}) - ν_{A_{j}} (x_{k}) |}{n} - \frac{\sum_{k = 1}^{n} | π_{A_{i}} (x_{k}) - π_{A_{j}} (x_{k}) |}{n}, \\ \frac{\sum_{k = 1}^{n} | ν_{A_{i}} (x_{k}) - ν_{A_{j}} (x_{k}) |}{n}) . \end{matrix}$ (5)

(3) (Viattchenin, 2012 [34])

$\begin{matrix} z_{ij} = (1 - \frac{\sum_{k = 1}^{n} max {| ν_{A_{i}} (x_{k}) - ν_{A_{j}} (x_{k}) |, | π_{A_{i}} (x_{k}) - π_{A_{j}} (x_{k}) |}}{n}, \\ \frac{\sum_{k = 1}^{n} max {| ν_{A_{i}} (x_{k}) - ν_{A_{j}} (x_{k}) |}}{n}) . \end{matrix}$ (6)

(4) (Feng et al.’, 2014 [35])

$\begin{matrix} z_{ij} = (\frac{1}{n} \sum_{k = 1}^{n} 1 - ((μ_{A_{i}} (x_{k}) - μ_{A_{j}} (x_{k}))^{2} + (ν_{A_{i}} (x_{k}) - ν_{A_{j}} (x_{k}))^{2})^{\frac{1}{2}}, \end{matrix}$ (7)

(5) (Wang et al., 2014 [36])

$\begin{matrix} z_{ij} = min_{1 \leq k \leq n} (μ_{ij}^{k}, ν_{ij}^{k}), \\ where μ_{ij}^{k} = min {min {1, 1 - μ_{A_{i}} (x_{k}) + μ_{A_{j}} (x_{k}), 1 - ν_{A_{j}} (x_{k}) \\ + ν_{A_{i}} (x_{k})}, min {1, 1 - μ_{A_{j}} (x_{k}) + μ_{A_{i}} (x_{k}), 1 - ν_{A_{i}} (x_{k}) + ν_{A_{j}} (x_{k})}} \\ ν_{ij}^{k} = max {max {0, min {μ_{A_{i}} (x_{k}) - μ_{A_{j}} (x_{k}), ν_{A_{j}} (x_{k}) - ν_{A_{i}} (x_{k})}}, \\ max {0, min {μ_{A_{j}} (x_{k}) - μ_{A_{i}} (x_{k}), ν_{A_{i}} (x_{k}) - ν_{A_{j}} (x_{k})}}} . \end{matrix}$ (8)

(6) (Li et al., 2014 [43])

$\begin{matrix} z_{ij} = (min {L (A_{i}, A_{j}), H (A_{i}, A_{j})}, \\ 1 - max {L (A_{i}, A_{j}), H (A_{i}, A_{j})}), \\ where L (A_{i}, A_{j}) = \frac{\sum_{k = 1}^{n} w_{k} min {μ_{A_{i}} (x_{k}), μ_{A_{j}} (x_{k})}}{\sum_{k = 1}^{n} w_{k} max {μ_{A_{i}} (x_{k}), μ_{A_{j}} (x_{k})}}, \\ H (A_{i}, A_{j}) = \frac{\sum_{k = 1}^{n} w_{k} min {1 - μ_{A_{i}} (x_{k}), 1 - μ_{A_{j}} (x_{k})}}{\sum_{k = 1}^{n} w_{k} max {1 - μ_{A_{i}} (x_{k}), 1 - μ_{A_{j}} (x_{k})}} . \end{matrix}$ (9)

where w_k is the weight of element x_k in the universe of discourse X for x_k ∈ X.

(7) (Kacprzyk et al., 2016 [37])

$\begin{matrix} z_{ij} = (1 - \frac{\sum_{k = 1}^{n} \sqrt{(ν_{A_{i}} (x_{k}) - ν_{A_{j}} (x_{k}))^{2} + (π_{A_{i}} (x_{k}) - π_{A_{j}} (x_{k}))^{2}}}{2 n}, \\ \frac{\sum_{k = 1}^{n} \sqrt{(ν_{A_{i}} (x_{k}) - ν_{A_{j}} (x_{k}))^{2}}}{2 n}) . \end{matrix}$ (10)

Recall the previous approaches regarding IFSM construction, we find that the unreasonable phenomenon may happen, e.g., Kacprzyk et al.’s approach in some cases. Taking IFSs A₁ = {(x, 0.1, 0.9) |x ∈ X}, A₂ = {(x, 0.5, 0.3) |x ∈ X} and A₃ = {(x, 0.6, 0.3) |x ∈ X}, where X = {x₁} is a domain of discourse. It is obvious to get A₁ ⊂ A₂ ⊂ A₃. By (10), we obtain z₁₂ = (0.6838, 0.3) and z₁₃ = (0.6959, 0.3). That is z₁₂ ⊂ z₁₃, which indicates that the similarity between A₁ and A₂ is lower than the one between A₁ and A₃ under the circumstance of A₁ ⊂ A₂ ⊂ A₃, which does not coincide with human’s intuition. The same phenomenons may happen in the other approaches ([27 , 36]). The reason why these approaches lead to “counterintuitive results” is that the fourth property of intuitionistic fuzzy similarity degree (IFSD) is not satisfied in some cases, which is pointed out in [43]. In next section we propose an approach to constructing an intuitionistic fuzzy similarity matrix based on the new IFSD, satisfying these properties (R1-R4), which can overcome such shortcomings.

3 An approach to constructing an intuitionistic fuzzy similarity matrix

In this section, we propose an approach to constructing an intuitionistic fuzzy similarity matrix. A similarity measure between intuitionistic fuzzy numbers (IFNs) is first given as:

Definition 4. Let a₁ = (μ₁, ν₁) and a₂ = (μ₂, ν₂) be two IFNs. A similarity measure s (a₁, a₂) between a₁ and a₂ is defined as: $s (a_{1}, a_{2}) = 1 - \frac{1}{2} (| μ_{1} - μ_{2} | + | ν_{1} - ν_{2} | + | π_{1} - π_{2} |),$ (11) where π_i = 1 - μ_i - ν_i denotes the hesitation degree of the IFN a_i for any i = 1, 2.

From (109), we know that s (a₁, a₂) is a real number between 0 and 1 (see Property 1). The bigger (smaller) the value of s (a₁, a₂) is, the higher (lower) similarity is between a₁ and a₂; if s (a₁, a₂) =1, the similarity reaches the maximum between them, namely, a₁ = a₂, which implies their identity.

As is analyzed above, some properties of the similarity of IFNs are further obtained.

Property 1. Let a₁ = (μ₁, ν₁) and a₂ = (μ₂, ν₂) be two IFNs. Then

$\begin{matrix} (1) 0 \leq s (a_{1}, a_{2}) \leq 1; \\ (2) s (a_{1}, a_{2}) = s (a_{2}, a_{1}); \\ (3) s (a_{1}, a_{2}) = s (a_{1}^{c}, a_{2}^{c}); \\ (4) s (a_{1}, a_{2}) = 1 \Leftrightarrow a_{1} = a_{2} . \end{matrix}$

Proof. (1) Let us first prove that 0 ≤ s (a₁, a₂) ≤1.(i) The conclusion is clear by (109) if π₁ = π₂. (ii) When π₁ ≠ π₂, (a) if μ₁ = μ₂ and ν₁ = ν₂, then 0 ≤ s (a₁, a₂) ≤1 holds; (b) if μ₁ = μ₂ and ν₁ ≠ ν₂, we obtain 0 ≤ s (a₁, a₂) ≤1; (c) if μ₁ ≠ μ₂ and ν₁ = ν₂, it yields 0 ≤ s (a₁, a₂) ≤1; (d) if μ₁ ≠ μ₂ and ν₁ ≠ ν₂, it deduces respectively 0 ≤ s (a₁, a₂) ≤1 from four cases as follows: textcircled1 μ₁ > μ₂ and ν₁ > ν₂; textcircled2 μ₁ > μ₂ and ν₁ < ν₂; textcircled3 μ₁ < μ₂ and ν₁ > ν₂; textcircled4 μ₁ < μ₂ and ν₁ < ν₂, which is corresponding to the case (i), (ii), (iii) and (iv) in Fig. 1, respectively.

Without loss of generality, we only prove 0 ≤ s (a₁, a₂) ≤1 under the first case, and the same for the others. It is well known that an IFN can be explained by two-dimensional coordinates [47], shown in the case (i) of Fig. 1. The membership degree (μ), non-membership degree (ν) and hesitation degree (π) are denoted by a point inside the triangle FOE, thus the point D and C are represented by D = (μ₁, ν₁, π₁) and C = (μ₂, ν₂, π₂), respectively. For convenience, assume that h = |μ₁ - μ₂|, l₁ = |ν₁ - ν₂| and l₂ = |π₁ - π₂|, let S be the area of the right trapezoid CHDG, then |DH| = l₁, |GC| = l₂, |CH| = h. It follows $S = \frac{1}{2} (l_{1} + l_{2}) h$ by the formula of trapezoid area and l₂ = h + l₁ with l₁, l₂, h ∈ [0, 1], which deduces $\frac{2 S}{h} = l_{1} + l_{2}$ . Analogously, $S = S_{PCH} + S_{PHDG} = \frac{1}{2} h^{2} + l_{1} h$ , namely, $\frac{2 S}{h} = h + 2 l_{1}$ , which yields l₁ + l₂ = h + 2l₁. So we can obtain $s (a_{1}, a_{2}) = 1 - \frac{1}{2} (h + l_{1} + l_{2}) = 1 - (h + l_{1}) = 1 - l_{2} \geq 0$ that leads to 0 ≤ s (a₁, a₂) ≤1. In addition, (2), (3) and (4) are immediate by (109).

Fig.1

Geometrical representation of IFNs.

From Property 1, we can see that the definition of the similarity measure of IFNs coincides with human’s cognition. Using Definition 4 and Definition 2, we extend the similarity and dissimilarity measures of IFNs to the IFSs and give some new definitions.

Definition 5. Let X = {x₁, x₂, …, x_n} be an universe of discourse and A = {(x_k, μ_A (x_k), ν_A (x_k)) |x_k ∈ X}, B = {(x_k, μ_B (x_k), ν_B (x_k)) |x_k ∈ X} be two IFSs. The similarity measure S (A, B) between A and B is given by: $S (A, B) = \frac{1}{n} \sum_{k = 1}^{n} (1 - \frac{1}{2} (| μ_{A} (x_{k}) - μ_{B} (x_{k}) |$ (12) $+ | ν_{A} (x_{k}) - ν_{B} (x_{k}) | + | π_{A} (x_{k}) - π_{B} (x_{k}) |)),$ where π_A (x_k) and π_B (x_k) are the hesitation degree of x_k to A and to B, respectively.

Property 2. Based on (12), we have: $\begin{matrix} (1) 0 \leq S (A, B) \leq 1; \\ (2) S (A, B) = 1 \Leftrightarrow A = B; \\ (3) S (A, B) = S (B, A) and S (A, B) = S (A^{c}, B^{c}); \\ (4) If A \subseteq B \subseteq C, then S (A, C) \\ \leq \min {S (A, B), S (B, C)} . \end{matrix}$

Proof. (1), (2) and (3) are obvious. (4) Given any three IFSs A, B and C such that A ⊆ B ⊆ C. Thus for any x_k ∈ X, it follows μ_A (x_k) ≤ μ_B (x_k) ≤ μ_C (x_k) and ν_A (x_k) ≥ ν_B (x_k) ≥ ν_C (x_k). Suppose $\begin{matrix} Q (A, C) (x_{k}) = & | μ_{A} (x_{k}) - μ_{C} (x_{k}) | + | ν_{A} (x_{k}) - ν_{C} (x_{k}) | \\ + | π_{A} (x_{k}) - π_{C} (x_{k}) |, \\ Q (A, B) (x_{k}) = & | μ_{A} (x_{k}) - μ_{B} (x_{k}) | + | ν_{A} (x_{k}) - ν_{B} (x_{k}) | \\ + | π_{A} (x_{k}) - π_{B} (x_{k}) | . \end{matrix}$ In what follows, we will prove Q (A, C) (x_k) ≥ Q (A, B) (x_k) for all x_k ∈ X. With respect to any π_A (x_k), π_B (x_k), π_C (x_k) ∈ [0, 1], where x_k ∈ X, there are six preference relations among them as follows: (a) π_A (x_k) ≤ π_B (x_k) ≤ π_C (x_k); (b) π_A (x_k) ≤ π_C (x_k) < π_B (x_k); (c) π_B (x_k) < π_A (x_k) ≤ π_C (x_k); (d) π_B (x_k) ≤ π_C (x_k) < π_A (x_k); (e) π_C (x_k) < π_A (x_k) ≤ π_B (x_k) and (f) π_C (x_k) < π_B (x_k) < π_A (x_k).

Case a: If π_A (x_k) ≤ π_B (x_k) ≤ π_C (x_k) for all x_k ∈ X, it follows Q (A, C) (x_k) =2 (ν_A (x_k) - ν_C (x_k)) and Q (A, B) (x_k) =2 (ν_A (x_k) - ν_B (x_k)), which yields Q (A, C) (x_k) ≥ Q (A, B) (x_k). Also we can obtain the same results under Case (b), Case (d) and Case (f) respectively by the similar method.

Case c: If π_B (x_k) < π_A (x_k) ≤ π_C (x_k) for all x_k ∈ X, Q (A, C) (x_k) =2 (ν_A (x_k) - ν_C (x_k)) and Q (A, B) (x_k) =2 (μ_B (x_k) - μ_A (x_k)) hold. Suppose Q (A, C) (x_k) < Q (A, B) (x_k), i.e., ν_A (x_k) - ν_C (x_k) < μ_B (x_k) - μ_A (x_k), which deduces μ_A (x_k) + ν_A (x_k) < μ_B (x_k) + ν_C (x_k) ≤ μ_C (x_k) + ν_C (x_k). That is 1 - π_A (x_k) <1 - π_C (x_k). So we can get π_A (x_k) > π_C (x_k), which is contradictory with π_A (x_k) ≤ π_C (x_k). Therefore Q (A, C) (x_k) ≥ Q (A, B) (x_k) holds. Also we can obtain the same results under Case (e) by the similar method.

As is discussed above, it follows Q (A, C) (x_k) ≥ Q (A, B) (x_k) for all x_k ∈ X, which deduces $\begin{matrix} S (A, C) & = \frac{1}{n} \sum_{k = 1}^{n} (1 - \frac{Q (A, C) (x_{k})}{2}) \\ \leq \frac{1}{n} \sum_{k = 1}^{n} (1 - \frac{Q (A, B) (x_{k})}{2}) = S (A, B) . \end{matrix}$ Similarly, it follows S (A, C) ≤ S (B, C). Therefore we obtain S (A, C) ≤ min {S (A, B), S (B, C)}.

In Property 2, we note that S (A, B) satisfies (S1)-(S4). Whence, we have the corresponding definition of divergence measure.

Definition 6. Given two IFSs A and B in X. The divergence measure D (A, B) between A and B is defined as:

$\begin{matrix} D (A, B) & = \frac{1}{n} \sum_{k = 1}^{n} (\frac{1}{2} (| μ_{A} (x_{k}) - μ_{B} (x_{k}) | \\ + | ν_{A} (x_{k}) - ν_{B} (x_{k}) |)) . \end{matrix}$ (13)

Similarly, as a divergence measure of IFSs, this measure should meet (D1)-(D4). Before verifying them, we first obtain the following Lemma.

Lemma 1. For any a, b, c ∈ [0, 1], then

$\begin{matrix} | \min {a, c} - \min {b, c} | \leq | a - b | and \\ | \max {a, c} - \max {b, c} | \leq | a - b | . \end{matrix}$ (14)

Proof. Since $\min {a, c} = \frac{a + c - | a - c |}{2} and | | a - c | - | b - c | | \leq | a - b | .$ We have:

$\begin{matrix} | \min {a, c} - \min {b, c} | = \frac{| a - b + | b - c | - | a - c | |}{2} \\ \leq \frac{| a - b | + | | b - c | - | a - c | |}{2} = | a - b | . \end{matrix}$ We can also get |max {a, c} - max {b, c} | ≤ |a - b| via the similar method. Therefore the concludes hold.

On the basis of Lemma 1, the following property is further obtained:

Property 3. Based on (111), we have $\begin{matrix} (1) 0 \leq D (A, B) \leq 1; \\ (2) D (A, B) = 0 \Leftrightarrow A = B; \\ (3) D (A, B) = D (B, A), D (A, A) = 0 \\ and D (A, B) = D (A^{c}, B^{c}); \\ (4) D (A \cap C, B \cap C) \leq D (A, B) and \\ D (A \cup C, B \cup C) \leq D (A, B); \\ (5) If A \subseteq B \subseteq C, then D (A, C) \\ \geq \max {D (A, B), D (B, C)}; \\ (6) If A \subseteq B \subseteq C \subseteq D, then D (A, D) \geq D (B, C); \\ (7) D (A \cap B, B) = D (A, A \cup B) and \\ D (A \cap B, B) \leq D (A, B); \\ (8) D (A, A \cup B) \leq D (A \cap B, A \cup B) . \end{matrix}$ Proof. (1), (2) and (3) hold clearly.

(4) Since $\begin{matrix} A \cap C = {(x_{k}, min {μ_{A} (x_{k}), μ_{C} (x_{k})}, \\ max {ν_{A} (x_{k}), ν_{C} (x_{k})}) | x_{k} \in X}, \\ B \cap C = {(x_{k}, min {μ_{B} (x_{k}), μ_{C} (x_{k})}, \\ max {ν_{B} (x_{k}), ν_{C} (x_{k})}) | x_{k} \in X} . \end{matrix}$ According to Lemma 1, then $\begin{matrix} D (A \cap C, B \cap C) & = \frac{1}{n} \sum_{k = 1}^{n} (\frac{1}{2} (| μ_{A \cap C} (x_{k}) - μ_{B \cap C} (x_{k}) | \\ + | ν_{A \cap C} (x_{k}) - ν_{B \cap C} (x_{k}) |)) \\ = \frac{1}{n} \sum_{k = 1}^{n} (\frac{1}{2} | min {μ_{A} (x_{k}), μ_{C} (x_{k})} \\ - min {μ_{B} (x_{k}), μ_{C} (x_{k})} |) \\ + \frac{1}{n} \sum_{k = 1}^{n} (\frac{1}{2} | max {ν_{A} (x_{k}), ν_{C} (x_{k})} \\ - max {ν_{B} (x_{k}), ν_{C} (x_{k})} |) \\ \leq \frac{1}{n} \sum_{k = 1}^{n} (\frac{1}{2} (| μ_{A} (x_{k}) - μ_{B} (x_{k}) | \\ + | ν_{A} (x_{k}) - ν_{B} (x_{k}) |)) \\ = D (A, B), \end{matrix}$ which leads to D (A ∩ C, B ∩ C) ≤ D (A, B) and we can obtain D (A ∪ C, B ∪ C) ≤ D (A, B) by the similar method. (5) Given three IFSs A, B and C such that A ⊆ B ⊆ C, then μ_A (x_k) ≤ μ_B (x_k) ≤ μ_C (x_k) and ν_A (x_k) ≥ ν_B (x_k) ≥ ν_C (x_k) for all x_k ∈ X. Thus, $\begin{matrix} D (A, B) & = & \frac{1}{n} \sum_{k = 1}^{n} (\frac{1}{2} (| μ_{A} (x_{k}) - μ_{B} (x_{k}) | \\ + & | ν_{A} (x_{k}) - ν_{B} (x_{k}) |)) \\ \leq & \frac{1}{n} \sum_{k = 1}^{n} (\frac{1}{2} (| μ_{A} (x_{k}) - μ_{C} (x_{k}) | \\ + & | ν_{A} (x_{k}) - ν_{C} (x_{k}) |)) \\ = & D (A, C) . \end{matrix}$ Similarly, we can also obtain D (B, C) ≤ D (A, C), which follows D (A, C) ≥ max {D (A, B), D (B, C)}. In addition, according to (4), (6), (7) and (8) hold.

In Property 3, it shows that the new definition of divergence measure of IFSs meets (D1)-(D4). Therefore based on the similarity and divergence measures, some relations of both measures are revealed as follows:

Theorem 1. Let A and B be two IFSs on X. Assume S (A, B) and D (A, B) are the similarity and divergence measures of A and B, respectively. Then we have:

$\begin{matrix} (1) 0 \leq S (A, B) + D (A, B) \leq 1; \\ (2) S (A, B) = S (B, A) and D (A, B) = D (B, A); \\ (3) If S (A, B) = 1, then D (A, B) = 0 \\ and vice versa; \\ (4) A = B \Leftrightarrow S (A, B) = 1 and D (A, B) = 0 . \end{matrix}$ Proof. (1) Since $\begin{matrix} S (A, B) + D (A, B) = \frac{1}{n} \sum_{k = 1}^{n} (1 - \frac{1}{2} (| μ_{A} (x_{k}) - μ_{B} (x_{k}) | \\ + | ν_{A} (x_{k}) - ν_{B} (x_{k}) | + | π_{A} (x_{k}) - π_{B} (x_{k}) |) \\ + \frac{1}{n} \sum_{k = 1}^{n} (\frac{1}{2} (| μ_{A} (x_{k}) - μ_{B} (x_{k}) | \\ + | ν_{A} (x_{k}) - ν_{B} (x_{k}) |)) \\ = \frac{1}{n} \sum_{k = 1}^{n} (1 - \frac{1}{2} | π_{A} (x_{k}) - π_{B} (x_{k}) |) \leq 1 . \end{matrix}$ Also S (A, B) + D (A, B) ≥0. Thus 0 ≤ S (A, B) + D (A, B) ≤1. It is clear that (2), (3) and (4) hold.

In Theorem 1, the relations between similarity and divergence measures of IFSs are characterized, which are of the practical semantics of similarity and divergence features and also coincides with human’s cognition. In what follows we construct a tuple called a closeness degree of IFSs.

Definition 7. Let A and B be two IFSs on X, and R (A, B) be a binary relation on X × X. We call R (A, B) a closeness degree between A and B, if $R (A, B) = (S_{R} (A, B), D_{R} (A, B)) .$ (15) where S_R (A, B) and D_R (A, B) are the similarity and divergence measures between A and B under the intuitionistic fuzzy relation R, respectively.

Theorem 2. The function R (A, B) in Definition 7 is an intuitionistic fuzzy similarity degree.

Proof. (1) Referring (R1)-(R4), let us first prove that R (A, B) is an IFN. It follows that 0 ≤ S_R (A, B) ≤1, 0 ≤ D_R (A, B) ≤1 and 0 ≤ S_R (A, B) + D_R (A, B) ≤1, respectively. Therefore R (A, B) is an intuitionistic fuzzy number. (2) R (A, B) = (1, 0) ⇔ A = B is immediate due to Theorem 1. (3) Obviously, S_R (A, B) = S_R (B, A) and D_R (A, B) = D_R (B, A). Thus R (A, B) = R (B, A). (4) If A ⊆ B ⊆ C, then S_R (A, C) ≤ S_R (A, B) and S_R (A, C) ≤ S_R (B, C) hold. Also it yields D_R (A, C) ≥ D_R (A, B) and D_R (A, C) ≥ D_R (B, C), thus R (A, C) ⊆ R (A, B) and R (A, C) ⊆ R (B, C).

It is found that the new definition of intuitionistic fuzzy similarity degree can deal with the shortcomings of “counterintuitive results” in some cases. For example, returning to the question in section 2: Let A₁ = {(x, 0.1, 0.9) |x ∈ X}, A₂ = {(x, 0.5, 0.3) |x ∈ X} and A₃ = {(x, 0.6, 0.3) |x ∈ X}, where X = {x₁}. We have A₁ ⊂ A₂ ⊂ A₃. Based on (112), it follows R (A₁, A₂) = (0.4, 0.5) and R (A₁, A₃) = (0.4, 0.55). From the semantics viewpoint of both degrees, the similarity degree between A₁ and A₂ is the same as the one between A₁ and A₃, while the divergence degree between A₁ and A₂ is lower than the one between A₁ and A₃. It leads to R (A₁, A₂) ⊃ R (A₁, A₃), which indicates that the closeness degree between A₁ and A₂ is bigger than the one between A₁ and A₃ under the condition of A₁ ⊂ A₂ ⊂ A₃, which is consistent with human’s intuition.

Definition 8. Let A = {A₁, A₂, …, A_m} be a set of m (m ≥ 2) IFSs and R be a binary relation. Let R (A_i, A_j) = (S_R (A_i, A_j), D_R (A_i, A_j)) = (S_{R
_ij}, D_{R
_ij}) be an intuitionistic fuzzy similarity degree between A_i and A_j (1 ≤ i, j ≤ m). An intuitionistic fuzzy similarity matrix Z = (z_ij) _m×m is defined as: $Z = [\begin{matrix} (S_{R_{11}}, D_{R_{11}}) & (S_{R_{12}}, D_{R_{12}}) & \dots & (S_{R_{1 m}}, D_{R_{1 m}}) \\ (S_{R_{21}}, D_{R_{21}}) & (S_{R_{22}}, D_{R_{22}}) & \dots & (S_{R_{2 m}}, D_{R_{2 m}}) \\ ⋮ & ⋮ & ⋮ & ⋮ \\ (S_{R_{m 1}}, D_{R_{m 1}}) & (S_{R_{m 2}}, D_{R_{m 2}}) & \dots & (S_{R_{mm}}, D_{R_{mm}}) \end{matrix}] .$ From Definition 8, S_{R
_ij} can be understood as the degree of A_i similar to A_j and D_{R
_ij} is the degree of A_i dissimilar to A_j. Furthermore, we call π_{R
_ij} = 1 - S_{R
_ij} - D_{R
_ij} (i, j = 1, 2, …, m) the uncertainty of A_i similar to A_j. Compared with (4)-(10), the proposed approach to IFSM construction z_ij is based on the defined similarity and divergence measures of IFSs served as its membership and non-membership degrees, respectively, which is of practical implicatios and may be not provided with the features in (4)-(10). Considering its computational efforts and time, the proposed method takes a less computational time than some existing methods, which is illustrated in Example 2 in details.

Definition 9. Let A = {A₁, A₂, …, A_m} be a set of m (m ≥ 2) IFSs, R be an intuitionistic fuzzy similarity degree and Z = ((S_{R
_ij}, D_{R
_ij})) _m×m be an intuitionistic fuzzy similarity matrix. For any α, β ∈ [0, 1] with α + β ≤ 1, the (α, β)-level cut-sets $R_{α}^{β}$ of R is defined as: $R_{α}^{β} = {(A_{i}, A_{j}) \in A \times A | S_{R_{ij}} \geq α \land D_{R_{ij}} \leq β},$ (16) $\begin{matrix} i, j = 1, 2, . . ., m, i . e ., \\ R_{α}^{β} (A_{i}) = {A_{j} \in A | (A_{i}, A_{j}) \in R_{α}^{β}} . \end{matrix}$ We further call R_α = {(A_i, A_j) ∈ A × A|S_{R
_ij} ≥ α} and R^β = {(A_i, A_j) ∈ A × A|D_{R
_ij} ≤ β} the α-level cut-sets on similarity degree and β-level cut-sets on divergence degree between IFSs generated by R, respectively.

Property 4. Let R and G be two intuitionistic fuzzy similarity degrees on X. α and β are a pair thresholds, satisfying 0 ≤ α + β ≤ 1. Then

$\begin{matrix} (1) R_{α}^{β} = R_{α} \cap R^{β}; \\ (2) If 0 \leq α_{1} \leq α_{2} \leq 1 and 0 \leq β_{2} \leq β_{1} \leq 1, \\ then R_{α_{2}}^{β_{2}} \subseteq R_{α_{1}}^{β_{1}}; \\ (3) If R \subseteq G, then R_{α}^{β} \subseteq G_{α}^{β}; \\ (4) (R \cap G)_{α} = R_{α} \cap G_{α}, (R \cap G)^{β} = R^{β} \cap G^{β}, \\ (R \cap G)_{α}^{β} = R_{α}^{β} \cap G_{α}^{β}; \\ (5) (R \cup G)_{α} = R_{α} \cup G_{α}, (R \cup G)^{β} = R^{β} \cup G^{β}, \\ (R \cup G)_{α}^{β} \supseteq R_{α}^{β} \cup G_{α}^{β} . \end{matrix}$

Proof. It is not difficult to prove that Property 4 holds based on related definitions.

In this section, we observe that an intuitionistic fuzzy similarity matrix is well constructed on the basis of the proposed similarity and divergence measure of IFSs, which is of the practical semantics of the similarity and divergence. Based on this matrix, the notion of (α, β)-level cut-sets of an intuitionistic fuzzy similarity degree and its similarity class are given so as to do clustering analysis directly.

4 A method for intuitionistic fuzzy information clustering

This section aims to propose a novel method for intuitionistic fuzzy information clustering directly, which is based on the similarity class under intuitionistic fuzzy (α, β)-level cut-sets. Thus the detailed steps are as follows:

Step 1: Considering an analysis issue of decision making, assume A = {A₁, A₂, …, A_m} is a discrete set of m alternatives needed to be clustered, and X = {x₁, x₂, …, x_n} is a collection of n evaluation attributes. The description of each alternative A_i with respect to all attributes x_k (k = 1, 2, …, n) is expressed by IFSs as:

$\begin{matrix} A_{i} = {(x_{k}, μ_{A_{i}} (x_{k}), ν_{A_{i}} (x_{k})) | x_{k} \in X}, \\ i = 1, 2, . . ., m; k = 1, 2, . . ., n, \end{matrix}$ where μ_{A
_i} (x_k) is the membership degree of x_k to the alternative A_i, and ν_{A
_i} (x_k) is the non-membership degree of x_k to the alternative A_i. In addition, π_{A
_i} (x_k) =1 - μ_{A
_i} (x_k) - ν_{A
_i} (x_k) denotes the uncertainty of x_k to A_i.

Step 2: According to (114) and (115) to calculate the intuitionistic fuzzy similarity degree z_ij = (S_{R
_ij}, D_{R
_ij}) (i, j = 1, 2, …, m) between A_i and A_j, where S_{R
_ij} and D_{R
_ij} can be understood respectively as the similarity and divergence degrees between them. $\begin{matrix} S_{R_{ij}} & = & \frac{1}{n} \sum_{k = 1}^{n} (1 - \frac{1}{2} (| μ_{A_{i}} (x_{k}) - μ_{A_{j}} (x_{k}) | + | ν_{A_{i}} (x_{k}) \\ - & ν_{A_{j}} (x_{k}) | + | π_{A_{i}} (x_{k}) - π_{A_{j}} (x_{k}) |)) \end{matrix}$ (17) and $\begin{matrix} D_{R_{ij}} = & \frac{1}{n} \sum_{k = 1}^{n} (\frac{1}{2} (| μ_{A_{i}} (x_{k}) - μ_{A_{j}} (x_{k}) | \\ + | ν_{A_{i}} (x_{k}) - ν_{A_{j}} (x_{k}) |)) . \end{matrix}$ (18) And then we obtain an intuitionistic fuzzy similarity matrix Z = (z_ij) _m×m.

Step 3: Delete all elements above the diagonal of the matrix Z = (z_ij) _m×m and add the symbol of the alternative A_i (i = 1, 2, …, m) to the elements on the diagonal, respectively. Thus, we require the corresponding matrix $\tilde{Z}$ .

Step 4: Do clustering analysis, we first determine the thresholds α and β by $\tilde{Z}$ as follows:

Step 4.1: Rank all membership (similarity) degrees S_{R
_ij} (i, j = 1, 2, …, m) of the matrix $\tilde{Z}$ in descending order and 0 is added. With loss of generality, we denote it as G = {g₁, g₂, …, g_p, g_p+1} where g_k (k = 1, 2, …, p + 1) are non-monotonic increasing as k increases and g_p+1 = 0. We first take α ∈ (g₂, g₁].

Step 4.2: For the fixed α in Step 4.1, we rank all non-nembership (or divergence) degrees D_{R
_ij} in ascending order with the premise of S_{R
_ij} ≥ α for i, j = 1, 2, …, m in $\tilde{Z}$ and 1 is added. With loss of generality, we denote it as H = {h₁, h₂, …, h_q, h_q+1} where h_l (l = 1, 2, …, q + 1) are non-monotonic decreasing as l increases and h_q+1 = 1. We take β ∈ [h₁, h₂).

Step 5: For α and β given above, if S_{R
_ij} ≥ α and D_{R
_ij} ≤ β, alternatives A_i and A_j are clustered into one class. Otherwise, they belong to different classes.

Step 6: Based on the same classes A_i and A_j derived from Step 5, we delete all elements under the diagonal again and here add nodal point “*" to the intersection point between vertical and horizontal lines to the symbol of alternatives A_i and A_j on the diagonal. Following it, we can get the corresponding form $\hat{Z}$ . Inspired by these ideas presented in [33, 48], there is a class containing two elements for each “*". Unit the classes together which have the common elements and thus we obtain the classes corresponding to the given α and β. After that, update these thresholds until α and β are completely selected in (g_k+1, g_k] (k = 1, 2, …, p) and [h_l, h_l+1) (l = 1, 2, …, q), respectively.

Note that the proposed method can realize the intuitionistic fuzzy clustering analysis based on the intuitionistic fuzzy similarity matrix, which can be well established with the aid of the definitions of the similarity and divergence measures of IFSs.

5 Illustrative examples

The following numerical examples show the effectiveness and advantages of our method compared to the previous methods.

Example 1. ([26]) An auto market plans to divide five different cars A_i (i = 1, 2, …, 5) into several types, and each of which is characterized by six evaluation factors: (a) x₁-fuel economy; (b) x₂-coefficient of friction; (c) x₃-price; (d) x₄-comfort degree; (e) x₅-design; (f) x₆-safety coefficient. The evaluating values of each car with respect to these six factors x_k (k = 1, 2, …, 6) is denoted by the IFS in Table 1.

Table 1
Evaluating values of each car with respect to these six factors

x ₁ x ₂ x ₃ x ₄ x ₅ x ₆

A ₁ (0.3, 0.5) (0.6, 0.1) (0.4, 0.3) (0.8, 0.1) (0.1, 0.6) (0.5, 0.4)

A ₂ (0.6, 0.3) (0.5, 0.2) (0.6, 0.1) (0.7, 0.1) (0.3, 0.6) (0.4, 0.3)

A ₃ (0.4, 0.4) (0.8, 0.1) (0.5, 0.1) (0.6, 0.2) (0.4, 0.5) (0.3, 0.2)

A ₄ (0.2, 0.4) (0.4, 0.1) (0.9, 0.0) (0.8, 0.1) (0.2, 0.5) (0.7, 0.1)

A ₅ (0.5, 0.2) (0.3, 0.6) (0.6, 0.3) (0.7, 0.1) (0.6, 0.2) (0.5, 0.3)

	x ₁	x ₂	x ₃	x ₄	x ₅	x ₆
A ₁	(0.3, 0.5)	(0.6, 0.1)	(0.4, 0.3)	(0.8, 0.1)	(0.1, 0.6)	(0.5, 0.4)
A ₂	(0.6, 0.3)	(0.5, 0.2)	(0.6, 0.1)	(0.7, 0.1)	(0.3, 0.6)	(0.4, 0.3)
A ₃	(0.4, 0.4)	(0.8, 0.1)	(0.5, 0.1)	(0.6, 0.2)	(0.4, 0.5)	(0.3, 0.2)
A ₄	(0.2, 0.4)	(0.4, 0.1)	(0.9, 0.0)	(0.8, 0.1)	(0.2, 0.5)	(0.7, 0.1)
A ₅	(0.5, 0.2)	(0.3, 0.6)	(0.6, 0.3)	(0.7, 0.1)	(0.6, 0.2)	(0.5, 0.3)

The proposed clustering method is used to classify five cars, which involves the following steps:

Step 1: Calculate an intuitionistic fuzzy similarity matrix Z = ((S_{R
_ij}, D_{R
_ij})) _5×5, where $\begin{matrix} Z = [\begin{matrix} (1, 0) & (0.8167, 0.1333) & (0.7667, 0.1500) \\ (0.8167, 0.1333) & (1, 0) & (0.8333, 0.1167) \\ (0.7667, 0.1500) & (0.8333, 0.1167) & (1, 0) \\ (0.7833, 0.1583) & (0.7500, 0.1583) & (0.7000, 0.1750) \\ (0.7167, 0.2167) & (0.7833, 0.1500) & (0.7167, 0.2167) \end{matrix} \\ \begin{matrix} (0.7833, 0.1583) & (0.7167, 0.2167) \\ (0.7500, 0.1583) & (0.7833, 0.1500) \\ (0.7000, 0.1750) & (0.7167, 0.2167) \\ (1, 0) & (0.7000, 0.2417) \\ (0.7000, 0.2417) & (1, 0) \end{matrix}] \end{matrix}$ Step 2: Delete all elements above the diagonal of the matrix Z = (z_ij) _5×5 and add respectively the symbol of objects A_i (1 ≤ i ≤ 5) to the elements of the diagonal, in which the matrix $\tilde{Z}$ is required: $\begin{matrix} \tilde{Z} = [\begin{matrix} A_{1} (1, 0) \\ (0.8167, 0.1333) & A_{2} (1, 0) \\ (0.7667, 0.1500) & (0.8333, 0.1167) & A_{3} (1, 0) \\ (0.7833, 0.1583) & (0.7500, 0.1583) & (0.7000, 0.1750) \\ (0.7167, 0.2167) & (0.7833, 0.1500) & (0.7167, 0.2167) \end{matrix} \\ \begin{matrix} A_{4} (1, 0) \\ (0.7000, 0.2417) & A_{5} (1, 0) \end{matrix}] \end{matrix}$ Step 3: Do clustering analysis, the thresholds α and β are first determined from $\tilde{Z}$ , in which the clustering results of cars are achieved quickly as below:

(1) When 0.8333 < α ≤ 1, there is no similar class such that S_ij ≥ α (1 ≤ i, j ≤ 5) in $\tilde{Z}$ and we can get the corresponding form $\hat{Z}$ . Namely, each car is divided into one class: {A₁}, {A₂}, {A₃}, {A₄}, {A₅}. $\hat{Z} = [\begin{matrix} A_{1} (1, 0) \\ A_{2} (1, 0) \\ A_{3} (1, 0) \\ A_{4} (1, 0) \\ A_{5} (1, 0) \end{matrix}]$

(2) When 0.8167 < α ≤ 0.8333, (a) if 0 ≤ β < 0.1167, similarly, each car is classified clearly as one class: {A₁}, {A₂}, {A₃}, {A₄}, {A₅}; (b) if 0.1167 ≤ β < 1, the cars A₂ and A₃ are, in this case, clustered into one class and the following form is obtained. Hence, all cars are classified as four classes: {A₁}, {A₂, A₃}, {A₄}, {A₅}. $\hat{Z} = [\begin{matrix} A_{1} (1, 0) \\ A_{2} (1, 0) \\ * & A_{3} (1, 0) \\ A_{4} (1, 0) \\ A_{5} (1, 0) \end{matrix}]$

(3) When 0.7833 < α ≤ 0.8167, (a) if 0 ≤ β < 0.1167, each car is classified as one class: {A₁}, {A₂}, {A₃}, {A₄}, {A₅}; (b) if 0.1167 ≤ β < 0.1333, both cars A₂ and A₃ are clustered into one class. Namely, all cars are clustered into four classes: {A₁}, {A₂, A₃}, {A₄}, {A₅}; (c) if 0.1333 ≤ β < 1, all cars are classified into three classes: {A₁, A₂, A₃}, {A₄}, {A₅}.

(4) When 0.7667 < α ≤ 0.7833, (a) if 0 ≤ β < 0.1167, each car is classified as one class: {A₁}, {A₂}, {A₃}, {A₄}, {A₅}; (b) if 0.1167 ≤ β < 0.1333, the cars A₂ and A₃ are clustered into one class, thus all cars are divided into four classes: {A₁}, {A₂, A₃}, {A₄}, {A₅}; (c) if 0.1333 ≤ β < 0.1500, all cars are analogous to classify as three classes: {A₁, A₂, A₃}, {A₄}, {A₅}; (d) if 0.1500 ≤ β < 0.1583, all cars are classified as two classes: {A₁, A₂, A₃, A₅}, {A₄}. (e) if 0.1583 ≤ β < 1, all cars are clustered into one class: {A₁, A₂, A₃, A₄, A₅}.

(5) When 0.7500 < α ≤ 0.7667, (a) if 0 ≤ β < 0.1167, obviously, all cars are clustered into five classes: {A₁}, {A₂}, {A₃}, {A₄}, {A₅}; (b) if 0.1167 ≤ β < 0.1333, the cars A₂ and A₃ are clustered into one class. Therefore all cars are classified four classes: {A₁}, {A₂, A₃}, {A₄}, {A₅}; (c) if 0.1333 ≤ β < 0.1500, all cars are analogous to classify as two classes: {A₁, A₂, A₃}, {A₄, A₅}; (d) if 0.1500 ≤ β < 1, all cars are still clustered into two classes: {A₁, A₂, A₃, A₅}, {A₄}. (e) if 0.1583 ≤ β < 1, all cars are clustered into one class: {A₁, A₂, A₃, A₄, A₅}.

(6) When 0.7167 < α ≤ 0.7500, (a) if 0 ≤ β < 0.1167, all cars are clustered into five classes: {A₁}, {A₂}, {A₃}, {A₄}, {A₅}; (b) if 0.1167 ≤ β < 0.1333, both cars A₂ and A₃ are clustered into one class. Hence all cars are classified as four classes: {A₁}, {A₂, A₃}, {A₄}, {A₅}; (c) if 0.1333 ≤ β < 0.1500, all cars are analogous to cluster into two classes: {A₁, A₂, A₃}, {A₄, A₅}; (d) if 0.1500 ≤ β < 0.1583, all cars are divided into two classes: {A₁, A₂, A₃, A₅}, {A₄}. (e) if 0.1583 ≤ β < 1, all cars are classified as one class: {A₁, A₂, A₃, A₄, A₅}.

(7) When 0.7000 < α ≤ 0.7167, (a) if 0 ≤ β < 0.1167, all cars are clustered into five classes: {A₁}, {A₂}, {A₃}, {A₄}, {A₅}; (b) if 0.1167 ≤ β < 0.1333, the cars A₂ and A₃ are divided into one class and thus all cars are classified as four classes: {A₁}, {A₂, A₃}, {A₄}, {A₅}; (c) if 0.1333 ≤ β < 0.1500, all cars are analogous to classify as two classes: {A₁, A₂, A₃}, {A₄, A₅}; (d) if 0.1500 ≤ β < 0.1583, all cars are clustered into two classes: {A₁, A₂, A₃, A₅}, {A₄}. (e) if 0.1583 ≤ β < 1, all cars are to cluster into one class: {A₁, A₂, A₃, A₄, A₅}.

(8) When 0 < α ≤ 0.7000, (a) if 0 ≤ β < 0.1167, each car is divided clearly into one class: {A₁}, {A₂}, {A₃}, {A₄}, {A₅}; (b) if 0.1167 ≤ β < 0.1333, both cars A₂ and A₃ are clustered into one class. Therefore all cars are classified four classes: {A₁}, {A₂, A₃}, {A₄}, {A₅}; (c) if 0.1333 ≤ β < 0.1500, all cars are clustered into two classes: {A₁, A₂, A₃}, {A₄, A₅}; (d) if 0.1500 ≤ β < 0.1583, all cars are classified as two classes: {A₁, A₂, A₃, A₅}, {A₄}. (e) if 0.1583 ≤ β < 1, all cars are clustered into one class: {A₁, A₂, A₃, A₄, A₅}.

As is discussed above, it is known that there are diverse clustering results of the cars A_i (1 ≤ i ≤ 5) under different thresholds α and β, which is refered to Table 2. The thresholds α and β, however, are determined by the constructed IFSM. Therefore, the IFSM construction is a pivotal issue to determine clustering results and computational efforts. On the other hand, it is pointed out [25, 31] that the computation complexity in direct clustering analysis comes mainly from the IFSM computation. Thus, we can compare the elapse time of calculating IFSM to show its computational complexity to a certain extent. To do so, we further conduct experiments with simulated data by comparing with some previous methods in Example 2, which is shown in Table 3 and Fig. 2 for experimental results.

Table 2

Clustering alternatives of the proposed method under different thresholds a and ß

The thresholds		Class	The proposed method
a	ß
(0.8333, 1.0000]	[0.0000, 1.0000)	5	{A1}, {A2}, {A3}, {A4}, {A5}
(0.8167, 0.8333]	[0.0000, 0.1167)	5	{A1}, {A2}, {A3}, {A4}, {A5}
	[0.1167, 1.0000)	4	{A1}, {A2, {A3}, {A4}, {A5}
(0.7833, 0.8167]	[0.0000, 0.1167)	5	{A1}, {A2}, {A3}, {A4}, {A5}
	[0.1167, 0.1333)	4	{A1}, {A2}, {A3}, {A4}, {A5}
	[0.1333, 1.0000)	3	{A1, A2, A3}, {A4}, {A5}
(0.0000, 0.7833]	[0.0000, 0.1167)	5	{A1}, {A2}, {A3}, {A4}, {A5}
	[0.1167, 0.1333)	4	{A1}, {A2, A3}, {A4}, {A5}
	[0.1333, 0.1500)	2	{A1, A2, A3}, {A4}, {A5}
	[0.1500, 0.1583)	2	{A1, A2, A3, A5}, {A4}
	[0.1583, 1.0000)	1	{A1, A2, A3, A4}, {A5}

Example 2. As is stated above, the computational complexity in the process of direct clustering analysis is mainly related with the computations of an IFSM, so we show the computational complexity of clustering analysis directly from consideration of computations of an IFSM using simulated experiments. In the following, we first introduce experimental tool, experimental datasets and comparation with Zhang et al.’s method [27], Feng et al.’s method [35], Wang et al.’s method [36], Li et al.’s method [43] and Kacprzyk et al’s method [37].

(1) Experimental tool: The proposed method in this paper is used to obtain the elapsed time for calculating an IFSM by MATLAB.

(2) Experimental datasets: Datasets, generated by MATLAB at random, are regarded as intuitionistic fuzzy evaluating results of cars on six attributes. Let A = {A₁, A₂, …, A_m} be m cars, each car is described by six attributes: (a) x₁-fuel economy; (b) x₂-coefficient of friction; (c) x₃-price; (d) x₄-comfort degree; (e) x₅-design; (f) x₆-safety coefficient. Evaluating values of each car are represented by IFSs.

Simulated datasets are utilized to achieve purpose of comparison. Suppose the number m of alternatives (cars) is considered fron both sides, which means that m takes discrete poins (i.e., m = 25, 50, 200, 400, 800, 1200, 1600, 2000) and continuous points in different interval of alternatives (i.e., m ∈ [5, 55], [60, 110], [115, 165], [170, 220], [225, 275], [280, 330], [335, 385], [390, 440], [445, 495]). Elapsed time of deriving the corresponding IFSM is measured for each method and the simulated results are shown in Table 3 and Fig. 2.

Table 3

Comparison of elapsed time of calculating the IFSM with some methods

Methods	Numbers of alternatives (cars)
	25	50	200	400	800	1200	1600	2000
Zhangetal. ′smethod (2007)	0.0085	0.0258	0.3747	1.5502	5.9118	15.3039	28.3976	44.9606
Fengetal. ′smethod (2014)	0.0053	0.0208	0.3083	1.2724	4.7008	12.5811	22.1916	37.1808
Wangetal. ′smethod (2014)	0.0296	0.1006	1.5895	6.4093	24.8670	57.4292	108.6780	177.6936
Lietal. ′smethod (2014)	0.0087	0.0214	0.3508	1.3441	5.8891	13.4476	24.3753	40.2911
Kacprzyketal′smethod (2016)	0.0047	0.0144	0.2406	0.9450	4.2880	9.9003	18.0038	28.8264
Theproposedmethod	0.0033	0.0118	0.1765	0.7033	2.5329	7.4867	13.2685	24.1709

From Table 3 and Fig. 2, we can see that the proposed method in this paper, to some extent, has some advantages over existing methods as follows:

(1) The proposed method can reduce effectively computational efforts and time by comparation with Zhang et al.’s method (2007), Feng et al.’s method (2014), Wang et al.’s method (2014), Li et al.’s method (2014) and Kacprzyk et al’s method (2016). In particular Wang et al.’s method. Moreover, the disparity of elapsed time is gradually increasing between the proposed method and Wang et al.’s method as numbers of alternatives increase in a certain range. Therefore the proposed method can reduce time-consuming in computation for practical applications.

(2) The proposed method mainly focuses on the practical semantics of membership and non-membership degrees of the intuitionistic fuzzy similarity measure z_ij with similarity and divergence features. Based on the proposed similarity and divergence measures, it can overcome the drawbacks of “counterintuitive results” of the existing intuitionistic fuzzy similarity measures in some cases, when the IFSM is constructed to do direct clustering analysis.

It is clear that an approach to IFSM construction in this paper can save the time in computation compared with existing methods [27 , 43]. In light of the conclusions presented in [27, 33], it shows the reduction in computational complexity of clustering analysis directly to some extent.

Fig.2

Comparison of elapsed time of calculating the IFSM with some methods.

6 Conclusion

By employing similarity and divergence measures of IFSs defined in this paper, we provide the construction of intuitionistic fuzzy similarity degree with a new method, which gives the practical semantics meaning in the membership and non-membership degrees with similarity and divergence properties respectively. Due to our method, we can deal with the “counterintuitive problem” well. Based on that, we further establish an intuitionistic fuzzy similarity matrix and thus propose a method for intuitionistic fuzzy information clustering directly. Finally, two examples are analyzed by computation experiment to illustrate the effectiveness and advantages of the proposed method. The experimental results show that the proposed method, to some extent, can reduce computation efforts and time compared to some previous methods. In the future, we will mainly focus on the applications of the clustering method and results of this paper in practice and its extension to interval type-2 fuzzy clustering [49].

Footnotes

Acknowledgments

The authors appreciate receiving helpful suggestions from anonymous referees which improve the quality of the paper. This work was supported by the Natural Science Foundation of China (Nos. 71671086, 61773208, 61473157, 71732003, 61876157 and 61876079), the National Key Research and Development Program of China (No.2016YFD0702100), the Fundamental Research Funds for the Central Universities (No. 011814380021), the Central military equipment development of the “13th Five-Year” pre research project (No. 315050202) and Nanjing University Innovation and Creative Program for PhD candidate (No. CXCY17-08).

References

Jain

A.K.

, Murty

M.N.

, Flynn

P.J.

, Data clustering: A review, ACM Computing Survey31 (1999), 264–323.

Delaney

, Strough

, Parker

A.M.

, de Bruin

W.B.

, Variations in decision-making profiles by age and gender: A cluster-analytic approach, Personality and Individual Differences85 (2015), 19–24.

, Liu

, An interval type-2 fuzzy clustering solution for large-scale multiple-criteria group decision-making problems, Knowledge-Based Systems114 (2016), 118–127.

Azimi

, Ghayekhloo

, Ghofrani

, Hedieh

, A novel clustering algorithm based on data transformation approaches, Expert Systems With Applications76 (2017), 59–70.

Chen

H.P.

, Shen

X.J.

, Lv

Y.D.

, Long

J.W.

, A novel automatic fuzzy clustering algorithm based on soft partition and membership information, Neurocomputing236 (2017), 104–112.

, Zhang

, Wang

G.Y.

, A novel automatic fuzzy clustering algorithm based on soft partition and membership information, Knowledge-Based Systems91 (2016), 189–203.

Bhatia

, Rani

, A parallel fuzzy clustering algorithm for large graphs using Pregel, Expert SystemsWith Applications78 (2017), 135–144.

de Oliveira

J.V.

, Szabo

and de Castro

L.N.

, Particle swarm clustering in clustering ensembles: Exploiting pruning and alignment free consensus, Applied Soft Computing55 (2017), 141–153.

Thong

N.T.

, Son

L.H.

, HIFCF: An effective hybrid model between picture fuzzy clustering and intuitionistic fuzzy recommender systems for medical diagnosis, Expert Systems with Applications42 (2015), 3682–3701.

10.

Zhang

, Xu

Z.S.

, Liu

S.S.

, Wang

, Intuitionistic fuzzy MST clustering algorithms, Computers & Industrial Engineering62 (2012), 1130–1140.

11.

Bezdek

J.C.

, Fuzzy mathematics in pattern classification, Applied Math Center, Cornell University Ithaca (1973).

12.

Atanassov

K.T.

, Intuitionistic fuzzy sets, Fuzzy Sets and Systems20 (1986), 87–96.

13.

Liang

D.C.

, Liu

, Deriving three-way decisions from intuitionistic decision-theoretic rough sets, Information Sciences300 (2015), 28–48.

14.

Liang

D.C.

, Xu

Z.S.

, Liu

, Three-way decisions based on decision-theoretic rough sets with dual hesitant fuzzy information, Information Sciences396 (2017), 127–143.

15.

Liang

D.C.

, Xu

Z.S.

, Liu

, Three-way decisions with intuitionistic fuzzy decision-theoretic rough sets based on point operators, Information Sciences375 (2017), 183–201.

16.

H.X.

, Zhang

L.B.

, Zhou

X.Z.

, Huang

, Cost-sensitive sequential three-way decision modeling using a deep neural network, International Journal of Approximate Reasoning85 (2017), 68–78.

17.

Huang

, Guo

C.X.

, Li

H.X.

, Feng

G.F.

, Zhou

X.Z.

, An intuitionistic fuzzy graded covering rough set, Knowledge-Based Systems107 (2016), 155–178.

18.

Huang

, Li

H.X.

, Feng

G.F.

, Zhuang

Y.L.

, Distance-based information granularity and hierarchical structure for an intuitionistic fuzzy granular space, Fuzzy Information and Engineering8 (2016), 147–168.

19.

Huang

, Li

H.X.

, Wei

D.K.

, Dominance-based rough set model in intuitionistic fuzzy information systems, Knowledge-Based Systems28 (2012), 115–123.

20.

Zhang

X.Y.

, Wei

, L

S.Q.

, Xu

W.H.

, Similarity degrees and uncertainty measures in intuitionistic fuzzy decision tables, Journal of Intelligent & Fuzzy Systems31 (2016), 2767–2777.

21.

Liu

J.B.

, Zhou

X.Z.

, Huang

, Li

H.X.

A three-way decision model based on intuitionistic fuzzy decision systems, L. Polkowski, et al. (eds) Rough Sets. IJCRS Lecture Notes in Computer Science, vol. 201710314. Springer, Cham.

22.

Liu

J.B.

, Zhou

X.Z.

, Li

H.X.

, Huang

, Zhang

L.B.

, Jia

X.Y.

An optimization view on intuitionistic fuzzy three-way decisions, 2018H.S. Nguyen, et al. (eds) Rough Sets. IJCRS Lecture Notes in Comuter Science, vol. 11103, Springer, Cham.

23.

Zhang

X.X.

, Chen

D.G.

, Tsang

E.C.C.

, Generalized dominance rough set models for the dominance intuitionistic fuzzy information systems, Information Sciences378 (2017), 1–25.

24.

Nguyen

, A novel similarity/dissimilarity measure for intuitionistic fuzzy sets and its application in pattern recognition, Expert Systems With Application45 (2016), 97–107.

25.

Chen

S.M.

, Cheng

S.H.

, Lan

T.C.

, A novel similarity measure between intuitionistic fuzzy sets based on the centroid points of transformed fuzzy numbers with applications to pattern ecognition, Information Sciences343-344 (2016), 15–40.

26.

Hassaballah

, Ghareeb

, A framework for objective image quality measures based on intuitionistic fuzzy sets, Applied Soft Computing57 (2017), 48–59.

27.

Zhang

H.M.

, Xu

Z.S.

, Chen

, On clustering approach to intuitionistic fuzzy sets, Control and Decision22 (2007), 882–888.

28.

Z.S.

, Chen

, Wu

J.J.

, Clustering algorithm for intuitionistic fuzzy sets, Information Sciences178 (2008), 3775–3790.

29.

Z.S.

, Wu

J.J.

, Intuitionistic fuzzy C-means clustering algorithm, Journal of Systems and Electronics21 (2010), 580–590.

30.

D.W.

, Xu

Z.S.

, Liu

S.S.

, Zhao

, A spectral clustering algorithm based on intuitionistic fuzzy information, Knowledge-Based Systems53 (2013), 20–26.

31.

Chaira

, A novel intuitionistic fuzzy C means clustering algorithm and its application to medical images, Applied Soft Computing11 (2011), 1711–1717.

32.

Thong

P.H.

, Son

L.H.

, A novel automatic picture fuzzy clustering method based on particle swarm optimization and picture composite cardinality, Knowledge-Based Systems109 (2016), 48–60.

33.

Wang

, Xu

Z.S.

, Liu

S.S.

, Tang

, A netting clustering analysis method under intuitionistic fuzzy environment, Applied Soft Computing11 (2011), 5558–5564.

34.

Viattchenin

D.A.

, A method of construction of intuitionistic fuzzy tolerances based on a similarity measure between intuitionistic fuzzy sets, New Developments in Fuzzy Sets, Generalized Nets and Related TopicsI (2012), 191–202.

35.

Feng

, Mi

J.S.

, Zhang

S.P.

, Belief functions on general intuitionistic fuzzy information systems, Information Sciences271 (2014), 143–158.

36.

Wang

, Xu

Z.S.

, Liu

S.S.

, Yao

Z.Q.

, Direct clustering analysis based on intuitionistic fuzzy implication, Applied Soft Computing23 (2014), 1–8.

37.

Kacprzyk

, Viattchenin

D.A.

, Shyrai

, Szmidt

A Novel Similarity Measure Between Intuitionistic Fuzzy Sets for Constructing Intuitionistic Fuzzy Tolerance, 2016K. Atanassov, et al. (eds) Novel Develoments in Uncertainty Representation and Processing. Advances in Intelligent Systems and Computing, vol 401. Springer, Cham.

38.

Zadeh

, Fuzzy sets, Information and Control8 (1965), 338–353.

39.

Z.S.

, Intuitionistic fuzzy aggregation operators, IEEE Transactions on Fuzzy Systems15 (2007), 1179–1187.

40.

Hong

D.H.

, Choi

C.H.

, Multicriteria fuzzy decision making problems based on vague set theory, Fuzzy Sets and Systems114 (2000), 103–113.

41.

Song

Y.F.

, Wang

X.D.

, A new similarity measure between intuitionistic fuzzy sets and the positive definiteness of the similarity matrix, Pattern Anal Applic20 (2017), 215–227.

42.

Montes

, Pal

N.R.

, Janis

, Montes

, Divergence measures for intuitionistic fuzzy sets, IEEE Transactions on Fuzzy Systems23(2) (2015), 444–456.

43.

, Wu

J.M.

, Zhu

J.J.

, Stochastic multi-criteria decisionmaking methods based on new intuitionistic fuzzy distance, Systems Engineering Theory & Practice34(6) (2014), 1517–1524.

44.

Atanassov

, On Intuitionistic Fuzzy Sets Theory, Springer, Berlin2012.

45.

Atanassov

, Intuitionistic Fuzzy Sets, Springer, Heidelberg1999.

46.

Atanassov

K.T.

, Pasi

, Yage

R.R.

, Intuitionistic fuzzy interpretations of multi-criteria multi-person and multimeasurement tool decision making, International Journal of Systems Science36 (2005), 859–868.

47.

Atanassov

, Geometrical interpretation of the elements of the intuitionistic fuzzy objects, Preprint IM-MFAIS-1-89, Sofia, Reprinted: Int J Bioautomation20(S1) (2016), S27–S42.

48.

Z.X.

Fuzzy Mathematics and Its Application, Tianjin Science and Technology pressTianjing, 1983.

49.

, Liu

X.W.

, An interval type-2 fuzzy clustering solution for large-scale multiple-criteria group decision-making problems, Knowledge-Based Systems114 (2016), 118–127.