Abstract
Compared to hesitant fuzzy sets and intuitionistic fuzzy sets, dual hesitant fuzzy sets can model problems in the real world more comprehensively. Dual hesitant fuzzy sets explicitly show a set of membership degrees and a set of non-membership degrees, which also imply a set of important data: hesitant degrees.The traditional definition of distance between dual hesitant fuzzy sets only considers membership degree and non-membership degree, but hesitant degree should also be taken into account. To this end, using these three important data sets (membership degree, non-membership degree and hesitant degree), we first propose a variety of new distance measurements (the generalized normalized distance, generalized normalized Hausdorff distance and generalized normalized hybrid distance) for dual hesitant fuzzy sets in this paper, based on which the corresponding similarity measurements can be obtained. In these distance definitions, membership degree, non-membership-degree and hesitant degree are of equal importance. Second, we propose a clustering algorithm by using these distances in dual hesitant fuzzy information system. Finally, a numerical example is used to illustrate the performance and effectiveness of the clustering algorithm. Accordingly, the results of clustering in dual hesitant fuzzy information system are compared using the distance measurements mentioned in the paper, which verifies the utility and advantage of our proposed distances. Our work provides a new way to improve the performance of clustering algorithms in dual hesitant fuzzy information systems.
Introduction
To address inaccurate and uncertain information, Zadeh [1] proposed the concept of fuzzy sets in 1965. The essence of a fuzzy set is to represent the membership degree of an element to a set with a number in [0, 1]. Fuzzy sets can solve an uncertain problem with a single membership degree. Atanassov [3] extended the fuzzy set and proposed the intuitionistic fuzzy set theory in 1983. He used two numbers between [0, 1] to represent the membership degree and non-membership degree of an element to a set. This theory can handle uncertain and inaccurate information more flexibly and solve voting problems with support and opposition in real life. With the rapid emergence of group decision-making, the persuasion of individuals as decision-makers has undoubtedly decreased. For this reason, Torra [2] proposed the hesitant fuzzy set model in 2010, which used a group of numbers between [0, 1] to represent the possible membership degrees of an elements to a set. This model can describe problems precisely when there are multiple membership degrees. Based on the intuitionistic fuzzy set and hesitant fuzzy set, Zhu et al. [4] brought forward the dual hesitant fuzzy set theory in 2012, which used two groups of numbers in [0, 1] to represent the possible membership degrees and non-membership degrees of an element to a set, which solved the decision-making problem with multiple membership degrees and non-membership degrees [6–10].
Dual hesitant fuzzy sets are more in line with the practical problems encountered in real life. Based on five bases of data, Zahari et al. [11] provided the definitions of three Z-score functions and applied them to multicriteria decision-making. According to the membership degrees and non-membership degrees, Meng et al. [13] defined the correlation coefficients of dual hesitant fuzzy sets and applied them to a clustering algorithm. Meng et al. [14] also proposed dual hesitant fuzzy preference relations and proved multiplicative consistency. On the basis of optimization models, dual hesitant fuzzy sets provide a method for group decision-making. From the reliability function and Weibull distribution, Kumar et al. [15] studied the reliability of dual hesitant fuzzy sets. By using the aggregation operator, the reliability of a system can be evaluated. In a dual hesitant fuzzy information system, Lu et al. [16] presented a novel three-phase linear programming technique for multidimensional analysis of preference (LINMAP), which can solve hybrid multicriteria group decision-making (MCGDM).
The distance between sets reflects the similarity to some extent. The smaller the distance, the greater the similarity and the more similar the two sets. Therefore, distance measurement can be applied to clustering algorithms. Xu [18] proposed several definitions of the distance between hesitant fuzzy sets in both continuous and discrete cases, and used them in the TOPSIS method in a hesitant fuzzy environment. Szmidt [19] studied multiple distance definitions between intuitionistic fuzzy sets using membership degree and non-membership degree. In the recent literature, for different data environments (Pythagorean fuzzy set, intuitionistic fuzzy set, q-rung orthopair fuzzy set, complex q-rung orthopair fuzzy set), cosine distance measures and cosine similarity measures have been proposed, the axioms satisfied by these distances proven, and then they were applied to multi-attribute decision-making [20–23]. According to the known membership degree and non-membership degree, Su [17] extended the Hamming distance, Euclidean distance and Hausdorff distance to define the generalized distance and generalized hybrid distance for dual hesitant fuzzy sets and used them for pattern recognition. Utilizing the mean values of membership degrees and non-membership degrees, Wang [10] defined multiple distances of dual hesitant fuzzy sets and applied them to multi-attribute decision-making. In the meantime, according to the relationship between the distance and similarity degrees, the similarity degree was defined. Using the minimum difference between membership and non membership, the dual hesitant fuzzy distances were defined and applied to the TOPSIS decision-making algorithm [24, 25]. Zhang [12] proposed three generalized distances using membership degrees and non-membership degrees, and then applied them to pattern recognition in a dual hesitant fuzzy environment. These distances do not need the length of dual hesitant fuzzy elements to be consistent, avoiding additional length processing.
These distance measurements underscore the importance of membership degrees and non-membership degrees. However, the measurements did not use any important data of hesitant degrees, which leads to the loss of information to some extent. From the theoretical point of view, there remains a need for new distance measurements using hesitant degrees, which is therefore the motivation of this paper.
The organization of this paper is as follows. In Section 2, some basic concepts of hesitant fuzzy set, intuitionistic fuzzy set and dual hesitant fuzzy set are introduced. In Section 3, several new distances of dual hesitant fuzzy set are defined. Some necessary properties of them are also proved. In Section 4, clustering algorithm for dual hesitant fuzzy set is proposed by using these distance measures. In Section 5, a numerical example of clustering analysis in dual hesitant fuzzy environment is given, which verifies the feasibility and practicability of clustering algorithm using these proposed distance measurements in dual hesitant fuzzy environment. Moreover, a contrastive analysis is made. Finally, we conclude this paper in Section 6.
Preliminaries
Torra [2] firstly proposed the concept of a hesitant fuzzy set, which is defined as follows:
Atanassov [3] firstly proposed the concept of an intuitionistic fuzzy set, which is defined as follows:
Zhu et al. [4] defined dual hesitant fuzzy set. Dual hesitant fuzzy set is an extension of hesitant fuzzy set and intuitionistic fuzzy set, which is defined as follows:
In order not to change the decision-maker’s preference for the attributes of an object and retain the most original information, we do not change the order of elements in the dual hesitant fuzzy set. We use
Note: when |h
H
(x) | ≠ |g
H
(x) | (|h
H
(x) | and |g
H
(x) | can be seen as the number of values inh
H
(x) andg
H
(x), respectively.), take
the complement ofA is defined as follow:
the union ofA andB, denoted byA ∪ B, is given by
the intersection ofA andB, denoted byA ∩ B, is given by
if ifB ⊆ A andA ⊆ B hold, then we callA is the same asB. We denote it byA = B.
Note: whenl A (x) ≠ l B (x), supposel A (x) < l B (x), then we extendh A (x) with min(h A (x)) until the length reachl B (x), extendg A (x) with min(g A (x)) until the length reachl B (x). In the same way, whenl A (x) > l B (x), we extendh B (x) with min(h B (x)) until the length reachl A (x) and extendg B (x) with min(g B (x)) until the length reachl A (x).
According to the definition of dual hesitant fuzzy set, Su [17] proposed the generalized dual hesitant normalized distance, which is defined as follow:
Distance and similarity are important measures for dual hesitant fuzzy set. Many people use membership degrees and non-membership degrees to calculate distances, but ignored the hesitant degree, which leads to the loss of information to some extent. So, take the hesitant degree into consideration, we propose some new distances.
0 ≤ d (A,B) ≤1 d (A,B) =0 if and only ifA = B
d (A,B) = d (B,A)
For convenience, when we calculate the distance and similarity between dual hesitant fuzzy setA andB, we denotel (AB (x)) = max(l (A (x)) ,l (B (x))).
Drawing on the well-known Hamming distance and the Euclidean distance, we define a normalized Hamming distance and a normalized Euclidean distance on dual hesitant fuzzy sets as follow:
We can further extend Definition 3.2 and Definition 3.3 into a generalized normalized dual hesitant fuzzy distance:
Clearly, the distance derived from Eq. (7) takes into account all the three parameters (membership degree, non-membership degree, and hesitant degree) described in dual hesitant fuzzy sets, which contains much more information than the distance derived by Eq. (4). Besides, we can find some properties about this distance.
0 ≤ d
gnd
(A,B) ≤1 d
gnd
(A,B) =0 if and only ifA = B
d
gnd
(A,B) = d
gnd
(B,A)
which satisfies the property (1).
According to Definition 3.4
which satisfies the property (3).
Especially, ifp = 1, then the generalized normalized distance of dual hesitant fuzzy sets is reduced to the normalized Hamming distance of dual hesitant fuzzy sets; Ifp = 2, then it is reduced to the normalized Euclidean distance of dual hesitant fuzzy sets.
Especially, (1) ifp = 1, then the generalized normalized Hausdorff distance of dual hesitant fuzzy sets is reduced to the normalized Hamming-Hausdorff distance of dual hesitant fuzzy sets:
Obviously, for these three particular sets:A,B andC,d (A,B) = d (A,C) = d(B,C) should hold. But this results only showd gd (A,C) = d gd (B,C), butd gd (A,B) ≠ d gd (A,C) andd gd (A,B) ≠ d gd (B,C). Then we calculate the distances by using normalized Euclidean distance, we obtain:
We can findd ned (A,B) = d ned (A,C) = d ned (B,C) holds. Again, when we use the generalized normalized distance and the generalized normalized Hausdorff distance to calculate the distances betweenA,B andC, we can findd (A,B) = d (A,C) = d (B,C) holds.
In this section, we use the proposed distances for clustering algorithm with dual hesitant fuzzy environment. Cluster analysis is an exploratory classification process to divide multiple objects into different categories.
Example
LetU = {A1,A2,A3,A4,A5,A6,A7,A8,A9,A10} be a set of companies andV ={ x1,x2,x3,x4,x5 } be a set of attributes, representing political risk, interest rate risk, market risk, operation risk and purchasing power risk. Several
Algorithm 4.1 LetX be a finite universe of discourse, andA i (i = 1, 2, . . . ,n) be a collection of dual hesitant fuzzy set onX. Based on clustering algorithm of the intuitionistic fuzzy sets and distances for dual hesitant fuzzy sets, we propose a clustering algorithm under dual hesitant fuzzy environment:
Step1: calculate the distances betweenA i (i = 1, 2, . . . ,n) to constitute the association matrix.
Step2: utilize Definition 4.3 to calculate the equivalence association matrix. Utilize Definition 4.4 to calculate theλ - cutting matrixC λ .
Step3: inλ - cutting matrixC λ , if all elements of theith line inC λ are the same as the corresponding elements in thejth line, then the dual hesitant fuzzy setA i andA j are of the same type.
expert groups are invited to make overall evaluation on these companies from five perspectives. Since each group may not be sure and not completely deny an indicator during the evaluation, that is, there are support and opposition opinions. In this case, it is more in line with the actual needs to use dual hesitant fuzzy information to express them. All the experiments were evaluated in a personal computer by running in the win10 system, with a 3.8-GHz Intel Xeon E-2244G CPU and 16-GB RAM. The programs were written in JAVA. The assessment values are presented inTables 1, 2.
Dual hesitant fuzzy information (1)
Dual hesitant fuzzy information (1)
Dual hesitant fuzzy information (2)
Step 1: Using normalized Euclidean distance and similarity of dual hesitant fuzzy sets, the distance matrix is calculated as matrixd. According to the distance matrixd, the association matrix is calculated as matrix C.
Step 2: Calculate the composition matrix. Apparently,C2 ⊆ C does not hold, we further calculate the equivalence matrix asC4,C8 andC16. The results show thatC8 = C16, soC8 is an equivalence matrix.
Step 3: For a confidence levelλ, consider clustering for DHFs, we get the possible classifications ofA i (i = 1, 2, . . . , 10), seeTable 3.
The clustering result of when we use generalized dual hesitation fuzzy distance andp = 2
Again, we use the generalized distance proposed by Su, the generalized normalized distance, generalized normalized Hausdorff distance and the generalized normalized hybrid distance of dual hesitant fuzzy sets we proposed in this paper to calculate the clustering results under differentp values.
By comparing the results we find, with different parametersp and distance measures, the minimum confidence levelλ required to completely separateA i (i = 1, 2, . . . , 10) is different, as shown in the followingTable 4:
The minimum confidence levelα required to completely separateA i (i = 1, 2, . . . , 10)
By using four distance measures and five parametersp = 1, 2, 3, 5, 10 for clustering in dual hesitant fuzzy information system, according to the clustering results of this example,A2 andA10 can be divided into the same category under high confidence level. Therefore, we believe thatA2 andA10 are the most similar. When the confidence level is low,A6 is first divided into one category, so we thinkA6 is quite different from other objects.
From the above numerical analysis, we synthetically compare the influence of different distances and the value of parameterp on the dual hesitant fuzzy clustering algorithm, and it follows that for a certain distance, the bigger the parameterp, the smaller confidence level is required to categorizeA i (i = 1, 2, . . . , 10) into specific categories. For the same parameterp, using the generalized distance proposed by Su to cluster requires higher confidence level than using the distances proposed in this paper. Among the three distances proposed in this paper, when the set needs to be divided into the same number of categories, using the generalized normalized distance to cluster needs the maximum confidence level and using the generalized normalized Hausdorff distance to cluster needs the minimum confidence level.
Based on the distance between the dual hesitant fuzzy sets, and the latent information on dual hesitant fuzzy sets, the hesitant degree, is taken into account. Hence, this paper proposes a new generalized normalized distance, a new generalized normalized Hausdorff distance and a new generalized normalized hybrid distance for dual hesitant fuzzy sets. The newly generalized distances are then used to generate a clustering algorithm for the dual hesitant fuzzy sets. By comparing the influence of different distances on the clustering algorithms and the influence of different p values on the distance definition on clustering algorithm, the findings give evidence to infer that the clustering algorithm can effectively reduce the confidence level of the system, and this implies that at the same confidence level, the dual hesitant fuzzy sets can be classified more accurately. Conclusively, this paper objectively described the human decision-making process and rationally evealed the scientific modeling of data.
Prospective studies will focus on three direction. First, the probability can be introduced into the decision-making research of dual hesitant fuzzy set and its extension theories. Second, dual hesitant fuzzy set and its extension theories can be combined with rough set theory and soft set theory for decision-making. Finally, the combination of dual hesitant fuzzy set theory and machine learning algorithm can be used in decision-making applications of practical problems.
Footnotes
Acknowledgments
This research is funded by the National Social Science Project (17XTQ013), the project of Qinghai provincial key laboratory of IoT(2020-ZJ -Y16).
