Abstract
Hesitant fuzzy set theory provides an effective technique for researchers and engineers to cope with vagueness and uncertainty. In recent years, to explore the correlation between hesitant fuzzy sets, traditional correlation measure in statistics has been constantly studied in hesitant fuzzy environments. In this study, extant studies of correlation measures in hesitant fuzzy contexts are recalled and analyzed. In view of the forgoing analysis, we find out that the extant correlation coefficients have some limitations. Moreover, a few correlation coefficients are not in line with the traditional definition of correlation coefficients. In order to address the flaws of the existing proposals, a novel hesitant fuzzy correlation coefficient is proposed in this study. The new proposal of this study can not only overcome the flaws of the old hesitant fuzzy correlation coefficients, but it also shows several desirable characteristics. The weighted form of the newly defined correlation coefficient and its features are also investigated. Finally, three numerical examples concerning supplier selection and medical diagnosis are examined using the developed correlation coefficients to demonstrate their applicability. Comparison analyses with existing proposals highlight the efficiency of our proposals.
Keywords
Introduction
When making decisions in real-world life, decision makers are usually confronted with various uncertainties. To manage such uncertainties, many different theories and techniques have been presented in recent decades. One of the most popular theories, the theory of fuzzy sets [1], was proposed to cope with fuzzy uncertainty in the 1970 s. Since the dawn of that theory, it has achieved a great deal of success both in academic research [2] and in engineering applications [3, 4]. Meanwhile, for improved characterization of imprecision and uncertainty, the fuzzy set has been consecutively expanded by researchers, and many higher order extensions, such as Atanassov’s intuitionistic fuzzy set [5], type-2 fuzzy [6], and fuzzy multiset [7, 8], have been developed. In the last several years, the theory of hesitant fuzzy set (HFS) was also introduced in [9, 10] by extending the fuzzy set. By permitting the membership to have several possible values, the HFS overcomes the difficulty of establishing the membership functions of fuzzy sets. In comparison with the other extensions as outlined above, the theory of HFS can reflect the hesitancy of individuals more accurately and objectively during actual decision-making situations. Since this theory’s appearance, the HFS has attracted much attention from different areas [11–13].
Regarding the topic of decision making with HFSs, existing literature is abundant with studies on the combination of multi-pieces of hesitant fuzzy information. Various types of aggregation functions have been developed for aggregating hesitant fuzzy information [14–20]. For instance, based on power aggregation operators, Zhang [14] presented several new aggregation operators for aggregating hesitant fuzzy information in decision contexts. Inspired by the prioritized aggregation operators [15], Wei [16] proposed some prioritized aggregation operators for HFSs and discussed their applicability by addressing hesitant fuzzy MAGDM. By extending Frank and Hamacher operations into HFSs, some generalized aggregation operators for HFSs together with their desirable properties were discussed in detail in [17–19]. The decision-making approaches mentioned above are developed on the basis of hesitant fuzzy aggregation functions. Meanwhile, there are numerous hesitant fuzzy decision-making approaches that are based on traditional decision-making techniques [21–32]. For example, Zhu et al. [21] developed a hesitant multiplicative programming method inspired by the analytic hierarchy process [22]. Zhang and Wei [24] applied the classical TOPSIS [25] and VIKOR [26] methods to resolve decision making using HFSs. Based on the theory of Social Choice, Alcantud et al. [27] developed a novel ranking method for HFSs and explored its applications to teaching performance assessment. By combining the traditional ELECTRE method [28] with the theory of HFS, Peng et al. [29] and Galo et al. [30] proposed two outranking approaches for uncertain multi-criteria decision-making. It is evident that the HFS theory has made great progress in decision making after nearly ten years of rapid development.
In recent years, the conventional correlation measure of two point sets has been abundantly studied in hesitant fuzzy environments or other hesitant fuzzy-based contexts [33], which further enhances the development of hesitant fuzzy decision-making. Correlation measure is usually used to assess how close two random variables are to having a linear relationship with each other in statistics. In order to investigate and better understand the relationship of such two sets as HFSs, many scholars have introduced the traditional concept of correlation measure into the hesitant fuzzy circumstance [34–37]. Subsequently, lots of correlation measure-based decision-making techniques have been presented and have been successfully employed to solve hesitant fuzzy decision problems. For instance, Xu and Xia [35] put forward some different forms of correlation coefficients for HFSs and explored their applications to make diagnoses for patients. Chen et al. [36] developed several types of correlation coefficients to measure the correlation of HFSs and applied the developed hesitant fuzzy correlation coefficients for the assessment and classification of software, and the evaluation and management of business failure risk as well. In [37], followed by the works in [35, 36], Liao et al. recently developed a new formulation of correlation coefficient for studying the relationship of HFSs. It is obvious that correlation measure of HFSs is the foundation of the aforesaid approaches. More importantly, the study of correlation of HFSs is still an open problem. This paper therefore continues to improve the investigation of correlation coefficients for HFSs by simultaneously addressing the following flaws: The correlation coefficients presented in [35, 36] are not in line with the traditional definition of correlation coefficient. The value of correlation coefficient in statistics is positioned in-between –1 and 1, while the values derived from the proposals in [35, 36] are always positive. In other words, their proposals neglect the negative correlation situation. In the definition of Liao et al. [37], the correlation coefficient for HFSs is calculated by averaging the means of correlation coefficients between all of their hesitant fuzzy elements (HFEs), which cannot cover the original information fully and thus causes the loss of information. Accordingly, the correlation coefficient formula does not work well in situations where the correlations between one HFS and several HFSs with the same means must be investigated. These correlation coefficient formulas are defined under two assumptions. The first assumption is that different HFEs under consideration have equal length, and the second is that their possible values should be arranged in a special (descending or ascending) order. Whereas the second assumption can be easily satisfied, the first assumption does not hold in most situations. Although we can obtain the correlation coefficient of HFSs by using the principles [36], the rationality requires further investigation.
Furthermore, supplier selection has always been an important issue in supply chain management. Well-round suppliers contribute greatly to the improvement of supply chain performance. Over the past decades, numerous multi-criteria decision making approaches have been reported to cope with supplier selection problems [3, 38–41]. Govindan et al. [40] and Ho et al. [41] analyzed in detail the literature of the approaches for supplier selection. One interesting characteristic is the preponderance of fuzzy analysis. In this paper, to explore an application of our proposals in supplier selection, a practical problem in this field will be analyzed. Moreover, to explore another application of the said proposals in the field of medical diagnosis, a related problem will also be solved. As most patients’ symptom manifestations and physicians’ knowledge involve fuzzy concepts in medical diagnosis [42], the applications of fuzzy set, Atanassov’s intuitionistic fuzzy set, and HFS have been illustrated in such a medical field [43–46]. In particular, the work in [35] demonstrates that the use of correlation coefficients between HFSs performs well in diagnosis problems. Hence, the applicability of the newly proposed correlation coefficients for HFSs will be validated by addressing a medical diagnosis problem in medical field.
With respect to the foregoing, the essence of this paper is to present novel correlation measures for HFSs and further to explore their practical applications in decision contexts. Specifically, Section 2 reviews several essential concepts about HFSs and performs an analysis of the current correlation coefficients for HFSs. Section 3 constructs a novel correlation coefficient for HFSs and discusses its desirable properties. The weighted form of the proposed correlation coefficient is presented in Section 3 as well. In Section 4, a supplier selection problem and a medical diagnosis problem are solved using our proposals to validate their applicability. Section 5 conducts two comparison analyses to highlight the efficiency of our proposals. Finally, Section 6 concludes this study with some suggestions for future work in this area.
Preliminaries
In this section, we recall some essential concepts about HFSs and review a few extant correlation measures of HFSs.
HFSs
For convenience, the HFS is simply symbolized by the following mathematical formula [35, 47].
In [35, 47], Xia and Xu developed a comparison law so as to compare different HFEs with the use of the following score function.
For the two HFEs h A (x) and h B (x), h A (x) > h B (x) in the case of S (h A ) > S (h B ); h A (x) < h B (x) in the case of S (h A ) < S (h B ); and h A (x) = h B (x) if S (h A ) = S (h B ).
In [37], Liao et al. presented some properties that correlation coefficients of HFSs should hold.
ρ (A, A) =1; ρ (A, B) = ρ (B, A); ρ (A, A
c
) = -1, where A
c
is defined as A
c
= {h
c
(x
i
)} with h
c
(x
i
) =1 - h
A
(x
i
); and -1 ⩽ ρ (A, B) ⩽1.
In what follows, we conduct an analysis of extant studies concerning hesitant fuzzy correlation measures and clarify their flaws.
In data analysis, correlation measure is one widely used index in exploring the relationship of different data sets. Recently, the concept of correlation has been introduced into the hesitant fuzzy circumstance. In [35], Xu and Xia presented several different formulas of correlation coefficient to study the correlation between two different HFEs. In the following, we take one of them as an example.
From Definition 3, one can find that the above correlation coefficient is defined under which two assumptions should be satisfied: 1) the lengths of different HFEs should be equal; and 2) their values should be arranged in a certain order.
Meanwhile, for two HFSs A = {h A (x i )} and B = {h B (x i )} over X, Chen et al. [36] developed four correlation coefficients to measure their correlation based on the same assumptions in Refs. [35, 47]. One of the correlation coefficients is shown as below.
Chen et al. [36] further developed the weighted form of Equation (3) as follows:
Note that the formula ρ ω (A, B) in Equation (4) will be reduced to that in Equation (3) in the case of ω i = 1/n.
It is evident that the correlation coefficients introduced in Definitions 3 and 4 have the following limitations. All the correlation coefficient formulas should satisfy two assumptions. 1) The possible values in each HFE should be re-arranged in a certain sequence, and 2) the lengths of different HFEs should be equal. The first assumption can be easily satisfied. However, the second assumption does not hold in most situations. Given that different HFEs often have different lengths in real life, Chen et al. [36] developed the pessimistic principle (or optimistic principle) to lengthen the shorter HFE via repeating the minimum (or maximum) value several times until both the shorter and the longer HFEs have the same length. Even though one can obtain the hesitant fuzzy correlation coefficient by using the principles, the rationality requires further discussion. Actually, adding a few artificial values into the HFEs would alter their original information. Thus, this becomes a matter of debate. The correlation coefficients proposed in Refs. [35, 36] are not in line with the conventional definition of a correlation coefficient in statistics. Traditionally, the value of a correlation coefficient can vary from –1 to 1, which can not only determine the intensity of the correlation between two given statistical variables, but can also reflect the negative or positive correlation. However, the values derived from the proposals in Refs. [35, 36] are always positive. In other words, their proposals neglect the negative correlation situation. This indicates that the proposed hesitant fuzzy correlation coefficient formulas lack theoretical support which further diminishes their applicability.
Under such circumstances, Liao et al. [37] put forward the following formulation of correlation coefficient for HFSs.
We first construct a novel correlation measure for HFSs in this section and then based on which a novel correlation coefficient for HFSs is presented. Several desirable features of the newly proposed correlation coefficient and its weighted form are discussed.
A novel correlation coefficient between HFSs
Equation (7) reveals that all combinations of possible values in the two HFEs h A (x i ) and h B (x i ) are considered in the proposed correlation measure. The two assumptions regarding the existing correlation measures are clearly relaxed. More importantly, the correlation defined by Equation (7) satisfies the following theorem:
C (A, B) = C (B, A);
where
(2) One can reason from Equation (7) that
Given that
it follows that
Thus,
Using Definition 6, the following hesitant fuzzy correlation coefficient is derived.
The above correlation coefficient calculation process does not need to consider the lengths of HFEs h A (x i ) and h B (x i ). Besides, the ordering relation of the possible values in each HFE has no influence on the calculation results. From Equation (8), we find that all combinations of possible values in the HFEs h A (x i ) and h B (x i ) are taken into consideration during the computing process of ρ (A, B). In the following, Example 1 is re-investigated to illustrate this correlation coefficient.
Similarly, we can calculate
Using Equation (8), we find
The difference between HFSs A and B is easily reflected by the calculation results of ρ (A, C) and ρ (B, C).
The proposed correlation coefficient satisfies the properties in Theorem 1, which are concluded in Theorem 3.
To facilitate the analysis of Theorem 3, a relevant lemma is presented.
Theorem 3 is proved as follows.
Given that
Further, if
Accordingly, it can be concluded from Lemma 1 that
Consequently, we have that -1 ⩽ ρ (A, B) ⩽1, thus concluding the proof. □
From the analysis above, one can find that the newly defined correlation coefficient in Definition 7 not only surmounts the flaws of the hesitant fuzzy correlation coefficients in [35, 36], but also shares the same desirable properties with the one developed in [37]. Furthermore, the above numerical results clearly show the better performance of the proposed correlation coefficient compared with the one in [37]. The efficiencies of the different correlation coefficients in addressing practical decision- making problems will be discussed in detail in Section 5.
Suppose that ω i denotes the normalized weight of x i , as defined before, then the weighted correlation of HFSs is presented.
Particularly, if ω i = 1/n, then the formula ρ ω (A, B) in Equation (11) is reduced to that in Equation (8).
ρ
ω
(A, A) =1; ρ
ω
(A, B) = ρ
ω
(B, A); ρ
ω
(A, A
c
) = -1, where A
c
is defined as A
c
= {h
c
(x
i
)} with h
c
(x
i
) =1 - h
A
(x
i
); and -1 ⩽ ρ
ω
(A, A) ⩽1.
Suppose that
In virtue of
then the following equation is obtained:
Given that
Using Lemma 1, one obtains the inequality below:
Therefore, one can obtain -1 ⩽ ρ ω (A, B) ⩽1. □
In what follows, two practical examples about supplier selection and medical diagnosis are examined using the developed correlation coefficients of HFSs to demonstrate their applicability
Supplier selection using the correlation coefficient of HFSs
Hesitant fuzzy information by the department
Hesitant fuzzy information by the department
Suppose that rw = (rw1, …, rw5) denotes the relative importance of the five attributes, which are not normalized, and the third attribute is identified as the most important one. Then the department specifies that rw3 = 1. Next, by comparing each attribute with the third one, their relative importance can be figured out. The results are (rw1, rw2, rw4, rw5) = (0.83, 0.67, 0.33, 0.50). It is obvious that min {rw1, …, rw5} >0 and max {rw1, …, rw5} =1. Finally, w
j
(j = 1, …, 5) is calculated as follows:
According to Equation (10) in Definition 8, we have
Further, as per Equation (11), we can obtain that
According to the values of ρ ω (X*, X i ) (i = 1, 2, …, 4), the four suppliers are ranked as: X1 ≻ X3 ≻ X4 ≻ X2. Accordingly, the best choice is X1.
To identify a disease for each patient based on their signs and symptoms, we must analyze the relationship between the diagnoses’ symptoms and the patients’ symptoms. With the use of the proposed hesitant fuzzy correlation coefficient formula as presented in Definition 7, the correlation coefficient values are derived and presented in Fig. 1. It is determined from the results shown in Fig. 1 that Bob has a Stomach problem, while Ted, Al and Joe have Malaria.

Results obtained when applying the developed approach to the example.
Based on the medical diagnosis problem in Example 4 and another medical diagnosis problem, we perform two comparison analyses between the proposed correlation coefficient with two representative correlation coefficients [35, 37] to emphasize the efficiency of our proposal.
Compared with the approach of Xu and Xia by solving medical diagnoses
We compare our proposal with the approach of Xu and Xia [35] based on Example 4 in this section. The results from applying the correlation coefficient formula of Xu and Xia to Example 4 are illustrated in Fig. 2. Figure 2 indicates that Joe has malaria, Bob has a stomach problem, and Al and Ted have viral fever. These results, however, differ from those generated by the proposed approach.

Results obtained when applying the approach of Xu and Xia [35] to Example 4.
From the results displayed in Figs. 1 and 2, one can find that all the correlation coefficient values obtained by the formula of Xu and Xia are limited to the interval [0, 1], but the values generated by the proposed correlation coefficient formula lie in the interval [–1, 1]. In other words, while the results obtained by the formula of Xu and Xia (see Fig. 2) can indicate the strength of the relationships between the symptoms of the patients and the symptoms’ characteristics of the diagnoses, they cannot demonstrate a negative or positive correlation. For example, it is evident from the symptom characteristics of Joe and the characteristics of the Chest problem that there is a negative correlation, while the correlation coefficient value obtained by the formula of Xu and Xia is 0.9750, suggesting that Joe likely has a Chest problem. This will greatly influence the rationality of the diagnosis.
Meanwhile, the correlation coefficient values derived from Fig. 2 are quite close (from 0.9677 to 0.9969), increasing the difficulty for doctors in making decisions. According to these similar correlation coefficient values, a doctor may have difficulty distinguishing from among various diagnoses, which further influences the reliability of the diagnosis. In the proposed method, however, this problem is managed without any difficulty for the doctor as the correlation coefficient values derived from Fig. 1 vary from –0.5715 to 0.9804 and the differences between and among these values are significant. To better illustrate this point, we restrict the results generated by the two methods within the same domain to [–1, 1] and present them in the form of two radar graphs (Figs. 3 and 4), respectively. Considering that the curves as shown in Fig. 4 cannot identify the differences among the different diagnoses, thus it is impossible for a doctor to make a reliable and convincing diagnosis for each patient. However, in stark contrast to the curves as shown in Fig. 4, the curves presented in Fig. 3 significantly reflect the differences among the diagnoses. This demonstrates the superiority of the developed approach in medical diagnoses.

Results obtained when using the developed approach.

Results obtained when using the approach of Xu and Xia [35].
In this section, our proposal is compared with the approach of Liao et al. [37] based on two examples. First, the medical diagnosis problem considered in Example 4 is solved by the two approaches. Then another medical diagnosis problem described in Example 5 is analyzed to further facilitate the comparison of the two approaches.
The results obtained for Example 4 when using the formula of Liao et al. are illustrated in Figs. 5 and 6. From the results presented in Figs. 5 and 6, one can determine that Ted, Joe and Al have Malaria, and Bob has a Stomach problem. While these results differ from those deduced from the method of Xu and Xia, they are the same as the results obtained by the proposed method. More specifically, comparing Fig. 1 (or Fig. 3) with Fig. 5 (or Fig. 6), we find that not only the diagnosis results generated by the proposed correlation coefficient formula and those of Liao et al. are the same, but that the correlation coefficient values calculated by the two formulas are the same. However, this does not mean that the formula proposed in this paper is the same as the formula developed in Liao et al. [37]. To illustrate this, we offer another adjusted example.

Results obtained when applying the approach of Liao et al. [37] to Example 4.

Results obtained when using the approach of Liao et al. [37].
Hesitant fuzzy information for patients P’
In the following, we employ our proposal and that of Liao et al. to calculate the results. The results from applying the two methods to this example are presented in Figs. 7 to 10. From Figs. 7 and 9, it is determined that AI’ and Joe’ have Malaria, Boe’ has a Stomach problem, and Ted’ has Viral fever. These results are quite different from those generated by the Liao et al.’ s method (see Figs. 8 and 10).

Results obtained when applying the developed approach to Example 5.

Results obtained when applying the approach of Liao et al. [37] to Example 5.

Results obtained when using the developed approach.

Results obtained when using the approach of Liao et al. [37].
Comparing Fig. 8 (or Fig. 10) with Fig. 5 (or Fig. 6), we find that the correlation coefficient values obtained by using the formula of Liao et al. for the set of patients in Example 5 to the possible diagnoses are the same as those for the set of patients in Example 4 to the diagnoses, although all the values regarding the various symptoms for the two sets of patients are quite different. This is because that the correlation coefficient of the HFSs is calculated by averaging the means of the correlation coefficient values between all HFEs of them. The correlation coefficients between one HFS and those HFSs with the same means are the same. For example, the mean of the HFE with regard to each symptom of patient AI’ is equal to that of patient AI. By using the formula of Liao et al., we obtain the same diagnoses for AI’ and AI, even though the HFSs for each symptom characteristic of the two patients are completely different. Thus, the correlation coefficient values cannot make the differences between the symptoms’ characteristics of the two sets of patients, reflecting the weakness of Liao et al.’s method. However, comparing Fig. 7 (or Fig. 9) with Fig. 1 (or Fig. 3), it is evident that the correlation coefficient values obtained by using the proposed formula for the two sets of patients to the set of possible diagnoses are different. The above analysis indicates that our proposed method is reasonable and convincing in medical diagnoses.
After analyzing the existing studies regarding correlation measures of HFSs, we find that the extant proposals have some flaws. The old correlation coefficients must consider the numbers of the possible values of HFEs, and take into account the sequence of these values. Some of the existing correlation coefficients can only change from 0 to 1, which are not in line with the traditional definition of a correlation coefficient in statistics where correlation coefficients can vary from –1 to 1. In order to resolve the weaknesses of the extant proposals, we present a new correlation coefficient for HFSs. Its desirable features and weighted form are also discussed. The proposals of this paper not only enjoy many ideal characteristics but also relax the assumptions that should be satisfied by the old proposals. Finally, the proposed correlation coefficients are implemented and applied to solve supplier selection and medical diagnosis problems. The experimental results verify the applicability and efficiency of our proposals.
In the future, it is worthwhile to apply the proposals to other fields, such as investment evaluations [20, 51] and cluster analyses [36] to broaden their applicability. In particular, Alcantud and Giarlotta [52] recently introduced a new model of HFSs, namely necessary and possible HFSs, which performs well in modelling collective decision situations. Therefore extending the proposals of this paper into the novel model or proposing other specific correlation coefficients for the model becomes an interesting topic in future research in this area.
Footnotes
Acknowledgments
This work was supported in part by the State Scholarship Fund of China (No. 201706690025), the Foundation for Innovative Research Groups of the National Natural Science Foundation of China (No. 71521001), and in part by the National Natural Science Foundation of China (Nos. 71601066, 71501056, and 71501054).
