Abstract
Pythagorean fuzzy set is a reliable technique for soft computing because of its ability to curb indeterminate data when compare to intuitionistic fuzzy set. Among the several measuring tools in Pythagorean fuzzy environment, correlation coefficient is very vital since it has the capacity to measure interdependency and interrelationship between any two arbitrary Pythagorean fuzzy sets (PFSs). In Pythagorean fuzzy correlation coefficient, some techniques of calculating correlation coefficient of PFSs (CCPFSs) via statistical perspective have been proposed, however, with some limitations namely; (i) failure to incorporate all parameters of PFSs which lead to information loss, (ii) imprecise results, and (iii) less performance indexes. Sequel, this paper introduces some new statistical techniques of computing CCPFSs by using Pythagorean fuzzy variance and covariance which resolve the limitations with better performance indexes. The new techniques incorporate the three parameters of PFSs and defined within the range [-1, 1] to show the power of correlation between the PFSs and to indicate whether the PFSs under consideration are negatively or positively related. The validity of the new statistical techniques of computing CCPFSs is tested by considering some numerical examples, wherein the new techniques show superior performance indexes in contrast to the similar existing ones. To demonstrate the applicability of the new statistical techniques of computing CCPFSs, some multi-criteria decision-making problems (MCDM) involving medical diagnosis and pattern recognition problems are determined via the new techniques.
Keywords
Introduction
Decision-making (DM) is an integral part of decision science where identification and selection of choices on the basis of preferences of decision-makers are considered via pertinent assessment criteria. The problems of DM could be multi-criteria, multi-attributes or multi-objectives in nature. MCDM is an operational technique for curbing complex engineering problems in machine language, artificial intelligence, etc. Multi-attributes decision-making (MADM) has to do with making choices among a finite number of decision alternates with respect to multiple, usually conflicting, attributes. Whereas in multi-objectives decision-making (MODM) problems, the number of alternates is infinite, and the trade-offs among the considered criteria are described by continuous functions. Nonetheless, decision-making is a herculean task because most certain, data or available variables are imprecise and uncertain in nature. To cope with such fuzziness, Zadeh [67] introduced fuzzy set to model human knowledge. It suffices to state that fuzzy models are not reliable because it considered only the membership grade of the concerned data.
In sequel, Atanassov [1] proposed intuitionistic fuzzy sets (IFSs) to effectively tackle uncertainties in practical problems. IFS is described by membership degree (MD) μ, non-membership degree (NMD) ν and hesitation margin (HM) π such that the sum of all the parameters is one and μ + ν ≤ 1. IFSs have been applied to solve myriad of DM problems. Some distance and similarity measures between IFSs have been studied with application in pattern recognition problems [3, 58]. The concept of composite relations have been extended to intuitionistic fuzzy environment with application in medical diagnosis [6, 53]. Similarly, the applications of IFSs in group DM based on Heronian aggregation operators and cluster algorithm have been examined [44, 64]. However, the notion of IFSs becomes insufficient to model some decision-making problems where the sum of MD and NMD is more than one, i.e., μ + ν ≥ 1. If we take μ = 0.5 and ν = 0.6, clearly IFS cannot model this scenario.
To help the decision-makers overcome such challenge, Atanassov [2] proposed a soft computing tool called IFS of second type also known as Pythagorean fuzzy sets (PFSs) [59, 62]. PFSs generalize IFSs in such a way that the addition of the MD and NMD could also be greater than one and the sum of their square is less than one. Honestly, every IFS is a PFS, but every PFS is not an IFS. Yager [61] discussed some fundamental details on PFSs. Some modal operators on IFSs have been presented in Pythagorean fuzzy context [11]. In terms of applications, PFS has received huge attentions by numerous researchers, and the idea has been applied in myriad of applicative areas via many measuring techniques. The applications of PFSs in pattern recognition, personnel appointment, multiple criteria group decision making via similarity measures have been explored [12, 71]. Also, some distance measures between PFSs were discussed with applications [13, 70]. The notion of composite relations have been studied in Pythagorean fuzzy setting with some applications in career placements and medical diagnosis [9, 10]. Some studies on the Choquet integral operators and Einstein prioritized aggregation operators under PFSs, interval-valued PFSs and Pythagorean hesitant fuzzy sets and their applications in DM problems have been considered [36, 43]. Several methods for the application of PFSs and interval-valued PFSs in MCDM and MADM have been examined [37, 68]. The notion of TOPSIS is very pivotal in decision-making and as such, the notion was extended to PFSs and Pythagorean hesitant fuzzy sets with incomplete weight information [40, 75]. Certain applications of other soft computing tools can be found in [4, 50].
In a quest to measure the interdependency, similarity and interrelationship between fuzzy sets, the notion of correlation coefficient was studied in fuzzy environment to measure fuzzy data [5, 65]. By extension, Gerstenkorn and Manko [31] initiated the research on correlation coefficient between IFSs (CCIFSs). Hong and Hwang [33] proposed CCIFSs in probability spaces. Hung [34] deployed statistical tool to develop CCIFSs technique. Mitchell [46] studied a new CCIFSs based on integral function. Park et al. [48] and Szmidt and Kacprzyk [54] independently extended the method in [34] by incorporating the hesitation margin of IFS to the earlier membership and non-membership grades considerd in [34]. Hung and Wu [35] proposed a method for measuring CCIFSs based on centroid method. Liu et al. [45] introduced a new CCIFSs method with application using some statistical tools. Thao et al. [57] proposed a CCIFSs method similar to the technique in [45]. Garg and Kumar [29] introduced new CCIFSs method based on set pair analysis and used the method to address MCDM problems. The concept of correlation coefficient has been extended to intuitionistic multiplicative and complex intuitionistic fuzzy environments [27, 30]. TOPSIS method based on correlation coefficient was proposed in [28] to address problems involving MCDM with intuitionistic fuzzy soft set information. Some modified correlation coefficient techniques with their application in real-life decision-making problems have been studied [16, 17]. Certain novel approaches of computing CCIFSs embedded with algorithms were studied and applied in the process of diagnosis [22, 25]. Other techniques of calculating CCIFSs have been studied and utilized in DM problems by different authors [16, 69].
Since PFS has proven to be more effective in curbing uncertainties than IFS, it is expedient to introduce the concept of correlation coefficient in Pythagorean fuzzy environment. Garg [26] first studied correlation coefficient between PFSs (CCPFSs) and applied the concept to address MCDM problems. Thao [56] proposed a CCPFSs technique by using the ideas of variance and covariance, and applied the approach to solve medical diagnosis problems. Singh and Ganie [51] proposed some CCPFSs methods with applications, but the procedures do not incorporate all the traditional parameters of PFSs. Ejegwa [15] proposed a CCPFSs method involving all the parameters of PFSs and applied the method to decision-making problems. A modified Szmidt and Kacprzyk’s approach of computing correlation coefficient were extended to Pythagorean fuzzy environment and applied to solve certain problems of pattern recognition [20]. A Pythagorean fuzzy algorithmic approach incorporating a novel method of calculating CCPFSs with medical diagnosis application have been initiated [24].
By examining the veracity of the methods of computing correlation coefficient, we notice some limitations that need to be addressed. In short, the approaches of computing correlation coefficient via statistical viewpoint as seen in [45, 57] do not take into cognizance the complete parameters of IFSs/PFSs, thus rendered the outputs of the methods unreliable. The quest for a technique of calculating CCPFSs which involves all the parameters of PFSs (to avoid information loss) with a better performance index motivates this work. This paper proposes new techniques of computing CCPFSs by taking into account the complete parameters of PFSs using a Pythagorean fuzzy variance and covariance. The new approaches have better performance indexes in contrast to similar existing techniques. The objectives of this work are to; (i) evaluate the techniques of computing CCPFSs in [45, 57] to foster the introduction of new statistical CCPFSs techniques with accuracy and reliability in measuring CCPFSs, (ii) authenticate the new techniques mathematically, and numerically substantiate the advantages of the new techniques over the techniques in [45, 57], and (iii) determine the application of the new techniques in both pattern recognition and medical diagnosis problems.
The outline for the rest of the paper are; Section 2 reviews some basic notions of PFSs and discusses some techniques of calculating correlation coefficient studied in [45, 57], Section 3 presents the new techniques with comparative analysis, Section 4 addresses the applications of the new CCPFSs techniques in solving problems of medical diagnosis and pattern recognition, and Section 5 draws conclusion and supplies some recommendations for future investigation.
Preliminaries
This section reviews certain basic notions of PFSs and discusses some similar techniques of calculating correlation coefficient studied in literature.
Pythagorean fuzzy sets
Throughout this work S denotes a non-empty set upon which both IFS and PFS are defined.
A = B iff μ
A
(s) = μ
B
(s) ν
A
(s) = ν
B
(s) for all s ∈ S. A ⊆ B iff μ
A
(s) ≤ μ
B
(s), ν
A
(s) ≥ ν
B
(s) for all s ∈ S.
Some methods of computing CCIFSs/CCPFSs via variance and covariance have been studied in literature [45, 57]. One interesting thing is that, all these methods considered only MD and NMD without regarding the essential of HM. Now, we reiterate existing methods before discussing the new approaches of calculating CCPFSs, which is the mainstay of the paper.
-1 ≤ ρ (A, B) ≤1, ρ (A, B) = ρ (B, A), if A = B then ρ (A, B) =1.
Liu et al.’s correlation coefficient method
A correlation coefficient method between IFSs was introduced by Liu et al. [45] defined within the range [-1, 1], which we present in Pythagorean fuzzy setting. Let S = {s1, …, s n } for n ∈ (1, ∞), and A and B be arbitrary PFSs in S. Then, we have the following definition.
Thao et al. [57] introduced a tool for computing correlation coefficient in intuitionistic fuzzy context, which we present in Pythagorean fuzzy setting as follows. Let S = {s1, …, s n } for n ∈ (1, ∞), and A and B be arbitrary PFSs in S.
Thao [56] introduced a method of computing CCPFSs via a statistical approach by considering only MD and NMD. Let S = {s1, …, s n } for n ∈ (1, ∞), and A and B be arbitrary PFSs in S.
By considering the weights ω = {ω1, …, ω
n
} of each of the elements of S = {s1, …, s
n
} such that ω
i
≥ 0 and
Before introducing the new techniques of computing CCPFSs, it is needful to establish the following concepts. Assume A and B are PFSs of S = {s1, …, s n } for n ∈ (1, ∞).
Now, we give some properties of variance and covariance of PFSs without proofs, i.e., the proofs are straightforward.
Assume that
Thus,
Therefore,
We numerically verify the reliability of the new techniques for computing CCPFSs and compare the results with that of the existing similar methods in Pythagorean fuzzy context.
Using the method in [45], we get E (A) =0.2, 0.1, E (B) =0.2, 0.2667 . By Eq. (3), we obtain Table 1.
Computation of ρ1 (A, B)
Computation of ρ1 (A, B)
Hence,
Using the method in [57], we have E (A) =0.2, 0.1, E (B) =0.2, 0.2667 . By Eq. (8), we obtain Tables 2, 3 and 4.
Computation of
Computation of
Computation of
Hence, we have
Using the method in [56], we have E (A) =0.2, 0.1, E (B) =0.2, 0.2667 . By Eq. (12), we have Table 5.
Computation of ρ3 (A, B)
Hence,
By using Eq. (17), we get
By using the proposed methods, the expectations are computed as follow:
By Eq. (23), we get Tables 6 and 7.
Computations of
Computation of
Hence,
By using Eqs. (24) and (25), we have
Using the method in [45], we get E (A) =0.5, 0.6, E (B) =0.54, 0.45. Using Eq. (3), we have Table 8.
Computation of ρ1 (A, B)
Hence,
Using the method in [57], then E (A) =0.5, 0.6, E (B) =0.54, 0.45. By Eq. (8), we get Tables 9, 10 and 11.
Computation of
Computation of
Computation of
Hence,
Using the method in [56], we have E (A) =0.5, 0.6, E (B) =0.54, 0.45. By Eq. (12), we have Table 12.
Computation of ρ3 (A, B)
Thus,
By Eq. (17),
By the proposed methods, E (A) =0.5, 0.6, 0.4781, E (B) =0.54, 0.45, 0.6646. By Eq. (23), we get Tables 13 and 14.
Computations of
Computation of
Hence,
By using Eqs. (24) and (25), we have
Here, we present the results obtained from Examples 1 and 2 in Table 15 for quick assessment.
Comparison of results
Comparison of results
From Table 15, we observe that the new techniques of computing CCPFSs yield improved results with greater correlation coefficient values when compare to the techniques in [45, 57]. In Example 2, the techniques in [45, 56] show that the correlation between the PFSs is negatively perfect, which is not true judging by the values of the proposed methods. The discrepancy is due to the omission of the hesitation margin in the techniques used in [45, 56]. The method in [56] with weights impact gives a misleading result that is not defined within [-1, 1] for Example 2. In like manner, the technique in [57] shows that there is no correlation between the considered PFSs (Example 2), which is again inaccurate. However, the proposed methods yield reliable interpretations of the correlation between the PFSs in Example 2. Among the three new techniques of computing CCPFSs,
In summary, the following are some of the advantages of the new techniques of calculating CCPFSs over the existing techniques; the proposed techniques are consistent with the statistical definition of CCPFSs as seen in Definition 5. the proposed techniques are reliable tool for measuring CCPFSs unlike the techniques in [45, 57], as seen in Example 2. the proposed techniques yield more accurate and reasonable results than the techniques in [45, 57], as seen in Examples 1 and 2. the proposed techniques considered the impact of hesitation margin in computation, thus giving no leeway for incorrect interpretation as seen in the existing similar techniques.
In this section, we solve two cases of MCDM problems comprise of pattern recognition and medical diagnosis using the new techniques of computing CCPFSs.
Case 1
Assuming we have three patterns P i , for i = 1, 2, 3, which are represented by PFVs in the feature space S = {s1, s2, s3}. Suppose there is an uncategorized pattern U represented by PFV in the same feature space S. Table 16 contains the Pythagorean fuzzy pattern representations.
Pythagorean fuzzy pattern representations
Pythagorean fuzzy pattern representations
The expectations of the patterns are calculated by using Eq. (20). The aim is to classify U into any of P i , for i = 1, 2, 3. By applying the new techniques, i.e., Eqs. (23), (24) and (25), we obtain the following results in Table 17.
Results for pattern recognition
From the values of the correlation coefficient obtained, it follows that the unknown pattern U can be classified with pattern P3 since the value of the correlation coefficient between P3 and U is the greatest. The decision is uniform using each of the proposed methods.
Suppose we have some diseases D i for i = 1, 2, 3, 4, 5 represented in PFPs, where D1 = viral fever, D2 = malaria, D3 = typhoid fever, D4 = stomach problem, D5 = chest problem. The symptoms S of D is a set S = {s1, s2, s3, s4, s5}, where s1 = temperature, s2 = headache, s3 = stomach pain, s4 = cough, s5 = chest pain.
Assume a patient P shows some symptoms of S during medical examination, which are also captured in PFPs. The Pythagorean fuzzy medical information of P and D with respect to the clinical manifestations S is contained in Table 18.
Pythagorean fuzzy representations of diagnostic process
Pythagorean fuzzy representations of diagnostic process
Now, we determine which of the diseases D relates most with the patient P with respect to the symptoms S by employing Eqs. (23), (24) and (25). The expectations of D1, D2, D3, D4, D5 and P are computed by Eq. (20). The correlation coefficient values between D and P are in Table 19.
Results for disease diagnosis
From the computations using Proposed methods I and II, the patient is suffering mainly from malaria. However, the patient should be equally treated for viral fever and typhoid fever, respectively. But, using Proposed method III, the patient is said to be suffering of typhoid fever with less severity, and should be equally treated for viral fever and malaria, respectively. It is interesting to know that the patient has negative correlation with both stomach and chest problems. The discrepancy of the results of Proposed methods I and II with Proposed method III is because Proposed method III only takes the maximum of the variances, and so the severity of the diagnosis from Proposed method III is misleading.
We have discussed in details some new techniques of computing CCPFSs in statistical perspective via Pythagorean fuzzy variance and covariance with their applications in DM. While scrutinizing the reliability of some statistical methods of computing CCIFSs and CCPFSs [45, 57], we observed some setbacks that need to be addressed such as (i) failure to incorporate the complete parameters of IFSs/PFSs which led to information loss, (ii) imprecise results, and (iii) less performance indexes. However, in contrast with the existing methods [45, 57], the new techniques remedied all the observed setbacks to enhance reliable output. Some numerical examples were given, analysed to show the advantages of the new techniques over the methods in [45, 57]. To determine the applicability of the new techniques of calculating CCPFSs, some decision-making problems namely; pattern recognition and diagnosis of disease were addressed. By extending these techniques to object-oriented environment, larger populated MCDM problems could be addressed with ease. The proposed correlation measures could be applied to MAGDM with multi-granular hesitant fuzzy linguistic term sets [66], two-sided matching DM with multi-granular hesitant fuzzy linguistic term sets and fuzzy preference relations [72, 74] and social network group DM based on leadership and bounded confidence [73]. The introduced methods can only be utilized in a tri-parametric soft computing construct.
Competing interests
The authors declare that they have no conflict of interests.
Footnotes
Acknowledgment
This work is supported by Foundation of Chongqing Municipal Key Laboratory of Institutions of Higher Education ([2017]3), Foundation of Chongqing Development and Reform Commission (2017[1007]), and Foundation of Chongqing Three Gorges University.
