Abstract
In this paper, tests of skewness and kurtosis are introduced under neutrosophic statistics. The necessary measures and neutrosophic forms of these estimators are introduced. The application of the proposed tests is given using the data associated with heart diseases. From the real example analysis, the proposed tests are quite flexible and informative than the existing tests under classical statistics. In addition, it is concluded from the analysis that the proposed tests give information about the measure of indeterminacy in the presence of uncertainty.
Introduction
The tests of kurtosis and skewness have been widely applied in social sciences and psychology where the data is usually deviated from the normality, see [1] and [2]. To see the height of the distribution from the central value, the test of kurtosis is applied. This test determines either the distribution is leptokurtic distribution, mesokurtic (normal), and platykurtic. On the other hand, the test of skewness is applied to see either the distribution is positively skewed, negatively skewed, and symmetric. The study using kurtosis and skewness is helpful to determine the suitable distribution for the data. Based on the information is obtained from these tests, the decision-makers can decide which distribution is suitable for the analysis of the data. The classical measures of kurtosis and skewness are calculated based on the third and fourth moments of mean which cause the increase in bias for the non-normal distribution case. Reference [3] introduced the alternative forms of measures of kurtosis and skewness based on robust estimators. References [4] and [5] pointed out that conventional estimators are affected by outliers. [6] worked on a data trimming approach.
References [7] and [8] pointed that the measurements are imprecise and in-intervals in the variety of fields. In the case of the imprecise data, the traditional measures of kurtosis and skewness cannot be applied. In such situations, the statistical tests using fuzzy logic are applied. References [9], [10] and [11] applied the statistical tests under the fuzzy approach. Reference [12] worked on non-parametric test for the interval data.
The fuzzy logic does not provide information about the measure of indeterminacy. The neutrosophic logic has an edge over fuzzy logic as it gives information about the measures of truth, falseness, and indeterminacy, see [13]. According to [13], neutrosophic logic is more efficient than fuzzy logic and interval-based analysis. The applications of the neutrosophic logic can be seen in References [14–25] and [26]. [27] introduced the idea of neutrosophic statistics which is applied when the data is uncertain, vague, and indeterminate. Neutrosophic statistics is a generalization of classical statistics and gives additional information about the measure of indeterminacy. References [28] and [29] presented the methods to analyze the neutrosophic numbers. Reference [30] proposed a new Mann-Whitney test under indeterminacy. Some applications of tests using the neutrosophic statistics can be seen in [31] and [32].
The tests of kurtosis and skewness under neutrosophic statistics can be more efficient than the tests under classical statistics and tests under fuzzy logic. By exploring the literature and best of our knowledge, there is no work on tests of kurtosis and skewness under an indeterminate environment. In this paper, we will present the estimators of kurtosis and skewness under neutrosophic statistics. The application of the proposed tests will be given with the data associated with heart disease. It is expected that the proposed tests will be helpful to see the shape of the medical data in the presence of uncertainty.
The proposed tests
In the following sections, we will present the tests of kurtosis and skewness under the neutrosophic statistics. We will first introduce the two estimators of kurtosis under the neutrosophic statistics and then the estimator of skewness under the neutrosophic statistics, respectively. Suppose that a
Ni
and b
Ni
are two neutrosophic numbers of size n
N
, where i = 1, 2, 3, …, n
N
. Let X
Ni
= a
Ni
+ b
Ni
I
N
; I
N
ε [I
L
, I
U
] be a neutrosophic random number where a
Ni
and b
Ni
I
N
are the determined part and indeterminate part of neutrosophic numbers, respectively. Note here that I
N
ε [I
L
, I
U
] is an indeterminate interval. The neutrosophic number X
Ni
ε [X
Li
, X
Ui
] reduces to the classical random number if X
Li
= 0. Based on this information, the neutrosophic average can be written as
References [33] and [34] argued that the classical kurtosis estimator is very sensitive to detect the outlier in the data. Reference [34] suggested an alternative kurtosis estimator see the nature of the shape of the data. We now extend [34] estimator of kurtosis, say K Nr 2 under the neutrosophic statistics. Suppose that XN(i) = aN(i) + bN(i)I N ; I N ε [I L , I U ] be an ordered neutrosophic number. Let UN0.05 denote the mean of upper 5% of XN(i), LN0.05 is the mean of lower 5% of XN(i), UN0.5 and LN0.5 are lower and upper 50% of XN(i).
The values of UN0.05ε [UL0.05, UU0.05] can be computed by
The values of LN0.05ε [LL0.05, LU0.05] can be computed by
The values of UN0.5ε [UL0.5, UU0.5] can be computed by
The values of LN0.5ε [LL0.5, LU0.5] can be computed by
Using the above-mentioned equations, the estimator K
Nr
2ε [K
Lr
2, K
Ur
2] is defined as follows
According to [3] and [35], the data has light-tailed distribution if K Nr 2 values are less than 2, the data has the medium tailed normal distribution if the values of K Nr 2 from 2 to 2.6, the data has heavily-tailed distribution such as double-exponential distribution if the values of K Nr 2 from 2.6 to 3.2 and the data follows a heavy-tailed distribution such as Cauchy distribution if K Nr 2 values are larger than 3.2.
[3] provided another estimator of kurtosis say K
Nr
3 to study the nature of the data. We now present the modification of the estimator K
Nr
3 under neutrosophic statistics. Let UN0.20 denote the mean of upper 20% of XN(i), LN0.20 is the mean of the lower 20% of XN(i). Based on this information, the estimator of kurtosis K
Nr
3ε [K
Lr
3, K
Ur
3] is defined by
According to [36], the numerator and denominator of K Nr 3 is an excellent estimate of standard deviation (SD) of normal distribution and double exponential distribution, respectively. [3] and [35] pointed out that the data has light-tailed distribution if K Nr 3 is less than 1.81, the distribution is said to be medium-tailed distribution if K Nr 3 from 1.81 to 1.87 and the data has follows heavy-tailed distribution if K Nr 3 is larger than 1.87.
The estimator of skewness is used to determine either the distribution is positively (+vely) skewed, negatively (-vely) skewed and normal. [3] proposed this estimator is based on the trimmed values of the order data. Let mN0.25 denote the mean of upper and lower values trimmed by 25%. The estimator of skewness under neutrosophic statistics say S
Nr
2ε [S
Lr
2, S
Ur
2] is defined by
As mentioned by [37] and [38], the distribution is said to be symmetrical if the values of S Nr 2 from 0.5 to 2, the distribution is said to be –vely skewed if values of S Nr 2 is less than 0.5 and the distribution is said to be+vely skewed distribution if S Nr 2 is larger than 2.
In this section, we will discuss the application of the proposed tests using medical data about heart disease. The data consists of three variables namely pulse rate (PR), systolic pressure (SP), and diastolic pressure (DP). Reference [39] provided some basic analysis of the data. The data of 60 patients are reported in Table 1 where the first 11 values are the original data and selected from [39] and the remaining observations are generated through simulation. The cardiologist is interested to see the nature of the three variables associated with heart disease. From Table 1, it can be noted that the three variables are measured in intervals rather than the exact values. Therefore, the existing tests of kurtosis and skewness under classical statistics cannot be applied to study the nature of these variables. Therefore, the proposed tests of kurtosis and skewness can be applied to see either the data of three variables follow the normal distribution or not.
The medical data
The medical data
The necessary neutrosophic descriptive of three variables for estimator K Nr 2 are shown in Table 2. The necessary neutrosophic descriptive of three variables for estimator K Nr 3 are shown in Table 3. The necessary neutrosophic descriptive of three variables for estimator S Nr 2 are shown in Table 4. The values of the three estimators K Nr 2, K Nr 3 and S Nr 2 are shown in Table 5. From Table 5, it can be seen that according to the estimator K Nr 2, the PR follows the medium-tailed neutrosophic normal distribution. It is interesting to note that the determined part of the variable SP follows the medium-tailed neutrosophic normal distribution and the indeterminate part follows the neutrosophic heavy-tailed distribution such as neutrosophic double exponential distribution. The variable DP follows the medium-tailed neutrosophic normal distribution. The estimator K Nr 3 indicates that PR, SP, and DP follow the light-tailed distributions. The estimator S Nr 2 indicate that the variables PR and SP have a symmetric distribution. For the variable DP, the determinate part follows the –vely neutrosophic skewed distribution (a neutrosophic distribution having tail lengthy on the left side), and the indeterminate part follows the neutrosophic symmetric distribution. From this study, it can be concluded that the proposed tests can guide the cardiologist in decision-making.
Estimator of Kurtosis K Nr 2
Estimator of Kurtosis K Nr 3
Estimator of Skewness S Nr 2
Nature of the distributions
In this section, we will compare the efficiency of the proposed tests over the existing tests under classical statistics proposed by [34]. We will show that the proposed tests are efficient in measure of indeterminacy. The proposed tests reduce to the tests by [34] if no indeterminacy is recorded in the data. For a fair comparison, we will consider the same data was given in the last section. The neutrosophic forms of three variables PR, SP, and DP are shown in Table 6. Note here that the first part of neutrosophic forms presents the determined part of the tests which is the same as in [34] and the second part of neutrosophic forms presents the indeterminate part. For example, for variable DP, the neutrosophic form for S Nr 2 is S Nr 2 = 0.37 + 1.40I N ; I N ∈ [0, 0.74]. According to the theory of neutrosophic, the proposed tests reduce to tests under classical statistics when I L = 0. Therefore, the value 0.37 presents the value of skewness tests under classical statistics and the second value 1.40I N show the value of the indeterminate part. From Table 6, it can be seen that the proposed test gives information about the measure of indeterminacy which the existing tests do not provide. For the above-mentioned neutrosophic form, the measure of indeterminacy is 0.74. From this study, it can be concluded that the proposed tests provide the results in the indeterminate intervals with the measure of indeterminacy. On the other hand, the existing tests give only the exact values of the tests which may mislead the cardiologist in making the decision about the nature of variables associated with heart disease. The proposed test is also a generalization of the test based on fuzzy logic and interval-based analysis. For the neutrosophic form S Nr 2 = 0.37 + 1.40I N ; I N ∈ [0, 0.74], S Nr 2 is also better structured, since we know that 0.37 is determined part, and 1.40I N is the fluctuating part around 0.37. In addition, I N may not necessarily be an interval, but any subset that is not the case of fuzzy logic and interval-based analysis. Therefore, the proposed test using neutrosophic statistics is more efficient than fuzzy logic and interval-based analysis.
Neutrosophic form of three variables
Neutrosophic form of three variables
To see the effect of measure of the indeterminacy of the three estimators under neutrosophic statistics, a simulation study was performed for variable DP. To see the trends in the three estimators, various values of the upper value of indeterminacy I U are considered. The values of estimators K Nr 2, K Nr 3 and S Nr 2 are shown in Table 7.
The effect of indeterminacy on estimators
The effect of indeterminacy on estimators
From Table 7, it can be seen that the values of estimator K Nr 2 increases from 0.37 to 1.77 when I U changes from 0 to 1. As K Nr 2 values at all values of I U are less than 2, therefore, the nature of distribution is light-tailed distribution. From the analysis, it can be concluded that I U does not effect K Nr 2. The nature of distribution based on the values of K Nr 3 is different when I U ⩽ 0.2. For I U ⩽ 0.2, the values of K Nr 3 are larger than 1.87 which indicate the nature of the distribution is heavy-tailed distribution. On the other hand, I U > 0.2, the values of K Nr 3 <1.81 shows the nature of the distribution is light-tailed distribution. The values of S Nr 2 are from 0.5 to 2 when I U ⩽ 0.74 which indicates that the nature of the distribution is symmetrical. The nature of data is –vely skewed distribution for the other values of I U . Based on the simulation study, it can be concluded for K Nr 2 does not affect by I U . On the other hand, I U affects the estimators K Nr 3 and S Nr 2. The effects of I U on other variables can be studied on the same lines.
In this paper, tests of skewness and kurtosis were introduced under neutrosophic statistics. The necessary measures and neutrosophic forms of these estimators were introduced. The application of the proposed tests was given using the data associated with heart diseases. From the real application, it can be seen that the proposed tests are the generalization of the existing tests. The proposed tests provide the results in indeterminacy intervals which are needed in the presence of uncertainty. The proposed test can be applied in medical science, decision science, big data analysis, and social science. The proposed test using big data can be extended as future research.
Footnotes
Acknowledgments
The authors are deeply thankful to the editor and reviewers for their valuable suggestions to improve the quality of the paper.
Author contribution
M.A and M.A.B wrote the paper.
Funding statement
This work was funded by the Deanship of Scientific Research (DSR), King Abdulaziz University, Jeddah. The authors, therefore, gratefully acknowledge the DSR technical and financial support.
