Tests for skewness parameter of skew log Laplace distribution

Abstract

Laplace probability density function with additional shape parameter that regulates the degree of skewness is a skew Laplace distribution. The various forms of skew Laplace distribution are found in the literature, the distributions defined by Mc Gill (1962), Holla and Bhattacharya (1968), Lingappaiah (1988), Fernandez and Steel (1998). The skew log Laplace distribution is the probability distribution of a random variable whose logarithm follows a skew Laplace distribution. In this paper, the classical optimum tests for skewness parameter of skew log Laplace distribution (SLLD) derived from Lingappaiah (1988) distribution are discussed. Uniformly most powerful test, uniformly most powerful unbiased test and Wald’s sequential probability ratio test for skewness parameter are compared. The exact likelihood ratio test and Neyman structure test for testing skewness parameter when scale parameter is known are derived. Finally, the underreported income of Road Transport Company is analysed on the basis of the tests derived in this paper.

Keywords

Skew log Laplace distribution uniformly most powerful test uniformly most powerful unbiased test sequential probability ratio test likelihood ratio test Neyman structure test

1. Introduction

Log Laplace distribution has been used by researchers to study growth rates, stock prices, exchange rates, income, and so on. Hartley and Revankar (1974) used skew log Laplace distribution to model underreported income and is described below.

Let $Y$ be the true income which assumes two parameter Pareto distribution given below

$\displaystyle f_{Y}(y;\alpha,\delta)=\left\{\begin{array}[]{ll}{\displaystyle% \frac{\alpha}{\delta}}\left({\displaystyle\frac{\delta}{y}}\right)^{\alpha+1}&% \text{if }y\geqslant\delta,\delta>0,\alpha>0,\\ 0,&\text{otherwise.}\end{array}\right.$ (1)

Assuming that income is underreported, true income $Y$ is unobserved. The underreported income and observational error are denoted as $X$ and $U$ respectively. Thus $U=Y-X$ and assumes only non negative values. $0\leqslant U\leqslant Y$ . The proportion of income which is not reported is denoted by $W=U/Y=1-X/Y$ and $W$ is distributed independently of $Y$ (Krishnaji, 1970) with probability density function (p.d.f.) as shown below

$\displaystyle f_{W}(w;\beta)=\left\{\begin{array}[]{ll}\beta(1-w)^{\beta-1},&0% <w<1,\beta>0,\\ 0,&\text{otherwise.}\end{array}\right.$ (2)

Using Eqs (1) and (2), it is proved that the p.d.f. of the underreported income $X$ is

$\displaystyle f_{X}(x;\alpha,\beta,\delta)=\left\{\begin{array}[]{ll}\left({% \displaystyle\frac{\alpha\beta}{\alpha+\beta}}\right)\left({\displaystyle\frac% {1}{\delta}}\right)\left({\displaystyle\frac{x}{\delta}}\right)^{\beta-1}&% \text{if }0<x\leqslant\delta,\\ \left({\displaystyle\frac{\alpha\beta}{\alpha+\beta}}\right)\left({% \displaystyle\frac{1}{\delta}}\right)\left({\displaystyle\frac{\delta}{x}}% \right)^{\alpha+1}&\text{if }\delta<x<\infty,\\ 0,&\text{otherwise}\\ \end{array}\right.$ (3)

where $\delta>0,\alpha>0,\beta>0$ , $\delta$ is scale parameter and $\alpha$ , $\beta$ are shape parameters.

The p.d.f. in Eq. (3) is Skew log Laplace distribution (SLLD) with parameters $(\delta,\alpha,\beta)$ . It can be easily proved that, $\log X$ assumes skew Laplace distribution given by Lingappaiah (1988). The distribution is scale symmetric, in the sense that, $X/\delta$ and $\delta/X$ have same p.d.f. Thus SLLD is symmetric, when $\alpha=\beta$ . Let $\alpha/\beta=k$ , where $k>0$ , then the p.d.f. is

$\displaystyle f_{X}(x,k,\beta,\delta)=\left\{\begin{array}[]{ll}\left({% \displaystyle\frac{\beta k}{k+1}}\right)\left({\displaystyle\frac{1}{\delta}}% \right)\left({\displaystyle\frac{x}{\delta}}\right)^{\beta-1}&\text{if }0<x% \leqslant\delta,\\ \left({\displaystyle\frac{\beta k}{k+1}}\right)\left({\displaystyle\frac{1}{% \delta}}\right)\left({\displaystyle\frac{\delta}{x}}\right)^{\beta k+1}&\text{% if }\delta<x<\infty,\\ 0,&\text{otherwise.}\end{array}\right.$ (4)

The p.d.f. given in Eq. (4) is SLLD $(\delta,k,\beta)$ , where $k$ is skewness parameter. Testing for scale symmetry is to test whether $k=1$ . The p.d.f. given in Eq. (4) can also be written as shown below

$\displaystyle f_{X}(x,k,\beta,\delta)=\left\{\begin{array}[]{ll}\left(\left({% \displaystyle\frac{\beta k}{k+1}}\right)\left({\displaystyle\frac{1}{\delta}}% \right)\left({\displaystyle\frac{x}{\delta}}\right)^{\beta-1}\right)^{I_{x}}% \left(\left({\displaystyle\frac{\beta k}{k+1}}\right)\left({\displaystyle\frac% {1}{\delta}}\right)\left({\displaystyle\frac{\delta}{x}}\right)^{\beta k+1}% \right)^{(1-I_{x})}&\text{if }0<x<\infty,\\ 0,&\text{otherwise,}\end{array}\right.$ (5)

where $I_{x}=1$ if $0<x\leqslant\delta$ and $I_{x}=0$ if $\delta<x<\infty$ .

Kozubowski and Podgorski (2003) have studied the properties and characterization of SLLD. In this paper tests for skewness parameter of SLLD are derived when scale parameter is known. In Sections 2 and 3 uniformly most powerful (UMP) test and sequential probability ratio test (SPRT) for skewness parameter are derived, when $\delta$ and $\beta$ are known respectively. In Section 4, uniformly most powerful unbiased (UMPU) test is discussed, for skewness parameter, for two sided alternative, when $\delta$ and $\beta$ are known. The more general, exact likelihood ratio test (LRT) and Neyman structure test for testing skewness parameter are derived, when $\delta$ is known and in presence of nuisance parameter $\beta$ , in Sections 5 and 6 respectively. The test for skewness parameter based on number of observations less than or equal to the scale parameter only (Binomial LRT), is also discussed. The critical values and values of power functions of UMP, UMPU, LR and Neyman structure tests are tabulated, at level of significance $\alpha^{*}=$ 0.05, for different sample sizes, in Appendix. Power and ASN values for SPRT are also presented in Appendix. Section 7 demonstrates the application of these tests to the underreported income of bus drivers of Road Transport Company.

2. Uniformly most powerful (UMP) test

Suppose $\textbf{X}=(X_{1},X_{2},\ldots,X_{n})$ is a random sample from SLLD $(\delta,k,\beta)$ given in Eq. (5). Assuming $\delta$ and $\beta$ are known, the joint probability density function is

$\displaystyle f_{\textbf{X}}(\textbf{x},k)=\left[{\displaystyle\frac{\beta k}{% (k+1)}}\right]^{n}\left({\displaystyle\frac{1}{\delta}}\right)^{n}e^{-[(\beta-% 1)s_{1}+(\beta k+1)s_{2}]},$ (6)

where

$\displaystyle s_{1}=-\sum_{i=1}^{n}I_{x_{i}}\log\left({\displaystyle\frac{x_{i% }}{\delta}}\right),s_{2}=-\sum_{i=1}^{n}(1-I_{x_{i}})\log\left({\displaystyle% \frac{\delta}{x_{i}}}\right).$ (7)

Let $R=\sum\limits_{i=1}^{n}I_{x_{i}}=$ no. of observations less than or equal to $\delta$ in a sample of size $n$ .

It can be proved that

$\displaystyle R\sim\textit{Bin}(n,p),\text{ where }p={\displaystyle\frac{k}{(k% +1)}}\text{ and }q=1-p,$ $\displaystyle S_{1}\mid r\sim G(r,\beta)\text{ if }r=1,2,3,\ldots,n.\text{ If % }r=0,\text{ then }s_{1}=0\text{ and }P(S_{1}=0)=q^{n}.$ (8) $\displaystyle S_{2}\mid r\sim G(n-r,\beta k)\text{ if }r=0,1,2,\ldots,(n-1).% \text{ If }r=n,\text{ then }s_{2}=0\text{ and }P(S_{2}=0)=p^{n}.$ (9) $\displaystyle\text{Given }R=r,S_{1}\text{ and }S_{2}\text{ are independent},$ (10)

where $\textit{Bin}(a,b)$ represents binomial distribution with parameters $a$ and $b$ indicating number of trials and probability of success respectively, $G(a,b)$ represents Gamma distribution with shape parameter $a$ and scale parameter $b$ .

The following theorem gives UMP test for skewness parameter $k$ .

.

The UMP test for testing $H_{0}:k=k_{0}$ against $H_{1}:k=k_{1}$ when $k_{1}>k_{0}$ of size $\alpha^{*}$ is given by

$\displaystyle\phi(\textbf{x})=\left\{\begin{array}[]{ll}1,&s_{2}<c,\\ \gamma,&s_{2}=c,\\ 0,&s_{2}>c,\\ \end{array}\right.$ (11)

where $s_{2}$ is as defined in Eq. (7), $c$ and $\gamma$ are such that $E_{k_{0}}[\phi(\textbf{x})]=\alpha^{*}$ .

The proof of the theorem is deferred to the Appendix. The UMP test for testing $H_{0}:k=k_{0}$ against $H_{1}:k=k_{1}(k_{1}<k_{0})$ of size $\alpha^{*}$ is described in Appendix (Remark 2.1). The UMP test for symmetry for one sided alternative hypothesis can be obtained by putting $k_{0}=1$ , in the tests described above. Critical values of $c$ for level of significance $\alpha^{*}=0.05$ and power of the test for different values of $k$ are calculated for sample sizes 10(10)50, under $H_{1}:k>1$ and $H_{1}:k<1$ , for $\beta=2$ (chosen arbitrarily). These values are presented in Tables 2 and 3 respectively, in Appendix. Note that cutoff points and power of the test does not depend on the value of $\delta$ .

The sequential tests are preferred over fixed sample size test, when sampling is costly. In the following section, Sequential Probability Ratio Test for skewness parameter is discussed.

3. Sequential probability ratio test (SPRT)

The sequential probability ratio test based on independent and identical random variables from SLLD $(\delta,k,\beta)$ , for testing $H_{0}:k=k_{0}$ against $H_{1}:k=k_{1}(k_{1}>k_{0})$ , when $\delta$ and $\beta$ are known, for the strength $\alpha^{*},\beta^{*}$ can be easily derived and is as shown below.

SPRT at $n^{\text{th}}$ stage is as follows:

1.
Reject $H_{0}$ if $D_{n}\geqslant\log A$ ,
2.
Accept $H_{0}$ if $D_{n}\leqslant\log B$ ,
3.
Continue sampling by taking one more observation if $\log B<D_{n}<\log A$ , where $z_{i}=\log\left({\displaystyle\frac{f_{1}{(x_{i})}}{f_{0}{(x_{i})}}}\right)$ , $D_{n}=\sum_{i=1}^{n}z_{i}=n\log\left[{\displaystyle\frac{k_{1}(k_{0}+1)}{k_{0}% (k_{1}+1)}}\right]+\beta(k_{1}-k_{0})\sum_{i=1}^{n}(1-I_{x_{i}})\log\left({% \displaystyle\frac{\delta}{x_{i}}}\right)$ , $A={\displaystyle\frac{(1-\beta^{})}{\alpha^{}}}$ and $B={\displaystyle\frac{\beta^{}}{(1-\alpha^{})}}$ .

On simplification SPRT is

At the $n^{\text{th}}$ stage

1.
Reject $H_{0}$ if $s_{2}\leqslant{\displaystyle\frac{-\log A+na}{b}}$ ,
2.
Accept $H_{0}$ if $s_{2}\geqslant{\displaystyle\frac{-\log B+na}{b}}$ ,
3.
Continue sampling by taking one more observation if ${\displaystyle\frac{-\log A+na}{b}}<s_{2}<{\displaystyle\frac{-\log B+na}{b}}$ , where $a=\log\left[{\displaystyle\frac{k_{1}(k_{0}+1)}{k_{0}(k_{1}+1)}}\right]$ , $b=\beta(k_{1}-k_{0})$ and $s_{2}$ as defined in Eq. (7).

It can be easily seen that the SPRT for testing $H_{0}:k=k_{0}$ against $H_{1}:k=k_{1}(k_{1}<k_{0})$ , for the strength $(\alpha^{},\beta^{})$ , when $\delta$ and $\beta$ are known is

At the $n^{\text{th}}$ stage,

1.
Reject $H_{0}$ if $s_{2}\geqslant{\displaystyle\frac{-\log A+na}{b}}$ ,
2.
Accept $H_{0}$ if $s_{2}\leqslant{\displaystyle\frac{-\log B+na}{b}}$ ,
3.
Continue sampling by taking one more observation if ${\displaystyle\frac{-\log B+na}{b}}<s_{2}<{\displaystyle\frac{-\log A+na}{b}}$ .

The properties of SPRT are derived in Appendix. SPRT for symmetry can be obtained by substituting $k_{0}=1$ . The power and ASN values for strength $\alpha^{}=\beta^{}=0.05$ are computed for $\beta=2$ (chosen arbitrarily) and are presented in Tables 4 and 5 respectively for different values of $k$ .

The UMP test discussed in Section 2, for one sided alternative hypothesis, clearly shows that the UMP test for testing $H_{0}:k=k_{0}$ against $H_{1}:k\neq k_{0}$ does not exist. Therefore in the next section, Uniformly most powerful unbiased test, for skewness parameter $k$ , for two sided alternative hypothesis is derived, as described in Lehmann (1986).
4. Uniformly most powerful unbiased (UMPU) test

Suppose $\textbf{X}=(X_{1},X_{2},\ldots,X_{n})$ is a random sample from SLLD $(\delta,k,\beta)$ given in Eq. (5). $s_{2}$ is as defined in Eq. (7). In the following theorem, UMPU test for skewness parameter $k$ is derived.

.

The uniformly most powerful unbiased test for testing $H_{0}:k=k_{0}$ against $H_{1}:k\neq k_{0}$ of size $\alpha^{*}$ , when $\delta$ and $\beta$ are known, is given by

$\displaystyle\phi(\textbf{x})=\left\{\begin{array}[]{ll}1,&s_{2}<d_{1}\text{ % or }s_{2}>d_{2},\\ \nu_{1},&s_{2}=d_{1},\\ \nu_{2},&s_{2}=d_{2},\\ 0,&d_{1}<s_{2}<d_{2},\\ \end{array}\right.$ (12)

where $d_{1},d_{2},\nu_{1}$ and $\nu_{2}$ are such that

$\displaystyle E_{k_{0}}[\phi(\textbf{X})]=\alpha^{*}\text{ and }E_{k_{0}}[\phi% (\textbf{X})S_{2}]=\alpha^{*}E_{k_{0}}(S_{2}).$ (13)

The proof of the theorem is deferred to the Appendix. Thus the UMPU test for skewness parameter, for testing $H_{0}:k=1$ against $H_{1}:k\neq 1$ is obtained by substituting $k_{0}=1$ . The critical values of $d_{1}$ and $d_{2}$ , for level of significance $\alpha^{*}=$ 0.05 and power of the test for different values of $k$ are calculated for sample sizes 10(10)50, for $\beta=2$ (chosen arbitrarily). These values are presented in Table 6 in Appendix.

The tests for skewness parameter described in Sections 2, 3 and 4 assumes $\beta$ to be known. In the following Section, likelihood ratio test for skewness parameter is discussed, when $\beta$ is unknown.

5. Exact likelihood ratio test (LRT)

Let $\textbf{X}=(X_{1},X_{2},\ldots,X_{n})$ denote $n$ independent and identical SLLD $(\delta,k,\beta)$ random variables, where $\delta$ is known and $\beta$ is unknown. The exact likelihood ratio test for skewness parameter $k$ is derived in the following theorem.

.

The LRT based on $n$ observations of size $\alpha^{*}$ , for testing $H_{0}:k=k_{0}$ against $H_{1}:k\neq k_{0}$ , where $\delta$ is known and $\beta$ is unknown is given by

$\displaystyle\phi(\textbf{x})=\left\{\begin{array}[]{ll}1,&v<c_{1}\text{ or }v% >c_{2},\\ \gamma_{1},&v=c_{1},\\ \gamma_{2},&v=c_{2},\\ 0,&\text{otherwise},\end{array}\right.$ (14)

where $v=s_{1}/(s_{1}+k_{0}s_{2})$ and $s_{1},s_{2}$ are as defined in Eq. (7). $c_{1},c_{2},\gamma_{1}$ and $\gamma_{2}$ are obtained such that $E_{k_{0}}[\phi(\textbf{X})]=\alpha^{*}$ .

The proof of the theorem and relevant remarks are deferred to the Appendix. The test for symmetry is obtained by putting $k_{0}=1$ . The critical values of $c_{1},c_{2}$ and power of the LRT for symmetry are obtained for different sample sizes 10(10)50 and $n=29$ at $\alpha^{*}=0.05$ , for different values of $k$ and are shown in Table 7 in Appendix. The two sided Binomial LRT test is described in Appendix (Remark 5.4). The critical values and power are obtained for different sample sizes 10(10)50 and $n=29$ at $\alpha^{*}=0.05$ and are shown in Table 8 in Appendix. From Tables 7 and 8, it is clear that power of an exact LRT is significantly greater than the power of Binomial LRT, as expected, but Binomial LRT is computationally simple. Also one sided LRT test for symmetry is derived in Appendix (Remark 5.5).

Neyman structure test is an alternate test to LRT when nuisance parameter is present. In the next section it is derived for skewness parameter $k$ in the presence of nuisance parameter $\beta$ .

6. Neyman structure test

Let $\textbf{X}=(X_{1},X_{2},\ldots,X_{n})$ denote $n$ independent and identical SLLD $(\delta,k,\beta)$ random variables, where $\delta$ is known and $\beta$ is unknown. Neyman structure test for skewness parameter $k$ is derived in the following theorem.

.

The Neyman structure test for testing $H_{0}:k=k_{0}$ against $H_{1}:k>k_{0}$ based on the random sample of size $n$ drawn from SLLD $(\delta,k,\beta)$ , where $\delta$ is known and $\beta$ is unknown is given by

$\displaystyle\phi(\textbf{x})=\left\{\begin{array}[]{ll}1&\text{if }v>c,\\ \nu&\text{if }v=c,\\ 0,&\text{if }v<c,\end{array}\right.$ (15)

where $w=(s_{1}+k_{0}s_{2})$ , $v=s_{1}/(s_{1}+k_{0}s_{2})=s_{1}/w$ , and $s_{1}$ , $s_{2}$ are as defined in Eq. (7). $c$ and $\nu$ are obtained such that $E_{k_{0}}[\phi(\underline{X})]=\alpha^{*}$ .

The proof of the theorem is deferred to the Appendix. Neyman structure test and LRT for symmetry parameter are identical for one sided alternative hypothesis.

In the following Section application of LRT and Binomial LRT for symmetry to the real life data is presented.

7. Application

The Road Transport Company is running buses on different routes. The bus halts at several stops in between the starting point and the destination. The fare paid by each passenger will be different, depending on distance traveled as they can get in and get out of the bus at any stop. The number of passengers traveling per trip is not fixed. The fare collected from the passengers by the bus drivers (income) per trip on any route depends on the number of passengers traveling and distance traveled by the passengers. Table 1 shows income (in Rs.) reported per trip, by bus drivers, on one of the routes, in the month of February, for 29 trips, between the two fixed locations.

Table 1
Income in Rs. reported per trip by bus drivers during 29 trips in February

3,160	2,152	1,798	1,871	2,535	1,753	4,190	2,181	2,423	3,094
3,228	2,886	1,237	1,524	1,889	1,629	3,954	2,967	745	1,896
1,595	2,025	1,373	2,205	2,184	2,208	1,379	1,625	1,446

The officer of Road Transport Company knows that bus drivers are underreporting incomes. He knows that the bus drivers should generate an income of at least Rs. 1,500 per trip. Therefore lower limit on true income $\delta$ is Rs. 1500. The sample size is $n=29$ and number of income values less than Rs. 1500 in the sample is $R=5$ . Using Eq. (7), the values of statistics are $S_{1}=3.0709$ , $S_{2}=5.3007$ . MLEs of the parameters are obtained using Eqs (35) and (36) as $\hat{k}=0.7611$ , $\hat{\beta}=4.0813$ .

The LRT described in Section 5 is applied to test $H_{0}:k=1$ versus $H_{1}:k\neq 1$ According to Theorem 5.1, $V=0.0991$ and it lies between 0 and the cutoff point $c_{1}=0.25$ (given in Table 7). Hence $H_{0}$ is rejected and it can be concluded that data supports the claim that the distribution of underreported income is not scale symmetric.

Using Binomial LRT, number of income values less than Rs. 1500 in the sample is $R=5$ and is less than the cutoff point $c^{\prime}_{1}=9$ (given in Table 8), leading to the same conclusion.

The comparison of fitting symmetric log Laplace distribution and skew log Laplace distribution (SLLD) to income data is carried out using Akaike’s information criterion (AIC). For any model $i$ , with the likelihood function $L_{i}$ its AIC is defined as

$\displaystyle(\textit{AIC})_{i}=-2\log L_{i}(\underline{\hat{\lambda}}\mid% \underline{x})+2k_{i}+{\displaystyle\frac{2k_{i}(k_{i}+1)}{(n-k_{i}+1)}},$ (16)

where $k_{i}$ denotes the number of parameters of the $i^{\text{th}}$ model estimated from the given data. The AIC values for symmetric log Laplace distribution and skew log Laplace distribution are 456.9047 and 436.9354 respectively. Skew log Laplace distribution (SLLD) has lesser AIC than symmetric log Laplace distribution and SLLD is appropriate to describe underreported incomes.

Assuming that underreported incomes of bus drivers follows SLLD $(\delta,k,\beta)$ , it can be concluded that percentage of trips with underreported income below Rs. 1,500 is 43.22 ( $P(x<\delta=1500)=p$ and $\hat{p}=0.4322$ ). Hartley and Revankar (1974) defined index of dishonesty as $E(X)/E(Y)$ . Using the distributions of $X$ and $Y$ given in Eqs (3) and (1) respectively, index of dishonesty is given by ${\beta}/(\beta+1)$ . The estimate of index of dishonesty is ${\hat{\beta}}/(\hat{\beta}+1)=0.8032$ indicating that of 80.32 percent of income is reported.

8. Concluding remarks

All the tests derived in this paper are classical optimum exact tests, whose cutoff points and power can be obtained using simple computer program. The UMP test, UMPU test and LRT/Neyman structure test are randomized tests. All tests are applicable for any sample size. The analysis of underreported income based on LRT/Neyman structure test indicates that Skew log Laplace distribution is better than symmetric log Laplace distribution to describe observed underreported income. Using Skew log Laplace distribution it is concluded that 80.32 percent of income is reported.

Footnotes

Acknowledgments

Authors sincerely thank the members of Editorial Board and the reviewer for their helpful comments and suggestions.

Appendix

References

Fernandez

, & Steel

F.J.

(1998). On Bayesian modeling of fat tails and skewness. J Am Stat Assoc, 93, 359-371.

Hartley

M.J.

, & Revankar

N.S.

(1974). On the estimation of the Pareto Law from under-reported data. Journal of Econometrics, 2, 327-341.

Holla

M.S.

, & Bhattacharya

S.K.

(1968). On a compound Gaussian distribution. Ann Inst Stat Math, 20, 331-336.

Kozubowski

T.J.

, & Podgorski

(2003). Log-Laplace distributions. International Mathematical Journal, 3, 467-495.

Krishnaji

(1970). Characterization of the Pareto distribution through a model of underreported incomes. Econometrica, 38, 251-255.

Lehmann

E.L.

(1986). Testing Statistical Hypotheses. (2nd ed., pp. 134-181) John Wiley and Sons Inc.

Lingappaiah

G.S.

(1988). On two-piece double exponential distribution. J. Korean Statist. Soc, 17(1), 46-55.

Mc Gill

W.J.

(1962). Random fluctuations of response rate. Psychometrika, 27, 3-17.