Model formulation on efficiency for median estimation under a fixed cost in survey sampling

Abstract

In survey sampling, it is observed that researchers and users of statistics sometimes do not take into consideration the tool that will be most appropriate for the measure of location. As a result, they often go for the mean or total, which has wider coverage in the finite population sampling literature, unlike the median, which is more complicated to deal with given that it has to do with ordered data. Keeping in mind the established facts from the literature on the usefulness of the median estimator in estimating economic indicators for high precision and efficiency, this study has made useful improvement in estimating the population median not only for gains in efficiency but also in achieving less biased estimates. The study suggests an estimator of population median in single and double sampling techniques. In addition, minimum mean square error has also been obtained for a given cost function under double sampling. Results obtained from both theoretical and empirical investigations reveal that the proposed estimators perform better when the considered variables are from a highly skewed distribution, such as income, expenditure, scores, etc. Moreso, it is observed that the proposed estimators compete favorably with less bias and outstanding gains in efficiency than the existing estimators of its class. In addition, this study avails us of an appropriate way of constructing the cost function for better evaluations compared to an existing estimator considered in this work.

Keywords

Auxiliary variable cost function double sampling mean square error

1. Introduction

Most often in survey sampling, statisticians come across variables that have highly skewed distributions, such as income, expenditure, scores, etc. In such situations, considering the tool that will be most appropriate for the measurement of location becomes essential. The median, unlike the mean or total, which has been widely discussed in finite population sampling, seems to be more complicated to deal with since it has to do with ordered data, thus deserves special attention. Kuk and Mak (1989) were the first to introduce the estimation of the population median of the study variate Y using auxiliary information in survey sampling. Francisco and Fuller (1991) also considered the problem of estimating the median as part of the estimation of a finite population distribution function. Several authors have made useful contributions to improving the precision of survey estimates of population parameters using auxiliary variables. Notable among them are Bahl and Tuteja (1991), who proposed both exponential ratio and product estimators for estimating population medians; Singh, Singh, and Puertas (2003); and Kadilar and Cingi (2004), who made attempts to modify the exponential ratio estimators with the introduction of several parameters to improve efficiency.

In what follows, given that weight adjustments in survey sampling are gaining a lot of attention towards improving the precision of estimates and given the robustness of the exponential estimators proposed by Bahl and Tuteja (1991), researchers in this area have adopted several procedures for modifying these estimators to enhance the performance of the median estimators. In most cases, these procedures lead to the same result for the mean square error (MSE) of the median estimator. Singh and Solanki (2013), Aladag and Cingi (2015), and Enang et al. (2016) are but few cases in point. As a deviation, Iseh (2020) calibrated a separate-ratio exponential estimator, and obtained a better result compared to other existing estimators in single-phase sampling.

To further enhance the performance of the median estimators, authors like Singh et al. (2001), Singh, Joarder, and Tracy (2003), and Singh, Singh, and Upadhyay (2007) have adopted the double sampling procedure to improve the efficiency of the estimators. However, Jhajj, Kaur, and Jhajj (2016), Biag, Masood, and Terray (2019), and Iseh (2021) have utilized the option of modifying the exponential estimators through double sampling, which has yielded fruitful results and showed prominence over existing estimators in single and two-phase sampling. Keeping in mind the usefulness of the median estimator in estimating economic indicators and the need for high precision and efficiency, this study seeks to make useful improvements in estimating the population median not only for gains in efficiency but also in achieving asymptotically unbiased estimates.

2. Methodology

2.1 Notations

Consider a finite population $U=\left\{{u_{1},u_{2},\ldots,u_{N}}\right\}$ with size $N$ . Let $Y$ , $X$ , and $Z$ be the study, auxiliary, and support variables respectively. Let $y_{i}$ represents the samples of the interest variable and $x_{i}$ and $z_{i}$ represents the samples of the auxiliary and support variables respectively known for every unit in the population for the $i^{\text{th}}$ element drawn under SRSWOR. Let $f_{Y}\left({M_{Y}}\right)$ , $f_{X}\left({M_{X}}\right)$ , and $f_{Z}\left({M_{Z}}\right)$ represent the density functions of the random variables with $\widehat{M}_{y}$ , $\widehat{M}_{x}$ , and $\widehat{M}_{z}$ as the samples estimates of the population median $M_{Y}$ , $M_{X}$ and $M_{Z}$ respectively. Also, suppose $r$ be the integer satisfying $Y_{r}\leqslant M_{Y}\leqslant M_{\left({r+1}\right)}$ and $P=\frac{r}{n}$ be the proportion of $y_{i}$ values in the sample that are less than or equal to the median value $M_{Y}$ which denotes the unknown population parameter. If $\varphi_{y}\left(r\right)$ denote the $r-\textit{quantile}$ of $Y$ then, $\widehat{M}_{y}=\varphi_{y}(0.5)$ with correlation coefficient $\rho_{M_{Y}M_{X}}=4\left({P_{11}-0.25}\right)$ , where $P_{11}=P\left({Y\leqslant M_{Y}\cap X\leqslant M_{X}}\right)$ , $\rho_{M_{Y}M_{Z}}=4\left({P_{11}^{\ast}-0.25}\right)$ , where $P_{11}^{\ast}=P\left({Y\leqslant M_{Y}\cap Z\leqslant M_{Z}}\right)$ , and $\rho_{M_{X}M_{Z}}=4\left({P_{11}^{\ast\ast}-0.25}\right)$ , where $P_{11}^{\ast\ast}=P\left({X\leqslant M_{X}\cap Z\leqslant M_{Z}}\right)$ . Kuk and Mak (1989) defined a matrix of proportion $P_{ij}$ as shown in Table 1.

Table 1
Matrix of proportion

	${X}\leqslant{M}_{{X}}$ , ${Z}\leqslant{M}_{{Z}}$	$X>M_{X}$ , $Z>M_{Z}$	Total
${Y}\leqslant{M}_{{Y}}$	$P_{11}$	$P_{12}$	$P_{1}$
${Y}>{M}_{{Y}}$	$P_{21}$	$P_{22}$	$P_{2}$
Total	$P_{.1}$	$P_{.2}$	1

2.2 Some standard derivations for special class of separate estimators

Adopting the concept by Srivastava (1971) and Srivastava and Jhajj (1995)

Let $u=\frac{\widehat{M}_{x}}{M_{X}}$ , and $v=\frac{\widehat{M}_{z}}{M_{Z}}$ , where $\widehat{M}_{y}=M_{Y}(1+e_{0})$ , $\widehat{M}_{x}=M_{X}(1+e_{1})$ , and $\widehat{M}_{z}=M_{Z}(1+e_{2})$ , then

$\displaystyle E\left[\frac{\widehat{M}_{y}}{M_{Y}}\right]=E[u]=E[v]=1,$ $\displaystyle e_{0}=\frac{\widehat{M}_{y}-M_{Y}}{M_{Y}},e_{1}=\frac{\widehat{M% }_{x}-M_{X}}{M_{X}},e_{2}=\frac{\widehat{M}_{z}-M_{Z}}{M_{Z}},e^{\prime}_{1}=% \frac{\widehat{M}^{\prime}_{x}-M_{X}}{M_{X}},e^{\prime}_{2}=\frac{\widehat{M}^% {\prime}_{z}-M_{Z}}{M_{Z}},$ $\displaystyle E\left({e_{0}}\right)=E\left({e_{1}}\right)=E\left({e_{2}}\right% )=E(e^{\prime}_{1})=E(e^{\prime}_{2})=0$ $\displaystyle E\left({e_{0}^{2}}\right)=\lambda_{1}C_{M_{Y}}^{2},\ E\left({e_{% 1}^{{}^{\prime}2}}\right)=\lambda_{2}C_{M_{X}}^{2},E\left({e_{2}^{{}^{\prime}2% }}\right)=\lambda_{2}C_{M_{Z}}^{2}$ $\displaystyle E\left({e_{0}e_{1}}\right)=\lambda_{1}C_{M_{Y}}C_{M_{X}}\rho_{M_% {X}M_{Y}},E\left({e_{0}e_{2}}\right)=\lambda_{1}C_{M_{Y}}C_{M_{Z}}\rho_{M_{Y}M% _{Z}},E\left({e_{1}e_{2}}\right)=\lambda_{2}C_{M_{X}}C_{M_{Z}}\rho_{M_{X}M_{Z}}$ $\displaystyle\lambda_{1}=\frac{1}{4}\left({\frac{1}{n}-\frac{1}{N}}\right),% \lambda_{2}=\frac{1}{4}\left(\frac{1}{n^{\prime}}-\frac{1}{N}\right),\lambda_{% 3}=\frac{1}{4}\left(\frac{1}{n}-\frac{1}{n^{\prime}}\right),k_{1}=\frac{C_{M_{% Y}}\rho_{M_{X}M_{Y}}}{C_{M_{X}}},k_{2}=\frac{C_{M_{Y}}\rho_{M_{Y}M_{Z}}}{C_{M_% {Z}}}$

$\displaystyle C_{M_{Y}}=\left\{{M_{Y}f_{Y}\left({M_{Y}}\right)}\right\}^{-1},C% _{M_{X}}=\left\{{M_{X}f_{X}\left({M_{X}}\right)}\right\}^{-1},C_{M_{Z}}=\left% \{{M_{Z}f_{Z}\left({M_{Z}}\right)}\right\}^{-1}$

where it is assumed that as $N\to\infty$ the distribution of the trivariate variable $\left({Y,X,Z}\right)$ approaches a continuous distribution with marginal densities $f_{Y}\left(y\right)$ , $f_{X}\left(x\right)$ , and $f_{Z}\left(z\right)$ , for $Y, X,$ and $Z$ respectively. This assumption holds in particular under a super population model framework, treating the values of $\left({Y,X,Z}\right)$ in the population as a realization of $N$ independent observations from a continuous distribution. We also assume that $f_{Y}\left({M_{Y}}\right),f_{X}\left({M_{X}}\right)$ and $f_{Z}\left({M_{Z}}\right)$ , are non-negative.

3. Related existing estimators in literature

This section considers some existing estimators with one and two auxiliary variables in single stage sampling.

i: The classical median estimator due to Gross (1980) is given by

$\displaystyle\widehat{M}_{C}=\widehat{M}_{y}$

(1) $\displaystyle\textit{Var}\left(\widehat{M}_{C}\right)=\lambda_{1}M^{2}_{Y}C^{2% }_{M_{Y}}$

ii: The classical ratio median estimator by Kuk and Mak (1989) is given by

$\displaystyle\widehat{M}_{\textit{CR}}=\widehat{M}_{y}\left(\frac{M_{X}}{% \widehat{M}_{x}}\right)$ $\displaystyle B\left(\widehat{M}_{\textit{CR}}\right)=\lambda_{1}M_{Y}C^{2}_{M% _{X}}(1-k_{1})$ (2) $\displaystyle\textit{MSE}\left(\widehat{M}_{CR}\right)=\lambda_{1}M^{2}_{Y}% \left[C^{2}_{M_{Y}}+C^{2}_{M_{X}}(1-2k_{1})\right]$

iii: The exponential ratio median estimator following Bahl and Tuteja (1991) is given by

$\displaystyle\widehat{M}_{\textit{ER}}=\widehat{M}_{y}\textit{exp}\left[\frac{% M_{X}-\widehat{M}_{x}}{M_{X}+\widehat{M}_{x}}\right]$ $\displaystyle B\left(\widehat{M}_{\textit{ER}}\right)=\frac{\lambda_{1}M_{Y}C^% {2}_{M_{X}}(3-4k_{1})}{8}$ (3) $\displaystyle\textit{MSE}\left(\widehat{M}_{\textit{ER}}\right)=\lambda_{1}M^{% 2}_{Y}\left[C^{2}_{M_{Y}}+\frac{C^{2}_{M_{X}}}{4}(1-4k_{1})\right]$

iv: The exponential product-type median estimator following Bahl and Tuteja (1991) is given by

$\displaystyle\widehat{M}_{\textit{PR}}=\widehat{M}_{y}\textit{exp}\left[\frac{% \widehat{M}_{x}-M_{X}}{M_{X}+\widehat{M}_{x}}\right]$ $\displaystyle B(\widehat{M}_{\textit{PR}})=\frac{\lambda_{1}C^{2}_{M_{X}}(4k_{% 1}-1)}{8}$ (4) $\displaystyle\textit{MSE}\left(\widehat{M}_{\textit{PR}}\right)=\lambda_{1}M^{% 2}_{Y}\left[C^{2}_{M_{Y}}+\frac{C^{2}_{M_{X}}}{4}(1+4k_{1})\right]$

v: The alternative exponential median estimator due to Enang et.al. (2016) is given by

$\displaystyle\widehat{M}_{A}=\alpha\left[\widehat{M}_{y}\textit{exp}\left[% \frac{M_{X}-\widehat{M}_{x}}{M_{X}+\widehat{M}_{x}}\right]\right]+\alpha_{2}% \left[\widehat{M}_{y}\textit{exp}\left[\frac{\widehat{M}_{x}-M_{X}}{M_{X}+% \widehat{M}_{x}}\right]\right]$ $\displaystyle B(\widehat{M}_{A})=\lambda_{1}M_{Y}C^{2}_{M_{X}}(4k_{1}-8k_{1}^{% 2}+1)$ (5) $\displaystyle\textit{MSE}(\widehat{M}_{A})=\lambda_{1}M^{2}_{Y}C^{2}_{M_{Y}}(1% -\rho_{M_{YM_{X}}}^{2})$

vi: Shabbir and Gupta (2017) suggested generalized difference-type estimator for population median as

$\displaystyle\widehat{M}^{G}_{\textit{PP}}=\left[m_{1}\widehat{M}_{y}+m_{2}% \left(M_{X}-\widehat{M}_{x}\right)\right]\left[\left(\frac{aM_{X}+b}{a\widehat% {M}_{x}+b}\right)\textit{exp}\left\{\frac{\alpha_{2}\alpha(M_{X}-\widehat{M}_{% x})}{a\{(\gamma-1)M_{X}+\widehat{M}_{x}\}+2b}\right\}\right]$

where $a$ and $b$ are defined to be unknown population parameters and $\alpha_{1}$ , $\alpha_{2}$ and $\gamma$ are scalar quantities which can take different values like $\alpha_{1}=b=0$ and $\alpha_{2}=a=\gamma=1$ , and

$\displaystyle m_{1\left(\textit{opt}\right)}=\frac{1-\frac{1}{2}\lambda_{1}M_{% X}^{2}}{1+\lambda_{1}M_{Y}^{2}\left({1-\rho_{M_{YM_{X}}}^{2}}\right)}$

and

$\displaystyle m_{2\left(\textit{opt}\right)}=\frac{M_{Y}}{M_{X}}\left[{1+m_{1% \left(\textit{opt}\right)}\left\{{\frac{\rho_{M_{Y}M_{X}}C_{M_{Y}}}{C_{M_{X}}}% -2}\right\}}\right].$

The expressions for the bias and the mean square error up to the first order of approximation are as follows:

$\displaystyle B\left(\widehat{M}^{G}_{\textit{PP}}\right)\cong(m_{1(\textit{% opt})}-1)M_{Y}+m_{2(\textit{opt})}\left\{\lambda_{1}M_{Y}\left(\frac{3}{8}C^{2% }_{M_{X}}-C_{M_{Y}}\right)+\lambda_{1}M_{X}C^{2}_{M_{X}}\right\}$

and

$\displaystyle\textit{MSE}\left(\widehat{M}^{G}_{\textit{PP}}\right)\cong\frac{% \lambda_{1}M^{2}_{Y}}{1+\lambda_{1}M^{2}_{Y}(1-\rho_{M_{\textit{YM}_{X}}}^{2})% }\left[C^{2}_{M_{Y}}(1-\rho_{M_{\textit{YM}_{X}}}^{2})(1-\lambda_{1}C^{2}_{M_{% X}})-\frac{1}{4}C^{4}_{M_{X}}\right]$ (6)

vii: Baig, Masood and Tarray (2019) suggested an improved class of difference-type estimators for population median using two auxiliary variables

$\displaystyle\widehat{M}^{I}_{P}=\left[\widehat{M}_{y}+m_{1}(M_{X}-\widehat{M}% _{x})\right]\left[m_{2}\textit{exp}\left(\frac{M_{Z}-\widehat{M}_{z}}{M_{Z}+% \widehat{M}_{z}}\right)+(1-m_{2})\textit{exp}\left(\frac{\widehat{M}_{z}-M_{Z}% }{M_{Z}+\widehat{M}_{z}}\right)\right]$ $\displaystyle B\left(\widehat{M}_{P}^{I}\right)=\lambda_{1}\left[m_{1}M_{X}C_{% M_{\textit{XZ}}}\left(m_{2}-\frac{1}{2}\right)+M_{Y}C_{M_{\textit{YZ}}}\left(% \frac{1}{2}-m_{2}\right)\right]M_{Y}$

where

$\displaystyle m_{1\left(\textit{opt}\right)}=\frac{M_{Y}C_{M_{Y}}\left({\rho_{% M_{X}M_{Z}}\rho_{M_{Y}M_{Z}}-\rho_{M_{Y}M_{X}}}\right)}{M_{X}C_{M_{X}}\left({1% -\rho_{M_{XM_{Z}}}^{2}}\right)},$ $\displaystyle m_{2\left(\textit{opt}\right)}=\frac{C_{M_{Z}}\left({\rho_{M_{XM% _{Z}}}^{2}-1}\right)+2C_{M_{Y}}\left({\rho_{M_{X}M_{Z}}\rho_{M_{Y}M_{X}}-\rho_% {M_{Y}M_{Z}}}\right)}{2C_{M_{Z}}\left({\rho_{M_{XM_{Z}}}^{2}-1}\right)}$

$\displaystyle\textit{MSE}\left(\widehat{M}_{P}^{I}\right)=\frac{\lambda_{1}M^{% 2}_{Y}C^{2}_{M_{Y}}}{(1-\rho_{M_{\textit{XM}_{Z}}}^{2})}\left[\left(1-\rho^{2}% _{M_{\textit{XM}_{Z}}}-\rho^{2}_{M_{\textit{YM}_{X}}}-\rho^{2}_{M_{\textit{YM}% _{Z}}}+2\rho_{M_{X}M_{Z}}\rho_{M_{Y}M_{X}}\rho_{M_{Y}M_{Z}}\right)\right]$ (7)

viii: Iseh (2021) proposed a separate ratio exponential estimator of the form

$\displaystyle\widehat{M}^{}_{\textit{srs}}(\alpha)=\widehat{M}_{y}\left[% \alpha\frac{M_{X}}{\widehat{M}_{x}}+(1-\alpha)\frac{\widehat{M}_{x}}{M_{X}}% \right]\textit{exp}\left[\frac{(\widehat{M}_{z}-M_{Z})}{(M_{Z}+\widehat{M}_{z}% )}\right]$

And the minimum bias given for

$\displaystyle\alpha=\frac{2C_{M_{X}}^{2}+2C_{M_{X}}C_{M_{Y}}\rho_{M_{X}M_{Y}}-% C_{M_{X}}C_{M_{Z}}\rho_{M_{X}M_{Z}}}{4C_{M_{X}}^{2}},$

is

$\displaystyle\textit{Bias}_{\textit{opt}}\left(\widehat{M}^{}_{\textit{srs}}(% \alpha)\right)=\lambda_{1}M_{Y}\left[\frac{C^{2}_{M_{X}}}{2}+\frac{3}{8}C^{2}_% {M_{Z}}+\frac{C_{M_{X}}C_{M_{Y}}\rho_{M_{X}}M_{Y}}{2}-\frac{C_{M_{X}}C_{M_{Z}}% \rho_{M_{X}}M_{Z}}{4}\right.$ $\displaystyle\quad\left.-\frac{c^{2}_{M_{Z}}\rho^{2}_{M_{X}M_{Z}}}{4}-\frac{C_% {M_{Y}}C_{M_{Z}}\rho_{M_{Y}}M_{Z}}{2}+C_{M_{Y}}C_{M_{Z}}\rho_{M_{X}M_{Z}}\rho_% {M_{X}M_{Y}}\right]$ $\displaystyle\textit{MSE}_{\textit{opt}}\left(\widehat{M}^{*}_{\textit{srs}}(% \alpha)\right)=\lambda_{1}M^{2}_{Y}\left[C^{2}_{M_{Y}}+\frac{c^{2}_{M_{Z}}}{4}% -\left(\frac{k_{2}}{2}-k_{1}\right)^{2}-C_{M_{Y}}C_{M_{Z}}\rho_{M_{Y}M_{Z}}\right]$ (8)
4. The proposed estimators

Let $\left({u,v}\right)$ assume values in abounded closed convex subset $R$ of the two dimensional real space containing the point $\left({1,1}\right)$ . Let $f\left({u,v}\right)$ be a function of $u$ and $v$ such that

$f\left({u,v}\right)=1$ , then the following conditions are satisfied

(i)
The function $f\left({u,v}\right)$ is continuous and bounded in $R$ .
(ii)
The first and second partial derivatives of $f\left({u,v}\right)$ exist and are continuous and bounded in $R$ .

Following Srivastava (1971), this particular class of estimator of the population median, $M_{Y}$ , is defined as

$\displaystyle\widehat{M}_{\textit{srs}}(\pi)=\widehat{M}_{y}f(u,v)$ (9)

To obtain the bias and mean square error of $\widehat{M}_{\textit{srs}}(\pi)$ the technique for expansion of a general class of estimators by Srivastava (1971) is adopted. Hence, the function $f\left({u,v}\right)$ is expanded about the point $\left({1,1}\right)$ in a second order Taylor’s series as shown in Sections 4.1 and 5.2.

Equation (9) can explicitly be written for single and two phase estimators as shown in Eqs (10 and (21).
4.1 The proposed estimator under simple random sampling

$\displaystyle\widehat{M}_{\textit{srs}}(\pi)=\widehat{M}_{y}\left[\alpha\frac{% M_{X}}{\widehat{M}_{x}}+(1-\alpha)\frac{\widehat{M}_{x}}{M_{X}}\right]\left\{% \beta\textit{exp}\left[\frac{(M_{Z}-\widehat{M}_{z})}{(M_{Z}+\widehat{M}_{z})}% \right]+(1-\beta)\textit{exp}\left[\frac{(\widehat{M}_{z}-M_{Z})}{(M_{Z}+% \widehat{M}_{z})}\right]\right\}$ (10)

where $\alpha$ and $\beta$ are unknown constants obtained while minimizing the MSE of $\widehat{M}_{\textit{srs}}(\pi)$ as

$\displaystyle\alpha=\frac{C_{M_{X}}C_{M_{Y}}\rho_{M_{X}M_{Y}}+C_{M_{X}}^{2}-C_% {M_{X}}C_{M_{Y}}\rho_{M_{X}M_{Z}}\rho_{M_{Y}M_{Z}}-C_{M_{X}}^{2}\rho_{M_{X}M_{% Z}}^{2}}{2\left({1-\rho_{M_{X}M_{Z}}^{2}}\right)}$ $\displaystyle\beta=\frac{2C_{M_{Y}}C_{M_{Z}}\rho_{M_{Y}M_{Z}}+C_{M_{Z}}^{2}-2C% _{M_{Y}}C_{M_{Z}}\rho_{M_{X}M_{Y}}\rho_{M_{X}M_{Z}-C_{M_{Z}}^{2}\rho_{M_{X}M_{% Z}}^{2}}}{2\left({1-\rho_{M_{X}M_{Z}}^{2}}\right)},$

then

$\displaystyle\textit{MSE}_{\textit{opt }}\left(\widehat{M}_{\textit{srs}}(\pi)% \right)=\frac{\lambda_{1}M_{Y}^{2}C_{M_{Y}}^{2}}{\left(1-\rho_{M_{X}M_{Z}}^{2}% \right)}\left[1-\rho_{M_{X}M_{Y}}^{2}-\rho_{M_{X}M_{Z}}^{2}-\rho_{M_{Y}M_{Z}}^% {2}+2\rho_{M_{X}M_{Y}}\rho_{M_{X}M_{Z}}\rho_{M_{Y}M_{Z}}\right]$ (11) $\displaystyle\textit{Bias}_{\textit{opt }}\left(\widehat{M}_{\textit{{srs} }}(% \pi)\right)=\frac{\lambda_{1}M_{Y}}{8\left[C_{M_{X}}^{2}C_{M_{Z}}^{2}\left(1-% \rho_{M_{X}M_{Z}}^{2}\right)\right]^{2}}\left\{C_{M_{X}}^{2}C_{M_{Z}}^{2}(1-% \rho_{M_{X}M_{Z}}^{2})\left(4C_{M_{X}}^{3}C_{M_{Z}}^{2}C_{M_{Y}}\right.\right.$ $\displaystyle\quad\left(\rho_{M_{X}M_{Y}}-\rho_{M_{X}M_{Z}}\rho_{M_{Y}M_{Z}}% \right)+4C_{M_{X}}^{2}C_{M_{Z}}^{3}C_{M_{Y}}\left(\rho_{M_{Y}M_{Z}}-\rho_{M_{X% }M_{Y}}\rho_{M_{X}M_{Z}}\right)+4C_{M_{X}}^{4}C_{M_{Y}}^{2}$ $\displaystyle\quad-4C_{M_{X}}^{4}C_{M_{Z}}^{2}\rho_{M_{X}M_{Z}}^{2}+8C_{M_{X}}% ^{2}C_{M_{Y}}^{2}C_{M_{Z}}^{2}\left(2\rho_{M_{X}M_{Y}}\rho_{M_{X}M_{Z}}\rho_{M% _{Y}M_{Z}}-\rho_{M_{X}M_{Y}}^{2}-\rho_{M_{Y}M_{Z}}^{2}\right)$ (12) $\displaystyle\quad\left.+C_{M_{X}}^{2}C_{M_{Z}}^{4}\left(1-\rho_{M_{X}M_{Z}}^{% 2}\right)\right)+8C_{M_{X}}^{2}C_{M_{Z}}^{2}C_{M_{Y}}\rho_{M_{X}M_{Z}}\left(C_% {M_{Z}}\rho_{M_{Y}M_{Z}}-\rho_{M_{X}M_{Y}}\right)C_{M_{X}}^{2}C_{M_{Y}}$ $\displaystyle\quad\left.C_{M_{Z}}\left(\rho_{M_{X}M_{Y}}\rho_{M_{X}M_{Z}}-\rho% _{M_{Y}M_{Z}}\right)\right\}$

4.2 Application

To validate the theoretical claims, empirical investigations are carried out using data statistics in Table 2. To obtain the percent relative efficiencies ( $\%\textit{RE}$ ) of the estimators, the MSE values of the existing and proposed estimators are computed, thus;

$\displaystyle\textit{\% RE}=\frac{\textit{MSE}\left(\widehat{M}_{ex}\right)}{% \textit{MSE}\left(\widehat{M}_{p}\right)}\times 100$

where $\textit{MSE}\left(\widehat{M}_{ex}\right)$ is the MSE of classical median estimator and $\textit{MSE}\left(\widehat{M}_{p}\right)$ denotes the MSE of proposed estimators.

4.3 Descriptive statistics

The data statistics for population I, II, III, and IV are given in Table 2.

Table 2
Data Statistics from four populations under simple random and two-phase sampling

Statistics	Population I	Population II	Population III	Population IV
${N}$	69	97	67	97
${n}$	17	33	15	24
${n}^{\prime}$	24	46	23	46
${M}_{Y}$	2068	1242	4.8	21.4
${M}_{X}$	2011	1233	7.0	22.8
${M}_{Z}$	2307	1207	151	22.6
${\rho}_{M_{\textit{YM}_{X}}}$	0.1505	0.2096	0.6624	0.48
${\rho}_{M_{\textit{XM}_{Z}}}$	0.1431	0.15	0.7592	0.45
${\rho}_{M_{\textit{YM}_{Z}}}$	0.3166	0.123	0.8624	0.44
${f}_{Y}(M_{Y})$	0.00014	0.00021	0.0763	2.303
${f}_{X}(M_{X})$	0.00014	0.0002	0.0526	2.510
${f}_{Z}({M}_{Z})$	0.00013	0.0002	0.0024	2.398

Table 3

Results for numerical comparison of AB, MSE and PRE under simple random sampling

	Population I			Population II			Population III			Population IV
Est	AB	MSE	PRE	AB	MSE	PRE	AB	MSE	PRE	AB	MSE	PRE
$\widehat{M}_{C}$	0.0	5.7 $e^{+5}$	100	0.0	1.1 $e^{+5}$	100	0.0	0.66	100	0.0	1.5 $e^{-3}$	100
$\widehat{M}_{\textit{CR}}$	246.3	9.9 $e^{+5}$	57	81.7	1.9 $e^{+5}$	60	0.05	0.44	149	0.0	1.4 $e^{-3}$	107
$\widehat{M}_{\textit{ER}}$	87.3	6.3 $e^{+5}$	90	28.3	1.2 $e^{+5}$	95	0.01	0.39	170	0.0	1.1 $e^{-3}$	136
$\widehat{M}_{\textit{PR}}$	15.0	8.0 $e^{+5}$	71	2.7	1.7 $e^{+5}$	67	0.03	1.26	53	0.0	2.4 $e^{-3}$	63
$\widehat{M}_{A}$	4.1 $e^{+2}$	5.5 $e^{+5}$	102	150.9	1.1 $e^{+5}$	105	0.02	0.37	179	0.0	1.1 $e^{-3}$	136
$\widehat{M}_{\textit{PP}}^{G}$	2.0 $e^{+7}$	4.9 $e^{+5}$	115	2.9 $e^{+6}$	1.0 $e^{+5}$	113	0.84	0.29	229	15.5	3.59	0.0
$\widehat{M}^{\prime}_{P}$	4.6 $e^{+4}$	5.0 $e^{+5}$	113	5.0 $e^{+4}$	1.1 $e^{+5}$	106	0.26	0.17	391	0.0	1.0 $e^{-3}$	150
$\widehat{M}_{\textit{srs}}(\alpha)$	208.0	5.2 $e^{+5}$	109	89.6	1.3 $e^{+5}$	88	0.07	0.20	327	0.0	1.1 $e^{-3}$	136
$\widehat{M}_{\textit{srs}}(\pi)$	208.3	5.0 $e^{+5}$	113	66.0	1.1 $e^{+5}$	106	0.09	0.17	391	0.0	1.0 $e^{-3}$	150

Population I: Let $y$ , $x$ and $z$ respectively be the number of fish caught by the marine recreational fisherman in years 1995, 1994 and 1993 in USA given by Singh (2003a).

Population II: Let $y$ be the district-wise tomato production (tones) in 2003, $x$ as a district-wise tomato production (tones) in 2002 and $z$ as a district-wise tomato production (tones) in 2001 given by MFA (2004).

Population III: Let $y$ be the U.S. exports to Singapore in billions of Singapore dollars, $x$ , the money supply figures in billions of Singapore dollars and $z$ is the local supply in U.S. dollars given by Aczel and Sounderpandian (2004).

Population IV: The study variable y is considered as total fertility rate, the supplementary variable $x$ is defined as crude birth rate and $z$ is considered as crude death rate. A transformation has been applied on the original data of the variables and the minimum value is $10$ for all the variables. The transformation of variables is defined as $y=\textit{TFR}\times 10-4.2$ ; $x=\textit{CBR}-0.2$ , and $z=\textit{CDR}\times 2+1.2$ Source: Silverman (1986) and Singh (2003b).

4.4 Results

The results to validate the theoretical claims for the single phase sampling computed for the bias, mean square error, and percent relative efficiency are given in Table 3 for population I, II, and III, IV respectively.

5. Two phase sampling

5.1 Existing estimators under two phase sampling

This section considers some existing estimators with one and two auxiliary variables in double sampling.

(i) Singh, Joarder, and Tracy (2003) suggested a ratio estimator for median in two phase Sampling

$\displaystyle\widehat{M}_{\textit{SA}}=\frac{\widehat{M}_{y}}{\widehat{M}_{x}}% \widehat{M}_{x}^{\prime}$ $\displaystyle B\left(\widehat{M}_{\textit{SA}}\right)=\lambda_{3}\frac{\left(1% -\rho_{M_{X}M_{Y}}\right)}{4f_{Y}\left(M_{Y}\right.}$ $\displaystyle\textit{MSE}\left(\widehat{M}_{\textit{SA}}\right)=\frac{\left\{f% _{Y}\left(M_{Y}\right)\right\}^{-2}}{4}\left[\lambda_{1}+\lambda_{3}\left(% \frac{M_{Y}f_{Y}\left(M_{Y}\right)}{M_{X}f_{X}\left(M_{X}\right)}\right)\left% \{\left(\frac{\left\{M_{Y}f_{Y}\left(M_{Y}\right)\right.}{\left\{M_{X}f_{X}% \left(M_{X}\right)\right.}\right)-2\rho_{M_{X}M_{Y}}\right\}\right]\rightarrow$ (13)

(ii) Singh, Singh, and Upadhyay (2007) studied a ratio-type estimator of median using two auxiliary variables

$\displaystyle\widehat{M}_{S}=\widehat{M}_{y}\left(\frac{\widehat{M}_{x}^{% \prime}}{\widehat{M}_{x}}\right)^{\alpha_{1}}\left(\frac{M_{Z}}{\widehat{M}_{Z% }^{\prime}}\right)^{\alpha_{2}}\left(\frac{M_{Z}}{\widehat{M}_{Z}}\right)^{% \alpha_{3}}$ $\displaystyle B\left(\widehat{M}_{S}\right)\cong\frac{\left\{f_{Y}\left(M_{Y}% \right)\right\}^{-2}}{8M_{Y}\left(1-\rho_{M_{X}M_{Z}}^{2}\right)^{2}}\left[% \lambda_{3}(1-\rho_{M_{X}M_{Z}}^{2})\left\{\left(\rho_{M_{Y}M_{X}}-\rho_{M_{X}% M_{Z}}\rho_{M_{Y}M_{Z}}\right)^{2}-2\rho_{M_{X}M_{Y}}\right.\right.$ $\displaystyle\quad\left.\left(\rho_{M_{Y}M_{X}}-\rho_{M_{X}M_{Z}}\rho_{M_{Y}M_% {Z}}\right)+\left(\frac{\left\{M_{Y}f_{Y}\left(M_{Y}\right)\right.}{\left\{M_{% X}f_{X}\left(M_{X}\right)\right.}\right)\left(\rho_{M_{Y}M_{X}}-\rho_{M_{X}M_{% Z}}\rho_{M_{Y}M_{Z}}\right)\right\}$ $\displaystyle\quad+\lambda_{1}\left(\rho_{M_{Y}M_{Z}}-\rho_{M_{X}M_{Z}}\rho_{M% _{Y}M_{X}}\right)\left\{\left(\rho_{M_{Y}M_{Z}}-\rho_{M_{X}M_{Z}}\rho_{M_{Y}M_% {X}}\right)+\right.$ $\displaystyle\quad\left.2\rho_{M_{X}M_{Z}}\left(\rho_{M_{Y}M_{X}}-\rho_{M_{X}M% _{Z}}\rho_{M_{Y}M_{Z}}\right)+\left(\frac{\left\{M_{Y}f_{Y}\left(M_{Y}\right)% \right.}{\left\{M_{X}f_{X}\left(M_{X}\right)\right.}\right)\left(1-\rho_{M_{X}% M_{Z}}^{2}\right)\right\}$ $\displaystyle\quad+\lambda_{2}\left(\rho_{M_{Y}M_{X}}-\rho_{M_{X}M_{Z}}\rho_{M% _{Y}M_{Z}}\right)\left\{\rho_{M_{X}M_{Z}}^{2}\left(\rho_{M_{Y}M_{X}}-\rho_{M_{% X}M_{Z}}\rho_{M_{Y}M_{Z}}\right)-2\rho_{M_{Y}M_{Z}}\rho_{M_{X}M_{Z}}\right.$ $\displaystyle\quad\left.\left.\left(1-\rho_{M_{X}M_{Z}}^{2}\right)+\left(\frac% {\left\{M_{Y}f_{Y}\left(M_{Y}\right)\right.}{\left\{M_{X}f_{X}\left(M_{X}% \right)\right.}\right)\rho_{M_{X}M_{Z}}\left(1-\rho_{M_{X}M_{Z}}^{2}\right)% \right\}\right]$ $\displaystyle\textit{MSE}\left(\widehat{M}_{S}\right)\cong\frac{\left\{f_{Y}% \left(M_{Y}\right)\right\}^{-2}}{4}\!\left[\lambda_{1}\!-\!\lambda_{2}\rho_{M_% {X}M_{Z}}^{2}-\lambda_{3}\frac{\left.\rho_{M_{Y}M_{X}}^{2}+\rho_{M_{Y}M_{Z}}^{% 2}\!-\!2\rho_{M_{Y}M_{X}}\rho_{M_{X}M_{Z}}\rho_{M_{Y}M_{Z}}\right]}{\left(1\!-% \!\rho_{M_{X}M_{Z}}^{2}\right)}\right]$ (14)

(iii) Jhajj, Kaur, and Jhajj (2016) defined ratio-exponential-type estimator as

$\displaystyle\widehat{M}_{\textit{YH}}=\widehat{M}_{y}\left(\frac{M_{Z}}{% \widehat{M}_{Z}^{\prime}}\right)^{v_{1}}\left(\frac{M_{Z}}{\widehat{M}_{Z}}% \right)^{v_{2}}\exp\left[\left(\frac{v_{3}\left(\widehat{M}_{x}-M_{X}\right)}{% \left(M_{X}+\widehat{M}_{x}\right)}\right)\right]$ $\displaystyle B\left(\widehat{M}_{\textit{YH}}\right)\cong\frac{\left\{f_{Y}% \left(M_{Y}\right)\right\}^{-2}}{8M_{Y}\left(1-\rho_{M_{X}M_{Z}}^{2}\right)^{2% }}\left[\lambda_{3}\left\{\left(\rho_{M_{Y}M_{X}}-\rho_{M_{X}M_{Z}}\rho_{M_{Y}% M_{Z}}\right)^{2}\right.\right.$ $\displaystyle\quad-2\rho_{M_{X}M_{Y}}\left(\rho_{M_{Y}M_{X}}-\rho_{M_{X}M_{Z}}% \rho_{M_{Y}M_{Z}}\right)\left(1-\rho_{M_{X}M_{Z}}^{2}\right)+\left(\frac{\left% \{M_{Y}f_{Y}\left(M_{Y}\right)\right.}{\left\{M_{X}f_{X}\left(M_{X}\right)% \right.}\right)$ $\displaystyle\quad\left.\rho_{M_{X}M_{Z}}\rho_{M_{Y}M_{X}}\right)+2\rho_{M_{X}% M_{Z}}\left(\rho_{M_{Y}M_{X}}-\rho_{M_{X}M_{Z}}\rho_{M_{Y}M_{Z}}\right)-2\rho_% {M_{Y}M_{Z}}\left(\rho_{M_{Y}M_{Z}}-\rho_{M_{X}M_{Z}}\rho_{M_{Y}M_{X}}\right)$ $\displaystyle\left.\left.\quad(1-\rho_{M_{X}M_{Z}}^{2}\right)+\left(\frac{% \left\{M_{Y}f_{Y}\left(M_{Y}\right)\right.}{\left\{M_{Z}f_{Z}\left(M_{Z}\right% )\right.}\right)\left(\rho_{M_{Y}M_{Z}}-\rho_{M_{X}M_{Z}}\rho_{M_{Y}M_{X}}% \right)\left(1-\rho_{M_{X}M_{Z}}^{2}\right)\right\}+\lambda_{2}$ $\displaystyle\quad\left(\rho_{M_{Y}M_{X}}-\rho_{M_{X}M_{Z}}\rho_{M_{Y}M_{Z}}% \right)\rho_{M_{X}M_{Z}}\left\{\rho_{M_{X}M_{Z}}\left(\rho_{M_{Y}M_{X}}-\rho_{% M_{X}M_{Z}}\rho_{M_{Y}M_{Z}}\right)\!-\!2\rho_{M_{Y}M_{Z}}\left(1\!-\!\rho_{M_% {X}M_{Z}}^{2}\right)\right\}$ $\displaystyle\quad\left.+\left(\frac{\left\{M_{Y}f_{Y}\left(M_{Y}\right)\right% .}{\left\{M_{Z}f_{Z}\left(M_{Z}\right)\right.}\right)\rho_{M_{X}M_{Z}}\left(% \rho_{M_{Y}M_{X}}-\rho_{M_{X}M_{Z}}\rho_{M_{Y}M_{Z}}\right)\left(1-\rho_{M_{X}% M_{Z}}^{2}\right)\right]$ $\displaystyle\textit{MSE}\left(\widehat{M}_{\textit{YH}}\right)\cong\frac{% \left\{f_{Y}\left(M_{Y}\right)\right\}^{-2}}{4}\!\left[\lambda_{1}\!-\!\lambda% _{2}\rho_{M_{Y}M_{Z}}^{2}\!-\!\lambda_{3}\frac{\rho_{M_{Y}M_{X}}^{2}+\rho_{M_{% Y}M_{Z}}^{2}-2\rho_{M_{Y}M_{X}}\rho_{M_{X}M_{Z}}\rho_{M_{Y}M_{Z}}}{\left(1-% \rho_{M_{X}M_{Z}}^{2}\right)}\right]$ (15)

(iv) Baig, Masood and Tarray (2019) suggested an improved class of difference-type estimators for population median under two phase sampling with two auxiliary variables

$\displaystyle\widehat{M}_{P}^{I}=\left[\widehat{M}_{y}+m_{1}\left(\widehat{M}_% {x}^{\prime}-\widehat{M}_{x}\right)\right]\left[m_{2}\exp\left(\frac{M_{Z}-% \widehat{M}_{Z}^{\prime}}{M_{Z}+\widehat{M}_{z}^{\prime}}\right)+\left(1-m_{2}% \right)\exp\left(\frac{\widehat{M}_{Z}^{\prime}-M_{Z}}{M_{Z}+\widehat{M}_{Z}^{% \prime}}\right)\right]$

where $m_{1}$ and $m_{2}$ are constants given as $m_{1\left({opt}\right)}=\frac{M_{Y}C_{M_{Y}}\rho_{M_{Y}M_{X}}}{M_{X}C_{M_{X}}}$ and $m_{2\left({opt}\right)}=\frac{1}{2}+\frac{M_{Y}\rho_{M_{Y}M_{Z}}}{C_{M_{Z}}}$

$\displaystyle B\left(\widehat{M}_{P}^{I}\right)=M_{Y}\frac{1}{4}\lambda_{1}% \left(\frac{1}{2}-m_{2}\right)\rho_{M_{Y}M_{Z}}C_{M_{Y}}C_{M_{Z}}$ $\displaystyle\textit{MSE}\left(\widehat{M}_{P}^{I}\right)=M_{Y}^{2}\frac{C_{M_% {Y}}^{2}}{4}\left[\lambda_{2}+\lambda_{3}\rho_{M_{YM_{X}}}^{2}-\lambda_{1}\rho% _{M_{YM_{Z}}}^{2}\right]$ (16)

(v) Iseh (2021) proposed a separate ratio exponential estimator of the form

$\displaystyle\widehat{M}_{\textit{srs}}^{D}(\alpha)=\widehat{M}_{y}\left[% \alpha\frac{\widehat{M}_{x}^{\prime}}{\widehat{M}_{x}}+(1-\alpha)\frac{% \widehat{M}_{x}}{\widehat{M}_{x}^{\prime}}\right]\exp\left[\frac{\left(M_{z}-% \widehat{M}_{z}^{\prime}\right)}{\left(M_{Z}+\widehat{M}_{z}^{\prime}\right)}\right]$

For optimum value of the MSE of $\widehat{M}_{srs}^{D}(\alpha)$

$\displaystyle\textit{MSE}_{\textit{opt }}\left(\widehat{M}_{\text{{srs } }}^{D% }(\alpha)\right)=M_{Y}^{2}\left[\lambda_{1}C_{M_{Y}}^{2}-\lambda_{3}C_{M_{Y}}^% {2}\rho_{M_{X}M_{Y}}^{2}+\lambda_{2}\left\{\frac{C_{M_{Z}}^{2}}{4}-C_{M_{Y}}C_% {M_{Z}}\rho_{M_{Y}M_{Z}}\right\}\right]$ (17)

$\displaystyle\textit{ Bias }_{\text{opt }}\left(\widehat{M}_{\text{{srs} }}^{D% }(\alpha)\right)=$ $\displaystyle\quad M_{Y}\left[\lambda_{3}\frac{C_{M_{X}}^{2}}{2}+\lambda_{3}% \frac{C_{M_{X}}C_{M_{Y}}\rho_{M_{X}M_{Y}}}{2}+\frac{3}{8}\lambda_{2}C_{M_{Z}}^% {2}-\lambda_{3}C_{M_{Y}}^{2}\rho_{M_{X}M_{Z}}^{2}-\lambda_{2}\frac{C_{M_{Y}}C_% {M_{Z}}\rho_{M_{Y}M_{Z}}}{2}\right]$

5.2 Proposed estimator under two phase Sampling

$\displaystyle\widehat{M}_{\textit{srs}}^{D}(\pi)=\widehat{M}_{y}\left[\alpha% \frac{\widehat{M}_{x}^{\prime}}{\widehat{M}_{x}}+(1-\alpha)\frac{\widehat{M}_{% x}}{\widehat{M}_{x}^{\prime}}\right]\left\{\beta\exp\left[\frac{\left(M_{Z}-% \widehat{M}_{Z}^{\prime}\right)}{\left(M_{Z}+\widehat{M}_{Z}^{\prime}\right)}% \right]+(1-\beta)\exp\left[\frac{\left(\widehat{M}_{Z}^{\prime}-M_{Z}\right)}{% \left(M_{Z}+\widehat{M}_{Z}^{\prime}\right)}\right]\right\}$ (18)

$\displaystyle\textit{MSE}_{\textit{opt }}\left(\widehat{M}_{srs}^{D}(\pi)% \right)=\frac{M_{Y}^{2}C_{M_{Y}}^{2}}{4}\left[\lambda_{2}+\lambda_{3}\rho_{M_{% X}M_{Y}}^{2}-\lambda_{1}\frac{\rho_{M_{Y}M_{Z}}^{2}}{4}\right]$ (19)

where $\alpha=\frac{K_{1}+1}{2},\beta=\frac{K_{2}+1}{2}$ , and

$\displaystyle\textit{Bias}_{\textit{opt }}\left(\widehat{M}_{srs}^{D}(\pi)% \right)=\frac{M_{Y}}{4}\left\{\lambda_{3}\left[C_{M_{Y}}^{2}\rho_{M_{X}M_{Y}}^% {2}-\frac{1}{2}C_{M_{X}}^{2}-\frac{1}{2}C_{M_{X}}C_{M_{Y}}\rho_{M_{X}M_{Y}}% \right]-\right.$ $\displaystyle\quad\left.\lambda_{1}\left[\frac{1}{2}C_{M_{Y}}^{2}\rho_{M_{Y}M_% {Z}}^{2}-\frac{1}{4}C_{M_{Y}}C_{M_{Z}}\rho_{M_{Y}M_{Z}}-\frac{1}{8}C_{M_{Z}}^{% 2}\right]\right\}$

Note: Obviously, it is worthy of note that the proposed estimators in single and two phase sampling are special members of the generalized class of estimators proposed by Srivastava (1971), where $f\left(\widehat{M}_{y},u,v\right)$ , for two supplementary information, $u=\frac{\widehat{M}_{x}}{M_{X}}$ and $v=\frac{\widehat{M}_{z}}{M_{z}}$ , and $f\left(\widehat{M}_{y},1,1\right)=M_{Y}$ , with the estimator $\widehat{M}_{y}$ .

6. Optimum sample sizes for fixed cost and variances

6.1 Existing estimators

i) Following Gross (1980), the cost function for the usual median estimator is given as

$C_{o}=nC_{1}$

where $C_{o}$ is the fixed cost of the survey, and $C_{1}$ as the cost per unit in obtaining information from the study variable, such that the minimum variance is obtained as

$\displaystyle\textit{Var}_{\min}\left(\widehat{M}_{C}\right)=\frac{1}{4}\left(% \frac{C_{1}}{C_{0}}-\frac{1}{N}\right)M_{Y}^{2}C_{M_{Y}}^{2}$ (20)

ii) Following Singh et. al. (2001), the cost function for an estimator with single auxiliary variable in double sampling is given as $C_{o}=nC_{1}+n^{\prime}C_{2}$ , where $C_{2}$ is the cost per unit in obtaining information from the auxiliary variable in the first phase.

$\displaystyle\textit{MSE}_{\min}\left(\widehat{M}_{SA}\right)=\frac{C_{1}\left% (\sqrt{V_{o}-V_{1}}+\sqrt{C_{2}}\right)^{2}}{C_{o}}-\frac{V_{o}}{N}$ (21)

where $V_{o}=\frac{\left\{f_{Y}\left(M_{Y}\right)\right\}^{-2}}{4}$ , and

$\displaystyle V_{1}=\left(\frac{M_{Y}}{M_{X}}\right)f_{X}\left(M_{X}\right)% \left[\left(4\rho_{M_{X}M_{Y}}-1\right)\left\{f_{Y}\left(M_{Y}\right)\right\}^% {-1}-\frac{1}{4}\left(\frac{M_{Y}}{M_{X}}\right)\left\{f_{X}\left(M_{X}\right)% \right\}^{-1}\right]$

iii) Baig, Masood and Tarray (2019), proposed a cost function for two auxiliary variables under double sampling as

$\displaystyle C_{o}=n^{\prime}C_{1}+n\left(C_{2}+C_{3}\right)$ (22)

with minimum mean square error as

$\displaystyle\textit{MSE}_{\min}\left(\widehat{M}_{P}^{I}\right)=\frac{\left\{% \sqrt{C_{1}\left(V_{o}-M_{1}\right.}+\sqrt{\left(C_{2}+C_{3}\right)\left(M_{1}% -M_{2}\right.}\right\}^{2}}{C_{o}}-\frac{V_{o}-M_{2}}{N}$ (23)

where

$\displaystyle M_{1}=\frac{\left\{f_{Y}\left(M_{Y}\right)\right\}^{-2}}{4}\left% (4\rho_{M_{X}M_{Y}}-1\right)^{2}\text{, and }\cdot M_{2}=\frac{\left\{f_{Y}% \left(M_{Y}\right)\right\}^{-2}}{4}\left(4\rho_{M_{Y}M_{Z}}-1\right)^{2},$

6.2 Under the suggested estimator

Here, what comes into mind is whether the reduction in variability is worth the extra expenditure required to observe the auxiliary variables.

Consider a cost function with $C_{o}$ as the fixed $\textit{cost},C_{1}$ as the cost per unit in obtaining information from the study variable in the second phase, while $C_{2}$ and $C_{3}$ , be the cost per unit in obtaining information from the auxiliary/helping variables in the first phase respectively. Then following Singh, Joarder and Tracy (2001) and Allen et al. (2002) a cost function for two auxiliary variables under double sampling is given as;

$\displaystyle C_{o}=nC_{1}+n^{\prime}\left(C_{2}+C_{3}\right)$ (24)

In the foregoing, the optimum first and second phase sample sizes for the fixed cost as well as the fixed variance cases are obtained respectively. By considering the Lagrange function;

$\displaystyle\varphi=\frac{M_{Y}^{2}C_{M_{Y}}^{2}}{4}\left[\lambda_{2}+\lambda% _{3}\rho_{M_{X}M_{Y}}^{2}-\lambda_{1}\frac{\rho_{M_{Y}M_{Z}}^{2}}{4}\right]+% \mu\left[nC_{1}+n^{\prime}\left(C_{2}+C_{3}\right)-C_{0}\right]$ (25)

Differentiating Eq. (25) partially with respect to $n^{\prime}$ and $n$ and solving gives respectively

$\displaystyle n_{opt}^{\prime}=\frac{C_{0}\sqrt{M_{Y}^{2}C_{M_{Y}}^{2}\left(1-% \rho_{M_{X}M_{Y}}^{2}\right)}}{\sqrt{\left(C_{2}+C_{3}\right)}\left[\sqrt{M_{Y% }^{2}C_{M_{Y}}^{2}\left(\rho_{M_{X}M_{Y}}^{2}-\frac{1}{4}\rho_{M_{Y}M_{Z}}^{2}% \right)}C_{1}+\sqrt{M_{Y}^{2}C_{M_{Y}}^{2}\left(1-\rho_{M_{X}M_{Y}}^{2}\right)% \left(C_{2}+C_{3}\right)}\right]}$ (26) $\displaystyle n_{\textit{opt }}=\frac{C_{0}\sqrt{M_{Y}^{2}C_{M_{Y}}^{2}\left(% \rho_{M_{X}M_{Y}}^{2}-\frac{1}{4}\rho_{M_{Y}M_{Z}}^{2}\right)}}{\sqrt{\left(C_% {1}\right)}\left[\sqrt{M_{Y}^{2}C_{M_{Y}}^{2}\left(\rho_{M_{X}M_{Y}}^{2}-\frac% {1}{4}\rho_{M_{Y}M_{Z}}^{2}\right)}C_{1}+\sqrt{M_{Y}^{2}C_{M_{Y}}^{2}\left(1-% \rho_{M_{X}M_{Y}}^{2}\right)\left(C_{2}+C_{3}\right)}\right]}$ (27)

substituting Eqs (26), and (27) in Eq. (19), we obtained the minimum MSE as

$\displaystyle\textit{MSE}_{\textit{opt}}\left(\widehat{M}_{\textit{srs}}^{D}(% \pi)\right)=\frac{M_{Y}^{2}C_{M_{Y}}^{2}}{4}\left[\left(\frac{1}{n_{\textit{% opt}}^{\prime}}-\frac{1}{N}\right)+\left(\frac{1}{n_{\textit{opt}}}-\frac{1}{n% _{\textit{opt}}^{\prime}}\right)\rho_{M_{X}M_{Y}}^{2}-\left(\frac{1}{n_{% \textit{opt}}}-\frac{1}{N}\right)\frac{\rho_{M_{Y}M_{Z}}^{2}}{4}\right]$ (28)

7. Application

The data statistics for population I, II, III, IV as given in Table 2 will be used in the empirical investigation under two-phase sampling scheme as seen in Table 4.

Table 4
Results for Numerical comparison of AB, MSE, and PRE under two-phase Sampling

	Population I			Population II			Population III			Population IV
Est	AB	MSE	PRE	AB	MSE	PRE	AB	MSE	PRE	AB	MSE	PRE
$\widehat{M}_{C}$	0.0	5.7 $e^{+5}$	100	0.0	1.1 $e^{+5}$	100	0.0	2.22	100	0.0	1.5 $e^{-3}$	100
$\widehat{M}_{\textit{SA}}$	26.0	7.3 $e^{+5}$	78	8.1	1.5 $e^{+5}$	78	0.03	1.89	117	0.0	1.4 $e^{-3}$	107
$\widehat{M}_{S}$	14.7	5.1 $e^{+5}$	112	2.1	1.1 $e^{+5}$	103	0.09	0.57	390	0.0	1.1 $e^{-3}$	136
$\widehat{M}_{\textit{YH}}$	245.5	5.1 $e^{+5}$	112	285.3	1.1 $e^{+5}$	103	0.0	0.57	390	0.0	1.1 $e^{-3}$	136
$\widehat{M}_{P}^{I}$	25.7	5.0 $e^{+5}$	112	1.4	1.1 $e^{+5}$	104	0.35	0.13	1677	0.0	1.0 $e^{-3}$	150
$\widehat{M}_{\textit{srs}}^{D}(\alpha)$	95.0	5.4 $e^{+5}$	106	44.6	1.2 $e^{+5}$	93	0.04	1.03	216	0.0	1.2 $e^{-3}$	125
$\widehat{M}_{srs}^{D}(\pi)$	37.8	3.4 $e^{+5}$	168	15.5	6.6 $e^{+4}$	170	0.09	1.25	178	0.0	7.0 $e^{-4}$	214

Population V: Consider the information provided in Population IV. In an institute, the Director fixed a cost as $C_{0}=\$3000$ for conducting a survey to estimate the median of total fertility rate in the world. Source: Silverman (1986) and Singh (2003b).

Table 5

PRE of some estimators in two-phase sampling over for various sample sizes

$n$	$n^{\prime}$	$\widehat{M}_{\textit{SA}}$	$\widehat{M}_{P}^{I}$	$\widehat{M}_{\textit{srs}}^{D}(\alpha)$	$\widehat{M}_{\textit{srs}}^{D}(\pi)$
10	40	107.7	161.5	127.3	323.1
	50	107.7	168.0	127.3	381.8
	60	107.7	168.0	127.3	466.7
15	40	112.5	158.8	128.6	270.0
	50	108.0	168.8	128.6	337.5
	60	108.0	168.8	128.6	385.7
20	40	105.6	158.3	126.7	211.2
	50	111.8	158.3	126.7	271.4
	60	111.8	172.7	126.7	316.7
25	40	107.7	140.0	127.3	175.0
	50	107.7	155.6	127.3	233.3
	60	107.7	155.6	127.3	280.0
30	40	110.0	127.5	122.2	157.1
	50	110.0	157.1	122.2	220.0
	60	110.0	157.1	137.5	275.0

Table 6

PRE of some estimators in two-phase sampling over under fixed cost

Cost ($)			$\widehat{M}_{\textit{SA}}$	$\widehat{M}_{P}^{I}$	$\widehat{M}^{D}_{\textit{srs}}(\pi)$
$C_{1}$	$C_{2}$	$C_{3}$
250	15	0.0	12284.8	43469.2	47091.6
		2.5	12284.8	43469.2	47091.7
		5.0	12284.8	43469.2	43469.2
	20	0.0	11772.9	43469.2	43469.2
		2.5	11772.9	40364.3	40364.3
		5.0	11772.9	40364.3	37673.3
300	15	0.0	8561.8	33635.7	33635.7
		2.5	8561.8	31393.3	31393.3
		5.0	8561.8	31393.3	31393.3
	20	0.0	8261.4	31393.3	31393.3
		2.5	8261.4	31393.3	29431.3
		5.0	8261.4	29431.3	27700.0
350	15	0.0	6404.8	25218.8	25218.8
		2.5	6404.8	25218.8	23735.3
		5.0	6404.8	23735.3	22416.7
	20	0.0	6113.6	23735.3	22416.7
		2.5	6113.6	23735.3	21236.8
		5.0	6113.6	22416.7	21236.8

7.1 Results under two phase sampling

The results to validate the theoretical claims for the two phase sampling computed for the absolute bias (AB), mean square error (MSE), and percent relative efficiency (PRE) of the existing and proposed estimators are given in Table 4 for population I, II, III,IV respectively. Also, Table 5 shows the performance of the proposed estimator over the existing estimators under percent relative efficiency with various sample sizes. In addition, Table 6 is the result for PRE under a fixed cost of the survey for the proposed estimator ${M}^{\prime\prime}$ , and some existing estimators $\widehat{M}_{srs}^{D}(\pi)$ , and some existing estimators $\widehat{M}_{C},\widehat{M}_{SA}$ , and $\widehat{M}_{P}^{I}$ .

8. Discussion

From the results in Table 3 , it is observed that the proposed estimator competes favorably with the existing estimators for the four populations considered in this study. As seen in the theoretical derivation, under single phase sampling, both the proposed estimators $\widehat{M}_{srs}(\pi)$ and the existing estimator $\widehat{M}_{P}^{\prime}$ have the same MSE and have outperformed other existing estimators considered in this study, except $\widehat{M}_{PP}^{G}$ , which performed better in populations I and II. As a result, the proposed estimator, having been shown to be less biased than $\widehat{M}_{P}^{\prime}$ and $\widehat{M}_{PP}^{G}$ , with smaller MSE than $\widehat{M}_{\textit{PP}}^{G}$ in populations III and IV is considered a better estimator in estimation of population median.

Under two-phase sampling, as shown in Table 4, the proposed estimator has a favorable bias compared to other existing estimators. In terms of MSE and PRE, the proposed estimator outperformed other existing estimators in populations I, II, and IV. Hence, $\widehat{M}_{srs}^{D}(\pi)$ has a remarkable gain in efficiency compared to $\widehat{M}_{P}^{\prime}$ (which has the same PRE performance under single phase) and other existing estimators considered in this study. This superiority in the gains in efficiency of the proposed estimator becomes a direction in the formulation of models in median estimation, and the choice of the auxiliary variables.

Again, under two phase sampling, population $V$ was examined with different sizes of the first phase sample units varying against different sizes of the second phase sample units. As shown in Table 5, an increase in the first-phase sample size and a fixed second-phase sample size result in outstanding. performance of the proposed estimator in terms of gains in efficiency relative to the classical median and other existing estimators $\widehat{M}_{SA}$ and $\widehat{M}_{P}^{I}$ under two phase sampling.

Most times, in survey sampling, it becomes imperative to find an estimator with a minimum MSE under a fixed cost of the survey (See Allen et al. (2002)). This is illustrated using data from population $V$ to estimate the median of total fertility rate in the world. As shown in Table 6 , the proposed estimator, $\widehat{M}_{srs}^{D}(\pi)$ performs better in terms of gains in percent relative efficiency than the ratio estimator $\widehat{M}_{SA}$ and the difference estimator $\widehat{M}_{P}^{I}$ for a fixed cost of the survey $C_{0}=\$3000$ . However, $\widehat{M}_{P}^{I}$ (apparently the incorrect cost function version of Baig, Masood and Tarray (2019)), seems to have a slight edge over the proposed estimator as $C_{1}$ (the cost of the survey in enumerating the study variable), increases. This is so because the authors used $C_{1}$ in enumerating the study variable for a large first-phase sample size thereby creating a trade-off between getting efficiency for a higher cost. Whereas, the proposed estimator has shown some fruitfulness and cost effectiveness in enumerating the study variable with improved efficiency, which agrees with the concept of a double sampling scheme.

8.1 Conclusion

This study was conceived to elucidate the direction of formulating models for enhancing efficiency in the estimation of the population median. It has been observed that the proposed estimators in single and two-phase sampling are special members of the generalized class of estimators proposed by Srisvastava (1971) with the median of two auxiliary variables. With several works done on improving the exponential ratio estimator, which seems to be the most robust among other classes of estimators for estimating the population median, one could visibly see that the proposed estimator has favorable qualities in single-phase sampling and stands out in two-phase sampling compared to others. Having examined the proposed estimator and other existing estimators in double sampling with varying sample sizes and a fixed cost of the survey, it is obvious that the former will be preferred for estimating the population median in two-phase sampling for greater gains in efficiency with a minimum cost. Consequently, it suffices to conclude that the proposed estimator will be suitable and highly recommended when the variable under study has a skewed distribution.

Footnotes

Acknowledgments

With profound gratitude, I acknowledge the anonymous reviewer for his expert review of the manuscript to see that it meets the required standard of the Journal.

References

Aczel

A.D.

, & Sounderpandian

. Complete business statistics. 5th ed. New York: McGraw Hill, 2004.

Aladag

, & Cingi

(2015). Improvement in estimating the population median in simple random sampling and stratified random sampling using auxiliary information. Communication in Statistics-Theory and Methods, 45(5), 1013-1032. doi: 10.1080/03610926.2012.753090.

Allen

Singh

H.P.

Singh

Smarandache

. (2002). A generalized class of estimators of population median using two auxiliary variables in double sampling: In Randomness and Optimal Estimation in Data Sampling, (2

{}^{\text{nd}}

ed. pp. 26-43). American Research Press, USA, .

Bahl

Tuteja

R.K

. (1991). Ratio and product type exponential estimator. Journal of Information and Optimization Sciences, 12(1), 159-164.

Baig

Masood

Tarray

T.A.

(2019). Improved class of difference-type estimators for population median in survey sampling. Communication in Statistics-Theory and Methods. doi: 10.1080/03610926.2019.1622017.

Enang

E.I.

Etuk

S.I.

Ekpenyong

E.J.

Akpan

V.M

. (2016). An alternative Exponential estimator of population median. International Journal of Statistics and Economics, 17(3): 85-97.

Francisco

C.A.

& Fuller

W.A.

(1991). Quantile estimation with a complex survey design. Ann. Statist, 19, 454-469.

Gross

T.S.

(1980). Median estimation in sample surveys. in American Statistical Association Proceedings of Survey Research methodology Section, 181-184.

Iseh

M.J.

(2020). Enhancing efficiency of ratio estimator of population median by calibration techniques. International Journal of Engineering Sciences & Research Technology, 9(8), 14-23.

10.

Iseh

M.J.

(2021). Towards the efficiency of the ratio estimator for population median in sampling survey. International Journal of Innovation Science, Engineering & Technology, 8(6), 518-533.

11.

Jhajj

H.S.

Kaur

& Jhajj

(2016). Efficient family of estimators of median using two-phase sampling design. Communications in Statistics-Theory and Methods, 45(15), 4325-31. doi: 10.1080/03610926.2014.911912.

12.

Kadilar

& Cingi

(2004). Ratio estimators in simple random sampling. Applied Mathematical Computations, 151, 893-902.

13.

Kuk

A.Y.C.

, & Mak

T.K.

(1989). Median estimation in the presence of auxiliary variable. Journal of Royal Statistical Society. Series B, 51, 261-269. doi: 10.1111/j.2517-6161.1989.tb01763.x.

14.

MFA. 2004. Crops area production, Government of Pakistan, Ministry of Food, Agriculture and Livestocks. Islamabad, Pakistan: Economic Wing.

15.

Shabbir

& Gupta

(2017). A generalized class of difference-type estimator for population median in survey sampling. Hacettepe Journal of Mathematics and Statistics, 46, 1015-28. doi: 10.15672/HJMS.201610614759.

16.

Silverman

B.W.

(1986). Density estimation for statistics and data analysis. Monographs on statistics and applied probability. London: Chapman and Hall.

17.

Singh

(2003a). Advanced Sampling Theory and Applications: How Michael ‘Selected’ Amy. Volume I and II. Kluwer academics Publishers, the Netherlands.

18.

Singh

(2003b). Advanced Sampling Theory With Applications: How Michael Selected Amy (Vol. 2), Springer Science & Business Media.

19.

Singh

Joarder

A.H.

& Tracy

D.S.

(2001). Median estimation using double sampling. Australian and New Zealand Journal of Statistics, 43, 33-46. doi: 10.1111/1467-842X.00153.

20.

Singh

H.P.

& Upadhyaya

L.N.

(2007). Chain ratio and regression-type estimators for median estimation in survey sampling. Statistical Papers, 48, 23-46. doi: 10.1007/s00362-006-0314y.

21.

Singh

H.P.

Singh

& Puertas

S.M.

(2003). Ratio-type estimators for the median of finite populations. Allegemeines Statistisches Archiv, 87, 369-382.

22.

Singh

H.P.

& Solanki

R.S.

(2013). Some classes of estimators for the population median using auxiliary information. Communication in Statistics, 42, 4222-4238.

23.

Srivastava

S.K.

& Jhajj

H.S.

(1995). Classes of estimators of finite population mean and variance using auxiliary information. Jour. Ind. Soc. Ag. Statistics, 47(2), 119-128.

24.

Srivastava

S.K.

(1971). A generalized estimator for the mean of a finite population using multi-auxiliary information. Jour. Amer. Statist. Assoc., 66(334), 404-407.

Model formulation on efficiency for median estimation under a fixed cost in survey sampling

Abstract

Keywords

1. Introduction

2. Methodology

2.1 Notations

Table 1 Matrix of proportion

3. Related existing estimators in literature

4.3 Descriptive statistics

Table 2 Data Statistics from four populations under simple random and two-phase sampling

5. Two phase sampling

5.1 Existing estimators under two phase sampling

6.1 Existing estimators

Table 4 Results for Numerical comparison of AB, MSE, and PRE under two-phase Sampling

8. Discussion

8.1 Conclusion

Footnotes

Acknowledgments

References

Table 1
Matrix of proportion

Table 2
Data Statistics from four populations under simple random and two-phase sampling

Table 4
Results for Numerical comparison of AB, MSE, and PRE under two-phase Sampling