The Improved Estimation of Ratio of Two Population Proportions

Abstract

In this article, first we obtained the correct mean square error expression of Gupta and Shabbir’s linear weighted estimator of the ratio of two population proportions. Later we suggested the general class of ratio estimators of two population proportions. The usual ratio estimator, Wynn-type estimator, Singh, Singh, and Kaur difference-type estimator, and Gupta and Shabbir estimator have been found to be members of the suggested class.

Keywords

ratio of proportions bias mean square error efficiency auxiliary variable

Introduction

For the estimation of a population proportion of a character, Wynn (1976) proposed and studied an estimator using auxiliary information on some other character (in the form of known population proportion of auxiliary character) when the population is classified into two classes according to both main and auxiliary characters. Rao (1977) extended the results to the instance when the numbers of classes are equal according to the main and auxiliary characters. But, sometimes, the problem is that the ratio of population proportions is estimated instead of finding an estimate for any population proportion. For example, in a specified region, we may be interested in finding the ratio of the proportion of persons suffering from lung cancer and the proportion of persons suffering from some other disease. Such estimation of the ratio of proportions will give an idea about the extent to which one disease is spreading compared to the others in a specified region, wherein the effect of various causes like smoking, drinking, poor environmental conditions and standard of living, and so on, is taken into account. We also know that the incidence of tuberculosis infection is one of the indices for studying the epidemiology of tuberculosis in a community. Tuberculosis infection is defined as the proportion of newly infected individuals during a specified period among individuals exposed to the risk of infection during that period. A method for the first test is made in order to identify the uninfected persons called population exposed to the risk between the two tests, and these uninfected are then tested again to determine the number of newly infected during the observation period. Now estimating the ratio of proportion of newly infected persons and uninfected persons at the first test is a problem. For this reason, Singh, Singh, and Kaur (1986) have suggested the difference-type estimator. Later Gupta and Shabbir (2008) have suggested linear weighted estimator to estimate the ratio of population proportions. The objective of this study is to propose an alternative estimator of ratio of population proportions, which is more efficient and generalized than the existing estimators.

Consider a population subdivided into two variables y and x. The objective is to estimate the ratio of the proportions of population units falling in the two specific categories of variables y and x. We assume that an auxiliary variable z, which is strongly associated with both y and x is also available. For example, in epidemiology research, one may be interested in the prevalence of disease A relative to disease B using information provided by some auxiliary characteristics (such as the extent of smoking) that strongly associated with both A and B.

Let Ω = (Ω₁, Ω₂, …, Ω_N) be a finite population of size N. Let A = (A₁, A₂, …, A_a), B = (B₁, B₂, …, B_b), and C = (C₁, C₂, …, C_c) be the partitions of Ω according to the characteristics y, x, and z, respectively; and (N_i00, N_0j0, N_00k are the numbers of population units in the (ith, jth, kth) subclasses A_i(i = 1, 2,…, a), B_j( j = 1, 2,…, b), and C_k (k = 1, 2,…, c), respectively, of Ω, such that $\sum_{i = 1}^{a} N_{i 00} = \sum_{j = 1}^{b} N_{0 j 0} = \sum_{k = 1}^{c} N_{00 k} = N$ . We draw a simple random sample of size n without replacement from Ω. Let n_i ₀₀, n _0j0, and n _00k be sample quantities analogous to N_i ₀₀, N _0j0, and N _00k for y, x, and z, respectively. Let us also define N_ij ₀ to be the number of population units that belong to A_i ∩ B_j. We can similarly define N_i _0k, N_0jk , and corresponding sample quantities. Let $P_{i 00} = \frac{N_{i 00}}{N}$ , $P_{0 j 0} = \frac{N_{0 j 0}}{N}$ , $P_{00 k} = \frac{N_{00 k}}{N}$ , $P_{i j 0} = \frac{N_{i j 0}}{N}$ , $P_{0 j k} = \frac{N_{0 j k}}{N}$ , $P_{i 0 k} = \frac{N_{i 0 k}}{N}$ , and $p_{i 00} = \frac{n_{i 00}}{n}$ , $p_{0 j 0} = \frac{n_{0 j 0}}{n}$ , $p_{00 k} = \frac{n_{00 k}}{n}$ , $p_{i j 0} = \frac{n_{i j 0}}{n}$ , $p_{0 j k} = \frac{n_{0 j k}}{n}$ , $p_{i 0 k} = \frac{n_{i 0 k}}{n}$ be the population and sample proportions, respectively, for (i = 1, 2,…, a), (j = 1, 2,…, b), and (k = 1, 2,…, c).

We are interested in estimating

R = \frac{P_{i 00}}{P_{0 j 0}},

for (i = 1, 2,…, a) and ( j = 1, 2,…, b) by using known value of P _00k. We define the following terms:

ξ_{i 00} = \frac{p_{i 00} - P_{i 00}}{P_{i 00}}, ξ_{0 j 0} = \frac{p_{0 j 0} - P_{0 j 0}}{P_{0 j 0}}, ξ_{00 k} = \frac{p_{00 k} - P_{00 k}}{P_{00 k}}

such that

E (ξ_{i 00}) = E (ξ_{0 j 0}) = E (ξ_{00 k}) = 0

and to the first degree of approximation

E (ξ_{i 00}^{2}) = \frac{V a r (p_{i 00})}{P_{i 00}^{2}} = θ \frac{P_{i 00} (1 - P_{i 00})}{P_{i 00}^{2}} = λ_{i 00},

E (ξ_{0 j 0}^{2}) = \frac{V a r (p_{0 j 0})}{P_{0 j 0}^{2}} = θ \frac{P_{0 j 0} (1 - P_{0 j 0})}{P_{0 j 0}^{2}} = λ_{0 j 0},

E (ξ_{00 k}^{2}) = \frac{V a r (p_{00 k})}{P_{00 k}^{2}} = θ \frac{P_{00 k} (1 - P_{00 k})}{P_{00 k}^{2}} = λ_{00 k},

E (ξ_{i 00} ξ_{0 j 0}) = \frac{C o v (p_{i 00}, p_{0 j 0})}{P_{i 00} P_{0 j 0}} = θ \frac{(P_{i j 0} - P_{i 00} P_{0 j 0})}{P_{i 00} P_{0 j 0}} = λ_{i j 0},

E (ξ_{i 00} ξ_{00 k}) = \frac{C o v (p_{i 00}, p_{00 k})}{P_{i 00} P_{00 k}} = θ \frac{(P_{i 0 k} - P_{i 00} P_{00 k})}{P_{i 00} P_{00 k}} = λ_{i 0 k},

E (ξ_{0 j 0} ξ_{00 k}) = \frac{C o v (p_{0 j 0}, p_{00 k})}{P_{0 j 0} P_{00 k}} = θ \frac{(P_{0 j k} - P_{0 j 0} P_{00 k})}{P_{0 j 0} P_{00 k}} = λ_{0 j k}

where $θ = \frac{N - n}{n (N - 1)}$ .

Known Estimators of Ratio of Proportions

In this section, we discuss about some existing estimators of ratio of population proportions R.

The Usual Ratio Estimator

The usual ratio estimator of ratio of population proportions R is defined as

{\hat{R}}_{0} = \frac{p_{i 00}}{p_{0 j 0}} .

The bias and mean square error (MSE) of ${\hat{R}}_{0}$ , to the first order of approximation, are given by:

B i a s ({\hat{R}}_{0}) \approx θ \frac{1}{P_{0 j 0}^{2}} [P_{i 00} - P_{i j 0}],

M S E ({\hat{R}}_{0}) \approx θ \frac{R}{P_{0 j 0}^{2}} [P_{i 00} + P_{0 j 0} - 2 P_{i j 0}] .

The Wynn-Type Estimator

Wynn (1976) has suggested difference-type estimator of the population proportion. Singh, Singh, and Kaur (1986) modified the Wynn difference-type estimator and called it Wynn-type estimator that estimates the ratio of two proportions. The estimator is defined as

{\hat{R}}_{W} = [\frac{p_{i 00}}{p_{0 j 0}} + (P_{00 k} - p_{00 k})] .

To the first order of approximation, the bias and MSE of ${\hat{R}}_{W}$ are given by

B i a s ({\hat{R}}_{W}) \approx B i a s ({\hat{R}}_{0}) .

M S E ({\hat{R}}_{W}) \approx M S E ({\hat{R}}_{0}) + θ [P_{00 k} (1 - P_{00 k})] - 2 R θ [\frac{P_{i 0 k}}{P_{i 00}} - \frac{P_{0 j k}}{P_{0 j 0}}] .

The Singh, Singh, and Kaur Estimator

The difference-type estimator of R suggested by Singh, Singh, and Kaur (1986) is as follows:

{\hat{R}}_{S} = [\frac{p_{i 00}}{p_{0 j 0}} + d (P_{00 k} - p_{00 k})],

where d is the constant.

To the first order of approximation, the bias and MSE of ${\hat{R}}_{S}$ are given by

B i a s ({\hat{R}}_{w}) \approx B i a s ({\hat{R}}_{0}),

M S E ({\hat{R}}_{S}) \approx M S E ({\hat{R}}_{0}) + d^{2} θ [P_{00 k} (1 - P_{00 k})] - 2 d R θ [\frac{P_{i 0 k}}{P_{i 00}} - \frac{P_{0 j k}}{P_{0 j 0}}] .

The MSE of ${\hat{R}}_{S}$ at equation (10) is minimized for

d = R \frac{[\frac{P_{i 0 k}}{P_{i 00}} - \frac{P_{0 j k}}{P_{0 j 0}}]}{P_{00 k} (1 - P_{00 k})} .

Thus, the minimum MSE of ${\hat{R}}_{S}$ is given by:

M S E {({\hat{R}}_{S})}_{min} \approx M S E ({\hat{R}}_{0}) - R^{2} θ \frac{{[\frac{P_{i 0 k}}{P_{i 00}} - \frac{P_{0 j k}}{P_{0 j 0}}]}^{2}}{P_{00 k} (1 - P_{00 k})} .

The Gupta and Shabbir Estimator

The linear weighted estimator of R suggested by Gupta and Shabbir (2008) is

{\hat{R}}_{G S} = [α \frac{p_{i 00}}{p_{0 j 0}} + β_{k} (P_{00 k} - p_{00 k}) \frac{P_{00 k}}{p_{00 k}}],

where (α, β _k ) are suitably chosen constants.

To the first order of approximation, the bias and MSE of ${\hat{R}}_{G S}$ are given by

B i a s ({\hat{R}}_{G S}) \approx (α - 1) R + α θ \frac{1}{P_{0 j 0}^{2}} (P_{i 00} - P_{i j 0}) + β_{k} θ (1 - P_{00 k}),

\begin{aligned} M S E ({\hat{R}}_{G S}) \approx {(α - 1)}^{2} R^{2} + α^{2} M S E ({\hat{R}}_{0}) + β_{k}^{2} θ P_{00 k} (1 - P_{00 k}) \\ - 2 α β_{k} R θ (\frac{P_{i 0 k}}{P_{i 00}} - \frac{P_{0 j k}}{P_{0 j 0}}) . \end{aligned}

The MSE of ${\hat{R}}_{G S}$ is minimized for

α = \frac{R^{2}}{R^{2} + {M S E}_{min} ({\hat{R}}_{S})} = α^{*},

β_{k} = α^{*} R \frac{(\frac{P_{i 0 k}}{P_{i 00}} - \frac{P_{0 j k}}{P_{0 j 0}})}{P_{00 k} (1 - P_{00 k})} = β_{k}^{*} .

Thus, the resulting minimum MSE of ${\hat{R}}_{G S}$ is given by:

{M S E}_{min} ({\hat{R}}_{S G}) \approx \frac{R^{2} {M S E}_{min} ({\hat{R}}_{S})}{R^{2} + {M S E}_{min} ({\hat{R}}_{S})} .

It is to be noted that the MSE expression of ${\hat{R}}_{G S}$ at equation (15) obtained by Gupta and Shabbir (2008) is not correct and thus the entire study carried out in the article by Gupta and Shabbir is erroneous except concerning the bias. Keeping this in view, we have proposed the generalized class of Gupta and Shabbir's estimator ${\hat{R}}_{G S}$ with its properties and obtained the correct MSE expression of Gupta and Shabbir estimator ${\hat{R}}_{G S}$ .

The Suggested Class of Estimators

We suggest the following generalized class of estimators of ratio of population proportions R:

{\hat{R}}_{S S} = [α (\frac{p_{i 00}}{p_{0 j 0}}) {(\frac{p_{00 k}}{P_{00 k}})}^{η_{k}} + β_{k} (P_{00 k} - p_{00 k})] {(\frac{P_{00 k}}{p_{00 k}})}^{δ_{k}},

where (α, β _k ) are suitably chosen constants and (η _k , δ _k ) are suitably chosen scalars (k = 1, 2,…, c).

It is interesting to note the following:

For $(α, β_{k}, η_{k}, δ_{k}) = (1, 0, 0, 0), {\hat{R}}_{S S} \to {\hat{R}}_{0},$ (Usual ratio estimator)

For $(α, β_{k}, η_{k}, δ_{k}) = (1, 1, 0, 0), {\hat{R}}_{S S} \to {\hat{R}}_{W},$ (Wynn-type estimator 1976)

For $(α, β_{k}, η_{k}, δ_{k}) = (1, d, 0, 0), {\hat{R}}_{S S} \to {\hat{R}}_{S},$ (Singh, Singh, and Kaur estimator 1986)

For $(α, β_{k}, η_{k}, δ_{k}) = (α, β_{k}, 1, 1), {\hat{R}}_{S S} \to {\hat{R}}_{G S},$ (Gupta and Shabbir, 2008).

Expressing equation (19) in terms of ξs, we have

{\hat{R}}_{S S} = [α R (1 + ξ_{i 00}) {(1 + ξ_{0 j 0})}^{- 1} {(1 + ξ_{00 k})}^{θ_{k}} - β_{k} P_{00 k} ξ_{00 k} {(1 + ξ_{00 k})}^{- δ_{k}}],

where θ_k = (η_k − δ_k.

We assume that $|ξ_{0 j 0}| < 1$ , $|ξ_{00 k}| < 1$ , so that ${(1 + ξ_{0 j 0})}^{- 1}$ , ${(1 + ξ_{00 k})}^{θ_{k}}$ , and ${(1 + ξ_{00 k})}^{- δ_{k}}$ are binomially expandable. Now expanding the right-hand side of equation (20), we have:

\begin{aligned} {\hat{R}}_{S S} & = [α R (1 + ξ_{i 00}) (1 - ξ_{0 j 0} + ξ_{0 j 0}^{2} - . . .) (1 + θ_{k} ξ_{00 k} + \frac{θ_{k} (1 - θ_{k})}{2} ξ_{00 k}^{2} - . . .) \\ - β_{k} P_{00 k} ξ_{00 k} (1 - δ_{k} ξ_{00 k} + \frac{δ_{k} (δ_{k} + 1)}{2} ξ_{00 k}^{2} - . . .)] \\ = [α R (1 + ξ_{i 00} - ξ_{0 j 0} - ξ_{i 00} ξ_{0 j 0} + ξ_{0 j 0}^{2} + ξ_{i 00} ξ_{0 j 0}^{2} + θ_{k} ξ_{00 k} + θ_{k} ξ_{i 00} ξ_{00 k} \\ - θ_{k} ξ_{0 j 0} ξ_{00 k} - θ_{k} ξ_{i 00} ξ_{0 j 0} ξ_{00 k} + θ_{k} ξ_{00 k} ξ_{0 j 0}^{2} + ξ_{i 00} ξ_{0 j 0}^{2} ξ_{00 k} \\ + \frac{θ_{k} (1 - θ_{k})}{2} ξ_{00 k}^{2} + \frac{θ_{k} (1 - θ_{k})}{2} ξ_{i 00} ξ_{00 k}^{2} - \frac{θ_{k} (1 - θ_{k})}{2} ξ_{0 j 0} ξ_{00 k}^{2} \\ - \frac{θ_{k} (1 - θ_{k})}{2} ξ_{i 00} ξ_{0 j 0} ξ_{00 k}^{2} + \frac{θ_{k} (1 - θ_{k})}{2} ξ_{0 j 0}^{2} ξ_{00 k}^{2} + ξ_{i 00} ξ_{0 j 0}^{2} ξ_{00 k}^{2} - . . .) \\ - β_{k} P_{00 k} (ξ_{00 k} - δ_{k} ξ_{00 k}^{2} + \frac{δ_{k} (δ_{k} + 1)}{2} ξ_{00 k}^{3} - . . .)] . \end{aligned}

Now neglecting the terms of ξs with order greater than 2, we have:

\begin{aligned} {\hat{R}}_{S S} & \approx α R [1 + ξ_{i 00} - ξ_{0 j 0} - ξ_{i 00} ξ_{0 j 0} + ξ_{0 j 0}^{2} + θ_{k} ξ_{00 k} + θ_{k} (ξ_{i 00} ξ_{00 k} - ξ_{0 j 0} ξ_{00 k}) \\ + \frac{θ_{k} (θ_{k} - 1)}{2} ξ_{00 k}^{2}] - β_{k} P_{00 k} (ξ_{00 k} - δ_{k} ξ_{00 k}^{2}) \end{aligned}

\begin{aligned} ({\hat{R}}_{S S} - R) & \approx R [α {1 + ξ_{i 00} - ξ_{0 j 0} - ξ_{i 00} ξ_{0 j 0} + ξ_{0 j 0}^{2} + θ_{k} ξ_{00 k} \\ + θ_{k} (ξ_{i 00} ξ_{00 k} - ξ_{0 j 0} ξ_{00 k}) + \frac{θ_{k} (θ_{k} - 1)}{2} ξ_{00 k}^{2}} - 1] \\ - β_{k} P_{00 k} (ξ_{00 k} - δ_{k} ξ_{00 k}^{2}) . \end{aligned}

Taking expectation on both sides of equation (21), we get the bias of suggested class of estimators ${\hat{R}}_{S S}$ to the first order of approximation as:

\begin{aligned} B i a s ({\hat{R}}_{S S}) \approx R [α {1 + {λ_{0 j 0} - λ_{i j 0} + θ_{k} (λ_{i 0 k} - λ_{0 j k}) \\ + \frac{θ_{k} (θ_{k} - 1)}{2} λ_{00 k}}} - 1] + β_{k} P_{00 k} δ_{k} λ_{00 k} . \end{aligned}

Squaring both sides of equation (21) and neglecting terms of ξs with order greater than 2, we have:

\begin{aligned} {({\hat{R}}_{S S} - R)}^{2} & \approx R^{2} [α^{2} {1 + ξ_{i 00}^{2} + 3 ξ_{0 j 0}^{2} + θ_{k}^{2} ξ_{00 k}^{2} + 2 ξ_{i 00} - 2 ξ_{0 j 0} + 2 θ_{k} ξ_{00 k} \\ - 4 ξ_{i 00} ξ_{0 j 0} + 4 θ_{k} (ξ_{i 00} ξ_{00 k} - ξ_{0 j 0} ξ_{00 k}) + θ_{k} (θ_{k} - 1) ξ_{00 k}^{2}} + 1 \\ - 2 α {1 + ξ_{i 00} - ξ_{0 j 0} + θ_{k} ξ_{00 k} - ξ_{i 00} ξ_{0 j 0} + ξ_{0 j 0}^{2} \\ + θ_{k} (ξ_{i 00} ξ_{00 k} - ξ_{0 j 0} ξ_{00 k}) + \frac{θ_{k} (θ_{k} - 1)}{2} ξ_{00 k}^{2}}] + {(β_{k})}^{2} P_{00 k}^{2} ξ_{00 k}^{2} \\ - 2 R β_{k} P_{00 k} [α {ξ_{00 k} + (ξ_{i 00} ξ_{00 k} - ξ_{0 j 0} ξ_{00 k}) + (θ_{k} - δ_{k}) ξ_{00 k}^{2}} \\ - ξ_{00 k} + δ_{k} ξ_{00 k}^{2}] . \end{aligned}

Taking expectation on both sides of equation (23), we get the MSE of suggested class of estimators ${\hat{R}}_{S S}$ to the first order of approximation as:

M S E ({\hat{R}}_{S S}) \approx R^{2} [1 + α^{2} γ_{1} + {(β_{k})}^{2} γ_{2} + 2 α β_{k} γ_{3} - 2 α γ_{4} - 2 β_{k} γ_{5}],

where

γ_{1} = [1 + λ_{i 00} + 3 λ_{0 j 0} - 4 λ_{i j 0} + 4 θ_{k} (λ_{i 0 k} - λ_{0 j k}) + θ_{k} \{2 θ_{k} - 1\} λ_{00 k}],

γ_{2} = R^{- 2} P_{00 k}^{2} λ_{00 k},

γ_{3} = R^{- 1} P_{00 k} [(2 δ_{k} - η_{k}) λ_{00 k} + (λ_{0 j k} - λ_{i 0 k})],

γ_{4} = [1 + λ_{0 j 0} - λ_{i j 0} + θ_{k} (λ_{i 0 k} - λ_{0 j k}) + (1 / 2) θ_{k} (θ_{k} - 1) λ_{00 k}],

γ_{5} = R^{- 1} δ_{k} P_{00 k} λ_{00 k} .

Minimization of equation (24) with respect to α and β _k , we have the normal equation:

[\begin{matrix} γ_{1} γ_{3} \\ γ_{3} γ_{2} \end{matrix}] [\begin{matrix} α \\ β_{k} \end{matrix}] = [\begin{matrix} γ_{4} \\ γ_{5} \end{matrix}] .

Solving equation (25) for α and β _k , we get the optimum values of α and β _k , respectively, as

\begin{matrix} α = [\frac{γ_{2} γ_{4} - γ_{3} γ_{5}}{γ_{1} γ_{2} - γ_{3}^{2}}] = α_{(o p t)} \\ β_{k} = [\frac{γ_{1} γ_{5} - γ_{3} γ_{4}}{γ_{1} γ_{2} - γ_{3}^{2}}] = β_{k (o p t)} \end{matrix}\} .

Thus, the resulting minimum MSE of ${\hat{R}}_{S S}$ is given by

{M S E}_{min} ({\hat{R}}_{S S}) \approx R^{2} [1 - \frac{(γ_{2} γ_{4}^{2} - 2 γ_{3} γ_{4} γ_{5} + γ_{1} γ_{5}^{2})}{(γ_{1} γ_{2} - γ_{3}^{2})}] .

Remark 1

Equation (27) provides only an ideal optimum MSE since the optimum values of α _k and β _k , that is, α_(opt) and β_(opt) involve unknown parameters. In practice, one can use reasonable values of these parameters known from prior studies (see Lui 1990; Murthy 1967; Srivastava 1967).

In the following, we have considered three cases of suggested generalized class of estimators ${\hat{R}}_{S S}$ for different values of (η_k, δ_k).

Case I: When (η_k, δ_k) = (0, 0)

For (η_k, δ_k) = (0, 0), the suggested generalized class of estimators ${\hat{R}}_{S S}$ at equation (19) reduces to the following estimator of ratio of population proportions R as

{\hat{R}}_{S S 1} = [α (\frac{p_{i 00}}{p_{0 j 0}}) + β_{k} (P_{00 k} - p_{00 k})] .

To the first degree of approximation, the bias and MSE of estimator ${\hat{R}}_{S S 1}$ are easily obtained from equations (22) and (24), respectively, as

B i a s ({\hat{R}}_{S S 1}) \approx R [α {1 + λ_{0 j 0} - λ_{i j 0}} - 1] .

M S E ({\hat{R}}_{S S 1}) \approx R^{2} [1 + α^{2} γ_{1 (1)} + {(β_{k})}^{2} γ_{2} + 2 α β_{k} γ_{3 (1)} - 2 α γ_{4 (1)}],

where

γ_{1 (1)} = [1 + λ_{i 00} + 3 λ_{0 j 0} - 4 λ_{i j 0}],

γ_{2} = R^{- 2} P_{00 k}^{2} λ_{00 k},

γ_{3 (1)} = R^{- 1} P_{00 k} (λ_{0 j k} - λ_{i 0 k}),

γ_{4 (1)} = [1 + λ_{0 j 0} - λ_{i j 0}] .

The MSE of estimator ${\hat{R}}_{S S 1}$ is minimized for

\begin{matrix} α = [\frac{γ_{2} γ_{4 (1)}}{γ_{1 (1)} γ_{2} - γ_{3 (1)}^{2}}] = α_{1 (o p t)} \\ β_{k} = [\frac{- γ_{3 (1)} γ_{4 (1)}}{γ_{1 (1)} γ_{2} - γ_{3 (1)}^{2}}] = β_{1 k (o p t)} \end{matrix}\} .

Thus, the resulting minimum MSE of estimator ${\hat{R}}_{S S 1}$ is given by:

{M S E}_{min} ({\hat{R}}_{S S 1}) \approx R^{2} [1 - \frac{γ_{2} γ_{4 (1)}^{2}}{γ_{1 (1)} γ_{2} - γ_{3 (1)}^{2}}] .

Case II: When (η_k, δ_k) = (1, 0)

For (η_k, δ_k) = (0, 0), the suggested generalized class of estimators ${\hat{R}}_{S S}$ at equation (19) reduces to the following estimator of ratio of population proportions R as:

{\hat{R}}_{S S 2} = [α (\frac{p_{i 00}}{p_{0 j 0}}) (\frac{p_{00 k}}{P_{00 k}}) + β_{k} (P_{00 k} - p_{00 k})] .

To the first degree of approximation, the bias and MSE of estimator ${\hat{R}}_{S S 2}$ are easily obtained from equations (22) and (24), respectively, as:

B i a s ({\hat{R}}_{S S}) \approx R [α {1 + λ_{0 j 0}^{} - λ_{i j 0} + λ_{i 0 k} - λ_{0 j k}} - 1] .

M S E ({\hat{R}}_{S S 2}) \approx R^{2} [1 + α^{2} γ_{1 (1)} + {(β_{k})}^{2} γ_{2} + 2 α β_{k} γ_{3 (2)} - 2 α γ_{4 (2)}],

where

\begin{aligned} γ_{1 (2)} = [1 + λ_{i 00} + 3 λ_{0 j 0} - 4 λ_{i j 0} + 4 (λ_{i 0 k} - λ_{0 j k}) + λ_{00 k}], \\ γ_{3 (2)} = R^{- 1} P_{00 k} [λ_{0 j k} - λ_{00 k} - λ_{i 0 k}], \\ γ_{4 (2)} = [1 + λ_{0 j 0} - λ_{i j 0} + λ_{i 0 k} - λ_{0 j k}] . \end{aligned}

The MSE of estimator ${\hat{R}}_{S S 2}$ is minimized for:

\begin{matrix} α = [\frac{γ_{2} γ_{4 (2)}}{γ_{1 (2)} γ_{2} - γ_{3 (2)}^{2}}] = α_{2 (o p t)} \\ β_{k} = [\frac{- γ_{3 (2)} γ_{4 (2)}}{γ_{1 (2)} γ_{2} - γ_{3 (2)}^{2}}] = β_{2 k (o p t)} \end{matrix}\} .

Thus, the resulting minimum MSE of estimator ${\hat{R}}_{S S 2}$ is given by

{M S E}_{min} ({\hat{R}}_{S S 2}) \approx R^{2} [1 - \frac{γ_{2} γ_{4 (2)}^{2}}{γ_{1 (2)} γ_{2} - γ_{3 (2)}^{2}}] .

Remark2: Corrected MSE of Gupta and Shabbir's estimator

It is interesting to note that, if we put $(η_{k}, δ_{k}) = (1, 1)$ in equation (24), we get the corrected MSE expression of ${\hat{R}}_{G S}$ (Gupta and Shabbir 2008) as:

M S E ({\hat{R}}_{S G}) \approx R^{2} [1 + α^{2} γ_{1 (1)} + β_{k}^{2} γ_{2} + 2 α β_{k} γ_{3}^{*} - 2 α γ_{4 (1)} - 2 β_{k} γ_{5}^{*}],

where $γ_{3}^{*} = R^{- 1} P_{00 k} [λ_{00 k} + λ_{0 j k} - λ_{i 0 k}]$ and $γ_{5}^{*} = R^{- 1} δ P_{00 k} λ_{00 k} .$

The MSE of ${\hat{R}}_{G S}$ is minimized for:

\begin{matrix} α = [\frac{γ_{2} γ_{4 (1)} - γ_{3}^{*} γ_{5}^{*}}{γ_{1 (1)} γ_{2} - γ_{3}^{* 2}}] = α_{(o p t)}^{*} \\ β_{k} = [\frac{γ_{1 (1)} γ_{5}^{*} - γ_{3}^{*} γ_{4 (1)}}{γ_{1 (1)} γ_{2} - γ_{3}^{* 2}}] = β_{k (o p t)}^{*} \end{matrix}\} .

Thus, the resulting minimum MSE of ${\hat{R}}_{G S}$ is given by:

{M S E}_{min} ({\hat{R}}_{S S}) \approx R^{2} [1 - \frac{(γ_{2} γ_{4 (1)}^{2} - 2 γ_{3}^{*} γ_{4 (1)} γ_{5}^{*} + γ_{1 (1)} γ_{5}^{* 2})}{(γ_{1 (1)} γ_{2} - γ_{3}^{* 2})}] .

Empirical Study

To judge the merits of suggested class of estimators ${\hat{R}}_{S S}$ over the other competitors, we use the data set earlier considered by Gupta and Shabbir (2008). The descriptions of population data set are as follows:

Data: source: Cochran (1977:182).

The variables are defined as

y: Number of paralytic polio cases in “placebo” group.

x: Number of paralytic polio cases in “not inoculated” group.

z: Number of children in placebo group.

We have calculated the percentage relative efficiencies of ${\hat{R}}_{0}$ , ${\hat{R}}_{W}$ (Wynn 1976 type estimator), ${\hat{R}}_{S}$ (Singh, Singh, and Kaur 1986 estimator), ${\hat{R}}_{G S}$ (Gupta and Shabbir 2008 estimator), ${\hat{R}}_{S S 1}$ , and ${\hat{R}}_{S S 2}$ with respect to ${\hat{R}}_{0}$ based on the category (i = j = 1) and various choices of k. The joint frequencies for variables are given in Tables 1 –3. The findings are summarized in Table 4.

Table 1.

Joint Frequencies for y and z.

z/y	0–2	3–5	6–8	>8	N _00k
1–4.9	20	2	1	—	23
5–9.9	1	3	1	1	6
10–14.9	1	1	1	—	3
15–19.9	—	—	—	1	1
20–24.9	—	—	—	1	1
N_i ₀₀	22	6	3	3	N = 34

Table 2.

Joint Frequencies for x and z.

z/x	0–2	3–5	6–8	>8	N _00k
1–4.9	20	1	2	—	23
5–9.9	2	2	2	—	6
10–14.9	1	1	—	1	3
15–19.9	—	—	—	1	1
20–24.9	—	—	1	—	1
N _0j0	23	4	5	2	N = 34

Table 3.

Joint Frequencies for y and x.

y/x	0–2	3–5	6–8	>8	N_i ₀₀
0–2	19	2	1	—	22
3–5	2	2	2	—	6
6–8	2	—	—	1	3
>8	—	—	2	1	3
N_ojo	23	4	5	2	N = 34

Table 4.

PREs of Different Estimators With Respect to ${\hat{R}}_{0}$ .

Estimator/k	1	2	3	4	5
${\hat{R}}_{0}$ (usual ratio estimator)	100.00	100.00	100.00	100.00	100.00
${\hat{R}}_{W}$ (Wynn 1976 type estimator)	75.03	65.69	84.88	93.78	93.78
${\hat{R}}_{S}$ (Singh, Singh, and Kaur 1986)	101.54	102.58	100.01	100.00	100.00
${\hat{R}}_{G S}$ (Gupta and Shabbir 2008)	119.33	119.40	69.58	96.31	96.31
${\hat{R}}_{S S 1}$ (suggested estimator in Case-I)	115.41	116.55	113.75	113.74	113.74
${\hat{R}}_{S S 2}$ (suggested estimator in Case-II)	118.10	108.30	114.73	113.74	113.74

Note: PRE = percentage relative efficiencies. Boldface numbers indicate the largest PRE for respective value of k.

Results in the Table 4 clearly show the gain in efficiency in using the suggested estimators ${\hat{R}}_{S S 1}$ and ${\hat{R}}_{S S 2}$ except the cases when k (= 1, 2). For k (= 1, 2), the Gupta and Shabbir (2008) estimator ${\hat{R}}_{G S}$ performed better than the other estimators. It is also noted that the efficiencies of the suggested estimators ${\hat{R}}_{S S 1}$ and ${\hat{R}}_{S S 2}$ are superior to the usual ratio estimator ${\hat{R}}_{0}$ , Wynn (1976) type estimator ${\hat{R}}_{W}$ , and Singh, Singh, and Kaur (1986) estimator ${\hat{R}}_{S}$ for all values of k (= 1, 2,…, 5), but the performance of Gupta and Shabbir (2008) estimator ${\hat{R}}_{G S}$ is not consistent with all values of k (= 1, 2,…, 5).

Similarly we can find the results for other choices of i and j (i = 1, 2, 3, 4; j = 1, 2, 3, 4) with various choices of k (= 1, 2,…, 5).

The Generalization of Suggested Class ${\hat{R}}_{S S}$

The suggested class of estimators at equation (19) can be generalized even more by making use of all of the known proportions in various categories relative to the auxiliary variable z. Thus, the more generalized class of suggested class is given by:

{\hat{R}}_{S S}^{(k)} = [α (\frac{p_{i 00}}{p_{0 j 0}}) \prod_{k = 1}^{c} {(\frac{p_{00 k}}{P_{00 k}})}^{θ_{k}} + \sum_{k = 1}^{c} β_{k} (P_{00 k} - p_{00 k}) {(\frac{P_{00 k}}{p_{00 k}})}^{δ_{k}}],

where (α, β _k ) are suitably chosen constants and (i = 1, 2,…, a), ( j = 1, 2,…, b), and (k =1, 2,…, c).

Expressing equation (41) in terms of ξs, we have

\begin{aligned} {\hat{R}}_{S S}^{(k)} = [α R (1 + ξ_{i 00}) {(1 + ξ_{0 j 0})}^{- 1} \prod_{k = 1}^{c} {(1 + ξ_{00 k})}^{θ_{k}} \\ - \sum_{k = 1}^{c} β_{k} P_{00 k} ξ_{00 k} {(1 + ξ_{00 k})}^{- δ_{k}}] . \end{aligned}

We assume that $|ξ_{0 j 0}| < 1$ , $|ξ_{00 k}| < 1$ , so that ${(1 + ξ_{0 j 0})}^{- 1}$ , ${(1 + ξ_{00 k})}^{θ_{k}}$ , and ${(1 + ξ_{00 k})}^{- δ_{k}}$ are expandable. Now expanding the right-hand side of equation (42) and neglecting the terms of ξs with order greater than 2, we have:

\begin{aligned} {\hat{R}}_{S S}^{(k)} \approx [α R {1 + ξ_{i 00} - ξ_{0 j 0} - ξ_{i 00} ξ_{0 j 0} + ξ_{0 j 0}^{2} + \sum_{k = 1}^{c} θ_{k} ξ_{00 k} \\ + \sum_{k = 1}^{c} θ_{k} (ξ_{i 00} ξ_{00 k} - ξ_{0 j 0} ξ_{00 k}) + \sum_{k < k^{'}}^{c} θ_{k} θ_{k^{'}} ξ_{00 k} ξ_{00 k^{'}} \\ + \sum_{k = 1}^{c} \frac{θ_{k} (θ_{k} - 1)}{2} ξ_{00 k}^{2}} - \sum_{k = 1}^{c} β_{k} P_{00 k} (ξ_{00 k} - δ_{k} ξ_{00 k}^{2})] \end{aligned}

\begin{aligned} ({\hat{R}}_{S S}^{(k)} - R) \approx [α R {1 + ξ_{i 00} - ξ_{0 j 0} - ξ_{i 00} ξ_{0 j 0} + ξ_{0 j 0}^{2} + \sum_{k = 1}^{c} θ_{k} ξ_{00 k} \\ + \sum_{k = 1}^{c} θ_{k} (ξ_{i 00} ξ_{00 k} - ξ_{0 j 0} ξ_{00 k}) + \sum_{k < k^{'}}^{c} θ_{k} θ_{k^{'}} ξ_{00 k} ξ_{00 k^{'}} \\ + \sum_{k = 1}^{c} \frac{θ_{k} (θ_{k} - 1)}{2} ξ_{00 k}^{2}} - \sum_{k = 1}^{c} β_{k} P_{00 k} (ξ_{00 k} - δ_{k} ξ_{00 k}^{2}) - R] . \end{aligned}

Taking expectation on both sides of equation (43), we get the bias of generalized class of estimators ${\hat{R}}_{S S}^{(k)}$ to the first order of approximation as:

\begin{aligned} B i a s ({\hat{R}}_{S S}^{(k)}) \approx R [α {1 + λ_{0 j 0} - λ_{i j 0} + \sum_{k = 1}^{c} {θ_{k} (λ_{i 0 k} - λ_{0 j k}) \\ + \sum_{k = 1}^{c} \frac{θ_{k} (θ_{k} - 1)}{2} λ_{00 k}}} - 1] + \sum_{k = 1}^{c} β_{k} P_{00 k} δ_{k} λ_{00 k} . \end{aligned}

Squaring both sides of equation (43) and neglecting terms of ξs with order greater than 2, we have

\begin{aligned} {({\hat{R}}_{S S} - R)}^{2} \approx R^{2} [α^{2} {1 + ξ_{i 00}^{2} + 3 ξ_{0 j 0}^{2} + \sum_{k = 1}^{c} θ_{k}^{2} ξ_{00 k}^{2} \\ + 2 \sum_{k < k^{'} = 1}^{c} θ_{k} θ_{k^{'}} ξ_{00 k} ξ_{00 k^{'}} + 2 ξ_{i 00} \\ - 2 ξ_{0 j 0} + 2 \sum_{k = 1}^{c} θ_{k} ξ_{00 k} - 4 ξ_{i 00} ξ_{0 j 0} \\ + 4 \sum_{k = 1}^{c} θ_{k} (ξ_{i 00} ξ_{00 k} - ξ_{0 j 0} ξ_{00 k}) \\ + 2 \sum_{k < k^{'} = 1}^{c} θ_{k} θ_{k^{'}} ξ_{00 k} ξ_{00 k^{'}} + \sum_{k = 1}^{c} θ_{k} (θ_{k} - 1) ξ_{00 k}^{2}} \\ + 1 - 2 α {1 + ξ_{i 00} - ξ_{0 j 0} + \sum_{k = 1}^{c} θ_{k} ξ_{00 k} - ξ_{i 00} ξ_{0 j 0} + ξ_{0 j 0}^{2} \\ + \sum_{k = 1}^{c} θ_{k} (ξ_{i 00} ξ_{00 k} - ξ_{0 j 0} ξ_{00 k}) + \sum_{k < k^{'} = 1}^{c} θ_{k} θ_{k^{'}} ξ_{00 k} ξ_{00 k^{'}} \\ + \sum_{k = 1}^{c} \frac{θ_{k} (θ_{k} - 1)}{2} ξ_{00 k}^{2}}] + \sum_{k = 1}^{c} {{(β_{k}^{})}^{2} P_{00 k}^{2} ξ_{00 k}^{2}} \\ + 2 \sum_{k < k^{'} = 1}^{c} (β_{k} β_{k^{'}} P_{00 k} P_{00 k^{'}} ξ_{00 k} ξ_{00 k^{'}}) \\ - 2 R \sum_{k < k^{'} = 1}^{c} [{β_{k} P_{00 k} {α {ξ_{00 k} + (ξ_{i 00} ξ_{00 k} - ξ_{0 j 0} ξ_{00 k}) \\ + (θ_{k} - δ_{k}) ξ_{00 k}^{2}}} - ξ_{00 k} + δ_{k} ξ_{00 k}^{2}} \\ + α \sum_{k < k^{'} = 1}^{c} (θ_{k} β_{k^{'}} P_{00 k^{'}} ξ_{00 k} ξ_{00 k^{'}})] . \end{aligned}

Taking expectation on both sides of equation (45), we get the MSE of generalized class of estimators ${\hat{R}}_{S S}$ to the first order of approximation as:

M S E ({\hat{R}}_{S S}) \approx R^{2} [1 + α^{2} γ_{1}^{(k)} + γ_{2}^{(k)} + 2 α γ_{3}^{(k)} - 2 α γ_{4}^{(k)} - 2 γ_{5}^{(k)}],

where

\begin{aligned} γ_{1}^{(k)} = [1 + λ_{i 00} + 3 λ_{0 j 0} - 4 λ_{i j 0} \\ + \sum_{k = 1}^{c} \{θ_{k} \{4 (λ_{i 0 k} - λ_{0 j k}) + (2 θ_{k} - 1) λ_{00 k}\}\}], \end{aligned}

γ_{2}^{(k)} = R^{- 2} \sum_{k = 1}^{c} \{{(β_{k})}^{2} P_{00 k}^{2} λ_{00 k}\},

\begin{aligned} γ_{3}^{(k)} = R^{- 1} \sum_{k = 1}^{c} \{β_{k} P_{00 k} ((2 δ_{k} - η_{k}) λ_{00 k} + (λ_{0 j k} - λ_{i 0 k}))\} \\ = R^{- 1} \sum_{k = 1}^{c} \{β_{k} Δ_{k}\}, \end{aligned}

γ_{4}^{(k)} = [1 + λ_{0 j 0} - λ_{i j 0} + \sum_{k = 1}^{c} \{θ_{k} \{(λ_{i 0 k} - λ_{0 j k}) + \frac{(θ_{k} - 1) λ_{00 k}}{2}\}\}],

γ_{5}^{(k)} = R^{- 1} \sum_{k = 1}^{c} \{β_{k} δ_{k} P_{00 k} λ_{00 k}\} .

Differentiating equation (46) partially with respect to α and β _k , we get the following equations, respectively, as:

α γ_{1}^{(k)} + γ_{3}^{(k)} = [α γ_{1}^{(k)} + R^{- 1} \sum_{k = 1}^{c} {β_{k} Δ_{k}}] = γ_{4}^{(k)},

R^{- 2} β_{k} P_{00 k}^{2} λ_{00 k} = [R^{- 1} δ_{k} P_{00 k} λ_{00 k} - R^{- 1} α Δ_{k}] .

Solving equations (47) and (48), we get the optimum values of α and β _k , respectively, as

α = \frac{\{γ_{4}^{(k)} - \sum_{k = 1}^{c} (\frac{δ_{k} Δ_{k}}{P_{00 k}})\}}{\{γ_{1}^{(k)} - \sum_{k = 1}^{c} (\frac{Δ_{k}^{2}}{P_{00 k}^{2} λ_{00 k}})\}} = α_{(o p t)}^{(k)},

β_{k} = \frac{R}{P_{00 k}^{2} λ_{00 k}} [δ_{k} P_{00 k} λ_{00 k} - Δ_{k} α_{(o p t)}^{(k)}] = β_{k (o p t)}^{* *} .

Putting equations (49) and (50) in equation (46), we get the minimum MSE of generalized class of estimators ${\hat{R}}_{S S}$ to the first order of approximation as:

\begin{aligned} M S E ({\hat{R}}_{S S}) \approx R^{2} [1 + α_{(o p t)}^{(k)} γ_{1}^{(k)} + R^{- 2} \sum_{k = 1}^{c} {{(β_{k (o p t)}^{* *})}^{2} P_{00 k}^{2} λ_{00 k}} \\ + 2 α_{(o p t)}^{(k)} R^{- 1} \sum_{k = 1}^{c} {β_{k (o p t)}^{* *} Δ_{k}} - 2 α_{(o p t)}^{(k)} γ_{4}^{(k)} \\ - 2 R^{- 1} \sum_{k = 1}^{c} {β_{k (o p t)}^{* *} δ_{k} P_{00 k} λ_{00 k}}] . \end{aligned}

It can be shown that the proposed class ${\hat{R}}_{S S}$ is more efficient than the estimators ${\hat{R}}_{0}$ , ${\hat{R}}_{W}$ (Wynn 1976 type estimator), ${\hat{R}}_{S}$ (Singh, Singh, and Kaur 1986 estimator), and ${\hat{R}}_{G S}$ (Gupta and Shabbir 2008).

Conclusion

We suggest the general class of estimators of ratio of two population proportions. The usual ratio estimator, Wynn (1976) type estimator, Singh, Singh, and Kaur (1986) difference-type estimator and Gupta and Shabbir (2008) estimator have been found to be members of suggested class. We have also obtained the correct MSE expression of Gupta and Shabbir linear weighted estimator of the ratio of two population proportions. The merits of proposed class of estimators have been studied by the empirical study and found that the proposed estimators are superior to the usual ratio estimator, Wynn type estimator and Singh, Singh, and Kaur estimator but the performance of Gupta and Shabbir estimator is not consistent. The generalized version of proposed class has been also proposed.

Footnotes

Declaration of Conflicting Interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) received no financial support for the research, authorship, and/or publication of this article.

References

Cochran

W. G.

1977. Sampling Techniques. 3rd ed. New York: John Wiley.

Gupta

Shabbir

. 2008. “On Estimating the Ratio of Proportions of Two Categories of a Population Using Auxiliary Information.” Journal of the Indian Society of Agricultural Statistics 62:149–55.

Lui

K. J.

1990. “Modified Product Estimators of Finite Population Mean in Finite Sampling.” Communications in Statistics–Theory and Methods 19:3799–807.

Murthy

M. N.

1967. Sampling: Theory and Methods. Calcutta, India: Statistical Publishing Society.

Rao

T. J.

1977. “Optimum Allocation of Sample Size and Prior Distributions.” International Statistical Review 25:173–79.

Singh

R. K.

Kaur

. 1986. “Estimating Ratio of Proportion Using Auxiliary Information.” Biometrical Journal 28:637–43.

Srivastava

S. K.

1967. “An Estimator Using Auxiliary Information in Sample Surveys.” Bulletin/Calcutta Statistical Association Bulletin 16:62–63.

Wynn

H. P.

1976. “An Unbiased Estimator in a Proportion.” The Statistician 25:225–28.

The Improved Estimation of Ratio of Two Population Proportions

Abstract

Keywords

Introduction

Known Estimators of Ratio of Proportions

The Usual Ratio Estimator

The Wynn-Type Estimator

The Singh, Singh, and Kaur Estimator

The Gupta and Shabbir Estimator

The Suggested Class of Estimators

Remark 1

Case I: When (η k , δ k ) = (0, 0)

Case II: When (η k , δ k ) = (1, 0)

Remark2: Corrected MSE of Gupta and Shabbir's estimator

Empirical Study

The Generalization of Suggested Class R ˆ S S

Conclusion

Footnotes

Declaration of Conflicting Interests

Funding

References

Case I: When (η_k, δ_k) = (0, 0)

Case II: When (η_k, δ_k) = (1, 0)

The Generalization of Suggested Class ${\hat{R}}_{S S}$