A Dexterous Optional Randomized Response Model

Abstract

This article addresses the problem of estimating the proportion π_S of the population belonging to a sensitive group using optional randomized response technique in stratified sampling based on Mangat model that has proportional and Neyman allocation and larger gain in efficiency. Numerically, it is found that the suggested model is more efficient than Kim and Warde stratified randomized response model and Mangat model.

Keywords

randomized response technique estimation of proportion stratified random sampling sensitive attribute bias mean squared error

Introduction

One of the important things for obtaining data pertaining to human population is the social survey. To measure opinions, attitudes, and behaviors that cover a wide band of interests, the social survey has been established as being tremendously practical. The surveys are conducted due to many reasons, nonavailability of certain facts/information in the archives being the most understandable and apparent. For instance, if one is interested in knowing crime rate, information about unseen crimes or unreported victimization experience is not available in formal records on crimes. Sometimes, the facts about the individuals (in a population) are inaccessible to the investigators for legal reasons. Questionnaires, in particular social surveys, generally consist of many items. Some of the items may be about sensitive/high-risk behavior, due to the social stigma carried by them. One problem with research on high-risk behavior is that respondents may consciously or unconsciously provide incorrect information. In psychological surveys, a social desirability bias has been observed as a major cause of distortion in standardized personality measures. Survey researchers have similar concerns about the truth of survey results/findings about topics such as drunk driving, use of marijuana, tax evasion, illicit drug use, induced abortion, shop lifting, child abuse, family disturbances, cheating in examinations, HIV/AIDS, and sexual behavior. Thus, to obtain trustworthy data on such confidential matters, especially the sensitive ones, instead of open surveys alternative procedures are required. Such an alternative procedure known as randomized response technique (RRT) was first introduced by Warner (1965).

Subsequently, several other workers have proposed different Randomized Response (RR) strategies, for instance, see the review-oriented references like Greenberg et al. (1969), Fox and Tracy (1986), Mangat (1994) and the papers by Tracy and Mangat (1995), Kim and Elam (2005), Singh and Tarray (2012, 2013, 2014a, 2014b, 2015a, 2015b, 2015c), and Tarray and Singh (2015).

Hong, Yum, and Lee (1994) suggested a stratified RR technique under the proportional sampling assumption. Kim and Warde (2004) and Kim and Elam (2005) have presented a stratified RR technique using a Neyman allocation which is more efficient than a stratified RR technique using a proportional allocation. Kim and Elam (2007) have mentioned that the extension of the RRT to stratified random sampling may be useful if the investigator is interested in estimating the proportion of HIV/AIDS positively affected persons at different levels such as by rural areas or urban areas, age-groups, or income groups.

In the stratified random sampling, the population to be used to conduct the survey is partitioned into strata. A sample is then selected by simple random sampling with replacement (SRSWR) from each stratum is known. To get the full benefit from stratification, it is assumed that the number of units in each stratum is known. In the stratified Warner’s randomized response model, an individual respondent in the sample from stratum “‘i” is instructed to use the randomization device R_i which consists of a sensitive question (S) card with probability P_i and its negative question $(\bar{S})$ card with probability (1 − P_i ). The respondent answers the question with “Yes” or “No” without reporting which question card he or she has. A respondent belonging to the sample in different strata will perform different randomization device, each having different preassigned probabilities. Under the assumption that these “Yes” or “No” reports are made truthfully and P_i is set by the researcher, the probability of “Yes” answers in stratum ‘i’ for the stratified Warner’s RR model is given by:

Z_{i} = P_{i} π_{S i} + (1 - P_{i}) (1 - π_{S i}), for (i = 1, 2, \dots, k);

where Z_i is the proportion of “Yes” answers in a stratum i and π_Si is the proportion of respondents with the sensitive trait in a stratum i. Let n_i denote the number of units in the sample from stratum i and n denote the total number of units in sample from all stratum so that $n = \sum_{i = 1}^{k} n_{i} .$ The maximum likelihood estimate ${\hat{π}}_{S}$ (which is unbiased) of sensitive proportion $π_{S} = \sum_{i = 1}^{k} w_{i} π_{S i}$ is given by:

{\hat{π}}_{S} = \sum_{i = 1}^{k} w_{i} {\hat{π}}_{S i} = \sum_{i = 1}^{k} w_{i} [\frac{{\hat{Z}}_{i} - (1 - P_{i})}{2 P_{i} - 1}],

where w_i = (N_i/N) for (i = 1, 2, …, k) so that $w = \sum_{i = 1}^{k} w_{i} = 1$ , N is the number of units in the whole population and N_i is the total number of units in the stratum i and ${\hat{Z}}_{i}$ is a point estimate of Z_i .

The variance of ${\hat{π}}_{S}$ in equation (2) is given by:

V ({\hat{π}}_{S}) = \sum_{i = 1}^{k} w_{i}^{2} V ({\hat{π}}_{S i}) = \sum_{i = 1}^{k} \frac{w_{i}^{2}}{n_{i}} [π_{S i} (1 - π_{S i}) + \frac{P_{i} (1 - P_{i})}{{(2 P_{i} - 1)}^{2}}] = MSE ({\hat{π}}_{S}),

where mean square error (MSE) (.) stands for the MSE of (.).

Under proportional allocation (i.e., n_i = n (N_i /N)), the variance/MSE of ${\hat{π}}_{S}$ is given by:

V {({\hat{π}}_{S})}_{P} = \frac{1}{n} \sum_{i = 1}^{k} w_{i} [π_{S i} (1 - π_{S i}) + \frac{P_{i} (1 - P_{i})}{{(2 P_{i} - 1)}^{2}}] = MSE ({\hat{π}}_{S})_{P},

which is due to Hong et al. (1994).

If the prior information on π_Si is available from the past experience, then under Neyman allocation:

\frac{n_{i}}{n} = \frac{w_{i} {[π_{S i} (1 - π_{S i}) + \frac{P_{i} (1 - P_{i})}{{(2 P_{i} - 1)}^{2}}]}^{1 / 2}}{\sum_{i = 1}^{k} w_{i} {[π_{S i} (1 - π_{S i}) + \frac{P_{i} (1 - P_{i})}{{(2 P_{i} - 1)}^{2}}]}^{1 / 2}} .

Kim and Warde (2004) obtained the minimal variance/MSE of the estimator ${\hat{π}}_{S}$ as

V {({\hat{π}}_{S})}_{O} = \frac{1}{n} {[\sum_{i = 1}^{k} w_{i} {π_{S i} (1 - π_{S i}) + \frac{P_{i} (1 - P_{i})}{{(2 P_{i} - 1)}^{2}}}^{1 / 2}]}^{2} = MSE ({\hat{π}}_{S})_{O} .

Proposed Model

In this proposed model, the population is partitioned into strata, and a sample is selected by SRSWR from each stratum. To get the full benefit from stratification, it is assumed that the number of units in each stratum is known. In this procedure, the randomized response device R_i and method for sampling the respondents in each stratum “‘i” remains same as in Kim and Warde (2004) model. However, it differs in the sense that the respondent is free to give answer in terms of “Yes” and “No” either by using RR device or without using it. It is not revealing to the interviewer which mode has been followed for giving answer.

Let n_i denote the number of units in the sample from stratum i and n denote the total number of units in samples from all stratum so that $n = \sum_{i = 1}^{k} n_{i} .$ If T_i is the probability that a respondent gives answer without using RR device, then assuming completely truthful reporting, the probability Y_i of a “Yes” answer in stratum i is given by:

\begin{array}{l} Y_{i} = T_{i} π_{S i} + (1 - T_{i}) Z_{i}, for (i = 1, 2, . . ., k) \\ = T_{i} π_{S i} + (1 - T_{i}) {P_{i} π_{S i} + (1 - P_{i}) π_{y i}}, for i = 1, 2, . . ., k, \end{array}

where Z_i is given by equation (1).

For this procedure, we consider the following estimator of π_Si:

{\hat{π}}_{m i} = \frac{n_{i}^{'}}{n_{i}},

where $n_{i}^{'}$ is the observed number of “Yes” answers obtained from the n_i respondents including in the sample from stratum i.

Since the selection in different strata is made independently, the estimators for individual strata can be added together to obtain an estimator for the whole population. Thus, the estimator ${\hat{π}}_{S T}$ of $π_{S} = \sum_{i = 1}^{k} w_{i} π_{S i}$ is given by:

{\hat{π}}_{S T} = \sum_{i = 1}^{k} w_{i} {\hat{π}}_{m i} = \sum_{i = 1}^{k} w_{i} (n_{i}^{'} / n_{i}) .

Since $n_{i}^{'}$ is distributed as a binomial variate B (n_i, Y_i ), we, therefore, have the following theorems.

Theorem 2.1: The estimator ${\hat{π}}_{S T}$ is biased and the expression for bias is given by:

B ({\hat{π}}_{S T}) = \sum_{i = 1}^{k} w_{i} (1 - T_{i}) (1 - P_{i}) (π_{y i} - π_{S i}) .

Proof is simple so omitted.

Theorem 2.2: The variance of the proposed estimator ${\hat{π}}_{S T}$ is given by:

V ({\hat{π}}_{S T}) = \sum_{i = 1}^{k} w_{i}^{2} \frac{Y_{i} (1 - Y_{i})}{n_{i}},

where Y_i is given by equation (7).

Proof is simple so omitted.

As the proposed estimator ${\hat{π}}_{S T}$ is biased, therefore, to study the performance of the proposed estimator ${\hat{π}}_{S T}$ , we need its MSE. We know that

MSE ({\hat{π}}_{S T}) = V ({\hat{π}}_{S T}) + {(Bias ({\hat{π}}_{S T}))}^{2}

Thus, we state the following theorem.

Theorem 2.3: The MSE of the proposed estimator ${\hat{π}}_{S T}$ is given by:

MSE ({\hat{π}}_{S T}) = \sum_{i = 1}^{k} w_{i}^{2} \frac{Y_{i} (1 - Y_{i})}{n_{i}} + {[\sum_{i = 1}^{k} w_{i} (1 - T_{i}) (1 - P_{i}) (π_{y i} - π_{S i})]}^{2} .

Now, we will obtain the MSE $({\hat{π}}_{S T})$ under (i) proportional allocation and (ii) Neyman allocation.

i. proportional allocation

The MSE $({\hat{π}}_{S T})$ under the proportional allocation is given in the following theorem.

Theorem 2.4: Under the proportional allocation (i.e., n_i = n (N_i /N)), the MSE $({\hat{π}}_{S T})$ is given by:

MSE {({\hat{π}}_{S T})}_{P} = \frac{1}{n} [\sum_{i = 1}^{k} w_{i} Y_{i} (1 - Y_{i})] + {[\sum_{i = 1}^{k} w_{i} (1 - T_{i}) (1 - P_{i}) (π_{y i} - π_{S i})]}^{2} .

Proof is simple so omitted.

ii. Neyman allocation

Information on π_Si and T_i are usually unavailable. But if prior information on π_Si and T_i are available from past experience, then it helps to derive the following Neyman allocation formula.

Theorem 2.5: The Neyman allocation of n to n ₁, n _2, …, n _k−1 and n_k to derive the minimum MSE subject to $n = \sum_{i = i}^{k} n_{i}$ is approximately given by:

\frac{n_{i}}{n} = \frac{w_{i} \sqrt{Y_{i} (1 - Y_{i})}}{\sum_{i = 1}^{k} w_{i} \sqrt{Y_{i} (1 - Y_{i})}} .

Proof is simple so omitted.

Theorem 2.4: Under the Neyman allocation (15), the minimum MSE of ${\hat{π}}_{S T}$ is given by:

MSE {({\hat{π}}_{S T})}_{O} = \frac{1}{n} {[\sum_{i = 1}^{k} w_{i}^{} \sqrt{Y_{i} (1 - Y_{i})}]}^{2} + {[\sum_{i = 1}^{k} w_{i} (1 - T_{i}) (1 - P_{i}) (π_{y i} - π_{S i})]}^{2} .

Proof: Inserting equation (15) in (13), one can easily get (16).

Relative Efficiency

To compare the MSE of the proposed estimator ${\hat{π}}_{S T}$ with that of Mangat (1991) estimator $^π_{m}$ , we write MSE of Mangat (1991) estimator $^π_{m}$ for two strata in the population (i.e., k = 2) as

MSE ({\hat{π}}_{m}) = \frac{[Y (1 - Y)]}{n} + {[(1 - T) (1 - P) (π_{y} - π_{S})]}^{2},

where $Y = T π_{S} + (1 - T) {P π_{S} + (1 - P) π_{y}}$ and $π_{S} = w_{1} π_{S 1} + w_{2} π_{S 2} .$

Proportional Allocation

For two strata (i.e., k = 2) in the population and P = P₁ = P₂ , MSE( ${\hat{π}}_{S}$ ) in equation (4) reduces to:

MSE {({\hat{π}}_{S})}_{P} = \frac{1}{n} [\sum_{i = 1}^{2} w_{i} {π_{S i} (1 - π_{S i}) + \frac{P (1 - P)}{{(2 P - 1)}^{2}}}] .

Under the assumptions k = 2, P = P₁ = P₂ and T = T₁ = T₂ , $π_{y 1} = π_{y 2} = π_{y}$ , the MSE ${({\hat{π}}_{S T})}_{P}$ in equation (14) reduces to:

MSE {({\hat{π}}_{S T})}_{P} = \frac{1}{n} [\sum_{i = 1}^{2} w_{i}^{} Y_{i}^{*} (1 - Y_{i}^{*})] + {[(1 - T) (1 - P) (π_{y} - π_{S})]}^{2},

where $π_{S} = (w_{1} π_{S 1} + w_{2} π_{S 2})$ and $Y_{i}^{*} = T π_{S i} + (1 - T) {P π_{S i} + (1 - P) π_{y i}}$ , for i = 1, 2, …, k.

From equations (17), (18), and (19), the percent relative efficiency (PRE) of the proposed estimator ${\hat{π}}_{S T}$ (under proportional allocation) with respect to Mangat (1991) estimator $^π_{m}$ and the estimator ${\hat{π}}_{S}$ (under proportional allocation, i.e., Hong et al.’s estimator) are, respectively, given by:

PRE ({({\hat{π}}_{ST})}_{P} {, \hat{π}}_{m}) = \frac{{MSE(\hat{π}}_{m})}{{MSE(\hat{π}}_{ST})_{P}} \times 100,

and

PRE {((\hat{π}}_{ST})_{P} {,(\hat{π}}_{S})_{P}) = \frac{{MSE(\hat{π}}_{S})_{P}}{{MSE(\hat{π}}_{ST})_{P}} .

We have computed the percent relative efficiencies $PRE ({\hat{π}}_{S T}, {\hat{π}}_{m})$ and $PRE ({\hat{π}}_{S T}, {\hat{π}}_{S})$ for different values of n, P, w₁ , w₂ , $π_{S 1}, π_{S 2}, π_{y}$ , and T. Findings are shown in Tables 1 and 2, respectively.

Table 1.

Percentage Relative Efficiency of the Proposed Estimator ${\hat{π}}_{S T}$ (Under Proportional Allocation) With Respect to Mangat (1991) Estimator ${\hat{π}}_{m}$ .

n	π_S1	π_S2	T	w ₁	w ₂	π_S	π_y	P
n	π_S1	π_S2	T	w ₁	w ₂	π_S	π_y	0.6	0.63	0.66	0.69	0.72	0.75	0.78	0.81
10	0.08	0.13	0.50	0.90	0.10	0.09	0.10	100.18	100.19	100.19	100.20	100.21	100.22	100.23	100.23
				0.80	0.20	0.09	0.20	100.25	100.26	100.28	100.29	100.31	100.32	100.34	100.36
				0.70	0.30	0.10	0.30	100.25	100.27	100.29	100.31	100.33	100.36	100.38	100.41
				0.60	0.40	0.10	0.40	100.23	100.25	100.27	100.29	100.32	100.35	100.38	100.41
				0.50	0.50	0.11	0.50	100.19	100.21	100.23	100.25	100.28	100.31	100.34	100.38
				0.40	0.60	0.11	0.60	100.15	100.16	100.18	100.21	100.23	100.26	100.29	100.32
				0.30	0.70	0.12	0.70	100.11	100.12	100.14	100.15	100.17	100.20	100.22	100.25
				0.20	0.80	0.12	0.80	100.07	100.08	100.09	100.10	100.11	100.13	100.15	100.17
				0.10	0.90	0.13	0.90	100.03	100.04	100.04	100.05	100.06	100.06	100.07	100.09
20	0.18	0.23	0.70	0.90	0.10	0.19	0.10	100.12	100.12	100.12	100.13	100.13	100.13	100.13	100.14
				0.80	0.20	0.19	0.20	100.20	100.20	100.21	100.21	100.22	100.22	100.23	100.23
				0.70	0.30	0.20	0.30	100.24	100.25	100.26	100.26	100.27	100.28	100.28	100.29
				0.60	0.40	0.20	0.40	100.25	100.26	100.27	100.28	100.29	100.30	100.31	100.32
				0.50	0.50	0.21	0.50	100.23	100.24	100.26	100.27	100.28	100.29	100.30	100.31
				0.40	0.60	0.21	0.60	100.20	100.21	100.22	100.23	100.25	100.26	100.27	100.28
				0.30	0.70	0.22	0.70	100.15	100.16	100.17	100.19	100.20	100.21	100.22	100.24
				0.20	0.80	0.22	0.80	100.10	100.11	100.12	100.13	100.14	100.15	100.16	100.17
				0.10	0.90	0.23	0.90	100.05	100.06	100.06	100.07	100.07	100.08	100.08	100.09
30	0.28	0.33	0.90	0.90	0.10	0.29	0.10	100.10	100.10	100.10	100.10	100.11	100.11	100.11	100.11
				0.80	0.20	0.29	0.20	100.18	100.18	100.18	100.18	100.18	100.19	100.19	100.19
				0.70	0.30	0.30	0.30	100.23	100.23	100.24	100.24	100.24	100.24	100.24	100.24
				0.60	0.40	0.30	0.40	100.26	100.26	100.27	100.27	100.27	100.27	100.27	100.27
				0.50	0.50	0.31	0.50	100.27	100.27	100.27	100.27	100.28	100.28	100.28	100.28
				0.40	0.60	0.31	0.60	100.25	100.25	100.25	100.26	100.26	100.26	100.26	100.27
				0.30	0.70	0.32	0.70	100.21	100.22	100.22	100.22	100.22	100.23	100.23	100.23
				0.20	0.80	0.32	0.80	100.16	100.16	100.16	100.16	100.17	100.17	100.17	100.17
				0.10	0.90	0.33	0.90	100.09	100.09	100.09	100.09	100.09	100.09	100.09	100.10

Table 2.

Percentage Relative Efficiency of the Proposed Estimator ${\hat{π}}_{S T}$ (Under Proportional Allocation) With Respect to the Hong et al.’s Estimator ${\hat{π}}_{S}$ (Under Proportional Allocation).

n	π_S1	π_S2	T	w ₁	w ₂	π_S	π_y	P
n	π_S1	π_S2	T	w ₁	w ₂	π_S	π_y	0.6	0.63	0.66	0.69	0.72	0.75	0.78	0.81
10	0.08	0.13	0.50	0.90	0.10	0.09	0.10	7,577.80	4,407.32	2,843.48	1,958.54	1,409.31	1,045.01	790.91	606.55
				0.80	0.20	0.09	0.20	5,845.35	3,458.87	2,270.51	1,591.26	1,165.16	879.24	677.31	528.80
				0.70	0.30	0.10	0.30	4,542.13	2,733.13	1,824.81	1,301.02	969.29	744.35	583.61	463.86
				0.60	0.40	0.10	0.40	3,581.66	2,187.45	1,483.01	1,074.12	813.28	634.96	506.34	409.44
				0.50	0.50	0.11	0.50	2,872.14	1,776.57	1,220.65	896.56	688.84	546.06	442.38	363.61
				0.40	0.60	0.11	0.60	2,341.58	1,464.09	1,017.61	756.66	588.99	473.41	389.16	324.80
				0.30	0.70	0.12	0.70	1,938.57	1,223.23	858.66	645.37	508.22	413.62	344.59	291.75
				0.20	0.80	0.12	0.80	1,627.32	1,034.86	732.69	555.89	442.29	364.04	307.03	263.42
				0.10	0.90	0.13	0.90	1,383.08	885.44	631.58	483.16	387.97	322.60	275.15	238.99
20	0.18	0.23	0.70	0.90	0.10	0.19	0.10	4,208.34	2,459.11	1,597.95	1,111.64	810.49	611.21	472.59	372.32
				0.80	0.20	0.19	0.20	3,986.42	2,334.30	1,520.40	1,060.43	775.36	586.56	455.09	359.88
				0.70	0.30	0.20	0.30	3,680.53	2,168.22	1,420.67	996.74	733.06	557.79	435.27	346.19
				0.60	0.40	0.20	0.40	3,331.58	1,979.93	1,308.45	925.70	686.37	526.39	413.92	331.66
				0.50	0.50	0.21	0.50	2,974.29	1,785.88	1,192.22	851.86	637.75	493.71	391.75	316.64
				0.40	0.60	0.21	0.60	2632.56	1,597.91	1,078.31	778.76	589.20	460.86	369.37	301.45
				0.30	0.70	0.22	0.70	2319.76	1,423.18	970.86	708.85	542.19	428.69	347.25	286.33
				0.20	0.80	0.22	0.80	2041.38	1,265.15	872.12	643.60	497.66	397.81	325.76	271.50
				0.10	0.90	0.23	0.90	1,797.90	1,124.73	782.97	583.75	456.17	368.60	305.15	257.10
30	0.28	0.33	0.90	0.90	0.10	0.29	0.10	3,071.46	1,807.97	1,185.58	833.87	615.93	471.60	371.12	298.37
				0.80	0.20	0.29	0.20	3036.03	1,787.08	1,172.01	824.53	609.25	466.73	367.53	295.71
				0.70	0.30	0.30	0.30	2,990.47	1,761.22	1,155.77	813.67	601.71	461.36	363.65	292.91
				0.60	0.40	0.30	0.40	2,935.73	1,730.84	1,137.09	801.45	593.38	455.53	359.52	289.97
				0.50	0.50	0.31	0.50	2,872.89	1,696.46	1,116.25	787.99	584.33	449.29	355.15	286.90
				0.40	0.60	0.31	0.60	2,803.11	1,658.60	1,093.51	773.45	574.64	442.67	350.57	283.71
				0.30	0.70	0.32	0.70	2,727.58	1,617.83	1,069.16	757.97	564.40	435.73	345.80	280.43
				0.20	0.80	0.32	0.80	2,647.49	1,574.70	1,043.47	741.70	553.69	428.50	340.86	277.05
				0.10	0.90	0.33	0.90	2,563.98	1,529.74	1,016.72	724.78	542.57	421.03	335.78	273.58

It is observed from Tables 1 and 2 that:

The values of percent relative efficiencies $PRE ({({\hat{π}}_{S T})}_{P}, {\hat{π}}_{m})$ and $PRE ({({\hat{π}}_{S T})}_{P}, {\hat{π}}_{S})$ are greater than 100. It follows that the proposed estimator ${\hat{π}}_{S T}$ (under proportional allocation) is more efficient than the proposed estimator ${\hat{π}}_{S}$ (under proportional allocation, i.e., Hong et al.’s estimator) and that of the Mangat (1991) estimator ${\hat{π}}_{m}$ . It is also observed from Table 1 that the values of the relative efficiency $PRE ({({\hat{π}}_{S T})}_{P}, {\hat{π}}_{m})$ increase as the value of P increases. Table 2 exhibits that the values of the PRE $PRE ({\hat{π}}_{S T}, {\hat{π}}_{S})$ decrease as the value of P increases.

We further note from the findings of Tables 1 and 2 that there is a large gain in efficiency by using the suggested estimator ${\hat{π}}_{S T}$ under proportional allocation over the proposed estimator ${\hat{π}}_{S}$ (Hong et al.’s [1994] estimator) as well as Mangat (1991) estimator ${\hat{π}}_{m}$ . Thus, the proposed estimator ${\hat{π}}_{S}_{T}$ under proportional allocation is to be preferred over Hong et al.’s (1994) estimator ${\hat{π}}_{S}$ and the Mangat (1991) estimator ${\hat{π}}_{m}$ .

Neyman Allocation

Under the assumption k = 2 (i.e., two strata in the population), P = P₁ = P₂ , T = T₁ = T₂ , MSE( ${\hat{π}}_{S}$ ) in equation (6) and MSE( ${\hat{π}}_{S T}$ ) in equation (16), respectively, reduce to:

MSE {({\hat{π}}_{S})}_{N} = \frac{1}{n} {[\sum_{i = 1}^{2} w_{i} {π_{S i} (1 - π_{S i}) + \frac{P (1 - P)}{{(2 P - 1)}^{2}}}^{1 / 2}]}^{2},

and

\tilde{M} SE {({\hat{π}}_{S T})}_{N} = \frac{1}{n} {[\sum_{i = 1}^{2} w_{i}^{} \sqrt{Y_{i}^{*} (1 - Y_{i}^{*})}]}^{2} + {[(1 - T) (1 - P) (π_{y i} - π_{S i})]}^{2} .

From equations (17), (22), and (23), PRE of the proposed estimator ${\hat{π}}_{S T}$ (under Neyman allocation) with respect to Mangat (1991) estimator ${\hat{π}}_{m}$ and the estimator ${\hat{π}}_{S}$ (under Neyman allocation, i.e., Kim and Warde’s [2004] estimator) are, respectively, given by:

PRE ({({\hat{π}}_{S T})}_{N}, {\hat{π}}_{m}) = \frac{MSE ({\hat{π}}_{m})}{MSE {({\hat{π}}_{S T})}_{N}} \times 100,

and

PRE ({({\hat{π}}_{S T})}_{N}, {({\hat{π}}_{S})}_{N}) = \frac{MSE {({\hat{π}}_{S})}_{N}}{MSE {({\hat{π}}_{S T})}_{N}} \times 100.

We have computed the percent relative efficiencies $PRE ({({\hat{π}}_{S T})}_{N}, {\hat{π}}_{m})$ and $PRE ({({\hat{π}}_{S T})}_{N}, {({\hat{π}}_{S})}_{N})$ for various values of n, P, w₁ , w₂ , $π_{S 1}, π_{S 2}, π_{y}$ , and T. Results are compiled in Tables 3 and 4, respectively.

Table 3.

Percentage Relative Efficiency of the Proposed Estimator ${\hat{π}}_{S T}$ (Under Neyman Allocation) With Respect to Mangat (1991) Estimator $^π_{m}$ .

n	π_S1	π_S2	T	w ₁	w ₂	π_S	π_y	P
n	π_S1	π_S2	T	w ₁	w ₂	π_S	π_y	0.6	0.63	0.66	0.69	0.72	0.75	0.78	0.81
10	0.08	0.13	0.50	0.90	0.10	0.09	0.10	154.59	154.69	154.80	154.91	155.03	155.14	155.27	155.39
				0.80	0.20	0.09	0.20	237.83	236.80	235.86	235.03	234.31	233.71	233.22	232.87
				0.70	0.30	0.10	0.30	353.88	348.95	344.34	340.07	336.15	332.62	329.51	326.83
				0.60	0.40	0.10	0.40	480.13	468.22	456.94	446.33	436.44	427.31	419.01	411.61
				0.50	0.50	0.11	0.50	568.79	548.60	529.37	511.16	494.04	478.07	463.34	449.96
				0.40	0.60	0.11	0.60	579.74	553.51	528.50	504.77	482.37	461.39	441.92	424.08
				0.30	0.70	0.12	0.70	517.60	489.85	463.40	438.30	414.60	392.39	371.73	352.74
				0.20	0.80	0.12	0.80	420.53	394.99	370.71	347.71	326.02	305.69	286.80	269.44
				0.10	0.90	0.13	0.90	323.53	301.93	281.45	262.08	243.86	226.82	211.01	196.48
20	0.18	0.23	0.70	0.90	0.10	0.19	0.10	153.60	153.29	153.01	152.75	152.52	152.31	152.13	151.98
				0.80	0.20	0.19	0.20	222.11	222.17	222.23	222.29	222.36	222.42	222.49	222.55
				0.70	0.30	0.20	0.30	310.83	310.14	309.51	308.95	308.45	308.01	307.64	307.34
				0.60	0.40	0.20	0.40	400.89	397.58	394.51	391.70	389.13	386.83	384.79	383.01
				0.50	0.50	0.21	0.50	456.56	449.02	442.00	435.51	429.56	424.15	419.31	415.04
				0.40	0.60	0.21	0.60	451.36	439.75	428.90	418.84	409.57	401.12	393.52	386.77
				0.30	0.70	0.22	0.70	393.68	379.90	367.02	355.04	344.00	333.92	324.81	316.72
				0.20	0.80	0.22	0.80	314.38	300.62	287.75	275.78	264.74	254.65	245.53	237.41
				0.10	0.90	0.23	0.90	238.98	226.61	215.03	204.27	194.34	185.26	177.05	169.74
30	0.28	0.33	0.90	0.90	0.10	0.29	0.10	151.39	151.21	151.05	150.90	150.76	150.64	150.53	150.43
				0.80	0.20	0.29	0.20	220.00	219.95	219.90	219.85	219.81	219.77	219.74	219.72
				0.70	0.30	0.30	0.30	301.99	302.00	302.02	302.04	302.06	302.07	302.09	302.11
				0.60	0.40	0.30	0.40	374.79	374.69	374.61	374.53	374.46	374.40	374.35	374.31
				0.50	0.50	0.31	0.50	404.66	404.18	403.74	403.34	402.97	402.64	402.35	402.10
				0.40	0.60	0.31	0.60	374.96	373.97	373.05	372.20	371.44	370.74	370.13	369.58
				0.30	0.70	0.32	0.70	304.13	302.74	301.46	300.28	299.20	298.22	297.35	296.59
				0.20	0.80	0.32	0.80	224.94	223.40	221.98	220.67	219.47	218.39	217.43	216.58
				0.10	0.90	0.33	0.90	158.26	156.78	155.40	154.13	152.98	151.94	151.01	150.19

Table 4.

Percentage Relative Efficiency of the Proposed Estimator ${\hat{π}}_{S T}$ (Under Neyman Allocation) With Respect to the Kim and Warde’s (2004) Estimator ${\hat{π}}_{S}$ .

n	π_S1	π_S2	T	w ₁	w ₂	π_S	π_y	P
n	π_S1	π_S2	T	w ₁	w ₂	π_S	π_y	0.6	0.63	0.66	0.69	0.72	0.75	0.78	0.81
10	0.08	0.13	0.50	0.90	0.10	0.09	0.10	7,858.05	4,571.23	2,949.47	2,031.43	1,461.45	1,083.22	819.27	627.67
				0.80	0.20	0.09	0.20	6,406.69	3,771.49	2,463.41	1,718.18	1,252.27	940.72	721.44	560.74
				0.70	0.30	0.10	0.30	5,388.57	3,194.51	2,102.35	1,478.28	1,086.86	824.17	638.53	501.82
				0.60	0.40	0.10	0.40	4,636.66	2,759.77	1,824.49	1,289.50	953.58	727.87	568.13	450.30
				0.50	0.50	0.11	0.50	4,076.44	2,431.48	1,611.71	1,142.77	848.33	650.52	510.56	407.33
				0.40	0.60	0.11	0.60	3,667.46	2,189.93	1,453.86	1,032.99	768.89	591.58	466.22	373.86
				0.30	0.70	0.12	0.70	3,375.03	2,016.50	1,340.05	953.51	711.12	548.53	433.70	349.20
				0.20	0.80	0.12	0.80	3,165.09	1,891.62	1,257.89	895.99	669.23	517.25	410.02	331.21
				0.10	0.90	0.13	0.90	3,009.54	1,798.70	1,196.52	852.88	637.73	493.67	392.14	317.61
20	0.18	0.23	0.70	0.90	0.10	0.19	0.10	4,339.34	2,529.75	1,640.15	1,138.50	828.31	623.36	481.00	378.17
				0.80	0.20	0.19	0.20	4,083.15	2,390.36	1,556.32	1,084.90	792.70	599.14	464.34	366.70
				0.70	0.30	0.20	0.30	3,836.47	2,253.81	1,472.73	1,030.49	755.87	573.62	446.44	354.11
				0.60	0.40	0.20	0.40	3,600.82	2,121.40	1,390.44	976.09	718.48	547.29	427.66	340.69
				0.50	0.50	0.21	0.50	3,386.93	1,999.86	1,314.04	925.00	682.95	521.98	409.39	327.47
				0.40	0.60	0.21	0.60	3,208.06	1,897.50	1,249.25	881.37	652.38	500.03	393.43	315.84
				0.30	0.70	0.22	0.70	3,069.60	1,818.00	1,198.75	847.24	628.39	482.75	380.83	306.61
				0.20	0.80	0.22	0.80	2,966.46	1,758.71	1,161.06	821.75	610.46	469.83	371.39	299.70
				0.10	0.90	0.23	0.90	2,888.76	1,714.04	1,132.64	802.52	596.94	460.08	364.28	294.50
30	0.28	0.33	0.90	0.90	0.10	0.29	0.10	3,122.58	1,835.52	1,202.02	844.33	622.86	476.32	374.38	300.63
				0.80	0.20	0.29	0.20	3,081.61	1,812.87	1,188.17	835.31	616.74	472.06	371.38	298.50
				0.70	0.30	0.30	0.30	3,029.53	1,783.73	1,170.11	823.40	608.57	466.31	367.27	295.57
				0.60	0.40	0.30	0.40	2,966.56	1,748.17	1,147.89	808.61	598.33	459.05	362.05	291.80
				0.50	0.50	0.31	0.50	2,898.62	1,709.59	1,123.64	792.39	587.04	450.99	356.23	287.57
				0.40	0.60	0.31	0.60	2,835.77	1,673.78	1,101.07	777.25	576.47	443.43	350.75	283.59
				0.30	0.70	0.32	0.70	2,785.89	1,645.36	1,083.15	765.23	568.08	437.43	346.39	280.42
				0.20	0.80	0.32	0.80	2,750.59	1,625.31	1,070.54	756.79	562.20	433.23	343.35	278.21
				0.10	0.90	0.33	0.90	2,726.92	1,611.94	1,062.19	751.23	558.35	430.49	341.38	276.79

The values of $PRE ({({\hat{π}}_{S T})}_{N}, {\hat{π}}_{m})$ and $PRE ({({\hat{π}}_{S T})}_{N}, {({\hat{π}}_{S})}_{N})$ are larger than 100 for all values of n, $π_{S 1}, π_{S 2}, π_{y}, w_{1}, w_{2}, P$ , and T considered here. So, we can say that the envisaged estimator ${\hat{π}}_{S T}$ (under Neyman allocation) is more efficient than the Kim and Warde’s (2004) estimator ${\hat{π}}_{S}$ (under Neyman allocation) and that of the Mangat (1991) estimator ${\hat{π}}_{m}$ . We note from Table 3 that the values of the percent relative efficiencies $PRE ({\hat{π}}_{S T}, {\hat{π}}_{m})$ increase as the value of P increases. Table 4 demonstrates that the values of the PRE $PRE ({\hat{π}}_{S T}, {\hat{π}}_{S})$ decrease as the value of P increases. It is further observed from the results of Tables 3 and 4 that there is a substantial gain in efficiency by using the proposed estimator ${\hat{π}}_{S T}$ under Neyman allocation to that of the Kim and Warde’s (2004) estimator ${\hat{π}}_{S}$ as well as Mangat (1991) estimator ${\hat{π}}_{m}$ .Thus, our recommendation is to use the proposed estimator ${\hat{π}}_{S T}$ (under Neyman allocation) over Kim and Warde’s (2004) estimator ${\hat{π}}_{S}$ and Mangat (1991) estimator ${\hat{π}}_{m}$ in presence of prior information of π_S1 and π_S2.

To have the tangible idea about the performance of the proposed estimator ${\hat{π}}_{S T}$ under Neyman allocation to that of under proportional allocation, we have computed the PRE of the proposed estimator ${\hat{π}}_{S T}$ under Neyman allocation with respect to ${\hat{π}}_{S T}$ under proportional allocation by using the formula:

PRE ({({\hat{π}}_{S T})}_{N}, {({\hat{π}}_{S T})}_{P}) = \frac{MSE {({\hat{π}}_{S T})}_{P}}{MSE {({\hat{π}}_{S T})}_{N}} \times 100.

For different values of n, P, w₁ , w₂ , $π_{S 1}, π_{S 2}, π_{y}$ , and T.

The findings are shown in Table 5.

Table 5.

Percentage Relative Efficiency of the Proposed Estimator ${\hat{π}}_{S T}$ Under Neyman Allocation With Respect to ${\hat{π}}_{S T}$ Under Proportional Allocation.

n	π_S1	π_S2	T	w ₁	w ₂	π_S	π_y	P
n	π_S1	π_S2	T	w ₁	w ₂	π_S	π_y	0.6	0.63	0.66	0.69	0.72	0.75	0.78	0.81
10	0.08	0.13	0.50	0.90	0.10	0.09	0.10	154.31	154.40	154.50	154.60	154.70	154.81	154.92	155.03
				0.80	0.20	0.09	0.20	237.25	236.18	235.22	234.35	233.60	232.95	232.43	232.03
				0.70	0.30	0.10	0.30	352.99	348.01	343.34	339.01	335.03	331.44	328.25	325.49
				0.60	0.40	0.10	0.40	479.04	467.07	455.71	445.03	435.05	425.83	417.44	409.93
				0.50	0.50	0.11	0.50	567.72	547.46	528.16	509.87	492.65	476.59	461.76	448.27
				0.40	0.60	0.11	0.60	578.88	552.60	527.53	503.74	481.27	460.21	440.66	422.73
				0.30	0.70	0.12	0.70	517.05	489.26	462.77	437.63	413.89	391.62	370.91	351.86
				0.20	0.80	0.12	0.80	420.24	394.69	370.39	347.36	325.65	305.30	286.38	268.98
				0.10	0.90	0.13	0.90	323.43	301.82	281.33	261.96	243.73	226.68	210.85	196.31
20	0.18	0.23	0.70	0.90	0.10	0.19	0.10	153.42	153.10	152.82	152.56	152.32	152.12	151.93	151.77
				0.80	0.20	0.19	0.20	221.66	221.71	221.77	221.82	221.87	221.93	221.98	222.04
				0.70	0.30	0.20	0.30	310.08	309.37	308.72	308.14	307.62	307.16	306.77	306.45
				0.60	0.40	0.20	0.40	399.89	396.55	393.45	390.61	388.02	385.68	383.61	381.80
				0.50	0.50	0.21	0.50	455.49	447.93	440.88	434.35	428.37	422.93	418.05	413.75
				0.40	0.60	0.21	0.60	450.47	438.83	427.96	417.86	408.57	400.09	392.45	385.67
				0.30	0.70	0.22	0.70	393.08	379.28	366.38	354.39	343.32	333.22	324.09	315.97
				0.20	0.80	0.22	0.80	314.05	300.29	287.41	275.43	264.38	254.27	245.14	237.01
				0.10	0.90	0.23	0.90	238.86	226.48	214.90	204.14	194.20	185.12	176.90	169.59
30	0.28	0.33	0.90	0.90	0.10	0.29	0.10	151.24	151.06	150.89	150.74	150.60	150.48	150.37	150.27
				0.80	0.20	0.29	0.20	219.61	219.55	219.50	219.45	219.41	219.37	219.33	219.30
				0.70	0.30	0.30	0.30	301.29	301.30	301.31	301.32	301.34	301.35	301.36	301.37
				0.60	0.40	0.30	0.40	373.81	373.71	373.61	373.53	373.45	373.39	373.33	373.28
				0.50	0.50	0.31	0.50	403.58	403.10	402.65	402.24	401.86	401.53	401.23	400.97
				0.40	0.60	0.31	0.60	374.02	373.03	372.10	371.25	370.47	369.77	369.15	368.60
				0.30	0.70	0.32	0.70	303.48	302.09	300.80	299.62	298.53	297.55	296.68	295.91
				0.20	0.80	0.32	0.80	224.59	223.05	221.62	220.31	219.11	218.03	217.06	216.21
				0.10	0.90	0.33	0.90	158.13	156.64	155.26	153.99	152.84	151.79	150.86	150.05

We note from Table 5 that the values of the relative efficiencies $RE ({\hat{π}}_{S T}, {\hat{π}}_{S T})$ increase as the value of P increases. We further note from the results of Table 5 that there is a marginal gain in efficiency by using the suggested estimator ${\hat{π}}_{S T}$ under optimum allocation to that of under proportional allocation.

Discussion

This article addresses the problem of estimating the proportion π_S of the population belonging to a sensitive group using optional RRT in stratified sampling. A stratified optional randomized response model using Mangat (1991) model has been proposed. It has been shown that the proposed randomized response model is more efficient than the Hong et al.’s (1994), Mangat (1991), and Kim and Warde’s (2004) stratified randomized response models.

Footnotes

Acknowledgments

The authors are grateful to the editor in chief and to the learned referees for their valuable suggestions regarding improvement of this article.

Declaration of Conflicting Interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) received no financial support for the research, authorship, and/or publication of this article.

References

Cochran

W. G.

1977. Sampling Technique. 3rd ed. New York: John Wiley.

Fox

J. A.

Tracy

P. E.

. 1986. Randomized Response: A Method of Sensitive Surveys. Newbury Park, CA: Sage.

Greenberg

Abul- Ela

Simmons

W. R.

Horvitz

D. G.

. 1969. “The Unreleased Question Randomized Response: Theoretical Framework.” Journal of American Statistical Association 64:529–39.

Hong

Yum

Lee

. 1994. “A Stratified Randomized Response Technique.” Korean Journal of Applied Statistics 7:141–47.

Kim

J. M.

Elam

M. E.

. 2005. “A Two-Stage Stratified Warner’s Randomized Response Model Using Neyman Allocation.” Metrika 61:1–7.

Kim

J. M.

Elam

M. E.

. 2007. “A Stratified Unrelated Randomized Response Model.” Statistical Papers 48:215–33.

Kim

J. M.

Warde

. 2004. “A Stratified Warner Randomized Response Model.” Journal of Statistical Planning and Inference 120:155–65.

Kim

J. M.

Warde

W. D.

. 2005. “A Mixed Randomized Response Model.” Journal of Statistical Planning and Inference 133:211–21.

Mangat

N. S.

1991. “An Optional Randomized Response Sampling Technique Using Non–stigmatized Attribute.” Statistica 51:595–602.

10.

Mangat

N. S.

1994. “An Improved Randomized Response Strategy.” Journal of the Royal Statistical Society B 56:93–95.

11.

Singh

H. P.

Tarray

T. A.

. 2012. “A Stratified Unknown Repeated Trials in Randomized Response Sampling.” Communication of the Korean Statistical Society 19:751–59.

12.

Singh

H. P.

Tarray

T. A.

. 2013. “An Alternative to Kim and Warde’s Mixed Randomized Response Technique.” Statistica 73:379–402.

13.

Singh

H. P.

Tarray

T. A.

. 2014a. “A stratified Tracy and Osahan’s two-stage randomized response model.” Communications in Statistics-Theory and Methods. doi:10.1080/03610926.2014.895839.

14.

Singh

H. P.

Tarray

T. A.

. 2014b. “A Dexterous Randomized Response Model for Estimating a Rare Sensitive Attribute Using Poisson Distribution.” Statistics & Probability Letters 90:42–45.

15.

Singh

H. P.

Tarray

T. A.

. 2015a. “A Dexterous Randomized Response Model for Estimating a Rare Sensitive Attribute Using Poisson Distribution.” Statistics & Probability Letters 90:42–45.

16.

Singh

H. P.

Tarray

T. A.

. 2015b. “A Revisit to the Singh, Horn, Singh and Mangat’s Randomization Device for Estimating a Rare Sensitive Attribute Using Poisson Distribution.” Model Assisted Statistics and Applications 10:129–38.

17.

Singh

H. P.

Tarray

T. A.

. 2015c. “Two - stage Stratified Partial Randomized Response Strategies.” Communications in Statistics-Theory and Methods. doi:10.1080/03610926.2013.804571.

18.

Tarray

T. A.

Singh

H. P.

. 2015. “A Proficient Randomized Response Model.” Istatistika: Jour. Turkey Statist. Assoc 8:1-12.

19.

Tracy

D. S.

Mangat

N. S.

. 1995. “A Partial Randomized Response Strategy.” Test 4:315–21.

20.

Warner

S. L.

1965. “Randomized Response: A Survey Technique for Eliminating Evasive Answer Bias.” Journal of American Statistical Association 60:63–69.