Negative adaptive cluster sampling

Abstract

The traditional sampling methods such as simple random sampling, stratified sampling etc. cannot be used to study the rare and clustered populations. Such type of populations are frequently observed in ecological, environmental and social sciences. In such situations, often the auxiliary information is collected along with the variable of interest. Obviously, one would like to exploit this auxiliary information to the maximum extent. We consider an auxiliary variable which is highly negatively correlated with the variable of interest. An initial random sample of a fixed size is drawn from the population under study. Further, networks are formed around the units selected in this sample that satisfy the pre specified condition with respect to the auxiliary variable. We used the procedure given by Thompson (1990) for that purpose. The variable of interest is measured corresponding to the units included in these networks. In such situation, negative adaptive cluster sampling (NACS) is of more practical interest than that of the conventional sampling designs. NACS can provide more informative sample for the investigator and more efficient estimates of the population parameters of interest. The parameters of the population are estimated by using the information on the variable of interest corresponding to the units included in the different networks. Different estimators are proposed in this article for the population total of the interest variable. The performance of these estimators is compared by using the data collected from a pilot study by using NACS method.

Keywords

Auxiliary variable negative adaptation NACS Hansen-Hurwitz (HH) and Horvitz-Thompson (HT) type estimators modified ratio regression and product type estimators

1. Introduction

In many real life situations, it is required to estimate either the population mean or population total. In general, traditional sampling methods such as simple random sampling (SRS), stratified sampling etc. are used to draw a sample from the population and to estimate the population mean/total.

But if the population under study is rare and patchy with respect to the variable of interest then the traditional methods lead to poor estimates. The values of the variable of interest may be zero for many of the units selected in the sample. This will lead to an underestimate of the population mean/total. On the other hand, the units for which value of the interest variable is large enough are likely to get clustered. Such type of patterns of clustering and patchiness are observed in many animal populations, with vegetation types and in epidemiological studies of rare and contagious diseases. In such situations, the investigators go away from the predefined sampling plan and add nearby or associated units in the sample.

A design which uses the information gathered from earlier sampled units to draw/include the next unit in the sample is called as an adaptive sampling design. Thompson (1990) has introduced adaptive cluster sampling (ACS) design. In this design, whenever observed value of a selected unit satisfies a condition of interest, additional units are added to the sample from the neighborhood of that unit. This design is discussed in Section 2 of this article. Although, the ACS design is found appropriate for sampling from a rare and clustered population, it suffers from drawback of losing control of the final sample size. Several suggestions have been made by the different researchers for limiting this final sample size of adaptive cluster sample. For instance, Salehi and Seber (1997) suggested a two stage sampling design in which primary units are selected using a conventional design and secondary units within the selected primary units are sub sampled using ACS. Brown (1994) proposed a design in which initial sample is selected sequentially until the final sample size reaches a specified value. Lee (1998) developed a two phase design, in which the first phase sample is an ACS sample based on an auxiliary variable and the second phase sample is selected from the first phase using probability proportional to size (PPS) with replacement sampling design. This design controls the number of measurements of the study variable but it cannot control that of the auxiliary variable. Salehi and Seber (2002) proposed an estimator of the population mean. Bahl and Tuteja (1991) proposed ratio and product type exponential estimators for estimating the mean of a finite population.

In Section 3 of this article, we have described adaptive cluster double sampling (ACDS) proposed by Martin Medina and Steven Thompson (2004). It is a method based on combining the idea of double sampling and ACS. It requires the availability of an inexpensive and easy to measure auxiliary variable.

In Section 4, we have proposed a new method, negative adaptive cluster sampling (NACS). In this method, the process of adding the units to the initial sample is different than that of ACS. In this sampling design, we consider two highly negatively correlated variables $X$ and $Y$ . $X$ is highly abundant in the population whereas $Y$ is rare. Taking observations on $X$ is rather economical and easy as compared to $Y$ . We have proposed the different estimators of population total and have derived variances of these estimators.

The pilot study is presented in Section 5. In Section 6, we have compared the performance of the estimators proposed in Section 4 on the basis of the data from pilot study. Lastly, the concluding remarks are incorporated.

2. Adaptive cluster sampling

In this design, we start with a rare and clustered population of $N$ units, ( $U_{1},U_{2},\ldots,U_{N}$ ) indexed by labels ( $1,2,\ldots,N$ ). A grid is formed of these units. Every unit in the population is assumed to be measurable with respect to variable of interest $Y$ .

Let $Y=(Y_{1},Y_{2},\ldots,Y_{N})$ be the vector of $Y$ values associated to units $(U_{1},U_{2},\ldots,U_{N})$ respectively. The population total $\tau_{y}=\sum\nolimits_{i=1}^{N}Y_{i}$ is required to be estimated. For that purpose, an initial sample of size $n$ is drawn by using any of the traditional sampling methods. Thompson (1990) has used simple random sampling without replacement (SRSWOR). Whenever an unit among these selected units is observed to satisfy the pre-decided condition $C$ , the adjacent neighboring units-to the left, right, above and below are added to the sample. Further, if any of these units satisfy the condition $C$ their neighbors also are added to the sample. This process is continued till the neighboring units not satisfying the condition $C$ are observed. The resulting final sample is called as an adaptive sample. The set of neighboring units that satisfy the condition $C$ , of an unit satisfying $C$ and included in the initial sample constitutes a network. These units are called the network units. The units which do not satisfy $C$ but get added as neighbors in a network are called edge units. The collection of network units along with the edge units is called as a cluster. These clusters are formed due to the use of adaptive sampling procedure. Hence the design is called ACS design.

This design differs from the classical sampling designs, in the sense that selection procedure depends upon the observed values of the variable of interest. The advantages of such sampling designs were described by Basu (1969) and Zacks (1900). Cassel et al. (1997) summarized the subsequent literature on the designs that make use of observed values.

The conventional estimators may be biased when used in an adaptive design. Thompson developed unbiased estimators of the population mean and total along with the unbiased estimators of variances of these estimators. These estimators are design unbiased. That means, they depend on the sample selection rather than on the assumptions about the population. Classical estimators such as the sample mean or the mean of cluster means are biased when they are used with the adaptive designs.

3. Adaptive cluster double sampling

ACS introduced by Thompson (1990) has been found appropriate for sampling of rare and clustered populations. But it suffers from drawback of losing control of the final sample size. There have been several suggestions for limiting this final sample size of adaptive cluster samples. We have already mentioned some of them in Section 1. In adaptive cluster double sampling, travelling costs are increased because the second phase sample is selected after the first phase sample is completed. In the second phase of the sampling design, the sampler cannot allocate the subsample near the places of interest. The proposed unbiased estimators of the population mean do not take the advantage of the relation between the variable of interest and the auxiliary variable. Felix et al. (2004) proposed a multiphase variant of ACS. It is obtained by combining the ideas of double sampling and ACS. It is called adaptive cluster double sampling (ACDS). In this design an auxiliary variable which is easy to measure and is inexpensive is considered. This variable is used to select the first phase ACS. The network structure of this first phase sample is used to select the subsequent subsamples, which are selected using conventional design. Values of the variable of interest, associated with the units selected in the final phase subsample only are recorded and the population mean is estimated by a regression type estimator. The ACDS allows the sampler to overcome the drawbacks of ACS.

3.1 Procedure of ACDS

Let $U=(U_{1},U_{2},\ldots,U_{N})$ denote the finite population of $N$ units. $Y$ and $X$ are the variable of interest and auxiliary variable respectively. The values of $X$ and $Y$ associated with $U_{i}$ are denoted by $(X_{i},Y_{i})$ for $i=1,2,\ldots,N$ . Suppose that no information is available about the auxiliary variable before starting the sampling stage.

It is required to estimate the population total of $y$ -values given by

$\displaystyle\tau_{y}=\sum\nolimits_{i=1}^{N}Y_{i}$

In the first phase of ACDS, an ordinary adaptive cluster sample $S_{1}$ based on the values of $X$ is selected. Here indirectly it is assumed that condition $C_{X}$ for additional sampling and the neighbors of each unit of $U$ are well defined. These definitions form a partition of $U$ into $K$ networks $\{\Psi_{1},\Psi_{2},\Psi_{3},\ldots,\Psi_{k}\}$ .

Let $S_{0}$ denote the initial sample that is used to select adaptive sample $S_{1}$ . Let $n$ be the size of $S_{0}$ .

In the second phase, the sample $S_{2}$ is selected by using conventional method containing $K_{1}$ networks from the $K$ different networks intersected by $S_{0}$ . If $S_{2}$ is selected with replacement then the number of distinct networks in $S_{2}$ may be less than $K_{1}$ . These networks are denoted by $\{\Psi_{1},\Psi_{2},\Psi_{3},\ldots,\Psi_{k_{1}}\}$ .

The third phase consists of selecting a conventional subsample of units from each of the distinct networks in $S_{2}$ . Further, the $y$ -values associated with every unit in these subsamples are recorded. The ${k}_{2}$ subsamples of units are denoted by $S_{3_{i}}$ , $i=1,2,\ldots,k_{2}$ and they are assumed to be independently selected.

In this procedure the $X$ -value associated with every unit in the adaptive cluster sample ${S}_{1}$ has to be measured. Hence, the procedure does not control the number of measurements of the auxiliary variable but that of the variable of interest only. The measurements of the auxiliary variable are easy and inexpensive. Hence, one can use a relatively large initial sample which will increase the probability of intersecting networks with units satisfying the condition $C_{X}$ and that will improve the efficiency of the estimators. Using this method, Felix et al. (2004) have proposed a regression type estimator of the population mean of $Y$ .

4. Negative adaptive cluster sampling

Thompson (1990) introduced the idea of ACS. But this method faces a drawback of excessively large final sample size. If the selection, acquisition and measurement of the units in the population is difficult and expensive with respect to the variable of interest then one has to think about some alternative procedure of sampling. Felix et al. (2004) introduced ACS.

They have considered an auxiliary variable along with the variable of interest. The mean of variable of interest can be estimated either by using ACDS or one can use the ratio estimator in ACS (Dryver & Chao, 2007; Chutiman & Chiangpredit, 2014).

If $X$ and $Y$ are positively correlated then we do not get abundant auxiliary information. Because the population under study is rare and clustered. Hence NACS cannot be used in such situation.

In this article, we have proposed a new sampling design. Here we consider two negatively correlated variables.

Consider some practical situation where such type of negative relationship is observed.

1.
The plateaus in Western Ghats of Sahyadri from Goa to Varandha Ghat (Bhor, Maharashtra, India) are rich in aluminum ore-Leterite. Due to which the thorny plants are rarely observed. They are abundantly available on basalt kind plateaus. But there are some rare patches of thorny plants. It indicates the absence of aluminum in that part. If the interest is to estimate the total number of thorny plants in that area, NACS can be effectively used. Here aluminum content of the soil is the auxiliary variable. Its presence can be detected easily. The estimate of total number of these thorny plants can be obtained by using NACS.
2.
The plateaus in Western Ghats (Maharashtra, India) of Sahyadri from Tamhini Ghat to Mumbai are dominated by the presence of Basalt kind of rocks. In this area Neem, Ziziphus and other thorny plants are highly abundant. But there are intermediate patches of semi ever green plants. The estimate of total number of these evergreen plants can be obtained by using NACS.
3.
Suppose we want to estimate the total population of fish in a specified region under the sea. Let this region be subdivided to form a grid of locations. There are ample of bush of specific variety of plants under the sea water. The fish are detracted by that specific variety of bush present in that region. But there are some rare sea plants in that region which provide food for the fish and hence they get attracted towards these plants. So, the fish present in these locations can be counted. Here we consider the number of bush as the auxiliary variable and the number of fish as the variable of interest. Further it is observed that if the number of bush of specific variety of plants is greater than $C$ then no fish will be found at that particular location.

The situations presented above show the negative correlation between the two variables.

In such a situation, we propose the following sampling design. In ACS the units in the initial sample are identified whether they satisfy the desired condition $C$ with respect to the variable of interest or not. Further, the networks are expanded around the units in the initial sample that satisfy the condition $C$ . Here, we propose different adaptive procedure. The variables are negatively correlated and the adaptive procedure involves the auxiliary variable instead of the variable of interest. We get the clusters of units during the adaptation. Hence, this method is called the negative adaptive cluster sampling.

In ACDS, by using adaptation technique the first phase units are decided by using an auxiliary variable. Then by using some traditional method such as SRSWOR, the second phase units are selected. In NACS, the adaptation is used to discover the networks in the population with reference to the auxiliary variable. Further the networks corresponding to the variable of interest are identified. There is no second phase in NACS. That is how NACS is different than ACDS. So in general, NACS is not ACDS. But NACS can be looked upon as a particular case of ACDS where the networks identified in the first phase, corresponding to the variable of interest are considered as the second phase units.

Secondly, ACDS does not bother about the type of relationship between the auxiliary variable and the variable of interest. In contrast to this NACS requires a negative relationship between the auxiliary variable and the variable of interest. The networks corresponding to the auxiliary variable and the variable of interest are discovered by using exactly the opposite conditions on the two variables. Hence, the design is called NACS. The use of auxiliary variable is justified by ACDS in the first phase sampling. We use the auxiliary information in NACS for adaptation purpose. We assume that the population information of auxiliary variable is known. In NACS, the networks are formed by using ACS with auxiliary information. The corresponding $Y$ is observed only for those units which satisfy the condition $C_{X}$ . Here, the population is rare and clustered and we observe $Y$ only for the units that satisfy the condition $C_{X}$ . So, there is substantial reduction in sample size with respect to $Y$ . This reduced sample size is called as the effective sample size.

Consider a population of $N$ units which can be observed and measured with respect to variables $X$ and $Y$ which are negatively correlated. Suppose the population is rare with respect to the variable of interest ( $Y$ ); equivalently we can say that it is highly abundant with respect to the auxiliary variable $X$ . Taking observations on $X$ is easy and inexpensive. The procedure of NACS is as follows:

Form a grid of population containing $N$ grid points of equal size and shape. Draw an initial sample of size $n$ grid points from this grid using simple random sampling without replacement (SRSWOR) or simple random sampling with replacement (SRSWR) method.

Check whether each of the selected units satisfies the condition $C_{X}$ or does not satisfy the condition $C_{X}$ . Add the unit to the left, right above and below to each unit included in the initial sample that satisfies the condition $C_{X}$ . These units are called neighbors of that unit. If any of these neighbors satisfy the condition $C_{X}$ , add their neighbors also to the sample. Continue this way till the neighbors that do not satisfy the condition $C_{X}$ are found. The set of neighbor units satisfying the condition ${C}_{X}$ along with the corresponding unit selected in the initial sample that satisfies the condition ${C}_{X}$ constitutes a network. Thus in this design the networks are formed around the units selected in the initial sample that satisfy $C_{X}$ . Note that a unit selected in the initial sample which does not satisfy the condition $C_{X}$ forms a network of size one.

Suppose $K$ distinct clusters are formed with respect to $X$ population. A cluster includes the units in a network and the corresponding edge units. Edge units do not satisfy the condition $C_{X}$ . If all edge units in a cluster are dropped we get a network. From the $K$ clusters, we get the $K$ networks.

Observe the values of the variable of interest corresponding to all the units in these $K$ networks. Further using the following estimators, the population total of $Y$ can be estimated. Estimates of the standard error of these estimators can be obtained. If we drop the auxiliary information to get modified Hansen-Hurwitz and Horvitz-Thompson estimators then NACS reduces to ACS.
4.1 Modified Hansen-Hurwitz (HH) type estimator

It is based on draw by draw probabilities that a unit’s network is intersected by the initial sample. Let $n$ denote the initial sample size and $V$ denote the final adaptive sample size. The initial sample is selected by using SRSWR. Let $\Psi_{i}$ denote the network that includes unit $i$ . Let $m_{i}$ be the number of units in this network. The HH type estimator of the population total $\tau_{y}$ is given as:

$\displaystyle(\hat{\tau}_{y})_{HH}=\frac{N}{n}\sum\nolimits_{i=1}^{n}\overline% {y}_{i}$ (1)

Where $\overline{y}_{i}$ is the average of the $y$ -values in the network that includes unit $i$ of the initial sample.

That is,

$\displaystyle\overline{y}_{i}=\frac{\sum\limits_{j\in\Psi i}y_{j.}}{m_{i}},i=1% ,2,\ldots,n$ (2)

Variance of $(\hat{\tau}_{y})_{HH}$ is:

$\displaystyle V{(\hat{\tau}_{y})}_{HH}=\frac{N(N-n)}{n(N-1)}\sum\limits_{i=1}^% {N}\left(\overline{y}_{i}-\frac{\tau_{y}}{N}\right)^{2}$ (3)

The unbiased estimator of $V{(\hat{\tau}_{y})}_{HH}$ is:

$\displaystyle\hat{V}{(\hat{\tau}_{y})}_{HH}=\frac{N(N-n)}{n(n-1)}\sum\limits_{% i=1}^{n}\left(\overline{y}_{i}-\frac{\left(\hat{\tau}_{y}\right)_{{HH}}}{N}% \right)^{2}$ (4)

4.2 Modified Horvitz-Thompson (HT) type estimator

It is based on probabilities of the initial sample intersecting networks. The initial sample of size $n$ is drawn by using SRSWOR. The unbiased estimator of the population total $\tau_{y}$ is given as:

$\displaystyle{(\hat{\tau}_{y})}_{HT}=\sum\limits_{k=1}^{K}\frac{y_{k.}}{\alpha% _{k}}I_{k}$ (5)

Where $y_{k.}$ is the sum of $y$ values in the network $k$ .

That is,

$\displaystyle y_{k.}=\sum\limits_{i\in\Psi k}y_{i.}$

$I_{k}$ is the indicator variable defined as follows:

$\displaystyle{I}_{k}=\left\{{\begin{array}[]{ll}1&\text{the initial sample % intersects the network }k\\ 0&\textit{otherwise}.\\ \end{array}}\right.$

The probability that the network $k$ is included in the sample,

$\displaystyle\alpha_{k}=1-\left[\frac{\binom{N-m_{k}}{n}}{\binom{N}{n}}\right]% ,k=1,2,\ldots,K.$ (6)

The probability that the networks $j$ and $k$ are both included in the sample is given by

$\displaystyle\alpha_{jk}=1-\left[\frac{\binom{N-m_{j}}{n}+\binom{N-m_{k}}{n}-% \binom{N-m_{j}-m_{k}}{n}}{\binom{N}{n}}\right],j,k=1,2,\ldots,K.$ (7)

Where $m_{k}$ is the number of units selected in the initial sample from the network $k$ in the population.

Note that, ${\alpha}_{{jj}}={\alpha}_{{j}}$ .

Variance of ${(\hat{\tau}_{y})}_{HT}$ is given by:

$\displaystyle V{(\hat{\tau}_{y})}_{HT}=\sum\limits_{j=1}^{K}\sum\limits_{k=1}^% {K}\frac{\left({\alpha}_{{jk}}-{\alpha}_{{j}}{\alpha}_{{k}}\right)}{{\alpha}_{% {j}}{\alpha}_{{k}}}y_{j.}y_{k.}$ (8)

The unbiased estimator of $V{(\hat{\tau}_{y})}_{HT}$ is given by:

$\displaystyle\hat{V}{(\hat{\tau}_{y})}_{HT}=\sum\limits_{j=1}^{K}\sum\limits_{% k=1}^{K}\frac{\left({\alpha}_{{jk}}-{\alpha}_{{j}}{\alpha}_{{k}}\right)}{{% \alpha}_{{j}}{\alpha}_{{k}}{\alpha}_{{jk}}}y_{j.}y_{k.}$ (9)

Summation is taken over the distinct networks included in the sample.

The proposed estimators for NACS

4.3 Modified ratio type estimator

Chutiman and Chiangpradit (2014) proposed a ratio estimator of the population total of the variable of interest. It is based on the Raj estimators of the population totals of the auxiliary and the variable of interest. Raj estimator itself is an ordered estimator.

We propose a ratio estimator which is based on the HT estimators of the population totals of the two variables. HT estimator is an unbiased and unordered estimator. So, the computational difficulty involved in our estimator is much lesser than that in the ratio estimator proposed by Chutiman and Chiangpradit (2014).

Consider a population $U=\{1,2,\ldots,i,\ldots,N\}$ which is partitioned into $K$ networks, denoted as $\{\Psi_{1},\Psi_{2},\Psi_{3},\linebreak\ldots,\Psi_{k}\}$ . Let the values of the auxiliary variable $X$ be known for all the population units.

Suppose a survey is conducted by using negative adaptive cluster sampling. The information on the variable of interest $Y$ is collected for all the units selected in the final sample.

Define $y_{k.}=\sum_{k\in\Psi_{k}}y_{k}$ and $x_{k.}=\sum_{k\in\Psi_{k}}x_{k}$ .

The modified HT estimator of the population total of the variable $Y$ can be obtained by using inclusion probabilities of networks.

That is we can estimate the population total $\tau_{{y}}=\sum_{U}y_{k}=\sum_{\Psi_{k}}y_{k.}$ by defining the estimator:

$\displaystyle{(\hat{\tau}_{y})}_{HT}=\sum\limits_{S}\check{y}_{k.}=\sum\limits% _{s_{k}}\sum\limits_{\Psi_{k}}\frac{y_{k}}{\Pi_{k}}=\sum\limits_{s_{k}}\left(% \frac{\sum\limits_{\Psi_{k}}y_{k}}{\Pi_{k}}\right)=\sum\limits_{s_{k}}\frac{y_% {k.}}{\Pi_{k}^{*}}=\sum\limits_{s_{k}}\check{y}_{k.}$ (10)

Note: i) $s_{k}$ is the set of units selected in the final adaptive sample from the $k^{\text{th}}$ network. ii) $\Pi_{k}=\Pi_{k}^{*}$ where $\Pi_{k}$ denotes the inclusion probability of $k^{\text{th}}$ unit and $\Pi_{k}^{*}$ denotes the inclusion probability of $k^{\text{th}}$ cluster which includes $k^{\text{th}}$ unit.

The population total of $X$ is $\tau_{x}=\sum_{U}x_{k}=\sum_{\Psi_{k}}x_{k.}$

The estimator of $\tau_{x}$ is:

$\displaystyle{(\hat{\tau}_{x})}_{HT}=\sum\limits_{S}\check{x}_{k.}=\sum\limits% _{s_{k}}\sum\limits_{\Psi_{k}}\frac{x_{k}}{\Pi_{k}}=\sum\limits_{s_{k}}\left(% \frac{\sum\limits_{\Psi_{k}}x_{k}}{\Pi_{k}}\right)=\sum\limits_{s_{k}}\frac{x_% {k.}}{\Pi_{k}^{*}}=\sum\limits_{s_{k}}\check{x}_{k.}$ (11)

The generalized population ratio total is:

$\displaystyle\tau_{\textit{RAD}}=\frac{\tau_{{y}}}{\tau_{x}}\tau_{x}=R\tau_{x}$ (12)

Note that $E{(\hat{\tau}_{x})}_{HT}=\tau_{x}$ and $E{(\hat{\tau}_{y})}_{HT}=\tau_{{y}}$ .

Estimator of ${\tau}_{\textit{RAD}}$ is,

$\displaystyle\hat{\tau}_{\textit{RAD}}=\frac{{(\hat{\tau}_{y})}_{HT}}{{(\hat{% \tau}_{x})}_{HT}}\tau_{x}=\hat{R}{\tau}_{x}$ (13)

Using the Taylor linearization technique about the point $(\tau_{x},\tau_{{y}})$ gives the approximation:

$\displaystyle\hat{\tau}_{\textit{RAD}}={(\hat{\tau}_{y})}_{HT}+\frac{\tau_{{y}% }}{\tau_{x}}\left(\tau_{x}-\left(\hat{\tau}_{x}\right)_{HT}\right)$ (14)

Its approximate variance is given by:

$\displaystyle AV(\hat{\tau}_{\textit{RAD}})=\text{Var}\left({(\hat{\tau}_{y})}% _{HT}-R{(\hat{\tau}_{x})}_{HT}\right)=\text{Var}\left(\sum\limits_{\Psi_{k}}% \frac{(y_{k.}-Rx_{k.})}{\Pi_{k}^{*}}\right)=\sum\limits_{S}\sum\limits_{\Psi_{% k}}\Delta_{kl}^{*}.\check{E}k.\check{E}l.$ (15)

Where,

$\displaystyle\Delta_{kl}^{*}=\Pi_{kl}^{*}$ $\displaystyle\check{E}k.=\frac{E_{k.}}{\Pi_{k}^{*}}\text{ and }E_{k.}=y_{k.}-% Rx_{k.}$

The variance estimator of the modified ratio estimator is:

$\displaystyle\hat{V}(\hat{\tau}_{\textit{RAD}})=\sum\limits_{k\&l\in\Psi_{k}}% \sum\limits_{\Psi_{k}\in S}\check{\Delta}_{kl}^{*}\check{E_{k.}}\check{E_{l.}}$ (16)

Where,

$\displaystyle\check{\Delta}_{{kl}}^{{*}}=\frac{{\Delta}_{{kl}}^{{*}}}{\Pi_{{kl% }}^{{*}}}$

4.4 Modified regression estimator

The modified regression estimator $\hat{\tau}_{\textit{RADD}}$ is a function of HT estimators ${(\hat{\tau}_{y})}_{HT}$ and ${(\hat{\tau}_{x})}_{HT}$

$\displaystyle\hat{\tau}_{\textit{RADD}}={(\hat{\tau}_{y})}_{HT}+\left(\sum% \limits_{\Psi_{k}}x_{k.}-{(\hat{\tau}_{x})}_{HT}\right)\hat{\beta}_{1}$ (17)

By using weighted least square method we get

$\displaystyle\hat{\beta}_{1}=\left(\sum\limits_{s}\frac{x_{k.}^{2}}{\sigma_{k}% ^{2}\Pi_{k}^{*}}\right)^{-1}\left(\sum\limits_{s}\frac{x_{k.}.y_{k.}}{\sigma_{% k}^{2}\Pi_{k}^{*}}\right)$ (18)

By using Taylor linearization technique about the points $(\tau_{x},\tau_{y}),{\hat{\beta}}_{1}$ is approximated by

$\displaystyle\hat{\beta}_{1}^{0}=\beta_{1}+T^{-1}\left(t-\hat{T}.\beta_{1}\right)$ (19)

Where,

$\displaystyle\hat{T}=\sum\limits_{s}\frac{x_{k.}^{2}}{\sigma_{k}^{2}\Pi_{{k}}^% {{*}}};T=\sum\limits_{\Psi_{k}}x_{k.}^{2};t=\sum\limits_{\Psi_{k}}{x_{k.}y_{k.% }};\hat{t}=\sum\limits_{s}\frac{x_{k.}y_{k.}}{\sigma_{k}^{2}\Pi_{k}^{*}}$

The approximate variance is given by:

$\displaystyle AV\left(\hat{\beta}\right)=T^{-1}VT^{-1}$ (20)

Let $E_{k.}$ is the residual of population fit.

$\displaystyle E_{k.}=y_{k.}-x_{k.}\beta_{1}$ $\displaystyle V=\sum\sum\limits_{U}\Delta_{kl}^{*}\left(\frac{x_{k.}E_{k.}}{% \Pi_{k}^{*}}\right)\left(\frac{x_{l.}E_{l.}}{\Pi_{l}^{*}}\right)$

The estimator of the variance of $\hat{\beta}$ is:

$\displaystyle\hat{V}\left(\hat{\beta}\right)=\left(\sum\limits_{s}\frac{x_{k.}% ^{2}}{\sigma_{k}^{2}\Pi_{{k}}^{{*}}}\right)^{-1}\hat{V}\left(\sum\limits_{s}% \frac{x_{k.}^{2}}{\sigma_{k}^{2}\Pi_{{k}}^{{*}}}\right)^{-1}$ (21)

Where, $\hat{V}=\sum{\sum_{s}\check{\Delta}_{kl}^{*}\left(\frac{x_{k.}e_{k.}}{\Pi_{k}^% {*}}\right)}\left(\frac{x_{l}.{e}_{l.}}{\Pi_{l}^{*}}\right)$ .

Where $e_{k.}$ is the sample residual fit.

$\displaystyle e_{k.}=y_{k.}-x_{k.}\hat{\beta}_{1}$ $\displaystyle\hat{\tau}_{\textit{RADD}}={(\hat{\tau}_{y})}_{HT}+\left(\sum% \limits_{\Psi_{k}}x_{k.}-\left(\hat{\tau}_{x}\right)_{HT}\right)\hat{T}^{-1}% \sum\limits_{s}\frac{x_{k.}y_{k.}}{\sigma_{k}^{2}\Pi_{k}^{*}}=\sum\limits_{s}% \left(1+\left(\sum\limits_{\Psi_{k}}x_{k.}-\left(\hat{\tau}_{x}\right)_{HT}% \right)\hat{T}^{-1}\frac{x_{k.}}{\sigma_{k}^{2}}\check{y}_{k.}\right)$ (22)

It shows that the regression estimator can be expressed as a linear function of the $\Pi$ expanded values $\check{y}_{k.}$ for $k\in S$ .

Thus,

$\displaystyle\hat{\tau}_{\textit{RADD}}=\sum\limits_{s}g_{ks}^{*}\check{y}_{k.}$

With the sample dependent weights

$\displaystyle g_{ks}^{*}=1+\left(\sum\limits_{\Psi_{k}}x_{k.}-{(\hat{\tau}_{x}% )}_{HT}\right)\hat{T}^{-1}\frac{x_{k.}}{\sigma_{k}^{2}}$

The regression estimator relates to the hypothetical population fit of the model $\xi$ which produces $\beta_{1}$ given by equation of fitted values.

$y_{k.}^{0}=x_{k.}\beta_{1}$ and the population fit of residuals

$\displaystyle E_{k.}=y_{k.}-y_{k.}^{0}$

The regression estimator becomes:

$\displaystyle\hat{\tau}_{\textit{RADD}}=\sum\limits_{s}g_{ks}^{*}\left(\check{% y}_{k.}^{0}+\check{E}_{k.}\right)=\sum\limits_{\Psi_{k}}y_{k.}^{0}+\sum\limits% _{s}g_{ks}^{*}\check{E}_{k.}$ (23)

The approximate variance of $\hat{\tau}_{\textit{RADD}}$ is,

$\displaystyle AV\left(\hat{\tau}_{\textit{RADD}}\right)=\sum\sum\limits_{U}% \Delta_{kl}^{*}\check{E}_{k.}\check{E}_{l.}$ (24)

The estimator of $AV\left(\hat{\tau}_{\textit{RADD}}\right)$ is:

$\displaystyle\hat{V}(\hat{\tau}_{\textit{RADD}})=\sum\sum\limits_{s}\check{% \Delta}_{kl}^{*}(g_{ks}^{*}\check{e}_{ks}^{*})(g_{ls}^{*}\check{e}_{ls}^{*})$ (25)

Where

$\displaystyle\check{e}_{ks}^{*}=\frac{e_{ks}^{*}}{\Pi_{{k}}^{{*}}}$

since $e_{ks}^{*}=\check{y}_{k.}-x_{k.}\hat{\beta}_{1}$ .

The above Eq. (25) is similar to that in Särndal et al. (1992).

4.5 Product estimator

Since the two variables are negatively correlated, it is of interest to define a product estimator of the population total based on the HT estimators of the population totals of the two variables.

The product estimator of the population total $\tau_{y}$ is defined as:

$\displaystyle\hat{\tau}_{\textit{RADE}}={(\hat{\tau}_{y})}_{HT}\exp\left(\frac% {\tau_{x}-{(\hat{\tau}_{x})}_{HT}}{\tau_{x}+{(\hat{\tau}_{x})}_{HT}}\right)$ (26)

Let $e_{y}=\frac{{(\hat{\tau}_{y})}_{HT}-\tau_{y}}{\tau_{y}}$

$\displaystyle e_{x}=\frac{{(\hat{\tau}_{x})}_{HT}-\tau_{x}}{\tau_{x}}$

We get, $E\left(e_{y}\right)=E\left(e_{x}\right)=0$

$\displaystyle E\left(e_{y}^{2}\right)=\frac{1}{2}\frac{1}{\tau_{y}^{2}}\sum% \sum\limits_{i\neq j\in\Psi_{k}}\left(\pi_{i}\pi_{j}-\pi_{ij}\right)\left(% \frac{y_{i}}{\pi_{i}}-\frac{y_{j}}{\pi_{j}}\right)^{2}$ $\displaystyle E\left(e_{x}^{2}\right)=\frac{1}{2}\frac{1}{\tau_{x}^{2}}\sum% \sum\limits_{i\neq j\in\Psi_{k}}\left(\pi_{i}\pi_{j}-\pi_{ij}\right)\left(% \frac{x_{i}}{\pi_{i}}-\frac{x_{j}}{\pi_{j}}\right)^{2}$ $\displaystyle E\left(e_{x}e_{y}\right)=\frac{1}{2}\frac{1}{\tau_{x}\tau_{y}}% \sum\sum\limits_{i\neq j\in\Psi_{k}}\left(\pi_{i}\pi_{j}-\pi_{ij}\right)\left(% \frac{x_{i}}{\pi_{i}}-\frac{y_{j}}{\pi_{j}}\right)^{2}$

Hence,

$\displaystyle\hat{\tau}_{\textit{RADE}}=\tau_{y}\left(1+e_{y}\right)\exp\left[% \frac{\tau_{x}-\tau_{x}(1+e_{x})}{\tau_{x}+\tau_{x}(1+e_{x})}\right]=\tau_{y}% \left(1+e_{y}\right)\exp\left[\frac{{-e}_{x}}{2+e_{x}}\right]=\tau_{y}\left(1+% e_{y}\right)\exp\left[-\frac{1}{2}e_{x}\left(1+\frac{e_{x}}{2}\right)^{-1}% \right]\cong\tau_{y}\left(1+e_{y}\right)\exp\left[-\frac{1}{2}e_{x}\left(1-% \frac{1}{2}e_{x}+\frac{1}{4}e_{x}^{2}\right)\right]$

By neglecting the terms involving $e_{x}$ with power three and above we get,

$\displaystyle\hat{\tau}_{\textit{RADE}}\cong\tau_{y}\left(1+e_{y}\right)\exp% \left[-\frac{1}{2}\left(e_{x}-\frac{1}{2}e_{x}^{2}\right)\right]\cong\tau_{y}% \left(1+e_{y}\right)\left[1-\frac{\left(e_{x}-\frac{1}{2}e_{x}^{2}\right)}{2}+% \frac{\left(e_{x}-\frac{1}{2}e_{x}^{2}\right)^{2}}{8}\right]\cong\tau_{y}\left% (1+e_{y}\right)\left[1-\frac{\left(e_{x}-\frac{1}{2}e_{x}^{2}\right)}{2}+\frac% {e_{x}^{2}}{8}\right]=\tau_{y}\left(1+e_{y}\right)\left[1-\frac{e_{x}}{2}+% \frac{3}{8}e_{x}^{2}\right]=\tau_{y}\left[1+e_{y}-\frac{e_{x}}{2}+\frac{3}{8}e% _{x}^{2}-\frac{1}{2}e_{x}e_{y}\right]$

Bias in

$\displaystyle\hat{\tau}_{\textit{RADE}}=E\left(\hat{\tau}_{\textit{RADE}}% \right)-\tau_{y}=\tau_{y}{E}\left(e_{y}-\frac{e_{x}}{2}+\frac{3}{8}e_{x}^{2}-% \frac{1}{2}e_{x}e_{y}\right)=\tau_{y}\left\{\frac{3}{16\tau_{x}^{2}}\sum\sum% \limits_{i\neq j\in\Psi_{k}}{(\pi_{i}}\pi_{j}-\pi_{ij})\left(\frac{x_{i}}{\pi_% {i}}-\frac{x_{j}}{\pi_{j}}\right)^{2}-\frac{1}{4\tau_{x}\tau_{y}}\sum\sum% \limits_{i\neq j\in\Psi_{k}}{(\pi_{i}}\pi_{j}-\pi_{ij})\left(\frac{x_{i}}{\pi_% {i}}-\frac{y_{j}}{\pi_{j}}\right)^{2}\right\}$

The variance of $\hat{\tau}_{\textit{RADE}}$ is given as

$\displaystyle V\left(\hat{\tau}_{\textit{RADE}}\right)=E\left(\hat{\tau}_{% \textit{RADE}}-\tau_{y}\right)^{2}=E\left[\tau_{y}\left(1+e_{y}-\frac{e_{x}}{2% }+\frac{3}{8}e_{x}^{2}-\frac{1}{2}e_{x}e_{y}\right)-\tau_{y}\right]^{2}\cong% \tau_{y}^{2}E\left(e_{y}-\frac{e_{x}}{2}\right)^{2}=\tau_{y}^{2}E\left(e_{y}^{% 2}+\frac{1}{4}e_{x}^{2}-e_{x}e_{y}\right)$ $\displaystyle V\left(\hat{\tau}_{\textit{RADE}}\right)=\tau_{y}^{2}\left[\frac% {1}{2}\frac{1}{{\tau}_{y}^{2}}\sum\sum\limits_{i\neq j\in\Psi_{k}}{(\pi_{i}}% \pi_{j}-\pi_{ij})\left(\frac{y_{i}}{\pi_{i}}-\frac{y_{j}}{\pi_{j}}\right)^{2}+% \frac{1}{8}\frac{1}{{\tau}_{x}^{2}}\sum\sum\limits_{i\neq j\in\Psi_{k}}{(\pi_{% i}}\pi_{j}-\pi_{ij})\left(\frac{x_{i}}{\pi_{i}}-\frac{x_{j}}{\pi_{j}}\right)^{% 2}\right.{}\left.-\frac{1}{2}\frac{1}{{\tau}_{x}\tau_{y}}\sum\sum\limits_{i% \neq j\in\Psi_{k}}{(\pi_{i}}\pi_{j}-\pi_{ij})\left(\frac{x_{i}}{\pi_{i}}-\frac% {y_{j}}{\pi_{j}}\right)^{2}\right]$ $\displaystyle V\left(\hat{\tau}_{\textit{RADE}}\right)=\frac{1}{2}\sum\sum% \limits_{i\neq j\in\Psi_{k}}{(\pi_{i}}\pi_{j}-\pi_{ij})\left(\frac{y_{i}}{\pi_% {i}}-\frac{y_{j}}{\pi_{j}}\right)^{2}+\frac{1}{8}\frac{\tau_{y}^{2}}{\tau_{x}^% {2}}\sum\sum\limits_{i\neq j\in\Psi_{k}}{(\pi_{i}}\pi_{j}-\pi_{ij})\left(\frac% {x_{i}}{\pi_{i}}-\frac{x_{j}}{\pi_{j}}\right)^{2}{}-\frac{1}{2}\frac{\tau_{y}}% {\tau_{x}}\sum\sum\limits_{i\neq j\in\Psi_{k}}{(\pi_{i}}\pi_{j}-\pi_{ij})\left% (\frac{x_{i}}{\pi_{i}}-\frac{y_{j}}{\pi_{j}}\right)^{2}$ (27)

5. Pilot study

Pilot study was conducted by using NACS. The interest was to estimate the total number of ever green plants which are rare in that region due to the presence of Basalt rocks.

The area of 100 acres in the Tamhini Ghat was divided into 100 plots each of size 1 acre and the percentage of silica observed on each of these plots was measured. Time required to measure the percentage of silica in a sample from one acre plot is fairly lesser than the time required to measure the number of evergreen plants in one acre. Secondly, the testing a soil sample for the percentage of silica is much cheaper than the cost incurred in counting the number of evergreen plants in one acre. The cost of testing a soil sample was $2 and that of counting the number of evergreen plants in one acre was $20. So, we considered the percentage of silica in one acre as the auxiliary variable.

The nature of the soil in Western Ghats is of two types: Basalt rocks and Leterite. After studying the nature of the soil we had observed the abundance of evergreen plants whenever the silica content of the soil is 20 percent or less. Hence we considered $C_{x}=\left\{X\leqslant 20\right\}$ as the condition for adaptation.

A random sample of 10 plots was drawn from this area by using SRSWOR. The plots selected in the initial sample from this population related to the auxiliary variable $X$ (percentage of silica in a plot) are shown by putting ${}^{*}$ in that plot as shown in Fig. 1.

Figure 1.

Silica (S ${}_{\text{i}}$ O ${}_{2}$ )% on the different plots of the square region. ${}^{*}$ in a square indicates selection in initial sample.

Figure 2.

The values of the number of evergreen plants observed on the plots in the population.

Then the procedure, negative adaptive cluster sampling was used. The networks were formed around the plots selected in the initial sample which satisfied the condition ${C}_{x}$ . Each plot with ${C}_{x}=\left\{X>20\right\}$ and selected in the initial sample formed a network of size 1 (shown in yellow colour) There were such 6 networks of size 1 selected in the initial sample from the above population of $X$ variable. There was 1 network of size 13 (shown in blue colour) and another network of size 4 (shown in brown colour).Thus the total number of distinct networks in the sample was 8.

The clusters were formed by using auxiliary information and domain knowledge of Silica content and evergreen plants. These two variables are negatively correlated. It means that the abundance of Silica in soil leads to the rare evergreen plants. A cluster involves the network units and edge units. The edge units of clusters of size more than 1 were dropped to get the networks. Only those networks which satisfied the condition $C_{x}=\left\{X\leqslant 20\right\}$ were selected and were measured for the survey variable (number of evergreen plants) as shown in Fig. 2.

6. Results and discussion

For computational efficiency in estimation of each estimator, $r$ number of repetitions were performed where $r$ varied as 5,000, 10,000, 20,000 and 100,000. It was very difficult to take all possible samples. In our study, the population size was 100 and the initial sample size was 10. Thus the number of possible samples was 1.731 $\times$ 10 ${}^{13}$ . This was a very large number. Hence we took $r$ repetitions for NACS. We required initial sample size (say $n$ ). It was varied as 10, 15, 20, 25, 35 and 45.

For establishing the condition under which the estimators used in NACS are more efficient than the traditional estimators, we repeated the simulations for different replications.

The estimated population total over r possible samples is given by

$\displaystyle\hat{\tau}=\frac{\sum_{i=1}^{r}{(\hat{\tau}_{y})}_{i}}{r}$ (28)

The estimated variance of the estimator of total is given by:

$\displaystyle\widehat{\textit{MSE}}(\hat{\tau}_{y})=\frac{\sum\nolimits_{i=1}^% {r}{(\left(\hat{\tau}_{y}\right)_{i}-\tau_{y})}^{2}}{r}$ (29)

Where $\left(\hat{\tau}_{y}\right)_{i}$ is the value of the relevant estimator for the $i^{\text{th}}$ sample.

The estimates of $\tau_{y}$ along with the corresponding estimates of standard error (SE) obtained by using the HT and HH type estimators ( $(\hat{\tau}_{y})_{HT}$ and $(\hat{\tau}_{y})_{HH}$ ) under ACS, modified ratio and regression estimators ( $\hat{\tau}_{\textit{RAD}}$ and $\hat{\tau}_{\textit{RADD}}$ ) in NACS, two phase estimator ( $\hat{\tau}$ ) (Särndal & Swensson, 1987) in ACDS were calculated. The results are presented in Tables 1 and 2. We estimated the population total of the interest variable by using the estimator $\hat{\tau}_{\textit{RADD}}$ in ACS. The results are shown in Table 1. Also the estimates of $\tau_{y}$ were obtained along with the corresponding estimates of standard error (SE) for $\hat{\tau}_{\textit{RAD}}$ under ACS, $\hat{\tau}_{\textit{RADD}}$ under SRSWOR and $\hat{\tau}_{\textit{RADE}}$ under NACS. The results are presented in Table 4.

Table 1

Estimated values and SE of different estimators in ACS for the different values of $r$ and $n$

Number of samples ( $r$ )	Initial sample size ( $n$ )	$(\hat{\tau}_{y})_{HT}$	$\widehat{SE}((\hat{\tau}_{y})_{HT})$	$(\hat{\tau}_{y})_{HH}$	$\widehat{SE}((\hat{\tau}_{y})_{HH})$	$\hat{\tau}_{\textit{RADD}}$	$\widehat{SE}(\hat{\tau}_{\textit{RADD}})$
5000	10	2031.62	1316.18	2031.03	1561.31	2030.03	1326.19
	15	2033.26	973.95	2030.81	1266.50	2030.05	969.91
	20	2032.51	756.07	2031.37	1092.98	2033.93	754.95
	25	2031.08	599.86	2029.82	976.26	2030.24	599.23
	35	2032.12	383.25	2031.88	829.61	2031.98	384.17
	45	2030.19	244.15	2031.92	729.06	2030.18	241.85
10000	10	2029.31	1315.63	2038.90	1552.43	2031.27	1315.07
	15	2033.75	969.66	2032.07	1259.94	2033.64	969.54
	20	2032.51	755.62	2039.92	1091.23	2032.63	752.70
	25	2029.71	595.55	2031.45	970.16	2034.34	595.94
	35	2030.47	382.99	2030.45	828.79	2034.92	383.94
	45	2031.99	241.71	2035.60	723.35	2033.48	241.11
20000	10	2033.84	1314.15	2031.48	1537.71	2030.40	1308.67
	15	2030.97	967.08	2030.63	1258.89	2031.22	969.27
	20	2032.02	755.00	2033.18	1086.86	2031.85	752.64
	25	2032.05	598.03	2031.33	968.46	2032.84	594.92
	35	2031.53	383.49	2031.13	824.78	2032.79	381.93
	45	2031.71	237.25	2030.99	720.96	2033.05	237.62
100000	10	2029.16	1309.28	2031.04	1536.88	2031.43	1307.76
	15	2031.48	970.77	2034.84	1255.06	2032.35	968.57
	20	2029.15	754.17	2031.81	1078.86	2030.34	752.49
	25	2029.00	592.94	2031.10	965.18	2031.72	592.12
	35	2031.42	382.12	2033.61	810.07	2030.42	381.46
	45	2031.08	236.11	2031.18	720.00	2030.91	237.15

Table 2

Estimated values and SE of different estimators in NACS and ACDS for the different values of $r$ and $n$

Number of samples ( $r$ )	Initial sample size ( $n$ )	$\hat{\tau}_{\textit{RAD}}$	$\widehat{SE}(\hat{\tau}_{\textit{RAD}})$	$\hat{\tau}_{\textit{RADD}}$	$\widehat{SE}(\hat{\tau}_{\textit{RADD}})$	$\hat{\tau}$	$\widehat{SE}(\hat{\tau})$
		NACS				ACDS
5000	10	2174.91	1598.58	2033.39	1319.39	2023.72	1463.54
	15	2091.30	1132.63	2031.59	971.84	2025.94	1125.09
	20	2063.51	863.58	2032.81	756.87	2041.87	910.41
	25	2068.98	683.87	2033.67	597.45	2031.20	770.94
	35	2062.73	442.43	2030.12	384.63	2029.86	602.93
	45	2042.49	298.33	2031.42	241.82	2034.65	511.70
10000	10	2154.28	1597.53	2030.73	1316.04	2037.79	1460.22
	15	2114.63	1131.08	2032.16	970.07	2032.13	1115.55
	20	2080.71	856.13	2031.54	754.93	2030.93	909.49
	25	2071.95	677.41	2031.32	596.46	2025.96	769.57
	35	2051.42	440.30	2033.42	382.25	2035.08	598.60
	45	2042.54	293.42	2031.70	238.85	2033.21	511.19
20000	10	2178.62	1596.31	2033.10	1310.00	2027.63	1457.46
	15	2117.47	1129.47	2033.74	970.02	2035.04	1107.99
	20	2083.95	855.61	2033.67	753.12	2027.68	908.82
	25	2074.24	679.72	2033.52	596.23	2031.29	768.96
	35	2051.13	439.57	2032.67	381.35	2030.20	597.88
	45	2047.81	290.23	2031.25	238.84	2028.44	507.91
100000	10	2183.83	1592.43	2032.20	1309.67	2033.26	1451.42
	15	2114.00	1128.52	2031.48	963.04	2028.92	1107.41
	20	2085.86	854.92	2031.62	751.92	2030.62	894.39
	25	2066.96	673.99	2031.29	596.11	2031.44	757.87
	35	2050.03	424.25	2031.90	381.20	2032.70	594.83
	45	2041.78	284.18	2031.19	235.91	2033.18	506.43

For presenting the cost benefit analysis of the new design we calculated the expected sampling costs in ACS and expected effective sampling costs in NACS.

Expected sampling cost in ACS is based on the final sample size ( $n_{s}$ ) and the expected sampling cost in NACS is based on the final sample size ( $n_{s}$ ) and the effective sample size ( $n_{e}$ ).

$\displaystyle\text{Final sample size}({n}_{{s}})=n+\sum_{k=1}^{K}{\left(n_{k}-% X_{k}\right)\delta_{k}}$

Where

$n_{k}$ : Size of the $k^{\text{th}}$ network in the population, $k=1,2,\ldots,K$ .

$X_{k}$ : Number of units included in the initial sample from the $k^{\text{th}}$ network, $k=1,2,\ldots,K$ .

$\displaystyle\delta_{k}=\left\{{\begin{array}[]{ll}1&\text{if the initial % sample includes a sampling unit from network }k.\\ 0&\textit{otherwise}\\ \end{array}}\right.$

$\textit{Effective sample size}(ne)=\sum_{j=1}^{n_{{s}}}\delta_{C_{x}}(j)$

Where

$\displaystyle\delta_{C_{x}}\left(j\right)=\left\{{\begin{array}[]{ll}1&\text{% if }U_{j}\text{ satisfies the condition }C_{x}\\ 0&\textit{otherwise}\\ \end{array}}\right.$

Since the expected sample size under ACS is the total size of the included clusters for the variable of interest and expected effective sample size under NACS is the total size of the included networks for the variable of interest. Hence, we get $n_{e}<n_{s}$ .

In ACS, we consider only a variable on interest and adaptation is made to get clusters. Which include the network units and edge units.

Thus, the expected sampling cost in ACS $=20E(n_{s})$ .

In NACS, we consider two variables auxiliary and interest variable. Here, using auxiliary information we determine the clusters that gives us the expected sample size. Edge units are dropped from these cluster to get networks. Only, networks of interest variable is observed to get the expected effective sample size.

Expected sampling cost in NACS $=2E(n_{s})+20E(n_{e})$ .

Values of $E(n_{s})$ and $E(n_{e})$ are obtained by averaging the values of $n_{s}$ and $n_{e}$ over the r repetitions.

The results are presented in Table 5.

To evaluate the performance of NACS, we compared the performance of the proposed modified regression estimator with that of the conventional regression estimator (SRSWOR). The results are shown in Table 3.

Table 3

Estimated values of the different estimators and their standard errors for initial sample of size 45 and number of repetitions equal to 100000

Estimator	Design	Estimate	Estimate of SE	Relative efficiency of NACS
$\hat{\tau}_{\textit{RADD}}$	ACS	2030.91	237.15	1.011
$\hat{\tau}_{\textit{RAD}}$	NACS	2041.78	284.18	1.451
$\hat{\tau}_{\textit{RADD}}$	NACS	2031.19	235.91	1.000
$\hat{\tau}_{y}$	ACDS	2033.18	506.43	4.608
${(\hat{\tau}_{y})}_{\textit{Reg}}$	SRSWOR	1981.99	558.94	5.614

ACDS controls the final sample size to some extent. But it requires the second phase sample. We were interested in finding a sampling design which will consider the type of relationship between the two variables, reduce the cost of sampling and will be more precise than the earlier developed sampling designs.

NACS differs from ACDS. In ACDS the second phase units are selected by using some conventional sampling technique and hence the sampling variations are introduced in the second phase as well. Due to this the standard error (SE) of the estimator is increased. In NACS, there are no second phase units. We take observation on all the units included in the final adaptive sample. So, sampling variations introduced in ACDS at the second phase are completely wiped out in NACS. Hence NACS performs better than ACDS under the specified conditions.

To understand the working of NACS, let us consider the following hypothetical situation.

Suppose we have population of 10 units. Let X is the auxiliary variable and Y be the variable of interest. The values of X for these 10 units be given as {1, 2, 3, 8, 7, 10, 15, 14, 4, 3}.

The corresponding values of Y be given as {20, 15, 12, 0, 0, 0, 0, 0, 10, 11}. Here, X and Y are highly negatively correlated. We use the auxiliary information to select the sample with condition $C_{x}=\{X\leqslant 5\}$ (say). The random sample of size 3 is drawn by using SRSWOR from X, say (2, 7, 15). The corresponding Y values are (15, 0, 0). By using the adaptation condition $C_{x}$ we get three networks with X values {1, 2, 3}, {7} and {15}. Since the values 7 and 15 do not satisfy the condition $C_{x}$ , the corresponding Y values are not observed. We observe Y values corresponding to X values included in the first network. These values are {20, 15, 12}. These values will be used for further estimation part.

In NACS, we do not take observations related to variable of interest on the edge units in the discovered networks. So, we have introduced the term effective sample size $(n_{e})$ . We have already mentioned that $n_{e}<n_{s}$ . Thus, along with the merit of controlling the final sample size, NACS also has the merit of reducing the cost of sampling. Hence, NACS is superior as compared to ACS under this setup.

If we use the same estimator in ACS and NACS, the two designs are equally efficient because the two designs differ only at the design stage. The two designs differ in costs. As said earlier, NACS is more cost effective than ACS. The degree of correlation affects the performance of NACS. The two variables must be highly negatively correlated. The significance of the correlation can be tested by using the $t$ -test. In case of a weak correlation, we do not advocate this method.

If the two variables are positively correlated then NACS reduces to ACS. There is no problem of losing control of the auxiliary variables. In ACDS and NACS as well, we consider only one auxiliary variable. ACDS completely ignores the type of correlation between the two variables. This drawback is covered in NACS. The effective sample size in NACS is smaller than the final sample size in corresponding ACS. It finally leads to reduction in the cost of sampling in NACS.

NACS is not simply ACS for variable X but it is far different than that. There are many more things that can be studied related to NACS. In NACS, we observe the values of the variable of interest corresponding to only the units included in networks of that variable. There is no double sampling. Hence, sampling efforts are reduced in NACS as compared to ACDS.

If we drop the auxiliary information then NACS reduces to ACS. In that case we have used the condition $C_{y}=\{Y>0\}$ for adaptation. The modified HH and HT estimators were used for ACS. The expected final sample size and the expected effective sample size was computed. The expected final sample size was found to be greater than expected effective sample size.

The modified ratio and regression estimators were used for NACS. For estimation of parameters the auxiliary information was utilized. Modified regression estimator was found to be more efficient than the modified ratio estimator. Even though, both the estimators were biased, they gave more stable estimates. As initial sample size increased the standard error of the estimate decreased.

The modified regression estimator in NACS gave us better results as compared to its use in ACDS and ACS. This estimator was found to be more efficient than the conventional regression estimator in SRSWOR as shown in Table 3. Classical cluster sampling, comparisons are often made on the basis of cost. It is often less expensive (in terms of time and money) to sample units within a cluster than to select a new cluster. Same is true for NACS.

The relative bias in the ratio estimator ( $\hat{\tau}_{\textit{RAD}}$ ) used in ACS showed a consistent reduction with the increase in the initial sample size. This estimator was observed to be positively biased. The regression estimator ( $\hat{\tau}_{\textit{RADD}}$ ) used in SRSWOR also showed a reduction in the relative bias with an increase in the initial sample size. This estimator was found to be negatively biased. The other newly proposed product estimator ( $\hat{\tau}_{\textit{RADE}}$ ) was found to be positively biased. The relative bias in this estimator also showed a reduction with an increase in the initial sample size.

Table 4

Estimated values, relative bias and SE of different estimators using various sampling designs

		ACS			SRSWOR			NACS
Number of	Initial sample	$\hat{\tau}_{\textit{RAD}}$	$\widehat{SE}(\hat{\tau}_{\textit{RAD}})$	Relative bias	$\hat{\tau}_{\textit{RADD}}$	$\widehat{SE}(\hat{\tau}_{\textit{RADD}})$	Relative bias	$\hat{\tau}_{\textit{RADE}}$	$\widehat{SE}(\hat{\tau}_{\textit{RADE}})$	Relative bias
samples ( $r$ )	size ( $n$ )			$\hat{\tau}_{\textit{RAD}}$			$\hat{\tau}_{\textit{RADD}}$			$\hat{\tau}_{\textit{RADE}}$

5000	10	2210.39	1617.20	0.088	1644.51	1573.59	$-$ 0.190	2112.10	1426.17	0.039
	15	2102.83	1116.68	0.035	1800.16	1251.84	$-$ 0.113	2072.56	1037.91	0.024
	20	2081.72	855.22	0.024	1861.89	1026.63	$-$ 0.083	2073.19	1031.48	0.027
	25	2055.09	689.35	0.011	1907.20	880.70	$-$ 0.060	2033.73	630.42	0.001
	35	2047.36	450.05	0.008	1960.42	677.89	$-$ 0.034	2040.69	404.21	0.004
	45	2036.71	298.54	0.002	1975.47	560.23	$-$ 0.027	2035.80	257.11	0.002
10000	10	2198.03	1613.36	0.082	1614.02	1566.68	$-$ 0.205	2106.95	1428.10	0.037
	15	2119.59	1122.55	0.043	1757.91	1198.66	$-$ 0.134	2044.44	1034.71	0.006
	20	2086.72	864.86	0.027	1868.74	1023.86	$-$ 0.079	2065.56	794.98	0.017
	25	2065.10	677.29	0.016	1909.50	881.30	$-$ 0.059	2058.69	621.74	0.013
	35	2048.99	439.73	0.008	1937.47	697.98	$-$ 0.046	2049.77	393.83	0.009
	45	2041.08	287.84	0.004	1980.39	556.09	$-$ 0.024	2030.59	267.07	$-$ 0.0001
20000	10	2176.74	1592.60	0.071	1644.80	1573.61	$-$ 0.190	2094.24	1422.64	0.031
	15	2117.40	1134.99	0.042	1797.84	1221.34	$-$ 0.114	2070.86	1029.37	0.019
	20	2082.63	861.30	0.025	1854.13	1024.60	$-$ 0.087	2049.24	796.45	0.008
	25	2057.96	683.46	0.013	1914.27	881.05	$-$ 0.051	2042.47	628.31	0.005
	35	2051.99	445.81	0.010	1964.50	694.98	$-$ 0.032	2037.87	403.77	0.003
	45	2039.77	295.42	0.004	1982.50	558.42	$-$ 0.023	2033.73	261.34	0.001
100000	10	2185.95	1598.78	0.076	1632.23	1575.20	$-$ 0.196	2100.97	1428.57	0.034
	15	2115.53	1130.26	0.041	1784.16	1217.45	$-$ 0.121	2069.654	1031.73	0.019
	20	2082.77	859.85	0.025	1864.19	1016.43	$-$ 0.082	2055.78	793.03	0.012
	25	2066.74	680.64	0.017	1903.56	878.48	$-$ 0.062	2045.53	626.45	0.007
	35	2051.39	440.21	0.010	1952.62	691.26	$-$ 0.038	2038.69	403.88	0.003
	45	2044.35	288.84	0.006	1977.69	561.55	$-$ 0.026	2034.82	257.82	0.001

Among the above three estimators in Table 4, the product estimator the least values of the relative bias for the different number of repetitions and the initial sample sizes. This estimator was observed to be superior to the other two estimators. In NACS, product estimator is superior to modified ratio estimator and inferior to modified regression estimator as shown in Tables 1 and 4.

Total cost involved in NACS is expected to be much lesser as compared to ACS. Since it is assumed that the auxiliary variable is abundant and the interest variable is rare, the cost involved in selection, acquisition and measurement of units with respect to auxiliary variable is expected to be much smaller than that involved in ACS. ACS involves the measurement cost of edge units and network units.

Table 5

Comparison of sample sizes and costs of sampling using ACS and NACS

		ACS		NACS
Number of	Initial sample	Expected final	Expected sampling	Expected sample	Expected effective	Expected sampling
samples ( $r$ )	size ( $n$ )	sample size $E(n_{s})$	cost (in $)	size $E(n_{s})$	sample size $E(n_{e})$	cost (in $)
5000	10	21.32	426.40	21.32	13.40	310.64
	15	28.22	564.40	28.22	16.43	385.04
	20	33.84	676.80	33.84	18.17	431.08
	25	38.90	778.00	38.90	19.35	464.80
	35	48.10	962.00	48.10	20.75	511.20
	45	56.53	1130.60	56.53	21.42	541.46
10000	10	21.38	427.60	21.38	13.45	311.76
	15	28.21	564.20	28.21	16.44	385.22
	20	33.87	677.40	33.87	18.19	431.54
	25	38.90	778.00	38.90	19.38	465.40
	35	48.05	961.00	48.05	20.72	510.50
	45	56.58	1131.60	56.58	21.44	541.96
20000	10	21.40	428.00	21.40	13.49	312.60
	15	28.18	563.60	28.18	16.40	384.36
	20	33.92	678.40	33.92	18.25	432.84
	25	38.93	778.60	38.93	19.38	465.46
	35	48.08	961.60	48.08	20.72	510.56
	45	56.52	1130.40	56.52	21.42	541.44
100000	10	21.39	427.80	21.39	13.47	312.18
	15	28.22	564.40	28.22	16.44	385.24
	20	33.88	677.60	33.88	18.23	432.36
	25	38.92	778.40	38.92	19.36	465.04
	35	48.06	961.20	48.06	20.73	510.72
	45	56.53	1130.60	56.53	21.41	541.26

Further, the cost involved in taking observations on the edge units in NACS is definitely lesser than that in ACS. There is no cost involved to measure edge units in NACS because at the stage of formation of networks, edge units are dropped without inspecting the variable of interest.

NACS assumes that the values of the auxiliary variable corresponding to all units in the population are known. For large geographical areas, it is very difficult to get such information. It limits the applicability of NACS. So, further research is required in that direction. One may think about using the idea of double sampling in NACS.

The proposed NACS methodology was studied on a small pilot study. If the population size is large we will get more precise idea about the proposed estimators.

We had presented number of samples ranging from 5000 to 100,000 of sizes varying from 15 to 45 each from probability proportional to size with replacement and without replacement. Table 1 showed that as the initial sample size increased the standard error of HH and HT estimator decreased. Hence, our proposed methodology is consistent to statistical regularity principle. The modified regression estimator is more efficient than the modified ratio estimator and product estimator as shown in Tables 1 and 4.

7. Conclusions

If information of auxiliary variable is available and it is known that the auxiliary variable and the variable of interest have negative correlation then NACS gives very close estimate of the population total. Ratio and regression type estimator are biased for population total. But in this case the regression type estimator gives a very close estimate of the population total than the ratio type estimator and product estimator. Moreover, the estimate of the standard error of regression estimator is lesser than that of the ratio estimator and product estimator.

The estimator based on SRSWOR gives a poor estimate of the population total with a high standard error. Hence in case of NACS use of regression estimator is recommended. NACS reduces the cost and efforts substantially. This method is very useful for the selection of rare species, plants and diseases. Also, this method has several applications in environmental science, ecology, forestry, health science, mining industry and market research. The proposed modified ratio and regression estimators for NACS give better results. As sample size increases the standard errors of these estimators decrease. The efficiency of modified regression estimator is more as compared to modified ratio estimator as well as the modified HH and HT estimators of the ACS. NACS method is cost efficient as compared to ACS and other conventional sampling methods.

References

Adeleke,

I.A.

Esan,

E.O.

, & Ray,

(2008). Horvitz-Thompson theorem as a tool for generalization of probability sampling techniques. Ghana Journal of Development Studies, 5(1), 80-94.

Bahl,

, & Tuteja,

R.K.

(1991). Ratio and product type exponential estimators. Journal of Information and Optimization Sciences, 12, 159-163.

Basu,

(1969). Role of sufficiency and likelihood principles in survey theory. Sankhya, Ser.A31, 441-453.

Brown,

J.A.

(1994). The application of adaptive cluster sampling in ecological studies. Statistics in Ecology and Environmental Monitoring, 2, 86-97.

Cassel,

C.M.

Särndal,

C.E.

, & Wretman,

J.H.

(1977). Foundation of Inference in Survey. New York, Wiley.

Christman,

, & Lan,

(2001). Inverse adaptive cluster sampling. Biometrics, 57(4), 1050-1058.

Chutiman,

, & Chiangpradit,

(2014). Ratio estimator in adaptive cluster sampling without replacement of networks. Journal of Probability and Statistics, Article ID 726398.

Dryver,

A.I.

, & Chao,

C.T.

(2007). Ratio estimators in adaptive cluster sampling. Environmetrics, 18(6), 607-620.

Felix,

M.H.

Medina,

, & Thompson,

S.K.

(2004). Adaptive cluster double sampling. Biometrica, 91, 877-891.

10.

Lee,

(1998). Two phase adaptive cluster sampling with unequal probabilities selection. J. Korean Statist. Soc., 27, 265-278.

11.

Salehi,

M.M.

, & Seber,

G.A.F.

(1997). Two stage adaptive cluster sampling. Biometrics, 53, 959-70.

12.

Salehi,

M.M.

, & Seber,

G.A.F.

(2002). Unbiased estimators for restricted adaptive cluster sampling. Aust. New Zeal. J. Statist., 44, 63-74.

13.

Särndal,

C.E.

, & Swensson,

(1987). A general view of estimation for two phases of selection with applications to two-phase sampling and nonresponse. International Statistical Review, 55(3).

14.

Särndal,

C.E.

Swensson,

, & Wretman,

J.H.

(1992). Model assisted survey sampling. Springer-Verlag, New York, Inc.

15.

Singh,

(2003). Advanced Sampling Theory With Applications. Vol. I & II., Kluwer Academic Publishers.

16.

Thompson,

S.K.

(1990). Adaptive cluster sampling. JASA, 85(412), 1050-1058.

17.

Thompson,

S.K.

(2002). Sampling. Second Edition, Wiley Publications.

18.

Zacks,

(1969). Bayes sequential design of fixed size samples from finite population. JASA, 64, 1342-69.