Twenty exponential type estimators with imputation have been suggested to overcome the problem of missing data for a study variable in sample surveys. It has been shown that the suggested estimators are more efficient than the mean method of imputation, ratio method of imputation, regression method of imputation and the estimators are given by Singh and Horn (2000), Singh and Deo (2003), Singh (2009), Gira (2015) and Singh et al. (2016). The biases and their mean square errors of the suggested estimators are derived. Simulation studies are also demonstrated for comparing the performances of the suggested estimators.
Non-response is one major problem, which is encountered by practitioners in the field of sample surveys. Repeated surveys are generally more prone to this problem than single-occasion surveys. For examples, in case of milk yield surveys the animal may be sold or may die during the survey period. In agricultural surveys, the crops destroy due to some natural calamities or disease during the course of experiments. Thus, the observations may be missing for some of the time stages. Such type of non-response (missingness) may have different patterns and causes.
It is well recognized fact that inference concerning population parameters can be wrong if the suitable information about the nature of non-response is not known. A natural question arises what one needs to assume to justify ignoring the incomplete mechanism. Rubin (1987) addressed three concepts: missing at random (MAR), observed at random (OAR) and parameter distribution (PD). Heitzan and Basu (1996) have distinguished the meaning of missing at random (MAR) and missing completely at random (MCAR) in a very nice way.
Many methods are used to reduces the negative impact of non-response in sample surveys. Imputation is one which deal with the filling up method of incomplete data for adapting the standared analytic model in statistics. It apparently solves the missing data problem at the beginning of the analysis. To deal with missing values effectively, Sande (1979) and Kalton et al. (1981) intended imputation methods that make incomplete data sets structurally complete and its analysis simply. Imputation method is also be driven out by all of the assist of an auxiliary variable, if one is available. For the MCAR response mechanism, Singh and Horn (2000) indicated a compromised method of imputation. Singh and Deo (2003), Ahmed et al. (2006), Toutenburg et al. (2008), Kadilar and Cingi (2008), Singh (2009), Singh et al. (2010), Diana and Perri (2010), Gira (2015), Singh et al. (2016) and Prasad (2016) have suggested several new imputation based methods with the aid of an auxiliary variable.
Motivated by a recent work of Singh et al. (2016) and using missing completely at random (MCAR) response mechanism, in the present section of this article we have suggested ratio exponential type estimators with imputation to tackle with the problems of non-response in sample surveys. Ratio exponential type estimators have been suggested for estimating the population mean in sample surveys and subsequently their behaviors are studied. Performances of the suggested estimators are compared with the existing estimators and suitable recommendations are made.
The outline of this paper is as follows: in Section 2, we consider several estimators of the finite population mean under non-response that are available in literature. The formulation of suggested estimators are given in Section 3. In Section 4, properties of the suggested estimators are discussed. A simulation study is conducted in Section 5, Analysis of simulation results are given in Section 6 and some concluding remarks are given in Section 7.
Let us consider a finite population of size with values of the study variable and values of the auxiliary variable. For the estimation of population mean , a random sample of size is drawn according to the procedure of SRSWOR scheme. Assuming the non-response to be random, suppose that there are response observations and non-response observations. Let be the size of responding units out of sampled units, the set of responding units by and the other of size denotes by set of non-responding units .
When the non-response observations are discarded, it is customary to estimate population mean by
When the non-response observations are not discarded and some imputation method is followed, the complete data set is specified by
The general point estimator of population mean takes the form
Here, the value denote the imputed value of the study variable corresponding to the non-responding units.
Related review of some existing estimators
In this section, we consider several estimators for estimating the population mean under non-response.
Mean method of imputation
Under the mean method of imputation, the point estimator Eq. 3 of population mean is derived as
which is known as the response mean estimator of population mean . The variance of the response sample mean , is given by
Ratio method of imputation
Under the ratio method of imputation, the point estimator Eq. 3 of population mean is derived as
which is known as the ratio estimator of population mean . The MSE of estimator is obtained under MCAR mechanism upto first order of large approximation, is given by
Regression method of imputation
Under the regression method of imputation, the point estimator Eq. 3 of population mean is given by is given by
which is known as regression estimator of population mean . where . The MSE of is obtained under MCAR mechanism upto first order of approximation, is given by
Singh and Horn (2000) estimator
Singh and Horn (2000) suggested a compromised imputation in survey sampling. Under this method of imputation, the point estimator Eq. 3 of population mean is derived as
where is a suitable constant. The optimum value of is .
Now, taking the optimum value of in Eq. 10, we can obtained the optimum MSE of under MCAR mechanism upto the first order of approximation, is given by
Singh and Deo (2003) estimator
Singh and Deo (2003) suggested imputation by power transformation in survey sampling. Under this method of imputation, the point estimator Eq. 3 of population mean becomes
where is a suitable constant. The optimum value of is .
Now, taking the optimum value of in Eq. 12, we obtained the optimum MSE of under MCAR mechanism upto first order of approximation, given by
Singh (2009) estimator
Singh (2009) suggested a new method of imputation in survey sampling. Under this method of imputation, The point estimator Eq. 3 of the population mean becomes
where is a suitable constant. The optimum value of is .
Now, using the optimum value of in Eq. 14, we get the optimum MSE of the estimator under MCAR mechanism upto the first order of large approximation, given by
Gira (2015) estimator
Gira (2015) suggested a new method of ratio type imputation in sample surveys. Under this method of imputation, The point estimator Eq. 3 of the population mean becomes
where is a suitably chosen constant. The optimum value of is .
Now, using the optimum value of in Eq. 16, we get the optimum MSE of the estimator under MCAR mechanism upto the first order of large approximation, given by
Singh et al. (2016) estimator
Singh et al. (2016) proposed single imputation methods and subsequent estimators for estimating the population mean , are given by
The MSEs of estimators are obtained under MCAR mechanism upto the first order of large sample approximation, are given by
Formulation of the problem
In this section, ratio exponential type estimators with imputation have been suggested for estimating the population mean . The study variable after imputation for the suggested method of imputation becomes
where , for different suitable choices of and .
Under this suggested method of imputation, the point estimator Eq. 3 of the population mean , given by
where is a suitably chosen constant, such that the MSE of the suggested estimator is minimum and and are either some real number or the known value of some population parameters of the auxiliary variable such as standard deviation , coefficient of kurtosis , coefficient of skewness , coefficient of variation and correlation coefficient .
We would like remark that for different choices of and in Eq. 25, we suggest twenty ratio exponential type estimators, as shown in Table 1.
Some members of the suggested ratio exponential type estimators with imputation
Estimators
1
0
1
1
1
1
1
1
1
1
1
Properties of the suggested estimators
To obtain the biases and mean square errors (MSEs) of the suggested estimators (where ) for different suitable choices of and upto the first order of large sample approximation are derived under the following transformations:
Under the above large transformations, the estimators take the following forms:
where .
For different suitable choices of and , the can be changed in and the Eq. 26 can be written as
where
Parameters of data sets
Data 1
Data 2
Data 3
Parameters
[Source: Cochran (1977)] page 325
[Source: Koyuncu and Kadilar (2009)]
[Source: Lohr (1999)]
10
923
3059
4
180
428
3
144
128
101.1
436.4345
308582.4
58.8
11440.498
56.5
0.1449
1.7183
1.3783
0.1281
1.8645
1.2796
0.3814
18.7208
7.5
0.5764
3.9365
2.4
0.6515
0.9543
0.677428
The Eq. 27 can be written neglecting the terms of ’s having power greater than two, we get
Taking expectation of both sides of Eq. 28, we get the biases of the proposed estimators upto the first order of large approximation as
Now, after squaring both sides of Eq. 28 and neglecting the terms of ’s having power greater than two, we have
Taking expectation of both sides of Eq. 30 respectively, we get the MSEs of the suggested estimators for the first order of large approximation as
Differentiating Eq. 31 with respect to and equating to zero, we get the optimum value of is given by
After substituting the optimum value of i.e., in Eq. 31, we can obtain the optimum MSEs of the suggested estimators as given as
Simulation study
Biases, mean square errors and percent relative biases of the suggested estimators for Data 1
10, 4, 3, 0.6515
Estimators
Absolute Bias
Mean square error (MSE)
Relative Bias (RB)
Percent relative Bias (PRB)
0.303351
30.9235
0.0545508
5.45508
0.307324
31.0804
0.0551257
5.51257
0.301793
30.8638
0.0543230
5.43230
0.305661
31.0139
0.0548860
5.48860
0.303869
30.9436
0.0546262
5.46262
0.305958
31.0257
0.0549289
5.49289
0.303888
30.9444
0.0546290
5.46290
0.303146
30.9156
0.0545208
5.45208
0.303661
30.9355
0.0545959
5.45959
0.303420
30.9262
0.0545608
5.45608
0.303701
30.9371
0.0546018
5.46018
0.310144
31.1957
0.0555285
5.55285
0.300631
30.8199
0.0541525
5.41525
0.304248
30.9584
0.0546813
5.46813
0.307830
31.1008
0.0551983
5.51983
0.290497
30.4596
0.0526355
5.26355
0.309385
31.1643
0.0554205
5.54205
0.300949
30.8318
0.0541992
5.41992
0.306875
31.0623
0.0550610
5.50610
0.304145
30.9544
0.0546664
5.46664
Biases, mean square errors and percent relative biases of the suggested estimators for Data 2
923, 180, 144, 0.9543
Estimators
Absolute Bias
Mean square error (MSE)
Relative Bias (RB)
Percent relative Bias (PRB)
2.51937
849.428
0.0864429
8.64429
2.51991
849.556
0.0864547
8.64547
2.52932
851.815
0.0866623
8.66623
2.52147
849.931
0.0864892
8.64892
2.52037
849.666
0.0864648
8.64648
2.51988
849.550
0.0864541
8.64541
2.51937
849.428
0.0864429
8.64429
2.51938
849.428
0.0864429
8.64429
2.51938
849.428
0.0864429
8.64429
2.51937
849.428
0.0864429
8.64429
2.51937
849.428
0.0864429
8.64429
2.51951
849.461
0.0864459
8.64459
2.52190
850.035
0.0864988
8.64988
2.51963
849.489
0.0864485
8.64485
2.51950
849.459
0.0864458
8.64458
2.52471
850.709
0.0865608
8.65608
2.51993
849.562
0.0864552
8.64552
2.52979
851.930
0.0866728
8.66728
2.52157
849.955
0.0864914
8.64914
2.52041
849.678
0.0864659
8.64659
In this section, the performance of suggested estimators has been evaluated by using statistical data sets previously used in the literature. The percent relative efficiencies of the suggested estimators with respect to the mean method of imputation, ratio method of imputation, regression method of imputation, Singh and Horn (2000) estimator, Singh and Deo (2003) estimator, Singh (2009) estimator, Gira (2015) estimator and Singh et al. (2016) estimators are computed as
where .
Simulation studies are carried out for the different choices of the parameter , , and . Results are presented in the Tables 3–8.
Analysis of simulation results
Biases, mean square errors and percent relative biases of the suggested estimators for Data 3
3059, 428, 128, 0.677428
Estimators
Absolute Bias
Mean square error (MSE)
Relative Bias (RB)
Percent relative Bias (PRB)
2514.34
7.88E+08
0.0895888
8.95888
2554.18
7.92E+08
0.0907398
9.07398
2777.15
8.22E+08
0.0968356
9.68356
2607.23
7.99E+08
0.0922440
9.22440
2565.03
7.94E+08
0.0910499
9.10499
2541.51
7.91E+08
0.0903758
9.03758
2514.90
7.88E+08
0.0896052
8.96052
2518.55
7.88E+08
0.0897113
8.97113
2515.69
7.88E+08
0.0896281
8.96281
2515.06
7.88E+08
0.0896098
8.96098
2514.72
7.88E+08
0.0895999
8.95999
2531.15
7.90E+08
0.0900765
9.00765
2633.52
8.02E+08
0.0929775
9.29775
2535.79
7.90E+08
0.0902108
9.02108
2525.76
7.89E+08
0.0899205
8.99205
2726.17
8.15E+08
0.0954955
9.54955
2572.57
7.95E+08
0.0912649
9.12649
2878.08
8.39E+08
0.0993900
9.93900
2648.33
8.04E+08
0.0933869
9.33869
2588.22
7.96E+08
0.0917088
9.17088
Percent relative efficiencies of the suggested estimators over the estimators , , , , , , , , and respectively for Data 1
10, 4, 3, 0.6515
Estimators
161.930
140.511
137.383
137.383
137.383
137.383
137.383
101.426
124.083
110.381
161.113
139.801
136.689
136.689
136.689
136.689
136.689
100.914
123.457
109.824
162.243
140.783
137.649
137.649
137.649
137.649
137.649
101.622
124.324
110.595
161.458
140.101
136.982
136.982
136.982
136.982
136.982
101.130
123.722
110.059
161.825
140.419
137.294
137.294
137.294
137.294
137.294
101.360
124.003
110.310
161.396
140.048
136.930
136.930
136.930
136.930
136.930
101.092
123.675
110.018
161.821
140.416
137.290
137.290
137.290
137.290
137.290
101.358
124.000
110.307
161.971
140.547
137.418
137.418
137.418
137.418
137.418
101.452
124.115
110.409
161.867
140.456
137.330
137.330
137.330
137.330
137.330
101.387
124.035
110.338
161.916
140.498
137.371
137.371
137.371
137.371
137.371
101.417
124.073
110.372
161.859
140.449
137.323
137.323
137.323
137.323
137.323
101.381
124.029
110.333
160.517
139.285
136.184
136.184
136.184
136.184
136.184
100.541
123.001
109.418
162.474
140.983
137.845
137.845
137.845
137.845
137.845
101.767
124.501
110.752
161.747
140.352
137.228
137.228
137.228
137.228
137.228
101.312
123.944
110.257
161.007
139.710
136.600
136.600
136.600
136.600
136.600
100.848
123.376
109.752
164.396
142.651
139.475
139.475
139.475
139.475
139.475
102.971
125.973
112.062
160.679
139.425
136.321
136.321
136.321
136.321
136.321
100.642
123.125
109.528
162.411
140.928
137.791
137.791
137.791
137.791
137.791
101.728
124.452
110.709
161.206
139.883
136.769
136.769
136.769
136.769
136.769
100.973
123.529
109.888
161.768
140.371
137.246
137.246
137.246
137.246
137.246
101.325
123.960
110.271
Percent relative efficiencies of the suggested estimators over the estimators , , , , , , , , and respectively for Data 2
923, 180, 144, 0.9543
Estimators
388.045
305.876
304.303
304.303
304.303
304.303
304.303
121.846
126.547
123.004
387.987
305.830
304.257
304.257
304.257
304.257
304.257
121.828
126.528
122.986
386.958
305.019
303.450
303.450
303.450
303.450
303.450
121.505
126.192
122.659
387.816
305.695
304.123
304.123
304.123
304.123
304.123
121.774
126.472
122.931
387.937
305.790
304.218
304.218
304.218
304.218
304.218
121.812
126.512
122.970
387.990
305.832
304.259
304.259
304.259
304.259
304.259
121.829
126.529
122.986
388.045
305.876
304.303
304.303
304.303
304.303
304.303
121.846
126.547
123.004
388.045
305.876
304.303
304.303
304.303
304.303
304.303
121.846
126.547
123.004
388.045
305.876
304.303
304.303
304.303
304.303
304.303
121.846
126.547
123.004
388.045
305.876
304.303
304.303
304.303
304.303
304.303
121.846
126.547
123.004
388.045
305.876
304.303
304.303
304.303
304.303
304.303
121.846
126.547
123.004
388.031
305.864
304.291
304.291
304.291
304.291
304.291
121.842
126.542
122.999
387.769
305.657
304.086
304.086
304.086
304.086
304.086
121.759
126.457
122.916
388.018
305.854
304.281
304.281
304.281
304.281
304.281
121.838
126.538
122.995
388.031
305.865
304.292
304.292
304.292
304.292
304.292
121.842
126.542
123.000
387.461
305.415
303.845
303.845
303.845
303.845
303.845
121.663
126.357
122.819
387.984
305.828
304.255
304.255
304.255
304.255
304.255
121.827
126.527
122.985
386.906
304.978
303.409
303.409
303.409
303.409
303.409
121.488
126.175
122.643
387.805
305.686
304.114
304.114
304.114
304.114
304.114
121.771
126.469
122.928
387.932
305.786
304.214
304.214
304.214
304.214
304.214
121.810
126.510
122.968
Percent relative efficiencies of the suggested estimators over the estimators , , , , , , , , and respectively for Data 3
3059, 428, 128, 0.677428
Estimators
171.911
122.122
114.198
114.198
114.198
114.198
114.198
128.199
159.450
112.742
170.897
121.402
113.525
113.525
113.525
113.525
113.525
127.443
158.510
112.077
164.632
116.951
109.363
109.363
109.363
109.363
109.363
122.771
152.699
107.969
169.497
120.407
112.595
112.595
112.595
112.595
112.595
126.398
157.211
111.159
170.615
121.202
113.338
113.338
113.338
113.338
113.338
127.233
158.249
111.893
171.223
121.633
113.741
113.741
113.741
113.741
113.741
127.686
158.812
112.291
171.897
122.112
114.189
114.189
114.189
114.189
114.189
128.188
159.437
112.733
171.805
122.047
114.128
114.128
114.128
114.128
114.128
128.120
159.352
112.673
171.877
122.098
114.176
114.176
114.176
114.176
114.176
128.173
159.419
112.720
171.893
122.109
114.186
114.186
114.186
114.186
114.186
128.185
159.433
112.730
171.901
122.115
114.192
114.192
114.192
114.192
114.192
128.191
159.441
112.736
171.487
121.821
113.917
113.917
113.917
113.917
113.917
127.883
159.057
112.464
168.781
119.899
112.119
112.119
112.119
112.119
112.119
125.865
156.548
110.690
171.369
121.737
113.838
113.838
113.838
113.838
113.838
127.795
158.948
112.387
171.624
121.918
114.007
114.007
114.007
114.007
114.007
127.984
159.184
112.554
166.151
118.030
110.372
110.372
110.372
110.372
110.372
123.903
154.108
108.965
170.418
121.061
113.207
113.207
113.207
113.207
113.207
127.086
158.066
111.763
161.481
114.713
107.270
107.270
107.270
107.270
107.270
120.421
149.777
105.902
168.373
119.608
111.848
111.848
111.848
111.848
111.848
125.560
156.168
110.422
170.005
120.768
112.932
112.932
112.932
112.932
112.932
126.778
157.683
111.493
The following interpretation can be read out from Tables 3–8.
From Tables 3–5 it is observed that the percent relative biases of suggested estimators lie between 9% to 10% in Table 5, more than 5% in Table 3 and more than 8% in Table 4.
From Table 6 it is clear that for a 40% sampled population with response rate 75%, the percent relative efficiencies of the suggested estimators with respect to the other existing estimators like as the mean method of imputation remains 160.517% to 164.396%; the ratio method of imputation remains 139.285% to 142.651%; the regression method of imputation, Singh and Horn () estimator, Singh and Deo () estimator, Singh () estimator and Gira () estimator remain 136.184% to 139.475%; Singh et al. (, and ) estimators remain 100.541% to 102.971%, 123.001% to 125.973% and 109.418% to 112.062% respectively.
From Table 7 it is clear that for a 20% sampled population with response rate 80%, the percent relative efficiencies of the suggested estimators with respect to the other existing estimators like as the mean method of imputation remains 386.906% to 388.045%; the ratio method of imputation remains 304.978% to 305.876%; the regression method of imputation, Singh and Horn () estimator, Singh and Deo () estimator, Singh () estimator and Gira () estimator remain 303.409% to 304.303%; Singh et al. (, and ) estimators remain 121.488% to 121.846%, 126.175% to 126.547% and 122.643% to 123.004% respectively.
From Table 8 it is seen that for a 14% sampled population with response rate between 20% and 30%, the percent relative efficiencies of the suggested estimators with respect to the other existing estimators like as the mean method of imputation remains 161.481% to 172.661%; the ratio method of imputation remains 109.288% to 122.122%; the regression method of imputation, Singh and Horn () estimator, Singh and Deo () estimator, Singh () estimator and Gira () estimator remain 100.836% to 114.198%; Singh et al. (, and ) estimators remain 120.421% to 132.243%, 149.777% to 167.612% and 105.902% to 113.235% respectively.
Conclusions
In this work, ratio exponential type estimators with imputation have been suggested for the estimating population mean in sample surveys using some known value of population parameters. From Tables 3–5 it is observed that the percent relative biases of the suggested estimators are less than 10% in Tables 3–5. The performance of the suggested estimators are highly justified in simulation studies presented in Tables 6–8. We tried simulation studies for several more populations and found that the percent relative biases in the suggested estimators varying between 11% (minimum) to 50% (maximum). Based on the above analysis (Section 6), it is showed that the suggested estimators are more efficient than the mean method of imputation, ratio method of imputation, regression method of imputation and the estimators are given by Singh and Horn (2000), Singh and Deo (2003)), Singh (2009), Gira (2015) and Singh et al. (2016).
Hence, the use of some known value of auxiliary variables through suggested estimators are highly beneficial and it may be recommended to the survey statisticians for its further use.
Footnotes
Acknowledgments
Author is thankful to the anonymous referee for his valuable suggestions.
References
1.
AhmedM. S.AL-TitiO.AL-RawiZ., & Abu-DayyehW. (2006). Estimation of a population mean using different imputation methods. Statistics in Transition, 7(6), 1247-1264.
2.
CochranW. G. (1977). Sampling Techniques, 3rd edn, Wiley and Sons.
3.
DianaG. & PerriP. F. (2010). Improved estimators of the population mean for misssing data. Communications in Statistics – Theory and Methods, 39, 3245-3251.
4.
GiraA. A. (2015). Estimation of population mean with a new imputation methods. Applied Mathematical Sciences, 9(34), 1663-1672.
5.
HeitzanD. F., & BasuS. (1996). Distinguishing ‘missing at random’ and ‘missing completely at random’. The American Statistician, 50, 207-213.
6.
KadilarC., & CingiH. (2008). Estimators for the population mean in the case of missing data. Communications in Statistics- Theory and Methods, 37, 2226-2236.
7.
KaltonG.KasprzykD., & SantosR. (1981). Issues of non-response and imputation of income and program participation. Currents Topics in Survey Sampling (KrevoskiD.PlatekR., & RaoJ. N. K., eds), Acad. Press, New York, 455-480.
8.
KoyuncuN., & KadilarC. (2009). Efficient Estimators for the Population mean. Hacettepe Journal of Mathematics and Statistics, 38(2), 217-225.
9.
LohrS. L. (1999). Sampling: Design and Analysis. Duxbury Press, Kentuncky.
10.
PrasadS. (2016). A note on imputing auxiliary variable for missing values in sample surveys. 2nd Research Summit on CEEE2016, NIT AP, ISBN: 978-93-85777-69-1.
11.
RubinR. B. (1987). Multiple imputation for non-response in surveys. John Wiley, New York.
12.
SandeI. G. (1979). A personal view of hot-deck imputation procedures. Survey Methodology, Statistics Canada, 238-247.
13.
SinghG. N.MauryaS. KhetanM., & KadilarC. (2016). Some imputation methods for missing data in sample surveys. Hacettepe Journal of Mathematics and Statistics, 45(6), 1865-1880.
14.
SinghG. N.PriyankaK.KimJ.M., & SinghS. (2010). Estimation of population mean using imputation techniques in sample surveys. Journal of Korean Statistical Society, 39(1), 67-74.
15.
SinghS., & DeoB. (2003). Imputation by power tranformation. Statistical Papers, 44, 555-579.
SinghS. (2009). A new method of imputation in survey sampling. Statistics, 43, 499-511.
18.
ToutenburgH., & SrivastavaV. K. (2008). Amputation versus imputation of missing values through ratio method in sample surveys. Statistical Papers, 49, 237-247.