It is shown that for a small sample size, when the maximum likelihood estimators lose its asymptotic properties, a more accurate point and interval estimation of the Poisson distribution parameter requires a direct investigation of the properties of the normalized likelihood function. Alternative point and interval estimates of the distribution parameter are obtained.
The theory of the Poisson distribution is widely known and fully represented in the educational literature. With a small sample size the estimations of the observed parameters obtained by the maximum likelihood method (MLM) can lose their unbiasedness and efficiency properties, so it can complicate the qualitative application of statistical methods and solving practical problems. In the present research it is proposed to use the method of direct investigation of the likelihood surface for more accurate point and interval estimations of the Poisson distribution parameter and to calculate the Expected value (EV) of the normalized likelihood function (NLF) as an alternative to the solution of the MLM system. The object of this paper is to give a quantitative and qualitative description of the error of such an estimation of the Poisson-type distribution parameter with a small sample size. To illustrate the applicability and simplicity of the proposed method, one example is considered, taken from the classical literature on statistics. A numerical comparison of the estimates obtained by the new method with previously known values is carried out.
Formulation of the problem
We consider the random variable (RV), having the Poisson-type distribution with a parameter , , with a probability function (PF) Eq. (1):
Then we say Patil et al. (1968, p. 14) that the random variable has the Poisson-type distribution with parameters , where and – are constant values. PF of the RV have the form Eq. (2):
where .
The Expected value (EV) and the variance of the random variables are:
The problem of point and interval estimation of the parameter of distribution Eq. (2) for a limited -volume sample of the independent observations of the RV is important for practice.
Method of estimation
Consider series of observations during a fixed period, in each of which it was recorded events, . We denote the possible estimate by . The likelihood function (LF) by the results of independent observations has the Eq. (4):
where
In accordance with the MLM, for the estimation of the parameter and its variance the following Eq. (6) are taken place:
To carry out the valuation of the LF we define a normalizing factor . Taking into account the known identity Eq. (7)
The Eq. (9) can be considered as the distribution density of estimates . It depends on the number of observations n and the total number of occurrences of the event during observations.
Let us investigate some properties of the Eq. (9) . The first Eq. (10) and the second Eq. (11) derivatives of this function are
After equating first and the second derivatives with zero we obtain extremums Eq. (12) and inflection points Eqs (13) and (14):
Then it is obvious that and the function graph is also convex upward on the interval and within it reaches a maximum at the point . On intervals and , we have and the function graph has a concavity down.
Despite the symmetry of the values of the inflection points relative to the value of the NLF at these points are different and its graph is asymmetric with a limited sample size . Indeed, if we consider the limiting case or , we have Eq. (15):
From the asymmetry of NLF follows a difference between the mode that corresponds to the MLM-estimate and the expected value density of distribution of the estimates .
We define the expected value of the estimates and its variance, and also more precisely the variance of the MLM-estimate Eq. (6). After a series of transformations, taking into account Eqs (7) and (9) we obtain Eqs (16)–(18):
Comparing Eq. (6) with Eq. (16) and Eq. (18) with Eq. (17), we can see that the MLM-estimate has a negative bias relative to the expected value , and its specified variance exceeds the variance from the EV on the squared bias . The comparisons show that the EV Eq. (16) NLF is the best alternative to the MLM-estimate Eq. (6) for the parameter , which is especially noticeable for a limited sample size.
Parameter estimations
Based on NLF as the density of distribution of estimates, it is possible not only to construct a point estimation of the parameter , but also a confidence interval for a given confidence probability . For this it suffices to use the relations Eq. (19):
where . In the symmetric case, when the quantities are assumed to be equal , we have , and with a given confidence level , it is easy to determine the value .
Substituting Eq. (9) into the left-hand sides of the equalities and taking into account the known Eq. (20)
after the transformations, we obtain equations for computing the limits of the confidence interval of the parameter of the Poisson type distributions Eq. (21):
Since the left-hand sides of the Eq. (21) correspond to the Poisson distribution function with the parameters and respectively, the confidence interval boundaries can be determined numerically from Eq. (21), or by the numerical integration of NLF Eq. (9), or by using tabulated tables of the Poisson distribution function.
Taking into account the known relations between Poisson and – distributions Bolshev and Smirnov (1983, p. 70) Eq. (22),
where is the probability integral, and is the – percentage point of the – distribution with degrees of freedom, defined as Eq. (4):
Thus, the confidence interval boundaries can also be found by interpolation (in terms of the number of degrees of freedom) in the tables of percentage points of – distribution.
We note that the estimate of the lower confidence boundary differs from the similar formula given in Bolshev and Smirnov (1983, p. 70). In the notation used, the corresponding Eq. (24) would have the form
This is due to the fact that the basis of the latter rations was founded on the general approaches of the theory of interval estimates. Whereas the interval estimates Eq. (24) are obtained from the NLF. The recalculation of the existing confidence interval tables for the Poisson parameter is not required. It is enough just to make a correction, shifting the line one position down to find the lower bound of the interval.
Thus, the confidence intervals determined by the rations Eq. (26) are not the narrowest. Their determination with a confidence probability can be based on the use of NLF Eq. (9), which leads to Eq. (24).
Example
The enterprise is conducting the quality control of the technical products. During the working period there were performed 8 series of independent tests of the technical device functioning and the following number of failures were recorded: . Assuming that the probability of failure is described by Poisson’s law, it is necessary to construct the density of the distribution of the parameter estimates, to find its point estimate and the confidence interval boundaries for the failure rate of the output products with confidence probability 0.98.
By virtue of the adopted notation, we have obtained the experimental results 1, 0, 8, 10. We will assume 0.99. By virtue of Eqs (6), (18), (16), (17), (24), (26) we obtain
According to Table 2.2a (Bolshev & Smirnov, 1983, p. 166-167) of the percentage points of the – distribution, in view of Eq. (24) we have
In the Table 5.4a (Bolshev & Smirnov, 1983, p. 306) of the confidence limits for the Poisson parameter, exactly the last boundaries [4.13, 20.14] are given. The left border 4.77 is a line below. As a result, we obtain a confidence interval with the level of confidence 0.98.
Conclusions
So, the paper shows that for a limited sample size, when MLM estimates lose their asymptotic properties, for more precisepoint and interval estimation of the parameter of Poisson type distributions we should use a direct investigation of the graph and the properties of the NLF. The alternative point and interval estimates of the distribution parameter are obtained. For greater clarity of applicability of the proposed method, one example taken from the classical literature on statistics is considered. A numerical comparison of the estimates obtained by the new method with previously known values is carried out. Simplicity of implementation of the above method allows you to include it in software packages on statistics. The detailed estimation method can also be extended to other distributions.
References
1.
Patil,G.P.Joshi,S.W., & Rao,C.R. A dictionary and Bibliography of discrete distributions. Oliver and Boyd Ltd., Edinburgh, 1968.
2.
Bolshev,L.N., & Smirnov,N.V. Tables of Mathematical Statistics (in Russian). Nauka, Moscow, 1983.