Abstract
The Poisson’s binomial (PB) is the probability distribution of the number of successes in independent but not necessarily identically distributed binary trials. The independent non-identically distributed case emerges naturally in the field of item response theory, where answers to a set of binary items are conditionally independent given the level of ability, but with different probabilities of success. In many applications, the number of successes represents the score obtained by individuals, and the compound binomial (CB) distribution has been used to obtain score probabilities. It is shown here that the PB and the CB distributions lead to equivalent probabilities. Furthermore, one of the proposed algorithms to calculate the PB probabilities coincides exactly with the well-known Lord and Wingersky (LW) algorithm for CBs. Surprisingly, we could not find any reference in the psychometric literature pointing to this equivalence. In a simulation study, different methods to calculate the PB distribution are compared with the LW algorithm. Providing an exact alternative to the traditional LW approximation for obtaining score distributions is a contribution to the field.
Keywords
Introduction
Lord (1980) pointed out that using item response theory (IRT), the frequency distribution of test scores,
Wang (1993) presented an explicit form for the distribution of the number of successes in independent but not necessarily identically distributed binary trials, the Poisson’s binomial (PB) distribution, and studied many of its properties. Wang pointed out that this distribution has played an important role in probability theory, and it dates back at least to Poisson (1837). Direct calculation of probabilities using the PB distribution suffers from similar computational problems as the CB. Nevertheless, efficient algorithms are available for the estimation of the PB distribution. To our knowledge, neither the PB distribution nor the algorithms to obtain the score probabilities seemed to have been referenced in the psychometric literature.
First, we show that the CB and PB distributions lead to the same probabilities of the number of successes in independent binary trials. Second, we conduct a simulation study to evaluate the performance of these algorithms. Third, the well-known Lord and Wingersky (1984) algorithm is shown to be equivalent to one of the methods used for the estimation of the PB distribution.
Theoretical Models
PB
Let
If
Definition: Let
and the corresponding cumulative distribution function (CDF)
To use this distribution in practice, consider
To calculate the probability that two successes are obtained, we use Equation 1 to obtain
where qi = 1-pi.. Using similar calculations, the probabilities of obtaining 0, 1, and 3 successes are
Algorithms for the Calculation of Probabilities
Both approximate and exact alternatives to the direct calculation of probabilities to efficiently obtain the distribution function of the PB model have been proposed and are reviewed here. A well-known approximation used in general statistics is the normal approximation (NA), which is based on the central limit theorem and approximates the CDF of the PB distribution by
where
Not surprisingly, this approximation has also been used in psychometrics for the CB (e.g., Lord & Novick, 1968), which will later be shown to be equivalent to the PB.
An improved version of the NA was described by Volkova (1996; see also Neammanee, 2005) and is known as the refined normal approximation (RNA). Compared with NA, it adds a correction to the skewness of the distribution of X. Under this method, the CDF of the PB is approximated by
where
Another approximate method referred here to as the Poisson approximation (PA) is based on a famous inequality established by Le Cam (Le Cam, 1960; Steele, 1994) and uses the Poisson distribution, with
Among the exact methods, one algorithm for computing the distribution function of the PB was proposed in Fernandez and Williams (2010) where polynomial interpolation and the Discrete Fourier Transform (DFT) are used to derive closed-form formulas for the PB’s probability distribution function and CDF. Later, Hong (2013) derived the same closed-form expressions in a simpler way. The method is based on the application of the DFT to the characteristic function of the PB distribution, and it is accordingly called the DFT-CF method. Using this method, the CDF of the PB distribution can be obtained through
where
Other exact methods are recursive and initiated as approximate expressions for the calculation of the PB’s CDF. In an early work, Walsh (1955) proposed a method based on power expansions of
The probability generating function of a random variable
It follows that one can extract the values
so that
with the conditions
where
which coincide with the probabilities calculated using Equation 1. This strategy is also used in Lord (1980, Section 4.1). Next, we will show that the RF actually corresponds to the Lord and Wingersky (1984) recursive algorithm.
IRT Models
Let
where
Using Equation 9, for a given ability, the probabilities of earning each of the
respectively. As it was the case for
As an alternative to the direct calculation of score probabilities, the CB distribution has traditionally been obtained using a recursion formula given by Lord and Wingersky (1984). The formula reads as follows
where
Note, (a) in both the PB and CB distributions, the number of elements in the sets
Simulation Example
In this empirical example, the R (R Core Development Team, 2014) package poibin (Hong, 2013) was used to obtain the CDFs for the PB distribution using all the described methods (i.e., NA, RNA, PA, DFT-CF) in the previous section in addition to the LW recursive algorithm. The test scores were simulated under the two-parameter logistic IRT model. Individuals were sampled from a
where
In many practical applications, the marginal score distribution
Results
Table 1 shows the mean absolute error for each combination of the factors in the simulation for the marginal test score distribution
Mean Absolute Error for the Estimation of Marginal Distributions for Different Test Lengths.
Note. LW = Lord and Wingersky; NA = normal approximation; RNA = refined normal approximation; PA = Poisson approximation.
Studying the approximation methods, PA does not perform well for most cases. The RNA performs better than the NA in all cases and both methods improve with the number of items. Interestingly, when the test length increase (
Discussion
We have shown that in the distribution of the number of successes in independent binary trials, the PB distribution is equivalent to the CB distribution seen in psychometrics. The LW recursive algorithm for CBs was shown to be equivalent to an RF derived for the PB distribution. Alternative methods, both approximate and exact, were introduced for the calculation of PB probabilities and evaluated in the IRT framework.
We used four different approaches for the estimation of the test score distributions. Some approximation methods were competitive in the cases where the score distributions were marginalized over
For the case of conditional distributions, approximate methods perform worse when extreme values of abilities are considered in comparison with average values of ability. This could have consequences in equating methods that are based on conditional score distribution functions as, for example, local equating. Because a family of equating functions is defined by different values of
There is an advantage of having a compact and exact mathematical definition for the PB distribution as random variables can directly be generated from this model. This could help in making fairer simulation studies, for instance, when comparing equating methods, as the advantages and disadvantages of using one or another method to simulate score data are not always clear (Sinharay, Holland, & von Davier, 2011). This topic is currently being investigated by the authors. Other potential benefit of knowing the exact mathematical definition of the PB distribution could arise in multistage testing if the conditional test scores of the individuals are used at each stage of testing (Haberman & von Davier, 2014). In this article, local independence is assumed in the specification of IRT models. The psychometric literature, however, has questioned the independence of items in IRT models (Gibbons & Hedeker, 1992; Wainer & Kiely, 1987). Exploring the score distribution in the non-independent, non-identically distributed case is an interesting topic for future research.
Recent work improving the described algorithms for the PB and showing new applications can be found in Barrett and Gray (2014), and in the psychometric field in Cai (2014). How the presented algorithms would extend to the case of polytomous items data (e.g., Thissen et al., 1995) is also a topic of future research.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: Jorge González was partially funded by Fondo Nacional de Desarrollo Científico y Tecnológico (FONDECYT) Grant 1150233. The research in this article by Marie Wiberg was funded by the Swedish Research Council Grant 2014-578.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
