Abstract
Poisson and negative binomial regression procedures have proliferated, and now are available in virtually all statistical packages. Along with the regression procedures themselves are procedures for addressing issues related to the over-dispersion and excessive zeros commonly observed in count data. These approaches, zero-inflated Poisson and zero-inflated negative binomial models, use logit or probit models for the “excess” zeros and count regression models for the counted data. Although these models are often appropriate on statistical grounds, their interpretation may prove substantively difficult. This article explores this dilemma, using data from a study of individuals released from facilities maintained by the Massachusetts Department of Correction.
Introduction
Ordinary least squares (OLS) regression has served the social sciences well. But research by econometricians, biostatisticians, and others over the past few decades has identified numerous issues associated with violations of distributional and other assumptions underpinning the OLS model. Some of these can be managed, but others cannot. The latter include models of binary and multinomial categorical data, ordinal data, and data that are “counts,” which consist only of positive integers and zeros. Variables that only take these values abound in every domain of scientific inquiry and are most commonly associated with the Poisson distribution, the support of which is the set of numbers greater than or equal to zero.
While the Poisson distribution addresses many of the issues associated with the use of OLS and is the preferred model for these types of variables, it is not without problems of its own and is in fact in a number of ways quite restrictive. One of the central features of the Poisson distribution is that includes only one parameter, which captures both centrality and dispersion, such that the mean and variance (λ) are assumed to be equal. In a regression framework, this means that the conditional mean and variance must be equal. This creates numerous problems for many social science applications. First, the count variables observed in most such research projects rarely meet the criterion for equality of mean and variance, but instead are typically “overdispersed,” in that the variance exceeds the mean, due to the heterogeneity of the sample from which it is derived. An alternative distribution, the negative binomial (NB), is better able to manage this problem, because it features two parameters, one for the mean and another for the variance. As a result, regression models using a negative binomial link function have become the favored approach for multivariate analysis of count variables in the social sciences. Most statistical packages now include both Poisson and negative binomial regression (PR and NBR, respectively) procedures either as separate algorithms (as in R and Stata) or as part of a global set of algorithms under a Generalized Linear Model rubric (e.g., SPSS).
But the difficulties of modeling count variables do not end with over-dispersion. Many variables’ distributions include “excess” zeros, or are “zero-inflated” (ZI). Often the proportion of cases with values of zero on an outcome measure greatly exceeds what would be expected in either a Poisson or negative binomial distribution. While in some cases this could be due to the fact that the count for some subset of individuals in a study’s sample simply “happens” to be zero and generated by the same processes as non-zero values in the distribution being modeled, it is assumed that the zeros observed in ZI models may, in fact, be generated by two different processes. Ridout, Demetrio, and Hinde (1998) offered the example of the number of roots produced by a plant cutting. Plants that fail to propagate will have no roots, and this condition ipso facto produces a value of zero. Other plants may propagate but have no roots. However, the latter do not share the attributes of those that are “structurally” prevented from rooting because of failed propagation, and their lack of roots can be assumed to be a function of the processes generating values other than zero (termed sampling zeros). A similar issue arose in data on the number of fish caught by a sample of campers. A value of zero could result from either of two processes—lack of angling skills by those who chose to fish or having chosen not to fish at all. Both may yield a value of zero for “fish caught” but for different substantive reasons (University of California, Los Angeles [UCLA] Statistical Consulting Center, 2011).
These issues are easily managed using zero-inflated Poisson (ZIP) and zero-inflated negative binomial (ZINB) models. These approaches use “mixture models,” which incorporate a binary component that models the probability that an individual’s value on the dependent variable is a structural zero, and another that analyzes the count model itself, where individuals may have any value on the outcome measure, including zero (i.e., a sampling zero) and where the zeros are assumed to be the result of the same processes that yield other values. In most models, the binary component is fit using a logit or probit model whereas the count component (including the zeros) is modeled using a PR or NBR (Long, 1997).
Beginning in the 1990s, recognition of the problems attending PR and NBR led to the development a number of diagnostics that could be used to determine when NB or ZI models were needed. These included a test for over-dispersion (Cameron & Trivedi, 1998) and a test for whether a zero-inflated version of a PR or NBR should be considered (Vuong, 1989). (It should be noted that the Vuong test has itself been the focus of recent criticism centering on issues related to the conceptualization of “non-nested models” underlying the test (Wilson, 2015).) The Vuong model as well as visual inspection of a dependent variable’s distribution offer guidance as to when a ZIP or ZINB model is an improvement over a basic Poisson or NBR model.
There are, however, dissenting voices regarding the need for ZINB and ZIP approaches even in the presence of excess zeros (cf. Allison, 2012; Xie, Tao, McHugo, & Drake, 2013). For example, in a blog posting titled “Do We Really Need Zero-Inflated Models?” Allison (2012) questioned on statistical grounds whether such models are needed and argues that straightforward PR and NBR models are adequate for handling most situations where excess zeros are encountered. In deciding on whether to use a zero-inflated count model, it is therefore important to determine whether any attribute that will be available in the model can detect “true” structural zeros and, moreover, how significant results in that “inflation” portion of the model can be substantively interpreted. Allison notes further that, in his experience, log likelihood comparisons of ZINB and standard NB models typically yield trivial differences.
A Criminal Justice Example
We illustrate the dilemma surrounding how best to manage excess zeros using data from a study of recidivism among individuals who were “open mental health cases” while incarcerated in facilities operated by the Massachusetts Department of Corrections. Each cohort member was followed for 2 years following his or her release. Data on personal characteristics and criminal history were obtained as well as recidivism characteristics from multiple state agencies (see Hartwell et al., 2013). The study was reviewed and approved by university and public agency human subjects review committees.
One of the factors examined in this study was the number of times an individual was rearrested during the 2-year observation period. Clearly, a count regression model was the appropriate choice for examining this distribution. The distribution of rearrests is shown in Figure 1. Two features of this distribution guided the modeling approach. First, the variance (2.90) was 2½ times the mean (1.17), indicating that a Poisson model would be inappropriate and a NBR would be preferred. In addition, as is also obvious, the number of zeros is much larger than what would be expected in a negative binomial distribution. Such a distribution should have a smooth concave appearance, which is not the case here; indeed, observations with a value of zero outnumber those for all other values, suggesting a clear case of zero inflation. Analyses were carried out using Stata v. 14 (Stata Corp.)

Distribution of total arrests.
Results
The results of two analyses are shown. The first, shown in Table 1, is the outcome of the ZINB analysis. The first set of incident rate ratios (IRRs), their associated standard errors, 95% confidence intervals, and p values represent the NBR results. The Vuong Test was significant, indicating that a ZINB model was statistically superior to an NBR, and the alpha and log alpha tests, which were also significant, confirmed the appropriateness of a negative binomial analysis compared with a PR.
Zero-Inflated Negative Binomial Regression for Total Post-Release Arrests (N = 1,349).
Note. Nonzero observations = 635; zero observations = 714; inflation model = logit LR χ2(8) = 24.37; log likelihood = −1,954.606 Pr > χ2 = .0020; likelihood-ratio test of α = 0: chibar2(01) = 62.81; Pr ≥ chibar2 = .0000; Vuong test of ZINB versus standard negative binomial: z = 3.49 Pr > z = .0002. IRR = incident rate ratio; CI = confidence interval; ZINB = zero-inflated negative binomial.
As indicated, African American releasees had significantly higher IRRs compared with Whites, while age, as well as the major offense categories, provided various degrees of protection against multiple arrests. The “inflation model” used logistic regression to examine the same set of variables, since no variable or subset of variables in the data set could be hypothesized to be associated with cases having clearly structural zeros as they might in the studies cited above (e.g., “rootlessness” or “choosing not to fish”). One variable, having a juvenile record, had a significant effect on “not having a value of zero,” which is consistent with previous analyses showing that a juvenile record was a strong predictor of a post-release arrest in this population (Fisher et al., 2014).
An NBR model was also fit, and its results are shown in Table 2. The findings of this model are essentially consistent with the ZINB model, but here the “Juvenile Record” factor was significant as part of the count model along with several other factors. One issue to consider in using an NBR as opposed to a ZINB approach is whether the overall fit of the models applied to these data differ significantly. In this case, the difference in log likelihood is minimal (−1954.6 for the ZINB model vs. −1974.58 for the NBR).
Negative Binomial Regression for Total Post-Release Arrests.
Note. Number of observations = 1,349; LR χ2(8) = 61.65; Pr > χ2 = .0000; log likelihood = −1,974.589; pseudo R2 = .0154. Likelihood-ratio test of α = 0: chibar2(01) = 601.54; Prob ≥ chibar2 = .000. IRR = incident rate ratio; CI = confidence interval.
Discussion
The substantive significance of these findings has been discussed elsewhere (Fisher et al., 2014; Hartwell et al., 2013; Hartwell et al., 2016), but here we focus on analytic issues, in particular the findings relative to the ZINB model’s treatment of Juvenile Record compared with the results for this variable obtained in the NBR. In the ZINB model, Juvenile Record was seen as a predictor of having a value other than zero in the inflation model, but was not a significant predictor of sampling zeros in the count portion of the model. In the NBR, those with juvenile records were found to have higher arrest counts and incident rates than those without.
What, exactly, does this mean? If we follow Ridout and colleagues’ (1998) notion of structural versus sampling zeros, we would have to conclude based on the findings of the inflation component of the ZINB model that a juvenile record systematically causes one to be rearrested, that is, as a “structural non-zero.” Certainly, as we have learned from previous analyses of these data, those with juvenile records were indeed more likely both to have been rearrested and rearrested more quickly than those without (Fisher et al., 2014), but nothing suggests that this background carries with it a structural predisposition. The difficulty created by the ZINB analysis here is that one has to provide a substantive argument for the identification of a structural association that may not exist. Certainly, the systematic relationship between a juvenile record and future offending does not rise to the same level as the lack of fish caught by campers who chose not to fish (UCLA Statistical Consulting Center, 2011), rootless trees that do not propagate (Ridout et al., 1998) or other examples of cases with “true” structural zeros provided in the literature on ZINB and ZIP models. However, observing an association between juvenile offending and a propensity to reoffend and to do so frequently as an adult is highly consistent with a range of findings from life course criminology research. Observing such a relationship among other factors as a feature of the heterogeneity of the sample that predicts number of arrests post-release seems both explainable and predictable.
Conclusion
In his forward to Long’s 1998 volume Regression Models for Categorical and Limited Dependent Variables, Richard Berk makes two salient points that are relevant to the present discussion. First, he argues that “one must be able to make the case that the statistical model maps well onto the empirical phenomena being studied" (cited in Long, 1997, p. xix). Second, he argues that the availability of easy-to-use software is “both a blessing and a curse. The blessing is that minimal computer skills are required; the curse is that minimal computer skills are required. Right answers and wrong answers are easy to obtain” (cited in Long, 1997, p. xix). A positive Vuong test, combined with the availability of ZIP and ZINB algorithms in most statistical packages, might compel some to use those procedures on statistical grounds or simply as a response to a statistical “technological imperative.” It is always important to consider, though, that the ultimate goal of any statistical analysis is to produce models that are not only statistically reasonable but also substantively interpretable.
Footnotes
Acknowledgements
The authors acknowledge the helpful comments of Jason Rydberg on earlier drafts of this article.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The data used in the analysis presented here were collected with the support of National Institute of Mental Health Grant 1RC1MH088716-01, Stephanie Hartwell, Principal Investigator.
