In this article we present a new INteger-valued AutoRegressive (INAR) model with the aim of extracting baseline patterns of cattle fallen stock registered over an 5-year period at a local scale. We introduce HINAR as a generalization of the classical Poisson-based INAR models whose innovations follow a Hermite distribution. In order to assess trends and seasonality in these time series, we fit different models with time-dependent parameters by specifying proper functions. Using real world examples, we illustrate how to estimate parameters by maximum likelihood and validate the fitted models. We also show a detailed method to forecast. Our proposed model supposes a good solution for studying discrete time series when the counts have many zeros, low counts and moderate overdispersion. This model has been applied to the analysis of fallen cattle registered at a local scale as part of the development of a veterinary syndromic surveillance system.
The well-known autoregressive moving average (ARMA) models are suitable for continuous or discrete time series which can be approximated conveniently by Gaussian distributions. However, in many situations in which discrete time series have low counts and/or many zeros, ARMA models can be inappropriate. Many models for discrete time series have been proposed in the literature. McKenzie (2003) summarizes some of them. One of the best-known proposals is the INteger-valued AutoRegressive models (abbreviated as INAR).
A classical INAR(p) process is defined by means of the recursion
where are fixed parameters and is a sequence of independent and identically distributed non-negative integer-valued random variables with mean ensuremath E(Zt) = μz and finite variance Var(Zt) = σ2z, being also independent of for all .
In this process, the integer discreteness of the series is ensured by the binomial thinning operator or binomial subsampling, defined as
where are independent and identically distributed Bernoulli random variables with , which are also independent of for all . Steutel and van Harn (1979) introduced for the first time the binomial thinning operator and McKenzie (1985) proposed to use this operator for count time series. Comprehensive reviews on INAR models can be found in McKenzie (2003), Jung and Tremayne (2006) and Scotto et al. (2015).
INAR(1) models are those that are mostly used in practice, being extended and generalized in different ways. Some authors proposed other extensions of the classical Poisson INAR(1) model considering geometric or zero-inflated Poisson innovations (see Jazi et al., 2012a; Jazi et al., 2012b). Weiß (2008) reviewed INAR(1) models based on different kinds of thinning operators. Zheng et al. (2007) constructed INAR(1) processes with random coefficient thinning operators describing its properties and methods of estimation. Another extension is the RCINAR processes by Gomes and Canto e Castro (2009).
Some studies have been conducted concerning change-point detection in binary time series (see Fokianos et al., 2014; Hudecová, 2013). Moreover, some authors have taken into account not only change-points detections in binary time series but also in INAR models. In this sense, Hudecová et al. (2015a) and Hudecová et al. (2015b) have developed on-line procedures for detecting change-points in the parameters of INAR(1) models, proposing a statistical test based on the probability generating function of the innovations.
Different INAR(p) models () have been considered in the literature depending on the joint distribution of . Alzaid and Al-Osh (1990) assumed a conditional multinomial distribution, while Du and Li (1991) imposed conditional independence. On the other hand, Zheng et al. (2005) extended INAR(p) models considering random thinning coefficients.
It is important to remark that in the INAR(p) model of Du and Li (1991), are binomial random variables with parameters and , all of them being mutually independent. The model of Du and Li (1991) is more tractable in practice and, henceforth, it will be considered in this work.
Parameter in equation (1.1) represents the proportion of counts which emigrates from state to state , and can be seen as the innovations or new events produced at time . In this way, the counts at time consist of survivors from the state , from the state , from the state , plus the innovations.
In the classical INAR(p) process, equation (1.1), the parameters are constants, but in many real world applications these parameters could vary over time taking into account, for instance, the trend and/or seasonality of the series. The distribution of the innovations is usually assumed to be Poisson, although other alternatives have been considered in the literature (zero-inflated Poisson, weighted Poisson, negative binomial, ).
For the INAR(1) model, it is interesting to remark that only when follows a rth-Hermite distribution, which includes the Poisson () and the Hermite distributions (), will the marginal distribution of the process belong to the same family as the innovations (see Weiß and Puig, 2015).
In this work, we propose a more general model that includes the classical INAR(p) of Du and Li (1991) with Poisson innovations. We will assume that the parameters can be time-dependent and the innovations will be Hermite distributed. In the next sections, we will see that our model is a flexible extension of the classical INAR(p) model, being applicable to a broader framework including the analysis of bovine fallen stock time series.
The HINAR(p) model
Let and be two independent Poisson variables with parameters and . Hence, the probability distribution of the variable is the Hermite distribution with parameters and (see Kemp and Kemp, 1965), and its probability mass function is expressed as
for , where is the integer part of .
Following is a summary of some important properties of the Hermite distribution whose probability generating function (pgf) is .
The moment generating function exists and is equal to , and its characteristic function is .
The mean and variance are and , respectively. The third central moment is and corresponds to the skewness. Whereas the fourth moment centred around the mean is and corresponds to the kurtosis.
Given two independent Hermite random variables such as Hermite and Hermite, the sum of both are also Hermite distributed, Hermite.
A consequence of property 2 is that the dispersion index satisfies . It can be noted that when , the Hermite distribution is just a Poisson distribution, and when , it is twice a Poisson distribution. Consequently, the Hermite distribution can be useful to model data which have a moderate overdispersion. In order to facilitate the computation of the Hermite probabilities and its quantiles, the hermite package in R could be a quality tool (see Moriña et al., 2015a, 2015b).
The third property implies that the Hermite distribution is closed under addition or convolution. This will be relevant for forecasting in Section 3. In fact, Puig and Valero (2007) proved that the rth-Hermite distributions are the unique count data distributions with parameters which are closed under binomial thinning and under addition. In this regard, Kemp and Kemp (1965) described more properties in further detail.
It is said that a distribution is Poisson zero-inflated if its proportion of zeros is greater than the proportion of zeros of a Poisson distribution with the same mean. Then, a good strategy to model time series, with low counts and many zeros, is to use Poisson zero-inflated distributions for the innovations. A direct solution is to consider the so-called zero-inflated Poisson distribution (Jazi et al., 2012a), but Hermite distribution is also Poisson zero-inflated because, according to equation (2.1), .
The novel extension of the conventional INAR(p) model proposed in this study considers that the innovations follow a Hermite distribution with time-dependent parameters and . This model will be called HINAR(p) and is expressed as
In addition, the former parameters of the model are functions of time , provided that the outputs of these functions fall in the interval [0, 1]. Each function introduces a new set of parameters () that have to be estimated. The functions acting as parameters of the Hermite distribution have to take positive values. In this sense, different choices of and have been considered for the purpose of exploring different patterns of trend and seasonality. For our examples, we have it found convenient to use a second-order trigonometric polynomial with a linear part, jointly with an appropriate link function. Specifically, for the Hermite parameters , the choice has been,
Notice that captures the possible trend of the series, while the other parameters capture the seasonalities of periods and respectively. Similarly, for , we have taken a logit link function leading to the expression,
In our examples, the time is measured in weeks. These are just epidemiological weeks, also called Epi or CDC weeks. Therefore, taking and , expressions in equations (2.3) and (2.4) would consider annual and biannual patterns of seasonality. It is quite common to find these patterns of seasonality in mortalities, in both beef and dairy cattle farms.
When coefficients and are constant, some results concerning stationarity will be provided in the next section.
HINAR(p) model with constant coefficients
Let be an INAR(p) process as in equation (1.1) with . Du and Li (1991) showed that to ensure the stationarity of the process, all the roots of the characteristic polynomial should be inside the unit circle which is equivalent to the more simple condition . Du and Li (1991) also showed that the correlation structure of this process is the same as the classical AR(p) model. Accordingly, under the stationarity assumption, the marginal distribution represented by the random variable satisfies,
Considering , taking expectations and using the property of independence under conditioning of the INAR(p) model, we obtain the following expression for the pgf of the marginal distribution in terms of the pgf of the innovations :
where . Notice that for , the expression in equation (2.6) is just the expression for discrete self-decomposability, and is just the pgf of a sum of independent Bernoulli random variables with different probabilities . The expectation E(X) = μ and variance Var(X) = σ2 of the marginal distribution can also be derived from equation (2.6). Taking the first and second derivatives with respect to in equation (2.6), evaluating them at and using the fact that , and , , we find,
If indicates the dispersion index of the innovations, the dispersion index of the marginal remains,
An immediate consequence is that . Consequently, for Poisson innovations and , the marginal distribution is not Poisson distributed because it is overdispersed. Hermite innovations have mean and variance equal to and , respectively. Because , Hermite innovations produce marginal overdispersed distributions with dispersion index in the interval,
It is interesting to point out that overdispersed innovations can produce marginals having the same dispersion index. For instance, an HINAR(2) model with and Hermite innovations has a marginal distribution with dispersion index .
As we have commented before, for the marginal distribution is Poisson (Hermite), if and only if the innovations are Poisson (Hermite) distributed. However, for , it is impossible to obtain a marginal Poisson (Hermite) distributed. To show this, we first need the following lemma:
where at least, andare non-zero. Then, , andare necessary conditions forto be a pgf.
Proposition Consider an INAR(p) process, , like in (1.1). It does not exist any sequence of independent and identically distributed count innovations giving a stationary marginal Poisson (Hermite) distribution.
Proof. Suppose that the marginal distribution is Poisson (Hermite). Then in equation (2.6) is a polynomial of degree 1 (2) and is a polynomial of degree p (2p). Then, is a polynomial of degree p (2p) with a negative coefficient of the leading term. Consequently, Lemma 1 shows that is not a pgf arriving to a contradiction.
The statement of proposition can also be extended to any distribution having a polynomic log-pgf (r-Hermite distributions).
In general, for the pgf of the marginal distribution can not be obtained in a closed form, but a MA-type representation can be derived from equation (2.6). When evaluating at , and . By iterating this scheme we obtain, , where , . As tends to infinity, tends to its fixed point , being an attractor, because . Therefore, we obtain the representation
For Hermite innovations , and the log-pgf of the marginal distribution can be numerically computed from equation (2.10). In this way, the probability of zero can be directly computed. Numerical derivatives at are required to calculate the other probabilities.
Parameter estimation
Parameters are estimated by means of the maximum likelihood method. According to the Total Probability Theorem, the likelihood function can be expressed as the product of probabilities conditioned to a set of past observations. Moreover, because of the existing dependence of order , the likelihood function can be expressed as
Considering that the following probabilities , , are equal to , the likelihood function conditioned to the initial observations remains,
In order to evaluate this likelihood function we need the following proposition.
Proposition Given a HINAR(p) process like in (2.2), then
where , , , , .
Proof Let X and Y be two independent count random variables, where is the sum of X and Y. Then, according to the Total Probability Theorem, it can be showed that
On the other hand, conditioned to the last observation, is the sum of independent binomial distributions plus a Hermite distribution. That is,
By induction, using the expression of the binomial probability function and the expression in equation (2.1), the proof holds.
Due to the complexity of the likelihood function of the model in equation (2.12), there are no closed expressions for the maximum likelihood estimation (MLE) of the parameters. Nevertheless, an appropriate numerical method can be used to maximize the likelihood function and obtain the MLE. In this article, a Newton-type algorithm provided by the function nlm in em R was used.
The code of the R-program built to estimate the parameters is included in the supplementary material. Notice that the number of parameters could be very large, so it is important to find parsimonious models. In this work, the model selection is based on both statistical significance of the parameters and Akaike information criterion (AIC) (or Bayesian information criterion, BIC). That is, the best model among all possible models is the one in which all parameters are statistically significant and which also shows a small value of AIC (or BIC).
Forecasting
Two kinds of predictions will be considered: the average behaviour of the series and the k-time-ahead distribution that will be studied for , due to the nature of our examples. McCabe et al. (2011) presented a non-parametric approach to estimate the forecasting distribution of for , when the process follows a classical INAR(p) model with arbitrary innovations. They estimate the forecasting distributions using non-parametric likelihood maximum estimators (NPMLE) which are basically obtained by maximizing the likelihood function of an INAR(p) model, but considering that the innovations are distributed following some discrete parameter-free distribution called , which is also estimated. As a result, the NPMLE estimated vector is obtained, where sequence corresponds to the estimate of the parameter-free discrete distribution . It is worth mentioning that this method does not allow to consider seasonality and/or time-dependent parameters, while the method proposed in this work allows to handle both seasonality and time-dependent parameters. However, for comparative purposes, the method of McCabe et al. (2011) will be used in the examples of Section 4.
The average behaviour of the time series
In order to compute point predictions for the average behaviour of the series, we apply an extension of the methodology described by Moriña et al. (2011). Accordingly, taking expectations in equation (2.2) and calling E(Xt)= t, we obtain the following recurrence relation,
For the first initial values of , we can consider arbitrary values. In our case, we have used the overall mean of the series.
The variance of is calculated using the delta method. First, for each , the expression of in equation (3.1) can be understood as a function of the parameters ,
Next, the gradient of is computed in order to apply the delta method technique. That is,
Finally, the variance of is computed by using the following expression which depends on the variance–covariance matrix of the MLE of and the above estimated gradient,
Furthermore, as the closed form of the gradient, equation (3.3), is intractable, it is numerically calculated by a first-order approximation of the derivatives, taking successive values of until the difference between two consecutive becomes insignificant
Computationally, we have built a R-program (see supplementary material) in which the gradient is calculated numerically using the numDeriv package (see Gilbert and Varadham, 2012). This method improves this first-order approximation by applying Richardson's extrapolation.
As usual, the confidence intervals of these point predictions are obtained by means of this commonly expression
The time-ahead forecasting distribution for
Given observations , to make predictions with the HINAR() model, it is required to know the distribution of . To do this, we first need the following lemma.
Lemma Let Hermite distributed according to equation (2.1), and . Then,
Proof The pgf of a -thinning of a count random variable has the form (see Puig and Valero, 2007). Because a -thinning of a Hermite distribution is also Hermite, and the log-pgf is a second order polynomial, equating its coefficients holds the proof.
In order to simplify the notation used in equation (2.2), we will denote , and . Assuming first that , notice that conditioned to is the sum of an independent binomial distribution with parameters , plus a Hermite distribution with parameters and . Consider now the case . Then, the distribution of can be determined through the last observed value of the series in the following way,
where and .
Therefore, the distribution of conditioned to is the sum of independent binomial random variables with parameters , plus a Hermite distribution with parameters and . This recursive procedure can be extended to , leading to the following proposition:
Proposition 3.
where
and
where and . Note that the property of closure under convolution of the Hermite distribution mentioned in Section 2 is essential to obtain this result. Consequently, the distribution of is the sum of a binomial with parameters and a Hermite() both independent. When tends to zero as increases, the long-term forecasting distribution of ( large) follows an approximate Hermite distribution with parameters and . The case in which the parameters are time constant (, and ) has special practical interest. In this situation, tends to zero as tends to infinity and Hermite, where the mean and variance are those of the marginal distribution given in equation (2.7), that is, and .
By replacing parameters in equations (3.9), (3.10) and (3.10) by their MLE, we can estimate regions of prediction of size , finding the lower and upper limits and that satisfy
Moriña et al. (2011) applied a similar methodology to predict the number of beds needed by month at the hospital level in the emergency service units. They used an INAR(2) model incorporating seasonality only in the innovations.
Examples of application
Our model has been used to extract the baseline patterns of mortality recorded at a farm level in two small populations of beef and dairy cattle. The continuous collection and analysis of automated data of fallen stock of cattle has shown a good potential as a health indicator (see Alba et al., 2015). If abnormal values of fallen cattle are out of the expected ranges according to the baseline patterns previously studied in a specific population, these peaks could indicate unusual events related to animal health issues. These signs could also pre-date the diagnosis and predict possible disease outbreaks. Several studies have been focused on the analysis of fallen stock in large populations that resulted in observations over time. However, many series should be studied at a local scale because important events may occur at this geographical level. That is, a region in Spain is organized at different geographical levels; province, county and municipality. Data recorded at a region or province is usually suitable to be studied by using ARMA models, while data recorded at a county and municipality has sometimes low values and/or many zeros. Also, in order to build a robust syndromic surveillance system based on cattle mortality, it would be quite necessary to explore and analyze fallen stock patterns at these different geographical levels. These patterns of fallen stock can be monitored by observing the number of carcasses collected at a farm level over time, although in small animal populations the series has many zeros and/or low counts. Therefore, our HINAR model could be a suitable tool to solve this problem.
Example 1. Modelling of fallen stock from a small population of beef cattle
The data of this study are provided by the Ministry of Agriculture, Food and Environment of Spain under strict agreement of confidentiality. These data record the fallen stock collected between January 2007 and December 2011 from 16 beef cattle farms with a total census of 1140 bovines placed in a small region of Catalonia. This dataset registers the number of bovine carcasses collected by the disposal services, the date of collection and the identification of each farm. For the purpose of studying the baseline patterns of fallen bovine of this region, the number of fallen cattle was aggregated by a week.
The range of carcasses collected by week is between 0 and 9 (see Figure 1). From 2007 to 2011 the median number of fallen cattle is 1, with a weekly average of 1.5. The standard deviation of the series is 1.7, a slight overdispersion is detected.
Evolution of fallen stock in a small beef cattle population from 2007 to 2011
It seems to be a change-point at (the beginning of 2009), according to the behaviour of the series shown in Figure 1. Change points in the parameters of a HINAR model can be tested using a likelihood ratio test (LRT). Under the null hypothesis, parameters , and are equal throughout the periods to and to , that is
The estimates of the parameters under the null hypothesis are , and , while the estimates without constraints are , , , , and . It leads to a LRT statistic equal to corresponding to a p-value equal to . Therefore, a model with constant parameters does not seem a good choice.
In order to fit the series properly, different hermite-based integer-valued autoregressive models (HINAR) are tested. That is, we have fitted several HINAR models with different temporal lags and patterns of trend and seasonality. The model we propose here is the best, based on our criteria for model selection (i.e., lowest AIC and BIC, and models with statistically significant coefficients). Accordingly, the estimated proposed model is a HINAR(1) with and a decreasing trend included in the Hermite parameter ,
The MLE of the parameters of the model are presented in Table 1. The low value of the standard error of the Hermite parameter implies that , and consequently the innovations of the model are not Poisson distributed. The AIC = 841.47 and BIC = 852.15 of this model are lower than the AIC = 915.13 and BIC = 922.25 of the equivalent Poisson-based integer-valued autoregressive model (PoINAR), leading to a LRT statistic equal to 75.65. Under the null hypothesis, , the LRT does not have an asymptotic distribution as expected, because belongs to the boundary of the domain of parameters. It can be shown that the asymptotic distribution is a 50:50 mixture of the constant 0 and the distribution, leading in this case to a p-value equal to .
Maximum likelihood estimates of the model proposed for the series of beef cattle
Parameter
ML estimate
s.e.
0.104
0.044
0.004
0.001
0.306
0.046
ACF and PACF of the residuals of model, in equation (4.1)
Predictions made in static cross-validation and forecasting of the HINAR(1) model for the beef cattle. The average behaviour of the time series (top) and the k time-ahead forecasting
distribution (bottom). Confidence limits at 95%
The auto-correlation function (ACF) and partial auto-correlation function (PACF) plots (Figure 2) allow us to validate the residuals of the model. It can be noted that their behaviour is close to white noise. The predictive performance of the proposed model is also assessed using two techniques of cross-validation. On the one hand, we use a static cross-validation in order to explore the predictive capability of the model. Using a static form, the dataset is divided into training and testing sets: the first is used to estimate the model; whereas the second is used for the model validation. In our case, the training set includes the first 80% of the observations registered over the period of study (up to 2011), while the testing sample includes the last 20% of the observations (2011). The model is fitted with the training set to predict the expected values over the period corresponding to the testing set. Then, the training set predictions are compared with the testing set observations. The average prediction of the series and the time-ahead forecasting distribution based on the estimated parameters according to equation (3.1) is shown in Figure 3. The estimates of these parameters based on the training sample (up to 2011) are similar (, and ) to those obtained based on the whole sample (see Table 1). In addition, the 95% confidence limits calculated from the time-ahead forecasting distribution show that the number of carcasses observed after 2011 agree with the proposed model. The approach proposed by McCabe et al. (2011) to estimate the time-ahead forecasting distributions is also performed and the results are shown in Figure 9. In this case, the NPMLE of the parameters (and their standard errors) are (), with an estimated distribution of the innovations equal to (), (), (), (), (), (), (), (), () and (). Figure 9 shows that the non-parametric-based 95% confidence limits (approximately ) are larger than those computed using our method (approximately ), summarized in Figure 3.
Finally, because this HINAR(1) model has a constant , the approximated long-term forecasting distribution can also be considered. However, this distribution is almost indistinguishable from the time-ahead forecasting distribution because has a very low value. On the other hand, we use a dynamic cross-validation which considers that the training and testing samples are changing over time. That is, an initial sample is used for estimating the model and making predictions for a near future time period. Once the predicted values are available, these are compared with the corresponding observed values which will be incorporated into the series over time. In that case, it is required to identify whether some abnormal peak is caused by a disease or a catastrophe, or whether it is only a regular random value of the series. If the peak is really anomalous, it will be replaced by its predicted value in order to keep the capability of the fitted model to detect abnormalities of this magnitude. This procedure is repeated in such a way that all the last available information is re-used to estimate the model, and it is also used to re-make predictions for a period without information. When such information is available, then predictions are re-validated. Hence, dynamic cross-validation can be performed at the same time the information is updated.
The dynamic cross-validation of the model in equation (4.1) is shown in Figure 4. This validation is built in the following way. First, an initial sample based on the information of 2007 is considered. Second, this initial sample is used to predict a week ahead. Next, this prediction is compared with the corresponding observed value; if this value is anomalous, it is studied to determine whether it is actually abnormal or not. Finally, if it is abnormal, it will be replaced by its predicted median value. This process is repeated as new information is incorporated in the sample.
Dynamic cross-validation of the HINAR(1) model for the beef cattle
Notice that the proposed model forecasts one week ahead. This elapsed time is considered appropriate to provide continuous information and gives responses avoiding the distortion that exists at a daily level due to the lack of carcasses collected over the weekend.
In this example, the values registered in weeks 58, 66, 75 and 81 could be anomalous according to the forecasting estimates. In our example, based on the available information, they have been considered as regular random values.
Example 2. Modelling of fallen stock from a small population of dairy cattle
In this example, the fallen stock data are also collected between January 2007 and December 2011 from three dairy cattle farms with a total census of 647 bovines placed in the same region as the previous example. A range between 0 and 7 is collected (see Figure 5), and from 2007 to 2011 the median number of collection recorded is 0.5, with a weekly average of 1 carcass. A slight overdispersion in the series with a dispersion index of 1.33 is also detected.
Evolution of fallen stock in a small dairy cattle population from 2007 to 2011
Maximum likelihood estimates of the model proposed for the series of dairy cattle
Parameter
ML estimate
s.e.
3.894
1.404
3.304
1.594
0.388
0.065
1.540
0.197
0.698
0.240
As in the previous example, different HINAR models are tested assessing trends and seasonality based on equations (2.3) and (2.4). Taking into account our criteria for model selection, the best model among all possible models is a HINAR(1) leading to the process,
where annual seasonality has been included in both parameters and . A summary of the MLE of the parameters is shown in Table 2. The AIC = 694.96 and BIC = 712.76 of this model are also lower than the AIC = 745.25 and BIC = 752.37 of the equivalent PoINAR. In this case, the asymptotic distribution of the LRT statistic is unknown since the null hypothesis takes the form of . However, the p-value of the LRT could be calculated using a parametric bootstrap method.
The plots of the ACF and PACF are shown in Figure 6. Notice that their behaviour is also close to white noise.
ACF and PACF of the residuals of model (4.2)
Predictions made in static cross-validation and forecasting of the HINAR(1) model for the dairy cattle. The average behaviour of the time series (top) and the k time-ahead forecasting distribution (bottom). Confidence limits at 95%
Finally, in order to validate our model not only the checking of its residuals is taken into account but also its capability as a predictive tool. As in the example 1, the static and dynamic cross-validation have been conducted. Figure 7 shows the average prediction of the series as well as the time-ahead forecasting distribution based on the estimated parameters according to equation (3.1). The estimates of the parameters based on the training sample (up to 2011) have also been similar (, , , and ) to those shown in Table 2. The 95% confidence limits calculated from the time-ahead forecasting distribution also agree with the proposed model. As in the first example, the approach of McCabe et al. (2011) is applied and the corresponding 95% confidence limits are shown in Figure 9. Here, the non-parametric estimates of the parameters are (), and the distribution of the innovations are (), (), (), (), () and (). Probabilities and are aggregated to () to avoid convergence problems. Comparing with the results obtained using our method (Figure 7), it is clear that the non-parametric-based approach of McCabe et al. (2011) completely ignores the seasonality pattern of the series, losing an important information.
Dynamic cross-validation of the HINAR(1) model for fallen dairy cattle
Figure 8 shows the dynamic cross-validation made for the model in equation (4.2). This validation has been conducted in the same way as the validation of example 1. The observed values agree with the model except for the value recorded in the week 153. However, further study at a field level is required to determine the causes associated with this event.
Discussion
Over the last few years the classical Poisson INAR model has been studied in detail and applied in different situations. The HINAR model is a reasonable generalization of the classical model allowing to cope with many time series in small scale studies having low values with many zeros. The HINAR model presented in this work allow to consider time-dependent coefficients, which are useful to include the trend and/or seasonality of the series. These reasons make the HINAR model more flexible than the classical Poisson INAR model. The main advantage of the INAR(1) model with Hermite innovations with respect to other innovations proposed in the literature is that the marginal distribution of the process (with constant coefficients) is also Hermite distributed. For time-dependent models and also for , the property of the Hermite distribution of being closed under independent addition simplifies notoriously the k time-ahead forecasting distribution.
Our model has been motivated by the study of mortalities in beef and dairy cattle farms at a small level. Specifically, these mortalities recorded at farm level have the potential of being used as an early warning system for assessing the impact and magnitude of abnormal events occurred over time. Moreover, these events can also have different impact at various geographical and temporal scales. For instance, a disease outbreak that causes mortality can occur in a wide geographical area or can be localized in a small area. In the first case, the analysis at a large scale is able to detect the outbreak, although in the second case it is required to analyze the data at a small scale in order to detect the anomaly. Accordingly, ARMA models are suitable for describing patterns of series at a large scale, but they are not as appropriate as the HINAR models for describing the profiles of the series at a small scale.
The 95% confidence limits of the forecasting distributions from the 1-week ahead to 48-week ahead based on the approach proposed by McCabe et al. (2011)
The series of beef cattle has been fitted using a HINAR(1) model with a decreasing trend included in the innovations, while the series of dairy cattle has been fitted using a HINAR(1) with an annual seasonality included in the -parameter and also in the innovations. These models have been selected based on different kind of validations. In both examples, the model validation was conducted analyzing the crude residuals () to identify if they look like white noise (see Figures 2 and 5), and then perform a cross-validation (see Figures 3 and 6). The static cross-validation is useful to validate the baseline patterns of the model, while the dynamic cross-validation is a better choice, if the aim of the study is to forecast future values. The new models appear to be more appropriate than the classical Poisson INAR models because the AIC and BIC are lower. We have explored several patterns of the trend of seasonality using linear and trigonometric functions with appropriate links. The use of non-parametric approaches like P-splines (Eilers et al., 2015) is a topic of further research.
Why are two very different models needed to model beef versus dairy cattle farms? The answer comes from the fact that beef cattle farms are characteristic of being extensive farms. These kind of farms are mainly kept in fields and may be housed for a part of the year. Furthermore, data are derived from a few number of farms which are quite heterogeneous and some of them were closed during the period of study, agreeing with the decreasing trend detected by the model. This trend can be explained as a consequence of the impact of the major economic crisis in Spain after 2008, where many farms decided to close leading to a decreasing number of beef cattle in the regions and a decreasing number of carcasses recorded. On the other hand, the profile of the dairy cattle farm is more sophisticated, appearing to be an annual seasonality. In dairy cattle, the farms are intensive in such a way that cattle may be housed throughout their lives. Furthermore, dairy cattle are quite sensitive to temperature changes. This fact agrees with the seasonality introduced in equation (4.2) and described in the Figure 7.
To sum up, the model proposed in this article can be an appropriate and less restrictive method for a fitting discrete time series with low counts and many zeros. Specifically, it allows us to fit series assuming non-Poisson innovations and time-dependent parameters.
Footnotes
Acknowledgments
This work was funded by grants MTM2012-31118 and MTM2015-69493-R from the Ministry of Economy and Competitiveness of Spain. We thank to the Ministry of Agriculture, Food and Environment of Spain for providing the data analyzed in the article. We also give special thanks to David Moriña who helped us with classical INAR models and their implementation in the R-software and Sandy Beitsch who carefully reviewed the article for English language. The authors thank the two referees and the associate editor for their valuable comments and suggestions.
References
1.
AlbaADóreaFCArineroLSánchezJCordónRPuigPCrawfordWR (2015) Exploring the surveillance potential of mortality data: Nine years of bovine fallen stock data collected in Catalonia (Spain). PLoS ONE10(4), e0122547, doi:10.1371/journal.pone.0122547.
2.
AlzaidAAAl-OshMA (1990) An integer-valued pth-order autoregressive structure (INAR(p)) process. Journal of Applied Probability27, 314–24.
3.
DuJGLiY (1991) The integer-valued autoregressive (INAR(p)) model. Journal of Time Series Analysis, 12, 129–42.
4.
EilersPHCMarxBDDurbanM (2015) Twenty years of P-splines. Statistics and Operations Research Transactions, 39, 149–86.
5.
FokianosKGombayEHusseinA (2014) Retrospective change detection for binary time series models. Journal of Statistical Planning and Inference, 145, 102–12.
6.
GilbertPVaradhamR (2012) numDeriv: Accurate numerical derivatives (R package). URL http://CRAN.R-project.org/package=numDeriv (last accessed 28 June 2016).
7.
GomesDCanto e CastroL (2009) Generalized integer-valued random coefficient for a first order structure autoregressive (RCINAR) process. Journal of Statistical Planning and Inference, 139, 4088–97.
8.
HudecováS (2013) Structural changes in autoregressive models for binary time series. Journal of Statistical Planning and Inference, 143, 1744–52.
9.
HudecováSHudecováMMeintanisS (2015a) Tests for time series of counts based on the probability generating function. Statistics, 49, 316–37.
10.
HudecováSHudecováMMeintanisS (2015b) Detection of changes in INAR models. In StelandA., Stochastic models, statistics and their applications, Springer Proceedings in Mathematics and Statistics. Vol. 122, pages 11–18.
11.
JaziMAJonesGLaiC (2012a) First-order integer valued AR processes with zero inflated poisson innovations. Journal of Time Series Analysis, 33, 954–63.
12.
JaziMAJonesGLaiC (2012b) Integer valued AR(1) with geometric innovations. Journal of the Iranian Statistical Society, 11, 173–90.
13.
JungRCTremayneAR (2006) Binomial thinning models for integer time series. Statistical Modelling, 6, 81–96.
14.
KempCDKempAW (1965) Some properties of the Hermite distribution. Biometrika Trust, 52, 381–94.
15.
McCabeBPMMartinMHarrisD (2011) Efficient probabilistic forecasts for counts. Journal of the Royal Statistical Society: Series B, 73, 253–72.
16.
McKenzieE (1985a) Some simple models for discrete variate time series. Water Resources Bull, 21, 645–50.
17.
McKenzieE (2003) Discrete variate time series. In Handbook of Statistics. pages 573–606. Amsterdam: Elsevier.
18.
MilneRKWestcottM (1993) Generalized multivariate Hermite distributions and related point processes. Annals of the Institute of Statistical Mathematics, 45, 367–81.
19.
MoriñaDPuigPRȷosJVilellaATrillaA (2011) A statistical model for hospital admissions caused by seasonal diseases. Statistics in Medicine, 30, 3125–36.
MoriñaDHiguerasMPuigPOliveiraM (2015b) Generalized hermite distribution modelling with the R package hermite. The R Journal, 7, 263–74.
22.
PuigPValeroJ (2007) Characterization of count data distributions involving additivity and binomial subsampling. Bernoulli, 13, 544–55.
23.
ScottoMGWeißCHGouveiaS (2015) Thinning-based models in the analysis of integer-valued time series: A review. Statistical Modelling, 15, 590–618.
24.
SteutelFWvan HarnK (1979) Discrete analogues of self-decomposability and stability. The Annals of Probability, 7, 893–9.
25.
WeißCH (2008) Thinning operations for modeling time series of counts---a survey. Advances in Statistical Analysis, 92, 319–41.
26.
WeißCHPuigP (2015) The marginal distribution of compound Poisson INAR(1) processes. Stochastic Models, Statistics and Their Applications, 122, 351–9.
27.
ZhengHBasawaIVDattaS (2005) Inference for pth-order random coefficient integer-valued autoregressive processes. Journal of Time Series Analysis, 27, 411–40.
28.
ZhengHBasawaIVDattaS (2007) First-order random coefficient integervalued autoregressive processes. Journal of Statistical Planning and Inference, 173, 212–29.