In many cases, only an unbalanced panel data set with some observations missing at random is available. This note derives a Cliff and Ord test for spatially autocorrelated disturbances for such data. In a small Monte Carlo simulation exercise, the performance of the proposed test is similar to its balanced counterpart. In almost all simulation experiments, the test is properly sized. Naturally, the lower the power of the test, the higher the share of missing data.
For unbalanced panels, diagnostic tests for spatial correlation of the idiosyncratic disturbances seem to be unavailable. The existing tests are based on balanced panels. Specifically, Baltagi, Song, and Koh (2003) derive an lagrange multiplier (LM) test under the random effects and the normality assumption, while Mutl and Pfaffermayr (2010) propose a Cliff and Ord test in the spirit of Burridge (1980) and Moran (1950) that is based on the within-transformed residuals.1
This article follows the work of Haining, Griffith, and Bennett (1989), Martin (1984), and LeSage and Pace (2004) on spatial cross-section models with data missing at random (MAR) and derives a Cliff and Ord test for unbalanced panel data.2 An important consequence of the MAR assumption is that the missing data mechanism is ignorable, so that the likelihood of the observed unbalanced panel model can then be based on the marginal distribution of the observed model (see Martin 1984; Little and Rubin 2002). The MAR assumption is plausible for possibly large cross sections of units, where the missingness of data only depends on the observed values of the outcome and the set of explanatory variables, but not on the outcome of the missing ones. A leading application is provided by LeSage and Pace, who analyze the determinants of housing prices in a cross section under the MAR assumption. In their data, only the prices of houses actually sold are observed, while a spatial relation of the disturbances is assumed among all houses.
The proposed Cliff and Ord test on spatially correlated disturbances in unbalanced panels is based on the within-transformed data and is robust to the violation of the random effects assumption. This test is easy to implement and in a small Monte Carlo simulation exercise it performs similar to its balanced counterpart in terms of size and power. In almost all simulation experiments, the test is properly sized. As one would expect, the power of the test is lower, the higher the share of missing data. However, at negative spatial correlation of the disturbances and a distance-based spatial weight matrix, the power of the test can be very low. This phenomenon is also observed in balanced panels (see e.g., Mutl and Pfaffermayr 2010), but seems to be aggravated in unbalanced ones.
The Cliff and Ord Test in an Unbalanced Panel
In the balanced but unobserved panel model, the data are assumed to be ordered first by time and then by units so that the fast index refers to units and the slow index to time . The balanced panel model is given by:3
The vector is the dependent variable, the matrix comprises the set of exogenous variables excluding the constant, which is denoted by α. β is the corresponding parameter vector. is the vector of unit-specific effects and denotes the vector of the i.i.d. remainder disturbances. The vector of unit-specific effects is repeated in all time periods using the dummy design matrix , where is a () vector of ones and and are identity matrices of dimension and , respectively. Following Kapoor, Kelejian, and Prucha (2007), the term may be spatially correlated involving the () spatial weighting matrix and the spatial autocorrelation parameter ρ.4 Typically, the spatial weights are either based on contiguity or decline in some measure of distance between units. The assumptions stated below guarantee that exists.
The observed panel is unbalanced with some observations MAR. It is derived from the unobserved balanced one using the selector matrix , which is obtained from the identity matrix by skipping all rows referring to missing values. n denotes the number of observed values, while the number of missings is denoted by m so that Thus, the unbalanced model uses and . Next, define rN as the vector of indicator variables taking the value if unit i is observed at time t and otherwise. According to the definition of Rubin (1976) data are MAR, if missingness only depends on the observed values, but not on the missing ones. An important consequence of the MAR assumption is that the missing data mechanism is ignorable. As a consequence, the joint density that is obtained by integrating out is the product of and the marginal distribution of the observed values denoted by . Hence, one can treat rN and, therefore, as given (see Little and Rubin 2002, 119) and base the inference on the marginal distribution .5 Throughout, it is assumed that the share of missing observations converges to a finite constant.
Applying the selector matrices defined yields the unbalanced panel model
with The projection matrix with removes the unit-specific effects to obtain the within-transformed observed model as
This uses as shown in the Appendix. The transformed model would also result from Anselin’s (1988) specification, where only the remainder disturbances are spatially correlated or the more general model considered in Baltagi and Liu (2012) and Lee and Yu (2010b). The variance–covariance matrix of the within-transformed disturbances of the observed unbalanced model follows immediately from equation (3) as
Throughout it is assumed that the following assumptions hold:
Assumption 1: The real-valued random variables of the array satisfy . For each , the random variables are totally independent and identically distributed. Furthermore, for some , and .
Assumption 2: The spatial weights collected in WN are nonstochastic and
,
is nonsingular, the absolute row and column sums of the matrices WN and are uniformly bounded,
, where k1 is a positive constant that does not depend on N and .
Assumption 3: Data are MAR and at fixed , where k2 is a positive constant that does not depend on N.
Assumption 4: The elements of XN are nonstochastic and are uniformly bounded in absolute value.
These assumptions closely follow those made in the literature (see e.g., Lee and Yu 2010a; Kapoor, Kelejian, and Prucha 2007; Mutl and Pfaffermayr 2010 for a discussion). Assumption 3 is based on the literature on missing data and guarantees that the selection mechanism is ignorable, so that the estimation can be based on the marginal distribution of the observed values. In addition, this assumption maintains that the share of missing values does not vanish as N gets large. Rather, the “missing” portion of the sample is allowed to increase at the same rate as the observed portion.
Following Cornwell and Schmidt (1992), Khatri (1968) and, specifically, Rao (1973, 527), the log likelihood for the within-transformed observed model is based on the singular multivariate normal distribution, since defined above does not have full rank:6
where we define d(ρ) is the product of the positive Eigenvalues of In the Appendix, the Cliff and Ord test for unbalanced panels is derived as the square root of an LM test for ρ = 0 and the corresponding test statistic is given by
where and is a consistent estimator of β. Under H0 with ρ = 0, is consistently estimated by (see Baltagi 2008; Lee and Yu 2010a). This test is easy to implement. One only needs the estimated residuals of the within-transformed unbalanced model and a spatial weighting matrix where all rows and columns are skipped that refer to missing data. In a balanced panel, is the identity matrix and the test statistic reverts to that considered by Mutl and Pfaffermayr (2010).7 The following asymptotic normality result for this test statistic under H0 can be established.
Theorem 1: Let Assumptions 1 to 4 hold and let be based on a consistent estimator of β. Then
The Monte Carlo simulations are based on the following model:
setting α = 1 and β = 0.5. The explanatory variable is generated as with and where denotes the uniform distribution on the interval . Xit is kept fixed in repeated samples. The remainder disturbances are generated as independent draws from The unit effects μi are left unspecified since they drop out when applying the within transformation.
The spatial weighting matrix is based on a regular lattice with 144 and 289 cells, respectively, containing one observation each. The first weighting scheme is a rook design, where every unit is surrounded by four neighbors. Alternatively, the distance between any two neighboring cells in the north–south and east–west direction is set to 1, and the spatial weighting scheme is based on the Euclidean distance dij. The spatial weight is then defined as with . There is a cutoff with if the Euclidian distance is larger than . Both spatial weighting matrices are maximum-row normalized, following Kelejian and Prucha (2010b).8
For each simulation experiment, Table 1 exhibits the size of the test calculated as the share of rejections at ρ = 0. The power of the test is given by the share of rejections under the alternative . Experiments 1 to 6 in Table 1 refer to the rook design with 10 percent, 30 percent, and 50 percent, respectively, missing observations in waves t > 2. Note waves 1 and 2 are assumed to be fully observed and the pattern of missings is kept fixed in repeated samples. At T = 5 this translates into an overall share of missing observations of 6 percent, 18 percent, and 30 percent, respectively.
Share of Rejections of the Cliff and Ord Test in the Unbalanced Panel, Maximum Row Sum Normalization.
6 Percent missing
18 Percent missing
30 Percent missing
ρ
1
2
3
4
5
1
2
3
4
5
1
2
3
4
5
Rook design
−0.9
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
−0.6
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
−0.4
1.000
1.000
1.000
0.999
1.000
1.000
1.000
1.000
0.998
0.999
0.993
1.000
1.000
0.989
0.994
−0.3
0.995
1.000
1.000
0.991
0.996
0.975
1.000
1.000
0.969
0.971
0.915
0.997
0.990
0.905
0.905
−0.2
0.857
0.989
0.995
0.865
0.858
0.733
0.958
0.956
0.752
0.729
0.583
0.864
0.812
0.604
0.589
−0.1
0.331
0.582
0.607
0.326
0.315
0.245
0.453
0.448
0.242
0.250
0.187
0.333
0.293
0.184
0.192
0.0
0.052
0.051
0.051
0.049
0.051
0.049
0.049
0.053
0.043
0.052
0.045
0.050
0.048
0.045
0.051
0.1
0.331
0.589
0.597
0.303
0.328
0.255
0.453
0.445
0.236
0.265
0.199
0.334
0.298
0.184
0.193
0.2
0.851
0.991
0.995
0.867
0.855
0.730
0.957
0.952
0.733
0.735
0.591
0.874
0.813
0.584
0.591
0.3
0.995
1.000
1.000
0.998
0.996
0.974
1.000
1.000
0.978
0.974
0.913
0.997
0.991
0.915
0.916
0.4
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
0.999
0.994
1.000
1.000
0.995
0.993
0.6
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
0.9
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
Distance-based spatial weights
−0.9
0.192
0.338
0.676
0.187
0.197
0.116
0.206
0.407
0.106
0.109
0.060
0.116
0.208
0.053
0.063
−0.6
0.083
0.150
0.349
0.072
0.086
0.052
0.097
0.202
0.042
0.049
0.030
0.063
0.105
0.026
0.029
−0.4
0.040
0.076
0.160
0.033
0.040
0.029
0.052
0.099
0.023
0.032
0.023
0.036
0.057
0.019
0.022
−0.3
0.029
0.047
0.094
0.023
0.026
0.025
0.038
0.061
0.019
0.022
0.021
0.031
0.044
0.017
0.022
−0.2
0.022
0.033
0.054
0.019
0.022
0.024
0.030
0.041
0.020
0.023
0.023
0.031
0.032
0.020
0.020
−0.1
0.028
0.033
0.037
0.025
0.027
0.032
0.034
0.037
0.028
0.028
0.029
0.032
0.036
0.026
0.029
0.0
0.047
0.050
0.047
0.045
0.049
0.043
0.044
0.048
0.043
0.044
0.042
0.046
0.045
0.042
0.042
0.1
0.084
0.081
0.105
0.081
0.085
0.076
0.075
0.087
0.072
0.076
0.074
0.072
0.078
0.065
0.070
0.2
0.147
0.151
0.211
0.145
0.147
0.128
0.127
0.173
0.129
0.134
0.115
0.111
0.137
0.111
0.113
0.3
0.242
0.243
0.386
0.254
0.237
0.207
0.210
0.298
0.202
0.209
0.178
0.179
0.227
0.174
0.174
0.4
0.375
0.379
0.598
0.384
0.381
0.321
0.321
0.479
0.324
0.324
0.267
0.257
0.363
0.265
0.268
0.6
0.698
0.690
0.915
0.707
0.696
0.621
0.601
0.842
0.622
0.615
0.520
0.497
0.719
0.533
0.514
0.9
0.973
0.967
1.000
0.975
0.971
0.955
0.939
0.998
0.958
0.958
0.920
0.885
0.993
0.928
0.918
Note: Boldface figures refer to the size of the test at a significance level of percent. The spatial weighting matrix is maximum row normalized. Data are missing in waves 3 or higher. Waves 1 and 2 are fully observed. Experiment 1: Normal disturbances, , ; Experiment 2: Normal disturbances, , ; Experiment 3: Normal disturbances, , 10; Experiment 4: log-normal disturbances, , ; and Experiment 5: t(5) disturbances, , .
Each experiment is repeated 10,000 times so that at nominal size of a 95 percent confidence interval of the calculated size is given by In almost all experiments, the simulated size of Cliff and Ord test is contained in this 95 percent confidence interval, and the test possesses power against the alternative. Doubling the number of observations either in the cross-section dimension or in the time dimension (experiments 2 and 3) improves the power as expected. Also, the power tends to be smaller the higher the share of missings as one would expect.9
Experiments 4 and 5 consider nonnormal disturbances, namely log-normal disturbances where N(0, 1) (Experiment 4) or alternatively, (Experiment 5). In both cases, the size of the Cliff and Ord test is nearly correct. In Experiment 4, the test seems somewhat undersized and its size falls outside the 95 percent confidence interval when 18 percent and 30 percent of the observations are missing. In addition, the power of the test tends to be smaller as compared to the case of normally distributed disturbances.
Finally, all experiments are repeated for the distance-based spatial weights (Experiments 6–10). This weighting scheme implies a much higher spatial correlation of the disturbances. Again, the size of the Cliff and Ord test is nearly correct in experiments 1 to 3, but we now observe a slightly lower than nominal size in both experiments 4 and 5. Due to the higher spatial correlation of the disturbances, the power of the test is lower than under the rook design. Specifically, the power turns out very small if ρ is negative but large in absolute value (see Mutl and Pfaffermayr 2010, for similar results in case of a balanced panel). A higher share of missing data reinforces this phenomenon. The reason seems to be that under a negative ρ positive and negative contributions to the mean of the test statistic tend to cancel out. The comparison of experiments 2 and 3 shows that the power increases much faster with T than with N. The reason is that the idiosyncratic disturbances are assumed to be uncorrelated over time, so that increasing T does not affect the spatial correlation of units, while increasing N means adding highly spatially correlated units.
Conclusion
This article derives a Cliff and Ord test for spatial correlation of the idiosyncratic disturbances for unbalanced panel data models where some observations are MAR. Under this assumption, missing data can be ignored and inference can be based on the marginal distribution of the observed values. The proposed test is easy to implement, as it only needs consistently estimated residuals of the within-transformed unbalanced panel model and a spatial weighting matrix where all rows and columns are skipped that refer to missing data.
A small Monte Carlo simulation exercise demonstrates that this test performs well in finite samples in terms of size and power under a rook design. With distance-based spatial weights, the power of the test is comparable only in case of positive spatial correlation of the disturbances. Similar to the case of the balanced panel, the power can be very low under negative spatial correlation of the disturbances. A high share of missing values seems to aggravate this phenomenon.
Appendix
Consider the observed within-transformed model
and observe that
where we define and use Therefore, we have
and
Following Kiefer (1980) and Rao (1973), the product of nonzero Eigenvalues d can be calculated by skipping one observation per unit (e.g., the last one). Specifically, define the () selector matrix DN that is obtained from In by skipping all rows referring to the last observation of a unit in the panel. Then, it follows that has rank n − N and its determinant, which is equal to d(ρ), is nonzero. The partial derivative of the log likelihood w.r.t. ρ is then given by
Under , we have
using Collecting terms yields
using (see Abadir and Magnus 2005, 231f). refers to the matrix selector of wave t. Then, one can define . is a diagonal matrix with zeros in the main diagonal if the corresponding observation is missing and with 1/Ti if it is nonmissing. Multiplying WN,t from the right by changes the columns of missing observations to zero and the others to Therefore, the typical diagonal element of is zero, since .
At true β and under ρ = 0 the score is, therefore, given by the quadratic form
Lemma 2: Let Assumptions 1 to 3 hold. Consider the sequence of quadratic forms , where is a symmetric nonsingular nonstochastic n × n matrix. Then, and is given by
Furthermore,
Proof: In order to derive the expectation and variance of rN, it is first demonstrated that CN has zero diagonal elements. Define where refers to the selector matrix of wave t with and observe that the N × N matrix has ones in the main diagonal and zero off-diagonal elements. In addition, all elements of main diagonal are zero that refers to missing values. Multiplying a conformable matrix by from the left changes the rows corresponding to missing values to zeros, multiplying from the right replaces the rows referring to missing values by zeros. Now
Defining the N × N matrix , it can easily be verified that and that .
Some tedious calculations yield
and
Finally, the matrix has typical blocks of elements × , . Therefore, has zero diagonal elements, since the typical diagonal block possesses zero diagonal elements if WN does. As a result, one obtains
as shown above. Next consider , which is given by
since and The assumptions maintained in the Lemma guarantee that those made in Theorem 1 in Kelejian and Prucha (2001, 227) hold. Specifically, Assumption 2 implies that for some positive constant k. The claim of Lemma then follows from Theorem 1 of Kelejian and Prucha (2001, 227).
Proof of Theorem 1: Part (i) follows from Lemma 2. The proof of part (ii) is similar to that of Theorem 1 in Mutl and Pfaffermayr (2010) and thus omitted.
Footnotes
Acknowledgments
I am grateful to two anonymous referees for helpful comments.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
AnselinL.Le GalloJ.JayetH.. 2008. “Spatial Panel Econometrics.” In The Econometrics of Panel Data: Fundamentals and Recent Developments in Theory and Practice, edited by L. Mátyás and P. Sevestre, chapter 19. Berlin, Heidelberg, Germany: Springer: 625–660.
4.
BaltagiB. H.2008. Econometric Analysis of Panel Data. 4th ed. Chichester, UK: John Wiley.
5.
BaltagiB.H.LongLiu. 2012. “Random Effects, Fixed Effects and Hausman's Test for the Generalized Spatial Panel Data Regression Model”. Working paper WP # 0029ECO-661-2012. The University of Texas at San Antonio, College Business.
6.
BaltagiB. H.SongS. H.KohW.. 2003. “Testing Panel Data Models with Spatial Error Correlation.”Journal of Econometrics117:123–50.
7.
BurridgeP.1980. “On the Cliff-Ord Test for Spatial Correlation.”Journal of the Royal Statistical Society Series B (Methodological).” 42:107–8.
8.
CornwellC.SchmidtP.. 1992. “Models for Which the MLE and the Conditional MLE Coincide.”Empirical Economics17:67–75.
9.
HainingR.GriffithD.BennettD.. 1989. “Maximum Likelihood with Missing Spatial Data and with an Application to Remotely Sensed Data.”Communications in Statistics—Theory and Methods18:1875–94.
10.
KapoorM.KelejianH. H.PruchaI. R.. 2007. “Panel Data Models with Spatially Correlated Error Components.”Journal of Econometrics140:97–130.
11.
KelejianH. H.PruchaI. R.. 2001. “On the Asymptotic Distribution of the Moran I Test with Applications.”Journal of Econometrics104:219–57.
12.
KelejianH. H.PruchaI. R.. 2010a. “Spatial Models with Spatially Lagged Dependent Variables and Incomplete Data.”Journal of Geographical Systems12:241–57.
13.
KelejianH. H.PruchaI. R.. 2010b. “Specification and Estimation of Spatial Autoregressive Models with Autoregressive and Heteroskedastic Disturbances.”Journal of Econometrics157:53–67.
14.
KhatriC. G.1968. “Some Results for the Singular Normal Multivariate Regression Models.”Sankhya: The Indian Journal of Statistics Series A30:267–80.
15.
KieferN. M.1980. “Estimation of Fixed Effect Models for Time Series of Cross-Sections with Arbitrary Intertemporal Covariance.”Journal of Econometrics14:195–202.
16.
LeeL.YuJ.. 2010a. “Estimation of Spatial Autoregressive Panel Data Models with Fixed Effects.”Journal of Econometrics154:165–85.
17.
LeeL.YuJ.. 2010b. “Some Recent Developments in Spatial Panel Data Models.”Regional Science and Urban Economics40:255–71.
18.
LeSageJ. P.PaceR. K.. 2004. “Models for Spatially Dependent Missing Data.”Journal of Real Estate Finance and Economics29:233–54.
19.
LittleR. J. A.RubinD. B.. 2002. Statistical Analysis with Missing Data. 2nd ed. New Jersey: John Wiley.
20.
MagnusJ. R.NeudeckerH.. 2007. Matrix Differential Calculus with Applications in Statistics and Econometrics. 2nd ed. New York: John Wiley.
21.
MartinR. J.1984. “Exact Maximum Likelihood for Incomplete Data from a Correlated Gaussian Process.”Communications in Statistics—Theory and Methods13:1275–88.
22.
MoranP.1950. “Notes on Continuous Stochastic Phenomena.”Biometrika37:17–23.
23.
MutlJ.PfaffermayrM.. 2010. “A Note on the Cliff and Ord Test for Spatial Correlation n Panel Models.”Economics Letters108:225–28.
24.
NicolettiC.2006. “Nonresponse in Dynamic Panel Data Models.”Journal of Econometrics132:461–89.
25.
RaoR. C.1973. Linear Statistical Inference and its Applications. 2nd ed. New Jersey: John Wiley.
26.
RubinD. B.1976. “Inference and Missing Data.”Biometrika63:581–92.