Abstract
This article compares a general cross-lagged model (GCLM) to other panel data methods based on their coherence with a causal logic and pragmatic concerns regarding modeled dynamics and hypothesis testing. We examine three “static” models that do not incorporate temporal dynamics: random- and fixed-effects models that estimate contemporaneous relationships; and latent curve models. We then describe “dynamic” models that incorporate temporal dynamics in the form of lagged effects: cross-lagged models estimated in a structural equation model (SEM) or multilevel model (MLM) framework; Arellano-Bond dynamic panel data methods; and autoregressive latent trajectory models. We describe the implications of overlooking temporal dynamics in static models and show how even popular cross-lagged models fail to control for stable factors over time. We also show that Arellano-Bond and autoregressive latent trajectory models have various shortcomings. By contrasting these approaches, we clarify the benefits and drawbacks of common methods for modeling panel data, including the GCLM approach we propose. We conclude with a discussion of issues regarding causal inference, including difficulties in separating different types of time-invariant and time-varying effects over time.
Keywords
Many methods exist for analyzing panel data (e.g., Arellano, 2003; Bollen & Curran, 2006; Box, Jenkins, & Reinsel, 2008). Yet, only some capitalize on the structure of panel data to offer a clear path to causal inferences. To justify such inferences, a theory of causality must be mapped onto a statistical model while addressing potential threats to causal inference (Granger, 1969, 1980). However, this is rarely done explicitly in most applications of panel data analysis.
We seek to promote a better understanding of causal modeling with panel data by showing the strengths and weaknesses of different panel data methods. For this, we use a coherence-based approach for comparing methods, while also being sensitive to their more pragmatic features. In terms of coherence, organization scholars note that it is crucial for developing and justifying theories by showing their link to existing logics and empirical findings (Locke & Golden-Biddle, 1997; Shepherd & Sutcliffe, 2011). In other words, a theory is justified if it fits with preexisting “background systems” or “webs of belief” in a community of researchers (see philosophical thought by Davidson, 1986; Lehrer, 2000; Quine & Ullian, 1970).
This approach is often implicitly used to justify methods, such as by arguing for their coherence with a psychometric logic (Bagozzi & Edwards, 1998; Edwards, 2011). We formally take this approach to evaluate panel data models based on their coherence with a typical view of causality, including (a) a cause→effect temporal order, (b) possible bidirectional effects among variables, and (c) controls for potential confounders. Yet, we also recognize that models have a pragmatic character in terms of the range of dynamic processes they can capture, the richness of information from hypothesis tests, and their ease of use. We use these criteria rather than, for example, Monte Carlo simulations because the latter start by assuming that one model is true—which is not knowable in practice—to show the foregone conclusion that others are problematic.
In what follows, we begin by describing the general cross-lagged model (GCLM) from our first article, treating its relation to causal inference and the importance of accounting for unit effects (i.e., stable factors) and temporal dynamics (i.e., the dependence of the future on the past; Baltagi, 2013b; Hsiao, 2014). We also note the range of system dynamics and hypothesis tests associated with a GCLM, including short-run and long-run effects. Then, we contrast the GCLM against alternative panel data models, many of which are very common in organization research.
Throughout, we distinguish static and dynamic models, where only dynamic models treat dependence of the future on the past with lagged effects. We first present static models: random-effects models; fixed-effects models (i.e., group-mean centered or within-group approaches); and latent curve models. We then treat dynamic models: cross-lagged models as structural equation models (SEM) or multilevel models (MLM), including with group-mean centering; econometric Arellano-Bond dynamic panel data models; and autoregressive latent trajectory models. By evaluating these methods based on a logic of causality and pragmatic concerns, we show how: static models make causal inference problematic by excluding lagged effects; unit effects are left uncontrolled in cross-lagged models and group-mean centering produces dynamic panel bias; and Arellano-Bond and autoregressive latent trajectory methods have various shortcomings. Online materials available at https://doi.org/10.26188/5c9ec7295fefd offer Mplus and Stata output, and the included Online Appendix A includes comparisons using data from Van Iddekinge et al. (2009) and Meier and Spector (2013), as in our first article.
We conclude with recommendations for how to match panel data models to theory and the context of research, as well as some of the limitations for causal inference associated with controlling for confounds in panel data models. We also note that there are other methods for analyzing panel data such as latent change score models (i.e., a model in differences; McArdle, 2001, 2009), but these can be seen as special cases of a GCLM (e.g., Voelkle & Oud, 2015).
The General Cross-Lagged Model (GCLM)
In our first article, we started with a cross-lagged model for a unit i at an occasion t, for N units and T occasions, measured for two variables
wherein
Here, AR and CL terms can be seen as indirect effects of past impulses on the future (seen by path tracing from an early A Full GCLM With AR(1)MA(1)CL(1)CLMA(1) Effects.
wherein a unit effect
The GCLM strongly coheres with the logic of causality noted above: (a) causes precede effects via lagged predictors; (b) bidirectional effects are allowed by all variables predicting each other; (c) potential confounds are controlled as occasion effects and unit effects that can induce aggregate and unit-specific trends, respectively, while AR terms hold the past constant to assess predictors’ unique effects. As for the pragmatic nature of a GCLM: (a) MA and CLMA terms enhance the range of dynamic processes it can model, with MA and CLMA terms that allow large temporary effects (e.g., small positive AR/CL effects and large positive MA/CLMA effects) or small persistent effects (e.g., large positive AR/CL effects and moderate negative MA/CLMA effects); (b) hypothesis tests offer rich information as short-run effects that take the form of CL + CLMA terms (e.g.,
To give some context to these assertions, we offer an abridged description of the panel data and results from our first article. For this, it is important to keep in mind that estimating a GCLM requires choosing some number of unit effects (typically one
Results for Dynamic Models.
Note: Columns are named after the models described in the text as follows: Model 1 is our full AR(1)MA(2)CL(1)CLMA(1) GCLM; 2.1 is a cross-lagged SEM; 2.2 is a cross-lagged MLM; 2.3 is a group-mean or within-group centered cross-lagged model; 2.4 is a cross-lagged SEM with time-varying unit effects; 3.1 is a “system-GMM” model in differences and levels; 3.2 is a difference-GMM; 3.3 is an SEM version of a latent fixed-effects model 4.1 is an ALT with a linear trend to represent unit effects; 4.2 is an ALT with a single time-varying unit effect. ALT = autoregressive latent trajectory; AR = autoregressive; CL = cross-lagged; CLMA = cross-lagged moving average; GCLM = general cross-lagged model; GMM = generalized methods of moments; MA = moving average; MLM = multilevel model; SEM = structural equation model; SWB = subjective well-being.
* p < .05.
In terms of the dynamics implied by this model, income is highly persistent with an AR
Alternative Approaches to Panel Data Analysis
Different researchers often use different methods for panel data analysis (contrast Baltagi, 2013b; Pitariu & Ployhart, 2010; Raudenbush, 2001). Yet, only some of these clearly map a causal logic to model parameters. To treat the coherence of different methods with a logic of causality and their pragmatic features, we treat common static and dynamic models. There are many variations on these models, but we offer typical specifications and descriptions. For clarity, in our text and figures u refers to a random impulse whereas we use ∊ as a residual that, as we note, may be conflated with unmodeled lagged effects, occasion effects, and/or unit effects.
Static Models
We define static models as those that do not specify dependence of the future on the past (i.e., excluding lagged effects; Hsiao, 2014). Thus, by “static” we are referring to the nature of a statistical model rather than the data used for estimation. We begin with two common MLMs in the form of random- and fixed-effects specifications, and then discuss latent curve models.
Random-effects MLM
The MLM has gained substantial prominence in organizational science over the past 30 years (e.g., Hofmann, 1997). This approach recognizes that observations can be hierarchically structured, in our case T = 6 observations of SWB and income “nested” in N = 135 countries. With this clustering, an MLM estimates relationships while modeling variation in outcomes due to lower-level factors across N and T versus higher-level factors across N. For example, Bloom (1999) predicted baseball player performance at multiple seasons with variables such as contemporaneous performance opportunity; Gulati (1995) predicted firm alliances at multiple years with contemporaneous measures of firm interdependence. Thus, inherent in this approach is treating the data as if they were a collection of T cross-sections (see Figure 2).

Random-Effects Model When Latent Unit Effect Covariance Restricted to Zero (Dashed Line), but Fixed-Effects Model Accounting for Unit Effects When Latent Covariance Estimated.
With SEM notation, we show this in an MLM that controls for occasion effects:
where a subscript “0” on β indicates a contemporaneous relationship,
As Figure 2 shows, each
This approach causes concerns regarding causal inference and the pragmatic nature of the model. The first causal concern relates to the temporal nature of the effects. Without modeling temporal priority among the variables, effects like
The next concern relates to possible bidirectional effects and controlling for confounds. Because
Consider if
In terms of the pragmatic features of the model, although they are easy to estimate, the absence of lagged effects means that no dynamic processes can be accommodated, so short-run Granger-Sims tests and long-run impulse responses are precluded. This raises questions about the practical use of
In sum, random-effects MLM fails to adequately cohere with a logic of causality and suffers from pragmatic issues compared to a GCLM. By this, we do not mean that MLMs are wholly bad or wrong. They can be useful when directions of causal effects are known, unit effects can be assumed uncorrelated or irrelevant, and lagged effects are irrelevant, possibly because of noisy data or because they are too distant in time. Thus, we do not categorically recommend against the model. Instead, we merely clarify issues associated with limited causal inference and pragmatic concerns. Of course, many researchers understand some of these issues, which leads us to the more common fixed-effects MLM specification for analyzing panel data.
Fixed-effects MLM
The fixed-effects MLM is equivalent to Eqs. 3 and 4, but it controls for unit effects—this is what econometricians often mean by “fixed effects.” For example, Judge, Ilies, and colleagues do this to eliminate stable individual differences to estimate within-person relationships among affect, job attitudes, work stressors, and the like (Ilies, Johnson, Judge, & Keeney, 2011; Ilies, Scott, & Judge, 2006; Judge & Ilies, 2004; Judge, Scott, & Ilies, 2006). This eliminates person-specific trends to estimate
This can be done in various ways (Halaby, 2004), classically with predictors to (dummy) code
However, in terms of the coherence of the model with a causal logic and its pragmatic nature, the same problems exist as for the random-effects MLM, save for holding unit effects constant. Estimated as two separate models, this produces a within-country income→SWB term for Eq. 3:
In sum, static fixed-effects MLMs are common and control for unit effects. Yet, they fail to cohere with the logic of causality described previously and create pragmatic dilemmas for modeling dynamic effects over time. Again, this does not mean they are wholly problematic and may even be considered acceptable misspecifications when time lags are too distant or data are too noisy to observe lagged effects, but compared to a GCLM they have multiple limitations.
Latent Curve Model (LCM)
Another extremely common model is the LCM (i.e., latent growth model; McArdle & Nesselroade, 2003), which estimates unit-specific trends over time. However, these are static models because they omit lagged effects, even though researchers often refer to them as indicating “dynamic” relationships among trends. For example, Pitariu and Ployhart (2010) illustrate this by predicting employee performance trends using trends in effort over time, while predicting both of these trends with a time-invariant measure of team diversity.
To critically explore this logic, we start with a simple and familiar example of an LCM using SEM notation (see Figure 3; for alternatives, see Bollen & Curran, 2006; Curran, 2003): Latent Curve Model Showing Latent Means as

with all terms as before except for two changes. First,
Results show positive growth for income and SWB: income intercept
To first tackle the issue of causality as a temporal process, in LCM there seems to be “an effect of time” by using it as a predictor, leading some researchers to treat trends as if time was their cause (e.g., Curran & Bauer, 2011; Curran et al., 2012; Wang & Maxwell, 2015). Yet, “an effect of time” here is potentially misleading, as time defines causality rather than itself being a causal factor (Pitariu & Ployhart, 2010; Voelkle & Oud, 2015). In turn, by conceptualizing time as a cause, researchers can easily overlook the causal processes that may be of interest, such as socialization, institutionalization, or maturation. Indeed, “although time is inextricably linked to the concept of development, in itself it cannot explain any aspect of developmental change” (Baltes, Reese, & Nesselroade, 1988, p. 108). In turn, perhaps the GCLM offers a better way to treat trends as interactions among time- and unit-specific factors
This point leads to concerns over the LCMs static nature. As Eqs. 5 and 6 show, there are no temporal dynamics as dependence of the future on the past, so causal effects are not modeled. Instead, the variables available for causal inference are time-invariant factors like
Next, in terms of bidirectional effects and controlling for confounds, without relying on a time-varying element, there is no way to know if
To explain,
Finally, in terms of LCM’s pragmatic nature, it allows descriptive curve fitting, but it does not incorporate dynamic effects. In turn, it offers little help for planning interventions—how should this be done using
Dynamic Models
Dynamic models are differentiated from static models by incorporating lagged effects such as AR, MA, CL, and CLMA terms. In what follows we discuss common dynamic models and explore their coherence with a causal logic along with their more pragmatic characteristics.
Cross-Lagged Models
As we have described, organizational researchers regularly use a cross-lagged model to analyze panel data as follows (see Figure 4a): (a) Typical Cross-Lagged Model. (b) Cross-Lagged Model Modified to Account for Unit Effects η.

wherein all terms are as described previously, except we show a residual as
Model fit is adequate (CFI = .96, TLI = .95, SRMR = .07, RMSEA = .10), with AR terms showing high persistence for income
In terms of a logic of causation, the cross-lagged model has the benefit of incorporating lagged effects and accounts for the possibility of bidirectional effects. However, compared to GCLM results, SWB has much stronger persistence over time, which seems misaligned with past findings of SWB being mean-reverting (e.g., Clark, Frijters, & Shields, 2008; Diener & Lucas, 1999). Similarly, CL effects are larger and statistically significant, and the SWB→income effect changes sign. One reason for these differences is that conventional cross-lagged models do not control for unit effects
Similar problems arise in MLMs, which can be used to estimate similar models (Beal & Weiss, 2003; Bolger & Laurenceau, 2013; Griffin, 1997; Nezlek, 2001, 2008, 2011, 2012a, 2012b; see also Kling, Harvey, & Maclean, 2017). To start, a random-effects MLM also fails to control unit effects (e.g., Schonfeld & Rindskopf, 2007), which we show by estimating Eqs. 7 and 8 with Stata’s “xtreg, mle” (see Table 1, Model 2.2), resulting in similar AR terms of income
This problem of bias in MLM has been recognized and many authors attempt to solve it by group-mean centering their data (i.e., a “within-group” or WG model), as if to estimate a fixed-effects static model (e.g., Beal, Trougakos, Weiss, & Green, 2006; Bono, Foldes, Vinson, & Muros, 2007; Dalal, Lam, Weiss, Welch, & Hulin, 2009; Fisher & Noble, 2004; Gielnik, Spitzmuller, Schmitt, Klemann, & Frese, 2015; Hoffman, 2015; Ilies et al., 2006; Ilies et al., 2011; Rovine & Walls, 2006). This is often done because researchers believe that this centering presents no issues beyond those of static MLMs (e.g., Beal, 2015; Duckworth, Tsukayama, & May, 2010; Enders & Tofighi, 2007; see also Uy, Foo, & Aguinis, 2010). However, this is not the case—it causes “dynamic panel bias.”
To explain, centering the variables by subtracting unit means produces:
which attempts to control for unit effects by assuming a unit average such as
To understand this point, consider the T = 2 case for
For over 40 years econometricians have known about this problem in models that include lagged effects, which they refer to as dynamic panel bias (Nerlove, 1967, 1971; Sevestre & Trognon, 1985). As Nickell described in 1981, the negative bias for AR terms take the form:
Although this bias is reduced as
To show the problem, we estimate Eqs. 9 and 10 using Stata’s “xtreg, fe” (see Table 1, Model 2.3), which group-mean centers all variables—the same results emerge when group-mean centering in cross-lagged SEM. The result is very small AR estimates for income
In sum, cross-lagged SEM and random-effects MLM are similarly biased, and group-mean centering or other approaches to estimate fixed-effects models lead to dynamic panel bias. This said, “the fact that these two estimators are likely to be biased in opposite directions [for AR effects] is useful. Thus we might hope that a candidate consistent estimator will lie between the…[two AR] estimates” (Bond, 2002, p. 144). There are various ways to do this (e.g., Allison, Williams, & Moral-Benito, 2017; Asparouhov et al., 2018; Hamaker et al., 2015), such as a cross-lagged SEM with
This improves model fit over the cross-lagged SEM, ostensibly because AR and CL effects are no longer tasked with accounting for unit effect (co)variance (CFI = .98, TLI = .97, SRMR = .03, RMSEA = .09). Also, AR terms are between the cross-lagged SEM/random-effects MLM and group-mean centered MLM estimates (see Table 1, Model 2.4). The AR term for income is
With this in mind, it is notable that many published cross-lagged models assume
In sum, classic cross-lagged SEMs and MLMs should be avoided when seeking to make causal inferences in panel data. Conveniently, with
Arellano-Bond Methods
There are many econometric approaches to panel data analysis found in organizational research, but a popular example is the Arellano-Bond (AB) method (see overviews in Bond, 2002; Bun & Sarafidis, 2015; for foundational work, see also Arellano & Bond, 1991; Arellano & Bover, 1995; Blundell & Bond, 1998; Holtz-Eakin, Newey, & Rosen, 1988). For example, Piening, Baluch, and Salge (2013) used the AB method to show a positive effect of HR practices on organizational performance (for other examples, see Barkema & Schijven, 2008; Foster, 2010; George, 2005; Goldstein, 2012). To we examine this method by briefly outlining its logic and some of its causality-oriented and pragmatic dilemmas, with more details in Online Appendix B (see also Arellano, 2003; Baltagi, 2013b; Roodman, 2009a, 2009b).
The problem that AB methods address is that a unit effect
wherein this subtracts
AB methods attempt to eliminate
To evaluate this method, we first note that it allows a coherent temporal order for causal effects. Also, bidirectional causality and potential confounds are addressed by using lagged instruments, such that the past of a predictor should cause the future of an outcome, but the reverse should not be true. In turn, this is meant to eliminate reverse causation and confounding by common causes. Also, although the model excludes MA and CLMA terms and thus limits the range of potential dynamic processes it can model, there is the pragmatic benefit of a particular long-run effect that can be estimated (Baltagi, 2013b). Specifically, by estimating each model for an outcome separately, this allows a thought experiment wherein a predictor is increased by 1-unit and this is maintained over time, with effects “aggregating” at each occasion via AR terms. This allows computing a long-run effect shown here for the x→y case as
Yet, using the AB method requires checking its assumption that the information from instruments is unrelated to residuals (i.e., instruments should be related to outcomes only via predictors). This is checked by residual autocorrelation and Sargan/Hansen tests with a null hypothesis of no instrument-residual covariance, meaning that small p-values entail rejecting the assumption of valid instruments (i.e., large p-values imply assumptions are met). If these tests show small p-values, instruments can be lagged further until a valid set of instruments is found.
Unfortunately, this approach leads to concerns related to controlling for confounds and the method’s practical implementation. First, using too few instruments causes inefficiency (i.e., large SEs), but using too many causes overfitting that reintroduces unit effects. Also, if changes over time are systematic, differences can correlate with
To show this, we first took a system-GMM approach using Stata’s “xtabond2,” with all available lags to instrument the equations in differences and a single lag for equations in levels (see Table 1, Model 3.1; see Online Appendix B and Stata output). The AR terms show a larger estimate for income when compared to the previous cross-lagged SEM controlling for
When checking the assumption of instrument validity, the Sargan/Hansen tests show p < .5 in levels equations. Yet, with many instruments this p-value is biased toward zero, and thus even this large p-value suggests a potential correlation among
To address these issues, we estimate a second model in differences only to reduce the instrument count, which shows better Sargan/Hansen test results (the smallest p = .683). Here, AR terms somewhat acceptable but are much less efficient (see Table 1, Model 3.2), with income
To overcome the issues of GMM, maximum-likelihood approaches in SEM exist (Allison et al., 2017; Bai, 2013; Moral-Benito, 2013), such as Stata’s “xtdpdml” tool (Williams, Allison, & Moral-Benito, 2018). To emphasize the pragmatic value of SEM for panel data, we illustrate a similar approach using separate models for income and SWB, with a first occasion t = 1 allowed to freely correlate with

Alternative to Arellano-Bond Methods Using SEM (Showing an Income→SWB Model and Excluding Covariance Labels for Concision).
The models we estimate show adequate fit (for income, CFI = .99, TLI = .97, SRMR = .02, RMSEA = .07; for SWB, CFI = .97, TLI = .94, SRMR = .03, RMSEA = .11), and results are consistent with the cross-lagged SEM with covariance among unit effects in Figure 4b (see Table 1, Model 3.3 and compare with Model 2.4). The AR term for income is
In sum, AB methods have issues in terms of controlling for confounds while also having the pragmatic problem of being difficult to use. This may be why researchers caution that “where system GMM offers the most hope, it may offer the least help” (Roodman, 2009b). To overcome the problems of AB methods, an SEM approach using either our GCLM or separate models for dependent variables can be used. In our view, this shows the benefits of our SEM framework in general, even for those trained in an econometric tradition.
Autoregressive Latent Trajectories
The autoregressive latent trajectory (ALT) model combines cross-lagged and LCM methods (Bollen & Curran, 2004, 2006; Bollen & Zimmer, 2010; Curran & Bollen, 2001). Although ALT is not common in organizational research, we include it here because it models both trends and lagged effects, which addresses many causal and pragmatic concerns we have with other methods. We show an ALT as follows (Figure 6a): (a) Autoregressive Latent Trajectory Model With a Linear Trajectory and Showing Latent Means as μ Terms (Excluding Covariance Labels for Concision). (b) Autoregressive Latent Trajectory Model With a Model-Estimated Trajectory and Showing Latent Means as μ Terms (Excluding Covariance Labels for Concision).

wherein all terms are as before for the LCM, with
Although ALT models are often given substantive interpretations for trends associated with
In our demonstration, maximum-likelihood estimation of the first ALT failed, as we often encounter in the presence of missing data in early occasions that are not treated as dependent variables—a potential pragmatic issue of the ALT. To solve this, we used a Bayes procedure with Markov Chain Monte Carlo estimation with “uninformative” or “diffuse” priors to approximate maximum-likelihood results (Muthén & Asparouhov, 2012; see Online Appendix C for more details). For consistency, we report results using frequentist concepts such as t-values and p-values, but rely on their Bayesian analogues that are based on posterior distributions.
Results for the first model in Figure 6a show effects that are different from those above (Table 1, Model 4.1). The AR term for income is much smaller than that found in our GCLM, with
Alternatively, the second ALT in Figure 6b with a single term
Although the second ALT with time-varying unit effects shows AR terms that appear more reasonable, it has an income→SWB effect that is not supported in our GCLM or the cross-lagged model that controls for
Perhaps what is most important about these comparisons is that we do not find major differences between AR and CL effects when using models that account for time-varying unit effects terms and allow occasion effects in SEM (Figure 4b, Table 1, Model 2.4; Figure 6b, Table 1, Model 3.3). Indeed, these models show effects that are similar to those from our full GCLM (Figure 1, Table 1, Model 1), unlike models that: ignore
Discussion
We have compared various static and dynamic panel data models to our GCLM. The static models we treat are random- and fixed-effect models and LCMs, which offer no clear path to causal inference as a temporal process, with LCMs assuming all trends are systematic rather than having elements of randomness. The dynamic models have other problems: typical cross-lagged models fail to control for
Although we do not compare all panel data models that appear in organization science, other models can often be understood in ways that are consistent with the kinds of comparisons we draw above (e.g., Chow, Ho, Hamaker, & Dolan, 2010; Hamaker & Dolan, 2009; Hamaker, Nesselroade, & Molenaar, 2007; Nesselroade, McArdle, Aggen, & Meyers, 2002). For example, latent change models and related approaches merely estimate effects among variables in differences (e.g., Box et al., 2008; McArdle, 2001, 2009; McArdle & Hamagami, 2001, 2004). These models can be reparameterized as cross-lagged SEMs or panel vector autoregressive models (see Allen & Fildes, 2001; Arellano, 2003; Bai, 2013; Baltagi, 2013a, 2013b; McArdle & Nesselroade, 2014; Moral-Benito, 2013; Usami, Hayes, & McArdle, 2015; Voelkle & Oud, 2015). Thus, such models will encounter the same issues we describe when they do not properly control for
When considering such parameters, as we have noted, a process should be mapped onto a statistical model using theory and previous findings, as well as substantive and statistical checking. For this purpose, not all of the terms that the GCLM includes need to be specified, but researchers should know that they are available in SEM if they are deemed to be of interest. Indeed, past research has recognized all of the terms included in the GCLM in various ways (e.g., du Toit & Browne, 2001, 2007; Hamaker, Dolan, & Molenaar, 2002; Hamaker et al., 2015; Hamaker & Grasman, 2015), and therefore a GCLM can be seen as bringing these terms together in a coherent and easy-to-implement SEM framework—facilitated by our online supplemental material that allows automatically generating Mplus program code using an Excel file.
This said, especially the comparison of our model with the ALT brings up important issues regarding competition, so to speak, among unit effects and the AR, MA, CL, and CLMA terms that we use for causal inference. The problem is that unit effects can be parameterized in a wide variety of ways (just as AR, MA, CL, and CLMA terms can be), and these specifications will produce different kinds of competition among parameters to explain auto- and cross-covariation (as illustrated by the ALT models we estimate). This issue, in a very general sense, was first discussed by “Student” (1914), who treated time as a predictor in order to detrend longitudinal data (for historical developments, see Hooker, 1905; Tintner, 1940; Yule, 1921, 1926). As Yule (1921) noted in commenting on Student’s approach, the problem was that if “Student” [1914] desires to remove from his figures secular movements, periodic movements, uniform movements, and accelerated movements—well the reader is left wondering with what sort of movements he does desire to deal…. He desires to find the correlation between x and y when every component in each of the variables is eliminated which can well be called a function of the time, and nothing is left but residuals such that the residual of a given year is uncorrelated with those that precede or that follow it…. [However], the only residuals which it is easy to conceive as being totally uncorrelated with one another in the manner supposed are errors of observation” (pp. 502-504).
Again, Yule pointed this out long ago, noting that, “it is not my view alone but the view of most writers on the subject up to 1914, that the essential difficulty of the time-correlation problem is the difficulty of isolating for study different components in the total movement of each variable” (1921, p. 501). Unfortunately, no single solution to this problem exists—or can exist—that is applicable to all research contexts. Given the uncertainties here, we recommend including a single time-varying unit effect term
In conclusion, as we noted in our first article, panel data models are not a panacea for unconditional causal inference, just as randomized controlled trials are not (Cartwright & Hardie, 2012). From a practical perspective, causal inference under any approach is meant to allow using past observations to plan and execute actions such as interventions or policy changes that are designed to work for a set of specific purposes (Heckman, 2003, 2005). This practical orientation should be kept in mind when both accounting for trends with unit effects and interpreting AR, MA, CL, and CLMA terms along with their Granger-Sims and impulse-response counterparts. No empirical method secures the future against uncertainty, but panel models like the GCLM can be a useful complement to other methods for making plans and acting in the face of uncertainty.
Supplemental Material
Supplemental Material, Paper_II_Online_Appendices_final_version_Word2016 - From Data to Causes II: Comparing Approaches to Panel Data Analysis
Supplemental Material, Paper_II_Online_Appendices_final_version_Word2016 for From Data to Causes II: Comparing Approaches to Panel Data Analysis by Michael J. Zyphur, Manuel C. Voelkle, Louis Tay, Paul D. Allison, Kristopher J. Preacher, Zhen Zhang, Ellen L. Hamaker, Ali Shamsollahi, Dean C. Pierides, Peter Koval and Ed Diener in Organizational Research Methods
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was supported by Australian Research Council’s Future Fellowship scheme (Project FT140100629).
Supplemental Material
Supplemental material for this article is available online at https://journals-sagepub-com.web.bisu.edu.cn/doi/suppl/10.1177/1094428119847280 and ![]()
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
