A popular way to reduce confounding in observational studies is to use each study participant as his or her own control. This is possible when both the exposure and the outcome are time varying and have been measured at several time points for each individual. The case-time-control method is a special case, which, under certain assumptions, allows the analyst to control for confounding by time-varying covariates, while controlling for all time-stationary characteristics of the study participants. There are two formulations of the case-time-control method. One formulation requires that the exposure be binary, and the other requires that there be no more than two time points per individual. In this article the author proposes a generalization of the case-time-control method for nonbinary exposures and an arbitrary number of time points. The author derives the asymptotic properties of the resulting estimator and assesses its finite sample properties in a simulation study.
A common aim of sociological and epidemiological research is to estimate the causal effect of an exposure (or treatment) on a binary outcome. In observational (nonrandomized) studies, confounding bias is always a concern. If both the exposure and the outcome are time varying and have been measured at several time points for each individual, then confounding by all time-stationary covariates can be eliminated by using each individual as his or her own control. In epidemiological jargon, this strategy is broadly referred to as the case-crossover method (Maclure 1991). A common analytic tool is the conditional logistic regression model, which conditions on the individual and thereby implicitly controls for all individual-specific characteristics that are constant over time (Allison and Christakis 2006).
In many situations the binary outcome is “absorbing”; that is, after it has occurred, the individual is no longer part of the study. The most obvious example is when the outcome is an indicator of death (i.e., 0 for alive, 1 for dead). Often, there may also be time-varying covariates, such as age, income, and socioeconomic status, that we wish to control for in the conditional logistic regression model. However, when at least one of these confounders, such as age, is monotonically increasing with time, it is no longer possible to fit the model. This is because such a confounder perfectly predicts the outcome within the individual, making it impossible to get a within-individual estimate of the coefficient for that covariate (Allison and Christakis 2006).
The case-time-control method (Suissa 1995) has been proposed as a solution to this problem. Briefly, the idea is to let the exposure and outcome switch places in the conditional logistic regression model. Provided that the exposure is not monotonically increasing with time, the model will no longer have convergence problems.
The original formulation of the case-time-control method by Suissa (1995) does not require the exposure to be binary. Indeed, in his application of the method, Suissa also considered dichotomous, trichotomous, and continuous coding of the exposure. However, this formulation works only when the exposure and outcome are observed at exactly two time points for all individuals. Allison and Christakis (2006) used a slightly different formulation of the case-time-control method, which allows an arbitrary number of time points but requires that the exposure be binary. Fafchamps, van der Leij, and Goyal (2010) proposed a related method to control for confounding by time, based on detrending of the exposure. This method works for nonbinary exposures and an arbitrary number of time points, but it is somewhat ad hoc and has not been justified by theoretical arguments.
The aim of this article is to propose a theoretically justified generalization of the case-time-control method for nonbinary exposures and an arbitrary number of time points. The article is structured as follows. Section 2 presents a motivating example with real data, then Section 3 introduces basic notation and definitions and reviews the formulations of the case-time-control method by Suissa (1995) and Allison and Christakis (2006). Section 4 reviews the detrending method proposed by Fafchamps et al. (2010) and shows by example that this method can give considerable bias. Section 5 presents my generalization of the case-time-control method for nonbinary outcomes and an arbitrary number of time points, and Section 6 carries out a small simulation study to assess the finite sample properties of the proposed method. Section 7 returns to the motivating example and reanalyzes this with my proposed generalization of the case-time-control method. Like all statistical methods, the case-time-control method relies on certain assumptions, and Section 8 concludes the article by providing a detailed discussion of these.
2. Motivating Example
As a motivating example I use the data set teenpov borrowed from Allison (2009) and containing information on 1,151 teenage girls who were interviewed annually for five years beginning in 1979. The data set is structured so that one row corresponds to one interview, which makes a total of rows. Such data are often referred to as “panel data.” I use the following variables: ID (a unique subject-identifier), nonpov (1 if the girl is currently not in poverty according to U.S. federal standards, 0 otherwise), hours (the number of hours currently worked per week), inschool (1 if the girl is currently enrolled in school, 0 otherwise), spouse (1 if the girl is currently living with a spouse, 0 otherwise), age (the girl’s current age), and mother (1 if the girl currently has at least one child, 0 otherwise).
The aim of this article is to investigate how much each additional working hour increases the chance of making a transition from poverty to nonpoverty. With this research question in mind, I restrict attention to those girls who were in poverty at the first interview, and I follow them until the first interview when they were no longer in poverty or until the fifth interview, whichever came first. After this restriction, the data set contains 1,342 interviews or rows from 401 girls. My outcome is nonpov, and my “exposure” is hours. Arguably, the covariates inschool, spouse, age, and mother may be important confounders, which I thus wish to control for in the analysis. I note that all these covariates are time varying; that is, they may change during follow-up within each individual.
As an initial analysis I consider the logistic regression model
Here, the parameter is the conditional log-odds ratio between hours and nonpov, given inschool, spouse, age and mother. I fit the logistic regression model with the glm function in R, by typing
> fit <- glm(formula=nonpov~hours+inschool+spouse+ age+mother, family=“binomial”,data=teenpov)
and we obtain the following output:
>summary(fit)
The estimate of is equal to 0.033, which indicates that each additional working hour increases the odds of making a transition from poverty to nonpoverty, with units. This association is highly significant (p = ), so it is most likely not a chance finding.
However, we may still worry about unmeasured confounding. For instance, it is easy to imagine that unmeasured genetic or lifestyle factors affect both a person’s ability or ambition to work and the person’s ability to make a transition from poverty to nonpoverty. A popular way to deal with unmeasured confounding in sociology and epidemiology is to use the individual as his or her own control; in epidemiological jargon this is referred to as the “case-crossover method.” The method requires repeated measures of the outcome and the exposure, which is exactly what we have in the teenpov data set. Toward this end I consider the conditional logistic regression model
where i stands for “individual.” By conditioning on the individual, the model automatically controls for all time-stationary characteristics of the individual, such as genetic make-up (Allison 2009). These covariates are absorbed by the individual-specific intercept . Conditional logistic regression is a special case of the broad class of models often referred to as “fixed effects regression models” (Allison 2009). I fit the conditional logistic regression model with the clogit function from the survival package by typing
> library(survival)
> fit <- clogit(formula=nonpov~hours+strata(ID), data=teenpov)
The conditional logistic regression estimate of is twice as large as that from the ordinary logistic regression and still highly significant.
However, even though the conditional logistic regression model automatically controls for all time-stationary covariates, it does not automatically control for any time-varying covariates, Thus, we may wish to repeat the conditional logistic regression analysis and control for the variables inschool, spouse, age, and mother, by explicitly adding these as regressors in the model. Unfortunately, though, this is not possible. The reason is that both age and mother are monotonically increasing in time, and the outcome (nonpoverty) is absorbing; that is, it can occur only at the last observed time point for each individual. Thus, when running an analysis that conditions on the individual, like conditional logistic regression, age and mother perfectly predict the outcome, and the model does not converge (Allison and Christakis 2006). This is precisely the problem the case-time-control method intends to solve.
In Section 3 I review the case-time-control method, as formulated by Suissa (1995) and Allison and Christakis (2006). Neither of these formulations works for my data, because my exposure (hours) is nonbinary, and there are more than two time points per individual. In Section 5 I propose a generalization of the case-time-control method that can handle nonbinary exposures with more than two time points, and in Section 7 I reanalyze the data with this generalized method.
3. The Original Formulations of the Case-time-control Method
Let x and y be the time-varying exposure and outcome of interest, respectively. Assume that repeated measures on both x and y are available, and let and be the jth measure of the exposure and outcome, respectively, for individual i, where and . Let be the time point at which and are measured for individual i. Note that the time points do not have to be regularly spaced or the same for all individuals. Assume that the outcome is binary (0/1) and absorbing, so that it can occur only at the last time point; that is, for . Let be a vector of time-varying covariates we wish to control for in the analysis, which may contain elements that increase monotonically with time as well as nonincreasing elements. In particular, may include time itself. Finally, I define a time-stationary variable, , which is an indicator of whether individual i experiences the outcome at the last time point; that is, if . For all variables, I suppress the indices i and j when not needed. Table 1 illustrates example data for two individuals. The first individual () is observed at time points 0 and 1 and never experiences the outcome, so that at for this individual. The second () is observed at time points 0, 2, and 4 and experiences the outcome at , so that at for this individual.
Example Data for Two Individuals.
i
j
t
x
y
g
1
1
0
2
0
0
1
2
1
5
0
0
2
1
0
3
0
1
2
2
2
2
0
1
2
3
4
9
1
1
Let be an arbitrary exposure level. The target parameter is the log-odds ratio
which measures the association of the outcome with a one-unit increase in the exposure, at given levels of i and v. In principle, this log-odds ratio could depend on both i and v; in statistical jargon, we would then say that there is effect modification by, or interaction with, i and v. I assume, though, that is constant across levels of i and v.
Ideally we would like to give a causal interpretation. Toward this end, let be the potential outcome for a given individual when hypothetically assigned a fixed exposure level (Pearl 2009; Rubin 1974). In the potential outcome framework we say that we have conditional exchangeability if the potential outcome is conditionally independent of x, given i and v. Conditional exchangeability is a technical condition that holds when i and v are sufficient for confounding control (Pearl 2009). Under conditional exchangeability, can be interpreted as the causal log-odds ratio
which measures the causal effect on the outcome of increasing the exposure with one unit, at a given level of i and v (Pearl 2009).
I consider the conditional (fixed effects) logistic regression model
In this model, all time-stationary covariates are absorbed into the individual-specific intercept . For notational convenience I model the effect of v linearly in equation (2), but I emphasize that the linear effect of v can be replaced with a nonlinear function (e.g., splines) in model 2, as well as in all subsequent models.
As mentioned above, standard conditional logistic regression software cannot fit the model in equation (2). This is because time perfectly predicts the outcome within the individual, making it impossible for the estimate of to converge (Allison and Christakis 2006). If the exposure is binary (0 or 1), this problem can be solved by assuming a conditional logistic regression model for the exposure, of the form
Model 3 assumes, like model 2, that the log-odds ratio between the exposure and the outcome is constant across levels of i and v. Because of the symmetry of the odds ratio, it then follows that the parameter in model 3 is identical to the parameter in model 2 (e.g., Chen 2007). However, this does not mean that the two models give the same estimate of . Indeed, whereas model 2 does not converge, model 3 can be fit with conditional logistic regression software without convergence problems, provided that the exposure is not monotonically increasing with time. Let denote the estimate of on the basis of model 3.
In the formulation of the case-time-control method by Allison and Christakis (2006), is estimated by fitting model 3. However, Suissa (1995), who originally proposed the method, used a slightly different formulation. Restricting attention to a scenario in which all individuals are observed at exactly two time points, coded as 0 and 1, Suissa proposed using conditional logistic regression software to fit the model
Using Bayes’s rule (together with the definition of the variable g), it can be shown (see Appendix A) that the parameter in model 4 is equal to the target parameter , when . This equivalence does not require the exposure to be binary, but it does require that the log-odds ratio between the exposure and the outcome be constant across levels of i and t, as in models 2 and 3. From the equality of and , it follows that a consistent estimate of is also a consistent estimate of . Let () denote the estimate of on the basis of model 4.
The two formulations of the case-time-control method, based on models 3 and 4, have different pros and cons. The formulation based on model 3 does not require the time variable to be binary, and it can thus be used when there are more than two time points for some individuals. However, it requires the exposure to be binary, in order for the logistic model for x to make sense. The formulation based on model 4 does not require the exposure to be binary, and it can thus be used for nonbinary exposures as well. However, it requires the time variable to be binary, in order for the logistic model for t to make sense. Also, whereas model 3 can handle additional covariates that increase monotonically with time, model 4 cannot. If both the exposure and the time variable are binary, then it can be shown (see again Appendix A) that models 3 and 4 give the same estimate of (i.e., ).
4. Detrending the Exposure
Fafchamps et al. (2010) proposed a related method to control for confounding by time on the basis of a detrending of the exposure. In their proposal, a linear fixed effects regression model is fitted for the exposure, of the form
The fitted model is used to create residuals of the form . Finally, is estimated by fitting the conditional logistic regression model
Fafchamps et al. (2010) argued informally that this detrending of the exposure should remove any dependence of the exposure on time, so that the resulting estimate of is not confounded by time.
To gain further insight into this method, suppose we replace the logistic model in equation (2) with the linear fixed effects regression model
It can be shown (see Appendix B) that the estimate of obtained from fitting model 7 is identical to the estimate obtained by fitting the fixed effects regression model
where r is the residual defined above. Thus, detrending the exposure does indeed give the desired result in linear outcome models.
However, there is no theoretical justification for detrending the exposure in logistic outcome models. Indeed, we can easily find examples for which this method gives considerable bias. I give a simulated example here, in which I simulated 1,000 samples of 1,000 individuals each. For each individual, I simulated the normally distributed intercepts and . I then simulated repeated measures of a normally distributed exposure and a Bernoulli distributed outcome at five time points , from models and , respectively, where expit is the inverse of the logit function. The true value of was equal to 0.5. To mimic an absorbing outcome, I excluded all observations beyond the first time point at which , for each individual. For each sample I fitted model 5 with and constructed residuals r as described above. Finally, we used these residuals to estimate in model 6. The mean (over the 1,000 samples) estimate of was 0.32, which is severely biased. Repeating the simulation with 50 time points gave a nearly identical estimate.
5. Generalization of the Case-time-control Method
To generalize the case-time-control method for nonbinary exposures I follow Allison and Christakis (2006) and use a fixed effects regression model for the exposure. This strategy is motivated by the fact that such a model can be fitted for the exposure without convergence problems, provided that the exposure is not monotonically increasing with time (Allison and Christakis 2006). I assume that the exposure has an exponential family distribution
with canonical parameter and dispersion parameter . I model the conditional mean of x as , where is either the identity link, the log link, or the logit link. Thus, the model for the exposure becomes a generalized linear model (GLM; McCullagh and Nelder 1989). I assume that is the canonical link, so that . Combining this assumption with the fact that (McCullagh and Nelder 1989) shows that when is the identity link, when is the log link, and when is the logit link.
As before, let be an arbitrary value of x. We see that
where the first equality follows from the definition of , the second equality follows from Bayes’s rule, the third equality follows from the exposure model 9, and the fifth equality follows from the canonical link assumption together with the assumed mean structure . From equation (10), we observe that the target parameter is equal to the GLM parameter divided by the dispersion parameter . I thus propose to estimate and , then take the ratio of these to obtain an estimate of . Let denote the resulting estimate of . If x is binary, , , and is the logit link, then the model in equation (9) reduces to the logistic model in equation (3), and reduces to . In this sense, my proposal generalizes the “standard” case-time-control method. I proceed by considering nonbinary exposures, for which I assume that is either the identity link or the log link.
To estimate , I use conditional generalized estimating equations (CGEE; Goetgeluk and Vansteelandt 2008). This method is applicable when is the identity link or the log link and provides consistent estimates of without making any parametric assumptions apart from the mean model . As a by-product, we are able to relax a certain independence assumption, which is needed for consistent estimation when is the logit link (see Section 8).
To estimate I proceed as follows. I first define the residual
I then estimate for each individual as
This estimator is motivated by the fact that . I next estimate as
Now, from standard GLM theory we see that , where when is the identity link and when is the log link. Thus, for a given individual i we can construct a moment estimator of (Pawitan 2001) as
This estimator is not feasible for those individuals with . When is the log link, this estimator is also not feasible for those individuals with for all j, because and for these individuals. I obtain a “pooled” estimate of by averaging all feasible individual-specific estimates. Finally, I construct an estimate of as
To assess the uncertainty in due to sampling variability, it is desirable to derive its asymptotic distribution. I first define . The estimator is an M-estimator (Stefanski and Boos 2002) that solves the equation system
and . If is the identity link and if the underlying assumptions of the case-time-control method hold (see Section 8), then it can be shown that the estimating function is unbiased; that is, (see Appendix C). It follows from standard theory for M-estimators (Stefanski and Boos 2002) that is asymptotically normal with mean 0 and variance given by the “sandwich formula”:
I obtain a consistent estimate of the variance for by replacing in equation (12) by and the population moments in equation (12) by their sample counterparts.
If is the log link, then the estimating function is biased; that is, . As a consequence, and are biased as well. However, does converge to as the s increase, and thus we would expect that the bias in and goes to 0 as the s increase as well. In the next section I confirm this property by simulation.
I end this section with a technical remark. When x has a causal effect on y, the exposure model in equation (9) cannot be interpreted as a data-generating (i.e., causal) model. From a purely mathematical perspective, this is not a problem, because a data-generating model (i.e., a model for the outcome) can always be reformulated (using Bayes’s rule) as a model for the exposure. However, from a subject-matter perspective, we may worry that the postulated exposure model implies a complex or unrealistic (or even impossible) data-generating model. This not the case, though. It follows immediately from results in Chen (2007) that the probability distributions or densities and are variation independent of each other, and of the conditional odds ratio between x and y, given i and v. This means that a model for does not impose any restrictions on , apart from those implied by the model for the odds ratio. In particular, the exposure model 9 does not rule out (or imply) that the outcome follows a standard logistic model.
6. Simulation
I carried out a small simulation study to assess the finite sample properties of the proposed estimator . I first considered a scenario in which is the identity link. I generated 1,000 samples of 1,000 individuals each. For each individual, I simulated normally distributed intercepts and . I then simulated repeated measures of a Bernoulli-distributed outcome and a normally distributed exposure from models and , respectively, where expit is the inverse of the logit function. The true values of and were 1.5 and 0.5, respectively. For each individual, the outcome and exposure were generated for , with T being a common constant for all individuals. To mimic an absorbing outcome, I excluded all observations beyond the first time point at which , for each individual. In this way the number of time points per individual, , becomes a random variable with . I first generated data with . Each sample was analyzed with the method described in Section 5, with . I calculated the mean and standard deviation (over the 1,000 samples) of and , together with the mean standard error, as estimated by the “sandwich formula.” We then repeated the procedure for , , and . Table 2 shows the simulation results. We observe that both and are virtually unbiased for all values of , and their mean standard errors agree well with their standard deviations.
Simulation Results for the Generalized Case-time-control Method When Is the Identity Link.
()
()
Mean
SD
Mean SE
Mean
SD
Mean SE
2
1.50
.07
.07
.50
.11
.11
3
1.50
.06
.06
.50
.08
.08
4
1.50
.05
.05
.50
.06
.06
5
1.50
.05
.05
.50
.05
.05
I next considered a scenario in which is the log link. I repeated the simulation described above but used a different distribution for the exposure. Unless , the distribution in equation (9) has no simple closed-form expression when is the log link, which makes simulation difficult. To bypass this problem, I used a common “trick” and instead generated x from a negative binomial distribution with mean equal to and variance equal to . This distribution has the desired ratio between the variance and the mean, and it converges to the Poisson distribution when approaches 1 (Pawitan 2001). Table 3 shows the simulation results. Observe that there is a slight bias in both and for , but this bias goes to 0 as increases, as expected. All mean standard errors agree well with the corresponding standard deviations.
Simulation Results for the Generalized Case-time-control Method When Is the Log Link.
()
()
max()
Mean
SD
Mean SE
Mean
SD
Mean SE
2
1.39
.06
.06
.54
.08
.08
3
1.44
.05
.05
.52
.05
.05
4
1.47
.05
.05
.51
.04
.04
5
1.50
.04
.04
.50
.03
.03
As a comparison I repeated the simulations above for but analyzed the data with the conditional logistic regression model 4. I calculated the mean and standard deviation of (), together with the mean standard error, as estimated by the inverse Fisher information. Table 4 shows the simulation results. When is the identity link, in Table 4 is almost unbiased, and its standard deviation is slightly bigger than the standard deviation of in the first row in Table 2. When is the log link, the bias of in Table 4 is similar to the bias of in the first row in Table 3; however, the standard deviation of is almost twice the standard deviation of .
Simulation Results for the Conditional Logistic Regression Model 4.
()
Mean
SD
Mean SE
Identity
.51
.14
.14
Log
.55
.16
.14
Appendix D presents additional simulations in which I used the same setup as described above but with three variations: smaller samples, nonlinear effects of time, and no exposure effect. In these additional simulations we observe similar patterns as in the simulation above, with one notable exception; when there is no exposure effect (i.e., when ), we observe that the standard deviation of is very similar to the standard deviation of . This is an interesting finding for which I currently have no explanation; I recognize this as an interesting topic for future research.
I end this section by following up the technical remark at the end of Section 5. This simulation is based on models for and . When x has a causal effect on y, neither of these models can be interpreted as a data-generating (i.e., causal) model. Thus, we may worry that the postulated models imply a complex or unrealistic data-generating model for the outcome. However, using Bayes’s rule, it is easy to show that the models I have used in the simulation imply the standard logistic regression model , where . By marginalizing out y from the postulated exposure model, we see that is a mixture of two normal distributions with means equal to and and mixing probabilities equal to and , respectively. A simulation from this “data-generating” model is mathematically equivalent to the simulation that we carried out above, but it is less computationally convenient.
7. Motivating Example Revisited
I now return to the motivating example from Section 2. My aim is to estimate the conditional log-odds ratio between hours and nonpov, given all time-stationary covariates as well as the time-varying covariates inschool, spouse, age, and mother. Following my proposal in Section 5, I used the exposure model 9, with hours, nonpov, and inschool, spouse, age, mother). I modeled the conditional mean of x as , where I let be the identity link. I estimated and as described in Section 5, obtaining estimates and . Finally I obtained the estimate , which indicates that each additional working hour increases the odds of making a transition from poverty to nonpoverty with units. This number is close to the estimate I obtained by using conditional logistic regression without controlling for inschool, spouse, age, and mother (see Section 2), which indicates that the confounding by these time-varying covariates may not be very strong in this example. The standard error for was 0.012, which gives a 95 percent confidence interval for equal to .
8. Underlying Assumptions of the Case-time-control Method
Like all statistical methods, the case-time-control method relies on certain assumptions. In particular, it relies on the exposure model 9, which assumes that the exposure has an exponential family distribution, with a mean that is linear (after transformation with the link function ) in i, y, and v. If the model is misspecified, bias may occur. I note that although bias due to model misspecification is a concern, it is not unique for the case-time-control method, but it is a potential problem for all model-based statistical methods. I also note that the model assumptions can be relaxed by, for instance, modeling the effect of v with a polynomial or with splines.
Another issue is that some individuals do not contribute to the parameter estimates when using fixed effects models such as models 3 and 9. To contribute to the estimates, an individual must have variation in x as well as variation in either y or v (or both). When the exposure is binary, it is possible that most individuals are unexposed (i.e., have ) all through follow-up, in which case most individuals will not contribute to the parameter estimates. This does not violate any assumption of the method per se, but it has important implications for the interpretation of the results. If those few individuals who become exposed at some point (and thus contribute to the estimates) are systematically different (e.g., if they are older) than those individuals who never become exposed, then the results may not be generalizable to the whole population. When the exposure is quantitative, as in my motivating example (Section 2), this problem is less of a concern because most individuals will have variation in the exposure and will thus also contribute to the parameter estimates.
The most important (and subtle) problem with the case-time-control method is that even if the exposure model is correct, certain independence assumptions are required in order for the parameter estimates to be consistent. I begin by defining the vectors and . For the logistic exposure model 3, the usual conditional logistic regression likelihood is derived by considering the joint distribution of the exposures , given i, , and . The derivation proceeds by factorizing this joint distribution into individual-specific contributions, as specified by the individual-level exposure model. However, in order for this factorization to be valid, the following two assumptions are required:
and
Assumption (13) states that the exposures at all time points should be conditionally independent, given the individual and the vector of outcomes and the vector of time-varying covariates. Assumption (14) states that the exposure at each time point should be independent of the vector of outcomes and the vector of time-varying covariates, given the individual and the current value of the outcome and the current value of the time-varying covariates. If either of these assumptions is violated, then the usual expression for the conditional logistic regression likelihood is not valid, and the resulting estimates are generally biased (Goetgeluk and Vansteelandt 2008; Zetterqvist et al. 2016). Jensen et al. (2014) carried out a simulation study to investigate the sensitivity of the case-time-control method to assumption (13). They observed some bias when the assumption was violated but noted that the bias was minor for small exposure effects.
Although the reason for requiring assumptions (13) and (14) is rather technical, the assumptions have strong practical implications. Assumption (13) rules out carryover effects from the exposure at one time point to the exposure at later time points, and assumption (14) rules out carryover effects from the outcome at one time point to the exposure at later time points. Such carryover effects are likely to be present in many scenarios. In my motivating example (Section 2), it is clear that both current working hours (exposure) and current poverty status (outcome) may affect working hours at later time points, in which case both assumptions (13) and (14) would be violated to some degree.
However, if we use exposure model 9 with an identity link or log link, only assumption (14) is required. This is because the CGEE technique I use for estimation (see Section 5) does not require that we consider the joint distribution of (Goetgeluk and Vansteelandt 2008; Zetterqvist et al. 2016). Thus, in the reanalysis of the motivating example (Section 7), we may allow current working hours to have an effect on later working hours. However, we must still assume that current poverty status has no effect on the later working hours. It is not obvious how much bias is introduced by violating assumption (14), and I recognize this as an important topic for future research.
In the case-time-control method, conditional logistic regression is used to model the exposure. However, in more standard applications of conditional logistic regression, the outcome is modeled instead. By symmetry, the necessary assumptions are then seen as assumptions (13) and (14), but with the exposure and outcome “swapped,” that is,
and
That conditional logistic regression relies on assumptions (15) and (16) is rarely recognized by analysts and rarely stated in standard statistical textbooks. This is unfortunate, because violations of these assumptions may induce bias in the estimates (Sjölander et al. 2012).
9. Discussion
The case-time-control method makes it possible to control for confounding by time while also controlling for all time-stationary characteristics of the study participants. Until now, the method has been restricted to binary exposures or data with only two time points. In this paper I have proposed a generalization of the case-time-control method for nonbinary exposures and an arbitrary number of time points, on the basis of a GLM for the exposure. I have derived the asymptotic properties of the resulting estimator , and I have assessed its finite sample properties in a simulation study. The simulation results indicate that the method works well, regardless of the number of time points, when using an identity link in the GLM. When using a log link, is somewhat biased for small number of time points. However, for two time points the bias of appears to be no larger than the bias of the conventional case-time-control estimator .
Lack of software implementation is often a major obstacle for practitioners who wish to use new methodology. To facilitate the use of the proposed methods, I have written an R function that calculates the estimates of , , and , together with their “sandwich” standard errors, as described in Section 5. The R function is provided in Appendix E.
Footnotes
Appendix A
Appendix B
Appendix C
Appendix D
Appendix E
Funding
The author disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was funded by the Swedish Research Council (grant 340-2012-6007).
Author Biography
Arvid Sjölander is a statistician and an associate professor in biostatistics in the Department of Medical Epidemiology and Biostatistics at Karolinska Institutet. He received a PhD in causal inference, and he is still working with problems and questions related to causality. His other research topics of interest include attributable fractions, analysis methods for sibling comparison designs, and doubly robust estimation.
References
1.
AllisonPaul D.2009. Fixed Effects Regression Models: Quantitative Applications in the Social Sciences. Thousand Oaks, CA: Sage.
2.
AllisonPaul D.ChristakisNicholas A.2006. “Fixed-effects Methods for the Analysis of Nonrepeated Events.” Pp. 155–72 in Sociological Methodology, Vol. 36, edited by StolzenbergRoss M.Boston, MA: Blackwell.
3.
CampbellCameron D.LeeJames Z.2009. “Long-term Mortality Consequences of Childhood Family Context in Liaoning, China, 1749–1909.”Social Science and Medicine68(9):1641–48.
4.
ChenYun H.2007. “A Semiparametric Odds Ratio Model for Measuring Association.”Biometrics63(2):413–21.
5.
FafchampsMarcelvan der LeijMarco J.GoyalSanjeev. 2010. “Matching and Network Effects.”Journal of the European Economic Association8(1):203–31.
6.
GoetgelukSylvieVansteelandtStijn. 2008. “Conditional Generalized Estimating Equations for the Analysis of Clustered and Longitudinal Data.”Biometrics64(3):772–80.
7.
JensenAksel K. G.GerdsThomas A.WeekePeterTorp-PedersenChristianAndersenPer K.2014. “On the Validity of the Case-time-control Design for Autocorrelated Exposure Histories.”Epidemiology25(1):110–13.
8.
JinLeiChristakisNicholas A.2009. “Investigating the Mechanism of Marital Mortality Reduction: The Transition to Widowhood and Quality of Health Care.”Demography46(3):605–25.
9.
KacperczykAleksandra J.2013. “Social Influence and Entrepreneurship: The Effect of University Peers on Entrepreneurial Entry.”Organization Science24(3):664–83.
10.
MaclureMalcolm. 1991. “The Case-crossover Design: A Method for Studying Transient Effects on the Risk of Acute Events.”American Journal of Epidemiology133(2):144–53.
PawitanYudi. 2001. In All Likelihood: Statistical Modelling and Inference Using Likelihood. Oxford, UK: Oxford University Press.
13.
PearlJudea. 2009. Causality: Models, Reasoning, and Inference. 2nd ed.New York: Cambridge University Press.
14.
RubinDonald B.1974. “Estimating Causal Effects of Treatments in Randomized and Nonrandomized Studies.”Journal of Educational Psychology66(5):688–701.
15.
SjölanderArvidJohanssonAnna L. V.LundholmCeciliaAltmanDanielAlmqvistCatarinaPawitanYudi. 2012. “Analysis of 1:1 Matched Cohort Studies and Twin Studies, with Binary Exposures and Binary Outcomes.”Statistical Science27(3):395–411.
16.
SørensenJesper B.2007. “Bureaucracy and Entrepreneurship: Workplace Effects on Entrepreneurial Entry.”Administrative Science Quarterly52(3):387–412.
17.
StefanskiLeonard A.BoosDennis D.2002. “The Calculus of M-estimation.”American Statistician56(1):29–38.
18.
SuissaSamy. 1995. “The Case-time-control Design.”Epidemiology6(3):248–53.
19.
WinterSidney G.SzulanskiGabrielRingovDimoJensenRobert J.2012. “Reproducing Knowledge: Inaccurate Replication and Failure in Franchise Organizations.”Organization Science23(3):672–85.
20.
ZetterqvistJohanVansteelandtStijnPawitanYudiSjölanderArvid. 2016. “Doubly Robust Methods for Handling Confounding by Cluster.”Biostatistics17(2):264–76.