Abstract
Missing data is almost inevitable in correlated-data studies. For non-Gaussian outcomes with moderate to large sequences, direct-likelihood methods can involve complex, hard-to-manipulate likelihoods. Popular alternative approaches, like generalized estimating equations, that are frequently used to circumvent the computational complexity of full likelihood, are less suitable when scientific interest, at least in part, is placed on the association structure; pseudo-likelihood (PL) methods are then a viable alternative. When the missing data are missing at random, Molenberghs et al. (2011, Statistica Sinica, 21,187–206) proposed a suite of corrections to the standard form of PL, taking the form of singly and doubly robust estimators. They provided the basis and exemplified it in insightful yet primarily illustrative examples. We here consider the important case of marginal models for hierarchical binary data, provide an effective implementation and illustrate it using data from an analgesic trial. Our doubly robust estimator is more convenient than the classical doubly robust estimators. The ideas are illustrated using a marginal model for a binary response, more specifically a Bahadur model.
Keywords
Introduction
Incomplete data has become an important concern for applied statisticians, especially in longitudinal and otherwise hierarchical outcome data. When the vector
The choice of inferential framework for analysing incomplete data will depend largely upon the nature of missingness. Conventionally, the process driving the latter is classified according to the terminology of Little and Rubin (2002, Chap. 6). When missingness is independent of both the observed and unobserved outcomes, it is called missing completely at random (MCAR), while when the missingness is independent of the unobserved measurements, conditional on the observed ones, the process is said to be missing at random (MAR). When neither MCAR nor MAR holds, missingness is termed missing not at random (MNAR).
Very commonly, direct likelihood is used as the basis for analysing correlated outcomes under MAR. The unified modelling framework provided by the linear mixed model, yielding both random effects as well as marginally interpretable regression parameters, is the dominant choice for Gaussian outcomes, while generalized linear mixed models remain popular for non-Gaussian outcomes, though marginalization is not always straightforward. Other likelihood-based options for marginal inference exist, such as the Bahadur (1961) model and the multivariate Dale or global odds ratio model (Molenberghs and Lesaffre, 1994, 1999) for binary data, but these involve complex likelihoods, can be computationally prohibitive in moderate to large studies, and are vulnerable to misspecification.
These issues have motivated the development of alternatives to likelihood, perhaps the most popular of which being generalized estimating equations or GEE (Liang and Zeger, 1986; Diggle et al., 2002; Molenberghs and Verbeke, 2005), along with variations or extensions such as GEE2 (Liang et al., 1992) and alternating logistic regressions (Carey et al., 1993), when association parameters are also of scientific interest. Standard GEE is valid only under MCAR, but a weighted version (WGEE; Robins et al., 1995) has been developed, using Horvitz-Thompson ideas (Cochran, 1977), to allow valid use of GEE under MAR. The WGEE approach, however, tends to be biased when the model for the weights is misspecified (Beunckens et al., 2008; Molenberghs and Kenward, 2007). To this end, doubly robust approaches (Scharfstein et al., 1999; Van der Laan and Robins, 2003; Bang and Robins, 2005; Rotnitzky, 2009; Birhanu et al., 2011), which further supplement the use of weights with a predictive model for the unobserved responses, given the observed ones, have been constructed. This not only removes or at least alleviates bias but also increases efficiency.
PL methods (le Cessie and van Houwelingen, 1991; Geys et al., 1998, 1999; Aerts et al., 2002) comprise yet another alternative to full likelihood. While the term ‘PL’ has various meanings in the literature, we take it here to mean the replacement of a likelihood function by a simpler function that still allows a consistent and asymptotically normal estimator of the model parameter vector, albeit with potentially reduced precision (Arnold and Strauss, 1991). This is in contrast to GEE methods, where the score equations are replaced with alternative, simpler functions.
PL is different to full likelihood and is therefore not guaranteed to be valid under MAR. Rubin (1976) provided conditions for ignorability that are sufficient but not always necessary. Yi et al. (2011) provide an example, using a pairwise (pseudo-)likelihood method for incomplete longitudinal binary data, that is ignorable under MAR, even though it is not a full likelihood approach. Molenberghs et al. (2011), on the other hand, propose a suite of corrections to PL in its standard form, also to ensure its validity under MAR. These corrections hold for PL in general and follow both single and double robustness ideas. They showed that, in contrast to the GEE case and in particular for both robust versions, PL-based estimating equations admit very convenient simplifications.
Molenberghs et al. (2011) applied the methodology to multivariate Gaussian responses and to a conditional model for clustered binary data. They provided a general outline with predominantly illustrative examples using normal and binary data. However, the marginal modelling of longitudinal binary data is very common in practice. Molenberghs et al. (2011) only sketched the methodology using a marginal Bahadur model for the binary responses; they did not pursue it in detail. The further development of doubly robust PL for incomplete hierarchical binary data under MAR is the central theme of this article.
The theoretical part, estimating equations and precision estimators, are calculated and reported for the first time Application is shown through a case study and easy-to-use SAS code is provided.
It should be clear that we are not fitting the full Bahadur model. In fact, we use its first and second moments only, because this allows us to describe the marginal mean function, while providing the vehicle to take correlations and incompleteness into account. Note that there is a similar connection between standard and weighted GEE for binary data on the one hand and the Bahadur model on the other. The latter connection was studied in detail by Molenberghs and Kenward (2010). Note that apart from very simple settings, the Bahadur model is prohibitive to fit (Aerts et al., 2002).
The rest of the article is organized as follows. Section 2 introduces the necessary background and concepts from PL and incomplete data. Our contribution, that is, PL based on the Bahadur model, is the subject of Section 3. Analysis of the case study can be found in Section 4.
Background on pseudo-likelihood and incomplete data
Let the random variable
The principal idea behind PL is to replace a numerically challenging joint density (and hence likelihood) by a simpler function assembled from suitable factors. Arnold and Strauss (1991) gave a formal, general definition and studied its statistical properties.
Our attention will be confined to so-called pairwise marginal likelihood, in which the conventional log-likelihood
is replaced by
Maximization of Eq. (2.1) can be done, subject to adequate regularity conditions, by solving the PL (score) equations, which are obtained by differentiating the logarithmic PL and equating the resulting derivative to zero. Suppose that
Where
General formulation
We begin this section by introducing some further notation. The response vector
Molenberghs et al. (2011) considered three classes of estimating equations for pairwise likelihood, respectively naive, singly robust (‘sr’), and doubly robust (‘dr’). For each of these three, the original authors further considered: complete cases (CC; using only subjects will all planned measurements observed), complete pairs (CP; where all complete pairs from incomplete sequences are also added) and available cases (AC; where additionally single observations from incomplete pairs are used), leading to nine sets of estimating equations. The word ‘naive’ refers to the fact that these estimating equations would generally lead to biased estimators under MAR. Here, only the response is modelled with a Bahadur model. For the single robust setting a weight model is introduced, using a logistic structure. For the double robust version the model was further extended with a predictive model for the unobserved outcomes using again a Bahadur model. All these estimating equations are presented in Table 1.
In this table,
where
in which case, for example, the single robust version of the CP estimating equation can be re-expressed as:
An important result is that all three doubly robust versions coincide (Molenberghs et al., 2011), that is,
It is thus not necessary to explicitly model the missing data mechanism. Further, under exchangeability, (Molenberghs et al., 2011) showed that the expectations in
Estimating equations for pairwise pseudo-likelihood. Abbreviations used: CC: complete cases; CP: complete pairs; AC: available pairs; sr: singly robust; dr: doubly robust
To introduce the Bahadur model (Bahadur, 1961), we follow the developments in Molenberghs et al. (2011). Denote
The multivariate Bahadur probabilities are
where
Based on the definitions made in Section 3.2, the log-likelihood terms from a pairwise Bahadur model take the following form:
Starting from PL contribution (3.5), pairwise and conditional contributions to the score equation take the form as follows
where
The expectations of these over the conditional distribution of the unobserved outcomes given the observed ones are further required. Evidently, because Eqs. (3.6) and (3.7) are linear in the triplet
where
Combining Eqs. (3.6) and (3.7) with Eq. (3.8) leads to
Many of the probabilities in the predictive model, that is, the ones involving
where
In the naive case, uncertainty stems from the θ parameter only. The asymptotic variance-covariance matrix in Eq. (2.2) can then be consistently estimated by
where
In the singly robust case, we must also take into account uncertainty coming from estimating the
In the doubly robust case, for the general expression, the weight model is complemented with a predictive model. The score function for this conditional Bahadur model is
From Eqs. (3.1), (3.15) can be simplified to the following expressions
More detailed calculations and complete formulas can be found in Supplementary material B. See also Bang and Robins (2005) and Rotnitzky (2009).
In this section, we apply the proposed methodology to data from a clinical trial designed to investigate an analgesic drug. All analyses have been performed with SAS (version 9.4). First, the Bahadur model, using three different estimating equations for CC, CP and AC, was fitted with an NLMIXED procedure. To make use of NLMIXED's functionality, an objective function is formulated of which the first derivative coincides with the estimating function under consideration. For optimization, the default Quasi-Newton technique was applied. Further, to estimate the precision, a sandwich-type robust variance estimator was used and, to perform the calculations, the IML procedure was implemented. The Bahadur model, based on the full likelihood, was again fitted in an NLMIXED procedure with similar settings. For more details, see Supplementary material C.
Analgesic clinical trial
The analgesic trial was a single-arm clinical trial involving 395 patients who were given analgesic treatment for pain caused by chronic non-malignant disease. Treatment was to be administered for 12 months and assessed by means of a five-point ‘Global Satisfaction Assessment’ (GSA) scale: (1) very good; (2) good; (3) indifferent; (4) bad; (5) very bad. As it is frequently of interest to physicians to classify a patient's status as either improving or worsening, some analyses have considered a dichotomized version, GSABIN, which is 1 if GSA
The analgesic trial. Absolute and relative frequencies of the five GSA categories for each of the four follow-up times
The analgesic trial. Absolute and relative frequencies of the five GSA categories for each of the four follow-up times
GSA was rated by each person four times during the trial: at months 3, 6, 9 and 12. An overview of the frequencies per follow-up time is given in Table 2. Inspection of Table 2 reveals varying totals per column, due to missingness. At three months, 10 subjects lack a measure, with these numbers being 93, 168 and 172 at subsequent times.
The analgesic trial. Overview of missingness patterns and the frequencies with which they occur. ‘O’ indicates observed and ‘M’ indicates missing
An overview of the extent of missingness (Table 3) indicates that only around 40% of the subjects have complete data. Both dropout and intermittent patterns of missingness occur–the former amounting to roughly 40%, with less than 20% for the latter.
For all ensuing analyses of the analgesic trial data, we consider only completers and dropouts, that is, a subset of 328 patients from the original dataset, and the dichotomized outcome (GSABIN). We first build a logistic regression for the dropout indicator, in terms of the previous outcome
The highly significant p value
Preliminary analyses have indicated that, among a set of potential covariates, the linear and quadratic effects of time
For the correlation across the within-subject outcomes, we posit a Toeplitz type correlation structure:
where
The resulting parameter estimates, along with corresponding standard errors, for model specification Eq. (4.1), with a Toeplitz correlation structure (Eq. 4.2), using full likelihood and estimating equations from Table 1 are presented in Table 4. The variability of the estimated weights, or additionally the variability of the estimated parameters of the predictive model, is incorporated in the computation of the standard errors. The high degree of similarity with the results of full likelihood indicate that the extra variability induced by the weights, or additionally by the parameters of the predictive model, does not seem to have a large impact on either the estimates or their standard errors.
Analgesic trial. Parameter estimates (empirically-corrected standard errors) for naive, singly and doubly robust pairwise likelihood and for full likelihood
Similar results are observed throughout the whole table, but in particular for the parameter estimates under full likelihood, naive AC and the doubly robust cases. Moreover, substantial efficiency over full likelihood seems to be gained under the naive AC and doubly robust approaches. Whereas these observations are not surprising for the doubly robust case, precisely because of their property, the relatively good performance of the naive AC case seems counterintuitive. However, under exchangeability, as shown before, the naive AC can be seen as a doubly robust estimator, given that then the expectation in these estimation equations can be removed because observed and unobserved components from a subject's history are interchangeable. To this effect, we assessed the plausibility of the Toeplitz correlation structure of the analgesic trial data, using full likelihood (approximate F-test in NLMIXED), and determined that the three correlation parameters
Next, we consider the CP versions, both single and doubly robust. The estimates for the parameters seem reasonably close to those under full likelihood. In addition, the standard errors under the singly robust case seem comparable, but those of the doubly robust case are generally larger than those from full likelihood, a result that could be attributed to the fact of single robustness. The estimates for the
PL approaches have become a practical alternative to full likelihood methods, particularly for applications involving complex likelihood forms. In view of the various issues arising from marginally modelling incomplete non-Gaussian longitudinal data, we move away from conditional PL, and focus on marginal PL, considering the specific case of incomplete longitudinal binary data, as proposed in Molenberghs et al. (2011). While the numerical and computational issues accompanying the likelihood expressions of the marginal model for the binary longitudinal responses are circumvented by means of substituting pairwise PL expressions for their full likelihood counterparts, the incompleteness in the data is addressed using concepts of inverse probability weighting and predictive terms in the form of expectations, thereby yielding singly and doubly robust estimators. This expands the set of tools available for fitting marginal models to incomplete non-Gaussian longitudinal data.
In this article, we assessed the performance of PL approaches proposed in Molenberghs et al. (2011), in order to provide practical insight into alternative strategies for marginal models for non-Gaussian incomplete longitudinal data. The analysis of the case study demonstrates the feasibility and adequacy of the proposed methodology. Singly robust estimators with correctly specified dropout model and our doubly robust estimators were found to be at least as efficient as direct likelihood methods. Moreover, under full or near exchangeability, the naive available case version is as efficient as the doubly robust estimators, allowing one to invoke double robustness without having to use weights or expectations.
While the situation examined in this article focuses on dropout, in principle, the general methodology applies for non-monotone missingness as well; one then has to pay particular attention to the construction of both weights and predictions, and some non-trivial algebraic challenges will emerge. Other possibilities include imputing all missing cases or imputing only non-monotone missing cases to render the missingness monotone and subsequently using PL on the imputed datasets. Also, while multiple imputation approaches generally prescribe Gaussian type data, variations for non-Gaussian data can be utilized and seem reasonably stable even with model misspecification; see, for instance, Beunckens et al. (2008).
Supplementary materials
More detailed information on calculations, data and SAS code can be found in the accompanying supplementary materials through the link: http://www. statmod.org/smij/archive.html.
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship and/or publication of this article.
Funding
The authors gratefully acknowledge support from the IAP research Network #P6/03 of the Belgian Government (Belgian Science Policy).
