The log-odd logistic-Weibull regression model under informative censoring

Abstract

One of the main characteristics of data from survival analysis is that the random variable of interest is not always observed, so that some observations are censored. The usual methods consider that these observations do not carry information about the distribution of the response variable (non-informative censoring). In other words, it is considered that an observation is censored simply by the fact that the event of interest (failure or death) did not occur during the period of study. However, in many situations, the survival time is clearly perturbed by the censoring mechanism, so the effect produced must be included in the analysis. The question is that once informative censoring is assumed to be non-informative, the results of the analysis can mask biases and thus weakening the model’s predictive power. Therefore, we consider the informative censoring mechanism in the odd-logistic Weibull regression model, based on the method described in Huang and Wolfe (2002), to analyze the variations which occur for estimating the model parameters. We obtain maximum likelihood estimates of the parameters by considering censored data and evaluate local influence on the estimates for different perturbation schemes. In addition, we define martingale and deviance residuals to detect outliers and evaluate the model assumptions. We show that the proposed regression model is useful to the analysis of real data and may give more realistic fits than other special regression models.

Keywords

Frailty gamma distribution informative censoring mechanism log-odd log-logistic Weibull regression model residual analysis

1. Introduction

In general, the data from survival analysis differ from the data found in other statistical problems by the fact that the variable of interest is not always observed. One of the main characteristics of data of this nature is the presence of censoring. Censoring can be defined as a partial observation of the response variable, and it can be caused by many aspects, such as the end of the study before the occurrence of the event, death of an individual in the sample for a cause other than that studied or abandonment of treatment by the patient, among others. In survival models, it is generally assumed the independence of the failure times and censoring (non-informative censoring) times. However, there are many other causes and types of censoring that pre-determined the end of an observation period (right censoring). In clinical trials and other medical and biological studies, for example, various reasons can cause censoring to be informative. According to Lagakos (1979), these are:

•
When individuals withdraw from a clinical trial for reasons that can be related to the therapy being analyzed;
•
When individuals are removed from a clinical trial by design and are no longer regarding survival times, but have already experienced a specific critical event;
•
When individuals in a study experience a failure caused by an event of secondary interest and there is censoring of the failure time caused by the event of primary interest.

In all these cases, it is necessary to verify whether these partial observations bring useful information about the distribution of the failure times. As reported by Lagakos (1979), a situation where an individual withdraws from a clinical trial for failure to adapt to the treatment of interest can be considered a case of informative censoring, since the reason for censoring is related to the subsequent survival time of that individual. The problem is that once non-informative censoring has been mistakenly assumed in a data set, the results of the analysis can be biased and the usual methods will overestimate (underestimate) the survival function (Staplin et al., 2012). Also, the biases of the estimates will tend to increase when the number of censored observations rises. In response to this problem, some authors have formulated proposals to model survival data under the assumption of informative censoring. Lagakos and Williams (1978) proposed a model that presents a scalar parameter $\theta$ that involves the hidden censoring mechanism. For $\theta=$ 1, the likelihood function only depends on the survival distribution. In contrast, for 0 $\leqslant\theta<$ 1, the censoring effects become increasingly important for estimating the model parameters. Wang et al. (2001) modeled the occurrence of recurrent events with the use of a non-stationary Poisson process and a latent variable, considering the distribution parameters of the (informative) censoring and the latent variables as nuisance parameters. Then, they considered a multiplicative intensity model for nonparametric estimation of the hazard function. Rotnitzky et al. (2007) obtained estimators of the survival function in the presence of right-censoring mechanisms. In the same study, they also performed sensitivity analysis of the estimator representing the potentially informative censoring, assuming that after the adjustment for all the prognostic factors, the failure and censoring times are independent. In this same line of reasoning, Scharfstein and Robins (2002) presented a method that, besides fitting models considering informative censoring, allows simultaneously quantifying the sensitivity of the inference for residual dependence between the failure and the censoring due to uncontrolled factors. However, this method does not allow more than one censoring mechanism, meaning that all the censoring of the data should be considered as informative. Therefore, the model of Rotnitzky et al. (2007) is an extension of this method under multiple causes of censoring.

In all these works, the greatest difficulty is in identifying the dependent censoring mechanism in the data (Tsiatis, 1975). In this case, several authors have proposed sensitivity analysis methods to assess the effects of dependent censoring on estimates of time-to-failure distribution parameters (Siannis, 2004; Siannis et al., 2005; Zhang & Heitjan, 2006; Huang & Zhang, 2008). Siannis (2004) and Siannis et al. (2005) used a proportional risk structure along with a linear predictor to allow estimation of the individual changes of the estimates. A more comprehensive sensitivity analysis for models in the presence of informative censoring was proposed by Siannis (2011) using the Cox proportional risk model, which is more flexible than standard parametric survival models. In the context of competitive risks, Lu and Tsiatis (2011) adopted auxiliary covariables to obtain estimators to quantify informative censoring that are more efficient than those that disregard these covariables. Freitas and Rodrigues (2013) considered the standard exponential cure rate model under informative censoring and investigated through a simulation study the impact caused by informative censoring on the probabilities of coverage and length of the asymptotic confidence intervals of the parameters of interest. The objectives of this work are to fit a new parametric model to real survival data assuming that the failure and censoring times are conditionally independent, given a frailty (informative censoring mechanism), and to analyze the variations in the estimates of the parameters when considering the usual informative censoring methods.

For the assessment of model adequacy, we develop diagnostic studies to detect possible influential or extreme observations that can cause distortions to the results of the analysis. Further, we compare two types of residuals to assess departures from the error assumptions as well as to detect outlying observations in the log-odd logistic-Weibull (LOLLW) regression model with informative censoring.

The paper is organized as follows. In Section 2, we define the LOLLW regression model. In Section 3, we study the informative censoring mechanism in the location-scale regression model. In Section 4, we adopt several diagnostic measures under three perturbation schemes in the proposed regression model with informative censoring and we define two kinds of residuals from the fitted model to assess departures from the error distribution assumption and to detect outlying observations. In Section 5, we analyze a real data set to show the flexibility, practical relevance and applicability of our regression model. We offer some concluding remarks in Section 6.
2. The LOLLW regression model

Most generalized Weibull distributions have been proposed in reliability literature to provide better fits to certain data sets than the traditional two- or three-parameter Weibull models. See, for example, the distributions discussed in Tables 1 and 2 by Pham and Lai (2007). Recently, da Cruz et al. (2015) introduced a three-parameter odd log-logistic Weibull (OLLW) distribution having probability density function (pdf)

$\displaystyle f(t;\alpha,\gamma,\lambda)=\frac{\alpha\gamma t^{\gamma-1}\left% \{\exp\left[-\left(\frac{t}{\lambda}\right)^{\gamma}\right]\right\}^{\alpha}% \left\{1-\exp\left[-\left(\frac{t}{\lambda}\right)^{\gamma}\right]\right\}^{% \alpha-1}}{\lambda^{\gamma}\left[\left\{1-\exp\left[-\left(\frac{t}{\lambda}% \right)^{\gamma}\right]\right\}^{\alpha}+\left\{\exp\left[-\left(\frac{t}{% \lambda}\right)^{\gamma}\right]\right\}^{\alpha}\right]^{2}},\quad t>0,$ (1)

where $\lambda>$ 0 is a scale parameter and $\alpha>$ 0 and $\gamma>$ 0 are shape parameters. Henceforth, we denote by $T$ a random variable with pdf Eq. (1). The survival function of $T$ is

$\displaystyle S(t;\alpha,\gamma,\lambda)=1-\frac{\left\{1-\exp\left[-\left(% \frac{t}{\lambda}\right)^{\gamma}\right]\right\}^{\alpha}}{\left\{1-\exp\left[% -\left(\frac{t}{\lambda}\right)^{\gamma}\right]\right\}^{\alpha}+\left\{\exp% \left[-\left(\frac{t}{\lambda}\right)^{\gamma}\right]\right\}^{\alpha}}.$

Then, the hazard rate function (hrf) of $T$ is $h(t;\alpha,\gamma,\lambda)=f(t;\alpha,\gamma,\lambda)/S(t;\alpha,\gamma,\lambda)$ . The great flexibility of this model to fit lifetime data is due to different forms of the hrf: (i) if $\alpha=$ 1, it is the Weibull hazard function; (ii) if $\alpha\in(0,1)$ and for some values of $\lambda$ and $\gamma$ , it can have bathtub-shaped; (iii) if $\alpha>$ 1 and special combinations of $\lambda$ and $\gamma$ , it is unimodal.

Figure 1.

Plots of the OLLW density for some parameter values. (a) For different values of $\lambda$ , $\gamma$ and $\lambda$ . (b) For different values of $\alpha$ and $\lambda$ with $\gamma=$ 2.45.

Some plots of the density of $T$ for selected parameter values, including well known distributions, are displayed in Fig. 1a and b. A characteristic of the OLLW distribution is that its pdf can be monotonically (increasing or decreasing), unimodal, bimodal, increasing-decreasing-increasing shaped, among others, depending basically on the parameter values.

Alternatively, other works had introduced using the odd log-logistic family of distributions, for example, Mendoza et al. (2016) presented the exponentiated log-logistic geometric distribution with dual activation and Cordeiro et al. (2016) considered a the odd log-logistic generalized half-normal lifetime distribution. Recently, da Silva et al. (2017) introduced the odd log-logistic Student $t$ distribution, da Cruz et al. (2017) proposed the bivariate odd-log-logistic Weibull regression model for oral health-related quality of life, Prataviera et al. (2018a) presented a generalized odd log-logistic flexible Weibull regression model with applications in repairable systems and Prataviera et al. (2018b) defined the heteroscedastic odd log-logistic generalized gamma regression model for censored data.

Let $Y=\log(T)$ ba a random variable having the LOLLW distribution. Recently, da Cruz et al. (2015) proposed the LOLLW regression model given by

$\displaystyle y_{i}=\log(t_{i})=\mu_{i}+\sigma z_{i},\quad i=1,\ldots,n,$ (2)

where $\mu_{i}=\mathbf{x}_{i}^{T}\bm{\beta}$ , $\bm{\beta}=(\beta_{1},\ldots,\beta_{p})^{T}$ , $\sigma>$ 0 and $\lambda>$ 0 are unknown parameters and $\mathbf{x}_{i}^{T}=(x_{i1},\ldots,x_{ip})$ is the explanatory variable vector modeling the linear predictor $\mu_{i}$ . Hence, the linear predictor vector $\bm{\mu}=(\mu_{1},\ldots,\mu_{n})^{T}$ of the LOLLW regression model is simply $\bm{\mu}=\mathbf{X}\bm{\beta}$ , where $\mathbf{X}=(\mathbf{x}_{1},\ldots,\mathbf{x}_{n})^{T}$ is a known model matrix. Equation (2) is also referred to as the log-location-scale or accelerated failure time model.

We define the standardized random variable $Z_{i}=(Y_{i}-\mu_{i})/\sigma$ . The density function of the response $Y_{i}$ can be expressed as

$\displaystyle f(y_{i}|\mathbf{x})=\frac{\alpha\exp(z_{i})\{\exp[-\exp(z_{i})]% \}^{\alpha}\{1-\exp[-\exp(z_{i})]\}^{\alpha-1}}{\sigma[\{1-\exp[-\exp(z_{i})]% \}^{\alpha}+\{\exp[-\exp(z_{i})]\}^{\alpha}]^{2}}$ (3)

where $z_{i}=(y_{i}-{\mathbf{x}_{i}^{T}}\bm{\beta})/\sigma$ , $y_{i}\in\mathbb{R}$ , $\alpha>$ 0 and $\sigma>$ 0.

The survival function, hrf and cumulative hrf of $Y_{i}$ are given by

$\displaystyle S(y_{i}|\mathbf{x})=1-\frac{\{1-\exp[-\exp(z_{i})]\}^{\alpha}}{% \{1-\exp[-\exp(z_{i})]\}^{\alpha}+\{\exp[-\exp(z_{i})]\}^{\alpha}},$ $\displaystyle h(y_{i}|\mathbf{x})=\frac{\alpha\exp(z_{i})\{1-\exp[-\exp(z_{i})% ]\}^{\alpha-1}}{\sigma\{\{1-\exp[-\exp(z_{i})]\}^{\alpha}+\{\exp[-\exp(z_{i})]% \}^{\alpha}\}}$

and

$\displaystyle H(y_{i}|\mathbf{x})=\alpha\exp(z_{i})+\log\left\{\{1-\exp[-\exp(% z_{i})]\}^{\alpha}\{\exp[-\exp(z_{i})]\}^{\alpha}\right\},$

respectively.

3. Informative censoring in the regression model

In this section, we construct the marginal likelihood function under informative censoring by considering that the censored times carry information about the times to failure (informative or dependent censoring). We follow the method described by Huang and Wolfe (2002) using the gamma function to build the frailty distribution and assuming that it acts in multiplicative form (Santos Jr., 2012). Under a right censoring mechanism, let $T$ be a random variable representing the failure time of an observation (time until the occurrence of the event of interest), and $C$ be another random variable, which represents the censoring time associated with this observation. For $i=1,\ldots,n$ , the observed data will consist of the pairs $(y_{i},\delta_{i})$ , where $y_{i}=\min(T_{i}^{*},C_{i}^{*})$ is the logarithm of the time observed for the $i$ th observation ( $T_{i}^{*}=\log(T_{i})$ , $C^{*}=\log(C_{i})$ ) and $\delta_{i}$ is the variable indicating failure, i.e., $\delta_{i}=$ 1 if $T_{i}^{*}\leqslant C_{i}^{*}$ or $\delta_{i}=$ 0 if $T_{i}^{*}>C_{i}^{*}$ . Further, the density and survival functions of $T_{i}$ and $C_{i}$ are denoted by $f_{T^{*}}(y_{i};\bm{\kappa})$ , $S_{T^{*}}(y_{i};\bm{\kappa})$ , $f_{C^{*}}(c_{i}^{*};\bm{\nu})$ and $S_{C^{*}}(c_{i}^{*};\bm{\nu})$ , respectively, where $\bm{\kappa}$ is the parameter vector associated with the failure time distribution and $\bm{\nu}$ is the parameter vector associated with the censoring time.

If an association exists between the failure and censoring times, the standard methods used to analyze censored data may not be robust, thus causing the need to formulate a structure able to incorporate this dependence. For this purpose, based on the work of Huang and Wolfe (2002), we consider that the random variables $T^{*}$ and $C^{*}$ are independent when individually conditioned to a frailty $W$ (random effect). Thus, as presented in Santos Jr. (2012), we assume that $W\sim\text{Gamma}(\phi,\phi)$ . Although other distributions can be used, such as the uniform, Weibull and log-normal (Vaupel & Yashin, 1983), the frailty to the gamma distribution in semi-parametric models is more widely used, basically due to its algebraic convenience. The choice of the frailty distribution with mean and variance equal to $\phi$ guarantees identifiability of the model.

Considering the model in which the frailty acts in a multiplicative form, the conditional hazard function for the logarithm of the failure times is

$\displaystyle h_{T^{*}}(y|\bm{\kappa},W)=h_{0}^{(T^{*})}(y|\bm{\kappa},W),$ (4)

where $h_{0}^{(T^{*})}(y|\bm{\kappa})$ is the basic hazard function for the logarithm of the failure times. The conditional survival function is

$\displaystyle S_{T^{*}}(y|\bm{\kappa},W)=\exp\left(-WH_{0}^{(T^{*})}(y|\bm{% \kappa})\right),$ (5)

where $H_{0}^{(T^{*})}(y|\bm{\kappa})$ is the basic cumulative hazard function for $T^{*}$ . Likewise, the risk and survival functions can be defined for the censoring times. Hence, the conditional hazard function for the censoring times can be expressed as

$\displaystyle h_{C^{*}}(y|\bm{\nu},W)=h_{0}^{(C^{*})}(y|\bm{\nu})W,$ (6)

where $h_{0}^{(C^{*})}(y|\bm{\nu})$ is the basic hazard function for the logarithm of the censoring times. The conditional survival function has the form

$\displaystyle S_{C^{*}}(y|\bm{\nu},W)=\exp\left(-WH_{0}^{(C^{*})}(y|\bm{\nu})% \right),$ (7)

where $H_{0}^{(C^{*})}(y|\bm{\nu})$ is the basic cumulative hazard function for $C^{*}$ .

Also, the marginal survival can be expressed as

$\displaystyle S(y|\phi)=\int_{W}S(y,W)dW=\int_{W}S(y|W)f(W)dW.$

Then, considering that $W\sim\text{Gamma}(\phi,\phi)$ and $S(y|W)=\exp(-WH_{0}(y))$ , we have

$\displaystyle S(y|\phi)=\left(\frac{\phi}{\phi+H_{0}(y)}\right)^{\phi}.$ (8)

So, when considering the logarithms of the failure or censoring times, the marginal survival function will only change with respect to the cumulative failure rate, which is associated with the distribution taken for the current times. The maximum likelihood estimators (MLEs) under the assumption that $T^{*}$ and $C^{*}$ are conditionally independent, given a frailty $W$ , are found by maximizing the marginal likelihood function. By assuming the frailty $W$ in the joint distribution of $T_{i}^{*}$ and $C_{i}^{*}$ , the marginal likelihood function has the form

$\displaystyle\!\!\!\!\!\!\!\!\!\!\!\!\!\!L(\bm{\kappa},\bm{\nu},\phi|D)\!=% \prod_{i\!=\!1}^{n}\!\int_{W_{i}}\!\!\{P[T_{i}^{*}\!=\!y_{i},C_{i}^{*}\!>\!y_{% i},W_{i}]\}^{\delta_{i}}\{P[C_{i}^{*}\!=\!y_{i},T_{i}^{*}\!>\!y_{i},W_{i}]\}^{% 1-\delta_{i}}dW_{i}\!=\prod_{i\!=\!1}^{n}\!\int_{W_{i}}\!\!\!\left\{P[(T_{i}^{% *}\!=\!y_{i},C_{i}^{*}\!>\!y_{i})|W_{i}]^{\delta_{i}}\right\}\!\{f(W_{i})\}^{% \delta_{i}}\{P[C_{i}^{*}\!=\!y_{i},T_{i}^{*}\!>\!y_{i}|W_{i}]\}^{1-\delta_{i}}% \{f(W_{i})\}^{1-\delta_{i}}dW_{i}\!=\prod_{i\!=\!1}^{n}\!\int_{W_{i}}\!\!\{P[(% T_{i}^{*}\!=\!y_{i},C_{i}{{}^{*}}\!>\!y_{i})|W_{i}]\}^{\delta_{i}}\{f(W_{i})\}% ^{\delta_{i}}\{P[C_{i}^{*}\!=\!y_{i},T_{i}^{*}\!>\!y_{i}|W_{i}]\}^{1-\delta_{i% }}f(W_{i})dW_{i}.$ (9)

By assuming that $T^{*}$ and $C^{*}$ are conditionally independent given the frailty $W$ , the marginal likelihood function Eq. (9) can be rewritten as

$\displaystyle L(\bm{\kappa},\bm{\nu},\phi|D)=\prod_{i=1}^{n}\int_{W_{i}}\{P[T_% {i}^{*}=y_{i}|W_{i}]\}^{\delta_{i}}\{P[C_{i}^{*}>y_{i}|W_{i}]\}^{\delta_{i}}\{% P[C_{i}^{*}=y_{i}|W_{i}]\}^{1-\delta_{i}}\{P[T_{i}^{*}>y_{i}|W_{i}]\}^{1-% \delta_{i}}f(W_{i})dW_{i}.$ (10)

By substituting the functions Eqs (4)–(7) into Eq. (10) and considering the relations

$\displaystyle f_{T^{*}}(y_{i}|\bm{\kappa},W_{i})=h_{T^{*}}(y_{i}|\bm{\kappa},W% _{i})S_{T^{*}}(y_{i}|\bm{\kappa},W_{i})\text{ and }f_{C^{*}}(y_{i}|\bm{\nu},W_% {i})=h_{C^{*}}(y_{i}|\bm{\nu},W_{i})S_{C^{*}}(y_{i}|\bm{\nu},W_{i}),$

the likelihood function can be rewritten as

$\displaystyle L(\bm{\kappa},\bm{\nu},\phi|D)=\prod_{i=1}^{n}\int_{W_{i}}\{h_{T% ^{*}}(y_{i}|\bm{\kappa},W_{i})\}^{\delta_{i}}S_{T^{*}}(y_{i}|\bm{\kappa},W_{i}% )h_{C^{*}}(y_{i}|\bm{\nu},W_{i})^{1-\delta_{i}}S_{C^{*}}(y_{i}|\bm{\nu},W_{i})% f(W_{i})dW_{i}.$

By substituting the density function $f(W)$ from the frailty $W\sim\text{Gamma}(\phi,\phi)$ , we obtain

$\displaystyle\!\!\!\!\!\!\!\!\!\!\!\!L(\bm{\kappa},\bm{\nu},\phi|D)=\prod_{i=1% }^{n}\int_{W_{i}}\{h_{T^{*}}(y_{i}|\bm{\kappa},W_{i})\}^{\delta_{i}}S_{T^{*}}(% y_{i}|\bm{\kappa},W_{i})h_{C^{*}}(y_{i}|\bm{\nu},W_{i})^{1-\delta_{i}}S_{C^{*}% }(y_{i}|\bm{\nu},W_{i})\times\frac{\phi}{\Gamma(\phi)}W_{i}^{\phi-1}\exp(-\phi W% _{i})dW_{i}=\prod_{i=1}^{n}h_{0}^{(T^{*})}(y_{i}|\bm{\kappa})^{\delta_{i}}h_{0% }^{(C^{*})}(y_{i}|\bm{\nu})^{1-\delta_{i}}\int_{W_{i}}W_{i}^{\phi}\exp\left\{-% W_{i}\left[\phi\!+\!H_{0}^{(T^{*})}(y_{i}|\bm{\kappa})\!+\!H_{0}^{(C^{*})}(y_{% i}|\bm{\nu})\right]\right\}dW_{i}.$

The kernel of a gamma distribution appears in the integral, i.e.

$\displaystyle W_{i}\sim\text{Gamma}\left(\phi+1;\phi+H_{0}^{(T^{*})}(y_{i}|\bm% {\kappa})+H_{0}^{(C^{*})}(y_{i}|\bm{\nu})\right).$

So, the marginal likelihood function reduces to

$\displaystyle L(\bm{\kappa},\bm{\nu},\phi|D)=\prod_{i=1}^{n}\left\{\frac{h_{0}% ^{(T^{*})}(y_{i}|\bm{\kappa})^{\delta_{i}}h_{0}^{(C^{*})}(y_{i}|\bm{\nu})^{1-% \delta_{i}}\phi^{\phi}\Gamma(\phi+1)}{\Gamma(\phi)\left[\phi+H_{0}^{(T^{*})}(y% _{i}|\bm{\kappa})+H_{0}^{(C^{*})}(y_{i}|\bm{\nu})\right]^{\phi+1}}\right\}.$ (11)

The marginal likelihood function Eq. (11) will have closed-form when the basic hazard functions and cumulative hazard functions have closed-forms. Hence, the log-likelihood function is

$\displaystyle l(\bm{\kappa},\bm{\nu},\phi|D)=\sum_{i=1}^{n}\left\{\delta_{i}% \log\left[h_{0}^{(T^{*})}(y_{i}|\bm{\kappa})\right]+(1-\delta_{i})\log\left[h_% {0}^{(C^{*})}(y_{i}|\bm{\nu})\right]+\phi\log(\phi)+\log[\Gamma(\phi+1)]\right% .\left.-\log[\Gamma(\phi)]-(\phi+1)\log\left[\phi+H_{0}^{(T^{*})}(y_{i}|\bm{% \kappa})+H_{0}^{(C^{*})}(y_{i}|\bm{\nu})\right]\right\}.$

Consider a sample $(y_{1},\mathbf{x}_{1}),\ldots,(y_{n},\mathbf{x}_{n})$ of $n$ independent observations, where the random response is defined by $y_{i}=\min\{T_{i}^{*},C_{i}^{*}\}$ . We assume informative censoring and the LOLLW distribution for the log-lifetime and log-Weibull for the log-censoring as in Section 2. The log-likelihood function for the vector $\bm{\theta}=(\bm{\kappa}^{\top},\bm{\nu}^{\top},\phi)^{\top}$ , where $\bm{\kappa}=\left(\alpha^{(T^{*})},\sigma^{(T^{*})},\bm{\beta}^{(T^{*})}\right% )^{\top}$ , $\bm{\nu}=\left(\alpha^{(C^{*})},\sigma^{(C^{*})},\mu^{(C^{*})}\right)^{\top}$ and $\bm{\beta}=(\beta_{1},\ldots,\beta_{p})^{\top}$ , is obtained from the models Eqs (4) and (6) as

$\displaystyle l(\bm{\theta}|D)=r\phi\log(\phi)+r\log[\Gamma(\phi+1)]-r\log[% \Gamma(\phi)]+\sum_{i=1}^{n}\delta_{i}\log\left[h_{0}^{(T^{*})}(y_{i}|\bm{% \kappa})\right]+\sum_{i=1}^{n}(1-\delta_{i})\log\left[h_{0}^{(C^{*})}(y_{i}|% \bm{\nu})\right]-(\phi+1)\sum_{i=1}^{n}\log\left[\phi+H_{0}^{(T^{*})}(y_{i}|% \bm{\kappa})+H_{0}^{(C^{*})}(y_{i}|\bm{\nu})\right],$ (12)

where $r$ is the number of uncensored observations (failures),

$\displaystyle h_{0}^{(T^{*})}(y_{i}|\bm{\kappa})=\frac{\alpha^{(T^{*})}\exp% \left(z_{i}^{(T^{*})}\right)\left(1-u_{i}^{(T^{*})}\right)^{\alpha^{(T^{*})}-1% }}{\sigma^{(T^{*})}\left\{\left(1-u_{i}^{(T^{*})}\right)^{\alpha^{(T^{*})}}+% \left[u_{i}^{(T^{*})}\right]^{\alpha^{(T^{*})}}\right\}},$ $\displaystyle h_{0}^{(C^{*})}(y_{i}|\bm{\nu})=\frac{\alpha^{(C^{*})}\exp\left(% z_{i}^{(C^{*})}\right)\left(1-u_{i}^{(C^{*})}\right)^{\alpha^{(C^{*})}-1}}{% \sigma^{(C^{*})}\left\{\left(1-u_{i}^{(C^{*})}\right)^{\alpha^{(C^{*})}}+\left% [u_{i}^{(C^{*})}\right]^{\alpha^{(C^{*})}}\right\}},$ $\displaystyle H_{0}^{(T^{*})}(y_{i}|\bm{\kappa})=\log\left\{\left(1-u_{i}^{(T^% {*})}\right)^{\alpha^{(T^{*})}}+\left[u_{i}^{(T^{*})}\right]^{\alpha^{(T^{*})}% }\right\}-\alpha^{(T^{*})}\log\left(u_{i}^{(T^{*})}\right),$ $\displaystyle H_{0}^{(C^{*})}(y_{i}|\bm{\nu})=\log\left\{\left(1-u_{i}^{(C^{*}% )}\right)^{\alpha^{(C^{*})}}+\left[u_{i}^{(C^{*})}\right]^{\alpha^{(C^{*})}}% \right\}-\alpha^{(C^{*})}\log\left(u_{i}^{(C^{*})}\right),$ $\displaystyle u_{i}^{(T^{*})}=\exp\left\{-\exp\left[z_{i}^{(T^{*})}\right]% \right\},\quad u_{i}^{(C^{*})}=\exp\left\{-\exp\left[z_{i}^{(C^{*})}\right]% \right\},$ $\displaystyle z_{i}^{(T^{*})}=\frac{y_{i}-\mathbf{x}_{i}^{\top}\bm{\beta}^{(T^% {*})}}{\sigma^{(T^{*})}}\text{ and }z_{i}^{(C^{*})}=\frac{y_{i}-\mu^{(C^{*})}}% {\sigma^{(C^{*})}}.$

We consider the regression structure only in the log-lifetime, which is the main interest. Future research can be developed to add the regression structure simultaneously in both log-lifetime and log-censoring.

The log-likelihood can be maximized either directly by using the SAS (NLMixed procedure), R (optim) or MaxBFGS routine in the matrix programming language Ox (Doornik, 2007) or by solving the nonlinear likelihood equations obtained by differentiating Eq. (12). Initial values for $\sigma^{(T^{*})}$ , $\bm{\beta}^{(T^{*})}$ , $\sigma^{(C^{*})}$ , $\mu^{(C^{*})}$ and $\phi$ can be taken from the fit of the log-Weibull regression model with informative censoring for $\alpha^{(T^{*})}=\alpha^{(C^{*})}=$ 1.

Let $\hat{\bm{\theta}}$ be the MLE of $\bm{\theta}$ . Approximate confidence intervals and hypothesis tests on the model parameters require the $(p+6)\times(p+6)$ total observed information matrix $-\ddot{\mathbf{L}}(\bm{\theta})$ . Under general conditions, the asymptotic distribution of $(\hat{\bm{\theta}}-\bm{\theta})$ is $N_{p+6}(0,I(\bm{\theta})^{-1})$ , where $I(\bm{\theta})$ is the expected information matrix. In practice, we can replace $I(\bm{\theta})$ by $-\ddot{\mathbf{L}}(\hat{\bm{\theta}})$ , i.e. the observed information matrix evaluated at $\hat{\bm{\theta}}$ . The observed information matrix $-\ddot{\mathbf{L}}(\bm{\theta})$ can be obtained from the authors upon request.

We can construct approximate confidence intervals for the parameters based on the multivariate normal $N_{p+6}(0,-\ddot{L}(\hat{\bm{\theta}})^{-1})$ distribution. Further, likelihood ratio (LR) statistics can be used to compare the LOLLW regression model with informative censoring and some of its sub-models. We can compute the maximum values of the unrestricted and restricted log-likelihoods to obtain LR statistics for testing some sub-models of the LOLLW regression model with informative censoring. For example, the test of $H_{0}:\alpha^{(T^{*})}=\alpha^{(C^{*})}=$ 1 versus $H:H_{0}$ is not true is equivalent to compare the LOLLW and log-Weibull regression models with informative censoring. In this case, the LR statistic is $w=2\{\ell(\hat{\bm{\theta}})-\ell(\tilde{\bm{\theta}})\}$ , where $\hat{\bm{\theta}}$ and $\tilde{\bm{\theta}}$ are the MLEs under $H$ and $H_{0}$ , respectively. For large samples, $w$ has approximately a chi-square distribution with two degrees of freedom.

4. Checking model: Diagnostic and residual analysis

An important step in the analysis of a fitted model is to check for possible deviations from the model assumptions. In this context, it is important to detect the presence of outliers in the data and to evaluate their impact on the inferential results. Therefore, an analysis of the residuals can help to validate the stability and robustness of the inferential results.

Influence diagnostic is important in the analysis of real data, since it can reveal the inadequacy model fit or influential observations. Since regression models are sensitive to the underlying model assumptions, generally performing a sensitivity analysis is strongly advisable. Cook (1986) used this idea to motivate this assessment of influence analysis. He suggested that more confidence can be put in a model which is relatively stable under small modifications. Another approach, also suggested by Cook (1986), is to weight observations instead of removing them. Previous works on local influence curvatures in regression models for censored data are due to Escobar and Meeker (1992), Ortega et al. (2003, 2009, 2011), Silva et al. (2008), Silva et al. (2009) and Hashimoto et al. (2010). The calculation of local influences can be carried out for model Eq. (2) with informative censoring. If the likelihood displacement $\textit{LD}(\bm{\omega})=2\{l(\hat{\bm{\theta}})-l(\hat{\bm{\theta}}_{\bm{% \omega}})\}$ is used, where $\hat{\bm{\theta}}_{\bm{\omega}}$ denotes the MLE under the perturbed model, then the normal curvature for $\bm{\theta}$ at direction $\mathbf{d}$ , $\|\mathbf{d}\|=1$ , is $C_{\mathbf{d}}(\bm{\theta})=2|\mathbf{d}^{T}\bm{\Delta}^{T}[\ddot{\mathbf{L}}(% \bm{\theta})]^{-1}\bm{\Delta}\mathbf{d}|$ , where $\bm{\Delta}$ is a $(p+6)\times n$ matrix which depends on the perturbation scheme. The elements of this matrix are given by $\Delta_{vi}=\partial^{2}l(\bm{\theta}|\bm{\omega})/\partial\theta_{v}\partial% \omega_{i}$ , $i=1,2,\ldots,n$ and $v=1,2,\ldots,p+6$ , evaluated at $\hat{\bm{\theta}}$ and $\bm{\omega}_{0}$ , where $\bm{\omega}_{0}$ is the no-perturbation vector. For the LOLLW regression model with informative censoring, the elements of $\ddot{\mathbf{L}}(\bm{\theta})$ can be obtained from the authors under request. We can also calculate the normal curvatures $C_{\mathbf{d}}(\bm{\kappa})$ , $C_{\mathbf{d}}(\bm{\nu})$ and $C_{\mathbf{d}}(\phi)$ to perform various index plots such as the index plot of $\mathbf{d}_{\max}$ , the eigenvector corresponding to $C_{\mathbf{d}_{\max}}$ , the largest eigenvalue of the matrix $\mathbf{B}=-\bm{\Delta}^{T}[\ddot{\mathbf{L}}(\bm{\theta})]^{-1}\bm{\Delta}$ , and the index plots of $C_{\mathbf{d}_{i}}(\bm{\kappa})$ , $C_{\mathbf{d}_{i}}(\bm{\nu})$ and $C_{\mathbf{d}_{i}}(\phi)$ , which are together called the total local influence (Lesaffre & Verbeke, 1998), where $\mathbf{d}_{i}$ denotes an $n\times 1$ vector of zeros with one at the $i$ th position. Thus, the curvature at the direction $\mathbf{d}_{i}$ has the form $C_{i}=2|\bm{\Delta}_{i}^{T}[\ddot{\mathbf{L}}(\bm{\theta})]^{-1}\bm{\Delta}_{i}|$ , where $\bm{\Delta}_{i}^{T}$ denotes the $i$ th row of $\bm{\Delta}$ . It is commonplace to point out cases for which $C_{i}\geqslant 2\bar{C}$ , where $\bar{C}=\frac{1}{n}\sum_{i=1}^{n}C_{i}$ .

Next, for three perturbation schemes, we calculate the matrix:

$\displaystyle\bm{\Delta}=(\bm{\Delta}_{vi})_{(p+6)\times n}=\left(\frac{% \partial^{2}l(\bm{\theta}|\bm{\omega})}{\partial\theta_{i}\partial\bm{\omega}_% {v}}\right)_{(p+6)\times n},\quad v=1,\ldots,p+6\text{ and }i=1,\ldots,n.$

We consider the model Eq. (2) with informative censoring and its log-likelihood function Eq. (12). Let $\bm{\omega}=(\omega_{1},\ldots,\omega_{n})^{T}$ be the vector of weights.

4.1 Case-weight perturbation

In this case, the log-likelihood function has the form

$\displaystyle l(\bm{\theta}|\bm{\omega})=\{r\phi\log(\phi)+r\log[\Gamma(\phi+1% )]-r\log[\Gamma(\phi)]\}\sum_{i=1}^{n}\omega_{i}+\sum_{i=1}^{n}\delta_{i}% \omega_{i}\log\left[h_{0}^{(T^{*})}(y_{i}|\bm{\kappa})\right]+\sum_{i=1}^{n}(1% -\delta_{i})\omega_{i}\log\left[h_{0}^{(C^{*})}(y_{i}|\bm{\nu})\right]-(\phi+1% )\sum_{i=1}^{n}\omega_{i}\log\left[\phi+H_{0}^{(T^{*})}(y_{i}|\bm{\kappa})+H_{% 0}^{(C^{*})}(y_{i}|\bm{\nu})\right],$

where 0 $\leqslant\omega_{i}\leqslant$ 1, $\bm{\omega}_{0}=(1,\ldots,1)^{T}$ and $h_{0}^{(T^{*})}$ , $h_{0}^{(C^{*})}$ , $H_{0}^{(T^{*})}$ and $H_{0}^{(C^{*})}$ are defined in Eq. (12). Here, $\bm{\Delta}=(\bm{\Delta}_{\bm{\kappa}}^{\top},\bm{\Delta}_{\bm{\nu}}^{\top},% \bm{\Delta}_{\phi}^{\top})^{\top}$ can be calculated numerically.

4.2 Response perturbation

Next, we consider that each $y_{i}$ is perturbed as $y_{iw}=y_{i}+\omega_{i}S_{y}$ , where $S_{y}$ is a scale factor that may be estimated by the standard deviation of the observed response $y$ and $\omega_{i}\in\mathbb{R}$ . The perturbed log-likelihood function can be expressed as

$\displaystyle l(\bm{\theta}|\bm{\omega})=r\phi\log(\phi)+r\log[\Gamma(\phi+1)]% -r\log[\Gamma(\phi)]+\sum_{i=1}^{n}\delta_{i}\log\left[h_{0}^{{\dagger}(T^{*})% }(y_{i}|\bm{\kappa})\right]+\sum_{i=1}^{n}(1-\delta_{i})\log\left[h_{0}^{{% \dagger}(C^{*})}(y_{i}|\bm{\nu})\right]-(\phi+1)\sum_{i=1}^{n}\log\left[\phi+H% _{0}^{{\dagger}(T^{*})}(y_{i}|\bm{\kappa})+H_{0}^{{\dagger}(C^{*})}(y_{i}|\bm{% \nu})\right],$

where

$\displaystyle h_{0}^{{\dagger}(T^{*})}(y_{i}|\bm{\kappa})=\frac{\alpha^{(T^{*}% )}\exp\left(z_{i}^{{\dagger}(T^{*})}\right)\left(1-u_{i}^{{\dagger}(T^{*})}% \right)^{\alpha^{(T^{*})}-1}}{\sigma^{(T^{*})}\left\{\left(1-u_{i}^{{\dagger}(% T^{*})}\right)^{\alpha^{(T^{*})}}+\left[u_{i}^{{\dagger}(T^{*})}\right]^{% \alpha^{(T^{*})}}\right\}},$ $\displaystyle h_{0}^{{\dagger}(C^{*})}(y_{i}|\bm{\nu})=\frac{\alpha^{(C^{*})}% \exp\left(z_{i}^{{\dagger}(C^{*})}\right)\left(1-u_{i}^{{\dagger}(C^{*})}% \right)^{\alpha^{(C^{*})}-1}}{\sigma^{(C^{*})}\left\{\left(1-u_{i}^{{\dagger}(% C^{*})}\right)^{\alpha^{(C^{*})}}+\left[u_{i}^{{\dagger}(C^{*})}\right]^{% \alpha^{(C^{*})}}\right\}},$ $\displaystyle H_{0}^{{\dagger}(T^{*})}(y_{i}|\bm{\kappa})=\log\left\{\left(1-u% _{i}^{{\dagger}(T^{*})}\right)^{\alpha^{(T^{*})}}+\left[u_{i}^{{\dagger}(T^{*}% )}\right]^{\alpha^{(T^{*})}}\right\}-\alpha^{(T^{*})}\log\left(u_{i}^{{\dagger% }(T^{*})}\right),$ $\displaystyle H_{0}^{{\dagger}(C^{*})}(y_{i}|\bm{\nu})=\log\left\{\left(1-u_{i% }^{{\dagger}(C^{*})}\right)^{\alpha^{(C^{*})}}+\left[u_{i}^{{\dagger}(C^{*})}% \right]^{\alpha^{(C^{*})}}\right\}-\alpha^{(C^{*})}\log\left(u_{i}^{{\dagger}(% C^{*})}\right),$ $\displaystyle u_{i}^{{\dagger}(T^{*})}=\exp\left\{-\exp\left[z_{i}^{{\dagger}(% T^{*})}\right]\right\},\quad u_{i}^{{\dagger}(C^{*})}=\exp\left\{-\exp\left[z_% {i}^{{\dagger}(C^{*})}\right]\right\},$ $\displaystyle z_{i}^{{\dagger}(T^{*})}=\frac{(y_{i}+\omega_{i}S_{y})-\mathbf{x% }_{i}^{\top}\bm{\beta}^{(T^{*})}}{\sigma^{(T^{*})}},\quad z_{i}^{{\dagger}(C^{% *})}=\frac{(y_{i}+\omega_{i}S_{y})-\mu^{(C^{*})}}{\sigma^{(C^{*})}},$

and $\bm{\omega}_{0}=(0,\ldots,0)^{T}$ . The matrix $\bm{\Delta}=(\bm{\Delta}_{\bm{\kappa}}^{\top},\bm{\Delta}_{\bm{\nu}}^{\top},% \bm{\Delta}_{\phi}^{\top})^{\top}$ is found numerically.

4.3 Explanatory variable perturbation

Consider now an additive perturbation on a particular continuous explanatory variable, say $X_{q}$ , by setting $x_{iq\omega}=x_{iq}+\omega_{i}S_{q}$ , where $S_{q}$ is a scale factor and $\omega_{i}\in\mathbb{R}$ . The perturbed log-likelihood function is

$\displaystyle l(\bm{\theta}|\bm{\omega})=r\phi\log(\phi)+r\log[\Gamma(\phi+1)]% -r\log[\Gamma(\phi)]+\sum_{i=1}^{n}\delta_{i}\log\left[h_{0}^{{\dagger}{% \dagger}(T^{*})}(y_{i}|\bm{\kappa})\right]+\sum_{i=1}^{n}(1-\delta_{i})\log% \left[h_{0}^{{\dagger}{\dagger}(C^{*})}(y_{i}|\bm{\nu})\right]-(\phi+1)\sum_{i% =1}^{n}\log\left[\phi+H_{0}^{{\dagger}{\dagger}(T^{*})}(y_{i}|\bm{\kappa})+H_{% 0}^{{\dagger}{\dagger}(C^{*})}(y_{i}|\bm{\nu})\right],$

where

$\displaystyle h_{0}^{{\dagger}{\dagger}(T^{*})}(y_{i}|\bm{\kappa})=\frac{% \alpha^{(T^{*})}\exp\left(z_{i}^{{\dagger}{\dagger}(T^{*})}\right)\left(1-u_{i% }^{{\dagger}{\dagger}(T^{*})}\right)^{\alpha^{(T^{*})}-1}}{\sigma^{(T^{*})}% \left\{\left(1-u_{i}^{{\dagger}{\dagger}(T^{*})}\right)^{\alpha^{(T^{*})}}+% \left[u_{i}^{{\dagger}{\dagger}(T^{*})}\right]^{\alpha^{(T^{*})}}\right\}},$ $\displaystyle h_{0}^{{\dagger}{\dagger}(C^{*})}(y_{i}|\bm{\nu})=\frac{\alpha^{% (C^{*})}\exp\left(z_{i}^{{\dagger}{\dagger}(C^{*})}\right)\left(1-u_{i}^{{% \dagger}{\dagger}(C^{*})}\right)^{\alpha^{(C^{*})}-1}}{\sigma^{(C^{*})}\left\{% \left(1-u_{i}^{{\dagger}{\dagger}(C^{*})}\right)^{\alpha^{(C^{*})}}+\left[u_{i% }^{{\dagger}{\dagger}(C^{*})}\right]^{\alpha^{(C^{*})}}\right\}},$ $\displaystyle H_{0}^{{\dagger}{\dagger}(T^{*})}(y_{i}|\bm{\kappa})=\log\left\{% \left(1-u_{i}^{{\dagger}{\dagger}(T^{*})}\right)^{\alpha^{(T^{*})}}+\left[u_{i% }^{{\dagger}{\dagger}(T^{*})}\right]^{\alpha^{(T^{*})}}\right\}-\alpha^{(T^{*}% )}\log\left(u_{i}^{{\dagger}{\dagger}(T^{*})}\right),$ $\displaystyle H_{0}^{{\dagger}{\dagger}(C^{*})}(y_{i}|\bm{\nu})=\log\left\{% \left(1-u_{i}^{{\dagger}{\dagger}(C^{*})}\right)^{\alpha^{(C^{*})}}+\left[u_{i% }^{{\dagger}{\dagger}(C^{*})}\right]^{\alpha^{(C^{*})}}\right\}-\alpha^{(C^{*}% )}\log\left(u_{i}^{{\dagger}{\dagger}(C^{*})}\right),$ $\displaystyle u_{i}^{{\dagger}{\dagger}(T^{*})}=\exp\left\{-\exp\left[z_{i}^{{% \dagger}{\dagger}(T^{*})}\right]\right\},\quad u_{i}^{{\dagger}{\dagger}(C^{*}% )}=\exp\left\{-\exp\left[z_{i}^{{\dagger}{\dagger}(C^{*})}\right]\right\},$ $\displaystyle z_{i}^{{\dagger}{\dagger}(T^{*})}=\frac{y_{i}-\mathbf{x}_{i}^{{% \dagger}{\dagger}\top}\bm{\beta}^{(T^{*})}}{\sigma^{(T^{*})}},\quad z_{i}^{{% \dagger}{\dagger}(C^{*})}=\frac{y_{i}-\mu^{(C^{*})}}{\sigma^{(C^{*})}},$

$\mathbf{x}_{i}^{{\dagger}{\dagger}\top}\bm{\beta}=\beta_{1}+\beta_{2}x_{i2}+% \ldots+\beta_{q}(x_{iq}+\omega_{i}S_{q})+\ldots+\beta_{p}x_{ip}$ and $\bm{\omega}_{0}=(0,\ldots,0)^{T}$ . The matrix $\bm{\Delta}=(\bm{\Delta}_{\bm{\kappa}}^{\top},\bm{\Delta}_{\bm{\nu}}^{\top},% \bm{\Delta}_{\phi}^{\top})^{\top}$ is determined numerically.

The assessment of the fitted model is an important part of data analysis, particularly in regression models, and residual analysis is a helpful tool to validate the fitted model. Examination of residuals can be used, for instance, to detect the presence of outlying observations, the absence of components in the systematic part of the model and departures from the error and variance assumptions. However, finding appropriate residuals in non-normal regression models has been an important topic of research, particularly under censoring. For more details, see Ortega et al. (2009), Hashimoto et al. (2010) and Silva et al. (2011).

4.4 Martingale residual

In parametric lifetime models with informative censoring, the martingale residual can be expressed as

$\displaystyle rm_{i}=\delta_{i}\left\{1+\log\left[S^{(T^{*})}(y_{i}|\bm{\kappa% },\phi)\right]\right\}+(1-\delta_{i})\left\{\log\left[S^{(C^{*})}(y_{i}|\bm{% \nu},\phi)\right]\right\}$

where $\delta_{i}=$ 1 indicates that the observation is uncensored and $\delta_{1}=$ 0 indicates that the observation is censored, $S^{(T^{*})}(y|\bm{\kappa},\phi)$ and $S^{(C^{*})}(y|\bm{\nu},\phi)$ are the survival functions for the failure and censoring times calculated from Eqs (5) and (7), respectively.

Thus, the martingale residual for the LOLLW regression model with informative censoring takes the form

$\displaystyle r_{M_{i}}=\left\{\begin{array}[]{ll}1+\phi\log\left\{\frac{\phi}% {\phi+H_{0}^{(T^{*})}(y_{i}|\bm{\kappa})}\right\}&\text{ if }\delta_{i}=1,\\ \phi\log\left\{\frac{\phi}{\phi+H_{0}^{(C^{*})}(y_{i}|\bm{\nu})}\right\}&\text% { if }\delta_{i}=0,\end{array}\right.$ (13)

where $H_{0}^{(T^{*})}(y_{i}|\bm{\kappa})$ and $H_{0}^{(C^{*})}(y_{i}|\bm{\nu})$ are defined in Section 3.

4.5 Modified deviance residual

Another possibility is to use a transformation of the martingale residual based on the deviance component residual for the Cox proportional hazard model with no time-dependent explanatory variables as introduced by Therneau et al. (1990), and defined by

$\displaystyle r_{D_{i}}=\text{sign}(r_{M_{i}})\left\{-2\left[r_{M_{i}}+\delta_% {i}\log\left(\delta_{i}-r_{M_{i}}\right)\right]\right\}^{1/2},$

where $r_{M_{i}}$ is the martingale residual presented in Eq. (13). Thus, the deviance modified residual for the LOLLW regression model with informative censoring takes the form

$\displaystyle r_{D_{i}}=\left\{\begin{array}[]{ll}\text{sign}(\hat{r}_{M_{i}})% \left\{-2\left[1+\log\left(\frac{\phi}{\phi+H_{0}^{(T^{*})}(y_{i}|\bm{\kappa})% }\right)^{\phi}+\log\left\{-\phi\log\left(\frac{\phi}{\phi+H_{0}^{(T^{*})}(y_{% i}|\bm{\kappa})}\right)\right\}\right]\right\}^{1/2}&\text{ if }\delta_{i}=1,% \\ \text{sign}(\hat{r}_{M_{i}})\left\{-2\phi\log\left(\frac{\phi}{\phi+H_{0}^{(C^% {*})}(y_{i}|\bm{\nu})}\right)\right\}^{1/2}&\text{ if }\delta_{i}=0.\end{array% }\right.$

We use this transformation to obtain a new residual symmetrically distributed around zero.

5. Application

In this section, we analyze the informative censoring mechanism by means of a real data set from the book by Collett (2003). During treatment for leukemia, the patients are often submitted to bone marrow transplantation to help bring their corpuscle count to a normal level. But this can trigger a potentially fatal side effect, graft-versus-host disease, in which the transplanted cells attack the host cells. The data considered come from $n=$ 37 patients who were in remission from acute myeloid leukemia (AML) or acute lymphocytic leukemia (ALL) or suffering from chronic myeloid leukemia (CML) and received a non-impoverished allogeneic bone marrow transplant. In this application, we consider the following variables:

•
$t_{i}$ : survival time in days of patients who were in complete remission from AML or ALL or in the chronic phase of chronic myeloid leukaemia CML and received a non-depleted allogeneic bone marrow transplant;
•
$\delta_{i}$ : censoring indicator;
•
$x_{i1}$ : pregnancy of the donor (0 $=$ no, 1 $=$ yes);
•
$x_{i2}$ : donor-versus-host disease (0 $=$ no, 1 $=$ yes).

5.1 Descriptive analysis

For these 37 patients, the event of interest occurred in 17 of them, i.e., 46% of the observations failed and 56% were censored. Some descriptive measures are given in Table 1 for the bone marrow transplant data considering failure and censoring times separately.

Table 1
Descriptive statistics for the bone marrow transplant data

	Mean	Median	SD	Skewness	Kurtosis	Min.	Max.
Failure times	271.82	142	306.91	1.76	2.18	41	1181
Censoring times	1008.05	1006	319.16	0.12	$-$ 1.52	572	1504

Figure 2 displays the histograms for all observed times, the failure times only and for the censoring times only. For the three cases it is difficult to choose a distribution, since they do not have known forms.

Table 2 gives the MLEs, their standard errors (SEs) in parentheses and the values of the Akaike Information Criterion (AIC), Bayesian Information Criterion (BIC) and Consistent Akaike Information Criterion (CAIC). For the censoring times, the AIC, CAIC and BIC values are lower for the Weibull distribution, thus indicating a better fit than the OLLW model. However, for the failure times, the OLLW model provides the lowest values of these statistics.

Figure 3a and b display the histograms of the fitted Weibull and OLLW densities for the censoring and failure times, respectively. Figure 3a shows that the Weibull distribution yields a better fit for the censoring times than the OLLW model. However, Fig. 3b reveals that the OLLW distribution gives a better fit to the failure times. Based on the plots in Fig. 3, it is reasonable to consider that the times to the events follow the OLLW distribution.

Table 2

MLEs, SEs (in parentheses) and the AIC, CAIC and BIC statistics for the Weibull and OLLW models under censoring and failure times fitted to the bone marrow transplant data

Model	$\gamma$	$\lambda$	$\alpha$	AIC	CAIC	BIC
Censoring times
Weibull	3.6468	1121.06	1	289.8	290.1	293.0
	(0.6518)	(72.6025)	(–)
OLLW	7.5219	1152.68	0.4348	290.4	291.1	295.5
	(4.1264)	(84.0535)	(0.2704)
Failure times
Weibull	1.0581	278.0	1	228.5	229.3	230.1
	(0.1857)	(67.6267)	(–)
OLLW	0.4829	279.95	3.9955	219.6	221.5	222.1
	(0.5308)	(37.1882)	(4.4379)

Figure 2.

Histograms for the: (a) Observed times (full sample). (b) Failure times. (c) Censoring times.

Next, we fit the LOLLW regression model with informative censoring

$\displaystyle y_{i}=\beta_{0}+\beta_{1}x_{i1}+\beta_{2}x_{i2}+\sigma z_{i},$

where the variable $Y_{i}$ has the LOLLW distribution given by Eq. (3) (for $i=1,2,\ldots,37$ ).

Table 3 lists the MLEs for the LOLLW regression model with non-informative and informative censoring. The covariable $x_{i2}$ becomes significant when considering informative censoring. Also, the SEs of the MLEs are much smaller under informative censoring, thus indicating a better fit for this case.

Table 3

MLEs and their SEs for the LOLLW regression model fitted to the bone marrow transplant data under non-informative and informative censoring

	Non-informative			Informative
$\bm{\theta}$	MLE	SE	$p$ -value	MLE	SE	$p$ -value
$\alpha$	0.2084	0.2820	–	0.0973	0.0188	–
$\sigma$	0.2357	0.3153	–	0.1304	0.0098	–
$\mu_{c}$	–	–	–	7.0042	0.0713	–
$\sigma_{c}$	–	–	–	0.2457	0.0434	–
$\phi$	–	–	–	5.2310	3.9063	–
$\beta_{0}$	8.0774	0.5790	$<$ 0.001	9.2274	0.0032	$<$ 0.001
$\beta_{1}$	$-$ 1.0546	0.4662	0.0304	$-$ 0.9774	0.3430	$<$ 0.001
$\beta_{2}$	$-$ 1.1286	0.6205	0.0780	$-$ 2.2651	0.3216	$<$ 0.001

Figure 3.

The fitted Weibull and OLLW densities. (a) Censoring times and (b) Failure times.

5.2 Comparing non-nested models

Note that the LOLLW regression model with informative censoring and the LOLLW regression model with non-informative censoring are non-nested. An alternative generalized LR statistic for discriminating among non-nested models is discussed in the book by Cameron and Trivedi (1998, p. 184). Consider two non-nested models – model $F_{\theta}$ with density function $f(y_{i}|\mathbf{x}_{i},\bm{\theta})$ and model $G_{\gamma}$ with density function $g(y_{i}|\mathbf{x}_{i},\bm{\gamma})$ . A distance between the two models measured in terms of the Kullback-Liebler information criterion is

$\displaystyle T_{\textit{LR},\textit{NN}}=\left\{\frac{1}{\sqrt{n}}\sum_{i=1}^% {n}\log\frac{f(y_{i}|\mathbf{x}_{i},\hat{\bm{\theta}})}{g(y_{i}|\mathbf{x}_{i}% ,\hat{\bm{\gamma}})}\right\}\div\left\{\frac{1}{n}\sum_{i=1}^{n}\left(\log% \frac{f(y_{i}|\mathbf{x}_{i},\hat{\bm{\theta}})}{g(y_{i}|\mathbf{x}_{i},\hat{% \bm{\gamma}})}\right)^{2}-\left(\frac{1}{n}\sum_{i=1}^{n}\log\frac{f(y_{i}|% \mathbf{x}_{i},\hat{\bm{\theta}})}{g(y_{i}|\mathbf{x}_{i},\hat{\bm{\gamma}})}% \right)^{2}\right\}.$

For strictly non-nested models, the statistic $T_{\textit{LR},\textit{NN}}$ converges in distribution to a standard normal distribution under the null hypothesis of equivalence of the models. Thus, the null hypothesis is not rejected if $|T_{\textit{LR},\textit{NN}}|\leqslant z_{\frac{\alpha}{2}}$ . On the other hand, we reject (at the $\alpha$ % significance level) the null hypothesis of equivalence of the models in favor of model $F_{\theta}$ being better (or worse) than model $G_{\gamma}$ if $T_{\textit{LR},\textit{NN}}>z_{\alpha}$ (or $T_{\textit{LR},\textit{NN}}<-z_{\alpha}$ ).

We shall use Eq. (3) to represent the pdf under informative censoring ( $f(y_{i}|\mathbf{x}_{i},\bm{\theta})$ ) and non-informative censoring ( $g(y_{i}|\mathbf{x}_{i},\bm{\gamma})$ ).

The generalized LR statistic is $T_{\textit{LR},\textit{NN}}=$ $-$ 22.6911. Since $T_{\textit{LR},\textit{NN}}<$ $-$ 1.96, we reject (at the 5% significance level) the null hypothesis of equivalence of the LOLLW models with informative censoring and non-informative censoring. Further, the value of this statistic indicates that the model with informative censoring is the best model for the current data.

5.3 Local and total influence analysis

In this section, we analyze local influences with respect to the bone marrow transplant data using the LOLLW regression model with informative censoring.

5.4 Case-weight perturbation

We apply the local influence framework developed in Section 4 in which case-weight perturbation is used. For the maximum curvature, we have the value $C_{\mathbf{d}_{\max}}=$ 1.9204. In Fig. 4a, we plot the eigenvector corresponding to $\mathbf{d}_{\max}$ . The plot of the total influence $C_{i}$ is displayed in Fig. 4b. The observations $\sharp$ 1, $\sharp$ 8, $\sharp$ 25 and $\sharp$ 28 are very distinguished in relation to the others.

Figure 4.

(a) Index plot of $\mathbf{d}_{\max}$ for $\bm{\theta}$ (case-weight perturbation) and (b) total local influence for $\bm{\theta}$ (case-weight perturbation) based on the current fitted model to the bone marrow transplant data.

5.5 Response variable perturbation

Here, the influence of perturbations on the observed survival times is analyzed. The value for the maximum curvature is $C_{\mathbf{d}_{\max}}=$ 97.708. Figure 5a displays the plot for $\mathbf{d}_{\max}$ versus the observation index, which reveals that the observation $\sharp$ 9 is more salient in relation to the others. Figure 5b displays the plot of the total local influence ( $C_{i}$ ), thus indicating that the observations $\sharp$ 9 and $\sharp$ 25 again stand out.

5.6 Impact of the detected influential observations

The diagnostic analysis detected the cases $\sharp$ 25 and $\sharp$ 28 as potentially influential observations. The observation $\sharp$ 25 corresponds to the lowest survival time ( $t_{25}=$ 572) and lifetime $\sharp$ 28 is the highest in the failure time ( $t_{28}=$ 572).

In order to reveal the impact of these two observations on the parameter estimates, we refit the model under some situations. First, we individually eliminate each one them. Next, we remove from the set “A” (original data set) the totality of potentially influential observations.

The figures in Table 4 give the relative change (in percentage) of each estimate defined by $\textit{RC}_{\bm{\theta}_{j}}=[(\hat{\bm{\theta}}_{j}-\hat{\bm{\theta}}_{j}(A)% )/\hat{\bm{\theta}}_{j}]\times 100$ , and the corresponding $p$ -value, where $\hat{\bm{\theta}}_{j}(A)$ is the MLE of $\bm{\theta}_{j}$ after the “set $A$ ” of observations be removed. Table 4 provides the following sets: $A_{1}=\{\sharp 25\}$ , $A_{2}=\{\sharp 28\}$ and $A_{3}=\{\sharp 25,\sharp 28\}$ . It indicates that the estimates of the LOLLW regression model with informative censoring are not highly sensitive under deletion of the outstanding observations. In general, the significance of the estimates does not change (at the significance level of 5%) after removing the set $A$ . Hence, we do not have inferential changes after removing the observations handed out in the diagnostic plots.

Table 4
Relative changes [RC in %], estimates and their $p$ -values (in parentheses) for some sets

Set(A)	$\hat{\alpha}$	$\hat{\sigma}$	$\hat{\mu_{c}}$	$\hat{\sigma_{c}}$	$\hat{\phi}_{2}$	$\hat{\beta}_{0}$	$\hat{\beta}_{1}$	$\hat{\beta}_{2}$
$A$	–	–	–	–	–	–	–	–
	0.0973	0.1304	7.0042	0.2457	5.2310	9.2274	$-$ 0.9774	$-$ 2.2651
	(–)	(–)	(–)	(–)	(–)	( $<$ 0.0010)	(0.0075)	( $<$ 0.0010)
$A_{1}$	[ $-$ 1.0637]	[ $-$ 0.1992]	[ $-$ 0.2986]	[1.3553]	[ $-$ 2.4480]	[ $-$ 0.8348]	[ $-$ 12.5455]	[ $-$ 8.2005]
	0.0983	0.1307	7.0251	0.2424	5.3590	9.3044	$-$ 1.1001	$-$ 2.4508
	(–)	(–)	(–)	(–)	(–)	( $<$ 0.0010)	(0.0075)	( $<$ 0.0010)
$A_{2}$	[3.2922]	[ $-$ 1.3677]	[0.2182]	[ $-$ 20.3612]	[0.3638]	[ $-$ 0.3187]	[11.6610]	[ $-$ 1.5614]
	0.0943	0.1322	6.9889	0.2957	5.2119	9.2568	$-$ 0.8635	$-$ 2.3004
	(–)	(–)	(–)	(–)	(–)	( $<$ 0.0010)	(0.0273)	( $<$ 0.0010)
$A_{3}$	[4.8551]	[4.6130]	[ $-$ 0.5171]	[17.4850]	[22.2952]	[1.5168]	[ $-$ 67.8432]	[10.8243]
	0.0925	0.1244	7.0405	0.2027	4.0647	9.0874	$-$ 1.6406	$-$ 2.0199
	(–)	(–)	(–)	(–)	(–)	( $<$ 0.0010)	( $<$ 0.0010)	( $<$ 0.0010)

Figure 5.

(a) Index plot of $\mathbf{d}_{\max}$ for $\bm{\theta}$ (simultaneous response perturbation ) and (b) total local influence for $\bm{\theta}$ (simultaneous response perturbation) based on the model fitted to the bone marrow transplant data.

5.7 Residual analysis

To detect possible outlying observations in fitting the LOLLW regression model to non-informative censoring and the LOLLW regression model with informative censoring, Fig. 6 provides the index plot of $r_{D_{i}}$ . It indicates that the residuals are not randomly scattered around zero for the LOLLW regression model with non-informative censoring. This plot also shows that the residuals discloses the formation of two groups. The appearance of Fig. 6b gives a much better randomly scattered plot of the residuals around zero for the LOLLW regression model with informative censoring. It also indicates that this regression model is more appropriate to fit the data since it does not present outliers.

Figure 6.

Index plot of the deviance component residuals for the bone marrow transplant data. (a) LOLLW regression model under non-informative censoring. (b) LOLLW regression model under informative censoring.

6. Conclusions

We introduce and study the log-odd log-logistic Weibull (LOLLW) distribution and construct the LOLLW regression model to investigate the informative censoring mechanism in a type of location-scale regression model. We use maximum likelihood to estimate the model parameters. We adopt several diagnostic measures considering three perturbation schemes in the new regression model with informative censoring. We define two kinds of residuals from the fitted model to assess departures from the error distribution assumption and outlying observations. The flexibility, practical relevance and applicability of the proposed regression model are illustrated by means of a real data set. The fitted LOLLW regression model with informative censoring is more effective to the current data because its predictive power is better as shown by the smaller standard errors of the maximum likelihood estimates of the model parameters and also for yielding improved residuals then for the fitted model with non-informative censoring.

Footnotes

Acknowledgments

We are very grateful to a referee and an associate editor for helpful comments that considerably improved the paper. We gratefully acknowledge financial support from CAPES and CNPq.

References

Cameron

A.C.

, & Trivedi

P.K.

(1998). Regression Analysis of Count Data. Cambridge University Press, New York.

Collett

(2003). Modelling Survival Data en Medical Research. Chapman and Hall, London.

Cook

R.D.

(1986). Assessment of local influence (with discussion). Journal of the Royal Statistical Society B, 48, 133-169.

Cordeiro

G.M.

Alizadeh

Pescim

R.R.

, & Ortega

E.M.M.

(2016). The odd log-logistic generalized half-normal lifetime distribution: Properties and applications. Communications in Statistics – Theory and Methods, 46, 4195-4214.

da Cruz

J.N.

Ortega

E.M.M.

, & Cordeiro

G.M.

(2015). The log-odd log-logistic Weibull regression model: Modeling, estimation, influence diagnostics and residual analysis. Journal of Statistical Computation and Simulation, 86, 1516-1538.

da Cruz

J.N.

Ortega

E.M.M.

Cordeiro

G.M.

Suzuki

A.K.

, & Mialhe

F.L.

(2017). Bivariate odd-log-logistic-Weibull regression model for oral health-related quality of life. Communications for Statistical Applications and Methods, 24, 271-290.

da Silva

A.B.

Cordeiro

G.M.

Ortega

E.M.M.

, & Silva

G.O.

(2017). The odd log-logistic student t distribution: Theory and applications. Journal of Agricultural Biological and Environmental Statistics, 22, 615-639.

Doornik

J.A.

(2007). An Object-Oriented Matrix language Ox 5. Timberlake Consultants Press, London.

Escobar

L.A.

, & Meeker

W.Q.

(1992). Assessing influence in regression analysis with censored data. Biometrics, 48, 507-528.

10.

Freitas

L.A.

, & Rodrigues

(2013). Standard exponential cure rate model with informative censoring. Communications in Statistics – Simulation and Computation, 42, 8-23.

11.

Hashimoto

E.M.

Ortega

E.M.M.

Cancho

V.G.

, & Cordeiro

G.M.

(2010). The log-exponentiated Weibull regression model for interval-censored data. Computational Statistics and Data Analysis, 54, 1017-1035.

12.

Huang

, & Wolfe

R.A.

(2002). A frailty model for informative censoring. Biometrics, 58, 510-520.

13.

Huang

, & Zhang

(2008). Regression survival analysis with an assumed copula for dependent censoring: A sensitivity analysis approach. Biometrics, 64, 1090-1099.

14.

Lagakos

S.W.

(1979). General right Censoring and its impact on the analysis of survival data. Biometrics, 35, 139-156.

15.

Lagakos

S.W.

, & Williams

J.S.

(1978). A cone class of variable-sum models. Biometrika, 65, 181-189.

16.

Lesaffre

, & Verbeke

(1998). Local influence in linear mixed models. Biometrics, 54, 570-582

17.

, & Tsiatis

A.A.

(2011). Semiparametric estimation of treatment effect with time-lagged response in the presence of informative censoring. Lifetime Data Analysis, 17, 566-593.

18.

Mendoza

N.V.R.

Ortega

E.M.

, & Cordeiro

G.M.

(2016). The exponentiated log-logistic geometric distribution: Dual activation. Communications in Statistics – Theory and Methods, 13, 3838-3859.

19.

Ortega

E.M.M.

Bolfarine

, & Paula

G.A.

(2003). Influence diagnostics in generalized log-gamma regression models. Computational Statistics and Data Analysis, 42, 165-186.

20.

Ortega

E.M.M.

Cancho

V.G.

, & Paula

G.A.

(2009). Generalized log-gamma regression models with cure fraction. Lifetime Data Analysis, 15, 79-106.

21.

Ortega

E.M.M.

Cordeiro

G.M.

, & Hashimoto

E.M.

(2011). A log-linear regression model for the Beta-Weibull distribution. Communications in Statistics – Simulation and Computation, 40, 1206-1235.

22.

Pham

, & Lai

C.D.

(2007). On recent generalizations of the Weibull distribution. IEEE Transactions on Reliability, 56, 454-458.

23.

Prataviera

Ortega

E.M.M.

Cordeiro

G.M.

Pescim

R.R.

, & Verssani

B.A.W.

(2018a). A new generalized odd log-logistic flexible Weibull regression model with applications in repairable systems. Reliability Engineering and System Safety,176, 13-26.

24.

Prataviera

Ortega

E.M.M.

Cordeiro

G.M.

, & da Silva

A.B.

(2018b). The heteroscedastic odd log-logistic generalized gamma regression model for censored data. Communications in Statistics – Simulation and Computation, 48, 1-25.

25.

Rotnitzky

Farall

Bergesio

, & Scharfstein

(2007). Analysis of failure time data under competing censoring mechanisms. Journal of the Royal Statistical Society B, 69, 307-327.

26.

Santos

P.C.

, Jr., (2012). Análise de sobrevivência na presença de censura informativa. Dissertação (Mestrado em Estatística) – Universidade Federal de Minas Gerais-Belo Horizonte/MG.

27.

Scharfstein

D.O.

, & Robins

J.M.

(2002). Estimation of the failure time distribution in the presence of informative censoring. Biometrika, 89, 617-634.

28.

Siannis

(2004). Applications of a parametric model for informative censoring. Biometrics, 60, 704-714.

29.

Siannis

(2011). Sensitivity analysis for multiple right censoring processes: Investigating mortality in psoriatic arthritis. Statistics in Medicine, 6, 77-91.

30.

Siannis

Copas

, & Lu

(2005). Sensitivity analysis for informative censoring in parametric survival models. Biostatistics, 6, 77-91.

31.

Silva

G.O.

Ortega

E.M.M.

, & Cordeiro

G.M.

(2009). A log-extended Weibull regression model. Computational Statistics and Data Analysis, 53, 4482-4489.

32.

Silva

G.O.

Ortega

E.M.M.

Garibay

V.C.

, & Barreto

M.L.

(2008). Log-Burr XII regression models with censored data. Computational Statistics and Data Analysis, 52, 3820-3842.

33.

Silva

G.O.

Ortega

E.M.M.

, & Paula

G.A.

(2011). Residuals for log-Burr XII regression models in survival analysis. Journal of Applied Statistics, 38, 1435-1445.

34.

Staplin

N.D.

Kimber

A.C.

Collett

, & Roderick

P.J.

(2014). Dependent censoring in piecewise exponential survival models. Statistical Methods in Medical Research. 24, 325-341.

35.

Therneau

T.M.

Grambsch

P.M.

, & Fleming

T.R.

(1990). Martingale-based residuals for survival models. Biometrika, 77, 147-160.

36.

Tsiatis

(1975). A noindentifiability aspect of the problem of competing risks. Proceedings of the National Academy of Sciences, 72, 20-22.

37.

Vaupel

J.W.

, & Yashin

A.I.

(1983). The Deviant Dynamics of Death in Heterogeneous Populations. International Institute for Applied Systems Analysis Research Report, Laxenburg, Austria.

38.

Wang

M.C.

Qin

, & Chiang

C.T.

(2001). Analyzing recurrent event data with informative censoring. Journal of the American Statistical Association, 96, 1057-1065.

39.

Zhang

, & Heitjan

D.F.

(2006). A simple local sensitivity analysis tool for nonignorable coarsening: Application to dependent censoring. Biometrics, 62, 1260-1268.

The log-odd logistic-Weibull regression model under informative censoring

Abstract

Keywords

1. Introduction

4.1 Case-weight perturbation

4.2 Response perturbation

4.3 Explanatory variable perturbation

4.4 Martingale residual

5. Application

Table 1 Descriptive statistics for the bone marrow transplant data

5.3 Local and total influence analysis

5.4 Case-weight perturbation

5.6 Impact of the detected influential observations

Table 4 Relative changes [RC in %], estimates and their p -values (in parentheses) for some sets

Footnotes

Acknowledgments

References

Table 1
Descriptive statistics for the bone marrow transplant data

Table 4
Relative changes [RC in %], estimates and their $p$ -values (in parentheses) for some sets