Shared frailty models are used despite their limitations. To overcome their disadvantages correlated frailty models may be used. In this paper, we introduce the correlated compound Poisson frailty models with two different baseline distributions namely, the generalized log logistic and the generalized Weibull. We introduce the Bayesian estimation procedure using Markov Chain Monte Carlo (MCMC) technique to estimate the parameters involved in these models. We present a simulation study to compare the true values of the parameters with the estimated values. Also we apply these models to a real life bivariate survival data set of McGilchrist and Aisbett (1991) related to the kidney infection data and a better model is suggested for the data.
The frailty model is a random effect model for time to event data which is an extension of the Cox’s proportional hazards model. Shared frailty models are the most commonly used frailty models in literature, where individuals in the same cluster share a common frailty. Frailty models (Vaupel et al., 1979) are used in the survival analysis to account for the unobserved heterogeneity in the individual risks to disease and death. The frailty model is usually modeled as an unobserved random variable acting multiplicatively on the baseline hazard function. Hanagal and Dabade (2013), Hanagal and Bhambure (2014a, 2014b, 2015, 2016, 2017), Hanagal and Pandey (2014a, 2014b, 2015a, 2015b, 2016), Hanagal et al. (2017a) and Hanagal and Kamble (2014) analyzed kidney infection data and Australian twin data using shared gamma, inverse Gaussian and positive stable frailty models with different baseline distributions for the multiplicative model. Hanagal and Sharma (2013, 2015a, 2015b, 2015c) analyzed acute leukemia data, kidney infection data and diabetic retinopathy data using shared gamma and inverse Gaussian frailty models for the multiplicative model. Hanagal and Bhambure (2014b) developed shared inverse Gaussian frailty model based on the reversed hazard rate for Australian twin data. Hanagal et al. (2017b) and Hanagal and Pandey (2020), and Hanagal (2021) discussed correlated gamma, correlated inverse Gaussian and correlated positive stable frailty models for bivariate survival data to analyze kidney infection data and Hanagal and Pandey (2017) proposed correlated gamma frailty models for bivariate survival data based on reversed hazard rate for Australian twin data. Hanagal (2011, 2017, 2019) gave extensive literature review on different shared frailty models.
Aalen (1992) introduced a compound Poisson distribution as a mixing distribution in survival models which is an extension of one studied by Hougaard (1986, 2000). The compound Poisson distribution plays a prominent role in this extension, being used here as a mixing distribution. Quite often hazard rates or intensities are raising at the start, reaching a maximum and then declining. Hence the intensity has a unimodal shape with finite mode. For example, 1) death rates for cancer patients, meaning that the longer the patient lives, beyond a certain time, the more improved are his or her chances, 2) divorce rates, the maximal rate of divorce which occurs after a few years which means most marriages are going through crisis and then improving (Aaberge et al., 1989). The population intensity starts to decline simply because the high-risk individuals have already died or been divorced, and so forth.
Shared frailty explains correlations between subjects within clusters. However, it does have some limitations. Firstly, it forces the unobserved factors to be the same within the cluster, which may not always reflect reality. For example, at times it may be inappropriate to assume that all partners in a cluster share all their unobserved risk factors. Secondly, the dependence between survival times within the cluster is based on marginal distributions of survival times.
To avoid these limitations, correlated frailty models are being developed for the analysis of multivariate failure time data, in which associated random variables are used to characterize the frailty effect for each cluster. Correlated frailty models provide not only variance parameters of the frailties as in shared frailty models, but they also contain additional parameter for modeling the correlation between frailties in each group.
Compound poisson frailty
An important feature of the compound Poisson distribution is often seen is that the total integral under the intensity (hazard rate) is to be finite; that is, the distribution is defective. In practical terms this means that some individuals have zero susceptibility; they will ‘survive forever’. For instance, some patients survive their cancer, some people never marry, some marriages are not prone to be dissolved, and so on. In medicine, there are several examples of diseases primarily attacking people with a particular susceptibility, for instance, a genetic kind, other people having virtually zero susceptibility of getting the disease. Another example is fertility. Some couples are unable to conceive children, so that the distribution of times to having first child births for a population of couples will be defective. In unemployment data, one is also faced with the fact that some people may be completely unable to get a job. In such type of data, compound distribution having some positive mass at zero value can be a suitable choice. Compound Poisson distribution is conveniently used in the literature, since it has an explicit Laplace transform and it deals with the feature that some people may have zero susceptibility. Aalen (1992) considered a compound Poisson distribution as a mixture distribution in survival analysis. Also Aalen & Tretli (1995), Moger & Aalen (2005), Hanagal (2010a, 2010b, 2010c, 2010d) have considered compound Poisson frailty models. Hanagal and Dabade (2012) and Hanagal and Kamble (2015) developed compound Poisson frailty models to analyze kidney infection data.
A random variable is said to have the compound Poisson distribution if is given by
where is Poisson distributed with mean , while are independent and identically gamma distributed random variables with scale parameter and shape parameter .
Thus density function of compound Poisson variate is given by,
The parameter set for the compound Poisson distribution is .and the Laplace transform of is given by
The mean and variance of are
For identifiability of the model, we assume has expected value equal to one i.e. . Under the restriction which leads to . Under this restriction, variance of is given by . Laplace transform under the restriction is,
Correlated frailty
The correlated frailty model is the important concept in the area of multivariate frailty models. It is a natural extension of the shared frailty approach on the one hand, and of the univariate frailty model on the other. In the correlated frailty model, the frailties of individuals in a cluster are correlated but not necessarily shared. It enables the inclusion of additional correlation parameters, which then allows the addressing of questions about associations between event times. Furthermore, associations are no longer forced to be the same for all pairs of individuals in a cluster. This makes the model especially appropriate for situations where the association between event times is of special interest, for example, genetic studies of event times in families. The conditional survival function in the bivariate case (here without observed covariates) looks like
where and are two correlated frailties. The distribution of the random vector needs to be specified and determines the association structure of the event times in the model. Integrating the above bivariate survival function over and , we get unconditional bivariate survival function as
where (, ) has some known bivariate frailty distribution.
We are assuming that the frailties are acting multiplicatively on the baseline hazard function and that the observations in a pair are conditionally independent, given the frailties. Hence, the hazard of the individual in pair has the form
where denotes age or time, is a vector of observed covariates, is a vector of regression parameters describing the effect of the covariates , are baseline hazard functions, and are frailties. Bivariate correlated frailty models are characterized by the joint distribution of a two-dimensional vector of frailties . If the two frailties are independent, the resulting lifetimes are independent, and no clustering is present in the model. If the two frailties are equal, the shared frailty model is obtained as a special case of the correlated frailty model with correlation one between the frailties (Wienke 2011).
In order to derive a marginal likelihood function, the assumption of conditional independence of lifespan, given the frailty, is used. Let be a censoring indicator for individual in pair . Indicator is 1 if the individual has experienced the event of interest, and 0 otherwise. According to Eq. (5), the conditional survival function of the th individual in the th pair is
with denoting the cumulative baseline hazard function. The contribution of individual in pair to the conditional likelihood is given by
where stands for observation time of individual from pair . Assuming the conditional independence of lifespan, given the frailty, and integrating out the frailty, we obtain the marginal likelihood function
where is the probability density function of the corresponding frailty distribution. All these formulas can be easily extended to the multivariate case, but need a specification of the correlation structure between individuals in a cluster in terms of the multivariate density function, which complicates analysis. For more details see (Hanagal, 2011; Wienke, 2011).
Correlated compound poisson frailty model
Let be an infinitely divisible frailty variable with Laplace transformation and , then there exist random variables each with univariate Laplace transform such that the Laplace transform of is given by:
where is the correlation coefficient between and and has the range, .
The respective bivariate survival model is identifiable under mild regularity conditions on provided that . The case is known as the shared frailty model.
The above equation can be extended to multivariate case () as below (Hanagal, 2019).
The case leads to shared frailty. If , are mutually independent.
Let be the compound Poisson distributed with parameters and , and Laplace transform
The bivariate Laplace transform for the correlated compound Poisson frailty model is given by
where .
The correlated frailty model with compound Poisson frailty distribution in the presence of covariates is characterized by the bivariate survival function of the form:
where and are the cumulative baseline hazard functions of the life time random variables and respectively and .
The bivariate distribution in the presence of covariates, when the frailty variable is degenerate is given by
According to different assumptions on the baseline distributions we get different correlated compound Poisson frailty models.
Baseline distributions
Generalized Weibull distribution
In the survival analysis, Weibull distribution is widely used distribution to model lifetime data. Generalized exponential distribution was proposed by Gupta and Kundu (1999) by taking where is the distribution function of the exponential distribution. In the similar way generalized exponential distribution is constructed by taking the power of the distribution function of the Weibull distribution. We use generalized Weibull distribution as a baseline distribution. If a continuous random variable follows generalized Weibull distribution then the survival function cumulative hazard rate function and hazard rate are respectively;
If failure rate is constant (exponential), failure rate is monotonic (Weibull), failure rate is decreasing, failure rate is increasing, failure rate is bathtub or increasing and failure rate is unimodal or decreasing.
Generalized log-logistic distribution
Lehmann family (Deshpande & Purohit, 2005), is a very useful family of life distributions generated from a given survival function and extensively used to model the effect of covariates. Let be an arbitrary known survival function. If is positive then
is also a survival function. If, in particular, is the positive integer , then it represents the survival function of where ’s are i.i.d. random variables with as the common survival function. The hazards are proportional times. Lehmann family is also known as the proportional hazards family. The survival function of the log-logistic distribution is given by,
If is positive then
is also a survival function we call as generalized log-logistic survival function. The distribution function, the cumulative hazard rate function and the hazard rate of generalized log-logistic distribution are as follows.
If failure rate is decreasing, failure rate is increasing.
Proposed models
Substituting cumulative hazard function for the generalized log-logistic and the generalized Weibull baseline distribution in Eqs (14) and (15), we get the unconditional bivariate survival functions at time and as,
Here onwards we call Eqs (25), (26), (27) and (28) as Model I, Model II, Model III and Model IV respectively. Model I and Model II are the generalized Weibull distribution with and without compound Poisson frailty; and likewise Model III and Model IV are for the generalized log-logistic distribution with and without compound Poisson frailty.
Likelihood specification and bayesian estimation of parameters
Suppose there are individuals under study, whose first and second observed failure times are represented by (). Let and be the observed censoring times for the individual () for first and second recurrence times respectively. We also assume that the censoring time is independent of the lifetimes of individuals.
The contribution of the bivariate life time random variable of the individual in likelihood function is given by,
and the likelihood function is,
where , and are respectively the frailty parameter , the vector of baseline parameters and the vector of regression coefficients. For without frailty model, likelihood function is
In Eq. (30) the frailty parameters , and are absent and in Eq. (29) they are present. The counts and are the number of individuals for which first and second failure times () lie in the ranges ; ; and respectively and
Substituting cumulative hazard functions , hazard functions , and survival function for four proposed models into the last relations we get the likelihood function given by Eq. (29) for Model I and Model III and Eq. (30) for Model II and Model IV.
Unfortunately computing the maximum likelihood estimators (MLEs) involves solving high dimensional optimization problem for these four models. As the method of maximum likelihood fails to estimate the parameters due to convergence problem in the iterative procedure, so we use the Bayesian approach. Moreover, standard maximum likelihood based inference methods may not be suitable for small sample sizes or situations in which there is heavy censoring [see Kheiri et al. (2007)). Thus, in our problem a Bayesian approach, which does not suffer from these difficulties, is a natural one, even though it is relatively computationally intensive.
Simulation study
To estimate parameters in the model, the Bayesian approach is now popularly used, because computations in the Bayesian analysis become feasible due to advances in computing technology.
To evaluate the performance of the Bayesian estimation procedure we carry out a simulation study. For the simulation purpose we have considered only one covariate which we assume to follow normal distribution. The frailty variable and are assumed to have positive stable distribution with known variance and correlation . Lifetimes () for individual are conditionally independent for given frailty and . We assume that follows one of the baseline distribution generalized Weibull distribution and Generalized log-logistic distribution.
As the Bayesian methods are time consuming, we generate only fifty pairs of lifetimes. According to the assumption, for given frailty and , lifetimes of individuals are independent.
A widely used prior for frailty parameters (, ) are the gamma distributions . In addition, we assume that the regression coefficients are normal with mean zero and large variance say 1000. Similar types of prior distributions are used in Ibrahim et al. (2001), Sahu et al. (1997) and Santos and Achcar (2010). So in our study we also use same non informative prior for frailty parameters (, ) and regression coefficients . Since we do not have any prior information about baseline parameters, and , prior distributions are assumed to be flat. We consider two different non-informative prior distributions for baseline parameters, one is and another is . All the hyper-parameters and are known. Here is the gamma distribution with the shape parameter and the scale parameter and represents uniform distribution over the interval . For correlation parameter, we use uniform distribution . We use different value of baseline parameters for Model I and Model III. We assume the value of the hyper-parameters as and .
Simulation results on correlated compound poisson frailty with generalized Weibull baseline (Model I)
Parameter (True value)
Estimate
Standard error
Lower credible limit
Upper credible limit
Burn in period 2000;
Autocorrelation lag 300
(2.0)
1.981
0.0423
1.9552
2.0613
(2.2)
2.172
0.0951
2.1517
2.2574
(1.50)
1.463
0.0816
1.3841
1.5661
(2.50)
2.541
0.0551
2.4361
2.5753
(2.50)
2.462
0.0653
2.3926
2.5738
(2.90)
2.791
0.0481
2.7517
2.9654
(0.5)
0.483
0.0321
0.4512
0.4991
(0.5)
0.475
0.0432
0.4405
0.4885
(0.8)
0.813
0.0708
0.7291
0.8775
(0.50)
0.475
0.0602
0.4675
0.5433
Simulation results on correlated compound poisson frailty with generalized log-logistic baseline (Model III)
Parameter (True value)
Estimate
Standard error
Lower credible limit
Upper credible limit
Burn in period 1300;
Autocorrelation lag 120
(2.0)
2.012
0.0241
1.9332
2.1362
(2.2)
2.225
0.0238
2.1816
2.2534
(1.5)
1.456
0.0337
1.3214
1.5613
(2.5)
2.451
0.0321
2.4214
2.5547
(2.5)
2.444
0.0428
2.3928
2.5771
(2.9)
2.762
0.0234
2.7352
3.0722
(0.5)
0.512
0.0461
0.4962
0.5311
(0.5)
0.523
0.0453
0.4933
0.5405
(0.8)
0.832
0.0631
0.7409
0.8751
(0.5)
0.532
0.0362
0.4867
0.5578
Posterior summary for kidney infection data: Model-I (Generalized Weibull baseline with compound poisson frailty)
Parameter
Estimate
Standard error
Lower credible limit
Upper credible limit
Burn in period 3000;
Autocorrelation lag 300
3.1821
0.2110
2.6428
3.6518
20.5474
0.4298
19.7235
20.9925
2.1427
0.0446
1.7603
2.3543
3.9172
0.0924
3.5464
4.1123
23.5026
0.4535
22.9491
24.0278
2.2534
0.0623
1.9685
2.3764
0.4681
0.0321
0.4175
0.4962
0.4215
0.0297
0.3984
0.4574
0.5623
0.0320
0.4962
0.6181
0.0412
0.0080
0.0031
0.0482
2.1723
0.2126
2.4685
1.8104
0.1842
0.0272
0.2472
0.1014
0.0332
0.0288
0.0812
0.0224
1.6834
0.3713
2.2104
1.0109
Posterior summary for kidney infection data: Model-II (Generalized Weibull baseline without frailty)
Parameter
Estimate
Standard error
Lower credible limit
Upper credible limit
Burn in period 12000;
Autocorrelation lag 180
2.4849
0.3094
1.8992
3.0831
0.2031
0.0686
0.0905
0.3589
0.6049
0.0781
0.4601
0.7620
5.0401
0.5055
4.9199
5.2492
0.3222
0.0814
0.1758
0.4942
0.5129
0.0616
0.3882
0.6333
0.0007
0.0027
0.0044
0.0063
1.0716
0.3169
1.6756
0.4608
0.0159
0.0278
0.0677
0.0375
0.0041
0.0066
0.0167
0.0078
0.0012
0.0018
0.0021
0.0046
We run two parallel chains for model one using two sets of prior distributions with the different starting points using Metropolis-Hastings algorithm and Gibbs sampler based on normal transition kernels. We iterate both the chains for 100000 times. There is no effect of prior distribution on posterior summaries because the estimates of parameters are nearly the same and the convergence rate of Gibbs sampler for both the prior sets is almost the same. Also for both the chains the results were somewhat similar. Table 1 presents the estimates and the credible intervals of the parameters for the Model I based on the simulation study. Table 2 gives the estimates and the credible intervals of all the parameters of the Model III based on the simulation study. The Gelman-Rubin (Gelman & Rubin, 1992) convergence statistic values are nearly equal to one and also the Geweke test (Geweke, 1992) values are quite small and the corresponding p-values are large enough to say that the chain attains stationary distribution. Simulated values of the parameters have the autocorrelation of lag . So that every iteration is selected as a sample from the posterior distribution.
Analysis of kidney infection data
To illustrate the Bayesian estimation procedure we use kidney infection data of McGilchrist & Aisbett (1991). The data related to recurrence times counted from the moment of the catheter insertion until its removal due to infection for 38 kidney patients using portable dialysis equipment. For each patient, the first and the second recurrence times (in days) of infection from the time of insertion of the catheter until it has to be removed owing to infection is recorded. So the first and the second recurrence times are taken to be independent apart from the common frailty component. The data consists of five risk variables age, sex and disease type GN, AN and PKD where GN, AN and PKD are short forms of Glomerulo Nephritis, Acute Nephritis and Polycystic Kidney Disease.
Let and be the first and the second recurrence time to infection. Five covariates age, sex and presence or absence of disease type GN, AN and PKD are represented by , , , , and . First we check goodness of fit of the data for the inverse Gaussian frailty distributions with two baseline distributions and then we apply the Bayesian estimation procedure. To check goodness of fit of kidney data set, we consider Kolmogorov-Smirnov (K-S) test for two baseline distributions. The p-values for Model I are 0.77411, 0.78781 for and respectively and similarly for Model III are 0.71181 0.66831 for and respectively. Thus from p-values of K-S test we can say that there is no statistical evidence to reject the hypothesis that data are from the Model I and Model III in the marginal case and we assume that they also fit for bivariate case. For the bivariate data fitting, we first carryout goodness of fit for the univariate case and then go for bivariate case if it is feasible to test. Most of the researches use this kind of technique for the goodness of fit test. The KS test is used for univariate case which is based on ranking the lifetime data. Order statistics of bivariate data cannot be done and so bivariate KS test does not exist. Another possibility is to conduct chi-square goodness of fit test. Here the data consists of 38 patients which is very small sample data for bivariate case again with some censored observations. We end up with very few cell counts (less than 5) in the bivariate table and so not appropriate to use this test.
As in case of simulation, here also we assume same set of prior distributions. We run two parallel chains for all four models using two sets of prior distributions with the different starting points using the Metropolis-Hastings algorithm and the Gibbs sampler based on normal transition kernels. We iterate both the chains for 100000 times. As seen in the simulation study here also we got nearly the same estimates of parameters for both the set of prior, so estimates are not dependent on the different prior distributions. The convergence rate of the Gibbs sampler for both the prior sets is almost the same. Also both the chains shows somewhat similar results, so we present here the analysis for only one chain with as prior for baseline parameters and as the prior for the frailty parameter . A pseudo random sample from the posterior distribution can be found by taking values from a single run of the Markov chain at widely spaced time points (autocorrelation lag) after burn-in period. The autocorrelation of parameters become almost negligible after the certain lag. The convergence rate of Gibbs sampling algorithm does not depend on these choices of prior distributions in our proposed model for kidney infection data. The Geweke test values are near to zero and corresponding -values are quite high and the Gelman-Rubin Statistics for all the parameters of all six models based on data are very close to one.
Posterior summary for kidney infection data: Model-III (Generalized log-logistic baseline with compound poisson frailty)
Parameter
Estimate
Standard error
Lower credible limit
Upper credible limit
Burn in period 3500;
Autocorrelation lag 300
34.8414
0.9914
33.7435
36.6461
551.5072
10.6481
527.0581
578.8421
3.4142
0.1144
3.2752
3.6165
31.2122
1.8429
28.9423
33.0204
524.6104
12.5724
502.8461
543.2315
2.6182
0.0540
2.4014
2.8210
0.4275
0.0085
0.4012
0.4521
0.3918
0.0074
0.3706
0.4147
0.5912
0.0813
0.5272
0.6512
0.0361
0.0075
0.0113
0.0513
2.8804
0.1928
3.1877
2.5791
0.2757
0.0174
0.3641
0.0287
0.0434
0.0231
0.0752
0.0262
1.8783
0.3484
2.1552
1.4142
Posterior summary for kidney infection data: Model-IV (Generalized Log-logistic Baseline without frailty)
Parameter
Estimate
Standard error
Lower credible limit
Upper credible limit
Burn in period 6500;
Autocorrelation lag 200
1.5606
0.1592
1.2381
1.8637
0.0091
0.0044
0.0018
0.0182
1.2828
0.1834
0.9650
1.6735
1.2855
0.1448
1.0186
1.5377
0.0042
0.0019
0.0010
0.0082
1.4494
0.1601
1.1739
1.7712
0.0008
0.0014
0.0036
0.0018
0.8767
0.2923
1.4305
0.2643
0.1785
0.2649
0.4027
0.7270
0.0088
0.0144
0.0368
0.0172
0.0235
0.0142
0.0012
0.0509
Comparison of AIC, BIC, DIC of proposed frailty models with existing models
Corr frailty
Baseline
AIC
BIC
DIC
Compound Poisson
Generalized Weibull
650.6
670.2
635.8
Without Frailty
Generalized Weibull
690.3
708.3
678.1
Compound Poisson
Generalized Log-logistic
662.5
680.5
652.9
Without Frailty
Generalized Log-logistic
696.9
715.0
683.9
Positive Stable
Generalized Weibull
660.6
682.5
646.2
Positive Stable
Generalized Log-logistic
670.7
694.6
663.4
Inverse Gaussian
Generalized Weibull
682.8
704.0
666.1
Inverse Gaussian
Generalized Log-logistic
686.4
707.7
673.2
Gamma
Generalized Weibull
683.7
706.6
669.5
Gamma
Generalized Log-logistic
692.2
715.1
674.5
The Gelman-Rubin convergence statistic values are nearly equal to one and the Geweke test statistic values are close to zero and the corresponding -values are close to 0.5 which implies that the chains attain stationary distribution. The posterior mean and the standard error and the lower credible interval and upper credible interval values for Models I to IV are presented in Tables 3, 4, 5 and 6. For Model I and Model III the estimates of the variance of the frailty variables are respectively and . This shows that there is a high degree of heterogeneity or frailty present in the kidney infection data and models with frailty are proper choice for fitting the data. The correlation coefficient () between and are and for Model I and Model III respectively which shows the two frailty variables are correlated.
To take the decision about Model I, Model II, Model III and Model IV, we use the Bayes factors. The Bayes factors for Model I against Model II is and Model III against Model IV is which are high and strongly support frailty models, namely, Model I and Model III for kidney infection data set. Between the Model I and Model III, the Bayes factor for Model I against Model III is 12.56 which is high and strongly support Model I and Model I is better than Model III. Some patients are expected to be very prone to infection compared to others with same covariate value. This is not surprising, as seen in the data set there is a male patient with infection time 8 and 16, and there is also male patient with infection time 152 and 562. Table 7 shows that frailty models are better than without frailty models and Model I is better then Model III. We can observe that the regression coefficients for all the four models are different. The credible interval of the regression coefficient does not contain zero which indicates that the covariate sex is significant for all the models. But in Model I and Model III is significant. Negative value of indicates that the female patients have a slightly lower risk for infection. Negative value of , the regression coefficient corresponding to the covariate (the disease type PKD) indicates the absence of the disease type PKD in the patients have lover risk of infection in Model I and Model III. In order to compare the proposed models we use model selection criteria, the Akaike information criteria (AIC), Bayesian information criteria (BIC) and deviance information criteria (DIC). The comparison between four proposed models is done using AIC, BIC and DIC values given in Table 7. The smallest AIC value is Model-I (generalized Weibull distribution with correlated compound Poisson frailty). Same result hold for BIC and DIC value. We also observe that the correlated compound Poisson frailty models (Models I and Model III) are better than without frailty models (Models II and IV). We also compare our proposed models with correlated gamma frailty, correlated inverse Gaussian frailty and correlated positive stable frailty models suggested by Hanagal et al. (2017b), Hanagal and Pandey (2020) and Hanagal (2021). The AIC, BIC and DIC of earlier existing correlated frailty models have been presented in Table 7 and observe that the proposed correlated compound Poisson frailty with generalized Weibull baseline distribution performs better than these existing correlated frailty models for kidney infection data set.
Conclusions
In this paper we discuss results for correlated compound Poisson frailty models with two different baseline distributions. Different prior gives the same estimates of the parameters. The convergence rate of the Gibbs sampling algorithm does not depend on these choices of the prior distributions in our proposed model for kidney infection data. The estimates of variance (Model I, ; Model III, ) from the correlated frailty models show that there is a strong evidence of high degree of heterogeneity or frailty is present in the population of patients. The correlation coefficient () between and are and for Model I and Model III respectively which shows the two frailty models are correlated. The covariate sex is the only covariate which is significant for all models. Negative value of regression coefficient of covariate sex indicates that the female patients have a slightly lower risk of infection. Negative value of indicates that the absence of the disease type PKD in the patients have lover risk of infection in Model I and Model III.
To take the decision about Model I, Model II, Model III and Model IV, we also use the Bayes factor. We observe that, the Model I is best. We also observe that the correlated compound Poisson frailty models (Models I and Model III) are better than without frailty models. Also we can conclude that the correlated compound Poisson frailty with the generalized Weibull distribution as the baseline distribution is a better fit than correlated compound Poisson frailty model with the generalized log-logistic distribution. In Table 7, we also compare our proposed models with correlated gamma frailty, correlated inverse Gaussian frailty and correlated positive stable frailty models suggested by Hanagal et al. (2017b), Hanagal and Pandey (2020) and Hanagal (2021). We observe that the proposed correlated compound Poisson frailty with generalized Weibull baseline distribution performs better than these existing correlated frailty models for kidney infection data set. By referring all the above analysis, now we are in a position to say that, we have suggested a new correlated compound Poisson frailty model with the generalized Weibull distribution as the baseline distribution which is the best in the proposed model for modeling of kidney infection data set.
Footnotes
Acknowledgments
I thank the referee and the editorial board member for the valuable suggestions and comments which improved the earlier version of the manuscript.
Conflict of interest
The manuscript has no conflict on interest.
References
1.
AabergeR.KravdelO., & WennemoT. (1989). Unobserved heterogeneity in models of marriage dissolution. Disscussion Paper 42, Cental Bureau of Statistics, Norway.
2.
AalenO.O. (1992). Modelling heterogeneity in survival analysis by the compound Poisson distribution. Annals of Applied Probability, 2, 951-72.
3.
AalenO.O., & TretliS. (1999). Analyzing incidence of tests cancer by means of a frailty model.Cancer Causes Control, 10, 285-92.
4.
DeshpandeJ.V., & PurohitS. G. (2005). Life Time Data: Statistical Models and Methods. World Scientific, New Jersey.
5.
GelmanA., & RubinD. B. (1992). A single series from the Gibbs sampler provides a false sense of security. In Bayesian Statistics 4 (J. M. Bernardo, J. 0.Berger, A. P. Dawid and A. F. M. Smith, eds.). Oxford Univ. Press. pp. 625-632.
6.
GenestC., & MackayJ. (1986). Joy of Copulas: Bivariate distributions with uniform marginals.The American Statistician, 40(4), 280-283.
7.
GewekeJ. (1992). Evaluating the Accuracy of Sampling-Based Approaches to the Calculation of Posterior Moments. In Bayesian Statistics 4 (eds. J.M. Bernardo, J. Berger, A.P. Dawid and A.F.M. Smith), Oxford: Oxford University Press, pp. 169-193.
8.
GuptaR. D., & KunduD. (1999). Generalized exponential distributions.Australian and New Zealand Journal of Statistics, 41(2), 173-188.
9.
HanagalD.D. (2010a). Correlated compound Poisoon frailty model for the bivariate survival data.International Journal of Statistics and Management Systems, 5, 127-40.
10.
HanagalD. D. (2010b). Modeling heterogeneity for bivariate survival data by compound Poisson distribution.Model Assisted Statistics and Applications, 5(1), 01-09.
11.
HanagalD. D. (2010c). Modeling heterogeneity for bivariate survival data by Weibull distribution.Statistical Papers, 51(4), 947-58.
12.
HanagalD. D. (2010d). Modeling heterogeneity for bivariate survival data by the compound Poisson distribution with random scale.Statistics and Probability Letters, 80, 1781-90.
13.
HanagalD. D. (2011). Modeling Survival Data Using Frailty Models. Chapman & Hall/CRC. New York.
14.
HanagalD. D. (2017). Frailty Models in Public Health.Handbook of Statistics, 37(B), 209-247. Elsevier Publishers; Amsterdam.
15.
HanagalD. D. (2019). Modeling Survival Data Using Frailty Models. Second Edition. Springer Nature, Singapore.
16.
HanagalD. D. (2021). Correlated positive stable frailty models.Communications in Statistics, Theory and Methods, 50(23), 5617-5633.
17.
HanagalD. D., & BhambureS. M. (2014a). Shared inverse Gaussian frailty model based on reversed hazard rate for modeling Australian twin data.Journal of Indian Society for Probability and Statistics, 15, 9-37.
18.
HanagalD. D., & BhambureS. M. (2014b). Analysis of kidney infection data using shared positive stable frailty models.Advances in Reliability, 1, 21-39.
19.
HanagalD. D., & BhambureS. M. (2015). Comparison of shared gamma frailty models using Bayesian approach.Model Assisted Statistics & Applications, 10, 25-41.
20.
HanagalD. D., & BhambureS. M. (2016). Modeling bivariate survival data using shared inverse Gaussian frailty model.Communications in Statistics, Theory & Methods, 45(17), 4969-4987.
21.
HanagalD. D., & BhambureS. M. (2017). Modeling Australian twin data using shared positive stable frailty models based on reversed hazard rate.Communications in Statistics, Theory & Methods, 46(8), 3754-3771.
22.
HanagalD. D., & DabadeA. D. (2012). Modeling Hetrogeneity in bivariate survival data by compound Poisson distribution using Bayesian approach.International Journal of Statistics and Management Systems, 7(1-2), 36-84.
23.
HanagalD. D., & DabadeA. D. (2013). Modeling of inverse Gaussian frailty model for bivariate survival data.Communications in Statistics, Theory & Methods, 42(20), 3744-3769.
24.
HanagalD. D., & KambleA. T. (2014). Bayesian estimation in shared positive stable frailty models.Journal of Data Science, 13, 615-640.
25.
HanagalD. D., & KambleA. T. (2015). Bayesian estimation in shared compound Poisson frailty models.Journal of Reliability and Statistical Studies, 8(1), 159-180.
26.
HanagalD. D., & PandeyA. (2014a). Inverse Gaussian shared frailty for modeling kidney infection data.Advances in Reliability, 1, 1-14.
27.
HanagalD. D., & PandeyA. (2014b). Gamma shared frailty model based on reversed hazard rate for bivariate survival data.Statistics & Probability Letters, 88, 190-196.
28.
HanagalD. D., & PandeyA. (2015a). Gamma frailty models for bivarivate survival data.Journal of Statistical Computation and Simulation, 85(15), 3172-3189.
29.
HanagalD. D., & PandeyA. (2015b). Inverse Gaussian shared frailty models with generalized exponential and generalized inverted exponential as baseline distributions.Journal of Data Science, 13(2), 569-602.
30.
HanagalD. D., & PandeyA. (2016). Inverse Gaussian shared frailty models based on reversed hazard rate.Model Assisted Statistics and Applications, 11, 137-151.
31.
HanagalD. D. and PandeyA. (2017). Correlated Gamma Frailty Models for Bivariate Survival Data Based on Reversed Hazard Rate.International Journal of Data Science, 2(4), 301-324.
32.
HanagalD. D., & PandeyA. (2020). Correlated inverse Gaussian frailty models for bivariate survival data.Communications in Statistics, Theory and Methods, 49(4), 845-863.
33.
HanagalD. D.PandeyA. & SankaranP. G. (2017a). Shared frailty model based on reversed hazard rate for left censoring data.Communications in Statistics, Simulation and Computation, 46(1), 230-243.
34.
HanagalD. D.PandeyA., & GangulyA. (2017b). Correlated gamma frailty models for bivariate survival data.Communications in Statistics, Simulation and Computation, 46(5), 3627-3644.
35.
HanagalD. D., & SharmaR. (2013). Modeling heterogeneity for bivariate survival data by shared gamma frailty regression model.Model Assisted Statistics and Applications, 8, 85-102.
36.
HanagalD. D., & SharmaR. (2015a). Bayesian inference in Marshall-Olkin bivariate exponential shared gamma frailty regression model under random censoring.Communications in Statistics, Theory and Methods, 44(1), 24-47.
37.
HanagalD. D., & SharmaR. (2015b). Comparison of frailty models for acute leukaemia data under Gompertz baseline distribution.Communications in Statistics, Theory & Methods, 44(7), 1338-1350.
38.
HanagalD. D., & SharmaR. (2015c). Analysis of bivariate survival data using shared inverse Gaussian frailty model.Communications in Statistics, Theory & Methods, 44(7), 1351-1380.
39.
HougaardP. (1986). Survival models for heterogeneous populations derived from stable distributions.Biometrika, 73, 387-396.
40.
HougaardP. (2000). Analysis of Multivariate Survival Data. Springer: New York.
KheiriS.KimberA., & MeshkaniM. R. (2007). Bayesian analysis of an inverse Gaussian correlated frailty model.Computational Statistics and Data Analysis, 51, 5317-5326.
43.
McGilchristC. A., & AisbettC. W. (1991). Regression with frailty in survival analysis.Biometrics, 47, 461-466.
44.
MogerT. A., & AalenO. O. (2005). A distribution for multivariate frailty based on the compound Poisson distribution with random scale.Lifetime Data Analysis, 11, 41-59.
45.
SahuS. K.DeyD. K.AslanidouH., & SinhaD. (1997). A Weibull regression model with gamma frailties for multivariate survival data.Life Time Data Analysis, 3, 123-137.
46.
SantosC. A., & AchcarJ. A. (2010). A Bayesian analysis for multivariate survival data in the presence of covariates.Journal of Statistical Theory and Applications, 9, 233-253.
47.
VaupelJ. W.MantonK. G., & StallaedE. (1979). The impact of heterogeneity in individual frailty on the dynamics of mortality.Demography, 16, 439-454.
48.
WienkeA. (2011). Frailty Models in Survival Analysis. Chapman & Hall/CRC. New York.