A copula-based approach to joint modelling of multiple longitudinal responses with multimodal structures

Abstract

This article introduces a flexible modelling strategy to extend the familiar mixed-effects models for analysing longitudinal responses in the multivariate setting. By initiating a flexible multivariate multimodal distribution, this strategy relaxes the imposed normality assumption of related random-effects. We use copulas to construct a multimodal form of elliptical distributions. It can deal with the multimodality of responses and the non-linearity of dependence structure. Moreover, the proposed model can flexibly accommodate clustered subject-effects for multiple longitudinal measurements. It is much useful when several subpopulations exist but cannot be directly identifiable. Since the implied marginal distribution is not in the closed form, to approximate the associated likelihood functions, we suggest a computational methodology based on the Gauss–Hermite quadrature that consequently enables us to implement standard optimization techniques. We conduct a simulation study to highlight the main properties of the theoretical part and make a comparison with regular mixture distributions. Results confirm that the new strategy deserves to receive attention in practice. We illustrate the usefulness of our model by the analysis of a real-life dataset taken from a low back pain study.

Keywords

clustered random-effects Copula function gaussian quadrature low-back pain multiple longitudinal responses multimodality non-linear dependence

1 Introduction

Linear mixed-effects (LME) models have been progressively extended in recent studies to analyse some correlated data, including longitudinal or clustered, wherein a set of subjects has repeatedly measured on different conditions or periods (Laird and Ware (1982)). A routine assumption in fitting various mixed models is the normality of underlying subject-effects though it may be violated in practical applications. For example, at the presence of outliers, the random-effects may follow a structure with heavier tails than the normal distribution. Another realistic situation involves the existence of latent subpopulations in the data generating process, mainly when significant categorical covariates have omitted by mistake in the fixed part of the models.

Although the conventional likelihood-based estimates of fixed effects might be robust to non-normality of random-effects (Butler and Louis (1992)), the same is invalid for the prediction of random-effects (Zhang and Davidian (2001)). Also, the maximum likelihood estimate of model parameters, including variance components, suffers from the loss of efficiency and incorrect computation of standard errors (Pinheiro et al. (2001)). In recent years, the main aim of several types of research has been centering on choosing suitable distributions for the random-effects and calibrating them efficiently to the observed data. In some applications, to avoid misleading inference, it is suggested that the collected measurements must be classified based on the adoption of a multimodal distribution for random-effects.

The choice of statistical methods in the literature to set up a multimodal structure has mostly focused on a mixture of multiple unimodal components. An example of a finite mixture of normal distributions in LME models is given by (Verbeke and Lesaffre (1996)) and detailed further by (Verbeke and Molenberghs (2000)). Another application, using the skew-t components, is proposed by (Lin (2010)) to allow the accommodation of both skewness and thick tails for random-effects.

In this article, we extend the familiar mixed-effects models to the analysis of multiple longitudinal responses by utilizing an innovative modelling strategy to cover possible multimodality of data. We fit separate LME models to all response variables and join them by allowing a suitable multivariate multimodal distribution for the random-effects.

There are several issues in the application of mixture distributions to fit LME models for multiple responses. One is the identifiability (Hennig (2000)) due to the unstructured forms of covariance matrices and a large number of unknown parameters that make the use of traditional estimation procedures complicated. Any mixture distribution intending to simplify the execution of computational techniques requires convincing prior information on the choice of the number of components. Adopting this number to be fixed to avoid overfitting with the same distribution for all clusters sound to be strict restrictions. Other constraints comprise the linearity of dependence structure between variables and the similarity of all marginal distributions.

The importance of dealing with these challenges motivated us to investigate alternative strategies that offer great flexibility in jointly modelling of multimodal data. We construct a new multivariate distribution by a combination of copula functions (Sklar (1959)) and a member of elliptical distributions (Fang et al. (1990)) called the double Gamma (DG) distribution. It is a suitable choice for analysing longitudinal data that exhibit multimodality since it can cover most distributional peaks through a limited number of parameters without the need for strong prior information. Moreover, this option overtakes mixture distributions that tend to enforce additional components or parameters to capture more peaks. Furthermore, the use of copulas can separate the dependence structure of a multivariate distribution from the individual marginal distributions by looking at its underlying copula form. During the data analysis using the copula system, the analyst should be aware of various associations between variables and recognize tails of the related distributions (Joe (2014); Durante and Sempi (2015)). For instance, a copula may be useful to observed data indicating correlation in the extreme tails but not elsewhere in the distribution.

The proposed strategy constructs a multimodal structure for the underlying random-effects based on the DG distribution and a suitable copula. It has several key advantages, such as (a) facilitating the fitness of general mixed models with multiple responses, (b) presenting great flexibility for modelling the non-linear correlations between responses, (c) allowing divers margins for each response variable, (d) joining various types of response variables with multimodal behaviours, (e) managing the impact of hidden subpopulations with different characteristics in terms of peaks that cannot be directly observable through the value of responses, (f) being useful for analysing clustered data, (g) avoiding incorrect inference when the normality of random-effects is violated, clustering of measurements occurs, and some significant categorical covariates have omitted accidentally.

We use the maximum likelihood approach to fit the proposed models. The likelihood functions do not appear in closed-form expressions due to complicated integrals. Thus the advantage of advanced numerical techniques is undertaken to jointly approximate integrals and maximize underlying likelihoods. The estimation process for the proposed joint mixed model with non-normal random-effects is a difficult task. A useful procedure for fitting complex mixed models is NLMIXED in SAS (High and Elrayes (2017); SAS Institute (2018); Toenges and Jahn-Eimermacher (2020)). First, the process carries out by the Gauss—Hermite quadrature technique to approximate involving integrals over the random-effects distributions. Then, the model fitting strategy reiterates for non-normal random-effects by reformulating the integrated likelihood function (Liu and Yu (2007); Liu and Huang (2008)). It delivers the SAS programming statements that remain similar to the general mixed models. At each iteration of the process, computation of the gradient vector and the approximate Hessian matrix for the likelihood function is accomplished by summing over subjects and quadrature points along with updating parameters estimate via the Newton–Raphson algorithm. The entire procedure is repeating until successful convergence. Finally, this procedure results in parameter estimates along with their approximated standard errors based upon the diagonal elements of the inverse observed-Hessian matrix. The normal approximation theory is then convenient for the statistical inference (Pinheiro and Bates (1995); Molenberghs et al. (2009); Jiang (2017)).

We examine the usefulness of our methodology in the analysis of low back pain (LBP) and its related disabilities, which have grown in most industrialized countries and are amongst the most frequent reasons for consulting a primary care physician. Prevention of LBP in primary stages is a principal public health problem worldwide since it contributes to the prevention of disabilities because of back pain in progressive stages. To evaluate the contributions of some factors to the acute LBP, we apply our strategy to re-analyse a real-life dataset taken from a prospective cohort study on LBP (Park et al. (2010)).

The remainder of the article is organized as follows. Section 2 introduces the proposed univariate DG distribution and reports its main properties. Section 3 presents a short introduction of copula functions with the construction of a new multivariate multimodal distribution using copulas. Section 4 specifies the multivariate mixed-effects models and extends strategies to jointly analysing multiple multimodal responses. Section 5 provides a simulation study to evaluate the performance of our proposed model. Section 6 illustrates our methodologies to analyse a real-life dataset taken from the LBP study and makes a comparison of our proposed model with several competitors.

2 Double Gamma distribution

A special case of elliptical distributions (Fang et al. (1990)) is the DG distribution (Johnson et al. (2004); Nguyen and Chen (2009)), defined as follows. Definition The random variable $X$ follows the DG distribution with parameters $μ \in$ , $σ > 0$ and $α > 0$ , if its probability density function (PDF) is of the form

f (x; μ, a, a) = \frac{1}{2 Γ (a) σ^{a}} | x - μ |^{a - 1} \exp (- | x - μ | / σ), x R .

(2.1)

We denote

X \sim DG (μ, σ, α)

. By conducting basic statistical techniques, it is easy to show that

E (X) = μ

Var (X) = α (α + 1) σ

, the kurtosis measure is

(α + 3) (α + 2) / α (α + 1)

and the cumulative distribution function (CDF) of

X

F (x) = \frac{1}{2} (1 + s i g n (x - μ) F_{a} (| x - μ | / σ)), x R .

where $F_{α}$ is the standard gamma distribution function and $sign (t)$ equals $0$ for $t = 0$ , $- 1$ for $t < 0$ , and $+ 1$ for $t > 0$ . Figure 1:

The density plot of $DG (0, 1, α)$ for (left) $α = 0.6$ , (centre) $α = 1$ and (right) $α = 3$

The univariate DG distribution is symmetric about $μ$ and its shape depends on $α$ . Figure 1 shows density plots of $DG (0, 1, α)$ for some values of $α$ . For $0 < α < 1$ the density function (2.1) tends to infinity at $X = μ$ , whereas for $α > 1$ it has a local minimum at $μ$ with two modes. The special case $α = 1$ refers to a generalization of the double-exponential distribution introduced by (Gómez et al. (1998)).

3 Multimodal double Gamma copula

Consider the random vector $(U_{1}, \dots, U_{p})^{⊤}$ where each $U_{i}$ , $i = 1, \dots, p$ , follows a uniform random variable over the unit interval $[0, 1]$ . On the unit hyper-cube $[0, 1]^{p}$ , a $p$ -dimensional copula function $C$ can be defined based on the joint CDF of $(U_{1}, \dots, U_{p})^{⊤}$ . (Sklar (1959)) shows that for any $p$ -dimensional random vector $X = (X_{1}, \dots, X_{p})^{⊤}$ with joint CDF $F (x_{1}, \dots, x_{p})$ and continuous margins $F_{1} (x_{1}), \dots, F_{p} (x_{p})$ , a unique copula function $C$ exists on $Ran F_{1} \times \dots \times Ran F_{p}$ , where $Ran F_{k}$ denotes the range of $F_{k}$ for $k = 1, \dots, p$ , such that $F (x_{1}, \dots, x_{p})$ can be represented through this copula and its margins as

F (x_{1}, \dots, x_{p}) = C (F_{1} (x_{1}), \dots, F_{p} (x_{p})),     (x_{1}, \dots, x_{p})^{⊤} \in^{p} .

(3.1)

The main objective is to construct a multivariate distribution through a specific copula and a set of univariate margins. Using this fact, we construct a class of flexible multivariate multimodal distributions through a combination of copulas with multimodal univariate margins, such as the DG. This strategy helps us to analyse multivariate responses that own a multimodal structure. To decide on selecting suitable margins, we notify that a unimodal distribution shows the presence of an unclustered population, while the existence of several distinct modes indicates a clustered population for measurements of each joined variable.

As already mentioned, closed-form expressions are available for the PDF and CDF of the univariate DG distribution. Thus, we introduce at least one DG distribution as the marginal of copula to consequently provide a collection of multivariate multimodal distributions. The dependence between related variables is then specified by making use of an assigned copula.

An attractive feature of copulas is to facilitate constructing scale-invariant measures of dependence that remain unchanged under monotonically increasing transformations of the marginal distributions. Thus, the specialist can express any scale-invariant dependence measure in terms of the underlying copula (Trivedi and Zimmer (2007); Durante and Sempi (2015)). A well-known type of these measures is Kendall's $τ$ given by

τ_{K} = 4 \int_{0}^{1} \int_{0}^{1} C (u_{1}, u_{2}) dC (u_{1}, u_{2}) - 1 .

If the copula $C$ and margins $F_{1} (x_{1}), \dots, F_{p} (x_{p})$ are continuous and differentiable then the joint density function, corresponding to the joint distribution (3.1), is given by

f (x_{1}, \dots, x_{p}) = c (F_{1} (x_{1}), \dots, F_{p} (x_{p})) \prod_{k = 1}^{p} f_{k} (x_{k}),     (x_{1}, \dots, x_{p})^{⊤} \in^{p},

where

f_{k} (x_{k})

is the density corresponding to the marginal CDF

F_{k} (x_{k})

for

k = 1, \dots, p

and copula density

c

is the derivative of the copula

C

The proposed approach can accommodate multimodality. To show this case, we present in Figure 2, the contour plot of familiar Archimedean copulas (Nelsen, 2006) coupled with DG and normal margins.

Note that the value of Kendall's $τ$ for all given copulas equals 0.5. These figures indicate that the number of peaks is a function of the assigned margins. The copula function reflects only a particular dependence structure. The choice of copula directly controls what parts of the implied distribution are more associated. As mentioned by (Frees and Valdez (1998)), Frank copula imposes a specific radially symmetric dependence structure. For the Clayton copula, the dependence in the lower-left region is strong relative to that in the upper-right region, while for the Gumbel copula the dependence in the upper-right region is stronger than that in the lower-left region.

Figure 2:

The density plot of the copula with the standard normal as both margins (left), the DG as one of two margins (centre), and the DG as both margins (right)

4 Specification of multimodal multivariate LME models

The multivariate linear mixed-effects (MLME) modelling is an appropriate technique to describe the variation in multiple responses that are measured repeatedly over time periods for each subject in terms of a set of fixed covariates. Let $p$ responses be measured for $N$ subjects. For each subject $i = 1, 2, \dots, N,$ denote the response vector $Y_{i}^{k} = {(Y_{i 1}^{k}, \dots, Y_{{in}_{i}}^{k})}^{⊤}$ corresponds to the $k$ th ( $k = 1, \dots, p$ ) response's measurements at $n_{i}$ different time periods. The traditional LME model assumes that the response vector $Y_{i}^{k}$ for each $k$ satisfies

Y_{i}^{k} | b_{i}^{k} \sim N_{n_{i}} (X_{i}^{k} β^{k} + Z_{i}^{k} b_{i}^{k}, D_{i}^{k}),

where

X_{i}^{k}

and

Z_{i}^{k}

are

n_{i} \times r

and

n_{i} \times q

known covariates matrices related to the

r

-dimensional vector of unknown fixed regression coefficients

β^{k}

and the

q

-dimensional vector of random-effects

b_{i}^{k} = (b_{i 1}^{k}, \dots, b_{iq}^{k})

, respectively, and

D_{i}^{k}

denotes an

n_{i} \times n_{i}

covariance matrix (Fieuws and Verbeke (2006)). A usual assumption in fitting MLME models is that the measures

Y_{i 1}^{k}, \dots, Y_{{in}_{i}}^{k}

for each

k

are conditionally independent given

b_{i}^{k}

. It simply results in

D_{i}^{k} = σ_{k}^{2} I_{n_{i}}

, where

I_{n_{i}}

is a

n_{i}

-dimensional identity matrix. Furthermore, all random-effects are assumed to be normally distributed. In practice, this naïve assumption can likely be violated if special classification exists for some responses.

In this article, we propose an extension of MLME models that promote to analyse the multiple clustered responses. We construct a new model by allowing the response vector $y_{i}^{k}$ , conditioned on the random-effects $b_{i}^{k}$ , follows a known distribution with PDF $g^{k} (Y_{i}^{k} | b_{i}^{k}; θ^{k})$ , where the vector of unknown parameters $θ^{k}$ possibly depends on some covariates. By assuming that all elements of $Y_{i}^{k}$ are independent, given $b_{i}^{k}$ , we introduce a multivariate distribution based on utilizing a proposed multimodal copula for the random-effects to take into account correlated responses. It suggests that fitting a separate MLE model for each response can appropriately specify the joint model by successively combining the multimodal copula distribution for all random-effects. This strategy also allows choosing any marginal density with the bimodal/multimodal property, such as the univariate double Gamma, for each random effect, to construct a multivariate multimodal density function.

In the regression modelling methodology, the marginal expectation of responses is commonly assumed to depend only on the covariates, that is, $E (Y_{i}^{k}) = X_{i}^{k} β^{k}$ . It is quite desirable to keep this property even for our proposed model by assuming that the marginal mean of each random effect is zero. Also, under offered assumptions, $k$ th LME model clearly dictates the marginal cross correlation structures between $k$ th response's measurements at two time points $j$ , $s = 1, \dots, n_{i}$ within subject $i$ , as $Corr (Y_{ij}^{k}, Y_{is}^{k}) = σ_{i (j, s)}^{k} / \sqrt{σ_{i (j, j)}^{k} σ_{i (s, s)}^{k}}$ for $j \neq s,$ where $σ_{i (j, s)}^{k} = Z_{ij}^{k ⊤} Cov (b_{i}^{k}) Z_{is}^{k} + σ_{k}^{2} I (j = s)$ , with $Z_{ij}^{k ⊤}$ being the $j$ th row of the matrix $Z_{i}^{k}$ and $I (\cdot)$ denotes the indicator function. Moreover, the role of the dependence between the responses-specific random-effects generates the correlation structure between the measurements of different responses $k$ , $l = 1, \dots, p$ , to be measured as $Corr (Y_{ij}^{k}, Y_{is}^{l}) = σ_{i (j, s)}^{(k, l)} / \sqrt{σ_{i (j, j)}^{k} σ_{i (s, s)}^{l}},$ where $σ_{i (j, s)}^{(k, l)} = Z_{ij}^{k ⊤} Cov (b_{i}^{k}, b_{i}^{l}) Z_{is}^{l}$ with $Cov (b_{i}^{k}, b_{i}^{l})$ is computed by the defined copula and Hoeffding's Lemma (Pumi and Lopes (2012)).

For illustration, consider the following simple model

Y_{i}^{k} | b_{i}^{k} \sim N_{n_{i}} (X_{i}^{k} β^{k} + b_{i}^{k} J_{n_{i}}, σ_{k}^{2} I_{n_{i}}),

where

J_{n_{i}}

denotes an

n_{i}

-dimensional vector of ones. For all

k

and

l

we have

Corr (Y_{ij}^{k}, Y_{is}^{l}) = Cov (b_{i}^{k}, b_{i}^{l}) / \sqrt{Var (Y_{ij}^{k}) Var (Y_{is}^{l})},

with

Var (Y_{ij}^{k}) = Var (b_{i}^{k}) + σ_{k}^{2}

, where

Var (b_{i}^{k})

can be obtained from the marginal distribution of the random intercept

b_{i}^{k}

and

Cov (b_{i}^{k}, b_{i}^{l}) = \int_{0}^{1} \int_{0}^{1} \frac{C (u_{k}, u_{l}) - u_{k} u_{l}}{f_{k} (F_{k}^{- 1} (u_{k})) f_{l} (F_{l}^{- 1} (u_{l}))} {du}_{k} {du}_{l},

where

F_{k}^{- 1} (u_{k})

and

F_{l}^{- 1} (u_{l})

denote the marginal quantiles and

f_{k} (\cdot)

and

f_{l} (\cdot)

denote the marginal PDFs of

b_{i}^{k}

and

b_{i}^{l}

, respectively.

The above expressions reveal that the correlation between measurements of responses is directly related to the correlation measure between responses-specific random-effects. It also shows that measures within each response variable may be correlated even if measures between two mixed responses are uncorrelated.

The inference for the vector of unknown model parameters $Θ$ (includes the vector parameters $θ^{1}, \dots, θ^{p}$ , the parameters of the marginal distributions of random-effects and the parameter of selected copula function $C$ ) in fitting the proposed model is based on the Log-likelihood function $ℓ (Θ | Y) = \sum_{i} ln [f (Y_{i}; Θ)]$ , where $f (Y_{i}; Θ)$ is the marginal density function of the response vector $Y_{i} = (Y_{i}^{1}, \dots, Y_{i}^{p})^{⊤}$ which can be obtained by integrating out the random-effects vector $b_{i} = (b_{i}^{1}, \dots, b_{i}^{p})^{⊤}$ as

\begin{matrix} f (Y_{i}; Θ) & = & \int \prod_{k = 1}^{p} \{g^{k} (Y_{i}^{k} | b_{i 1}^{k}, \dots, b_{iq}^{k}; θ^{k}) \prod_{h = 1}^{q} f_{kh} (b_{ih}^{k})\} \\ \times c (F_{11} (b_{i 1}^{1}), \dots, F_{pq} (b_{iq}^{p})) {db}_{i 1}^{1} \dots {db}_{iq}^{1} \dots {db}_{i 1}^{p} \dots {db}_{iq}^{p}, \end{matrix}

(4.1)

where

F_{kh} (\cdot)

and

f_{kh} (\cdot)

for

k = 1, \dots, p

and

h = 1, \dots, q

, are CDF and PDF of the presumed marginal distribution for random effect

b_{ih}^{k}

respectively, and

c

is the density function of the selected copula

C

5 Maximum likelihood estimation

To carry out the inference of $Θ$ , the direct maximization of the marginal log-likelihood function may involve solving complex integrals using advanced numerical techniques. In this article, we implement the Gauss–Hermite quadrature to approximate the log-likelihood function and to make inference on parameters in user-friendly software packages, such as SAS or R. Gaussian quadrature can be used to approximate integrals for a given kernel by a weighted average of the integrand evaluated at predetermined points, called nodes. Tables are available (Abramowitz and Stegun (1964)) to compute known weights and nodes for the adopted kernel. Gaussian quadrature for multiple integrals is numerically complicated (Davis and Rabinnowitz (2007)). Numerical techniques often use the efficient algorithm proposed by (Golub (1973)).

In fitting LME models for non-normal responses, the Gaussian quadrature technique can approximate the marginal density function by a weighted average of the integrand directly when the random-effects follow a normal distribution and the dimension of the random-effects vector is not large (Lesaffre and Spiessens (2001); McCulloch and Searle (2001); Gueorguieva (2001)). Thus, the estimation process is not straightforward when we use the Gaussian quadrature for fitting our proposed model.

Nevertheless, using a statistical trick followed by (Liu and Yu (2007)), we can multiply and divide the integrand in (4.1) by a standardized multivariate normal density and reformulate the resulting function over the vector of normal random-effects $α_{i} = {(α_{i}^{1}, \dots, α_{i}^{p})}^{⊤}$ , where $α_{i}^{k} = (α_{i 1}^{k}, \dots, α_{iq}^{k})$ for $k = 1, \dots, p$ , as

\begin{matrix} f (Y_{i}; Θ) & = & \int \prod_{k = 1}^{p} \{g^{k} (Y_{i}^{k} | α_{i 1}^{k}, \dots, α_{iq}^{k}; θ^{k}) \prod_{h = 1}^{q} f_{kh} (α_{ih}^{k})\} \\ \times c (F_{11} (α_{i 1}^{1}), \dots, F_{pq} (α_{iq}^{p})) \frac{ϕ_{pq} (α_{i}; {0, I}_{pq})}{ϕ_{pq} (α_{i}; {0, I}_{pq})} d α_{i 1}^{1} \dots d α_{iq}^{1} \dots d α_{i 1}^{p} \dots d α_{iq}^{p}, \end{matrix}

where

ϕ_{pq} (α_{i}; 0, I_{pq})

denotes the normal density function of the

pq

-dimensional vector

α_{i}

with zero mean vector and identity covariance matrix. Thus, the Gaussian quadrature technique can easily be applied to approximate the integrand

\prod_{k = 1}^{p} \{g^{k} (Y_{i}^{k} | α_{i 1}^{k}, \dots, α_{iq}^{k}; θ^{k}) \prod_{h = 1}^{q} f_{kh} (α_{ih}^{k})\} c (F_{11} (α_{i 1}^{1}), \dots, F_{pq} (α_{iq}^{p})) / ϕ_{pq} (α_{i}; {0, I}_{pq}) .

This technique only requires that the PDF of random-effects has a closed form expression as is available for our proposed multimodal copulas.

6 Simulation studies

We conduct two simulation studies to highlight the performance of our modelling methodology in comparison to normal and mixture models. To obtain the maximum likelihood estimation of model parameters, we use the Gauss–Hermite quadrature, implemented in the NLMIXED procedure of SAS (SAS Institute (2018)).

To design the first simulation study a specific mixed-effects model is considered for illustrative purposes. In particular, we generate 100 datasets from the bivariate LME model

y_{ij}^{k} = β_{0}^{k} + β_{1}^{k} x_{j} + β_{2}^{k} x_{i}^{k} + b_{i}^{k} + e_{ij}^{k},

(6.1)

for

k = 1, 2

i = 1, \dots, 100

j = 1, \dots, 5

, where

e_{ij}^{1} \overset{iid}{\sim} N (0, 1)

and

e_{ij}^{2} \overset{iid}{\sim} N (0, 4)

are mutually independent for all

i

and

j

, the covariate

x_{j} = j - 3

contains values changing within subjects and the same for all subjects, and

x_{i}^{1}

and

x_{i}^{2}

are assumed to be the subject level covariates and are drawn uniformly in the range

(10, 20)

. True values of fixed parameters are set to

β_{0}^{1} = 20

β_{1}^{1} = 4

β_{2}^{1} = 3

β_{0}^{2} = 10

β_{1}^{2} = 6

, and

β_{2}^{2} = 7 .

To illustrate the usefulness of our proposed multimodal copulas in accommodating multimodality, we select the Clayton copula and generate the random intercepts

b_{i}^{1}

and

b_{i}^{2}

from a bivariate distribution according to the Clayton copula with margins

DG (0, 1, 3)

and

DG (0, 2, 5)

. We set

θ = 2

to correspond to Kendal's

τ = 0.5

To generate $b_{i}^{1}$ and $b_{i}^{2}$ , we first draw variants $(u_{1}, u_{2})$ from the following process (Nelsen (2006)):

Generate two independent uniform random variables $u_{1}$ and $v .$

Set $u_{2} = {((v^{- θ / (1 + θ)} - 1) u_{1}^{- θ} + 1)}^{- 1 / θ}$ .

Then, we generate $b_{i}^{k}$ for $k = 1, 2$ , by computing the quantile function of $DG (μ_{k}, σ_{k}, α_{k})$ given by , where $F_{α}^{- 1}$ denotes the quantile function of a $Gamma (α, 1)$ distribution.

Histograms of the generated random intercepts in Figure 3 demonstrate the existence of two modes for each one. Also, in Figure 4 (left) and (centre), the scatter plot and the distribution surface of random intercepts show that two partitions exist and the non-linearity of the dependence structure is present.

Figure 3:

Histograms of the generated random-effects from the Clayton copula with DG margins

Figure 4:

The scatter plot (left), the surface plot (centre) and the contour plot (right) of the generated random-effects from the Clayton copula with DG margins

Table 1:

Simulation results based on 100 generated data sets of model (6.1) when the random-effects have been generated from (a) Clayton copula and (b) bivariate normal. Parameter estimates (Est) and their standard errors (SE) are reported.

,	M1		M2		M3		M4
Parameters	Est	SE	Est	SE	Est	SE	Est	SE
\multicolumn 12c(a) Multimodal random-effects
$β_{0}^{1} = 20$	19.821	0.798	19.873	0.792	20.081	0.764	20.153	0.799
$β_{0}^{2} = 10$	9.867	0.689	10.113	0.668	9.922	0.524	9.899	0.759
$β_{1}^{1} = 4$	4.014	0.271	4.013	0.264	4.013	0.231	4.019	0.285
$β_{1}^{2} = 6$	5.986	0.235	6.014	0.219	5.988	0.187	5.898	0.311
$β_{2}^{1} = 3$	3.067	0.172	2.944	0.167	2.957	0.155	3.61	0.188
$β_{2}^{2} = 7$	6.895	0.379	7.098	0.369	7.056	0.348	7.212	0.423
$σ_{e^{1}}^{2} = 1$	1.008	0.153	1.006	0.149	1.004	0.144	1.009	0.229
$σ_{e^{2}}^{2} = 4$	4.011	0.149	4.009	0.146	4.008	0.145	4.23	0.161
(b) Normal random-effects
$β_{0}^{1} = 20$	19.935	0.349	20.069	0.613	20.132	0.745	19.878	0.512
$β_{0}^{2} = 10$	9.951	0.371	10.166	0.581	10.187	0.654	10.067	0.423
$β_{1}^{1} = 4$	3.996	0.093	4.005	0.105	4.012	0.116	3.994	0.101
$β_{1}^{2} = 6$	5.997	0.106	6.009	0.109	5.991	0.114	6.007	0.107
$β_{2}^{1} = 3$	3.019	0.098	2.948	0.161	3.065	0.192	2.963	0.123
$β_{2}^{2} = 7$	7.008	0.027	6.949	0.192	7.073	0.193	7.045	0.179
$σ_{e^{1}}^{2} = 1$	0.999	0.127	1.008	0.229	0.988	1.241	1.009	0.221
$σ_{e^{2}}^{2} = 4$	4.004	0.214	4.006	0.248	4.011	0.254	4.006	0.232

For each of 100 generated datasets, Model (6.1) was fitted by assuming that $e_{ij}^{k} \overset{iid}{\sim} N (0, σ_{e^{k}}^{2})$ for $k = 1, 2$ , and the random intercepts $b_{i}^{1}$ and $b_{i}^{2}$ distributed as

the bivariate normal $N_{2} (0, Σ_{b}) .$

the mixture distribution $\sum_{j = 1}^{2} π_{j} ϕ (μ_{j}, Σ_{b})$ with $\sum_{j = 1}^{2} π_{j} = 1$ . Here, the condition $\sum_{j = 1}^{2} π_{j} μ_{j} = 0$ is required to let the mean value of random-effects being zero. Also, it is necessary to assume a common covariance matrix for all components to avoid unbounded likelihood (Böhning (1999); Verbeke and Lesaffre (1996)).

the Clayton copula with margins $DG (0, σ_{1}, α_{1})$ and $DG (0, σ_{2}, α_{2})$ .

the Gaussian copula with margins $DG (0, σ_{1}, α_{1})$ and $DG (0, σ_{2}, α_{2})$ .

To make a comparative study, we report the parameter estimates and their standard errors of each model in Table 1(a). We also compute the Akaike information criteria (AIC) and the Bayesian information criteria (BIC) to select the best-fitted model. These values show that M3 is the best-fitted model for the generated dataset. The estimate of shape parameters for the DG margins is also significant.

A comparison of various models shows that the most parameter estimates are nearly unbiased and the same for all fitted models. In model M3, biases and standard errors are small. The efficiency of $β_{2}^{1}$ and $β_{2}^{2}$ estimates, associated with the subject-level covariates $x_{i}^{1}$ and $x_{i}^{2}$ , are improved in comparison with the case of normally distributed random-effects. This evidence reveals that our proposed model deserves further consideration in practical applications. Also, the adoption of incorrect assumptions (e.g., normality) for random-effects distribution may reduce the efficiency of regression parameter estimates. Similar findings have addressed by (McCulloch and Neuhaus (2011)) for the estimate of intercepts. Our results show that the efficiency of $β_{0}^{1}$ and $β_{0}^{2}$ estimates may be degraded when the random-effects distribution is far from the normal distribution while the normality is assumed. The efficiency of $β_{1}^{1}$ and $β_{1}^{2}$ estimates are nearly equal in all models, which shows that the distribution of random-effects does not have a remarkable influence on longitudinal effects. This fact was already addressed by (Verbeke and Lesaffre (1996)) in a specific LME model. The estimate of scale parameters shows a discrepancy for all fitted models but are not comparable because of owning different scales.

Figure 4 (right) displays the scatter plot of predicted random intercepts from model M3, with the super-imposed contour plots of the fitted Clayton copula with DG margins. It demonstrates that the additional flexibility provided by the proposed distribution is sufficient to capture quite accurately the true multimodality of random intercepts.

Afterward, we design the second simulation by assuming that the random intercepts follow a bivariate normal and illustrate that the proposed model still provides reasonable estimation results. The evidence reveals that the proposed model deserves to be adopted in practical applications as a preferred substitution even if the classical model is correct. Specifically, we let $b_{i}^{1}$ and $b_{i}^{2}$ being generated by a bivariate normal distribution with mean $0$ and the covariance matrix $binom1 1 1 4$ .

For each of 100 generated datasets, we again, fit Model (6.1) by assuming that $b_{i}^{1}$ and $b_{i}^{2}$ follow M1–M4. As expected, the AIC and BIC values choose the normal as the best-fitted model. Results, given in Table 1(b), show that the parameter estimates are, for the most parameters, relatively unbiased in all models. The fixed effects estimate in model M4 is notably close to model M1. The Gaussian copula evidently can cover the linear dependence structure between the generated normal random-intercepts. In this way, there is no efficiency loss associated when using the Gaussian copula. The comparison of findings for M3 and M4 shows that changing of the copula can increase the bias and standard error of some parameter estimates. As a result, a correct specification of both the random-effects distribution and copula functions is challenging. The success of data analysis using copulas relies necessarily on a suitable choice of margins such that it can demonstrate a clear picture of the fitting joint distribution.

7 Data analysis strategies for the low back pain study

We reanalyse a real-life dataset, which is taken from a prospective cohort study on LBP (Park et al. (2010)), to illustrate the usefulness of our proposed MLME model. The main aim of the study was to explore the effect of a treatment package composed of herbal medicine, acupuncture, bee venom acupuncture and a Korean version of spinal manipulation (Chuna) on LBP. We show that our methodology is useful when a complex structure involving the multimodality of bivariate responses is to be analysed.

7.1 Data description

The institutional review boards (IRBs) of both the University of North Carolina and Jaseng hospital in Korea has organized the LBP study. The dataset, collected from November 2006 to October 2007, includes a sample of 127 patients in total. They have not being previously treated for LBP at the Jaseng hospital. We delete some specific cases from the original sample due to some exclusion criteria, such as back pain caused by non-spinal or soft tissue issues, pregnancy, spinal tumour, rheumatoid arthritis, the history of back surgery, vertebral fracture, dislocation, suspected concurrent severe neurological symptoms and major organ transplantation (such as the heart, kidney or liver). The control of treatment was at baseline, and followed-up measurements were at weeks 4, 8, 12, 16, 20 and 24. Patients were $34.7 \pm 8.4$ years old (mean $\pm$ standard deviation) with $41.6$ % female.

In our modelling process, we will jointly analyse the visual analogue scale (VAS) (0-10) of back pain (Jensen et al. (1986)) and the Oswestry Disability Index (ODI) (Beurskens et al. (1996)). The model includes several medical and demographic factors such as the patients’ age, sex, body mass index (BMI), surgery recommendation (0 = recommended and 1 = not recommended), baseline measures of two responses, and the quality of life variables according to different subcategories mental health and physical health. These two main summary measures have aggregated from eight subscale items (physical functioning, role-physical, bodily pain, general health, vitality, social functioning, role-emotional and mental health) of the SF-36 Health-Related Quality of Life Questionnaire (Ware et al. (1995)) and are defined as scores ranging from 0 to 100 wherein the higher score indicates an improved level of health.

A preliminary descriptive analysis shows that the number of patients who are in the normal BMI category (18.5–23), overweight ( $> 23$ ), obesity and underweight ( $< 18.5$ ) are, respectively, $63 (42 %)$ , $37 (24.7 %)$ and $48 (32 %)$ . Based on these categories and other unmeasured factors, a hidden classification may exist in the structure of collected data. We consider it our data analysis process.

7.2 Data analysis

The individual profiles plot, not shown here, shows that both ODI and VAS levels increase over time for most patients and substantial inter-patient variation exists. Thus, we consider the following random-intercepts models for responses $(Y_{ij}^{1}, Y_{ij}^{2}) \equiv ({VAS}_{ij}, {ODI}_{ij})$ ,

Y_{ij}^{k} = X_{ij} β^{k} + b_{i}^{k} + e_{ij}^{k},

(7.1)

for

i = 1, \dots, 127

and

j = 1, \dots, 6

. Before fitting any proposed model, the analyst commonly performs a preliminary data analysis by fitting a familiar structural model to examine whether the underlying structure holds or alternatives are necessary. Following \citeauthor verbeke2000 (Verbeke and Molenberghs (2000), Ch. 9) for the selection of a mean structure and the residual covariance structure we fit two separated mixed models (7.1) for each

{VAS}_{ij}

and

{ODI}_{ij}

; that is,

e_{ij}^{k} \overset{iid}{\sim} N (0, σ_{e^{k}}^{2})

b_{i}^{k} \overset{iid}{\sim} N (0, σ_{b^{k}}^{2})

for

k = 1, 2

, and all are mutually independent. In brief, the preliminary mean structure analysis of our longitudinal data showed to include only the main fixed-effects of associated covariates in each mixed model. Moreover, to specify an appropriate covariance matrix for the error components

e_{ij}^{k}

k = 1, 2

, we examined some structures available in the SAS procedure MIXED. We discovered that between frequently used covariance structures an appropriate covariance structure for the residual components, in the presence of random-effects, is of the form

σ_{e^{k}}^{2} I_{6}

, for

k = 1, 2

Next, to verify whether a correlation exists between two responses in Model (7.1), we present a preliminary correlation structure. The empirical correlation between the pair (ODI, VAS) was 0.62, suggesting that a bivariate model may significantly fit better than two separate univariate LME models. Results of the fitted models show that the correlation between the prediction of random intercepts of two separated models for the ODI and the VAS is close to one (0.83), which may suggest that a model with one shared random intercept should also fit well. A comparison of two fitted models with shared and separated random-intercepts shows that the sharing strategy makes no better fit based on the smallest AIC and BIC values. Thus, our preliminary data analysis process verifies that a plausible model should assume that $(b_{i}^{1}, b_{i}^{2})$ follows a bivariate distribution.

As a preliminary bivariate mixed-effects model, we let $b_{i}^{1}$ and $b_{i}^{2}$ be correlated and bivariate-normally distributed. Then, we fit the bivariate LME model M1 using the SAS procedure MIXED (Thiébaut et al. (2002)). However, based on histograms of the predicted random intercepts from model (7.1), shown in Figure 5, we observed that the random intercept associated with VAS deviates from the normality and multimodal shape. While the random intercept associated with ODI may follow a normal distribution. Also, the related density surface and scatter plot of the predicted random intercepts from model (7.1), shown in Figure 6 (left) and (centre), obviously reveal that the joint distribution of intercepts $(b_{i}^{1}, b_{i}^{2})$ may be bimodal.

Figure 5:

Histograms of the predicted random intercepts from model (7.1) in low back pain study

Figure 6:

The scatter plot (left). The surface plot of the predicted random intercepts from the bivariate mixed model M1 (centre). The contour plots of the Clayton copula with DG margins in the low back pain study (right).

The above evidence motivates us to examine the ability of our proposed strategy to classifying patients, without any prior information to the group's structure.\\ The strong dependence observed in the lower-left region of the predicted random intercepts from model (7.1) may be covered by the Clayton copula. Thus, we specify a multimodal bivariate distribution for the random intercepts by utilizing the univariate DG distribution for each random intercept and the Clayton copula to join them. Because the joint distribution of random intercepts is multimodal, we fit a bivariate LME model specified by a finite mixture of normal components and the bivariate DG distribution for random intercepts.

For comparison, we fit the mixed-effects model (7.1) by assuming that the random intercepts be distributed as the already introduced specification for Models M1–M3 and two following models.

The Clayton copula with margins $N (0, σ_{1}^{2})$ and $N (0, σ_{2}^{2})$ .

The Clayton copula with margins $DG (0, σ_{1}, α_{1})$ and $N (0, σ_{2}^{2})$ .

Table 2 shows the estimation results. We report the SAS code for M5, as an example, in Appendix. The values of model selection criteria show that M5 is the best-fitting model, while M3 is the second-best one. The dependence parameter of Clayton copula with the shape parameter of the DG margin in M5 are both significant. We observe that the standard errors of fixed effects associated with most covariates in the normal model M1 are larger than those models assuming multimodality and are smaller in the selected model M5. The same is true for the estimate of random-effects variances.

Figure 6 (right) displays the scatter plot of the predicted intercepts with the super-imposed contour plots of the fitted model M5. This figure indicates that our proposed model can flexibility capture the multimodality and the non-linear dependence between the random intercepts.

Table 2:

Estimate (standard error) of model parameters under the fitted models M1–M5 for the low back pain study. ODI = Oswestry Disability Index; VAS = Visual analogue scale

,	Estimate (standard error)
	M1	M2	M3	M4	M5
\multicolumn 6lFixed effects parameters
Baseline $^{ODI}$	$0.59 (0.67)$	$0.45 (0.54)$	$0.43 (0.52)$	$0.48 (0.59)$	$0.43 (0.33)$
Female $^{ODI}$	$0.38 (0.91)$	$0.53 (0.53)$	$0.46 (0.46)$	$0.36 (0.62)$	$0.54 (0.42)$
Age $^{ODI}$	$0.92 (0.74)$	$0.85 (0.52)$	$0.75 (0.42)$	$0.78 (0.57)$	$0.72 (0.36)$
BMI $^{ODI}$	$1.99 (1.39)$	$2.09 (1.26)$	$1.84 (1.11)$	$2.41 (1.29)$	$2.89 (0.91)$
Surgery recommendation $^{ODI}$	$0.13 (1.82)$	$0.33 (1.19)$	$0.52 (1.13)$	$0.46 (1.25)$	$0.42 (1.07)$
Physical health $^{ODI}$	$- 0.64 (0.84)$	$- 1.44 (0.76)$	$- 1.09 (0.53)$	$- 1.14 (0.69)$	$- 1.05 (0.31)$
Mental health $^{ODI}$	$- 0.78 (0.64)$	$- 1.34 (0.46)$	$- 1.11 (0.23)$	$- 1.17 (0.47)$	$- 1.98 (0.14)$
Baseline $^{VAS}$	$0.09 (0.09)$	$0.09 (0.07)$	$0.03 (0.06)$	$0.07 (0.08)$	$0.09 (0.05)$
Age $^{VAS}$	$0.27 (0.37)$	$0.34 (0.29)$	$0.59 (0.14)$	$0.51 (0.32)$	$0.64 (0.11)$
BMI $^{VAS}$	$0.03 (0.89)$	$0.19 (0.64)$	$0.09 (. 34)$	$0.07 (0.68)$	$0.13 (0.29)$
Surgery recommendation $^{VAS}$	$1.46 (1.91)$	$1.92 (1.36)$	$1.55 (1.28)$	$1.25 (1.56)$	$1.56 (1.13)$
Physical health $^{VAS}$	$- 0.07 (0.09)$	$- 0.02 (0.06)$	$- 0.08 (0.05)$	$- 0.01 (0.06)$	$- 0.09 (0.04)$
Mental health $^{VAS}$	$- 0.55 (0.36)$	$- 0.58 (0.19)$	$- 0.86 (0.16)$	$- 0.52 (0.28)$	$- 0.95 (0.13)$
Variance components
$Var (b_{1})$	$16.74 (5.21)$	$16.01 (4.08)$	$14.96 (3.82)$	$15.91 (3.92)$	$14.15 (3.27)$
$Var (b_{2})$	$16.61 (5.82)$	$14.21 (4.30)$	$12.52 (4.41)$	$15.13 (4.35)$	$12.11 (4.22)$
$σ_{e^{1}}^{2}$	$1.97 (0.027)$	$1.93 (0.026)$	$1.94 (0.027)$	$1.92 (0.025)$	$1.91 (0.024)$
$σ_{e^{2}}^{2}$	$1.46 (0.35)$	$1.39 (0.23)$	$1.36 (0.16)$	$1.45 (0.22)$	$1.34 (0.14)$
Model selection criterion
AIC	$7 791$	$7 403$	$7 307$	$7 665$	$7 273$
BIC	$7 845$	$7 458$	$7 363$	$7 612$	$7 326$

7.3 Medical results based on the best-fitted model

Results of the best-fitted model show that both responses significantly unrelated to the gender and surgery recommendation with a considerable amount of pain or disability at the beginning of the study. An intervention may be associated with the degree of patient improvement, according to the VAS and ODI changes in the follow-up values. Previous researchers have already addressed this result (Karaman et al. (2011)).

As expected, our analysis shows that a significant positive relationship exists between age and both ODI and VAS. It means that an increase in age leads to higher disability and pain severity. We can show that the risk of disabling back pain rises in older-ages. Accordingly, it is much appreciated to find a desirable policy for LBP in elderly patients. Another finding of our study is the significant positive relationship between BMI and both ODI and VAS. It means that the pain severity and the risk of disabling back pain rise in the overweight category. We also observed a relationship between the disability of patients with chronic pain and a significant negative relationship between physical and mental health with both ODI and VAS. These show that a correlation exists between the reduction in pain and the improvement in disability simultaneously. Also, an increase in quality of life is noticeable when excluding the effect of other factors. The negative correlation of quality of life with chronic low back pain is in concordance with other studies (e.g., (Di Iorio et al. (2007))). As expected, decreased disability also had an impact on the physical and mental components score of the quality of life given the bilateral relationship. However, we omitted it in our study and only investigated the effect of quality-of-life components on the pain severity and disability of patients.

8 Discussion

A primary requirement of analysing multiple responses in mixed-effects modelling is to construct a multivariate distribution from desired marginal distributions with a given dependence structure. A flexible tool is a copula model, which extends the MLME models in an adaptable alternative that the dependence between multiple correlated responses is not necessarily limited to be in linear dependence structure. Furthermore, the proposed methodology is useful when the marginal distributions of responses are non-normal. Besides, it is convenient to model heterogeneous data with some unobserved subpopulations. The strategy was to specify the separate LME models with random intercepts with each response distributed as the DG. Then, we used a copula function to join the random intercepts of two response variables. Since the copula eliminates the effect of univariate margins from their dependence structure, the strategy causes greater flexibility in designing mixed-effects models that are applicable in real empirical applications. It is also helpful when several peaks exist in joint or each one of the marginal distributions of responses but is not directly detectable. A simple extension is to allow controlling of the unobserved subject heterogeneity by letting some regression coefficients being heterogeneous across subjects and consequently fit random slopes models. It is a topic of our future research.

We should mention that our proposed strategy to jointly modelling clustered data differs in methodology in comparison to other tools in the literature. In the analysis of binary and continuous responses, (Gueorguieva and Agresti (2001)) proposed a correlated probity model without using the copula approach. (Lambert and Vandenhende (2002)) offered an adaptable way of modelling the dependence between the components of non-normal multivariate longitudinal data by using the copula but without any notice on the multimodal structure of data. Also, to relax the normality assumption in the multivariate longitudinal setting, (Nai Ruscone and Osmetti (2017)) introduced the implementation of the D-vine copula function. Our proposed strategy uses familiar copulas to illustrate how two types of dependence between variables and time appear in the multivariate-clustered longitudinal data framework.

Our proposed strategy can be used suitably as an attractive alternative to the multivariate mixture modelling since it can cover multimodality via a fewer number of parameters without employing any selection method to determine the number of mixture components. It is not, however, applicable to research studies with aims concentrated only on classification, clustering, or discrimination of population under investigation. Although our simulation studies show that the proposed strategy is convenient for the analysis of multimodal correlated data, further research is required to illustrate the strengths and weaknesses of our modelling strategy when comparing with finite mixture models.

We employed a numerical integration technique using the Gauss–Hermite quadrature to carry out statistical inference through the maximum likelihood approach. Although most commonly available software packages, such as SAS or R, are useful to implement the technique, in our experience, fitting multiple mixed-effects models with several responses is somehow complicated. The optimization algorithms may terminate due to non-convergence. The analyser requires some carefully selected initial values. Thus, for future work, we suggest performing other estimation approaches based on Bayesian computation. It can be implemented easily in software packages, such as OpenBUGS, STAN, or JAGS in R.

Footnotes

Appendix

A sketch of the SAS programme is available below. The maximization of a general log-likelihood function proceeds by option ‘general\rq to ‘model\rq statement. The variable ‘lastid\rq is set to 1 for the last record of the same patient and zero otherwise.

proc nlmixed data=pain qpoints=30;

parms...;

bounds s2a1\\ greaterthan 0,s2a2\\ greaterthan 0,s2e1\\ greaterthan 0,s2e2\\ greaterthan 0;

if var=1 then do; mu=b01+b11*female +...+ a1; s2e=s2e1; end;

else if var=2 then do; mu=b02+b12*female +...+ a2; s2e=s2e2; end;

\\ lnlike =-0.5*\\ ln (s2e)-(y-mu)**2/(2*s2e);

z1=a1/sqrt(s2a1);

p1=0.5/(sqrt(s2a1)*gamma(alpha))*abs(z1)**(alpha-1)*exp(-(abs(z1)));

F1=0.5+0.5*sign(z1)*CDF(‘gamma’,abs(z1),alpha,1); if F1\\ greaterthan 0.9999 then F1=0.9999;

p2=pdf(‘Normal’, a2, 0, s2a2);

F2=cdf(‘Normal’, a2, 0, s2a2); if F2\\ greaterthan 0.9999 then F2=0.9999;

\\ lnclaytonden =\\ ln (theta+1)+\\ ln (p1)+\\ ln (p2) -(theta+1)*(\\ ln (F1)+\\ ln (F2))

-(1/theta+2)*\\ ln (F1**(-theta)+F2**(-theta)-1);

\\ lnnormalden =-a1**2/2-a2**2/2;

if lastid=1 then \\ lnlike =\\ lnlike +\\ lnclaytonden -\\ lnnormalden;

model y \sim general(\\ lnlike);

random a1 a2 \sim normal([0,0],[1,0,1]) subject=patient; run;

Acknowledgments

Authors are grateful to the Editor and anonymous reviewers for positive comments. The first author is also grateful to the Graduate Office of the University of Isfahan for the support.

Declaration of conflicting interests

The authors declared no potential conflicts of interest with respect to the research, authorship and/or publication of this article.

Funding

The authors received no financial support for the research, authorship and/or publication of this article.

References

Abramowitz

and Stegun

(1964) Handbook of Mathematical Functions: With Formulas, Graphs, and Mathematical Tables . North Chelmsford, Chelmsford, MA: Courier Corporation.

Beurskens

, De Vet

and Koke

(1996) Responsiveness of functional status in LBP: Comparison of different instruments. Pain , 95, 71–6.

Böhning

(1999). Computer-assisted Analysis of Mixtures and Applications: Meta-analysis, Disease Mapping and Others, volume 81 . London: CRC Press.

Butler

and Louis

(1992) Random effects models with non-parametric priors. Statistics in Medicine , 11, 1981–2000.

Davis

and Rabinowitz

(2007) Methods of Numerical Integration . North Chelmsford, Chelmsford, MA: Courier Corporation.

Di Iorio

, Abate

, Guralnik

, Bandinelli

, Cecchi

, Cherubini

, Corsonello

, Foschini

, Guglielmi

, Lauretani

, Volpato

, Abate

and Ferrucci

(2007) From chronic low back pain to disability, a multifactorial mediated pathway: The inchianti study, Spine, 32, E809.

Durante

and Sempi

(2015) Principles of Copula Theory . New York, NY: Chapman & Hall/CRC.

Fang

, Kotz

and NG

(1990) Symmetric Multivariate and Related Distributions . New York, NY: Chapman & Hall.

Fieuws

and Verbeke

(2006). Pairwise fitting of mixed models for the joint modeling of multivariate longitudinal proles. Biometrics , 62, 424–31.

10.

Frees

and Valdez

(1998) Understanding relationships using copulas. North American Actuarial Journal , 2, 1–25.

11.

Golub

(1973) Some modified matrix eigenvalue problems. Siam Review , 15, 318–34.

12.

Gomez

, Gomez-Viilegas

and Marn

(1998) A multivariate generalization of the power exponential family of distributions. Communications in Statistics: Theory and Methods , 27, 589–600.

13.

Gueorguieva

(2001) A multivariate generalized linear mixed model for joint modelling of clustered outcomes in the exponential family. Statistical Modelling , 1, 177–93.

14.

Gueorguieva

and Agresti

(2001). A correlated probit model for joint modeling of clustered binary and continuous responses. Journal of the American Statistical Association , 96, 1102–12.

15.

Hennig

(2000). Identifiablity of models for clusterwise linear regression. Journal of Classification , 17, 273–96.

16.

High

and Elrayes

(2017) Fitting complex statistical models with PROCs NLMIXED and MCMC . SAS Global Forum Conference, SAS Institute Inc., Cary, NC, USA.

17.

Jensen

, Karoly

and S

(1986) The measurement of clinical pain intensity: A comparison of six methods. Pain , 27, 117–26.

18.

Jiang

(2017) Asymptotic Analysis of Mixed Effects Models: Theory, Applications and Open Problems . New York, NY: Chapman & Hall/CRC.

19.

Joe

(2014) Dependence Modeling with Copulas . New York, NY: Chapman & Hall/CRC.

20.

Johnson

, Kotz

and Balakrishnan

(2004) Continuous Univariate Distributions. Hoboken, NJ: Wiley-Interscience.

21.

Karaman

and T–fek

, Kavak

, Kaya

, Yildirim

, Uysal

and –elik

(2011) 6-month results of Transdiscal Biacuplasty on patients with discogenic low back pain: Preliminary findings. International Journal of Medical Sciences , 8, 1.

22.

Laird

and Ware

(1982) Random-effects models for longitudinal data. Biometrics , 38, 963–74.

23.

Lambert

and Vandenhende

(2002) A copula-based model for multivariate non-normal longitudinal data: Analysis of a dose titration safety study on a new antidepressant. Statistics in Medicine , 21, 3197–217.

24.

Lesaffre

and Spiessens

(2001) On the effect of the number of quadrature points in a logistic random effects model: An example. Journal of the Royal Statistical Society: Series C (Applied Statistics) , 50, 325–35.

25.

Lin

(2010) Robust mixture modeling using multivariate skew-t distributions. Statistics and Computing , 20, 343–56.

26.

Liu

and Huang

(2008) The use of Gaussian quadrature for estimation in frailty proportional hazards models. Statistics in Medicine , 27, 2665–83.

27.

Liu

and Yu

(2007) A likelihood reformulation method in non-normal random effects models. Statistics in Medicine , 27, 3105–24.

28.

McCulloch

and Neuhaus

(2011) Misspecifying the shape of a random effects distribution: Why getting it wrong may not matter. Statistical Science , 26, 388–402.

29.

McCulloch

and Searle

(2001) Generalized, Linear and Mixed Models . New York, NY: Wiley.

30.

Molenberghs

, Kenward

and Verbeke

(2009) Discussion of likelihood inference for models with unobservables: Another view. Statistical Science , 24, 273–79.

31.

Nai Ruscone

and Osmetti

(2017) Modelling the dependence in multivariate longitudinal data by pair copula decomposition. In SMPS 2016. Advances in Intelligent Systems and Computing: Soft Methods for Data Science, volume 456, edited by Ferraro

, Giordani

, Vantaggi

, Gagolewski

, –ngeles Gil

, Grzegorzewski

and Hryniewicz

, pages 373–80. Cham: Springer.

32.

Nelsen

(2006) An Introduction to Copulas . Berlin: Springer Science & Business Media.

33.

Nguyen

and Chen

(2009) A connection between the double gamma model and Laplace sample mean. Statistics and Probability Letters , 79, 1305–10.

34.

Park

, Shin

, Choi

, Youn

, Lee

, Kwon

, Lee

, Kang

, Ha

and Shin

(2010) Integrative package for low back pain with leg pain in Korea: A prospective cohort study. Complementary Therapies in Medicine , 18, 78–86.

35.

Pinheiro

and Bates

(1995) Approxi- mations to the log-likelihood function in the nonlinear mixed-effects model. Journal of Computational and Graphical Statistics , 4, 12–35.

36.

Pinheiro

, Liu

and Wu

(2001). Efficient algorithms for robust estimation in linear mixed-effects models using the multivariate t distribution. Journal of Computational and Graphical Statistics , 10, 249–76.

37.

Pumi

and Lopes

(2012) Parameterization of copulas and covariance decay of stochastic processes with applications . arXiv preprint arXiv, 1, 1204–3339.

38.

SAS

Institute

(2018) SAS/STATr 15.1 User's Guide: The NLMIXED Procedure . Cary, NC: SAS Institute Inc.

39.

Sklar

(1959) Fonctions de répartition à n dimensions et leurs marges. Publications de I\rq Institut de Statistique de L\rq Université de Paris , 8, 229–31.

40.

Thiébaut

, Jacqmin-Gadda

, Chêne

, Leport

and Commenges

(2002) Bivariate linear mixed models using SAS Proc MIXED. Computer Methods and Programs in Biomedicine , 69, 249–56.

41.

Toenges

and Jahn-Eimermacher

(2020) Computational issues in fitting joint frailty models for recurrent events with an associated terminal event. Computer Methods and Programs in Biomedicine , 188. URL https://doi.org/10.1016/j.cmpb.2019.105259 (29 October 2020).

42.

Trivedi

and Zimmer

(2007) Copula Modeling: An Introduction for Practitioners . Hanover, MA: Now Publishers

43.

Verbeke

and Lesaffre

(1996) A linear mixed-effects model with heterogeneity in the random-effects population. Journal of the American Statistical Association , 91, 217–21.

44.

Verbeke

and Molenberghs

(2000) Linear Mixed Models for Longitudinal Data . New York, NY: Springer-Verlag.

45.

Ware

, Kosinski

, Bayliss

, McHorney

, Rogers

and Raczek

(1995) Comparison of methods for the scoring and statistical analysis of SF-36 health prole and summary measures: Summary of results from the medical outcomes study. Medical Care , 33, 264–79.

46.

Zhang

and Davidian

(2001) Linear mixed models with flexible distributions of random effects for longitudinal data. Biometrics , 57, 795–802.

A copula-based approach to joint modelling of multiple longitudinal responses with multimodal structures

Abstract

Keywords

1 Introduction

2 Double Gamma distribution

The density plot of DG ( 0 , 1 , α ) for (left) α = 0.6 , (centre) α = 1 and (right) α = 3

The density plot of the copula with the standard normal as both margins (left), the DG as one of two margins (centre), and the DG as both margins (right)

6 Simulation studies

Histograms of the generated random-effects from the Clayton copula with DG margins

The scatter plot (left), the surface plot (centre) and the contour plot (right) of the generated random-effects from the Clayton copula with DG margins

7.1 Data description

7.2 Data analysis

Histograms of the predicted random intercepts from model (7.1) in low back pain study

The scatter plot (left). The surface plot of the predicted random intercepts from the bivariate mixed model M1 (centre). The contour plots of the Clayton copula with DG margins in the low back pain study (right).

8 Discussion

Footnotes

Appendix

Acknowledgments

Declaration of conflicting interests

Funding

References

The density plot of $DG (0, 1, α)$ for (left) $α = 0.6$ , (centre) $α = 1$ and (right) $α = 3$