Fitting Nonlinear Mixed-effects Models With Alternative Residual Covariance Structures

Abstract

Nonlinear mixed-effects models are models in which one or more coefficients of the growth model enter in a nonlinear manner, such as appearing in the exponent of the growth function. In their applications, the within-individual residuals are often assumed to be independent with constant variance across time, an assumption that implies that the assumed growth function fully accounts for the dependencies and patterns of variation in the data. Studies have shown that a poorly specified within-individual residual covariance structure of a linear mixed-effects model can impact the estimated covariance matrix of the random effects at the second-level, model fit and statistical inference. The consequences for nonlinear mixed-effects models are not, however, clearly understood. This is due in part to the differences in the estimation needs of the two types of models. Using empirical data examples, this work illustrates the impact of fitting alternative residual covariance structures in nonlinear mixed-effects models that do not entirely parallel the results from studies of strictly linear mixed-effects models and call for the need of researchers to consider alternative structures when fitting nonlinear mixed-effects models.

Keywords

longitudinal data analysis nonlinear mixed-effects models residual covariance structure correlated residuals hierarchical linear models

Accounting for dependencies and heterogeneity of variance in longitudinal data is central to fitting mixed-effects models. Although a particular growth model may account for much of the within-individual variation and covariation in a longitudinal response, some dependencies may remain or the assumption of homogeneity of variance may not be met. Not accounting for dependencies in the residuals at the first level of a mixed-effects model that is strictly linear in its parameters has been shown to result in poorer model fit and biased parameter estimates (Chi and Reinsel 1989; Ferron, Daily, and Yi 2002; Kwok, West, and Green 2007; Sivo, Fan, and Witta 2005). The impact of not accounting for heterogeneity of variance in the residuals may have similar implications. With the consequences of a poorly specified residual covariance structure well understood for linear mixed-effects models, steps to assess the adequacy of the structure are commonly outlined in articles and books describing the general methodology (Muthén 1997; Raudenbush and Bryk 2002; Singer and Willett 2003) and applied in practice (Roberts and Adams 2018; Wickrama, Lorenz, and Conger 1997).

Nonlinear mixed-effects models, in which one or more of the growth coefficients enters the function in a nonlinear, or nonadditive, way, are inherently more complex than those within the family of linear mixed-effects models (Davidian and Giltinan 1995, 2003). Nonlinear mixed-effects models differ from the linear version in several important ways. Perhaps one of the most important differences is that the former provides a highly flexible means of modeling longitudinal data. That is, longitudinal data that follow complex forms of change, such as data that tend toward an asymptote or that follow multiple phases of change, require a nonlinear modeling framework that allows for any specified form of growth. Although linear mixed-effects models can be used to model some complex forms of change, such as by applying a quadratic or cubic growth function to capture nonlinear change, linear mixed-effects models are limited to polynomial functions that may not provide an intuitive understanding of a longitudinal process. Conversely, nonlinear mixed-effects allow for a longitudinal process to be modeled by any kind of function. This can be ideal for directly relating aspects of growth or change to hypotheses about a longitudinal process (Cudeck 1996; Fitzmaurice, Laird, and Ware 2004).

Given the complexity of some longitudinal data, it can be challenging to define a growth model that adequately captures a longitudinal process. Although a quadratic growth model may approximate some forms of nonlinear change, it may not do well in characterizing a process that tends to level off with time. Alternatively, an exponential growth function may improve upon the characterization of such a process. More complex models can be anticipated for responses expected to follow multiple phases of change over time, such as the bereavement response to the loss of a spouse (Infurna et al. 2017), and so a model that allows for transitions between different response functions over the course of time may be necessary. In any case, there is likely to be some degree of misfit between the observed longitudinal response and the hypothesized form of change, especially given that a hypothesized model serves at best to approximate a behavior. Although this issue is a problem common to both linear and nonlinear models, the problem may be greater when fitting nonlinear models because their applications naturally involve more complex forms of change (e.g., Lee and Rojewski 2009; Zhao et al. 2016). Thus, concerns about dependencies in the residuals, as well as heterogeneity of variance, at the first level of a nonlinear mixed-effects model may play a greater role when considering these models simply given their greater complexity.

Adding to the complexity in fitting nonlinear mixed-effects models are cases where longitudinal data are incomplete and the source of the missing data is not ignorable (Xu and Blozis 2011). Although mixed-effects models allow for missing response data in longitudinal studies, if the source of the missing data is not ignorable, more complex modeling procedures are needed (Molenberghs and Kenward 2007). Nonignorable sources of missing data can compromise statistical inference from a model, and consequently, erroneous conclusions about the data may be drawn. One of the problems with nonignorable missingness is that it can be an additional source of dependency in the data or heterogeneity of variance, that if not addressed, can mean that an assumption of independent residuals or homogeneity of variance will not be met. In some cases, a covariate that is related to the missingness can be included in the model, and the missingness is then ignorable. If, however, such covariates are either not available or do not adequately address the source, then a model that accommodates greater flexibility in the residual covariance structure can be an important aspect of fitting a model to ensure valid statistical inference.

Unlike a nonlinear mixed-effects model, it is actually possible to generate an analytic solution to possible consequences of a misspecified within-individual residual structure under a linear mixed-effects model. For both linear and nonlinear mixed-effects models, the marginal distribution of the response is a multidimensional intergral of the joint distribution of the response and the random effects (Davidian and Giltinan 2003). For a linear mixed-effects model, integration of the integral can be handled algebraically, and the result is a closed-form solution for the marginal distribution of the response. Given this, it is possible to provide an analytic solution to possible consequences of a misspecified within-individual residual structure in which parameters that characterize the random effects covariance structure are impacted (see Online Appendix A). The same, however, is not true for a nonlinear mixed-effects model that includes a nonlinear random effect because the integration of the integral cannot generally be done algebraically, making it not possible to carry out a solution like can be done for the linear model. With this, it is not possible to use this strategy to understand the potential consequences of a misspecified residual covariance structure. Thus, although the issues that arise when fitting linear mixed-effects models may be thought to arise as well when fitting nonlinear mixed-effects models, it is not possible to derive an analytic solution to understand the potential problems, particularly with regard to the random effects. Further, given the great complexity of formulating and estimating nonlinear mixed-effects models, it is not clear whether the consequences that have been documented for linear mixed-effects models parallel those for nonlinear mixed-effects models.

In practice, choosing among alternative residual covariance structures for a linear mixed-effects model is straightforward using commercial software programs, such as SAS PROC MIXED, that include several options. Using the TYPE option in the REPEATED statement of PROC MIXED, for instance, users can specify one of the many available covariance structures, such as a first-order autoregressive structure or heterogeneity of variance. Conversely, programs for fitting nonlinear mixed-effects models, such as SAS PROC NLMIXED, do not readily offer such options. Harring and Blozis (2014) showed how PROC NLMIXED could be used to fit a nonlinear mixed-effects model for which the within-individual residual covariance structure allows for correlations between residuals or for heterogeneity of variance. This is accomplished by relying on the GENERAL loglikelihood option in the MODEL statement. With this, researchers may more readily consider how alternative residual structures can impact model fit and the parameter estimates of the model. Nonlinear mixed-effects models provide for a greater range of functional forms for longitudinal data relative to linear mixed-effects model, and so understanding the impact of fitting alternative residual structures is important, as nonlinear mixed-effects model increase in popularity across many domains of study in the social and behavioral sciences, especially in light of recent applications of nonlinear mixed-effects models that do not consider alternatives to the standard assumption of independence of the within-individual residuals with constant variance (Marceau, Abar, and Jackson 2015; Zhao et al. 2016).

This article evaluates the use of alternative covariance structures for longitudinal data fit by a nonlinear mixed-effects model with a focus on model selection, model fit, and sensitivity of parameter estimates under different assumptions about the within-individual residual covariance structure. This article extends earlier work in a number of ways. First, we consider data from a learning study presented in Harring and Blozis (2014). In Harring and Blozis, many different functions were applied to the learning data, and model fit comparisons were done for models that all assumed the level 1 residuals were independent with constant variance. Here, we consider these data to see whether different residual covariance structures should also be consider during the process of selecting a growth function to describe the data. A central assumption about a mixed-effects model is that the chosen function, such as a logistic versus an exponential growth function, represents the general form of change for all individuals in a population with subject-specific coefficients that allow for individual differences in specific aspects that characterize change. The residuals at the first level of the model represent the discrepancies between the fitted model and the data. One question is whether alternative residual covariance structures should be consider during the process of selecting a growth function to provide insight into the process of providing a suitable representation of the data. Also in Harring and Blozis’s work, a first-order autoregressive residual structure was shown to provide a superior fit to the learning data relative to the fit of a model that assumed independent residuals with constant variance and other residual structures, but it was not shown how this preferred residual structure impacted parameter estimates and statistical inference. We, therefore, aim to also study the impact of these assumptions on model inference.

We also extend earlier work by considering problems of incomplete longitudinal data and for which the missingness (i.e., whether data are missing or not) may not be ignorable. Parameter estimates from a mixed-effects model are unbiased if the source of the missing data is ignorable, such that the missingness is independent of the missing values (Laird 1988). One approach to addressing nonignorable missingness in a mixed-effects model is to incorporate indicator variables that represent patterns of missing data. In a random-effects pattern-mixture model, indicator variables are included in the model to account for between-subject variation in the random effects, which is attributable to patterns of missing data. Conditional on these patterns, the missingness is assumed to be ignorable (Little 1995). Based on prior work on linear mixed-effects models in which the use of alternative residual covariance structures was studied, it is understood that changes in the assumptions about the residual covariance structure at the first level of the model can lead to changes in the covariance structure of the random effects at the second level (Chi and Reinsel 1989). Thus, this article explores the impact of fitting alternative residual covariance structures in the context of fitting a nonlinear random-effects pattern-mixture model. If, for instance, data are best fit with a model that uses an alternative residual covariance structure and this results in a change in the covariance structure of the random effects at the second level of the model, then this could possibly lead to changes in the pattern-mixture model component that serves to address nonignorable missingness.

In fitting models to the different data sets, we provide maximum likelihood (ML) estimates (see Harring and Blozis 2014) and Bayesian estimates. Bayesian estimation vis-à-vis Markov chain Monte Carlo (MCMC) methods (Gelfand and Smith 1990; Wakefield 1996) of hierarchical models such as the nonlinear mixed-effects models provides an alternative scheme to ML (Davidian and Giltinan 1995) that, in contrast, relies on large-sample asymptotic results based on normal theory for valid statistical inference. For data sets using small samples, it is often difficult to ensure such approximations of the sampling distributions of parameter estimates are valid (De la Cruz-Mesia and Marshall 2006). In addition, the uncertainty of parameter estimates and future predictions are relatively difficult to evaluate under ML, especially in cases for which general model assumptions such as independent residuals and normality are relaxed (Li, Stewart, and Weiskittel 2012). Thus, the primary benefits of using a Bayesian approach include its ability to generate the full posterior distribution of estimated parameters through samples generated by simulation algorithms during model fitting from which various statistics (mean, mode, median, SD, etc.) can be easily calculated (Levy and Mislevy 2016). As we provide results using both methods of estimation, this allows for comparisons between sets of results to study the sensitivity of estimates to the different approaches to estimation. We note as well that many applications of nonlinear mixed-effects models using Bayesian estimation take the approach of a simple specification of the residual covariance structure like a conditionally independent structure with constant variance, and so structures that allow for serial correlation within a Bayesian framework are rare (Broemeling 2016).

The remainder of this article is as follows: For background, mixed-effects models are reviewed first. A set of residual covariance structures is then described that allow for correlations between residuals and heterogeneity of variance. Similar to Harring and Blozis (2014), among others who have studied this problem for linear mixed-effects model (e.g., Kwok et al. 2007), longitudinal measures are assumed to be assessed at discrete points in time. Two empirical data sets are then described and analyzed to study the impact of different residual structures. In each example, a set of nonlinear mixed-effects models that are judged as suitable contenders in describing the longitudinal data is applied along with different residual structures to address possible dependencies and heterogeneity of variance in the data conditional on the assumed individual-specific growth function. Fitting the alternative residual structures provides a means of evaluating the standard assumption that the residuals are independent with constant variance, as well as understanding the impact of this assumption on model fit and the estimated parameters.

Mixed-effects Models

Mixed-effects models are a major tool for the analysis of longitudinal data due to their emphasis on the individual, as well as the flexibility in how the models may be specified. A common function is selected to describe the responses of individuals in a population, but the particular coefficients of the function, such as the rate of change, may be unique to the individual. That is, all individuals in a population are assumed to follow the same functional form, but the parameters of the growth function can vary between individuals. The fact that the model can be formulated to highlight individual differences in the characteristics that describe change has made these models widely popular in many disciplines of study. Although applications of linear mixed-effects model that are based on polynomial functions (Raudenbush and Bryk 2002) are clearly predominant in the behavioral sciences, the need for nonlinear mixed-effects model to characterize measures that change at nonconstant rates is obvious in several areas of study such as learning, development, and personality. For example, longitudinal responses from these types of studies can follow forms of change that are best described by a growth function that includes an asymptote (Choi, Harring, and Hancock 2009) or possibly by two or more growth functions that are linked together to allow for distinct phases in change (Cudeck and Klebe 2002; Kohli and Harring 2013; Zhao et al. 2016).

For instance, common to learning data are responses that tend toward an upper or lower asymptote (e.g., Susman et al. 1998). In developmental studies that are carried out over extended periods of time, a response can tend to follow one course (e.g., increasing linear growth) and then shift to a different course (e.g., decreasing linear growth) (Schlotz et al. 2011). Problems such as these often require a nonlinear growth model to account for complex patterns in the data. Also of value, nonlinear models can be specified to yield a parsimonious model based on highly interpretable parameters (Cudeck 1996). Indeed, the need for nonlinear growth models is apparent in majors areas of behavioral study including development (Burke, Shrout, and Bolger 2007; Grimm, Ram, and Hamagami 2011; Laursen, Little, and Card 2013; Roberts 1986) and personality (West et al. 2011). Choi et al. (2009), for example, describe the utility of a logistic function to understand longitudinal data that follow an S-shaped form of change. They emphasize the appealing aspects of using a nonlinear function to capture meaningful features of a measured response such as an asymptote to represent the long-run tendency of a response.

We first review a mixed-effects model for a normal variable in which the growth function can be nonlinear, or nonadditive, in its parameters. Let $y_{t i}$ be a measured response at occasion $t$ for individual $i$ , with $t = 1, . . ., n_{i}$ and $i = 1, . . ., N$ , where $n_{i}$ is the number of observations for individual $i$ , and $N$ is the total number of individuals in a random sample. In a nonlinear mixed-effects model, the response is assumed to follow a function that may include one or more random effects that enter the model in a nonlinear way and may be expressed generally as (cf. Davidian and Giltinan 1995):

y_{t i} = f (β, b_{i}, X_{i}) + e_{t i},

where $f (\cdot)$ simply denotes the function selected to describe the longitudinal response, $β$ is a set of fixed effects that do not vary between individuals, $b_{i}$ is a set of random effects that vary between individuals, and $X_{i}$ contains one or more variables that are used in the model to predict $y_{t i}$ . Variables in $X_{i}$ can include measures of time, variables that covary with the response over time, or variables that are constant for the individual over the study period. Each random effect often corresponds to a fixed effect although this is not a requirement of the model. For a fixed effect that has a corresponding random effect, the sum of the two is a mixed effect, for example, $β_{1 i} = β_{1} + b_{1 i}$ .

At the first level of the model, the residual $e_{t i}$ is often assumed in practice to be independent between assessments for a given individual and to be distributed the same across individuals, typically as normally distributed with mean equal to zero and constant variance across time. At the second level of the model, the random effects are assumed to be independent between individuals and to be normally distributed with means equal to zero and covariance matrix, denoted here by $Φ$ . In a model that includes two random effects, $β_{0 i}$ and $β_{1 i}$ , for instance, the matrix $Φ$ would be

Φ = [\begin{array}{l} φ_{b 0} \\ φ_{b 1 b 0} & φ_{b 1} \end{array}],

where $φ_{b 0}$ and $φ_{b 1}$ are the variances of the random effects, and $φ_{b 1 b 0}$ is the covariance between them. The variances of the random effects characterize the extent to which a given effect varies between individuals, and so larger variances correspond to greater individual differences. The residual at level 1 and the random effects at level 2 are often assumed to be independent, and the covariance matrices corresponding to the two levels are assumed to be positive definite (Laird and Ware 1982).

Accounting for Within-individual Dependencies

An essential part of formulating a longitudinal model is accounting for the within-individual dependencies in the data, given that longitudinal data are typically autocorrelated within person. A mixed-effects model addresses these dependencies by assuming that at least part of the dependencies is due to person-specific function parameters. As an example, the responses of individuals from a population may all tend to follow the same nonlinear form of growth such as a logistic curve (Choi et al. 2009), but growth at the individual level may vary, with individuals having unique response levels and unique rates of change. A mixed-effects model includes subject-specific coefficients that allow the fitted growth curves to better align with an individual’s response. Given a suitable growth model, variation in the responses may be well accounted for by the growth function that includes the subject-specific coefficients, and as a result, the residuals conditional on the subject-specific growth model are independent. In such cases, it would then be appropriate to interpret a model that assumes that the residuals are independent between measurement occasions and possibly that the variances of the residuals are constant over time. If, however, some correlations in the residuals remain, this may be an indication that the chosen function is not adequate in summarizing the responses or that additional covariates may be needed to address the dependencies. In either case, inference from a model that assumes that the within-individual residuals are independent may be problematic. Thus, the ability to model dependencies in data not fully captured by an assumed growth function may be important, particularly for situations in which the source of the dependencies is not well understood, such as when additional covariates are not available to address the dependencies. In these latter cases, the only recourse may be to specify a covariance structure that simply allows for correlations between the residuals at different occasions. In other kinds of problems, it may be that the assumption of homogeneity of variance is not tenable, and for these cases, it might be appropriate to allow for heterogeneity in the variances of the residuals. Although either of these situations is more commonly considered for applications of linear mixed-effects model, the same is not true for nonlinear mixed-effects models (Harring and Blozis 2014).

Related to the need for alternative residual covariance structures in fitting nonlinear mixed-effects model are the analytic methods that are required for estimation of these models. That is, there can be practical challenges in the estimation of nonlinear mixed-effects model in general, a problem that quite often is due to a nonlinear random coefficient. Estimation of a nonlinear mixed-effects model that includes a nonlinear random coefficient can present a challenge because the likelihood function for a given model may not be tractable, meaning that it is not possible to express the likelihood function in such a way as to directly solve for the parameters. For a linear mixed-effects model, for instance, a likelihood function may be reexpressed as a log of the likelihood function, and differential calculus applied to obtain an analytic solution. Commonly applied algorithms, such as Newton-Raphson, may then be implemented in a straightforward manner to obtain parameter estimates. For a nonlinear mixed-effects model, much effort has been made to develop methods to solve this often complex estimation problem. Solutions include methods that rely on linear approximations to a nonlinear function, such as a first-order Taylor series approximation (Beal and Sheiner 1982; Pinheiro and Bates 1995). Alternatively, if the nonlinear coefficients are fixed across individuals and only coefficients that enter the function in a linear way are random, then estimation may be carried out using methods commonly used for estimation of linear models (Blozis 2007; Blozis and Cudeck 1999). Thus, a central issue seems to concern problems for models that include nonlinear random effects.

PROC NLMIXED is employed for the estimation of nonlinear mixed-effects models with the assumption for normally distributed data that the residuals are independent with constant variance. As described, estimation of a nonlinear mixed-effects model is rather complicated, and much of the literature has focused on available approaches and consequences to their choices (e.g., Harring and Blozis 2016). To date, the program does not include options for the within-individual residual structure. So, unlike PROC MIXED that provides for many options for the residual covariance structure, the same is not true of PROC NLMIXED. Due to Harring and Blozis (2014), it is possible to use SAS PROC NLMIXED to fit several alternative within-individual covariance structures of a nonlinear mixed-effects model that includes nonlinear random effects. This can be done by implementing the MODEL statement with the GENERAL option in which users provide a loglikelihood function with direct specification of both the mean function and the covariance structure of a given model. As detailed in Harring and Blozis (2014), a multivariate normal density function requires both the inverse and the determinant of the within-individual covariance structure. To allow for serial correlation, for instance, the variance matrix of the level 1 residuals is separated from a matrix that represents the serial correlations among the residuals (see Davidian and Giltinan 2003). In the expression of the density function, the within-individual covariance matrix is constructed by combining the inverse and determinant of the variance matrix and correlation matrix. With this general strategy, it is possible to fit patterned matrices that include a first-order autoregressive structure, a symmetric banded-1 Toeplitz structure, compound symmetry, and an independent structure with nonconstant variance.

Alternative Residual Covariance Structures

In fitting a growth model to data for which the rate of change in the response is not constant over time, there may be several functions considered to be suitable for describing the data. Learning and developmental processes, for instance, may be well summarized by a function that allows for the responses to tend toward an asymptote. Functions of this kind may include any member of the Richards (1959) family of functions such as a logistic, exponential, or Gompertz function. A response may be assumed to follow a function that relies only on time, or the function may additionally include covariates such as time-varying covariates in which a response is assumed to be due to time along with variables that also change with time. Naturally, moderators (i.e., predictors) of the coefficients of a given function may be incorporated into the model, such as allowing person-level attributes to moderate the change rate or response level. In practice, one may fit several suitable functions both with and without covariates and evaluate model fit between them.

A measure of how well a model accounts for the responses is the within-individual residuals, that is, the discrepancies between an individual’s observed scores and those predicted by a model. Relatively large residuals can suggest a poor fitting model to a competing model. Among a set of fitted models, the one that yields the smallest average residual may be judged as best. Even with the best fitting of a given selection of models, it may be that some degree of correlation between the residuals remains, a result that would suggest that the chosen function did not fully account for the within-individual dependencies in the data. The source of the correlation may be known, in which case covariates may be added to the model to help address the dependencies, or the source may be unknown, in which case one may only be able to allow for correlations between the residuals. In a different situation, it may be that the variances of the residuals are not equal across the different time points. Similar to addressing correlations between residuals, addressing differences in the residual variances may be done by introducing covariates into a model or by specifying a model that allows the variances to differ.

Simulation studies have investigated the impact of ignoring correlated residuals in linear growth models (Ferron et al. 2002; Kwok et al. 2007; Sivo et al. 2005) and have shown poorer model fit and biased estimates of the variances and covariances of the random effects for conditions under which the level 1 residuals were assumed to be independent with constant variance. Depending on the particular software that is used to fit a linear model with random effects, there may be several different options available for the residual structure. SAS PROC MIXED, as noted earlier, offers many options for covariance structures. Latent growth models fitted within a structural equation model framework can also handle heterogeneity of variance as well as covariances between residuals (e.g., Willett and Sayer 1994). Thus, not only does the option of fitting alternative residual structures allow for possible improvement in model fit and parameter estimates, it allows for assessments of the tenability of the commonly assumed structure of conditional independence and homogeneity of variance.

For nonlinear mixed-effects model, Harring and Blozis (2014) develop syntax for PROC NLMIXED to fit several alternative residual structures that are more typically used in fitting linear mixed-effects model. These structures include one that allows heterogeneity in the residual variances over time, assuming that the residuals are independent between occasions. Other structures allow for correlations between the residuals and include a first-order autoregressive structure, compound symmetry, and a symmetric banded-1 Toeplitz structure. These residual structures are briefly reviewed here. As in many applications of linear mixed-effects model, these particular residuals structures are applied to discrete measures of time such that there is a common set of measurement occasions across individuals, although some individuals may not have complete data for all occasions. The residual covariance structures presented in Harring and Blozis’s work are shown here assuming four measurement occasions. Naturally, these structures can be generalized to handle a different number of measurement occasions.

Heterogeneity of Variance

A residual covariance structure $Θ_{i}$ that allows for heterogeneity of the residual variances, assuming independence between the residuals, is

Θ_{i} = [\begin{matrix} σ_{1}^{2} \\ 0 & σ_{2}^{2} \\ 0 & 0 & σ_{3}^{2} \\ 0 & 0 & 0 & σ_{4}^{2} \end{matrix}],

where the subscript on the variances denotes the four distinct variances at each occasion. Additionally, the subscript $i$ is used here and henceforth to indicate that the residual covariance matrix can differ among individuals with regard to its dimensions (see Jennrich and Schluchter 1986) but typically not otherwise (although see Davidian and Giltinan 1995). This is useful for situations in which the number of measurement occasions is different for different individuals, and thus, the dimensions of $Θ_{i}$ will vary accordingly. For instance, if an individual is measured at the first and third measurement occasions, the individual will have a 2 × 2 covariance matrix with rows and columns that correspond to the first and third occasions:

Θ_{i} = [\begin{matrix} σ_{1}^{2} \\ 0 & σ_{3}^{2} \end{matrix}] .

Autoregressive Residual Structure

A first-order autoregressive residual structure allows for the residuals between adjacent time points to covary and assumes that the magnitude of the correlation lessens as scores move further apart in time:

Θ_{i} = σ^{2} [\begin{matrix} 1 \\ ρ & 1 \\ ρ^{2} & ρ & 1 \\ ρ^{3} & ρ^{2} & ρ & 1 \end{matrix}],

where $σ^{2}$ is a common variance, and $ρ$ is an autoregressive coefficient that is bounded between $0$ and $1$ . Thus, the variances are constant across time, but the residuals may covary between time points in a systematic manner.

Compound Symmetry

Compound symmetry assumes that the residuals have constant variance across time, and the covariances between the residuals maintain a constant magnitude no matter how far apart the measures lie with regard to the measurement occasions:

Θ_{i} = [\begin{matrix} σ^{2} + σ_{1} \\ σ_{1} & σ^{2} + σ_{1} \\ σ_{1} & σ_{1} & σ^{2} + σ_{1} \\ σ_{1} & σ_{1} & σ_{1} & σ^{2} + σ_{1} \end{matrix}],

where $σ_{1}$ is the common covariance, and $σ^{2} + σ_{1}$ is the common variance.

Symmetric Banded-1 Toeplitz Structure

A version of a banded-1 Toeplitz structure is one in which the variances are constant and the covariances between residuals separated by one time lag are equal, and all other covariances are set equal to 0:

Θ_{i} = [\begin{matrix} σ^{2} \\ σ_{1} & σ^{2} \\ 0 & σ_{1} & σ^{2} \\ 0 & 0 & σ_{1} & σ^{2} \end{matrix}],

where $σ^{2}$ is a common error variance, and $σ_{1}$ is the covariance between residuals that are one time lag apart.

Examples

The first example comes from a learning study in which performance measures were obtained over a series of consecutive trials within a single study session. Models for learning data may benefit from consideration of alternative residual covariance structures to adequately capture patterns in the data. That is, although a nonlinear mixed-effects model, such as one based on an exponential growth function that includes an asymptote, may do well in accounting for individual differences in the responses across trials by including subject-specific growth coefficients, it is possible that after fitting the model that the trial-specific residuals may be correlated within person. This can result if an individual’s responses do not change entirely according to the specified growth function and perhaps a different function is warranted. Thus, finding that some correlation between the residuals remains after fitting a model may suggest the need for an alternative growth model, such as one that uses a different function or one that includes time-varying covariates. It may also be that the residual variances are not equal across trials. Differences in residual variances across trials could reflect a number of possible sources. For example, the within-individual variation toward the end of the study trials could increase due to participant fatigue for some participants and not for others. Alternatively, some participants may show greater stability in performance toward the end of the learning trials that would be reflected as a decrease in variation for them but not for others. We explore such possibilities for the learning data. The second example is from a study of clinical status for a sample of individuals diagnosed with schizophrenia. For these data, responses were recorded weekly although some data are missing. Valid inference from a mixed-effects model requires that the source of the missing data is ignorable. Previous analyses of these data using linear mixed-effects models suggest that the source of the missing data may not be ignorable (Hedeker and Gibbons 1997). Here, a nonlinear random-effects pattern-mixture model is applied to account for nonignorable missingness and is further evaluated under alternative assumptions about the residual covariance structure. Of particular interest for these data is whether the assumptions made about the residual covariance structure have any impact on the conclusions drawn about the missing data.

In considering alternative residual covariance structures, a sensitivity analysis is performed in each of the examples to evaluate the parameter estimates of the growth models under different assumptions about the residual structure. The purpose of doing such an investigation follows from work on linear mixed-effects models. For instance, Chi and Reinsel (1989) describe how changes in the assumptions about the residual covariance structure at the first level of a linear mixed-effects model can lead to changes in the estimated covariance structure of the random effects at the second level of the model. Specifically, they discuss that in a linear mixed-effects model, the random effects may represent serial correlation between measures within person and that a first-order autoregressive structure combined with random-effects may be more appropriate in such cases. Here, we study the impact of applying these alternative residual covariance structures on model specification and statistical inference of nonlinear mixed-effects models.

In addition to studying the sensitivity of parameter estimates under different assumptions about the residual covariance structure, comparisons are made with regard to the estimation method. As discussed previously, estimation via ML relies on large-sample asymptotic results based on normal theory for valid statistical inference, making it difficult to ensure valid approximations of the sampling distributions of parameter estimates when sample sizes are not large (De la Cruz-Mesia and Marshall 2006). The benefits of using a Bayesian approach include the ability to generate the full posterior distribution of estimated parameters through samples generated by simulation algorithms during model fitting. Comparing results obtained via ML and Bayesian thus allows for evaluation of the sensitivity of results to the chosen estimation methods. Here, ML estimation was carried out using PROC NLMIXED with SAS version 9.4. The Bayesian analyses were executed using WinBUGS software version 1.4.3. (Spiegelhalter et al. 2002), a widely adopted program for Bayesian analysis in which estimated parameters may be obtained from their posterior distributions using MCMC algorithms. We used the R packages BRugs (Thomas 2004) and R2WinBUGS (Sturtz, Ligges, and Gelman 2005) to link R and WinBUGS for data input and output and graph generation for convergence diagnostics. Syntax for fitting models to the learning data in provided in the Online Appendix B.

Performance on a Flight Simulation Task

Data from a computerized skill acquisition task that was presented in Kanfer and Ackerman (1989; also see Harring and Blozis 2014) are studied here. The data represent the number of planes brought in safely for a series of ten 10-minute intervals, a task designed to simulate the role of an air traffic controller. Participants were allowed ten-minute breaks immediately following the fourth and seventh trials. The data are the performance measures for 140 participants. Data from the first interval were excluded from analysis assuming that the measures could reflect the participants’ adjustments to the task. Sample means and the covariance matrix of performance scores for the nine intervals ( $t = 1, . . ., 9$ ), along with a display of scores for a subset of participants, are given in Harring and Blozis (2014). These data have been used in other methodological contributions, and similar to these studies, the responses are treated as continuous measures. Data are complete across all trials for all participants.

Similar to Harring and Blozis (2014), we considered functions that allow for an upper asymptote. Specifically, we fit models using an exponential and a logistic function (cf. Browne 1993):

Exponential: y_{t i} = β_{1 i} - (β_{1 i} - β_{0 i}) exp {- β_{2 i} (t - 1)} + e_{t i},

Logistic: y_{t i} = \frac{β_{0 i} β_{1 i}}{β_{0 i} + (β_{1 i} - β_{0 i}) exp {- β_{2 i} (t - 1)}} + e_{t i} .

The function coefficients include random effects so that the functions are specific to each participant (with the random effects assumed to covary with each other). As described in Browne (1993), these particular forms of an exponential and logistic function have parameters that have essentially the same interpretation. Thus, for both functions, $β_{0 i}$ is an individual’s performance level at the start of the trial blocks ( $t = 0$ ), $β_{1 i}$ is an individual’s potential performance level, and $β_{2 i}$ combined with values of $t$ allows for the nonconstant rate of change. The exponential function assumes a gradual slowing in the response toward the end of the period, whereas the logistic function assumes a gradual amount of change initially and a gradual slowing in the response toward the end of the period.

For a set of models that assume the same growth function (i.e., either logistic growth or exponential growth), the different residual structures reviewed earlier were applied to understand whether assumptions made about the residuals play a role in the selection of a growth model for these data. As a set, the alternative residual structures provide a means of assessing the assumptions that the residuals are independent and that their variances are constant across trials. The models were estimated using the default method for PROC NLMIXED, Gauss–Hermite quadrature, that is implemented by using the GENERAL loglikelihood option in the MODEL statement. Syntax for fitting a nonlinear mixed-effects model with these alternative residual structures is provided by Harring and Blozis (2014). Additionally, the noad (for nonadaptive Gaussian quadrature) option was selected. Estimates were based on 30 quadrature points. Lesaffre and Spiessens (2001) and Pinheiro and Bates (1995) discuss the role of specifying the number of quadrature points when applying adaptive and nonadaptive Gaussian quadrature. Starting values were generated by first fitting a completely fixed-effects model to obtain fixed-effects estimates and a constant residual variance at level 1. Using those results as an updated set of starting values, we added in a stepwise fashion random effects to the second level of the model, beginning with a random intercept, then a random asymptote, and finally a random rate parameter. Starting values were updated with each step. In a final step, the final model that was deemed to provide the best fit to the data was fit again using Bayesian estimation methods. Bayesian estimates for the best fitting growth model, as well as a model that assumed that the residuals were conditionally independent with constant variance, are provided. Details regarding specific assumptions made for the Bayesian estimation are provided in Online Appendix C.

Results

Indices of model fit based on ML estimation are in Table 1. In addition to −2 times the loglikelihood, we report the Akaike information criterion (AIC; Akaike 1974) and the Bayesian information criterion (BIC; Schwarz 1978). Smaller values of the AIC and BIC indicate preferred models in terms of relative model fit. Across models, the logistic function was preferred over the exponential function regardless of the assumptions made about the residual structure. This suggests that for these data, the choice of a growth function was not sensitive to changes in the assumptions about the within-individual residuals. Thus, the pursuit of selecting a suitable nonlinear growth function for these data does not seem to require that different residual covariance structures be considered throughout the process.

Table 1.

Indices of Model Fit for Performance Scores on a Flight Simulation Task using ML Estimation.

Exponential function: $β_{1} - (β_{1} - β_{0}) exp {- β_{2} (t - 1)}$
Level-1 covariance structure	q	−2lnL	AIC	BIC
Independent, constant variance	$10$	$7, 591.8$	$7, 611.8$	$7, 641.2$
First-order autoregressive	$11$	$7, 384.8$	$7, 406.8$	$7, 439.1$
Compound symmetry	$11$	$7, 580.5$	$7, 602.5$	$7, 634.9$
Symmetric banded-1 Toeplitz	$11$	$7, 457.5$	$7, 479.5$	$7, 511.9$
Independent, heterogeneous variances	$18$	$7, 545.1$	$7, 581.1$	$7, 634.1$
Logistic function: $\frac{β_{0} β_{1}}{β_{0} + (β_{1} - β_{0}) exp {- β_{2} (t - 1)}}$
Level-1 covariance structure	q	−2lnL	AIC	BIC
Independence, constant variance	$10$	$7, 410.7$	$7, 430.7$	$7, 460.1$
First-order autoregressive	$11$	$7, 300.0$	$7, 322.0$	$7, 354.4$
Compound symmetry	$11$	$7, 389.1$	$7, 411.1$	$7, 443.4$
Symmetric banded-1 Toeplitz	$11$	$7, 321.5$	$7, 343.5$	$7, 375.8$
Independence, heterogeneous variances	$18$	$7, 379.1$	$7, 415.1$	$7, 468.1$

Note: n = 140. Models were fit using ML via nonadaptive Gaussian quadrature and 30 quadrature points. ML = maximum likelihood.

For either growth function, model fit was most improved if a first-order autoregressive residual structure was assumed, suggesting that dependencies in the data remain after fitting the growth function. The fact that the other residual structures did not improve model fit to the same extent or greater sheds light on the learning response and is worth exploring. Specifically, using the AIC value to evaluate the relative fit of compound symmetry to the independent structure, there was a slight improvement in model fit, whereas the BIC suggests that the added complexity of compound symmetry provided no improvement. Although compound symmetry allows for scores to covary over time, the added restriction that the dependencies are constant across all covariances, regardless of the distance between trials, is not consistent with the data. The symmetric banded-1 Toeplitz model improved model fit, but comparisons of model fit suggest that the assumption that scores two or more trials apart are independent is not consistent with the data. The structure that assumes independence between residuals and different residual variances across trials improved model fit as well but not to the same extent as the first-order autoregressive structure. The fact that the first-order autoregressive structure provided relatively the best overall fit suggests that the need to address decreasing dependencies in the residuals across trials was most important and addressing possible differences in the residual variances was not. Consistent with studies involving linear mixed-effects model, model fit for nonlinear mixed-effects models may be improved by relaxing the usual assumption that the residuals are independent with constant variance (Chi and Reinsel 1989).

Table 2 reports the estimated fixed effects and variances and covariances of the random coefficients based on the logistic growth model assuming that the residuals are independent with constant variance and then assuming a first-order autoregressive (AR[1]) structure. Sensitivity of estimates to assumptions about the within-individual errors is studied by making comparisons between the estimates, giving light to the impact that this alternative structure has on model inference. Bayesian estimates for these two models are provided as well. As shown in Table 2, there are slight differences in the estimated fixed effects between models and estimation methods, so conclusions regarding the typical performance trajectory are fairly consistent across models and estimation methods. Thus, the assumptions about the residual covariance structure seem to have little impact on the report of the typical performance trajectory.

Table 2.

ML and Bayesian Estimates of a Logistic Growth Function Under Independence Versus First-order Autoregressive (AR(1)) for Performance Scores on a Flight Simulation Task.

	Level 1 Residual Covariance Structure by Estimation Method
	ML		Bayesian
Parameter	$Θ = I σ^{2}$	$Θ = AR (1)$	$Θ = I σ^{2}$	$Θ = AR (1)$
$β_{0}$	$18.5 (0.44)$	$18.9 (0.73)$	$20.24 (0.81)$	$18.0 (0.84)$
$β_{1}$	$39.5 (0.60)$	$39.4 (0.72)$	$40.63 (0.85)$	$39.5 (0.87)$
$β_{2}$	$0.71 (0.037)$	$0.72 (0.05)$	$0.64 (0.050)$	$0.66 (0.06)$
$φ_{b 0}$	$108.5 (5.15)$	$115.1 (8.86)$	$88.77 (11.12)$	$83.52 (11.02)$
$φ_{b 1 b 0}$	$41.7 (4.65)$	$49.5 (9.33)$	$35.64 (8.22)$	$35.11 (7.78)$
$φ_{b 1}$	$73.0 (7.26)$	$59.4 (9.49)$	$63.81 (9.97)$	$49.22 (9.11)$
$φ_{b 2 b 0}$	$- 2.01 (0.18)$	$- 2.77 (0.35)$	$- 1.81 (0.45)$	$- 1.45 (0.41)$
$φ_{b 2 b 1}$	$- 1.82 (0.18)$	$- 1.37 (0.30)$	$- 1.24 (0.48)$	$- 0.64 (0.45)$
$φ_{b 2}$	$0.16 (0.020)$	$0.15 (0.025)$	$0.11 (0.020)$	$0.12 (0.03)$
$σ^{2}$	$8.78 (0.42)$	$11.8 (1.06)$	$8.88 (0.49)$	$13.25 (1.44)$
$ρ$		$0.36 (0.054)$		$0.38 (0.10)$
AIC	$7, 383.3$	$7, 324.6$
BIC	$7, 412.8$	$7, 356.9$
DIC			$6, 328$	$6, 297$

Note: n = 140. ML estimates obtained using non-adaptive Gaussian quadrature with 30 quadrature points.Bayesian estimation was carried out using diffuse priors on all parameters and 5,000 replications with a burn-in of 4000. Standard deviations of the respective sampling distributions are in parentheses. Smaller values of the AIC and BIC (ML estimation), as well as the DIC (Bayesian estimation), indicate relatively better fitting models. AIC = Akaike information criterion; BIC = Bayesian information criterion; ML = maximum likelihood.

Perhaps the more important differences between models are in the estimated variances of the random effects. Using ML, the estimated variance of the random intercept $φ_{b 0}$ increased from 108.5 to 115.1 after allowing for correlations between the within-person residuals; conversely, the estimated variance of the random asymptote $φ_{b 1}$ decreased from 73.0 to 59.4 after allowing these correlations. The estimated variance of the random rate parameter $φ_{b 2}$ using ML showed only a slight change (from $0.16$ assuming independence and $0.15$ allowing for correlations). This pattern of results is inconsistent with simulation studies based on linear mixed-effects models in which the variances of the random effects generally decrease after allowing for correlations between the residuals at the first level of the model (Chi and Reinsel 1989). In contrast, Bayesian estimates of the variances of the random intercept and asymptote decreased after allowing for correlations between the residuals. Estimates of the random rate parameter were fairly comparable between models (from $0.11$ assuming independence and $0.12$ allowing for correlations). Results from Bayesian methods are consistent with investigations based on linear models. Assuming that these differences in the estimated variances of the intercept and asymptote between models are meaningful, the results here suggest a sensitivity of the estimated variances of the random effects, both to the assumptions about the residual covariance structure and the method of estimation. The implication here is that a study of the individual differences in performance, as done by examination of the covariance structure of the random effects at the second level of the model, is impacted by the assumptions made about the residual covariances at the first level of the model but also by the estimation method. In sum, assumptions about the residual covariance structure can influence both model fit and the estimated parameters of the model, a finding that is generally consistent with the literature on linear mixed-effects model (Chi and Reinsel 1989), but the pattern of results is not. Although beyond the scope of this article, the finding of differences in estimates based on the two estimation methods suggests a need for future work to better understanding these patterns of results.

Severity of Illness Ratings

Data from the National Institute of Mental Health Schizophrenia Collaborative Study are studied in an application of a nonlinear mixed-effects model for which different mean and covariance structures are considered. The data are based on Item 79 of the Inpatient Multidimensional Psychiatric Scale (Lorr and Klatt 1966), a seven-point ordinal scale measuring severity of illness: 1 = normal, not at all ill; 2 = borderline mentally ill; 3 = mildly ill; 4 = moderately ill; 5 = markedly ill; 6 = severely ill; and 7 = among the most extremely ill. Data are available for 437 patients for whom weekly severity ratings were planned for up to six weeks in addition to a baseline measure, with all patients having some missing data. Table 3 displays the sample descriptives of the severity ratings for week $t = 0, . . ., 6$ (with 0 denoting baseline) with the respective sample sizes at each occasion. These data have been considered elsewhere to illustrate applications of linear random coefficient models including models that assume a continuous response (Hedeker and Gibbons 1997). Figure 1 is a display of the raw scores for an arbitrarily selected subset of individuals who received the placebo (upper plot) or a drug (lower plot).

Table 3.

Summary Statistics of Weekly Severity of Illness Ratings by Placebo Versus Drug.

Week
Placebo	Baseline	1	2	3	4	5	6
Mean	5.4	5.0	5.8	4.7	5.5	4.3	4.3
Standard deviation	0.8	1.2	0.6	1.2	0.7	2.5	1.5
Minimum	3	1	5	1.5	5	2.5	1.5
Maximum	7	7	6.5	6.5	6	6	6.5
Sample size	107	105	5	87	2	2	70
Drug	Baseline	1	2	3	4	5	6
Mean	5.4	4.4	3.3	3.8	2.5	2.9	3.1
Standard deviation	0.9	1.2	1.6	1.5	1.3	1.6	1.4
Minimum	2	1	1	1	1	1.5	1
Maximum	7	7	6	7	5	6	7
Sample size	327	321	9	287	9	7	265

Note: N = 437.

Figure 1.

A plot of severity of illness ratings for a subsample of 20 individuals (drug group upper plot; placebo group lower plot).

In a longitudinal study, whether data are missing or not may be related to the missing data. In such cases, the missingness is said to be nonignorable. Consequently, inference from a longitudinal model without accounting for the missingness may be misleading (Molenberghs and Kenward 2007). Thus, an important goal for the data analysis here was to address a potential source of nonignorable missingness due to attrition, as has been considered for these data in previous studies that relied on linear mixed-effects models that assumed independence and constant variance for the residuals (Hedeker and Gibbons 1997). This is done here in the context of fitting a nonlinear mixed-effects model, in addition to fitting different residual covariance structures. Following Hedeker and Gibbons (1997), the measures are first studied as a function of time, measured in weeks, in addition to a treatment indicator variable that denoted whether a patient received one of four psychiatric medications (drug $_{i}$ = 0 if the patient was given a placebo, drug $_{i}$ = 1 if the patient was given a psychiatric drug). To account for possible nonignorable missingness, a nonlinear random-effects pattern-mixture model was applied to the data. An indicator of dropout, dropout $_{i}$ , was created where dropout $_{i}$ $= 0$ if a patient was measured at week 6 and dropout $_{i}$ $= 1$ if not. The dropout indicator served to account for variation in responses due to participant attrition. Approximately 23 percent of the participants were identified here as having dropped from the study by the last measurement occasion.

As a first step in the analysis, different growth functions that included drug $_{i}$ as a covariate were fitted to the data, including linear growth, quadratic growth, linear growth using a square root transformation of the week of measurement (see Hedeker and Gibbons 1997), exponential growth (see the exponential function in equation [1]), and logistic growth (see the logistic function in equation [2]). drug $_{i}$ was included as a moderator of the coefficients of a given growth function. Under these models, valid inference is based on the assumption that missingness is ignorable, such that conditional on the growth model that includes drug $_{i}$ as a covariate, the missingness is ignorable. Under these models, the missingness could be related to any of the observed response data, as well as the covariate drug $_{i}$ , but is assumed not to be related to the missing data. As was done in the first example, the different growth functions that included drug $_{i}$ as a covariate were fitted to the data while also considering alternative residual covariance structures. Consistent with the findings from the first example, indices of model fit (available upon request) suggested that the choice of a growth function did not depend on the assumptions made about the residual covariance structure at the first level of the model. From model fit comparisons, a logistic growth model including drug $_{i}$ as a covariate provided the best relative fit to the data and was provisionally accepted as a suitable model to describe severity of illness ratings.

Using the logistic growth model to describe illness ratings, dropout $_{i}$ was added as a between-subjects covariate, along with drug $_{i}$ to create a second model that was used specifically to evaluate the impact of participant dropout, conditional on drug $_{i}$ , on each of the coefficients of the logistic function. This kind of model is generally known as a random-effects pattern-mixture model in which an indicator variable that denotes a pattern of missing data is included as a subject-level covariate and serves to address data that are missing not at random (Little 1995; Molenberghs and Kenward 2007). In a random-effects pattern-mixture model, the missingness is assumed to be ignorable within each pattern of missing data. To fit this second model to the illness ratings, the logistic growth model was extended to include both drug $_{i}$ and dropout $_{i}$ , as well as their interaction, as predictors of the three function coefficients:

Logistic: y_{t i} = \frac{β_{0 i} β_{1 i}}{β_{0 i} + (β_{1 i} - β_{0 i}) exp {- β_{2 i} (t - 1)}} + e_{t i},

([2] repeated)

where

β_{0 i} = β_{00} + β_{01} d r u g_{i} + β_{02} d r o p o u t_{i} + β_{03} D r u g \times d r o p o u t_{i} + r_{0 i},

β_{1 i} = β_{10} + β_{11} d r u g_{i} + β_{12} d r o p o u t_{i} + β_{13} D r u g \times d r o p o u t_{i} + r_{1 i},

β_{2 i} = β_{20} + β_{21} d r u g_{i} + β_{22} d r o p o u t_{i} + β_{23} D r u g \times d r o p o u t_{i} + r_{2 i},

and with $t$ denoting the week of measurement. The coefficient $β_{00}$ denotes the mean initial response for those assigned to the placebo condition and who did not drop from the study by the last occasion. The coefficients $β_{01}$ and $β_{02}$ are the effects of having received a drug treatment or having dropped from the study, respectively, net of the combined effects of these two covariates, denoted by $β_{03}$ . Similar interpretations can be done for the asymptote $β_{1 i}$ and rate parameter $β_{2 i}$ . In each of the expressions for the growth coefficients, the residuals of the level 2 equations, $r_{0 i}$ , $r_{1 i}$ , and $r_{2 i}$ , are conditional residuals after accounting for variation in the growth coefficients by the person-level covariates, $d r u g_{i}$ and $d r o p o u t_{i}$ , and their interaction. Conditional on the covariates and time, the missingness was assumed to be ignorable. Thus, a key difference between the two models was whether $d r o p o u t_{i}$ was important in accounting for individual differences in the growth coefficients, a result that would suggest that the missingness is not ignorable, even after account for the effects of $d r u g_{i}$ .

Next, models assuming ignorable missingness conditional on $d r u g_{i}$ and assuming ignorable missingness conditional on $d r u g_{i}$ , $d r o p o u t_{i}$ , and their interaction were fitted to the data assuming different residual covariance structures to evaluate the models according to the different assumptions about the within-person residuals. We considered a first-order autoregressive structure and a symmetric banded-1 Toeplitz structure to evaluate whether the residuals had dependencies between time points. Additionally, a structure that assumed independent residuals with heterogeneous variances was applied. Note that given the sparse data collected at three of the seven planned measurement occasions (i.e., weeks 2, 4, and 5 with sample sizes of 14, 11, and 9, respectively), one may be cautious about the estimates of the individual variances based on the limited data for those occasions (see Table 3 for sample sizes at each measurement occasion).

It is important to note that it is not possible to confirm with absolute certainty the role that missingness plays in statistical inference (Molenberghs and Kenward 2007). This is due to the fact that analyses are conducted using only observed data. The missing data are not available to conduct any test concerning a relationship between the missing data and the missingness, and the missingness itself may not be fully understood simply by examination of the observed data. Thus, a common goal for an analysis that includes some information about the missing data process is to evaluate the conclusions drawn from an analysis that is based on assumptions that are made about the missing data. Generally, it may be best practice to consider multiple mechanisms that may explain the missing data, keeping in mind that the actual missing data process may not be correctly specified (simply because it is not likely to be known). Thus, although we provide values of the AIC and BIC for the different models fitted using ML, one must be cautious in drawing conclusions about the missing data mechanism. As an added note, the BIC, unlike the AIC, uses sample size in its calculation. SAS PROC NLMIXED calculates the BIC as: BIC = $2 f (\hat{θ}) + p log (N),$ where $f (\cdot)$ is the negative of the marginal loglikelihood function, $\hat{θ}$ is the estimated parameter set, $p$ is the number of model parameters, and $N$ is the number of subjects. Starting values for this empirical example were generated in a manner similar to those generated for the two learning examples. ML estimation methods were used for estimation in this example.

Results

For each model that differed according to the residual covariance structure, the three coefficients of the logistic function that included $d r u g_{i}$ as a covariate were initially assumed to have random effects. For all models, the variance of the nonlinear rate parameter was close to zero, and interval estimates included zero as interior points. It was provisionally assumed that the nonlinear rate parameter did not vary across subjects. Table 4 provides AIC and BIC values obtained for the logistic growth models based on different assumptions about the missing data and the residual covariance structure. Within models that assumed either ignorable missingness or nonignorable missingness, the residual structure assuming independent residuals with nonconstant variance was preferred over models that assumed other residual covariance structures. With regard to the tenability of the assumption of ignorable missingness, conclusions drawn from the analyses depend on the index of model fit that is used for the comparisons. That is, the more complex model was preferred according to the AIC, such that model fit was improved by including $d r o p o u t_{i}$ and its interaction with $d r u g_{i}$ , suggesting that the missingness was not ignorable if $d r u g_{i}$ alone was included as a covariate. Conversely, the simpler model was preferred according to the BIC, such as no appreciable improvement in model fit resulted when relaxing the assumption of ignorable missingness and including $d r o p o u t_{i}$ and its interaction with $d r u g_{i}$ in the model. To better understand the different conclusions, a comparison is made of the estimated model parameters resulting from the different models.

Table 4.

Model Fit for a Logistic Growth Function Applied to Psychiatric Illness Severity Ratings Under Different Missing Data Assumptions Using ML Estimation.

Ignorable Missingness Conditional on Drug
Level-1 covariance structure	q	−2lnL	AIC	BIC
Independent, constant variance	$10$	$4, 617.3$	$4, 637.3$	$4, 678.1$
First-order autoregressive	$11$	$4, 683.6$	$4, 705.6$	$4, 750.4$
Symmetric banded-1 Toeplitz	$11$	$4, 614.0$	$4, 636.0$	$4, 680.9$
Independent, nonconstant variance	$16$	$4, 551.0$	$4, 583.0$	$4, 648.3$
Ignorable Missingness Conditional on Drug, Dropout and Their Interaction
Level-1 covariance structure	q	−2lnL	AIC	BIC
Independent, constant variance	$16$	$4, 588.6$	$4, 620.6$	$4, 685.9$
First-order autoregressive	$17$	$4, 574.9$	$4, 608.9$	$4, 678.2$
Symmetric banded-1 Toeplitz	$17$	$4, 584.0$	$4, 618.0$	$4, 687.4$
Independent, nonconstant variance	$22$	$4, 526.2$	$4, 570.2$	$4, 660.0$

Notes: n = 437. q is the number of model parameters. Models were fit using ML via nonadaptive Gaussian quadrature and 30 quadrature points. Smaller values of the AIC and BIC indicate relatively better model fit.

Table 5 provides ML estimates of the logistic growth models under ignorable and nonignorable missingness, as well as assuming that the residuals were independent with constant variance and independent with nonconstant variance.¹ The fixed effects estimates that relate to the initial response and the rate parameter are fairly comparable between models that make the same assumption about the missingness, suggesting no appreciable impact of this assumption on the initial response and rate of change for the typical individual under each assumption about missingness. Depending on whether the residuals were assumed to be independent with constant variance or with nonconstant variance, we note differences in the estimates relating to the effects of covariates on the asymptote under the assumption of nonignorable missingness but not ignorable missingness. As a result, the estimated asymptotes for the four groups (placebo + no dropout, placebo + dropout, drug + no dropout, and drug + dropout) differ depending on whether the variances of the residuals at the first level of the model are assumed to constant or not across the study period. This may be important from a clinical perspective, given that different conclusions may be drawn about the patients’ potential health status levels.

Table 5.

ML Estimates of a Logistic Growth Function Applied to Psychiatric Illness Severity Ratings.

	Missingness
	Ignorable		Nonignorable
Parameter	$Θ = I σ^{2}$	$Θ = I σ_{t}^{2}$	$Θ = I σ^{2}$	$Θ = I σ_{t}^{2}$
$β_{00}$ , initial response	5.36(0.08)	5.36(0.08)	5.20(0.10)	5.22(0.10)
$β_{01}$ , $d r u g_{i}$	0.04(0.10)	0.04(0.10)	0.20(0.12)	0.19(0.11)
$β_{02}$ , $d r o p o u t_{i}$			0.35(0.17)	0.28(0.16)
$β_{03}$ , $d r u g_{i}$ × $d r o p o u t_{i}$			−0.38(0.21)	−0.35(0.20)
$β_{10}$ , Asymptote	2.53(0.15)	2.53(0.15)	4.05(0.24)	4.00(0.29)
$β_{11}$ , $d r u g_{i}$	−0.02(0.17)	−0.02(0.17)	−1.52(0.29)	−1.25(0.33)
$β_{12}$ , $d r o p o u t_{i}$			−2.48(0.27)	−3.30(0.29)
$β_{13}$ , $d r u g_{i}$ × $d r o p o u t_{i}$			2.27(0.44)	2.87(0.39)
$β_{20}$ , Rate	0.03(0.01)	0.03(0.01)	0.20(0.07)	0.17(0.07)
$β_{21}$ , $d r u g_{i}$	0.12(0.02)	0.12(0.02)	−0.07(0.07)	−0.02(0.08)
$β_{22}$ , $d r o p o u t_{i}$			−0.19(0.07)	−0.17(0.07)
$β_{23}$ , $d r u g_{i}$ × $d r o p o u t_{i}$			0.28(0.10)	0.21(0.09)
$σ_{0}^{2}$ , var( $e_{i 0}$ )*	0.53	0.11	0.54	0.18
$σ_{1}^{2}$ , var( $e_{i 1}$ )		0.47		0.46
$σ_{2}^{2}$ , var( $e_{i 2}$ )		0.27		0.20
$σ_{3}^{2}$ , var( $e_{i 3}$ )		0.57		0.53
$σ_{4}^{2}$ , var( $e_{i 4}$ )		0.37		0.28
$σ_{5}^{2}$ , var( $e_{i 5}$ )		0.46		0.58
$σ_{6}^{2}$ , var( $e_{i 6}$ )		0.94		0.91
$\begin{array}{l} Estimated level 2 \\ covariance matrix, \hat{Φ} \end{array}$	$[\begin{matrix} 0.40 \\ 0.26 & 1.58 \end{matrix}]$	$[\begin{matrix} 0.64 \\ 0.049 & 1.47 \end{matrix}]$	$[\begin{matrix} 0.36 \\ 0.29 & 1.49 \end{matrix}]$	$[\begin{matrix} 0.55 \\ 0.18 & 1.50 \end{matrix}]$
$- 2 ln L$	4,617.3	4,551.0	4,588.6	4,526.2
AIC	4,637.3	4,583.0	4,620.6	4,570.2
BIC	4,678.1	4,648.3	4,685.9	4,660.0

Note: n = 437. ML estimates obtained using nonadaptive Gaussian quadrature with 30 quadrature points. Standard errors are in parentheses. The residual covariance structure $Θ = I σ^{2}$ assumes that the residuals are independent between weeks with common variance $σ_{0}^{2}$ . BIC = $2 f (\hat{θ}) + p log (n)$ where $f (\cdot)$ is the negative of the marginal loglikelihood function, $\hat{θ}$ is the estimated parameter set, $p$ is the number of model parameters, and $n$ is the number of subjects. Akaike information criterion; BIC = Bayesian information criterion; ML = maximum likelihood.

Assuming different residual covariance structures, as well as assumptions about missingness, the estimated variances of the random intercept and asymptote differed. Generally, the estimated variances of the random intercept $φ_{b 0}$ increased from 0.40 to 0.64 (assuming ignorable missingness) and from 0.36 to 0.55 (assuming nonignorable missingness), suggesting an impact in allowing for heterogeneity of variance in the residuals. As shown in Table 5, the estimated residual variance at the first time point was small (0.11) when allowing for heterogeneity of variance in the residuals in contrast to the estimated residual variance of 0.53 under the assumption of homogeneity of variance. Thus, allowing for differences in the residual variance resulted in an increase in the estimated measure of individual differences at the initial assessment. Less notable were differences in the estimated variances of the random asymptote $φ_{b 1}$ within models that made the same assumption about missingness. Finally, the different estimates of the level 2 covariance structure for the random effects, particularly with regard to the random intercept, meant a change in the estimated correlation between the random intercept and asymptote, with a decrease in the correlation from $r = .32$ to $r = .06$ between the model that assumed constant variance versus the model that allowed for different variances, respectively, under the assumption of ignorable missingness, and a decrease in the correlation from $r = .40$ to $r = .20$ between the model that assumed constant variance versus the model that allowed for different variances, respectively, under the assumption of nonignorable missingness.

Discussion

Considering alternative residual structures at the first level of a mixed-effects model that is linear in its parameters is made straightforward using major statistical software programs such as SAS and SPSS. The need for covariance structures as alternatives to the standard structure that assumes the residuals are independent with constant variance to improve model fit and provide more appropriate parameter estimates has been documented in several studies (Chi and Reinsel 1989; Ferron et al. 2002; Kwok et al. 2007; Sivo et al. 2005). Options for fitting alternative residual covariance structures for nonlinear mixed-effects models are limited for commercial software and require some effort to modify standard syntax for estimation of these models (Harring and Blozis 2014). Thus, applications of models that make use of residual covariance structures that differ from the standard assumption of independence and homogeneity of variance are rare, and understanding the impact of relaxing the standard assumption about the residuals is therefore not well understood. Based on the empirical examples presented in this article, considering alternative residuals structures for a nonlinear mixed-effects model can be an important element in the process of model assessment and inference, particularly given that nonlinear mixed-effects models are much more complex than those in the family of linear mixed-effects models.

The first example presented in this article was based on learning data. In the example, model fit was improved by allowing for dependencies in the level 1 residuals by using a first-order autoregressive structure, a finding that suggested some dependency between scores even after fitting subject-specific growth curves. A consequence of allowing for dependencies in the residuals resulted in an increased estimate of the random intercept and a decreased estimate of the random asymptote using ML estimation. Conversely, both of the estimated variances decreased using Bayes estimation. Only the latter result using Bayes estimation is consistent to what others have described for linear mixed-effects models. Chi and Reinsel (1989), for example, describe how the random effects of a growth model may serve to account for a serial correlation between measures within person and that this serial correlation may be better represented at the first level of the model so as to not inflate the importance of a random effect. Chi and Reinsel’s work relied on ML for estimation of the linear mixed-effects models although here the estimates resulting from Bayesian estimation produced results that were consistent with this earlier work. Here, ML produced results inconsistent with these other patterns, a finding that suggests a sensitivity of the variance estimator to both the assumptions about the covariance structure and estimation procedure.

The second example presented in this article illustrated the impact of considering alternative residual structures for longitudinal measures of illness severity for a patient population and for which data were incomplete and the missingness possibly not ignorable. In the example provided, assumptions about the residual covariance structure had its most notable impact on a model that assumed nonignorable missingness. Although conclusions about missingness for the patient data reported here is, in general, consistent with other studies that suggest that missingness is not likely to be ignorable (see Hedeker and Gibbons 1997), the particular conclusions about the patterns of missing data were driven in part by the assumptions that were made about the residual covariance structure. Such a finding suggests possible implications of fitting alternative residual covariance structures in the understanding of the mechanisms that give rise to missing data.

The work presented here represents one aspect of the process of evaluating model fit for longitudinal data, with a focus on nonlinear mixed-effects models. Certainly, the issue of model specification for longitudinal data more generally can present a great challenge to researchers. Although our focus and that of others concerns mixed-effects model and latent growth models, and in particular, the need to consider alternative residual covariance structures at the first level of a model, a higher level challenge is in selecting a model that will provide a framework for testing hypotheses and providing a good representation of the data (Liu, Rovine, and Molenaar 2012). Added to these issues is the need to address features of the data, such as data that are not measured at the same time points for individuals. Also relevant to the estimation of longitudinal models is the potential increase in the difficulty of estimation, particularly as models become increasingly complex. Estimation of nonlinear mixed-effects models is indeed complex relative to linear models. Estimation of such models that also require more complex specifications of the residual covariance structure can potentially increase computational demand. Increased model complexity can also reduce the precision of parameter estimates. Although these can be unfortunate consequences of considering more complex models, the ultimate goal is to provide a reasonable representation of the data. Considering different residual covariance structures, whether a growth model is linear or nonlinear in its parameters, can yield information in the more general process of evaluating longitudinal models.

As a final note, results from the present investigation suggest some sensitivity of parameter estimates to choices in estimation, namely, ML and Bayes. Based on an analysis of learning data provided here, discrepancies between ML and Bayes lead to different interpretations. In particular, a first-order autoregressive structure was needed to describe the data (similar to the problems considered for linear mixed-effects models in the extant literature). As discussed, the estimated variance of the random intercept increased under ML, whereas it decreased under Bayes, after allowing for the autocorrelation. This pattern resulting from the Bayes estimates is consistent with findings from linear models, but results from ML were inconsistent with previous research. Although certainly beyond the scope of this article, this finding suggests a need for future work to understand the role of estimation choices in fitting such complex models.

Supplemental Material

Supplemental Material, Appendices_Blozis_Harring - Fitting Nonlinear Mixed-effects Models With Alternative Residual Covariance Structures

Supplemental Material, Appendices_Blozis_Harring for Fitting Nonlinear Mixed-effects Models With Alternative Residual Covariance Structures by Shelley A. Blozis and Jeffrey R. Harring in Sociological Methods & Research

Footnotes

Declaration of Conflicting Interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) received no financial support for the research, authorship, and/or publication of this article.

Supplemental Material

Supplemental material for this article is available online.

Note

References

Akaike

1974. “A New Look at the Statistical Model Identification.” IEEE Transactions on Automatic Control 19(6):716–23. doi: 10.1109/TAC.1974.1100705.

Beal

S. L.

Sheiner

L. B.

. 1982. “Estimating Population Kinetics.” CRC Critical Reviews in Biomedical Engineering 8:195–222.

Blozis

S. A.

2007. “A Newton Procedure for a Conditionally Linear Mixed-effects Model.” Behavior Research Methods 39(4):695–708.

Blozis

S. A.

Cudeck

. 1999. “Conditionally Linear Mixed-effects Models with Latent Variable Covariates.” Journal of Educational & Behavioral Statistics 24(3):245–70.

Broemeling

L. D.

2016. Bayesian Methods for Repeated Measures Data. Boca Raton, FL: Chapman & Hall/CRC.

Browne

M. W.

1993. “Structured Latent Curve Models.” Pp. 171–97 in Multivariate Analysis: Future Directions 2, edited by Cuadras

C. M.

Rao

C. R.

. Amsterdam, the Netherlands: Elsevier Science.

Burke

C. T.

Shrout

P. E.

Bolger

. 2007. “Individual Differences in Adjustment to Spousal Loss: A Nonlinear Mixed Model Analysis.” International Journal of Behavioral Development 31(4):405–15.

Chi

E. M.

Reinsel

G. C.

. 1989. “Models for Longitudinal Data with Random Effects and AR(1) Errors.” Journal of the American Statistical Association 84(406):452–59.

Choi

Harring

J. R.

Hancock

G. R.

. 2009. “Latent Growth Modeling for Logistic Response Functions.” Multivariate Behavioral Research 44(5):620–45.

10.

Cudeck

1996. “Mixed-effects Models in the Study of Individual Differences with Repeated Measures Data.” Multivariate Behavioral Research 31(3):371–403.

11.

Cudeck

Klebe

K. J.

. 2002. “Multiphase Mixed-effects Models for Repeated Measures Data.” Psychological Methods 7(1):41–63.

12.

Davidian

Giltinan

D. M.

. 1995. Nonlinear Models for Repeated Measurement Data. New York: Chapman & Hall.

13.

Davidian

Giltinan

D. M.

. 2003. “Nonlinear Models for Repeated Measures Data: An Overview and Update.” Journal of Agricultural, Biological, and Environmental Statistics 8(4):387–419.

14.

De la Cruz-Mesia

Marshall

. 2006. “Non-linear Random Effects Models with Continuous Time Autoregressive Errors: A Bayesian Approach.” Statistics in Medicine 25(9):1471–484.

15.

Ferron

Dailey

. 2002. “Effects of Misspecifying the First-level Error Structure in Two-level Models of Change.” Multivariate Behavioral Research 37(3):379–403.

16.

Fitzmaurice

G. M.

Laird

N. M.

Ware

J. H.

. 2004. Applied Longitudinal Analysis. 2nd ed. New York: Wiley.

17.

Gelfand

A. E.

Smith

A. F. M.

. 1990. “Sampling-based Approaches to Calculating Marginal Densities.” Journal of the American Statistical Association 85(410):398–409.

18.

Grimm

K. J.

Ram

Hamagami

. 2011. “Nonlinear Growth Curves in Developmental Research.” Child Development 82(5):1357–71.

19.

Harring

J. R.

Blozis

S. A.

. 2014. “Fitting Correlated Residual Error Structures in Nonlinear Mixed-effects Models Using SAS PROC NLMIXED.” Behavior Research Methods 46(2):372–84.

20.

Harring

J. R.

Blozis

S. A.

. 2016. “A Note on Recurring Misspecifications When Fitting Nonlinear Mixed Models.” Multivariate Behavioral Research 51(6):805–817.

21.

Hedeker

Gibbons

R. D.

. 1997. “Application of Random-effects Pattern-mixture Models for Missing Data in Longitudinal Studies.” Psychological Methods 2(1):64–78.

22.

Infurna

F. J.

Wiest

Gerstorf

Ram

Schup

Wagner

G. G.

Heckhausen

. 2017. “Changes in Life Satisfaction When Losing One’s Spouse: Individual Differences in Anticipation, Reaction, Adaptation and Longevity in the German Socio-economic Panel Study (SOEP).” Aging and Society 37(5):899–934.

23.

Jennrich

R. I.

Schluchter

M. D.

. 1986. “Unbalanced Repeated Measures Models with Structured Covariance Matrices.” Biometrics 42(4):805–20.

24.

Kanfer

Ackerman

P. L.

. 1989. “Motivation and Cognitive Abilities: An Integrative/Aptitude-treatment Interaction Approach to Skill Acquisition. Journal of Applied Psychology 74(4):657–90.

25.

Kohli

Harring

J. R.

. 2013. “Modeling Growth in Latent Variables Using a Piecewise Function.” Multivariate Behavioral Research 48(3):370–97.

26.

Kwok

West

S. G.

Green

S. B.

. 2007. “The Impact of Misspecifying the Within-subject Covariance Structure in Multiwave Longitudinal Multilevel Models: A Monte Carlo Study.” Multivariate Behavioral Research 42(3):557–92.

27.

Laird

N. M

. 1988. “Missing Data in Longitudinal Studies.” Statistics in Medicine 7(1-2):305–15.

28.

Laird

N. M.

Ware

J. H.

. 1982. “Random-effects Models for Longitudinal Data.” Biometrics 38(4):963–74.

29.

Laursen

Little

T. D.

Card

N. A.

. 2013. Handbook of Developmental Research Methods. New York: Guilford Press.

30.

Lee

I. H.

Rojewski

J. W.

. 2009. “Development of Occupational Aspiration Prestige: A Piecewise Latent Growth Model of Selected Influences.” Journal of Vocational Behavior 75:82–90.

31.

Lesaffre

Spiessens

. 2001. “On the Number of Quadrature Points in a Logistic Random Effects Model: An Example.” Applied Statistics 50(3):325–35.

32.

Levy

Mislevy

R. J.

. 2016. Bayesian Psychometric Modeling. Boca Raton, FL: Chapman and Hall/CRC.

33.

Stewart

Weiskettel

. 2012. “A Bayesian Approach for Modelling Non-linear Longitudinal/Hierarchical Data with Random Effects in Forestry.” Forestry: An International Journal for Forest Research 85(1):17–25.

34.

Little

R. J. A.

1995. “Modeling the Drop-out Mechanism in Repeated-measures Studies.” Journal of the American Statistical Association 90(431):1112–21.

35.

Liu

Rovine

M. J.

Molenaar

P. C. M.

. 2012. “Selecting a Linear Mixed Model for Longitudinal Data: Repeated Measures Analysis of Variance, Covariance Pattern Model, and Growth Curve Approaches.” Psychological Methods 17(1):15–30.

36.

Lorr

Klett

C. J.

. 1966. Inpatient Multidimensional Psychiatric Scale: Manual. Palo Alto, CA: Consulting Psychologists Press.

37.

Marceau

Abar

C. C.

Jackson

K. M.

. 2015. “Parental Knowledge is a Contextual Amplifier of Associations of Pubertal Maturation and Substance Use.” Journal of Youth and Adolescence 44(9):1720–34.

38.

Molenberghs

Kenward

. 2007. Missing Data in Clinical Studies. Chichester, England: John Wiley.

39.

Muthén

1997. “Latent Variable Modeling of Longitudinal and Multilevel Data.” Sociological Methodology 27(1):453–80.

40.

Pinheiro

J. C.

Bates

D. M.

. 1995. “Approximations to the Log-likelihood Function in the Nonlinear Mixed-effects Model.” Journal of Computational and Graphical Statistics 4(1):12–35.

41.

Raudenbush

S. W.

Bryk

A. S.

. 2002. Hierarchical Linear Models: Applications and Data Analysis Methods. 2nd ed. Thousand Oaks, CA: Sage.

42.

Richards

F. J.

1959. “A Flexible Growth Function for Empirical Use.” Journal of Experimental Botany 10(2):290–301.

43.

Roberts

W. L.

1986. “Nonlinear Models of Development: An Example from the Socialization of Competence.” Child Development 57(5):1166–78.

44.

Roberts

A. R.

Adams

K. B.

. 2018. “Quality of Life Trajectories of Older Adults Living in Senior Housing.” Research on Aging 40(6):511-34.

45.

Schlotz

Hammerfald

Ehlert

Gaab

. 2011. “Individual Differences in the Cortisol Response to Stress in Young Healthy Men: Testing the Roles of Perceived Stress Reactivity and Threat Appraisal Using Multiphase Latent Growth Curve Modeling.” Biological Psychology 87(2):257–64.

46.

Schwarz

1978. “Estimating the Dimension of a Model.” Annals of Statistics 6(2):461–64.

47.

Singer

J. D.

Willett

. 2003. Applied Longitudinal Data Analysis: Modeling Change and Event Occurrence. New York: Oxford University Press.

48.

Sivo

Fan

Witta

. 2005. “The Biasing Effects of Unmodeled ARMA Time Series Processes on Latent Growth Curve Model Estimates.” Structural Equation Modeling 12:215–31.

49.

Spiegelhalter

D. J.

Best

N. G.

Carlin

B. P.

Van Der Linde

. 2002. “Bayesian Measures of Model Complexity and Fit.” Journal of the Royal Statistical Society: Series B (Statistical Methodology 64(4):583–639.

50.

Sturtz

Ligges

Gelman

A. E.

. 2005. “R2WinBUGS: A Package for Running WinBUGS from R.” Journal of Statistical software 12(3):1–16.

51.

Susman

E. P.

Murphy

J. R.

Zerbe

G. O.

Jones

R. H.

. 1998. “Using A Nonlinear Mixed Model to Evaluate Three Models of Human Stature.” Growth, Development, and Aging 62(4):161–71.

52.

Thomas

2004. BRugs User Manual, Version 1.0. Helsinki, Finland: Dept of Mathematics & Statistics, University of Helsinki.

53.

Wakefield

J. C.

1996. “The Bayesian Analysis of Population Pharmacokinetic Models.” Journal of the American Statistical Association 91(433):62–75.

54.

West

S. G.

Ryu

Kwok

O.-M.

Cham

. 2011. “Multilevel Modeling: Current and Future Applications in Personality Research.” Journal of Personality 79(1):2–50.

55.

Wickrama

F. O.

Lorenz

F. O.

Conger

R. D.

. 1997. “Parental Support and Adolescent Physical Health Status: A Latent Growth-curve Analysis.” Journal of Health and Social Behavior 38(2):149–63.

56.

Willett

J. B.

Sayer

A. G.

. 1994. “Using Covariance Structure Analysis to Detect Correlates and Predictors of Individual Change Over Time.” Psychological Bulletin 116(2):363–81.

57.

Blozis

S. A.

. 2011. “Sensitivity Analysis of Mixed Models for Incomplete Longitudinal Data.” Journal of Educational and Behavioral Statistics 36(2):237–56.

58.

Zhao

Luo

Chu

C. T.

Epstein

L. H.

. 2016. “A Two-part Mixed Effects Model for Cigarette Purchase Task Data.” Journal of the Experimental Analysis of Behavior 106:242–53.

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

0.03 MB