Abstract
Abstract
A multistate model is used to describe employment history. Transition-specific rates are defined using generalized gamma distributions and Gompertz distributions. This flexible parametric modelling of the rate of change is combined with latent classes for unobserved propensity to change jobs. The propensity is described by two latent classes which can be interpreted as consisting of movers and stayers. The modelling is illustrated by analysing longitudinal data from the German Life History Study.
Keywords
Introduction
Multistate models are used to describe stochastic processes where the change of status is of interest. Many applications can be found in biostatistics with the illness-death model as the quintessential example. An illness-death model can be defined by three states: a healthy state, an ill state and the dead state. Of interest can be risk factors for the onset of the illness or expected duration in the ill state. In social statistics and in demography, multistate models are used to study processes such as changes in region of residence, employment history, or changes in marital status.
Statistical methods for multistate models are typically not discipline-specific. As an example, a three-state illness-death model for medical data can be quite similar to a three-state model for employment history when the states in the latter are defined as employed, unemployed and retired.
The aim of this article is to introduce the generalized gamma distribution (Stacy, 1962) for flexible parametric continuous-time multistate modelling in demography. The application is with respect to employment history, where states are defined corresponding to the number of past jobs and intermediate spells without a job. The time scale in our model for employment history is time since entry into the labour market. We use the generalized gamma distribution to model transition-specific time dependence, and we show that using this distribution improves the statistical inference when combined with transition-specific exponential and Gompertz distributions. In addition, we show that parametric models can be extended with a definition of latent classes.
For continuous-time multistate models, the generalized gamma distribution is discussed in Jackson (2016). The parametric modelling is within a wider methodological framework for multistate models described by, for example, Kalbfleisch and Lawless (1985), Kay (1986), Hougaard (2000), Jackson (2011), and Van den Hout (2017). The latent-class model that we define can be seen as a random-effects model with a discrete distribution for the random effects. Latent-class models for discrete-time multistate models are discussed in, for example, Vermunt et al. (1999) and Bartolucci et al. (2012). An example in demography is Dias and Willekens (2005), who discussed determining the number of latent classes. For continuous-time multistate models which include random effects see, for example, Hougaard (2000), Putter and Van Houwelingen (2015), and Van den Hout (2017). For multistate modelling in demography, an overview with a wide range of random-effects structures is given by Bijwaard (2014).
When multistate models are applied in biostatistics, we can see the latent-class model as a frailty model. For example, if progression through a set of states denotes a deterioration of health, a model with two latent classes can distinguish individuals who move quickly through the states (the frail individuals) from those who move less quickly (the more healthy ones). In the current article, the latent classes are defined with respect to employment history and will allow us to distinguish individuals who tend to change job more quickly (the movers) from those who tend to stay put (the stayers). In the application, we will show that such a distinction can lead to a model that fits the data better.
Our multistate model for employment history combines the generalized gamma distribution with the latent-class approach. Although this distribution and the latent-class approach have been discussed in the literature separately (see the references above), our contribution is to combine these two concepts and thus defining a very flexible statistical modelling framework. The article illustrates this by defining a series of models, and by model comparison, validation and interpretation. We show that the modelling framework is general and allows for a wide range of applications in demography.
German Life History Survey
The German Life History Study (1980–2005) provides retrospective life course information for Germans born between 1919 and 1951 (Mayer, 2015). The study is often used to investigate education, employment history and family formation.
For the current article, we use the Blossfeld–Rohwer subsample of the German Life History Study (Blossfeld and Rohwer, 2002). These data are available in the Biograph package in the R software. This package is introduced by Willekens (2014), who used the Blossfeld–Rohwer subsample in a three-state model for employment history.
Eleven-state process for employment history in GLHS
Eleven-state process for employment history in GLHS
This subsample contains data for 201 individuals on job episodes and spells without a job. The start and end of each job episode are available on a time scale in months. Individual background information is given by covariates such as age, gender and education.
Interviews for GLHS were conducted in 1981. In what follows, we propose a statistical model for employment history as known at the time of the interview. This history is defined by a series of mutually exclusive states. Because not many individuals in GLHS have had more than six jobs, we restrict the modelling of employment history up to (and including) the sixth job.
Figure 1 shows the process that we will model in this article. States 1–6 denote the first job up to the sixth job. States 7–11 denote episodes between jobs; that is, state 7 denotes having no job after the first job, state 8 denotes having no job after the second job, and so on. With the imposed restriction to a maximum of six jobs, state 6 acts as an absorbing state.
It is possible to define a process with fewer states by collapsing the five no-job states into one state. However, the model comparison in Section 6 shows that it is worthwhile to distinguish the five no-job states, as illustrated in Figure 1.
In the subsample (denoted GLHS from now on), all individuals start in state 1, but they do so at different times and different ages. The time scale in GLHS is century months. For example, the first individual in the data is a man who starts his first job at century month 555. In calendar years, this is
For the 201 individuals in GLHS, Figure 2 shows the start time of the first job (in calendar years) and the age at which the individuals started their first job. The graph shows a cohort effect in the sense that the age at the start of the first job is increasing with increasing calendar year for the start of the first job. Therefore, we will include cohort effects in the statistical model for employment history.
Table 1 is the state table for the 11-state employment history in GLHS. This table shows the frequencies for successive pair of states. Right-censored histories are presented in the
For the 201 individuals in GLHS, a two-dimensional representation of the start time of the first job and the age at the start of the first job
State table for GLHS: number of times for each successive pair of states. The states are defined by employment history; see Figure 1
To model potential time dependence in employment history, we need to specify a time scale for the process. Possible options are time in current job, age and time since entry into the labour market. For our statistical model, we use the third option following the choice of Blossfeld and Rohwer (2002, Chapter2) for their model for the rate of leaving the current job. This option implies, for example, that our modelling allows two individuals who have been in their third job equally long to have a different distribution for moving to the fourth job due to a difference in time since entry into the labour market. It allows us to investigate the effects of gender and education while controlling for time spent in the labour market.
To model change of state in continuous time, we specify continuous parametric distributions for transition time
Our modelling is based on continuous-time stochastic processes, which are defined as
In our application, we have finite
Let
If a transition is not possible from r to s, we define
For
This definition of
Exponential and Gompertz distribution
The transition-specific hazard function for the exponential distribution is given
The transition-specific hazard function for the Gompertz distribution is
Models with exponential and Gompertz distributions can be extended in the usual way by log-linear regression:
Generalized gamma distribution
The generalized gamma distribution is a very flexible continuous parametric distribution for event time
Following the presentation and terminology in Cox et al. (2007), the generalized gamma distribution
for location μ, scale
where
The log-normal distribution is defined as
Re-parameterization by
which is a standard representation of the Weibull density. The median of this distribution is
For the multistate model, we define transition-specific hazard as
A latent-class model is defined by specifying K classes, probabilities
The class-specific parameters induce class-specific distributions for the transition times. As an example, consider the exponential model and
The distinction between individuals who move quickly through the states and individuals who tend not to move so quickly is described as tracking by Satten (1999): there is a correlation between the transition times within individuals, and rapid progressors can be distinguished from slow progressors. In the literature, a mover/stayer model can also denote a stochastic process that is a mixture of two processes, one of which has transition hazards equal to zero; see, for example, Frydman (1984) and Cole et al. (2005). This option, however, will not be explored in the current article.
Latent-class models can be seen as random-effects models where the random effects have a discrete distribution; see, for example, the generalized linear mixed models as discussed in Aitkin (1999). When fitting the latent-class as a random-effects model, the classes that are created are primarily aimed at capturing unobserved heterogeneity interpretation of the classes is not of primary interest.
Estimation
For individual i, for
Consider a time interval
Individual contributions to the likelihood function for the fixed-effects models are given by
for
Individual contributions to the likelihood function for the latent-class models are given by
where
The likelihood function for the fixed-effects models is the standard format for survival data extended to multiple event times within individuals. Note that possible association between observations within an individual is not modelled. In contrast, the likelihood function for the latent-class models allows a correlation between observations through the latent-class parameters.
For the maximum likelihood estimation, we use the general-purpose optimizer in the R software optim (R Core Team, 2013). This versatile optimizer can maximize the log-likelihood function without the need to provide explicit expressions for the derivatives. The optimizer will return the numerically differentiated Hessian matrix if requested.
Covariance of a function of estimated model parameters can be derived by using simulation. An important example of such a function are transition probabilities for a specified time interval. For the simulation, parameter vectors
Data analysis
We start the data analysis for GLHS with the basic exponential model. This model can be seen as an intercepts-only model for the location parameters and has as many parameters as transitions in the process:
Dummy variables for the cohort effects are defined as
For the modelling of the effects of the dummy variables, we distinguish three types of transitions: from one job to the next (A), from a job to no job (B) and from no job to the next job (C). The model for the location parameters with the covariates is thus given by
and has AIC = 6437.2, which shows a clear improvement over the intercepts-only model.
It is possible to restrict the intercepts
To investigate non-constant transition hazards, we discuss two models. For the first model, we add Gompertz
Table 1 shows that there is good information for moving out of states 1 and 2. To allow for more flexible parametric shapes, we fit generalized gamma distributions for
For
Next, we extend Model
for
Model
Parameter estimates for the mover/stayer model for the GLHS data on employment history (Model
). Time scale is months since entry into the labour market. Estimated standard errors in parentheses. Estimated intercepts are not included
For women born in 1939–1941 and with less than 12 years of education, Figure 3 depicts the hazards in Model
The flexibility of the generalized gamma distribution is illustrated in Figure 3 by the arc-shape hazards transitions
Looking at the estimated standard errors in Table 1, we see that there are clear cohort effects. The younger cohorts have higher hazards for all transitions. The effects for gender illustrate that men (coded by
Goodness of fit for Model
Transition-specific hazards for women born in 1939–1941 and with less than 12 years of education. Black curves for class 1, and grey curves for class 2 (with 95% confidence bands)
For survival prior to state 2 and for survival prior to state 3, the model-based curves are close to the Kaplan–Meier curves. For the higher states, Figure 4 shows some lack of agreement at the later times part of which is due to the right-censoring. Using the non-parametric Kaplan–Meier estimation as a data summary, the overall similarity with the model-based prediction validates the main features of the fitted model.
Hazards such as the ones in Figure 3, inform about employment history in the sense that they show the differences between the classes and the rates of moving to a next job. Transition probabilities can be used for additional inference. For example, to compare the two classes, we can look at the probabilities for changing states within a 5-year period. Say we do this for a man in birth cohort 1949–1951, with more than 12 years of education, who has been in the labour market for 5 years and is currently in his second job. The probability to move to (or stay in) state
Comparison of predicted model-based survival and Kaplan�Meier survivor curves. Solid lines for mean of predicted survival given baseline GLHS data for the 201 individuals. Dashed lines for the Kaplan�Meier curves (with 95% confidence intervals)
which we will denote
If this man is in class
For the computation of these probabilities, we used a piecewise-constant approximation (defined by a 1-month grid) to take the non-constant hazards into account. These transition probabilities illustrate the differences between the classes on the scale of probabilities. For example, we see that for staying in the second job the difference between the classes was already quite large (0.231 vs. 0.572 for classes 1 and 2, respectively). When in class 1, the man has a consistently higher chance to be in another job after 5 years then when he is in class 2, illustrating that class 1 is the class of the movers.
If we want to allocate the above specified man (with identifier i, say) to class
where
To summarize the GLHS data analysis, Model
This article illustrates the use of the generalized gamma distribution for a progressive multistate process in demography. In addition, it is shown that an extension to a latent-class structure is possible and can lead to improved statistical inference.
Because the multistate process in the application is progressive, the likelihood function for the time-dependent model can be constructed using transition-specific hazard functions and state-specific survivor functions. For multistate processes where back-and-forth transitions between the states are possible, the distributional options are the same, but the estimation becomes more complex. For such a case, we would propose to use a piecewise-constant approximation to the continuous-time parametric shape. The approximation would consist of a series of exponential models with changing hazard specifications; see, for example, (Blossfeld and Rohwer, 2002) and Van den Hout (2017).
This article combines parametric hazard models with a discrete distribution for unobserved heterogeneity. Other options are possible. Putter et al. (2007) discuss a semi-parametric alternative for the fixed-effects models, and Hougaard (2000) discusses parametric choices for the distribution for unobserved heterogeneity. Advantages of parametric hazard models are efficiency and the option to predict the process outside the range defined by the data. With respect to the discrete distribution for unobserved heterogeneity, as long as this distribution is defined by a few classes, this option is computationally advantageous and, of course, it circumvents the need to specify a parametric shape for a latent characteristic.
Acknowledgements
The creation of Figure 2 is based on R code on
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article. Funding The authors received no financial support for the research, authorship, and/or publication of this article.
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
