Abstract
State-switching models such as hidden Markov models or Markov-switching regression models are routinely applied to analyse sequences of observations that are driven by underlying non-observable states. Coupled state-switching models extend these approaches to address the case of multiple observation sequences whose underlying state variables interact. In this article, we provide an overview of the modelling techniques related to coupling in state-switching models, thereby forming a rich and flexible statistical framework particularly useful for modelling correlated time series. Simulation experiments demonstrate the relevance of being able to account for an asynchronous evolution as well as interactions between the underlying latent processes. The models are further illustrated using two case studies related to (a) interactions between a dolphin mother and her calf as inferred from movement data and (b) electronic health record data collected on 696 patients within an intensive care unit.
Introduction
Hidden Markov models (HMMs) are flexible statistical models for sequential data in which the observations are assumed to depend on an underlying latent state process. They have successfully been applied in various areas, starting with speech recognition in the 1970s (Baker, 1975) and nowadays including fields such as psychology (Visser et al., 2002), finance (Bulla and Bulla, 2006), medicine (Langrock et al., 2013) and ecology (Michelot et al., 2016). When modelling multiple observed variables using HMMs, it is usually assumed to have either (a) a single state process underlying the observed variables (e.g., the speed and tortuosity of an animal's movement are both driven by its behavioural mode) or (b) variable-specific but independent state processes (e.g., multiple animals separated in space will have independent behavioural modes; Langrock et al., 2012). However, there are also scenarios in which neither of these assumptions is valid. For example, multiple individuals may interact due to spatial proximity, the underlying volatilities of different financial markets may affect each other, and body functions may be coupled through physiological mechanisms. In such cases, each process of interest will have its own sequence of underlying states, but the different state processes are coupled.
Coupled hidden Markov models (CHMMs) extend the basic HMM framework by assuming distinct but correlated state sequences that underlie the observed variables, hence ‘coupling’ the state processes. Since their first appearance in Brand (1997), they have been further developed and applied, for example, to classify electroencephalography data (Michalopoulos and Bourbakis, 2014), to model interactions of suspects in forensics (Brewer et al., 2006) and to detect bradycardia events from electrocardiography data (Ghahjaverestan et al., 2016). CHMMs can be considered as established tools within the engineering literature, where they are commonly applied in classification tasks, for example, emotion recognition from audio–visual signals (Lin et al., 2012) or gesture recognition from hand tracking data (Brand et al., 1997). As a full probabilistic model for sequential data, CHMMs can however also be useful for other inferential purposes, including forecasting future observations as well as general inference on the data-generating process.
In this work, we argue that the full potential of CHMMs for such statistical modelling challenges to date has not been recognised, as evidenced by the fact that these models have only very rarely been used in such a context; some notable exceptions are Sherlock et al. (2013), Johnson et al. (2016) and Touloupou et al. (2020). We set out to fill this gap by introducing the CHMM formulation, in particular discussing the various simplifying assumptions that one may or may not want to make, and by presenting inferential tools available for CHMMs. Furthermore, we discuss the inclusion of covariates and introduce a coupled Markov-switching regression (CMSR) model which allows the observed variables to depend on covariates. Simulation studies are used to highlight practical issues that are relevant when modelling multiple interacting processes, thereby showcasing the potential benefits of the CHMM framework compared to more basic model formulations. Finally, we illustrate the practical use of CHMMs in two case studies. First, we consider a simple CHMM for studying the behaviour of a dolphin mother and calf pair. Second, we apply a CMSR model to electronic health record data collected by the University of California in Los Angeles (UCLA) to model the evolution of important vital signs over time, controlling for age and sex of the patients. A detailed model comparison for both case studies as well as data and
Hidden and coupled hidden Markov models
Hidden Markov models
Basic model formulation and inference
An HMM is a doubly stochastic process comprising an observable time series
The HMM likelihood can be written as
HMMs for multivariate time series
We now consider multivariate time series
Since such multivariate HMMs assume the observed processes to be driven by a single state sequence, the
Coupled hidden Markov models
Consider
Cartesian product model
Instead of modelling each state variable
Dependence structure of the Cartesian product CHMM with
We note here that the use of the label ‘coupled HMM’ is not consistent in the literature, and that the Cartesian product model is not always regarded as a CHMM (see, for example, Brand, 1997; Brand et al., 1997; Nefian et al., 2002). Other authors use the Cartesian product formulation as a framework for estimation of other coupled models (see, for example, Rezek et al. 2000; Ghosh et al. 2017). In this contribution, the label CHMM refers to all models that couple several HMMs via the state process, and we regard the Cartesian product model as one way to specify such a CHMM.
The Cartesian product model contains instantaneous correlations between the states, that is, the transition probabilities
CHMM structure with contemporaneous conditional independence assumption for
time series
This model formulation involves
In the CHMM representations discussed above, there is no parameter explicitly representing direct variable-to-variable effects, which makes interpretation difficult (Brand, 1997). Saul and Jordan (1999) offer a remedy to this caveat by combining the contemporaneous conditional independence assumption (2.1) with a mixture representation for the marginal transition probabilities:
The CHMM originally proposed by Brand (1997) is described by a factorisation based on contemporaneously conditionally independent state variables:
In a Bayesian framework, Sherlock et al. (2013) propose to directly model the influence of state
Coupled Markov-switching regression
We now turn to models which account for the influence of covariates. For example, the transition probabilities of the state process of an HMM can be expressed as a function of covariates using an appropriate link function such as the multinomial logit (Zucchini et al., 2016). While this approach can in principle be applied to CHMMs, it will often be infeasible as even a basic CHMM typically involves a high number of transition probabilities, such that model complexity can be prohibitive. The incorporation of covariates into the observation process — often referred to as Markov-switching regression (MSR; Langrock et al., 2017) — is more promising for the CHMM setting. MSR models were first introduced for econometric time series, in which case they can be used, for example, to investigate if covariate effects differ between periods of high and low economic growth, respectively (Hamilton, 2008). The MSR framework can be transferred to the CHMM setting by relating the
Simulation study
We provide simulation experiments to illustrate the consequences of neglecting or misspecifying the dependence structure in the state process. More specifically, we simulate data from a CHMM as the true data-generating process — that is, multiple time series with interacting underlying state processes — and demonstrate the consequences of either completely neglecting the interaction (by fitting separate univariate HMMs) or incorrectly assuming full synchronicity (by fitting a multivariate HMM).
The data-generating process we consider is a Cartesian product CHMM with
Estimation accuracy
Figure 3 displays the state-dependent densities as obtained in the 1 000 runs, for each of the three model formulations considered. Under the correct CHMM specification, but also under the incorrect model specification using two separate univariate HMMs, the true state-dependent densities were generally well recovered in the estimation. In other words, even when neglecting the correlation of the two state processes the estimation is fairly accurate at the level of the observation process.
Estimated state-dependent densities obtained in 1 000 simulation runs. The upper panel displays the results of the fitted CHMMs, the middle panel corresponds to the multivariate HMMs and the bottom panel to the estimated univariate HMMs. The thick lines show the true underlying densities
The comparison of the classification performance is based on the globally decoded Viterbi state sequences as obtained for both the training and test data, respectively. Table 1 displays the average percentage of falsely decoded states across all simulation runs under the univariate, multivariate and CHMMs, respectively. The multivariate HMM has the largest classification error as it cannot correctly identify the state pair if
Average percentage of falsely decoded states in the Viterbi sequence
Average percentage of falsely decoded states in the Viterbi sequence
To compare the forecasting performance, we consider the conditional log-likelihood of the test set given the training data,
In summary, our simulations show that misspecifications of the dependence structure in the state process have various undesirable consequences. Erroneously mistaking two separate, highly correlated state sequences for a single state sequence led to substantially biased estimators, a high classification error and poor forecasting performance. Distinguishing two such state sequences but failing to account for their correlation negatively affected the forecasting and classification performance.
Case studies
We illustrate the application of CHMMs in two case studies. First, we analyse movements of a dolphin mother and its calf using a Cartesian product CHMM. Subsequently, we apply a CMSR model to data on vital signs of patients hospitalised in the intensive care unit (ICU), controlling for sex and age. Parameters were estimated via numerical likelihood maximisation using the R function
Movements of dolphin mother and calf
HMMs are routinely used to analyse animal movement data, with the model's state process interpreted as a proxy for an animal's behavioural modes (e.g., resting, foraging or relocating) determining the observed movement patterns (Langrock et al., 2012). Here we consider movement data from a bottlenose dolphin mother and calf pair which was simultaneously tagged with 3D accelerometers and magnetometers for
It is certain that the two animals interact, that is, that the behaviour of mother and calf influence each other. To account for these interactions, instead of fitting two univariate HMMs separately to both individuals, we consider CHMMs within which the two animals’ separate behavioural state sequences are correlated. To avoid restrictive assumptions regarding the interaction, we use a Cartesian product CHMM with bivariate state vectors — indeed the AIC favoured this ‘full’ CHMM over the alternative model formulations that involve more restrictive assumptions (an AIC-based model comparison is provided in the Online Supplementary Material). Tortuosity was modelled using state-dependent beta distributions. The observed zeros (
Estimated state-dependent distributions for tortuosity of the dolphin mother and calf, respectively, weighted by the stationary distribution of the bivariate Markov chain
Estimated state-dependent distributions for tortuosity of the dolphin mother and calf, respectively, weighted by the stationary distribution of the bivariate Markov chain
The estimated state-dependent beta distributions are displayed in Figure 4. For both animals, the model identifies similar movement patterns, with state 1 capturing low tortuosity values (approximate straight-line movement; means
Tortuosity time series of dolphin mother and calf. Viterbi-decoded states differing between mother and calf are highlighted
Steady-state (stationary) probabilities of the state process as implied by the estimated TPM
The identification of such differences can be used as a starting point for further biological inference. For example, environmental covariates could be incorporated for further investigations into the role and the causes of different state combinations. Overall, the results suggest that the movement behaviour of mother and calf is well adapted to each other.
In our second case study, we analyse electronic health record (EHR) data of patients hospitalised in the ICU of the Ronald Reagan UCLA Medical Center. We use a subset of the data also considered in Alaa and van der Schaar (2018) and Alaa et al. (2018). ICU patients usually suffer from severe illnesses and injuries and are intensively observed by the nurses and physicians. However, as the patients undergo an increased risk, it is important to understand the progression of diseases and to identify early indications of a forthcoming deterioration. Modelling and analysing the physiological processes over time could help to detect critical developments early and support the decision-making of the physicians. State-switching time series models provide an intuitive and convenient framework for modelling the evolution of a system over time, and hence to quantify the risk of an impending deterioration of a patient's health state.
The data contain hourly measurements of four major vital signs: heart rate (in beats per minute, bpm), respiratory rate (in breaths per minute, bpm), systolic and diastolic blood pressure (in millimetre of mercury, mmHg). We did not consider diastolic blood pressure as it is strongly correlated with systolic blood pressure (Pearson correlation of 0.58). The dataset further contains information about sex, age, admission type and location for each patient. The medical diagnosis, however, is omitted. In order to reduce the substantial patient heterogeneity caused by the underlying diseases, in this case study we consider only the patients who undergo dialysis, and restrict our analysis to patients with known sex and age who stayed in the ICU for more than
The observed vital signs do not evolve synchronously over time—for example, an increase in the heart rate is not necessarily accompanied by a change in blood pressure (cf. Figure 6).
Example time series for heart rate and systolic blood pressure, respectively. The dashed lines highlight intervals with an elevated heart rate that does not seem in synchronity with the evolution of the observed systolic blood pressure
Figure 7 illustrates the estimated state-dependent distributions for male patients with the median age
Estimated state-dependent distributions for heart rate, respiratory rate and systolic blood pressure, respectively, for 62-year-old males
Table 3 gives the estimates of the parameters associated with the state-dependent process, showing only small effects of the covariates considered. According to the model, we would expect to observe slightly lower heart rates, respiratory rates and systolic blood pressures for older patients. In case of respiratory rate and systolic blood pressure this is an unexpected result, which may be due to the exceptional circumstance of the patients considered being treated in the ICU.
Estimated parameters (and standard errors) associated with the state-dependent distributions for heart rate, respiratory rate and systolic blood pressure, respectively
The estimated effects of the sex are relatively small.
The diagonal elements of the estimated Off-diagonal elements of the estimated transition probability matrix. The diagonal entries lie between
The main advantage of the full Cartesian product CMSR model is that it allows us to derive a completely data-driven dependence structure of how the multivariate state process evolves over time. While our model is still somewhat simplistic, for example, with regard to the conditional independence assumption, it offers an idea of the type of inference that can be gleaned on the joint evolution of heart rate, respiratory rate and blood pressure. Such results could be used for example to develop risk scores based on the probabilities to switch to deterioration states or to cluster the different courses of diseases based on the patients’ Viterbi sequences.
CHMMs constitute a natural extension of basic HMMs to address scenarios with multiple time series whose underlying state processes interact. The explicit modelling of dependencies between the state variables can increase estimation accuracy, may decrease state classification error and generally provide new opportunities for meaningful inference related to the correlation between processes. The potential of CHMMs has already been recognised in particular in engineering, where these models have been applied in various classification and signal processing tasks such as action recognition (Brand et al., 1997), audio–visual speech recognition (Nefian et al., 2002), bearing fault recognition (Zhou et al., 2016), and EEG, ECG and PCG classification (Michalopoulos and Bourbakis, 2014; Oliveira et al., 2002). Due to technological advances for example in animal tracking and in EHRs (as illustrated in Section 4), and generally the rapid growth in the amount of multi-stream data collected, we anticipate CHMMs to gain popularity also in other statistical modelling tasks such as forecasting or general inference on data-generating processes. In addition to the application areas showcased in the present article, CHMMs could for example be useful to model the spread of infection in individual-based epidemic models (Touloupou et al., 2020), for exploiting dependencies between different economic markets in financial risk management (Cao et al., 2019) or to accommodate the spatio-temporal correlation of meteorological and geophysical time series (Stoner and Economou, 2019).
The main barrier to CHMMs becoming much more widely used in applied statistics is the models’ complexity arising from a curse of dimensionality: the number of model parameters very rapidly increases as the number of state variables or the number of states per variable increases, leading to high computational costs and numerical problems. Without imposing constraints on the model structure, CHMM-based analyses thus risk being limited to scenarios with only moderate numbers of variables and states. One possible way forward may be Possible hierarchical model with a global state
Footnotes
Acknowledgments
The authors received no financial support for the research, authorship, and/or publication of this article.
