Abstract
Background:
Clinical studies of telemedicine (TM) programs for chronic illness have demonstrated mixed results across settings and populations. With recent uptake in use of digital health modalities, more precise patient classification may improve outcomes, efficiency, and effectiveness.
Objective:
The purpose of the research was to develop a predictive score that measures the influence of patient characteristics on TM interventions. The central hypothesis is that disease type, illness severity, and the social determinants of health influence outcomes, including resource utilization, and can be precisely characterized.
Methods:
The retrospective study evaluated the feasibility of creating a patient “Telemedicine ImPact” (TIP) score derived from a Virginia Medicare and Medicaid claims data set. Claims were randomly selected, stratified by disease type, and matched by illness severity into a TM intervention group (N = 7,782) and a nontelemedicine “usual care” control cohort (N = 7,981). The individual records were then summarized into 15,762 cases with 80% of the cases used to develop, train, and test four predictive models (hospital utilization, readmissions, total utilization, and mortality) using 10-fold cross-validation.
Results:
Bayesian supervised machine learning achieved reference model performance index area under the curve for receiver operating characteristic (AUC/ROC) ≥0.85. Posterior probabilities for each outcome model were generated on a “hold-back” set of 3,082 cases. Robust parametric statistical methods enabled dimension reduction, model validation, and derivation of a reliable composite scaled score that quantified the overall health risk for each case. The TM intervention cohort demonstrated higher total utilization (representing the sum of inpatient, outpatient, and prescription use) and lower mean inpatient utilization than the usual standard of care. This finding suggests TM-based care may shift the composition of health resource utilization, reducing hospitalizations while increasing outpatient services, adjusted for patient differences.
Conclusions
: The creation of a patient score using machine learning to predict the effect of TM on outcomes is feasible. Adoption of the TIP score may reduce variability in results by more precisely accounting for the effects of patient characteristics on health outcomes and utilization. More consistent outcome prediction may lead to greater support for digital health.
Introduction
The results of multiple studies point to disparate health outcomes originating in the social determinants of health (SDOH). 1,2 SDOH may contribute to the prevalence of chronic illness and growing health care inequity in diverse vulnerable patient populations. 3 –5 The presence of anxiety or depression is positively correlated with increased hospitalizations. 6 –8 Depression reduces adherence to treatment in patients with chronic illness, including worsening symptoms. 9,10 Poor health literacy and poverty may limit access to care and self-efficacy, as well as serve to reduce adherence to treatment. 11 –13
Telemedicine (TM) is thought to support earlier intervention, reducing the effects of illness exacerbations and readmissions. 14 Studies have documented positive effects in improving access to care, health outcomes, and self-management. 15 –18 Despite an abundance of studies, mixed results are prevalent. 19 –23 The challenge remains to devise more effective digital health strategies that reach diverse populations with chronic conditions. 24 –27
The strategic use of data to identify high-risk patients and optimize treatment strategies has a storied history, continuing to find novel applications across clinical research and practice. 28 –30 Risk classification is used to assess lifestyle factors that increase health risks with significant positive public health benefits. 31,32 A risk model in this context could be applied to screen for and tailor TM treatment to better match patient need. The model, embodied in a risk-based score, could characterize the impact of patient characteristics on TM outcomes. The development of a patient risk classifier is intended to (1) array the population to identify high-need patients, (2) inform tailoring of TM interventions through objective criteria, and (3) adjust for patient-specific differences to increase likelihood of achieving desired outcomes.
Research Questions and Hypothesis
The primary objective of this study was to create a reliable valid score to control for patient characteristics in assessment of TM effectiveness. The central hypothesis is that disease diagnosis, illness severity, and the SDOH can be used to develop a robust and valid score that controls for patient variability. Adequate study controls increase the likelihood of discerning the true treatment effects of TM-based programs for chronic illness management. The research questions are as follows:
The research questions guided the analysis to (1) identify the most influential characteristics from a set of candidate predictor and target outcome variables, generate probabilistic models, and create a composite TIP score quantifying patient health risks, (2) determine whether the score covaries by disease, and (3) assess group differences in resource utilization by comparing outcomes between those who use TM with a “usual standard of care,” non-telemedicine (NTM) cohort. The study sought to establish the predictive validity of the TIP score that generalizes across patient characteristics, including disease type.
Materials and Methods
Design
The research was approved by Virginia Commonwealth University Office of Research Subjects Protection under the exempt category per 45 CFR 46 with the study number IRB HM 20017231. The retrospective cohort design included a TM intervention group (N = 7,782) and NTM control cohort (N = 7,981). Randomized selection into each cohort used intervention or control criteria, including stratification by disease type and matching by illness severity.
Sample
Claims from adults (21 years and older) in the Virginia Health Institute Commonwealth of Virginia All Payer Claims Database (APCD) provided the authoritative source for the data. 33 Cases included four primary chronic illnesses: heart failure (HF), chronic obstructive pulmonary disease (COPD), diabetes, and hypertension. In addition, co-occurring illness included anxiety, depression, substance use, and end-stage kidney disease. 34 –36 International Classification of Diseases (ICD) 9/10 diagnosis codes identified the select chronic illnesses. 37 A minimum of one chronic illness from the mentioned set of target diagnoses served as inclusion criteria. A risk adjustment factor, the Milliman Advanced Risk Adjuster for concurrent risk of hospitalization (included in the APCD), provided the illness severity cohort matching criteria. 38,39 Centers for Medicare and Medicaid “Healthcare Common Procedure Coding System” and American Medical Association Current Procedural Terminology code sets (and code modifiers) were used to select the appropriate encounters into the TM and NTM cohorts. 40
Approach
To answer research question 1, robust parametric and nonparametric techniques were used for variable selection and model creation. A bivariate correlation matrix, principal components analysis, and Bayes unsupervised learning supported the exploratory analysis and dimension reduction to select input parameters and generate outcome models from a larger set of candidate variables. Final models were created using Bayes supervised machine learning. Each outcome model was resampled using 10-fold cross-validation specifying the target model performance criteria of area under the curve for receiver operating characteristic (AUC/ROC) >80%. Posterior probabilities generated from the outcome models were scaled to derive a single composite score for each case in the hold-back set of data. Analysis of covariance (ANCOVA) tests, using the same set of Bayesian model predictors, provided confirmation of results.
To answer research question 2, an ANCOVA test used the TIP score as the outcome variable, with patient characteristics (including disease diagnosis) as covariates, and the cohort designation (TM or NTM) as the predictor variable. This test sought to identify the explained variance in the score associated with a patient's chronic illness.
To answer research question 3, a multivariate analysis of covariance (MANCOVA) test was conducted using hospital and total utilization as outcome variables. The score and disease-related variables were entered as covariates, with the intervention cohort (TM or NTM) entered as the predictor variable. The covariates eliminated the variance associated with disease, leaving the remaining variance explained by the intervention. IBM SPSS (Version 26) and Bayesia Lab (Version 8) were used for the data analysis. 41,42
Data Collection
A total of 15,763 unique cases, aged 21 years and older, were derived from ∼3,500,000 claims records. Multiple claims were aggregated to a single deidentified (patient) case with utilization data for inpatient, outpatient, and prescription medications. After data cleaning and integrity checks, the training data set sample size was 12,680 cases. After eliminating one outlier, the “hold-back” set of data comprised 3,082 cases. Four variables (two predictors and two outcome variables) with positive skew were log transformed. Tables 1 and 2 detail the set of predictors and outcome variables, respectively, included in the data analysis. The variable coding procedures for aggregation from individual claims to deidentified cases are detailed in the Supplementary Appendix A. Table 3 lists the patient attributes of the TM intervention and NTM “usual standard of care” control groups. The cohorts appeared to be reasonably matched by sample size and demographics.
Candidate Predictor Variables
APCD, All Payer Claims Database; COPD, chronic obstructive pulmonary disease; HF, heart failure; ICD, International Classification of Diseases; MARA, Milliman advanced risk adjuster; NTM, nontelemedicine; TM, telemedicine; VHI, Virginia Health Institute.
Candidate Outcome Variables
Patient Attributes in Cohort Data
SD, standard deviation; TM, Telemedicine.
Data Analysis and Results
Model Development
The first research question was informed by the results of parametric approaches to variable selection and Bayes machine learning on the set of predictors to target outcomes. Four predictive models were generated including hospital utilization, readmissions, total utilization, and mortality. Outcome variable factor loading results are detailed in the Supplementary Appendix B. A subset of 8 (of 23) candidate independent variables were found to be predictive of outcomes. Different subsets of patient characteristics served as predictors for separate outcome models as given in Table 4. One variable, the illness severity risk adjuster, served as a predictor across the four models. Five other variables, which are the intervention cohort (TM or NTM), technical experience, Medicare/Medicaid subscriber status, depression, and age, appear in more than one model. Two variables (the number of comorbidities and provider zip code poverty rate) were found to be predictors in a single Bayes outcome model each (i.e., total utilization and readmissions). The four Bayes network models including performance metrics are detailed in the Supplementary Appendices C–F. Each outcome model achieved a reference index ROC ≥0.85.
Summary of Model Predictors by Target Outcome
ROC, receiver operating characteristic.
Follow-up ANCOVA tests were conducted as confirmation (and comparison) for the two continuous outcome variables, inpatient and total utilization. The intervention cohort served as the predictor in each test. Parameter estimates are given in Table 5. The only difference between the Bayes model and the ANCOVA results was the age variable for total utilization, which was not significant.
Parameter Estimates for Inpatient and Total Utilization
ANCOVA confirmatory results provided for the two continuous Bayes outcome models.
TM and NTM cohorts.
ANCOVA, analysis of covariance; CI, confidence interval.
Note: Parameter estimates provided with p-values, partial η 2 effect size, and robust standard errors (99% CI).
TIP (Composite Score) Development
A batch utility generated posterior probabilities for each of the four outcomes in the hold-back set of 3,082 cases. Using an adjustment factor to uniformly discount for the differing risk probabilities across the discrete categorical distributions for each Bayes model outcome, the individual item probabilities were then scaled and summed into a single composite TIP score for each case. A robust estimate of the low and high range in percentiles for the TIP was distributed from a low score of 1.61 in the 5th percentile to a high score of 3.89 in the 95th percentile. The distribution is negatively skewed with lower scores corresponding to higher risk patients (scores of ≤2.56) and higher scores corresponding to healthier patients (≥3.70). A perfect score (lowest risk possible) equals 4.00 and the highest health risk scores could asymptotically approach 0 in theory. Using test construction evaluation measures, including item test analysis, the TIP score was found to be reliable (Cronbach's alpha = 0.654) and demonstrated criterion validity.
Cohort Differences in Mean TIP Scores
The second research question provides validation of the score across disease-related factors including comorbidities. ANCOVA was conducted using the TIP as the outcome variable. Covariates included the number of comorbidities, illness severity, HF, COPD, diabetes, and hypertension diagnoses. The TM intervention served as the predictor of interest. The test was significant for the intervention (F (1,3074) = 1,218.14, p < 0.001, partial η 2 = 0.280) with an adjusted R 2 = 0.85. The significant control variables included the number of comorbidities (β = −0.137, p < 0.001, partial η 2 = 0.158) and the illness severity risk adjustment factor (β = −1.0, p < 0.001, partial η 2 = 0.521). There was a statistically significant mean difference in the TIP score between the TM intervention and the usual standard of care. The score was lower for TM cases (mean = 2.834) than for usual care (NTM) cases (mean = 3.256) (99% confidence interval [CI]). Robust parameter estimates are given in Table 6. TIP appears to improve outcome prediction by identifying a patient's likely response to TM, while controlling for diagnosis, illness severity, and comorbidities. One covariate in the model, a hypertension diagnosis, was found significant with a small effect size (p < 0.001, partial η 2 = 0.011). With that exception, the score appears to be independent of disease category and, therefore, generalizable.
TIP Score (Outcome Variable) Analysis of Covariance Summary of Parameter Estimates
TIP = telemedicine impact.
Bolded rows are significant at p < 0.001 and effect size ≥0.01 with robust standard errors. Covariates appearing in the model are evaluated at the following values: comorbidities = 2.62, illness severity = 0.8002, HF = 0.20, COPD = 0.27, diabetes = 0.31, and hypertension = 0.66.
Resource use in TM
The third research question was informed by statistically significant group differences in resource use found between patients using TM and the usual standard of care. MANCOVA identified a significant effect for hospital and total utilization with the intervention cohort as the predictor variable (TM or NTM) (λ = (F (2,3070) = 85.49, p < 0.001, partial η 2 = 0.053). TIP and disease diagnoses were entered as the set of control variables. The total utilization adjusted R 2 = 0.43 and the hospital utilization adjusted R 2 = 0.48. Table 7 summarizes the robust parameter estimates that met the threshold significance level for p < 0.001 and an effect size (partial η 2 ) ≥0.01 for a large sample. The results highlight the mean differences for each utilization measure. Total utilization is significantly higher for the TM intervention (mean = 3.226) and lower for the NTM group (mean = 2.878), (p < 0.001, 99% CI). By contrast, hospital utilization (figure 36) is significantly lower in the TM cohort (mean = 0.356) and higher in the NTM cohort (mean = 0.479) (p < 0.001, 99% CI). The findings suggest a significant effect on resource utilization associated with the TM intervention, adjusted by the TIP. A “shifting” effect in use of TM was found in reduced inpatient services and increased prescriptions and outpatient services for the management of chronic illness.
Parameter Estimates for Multivariate Analysis of Covariance with Two Outcomes (Hospital and Total Utilization) and Estimated Marginal Means
Covariates appearing in the model are evaluated at the following values: TIP score = 3.0483, HF = 0.20, COPD = 0.27, diabetes = 0.31, hypertension = 0.66, anxiety = 0.28, depression = 0.36, substance use = 0.29, end-stage kidney disease = 0.24, comorbidities = 2.62.
Discussion
Cohort Differences
The differences in TIP scores indicated patients with higher health risks in the TM cohort. The summary of ANCOVA results for TIP (outcome) and the intervention (predictor) with the disease covariates demonstrated effective control for group differences with large proportions of variation explained by TM (adjusted R 2 = 0.85). The MANCOVA for the two outcome variables (hospital and total utilization) by the intervention, with the TIP score and the individual set of diseases as covariates, resulted in an adjusted R 2 = 0.42 and 0.48, respectively. This showed a significant share of explained variation in resource use for the intervention, adjusted for disease factors.
The TIP as shown in the two utilization models demonstrated a useful control and highlighted a key effect. The use of hospital services was lower in the TM group than the control. At the same time, the use of outpatient and prescription services was higher than the control. This makes sense in the context of TM programs that deliver care in geographically dispersed rural areas lacking robust medical services infrastructure.
Control Variables and Effect Sizes
The small effect sizes found in this study suggest that smaller experimental studies may not adequately control for subject variation. This could account for mixed results in the form of type 2 errors that mask true treatment effects. Controlling for traditional demographic factors may not adequately cover differences in the patient mix between treatment groups, thereby increasing the potential for selection bias. Other adjustments may be necessary, including better characterization of the patient's individual level of health needs, as suggested by the TIP score distribution. More precise characterization of the impact of the SDOH, including factors relevant to vulnerable underserved populations, may improve accuracy of outcome prediction in planning TM programs. Although the full scope of health determinants is not available in claims data, a significant effect was revealed in the composition of resources in the TM group as compared with usual care.
TM Adoption Policy Barriers
A patient's predisposing, enabling, and need characteristics are important factors in the prediction of health outcomes. 43,44 This study suggests a role for the empirical assessment of a patient's health risks and needs when considering digital health programs to deliver care. The TM intervention cohort was concentrated in distinct areas of the Commonwealth of Virginia lacking the full scope of medical resources. This conforms to current TM reimbursement policy for the years under study (2016–2017). 45,46 Policy rules through 2019 proscribed the use of TM to designated low medically resourced rural areas. Yet, even in these areas, barriers to adoption remained. 47 –49 Until 2020, TM covered a small set of services and encounters in a limited range of health care settings.
Interim changes in reimbursement rules (during the 2020 pandemic) have increased adoption, since TM encounters protect both provider and patient while providing access to care for a wide range of chronic and acute conditions. It is uncertain whether the short-term changes will revert once the current crisis ends. A future large-scale study, using data from the 2020 interim policy changes as a natural experiment, could complement these findings by increasing understanding of digital health effectiveness beyond low-resourced regions. An assessment of the impact of patient characteristics using the TIP score could discern effect of TM use across the broader population.
Borne out by the findings in this study and from previous research, TM serves as a distinguishing factor, as well as an essential bridge to delivering medical care in underserved patient populations. The prevalence of co-occurring behavioral health diagnoses (depression, anxiety, and substance use) lends additional credence to the need to identify and stratify the treatment of patients with both chronic illness and additional complex needs to achieve expected outcomes. 50,51 The TIP score could assist by providing an empirical basis for tailoring treatment, including indicating resources and strategies needed to reduce health disparities.
Conclusions
A subset of patient characteristics contributes to the creation of reliable and valid TIP score to improve patient treatment and consistency of TM program results. The predictive score demonstrated the association of a few risk factors with resource use. Predictive modeling could lead to improved screening and digital health practice innovations, including development of targeted participation criteria. Classification methods can provide patient-specific risk stratification supporting care planning and resource allocation decisions for large patient case loads. Busy care coordinators and nurse case managers may find the simple indicator provided by the TIP a helpful input to identify patients needing more intensive therapy or more suitable approaches to treatment.
A TIP score may provide an important indicator supporting diverse patient populations with chronic illness. Matching treatment intensity (and support) to patient need using an empirically derived indicator of risk may reduce the high variability in TM programs across settings and populations by adjusting for patient characteristics. More consistent outcome prediction may lead to greater support for digital health.
Footnotes
Acknowledgments
This article is based on the dissertation published in the VCU Scholar's Compass on February 18, 2020 (Crump, 2020). 52 Thank you to Virginia Commonwealth University, The College of Health Professions, the Dissertation Committee, and the staff at the Virginia Health Institute for their amazing support for the research.
Disclosure Statement
No competing financial interests exist.
Funding Information
No funding was received for this article.
Supplementary Material
Supplementary Appendix A
Supplementary Appendix B
Supplementary Appendix C
Supplementary Appendix D
Supplementary Appendix E
Supplementary Appendix F
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
