Abstract
Introduction:
Back pain is among the most common presentations in primary care offices. National organizations have standardized the appropriate use of imaging for acute low-back pain (LBP). The objective of this study was to evaluate the use of imaging in LBP between telemedicine and in-person clinical encounters.
Methods:
This retrospective cohort compared secondary data from 20,624 telemedicine and office encounters in a large nonprofit health system from July 1, 2019, to June 30, 2021. The proportion of patients aged 18–50 years who did not receive imaging for acute LBP (X-ray, computed tomography, or magnetic resonance imaging) within 28 days of the provider encounter was measured according to Healthcare Effectiveness Data and Information Set specifications. Performance was compared across race, ethnicity, age, body mass index, overall risk score, and insurance type. Chi-squared tests determined significant differences between cohorts (p < 0.05).
Results:
Patients seen via telemedicine had significantly lower rates of imaging within 28 days of their physician encounter (office: 16.32%, telemedicine: 11.20%; difference: 5.12%; p < 0.01). This was consistent across racial, ethnic, and risk score subgroups.
Discussion:
For practices and health systems, telemedicine might be a higher value approach for initial evaluation of acute LBP in primary care. For policy makers, telemedicine can save on health care costs without negatively impacting quality performance measures.
Conclusions
: Telemedicine is unlikely to compromise quality of acute LBP care, supporting this virtual space as an alternative care venue. The most beneficial use of telemedicine might be triaging initial encounters of acute LBP in primary care. Stronger evidence could support its long-term potential for driving value through cost savings.
Introduction
Back pain is common with a lifetime prevalence estimated up to 85% and being the fifth leading presentation in physician offices. 1 Low-back pain (LBP) correlates with middle age, low fitness level, smoking, low strength of surrounding spinal musculature, excess body weight, psychological factors, and occupational factors. 1 Most patients who experience back pain have underlying chronic back pain that acutely and temporarily worsens, affecting activities of daily living; this common presentation of back pain frequently results in long-term disability. 1,2
Upon presentation, physicians should initially assess for “red flag” symptoms, indicative of potentially serious pathologies. 3 These red flag symptoms include fever, sudden pain with spinal tenderness, trauma, serious medical conditions, and sudden or progressive neurological deficit including bowel or bladder incontinence and saddle paresthesias. 4 Mainstays of conservative treatment include nonsteroidal anti-inflammatory drugs and physical therapy. With conservative management, back pain often resolves within 3 months and imaging is not necessary. 2
Many sources have outlined the appropriate use of imaging for LBP, including the National Quality Forum (NQF) and the Centers for Medicare and Medicaid Services (CMS). The goal of these organizations is to strive continuously in excellence and value-based care for all patients by driving measurable health outcomes. 5 These measurable outcomes are reportable with Healthcare Effectiveness Data and Information Sets (HEDIS), which is used nationally for performance improvement within electronic clinical data systems. HEDIS measures are authored by entities such as the National Committee for Quality Assurance and their Core Measures Quality Collaborative who stipulate criteria for quality performance measures across multiple domains of primary care, including LBP. 5
In accordance with the NQF and CMS guidelines for LBP, early imaging is not recommended in the first 4 weeks without red flags. 3,4,6,7 Imaging may be indicated in patients with red flag symptoms or back pain that persists for more 3 months and fails conservative measures 3 ; however, many patients still undergo early imaging, which is associated with greater long-term health care costs and increased days of work missed. 8 Cost savings have been estimated in the hundreds of millions of dollars if these clinical guidelines were followed more closely. 9
During the SARS-CoV-2 pandemic, telemedicine visits increased exponentially in number, becoming a routine alternative venue for health services. 10 Recent studies have compared the visit content between telemedicine and in-person visits, 11 and telemedicine might be a feasible way to mitigate overutilization of health care resources. 12 But despite the pandemic-augmented increase in research surrounding telemedicine, there is limited literature comparing specific quality performance measures between telemedicine and office visits. After the SARS-CoV-2 pandemic resolves, many believe that telemedicine will have continued utility. 13,14 Thus, further study is needed to determine the measurable efficacy and quality of telemedicine visits, specifically for patients with LBP. 11,15
Accordingly, the objective of this study was to compare the HEDIS quality performance of telemedicine and in-person visits for patients with LBP.
Methods
We retrospectively evaluated the quality of in-office and telemedicine outpatient primary care in a large nonprofit health system in South Central Pennsylvania over a 2-year time frame (July 1, 2019, to June 30, 2021). Using Epic's SlicerDicer (a sophisticated electronic medical record [EMR] data mining tool with population-level data analysis capabilities), we acquired de-identified patient data across the integrated 8-hospital health system (>200 outpatient care sites with ∼2,600 clinicians). We compared quality of care between encounter venues by adapting CMSs HEDIS measure 16 for “Use of Imaging Studies for LBP.” The study was deemed exempt from full board review by the WellSpan Health Institutional Review Board (1760263-1), and STROBE format for cohort studies was followed.
A patient data model was used in SlicerDicer to mine numerators and denominators according to CMSs HEDIS measure, which was authored by the NQF 4 and included in the Core Quality Measures Collaborative Consensus Core Set for Accountable Care Organization and Patient Centered Medical Home/Primary Care. 5 The measure is defined as the proportion of patients who did not receive imaging studies (plain X-ray, magnetic resonance imaging [MRI], computed tomography [CT] scan) within 28 days of an encounter when the primary diagnosis was LBP. Inclusion criteria were built according to the HEDIS denominator specification: all 18- to 50-year-olds with a patient encounter for LBP from July 1, 2019, to June 30, 2021. We patterned the HEDIS-specified exclusions to avoid counting patients with diagnosis warranting clinically appropriate imaging (Fig. 1).

Conceptual schema of SlicerDicer patient data model with filter hierarchy.
We made two notable adjustments to the HEDIS denominator: (1) exclusion of emergency departments from care locations and (2) inclusion of only ambulatory medical service lines and primary care service lines. For numerators, we also patterned the HEDIS specification: patients receiving an X-ray, CT, and MRI within the 28 days following a diagnosis of LBP. To ensure sequential measuring of patient encounter and imaging of acute LBP, we set conditional criteria in SlicerDicer: first visit, then image. In other words, if a patient had a CT, MRI, or X-ray of the back 29 days or more after the encounter, they were not captured in the measure.
We stratified the office and telemedicine data separately by race, ethnicity, age, body mass index (BMI), and overall adult risk score to observe if there was comparable demographic representation. The overall risk score was designed in WellSpan Health's Epic build and profiles patient risk according to the number of diagnosis, social determinants of health, and health care utilization. Scoring categories were defined as: low risk = 0–8, medium risk = 9–16, high risk = >16. We explain details of this scoring system in the supplemental methods of a previous publication. 17 We also measured the composition of exclusions between cohorts to understand if there was a discrepancy (a sampling bias) in the efficiency of detecting red flags between office and telemedicine visits.
STATISTICAL ANALYSIS
We calculated percent differences between office and telemedicine quality performance and red flag exclusion composition with N-1 chi-squared tests. The demographics and other frequencies were provided directly by SlicerDicer when data were collected, and MedCalc 18 was used to compare the proportions (HEDIS numerator/HEDIS denominator) of telemedicine and office encounters by race, ethnicity, and overall adult risk score. Whites were the dominant representation in our sample and served as the control in racial comparisons. Statistically significant differences in proportions were set at p < 0.05.
Results
There were 20,624 total patients meeting the inclusion criteria (15,290 office and 5,334 telemedicine). There were comparable compositions of demographic subgroups (Table 1) and red flag exclusions between office and telemedicine cohorts (Table 2). The largest representation in each demographic category for both cohorts was: female, white, non-Hispanic, ages 35–50 years, BMI 30–30.9, and low overall risk. The composition of racial subgroups for office and telemedicine cohorts, respectively, was found to be white (76.19% vs. 79.17%), other (15.50% vs. 13.57%), black (7.41% vs. 6.61%), and Asian (0.89% vs. 0.66%). More than 85% of all patients in each cohort were aged 25–50 years, and most of the sample were class 1 and class 2 obese.
Back Imaging Quality Performance Comparing Primary Care Office and Telemedicine Encounters by Health System Patients from April 1, 2020, to June 30, 2021, Stratified by Race, Ethnicity, Age, and Body Mass Index
Demographic comparison of office and telemedicine HEDIS quality measures (percentages of patients with an X-ray, CT, or MRI within 28 days of the encounter). Note, the office or telemedicine encounter was linked to the imaging report populating in the EMR, thus avoiding duplicate counts in the HEDIS numerators or denominators between cohorts. The subcategories of insurance type represent a crossover in patient encounters where cost-sharing (including self-pay) can trigger counts in multiple categories within SlicerDicer. Thus, insurance type is not a precise method for tabulating patient encounters, and these total numbers do not exactly match the race-tabulated totals.
Total patient count was found by taking the average of race and ethnicity encounter numbers to reflect the most likely true population numbers, since unlike age and BMI, these numbers do not change within the time frame.
Age is recorded over the 2-year time frame within the EMR; thus, there is a surplus in estimate with these methods. BMI is also recorded throughout the time frame, and thus, the EMR counts the same patient in multiple categories, overestimating the true total. These numbers were included to reveal the consistency in percentages of the data.
BMI, body mass index; CT, computed tomography; EMR, electronic medical record; HEDIS, Healthcare Effectiveness Data and Information Set; MRI, magnetic resonance imaging.
Red Flag Exclusion Composition Between Office and Telemedicine Cohorts
CI, confidence interval.
Patients with “no value” in overall risk score are reported in Table 1 for completeness (although there were only 33 total office patients and 9 total telemedicine representing only 0.22% and 0.17% of these cohort subgroups, respectively). The predominant payer type in both cohorts was commercial insurance (office: 57.4%, telemedicine: 57.8%), followed by Medicaid (office: 35.8%, telemedicine: 34.7%) and Medicare/Government (office: 6.9%, telemedicine: 7.8%). Rates of back imaging for insurance type were modestly lower in the telemedicine cohort and for subgroups, rates were modestly lower for commercial payers (office 13.8%, telemedicine: 10.4%) compared with Medicaid (office 17.2%, telemedicine: 11.3%) and Medicare/Government (office 17.6%, telemedicine: 10.9%). Statistically significant differences were observed between office and telemedicine cohorts across race, ethnicity, and overall adult risk score (Table 3).
Statistical Analysis of Quality Performance by Encounter Type from April 1, 2020, to June 30, 2021
Delta (Δ) represents the difference in office and telemedicine imaging rates (office − telemedicine).
Discussion
This >20,000 patient cohort study found that telemedicine encounters for lower back pain had significantly lower rates of imaging within 28 days of their physician encounter (and better HEDIS performance) compared with office encounters. These results were consistent across racial, ethnic, and risk score subgroup analysis.
Our study responds to primary care leaders' call for reducing high-cost care by investigating the quality of telemedicine. 19 We know that telemedicine decreases costs, improves access to care, 20,21 positively impacts patient experience, 22 –25 and has potential for improving clinical outcomes. 26 –33 And what is clear from our analysis, telemedicine had superior quality performance (Table 3). Overall and in each of the subgroup analyses, there were consistently >5% differences favoring telemedicine's quality performance over office encounters. Moreover, in some of these subgroups, quality performance favoring telemedicine was greater (e.g., Asians and Hispanic/Latino minorities seen via telemedicine were imaged less than half as much as the respective office cohort).
This is important to interpret in the context of sociodemographic similarity in subgroup composition (Table 1) and red flag exclusions not playing a role in sampling bias (Table 2). Perhaps the most meaningful takeaway is that telemedicine can be implemented without decreasing quality performance. In fact, adding telemedicine could enhance the health system quality performance overall, which alludes to the value-based care application of telemedicine in LBP care (see Supplementary Data S1 for cost analysis example).
What is unclear is why office cohorts had higher imaging rates. The most obvious explanation is the COVID-19-induced shift to telemedicine where fear of the pandemic or distrust in the ability to obtain imaging safely could have influenced patient decisions. Perhaps more patients were ordered imaging but never showed up. Perhaps patients with higher pain levels were more inclined to seek in-office evaluation. There was no way of detecting these types of issues in our methods. Another consideration for higher office imaging rates could be related to race and ethnicity, notably in minorities compared with white, non-Hispanics. However, it is unclear which racial minorities also identified white or “other” (e.g., our “other” category represented about 16% of the population, Table 1). Furthermore, it is not clear how many of these would fall into the other racial categories and impact the quality performance of those racial subgroups.
Thus, language and cultural norms could be considered in postulations on the discordance of office and telemedicine quality performance. Language could be a barrier. It is plausible that less consistency exists in use of translation services in the office versus telemedicine. Or perhaps more patients show up to office visits with overly confident, partially English-speaking familiar members leading to miscommunication. Cultural norms in minorities could also foster the expectation that providers will “do something” at appointments. This could partially explain the contrast with telemedicine appointments where “virtual” patients have comparably less access to health care resources (i.e., in-office or close-by imaging).
To help generalize our findings, we found a few trends in our data. There were positive correlations in age and BMI with imaging rates, where older and more obese patients were imaged at higher rates (Table 1). This is consistent with the literature in describing concordance of back pain with age and BMI. 34,35 Health entities considering telemedicine as first-line venue for LBP care might target low-risk, 35- to 50-year-olds that are class 1 or 2 obese. Another trend was a moderately higher positive correlation between overall risk score and imaging rate (statistically significant). This finding should be interpreted in the context of a low representation of medium- and high-risk patients (Table 1). Still, this comorbidity trend is consistent with the literature, 36,37 and it is reasonable that in riskier populations, physicians might have lower thresholds for obtaining imaging. Especially with comorbidities such as cancer or bowel and genitourinary diseases, there could be a crossover of symptomatology masquerading as LBP red flags thus warranting appropriate use of imaging.
Our results are likely generalizable to other U.S. northeastern populations given the demographic comparability of our sample to U.S. census data. 38 Additional interpretation on generalizability, including analyses of cohort similarities and the comparability of our findings to national quality performance measures, can be found in the Supplementary Data S1.
LIMITATIONS
We recognize limitations regarding sample selection. First, although we followed HEDIS specifications for numerator selection, we cannot be certain that only patients with initial occurrences of acute back pain were captured. In other words, we may have captured those with a recurrence of chronic back pain masquerading as “new” acute LBP. Second, a specific limitation regards our health system's rurality and representation of the Plain Community (i.e., Amish and Mennonite communities). These patients tend to self-pay, which could affect imaging rates. Third, the digital divide (both limited access to technology and limited knowledge of how to use it) affects vulnerable populations such as migrant and seasonal farm workers, the elderly, long-term care patients, and African Americans. These populations are less likely to use online platforms even when these patients have internet access, 39,40 which highlights the broader limitation of telemedicine: providing benefits only to those privileged to access it.
We recognize the limitation of telemedicine in the detection of red flags. For example, evaluation for spinal tenderness and neurological deficits can be challenging during telemedicine visits. In our methods, these red flags would only be excluded if these chief complaints were documented. This poses a potential for differences in red flag detection rates and could have an unintentional effect on discrepant imaging prescriptions between the two cohorts.
We also recognize limitations regarding our methodology. First, a lack of randomization precludes drawing conclusions about superiority or noninferiority of telemedicine. Second, we do not know how or why patients opted for telemedicine versus office visits. More granular methods would have to account for the etiology of care venue assignment. Third, we could not measure the implications of harm in using telemedicine, thus caution should be used when considering telemedicine visits (e.g., not ordering imaging to achieve the quality performance point at the expense of missing a spinal abscess). Fourth, being a process measure, the LBP HEDIS performance does not measure patients appropriately imaged (outcome measure), in fact, those patients are excluded. As telemedicine is new enough that we do not reliably know if this is an appropriate level of care (beyond this study's scope), there may be a need for creating telemedicine-specific quality measures. This highlights an area for future research, where more specific outcome measures could make more confident recommendations about telemedicine.
Conclusions
We found better HEDIS quality performance (use of imaging for LBP) for telemedicine over office encounters in a large health system, supporting this virtual space as a potentially acceptable alternative care venue. The most beneficial use of telemedicine might be triaging initial encounters of acute LBP in primary care. At the very least, telemedicine is unlikely to compromise quality of care in LBP, and stronger evidence could support its long-term potential for driving value through cost savings.
Health entities seeking to implement telemedicine would need to consider the racial and ethnic patient risk, and sociocultural implications that are unique to their geographical areas. Further research could broaden this narrow-lens study to also examine patient satisfaction, return on investment, and more sensitive quality measures for appropriate imaging across different clinical workflows.
Footnotes
Authors' Contributions
D.B. and Y.J. conceived the conceptual framework of the project. A.W. was the principal investigator. A.W. and Y.J. served as mentors and advisors for the project. A.P. and D.B. completed the literature review and authored the introduction. D.B. and K.B. drafted the article. All authors contributed to the critical editing process, including revisions. D.B. completed data collection. D.B. and N.B. collaborated on statistical analysis. A.P., A.W., D.B., K.B., N.B., and Y.J. made ICMJE authorship contributions and are accountable for all aspects of work.
Acknowledgments
Steve Strom provided expert consultation for SlicerDicer data acquisition. Steve Strom and Theodore Bell were key advisors in data acquisition and statistical analysis. Dr. Brian Pollak was a key clinical advisor for telemedicine in primary care.
Disclosure Statement
No competing financial interests exist.
Funding Information
No funding was received for this article.
Supplementary Material
Supplementary Data
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
