Abstract
There is increased acceptance that social and behavioral determinants of health (SBDH) impact health outcomes, but electronic health records (EHRs) are not always set up to capture the full range of SBDH variables in a systematic manner. The purpose of this study was to explore rates and trends of social history (SH) data collection—1 element of SBDH—in a structured portion of an EHR within a large academic integrated delivery system. EHR data for individuals with at least 1 visit in 2017 were included in this study. Completeness rates were calculated for how often SBDH variable was assessed and documented. Logistic regressions identified factors associated with assessment rates for each variable. A total of 44,166 study patients had at least 1 SH variable present. Tobacco use and alcohol use were the most frequently captured SH variables. Black individuals were more likely to have their alcohol use assessed (odds ratio [OR] 1.21) compared with White individuals, whereas White individuals were more likely to have their “smokeless tobacco use” assessed (OR 0.92). There were also differences between insurance types. Drug use was more likely to be assessed in the Medicaid population for individuals who were single (OR 0.95) compared with the commercial population (OR 1.05). SH variable assessment is inconsistent, which makes use of EHR data difficult to gain better understanding of the impact of SBDH on health outcomes. Standards and guidelines on how and why to collect SBDH information within the EHR are needed.
Introduction
Social and behavioral determinants of health (SBDH) include the environment in which individuals live, learn, work, and play, as well as other social and behavioral factors that may not typically be considered the primary focus of health care services. 1,2 SBDH factors such as income, neighborhood resources, and tobacco use can place populations at risk of health disparities and poor outcomes. 1 –6 Providers are increasingly recognizing the importance of addressing SBDH factors as part of improving health outcomes and addressing disparities, but there is not full consensus on how best to identify and address SBDH factors in clinical settings.
There are currently no standards within most electronic health record (EHR) systems regarding the collection of SBDH information. Typically, only a limited number of elements are captured in various structured components of the EHR, such as the social history (SH) table, patient demographics table, or in vendor-specific SBDH tables such as the Social Determinants wheel in Epic. 2,3,7 Other elements are captured haphazardly in the “unstructured” clinician notes section of the EHR. 2,7,8 The Social Interventions Research and Evaluation Network, the National Academy of Medicine, and other national organizations have made recommendations on what variables to capture in the EHR, but no standard collection practices have been adopted across EHR vendors. 1,3,9,10
The SBDH data entered as “free text” within the clinician note section are hard to extract, interpret, and utilize for anything beyond the review of the past care notes for that single patient. 11 –14 The structured SH table is a common location within the EHR to record information about individual behaviors, including tobacco, alcohol, and illicit drug use. This information is categorical and structured, but data collection processes are not always consistent within or across care settings. There are limited studies looking at the data completeness and quality of these EHR SH tables. 15,16
Recent studies have compared SBDH information found in structured and free text portions of the EHR patient record. Feller et al explored the availability of select SBDH information in free text, structured, and combined formats. This study focused on alcohol use, substance use, sexual orientation, sexual activity, and housing status. The authors found that about 60% records had more accurate results when combining both structured and free text information. Of the specific SBDH domains, sexual orientation was the most commonly collected information and IV drug use was the least.
This study did not specifically review the SH table. 14 Another study investigating the availability of housing issues such as homelessness, housing instability, and structural challenges associated with residence in free text and structured sections of the EHR found that there were multiple approaches for collecting SBDH information, and that completeness rates ranged from 95% to 3.3% for structured data elements. Again, this study did not specifically focus on the SH table when exploring completion rates. 13
There is currently limited information on what patient, provider, visit or system factor(s) may impact the collection of SBDH information. One challenge in undertaking research of this type is that there are varied ways these data can be captured within and across EHR systems and organizations. Furthermore, clinician behavior and unconscious bias may also impact capture of such information. 16,17 Gaining better understanding of what type of patients have more complete records can help support strategies to improve data collection and use of SBDH information in the EHR to address disparities. This study aimed to characterize the completeness of the SH table in the Epic EHR for patients within a large academic integrated delivery system. It also sought to understand factors that may impact data collection rates.
Methods
This study was considered exempt by the Johns Hopkins Bloomberg School of Public Health Institute Review Board (IRB). This is a secondary use data study.
Data source
This study used a limited data set extracted from the Johns Hopkins Health System (JHHS) EHR. Epic EMR (Wisconsin;
Study sample
To be eligible, individuals must have been enrolled the entire year in 2017 in either an employee health plan (EHP) or a Medicaid managed care plan, priority partners (PP), from Johns Hopkins Health Care (JHHC). JHHC is affiliated with JHHS, but services may be received either within or external to JHHS; to ensure that individuals had information within the JHHS Epic EHR system, only JHHC enrollees making at least 1 Johns Hopkins Community Physicians (JHCP) visit between 2013 and 2017 were eligible to be included. JHCP is a large primary and specialty health network affiliated with Johns Hopkins with >40 locations throughout Maryland and Washington, DC. 18
Eligibility was further restricted to individuals with at least 1 outpatient visit in 2017. Number of visits was calculated using the Johns Hopkins Adjusted Clinical Group (ACG) outpatient visit count score from the claims record. Outpatient visits were counted as 1 event for each unique combination of patient ID, provider ID, and date of service for qualifying types of visits, which included physician office visits, outpatient hospital visits, types of bills 13 × or 14 × , and settings for which the place of service code was “other.” 19 The number of visits was categorized as 0, 1–3, 4–9, and 10 or more visits.
Variables
The key dependent variable of interest was a binary indicator (“not assessed” or “assessed”) of whether the following SH measures were assessed and documented in the EHR SH table by a health professional (clinician or medical assistant): “Alcohol use,” “Tobacco use,” “Smoking Tobacco use,” “Smokeless tobacco use,” and “Illegal drug use.” “Not assessed” was defined as either missing completely or assigned the value of “not asked,” “unknown,” or “unspecified.” Some SH variables captured different degrees of granularity for the same or similar behaviors: for example, the “Tobacco use” variable captured whether an individual used tobacco or not regardless of form (ie, chew, smoking) and was captured as “yes” or “no.”
The “Smoking tobacco status” variable had more details around the amount smoked (ie, heavy smoker, passive smoker). If there was any classification of the behavior, regardless of the degree of granularity (ie, “yes,” “no,” “currently smoking”), the variable was coded as “assessed.” When individuals had >1 SH record, the last known value for each variable was extracted for this analysis.
Demographic variables were extracted from the structured patient demographic table within the EHR. Variables included age (calculated by date of birth), gender, and race (categorized as White, Black/African American, other, unknown). Age was grouped as adolescents (14–17 years old), young adults (18–29), middle-aged adults (30–64), and older adults (65+). Adolescents were included because it is thought that by age 14 they have some autonomy and can play an active role in their health. Young adults were included as there is evidence of differences in behaviors between younger adults, middle-aged adults, and older adults. 20,21
Marital status was aggregated to identify those with partners compared with those without partners. Those who indicated they were single, divorced, separated, widowed, marital status unknown, other, or declined to answer were characterized as having no partner. Those who indicated they were married or had a significant other were characterized as having a partner. Anyone who was enrolled in PP (the JHHC Medicaid managed care plan) for any part of 2017 had their insurance status coded as PP, even if they were dually enrolled.
Analysis
All analysis was done using StataSE 15. Frequencies were calculated to assess data completeness for each categorical variable of interest. Race and marital status categories were grouped due to low numbers in a few categories.
Crude and adjusted odds ratios (OR) were calculated for each SH variable using multivariable logistic regression to determine what factors may make assessment more likely. Modeling was conducted on the entire population and then on each insurance plan separately. All demographic factors were included in all the models for all behaviors. Reference groups were ages 30–64, “White,” “female,” no partner, enrolled in PP, and had 1–3 visits. A P-value ≤0.05 was considered significant. An OR >1 for an SH variable indicated the group was more likely to be assessed on that variable than the reference group.
Results
A total of 44,166 individuals were included in the study sample. The total population was primarily female (67%), White (38%) or Black/African American (49%). The population was roughly evenly split between both insurance plans included in this study, 40% for EHP and 60% for PP, but each plan had some differences in demographics (Table 1).
Population Characteristics Including the Total Population and Separated by Insurance Plan
EHP, employee health plan; NA, not asked; PP, priority partners.
Table 2 shows the frequency with which each behavior in the SH table was assessed. More detailed frequencies captured by category for each behavior (eg, heavy smoker, light smoker, passive smoker, never smoker), as well as more detailed demographics, can be found in Supplementary Appendix A in Supplementary Data. “Alcohol use,” “Tobacco use,” and “Smoking tobacco” were assessed the most frequently (83%, 89.6%, and 89.6%), whereas “Smokeless tobacco use” and “illegal drug use” were assessed less frequently (64.3% and 77.4%).
Frequency of Each Structured Social History Behavior as Assessed versus Not Assessed
EHP, employee health plan; PP, priority partners.
There was a high degree of collinearity between insurance plan and partner status. In almost all models these variables were statistically significant for each behavior when incorporated into the model alone but were no longer significant for a majority of behaviors when both insurance plan and partner status were included in the model (Supplementary Appendix B). Tables 3–6 show the adjusted OR, P-values, and 95% confidence interval (CI) for each behavior and the variables included in the full models. The number of outpatient visits as calculated by the ACG system was significant in all models. The OR increased as the number of visits increased. This was consistent for the total population (ie, where both insurance plan types were combined), for each insurance plan separately, and for each behavior.
Adjusted Odds Ratio for Alcohol Use Being Assessed or Not in the Electronic Health Record for the Total Population and by Each Insurance Plan
Models included only the factors listed and all factors listed were included in the model.
CI, confidence interval; EHP, employee health plan; NA, not asked; OR, odds ratio; PP, priority partners.
Adjusted Odds Ratio for Drug Use Being Assessed or Not in the Electronic Health Record for the Total Population and by Each Insurance Plan
Models included only the factors listed and all factors listed were included in the model.
CI, confidence interval; EHP, employee health plan; NA, not asked; OR, odds ratio; PP, priority partners.
Alcohol use assessment
Overall, individuals who were female or Black were more likely to have their alcohol use assessed (Table 3). OR for males compared with females were 0.81, 0.72, and 0.87 in the total, EHP, and PP populations, respectively. In the total population, individuals who were identified as Black were more likely to have their alcohol use assessed compared with their White counterparts (OR 1.21, 95% CI 1.13–1.29). For the Medicaid population, the elderly group was more likely to have their alcohol use assessed compared with the 30–64 reference group (OR 1.36). There was no difference for this age group compared with the reference group in the EHP population. In the EHP population, having a partner increased the odds that alcohol use was assessed.
Tobacco use assessment and smoking tobacco use status assessment
These 2 behaviors are captured separately within the SH table, but the odds of being assessed were similar (Tables 4 and Supplementary Appendix C). For example, in the total population, the OR for the elderly group was 1.48 (CI 1.20–1.83) for both behaviors, and in the EHP population it was 1.16 (CI 1.01–1.33) for Black individuals for both behaviors. In the total population, age is significant when comparing 30–64 with 14–17 with an OR of 0.82 but was not significant between 18 and 29 (OR 0.99). In the EHP population being female and Black also increased the odds of being assessed (OR 0.88 and 1.16, respectively) and was statistically significant, but these were not significant in the total population or just in PP population.
Adjusted Odds Ratio for “Tobacco use” Being Assessed or Not in the Electronic Health Record for the Total Population and by Each Insurance Plan
Models included only the factors listed and all factors listed were included in the model.
CI, confidence interval; EHP, employee health plan; NA, not asked; OR, odds ratio; PP, priority partners.
Smokeless tobacco use
For the total population both having a partner and being enrolled in EHP was statistically significant (OR 1.08 and 1.07) (Table 5). Having a partner was significant in the EHP population and not PP (OR 1.1 [CI 1.03–1.20] and OR 1.07 [0.99–1.16]). For all models, being 30–64 increase the odds to have your status assessed compared with adolescents (OR 0.71 total, 0.83 EHP, and 0.69 PP). Female continues to be statistically significant in assessing behavior (OR 0.87 [CI 0.83–0.91]). In the full population being White has greater odds of being assessed compared with Black or other (OR 0.92, 0.87 [CI 0.88–0.97 and 0.81–0.94]), but is only significant in the PP plan (OR 0.89 [CI 0.84–0.95]).
Adjusted Odds Ratio for Smokeless “Tobacco Use” Being Assessed or Not in the Electronic Health Record for the Total Population and by Each Insurance Plan
Models included only the factors listed and all factors listed were included in the model.
CI, confidence interval; EHP, employee health plan; NA, not asked; OR, odds ratio; PP, priority partners.
Illegal drug use assessment
Similarly, to other behaviors, those who are 30–64 have higher odds than adolescents of having their drug use assessed (OR 0.33 and 95% CI 0.3–0.36) for the total population (Table 6). Being female (OR 0.75 95% CI 0.71–0.79) or Black (OR 1.30 95% CI 1.23–1.37) also increases the odds of being assessed.
Discussion
The frequency with which SH variables were captured in the EHR varied by behavior. Tobacco use or smoking tobacco use being collected the most frequently and smokeless tobacco use and drug use being collected the least. Odds of having a patient's behavior assessed varied by behavior as well. For many variables being Black, female increased the odds that any SH behavior would be assessed, but for smokeless tobacco use White individuals were more likely to be assessed. The significance of having a partner and if an aspect of SH was assessed varied by insurance plan.
Differences in SH variable capture could be due to various policies, guidelines, and system architecture of the EHR during the study timeframe. For example, the federal “Meaningful Use” EHR incentive program required smoking status to be captured for reimbursement purposes. Nationally, assessment of that variable increased from 50% of individuals (13+ years old) in stage 1 of meaningful use to 80% in stage 2 of Meaningful Use. 22,23 Even though the Meaningful Use program has effectively ended, the measures and EHR certifications have been rolled into what is now known as the electronic Clinical Quality Measures within the Medicare program. So financial incentives remain strongly linked to the EHR's ability to capture smoking status. 24 This is just 1 policy that could account for the greater assessment of “smoking tobacco use” and “tobacco use” compared with other SH variables captured in the SH table.
The varying frequencies and predictors of assessment further underscored that providers are not capturing SH information in a standard manner. 17 This could be due to a lack of standards and policies that would make capturing this information easier and more consistent. 7,8,15,25 These standards are needed both in how information is captured and who is capturing the information to make the data more reliable and usable for multiple stakeholders. For example, there are multiple ways in the Epic EHR to collect information about tobacco use, but only 1 way to capture information about alcohol or drug use. ORs for both “Tobacco use” and “Smoking tobacco use” were almost identical, which indicates providers were taking the time to assess the behavior in multiple ways. But the OR for different behaviors (ie, tobacco use, alcohol use) were different, meaning that providers were often only completing information for 1 behavior, not all 4.
The odds of assessing behavior increased as the number of outpatient visits increased. This is consistent with more visits translating into more opportunities to complete the SH table. Patients who use more health care may also be sicker, and providers may be more inclined to capture behavior information to assist in understanding their patient's need. 16 In contrast, odds of having behaviors assessed were different based on insurance plan, which was likely due to the type of populations that enroll in these plans. Those enrolled in PP are a Medicaid population and may have other social issues to be addressed, making it less likely for providers to assess all social behaviors and more likely for them to focus on the most pressing social needs of the patient. Often providers will document social needs in the clinical notes section making it difficult for other providers or researchers to access this information. 13,14,26
Capturing behavioral information on patients is important. Participating in these behaviors (eg, drinking alcohol, smoking tobacco) can lead to increased disease burden and poor outcomes. Smoking can exacerbate chronic conditions such as asthma or Chronic Obtrusive Pulmonary Disease and can lead to poorer outcomes in pregnancy and cancer treatment. Excessive alcohol use can lead to liver disease and mental health issues. 27 –29 It is important for care managers and providers to understand their patients beyond just their medical conditions. These behaviors can be treated through various interventions and social services, but if they are unknown or not documented, there is no way to provide treatment and in turn improve health outcomes. If the information is not collected, the extended care team may not be able to address issues in a preventative manner.
Social and behavioral information may also play an important role in understanding the health of populations. Without standards and clear and consistent guidelines, the information captured in the EHR is largely unusable for assessing population health. Geographic or neighborhood-level information are largely used as a proxy to understand the health of populations in a certain area, but these may not be accurate for each individual. 13,29 –31 Improving SBDH variable assessment, including SH variables, for all individuals will allow for more accurate information about entire populations. A better understanding of a population and their behavior will allow policy makers to develop programs to improve the health of larger populations.
This analysis is one of the first of its kind that investigates multiple behaviors captured within the structured SH table in an EHR within a large academic delivery system. Unstructured data such as clinical notes may also provide more information about individual's SBDH, the information is not easily accessible. The sample is similar to that of the state of Maryland, making these findings generalizable to other health systems in the state. 32,33 Multiple covariates were explored including race, having a partner, age, and number of outpatient visits. Few other articles have incorporated these demographic variables. Authors also explored 2 different insurance plans, whose enrollees have different medical and social needs. The results showed that it is important to capture information on all individuals, not just a few.
Limitations of this study
This article explored only 1 component of the EHR where SH information is captured. Clinician notes can contain SH and SBDH information, but natural language processing or exploration of the free text of EHR records was beyond the scope of this article. This article also only explored 1 health system and 1 installation of Epic. Other health systems may capture information in different ways, and even the health system explored in this study has changed how information is collected since 2017. A review of data collection practices with more recent data should be conducted to help understand how changes to the EHR have potentially changed SBDH data collection.
The last known value for each behavior captured in the SH table within 1 year was used. It was unknown if that information had changed over time or if the change in behavior caused information to be captured more or less frequently. These values may have been captured at different times or at different visits (eg, alcohol may have been assessed during a presurgical visit, whereas smoking was last assessed during a routine physical). This may have impacted the accuracy of the measures of how often variables were assessed.
There were also about 8000 individuals (15% of the population) who had a primary care visit in 2017 but had no SH information recorded. These individuals may have gotten other types of care at different health centers, they could have only been enrolled for a short time period within 2017, or other factors may have contributed to them not having an SH encounter. Other studies exploring the types of visits where social and behavioral information is captured are needed.
Finally, authors had all administrative claims but EHR data from only 1 site. This may have skewed some of the results, particularly in relation to number of visits. Visits were calculated using claims information, which included visits that occurred outside of the health system. Information on behaviors may have been captured outside of the health system at other physician offices, making some of the numbers lower than they would be if authors had access to all EHR records.
Conclusion
The capture of social and behavioral information is important to best address patient needs, but these data are not consistently collected. Individuals who have more contacts with a health system appear to have greater chances that this information is captured within the EHR during these interactions. There also appear to be other patient-, provider-, and encounter-related factors that are also related to increased SH variable assessment. SH and SBDH variables should be captured for all patients.
Developing more consistent standards and guidelines will help make this data collection more consistent and reliable. Creating an easy workflow to capture this information will ease the burden on providers and patients to collect and report accurate information. Organizations such as the Center for Medicare and Medicaid should continue to create incentives for providers to capture SBDH information. With more reliable data around behaviors, researchers will be able to utilize these data to study the impact of SBDH variables on various health outcomes.
Footnotes
Acknowledgment
The authors thank Thomas Richards for his help in understanding data preparation and assisting in database management.
Authors' Contributions
Conceptualization, data curation, analysis, writing original draft, writing review, and editing by Dr. Lasser. Conceptualization, writing review, and editing by Dr. Gudzune and Dr. Lehmann. Writing review and editing by Dr. Kharrazi. Conceptualization, writing review and editing, and supervision by Dr. Weiner.
Author Disclosure Statement
All authors report being employed by Johns Hopkins University or Johns Hopkins Health System. There are no other conflict of interests.
Funding Information
This project was conducted as part of a doctoral dissertation and was not funded.
Supplementary Material
Supplementary Data
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
