Abstract
Obesity is a potentially modifiable risk factor for many diseases, and a better understanding of its impact on health care utilization, costs, and medical outcomes is needed. The ability to accurately evaluate obesity outcomes depends on a correct identification of the population with obesity. The primary objective of this study was to determine the prevalence and accuracy of International Classification of Diseases, Ninth Revision (ICD-9) coding for overweight and obesity within a US primary care electronic health record (EHR) database compared against actual body mass index (BMI) values from recorded clinical patient data; characteristics of patients with obesity who did or did not receive ICD-9 codes for overweight/obesity also were evaluated. The study sample included 5,512,285 patients in the database with any BMI value recorded between January 1, 2014, and June 30, 2014. Based on BMI, 74.6% of patients were categorized as being overweight or obese, but only 15.1% of patients had relevant ICD-9 codes. ICD-9 coding prevalence increased with increasing BMI category. Among patients with obesity (BMI ≥30 kg/m2), those coded for obesity were younger, more often female, and had a greater comorbidity burden than those not coded; hypertension, dyslipidemia, type 2 diabetes mellitus, and gastroesophageal reflux disease were the most common comorbidities. Key findings: US outpatients with overweight or obesity are not being reliably coded, making ICD-9 codes undependable sources for determining obesity prevalence and outcomes. BMI data available within EHR databases offer a more accurate and objective means of classifying overweight/obese status.
Introduction
O
In order to study the nature and scope of obesity-related complications, whether it be from a medical or payer perspective, it is critical that the population being studied is identified accurately and completely for the findings to be valid. Although medical claims data are well recognized as useful for large-scale evaluations of disease epidemiology, including medical and economic outcomes, previous research has demonstrated that coding for obesity and body mass index (BMI) in medical claims is inconsistent at best and is significantly underrepresentative of populations with overweight or obesity. 10 –16 The reasons are numerous, but it is likely that ICD-9 coding for overweight and obesity is often overlooked unless it is the primary reason patients are seeking treatment. In addition, there are provider reimbursement challenges for obesity management, thus making the reimbursement code alone not well suited to identifying patients who may be overweight or obese.
Increased adoption of electronic health record (EHR) technology by physicians in the United States is providing a new source of health care data for outcomes research purposes. Such data are based on information actually measured during the routine process of patient care and include biometric data, such as BMI, not available in claims databases. The Quintiles electronic medical record (EMR; QuintilesIMS, Durham, NC) research database is a commercially-available, high-quality source of anonymized patient-level ambulatory medical records. This database captures a range of demographic and clinical variables, including BMI, and is increasingly being used for outcomes research pertaining to various disease states. 17 –22
The primary objective of this study was to determine the prevalence and accuracy of ICD-9 coding for obesity and overweight within the Quintiles EMR, compared to BMI values calculated from patient data. The study also was designed to analyze characteristics of patients with overweight or obesity in the database (based on BMI values) who did or did not receive ICD-9 codes for overweight or obesity. A third objective was to examine whether the use of ICD-9 coding for overweight and obesity changed between 2011 and 2014.
Methods
Overall study design and data collection
This was a cross-sectional, 2-part analysis of data from a US primary care EMR database, Quintiles EMR. The Quintiles EMR database is used in the ambulatory care setting, with more than 1300 sites in 49 states. There are more than 30,000 clinicians using the system, resulting in approximately 35 million patients in the database. Source data for the Quintiles EMR database are captured with the GE Centricity (GE Healthcare, Chicago, IL) user interface. Participating physicians are in middle- to large-size group practices, with approximately 85% being primary care providers (eg, family practice, internal medicine, obstetrics/gynecology, pediatrics). Geographic areas served by the Quintiles EMR database are similar to the overall US population and demographic characteristics are similar to utilizers of health care in the National Ambulatory Medical Care Survey. 23 This database may or may not document care provided in other health care sectors, such as inpatient services and procedures (eg, imaging, same-day surgery, acute rehabilitation, long-term care).
Demographic data collected from the Quintiles EMR database included patient age, sex, race, and geographic region of the United States. BMI measurements were determined from actual patient height and weight values in the database as input by clinical staff. Weight category classifications (underweight, normal, overweight, Class I-III Obesity) were based on BMI cutoffs set by the National Heart, Lung, and Blood Institute. These analyses were limited to adult patients aged 20 years and older.
ICD-9 codes are comprised of 3, 4, or 5 digits, with the first 3 digits being mandatory. Greater specificity is achieved by the addition of 1 or 2 additional digits following a decimal place. The following ICD-9 codes for overweight/obesity were used in this study: 278 (Overweight, obesity and other hyperalimentation); 278.0 (Overweight and obesity); 278.00 (Obesity unspecified); 278.01 (Morbid obesity); ICD-9 278.02 (Overweight); and 278.03 (Obesity hypoventilation syndrome). Patients with claims evidence of pregnancy or gestational diabetes were not included because of expected BMI fluctuations related to pregnancy.
Part I: Prevalence and characteristics of patients receiving ICD-9 codes for overweight/obesity
The primary analysis was a point-in-time assessment of the prevalence of ICD-9 coding for patients with overweight and obesity in the Quintiles EMR database and comparative characteristics of coded versus non-coded groups. The inclusion criteria for patients included in this analysis were: available BMI measurement in the Quintiles EMR database between January 1, 2014, and June 30, 2014; ≥20 years of age on index date; no evidence of pregnancy or gestational diabetes; and at least 3 months of follow-up data after the index BMI (first recorded BMI during the study period) to provide an adequate time window for capturing ICD-9 coding.
All 3-month pre-index and 3-month post-index medical records were searched for ICD-9 codes for overweight and obesity. The rationale for searching 3 months before and after the index BMI was to search for a sufficiently long interval for the provider to have responded to the patient's increased weight, but not so long that the patient's weight was likely to have changed substantially. The proportions of patients coded for overweight and obesity by ICD-9 codes were compared with the proportions of patients in corresponding weight categories based on actual height and weight measurements. Among patients with BMI-categorized obesity (BMI ≥30 kg/m2), characteristics of age, sex, race, region of the United States, and comorbidities were compared between subgroups with and without ICD-9 codes for overweight/obesity. Comorbidities were determined based on ICD-9 codes. The Charlson comorbidity index (CCI), a method widely used to measure the burden of comorbid disease and predict mortality, 24 was calculated. This index consists of a list of 17 diagnoses and assigns a weight from 1 to 6 to each diagnosis. The summary score is the sum of weighted values.
Descriptive statistics for all study variables were calculated. Categorical variables are described using the proportion of patients in each category, and continuous variables are described using mean and standard deviation. Logistic regression was conducted to obtain odds ratios (ORs) for receiving an ICD-9 code for overweight or obesity.
Part II: Coding trends over time
This analysis compared 4 sequential years of coding patterns using data from January 1, 2011, to September 30, 2014. Patients for whom index BMI values were available during each calendar year (2011–2014) were grouped into conventional BMI categories, and the database searched for any ICD-9 codes for obesity or overweight over the whole year. The annual prevalence of coding was examined by BMI category and compared across the years. This analysis was performed on all patients with BMI data, and for male and female subsets separately.
Results
Part I: Prevalence and characteristics of patients receiving ICD-9 codes for overweight/obesity
The study sample for this analysis included 5,512,285 patients with any BMI value recorded between January 1, 2014, and June 30, 2014. According to BMI values, 74.6% of the study cohort had a BMI ≥25.0 and thus were eligible to be coded for overweight or obesity. However, only 15.1% of all patients (n = 833,763) had ICD-9 codes for overweight or obesity.
The most commonly used ICD-9 code was 278.00 (Obesity, unspecified) in 10.6% of patients, followed by 278.02 (Overweight) in 2.6% of patients, and 278.01 (Morbid obesity) in 2.5% of patients. Codes 278.0 (Overweight and obesity), 278 (Overweight, obesity, and other hyperalimentation), and 278.03 (Obesity hypoventilation syndrome) were each used in <1% of patients.
Overweight and obesity coding by BMI category
Of all patients with an “overweight” BMI (25.0–29.9 kg/m2; 32.3% of the population), 3.86% were correctly coded for overweight only, 2.79% were incorrectly coded for obesity only, and 0.19% coded for both obesity and overweight (Table 1). Of all patients with an “obese” BMI (≥30.0 kg/m2; 42.3% of the population), 27.3% were correctly coded for obesity only, 2.3% were incorrectly coded for overweight only, and 0.4% were coded for both obesity and overweight.
Shaded cells designate correct coding according to recorded BMI.
ICD-9 278 (Overweight, obesity and other hyperalimentation); ICD-9 278.0 (Overweight and obesity); ICD-9 278.00 (Obesity unspecified), ICD-9 278.01 (Morbid obesity), ICD-9 278.03 (Obesity hypoventilation syndrome).
ICD-9 278.02 (Overweight).
BMI, body mass index; ICD-9, International Classification of Diseases, Ninth Revision.
The percentage of patients coded with an accurate ICD-9 code increased with increasing BMI category, with approximately two thirds of subjects in the highest BMI category (≥50 kg/m2) having a correct ICD-9 code for obesity (278.00, Obesity, unspecified; 278.01, Morbid obesity; 278.03, Obesity hypoventilation syndrome) and less than 1% of patients in this BMI category having a code for overweight (278.02). In the lowest BMI category for obesity (30–34.9 kg/m2), 17% of patients were coded for obesity. The ICD-9 code for morbid obesity (278.01) appeared to be applied most accurately, with a majority (70.9%) of its use occurring in patients with BMI ≥40 kg/m2.
Table 2 presents the demographics of the subgroup of study patients with an index BMI ≥30 kg/m2, as well as comparative demographics for those coded for obesity and those who were not. The patients who were coded for obesity (28.0% of eligible patients), were younger, more often female, and had a greater comorbidity burden compared with patients not coded for obesity but who were obese as assessed by actual BMI. Among patients with BMI ≥30 kg/m2, the mean (±SD [standard deviation]) CCI score was higher among patients coded for obesity than for those not coded for obesity. Even among patients in the highest CCI score category (5+), only 37% were coded for obesity.
P values for coded vs non-coded patients; t test for continuous variables, chi-square test for categorical values.
BMI, body mass index; CCI, Charlson comorbidity index; SD, standard deviation.
The mean BMI was significantly higher in the coded versus the non-coded group, but it was still notably high in the non-coded group (Table 2). Geographically, the highest proportion of coded patients was observed in the Northeast region of the United States. By race category, the highest proportion of coding occurred among patients classified as Native American and Multiethnic. White patients comprised the greatest percentage of patients in the population with BMI ≥30 kg/m2, yet the prevalence of obesity coding among them was the lowest of the major racial groups evaluated.
Baseline characteristics for patients with an index BMI of 25–29.9 kg/m2 and whether or not they were coded for overweight are provided in online Supplementary Table S1 (Supplementary Data are available online at
Coding and specific comorbidities
The most common comorbidities among patients with a BMI ≥30 kg/m2 were hypertension, dyslipidemia, type 2 diabetes mellitus, and gastroesophageal reflux disease (Table 3). Among patients coded for these comorbidities, the prevalence of obesity coding concomitant with these diagnoses ranged from 32.5% to 36.9%.
Index BMI = first recorded BMI measurement during the study period.
Comorbidity confirmed by existing diagnosis ±3 month window around index BMI.
ICD-9 codes 278, 278.0, 278.00, 278.01, 278.03.
P values for coded vs non-coded patients; t test for continuous variables, chi-square test for categorical variables.
BMI, body mass index; CVD, cardiovascular disease; GERD, gastroesophageal reflux disease; HIV, human immunodeficiency virus; ICD-9, International Classification of Diseases, Ninth Revision; NAFLD, nonalcoholic fatty liver disease; T2DM, type 2 diabetes mellitus.
In contrast, comorbidities such as Prader-Willi Syndrome, metabolic syndrome, sleep apnea, nonalcoholic fatty liver disease, and Cushing syndrome were coded much less frequently (Table 3). However, patients with these comorbid conditions were more likely to also be accurately coded for obesity (44.3% to 55.4%).
Corresponding data for patients with an index BMI of 25–29.9 kg/m2 (overweight) are provided in online Supplementary Table S2.
Table 4 provides logistic regression model estimates for probabilities of being coded for obesity according to demographic and clinical characteristics. Younger adults had a higher probability of being coded compared with older adults, and female sex was associated with an OR of about 1.3 relative to male sex. Native American, Multiracial, Hispanic, and black patients had higher ORs for being coded relative to white patients. The probability of being coded for obesity increased with increasing CCI category. Specific comorbidities with the highest ORs included Prader Willi Syndrome, metabolic syndrome, and sleep apnea.
From regression analysis.
CCI, Charlson comorbidity index; CI, confidence interval; CVD, cardiovascular disease; HIV, human immunodeficiency virus; NAFLD, nonalcoholic fatty liver disease; T2DM, type 2 diabetes mellitus.
Similar patterns were observed for coding of overweight (online Supplementary Table S3), although CCI category had much less influence on the probability of being coded. The comorbidities with the highest ORs for overweight coding were HIV, metabolic syndrome, and prediabetes.
Part II: Coding trends over time
Among patients with BMI ≥30 kg/m2, the prevalence of ICD-9 coding for obesity increased slightly each year from 2011 to 2014 (Fig. 1), with a slightly greater increase apparent between 2013 and 2014 than between other years, particularly within BMI categories ≥35 kg/m2. For all years, the prevalence of coding for obesity was low, but it increased as BMI category increased. Similar trends were observed for both males and females, though the prevalence of coding was consistently higher among women than men (Table 5). Similar trends were observed for coding among males and females even when analyzed by age categories (data not shown).

Prevalence of coding for obesitya by index body mass indexb category, by year, 2011 to 2014.c
ICD-9 codes 278, 278.0, 278.00, 278.01, 278.03.
Index BMI = first recorded BMI measurement during the study period.
BMI, body mass index; ICD-9, International Classficiation of Diseases, Ninth Revision.
Similar trends were observed for coding for overweight, though coding for overweight was observed in very low proportions of patients overall (online Supplementary Fig. S1). Among patients with a BMI consistent with the classification of overweight (BMI 25–29 kg/m2), the proportion of patients with ICD-9 coding for overweight was very low: 1.8% in 2011, 2.4% in 2012, 3.4% in 2013, and 3.9% in 2014. Prevalence of ICD-9 coding for overweight by year and sex is provided in online Supplementary Table S4.
Discussion
This study provided a unique opportunity to examine the prevalence and trends associated with clinician ICD-9 coding for overweight and obesity based on actual patient BMI data. Results demonstrate that ICD-9 codes should not be relied on to estimate population prevalence rates of obesity or overweight, given the likelihood of capturing only a portion of relevant patients and generating inaccurate conclusions about the magnitude of the obesity problem. Furthermore, even in other conditions where weight may be an effect modifier, using ICD-9 codes for obesity/overweight will not provide an effective control of confounding.
Even for patients in the very highest BMI categories of 40 kg/m2 and greater only one third to just over one half were coded for obesity in any given year. Although the frequency of coding for all categories of overweight or obesity increased each year of the study, the overall frequency remained extremely low for such an important disease.
Interestingly, patients with obesity with rarely coded comorbidities such as Prader-Willi Syndrome, metabolic syndrome, sleep apnea, nonalcoholic fatty liver disease, and Cushing syndrome were the most likely to receive codes for obesity. Among this group, Prader-Willi Syndrome and Cushing Syndrome are epidemiologically uncommon, occurring very rarely in any population and are specialist treated. The fact that obesity is a core presentation of both of these conditions may increase its likelihood for coding. In contrast, the other comorbid conditions where obesity coding is common, such as metabolic syndrome, sleep apnea, and nonalcoholic fatty liver disease are actually epidemiologically quite common; however, they are also commonly underdiagnosed and, thus, likely to also be undercoded. In the case where one of these comorbidities has been diagnosed and is being treated, coding alongside obesity coding may not be surprising, as each of these comorbidities is driven by obesity. Additionally, because it is not common to code for these comorbidities or for obesity, it is possible that there are particular clinicians who are more likely to either assess these conditions and/or code for them. Surprisingly, the rate of obesity coding was lower among other obesity-driven comorbidities such as type 2 diabetes, hypertension, and dyslipidemia. However, perhaps their high population prevalence, in addition to the comorbid presentation of all 3, does not provide an automatic trigger to clinicians to also code for obesity.
Possible reasons for incomplete obesity and overweight coding might include the time burden of documentation, including prioritizing other codes in the limited time visit, shortage of time to search for the code, lack of knowledge of the existence of ICD codes for overweight and obesity, EHR system issues wherein the codes do not come up in searches, the limit of only 4 codes per procedure/visit (higher prioritization for “mainstream” diseases), and a nihilistic view of coding weight status related to a perception or reality of limited or no reimbursement for management of obesity, because there currently is no increased payment or minimal economic incentive to do so with many payers. It is also worth noting that the data found an unexpected “false positive” phenomenon in which a small percentage of patients with very low BMIs (<25 kg/m2) were coded as overweight or obese, which could suggest a certain level of operational or administrative error when it comes to coding accuracy in general.
A lack of obesity coding behavior also may reflect a general aversion among clinicians toward addressing the topic of weight with their patients. Reasons for such behavior can include time limitations, failure to prioritize obesity as a clinical issue, negative stereotypes and attitudes about people with obesity, low levels of emotional rapport with people with obesity, and poor expectations for weight loss success. 25 –28 Some of these challenges may be addressed by improving health care provider training in weight management.
Another likely phenomenon of the obesity epidemic is that, as the mean BMI in the US has risen, patients with BMIs >25 kg/m2 (overweight) or >30 kg/m2 (obese) are subconsciously assessed as reflecting the norm rather than a population at increased health risk that requires intervention. In addition, clinicians are likely unaware of the biological basis of obesity, and hence may be cynical about their ability to facilitate weight loss. This is probably driven by past disappointments with diet and exercise interventions or historic failures of older weight loss medications. Data from the National Ambulatory Medical Care Survey revealed a significant decline in primary care physician-based weight counseling efforts from 1995–1996 to 2007–2008 in the United States. 29 This is an alarming trend, given the millions of individuals with overweight/obesity, which has well-established health consequences such as diabetes, heart disease, fatty liver, and others, and the fact that overweight/obesity is a modifiable risk factor.
A number of major changes have occurred in recent years with potential impact on the medical coding environment in the United States. The US Health Information Technology for Economic and Clinical Health Act, passed in 2009, provides financial incentives to hospitals and outpatient clinics that utilize EHRs meeting certain requirements, known as “meaningful use” standards. As part of these standards, BMI calculations must be included in the EHRs. Adoption of EHR compliance among outpatient providers reportedly rose from 17% in 2006 to 78% in 2013. 30 The Affordable Care Act (ACA) went into full effect in January, 2014, providing millions of new patients access to the US health care system overall, though adding additional administrative burdens to medical practices as they adjust to greater patient demand, multitudes of regulations, and adapt to additional payer models. However, the ACA also has strict documentation rules, reinforcing the need for medical coding accuracy to avoid denial of reimbursement. The current data noted a slight but noticeable increase in obesity coding prevalence in 2014 relative to other years, most evident among patients with BMI ≥35. Although this finding coincides with the full implementation of the ACA in the same calendar year, possible causal associations cannot be determined. The ACA does require full coverage for preventive care obesity screening and counseling by most insurers, possibly increasing the financial incentive for physicians to code and bill for weight-related services. Further studies looking at coding trends over time will be needed to evaluate the ongoing impact of the ACA and Medicare and commercial payer quality reporting requirements on coding compliance and accuracy.
Another change affecting the US health care system, and which the current data do not reflect, is the implementation of the ICD-10 coding system in October 2015. The ICD-10 system represents a tremendous expansion of coding complexity, as it contains approximately 72,000 diagnosis codes compared with 14,000 in the ICD-9 system. Previous numeric ICD codes for diseases have been replaced with a vastly expanded set of alphanumeric codes. Although the intention is greater specificity and accuracy, it remains to be seen if this new system will achieve the desired goals given the much greater effort on behalf of the clinical team to search, identify, and document these new codes. The present data may provide a useful benchmark for future studies to examine obesity coding trends with the ICD-10 system.
Previous US studies of varying methodology and sample size have reported under-recognition and poor coding practices for obesity, including studies utilizing Medicare records, 31 the Mayo Clinic Rochester primary care database, 13 and US military health system EMRs. 14 The present data corroborate and expand on these prior findings, providing additional robust evidence from a multimillion patient sample with broad national representation of outpatient coding practices in the United States. The present findings add to a growing body of evidence substantiating the critical limitations of relying on medical coding for tracking obesity burden, epidemiology, expenditures, and outcomes.
Although not all providers will accept and enact the requirement, as of January 2014 all health care providers in the United States are now federally mandated to use EHR systems. Hopefully, this will increase the scope and utility of such databases for outcomes research, though it is unclear to what extent researchers will have access to these improved data sets.
What should be equally concerning about these data is the implication that if clinicians are not coding obesity, they likely are not treating obesity 13 ; prior research has revealed that documentation of obesity correlates with interventional recommendations. 32 Given that two thirds of the US population is overweight or obese, the medical community needs to proactively embrace treating overweight and obesity much as they do for other chronic diseases such as depression or hypertension. In fact, the magnitude of the problem represents a tremendous opportunity for clinicians to make an impact on numerous weight-related diseases simply by maintaining awareness of obesity as a disease and being proactive in recommending evidence-based and effective weight-loss measures. Proactive physician communication, intervention, and advice have been shown to be effective motivators for weight-loss behavior, 33 and medical therapy is often indicated.
Conclusions
This analysis found that patients with measured BMIs indicating overweight or obese status in outpatient settings in the United States still are not reliably coded in claims data as such. Physicians, payers, and administrators who are dedicated to evaluating medical and economic outcomes related to obesity must be aware that medical coding data for overweight and obesity are wholly inadequate for these purposes. BMI determinations from clinical height and weight data available within EHR databases are, in theory, easily accessible and provide an accurate, objective means of classifying overweight and obesity status. The apparent high degree of discordant obesity coding may signal a need for efforts to correct underlying perceptions and attitudes of physicians regarding the medical relevance of obesity and the need for effective intervention.
Footnotes
Acknowledgments
Statistical analysis assistance was provided by Paige Meade, MPH of Novo Nordisk, Inc. Writing assistance was provided by Sandra Westra, PharmD of Churchill Communications.
Author Disclosure Statement
Ms. Mocarski and Dr. Smolarz are employees of Novo Nordisk Inc. and hold stock options in Novo Nordisk A/S. Dr. Tian was an employee of Novo Nordisk at the time this research was being conducted. Drs. McAna and Crawford have no potential conflicts of interest to report. The authors received the following financial support for the research, authorship, and/or publication of this article: This study was funded by Novo Nordisk, Inc.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
