Natural Language Processing to Assess End-of-Life Quality Indicators in Cancer Patients Receiving Palliative Surgery

Abstract

Background:

Palliative surgical procedures are frequently performed to reduce symptoms in patients with advanced cancer, but quality is difficult to measure.

Objective:

To determine whether natural language processing (NLP) of the electronic health record (EHR) can be used to (1) identify a population of cancer patients receiving palliative gastrostomy and (2) assess documentation of end-of-life process measures in the EHR.

Design/Setting:

Retrospective cohort study of 302 adult cancer patients who received a gastrostomy tube at a single tertiary medical center.

Measurements:

Sensitivity and specificity of NLP compared to gold standard of manual chart abstraction in identifying a palliative indication for gastrostomy tube placement and documentation of goals of care discussions, code status determination, palliative care referral, and hospice assessment.

Results:

Among 302 cancer patients who underwent gastrostomy, 68 (22.5%) were classified by NLP as having a palliative indication for the procedure compared to 71 patients (23.5%) classified by human coders. Human chart abstraction took >2600 times longer than NLP (28 hours vs. 38 seconds). NLP identified the correct patients with 95.8% sensitivity and 97.4% specificity. NLP also identified end-of-life process measures with high sensitivity (85.7%–92.9%,) and specificity (96.7%–98.9%). In the two months leading up to palliative gastrostomy placement, 20.5% of patients had goals of care discussions documented. During the index hospitalization, 67.7% had goals of care discussions documented.

Conclusions:

NLP offers opportunities to identify patients receiving palliative surgical procedures and can rapidly assess established end-of-life process measures with an accuracy approaching that of human coders.

Introduction

Palliative surgery represents 6%–20% of all operations performed by surgical oncologists and accounts for more than 1000 procedures per year at tertiary cancer centers.^1–4 While the mortality risk and other major complications for these procedures have been described in the literature, little data exist regarding associated processes' measures of high-quality end-of-life care, including preoperative goals of care discussions and establishement of a healthcare proxy.

Natural language processing (NLP) refers to computational methods that enable machines to process and analyze written text. Through NLP, “free text” medical notes, which represent 70%–80% of all data in electronic health records (EHRs), can be rapidly scanned to detect prespecified indicators.⁵ In this study, we developed and tested NLP methods using existing data from the EHR to (1) retrospectively identify cancer patients who received palliative venting gastrostomy tube for refractory nausea and vomiting, and (2) identify quality benchmarks for processes of care, such as documentation of preoperative goals of care. We chose to focus on venting gastrostomy, because it is among the most common palliative surgical procedures in cancer patients and associated with poor prognosis with a median survival of 38 days.^6,7

Methods

Data source and study population

The primary data source was the Partners HealthCare Research Patient Data Registry. This registry gathers data from multiple hospital EHRs at Partners HealthCare. These data are linked to the EHR and include admission notes, consultation notes, procedure notes, operative reports, and discharge summaries. We focused the analysis on a single tertiary medical center captured within this database. This study was approved by the Partners Institutional Review Board (Protocol #2016P001014).

We used International Classification of Diseases, Ninth Revision, Clinical Modification (ICD9-CM) and Current Procedural Terminology (CPT) administrative codes to identify cancer patients (ICD9-CM 140–209) who received a gastrostomy tube (ICD9-CM 43.11, 43.19, 44.32 or CPT 49440) from January 1, 2012, to March 31, 2016. Patients who had a gastrostomy tube placed, but did not have a cancer diagnosis were not identified in this study.

To determine if the gastrostomy tube was placed for feeding or palliative venting, we used both manual chart review and NLP methods. We then compared these methods to determine the effectiveness of NLP.

Manual chart review

Methodology for determining the indication for the gastrostomy procedure through manual chart review has been previously described.⁸ In short, a single researcher noted whether the gastrostomy was indicated for venting a malignant obstruction (palliative indication). A second researcher reviewed a 20% random sample of charts. Inter-rater agreement was excellent (κ = 0.97).

Manual chart review was also used to identify documentation of validated end-of-life process measures: goals of care discussions, code status determination, palliative care consultation, and assessment for hospice.⁹ Briefly, the EHRs of 20 randomly selected patients with a total of 1710 clinical notes were annotated by 2 researchers. The presence or absence of process measures was determined at the note level. In the case of disagreement, a third human coder reviewed the specific note in question and broke the tie.

Natural language processing

NLP, in the form of regular expressions, was used to identify palliative procedure and quality processes' meaures. Regular expressions identify patterns of characters as they are specified. Our NLP software, ClinicalRegex, identifies predefined keywords or phrases within the clinical notes, taking into account varieties in language and punctuation. NLP identified patients who had a palliative procedure by searching for documentation of the key word “venting” within one week of their procedure.

Words and phrases that were found to be associated with a specific end-of-life process were added to a key term library (Table 1). These key term libraries were refined and validated by manual review of notes flagged by NLP, as well as manual review of notes not flagged by NLP. This iterative process resulted in improvements in NLP performance over time.

Table 1.

Natural Language Processing Keyword Library

Process measure	Keywords
Clarifying code status: Conversations with patients or family members about preferences for cardiopulmonary resuscitation and intubation. Includes limitations on life-sustaining treatment and confirmation, by the patient or family, of full code status. Does not include presumed full code status or if obtained from other sources (i.e., review of records, according to team).	Limitations on code status: dnr, dnrdni, dni, do not resuscitate, do-not-resuscitate, do not intubate, do-not-intubate, chest compressions, no defibrillation, no endotracheal intubation, no mechanical intubation, shocks, cmo, comfort measures Full code status: Full code confirmed, full code d/w, full code discussed, full code verified, would like to be full code, wishes to be full code, would like to remain full code, wishes to remain full code, wish to be full code, remaining full code, full code MOLST
Goals of care discussions: Conversations with patients or family members about the patient's goals, values, or priorities for treatment and outcomes. Includes statements that conversation occurred as well as listing specific goals.	Goals of care, goc, goals for care, goals of treatment, goals for treatment, treatment goals, family meeting, family discussion, family discussions, debility/goals of care, goc/coping, patient goals
Palliative care referral: Documentation that palliative care specialists were involved or that consultation was considered or offered, regardless of whether consultation occurred.	Pallcare, palliative care, pall care, pallcare, palliative medicine, supportive care
Hospice assessment: Documentation that hospice was discussed, prior enrollment in hospice, patient preferences regarding hospice, and assessments the patient did not meet hospice criteria.	Hospice

The NLP keyword library was then used to enumerate the instances of process documentation at three distinct time points in the procedural timeline: two months before gastrostomy tube placement, during admission for gastrostomy tube placement, and after the procedure admission. The percentage of patients with documentation at each of these three time points was then calculated to determine a process measure score.

Statistical analysis

Patient characteristics were described using means, standard deviations, interquartile ranges, and medians, as appropriate. NLP was evaluated by sensitivity and specificity when compared against manual chart review. True positives were defined as those notes where both NLP and the human coders agreed about the presence of a process measure. False positives were defined as the notes that NLP flagged for containing a process measure, but the human coders did not. False negatives were defined as the notes that NLP did not identify, yet were scored for a process measure by human coders.

Results

Patient characteristics

We identified 302 cancer patients receiving tube gastrostomy using administrative codes. Of those 302 patients, 70 had the word “venting” documented within one week of the procedure. Manual chart review identified 71 of the 302 as receiving venting gastronomy tube for palliative indication, whereas NLP identified 68 out of the 71 (sensitivity 95.8% and specificity 97.4%). Alternative language to indicate “venting” (e.g., “vent”) resulted in the three cases of missed identification by NLP.

Baseline demographic and clinical characteristics of patients identified by NLP are presented in Table 2. The mean age was 62.8 years, with the majority of individuals being female (70.6%), non-Hispanic white (77.9%), and relying upon private insurance (60.3%). Ovarian (16.2%), colorectal (16.2%), and pancreatic cancers (14.7%) were the most common malignancies.

Table 2.

Baseline Demographics for Patients Identified by Natural Language Processing as Receiving Gastrostomy Tube for Palliative Indication

	Patients (n = 68)
Age, mean (SD)	62.8 (11.2)
Females, n (%)	48 (70.6)
Race, n (%)
White	53 (77.9)
African-American	4 (5.9)
Hispanic	1 (1.5)
Other/no response	10 (14.7)
Insurance, n (%)
Medicare	24 (35.3)
Medicaid/Mass Health	2 (2.9)
Private	41 (60.3)
Self-pay	1 (1.5)
Cancer diagnosis, n (%)
Ovarian	11 (16.2)
Colorectal	11 (16.2)
Pancreatic	10 (14.7)
Genitourinary	6 (8.8)
Hepatobiliary	5 (7.4)
Gastric	5 (7.4)
Other	20 (29.4)
Survival past procedure, median (IQR)	50 (21–143)

IQR, interquartile range.

NLP identification of process measures

NLP evaluated process measures for the 68 patients who were identified as having a venting gastrostomy tube. The performance of NLP for process measures is shown in Table 3. NLP identified goals of care discussions, code status clarification, palliative care, and hospice discussions with high sensitivity (85.7%, 90.8%, 92.9%, and 89.6%, respectively) and high specificity (96.7%, 90.6%, 98.2%, and 98.9%, respectively) compared to human coders.

Table 3.

Definitions of Process Measures Based on Cancer Quality-Assessing Symptoms, Side Effects, and Indicators of Supportive Treatment^12,13

			NLP performance
	Process measure	NLPlibrary	Sensitivity (95% CI)	Specificity (95% CI)
Denominator
Patients with billing codes for cancer AND gastrostomy procedure AND key word “venting” documented within one week of the procedure.	Process measure	NLPlibrary	Patient with cancer requiring placement of venting gastrostomy.	Venting	95.8 (88.4–97.0)	97.4 (94.5–99.1)
Numerator
Patients in the denominator who have EHR documentation of process measure.	ASSIST: IF a patient is newly known to have advanced cancer after a surgery, diagnostic test, or physical examination, THEN a discussion, including prognosis and advance care planning, should be documented within one month or a reason given why such a discussion did not occur.	Goals of care	85.7 (84.6–87.4)	96.7 (89.6–91.3)
		Code status	90.8 (89.4–91.1)	90.5 (97.6–98.8)
	ASSIST: IF an outpatient with advanced cancer dies an expected death, THEN he or she should have been referred for palliative care within six months before death (hospital-based or community hospice) or there should be documentation why there was no referral.	Palliative care	92.9 (91.8–94.2)	98.2 (97.6–98.9)
		Hospice	89.6 (88.2–91.0)	98.9 (98.2–99.3)

Performance of NLP compared to gold standard human coding.

ASSIST, assessing symptoms, side effects, and indicators of supportive treatment; NLP, natural language processing.

Documentation of end-of-life quality metrics was assessed for each patient two months before procedure date, during the procedure admission, and after the procedure admission. Table 4 reports the process measure score (percent of patients with documentation for the specified quality metric) during each of these time periods. In the two months before the procedure, 25.0% of patients had conversations regarding code status documented; 20.5% had goals of care discussions documented; 10.3% documentation of palliative care referral; and 27.9% were considered for hospice. Comparatively, during the procedure admission, 64.7% had code status documentation, 67.6% had goals of care documentation, 33.8% had documentation of palliative care, and 63.2%% had documentation of hospice conversations. These percentages continued to rise when considering documentation after the procedure admission.

Table 4.

Natural Language Processing Assessed Process Measure Score Before, During, and After Admission for Venting Gastrostomy Tube

	End-of-life quality metric
	Goals of care (%)	Code status (%)	Palliative care (%)	Hospice (%)
Two months before admission	20.6	25.0	10.3	27.9
During admission	67.7	64.7	33.8	63.2
During or after admission	80.9	75.0	47.1	88.2

Discussion

In this study, we showed that NLP can be applied to EHR data to both identify patients receiving palliative procedures and analyze the documentation of end-of-life process measures. The model performed with high sensitivity and specificity (85.7%–95.8% and 90.5%–98.9%, respectively) across each domain. Process measure scores revealed an increased frequency of end-of-life conversations at the time of a palliative procedure and after the procedure. The documentation rates varied by process measure, with palliative care consultation being the least likely to be recorded.

Several methodological barriers impede the development and implementation of quality measures in palliative surgery.¹⁰ Relevant processes such as establishment of a healthcare proxy and discussion of goals of care are poorly captured by administrative data. As a result, the current literature on palliative surgery outcomes is largely limited to single-institution retrospective studies, which may not be generalizable and are influenced by regional variations in end-of-life care intensity.¹¹ Moreover, data extraction from chart review is laborious, subject to interpreter bias, and impractical in larger studies.^12–14 Multisite and population-based data are needed, but the use of large databases is hindered by the limitations of administrative claims' codes.

Through NLP, we can measure the delivery of quality end-of-life care to a far greater degree than by using administrative codes alone. Documentation of patient preferences is important in providing goal-concordant care in palliative care populations.¹⁵ Measuring outcomes for palliative surgery patients should emphasize the quality and value of end-of-life care as typical metrics such as 30-day mortality are not relevant to patients with limited survival.

Although our study demonstrates the ability of NLP to identify and analyze quality measures in surgical palliative care, limitations exist. Rule-based NLP models only detect phrases in notes if they match the specified keywords. There is wide variety in free-text clinical notes, which means that a nonexhaustive keyword library will likely miss some instances of end-of-life care delivery. Yet, as demonstrated by the high reported sensitivity and specificity, these instances are rare. Finally, NLP is dependent on the quality of documentation. If clinicians deliver end-of-life care, but fail to document, then no data can be captured, no matter how advanced or rigorous the NLP.

Conclusion

NLP can effectively and accurately identify patients receiving palliative procedures and assess the quality of end-of-life care. These methods open the door to the practical implementation and measure of quality metrics in the field of palliative surgery.

Footnotes

Acknowledgment

The authors would like to thank Elise Brannen and Kaci Seibert for help annotating clinical notes. Funding: Dr. Lindvall was supported by a Junior Faculty Career Development Award from the National Palliative Care Research Center (NPCRC), New York City, New York; Pilot Award (NINR U24) from the Palliative Care Research Cooperative Group (PCRC), Denver, Colorado; and Gloria Spivak Faculty Advancement Award, Boston, MA.

Author Disclosure Statement

No competing financial interests exist.

References

McCahill

, Krouse

, Chu

, et al.: Indications and use of palliative surgery-results of Society of Surgical Oncology survey. Ann Surg Oncol, 2002; 9:104–112.

Krouse

, Nelson

, Farrell

, et al.: Surgical palliation at a cancer center: Incidence and outcomes. Arch Surg, 2001; 136:773–778.

Miner

, Brennan

, Jaques

: A prospective, symptom related, outcomes analysis of 1022 palliative procedures for advanced cancer. Ann Surg, 2004; 240:719–726 ; discussion 726–717.

Kwok

, Hu

, Dodgion

, et al.: Invasive procedures in the elderly after stage IV cancer diagnosis. J Surg Res, 2015; 193:754–763.

Murdoch

, Detsky

: The inevitable application of big data to health care. JAMA, 2013; 309:1351–1352.

Lilley

, Scott

, Goldberg

, et al.: Survival, healthcare utilization, and end-of-life care among older adults with malignancy-associated Bowel obstruction: Comparative study of surgery, venting gastrostomy, or medical management. Ann Surg, 2018; 267:692–699.

Wright

, Chakraborty

, Helyer

, et al.: Predictors of survival in patients with non-curative stage IV cancer and malignant bowel obstruction. J Surg Oncol, 2010; 101:425–429.

NQF-Endorsed Palliative Care and End-of-Life Care Endorsement Maintenance Standards. www.qualityforum.org/Publications/2012/04/Palliative_Final_Report.aspx. National Quality Forum. 2012 (last accessed April 1, 2018).

Lilley

, Columbus

, Cooper

: Inferring palliative intent from administrative data: Validation of a claims-based case definition for venting gastrostomy tube. J Pain Symptom Manage, 2017; 53:e3–e5.

10.

Lilley

, Khan

, Johnston

, et al.: Palliative care interventions for surgical patients: A systematic review. JAMA Surg, 2016; 151:172–183.

11.

Kwok

, Semel

, Lipsitz

, et al.: The intensity and variation of surgical care at the end of life: A retrospective cohort study. Lancet, 2011; 378:1408–1413.

12.

Walling

, Tisnado

, Asch

, et al.: The quality of supportive cancer care in the veterans affairs health system and targets for improvement. JAMA Intern Med, 2013; 173:2071–2079.

13.

, Lorenz

, O'Neill

, et al.: Cancer Quality-ASSIST supportive oncology quality indicator set: Feasibility, reliability, and validity testing. Cancer, 2010; 116:3267–3275.

14.

Aldridge

, Meier

: It is possible: Quality measurement during serious illness. JAMA Intern Med, 2013; 173:2080–2081.

15.

Lilley

, Lindvall

, Lillemoe

, et al.: Measuring process of care in palliative surgery: A novel approach using natural language processing. Ann Surg, 2018; 267:823–825.