Abstract
Abstract
Background:
Palliative surgical procedures are frequently performed to reduce symptoms in patients with advanced cancer, but quality is difficult to measure.
Objective:
To determine whether natural language processing (NLP) of the electronic health record (EHR) can be used to (1) identify a population of cancer patients receiving palliative gastrostomy and (2) assess documentation of end-of-life process measures in the EHR.
Design/Setting:
Retrospective cohort study of 302 adult cancer patients who received a gastrostomy tube at a single tertiary medical center.
Measurements:
Sensitivity and specificity of NLP compared to gold standard of manual chart abstraction in identifying a palliative indication for gastrostomy tube placement and documentation of goals of care discussions, code status determination, palliative care referral, and hospice assessment.
Results:
Among 302 cancer patients who underwent gastrostomy, 68 (22.5%) were classified by NLP as having a palliative indication for the procedure compared to 71 patients (23.5%) classified by human coders. Human chart abstraction took >2600 times longer than NLP (28 hours vs. 38 seconds). NLP identified the correct patients with 95.8% sensitivity and 97.4% specificity. NLP also identified end-of-life process measures with high sensitivity (85.7%–92.9%,) and specificity (96.7%–98.9%). In the two months leading up to palliative gastrostomy placement, 20.5% of patients had goals of care discussions documented. During the index hospitalization, 67.7% had goals of care discussions documented.
Conclusions:
NLP offers opportunities to identify patients receiving palliative surgical procedures and can rapidly assess established end-of-life process measures with an accuracy approaching that of human coders.
Introduction
Palliative surgery represents 6%–20% of all operations performed by surgical oncologists and accounts for more than 1000 procedures per year at tertiary cancer centers.1–4 While the mortality risk and other major complications for these procedures have been described in the literature, little data exist regarding associated processes' measures of high-quality end-of-life care, including preoperative goals of care discussions and establishement of a healthcare proxy.
Natural language processing (NLP) refers to computational methods that enable machines to process and analyze written text. Through NLP, “free text” medical notes, which represent 70%–80% of all data in electronic health records (EHRs), can be rapidly scanned to detect prespecified indicators. 5 In this study, we developed and tested NLP methods using existing data from the EHR to (1) retrospectively identify cancer patients who received palliative venting gastrostomy tube for refractory nausea and vomiting, and (2) identify quality benchmarks for processes of care, such as documentation of preoperative goals of care. We chose to focus on venting gastrostomy, because it is among the most common palliative surgical procedures in cancer patients and associated with poor prognosis with a median survival of 38 days.6,7
Methods
Data source and study population
The primary data source was the Partners HealthCare Research Patient Data Registry. This registry gathers data from multiple hospital EHRs at Partners HealthCare. These data are linked to the EHR and include admission notes, consultation notes, procedure notes, operative reports, and discharge summaries. We focused the analysis on a single tertiary medical center captured within this database. This study was approved by the Partners Institutional Review Board (Protocol #2016P001014).
We used International Classification of Diseases, Ninth Revision, Clinical Modification (ICD9-CM) and Current Procedural Terminology (CPT) administrative codes to identify cancer patients (ICD9-CM 140–209) who received a gastrostomy tube (ICD9-CM 43.11, 43.19, 44.32 or CPT 49440) from January 1, 2012, to March 31, 2016. Patients who had a gastrostomy tube placed, but did not have a cancer diagnosis were not identified in this study.
To determine if the gastrostomy tube was placed for feeding or palliative venting, we used both manual chart review and NLP methods. We then compared these methods to determine the effectiveness of NLP.
Manual chart review
Methodology for determining the indication for the gastrostomy procedure through manual chart review has been previously described. 8 In short, a single researcher noted whether the gastrostomy was indicated for venting a malignant obstruction (palliative indication). A second researcher reviewed a 20% random sample of charts. Inter-rater agreement was excellent (κ = 0.97).
Manual chart review was also used to identify documentation of validated end-of-life process measures: goals of care discussions, code status determination, palliative care consultation, and assessment for hospice. 9 Briefly, the EHRs of 20 randomly selected patients with a total of 1710 clinical notes were annotated by 2 researchers. The presence or absence of process measures was determined at the note level. In the case of disagreement, a third human coder reviewed the specific note in question and broke the tie.
Natural language processing
NLP, in the form of regular expressions, was used to identify palliative procedure and quality processes' meaures. Regular expressions identify patterns of characters as they are specified. Our NLP software, ClinicalRegex, identifies predefined keywords or phrases within the clinical notes, taking into account varieties in language and punctuation. NLP identified patients who had a palliative procedure by searching for documentation of the key word “venting” within one week of their procedure.
Words and phrases that were found to be associated with a specific end-of-life process were added to a key term library (Table 1). These key term libraries were refined and validated by manual review of notes flagged by NLP, as well as manual review of notes not flagged by NLP. This iterative process resulted in improvements in NLP performance over time.
Natural Language Processing Keyword Library
The NLP keyword library was then used to enumerate the instances of process documentation at three distinct time points in the procedural timeline: two months before gastrostomy tube placement, during admission for gastrostomy tube placement, and after the procedure admission. The percentage of patients with documentation at each of these three time points was then calculated to determine a process measure score.
Statistical analysis
Patient characteristics were described using means, standard deviations, interquartile ranges, and medians, as appropriate. NLP was evaluated by sensitivity and specificity when compared against manual chart review. True positives were defined as those notes where both NLP and the human coders agreed about the presence of a process measure. False positives were defined as the notes that NLP flagged for containing a process measure, but the human coders did not. False negatives were defined as the notes that NLP did not identify, yet were scored for a process measure by human coders.
Results
Patient characteristics
We identified 302 cancer patients receiving tube gastrostomy using administrative codes. Of those 302 patients, 70 had the word “venting” documented within one week of the procedure. Manual chart review identified 71 of the 302 as receiving venting gastronomy tube for palliative indication, whereas NLP identified 68 out of the 71 (sensitivity 95.8% and specificity 97.4%). Alternative language to indicate “venting” (e.g., “vent”) resulted in the three cases of missed identification by NLP.
Baseline demographic and clinical characteristics of patients identified by NLP are presented in Table 2. The mean age was 62.8 years, with the majority of individuals being female (70.6%), non-Hispanic white (77.9%), and relying upon private insurance (60.3%). Ovarian (16.2%), colorectal (16.2%), and pancreatic cancers (14.7%) were the most common malignancies.
Baseline Demographics for Patients Identified by Natural Language Processing as Receiving Gastrostomy Tube for Palliative Indication
IQR, interquartile range.
NLP identification of process measures
NLP evaluated process measures for the 68 patients who were identified as having a venting gastrostomy tube. The performance of NLP for process measures is shown in Table 3. NLP identified goals of care discussions, code status clarification, palliative care, and hospice discussions with high sensitivity (85.7%, 90.8%, 92.9%, and 89.6%, respectively) and high specificity (96.7%, 90.6%, 98.2%, and 98.9%, respectively) compared to human coders.
Performance of NLP compared to gold standard human coding.
ASSIST, assessing symptoms, side effects, and indicators of supportive treatment; NLP, natural language processing.
Documentation of end-of-life quality metrics was assessed for each patient two months before procedure date, during the procedure admission, and after the procedure admission. Table 4 reports the process measure score (percent of patients with documentation for the specified quality metric) during each of these time periods. In the two months before the procedure, 25.0% of patients had conversations regarding code status documented; 20.5% had goals of care discussions documented; 10.3% documentation of palliative care referral; and 27.9% were considered for hospice. Comparatively, during the procedure admission, 64.7% had code status documentation, 67.6% had goals of care documentation, 33.8% had documentation of palliative care, and 63.2%% had documentation of hospice conversations. These percentages continued to rise when considering documentation after the procedure admission.
Natural Language Processing Assessed Process Measure Score Before, During, and After Admission for Venting Gastrostomy Tube
Discussion
In this study, we showed that NLP can be applied to EHR data to both identify patients receiving palliative procedures and analyze the documentation of end-of-life process measures. The model performed with high sensitivity and specificity (85.7%–95.8% and 90.5%–98.9%, respectively) across each domain. Process measure scores revealed an increased frequency of end-of-life conversations at the time of a palliative procedure and after the procedure. The documentation rates varied by process measure, with palliative care consultation being the least likely to be recorded.
Several methodological barriers impede the development and implementation of quality measures in palliative surgery. 10 Relevant processes such as establishment of a healthcare proxy and discussion of goals of care are poorly captured by administrative data. As a result, the current literature on palliative surgery outcomes is largely limited to single-institution retrospective studies, which may not be generalizable and are influenced by regional variations in end-of-life care intensity. 11 Moreover, data extraction from chart review is laborious, subject to interpreter bias, and impractical in larger studies.12–14 Multisite and population-based data are needed, but the use of large databases is hindered by the limitations of administrative claims' codes.
Through NLP, we can measure the delivery of quality end-of-life care to a far greater degree than by using administrative codes alone. Documentation of patient preferences is important in providing goal-concordant care in palliative care populations. 15 Measuring outcomes for palliative surgery patients should emphasize the quality and value of end-of-life care as typical metrics such as 30-day mortality are not relevant to patients with limited survival.
Although our study demonstrates the ability of NLP to identify and analyze quality measures in surgical palliative care, limitations exist. Rule-based NLP models only detect phrases in notes if they match the specified keywords. There is wide variety in free-text clinical notes, which means that a nonexhaustive keyword library will likely miss some instances of end-of-life care delivery. Yet, as demonstrated by the high reported sensitivity and specificity, these instances are rare. Finally, NLP is dependent on the quality of documentation. If clinicians deliver end-of-life care, but fail to document, then no data can be captured, no matter how advanced or rigorous the NLP.
Conclusion
NLP can effectively and accurately identify patients receiving palliative procedures and assess the quality of end-of-life care. These methods open the door to the practical implementation and measure of quality metrics in the field of palliative surgery.
Footnotes
Acknowledgment
The authors would like to thank Elise Brannen and Kaci Seibert for help annotating clinical notes. Funding: Dr. Lindvall was supported by a Junior Faculty Career Development Award from the National Palliative Care Research Center (NPCRC), New York City, New York; Pilot Award (NINR U24) from the Palliative Care Research Cooperative Group (PCRC), Denver, Colorado; and Gloria Spivak Faculty Advancement Award, Boston, MA.
Author Disclosure Statement
No competing financial interests exist.
