Abstract
Barrett’s esophagus is the only known precursor for esophageal adenocarcinoma. The stepwise progression of nondysplastic Barrett’s to low-grade dysplasia, high-grade dysplasia, intramucosal carcinoma, and invasive cancer provides an opportunity for endoscopic surveillance, with the goal to detect and effectively treat early neoplastic changes endoscopically, in order to avoid surgery and chemoradiation. Multiple limitations in the current endoscopic surveillance practice have led to missed early neoplasia and therefore reduced the effectiveness of the current surveillance practice. Artificial intelligence (AI)-based clinical decision support systems have been developed to provide additional assistance to physicians performing diagnostic and therapeutic gastrointestinal endoscopy. In this article, we review the current endoscopic surveillance practice for Barrett’s esophagus, its limitations, the potential role of AI to improve these limitations, and our suggested framework to integrate AI into Barrett’s clinical practice.
Introduction
Barrett’s esophagus (BE), defined as the replacement of the squamous epithelium of the distal esophagus by intestinal metaplasia, is the only known precursor for esophagus adenocarcinoma (EAC). 1 Patients with BE have a 30–40-fold higher risk of developing EAC as compared with the general population. 2 The incidence of EAC has risen significantly in the past decade in Western countries. Although EAC and BE are known to be the disease of elderly White males, recent studies have reported an alarming 40% rise in the EAC rate among the 45–64 age group and 50% rise in the prevalence of BE (from 304 to 466 per 100,000 patients). 3 The majority of patients with EAC are still diagnosed at advanced stages, with 5-year survival rates remaining below 20%. 4 The stepwise progression of nondysplastic BE (NDBE) to low-grade dysplasia (LGD), high-grade dysplasia (HGD), intramucosal carcinoma, and invasive cancer provides an opportunity for screening and surveillance. 5 Currently, all gastrointestinal societies recommend endoscopic surveillance once a diagnosis of BE has been established.1,5–7 The goal of endoscopic surveillance is to detect early neoplastic changes that can be treated endoscopically, without the need for chemotherapy, radiation, or surgical resection. As such, endoscopic surveillance is an integral part of the EAC prevention paradigm. Despite this, multiple limitations in the current approach exist that result in missing early neoplasia and therefore reduced effectiveness of the current surveillance practice.
In recent years, artificial intelligence (AI) has emerged as a promising tool to improve the field of BE, some include detection of subtle dysplasia and cancer, quality control in upper endoscopy, and reducing the considerable interobserver variability in identifying and distinguishing dysplasia histologically. Here, we review the current endoscopic surveillance practice for BE and its limitations, the potential role of AI to improve these limitations, and our suggested framework to integrate AI into Barrett’s clinical practice.
Current Status of BE Surveillance and Limitations
High-quality examination of the BE mucosa is the foundation of endoscopic surveillance, which entails washing and cleaning of the surface mucosa, taking adequate time for inspection using both high-definition white-light imaging and chromoendoscopy, either dye-based (using methylene blue or acetic acid) or electronic (narrow band imaging [NBI], blue laser imaging, or i-Scan optical enhancement [i-Scan], Fujinon intelligence chromoendoscopy), identification of landmarks such as the squamocolumnar junction, gastroesophageal junction (GEJ), and the diaphragmatic hiatus, as well as defining the BE segment with the Prague C (circumferential) and M (maximal) criteria for standardized BE reporting. 8 Although it has been shown that there is a direct correlation between the inspection time and dysplasia detection, due to the time-consuming and challenging nature of the BE exam, endoscopists may not adhere to these protocols, especially in busy practices. 9
Once the BE mucosa has been carefully examined, sampling should be performed. Visible lesions should be categorized based on morphological appearance in accordance with the Paris classification, sampled separately (either by the biopsy forceps or endoscopic resection) and placed in a separate bottle. 10 Due to the subtle nature of the dysplastic lesions, it can be difficult, sometimes even for Barrett’s experts, to identify these lesions.11,12 At the same time, it is known that the majority of surveillance endoscopies are conducted by non-Barrett’s experts in community settings. 13 It has been reported that the sensitivity and specificity of Barrett’s nonexperts are 79% and 75% for the detection of Barrett’s-related neoplasia, and up to 26% of neoplastic lesions are missed on initial endoscopic evaluation.15,16
The targeted biopsy/resection of the visible lesions should be followed by the Seattle protocol, defined as nontargeted four-quadrant biopsies every 2 cm (in the absence of dysplasia) or every 1 cm (with a history of dysplasia). The Seattle protocol biopsies are required because only 13% of early neoplasia appears as macroscopically visible nodules on endoscopic examination. 17 This protocol has been shown to increase the yield of dysplasia detection. 18 However, adherence to this protocol is time-consuming, with data showing that only 51.2% of endoscopists are compliant with the Seattle protocol, with rates further decreasing as the length of the BE segment increases. 19 In addition, even with full compliance with the Seattle protocol, due to the patchy distribution of dysplasia and EAC, these areas can be missed. As a result of the limitations, a significant proportion of dysplastic lesions are missed on the initial endoscopic evaluation. One systematic review and meta-analysis showed that up to 26.6% of lesions with HGD and EAC can be missed. 15 Another study reported a 26% rate for post-endoscopy esophageal carcinoma. 16
Histology is critical in guiding the time intervals at which surveillance endoscopy should be performed after the initial evaluation. For patients with NDBE, it is recommended to perform surveillance endoscopies every 3–5 years, depending on the society guidelines and the length of the Barrett’s segment.1,5–7 Despite the criteria for the diagnosis of dysplasia being clearly defined, there is high interobserver variability among even expert pathologists, particularly in the diagnosis of LGD. 20 If LGD is detected, it should be confirmed by expert pathology review, given that most LGD diagnosed in the community is downgraded to NDBE when reviewed by experienced gastrointestinal pathologists. 20 Recent guidelines suggest that endoscopic eradication therapy (EET) can be considered for confirmed LGD cases. However, if EET is not pursued, surveillance in 6 months done by a Barrett’s expert using advanced imaging followed by annual surveillance is also reasonable. All major gastroenterology societies recommend EET for confirmed HGD, given the high rates of progression to EAC, and the efficacy and cost-effectiveness of EET in the treatment of HGD.1,5–7,21
Early-stage EAC (also known as superficial EAC) includes Tis (also known as HGD), T1a (limited to mucosa), and T1b (limited to submucosa). Cancers deeper than these should be treated with surgical resection and lymph node dissection. In recent years, endoscopic therapy has emerged as an alternative to esophagectomy for the treatment of some superficial EAC, therefore, determining the depth of tumor invasion is extremely valuable as it can guide the optimal treatment plan. Most T1 cancers can be completely resected endoscopically, which can provide precise staging as well as potentially curative resection. Lesions limited to mucosa (T1a) can be effectively treated endoscopically via endoscopic mucosal resection (EMR) with excellent long-term outcomes reported in the literature. 22 However, the optimal treatment approach for T1b lesions is less well defined, as the risk of lymph node metastasis in T1b lesions can be significant. In general, in patients with T1b lesions who are fit for surgery, the standard of care has been esophagectomy with regional lymphadenectomy. Some studies have reported promising survival data for endoscopic resection in patients with T1b EAC with “low-risk features,” defined as lesions <2 cm, with moderately or well-differentiated histology, and superficial submucosal invasion (<500 μ or T1b SM1).23,24 The American Gastroenterological Association (AGA) and European Society of Gastrointestinal Endoscopy (ESGE) guidelines recommend EMR for most smaller esophageal lesions, reserving endoscopic submucosal dissection for selected cases, such as lesions larger than 15 mm, poorly lifting lesions, and lesions at risk for carcinoma with submucosal invasion.25,26 While determining the depth of tumor invasion is extremely valuable for guiding the optimal treatment approach, even under trained eyes, it is extremely difficult to differentiate the different depths, which can result in choosing an inappropriate resection plan.
Overall, the aforementioned challenges have led to modest effectiveness of endoscopic surveillance and highlight the profound need for innovative solutions to improve the effectiveness of current surveillance strategies.
AI in Gastroenterology: A Brief Review
In the past few years, AI has been increasingly utilized in medicine in order to improve patients’ outcome and physician performance. AI comprises the use of computers or machines to perform critical analysis in cognitive tasks, similar to humans. 27 Machine learning (ML), as a subdiscipline of AI, describes algorithms employed to learn from preexisting data. Deep learning (DL) is a subtype of ML in which a convolutional neural network (CNN) receives input, learns specific pattern, and processes this information through the multilayered network to produce an output. 28 In gastrointestinal endoscopy, this output has been the mainstay of the development and expansion in the field of computer-aided detection (CAD) and computer-aided diagnosis (CADx). CAD systems are trained using still images to discern characteristics of neoplastic lesions, eventually progressing to video recordings, and then to live endoscopy. AI has the potential to improve several aspects of endoscopic surveillance for BE.
AI in BE
AI for the diagnosis of BE-related neoplasia (Table 1)
AI comprises the use of computers or machines to perform critical analysis in cognitive tasks, similar to human beings. 27 Its use in medicine has exploded, improving dictation and electronic medical record (EMR) software in clinics. It has been incorporated into hardware, creating intelligent prostheses and surgical assistant robots. 29 Neural networks are a subset of AI that involves multiple nodes containing an input and output layer. 28 When the output of one node reaches a certain threshold, it activates and passes its information to the next node. DL is a neural network of three or more nodes that can analyze unstructured information in the form of text or images and categorize it based on structured training. 28 DL was first applied in endoscopy as an adjunct in colorectal cancer surveillance. 30 Computer-aided diagnosis (CADx) systems were trained using still images to discern characteristics of neoplastic lesions, eventually progressing to video recordings, and then to live endoscopy. These CADx systems could predict histological diagnoses of lesions depicted on NBI with a sensitivity and specificity of 93%. 31 Given the current performance deficiencies in Barrett’s surveillance, application of DL methods was not only needed but also welcomed. AI systems can improve the detection of early neoplastic lesions suitable for minimally invasive endoscopic treatment, which is the goal of endoscopic surveillance. This improved diagnostic accuracy in the community has the potential to reduce the number of referral cases and associated costs, especially once a standardized CNN is offered to community sites. In addition, the AI powered decision support can increase the community’s physician’s confidence, limiting the unnecessary repeated upper endoscopies for avoiding adverse events and improved costs. With AI, random biopsies (Seattle protocol) can be replaced by “high-quality targeted biopsies.” This can improve the sampling sensitivity and reduce missed lesions, speed up examination time, limit the use of staining resulting in shorter test time and improved patient experience, and reduce the number of biopsies and associated costs.
Initial Barrett’s AI studies utilized small datasets of retrospectively collected still images and demonstrated impressive diagnostic performance of CNN, with sensitivity, specificity, and accuracy of 83–96%, 42–100%, and 85–92%.14,32–46
Studies Applying AI in the Detection of Neoplasia in Patients with Barrett’s Esophagus
AI, artificial intelligence; AUROC, area under the receiver operator curve; BE, Barrett’s esophagus; BERN, Barrett’s esophagus-related neoplasia; CAD, computer-aided detection; CLI, color-linked imaging; CNN, convoluted neural network; EAC, esophageal adenocarcinoma; EC, esophageal cancer; GE, gastroesophageal; H&E, hematoxylin and eosin; HGD, high-grade dysplasia; LGD, low-grade dysplasia; NBI, narrow band imaging; NDBE, nondysplastic Barrett’s esophagus; NLP, natural language processing; WLI, white light imaging.
The performance of CNN in these early studies inspired further research, integrating CNN in real-time video analysis and direct comparison with general endoscopists. Video-based studies trained CNN to assess the lumen in real time, breaking down the screen into 60 × 60 pixel frames, labeling them as either “neoplastic” or “non-neoplastic.” Studies investigating CNN using real-time video identified BE lesions from video frames with accuracy, sensitivity, and specificity ranges of 84–90%, 83–92%, and 82–89%.47–56 Next, studies bolstered their datasets to enhance neural network training. The Barrett's Oesophagus Imaging for Artificial Intelligence (BONS-AI) consortium, for example, used data from 15 international centers demonstrating a synergistic effect of CADx during video, improving endoscopist sensitivity from 67% to 79% without compromising specificity. 51 Recent studies utilized prospective data to diversify the dataset in which CNN is tested, with sensitivity and specificity of 84–100% and 79–100%, respectively.36,45,46,57,58 These studies directly assess the performance of CNN using images of early BE neoplasia and non-BE dysplasia, benchmarked by performance by general endoscopists. 59 The methodological rigor of recent studies has shown that CNN can be an effective tool in improving endoscopic detection of BE neoplasia in real time, with a synergistic effect when used with endoscopists. These promising results have highlighted a performance benefit over conventional diagnostic endoscopy, calling into question the current gold standard. Application of CNN raises the diagnostic performance of BE detection during screening endoscopy, closing the diagnostic performance gap between community hospitals and specialized centers.
AI for depth of invasion (Table 2)
As stated earlier, preoperative differentiation between mucosal (T1a) and submucosal (T1b) cancers has relevant therapeutic and prognostic implications. A pivotal study by Knabe et al. demonstrated an overall accuracy rate of 73% in the classification of T stages in Barrett’s cancer. 60 Another study by Ebigbo et al. demonstrated sensitivity and specificity of 77% and 64% for differentiating between T1a and T1b lesions. 61 Van der Laan et al. demonstrated that AI-assisted gastroenterologists performed better than their unassisted counterparts in the identification of dysplasia in optical biopsies of BE generated by endocystoscopy. 44 Although its diagnostic performance currently remains comparable to current gold standard measures, the CADx system can improve with more robust training regimens and expansion of labeled datasets, with real potential to serve as a timely and accurate biopsy predictor during real-time endoscopy.
Performance of Studies Applying AI for Predicting the Depth of Invasion
BE, Barrett’s esophagus; CNN, convoluted neural network.
AI for Barrett’s related histology (Table 3)
As stated earlier, the clinical care of patients with esophageal diseases relies on the histological evaluation of biopsies and resection specimens. In this regard, the application of AI to digital endoscopy will play a key role in improving and standardizing clinical care. Gehrung et al. applied DL to the Cytosponge-TFF3 test, a noninvasive diagnostic test for BE. 62 DL was implemented to create a human-in-the-loop triage process, with AI flagging difficult cases for expert pathology review with similar sensitivity and specificity to fully manual reviews. 62 DL can interpret voice biomarkers, particularly voice signal periodicity, to differentiate patients with normal esophageal mucosa, gastroesophageal reflux disease (GERD), and BE. 63 TissueCypher implements DL to interpret biomarker expression and tissue structures of esophageal biopsies to generate scores denoting low, mid, and high risk of progression from NDBE to HGD or EAC. 64 AI has also been applied to augment primary prevention through the advent of electronic health record-based ML models. These models can identify high-risk patients from EMR documentation by generating a predictive risk score, improving the efficacy of primary prevention screening. 65 CNN has also been applied in the analysis of histopathology.66–69 Tomita et al. developed a CNN that differentiated normal tissue, NDBE, BE with dysplasia, and adenocarcinoma with a mean accuracy of 83%. 70 Faghani et al. demonstrated CNN could categorize histopathology slides into NDBE, LGD, or HGD with an F1 score between 0.91 and 1.0. 71 When applied with wide angle transepithelial sampling, computerized microscopy can improve diagnostic accuracy of BE.72,73 This technology creates a three-dimensional rendering of the tissue biopsy, utilizing neural networks to compare it with thousands of images of BE dysplasia. In turn, this CNN can identify pathological biopsies with much greater accuracy and improve interobserver bias of pathologists compared with forceps biopsy. 74
Performance of Studies Applying AI for Predicting Histology
AUROC, area under the receiver operator curve; BE, Barrett’s esophagus; CNN, convoluted neural network; H&E, hematoxylin and eosin; HGD, high-grade dysplasia; LGD, low-grade dysplasia; NDBE, nondysplastic Barrett’s esophagus; NLP, natural language processing.
AI for quality control of upper gastrointestinal endoscopy (Table 4)
AI has the potential to be integrated during endoscopy to augment efficiency and assure quality. Quality improving systems such as WISENSE can monitor blind spots, automatically generate photo documentation, and time the procedure, improving not just Adenoma Detection Rate (ADR) but also endoscopy efficiency. 75 In addition, AI has been implemented to reduce interobserver variance, with algorithms designed to automatically detect the GEJ and interpret BE extension using the Prague classification (Table 4).61,62 One study developed an AI system that automatically identifies the squamous-columnar junction and GEJ on images. 57 Another study worked on an AI system that automatically determined the extension of the BE according to the Prague classification. 76 With the extension of BE as a relevant factor for risk stratification, AI-assisted standardized and automated reporting has the potential to significantly improve patient care.
Performance of Studies Applying AI for Quality Improvement
BE, Barrett’s esophagus; CNN, convoluted neural network.
AI can integrate the results of the endoscopy to produce guideline-based recommendations on further surveillance measures, while also completing billing documentation. With AI, an endoscopy “report card” can be calculated, analyzing the endoscopist’s ability to avoid blind spots, effectively irrigate, resect polyps, and biopsy suspicious lesions. For training programs, CNN can be used as a standardized measure to evaluate the competency of trainees as they transition to independent practice. These advantages serve to improve not just the efficiency but also the performance of endoscopists (Table 5).
Applications of AI as a Quality Improvement Tool
AI, artificial intelligence; BE, Barrett’s esophagus; CNN, convoluted neural network.
AI systems approved for clinical use
Several AI systems have been developed, demonstrating impressive ability of the CADx systems to identify dysplasia and early EAC. WISE VISION (NEC Corp, Tokyo, Japan) can differentiate between NDBE and dysplastic BE and offers a visual representation of the area that has been classified as HGD or neoplastic. 77 CADU (Odin Medical Ltd, London, UK) can differentiate between NDBE and dysplastic BE and offers a visual representation when the observed image is deemed dysplastic. 78
Barriers to implementation
Several barriers have slowed the integration of AI into the endoscopy suite. From a development standpoint, more diverse, well-annotated images are needed to create validated, robust neural networks. Preliminary studies highlight the limitations of ML exclusively on high-quality image datasets, creating neural networks without dexterity for the real world.33,42,79
From a financial standpoint, governing bodies must purchase AI products designed by consortiums and industry companies, evaluating the expected life of the machines and the need for upgrading software. To allow for equitable access, medical centers across the authority’s jurisdiction would need to be accounted for. This results in both an impressive one-time cost as well as recurring costs anticipated for upgrades and repair.
For experienced endoscopists, AI is a novel, and at times, frighteningly powerful technology. Machine algorithms operate decisively and more accurately than humans; however, users still feel entitled to transparency and accountability in decision-making, coined the “black box” criticism.38,80 Thankfully, systems are being trained to integrate explanations into decision-making, allowing the user to make an informed decision and overrule when needed. 81
Procedurally, the AI-generated alert boxes during endoscopy can be viewed as “distracting” or “redundant” by users. 82 Thankfully, the user experience will only improve with continual feedback, with ongoing studies aimed to minimize distractions. 83 Ultimately, CNN should be viewed as a support tool with immense growth potential, aimed to improve patient care and physician efficiency.
A proposed value chain
The promise of AI is to improve patients’ outcomes, physician performance, and workload. To summarize the various benefits of integrating AI into Barrett’s clinical practice, we have developed a value chain, encompassing the entire continuum of care.
Once AI becomes a clinical reality available to physicians, the proposed use during upper endoscopy will be with live video images that will be sent to the AI application and analyzed in real time. The application will be a full workflow solution starting with a quality algorithm capable of monitoring blind spots and mucosal cleanness. It will then help reduce interobserver bias by automatically measuring the size of Barrett’s segment and documenting the Prague classification. Throughout the procedure, blind spots would be identified, and photos automatically captured and stored (Figure 1).

A potential value chain depicting artificial intelligence (AI) integration in Barrett’s esophagus (BE) surveillance and management.
As CADx is integrated into surveillance endoscopy, detection rates of HGD and early EAC can improve. Optical biopsy would also save patients from further diagnostic procedures for tissue sampling and offer unparalleled efficiency in therapeutic management. These measures can reduce health care costs through more effective malignancy screening, reducing costs associated with chemotherapy and surgical interventions. Moreover, they will improve patient safety, protecting them from the risks of surgery or further, unnecessary investigations. Endoscopists benefit dramatically from the integration of CADx systems. At the end of the procedure, an impression would be generated, summarizing the salient findings and providing recommendations on management and screening interval based on guideline recommendations. AI tool can then proceed with automated clinical and procedural documentation that can be automatically incorporated into daily clinical practice. AI tools will likely also be available to help optimize patient scheduling and procedural workflow.
Conclusions
EAC is a morbid disease, in part due to poor performance of identification of dysplasia and early EAC during surveillance screening. 84 AI offers unparalleled convenience and growth opportunities, creating an “arms race” between consortiums to build CNN that is accurate, reliable, and scalable. CADx system can differentiate NDBE from LGD, HGD, and early EAC with impressive accuracy, sensitivity, and specificity. Its ability to predict the depth of infiltration offers immense financial and safety benefits. In addition, CNN can improve endoscopic efficiency, offer comparative performance metrics, and serve as a new gold standard for trainees. 85 Its multiple applications in quality improvement and assurance serve as a worthwhile investment for private medical centers and governing health authorities. For the practicing gastroenterologist, this technology still requires further validation before routine incorporation, but nonetheless has the exciting potential to advance the field of endoscopy.
Footnotes
Authors’ Contributions
Conception and design: J.B., H.J.K., M.F.B., and N.P. Analysis and interpretation of data: J.B., M.F.B., and N.P. Drafting of article: J.B. and H.J.K. Critical revision of the article for important intellectual content: M.F.B. and N.P. Final approval of the article: M.F.B. and N.P.
Author Disclosure Statement
N.P.: VP of Medical Affairs at Satisfai Health, Speaker for Phathom Pharmaceuticals. M.F.B.: Founder, Chairman, and CMO at Satisfai Health. J.B.: No conflicts to disclose. H.J.K.: No conflicts to disclose.
Funding Information
No funding was received for this article.
