Abstract
Background:
Laparoscopic Heller myotomy (LHM), pneumatic dilatation (PD), and peroral endoscopic myotomy (POEM) are common treatments for esophageal achalasia. Literature evidence is restricted to pairwise analysis and PD versus POEM comparison is missing. The aim of this network meta-analysis (NMA) was to comprehensively compare outcomes within these three surgical approaches with those of esophageal achalasia.
Materials and Methods:
PubMed, EMBASE, and Web of Science databases were consulted. A systematic review and a fully Bayesian study level arm-based random effect NMA were performed.
Results:
Nineteen studies (14 observational and 5 randomized controlled trial) and 4407 patients were included. Overall, 50.4% underwent LHM, 42.8% PD, and 6.8% POEM. The postoperative dysphagia remission was statistically significantly improved in POEM compared with LHM and PD (risk ratio [RR] = 1.21; 95% credible intervals [CIs] = 1.04–1.47 and RR = 1.40; 95% CIs = 1.14–1.79, respectively). Postoperative gastroesophageal reflux disease (GERD) rate was higher in POEM than in LHM and PD (RR = 1.75; 95% CIs = 1.35–2.03 and RR = 1.36; 95% CIs = 1.18–1.68, respectively). Postoperative Eckardt score was significantly lower in POEM than in LHM and PD (standardized mean difference (smd) = −0.6; 95% CIs = −1.4 to −0.2 and smd = −1.2; 95% CIs = −2.3 to −0.2, respectively). No statistically significant differences were found comparing LHM and PD in any of the analyzed outcomes.
Conclusions:
In the short-term follow-up, POEM seems to be associated with better dysphagia improvement and higher postoperative GERD than LHM and PD. The choice of the ideal initial management should be left to multidisciplinary team discussion and personalized on each patient basis.
Introduction
Esophageal achalasia is characterized by the loss of inhibitory innervation of the lower esophageal sphincter (LES) with consequent inadequate relaxation upon swallowing, higher baseline LES pressures, and loss of physiological peristalsis.1,2 Laparoscopic Heller myotomy (LHM), endoscopic pneumatic dilatation (PD), and peroral endoscopic myotomy (POEM) are common options in the management of idiopathic esophageal achalasia. 3
LHM and graded PD have been shown to be equally effective in previous randomized trials and meta-analyses.4,5 POEM is a promising technique that gained increasing acceptance among gastroenterologist and surgeons. 6 Recent meta-analyses comparing POEM and LHM showed improved results in terms of dysphagia relief with higher postoperative pathologic gastroesophageal reflux disease (GERD).7,8 Previous studies analyzed these techniques in a pairwise comparison (LHM versus PD or LHM versus POEM); however, a comprehensive analysis is lacking and no definitive consensus has been reached on the best treatment.
The aim of this network meta-analysis (NMA) was to comprehensively compare short-term outcomes within these three main approaches with those of esophageal achalasia. In addition, it also intends to compare POEM with PD, which have not been directly compared to date.
Materials and Methods
The study was conducted according to the Preferred Reporting Items for Systematic Reviews and Network Meta-analyses (PRISMA-NMA) statement. 9 A broad literature examination between 2000 and December 31, 2018, was performed by 3 independent authors (A.A., C.G.R., and E.R.) to identify the English-written studies on LHM, PD, and POEM for the treatment of esophageal achalasia. PubMed, EMBASE, and Web of Science databases were consulted using the terms “esophageal achalasia” and “endoscopic dilatation” and “Heller myotomy” and “POEM.”
Inclusion and exclusion criteria
To be included in the study, articles must satisfy the following conditions: (1) studies that compare outcomes for either LHM, PD, and POEM; (2) articles written in English; (3) articles with the longest follow-up or the largest sample size when two or more articles were published by the same institution, study group, or used the same data set. Studies were excluded if (1) they were not written in English, (2) had no clear methodology, and (3) articles had <10 patients per study arm.
Data extraction
Three authors (A.A., E.R., and D.B.) individually extracted data from eligible studies. Analyzed data included study design, number of patients, age, gender, treatment, mean follow-up (months), improvement of dysphagia, post-procedural GERD symptoms, post-procedural evidence of reflux esophagitis, post-procedural evidence of GERD (24-hour pH study), complications, hospital length of stay (days), and mortality. Disagreements were resolved by discussion and, if no agreement could be reached, a fourth senior author (L.B.) made the decision. The protocol was recorded in PROSPERO (International prospective register of systematic reviews; CRD42018091899).
Quality assessment
Three authors (A.A., E.R., and G.B.) individually assessed the risk of bias and quality of the included studies. The Newcastle-Ottawa Scale (NOS) was adopted for observational studies on a 0 to 9 point scale. A 0–4 score indicated a poor quality study, whereas a 5 to 9 score indicated a high methodological quality study. 10 The randomized controlled trials (RCTs) methodological studies were evaluated using the Jadad scale. 11 A three-point questionnaire forms the basis for the Jadad score; a trial could receive a score of between 0 (poor quality) and 5 (rigorous). Disagreements on study quality were solved by discussion.
Outcomes
Primary outcomes were postoperative dysphagia rate and postoperative GERD. Postoperative dysphagia was considered as dichotomous variable (presence or absence). Postoperative GERD was considered as (1) presence of GERD symptoms, (2) and/or evidence of reflux esophagitis, and/or (3) abnormal 24-hour pH monitoring. Secondary outcomes were postoperative Eckardt score and iatrogenic perforation.
Statistical analysis
Random effect fully Bayesian NMA was performed where indirect comparisons were made by deriving information from a common comparator. The Bayesian inference allows accurate estimates for small sample should yield exact coverages, not depending from sample size.12,13 Analysis was performed using the binomial/log likelihood.14,15 Noninformative prior distributions were normal (0, 1000) for log risk ratio and relative effects, gamma (0.001, 0.001) distribution for random effect precision. Unrelated mean effects model was used for pairwise comparison. 14 Descriptive statistics and the distributions of baseline participant variables across studies and treatment comparisons helped us to evaluate lack of transitivity.
Given that this network only included unclosed loops, no formal assessment of consistency was between-trial variances and I2-index was calculated to assess heterogeneity. I2-index value of 25% connotes low heterogeneity, 50% moderate heterogeneity, and 75% high heterogeneity. 16 The inference was performed using posterior distribution mean and relative 95% credible intervals (CIs), based on marginal posterior distribution in Monte Carlo Markov chain (MCMC), drawing 400,000 iterations after a burn-in period of 40,000. Statistical significance when 95% CI encompasses null hypothesis value. We conduct sensitivity analysis regarding the choice of prior distribution.
Model convergence was assessed by analyzing history, running means density, and Brooks–Gelman–Rubin diagnostic plots.17,18 We depicted rank probabilities for all competing treatments. When missing, variance of continuous variables was estimated. 19 All analyses were performed using Jags and R-Cran.20,21
Review of network geometry
The spectrum of comparisons among the different techniques for esophageal achalasia within the network of published studies was examined. The networks geometry for each outcome was appraised separately and network graphs are provided. The connection between treatment approaches was assessed (i.e., those compared head-to-head in the selected studies and those that were only connected indirectly by one “common comparators” and the amount of evidence informing each comparison).
Results
Systematic review
One thousand six hundred sixty-six publications were found. After duplicates and nonrelevant studies were removed, 421 publications were reviewed. Nineteen studies met the inclusion criteria. The selection process is shown in Figure 1. Fourteen studies were non-RCTs (10 retrospective and 4 prospective) and 5 were RCTs. The NOS for observation studies ranged from 7 to 8, the Jadad scale for RCTs ranged from 3 to 5, suggesting a fair methodological study quality.

The PRISMA-NMA checklist diagram. PRISMA-NMA, Preferred Reporting Items for Systematic Reviews and Network Meta-analyses.
Four thousand four hundred seven patients were included. Of these, 2221 (50.4%) underwent LHM, 1888 (42.8%) PD, and 298 (6.8%) POEM. Demographic data of all patients according to the surgical treatment are presented in Tables 1 and 2. The age of the patient population ranged from 30.8 to 64.1 and 54.3% were males. Preoperative body mass index was investigated in 10 studies and ranged from 22.9 to 28.8 kg/m2. Manometric data, symptoms duration, radiological stage (I–IV), and previous treatments were not specified in all the included studies. The mean preoperative resting LES pressure in LHM, PD, and POEM was 31.8, 32, and 29.8 mmHg, respectively. The mean preoperative Eckardt score in LHM, PD, and POEM was 7.2, 7, and 7.2, respectively. Quality of life was reported in four studies using the Short-Form 36 (SF-36) questionnaire and in two studies using the Personal General Well Being score.
Demographic and Clinical Data of Patients Undergoing Laparoscopic Heller Myotomy and Pneumatic Dilatation
Values are reported as mean ± SD or median (range).
BMI, body mass index; LES, lower esophageal sphincter; LHM, laparoscopic Heller myotomy; NR, not reported; PD, pneumatic dilatation; PS, propensity score; RCT, randomized controlled trial; RS, retrospective.
Demographic and Clinical Data of Patients Undergoing Laparoscopic Heller Myotomy and Peroral Endoscopic Myotomy
Values are reported as mean ± SD or median (range).
BMI, body mass index; LES, lower esophageal sphincter; LHM, laparoscopic Heller myotomy; NR, not reported; POEM, peroral endoscopic myotomy; PR, prospective; RS, retrospective.
Network meta-analysis
Dysphagia remission
The dysphagia remission rate is similar when comparing LHM and PD (Risk Ratio [RR] = 1.20; 95% CIs = 0.89–1.39; I2 = 12.4%) and significantly improved when comparing POEM and LHM (RR = 1.21; 95% CIs = 1.04–1.47; I2 = 0.0%). The NMA indirect comparison shows that POEM is associated with a significantly improved dysphagia remission rate compared with PD (RR = 1.40; 95% CIs = 1.14–1.79) (Fig. 2a). The standard error of Log RR is 0.13 (95% CIs = 0.08–0.24) and the global I2-index is 18.9%. The rank plot (Fig. 2b) showed the empirical probabilities that each treatment is ranked first to third.

Dysphagia remission.
Gastroesophageal reflux disease
The postoperative GERD is similar when comparing LHM and PD (RR = 0.74; 95% CIs = 0.28–1.37; I2 = 21.1%) and significantly higher when comparing POEM and LHM (RR = 1.75; 95% CIs = 1.35–2.03; I2 = 6.3%). The NMA indirect comparison shows higher postoperative GERD when comparing POEM and PD (RR = 1.36; 95% CIs = 1.18–1.68) (Fig. 3a). The standard error of Log RR is 0.75 (95% CIs = 0.27–1.41) and the global I2-index is 20.9%. The rank plot (Fig. 3b) shows the empirical probabilities that each treatment is ranked first to third.

Postoperative GERD.
Secondary outcomes
The postoperative Eckardt score is similar when comparing LHM versus PD (mean difference = −0.7; 95% CIs = −1.6 to 0.27; I2 = 14.3%) and significantly lower when comparing POEM versus LHM (mean difference = −0.6; 95% CIs = −1.4 to −0.2; I2 = 17.5%). Similarly, the NMA indirect comparison shows that POEM is associated with a significantly lower postoperative Eckardt score than PD (mean difference = −1.2; 95% CIs = −2.3 to −0.2) (Fig. 4a, b). The global I2-index is 13.4%. The risk of iatrogenic esophageal perforation is similar when comparing LHM versus PD (RR = 0.59; 95% CIs = 0.21–1.31; I2 = 11.3%) and POEM versus LHM (RR = 0.66; 95% CIs = 0.23–1.82; I2 = 0.0%). The NMA indirect comparison shows that POEM is associated with a significantly lower risk of perforation than PD (RR = 0.53; 95% CIs = 0.39–0.96). The global I2-index is 0.0%. The robustness of the results was confirmed by the sensitivity analysis and the leverage plots do not show evidence of study outliers. For all considered outcomes, there was no evidence of non-MCMC convergence. The assessments of confidence in the estimates using Confidence in Network Meta-Analysis (CINeMA) show moderate to very low confidence, essentially due to study limitation and imprecision.

Postoperative Eckardt score.
Discussion
This study shows that POEM seems related with improved dysphagia remission and lower postoperative Eckardt score than LHM and PD in the short term. However, POEM seems associated with higher postoperative gastroesophageal reflux than LHM and PD.
LHM with or without fundoplication has represented the standard treatment for achalasia. 40 Recent studies have shown that graded endoscopic PD is equally safe and effective compared with LHM, and has similar outcomes in the medium-term follow-up.4,5 Conversely, POEM is a novel technique with promising short-term follow-up outcomes but unknown risk of long-term gastroesophageal reflux and Barrett's esophagus.7,8,41 It has been shown that achalasia patients subjected to esophageal distal myotomy are at increased risk for developing GERD, thus suggesting that an antireflux procedure should be always combined. 42 Although POEM appears to be safe and highly effective in terms of symptom resolution, it remains to be established whether improvement of dysphagia will be maintained in the long run and whether further technical refinements of the technique will keep gastroesophageal reflux under control. 6
In this meta-analysis, POEM was found to have a significantly higher symptom resolution than PD and LHM. These results are not surprising as both POEM and LHM provide one-shot and possibly definitive relief of esophageal outflow obstruction. In contrast, PD is offered as a sequential therapy that may consist of one to three repeat sessions.4,5,43,44
Data on post-POEM gastroesophageal reflux are controversial because of the limited number of patients and follow-up. Previous retrospective series suggested that POEM is associated with a high rate of pathologic reflux measured at 24-hour pH-study. Swanstrom et al. and Von Renteln et al. reported a pathologic reflux in 46% and 53.4% of patients, respectively. Moreover, esophagitis was reported in 28% and 42% of patients, respectively.45,46 A multicenter study reported esophagitis in 37.5% of patients at a median follow-up of 29 months (range 24–41). 47 By contrast, Bhayani et al. reported similar results in terms of pathologic acid exposure comparing POEM and LHM at 6 months. 32 Similarly, Teitelbaum et al. described similar postoperative reflux rates after POEM and LHM combined with Dor or Toupet fundoplication. 36 Two recent meta-analyses showed that the postoperative pathologic GERD seems significantly higher after POEM than after LHM with fundoplication.7,8 Equally, the present meta-analysis showed a higher GERD rate for POEM than for PD and LHM. As advocated by some authors, the lack of hiatal dissection, the maintenance of the anatomical antireflux mechanisms, and a selective myotomy of circular muscle layer in POEM may be helpful to significantly reduce postoperative reflux, but future studies are necessary to deeply investigate this issue.48,49
The safety of POEM has been previously demonstrated with most of the adverse events being self-limited and not altering the postoperative course and outcomes.30–39 In this meta-analysis, the iatrogenic esophageal perforation rate was significantly lower after POEM than after PD but similar to LHM. This is conceivable since esophageal myotomy performed either endoscopically or laparoscopically allows a sharp muscle layer division compared with the blunt and less controlled muscle stretching in PD. 50
In the current clinical practice, the choice between the three treatment options should be guided by multidisciplinary consensus, available local expertise, severity of disease, manometric features, comorbidity, and patient expectations.51,52 The fact that PD may require sequential dilatations and POEM may be associated with higher incidence of GERD in the long term should be clearly discussed with the patient. In addition, besides symptom scores and objective outcomes, comparison of comprehensive quality of life assessment for each therapeutic option is necessary to minimize the risk of over- or underestimating the effect of treatment. Combining these efforts may eventually lead to an optimal decision-making shared with the patient.
This study has several limitations. Seven RCTs were included in the comparison of LHM versus PD, but there were no RCT comparing LHM versus POEM; this might result in a possible selection bias because of the difference in the study design. The inclusion of observational study could be considered a limitation due to the intrinsic bias; however, the a priori exclusion of observational studies in systematic reviews is inappropriate and inconsistent with a comprehensive evidence-based approach. 53 The surgical and endoscopic techniques reported in the reviewed studies were various. An antireflux procedure was not consistently added to the myotomy, and if added, either a Dor or a Toupet fundoplication was fashioned. PD was performed in a single or multiple sessions, using different balloon diameters and duration of inflation. Also, POEM was performed in either the anterior or posterior esophageal wall and was full thickness or included only the circular muscle layer. Additional limitation of this meta-analysis is that various outcome measures and various grading scales were used in the studies, and that patients were not stratified according to the Chicago classification. Finally, POEM studies have significantly shorter follow-up time than LHM and PD. Further prospective high-quality studies or RCTs with direct procedures comparison using standardized techniques and outcomes are needed to validate the results of this meta-analysis. The imprecision must be considered for some of the outcomes and the treatment ranking should be cautiously interpreted. Therefore, surgeons should choose the most appropriate treatment comprehensively considering the ranking, related costs, and individual experience.
As a result, multiple treatments can be necessary throughout lifetime. There is a need to provide all stakeholders—physicians, patients, health policy managers, and insurers—both objective and patient-reported outcomes to establish best evidence-based principles in the treatment of this disease.
Conclusions
The treatment of esophageal achalasia is evolving. Multidisciplinary team discussion and further research are needed to clarify the ideal management. In the short term, POEM seems to be more efficient than LHM and PD in terms of dysphagia relief with higher postoperative reflux. The incidence of postoperative GERD, especially after POEM, needs further objective evaluation both in the short- and in the long-term follow-ups. The role of POEM as initial treatment of esophageal achalasia should be further investigated in prospective trials.
Footnotes
Acknowledgment
This study was supported by AIRES (Associazione Italiana Ricerca Esofago).
Authors' Contributions
A.A., E.R., and C.G.R. did the literature search. A.A., D.B., and L.B. formed the study design. Data collection was done by A.A., E.R., C.G.R., G.O., and G.B. A.A., G.B., and L.B. analyzed the data. A.A., D.B., and L.B. interpreted the data and wrote the article. A.A., D.B., G.M., G.C., and L.B. critically reviewed the article.
Disclaimer
This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors
Disclosure Statement
No competing financial interests exist
