Comparative Benefits and Harms of Complementary and Alternative Medicine Therapies for Initial Treatment of Major Depressive Disorder: Systematic Review and Meta-Analysis

Abstract

Objectives:

To report the comparative benefits and harms of exercise and complementary and alternative medicine (CAM) treatments with second-generation antidepressants (SGA) for major depressive disorder (MDD).

Design:

Systematic review and meta-analysis.

Settings:

Outpatient clinics.

Subjects:

Adults, aged 18 years and older, with MDD receiving an initial treatment attempt with SGA.

Interventions:

Any CAM or exercise intervention compared with an SGA.

Outcome measures:

Treatment response, remission, change in depression rating, adverse events, treatment discontinuation, and treatment discontinuation due to adverse events.

Results:

We found 22 randomized controlled trials for direct comparisons and 127 trials for network meta-analyses, including trials of acupuncture, omega-3 fatty acids, S-adenosyl methionine, St. John's wort, and exercise. For most treatment comparisons, we found no differences between treatment groups for response and remission. However, the risk of bias of these studies led us to conclude that the strength of evidence for these findings was either low or insufficient. The risk of treatment harms and treatment discontinuation attributed to adverse events was higher for selective serotonin receptor inhibitors than for St. John's wort.

Conclusions:

Although we found little difference in the comparative efficacy of most CAM therapies or exercise and SGAs, the overall poor quality of the available evidence base tempers any conclusions that we might draw from those trials. Future trials should incorporate patient-oriented outcomes, treatment expectancy, depressive severity, and harms assessments into their designs; antidepressants should be administered over their full dosage ranges; and larger trials using methods to reduce sampling bias are needed.

Introduction

Major depressive disorder (MDD) is the most prevalent and disabling form of depression, affecting more than 16% of U.S. adults (lifetime).¹ In any given year, nearly 7% of the U.S. adult population (∼17.5 million people in 2014) experiences an episode of MDD that warrants treatment.¹ Most patients receiving care obtain treatment in primary care settings,² where second-generation antidepressants (SGAs) are the most commonly recommended treatment interventions.³ However, some evidence supports a variety of other interventions to treat patients with depression.

Use of complementary and alternative medicines (CAM) among the general U.S. adult population is common⁴ and increases among those with chronic medical conditions. In a nationally representative sample of U.S. adults⁵ with self-reported depression or anxiety, more than 50% reported using CAM therapies. As a result, several professional organizations have published statements or practice guidelines about CAM use for MDD, including the American Psychiatric Association (APA),⁶ the Canadian Network for Mood and Anxiety Treatments (CANMAT),⁷ and the Department of Veterans Affairs.⁸ Although recommendations from these organizations support the use of St. John's wort for mild-to-moderate depression, recommendations for other CAM therapies are more equivocal because of a paucity of high-quality evidence.

Pharmacotherapy remains the most common intervention for patients with MDD. However, more than 60% of patients experience at least one adverse effect during treatment, often leading to treatment cessation.⁹ Further, ∼70% of patients with MDD do not achieve remission after initial pharmacologic treatment.¹⁰ Therefore, both patients and providers may wish to consider other treatment options, including the combination of antidepressants with CAM therapies. Previously, we reported on the comparative effectiveness of SGAs versus psychological, complementary, and exercise treatments for MDD.^11
–13 Here, we present our full findings on the comparative benefits and harms of CAM and exercise interventions with SGAs for MDD. In addition, we report on the comparisons among CAM and exercise interventions themselves.

Materials and Methods

Search strategy

Detailed methods and search strategy are available in the full report on the use of nonpharmacological treatments for patients with MDD.¹² Briefly, we searched MEDLINE^®, EMBASE, the Cochrane Library, AMED (Allied and Complementary Medicine Database), PsycINFO, and CINAHL (Cumulative Index to Nursing and Allied Health Literature) for studies published from January 1990 through September 2015. For a comparison of benefits, we limited ourselves to randomized controlled trials (RCT) only, but included nonrandomized studies (i.e., nonrandomized controlled trials, cohort study, case-control study) for comparisons of harms. We used a combination of Medical Subject Headings terms and keywords for CAM interventions. To find unpublished studies, we searched ClinicalTrials.gov, the World Health Organization's International Clinical Trials Registry Platform, Drugs@FDA, the European Medicines Agency, the National Institute of Mental Health website, the American Psychological Association website, Scopus, the Conference Proceedings Citation Index, and reference lists of pertinent reviews and included trials.

Study selection, data abstraction, and quality assessment

Two trained team members independently reviewed all abstracts and full-text articles by using predefined inclusion and exclusion criteria. Our population of interest included adult outpatients of all races and ethnicities with MDD, which was consistent with the Diagnostic and Statistical Manual, Fourth edition (DSM-IV) or diagnostic criteria. We used a structured data abstraction form. Trained reviewers initially abstracted trial data, which a senior reviewer reviewed for accuracy and completeness. Two independent reviewers assessed trial risk of bias by using the Cochrane Risk of Bias tool¹⁴ (rated as low, medium, or high). Two reviewers independently graded the strength of evidence based on the guidance established for the Evidence-based Practice Center program of the U.S. Agency for Healthcare Research and Quality.¹⁵ Grades reflect the confidence that the estimate of an outcome of interest is close to the true effect and are rated as high, moderate, low, or insufficient. At each step, disagreements were resolved by consensus or by involving a third reviewer.

Analysis plan

Because we were aware of the dearth of studies directly comparing CAM interventions, we planned to conduct network meta-analyses, which took a larger evidence base into account. Network meta-analysis allows for the estimation of comparative treatment effects across trials based on a common comparator such as placebo or a standard treatment. In the absence of head-to-head trials, such indirect comparisons allow for pooling of results to better visualize the available evidence base. When available, direct comparisons using meta-analysis are typically preferred and often represent higher strength of evidence compared with indirect comparisons. However, evidence suggests that network meta-analyses agree with head-to-head trials if component studies are similar and treatment effects are expected to be consistent in patients in different trials.¹⁶

We conducted network analyses with a hierarchical frequentist approach by using random effects models.^17,18 We included all placebo- and active-controlled RCTs that were homogenous in study populations and outcome assessments and were part of a connected network. We built on a database of relevant RCTs from a previous report on the comparative efficacy and safety of second-generation antidepressants.¹⁹ We included double-blinded RCTs of at least 6 weeks' duration. For interventions for which double blinding was not possible or not performed, we required that outcome assessors were blinded. For network meta-analyses, we excluded studies conducted only in participants who were older than 55 years of age.

Our primary outcome measure was response to treatment on the Hamilton Depression Rating Scale (HAM-D), which was defined as a 50% improvement of scores from baseline. We chose this outcome because most studies used the HAM-D and reported data on response to treatment. We recalculated response rates for each study by using the number of all randomized patients as the denominator to reflect a true intent-to-treat (ITT) analysis. With this approach, we attempted to correct variations in results of modified ITT analyses encountered in individual studies.

The data provided information on the probability of the response of treatment j out of K possible treatments in study i (p_ij). We applied a generalized linear model with random effects. The logit for the random effects model^16,17,20 can be expressed as: \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} \begin{align*} logit ( { { p_ { ij } } } ) = { \mu _i } + { \delta _ { ij } } + \sum\nolimits_ { k = 1 } ^K \frac { { { \delta _ { ik } } } } { K } , \end{align*} \end{document}

where all \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}$${ \delta _{i1}}$$ \end{document} and \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}$$\left( {{ \delta _{i2}} \ldots , { \delta _{ik}}} \right) \sim N \left[ { \left( {{d_2} , \ldots , {d_k}} \right) , \sum } \right]$$ \end{document}

We fit all models by using PROC GLIMMIX in SAS version 9.3 (SAS Institute, Cary, NC), specifying a binomial likelihood and logit link function. For ease of interpretation, we present the relative risks and 95% confidence intervals (CI) of outcomes of interest for all possible comparisons among our treatments of interest. For all network meta-analyses, we conducted sensitivity analyses with and without high risk-of-bias studies, but we report findings with high risk-of-bias studies because sensitivity results were similar to the full results.

Results

Our searches for the full report identified 8,317 citations; we included 22 RCTs for direct comparisons and 127 trials for network meta-analyses (Fig. 1). We included CAM trials of acupuncture, omega-3 fatty acids, S-adenosyl methionine, and St. John's wort (Hypericum perforatum [L.]); we also included trials of exercise. No studies of meditation, mindfulness-based therapies, or yoga met our inclusion criteria. For all included trials of direct comparison, the SGA was a selective serotonin receptor inhibitor (SSRI). For network meta-analyses, we found trials comparing both SSRI and selective norepinephrine receptor inhibitor.

FIG. 1.

PRISMA diagram for CAM and exercise treatment of major depressive disorder. CAM, complementary and alternative medicine; KQ, key question; MA, meta-analysis; PRISMA, preferred reporting items for systematic reviews and meta-analyses; SR, systematic review.

For direct comparisons, we identified 20 RCTs (22 articles) including 2,600 participants comparing a CAM therapy with an SSRI and 2 RCTs (4 articles) including 309 participants comparing exercise with an SSRI. Half of the trial comparisons were with fluoxetine; other SSRIs included sertraline (5 trials), paroxetine (2), citalopram (2), and escitalopram (1). Most trials made comparisons with moderate or low doses of the SSRIs, but no trial used the full, approved range of antidepressant dosage. All trials enrolled participants exclusively from outpatient settings and excluded patients who had additional Axis I disorders, high suicidal risk, progressive medical diseases, or who used psychotherapy, electroconvulsive therapy, or psychotropic medications. Most participants had moderate-to-severe depression as measured by the HAM-D.²¹ Treatment durations ranged from 6 to 16 weeks. Trials were conducted in a variety of countries, including Brazil (1 trial), Canada (1), China, (3), Denmark (1), Germany (5), Iran (1), Sweden (1), and the United States (7). Table 1 presents study characteristics, main outcomes, and risk-of-bias ratings for all 22 CAM and exercise trials that we identified.

Table 1.

Acupuncture, Exercise, Omega-3 Fatty Acids, S-Adenosyl-L-Methionine, and St. John's Wort Versus Selective Serotonin Receptor Inhibitors: Study Characteristics, Main Outcomes, and Risk-of-Bias Ratings

	N ^a
Trial	duration (weeks)	Mean baseline HAM-D score	Treatment (dose, mg/day) ^b	Response ^c (%) and significance level	Remission ^c (%) and significance level	Risk-of-bias rating
Acupuncture
Huang et al.²³	98	24.1	Fluoxetine (20–40)	65 vs.56	NR	Medium
Huang et al.²³	6	24.1	Scalp EA (36)	p = NR	NR	Medium
Qu et al.²⁵ Chen et al.^48,d	160	24.4	Paroxetine (10–40) Paroxetine + MA (18) Paroxetine + EA (18)	42 vs. 70 (MA) vs. 70 (EA) p = 0.004 for SGA vs. MA or EA	22.9 vs.22.6 (MA) vs. 28.6 (EA) p = 0.72	Medium
Qu et al.²⁵ Chen et al.^48,d	6	24.4		42 vs. 70 (MA) vs. 70 (EA) p = 0.004 for SGA vs. MA or EA	22.9 vs.22.6 (MA) vs. 28.6 (EA) p = 0.72	Medium
Song et al.²⁴	90	25.3	Fluoxetine (20)	NR	NR	High^e
Song et al.²⁴	6	25.3	EA (30)	NR	NR	High^e
Sun et al.^22,f,g	75	23.3	Fluoxetine (20) EA #1 (30) EA #2 (30)	60 vs. 75 vs. 75 p = 0.16	NR	High^f
Sun et al.^22,f,g	6	23.3	Fluoxetine (20) EA #1 (30) EA #2 (30)	60 vs. 75 vs. 75 p = 0.16	NR	High^f
Zhang et al.²⁶	80	24.1	Fluoxetine (20–30) + sham MA (30) Fluoxetine (10) + MA (30)	80 vs. 78 p = 0.79	NR	Medium
Zhang et al.²⁶	6	24.1	Fluoxetine (20–30) + sham MA (30) Fluoxetine (10) + MA (30)	80 vs. 78 p = 0.79	NR	Medium
Exercise
Blumenthal et al.⁴² Babyak et al.⁴⁹	156 16-week treatment; 6-month follow-up	NR Per group: range from 17 to 19	Sertraline (50–200) Aerobic exercise (three times per/week) Sertraline + aerobic exercise (three times per week)	At 16 weeks: NR	68.8 vs.60.4 vs. 65.5 p = 0.67	Medium
Blumenthal et al.⁴³ Hoffman et al.⁵⁰	153 16-week treatment	NR Per group: range from 16 to 17	Sertraline (50–200) Supervised aerobic exercise (three times per week)	At 16 weeks: NR	47 vs.45 vs.40 p = 0.66	Medium
Blumenthal et al.⁴³ Hoffman et al.⁵⁰	153 16-week treatment	NR Per group: range from 16 to 17	Home-based aerobic exercise (three times per week)	At 16 weeks: NR	47 vs.45 vs.40 p = 0.66	Medium
Omega-3 fatty acids
Gertsik et al.²⁸	42	25.3	Citalopram: (20–40) EPA: 1,800 + DHA 400 + other 200 + citalopram (20–40)	14 vs. 17 NR	18 vs. 44 NR	High^h
Gertsik et al.²⁸	8	25.3		14 vs. 17 NR	18 vs. 44 NR	High^h
Jazayeri et al.²⁷	48	30.0	Fluoxetine: (20) EPA: (1,000) Fluoxetine: 20 + EPA (1,000)	50 vs. 56 vs. 81^h p = 0.43, p = 0.005, p = 0.009	NR	Highⁱ
Jazayeri et al.²⁷	8	30.0	Fluoxetine: (20) EPA: (1,000) Fluoxetine: 20 + EPA (1,000)	50 vs. 56 vs. 81^h p = 0.43, p = 0.005, p = 0.009	NR	Highⁱ
SAMe
Mischoulon et al.²⁹	129	19.2	Escitalopram: (10–20) SAMe: (1,600–3,200)	34 vs. 36 p > 0.05	28 vs. 28 p > 0.05	High^j
Mischoulon et al.²⁹	12	19.2	Escitalopram: (10–20) SAMe: (1,600–3,200)	34 vs. 36 p > 0.05	28 vs. 28 p > 0.05	High^j
St. John's wort
Behnke et al.³⁰	70	20.4	Fluoxetine (40) Calmigen (300)	66 vs. 55 p = 0.41	NR	Medium
Behnke et al.³⁰	6	20.4	Fluoxetine (40) Calmigen (300)	66 vs. 55 p = 0.41	NR	Medium
Bjerkenstedt et al.³¹	113	24.7	Fluoxetine (20) LI160 (900)	37 vs. 38 NS	28 vs. 24 NR	Low
Bjerkenstedt et al.³¹	4–6	24.7	Fluoxetine (20) LI160 (900)	37 vs. 38 NS	28 vs. 24 NR	Low
Brenner et al.³²	30	21.5	Sertraline (50–75) LI160 (600–900)	40 vs. 47 NS	NR	High^{k, l}
Brenner et al.³²	7	21.5	Sertraline (50–75) LI160 (600–900)	40 vs. 47 NS	NR	High^{k, l}
Davidson et al.³³	224	22.7	Sertraline (50–100) LI160 (900–1,500)	24 vs. 14 NR	25 vs. 24 NR	Medium
Davidson et al.³³	8	22.7	Sertraline (50–100) LI160 (900–1,500)	24 vs. 14 NR	25 vs. 24 NR	Medium
Fava et al.³⁴ Papakostas et al.^51,e	92	19.6	Fluoxetine (20) LI160 (900)	NR	30 vs. 38 NS	High^{m, n}
Fava et al.³⁴ Papakostas et al.^51,e	12	19.6	Fluoxetine (20) LI160 (900)	NR	30 vs. 38 NS	High^{m, n}
Gastpar et al.³⁵	241	22.1	Sertraline (50) STW3 (612)	69 vs. 74 NS	NR	Medium
Gastpar et al.³⁵	12	22.1	Sertraline (50) STW3 (612)	69 vs. 74 NS	NR	Medium
Gastpar et al.³⁶	258	21.9	Citalopram (20) STW3-VI (900)	56 vs. 54 p = 0.63	NR	Low
Gastpar et al.³⁶	6	21.9	Citalopram (20) STW3-VI (900)	56 vs. 54 p = 0.63	NR	Low
Harrer et al.^37,o	161	NR	Fluoxetine (10) LoHyp-57 (400)	72 vs. 71 NR	NR	Medium
Harrer et al.^37,o	6	NR	Fluoxetine (10) LoHyp-57 (400)	72 vs. 71 NR	NR	Medium
Moreno et al.³⁸	40	NR	Fluoxetine (20) Iperisan (900)	55 vs. 20 p = 0.021	12 vs. 35 NR	High^p
Moreno et al.³⁸	8	NR	Fluoxetine (20) Iperisan (900)	55 vs. 20 p = 0.021	12 vs. 35 NR	High^p
Schrader³⁹	240	19.6	Fluoxetine (20) Ze117 (500)	40 vs. 60 p = 0.05	NR	Medium
Schrader³⁹	6	19.6	Fluoxetine (20) Ze117 (500)	40 vs. 60 p = 0.05	NR	Medium
Szegedi et al.⁴⁰	251	25.5	Paroxetine (20–40) WS5570 (900–1,800)	73 vs. 86 p = 0.08	43 vs. 61 p = 0.02	Medium
Szegedi et al.⁴⁰	6	25.5	Paroxetine (20–40) WS5570 (900–1,800)	73 vs. 86 p = 0.08	43 vs. 61 p = 0.02	Medium
van Gurp et al.⁴¹	90	19.3	Sertraline (50–100) Swiss herbal remedies (900–1,800)	NR	NR	Medium
van Gurp et al.⁴¹	12	19.3	Sertraline (50–100) Swiss herbal remedies (900–1,800)	NR	NR	Medium

Total number of randomized participants in relevant arms of trial.

Acupuncture and exercise dose recorded as number of sessions.

Response and remission are measured on the HAM-D.

The Chen et al. trial had a substantial overlap of participants (n = 105) with the Qu et al. trial.

Very little information provided on randomization procedures and analytic methods.

High differential attrition; completers analysis.

Trial included two active electroacupuncture groups, with different sets of points, designed to treat depression.

Fluoxetine versus EPA versus fluoxetine + EPA. p-values are for fluoxetine versus EPA, fluoxetine versus combination, and EPA versus combination, respectively.

Unclear randomization methods; high attrition; completers analysis.

High attrition.

For dichotomous outcomes (e.g., response and remission), we rated the risk of bias for these trials as medium because dropouts were counted as remission failures.

High attrition, unclear randomization methods.

Not included in meta-analyses because it is a reanalysis of Fava et al.³⁴

High attrition, unclear randomization methods.

Not included in response and remission meta-analyses because of the age of the trial population (60–80 years).

Completers analysis.

DHA, docosahexaenoic acid; EA, electroacupuncture; EPA, eicosapentaenoic acid; HAM-D, Hamilton Depression Rating Scale; MA, manual acupuncture; mg/day, milligram per day; N, number; NR, not reported; NS, reported as not significant; SAMe, S-adenosyl-L-methionine; SGA, second-generation antidepressants.

Treatment benefits and comparative effectiveness

Acupuncture

Three trials compared acupuncture monotherapy with an SSRI. Response rates were similar for interventions in two trials^22,23 and were not reported for the third.²⁴ We found no statistically significant differences in treatment response rates from network meta-analysis (RR 1.25; 95% CI: 0.71–2.2). CI, however, encompassed clinically relevant differences between treatments.

Two trials compared acupuncture plus SSRI combination therapy with SSRI monotherapy. One trial reported a higher treatment response rate for combination manual acupuncture (70% vs. 42%, p = 0.004) and electroacupuncture (70% vs. 42%, p = 0.004) compared with paroxetine alone, but no differences in remission.²⁵ The other trial reported no difference in response rate between combination acupuncture plus fluoxetine versus sham acupuncture plus fluoxetine.²⁶

In head-to-head comparisons using network meta-analyses, acupuncture was superior only to omega-3 fatty acids (RR 2.45; 95% CI: 1.20–5.03) and not statistically different from those of all other interventions.

Omega-3 fatty acids

One trial compared omega-3 fatty acid monotherapy with an SSRI.²⁷ It reported no differences in treatment response. However, based on network meta-analyses, we found higher response rates for patients treated with an SSRI (RR 1.96; 95% CI: 1.26–3.05) compared with those taking omega-3 fatty acids.

Two trials compared a combination of omega-3 fatty acid plus either fluoxetine or citalopram with SSRI monotherapy. For both trials, participants treated with combination therapy were more likely to benefit than those treated with either SSRI or omega-3 fatty acid monotherapy (combination treatment vs. citalopram: 44% vs. 18% for remission, p-value not reported²⁸; combination treatment vs. fluoxetine: 81% vs. 50% for treatment response, p = 0.005, EPA monotherapy vs. fluoxetine: 56% vs. 50%, p = 0.43).²⁷

In head-to-head comparisons using network meta-analyses, omega-3 fatty acids were inferior to St. John's wort (RR 0.41; 95% CI: 0.25–0.66) and SGAs (RR 0.51; 95% CI: 0.32–0.80). We found no significant differences between omega-3 fatty acids and other interventions.

S-adenosyl methionine

We identified one trial comparing S-adenosyl-L-methionine (SAMe) monotherapy with escitalopram alone.²⁹ It reported no significant differences between interventions for treatment response and remission. Similarly, we found no differences in response rates from network meta-analysis (RR 1.22; 95% CI: 0.66–2.26). We did not identify any trials comparing SAMe combination therapy with SGA monotherapy. In head-to-head comparisons using network meta-analyses, we found no significant differences for any comparisons with SAMe.

St. John's wort

Twelve trials (1,806 participants) compared St. John's wort with various SSRIs.^{30

–41} Trials used a variety of commercially available standardized extracts (e.g., LI-160, WS5570, Ze117, STW3, Calmigen, Iperisan, Swiss herbal remedies), which were most often standardized to 0.12%–0.28% hypericin; doses ranged from 300 mg to 1,800 mg of the standardized extract daily. Based on the HAM-D, most participants had severe depression.

Overall, treatment response, remission, and magnitude of change on the HAM-D scale were similar between participants treated with St. John's wort and those receiving an SSRI (Fig. 2). Sensitivity analysis using dose of the SSRI or treatment duration showed no statistical difference between the SSRIs and St. John's wort. Sensitivity analysis stratified by St. John's wort preparation demonstrated a larger benefit in treatment response compared with an SSRI for Ze 117³⁹ when compared with other St. John's wort preparations (RR 0.66; 95% CI: 0.51–0.87), but it was used in only a single trial. When stratifying by study country of origin, we found no statistical difference in estimates between studies conducted in Germany (RR 0.90; 95% CI: 0.76–1.06) and those conducted in non-German countries (RR 1.07; 95% CI: 0.85–1.33). We did not find any trials comparing combination therapy with SGA monotherapy.

FIG. 2.

St. John's wort versus second-generation antidepressants: response, remission, and change in HAM-D. (A and B) report absolute response/total participants in that study arm; change in HAM-D (C). For all plots, box size indicates relative study weights. CI, confidence interval; HAM-D, Hamilton Depression Rating Scale; SGA, second generation antidepressants; SJW, St. John's wort.

In head-to-head comparisons using network meta-analyses, St. John's wort was superior to exercise (RR 2.37; 95% CI: 0.98–5.75), omega-3 fatty acids (RR 2.44; 95% CI: 1.52–3.93), and SGAs (RR 1.24; 95% CI: 1.05–1.48). We found no differences between St. John's wort and other interventions.

Exercise

We identified two trials comparing aerobic exercise monotherapy with SSRIs.^42,43 Neither trial found a statistically significant difference in remission rates between interventions. One trial compared a combination of exercise plus sertraline with sertraline monotherapy.⁴² It reported no differences in treatment response. In network meta-analysis comparing exercise with SGAs, we found no significant difference in treatment response rate (RR 0.53; 95% CI: 0.22–1.26).

In head-to-head comparisons using network meta-analyses, exercise had lower response rates compared with St. John's wort (RR 0.42; 95% CI: 0.17–1.02), although the difference did not quite reach statistical significance. We found no significant differences between exercise and all other interventions.

Harms and treatment discontinuation

Few trials adequately assessed differences in harms; no trial reported harms data by using a validated scale. Most trials combined spontaneous patient-reported adverse events with a regular clinical examination. Rarely did authors report whether adverse events were prespecified and defined. No trial was designed to assess specific adverse events as primary outcomes.

The risk of treatment harms was higher for SSRIs than for both acupuncture (RR 3.96; 95% CI: 3.40–4.62, 21 trials, 3,128 participants) and St. John's wort (RR 1.19; 95% CI: 1.05–1.34, 8 trials, 1,427 participants) (Fig. 3). Treatment harms did not differ between combination acupuncture plus SGA and SGA alone. Evidence was absent or inadequate to draw conclusions about the risk of harms for other CAM treatments or exercise.

FIG. 3.

Comparison of overall risk of harms of second-generation antidepressants with CAM interventions. Strength of evidence rated as high, moderate, low, or insufficient based on the Agency for Healthcare Research and Quality (AHRQ) guidance. CI, confidence interval; SGA, second-generation antidepressant.

The risk of treatment discontinuation because of adverse events (Fig. 4) was higher for SGAs than for both exercise (RR 21.0; 95% CI: 1.19–269.0) and St. John's wort (RR 1.70; 95% CI: 1.12–2.60). For comparisons of other monotherapies or combination treatments for which we had at least one trial, we did not find any differences in treatment discontinuation between intervention groups (Fig. 5).

FIG. 4.

Comparison of treatment discontinuation because of adverse event rates of SSRIs with CAM interventions. Strength of evidence rated as high, moderate, low, or insufficient based on the Agency for Healthcare Research and Quality guidance.

FIG. 5.

Comparison of overall discontinuation rates from SSRIs with other CAM Interventions. Strength of evidence rated as high, moderate, low, or insufficient based on the Agency for Healthcare Research and Quality (AHRQ) guidance. CAM, complementary and alternative medicine; CI, confidence interval; SAMe, S-adenosyl-L-methionine; SOE, strength of evidence; SSRI, selective serotonin reuptake inhibitor.

Discussion

Main findings of the study

For most treatment comparisons of CAM interventions or exercise with SGAs, we found no differences between treatment groups for response and remission. However, with the exception of St. John's wort, the risk of bias of these studies was either medium (50% of studies) or high (50% of studies); this finding led us to conclude that the strength of evidence for these findings was either low or insufficient. For St. John's wort, although our meta-analysis included only trials of medium or low risk of bias, we concluded that the strength of evidence was low because the comparisons with antidepressants were done with either moderate or low doses of the antidepressants. For all CAM therapies, we concluded that more evidence of a higher quality was needed to adequately assess the benefits and harms of these treatments compared with antidepressants.

Both Canadian (CANMAT) and American (APA) guidelines support the use of omega-3 fatty acid supplementation for patients with MDD based on evidence of modest efficacy versus placebo and low risk of harms.^6,7 However, a recent systematic review concluded that the evidence to support efficacy was weak; any benefit was likely to be small and not clinically meaningful.⁴⁴ Similarly, these guidelines agree about the positive benefit of using either exercise or St. John's wort for treating patients with mild-to-moderate MDD. For all three interventions, however, we concluded that the current strength of evidence did not support a strong recommendation for their use in managing patients with MDD. Given the low risk of harms associated with the CAM treatments or exercise, clinicians may choose to use them for patients with strong interest in such therapies, but they should be cautious about prolonged treatment trials for individual patients when benefit is equivocal, especially given the demonstrated and comparable efficacy of SGAs and cognitive behavioral therapy.^13,19

For acupuncture and SAMe, evidence is not yet sufficient to recommend a treatment trial in lieu of treatments with proven efficacy. For acupuncture, treatment guidelines and evidence from high-quality systematic reviews concur that the evidence of benefit is insufficient, although the risk of harms appears to be very low.^{6
–8,45
–47} The Canadian guideline recommends SAMe as treatment only after failure of an initial proven therapy, whereas the American guideline calls for more studies to determine efficacy and does not recommend its use. Our results do not support the use of either of these therapies for first-line treatment of MDD.

Contrary to findings for treatment response and remission, we found greater risk of harms for certain SSRIs than for either acupuncture or St. John's wort. For both comparisons, the strength of evidence was sufficient to conclude that clinicians could have confidence that differences in harms exist. Further, patients were more likely to discontinue SSRIs than St. John's wort because of treatment-specific adverse events. Given the demonstrated superiority of St. John's wort to placebo, clinicians may find it reasonable to consider an initial treatment trial in patients seeking to use this supplement. However, St. John's wort is well known to interact with a variety of other medications, so its use should be guided by practitioners with a good understanding of its herb-drug interactions.

Limitations of the evidence base

We encountered numerous methodologic shortcomings in this evidence base. Among the more important ones were unclear randomization methods, high loss to follow-up, small sample sizes, and inadequate dosing for the SSRIs used in the comparative studies. These shortcomings most often led to low strength of evidence ratings, which tempered most of our findings. Although many of the treatments we evaluated may, indeed, be beneficial for managing patients with MDD, or at least patients with mild and perhaps moderate MDD, the quality of the current evidence base precludes any estimation of comparative efficacy with well-demonstrated interventions for MDD.

Risk of harms was often not adequately assessed. Most trials did not employ an objective instrument or scale to measure harms, and few trials used a systematic approach for evaluating them. Even for trials that did assess harms, the methods were often poorly described, precluding a thorough assessment of whether the approaches were adequate and unbiased. Further, most trials were small and of a short duration; these factors limited the validity and generalizability of their harms assessments.

Treatment expectancy (i.e., patients expecting a positive outcome) was rarely factored into trial designs or results. Expectancy may play a large role in determining outcomes for CAM interventions, especially in countries where a treatment is commonly used (e.g., acupuncture in China or St. John's wort in Germany). We found only a single study, a re-analysis of the U.S. Hypericum Depression Trial,⁵² that considered the role of expectancy.⁵³ Interestingly, those researchers concluded that treatment expectancy was more strongly associated with outcomes than the treatment that patients actually received. Treatment expectancy may also play an important role for trials of acupuncture.¹¹

Next steps

Considering the various shortcomings that we identified in the evidence base, we suggest the following recommendations for investigators leading future trials:

1. Study outcomes should include patient-centered measures such as functional capacity, quality of life, and comparative harms;

2. Treatment expectancy and depressive severity should be considered treatment effect modifiers;

3. Comparative studies should incorporate appropriate dose escalation protocols extending over the full range of antidepressant dosages;

4. Additional CAM therapies with evidence of positive treatment efficacy for MDD, such as yoga and meditation, should be included in future comparative studies.

Conclusions

Although we found little difference in the comparative efficacy of most CAM therapies and SGAs for treating patients with MDD, the overall poor quality of the available evidence base tempers any conclusions that we might draw from those trials. An important exception is that SSRIs may lead to more adverse events and treatment cessation when compared with acupuncture or St. John's wort. Although this evidence base does not provide definitive answers about the comparative benefits and harms of these interventions, clinicians with knowledge of the safety profile of St. John's wort may want to consider a trial of its use for the initial treatment of MDD in patients with a strong inclination to use CAM therapy.

Footnotes

Acknowledgments

The authors would like to thank Aysegul Gozu, MD, MPH, from the U.S. Agency for Healthcare Research and Quality, for support and advice throughout the project. They also express their appreciation to RTI colleagues Meera Viswanathan, PhD, Director of the RTI International–University of North Carolina Evidence-based Practice Center, for dedicated encouragement and leadership and to Loraine Monroe, for exceptional document preparation efforts. They also extend their gratitude to Irma Klerings, MA, from Danube University, Krems, Austria, for literature searches. This study was supported by Contract 290-2012-00008i from the U.S. Agency for Healthcare Research and Quality to RTI International.

Author Disclosure Statement

This project was funded under Contract No. HHSA290201200008i from the Agency for Healthcare Research and Quality, the U.S. Department of Health and Human Services. The authors of this article are responsible for its content. Statements in the article should not be construed as endorsement by the Agency for Healthcare Research and Quality or the U.S. Department of Health and Human Services. AHRQ retains a license to display, reproduce, and distribute the data and the report from which this article was derived under the terms of the agency's contract with the authors.

This topic was nominated by the American College of Physicians and selected by AHRQ for systematic review by an EPC. A representative from AHRQ served as a Contracting Officer's Technical Representative, provided technical assistance during the conduct of the full evidence report, and provided comments on the draft versions of the full evidence report. AHRQ did not directly participate in the literature search, determination of study eligibility criteria, data analysis or interpretation, or preparation, review, or approval of the article for publication.

References

Kessler

, Berglund

, Demler

, et al. The epidemiology of major depressive disorder: Results from the National Comorbidity Survey Replication (NCS-R). JAMA, 2003; 289:3095–3105.

Institute of Medicine. Crossing the Quality Chasm: A New Health System for the 21st Century. Washington, DC: The National Academies Press, 2001.

Mojtabai

, Olfson

. National patterns in antidepressant treatment by psychiatrists and general medical providers: Results from the national comorbidity survey replication. J Clin Psychiatry, 2008; 69:1064–1074.

Clarke

, Black

, Stussman

, et al. Trends in the use of complementary health approaches among adults: United States, 2002–2012. National health statistics reports; no 79. Hyattsville, MD: National Center for Health Statistics, 2015.

Kessler

, Soukup

, Davis

, et al. The use of complementary and alternative therapies to treat anxiety and depression in the United States. Am J Psychiatry, 2001; 158:289–294.

Freeman

, Fava

, Lake

, et al. Complementary and alternative medicine in major depressive disorder: The American Psychiatric Association Task Force report. J Clin Psychiatry, 2010; 71:669–681.

Ravindran

, Lam

, Filteau

, et al. Canadian Network for Mood and Anxiety Treatments (CANMAT) Clinical guidelines for the management of major depressive disorder in adults. V. Complementary and alternative medicine treatments. J Affect Disord, 2009; 117 Suppl 1:S54–S64.

Williams

, Gierisch

, McDuffie

, et al. An Overview of Complementary and Alternative Medicine Therapies for Anxiety and Depressive Disorders: Supplement to Efficacy of Complementary and Alternative Medicine Therapies for Posttraumatic Stress Disorder. Washington DC: U.S. Department of Veterans Affairs, 2011.

Gartlehner

, Thieda

, Hansen

, et al. Comparative risk for harms of second-generation antidepressants: A systematic review and meta-analysis. Drug Saf, 2008; 31:851–865.

10.

Gartlehner

, Hansen

, Morgan

, et al. Second-Generation Antidepressants in the Pharmacologic Treatment of Adult Depression: An Update of the 2007 Comparative Effectiveness Review. (Prepared by the RTI International–University of North Carolina Evidence-based Practice Center, Contract No. 290-2007-10056-I.) AHRQ Publication No. 12-EHC012-EF. Rockville, MD: Agency for Healthcare Research and Quality, 2011. www.effectivehealthcare.ahrq.gov/reports/final.cfm.

11.

Gartlehner

, Gaynes

, Amick

, et al. Comparative benefits and harms of antidepressant, psychological, complementary, and exercise treatments for major depression: An evidence report for a clinical practice guideline from the American College of Physicians. Ann Intern Med, 2016; 164:331–341.

12.

Gartlehner

, Gaynes

, Amick

, et al. Nonpharmacological Versus Pharmacological Treatments for Adult Patients with Major Depressive Disorder. (Prepared by the RTI International-University of North Carolina Evidence-based Practice Center, Contract No. 290-2012-00008i) Rockville, MD: Agency for Healthcare Research and Quality, 2015. www.effectivehealthcare.ahrq.gov/reports/final.cfm

13.

Amick

, Gartlehner

, Gaynes

, et al. Comparative benefits and harms of second generation antidepressants and cognitive behavioral therapies in initial treatment of major depressive disorder: Systematic review and meta-analysis. BMJ, 2015; 351:h6019.

14.

Higgins

JPT

, Green

, eds. Cochrane Handbook for Systematic Reviews of Interventions Version 5.1.0. The Cochrane Collaboration, 2011. Available from http://handbook.cochrane.org

15.

Berkman

, Lohr

, Ansari

, et al. Grading the Strength of a Body of Evidence When Assessing Health Care Interventions for the Effective Health Care Program of the Agency for Healthcare Research and Quality: An Update. Methods Guide for Effectiveness and Comparative Effectiveness Reviews (Prepared by the RTI-UNC Evidence-based Practice Center under Contract No. 290-2007-10056-I) AHRQ Publication No. 13(14)-EHC130-EF. Rockville MD: Agency for Healthcare Research and Quality, 2013. www.effectivehealthcare.ahrq.gov/reports/final.cfm

16.

Glenny

, Altman

, Song

, et al. Indirect comparisons of competing interventions. Health Technol Assess, 2005; 9:1–134, iii–iv.

17.

Jones

, Roger

, Lane

, et al. Statistical approaches for conducting network meta-analysis in drug development. Pharm Stat, 2011; 10:523–531.

18.

Hong

, Carlin

, Shamliyan

, et al. Comparing Bayesian and frequentist approaches for multiple outcome mixed treatment comparisons. Med Decis Making, 2013; 33:702–714.

19.

Gartlehner

, Hansen

, Morgan

, et al. Comparative benefits and harms of second-generation antidepressants for treating major depressive disorder: An updated meta-analysis. Ann Intern Med, 2011; 155:772–785.

20.

, Ades

. Combination of direct and indirect evidence in mixed treatment comparisons. Stat Med, 2004; 23:3105–3124.

21.

Zimmerman

, Martinez

, Young

, et al. Severity classification on the Hamilton Depression Rating Scale. J Affect Disord, 2013; 150:384–388.

22.

Sun

, Zhao

, Ma

, et al. Effects of electroacupuncture on depression and the production of glial cell line-derived neurotrophic factor compared with fluoxetine: A randomized controlled pilot study. J Altern Complement Med, 2013; 19:733–739.

23.

Huang

, Htut

, Li

, et al. Studies on the clinical observation and cerebral glucose metabolism in depression treated by electro-scalp acupuncture compared to fluoxetine. Int J Clin Acupunct, 2005; 14:7–26.

24.

Song

, Zhou

, Fan

, et al. Effects of electroacupuncture and fluoxetine on the density of GTP-binding-proteins in platelet membrane in patients with major depressive disorder. J Affect Disord, 2007; 98:253–257.

25.

, Huang

, Zhang

, et al. A 6-week randomized controlled trial with 4-week follow-up of acupuncture combined with paroxetine in patients with major depressive disorder. J Psychiatr Res, 2013; 47:726–732.

26.

Zhang

, Yang

, Zhong

. Combination of acupuncture and fluoxetine for depression: A randomized, double-blind, sham-controlled trial. J Altern Complement Med, 2009; 15:837–844.

27.

Jazayeri

, Tehrani-Doost

, Keshavarz

, et al. Comparison of therapeutic effects of omega-3 fatty acid eicosapentaenoic acid and fluoxetine, separately and in combination, in major depressive disorder. Aust N Z J Psychiatry, 2008; 42:192–198.

28.

Gertsik

, Poland

, Bresee

, et al. Omega-3 fatty acid augmentation of citalopram treatment for patients with major depressive disorder. J Clin Psychopharmacol, 2012; 32:61–64.

29.

Mischoulon

, Price

, Carpenter

, et al. A double-blind, randomized, placebo-controlled clinical trial of S-adenosyl-L-methionine (SAMe) versus escitalopram in major depressive disorder. J Clin Psychiatry, 2014; 75:370–376.

30.

Behnke

, Jensen

, Graubaum

, et al. Hypericum perforatum versus fluoxetine in the treatment of mild to moderate depression. Adv Ther, 2002; 19:43–52.

31.

Bjerkenstedt

, Edman

, Alken

, et al. Hypericum extract LI 160 and fluoxetine in mild to moderate depression: A randomized, placebo-controlled multi-center study in outpatients. Eur Arch Psychiatry Clin Neurosci, 2005; 255:40–47.

32.

Brenner

, Azbel

, Madhusoodanan

, et al. Comparison of an extract of hypericum (LI 160) and sertraline in the treatment of depression: A double-blind, randomized pilot study. Clin Ther, 2000; 22:411–419.

33.

Davidson

JRT

, Gadde

, Fairbank

, et al. Effect of Hypericum perforatum (St John's wort) in major depressive disorder: A randomized controlled trial. J Am Med Assoc, 2002; 287:1807–1814.

34.

Fava

, Alpert

, Nierenberg

, et al. A double-blind, randomized trial of St John's wort, fluoxetine, and placebo in major depressive disorder. J Clin Psychopharmacol, 2005; 25:441–447.

35.

Gastpar

, Singer

, Zeller

. Efficacy and tolerability of hypericum extract STW3 in long-term treatment with a once-daily dosage in comparison with sertraline. Pharmacopsychiatry, 2005; 38:78–86.

36.

Gastpar

, Singer

, Zeller

. Comparative efficacy and safety of a once-daily dosage of hypericum extract STW3-VI and citalopram in patients with moderate depression: A double-blind, randomised, multicentre, placebo-controlled study. Pharmacopsychiatry, 2006; 39:66–75.

37.

Harrer

, Schmidt

, Kuhn

, et al. Comparison of equivalence between the St. John's wort extract LoHyp-57 and fluoxetine. Arzneimittelforschung, 1999; 49:289–296.

38.

Moreno

, Teng

, Almeida

, et al. Hypericum perforatum versus fluoxetine in the treatment of mild to moderate depression: A randomized double-blind trial in a Brazilian sample. Rev Bras Psiquiatr, 2006; 28:29–32.

39.

Schrader

. Equivalence of St John's wort extract (Ze 117) and fluoxetine: A randomized, controlled study in mild-moderate depression. Int Clin Psychopharmacol, 2000; 15:61–68.

40.

Szegedi

, Kohnen

, Dienel

, et al. Acute treatment of moderate to severe depression with hypericum extract WS 5570 (St John's wort): Randomised controlled double blind non-inferiority trial versus paroxetine. BMJ, 2005; 330:503.

41.

van Gurp

, Meterissian

, Haiek

, et al. St John's wort or sertraline? Randomized controlled trial in primary care. Can Fam Physician, 2002; 48:905–912.

42.

Blumenthal

, Babyak

, Moore

, et al. Effects of exercise training on older patients with major depression. Arch Intern Med, 1999; 159:2349–2356.

43.

Blumenthal

, Babyak

, Doraiswamy

, et al. Exercise and pharmacotherapy in the treatment of major depressive disorder. Psychosom Med, 2007; 69:587–596.

44.

Appleton

, Perry

, Sallis

, et al. Omega-3 fatty acids for depression in adults. Cochrane Database Syst Rev, 2014:CD004692.

45.

Sorbero

, Reynolds

, Colaiaco

, et al. Acupuncture for Major Depressive Disorder: A Systematic Review. Santa Monica, CA: RAND Corporation, 2015.

46.

Smith

, Hay

, Macpherson

. Acupuncture for depression. Cochrane Database Syst Rev, 2010:CD004046.

47.

Melchart

, Weidenhammer

, Streng

, et al. Prospective investigation of adverse effects of acupuncture in 97733 patients. Arch Intern Med, 2004; 164:104–105.

48.

Chen

, Lin

, Wang

, et al. Acupuncture/electroacupuncture enhances antidepressant effect of seroxat: the symptom checklist-90 scores. Neural Regen Res, 2014; 9:213–222.

49.

Babyak

, Blumenthal

, Herman

, et al. Exercise treatment for major depression: maintenance of therapeutic benefit at 10 months. Psychosom Med, 2000; 62:633–638.

50.

Hoffman

, Blumenthal

, Babyak

, et al. Exercise fails to improve neurocognition in depressed middle-aged and older adults. Med Sci Sports Exerc, 2008; 40:1344–1352.

51.

Papakostas

, Crawford

, Scalia

, et al. Timing of clinical improvement and symptom resolution in the treatment of major depressive disorder. A replication of findings with the use of a double-blind, placebo-controlled trial of Hypericum perforatum versus fluoxetine. Neuropsychobiology, 2007; 56:132–137.

52.

Hypericum Depression Trial Study Group. Effect of Hypericum perforatum (St John's wort) in major depressive disorder: A randomized controlled trial. JAMA, 2002; 287:1807–1814.

53.

Chen

, Papakostas

, Youn

, et al. Association between patient beliefs regarding assigned treatment and clinical response: Reanalysis of data from the Hypericum Depression Trial Study Group. J Clin Psychiatry, 2011; 72:1669–1676.

54.

Prady

, Burch

, Vanderbloemen

, et al. Measuring expectations of benefit from treatment in acupuncture trials: A systematic review. Complement Ther Med, 2015; 23:185–199.