Abstract

Dear editor:
A
Comments on the review have criticized its methodology and conclusions on the basis that it excluded many studies relevant to the research question. Walach et al. criticized it for only including RCTs, pointing out that randomization to a meditation group is incompatible with the way meditation is actually practiced, in which the person makes a commitment to change their personal habits and dedicates time to work meditation into their schedule. They argue that meditation effects can be best assessed by long-term cohort studies, which compare people practicing meditation with a similar group of people receiving usual care. Walach et al. cite cohort studies on depression, which show much stronger effects than the RCTs reported by Goyal et al. 7 In another comment, Loucks also pointed out that only including RCTs excluded many relevant studies and was contrary to the 2009 Institute of Medicine report that called for studies comparing meditation techniques with usual care. 8
Goyal et al. replied that RCTs do require engagement because the subjects make a conscious choice to participate in such trials.
9
However, this being the case, RCTs, like cohort studies, do not control for self-selection, and hence do not control for unknown prognostic variables that may accompany self-selection, such as genetic factors. RCTs are considered the “gold standard” of medical research because they provide two equivalent groups for which the treatment allocation is the only difference, which theoretically allows a causal interpretation that the treatment caused the effect, should any significant difference between groups be observed. However, for behavioral studies, in which self-selection cannot truly be eliminated, RCTs appear to be little better than controlled studies (CTs) for establishing causality. Like RCTs, CTs can also create equivalent groups on pertinent covariates, such as age, sex, health status, medication regime, and so on, and they appear to be as good at establishing valid conclusions. For example, a study of truth survival of 474 conclusions from meta-analyses, CTs, and RCTs on research on cirrhosis and hepatitis from 1945 to 1999 found that the 20-year survival of conclusions derived from meta-analyses was lower (57%) than from CTs (87%) or RCTs (85%); there was virtually no difference between CTs and RCTs.
10
Consequently, we recommend that future reviews of meditation should include cohort studies, CTs and RCTs, reporting and comparing the results for each level of evidence. An approach similar to this was taken by Eppley et al.'s systematic review and meta-analysis of meditation and relaxation techniques on trait anxiety.
4
These authors reported the N, effects sizes, and d (±SE) at three different levels of experimental rigor for studies on TM and trait anxiety: 1. all independent outcomes, all research designs, N = 35, d = 0.70 (±0.068); 2. studies matched with controls on population, adjusted for duration, attrition, and follow-up hours, N = 23, d = 0.77 (±0.06); and 3. randomized assignment, alternate treatment, published in journals or dissertations, authors neutral or negative toward TM, N = 4, d = 0.89 (±0.19).
4
These results show that the results were robust across different levels of experimental rigor.
In another comment, Rutledge et al. 11 pointed out that Goyal et al.'s decision to include only RCTs with participants with “a clinical condition” or a “stressed population” apparently excluded RCTs with active controls on other populations that appear to be relevant to the review's mission to address the issue of causality. In their reply, Goyal et al. agreed that reviews with different selection criteria could come to different conclusions. 9 The study that Rutledge et al. cited 5 located 10 RCTs on anxiety with active controls, for which d = 0.50, p = 0.0000005, which is quite different from the two RCTs on TM and anxiety with nil effect that Goyal et al. reported. 1
In the following, we examine how differences in selection criteria can result in such different outcomes and conclusions.
Figure 1 shows the forest plot from our previous meta-analysis comparing TM to treatment-as-usual controls (16 studies), and Figure 2 shows TM compared with active treatment controls (10 studies). 5 Goyal et al. reported the results of only two of these studies—Paul-Labrador and Smith —which, as Goyal et al. reported, showed nil effects (Paul-Labrador: d = 0.04, see Fig. 1; Smith: d = −0.09, see Fig. 2.) The reason why they only reported two TM studies whereas we reported 16 is partially due to differences in inclusion criteria. Goyal et al.'s inclusion criteria were: RCT, which excluded Brooks (1985), a quasi-RCT; adult participants, which excluded four studies on adolescents (Barnes, and So studies 1, 2, and 3); participants had to be undergoing clinical treatment or be a high anxiety population, which excluded three studies on adults not undergoing clinical treatment (Dillbeck, Gaylord, and Sheppard); and attention control or active treatment control, which excluded two studies that used waitlist controls (Ballou and Nidich).

Forest plot of 15 randomized controlled trials (RCTs) and one quasi-randomized study (Brooks) comparing Transcendental Meditation (TM) with treatment-as-usual controls. PT, psychotherapy; PP, prepost; WL, wait-list; CSM, corporate stress management; R, relaxation; ATT, attention.

Forest plot of nine TM RCTs and one quasi-RCT compared with active alternative treatments: group therapy (GT), EMG biofeedback with progressive relaxation (EMG), a Taoist meditation technique (Tao), Periodic Somatic Inactivity (PSI), and progressive relaxation (PR). R, simple relaxation; NAP, napping.
These inclusion criteria, which excluded 10 of our 16 studies, still left six studies that appear to meet Goyal et al.'s criteria (Brautigam, Gore, Kondwani, Paul-Labrador, Raskin, and Smith). The standardized difference in the means, standard error, and 95% confidence interval of the synthesis of these six studies was d = −0.31, SE = 0.15 (−0.60 to −0.01), respectively (p = 0.045).
It can be seen in Figures 1 and 2 that there is considerable heterogeneity between the TM studies on anxiety, with effect sizes ranging from zero effect to large effects of >1.0. The quantification of the heterogeneity of the 16 studies was I 2 = 50.5, indicating that 50.5% of the variance was true between-studies variance. Meta-regression indicated that all of the heterogeneity could be accounted for by pretest anxiety levels. 5 For example, the studies with the largest effect sizes seen in Figure 1 (Brooks, Raskin, Ballou, and Brautigam) were on high-anxiety populations, which were, respectively, war veterans with PTSD, psychiatric anxiety patients, prison inmates, and drug abusers in rehabilitation. On the other hand, the studies with the low effect sizes were conducted on populations with low pretreatment anxiety such as Paul-Labrador, which had a pretreatment anxiety level in the 48th percentile, below normal, and the study's effect size was virtually zero. It is important to point out that this meta-regression was not due to regression to the mean because the control groups that scored high on pretreatment anxiety did not show a regression to lower anxiety at posttest. 5
We suggest that systematic reviews and meta-analyses of psychological variables would do well to group studies according to pretreatment levels of the dependent variables, parallel to the common practice for reviews of hard clinical outcomes. For example, in meta-analyses of blood pressure, studies typically are grouped by whether the participants are normotensive, pre-hypertensive, hypertensive, and so on. Goyal et al. 1 tacitly assumed that patients undergoing clinical treatment would have elevated anxiety, but they had no mechanism to assess if this actually was the case. Some chronic clinical conditions such as hypertension may not be associated with elevated anxiety. In the case of Paul-Labrador, anxiety was a secondary variable with no power analysis, and the primary interest was on the effects of TM on metabolic syndrome in patients with coronary heart disease. Arguably then, the Paul-Labrador study should not have been included in our meta-analysis or that of Goyal et al.. If it is excluded, then the synthesis of the five remaining studies that appear to meet the Goyal criteria is d = −0.45, SE = 0.16 (−0.76 to −0.13), p = 0.006. Paul-Labrador was a large study (N = 103) relative to the others and carried a lot of weight in the synthesis, which explains why the p-value is so much smaller when it is excluded.
With regard to Smith, the second of the two studies that Goyal et al. included, although TM was not significantly more effective than the control (d = −0.09), it is important to note that independently both treatments had significant effects on reducing anxiety compared with a waitlist control (TM: d = −0.74, p = 0.01; PSI: d = −0.69, p = 0.01).
However, more important than the question of whether these five or six additional studies met Goyal et al.'s criteria and should have been included is the question about the validity of using “clinical patients” as an inclusion criterion. Many things besides clinical treatment could elevate anxiety. We know of no reason why a treatment that was effective for adults who were anxious because of stress at home, school, or office would not also be effective for people who were anxious because of undergoing clinical treatment. Studies on adolescents should also be included because normative data show them to be an anxious population. 12 Therefore, to assess the efficacy of a treatment on anxiety, all groups with elevated anxiety should be included. Three previous meta-analyses that included all groups found that TM was highly effective for reducing anxiety. 4 –6
From our previous meta-regression, 5 we calculated what the effect sizes would be for studies that were grouped by pretreatment anxiety levels, and we found that for participants in the 90th–100th percentile on pretreatment anxiety, the effect sizes were very large: between −0.96 and −1.17. For studies with participants with pretreatment anxiety in the 70th–80th percentile range, the effect sizes were more moderate: −0.53 to −0.74. This kind of information is much more useful than a synthesis of results from a mix of populations with different or unknown pretreatment levels on the dependent variable being studied.
A final point on establishing causality is that Rutledge et al. 11 commented that a major limitation of Goyal et al.'s review 1 is that it excluded important hard clinical outcomes such as blood pressure, rates of mortality, and cardiovascular disease. 13 –15 Goyal et al. replied that including clinical measures would be desirable but was beyond their budget. 9 We suggest that hard physiological variables should be included as a means of cross-validating soft self-report measures. For example, studies have found that the acute and long-term physiological effects of TM are the opposite of those produced by psychological stress and anxiety. TM produces acute reductions in respiratory rate, skin conductance, plasma lactate, and cortisol, and long-term reductions in resting baseline levels of heart rate, respiratory rate, spontaneous skin resistance responses, plasma lactate, and cortisol, 16 –18 which together provide a plausible mechanism for the subjective experience of reduced anxiety.
Conclusion
We have shown that TM is effective in reducing anxiety in populations undergoing clinical treatment, but the results are much more robust when all studies are included. 4 –6 We suggest that future systematic reviews and meta-analyses of psychological variables such as anxiety, depression, and anger further investigate the role of pretreatment levels in influencing the magnitude of outcomes and consider grouping studies according to pretreatment levels. We also suggest that future reviews of behavioral interventions on psychological variables include physiological correlates, include cohort studies, controlled trials, and RCTs, and that they report and compare results from all levels of evidence to provide a comprehensive overview of treatment efficacy.
Footnotes
Author Disclosure Statement
No competing financial interests exist.
