Abstract
Happier people are healthier, but does becoming happier lead to better health? In the current study, we deployed a comprehensive, 3-month positive psychological intervention as an experimental tool to examine the effects of increasing subjective well-being on physical health in a nonclinical population. In a 6-month randomized controlled trial with 155 community adults, we found effects of treatment on self-reported physical health—the number of days in the previous month that participants felt healthy or sick, as assessed by questions from the Centers for Disease Control and Prevention’s Behavioral Risk Factor Surveillance System Questionnaire. In a subsample of 100 participants, we also found evidence that improvements in subjective well-being over the course of the program predicted subsequent decreases in the number of sick days. Combining experimental and longitudinal methodologies, this work provides some evidence for a causal effect of subjective well-being on self-reported physical health.
Keywords
Happiness feels good, but do its benefits extend beyond the psychological domain to include physical wellness? Evidence of a link between subjective well-being and health is growing, rooted primarily in cross-sectional and prospective longitudinal studies, along with experiments leveraging momentary mood manipulations in the laboratory (Lyubomirsky, King, & Diener, 2005). There remains, however, a dearth of studies examining the impact of long-term increases in subjective well-being on physical health in healthy adults (Diener, Pressman, Hunter, & Delgadillo-Chase, 2017). Perhaps as a result of this missing line of externally valid experimental research, the causal impact of enduring happiness on improved physical health remains a subject of debate (Liu et al., 2016). In the present research, we aimed to fill this gap in the literature using a 12-week positive psychology intervention (PPI) as an experimental tool.
The Link Between Happiness and Health
The link between happiness and health is supported by correlational, longitudinal, and experimental evidence (Lyubomirsky et al., 2005). For instance, happier people have better cardiovascular health (Boehm, Vie, & Kubzansky, 2012) and immune functioning (Marsland, Cohen, Rabin, & Manuck, 2006), engage in healthier behaviors (Boehm et al., 2012), and live longer lives (Diener & Chan, 2011). In one particularly notable prospective study, the positivity of autobiographies written by nuns around age 22 predicted their survival rates from ages 75 to 95 (Danner, Snowdon, & Friesen, 2001). Synthesizing this broad-ranging literature, meta-analytic evidence suggests that subjective well-being is linked to increased survival in both healthy and ill samples (Chida & Steptoe, 2008), short-term immune functioning, pain mitigation, endocrine response, and long-term cardiovascular health, as well as slower disease progression (Howell, Kern, & Lyubomirsky, 2007) and recovery in ill patients (Lamers, Bolier, Westerhof, Smit, & Bohlmeijer, 2012).
Despite this consistent evidence, the literature on the health benefits of happiness is plagued by a relative lack of long-term, externally valid experimental evidence in healthy samples, such as from randomized controlled trials (considered the gold standard in the medical sciences). Indeed, the existing experimental evidence comes primarily from brief positive-affect inductions in the laboratory and measurement of the immediate, short-term effects on indicators of physiological health (Pressman & Cohen, 2005).
The handful of studies examining the health effects of increasing subjective well-being over time have primarily focused on populations with particular health issues. Some of these interventions have targeted the alleviation of subjective ill-being, such as stress or depression, producing reductions in all-cause mortality 2 years later in men (Linden, Phillips, & Leclerc, 2007), decreased future cardiac events (Rutledge, Redwine, Linke, & Mills, 2013), and modest improvements in physical health (O’Neil, Sanderson, Oldenburg, & Taylor, 2011). Other interventions have focused more directly on increasing subjective well-being but only in specific clinical populations with existing health issues. For instance, PPIs have produced health-behavior changes in chronic cardiopulmonary-disease patients (Charlson et al., 2014), greater medication adherence in hypertensive African Americans (Ogedegbe et al., 2012), and increased physical activity in coronary-care patients (Peterson et al., 2012).
Intensive PPIs do seem to improve health behaviors and health indicators within clinical populations. But does happiness lead to better health even in a nonclinical population?
The Present Research
Though the link between subjective well-being and health is well supported by existing evidence, this evidence is limited by an enduring absence of randomized controlled trials that experimentally examine the causal effects of increasing subjective well-being on health outcomes within healthy, nonclinical populations. In the present study, we aimed to address this gap in the psychological literature by conducting a randomized controlled trial examining the health effects of a 12-week PPI, focused explicitly on increasing subjective well-being through empirically supported exercises and activities (Lyubomirsky & Layous, 2013; Quoidbach, Mikolajczak, & Gross, 2015). We randomly assigned healthy adults to an active-treatment group and a wait-list control group. Participants completed measures of subjective well-being and physical health before treatment (pretest), over the course of treatment (weekly), 3 months after treatment (posttest), and 6 months after treatment (follow-up).
Our main goal was to utilize a comprehensive PPI as an experimental tool to examine whether subjective well-being can improve health outcomes. Past research suggests that subjective well-being can lead to improvements in health through multiple mechanisms, including promoting healthy behavior (e.g., Charlson et al., 2014; Ogedegbe et al., 2012; Peterson et al., 2012), improving resistance to illnesses such as the common cold (Cohen, Alper, Doyle, Treanor, & Turner, 2006), and increasing antibody activity in response to viruses (Marsland et al., 2006). Thus, we hypothesized that by increasing subjective well-being, our PPI treatment would improve health. Given that our PPI was based on existing research linking the suggested activities with greater subjective well-being, we expected to observe relatively steady improvements in subjective well-being that would accrue over the duration of the program. We expected a similar linear trajectory on physical health outcomes, but with delayed effects preceded by improvements in subjective well-being.
Method
Participants
Participants were 155 community adults (age: M = 45.36 years, SD = 13.52; 78% women); 55 individuals (age: M = 49.15 years, SD = 11.91; 75% women) were recruited in Kelowna, British Columbia, and received the intervention through weekly in-person group sessions led by trained clinicians (wait-list control group: n = 28; treatment group: n = 27); 100 individuals (age: M = 43.28 years, SD = 13.95; 80% women) were recruited in Charlottesville, Virginia, and received the intervention through weekly online sessions (wait-list control group: n = 50; treatment group: n = 50). Full demographic information is provided in Table S8 in the Supplemental Material available online.
All participants completed a baseline assessment, 133 (85.8%) completed a posttest assessment at the conclusion of treatment, and 127 (81.9%) completed a follow-up assessment 3 months after program completion. Because of a procedural artifact, we were not able to match the weekly survey data for the 55 participants recruited in Kelowna; the results based on data from the weekly surveys, therefore, are based on the 100 participants recruited in Charlottesville. Participants received no compensation for completing the program modules, the weekly surveys, or the baseline assessment; they were compensated for the posttest ($10) and the follow-up assessments ($15).
Recruitment and randomization procedures
To reach a broad community sample, we utilized a variety of advertising methods, including social media, community flyers, newspaper advertisements, e-mails to university staff LISTSERVs, and local talk-radio interviews. Individuals (N = 462) expressed interest via e-mail or an online link. Of those, 258 individuals completed the prescreening questionnaire. Participants were ineligible to participate if they were younger than 25, older than 75, or met criteria for moderately severe depression by scoring 15 or more on the Patient Health Questionnaire-9 (PHQ-9; Kroenke, Spitzer, & Williams, 2001). Sixty-seven of the interested individuals did not complete the prescreening; of those who completed prescreening, 4 participants were outside the age-eligibility range and 15 reported symptoms indicative of moderate depression (and were referred to the appropriate mental health services). Seventeen eligible participants declined to participate or failed to show up to receive informed-consent paperwork, which was delivered in person to each participant. All remaining individuals (n = 155) consented to participate in the study, including being randomly assigned to either an active-treatment group (n = 77) or a wait-list control group (n = 78). We employed stratified randomization to ensure equivalency between conditions in gender and in participants with moderate depression (PHQ-9 score ≥ 10). Following Consolidated Standards of Reporting Trials (CONSORT; Moher et al., 2012), we provide a full CONSORT diagram of the study samples separately for each site—from recruitment and eligibility screening to randomization and attrition (see Fig. S1 in the Supplemental Material).
Attrition
Attrition information across the comprehensive assessments is presented in Table 1. Our attrition rate over 6 months from baseline assessment to follow-up assessment was 18%. Notably, the attrition rates were comparable across modalities (see Fig. S1). The attrition rate for the participants who received the treatment online was 16%—much lower than is typically observed in online interventions. Though some participants completed the program modules online on their own, all participants were recruited from the local community, were introduced to the treatment program and the study team during several introductory in-person sessions, and completed all major assessments in the lab. All participants also received reminders to complete their program modules and schedule assessments via personally addressed e-mails and phone calls.
Means, Standard Deviations, and Effect Sizes for Heath Measures at Each Assessment Point
Noncompletion information for the weekly surveys throughout treatment is shown in Table 2 (see Table S3 in the Supplemental Material for details). Participants completed, on average, 7.7 out of the 10 weekly online surveys that we administered over the duration of treatment. Though satisfactory overall, this completion rate differed between conditions: Control participants, on average, completed 8.9 out of 10 surveys, but treatment participants completed 6.4 out of 10 surveys. We ran additional comparisons between the groups at baseline to probe for a possible failure of random assignment that could explain these differences. We found no significant differences on key demographics, age: t(98) = −1.63, p = .106; sex: t(96) = 0.59, p > .250; education: t(98) = −0.10, p > .250, with the exception of income, t(97) = −2.00, p = .048, for which the treatment group reported slightly higher income than the control group. The two groups did not differ in overall wealth, including real estate, investments, and other assets, t(95) = −0.98, p > .250. The groups also did not differ on other relevant constructs, such as stress, t(98) = 1.51, p = .134, and depression, t(98) = 0.07, p > .250. Further, we found no differences at baseline in the number of days in the previous month that (a) people felt sick, t(98) = 0.81, p > .250; (b) physical health affected activity, t(98) = 1.23, p = .221; or (c) people felt healthy and full of energy, t(98) = 0.21, p > .250. Thus, we did not find evidence suggesting failure of random assignment.
Person-Level Descriptive Statistics and Treatment Effects for the Weekly Measure of Sick Days
We also compared people who completed the last weekly survey (Week 10) with those who did not on key related outcomes. Again, we found no differences at baseline between completers and noncompleters on stress, t(98) = 0.37, p > .250, and depression, t(98) = 1.33, p = .187, or in the number of days in the previous month that (a) they felt sick, t(98) = −0.26, p > .250; (b) physical health affected activity, t(98) = 0.10, p > .250; or (c) people felt healthy and full of energy, t(98) = −1.17, p = .246. Still, the analyses of the weekly measures need to be interpreted with caution.
Treatment
We employed a comprehensive 12-week PPI: Enduring Happiness and Continued Self-Enhancement (ENHANCE). ENHANCE features 10 weekly modules, or principles of happiness. The program modules can be administered in in-person group sessions led by a trained clinician or self-administered via a custom-developed Web platform.
ENHANCE is a comprehensive PPI that includes a wide assortment of activities and skills that have been associated with higher subjective well-being, such as self-affirmation, mindfulness, gratitude, positive social interactions, and prosocial behavior (Quoidbach et al., 2015). Creating a program that features a variety of empirically supported activities was a key theoretical motivator in conceptualizing ENHANCE. Indeed, existing research and theory show that activity variety combats hedonic adaptation (Sheldon & Lyubomirsky, 2012) and maximizes the opportunity for person–activity fit (Schueller, 2010). Each weekly module featured (a) an hour-long lesson administered online or in the in-person group sessions, with information and exercises on target principles of happiness; (b) a weekly writing assignment (e.g., keeping a gratitude journal, writing about one’s values); and (c) an active behavioral component to integrate and apply the principle in daily life (e.g., guided mindfulness meditations, sending a gratitude letter).
The program modules were organized to build and expand on each other to produce incremental increases in subjective well-being over time. The first section of the program, The Core Self (Modules 1–3), targeted self-discovery and future planning, including activities centered on value affirmation, goal pursuit, and character strengths. The second section, The Experiential Self (Modules 4–6), dealt with the manner in which participants experience the external world and their internal feelings, including mindfulness, self-compassion, and savoring. Lastly, the third section, The Social Self (Modules 7–10), involved building and maintaining healthy social lives, including fostering positive interactions with strong and weak social ties and engaging in prosocial behavior. Further details on the specific activities in the modules are available in Table S1 in the Supplemental Material and in our published design-and-rationale article (Kushlev et al., 2017).
Weekly assessments
Participants completed weekly reports of health and well-being obtained through a brief online survey for each of the 10 active weeks of treatment. Both treatment and control participants received these weekly surveys on Sundays, coinciding with the release of the new weekly module for the treatment group. Treatment participants were instructed to complete the survey from the previous week before beginning the current week’s module.
Each weekly assessment included the Scale of Positive and Negative Experience (SPANE; Diener et al., 2010) to measure positive and negative affect “in the past week.” Participants also completed single-item measures of life satisfaction (“In the past week, how satisfied were you with things in your life?”) and meaning in life (“In the past week, to what extent did you feel a sense of meaning and purpose in your life?”; see Tables S2 and S4 in the Supplemental Material for descriptive statistics). These measures were used as manipulation checks in the present research to establish whether our PPI successfully improved subjective well-being. Other measures beyond the scope of this article, such as person–activity fit, were also included (see Kushlev et al., 2017).
Each week, we used an item from the Behavioral Risk Factor Surveillance System (BRFSS; Centers for Disease Control and Prevention, 2014) as our primary measure of health: “Thinking about your physical health, which includes physical illness and injury, for how many days during the past week was your physical health not good?” (see Table 2 for descriptive statistics; week-by-week descriptive statistics are provided in Table S3). This item combines breadth (i.e., not focusing on any specific health condition) and specificity (i.e., clearly defining physical health symptoms within a specific and recent period of time).
Comprehensive assessments
Self-reported indicators of health
In addition to completing brief weekly assessments over the duration of treatment, participants completed a larger baseline assessment prior to random assignment and identical assessments at a 3-month posttest (immediately after the intervention), and a 6-month follow-up (3 months after the end of the intervention). These assessments contained items from the BRFSS, including the number of days in the past month that participants (a) felt sick, (b) were prevented from usual activity by their health, and (c) felt healthy and full of energy (see Table 1 for descriptive statistics).
Objective indicators of health
In addition to gathering self-report measures, we obtained basic objective indicators of health—systolic blood pressure, diastolic blood pressure, and weight—at each of the three comprehensive assessments. At each assessment, we measured blood pressure twice, calculating the average systolic and diastolic blood pressure from the two measurements. We used weight at each assessment to calculate body mass index (BMI) on the basis of height measured at baseline. We treated each of those objective measures as continuous indicators of health (see Table 1 for descriptive statistics). Data and materials are available on the Open Science Framework at https://osf.io/hs9gf/. 1
Power and significance
We employed a Neyman-Pearson approach to null-hypothesis significance testing (α = .05). Because we were examining multiple outcomes to test the same hypothesis, we adjusted the critical p values for each outcome to keep the family-wise error rate (FWER) smaller than .05. We classified our measures into two families: (a) our four primary subjective measures of physical health, as measured by the self-report Centers for Disease Control and Prevention questions before, during, and after treatment, and (b) the three auxiliary objective measures of health—BMI, systolic blood pressure, and diastolic blood pressure—measured before and after treatment. To be conservative, we did not distinguish between weekly and comprehensive assessments of the self-report BRFSS questions. We chose the Holm-Bonferroni step-down method of controlling the FWER (Holm, 1979), whereby the observed p values are first ordered by size, from largest to smallest. The smallest p is then adjusted to p/m (where m is the total number of comparisons), the next-smallest p value is adjusted to p/(m − 1), and so forth. Hypotheses are tested sequentially, starting from the outcome with the smallest p value, until a nonsignificant test is reached; the effects on all untested outcomes are considered nonsignificant. We chose this approach because it provides a strong, conservative control of FWER while preserving power better than the classic Bonferroni method.
We preregistered a plan to detect an effect size (d) of 0.4 with 80% power in a design-and-rationale article (Kushlev et al., 2017). Using the package powerlmm (Version 0.4.0; Magnusson, 2018) in R, we estimated power for our two-level mixed design; we specified different model parameters for the comprehensive and weekly assessments to reflect the differences between each method in number of observations (3 vs. 10), number of participants (155 vs. 100), period of time between assessments (3 months vs. 1 week), and the measurement scales (0–30 days vs. 0–7 days). For the three comprehensive assessments, we specified 3 measurements per 155 participants with a subject-level random intercept of 1.5 and a within-subjects residual of 0.5. As there were only three time points per slope, we constrained the random slope value to 0. Considering that the most conservative Holm adjustment to the alpha level for a family of four outcomes is an α/4 of .0125, we estimated that we would have power greater than 99% to detect effects equal to or greater than the preregistered d of 0.4 (where d is the effect size at last measurement, standardized on the basis of the standard deviation of the outcome variable at baseline). For a small d of 0.2, we had 60% power at an α/4 of .0125, 64% power at an α/3 of .0167, 70% power at an α/2 of .025, and 79% power at an α of .05.
For the weekly assessments, we specified 10 measurements per 100 participants with a subject-level random intercept of 1, subject-level random slope of 0.01, and within-subjects residual of 1. For the preregistered d of 0.4, we found that we had 62% power at an α/4 of .0125, 66% power at an α/3 of .0167, 71% power at an α/2 of .025, and 80% power at an α of .05. For a small d of 0.2, we had 14% power at an α/4 of .0125, 16% power at an α/3 of .0167, 20% power at an α/2 of .025, and 29% power at an α of .05. For a medium d of 0.5, we had 84% power at an α/4 of .0125, 87% power at an α/3 of .0167, 90% power at an α/2 of .025, and 94% power at an α of .05.
Results
Weekly assessments: effects across the course of treatment
Analytic strategy
For the weekly assessments over the duration of treatment, we employed multilevel growth models to estimate the effect of treatment on outcomes over time. Thus, we compared change over time between experimental conditions as indicated by the fixed-effect interaction between time (as a Level 1 within-subjects factor) and treatment (as a Level 2 between-subjects factor). We modeled the random intercept of sick days and the random effect of time, allowing for each participant’s score over time to be modeled as observed. Because fixed effects were of primary interest, the covariance between the random intercept and slope was constrained for parsimony and to allow direct comparison between models (see Tables 3 and 4); this analytical decision had negligible effects on the effect-size estimates. For ease of interpretability, time was coded from −9 (Week 1) to 0 (Week 10) so that the model intercept can be interpreted as the mean difference after the intervention at Week 10 and the treatment coefficient can be interpreted as the instantaneous effect of treatment at Week 10 (Tables 3 and 4).
Fixed Components From the Growth Models for the Number of Weekly Sick Days Throughout the Course of Treatment
Note: Participants: n = 98, observations: n = 767. Models are based on the weekly number of sick days after removing outliers and have a diagonal covariance structure. Incidence-rate ratio (IRR) is equal to exp(b) and represents the rate of incidence of sick days. Treatment IRR for the conditional model is the instantaneous effect of treatment during the last week of the intervention (0 = control, 1 = treatment; Time: 0 = Week 10, or last week; −9 = Week 1). CI = confidence interval.
Random Components From the Growth Models for the Number of Weekly Sick Days Throughout the Course of Treatment
Note: The ID subscripts indicate between-subjects statistics. The R2 for multilevel models is the marginal variance explained, defined as the variance explained by fixed effects. For alternative model specifications and diagnostics, see https://rpubs.com/KKushlevPhD/ENHANCE-Diagnostics. AIC = Akaike information criterion.
The observed number of sick days fit the theoretical Poisson distribution substantially better than the normal distribution (λ = .84; see Fig. S2 in the Supplemental Material). We thus specified a Poisson distribution for the mixed models predicting the number of sick days; Poisson distributions are well suited for nonnormally distributed frequency outcomes with a high occurrence of nonevents (i.e., days with no symptoms; Bolker et al., 2009).
To provide standard measures of effect size, we reverse-calculated Cohen’s ds from the Wald t test for mixed models. The model effect sizes thus represent the overall condition effect sizes over the course of treatment. In the Supplemental Material, we additionally provide week-by-week Cohen’s ds for each outcome (see Table S2 for subjective-well-being outcomes; see Tables S3 and S6 for number of sick days), as well as Pearson correlations decomposed into between- and within-subjects variance (Table S4). Because Cohen’s d effect sizes (and the Wald t tests) assume a normal distribution, we also calculated incidence-rate ratios (IRRs) for the number of sick days. The rate of incidence of sick days provides a standard effect-size measure that is directly interpretable in terms of the original metric.
Manipulation check: subjective well-being
First, we explored whether the intervention produced the expected changes in subjective well-being over time during the course of treatment. We found a significant Treatment × Time interaction on life satisfaction, b = 0.08, SE = 0.02, β = 0.27, t(666) = 3.98, p < .001, d = 0.31, R2 = .03; positive affect, b = 0.05, SE = 0.02, β = 0.19, t(667) = 3.13, p = .002, d = 0.24, R2 = .03; and negative affect, b = −0.06, SE = 0.02, β = −0.21, t(665) = −3.45, p = .001, d = −0.27, R2 = .06. As shown in Figure 1, experimental participants grew more satisfied with their lives and reported increasingly higher positive affect and lower negative affect throughout the course of treatment, whereas control participants experienced little change. Treatment also increased participants’ meaning in life compared with the control condition, b = 0.05, SE = 0.02, β = 0.15, t(667) = 2.48, p = .013, d = 0.45, R2 = .02. Week-by-week statistical comparisons and effect sizes showed that the benefits of our PPI accrued over time, with larger differences beginning to emerge only after the midpoint of treatment (see Table S2).

Effect of treatment and no treatment on three indicators of subjective well-being (life satisfaction, positive affect, and negative affect) over time. The x-axis represents each survey administered after each weekly program module. Dots are group means, and error bars are standard deviations. Slopes show best-fitting regressions, and the gray bands represent standard errors of the mean.
Reported physical health
The conditional growth model revealed a significant Treatment × Time interaction on the number of sick days, IRR = 0.91, d = −0.32, p = .042, pHolm-Bonferroni = .042. Specifically, people in the treatment condition experienced significantly lower rates of sickness over the course of the study (Fig. 2; see Table S5 in Supplemental Material for model details). The instantaneous effect of treatment during Week 10 was an IRR of 0.35, 95% confidence interval (CI) = [0.18, 0.71], suggesting that, compared with control participants, treatment participants had only one third the incidence of sick days at the end of the intervention (see Table S5). In Table 2, we additionally show this same effect of treatment during Week 10, quantified as a standardized mean difference between conditions (d = −0.35).

Effect of treatment and no treatment on the number of sick days over the course of treatment. The x-axis represents each survey administered after each weekly program module. Dots are group means, and error bars are standard deviations. Slopes show best-fitting regressions, and the gray bands represent standard errors of the mean.
Diagnostic analyses indicated the presence of outliers in the outcome variable: Only 0.5% of observations included being sick for 6 days a week, and only 2.1% included being sick all 7 days. Recoding 6 and 7 days into 5 days showed better fit to the Poisson distribution, and simulation-based scaled residual plots indicated that using the transformed variable in the Poisson mixed linear models better satisfied assumptions of the distribution of residuals (see Fig. S3 in the Supplemental Material). We reanalyzed the data with the transformed outcome variable, excluding outliers, to examine whether this adjustment would impact the effects (see Tables 3 and 4 for model details). The Treatment × Time interaction on the transformed number of sick days remained significant with no changes to the effect sizes, IRR = 0.91, p = .032, pHolm-Bonferroni = .032, with an instantaneous effect of treatment during Week 10 of IRR = 0.36, 95% CI = [0.18, 0.71].
Effect size
Because there is no single standard measure of effect size for conditional growth models, we estimated several additional effect-size measures. All effect-size measures were estimated for the original untransformed variable (including any outliers). One way to quantify the effect size is by calculating the standardized mean difference between conditions averaged across all weeks. In Table 2, we can see that this effect (d = −0.18) was somewhat smaller than the Treatment × Time effect and the last-week-of-treatment effect. This pattern suggests that the effect of treatment was not immediate but rather accrued over the course of the PPI. Indeed, as with subjective well-being, week-by-week comparisons indicated meaningful differences emerging only after Week 5 (see Tables S3 and S6). As another standardized measure of average effect size, we also offer a table of correlations between all variables in Table S4, decomposed into between- and within-subjects variance; the average between-subjects effect (r) of treatment was −.09.
Does subjective well-being explain differences in subjective health?
Taking further advantage of the longitudinal nature of the data, we examined whether week-to-week changes in subjective well-being predicted differences in subjective health. We ran time-lagged correlational analyses predicting sick days in a given week from the difference in subjective well-being between that week and the previous week, while controlling for sick days during the previous week. To retain Week 1 responses, we conservatively assumed no increase in well-being from Week 0 (baseline). Over and above being sick in the previous week (IRR > 1.45, 95% CI = [1.36, 1.55], z = 10.98, p < .001, R2 = .16), week-to-week changes in positive affect, IRR = 0.75, 95% CI = [0.66, 0.85], z = −4.42, p < .001, R2 = .03; negative affect, IRR = 1.23, 95% CI = [1.09, 1.39], z = 3.39, p < .001, R2 = .02; and life satisfaction, IRR = 0.83, 95% CI = [0.75, 0.91], z = −3.83, p < .001, R2 = .02, significantly predicted how sick people felt during the current week. Changes in meaning in life did not significantly predict sick days in these analyses, IRR = 0.91, 95% CI = [0.80, 1.03], z = −1.46, p = .144, R2 = .01. 2 Given that the treatment did not target any health behaviors, such as exercise, these analyses suggest that the treatment effects we observed on subjective health are at least in part explained by the efficacy of the intervention in raising subjective well-being.
Comprehensive assessments: do the treatment effects on health persist?
Self-reported health
Next, we explored changes in health outcomes in the comprehensive assessments from baseline to follow-up (i.e., up to 3 months after the intervention). As with the weekly measure, we tested for omnibus effects using multilevel growth models with a Poisson distribution. Because of the small number of repeated observations in these outcome variables, however, we constrained the random slope to 0 to avoid model nonconvergence. Overall, from baseline to follow-up, treatment participants were less likely than control participants to experience sick days over the past month, IRR = 0.81, p = .0008, pHolm-Bonferroni = .0025, d = −0.12, or days in which their health prevented daily activities, IRR = 0.79, p = .0016, pHolm-Bonferroni = .0032, d = −0.09. Treatment participants were also one fifth more likely to have days of feeling healthy and full of energy, IRR = 1.15, p = .00001, pHolm-Bonferroni = .00008, d = 0.39 (for descriptive statistics and effect sizes, see Table 1; rain plots are displayed in Fig. 3). Model details are available in Table S7 in the Supplemental Material.

Rain plots of three subjective health indicators (number of days of feeling sick, number of days sickness prevented activity, and number of days of feeling healthy) at baseline, after treatment, and at follow-up 3 months after treatment, separately for the control and treatment groups. Dots indicate individual data points. The top and bottom of each box indicate the 75th and 25th percentile, respectively. The horizontal line in each box indicates the group median, and the whiskers represent values within 1.5 times the interquartile range from the upper and lower quartiles. The wavy plots to the right of each box indicate the density of the data.
Objective health
In addition to the self-report measures, we obtained basic objective health measures—blood pressure and BMI—at each of the three comprehensive assessments (for descriptive statistics and effect sizes, see Table 1; rain plots are displayed in Fig. 4). We specified a growth mixed model with a Gaussian distribution. We did not find any treatment effects over time on BMI, b = −0.16, SE = 0.09, t(229) = −1.77, d = −0.23, p = .079, pHolm-Bonferroni = .236; systolic blood pressure, b = −0.73, SE = 1.33, t(232) = −0.55, d = −0.07, p > .250, pHolm-Bonferroni = 1; or diastolic blood pressure, b = −0.40, SE = 0.77, t(231) = −0.53, d = −0.07, p > .250, pHolm-Bonferroni = 1. Thus, we found no evidence that the PPI had any effects on the objective health outcomes.

Rain plots of three objective health indicators (body mass index, systolic blood pressure, and diastolic blood pressure) at baseline, after treatment, and at follow-up 3 months after treatment, separately for individuals in the control and treatment groups. Dots indicate individual data points. The top and bottom of each box indicate the 75th and 25th percentile, respectively. The horizontal line in each box indicates the group median, and the whiskers represent values within 1.5 times the interquartile range from the upper and lower quartiles. The wavy plots to the right of each box indicate the density of the data.
Discussion
In a randomized controlled trial, we found evidence that a psychological intervention specifically designed to boost subjective well-being had downstream effects on self-reported physical health. Over the course of the program, treatment participants reported increasing levels of subjective well-being compared with control participants; week-to-week changes in subjective well-being over the course of treatment, in turn, predicted subsequent changes in the number of sick days reported by participants. In addition, from baseline through 3 months after the end of treatment, participants in the PPI reported fewer days in which they felt sick or in which their health interfered with normal activities. These findings suggest that increasing subjective well-being can make people feel healthier. However, we found no evidence that our treatment to enhance subjective well-being had any effects on objective indicators of health, including BMI and blood pressure. Thus, while helping to fill a critical gap in the literature, the present research further highlights the need for more prospective randomized trials employing the best practices in clinical science. A critical next step would be to see whether our findings can be replicated and extended using an active control group, thus controlling for possible placebo effects and demand characteristics inherent in our wait-group control design.
We found effects of our PPI on the self-reported number of days participants felt healthy or sick but found no effects on physiological indicators of health, including weight and blood pressure. This pattern of findings is consistent with that of past research, which has shown that subjective well-being predicts better immune functioning (Cohen et al., 2006; Marsland et al., 2006); in contrast, relatively little evidence exists to suggest that subjective well-being impacts weight or cardiovascular health in the short term (for reviews, see Diener et al., 2017; Howell et al., 2007). Indeed, our objective health indicators, blood pressure and BMI, were selected for their ease and affordability of measurement rather than for existing evidence about the associations between physiological indicators of health associated and higher subjective well-being. Thus, subsequent work building on the present research ought to include objective physiological indicators of health that have been more strongly linked to well-being in past research, such as indicators of immune functioning, endocrine response, inflammation, or wound healing (Diener et al., 2017). Additional objective indicators, such as medical-visit information and age of death, can also be collected in future studies. Collaborations between medical and psychological researchers can be particularly fruitful in providing better objective measures of health.
Relatedly, because changes in physical health often unfold slowly over time (Howell et al., 2007), future studies need to extend the assessment of physiological health indicators from months to years. Such longitudinal studies would also aid in answering longstanding theoretical questions about the connection between body and mind. Can a PPI have cumulative effects on health across the life span through a positive feedback loop (Fredrickson & Losada, 2005)? Does the life stage at which one engages with a PPI—during childhood, young adulthood, or middle age—moderate effects on health across a lifetime? Although early administration as part of children’s education curriculum might set people on the right track for compounded health benefits over the life course, undertaking such an intervention in middle age might be associated with greater agency and commitment.
It is also important to acknowledge the limitations arising from our wait-list control design. First, we suspect that this design may have produced the differential attrition we observed between conditions in weekly assessments: Although the weekly survey was the only task control participants were asked to complete each week, the treatment participants could complete the surveys only after completing their modules, writing exercises, and applied activities. Second, the wait-list design does not control for placebo effects or demand characteristics. To probe whether our effects on subjective well-being are primarily due to such design effects, we included several non-self-report measures of well-being. In a positive- and negative-memory recall task (Seidlitz & Diener, 1993), for example, treatment participants recalled a greater proportion of positive to negative life memories across the course of the study compared with control participants (Heintzelman et al., 2019). Of course, because health outcomes and health behaviors were intentionally not targeted by our intervention, we would not necessarily expect demand effects on our health outcomes.
Finally, our goal in employing a randomized-controlled-trial design was to examine whether meaningful increases in subjective well-being over time could have downstream consequences for physical health outcomes. An intrinsic issue with any randomized controlled trial of a comprehensive psychological intervention, however, is that the multimodal nature of the intervention creates a third variable problem. In other words, it is possible that any of the skills targeted by our intervention may have produced direct effects on subjective well-being and concurrent direct effects on health outcomes. Even a design using an active control condition could not properly address all possible confounding variables. Of course, only a well-controlled lab environment can establish clear causality by manipulating isolated factors. The primary contribution of our randomized-controlled-trial design is to add to the evidence predominant in the existing literature on the causal effects of subjective well-being on health—which comes precisely from brief lab manipulations or longitudinal studies with no manipulations (Diener et al., 2017).
Conclusion
Using an externally valid manipulation to boost subjective well-being beyond the laboratory, the current work joins a growing body of literature suggesting that increasing subjective well-being can lead to feeling healthier. Happiness might be good not only for the mind but for the body as well.
Supplemental Material
Kushlev_OpenPracticesDisclosure_rev – Supplemental material for Does Happiness Improve Health? Evidence From a Randomized Controlled Trial
Supplemental material, Kushlev_OpenPracticesDisclosure_rev for Does Happiness Improve Health? Evidence From a Randomized Controlled Trial by Kostadin Kushlev, Samantha J. Heintzelman, Lesley D. Lutes, Derrick Wirtz, Jacqueline M. Kanippayoor, Damian Leitner and Ed Diener in Psychological Science
Supplemental Material
Kushlev_Supplemental_Material_rev – Supplemental material for Does Happiness Improve Health? Evidence From a Randomized Controlled Trial
Supplemental material, Kushlev_Supplemental_Material_rev for Does Happiness Improve Health? Evidence From a Randomized Controlled Trial by Kostadin Kushlev, Samantha J. Heintzelman, Lesley D. Lutes, Derrick Wirtz, Jacqueline M. Kanippayoor, Damian Leitner and Ed Diener in Psychological Science
Supplemental Material
Kushlev_Transparency_Report – Supplemental material for Does Happiness Improve Health? Evidence From a Randomized Controlled Trial
Supplemental material, Kushlev_Transparency_Report for Does Happiness Improve Health? Evidence From a Randomized Controlled Trial by Kostadin Kushlev, Samantha J. Heintzelman, Lesley D. Lutes, Derrick Wirtz, Jacqueline M. Kanippayoor, Damian Leitner and Ed Diener in Psychological Science
Footnotes
Acknowledgements
We thank Shigehiro Oishi (Columbia University), who contributed to the development of materials used in the present work.
Transparency
Action Editor: Brent W. Roberts
Editor: D. Stephen Lindsay
Author Contributions
K. Kushlev and S. J. Heintzelman contributed equally to the study design, study implementation, and writing of this manuscript, and they share first authorship. E. Diener served in an advisory role throughout the research process. L. D. Lutes, D. Wirtz, J. M. Kanippayoor, and D. Leitner spearheaded the conceptualization, material development, participant recruitment, and implementation of the in-person portion of the trial. All of the authors approved the final version of the manuscript for publication.
Notes
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
