Abstract
Objective:
To evaluate the effects of financial incentives on physical activity (PA).
Data Sources:
MEDLINE, Embase, 7 other databases, and 2 trial registries until July 17, 2019.
Study Inclusion and Exclusion Criteria:
Randomized controlled trials with adults aged ≥18 years assessing the effect of financial incentives on PA. Any comparator was eligible provided the only difference between groups was the incentive strategy.
Data Extraction:
Two independent reviewers extracted data and assessed study quality. Of 5765 records identified, 57 records (51 unique trials; n = 17 773 participants) were included.
Data Synthesis:
Random-effects models pooling data for each of the 5 PA domains.
Results:
Financial incentives increase leisure time PA (gym or class attendance; standardized mean difference [95% CI], 0.46 [0.28-0.63], n = 5057) and walking behavior (steps walked; 0.25 [0.13-0.36], n = 3254). No change in total minutes of PA (0.52 [−0.09 to 1.12], n = 968), kilocalories expended (0.19 [−0.06 to 0.44], n = 247), or the proportion of participants meeting PA guidelines (risk ratio [95% CI] 1.53 [0.53-4.44], n = 650) postintervention was observed. After intervention has ceased, incentives sustain a slight increase in leisure time PA (0.10 [0.02-0.18], n = 2678) and walking behavior (0.11 [0.00-0.22], n = 2425).
Conclusions:
Incentives probably improve leisure time PA and walking at intervention end, and small improvements may be sustained over time once incentives have ceased. They lead to little or no difference in kilocalories expended or minutes of PA. It is uncertain whether incentives change the likelihood of meeting PA guidelines because the certainty of the evidence is low.
Introduction
Adults who engage in regular physical activity (PA) experience many health benefits, including a reduced risk of chronic disease and improved quality of life. 1,2 However, only 1 in 3 adults in high-income countries meet the World Health Organization (WHO) weekly PA recommendations. 3 Increasingly, behavioral economic principles are used in health interventions to motivate people to be physically active. 4 -7 For example, offering immediate financial incentives can increase PA participation by offsetting one’s present bias to overemphasize the immediate opportunity costs of participation (eg, physical discomfort) at the expense of gaining long-term health benefits. 5,8 Financial incentives can take various forms (eg, cash rewards, 9 lottery draws, 10 or financial penalties for not engaging in a healthy behavior 11 ) and have been implemented in university, 12 workplace, 13 and health care settings. 11,14
Previous systematic reviews suggest financial incentives may increase PA in the short and long term. 15 -21 Systematic reviews conducted to date have major limitations, including a narrow scope (eg, focused on a single setting, population, 15,17 or PA domain 15,16,20 ), a limited search strategy 17,18,20,21 or methodological approach to evidence synthesis, 16,21 and/or failure to assess impact of publication bias or quality of studies. 16 -19 Given these limitations, and the recent rapid increase in trials evaluating financial incentives for PA (12 new trials published in 2017-2018 alone), an up-to-date high-quality systematic review and meta-analysis is required.
Objective
This systematic review and meta-analysis aims to provide an evidence synthesis of randomized controlled trials (RCTs) that evaluated the effectiveness of financial incentives on PA participation in adults. Secondary objectives are to describe the (1) structure and design attributes of different financial incentives using an established framework, 22 (2) degree to which financial incentives are theory-informed, 23 (3) behavior change techniques used in designing interventions, 24 and (4) social ecological context under which financial incentives are studied. 25
Methods
Our review was guided by Cochrane Methodological Expectations of Cochrane Intervention Reviews 26 and the Preferred Reporting Items for Systematic Reviews and Meta-Analyses. 27 The protocol was prospectively registered on the International Prospective Register of Systematic Reviews (# CRD42017068263) and published. 28 The minor deviations from the protocol are described in Supplementary File 1A. We resolved disagreements over eligibility, extraction, or quality via discussion with a biostatistician (J.K.) and/or third reviewer (R.S.H.).
Data Sources
One author (M.N.L.), with assistance from an academic librarian, conducted the search that combined relevant terms for population, intervention, and outcome (Supplementary File 1B). We searched the following databases from inception to July 17, 2019: MEDLINE (Ovid), Embase (Ovid), Cochrane Central Register of Controlled Trials (CENTRAL), Cumulative Index to Nursing and Allied Health Literature (CINAHL) (EBSCO), Web of Science (Clarivate), Scopus (Elsevier), PsycINFO (Ovid), EconLit (EBSCO), SPORTDiscus (EBSCO), and National Health Service (NHS) Economic Evaluation Database. We searched ClinicalTrials.gov and the WHO International Clinical Trials Registry Platform and screened reference lists of relevant systematic reviews 15 -19,29 and included studies. If the study appeared in a trial registry, but was unpublished (eg, a dissertation), we followed up with the trial author to provide the opportunity to present preliminary or unpublished results.
Inclusion and Exclusion Criteria
Selection was based around the Population, Intervention, Comparison, Outcome framework. 30 Any English-language RCTs (published or unpublished) that met the following criteria were included: (1) population: adults aged ≥18 years, (2) intervention: at least 1 treatment group receiving a financial incentive to increase participant PA. We defined a financial incentive as any material reward or penalty applied to an individual or group to encourage participation in PA, including, but not limited to cash rewards, deposit contracts, and vouchers that could be exchanged for goods. (3) Comparison: any comparator was eligible provided the only difference between the comparison and intervention groups was the financial incentive strategy under investigation. Trials comparing 2 or more financial incentives were eligible provided all other treatment elements were similar across trial arms. (4) Outcome: any quantitative measure of PA, measured both prior to randomization and following intervention. We defined PA as any bodily movement produced by skeletal muscles that requires energy expenditure, 31 including leisure time activity as well as planned, structured, and purposeful activity.
Study Selection
Two authors (M.N.L., M.H.) independently screened titles and abstracts of articles and reviewed the full text of potentially eligible studies. One reviewer (M.N.L.) examined all included articles to link reports of the same RCT together and avoid including duplicates.
Data Extraction and Quality Assessment
Two reviewers (M.N.L., M.H.) independently extracted data and outcomes from included studies. One reviewer (M.N.L.) extracted intervention characteristics about the financial incentives. A second reviewer (M.H.) audited 10% of this information for accuracy. We extracted the following: Study characteristics: author and year of publication, number of trial arms, setting and social ecological level according to Golden and Earp,
25
sample size, and funding source(s); Participant characteristics: health condition, age, sex; Intervention characteristics: treatments applied in eligible comparison (control) and intervention groups. We used the Behaviour Change Taxonomy to code behavior change techniques,
23
item 1 of the Theory Coding Scheme
23
to assess use of theory, the Template for Intervention Description and Replication (TIDieR) checklist to extract essential intervention details, as proposed by experts,
32,33
and described intervention attributes using a modified framework for financial incentives.
22,34
This framework describes the direction, form, magnitude and calculated total magnitude, certainty of receiving the incentive (ie, lottery model), target behavior, frequency, timing of assessment, immediacy of receiving the incentive, schedule (ie, fixed magnitude of incentive or variable magnitude of incentive over time), and recipient of the incentive. Outcomes: We used the Strath decision matrix to categorize PA outcomes into 5 domains
35
: (1) leisure time activity (eg, gym visits), (2) walking behavior (eg, steps/day), (3) meeting PA guidelines (eg, proportion of participants achieving public health recommendations), (4) kilocalories expended (eg, kcal/d), and (5) total minutes of PA (eg, moderate-to-vigorous PA/week). For each study, when possible, we extracted 1 outcome per domain measured at 3 time points: baseline, immediately posttreatment, and at the longest measured follow-up. Where multiple outcomes were measured under the same domain, we established post hoc decision rules
36
that prioritized objective measures over subjective measures, absolute follow-up scores over change scores, intention-to-treat over per-protocol analyses, and primary analyses over sensitivity analyses. We extracted frequency counts for dichotomous data and calculated effect sizes via risk ratios (RRs) with 95% confidence intervals. We extracted point estimates with measures of variability for continuous data and calculated mean differences (MDs) and 95% confidence intervals. Where necessary, we used CIs or standard errors to calculate standard deviations (SD). We contacted authors to request missing or unclear information (Supplementary File 1C). We used DigitizeIt (https://www.digitizeit.de/) to extract information from graphs where necessary. When required, we imputed missing data by calculating the median of other SDs reported under the domain as per Cochrane methods.
37
Two review authors (M.N.L., M.H.) independently used the Cochrane risk of bias tool 38,39 to assess individual study bias. We evaluated (1) random sequence generation, (2) allocation concealment, (3) blinding of personnel and participants, (4) blinding of outcome assessment, (5) incomplete outcome data, (6) selective reporting, and (7) other bias. One reviewer (M.N.L.) rated quality of intervention reporting using the TIDiER checklist 32 and another (M.H.) independently verified 10%.
Data Synthesis and Analysis
We pooled data for each domain of PA at 2 time points: (1) end of intervention and (2) longest follow-up. We only pooled data where outcome measures within the activity domain were sufficiently homogenous with respect to the scale of the measurement tool (eg, for walking behavior, we pooled continuous outcome measures [steps/day], but not categorical measures [“walking goals met”]). We did not include data from trials in a meta-analysis if (1) estimates of effect or variance could not be computed or (2) the trial only compared financial incentives head-to-head.
Following Cochrane guidelines, 37 if trials included multiple eligible study arms, we created a single pair-wise comparison at the study level by combining all relevant comparison intervention arms into a single control group and all relevant financial incentive arms into a single treatment group. We used Review Manager (RevMan; Cochrane Collaboration) to pool means, SDs, and sample sizes so that each trial only made one contribution to each meta-analysis. For continuous outcomes, we calculated standardized mean differences (SMDs; 95% CI) between groups and calculated relative risks (95% CI) for dichotomous outcomes. After producing pooled means, SDs, and sample sizes, we conducted a random-effects meta-analysis using R 3.4.3 (R Core Team, 2017) using the meta 40 and metafor 41 packages (Supplementary File 1D) that combined these contrasts across studies.
We assessed statistical heterogeneity using the I2 statistic, with values above 30%, 50%, and 75% considered moderate, substantial and considerable, 37 respectively. We calculated a prediction interval to see the distribution of true intervention effects and to compute an approximate range of effects expected in similar studies. 42 We conducted random-effects meta-analyses using separate estimates of τ2 within each subgroup by measurement type (eg, objective, subjective) and health status (eg, healthy, obese/overweight, other chronic disease). We estimated separate intervention effects by incentive dimension (eg, direction, form, magnitude) for separate subgroups if trials included multiple arms that varied by an incentive dimension. We conducted 2 sensitivity analyses. One sensitivity analysis removed studies where we imputed data. The second sensitivity analysis was restricted to studies at low risk of bias on domains we determined a priori to be the greatest threat to internal validity 43 (ie, allocation concealment, blinding of assessors, and/or incomplete outcome data). Where 10 or more studies were pooled, we assessed publication bias by contour-enhanced funnel plots with power regions using R 3.4.3 (R Core Team, 2017) using the metaviz package. 44,45
We calculated the cost of financial incentives by standardizing reported measures of time spent in activities into metabolic equivalent of task (MET)-hours gained per person based on existing formulas for PA outcome translation. 46 We included only the value of the financial incentive themselves and did not consider costs incurred in the development or other costs associated with their administration (eg, personnel, etc).
Quality of the Body of Evidence
We used the Grading of Recommendations Assessment, Development and Evaluation (GRADE) framework to assess the body of evidence for each PA domain. 47 Two authors (M.N.L., M.H.) used GRADEpro software (McMaster University, 2015, developed by Evidence Prime Inc, available from gradepro.org) to independently assess the body of evidence. Evidence was considered “high quality” but downgraded if there was concern with risk of bias, indirectness, inconsistency, imprecision, or risk of publication bias. 47 We summarize the contents of the Summary of Findings table using descriptions adapted from Cochrane plain language summaries.
Results
We identified 8316 records through databases and 42 citations from other sources, yielding 5765 titles after removing duplicates (Figure 1). We identified 14 completed but unpublished RCTs and 13 ongoing trials (Supplementary File 1Ee). Fifty-one unique studies (n = 17 773 participants, Supplementary File 1F) from 8 countries were included in qualitative synthesis and 39 studies (n = 9181) in meta-analysis. Forty-eight RCTs (94%) compared an intervention with at least 1 financial incentive to a comparison group with no financial incentive, and 3 RCTs (6%) compared 2 financial incentives head-to-head. 48 -50 Participants ranged in age from 18 to 81 years, and 67% were female. The median sample size was 120 (range: 12-3515).

Flow diagram of review process.
Most trials included more than 1 financial incentive arm and/or comparison arm, although not all arms were eligible for inclusion (Table 1). Median length of intervention was 10.5 weeks. Overall, 95 different incentive arms were extracted (Table 2). Settings included workplaces (k = 14), schools (k = 13), communities (k = 15) and health care (k = 8). Twenty-one trials measured leisure time PA, 23 measured walking behavior, 3 measured attainment of PA guideline recommendations, 5 measured kilocalories expended, and 7 measured total PA. Thirty-one trials measured PA at a later follow-up after the incentive had ceased (median: 24 weeks, range: 4-104). All 51 trials targeted behavior change at the intrapersonal level of the social ecological model (Supplementary File 1F).
Summary of Findings and Quality of the Body of Evidence.
Abbreviations: GRADE, Grading of Recommendations Assessment, Development and Evaluation; RR, risk ratio; SMD, standardized mean difference.
a GRADE Working Group grades of evidence: High = This research provides a very good indication of the likely effect. The likelihood that the effect will be substantially different is low. Moderate = This research provides a good indication of the likely effect. The likelihood that the effect will be substantially different is moderate. Low = This research provides some indication of the likely effect. However, the likelihood that it will be substantially different is high. Very low = This research does not provide a reliable indication of the likely effect. The likelihood that the effect will be substantially different is very high. Substantially different = a large enough difference that it might affect a decision.
b Inconsistency considered serious when there was minimal to no overlap of CIs or heterogeneity was high (I2 ≥ 50%) and could not be explained.
c Imprecision was considered serious when the 95% CI crossed the line of no effect and was wide, such that the interpretation of the data would be different if the true effect were at the lower bound or the upper bound of the CI or there are very few events.
d Inconsistency considered very serious when heterogeneity was high (I2 ≥ 50%) and could not be explained, there was no overlap of CIs, or was considered when only 1 study was assessed (I2 unavailable).
e Jeffery et al, 1998 51 , has a higher weight given the sample size and is ranked at high risk of bias for 2 of 3 prioritized risk of bias domains. The remaining 2 studies have rating of “unclear” or “high” for at least 1 prioritized domain, indicating that there is a substantial risk of bias across most of the body of available evidence.
Subgroup Analyses.a
Abbreviation: SMD, standardized mean difference.
a Pooled random-effects analysis of standardized mean differences (SMD) comparing financial incentives to no financial incentives at the end of treatment and at longest follow-up.
b ni = Participants in the intervention arm; nc = participants in the comparison arm.
c Calculated per day in 2018 US$.
The included trials investigated cash, donations, reimbursements, vouchers or goods/services, and financial incentive dimensions varied considerably across trials (Supplementary File 1G). Use of theory and behavior change techniques to develop financial incentives also varied (Supplementary File 1H and 1I). Only 16 (31%) trials described their interventions fully according to TIDiER guidelines (Supplementary File 1J). The methodological domain most commonly classified at high risk of bias in individual studies was blinding of participants and personnel (Supplementary File 1K), with 49 trials (96%) deemed at high risk.
Synthesis of Data
We present PA outcomes from individual trials in Supplementary File 1L. Of the 51 trials, 39 studies provided data amenable for meta-analysis. Six trials did not report estimates of effect or variance, 3 studies reported outcomes that could not be combined, and 3 studies only compared financial incentives head-to-head (Supplementary File 1M).
Leisure time PA
Twenty-one (41%) trials measured leisure time PA. Of these, 15 (n = 5057) measured gym attendance and were suitable to pool. Evidence suggests financial incentives had a moderate effect on increasing gym attendance (SMD [95% CI] = 0.46 [0.28-0.63], P < .0001; I2 = 84%; Figure 2A), and in 6 trials (n = 2678), financial incentives sustained a slight increase in gym attendance at the longest follow-up after incentives had ceased (0.10 [0.02-0.19], P = .0154; I2 = 3.3%).

Pooled random-effects model of financial incentives on domains of physical activity. Reported as standardized mean differences (SMD) or risk ratios (RR) with 95% CI. A, Leisure time physical activity (gym attendance). B, Walking behavior (average daily steps). C, Meeting physical activity guideline recommendations. D, Kilocalories expended. E, Total physical activity minutes.
Walking behavior
Twenty-three trials (45%) used Fitbits, pedometers, or other objective measures to assess walking behavior. Of these, 20 (n = 3254) measured daily steps and were suitable to pool. Evidence suggests financial incentives had a small effect on increasing average daily steps (0.25 [0.13-0.36], P < .01; I2 = 55%; Figure 2B) at the end of intervention. In 14 trials, financial incentives demonstrated a sustained small effect on walking behavior at the longest follow-up after incentives had ceased (0.11 [0.00-0.22], P = .07; I2 = 39%). The cost of these intervention was US$5.60 2018/MET-hour/person for increasing daily steps (Supplementary File 1N).
Physical activity guidelines
Two studies (4%) measured the proportion of participants meeting PA guidelines using the International Physical Activity Questionnaire (n = 650) and were suitable to pool. There was insufficient evidence to determine if financial incentives increased or decreased the proportion of participants meeting PA guidelines (RR = 1.53, 95% CI: 0.53-4.44, P = .43; I2 = 86%; Figure 2C) at the end of intervention. Only 1 trial (n = 599) measured this outcome after incentives had ceased, showing incentives reduced the proportion of participants meeting PA guidelines relative to control at 6 months after intervention (RR [95% CI] = 0.63 [0.55-0.73]).
Kilocalories expended
Five (11%) studies measured kilocalories expended and 3 (n = 247) were suitable to pool. There was insufficient evidence to determine if financial incentives increased or decreased kilocalories expended (0.19 [−0.06 to 0.44], P = .13; I2 = 0; Figure 2D) at the end of the intervention period. No trials measured this outcome at a later follow-up.
Total PA
Seven (17%) studies measured total PA, and of these, 6 (n = 968) measured minutes of PA and were suitable to pool. There was insufficient evidence to determine if financial incentives increase or decrease total minutes of PA (0.52 [−0.09 to 1.12], P = .07; I2 = 84%; Figure 2E) at the end of intervention or at the longest follow-up (0.09 [−0.33 to 0.51], P = .13; I2 = 56%).
Sensitivity Analyses
Findings were similar in sensitivity analyses removing imputed data (Supplementary File 1O) and removing studies rated as unclear or high risk of bias (Supplementary File 1P).
Subgroup Analyses
Findings were similar whether outcomes were measured objectively or subjectively (Figures 2). The test for subgroup differences indicates that there is no statistically significant subgroup effect for leisure time PA (P = .97), suggesting that measurement type does not modify the effect of interventions with financial incentives in comparison to interventions without financial incentives. However, a far smaller number of trials and participants contributed data to the subjective subgroup (k = 4, n = 330) than to the objective subgroup (k = 9, n = 3568), meaning that the analysis may not be able to detect subgroup differences. All studies measured walking behavior objectively and meeting PA guidelines subjectively. There were a small number of studies pooled for total minutes of PA, meaning that the subgroup analysis is unlikely to produce useful findings.
Although there were some variations in estimates of effectiveness when analyses were subgrouped according to health status and incentive design features, the overlapping confidence intervals, combined with the small numbers of studies pooled, do not provide evidence that any financial incentives are more effective than others (Table 1).
Quality of the Body of the Evidence
Table 2 summarizes findings and the quality of the body of evidence. Contour-enhanced funnel plots with power regions for leisure time PA and walking behavior suggest that meta-analytic summary effects for these outcomes may be driven by underpowered but significant trials (Supplementary File 1Q).
Conclusions
This systematic review and meta-analysis provides an up-to-date synthesis of RCT evidence about the effectiveness of financial incentives for increasing PA participation in adults. We found moderate quality evidence that financial incentives had a moderate effect on increasing leisure time PA and a small effect on walking behavior at the end of intervention. High-quality evidence showed financial incentives sustained a slight increase in leisure time PA, and moderate-quality evidence showed that financial incentives might sustain a slight increase in walking behavior at longest follow-up after intervention has ceased. The evidence contributing to comparisons of kilocalories expended and total minutes of PA was low quality and was consistent with financial incentives leading to no change in these outcomes. The evidence associated with proportions of participants meeting PA guidelines was very low quality and was consistent with no change immediately at the end of intervention.
Our findings are consistent with previous reviews that evaluated effects of financial incentives on leisure time PA 16 -19,21 and walking behaviour. 15,20,21 We observed a smaller effect on daily steps (SMD = 0.27) than Gong et al (SMD = 0.39), 15 which may be due to the larger number of trials (k = 20 vs k = 2) and participants (n = 3254 vs n = 242) pooled in our review. In absolute terms, we observed a slightly larger MD in daily steps (754 steps) than that observed by Mitchell et al (607 steps) at the end of the intervention period 20 and a smaller difference at longest follow-up (MD = 459 steps vs MD = 376 steps). These differences are probably due to differing statistical approaches in meta-analysis; we combined multiple incentive arms from one study to generate a single pair-wise comparison, as opposed to including 2 or more correlated comparisons. Ours is the first review to find evidence of sustained small effects of financial incentives on leisure time PA following cessation of the incentive. These sustained effects should be considered cautiously, given the heterogeneity of secondary follow-up time points, which ranged from 4 to 104 weeks (median: 24 weeks).
Previously published reviews have made recommendations about the most effective design features 20,21 ; however, we did not find sufficient evidence in our subgroup analyses to draw any definitive conclusions. One review suggests incentives should be designed to be delivered within 7 days of a goal achievement, 20 and another suggests they are more likely to be effective if there are multiple opportunities to earn incentives. 21 While we acknowledge variation in the magnitude of effect sizes observed with different financial incentive features in our analyses, given the overlapping CIs across subgroups and the small number of trials and participants contributing to data in certain subgroups, there was no evidence in these data about which incentive designs were most effective.
A practical implication to consider is the small-to-moderate effect sizes observed in this meta-analysis which vary by PA outcome. While there are known clinical benefits for people who consistently meet PA guidelines, there is limited information on the clinical benefits associated with leisure time PA or minimum daily steps needed to achieve any health benefits. 52,53 Recent prospective studies suggest an inverse dose-response association between daily steps and mortality for adults. 54,55 There is some evidence that suggests even small changes in step count at a magnitude of 700 steps/day could have clinically relevant health benefits for chronic disease populations. 56 -58 Among older women, as few as approximately 4400 steps/day was significantly related to lower mortality rates compared with approximately 2700 steps/day. 55 With more steps/day, mortality rates progressively decreased before leveling at approximately 7,500 steps/day. In addition, leisure time PA patterns characterized by 1 or 2 sessions/week may reduce all-cause, cardiovascular disease, and cancer mortality risks regardless of adherence to prevailing PA guidelines. 59
Another practical implication for policymakers to consider is the cost-effectiveness of an intervention. Although estimates of the economic burden of physical inactivity vary, 60 the direct cost of incentive interventions to increase walking (estimated at US$5.60 2018/MET-hour/person, Supplementary File 1N) is higher than the potential health care cost saving calculated from credible estimates of the cost of inactivity 46,61,62 (ranging from US$0.22 to US$1.15 2018/MET-hour/person). On the other hand, it may be that health gains from increased activity would justify the net cost of incentives. In the future, financial incentives may specifically be designed to have low public costs to overcome the substantial financial burden associated with inadequate levels of PA. They also have the capacity to be easily scaled up for use at the population level, with minimal effort from health service providers, given that implementing financial incentives does not require specialized staff or resources. 5,63 For example, mobile applications that provide continuous financial incentives to be physically active, such as Sweatcoin and the Vitality Activity Reward program, have respectively had a reach of more than 20 million users globally 64 and 415 000 adults in South Africa. 65 Many corporate wellness programs, private insurers, government agencies, and public health programs are already using financial incentives for health promotion 65 -68 . Improving their efficacy is an important consideration for policy and practice as these programs continue to expand.
Detailed descriptions of interventions in RCTs are necessary to allow public health programs to implement effective interventions and tailor incentives to different contexts and populations. 4,23,32,70,71 In our review, fewer than one-third of trials described interventions in sufficient detail to allow replication. Future research in this field should use the CONSORT statement and TIDieR checklist to facilitate detailed and transparent reporting. Similarly, many studies failed to explicitly describe the theory postulating why the financial incentive should lead to increased PA. It is unclear if authors failed to use theory in designing interventions or simply did not report their use of theory in intervention design. Use of behavior change and behavioral economics theory in future research may enhance effectiveness of financial incentive interventions. 4,23,70,71
A key strength of our systematic review is that it represents a more comprehensive approach compared to previous reviews, 15 -21 which have provided a fragmented view of the evidence by focusing only on evidence in specific settings, populations, PA domains, or time periods. The largest and most recently published systematic review searched for publications from 1996 to June 2018, 21 synthesizing the evidence for only 23 eligible studies. In contrast, our rigorous and extensive search strategy identified 51. Our systematic review is the first to evaluate effects of financial incentives on total kilocalories, minutes of PA, and meeting PA guidelines and establish evidence for the sustained effects of incentives on leisure time PA. We also reported on secondary objectives of interest to health practitioners, including the quality of trial reporting using the TIDiER guidelines, identification of incentive design, behavior change techniques, theory, and socioecological levels of these interventions.
Another strength of our study is the explicit, transparent, and appropriate methodological approach to synthesizing data. We minimized potential biases by registering the systematic review (PROSPERO # CRD42017068263) and publishing a protocol. 28 We provide open data and materials and document, explain, and justify any changes from the protocol. Previous reviews used vote counting procedures to synthesize evidence, 20,21 which has no statistical rationale, 72 poor performance validity, 73 and can result in misleading conclusions, particularly when statistical power is low. 72,74 We used more robust methods by conducting random-effects meta-analysis and explored reasons for statistical heterogeneity. When multiple incentive arms were featured in one study, we followed appropriate analysis techniques by combining arms to create pair-wise comparisons, whereas other reviews included multiple incentive arms in one meta-analysis without addressing the problem of correlated comparisons. 15,20,21 Finally, we used GRADE to assess the quality of the body of evidence and adhered to Cochrane MECIR standards 26 and PRISMA guidelines 27 to maximize robustness and assure the quality of the review.
There are several limitations of our systematic review. Since we restricted our inclusion criteria to studies written in English, there may be language bias. The evidence was largely comprised of trials from the United States (34/51 studies), and thus, findings may not be generalizable elsewhere. Furthermore, the inclusion of a broad range of ages and populations may not be generalizable to specific populations. Vulnerable populations were underrepresented; only 2 studies were conducted in low-income countries or socioeconomically disadvantaged populations. 50,75 Despite efforts to obtain data from unpublished trials, 14 authors did not respond to our requests to provide data (Figure 1). Although we did not detect publication bias, the possibility of selective reporting in included studies may undermine generalizability of findings. We encourage researchers to report their findings, regardless of outcomes, no later than 1 year after study completion date (as recommended by ClinicalTrials.gov).
Despite moderate-to-strong quality evidence for financial incentives increasing leisure time PA and walking behavior, only low-to-very-low quality evidence was found for other PA outcomes. This low-to-very-low quality evidence was driven by a limited number of small trials with variable treatment effects, which resulted in inconsistent and imprecise effect estimates (Table 1). Future trials should consider evaluating domains of PA that are relevant to the target population and ensure a priori that the trial is powered to detect a minimal clinically important difference. More studies with sample sizes adequately powered to detect meaningful change will reduce heterogeneity in estimates and increase our confidence in the estimated range of possible treatment effects for different measures of PA. This is a rapidly changing field of research, and inclusion of new trials may change our findings.
A final clinical consideration is the limited evidence about sustainability of changes in PA once financial incentives are withdrawn. Although our review showed sustained effects of incentives on gym attendance and daily steps, effect sizes were small and diluted from the moderate effects observed at the end of the intervention. Furthermore, of the 31 trials in the study that measured PA beyond the incentive period, only 10 studies had a follow-up period longer than 6 months. Given that relatively few RCTs evaluated sustained effects of financial interventions beyond the intervention period, future research should aim to evaluate intervention effects for at least 12 months following the incentive period.
This review highlights the potential role of financial incentives to promote and maintain PA in adults across diverse settings and populations, with the most robust evidence supporting the use of financial incentives to increase walking behavior in the short term. There are preliminary data to suggest financial incentives can promote PA maintenance; however, more research is needed to examine the conditions under which financial incentives are likely to drive long-term, postincentive adherence to PA. Future research is required to design and evaluate cost-effective financial incentive interventions that have a sustainable impact on PA behavior. Researchers should design studies that are adequately powered to detect clinically meaningful treatment effects, consider potential treatment mediators and moderators (eg, gender), include long-term follow-up, and transparently report interventions and findings using established guidelines. Programs should consider the cost-effectiveness, sustainability, and acceptability of this behavior change technique compared to other approaches before we confidently advocate their widespread implementation in public health policy. 4,76,77
So What? (Implications for Health Promotion Practitioners and Researchers)
What is already known on this topic?
Previous systematic reviews suggest financial incentives may increase walking behavior in the short-term. There has been a recent rapid increase in trials evaluating financial incentives for physical activity (PA) and implementation of financial incentive programs in industry settings to encourage healthy behaviors.
What does this article add?
In this meta-analysis of 39 individual trials including 9181 individual participants, there is moderate quality evidence that financial incentives improved leisure time PA and walking behavior at the end of the intervention period and moderate-to-high quality evidence of small, sustained effects at the longest follow-up after the incentive had been withdrawn. Evidence suggests financial incentives did not change kilocalories expended, total PA minutes, or the proportion of people meeting PA guidelines. There is variation in the magnitude of effect sizes observed with different financial incentive features, however no evidence about which incentive designs were most effective, given the overlapping CIs across subgroups and small number of trials and participants contributing data in some subgroups. Although our review showed sustained long-term effects of financial incentives on gym attendance and daily steps, effect sizes were only small and diluted from the moderate effects observed at the end of the intervention.
What are the implications for health promotion practice or research?
Given that relatively few RCTs evaluated sustained effects of financial interventions beyond the intervention period, future research should prioritize RCTs that are adequately powered to detect intervention effects for at least 12 months following the incentive period. Programs should consider cost-effectiveness, sustainability, and acceptability of this behavior change technique compared to other approaches before we confidently advocate their widespread implementation in public health policy.
Supplemental Material
Supplemental Material, sj-pdf-1-ahp-10.1177_0890117120940133 - The Impact of Financial Incentives on Physical Activity: A Systematic Review and Meta-Analysis
Supplemental Material, sj-pdf-1-ahp-10.1177_0890117120940133 for The Impact of Financial Incentives on Physical Activity: A Systematic Review and Meta-Analysis by My-Linh Nguyen Luong, Michelle Hall, Kim L. Bennell, Jessica Kasza, Anthony Harris and Rana S. Hinman in American Journal of Health Promotion
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This study is supported by funding from the National Health and Medical Research Council Centre of Research Excellence (#1079078). M.N.L. is supported by a Melbourne International Research Scholarship and an Australian Government Research Training Program Scholarship. K.B. is supported by a National Health and Medical Research Council Principal Research Fellowship (#1058440). M.H. is supported by Sir Randal Heymanson Fellowship from The University of Melbourne. R.S.H. is supported by a National Health and Medical Research Council Senior Research Fellowship (#1154217).
Authors’ Note
PROSPERO Registration: # CRD42017068263. All authors have participated in the work and approved the manuscript.
Supplemental Material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
