Abstract
High-intensity exercise has recently emerged as a potent alternative to aerobic regimens, with ramifications for health and brain function. As part of this trend, single sessions of intense exercise have been proposed as powerful, noninvasive means for transiently enhancing cognition. However, findings in this field remain mixed, and a thorough synthesis of the evidence is lacking. Here, we synthesized the literature in a meta-analysis of the acute effect of high-intensity exercise on executive function. We included a total of 1,177 participants and 147 effect sizes across 28 studies and found a small facilitating effect (d = 0.24) of high-intensity exercise on executive function. However, this effect was significant only compared with rest (d = 0.34); it was not significant when high-intensity exercise was compared with low-to-moderate intensity exercise (d = 0.07). This suggests that intense and moderate exercise affect executive function in a comparable manner. We tested a number of moderators that together explained a significant proportion of the between-studies variance. Overall, our findings indicate that high-intensity cardiovascular exercise might be a viable alternative for eliciting acute cognitive gains. We discuss the potential of this line of research, identify a number of challenges and limitations it faces, and propose applications to individuals, society, and policies.
Keywords
Our modern societies are becoming increasingly inactive. On average, people spend 5 to 9 hr of their waking day engaged in sedentary behaviors (Al-Nakeeb et al., 2012; Matthews et al., 2008; Ruiz et al., 2011), with detrimental effects on physiological and psychological health (Iannotti et al., 2009; Tremblay, Colley, Saunders, Healy, & Owen, 2010). In contrast, physical exercise has a positive impact on the body (Warburton & Bredin, 2017) and the brain (Gomez-Pinilla & Hillman, 2013), including several well-documented neurobiological effects (for a review, see Moreau & Conway, 2013).
Despite the sizeable benefits associated with active lifestyles, many individuals do not exercise regularly, the most common reasons being related to a lack of motivation or time constraints (Anderson, 2003). In an effort to propose exercise regimens that are more efficient and engaging, researchers have explored different ways to shorten exercise regimens. Among these, high-intensity regimens have recently gained popularity because of their potential to elicit health benefits comparable with, and sometimes even surpassing, those elicited by longer workouts (Gibala, Little, Macdonald, & Hawley, 2012; Weston, Wisløff, & Coombes, 2014). These regimens come in various forms but are typically based on intense cardiovascular exercise interleaved with rest periods.
The long-term impact of physical exercise, including the chronic effects of high-intensity regimens, has been studied extensively (for a general review, see Hillman, Erickson, & Kramer, 2008). This body of research has demonstrated that chronic exercise exerts a positive influence on a wide range of variables, including those related to cognitive performance (Moreau & Conway, 2013; Moreau, Kirk, & Waldie, 2017). Although long-term commitment to exercise is key in inducing profound, lasting changes, studying the short-term effects of exercise remains important in further understanding the mechanisms subserving change at different time scales. In this dynamic, several theories have attempted to explain the effects of acute exercise on cognition (Tomporowski, 2003), emphasizing in the process the fundamental discrepancies between short-term effects and long-term, or chronic, outcomes. For example, the reticular-activating hypofrontality model (Audiffren, 2016) posits that acute exercise forces the brain to shift metabolic resources away from specific regions such as the prefrontal cortex to instead favor structures that support exercise, such as the reticular formation and motor cortices. This process typically facilitates performance on sensory and motor tasks, whereas the associated hypofrontality is thought to temporarily impair executive function (Audiffren, 2016; Dietrich & Audiffren, 2011). A number of electroencephalography studies support this view, reporting postexercise alterations in the form of shifts in the amplitude or latency of event-related potential components thought to reflect an increased allocation of cognitive resources after exercise (Hillman, Snook, & Jerome, 2003; Kamijo et al., 2009; O’Leary, Pontifex, Scudder, Brown, & Hillman, 2011).
Exercise also leads to acute changes—typically increases—in the concentration of several neurochemicals in the brain (for a review, see Moreau & Conway, 2013). Examples of these neurochemicals include the catecholamines (noradrenaline, dopamine), cortisol, brain-derived neurotrophic factor, and possibly serotonin (McMorris, Turner, Hale, & Sproule, 2016). These neurochemicals influence brain function in a complex manner: Whereas moderate increases in catecholamine concentrations appear to facilitate performance on most cognitive tasks, excessive concentrations (from high-intensity and/or prolonged exercise) can inhibit cognition instead (McMorris et al., 2016), consistent with the inverted-U hypothesis (Yerkes & Dodson, 1908). In addition, physical and emotional fatigue (Barnes & Van Dyne, 2009) may impair cognitive performance at higher intensities of exercise.
Although neurochemical accounts and the reticular-activating hypofrontality model can help explain the impact of exercise on cognition while exercising, they might tell us very little about postexercise effects. This question has practical importance given that many individuals may be motivated to exercise during breaks if it has beneficial effects on both physical health and cognitive performance. Yet compared with that of low-to-moderate-intensity exercise (e.g., Ludyga, Gerber, Brand, Holsboer-Trachsler, & Pühse, 2016), the acute effect of high-intensity exercise on subsequent cognitive performance is not well understood (Browne et al., 2017; McMorris, 2016). Exercise-induced impairments in cognitive performance associated with high-intensity training could be temporary and thus not representative of subsequent effects after the bout. For example, one might posit that detrimental effects would rapidly subside to allow general cognitive improvements similar to, or even surpassing, those observed after aerobic exercise (Kao, Westfall, Soneson, Gurd, & Hillman, 2017; Samuel et al., 2017). This would be consistent with the profound physiological effects but rapid physical recovery typically observed following these types of regimens (MacInnis & Gibala, 2017). In fact, a number of studies have found facilitating effects of high-intensity exercise on postexercise cognitive performance (Kao et al., 2017) despite the well-established impairments reported during high-intensity exercise (Samuel et al., 2017). On the other hand, it is also possible that the debilitating effects of high intensity on cognition remain after the bout (Ludyga, Pühse, Lucchi, Marti, & Gerber, 2019; Mekari et al., 2015) or that the effect of high-intensity exercise on cognitive performance depends on moderators such as fitness level or age (Browne et al., 2017). In line with this idea, several high-intensity exercise studies have failed to find facilitating effects (Browne et al., 2017; McMorris & Hale, 2012).
Moreover, more fine-grained data might be required to truly understand the dynamics of exercise-induced effects on cognition. For example, a meta-analysis by Chang and colleagues found impairing effects up to 1 min after the bout and facilitating effects beyond 1 min after exercise (Y. K. Chang, Labban, Gapin, & Etnier, 2012). To complicate things further, methodological differences between studies have possibly contributed to these discrepancies; for example, the threshold for high-intensity exercise varies substantially across protocols (McMorris, 2015). Specifically, the following thresholds, which are based on either maximum heart rate (HRmax) or maximum power output (Wmax), are common in the literature: (a) ≥ 77% HRmax (American College of Sports Medicine, 2010; Y. K. Chang et al., 2012) and (b) ≥ 80% Wmax (Borer, 2003; Browne et al., 2017; McMorris, Hale, Corbett, Robertson, & Hodgson, 2015). Using conversion formulas (Arts & Kuipers, 1994; Lounana, Campion, Noakes, & Medelli, 2007), (a) can be converted to approximately 59.5% Wmax, and (b) corresponds to 88.6% HRmax. In addition, a wide range of populations is included in these studies; for example, the age range spans from children to older adults, whereas fitness levels range from sedentary to highly active, or even professionally trained, individuals. These differences in thresholds, units of measurements, and demographics may account for some of the inconsistencies between studies.
The aforementioned inconsistencies are best addressed via meta-analytic investigations, yet previous reviews and meta-analyses have included only a relatively low number of studies. The most recent literature search for a meta-analysis on the acute effect of exercise on cognition was conducted in 2010 1 (Y. K. Chang et al., 2012). In the 9 years since, many studies have been published, including several that examined intermittent forms of high-intensity exercise typically shorter than traditional aerobic exercise (Klika & Jordan, 2013) and often associated with higher self-reported enjoyment (Thum, Parsons, Whittle, & Astorino, 2017). In addition, several potential moderators of the influence of exercise on cognition have been identified (Y. K. Chang et al., 2012; Etnier, Nowell, Landers, & Sibley, 2006; Etnier et al., 1997; Lambourne & Tomporowski, 2010; McMorris & Hale, 2012; Sibley & Etnier, 2003; Tomporowski, 2003), providing the basis for a deeper understanding at the mechanistic level.
Present Study
In the present meta-analysis, we focused on the effect of cardiovascular high-intensity exercise on executive function. Although restrictive, our emphasis on executive performance was motivated by three distinct factors. First, executive function is a central component of cognition known to influence many cognitive processes and with ecological relevance to various domains, ranging from academic to professional (e.g., Diamond, 2013). Second, intervention studies have demonstrated repeatedly that executive performance is malleable and that it can be improved given adequate regimens or suitable environments (e.g., Diamond & Lee, 2011). Finally—and perhaps most importantly—there are clear theoretical predictions, generated on the basis of previous literature (e.g., Audiffren, 2016; Dietrich & Audiffren, 2011), about the effects of exercise on executive function. In our view, predictions related to other cognitive domains are less well informed theoretically and thus more prone to spurious findings.
We aimed to answer two research questions. First, what is the effect of a single bout of high-intensity exercise on executive performance? This question is important given the mixed findings reported in previous studies. Second, how is the effect of high-intensity exercise on executive performance moderated by the characteristics of exercise, cognitive tasks, research protocols, and participants? We predicted an effect of high-intensity exercise on executive function but did not predict a direction because of inconsistencies in previous studies (Y. K. Chang et al., 2012; Ludyga et al., 2016; Verburgh, Königs, Scherder, & Oosterlaan, 2014). We also postulated larger nondirectional effects on the basis of previous findings for longer exercise durations (Browne et al., 2017; Y. K. Chang et al., 2012; Ludyga et al., 2016; Miller, Hanson, Tennyck, & Plantz, 2019) and a greater facilitating effect for cycling compared with running protocols (Lambourne & Tomporowski, 2010). Because of the relative scarcity of studies examining subcategories of high-intensity exercise and exercise rhythm, we did not make predictions for these moderators. We also hypothesized a greater facilitating effect for cognitive tasks administered after a minute following exercise (Browne et al., 2017; Y. K. Chang et al., 2012). Finally, we predicted larger effects in within-subjects (vs. between-groups) studies (Y. K. Chang et al., 2012) and lower-quality (vs. higher-quality) studies (Etnier et al., 1997), larger effects on executive function when contrasting high-intensity exercise with rest rather than with low-intensity exercise (Ludyga et al., 2016), and a greater facilitating effect for high-fitness participants and for individuals 14 years of age or older on the basis of previous literature (Y. K. Chang et al., 2012). In addition, we ran exploratory analyses to test potentially meaningful interactions (e.g., trade-offs, additive effects, confounds) between moderators thought to be informative given previously unexplored relationships or prior mixed findings.
Method
This meta-analysis followed guidelines from the Preferred Reporting Items for Systematic Review and Meta-Analysis Protocols (PRISMA-P) 2015 statement (Moher et al., 2015; Shamseer et al., 2015).
Eligibility criteria
Participant, intervention, comparator, and outcome (PICO) criteria were used to determine eligibility (Moher et al., 2015). Specifically, studies had to include an acute bout of high-intensity exercise as an independent variable and performance on at least one standardized test of executive function as a dependent variable. 2 For a study to be included, the executive-function task(s) also needed to be administered at least once after the exercise bout. Studies also needed to include a control/comparison group or condition, with random allocation to groups/conditions, and needed to be in English (see Fig. 1).

Preferred Reporting Items for Systematic Review and Meta-Analysis (PRISMA) flow diagram of literature search and study inclusion.
High-intensity threshold
Two thresholds have been used in the past to define high-intensity exercise: ≥ 77% HRmax (American College of Sports Medicine, 2010; Y. K. Chang et al., 2012) or ≥ 80% Wmax (Borer, 2003; Browne et al., 2017; McMorris et al., 2015). When exercise intensity was expressed in terms of Wmax, maximal oxygen uptake (VO2max), or other units, we converted those values into HRmax equivalents. Details of this conversion are presented later in the Data Preprocessing section.
We chose our intensity thresholds (high, very high, and maximal) by integrating the two definitions. 3 High intensity corresponded to 77% to 88.5% HRmax, 59.5% to 79.9% Wmax, or an equivalent—similar to the first threshold. Very high intensity corresponded to 88.6% to 99.9% HRmax, 80% to 99.9% Wmax, or an equivalent—similar to the second threshold. Maximal intensity corresponded to ≥ 100% HRmax, ≥ 100% Wmax, or an equivalent.
Definitions of acute exercise and executive function
Acute exercise refers to exercise performed on a single day (American College of Sports Medicine, 2010). Only standardized cognitive tasks or analogues were included. For search purposes, we included cognitive domains that are either subcomponents of, or related to, executive function, such as attention, cognitive control, cognitive flexibility, fluid intelligence, inhibitory control, planning, and working memory. See Table S2 in the Supplemental Material available online for details on the full categorization process.
Search
Sources
We searched the following databases: Scopus, PubMed, PsycINFO, ScienceDirect, Web of Science, ProQuest, SPORTDiscus, and Google Scholar. Searches were performed between January 10 and January 15, 2018; all studies published up until the date of the search were considered for inclusion. We also searched relevant reviews and meta-analyses (Browne et al., 2017; Y. K. Chang et al., 2012; Cooper, Dring, & Nevill, 2016; Lambourne & Tomporowski, 2010; McMorris & Hale, 2012; Stork, Banfield, Gibala, & Martin Ginis, 2017), together with reference lists, to identify other relevant articles (see details at http://osf.io/59cgd).
Strategy
In each database, a minimum of two different searches were performed. We used the following combination of key terms or phrases across all databases: “HIIT,” “HICT,” “HIT,” “High intensity interval,” and “High intensity intermittent,” plus either “exercise” or “training,” plus any of “stroop,” “cognitive,” “cognition,” “memory,” “learning,” “attention,” “perception,” “language,” and “executive func*" (* = truncated). In addition to these two searches, minor variations were performed to find all relevant articles. Search exemplars in the Scopus database are provided in Table S1 in the Supplemental Material. A record of all search terms and procedures is available at http://osf.io/59cgd.
Selection
E. Chou screened studies for initial inclusion on the basis of titles and abstracts, with advice on inclusion criteria from D. Moreau. If an initial decision could not be made, E. Chou examined the full text for eligibility. Articles unable to be categorized by E. Chou alone on the basis of the outlined criteria were referred to D. Moreau, who decided which articles to include. The full selection process is presented in Figure. 1. All full-text studies considered and the rationale behind their inclusion or exclusion are available at http://osf.io/kqwyc.
Data collection
E. Chou collected the data. For all studies, information was collected on a number of variables. The complete list of these variables is included in List S1 in the Supplemental Material; full data are available at http://osf.io/72pfv. E. Chou contacted authors by e-mail (n = 11) to request unpublished data, with the deadline for replying set to March 30, 2018. Five authors provided us with unpublished data; we included data from four of these authors because they matched all inclusion criteria.
Moderators
Key variables identified from previous studies were included as moderators in all analyses. These variables included research-protocol characteristics (study design, comparison group, and study quality), exercise characteristics (duration, intensity, mode, and rhythm), cognitive-task characteristics (timing of cognitive testing, cognitive-task domain, and baseline cognitive testing), and sample characteristics (fitness and age). We provide further details about each of these variables hereafter.
Research-protocol characteristics
We characterized whether studies included a within-subjects or between-groups design. This distinction was thought to be particularly important in the context of exercise interventions—a meta-analysis by Y. K. Chang et al. (2012) found significant improvements in cognitive performance following exercise in within-subjects, repeated measures studies (d = 0.11) but not in between-groups, single-measure studies (d = 0.06), although the difference between the two types of designs was not itself significant. We also documented whether studies compared cognitive performance following high-intensity exercise with performance after rest or with performance following lower intensities of exercise. We included the comparison group as a moderator on the basis of previous findings showing that light-to-moderate exercise elicits cognitive gains that are quantitatively different from test-retest effects and because of current discrepancies regarding the effect of high-intensity exercise on cognition (Browne et al., 2017; Y. K. Chang et al., 2012; McMorris & Hale, 2012).
Finally, studies were categorized into three levels of quality (low, average, and high) on the basis of the design properties. This is important given that previous studies have identified an increase in the magnitude of reported effect sizes within a study as the number of threats to internal validity (i.e., the risk of bias) increases (Etnier et al., 1997). To mitigate the risk of bias from individual studies, we assessed study quality following Cochrane guidelines (Higgins et al., 2011). Study quality was assessed on the basis of the variables described in List S1 in the Supplemental Material. E. Chou graded study quality according to whether specific criteria were met (no = 0, partial = 1, yes = 2). Items that were not reported or not applicable were not used in the calculation of the quality score. Criteria deemed more likely to affect the risk of bias were given greater weighting than other criteria (minimum weighting = 1; maximum weighting = 3). Although the two types of designs (within subjects/between groups) had different assessment criteria, this difference was reconciled by conversions into comparable percentage scores. These assessments are available in Tables S3 and S4 in the Supplemental Material.
Exercise characteristics
Exercise duration was categorized as follows: 0 to 5 min, 6 to 10 min, 11 to 20 min, and > 20 min. This categorization was based on the results of Y. K. Chang et al. (2012), who, across all intensities, found exercise durations of 0 to 10 min to have a small, negative effect on cognition after exercise and durations of 11 to 20 min and > 20 min to have small, positive effects on cognition after exercise. In addition, we characterized exercise intensity as high, very high, and maximal on the basis of the aforementioned criteria. Previous findings have been equivocal; Y. K. Chang et al. (2012) reported cognitive improvements for both hard and very hard exercise (when tested at least 1 min after exercise) and McMorris and Hale (2012) finding moderate—but not what they referred to as heavy—exercise to improve processing speed during and after exercise. Finally, we distinguished studies on the basis of both the modality and rhythm of exercise. The former distinction was motivated by a previous meta-analysis (Lambourne & Tomporowski, 2010) that found cycling to lead to better cognitive performance after exercise compared with running. The latter variable (continuous vs. intermittent) was modeled because of the rising number of studies using interval-training paradigms. Previous studies have not examined this variable in detail.
Cognitive-task characteristics
We analyzed the effect of time of testing using the categories of 0 to 1 min, 1 to 10 min, and > 10 min after exercise. This categorization was informed by a previous meta-analysis (Y. K. Chang et al., 2012) that found no effect of high-intensity exercise on cognitive performance when tests were administered immediately (0–1 min) after high-intensity exercise and small-to-moderate facilitating effects when tests were administered > 1 min after high-intensity exercise. Executive-function tasks were further categorized relative to the specific cognitive domains they tapped, namely attention, cognitive flexibility, inhibitory control, and working memory. When a task targeted more than one domain, it was labeled on the basis of its main subdomain. Finally, studies either involved baseline testing and postexercise testing or postexercise testing only. We added this moderator variable given that studies with both time points may be less biased than those for which only posttest performance was available.
Sample characteristics
Participants were characterized as low fitness, moderate fitness, or high fitness. Y. K. Chang et al. (2012) found high-fitness participants to be the only group with significant cognitive improvements both immediately following exercise (0–1 min) and after a postexercise delay (> 1 min), whereas low-fitness and moderate-fitness participants showed improvements only immediately after exercise for the former and only after a postexercise delay for the latter. However, Etnier et al. (2006) did not find a clear relationship between fitness and cognitive performance overall. Like previous researchers (see Y. K. Chang et al., 2012), we further categorized age into the following groups: children (6–13 years), adolescents (14–17 years), young adults (18–30 years), middle-aged adults (31–60 years), and older adults (> 60 years). Note that Y. K. Chang et al. (2012) found significant improvements in cognition following exercise for all age groups except children ages 6 to 13.
Continuous moderators
When studies had continuous data available, we also conducted continuous-moderator analyses for the following moderators: exercise duration at high intensity, exercise intensity of high-intensity condition, timing of cognitive testing, study quality, and participant age. Although continuous-moderator analyses help to increase statistical power compared with categorical-moderator analyses (MacCallum, Zhang, Preacher, & Rucker, 2002), we chose to primarily report categorical analyses because they allowed the inclusion of a larger number of relevant studies.
Analyses
The search strategy identified 2,320 records through database searching and 383 additional records through other search methods. Of these, 146 full-text articles were examined for eligibility. Twenty-eight studies were included in the final meta-analysis (see Fig. 1). For clarity, we present characteristics of within-subjects and between-groups studies separately; however, these two types of protocols were analyzed together in our multilevel model. 4 Tables 1 and 2 present an overview of research protocols and sample characteristics for within-subjects and between-groups studies, respectively.
Research Protocol and Sample Characteristics for All Within-Subjects Studies Included in the Meta-Analysis
Note: ADHD = attention-deficit/hyperactivity disorder; BMI = body mass index; HIE = high-intensity exercise; HIIT = high-intensity intermittent exercise; LT = lactate threshold; — = data not provided; VO2max = maximal oxygen uptake; Vt = ventilatory threshold.
Whole-group analyses (not by fitness). bExcluded ADHD group from analysis. cOne outlier effect size. dExperiment 1 only.
Research Protocol and Sample Characteristics for All Between-Groups Studies Included in the Meta-Analysis
Note: AT = anaerobic threshold; BMI = body mass index; HIE = high-intensity exercise; HIIT = high-intensity interval training; — = data not provided; PACER = Progressive Aerobic Cardiovascular Endurance Run; VO2max = maximal oxygen uptake. The Baecke questionnaire is from Baecke, Burema, and Frijters (1982).
Group 1 only. bEight independent samples (4 testing delays times 2 conditions). cTwo outlier effect sizes; excluded resistance exercise group. dExcluded laser-therapy groups. eExcluded results after exercise blocks 2 and 3.
Data preprocessing
When exercise intensity was not expressed in terms of HRmax—for example, %Wmax, %VO2max, or percentage of heart rate reserve (%HRR, where HRR = HRmax – HRrest)—it was converted into HRmax (Arts & Kuipers, 1994; Lounana et al., 2007). In addition, when average age was available, HRmax was estimated using the formula HRmax = 208 − (0.7 × Age) from Tanaka, Monahan, and Seals (2001). Note that based on these conversion formulas, 77% HRmax corresponded to approximately 59.5% Wmax. When intensity could not be converted to an HRmax equivalent, such as with scores from the Rated Perceived Exertion scale, the article was examined thoroughly to determine the intensity on the basis of its description. For example, exercise to fatigue or exhaustion was coded as maximal intensity. We excluded articles from our analyses if accurate categorization was not possible. Exercise characteristics are further described in Tables 3 and 4.
Exercise Characteristics for All Within-Subjects Studies Included in the Meta-Analysis
Note: HIE = high-intensity exercise; HIIT = high-intensity interval training; Cont = continuous; Int = intermittent; HRmax = maximum heart rate; LT = lactate threshold; — = data not provided THI = exercise duration at high intensity; Ttotal = total exercise duration; VO2max = maximal oxygen uptake; ƛ = maximum intensity.
This was described as moderate intermittent exercise but included 4.5 min of heavy exercise and 1 min of severe exercise. bThese modalities were analyzed under “running” in moderation analyses. cVigorous exercise measured with an accelerometer. dNo equation available.
Exercise Characteristics for All Between-Groups Studies Included in the Meta-Analysis
Note: AT = anaerobic threshold; HIE = high-intensity exercise; HIIP = high-intensity intermittent-exercise protocol; Cont = continuous; Int = intermittent; HRmax = maximum heart rate; PACER = Progressive Aerobic Cardiovascular Endurance Run; THI = exercise duration at high intensity; Ttotal = total exercise duration; ƛ = maximum intensity.
This modality was analyzed under “running” in the moderation analyses.
Studies reported two main types of cognitive test scores: those with postexercise scores only and those with both baseline and postexercise scores. For the latter, we calculated mean-difference (posttest-pretest) scores. Baseline scores were subtracted from the postbout scores for each study, correcting for bias and adjusting for the direction of each cognitive test—higher scores indicated better performance in some instances (e.g., accuracy), whereas lower scores indicated better performance for other measures (e.g., reaction time). Pooled standard deviation values (SDpooled) were calculated for each mean-difference score. Where available, moderator-variable data related to each mean-difference/raw score were recorded in additional columns. Note that when studies included more than one posttest session, only the testing session closest in time to the exercise bout was analyzed, given the potential for practice effects to contaminate subsequent testing.
Because different cognitive tests measured performance on different scales, and because the cognitive test scores within each group or condition were available in either mean-difference or raw-score format, we calculated bias-corrected Cohen’s d (Cohen, 1992) to standardize the differences in mean scores between groups or conditions. For between-groups studies, we compared the scores between different groups under different conditions; for within-subjects studies, we compared the scores between the same group under different conditions. Cognitive-task characteristics are further described in Tables 5 and 6.
Cognitive-Task Characteristics for All Within-Subjects Studies Included in the Meta-Analysis
Note: N = number of postexercise cognitive test repetitions; t = approximate timing of first postexercise cognitive test; WISC-R, Wechsler Intelligence Scale for Children–Revised (Wechsler, 1974); ITPA = Illinois Test of Psycholinguistic Abilities.
Includes only the tests analyzed in our meta-analysis. bDefined as whether the study has a preexercise cognitive test. cBarring one study on long-term memory (Etnier et al., 2016), for which we analyzed all timings, we analyzed only the first of these postexercise cognitive tests to minimize the contribution of test–retest effects (minutes after exercise/control).
Cognitive-Task Characteristics for All Between-Groups Studies Included in the Meta-Analysis
Note: The cognitive tests listed include only those used in our meta-analysis. N = number of postexercise cognitive test repetitions; t = approximate timing of first postexercise cognitive test (minutes after exercise/control). The Hopkins Verbal Learning Test is a product of PAR (Lutz, FL). The Benton Visual Retention Test is described in Benton (1992). The Symbol Digits Modalities test is described in Smith (1982). The Rey Auditory and Verbal Learning Test is described in Schmidt (1996).
This indicates whether the study had a preexercise cognitive test. bIn this study, prefrontal and hippocampal measures were recorded in summary z scores. cAll test sessions were examined because this study examined test timing as an independent variable. dBoth of these were examined because the first repetition corresponded to working memory and the second to long-term memory.
Main analyses
We conducted all analyses in the R software environment (Version 3.5.1; R Core Team, 2018) using the metafor (Version 2.1; Viechtbauer, 2010) and multcomp (Version 1.4; Hothorn et al., 2013) packages. We ran restricted maximum-likelihood, multilevel, random-effects meta-analyses to estimate overall effects as well as the heterogeneity between effect sizes. In addition, we conducted mixed-effects meta-analyses to investigate whether the observed heterogeneity could be explained by the effects of moderator variables. The R code for our analyses is available online at http://osf.io/pg53m.
We conducted a multilevel analysis to account for the dependency between effect sizes (Assink & Wibbelink, 2016). Multilevel analyses group effect sizes on the basis of higher-level clustering variables (Konstantopoulos, 2011) to prevent inflation and overconfidence in meta-analytic estimates (Van den Noortgate, López-López, Marín-Martínez, & Sánchez-Meca, 2013). We used a procedure that adds random effects at each level of possible dependency to reduce inflation while preserving valuable information provided by studies that report multiple effect sizes (Viechtbauer, 2010).
Our multilevel analysis had three clustering variables (i.e., levels). First, we modeled dependency at the study-design level (i.e., within-subjects and between-groups designs). Second, we modeled dependency at the sample level—within samples reporting results in multiple tests and/or samples with two or more comparison groups. For example, a group undergoing three sessions (resting, low-intensity exercise, and high-intensity exercise) included dependency across all difference scores given that the high-intensity session score was compared with both the resting and low-intensity session scores. Finally, we modeled dependency within the cognitive domains for which multiple test scores were reported (e.g., response time and accuracy in a Stroop task) within individual samples.
Outliers
Effect sizes whose residuals were more than three standard deviations from the mean were considered outliers and excluded from our main analysis. There were three effect sizes meeting this criteria (see Fig. S1 in the Supplemental Material). In addition, Cook’s distance was used as an exploratory estimate to detect outliers. This method takes into account the relative influence of each effect size on the overall estimate. Effect sizes with a Cook’s distance more than three times the mean Cook’s distance were labeled as possible outliers, although they were not excluded from our main analyses. There were 16 effects meeting this criteria (see Fig. S2 in the Supplemental Material); however, because this criterion was not defined a priori, we provided the full data set before outlier removal, together with the R code, at http://osf.io/cauxq.
Interaction analyses
To test whether the moderating effects of individual categorical moderators were dependent on other moderator variables, we conducted interaction analyses. We limited the scope of interactions to two moderators at a time and chose to include the following six moderators in the interaction analyses: comparison group, exercise rhythm, exercise intensity, exercise duration, timing of cognitive testing, and cognitive domain. These moderators were chosen because they were thought to relate to each other in terms of their influence on the effect of high-intensity exercise on executive function. As these analyses involved multiple pairwise comparisons between each level of each moderator, we controlled the familywise error rate using the Holm method (Holm, 1979).
Heterogeneity and model comparisons
We estimated heterogeneity (between-studies variability in effect sizes) in two ways. First, we calculated the I2 statistic, which estimates the percentage of between-studies variation that is due to true differences (e.g., differences in research-protocol characteristics, differences in exercise intensity) rather than error. Second, using Cochran’s Q statistic, we tested whether the moderators could account for the between-studies heterogeneity. We tested whether a model including the significant moderator reduced the model’s residual heterogeneity to nonsignificance. If residual heterogeneity was still significant, we combined all moderators in our model (including the nonsignificant ones) to account for any residual heterogeneity. We evaluated whether the three models (no moderator, significant moderators only, all moderators) were different from each other in terms of goodness of fit and parsimony using the Akaike information criterion–corrected (AICc) and the Bayesian information criterion (BIC).
Results
Across 28 studies with a range of protocols, as well as exercise, cognitive task, and sample characteristics, acute high-intensity exercise had a small, significant, facilitating effect on executive function after exercise compared with other conditions (lower-intensity exercise and rest combined). The meta-analytic estimate of this effect was d = 0.24, 95% confidence interval (CI) = [0.13, 0.35]. Our main measure of heterogeneity (I2 = 49.08% across all studies) indicated that about half the variability between effect sizes was potentially due to real differences between studies, that is, differences that are not related to random error. Figure 2 provides a general overview of our findings.

Forest plot of effect sizes for all included studies. The sizes of the squares represent relative sample sizes. The diamond at the bottom represents the overall effect. The dotted vertical line represents an effect size (d ) of zero. CI = confidence interval; RE = random effects.
Moderator analyses
Categorical-moderator analyses
Comparison group was a significant moderator (p = .002; see Table 7 for details); that is, high-intensity exercise had a facilitating effect on executive function compared with rest (d = 0.34) but not significantly different from that of lower-intensity exercise (d = 0.07). Study quality was a marginally significant moderator (p = .08), with the facilitating effect decreasing in magnitude as study quality increased. Other moderator variables did not yield significant moderating effects despite effect sizes being significantly different from zero at specific levels of those moderators. All details are reported in Table 7.
Categorical Moderator Analyses
Note: Boldface type indicates effects that were both significant and had an adequate (> 5) number of individual studies supplying the effects. HRmax = maximum heart rate; N = number of independent studies supplying categorical data.
These were also analyzed as continuous moderators. bThis was a significant moderator; the effect of exercise on cognitive performance after exercise was different on at least two different levels of this moderator.
p < .10. *p < .05. **p < .01. ***p < .001.
Because executive function is an umbrella term that encompasses many different subdomains, we also examined the subcomponents attention, cognitive flexibility, inhibitory control, and working memory separately. High-intensity exercise had a small, facilitating effect on cognitive performance for each of these four subdomains, a finding that was consistent with our overall analysis. Figure 3 shows forest plots for each of these subdomains.

Forest plots of effect sizes for working memory (a), inhibitory control (b), cognitive flexibility (c), and attention (d). The sizes of the squares represent within-domain differences in sample size only. The diamond at the bottom represents the overall effect. The dotted vertical line represents an effect size (d) of 0. CI = confidence interval; RE = random effects.
Continuous-moderator analyses
None of the planned continuous moderators yielded a significant effect (see Table 8 for details); that is, when examined separately, the effect of high-intensity exercise on executive function did not change significantly as a function of unit increases or decreases in any of the moderators. The effect of an exploratory moderator, percentage of male participants, was marginally significant (p = .058), with the facilitating effect of high-intensity exercise on executive function decreasing as the percentage of male participants increased.
Continuous Moderator Analyses
Note: dproj = projected Cohen’s d; d0 = estimated effect when moderator is 0; N = number of independent studies supplying continuous data; Δ
This does not include testing for long-term memory.
p < .10. *p < .05. **p < .01. ***p < .001.
Interaction analyses
Of the 15 pairs of moderators tested, two potentially meaningful interactions emerged. The first was the interaction between comparison group and exercise intensity, and the second was the interaction between exercise rhythm and timing of cognitive testing. When lower-intensity exercise was the comparison group, participants undergoing very-high-intensity exercise (three studies) experienced significantly larger improvements in executive performance compared with those undergoing high-intensity exercise (eight studies; Δd = 0.35, p = .002). When exercise was intermittent, participants tested 0 to 1 min after exercise (10 studies) experienced significantly larger improvements in executive performance compared with those tested > 10 min after exercise (three studies; Δd = 0.77, p = .009). Most interaction analyses for which both moderators had three or more levels (i.e., both moderators belonged to any of exercise intensity, exercise duration, timing of cognitive testing, or cognitive domain) yielded nonsignificant pairwise comparisons as a result of the low number of studies for each level and because of the Holm correction. These analyses are available at http://osf.io/9jz3s.
Heterogeneity and model comparisons
The total heterogeneity present in the model without moderators was significant, Cochran’s Q(146) = 264.151, p < .0001. Only the categorical-moderator-comparison group significantly reduced heterogeneity (p < .0001), whereas the effects of the categorical moderator study quality and the continuous moderator percentage of male participants were marginally significant (p = .082 and p = .058, respectively).
Although the inclusion of comparison group in the model significantly reduced heterogeneity, the remaining heterogeneity was still significant, QE(145) = 263.449, p < .0001. The inclusion of additional moderators (i.e., study design, comparison group, cognitive-task domain, categorical study quality, categorical exercise intensity, baseline cognitive testing, and percentage male) available for most effect sizes decreased heterogeneity substantially, although it remained significant, QE (132) = 230.5262, p < .0001.
Both AICc and BIC criteria provided converging evidence in their assessments of model fit and parsimony. The best model was the model with comparison group as the moderator (AICc = 151.7, BIC = 166.2), followed by the model without moderators (AICc = 158.8, BIC = 170.5). Meanwhile, the model incorporating the most moderators was least preferred (AICc = 169, BIC = 215.1). Together, these results suggested that the nonsignificant moderators did not account for enough residual heterogeneity to justify their inclusion in the meta-analytic model.
Assessment of metabias
We assessed bias in three different ways. First, we plotted funnel plots to detect possible publication bias. Funnel plots help visualize the relationship between each effect size and their corresponding standard error. In a meta-analysis unaffected by publication bias, effect sizes from larger samples, which have smaller standard errors, should cluster around the mean effect. In contrast, effect sizes from smaller samples, with larger standard errors, are expected to be more dispersed around the mean. In addition, all effect sizes are expected to be represented somewhat equally on both sides of the mean, forming a symmetric, inverse, funnel-like shape. Asymmetry in funnel plots may result from reporting biases, poor methodological quality, true heterogeneity, artifacts, or chance (Sterne et al., 2011). The funnel plot for our overall analysis (Fig. 4) was fairly symmetrical and evenly distributed, and few effects fell outside the 99% CI, suggesting that the present meta-analysis was not substantially affected by publication bias.

Funnel plot of effect sizes for studies in the meta-analysis. Each dot represents an individual effect size and is plotted as a function of standard error. Light gray and dark gray triangles denote 95% and 99% confidence intervals, respectively, for the effect sizes, given the absence of publication (or small-study) bias. The vertical line represents the random-effects-model estimate (d = 0.24).
Second, we analyzed whether study quality (categorical) affected the estimated effect. According to our criteria (Tables S3 and S4 in the Supplemental Material), there were 23 high-quality studies, 4 average-quality studies, and 1 low-quality study. Study quality had a marginally significant moderating effect on the effect of high-intensity exercise on executive function (p = .082). Whereas high-quality studies reported a small facilitating effect (d = 0.19, 95% CI = [0.07, 0.31]), average-quality studies reported a moderate facilitating effect (d = 0.59, 95% CI = [0.26, 0.92]), and there were too few low-quality studies to reliably estimate an average effect size. Our finding suggested that studies of lower quality reported larger effect sizes because of poor control of confounding variables. This result fits well with findings from earlier research (Etnier et al., 1997) and might suggest a biasing effect from average-quality studies.
To further address metabias, we constructed a p-curve analysis (Simonsohn, Nelson, & Simmons, 2014) for significant (p < .05) effect sizes in our studies. The rationale for the p-curve analysis is that the p-value distribution for a true effect that is not influenced by reporting bias should contain fewer high (p > .04) than low (p ≤ .01) significant p values (Simonsohn et al., 2014). To construct a p curve, we first converted Cohen’s d values to t values, adjusting for study design (within-subjects or between-groups). We then generated the curve using an online app (available at http://www.p-curve.com/app4). The p curve (Fig. 5) was right-skewed, with fewer large (p > .04) than small (p ≤ .01) p values. This suggested no clear sign of publication bias for the significant effect sizes analyzed. The power estimate of the studies examined was relatively high (74%, 90% CI = [56%, 86%]).

P curve for all significant (p < .05) effect sizes. The observed p curve includes 35 statistically significant (p < .05) results, of which 23 were p < .025; 112 additional results were entered but excluded from the p curve because they were p > .05. The 90% confidence interval is given for the power estimate.
Discussion
Short but intense bursts of exercise have gained traction in recent years as a viable substitute to aerobic exercise to elicit cognitive improvements (e.g., Moreau et al., 2017). Whether this form of exercise—typically referred to as high-intensity exercise—is associated with facilitating or debilitating effects on cognitive performance immediately following exercise, however, remains unclear. The present meta-analysis provides a quantitative answer to this question, together with an analysis of the specific variables that help explain some of the past discrepancies. In particular, we focused on the effect of high-intensity exercise, defined as exercise above 77% of one’s maximum heart rate, on postexercise executive performance. We identified 28 studies examining this relationship, with comparisons to either lower-intensity exercise or rest, and investigated subdomains of executive function, including attention, cognitive flexibility, inhibitory control, and working memory.
Overall, high-intensity exercise had a small, significant facilitating effect on executive function after exercise. This finding was in line with our prediction of an effect, although we did not hypothesize a direction. Importantly, this result was inconsistent with theoretical predictions of impairing effects, such as the inverted-U theory (Yerkes & Dodson, 1908), the reticular-activating hypofrontality model (Audiffren, 2016; Dietrich & Audiffren, 2011), and neurochemical hypotheses (McMorris et al., 2016). However, it aligns with a previous meta-analysis by Y. K. Chang et al. (2012), who found that high-intensity exercise improved postexercise cognitive performance, and with a meta-analysis by Ludyga et al. (2016), who reported gains in executive performance associated with acute bouts of moderate-intensity exercise. Ludyga et al. (2016) stated that their meta-analysis “examined effects of aerobic exercise on executive function based on moderate intensity,” and further indicate that it therefore “remains unclear how other exercise intensities influence executive control and if those effects are further moderated by the subjects’ characteristics” (p. 1622). Our meta-analysis fills this gap in the literature, providing corroborating evidence with more intense but shorter bouts of exercise.
The general facilitating effect of high-intensity exercise should not, however, obscure the inherently complex dynamics related to the effect of exercise on cognition. Because studies typically average performance on a given cognitive task, a lot of information can be lost about the fine-grained relationship between exercise and cognitive performance (see, e.g., Moreau & Corballis, 2018). For instance, it is possible that high-intensity exercise leads to an early dip in performance as a result of competing physiological resources followed by a subsequent increase in performance as a result of higher levels of activation. If both of these phases (debilitating effect followed by a facilitating effect) occur within the course of a cognitive task, average performance could represent a very coarse measure of the effect of exercise on cognitive performance. The lack of systematic analyses investigating cognitive dynamics after exercise, together with inconsistencies in defining thresholds of high-intensity, perhaps explain some of the discrepancies between our results and those of previous meta-analyses (Y. K. Chang et al., 2012; Lambourne & Tomporowski, 2010; McMorris & Hale, 2012). With this caveat in mind, moderator analyses helped to further identify factors influencing the effect of high-intensity exercise on executive function. We discuss these factors in detail hereafter.
What characteristics of exercise matter in eliciting cognitive enhancement?
The type of comparison or control used in the included studies was a significant moderator. Specifically, high-intensity exercise was associated with superior executive performance compared with rest but not with lower-intensity exercise; in the latter comparison, the two exercising conditions elicited similar gains in executive function. These findings are in line with our hypothesis of a larger effect for comparisons to resting conditions. In contrast with the predictions made by a number of models, for example, psychological (Thum et al., 2017) or neurochemical (McMorris et al., 2016) theories, high-intensity exercise does not appear to impair executive function. This is despite some of the physiological changes induced by exercise (e.g., increases in lactate and catecholamines, shifts in blood flow to cortical regions) being larger following high-intensity regimens (McMorris et al., 2016), suggesting that rather than following an inverted-U function, the facilitating effect of exercise holds at high-exercise intensities, at least after the bout.
To test this idea, we used common subcategories to further characterize high-intensity exercise, namely high (77%–88.5% HRmax), very high (88.6%–99.9% HRmax), and maximal (100% HRmax). Whereas high-intensity exercise was associated with a small, facilitating effect, exercise at very-high and maximal intensities was associated with nonsignificant effects. However, this moderator did not significantly affect the relationship between exercise and executive function in categorical-moderator analyses, whereas in continuous-moderator analyses there was only a small, nonsignificant decrease in facilitating effects as intensity increased, particularly at the highest intensities (> 88.5% HRmax). The inconsistencies between past theoretical (McMorris et al., 2016) and meta-analytical (Y. K. Chang et al., 2012) studies are perhaps not surprising given that they relied on different thresholds. Using a threshold of > 88.5% HRmax, McMorris and Hale (2012) did not find facilitating effects, whereas McMorris et al. (2016) predicted detrimental effects; using a threshold of > 77% HRmax, Y. K. Chang et al. (2012) reported improved postexercise performance. Our results, derived using the lower threshold, align with past meta-analytic literature (Y. K. Chang et al., 2012; McMorris and Hale, 2012) and support the idea of an optimal intensity (i.e., 77%–88.5% HRmax) of high-intensity exercise to induce improvements in executive function.
Other exercise characteristics could potentially affect executive function. Just as higher exercise intensities are thought to be associated with greater physiological effects, so are longer bouts of exercise (Awopetu, 2014). We therefore expected greater effects on cognitive performance as exercise duration increased. Although duration as a continuous moderator trended in this direction, it was nonetheless nonsignificant. Likewise, different exercise durations (0–5, 6–10, 11–20, and > 20 min) led to small, facilitating effects (some significant, some nonsignificant) that were similar in magnitude to one another. Note that this could be due to the larger effects associated with higher intensities being compensated by shorter durations; overall, the correlation between intensity and duration was negative and marginally significant. Furthermore, whereas the magnitude of the facilitating effect was consistent across different durations, its latency could potentially vary independently. For example, some longer exercise bouts might be associated with a delayed, prolonged peak of the facilitating effect. We could not examine this relationship because too few studies examined duration as an independent variable. Fortunately, the field is moving toward the continuous measurement of physiological (e.g., lactate, serum brain-derived neurotrophic factor, heart rate) and cognitive changes during and after exercise, aspects that will help further our understanding of the complex, dynamic interactions between physiology and cognition.
We also explored how exercise rhythm—either continuous or intermittent—affected behavior. Intermittent-exercise protocols have become increasingly popular, perhaps because of their cardiorespiratory (Elliott, Rajopadhyaya, Bentley, Beltrame, & Aromataris, 2015; Milanović, Sporiš, & Weston, 2015; J. S. Ramos, Dalleck, Tjonna, Beetham, & Coombes, 2015; Weston et al., 2014) and psychoaffective (Thum et al., 2017) benefits compared with traditional continuous exercise. We examined whether this form of exercise had comparable benefits on immediate executive function but did not find a difference between the effects of intermittent and continuous exercise compared with rest—both types of protocols were associated with small, facilitating effects. This result is consistent with physiological responses to exercise, which appear to be similar regardless of whether rest periods are interleaved throughout the exercise protocol (Arnardóttir, Boman, Larsson, Hedenström, & Emtner, 2007; Safarimosavi, Mohebbi, & Rohani, 2018). Note that this could be an important advantage of intermittent exercise—with less exercise volume overall but with higher peak intensities compared with continuous exercise, similar effects can potentially be elicited.
Two caveats need to be acknowledged, however. First, these short-term similarities may not reflect long-term equivalence, both in terms of mechanisms and outcomes. Several studies have investigated the physiological effect of chronic high-intensity exercise (Maillard, Pereira, & Boisseau, 2018), as well as the relationship between acute and chronic physiological effects (Tonoli et al., 2012), yet the extent to which acute effects translate or even relate to long-term change at the cognitive level remains unknown. Second, the high intensity typically associated with intermittent exercise may not be desirable for specific populations (e.g., low-fit individuals, older adults). In this regard, acute exercise at more moderate intensities (55% to 70% HRmax) might be preferred for individuals prone to injury or at risk of cardiovascular conditions (Ludyga et al., 2016). The rhythm and intensity of exercise remain dissociable, however, and further research will allow exploring whether lower intensities of intermittent exercise could elicit similar improvements in executive function (Kujach et al., 2018; Ludyga et al., 2016).
Finally, an additional characteristic of exercise was worth investigating in our view—when exercising at high intensity, does it matter whether one runs, cycles, or swims? In other words, is the modality of exercise relevant, or should it just be treated as a personal preference with no direct impact on cognition? Most studies examined either running or cycling, as these modalities were relatively easy to measure and control. Because the two modalities have important physiological differences (Millet, Vleck, & Bentley, 2009), it was plausible that these be reflected in terms of differences in executive function. However, we did not find differences attributable to modalities of exercise in the present meta-analysis—both were associated with small facilitating effects, with a marginally significant effect in the case of cycling and a significant effect in the case of running. This result differed from our prediction: We hypothesized, on the basis of Lambourne and Tomporowski’s (2010) meta-analysis, that cycling would elicit a greater facilitating effect on executive function. Note that in Lambourne and Tomporowski (2010), the difference between modalities was more pronounced when cognitive tests were administered during, rather than after, exercise, which could suggest that the observed benefit of cycling compared with running in their study was due to the difficulty of performing cognitive tasks while running rather than true differences in the effect of running versus cycling. Nevertheless, the disparity between the findings of the present study and those reported by Lambourne and Tomporowski (2010) points to exercise modality as a potential moderator of interest for future investigation.
How were our results influenced by testing, protocol, and sample characteristics?
In addition to variables related to exercise itself, we tested moderators related to the specific measurements used to assess executive function, for example, whether the timing of testing had an effect on cognitive performance. We did not find a significant effect of timing; tests conducted in the ranges 0 to 1, 1 to 10, and > 10 min after a bout were all associated with small, facilitating effects. These results differed from our predictions, which were made on the basis of Y. K. Chang et al. (2012), who found tests administered after 1 min—but not 0 to 1 min—after a bout to elicit facilitating effects. Several studies administered the same postexercise tasks repeatedly to further elucidate the temporal characteristics of exercise-induced cognitive change. We chose not to analyze tests beyond the first repetition (i.e., posttest) to reduce confounds from practice effects. Recent studies examining the influence of test timing have strived to control practice effects; for example, Zimmer et al. (2017) randomly assigned participants to multiple groups with only one posttest session, which was scheduled at a different time after exercise (0, 30, 60, and 90 min). They found a moderating effect of test timing on cognitive performance, with a detrimental effect of high-intensity exercise on executive function at the earlier (0, 30 min) but not the later (60, 90 min) time points. Furthermore, performance decline was correlated with blood-lactate concentrations, suggesting that the observed cognitive effects might be mediated by physiological variables (Zimmer et al., 2017).
Furthermore, we tested whether our findings were contingent on a comparison against a baseline measure of executive performance. Contrary to our prediction, the presence or absence of baseline cognitive testing did not significantly moderate the results; studies with baseline testing (i.e., repeated-measure studies) and studies without baseline testing (i.e., single-measure studies) both showed small, facilitating effects of exercise on executive performance. Our findings also did not differ depending on the subdomain of executive function we focused on (i.e., attention, cognitive flexibility, inhibitory control, or working memory). These results matched our general prediction and were in line with findings from Y. K. Chang et al. (2012), who found an overall facilitating effect of exercise on executive function when collapsing all intensities and test timings.
Most studies included in the current meta-analysis were of high quality, with several average-quality studies and only one low-quality study. Study quality was a marginally significant moderator of the influence of high-intensity exercise on executive function, with a moderate, facilitating effect for average-quality studies and a small, facilitating effect for high-quality studies. This finding was in line with our prediction (although we did not predict the direction of the effect) as well as previous research (Etnier et al., 1997). The larger effects in average-quality studies possibly arose because of poor control of confounding variables, and low- and average-quality studies generally had small sample sizes (see Tables S3 and S4 in the Supplemental Material). Because of the paucity of lower-quality studies in the present meta-analysis, care should be taken in interpreting the effects of study quality.
Moreover, study design—whether the experiment included a within-subjects or between-groups design—did not moderate the relationship between high-intensity exercise and executive function. Both types of design were associated with nonsignificant, small, facilitating effects of high-intensity exercise on executive function, with substantial heterogeneity in the results. This result differed from our prediction (which was based on the results of Y. K. Chang et al., 2012) of a larger effect for within-subjects studies. Our results suggest that most within-subjects studies included in the present meta-analyses controlled adequately for confounds, leading to noninflated effects, which may not have been the case in previous meta-analyses.
Finally, we also tested the influence of sample characteristics on the relationship between exercise and executive performance. These included fitness level, age, and sex. None of these moderators showed a significant effect despite prior studies pointing to the contrary (e.g., Y. K. Chang et al., 2012; Etnier et al., 2006). Note that for fitness level, there were important caveats that prevented a full examination of this moderator, which we detail in the limitations section. With respect to age, few studies examined participants who were not young adults, limiting our ability to investigate the effect of this moderator.
Interactions and model comparisons
We tested a number of interactions between moderators to examine whether the effect of a moderator variable was contingent on another one. From the 15 pairs of moderators defined a priori, we found two significant interactions after correcting for multiple comparisons. First, when the comparison group was a lower intensity exercise group, those individuals undergoing very-high-intensity exercise experienced significantly larger facilitating effects than those undergoing high-intensity exercise. Second, when exercise was intermittent, those who were tested 0 to 1 min after exercise experienced significantly larger facilitating effects than those who were tested > 10 min after exercise. However, given the low number of studies supplying effects at particular levels (three effect sizes for very-high- vs. lower-intensity exercise; three effect sizes for intermittent exercise and testing > 10 min after a bout), these results should be interpreted with caution.
To further understand relationships between moderator variables, we compared a number of models that included single or multiple moderators. 5 Of all models, the model with seven moderators had the lowest residual heterogeneity, although it remained significantly larger than zero. Model comparisons showed the model with one moderator (comparison group) to be a better fit than the model with the largest number of moderators and the model with no moderators. Thus, including the one significant moderator of our analyses to the model was beneficial despite the inherent penalty for model complexity. In contrast, the addition of other moderators to the model was detrimental to overall fit despite the resulting decrease in heterogeneity.
Were our results biased?
Our analyses suggested low bias overall, as illustrated by the funnel plot shown in Figure 4 and corroborated with the p-curve analysis shown in Figure 5 that indicated a low probability of p hacking for the significant effect sizes. Furthermore, analyses of study quality suggested that although most studies were not significantly biased (i.e., high-quality studies), a few were biased toward large effects (i.e., low- and average-quality studies) compared with the overall meta-analytic estimate. Because the inclusion of lower-quality studies did not substantially alter funnel-plot symmetry, it is reasonable to assume that the overall meta-analytic estimate was not substantially affected by biased studies. Importantly, the meta-analytic estimate of a subset of the meta-analytic data composed exclusively of high-quality studies remained significant, showing a small facilitating effect of high-intensity exercise. Altogether, these assessments of quality suggested a low degree of bias; however, given that we could not obtain all the potentially relevant data we requested, it remains possible that our findings were affected by publication bias, at least to a certain extent.
Limitations of the present study
We should point out a few limitations to the present meta-analysis. First, the study was not preregistered, thus possibly increasing the risk of confirmation bias and providing weaker control over theory-driven choices in the selection, coding, and interpretation of studies (Lakens, Hilgard, & Staaks, 2016). With meta-analyses, errors may arise in the implementation of inclusion criteria, extraction of data, and coding of study characteristics; however, to mitigate bias, we thoroughly documented each step of the present meta-analysis and made this documentation publicly available to increase transparency and allow reproducibility.
Our meta-analysis was also limited by inherent features of the studies we included. For example, a majority of the within-subjects studies, and a few of the between-groups studies, did not include baseline scores. This is problematic given that baseline testing reduces error by providing a reference for each individual (Higgins et al., 2011). In several instances, it also proved difficult to precisely quantify participant fitness because of missing information and typical lack of individual data. Complex relationships such as the one linking exercise and executive function can be understood only with a detailed assessment of the dynamics of cognitive performance, that is, of the specific characteristics and their evolution in time. This type of analysis relies on more than summary statistics and instead requires individual trial-by-trial data. The studies we reviewed in this meta-analysis did not include this information, but we believe future studies should collect, retain, and ideally share these types of data to allow more detailed analyses.
In addition, the categorizations we made with regard to exercise intensities and cognitive domains could have influenced the results. With respect to the former, we should note that we did not convert intensity scores into VO2max, arguably considered the best measure for exercise testing (Fletcher et al., 2013). Rather, we converted intensity scores into %HRmax using conversion formulas (Arts & Kuipers, 1994; Lounana et al., 2007) that were possibly not representative of the samples included in the current meta-analysis. This was our preference to allow the inclusion of a larger set of studies, possibly to the slight detriment of overall quality—the moderator analysis for exercise intensity should be interpreted with this limitation in mind.
Likewise, our categorization of cognitive tasks under the umbrella term of executive function helped simplify our analyses and our interpretation of results. However, there remain important differences in definitions of executive function (see, e.g., Diamond, 2013; Miyake et al., 2000), and different theoretical accounts could lead to variations in how the evidence is assessed. However, we attempted to mitigate this effect by further looking into subdomains of executive function to provide a more detailed analysis at a finer level of investigation and to increase overall accuracy and transparency.
Finally, there were particular characteristics of exercise that remain largely unexplored. For example, few studies explored the distinction between high (77%–88.5% HRmax), very-high (88.6%–99.9% HRmax), and maximal (100% HRmax) intensities. There were very few studies that examined the adolescent, middle-aged, and older-adult age groups or that investigated sex differences in the influence of high-intensity exercise on executive function. Importantly, these additional questions do not necessarily require additional studies—many of these analyses could be carried out if demographic data were openly available.
Concluding Remarks
In a meta-analysis including 147 effect sizes across 28 studies, we found that high-intensity exercise had a small, facilitating effect on executive performance and that this effect was comparable with that of more moderate, but longer, forms of exercise. We believe our findings have important applications for individuals, society, and policy. Specifically, they demonstrate the relevance of high-intensity exercise for those seeking time-efficient ways to induce immediate cognitive improvements, such as children who are often constrained to sedentary learning environments, professionals whose work conditions involve long hours sitting at a desk, or older individuals with limited opportunities to exercise. In these various contexts, recognizing the benefits of high-intensity exercise could promote healthier, more productive lifestyles, with a positive impact on the community at large.
Author Contributions
D. Moreau conceived the study, provided statistical and research expertise; supervised the literature search, data collection, and data analyses; and provided funding for the project. E. Chou performed the literature search, collected the data, and conducted the analyses. D. Moreau and E. Chou wrote the manuscript, and both authors approved the final version for submission. E. Chou is the guarantor.
Supplemental Material
Supplemental_Material – Supplemental material for The Acute Effect of High-Intensity Exercise on Executive Function: A Meta-Analysis
Supplemental material, Supplemental_Material for The Acute Effect of High-Intensity Exercise on Executive Function: A Meta-Analysis by David Moreau and Edward Chou in Perspectives on Psychological Science
Footnotes
Acknowledgements
We thank Beau Gamble for his help with data analysis, and three anonymous reviewers for their constructive comments.
Action Editor
Laura A. King served as action editor for this article.
Declaration of Conflicting Interests
The author(s) declared that there were no conflicts of interest with respect to the authorship or the publication of this article.
Funding
D. Moreau is supported by the Royal Society of New Zealand (Marsden) and the Neurological Foundation of New Zealand. E. Chou was supported by a University of Auckland summer research scholarship (2017–2018).
Notes
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
