Abstract
There is no single gold standard test to diagnose sport-related concussion (SRC). Concussion-related exercise intolerance, that is, inability to exercise to the individual's appropriate level due to exacerbation of concussion-like symptoms, is a frequent finding in athletes early after SRC that has not been systematically evaluated as a diagnostic test of SRC. We performed a systematic review and proportional meta-analysis of studies that evaluated graded exertion testing in athletes after SRC. We also included studies of exertion testing in healthy athletic participants without SRC to assess specificity. Pubmed and Embase were searched in January 2022 for articles published since 2000. Eligible studies included those that performed graded exercise tolerance tests in symptomatic concussed participants (> 90% of subjects had an SRC, seen within 14 days of injury), at the time of clinical recovery from SRC, in healthy athletes, or both. Study quality was assessed using the Newcastle-Ottawa Scale. Twelve articles met inclusion criteria, most of which were of poor methodological quality. The pooled estimate of incidence of exercise intolerance in participants with SRC equated to an estimated sensitivity of 94.4% (95% confidence interval [CI]: 90.8, 97.2). The pooled estimate of incidence of exercise intolerance in participants without SRC equated to an estimated specificity of 94.6% (95% CI: 91.1, 97.3). The results suggest that exercise intolerance measured on systematic testing within 2 weeks of SRC may have excellent sensitivity for helping to rule in the diagnosis of SRC and excellent specificity for helping to rule out SRC. A prospective validation study to determine the sensitivity and specificity of exercise intolerance on graded exertion testing for diagnosing SRC after head injury as the source of symptoms is warranted.
Introduction
Sport-related concussion (SRC) is a form of mild traumatic brain injury (mTBI) and a significant public health concern. 1,2 The Centers for Disease Control and Prevention estimates that up to 3.8 million concussions occur annually in the United States from sports and recreational activity, although this number may be an underestimate. 3 The clinical presentation of SRC includes a wide range of physical, behavioral, cognitive, sleep, somatic, and/or emotional signs and symptoms, unexplained by another cause (e.g., medications or cervical trauma), that typically appear within minutes or hours of injury, and may spontaneously resolve within days. Hence, many SRC may not be recognized or reported. 2 There has been a lack of consensus on the diagnostic criteria for SRC, 4 although the field has recently moved toward a more standardized assessment and definition. 2 Nevertheless, SRC diagnosis remains a clinical one with no single gold standard test to objectively confirm the diagnosis. Diagnosis is made using a combination of self-reported symptoms that can be linked to a concussive head injury, 5 impairments on a concussion-focused physical examination, 6,7 and additional tests that may include neurocognitive testing, 8 reaction time, 9 and other measures. 10
The ability to exercise without return/worsening of symptoms is required for athletes to successfully complete a graduated return-to-play protocol before unrestricted return to sport. 2 This can be conceptualized as restoration of normal exercise tolerance after SRC. Concussion-related exercise intolerance, that is, inability to exercise to the individual's appropriate level due to exacerbation of concussion-like symptoms and/or excessive fatigue, 11,12 is a frequent finding in athletes early after SRC that has been characterized as a physiological biomarker of SRC. 13,14 Data from the measurement of the degree of exercise intolerance on graded exertion testing within 10 days of injury are used to prescribe individualized sub-symptom threshold aerobic exercise as treatment after SRC, 15 which has been shown to reduce the duration of symptoms and the incidence of persisting post-concussive symptoms (PPCS) beyond one month. 16 Additionally, the degree of exercise intolerance in the first 10 days after SRC, that is, the lower the heart rate threshold (HRt) at which more than mild symptom exacerbation occurs, has been associated with longer duration of recovery when compared with those who have a higher HRt (i.e., better exercise tolerance). 17,18
Diagnostic measures provide information about the presence or absence of a condition, in this case SRC. Exercise intolerance as a diagnostic test of SRC has not been systematically evaluated using traditional cross-sectional studies. To assess the sensitivity and specificity of exercise intolerance for the diagnosis of SRC, we performed a systematic review of studies that evaluated graded exertion testing in symptomatic athletes within 14 days of injury. Because there were limited publications that included exercise tolerance on graded exertion testing in both injured participants and in healthy controls, we performed an exploratory proportional meta-analysis, primarily of single-arm studies with emphasis on homogeneity of included samples to estimate the sensitivity and specificity of exercise intolerance on graded exertion testing for the diagnosis of SRC.
To establish accurate and useful estimates in this context, studies must have had clear and near uniformity of definition of exercise intolerance and of SRC. We included studies of graded exertion testing in participants with SRC but excluded those that had exercise intolerance as an inclusion criterion. For example, Leddy and colleagues 15 performed graded exertion testing at the initial visit in a randomized controlled trial (RCT) of adolescents after SRC. Exercise intolerance was an inclusion criterion for randomization, so this study was not included in this analysis. We hypothesized that concussion-related exercise intolerance on graded exertion testing would have excellent estimated sensitivity and specificity for the diagnosis of SRC.
Methods
This systematic review was deemed non-human subjects research and an exemption from the need for consent was granted. PRISMA guidelines for systematic reviews were followed. 19
Study definitions and populations
Sensitivity
Test sensitivity is the proportion correctly identified as having the disease divided by the number of subjects in the population with disease. 20 For this sample, we used patients with symptomatic SRC who were within 2 weeks of injury.
Specificity
Test specificity is the probability of the test being negative when the disease screened for is absent. 20 For this sample, we used patients with recent SRC at the time of clinical clearance (i.e., they had recovered from their SRC and were cleared for unrestricted return to sport), or healthy controls who had performed graded exertion testing as part of a concussion-related research study.
Literature search
Senior authors (MNH, BSW, JJL) identified keywords and a research data scientist with years of experience performing systematic reviews was involved in developing the search strategy. Pubmed and Embase were searched in January 2022 for human subjects research published since 2000 using a combination of the following terms: concussion, exercise test, treadmill test, graded exertion test, exercise tolerance/intolerance, and Buffalo Concussion Treadmill Test (BCTT). Search syntaxes for both databases are included in Supplementary Appendix S1. To assess the effectiveness of our search, we confirmed retrieval of three studies that we knew assessed exercise testing in participants with SRC. 16,21,22 Duplicate articles were removed and a title/abstract screen was performed by two separate reviewers (EL, MSZN). The lead author (MNH) reviewed the results of the first 50 screens from both reviewers as a quality control measure and weekly meetings were held during the screening phase to address any concerns. Cohen's Kappa of inter-rater reliability was calculated. 23 The references of included articles were screened for possible additional articles, and authors of included articles were contacted to ask if they knew of any other articles or abstracts that should be included.
Study selection criteria
Inclusion criteria
Studies that performed graded exercise tolerance tests in participants with acute SRC (< 14 days since injury), in non-concussed athletes, or both. SRC was required to be diagnosed using standard criteria set forth by the Concussion In Sport Group (CISG) 2 and the study population had to be primarily sport-related injuries (> 90%). For studies that included only healthy controls, the exercise tolerance test protocol had to be identical to an exercise protocol used for concussion assessment, such as the BCTT 24 or a modified Balke protocol. 25 Non-specific concussion-like symptoms such as headaches and dizziness during graded exercise can be due to a variety of non-pathological conditions in healthy non-athletes, but athletes who play sports are typically not expected to have exercise intolerance. Hence, we included only studies that had athletic controls who participated in organized sport or an equivalent activity (e.g., recreational sport or regular exercise).
Exclusion criteria
Studies were excluded if they did not perform the first exercise tolerance assessment within 14 days of injury, did not provide results of exercise tolerance assessment or comment on the incidence of exercise intolerance, included participants with non-SRC (> 10%), or included injuries that were more severe than concussion/mTBI (e.g., intracranial bleed).
Data extraction
The full text of screened articles was read, and the following data were extracted by three reviewers (MNH, MSZN, AN) using a standard data form: study author, study year, study design, SRC sample size, healthy/control sample size, age, sex, days from injury to assessment, definition of study population, method of exercise tolerance assessment, and incidence of concussion symptom exacerbation (i.e., exercise intolerance). Two researchers (MSZN, AN) shared work and extracted data from half the articles each, and one researcher (MNH) performed the secondary extraction. The lead author reviewed all discrepancies and discussed them with the research assistants during weekly research meetings. No agreement statistic was performed for extracted data.
Risk of bias and level of evidence
Study quality assessment was performed using the Newcastle-Ottawa Scale, 26 a validated checklist that assesses risk of bias (ROB) in observational, case control, and cohort studies, and grades the following domains: selection, comparability, and outcome/exposure. Good quality corresponds to 3–4 points in selection, 1–2 points in comparability, and 2–3 points in outcome/exposure. Fair quality corresponds to 2 points in selection, 1–2 points in comparability, and 2–3 points in outcome/exposure. Poor quality corresponds to 0–1 points in selection or 0 points in comparability, or 0–1 points in outcome/exposure. Level of Evidence was determined using guidelines set forth by Melnyk and associates, 27 which uses a seven-level grading system that ranges from systematic review of RCTs (Level 1) to expert opinion (Level 7). As with the data extraction process, two research assistants split the work followed by a secondary assessment by the lead author. All discrepancies were discussed, and the lead author made the final decision.
Statistical analysis
Individual studies' demographics and incidence of concussion-induced exercise intolerance on graded exertion tests were tabulated and displayed in forest plots. A meta-analysis was performed by calculating the pooled incidence estimate of exercise intolerance with 95% confidence intervals (CIs) using a random effects model with double arcsine transformation (Freeman-Tukey) in participants with and without SRC separately. To include studies with zero incidence, a continuity correction was performed by adding 0.5 to the numerator and denominator of incidence. 28 The pooled incidence estimate and CI for participants with SRC is the meta-analysis sensitivity estimate for exercise intolerance and, similarly, one minus the incidence estimate for participants without SRC is the meta-analysis specificity estimate. All analyses were performed using JBI System for the Unified Management, Assessment and Review of Information (JBI SUMARI) (JBI, Adelaide, Australia). 29
Results
The database search identified 352 unique articles, and other sources identified one medical school poster. Three hundred five articles were removed during title/abstract screen by two independent reviewers. There were 35 disputed articles among the screeners, which were reviewed by the lead author. The Cohen's Kappa of inter-rater agreement during screening was 0.673, which corresponds to substantial agreement. 30 The full texts of 48 articles were read and a total of 12 articles was included in the analysis. Figure 1 presents the PRISMA flowchart of study inclusion and exclusion.

PRISMA flowchart.
Table 1 presents the characteristics of the included studies: sex distribution, age, and type of graded exertion test used. Except for three studies, 31 –33 all evaluated either SRC or healthy participants but not both. The study by Cordingley and co-workers 34 assessed exercise tolerance in children with symptomatic concussion at their initial assessment (n = 106) and again after they had clinically recovered (n = 65). The initial assessments could not be used because they were >14 days since injury, so we used only the recovered assessment in our analysis for the control arm. The 2021 study by Leddy and colleagues 35 had two arms, acutely concussed and a healthy, athletic control arm, but it had exercise intolerance as an inclusion criterion for the concussed sample, so only the control arm was included. Similarly, Bogdanowicz and associates 33 studied two separate populations, an adolescent athlete group and a non-athletic adult group (medical students), so only half the sample (adolescent athletes) was included in this analysis.
Description of Included Studies
BCBT, Buffalo Concussion Bike Test; BCMT, Buffalo Concussion March-in-Place Test; BCTT, Buffalo Concussion Treadmill Test; mSRT: modified Shuttle Run Test; RCT, randomized controlled trial; SRC, sport-related concussion.
Study population
All SRC studies evaluated adolescents, whereas the two studies 36,37 of non-SRC controls included college-aged adults. The CISG 2 definition of SRC was uniform across studies. The study by Chizuk and colleagues 22 included some non-SRC but qualified for inclusion because >90% were SRC. The definition of controls in the studies, however, was less uniform. The Leddy group 31 –33,35 used a very homogenous definition of controls: healthy adolescents (aged 13–18 years) who played at least one organized sport, did not experience a concussion in the past year, and had no more than three life-time concussions. Studies on controls 37 –39 that reported on the normal (non-concussed) response to aerobic exercise for the purpose of sport-concussion management used athletic controls who exercised regularly. To develop an exertion test for active-duty service members, Prim and associates 36 studied young, healthy adults who were exercising 3 times a week; hence, they were athletic and thus included in the analysis. Cordingly and co-workers 34 performed a longitudinal cohort study of exercise intolerance at the time of symptomatic concussion and after clinical recovery. They did not include a non-concussed sample per se, but we included their recovered time-point exercise assessment in our analysis because these were adolescent athletes who had begun a return-to-play protocol after having been declared clinically recovered from their concussion.
Graded exertion tests
Three studies employed cycle leg ergometers, 31,32,38 two used a 6-min step test, 36 and one used a 15-min march-in-place test to assess exercise tolerance after concussion. 33 The most common graded exertion test used to evaluate exercise tolerance after SRC or within the non-concussed population was the BCTT.
Table 2 presents results of the study quality analysis. Except for two RCTs, the studies were either cross sectional, case-control, or cohort (Level of Evidence 4). Most studies received a poor-quality score because they either did not have a comparison arm or had one that we could not include. Thus, they were treated as an observational study in the context of this meta-analysis and scored poorly in the comparability domain.
Study Quality Assessment
Level of Evidence grading system: 1—systematic review and meta-analyses of randomized controlled trial; 2—one or more randomized controlled trials; 3—controlled trial; 4—case-control or cohort study; 5—systematic review of descriptive or qualitative studies; 6—single descriptive or qualitative study; 7—expert opinion.
For the specificity determination, Figure 2 presents the forest plot of proportions of participants without SRC who had exercise intolerance. The pooled estimate of incidence of exercise intolerance in participants without SRC was 5.4% (95% CI: 2.7, 8.9). This corresponds to a specificity of 94.6% (95% CI: 91.1, 97.3).

Forest plot of proportions of participants without SRC who had exercise intolerance. CI, confidence interval; SRC, sport-related concussion.
For the sensitivity determination, Figure 3 presents the forest plot of proportions of participants with SRC who had exercise intolerance. The pooled estimate of incidence of exercise intolerance in participants with SRC was a sensitivity of 94.4% (95% CI: 90.8, 97.2).

Forest plot of proportions of SRC participants with concussion-related exercise intolerance. CI, confidence interval; SRC, sport-related concussion.
Discussion
This systematic review identified studies that allowed us to estimate the sensitivity and specificity of concussion-related exercise intolerance on graded exertion testing for the diagnosis of SRC. Because we did not find a single publication that specifically studied the sensitivity and specificity of exercise intolerance on graded exertion testing in patients acutely after SRC versus a matched control group, we performed a proportional meta-analysis to estimate the incidence of concussion-like exercise intolerance separately in patients with and without SRC. For this reason, the results should be considered exploratory until they can be validated in prospective studies of exercise testing in concussed patients compared with matched controls.
Nevertheless, we found excellent diagnostic sensitivity and specificity for early (1–14 days post-injury) exercise intolerance for SRC, with an estimated sensitivity of 94.4% and an estimated specificity of 94.6%. These values are comparable or superior to those of other common measures used to test for the diagnosis of SRC. One study of a computerized neurocognitive test, the Immediate Post-concussion Assessment and Cognitive Test (ImPACT), reported a sensitivity and specificity of 55.0% and 97.5%, respectively, within 1 day of injury, 40 whereas another study done within 3 days of injury reported 91.4% sensitivity and 69.1% specificity. 41 The King-Devick (KD) Test, a rapid number reading test, has a sensitivity and specificity of 86% and 90%, respectively, for diagnosing SRC when used on the sideline, although the diagnostic accuracy declines a few days after injury. 42,43 The Sport Concussion Assessment Tool (SCAT), a multi-modal assessment including self-reported symptoms, a neurocognitive screen, a vestibular exam, and a brief oculomotor exam, is reported to have sensitivities of 80.4–89.1% and specificities of 69.0–80.9% when used for immediate assessment on sport sidelines. 44,45
Most included articles had equal sex distribution, with only three being male dominant. The incidence of exercise intolerance in male dominant studies of non-concussed controls (3.1%, 10.3%, and 3.2%) was not significantly different from the pooled mean incidence (5.4%) in the entire group without SRC, suggesting that the incidence of exercise intolerance is similar in healthy males and females. All included SRC studies involved adolescents, whereas the control studies included both adolescents and young adults. We were unable to control for the covariate of age due to the limitations of an exploratory proportional meta-analysis. Gaetz and Iverson 38 reported an exercise intolerance rate of 12% in a healthy young adult population (aged 18–24 years), which suggests that non-adolescent athletes may demonstrate more exercise intolerance than younger athletes.
Because all the included studies evaluated athletes, our results may not generalize to non-athletes. One potentially important study that merits further discussion did not meet inclusion criteria because 40% of the sample was non-SRC: Orr and colleagues 18 performed graded exertion testing using the Bruce protocol in concussed adolescents (aged 12–16 years) 5–7 days from injury and reported that only 46% of their population was exercise intolerant, that is, 54% of their acutely concussed sample was not. Exercise tolerant participants had few clinical indicators of concussion, were safely transitioned to a return-to-activity protocol, and all recovered within 10 days. The exercise intolerant participants, however, had high clinical indicators of concussion and experienced prolonged recovery (mean recovery time of 45.6 days). The authors commented on the usefulness of graded exertion testing for concussion management. We contacted the authors to confirm the non-athlete nature of their study population before exclusion. Future studies should assess the incidence of exercise intolerance in the non-sport population.
Exercise intolerance has been considered to be a “physiological biomarker” of concussion 17 because there is evidence of abnormal regulation of cerebral blood flow (CBF) during exercise that is associated with reduced exercise performance. 46 Conversely, recovery of exercise tolerance has been associated with return to normal brain function during cognitive tasks and normalization of local CBF on functional magnetic resonance imaging (MRI). 47 The diagnosis of SRC currently is made by a combination of history and physical examination findings after an appropriate mechanism of injury, 2 and there is a differential diagnosis for the clinical presentation after head injury because post-traumatic symptoms are non-specific and may originate from preexisting conditions and/or a concomitant cervical injury. 11
One goal for the field, therefore, is to find more objective means to support the clinical diagnosis of brain injury as the source of post-traumatic symptoms. 2 Demonstration of exercise intolerance in the days to 2 weeks after SRC may help to increase clinician confidence that the source of symptoms after head trauma is due to brain injury as opposed to other potential sources. 17 Our study compared the incidence of concussion-related exercise intolerance in patients with and without a diagnosed SRC. It did not examine the sensitivity and specificity of exercise intolerance for SRC diagnosis after any head injury. To accomplish this, researchers would have to enroll concussed athletes and compare them with athletic controls with a recent head injury but who were not diagnosed with SRC.
Clinical implications
Sub-symptom threshold aerobic exercise treatment has been proven to facilitate recovery from SRC when it is based upon the individual patient's heart rate threshold identified on systematic exercise testing performed within 10 days of injury. 15,16 Clinicians without their own equipment can partner with physical therapists, athletic trainers, or exercise physiologists, practitioners with the expertise and equipment to conduct exercise testing. When the diagnosis of concussion may be in question because of non-specificity of symptoms or lack of an accurate history, clinicians can employ the principle of “exercise intolerance” to help raise or lower the pre-test probability of the diagnosis without the need for questionably reliable computerized cognitive tests, 48 sophisticated imaging, or expensive and thus far unproven blood or other fluid biomarkers.
Limitations
The major limitation of this study is that we were unable to perform a traditional comparative meta-analysis and so our results require validation. Another limitation is that our results are subject to selection bias because all the studies of exercise tolerance within 14 days of SRC were from the same group of researchers at the University at Buffalo. This is not necessarily surprising because the concept of exercise tolerance testing in SRC 49 and of using the data to prescribe sub-symptom threshold aerobic exercise to improve recovery 50 originated in Buffalo. Our clinic offers graded exertion testing within the initial clinical encounter by experts in exercise rehabilitation for SRC, so patients with noticeable exercise intolerance may preferably seek care there. Our results would not be achievable by clinics that do not have resources to conduct exercise testing. Nevertheless, centers with equipment and that use the “Buffalo protocol” for treatment based on exercise testing in SRC patients have reported good outcomes in those with prolonged symptoms. 51 –53
The generalizability of our findings is limited, and more studies of the sensitivity and specificity of exercise tolerance after SRC should be conducted at other centers. Another limitation is that there is no gold standard diagnostic test for concussion, so the determination of sensitivity and specificity is based upon the clinical diagnosis of concussion by experienced physicians. Lastly, this review is subject to publication bias. We searched two of the largest databases (PubMed and Embase), which encompass 91% of the published literature, 54 and we contacted sport-concussion researchers who performed graded exertion testing. Still, we were unable to review unpublished work or literature that was not included in these databases.
Conclusions
The results of this proportional meta-analysis suggest that exercise intolerance measured on systematic testing within 2 weeks of SRC has excellent (94%) sensitivity for the diagnosis of SRC and excellent (95%) specificity for helping to rule out SRC. When combined with a careful history and concussion-relevant physical examination, systematic determination of exercise tolerance could significantly raise or lower the pre-test probability of concussion to help clinicians manage their patients more successfully. A prospective validation study to determine the sensitivity and specificity of exercise intolerance to diagnose concussion as the source of symptoms after head injury is warranted.
Footnotes
Transparency,Rigor,and Reproducibility Summary
The methods of this exploratory meta-analysis were guided by the Preferred Reporting Items for Systematic reviews and Meta-Analyses guidelines (PRISMA). The screening of articles was facilitated by the web-based software platform JBI SUMARI. Two databases, namely, Pubmed and Embase, were initially searched from 2000 to Jan 20, 2022, with an updated search before submission to confirm no new articles of interest were published that may change our results. Following initial screen for duplicate articles, two authors (EL and MSZN) independently assessed the eligibility of retrieved articles based on titles and abstracts. Data on the 12 included articles have been comprehensively reported in the manuscript and a list of all full-text publications screened for data extraction is provided in the Supplement File. For other details concerning the search process and outcomes, please contact the corresponding author.
Acknowledgments
The authors thank the University at Buffalo Health Sciences Library for helping to develop our search strategy.
Authors' Contributions
The authors contributed as follows. MNH.: writing—original draft (lead); formal analysis (lead); writing—review and editing (equal); methodology. EL: data curation; review and editing (equal). MSZN: data curation; review and editing (equal). AN: data curation; review and editing (equal). HMC: data curation; review and editing (equal). JCM: formal analysis; review and editing (equal). JIM: formal analysis; review and editing (equal); BSW: conceptualization; supervision; review and editing (equal). JJL: conceptualization; supervision; review and editing (equal).
Funding Information
No funding was received for this work.
Author Disclosure Statement
No competing financial interests exist.
Supplementary Material
Supplementary Appendix S1
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
