Abstract
BACKGROUND:
Minimum cardiorespiratory fitness (CRF) has been recommended for firefighters due to job requirements. Thus, it is important to identify accurate and readily available methods to assess CRF in this population. Non-exercise CRF estimates (NEx-CRF) have been proposed but this approach requires validation in this population.
OBJECTIVE:
To evaluate the accuracy of a NEx-CRF, as compared to a field maximum exercise test, among career military firefighters of both genders using a comprehensive agreement analysis.
METHODS:
We evaluated the accuracy of a NEx-CRF estimate compared to the Cooper 12 min running test among 702 males and 106 female firefighters.
RESULTS:
Cooper and NEx-CRF tests yielded similar CRF in both genders (differences <1.8±4.7 ml/kg–1.min–1; effect size <0.34). However, NEx-CRF underestimated Cooper-derived CRF among the fittest firefighters. NEx-CRF showed moderate to high sensitivity/specificity to detect fit or unfit firefighters (71.9% among men and 100% among women). Among men, the NEx-CRF method correctly identified most firefighters with less than 11 METs or greater than 13 METs, but showed lower precision to discriminate those with CRF between 11–13 METs.
CONCLUSIONS:
The NEx-CRF method to estimate firefighters’ CRF may be considered as an alternative method when an exercise-based method is not available or may be used to identify those who require more traditional testing (CRF 11–13 METs).
Keywords
Introduction
Cardiorespiratory fitness (CRF) is a component of health-related physical fitness [1] and is associated with lower overall and cardiovascular mortality among the general population [2–5]. This positive association between a high CRF and a reduced mortality has also been found among men with cardiovascular disease and among those with some highly prevalent comorbidities such as hypertension and diabetes [6, 7]. Additionally, evaluating CRF is common in epidemiological studies focused on evaluating precursors of cardiovascular diseases and/or mortality [5].
CRF is an important confounding factor that must be taken into account in risk analysis for cardiovascular disease risk factors such as fatness, elevated cholesterol levels, and elevated blood pressure [8, 9]. In 1996, Blair and colleagues [8] showed that fit people had lower mortality than people with low-fitness but with any combination of smoking, high blood pressure, or high cholesterol level. More recently, a 2014 meta-analysis addressing the obesity paradox found that obese/overweighed-fit individuals had similar mortality risk as the normal weight-fit ones, whereas unfit participants, regardless of their BMI, had higher mortality risk as compared to normal weight-fit individuals [9].
Among firefighters, low CRF has also been negatively associated with metabolic syndrome [10], ECG and autonomic exercise testing abnormalities [11], and poor body composition [12]. In addition to the unequivocal association between CRF and many health outcomes [10, 13], a minimum level of CRF has been recommended for firefighters due to the inherent high physical demands of firefighting [14–17].
A 2016 study reported that the professions with the highest physical workload, as estimated by METs during work time, such as agricultural workers, craftsmen, and labourers, typically perform most of their work activities at a moderate intensity (3.0–5.9 METs), with normally less than a minute per day of work physical activity requiring more than 9 METs (very high intensity) [18]. However, firefighting tasks even in the absence of live-fire can require energy expenditure between 7.1–12.9 METs [19].
It is well known that firefighters’ job activities involve intense physical and psychological demands, which make firefighting a hazardous profession associated with high on-duty mortality [20, 21]. Based on physiological demands and the known health risks, the United States National Fire Protection Association (NFPA) recommends a CRF equivalent to 12 METs as a minimum for firefighters’ safe job performance [22].
Therefore, the assessment of CRF among firefighters is highly important both for health purposes and for job requirements. Traditionally, CRF has been evaluated with exercise testing, as measured by maximal or submaximal treadmill tests or by field tests [23, 24]. One of the field tests most commonly used worldwide is the Cooper 12-min running test, which is a very low-cost test that has been widely used among militaries around the world. [12, 25]. Importantly, treadmill exercise testing is costly and time-consuming, requires qualified staff, and increases acute cardiovascular risk among susceptible individuals [26, 27].
Accordingly, novel approaches to estimate CRF that reduce risk, time and cost are needed. Non-exercise (N-Ex) CRF estimates based on self-reported physical activity questionnaires (SRPAs) along with some easily evaluated measures (e.g. BMI, age, gender), are an attractive alternative. N-Ex CRF has been shown to be accurate both for CRF assessment and for long-term cardiovascular and mortality risks classification [3, 28–31]. N-Ex CRF estimates have also been used in studies among firefighters [32–34]. However, since firefighters go through occupationally-mandated physical training and might be more fit and have more muscle mass than the general population, the parameters used to predict N-Ex CRF such as BMI, age, and gender may affect CRF estimates differently than the general population. Also, the most common SRPAs that are used in N-Ex CRF estimates emphasize leisure physical activities (PA) or do not include occupational PA, which is high among firefighters [35, 36] and may influence aerobic capacity estimates. As a result, N-Ex CRF requires validation in this unique occupational group.
Therefore, we aimed to evaluate the accuracy of an N-Ex CRF as compared to a field maximum exercise test among career military firefighters of both genders by a comprehensive agreement analysis that includes both inferential statistics and epidemiological indices.
Methods
Study design and subjects
We conducted a cross-sectional study with a convenience sample of Brazilian military firefighters. Participants were career military firefighters from the Brazil Federal District (Brasilia) Military Firefighter Brigade (CBMDF –Portuguese acronym). Data were collected in 2017 during the CBMDF mandatory annual physical fitness assessment. Experienced firefighter instructors with prior training in the CBMDF physical fitness protocol conducted all physical fitness tests and data collection. Trained researchers administered the questionnaire booklet, which was added for research purposes only. Firefighters were informed that none of their questionnaire responses would be used for any occupational purpose and were blinded to the fact that it would generate a CRF estimate.
The annual physical fitness assessment was performed over 6 consecutive weeks, following a schedule established by military personnel. In order to include participants from all ranks, age ranges and genders, participants were recruited and questionnaires completed on a daily basis during that period. Data were collected during more than 80% of the days in which the physical fitness assessment was scheduled. The main research proposal was previously disclosed on the CBMDF website and firefighters were verbally invited to participate. All firefighters between 18 to 49 years, with no work restrictions, and who were released for the physical fitness assessment following medical screening were eligible. Firefighters who were aged 50 years and older were not included because they perform a submaximal physical fitness evaluation. The study was approved by the University of Brasilia Faculty of Health Sciences Ethics Committee on Human Research (CEP-FS-UnB-CAAE:16473613.9.0000.0030), and an authorization from the CBMDF was also issued. All participants signed an informed consent form.
Anthropometric assessment and resting functional variables
Participants’ heights to the nearest 0.1 cm and weights to the nearest 1.0 kg were measured using an Welmy® altimeter and calibrated medical scale and participants’ BMIs were calculated as weight (kg)/height squared (m²). Participants were measured in a physical education uniform (light clothes) without shoes. Resting blood pressure and heart rate were measured after five minutes of resting in a seated position. Anthropometric data and CRF estimated based on the Cooper 12 min running test data were extracted from the CBMDF database and analyzed in a de-identified (anonymous) fashion.
Exercise cardiorespiratory fitness assessment
Cardiorespiratory fitness (CRF) was assessed using the Cooper 12-min running test, which is an indirect estimate of the maximum oxygen consumption (VO2max in mL · kg–1 · min–1) [37]. During the Cooper test, firefighters were instructed to run as far as possible in 12 minutes on a standardized 400m running track available in the CBMDF facilities. All tests were supervised by a qualified CBMDF staff and performed on the same athletic running track to improve the test’s precision and ensure standardization among participants. The covered distance in meters was then converted into an oxygen uptake estimate (VO2) using a validated formula: VO2max = (Distance –504.9)/44.73) [38]. For some analysis of CRF, VO2max was converted to METs, dividing VO2max values by 3.5 [39].
Non-exercise cardiorespiratory fitness assessment (N-Ex)
CRF was also estimated by a validated algorithm, proposed by Jackson and colleagues, that includes a self-report physical activity pattern (SRPA), BMI, age, and gender, according to the equation: 56.363 + 1.921(SRPA) –0.381(age) –0.754(BMI) +10.987(female = 0, male = 1) [28]. In the SRPA questionnaire, firefighters were instructed to choose one out of eight options that best characterized their physical activity pattern over the last month. The N-Ex CRF model had been shown to be appropriate for use among adult populations [28] and it has been used in studies among firefighters [32, 40] even though without a specific accuracy analysis within this population as pointed out previously. Furthermore, a systematic review has found that this N-Ex CRF estimate had the highest methodological and statistical scores among different available non-exercise CRF prediction models. [41].
Statistical analysis
The normality distribution hypothesis was analyzed and rejected in some variables by the Shapiro-Wilk test. However, large sample sizes have been demonstrated to enhance the probability of type I error in distribution analyses [42]. Thus, based on apparent normal distribution assessed by visual inspection of histograms and the high overlap of expected and observed values in the Q-Q plot [43], parametric tests were applied and data are expressed as mean±standard deviation.
Intra- and inter-subject comparisons were performed by dependent and independent T-tests, respectively. Agreement between Cooper-derived CRF and N-Ex CRF was assessed using the Bland & Altman method [44]. The hypothesis that the bias between measurements was different from zero was evaluated by a one-sample T-test. Effect sizes (ES) of the differences were calculated by the formula: ES = √ (t∧2/(t∧2 + df) ). Correlation between the two CRF estimate methods was determined by the Pearson correlation coefficient and the association between categorical data was analyzed by the chi-square or the Fisher’s exact test.
Additional agreement analysis was done using the following epidemiological indexes: 1) total agreement (TA), as the sum of the percentage of true positive (TP) and true negative (TN) values (TA = TP+ TN); 2) N-Ex CRF questionnaire-sensitivity (sensitivity = [TP/(TP + FN)]×100%), where FN is false negative; 3) N-Ex CRF questionnaire-specificity (specificity = [TN/(TN + FP)]×100), where FP is false positive; 4) Positive predictive value (PPV = [TP/(TP + FP)×100]); 5) Negative predictive value (NPV = [TN/(TN + FN)×100]) [45]. All epidemiological indexes were calculated as their point value and 95% confidence interval (95% CI).
Results
Descriptive characteristics of the study sample are shown in Table 1. Men were significantly older and had higher BMI, systolic blood pressure (SBP), diastolic blood pressure (DBP), and VO2max than women. Women had significantly higher resting heart rate than men.
Descriptive characteristics of the sample
Descriptive characteristics of the sample
RHR: resting heart rate; BP: blood pressure; VO2max: oxygen uptake; *t-test independent; #chi-square; ##Fisher test; scalar variables are expressed as mean±SD and categorical variables as n,%.
The difference between VO2max estimated by the Cooper 12 min running test and the N-Ex method (questionnaire) among men was significant but with a very small effect size. Among women, the VO2max estimates obtained by the two methods were also significantly different and showed a small effect size. Table 2 shows the differences in VO2max based on the two CRF methods for the whole sample and for CRF terciles as measured by the Cooper test for both genders.
Cardiorespiratory fitness (VO2max) estimated by the Cooper 12 min running test and N-Ex questionnaire
N-Ex: VO2max estimated by questionnaire; Cooper test: 12 minutes; p = value: t-test paired; ES: effect size. Cut-off point for men corresponded to: 1st tercile: ≤40.4 ml.kg min–1; 2nd tercile: >40.4 and ≤44.6 ml·kg·min–1; 3rd tercile: >44.6 ml·kg·min–1; cut-off point for women corresponded to: 1st tercile: ≤33.1 ml·kg·min–1; 2nd tercile: >33.1 and ≤37.7 ml·kg·min–1; 3rd tercile: >37.7 ml·kg·min–1.
Figure 2 shows the Bland-Altman plot comparison between VO2max measured by the Cooper test and by the N-Ex method. The differences in prediction accuracy is shown by gender and tercile within each gender.

Diagram of the recruiting study sample. FFs: Firefighters.

Bland-Altman plots assessing agreement between VO2max estimated by the Cooper test and by N-Ex method. Dotted lines represent 95% limits of agreement (±1.96 SD). For men (a: men n = 702; c: first tercile (lowest) n = 246; e: second tercile (n = 224); g: third tercile (highest) n = 232; and woman (b: woman n = 106; d: first tercile (lowest) n = 35; f: second tercile n = 39; h: third tercile (highest) n = 32. VO2Cooper: test Cooper (12 min), VO2N - ex: questionnaire.
Epidemiological indices of agreement between measures in men are shown in Table 3. We used a unique CRF value for men and women (12 METs for men and 9.5 for women), which corresponds to evaluation criteria adopted by the fire department where the volunteers came from. We observed good total agreement among men, with moderate to high sensitivity and specificity, representing a moderate capacity of the N-Ex CRF questionnaire to correctly identify those with CRF≥12 METs and those with CRF < 12 METs. Alternative cut-off points (≥11 and ≥13 METs) for men are also presented. When the 11 METs threshold was used in the N-Ex CRF estimates, we found a much higher sensitivity (almost 90%) and a lower specificity. The opposite was observed when the 13 METs cut-off point was used.
Agreement between N-Ex method (questionnaire) and Cooper 12 min running test for defining CRF in male firefighters (n = 702)
Among women a maximum total agreement, sensitivity and specificity were observed, indicating that the N-Ex CRF questionnaire correctly identified women with CRF≥9.5 METs and those with CRF < 9.5 METs. In other words, N-Ex CRF questionnaire among female firefighters showed no false positive or false negative for CRF categories, resulting in predictive values of 100%.
Agreement indices calculated by gender and age categories applying the Cooper test standard values instead of a unique point showed good sensitivity and poor specificity.
The correlation between VO2max estimated by the Cooper test and the one estimated by the N-Ex CRF questionnaire was moderate for men (Fig. 3a) and women (Fig. 3b).

Correlation between VO2 estimated by the Cooper test and N-Ex questionnaire in 702 military male firefighters with the line of identity and the 95% confidence interval (a) and 106 military female firefighters with the line of identity and the 95% confidence interval (b).
In this cross-sectional study among middle-aged career military firefighters, we found that cardiorespiratory fitness estimated by the non-exercise cardiorespiratory fitness algorithm and the Cooper 12 min running test yielded similar results, both in men and women. Good agreement was verified both by inferential statistics and by epidemiological agreement indices. Bland-Altman plots reinforce the small bias for both groups. Further, the correlation analysis showed a moderate positive association between measures. Among men, N-Ex CRF method correctly identified most firefighters with CRF below 11 METs and above 13 METs but showed lower precision in discriminating those with CRF between 11–13 METs. The precision to discriminate between fit or unfit firefighters significantly increased (sensitivity > 84%) when gender and age specific Cooper standard cut-off points were used. Statistically and functionally significant differences and wider limits of agreement were found among the fittest firefighters (third tercile). In other words, the N-Ex CRF estimates were more accurate among the intermediate and less fit firefighters.
A systematic review published in 2004 investigated the predictive ability of several N-Ex models to predict the CRF, and evaluated their applicability to epidemiologic studies [41]. Only five studies out of 23 reviewed fulfilled all quality criteria, including the one from Jackson et al. [28]. Among those five, the variables included in the models explained 62% to 77% of the variation of the estimated CRF [41]. It is expected that any accurate assessment of a N-Ex CRF estimate will result in quality indices somewhat below 100% of agreement.
A study performed with young (∼20 yrs), male conscripts at the military service found a significant difference (∼–1.5 METs) between the N-Ex CRF and the 12 min running test estimates [46]. Even though the mean age of our sample was almost 20 years older, the mean CRF estimated from the Cooper 12 min running test of both samples were very similar (42.6±5.9 vs 42.6±4.3∼ml/kg/min–1). The difference between the exercise and the N-Ex CRF estimates found in the former study is similar to the one that we found among the fittest volunteers. Of note, the fittest firefighters in our sample also tended to be the youngest ones and those with lower BMI.
Considering the importance of the assessment of CRF, both for health and for job performance evaluations, a tendency to underestimate the actual CRF (by about 1.5 METs) among the fittest volunteers seems less critical than other possible errors or misclassifications, such as an overestimation among the less fit individuals. In our study, the third tercile cut-off point among men was equal to 44.2 ml·kg–1·min–1 or 12.6 METs. Thus, an N-Ex CRF underestimation within the fittest firefighters will not dramatically affect the qualitative interpretation of the group. The relatively small magnitude of the differences, its direction (underestimation), and the fact that it occurred only in the fittest group mitigate the potential negative impact of using the N-Ex CRF in fire service fitness evaluations.
A mean absolute difference of 0.2 and 2.0 ml·kg–1·min–1 and relative difference of 0.6% and 5.6% were observed in the intermediate and lower fitness group of male firefighters (Table 2). Dolezal et al. [47] predicted VO2max in male firefighters using a submaximal treadmill test and observed absolute and relative differences of 0.94 ml·kg–1·min–1 and 11%, which is almost two times the relative difference found in our lower fitness group. Also, the 95% confidence intervals of the agreement limits (±13.1) were wider than the one found in the present study. Evans and colleagues reviewed 19 studies using equations derived from submaximal exercise tests to predict VO2max in men and women of various ages and fitness levels [48]. Differences between the limits of agreement of estimated and measured VO2max ranged from 0.0 to 8.1 ml·kg–1·min–1. Our findings are similar and sometimes better than the equations analyzed by Evans et al. [48], even though our results are based on N-Ex based instead of a submaximal exercise test.
The N-Ex CRF assessment provided precise and accurate mean values among those with intermediate and lower values of CRF. Individuals with intermediate or low CRF are the most concerning individuals in regard to a physical fitness assessment discern their ability to cope with the strenuous job-related physical and physiological demands [15].
Due to the importance of classifying firefighters as “fit” or “unfit”, either for job-performance clearance or for training purposes [49, 50], we also analyzed the accuracy of the N-Ex CRF estimates by some epidemiological agreement indices in order to test its capacity to identify firefighters that meet or do not meet the 12-METs threshold (minimum CRF proposed for firefighters’ safety) [22, 50]. A novel contribution of our study is the inclusion of female firefighters. Considering the expected gender difference in CRF and the consequent rationale to establish a gender-specific cut-off point for health outcome analysis [51], we used the value of 9.5 METs among the female firefighters for fitness classification. The 9.5 METs value for women was chosen because it corresponds to the 12-METs cut-off point used for men in Cooper gender-specific CRF classification. In that analysis, we found a meaningful agreement difference between genders. The N-Ex CRF estimates showed perfect (100%) sensibility and specificity among women, while among men the correspondent values were around 70% –75% (Tables 3, 4). To the best of our knowledge, this is the first study to evaluate the accuracy of a N-Ex method for CRF estimation among female firefighters. This novelty prevents a comparison with previous studies but the results per se show that the N-Ex has a greater ability to accurately classify female firefighters as fit or unfit, as compared to the worldwide used Cooper 12 min running test. The perfect agreement among women is probably influenced by the relatively small sample size that reduces the chances of more heterogeneous values. Among male firefighters, we found moderate to high levels of sensitivity and specificity (74.1% and 71.9%, respectively). Considering that non-exercise equations to estimate CRF usually explain less than 80% of the CRF variation [41], one could argue that those levels of agreement are accurate enough to be useful when considering its logistics advantages; lower cost and the absence of health risks associated with exercise testing. However, 20–25% error in identifying those who are fit enough or not fit enough for some job-related activities may result in important practical problems for individuals or for the fire service. Importantly, the N-Ex method correctly identifies most firefighters with CRF values 11 > CRF > 13 METs, making it very useful to identify those who might require more traditional testing (CRF between 11 and 13 METs).
Despite the strengths of our study, including the comprehensive agreement analysis performed and the inclusion of female firefighters, some limitations exist. We compared the N-Ex CRF estimate against a reference method (Cooper 12 min running test) that is not the gold-standard for CRF evaluation [23]. However, we aimed to compare the N-Ex CRF quality against a method that is regularly employed worldwide, and has long been recognized as a feasible, valid and low cost method [23]. Also, the Portuguese version of the self-reported physical activity questionnaire based on the Jackson et al. equations is not yet fully validated in regard to its cross-cultural adaptation. However, the version we used was translated by an experienced researcher in the physical activity field, competent both in English and Portuguese, and a first version was previously tested on a small group and adjusted before the final version that was used. Also, this adjusted Portuguese version had been previously used and none of the volunteers reported any difficulty in understanding the questionnaire [52]. The very high consistency of our data, i.e., convergent findings irrespectively to the method of analysis, and the similarity of the estimated CRF based on both methods (Cooper test and N-Ex CF), both for men and women, suggest that the potential limitations associated with the translated version used in our study did not affect the results. Even though our results are based on military firefighters, our findings are likely generalizable to other similar populations including law enforcement, military, and other fire service organizations.
In conclusion, this study conducted among middle-aged male and female military firefighters showed that the N-Ex CRF estimates are accurate enough to be used in this population. Good accuracy as compared to the Cooper 12 min running test was verified by inferential statistics, by Bland-Altman plots, correlation analysis and CRF categories. The precision of the N-Ex method to identify fit or unfit male firefighters was very good when applying specific Cooper gender and age cut-off points but lower when applying the 12 MET threshold. Among men, N-Ex-CRF estimates applying alternative cut-off points was shown to be a practical and low-cost method to identify those who might require more traditional testing, i.e., those with intermediate CRF estimate (between 11 and 13 METs). The N-Ex CRF estimates using a threshold of 9.5 METs showed perfect sensitivity and specificity among women for identifying the fit and unfit firefighters. The use of the N-Ex CRF estimates among more fit firefighters (men: >12.6 METs; women: >10.8 METs) deserve special attention due for underestimated fitness.
Conclusion
The consistency of the results across multiple statistical methods of analysis, indicate that the non-exercise method to estimate firefighters’ CRF may be an acceptable alternative method when an exercise-based method is not available, feasible, recommended, or allowed. There may be concerns regarding the use of an N-Ex CRF estimate to make decisions about personal selection or career promotion due to the possibility of false information. However, there is a well-acknowledged need to evaluate CRF in the fire service, and many organizations lack the time, resources, or skills to perform exercise based testing. We have shown that the N-Ex may be a reasonable alternative tool compared to the Cooper 12 min running test.
Conflict of interest
LCS is a military firefighter of the CBMDF. RM is a retired officer of the CBMDF. No other potential conflicts of interest relevant to our study exist.
Footnotes
Acknowledgments
We thank the successive commands of the Federal District (Brasilia) Military Firefighter Brigade (CBMDF) allowing the conduction of the Brasilia Firefighters Study - BFS. We also thank CAPES Brazil (Coordenação de Aperfeiçoamento de Pessoal de Nível Superior) for supporting DRFSM and EMKVKS with scholarships (Finance code 001).
