Abstract
BACKGROUND:
Law enforcement is a profession of sedentary physical activity interspersed with physically demanding activity that requires high levels of fitness. It is imperative that agencies training law enforcement recruits maximise their fitness during their time at the academy.
OBJECTIVE:
The aim of this study was to investigate changes in physical fitness during academy training.
METHODS:
Retrospective data for 10 academy recruit classes, totalling 715 participants, were collected from a US law enforcement agency. The change in performance on two standardised tests were used as outcome measures. Comparisons were made between percentiles utilising one-way ANOVA and a linear mixed model (LMM).
RESULTS:
Overall, higher percentiles were found to have smaller improvements in physical fitness than lower percentiles. The results of the LMM support this supposition, showing that lower physical fitness scores resulted in greater improvements in a generalised fitness assessment (value = –0.45, standard error 0.02, p < 0.001) and an occupational assessment (value = –0.49, standard error = 0.02, p < 0.001).
CONCLUSION:
The results of this study suggest that recruits with lower physical fitness will see greater improvements during academy training. This could be due to a ceiling effect for the more fit but may also be due to recruits of higher physical fitness being under trained during academy. Utilising ability-based training and prescribing an appropriate workload to recruits of higher fitness may improve overall recruit fitness upon graduation.
Introduction
Law enforcement is a profession that combines long periods of sedentary behaviour with spurts of sudden and highly demanding physical activity [1, 2]. For example, an officer may transition from sitting in a patrol car to a maximal effort sprint to chase and apprehend a suspect [1]. In addition, officers must perform these activities while carrying up to 10 kg of additional occupational load [3]. This additional load has a multitude of negative impacts such as decreasing occupational performance, increasing metabolic requirements, and increasing injury risk in officers performing their tasks [4–6]. To mitigate these negative effects, it is imperative that officers develop and maintain sufficiently high fitness levels [2]. To prepare new recruits for this occupational demand, law enforcement agencies employ periods of training known as ‘academies’ to develop the necessary physical (and mental) skills to serve in this occupation [7–9].
Law enforcement recruits are usually employed from the general population and will have varying levels of fitness and physical capabilities [10–12], that are influenced by multiple factors such as gender [13] and age [1]. To develop the necessary physical skills to become an effective officer, academies typically employ physical training sessions [8, 14]. Not only do higher levels of physical fitness allow for effective completion of occupational tasks [5, 15], higher fitness levels can also promote improvements in long-term physical [16] and psychological [17] health. Better fitness could also decrease officer injury risk as it has been linked as a significant predictor to injuries [18], and accounting for physical fitness differences also decreases the gender-related differences in injuries seen in tactical personnel [13]. Due to these numerous beneficial impacts of fitness, it is vital for physical fitness to be developed during the academy period, and ideally maintained throughout an officer’s career. Unfortunately, the nature of law enforcement (e.g., shift work and long working hours) [19], makes the maintenance of physical fitness difficult. This is reflected in the decreasing physical fitness levels that can be seen as officers progress through their career [20]. As such, physical training at academies presents an opportunity to develop fitness so as to optimise an officer’s fitness and mitigate impacts of fitness loss due to occupational factors.
Previous research in law enforcement academies has found that recruits are able to significantly improve their physical fitness, such as aerobic fitness, muscular strength, and muscular endurance, by graduation [8, 14]. However, academies often employ a generic, “one size fits all”, training approach whereby all recruits are required to complete the same exercises with given loads and at given speeds and distances [7]. As such, exercises being performed typically do not consider the broad range of initial fitness levels amongst recruits [10, 11]. Adopting a “one size fits all” training approach may be too demanding for those with lower fitness levels, while simultaneously not challenging enough for those with a higher baseline. This discrepancy can potentially lead to overtraining and increased risk of injury for less fit individuals, while not being a sufficient enough stimulus to develop fitness for those with higher fitness levels [21]. Additionally, academies often focus on aerobic fitness and muscle endurance training as these activities can be readily applied to train a large volume of recruits simultaneously and with little equipment [14]. This training approach occurs despite the officers commonly encountering law enforcement tasks that require muscular strength, power, and anaerobic fitness [15, 22]. As such, it is vital that these fitness components are sufficiently developed to adequately prepare recruits for the occupational demands of policing.
Physical fitness is a vital component to the success and long-term health of law enforcement officers [5, 16], but physical training delivery is often a generic “one size fits all” approach [7], that focuses on aerobic capacity and muscular endurance is a core component of academy training [14]. Research has suggested the implementation of ability-based training, particularly to account for between sex differences [23], but limited research has been conducted to assess how a “one size fits all approach” affects the physical fitness change across recruits of different baseline levels. The aims of this article were twofold. The first aim was to profile changes, grouped by quintiles, in recruit fitness based on their baseline fitness levels on entering academy and following academy use of a generic physical training program. The second aim was to compare changes in fitness across various fitness components (e.g., endurance, strength, power, etc.) to determine the impacts of this generic program across fitness components. Based on Selye’s General-Adaptation-Syndrome [24] it was hypothesised that recruits with lower baseline fitness (lower quintiles) would see larger improvements in fitness over the course of their academy training than fitter recruits (higher quintiles) and, that changes in aerobic and muscular endurance will be greater than the changes in measures of an anaerobic nature (strength, power, and agility) regardless of initial fitness level.
Methods
Subjects
Data were retrospectively collected from a US state law enforcement agency for 10 academy recruit classes, conscripted from the general population, totalling 715 participants. Of these 715 participants, 604 were male (age = 26.70±5.22 yrs, height = 175.98±7.37 cm, body mass = 83.16±12.29 kg), 110 were female (age = 26.69±4.64 yrs, height = 162.63±6.56 cm, body mass = 65.32±12.08 kg), with one participant not disclosing their sex. Ethical approval was sought and granted by the Bond University Human Research Ethics Committee and the California State University, Fullerton Institutional Review Board (HSR-17-0037).
Protocols
Initial and final testing dates were approximately 20 weeks apart. The medicine ball toss, 75- yard pursuit run, and 20 m multi-stage fitness test were typically conducted the week prior to the academy. Two physical test batteries were conducted, the PT500 (agency specific composite of six physical fitness tests, and the work sample test battery (WSTB) (a state mandated assessment of occupational skills). The initial PT500 was conducted during the first week of the academy while the WSTB was conducted at various points during the beginning of the academy. The assessments are described in detail below. Though some assessments have a maximum amount to be completed, if the number of repetitions recorded exceeded the maximum, the raw number was kept to reflect the recruit’s ability. Assessments were performed in standard physical training attire (shorts, t-shirt, and running shoes) and overseen by academy staff.
Medicine Ball Toss (MBT)
The MBT was conducted independent of the PT500 and WSTB. This is an assessment of upper body power and has previously been used in recruit populations [25, 26]. Recruits sat on the ground, with head, shoulders, and lower back against a concrete wall. The recruits then tossed a 2-kg medicine ball (Champion Barbell, Texas, USA), lightly dusted with chalk, as far as possible using a two-handed chest pass. A standard tape measure was used to measure the perpendicular distance from the wall to the closest chalk mark made by the ball landing. Two trials were completed with a recovery time ranging from 30-60 s. Results were recorded to the nearest 0.01 m with the furthest of the two trials beingrecorded.
75-Yard Pursuit Run (75PR)
The 75PR (68.6 m) was also performed to measure agility and occupational performance and has previously been used as assessment with law enforcement recruits [25]. This test consisted of a recruit completing five linear sprints about a square grid with sides measures 12.1 m, while also completing four 45° direction changes across the grid. Recruits were also required to step over three barriers (2.4 m long and 0.2 m high) during three of the five sprints. Time was recorded using a handheld stopwatch (Professional Digital Stopwatch Timer, LuckyStone) that began on initiation of movement and ended with the recruit crossing the finish line, measured to the nearest 0.1 s (Figure 23).

75PR diagram.
A MSFT test was completed independent of the PT500 and WSTB. Standard procedures were adopted for the MSFT, with recruits required to run back and forth between two lines 20 m apart, to measure aerobic fitness [25, 27]. This assessment has previously been used and validated in this population [27]. Running speed was standardised by pre-recorded auditory cues played from an iPad handheld device (Apple Inc., Cupertino, California) connected to a portable speaker (ION Block Rocker, Cumberland, Rhode Island) via Bluetooth. The speaker was located in the centre of the running area so each recruit could clearly hear the auditory cues but positioned in a way as not to interfere with the recruits running. The test was stopped when the recruit was unable to reach markers twice in a row during the allotted time as indicated by the auditory cues, or voluntarily stopped running. Scores were recorded according to the final stage the recruit was able to achieve, and then used to calculate the total number of completed shuttles. Relative VO2Max (ml/kg/min) was estimated for each recruit based on Ramsbottom et al. [28].
PT500
The PT500 is a composite score of six assessments: maximal push-ups, sit-ups, and mountain climbers completed in 120 s, maximal pull-ups, 200-m run, and 2.4-km run [9]. The PT500 is an established standard of fitness assessment that has been used historically within this law enforcement agency [9, 29]. Recruits completed the above assessments in typical physical training attire. The push-ups, sit-ups, and mountain climbers were completed on an outdoor, concrete surface with a partner, who ensured correct techniques and counted the number of repetitions. Pull-ups were completed on an outdoor pull-up bar. The 200-m and 2.4-km runs were completed on an athletic track at the agency’s training facility. Recruits completed the runs in groups of 10-15. Specific procedures for each of the assessments are detailed below, as well as the scoring system for each regarding the final PT500 score.
Push-ups
Upper-body muscular endurance was assessed via maximal push-up test where recruits completed as many repetitions as possible in 120 s. The technique used for this assessment has been used previously in law enforcement populations [8, 30–35]. Recruits started in the standard “up” position, with the body straight, hands positioned shoulder-width apart, and fingers pointed forwards. A water bottle was placed under the recruit’s chest to determine the correct depth of the “bottom” position of the push-up. Upon starting the assessment, academy staff began timing the 120 s, and recruits flexed their elbows, lowering themselves until their chest touched the water bottle. Recruits then extended their elbows, returning to the start position. This technique was completed as many times as possible in the 120 s. Recruits were awarded one point per push up completed, with a maximum score of 50.
Sit-ups
To test abdominal muscular endurance, the maximum number of sit-ups able to be completed in 120 s was assessed. The technique used during this assessment has been previously used in law enforcement populations [8, 30–35]. Recruits laid on their backs, with knees flexed to 90°, feet flat on the ground, and hands cupped behind their ears. Each recruit had a partner holding their feet to the ground during the test. Upon starting the assessment, training staff began timing. Recruits raised their shoulders from the ground until their elbows touched their knees while keeping feet flat on the ground and hands cupped behind their ears. The recruits then lowered themselves down until their shoulder blades contacted the ground. This technique was performed as many times as possible in 120 s. For the first 50 repetitions, recruits were given one point per repetition, while for the last 25 repetitions recruits were given two points per repetition, resulting in a maximum score of 100.
Mountain Climbers (MC)
MC involves isometric work in the truck and upper limb musculature with dynamic movement occurring in the hip and knee joints and assesses muscular endurance. Previous research in law enforcement populations have conducted this assessment using a similar procedure [29, 36]. Recruits started in the standard “up” position of a push-up and maintained this position with arms extended throughout the test. Maintaining a neutral spine, recruits alternated flexing the hip and knee for each leg, bringing the knee close to the chest and foot underneath the body with each repetition. Recruits began on the start command, with staff timing the 120 s. The first 40 mountain climbers completed count as one point each, while the last 20 were given three points each, resulting in a maximum score of 100.
Pull-ups
The pull-up test provides a second measure of upper body endurance [37], and has been previously used as an assessment tool, with similar technique, in law enforcement recruits [9, 29]. The recruits’ start position involved hanging on the bar in a vertical position, hands shoulder width apart, and using a pronated grip. While maintaining a vertical body alignment, recruits pulled themselves upwards until their chin rose above the bar. Recruits would then lower themselves until their arms were fully extended. This technique was continued until the recruit could no longer raise their chin over the bar. Each repetition counted as three points with a maximum score of 60 points.
201-m Run
The 201-m run provides a measure of anaerobic capacity [38] and has been previously utilised in law enforcement recruits with a similar technique [9, 29]. The 201-m distance was marked on a running track and timed by academy staff members with a handheld stopwatch (Professional Digital Stopwatch Timer, LuckyStone). Upon the start command, training staff started timing until the recruits passed the distance marker. Run time was recorded for each recruit to the nearest 0.1 s.
2.4-km Run
The 2.4-km run measures aerobic capacity [38] and has been commonly used as an assessment tool in law enforcement populations [8, 40]. For the 2.4-km run assessment, recruits were required to complete six laps around a 400 m athletics track at the academy training facility. Recruits were instructed to run this distance as quickly as possible. Run time was recorded for each recruit on a handheld stopwatch (Professional Digital Stopwatch Timer, LuckyStone) to the nearest 0.1 s.
Work Sample Test Battery (WSTB)
The WSTB is a California (U.S.A.) mandated group of tests each recruit must complete. Recruits must obtain a minimum score of 384 to graduate from the academy. Points are awarded relative to the completion time of each task [41]. All tests were performed outdoors on structures specifically designed for the training facility. Recruits wore their standard physical training attire. The procedures are explained in detail by the Peace Officer Standards and Training [41], but have been described briefly below. This battery of tests has been analysed before in research on law enforcement recruits [9].
99-yard Obstacle Course (99OC)
Simulating a foot pursuit, recruits were instructed to complete the 99-yard (90.53 m) course as quickly as possible while remaining on the concrete track. During this run, recruits also had to clear three 0.15 x 0.15 m curbs, and one 0.86 m high obstacle (Figure 2).

Diagrammatic representation of the 99OC.
The body drag simulates the ability of an officer to safely move an injured individual to safety. Recruits are required to drag a 74.8 kg dummy for a distance of 9.75 m. Initially, recruits were required to pick up the dummy by wrapping their arms underneath the arms of the dummy and extending their hips and knees. Timing was initiated as soon as the recruit began dragging the dummy. Recruits dragged the dummy by walking backwards over the complete 9.75 m at which point timing was stopped. Time was recorded to the nearest 0.1 s.
Chain Link Fence Climb (CLF)
Beginning 4.6 m away from the fence, recruits were required to run up to and scale the fence using whatever technique they choose, without using the side supports to assist their climb. Recruits were given two attempts to scale the fence. Once the fence was cleared, recruits were required run 22.9 m as quickly as possible to complete the test. Staff measured the time to complete the task using a handheld stopwatch (Professional Digital Stopwatch Timer, LuckyStone). Time was recorded to the nearest 0.1 s.
Solid Wall Fence Climb (SW)
As per the CLF, recruits ran 4.6 m before clearing the SW with any technique and then running 22.9 m upon clearance. The only difference between the two tests (CLF and SW) were the type of fence that needed to be cleared with this test utilising a solid wall instead of a chain link fence. Time was recorded to the nearest 0.1 s using a handheld stopwatch (Professional Digital Stopwatch Timer, LuckyStone).
500-yard Run (500 R)
A 500 yard (457.2 m) distance was marked on an athletics track by training staff. Recruits were instructed to run this distance as quickly as possible with training staff standing at the finish line timing each recruit to the nearest 0.1 s using a handheld stopwatch (Professional Digital Stopwatch Timer, LuckyStone).
Training program
Variations exist between the training programs of various classes. This is due in part to different members of staff overseeing different classes, in addition to external factors such as weather. However, each class was required to complete 36 physical training sessions over the course of a 22-week academy. Sessions varied over the course of the academy, but typically consisted of two to four sessions per week, with each session lasting approximately one to two hours [14]. It should be noted that some weeks, often occurring at the end of the academy, did not have any physical training sessions. Sessions were overseen by staff instructors who had previously undergone a mandated two-week physical training instructor course [14]. These sessions tended to focus on aerobic and muscular endurance exercises, with an emphasis on long distance running (e.g., 5 km runs) and circuit training [14, 43]. Despite this focus, there were still periods of training that incorporated elements of anaerobic fitness, such as muscular strength and power. These sessions included exercises, such as squats, hang cleans, lunges, or rows, with external loads (i.e., ammunition cans or weight plates) and/or medicine ball tosses. Further information on the training program has previously been described [14].
Statistical analysis
Hard copy data were collected and transferred into a Microsoft Excel (version 16.0) spreadsheet, where it was organised and cleaned. Data were only kept if initial and final results were present. Recruits may not have an initial or final score due to factors such as injuries or separation whereby a recruit leaves the academy. Data were separated into quintiles based off initial PT500 scores, resulting in the 715 participants described above. Only the PT500 was used for quintile grouping as this measure was consistently assessed during the first week of the academy and was therefore likely to be the most consistent initial measure of fitness. This separated data was exported into R Studio Statistical Software (version 1.2.5042 2020) for analysis.
Data analysis consisted of a paired t-test comparison of initial and final results for each quintile within each fitness assessment. Significance was set at α< 0.05 a priori. Effect sizes (Cohen’s d) for between group comparisons were then calculated for each fitness test by dividing the difference between the means by the pooled SD [44]. Interpretation of effect sizes was then performed based on Hopkins [45] which states that values less than 0.20 are considered a trivial effect; 0.20 to 0.60 a small effect; 0.60 to 1.20 a moderate effect; 1.20 to 2.00 a large effect, 2.00 to 4.00 a very large effect, and greater than 4.00 an extremely large effect.
A one-way ANOVA was performed to examine for significant differences between the change in fitness levels during the academy by percentile group: 0-20th, 21st-40th, 41st-60th, 61st-80th, 81st-100th. These percentiles are named 20th, 40th, 60th, 80th, and 100th respectively throughout the rest of the paper. Levene’s tests were performed to assess for homogeneity of variance. If not significant, a one-way ANOVA was performed. If the Levene’s test was significant, then a robust version of the ANOVA was used. The robust ANOVA used involved trimmed means and bootstrapping as proposed by Wilcox [46] and has been described as able to tolerate deviations from homogeneity of variance [47]. A post hoc analysis was then performed to identify the specific fitness changes that were significantly different. As post hoc analysis tend to perform poorly when sample sizes are not equal, a robust post hoc test was performed for all variables even if Levene’s test was non-significant [47]. The robust post hoc test utilised involved trimmed means and bootstrapping as proposed by Wilcox [46], which has been suggested to be capable of controlling for Type 1 error rates in both confidence intervals and p-values [47].
A linear mixed model (LMM) with maximum likelihood estimation was utilised to explore the relationship between initial fitness measures (e.g., PT500, and WSTB), and change in physical fitness. Due to the nature of the data collection (i.e. utilising a desktop analysis), each recruit within a class was assumed to have experienced the same training. Given variations in training staff and programs, individual training classes were treated as a random effect. All variables mentioned in the procedures section were explored for potential relationships with injury risk. If a recruit separated, or left the academy, all further measures past the week of separation, were marked as zero. This was performed to continue to account for recruits that were intended to undergo, but failed to complete, the training.
A stepwise approach was utilised to choose the best fitting model, wherein each variable was individually modelled as a potential predictor of injury. Comparisons between the models’ Akaike information criterion (AIC) and Bayesian information criterion (BIC) scores were then conducted, with the lowest score suggesting the best fit. The best fitting model was carried forward and the remaining variables added individually as a predictor. This process was repeated (the addition of a predictor resulting in the lowest AIC and BIC scores) until further additions of a predictor did not significantly (p < 0.05), with AIC and BIC as a reference, improve model fit. All statistical analyses were conducted using R statistical software [48] (version 1.25.042) with packages tidyverse [49], pander [50], furniture [51], texreg [52], psych [53], lme4 [54], gee [55], effects [56], performance [57], interactions [58], lattice [59], patchwork [60], and devtools [61]. Due to the complexity in analysing residuals of a GLMM [62], model diagnostics were performed using the DHARMa package [63] effectively examine residuals.
Results
Demographic information for each quintile (20th, 40th, 60th, 80th, and 100th) can be found in Table 1.
Recruit demographics by fitness percentile
Recruit demographics by fitness percentile
Key: Percentile based off initial fitness scores as measured by PT500; SD: standard deviation.
Sample sizes varied across tests and percentiles and can be found in Table 2, along with the initial and mean scores of each assessment. It should be noted that the 75PR, MBT, MSFT and associated VO2Max, were not performed for all 10 classes thus resulting in a smaller sample size for these assessments. The WSTB tended to have a smaller sample size then the tests comprising it (e.g., CLF, SW, etc.) despite these tests being conducted at the same time. The reason for this is unknown, but could possibly be due to failure to complete all tests. Instead of calculating scores results were left as originally recorded to reduce any bias.
Mean, SD, and sample size of each fitness assessment separated by quintile
The results from the paired samples t-tests demonstrated only a few non-significant differences between initial and final measures. No significant differences between initial and final fitness results were found in the 20th percentile 99OC; 40th percentile, MBT, and 75PR; 60th percentile 99OC, CLF; 80th percentile 99OC, BD, CLF, SW, 500 R; and 100th percentile push-ups, BD, CLF, SW, 500 R, and WSTB. The 80th and 100th percentiles had a higher number of non-significant differences in fitness results. Depending on the test similar effect sizes between percentiles are present, as can be seen with the 99OC, or decreasing effect size with higher percentiles, as seen in 2.4-km run. Confidence intervals and effect sizes are presented in Table 3.
Changes in recruit fitness separated by initial percentile ranking
Changes in recruit fitness separated by initial percentile ranking
Non-significant results of the Levene’s test existed for the 2.4-km run (p = 0.55), MSFT (p = 0.10), 99OC (p = 0.45), BD (p = 0.92), and pull-ups (p = 0.24). These fitness tests underwent a standard one-way ANOVA while the other measures underwent a robust ANOVA as per the methods section. Significant differences existed upon completion of the ANOVA for all fitness assessments except for 99OC and 75PR (Table 4). A post hoc analysis was completed after the initial ANVOA test for all fitness assessments except for 99OC and 75PR. The post hoc analysis revealed the mean difference significantly differed between each percentile within the PT500. Each percentile saw a significantly greater mean change compared to the higher percentiles. For example, the 20th percentile improved significantly more (132.5 points) compared to the 40th (95.4 points), 60th (81.8 points), 80th (63.0 points), and 100th (27.2 points) percentiles. A similar trend was seen in the MC, and push-ups. For the other tests associated with the PT500, non-significant differences in the mean change often occurred only between adjacent percentiles (e.g., 20th and 40th percentile), however the trend of lower percentiles seeing larger increases compared to higher percentiles continued.
Statistical results of Levene’s Test and ANOVA for between quintile comparisons
*one-way ANOVA used.
The post hoc analysis for the WSTB showed more non-significant results in the mean differences compared to the PT500 and associated tests. Non-significant changes still tended to occur between the higher percentiles, for example the 60th, 80th, and 100th percentiles. As in the PT500 and associated tests, the results of the WSTB suggest that lower percentiles see larger effect sizes compared to the higher percentile groups.
While the PT500, WSTB and their associated tests tended to show larger increases in the lower percentile groups, the last four assessments (MSFT, VO2Max, MBT, 75PR) show either larger or similar increases in the highest percentile groups, as seen in the estimated VO2Max results (Table 5).
Pairwise post hoc test p-values comparing quintile groups
Key: *** p-value< 0.001, Italics = significant value.
Results from the LMM support the findings of the percentile comparisons. Initial scores were a significant predictor for both the PT500 (value=-0.50, standard error 0.04, p < 0.001) and the WSTB (value=-0.67, standard error = 0.02, p < 0.001). These results indicate a one-point increase in initial scores would result in a -0.50 and -0.67 decrease in fitness change for the PT500 and WSTB respectively. Males were more likely to see significantly greater improvements in both the PT500 (value = 37.4, standard error = 8.43, p < 0.001) and the WSTB (value = 39.5, standard error = 6.4, p < 0.001) when compared to females. Body mass index (BMI) was found to be a significant predictor for both the PT500 (value=-2.5, standard error = 0.86, p = 0.004) and WSTB (value=-2.0, standard error = 0.54, p < 0.001), showing those with higher BMI saw less improvements in fitness. Age was not found to significantly improve model fit.
Discussion
The aim of this study was to analyse the changes in fitness of different law enforcement recruits based on their fitness at the commencement of training and to compare the changes in anaerobic and aerobic conditioning following completion of a typical law enforcement physical training program. It was found that, in general, recruits in the lower percentiles experienced a greater improvement in fitness than the recruits within the higher percentiles, suggesting the “one size fits all” training program may be providing an inadequate stimulus, particularly those with higher levels of fitness. This may be due to a variety of reasons, with one potential explanation being that the recruits of higher fitness were exposed to a suboptimal stimulus not sufficient to improve performance. However, it also may be partly explained by both a cap on the score that can be achieved, and insufficient training stimulus for the fitter recruits. For example, the100th percentile group did not see a significant increase in push-up repetitions likely because initially these recruits were already close to the maximum 50 repetitions. In terms of a measurement cap, the maximum points available may be limiting actual improvements of recruits in the higher percentiles. The 20th percentile group was able to experience greater improvements in assessments without a measurement cap, such as CLF, SW, and 500 R, potentially suggesting under stimulation of the fitter recruits (100th percentile). Contrary, the top percentile had a similar, if not greater, improvement in the rest of the fitness assessments (e.g., MSFT, VO2Max, MBT, 75PR) compared to other percentiles. The reason for this is unknown but could be due to these tests not being factored into academy scoring which could limit higher scores to those who are more motivated. Fit recruits may have a history of exercise participation and therefore may be more motivated to complete these assessments. If individuals with lower physical fitness were less motivated to complete these assessments, then the true expression of their fitness improvements may be hindered. Lastly, sex was found to have a significant impact on the change of physical fitness, with male recruits seeing a greater improvement during training then their female counterparts.∥The supposition of lower percentiles experiencing greater improvement is supported by the ANOVA and post hoc analysis which show significant differences between the lower and higher percentiles for the PT500 and associated tests. The WSTB and its associated tests tended to have smaller changes compared to the PT500 and associated tests. The smaller change experienced by the top percentiles could be due to the PT programs focusing more on muscular endurance and aerobic capacity [42, 43], as research has linked a portion of the WSTB (i.e., CLF, SW, and BC) to muscular power and strength [22, 64]. The WSTB and associated tests did not see as great of an improvement, measured via effect size, when compared to the PT500. This is suggestive that the current physical training program results in greater improvements in aerobic fitness compared to anaerobic fitness, strength, and power. Previous research in the same law enforcement population has found that physical training program mainly focuses on muscular endurance (through circuit training) and aerobic fitness (through slow, long duration runs) [42, 43]. The addition of anaerobic fitness, strength, and power training to this program may lead to a more well-rounded fitness profile. This is crucial as strength and power are vital to a police officer’s ability to perform their job efficiently and effectively, having previously been linked to occupational task performance [22, 64].}∥Overall, these results suggest that individuals with lower levels of fitness, as expected, experience significantly better improvements in physical fitness compared to their fitter peers. One adjustment to the training program that may account for varying fitness levels is the implementation of ability-based training (ABT). ABT changes the training stimulus depending on baseline fitness scores, potentially avoiding overtraining for those with lower fitness levels and providing a greater training stimulus for individuals with higher levels of fitness [11]. It should be noted that some modifications based on ability may have been implemented. However, it was unable to be determined what ability-based modifications (if any) may have been incorporated through retrospective analysis of the physical training programs. ABT has previously been implemented in law enforcement populations and shown to reduce injury rates and improve aerobic performance [7]. Lockie et al. [23] discussed at length the numerous benefits of ABT and how this could be implemented effectively in a police recruit population. Though law enforcement academies do not have ideal staff to recruit ratios for strength and conditioning programs [23], ABT can be implemented effectively. Lockie et al. [23] detailed two methods in which this can be done. For example, one method includes the separation of recruits into three groups depending on initial aerobic fitness results [23]. Three grids of varying sizes, with the lower fitness class using the smaller grid, can then be constructed, allowing recruits to complete interval training by sprinting one side of the grid and jogging the other [23]. The variations in grid size will help to ensure that each recruit is receiving a training stimulus more in line with their current fitness level [23]. Though some modifications may potentially be made based on fitness in the current program, this process can be formalised by using structured variations within training programs to ensure an optimal training stimulus.∥To mitigate the over-dominance of aerobic and muscular endurance training and improve anaerobic and strength/power outcomes, improving staff knowledge around principles of strength and conditioning could ensure optimal improvements across all relevant fitness components. These qualified personnel may also be better suited to adjusting training loads to account for varying fitness levels. Previous research in military training has shown that training programs ran by more knowledgeable instructors lead to greater improvements in fitness, even with the same training volume potentially due to more effective programming, greater instruction capabilities, and/or a more individualized training load [65]. Certifications to potentially improve staff knowledge, such as the Tactical Strength and Conditioning Facilitator course developed by the National Strength and Conditioning Association, may be able to result in greater improvements in recruit physical fitness by improving staff knowledge. Working alongside or hiring a Certified Strength and Conditioning Coach may also improve the structure of the physical training program.∥Noting the need to develop recruit strength and power, and that improvements in these measures may manifest themselves in improved performance on occupational tasks such as the BD, CLF, and SW [22, 64], law enforcement agencies may be limited in their ability to implement strength- and power-based training programs. Such limitation may be partly due to a lack of necessary equipment and facilities necessary to cater to a large number of recruits. This barrier can potentially be overcome by the use of non-traditional equipment, such as sandbags, ammunition canisters or body armour and utilising unilateral exercises [23, 66].∥A final consideration exists in a potential survivor effect whereby the results for the lower percentiles in fitness are derived from those trainees who ‘survived’ training. Trainees with lower levels of fitness are known to be at a greater risk of injury [10, 68]. While the lower percentiles in fitness generally made the most notable gains in fitness, this should not directly support the continuation of group training as the ideal training method. As these individuals are at a higher risk of injury, this methodology may be cause unnecessary injuries to recruits through excessive training load.∥Apart from a lack of injury data, several limitations are present in this study. One limitation was the lack of fitness tests that specifically measured muscular strength and power. Estimated repetition maximum testing for resistance training exercises and those that measure explosive power (e.g., vertical jump, standing broad jump, etc.) would likely give a greater indication of the true change in these components. This is crucial as these components have previously been linked to occupational tasks such as fence jumps and body drags [15]. Though the tests within the WSTB do require muscular strength and power, they may be limited by other factors such as technique. Another limitation is the lack of uniform training programs as variations in training styles may impact fitness development. Variations also existed between testing dates, particularly the WSTB, which can affect the improvements across the course of the academy. However, these limitations are offset by the large sample size which produces an overall picture of recruit fitness. Further, this data was analysed after splitting the sample into quintiles. While presenting a practical overview of fitness, this can cause limitations to statistics, namely inflating the Type 1 error rate [69]. Additional analysis that can control for these errors, as well regression to the mean and heterogeneity between responders and non-responders, can strengthen the results in this paper. Caution should be exercised when interpreting these findings. Due to this data being drawn from one law enforcement academy, these results should not be applied to other academies. Future research should implement testing protocols that are more purely focused on measuring strength and power, as well as standardised training programs to improve the strength of the results. The addition of injury data and measures of training load (such as Rating of Perceived Exertion) can provide further detail on how training load impacts physical fitness.
Conclusion
Recruits with lower levels of fitness were more consistently able to improve their fitness level during the course of the academy compared to their more fit counterparts, suggesting a potential under-stimulus for those with a greater baseline. Law enforcement academies can implement ABT programs to ensure all recruits undergo adequate stimulus to increase fitness. Larger changes were seen in areas of aerobic capacity and muscular endurance. Adding a focus on muscular strength and anaerobic exercises may better prepare recruits for tasks they will commonly undertake in law enforcement (e.g., body drag, fence climb). Law enforcement agencies can invest more resources and training for physical training instructors to better implement ABT programs with an added focus on muscular strength and anaerobic exercises.
Conflicts of interest
The authors have no conflicts of interest to declare.
Acknowledgments
The authors would like to acknowledge the law enforcement training staff for their support, without which this research would not have been achievable.
Informed consent
Due to the retrospective nature and the data previously collected as standard operating procedures informed consent was not required of the participants.
Ethical considerations
Ethical approval was sought and granted by the Bond University Human Research Ethics Committee and the California State University, Fullerton Institutional Review Board (HSR-17-0037).
Reporting guidelines
This manuscript follows the EQUATOR Network reporting guidelines, particularly the STROBE checklist on cohort studies.
Funding
This research was supported by an Australian Government Research Training Program Scholarship.
Data availability
Given the sensitive nature of the data, it has been made available only upon reasonable request.
