Decomposing orienteering performance into speed,instability,and mistake severity using split times

Abstract

Final time and rank do not fully characterize orienteering performance: athletes with similar outcomes can differ substantially in consistency and in the severity of major navigational mistakes. We propose a split-time framework that decomposes performance into three interpretable components: speed, instability, and mistake severity. The framework combines a descriptive layer based on leg-level relative log-times with an additive model for log split times that includes athlete effects (baseline pace), leg effects (common difficulty), and athlete-leg residuals. Under a complete split-time table, least-squares estimation has a transparent closed-form representation in terms of doubly centered log split times. Using elite split-time data from the 2025 World Orienteering Championships middle- and long-distance finals, we show that speed is most closely aligned with final outcome, while instability and mistake severity provide complementary execution-level information rather than replacements for official results. We also report bootstrap uncertainty intervals and a minimal missing-data sensitivity analysis. The proposed framework is simple to compute, interpretable in practice, and extensible for applied sports analytics.

Keywords

orienteering split times sports analytics performance decomposition instability navigational mistakes

Introduction

In orienteering, athletes navigate through a prescribed sequence of controls while choosing their own routes between successive controls, so competitive performance depends on both running capacity and navigational execution. Official race results are typically summarized by total finishing time and final rank. These quantities determine competition outcomes, but they do not fully describe how an athlete performed during the race. Athletes with similar total times can arrive there through very different profiles: one may execute consistently across most legs, whereas another may combine fast running with one or two major navigational errors. From a performance-analysis perspective, these profiles are meaningfully different even when the final time gap is small.

The increasing availability of split-time data makes it possible to move beyond outcome-based summaries and study how performance unfolds over the course of a race. Split times contain leg-level information on execution, and therefore offer a natural basis for separating overall pace from within-race variation (Casado et al., 2021; Hébert-Losier et al., 2015; Sha et al., 2024). Many organizers already provide online split-time tools that display cumulative or leg-by-leg positions across athletes, which is an important improvement over final ranks alone. The present framework is intended to complement such tools rather than duplicate them: by adjusting for athlete-level pace and common leg-level effects, it separates overall speed, shared course structure, residual instability, and unusually severe athlete-leg deviations in a single interpretable decomposition.

This adjustment is useful because a raw split display shows where time was lost, but not always whether a loss was large relative to the athlete’s own baseline and to the field-wide difficulty of that leg. The proposed model provides leg-adjusted residual profiles for athletes and common leg-effect summaries for organizers. It can therefore be used as an additional layer in post-race split tools: athlete pages could display speed, instability, and mistake-severity summaries alongside the usual split table, while organizer views could highlight legs that were broadly slow across the field or unusually decisive for particular athletes.

This paper proposes a split-time framework that decomposes orienteering performance into three interpretable components: overall speed, within-race instability, and mistake severity. We begin with a descriptive representation based on leg-level relative log-times, then formalize these quantities with an additive model for log split times that includes athlete effects, leg effects, and athlete-leg residuals. The model separates overall pace, common leg structure, and extreme underperformance on particular legs.

Our goal is not to replace official results, nor to compress performance into a single universal score. Instead, we provide a decomposition that is simple to compute, interpretable in practice, and statistically grounded. The framework is intended to support athlete comparison and race review by separating overall pace from consistency and severe underperformance, while also providing course-level summaries of common leg difficulty. Because the available data are split times rather than direct measurements of physiology, cognition, or route choice, the framework should not be read as a definitive separation of physical and navigational skill. It can, however, indicate how much of a race outcome is associated with overall pace versus residual leg-specific disruptions after common leg difficulty has been removed.

Despite the competitive importance of split times in elite orienteering, the sport remains comparatively underrepresented in the quantitative sports analytics literature. In particular, there has been relatively little work on simple statistical frameworks for decomposing official split-time results into interpretable dimensions of race performance. Orienteering is especially suitable for this purpose because it naturally combines leg-level split structure, navigation uncertainty, and substantial heterogeneity across legs.

The contributions of this paper are threefold. First, we introduce a split-time representation of orienteering performance in terms of speed, within-race instability, and mistake severity. Second, we provide a corresponding additive model for log split times that separates athlete-level pace, common leg difficulty, and athlete-leg specific deviations. Third, using elite-level championship data, we show that athletes with similar final times can nevertheless exhibit markedly different performance profiles across these dimensions.

More broadly, this paper does not aim to build a highly elaborate predictive model. Its focus is an interpretable split-time decomposition that is practically useful for athlete comparison, race analysis, and post-race course review in a sport where official outcomes often mask meaningful differences in execution.

The remainder of the paper is organized as follows. Section “Related work” reviews related work. Section “Data structure and notation” introduces notation and data structure. Section “Descriptive split-time decomposition” presents descriptive split-based measures. Section “A statistical model for split-time decomposition” develops the statistical model and the corresponding model-based decomposition. Section “Estimation and implementation” discusses estimation and implementation. Section “Empirical analysis” presents the empirical analysis, including robustness evidence from multiple elite races. Section “Discussion” concludes with limitations and future directions.

Related work

This paper relates to three broad strands of work: performance evaluation in sport, split-based analysis of race data, and interpretable additive decompositions for repeated measurements.

First, in sports analytics more broadly, a common objective is to separate observed outcomes into interpretable components such as underlying ability, contextual difficulty, and event-specific fluctuation. This perspective appears in team sports through decompositions into offensive and defensive strength, home-field effects, and match-level shocks, and in athlete-level settings where repeated observations are used to distinguish baseline performance from contextual variation (Albert et al., 2024; Baio and Blangiardo, 2010; Frees, 2004; Hsiao, 2003). Our setting differs in that the data arise from a common ordered sequence of legs within a race, but the underlying motivation is similar: to separate stable performance level from structured context and residual deviation.

Second, the paper is connected to the use of split times and intermediate race measurements for performance profiling. In race-based sports, final outcomes often conceal substantial heterogeneity in how performances unfold over time or across segments (Casado et al., 2021; Díaz et al., 2018, 2019; Sha et al., 2024). Split-level analyses provide a process-oriented view of pacing patterns, segment-specific strengths, and deviations from expected execution (Boullosa et al., 2023; Corbí-Santamaría et al., 2023; Ćuk et al., 2024; Haney and Mercer, 2011; Nikolaidis and Knechtle, 2017). Related scoring systems in endurance sports also use relative finishing times as the basic comparison scale; for example, the FIS cross-country skiing race-points formula is based on an athlete’s time relative to the winner’s time (International Ski Federation, 2017). Our use of log-times follows the same relative-time motivation but additionally makes multiplicative split-time ratios additive. In orienteering, this perspective is especially natural because split times are routinely recorded and major navigational mistakes often appear as unusually poor performance on one or a small number of legs (Hébert-Losier et al., 2015; Sailer et al., 2019).

Third, the proposed framework is methodologically close in spirit to simple additive decompositions and fixed-effects representations for two-way data arrays (Frees, 2004; Gelman et al., 2020; Hsiao, 2003). We use a deliberately simple decomposition on the log split-time scale, with athlete effects representing baseline pace and leg effects representing common leg difficulty across the field. The value of this approach is not that it fully captures the complexity of orienteering performance, but that it provides a transparent and interpretable first-order representation that can be computed directly from split-time tables.

The present paper also connects to prior work on orienteering performance, perception, and context. Existing studies have examined physiological predictors of performance, the role of experience, visual search and map reading, cognitive load, and movement trajectories or GPS-derived contextual information (Creagh and Reilly, 1997; Gasser, 2016; Gasser and Vogel, 2021; Guzmán et al., 2008; Larsson et al., 2002; Liu, 2019; Millet et al., 2010; Sailer et al., 2019). Other work has focused more directly on cognitive processing, map interpretation, running speed under cognitive load, and error formation in orienteering (Barrell and Cooper, 1986; Chaloupská, 2015; Cheshikhina, 1993; Seiler, 1996). Recent optimization work on rogaining also emphasizes the balance between physical and cognitive skill demands in orienteering-like contest design (Van Bulck et al., 2025). These studies highlight that orienteering performance depends not only on running capacity but also on navigation, decision making, and interaction with terrain and course structure. Our contribution is complementary: rather than modeling the full cognitive or biomechanical mechanism of race execution, we propose a split-time based decomposition that separates overall pace, within-race consistency, and severe mistakes in a simple and interpretable statistical framework.

The present paper differs from more informal split-based summaries in two respects. First, rather than compressing performance into a single composite score, we emphasize decomposition into separate dimensions: overall speed, within-race instability, and mistake severity. Second, we distinguish ordinary residual variation from extreme underperformance, which is especially relevant in orienteering where one or two large mistakes may dominate an otherwise strong race. The contribution is therefore not a comprehensive generative model of race behavior, but a statistically grounded framework for interpretable split-time decomposition.

Data structure and notation

Consider a single race within a fixed event and class. Suppose there are $n$ athletes and $m$ legs. For athlete $a \in {1, \dots, n}$ and leg $i \in {1, \dots, m}$ , let

T_{a i} > 0

denote the recorded split time on leg

i

, and let

T_{a} = \sum_{i = 1}^{m} T_{a i}

denote the total time of athlete

a

. In the empirical analysis, split times and total times are measured in seconds. The log-time variables below are therefore logarithms of times recorded in seconds; differences on the log scale are dimensionless log ratios. The speed components are reported on this log-time scale, while instability and mistake severity are residual log-time quantities. Throughout, we assume that the analysis is conducted separately within a homogeneous race setting, such as a single event-class combination. In applications, incomplete split records, retirements, or mispunches would typically be excluded from the main analysis.

In addition to the raw split-time table, we allow the use of a leg-specific reference profile. Let $T_{i}^{ref}$ denote a reference time for leg $i$ , and define

T^{ref} = \sum_{i = 1}^{m} T_{i}^{ref} .

Possible choices of

T_{i}^{ref}

include the fastest split on leg

i

, the median split among the top

K

finishers, or a trimmed mean computed from a selected reference group. We will use such reference quantities primarily in the descriptive layer of the framework.

For most of the paper, we focus on split times within a single race. When multiple races are analyzed, the same framework can be applied separately by race, or extended to a hierarchical model that pools information across races. Such extensions are discussed in Section “Discussion”.

Descriptive split-time decomposition

We first define a descriptive layer that does not rely on a fully specified probabilistic model. For athlete $a$ and leg $i$ , define the relative log-time

r_{a i} = \log (\frac{T_{a i}}{T_{i}^{ref}}) .

This quantity is zero when athlete

a

matches the reference on leg

i

, positive when the athlete is slower than the reference, and negative when the athlete is faster.

Using the relative log-times $r_{a i}$ , we define three descriptive performance measures.

Overall speed.

A natural split-aggregated summary of overall speed is

{Speed}_{a}^{desc} = \log (\frac{T_{a}}{T^{ref}}) .

Smaller values correspond to better overall speed relative to the reference.

Within-race instability

We define instability as the standard deviation of the leg-level relative log-times,

{Instability}_{a}^{desc} = sd (r_{a 1}, \dots, r_{a m}) .

This quantity captures how uneven the athlete’s leg-by-leg performance is relative to the reference profile. An athlete whose relative performance is similar across legs will have a smaller instability value, whereas an athlete with substantial leg-to-leg fluctuation will have a larger value.

Mistake severity

To capture unusually poor legs, we define

{Mistake}_{a}^{desc} = max_{1 \leq i \leq m} r_{a i} .

This summary emphasizes the worst leg relative to the reference profile and is intended as a simple measure of major mistake severity.

Remark 1.

The three descriptive measures play different roles. The speed measure reflects overall race level, the instability measure reflects dispersion of relative performance across legs, and the mistake measure reflects the most extreme underperformance on a single leg. These dimensions are complementary and should not, in general, be expected to rank athletes identically.

Remark 2.

The reference profile enters only through $T_{i}^{ref}$ . Different choices of reference may change the absolute numerical values of the descriptive measures, so sensitivity analyses with respect to reference construction are important in practice.

A statistical model for split-time decomposition

Because split times are positive and are naturally compared through relative rather than absolute differences, we work on the log-time scale (Casado et al., 2021; Sha et al., 2024). This has two advantages. First, multiplicative slowdowns or speedups are transformed into additive deviations. Second, relative comparisons of the form $T_{a i} / T_{i}^{ref}$ become log differences, which aligns the descriptive representation with the additive model introduced below.

Let

Y_{a i} = \log T_{a i} .

We propose the additive split-time model

Y_{a i} = μ_{a} + δ_{i} + ε_{a i}, a = 1, \dots, n, i = 1, \dots, m,

(1)

where

μ_{a}

is an athlete-specific baseline speed effect,

δ_{i}

is a leg-specific difficulty effect, and

ε_{a i}

is an athlete-leg specific residual deviation.

The interpretation of (1) is straightforward. The term $μ_{a}$ represents the athlete’s overall pace level on the log-time scale, averaged across the course. The term $δ_{i}$ captures the fact that some legs are systematically slower than others for all athletes, due to length, climb, terrain, technical complexity, or other common course features. The residual $ε_{a i}$ measures how much athlete $a$ deviates from their expected performance on leg $i$ after accounting for baseline speed and common leg difficulty. Accordingly, the fitted leg effects may also be viewed as a course-level summary of which legs were systematically slower or faster across the field, which may be useful not only for athlete analysis but also for post-race review of course structure.

Model (1) yields a direct statistical decomposition of performance into three components.

Model-based speed

The athlete-specific speed effect is summarized by

{Speed}_{a}^{model} = μ_{a} .

Smaller values correspond to faster athletes on the log-time scale.

Model-based instability

Once $μ_{a}$ and $δ_{i}$ are removed, the spread of residual deviations reflects within-race consistency. A natural model-based measure is

{Instability}_{a}^{model} = \sqrt{\frac{1}{m} \sum_{i = 1}^{m} ε_{a i}^{2}} .

This quantity is small when athlete

a

performs close to their expected level on most legs, and large when leg-level performance fluctuates substantially.

Model-based mistake severity

A natural measure of mistake severity is the maximum positive residual,

{Mistake}_{a}^{model} = max_{1 \leq i \leq m} ε_{a i} .

Because large positive residuals correspond to unexpectedly slow splits after controlling for baseline pace and leg difficulty, they serve as a simple proxy for major mistakes.

Remark 3.

The additive model in (1) is intentionally simple. It is not designed to capture all aspects of orienteering performance, such as route choice, terrain interactions, pack effects, or serial dependence across legs. Its purpose is instead to provide a transparent first-order decomposition of split times into overall speed, common leg difficulty, and residual leg-level deviations.

Identifiability and normalization

Model (1) is unchanged if one adds a constant $c$ to all athlete effects and subtracts the same constant from all leg effects. Accordingly, $(μ_{a}, δ_{i})$ is identifiable only up to an additive constant without further normalization. Throughout, we impose the centering constraint

\sum_{i = 1}^{m} δ_{i} = 0.

(2)

Under this normalization, the athlete effects are interpreted relative to the average leg difficulty on the log-time scale. Other equivalent normalizations are possible, such as centering the athlete effects or fixing one leg effect to zero. The specific choice does not affect fitted values or residuals and therefore does not affect the instability or mistake summaries derived below.

Connection between descriptive and model-based quantities

The descriptive and model-based decompositions are closely related, but they are not identical. In the descriptive layer, performance is summarized relative to a chosen reference profile through

r_{a i} = \log T_{a i} - \log T_{i}^{ref} .

In the model-based layer, the split time is decomposed into an athlete-specific level, a common leg effect, and an athlete-leg specific residual.

If the reference profile is such that

\log T_{i}^{ref} \approx δ_{i} + c

for some constant

c

not depending on

i

, then the relative log-times

r_{a i}

approximately remove common leg difficulty up to an additive athlete-independent shift. In that case, the descriptive summaries based on the mean, spread, and upper tail of

r_{a i}

may be viewed as empirical approximations to the corresponding model-based notions of speed, instability, and mistake severity.

It is worth noting that the descriptive speed measure

\log (\frac{T_{a}}{T^{ref}})

is directly tied to total race time and is therefore especially intuitive in practice, whereas the model-based speed effect

μ_{a}

is defined on the mean log split-time scale. An alternative descriptive speed summary,

\frac{1}{m} \sum_{i = 1}^{m} r_{a i},

aligns more directly with the additive model.

Estimation and implementation

For a fixed race, the additive model in (1) may be estimated by least squares under the identifiability constraint (2). Define

L (μ, δ) = \sum_{a = 1}^{n} \sum_{i = 1}^{m} (Y_{a i} - μ_{a} - δ_{i})^{2} .

We estimate the athlete and leg effects by

(\hat{μ}, \hat{δ}) \in \underset{μ, δ}{argmin} L (μ, δ) subject\; to \sum_{i = 1}^{m} δ_{i} = 0.

(3)

The fitted residuals are then

{\hat{ε}}_{a i} = Y_{a i} - {\hat{μ}}_{a} - {\hat{δ}}_{i} .

This yields the empirical model-based summaries

\begin{aligned} {\hat{Speed}}_{a} & = {\hat{μ}}_{a}, \\ {\hat{Instability}}_{a} & = (\frac{1}{m} \sum_{i = 1}^{m} {\hat{ε}}_{a i}^{2})^{1 / 2}, \\ {\hat{Mistake}}_{a} & = max_{1 \leq i \leq m} {\hat{ε}}_{a i} . \end{aligned}

Closed-form solution under a complete split-time table

When the split-time table is complete, the least-squares estimator in (3) has a simple closed-form representation. Let

\begin{aligned} {\bar{Y}}_{a \cdot} & = \frac{1}{m} \sum_{i = 1}^{m} Y_{a i}, \\ {\bar{Y}}_{\cdot i} & = \frac{1}{n} \sum_{a = 1}^{n} Y_{a i}, \\ {\bar{Y}}_{\cdot \cdot} & = \frac{1}{n m} \sum_{a = 1}^{n} \sum_{i = 1}^{m} Y_{a i} . \end{aligned}

Proposition 1.

Under the constraint $\sum_{i = 1}^{m} δ_{i} = 0$ , the least-squares solution to (3) satisfies

\begin{aligned} {\hat{μ}}_{a} & = {\bar{Y}}_{a \cdot}, \\ {\hat{δ}}_{i} & = {\bar{Y}}_{\cdot i} - {\bar{Y}}_{\cdot \cdot}, \end{aligned}

and therefore

{\hat{ε}}_{a i} = Y_{a i} - {\bar{Y}}_{a \cdot} - {\bar{Y}}_{\cdot i} + {\bar{Y}}_{\cdot \cdot} .

Proof.

Substituting the model into the least-squares criterion and differentiating with respect to $μ_{a}$ gives

\frac{\partial L}{\partial μ_{a}} = - 2 \sum_{i = 1}^{m} (Y_{a i} - μ_{a} - δ_{i}) = 0,

so that

m μ_{a} = \sum_{i = 1}^{m} Y_{a i} - \sum_{i = 1}^{m} δ_{i} .

Under the constraint

\sum_{i = 1}^{m} δ_{i} = 0

, this yields

{\hat{μ}}_{a} = \frac{1}{m} \sum_{i = 1}^{m} Y_{a i} = {\bar{Y}}_{a \cdot} .

Next, differentiating with respect to

δ_{i}

gives

\frac{\partial L}{\partial δ_{i}} = - 2 \sum_{a = 1}^{n} (Y_{a i} - μ_{a} - δ_{i}) = 0,

hence

n δ_{i} = \sum_{a = 1}^{n} Y_{a i} - \sum_{a = 1}^{n} μ_{a} .

Using

{\hat{μ}}_{a} = {\bar{Y}}_{a \cdot}

, we have

\frac{1}{n} \sum_{a = 1}^{n} {\hat{μ}}_{a} = \frac{1}{n} \sum_{a = 1}^{n} {\bar{Y}}_{a \cdot} = {\bar{Y}}_{\cdot \cdot},

and therefore

{\hat{δ}}_{i} = \frac{1}{n} \sum_{a = 1}^{n} Y_{a i} - \frac{1}{n} \sum_{a = 1}^{n} {\hat{μ}}_{a} = {\bar{Y}}_{\cdot i} - {\bar{Y}}_{\cdot \cdot} .

Finally,

{\hat{ε}}_{a i} = Y_{a i} - {\hat{μ}}_{a} - {\hat{δ}}_{i} = Y_{a i} - {\bar{Y}}_{a \cdot} - {\bar{Y}}_{\cdot i} + {\bar{Y}}_{\cdot \cdot},

as claimed.

This representation makes the decomposition especially transparent. The fitted athlete effect is the athlete’s mean log split time across all legs, the fitted leg effect is the average leg difficulty relative to the grand mean, and the residual is the doubly centered log split time. Thus the model-based decomposition can be interpreted directly as a decomposition of the observed split-time table into athlete-level pace, leg-level common structure, and athlete-leg specific departure from expectation.

Interpretation of the residual-based summaries

The quantity ${\hat{Instability}}_{a}$ is the root mean square of the fitted leg-level deviations for athlete $a$ . It should therefore be interpreted as a measure of within-race residual dispersion after adjusting for overall pace and common leg difficulty. Larger values indicate greater leg-to-leg fluctuation, whereas smaller values indicate more consistent execution relative to the athlete’s expected level within that race.

Similarly, ${\hat{Mistake}}_{a}$ is the largest positive fitted deviation and is intended to summarize the most severe unexpectedly slow leg. This choice is simple and interpretable, though it is naturally sensitive to a single extreme observation. For this reason, alternative summaries based on, for example, the mean of the largest two positive residuals or upper-quantile features of the positive residual distribution may also be useful in practice.

Incomplete split-time tables

If some split times are missing, the same least-squares criterion may be minimized over the observed entries only, provided that the remaining design retains sufficient connectivity for identification of the relevant athlete and leg effects. In practice, however, missingness in split-time data may be informative — for example, due to retirement, mispunch, or severe disruption during the race — and results based on incomplete tables should therefore be interpreted with appropriate caution.

Uncertainty quantification

Because empirical summaries are estimated from finite race-specific samples, we report uncertainty intervals in addition to point estimates. For each race-class analysis, we use nonparametric athlete-level bootstrap resampling to construct 95% confidence intervals for correlations between model-based measures and final outcomes (total time and rank). For selected close-time case-study pairs, we use a leave-one-leg-out (LOLO) sensitivity analysis: for each leg $j$ , we recompute the pairwise difference after removing leg $j$ and summarize the resulting distribution of leave-one-leg-out differences. This diagnostic follows the general logic of leave-one-out influence analysis, in which observations are removed one at a time to assess their effect on a fitted summary or conclusion (Cook, 1977).

The resampling unit is chosen to match the target estimand. For correlation summaries, each bootstrap sample resamples athletes with replacement (size $n$ ), recomputes the decomposition-based summaries, and then recomputes the correlation of interest. Let $θ$ denote a generic target statistic and ${θ^{* (b)}}_{b = 1}^{B}$ the bootstrap replicates; the reported 95% nonparametric interval is the percentile interval

[q_{0.025} {θ^{* (b)}}_{b = 1}^{B}, q_{0.975} {θ^{* (b)}}_{b = 1}^{B}] .

Sensitivity analysis for incomplete split-time tables

To assess whether complete-case filtering materially changes the qualitative conclusions in the present data, we run a simple sensitivity analysis that retains athletes with at most one missing leg and imputes missing leg times by legwise medians within each race-class table. We then recompute the same summary measures and outcome associations. This check is intended as a practical robustness diagnostic rather than as a full missing-data model.

Optional aggregated score

Although our emphasis is on multicomponent performance decomposition rather than a single ranking score, the fitted residuals also support natural aggregated summaries when one-number comparisons are needed for downstream applications. One simple choice is

I_{a} = \frac{1}{m} \sum_{i = 1}^{m} {\hat{ε}}_{a i}^{2} .

This may be interpreted as an overall residual energy after adjusting for athlete-level pace and leg-level difficulty. We nevertheless view such aggregation as secondary, because it intentionally compresses distinct aspects of performance into a single number.

Empirical analysis

We illustrate the proposed framework using official split-time results from the 2025 World Orienteering Championships middle-distance final. The 2025 championships were held in the Kuopio region of Finland. The middle-distance final took place on technical forest terrain, with official course information listing 5.8 km, 255 m climb, and 18 controls for the men’s course and 5.0 km, 230 m climb, and 16 controls for the women’s course (International Orienteering Federation, 2025b). We also use the long-distance final as a robustness illustration; the corresponding courses were 16.0 km, 565 m climb, and 27 controls for men, and 13.3 km, 475 m climb, and 23 controls for women (International Orienteering Federation, 2025a). In the official XML result files, split information is recorded at successive controls as cumulative times; leg split times were obtained by differencing consecutive cumulative control times, with the final run-in to the finish treated as an additional segment when needed. Public split result files do not provide leg-specific length and climb for each segment, so these physical course attributes are not entered as direct covariates. Instead, their shared effect on split times is absorbed indirectly by the fitted common leg effects, together with terrain, route-choice complexity, and other leg-level features observed by the field as a whole.

The analysis was conducted separately for the men’s and women’s races. For each race, we restricted attention to athletes with complete split records. We then computed both the descriptive summaries introduced in Section “Descriptive split-time decomposition” and the model-based summaries derived from the additive log split-time decomposition in Sections “A statistical model for split-time decomposition”–“Estimation and implementation”. In the descriptive layer, the reference profile was constructed as the legwise median split among the top 10 finishers ordered by official rank.

Overall associations with final outcome

We first examined how the proposed summaries relate to total finishing time and final rank. Table 1 reports correlations between the model-based measures and these outcomes in the men’s and women’s races, together with 95% bootstrap confidence intervals. We use total time in seconds because it is the official performance outcome and the quantity displayed in standard result and split-time tools. Rank is included as a second outcome because it is scale-free and directly tied to competition placement; correlations with log total time would instead emphasize proportional rather than absolute time differences. As expected, the speed component is most strongly aligned with final outcome in both races. By contrast, model instability and mistake severity remain associated with outcome but less strongly.

Table 1.

Correlations of the model-based performance measures with total finishing time and final rank in the men’s and women’s races. Values are point estimates with 95% bootstrap confidence intervals in brackets.

	Men		Women
Metric	Total time	Rank	Total time	Rank
Model speed	0.974 [0.953, 0.988]	0.935 [0.909, 0.970]	0.948 [0.901, 0.989]	0.975 [0.962, 0.986]
Model instability	0.753 [0.598, 0.869]	0.703 [0.567, 0.814]	0.643 [0.373, 0.829]	0.571 [0.397, 0.722]
Model mistake severity	0.672 [0.497, 0.818]	0.646 [0.498, 0.771]	0.507 [0.221, 0.760]	0.454 [0.224, 0.652]

This pattern is important for interpretation. If instability and mistake severity were nearly redundant with total time, then the decomposition would add little beyond the official result itself. Instead, the weaker associations indicate that these dimensions are best interpreted as complementary race-execution descriptors. In other words, the speed component primarily summarizes how fast an athlete was overall, whereas the instability and mistake components help describe how the race unfolded.

Figure 1 provides a complementary visual summary. The horizontal axis shows the model-based speed effect, the vertical axis shows model-based instability, and the color scale represents model-based mistake severity. Several features are immediately visible. First, athletes with faster overall pace tend to cluster toward the left side of each panel, as expected. Second, even among athletes with similar speed values, there remains substantial heterogeneity in instability and mistake severity. Third, the athletes highlighted for the case studies below lie close to each other in speed but far apart in instability, illustrating that similar overall competitiveness need not imply similar split-time structure.

Figure 1.

Model-based speed–instability plots for the men’s and women’s races. The horizontal axis shows the athlete-specific speed effect, with smaller values corresponding to faster overall pace; the vertical axis shows model-based instability, measured by the root mean square residual after adjusting for athlete-level pace and common leg difficulty; color indicates model-based mistake severity. Red circles highlight the close-time case-study athletes discussed in the text. (a) Men and (b) Women

Taken together, Table 1 and Figure 1 are consistent with the intended interpretation of the decomposition: speed behaves as an overall performance-level summary, while instability and mistake severity provide additional information about race execution that is not fully visible from total time alone.

To reduce the appearance of selective reporting, Figures 9 and 10 display the broader distributions of close-time pairs and indicate where the selected case-study pairs fall within those distributions. The examples discussed in the main text should therefore be viewed as interpretable illustrations drawn from a larger set of close-time comparisons rather than as isolated anecdotes; substantial instability/mistake contrasts also appear in the broader comparison sets.

Case study: men’s race

A representative example from the men’s race is provided by Quentin Moulet and Matt Doyle. Their total times differed by only 22 seconds, with Moulet finishing in 2543 seconds and Doyle in 2565 seconds. Despite this small overall gap, their within-race profiles were markedly different.

Table 2 indicates that Moulet had a much larger model-based instability value than Doyle ( $0.301$ versus $0.106$ ), together with a much larger model-based mistake-severity value ( $1.031$ versus $0.272$ ). The corresponding absolute differences were $0.195$ for model instability and $0.759$ for model mistake severity. Under LOLO sensitivity, the leave-one-leg-out ranges were $[0.082, 0.213]$ for instability difference and $[0.295, 0.849]$ for mistake-severity difference, with maximum absolute influence $0.113$ and $0.464$ , respectively. This contrast is also visible in the men’s panel of Figure 1, where the two athletes occupy nearby horizontal positions but are clearly separated vertically. Thus, although their overall pace levels were similar, their races differed substantially in leg-to-leg consistency.

Table 2.

Selected close-time case studies illustrating different within-race performance profiles. In both races, the two athletes have nearly identical total times, but markedly different instability and mistake-severity values.

Race	Athlete	Rank	Total (s)	Instability	Mistake
Men	Quentin Moulet	33	2543	0.301	1.031
Men	Matt Doyle	35	2565	0.106	0.272
Women	Lucie Dittrichova	28	2620	0.295	1.091
Women	Jasmina Gassner	29	2624	0.087	0.176

The difference becomes even more apparent in Figure 2(a), which displays the model-based residual profiles for the two athletes. Doyle’s residuals remain comparatively close to zero across most legs, indicating a relatively even race after adjusting for overall pace and common leg difficulty. By contrast, Moulet exhibits pronounced positive residual spikes, especially on later legs, indicating one or more legs on which he was much slower than expected relative to his own race baseline. In substantive terms, Moulet’s final result appears to have been shaped by a small number of severe underperforming legs embedded within an otherwise competitive overall performance, whereas Doyle’s race was much more stable from leg to leg.

Figure 2.

Model-based residual profiles for the selected close-time athlete pairs. Positive residuals indicate legs on which the athlete was slower than expected after adjusting for overall pace and common leg difficulty. In both panels, one athlete exhibits one or more pronounced positive spikes, corresponding to substantially larger instability and mistake-severity measures, despite a nearly identical final time. (a) Quentin Moulet vs. Matt Doyle and (b) Lucie Dittrichova vs. Jasmina Gassner

This example illustrates the main motivation of the framework: athletes with nearly identical total times can still exhibit markedly different race structures, a distinction that is not visible from total time alone.

Case study: women’s race

A similarly clear example appears in the women’s race, where Lucie Dittrichova and Jasmina Gassner finished only 4 seconds apart, with total times of 2620 and 2624 seconds, respectively. Again, however, the decomposition indicates that these nearly identical final outcomes correspond to very different split-time profiles.

As reported in Table 2, Dittrichova had a much larger model-based instability value than Gassner ( $0.295$ versus $0.087$ ), together with a dramatically larger model-based mistake-severity value ( $1.091$ versus $0.176$ ). The corresponding absolute differences were $0.209$ for model instability and $0.915$ for model mistake severity. Under LOLO sensitivity, the leave-one-leg-out ranges were $[0.049, 0.226]$ for instability difference and $[0.056, 0.971]$ for mistake-severity difference, with maximum absolute influence $0.159$ and $0.859$ , respectively. This difference is visible already in the women’s panel of Figure 1, where the two athletes are close in speed but widely separated in instability. Thus, as in the men’s example, similarity in final outcome does not imply similarity in race structure.

Figure 2(b) clarifies the source of this difference. Dittrichova shows a very large positive residual spike on one leg, together with several additional deviations away from zero, whereas Gassner’s residuals remain much more moderate across the course. This suggests that Dittrichova’s race was strongly shaped by a major single-leg error, while Gassner’s race was substantially more even from leg to leg.

The fact that an analogous pattern appears in both the men’s and women’s races is encouraging. It suggests that the proposed decomposition is not merely identifying isolated anecdotes, but is capturing a recurring structural feature of elite split-time data: athletes with comparable total times can nevertheless differ substantially in consistency and in the severity of their worst mistakes.

Common leg effects as a course-level summary

In addition to athlete-level summaries, the additive split-time model also yields estimated common leg effects, namely the fitted quantities ${\hat{δ}}_{i}$ for each leg. These effects provide a course-level view of which legs were systematically slower or faster across the field after accounting for athlete-specific pace. In this sense, the model summarizes not only how athletes differed from one another, but also how the course itself was expressed statistically through the split-time table.

Figure 3 displays the estimated common leg effects for the men’s and women’s middle-distance finals. Several legs stand out as clearly slower than average across the field, while others appear uniformly faster. This is consistent with the interpretation that the fitted leg effects absorb shared structure arising from leg length, climb, terrain, route-choice complexity, and other course features common to all athletes.

Figure 3.

Estimated common leg effects for the men’s and women’s middle-distance finals. Larger values indicate legs that were systematically slower across the field on the log-time scale after accounting for athlete-specific pace. These effects summarize shared leg-level time structure and may reflect a combination of leg length, climb, terrain, route-choice complexity, and other common course features. The finish segment is shown for completeness but is not directly comparable to ordinary course legs. (a) Men and (b) Women

These leg effects should not be interpreted as a pure measure of intrinsic technical difficulty. Rather, they are common leg-level time effects that summarize how demanding each leg appeared at the field level. Nevertheless, they may provide useful quantitative information for post-race course review. In particular, they can help identify sections of a course that appeared broadly demanding, relatively straightforward, or potentially decisive across competitors, offering an additional perspective for course planners alongside traditional qualitative assessment.

Interpretation of the empirical illustration

Taken together, these results are consistent with the intended interpretation of the proposed decomposition at both the athlete level and the leg level. The speed dimension behaves as a summary of overall race level and is most closely aligned with total time and rank. The instability dimension captures residual leg-to-leg dispersion after adjusting for athlete-specific pace and common leg difficulty. The mistake-severity dimension highlights races in which one or a few unusually poor legs dominate the performance profile. At the same time, the fitted leg effects provide a complementary course-level summary of which legs were collectively slower or faster across the field.

The empirical value of the framework is therefore not that it replaces total time, but that it supplements total time with interpretable information about race structure. In particular, it helps distinguish between athletes who were similarly competitive overall but differed in execution consistency, and between races characterized by diffuse variability versus races dominated by one major mistake. This distinction is especially relevant in orienteering, where one navigational error can substantially alter the final result without fully describing the rest of the athlete’s race.

Moreover, the same qualitative distinction appears not only in the middle-distance final but also in the long-distance final, suggesting that the decomposition captures a recurring structural feature of elite split-time data rather than an isolated race-specific phenomenon.

Additional robustness illustration: WOC long-distance final

To assess whether the same qualitative pattern extends beyond the middle-distance final, we repeated the analysis on the 2025 World Orienteering Championships long-distance final, again separately for the men’s and women’s races. The results were qualitatively consistent with those reported above.

First, the speed component again showed the strongest association with final outcome. In the men’s long-distance race, model speed had correlation $0.987$ with total time, while model instability and mistake severity had correlations $0.680$ and $0.522$ . In the women’s long-distance race, the corresponding correlations were $0.983$ , $0.683$ , and $0.544$ . Thus, as in the middle-distance final, overall speed remained most closely aligned with final performance, while instability and mistake severity captured additional variation not reducible to total time alone.

Second, close-time pairs with markedly different within-race profiles again appeared in both races. In the men’s long-distance race, Christian Michelsen and Mate Baumholczer finished only 25 seconds apart, and Michelsen had larger model-based instability and mistake-severity values; the corresponding absolute differences were $0.090$ and $0.582$ . Under LOLO sensitivity, the corresponding ranges were $[0.006,$ $0.103]$ and $[0.077, 0.691]$ , with maximum absolute influence $0.084$ and $0.505$ . In the women’s long-distance race, Luboslava Weissova and Elif Gokce Avci Ataman finished only 17 seconds apart, and Ataman had larger model-based instability and mistake-severity values; the corresponding absolute differences were $0.127$ and $0.785$ . Under LOLO sensitivity, the corresponding ranges were $[0.004, 0.141]$ and $[0.086, 0.788]$ , with maximum absolute influence $0.123$ and $0.699$ . In both cases, the residual profiles indicated that one athlete’s race was shaped by a pronounced local disruption, whereas the other’s deviations were smaller or more diffuse across the course.

These additional long-distance results further support the proposed interpretation. The decomposition is not tied to a single race format or isolated example: across both middle- and long-distance elite races, total time is most directly reflected by speed, while instability and mistake severity provide complementary information about race execution.

Limitations of the current empirical illustration

The present empirical study should be viewed as an initial illustration rather than a comprehensive validation exercise. The main analyses were conducted on complete split-time tables. Although the number of excluded athletes was modest in the present race illustrations (see Table 4), incomplete split records may still be informative because they can arise from retirement, mispunch, or severe race disruption. Accordingly, the complete-case analysis may somewhat underrepresent the most extreme forms of instability or mistake severity. In practical terms, the current decomposition is most directly applicable to finishers with reasonably complete split profiles. As a minimal sensitivity check, we repeated the decomposition after allowing athletes with up to one missing leg and imputing missing leg times by legwise medians; the main correlation patterns were qualitatively unchanged. We also note that uncertainty intervals are wider for mistake-related summaries, which is expected because tail-focused max-residual measures are more sample-sensitive.

Although we now report qualitatively consistent findings from both the middle- and long-distance championship finals, the empirical evidence remains limited to a small number of elite races and was designed primarily to assess interpretability rather than predictive performance. A fuller empirical study should examine additional races and compare patterns across formats, investigate richer handling of incomplete split records, and evaluate extensions that incorporate route-choice/GPS/course covariates.

Nonetheless, even this limited illustration indicates that the proposed framework can reveal meaningful differences in race structure that are obscured by official outcome summaries alone. This suggests that split-time decomposition may provide a useful basis for more systematic quantitative performance analysis in elite orienteering.

Discussion

This paper proposes a simple and interpretable framework for decomposing orienteering performance from official split times into three components: speed, within-race instability, and mistake severity. The framework is designed to complement final time and rank by describing how performance was produced across the course. The empirical analyses show that split-time data contain meaningful execution-level structure beyond official summaries.

From a sports-analytics perspective, the contribution is intentionally practical. The proposed quantities are straightforward to compute, directly interpretable, and usable for athlete comparison, race review, exploratory analysis, and post-race course evaluation. At the same time, the additive model provides a statistical formulation that links these summaries to a transparent decomposition of baseline pace, common leg difficulty, and athlete-leg-specific departures.

For athlete feedback, the decomposition can be read as a compact post-race diagnostic. The speed component summarizes the athlete’s overall pace level in the race, instability summarizes how uneven the adjusted leg profile was, and mistake severity identifies the largest unexpectedly slow leg after accounting for the athlete’s own baseline and common leg difficulty. For organizers and course planners, the same fitted table provides leg effects that summarize which sections were broadly slow across the field. These summaries can help distinguish a race decided mainly by overall speed from one in which residual disruptions or highly demanding legs played a larger role, while still acknowledging that split times alone cannot fully separate physiological from navigational causes.

The proposed framework has several strengths. It is tailored to the split-time tables naturally produced by orienteering events, emphasizes interpretability over opaque composite scoring, and accommodates both descriptive summaries and model-based estimation. It also highlights a practically important distinction between overall pace and race execution quality.

A further practical strength is that the framework provides information not only about athletes but also about the course. The fitted common leg effects summarize which sections were systematically slow or fast across the field after adjusting for athlete-specific pace, while residual deviations indicate where unusually large athlete-specific disruptions occurred. Although these quantities do not by themselves determine whether a course was well designed, they can still support post-race review by highlighting legs that appeared uniformly straightforward, broadly demanding, or disproportionately decisive.

In future work, one could extend the present analysis in four focused directions: broader summaries of close-time pair distributions across races, richer treatment of incomplete split records, integration of route-choice/GPS/course features, and hierarchical modeling that pools information across races and events. In the present paper, however, our aim is more limited: to provide a transparent and interpretable decomposition framework together with practical diagnostics that clarify how the illustrative case studies and complete-case analyses were constructed.

At the same time, the current formulation has limitations. The additive model ignores dependence across legs and interactions between athletes and specific terrain or technical features. The mistake measure is intentionally simple and does not distinguish among different sources of severe delay, such as route-choice inefficiency, relocation after loss of contact, or physical collapse. In addition, comparisons across races remain delicate because race formats, course characteristics, and competitive depth vary.

These extensions may yield a richer quantitative description of orienteering performance while preserving the core decomposition proposed here (Larsson et al., 2002; Liu, 2019; Maddison and Ni Mhurchu, 2009; Mottet and Saury, 2013; Omodei and McLennan, 1994; Sailer et al., 2019; Tan, 2008).

Overall, we view the present framework as a first step toward a more systematic statistical analysis of split times in orienteering. Its primary value is a transparent decomposition of race performance beyond total time, which can serve as a basis for future methodological and applied work.

Footnotes

Acknowledgements

The author thanks friends from the Tsinghua University Orienteering Team (THO), whose discussions and shared enthusiasm for orienteering helped inspire this work.

ORCID iD

Xieheng Wang

Ethical considerations

Not applicable. This study used publicly available race-result records and did not involve intervention, recruitment, or identifiable private human-subject data.

Consent to participate

Not applicable.

Consent for publication

Not applicable.

Author contributions

Single-author manuscript. The author contributed to all aspects of the study, including conceptualization, data curation, methodology development, analysis, visualization, and writing.

Funding statement

The author received no financial support for the research, authorship, and/or publication of this article.

Declaration of conflicting interest

The author declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Data availability

The data analyzed in this study were obtained from the official online results pages for the 2025 World Orienteering Championships middle-distance and long-distance finals. Leg-level split times were derived from publicly available XML result files containing control-level cumulative times and final results. The analysis repository is publicly available at . It contains scripts for XML parsing and split extraction, decomposition and summary-table construction, and figure reproduction for the main and appendix results, together with dependency and usage documentation.

Appendix

References

Albert

Baumer

Marchi

(2024) Analyzing Baseball Data with R. 3rd ed. Boca Raton, FL: Chapman and Hall/CRC.

Baio

Blangiardo

(2010) Bayesian hierarchical model for the prediction of football results. Journal of Applied Statistics 37(2): 253–264.

Barrell

Cooper

(1986) Cognitive processes in orienteering: The interpretation of contours and responses to the map as a whole. Scientific Journal of Orienteering 2: 25–46.

Boullosa

Patrocinio

Renfree

, et al. (2023) Short-term speed variability as an index of pacing stochasticity in athletic running events. Journal of Functional Morphology and Kinesiology 8(2): 86.

Casado

Hanley

Jiménez-Reyes

, et al. (2021) Pacing profiles and tactical behaviors of elite runners. Journal of Sport and Health Science 10(5): 537–549.

Chaloupská

(2015) Analysis of erring in selected orienteering runners. Journal of Human Sport and Exercise 10(Extra 1): 340–344.

Cheshikhina

(1993) Relationship between running speed and cognitive processes in orienteering: Two empirical studies. Scientific Journal of Orienteering 9(1–2): 49–59.

Cook

(1977) Detection of influential observation in linear regression. Technometrics 19(1): 15–18.

Corbí-Santamaría

Herrero-Molleda

García-López

, et al. (2023) Variable pacing is associated with performance during the OCC ultra-trail du mont-blanc 2017–2021. International Journal of Environmental Research and Public Health 20(4): 3297.

10.

Creagh

Reilly

(1997) Physiological and biomechanical aspects of orienteering. Sports Medicine 24(6): 409–418.

11.

Ćuk

Marković

Weiss

, et al. (2024) Running variability in marathon: Evaluation of the pacing variables. Medicina 60(2): 218.

12.

Díaz

Fernández-Ozcorta

Santos-Concejero

(2018) The influence of pacing strategy on marathon world records. European Journal of Sport Science 18(6): 781–786.

13.

Díaz

Renfree

Fernández-Ozcorta

, et al. (2019) Pacing and performance in the 6 world marathon majors. Frontiers in Sports and Active Living 1: 54.

14.

Frees

(2004) Longitudinal and Panel Data: Analysis and Applications in the Social Sciences. Cambridge: Cambridge University Press.

15.

Gasser

Vogel

(2021) Older orienteers perform better—is experience key? Science & Sports 36(4): e151–e157.

16.

Gasser

(2016) Predictors of average speed in orienteering: The number of controls is crucial. Sportverletzung ⋅Sportschaden 30(2): 90–94.

17.

Gelman

Hill

Vehtari

(2020) Regression and Other Stories. Cambridge: Cambridge University Press.

18.

Guzmán

Pablos

(2008) Perceptual-cognitive skills and performance in orienteering. Perceptual and Motor Skills 107(1): 159–164.

19.

Haney

Mercer

(2011) A description of variability of pacing in marathon distance running. International Journal of Exercise Science 4(2): 133–140.

20.

Hébert-Losier

Platt

Hopkins

(2015) Sources of variability in performance times at the world orienteering championships. Medicine & Science in Sports & Exercise 47(7): 1523–1530.

21.

Hsiao

(2003) Analysis of Panel Data. 2nd ed. Cambridge: Cambridge University Press.

22.

International Orienteering Federation (2025a) World orienteering championships 2025: Long distance. https://orienteering.sport/event/world-orienteering-championships-2025/long/ (accessed 5 May 2026).

23.

International Orienteering Federation (2025b) World orienteering championships 2025: Middle distance. https://orienteering.sport/event/world-orienteering-championships-2025/middle/ (accessed 5 May 2026).

24.

International Ski Federation (2017) Rules for FIS Cross-Country Points 2017–2018. International Ski Federation.

25.

Larsson

Burlin

Jakobsson

, et al. (2002) Analysis of performance in orienteering with treadmill tests and physiological field tests using a differential global positioning system. Journal of Sports Sciences 20(7): 529–535.

26.

Liu

(2019) Visual search characteristics of precise map reading by orienteers. PeerJ 7: e7592.

27.

Maddison

Ni Mhurchu

(2009) Global positioning system: A new opportunity in physical activity measurement. International Journal of Behavioral Nutrition and Physical Activity 6: 73.

28.

Millet

Divert

Banizette

, et al. (2010) Changes in running pattern due to fatigue and cognitive load in orienteering. Journal of Sports Sciences 28(2): 153–160.

29.

Mottet

Saury

(2013) Accurately locating one’s spatial position in one’s environment during a navigation task: Adaptive activity for finding or setting control flags in orienteering. Psychology of Sport and Exercise 14(2): 189–199.

30.

Nikolaidis

Knechtle

(2017) Effect of age and performance on pacing of marathon runners. Open Access Journal of Sports Medicine 8: 171–180.

31.

Omodei

McLennan

(1994) Studying complex decision making in natural settings: Using a head-mounted video camera to study competitive orienteering. Perceptual and Motor Skills 79(3 Pt 2): 1411–1425.

32.

Sailer

Martin

Gaia

, et al. (2019) Analyzing performance in orienteering from movement trajectories and contextual information. In: Proceedings of the 15th international conference on location-based services, pp.141–146.

33.

Seiler

(1996) Cognitive processes in orienteering: A review. Scientific Journal of Orienteering 12(2): 50–65.

34.

Sha

Jiang

, et al. (2024) Pacing strategies in marathons: A systematic review. Heliyon 10(17): e36760.

35.

Tan

(2008) The application of GPS/GIS navigation and positioning system in cross-country orienteering. In: 2008 International conference on computer science and software engineering, pp.582–584. DOI: 10.1109/CSSE.2008.167.

36.

Van Bulck

Pääkkönen

Jacquet

, et al. (2025) Designing a sports orienteering contest: Physical versus cognitive skills in rogaining. International Transactions in Operational Research 33: 68–89.