Comparing algorithmic trading strategies by analogies to machine learning

Abstract

In technical analysis-based algorithmic trading strategies, we use historical price patterns to predict future prices and trade accordingly. This is analogous to machine learning where we use the existing data patterns to classify or predict new patterns. This paper uses this analogy and explains trading strategies as a machine learning classification problem. We derive simple approximations that relate the performance of trading strategies to machine learning statistics. We introduce a new performance measure of the Return Efficiency Index. This index provides a link between trading strategy return statistics and classification accuracy. It has a simple geometric interpretation, similar to the ROC index in machine learning, and can be used to compare strategies in terms of their ability to capture the potential returns possible with the underlying assets. We illustrate the proposed approach by a detailed comparison of daily trading strategies designed by analogies to nearest neighbor classification widely used in machine learning and to some strategies based on deep learning.

Keywords

technical analysis algorithmic trading machine learning nearest neighbor classification trading signals return efficiency index

Introduction

In the fundamental analysis of stock pricing, we focus on financial statements and economic statistics to predict the intrinsic value. By contrast, in the technical analysis, we examine historical market data to find patterns to predict future asset price movements and design trading strategies accordingly. This is very similar to what we do in machine learning (Bishop, 2016; Hastle, 2018) when we use existing data to find patterns that help us classify or predict unseen data patterns. Not surprisingly, in recent years there has been an increased interest in using machine learning to design new algorithms or improve the existing ones using machine learning techniques.

These include combining technical indicators with machine learning models (Ayala et al., 2021; Beg et al., 2019), deep learning (Agrawal et al., 2019; Buachuen and Kantavat, 2023; Chen, 2020; Chen et al., 2023; Dash and Dash, 2016; Gu et al., 2019; Mahfooz et al., 2022; Park and Shin, 2019; Pholsri, 2023; Wang et al., 2018; Yao et al., 2022; Yu et al., 2023; Zhang et al., 2021; Zhong and Enke, 2019), natural language processing (Meesad and Boonmatham, 2023), ML-based FOREX trading (Gerlein et al., 2016), trend prediction (Grigoryan, 2017; Jagadisha et al., 2022; Nousi et al., 2018; Ntakaris et al., 2018; Oyewola et al., 2019; Pradip et al., 2018), sentiment analysis (Ashitha et al., 2023; Joiner et al., 2022; Mndawe et al., 2022), re-enforcement learning (Tran Van et al., 2023; Tsantekidis et al., 2020; Zarkias et al., 2019), ensemble methods (Chavarnakul and Enke, 2009; Choudhry and Garg, 2008; Kim and Won, 2018; Pasupulety et al., 2019), portfolio construction (Ndikum, 2020; Padhi et al., 2022), risk management (Karthik, 2023), high-frequency and short-term trading (Bitvai and Cohn, 2015; Wong et al., 2023), to name just some. A comparison of some machine learning methods in technical analysis is presented in Hsu et al. (2016), Khan et al. (2023) and Lumoring et al. (2023); Patel et al. (2015).

In these models, the effectiveness of using machine learning is evaluated using the standard machine learning metrics (like accuracy, confusion matrix, etc.) and the effectiveness of trading is evaluated using the financial metrics (like return, and volatility). We propose a unified approach to evaluate strategies using machine learning. We derive (approximate) expressions relating the entries of the confusion matrix to return and volatility. We introduce the new index, which we call the return efficiency index. This index is analogous to the area under the curve (AOC) measure in machine learning but relates both the accuracy of the underlying classifier and the return characteristics of the assets. This new metric has a simple interpretation: it is the proportion of the possible return range between the worst and the best trading strategies. It allows a direct comparison of the performance of different trading strategies with possibly different underlying assets as well as a comparison to a random flip. it has a simple geometrical interpretation as the cosine similarity between the metrics that reflect the accuracy of the strategy as a classifier and the underlying return profiles of the traded securities. We illustrate we consider two daily strategies built by analogy to the nearest-neighbor classification: Growth-Value (trading the S&P Growth vs. S&P Value) and Market-Cash (trading S&P-500 vs. cash) strategies.

This paper is organized as follows: In Section “Machine-learning interpretation of trading strategies”, we describe trading strategies in terms of machine learning and list some important statistics from an ML viewpoint. In Section “Analysis of strategy performance by machine learning metrics” we relate returns to label counts. We compare simple and logarithmic returns in terms of accuracy and volatility and justify our choice of logarithmic returns. In Section “Analysis of volatility and sharpe ratios by corresponding machine learning metrics”, we present a framework to express volatility and Sharpe ratios in terms of the confusion matrix and its corresponding ratios. In Section “The “return efficiency index”” we introduce and discuss the return efficiency index to compare trading strategies. We illustrate this with a detailed example in Section “A detailed example”. In Section “Example: k-NN “winners” and “losers” trading strategies” we introduce “Winner” and “Loser” Growth-Value and Market-Cash strategies based on analogies to $k$ -NN. In Section “Results and discussion” we present a detailed comparison and show that “loser” strategies outperform “winner” strategies with the best choice being the Growth-Value strategies. In Section “Choosing the number of nearest neighbors and transaction costs” we address the question of choosing the best value of $k$ and show that this value is $k = 1$ . We summarize our key findings and conclude in Section “Concluding remarks”.

Machine-learning interpretation of trading strategies

In a typical trading strategy, for any subsequent time period $t_{i}$ , we choose between investing in asset $A$ or asset $B$ according to some rule for that strategy. In our discussion, we assume that our periods are days, and our rule tries to predict whether the daily return for $A$ is higher or lower than the corresponding daily return for $B$ .

In the ideal case, we would like to invest in $A$ for the day $d_{i}$ if the daily return for $A$ for that day is higher than the daily return for $B$ . Similarly, we would like to invest in $B$ for the day $d_{i}$ if the daily return for $B$ for that day is higher than the daily return for $A$ . Accordingly, we can assign the so-called “True” (the so-called “Ground Truth”) labels $T_{i}$ to each trading day $d_{i}$ as follows:

a true label $T_{i} =$ “+” for the day $d_{i}$ means we would like to be invested in $A$ for that day

a true label $T_{i} =$ “−” for the day $d_{i}$ means we would like to be invested in $B$ for that day.

However, in the context of trading algorithms, this assignment of Ground Truth labels can only be done for past historical data. We illustrate this with the following two examples.

Example 1

Security $A$ is the S&P index and security $B$ is cash. We would call such strategies Market-Cash strategies. They can be implemented by trading the SPY Exchange-Traded Fund SPY. The benchmark for this class of strategies is “Buy-and-Hold” strategy: a passive investing in the S&P-500 index.

In these strategies, we would like to invest in positive return days and be in cash on negative return days. We try to predict $P_{i} =$ “+” (invest) or $P_{i}$ = “−” (cash) decisions for day $d - i$ based on historical returns of the S&P-500 index.

Suppose that for 10 consecutive days $d_{1}, \dots, d_{10}$ , the daily returns $r_{i}$ for S&P-500 index are known. We can then assign daily true labels as shown in Table 1 below:

Table 1.

Assignment of true labels $T_{i}$ for market-cash (MC) strategies.

Day	$d_{1}$	$d_{2}$	$d_{3}$	$d_{4}$	$d_{5}$	$d_{6}$	$d_{7}$	$d_{8}$	$d_{9}$	$d_{10}$
$r_{i}$	1	3	$-$ 2	$-$ 4	2	2	$-$ 1	3	1	$-$ 1
$T_{i}$ :	$+$	$+$	$-$	$-$	$+$	$+$	$-$	$+$	$+$	$-$

Positive true labels (“+”) are assigned to six days $d_{1}$ , $d_{2}$ , $d_{5}$ , $d_{6}$ , $d_{8}$ and $d_{9}$ with non-negative daily returns $r_{i} \geq 0$ . On these five days, we would like to be invested in the S&P-500 index. By contrast, negative true labels (“−”) are assigned to the remaining four days $d_{3}$ , $d_{4}$ , $d_{7}$ , and $d_{10}$ with negative daily returns $r_{i} < 0$ . We want to remain in a cash position on these four days.

Example 2

Security $A$ is the S&P Growth index and security $B$ is the S&P value index. We would call such strategies Growth-Value strategies. They can be implemented by trading S&P Growth and S&P Value Exchange-Traded Funds SPYG and SPYV respectively. In these strategies, we are always fully invested. The benchmark for this class of strategies is “Buy-and-Hold” strategy: a passive investing in the S&P Growth or S&P value indices.

We try to predict $P_{i} =$ “+” (invest in S&P Growth index) or $P_{i} =$ “−” (invest in S&P Value index) decisions for day $d_{i}$ based on comparative returns of Value and Growth indices. For example, we believe that Growth stocks have higher average returns than Value stocks.

Suppose that for 10 consecutive days $d_{1}, d_{2}, \dots, d_{10}$ the daily returns $r_{i}^{G}$ for Growth and daily returns $r_{i}^{V}$ for Value indices are known. We can then assign daily True labels as shown in Table 2 below:

Table 2.

Assignment of true labels $T_{i}$ for growth-value (GV) strategies.

$d_{i}$	$d_{1}$	$d_{2}$	$d_{3}$	$d_{4}$	$d_{5}$	$d_{6}$	$d_{7}$	$d_{8}$	$d_{9}$	$d_{10}$
$r_{i}^{G}$	1	2	$-$ 3	$-$ 1	3	2	$-$ 2	5	1	1
$r_{i}^{V}$	0	1	$-$ 2	$-$ 2	2	1	$-$ 1	2	3	$-$ 1
$T_{i}$	$+$	$+$	$-$	$-$	$+$	$+$	$-$	$+$	$+$	$-$

Positive true labels (“+”) are assigned to six days $d_{1}$ , $d_{2}$ , $d_{5}$ , $d_{6}$ , $d_{8}$ and $d_{9}$ when the Growth index outperforms the Value index ( $r_{i}^{G} \geq r_{i}^{V})$ ). These days we would like to be invested in the S&P Growth index. By contrast, negative true labels (“−”) are assigned to the remaining four days $d_{3}$ , $d_{4}$ , $d_{7}$ , and $d_{10}$ when the Growth index underperforms the Value index ( $r_{i}^{G} < r_{i}^{V})$ ). These days, we would like to invest in the S&P Value index. Schematic Representation: A trading strategy decides on investing for the day $d_{i}$ by predicting a label $P_{i}$ according to some rule. For every day, we have a predicted label, and our trading strategy is implemented as follows:

label $P_{i}$ “+” for the day $d_{i}$ means we invest in $A$ for that day

label $P_{i} =$ “−” for the day $d_{i}$ means we invest in $B$ for that day

P^{+}

and

P^{-}

represent days with predicted “+” and “−” labels then schematically, our trading strategy can be represented as follows

Strategy: \overset{predicted ''+'' P^{+}}{\overset{⏞}{\underset{invested in A}{\underset{⏟}{+, +, \dots, \dots, +, +}}}}, \overset{predicted ''-'' P^{-}}{\overset{⏞}{\underset{invested in B}{\underset{⏟}{-, -, \dots, \dots, -, -}}}}

Trading according to true labels represents an ideal strategy where

P_{i} = T_{i}

: we never make a mistake in predicting labels. However, in practice, that is not possible. Therefore, our strategy would make mistakes in predicting “+” and/or “minus” labels. As a result, its performance will be lower than that of an ideal strategy. The higher the accuracy of our strategy in predicting true labels, the better its performance.

A similar situation arises in many problems in machine learning, such as supervised learning. In these problems, we are given true “+” and “−” labels to some datasets and we need to design rules for classifying the unseen data points. In the language of machine learning, we are given true labels for past trading days and are asked to predict future labels. However, our predictions of labels are not perfect, and on some days, we predict labels incorrectly. The resulting statistics of classification are summarized in the so-called 4-element “confusion” matrix, Therefore, as in machine learning, we can split all days into four disjoint groups:

$TP$ (true positive): on these days, the true label was $+$ and we correctly invested in security $A$

$FP$ (false positive): on these days, the true label was $+$ but we predicted it incorrectly “−”. As a result, we invested in security $B$ instead of $A$ . Such misclassification of true “+” labels is called a Type I error

$TN$ (true negative): on these days, the true label was $-$ and we correctly invested in security $B$

$FN$ (false negative): on these days, the true label was $-$ but we predicted it incorrectly as “+” label. As a result, we invested in security $A$ instead of $B$ . Such mis-classification of true “−” labels is called a Type II error

Define

T^{+} = (TP + FN)

and

T^{-} = (TN + FP)

. Then,

T^{+}

and

T^{-}

represent the number of days days with true “+” and “−” labels respectively. From the above entries in the confusion matrix, we can compute the following ratios:

True Pos. Rate (recall or sensitivity): $TPR = TP / T^{+}$

True Negative Rate (specificity): $TNR = TN / T^{-}$

Accuracy: $ACC = (TP + TN) / (T^{+} + T^{-})$

Pos. Pred. Value (precision): $PPV = TP / (TP + FP)$

Neg. Predicted Value: $NPV = TN / (TN + FN)$

Prevalence: $π^{+} = T^{+} / T$

The entries of the confusion matrix, the rates, and the number of positive and negative labels are related as follows:

\begin{aligned} TP & = T^{+} \cdot TPR, FN = T^{+} (1 - TPR), \\ TN & = T^{-} \cdot TNR, FP = T^{-} (1 - TNR) \end{aligned}

(1)

Intuitively, the performance of the trading strategy would depend on the number of False Positive days (Type I error), False Negative days (Type II error), and the difference in returns of securities $A$ and $B$ on these days. We will quantify this more precisely in the next section.

Analysis of strategy performance by machine learning metrics

In machine learning, the metrics are derived from counting the number of correctly identified labels. On the other hand, the total returns are computed by multiplication. Therefore, to analyze the performance of trading strategies in terms of machine learning metrics, we need to consider approximations to total returns that are additive (Hudson and Gregoriou, 2015).

Approximating total returns

We start by considering two commonly used methods to express total returns in terms of suitable averages. Recall that if we have $k$ days with daily returns $r_{1}, \dots, r_{k}$ then the total return over these $k$ days is

R = (1 + r_{1}) (1 + r_{2}) \dots (1 + r_{k}) - 1

(2)

We consider the following two commonly used methods to approximately relate individual returns

{r_{i}}

to total return

R

simple averaging method:

μ^{(AVE)} = \frac{(r_{1} + \dots + r_{k})}{k}

Then from the above equation (2), we have

R = (r_{1} + \dots + r_{k}) + O (r_{i}^{2}) \approx k μ^{(AVE)}

(3)

This method approximates the total return for

k

days as

R \approx (r_{1} + \dots + r_{k})

. This method is simple but it ignores the compounding effect.

average logarithmic return:

μ^{(LOG)} = \frac{\log (1 + r_{1}) + \dots + \log (1 + r_{k})}{k}

P_{0}, P_{1}, \dots, P_{k}

denote prices on days

d_{0}, d_{1}, \dots, d_{k}

then for this return we have

\begin{aligned} μ^{(LOG)} & = \frac{1}{k} [\log (\frac{P_{1}}{P_{0}}) + \dots + \log (\frac{P_{k}}{P_{k - 1}})] \\ = \frac{1}{k} \log (\frac{P_{k}}{P_{0}}) = \frac{\log (1 + R)}{k} \end{aligned}

(4)

This method approximates the total return for

k

days as

\log (1 + R)

. Logarithmic returns consider the compounding effect by calculating the logarithm of the ratio of final price

P_{k}

to initial price

P_{0}

To compare and interpret the two methods, we proceed as follows: from the definition of the total compounded return over $k$ days in equation (2), we have

(1 + R)^{1 / k} = {[(1 + r_{1}) \dots (1 + r_{k})]}^{1 / k}

(5)

The term on the right in equation (5) is the geometric mean of terms $(1 + r_{i})$ . We can transform this geometric mean into an arithmetic mean by taking logarithms to obtain

\frac{1}{k} \log (1 + R) = \frac{\log (1 + r_{1}) + \dots + \log (1 + r_{k})}{k} = μ^{(LOG)}

(6)

Using the first-order Taylor expansion

f (x) \approx f (a) + f^{'} (a) (x - a)

applied to function

f (x) = \log (1 + x)

at point

a = 0

gives us

\log (1 + x) \approx x

. Applying this to

\log (1 + R)

we obtain

k μ^{(LOG)} = \log (1 + R) \approx R

(7)

From equations (3) and (7), we can approximate total return by adding average simple or logarithmic returns. The accuracy would depend on the values of

r_{i}

and the number of days

k

All of these are approximations to true return. To illustrate the differences between the methods, let us consider two simple examples:

Example 3

Consider an investment of $$ 100$ that increases 20% ( $r_{1} = 0.2$ ) on the first day from $100$ to $120 and then decreases 16.67% ( $r_{2} = - 0.167)$ on the second day back to the original amount of $100. The total return is therefore $R = 0$ . For the three approximation methods, we have

simple averaging method: $μ^{(AVE)} = (r_{1} + r_{2}) / 2 = 0.0167$ . This would result in the final amount of $101.67$

average log return method: The average log return is calculated as

μ^{(LOG)} = \frac{\log (1 + 0.2) + \log (1 - 0.167)}{2} = 0

which is the same as the actual total return. In this example, simple averaging and IRR method over-estimate the total return. The average “log” returns captures the compounding value better.

Example 4

Suppose that for 10 days the % returns were ${1, 3, - 2, - 4, 2, 2, - 1, 3, 1, 1}$ . There are $k_{1} = 6$ days ${d_{1}, d_{2}, d_{5}, d_{6}, d_{8}, d_{9}}$ with positive returns $r_{1} = 0.01$ , $r_{2} = 0.03$ , $r_{5} = 0.02$ , $r_{6} = 0.02$ , $r_{8} = 0.03$ , and $r_{9} = 0.01$ . There are $k_{2} = 4$ days ${d_{3}, d_{4}, d_{7}, d_{10}}$ with negative returns $r_{3} = - 0.02$ , $r_{4} = - 0.04$ , $r_{7} = - 0.01$ , and $r_{10} = - 0.01$ .

The total return for this strategy is

\begin{aligned} R & = (1.01 \cdot 1.03 \cdot 0.98 \cdot 0.96 \cdot 1.02 \cdot 1.02 \cdot 0.99 \\ \cdot 1.03 \cdot 1.01 \cdot 0.99) - 1 = 0.03821 \end{aligned}

(8)

A $100 investment would grow to $103.82 after 10 days. Let us compare the two methods for this example:

simple averaging method: we compute a simple average of all returns over 10-day period

\begin{aligned} μ^{(AVE)} & = \frac{1}{10} [0.01 + 0.03 - 0.02 - 0.04 + 0.02 + 0.02 \\ + 0.99 + 0.03 + 0.01 - 0.01] = 0.0040 \end{aligned}

Using this method, a $100 investment would grow to $100.40 after 10 days.

average logarithmic return:

μ^{(LOG)} = \frac{\log (1 + r_{1}) + \dots + \log (1 + r_{10})}{10} = 0.003750

Using this method, a $100 investment would grow to $103.75 after 10 days.

When we compare the two methods, we see that simple averaging gives the worst result because it ignores the effect of compounding. The logarithmic return method is better giving us the closest value.

The above example illustrates a well-known result: for long-term time-series data, simple averaging can generate very inaccurate results as it ignores the effect of compounding.

To see the relationship between the two returns, we proceed as follows (Hudson and Gregoriou, 2015). Recall the Taylor series expansion of $\log (1 + r)$ around $r = 1$

\begin{aligned} \log (1 + r) & = \log (1) + r - \frac{r^{2}}{2} + \frac{r^{3}}{3} - \frac{r^{4}}{4} + \dots \\ = r - \frac{r^{2}}{2} + O (r^{3}) \end{aligned}

(9)

Therefore, for the mean logarithmic return in equation (4) we have

\begin{aligned} μ^{(LOG)} & \approx \frac{1}{k} [(r_{1} - \frac{r_{1}^{2}}{2}) + \dots + (r_{k} - \frac{r_{k}^{2}}{2})] \\ = μ^{(AVE)} - \frac{1}{2} \cdot \frac{(r_{1}^{2} + \dots r_{k}^{2})}{k} \\ = μ^{(AVE)} - \frac{1}{2} [σ^{2} (r) + {(μ^{(AVE)})}^{2}] \\ = [μ^{(AVE)} - \frac{1}{2} σ^{2} (r)] - \frac{1}{2} {(μ^{(AVE)})}^{2} \end{aligned}

For large $k$ and small overall simple $μ^{(AVE)}$ , the mean daily logarithmic return $μ^{(LOG)}$ is approximately the mean daily return $μ^{(AVE)}$ minus one-half the variance $σ^{2} (r)$ of daily returns. In particular, the total return obtained by logarithmic daily return $μ^{(LOG)}$ could be higher or lower than simple average return $μ^{(LOG)}$ . The results obtained by these returns would differ at times of high volatility.

Example 5

We illustrate the differences in accuracy of using returns for different daily returns (from $r_{i} = 0.05 %$ to $r_{i} = 0.50 %$ ) for different time periods $T$ (from 25 to 250 days) in Table 3.

Table 3.

Comparison of exact (E), simple (S) and logarithmic (L) results for total return $R$ (%) for different daily return rates and time periods.

	25 days			50 days			125 days			250 days
$r_{i} (%)$	E	S	L	E	S	L	E	S	L	E	S	L
0.05	1.26	1.25	1.25	2.53	2.50	2.50	6.45	6.25	6.25	13.31	12.50	12.50
0.10	2.53	2.50	2.50	5.12	5.00	5.00	13.31	12.50	12.49	28.39	25.00	24.99
0.15	3.82	3.75	3.75	7.78	7.50	7.49	20.61	18.75	18.74	45.46	37.50	37.47
0.20	5.12	5.00	5.00	10.51	10.00	9.99	28.37	25.00	24.98	64.79	50.00	49.95
0.25	6.44	6.25	6.24	13.30	12.50	12.48	36.63	31.25	31.21	86.68	62.50	62.42
0.30	7.78	7.50	7.49	16.16	15.00	14.98	45.42	37.50	37.44	111.46	75.00	74.89
0.35	9.13	8.75	8.73	19.09	17.50	17.47	54.76	43.75	43.67	139.52	87.50	87.35
0.40	10.50	10.00	9.98	22.09	20.00	19.96	64.71	50.00	49.90	171.29	100.00	99.80
0.45	11.88	11.25	11.22	25.17	22.50	22.45	75.28	56.25	56.12	207.25	112.50	112.25
0.50	13.28	12.50	12.47	28.32	25.00	24.94	86.53	62.50	62.34	247.95	125.00	124.69

For example, when $r_{i} = 0.10$ , the exact return over $T = 25$ days is 2.53%, while the simple and logarithmic returns are both 2.50%. if we consider a longer period of 250 days, the exact return is 28.39%, compared to 25.00% for simple returns and 24.99% for logarithmic returns respectively. The difference with exact values becomes significantly larger for higher daily return values. For example, if we take $r_{i} = 0.40$ , then the exact return over 25 days is 10.50%, while the simple and logarithmic returns are 10.00% and 9.98%. Over 250 days, the exact return grows to 171.29%, compared to 100.00% for simple returns and 99.80% for logarithmic returns. This difference between exact values of returns and values using simple or logarithmic averaging becomes larger for higher values of daily returns and longer time period.

In real market scenarios, we have both positive and negative returns. Table 14 compares real market results and shows that logarithmic returns are closer to exact values than simple averaging. As for the volatility of logarithmic returns, using an approximation in equation (9) and ignoring cubic and higher terms, we have

\begin{aligned} {VAR}^{(LOG)} & = \frac{1}{k} [\log^{2} (1 + r_{1}) + \dots + \log^{2} (1 + r_{k})] \\ - {(μ^{(LOG)})}^{2} \\ \approx \frac{1}{k} [{(r_{1} - \frac{r_{1}^{2}}{2})}^{2} + \dots + {(r_{k} - \frac{r_{k}^{2}}{2})}^{2}] \\ - \frac{1}{k^{2}} {[(r_{1} - \frac{r_{1}^{2}}{2}) + \dots + (r_{k} - \frac{r_{k}^{2}}{2})]}^{2} \\ \approx \frac{1}{k} [r_{1}^{2} + \dots + r_{k}^{2}] - \frac{1}{k^{2}} {[r_{1} + \dots + r_{k}^{2}]}^{2} = σ^{2} (r) \end{aligned}

(10)

Therefore, the volatility of logarithmic returns is approximately equal to the volatility of simple returns.

We can also derive some of these results more formally if we make a commonly used assumption in finance that prices follow log-normal distribution (Satchell and Knight, 2001). Let $X$ denote the normal distribution $X \sim N (μ, σ^{2})$ and let $Y = \exp (X)$ (prices). In other words, $Y$ represents prices and $X$ represents the returns). Then $Y$ follows a log-normal distribution with mean and variance given by Johnson and Kotz (1970)

E [Y] = \exp (μ + \frac{σ^{2}}{2})

and

V [Y] = [\exp (σ^{2}) - 1] \exp (2 μ + σ^{2})

Using the approximation

\exp (t) \approx 1 + t + t^{2} / 2

, we obtain

μ^{(LOG)} = \exp (E [X]) - 1 = \exp (μ) - 1 \approx μ + \frac{μ^{2}}{2}

(11)

and for small $μ$ and $σ$ , we have

\begin{aligned} μ^{(AVE)} & = E [\exp (X) - 1] = E [\exp (X)] - 1 \\ = \exp (μ + \frac{σ^{2}}{2}) - 1 \\ \approx μ + \frac{σ^{2}}{2} + \frac{1}{2} (μ^{2} + μ σ^{2} + \frac{σ^{4}}{4}) \\ \approx μ + \frac{σ^{2}}{2} + \frac{μ^{2}}{2} = μ^{(LOG)} + \frac{σ^{2}}{2} \end{aligned}

(12)

Therefore, the average of daily returns is approximately

σ^{2} / 2

higher than the average of daily logarithmic returns. The difference would depend on the volatility of daily returns.

It is instructive to examine Table 18 and relate the differences in approximating annual return results to daily return volatility for S&P-500 provided in Table 14. Here are some observations:

For most years (16 out of 23) simple averaging has given us us better results than logarithmic averaging.

For the simple averaging method, the average relative error of $μ = 17.92 %$ with a very large standard deviation $σ = 32.88$ .

The approximation was the worst in 2011 when its relative error was 138.94. The exact return for B&H was 1.89%, and the simple averaging method gave a value of 4.53% more than 100% relative error.

For the average logarithmic returns, the average relative error was $μ = 8.31 %$ and a much lower standard deviation of error $σ = 5.37 %$ . its worst performance was in 2008 when the relative error was about 25%. Interestingly, this method has a relative error of less than 1% in 2011,

For the 5 years with the highest daily return volatility ( $σ^{2} (r) = 2.60 %$ in 2009, $σ^{2} (r) = 2.10 %$ in 2020, $σ^{2} (r) = 1.68 %$ in 2009, $σ^{2} (r) = 1.67 %$ in 2002 and $σ^{2} (r) = 1.67 %$ in 2002) we had very large differences in accuracy between the two methods.

For the 5 years with the lowest daily return volatility, ( $σ^{2} (r) = 0.42 %$ in 2017, $σ^{2} (r) = 0.63 %$ in 2006, $σ^{2} (r) = 0.65 %$ in 2005, $σ^{2} (r) = 0.71 %$ in 2014 and $σ^{2} (r) = 0.70 %$ in 2004) we had very small differences in both return estimates and they were within $10 %$ of the exact results.

Overall, the average absolute error is about half-lower for logarithmic returns than for simple returns (

8.31 %

vs.

17.92 %

) with a much lower standard deviation (

5.37 %

vs.

32.88 %

Therefore, for computing strategy returns, we will use average logarithmic returns. Using such returns supports time additivity and retains the compounding effect. We emphasize that although it provides a good approximation and captures the compounding effect, it is still an approximate value.

Evaluating strategy performance

The ability to relate returns in terms of averages allows us to analyze and compare the strategies by partitioning trading days into disjoint groups of correctly and incorrectly classified true labels and then (approximately) relate the strategy returns to average returns for these groups.

We proceed as follows. Let $r_{A}^{+}$ and $r_{B}^{+}$ denote the average log returns for $A$ and $B$ on days with true “ $+$ ” labels. Similarly, let $r_{A}^{-}$ and $r_{B}^{-}$ denote the average log returns for $A$ and $B$ on days with true “ $-$ ” labels.

Finally, let us define

Δ^{+} = r_{A}^{+} - r_{B}^{+} and Δ^{-} = r_{B}^{-} - r_{A}^{-}

A trading strategy invests in $A$ or $B$ for each day $d_{i}$ based on predicted labels $P_{i}$ for that day. Since the order of days is not important for the final return, we can schematically describe any trading strategy “Str” as follows:

'' str'': \overset{predicted ''+'' P^{+}}{\overset{⏞}{\underset{invested in A}{\underset{⏟}{+, +, \dots, \dots, +, +}}}}, \overset{predicted ''-'' P^{-}}{\overset{⏞}{\underset{invested in B}{\underset{⏟}{-, -, \dots, \dots, -, -}}}}

We can measure the performance of strategies relative to the ideal (“max”) strategy and the worst (“min”) strategy, as well as to buy-and-hold benchmark strategies

A

and

B

(tracking errors). Each of these strategies can be represented schematically as follows:

a generic strategy “str” is schematically composed of four parts, corresponding to entries in the confusion matrix as follows:

'' str'': \overset{true ''+'' T^{+}}{\overset{⏞}{\underset{inv. in A}{\underset{⏟}{\underset{T P \times r_{A}^{+}}{\underset{⏟}{+, \dots, +}}}}, \underset{inv. in B}{\underset{⏟}{\underset{F N \times r_{B}^{+}}{\underset{⏟}{+, \dots, +}}}}}}, \overset{true ''-'' T^{-}}{\overset{⏞}{\underset{inv. in B}{\underset{⏟}{\underset{T N \times r_{B}^{-}}{\underset{⏟}{-, \dots, -}}}}, \underset{inv. in A}{\underset{⏟}{\underset{F P \times r_{A}^{-}}{\underset{⏟}{-, \dots, -}}}}}}

ideal (“max”) strategy: predicted labels are all correct $P_{i} = T_{i}$ for each day $i$ .This means that we invest in $A$ on all days with true positive labels and invest in $B$ on all days with true negative labels.

'' max'': {\overset{⏞}{\underset{⏟}{\underset{T P \times r_{A}^{+}}{\underset{⏟}{+, \dots, +}}, \underset{F N; \times r_{A}^{+}}{\underset{⏟}{+, \dots, +}}}}}_{inv. in A}^{true ''+'' labels T^{+}}, {\overset{⏞}{\underset{⏟}{\underset{T N \times r_{B}^{-}}{\underset{⏟}{-, \dots, -}}, \underset{F P \times r_{B}^{-}}{\underset{⏟}{-, \dots, -}}}}}_{inv. in B}^{true ''-'' labels T^{-}}

worst (“min”) strategy: predicted labels are all incorrect $P_{i} \neq T_{i}$ for each day $i$ . This means that we invest in $B$ on all days with true positive labels and invest in $A$ on all days with true negative labels as follows:

'' min'': {\overset{⏞}{\underset{⏟}{\underset{T P \times r_{B}^{+}}{\underset{⏟}{+, \dots, +}}, \underset{F N; \times r_{B}^{+}}{\underset{⏟}{+, \dots, +}}}}}_{inv. in B}^{true ''+'' labels T^{+}}, {\overset{⏞}{\underset{⏟}{\underset{T N \times r_{A}^{-}}{\underset{⏟}{-, \dots, -}}, \underset{F P \times r_{A}^{-}}{\underset{⏟}{-, \dots, -}}}}}_{inv. in A}^{true ''-'' labels T^{-}}

Buy-and-Hold (B&H) “A” strategy: all predicted labels $P_{i} =^{″} +^{″}$ for each day $i$ . This means that we invest in security $A$ on all days as follows:

B & H A : \underset{invested in A}{\underset{⏟}{\overset{true ''+'' labels T^{+}}{\overset{⏞}{\underset{T P \times r_{A}^{+}}{\underset{⏟}{+, \dots, +}}, \underset{F N \times r_{A}^{+}}{\underset{⏟}{+, \dots, +}}}}, \overset{true ''-'' labels T^{-}}{\overset{⏞}{\underset{T N \times r_{A}^{-}}{\underset{⏟}{-, \dots, -}}, \underset{F P \times r_{A}^{-}}{\underset{⏟}{-, \dots, -}}}}}}

Buy-and-Hold (B&H) “B” strategy: all predicted labels $P_{i} =^{″} -^{″}$ for each day $i$ . This means that we invest in security $B$ on all days as follows:

B & H B : \underset{invested in B}{\underset{⏟}{\overset{true ''+'' labels T^{+}}{\overset{⏞}{\underset{T P \times r_{B}^{+}}{\underset{⏟}{+, \dots +}}, \underset{F N \times r_{B}^{+}}{\underset{⏟}{+, \dots, +}}}}, \overset{true ''-'' labels T^{-}}{\overset{⏞}{\underset{T N \times r_{B}^{-}}{\underset{⏟}{-, \dots, -}}, \underset{F P \times r_{B}^{-}}{\underset{⏟}{-, \dots, -}}}}}}

We now make a connection between trading statistics and machine learning statistics by approximately expressing the strategy returns and other metrics in terms of confusion matrix counts and average daily logarithmic returns. We will use the notation “*” and the prefix ML (machine learning) to emphasize that these are approximations.

With this notation, we summarize ML-returns as follows:

\begin{aligned} {\begin{matrix} R_{s t r}^{*} & = TP \cdot r_{A}^{+} + FN \cdot r_{B}^{+} + TN \cdot r_{B}^{-} + FP \cdot r_{A}^{-} \\ R_{max}^{*} & = T^{+} r_{A}^{+} + T^{-} r_{B}^{-} \\ R_{min}^{*} & = T^{+} r_{B}^{+} + T^{-} r_{A}^{-} \\ R_{A}^{*} & = T^{+} r_{A}^{+} + T^{-} r_{A}^{-} \\ R_{B}^{*} & = T^{+} r_{B}^{+} + T^{-} r_{B}^{-} \end{matrix} \end{aligned}

(13)

The underperformance of the strategy relative to the ideal strategy is

R_{max}^{*} - R_{s t r}^{*} = \underset{type II loss}{\underset{⏟}{FN \cdot Δ^{+}}} + \underset{type I loss}{\underset{⏟}{FP \cdot Δ^{-}}}

The first contribution to underperformance is due to our error in predicting days with true “

+

” labels. The second contribution is due to an error in predicting days with “

-

” true labels.

The ML-tracking errors ${TE}_{A}^{*}$ and ${TE}_{B}^{*}$ relative to $A$ and $B$ are given by

\begin{aligned} {TE}_{A}^{*} & = R_{s t r}^{*} - R_{A}^{*} = - FN \cdot Δ^{+} + TN \cdot Δ^{-} \\ {TE}_{B}^{*} & = R_{s t r}^{*} - R_{B}^{*} = TP \cdot Δ^{+} - FP \cdot Δ^{-} \end{aligned}

To outperform benchmark

A

we need

FN < TN (Δ^{-} / Δ^{+})

. To outperform benchmark

B

we need the

FP < TP (Δ^{+} / Δ^{-})

. To outperform both benchmarks, we need

FN < TN (Δ^{-} / Δ^{+}) and FP < TP (Δ^{+} / Δ^{-})

The ML-based metric for the underperformance of Buy-and-Hold in

A

and

B

relative to the ideal strategy is

R_{max}^{*} - R_{A}^{*} = T^{-} Δ^{-} and R_{max}^{*} - R_{B}^{*} = T^{+} Δ^{+}

(14)

For $T^{-} = (TN + FP)$ days, we have an average negative return $Δ^{-} = (r_{B}^{-} - r_{A}^{-})$ . Unlike the ideal strategy, in the Buy-and-Hold strategy BH-A, we were fully invested in $A$ during these days, resulting in a loss given by equation (14). Although we do not make trading decisions in a Buy-and-Hold strategy, intuitively, we can think of this loss as consisting of two components: we lost a portion of the potential return in $TN$ negative true labels days. These $TN$ were correctly identified by the strategy. The second component of the loss in $FP$ days. These $FP$ days were incorrectly identified by our strategy as well.

Machine-learning interpretations for market-cash

For the Market-Cash strategies, security $A$ is S&P-500 index, and security $B$ is cash. For S&P-500, we have $r_{A}^{+} \geq 0$ and $r_{A}^{-} < 0$ . For cash, we have $r_{B}^{-} = r_{B}^{+} = 0$ , and therefore, $Δ^{+} = r_{A}^{+}$ and $Δ^{-} = - r_{A}^{-}$ Therefore, the expressions relating returns to machine learning metrics are reduced to a much simpler form. For the Market-Cash strategy we have:

MC: \overset{true ''+'' labels T^{+}}{\overset{⏞}{\underset{S & P}{\underset{⏟}{\underset{T P}{\underset{⏟}{+, \dots +}}}}, \underset{cash}{\underset{⏟}{\underset{F N}{\underset{⏟}{+, \dots +}}}}}}, \overset{true ''-'' labels T^{-}}{\overset{⏞}{\underset{cash}{\underset{⏟}{\underset{T N}{\underset{⏟}{-, \dots, -}}}}, \underset{S & P}{\underset{⏟}{\underset{F P}{\underset{⏟}{-, \dots, -}}}}}}

The expressions for ML-returns in equation (13) are reduced to a much simpler form

\begin{aligned} {\begin{matrix} R_{s t r}^{*} & = TP \cdot r_{A}^{+} + FP \cdot r_{A}^{-} \\ R_{max}^{*} & = T^{+} r_{A}^{+} \\ R_{min}^{*} & = T^{-} r_{A}^{-} \\ R_{A}^{*} & = T^{+} r_{A}^{+} + T^{-} r_{A}^{-} \end{matrix} \end{aligned}

(15)

The difference in ML-returns with the ideal strategy is then

R_{max}^{*} - R_{M C}^{*} = FN \cdot r_{M}^{+} - FP \cdot r_{M}^{-}

This has a simple and intuitive interpretation: our under-performance relative to the ideal strategy is composed of two parts:

for FP days, we misclassified the true negative days as “+” and invested in the losing days, and therefore, losing (approximately) $(- FP \cdot r_{M}^{-})$

for FN days, we misclassified the true positive days as “−”, stayed in cash, and did not invest in the positive days. This resulted in the “opportunity” loss of approximately $(FN \cdot r_{M}^{+})$

Finally, from equations (15) we can estimate the ML-tracking error as

{TE}_{A}^{*} = R_{M C}^{*} - R_{A}^{*} = \underset{type II loss}{\underset{⏟}{- FN \cdot r_{A}^{+}}} + \underset{type I loss}{\underset{⏟}{TN \cdot r_{A}^{-}}}

Therefore, the tracking error consists of two components:

For $FN$ days, the market was positive. The market-cash strategy was in cash, while the Buy-and-Hold was invested. This resulted in a loss of approximately $FN \cdot r_{M}^{+}$ to the tracking error.

For $TN$ days, the market was negative. The market-cash strategy was in cash, while the Buy-and-Hold was invested. This resulted in a gain of approximately $FN \cdot | r_{M}^{-} |$ .

Analysis of volatility and sharpe ratios by corresponding machine learning metrics

We now consider volatility. We will use $S$ to denote volatility with appropriate subscripts to indicate specific trading strategies or benchmarks, and $S^{*}$ to represent ML-volatility computed from the confusion matrix. If $σ$ denotes the standard deviation of daily returns, then the volatility over $t$ days is given by $S = σ \sqrt{t}$ . Recall that the standard deviation of simple returns is approximately equal to the standard deviation of logarithmic returns as shown in equation (10). Let $π^{+} = T^{+} / T$ denote the prevalence. As before, we assume that daily returns are independent.

Let $σ_{A}$ and $σ_{B}$ denote the standard deviation of daily returns for $A$ and $B$ , then the volatilities for the buy-hold strategies $A$ and $B$ we have

S_{A} = \sqrt{T} σ_{A} and S_{B} = \sqrt{T} σ_{B}

In Appendix A we derive the following expressions for ML-volatility and ML-Sharpe ratios for different strategies. We summarize our results below: ML-Volatility:

ideal strategy:

S_{max}^{*} = \sqrt{π^{+} S_{A}^{2} + (1 - π^{+}) S_{B}^{2}}

worst strategy:

S_{min}^{*} = \sqrt{π^{+} S_{B}^{2} + (1 - π^{+}) S_{A}^{2}}

generic strategy:

S_{s t r}^{*} = \sqrt{(\frac{TPR}{PPV}) π^{+} S_{A}^{2} + (\frac{TNR}{NPV}) (1 - π^{+}) S_{B}^{2}}

(16)

Ignoring the risk-free rate, we can write the expressions relating the Sharpe Ratios with confusion matrix entries. We will use the notation $SR$ to denote the Sharpe ratio. ML-Sharpe Ratio:

Buy-and-Hold strategies $A$ and $B$ we have

\begin{aligned} {SR}_{A}^{*} & = \frac{T^{+} r_{A}^{+} + T^{-} r_{A}^{-}}{S_{A}} \\ {SR}_{B}^{*} & = \frac{T^{+} r_{B}^{+} + T^{-} r_{B}^{-}}{S_{B}} \end{aligned}

ideal strategy:

{SR}_{max}^{*} = \frac{T^{+} r_{A}^{+} + T^{-} r_{B}^{-}}{\sqrt{π^{+} S_{A}^{2} + (1 - π^{+}) S_{B}^{2}}}

worst strategy:

{SR}_{min}^{*} = \frac{T^{+} r_{B}^{+} + T^{-} r_{A}^{-}}{\sqrt{π^{+} S_{B}^{2} + (1 - π^{+}) S_{A}^{2}}}

generic strategy:

{SR}_{s t r}^{*} = \frac{TP \cdot r_{A}^{+} + FN \cdot r_{B}^{+} + TN \cdot r_{B}^{-} + FP \cdot r_{A}^{-}}{\sqrt{(P^{+} / T) S_{A}^{2} + (P^{-} / T) S_{B}^{2}}}

The proposed ML-metrics are approximations and cannot be used to exactly compare two trading strategies. However, these approximations of strategy performance could offer insights to explain the relative underperformance of strategies in terms of prediction accuracy. For example, if we can improve the strategy by identifying only one extra true positive day, this means that we increase $TP$ by 1 and decrease $FN$ by 1. This results in an increase of $Δ^{+} = (r_{A}^{+} - r_{B}^{+})$ to the return in the numerator. This will also increase $P^{+}$ by 1 and decrease $P^{-}$ by 1. As a result, the expression in the denominator under the square root will increase by $(S_{A}^{2} - S_{B}^{2}) / T = (σ_{A}^{2} - σ_{B}^{2})$ .

On the other hand, if we can improve the strategy by identifying only one extra true negative day, this means that we increase $TN$ by 1 and decrease $FP$ by 1. This results in an increase of $Δ^{-} = (r_{B}^{-} - r_{A}^{-})$ to the return in the numerator. This will also decrease $P^{+}$ by 1 and increase $P^{-}$ by 1. As a result, the expression in the denominator under the square root will decrease by $(S_{A}^{2} - S_{B}^{2}) / T$ .

Finally, if we can improve the strategy by identifying only one extra true positive and one extra negative day, then $P^{+}$ and $P^{-}$ will remain unchanged. Therefore, the volatility will remain unchanged, whereas the return will increase by $(Δ^{+} + Δ^{-})$ increasing the Sharpe’s ratio.

This means that we always increase both the returns and Sharpe’s ratio by increasing our accuracy in identifying only negative labels. On the other hand, if we increase our accuracy in identifying only the positive labels, we increase our returns but not necessarily the Sharpe’s ratio. Increasing the accuracy in both positive and negative will increase both the return and Sharpe’s ratio.

For the Market-Cash strategies, $A$ is the $S & P$ -500 index and $B$ is cash. In this case, we have $r_{B}^{-} = 0$ , $r_{B}^{+} = 0$ , and $S_{B} = 0$ . Therefore, the above expressions for volatility and Sharpe’s ratio was reduced to a much simpler form

S_{M C}^{*} = S_{A} \sqrt{P^{+} / T} and {SR}_{M C}^{*} = \frac{TP \cdot r_{A}^{+} + FP \cdot r_{A}^{-}}{S_{A} \sqrt{P^{+} / T}}

The “return efficiency index”

In the previous section, we derived (approximate) expressions for strategy performance in terms of machine learning metrics. But how should we compare any two strategies? Many metrics are used in finance, such as tracking error, Sharpe ratio, drawdowns, and others. One drawback of such metrics is that they do not take into account how close the trading strategy is to the ideal case. Maybe it was not possible to significantly outperform the benchmark.

Is there a way to quantify this and is there a way to compare strategies not by comparing their relative absolute performance but by assigning them a universal score from 0 to 1, reflecting their ability to capture the maximum possible return?

We suggest proceeding as follows: consider the worst and the best trading strategy with corresponding returns $R_{min}$ and $R_{max}$ respectively. For any strategy, we have $R_{min} \leq R_{s t r} \leq R_{max}$ . Unless all daily returns are the same, we have $R_{min} < R_{max}$ . We suggest to define the strategy return efficiency index $I_{s t r}$ as

(return) efficiency index I_{s t r} = \frac{R_{s t r} - R_{min}}{R_{max} - R_{min}}

(17)

For brevity, we will refer to it as the efficiency index. The above formula is analogous to “min-max” scaling of data widely used in machine learning.

The above definition implies that for any strategy $0 \leq I_{s t r} \leq 1$ . The numerator $(R_{s t r} - R_{min})$ is the excess return compared to the worst return $R_{min}$ , whereas the denominator is the maximum possible excess return generated by predicting all true labels correctly.

Therefore, the return capture efficiency has the following simple interpretation: it tells us what fraction of the possible return range (from best to worst possible strategy) is captured by our strategy. For the worst strategy, this index is 0, and for the best possible strategy, it is 1.

For any strategy, its efficiency index would depend on how good the strategy is in predicting positive and negative true labels. To express the efficiency index in terms of machine learning metrics, we compute ML-efficiency index $I_{s t r}^{*}$ as follows. As before, recall $Δ^{+} = r_{A}^{+} - r_{B}^{+}$ and $Δ^{-} = r_{B}^{-} - r_{A}^{-}$ . Note that $Δ^{+} > 0$ and $Δ^{-} > 0$ .

From our equations (13) we obtain the following for the ML-efficiency index:

I_{s t r}^{*} = \frac{TP \cdot Δ^{+} + TN \cdot Δ^{-}}{T^{+} Δ^{+} + T^{-} Δ^{-}}

(18)

We can rewrite this in terms of recall $TPR$ and $TNR$ as follows:

\begin{aligned} I_{s t r}^{*} & = TPR \cdot [\frac{T^{+} Δ^{+}}{T^{+} Δ^{+} + T^{-} Δ^{-}}] \\ + TNR \cdot [\frac{T^{-} Δ^{-}}{T^{+} Δ^{+} + T^{-} Δ^{-}}] \end{aligned}

(19)

To interpret the index in equation (19) we note that for the ideal strategy $I_{max} = 1$ and its efficiency index can be written as

I_{max}^{*} = [\frac{T^{+} Δ^{+}}{T^{+} Δ^{+} + T^{-} Δ^{-}}] + [\frac{T^{-} Δ^{-}}{T^{+} Δ^{+} + T^{-} Δ^{-}}]

(20)

The first term in

I_{max}

in the above equation (20) is the fraction of excess return over

R^{min}

captured by the positive True labels, and the second term is the fraction of excess return captured by the negative True labels.

Therefore, the (return) efficiency index for any strategy $I_{s t r}$ has the following interpretation: it is the weighted sum of fractions of excess returns captured by the positive and negative true labels with precision $TPR$ and recall $TNR$ as the corresponding weights.

Next, consider a buy-and-hold strategy of investing only in $A$ or $B$ . These can be thought of as trading strategies where all predictive labels are “+”.

From equations (13) and (19) the return efficiency indices for these buy-and-hold strategies are

\begin{aligned} I_{A}^{*} & = \frac{T^{+} Δ^{+}}{T^{+} Δ^{+} + T^{-} Δ^{-}} \\ I_{B}^{*} & = \frac{T^{-} Δ^{-}}{T^{+} Δ^{+} + T^{-} Δ^{-}} \end{aligned}

(21)

Therefore, we can rewrite the efficiency index for any strategy in terms of the efficiency indices of buy and hold strategies for

A

and

B

as follows:

I_{s t r}^{*} = TPR \cdot I_{A} + TNR \cdot I_{B}

(22)

This provides an alternative interpretation of the efficiency index: it is the weighted sum of the efficiency indices of buy-and-hold strategies taken with weights $TPR$ and $TNR$ respectively.

We can re-write the above equation (22) in terms of the (unscaled) dot product as follows:

I_{s t r}^{*} = \underset{U_{1}}{\underset{⏟}{(TPR, TNR)}} \cdot \underset{U_{2}}{\underset{⏟}{(I_{A}, I_{B})}}

The above expression for

I_{s t r}

is the dot-product of two vectors:

the vector $U_{1} = (TPR, TNR)$ is machine learning related. It describes the accuracy of our strategy as a classifier predicting positive and negative labels.

the vector $U_{2} = (I_{A}, I_{B})$ is returns-related. It describes the return “profiles” defined by the benchmarks $A$ and $B$ .

Geometrically, this unscaled dot product $U_{1} \cdot U_{2}$ is related to the cosine of the angle between $U_{1}$ and $U_{2}$ via $U_{1} \cdot U_{2} = | | U_{1} | | | | U_{2} | | \cos α$ . This is illustrated in Figure 1.

Figure 1.

AUC Score (left) and Reutn Efficiency Index (right).

We can compare our strategy to benchmarks by comparing the corresponding efficiency indices instead of examining the tracking errors. From equations (21) and (18) we have

\begin{aligned} I_{s t r}^{*} - I_{A}^{*} & = \frac{TN \cdot Δ^{-} - FN \cdot Δ^{+}}{T^{+} Δ^{+} + T^{-} Δ^{-}} \\ I_{s t r}^{*} - I_{B}^{*} & = \frac{TP \cdot Δ^{+} - FP \cdot Δ^{-}}{T^{+} Δ^{+} + T^{-} Δ^{-}} \end{aligned}

(23)

Therefore, our ability to outperform the benchmark

A

depends on our ability to correctly identify labels and on the relative values of

Δ^{+}

and

Δ^{-}

. From equation (23) we have

\begin{aligned} I_{s t r}^{*} - I_{A}^{*} & > 0 ⟷ FN < TN (Δ^{+} / Δ^{-}) \\ I_{s t r}^{*} - I_{B}^{*} & > 0 ⟷ FP < TP (Δ^{+} / Δ^{-}) \end{aligned}

Finally, consider a “random flip” strategy $p -Random$ where we flip a coin to decide True labels. Let $p$ be the probability that we invest in security $A$ and $(1 - p)$ be the probability that we invest in security $B$ . of “head” (we choose security $A$ ) and $(1 - p)$ be the probability of tails. This means that out of $T^{+}$ true labels we correctly identify $p T^{+}$ positive true labels and out of $T^{-}$ true “−” labels we correctly identify $(1 - p) T^{-}$ negative true labels. This is equivalent to $TPR = p$ and $TNR = (1 - p)$ . Therefore, from equation (19) we obtain

I_{random}^{*} = p \cdot I_{A}^{*} + (1 - p) \cdot I_{B}^{*}

(24)

Therefore, the efficiency index of the random strategy is a weighted average of

I_{A}^{*}

and

I_{B}^{*}

with weights

p

and

(1 - p)

respectively.

From equations (19) and (24), it follows that to outperform a random index, we must have

\begin{aligned} I_{s t r}^{*} > I_{random}^{*} & \leftrightarrow (TPR - p) I_{A}^{*} + (TNR - (1 - p)) I_{B}^{*} > 0 \\ \leftrightarrow TPR \cdot I_{A}^{*} - (1 - TNR) \cdot I_{B}^{*} > p (I_{A}^{*} - I_{B}^{*}) \end{aligned}

I_{A}^{*} > I_{B}^{*}

then our strategy outperforms a

p

-random flip strategy if

\frac{TPR \cdot I_{A}^{*} - (1 - TNR) \cdot I_{B}^{*}}{I_{A}^{*} - I_{B}^{*}} > p

(25)

Therefore, we can compare our strategy to the efficiency index of some benchmarks and to a random strategy. This is similar to analyzing classifiers in machine learning, where we compute the so-called Area Under Curve and compare it to $1 / 2$ to see how different our predictions on labels are from those generated by a random coin flip.

The equation (19) gives us a universal way to compare any strategies in terms of their ability to capture the potential excess return over the worst strategy.

Example 6

Consider the Market-Cash (MC) strategies. for these strategies, we have $r_{B}^{+} = r_{B}^{- 1} = 0$ and therefore, $Δ^{+} = r_{M}^{+}$ and $Δ^{-} = - r_{M}^{-}$ . Note that $r_{M}^{-} < 0$ and therefore $Δ^{-} > 0$ . They can be interpreted as the absolute average log returns for positive and negative return days respectively.

The efficiency index of the strategy and benchmark S&P-500 index is given by

\begin{aligned} I_{s t r}^{*} & = \frac{TP \cdot r_{M}^{+} + TN \cdot | r_{M}^{-} |}{T^{+} r_{M}^{+} + T^{-} | r_{M}^{-} |} \\ I_{M C}^{*} & = \frac{TN \cdot | r_{M}^{-} | - FN \cdot r_{M}^{+}}{T^{+} r_{M}^{+} + T^{-} | r_{M}^{-} |} \end{aligned}

The market-cash strategy can outperform the index if we do not have too many False Negative (FN) predictions, namely, we must have

FN < TN (r_{M}^{+} / | r_{M}^{-} |)

A detailed example

Let us present a detailed example of comparing two strategies, $X$ and $Y$ . Strategy $X$ is a Growth-Value strategy and strategy $Y$ is a Market-Cash strategy. We assume that we have the following daily data for eleven days $d_{0}, \dots, d_{10}$ starting with day $d_{0}$ . The detailed results for returns $r_{i}$ , balances $B_{i}$ , true labels $T_{i}$ and predicted labels $P_{i}$ for each day $d_{i}$ are summarized in Table 4.

Table 4.

Daily data for $X$ and $Y$ strategies.

	S&P		G		V		Strategy $X$				Strategy $Y$
$d_{i}$	$r_{i}$	$B_{i}$	$r_{i}$	$B_{i}$	$r_{i}$	$B_{i}$	$r_{i}$	$B_{i}$	$T_{i}$	$P_{i}$	$r_{i}$	$B_{i}$	$T_{i}$	$P_{i}$
$d_{1}$	1	101	1	101	0	100	1	101	+	+	1	101	+	+
$d_{2}$	3	104	2	103	1	101	2	103	+	+	3	104	$+$	+
$d_{3}$	$-$ 2	102	$-$ 3	100	$-$ 2	99	$-$ 3	100	$-$	$+$	$-$ 2	102	$-$	$+$
$d_{4}$	$-$ 4	98	$-$ 1	99	$-$ 2	97	$-$ 2	98	$-$	$-$	0	102	$-$	$-$
$d_{5}$	2	100	3	102	2	99	3	101	+	+	0	102	+	$-$
$d_{6}$	2	102	2	104	1	100	2	103	+	+	3	105	+	+
$d_{7}$	$-$ 1	101	$-$ 2	102	$-$ 1	99	$-$ 1	102	$-$	$-$	$-$ 1	104	$-$	+
$d_{8}$	3	104	5	107	2	101	2	104	$+$	$-$	0	104	$+$	$-$
$d_{9}$	1	105	2	109	3	104	3	107	$+$	$-$	2	106	+	$+$
$d_{10}$	$-$ 1	104	$-$ 1	108	$-$ 1	103	$-$ 1	106	$-$	+	$-$ 1	105	$-$	+

The growth of $100 investments in S&P Buy-and-Hold strategies and our two strategies in shown below in Figure 2.

Figure 2.

Example of Strategy Comparison.

We first compute the relative performance of the two strategies over 10 days starting with day $d_{2}$ .

Our results are summarized in the Table 5: All strategies start with the same balance $B_{0} = $ 100$ . The higher final balance (after rounding to the nearest integer) of $106 is achieved by the Growth-Value $X$ , while the lower final balance of $105 was achieved by the Market-Cash $Y$ . The corresponding total returns for these strategies are $R_{X} = 6 %$ and $R_{Y} = 5 %$ , respectively. The total return of Strategy $X$ is more than twice the total return of Strategy $Y$ .

Table 5.

Performance measures for strategies.

	Buy-and-Hold			Strategy
Metrics	S&P	Gr.	Val	$X$	$Y$
Final balance $B$	104	108	103	106	105
Total return $R$	4	8	3	6	5
(Daily) st.dev.	2.2	2.4	1.7	2.1	1.6
Volatility (risk) $S$	7.0	7.5	5.3	6.5	5.2
Tracking error $E$	*	4	$-$ 2	2	1
Sharpe ratio $S R$	0.6	1.1	0.6	0.9	1.0
MDD - max draw	$-$ 6	$-$ 4	$-$ 5	$-$ 5	$-$ 2
# Trades	1	1	1	5	5

Next, we compare volatility. If $σ$ denotes the standard deviations of daily returns ${r_{1}, \dots, r_{10}}$ over $t = 10$ days, then we can compute the risk (volatility of returns over $D$ days) as $S = σ \sqrt{t}$ . The volatility for Strategy $X$ is $S_{X} = 6.51$ and is higher than the volatility of $S_{Y} = 5.15$ for Strategy $Y$ .

From the above, we can compute the corresponding Sharpe ratios for the two strategies. For strategy $X$ , the Sharpe’s ratio (after rounding to decimals) is ${SR}_{X} = R_{X} / S_{X} = 1.1$ , whereas for Strategy $Y$ the Sharpe’s ratio is ${SR}_{Y} = R_{Y} / S_{Y} = 1.0$ . Therefore, Strategy $X$ has a higher absolute return than Strategy $Y$ , but not on a risk-adjusted basis as shown by the Sharpe’s ratio.

Next, we examine the maximum drawdowns. For Strategy $X$ , the maximum decrease was from $B_{2} = $ 103$ to $B_{4} = $ 98$ . This gives the maximum drawdown $MDD = (B_{4} - B_{2}) / B_{1} \approx - 5 %$ . For Strategy $Y$ , the largest decrease in balance was from $B_{1} = 104$ to $B_{2} = 102$ . This gives the maximum drawdown $MDD = (B_{2} - B_{1}) / B_{1} \approx - 2 %$ . For the S&P-500, the largest decrease was from $B_{2} = $ 104$ to $B_{4} = $ 98$ , giving us the maximum drawdaown $MDD \approx - 6 %$ . For the S&P Growth index, the maximum decrease was from decrease was from $B_{2} = $ 103$ to $B_{4} = $ 99$ giving us $MDD \approx - 4 %$ . Finally, for the S&P Value index, the maximum decrease was from $B_{2} = $ 103$ to $B_{4} = $ 98$ , giving us $MDD \approx - 5 %$ .

We summarize the results (after rounding) in Table 5.

Machine learning description of strategy $X$

From Table 4 we can represent this strategy $X$ schematically as:

X : \overset{true ''+'' labels T^{+}}{\overset{⏞}{\underset{Growth r_{G}^{+}}{\underset{⏟}{\underset{T P}{\underset{⏟}{d_{1}, d_{2}, d_{5}, d_{6}}}}}, \underset{Value r_{V}^{+}}{\underset{⏟}{\underset{F N}{\underset{⏟}{d_{8}, d_{9}}}}}}}, \overset{true ''-'' labels T^{-}}{\overset{⏞}{\underset{Value r_{V}^{-}}{\underset{⏟}{\underset{T N}{\underset{⏟}{d_{4}, d_{7}}}}}, \underset{Growth r_{G}^{-}}{\underset{⏟}{\underset{F P}{\underset{⏟}{d_{3}, d_{10}}}}}}}

There are

T^{+} = 6

days

{d_{1}

d_{2}

d_{5}

d_{6}

d_{8}, d_{9}}

with Positive True Labels. The average log returns of Growth and Value indices on these days are

\begin{aligned} r_{G}^{+} & \approx \frac{1}{6} (\log (1.01) + \log (1.02) + \log (1.03) + \\ \log (1.02) + \log (1.05) + \log (1.02)) = 0.0246 \\ r_{V}^{+} & \approx \frac{1}{6} (\log (1.00) + \log (1.01) + \log (1.02) + \\ \log (1.01) + \log (1.02) + \log (1.03)) = 0.0148 \end{aligned}

There are

T^{-} = 4

days

{d_{3}

d_{4}

d_{7}, d_{10}}

with Negative True Labels. The average log returns of Growth and Value indices on these days

\begin{aligned} r_{G}^{-} & \approx \frac{1}{4} (\log (0.97) + \log (0.99) \\ + \log (0.98) + \log (0.99)) = - 0.0177 \\ r_{V}^{-} & \approx \frac{1}{4} (\log (0.98) + \log (0.98) \\ + \log (0.99) + \log (0.99)) = - 0.0151 \end{aligned}

This gives us

Δ^{+} = r_{G}^{+} - r_{V}^{+} = 0.0098

and

Δ^{-} = r_{V}^{-} - r_{G}^{-} = 0.0026

Next, we examine the confusion matrix for Strategy $X$ :

CF = [\begin{matrix} T P & F P \\ F N & T N \end{matrix}] = [\begin{matrix} 4 & 2 \\ 2 & 2 \end{matrix}]

The corresponding recall $TPR = 2 / 3$ , specificity $TNR = 1 / 2$ and accuracy $ACC = 3 / 5$ . From Table 5, we have $R_{X} = 6$ , $R_{G} = 8$ and $R_{V} = 3$ . From equation (13) we have

\begin{aligned} {\begin{matrix} R_{max}^{*} & = T^{+} r_{G}^{+} + T^{-} r_{V}^{-} \\ = 6 \cdot 0.0246 + 4 \cdot (- 0.0151) \\ = 0.0872 \\ R_{min}^{*} & = T^{+} r_{V}^{+} + T^{-} r_{G}^{-} \\ = 6 \cdot 0.0148 + 4 \cdot (- 0.0177) \\ = 0.0180 \end{matrix} \end{aligned}

(26)

The Predicted Positive Value $PPV = TP / (TP + FP) = 2 / 3$ and the Predicted Negative Value $PNV = TN / (TN + FN) = 1 / 2$ . The ML-volatility of strategy- $X$ is then

S_{X}^{*} = \sqrt{PPV \cdot S_{A}^{2} + PNV \cdot S_{B}^{2}} \approx 6.51

and this gives us the corresponding Sharpe’s ratio

S R_{X} = 0.92

We now compute the return efficiency indices $I_{G}$ , $I_{V}$ and $I_{X}$ . From Table 5 and equation (26) we have

I_{G} = 0.896, I_{V} = 0.173, I_{X} = 0.607

To compute the equivalent

p

-random strategy, we obtain from equation (25)

\begin{aligned} p & = \frac{TPR \cdot I_{G} - (1 - TNR) \cdot I_{V}}{I_{G} - I_{V}} \\ = \frac{(2 / 3) \cdot 0.896 - (1 - 0.5) \cdot 0.173}{0.896 - 0.173} \\ \approx 0.707 \end{aligned}

The strategy recovered about 3/5 of the difference in returns

(R_{max}^{*} - R_{min}^{*}) = 0.0692

between the best and the worst strategy. In terms of the return efficiency index, we have

I_{V} < I_{X} < I_{G}

: strategy

X

outperforms the buy-and-hold in Value but underperforms the buy-and-hold in Growth strategy. And this strategy outperforms the random flip strategy with

p \leq 0.707

Machine learning description of strategy $Y$

For this strategy, security $A$ is the market (“M”) - S&P-500 index and security $B$ is cash (“C”). From Table 4 we can compute represent for Strategy- $Y$ schematically as:

Strategy~ Y \overset{true ''+'' labels T^{+}}{\overset{⏞}{\underset{S & P}{\underset{⏟}{\underset{T P}{\underset{⏟}{d_{1}, d_{2}, d_{6}, d_{9}}}}}, \underset{cash}{\underset{⏟}{\underset{F N}{\underset{⏟}{d_{5}, d_{8}}}}}}}, \overset{true ''-'' labels T^{-}}{\overset{⏞}{\underset{cash}{\underset{⏟}{\underset{T N}{\underset{⏟}{d_{4}}}}}, \underset{S & P}{\underset{⏟}{\underset{F P}{\underset{⏟}{d_{3}, d_{7}, d_{10}}}}}}}

There are $T^{+} = 6$ days $d_{1}$ , $d_{2}$ , $d_{5}$ , $d_{6}$ , $d_{8}$ and $d_{9}$ with Positive True Labels and there are $T^{-} = 4$ days $d_{3}$ , $d_{4}$ , $d_{7}$ , and $d_{10}$ with negative True Labels. The corresponding average log returns of S&P-500 (market index) index on these days are

\begin{aligned} r_{M}^{+} & \approx \frac{1}{6} (\log (1.01) + \log (1.03) + \log (1.02) \\ + \log (1.02) + \log (1.03) + \log (1.01)) = 0.0198 \\ r_{M}^{-} & \approx \frac{1}{4} (\log (0.98) + \log (0.96) \\ + \log (0.99) + \log (0.98)) = - 0.0203 \end{aligned}

The returns for investing in cash on these days are

r_{C}^{-} = r_{C}^{+} = 0

. This gives

Δ^{+} = (r_{M}^{+} - r_{C}^{+}) = 0.0198

and

Δ^{-} = (r_{C}^{-} - r_{M}^{-}) = 0.0203

The confusion matrix is

CF = [\begin{matrix} T P & F P \\ F N & T N \end{matrix}] = [\begin{matrix} 4 & 3 \\ 2 & 1 \end{matrix}]

The corresponding recall

TPR = 2 / 3

, specificity

TNR = 1 / 4

and accuracy

ACC = 1 / 2

. From Table 5 we have

R_{Y} = 5 %

R_{M} = 4 %

and

R_{C} = 0 %

. For the ML-returns of the best and worst strategies, from equation (15) we have

\begin{aligned} {\begin{matrix} R_{max}^{*} & = T^{+} r_{M}^{+} = 6 \cdot 0.0198 = 0.1186 \\ R_{min}^{*} & = T^{-} r_{M}^{-} = 4 \cdot (- 0.0203) = - 0.0811 \end{matrix} \end{aligned}

(27)

The Predicted Positive Value $PPV = TP / (TP + FP) = 4 / 7 \approx 0.57$ and the Predicted Negative Value $PNV = TN / (TN + FN) = 1 / 3$ . The ML-volatility of strategy- $Y$ is then

S_{Y}^{*} = S_{M} \sqrt{PPV} \approx 5.2

and this gives us the corresponding Sharpe’s ratio

S R_{X} = 1.0

We now compute the return efficiency indices $I_{M}$ , $I_{C}$ and $I_{Y}$ . From Table 5 and equation (27) we have

I_{M} = 0.606, I_{C} = 0.406, I_{Y} = 0.656

To compute the equivalent

p

-random strategy, we obtain from equation (25)

\begin{aligned} p & = \frac{TPR \cdot I_{M} - (1 - TNR) \cdot I_{C}}{I_{M} - I_{C}} \\ = \frac{(2 / 3) \cdot 0.606 - (1 - 1 / 4) \cdot 0.406}{0.606 - 0.406} \approx 0.5 \end{aligned}

The strategy recovered $65.6 %$ of the difference in returns $(R_{max}^{*} - R_{min}^{*}) = 0.1997$ between the best and the worst strategy. Its return efficiency index is greater than that of a random flip strategy.

In terms of accuracy indices (machine learning-based metric) $I_{Y} - I_{M} = 0.05$ . This result tells us that strategy provides very low advantage over buy-and-hold: it is only 5% better at recovering the potential return compared with buy-and-hold.

Machine learning comparison of X and Y

We summarize the comparison results for strategies $X$ and $Y$ in Table 6.

Table 6.

Machine learning metrics for strategies.

	Strategy
Metrics	$X$	$Y$
True Positive (TP)	4	4
False Positive (FP)	2	3
True Negative (TN)	2	1
False Negative (FN)	2	2
Recall (TPR)	0.67	0.6
Specificity (TNR)	0.5	0.25
Accuracy (ACC)	0.6	0.5
Precision (PPV)	0.67	0.57
Predicted Negative Value (PNV)	0.5	0.33
Return Efficiency Index $I$	0.61	0.65
Equivalent Random Flip probability	0.71	0.5

Strategy $X$ predicts both the positive and negative True labels at a higher rate, resulting in higher overall accuracy ACC (0.6 vs. 05). However, in terms of return efficiency, strategy $Y$ is more efficient: it recovers about 65% of the potential return vs. 61% for $X$ . Recall that this potential return depends on the benchmarks used and is different for these strategies. For the Growth-Value, the range of returns was 7.2, and for Market-Cash, the range was 20 and was much wider. For the equivalent random strategy, Growth-Value requires a larger value of $p$ (0.71 vs. 0.5). This is consistent with higher accuracy for $X$ and a lower True negative Rate for $Y$ .

Comparing these two strategies, we see that the $X$ strategy has a higher return but this comes at the cost of higher volatility and drawdowns as well as increased trading frequency. The risk-adjusted return of $X$ is lower than for $Y$ as measured by the Sharpe’s ratio,

Example: k-NN “winners” and “losers” trading strategies

In the previous sections, we draw an analogy between a trading strategy of choosing the appropriate asset to invest and a machine learning problem of predicting a label. We explore this idea further and suggest some trading strategies based on analogies to machine learning.

One of the simplest algorithms for classification in machine learning is the so-called $k$ -NN - nearest neighbor classification (Bishop, 2016). In this method, we assume that we are given a distance $D (x, y)$ metric between any two points $x$ and $y$ . To assign a label to any point $x$ , we find the closest $k$ labeled points (the so-called “neighbors”) $x_{1}, \dots, x_{k}$ of $x$ and assign a label to $x$ based on the majority of labels from these neighbors. The number $k$ must be odd to have a well-defined predicted label. The simplest case is $k = 1$ : we assign a label to $x$ based on the label of its closest neighbor.

An example is illustrated in Figure 3 where we have six True (“Ground Truth”) labels in the training set.

Figure 3.

Example of Nearest Neighbor Classification.

In this example, we need to assign a label to point $A$ . If we take $k = 1$ , then the nearest neighbor is point 1 with a (Ground Truth) “green” label. In this case, we assign the label “green” to $A$ . If we take $k = 3$ neighbors from the training set, the nearest three neighbors to $A$ are point 1 (“green”), point 2 (“red”) and point 3 (“red”). The majority of these labels are “red” labels and therefore $A$ will be assigned the label “red”. Finally, take $k = 5$ . The nearest five neighbors to $A$ are point 1 (“green”), point 2 (“red”), point 3 (“red”), point 4 (“green”) and point 5 (“green”). Most of these five points have the Ground truth label “green,” and therefore, $A$ will be assigned “green”. This example shows that the final label depends on the value $k$ . This value is computed by experiments.

Let us consider a trading strategy based on this analogy. Unlike typical scenarios in applying supervised machine learning algorithms where ground truth labels are known in advance (Bishop, 2016), in trading many trading algorithms, these ground truth labels are known only for historical data. This is illustrated in Figure 4 where we need to make a prediction for $A$ based on historical ground truth labels $T_{1}, \dots, T_{6}$ .

Figure 4.

Next-Day Label Prediction by $k$ -NN Analogy.

By analogy to $k$ -NN in machine learning, for any two days $d_{i}$ and $d_{j}$ define the distance as the number of days in between $D (d_{i}, d_{j}) = | i - j |$ . With this definition, the neighbors of any day $d_{i}$ are the previous days. In the simplest case of $k = 1$ , the nearest neighbor of $d_{i}$ is the previous day $d_{i - 1}$ . In this case, we assign a predicted label $P_{i}$ to day $d_{i}$ based on the true label $T_{i - 1}$ of the previous day $d_{i - 1}$ . Since the ground truth label for day $i - 1$ is “green”, we assign “green” as predicted label for day $d_{i}$ . In the more general setting of $k > 1$ , we assign a predicted label $P_{i}$ to day $d_{i}$ based on the majority of ground truth labels of its $k$ “nearest neighbors”, namely the labels $T_{i - 1}, T_{i - 2}, \dots T_{i - k}$ of the $k$ preceding days ${d_{i - 1}, d_{i - 2}, \dots, d_{i - k}}$ .

If the predicted label $P_{i}$ is taken to be the majority of the true label of $k$ neighbors, we will call such strategy $k$ -winners strategy. We denote this strategy as AB- $k$ W to indicate that it trades in two securities $A$ and $B$ , and uses the majority (“winning”) of true labels $T_{i - 1}, \dots, T_{i - k}$ of the previous $k$ days (“neighbors”). Formally, in AB- $k$ W strategy, we assign predicted label $P_{i}$ for day $d_{i}$ as:

P_{i} = Majority (T_{i - 1}, \dots, T_{i - k})

If the predicted label $P_{i}$ is taken to be the minority of the true labels of $k$ neighbors, we will call such strategy $k$ -losers strategy. We denote such a strategy as AB- $k$ L to indicate that it trades in two securities $A$ and $B$ , and uses the minority (“losing”) of true labels of the previous $k$ days (“neighbors”). Formally, in AB- $k$ L strategies, the predicted label for day $d_{i}$

P_{i} = Minority (T_{i - 1}, \dots, T_{i - k})

Market-cash $k$ -NN strategies

In this class of strategies, we predict the choice of investments for the next day (market or cash) based on the daily returns of the S&P-500 index over the last $k$ days.

Recall that in Market-Cash strategies, to each day $d_{i}$ we assign a True label $T_{i} =^{″} +$ ” or $T_{i} =^{″} -$ ” depending on the daily return $r_{i}$ of the S&P-500 index as follows:

T_{i} =^{″} +^{″} if R_{i} \geq 0 and T_{i} =^{″} -^{″} if R_{i} < 0

Once the true labels are assigned to each trading day, we generate predicted labels (trading signal) $P_{i + 1}$ for the day $d_{i + 1}$ based on the majority (“winning”) or the minority (“winning”) of True labels in the $k$ previous day(s).

Our trading algorithm invests in day $d_{i + 1}$ based on predicted label $P_{i + 1}$ for that day as follows:

predicted label $P_{i + 1} =^{″} +$ ”: (re)invest in S&P-500 Index for day $d_{i + 1}$

predicted label $P_{i + 1} =^{″} -$ ”: be in Cash for day $d_{i + 1}$

In MC- $k$ W (Market-Cash $k$ -winners) strategies, we predict the next day label $P_{i + 1} = ‘ ‘ +$ ” if most daily returns of S&P-500 index of the previous $k$ days were non-negative. In such strategies, we tend to believe that the current positive momentum has some inertia and will continue. In MC- $k$ L (Market-Cash $k$ -losers) strategies, we predict the next day label $P_{i + 1} =^{″} -^{″}$ if most daily returns of S&P-500 index of the previous $k$ days were positive. In these strategies, we tend to believe that the current positive momentum is about to change.

Example 7

We illustrate this by the following example of

k = 1

and

k = 3

for 11 days

d_{0}

d_{1}

, …,

d_{10}

with true and predicted labels as shown in Table 7.

Table 7.

Predicted labels for different MC- $k$ * trading strategies.

Day	$d_{0}$	$d_{1}$	$d_{2}$	$d_{3}$	$d_{4}$	$d_{5}$	$d_{6}$	$d_{7}$	$d_{8}$	$d_{9}$	$d_{10}$
$T_{i}$ (MC)	$+$	$+$	$-$	$-$	$+$	$+$	$+$	$-$	$-$	$+$	$-$
$P_{i}$ (MC-1W)	n/a	$+$	$+$	$-$	$-$	$+$	$+$	$+$	$-$	$-$	$+$
$P_{i}$ (MC-1L)	n/a	$-$	$-$	$+$	$+$	$-$	$-$	$-$	$+$	$+$	$-$
$P_{i}$ (MC-3W)	n/a	n/a	n/a	$+$	$-$	$+$	$+$	$+$	$+$	$-$	$-$
$P_{i}$ (MC-3L)	n/a	n/a	n/a	$-$	$+$	$-$	$-$	$-$	$-$	$+$	$+$

For $k = 3$ , we can assign a signal starting on the day $d_{3}$ . For the three preceding (“neighbor”) days have true labels $T_{0} =^{″} +^{″}$ , $T_{1} =^{″} +^{″}$ and $T_{2} =^{″} -^{″}$ respectively. This means that on these days, the S&P-500 index had mostly non-negative daily returns. The majority of these labels are “+” and therefore, the 3-Day Losers strategy assigns predicted label (trading signal) $P_{3} =^{″} -^{″}$ for the fourth day $d_{4}$ , indicating to be in Cash position.

For example, consider MC-3W (“winners”) strategy and the assignment of predicted labels starting with day $d_{3}$ . The true labels for the preceding three (“neighbor”) days $d_{0}$ , $d_{1}$ , and $d_{2}$ were “+”, “+” and “−” respectively. The majority of these three labels was “+”. Therefore, for $k = 3$ nearest neighbor winning strategy, we would assign a predicted label $P_{3} =^{″} +^{″}$ for the day $d_{3}$ in strategy MC-3W. Therefore, the MC-3W strategy suggests to be invested in S&P index for the day $d_{3}$ . By contrast, for the same day $d_{3}$ we would assign a predicted label $P_{3} =^{″} -^{″}$ by the “losers” strategy MC-3L. The MC-3L strategy suggests being in cash position for the day $d_{3}$

Growth-value $k$ -NN strategies

In this class of strategies, we predict the choice of investments for the next day (growth or value) based on the relative performance of the corresponding indices over the last $k$ days (“neighbors”)

Recall that for Growth-Value strategies, to each day $d_{i}$ we assign a true label $T_{i} =^{″} +^{″}$ or $T_{i} =^{″} -^{″}$ depending on the daily returns $r_{i}^{G}$ and $r_{i}^{V}$ of the Growth and Value indices as follows:

T_{i} =^{″} +^{″} if r_{i}^{G} \geq r_{i}^{V} and T_{i} =^{″} -^{″} if r_{i}^{G} < r_{i}^{V}

Once the true labels are assigned to each trading day, we generate predicted label (trading signal) $P_{i + 1}$ for the day $d_{i + 1}$ based on the assigned True labels in the previous $k$ day(s). Our trading algorithm invests in day $d_{i + 1}$ based on predicted label $P_{i + 1}$ for that day as follows: We will consider two trading signals:

predicted label $P_{i + 1} =^{″} +^{″}$ : choose S&P-500 Growth index for day $d_{i + 1}$

predicted label $P_{i + 1} =^{″} -^{″}$ : choose S&P-500 Value index for day $d_{i + 1}$

In GV- $k$ W (Growth-Value $k$ -winners) strategies, we predict the next day label $P_{i + 1} =^{″} +^{″}$ if most daily returns of S&P Growth index of the previous $k$ days were higher than the daily returns of the S&P Value index. And we predict the next day label $P_{i + 1} =^{″} -^{″}$ if most daily returns of S&P Growth index of the previous $k$ days were lower than the daily returns of the S&P Value index. In such strategies, we tend to believe that the current overperformance of one index over the other has some inertia and will continue. By contrast, in GV- $k$ L (Growth-Value $k$ -losers) strategies, we believe that the current overperformance of one index over another is about to change.

Example 8

We illustrate this by the following example of

k = 1

and

k = 3

for 11 days

d_{0}

d_{1}

, …,

d_{10}

with true and predicted labels as shown in Table 8 below:

Table 8.

Predicted labels for different GV- $k$ trading strategies.

Day	$d_{0}$	$d_{1}$	$d_{2}$	$d_{3}$	$d_{4}$	$d_{5}$	$d_{6}$	$d_{7}$	$d_{8}$	$d_{9}$	$d_{10}$
True Label	$+$	$+$	$-$	$-$	$+$	$+$	$-$	$+$	$-$	$+$	$-$
GV-1W	n/a	$+$	$+$	$-$	$-$	$+$	$+$	$-$	$+$	$-$	$+$
GV-1L	n/a	$-$	$-$	$+$	$+$	$-$	$-$	$+$	$-$	$+$	$-$
GV-3W	n/a	n/a	n/a	$+$	$-$	$-$	$+$	$+$	$+$	$-$	$+$
GV-3L	n/a	n/a	n/a	$-$	$+$	$+$	$-$	$-$	$-$	$+$	$-$

Table 9.

Summary metrics for KNN, LSTM and CNN models.

Strategy	GV-1L	3-Month		6-Month
Strategy	GV-1L	CNN	LSTM	CNN	LSTM
Average ML Statistics
TPR (%)	52	42	43	41	41
TNR (%)	55	60	61	62	62
Acc.(%)	53	49	50	49	49
Average Trading Statistics
Final $	2,642	383	300	338	304
Annual $R$	16.6	7.6	6.7	7.3	6.6
MDD	$-$ 15.2	$-$ 16.5	$-$ 16.9	$-$ 16.7	$-$ 16.5
Volatility	18.6	17.7	17.7	17.9	17.7
Sharpe	1.2	0.7	0.7	0.7	0.7
Summary Statistics of Return Efficiency Index (REI)
Median	0.41	0.35	0.36	0.35	0.34
Mean	0.40	0.34	0.33	0.34	0.34
St. dev.	0.10	0.08	0.08	0.08	0.09

For $k = 3$ , we can assign a signal starting on the day $d_{3}$ . For the first three days $d_{0}$ , $d_{1}$ and $d_{2}$ have true labels $T_{0} =^{″} +^{″}$ , $T_{1} =^{″} +^{″}$ and $T_{2} =^{″} -^{″}$ respectively. This means that on these days, the S&P Growth index overperformed the Value index (in terms of number of days when $r_{i}^{G} \geq r_{i}^{V}$ ). The majority of these true labels are “+” and therefore, the 3-Day “Winners” strategy GV-3W assigns predicted label (trading signal) $P_{3} =^{″} +^{″}$ for the day $d_{3}$ , indicating to be invested in the S&P Growth index. By contrast, the 3-day losers strategy $GV-3L$ assigns predicted label (trading signal) $P_{3} =^{″} -^{″}$ for the day $d_{3}$ , indicating to be invested in the S&P Value index.

Results and discussion

We now turn to analyze results for the nearest neighbor Growth-Value and Market-Cash strategies using Machine learning metrics.

We present the following comparisons:

growth comparison

comparison of returns and machine learning metrics

comparison of volatility and drawdowns

comparison of tracking errors and Sharpe ratios

comparison by return efficiency ratio

choosing the number of neighbors $k$ and transaction costs

The detailed tables for each year as well as summary statistics (min, max, median, average and standard deviation) are presented in the Appendix. In most tables, we used the color “green” to identify the best value, “red” to identify the worst value, and “yellow” to identify the median value.

Growth comparison

We start by considering the three Buy-and-hold strategies investing in the three indices S&P-500, S&P-Growth and S&P-Value and in $k = 1$ day Growth-Value and Market Cash Strategies. The growth of $100 for these is shown in Figure 5

Figure 5.

Comparison of Growth.

As can be seen from this graph, the highest growth is achieved by the Growth-Value “Loser” strategy GV-1L with $k = 1$ . The detailed annual growth is summarized in Table 10.

Table 10.

Comparison of annual end balances of investment strategies.

	Buy-and-Hold			Growth-Value		Market-Cash
Year	S&P	G	V	GV-1W	GV-1L	MC-1W	MC-1L
2001	88	74	94	60	117	93	95
2002	69	50	77	25	157	67	103
2003	89	65	97	26	238	73	122
2004	98	68	110	27	281	77	127
2005	103	70	116	26	308	73	141
2006	119	76	141	29	367	81	147
2007	125	84	143	29	413	71	176
2008	79	53	91	16	303	41	194
2009	100	72	106	19	408	43	234
2010	115	84	123	21	501	43	265
2011	117	88	122	20	547	46	256
2012	136	100	143	19	745	53	258
2013	180	133	188	24	1,045	60	299
2014	204	153	211	26	1,222	62	332
2015	207	161	205	26	1,278	62	333
2016	232	172	240	29	1,419	59	392
2017	282	218	276	35	1,720	63	445
2018	269	218	252	34	1,604	69	388
2019	353	285	331	47	2,016	83	424
2020	418	381	336	60	2,137	72	577
2021	538	503	419	85	2,486	82	659
2022	440	355	397	65	2,161	72	615
2023	555	461	486	85	2,642	78	717
IRR	7.7	6.9	7.1	$-$ 0.7	15.3	$-$ 1.1	8.9

The GV-1L strategy ends with the highest balance of $2,642 whereas the Market-Cash MC-1W ends with $717. Both strategies outperform the Buy-and-hold S&P-500, S&P Growth and S&P Value strategies that yielded $555, $461 and $486 respectively. Both Growth-Value and Market-Cash “Winner” strategies resulted in a loss.

Comparison of returns and machine learning metrics

A detailed breakdown of annual returns for 23 years for these strategies is presented in Table 11. A detailed Table 13 of machine learning metrics is presented in the Appendix.

Table 11.

Comparison of annual returns of investment strategies.

	Buy-and-Hold			Growth-Value		Market-Cash
Year	S&P	G	V	GV-1W	GV-1L	MC-1W	MC-1L
2001	$-$ 11.8	$-$ 25.9	$-$ 5.6	$-$ 40.2	17.0	$-$ 7.1	$-$ 5.0
2002	$-$ 21.6	$-$ 32.1	$-$ 18.0	$-$ 58.5	34.2	$-$ 28.0	8.9
2003	28.2	28.3	25.2	6.1	51.4	8.9	17.8
2004	10.7	5.3	13.2	0.7	18.4	5.9	4.5
2005	4.8	2.8	5.4	$-$ 1.2	9.6	$-$ 5.4	10.8
2006	15.8	9.0	21.6	11.4	19.0	11.1	4.3
2007	5.1	10.8	1.4	$-$ 0.2	12.5	$-$ 11.9	19.3
2008	$-$ 36.8	$-$ 37.4	$-$ 36.3	$-$ 45.7	$-$ 26.5	$-$ 42.8	10.6
2009	26.4	37.0	17.1	19.3	34.4	5.0	20.3
2010	15.1	16.2	15.5	9.4	22.7	1.4	13.5
2011	1.9	4.6	$-$ 0.7	$-$ 4.9	9.2	5.5	$-$ 3.4
2012	16.0	14.2	17.2	$-$ 1.8	36.3	15.2	0.7
2013	32.3	32.6	31.8	24.6	40.3	14.1	16.0
2014	13.5	14.8	12.2	10.2	16.9	2.3	11.0
2015	1.2	5.1	$-$ 3.2	$-$ 2.7	4.6	1.0	0.3
2016	12.0	6.8	17.1	12.6	11.1	$-$ 5.0	17.9
2017	21.7	27.2	15.4	21.1	21.2	7.4	13.4
2018	$-$ 4.6	$-$ 0.1	$-$ 9.0	$-$ 2.5	$-$ 6.7	9.2	$-$ 12.6
2019	31.2	30.8	31.7	37.1	25.7	20.1	9.2
2020	18.3	33.5	1.4	27.7	6.0	$-$ 13.1	36.1
2021	28.7	32.0	24.9	41.7	16.3	12.8	14.1
2022	$-$ 18.2	$-$ 29.4	$-$ 5.3	$-$ 23.1	$-$ 13.1	$-$ 12.4	$-$ 6.6
2023	26.2	30.0	22.2	30.0	22.2	8.3	16.5
$max$	32.3	37.0	31.8	41.7	51.4	20.1	36.1
$min$	$-$ 36.8	$-$ 37.4	$-$ 36.3	$-$ 58.5	$-$ 26.5	$-$ 42.8	$-$ 12.6
$M$	13.5	10.8	13.2	6.1	17.0	5.0	10.8
$μ$	9.4	9.4	8.5	3.1	16.6	0.1	9.5
$σ$	18.3	22.3	16.6	25.4	17.4	14.6	10.8

Table 12.

Yearly confusion matrix statistics for strategies.

	GV-1L						MC-1L
	Confusion Matrix				True Label		Confusion Matrix				True Label
Year	TP	FP	TN	FN	$T^{+}$	$T^{-}$	TP	FP	TN	FN	$T^{+}$	$T^{-}$
2001	67	60	67	54	121	127	64	59	64	61	125	123
2002	81	54	80	37	118	134	73	60	72	47	120	132
2003	73	52	73	54	127	125	66	42	66	78	144	108
2004	70	61	71	50	120	132	60	46	61	85	145	107
2005	66	65	66	55	121	131	65	47	65	75	140	112
2006	67	76	66	42	109	142	58	52	58	83	141	110
2007	71	42	71	67	138	113	71	41	71	68	139	112
2008	66	61	67	59	125	128	63	65	62	63	126	127
2009	67	40	67	78	145	107	62	48	63	79	141	111
2010	68	59	68	57	125	127	63	43	62	84	147	105
2011	70	56	69	57	127	125	59	56	60	77	136	116
2012	75	59	75	41	116	134	57	55	57	82	139	111
2013	70	49	70	63	133	119	64	39	64	85	149	103
2014	66	46	66	74	140	112	65	37	66	84	149	103
2015	66	48	67	71	137	115	66	65	66	55	121	131
2016	63	60	63	66	129	123	74	40	74	64	138	114
2017	66	40	66	79	145	106	67	40	67	77	144	107
2018	60	48	59	84	144	107	60	57	59	75	135	116
2019	58	70	59	65	123	129	57	45	57	93	150	102
2020	57	46	57	93	150	103	72	35	72	74	146	107
2021	59	68	59	66	125	127	63	42	64	83	146	106
2022	64	75	63	49	113	138	61	80	61	49	110	141
2023	62	45	63	80	142	108	63	46	63	78	141	109
$max$	81	76	80	93	150	142	74	80	74	93	150	141
$min$	57	40	57	37	109	103	57	35	56	47	110	102
$M$	66	56	67	63	127	125	63	46	64	77	141	111
$μ$	67	56	67	63	129	122	64	50	64	74	138	114
$σ$	6	11	5	14	11	11	5	11	5	12	11	10

Table 13.

Comparison of machine learning metrics.

	Recall, Specificity, and Accuracy						Precision, NPV, and Prevalence
	TPR		TNR		ACC		PPV		NPV		$π^{+}$
Year	GV	MC	GV	MC	GV	MC	GV	MC	GV	MC	GV	MC
2001	55	51	53	52	54	52	53	52	55	51	49	50
2002	69	61	60	55	64	58	60	55	68	61	47	48
2003	57	46	58	61	58	52	58	61	57	46	50	57
2004	58	41	54	57	56	48	53	57	59	42	48	58
2005	55	46	50	58	52	52	50	58	55	46	48	56
2006	61	41	46	53	53	46	47	53	61	41	43	56
2007	51	51	63	63	57	57	63	63	51	51	55	55
2008	53	50	52	49	53	49	52	49	53	50	49	50
2009	46	44	63	57	53	50	63	56	46	44	58	56
2010	54	43	54	59	54	50	54	59	54	42	50	58
2011	55	43	55	52	55	47	56	51	55	44	50	54
2012	65	41	56	50	60	45	56	51	65	41	46	56
2013	53	43	59	62	56	51	59	62	53	43	53	59
2014	47	44	59	64	52	52	59	64	47	44	56	59
2015	48	55	58	50	53	52	58	50	49	55	54	48
2016	49	54	51	65	50	59	51	65	49	54	51	55
2017	46	47	62	63	53	53	62	63	46	47	58	57
2018	42	44	55	51	47	47	56	51	41	44	57	54
2019	47	38	46	56	46	45	45	56	48	38	49	60
2020	38	49	55	67	45	57	55	67	38	49	59	58
2021	47	43	46	60	47	50	46	60	47	44	50	58
2022	57	55	46	43	51	49	46	43	56	55	45	44
2023	44	45	58	58	50	50	58	58	44	45	57	56
$max$	69	61	63	67	64	59	63	67	68	61	59	60
$min$	38	38	46	43	45	45	45	43	38	38	43	44
$M$	53	45	55	57	53	50	56	57	53	45	50	56
$μ$	52	47	55	57	53	51	55	57	52	47	51	55
$σ$	7	6	5	6	4	4	5	6	7	6	5	4

Before discussing the overall results, let us focus on 2023 and illustrate the computations of returns for Growth-Value and Market-Cash strategies using the data for 2023 for machine learning and returns from Tables 13 and 14. To that end, recall the general equation (13) linking the returns of a strategy with machine learning metrics is ML-return

\begin{aligned} R_{s t r}^{*} & = \underset{TP}{\underset{⏟}{T^{+} (TPR)}} r_{A}^{+} + \underset{FN}{\underset{⏟}{T^{+} (1 - TPR)}} r_{B}^{+} \\ + \underset{TN}{\underset{⏟}{T^{-} (TNR)}} r_{B}^{-} + \underset{FP}{\underset{⏟}{T^{-} (1 - TNR)}} r_{A}^{-} \end{aligned}

Table 14.

Daily risk and return details.

	St. Dev (risk)			Average Log Return for True Labels
	Buy-and-Hold			GV				MC
Year	$σ_{M}$	$σ_{G}$	$σ_{V}$	$r_{G}^{+}$	$r_{G}^{-}$	$r_{V}^{+}$	$r_{V}^{-}$	$r_{M}^{+}$	$r_{M}^{-}$
2001	1.39	2.58	1.09	1.51	$-$ 1.67	$-$ 0.00	$-$ 0.04	1.00	$-$ 1.12
2002	1.67	2.09	1.55	1.09	$-$ 1.25	$-$ 0.08	$-$ 0.08	1.25	$-$ 1.32
2003	1.04	1.15	1.04	0.49	$-$ 0.30	$-$ 0.09	0.27	0.79	$-$ 0.83
2004	0.70	0.70	0.66	0.26	$-$ 0.20	$-$ 0.11	0.19	0.51	$-$ 0.60
2005	0.65	0.64	0.61	0.24	$-$ 0.20	$-$ 0.07	0.11	0.48	$-$ 0.56
2006	0.63	0.68	0.62	0.29	$-$ 0.16	0.05	0.10	0.47	$-$ 0.47
2007	1.00	0.92	1.05	$-$ 0.00	0.09	$-$ 0.32	0.41	0.67	$-$ 0.78
2008	2.60	2.35	2.57	$-$ 0.34	$-$ 0.03	$-$ 1.06	0.68	1.52	$-$ 1.87
2009	1.68	1.61	1.75	0.18	0.05	$-$ 0.31	0.57	1.16	$-$ 1.26
2010	1.13	1.19	1.12	0.35	$-$ 0.22	0.00	0.11	0.73	$-$ 0.88
2011	1.45	1.36	1.49	$-$ 0.08	0.12	$-$ 0.48	0.49	0.96	$-$ 1.11
2012	0.80	0.76	0.88	0.04	0.06	$-$ 0.34	0.42	0.59	$-$ 0.60
2013	0.70	0.69	0.66	0.27	$-$ 0.06	0.04	0.18	0.55	$-$ 0.52
2014	0.71	0.77	0.65	0.35	$-$ 0.31	0.11	$-$ 0.03	0.48	$-$ 0.58
2015	0.97	1.00	0.97	0.17	$-$ 0.16	$-$ 0.14	0.14	0.75	$-$ 0.68
2016	0.82	0.84	0.83	0.20	$-$ 0.16	$-$ 0.08	0.21	0.57	$-$ 0.59
2017	0.42	0.45	0.45	0.17	$-$ 0.01	$-$ 0.10	0.27	0.33	$-$ 0.27
2018	1.07	1.23	0.95	0.50	$-$ 0.68	0.08	$-$ 0.20	0.67	$-$ 0.82
2019	0.79	0.81	0.80	0.22	$-$ 0.00	$-$ 0.05	0.26	0.57	$-$ 0.57
2020	2.10	2.18	2.21	0.38	$-$ 0.28	$-$ 0.37	0.55	1.21	$-$ 1.49
2021	0.82	1.03	0.82	0.62	$-$ 0.39	$-$ 0.12	0.29	0.63	$-$ 0.63
2022	1.53	1.93	1.21	1.25	$-$ 1.27	0.47	$-$ 0.43	1.28	$-$ 1.14
2023	0.82	0.84	0.84	0.23	$-$ 0.06	$-$ 0.06	0.27	0.66	$-$ 0.64
$max$	2.60	2.58	2.57	1.54	0.12	0.48	0.71	1.56	$-$ 0.27
$min$	0.42	0.45	0.45	$-$ 0.31	$-$ 1.64	$-$ 1.02	$-$ 0.42	0.33	$-$ 1.84
$M$	0.97	1.00	0.95	0.26	$-$ 0.16	$-$ 0.08	0.22	0.67	$-$ 0.68
$μ$	1.11	1.21	1.08	0.38	$-$ 0.30	$-$ 0.12	0.21	0.78	$-$ 0.83
$σ$	0.53	0.61	0.52	0.42	0.47	0.28	0.26	0.32	0.37

For Growth-Value GV-1L strategy, $A$ is the S&P-500 Growth and $B$ is the S&P-500 Value index. For Market-Cash, $A$ is the S&P-500 index and $B$ is cash.

Growth-Value: From Table 14, the average log returns for GV-1L in 2023 were $r_{G}^{+} = 0.24$ , $r_{G}^{-} = - 0.06$ , $r_{V}^{+} = - 0.06$ , and $r_{V}^{-} = 0.27$ . For confusion matrix counts, from Table 12 we obtain: $TP = 62$ , $FN = 80$ , $TN = 63$ , and $FP = 45$ . Substituting these values in the above equation for $R_{str}^{*}$ , we get

\begin{aligned} R_{GV-1L}^{*} & = TP \cdot r_{G}^{+} + FN \cdot r_{V}^{+} \\ + TN \cdot r_{V}^{-} + FP \cdot r_{G}^{-} \approx 24.4 % \end{aligned}

Market-Cash: From Table 14, the average log returns for MC-1L in 2023 were $r_{A}^{+} = r_{M}^{+} = 0.66$ , $r_{A}^{-} = r_{M}^{-} = - 0.63$ and $r_{B}^{+} = r_{B}^{-} = 0$ . For confusion matrix counts, from Table 12 we have $TP = 63$ , $FN = 78$ , $TN = 63$ , and $FP = 46$ . Substituting these values in above equation for $R_{str}^{*}$ we get:

R_{MC-1L}^{*} = TP \cdot r_{M}^{+} + FP \cdot r_{M}^{-} \approx 12.6 %

We now examine the returns data in Table 11. The outperformance of GV-1L strategy over other strategies is very significant - about 700 basis points higher than buy-and-hold strategies and about 600 basis points higher than MC-1L. This is somewhat unexpected: intuitively, we would expect the opposite since in Growth-Value strategies we are always invested and the indices themselves are correlated, whereas in Market-Cash we seek to avoid losses by having a cash position. However, these results tell us that our intuition is wrong. One plausible explanation for this is that on most days markets overreact to news and this is somewhat corrected in the next day(s).

We also observe that the annual return’s standard deviation of the MC-1L strategy is very stable at 10.8, much better than GV-1L at 17.4, Buy-and-hold strategies investing in the three indices S&P-500 at 18.3, S&P-Growth at 22.3, and S&P-Value at 16.6. This stability makes MC-1L a viable option for those seeking a more stable return.

Therefore, we remove the “winner” strategies from the analysis and focus on a comparison of Growth-Value and Market-Cash “loser” strategies. We first consider the case $k = 1$ .

In 23 years from 2001 to 2023, compared with the MC-1W, the GV-1L strategy has a higher True Positive Rate in 17 years, higher TNR in 10 years, and higher overall accuracy in 18 years. The difference in TNR was quite significant (53 vs. 45 median values), the underperformance in TNR was minor (55 vs. 57) as was the overall accuracy (53 vs. 51). Nevertherless, the significant over-performance in TPR resulted in significant overperformance of the GV-1L strategy.

One exception to this is the year 2020. In that year, the GV-1L strategy returned 6% (vs. Median return $M = 17.0 %$ ) whereas the MC-1l returns 36.1% (vs. median return $M = 10.8$ ). We can explain this by examining the machine learning metrics in Table 13 and average benchmark returns in Table 14.

Growth-Value GV-1L: for most years, the number of “+” and “−” days is about equal, with median values of $T^{+} = 127$ and $T^{-} = 125$ , respectively. However, in 2020, the number of “+” days increased significantly to $T^{+} = 150$ and the number of “−” days decreased significantly to $T^{-} = 103$ . At the same time, as seen from Table 13, for that year the True Positive Rate dropped to $TPR = 0.38$ compared to its median $TPR = 0.53$ . This resulted in $TP = 150 \cdot 0.38 = 57$ , much lower than the median value of $TP = 127 \cdot 0.53 = 67$ . The True Negative Rate remained at its typical (median) value of $TNR = 0.55$ . As seen from Table 11, although the value of $r_{G}^{+} = 0.4$ was higher than the median value of $r_{G}^{+}$ , the decrease in $TPR$ and the number of “+” days resulted in big drop in returns for this strategy. Also, 2020 was one of the few years when $r_{G}^{+} < r_{V}^{-}$ and $r_{G}^{-} > r_{V}^{+}$ , compared to typical values, the overall decline in $TP$ to 57 contributed to significant underperformance of GV-1L strategy for that year with unusually low return of $6 %$ compared to the typical annual return of $17 %$

Market-Cash MC-1L: in 2020, the number of “+” days $T^{+} = 146$ and the number of “−” days $T^{-} = 107$ was close to the median values median values of $T^{+} = 141$ and $T^{-} = 111$ , As seen from Table 13, for that year the True Positive Rate was $TPR = 0.49$ compared to its median $TPR = 0.45$ . This resulted in $TP = 146 \cdot 0.49 = 72$ , much higher than the median value of $TP = 141 \cdot 0.45 = 63$ . The True Negative Rate was $TNR = 0.67$ compared to its median value $TNR = 0.57$ . This resulted in $TN = 107 \cdot 0.67 = 72$ , much higher than the median value $TN = 111 \cdot 0.57 = 63$ . As seen from Table 11, for 2020 the returns $r_{M}^{+} = 1.23$ and $r_{M}^{-} = - 1.46$ were twice the median values of $r_{M}^{+} = 0.67$ and $r_{M}^{-} = - 0.68$ respectively. However, a much higher value for $TN$ means that we remained in cash positions for more days, and much higher value for $TP$ means that we took more advantage of higher returns on the “+” days. Since the number of correctly predicted true labels $(TP + TN) = 144$ is significantly higher than this number in a typical year ( $63 + 63 = 126)$ . All this resulted in an unusually high return of $36.1 %$ for MC-1L strategy, compared to the typical value of $10.8 %$ .

The above discussion illustrates how one can connect machine learning and return statistics and explain the difference in strategy performance.

Comparison of volatility and drawdowns

Next, we consider the volatility. We saw in Section “Analysis of volatility and sharpe ratios by corresponding machine learning metrics” that increasing TPR increases both the returns and the volatility whereas increasing TNR increases the return by a smaller amount and decreases volatility. increase in TNR. A detailed comparison of volatility and drawdowns is presented in the Appendix in Table 15. A summary of median values is presented in Table 15.

Table 15.

Annual maximum drawdown and volatility of investment strategies.

	Maximum Drawdowns					Annual Volatility
	Buy-and-Hold			$k = 1$ day		Buy-and-Hold			$k = 1$ day
Year	S&P	G	V	GV	MC	S&P	G	V	GV	MC
2001	$-$ 28.8	$-$ 48.6	$-$ 18.1	$-$ 32.0	$-$ 19.7	21.9	40.7	17.2	33.4	16.2
2002	$-$ 33.0	$-$ 40.1	$-$ 31.2	$-$ 26.4	$-$ 18.0	26.4	33.1	24.5	29.2	19.5
2003	$-$ 13.7	$-$ 12.8	$-$ 15.3	$-$ 13.9	$-$ 6.5	16.5	18.2	16.5	17.2	10.9
2004	$-$ 7.5	$-$ 10.8	$-$ 7.5	$-$ 6.5	$-$ 6.6	11.1	11.1	10.5	11.2	7.4
2005	$-$ 7.0	$-$ 8.0	$-$ 6.2	$-$ 5.6	$-$ 4.9	10.3	10.2	9.8	10.1	7.1
2006	$-$ 7.6	$-$ 9.0	$-$ 7.4	$-$ 7.7	$-$ 6.8	10.0	10.7	9.8	10.6	7.0
2007	$-$ 9.9	$-$ 9.0	$-$ 12.0	$-$ 10.1	$-$ 5.7	15.9	14.6	16.7	15.9	11.0
2008	$-$ 47.1	$-$ 48.0	$-$ 47.0	$-$ 41.8	$-$ 20.9	41.4	37.4	40.9	39.9	33.3
2009	$-$ 27.1	$-$ 23.0	$-$ 30.6	$-$ 28.4	$-$ 14.7	26.6	25.5	27.8	26.9	18.9
2010	$-$ 15.7	$-$ 16.3	$-$ 14.5	$-$ 14.3	$-$ 8.0	17.9	18.8	17.8	18.1	14.6
2011	$-$ 18.6	$-$ 16.5	$-$ 21.9	$-$ 16.8	$-$ 16.1	23.0	21.5	23.7	22.2	16.3
2012	$-$ 9.7	$-$ 8.3	$-$ 11.2	$-$ 6.3	$-$ 9.2	12.7	12.0	13.9	13.1	9.5
2013	$-$ 5.6	$-$ 5.8	$-$ 5.1	$-$ 4.3	$-$ 4.7	11.1	10.9	10.4	10.8	7.7
2014	$-$ 7.3	$-$ 7.4	$-$ 7.4	$-$ 7.4	$-$ 4.2	11.2	12.2	10.3	11.3	8.0
2015	$-$ 11.9	$-$ 11.8	$-$ 13.6	$-$ 13.2	$-$ 11.7	15.4	15.8	15.3	15.7	12.6
2016	$-$ 9.2	$-$ 9.5	$-$ 8.6	$-$ 10.8	$-$ 3.8	13.1	13.4	13.3	13.8	8.8
2017	$-$ 2.6	$-$ 2.4	$-$ 4.4	$-$ 3.3	$-$ 1.8	6.7	7.2	7.2	7.2	4.9
2018	$-$ 19.3	$-$ 20.6	$-$ 19.2	$-$ 19.5	$-$ 19.9	17.0	19.5	15.1	17.1	14.0
2019	$-$ 6.6	$-$ 6.4	$-$ 7.7	$-$ 7.3	$-$ 6.6	12.5	12.9	12.8	13.0	9.1
2020	$-$ 33.7	$-$ 31.3	$-$ 36.9	$-$ 36.0	$-$ 17.6	33.5	34.7	35.2	35.3	26.3
2021	$-$ 5.1	$-$ 8.7	$-$ 5.8	$-$ 7.9	$-$ 4.3	13.0	16.3	13.0	15.3	9.4
2022	$-$ 24.5	$-$ 32.3	$-$ 17.9	$-$ 19.5	$-$ 17.0	24.2	30.6	19.2	26.3	18.0
2023	$-$ 10.0	$-$ 9.1	$-$ 10.9	$-$ 9.8	$-$ 7.0	13.0	13.3	13.3	13.5	9.3
$max$	$-$ 2.6	$-$ 2.4	$-$ 4.4	$-$ 3.3	$-$ 1.8	41.4	40.7	40.9	39.9	33.3
$min$	$-$ 47.1	$-$ 48.6	$-$ 47.0	$-$ 41.8	$-$ 20.9	6.7	7.2	7.2	7.2	4.9
$M$	$-$ 10.0	$-$ 10.8	$-$ 12.0	$-$ 10.8	$-$ 7.0	15.4	15.8	15.1	15.7	10.9
$μ$	$-$ 15.7	$-$ 17.2	$-$ 15.7	$-$ 15.2	$-$ 10.2	17.6	19.2	17.1	18.6	13.0
$σ$	11.5	13.6	11.2	10.8	6.2	8.4	9.7	8.4	9.0	6.8

Not surprisingly, much higher TNR for GV-1L translated into significantly higher volatility. In fact, in each of the 23 years from 2001 to 2023, the volatility of GV-1L strategy was higher than the volatility of MC-1L by about 50% as measured by the median values (15.7 vs. 10.9). For the maximum drawdowns, GV-1L had higher drawdowns than MC-1L in 19 out of 23 years, also about 50% as measured by the median values ( $- 10.8 %$ vs. $- 7 %$ ).

For volatility, recall the equation (16) linking ML-volatilities of the benchmarks and machine learning metrics:

S_{s t r}^{*} = \sqrt{\underset{(TP + FP) σ_{A}^{2}}{\underset{⏟}{(TPR / PPV) π^{+} S_{A}^{2}}} + \underset{(TN + FN) σ_{B}^{2}}{\underset{⏟}{(TNR / NPV) (1 - π^{+}) S_{B}^{2}}}}

From the above, we compute the volatilities of the two strategies for 2023. For illustration, we will compute these from the ratios.

Growth-Value: we have $TPR = 0.44$ , $PPV = 0.58$ , $π^{+} = 0.49$ , $S_{A}^{2} = 13.5$ and $TNR = 0.58$ , $PPV = 0.44$ , $S_{B}^{2} = 9.3$ . Substituting this in equation (16) we obtain $S_{GV}^{*} = 13.3$ .

Market-Cash: we have $TPR = 0.45$ , $PPV = 0.58$ , $π^{+} = 0.56$ , $S_{A}^{2} = 13$ and $TNR = 0.58$ , $PPV = 0.45$ , $S_{B}^{2} = 0$ . Substituting this in equation (16) we obtain $S_{MC}^{*} = 8.6$ .

We could consider other measures of volatility, for example, the integrated volatility $S^{*}$ defined by $S^{*} = (r_{1}^{2} + \dots + r_{T}^{2})$ . This integrated volatility can be expressed in terms of the average return $R$ and volatility $S$ by $S^{*} = S^{2} - R^{2}$ , and therefore, can be expressed in terms of confusion matrix counts. Similarly, from equation (10) we can consider integrated volatility in terms of logarithmic returns.

Comparison of tracking errors and sharpe ratios

A detailed comparison of tracking errors and Sharpe values is presented in Table 16

Table 16.

Comparison of tracking errors and sharp ratios.

	Return	Tracking Error					Sharp Ratio
Year	S&P	G	V	GV	MC	S&P	G	V	GV	MC
2001	$-$ 11.8	$-$ 14.1	6.2	28.8	6.7	$-$ 0.5	$-$ 0.6	$-$ 0.3	0.5	$-$ 0.3
2002	$-$ 21.6	$-$ 10.5	3.6	55.7	30.5	$-$ 0.8	$-$ 1.0	$-$ 0.7	1.2	0.5
2003	28.2	0.2	$-$ 3.0	23.3	$-$ 10.4	1.7	1.6	1.5	3.0	1.6
2004	10.7	$-$ 5.4	2.5	7.7	$-$ 6.2	1.0	0.5	1.3	1.6	0.6
2005	4.8	$-$ 2.1	0.5	4.8	6.0	0.5	0.3	0.5	1.0	1.5
2006	15.8	$-$ 6.8	5.7	3.1	$-$ 11.6	1.6	0.8	2.2	1.8	0.6
2007	5.1	5.6	$-$ 3.7	7.4	14.2	0.3	0.7	0.1	0.8	1.8
2008	$-$ 36.8	$-$ 0.6	0.5	10.3	47.3	$-$ 0.9	$-$ 1.0	$-$ 0.9	$-$ 0.7	0.3
2009	26.4	10.6	$-$ 9.3	8.1	$-$ 6.1	1.0	1.5	0.6	1.3	1.1
2010	15.1	1.2	0.4	7.7	$-$ 1.6	0.8	0.9	0.9	1.3	0.9
2011	1.9	2.8	$-$ 2.6	7.4	$-$ 5.3	0.1	0.2	0.0	0.4	$-$ 0.2
2012	16.0	$-$ 1.8	1.2	20.3	$-$ 15.3	1.3	1.2	1.2	2.8	0.1
2013	32.3	0.3	$-$ 0.5	7.9	$-$ 16.3	2.9	3.0	3.0	3.7	2.1
2014	13.5	1.3	$-$ 1.3	3.4	$-$ 2.5	1.2	1.2	1.2	1.5	1.4
2015	1.2	3.8	$-$ 4.4	3.3	$-$ 1.0	0.1	0.3	$-$ 0.2	0.3	0.0
2016	12.0	$-$ 5.2	5.1	$-$ 0.9	5.9	0.9	0.5	1.3	0.8	2.0
2017	21.7	5.5	$-$ 6.3	$-$ 0.5	$-$ 8.3	3.2	3.8	2.1	3.0	2.7
2018	$-$ 4.6	4.5	$-$ 4.4	$-$ 2.2	$-$ 8.1	$-$ 0.3	0.0	$-$ 0.6	$-$ 0.4	$-$ 0.9
2019	31.2	$-$ 0.4	0.5	$-$ 5.5	$-$ 22.0	2.5	2.4	2.5	2.0	1.0
2020	18.3	15.1	$-$ 17.0	$-$ 12.3	17.8	0.5	1.0	0.0	0.2	1.4
2021	28.7	3.3	$-$ 3.8	$-$ 12.4	$-$ 14.6	2.2	2.0	1.9	1.1	1.5
2022	$-$ 18.2	$-$ 11.2	12.9	5.1	11.5	$-$ 0.8	$-$ 1.0	$-$ 0.3	$-$ 0.5	$-$ 0.4
2023	26.2	3.8	$-$ 4.0	$-$ 3.9	$-$ 9.7	2.0	2.3	1.7	1.6	1.8
$max$	32.3	15.1	12.9	55.7	47.3	3.2	3.8	3.0	3.7	2.7
$min$	$-$ 36.8	$-$ 14.1	$-$ 17.0	$-$ 12.4	$-$ 22.0	$-$ 0.9	$-$ 1.0	$-$ 0.9	$-$ 0.7	$-$ 0.9
$M$	13.5	0.3	$-$ 0.5	5.1	$-$ 5.3	0.9	0.8	0.9	1.2	1.0
$μ$	9.4	0.0	$-$ 0.9	7.2	0.0	0.9	0.9	0.8	1.2	0.9
$σ$	18.3	6.8	6.0	14.4	16.1	1.2	1.2	1.1	1.1	0.9

In 17 out of 23 years, the tracking error of GV-1L was superior to that of Market-Cash with a median value of $M = 5.1$ and mean value of $μ = 7.2$ . The market cash offered very little advantage over S&P Buy-and-Hold. In 2008, it outperformed the index by $47.3 %$ . Without this year, the strategy would probably result in a loss. By contrast, the Growth-Value would still outperform the benchmark even if we were to remove its best year in 2022.

Next, we consider the Sharpe ratios. First, we illustrate the computation for 2023 using the previous results that we computed using machine learning and return statistics:

Growth-Value: $R_{GV}^{*} = 24.4$ and $S_{GV}^{*} = 13.3$ . This gives ML-Sharpe’s ratio $S R_{GV}^{*} = R_{GV}^{*} / S_{GV}^{*} \approx 1.8$

Market-Cash: $R_{MC}^{*} = 12.6$ and $S_{MC}^{*} = 8.6$ . This gives ML-Sharpe’s ratio $S R_{MC}^{*} = R_{MC}^{*} / S_{MC}^{*} \approx 1.5$

We now examine Sharpe ratios in more detail. From Table 16, the GV-1L has the highest Sharp ratio with a median and mean value of 1.2. This value is higher than that of the benchmarks ( $0.8 - 0.9$ range). By contrast, the Sharpe ratios value of MC-1L is lower than GV-1L and is comparable to that of the benchmarks. In fact, in 15 out of 23 years, the Sharpe ratio of GV-1L was higher than that of MC-1L.

Comparison of strategies by return efficiency ratios

In the previous sections, we compared the strategies by examining returns, volatility, and Sharpe’s ratio. We now ask the question: how efficient are the strategies, and do they outperform a simple random flip strategy? To that end, we compute the return efficiency index of the strategies and compare it to the return efficiency index of random flip.

The results are presented in Figure 6.

Figure 6.

Return Efficiency Index of Strategies.

and the detailed comparison is presented in the Appendix in Table 17. In 23 years from 2001–2023, in terms of the Return Efficiency Index, the GV-1L outperformed the random flip strategy in 16 years with the mean difference of $0.038$ in return efficiency ratios. The MC-1L outperformed the random flip in 14 years by a much smaller mean difference of $0.013$ . The median value of the Return efficiency index for $GV-1L$ was 0.41 and is about 25% much higher than the median value of 0.32 for MC-1L. Not only does GV-1L give higher return and Sharpe’s ratio, but it is also more efficient as a strategy in capturing the potential returns range.

Table 17.

Return efficiency ratios (C means cash).

	Buy-and-Hold				Growth-Value		Market-Cash
Year	S&P	G	V	C	GV-1L	Rand.	MC-1L	Rand.
2001	0.19	0.11	0.14	0.23	0.18	0.13	0.21	0.21
2002	0.14	0.17	0.21	0.19	0.38	0.19	0.21	0.17
2003	0.32	0.33	0.32	0.22	0.45	0.32	0.28	0.27
2004	0.37	0.35	0.42	0.30	0.47	0.38	0.33	0.33
2005	0.36	0.39	0.42	0.33	0.47	0.40	0.40	0.34
2006	0.42	0.34	0.51	0.30	0.48	0.42	0.33	0.36
2007	0.30	0.46	0.35	0.28	0.48	0.40	0.37	0.29
2008	0.08	0.28	0.29	0.13	0.37	0.29	0.15	0.11
2009	0.21	0.41	0.29	0.15	0.39	0.35	0.20	0.18
2010	0.30	0.40	0.39	0.24	0.46	0.40	0.29	0.27
2011	0.22	0.41	0.35	0.21	0.45	0.38	0.20	0.21
2012	0.37	0.37	0.40	0.28	0.57	0.39	0.28	0.32
2013	0.44	0.43	0.42	0.25	0.53	0.43	0.34	0.34
2014	0.39	0.44	0.40	0.30	0.46	0.42	0.37	0.34
2015	0.29	0.46	0.35	0.29	0.45	0.41	0.29	0.29
2016	0.36	0.35	0.45	0.29	0.39	0.40	0.40	0.33
2017	0.54	0.49	0.35	0.29	0.41	0.42	0.44	0.41
2018	0.27	0.40	0.32	0.29	0.35	0.36	0.23	0.28
2019	0.42	0.41	0.42	0.25	0.35	0.42	0.30	0.33
2020	0.17	0.34	0.22	0.14	0.23	0.28	0.21	0.16
2021	0.39	0.30	0.28	0.24	0.24	0.29	0.32	0.32
2022	0.16	0.21	0.33	0.21	0.29	0.27	0.19	0.18
2023	0.38	0.44	0.37	0.25	0.37	0.40	0.33	0.31
$max$	0.54	0.49	0.51	0.33	0.57	0.43	0.44	0.41
$min$	0.08	0.11	0.14	0.13	0.18	0.13	0.15	0.11
$M$	0.32	0.39	0.35	0.25	0.41	0.39	0.29	0.29
$μ$	0.31	0.36	0.35	0.25	0.4	0.35	0.29	0.28
$σ$	0.11	0.1	0.08	0.05	0.1	0.08	0.08	0.08

Table 18.

A comparison of simple vs. logarithmic returns ( $App. = T^{+} r^{+} + T^{-} r^{-}$ ).

		True Label		Average Daily Return				Average Log Returns
Year	B&H	$T^{+}$	$T^{-}$	$r^{+}$	$r^{-}$	App.	$\| % E \|$	$r^{+}$	$r^{-}$	App.	$\| % E \|$
2001	$-$ 11.76	125	123	1.01	$-$ 1.11	$-$ 10.11	14.01	1.00	$-$ 1.12	$-$ 12.51	6.37
2002	$-$ 21.58	120	132	1.26	$-$ 1.30	$-$ 20.83	3.48	1.25	$-$ 1.32	$-$ 24.31	12.65
2003	28.18	144	108	0.80	$-$ 0.82	26.20	7.04	0.79	$-$ 0.83	24.83	11.88
2004	10.70	145	107	0.52	$-$ 0.60	10.78	0.81	0.51	$-$ 0.60	10.16	5.06
2005	4.83	140	112	0.48	$-$ 0.56	5.25	8.66	0.48	$-$ 0.56	4.72	2.34
2006	15.85	141	110	0.47	$-$ 0.47	15.21	4.01	0.47	$-$ 0.47	14.71	7.17
2007	5.15	139	112	0.67	$-$ 0.78	6.28	21.99	0.67	$-$ 0.79	5.02	2.54
2008	$-$ 36.79	126	127	1.56	$-$ 1.84	$-$ 37.36	1.53	1.52	$-$ 1.87	$-$ 45.87	24.67
2009	26.35	141	111	1.17	$-$ 1.24	26.93	2.19	1.16	$-$ 1.26	23.39	11.25
2010	15.06	147	105	0.73	$-$ 0.88	15.63	3.82	0.73	$-$ 0.88	14.03	6.79
2011	1.89	136	116	0.97	$-$ 1.10	4.53	138.94	0.96	$-$ 1.11	1.89	0.47
2012	15.99	139	111	0.59	$-$ 0.60	15.64	2.20	0.59	$-$ 0.60	14.83	7.23
2013	32.31	149	103	0.55	$-$ 0.52	28.62	11.41	0.55	$-$ 0.52	28.00	13.32
2014	13.46	149	103	0.49	$-$ 0.58	13.27	1.47	0.48	$-$ 0.58	12.64	6.12
2015	1.23	121	131	0.75	$-$ 0.68	2.41	95.63	0.75	$-$ 0.68	1.22	1.12
2016	12.00	138	114	0.57	$-$ 0.59	12.18	1.56	0.57	$-$ 0.59	11.33	5.56
2017	21.71	144	107	0.33	$-$ 0.27	19.88	8.42	0.33	$-$ 0.27	19.64	9.49
2018	$-$ 4.57	135	116	0.68	$-$ 0.81	$-$ 3.23	29.27	0.67	$-$ 0.82	$-$ 4.67	2.22
2019	31.22	150	102	0.57	$-$ 0.56	27.97	10.42	0.57	$-$ 0.57	27.17	12.97
2020	18.33	146	107	1.23	$-$ 1.46	22.47	22.57	1.21	$-$ 1.49	16.84	8.16
2021	28.73	146	106	0.63	$-$ 0.62	26.11	9.12	0.63	$-$ 0.63	25.26	12.09
2022	$-$ 18.18	110	141	1.29	$-$ 1.13	$-$ 17.13	5.73	1.28	$-$ 1.14	$-$ 20.06	10.39
2023	26.18	141	109	0.66	$-$ 0.63	24.11	7.91	0.66	$-$ 0.64	23.25	11.19
$max$	32.31	150	141	1.56	$-$ 0.27	28.62	138.94	1.56	$-$ 0.27	28.00	24.67
$min$	$-$ 36.79	110	102	0.33	$-$ 1.84	$-$ 37.36	0.81	0.33	$-$ 1.84	$-$ 45.87	0.47
$M$	13.46	141	111	0.67	$-$ 0.68	13.27	7.91	0.67	$-$ 0.68	12.64	7.23
$μ$	9.40	138	114	0.78	$-$ 0.83	9.34	17.92	0.78	$-$ 0.83	7.46	8.31
$σ$	18.25	10.6	10.4	0.32	0.37	17.40	32.88	0.32	0.37	18.60	5.37

For ML-Return Efficiency Ratios, recall the equation linking ratios of the benchmarks and machine learning metrics:

I_{s t r}^{*} = TPR \cdot I_{A} + TNR \cdot I_{B}

(28)

Let us illustrate the computation of the Return Efficiency Ratios of the two strategies for 2023:

Growth-Value: we have $TPR = 0.44$ , $TNR = 0.58$ , $I_{A} = 0.44$ , and $I_{B} = 0.37$ . Substituting this in the above equation, we obtain $I_{GV}^{*} \approx 0.41$ .

From equation (25) we compute the probability $p_{1}$ of the equivalent $p$ -strategy:

p_{1} \approx \frac{TPR \cdot I_{A} - (1 - TNR) \cdot I_{B}}{I_{A} - I_{B}} = 0.55

Market-Cash: we have $TPR = 0.45$ , $TNR = 0.58$ , $I_{A} = 0.38$ , and $I_{B} = 0.25$ . Substituting this in the above equation, we obtain $I_{MC} \approx 0.32$ .

From equation (25) we compute the probability

p_{2}

of the equivalent

p

-strategy:

\begin{aligned} p_{2} & \approx \frac{TPR \cdot I_{A} - (1 - TNR) \cdot I_{B}}{I_{A} - I_{B}} \\ = \frac{0.45 \cdot 0.38 - (1 - 0.58) \cdot 0.27}{0.38 - 0.25} = 0.44 \end{aligned}

By return efficiency index, the GV-1L strategy is superior to a $p = 1 / 2$ random flip strategy, whereas the MC-1L strategy is inferior to a $p = 1 / 2$ random flip strategy. Although the $TPR$ and $TNR$ are similar to both strategies, the efficiency return indices of the benchmarks are quite different.

Choosing the number of nearest neighbors and transaction costs

In Nearest Neighbor algorithms, one of the important hyperparameters (besides the distance metric) is $k$ - the number of neighbors to use. Since the decision on predicting labels is made by the majority (minority) of true labels, the number $k$ must be odd and, in the simplest case, $k = 1$ . This value is determined by comparing the results for different $k$ .

We start by comparing the Growth for different $k$ . The results are presented in Figures 7, 8 and 9. We can see that as we increase $k$ , the performance of the Growth-Value strategy degrades rapidly. Intuitively, we can explain this as follows: in daily movements, the market overreacts to positive and negative news. This is corrected in the very near term (the next day). However, this effect is very short-lived. For very large $k$ , we will end up predicting labels depending on the proportion of true labels in the past $k$ trading days for both strategies. To illustrate this further, consider the annual returns for 2023 for different $k$ :

Growth-Value: annual returns $R_{GV}$ decrease with $k$ : $R_{GV} = 22.2 %$ for $k = 1$ , $R_{GV} = 18.0 %$ for $k = 3$ , $R_{GV} = 20.5 %$ for $k = 5$ , $R_{GV} = 17.7 %$ for $k = 7$ , and $R_{GV} = 18.2 %$ for $k = 9$ .

Market-Cash: annual returns $R_{MC}$ also show a decreasing trend: $R_{MC} = 16.5 %$ for $k = 1$ , $R_{MC} = 11.5 %$ for $k = 3$ , $R_{MC} = 10.4 %$ for $k = 5$ , $R_{MC} = 15.0 %$ for $k = 7$ , and $R_{MC} = 12.5 %$ for $k = 9$ .

Figure 7.

Return Efficiency Index For Different $k$ .

Figure 8.

Annual Return Efficiency Index for Trading Algorithms.

Figure 9.

Growth, Maximum Drawdown, Volatility And Number Of Transactions For Different $k$ .

In 2023, the Growth-Value strategy’s returns were higher for lower $k$ because it capitalized on short-term market overreactions. As $k$ increased, returns decreased due to reduced responsiveness. For the Market-Cash strategy, returns also declined with higher $k$ , showing the market’s lack of memory and short-lived price movements. Over the whole range, both strategies showed diminishing returns with increasing $k$ .

Next, we compare Maximum drawdowns and volatility. This is presented in Figure 9. As we can see from this graph, the MDD for both strategies remains stable with the MDD for Growth-Value strategy about 50% higher than for Market-Cash strategies. if we examine annualized volatility, we note that its value does not change with $k$ . On average, the annualized volatility of Growth-Value is about 18%, whereas the volatility of Market-Cash is about 13%. The volatility of the Growth-Value strategy is about 50% higher than Market-Cash that is similar to the volatility of the S&P-500 index. Since the returns decrease for larger $k$ whereas volatility does not change, we see that the risk-adjusted returns as measured by Sharp’s ratio will decrease for larger $k$ as well.

We analyzed the differences between the return efficiency index of strategies compared with a random flip for different $k$ values. For GV, these differences were 0.047 for $k = 1$ , 0.038 for $k = 3$ , 0.022 for $k = 5$ , 0.016 for $k = 7$ , and 0.017 for $k = 9$ . For MC, the differences were 0.013 for $k = 1$ , 0.006 for $k = 3$ , 0.008 for $k = 5$ , 0.008 for $k = 7$ , and 0.003 for $k = 9$ . As $k$ increases, both strategies’ differences tend to decrease to 0. This means that for larger $k$ , the return efficiency approaches those of the random flip strategy, indicating that their average return efficiency ratios decrease towards randomness. However, this convergence is not towards a $p = 1 / 2$ random flip strategy, but most likely to a random flip with probability $p$ equal to preference $π = T^{+} / T^{-}$ . For larger $k$ , the probability that a majority label is “+” would correspond to the proportion of “+” labels in the data. A similar situation is often encountered in machine learning, where increasing the number of neighbors does not necessarily increase the accuracy and for large $k$ , the accuracy will simply converge to the proportion of labels of some class in the dataset.

Finally, we compare the transaction costs for different $k$ in Figure 9. The number of transactions is comparable for both Growth Value and Market Cash strategies and drop rapidly as we increase. For $k = 1$ , the number of transactions is about 120, indicating that on average, these strategies would require trading every two days. For larger $k$ , the number of trades drops to about 30, indicating trading every 5-7 days.

Additional examples on using return efficiency index

In this paper, we presented a detailed analysis of a simple trading strategy based on $k$ -NN nearest neighbor classification popular in machine learning (Bishop, 2016). One of our main results is the concept of “Return Efficiency Index” that can be used to connect machine learning accuracy and returns of strategies and to provide a universal metric to compare these algorithms.

Therefore, we can apply a similar analysis to describe the performance of trading algorithms based on other machine learning algorithms (Joiner et al., 2022). For example, consider two popular deep learning architectures (Ahmed et al., 2022; Hu et al., 2021) for label prediction: LSTM (Long-Short Term Memory) and CNN (Convolutional Neural Networks). In both architectures, we predict the next trading label based on input patterns of 10-day trading periods.

In Figure 8, we compare these deep-learning-based models with $k$ -NN based models using the Return Efficiency Index. The index captures not only accuracy but also robustness in financial prediction tasks. It is a key metric for comparing the reliability of models.

A summary comparison is presented in Table 9.

We observe the following:

3-Month vs. 6-month Performance: For 6-month forecasts, CNN and LSTM have nearly identical performance, with average Return Efficiency Index values both rounding to 0.34. For 3-month forecasts, CNN performs slightly better than LSTM (CNN: 0.34, LSTM: 0.33). This shows that both models handle short- and medium-term trends similarly well.

GV-1L: The GV-1L strategy shows the highest Return Efficiency Index at 0.40, particularly excelling in earlier years. However, its value declines over time and gets lower than both LSTM and CNN in later years.

CNN Architecture: The CNN model includes convolutional layers with max-pooling and dense layers.

The Return Efficiency Index is a valuable tool for evaluating models. It links the forecasting accuracy of label prediction (Machine Learning) to trading algorithm financial performance. As such, it could be an additional useful metric for comparing trading algorithms.

Concluding remarks

In this paper, we presented an approach to analyze and explain the performance of algorithmic trading strategies by using an analogy to classification in machine learning. We derived explicit expressions that qualitatively relate strategy statistics to machine learning metrics derived from the underlying confusion matrices. We introduced a new performance metric of the Return Efficiency Index. This new index provides a link between the return performance of trading strategies and the accuracy of the strategy as a machine learning classifier. This new metric gives a universal scale to compare any trading strategies in terms of their ability to capture the possible returns and outperform a random flip strategy. We applied our approach to trading strategies designed by analogies to machine learning. Future work will focus on applying other methods and ideas, from machine learning to algorithmic trading.

Footnotes

Acknowledgements

We want to thank Metropolitan College Boston University, for their support.

Ethics approval

Not applicable.

Consent to participate

Not applicable.

Consent for publication

Not applicable.

Author contributions

All authors contributed equally to the effort.

Funding

This research was conducted without any external funding. All aspects of the study, including design, data collection, analysis, and interpretation, were carried out using the resources available within the authors’ institution.

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Data availability

All the relevant data and analysis are available via:

Appendix A: Details on ML-volatility and ML-sharpe ratio computations

We will use $S$ to denote volatility with appropriate subscripts to indicate specific trading strategies or benchmarks, and $S^{*}$ to represent ML-volatility computed from the confusion matrix. If $σ$ denotes the standard deviation of daily returns, then the volatility over $t$ days is given by $S = σ \sqrt{t}$ . Recall that the standard deviation of simple returns is approximately equal to the standard deviation of logarithmic returns as shown in equation (10). Let $π^{+} = T^{+} / T$ denote the prevalence. As before, we assume that daily returns are independent.

Let $σ_{A}$ and $σ_{B}$ denote the standard deviation of daily returns for $A$ and $B$ , then the volatilities for the buy-hold strategies $A$ and $B$ we have

S_{A} = \sqrt{T} σ_{A} and S_{B} = \sqrt{T} σ_{B}

For the ideal strategy, we have for ML-volatility

S_{max}^{*} = \sqrt{T^{+} σ_{A}^{2} + T^{-} σ_{B}^{2}} = \sqrt{π^{+} S_{A}^{2} + (1 - π^{+}) S_{B}^{2}}

Similarly, for the worst strategy we have

S_{min}^{*} = \sqrt{T^{+} σ_{B}^{2} + T^{-} σ_{A}^{2}} = \sqrt{π^{+} S_{B}^{2} + (1 - π^{+}) S_{A}^{2}}

For a generic strategy, we have

\begin{aligned} S_{s t r}^{*} & = \sqrt{(TP + FP) σ_{A}^{2} + (TN + FN) σ_{B}^{2}} \\ = \sqrt{(\frac{P^{+}}{T}) S_{A}^{2} + (\frac{P^{-}}{T}) S_{B}^{2}} \end{aligned}

Since

P^{+} = (\frac{TPR}{PPV}) π^{+} T and P^{-} = (\frac{TNR}{NPV}) (1 - π^{+}) T

we can rewrite the expression for ML-volatility for a generic strategy as

S_{s t r}^{*} = \sqrt{(\frac{TPR}{PPV}) π^{+} S_{A}^{2} + (\frac{TNR}{NPV}) (1 - π^{+}) S_{B}^{2}}

Ignoring the risk-free rate, we can write the expressions relating the Sharpe Ratios with confusion matrix entries. We will use the notation $SR$ to denote the Sharpe ratio.

For the Buy-and-Hold strategist $A$ and $B$ we have

\begin{aligned} {SR}_{A}^{*} & = \frac{R_{A}^{*}}{S_{A}} = \frac{T^{+} r_{A}^{+} + T^{-} r_{A}^{-}}{S_{A}} \\ {SR}_{B}^{*} & = \frac{R_{B} *}{S_{B}} = \frac{T^{+} r_{B}^{+} + T^{-} r_{B}^{-}}{S_{B}} \end{aligned}

The ML-Sharpe’s ratio for the ideal strategy

{SR}_{max}^{*} = \frac{R_{max}^{*}}{S_{max}^{*}} = \frac{T^{+} r_{A}^{+} + T^{-} r_{B}^{-}}{\sqrt{π^{+} S_{A}^{2} + (1 - π^{+}) S_{B}^{2}}}

The ML-Sharpe’s ratio for the worst strategy

{SR}_{min}^{*} = \frac{R_{min}^{*}}{S_{min}^{*}} = \frac{T^{+} r_{B}^{+} + T^{-} r_{A}^{-}}{\sqrt{π^{+} S_{B}^{2} + (1 - π^{+}) S_{A}^{2}}}

Note that although $R_{m a x}^{*} > R_{min}^{*}$ , it is possible that $S R_{max}^{*} < S R_{min}^{*}$ . For example, if the numbers of true positive and true negative labels are the same then $T^{+} = T^{-}$ and $π^{+} = 0.5$ . In this case, from the above two equations we obtain

S R_{m a x}^{*} = (\frac{r_{A}^{+} + r_{B}^{-}}{r_{A}^{-} + r_{B}^{+}}) \cdot S R_{min}^{*}

Although we assume

r_{A}^{-} < r_{B}^{-} < r_{B}^{+} < r_{A}^{+}

, it is still quite possible to have

(r_{A}^{+} + r_{B}^{-}) < (r_{A}^{-} + r_{B}^{+})

resulting in

S R_{max}^{*} < S R_{min}^{*}

Finally, the ML-Sharpe’s ratio for a generic strategy is

{SR}_{s t r}^{*} = \frac{R_{s t r}^{*}}{S_{s t r}^{*}} = \frac{TP \cdot r_{A}^{+} + FN \cdot r_{B}^{+} + TN \cdot r_{B}^{-} + FP \cdot r_{A}^{-}}{\sqrt{(P^{+} / T) S_{A}^{2} + (P^{-} / T) S_{B}^{2}}}

Appendix B: Detailed tables and figures

This Appendix contains tables with detailed annual statistics for strategies

References

Agrawal

Khan

Kumar

(2019) Stock price prediction using technical indicators: A predictive model using optimal deep learning. International Journal of Recent Technology and Engineering. https://api.semanticscholar.org/CorpusID:219325700 .

Ahmed

Hassan

Mstafa

(2022) A review on deep sequential models for forecasting time series data. Applied Computational Intelligence and Soft Computing.

Ashitha

Sakshi

Vishal

, et al. (2023) Prediction and sentiment analysis of stock using machine learning. International Journal for Research in Applied Science and Engineering Technology. DOI: https://doi.org/10.22214/ijraset.2023.53169.

Ayala

García-Torres

Noguera

, et al. (2021) Technical analysis strategy optimization using a machine learning approach in stock market indices. Knowledge-based Systems 225: 107119. DOI: https://doi.org/10.1016/J.KNOSYS.2021.107119.

Beg

Awan

Ali

(2019) Algorithmic machine learning for prediction of stock prices. DOI: 10.4018/978-1-5225-7805-5.CH007.

Bishop

(2016) Pattern Recognition and Machine Learning. New York: Springer.

Bitvai

Cohn

(2015) Day trading profit maximization with multi-task learning and technical analysis. Machine Learning 101: 187–209. DOI: https://doi.org/10.1007/s10994-014-5480-x.

Buachuen

Kantavat

(2023) Automated stock trading system using technical analysis and deep learning models. In: Proceedings of the 13th international conference on advances in information technology, pp.1–9. DOI: 10.1145/3628454.3631670.

Chavarnakul

Enke

(2009) A hybrid stock trading system for intelligent technical analysis-based equivolume charting. Neurocomputing 72: 3517–3528. DOI: https://doi.org/10.1016/j.neucom.2008.11.030.

10.

Chen

(2020) Using machine learning algorithms on prediction of stock price. Journal of Modeling and Optimization 12(2): 84–99. DOI: https://doi.org/10.32732/jmo.2020.12.2.84.

11.

Chen

Xie

(2023) Two-stage attentional temporal convolution and lstm model for financial data forecasting. In: International conference on electronic information engineering and data processing (EIEDP 2023), volume 12700, pp.122–130. SPIE.

12.

Choudhry

Garg

(2008) A hybrid machine learning system for stock market forecasting. World Academy of Science, Engineering and Technology, International Journal of Computer, Electrical, Automation, Control and Information Engineering 2: 689–692.

13.

Dash

(2016) A hybrid stock trading framework integrating technical analysis with machine learning techniques. The Journal of Finance and Data Science 2: 42–57. DOI: https://doi.org/10.1016/J.JFDS.2016.03.002.

14.

Gerlein

McGinnity

Belatreche

, et al. (2016) Evaluating machine learning classification for financial trading: An empirical approach. Expert Systems With Applications 54: 193–207. DOI: https://doi.org/10.1016/j.eswa.2016.01.018.

15.

Grigoryan

(2017) Stock market trend prediction using support vector machines and variable selection methods. In: Proceedings of the 2017 international conference on applied mathematics, modelling and statistics application (AMMSA 2017), pp.210–213. DOI: 10.2991/ammsa-17.2017.45.

16.

Kelly

Xiu

(2019) Empirical asset pricing via machine learning. In: Chicago Booth Research Paper No. 18-04, 31st Australasian Finance and Banking Conference 2018, Yale ICF Working Paper No. 2018-09. DOI: 10.2139/ssrn.3159577. https://ssrn.com/abstract=3159577.

17.

Hastle

(2018) Elements of Statistical Learning. New York: Springer.

18.

Hsu

Lessmann

Sung

, et al. (2016) Bridging the divide in financial market forecasting: Machine learners vs. financial economists. Expert Systems With Applications 61: 215–234. DOI: https://doi.org/10.1016/j.eswa.2016.05.033.

19.

Zhao

Khushi

(2021) A survey of forex and stock price prediction using deep learning. Applied System Innovation 4(1): 9.

20.

Hudson

Gregoriou

(2015) Calculating and comparing security returns is harder than you think: A comparison between logarithmic and simple returns. International Review of Financial Analysis 38: 151–162. DOI: https://doi.org/10.1016/j.irfa.2014.10.008.

21.

Jagadisha

Raghuram

Praveen

, et al. (2022) Stock price movement prediction using machine learning. International Journal of Advanced Research in Science, Communication and Technology. DOI: https://doi.org/10.48175/ijarsct-7774.

22.

Johnson

Kotz

(1970) Distributions in Statistics. New York: Wiley.

23.

Joiner

Vezeau

Wong

, et al. (2022) Algorithmic trading and short-term forecast for financial time series with machine learning models; state of the art and perspectives. In: 2022 IEEE international conference on recent advances in systems science and engineering (RASSE), pp.1–9. DOI: 10.1109/RASSE54974.2022.9989592.

24.

Karthik

(2023) Applications of machine learning in predictive analysis and risk management in trading. International Journal of Innovative Research in Computer Science and Technology 11(6): 18–25. DOI: https://doi.org/10.55524/ijircst.2023.11.6.4.

25.

Khan

Shah

Shahid

, et al. (2023) A performance comparison of machine learning models for stock market prediction with novel investment strategy. PloS One 18(9): e0286362.

26.

Kim

Won

(2018) An ensemble model integrating machine learning algorithms with technical indicators for stock price prediction. Expert Systems with Applications 107: 123–130. DOI: https://doi.org/10.1016/j.eswa.2018.04.021.

27.

Lumoring

Chandra

Agung

, et al. (2023) A systematic literature review: Forecasting stock price using machine learning approach. In: 2023 International conference on data science and its applications (ICoDSA), pp.129–133. DOI: 10.1109/ICoDSA58501.2023.10277318.

28.

Mahfooz

Iftikhar

Khan

(2022) Improving stock trend prediction using lstm neural network trained on a complex trading strategy. International Journal for Research in Applied Science and Engineering Technology 10(7): 4361–4371. DOI: https://doi.org/10.22214/ijraset.2022.45961.

29.

Meesad

Boonmatham

(2023) A combination of machine learning-based natural language processing with technical analysis for stock trading. Indonesian Journal of Electrical Engineering and Computer Science 30: 422–434. DOI: https://doi.org/10.11591/ijeecs.v30.i1.pp422-434.

30.

Mndawe

Paul

Doorsamy

(2022) Development of a stock price prediction framework for intelligent media and technical analysis. Applied Sciences 12(2): 719. DOI: https://doi.org/10.3390/app12020719.

31.

Ndikum

(2020) Machine learning algorithms for financial asset price forecasting. ArXiv abs/2004.01504.

32.

Nousi

Tsantekidis

Passalis

, et al. (2018) Machine learning for forecasting mid-price movements using limit order book data. IEEE Access 7: 64722–64736. DOI: https://doi.org/10.1109/ACCESS.2019.2916793.

33.

Ntakaris

Kanniainen

Gabbouj

, et al. (2018) Mid-price prediction based on machine learning methods with technical and quantitative indicators. PloS one 15. DOI: https://doi.org/10.2139/ssrn.3213389.

34.

Oyewola

Dada

Olaoluwa

, et al. (2019) Predicting nigerian stock returns using technical analysis and machine learning. European Journal of Electrical Engineering and Computer Science 3. DOI: https://doi.org/10.24018/EJECE.2019.3.2.65.

35.

Padhi

Padhy

Bhoi

, et al. (2022) An intelligent fusion model with portfolio selection and machine learning for stock market prediction. Computational Intelligence and Neuroscience 2022. DOI: https://doi.org/10.1155/2022/7588303.

36.

Park

Shin

(2019) Stock price prediction using deep learning models and technical analysis indicators. IEEE Access 7: 115463. DOI: https://doi.org/10.1109/ACCESS.2019.2933336.

37.

Pasupulety

Anmol

Mohan

(2019) Predicting stock prices using ensemble learning and sentiment analysis. In: 2019 IEEE second international conference on artificial intelligence and knowledge engineering (AIKE), pp.215–222. DOI: 10.1109/AIKE.2019.00045.

38.

Patel

Shah

Thakkar

, et al. (2015) Predicting stock and stock price index movement using trend deterministic data preparation and machine learning techniques. Expert Systems With Applications 42: 259–268. DOI: https://doi.org/10.1016/j.eswa.2014.07.040.

39.

Pholsri

(2023) Combining technical analysis and deep learning models for stock market trading. DOI: 10.58837/chula.the.2022.103.

40.

Pradip

Bari

Nandhini

(2018) Stock market prediction using machine learning. Journal of Computational and Theoretical Nanoscience. DOI: https://doi.org/10.1166/jctn.2020.8405.

41.

Satchell

Knight

(2001) Return Distributions in Finance (Quantitative Finance). New York: Butterworth-Heinemann.

42.

Tran Van

Nguyen Bao

Pham Minh

(2023) Integrated hybrid approaches for stock market prediction with deep learning, technical analysis, and reinforcement learning. In: Proceedings of the 12th international symposium on information and communication technology, pp.213–220. DOI: 10.1145/3628797.3629018.

43.

Tsantekidis

Passalis

Toufa

, et al. (2020) Price trailing for financial trading using deep reinforcement learning. IEEE Transactions on Neural Networks and Learning Systems 32: 2837–2846. DOI: https://doi.org/10.1109/tnnls.2020.2997523.

44.

Wang

Sun

Liu

, et al. (2018) Financial markets prediction with deep learning. In: 2018 17th IEEE international conference on machine learning and applications (ICMLA), pp.97–104. DOI: 10.1109/ICMLA.2018.00022.

45.

Wong

Figini

Raheem

, et al. (2023) Forecasting of stock prices using machine learning models. In: 2023 IEEE international systems conference (SysCon), pp.1–7. DOI: 10.1109/SysCon53073.2023.10131091.

46.

Yao

Chang

, et al. (2022) Stock price analysis and forecasting based on machine learning. In: Third international conference on computer science and communication technology (ICCSCT 2022), volume 12506, pp.1503–1510. SPIE.

47.

Yang

Yoon

(2023) The design of an intelligent lightweight stock trading system using deep learning models: Employing technical analysis methods. MDPI Systems 11(9). DOI: https://doi.org/10.3390/systems11090470.

48.

Zarkias

Passalis

Tsantekidis

, et al. (2019) Deep reinforcement learning for financial trading using price trailing. In: ICASSP 2019 - 2019 IEEE international conference on acoustics, speech and signal processing (ICASSP), pp.3067–3071. DOI: 10.1109/ICASSP.2019.8683161.

49.

Zhang

Yuan

, et al. (2021) A hybrid deep learning model for stock price prediction using technical analysis indicators. Journal of Finance and Data Science 7: 67–78. DOI: https://doi.org/10.1016/j.jfds.2021.01.003.

50.

Zhong

Enke

(2019) Predicting the daily return direction of the stock market using hybrid machine learning algorithms. Financial Innovation 5: 1–20. DOI: https://doi.org/10.1186/s40854-019-0138-0.

Comparing algorithmic trading strategies by analogies to machine learning

Abstract

Keywords

Introduction

Machine-learning interpretation of trading strategies

Analysis of strategy performance by machine learning metrics

Approximating total returns

Evaluating strategy performance

Machine-learning interpretations for market-cash

Analysis of volatility and sharpe ratios by corresponding machine learning metrics

The “return efficiency index”

A detailed example

Machine learning description of strategy X

Machine learning description of strategy Y

Machine learning comparison of X and Y

Example: k-NN “winners” and “losers” trading strategies

Market-cash k -NN strategies

Growth-value k -NN strategies

Results and discussion

Growth comparison

Comparison of returns and machine learning metrics

Comparison of volatility and drawdowns

Comparison of tracking errors and sharpe ratios

Comparison of strategies by return efficiency ratios

Choosing the number of nearest neighbors and transaction costs

Additional examples on using return efficiency index

Concluding remarks

Footnotes

Acknowledgements

Ethics approval

Consent to participate

Consent for publication

Author contributions

Funding

Declaration of conflicting interests

Data availability

Appendix A: Details on ML-volatility and ML-sharpe ratio computations

Appendix B: Detailed tables and figures

References

Machine learning description of strategy $X$

Machine learning description of strategy $Y$

Market-cash $k$ -NN strategies

Growth-value $k$ -NN strategies