Abstract
This paper deals with the risk associated with the mis-estimation of mean-reversion of residuals in statistical arbitrage. The main idea in statistical arbitrage is to exploit short-term deviations in returns from a long-term equilibrium across several assets. This kind of strategy heavily relies on the assumption of mean-reversion of idiosyncratic returns, reverting to a long-term mean after some time. But little is known regarding the assessment of this kind of risk. In this paper, we propose a simple scheme that controls the risk associated with estimating mean-reversions by using portfolio selections and screenings. Realizing that each residual has a different mean-reversion time, the ones that are fast mean-reverting are selected to form portfolios. Further control is imposed by allowing the trading activity only when the goodness-of-fit of the estimation for trading signals is sufficiently high. We design a dynamic asset allocation strategy with market and dollar neutrality, formulated as a constrained optimization problem, which is implemented numerically. The improved reliability and robustness of this strategy is demonstrated through back-testing with real data. It is observed that its performance is robust to a variety of market conditions. We further provide some answers to the puzzle of choosing the number of factors to use, the length of estimation windows, and the role of transaction costs, which are crucial issues with direct impact on the strategy.
Keywords
Introduction
With the rapid evolution statistical methods for analyzing large data sets and the establishment of automated markets, statistical arbitrage has become a common investment strategy with both hedge funds and investment banks. Although there is no consensus on what is statistical arbitrage, its main idea is a trading or investment strategy that exploits short-term deviations from a long-term equilibrium across the assets. Pairs trading is one of the earliest forms of statistical arbitrage, and it is widely used. This strategy consists of buying a certain asset which is below some equilibrium price and selling another correlated asset which is above it, expecting that the two will tend to equilibrium so that the trade generates profits. In other words, it is a bet on relative positions, rather than absolute positions, in a way that the resulting portfolio is unaffected by the equilibrium itself, and is relatively insensitive to market behavior. This kind of strategy heavily relies on the belief of mean-reversion of idiosyncratic returns or residuals, reverting to a long-term mean within some time that can be quantified. Although residuals of highly correlated assets generally converge to each other after diverging, there is no rule that asserts this has to happen. Thus, the essential risk of statistical arbitrage lies in the reliable quantification of the mean reversion characteristics of residuals.
There are very few studies in the literature that address the issue of the mean-reversion of residuals from the statistical arbitrage point of view. Thus, the objective of this paper is to address the question:
Can we control the risk from the mean-reversion behavior of residuals in statistical arbitrage?
The first and the main contribution of this paper is that we consider statistical methods for the risk-control of mean-reversion. Motivated by the observation that each residual has a different rate of mean-reversion, we carry out a statistical training to assess the quality of each residual relative to its rate of mean-reversion, from which only fast mean-reverting residuals are selected to form portfolios. Regular updates of the portfolio and further screening using the goodness-of-fit in the estimation of trading signals is also imposed to boost the reliability of the strategy.
Another contribution of our work is the transformation of the asset allocation problem in statistical arbitrage, with market- and dollar-neutrality conditions, to a constrained optimization problem that can be solved numerically. In this strategy, the threshold-based rule defines the sign (long or short) of the position of each asset and the neutrality conditions determine the investment amount.
Lastly, we demonstrate the performance of our strategy with real market data. The out-of-sample results show that, compared to other strategies, the proposed risk-control method works very well, giving persistently higher Sharpe ratios in all the different market regimes encountered in the last several years, before and after the 2008 crisis.
Basic aspects of statistical arbitrage
Implementing a statistical arbitrage strategy involves several steps. We briefly run through the necessary ones in the context of generalized pairs trading, where each pair consists of an individual asset and its corresponding common factors.
First of all, we describe how to construct residual returns. Consider a factor model
Next, the factor loadings (L) are estimated by multivariate regression of R on the subspace spanned by F. Finally, the residual (U) is the remaining part, after subtracting the estimated common factors from the original returns:
The next step is modeling the residual processes, and we adopt the Ornstein–Uhlenbeck (OU) process:
The last step is trading. For each asset, a trading signal is generated as a normalized level of each residual using the estimated OU parameters. Then we open or close a position whenever its signal becomes active or inactive, respectively.
Risk control for reliable residuals
Although the strategy described above is well designed and widely used, there are several issues that must be considered. First and most critically, there is no guarantee that a residual reverts to some mean. Clearly, a slow mean reversion is unfavorable in statistical arbitrage, since it increases uncertainties in spread convergence, and hence decreases the chance of getting profits. Second, the parameter estimation for generating trading signals may not always be satisfactory. Signals with large estimation errors cannot be trusted. Third, the estimated mean-reversion times or the correlation structure of residuals are not constant over rolling time windows. They can and do change with time.
Consequently, to improve the reliability of the statistical arbitrage strategy, we propose the following controls.
Strategic reliability – Residuals that have long estimated mean-reversion time should be excluded from the strategy. We only consider residuals with relatively fast mean-reversion for portfolio selection.
Statistical reliability – Any trading signal with large estimation error is screened out.
Temporal reliability – Portfolio choices based on the estimated mean-reversion time must be updated regularly.
For the first control, we introduce a reliability score for each residual by the average of rates of mean-reversion (
As discussed, a trading signal determines whether to buy or sell the asset. However, how much one needs to allocate to each asset is still a non-trivial problem, since there are several constraints. In particular, market neutrality is crucial in our strategy, since it allows the portfolio to be relatively uncorrelated from market movements. Dollar neutrality and leverage constraints are also necessary for a fair assessment of strategies. In short, for each time t, the required conditions can be stated as follows,
These conditions can be consolidated to formulate an optimization problem. We do not write the full expression here, but it turns it has the form:
Back-tests
The performance of our method is evaluated by analyzing the cumulative profit and loss (PnL) with real data. As for market data, we use 3780 records (Jan 2, 2000–Dec 31, 2014) of daily prices for 3786 stocks of the S&P500. We use a portfolio size of 25, 50, 75, and 100 stocks, out of a total of 378 stocks. The performance is evaluated within different market regimes. We consider five regimes, each of which has two-year (504 business days) period: pre-crisis (2005–2006), in-crisis (2007–2008), post-crisis (2009–2010), afterward (2011–2012), and recent years (2013–2014). The results are compared when using different portfolios as well as with other parameter choices. As comparison groups, we first consider a portfolio where stocks are randomly chosen. In addition, portfolios where stocks are selected by capitalization, one with the highest and the other with the lowest, are also investigated.

The effect of mean-reversion controlled portfolio on Sharpe ratios. The size of portfolio is 75, the transaction cost is 5 bp per volume, and
Fig. 1 provides a snapshot of the back-testing results. We see that our portfolio selection using faster mean-reversions performs better than any other comparison portfolios: the Sharpe ratios of the controlled portfolios are much higher. By calculating the length of days of active positions as a proxy of realized mean-reversion time, we check that our training for the mean-reversion speeds is effective in the out-of-sample tests. Furthermore, the selective trading via the
We also see that the investment adjustment derived from the proposed optimization problem plays an important role in enhancing the robustness of the strategy. Regardless of regimes, the Sharpe ratios are more stable than those from uniform asset allocation, which shows the importance of the optimization problem regarding market-neutrality. Lastly, we also investigate the effect of parameters such as number of factors, window lengths, and transaction costs. From the results we find that the performance with shorter windows is more sensitive to the impact of higher transaction costs, as might be expected. Furthermore, we see that in volatile markets only a small number of factors is sufficient for our strategy to work well. The issues of portfolio size and the decreasing performance in recent years are discussed as well.
Pairs trading has been studied extensively from various perspectives. A seminal work is [12], where the authors present a very simple relative value trading rule with U.S. data from 1963 to 2002. They use the Euclidean squared distance of price time series to identify pairs, and show that a simple rule to open/close a trade generates statistically and economically significant excess returns. This work has been extended by [7] and [8]. They refine the formation of portfolios by allowing securities within the 48 Fama-French industries and by measuring the number of zero crossing. Nevertheless, there are several limitations, including the fact that none of these studies take into account trading costs.
There are not many studies devoted to mean reversion in residuals in the context of statistical arbitrage.8 In [10] a Gaussian linear state-space processes is suggested for modeling mean-reverting spreads arising in pairs trading. Their point of view is that the observed process must be seen as a noisy realization of an underlying hidden process, so the comparison between the two can lead to a more profitable strategy. They suggest to use the EM (expectation-maximization) algorithm for estimating model parameters. Although their model is fully tractable, its applicability in practice is debatable, due to the fact that the model restricts the long run relationship between the two stocks to one of return parity.9 In [21] a framework for forecasting is developed using a co-integration method. Stocks with similar common trends are preselected, are assumed to be non-stationary within the framework of a common trend model as in [20] and in Arbitrage Pricing Theory of [19]. In Vidyamurthy a method is also developed for finding the optimal triggering level for each pair, by counting the number of times that each trigger level is exceeded. However, Vidyamurthy’s arguments are informal, depend on heuristics, and lack explanations for decisions that affect other aspects of the strategy and thus make the approach of questionable value for practitioners.
The Ornstein–Uhlenback process has played a significant role in modeling residuals. Closed form expressions are obtained in [4] for the mean and variance of the trade duration and the strategy return, using OU modeling. The effect of transaction costs is also considered. In [14] stochastic control theory with OU-modeling is used, and an optimal dynamic strategy for mean-reverting arbitrage opportunities is presented. But transaction costs are still missing in the paper, and so their daily rebalancing is not expected to outperform the threshold-based trading rule.
Recently, [2] carried out an extensive study of statistical arbitrage in the U.S. equities market. The authors use principal component analysis and sector ETFs to find factors, and analyze the performance of statistical arbitrage strategies using residual modeling. Our study is motivated by their work, while our main focuses is on the risks associated with mean-reversion times, portfolio selections, and the reliability of estimations. We also enforce market- and dollar-neutrality by solving a constrained optimization problem, which is also different from what they do. Our approach is especially effective when there is a restriction on the number of stocks in a portfolio. We also study in detail the effect of various parameter choices, such as the number of factors, estimation windows, capitalizations, regimes, or transaction costs.
Outline
The remainder of the paper is organized as follows. In Section 2, we first recall the factor model framework and explain how the residuals are generated. Then the estimation of mean-reversion times is discussed. In addition, we introduce a risk control method to search for reliable residuals. Section 3 explains the trading rule and the required balance conditions for the strategy. We propose an optimization problem for the asset allocation. The numerical results and back-testings are presented in Section 4. We compare results with different parameter choices and with various other portfolios. We first investigate the effects of our control methods and the proposed optimization problem for asset allocations. We explore further other issues, such as capitalization, market regimes, number of factors, training window lengths, transaction costs, and portfolio size. An issue in recent years’ market and the results from time-varying number of factors are also addressed. Finally, we conclude in Section 5.
Risk-control of mean-reversion times of residual processes
In this section we explain how residuals are constructed, calibrated, and controlled.
Construction of residuals
Most trading strategies in statistical arbitrage only concern the residual returns of assets. Thus, the way of determining residuals directly affects its performance. By definition of residuals in factor models, the problem of determining residuals is equivalent to that of determining common factors.
Let us consider a set of N securities (e.g., stocks) each represented by a time series of returns and let T be the number of time series observations. For
where
In this paper, we adopt statistical factor models in which factors are unobservable and so extracted from asset returns. That is, one needs to determine factors, their loadings, the number of factors, and residuals, based on the information taken from market returns. We follow [2] for constructing factors. The number of factors is also an important quantity to be specified. One simple and common way is to choose p factors that explain a certain level of variance. In this study, we mainly consider constant number of factors.10
From the data matrix R, a correlation matrix can be obtained according to the formula:
Calibration of residuals
For the estimation of mean-reversion time of residual processes, we employ the Ornstein–Uhlenbeck (OU) process.13
Rewriting the factor model, we have
Risk-control methods
Ranks by mean-reverting speed
The mean-reversion time of residuals is hard to analyze and control, since it reflects idiosyncratic features of individual asset, which may not be easily interpreted with economic indicators. For example, the mean-reversion dynamics of residuals does not have to be similar to that of the original returns. Obviously, there is no fundamental theory for this. However, our approach starts from the premise that some stocks have a more distinguishable correlation structure than others. Then their residuals, the remainders after removing factors, are more well-decoupled from the market and hence are more reliable for market-neutral strategies. In other words, these residuals are less trending or faster mean-reverting. Since there is no structural model for this mechanism, we develop empirical schemes using market data.
Note that in OU modeling, the parameter κ represents the mean-reverting speed.17 In order to find “good” residuals, we first filter out slow mean-reversions. This is because if a residual is not reverting for a long time after opening a position on it, it will be hard to find an optimal time to close the position. Hence, we accept residuals which have relatively faster mean reversion speed.
To quantify the mean-reversion speed of each residual, we take a simple average of mean-reverting speed (
Selective trading via
-value screening
As we discussed, the estimated OU parameters are used for generating trading signals. Poor estimation cannot provide reliable signals. In order to improve statistical reliability, we accept estimation results only with high enough goodness-of-fit. As a measure of goodness-of-fit, we adopt

An exemplary statistics of mean reversion time and
The mean-reversion dynamics of residuals are not stationary, but change in time. The validity of residuals’ ranks based on the training within a time window can deteriorate with time. Thus, the portfolio selection must be updated regularly. The updating interval is set to be same as the length of estimation window, for consistency of market information. By doing this, the current portfolio maintains up-to-date information of mean-reversion times.
Robust trading algorithm and optimization
In this section, we give an overview of the trading strategy we implement. First, a threshold-based rule is explained. We then focus on developing an investment strategy that incorporates several constraints that are needed to make it robust.

A schematic diagram for a long/short strategy. When the trading signal hits 1.25, it is presumed to be over-valued, so we sell the stock to open the short position (green dots), expecting it to go down soon. If it hits 0.5, we buy the stock to close the short position, which leads us to obtain profits. Similar strategy can be applied to the open/close of long position (red dots).
Note that the strategy involves three main steps.
Residual generation: do PCA for historical data from
Portfolio selection: estimate parameters with OU-modeling of residuals and select stocks using confidence criteria.
Trading: dynamically adjust the portfolio by opening and closing the long/short positions during the trading horizon.21 The investment amount for each asset is determined by solving a constrained optimization problem.
From the estimated OU parameters, we generate dimensionless trading signals as
Three constraints
For the practical implementation of the strategy and the proper assessment of its performance, we impose the following three conditions.
Market-neutrality
A market-neutral strategy generates returns that are independent of the market environment. Recall we have the factor model for
Dollar-neutrality
We also want our strategy to be dollar-neutral, which just means symmetry between long and short holdings,
Leverage: Total amount of investment
When it comes to comparing different strategies, the total absolute amount matters. In order to normalize this effect, we need to restrict the total investment.
Optimization for asset allocations
The rule displayed in Fig. 3 determines only whether to buy or sell an asset. The main objective of this section is to determine how much. First of all, it is worthwhile to note that not all assets’ allocations need to be changed each time. For example, if the trading signal of an asset does not indicate a change of its position from time t to
Let us start out by defining the following sets of assets:
In order to find
Again, the first constraint is for the sign of each investment that is determined by trading signals. The second constraint is for a dollar-neutrality, and the third is from the leverage condition. The summations over
We can rewrite the problem in a simpler vector-matrix form. Let us employ notation for two vectors,
Here
Empirical results
The back-testing results are presented in this section. We start out by describing the testing setups.

Diagram for trading scheme in a single testing cycle. The mean-reversion speeds (or the inverse of the mean-reversion times) are estimated with a moving estimation window, and the quality scores are recorded until the end of the training period. Based on the quality scores, high-ranked stocks are selected to form the trading portfolio. When the trading signal is judged to be statistically reliable, the optimization problem for asset allocations is solved for trading. Note that the estimation windows for training and for trading are set to be the same.
Schemes
Fig. 4 provides a schematic diagram on how the data is used for back-testings. During the “portfolio selection” period, we use historical data to estimate the mean-reversion speed of each asset’s residual. The estimations are collected with a moving window, and they are sorted by the average mean-reversion speed.26 At the end of training, the high-ranked stocks are chosen to form the trading portfolio. In the “trading” period, the trading signals are generated. Then we dynamically adjust the asset allocations based on the solution of the optimization problem we discussed in Section 3.3. Note that the OU-estimations are carried out in both periods: One is for sorting estimated mean-reversion speeds and the other is for generating signals.
Parameters
Trading results cannot of course be independent of the choice of parameters and of the testing environments. Therefore, we consider the following parameter ranges for our tests.27
Estimation window: 30, 60, 90, 120 days Number of factors: 5, 10, 15, 20 factors are removed to obtain residuals. Trading periods: 5 regimes – 2005–2006/2007–2008/2009–2010/2011–2012/2013–2014 Portfolio selection: risk-controlled, high-capitalization, low-capitalization, random
The window lengths are chosen for reliable estimations. For example, 5 days are insufficient to derive mean-reversion properties from daily data. On the other hand, a window that is too long can cause over-fitting or overemphasis of old information. We choose the range from 30 to 120 days which is also consistent with the actual reporting cycles of companies. As for the number of factors, our intention is to find the number of principal components that effectively remove strong cross-correlations between assets. From the returns data of 378 stocks,28
we found that taking out more than 30 principal components makes residuals too noisy, which may cause instability in estimations. In order to have meaningful mean-reverting residuals, we extract 1 to 20 factors. Lastly, the performance of the trading strategy is evaluated and summarized for each two-year period. Although the overall performance for the whole period (ten years in this paper) may be important for long-term profits, persistent performance with no bad periods is preferred. Thus, five different two year periods are considered: pre-crisis (2005–2006), in-crisis (2007–2008), post-crisis (2009–2010), afterward (2011–2012), and recent years (2013–2014).
The performance from trading can be measured by the cumulative profit-and-loss (PnL). We use the following formula.29
For each period, the annualized Sharpe ratio for the PnL is calculated as the standardized excessive returns:
Market data
Our analysis is based on stocks from S&P500. We extracted
Sample PnL plots
Before presenting the back-test results, we first present typical PnL plots (see Fig. 5) that are generated from controlled and random portfolios with different transaction costs. Our trading algorithm works quite well regardless of the prevailing market regime.

Sample results of cumulative PnL for each period. The results are generated using two portfolios (controlled and random) with two transaction costs (of 5 bp and 10 bp per trading volume). Here the portfolio size is

The out-of-sample performance from four different portfolios. The number of selected stocks is 75, and the transaction costs are 5 bp per volume. (Left) Without
As illustrated in Section 2.3, stocks are chosen based on the quality scores we calculate before each period so as to form the trading portfolio. We also compare performance with other portfolios, including a randomly chosen portfolio, and high- and low- capitalized firms portfolios. The left plot in Fig. 6 shows the performance of the four different portfolios. The portfolio size here is
In contrast, the performance with a randomly chosen portfolio is different. The Sharpe ratios of the random portfolio are lower and more unstable. It is also clear that the mean reversion strategy with the risk-controlled portfolio is far superior to low- and high- cap portfolios as well. Moreover, the controlled portfolio gives high-returns and low-volatility, whereas for other portfolios, the high returns are accompanied by high volatilities.
To further assess the impact of our approach, we also test with an anti-controlled portfolio, which is formed from the stocks with long mean-reversion times. We present the performance from this portfolio in Table 10 in the Appendix, and as expected, the results are the worst among the all other portfolios considered. Although there are few parameter choices that yield a good performance, the overall fluctuations of PnL are significantly larger than those from other portfolios. We also confirm this with experiments using other portfolio sizes.

Mean-reversion times (in days): estimated (Left) and realized (Right) values are presented. The mean-reversion controlled portfolio exhibits significantly shorter trading intervals compared to other three portfolios. Note the values are averaged.
To confirm and illustrate the effect of mean-reversion control in the out-of-sample tests, we also investigate the realized mean-reversion time of the residuals. By this, we mean the realized buy-sell time intervals, or the duration between opening and closing of a position:
The reliability of a trading signal relies on the estimation quality of OU-parameters. Rather than keeping and using all signals, we reject some of them if the goodness of fit of the corresponding OU estimations is significantly lower than an acceptable level. The least squares estimation uses two consecutive integrated residuals,
The effect of selective trading via

The performance surface as a function of number of factors (p) and window length (T), for two different transaction costs settings:
As we have seen in Fig. 6, in addition to the random portfolio, we also compare other portfolios that are formed by according to capitalization. The assets are sorted by market capitalization, and low- and high-capitalized assets are selected to form portfolios for trading. We note first that none of them outperform the controlled portfolio in all periods. Second, we find that the high-cap portfolio performs very well during 2007–2008 and 2009–2010, compared to low-cap portfolio. The most likely explanation for this is that small firms rely less on market information and so are less sensitive to market changes, as compared to large firms. Consequently, during volatile periods market impact is greater on large firms, and this is reflected by principal component analysis. Thus the residuals of large-cap portfolios are well-decoupled from the market. The third observation is that the volatility resulting from high-cap portfolios is lower than for other portfolios. The volatility level on average is about
Number of factors, window lengths, and transaction costs
Most studies in statistical arbitrage do not provide sufficient guidance or discussion about the role of important parameters, such as the number of factors, window lengths, and transaction costs. In this subsection we consider some issues associated with them.
First, the number of factors to be used varies depending on the portfolio size and the window length, since a larger portfolio or a longer window generally require more factors to be removed to get reliable residuals. This is because when using statistical factor models, with the factors constructed from observed data, the explaining power of a single factor can decrease when the data size increases. Second, setting a window length must take into account the trade-off between the benefit from accessing more data and the loss from using old information. Lastly, transaction costs also matter. For example, frequent trading may increase chances of gaining, but at the same time it will bring losses if the transaction costs are high.
We have found that the relations among these quantities are rather complicated, and a specific parameter set that leads to best performance is not clearly identified. However, considering an extensive range of parameters (see Fig. 8), we observe the following. Performance with short windows is more sensitive to the number of factors. If the window is short, not many factors can be used. Thus, subtracting too many factors quickly degrades performance, compared to using longer windows. The results with shorter window (with a smaller number of factors) are best when transaction costs are small. However, performance drops significantly as the transaction costs increase (to 15 bp and above). In particular, if the transaction costs are high, then using long windows provides better and more stable performance. This is not surprising, since when it comes to trading environments with high transaction costs, frequent trading activities are not desirable.
The performance of a strategy considered (denoted by Ω) depends on three elements: the number of factors (p), the length of estimation window (T), and transaction cost (ϵ):

Sharpe ratios with respect to portfolio sizes: X axis represents portfolio sizes (25, 50, 75, and 100 stocks) and Y axis represents Sharpe ratios. The results are presented with transaction costs of 5 bp. Note that the values on the Y axis are averaged ones over all number of factors, estimation windows, and periods. In general, the performance is better with larger portfolio sizes. However, for controlled portfolio, using
In many cases we consider, as the portfolio size increases the strategy becomes more stable, as can be seen in Fig. 9. This is consistent with the well-known idea of diversification. Since the total leverage is fixed regardless of the number of stocks participated, the optimization process has more range and give more stable results. However, this is not true in general. For example, in the case of a controlled portfolio we find that using

Performance comparison between with and without solving optimization problem for asset allocations. In void of optimized asset allocations, the strategy is unstable and cannot achieve the market-neutrality.
This also illustrates the role of the proposed optimization problem (Eq. (43)) given the total leverage. Fig. 10 presents the performance without doing the optimization for asset allocation. That is, we allocate an identical amount to each asset, given the long/short signal of each. As expected, the performance is very unstable, since the allocation in this case may not satisfy the market-neutral conditions.
We found that our investment algorithm becomes less effective in recent years, 2013–2014. We need to examine this period more carefully. Since the essence of our approach is based on mean-reversion time control, we can check whether there is any unexpected change in realized mean-reversion times. Actually, Fig. 7 already has a clue for this. We see that the buy-sell interval for the controlled portfolio is relatively longer both in 2005–2006 and 2013–2014. We note that the selected residuals in the controlled portfolio are expected to have shorter mean-reversion times. However, if they do not behave as predicted, the reliability of whole approach may drop. It appears that in 2005–2006, the relatively longer realized mean-reversion time did not affect performance, but in 2013–2014, we have seem to affect it. Unfortunately, we do not found a clear explanation for why this happens, but only a guess that the residuals in the recent market period have abnormally higher temporal correlations, which results in slower mean-reversions than expected.
Time-varying number of factors by variance explanations
So far, we have been using the fixed number of factors. This assumption can be relaxed by introducing a dynamic number of factors. Estimating the number of factors in high-dimensional factor models is beyond the scope of this study.34

Dynamics of number of factors that explain different levels of variance. The numbers differ with estimation window: 60 days (top) and 120 days (bottom). The total number of stocks is 378. The scaled S&P500 index is compared. The correlation is larger in volatile markets.

The largest eigenvalue of correlation matrix of normalized original returns with moving window. The largest eigenvalue increase significantly when market becomes volatile. The shorter windows response more sensitively to the market.
Results with a dynamic number of factors with moving windows is presented in Fig. 11. We can see that the numbers range from 1 to 25, which shows that there is time-varying correlation among assets. It can be also found that the market becomes very condensed during late-2008, mid-2010, and late-2011. We note that high volatility of markets is directly linked to the strong correlation of assets. The largest eigenvalue dynamics depicted in Fig. 12 also illustrates this. Given a variance level, the corresponding number of factors is smaller when using shorter windows (60 days in Fig. 11). This is because each factor brings higher proportion of variance, compared to using longer windows (120 days in Fig. 11).
With a dynamic number factors we repeat the back-tests as before. Table 2 and Table 3 are the results for a controlled portfolio with transaction costs of 5 bp and 20 bp, respectively. The result with 20% is the best when the transaction costs are low. However, the performance from the variance explanation level of 30% to 40% are more stable with high-cost settings. We point out that there is no unique level of variance explanation that yields an all-the-time best performance. This would imply that using a dynamic number of factors solely based on variance explanation might not be ideal.
Mean-reversion is the critical property for many statistical arbitrage strategies, and this paper studies its risk in detail. We have proposed a new approach to control the risk of mean-reversion time. Following the well-known Ornstein–Uhlenbeck modeling for residual processes, stocks are ranked by the estimated mean-reversion speeds. Then high-ranked stocks are selected to form the trading portfolio. This simple portfolio formation procedure combined with selective trading via goodness-of-fit in trading signals estimation effectively makes the strategy more reliable. Moreover, the optimization problem for trading volume plays a crucial role in obtaining market and dollar neutrality. All these contribute to robust performance regardless of market regime. The results obtained from back-testings have demonstrated the effectiveness of the approach.
Statistical arbitrage algorithms generally involve many parameters. There are, however, no “best” parameters to use. The exploration of the associated parameter regimes provides some insight into the relation between these parameters and trading performance.
Footnotes
Acknowledgement
Joongyeub Yeo gratefully acknowledges the support of ILJU Foundation Scholarship.
Supplemental tables
Summary of tables in appendix. Tables with other parameters are available on request Dynamic number of factors: Controlled portfolio with Dynamic number of factors are applied, based on variance explanation by principal components. Transaction costs are 5 bp. All values are annualized.
Table
Portfolio
N
ϵ
-screening
Table 2
Controlled
100 (dynamic p)
5 bp
No
Table 3
Controlled
100 (dynamic p)
20 bp
No
Table 4
Controlled
75
5 bp
No
Table 5
Controlled
75
5 bp
Yes
Table 6
Random
75
5 bp
No
Table 7
Random
75
5 bp
Yes
Table 8
Low-cap
75
5 bp
No
Table 9
High-cap
75
5 bp
No
Table 10
Anti-controlled
75
5 bp
No
Table 11
Controlled
100
5 bp
No
Table 12
Controlled
50
5 bp
No
Table 13
Controlled
25
5 bp
No
Table 14
Controlled
75
10 bp
No
Table 15
Controlled
75
No cost
No
Windows
Var. Exp.
Periods
All
2005–2006
2007–2008
2009–2010
2011–2012
2013–2014
Mean
Return
Volatility
30
20%
2.32
3.10
2.47
3.17
2.52
2.72
40.03%
5.22%
30%
2.18
2.99
2.53
2.50
2.07
2.45
34.74%
5.22%
40%
2.69
3.19
2.18
2.71
1.55
2.46
32.42%
5.00%
50%
1.52
2.80
2.53
2.79
3.08
2.54
27.94%
4.46%
60%
0.92
1.65
3.12
2.04
1.64
1.87
16.72%
4.12%
60
20%
2.19
2.35
1.45
2.31
0.49
1.76
25.79%
6.09%
30%
2.80
2.17
1.45
2.34
0.79
1.91
27.75%
6.06%
40%
1.59
2.25
1.54
1.81
2.11
1.86
23.93%
5.58%
50%
1.60
3.26
1.33
1.77
1.42
1.88
22.49%
5.16%
60%
1.83
2.30
0.94
1.59
0.48
1.43
14.95%
4.81%
90
20%
1.39
1.55
1.83
1.93
0.02
1.35
17.08%
5.78%
30%
1.02
1.64
1.83
1.97
0.08
1.31
16.50%
5.71%
40%
1.78
1.96
1.71
1.80
1.12
1.67
19.49%
5.20%
50%
1.05
2.47
1.42
3.01
0.55
1.70
17.76%
4.71%
60%
0.00
0.00
0.00
0.00
0.00
0.00
0.00%
-%
120
20%
1.15
0.69
2.30
1.33
1.98
1.49
17.44%
5.98%
30%
1.45
0.65
2.30
1.33
2.03
1.55
17.60%
5.85%
40%
1.70
1.90
2.09
1.51
1.43
1.72
21.32%
5.38%
50%
1.07
0.72
1.28
1.08
1.80
1.19
11.58%
5.18%
60%
1.06
1.58
1.56
1.25
0.15
1.12
11.56%
4.63%
Mean
1.65
2.06
1.89
2.01
1.33
1.79
21.95%
5.27%
κ represent the rate of mean-reverting, σ is for volatility, m is for the long-term mean value of X.
Other goodness-of-fit measures can be used, but in this paper we use
As will be discussed later, the dimensions of these quantities can change depending on the previous state of the portfolio.
There are stocks coming in and out of the pool of S&P500. We have only picked stocks which have persisted during the 15-year period.
This is verified with various choices on the number of extracted factors and estimation windows.
Two kinds of time windows are mentioned in this paper. One is for estimating the sample correlation matrix and PCA, and the other is for parameter estimation of residual modeling. For simplicity and consistency, we fix the same length for both.
There are many possible ways to model the residuals. However, the OU modeling is the most frequently used for mean-reverting processes and is convenient for parameter estimation.
The maximum likelihood (ML) is also frequently used. It can be shown that the LS estimator is equivalent to the ML estimator, and we focus LS estimator in the rest of the paper.
One problem with continuous OU model is estimation bias for the mean-reversion estimator. Standard estimation methods, such as least squares, maximum likelihood or generalized method of moments, are known to produce biased estimators for the mean reversion parameter [22]. Researchers have developed several methods to reduce this bias. For example, [
] adopts jackknife method to resolve this issue.
The mean-reversion time can be defined by its inverse, scaled by the interval of discrete observations:
The estimated mean-reversion parameter κ generally depends on the length estimation window. Thus, the mean-reversion speed must be normalized by the estimation window. However, since our selection step is only conducted in a fixed window, it is actually not necessarily at least here.
If the number selected is too small, then the solution of optimization problem for market-neutral condition becomes inaccurate, since finding a feasible combination of dollar quantities that cancelled out can be hard.
The
For selective trading, trades are rejected if the estimation error is too high.
Other conditions may be used for the objective function. Our choice is made solely for better numerical stability in solving the problem.
We found that the matrices and vectors used in this optimization problem are sparse, since the proportion of active signals are usually small.
Or the inverse of the estimated mean-reversion times.
Other parameters are also considered later.
The number of factors can vary with the asset universe used.
We use
The volatility increases by limited amount.
Here we assume that the relations can change with the total number of stocks (N) and data frequency (
However, a commonly used method is to determine the number of principal components that explain a certain level of total variance of returns. Instead of fixing the number of factors, we now fix the percentage of explained variance, such as 20%, 30%, 40%, 50%, and 60%, and the number of factors changes according to the variance levels.
