Algorithmic pairs trading with expert inputs,a fuzzy statistical arbitrage framework

Abstract

Pairs trading is a widespread market-neutral trading strategy aiming to utilize the relationship between pairs of financial instruments in efficient markets, where predictability of separate asset movements is theoretically not possible. The implication of trading pairs, following statistical analysis, is to buy the underpriced asset while short selling the overpriced. The predicted price relationship is determined through analysis of historical spread data between the members of the corresponding pair. The investor expects the price difference, in an efficient market, should converge and stocks return to their ‘fair value’, where the positions are closed and profit is realized. The main focus of this study is the contribution of the fuzzy engine to the existing pairs trading strategy. Widespread classical ‘crisp’ technique is chosen, utilized and compared with the developed ‘fuzzy’ model throughout the paper. In order to further improve this contribution, the expert opinions extracted from the Bloomberg database are also integrated into the fuzzy decision-making process. In most studies, transaction costs are simply ignored. As a final robustness check, the transaction costs are also considered. The improvement reached by the developed fuzzy technique is observed to be even more remarkable in this case.

Keywords

pairs trading algorithmic trading fuzzy statistical arbitrage

1 Introduction

Studied for decades by financial and industrial engineers, mathematicians, and market quants, pairs trading is a relatively popular and market-neutral trading strategy that tries to take advantage of the differences in the prices of similar stocks. In principal, the strategy buys the stock that is underpriced compared to its counterpart in the pair, and accordingly short sells the other stock, and expects the prices of the stocks in the pair would converge within the intended time horizon. Once the convergence happens both positions are closed. In this study, this strategy is applied on the end-of-day data from year 2013 to 2014 of the energy sector stocks in Nasdaq Index.

Pairs trading aims to utilize this mispricing by testing pairs of assets, instead of a specific stock and using historical data to predict future movements. These types of strategies are analyzed under the broad class of statistical arbitrage strategies. The performance of the overall statistical arbitrage techniques have been analyzed extensively [22]. Pairs trading strategy is a particular kind within this class.

As any other algorithmic trading method, the strength of this strategy, and variants, diminish over time as more market actors employ it for profitability. Although, Do and Faff [21] find evidence that during the turbulent times in the financial markets such as the global crisis in 2008, the performance of the pairs trading strategies are quite high. Elliott et al. also points out the significant increase in the level of efficiency of the pairs trading strategies if the market is out of its equilibrium [23]. Huck comments on the market timing, and finds strong connection between the returns of the pairs trading strategies and the volatility level in the market [18]. In line with all these findings, in a complex and vague environment such as financial markets, using ‘crisp’ rules and strategies, we believe, may lead to missed opportunities in the trade signal decision making step of any algorithm. The sensitivity of the performance of the returns of the pairs trading strategies is well explained by Huck [19]. Therefore, we propose using a fuzzy inference model to better exploit these opportunities. We believe, fuzzy logic holds great potential for further research in similar trading environments where human actors take place, as it allows imitation of the human decision making process.

This study focuses on the signal generation for opening and winding up the position for a trade. We took into consideration the co-integration measure during pairs generation period and the spread measure for opening and closing the positions in the corresponding pair. There are other measures, e.g. stationarity, partial co-integration [16] or the stochastic control approach [3], that are studied in the literature. One can see Krauss [6] for the comparison of the effectiveness of each of these techniques. But the main focus of this study is the contribution of the fuzzy engine to the existing pairs trading strategy. Most common techniques are chosen and utilized throughout the paper. In order to further improve this contribution, also the expert opinions extracted from the Bloomberg database are integrated to our fuzzy decision making engine. In most of the studies the transaction costs are simply ignored. Do and Faff [4] claim the transaction costs in fact diminishes the efficiency of pairs trading strategies. As a final robustness check, also the transaction costs are considered.

Rest of this paper is structured as follows. First concepts of statistical arbitrage, specifically pairs trading strategy for efficient stock markets are introduced and previous studies are listed in Section 2. Our materials and methods, consisting of traditionally utilized ‘crisp’ strategy and proposed fuzzy model are represented in Section 3. In Section 4, detailed results with fuzzy engine figures are laid out and compared to traditional ‘crisp’ pairs trading technique. Conclusions are discussed in Section 5.

2 Statistical arbitrage & pairs trading

Algorithmic trading strategies usually make use of the quantitative indicators that are formulated from the past price information. These indicators create crisp (buy or sell) trading recommendations. Most of those models are not based on strong theory and ignore fundamental information about the price; yet, most of the time they have been proven to be quite successful in terms of making profitable trades.

Pairs trading, one of those most successful algorithmic trading strategy, was first discovered in 1980’s by a team of quantitative traders on Wall Street led by Nunzio Tartaglia. After the team brilliantly employed their strategy and made millions of dollars in the market, they announced their strategy to the market in 1987. After two years, in 1989, the group fell apart and all the members started working in various places, which made the strategy even more common and widespread. Since then pairs trading has grown significantly and have become a market standard trading strategy adopted by many investment banks, hedge funds and other financial institutions, [10].

The beginning point of this study is the seminal paper of Gatev et al. [8], where the concept of pairs trading is described in its most basic form. The overall procedure can be analyzed in two separate steps. The first step is choosing two shares of stocks whose prices tend to move closely in the time interval called the formation period. Second step is watching the pair in the following time interval called the trading period. If the prices of the stocks move away from each other, simply the relatively underpriced one is bought, and the relatively overpriced one is sold short. When the difference between the stock prices revert back to the original level, both of the positions are closed and a profit is realized. At the end of the trading period all the open positions are closed regardless of the size of the spread.

Gatev et al. [8], find evidence that pairs trading works well even including the transaction costs. Do and Faff [21], applies to same methodology to a broader set of data and finds that it still works, although at a declining rate. In those studies the daily data is used. The training period is 12 months and the trading period is 6 months. The same framework has also been adapted in this study, as this study is not looking for the most perfect pairs trading strategy. Rather, this study focuses on the improvements attained by the addition of fuzzy engine in different ways.

Application of the fuzzy systems theory to financial problems is not something new in the literature. Gradejovic and Gencay, [17], applies fuzzy logic to a portfolio optimization problem and reports the improvement on the performance of the algorithmic trading considered. Kahraman and Kaya, [5], employs a similar technique to an investment analysis problem and uses fuzzy logic to improve the estimation of interest rates. Kablan, [1], makes use of the fuzzy logic reasoning mechanisms to measure the momentum in a high frequency trading problem. Bayram and Akat, [15], involves the fuzzy logic techniques to make better decisions while pairs trading. Although this paper adopts a similar strategy of integrating a fuzzy logic engine to a financial problem, it differs significantly in the way it does it. In all of the above studies fuzzy logic is intended to improve a crisp technical measure. In this study, in addition to this, the expert opinions have been converted to a technical indicator via the fuzzy logic theory.

Another important aspect of the pairs trading strategies is the impact of the transaction costs on the performance of the strategy. It attracted a lot of people’s interest in the field. Primbs and Yumada, [11], develops a model predictive control approach to the trading of a portfolio of pairs of stocks under proportional transaction costs. Do and Faff, [4], examine the impact of trading costs on pairs trading profitability in the U.S. equity market, from 1963 to 2009. They report that the strategy remains profitable, but the efficiency declines significantly. An important contribution of the paper is addressing the issue of the transaction costs. With the help of the expert opinions, the fuzzy enriched pairs trading strategy avoids some of the unnecessary trades. The impact of the fuzzy engine is even more visible in the case where the transaction costs are taken into consideration.

Many studies in the literature tries to incorporate the fuzzy logic techniques to the world of finance. However, most of them are applications of fuzzy expert systems for decision making support in a stock trading process [9]. Zapart [7], improves the performance of statistical arbitrage strategies via neural networks. Kablan [2], employs neuro fuzzy systems for high frequency trading and stock price movement predictions. None of these works combine the particular pairs trading the strategies and the fuzzy logic systems.

From a fuzzy logic theory perspective, most related study to this work is the paper by Cao et al. [12], proposing a fuzzy genetic algorithm framework for financial pairs mining to discover pair relationships between financial entities such as stocks and markets. The findings show 13 highly correlated pairs, out of total of tested 32, came from different sectors; which leads to the idea as we also mentioned in our conclusion that potential pairs are not necessarily come from the same sector as presumed by traders and financial researchers [14].

Kablan mentioned the cumulated order quantity of large order executing institutional traders and proposed a novel way of momentum analysis named as ‘fuzzy momentum analysis’ which makes use of fuzzy logic reasoning mechanisms [1]. Kablan is only interested in high frequency trading and large volume trading. In this study only the end of day data is considered as it is the most common approach in the pairs trading strategies in general.

3 Materials and methods

Aligned with the aim of this study, and to demonstrate the contribution to the domain of statistical arbitrage, widespread pairs trading strategy (mentioned as classical ‘crisp’ strategy throughout this paper) is employed and compared with the developed fuzzy-logic based method. The classical ‘crisp’ method uses co-integration measure for suitable pairs selection process, and distance (spread) values for the trade signal generation.

3.1 Description of the ‘crisp’ method

The crisp method of pairs trading can be described mainly in two parts. The pairs are chosen based on an observation of the stock price movements over a 12-month period which we call the training period. The pairs that are detected to be co-integrated are traded for the next 12 months following the training period which is called the trading period. The choices of the lengths of the training period and the trading period are arbitrary. One could try different time frames, however these lengths remain the same throughout the entire study.

3.1.1 Training period

The end-of-day data from year 2013 to 2014 of the energy sector stocks in Nasdaq Index are considered. All stocks that dropped out of the index during the time period are screened out. Hence there is no missing data point for any of the 67 stocks used (Fig. 1 – Process Box 1).

Fig. 1

Flow diagram of the proposed algorithm.

As a result there are 2211 possible pairs to consider for trading. Engle-Granger co-integration test [20] is run for all these 2211 possible pairs. All the pairs that are found out to be not co-integrated are opted out. This brings down the number of pairs to be considered to 240 (Fig. 1 – Process Box 2–4).

Consequently, a series of return index is created for each stock over the training period. In the beginning all the stock prices are normalized to be 1, so that all the stocks start at the same price. Hence, from this point onwards instead of the stock price data, the stock return data is used (Fig. 1 – Process Box 5).

3.1.2 Trading period

Following the calculations and analysis throughout the training period, the pairs are sorted based on the co-integration measure and put together in the trading period; the top 50 (1–50), second top 50 (51–100), and all co-integrated pairs are traded based on the historical distance criteria. The reason for working with different groups can be considered as a robustness check for the efficiency of the pairs trading technique being used as well as a robustness check for the contribution of the fuzzy method.

The common practice for the historical distance metric is the standard deviation approach. A position in a pair is opened if the return indices diverge by more than two historical standard deviations, which was calculated in the training period. The position is closed at the next time the indices intersect, and a profit is realized as a result. If the indices do not meet before the end of the trading period, the position is closed no matter what. In this case a profit or a loss could be realized (Fig. 1 – Process Box 8).

3.2 Description of the fuzzy model

The proposed fuzzy algorithm in this study is in line with the classical crisp method for the training part. This makes crisp and the fuzzy strategies identical in choosing appropriate pairs for trade, via Engle-Granger co-integration testing. The novelty and contribution of the fuzzy strategy presents itself in the trading period. While classical strategy uses two standard deviations, 2σ spread to generate buy and sell signals, proposed fuzzy model uses fuzzy logic based variables of spread and expert inputs for the same purpose. Figure 1 demonstrates the flowchart of generated algorithm to execute and compare both strategies. Two strategies divert in the <8. Generate ‘Crisp’ Position Array>and <9. Generate ‘Fuzzy’ Position Array>process steps of the algorithm. The differences between the performances of the algorithms are analyzed and the results are presented in Section 4. The details of the fuzzy signal generation are in the following subsection(Fig. 1 – Process Box 6,7).

3.2.1 Trading period under the fuzzy algorithm

Proposed method in this study is a pairs trading strategy, employing fuzzy logic with three inputs: i. spread, ii. expert opinions on the first stock of the pair ‘expStock1’ and iii. expert opinions on the second stock of the pair ‘expStock2’.

Expert analyst inputs extracted from Bloomberg Terminal database, as partly displayed in Table 1, is structured as daily buy/sell/hold order recommendations by several stock market brokers, which may coincide or differ. Overlapping and populated inputs increase the strength rating of specific recommendation for the corresponding stock and vice versa. This strength rating afterwards determine the fuzzy variable, used in the fuzzy decision making process as demonstrated in Figs. 4, 5. As described in the results section of this paper, analyst inputs, employed as fuzzy variables with spread measure serves as an inhibiting factor for the order signal generation process, yielding better choices and decreasing number of transactions(Fig. 1 – Process Box 8–10).

Fig. 2

Proposed fuzzy inference structure. Spread is the measure of deviation from the historical distance between normalized price series, expStock1 and expStock2 are the buy/hold/sell recommendations by experts concerning the first and the second stock.

Fig. 3

Membership functions for the input variable ‘spread’, according to the standard deviation from historical mean. Positive membership values indicate taking short position in the first member of the pair while buying the second, negative values indicate buying the first member while short selling the second.

Fig. 4

Membership functions for the input ‘expStock1’ analyst recommendation for the first stock of the corresponding pair.

Fig. 5

Membership functions for the input ‘expStock2’, analyst recommendation for the second stock of the corresponding pair.

Table 1

Analyst data layout extracted from Bloomberg Terminal database

Date	Total	Buys	Holds	Sells	Rating
12/6/2012	17	14	1	2	4.471
12/5/2012	17	14	1	2	4.471
12/4/2012	17	14	1	2	4.471
12/3/2012	17	14	1	2	4.471
12/2/2012	16	14	1	1	4.625
12/1/2012	16	14	1	1	4.625
11/30/2012	16	14	1	1	4.625
11/29/2012	17	14	1	2	4.471
11/28/2012	17	14	1	2	4.471
11/27/2012	17	14	1	2	4.471
11/26/2012	17	14	1	2	4.471
11/25/2012	17	14	1	2	4.471
11/24/2012	17	14	1	2	4.471
11/23/2012	17	14	1	2	4.471
11/22/2012	17	14	1	2	4.471

Three fuzzy inputs (spread, expStock1, expStock2) processed through the ‘sugeno’ fuzzy inference engine in Matlab fuzzy logic toolbox, yield one of three signals: ‘tradePositive’, ‘tradeNegative’ or ‘noTrade’. Expert opinions are considered as perfect linguistic variables to be used in the fuzzy inference structure. Therefore, the analyst data acquired from Bloomberg database have been converted to a technical indicator via the fuzzy logic theory. For the reader to get an idea about the nature of this data, some portion of it is represented in Table 1.

This paper, to our knowledge, is the first study utilizing expert analyst recommendations as inputs for algorithmic trading, through fuzzy inference. This serves as a critical contribution to algorithmic trading for the domain of financial engineering. Flow diagram of the proposed algorithm is presented in Fig. 1.

The algorithm is implemented using Matlab R2018a software and employing Fuzzy Logic Toolbox for the trade signal generation step. Bloomberg terminal data was used to download the analyst data for each stock and the stock price information was downloaded from Yahoo Finance open-source online database.

The stock data from US Nasdaq energy sector was determined as sample dataset due to its high trade volume, volatility, and the number of stocks remaining in the index for two consecutive years (67 stocks). The stocks used for analysis in this study are listed in Table 2. The stock data is downloaded autonomously by the algorithm during the code execution from Yahoo Finance database URL. This lets the ability to plug in any online database to the algorithm for potential further research.

Table 2

Nasdaq energy sector stocks used as inputs for this study

AXAS	FANG	IEP	PDCE	TGA
AHGP	DMLP	ISRL	PLUG	USEG
ARLP	EROC	LGCY	PSTR	VNR
AETI	EXXI	LLEX	PSIX	VNRAP
AMCF	ESCR	LNCO	PNRG	VTNR
AREX	ESCRP	LINE	PFIE	WRES
BLDP	EVEP	MPET	RCON	WLB
BKEP	FES	MARPS	REXX	WPRT
BKEPP	FXEN	MMLP	ROSE	WWD
BBEP	GLRI	MEMP	ROYL	ZAZA
CLMT	GPOR	MGEE	SAEX	ZN
CPST	HNRG	MCEP	SEV
CRZO	HERO	ORIG	TESO
CCLP	HOLI	PTEN	TRCH

Sixty-seven stocks that stay in the index for the two consecutive years yields 2211 possible pairs. The proposed algorithm, using the candidate pairs information, proceeds to gathering historical price information of corresponding stocks, for further analysis. Through the Yahoo Finance gateway, the price information for the unique stocks in 2211 pairs is downloaded for the year 2012. Following the construction of the price data for the pairs, ‘Engle-Granger’ co-integration testing [20] is used to reduce the set of pairs and reach the best candidates for pairs trade. This method had been used in numerous studies throughout the statistical arbitrage literature [3 , 24]. It involves the test for stationarity for the combination of not-necessarily stationary series. Following the test for every possible pair in our research, remaining co-integrated candidates are listed in Table 3 (240 pairs out of 2211).

Table 3

Pairs chosen for trade, based on the Engle & Granger cointegration measure

AXAS	MGEE	PFIE	SAEX	PSTR	RCON	LINE	RCON	LGCY	WPRT	SEV	WPRT
AXAS	MCEP	PFIE	TRCH	LINE	PFIE	MEMP	TRCH	SAEX	WPRT	ORIG	TRCH
AXAS	PFIE	PFIE	VNR	EROC	RCON	AXAS	LGCY	ORIG	SAEX	SEV	USEG
AXAS	RCON	PFIE	WRES	AXAS	VNR	AHGP	REXX	AXAS	ISRL	AMCF	ZN
AHGP	RCON	PFIE	WWD	AXAS	DMLP	DMLP	PNRG	PNRG	TRCH	MCEP	TRCH
ARLP	RCON	RCON	REXX	CPST	PFIE	PSTR	WPRT	MARP	SEV	ESCR	PFIE
AETI	RCON	RCON	ROSE	MMLP	REXX	ISRL	ROYL	MPET	WPRT	ESCRP	ROSE
BKEP	RCON	RCON	SAEX	AXAS	ARLP	ISRL	VNR	ESCRP	WRES	BKEPP	MEMP
BKEPP	MGEE	RCON	TESO	MEMP	SAEX	ISRL	SAEX	BKEPP	DMLP	BKEPP	PNRG
BKEPP	PFIE	RCON	TRCH	EROC	PFIE	ISRL	PSTR	ESCRP	MGEE	EROC	WPRT
BKEPP	RCON	RCON	VNR	RCON	USEG	HOLI	TESO	PSTR	WRES	PFIE	WLB
BKEPP	SAEX	RCON	WRES	TRCH	WWD	CRZO	PFIE	CCLP	WLB	ARLP	WLB
BKEPP	TRCH	RCON	WLB	MCEP	PFIE	TESO	WLB	ESCR	ISRL	ISRL	ROSE
BBEP	RCON	RCON	WPRT	AETI	REXX	MMLP	PNRG	ISRL	LGCY	AMCF	ZAZA
CLMT	RCON	RCON	WWD	BKEP	WLB	ESCR	RCON	BKEP	HERO	AMCF	SEV
CCLP	RCON	SAEX	TRCH	PNRG	WLB	AREX	SEV	BKEP	REXX	BKEPP	MCEP
DMLP	RCON	SAEX	VNR	ROSE	SAEX	LINE	TRCH	MPET	MCEP	SAEX	WWD
ESCRP	PFIE	PFIE	WPRT	FES	RCON	AXAS	PNRG	MCEP	WPRT	ROYL	TGA
ESCRP	RCON	PNRG	REXX	AXAS	ORIG	PSTR	PFIE	SAEX	WRES	ISRL	MEMP
ESCRP	SAEX	AMCF	FXEN	AXAS	WRES	VNR	WPRT	HOLI	VTNR	ARLP	MMLP
ESCRP	TRCH	LGCY	PFIE	AXAS	MEMP	BKEP	TESO	BKEPP	VNR	AHGP	WLB
GLRI	LNCO	HERO	RCON	TRCH	WPRT	ARLP	PFIE	AXAS	WLB	HOLI	TRCH
HOLI	RCON	TRCH	VNR	ESCRP	VNR	ESCRP	ISRL	REXX	WLB	ROYL	TRCH
ISRL	LINE	AXAS	BKEPP	HNRG	ROYL	ESCRP	WPRT	CLMT	CCLP	ARLP	DMLP
ISRL	PFIE	PDCE	RCON	ISRL	TRCH	CLMT	WLB	MARP	ZN	AXAS	CRZO
ISRL	WRES	PFIE	ROSE	AXAS	TRCH	AETI	WLB	AETI	PNRG	ISRL	ORIG
MPET	RCON	AREX	ISRL	BBEP	SAEX	MEMP	MGEE	ARLP	PNRG	PFIE	ROYL
MMLP	RCON	PNRG	PFIE	AREX	RCON	MEMP	VNR	PSTR	TRCH	HNRG	PFIE
MEMP	PFIE	GPOR	RCON	AXAS	ESCRP	BKEPP	WPRT	ISRL	WPRT	MGEE	PNRG
MEMP	RCON	BKEP	CCLP	AHGP	ARLP	AMCF	MARPS	DMLP	TRCH	BKEP	CLMT
MGEE	PFIE	RCON	VTNR	EROC	ISRL	EROC	ESCR	VNR	WRES	CCLP	PNRG
MGEE	RCON	LGCY	RCON	IEP	RCON	AXAS	BBEP	AXAS	AHGP	ARLP	REXX
MGEE	SAEX	DMLP	PFIE	AHGP	AETI	WRES	WPRT	ESCR	WRES	PFIE	TESO
MGEE	TRCH	AXAS	SAEX	TRCH	WRES	CLMT	TESO	AXAS	MPET	PNRG	SAEX
MCEP	RCON	AREX	PFIE	AHGP	MMLP	MPET	PFIE	TRCH	WLB	CRZO	MCEP
ORIG	PFIE	CRZO	RCON	EROC	TRCH	BBEP	ISRL	CRZO	PNRG	ROYL	SAEX
ORIG	RCON	CPST	RCON	ISRL	RCON	MCEP	PNRG	LINE	ROYL	MPET	TRCH
PTEN	RCON	BBEP	TRCH	DMLP	MCEP	ROSE	TRCH	CLMT	TRCH	FXEN	ZN
PNRG	RCON	BKEPP	ESCRP	RCON	SEV	MPET	PSTR	BBEP	WRES	BKEP	PTEN
PFIE	RCON	BBEP	PFIE	AXAS	WPRT	RCON	ROYL	AHGP	PNRG	AETI	MMLP

The ‘training’ step of the algorithm, involves analysis of stock price series, calculation of means and variances using the end-of-day price data for the training year of 2012. Historical spread values are determined following the normalization.

The algorithm proceeds to the trading phase following the analysis and calculations in the training step. This part of the algorithm is designed to be able to run in real time using the corresponding stock price gateway. However, for the sake of analysis, historical price information for the second year, 2013, is used. Price data is normalized and trade is executed using ‘crisp’ and ‘fuzzy’ methods consecutively.

Trading stage of the proposed method is mainly based on the notion of mispricing for pairs taking into consideration the co-integrating nature. Short selling position signal for the high valued stock while buying signal for the low valued stock is generated. Positions are unwound when spread revert to the determined minimum.

The spread of prices in the strategy is simply denoted as: $log (p_{t}^{A}) - γ log (p_{t}^{B}) = μ \pm Δ$ (1) where A and B are cointegrated and selected stocks with nonstationary time series corresponding them being ${log (p_{t}^{A})}$ and ${log (p_{t}^{B})}$ ; μ, the historical arithmetic mean of spread series and Δ as the pre-determined rule value for taking and unwinding position [10].

The constructed fuzzy inference engine is demonstrated in Fig. 2. Figures 3 –5 show the fuzzy membership functions for the variables ‘spread’, ‘expert suggestions for stock1’ and ‘expert suggestions for stock 2’. Three outputs yielded by the fuzzy strategy are demonstrated by Fig. 6: tradePositive, tradeNegative and noTrade. Output ‘tradePositive’ is the signal for buying the lower priced stock of the pair while short selling the higher priced, and vice versa. Output ‘noTrade’ stands for keeping closed position or winding out the open position.

Fig. 6

Output variables by the fuzzy inference engine for trade signal generation.

In the ‘crisp’ trading phase, trade is executed utilizing the market-wide accepted distance strategy using the crisp rule of 2σ historical deviation. Following this step, constructed fuzzy strategy, utilizing Matlab fuzzy logic toolbox is executed using fuzzy inputs of expert opinions on separate stocks and the spread.

Stock markets inherently apply several fixed and variable costs throughout exercised transaction. Fixed transaction cost is an apparent reason for loss when considering a trade strategy. Any algorithmic trading strategy that enforces high number of transactions may diminish the advantages of gains because of the high aggregated transaction costs. Therefore, the rule base of the proposed strategy is constructed for not generating signal to take position for pairs when not recommended or encouraged by experts to avoid high relative cost and risk of bankruptcy. The investment value is always 1 unit, which may be developed, for different investment patterns and differing risk profiles of selected pair [17].

Following the signal generation by both the crisp and the fuzzy strategies, the signals are collected in separate matrixes, short and long returns are calculated for both strategies and compared with and without considering trade costs.

4 Results

The algorithm, using the stock price data from year 2012 for the training and 2013 for the trading phase was run and results from crisp and fuzzy strategies were obtained. All open positions were wound out in the last trade day of the year no matter the rule base recommended. Figures 7 –11 represent several random trades in a year (252 trading days), with generated signals to give a better overview of both strategies in different conditions. As demonstrated by the figures, both strategies may differ in decision, timing and length of the trade.

Fig. 7

Several ‘crisp’ and ‘fuzzy’ signals for buy and short sell orders, shows apparently although similar, strategies may differ in signal generation for opening and unwinding positions.

Fig. 8

Comparison of ‘crisp’ and ‘fuzzy’ signals revealed crisp strategy yields more transactions in the same time frame and for the same pair, which leads to increased transaction cost.

Fig. 9

Another example representing one extra ‘crisp’ signal for reverse position opening and pair of ‘crisp’ and ‘fuzzy’ signals opened and unwound simultaneously for the same pair.

Fig. 10

Fuzzy engine may generate early signals, which lets exploit profit opportunity, the crisp rule base may miss to acquire.

Fig. 11

An additional signal generation and an early opening by the fuzzy engine for the same pair in the same time frame.

At the end of trading year of 2013, using the training data of 2012 for signal generation using distance method for the crisp and fuzzy strategies and expert analyst data as linguistic inputs for the fuzzy strategy, results are compared in terms of their financial profits and the main results are depicted in Table 4.

Table 4

Classical and proposed method performance results for proposed strategy with expert inputs for each corresponding member of the pair as fuzzy variables, and with numerical inputs (spread and volatility). (Transaction cost: 0.005)

	Crisp Strategy	Fuzzy Strategy	Difference
PANEL-A (fuzzified inputs: spread, volatility)
Profit	84.48	91.56	7.08 (8.38%)
Number.of transactions	5626	8862	3236 (57.52%)
Profit following trans. cost	56.35	47.25	– 9.10 (16.15%)
PANEL-B (fuzzified inputs: spread, expert opinions)
Profit	84.48	92.63	8.15 (9.65%)
Number.of transactions	5626	4780	– 846 (15.04%)
Profit following trans. cost	56.35	68.73	12.38 (21.97%)

Table 4 summarizes the returns of pairs trading strategies under the crisp and two different fuzzy algorithms for the trading period of year 2013 for all the co-integrated pairs of stocks in the Energy Sector of the NASDAQ index. The profits generated under all algorithms are nominal values since the trading strategy does not require any initial capital, as the funding for buying a stock comes from the short sales of its counterpart in the pair. Note that the short sales costs are ignored in this study.

In Panel A of Table 4, the inputs for the fuzzy engine are the historical spread and the volatility of the historical spread which are both originally quantitative variables. These numerical variables are converted to linguistic variables and an improvement of 8.38% is observed in the profit generated. However, the improvement is only under the assumption of no transaction costs involved. When a modest transaction cost is introduced one notices that there is no improvement at all. In fact, the profit cut down by 16.15% compared to the crisp algorithm. This is simply due to the high number of transactions under the fuzzy method. Namely, fuzzy method is making more profit by opening and closing a lot more positions. If the level of the transaction cost is higher this may even cause a bankruptcy risk for the trader. The transaction cost used in Table 4 is 0.005, or 0.5%. That means that for a position traded (opened or closed) that is worth 1 dollar, a cost of 0.005 dollars or 0.5 cents is realized.

In Panel B of Table 4, the inputs, the fuzzy variables, for the fuzzy engine are historical spread and expert opinions for each member of the pair. The historical spread is inherently a quantitative variable, on the other hand the expert opinion is a qualitative variable. Contribution of the fuzzy algorithm with this combination of inputs is clearer. If there is no transaction cost compared to the crisp algorithm, the improvement in the profit is 8.15%. However, this time the improvement does not come from taking a lot more positions in the pairs. On the contrary, the number of transactions is 15.04% less than the crisp algorithm. This makes the fuzzy algorithm even more attractive with the transaction costs as the improvement is now 21.97%.

For a further robustness check different groups of co-integrated pairs are compared. Table 5 presents the results with different groups of pairs. If the top 50 of the mostly co-integrated pairs are considered there is a very slight improvement due to the fuzzy algorithm, 1.67%. However, in the second next 50 of most co-integrated pairs the difference is much more significant, 25%. When all the co-integrated pairs are considered the difference is 9.65% as we clearly see on Table 3. This 9.65% difference can be evaluated as the average percentage of improvement among all pairs. As determining which group shows the greatest improvement requires further research.

Table 5

Returns grouped by sorted list by p-values of the co-integration test from the generated co-integrated pairs list

	Crisp Return	Fuzzy Return	Difference
Top 50	30.00	30.62	0.62
51–100	34.05	42.27	8.22
All pairs	84.47	92.63	8.16

To summarize, Table 4 Panel B and Table 5 show that fuzzy algorithm consistently improves the performance of the crisp pairs trading strategy. Taking into consideration the market trade costs, which may differ from market to market and from the type of the transaction to another type, the fuzzy strategy also performed better by the contribution of linguistic analyst inputs, which made fuzzy signal generation engine more hesitant to opening position in proposed algorithm. Fuzzy engine yielded given profit with 15% less transactions which propose higher robustness of the strategy over the crisp rule based method. With the inclusion of the trading costs the difference between the crisp and the fuzzy strategies increased to almost 22%. Naturally, the trading costs bring down the profitability of the strategy regardless of the way it is done. However, the reduction of the profits is much more dramatic in the crisp case.

5 Conclusion

This study shows, via a sample data taken from sixty-seven stocks appearing in NASDAQ energy sector, the improvement obtained by the fuzzy logic techniques over the standard crisp pairs trading strategy based on the co-integration approach for choosing the pairs to trade and the spread approach for opening and closing positions in the pair. The main contribution of this application is acquired from the additional input of the expert opinions for the individual stocks. With the help of the fuzzy logic system, it is possible to convert this qualitative data to an indicator, a fuzzy variable, and feed into the strategy. On the other hand, it is well documented in the literature that the pairs trading strategy returns are quite sensitive to volatility of the market and the transaction costs. Pairs trading strategies tend to work better when the market is highly volatile, and the returns diminish significantly when the transaction costs are included.

The time period chosen in this study is between 2012 and 2013, which is a remarkably steady time span for the financial markets. The key contribution of this study is the improvement observed even in this kind of a time period. Also, the difference between the crisp pairs trading and the fuzzy pairs trading becomes even more visible when the transaction costs are included. This could be interpreted as the fuzzy pairs trading cuts down the unnecessary positions opened significantly. In section 4, when the impact of the transaction costs is analyzed, also the number of transactions is compared along with the returns. The difference is quite noticeable.

For future studies, several market related and linguistic inputs are recommended to be considered through more sophisticated fuzzy models. Buy and sell signals may be generated as strong or weak buy and sell, to determine different investment structures. As co-integration not necessarily require stocks of the pair to belong to the same sector or even market, several sectors and markets may be analyzed. Furthermore, we conclude that fuzzy logic holds great potential for utilizations in financial markets, where human actors take place, as it allows imitation of the human decision making process.

References

Kablan

and Ng

, High frequency trading using fuzzy momentum analysis, Proc World Congr, 2010.

Kablan

, Adaptive neuro fuzzy inference systems for high frequency financial trading and forecasting, 3rd Int Conf Adv Eng Comput Appl Sci ADVCOMP 2009, 2009, pp. 105–110.

Tourin

and Yan

, Dynamic pairs trading using the stochastic control approach, J Econ Dyn Control 37(10) (2013), 1972–1981.

and Faff

, Are pairs trading profits robust to trading costs? J Financ Res 35(2) (2012), 261–287.

Kahraman

and Kaya

İ.

, Investment analyses using fuzzy probability concept / investicijų analizė taikant tikimybinę neapibrėžtųjų aibių koncepciją, Technol Econ Dev Econ 16(1) (2010), 43–57.

Krauss

, Statistical arbitrage pairs trading strategies: Review and outlook, J Econ Surv 31(2) (2017), 513–545.

Zapart

, Statistical arbitrage trading with wavelets and artificial neural networks, 2003 IEEE Int Conf Comput Intell Financ Eng 2003 Proceedings, 2003.

Gatev

, Goetzmann

W.N.

and Rouwenhorst

K.G.

, Pairs trading: Performance of a relative-value arbitrage rule, Rev Financ Stud 19(3) (2006), 797–827.

Tiryaki

and Ahlatcioglu

, Fuzzy stock selection using a new fuzzy ranking and weighting algorithm, Appl Math Comput 170(1) (2005), 144–157.

10.

Vidyamurthy

, Pairs trading, Quantitative Methods and Analysis, Wiley Finance, 2004.

11.

Primbs

J.A.

and Yamada

, Pairs trading under transaction costs using model predictive control, Quant Financ 18(6) (2018), 885–895.

12.

, Cao

, Zhang

, Luo

and Luo

, Stock Data Mining through Fuzzy Genetic Algorithms, in Joint Conference on Information Sciences, 2006.

13.

Triantafyllopoulos

and Montana

, Dynamic modeling of mean-reverting spreads for statistical arbitrage, Comput Manag Sci 8 (2011), 23–49.

14.

Cao

, Luo

, Ni

, Luo

and Zhang

, Stock Data Mining through Fuzzy Genetic Algorithm, in Proceedings of the 2006 Joint Conference on Information Sciences (JCIS 2006), 2006.

15.

Bayram

and Akat

, Statistical Arbitrage with Fuzzy Logic, in International Fuzzy Systems Symposium, 2015.

16.

Clegg

and Krauss

, Pairs trading with partial cointegration, Quant Financ 18(1) (2018), 121–138.

17.

Gradojevic

and Gençay

, Fuzzy logic, trading uncertainty and technical trading, J Bank Financ 37(2) (2013), 578–586.

18.

Huck

, Pairs trading: Does volatility timing matter? Appl Econ 47(57) (2015), 6239–6256.

19.

Huck

, The high sensitivity of pairs trading returns, Appl Econ Lett 20(14) (2013), 1301–1304.

20.

Engle

R.F.

, Granger

C.W.J.

and Mar

, Co-integration and error correction: Representation, estimation, and testing, Econometrica 55(2) (1987), 251–276.

21.

Faff

and Do

, Does simple pairs trading still work? Financ Anal J 66(4) (2010), 83–95.

22.

Fernholz

and Maguire

, The statistics of statistical arbitrage logarithmic return of portfolios, 63(5) (2014), 46–52.

23.

Elliott

R.J.

, Hoek

, Der

and Malcolm

W.P.

, Pairs trading, Quant Financ 5(3) (2005), 271–276.

24.

Zeng

and Lee

C.-G.

, Pairs trading: Optimal thresholds and profitability, Quant Financ 14(11) (2014), 1881–1893.