An efficient isomorphic CNN-based prediction and decision framework for financial time series

Abstract

Financial time series prediction and trading decision-making are priorities of computational intelligence for researchers in academia and the finance industry due to their broad application areas and substantial impact. However, these methods remain challenging because they retain various complex statistical properties, and the mechanism behind the processes is unknown to a large extent. A significant number of machine learning-based methods are proposed and demonstrate impressive results, especially deep learning-based models. Nevertheless, due to the high complexity of massive, nonlinear, and nonindependent data and the difficulties and time consumption of complicated training models of deep learning, the performance of online trading decisions is still inadequate for practical application. This paper proposes the Integrated Framework of Forecasting Based Online Trading Strategy (IFF-BOTS) to satisfy better prediction performance and dynamic decisions for real-world online trading systems. Our method adopts a novel isomorphic convolutional neural network (CNN)-based forecaster-classifier-executor architecture to exploit CNN-based price and trend integrated prediction and direct-reinforcement-learning-based trading decision-making. IFF-BOTS can also achieve better real-time performance for online trading. We empirically compare the proposed approach with state-of-the-art prediction and trading methods on real-world S&P and DJI datasets. The results show that the IFF-BOTS outperforms its competitors in predicting metrics, trading profits, and real-time performance.

Keywords

Time series analysis algorithmic trading deep learning reinforcement learning convolution neural networks

1. Introduction

Financial time series analysis and associated applications have been studied extensively for many years. Among them, algorithmic trading, i.e., quantitative trading, is a time-honored topic widely discussed in modern artificial intelligence [22] and time series analysis. Specifically, algorithmic trading refers to executing trading orders using automated preprogrammed trading instructions accounting for variables such as time, price, and volume to generate profits at high speed and frequency. As shown in Fig. 1, we first get market data for signal research and perform backtesting. Then, with signals and their aggregation, we can establish position and risk models and finally provide execution logic based on various decision-making methods. There are many advantages over human trader’s, e.g., execution at the best possible prices, avoiding significant price changes, and reduced risk of manual errors or human trader’s emotional and psychological factors. Essentially, the process of trading is well depicted as an online continuous decision-making problem involving two critical challenges of financial market representation and optimal trading decision-making. First, financial market representation is usually considered one of the most challenging issues among time series analyses due to its noise and volatile features. Financial data contain a significant amount of noise and jumps, resulting in nonstationary, high-noise, nonlinear and chaotic time series data [7]. To mitigate such complexity, handcrafted features, e.g., the relative strength index (RSI) and other stochastic technical indicators, have been extensively explored for technical analysis in quantitative finance [20]. However, a widely known limitation of technical analysis is its poor generalization ability. Rather than exploiting predefined handcrafted features, can we learn more robust feature representations directly from data? Second, due to the high dynamics of the trading process, trading decision-making is systematic work that should take several practical factors into account. Changing trading positions too frequently will only lead to significant losses due to transaction costs and slippage. Effective modeling of current market conditions and combining historical actions and the corresponding positions are crucial for trading strategy and policy learning. However, although various empirically universal properties are observed, the mechanism behind financial market dynamics is unrevealed to a large extent, which leads to modeling suffering from nearly impossible complexity. How can we incorporate such optimal dynamic decision-making into the real-world trading system without affecting real-time performance? As a result, algorithmic trading has become much more challenging and has attracted significant attention in recent decades. Moreover, conventional methods, including stochastic processes and model-based approaches, are inadequate to capture the complex temporal dependencies of financial markets or make decent decisions in a timely manner.

Figure 1.

General model of algorithmic trading system.

Recently, with the rapid development and impressive achievements in machine learning, leveraging machine learning approaches in algorithmic trading has become a highly researched topic. Although time series forecasting has been an active topic for several decades, financial time series prediction remains quite challenging due to its complexity. Therefore, numerous studies have been published on machine-learning-based models with relatively better performances than classical techniques. Specifically, in recent years, deep learning has demonstrated tremendous learning of feature representations of complex data such as high-dimensional nonlinear data, temporal data, spatial data, and graph data, and has gained great success in various applications. Financial time series forecasting is no exception. As such, an increasing number of prediction models based on multiple deep-learning-based approaches, especially LSTM-based approaches, have been introduced and have achieved state-of-the-art performance in recent years. On the other hand, compared with conventional machine learning methods, reinforcement learning is learning behavior, which deals with what actions should be taken by subjects to achieve the highest reward in an environment. Such characteristics make reinforcement learning suitable for trading, especially when we combine reinforcement learning and deep learning, i.e., deep reinforcement learning. Deep reinforcement learning demonstrates significant advantages and many achievements in various aspects. In practical applications, the successes of deep reinforcement learning have been shown extensively in many tasks, including robot navigation and helicopter control. The application of deep learning and reinforcement learning in algorithmic trading has achieved a good number of impressive results.

However, we still face difficulties applying reinforcement learning in highly dynamic and nonindependent financial markets. First, there is low forecasting performance for noisy, nonstationary, and nonlinear financial time series. Second, convergence issues and mode collapse occur frequently during model training. Last, it is time-consuming for neural network training, which may not be acceptable for real-time online trading. Hence, it leads to an interesting question in the context of the financial market: how can we leverage both deep learning and reinforcement learning to design an efficient trading approach for financial markets?

Inspired by deep learning’s strong capability of financial market feature representation and reinforcement learning’s significant achievement in decision-making, we are motivated to propose the Integrated Framework of Forecasting Based Online Trading Strategy (IFF-BOTS). IFF-BOTS can satisfy better prediction performance and dynamic decisions for real-world online trading systems. Our method adopts a novel isomorphic convolutional neural network (CNN) based forecaster-classifier-executor architecture, which can empower temporal feature capture and make the best use of the strong pattern classification capability of convolutional neural networks. Furthermore, we adopt an integrated prediction indicator to leverage forecasting and corresponding classification precision. With such quantitative prediction, we define an improved trading strategy based on an effective combination of direct reinforcement learning and a predictive heuristic strategy. Unlike most existing methods that mainly adopt deep neural networks for feature extraction and representation of financial time series, the proposed forecaster-classifier-executor architecture can unify prediction together with trading decisions. The main contributions of the proposed IFF-BOTS are as follows:

•

Propose a forecaster-classifier-executor framework that exploits deep-learning-based financial time series prediction and reinforcement-learning-based decision-making to achieve a better Sharpe Ratio and real-time performance.

•

Capture financial time series data regularities, forecasting by providing a computationally effective isomorphic CNN-based architecture, and a novel integrated prediction indicator to make the best use of forecasting and uncertainty for trading strategy.

•

An improved direct-reinforcement-learning-based strategy can benefit from deep feature extraction and prediction-based heuristic strategy.

•

Empirically evaluate the proposed framework on two real-world stock datasets with widespread measurements: Total Profits and Sharpe Ratio.

The rest of this paper is organized as follows. Section 2 presents a brief review of the related works. Section 3 introduces the proposed IFF-BOTS architecture and trading strategy. In Section 4, we introduce the experimental settings and demonstrate the results of the proposed IFF-BOTS with three state-of-the-art methods on the test datasets. Finally, Section 5 summarizes the whole paper and suggests possible future work.

2. Related work

We briefly review existing work on deep-learning-based and reinforcement-learning-based financial time series forecasting and trading strategies. More comprehensive literature reviews can be found in recent surveys [24, 26].

2.1 Deep-learning-based price prediction

Price prediction of any given stock is the most studied financial application. We observed the same trend within deep learning implementations. Depending on the prediction time horizon, different input parameters are chosen, varying from high-frequency trading (HFT) and intraday price movements to daily, weekly, or even monthly stock close prices. Additionally, technical, fundamental analysis, and social media feeds or sentiments are among the different parameters used for the prediction models. In [8], deep neural networks and lagged stock returns were used to predict the Korea Composite Stock Price Index (KOSPI). [6] applied raw price data as the input to LSTM models. Meanwhile, some studies implement multiple deep learning models for performance comparison using raw price (OCHLV) data for forecasting. Among the noteworthy studies, [12, 25] compared RNN, stacked RNN, MLP, LSTM, CNN, GRU, and ARIMA. [5] used cooperative neuro-evolution, RNN (Elman network), and DFNN to predict stock prices in NASDAQ. [16] applied CNN $+$ LSTM. [28] proposed the novel deep and wide neural network (DWNN), a combination of RNN and CNN. [29] implemented a state frequency memory (SFM) recurrent network. In [1], a deep neural network and 25 fundamental features were used for the prediction of Japan Index constituents. [15] implemented LSTM with transfer learning using text mining through financial news and stock market data. Similarly, [17] used LSTM to predict the next-day stock price using corporate action events and a macroeconomic index. [3] used textual information and stock prices through a paragraph vector and LSTM for forecasting prices, and comparisons were provided with different classifiers.

There were also multiple hybrid models that used primarily technical analysis features as their inputs to the deep learning model. Recently, [14] used market microstructure-based trade indicators as inputs into an RNN with Graves LSTM detecting the buy-sell pressure of movements in the Istanbul Stock Exchange Index (BIST) to perform price prediction for intelligent stock trading. Meanwhile, some papers prefer CNN models. [2] used 250 features, including order details, to predict a private brokerage company’s real data on risky transactions. They used CNN and LSTM for stock price forecasting.

2.2 Reinforcement learning-based trading decision-making

The nature of trading requires counting the profits in an online manner. Not all reinforcement learning methods are ideal for such online decision-making. While value-function-based RL methods are plausible for offline scheduler problems, the actor-based framework is more suitable for dynamic online trading [19] due to two advantages: i) flexible objectives for optimization and ii) continuous descriptions of market conditions. [18] proposed the direct reinforcement learning trading model based on recurrent reinforcement learning, which does not shed light on the side of feature learning. Robust feature representation is vital to machine learning performances. In the context of stock data learning, various feature representation strategies have been proposed from multiple views [4, 10]. Failure to extract robust features may adversely affect the performance of a trading system in handling market data with high uncertainty. With demonstrated strong feature representation capability, deep learning is adopted for feature extraction combined with reinforcement learning for online trading. [9, 21] exploited deep reinforcement learning approaches in stock and commodity future markets using RNN and LSTM for feature engineering. [13] introduced CNN, RNN, and LSTM to improve the policy gradient function of actor-critic reinforcement learning for cryptocurrency trading.

In summary, there are impressive achievements in deep-learning-based or reinforcement-learning-based financial time series prediction and trading approaches. However, there are still three difficulties: i) the prediction performance still cannot satisfy real-world applications, especially for predictive trading decision making; ii) the real-time performance suffers from multiple problems during neural network training, such as failure or slowness to converge and mode collapse; and iii) forecasting and dynamic decision-making are not well harmonized together.

3. Integrated framework of forecasting-based online trading strategy

3.1 Problem formulation

3.1.1 CNN-based time series prediction

Consider a one-dimensional time series $x_{n}=(x_{0},x_{1},\cdots,x_{n-1})$ . Given a model with parameter values $\theta$ , the task for a predictor is to output the next value $x_{t+1}$ conditional on the historical series: $x_{0}$ , $x_{1}$ , $\cdots$ , $x_{t}$ . This can be done by maximizing the likelihood function

$\displaystyle p(x|\theta)=\mathop{\prod}\limits_{{{t}}=0}^{N-1}p(x(t+1)|x(0),% \ldots,x(t),\theta).$ (1)

The CNN [11] is a deep neural network consisting of convolutional layers based on the convolutional operation. It is the most common model used for vision and image processing-based classification problems such as image classification, object detection, and image segmentation. The advantage of CNN compared to conventional deep learning models is its strong pattern recognition capability. Furthermore, filtering with the kernel window function gives the benefit of data processing to CNN architectures with fewer parameters, which is beneficial for computing and storage. In typical CNN architectures, there are different layers: convolutional, max-pooling, dropout, and fully connected layers. It can be formulated as:

$\displaystyle s(t)=(x*w)(t)=\sum_{a=-\infty}^{\infty}x(a)w(t-a)$ (2)

where $t$ denotes time, $s$ denotes feature map, $w$ denotes kernel, $x$ denotes input, and a denotes variable.

$\displaystyle Z_{i}=\sum_{j}W_{i,j}\chi_{j}+b_{j}$ (3) $\displaystyle y=\operatorname{softmax}(z)$ (4) $\displaystyle\operatorname{softmax}(Z_{i})=\frac{\exp(Z_{i})}{\sum_{j}\exp(Z_{% j})}$ (5)

where W denotes weights, $x$ denotes input, $b$ denotes bias, and $z$ denotes the output of neurons. At the end of the network, the softmax function is used to obtain the output. Equations (4) and (5) illustrate the softmax function, where $y$ denotes output.

The backpropagation process is used for CNN model learning. The most commonly used optimizers, e.g., SGD and RMSProp, are used to find the optimal CNN parameters. The hyperparameters of CNN are similar to the hyperparameters of other deep learning models. CNN-based time series prediction can be implemented as shown in Fig. 2.

3.1.2 Reinforcement-learning-based trading decision

Reinforcement learning (RL) [22] is a type of learning that differs from supervised and unsupervised learning models. It does not require a preliminary dataset that has been labeled or clustered before. There are different areas in which it is used: game theory, control theory, multiagent systems, operations research, robotics, information theory, investment portfolio management, simulation-based optimization, playing Atari games, and statistics. This learning method mimics the basics of how humans learn.

Figure 2.

CNN-based time series prediction.

Figure 3.

Discrete-time markov decision process.

As shown in Fig. 3, reinforcement learning is mainly based on a Markov decision process (MDP). The objective is to choose a policy (a sequence of actions) to maximize the cumulative value function. An MDP is used to formalize the RL environment. An MDP consists of five tuples: state (a finite set of states), action (a finite set of actions), reward function (scalar feedback signal), state transition probability matrix $(p(s^{\prime},r|s,a)$ , where $s^{\prime}$ denotes next state, $r$ denotes reward function, $s$ denotes state, and a denotes action, and discount factor ( $\gamma$ , present value of future rewards). The return $G_{t}$ is the total discounted reward. Equation (6) illustrates the total return, where $t$ denotes time, and $k$ denotes a variable in time.

$\displaystyle G_{t}=r_{t+l}+\gamma r_{t+2}+\gamma^{2}r_{t+3}+\cdots+\sum_{k=0}% ^{\infty}\gamma^{k}r_{t+k+1}$ (6)

Based on the above classic model, we formulate the trading decision in Moody’s direct reinforcement learning trading model [18]. We define $\{p_{1},p_{2},\ldots,p_{t}\}$ as the price sequences released from the exchange center. Then, the return at time point $t$ is easily determined by $z_{t}=p_{t}-{p_{t-1}}$ . Based on the current market conditions, the real-time trading decision (policy) $\delta_{t}\in\{\textit{long},\textit{neutral},\textit{short}\}=\{1,0,-1\}$ is made at each time point $t$ . With the symbols defined above, the profit $R_{t}$ made by the trading model is obtained by

$\displaystyle R_{t}=\delta_{t-1}z_{t}-c|\delta_{t}-\delta_{t-1}|$ (7)

where the first term is the profit/loss made from the market fluctuations and the second term is the transaction cost $c$ when flipping trading positions at time point $t$ . It is the mandatory fee paid to the brokerage company only if $\delta_{t}=\delta_{t-1}$ . When two consecutive trading decisions are the same, no transaction cost is applied. The function in Eq. (7) is the value function defined in the typical direct reinforcement learning framework. When obtaining the value function at each time point, we represent the accumulated value throughout the whole training period as:

$\displaystyle\max_{\Theta}U_{T}\{R_{1},\ldots,R_{T}\mid\Theta\}$ (8)

where UT $\{\cdot\}$ is the accumulated reward in the period of $\{1,\cdots,T\}$ . Intuitively, the most straightforward reward is the Total Profits (TP) made in period T, i.e., $U_{t}=\sum_{t=1}^{T}R_{t}$ . Other complicated reward functions, e.g., the risk-adjusted returns, can also be used here as the objective function. Accordingly, a major contribution of direct reinforcement learning frameworks is introducing a reasonable strategy to learn the trading policy directly. In detail, a nonlinear function is adopted in direct reinforcement learning to approximate the trading action (policy) at each time point by:

$\displaystyle\delta_{t}=\tanh[\langle\bm{W},f_{t}\rangle+b+u\delta_{t-1}]$ (9)

Where $<\cdot,\cdot>$ is the inner product, $f_{t}$ defines the feature vector of the current market condition at time $t$ , and $\{\textbf{W},u,b\}$ are the coefficients for the feature regression. The optimization of direct reinforcement learning aims to learn such a family of parameter sets $\varTheta=\{\textbf{W},u,b\}$ to maximize the global reward Eq. (8). Based on deep-learning-based prediction and reinforcement-learning-based trading decision methods, the three difficulties mentioned earlier turn into three key problems: i) how to exploit a prediction-decision framework to combine the two machine learning model’s advantages; ii) how to achieve precise and computationally efficient predictions to assure online trading decisions; and iii) how to define an effective trading strategy based on the prediction results, especially in highly dynamic scenarios. To perform effective algorithmic trading in real-world trading systems, we need to answer these questions reasonably.

3.2 Proposed framework: IFF-BOTS

To address the first and most fundamental problem, i.e., how to exploit a prediction-decision framework to combine the advantages of the two machine learning models, we first need to examine the gaps between deep-learning-based prediction and reinforcement-learning-based trading decisions. In brief, there are two main issues to be addressed: i) as with most existing deep reinforcement learning methods for algorithm trading, deep learning is adopted mainly for feature extraction instead of a harmonized prediction-decision methodology. Similarly, reinforcement learning is leveraged in decision-making, standalone from prediction; ii) dynamic trading decisions are dependent on critical information available at the time, e.g., predicted prices, risks, trends, or sentiments. However, the performance of predictions is not static, and it varies greatly due to the dynamics and complexities. Therefore, how can we make the best use of such uncertain predictions and take optimal trading actions?

Figure 4.

Architecture of IFF-BOTS framework.

As shown in Fig. 4, on the basis of CNN-based time series prediction and direct reinforcement learning-based dynamic decision models, we propose an integrated framework of forecasting-based online trading strategy (IFF-BOTS) to exploit and unify both models into a single framework for algorithmic trading in real-world online trading systems, which is featured by automatically providing continuous trading decisions without domain experts involvement. Specifically, the framework contains three main components: the forecaster (F), the classifier (C), and the executor (E). The left part contains isomorphic CNN-based forecaster F and classifier C, which share all neural layers except for the last layer. Convolutional neural networks can capture temporal dependencies and extract feature representations from the input financial time series. Based on different neural layers and activation functions, forecaster F and classifier C can achieve direct price forecasting and multiclassification of the trends with corresponding likelihood. The trends obtained from multiclassifier C align with trading decisions, i.e., $\{$ long, neutral, short $\}$ .

Furthermore, the extracted feature representations can also contribute to executor E as hidden features for conventional reinforcement-learning-based trading decision-making methods. Such integrated prediction and feature extraction can offer more details to help executor E make better trade decisions. The integrated prediction indicator $\varPhi$ is measured by the difference between forecasted price and current price from Forecaster F, the type of trend, and its corresponding likelihood from classifier C. On the one hand, the more consistent the forecasted price and the trend, the more accurate the prediction is. On the other hand, the more significant the likelihood of the trend, the better the prediction. Thus, it is much easier to make decent, or even sometimes, straightforward decisions with better predictions. The integrated prediction indicator $\varPhi$ is defined as follows:

$\displaystyle\Phi=\left\{\begin{array}[]{ll}\lambda\times\hat{t}&\text{if }(% \hat{y}>0\text{ and }\hat{t}=1)\text{ or }(\hat{y}<0\text{ and }\hat{t}=-1)\\ \lambda\times\hat{t}+(1-\lambda)\times\hat{y}&\text{if }(\hat{y}>0\text{ and }% \hat{t}=-1)\text{ or }(\hat{y}<0\text{ and }\hat{t}=1)\\ (1-\lambda)\times\hat{y}&\text{if }\hat{t}=0\end{array}\right.$ (10)

where $\hat{y}\in R$ denotes the difference in price and $\hat{t}\in\{1,0,-1\}$ denotes the classified trend, i.e., $\{$ long, neutral, short $\}$ .

However, the decision-making mechanism should tackle inaccurate predictions or even uncertainties, i.e., predictions with poor likelihood. Trade decision-making is a discrete-time Markov decision process, and reinforcement learning demonstrates impressive state-of-the-art results for such continuous trading decision scenarios. Therefore, as shown in the right part, the IFF-BOTS presents an improved direct-reinforcement-learning-based trading strategy. Unlike conventional direct reinforcement learning, the proposed trading strategy leverages integrated predictions and deep feature learning to combine prediction-based heuristic trading and the dynamics of direct reinforcement learning. Therefore, the trading decision function $\mathcal{D}$ would be appropriately defined by a weighted combination of the prediction-based heuristic trading from forecaster F and classifier C and the dynamic decision from direct-reinforcement-learning-based executor E with deep features, which can be formulated as Eq. (11):

$\displaystyle\mathcal{D}=(1-p(\hat{t}))\cdot\mathcal{D}_{s}^{t}(f)+p(\hat{t})% \cdot\mathcal{H}_{s}^{t}(\Phi)$ (11)

where $\mathcal{H}_{s}^{t}(\cdot)$ denotes the heuristic strategy for state $s$ and time $t$ based on integrated prediction indicator $\varPhi$ , $\mathcal{D}_{s}^{t}(\cdot)$ denotes trading action from direct reinforcement learning for state $s$ and time $t$ with extracted feature representation $f$ , and $p(\hat{t})\in[0,1]$ denotes the likelihood of trend $\hat{t}$ in Eq. (10). $p(\hat{t})$ is available from the softmax function of classifier C.

However, considering that trading decisions should be made online for real-world trading systems, we must take real-time performance into account, which is the second key problem to be handled.

3.3 Isomorphic CNN-based forecaster and classifier structure

The second problem is how can precise and computationally efficient predictions be achieved to assure online trading decisions? Although in most situations, we adopt LSTM for time series prediction, here, we must address real-time performance issues seriously because they are critical for actual trading. Therefore, we adopt an isomorphic CNN-based forecaster-classifier-executor structure, as shown in Fig. 4, which can lead to i) effective calculation for the forecasted price, type of trend, and corresponding likelihood, which could be exploited to refine the prediction. According to Eq. (10), we compute the integrated prediction indicator $\varPhi$ to provide a comprehensive understanding of the prediction and subsequently make better trading decisions; ii) computational efficiency due to CNN’s compressed feature representation and training advantage over LSTM; iii) simultaneous training for forecaster F, classifier C, and data regularity learning of executor E because of the isomorphic structure. In brief, the designed isomorphic neural network structure would be capable of predicting the price, the trend with probabilistic metrics, and deep feature learning. Moreover, it can satisfy reduced calculation complexity, i.e., neural network training and prediction, for online trading.

3.4 Dynamic trading decision

To reasonably address the third key problem, i.e., how to define an effective trading strategy based on the prediction results, especially in highly dynamic scenarios, we improve classic direct reinforcement learning by introducing the prediction-based heuristic trading strategy. Most existing approaches take advantage of reinforcement-learning-based continuous decision-making to address the complicated dynamics of realistic financial markets. Moreover, some of them also adopt deep learning for financial market feature extraction. However, the proposed framework leverages both deep-learning-based prediction and deep feature representation instead of only feature representation and adopts more proactive trading decisions, i.e., predictive decisions. On the one hand, IFF-BOTS better uses CNN’s demonstrated strong capability of nonlinear feature learning and classification for time series prediction, which is more advantageous than feature engineering for further trading execution. On the other hand, the predictive decision would contribute to a better decision for a highly dynamic financial market due to a combination of prediction and the Markov decision process. Given a decent forecast indicated by an integrated prediction indicator, a heuristic strategy could be easily selected. However, in regard to an uncertain prediction measured by the integrated prediction indicator $\varPhi$ , an improved direct-reinforcement-learning-based method can suggest the optimal action based on Eq. (11) because it is continuously trained and has more informative predictive inputs. This improved hybrid mechanism for trading strategy, as shown in the right part of Fig. 4, can achieve better robustness and decision performance.

4. Evaluation

4.1 Experimental settings

4.1.1 Datasets

To verify the proposed algorithm trading method, we conduct experiments on real-world financial data:1

¹
https://finance.yahoo.com/.

Standard & Poor’s 500 Index (S&P) and Dow Jones Industrial Average Index (DJI). For S&P, we select a period from 2001-01-02 to 2018-12-26, approximately eighteen years. As shown in Fig. 5a, there are significant jumps and minor fluctuations in both training (blue) and testing (red) periods, and we perform one-day ahead forecasting. Regarding DJI, to differentiate it from S&P, we choose another time window in a different shape. It is from 1990-01-02 until 2008-08-31, approximately eighteen years covering the global financial crisis of 2008, which can be seen obviously for the testing period in red, as shown in Fig. 5b. Overall, these two real-world datasets contain various fluctuations within long periods, including the great financial crisis. Meanwhile, the trend and shape are noticeable different from each other to compare both performance and robustness of the proposed method.

Figure 5.

Datasets of S&P and DJI.

Table 1

General information of two real-world datasets

Dataset	Start	End	Training start	Training end	Test start	Test end	TC
S&P	2000-01-02	2018-12-26	2001-01-02	2017-05-17	2017-05-18	2018-12-26	0.1%
DJI	1990-01-02	2008-08-31	1990-01-02	2008-06-01	2007-06-04	2008-08-31	0.1%

4.1.2 Evaluation measures

he main goal of algorithmic trading is to gain more profits and take fewer risks. Thus, the most popular metrics, Total Profits and Sharpe Ratio, are adopted for comparison. Total Profits (TP) gained by trading is defined as follows:

$\displaystyle{TP}_{T}=\sum_{i=1}^{T}(\textit{Pos}_{i}R_{i+1}-c|\textit{Pos}_{i% }-\textit{Pos}_{i-1}|)$ (12)

where $R_{t}$ is the return defined in Eq. (9), $\textit{Pos}_{i}$ is the position at time $i$ , and $c$ is the transaction cost.

Moreover, in modern portfolio theory, risk-adjusted profits are more widely used to evaluate a trading system’s performance than Total Profits. This paper will also consider the Sharpe Ratio (SR), commonly used in many trading-related works. The Sharpe Ratio is the ratio of the average return to the standard deviation of the returns calculated in period $\{1,2,\cdots,T\}$ as follows:

$\displaystyle{SR}_{T}=(\operatorname{mean}(R_{t})/\operatorname{std}(R_{t}))$ (13)

where $R_{t}$ is the return defined in Eq. (9) and the mean ( $\cdot$ ) and std ( $\cdot$ ) denote the mean and standard deviation, respectively. To simplify the expression, we follow the same idea in the direct reinforcement learning trading model [18] to use the moving Sharpe Ratio instead. In general, the moving Sharpe Ratio obtains the first-order Taylor expansion of the typical Sharpe Ratio and incrementally updates the value.

Since the proposed IFF-BOTS is a prediction-based trading framework, we consider financial time series prediction. In the framework, both direct price forecasting and trend classification are adopted. As a result, we follow generic forecasting and classification measurements, i.e., root mean square error (RMSE) and mean absolute percentage error (MAPE), for forecasting, while the F1-score is a composite indicator of precision and recall for classification.

Another important measurement is the time cost, including training duration, measured in seconds, especially for real-world applications. It is of great importance that both prediction and trading decision-making be carried out in a real-time manner.

4.1.3 State-of-the-art methods

We evaluate the performance of the proposed IFF-BOTS on the datasets mentioned earlier. First, to demonstrate the performance of the isomorphic CNN-based forecaster-classifier-executor structure, we compare its prediction performance with three state-of-the-art methods: LSTM recurrent neural networks (LSTM-RNN) [6], dilated convolutions long short-term memory networks (DC-LSTM) [27], and convolutional LSTM (CNN-LSTM) [16] based on RMSE, MAPE, and F1-score. Then, we continue comparing the overall trading results to evaluate the decision-making performance based on the Sharpe Ratio (SR) and Total Profits (TP). In practical implementation, we compare IFF-BOTS with two sets of trading methods: i) Prediction-based heuristic trading methods, including LSTM-RNN-H, DC-LSTM-H, and CNN-LSTM-H. These approaches only trust the trading signal that has a high likelihood. This signal was identified if the predicted likelihood for one direction was higher than a certain threshold $\tau$ . ii) Deep direct reinforcement learning methods, including LSTM-RNN-DDRL, DC-LSTM-DDRL, and CNN-LSTM-DDRL. These trading strategies adopt LSTM-RNN, DC-LSTM, and CNN-LSTM only for feature extraction and then use direct reinforcement learning for trading decisions. Finally, we calculate the training time for each method to demonstrate the real-time performance.

4.1.4 Experimental environments

We perform all the experiments on a single server with an Intel Core Processor 6 Core Coffee Lake CPU and an Nvidia Geforce 2070s GPU. The software environments include Anaconda 3, Python 3.8, Cuda 11.1, and necessary running libraries such as TensorFlow-GPU 2.2.0 and Cudnn 7.6.5. Implementation of all methods is based on Keras.

To better compare the hyperparameters of the different methods, we use relatively stringent parameter settings: (i) No more than four layers of neural networks are adopted for all these models to ensure acceptable real-time performance; (ii) The number of hidden neurons is no more than 64; (iii) We adopt the Softmax activation function for the output layer of classifier C, linear activation function for the output layer of forecaster F, and the Tanh activation function for the output layer of executor E; (iv) The training epochs are less than 100; (v) We use the RMSPROP optimizer for the forecaster F and the classifier C, and the Adam optimizer for the executor E; (vi) a mini-batch size m $=$ 128 for training; and (vii) stop training the generator when the downward trend of its loss tends to be slow.

To mitigate the experimental randomness, we ran each method for each dataset ten times with different randomized seeds.

4.2 Experimental results

This section presents the results intending to demonstrate the overall performance of IFF-BOTS in both prediction and trading on real-world financial time series datasets and provides some insights into the performance comparison among different algorithmic trading methods.

To make the result clear, we show the best performance among the compared methods in bold. The last row of the tables indicates the ranking of each approach within each dataset.

Table 2
Prediction metrics

	S&P				DJI
	RMSE	MAPE	Training duration	F1	RMSE	MAPE	Training duration	F1
DC-LSTM	66.0359	0.0218	40.2s	0.4515	433.51	0.0315	43.6s	0.3032
CNN-LSTM	77.1052	0.0261	48.9s	0.4338	308.42	0.0224	49.0s	0.3168
LSTM-RNN	61.9310	0.0197	1m28s	0.3615	428.44	0.0285	80.0s	0.4497
IFF-BOTS	42.8710	0.0128	10.3s	0.483	226.97	0.0158	9.66s	0.4933
Rank of IFF-BOTS	1	1	1	1	1	1	1	1

Figure 6.

Datasets of S&P and DJI.

4.2.1 Prediction results and analysis

As shown in Fig. 6 and Table 2, IFF-BOTS’s isomorphic CNN-based architecture demonstrates obvious advantages over LSTM-RNN, DC-LSTM, and CNN-LSTM. From the experimental results, two observations can be found as follows: i) Although the performance of the training set looks similarly well in Fig. 6a and b, IFF-BOTS outperforms its competitors for the test sets in Fig. 6c and d. The measurements of RMSE, MAPE, and F1 are consistently decent. To be more precise, for S&P, the metrics are better by 44.46%, 53.90%, and 6.97% than the second one. For DJI, the scores are better by 35.95%, 41.77%, and 9.70%; ii) Training duration is also much shorter, i.e., 10.3 seconds and 9.66 seconds for S&P and DJI, compared with other methods, especially LSTM-RNN.

4.2.2 Trading results and analysis

A. Comparisons with prediction-based heuristic trading methods

Figure 7.

Overall trading results in the test set.

Figure 7c and d demonstrate the trading results among these compared prediction-based heuristic methods, including IFF-BOTS, LSTM-RNN-H, DC-LSTM-H, and CNN-LSTM-H. In addition, the prices of the test sets are shown in Fig. 7a and b, which fluctuate in significantly different manner. In brief, IFF-BOTS outperforms the compared approaches for both S&P and DJI, which can be seen from the profit and loss (P&L) curves in Fig. 7c and d. For S&P, the P&L curve of IFF-BOTS shows much better stability to profit than others, while for DJI, there are apparent advantages of IFF-BOTS, although there are some struggles in a few intervals. To be more specific, the quantitative evaluations are summarized in Table 3. IFF-BOTS is doing an impressive job for the TP and SR, i.e., compared with the second winner, 499.39:100.16 for TP of S&P 1.321:0.283 for SR of S&P, 7794.06:3333.9 for TP of DJI, and 1.51:0.8229 for SR of DJI.

Table 3

Overall TP and SR results of prediction-based heuristic trading methods

	S&P				DJI
	TP		SR		TP		SR
DC-LSTM-H	100	.16	0	.283	$-$ 2643	.01	$-$ 0	.695
CNN-LSTM-H	53	.09	0	.133	3333	.9	0	.8229
LSTM-RNN-H	$-$ 0	.402	$-$ 0	.0169	$-$ 315	.09	$-$ 0	.221
IFF-BOTS	499	.39	1	.321	7794	.06	1	.51
Rank of IFF-BOTS	1		1		1		1

Table 4

Overall TP and SR results of DDRL-based trading methods

	S&P				DJI
	TP		SR		TP		SR
DC-LSTM-DDRL	133	.98	0	.269	$-$ 552	.4	$-$ 0	.127
CNN-LSTM-DDRL	21	.39	0	.054	2524	.8	0	.529
LSTM-RNN-DDRL	78	.312	0	.1811	$-$ 146	.03	$-$ 0	.045
IFF-BOTS	499	.39	1	.321	7794	.06	1	.51
Rank of IFF-BOTS	1		1		1		1

B. Comparisons with DDRL-based methods

For deep-direct-reinforcement-learning-based categories, Fig. 7e and f present the trading comparison among IFF-BOTS, LSTM-RNN-DDRL, DC-LSTM-DDRL, and CNN-LSTM-DDRL. Similarly, the prices of the test sets are shown in Fig. 7a and b.

In summary, IFF-BOTS outperforms the compared approaches for both S&P and DJI. For the S&P, the P&L curve of the IFF-BOTS achieves better profitability than the other methods. For DJI, IFF-BOTS still leads the P&L curves, although CNN-LSTM-DDRL performs well. As shown in Table 4, IFF-BOTS wins its main competitor of DC-LSTM-DDRL for S&P: 499.39:133.98 for TP and 1.321:0.269 for SR. Moreover, regarding DJI, IFF-BOTS is advantageous over CNN-LSTM-DDRL: 7794.06:2524.8 for TP and 1.51:0.529 for SR.

4.2.3 Experiments for robustness analysis

The robustness of the IFF-BOTS is an interesting issue to explore further. Following previous experiments on S&P datasets, we conduct more experiments for different hyperparameters as the IFF-BOTS is based on CNN layers and RNN nodes, potentially impacting the overall performance. For the representation learning ability of neural networks, we adjust the number of layers from 2 to 4 and the number of RNN nodes from 1 to 4. As shown in Table 5, we can see that more hidden layers lead to a slightly better TP and SR because improved feature representation can directly benefit forecaster F, classifier C, and executor E. Moreover, excessive RNN nodes in the direct reinforcement learning model have little impact on model performance, which was discussed in the direct reinforcement learning model. Indeed, with increased CNN layers and RNN nodes, the runtime arises correspondingly. Therefore, we can conclude that the proposed IFF-BOTS is generally robust to different CNN layer and RNN node settings.

Table 5
Robustness results

CNN Layer# of F and C	2		3		4
RNN Node# of E	TP	SR	TP	SR	TP	SR
1	499.39	1.321	523.81	1.278	885.57	1.888
2	481.14	1.318	588.7	1.515	822.26	1.682
4	361.91	1	475.47	1.132	858.9	1.81

5. Discussion and conclusion

This paper proposes a novel algorithmic trading framework IFF-BOTS to satisfy better financial time series prediction performance and dynamic trading decisions for real-world online trading systems. IFF-BOTS features a novel isomorphic CNN-based forecaster-classifier-executor architecture for feature representation of nonstationary, nonlinear, and highly noisy financial data. The framework adopts an integrated prediction to leverage both price forecasting and trend classification. Moreover, it exploits the CNN’s strong capability of pattern classification and nonlinear feature learning. Furthermore, we present an improved direct-reinforcement-learning-based strategy to benefit from deep feature extraction and a prediction-based heuristic strategy based on quantitative prediction. Although training neural networks is involved, the real-time performance of IFF-BOTS is suitable for most online real-world trading systems because of the isomorphic structure and shallow layers. Practically, IFF-BOTS achieves the best metrics and ranking on the real-world S&P and DJI datasets compared to several state-of-the-art prediction-based heuristic trading and deep-direct-reinforcement learning-based trading methods. For future work, we plan to conduct the following research: introduce anomaly detection and classification for financial time series so that turning points and jumps can be detected and predicted, which would be of crucial importance for algorithmic trading; take dynamic modeling for slippage and transaction execution, such as trade orders based on the execution status, into account; and in terms of applications, we plan to extend IFF-BOTS for high-frequency trading (HFT) markets and orchestrate it as a software-as-a-service in cloud environments.

Footnotes

Acknowledgments

This work has been partially supported by China Scholarship Council, National Natural Science Foundation of China under Grant No. 71971174, Science and Technology Program of Sichuan Province under Grant No. 2021JDR0222 and No. 2020YFG0326, and Talent Program of Xihua University under Grant No. Z202047.

References

Abe

and Nakayama

, Deep Learning for Forecasting Stock Returns In The Cross-section, Advances In Knowledge Discovery and Data Mining, Springer International Publishing, 2018, 273–284.

Abroyan

, Neural networks for financial market risk classification, Signal Process 1(2) (2017).

Akita

Yoshihara

Matsubara

and Uehara

, Deep Learning for Stock Prediction Using Numerical and Textual Information, in: 2016 IEEE/ACIS 15th International Conference on Computer and Information Science, ICIS, IEEE, 2016.

Ang

K.K.

and Quek

, Stock trading using RSPOP: A novel rough set-based neuro-fuzzy approach, IEEE Transactions on Neural Networks 17(5) (2006), 1301–1315.

Chandra

and Chand

, Evaluation of co-evolutionary neural network architectures for time series prediction with mobile application in finance, Applied Soft Computing 49 (2016), 462–473.

Chen

Zhou

and Dai

, A LSTM-based Method for Stock Returns Prediction: A Case Study of China Stock Market, in: 2015 IEEE International Conference on Big Data, Big Data, IEEE, Information Systems, ICIIS, IEEE, 2015.

Chen

Jiang

Zhang

W.-G.

and Chen

, A novel graph convolutional feature based convolutional neural network for stock trend prediction, Information Sciences 556 (2021), 67–94.

Chong

Han

and Park

F.C.

, Deep learning networks for stock market analysis and prediction: Methodology, data representations, and case studies, Expert Systems with Application 83 (2017), 187–205.

W.D.

, Agent Inspired Trading Using Recurrent Reinforcement Learning and LSTM Neural Networks, arXiv preprint: 1707.07338, 2017.

10.

Deng

Kong

Bao

and Dai

, Sparse coding-inspired optimal trading system for HFT industry, IEEE Transacion on Industial Informatics 11(2) (2015), 467–475.

11.

Goodfellow

Bengio

and Courville

, Deep Learning, MIT Press, 2016.

12.

Hiransha

Gopalakrishnan

E.A.

Menon

V.K.

and Soman

K.P.

, Nse stock market prediction using deep-learning models, Procedia Computer Science 132 (2018), 1351–1362.

13.

Jiang

and Liang

, A Deep Reinforcement Learning Framework for the Financial Portfolio Management Problem, arXiv preprint: 1706.10059, 2017.

14.

Karaoglu

and Arpaci

, A Deep Learning Approach for Optimization of Systematic Signal Detection in Financial Trading Systems With Big Data, International Journal of Intelligent Systems, 2018, 31–36.

15.

Kraus

and Feuerriegel

, Decision Support From Financial Disclosures With Deep Neural Networks and Transfer Learning, Decisdion Support System, 2017, 38–48.

16.

Liu

Zhang

and Ma

, CNN-LSTM Neural Network Model for Quantitative Strategy Analysis In Stock Markets, Neural Information Processing, Springer International Publishing, 2017, 198–206.

17.

Minami

, Predicting equity price with corporate action events using LSTM-RNN, J. Math. Finance 8(1) (2018), 58–63.

18.

Moody

and Saffell

, Learning to trade via direct reinforcement, IEEE Transactions on Neural Networks 12(4) (2001), 875–889.

19.

Moody

and Saffell

, Reinforcement Learning for Trading, Advances in Neural Information Processing Systems, 1999, 917–923.

20.

Neely

C.J.

Rapach

D.E.

and Zhou

, Forecasting the equity risk premium: The role of technical indicators, Management Science 60(7) (2014), 1772–1791.

21.

Peng

Feng

Kong

Ren

and Dai

, Deep direct reinforcement learning for financial signal representation and trading, IEEE Transactions on Neural Networks and Learning Systems 28(Issue 3) (2017), 653–664.

22.

Richard

S.S.

and Andrew

G.B.

, Introduction to Reinforcement Learning, Vol. 135, MIT press Cambridge, 1998.

23.

Saad

E.W.

Prokhorov

D.V.

and Wunsch

D.C.

, II, Comparative Study of Stock Trend Prediction Using Time Delay, Recurrent and Probabilistic Neural Networks, IEEE Transactions on Neural Networks 9(6) (1998), 1456–1470.

24.

Sato

, Model-Free Reinforcement Learning for Financial Portfolios: A Brief Survey, arXiv preprint: 1904.04973, 2019.

25.

Selvin

Vinayakumar

Gopalakrishnan

E.A.

Menon

V.K.

and Soman

K.P.

, Stock Price Prediction Using LSTM, Rnn and CNN- sliding Window Model, in: 2017 International Conference on Advances in Computing, Communications and Informatics, ICACCI, IEEE, 2017.

26.

Sezer

Gudelek

and Ozbayoglu

, Financial time series forecasting with deep learning: A systematic literature review: 2005–2019, Applied Soft Computing 90 (2020), 106181.

27.

Wang

Peng

Gao

and Jiang

, A dilated convolution network-based LSTM model for multi-step prediction of chaotic time series, Computational and Applied Mathematics 39 (2020), Article 30.

28.

Yuan

Zhang

and Shao

, Deep And Wide Neural Networks on Multiple Sets of Temporal Data With Correlation, in: Proceedings of The 2018 International Conference on Computing and Data Engineering ICCDE 2018, ACM Press, 2018.

29.

Zhang

Aggarwal

and Qi

, Stock Price Prediction Via Discovering Multi-frequency Trading Patterns, in: Proceedings of The 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining – KDD17, ACM Press, 2017.

An efficient isomorphic CNN-based prediction and decision framework for financial time series

Abstract

Keywords

1. Introduction

2.1 Deep-learning-based price prediction

2.2 Reinforcement learning-based trading decision-making

3. Integrated framework of forecasting-based online trading strategy

3.1 Problem formulation

3.1.1 CNN-based time series prediction

3.4 Dynamic trading decision

4. Evaluation

4.1 Experimental settings

4.1.1 Datasets

1 https://finance.yahoo.com/.

4.1.4 Experimental environments

4.2 Experimental results

Table 2 Prediction metrics

4.2.2 Trading results and analysis

A. Comparisons with prediction-based heuristic trading methods

B. Comparisons with DDRL-based methods

Table 5 Robustness results

Footnotes

Acknowledgments

References

¹
https://finance.yahoo.com/.

Table 2
Prediction metrics

Table 5
Robustness results