Abstract
Financial Prediction has always been an attractive area of study not only for academic research but also for commercial applications. In particular forecasting price movement of stock market, at least a one day ahead has been a goal of many traders. Gaining high profit with suitable investment is the dream of every investor, but it requires proper financial knowledge, analytical capability and ability for discovering the non-linear pattern hidden within the particular stock market data. Since neural networks have an inherent capability of learning and approximating nonlinear functions based on historical data so it is a very attractive tool for financial prediction. In this study a predictor model using Chebyshev Polynomial neural network (CPNN) is developed for one day a head prediction of closing price of stock indices. Further the parameters of the predictor model are estimated using the Differential Evolution (DE) algorithm. Being a parallel direct search algorithm, DE has the strength of finding global optimal solution regardless of the initial values of its few control parameters. Furthermore, the DE based algorithm aims to achieve an optimal solution with a rapid convergence rate. A comparative study of training CPNN using DE with respect to traditional back propagation (BP) algorithm and Particle swarm optimization (PSO) algorithm is also provided on two benchmark stock indices. Experimental results clearly reveal the efficiency of the hybrid predictor model in term of two known error metrics such as Root mean square error and Mean absolute percentage error.
Keywords
Introduction
Financial Prediction has always been an attractive area of study not only for academic research but also for commercial applications. In particular forecasting price movement of stock market, at least a one day ahead has been a goal of many traders. The motivation is obviously hope for financial gain. However stock price is highly affected by many external factors, such as many highly interrelated economic, social, political and even psychological factors coupled with the complexity of its internal law, such as price (stock index) changes in the non-linear, and shares data with high noise characteristics. Therefore, it is generally very difficult to forecast the movements of stock markets. The traditional statistical methods, including moving average, exponential smoothing and linear regression methods used in the prediction of stock prices are simple. However, the significant nonlinear and time-varying characteristics of the market behavior make those traditional methods difficult to reveal the stock market’s internal pattern. Hence developing more realistic models to predict stock price more effectively and accurately is a great interest of research in financial data mining.
In this paper, the detailed architecture and mathematical modeling of a Chebyshev Polynomial neural network (CPNN) is described for one day a head prediction of closing price of two well known benchmark stock indices. Chebyshev Polynomial neural network (CPNN) is a single layer single neuron neural network like Functional Link Neural Network (FLANN). Unlike a number of hidden layers and a number of neurons in hidden layers used in Multi Layer Perceptron (MLP) networks, CPNN is able to capture the hidden nonlinearity between input and output patterns by expanding the input pattern through a set of chebyshev polynomials in the functional expansion layer. In contrast to traditional FLANN, in which trigonometric functions are used in the functional expansion, CPNN uses a set of chebyshev polynomial functions. The evaluation of chebyshev polynomials involves less computation compared to that of the trigonometric functions, therefore, CPNN offers faster training compared to FLANN. The main advantage of the CPNN is its simple structure, faster convergence and reduced computational complexity by increasing the dimensionality of the input pattern with a set of linearly independent nonlinear functions. Back propagation algorithm is the most commonly used algorithm for training the network. But it suffers from extensive computations, relatively slow convergence speed and possible divergence for certain conditions. So to avoid the inability to escape local optima and to obtain maximum accuracy, it has been proposed to use the differential evolution method in training of Chebyshev FLANN to optimize its parameters. Being a parallel direct search algorithm, DE has the strength of finding global optimal solution regardless of the initial values of its few control parameters. Furthermore, the DE based algorithm aims to achieve an optimal solution with a rapid convergence rate. The training of CPNN using DE is also compared with traditional back propagation (BP) algorithm and Particle swarm optimization (PSO) algorithm. To test the model performance two well known stock market indices namely: Bombay stock exchange (BSE SENSEX) and Standard’s & Poor’s 500 (SP500) are taken as experimental data. The Root Mean Square Error (RMSE) and Mean Absolute Percentage Error (MAPE) are used for model validation.
The rest of the paper is organized as follows. Section 2 reviews the literature in the area stock prediction using neural networks. In Section 3 the basic CPNN architecture and various learning algorithms for it has been discussed. The simulation study for demonstrating the prediction performance of the proposed model is carried out in Section 4. This section also provides a comparative result of the proposed model with Back propagation and PSO based learning of CPNN for predicting financial time series data. Finally conclusions are drawn in the last section.
Literature review
The strong ability of Artificial Neural Network (ANN) to deal with nonlinear problems to capture deterministic as well as random features makes it very attractive for time series modeling and for financial prediction. It can identify any complex nonlinear relationships based on historical data that are difficult to capture using traditional forecasting models. Over the years, many scholars have committed to predict stock price using neural network and have made great achievements [2, 14, 20, 27]. The mathematical theorem has been proved that the correctly configured neural networks can approximate any function once right data are given.
Artificial neural network is a large broad network with a number of processing units (neurons) connected. It is an abstract, simplified and simulation to human brain, and reflects the basic characteristics of the human brain. Generally, the neural network is the multi-layered network topology, including the input layer, hidden layer and output layer. The usage of neural networks for time series analysis relies on the historical data and a process called network learning. Neural networks application in time series analysis and prediction involves following standard steps: data preprocessing, selection of networks, training and testing [3]. Various ANN based methods like Multi Layer Perception Network (MLP), Radial Basis Function Neural Network (RBF), Wavelet Neural Network (LLWNN), Recurrent Neural Network (RNN) and Functional Link Artificial Neural Network (FLANN) are extensively used for stock market prediction [2, 12, 15, 22, 23]. Since most of the ANN-based applications need substantial amount of data and consume plenty of time for training, it is always preferable to develop computational efficient algorithms.
MLP neural networks are mostly used by the researchers for its inherent capabilities to approximate any non-linear function to a high degree of accuracy. It has proved that an MLP neural network can approximate any complex continuous function that enables us to learn any complicated relationship between the input and the output of the system [4, 7, 10]. But these models have high computational cost and need large number of iterations for its training due to the availability of hidden layer [11, 28]. To overcome these limitations, a different kind of ANN, i.e. Functional Link ANN having a single neuron and single layer architecture is proposed in literature. In general the functional link based neural network models were single-layer ANN structure possessing higher rate of convergence and lesser computational load than those of a MLP structure. The mathematical expression and computational calculation is evaluated as per MLP. Therefore it is very suitable for the analysis of stock data [5]. The FLANN, originally proposed by Pao is a single layer network in which the original input pattern is expanded to a higher dimensional space using nonlinear functions. The hyper-planes generated provide greater discrimination capability in the input pattern space [12, 15, 25]. By applying this expansion, the hidden layers are eliminated making the learning algorithm simpler. A wider application of FLANN models for solving non linear problems like channel equalization, non linear dynamic system identification, electric load forecasting, prediction of earthquake, financial forecasting have demonstrated its viability, robustness and ease of computation. Earlier FLANN were using trigonometric functions for functional expansion. Chebyshev polynomial neural network (CPNN) is a single layer single neuron neural network similar to the FLANN except that its input pattern is expanded into the nonlinear high dimensional space using Chebyshev polynomials instead of using trigonometric functions. Chebyshev FLANN produces better performance than MLP and trigonometric FLANN in terms of computational complexities, convergence rates and prediction accuracies [11, 24].
The well known Back Propagation algorithm is commonly used to update the weights of Chebyshev FLANN. The learning process of BP Algorithm involves two steps, i.e. the forward propagation of information and the backward propagation of error. In each iteration an input pattern is applied and the output is computed. After a given input sample set being processed in the network, the outputs are compared with target outputs. The deviations or errors are used to adjust the weights and bias of network repeatedly in a way that guarantees minimization of a given cost function
DECPNN: A hybrid stock predictor model using Differential Evolution and Chebyshev Polynomial neural network
A financial time series data with N points is represented by
Chebyshev polynomial neural network (CPNN)
Chebyshev Polynomial Neural Network (CPNN) is a single layer neural network in which the original input pattern in lower dimensional space is expanded to a higher dimensional space by using a set of Chebyshev orthogonal functions instead of using trigonometric functions [12, 18, 19, 24]. The Chebyshev polynomials are a set of polynomials denoted by CH
The recursive formula to generate higher order Chebyshev polynomials is given by
With the order
Architecture of Chebyshev Polynomial neural network (CPNN).
After expansion, the weighted sum of the components of the enhanced input pattern is obtained using the following formula:
Then the weighted is passed through a hyperbolic tangent (tanh()) non linear function to produce an output
Learning is a process by which the neural network adjusts itself in response to inputs in order to produce the desired outputs. During the process of learning, the network modifies its connection weights based on the inputs received so that its outputs come closer to the actual or target values. Various learning algorithms used for CPNN are as follows.
Back propagation learning
BP learning algorithm is a supervised learning method which is used to train neural network based on a set of data that contains a number of input and output pairs. The learning process of BP Algorithm can roughly be divided into two phrases: the forward propagation of information and the backward propagation of error. In each iteration an input pattern is applied and the output is computed. After a given input sample set being processed in the network, the outputs are compared with target outputs. The deviations or errors are used to adjust the weights and bias of network repeatedly in a way that guarantees minimization of a given cost function
The gradient of the cost function is given by
The update rule for the weight
Differential Evolution (DE) is a population-based stochastic function optimizer, that combines simple arithmetical operators with the classical operators of recombination, mutation, and selection to evolve from a randomly generated starting population to a final solution. Day to day, popularity of DE is increasing in optimization process due to its several advantages such as: it is able to reproduce the same results consistently over many trials. It has the ability of finding global minimum of a multi-modal function regardless of initial values of its parameters, quick convergence and a small number of parameters to set up at the start of the algorithm. The optimization process is conducted by means of three main operations: mutation, crossover and selection. In each generation, individuals of the current population become target vectors. For each target vector, the mutation operation produces a mutant vector, by adding the weighted difference between two randomly chosen vectors to a third vector. The crossover operation generates a new vector, called trial vector, by mixing the parameters of the mutant vector with those of the target vector. If the trial vector obtains a better fitness value than the target vector, then the trial vector replaces the target vector in the next generation. The mutant vector can be generated using any one of the following strategies:
DE/Rand1
DE/Rand2
DE/Best1
DE/Best2
DE/Current to Best
The random numbers used in mutation are mutually exclusive integers generated in the range [1, np) and F is the scaling factor. The flow chart of DE based learning is shown in Fig. 2.
Flow chart of DE based learning of CPNN.
Steps of DE based learning are:
Expand the input pattern using the functional expansion block. Initialize the position of each individual according to the population size. Find the fitness function value of each individual, i.e. the error obtained by applying the weights specified in each individual to the expanded input and applying the nonlinear tanh() function at the output unit. Create a mutated individual Create a new vector called trial by mixing the parameters of the mutant vector with those of the target vector by using the following crossover operation:
Compare the fitness value of the trial and target vector to select a vector having less fitness value in the next generation. Repeat steps 4, 5 and 6 until some termination condition is reached, such as predefined number of iterations is reached or the error has satisfied the default precision value. Fixed the weight equal to the position of the vector having minimum fitness value and use the network for testing.
Data Collection and Preparation Initially a sample financial data set comprising the daily closing prices is collected. The total no. of samples of the data set is divided into training and test sets. Data Normalization To improve the performance initially all the inputs are scaled between 0 and 1 using the min max normalization as follows:
where Output of data normalization
Training set representing the input and desired output of the network
Chebyshev functional expansion of training input
Initial population and its fitness value
A sample training data set containing closing prices of 10 days and its corresponding normalized values are shown in Table 1.
Creating Network Structure Choosing a sui- table window size and the technical indicators to be used for training, the no. of input layer node is set. Let with window size 2 the no. of input layer node for the CPNN model is set to 2 to express the closing index of 2 days ago and the number of output node is set to 1 for expressing the closing index of 3rd day. Then a suitable order is chosen for expansion. Let with order 3, the 2 dimensional input patterns have been expanded to a pattern of dimension 7 using the basis functions of CPNN. Then a weight vector of size 7 is initially set to random values between
Training and testing the Network For training the network, data from the training set are fed to the network, so that the network can adjust its weights. The DE based learning process is conducted by means of three main operations: mutation, crossover and selection. After functional expansion the position of each individual is randomly initialized within [
Training details of DE based CPNN for one iteration are described as follows:
With population size 5 the initial population and using that the fitness value obtained for each individual has shown in Table 4.
For the first individual, with mutation scale 0.5 and crossover rate 0.9 the output of mutation crossover and selection are described as follows;
Mutation:
(0.3654 0.3013 0.3452
Crossover:
Selection:
As fitness value of trial vector is less than the fitness value of target vector so for next generation the trial vector will be chosen as the target vector.
Output of iteration 1
RMSE comparison of CPNN with BP, PSO, DE based learning during training of S&P500 data.
RMSE comparison of CPNN with BP, PSO, DE based learning during training of BSE data.
Applying the mutation, crossover and selection to each individual the new individuals generated and its corresponding fitness value has shown in Table 5 and the same procedure of getting new individuals are continued till some termination condition is satisfied.
In testing process the network architecture is validated by the sample data, i.e. the performance of the prediction system is evaluated using some common error measures like Root Mean Square Error (RMSE) and Mean Absolute Percentage Error (MAPE). The RMSE and MAPE are defined as:
where
CPNN having a single layer neural network structure provides a great advantage in reduced computational complexity compared to Multilayer Perceptron Network (MLP). The reduction in computational complexity is achieved by the use of a simple polynomial expansion used in a CPNN in lieu of the multiple hidden layers and number of neurons in each layer employed in a Multilayer Perceptron Network. If we will consider a simple MLP having a single hidden layer with m number of neurons in hidden layer and n number of input layer nodes and 1 output neuron, then number of weights need to be estimated during learning stage is [
Output of DECPNN for S&P500 data set with predictionhorizon 1.
Output of DECPNN for BSE data set with predictionhorizon 1.
Performance comparison of CPNN with different learning methods
In this study, sample data from two popular stock indices namely: Bombay Stock Exchange (BSE) and Standard’s&Poor’s 500 (S&P500) comprising the daily closing prices is taken for comparing the performance of CPNN with BP, PSO and DE based learning. The total no. of samples for BSE is 1200 from 1
The direct method of prediction is used in this study. Initially with window size 5 and prediction horizon 1 the input patterns with corresponding output patterns are prepared from the normalized dataset. Accordingly the no. of input layer node for the CPNN models has been set to 6 to express the closing index of 5 days ago and the simple moving average of it and the number of output node to 1 for expressing the closing index of 6
Figures 3 and 4 represent the RMSE error obtained during training of CPNN using Back Propagation, PSO and DE based learning for S&P500 and BSE dataset respectively. The RMSE and MAPE value obtained during testing of BSE and S&P500 data set using CPNN with different learning algorithms has shown in Table 6. Figures 5 and 6 represents the one day ahead prediction of S&P500 and BSE data set using CPNN.
Conclusion
Chebyshev Polynomial Neural Network is a single layer ANN structure, similar to the FLANN except that its input pattern is expanded into the nonlinear high dimensional space using a set of orthogonal Chebyshev polynomials instead of using trigonometric functions. The inputs expanded by the function expansion block, causes an increase in the input vector dimensionality eliminating the hidden layers used in MLP, but it helps to solve complex nonlinear problems by generating non-linear decision boundaries. The prime advantage of CPNN is its reduced computational complexity without compromising the performance. The reduction in computational complexity is achieved by the use of a simple polynomial expansion used in a CPNN in lieu of the multiple hidden layers and number of neurons in each layer employed in a Multilayer Perceptron Network. The use of polynomial functional expansion block reduces the number of weights to be estimated during training compared to the weights and bias values used in the Multi-Layer Perceptron Network. The traditional BP algorithm used to train the CPNN may be stuck at local minima problem and also it takes a large no. of iterations for convergence. So to overcome this limitation Differential Evolution method has used to train the CPNN. Experimental results clearly reveals that the CPNN trained using DE produces better average percentage error with faster convergence rate than both BP and PSO based learning, thereby increasing the prediction capability of stock prices.
