DECPNN: A hybrid stock predictor model using Differential Evolution and Chebyshev Polynomial neural network

Abstract

Financial Prediction has always been an attractive area of study not only for academic research but also for commercial applications. In particular forecasting price movement of stock market, at least a one day ahead has been a goal of many traders. Gaining high profit with suitable investment is the dream of every investor, but it requires proper financial knowledge, analytical capability and ability for discovering the non-linear pattern hidden within the particular stock market data. Since neural networks have an inherent capability of learning and approximating nonlinear functions based on historical data so it is a very attractive tool for financial prediction. In this study a predictor model using Chebyshev Polynomial neural network (CPNN) is developed for one day a head prediction of closing price of stock indices. Further the parameters of the predictor model are estimated using the Differential Evolution (DE) algorithm. Being a parallel direct search algorithm, DE has the strength of finding global optimal solution regardless of the initial values of its few control parameters. Furthermore, the DE based algorithm aims to achieve an optimal solution with a rapid convergence rate. A comparative study of training CPNN using DE with respect to traditional back propagation (BP) algorithm and Particle swarm optimization (PSO) algorithm is also provided on two benchmark stock indices. Experimental results clearly reveal the efficiency of the hybrid predictor model in term of two known error metrics such as Root mean square error and Mean absolute percentage error.

Keywords

Neural network Chebyshev Polynomial neural network back propagation learning Differential Evolution Particle swarm optimization

1. Introduction

Financial Prediction has always been an attractive area of study not only for academic research but also for commercial applications. In particular forecasting price movement of stock market, at least a one day ahead has been a goal of many traders. The motivation is obviously hope for financial gain. However stock price is highly affected by many external factors, such as many highly interrelated economic, social, political and even psychological factors coupled with the complexity of its internal law, such as price (stock index) changes in the non-linear, and shares data with high noise characteristics. Therefore, it is generally very difficult to forecast the movements of stock markets. The traditional statistical methods, including moving average, exponential smoothing and linear regression methods used in the prediction of stock prices are simple. However, the significant nonlinear and time-varying characteristics of the market behavior make those traditional methods difficult to reveal the stock market’s internal pattern. Hence developing more realistic models to predict stock price more effectively and accurately is a great interest of research in financial data mining.

In this paper, the detailed architecture and mathematical modeling of a Chebyshev Polynomial neural network (CPNN) is described for one day a head prediction of closing price of two well known benchmark stock indices. Chebyshev Polynomial neural network (CPNN) is a single layer single neuron neural network like Functional Link Neural Network (FLANN). Unlike a number of hidden layers and a number of neurons in hidden layers used in Multi Layer Perceptron (MLP) networks, CPNN is able to capture the hidden nonlinearity between input and output patterns by expanding the input pattern through a set of chebyshev polynomials in the functional expansion layer. In contrast to traditional FLANN, in which trigonometric functions are used in the functional expansion, CPNN uses a set of chebyshev polynomial functions. The evaluation of chebyshev polynomials involves less computation compared to that of the trigonometric functions, therefore, CPNN offers faster training compared to FLANN. The main advantage of the CPNN is its simple structure, faster convergence and reduced computational complexity by increasing the dimensionality of the input pattern with a set of linearly independent nonlinear functions. Back propagation algorithm is the most commonly used algorithm for training the network. But it suffers from extensive computations, relatively slow convergence speed and possible divergence for certain conditions. So to avoid the inability to escape local optima and to obtain maximum accuracy, it has been proposed to use the differential evolution method in training of Chebyshev FLANN to optimize its parameters. Being a parallel direct search algorithm, DE has the strength of finding global optimal solution regardless of the initial values of its few control parameters. Furthermore, the DE based algorithm aims to achieve an optimal solution with a rapid convergence rate. The training of CPNN using DE is also compared with traditional back propagation (BP) algorithm and Particle swarm optimization (PSO) algorithm. To test the model performance two well known stock market indices namely: Bombay stock exchange (BSE SENSEX) and Standard’s & Poor’s 500 (SP500) are taken as experimental data. The Root Mean Square Error (RMSE) and Mean Absolute Percentage Error (MAPE) are used for model validation.

The rest of the paper is organized as follows. Section 2 reviews the literature in the area stock prediction using neural networks. In Section 3 the basic CPNN architecture and various learning algorithms for it has been discussed. The simulation study for demonstrating the prediction performance of the proposed model is carried out in Section 4. This section also provides a comparative result of the proposed model with Back propagation and PSO based learning of CPNN for predicting financial time series data. Finally conclusions are drawn in the last section.

2. Literature review

The strong ability of Artificial Neural Network (ANN) to deal with nonlinear problems to capture deterministic as well as random features makes it very attractive for time series modeling and for financial prediction. It can identify any complex nonlinear relationships based on historical data that are difficult to capture using traditional forecasting models. Over the years, many scholars have committed to predict stock price using neural network and have made great achievements [2, 14, 20, 27]. The mathematical theorem has been proved that the correctly configured neural networks can approximate any function once right data are given.

Artificial neural network is a large broad network with a number of processing units (neurons) connected. It is an abstract, simplified and simulation to human brain, and reflects the basic characteristics of the human brain. Generally, the neural network is the multi-layered network topology, including the input layer, hidden layer and output layer. The usage of neural networks for time series analysis relies on the historical data and a process called network learning. Neural networks application in time series analysis and prediction involves following standard steps: data preprocessing, selection of networks, training and testing [3]. Various ANN based methods like Multi Layer Perception Network (MLP), Radial Basis Function Neural Network (RBF), Wavelet Neural Network (LLWNN), Recurrent Neural Network (RNN) and Functional Link Artificial Neural Network (FLANN) are extensively used for stock market prediction [2, 12, 15, 22, 23]. Since most of the ANN-based applications need substantial amount of data and consume plenty of time for training, it is always preferable to develop computational efficient algorithms.

MLP neural networks are mostly used by the researchers for its inherent capabilities to approximate any non-linear function to a high degree of accuracy. It has proved that an MLP neural network can approximate any complex continuous function that enables us to learn any complicated relationship between the input and the output of the system [4, 7, 10]. But these models have high computational cost and need large number of iterations for its training due to the availability of hidden layer [11, 28]. To overcome these limitations, a different kind of ANN, i.e. Functional Link ANN having a single neuron and single layer architecture is proposed in literature. In general the functional link based neural network models were single-layer ANN structure possessing higher rate of convergence and lesser computational load than those of a MLP structure. The mathematical expression and computational calculation is evaluated as per MLP. Therefore it is very suitable for the analysis of stock data [5]. The FLANN, originally proposed by Pao is a single layer network in which the original input pattern is expanded to a higher dimensional space using nonlinear functions. The hyper-planes generated provide greater discrimination capability in the input pattern space [12, 15, 25]. By applying this expansion, the hidden layers are eliminated making the learning algorithm simpler. A wider application of FLANN models for solving non linear problems like channel equalization, non linear dynamic system identification, electric load forecasting, prediction of earthquake, financial forecasting have demonstrated its viability, robustness and ease of computation. Earlier FLANN were using trigonometric functions for functional expansion. Chebyshev polynomial neural network (CPNN) is a single layer single neuron neural network similar to the FLANN except that its input pattern is expanded into the nonlinear high dimensional space using Chebyshev polynomials instead of using trigonometric functions. Chebyshev FLANN produces better performance than MLP and trigonometric FLANN in terms of computational complexities, convergence rates and prediction accuracies [11, 24].

The well known Back Propagation algorithm is commonly used to update the weights of Chebyshev FLANN. The learning process of BP Algorithm involves two steps, i.e. the forward propagation of information and the backward propagation of error. In each iteration an input pattern is applied and the output is computed. After a given input sample set being processed in the network, the outputs are compared with target outputs. The deviations or errors are used to adjust the weights and bias of network repeatedly in a way that guarantees minimization of a given cost function $E_{k}$ , the cost function at $k$ th instant. Normally the gradient of the cost function with respect to the weights is determined and the weights are incremented by a fraction of the negative gradient at each iteration [13]. Although BP neural network has been widely applied in many fields, there are still some weaknesses in BP algorithm, such as easy to fall into local minimum value and slow in convergence, a trained network’s generalization problem and so on [28]. Especially, it will be hard to get a well-trained network, if the algorithm falls into local minimum value. To avoid the common drawbacks of back propagation algorithm and to increase the accuracy some scholars proposed several improved measures, including additional momentum method, self-adaptive learning rate adjustment method, and various search algorithms like GA, PSO, DE algorithm in the training step of the neural network to optimize the parameters of the network like the network weights and the number of hidden units in the hidden layer. In [1, 9] a method has been proposed to analyze and forecast the closing price of a stock by integrating the advantage of the global search Genetic Algorithm with the parallel processing of Back Propagation algorithm to overcome the slow convergence rate and to achieve an optimal solution to the neural network. Although the genetic algorithm is a parallel search algorithm, still its efficiency is low because neural network training time take on exponential growth accompanied by the increase of scale and complexity owing to its complex genetic manipulation for instance selection, reproduction, crossover, mutation etc. So for the purpose of increasing the efficiency and accuracy of stock prediction, a PSO-BP hybrid algorithm combining Particle Swarm Optimization (PSO) with BP is proposed for stock prediction [6]. A feed-forward neural network trained with Hybrid PSO-SA algorithm is proposed in [26] that produces better classification performance compared to the neural networks trained by simulated annealing and back propagation algorithm respectively. Neural network trained with Hybrid PSO-SA algorithm has combined the parallel search approach of PSO and selective random search and global search properties of simulated annealing and hence combined the advantages of both the approaches to achieve better performance. An improved PSO has been developed for training trigonometric FLANN for data classification, which also provides better result than MLP and Back Propagation based FLANN [24, 25]. In comparison to classical evolutionary algorithms, such as genetic algorithms, evolutionary programming, and PSO, the differential evolution (DE) algorithm uses a rather greedy and less stochastic approach for problem solving. DE also incorporates an efficient way of self-adapting mutation using small populations. It is able to reproduce the same results consistently over many trials unlike PSO which is more dependent on the randomized initialization of individuals. DE algorithm is like genetic algorithm using similar operators; crossover, mutation and selection. The main difference in constructing better solutions is that genetic algorithms rely on crossover while DE relies on mutation operation [16]. A hybrid trigonometric FLANN based on GA, PSO and DE for weight optimization has proposed for data classification in [8]. It has been shown that this classifier provides better performance than FLANN based on Back Propagation. A comparative study between Differential Evolution (DE) and Particle Swarm Optimization (PSO) in the training and testing of feed-forward neural network for the prediction of daily stock market prices has shown that DE provides faster convergence speed and better accuracy than PSO algorithm in the prediction of fluctuated time series [21]. Differential Evolution based FLANN has also shown its superiority over Back Propagation based Trigonometric FLANN in Indian Stock Market prediction [17].

3. DECPNN: A hybrid stock predictor model using Differential Evolution and Chebyshev Polynomial neural network

A financial time series data with N points is represented by $X(t)=\{x_{1},x_{2},\ldots,x_{N}\}$ where $x_{k}$ for $(k=1,2,\ldots,N)$ denotes the index value at the $k$ th time instant. The aim of the ANNs is to predict the financial index value $x_{N+1}$ using preceding observation sequence over a window size of W, i.e., $[x_{N-W+1},\ldots,x_{N-1},x_{N}]^{T}$ . During training, past index data points are used as outputs and the corresponding window-sized preceding sequences created for each of these data points are applied as inputs. The training continues over a large number of iterations until the chosen error metric value between the ANN output and the desired output attains a minimum value. Once the training is complete, the network can be used for the prediction of future values.

3.1 Chebyshev polynomial neural network (CPNN)

Chebyshev Polynomial Neural Network (CPNN) is a single layer neural network in which the original input pattern in lower dimensional space is expanded to a higher dimensional space by using a set of Chebyshev orthogonal functions instead of using trigonometric functions [12, 18, 19, 24]. The Chebyshev polynomials are a set of polynomials denoted by CH ${}_{p}(X)$ , where $p$ is the order and $-$ 1 $<x<$ 1 is the argument of the polynomial. These polynomials are the orthogonal polynomials obtained as the solutions to the Chebyshev differential equation. The zeroth and the first order Chebyshev polynomials are respectively given by, Ch ${}_{0}(x)=1$ and Ch ${}_{1}(x)=x$ . The higher order polynomials are

$\displaystyle\textit{Ch}_{2}(x)=2x^{2}-1$ $\displaystyle\textit{Ch}_{3}(x)=4x^{3}-3x$ (1) $\displaystyle\textit{Ch}_{4}(x)=8x^{4}-8x^{2}+1$

The recursive formula to generate higher order Chebyshev polynomials is given by

$\displaystyle\textit{Ch}_{p+1}(x)=2x\textit{Ch}_{p}(x)-\textit{Ch}_{p-1}(x)$ (2)

With the order $p$ any $d$ dimensional input pattern $X=[x_{1},x_{2},\ldots,x_{d}]^{T}$ is expanded to a m dimensional pattern CHX by Chebyshev functional expansion as CHX $=[1,\textit{Ch}_{1}(x_{1}),\textit{Ch}_{2}(x_{1}),\ldots,\textit{Ch}_{p}(x_{1}% ),\textit{Ch}_{1}(x_{2}),$ $\textit{Ch}_{2}(x_{2}),\hfil\ldots,\hfil\textit{Ch}_{p}(x_{2}),\hfil\ldots,% \hfil\textit{Ch}_{1}(x_{d}),\hfil\textit{Ch}_{2}(x_{d}),\hfil\ldots,$ $\textit{Ch}_{p}(x_{d})]^{T}$ where $m=p*d+1$ . The functional expansion with order 3 for a two dimensional input pattern has shown in Fig. 1.

Figure 1.

Architecture of Chebyshev Polynomial neural network (CPNN).

After expansion, the weighted sum of the components of the enhanced input pattern is obtained using the following formula:

$\displaystyle\text{Weighted sum }=\sum\limits_{j=1}^{m}{w_{j}}\textit{CHX}_{j}$ (3)

Then the weighted is passed through a hyperbolic tangent (tanh()) non linear function to produce an output $y$ . The error obtained by comparing the o/p with desired o/p is used to update the weights of the FLANN structure by a weight updating algorithm during the learning process.

3.2 Learning methods for CPNN

Learning is a process by which the neural network adjusts itself in response to inputs in order to produce the desired outputs. During the process of learning, the network modifies its connection weights based on the inputs received so that its outputs come closer to the actual or target values. Various learning algorithms used for CPNN are as follows.

3.2.1 Back propagation learning

BP learning algorithm is a supervised learning method which is used to train neural network based on a set of data that contains a number of input and output pairs. The learning process of BP Algorithm can roughly be divided into two phrases: the forward propagation of information and the backward propagation of error. In each iteration an input pattern is applied and the output is computed. After a given input sample set being processed in the network, the outputs are compared with target outputs. The deviations or errors are used to adjust the weights and bias of network repeatedly in a way that guarantees minimization of a given cost function $E_{k}$ , the cost function at $k$ th instant. Normally the gradient of the cost function with respect to the weights is determined and the weights are incremented by a fraction of the negative gradient at each iteration. Let the cost function at $k$ th instant is

$\displaystyle E_{k}=\frac{1}{2}[{d_{k}-y_{k}}]^{2}=\frac{1}{2}e_{k}^{2}$ (4)

The gradient of the cost function is given by

$\displaystyle\frac{\partial E_{k}}{\partial W}=e_{k}\frac{\partial y_{k}}{% \partial W}$ (5)

The update rule for the weight $w_{ij}$ becomes:

$\displaystyle w_{j,k+1}=w_{j,k}+\alpha e_{k}(1-y_{k})^{2}\textit{Ch}_{j}(X)$ $\displaystyle w_{ij,k+1}=\text{weight at }k+1_{\rm th}\text{ step }$ $\displaystyle w_{ij,k}=\text{weight at }k_{\rm th}\text{ step }$ $\displaystyle e_{k}=\text{error at }k_{\rm th}\text{ step }$ (6) $\displaystyle y_{k}=\text{output obtained at }k_{\rm th}\text{ step }$ $\displaystyle\textit{Ch}_{j}(X)=\text{value of }j_{\rm th}\exp\text{ anded % unit }$ $\displaystyle\alpha=\text{learning rate}$

3.2.2 Differential Evolution based learning

Differential Evolution (DE) is a population-based stochastic function optimizer, that combines simple arithmetical operators with the classical operators of recombination, mutation, and selection to evolve from a randomly generated starting population to a final solution. Day to day, popularity of DE is increasing in optimization process due to its several advantages such as: it is able to reproduce the same results consistently over many trials. It has the ability of finding global minimum of a multi-modal function regardless of initial values of its parameters, quick convergence and a small number of parameters to set up at the start of the algorithm. The optimization process is conducted by means of three main operations: mutation, crossover and selection. In each generation, individuals of the current population become target vectors. For each target vector, the mutation operation produces a mutant vector, by adding the weighted difference between two randomly chosen vectors to a third vector. The crossover operation generates a new vector, called trial vector, by mixing the parameters of the mutant vector with those of the target vector. If the trial vector obtains a better fitness value than the target vector, then the trial vector replaces the target vector in the next generation. The mutant vector can be generated using any one of the following strategies:

DE/Rand1

$\displaystyle v_{ij}=x_{r_{1}j}+F\times(x_{r_{2}j}-x_{r_{3}j})$ (7) $\displaystyle\text{where }r_{1}\neq r_{2}\neq r_{3}\neq i$

DE/Rand2

$\displaystyle v_{ij}=x_{r_{1}j}+F\times(x_{r_{2}j}-x_{r_{3}j})+$ $\displaystyle\mspace{44.0mu }F\times(x_{r_{4}j}-x_{r_{5}j})$ (8) $\displaystyle\text{where }r_{1}\neq r_{2}\neq r_{3}\neq r_{4}\neq r_{5}\neq i$

DE/Best1

$\displaystyle v_{ij}=x_{\textit{best},j}+F\times(x_{r_{1}j}-x_{r_{2}j})$ (9) $\displaystyle\text{where }r_{1}\neq r_{2}\neq i$

DE/Best2

$\displaystyle v_{ij}=x_{\textit{best},j}+F\times(x_{r_{1}j}-x_{r_{2}j})+$ $\displaystyle\mspace{44.0mu }F\times(x_{r_{3}j}-x_{r_{4}j})$ (10) $\displaystyle\text{where }r_{1}\neq r_{2}\neq r_{3}\neq r_{4}\neq i$

DE/Current to Best

$\displaystyle v_{ij}=x_{ij}+F\times(x_{\textit{best},j}-x_{ij})+$ $\displaystyle\mspace{44.0mu }F\times(x_{r_{1}j}-x_{r_{2}j})$ (11) $\displaystyle\text{where }r_{1}\neq r_{2}\neq i$

The random numbers used in mutation are mutually exclusive integers generated in the range [1, np) and F is the scaling factor. The flow chart of DE based learning is shown in Fig. 2.

Figure 2.

Flow chart of DE based learning of CPNN.

Steps of DE based learning are:

Step 1: Step 1:

Expand the input pattern using the functional expansion block.

Step 2:

Initialize the position of each individual according to the population size.

Step 3:

Find the fitness function value of each individual, i.e. the error obtained by applying the weights specified in each individual to the expanded input and applying the nonlinear tanh() function at the output unit.

Step 4:

Create a mutated individual $v_{i}$ for each individual target vector $x_{i}$ using any one of the mutation strategy.

Step 5:

Create a new vector called trial by mixing the parameters of the mutant vector with those of the target vector by using the following crossover operation:

$\displaystyle\mspace{-78.0mu }u_{ij}=v_{ij}\text{ if }\textit{rand}\,[0,1]% \leqslant cr\text{ or }j=j_{\textit{rand}}$ (12) $\displaystyle\mspace{-78.0mu }\text{else }x_{ij}$

Step 6:

Compare the fitness value of the trial and target vector to select a vector having less fitness value in the next generation.

Step 7:

Repeat steps 4, 5 and 6 until some termination condition is reached, such as predefined number of iterations is reached or the error has satisfied the default precision value.

Step 8:

Fixed the weight equal to the position of the vector having minimum fitness value and use the network for testing.

3.3 Detailed steps of prediction using DECPNN

Step 1: Step 1:
Data Collection and Preparation Initially a sample financial data set comprising the daily closing prices is collected. The total no. of samples of the data set is divided into training and test sets.
Step 2:
Data Normalization To improve the performance initially all the inputs are scaled between 0 and 1 using the min max normalization as follows:

$\displaystyle y=\frac{x-x_{\min}}{x_{\max}-x_{\min}}$ (13)

where $y=$ normalized value. $x=$ value to be normalized. $x_{\min}=$ minimum value of the series to be normalized. $x_{\max}=$ maximum value of the series to be normalized.

Table 1
Output of data normalization

Original data Normalized data

1132.99 0

1136.52 0.228184

1137.14 0.268261

1141.69 0.562379

1144.98 0.775048

1146.98 0.904331

1136.22 0.208791

1145.68 0.820297

1148.46 1

1136.03 0.196509

Table 2
Training set representing the input and desired output of the network

Input Desired output

0 0.228184 0.268261

0.228184 0.268261 0.562379

0.268261 0.562379 0.775048

0.562379 0.775048 0.904331

0.775048 0.904331 0.208791

0.904331 0.208791 0.820297

0.208791 0.820297 1

0.820297 1 0.196509

Table 3
Chebyshev functional expansion of training input

Sample no. Col 1 Col 2 Col 3 Col 4 Col 5 Col 6 Col 7

1 1 0 $-$ 1 0 0.228184 $-$ 0.89586 $-$ 0.63703

2 1 0.228184 $-$ 0.89586 $-$ 0.63703 0.268261 $-$ 0.85607 $-$ 0.72756

3 1 0.268261 $-$ 0.85607 $-$ 0.72756 0.562379 $-$ 0.36746 $-$ 0.97568

4 1 0.562379 $-$ 0.36746 $-$ 0.97568 0.775048 0.2014 $-$ 0.46286

5 1 0.775048 0.2014 $-$ 0.46286 0.904331      0.635629      0.245307

6 1 0.904331      0.635629      0.245307 0.208791 $-$ 0.91281 $-$ 0.58997

7 1 0.208791 $-$ 0.91281 $-$ 0.58997 0.820297      0.345775 $-$ 0.25302

8 1 0.820297      0.345775 $-$ 0.25302 1 1 1

Table 4
Initial population and its fitness value

Vector Col 1 Col 2 Col 3 Col 4 Col 5 Col 6 Col 7 Fitness value

X1 0.3790 0.4900 $-$ 0.0019 0.0860 0.1609 0.0814 0.3627 0.3440

X2 0.4889 0.0277 0.4009 $-$ 0.2533 0.2298 0.4283 $-$ 0.0157 0.3601

X3 $-$ 0.4995 $-$ 0.0205 0.0747 0.1664 0.3908 0.0801 0.3449 0.8087

X4 0.3654 0.3013 0.3452 $-$ 0.4165 0.4823 $-$ 0.4830 $-$ 0.2906 0.2159

X5 0.1126 $-$ 0.2722 0.2386 0.1260 0.2690 $-$ 0.3791 0.0523 0.4766

A sample training data set containing closing prices of 10 days and its corresponding normalized values are shown in Table 1.
Step 3:
Creating Network Structure Choosing a sui- table window size and the technical indicators to be used for training, the no. of input layer node is set. Let with window size 2 the no. of input layer node for the CPNN model is set to 2 to express the closing index of 2 days ago and the number of output node is set to 1 for expressing the closing index of 3rd day. Then a suitable order is chosen for expansion. Let with order 3, the 2 dimensional input patterns have been expanded to a pattern of dimension 7 using the basis functions of CPNN. Then a weight vector of size 7 is initially set to random values between $-$ 1 to 1 for the network. With window size 2 rearranging the training sample the input and output values of the sample are given in Table 2 and the expanded input pattern with order 3 are shown in Table 3.
Step 4:
Training and testing the Network For training the network, data from the training set are fed to the network, so that the network can adjust its weights. The DE based learning process is conducted by means of three main operations: mutation, crossover and selection. After functional expansion the position of each individual is randomly initialized within [ $-$ 1, 1] according to the population size, that specifies the weight of the network. Then the fitness function value of each individual, i.e. the RMSE error obtained by applying the weights specified in each individual to the expanded input and applying the nonlinear tanh() function at the output unit is calculated. In each generation, individuals of the current population become target vectors. Then a mutated individual $v_{i}$ for each individual target vector $x_{i}$ is created using any one of the mutation strategy. Then a trial vector is created by mixing the parameters of the mutant vector with those of the target vector by using the crossover operation. Comparing the fitness value of the trial and target vector a vector having less fitness value is selected in the next generation. The process of mutation, crossover, and selection is repeated until some termination condition is reached, such as predefined number of iterations is reached or the error has satisfied the default precision value. Finally the weight equal to the position of the vector having minimum fitness value is fixed and the network is used for testing.

Training details of DE based CPNN for one iteration are described as follows:

With population size 5 the initial population and using that the fitness value obtained for each individual has shown in Table 4.

For the first individual, with mutation scale 0.5 and crossover rate 0.9 the output of mutation crossover and selection are described as follows;

Mutation:

$\bm{X_{1}}=$ 0.3790 0.4900 $-$ 0.0019 0.0860 0.1609                0.0814 0.3627

$\bm{r_{1}}=$ 4 $\bm{r_{2}}=$ 2 $\bm{r_{3}}=$ 3 $\bm{F}=$ 0.5

$V_{1}=X_{4}+F(X_{2}-X_{3})$

(0.3654 0.3013 0.3452 $-$ 0.4165 0.4823 $-$ 0.4830 $-$ 0.2906) $+$ 0.5{(0.4889 0.0277 0.4009 $-$ 0.2533 0.2298 0.4283 $-$ 0.0157) $-$ ( $-$ 0.4995 $-$ 0.0205 0.0747 0.1664 0.3908 0.0801 0.3449)}

$\bm{V_{1}}=$ 0.8596 0.3254 0.5083 $-$ 0.6264 0.4018 $-$ 0.3089

$-$ 0.4709

Crossover:

$\bm{cr}=$ 0.9

$\bm{X_{1}}=$ 0.3790 0.4900 $-$ 0.0019 0.0860 0.1609                0.0814 0.3627

$\bm{V_{1}}=$ 0.8596 0.3254 0.5083 $-$ 0.6264 0.4018 $-$ 0.3089

$-$ 0.4709

$r=$ 0.6952 0.5358 0.1239 0.8530 0.2703 0.5650 0.4170

$j_{\textit{rand}}=$ 4 4 4 7 2 5 2

$\bm{U_{1}}=$ 0.8596 0.3254 0.5083 $-$ 0.6264 0.4018 $-$ 0.3089

$-$ 0.4709

Selection:

$fv(U_{1})=$ 0.0299

$fv(X_{1})=$ 0.0344

As fitness value of trial vector is less than the fitness value of target vector so for next generation the trial vector will be chosen as the target vector.

Table 5
Output of iteration 1

Vector Col 1 Col 2 Col 3 Col 4 Col 5 Col 6 Col 7 Fitness value

X1 0.8596 0.3254 0.5083 $-$ 0.6264 0.4018 0.3089 $-$ 0.4709 0.2299

X2 0.4889 0.0277 0.4009 $-$ 0.2533 0.2298 0.4283 $-$ 0.0157 0.3601

X3 0.3173 $-$ 0.0205 $-$ 0.0297 0.0044 0.2872 $-$ 0.3742 0.2253 0.2629

X4 0.3654 0.3013 0.3452 0.4165 0.4823 $-$ 0.4830 $-$ 0.2906 0.2159

X5 0.9282 0.2129 0.3626 $-$ 0.2935 0.1148 0.4290 $-$ 0.0068 0.3393

Figure 3.
RMSE comparison of CPNN with BP, PSO, DE based learning during training of S&P500 data.

Figure 4.
RMSE comparison of CPNN with BP, PSO, DE based learning during training of BSE data.

$\bm{X_{1}}$ ( $\bm{t}+\bm{1}$ ) $=$ 0.8596 0.3254 0.5083 $-$ 0.6264 0.4018                     $-$ 0.3089 $-$ 0.4709

Applying the mutation, crossover and selection to each individual the new individuals generated and its corresponding fitness value has shown in Table 5 and the same procedure of getting new individuals are continued till some termination condition is satisfied.

In testing process the network architecture is validated by the sample data, i.e. the performance of the prediction system is evaluated using some common error measures like Root Mean Square Error (RMSE) and Mean Absolute Percentage Error (MAPE). The RMSE and MAPE are defined as:

$\displaystyle\textit{RMSE}=\sqrt{\frac{1}{n}\sum\limits_{k=1}^{n}(y_{k}-\hat{y% }_{k})^{2}}$ (14) $\displaystyle\textit{MAPE}=\frac{1}{n}\sum\limits_{k=1}^{n}\left|\frac{y_{k}-% \hat{y}_{k}}{y_{k}}\right|\times 100$ (15)

where $y_{k}=$ actual closing price on kth day, $\hat{y}_{k}=$ predicted closing price on kth day, $n=$ number of test data.
3.4 Computational complexity of CPNN

Original data	Normalized data
1132.99	0
1136.52	0.228184
1137.14	0.268261
1141.69	0.562379
1144.98	0.775048
1146.98	0.904331
1136.22	0.208791
1145.68	0.820297
1148.46	1
1136.03	0.196509

Input	Desired output
0	0.228184	0.268261
0.228184	0.268261	0.562379
0.268261	0.562379	0.775048
0.562379	0.775048	0.904331
0.775048	0.904331	0.208791
0.904331	0.208791	0.820297
0.208791	0.820297	1
0.820297	1	0.196509

Sample no.	Col 1	Col 2	Col 3	Col 4	Col 5	Col 6	Col 7
1	1	0	$-$ 1	0	0.228184	$-$ 0.89586	$-$ 0.63703
2	1	0.228184	$-$ 0.89586	$-$ 0.63703	0.268261	$-$ 0.85607	$-$ 0.72756
3	1	0.268261	$-$ 0.85607	$-$ 0.72756	0.562379	$-$ 0.36746	$-$ 0.97568
4	1	0.562379	$-$ 0.36746	$-$ 0.97568	0.775048	0.2014	$-$ 0.46286
5	1	0.775048	0.2014	$-$ 0.46286	0.904331	0.635629	0.245307
6	1	0.904331	0.635629	0.245307	0.208791	$-$ 0.91281	$-$ 0.58997
7	1	0.208791	$-$ 0.91281	$-$ 0.58997	0.820297	0.345775	$-$ 0.25302
8	1	0.820297	0.345775	$-$ 0.25302	1	1	1

Vector	Col 1	Col 2	Col 3	Col 4	Col 5	Col 6	Col 7	Fitness value
X1	0.3790	0.4900	$-$ 0.0019	0.0860	0.1609	0.0814	0.3627	0.3440
X2	0.4889	0.0277	0.4009	$-$ 0.2533	0.2298	0.4283	$-$ 0.0157	0.3601
X3	$-$ 0.4995	$-$ 0.0205	0.0747	0.1664	0.3908	0.0801	0.3449	0.8087
X4	0.3654	0.3013	0.3452	$-$ 0.4165	0.4823	$-$ 0.4830	$-$ 0.2906	0.2159
X5	0.1126	$-$ 0.2722	0.2386	0.1260	0.2690	$-$ 0.3791	0.0523	0.4766

Vector	Col 1	Col 2	Col 3	Col 4	Col 5	Col 6	Col 7	Fitness value
X1	0.8596	0.3254	0.5083	$-$ 0.6264	0.4018	0.3089	$-$ 0.4709	0.2299
X2	0.4889	0.0277	0.4009	$-$ 0.2533	0.2298	0.4283	$-$ 0.0157	0.3601
X3	0.3173	$-$ 0.0205	$-$ 0.0297	0.0044	0.2872	$-$ 0.3742	0.2253	0.2629
X4	0.3654	0.3013	0.3452	0.4165	0.4823	$-$ 0.4830	$-$ 0.2906	0.2159
X5	0.9282	0.2129	0.3626	$-$ 0.2935	0.1148	0.4290	$-$ 0.0068	0.3393

CPNN having a single layer neural network structure provides a great advantage in reduced computational complexity compared to Multilayer Perceptron Network (MLP). The reduction in computational complexity is achieved by the use of a simple polynomial expansion used in a CPNN in lieu of the multiple hidden layers and number of neurons in each layer employed in a Multilayer Perceptron Network. If we will consider a simple MLP having a single hidden layer with m number of neurons in hidden layer and n number of input layer nodes and 1 output neuron, then number of weights need to be estimated during learning stage is [ $m(n+1)+m+1$ ]. Whereas For a CPNN with order $m$ , $n$ number of input layer nodes and 1 output neuron, the number of weights need to be estimated is [ $nm+1$ ], which is much smaller than a MLP with a single hidden layer.

Figure 5.

Output of DECPNN for S&P500 data set with predictionhorizon 1.

Figure 6.

Output of DECPNN for BSE data set with predictionhorizon 1.

Table 6

Performance comparison of CPNN with different learning methods

Data set	Type of learning	RMSE test	MAPE test
	for CPNN
BSE	BP	0.0623	6.4172
	PSO	0.0509	5.3158
	DE	0.0449	4.5213
S&P500	BP	0.0480	1.7321
	PSO	0.0387	1.3956
	DE	0.0365	1.2790

4. Experimental result analysis

In this study, sample data from two popular stock indices namely: Bombay Stock Exchange (BSE) and Standard’s&Poor’s 500 (S&P500) comprising the daily closing prices is taken for comparing the performance of CPNN with BP, PSO and DE based learning. The total no. of samples for BSE is 1200 from 1 ${}^{\rm st}$ January 2004 to 31 ${}^{\rm st}$ December 2008 and for S&P500 are 653 from 4 ${}^{\rm th}$ January 2010 to 12 ${}^{\rm th}$ December 2012. Both the data sets are divided into two sets such as: in sample set that is used for training and validation and out sample set that is used for testing. For BSE data set a set of 800 patterns is used for training and validation whereas the rest 400 patterns are set for testing. Similarly for S&P500 dataset a set of 400 patterns is used for training and validation leaving the rest 253 samples for testing. A 5 fold cross validation is used for measuring the generalization ability of the network. The in sample data set is divided into 5 groups. Out of these 5 groups 4 groups are used for training and the other is used for validation. The entire process is carried out for 20 independent runs. Finally the trained network is applied on the out sample data. To improve the performance initially all the inputs are scaled between 0 and 1 using the min max normalization as given in Eq. (13).

The direct method of prediction is used in this study. Initially with window size 5 and prediction horizon 1 the input patterns with corresponding output patterns are prepared from the normalized dataset. Accordingly the no. of input layer node for the CPNN models has been set to 6 to express the closing index of 5 days ago and the simple moving average of it and the number of output node to 1 for expressing the closing index of 6 ${}^{\rm th}$ day. With order 3, the 6 dimensional input patterns have been expanded to a pattern of dimension 19 using the corresponding basis functions in the functional expansion block for CPNN. Then a population of size 20 has been randomly initialized to values between -1 to 1 where each individual represents the weight vector of size 19 for the network. The mutation scale and cross over rate for the DE based learning has set to 0.5 and 0.9 respectively. For network convergence, the number of iterations was set at 500 and the minimum error rate at 0.001. At the end of the training process of the network the weights are frozen for testing the network. The Root Mean Square Error (RMSE) and Mean Absolute Percentage Error (MAPE) has been used to compare the performance of CPNN with BP learning, PSO based learning and DE based learning for predicting the closing price of the BSE and S&P500 index in one day advance .

Figures 3 and 4 represent the RMSE error obtained during training of CPNN using Back Propagation, PSO and DE based learning for S&P500 and BSE dataset respectively. The RMSE and MAPE value obtained during testing of BSE and S&P500 data set using CPNN with different learning algorithms has shown in Table 6. Figures 5 and 6 represents the one day ahead prediction of S&P500 and BSE data set using CPNN.

5. Conclusion

Chebyshev Polynomial Neural Network is a single layer ANN structure, similar to the FLANN except that its input pattern is expanded into the nonlinear high dimensional space using a set of orthogonal Chebyshev polynomials instead of using trigonometric functions. The inputs expanded by the function expansion block, causes an increase in the input vector dimensionality eliminating the hidden layers used in MLP, but it helps to solve complex nonlinear problems by generating non-linear decision boundaries. The prime advantage of CPNN is its reduced computational complexity without compromising the performance. The reduction in computational complexity is achieved by the use of a simple polynomial expansion used in a CPNN in lieu of the multiple hidden layers and number of neurons in each layer employed in a Multilayer Perceptron Network. The use of polynomial functional expansion block reduces the number of weights to be estimated during training compared to the weights and bias values used in the Multi-Layer Perceptron Network. The traditional BP algorithm used to train the CPNN may be stuck at local minima problem and also it takes a large no. of iterations for convergence. So to overcome this limitation Differential Evolution method has used to train the CPNN. Experimental results clearly reveals that the CPNN trained using DE produces better average percentage error with faster convergence rate than both BP and PSO based learning, thereby increasing the prediction capability of stock prices.

References

Khan

Bandopadhyaya

Sharma

. Comparisons of stock rates prediction accuracy using different technical indicators with back propagation neural network and genetic algorithm based back propagation neural network. IEEE First International Conference in Emerging Trends in Engineering and Technology (ICETET)2008; 575-580.

Rout

Dash

Bisoi

. Forecasting financial time series using a low complexity recurrent neural network and evolutionary learning approach. Journal of King Saud University-Computer and Information Sciences2017; 29(4): 536-552.

Kozarzewski

. A neural network based time series forecasting system. IEEE 3rd Conference on Human System Interactions (HSI)2010; 59-62.

Manjula

Sarma

SSVN

Naik

Shruthi

. Stock prediction using neural network. International Journal of Advanced Engineering Sciences and Technologies2011; 10(1): 13-18.

Majhi

Shalabi

Fathi

. FLANN based forecasting of S&P 500 index. Information Technology Journal2005; 4(3): 289-292.

. A stock predition method based on PSO and BP hybrid algorithm. IEEE International Conference in E-Business and E-Government (ICEE)2011; 1-4.

De Oliveira

Zárate

De Azevedo Reis

Nobre

. The use of artificial neural networks in the analysis and prediction of stock prices. IEEE International conference in Systems, Man, and Cybernetics (SMC)2011; 2151-2155.

Mili

Hamdi

. A hybrid evolutionary functional link artificial neural network for data mining and classification. IEEE 6th International Conference in Sciences of Electronics, Technologies of Information and Telecommunications (SETIT)2012; 917-924.

Hua-Ning

. Short-term forecasting of stock price based on genetic-neural network. Sixth International Conference on Natural Computation (ICNC)2010; 10-12.

10.

Tahersima

Fesharaki

Hamedi

. Forecasting stock exchange movements using neural networks: A case study. IEEE International Conference in Future Computer Sciences and Application (ICFCSA)2011; 123-126.

11.

Patra

Lim

Meher

Ang

. Financial prediction of major indices using computational efficient artificial neural networks. IEEE International Joint Conference on Neural Networks (IJCNN)2006; 2114-2120.

12.

Patra

Thanh

Meher

. Computationally efficient FLANN-based intelligent stock price prediction system. IEEE International Conference in Neural Networks (IJCNN)2009; 2431-2438.

13.

Zhang

Shao

. Stock data analysis based on BP neural network. IEEE Third International Symposium in Intelligent Information Technology Application Workshops (IITAW)2009; 288-291.

14.

Zahedi

Rounaghi

. Application of artificial neural network models and principal component analysis method in predicting stock prices on Tehran Stock Exchange. Physica A2015; 438: 178-187.

15.

Jiang

Maskell

Patra

. Chebyshev functional link neural network-based modeling and experimental verification for photovoltaic arrays. IEEE International Conference in Neural Networks (IJCNN)2012; 1-8.

16.

Abdual-Salam

Abdul-Kader

Abdel-Wahed

. Comparative study between Differential Evolution and Particle Swarm Optimization algorithms in training of feed-forward neural network for stock price prediction. IEEE 7th International Conference in Informatics and Systems (INFOS)2010; 1-8.

17.

Naeini

Taremian

Hashemi

. Stock market value prediction using neural networks. IEEE International Conference in Computer Information Systems and Industrial Management Applications (CISIM)2010; 132-136.

18.

Liu

Jiang

Feng

. Complex-Chebyshev functional link neural network behavioral model for broadband wireless power amplifiers. IEEE Transactions on Microwave Theory and Techniques2012; 60(6): 1979-1989.

19.

Mishra

Dash

. A Comparative Study of Chebyshev Functional Link Artificial Neural Network, Multi-layer Perceptron and Decision Tree for Credit Card Fraud Detection. IEEE International Conference in Information Technology (ICIT)2014; 228-233.

20.

Patel

Yalamalle

. Stock price prediction using artificial neural network. International Journal of Innovative Research in Science, Engineering and Technology2014; 3(6): 13755-13762.

21.

Mohapatra

Raj

Patra

. Indian stock market prediction using Differential Evolutionary Neural Network model. International Journal of Electronics Communication and Computer Technology (IJECCT)2012; 2(4): 159-166.

22.

Dash

Bisoi

. A self adaptive differential harmony search based optimized extreme learning machine for financial time series prediction. Swarm and Evolutionary Computation2014; 19: 25-42.

23.

Dash

. Prediction of financial time series data using hybrid evolutionary legendre neural network: Evolutionary LENN. International Journal of Applied Evolutionary Computation (IJAEC)2016; 7(1): 16-32.

24.

Dehuri

Cho

. A comprehensive survey on functional link neural networks and an adaptive PSO-BP learning for CFLNN. Neural Computing and Applications2010; 19(2): 187-205.

25.

Dehuri

Roy

Cho

Ghosh

. An improved swarm optimized functional link artificial neural network (ISO-FLANN) for classification. Journal of Systems and Software2012; 85(6): 1333-1345.

26.

Sanjeevi

Nikhila

Khan

Sumathi

. Hybrid PSO-SA algorithm for training a Neural Network for classification. International Journal of Computer Science, Engineering and Applications2011; 1(6): 73-83.

27.

Yogi

Subhashini

Satapathy

. A PSO based functional link artificial neural network training algorithm for equalization of digital communication channels. IEEE International Conference in Industrial and Information Systems (ICIIS)2010; 107-112.

28.

Wang

Dong

. Study on stock price prediction based on BP neural network. IEEE International Conference in Emergency Management and Management Sciences (ICEMMS)2010; 57-60.