Hybrid modified weighted water cycle algorithm and Deep Analytic Network for forecasting and trend detection of forex market indices

Abstract

This paper presents forecasting and trend analysis of foreign currency exchange rate in financial market using a hybrid Deep Analytic Network (DAN) technique optimized by a modified water cycle algorithm called Weighted WCA (WWCA) with better generalization capability than the traditional WCA.DAN comprises several stacked KRR (Kernel Ridge Regression) Auto encoders in a multilayer nonlinear regression architecture approach that provides better generalization and accuracy using regularized least squares technique. Further DAN using wavelet kernel function is particularly attractive for its strong data fitting and generalization ability along with its simplified execution procedure, high speed, and better performance achievements in comparison to LSSVM (least squares support vector machine). The output from the DAN is fed to a weighted KRR module to reject noise or the outliers in the noisy data and to make DAN a more robust predictor of the Forex markets, To obtain optimal values of wavelet kernel parameters, a modified metaheuristic water cycle algorithm i.e. the proposed WWCA is utilized. Applications of this new approach to predict forex rate along with trend analysis on three stock markets provide successful results and validate its superiority over some well known approaches like ANN, SVM, Naïve-Bayes, ELM.

Keywords

Forecasting Deep Analytic Network wavelet kernel function classification weighted water cycle algorithm trend detection

1. Introduction

Forex market is a major and important financial market in the world economy and forex rates are the imperative indices and play a major role in the growth of economy. Forex rates are affected by political, economic, as well as psychological factors that are correlated to each other. The non-stationary and highly fluctuating nature of forex market makes it complicated to study the future indices in financial markets. Therefore forex price forecasting is treated as a difficult task in financial market. Thus there are numerous studies on currency exchange market prediction.

It is well known that machine learning technique like Artificial Neural networks (ANN) are quite capable of capturing the nonlinearities in highly noise data series and proved with better accuracy as compared to linear models. Hybrid models are more capable than individual models. ANN integrated different hybrid methods such as ANN-Autoregressive moving average (ANN-ARMA) [1], ANN with fuzzy logic system (ANFIS) with Generalized auto regressive conditional heterodasticity (GARCH) [2], ANN-GARCH [3], ANN with support vector machine (ANN-SVM), Radial basis functional neural network (RBFNN) with ARMA and Wavelet transform (WT), Wavelet transform with recurrent neural network (RNN) [4], multiple ANN classifiers [5], ANN-SVM (support vector machine) [6], fuzzy interval network with statistical features [7], functional link artificial neural network [8], ensemble based multi-objective based neural classifier [9], neural network performance in currency exchange rate prediction and volatility prediction [10, 11] etc. have applied successfully in stock market, forex market, energy market, etc. where the linear models could not beat the proposed integrated methods. Further, ELM has been developed to train single layer feedforward neural network, and ELM is anon-iterative technique which avoid iterative learning and makes the process very fast with higher learning speed. However, in traditional ANNs, gradient-based learning algorithm is common which leads to slow convergence, easily trapped in local minima, suffers from the problem of over fitting. In ELM, the learning process at the initial stage comprises random feature mapping and then the output layer parameters are solved by a pseudo-inverse least square which ignores the general iterative procedure. However, the choices of neurons in hidden layer as well as its activation function are still an unresolved problem. Thus Kernel based extreme learning machine (KELM) has been developed as a solution. The non-iterative based methods like ELM, KELM, Kernel ridge regression (KRR), etc. are in great demand for its better generalization capability. In recent years, different KRR-based methods have been developed for the solution of regression/classification problems in various research fields [12, 13, 14, 15, 16, 17, 18, 19, 20]. These methods are all single layer based structures.

In recent years, multilayer (deep) neural network as a machine learning approach has received great attention. Deep neural networks are used in many tasks. With multiple hidden layers, deep learning is responsible in meaningful feature representation [21]. With multiple hidden layers it forms the feature representation learning [22]. Recently stacked autoencoder (AE) i.e. several AEs are stacked together to formulate the multilayer neural network with deep learning and extract meaningful features [23]. In several tasks, deep neural network are proved to be very successful, but these are not the solution for all types of tasks [24]. Traditionally in DNNs, like CNN, LSTM and other architectures have parameters which are optimized through training by backpropagation (BP). DNNs are very effective for regression and classification problems, still these DNN models suffer from large data requirements to confront over fitting problems, occurrence of local minima, slow speed of convergence, high computational complexity, and network adaptability and scalability problems, very sensitive to the learning rate setting, etc. due to iterative learning process. Besides DNNs require a correct choice of number of hidden layers, number of neurons in each hidden layer, activation function, etc. to provide good prediction accuracy. Over the last few years, multi-layer based models specifically stacked Auto-encoder based multi-layer structures have been developed due to its increasing attention. Such structures can lead to better feature representation for regression/classification problems [25] and dimensionality reduction [26].

Thus, in this work, DAN comprising a stack of kernel machine (KRR) Auto-encoders in a multilayer architecture is proposed for addressing currency exchange rate prediction and trend detection. Each of the Kernel machines Auto-encoder can be trained independently using generalized least squares without the use of BP algorithm. On the contrary kernel ridge regression (KRR) modules is more suitable that maps the set of predictors into a high-dimensional feature space that satisfies the Mercer condition, and which ignores the dimensionality problem. This approach bypasses the problem of computational complexity that exists in the standard ridge regression method when the number of predictor variables is large as compared to the number of time series observations. Due to its simplified execution procedure, high speed and generalization capability and performance accuracy it has attracted great attention in various fields for solving regression and classification problems. To deal with highly complex data and high level feature reduction through hierarchical structure learning by multilayer representation, is adopted in this work using KRR based stacked Auto-encoders and having a weighted KRR classifier.

Recently a robust formulation of ELM has been outlined in [27] to take care of the noise and outliers in the data samples. In a similar way the new improved weighted DAN which uses a variable weighting factor with error residuals is presented in this paper to provide better accuracy by rejecting noise and outliers in financial data. The choice of the wavelet kernel in kernel machine is due to the fact that it fits the data very closely and is known to produce generalization and stability to the DAN. Further to improve forecasting accuracy the kernel parameters are optimized by an improved water cycle algorithm with enhanced performance for achieving higher accuracy due to its fast convergence with global optimum. Water cycle algorithm has been implemented in [28, 29] for optimal solution of unknown parameters. In this solution, we have taken the minimum weighted chaos mapping functions to generate the random series instead of random numbers between [0 1]. In order to implement the proposed hybrid method, we have used three foreign currency exchange (forex) datasets as we are aware that forex prediction is of great challenge in recent years and is an important issue in financial market, and the supportive forecasting results obtained in the WWCA-Wavelet-DAN yields better prediction performance in terms of RMSE, MAE, and MAPE than some well-established methods for daily exchange rate prediction.

In financial market, forex trend detection is also important which is dependent on the behavior of past forex prices and can be determined using technical indicators which are based on past price values like closing price, open price, low price, high price of the forex market in order to know the general movement of the forex market which is nonlinear in nature and needs attention for future trend prediction. In this regard, among some contributions made for trend analysis in financial market existing in literature based on technical analysis are presented in [30, 31, 32, 33, 34, 35, 36, 37]. With an aim to develop a good forecasting tool for forex price and trend analysis, in this study, we have proposed the robust non-iterative kernel ridge regression with wavelet kernel (Wavelet-DAN) with kernel parameters optimized by an improved water cycle algorithm i.e. WWCA. The price movement direction is determined along with future price prediction. To accomplish this task, in the proposed WWCA-Wavelet-DAN approach, we are using past values like six technical indicators: moving average (MA), moving average convergence and divergence (MACD), relative strength index (RSI), %K indicator and %D indicator, and Larry William’s R (R indicator) which are fed as inputs to the proposed system for classifying the trend.

The details of the proposed method and its implementation are presented in subsequent sections. Apart from the introduction described above, the remaining portion is described as follows: Section 2 describes the Wavelet Kernel Ridge Regression proposed for forex price prediction; Section 3 explains the processing of modified WWCA used for optimizing kernel parameters. Section 4 explains the overall framework of the proposed prediction method, considered forex datasets for day ahead price prediction and trend detection; explains the input variables used to implement the proposed model and the performance metrics used for evaluating the performance accuracy. This section also describes the numerical results of forex price prediction. In Section 5 the forex trend detection procedure and its prediction results are summarized with discussion along with comparisons to some popular methods. Section 6 draws the overall conclusion followed by relevant references.

2. Deep robust analytic network for forex rate prediction

The DAN architecture comprises several layers of stacked KRR Auto encoders with the final layer having a weighted KRR for removing the presence of outliers in the data.Nonlinear Kernel ridge regression (KRR) utilizes the well established kernel trickthat transforms the time series data into a high dimensional feature space thereby providing linear separability, fast processing speed, generalization, and accuracy. It uses those kernel functions which satisfy Mercer’s condition and they can be either local or global depending on their data fitting capabilities. The m dimensional data input vector is represented as $X^{0}=\{x_{i}^{0}\}=[x_{i1}^{0},x_{i2}^{0},x_{i3}^{0},\ldots,x_{im}^{0}]$ , $i=$ 1, 2, 3, …, $N$ , where $N$ is the total number of training samples. Thus if the DAN comprises $L$ layers of stacked KRR Auto encoders the output of the $L$ th layer is obtained as $X^{l}=\{x_{i}^{l}\}=[x_{i1}^{l},x_{i2}^{l},x_{i3}^{l},\ldots,x_{im}^{l}]$ , $l=$ 1, 2, 3, …, $L$ . Using kernel trick the input data matrix is transformed using a mapping function $\phi(.)$ to a higher dimensional space as $\phi(X^{0})$ . $\phi(.)$ and its kernel representation is $K^{1}=\phi^{T}(X^{0})\phi(X^{0})$ . In a similar way the kernel matrix for the lth layer KRR Auto encoder is obtained as $K^{l}=\phi^{T}(X^{l-1})\phi(X^{l-1})$ and the output is $X^{l}$ . For the lth order KRR Autoencoder the training is initiated by minimizing an objective function using generalized least squares in the form as

$\displaystyle\text{Minimize}\ J_{\textit{lKRR}}=\frac{1}{2}\left\|\beta^{l}% \right\|^{2}+\frac{1}{2}C\sum\limits_{i=1}^{N}\left\|\xi_{i}^{l}\right\|^{2}$ (1) $\displaystyle\text{subject to}\ x_{i}^{l-1}-\phi_{i}^{l}(x_{i}^{l-1}).\beta^{l% }=\xi_{i}^{l},$ (2) $\displaystyle\quad i=1,2,\ldots,N$

where $\beta^{l}$ is the weight vector for the $l$ th layer KRR Auto encoder and is expressed as $\beta^{l}=[\beta_{1}^{l},\beta_{2}^{l},\ldots,\beta_{N}^{l}]^{T}$ , and $C$ is a positive constant which essentially provides penalty on the squared error and is known as regularization parameter that needs to be fixed by the user. The $C$ values is equal to $C=2^{\gamma}$ , $\gamma>$ 0, $\xi_{i}^{l}$ is the model error for the $i$ th pattern. To solve the optimization problem in Eq. (1) Lagrange multipliers are applied to yield the following expression:

$\displaystyle\text{Minimize}\ J_{\textit{lKRR}}=\frac{1}{2}\left\|\beta^{l}% \right\|^{2}+\frac{1}{2}C\sum\limits_{i=1}^{N}\left\|\xi_{i}^{l}\right\|^{2}$ (3) $\displaystyle\quad+\sum\limits_{i=1}^{N}\alpha_{i}^{l}(x_{i}^{l-1}-\phi_{i}^{l% }(x_{i}^{l-1}).\beta^{l}-\xi_{i}^{l})$

By deriving $L_{\textit{KRR}}$ with respect to $\beta$ , $\xi$ , $\alpha$ ,and $\lambda$ and equating them to zero, the following expressions are obtained as:

$\displaystyle\frac{\partial J_{\textit{lKRR}}}{\partial\beta^{l}}=0\to\beta^{l% }=\Phi^{l}\alpha^{l}$ (4) $\displaystyle\frac{\partial J_{\textit{lKRR}}}{\partial\alpha_{i}^{l}}=0\to(% \Phi^{l})^{T}\beta-X^{l-1}+\xi^{l}$ (5) $\displaystyle\frac{\partial J_{\textit{lKRR}}}{\partial\xi_{i}^{l}}=0\to\alpha% ^{l}=C\xi^{l}$ (6)

Thus solving Eqs (4) to (6) the value of $\alpha^{l}$ , and $\beta^{l}$ are obtained as:

$\displaystyle\alpha^{l}=\left((\Phi^{l})^{T}\Phi^{l}+\frac{I}{C}\right)^{-1}(X% ^{l-1})^{T}$ (7) $\displaystyle\beta^{l}=\Phi^{l}\left((\Phi^{l})^{T}\Phi^{l}+\frac{I}{C}\right)% ^{-1}(X^{l-1})^{T}$ (8)

Using Kernel matrix $\beta^{l}$ is obtained as:

$\displaystyle\beta^{l}=\Phi^{l}\left(K^{l}+\frac{I}{C}\right)^{-1}(X^{l-1})^{T}$ (9)

where $K^{L}=(\Phi^{L})^{T}\Phi^{L}$ and the elements of the kernel matrix are obtained as:

$\displaystyle k_{i,j}^{l}=(\phi_{i}^{l})^{T}\phi_{j}^{l},\Phi^{l}=[\phi_{1}^{l% },\phi_{2}^{l},\ldots,\phi_{N}^{l}]$ (10)

Hence the output from $l$ th KRR-AE is obtained as:

$\displaystyle X^{l}\!=\!(\Phi^{l})^{T}\beta^{l}\!=\!(\Phi^{l})^{T}\Phi^{l}% \left(K^{l}+\frac{I}{C}\right)^{-1}\!(X^{l-1})^{T}$ $\displaystyle\quad=\,\hat{K}^{l}\left(K^{l}+\frac{I}{C}\right)^{-1}(X_{i}^{l-1% })^{T}$ (11)

Thus for $l=L$ , $X^{L}$ is obtained as:

$\displaystyle X^{L}=(\Phi^{L})^{T}\beta^{L}$ (12) $\displaystyle\quad=\,\hat{K}^{L}\left(K^{L}+\frac{I}{C}\right)^{-1}(X^{L-1})^{T}$

After transforming the input data by passing it through successive $L$ $-$ 1 KRR-AEs the final predictor in the $L$ th layer will comprise a weighted KRR to take care of the outliers in the input samples. The objective function is obtained using error vector $\xi^{l}$ as:

$\displaystyle\text{Minimize}\ J_{\textit{LKRR}}=\frac{1}{2}\left\|\beta^{L}% \right\|^{2}$ (13) $\displaystyle\quad+\,\frac{1}{2}C\sum\limits_{i=1}^{N}w_{i}\left\|\xi_{i}^{L}% \right\|^{2}$ $\displaystyle\text{subject to}\ t_{i}-\beta^{L}\cdot\phi_{i}^{L}(x_{i}^{L})=% \xi_{i}^{L},$ (14) $\displaystyle\quad i=1,2,\ldots,N$

where the target $T=[t_{1},t_{2},t_{3},\ldots,t_{i},\ldots,t_{N}]^{T}$ , $i=$ 1, 2, 3, …, $N$ .

In a similar way to earlier formulation the Lagrange multipliers are applied to yield the following expression:

$\displaystyle\text{Minimize}\ J_{\textit{LKRR}}=\frac{1}{2}\left\|\beta^{L}% \right\|^{2}+\frac{1}{2}C\sum\limits_{i=1}^{N}w_{i}\left\|\xi_{i}^{l}\right\|^% {2}$ $\displaystyle\quad+\,\sum\limits_{i=1}^{N}\alpha_{i}^{L}(t_{i}-\phi_{i}^{L}(x_% {i}^{L}).\beta^{L}-\xi_{i}^{L})$ (15)

By deriving $L_{\textit{KRR}}$ with respect to $\beta$ , $\xi$ , $\alpha$ , and $\lambda$ and equating them to zero, the following expressions are obtained as:

$\displaystyle\frac{\partial J_{\textit{LKRR}}}{\partial\beta^{L}}=0\to\beta^{L% }=\Phi^{L}\alpha^{L}$ (16) $\displaystyle\frac{\partial J_{\textit{LKRR}}}{\partial\alpha_{i}^{L}}=0\to% \Phi^{L}-T+\xi^{L}$ (17) $\displaystyle\frac{\partial J_{\textit{LKRR}}}{\partial\xi_{i}^{L}}=0\to\alpha% _{i}^{L}=CW\xi^{L}$ (18)

Thus solving Eqs (16) to (18) the value of $x_{i}^{l-1}$ is obtained as:

$\displaystyle\beta^{L}=\Phi^{L}\left((\Phi^{L})^{T}\Phi^{L}+\frac{I}{CW}\right% )^{-1}T$ (19)

Using Kernel matrix $\beta^{L}$ is obtained as

$\displaystyle\beta^{L}=\Phi^{L}\left(K^{L}+\frac{I}{CW}\right)^{-1}T$ (20)

where $K^{L}=(\Phi^{L})^{T}\Phi^{L}$ and the elements of the kernel matrix are obtained as: $k_{i,j}^{l}=(\phi_{i}^{l})^{T}\phi_{j}^{l}$ , $l=L$ and $W$ is a diagonal matrix and is represented as

$\displaystyle W=\textit{Diag}[w_{1},w_{2},w_{3},\ldots,w_{N}]$ (21)

Hence the output from the weighted KRR predictor is obtained as

$\displaystyle O_{p}\!=\!(\Phi^{L})^{T}\beta^{L}\!=\!(\Phi^{L})^{T}\Phi^{L}% \left(K^{L}+\frac{I}{CW}\right)^{-1}T$ $\displaystyle\quad=\,\hat{K}^{L}\left(K^{L}+\frac{I}{CW}\right)^{-1}T$ (22)

where for the test sample $x_{\textit{test}}$ the kernel matrix $\hat{K}^{L}$ is obtained as:

$\displaystyle\hat{K}^{L}=k\left[\begin{array}[]{c}{x_{\textit{test}},\Phi_{1}^% {L}}\\ {x_{\textit{test}},\Phi_{2}^{L}}\\ {\vdots}\\ {x_{\textit{test}},\Phi_{N}^{L}}\end{array}\right]$

From Eqs (19) and (20) it is observed that the empirical loss weight matrix $W$ influences the values of the output vector and the output for a particular test pattern. Initially the loss weight matrix $W$ is chosen as unity matrix: $W=I$ and the residual error is calculated as:

$\displaystyle\xi_{i}^{L}=t_{i}-\phi(x_{i}^{L}).\beta^{L},\quad i=1,2,\ldots,N$ (23)

A simple formula given by Huber [45] is used here to compute the individual weight parameters $w_{i}$ as:

$\displaystyle w_{i}=\begin{cases}1&\textit{if}\ \left|\frac{\xi_{i}^{L}}{k}% \right|\leqslant 1\\ \\ \left|\frac{k}{\xi_{i}^{L}}\right|&\textit{if}\ \left|\frac{\xi_{i}^{L}}{k}% \right|>1\end{cases}$ (24)

Here the value of $k$ is set as $k=$ 1.345.

Regarding the choice of kernel functions it is well known that the wavelet kernel either Mexican Hat or Morlet type has strong function fitting capability with strong generalization ability. Therefore in this paper Morlet type wavelet kernel function is used for the lth layer stacked KRR-AE and it is described as:

$\displaystyle K_{W}(x_{i}^{l},x_{j}^{l})=\cos\left(a\frac{\left\|x_{i}^{l}-x_{% i}^{l}\right\|}{b}\right)$ (25) $\displaystyle\qquad\exp\left(-\frac{\left\|x_{i}^{l}-x_{j}^{l}\right\|^{2}}{2c% }\right)$

The parameters $a$ , $b$ , $c$ , are to be chosen appropriately for accurate prediction problem like the financial time series forecasting. To find the optimal values of these parameters, the 1 ${}^{\text{st}}$ layer KRR-AE is chosen and a weighted mapping function based Water cycle (WWCA) algorithm is used, which is described below. Kernel parameters can be optimized in a similar manner in subsequent layers. The optimized parameters $a$ , $b$ , and $c$ of the wavelet kernel are used for successive KRR-AEs of the stack.

2.1 Pseudo code for DAN

Algorithm 1: The pseudo code of DAN.

Number of layers in the stack $L=$ 2.

Input: kernel parameters $a_{l}$ , $b_{l}$ , and $c_{l}$ , regularization parameter $C_{l}$ and input matrix $X^{0}$ , number of samples or patterns $N$ , start with $l=$ 1.

Output: New updated data $X^{l}$ of layer l

(1) Evaluate kernel matrix $K^{l}\leftarrow K(x_{i}^{l},x_{j}^{l},a,b,c)$ where $x_{i}^{l}$ is $i$ -th training sample and $x_{j}^{l}$ is $j$ -th training sample of layer $l$ , respectively.

(2) Evaluate output weight for the $l$ th layer $\beta^{l}=\Phi^{l}(K^{l}+\frac{I}{C})^{-1}(X^{l-1})^{T}$ .

(3) Evaluate new updated data $X^{l}=\hat{K}^{l}(K^{l}+\frac{I}{C})^{-1}\linebreak(X^{l-1})^{T}$ For $l=L$ Choose the weight loss matrix for the $L$ th layer as $W=I$ .

(4) Find the output weight $\beta^{L}=\Phi^{L}(K^{L}+\frac{I}{CW})^{-1}\linebreak(X^{L})^{T}$ Evaluate the weight loss matrix $W$ using Eq. (19) Recalculate the output weight $\beta^{L}$ from Eq. (20) Evaluate the final output as $O_{p}(\textit{test})=\hat{K}^{L}(K^{L}+\frac{I}{CW})^{-1}T$

End

In simpler form the steps are:

Figure 1.

Flow of the DAN processing.

At the first step, $X^{0}=X$

for l=1 to L do

if $l<L$ then

Create the first layer with ( $L$ $-$ 1)th in hierarachical manner and transfer the input sample

Train the KRR-AE

Transformed output $X^{l}$ for the input $X^{l-1}$ is calculated to pass to the next layer as inputs

Else

Final layer i.e Lth layer for prediction

( $L$ $-$ 1)th AE is passed to the KRR classifier at $L$ th layer

10.

Train the $L$ th layer

11.

End if

12.

End for

The flow chart is represented in Fig. 1.

3. Modified weighted water cycle algorithm (WWCA)

WCA as a population based optimization algorithm which is developed by observing the movements of streams and rivers towards sea in the natural water cycle process starting from the creation of rivers. In this process, water passes through the processes of evaporation, transpiration, condensation, etc. A stream or a river is generated whenever water flows downhill. Thus most rivers are generated in high up places like mountains when snow melts by flowing from one place to another. Rivers on their way to sea carries rainwater and stream water from different places with them when flowing down hills. Here sea is treated as the optimum destination for the streams and rivers. Based on this natural concept, the water cycle algorithm is processed. At the initial process of raining, random population of raindrops are generated to start the algorithm based on the natural water cycle process.

In this study, the water cycle algorithm with a modified version is applied for optimizing the kernel parameters of the proposed Wavelet-DAN predictive approach with minimization of cost function to enhance the performance accuracy. Here, cost function is represented in Eq. (20) that is directly proportional to the discharge intensity. In the WCA process, the initial kernel parameters are treated as the position of the rain drops. The objective function (cost function) for this optimization is the square of the total residual error and is expressed as:

$\displaystyle\textit{CF}=\sum\limits_{i=1}^{N}(\xi_{i}^{l-1})^{2}$ (26)

Suppose, the number of total rain drops in the population is ‘ $N_{\textit{TR}}$ ’ where each raindrop contains ‘ $N_{g}$ ’ variables. The formulation of the population of raindrops (RN) is:

$\displaystyle\text{population of raindrops}=\left[\begin{array}[]{c}{\textit{% RN}_{1}}\\ {\textit{RN}_{2}}\\ {\vdots}\\ {\textit{RN}_{N_{\textit{TR}}}}\end{array}\right]$ (27) $\displaystyle=\left[\begin{array}[]{cccc}{k_{1}^{1}}&{k_{2}^{1}}&{\cdots}&{k_{% N_{g}}^{1}}\\ {k_{1}^{2}}&{k_{2}^{2}}&{\cdots}&{k_{N_{g}}^{2}}\\ {\vdots}&{\vdots}&{\vdots}&{\vdots}\\ {k_{1}^{N_{\textit{TR}}}}&{k_{2}^{N_{\textit{TR}}}}&{\cdots}&{k_{N_{g}}^{N_{% \textit{TR}}}}\end{array}\right]_{N_{\textit{TR}}\times N_{g}}$ (28)

For all the initially generated raindrops, discharge intensity is obtained. Among all the raindrops, the raindrop obtained with lowest cost function i.e. discharge intensity value is treated to be the supreme raindrop which is then termed as sea whereas nearby raindrops i.e. raindrops with lower cost function values are treated as rivers, and the remaining raindrops are known as streams that move to rivers and sea.

Say, among the rain drops, $N_{r}$ : number of rivers, $N_{\textit{ST}}$ : number of streams.

So, $N_{\textit{SR}}=N_{r}+1$ which is the summation of rivers $N_{r}$ and one sea and $N_{\textit{ST}}=N_{\textit{TR}}-N_{\textit{SR}}$ .

Once identified the sea, river and stream, the reformulation of Eq. (27) can be expressed as

$\displaystyle\left[\begin{array}[]{c}{\textit{sea}}\\ {\textit{river}_{1}}\\ {\textit{river}_{2}}\\ {\vdots}\\ {\textit{stream}_{N_{\textit{SR}}+1}}\\ {\textit{stream}_{N_{\textit{SR}}+1}}\\ {\vdots}\\ {\textit{stream}_{N_{\textit{TR}}\phantom{+1}}}\end{array}\right]$ (29)

The streams towards it’s destination to river or sea is decided by their calculated discharge intensity values and one stream can enter into one river or sea. The number of streams for river or seato enter into is formulated as:

$\displaystyle\textit{Nstr}_{i}=\textit{round}\left(\left|\frac{\textit{CN}_{i}% }{\sum\limits_{i=1}^{N_{\textit{SR}}}\textit{CN}_{i}}\right|*N_{\textit{ST}}\right)$ (30)

and

$\displaystyle\textit{CN}_{i}=\textit{CF}_{i}-\textit{CF}_{N_{\textit{SR}}+1},% \,\,i=1,2,3\ldots N_{\textit{SR}}$ (31)

The streams flow to river or to sea, whereas flow of rivers is always towards sea.

Hence, they modify their existing positions to move towards sea i.e. the final destination.

The position of streams will be modified by Eq. (32) to flow towards river:

$\displaystyle K_{\textit{stream}}(t+1)=K_{\textit{stream}}(t)+\textit{rand}% \times C\times\,(K_{\textit{river}}(t)-K_{\textit{stream}}(t))$ (32)

The positions of streams will be modified by Eq. (33) to flow towards sea:

$\displaystyle K_{\textit{stream}}(t+1)=K_{\textit{stream}}(t)+\textit{rand}% \times C\times\,(K_{\textit{sea}}(t)-K_{\textit{stream}}(t))$ (33)

The position of rivers will be changed by eq.(34) in order to flow towards sea:

$\displaystyle K_{\textit{stream}}(t+1)=K_{\textit{stream}}(t)+\textit{rand}% \times C\times\,(K_{\textit{sea}}(t)-K_{\textit{stream}}(t))$ (34)

where, rand is a random value between [0 1] and $C$ is a value between [1 2], the distance between stream and river is calculated by: $K_{\textit{sea}}(t)-K_{\textit{stream}}(t)$ for the current distance.

The discharge intensity of each raindrop will be computed after the modification of the positions of streams as well as rivers. If the new discharge intensity is found to be better for a river as compared to its connected sea then the particular river will act as sea whereas the sea behaves as river. In this case, now the streams connected earlier with the particular river will be connected to the sea as the particular river has already converted to sea by modification. The similar process is followed between sea and streams, and sea and rivers according to the flow of intensity.

While the streams and rivers join the sea then evaporation process starts which causes the sea water to evaporate which is an important phase of water cycle algorithm as like in the natural water cycle process water evaporates from river and the evaporated water later on with condensation drops as rain to the earth. Thus, in this process a tiny value i.e. $R_{\textit{max}}\approx$ 0 is taken to represent the convergence of a stream or river with the sea. Once the evaporation condition is satisfied then raining process starts.

The condition of evaporation for river to join sea will be:

$\displaystyle\left\|K_{\textit{sea}}-K_{\textit{river}}^{i}\right\|<R_{\textit% {max}}\ \text{or}\ \textit{rand}<0.1$ (35) $\displaystyle\quad\text{and}\ i=1,2,\ldots(N_{\textit{SR}-1})$

If the conditions stated above are satisfied then new raindrops are created by the raining process to generate new streams like the natural process. The positions of the newly generated streams are obtained by:

$\displaystyle K_{\textit{stream}}=\textit{lb}+\textit{rand}(1,N_{g})\times(% \textit{ub}-\textit{lb})$ (36)

where lb and ub are the lower and upper bounds of decision variables.

The value of $R_{\textit{max}}$ is reduced with Eq. (37); $R_{\textit{max}}$ with a small value controls the search intensity near the sea to achieve the best parameters ie. to reach the sea.

$\displaystyle R_{\textit{max}}(t+1)\!=\!R_{\textit{max}}(t)\!-\!(R_{\textit{% max}}(t)/\textit{Maxiter})$ (37)

where Maxiter $=$ 1, 2, …, Maxiter.

Continue the above process till the last iteration to meet the convergence criteria and find the best solution.

Pseudo-code of WCA:

Initially select parameters of WCA such as $N_{p}$ , $N_{\textit{sr}}$ , Max iterr.

Decide the number of streams that flow to the rivers and sea utilizing the following:

$\displaystyle N_{\textit{SR}}=N_{r}+1,\quad N_{\textit{ST}}=N_{\textit{TR}}-N_% {\textit{SR}}.$

Here $N_{r}$ ; number of rivers, 1: one sea, $N_{\textit{ST}}$ : number of streams $N_{\textit{TR}}$ : number of population.

Generate initial population of streams randomly and form streams, rivers and sea initially.

Find the intensity of flow which indicates how many streams flow to their corresponding rivers and sea utilizing Eq. (30).

while the maximum iteration reaches ie. ( $t$ $<$ max iter)

for $i$ $=$ 1: $N_{p}$

streams move to its corresponding rivers and sea utilizing Eqs (32) and (33)

compute the value of the cost function of the created stream

if cost_stream ${}_{\text{new}}$ $<$ cost_river

river $=$ stream ${}_{\text{new}}$ ;

if cost_stream ${}_{\text{new}}$ $<$ cost_sea

sea $=$ stream ${}_{\text{new}}$

end if

rivers move towards sea utilizing Eq. (34)

compute the value of the cost function of the created river

if cost_river ${}_{\text{new}}$ $<$ cost_sea

sea $=$ river ${}_{\text{new}}$ ;

end if

end for

for $i$ $=$ 1: $N_{sr}$

if (|sea $-$ rivr|) $<$ $d_{\textit{max}}$ or rand(0.1)

generte new streams utilizing Eq. (36)

end if

end for

decrease ${d}_{\textit{max}}$ with Eq. (37)

end while

Obtain final results

3.1 Chaotic maps

A chaotic mapping function is a well-known procedure in nonlinear systems. The chaotic system influences the parameters to a great extent and its initial condition as small changes in the system parameters and in initial conditions may lead to a large variation in the future hehaviour of the system [39]. The chaos mapping helps in preventing premature convergence, provide diversity in search space in the optimization process and thus speed up the convergence. With an aim to enhance the convergence speed of the optimization process, chaos functions are utilized in this work. The chaotic mapping functions which are expressed below are well illustrated in literature for nonlinear system:

(a) Sinusoidal chaos map [38]:

(38) $\displaystyle x_{t+1}=mx_{t}^{2}\sin(\pi x_{t}),\,m=2.3$

The two dimensional sinusoidal map is chaotic with parameters $m=$ 2.3 and the initial values of the variables are taken between [0 1].

(b) Sine chaos map [40]: $\displaystyle x_{t+1}=0.25p\sin\left(\pi x_{t}\right),\,(0<p\leqslant 4)$ (39) $\displaystyle p\in(0\ 4),x_{0}\in(0,1)$

The sinusoidal map Eq. (38) is a two dimensional map and the sin map Eq. (39) is a unimodal map.

The initial condition of the variables are taken between [0 1].

It is reported in [41, 42] that instead of random number between (0,1) if associated a chaotic map for random number generation it permits diversity in the solution space and hence increases the efficiency of the algorithm. Thus in this analysis, to introduce diversity, the sinusoidal and sine chaos maps are associated to the WCA in a combined form with a weighted minimum form. For the initial condition the values are chosen between (0,1) and then the chaotic series is generated with the above formulation.
3.2 Modified weighted water cycle algorithm (WWCA)

In order to enhance the efficiency of the proposed WCA approach, different chaotic functions are associated to the water cycle algorithm. In the traditional WCA the position of streams and rivers are modified with random values to find the destination i.e. the sea by obtaining the minimum objective function value at the position modification stage and thus accordingly exchange the position of rivers /streams to determine the sea. The value of C is assigned beforehand in WCA and this value also affects the algorithm in balancing the exploration and exploitation, and thus in WWCA, the C value is determined by multiplying the weighted minimum combination of two chaos functions ie. the two chaos functions are combined together with two random weights assigned to it and the weighted minimum value is multiplied to find the value of C. With the minimum relation of two chaos functions, the modified positions of the streams and rivers in WWCA are calculated using the Eqs (40)–(43).

$\displaystyle K_{\textit{stream}}(t+1)=K_{\textit{stream}}(t)+\eta(t)\,\times$

(40) $\displaystyle\quad(K_{\textit{river}}(t)-K_{\textit{stream}}(t))$ $\displaystyle K_{\textit{stream}}(t+1)=K_{\textit{stream}}(t)+\eta(t)\,\times$ (41) $\displaystyle\quad(K_{\textit{sea}}(t)-K_{\textit{stream}}(t))$ $\displaystyle K_{\textit{river}}(t+1)=K_{\textit{river}}(t)+\eta(t)\,\times$ (42) $\displaystyle\quad(K_{\textit{sea}}(t)-K_{\textit{river}}(t))$ $\displaystyle\eta(t)=\{\eta_{i}+(\eta_{j}-\eta_{i})\times(t/t_{\textit{max}})% \}\,\times$ (43) $\displaystyle\quad\textit{chaos}_{\textit{com}}(t)$

where $\textit{chaos}_{\textit{com}}(t)=\min[q_{1}(\text{sinusoidal chaos}),q_{2}% \linebreak(\text{sine chaos})]$ using Eqs (38) and (39).

$\displaystyle q_{1}\in(0,1),\quad q_{2}\in(0,1)$

where $\eta_{i}$ is the initial value and $\eta_{j}$ is the final value of the linearly decreasing operator. These values are chosen on trial and error basis; however $\eta_{i}$ is chosen to be 1 and $\eta_{j}$ is chosen to be 2, $t$ is the current iteration, $t_{\textit{max}}$ is the maximum number of iteration, and $\textit{chaos}_{\textit{com}}(t)$ is denoted as the combined chaotic series used in algorithm.

Instead of taking random values between [0 1] as in the original WCA for modifying the position of streams and rivers the nonlinear chaotic functions with weighted minimum to modify the position of streams, rivers and evaporation process to enhance the exploitation capacity of the water cycle algorithm and hence to avoid premature convergence.

The evaporation and raining process with the nonlinear chaotic functions mutually provides a solution to avoid premature convergence and in this solution the evaporation condition with the chaos function is verified on the basis of the following pseudo code:

$\displaystyle\textit{if}\ \left\|K_{\textit{sea}}-K_{\textit{river}}^{j}\right% \|<\textit{chaos}_{\textit{com}}(t)\ \textit{or}$ $\displaystyle\textit{chaos}_{\textit{com}}(t)<0.1,\ i=1,2,3,\ldots N_{\textit{% ST}}$ (44) $\displaystyle\quad\textit{implement the process of raining}$ end if

The value of $d_{\textit{max}}$ is also chosen chaotically and it lies in the range [1.00E-17, 1.00E-01].

The chaotic formulation in WWCA enhance the search diversification of WCA.

The chosen chaos functions in combined form are used to generate the raining process during iteration as:

$\displaystyle K_{\textit{stream}(\textit{new})}=\textit{lb}+\textit{chaos}_{% \textit{com}}(t)\times(\textit{ub}-\textit{lb})$ (45)

The chaotic functions allow diversification in the problem search space by providing more iteration in the chaos formulation. The movement of streams and rivers based on chaos formulation through Eqs (34)–(37) with its generated chaotic patterns are quite efficient to provide exploitation/exploration during raining process. The chaos function with more iteration also permits to find better convergence in the local best and hence avoid premature convergence.
4. Framework of the methodology proposed for forex prediction

Figure 2.

Outline of the WWCA-Wavelet-DAN forecasting processing.

The implementation process of the proposed WWCA-Wavelet-DAN model for forecasting forex price is depicted in Fig. 2 as a flow chart. The numerical experiment goes through the following steps for its implementation for currency exchange rate prediction. The experimental results support the capability and efficiency of the proposed method.

4.1 Numerical experimentation

4.1.1 Performance evaluation measures

The experimental results of the proposed model use three different performance metrics for model evaluation; Mean Absolute Percentage Error (MAPE), Mean Absolute Error (MAE), Root Mean Square (RMSE) for one day ahead prediction (refer Eqs (46)–(48)) as these measures are usually considered for evaluating the performance of a prediction model.

$\displaystyle\text{MAPE}=\frac{1}{\textit{NT}}\sum_{i=1}^{N}\left|\frac{t_{i}-% y_{i}}{t_{i}}\right|\times 100$ (46) $\displaystyle\text{NAE}=\frac{1}{\textit{NT}}\sum_{i=1}^{N}\left|\frac{t_{i}-y% _{i}}{t_{i}}\right|\times 100$ (47) $\displaystyle\text{RMSE}=\sqrt{\frac{1}{NT}\sum\limits_{i=1}^{N}\left(t_{i}-y_% {i}\right)^{2}}$ (48)

here NT $=$ total number of test data, $t_{i}$ : actual output, $y_{i}$ : predicted output.

4.1.2 Input variables selection based on daily prices: Closing prices and technical indicators (TIs)

The daily closing prices in major cases play an important role for daily future price prediction. There are several influential technical indicators such as moving average (MA), Moving Average Convergence and Divergence (MACD), Relative Strength Index (RSI), Stochastic K (%K) and D indicator (%D), Larry William’s R (%R), which are useful for traders in financial market to follow trend of financial time series like stock market price, currency exchange market rates, etc. and take a trading decision and have proved successfully in several applications. These technical indicators are formulated below:

The value of MA is the simple mean calculated as per Eq. (49) on closing prices for a fixed number of past days fd (here fd $=$ 25),

$\displaystyle\textit{MA}(\textit{fd})=\frac{\sum\limits_{i=t-\textit{fd}}^{t}% \textit{cp}_{i}}{\textit{fd}}$ (49)

where fd: fixed number of past days, $\textit{cp}_{i}$ : closing price of $i^{\text{th}}$ day.

The value of MACD is obtained using Eq. (50) based on daily closing price values.

$\displaystyle\text{MACD}(i)=\text{EMA}12(i)-\text{EMA}26(i)$ (50) $\displaystyle\textit{EMA}(i)=(\textit{cp}(i)-\text{EMA}(i-1)$ (51) $\displaystyle\quad\times\,\textit{Multiplier}+\text{EMA}(i-1)$ $\displaystyle\text{where}\ \textit{Multiplier}=2/$ (52) $\displaystyle\quad(\text{fixed number of past days}+1)$

RSI is a momentum indicator and its value is calculated using Eq. (53)

$\displaystyle\textit{RSI}(i)=100-\frac{100}{1+\textit{RS}}$ (53) $\displaystyle\text{where}\ \textit{RS}=$ (54) $\displaystyle\quad\frac{\text{mean}(\text{past $t$ days up closing prices})}{% \text{mean}(\text{past $t$ days down closing prices})}$

The values of %K and %D oscillators are calculated using Eqs (55) and (56) based on daily closing prices, high and low prices.

$\displaystyle\%K(i)=\frac{\textit{cp}(i)-\textit{Lp}_{t}}{\textit{Hp}_{t}-% \textit{Lp}_{t}}$ (55)

where $\textit{cp}(i)$ : current day’s closing price, $\textit{LP}_{t}$ : lowest price of past $t$ days period, $\textit{HP}_{i}$ is the highest price of past $t$ days period,

$\displaystyle\%D(i)=(\%K(i)+\%K(i-1)+\,\%K(i-2))/3$ (56)

%R is a stochastic oscillator and its value is obtained using Eq. (57).

$\displaystyle\%R(i)=\frac{H_{t}-\textit{cp}(i)}{H_{t}-L_{t}}\times 100$ (57)

where $\textit{cp}(i)$ : current day’s closing price, $\textit{HP}_{i}$ : highest price of past $t$ days, $\textit{LP}_{t}$ : lowest price of past $t$ days.

4.2 Data sets used for currency exchange rate and trend detection

Table 1a
Forex datasets used for daily exchange rate prediction

	MYR/USD, MXN/USD, BRL/USD
	No. of trading days	Period of trading days
Total dataset	900	03 January 2012 to 05 August 2015
Training dataset	600	03 January 2012 to 23 May 2014
Testing dataset	300	26 May 2014 to 05August 2015

Table 1b

Forex datasets used for trend prediction

	MYR/USD, MXN/USD, BRL/USD
	No. of trading days	Period of trading days
Total dataset	1044	03 January 2012 to 31 December 2015
Total dataset (year wise)	261	03 January to 31 December (each year)
Training Period (year wise)	131	$\sim$ 50% of total 261 instances ( $\sim$ January to June of each year)
Testing Period (year wise)	130	$\sim$ 50% of total 261 instances ( $\sim$ July to December of each year)

${}^{*}$ Each year: year 2012, 2013, 2014, 2015.

In this paper we examine the daily exchange rates data of three currencies i.e. Malaysian Ringgit (MYR), Mexican Peso (MXN), and Brazilian Real (BRL) against (USD) and the daily observations are used for one day ahead exchange rate prediction. The whole dataset covers the time span from 03 January 2012 to 05 August 2015, a total of 900 data pairs (Saturday and Sunday data pair values are missing). The dataset is partitioned into training and testing. The training part contains the initial 600 data pairs from the whole dataset in order to train the model proposed in this paper and the testing part contains the rest 300 data pairs. To test the efficiency of the model, we have chosen the observations from 26 May 2014 to 05 August 2015 in the datasets where fluctuations are more prominent. The details of the MYR/USD, MXN/USD, and BRL/USD datasets discussed above are given in Table 1a.

The trend detections are calculated on three currency exchange markets, such as MYR/USD, MXN/USD, BRL/USD and each of these datasets carry 1044 data instead of 900 data in total as the trends are detected for the entire period for each year up to December 2015.The details are given in Table 1b.

The inputs to the proposed model comprise lagged scaled prices of past 5 days for day ahead price prediction i.e. the prediction of day ( $d$ ) price is dependent on the lagged prices i.e. prices on ( $d$ $-$ 1), ( $d$ $-$ 2), ( $d$ $-$ 3), ( $d$ $-$ 4), ( $d$ $-$ 5) and all the data in each time series are arranged in this manner. The information available in the observed data pairs are used for day ahead prediction. The observed data pairs are scaled between 0 and 1 using Eq. (58). Figure 3 shows each of the scaled datasets used for examining the WWCA-Wavelet-DAN model.

$\displaystyle x_{\textit{scaled}}=\frac{x-x_{\text{min}}}{x_{\text{max}}-x_{% \text{min}}}$ (58)

where $x_{\textit{scaled}}=$ observed scaled value, $x=$ observed value, $x_{\text{max}}=$ highest value of the dataset, $x_{\text{min}}=$ lowest value of the dataset.

The six influential technical indicators explained above i.e. MA, RSI, MACD, %K %D and %R are used as input variables to the considered predictive models. The calculated technical indicator values are scaled between 0 and 1 by following procedure in Eq. (58).

Table 2

Performance comparison of wavelet based DAN model with other predictive models in terms of MAPE, MAE, and RMSE (Models: WWCA-Wavelet-DAN, Wavelet-DAN, WWCA-Poly-DAN, WWCA-tanh-DAN, WWCA-Gaussian-DAN, ELM, BPNN, SVR)

	WWCA-Wavelet-DAN			Wavelet-DAN			WWCA-Poly-DAN
	MAPE	MAE	RMSE	MAPE	MAE	RMSE	MAPE	MAE	RMSE
MYR/USD	1.3537	0.0091	0.0123	1.5546	0.0112	0.0159	1.4166	0.0099	0.0147
MXN/USD	1.3850	0.0097	0.0137	1.5803	0.0109	0.0176	1.4396	0.0102	0.0161
BRL/USD	1.3301	0.0072	0.0106	1.5452	0.0086	0.0127	1.4025	0.0077	0.0111
	WWCA-tanh-DAN			WWCA-Gaussian-DAN			ELM
	MAPE	MAE	RMSE	MAPE	MAE	RMSE	MAPE	MAE	RMSE
MYR/USD	1.4787	0.0108	0.0153	1.4429	0.0103	0.0149	2.7408	0.0154	0.0211
MXN/USD	1.5002	0.0106	0.0172	1.4407	0.0099	0.0157	2.8259	0.0166	0.0228
BRL/USD	1.4336	0.0080	0.0119	1.4124	0.0079	0.0116	2.3056	0.0133	0.0195
	BPNN			SVR			RELM
	MAPE	MAE	RMSE	MAPE	MAE	RMSE	MAPE	MAE	RMSE
MYR/USD	3.3796	0.0175	0.0261	3.0017	0.0166	0.0218	2.5527	0.0131	0.0201
MXN/USD	3.8408	0.0227	0.0313	3.2284	0.0206	0.0276	2.6213	0.0137	0.0211
BRL/USD	3.1690	0.0167	0.0249	2.9835	0.0157	0.0238	2.1466	0.0118	0.0167
	WWCA-Wavelet-KRR			Wavelet-KRR			WWCA-Poly-KRR
	MAPE	MAE	RMSE	MAPE	MAE	RMSE	MAPE	MAE	RMSE
MYR/USD	1.7273	0.0104	0.0145	1.7842	0.0133	0.0174	1.6605	0.0120	0.0161
MXN/USD	1.7806	0.0111	0.0152	1.8786	0.0128	0.0189	1.7932	0.0119	0.0177
BRL/USD	1.6156	0.0093	0.0136	1.8044	0.0117	0.0141	1.6141	0.0095	0.0132
	WWCA-tanh-KRR			WWCA-Gaussian-KRR
	MAPE	MAE	RMSE	MAPE	MAE	RMSE	MAPE	MAE	RMSE
MYR/USD	1.6418	0.0117	0.0176	1.7143	0.0118	0.0159	2.8971	0.0166	0.0234
MXN/USD	1.7513	0.0124	0.0194	1.7642	0.0112	0.0164	2.8278	0.0179	0.0233
BRL/USD	1.7988	0.0093	0.0127	1.7255	0.0082	0.0151	2.6218	0.014t33	0.0195

Figure 3.

MYR/USD, MXN/USD, and BRL/USD datasets containing 1044 data.

4.3 Forex price prediction results from the conducted experiment and discussion

The proposed model called DAN using Wavelet kernel is first trained for one-day ahead prediction using the three datasets from MYR/USD, MXN/USD and BRL/USD forex markets and then testing starts for the same as per Table 1. Then the performance of the proposed model is compared with other kernel functions like polynomial, tan hyperbolic, and Gaussian kernel models and these modes are named as Poly-DAN, tanh-DAN, Gaussian-DAN. The proposed Wavelet-DAN model is first trained with random values assigned for the wavelet kernel function parameters. Initially we run for ten times using different random values and measure the errors and find the MAPE, MAE and RMSE values. Then the average of the ten measured errors in terms of MAPE, MAE and RMSE are taken as the final prediction errors for the kernel function based DAN model. In order to achieve better accuracy, we have used a modified water cycle algorithm i.e. WWCA to optimize the kernel parameters in the first KRR-AE and use the best obtained values in the proposed stacked KRR-AEs in the subsequent layers (WWCA-Wavelet-DAN) and then measure the performances of the three datasets implemented for prediction analysis. Similar process is applied for the DAN approach with other kernel functionsi.e. Poly-DAN, tanh-DAN, Gaussian-DAN mentioned above and the optimized version of these models are known as WWCA-Poly-DAN, WWCA-tanh-DAN, and WWCA-Gaussian-DAN. Further the comparison is also done with non-iterative ELM technique, BPNN and SVR (support vector regression) for testing the robustness of the proposed model.

Figure 4.

(a) MYR/USD Testing results with errors. (b) Modified WWCA and WCA convergence curve for MYR/USD results.

The findings of the experimental results during testing using the considered predictive models are reported in Table 2. In the optimized based WWCA-Wavelet-DAN experiment, the MYR/USD shows MAPE, MAE and RMSE values of 1.3537, 0.0091 and 0.0123, respectively, while the MXN/USD shows MAPE, MAE and RMSE values of 1.3850, 0.0097 and 0.0137, respectively. The BRL/USD yields the lowest errors with lowest MAPE, MAE and RMSE values of 1.3301, 0.0072 and 0.0106, respectively for day-ahead price prediction. The performances of these test cases are quite satisfactory in terms of MAPE, MAE and RMSE. The MAPE, MAE and RMSE values of other kernel variants are depicted in Table 2. The visualization of target vs. predicted results of MYR/USD, MXN/USD, and BRL/USD implemented in the proposed WWCA-Wavelet-DAN approach are depicted in Figs 4 and 5, and 6along with their corresponding convergence curves. In these forex datasets, the wavelet kernel function supports the performance accuracy to a higher rate whereas the other kernel functions reports lesser accuracy as can be observed from the measured MAPE, MAE, and RMSE values presented in Table 2. From the table it is also clear that the conventional WWCA-DAN reveals lesser prediction accuracies with the four different kernel variants in all cases as compared to the robust WWCA-DAN model using a weighted KRR model as the final predictor.

Figure 5.

(a) MXN/USD Testing results with errors. (b) Modified WWCA and WCA convergence curve for MXN/USD results.

As the non-optimized approaches provide lower accuracy, for illustration purpose we have shown the measured errors of the proposed Wavelet-RDAN approach for the considered datasets in Table 2. From the convergence figures of the three datasets for the optimized WWCA-Wavelet-DAN shown in Figs 4b, 5b, and 6b, it is evident that the errors converge at less than five iterations in all the test cases whereas WCA takes more iterations to obtain the optimized values of kernel parameters. After a few more iterations, WCA converges almost with the same objective function values with a very little higher obtained error. In all the cases the number of population is fixed as 50, number of iterations 25; in each case we can notice WWCA is more effective than WCA with regard to error convergence. Thus we used the proposed WWCA technique with our proposed DAN network for optimizing the kernel parameters for day ahead currency exchange rate prediction.

Figure 6.

(a) BRL/USD Testing results with errors. (b) Modified WWCA and WCA convergence curve for BRL/USD results.

The MYR/USD and BRL/USD converge at the lowest of five iterations whereas the MXN/USD takes approximately eight iterations for convergence and to get the optimal wavelet kernel function values i.e. $a$ , $b$ and $c$ . The corresponding $a$ , $b$ and $c$ values in order of the datasets described above are: $a=$ 4.14, $b=$ 2.43, $c=$ 3.23 (for MYR/USD), $a=$ 3.15, $b=$ 6.10, $c=$ 4.11 (for MXN/USD) and $a=$ 2.2, $b=$ 4.18, $c=$ 3.58 (for BRL/USD). The optimal values are different for different datasets. Further for comparison ELM, BPNN, and SVR (support vector regression) have been applied. The ELM experimental results for the three datasets exhibit higher errors as compared to the proposed model and also with other kernel functions attached with the RKRR models. The measured performance in BPNN exhibits the highest error which could not stand against the DAN and ELM approaches. The SVR results lesser errors than BPNN but higher errors than the proposed model and with other associated kernel function with DAN results.

The considered models are examined with Friedman test [43] with Scheffe’s procedure at 95% confidence level to identify the best model by performance. The obtained Friedman test values are presented in Table 3. Among the mean rankings of the considered models, the proposed WWCA-Wavelet-RDAN model is obtained with the lowest mean rank (MR) value which indicates the supremacy of the model as compared to other models by ranking comparisons. The MYR/USD yields the lowest MR value of 2.9654, the MXN/USD yields the lowest MR value of 3.0206 and the BRL/USD shows the lowest MR value of 2.8933 in the proposed WCA-Wavelet-DAN model which are lower than the MR values of other models. The proposed model comparison with other models shows Scheffe’s p-values (p) below 0.05 which identify the significance differences among the models. The details comparisons are provided in Table 3 for the three datasets.

Table 3

Scheffé’s procedure based on Friedman test with 95% confidence level for MYR/USD, MXN/USD, BRL/USD

Prediction model	MR	Diff	$p$	MR	Diff	$P$	MR	Diff	$P$
	Forex datasets
	MYR/USD			MXN/USD			BRL/USD
WCA-Wavelet-DAN (A1)	2.9654	–	–	3.0206	–	–	2.8933	–	–
WCA-Poly-DAN (A2)	3.8666	A1–A2 $-$ 0.9012	0.0035	3.8453	A1–A2 $-$ 0.8247	0.0044	3.6743	A1–A2 $-$ 0.7810	0.0032
WCA-tanh-DAN (A3)	4.1691	A1–A3 $-$ 1.2037	0.0007	3.9460	A1–A3 $-$ 0.9254	0.0019	3.7713	A1–A3 $-$ 0.8780	0.0026
WCA-Gaussian-DAN (A4)	3.9760	A1–A4 $-$ 1.0106	0.0000	3.7553	A1–A4 $-$ 0.7347	0.0000	3.7253	A1–A4 $-$ 0.8320	0.0002
ELM (A5)	4.5665	A1–A4 $-$ 1.6011	0.0000	4.8050	A1–A5 $-$ 1.7844	0.0000	4.0371	A1–A5 $-$ 1.1438	0.0000
BPNN (A6)	5.3333	A1–A5 $-$ 2.3679	0.0000	5.0664	A1–A6 $-$ 2.0458	0.0000	5.0369	A1–A6 $-$ 2.1436	0.0000
SVR (A7)	4.8843	A1–A6 $-$ 1.9189	0.0000	4.8899	A1–A7 $-$ 1.8693	0.0000	4.2228	A1–A7 $-$ 1.3295	0.0000

Note: MR: mean rank, $p$ : Scheffe’s $p$ -value, for comparison friendly the above forecasting approaches are identified as A1, A2, A3, A4, A5, A6 and A7. (the $p$ -values are shown up to four decimal points, also prove the significant differences among different approaches under consideration).

Table 4

Price trend in each year on historical data of MYR/USD, MXN/USD, and BRL/USD

Dataset	Year	Increase	Decrease	Increase (%)	Decrease (%)	Total
MYR/USD	2012	127	134	48.66	51.34	261
	2013	156	105	59.77	40.23	261
	2014	134	127	51.34	48.66	261
	2015	137	124	52.49	47.51	261
	Total	554	490	53.07	46.93	1044
MXN/USD	2012	120	141	45.98	54.02	261
	2013	137	124	52.49	47.51	261
	2014	119	142	45.59	54.41	261
	2015	124	137	47.51	52.49	261
	Total	500	544	47.89	52.11	1044
BRL/USD	2012	111	150	42.53	57.47	261
	2013	116	145	44.44	55.56	261
	2014	118	143	45.21	54.79	261
	2015	116	145	44.44	55.56	261
	Total	461	583	44.16	55.84	1044

5. Forex trend detection on MYR/USD, MXN/USD, BRL/USD datasets

The solution to trend prediction in the form of binary classification problem is examined on three currency exchange markets, i.e. MYR/USD, MXN/USD, and BRL/USD over a period of four years from 3 January 2012 to 31 December 2015 and each dataset carries 1044 numbers of currency exchange prices.

5.1 Forex trend calculation on historical data for target fixation

The daily exchange price directions of these datasets in terms of 1 (up)/0 (down) are calculated year wise for the entire dataset. The up and down instances are obtained on 1044 data by finding the difference between the daily prices with simple rule base which indicates that if price on current day exceeds the price on recent previous day then the instance will be considered as up trend (1) and in reverse case it is considered as down trend (0), and if both the conditions fail then it will be considered to have no trend. The number of up/down instances calculated on this rule base in each year for the MYR/USD, MXN/USD, BRL/USD datasets are depicted in Table 4 and the number of up/down instances calculated for further divisions of training and testing as per Table 1b are reported in Table 5.

Table 5
Price trend in each year on historical data of MYR/USD, MXN/USD, and BRL/USD selected for training ${}^{}$ and testing ${}^{}$

		Training			Testing
Dataset	Year	Increase	Decrease	Total	Increase	Decrease	Total
MYR/USD	2012	63	68	131	64	66	130
	2013	78	53	131	78	52	130
	2014	71	60	131	63	67	130
	2015	69	62	131	68	62	130
	Total	281	243	524	273	247	520
MXN/USD	2012	56	75	131	64	66	130
	2013	68	63	131	69	61	130
	2014	67	64	131	52	78	130
	2015	61	70	131	63	67	130
	Total	252	272	524	248	272	520
BRL/USD	2012	56	75	131	55	75	130
	2013	57	74	131	59	71	130
	2014	67	64	131	51	79	130
	2015	57	74	131	59	71	130
	Total	237	287	524	224	296	520

${}^{*}$ Training: 50% of total data in each year, ${}^{*}$ Testing: next 50% of total data in each year.

Table 6

Classification accuracy obtained during training and testing of MXN/USD datasets using WWCA-Wavelet-DAN and other kernel function based DAN approach

		WWCA-Poly-DAN				WWCA-tanh-DAN
Dataset	Year	Trn (PCCA)	Trn (Fm)	Tst (PCCA)	Tst (Fm)	Trn (PCCA)	Trn (Fm)	Tst (PCCA)	Tst (Fm)
	2012	81.74	0.8189	76.80	0.7071	82.53	0.8218	80.00	0.7615
	2013	88.88	0.8833	81.60	0.7850	76.98	0.7563	81.60	0.7965
MXN/USD	2014	83.33	0.8346	80.80	0.7736	82.53	0.8281	85.60	0.8085
	2015	85.00	0.8588	78.00	0.7839	88.09	0.8598	78.40	0.7611
	Avg.	84.74	0.8489	79.30	0.7624	82.53	0.8165	81.40	0.7819
		WWCA-Gaussian-DAN				WWCA-Wavelet-DAN
Dataset	Year	Trn (PCCA)	Trn (Fm)	Tst (PCCA)	Tst (Fm)	Trn PCCA)	Trn (Fm)	Tst (PCCA)	Tst (Fm)
	2012	82.53	0.8226	80.00	0.7573	83.33	0.8293	80.00	0.7664
	2013	88.09	0.8760	82.40	0.7963	89.68	0.8917	82.40	0.7885
MXN/USD	2014	84.92	0.8504	84.80	0.8081	84.88	0.8508	85.60	0.8163
	2015	88.88	0.8654	80.80	0.7778	88.88	0.8727	82.40	0.7442
	Avg.	86.11	0.8536	82.00	0.7849	86.69	0.8611	82.60	0.7788

Avg: Average, Trn: Training, Tst: Testing, PCCA: Percentage correct classification accuracy, Fm: F-measure.

The MYR/USD dataset contains 554 up instances which claim 53.07% of the total 1044 data and 490 down instances which claim 46.93% of the total 1044 data. Each year contains 261 data. 127 up trends and 134 down trends are found in 2012, 156 up trends and 105 down trends are found in 2013, 134 up trends and 127 down trends in 2014, and 137 up trends and 124 down trends in 2015 are found in 2015. The MXN/USD dataset contains 500 up instances which is 47.89% of the total instances and 544 down instances which is of 52.11% of the total 1044 instances. Out of 261 instances, 120 up trends and 141 down trends are found in 2012, 137 up trends and 124 down trends are found in 2013, 119 up trends and 142 down trends are found in 2014, and 124 up trends and 137 down trends are found in 2015. Similarly the BRL/USD dataset contains 461 up instances i.e. 44.16% of the total instances and 583 down instances i.e. 55.84% of the total 1044 instances. Out of 261 instances, 111 up trends and 150 down trends in 2012, 116 up trends and 145 down trends in 2013, 118 up trends and 143 down trends in 2014, and 116 up trends and 145 down trends in 2015 are found from the historical data.

Each training data carries 261 data in each year out of which 131 data ( $\sim$ 50%) are reserved for training and 130 data ( $\sim$ 50%) kept for testing our models. The up/down trends are obtained for training and testing period year wise and the number of up/down instances are shown in Table 5. From the table we notice that the training data of MYR/USD contains 63/68 up/down instances and the testing period contains 64/66 up/down instances in the year 2012. In 2013, 78/53 up/down instances are found in the training period and 78/52 up/down instances are found in the testing period. Similarly in 2014, 71 up instances and 60 down instances are found in training period whereas 63 up and 67 down instances are found in testing period. In 2015, 69/62 instances in training period and 68/62 instances in testing period are found for up/down trends, respectively. Table 5 can be referred for the details of up/down instances of the other two markets i.e. MXN/USD and BRL/USD.

5.2 Forex trend prediction against targeted trends

Table 7
Classification accuracy obtained during training and testing of MYR/USD datasets using WWCA-Wavelet-DAN and other kernel function based DAN approach

		WWCA-Poly-DAN				WWCA-tanh-DAN
Dataset	Year	Trn (PCCA)	Trn (Fm)	Tst (PCCA)	Tst (Fm)	Trn (PCCA)	Trn (Fm)	Tst (PCCA)	Tst (Fm)
	2012	82.53	0.8308	75.20	0.8166	78.57	0.7874	74.00	0.7860
	2013	81.75	0.8271	75.20	0.7704	80.00	0.8137	74.40	0.7508
MYR/USD	2014	80.15	0.8062	76.80	0.7914	80.15	0.8072	74.40	0.7511
	2015	82.53	0.8308	76.80	0.7914	80.00	0.8223	74.00	0.7506
	Avg.	81.74	0.8237	76.00	0.7925	79.68	0.8077	74.20	0.7596
		WWCA-Gaussian-DAN				WWCA-Wavelet-DAN
Dataset	Year	Trn (PCCA)	Trn (Fm)	Tst (PCCA)	Tst (Fm)	Trn (PCCA)	Trn (Fm)	Tst (PCCA)	Tst (Fm)
	2012	80.00	0.8068	76.00	0.8026	80.15	0.8000	76.00	0.8000
	2013	80.95	0.8154	74.40	0.7681	82.53	0.8308	76.80	0.7936
MYR/USD	2014	80.95	0.8154	74.40	0.7681	81.75	0.8206	76.80	0.7914
	2015	80.95	0.8154	74.40	0.7681	80.15	0.8000	75.20	0.7704
	Avg.	80.71	0.8133	74.80	0.7767	81.15	0.8129	76.20	0.7889

Avg: Average, Trn: Training, Tst: Testing, PCCA: Percentage correct classification accuracy, Fm: F-measure.

Table 8

Classification accuracy obtained during training and testing of BRL/USD datasets using WWCA-Wavelet-DAN and other kernel function based DAN approach

		WWCA-Poly-DAN				WWCA-tanh-DAN
Dataset	Year	Trn (PCCA)	Trn (Fm)	Tst (PCCA)	Tst (Fm)	Trn (PCCA)	Trn (Fm)	Tst (PCCA)	Tst (Fm)
	2012	85.71	0.8269	80.00	0.7669	80.95	0.7818	83.84	0.7835
	2013	80.00	0.7788	80.00	0.7833	82.53	0.7908	81.74	0.7723
BRL/USD	2014	81.60	0.8025	77.60	0.7705	80.95	0.8033	83.20	0.7407
	2015	80.15	0.7899	79.36	0.7736	80.95	0.7647	84.00	0.8077
	Avg.	81.87	0.7995	79.24	0.7736	81.35	0.7852	83.20	0.7761
		WWCA-Gaussian-DAN				WWCA-Wavelet-DAN
Dataset	Year	Trn (PCCA)	Trn (Fm)	Tst (PCCA)	Tst (Fm)	Trn (PCCA)	Trn (Fm)	Tst (PCCA)	Tst (Fm)
	2012	84.12	0.8113	85.71	0.8272	85.71	0.8407	86.40	0.8383
	2013	85.71	0.8302	81.60	0.8217	86.40	0.8353	84.92	0.8427
BRL/USD	2014	81.74	0.8099	83.20	0.7407	85.71	0.8406	85.60	0.8106
	2015	84.92	0.8155	83.20	0.8073	87.30	0.8547	86.40	0.8549
	Avg.	84.12	0.8167	83.43	0.7992	86.28	0.8428	85.83	0.8341

Avg: Average, Trn: Training, Tst: Testing, PCCA: Percentage correct classification accuracy, Fm: F-measure.

The three datasets taken from MYR/USD, MXN/ USD, and BRL/USD are analyzed for trend detection using the proposed WWCA-Wavelet-DAN model by training. Once the model is well trained then testing is processed for the dataset held for testing and measured the prediction accuracy in terms of PCCA (Percentage correct classification accuracy) (Eq. (59)) and Fm (F-measure) (Eqs (60)–(62)) and discussed the obtained results and compare with the results obtained in other kernel variants like Gaussian kernel,tan hyperbolic kernel and polynomial kernel functions.

$\displaystyle\text{PCCA}=\frac{\textit{tp}+\textit{tn}}{\textit{tp}+\textit{tn% }+\textit{fp}+\textit{fn}}$ (59)

tn $=$ true positive,tn $=$ true negative, fp $=$ false positive, fn $=$ false negative.

$\displaystyle\text{Fm}=2\times\frac{\textit{pn}\times\textit{rl}}{\textit{pn}+% \textit{rl}},$ (60) $\displaystyle\quad\textit{pn}=\text{precision},\textit{rl}=\text{recall}$ $\displaystyle\textit{pn}=\frac{\textit{tp}}{\textit{tp}+\textit{fp}}$ (61) $\displaystyle\textit{rl}=\frac{\textit{tp}}{\textit{tp}+\textit{fn}}$ (62)

Table 9

Price trend on historical data of MXN/USD, MYR/USD, BRL/USD selected for training ${}^{**}$ and testing ${}^{**}$

		Training			Testing
Dataset	Year	Increase	Decrease	Total	Increase	Decrease	Total
MXN/USD	2012–2015	283	239	522	271	251	522
MYR/USD	2012–2015	257	265	522	243	279	522
BRL/USD	2012–2015	227	295	522	234	288	522

${}^{**}$ Training: 50% of data of the dataset, ${}^{**}$ Testing: next 50% data of the dataset, ${}^{**}$ 522 data/ ${}^{**}$ 522 data.

Table 10

Classification accuracy obtained during training ${}^{**}$ and testing ${}^{**}$ of MXN/USD, MYR/USD and BRL/USD with reference to Table 9 using WWCA-Wavelet-DAN and other predictive approaches

		Training		Testing		Training		Testing
Dataset	Year	Trn (PCCA)	Trn (Fm)	Tst (PCCA)	Tst (Fm)	Trn (PCCA)	Trn (Fm)	Tst (PCCA)	Tst (Fm)
		WWCA-Poly-DAN				WWCA-tanh-DAN
MXN/USD	2012–2015	83.33	0.8191	83.10	0.7885	83.52	0.8245	84.45	0.8146
MYR/USD	2012–2015	79.11	0.8111	82.72	0.8375	78.58	0.7963	78.00	0.7884
BRL/USD	2012–2015	81.22	0.7832	85.02	0.8243	80.84	0.7768	84.83	0.8225
		WWCA-Gaussian-DAN				WWCA-Wavelet-DAN
MXN/USD	2012–2015	85.63	0.8434	83.49	0.7852	86.00	0.8614	84.45	0.8213
MYR/USD	2012–2015	79.11	0.8130	81.19	0.8280	80.00	0.8147	81.57	0.8261
BRL/USD	2012–2015	82.79	0.7926	85.60	0.8299	84.26	0.8255	86.04	0.8464
		Naive-Bayes Classifier				Support Vector Classifier
MXN/USD	2012–2015	78.50	0.7756	77.60	0.7813	79.11	0.8056	79.00	0.7859
MYR/USD	2012–2015	74.60	0.7582	73.00	0.7374	75.13	0.7634	74.55	0.7517
BRL/USD	2012–2015	76.15	0.7604	76.00	0.7636	78.05	0.7827	80.12	0.7866
		ANN				RELM
MXN/USD	2012–2015	78.33	0.7894	77.80	0.7702	81.35	0.8102	82.40	0.8005
MYR/USD	2012–2015	75.00	0.7586	73.83	0.7319	77.05	0.7597	76.00	0.7364
BRL/USD	2012–2015	77.83	0.7753	77.24	0.7683	78.15	0.7555	82.05	0.8117
		WWCA-Poly-KRR				WWCA-tanh-KRR
MXN/USD	2012–2015	81.57	0.7963	81.00	0.7787	80.88	0.8156	83.15	0.8008
MYR/USD	2012–2015	78.24	0.7977	81.05	0.8241	76.85	0.7783	77.15	0.7731
BRL/USD	2012–2015	80.00	0.7663	84.00	0.8184	79.15	0.7606	83.47	0.8156
		WWCA-Gaussian-KRR				WWCA-Wavelet-KRR
MXN/USD	2012–2015	82.66	0.8375	81.26	0.7764	84.75	0.8550	82.40	0.8162
MYR/USD	2012–2015	77.55	0.7812	79.85	0.8165	78.80	0.8025	80.30	0.8181
BRL/USD	2012–2015	81.00	0.7162	83.50	0.8158	82.60	0.8146	84.00	0.8376

Avg: Average, Trn: Training, Tst: Testing, PCCA: Percentage correct classification accuracy, Fm: F-measure.

The calculated scaled values of the MA, RSI, MACD, %K, %D, %R technical indicators are employed as inputs to the proposed method for starting the model processing with an aim to correctly classify the trend against targeted trend. Once the inputs are employed to the model, it yields output with continuous values which is further utilized to find the trend with a decision point fixed at the mean ( $\theta$ ) of the output values. By rule, if output is obtained with higher value than $\theta$ then uptrend is found, and on the other hand if the output is lower than $\theta$ then down trend is found. The calculated up/down trends are classified as class-up/class-down and fixed as target classes denoted by 1/0. The training and testing sets mentioned in Table 1b are used for finding the class during training and testing and measuring the classification accuracy against the target classes.

The prediction accuracies in the proposed model and with other kernel variants are reported in Table 6 for MXN/USD dataset. From the obtained results we found the proposed Wavelet-KRR model with WWCA optimized values (WWCA-Wavelet-DAN) claims to be the best with higher PCCAs in all cases and yields the highest average PCCA (APCCA) of 86.69% over four years i.e. 2012, 2013, 2014 and 2015 during training and 82.60% during testing which are the best obtained accuracies as compared to other kernel variants like polynomial, tanh, and Gaussian kernels. The measured APCCAs using polynomial, tanh and Gaussian kernel functions during training are obtained with lower accuracies of 84.74%, 82.53%, and 86.11%, respectively, and during testing the accuracies are found to be 79.30%, 81.40%, 82.00%. The year wise accuracies in almost each case are lesser than the wavelet kernel based DAN model. The calculated Fm values are seen in the table. From tabular values and the results analysis, it is quite clear that the wavelet kernel is more efficient in finding correct classes in each year during training and testing for the MXN/USD dataset.

We can observe in Table 7 that in case of MYR/USD dataset, the highest classification accuracy is obtained with the proposed model showing 81.15% average accuracy in training and 76.20% average accuracy in testing across all the years. The year wise classification accuracy almost in each year is also superior to other kernel variants. On the other hand, the training accuracy in Poly-DAN approach over the years varies between 75.20% and 76.80% during testing and training accuracy varies from 80.15% to 82.53% which is comparatively lesser than the proposed one. Similarly the tanh-DAN and Gaussian-DAN functions are lesser efficient in these cases compare to the proposed one. The corresponding Fm values also support the achievements of the WWCA-Wavelet-DAN approach. We can notice from the table that the accuracy is increasing consistently in the proposed method while comparing the performance year wise for the other kernel variants and hence the proposed WWCA-Wavelet-DAN model is achieved to be the best model among different predictive models considered here.

The trend prediction result of the third dataset i.e. BRL/USD is summarized in Table 8. Here also the proposed approach achieves better performance in all the considered cases which is clear from result analysis. The resultant average classification accuracy is found to be 85.83% which is the highest among the other considered models. The year wise classification accuracies are higher to the resultant accuracies obtained in other kernel variants year wise. In the years 2012 and 2015, the highest accuracy is found to be 86.40% whereas in 2013 and 2014 it is found to be a little lower. In all the cases the WWCA-Wavelet-DAN keeps maintaining higher accuracies both in training and testing. The Fm values reported in the table show the efficiency of the proposed model in all the case studies. With all the analysis and discussion, it is ascertained that the proposed model achieves quite promising performance results using the MYR/USD, MXN/USD, and BRL/USD datasets.

Further for verification purpose, the entire data across all the four years are partitioned into two phases; first 50% of the entire data are for training and the next 50% data for testing which carries 522 data in each phase covering the period 03 Jan. 2012 to 31 Dec. 2013 and 01 Jan. 2014 to 31 Dec. 2015, respectively, for the training and testing phase. The up and down trends on historical data during training as well as testing are summarized in Table 9. The trend prediction performance accuracies with reference to Table 9 are reported in Table 10. While comparing these results with the obtained average PCCAs presented in Table 6, Table 7 and Table 8 across four years for MXN/USD, MYR/USD and BRL/USD reported in Table 10, similar results are found with better achievements in accuracy using the proposed WWCA-Wavelet-DAN model.The comparison of the robust WWCA-WDAN with WWCA-WKRR is also shown in Table 10 which indicates the efficiency of the proposed WWCA-WDAN model with higher classification accuracies in all cases.

The variation in training and testing datasets for the three forex markets again verifies the model efficiency and the obtained PCCA and Fm values in the experiment prove the efficiency of the proposed WWCA-Wavelet-DAN model as compared to the other kernel variants. Further the model is compared with Naïve-Bayes classifier, support vector classifier and ANN. The classification results with comparison shown in Table 10clearly indicate the significant achievement of WWCA-Wavelet-DAN (bold fonts) in classification accuracy by showing higher PCCAs and Fm values in all the datasets. The WWCA-Wavelet-DAN model produces superior performance in comparison with all the predictive models considered in this studyincluding other kernel variants for forex trend detection.

6. Conclusion

The study and analysis of forex market is one of the complex tasks in financial market and stands as a challenge to human intelligence. Therefore, the proposed method is executed for forex price as well as forex trend prediction. Thus in this analysis, the non-iterative Wavelet-DAN model is proposed for forex price prediction and trend detection in order to achieve promising performance accuracy with less computation overhead. In order to maximize the forecasting as well as classification accuracy, a modified WCA algorithm i.e. WWCA is proposed and utilized to optimize the kernel parameters. This model is compared with the optimized DAN and KRR models combined with other kernel functions like polynomial, tan hyperbolic, and Gaussian kernel. From the experimental results it is confirmed that the optimized WWCA-Wavelet-DAN model reveals the best prediction accuracy with lowest prediction errors with RMSE, MAE, and MAPE values among the considered kernel functions for MYR/USD, MXN/USD, and BRL/USD forex data sets which is also superior to the well known non-iterative ELM and RELM techniques, and traditional BPNN technique which requires a higher number of iterations for its convergence with high computational overhead.

The forex trend prediction treated here as a classification problem is solved using modified WWCA-Wavelet-DAN approach in the three forex datasets (MXN/USD, MYR/USD, BRL/USD)and the revealed classification accuracies from the experimental results in terms of PCCA and f-measure prove its superiority over the other kernel variants and some well known classifier techniques like Naive-Bayes classifier, support vector classifier, and ANN. The proposed model offers simple execution procedure and is quite efficient for nonlinear time series prediction like currency exchange rate. From the results and discussion, we reach at a conclusion that the proposed approach is quite efficient and can be treated as a promising machine expert approach for nonlinear time series prediction in highly noise environment.

Footnotes

Conflict of interest

The authors declare that they have no conflict of interest for this paper.

Future scope

This paper is focused on currency exchange rate prediction for day ahead prediction and daily price trend detection. In a future work the authors will focus on trading for taking decision for stock investment by taking predicted factors. The proposed method can be applied to other forecasting problems like stock price forecasting, electricity price forecasting, load forecasting, etc. Other variants of chaos functions will also be considered and verified for currency exchange rate prediction and trend detection.

References

Zhang

. Time series forecasting using a hybrid ARIMA and neural network model. Neurocomputing. 2003; 50: 159-75.

Kristjanpoller

Michell

. A stock market risk forecasting model through integration of switching regime, ANFIS and GARCH techniques. Applied Soft Computing. 2018; 67: 106-16.

Kristjanpoller

Minutolo

. Forecasting volatility of oil price using an artificial neural network-GARCH model. Expert Systems with Applications. 2016; 65: 233-41.

Hsieh

Hsiao

Yeh

. Forecasting stock markets using wavelet transforms and recurrent neural networks: An integrated system based on artificial bee colony algorithm. Applied Soft Computing. 2011; 11(2): 2510-25.

Patel

Shah

Thakkar

Kotecha

. Predicting stock market index using fusion of machine learning techniques. Expert Systems with Applications. 2015; 42(4): 2162-72.

Sapankevych

Sankar

. Time series prediction using support vector machines: a survey. IEEE Computational Intelligence Magazine. 2009; 4(2): 24-38.

Zhang

Wan

. Statistical fuzzy interval neural networks for currency exchange rate time series prediction. Applied Soft Computing. 2007; 7(4): 1149-56.

Majhi

Panda

Sahoo

. Efficient prediction of exchange rates with low complexity artificial neural network models. Expert Systems with Applications. 2009; 36(1): 181-9.

Bui

Dinh

. A novel evolutionary multi-objective ensemble learning approach for forecasting currency exchange rates. Data & Knowledge Engineering. 2018; 114: 40-66.

10.

Galeshchuk

. Neural networks performance in exchange rate prediction. Neurocomputing. 2016; 172: 446-52.

11.

Lahmiri

. Modeling and predicting historical volatility in exchange rate markets. Physica A: Statistical Mechanics and its Applications. 2017; 471: 387-95.

12.

Chen

Audhkhasi

Kingsbury

Ramabhadrari

. Efficient one-vs-one kernel ridge regression for speech recognition. In: 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE; 2016 Mar 20. pp. 2454-2458.

13.

Douak

Melgani

Benoudjit

. Kernel ridge regression with active learning for wind speed prediction. Applied Energy. 2013; 103: 328-40.

14.

Naik

Satapathy

Dash

. Short-term wind speed and wind power prediction using hybrid empirical mode decomposition and kernel ridge regression. Applied Soft Computing. 2018; 70: 1167-88.

15.

Naik

Bisoi

Dash

. Prediction interval forecasting of wind speed and wind power using modes decomposition based low rank multi-kernel ridge regression. Renewable Energy. 2018; 129: 357-83.

16.

Haworth

Shawe-Taylor

Cheng

Wang

. Local online kernel ridge regression for forecasting of urban travel times. Transportation Research part C: Emerging Technologies. 2014; 46: 151-78.

17.

Rakesh

Suganthan

. An ensemble of kernel ridge regression for multi-class classification. Procedia Computer Science. 2017, 108: 375-83.

18.

Shen

Ding

Tang

Guo

. Multivariate information fusion with fast kernel learning to kernel ridge regression in predicting LncRNA-protein interactions. Frontiers in Genetics. 2019; 9: 716.

19.

Ali

Prasad

Xiang

Yaseen

. Complete ensemble empirical mode decomposition hybridized with random forest and kernel ridge regression model for monthly rainfall forecasts. Journal of Hydrology. 2020; 584: 124647.

20.

Exterkate

Groenen

Heij

van Dijk

. Nonlinear forecasting with many predictors using kernel ridge regression. International Journal of Forecasting. 2016; 32(3): 736-53.

21.

Yang

Chen

. Representation learning with extreme learning machines and empirical mode decomposition for wind speed forecasting methods. Artificial Intelligence. 2019; 277: 103176.

22.

Wong

Vong

Wong

Cao

. Kernel-based multilayer extreme learning machines for representation learning. IEEE transactions on neural networks and learning systems. 2016; 29(3): 757-62.

23.

Nayak

Dash

Majhi

Pachori

Zhang

. A deep stacked random vector functional link network autoencoder for diagnosis of brain abnormalities and breast cancer. Biomedical Signal Processing and Control. 2020; 58: 101860.

24.

Katuwal

Suganthan

. Stacked autoencoder based deep random vector functional link neural network for classification. Applied Soft Computing. 2019; 85: 105854.

25.

Shin

Orton

Collins

Doran

Leach

. Stacked autoencoders for unsupervised feature learning and multiple organ detection in a pilot study using 4D patient data. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2012; 35(8): 1930-43.

26.

Wang

Huang

Wang

. Generalized autoencoder: A neural network framework for dimensionality reduction. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 2014. pp. 490-497.

27.

Barreto

Barros

. A robust extreme learning machine for pattern classification with outliers. Neurocomputing. 2016; 176: 3-13.

28.

Eskandar

Sadollah

Bahreininejad

Hamdi

. Water cycle algorithm – A novel metaheuristic optimization method for solving constrained engineering optimization problems. Computers & Structures. 2012; 110: 151-66.

29.

Haddad

Moravej

Lo?iciga

. Application of the water cycle algorithm to the optimal operation of reservoir systems. Journal of Irrigation and Drainage Engineering. 2015; 141(5): 04014064.

30.

De Oliveira

Nobre

Z?rate

. Applying Artificial Neural Networks to prediction of stock price and improvement of the directional prediction index – Case study of PETR4, Petrobras, Brazil. Expert Systems with Applications. 2013; 40(18): 7596-606.

31.

Chiang

Enke

Wang

. An adaptive stock index trading decision support system. Expert Systems with Applications. 2016; 59: 195-207.

32.

Patel

Shah

Thakkar

Kotecha

. Predicting stock and stock price index movement using trend deterministic data preparation and machine learning techniques. Expert Systems with Applications. 2015; 42(1): 259-68.

33.

Bisoi

Dash

. A hybrid evolutionary dynamic neural network for stock market trend analysis and prediction using unscented Kalman filter. Applied Soft Computing. 2014; 19: 41-56.

34.

Yao

Tan

. A case study on using neural networks to perform technical forecasting of forex. Neurocomputing. 2000; 34(1-4): 79-98.

35.

Ozturk

Toroslu

Fidan

. Heuristic based trading system on Forex data using technical indicator rules. Applied Soft Computing. 2016; 43: 170-86.

36.

Serban

. Combining mean reversion and momentum trading strategies in foreign exchange markets. Journal of Banking & Finance. 2010; 34(11): 2720-7.

37.

Dymova

Sevastjanov

Kaczmarek

. A Forex trading expert system based on a new approach to the rule-base evidential reasoning. Expert Systems with Applications. 2016; 51: 1-3.

38.

Gandomi

Yang

. Chaotic bat algorithm. Journal of Computational Science. 2014; 5(2): 224-32.

39.

May

. Simple mathematical models with very complicated dynamics. Nature. 1976; 261: 459-67.

40.

Devaney

. An introduction to chaotic dynamical systems. CRC Press; 2018 Mar 9.

41.

dos Santos Coelho

Lee

. Solving economic load dispatch problems in power systems using chaotic and Gaussian particle swarm optimization approaches. International Journal of Electrical Power & Energy Systems. 2008; 30(5): 297-307.

42.

dos Santos Coelho

Mariani

. A novel chaotic particle swarm optimization approach using H?non map and implicit filtering local search for economic load dispatch. Chaos, Solitons & Fractals. 2009; 39(2): 510-8.

43.

Friedman

. The use of ranks to avoid the assumption of normality implicit in the analysis of variance. Journal of the American Statistical Association. 1937; 32(200): 675-701.

44.

Low

Park

Teoh

. Stacking-based deep neural network: deep analytic network for pattern classification. IEEE Transactions on Cybernetics. 2019; 50(12): 5021-34.

45.

Huber

Ronchetti

. Robust Statistics. John Wiley & Sons, Inc. Publication; 2009.

Hybrid modified weighted water cycle algorithm and Deep Analytic Network for forecasting and trend detection of forex market indices

Abstract

Keywords

1. Introduction

2. Deep robust analytic network for forex rate prediction

4.1.1 Performance evaluation measures

Table 1a Forex datasets used for daily exchange rate prediction

5.1 Forex trend calculation on historical data for target fixation

Table 5 Price trend in each year on historical data of MYR/USD, MXN/USD, and BRL/USD selected for training * and testing *

Table 7 Classification accuracy obtained during training and testing of MYR/USD datasets using WWCA-Wavelet-DAN and other kernel function based DAN approach

Footnotes

Conflict of interest

Future scope

References

Table 1a
Forex datasets used for daily exchange rate prediction

Table 5
Price trend in each year on historical data of MYR/USD, MXN/USD, and BRL/USD selected for training ${}^{}$ and testing ${}^{}$

Table 7
Classification accuracy obtained during training and testing of MYR/USD datasets using WWCA-Wavelet-DAN and other kernel function based DAN approach