Abstract
Machine learning as a subset of artificial intelligence presents a promising set of algorithms for tackling increasingly complex challenges. A notable ability of this subgroup of algorithms to tackle tasks without explicit programming coupled with the expanding availability of computational resources and information transparency has made it possible to utilize algorithms to forecast prices. In recent years, cryptocurrency has increased in popularity and has seen wider adoption as a payment method. Cryptocurrency trading and mining have become a potentially very lucrative venture. However, due to the instability of cryptocurrency prices, casting accurate predictions can be quite challenging. A novel way of approaching this challenge is by tackling it through time-series forecasting. A particularly promising method for tackling this type of problem is through the utilization of long-short-term memory artificial neural networks to attain accurate prediction results. However, the forecasting accuracy of machine learning models is highly dependent on adequate hyperparameter settings. Thus, this work presents an improved variation of the arithmetic optimization algorithm, tasked with selecting the best values of a long-short term neural network casting price predictions. The presented approach has been evaluated on publicly available real-world Ethereum trading price data. The attained results of a comparative analysis against several popular metaheuristics indicate that the presented method achieved excellent results, and outperformed aforementioned algorithms in one and four-step ahead predictions.
Introduction
In the last decade, cryptocurrencies have seen astounding proliferation. This digital payment system employs advanced cryptographic methods to safeguard and manage transactions that are made in a peer-to-peer manner among individuals. The benefits of cryptocurrencies over conventional payment methods include decentralization, security, anonymity, immutability, ease of financial transfers, and decreased transaction costs [1], making it an increasingly popular payment modus operandi. The pioneer of all cryptocurrencies is Bitcoin which influenced the development of many other ecosystems. The idea was first presented in Satoshi Nakamoto’s 2008 article [2] and it was built upon it. The idea gained widespread acceptance and was used in a variety of applications. It also influenced the development of the cryptocurrency ecosystems and gave rise to many additional virtual currencies, including Ethereum, Tether, Binance, Stellar, Cardano, XRP, and Dash [3]. There are presently more than 20,000 different types of cryptocurrencies accessible for public trade, demonstrating the crucial role cryptocurrencies have as a form of electronic transaction media and a financial asset has sparked a lot of interest among investors, researchers, developers, and the general public [4, 5].
The pillar on which the cryptocurrency market operates is blockchain - an open ledger of all transactions that take place inside the system for a particular purpose and is available to all users. As a result, everyone using the system may see how each record has been balanced. Consequently, centralized authority is no longer required, and the entities immediately involved in the transaction are given control [6]. This ledger record consists of an ever-growing sequence of blocks that include the sender or receiver’s public key, timestamp, a cryptographic hash structure of the preceding block, and the quantity of transferred coins [7]. Wallet addresses are the label given to these public keys. A personal key must be approved by the sender in each transaction. After confirmation, the transaction is promptly broadcast throughout the network. Since the system is decentralized, each member has to perform an activity called mining. By solving a cryptographic puzzle, miners in a cryptocurrency network can validate transactions. A miner must verify the exchange or transaction, mark it as authentic, and then broadcast it throughout the network. Each hub then updates its databases with the completed transactions. For this calculation work, miners are granted a virtual bonus, encouraging more individuals to participate in the network. Once the exchange has been verified, the transaction procedure ends successfully. Ultimately, the invulnerability of the blockchain’s cryptocurrency to information change is what gives it its legitimacy.
One distinctive quality of most cryptocurrencies is very unpredictable price fluctuations, that result in an immediate impact on investments and are controlled by a variety of variables. To make strategic judgments and maximize investment returns, professional traders require a reliable forecasting tool [6]. The assessment of impending price changes is further made more challenging by the minimal correlations of digital currency with traditional assets. Furthermore, numerous factors may impact the price of cryptocurrencies, including changes in macroeconomic statuses, impact world events and even misleading news, governmental regulations, and volatile social website content [8].
Determining and effectively interpreting complex relations requires models with sufficient complexity. Additionally, a model needs to be capable of tackling data formulated as a time series. While recurrent neural networks show great promise for tackling time series, problems intrinsic to this approach such as vanishing gradients have led researchers to develop improved methods. One notable approach is the long-short-term memory(LSTM) [9] neural network. By using memory cells and a sophisticated set of control gates the LSTM network can select and retain data important for future forecasts and recall it when needed. These unique abilities make LSTM neural networks uniquely well-suited for price forecasting.
Like many artificial intelligence(AI) algorithms, the LSTM networks present a set of control parameters that govern specific behaviors. A trait shared with machine learning (ML) algorithms as well. While these parameters help ensure good general performance when handling different problems, they require appropriate tuning to better suit a specific task. With more modern algorithms there has been an increase in the number of hyperparameters. The process of selecting specific values has traditionally been handled manually. However, the increasing number of possible combinations makes this process difficult. Therefore, methods of automation have been developed in the form of hyper-parameter tuning. By framing the selection process as an optimization task, we can address the process of selection. Nevertheless, determining optimal hyperparameters is often understood to be an NP-Hard problem, and algorithms capable of handling this form of optimization are needed.
When handling optimization problems, few groups of algorithms have shown the same success swarm intelligence algorithms have had in this field. By simulating behaviors often inspired by natural groups cooperating with a goal in mind, such as hunting or reproduction, swarm intelligence algorithms have demonstrated their ability to address complex optimizations. By refining a population of agents that represent potential solutions, through carefully crafted exploration and exploitation mechanisms swarm algorithms can even handle NP-hard problems. This makes swarm intelligence algorithms a promising contender when optimizing hyperparameter selection.
The principal focus of this manuscript is, however, a cryptocurrency called Ether whose blockchain is provided by the Ethereum platform, developed and presented by Vitalik Buterin in 2013 [10] which became operational in 2015. Therefore, we are proposing a method for Ethereum price univariate forecasting by employing Long Short-Term Memory Networks that are optimized using a metaheuristic algorithm called Arithmetic Optimization Algorithm (AOA). This approach outperforms other modern metaheuristics when the mean and best cases across 15 independent iterations are considered while performing insignificantly worse in other situations. Furthermore, this work aims to expand on previous research [11].
The primary scientific focus of the conducted research can be summered as the following:
A proposal for a novel improved AOA metaheuristic that improves the already admirable performance of the original. The introduction of a novel LSTM-based approach for accurately forecasting Ethereum prices five steps ahead. The application of the introduced novel metaheuristic for optimizing hyperparameter values of a LSTM network to further improve performance.
The rest of this work is formulated as per the following: Section 2 presents the most relevant published literature tackling similarly challenging issues using novel ML and AI algorithms and presents a brief background overview of LSTM neural networks as well as swarm intelligence. The subsequent Section 3 describes in detail the methods that form the backbone of this research and the novel introduced improved algorithm. In Section 4 and Section 5, the experimental setup and attained results are presented and discussed. Finally, Section 6 presents a conclusion on the conducted research and presents proposals for future research works.
With the increased availability of computational resources researchers have started exploring potential applications of computational methods to address ever more complex problems. A recently popular subject is the topic of secure and distributed transactions. By leveraging complex peer-to-peer technology with advanced crystallographic protocols research has developed fully digital currencies that offer several advantages over traditional methods. Some improvements include independence from central authorities such as governments and banks. This new paradigm shift in computing has resulted in novel approaches being developed. The initial white paper that introduces Bitcoin [2] introduced several methods of protocols for tackling secure transactions in a distributed peer-to-peer network. These novel principles inspired many researchers to develop novel variations of the original cryptocurrencies each with their own unique contributions [12]. Nevertheless, Bitcoin had a huge first-to-market advantage and quickly established market dominance that it maintains to this day.
While Bitcoin had many advantages by being the first cryptocurrency on the market, novel currencies have managed to carve out their own potentially profitable niche. Some interesting examples tether the United States Dollar to a virtual coin in hopes of stabilizing price volatility. Other currencies reward users by performing certain actions of making their own hardware available for community use. However, few have seen the rapid success of Ethereum [13]. Ether, itself built on the basis of the Ethereum blockchain was introduced in 2015, and notably quickly established market dominance second only to Bitcoin. The rapid economic growth of Ethereum has made it a potentially very lucrative venture. However, the lack of theater of virtual currencies to real-world assets makes them prone to rapid and drastic variations in price [14]. These changes make cryptocurrencies a risky venture and often deter potential investors. A need for a robust approach that is capable of accurately forecasting prices while accounting for rapid variations is apparent.
By formulating price data as a time series domain problem and applying sophisticated algorithms capable of working within this domain can be applied to cast predictions. One promising approach that has seen an increase in popularity within the last decade is the application of neural networks. Artificial neural networks mimic structures observed in biological brains. By mathematically modeling neurons and synapses that connect them. neural networks have been able to address problems in dynamic environments and adjust connections to adapt without the need for explicit programming. By increasing the number of layers in network architecture, more complex tasks may be addressed. Deep networks have in recent years shown great promise for addressing more complex tasks. However, this comes at a cost of increased computational demands. Recent global silica shortages have limited the immediate availability of computational resources. This has further emphasized the need for more optimized and lighter networks capable of making the best of the available hardware. Therefore optimization plays an increasingly crucial role in model development and testing.
Traditional networks are incapable of addressing time series and rely on single-point predictions. One approach for handling problems in the time domain is by adapting recurrent neural networks(RNN) [15] that leverage cyclical connections within the network. This allows previous inputs to affect subsequent predictions. However, RNNs have several issues intrinsic to this approach that can cause problems when put into practice. One notable shortcoming is the problem of vanishing gradients [16] that make training less effective and more difficult. Therefore, novel methods capable of overcoming these shortcomings have been developed by researchers [9].
A novel approach that shows great promise when applied to time-series forecasting is utilizing LSTM [9] networks. This novel variant of recurrent networks uses a special cell state and various gates that allow networks to retain a certain amount of information within the network and release it when deemed necessary. Researchers have applied this category of a network to various fields including oil price forecasting [17]. However, to attain good performance adequate control parameters needed to be selected and optimized. While traditionally, this process is often handled through trial and error, increasing numbers of parameters presented in novel algorithms have created a need for techniques that automate the process of selection.
Hyperparameter tuning is an increasingly popular method used by researchers to improve the performance of existing models. By formulating algorithm performance as an optimization task, and applying novel algorithms, the process of selection can be automated. This in turn further boosts the overall performance of the resulting methods. Researchers have successfully applied this method to several algorithms and across multiple fields. Some notable examples include optimizing XGboost parameters for various purposes [18, 19, 20], as well as optimizing LSTM parameters [21, 22].
LSTM
a LSTM network [23] a variation of a of RNN, that enables data to be maintained within cell states. As a result, past inputs have an impact on those in the future, making it particularly suited to time-series predictions. A conventional network’s cells are substituted with memory nodes in network layers to impart the ability to data storage. Three different types of gates including input, output, and forget. These form the building blocks of up memory nodes, which work by judicious capturing and releasing data that passes through them.
Data flowing through a forget gate
where
Data is chosen and stored in memory cells by the ensuing stem. Additionally, the sigmoid function determines values that will be regenerated and are established per Eq. (2) for a particular input gate.
Here,
The
in which
Upon making a decision on what data will be recorded, the state of a specific cell
where
Sigma function computations can be employed to determine the output gate
where the range of
Last but not least, the output value
Beni and Wang first proposed the concept of swarm intelligence in 1989 about the intelligent behavior of cellular robotic systems [24]. This area of study examines both man-made and systems present in nature in which many individual units work together to create favorable outcomes for an entire population through decentralized control and self-organization. A typical swarm intelligence system consists of several discrete “boids” [25] that interact both locally and with their surroundings. Boids are an example of emergent behavior; unknowing to these individuals, the formation of intelligent global behavior is the result of seemingly random actions taken by individual agents while adhering to a set of basic rules in the absence of a centralized framework to govern their coordination.
The core principles behind the functioning of swarm intelligence algorithms predominantly stem from nature. The most famous algorithm inspired by the latter and initially intended to simulate social behavior [26] is Particle Swarm Optimization (PSO). This class of algorithms formulates a technique made for various tasks that could be formulated as optimizations. Other notable representatives of this family are: artificial bee colony(ABC) [27], firefly algorithm(FA) [28], sine cosine algorithm (SCA) [29], bat algorithm(BA) [30], and many others [31, 32, 29]. However, various hybrid optimization methods have proven to utilize the best traits of a swarm and conventional machine learning models, and they show an increasing trend in popularity among researchers. A few noteworthy cases include the two-stage GA-PSO-ACO algorithm [33], interactive search algorithm (ISA) [34], ABC-BA [35], Swarm-TWSVM [36], and those used for forecasting, such as a hybrid consisting of support vector machines (SVM) and RNNs, called RSVR [37].
Although they have different methods for approaching problems, and according to the no free lunch (NFL) theorem, there is no single best approach for all presented problems [38]. Swarm intelligence algorithms often do optimization tasks exceptionally well. Even NP-hard problems may be addressed and solved by swarm intelligence algorithms with satisfactory outcomes consuming reasonable resources and in a fair amount of time.
Metaheuristics algorithms have also been recently used to deal with various issues in the domain of artificial intelligence, such as stock market forecasting [39] plant classification [40], intrusion detection and security [41, 42, 43], spam detection [44], and medical applications [45, 46, 47, 48] to mention the few.
Proposed method
Arithmetic optimization algorithm
The AOA represents a novel algorithm that draws inspiration from fundamental operators in mathematics [49]. The procedure of refining solutions in metaheuristic algorithms is subdivided into two elementary stages referred to collectively as the search phase. These stages are exploration and exploitation. During the initial stage the algorithms are tasked with scouring the unexplored areas of the problem space, the latter stage assigns a goal of focusing on the spaces that have already been investigated.
The optimization process initiates with a randomized matrix
in which
In the exploration phase, Division (
whereby
Here, Math Optimizer Probability
Then, using the Addition (
Candidate solutions gravitate towards divergence from the potentially almost optimal solution whenever
Despite the good performance of the basic AOA, certain drawbacks have been observed [50], signifying that there is room for further improvement. A notable tendency of the original algorithm to get stuck in sub-optimal regions can lead to premature convergence. in turn, resulting in reduced accuracy. This section describes a proposed strategy to mitigate this problem and improve on the excellent results presented by the original metaheuristic.
A popular and effective approach for addressing algorithm drawbacks is hybridization. By incorporating mechanisms from other well-performing metaheuristics, a novel hybrid algorithm is created, with improved overall performance.
A popular and effective approach for addressing algorithm drawbacks is hybridization. Following extensive testing, the Firefly algorithm (FA) [28] has been selected to improve the shortcomings of the AOA due to its particularly powerful exploration mechanism. Accordingly, the proposed algorithm has named the AOA with firefly search (AOA-FA).
The improved version of the algorithm introduces a search mechanism utilized by the original FA as shown in Eq. (11).
where
The proposed algorithm’s pseudocode is presented in Algorithm 3.2.
Arithmetic optimization firefly search algorithm pseudocodeDefine values for
In this manuscript, univariate forecasting of Ethereum was conducted using the day’s closing cryptocurrency price using daily available data that was gathered from Investing.com – a service that offers news and analysis on worldwide financial markets. During experimentation, 70% of accessible samples were employed for model training, with the latter 30% needed for evaluation. A visual representation of the dataset and the training test split is demonstrated in Fig. 1.
Train/test split of closing Ethereum price used in experiments.
The evaluation of the scheme is performed using the following metrics: mean squared error (MSE), root mean squared error (RMSE), mean absolute error (MAE), and lastly mean absolute percentage error (MAPE) shown respectively in Eqs (12)–(15). In all these equations
Objective function metrics over 15 runs for five-step ahead.
LSTM Comparative analysis visualizations.
LSTM Comparative analysis visualizations.
Five steps ahead predictions made by best-performing model optimized by each metaheuristic.
All tested models were independently implemented in Python, using TensorFlow and Keras. The prediction model used six input steps, representing six days’ worth of data, used to predict prices five steps (days) ahead. The optimization algorithms used a population of six individuals, and 8 algorithm iterations for an optimization. Furthermore, the optimization constraint for LSTM parameters where in the following ranges: number of neurons
To demonstrate the improvements made to the introduced algorithm it has been subjected to a comparative analysis against several well-known metaheuristics including the original AOA [49], SCA [29], FA [28], BA [30], ABC [27]. Since these algorithms were tasked with optimizing LSTM network hyperparameters these are given a LSTM suffix to improve clarity. Furthermore, experimental findings followed by a comparison with other contemporary algorithms applied to LSTM optimization are demonstrated in two types of tables.
The R2, MAE, MSE, and RMSE metrics of best-generated LSTM with 5 steps ahead
The R2, MAE, MSE, and RMSE metrics of best-generated LSTM with 5 steps ahead
Five steps ahead predictions of the best performing AOA-FS-LSTM model (above) and original AOA-LSTM model (below) vs actual price values.
The following section presents and discusses the attained results by each of the metaheuristic optimized models applied to cryptocurrency forecasting. The models are evaluated in several independent runs to ensure fair grounds for an objective comparison and account for random nature of the metaheuristic algorithms. Furthermore, each metaheuristic has been evaluated under identical test conditions, given the same population size and identical number of iterations to improve population quality. Objective metrics are captured through the executions in order to wage the performance of the algorithm.
In the first Table 2 R2, MAE, MSE, and RMSE performance indicators for the objective obtained in the best run, are shown for each step separately. The proposed algorithm outperformed all contemporary metaheuristics in one and four steps ahead, while still attaining admirable results in two, three, and five-step ahead predictions. The LSTM-SCA model outperformed the proposed algorithm in two steps ahead predictions reinforcing the NFL theorem, while the based AOA achieved better results for three and five steps ahead. However, the proposed algorithm outperformed all other algorithms in overall results.
The Best, worst, mean, median, standard deviation, and variance of the objective function(MSE) averaged over 15 independent executions are demonstrated in Table 1 for five-steps ahead prediction. To highlight the best-obtained results they are marked in bold text.
The median and best performance averaged over 15 independent runs have been attained by the novel proposed algorithm, while the LSTM-BA algorithm performed better in the works and arithmetic mean further solidifying the NFL theorem.
To further emphasize the improvements made a visual comparison of the proposed LSTM-AOA-FA algorithm result distributions can be seen in Fig. 2. Additionally, the comparison of convergence speeds of all tested algorithms is demonstrated in Fig. 3.
The forecasts made by each best-performing tested model can be seen in Fig. 4.
Finally, the overall best-performing prediction model, the AOA-FS-LSTM model, alongside or original AOA-LTMS model is shown in Fig. 5.
The research presented in this paper tackles the potentially very lucrative process of forecasting cryptocurrency prices, focusing on Ether running on the Ethereum blockchain. Since price changes can be formulated as a time series an appropriate method has been selected to tackle forecasting. Due to their ability to effectively retain data from previous inputs through the use of memory cells and various gates, LSTM neural networks have been applied to price forecasting. However, as LSTM networks, much like other AI algorithms require adequate hyperparameter tuning to ensure desired functionality, this work also introduces a novel improved version of the AOA algorithm. By leveraging the powerful search mechanism of the FA the shortcomings of the original can be overcome. Due to this mechanism being introduced the novel algorithm is therefore dubbed the AOA-FS. It is applied to selecting appropriate LSTM network hyperparameter values improving performance and accuracy.
The attained results suggest that the introduced approach is favorable for predicting market prices with reasonable resources and within an acceptable time frame. Furthermore, the proposed approach attained the best results in mean and best cases in 15 independent runs and outperformed other algorithms in one and four-step ahead predictions, while performing marginally worse in other cases, suggesting that the proposed algorithm has the potential for success used for optimization and prediction.
Future research will focus on further refining the proposed approach and prediction model accuracy through additional optimization. On top of that, a goal for future research will be exploring the potential for applying the proposed novel algorithm in various fields.
