Abstract
The volatility of gold prices significantly influences global financial stability, necessitating the development of reliable models capable of producing precise forecasts to minimize investment risks and maximize profitability. Recently, both machine learning and deep learning approaches have gained significant traction for time series forecasting across scientific and industrial domains. In this paper, we propose the ForCNN model, which utilizes grayscale image-based input rather than traditional numerical data. This algorithm integrates the advantages of visual image representation of a time series and deep 2D convolution neural network to analyze and extract important features and generate accurate forecasts. We carried out extensive experiments on two real-world gold closing price datasets and showed ForCNN outperformed most of the state-of-the-art deep learning techniques such as MLP, CNN1D, LSTM, CNN-LSTM, BiLSTM, CNN-BiLSTM in terms of accuracy measures. Furthermore, portfolio performance evaluation using Cumulative Return, Average Daily Return, and Sharpe Ratio indicates that ForCNN achieves superior profitability and stronger risk-adjusted performance, thereby underscoring its effectiveness in practical financial forecasting applications.
Introduction
Gold plays a pivotal role in the global economy as it is one of the most preferred commodities for investment worldwide. However, the gold price market demonstrates pronounced volatility and sharp fluctuations due to inflation, supply-demand imbalances, and political disruption. An extensive amount of research has already been done in the development of numerous gold price prediction algorithms. Traditional time series techniques such as ARIMA (Autoregressive Integrated Moving Average) (Guha and Bandyopadhyay, 2016) modeling and multilinear regression had been carried out to forecast future gold price values; however, these classic algorithms rely on presumptions like stationarity and linear correlation between past datasets. In addition to these statistical methods, various sophisticated machine learning algorithms such as MLP (Multilayer Perceptron), SVM (Support Vector Machine), KNN (K-Nearest Neighbor), DT (Decision Trees), and SVR (Support Vector Regression) have been deployed to predict the price of gold (Das et al., 2022). In recent years, researchers have been using deep learning models such as ANN, CNN, LSTM, BiLSTM and many others to produce gold price forecasts with better accuracy (Alameer et al., 2019; Livieris et al. 2020; Lu et al., 2021; Tuncer et al., 2022). Most of these deep learning techniques use a numeric representation of the time series data as input. Although CNNs were originally designed for extracting features from visual images, 1D numeric vectors are used for handling time series data. Despite the use of images in some time series analysis projects, no prior study has explored the use of grayscale images as input to neural networks for generating point forecasts. Spatial structural information encoded in the visual representation of the time series provides a more insightful perspective even though the same information is embedded in numerical values. Researchers found that the combined use of visual time series representations and deep 2D convolutional neural network (CNN) can extrapolate complex patterns, and hence outperform other state-of-the-art deep learning models. Inspired by the aforementioned facts, the contributions of our research are encapsulated as follows:
Proposed a new approach based on the image-based ForCNN model to predict the gold price for the next time step along with hyperparameter optimization. Investigated the performance of other benchmark models such as MLP, CNN1d, LSTM, CNN-LSTM, BiLSTM, CNN-BiLSTM with hyperparameter optimization and compared their prediction accuracy with the ForCNN model. Examined the performance of the proposed approach through an empirical evaluation using two different datasets of closing gold price in GBP (Great British Pound) and USD (United States Dollar) and further assessed its investment potential using portfolio performance metrics such as Cumulative Return (CR), Average Daily Return (ADR), Annualized Return (AR), Annualized Volatility (AV), Sharpe Ratio (SR), and Maximum Drawdown (MD). The outline of the paper is as follows: Sections 1 and 2 provide a brief introduction and review of the literature on the use of deep learning and statistical models in the field of gold price forecasting. In Sections 3 and 4, we have briefly described the architecture of the ForCNN model, the datasets, the accuracy measures, and the benchmark models. The outcome of the current work and the conclusions are discussed in Sections 5 and 6.
Literature Review
The challenging task of predicting and modeling the volatility of financial series (Gezici and Sefer, 2024; Pala and Sefer, 2024; Uygun and Sefer, 2025) has attracted considerable attention from both researchers and practitioners. A paper on the prediction of gold prices was published by Banhi Guha and Gautam Bandyopadhyay in 2016 (Guha and Bandyopadhyay, 2016). Thereafter, in 2017 Sandya N. Kumari et al. demonstrated the strength of GARCH models for modeling and forecasting the volatility and non-linearity of gold prices (Kumari and Tan, 2018). Despite the effectivity of ARIMA and GARCH models in the field of gold price forecasting, machine learning and deep learning algorithms (Pirani et al., 2022) have been recently popularized due to their ability to analyze complex trends, non-linear patterns, and tackle larger datasets. Arash Tashakkori et al. established the efficacy of MLP (MultiLayer Perceptron) for forecasting gold prices and showed the ability of MLP neural networks in grasping the complex relationships behind gold price fluctuations (Tashakkori et al., 2024). Furthermore, R. Hafezi and A. N. Akhavan built an intelligent network that was equipped with a meta-heuristic algorithm known as the BAT algorithm. This algorithm was designed to enable the ANN to efficiently manage fluctuations in gold prices (Hafezi and Akhavan, 2018). In another study, Zakaria Alameer et al. trained a Multilayer Perceptron by implementing WOP (Whale Optimization Algorithm) for forecasting gold prices and higher precision was achieved when compared to other baseline models (Alameer et al., 2019). Moreover, Andres Vidal and Werner Kristjanpoller developed a hybrid model that integrated LSTM and VGG19 in 2019 to enhance the precision of gold price volatility forecast. Since the network was fed with images as input, it was capable of preserving both the stationary and dynamic characteristics of the time series (Vidal and Kristjanpoller, 2020). In 2021, Yu-Chen Chen and Wen-Chen Huang utilized CNN and LSTM models to analyze market behavior and generate trading stategies for the S&P 500, using commodity prices like gold and crude oil as indicators (Chen and Huang, 2021). A CNN-BiLstm-AM model was presented by Wenjie Lu1 et al. to forecast stock closing price of the next day. CNN helped to extract intricate features from input and Bilstm used those features for forecasting stock closing price of the next day. Thereafter, in order to increase forecast accuracy, AM was applied to record the impact of feature states on the closing price of the stock at multiple instances in the past (Lu et al., 2021). On the other hand, Margustin Salima and Arif Djunaidy created a hybrid CNN-LSTM model in 2023 where images were provided as input. In this work, the time series data were converted to images by the Gramian Angular Field (GAF) technique (Salim and Djunaidy, 2024) for training the neural network architecture. Subsequently, a hybrid CNN-LSTM model has been developed by Ioannis E. Livieris et al. to evaluate future gold prices and movements. Two distinct versions of their model were built, each with two convolutional layers and different numbers of filters. As a result, the first model showed improved forecasting ability, while the second model exhibited better performance for the classification problem of the gold movement estimate (Livieris et al., 2020). Expanding the image-based approach to general time series, Artemios et al. in 2022 introduced a ForCNN model and tested it on the time series dataset of the M3 and M4 forecasting competitions. Their results verified that ForCNN outperformed ResNet, VGG-19 and other baseline models in terms of accuracy (Semenoglou et al. 2023). Nevertheless, the ForCNN model has not been demonstrated for the purpose of forecasting gold prices with volatility and non-linear fluctuations. We have conducted a comprehensive performance evaluation of the ForCNN model and have compared it to other benchmark models to validate the efficiency of ForCNN model for forecasting gold time series.
Methodology
A time series is defined as an ordered sequence of observations taken over equally spaced time intervals. A time series Y of length n can be represented as follows:
Time Series Preprocessing
The methodological approach followed in this paper to forecast gold time series includes two steps. In the first step, the
Model Architecture
Let
The activation is applied after the first two convolutions, while the third convolution omits ReLU before residual addition to preserve gradient flow.
Here, W represents the latent embedding vector containing the compressed temporal–spatial representation of the input image X.
Here,
Experimental Setup
Dataset
This analysis uses two gold price datasets for validating the usefulness of ForCNN model. First dataset comprises per-minute gold price data in GBP from January 2014 to December 2023 (HistData, 2025). The data were divided into a training set and a testing set. The training set contains daily gold closing prices from January 2014 to December 2020, while the testing set includes the subsequent data set from January 2021 to December 2023. We have transformed the per-minute data set into hourly data for our research objectives. The second data consists of hourly gold price closing dataset in USD (MetaTrader 5, 2024). The training set for this data is considered from January 2020 to December 2023 and the testing set contains data from January 2024 to April 2025.
Forecasting Accuracy Measures
It is important to establish direct comparisons between our findings and other benchmarks in literature. Consequently, we opted to use the standard metrics used for the ForCNN model to analyze and compare the predicting accuracy of the examined models. Specifically, the measure used for evaluating the ForCNN model was symmetric mean absolute percentage error (sMAPE) and additionally, we calculated RMSE (Root Mean Square Error) and MAE (Mean Absolute Error). The formulae for these error metrics are as follows:
In addition to conventional forecasting accuracy metrics, to evaluate the effectiveness of the forecasting models from a trading perspective, several portfolio-based performance measures (Uygun and Sefer, 2025) were computed, including Cumulative Return (CR), ADR, Annualized Return (AR), Annualized Volatility (AV), Sharpe Ratio (SR), and Maximum Drawdown (MD). Let the actual daily prices be denoted by
The daily return from the actual prices is computed as:
The trading signal
The corresponding strategy return is then given by:
Using these strategy returns, the following performance metrics are derived:
Benchmark Models
We have considered six different benchmark models to evaluate the relative forecast performance of the proposed ForCNN model. For each of the models, the last 24 h gold price trend has been fixed as the sequence length to calculate the forecast of the next hour. The main purpose of choosing sequence length 24 is to be able to compare between performance of the models with image data and numerical data. The network architecture and the corresponding optimal values for each models are described in details in the appendix.
MLP
Multilayer Perceptron, or MLP (Tashakkori et al., 2024), is a fully interconnected feedforward network that is commonly utilized for data analysis tasks like regression and classification. MLPs have many hidden layers, and each one is made up of neurons that use non-linear activation functions to their input.
CNN1D
Artificial neural networks that apply convolutional filters along one dimension are known as 1D CNNs (Hu et al., 2023). CNN-1D models are very effective and successful for time series forecasting applications. To enhance feature learning and reduce computational complexity, they make use of shared weights and sparse connections.
LSTM
In 1997, S. Hochreiter and J. Schmidhuber proposed LSTM (Hochreiter and Schmidhuber, 1997), a particular kind of recurrent neural network that assists in solving issues related to vanishing and exploding gradients. LSTM architecture contains multiple LSTM layers and each LSTM layer contains multiple lstm units operating in parallel. Each unit consists of an lstm cell composed of a cell state and three gates: forget gate (
Finally, cell state (
CNN-LSTM
CNN, introduced by Lecun et al. in 1998, is a multilayer deep neural network which can process both numerical data and image data. Recently, the combination of CNN and LSTM that is CNN-LSTM (Livieris et al., 2020; Widiputra et al. 2021) model is immensely implemented in areas of time series forecasting, signal and natural language processing, image recognition and so on. CNN-LSTM architecture uses CNN layers for extracting complex features from input and LSTM layer for sequence prediction.
BiLSTM
Bidirectional LSTM, often known as BiLSTM (Stankovic et al., 2023), is a model for processing sequences that has two LSTMs: forward and reverse. This network can use past and future data to create more thorough and detailed conclusions by taking into account the changing laws of information both before and after data transmission.
CNN-BiLSTM
The CNN-BiLSTM (Lu et al., 2021) model integrates the advantages of CNNs and Bidirectional Long Short-Term Memory (BiLSTM) networks (Zhang et al. 2023).
Results and Discussion
Forecasting Accuracy and Visual Comparison of Forecasts
Table 1 highlights the forecast accuracy of the benchmark models and the proposed ForCNN model for the gold time series in GBP. The second, third, and fourth columns of the table denote the scores according to the sMAPE, RMSE, and MAE error measures. These results indicate that using grayscale images for training ForCNN model can significantly enhance forecasting accuracy. ForCNN excels the MLP benchmark in accuracy by 0.77%, 72.48%, and 71.45% with respect to sMAPE, RMSE, and MAE, respectively. In comparison to CNN-1D, ForCNN displays improvement by 0.75%, 72.44%, and 71.38% in the identical metrics. Furthermore, it surpasses LSTM by 0.77%, 72.36%, and 71.33% regarding sMAPE, RMSE, and MAE. Furthermore, ForCNN outperforms the CNN-LSTM model by 0.78% for sMAPE, 71.43% for RMSE, and 81.37% for MAE. It also exhibits better performance than BiLSTM by 0.76%, 72.35%, and 71.32%, and exceeds CNN-BiLSTM's precision by 0.76%, 72.48%, and 71.42% across the same measures.
Error Comparison Between ForCNN and Benchmark Models for the Gold Price Dataset in GBP.
The predictive accuracy of the benchmark models alongside the proposed ForCNN model for the gold time series in USD is illustrated in Table 2. When assessing performance, ForCNN shows 3.13% higher precision in sMAPE, 90.84% in RMSE, and 90.74% in MAE compared to MLP. It also delivers 2.90%, 90.29%, and 90.16% better results than CNN-1D, and shows 3.016%, 90.55%, and 90.53% improvements over LSTM relative to same error metrics consecutively. ForCNN also excels CNN-LSTM in prediction accuracy by 2.37%, 88.88% and 88.56% in sMAPE, RMSE, MAE respectively. It performs superior than BiLSTM, by 2.82%, 90.11%, and 90.03%, and shows enhanced results over CNN-BiLSTM by 2.15%, 79.91%, and 88.31% with respect to identical error metrics. Figures 2–8 and Figures 9–15 illustrate the forecasting plots for the GBP and USD gold price time series, respectively.
Error Comparison Between ForCNN and Benchmark Models for the Gold Price Data in USD.
Financial Performance Evaluation on USD Dataset
The results of the portfolio evaluation in Table 3 reveal that the proposed ForCNN model consistently outperforms all benchmark approaches, achieving the highest cumulative and annualized returns with a superior Sharpe Ratio of 1.3959. Its strong profitability, coupled with moderate volatility and minimal drawdown, demonstrates the robustness of the model in capturing market dynamics and maintaining stable risk-adjusted performance.
Performance Comparison of Models Based on Portfolio Metrics.
Ablation Study of the ForCNN Model on USD Dataset.
Hyperparameter Optimization
The hyperparameters for all the models were optimized using the Tree-structured Parzen Estimator (TPE) implemented in the HyperOpt library. TPE is a Bayesian optimization approach that models
Ablation Study
As presented in Table 4, we examined the effect of varying architectural parameters such as the number of blocks, the number of layers, residual connections, and number of filters. The configuration obtained through hyperparameter optimization (5 blocks, 2 layers, residual = False, 32 filters) yielded the most consistent and stable results. A model variant with one layer showed a marginal reduction in error (from 3.426 to 3.402 in mae and 5.294 to 5.289 in rmse); however, this difference was not statistically significant. Therefore, the final model in the paper retains the optimized configuration for forecasting. The proposed ForCNN model as well as all the benchmark models used in this work were trained using Lenovo ThinkingSystem SR650 with the following configuration: 2 nos of Intel(R) Xeon(R) Gold 6248 CPU @ 2.50 GHz. Processor, 64 cores, 192 GB RAM, and 2 TB HDD, and the codes were executed in Python 3.8.9.
Conclusion
Deep learning and machine learning algorithms in finance are frequently upgraded to handle the vast amount of data accessible nowadays. This paper investigated the significance of using visual time series representation and deep Architecture of the ForCNN Model. Plot of time series and the corresponding forecast given by ForCNN for the gold price data in GBP. Plot of time series and the corresponding forecast given by MLP for the gold price data in GBP. Plot of time series and the corresponding forecast given by CNN1D for the gold price data in GBP. Plot of time series and the corresponding forecast given by LSTM for the gold price data in GBP. Plot of time series and the corresponding forecast given by CNN_LSTM for the gold price data in GBP. Plot of time series and the corresponding forecast given by BiLSTM for the gold price data in GBP. Plot of time series and the corresponding forecast given by CNN_BiLSTM for the gold price data in GBP. Plot of time series and the corresponding forecast given by ForCNN for the gold price data in USD. Plot of time series and the corresponding forecast given by MLP for the gold price data in USD. Plot of time series and the corresponding forecast given by CNN1D for the gold price data in USD. Plot of time series and the corresponding forecast given by LSTM for the gold price data in USD. Plot of time series and the corresponding forecast given by CNN-LSTM for the gold price data in USD. Plot of time series and the corresponding forecast given by BiLSTM for the gold price data in USD. Plot of time series and the corresponding forecast given by CNN-BiLSTM for the gold price data in USD.














Footnotes
Acknowledgments
This work has been financially supported by the Department of Science and Technology (DST), Government of India, through the INSPIRE fellowship with no. DST/INSPIRE Fellowship/2022/IF220134. The Centre for Computational Modeling and Simulation, National Institute of Technology Calicut, has provided assistance for all computational work.
Credit Authorship Contribution Statement
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the Department of Science and Technology, INSPIRE, Government of India, (grant number DST/INSPIRE Fellowship/2022/IF220134).
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Data and Code Availability
The datasets in GBP and USD are obtained from the websites https://www.histdata.com and https://www.metatrader5.com, respectively, and the code is available at
.
