Analysis of artificial neural network performance based on influencing factors for temperature forecasting applications

Abstract

Artificial neural network (ANN)-based methods belong to one of the most growing research fields within the artificial intelligence ecosystem, and many novel contributions have been developed over the last years. They are applied in many contexts, although some “influencing factors” such as the number of neurons, the number of hidden layers, and the learning rate can impact the performance of the resulting artificial neural network-based applications. This paper provides a deep analysis about artificial neural network performance based on such factors for real-world temperature forecasting applications. An improved back propagation algorithm for such applications is also presented. By using the results of this paper, researchers and practitioners can analyse the encountered issues when applying ANN-based models for their own specific applications with the aim of achieving better performance indexes.

Keywords

Artificial neural network optimization modeling and simulation improved back propagation neural network temperature forecasting applications

1. Introduction

In recent years, weather forecasting has been playing a vital role in day-to-day life. Weather warnings are important forecasts because they are used to protect life and property [1]. They can be used for the smart home, load forecasting, renewable energy forecasting, fire hazard prevention, weather report and so on [24]. There is a variety of end users to weather forecasts. For instance, utility companies use them to estimate demand over coming days and improve the reliability, availability, serviceability, and usability of the provided services [2]. Among other things, it is worthwhile mentioning some services provided over communication networks, such as real time communication and multimedia services (VoIP, online games, audio and video streaming), solutions for fully distributed architectures (clouds and grids), evolutionary network services, and storage area networking solutions.

Within the weather forecasting scope, temperature forecasting with high accuracy is a challenging task because of the influence of different atmospheric parameters [23]. Many researchers suggest various temperature forecasting models. In this paper, we propose a temperature forecasting model using artificial neural networks (ANN). ANN-based methods try to mimic a human brain functionality to provide a solution for complex problems, featured by massively parallel fully interconnected structures [6]. They can be composed of different layers such as the input layer, one or more hidden layers and the output layer. The input layer consists of input neurons, which transfer the information or data to the next layer (hidden layer) via weighted connections and activation functions. The same happens for the next layers. The model parameters like weights and biases are modified during the training process by using the so-called training data set. Moreover, before starting such training process, it is important to set the hyper parameters to manage training and testing data set with ease. Some of the hyper parameters are the number of hidden neurons, the number of hidden layers, and the activation functions. The error can be computed based on the difference between the forecasted and the target outputs. A back propagation process is used to reduce such error. In neural network-based application, weight adjustment, hyper parameter selection, minimization of the cost function, regularization, generalization and convergence are the major elements to take into account for achieving satisfactory stability, robustness, and performance indexes.

In this paper, we address the problem of identifying adequate ANN model parameter values (such as the number of hidden layers, the number of hidden neurons, the weights) for temperature forecasting applications. All these elements are referred to as “influencing factors”, since they impact the overall model performances. The outcome of this paper can be used to implement both specific temperature forecasting applications and Application Programming Interface (API) libraries for e.g. Android-based devices.

The paper has been organized as follows. Section 2 provides some preliminaries on ANN-based models and their influencing factors. Section 3 describes the ANN model and design procedures for temperature prediction applications. It also presents the performance indexes used in the paper. Section 4 analyses how the influencing factors affect the performance indexes of the resulting ANN-based model. Section 5 concludes the paper.

2. Background on the ANN influencing factors

ANN learning methods can be classified as follows: (i) Supervised learning [22] – Neural network with known targets; (ii) Unsupervised learning [7] – Neural network with unknown targets; (iii) Reinforcement learning [8] – Neural network aimed at maximizing expected returns. In this paper, we focus on the first type of learning. Once defined the hyper parameters, the neural network output errors can be minimized by a weight optimization process, known as neural network training. The neural network training process usually starts with small variance and large bias since the neural network output is far away from the desired targets and the influence of the training data set is low. On the other hand, such training process ends with smaller bias (which can solve the underfitting problem) and larger variance (which can give rise to poor generalization of the resulting model). Different algorithms can be adopted for the ANN training process: Delta rule [20], Perceptron learning algorithm [19], Back propagation algorithm [21], Kohonen Learning Rule [10], and Meta heuristic Optimization algorithm [9]. In this paper, a feed-forward ANN with the back-propagation learning algorithm is used: the error of the output is computed and propagated back to the input layer through hidden layers. During such back-propagation phase, the ANN weights are adjusted by means of the gradient descent method. The logistic or sigmoid function is usually adopted in back-propagation learning thanks to its relevant features like continuity, shape, and differentiability.

The selection of an appropriate learning rate is fundamental for the ANN training process. If the learning rate is large, we can take a large step and become stuck in a local minimum. In case of a small learning rate, the gradient descent can be very slow. In ANNs, different activation functions can be used to compute the output signals from the applied total net input signals. The usage of such activation functions is motivated by the following main needs: (i) To improve the computation power and performance of the resulting trained model; (ii) To solve complex problems featured by non-linear dynamics; (iii) To provide output activation with respect to the weighted input neurons. Different types of activation functions can be used, in particular: linear activation function, step activation function, logistic or sigmoidal activation function (sig), hyperbolic tangent activation function (tanh), rectified linear unit activation function (Relu), see [16,18]. The selection of proper activation functions depends on the addressed layer as well as the required performance indexes.

Besides the learning rate, the momentum factor can be used to improve the network stability and minimize the training time. It aids faster convergence by considering past changes in ANN weight updates during the training process. Moreover, neural network oscillations and local convergence to local minima due to higher learning rate can be overcome by the momentum factor.

Finally, we mention the importance of dimensioning the hidden layer by selecting a proper number of hidden neurons. Underfitting can happen in case of too few hidden neurons, whereas a high number of hidden neurons can cause overfitting and slow convergence during the ANN training process. There is no general rule for deciding the number of hidden layers and the number of hidden neurons in each selected hidden layer within a neural network. Researchers can decide the number of hidden layers and the number of hidden neurons in each hidden layer by means of a trial and error method, specific mathematical formulations, and heuristic techniques, see [4,5,12–15,17,27]. An improper number of hidden layers and hidden neurons in each hidden layer can cause different issues like no stability guarantee, poor generalization, slow convergence rate, convergence into local minima, underfitting and overfitting.

3. The proposed ANN model and design procedure for temperature forecasting applications

In recent years, weather forecasting has been playing a vital role in day-to-day life, and within weather forecasting scope, temperature forecasting. Many researchers suggest various temperature forecasting models, see [3,11,25,26]. Temperature forecasting is used for the smart home, load forecasting, renewable energy forecasting, fire hazard prevention, weather report and so on. Temperature forecasting with high accuracy is a challenging task because of the influence of different atmospheric parameters.

The proposed ANN model for temperature forecasting applications process six meteorological parameters as inputs. The procedure used to set up such ANN model involves three main steps, namely: the ANN design and training approach definition, the data set collection and normalization, and finally, the ANN model training and testing.

3.1. ANN design and training approach definition

We apply the ANN back propagation algorithm with the momentum factor to train the ANN model for temperature forecasting applications. Such algorithm is composed of three main steps: the feed-forward step, the error computation step, and the ANN weight update step. The momentum factor can help achieving a faster convergence rate during the ANN training process.

The input layer consists of the input parameters related to the temperature forecasting application, while the output layer has one output neuron, that is to say, the forecasted temperature. The ANN output error is obtained by calculating the difference between the computed and the target temperature values, the latter given in the available data set. In the following section, a deep analysis of the ANN output error and convergence rate based on ANN model influencing factors is performed. The ANN architecture is depicted in Fig. 1, while the procedural steps of the ANN design and back-propagation algorithm are shown in Algorithm 1.

Algorithm 1

Procedural steps of the proposed algorithm

As for the ANN architecture shown in Fig. 1, the input layer consists of the following data $\begin{matrix} (1) & [K_{1}, K_{2}, K_{3}, K_{4}, K_{5}, K_{6}] = [T, W S, S I, R H, D P, P W] \end{matrix}$ which are the temperature, the wind speed, the solar irradiance, the relative humidity, the dew point, and the precipitation of water contents, respectively.

The output layer provides the forecasted temperature $\begin{matrix} (2) & J = [T_{f}] \end{matrix}$ which is compared with the given target temperature value. The computed error is backpropagated up to the input layer and the ANN model weights are updated by means of the gradient descent method.

Fig. 1.

The proposed ANN architecture.

The synaptic weight vectors of input layer to the hidden layer are the following $\begin{matrix} (3) & S V = [\begin{matrix} S V_{11}, S V_{12}, \dots, S V_{1 h}, S V_{21}, S V_{22}, \dots, S V_{2 h}, S V_{31}, S V_{32}, \dots, S V_{3 h}, \\ S V_{41}, S V_{42}, \dots, S V_{4 h}, S V_{51}, S V_{52}, \dots, S V_{5 h}, S V_{61}, S V_{62}, \dots, S V_{6 h} \end{matrix}] \end{matrix}$

The net input to the hidden layer is $\begin{matrix} (4) & Y_{i n q} = \sum_{p = 1}^{6} \sum_{q = 1}^{h} K_{p} S V_{p q} \end{matrix}$ while the output of the hidden layer is $\begin{matrix} (5) & Y_{q} = f (\sum_{p = 1}^{6} \sum_{q = 1}^{h} K_{p} S V_{p q}) \end{matrix}$ where h is the number of hidden neurons and f is the activation function.

The synaptic weight vectors of hidden layer to the output layer can be defined as follows $\begin{matrix} (6) & S W = [S W_{1}, S W_{2}, \dots ..., S W_{h}] . \end{matrix}$

As consequence, the net input of the output layer becomes $\begin{matrix} (7) & Z_{i n} = \sum_{q = 1}^{h} (Y_{q} S W_{q}) \end{matrix}$ and the output of the overall ANN model is $\begin{matrix} (8) & Z = f (\sum_{q = 1}^{h} (Y_{q} S W_{q})) . \end{matrix}$

The computed error in the output layer, $\begin{matrix} (9) & E = (T_{r} - Z) d^{'} (Z_{i n}) \end{matrix}$ where $T_{r}$ is the target temperature, $d^{'} (Z_{i n})$ is the derivative of the net input of the output layer. The computed error (E) is back propagated to the hidden layer. In particular, each hidden neuron ( $Y_{q}, q = 1, 2, \dots, h$ ) sums its delta inputs from the output layer neurons (in our case, there is only one neuron in the output layer) $\begin{matrix} (10) & E_{i n q} = \sum_{q = 1}^{h} E S W_{q} . \end{matrix}$

The error in hidden layer can be computed as follows $\begin{matrix} (11) & E_{q} = E_{i n q} d^{'} (Y_{i n q}) \end{matrix}$ where $d^{'} (Y_{i n q})$ is the derivative of the net input of the hidden layer.

The synaptic weight updating process is described by the following expressions $\begin{array}{l} (12) & S W_{q} (n + 1) = S W_{q} (n) + α E Y_{q} + η [S W_{q} (n) - S W_{q} (n - 1)] \\ (13) & S V_{p q} (n + 1) = S V_{p q} (n) + α E_{q} K_{p} + η [S V_{p q} (n) - S V_{p q} (n - 1)] \end{array}$ where α is the learning rate and η is the momentum factor.

3.2. Data collection and normalization

The real-time data (acting as the ANN input parameters and target temperatures) are collected from the National Oceanic and Atmospheric Administration (United States of America) from January 2016 to December 2018. The input parameters are the temperature (^∘C), the wind speed (m/s), the solar irradiance (W/m²), the relative humidity (%), the dew point (%), the precipitation of water contents (%). The target temperature (contained in such data set) is used to compute the error with respect to the forecasted temperature. The data set spans over two years, which data samples 10 minutes. The resulting data set contains 1051200 data samples.

For the considered application, the data set normalization is needed to minimize some training process issues (e.g., high value inputs can reduce the effect of small value inputs) and enhance the performance of the resulting trained ANN model. In particular, the min-max normalization is adopted for the proposed neural network. The scaled input is calculated as follows $\begin{matrix} (14) & K_{p}^{'} = (\frac{K_{p} - K_{min}}{K_{max} - K_{min}}) (K_{max}^{'} - K_{min}^{'}) + K_{min}^{'} \end{matrix}$ where, $K_{p}$ is the real time input data, $K_{min}$ is the lowest input data, $K_{max}$ is the highest input data, $K_{min}^{'}$ is the lowest target value, $K_{max}^{'}$ is the highest target value. After that, the training and testing process can be performed and output error metrics are evaluated.

3.3. Training and testing of the ANN model

The design and training parameters of the ANN model for temperature forecasting applications are given in Table 1.

Table 1
Design parameters of the ANN model

ANN model design and training parameters

Number of input neurons = 6

Number of hidden layer = 1

Output neuron = 1

Number of epochs = 2000

Learning rate = 0.01

Momentum factor = 0.9

Threshold value = 1

ANN model design and training parameters
Number of input neurons	= 6
Number of hidden layer	= 1
Output neuron	= 1
Number of epochs	= 2000
Learning rate	= 0.01
Momentum factor	= 0.9
Threshold value	= 1

The hyperbolic tangent sigmoid activation function has been used both in the hidden and output layers. As shown in the next section, the definition of number of hidden neurons in the hidden layer has been performed via a trial and error approach. The temperature forecasting model is constructed by using the training data set, while the corresponding performance indexes are assessed via the testing data set. As shown in the next section, $70 %$ samples of data set (735840) are used for the training phase, while $30 %$ (315360) for the testing phase.

The following indexes are used for the performance evaluation: Correlation co-efficient (R), Mean Absolute Percentage Error (MAPE), Mean Square Error (MSE), Mean Absolute Error (MAE), Root Mean Square Error (RMSE), Mean Relative Error (MRE), and Time in Minutes to process the given training and testing data set and compute such error metrics for the given ANN model. The latter index shows the convergence time during the ANN model training process. More specifically, the formulation of the error metrics is $\begin{array}{l} (15) & R = 1 - {(\frac{\sum_{s = 1}^{N} (T_{s}^{'} - T_{s}^{f})}{\sum_{s = 1}^{N} T_{s}^{f}})}^{2} \\ (16) & MAPE = \frac{100}{N} \sum_{s = 1}^{N} | (T_{s}^{'} - T_{s}^{f}) / {\bar{T}}_{s} | \\ (17) & MSE = \frac{1}{N} \sum_{s = 1}^{N} {(T_{s}^{'} - T_{s}^{f})}^{2} \\ (18) & MAE = \frac{1}{N} \sum_{s = 1}^{N} (T_{s}^{'} - T_{s}^{f}) \\ (19) & RMSE = \sqrt{(\frac{1}{N} \sum_{s = 1}^{N} {(T_{s}^{'} - T_{s}^{f})}^{2})} \\ (20) & MRE = \frac{1}{N} \sum_{s = 1}^{N} | (T_{s}^{'} - T_{s}^{f}) / {\bar{T}}_{s} | \end{array}$ where N is the total amount of data samples, $T_{s}^{'}$ is the temperature target output, ${\bar{T}}_{s}$ is its average, and $T_{s}^{f}$ is the forecasted temperature output.

4. Performance analysis of the proposed ANN model based on its influencing factors

In this section, the performance indexes of the ANN model for temperature forecasting applications are assessed by varying the different influencing factors. Such evaluation has been performed by using MATLAB platform version 2013 executed on an Acer personal computer with Pentium (R) Dual Core processor running at 2.30 GHZ and 2 GB of RAM.

4.1. ANN model performance analysis with various hidden neurons

The ANN model has only one hidden layer. Tables 2 and 3 show the performance index values when the number of hidden neurons varies from 1 to 30. As shown in Table 2 in bold, six hidden neurons in the hidden layer achieve the best performance. For this analysis, the data set has been classified as follows: 70% for training purpose and 30% for testing purpose.

Table 2
Performance analysis with number of hidden neurons 1–15

Number of hidden neurons Performance indexes

R MAPE MSE MAE RMSE MRE Time (min)

1 0.99777 1.7227 0.6579 0.3877 0.8111 0.0172 2.06

2 0.99989 0.2257 0.0192 0.0508 0.1387 0.0023 2.28

3 0.9999 0.2126 0.0177 0.0478 0.1332 0.0021 2.26

4 1 0.1824 0.0224 0.0411 0.1498 0.0018 2.28

5 1 0.1550 0.0215 0.0349 0.1468 0.0015 2.34

6 1 0.0728 0.0059 0.0164 0.0770 7.2796e $-$ 04 2.26

7 1 0.4387 0.1873 0.0987 0.4327 0.0044 1.41

8 1 0.2985 0.0954 0.0672 0.3089 0.0030 2.41

9 1 0.6955 0.3779 0.1565 0.6148 0.0070 3.19

10 1 0.6937 0.3878 0.1561 0.6227 0.0069 3.10

11 1 1.0056 0.6999 0.2263 0.8366 0.0101 3.31

12 1 0.7317 0.4305 0.1647 0.6562 0.0073 2.59

13 1 1.0493 0.7634 0.2362 0.8737 0.0105 3.13

14 1 0.8437 0.5487 0.1899 0.7408 0.0084 3.49

15 1 1.0960 0.8116 0.2467 0.9009 0.0110 2.47

Number of hidden neurons	Performance indexes
1	0.99777	1.7227	0.6579	0.3877	0.8111	0.0172	2.06
2	0.99989	0.2257	0.0192	0.0508	0.1387	0.0023	2.28
3	0.9999	0.2126	0.0177	0.0478	0.1332	0.0021	2.26
4	1	0.1824	0.0224	0.0411	0.1498	0.0018	2.28
5	1	0.1550	0.0215	0.0349	0.1468	0.0015	2.34
6	1	0.0728	0.0059	0.0164	0.0770	7.2796e $-$ 04	2.26
7	1	0.4387	0.1873	0.0987	0.4327	0.0044	1.41
8	1	0.2985	0.0954	0.0672	0.3089	0.0030	2.41
9	1	0.6955	0.3779	0.1565	0.6148	0.0070	3.19
10	1	0.6937	0.3878	0.1561	0.6227	0.0069	3.10
11	1	1.0056	0.6999	0.2263	0.8366	0.0101	3.31
12	1	0.7317	0.4305	0.1647	0.6562	0.0073	2.59
13	1	1.0493	0.7634	0.2362	0.8737	0.0105	3.13
14	1	0.8437	0.5487	0.1899	0.7408	0.0084	3.49
15	1	1.0960	0.8116	0.2467	0.9009	0.0110	2.47

Table 3

Performance analysis with number of hidden neurons 16–30

Number of hidden neurons	Performance indexes

	R	MAPE	MSE	MAE	RMSE	MRE	Time (min)
16	1	1.2397	0.9241	0.2790	0.9613	0.0124	2.01
17	1	0.9426	0.6399	0.2122	0.8000	0.0094	2.23
18	1	1.3522	1.0660	0.3043	1.0325	0.0135	1.25
19	1	1.3354	1.0485	0.3006	1.0240	0.0134	4.46
20	1	1.0038	0.7495	0.2259	0.8657	0.0100	1.34
21	1	1.4838	1.2413	0.3340	1.1141	0.0148	1.13
22	1	1.5249	1.2588	0.3432	1.1220	0.0152	7.22
23	1	1.5045	1.2567	0.3386	1.1210	0.0150	2.10
24	1	1.5524	1.2764	0.3494	1.1298	0.0155	1.18
25	1	1.5589	1.2804	0.3509	1.1316	0.0156	7.19
26	1	1.3362	1.0132	0.3008	1.0066	0.0134	4.44
27	1	1.5959	1.2911	0.3592	1.1363	0.0160	4.59
28	1	1.1945	0.8391	0.2689	0.9160	0.0119	3.40
29	1	1.5023	1.1666	0.3381	1.0801	0.0150	6.56
30	1	2.8261	5.9625	0.6361	2.4418	0.0283	12.10

The following figures highlight the following aspects:

Fig. 2 shows the temperature data input as a function of the data samples.

Fig. 3 compares the target temperature with the one forecasted by the proposed ANN model.

Fig. 4 shows the error metrics (MSE) as a function of the data samples.

Fig. 5 shows that the proposed temperature forecasting model achieves the regression $R = 1$ .

Fig. 6 shows the error metrics as a function of the number of neurons within the hidden layer.

Fig. 7 shows the time in minutes to process the training and testing data set as function of the number of neurons within the hidden layer.

Fig. 2.

Portion of original temperature vs data samples.

Fig. 3.

Portion of original target temperature against forecasted temperature.

Fig. 4.

Evaluation error metric (MSE) vs number of data samples.

Fig. 5.

Regression graph.

Fig. 6.

Error metrics vs hidden neurons.

Fig. 7.

Time in minutes vs hidden neurons.

4.2. ANN model performance analysis with various momentum factors

Table 4 shows the performance analysis with different momentum factor values. In this respect, the ANN model with 6 hidden neurons has been used. The best performance indexes are achieved with a momentum factor equal to 0.9. As expected, it can be noticed that the momentum factor only affects the convergence time during the ANN model training process. The other performance indexes remain unchanged.

Table 4
Performance analysis with different momentum factor values

Momentum factor Performance indexes

R MAPE MSE MAE RMSE MRE Time (min)

0.1 1 0.0728 0.0059 0.0164 0.0770 7.2796e−04 3.15

0.2 1 0.0728 0.0059 0.0164 0.0770 7.2796e−04 3.13

0.3 1 0.0728 0.0059 0.0164 0.0770 7.2796e−04 3.37

0.4 1 0.0728 0.0059 0.0164 0.0770 7.2796e−04 3.25

0.5 1 0.0728 0.0059 0.0164 0.0770 7.2796e−04 3.34

0.6 1 0.0728 0.0059 0.0164 0.0770 7.2796e−04 3.19

0.7 1 0.0728 0.0059 0.0164 0.0770 7.2796e−04 2.55

0.8 1 0.0728 0.0059 0.0164 0.0770 7.2796e−04 3.11

0.9 1 0.0728 0.0059 0.0164 0.0770 7.2796e $-$ 04 2.26

1 1 0.0728 0.0059 0.0164 0.0770 7.2796e−04 3.06

Momentum factor	Performance indexes
0.1	1	0.0728	0.0059	0.0164	0.0770	7.2796e−04	3.15
0.2	1	0.0728	0.0059	0.0164	0.0770	7.2796e−04	3.13
0.3	1	0.0728	0.0059	0.0164	0.0770	7.2796e−04	3.37
0.4	1	0.0728	0.0059	0.0164	0.0770	7.2796e−04	3.25
0.5	1	0.0728	0.0059	0.0164	0.0770	7.2796e−04	3.34
0.6	1	0.0728	0.0059	0.0164	0.0770	7.2796e−04	3.19
0.7	1	0.0728	0.0059	0.0164	0.0770	7.2796e−04	2.55
0.8	1	0.0728	0.0059	0.0164	0.0770	7.2796e−04	3.11
0.9	1	0.0728	0.0059	0.0164	0.0770	7.2796e $-$ 04	2.26
1	1	0.0728	0.0059	0.0164	0.0770	7.2796e−04	3.06

4.3. ANN model performance analysis with various learning rates

Table 5 shows the performance analysis with different learning rate values. Like before, the ANN model with 6 hidden neurons with the momentum factor set to 0.9 has been used. The best performance indexes are achieved with a learning rate equal to 0.01. Like the momentum factor, it can be noticed that the learning rate only affects the convergence time during the ANN model training process. The other performance indexes remain unchanged.

Table 5
Performance analysis with different learning rates

Learning rate Performance indexes

R MAPE MSE MAE RMSE MRE Time (min)

0.001 1 0.0728 0.0059 0.0164 0.0770 7.2796e−04 3.20

0.01 1 0.0728 0.0059 0.0164 0.0770 7.2796e $-$ 04 2.26

0.1 1 0.0728 0.0059 0.0164 0.0770 7.2796e−04 2.30

0.2 1 0.0728 0.0059 0.0164 0.0770 7.2796e−04 3.03

0.3 1 0.0728 0.0059 0.0164 0.0770 7.2796e−04 2.27

0.4 1 0.0728 0.0059 0.0164 0.0770 7.2796e−04 3.01

0.5 1 0.0728 0.0059 0.0164 0.0770 7.2796e−04 2.30

0.6 1 0.0728 0.0059 0.0164 0.0770 7.2796e−04 3.08

0.7 1 0.0728 0.0059 0.0164 0.0770 7.2796e−04 4.37

0.8 1 0.0728 0.0059 0.0164 0.0770 7.2796e−04 5.04

0.9 1 0.0728 0.0059 0.0164 0.0770 7.2796e−04 4.30

1 1 0.0728 0.0059 0.0164 0.0770 7.2796e−04 3.33

Learning rate	Performance indexes
0.001	1	0.0728	0.0059	0.0164	0.0770	7.2796e−04	3.20
0.01	1	0.0728	0.0059	0.0164	0.0770	7.2796e $-$ 04	2.26
0.1	1	0.0728	0.0059	0.0164	0.0770	7.2796e−04	2.30
0.2	1	0.0728	0.0059	0.0164	0.0770	7.2796e−04	3.03
0.3	1	0.0728	0.0059	0.0164	0.0770	7.2796e−04	2.27
0.4	1	0.0728	0.0059	0.0164	0.0770	7.2796e−04	3.01
0.5	1	0.0728	0.0059	0.0164	0.0770	7.2796e−04	2.30
0.6	1	0.0728	0.0059	0.0164	0.0770	7.2796e−04	3.08
0.7	1	0.0728	0.0059	0.0164	0.0770	7.2796e−04	4.37
0.8	1	0.0728	0.0059	0.0164	0.0770	7.2796e−04	5.04
0.9	1	0.0728	0.0059	0.0164	0.0770	7.2796e−04	4.30
1	1	0.0728	0.0059	0.0164	0.0770	7.2796e−04	3.33

4.4. ANN model performance analysis with various training and testing data set portioning

Table 6 shows the performance analysis with a different classification of the available data between training and testing dataset. Like before, the ANN model with 6 hidden neurons has been used. The best performance indexes are achieved with the following data portioning: 70% for the training data set and 30% for the testing one.

Table 6
Performance analysis with various training and testing data sets

Percentage of data set Performance indexes

Training set Testing set R MAPE MSE MAE RMSE MRE Time (min)

90 10 1 0.6522 0.5909 0.0713 0.7687 0.0065 2.32

80 20 1 0.5090 0.3713 0.0623 0.6093 0.0051 2.53

70 30 1 0.0728 0.0059 0.0164 0.0770 7.2796e-04 2.26

60 40 1 0.4480 0.2362 0.1147 0.4860 0.0045 2.34

50 50 1 0.1068 0.0143 0.0265 0.1197 0.0011 2.50

40 60 1 0.1338 0.0184 0.0316 0.1356 0.0013 2.68

30 70 1 0.1652 0.0312 0.0428 0.1768 0.0017 2.17

20 80 1 0.0797 0.0062 0.0174 0.0787 7.9740e-04 2.42

10 90 1 0.0962 0.0089 0.0241 0.0943 9.6211e-04 3.04

Percentage of data set	Performance indexes
90	10	1	0.6522	0.5909	0.0713	0.7687	0.0065	2.32
80	20	1	0.5090	0.3713	0.0623	0.6093	0.0051	2.53
70	30	1	0.0728	0.0059	0.0164	0.0770	7.2796e-04	2.26
60	40	1	0.4480	0.2362	0.1147	0.4860	0.0045	2.34
50	50	1	0.1068	0.0143	0.0265	0.1197	0.0011	2.50
40	60	1	0.1338	0.0184	0.0316	0.1356	0.0013	2.68
30	70	1	0.1652	0.0312	0.0428	0.1768	0.0017	2.17
20	80	1	0.0797	0.0062	0.0174	0.0787	7.9740e-04	2.42
10	90	1	0.0962	0.0089	0.0241	0.0943	9.6211e-04	3.04

4.5. ANN model performance analysis with various hidden layers

Finally, we have assessed the performance indexes of the ANN model for temperature forecasting applications with one, two and three hidden layers. As shown in Table 7, the best performance indexes are achieved with one hidden layer. It can be noticed that, as the number of neural network hidden layers increase, the convergence speed and stability are significantly impacted.

Table 7
ANN model performance analysis with various hidden layers

Number of hidden layers Performance indexes

R MAPE MSE MAE RMSE MRE Time (min)

1 1 0.0728 0.0059 0.0164 0.0770 7.2796e $-$ 04 2.26

2 0.9899 0.2125 0.0431 0.0582 0.2076 0.0021 8.10

3 0.9697 2.3156 0.9210 0.3155 0.9596 0.0232 10.32

Number of hidden layers	Performance indexes
1	1	0.0728	0.0059	0.0164	0.0770	7.2796e $-$ 04	2.26
2	0.9899	0.2125	0.0431	0.0582	0.2076	0.0021	8.10
3	0.9697	2.3156	0.9210	0.3155	0.9596	0.0232	10.32

From this analysis, we can conclude the proposed ANN model for temperature forecasting applications consist of 6 inputs, one single hidden layer, 6 hidden neurons, the momentum factor set to 0.9, and the learning rate set to 0.01. Data set classification is 70 % for training and 30% for testing. This way, it achieves the following performance indexes: $R = 1$ , $MAPE = 0.0728$ , $MSE = 0.0059$ , $MAE = 0.0164$ , $RMSE = 0.0770$ , $MRE = 7.2796 e - 04$ , Time = 2.26 minutes. Finally, as for the operational usage of the resulting trained ANN model, temperature values can be predicted over a given time window by feeding the network with the time profile of the relevant atmospheric parameters.

5. Conclusion

This paper has presented an ANN-based model for real-world temperature forecasting applications. It has investigated how the ANN model influencing factors (e.g., the number of neurons of the hidden layer, the learning rate, the momentum factor) and the portioning of the available data into training and testing data sets can affect the resulting model performance indexes. Prediction error metrics, convergence rate, generalization, and stability have been considered in this analysis, which can be summarized as follows:

As for the ANN model underfitting and overfitting issues, a proper number of hidden neurons has to be determined.

The activation function selection is important to capture the non-linear/complex aspects of the application at hand.

Both the learning rate and the momentum factor affect the convergence rate during the ANN model training process.

An improper classification of training and testing data set can affect the ANN model prediction performance.

The ANN model generalization can be achieved by examining the various ANN model influencing factors.

The results and the remarks of this paper can be used by both researchers and engineers when applying ANN-based models for their own specific applications. The setting of such influencing factors is a key element for the resulting ANN-based model to work. The presented optimization procedure has addressed the optimal value of each influencing factor individually. As a future work, we plan to improve the neural network performance using meta-heuristic optimization algorithms. Moreover, the results of this manuscript can be extended to other weather aspects (such as humidity). Such results can be used to implement weather forecasting applications for e.g. wireless, cellular, and mobile communication services with the aim of improving their reliability, availability, serviceability, and usability.

Footnotes

Acknowledgement

The authors gratefully acknowledge the contribution of the National Oceanic and Atmospheric Administration (United States of America) for the provision of data set used to train and test the proposed ANN-based temperature forecasting model.

References

Abhisheka,

M.P.

Singha,

Ghoshb and

Anandc, Weather forecasting model using artificial neural network, Procedia Technology 4 (2012), 311–318. doi:10.1016/j.protcy.2012.05.047.

Ahmed,

D.H.

Vu,

K.M.

Muttaqi and

A.P.

Agalgaonkar, Load forecasting under changing climatic conditions for the city of Sydney, Australia, Energy (Elsevier) 142(C) (2018), 911–919.

Curceac,

Ternynck,

T.B.M.J.

Ouarda et al., Short-term air temperature forecasting using nonparametric functional data analysis and SARMA models, Environ Model Softw. 111 (2019), 394–408. doi:10.1016/j.envsoft.2018.09.017.

D’Angelo,

Tipaldi,

Glielmo and

Rampone, Spacecraft autonomy modeled via Markov decision process and associative rule-based machine learning, in: 2017 IEEE International Workshop on Metrology for AeroSpace (MetroAeroSpace), 2017, pp. 324–329. doi:10.1109/MetroAeroSpace.2017.7999589.

D’Angelo,

Tipaldi,

Palmieri and

Glielmo, A data-driven approximate dynamic programming approach based on association rule learning: Spacecraft autonomy as a case study, Information Sciences 504 (2019), 501–519. doi:10.1016/j.ins.2019.07.067.

Fausett, Fundamentals of Neural Networks: Architectures, Algorithms, and Applications, Prentice-Hall, Inc., USA, 1994.

Hinton and

Sejnowski, Unsupervised Learning: Foundations of Neural Computation, 1999, MIT Press. ISBN 978-0262581684.

L.P.

Kaelbling,

M.L.

Littman and

A.W.

Moore, Reinforcement learning: A survey, Journal of Artificial Intelligence Research 4 (1996), 237–285. doi:10.1613/jair.301.

B.W.

Kernighan and

Lin, An efficient heuristic procedure for partitioning graphs, Bell System Technical Journal 49(2) (1970), 291–307. doi:10.1002/j.1538-7305.1970.tb01770.

10.

Kohonen, Self-organized formation of topologically correct feature maps, Biological Cybernetics 43(1) (1982), 59–69. doi:10.1007/bf00337288.

11.

Krzemień, Fire risk prevention in underground coal gasification (UCG) within active mines: Temperature forecast by means of Mars models, Energy 170 (2018), 777–790. doi:10.1016/j.energy.2018.12.179.

12.

Madhiarasan and

S.N.

Deepa, A novel criterion to select hidden neuron numbers in improved back propagation networks for wind speed forecasting, Applied Intelligence 44(4) (2016), 878–893.

13.

Madhiarasan and

S.N.

Deepa, ELMAN neural network with modified grey wolf optimizer for enhanced wind speed forecasting, Circuits and Systems 7(10) (2016), 2975–2995.

14.

Madhiarasan and

S.N.

Deepa, New criteria for estimating the hidden layer neuron numbers for recursive radial basis function networks and its application in wind speed forecasting, Asian Journal of Information Technology 15(21) (2016), 4377–4391.

15.

Madhiarasan and

S.N.

Deepa, Comparative analysis on hidden neurons estimation in multi layer perceptron neural networks for wind speed forecasting, Artificial Intelligence Review 48(4) (2017), 449–471.

16.

Nair and

G.E.

Hinton, Rectified linear units improve restricted Boltzmann machines, in: 27th International Conference on International Conference on Machine Learning, ICML’10, Omnipress, USA, 2010, pp. 807–814. ISBN 9781605589077.

17.

Nardone,

Santone,

Tipaldi,

Liuzza and

Glilmo, Model checking techniques applied to satellite operational mode management, IEEE Systems Journal 13(1) (2019), 1018–1029. doi:10.1109/JSYST.2018.2793665.

18.

Nwankpa,

Ijomah,

Gachagan and

Marshall, Activation Functions: Comparison of trends in Practice and Research for Deep Learning, 2018, available at: arXiv181103378.

19.

Rosenblatt, The Perceptron—a perceiving and recognizing automaton, Report 85-460-1, Cornell Aeronautical Laboratory, 1957.

20.

Russell, “The Delta Rule”. University of Hartford, 2016, Archived from the original on 4, Retrieved 5 November 2012.

21.

Russell and

Norvig, Artificial Intelligence: A Modern Approach, Prentice Hall, Englewood Cliffs, 1995, p. 578. ISBN 0-13-103805-2.

22.

S.J.

Russell and

Norvig, Artificial Intelligence: A Modern Approach, 3rd edn, Prentice Hall, 2010. ISBN 9780136042594.

23.

Santhosh Baboo and

Kadar Shereef, An efficient weather forecasting system using artificial neural network, International Journal of Environmental Science and Development 1(4) (2010), 321–326.

24.

Sharma,

Gummeson,

Irwin and

Zhu Prashant Shenoy, Leveraging weather forecasts in renewable energy systems, Sustainable Computing: Informatics and Systems 4(3) (2014), 160–171.

25.

Spencer,

Alfandi and

Al-Obeidat, Forecasting temperature in a smart home with segmented linear regression, Procedia Computer Science. 155 (2019), 511–518. doi:10.1016/j.procs.2019.08.071.

26.

Sun,

Wang,

Zhen,

Mi,

Liu,

Wang and

Lu, Research on short-term module temperature prediction model based on BP neural network for photovoltaic power forecasting, in: Proc. IEEE Power Energy Soc. General Meeting, 26–30, Jul., 2015, pp. 1–5.

27.

Tipaldi,

Feruglio,

Pierre and

D’Angelo, On applying AI-driven flight data analysis for operational spacecraft model-based diagnostics, Annual Reviews in Control 49 (2020), 197–211. doi:10.1016/j.arcontrol.2020.04.012.

Percentage of data set		Performance indexes

Training set	Testing set	R	MAPE	MSE	MAE	RMSE	MRE	Time (min)
90	10	1	0.6522	0.5909	0.0713	0.7687	0.0065	2.32
80	20	1	0.5090	0.3713	0.0623	0.6093	0.0051	2.53
70	30	1	0.0728	0.0059	0.0164	0.0770	7.2796e-04	2.26
60	40	1	0.4480	0.2362	0.1147	0.4860	0.0045	2.34
50	50	1	0.1068	0.0143	0.0265	0.1197	0.0011	2.50
40	60	1	0.1338	0.0184	0.0316	0.1356	0.0013	2.68
30	70	1	0.1652	0.0312	0.0428	0.1768	0.0017	2.17
20	80	1	0.0797	0.0062	0.0174	0.0787	7.9740e-04	2.42
10	90	1	0.0962	0.0089	0.0241	0.0943	9.6211e-04	3.04

Analysis of artificial neural network performance based on influencing factors for temperature forecasting applications

Abstract

Keywords

1. Introduction

2. Background on the ANN influencing factors

3. The proposed ANN model and design procedure for temperature forecasting applications

3.1. ANN design and training approach definition

3.3. Training and testing of the ANN model

Table 1 Design parameters of the ANN model ANN model design and training parameters Number of input neurons = 6 Number of hidden layer = 1 Output neuron = 1 Number of epochs = 2000 Learning rate = 0.01 Momentum factor = 0.9 Threshold value = 1

4.1. ANN model performance analysis with various hidden neurons

Table 7 ANN model performance analysis with various hidden layers Number of hidden layers Performance indexes R MAPE MSE MAE RMSE MRE Time (min) 1 1 0.0728 0.0059 0.0164 0.0770 7.2796e − 04 2.26 2 0.9899 0.2125 0.0431 0.0582 0.2076 0.0021 8.10 3 0.9697 2.3156 0.9210 0.3155 0.9596 0.0232 10.32

Footnotes

Acknowledgement

References

Table 1
Design parameters of the ANN model

ANN model design and training parameters

Number of input neurons = 6

Number of hidden layer = 1

Output neuron = 1

Number of epochs = 2000

Learning rate = 0.01

Momentum factor = 0.9

Threshold value = 1