Abstract
In the hot strip rolling process, accurate prediction of bending force is beneficial to improve the accuracy of strip crown and flatness, and further improve the strip shape quality. Due to outliers and noise are commonly present in the data generated in the rolling process, not only the prediction accuracy should be considered, but also the uncertainty of prediction results should be described quantitatively. Therefore, for the first time, the authors establish an interval prediction model for bending force in hot strip rolling process. In this paper, we use Artificial Neural Network (ANN) and whale optimization algorithm (WOA) to produce a prediction interval model (WOA-ANN) for bending force in hot strip rolling. Based on the point prediction by ANN, interval prediction is completed by using lower upper bound estimation (LUBE) and WOA, and three indexes are used to evaluate the performance of the model. This paper uses real world data from steel factory to determine the optimal network structure and parameters of the interval prediction model. Furthermore, the proposed WOA-ANN model is compared with other interval prediction models established by other three optimization algorithms. The experimental results show that the proposed WOA-ANN model has high reliability and narrow interval width, and can well complete the interval prediction of bending force in hot strip rolling. This study provides a more detailed and rigorous basis for setting bending force in hot strip rolling process.
Keywords
Introduction
Iron and steel industry plays an important role in the development of human society and is one of the industrial bases that industrialized countries must have. With the continuous development of society, the iron and steel industry is also in an important period of strategic adjustment, and it is necessary to take high quality as the core point of current and future development. Therefore, the iron and steel industry urgently needs to establish a cyber-physical system supported by information technology to comprehensively improve the production technology and product quality.
Strip shape is one of the important indicators to measure quality of rolled strip products. Therefore, the shape quality has become a hot issue for many scholars [1, 2]. However, the rolling environment is very complex, and there are many factors that affect the strip shape. To improve the strip shape quality, an accurate bending force prediction model is needed to compensate for the influence of external factors on the strip shape accuracy. Because the strip shape can be improved by applying a certain bending force on the work roll, the strip shape quality is decided by the setting value of bending farce in great part, so it can be said that the bending force directly affects the adjustment ability of the shape control mechanism.
With the rapid development of artificial intelligence algorithms, Artificial Neural Network (ANN) has been widely applied in different aspects prediction in rolling process, including rolling force prediction [3–6], bending force prediction [7, 8], mechanical properties prediction [9–13], temperature prediction [14], thermal crown of work rolls prediction [15, 16], strip crown and flatness prediction [17–20]. Hydraulic bending rolls technology is one of the most basic and important methods for hot rolling strip shape control. The preset bending force accuracy directly affects strip shape and flatness, especially strip head, and further affects production efficiency. Therefore, high-precision prediction of bending force will benefit the closed-loop feedback control of bending force, so it is necessary to study the prediction of bending force. Wang et al. [21] established a prediction model of bending force in hot strip rolling by using neural network optimized by genetic algorithm (GA), which can be used for on-line controlling and rolling schedule optimization. Li et al. [7] used ANN, Support Vector Machine (SVR), Classification and Regression Tree (CART), Bagging Regression Tree (BRT), Least Absolute Shrinkage and Selection operator (LASSO) and Gaussian Process Regression (GPR) to predict the bending force in hot strip rolling. The experimental results show that GPR model is the best prediction model for bending force. Li et al. [22] chose BP network based on LM algorithm to establish the prediction model of bending force. This model improves the convergence speed, overcomes the defect of local minimum, and has high prediction accuracy.
In iron and steel production, digital twinning technology can use the collected data of field instruments, equipment, processes, etc., combined with artificial intelligence technology to predict unknown problems, such as the prediction of bending force in hot strip rolling process. Although the existing studies can predict the bending force, they are all point predictions, and point predictions have obvious defects. For example, point prediction can only provide a single prediction value while lacking accuracy relevant information. That is to say, point prediction cannot express the probability of correct prediction, and prediction errors still exist. The reliability of rolling process must be guaranteed, and the evaluation of uncertainty is very important for the whole rolling process.
Recent years, probabilistic prediction methods have been widely studied to effectively quantify the forecast uncertainty. There are generally three forms of probability prediction, including probability density functions, quantiles and intervals, among which interval forecast as the most intuitive form, has attracted wide attention. Extending the single-point prediction of the original target value to the interval composed of the upper and lower bounds of the prediction can not only give the prediction results, but also express the correctness of the prediction results. Conventional interval forecast methods such as Delta method [23, 24], Bayesian method [25, 26], Mean-variance Estimation method (MVE) [27] and Bootstrap method [28, 29] suffer from specific data distribution assumptions. As a nonparametric method, lower upper bound estimation (LUBE) method can directly construct appropriate PIs [30] through input in unsupervised learning mode based on a feedforward neural network. Compared with other methods, LUBE method has less computation and does not need data distribution assumption.
In this paper, the interval prediction of bending force in hot rolling process is studied for the first time. ANN and LUBE method are used to construct PIs, and a single objective framework is adopted, and the comprehensive fitness function is taken as the training objective of the model. Considering the high complexity and nonlinearity of the objective function, this paper uses the meta-heuristic algorithm whale optimization algorithm (WOA) to solve this problem effectively. At the same time, in order to illustrate the performance of this prediction model, it is compared with other group-based meta-heuristic algorithms, particle swarm optimization (PSO) [31], Grey wolf optimizer (GWO) [32] and simulated annealing (SA) [33] respectively.
The remainder of this paper is organized as follows. Section 2 mainly introduces the evaluation index of interval prediction and LUBE method. Section 3 describes the WOA-ANN model in detail, including WOA and the whole prediction process. In Section 4, the interval prediction of real bending force data in hot strip rolling process is carried out, and the network topology and related parameters are determined by experiments. Section 5 evaluates and discusses the performance of the WOA-ANN model in this paper. Finally, Section 6 concludes this paper with some remarks for future study.
Background of PI
Interval prediction is the range of estimated value under the condition of guaranteeing a certain prediction probability. The whole interval is composed of upper and lower bound. Like point prediction, interval prediction also needs some evaluation indices to evaluate the accuracy and quality of PIs. This section first introduces the evaluation index of interval prediction, and then introduces the construction method of PIs based on neural network.
PI evaluation indexes
High-quality interval prediction model requires greater reliability and narrower width. In order to evaluate the performance of these two aspects, PI coverage probability (PICP) and PI normalized average width (PINAW) are usually used to quantitatively measure the quality of interval prediction.
PICP is an important feature of interval prediction, which represents the probability that the objective value is located in the interval, it is defined as follows [34]:
As the evaluation index of interval prediction, the value of PICP lies between 0 and 1. Obviously, the larger the PICP value, the better. It is easy to see that the ideal value of PICP is 1, which means that all targets are in the PI. Although PICP is an important index to evaluate the quality of the PI, it is not the only one, because if the width of the PI is large enough, then the value of PICP will be 1, and the deterministic information provided will be very small, so it is impossible to offer meaningful information, the whole PI is of little significance. Therefore, along with the PICP, the width of the interval should also be considered.
In this paper, the quantitative measurement of the width of PI is defined as the PINAW, which can be defined as follows:
Ideally, we would like to include as many targets as possible within the narrowest possible interval, and it is easy to see that a high PICP and a low PINAW are required to achieve a high quality PI. As can be seen from the above (1) and (3) for calculating PICP and PINAW, these two requirements are contradictory. Therefore, it is necessary to define a comprehensive fitness function based on PI, called coverage width-based criterion (CWC), which is used to optimize the above two evaluation indicators [35]. The specific definition is as follows:
The single objective function CWC balances the accuracy and effectiveness of PI. Therefore, the interval prediction model is established by optimizing CWC.
The LUBE method is implemented by a feedforward NN with two output nodes, the structure is shown in Fig. 1. The upper node represents the upper bound of the PI and the lower node represents the lower bound of the PI. In fact, this method belongs to unsupervised learning since the upper and lower bounds are not known during the training process. The output of the NN is expressed as:

LUBE architecture.
Because CWC transforms two conflicting multi-objective optimization problems into single objective optimization problems, LUBE method can efficiently and directly construct PIs by minimizing CWC. Moreover, because CWC is nonlinear and nondifferentiable, it can not be minimized by gradient descent method. Therefore, it is necessary to adopt intelligent optimization algorithm to minimize CWC by adjusting the weight parameters of neural network.
ANN
ANN is a kind of simulation of biological nerve, which is composed of a large number of neurons connected widely. It is a highly complex nonlinear and adaptive learning system. ANN generally includes input layer, hidden layer and output layer. As a common ANN, Backpropagation (BP) network includes forward propagation and back propagation.
In forward propagation, the signals are transmitted in sequence from input layer to hidden layer and then to output layer, and each layer uses activate function to process the output of neurons. A typical neuron is shown in Fig. 2.

Schematic illustration of neurons.
The input vector X = (x1, x2, …, x
n
) of each layer may be the input of the model or the output of the neuron of the upper layer. The parameters to be determined are W and B, where W = (w1, w2, …, w
n
) is the connection weight between the neuron of this layer and the neuron of the upper layer, B is the bias, the output of the neuron is defined as y, which can be defined as follows:
Back propagation uses gradient descent method to adjust the network parameters by using the error between the predicted value of the output layer and the real value, so as to make the error smaller and further make the predicted value closer to the real value.
WOA is a meta-heuristic algorithm, which was proposed by Mirjalili and Lewis in 2016 [36]. Its inspiration comes from the hunting method of large humpback whales [36], which is called bubble-net hunting method [37]. Under this behavior, humpback whales like to prey on small fish or krill near the water surface, and this foraging is accomplished by creating unique bubbles through the circular or “9”-shaped path shown in Fig. 3.

Bubble-net feeding behavior of humpback whales [36].
The mathematical model of WOA is as follows:
Humpback whales can identify the position of prey and then encircle them. The WOA algorithm assumes that the current optimal candidate solution is the objective prey or close to the best candidate. After defining the best search agent, other search agents will update their positions to find the best search agent. This process is represented by the following formula:
Two approaches have been made to model the bubble-net behavior of humpback whales, two approaches are designed as follows:
a: Shrinking encircling mechanism
This behavior is realized by reducing the
b: Spiral updating position
The method first calculates the distance between the whale located at (X, Y) and the prey located at (X*, Y*), and then creates a spiral-based equation between the whale and the prey position to simulate the spiral movement of humpback whales:
Because humpback whales swim around their prey and swim along a spiral path at the same time, in order to model these two simultaneous behaviors, it is assumed that the probability of updating each behavior is 50%, and the mathematical expression is as follows:
When the humpback whale selects a randomly selected search agent, this stage can perform a global search. If the value of

The flowchart of WOA.
The implementation flowchart of WOA-ANN is shown in Fig. 5, the module diagram of it is shown in Fig. 6, and the main steps are described in detail as follows:

The flowchart of WOA-ANN.

Module diagram of WOA-ANN.
(I) Dataset partition and preprocess. For the prediction model, the input of the prediction model is the related data which have influence on the bending force. The original data should be divided into training data set and testing data set. The training data set is used to train the model, and the testing data set is used to verify the generalization ability of the model. At the same time, because the dimensions of the original data are different, normalization is needed, which in turn will help accelerate the training of the model.
(II) The training of point prediction model is completed on the training data set.
a) Because the WOA-ANN model in this paper is based on point prediction, the parameters of ANN network are initialized randomly at first. And the training set is used to complete the establishment of ANN point prediction model, the connection weight matrix W and biases B between each layer of ANN network are obtained and stored respectively.
b) The obtained weight matrices and biases between ANN input layer and hidden layer, and between each hidden layer are directly used as weight matrix parameters and bias parameters of each layer corresponding to ANN in interval prediction model. Because interval prediction includes upper and lower bound, interval prediction needs two output nodes, while ANN only has one output node in point prediction. Therefore, the weight matrix W o (n × 1) between the hidden layer and the output layer obtained from the point prediction is copied to become W ho with the shape of n × 2. If the two columns of W ho obtained by copying W o are identical, which means that the weights of the two output nodes are the same, then the upper and lower bounds of the interval will completely coincide, which will not achieve the purpose of interval prediction. Therefore, in order to speed up the convergence speed during the model iteration, some processing is done to W ho . So that the upper and lower bounds of the interval are separated at the beginning of the iterative process. The specific method is to add a random matrix R with the values between [–0.1, 0.1] to W ho to obtain the matrix W no . This is well illustrated in Fig. 6.
(III) Determination of the best parameters of WOA-ANN model.
a) WOA algorithm is used to optimize the connection weight matrix W no (n × 2). When executing WOA, the CWC function is taken as the fitness function, and the weight matrix W no (n × 2) to be optimized is taken as the initial position X of each individual in the population of WOA. Firstly, the position of each individual at moment t is updated by three position updating strategies of WOA to obtain the position at moment t + 1. Each individual is input into the training data set to obtain the prediction interval of all data, and the CWC of each individual is calculated. The optimal CWCopt of the population at moment t + 1 and the corresponding optimal position Xopt are selected. Secondly, compare with the current population optimal CWCgopt, if CWCopt is less than CWCgopt, make CWCgopt equal to CWCopt, and when CWCopt is not less than CWCgopt, CWCgopt remains unchanged. The process is continued until the termination condition is satisfied (This will be explained in detail in 3.4), the best W no is obtained, that is, W opt .
b) The weight and bias of each layer except the output layer obtained from step (II) and the newly obtained optimum W opt are used on the testing data set. The evaluation indexes PICP, PINAW and CWC under different parameters are compared to determine the optimum parameters of the prediction model, including network structure, related parameters of WOA algorithm, etc.
(IV) Use the best interval prediction model obtained in step (III) to predict PI. And according to the PI, calculate PICP, PINAW and CWC to evaluate the prediction performance of the WOA-ANN model in this paper.
In this paper, the model training terminational condition is the convergence condition of the model, in general, which is to reach the maximum number of iterations. However, in the iterative process, there may be a situation where the optimal solution no longer changes from a certain iteration and remains unchanged until the maximum number of iterations is reached, in this case, it will be meaningless to continue to iteratively train the model. Therefore, when the value of CWC function remains unchanged in six consecutive iterations of model training, this paper considers that the model has converged and finds the optimal solution, which can avoid unnecessary waste of time. Based on the above reasons, there are two convergence conditions in this paper, one is to reach the maximum number of iterations, which is 50, and the other is that the CWC function remains unchanged in six consecutive iterations.
Determination of the best model
In order to get a high precision prediction model of bending force, the actual data collected by the factory are used to determine the model parameters. This section first describes the data set, and then about the establishment of the interval prediction model (WOA-ANN).
Regarding the determination of the best model, the parameters involved in the model are mainly determined through experiments, including the structure of ANN which has an impact on the prediction results, b which affects the convergence rate of the model in WOA and the population size P N . And the hyper-parameters of the CWC. The specific influence of each parameter on the model and the selection of parameter values will be explained in detail later.
Three evaluation indexes (PICP, PINAW, CWC) are used to evaluate the model. For unbiased comparison, each experiment is repeated five times. In order to determine the best parameters of the model, the average performance of the model is evaluated by the average value of CWC. The smaller the CWC shows the better the prediction performance of the model, which also means that the selection of specific values of parameters is the best. The standard deviation of each index reflects the stability of the model.
Data set
Figure 7 shows the complete rolling process in a typical HSR process. The HSR process consists of 6 key parts: the reheating furnace, the roughing mill, the hot coil box and flying shear, the finishing mill, the laminar cooling, and the coiler. The key equipment of the production line is a finishing mill group composed of 8 groups of stands, which determines the final shape of the strip. Each group of stand consists of a pair of work rolls and a pair of backup rolls. The spacing between the stand is 5.5 m. The whole line is equipped with work roll shifting and hydraulic roll bending systems to control flatness and plate crown.

Schematic layout of hot strip rolling (HSR).
A single batch consists of a coil of rough steel, which enters the reheating furnace to be reheated to the appropriate temperature. Next, the strip passes through the roughing mill, where its thickness and width are reduced to close to the desired value. Then, the strip enters the finishing mill section, where the strip is carefully milled to the required width and thickness. The profile of the strip can be controlled by changing the bending forces between the two work rolls [38]. The strip thickness and flatness are measured in real time by an X-ray gauge at the end of the finishing stands as shown in Fig. 7. Measuring the final dimensions of the strip is vital for the mill controllers. The controllers adjust mill parameters in real time with feedback from the gage to minimize strip flatness. Next, the strip is cooled by water to an appropriate final temperature. Finally, the strip is coiled and is ready for shipment.
In the hot strip rolling process, the bending force is usually related to the production environment, rolling parameters, strip characteristics and other conditions. Therefore, this paper collected the final stand rolling data of a 1580 mm HSR process in a steel factory for experiments. The input variables used for the proposed prediction model of bending force are entrance temperature (°C), entrance thickness (mm), exit thickness (mm), strip width (mm), rolling force (kN), rolling speed (m/s), roll shifting (mm), yield strength (MPa), and target profile (μm). The ouput variable of the model is the bending force (kN). A total of 1444 pieces of steel data are employed in the experiments, and 70% of them is training set. Table 1 shows the data distributions for each input variable.
Description of input parameters
Because the input data represent different characteristics, they have different dimensions and units, which will affect the prediction results. In order to eliminate the dimensional influence among the input data, it is necessary to standardize the data, so that the input data are in the same order of magnitude, and then make the input data comparable. The most common processing method is to normalize the data, and the normalization formula is as follows:
The network structure of ANN has a great impact on the accuracy and stability of the whole interval prediction model. In this paper, the best network structure is determined, which mainly includes the number of hidden layers, the activation function of each layer, the learning rate and the number of nodes in each layer. In the CWC fitness function, the controlling parameter μ is set to be the nominal confidence level 0.95.
Determination of the number of hidden layers of the network
The number of the hidden layers of the network affects the accuracy of the prediction and the training time of the model. Therefore, in this paper, according to general experience, the network performance is tested when the network layer number is 1, 2, 3, and 4, in which “Tanh” is used for the activation function of hidden layers and the linear activation function is used for the output layer. The test results are shown in Table 2.
Comparison of evaluation indexes of different number of hidden layers
Comparison of evaluation indexes of different number of hidden layers
It can be seen that when the number of hidden layers increases from 1 to 4, PICP satisfies the nominal confidence level (0.95). When the number of hidden layers is 1, the mean and standard deviation of CWC are the lowest, which are 0.3997 and 0.1682, respectively, so the network with one hidden layer is determined.
Nonlinear activation function can make neural network approximate any complex function, so it is necessary to select and determine the appropriate activation function when designing neural network.
When using neural network for regression prediction, “Line” is generally selected as the activation function of the output layer. For the activation function of the hidden layer, this paper tests the influence of “Sigmoid”, “Tanh” and “Relu” on the prediction performance of the model, and the results are shown in Table 3.
Comparison of evaluation indexes of different activation functions
Comparison of evaluation indexes of different activation functions
It can be seen from the Table 3 that when the activation function of the output layer is linear activation function and the activation function of the hidden layer is “Sigmoid”, the mean and standard deviation of CWC predicted by the model are minimum, which are 0.2933 and 0.0117 respectively. It indicates that the WOA-ANN model has the best prediction performance with higher prediction accuracy and better stability when the combination form of activation functions is “Sigmoid” and “Line”.
As an important hyper-parameters of neural network, the learning rate will directly affect the update of the weights and biases of each network in the backpropagation process of the network, further affect the accuracy of prediction. Therefore, a suitable learning rate is also one of the important factors that determine the final prediction results of the whole interval prediction model, so it must be set reasonable.
Generally, the learning rate is between [0, 1]. In this experiment, six learning rates are selected, which are 0.001, 0.005, 0.01, 0.05, 0.1, 0.5, respectively. Experiments are carried out for each learning rate, and the experimental results are shown in Table 4. It can be seen from Table 4 that when the learning rate is 0.005, the mean and standard deviation of CWC are the minimum, which are 0.2860 and 0.0107, respectively. The prediction performance of the model is also the best, so 0.005 is finally selected as the learning rate of the network.
Comparison of evaluation indexes of different learning rates
Comparison of evaluation indexes of different learning rates
In a neural network, the number of neural nodes plays a vital role in the final prediction results and the generalization ability of the model. If the number of nodes is too large, the model itself will have redundancy and affect the final generalization ability of the model. On the other hand, if the number of nodes is too small, the accuracy of the model may be affected. Therefore, a reasonable number of network nodes needs to be determined to enhance the generalization ability of the model and improve the accuracy of the model. In this experiment, according to the dimension of input data, the number of nodes in input layer is 9. Because the interval prediction model has two output variables, the number of network nodes in the output layer is 2. Therefore, the key to the experiment in this paper is to determine the number of nodes in the hidden layer.
Generally, the number of nodes in the hidden layer is the direct cause of “overfitting”, and there is no ideal analytical formula to determine the number of nodes in the hidden layer, which is usually based on empirical formula. In this paper, 3–15 were selected to determine the appropriate number of nodes in the hidden layer. The experimental results are shown in Table 5.
Comparison of evaluation indexes of different neuron numbers in hidden layer
Comparison of evaluation indexes of different neuron numbers in hidden layer
Table 5 fully demonstrates the influence of the number of nodes in hidden layer on the final performance of the interval prediction model in this paper. It can be seen that when the number of nodes in the hidden layer is 9, the CWC value of the interval prediction model is the minimum, and the interval prediction performance is the best.
Through the above four experiments, the parameters of the neural network have been determined, as shown in Table 6.
Parameters of ANN
There are some parameters in WOA algorithm, which need to be adjusted according to specific experiments in order to achieve better experimental results. In this paper, many experiments are performed to determine the value of b and the size of the population in the WOA algorithm.
Determination of b
b is a constant that determines the shape of the logarithmic spiral in the WOA algorithm. Its value affects the iterative performance of the WOA algorithm and thus the accuracy of the whole model. Therefore, it is necessary to determine the appropriate b value in the experiment.
Generally, the value of b is 1. In order to find the best value of b in the model, the maximum value of b selected in this paper is 3, and then decreases by a difference of 0.5. Six groups of experiments were carried out, and the results are shown in Table 7. It can be seen from the Table 7 that when the value of b is 1, the prediction performance of the WOA-ANN model is the best, so the value of b is determined as 1.
Comparison of evaluation indexes of different b values
Comparison of evaluation indexes of different b values
In the population-based algorithm, the population size P N affects the search ability of the algorithm and the calculation amount of the whole algorithm. The selection of P N is determined according to specific problems. Generally, the P N is smaller for simple problems and larger for complex problems.
In this paper, the parameter to be optimized is the connection weight between the output layer and the hidden layer of the network. Therefore, the minimum value of the population in this experiment is 50, and the maximum value is 300, which is incremented by 50 each time, 6 kinds of population settings are obtained. It should be noted that the specific form of the weight is
It can be seen from Table 8 that when the population size is 200, the value of CWC is the minimum, and the performance of the WOA-ANN model is the best at this time. Based on the above experiments, the parameters of the WOA algorithm are finally determined as shown in Table 9.
Comparison of evaluation indexes of different population sizes
Comparison of evaluation indexes of different population sizes
Parameters of WOA
The CWC is used as the fitness function of the proposed WOA-ANN model, as mentioned in Section 2.1, η, as a hyper-parameter involved in CWC, will punish PI with poor quality when PICP does not reach the confidence level, so it is necessary to discuss η value. Therefore, this paper compares the interval prediction results under different η.
Generally, the value of η is 50. So, in this paper, the maximum value of η is 90, which decreases by 10 in turn, and the experimental results are shown in Table 10. It can be seen that when η is 20, the prediction performance of the WOA-ANN model is the best, so the value of η is determined as 20.
Comparison of evaluation indexes with different η
Comparison of evaluation indexes with different η
Since the prediction of bending force in the hot strip rolling process has been point prediction so far, no article or method for its interval prediction has been studied, this paper is the first time to propose the interval prediction model WOA-ANN for bending force in the hot strip rolling process. Therefore, in order to evaluate the performance of WOA-ANN model proposed in this paper, WOA-ANN model is compared with PSO-ANN model, GWO-ANN model and SA-ANN model under nominal confidence level of 0.95. The evaluation indexes of the four different prediction models for bending force are shown in Table 11, and the corresponding PIs results are shown in Fig. 8.
Comparison of evaluation indexes of different models under confidence level of 0.95
Comparison of evaluation indexes of different models under confidence level of 0.95

PIs results of different models.
As can be seen from Table 11, the WOA-ANN model proposed in this paper does not differ much compared with PSO-ANN model, GWO-ANN model and SA-ANN model on the training data set. In fact, the computational speed of the model on the testing data set is not affected by the training data set. After the model is trained, it is used on the testing data set instead of the training data set in practical application, so this paper pays more attention to the accuracy of the model on the testing data set. The computational time of WOA-ANN model, PSO-ANN model, GWO-ANN model and SA-ANN model on the testing data set are similar, which are 0.5153 seconds, 0.5102 seconds, 0.5250 seconds and 0.5198 seconds respectively. However, the WOA-ANN model proposed in this paper has the lowest CWC, so it has the best prediction performance. And also, the standard deviation of WOA-ANN model is the minimum, which indicates that using the WOA-ANN model for interval prediction of bending force is more stable than the others three models.
As shown in Fig. 8, the PIs constructed by the four interval prediction models can make most of the target values lie between the upper and lower bounds, and the upper and lower bounds predicted by the four models have similar trends with the real data. Compared with the PSO-ANN model, the GWO-ANN model and the SA-ANN model, the WOA-ANN model predicts a narrower interval width, and a better similarity trend of data. This implies that the WOA-ANN model has better prediction quality for the bending force than other models.
Figure 9 shows the variation of CWC of four model. As can be seen from Fig. 9, the change of the fitness function during the iteration process, the CWC as a fitness function drops rapidly at the beginning of the iteration of four models, which indicates that when the PICP does not satisfy the nominal confidence level, the CWC pays more attention to the value of the PICP. With the progress of the optimization process, CWC will continue to decrease but the degree of decrease will slow down, and finally gradually converge to a certain value. This shows that when PICP satisfies the nominal confidence level, CWC no longer pays attention to PICP and instead pays attention to PINAW, and finally converges to the optimal value. It can also be seen from the Fig. 9 that the number of model iterations is WOA-ANN, GWO-ANN, PSO-ANN, SA-ANN from more to less, which means that the iteration time of the four models on the training data set is the same. It is consistent with the conclusion in Table 11. Although the iterative time of WOA-ANN model on the training data set is slightly longer than the other three models, the final convergence CWC value of WOA-ANN model is smaller than the other three models. This means that the performance of WOA-ANN model is better than the other three models, in other words, the prediction effect obtained by using WOA-ANN model is the best.

Convergence curve of different models.
Figure 10 is a line chart of the improvement rates of PINAW and CWC obtained according to Table 11. The CWC and PINAW improvement rates of the WOA-ANN model compared to other models are respectively defined as follows:

Improvement rates of PINAW and CWC.
Compared with SA-ANN model, the improvement rate of WOA-ANN model for CWC is obvious, which is 76.17%. Compared with PSO-ANN and GWO-ANN models, the improvement rate is 5.95% and 23.69% respectively. As can be seen from Fig. 10, on the premise of meeting the confidence level, the interval width predicted by WOA-ANN model is improved by 5.95%, 1.94% and 54.42% respectively compared with PSO-ANN, GWO-ANN and GA-ANN.
The above analysis shows that the WOA-ANN model in this paper has the best overall performance, and the prediction results of this model are more accurate and stable, which can build a high-quality PIs of bending force in the hot strip rolling process.
Table 12 shows the comparison of evaluation indexes of WOA-ANN model under different confidence levels. It can be seen that the average PICP values are 0.8351, 0.9048, 0.9182 and 0.9898 under the confidence levels are 0.80, 0.85, 0.90 and 0.98, respectively, which all satisfy the nominal confidence levels. This shows that WOA-ANN model is effective for PIs of bending force in the hot strip rolling process.
Comparison of evaluation indexes of WOA-ANN model under different confidence levels
The mean and standard deviation of CWC and PINAW are equal when the confidence levels are 0.85, 0.90 and 0.98, respectively, which means that the PICP of the five experiments all satisfy the nominal confidence levels. This indicates that the stability of WOA-ANN model is better when the confidence levels are 0.85, 0.90 and 0.98, respectively. When the confidence level is 0.80, the mean value of CWC is not equal to the mean value of PINAW, and the standard deviation of CWC is larger than the standard deviation of CWC under the other three confidence levels, which indicates that WOA-ANN model can effectively predict the bending force when the confidence level is 0.8, but its stability is not as good as that under the other three confidence levels.
Figure 11 shows the results of PIs of WOA-ANN model under different confidence levels. It can be seen from Fig. 11 that when the confidence levels are 0.80, 0.85, 0.90 and 0.98, respectively, the upper and lower bounds of PIs constructed by WOA-ANN model have similar trends with the real data. Moreover, when the confidence level is higher, the PIs contains more real data and the width of PIs is larger, which further proves that PICP and PINAW are two mutually restrictive indicators.

Results of PIs of WOA-ANN model at different confidence levels.
The bending force of hot rolled strip is of great significance to the hot rolling process, because the setting of bending force is one of the important factors that determine the shape quality of hot rolled strip. With accurately predicting of the bending force, the control accuracy of strip flatness can be improved, and high quality strip can be obtained. However, there are many outliers and noises in the actual production process, which will lead to great uncertainty in the point prediction of bending force. In order to quantitatively describe this uncertainty and provide more accurate guidance for production, the interval prediction of bending force is proposed for the first time in this paper. Compared with traditional point prediction, interval prediction not only considers the accuracy of prediction, but more importantly, it can also effectively quantify the uncertainty of prediction results.
In LUBE framework, WOA algorithm is used to minimize the comprehensive evaluation index CWC based on PIs to optimize the output layer weight. According to the evaluation index, the performance of different aspects of PIs is evaluated, the best parameters of the model are determined, and the optimal PIs is constructed. To evaluate the performance of the WOA-ANN model in this paper, it is compared with the PSO-ANN model, the GWO-ANN model and the SA-ANN model, respectively. The accuracy and stability of these models in prediction intervals of the bending force in the hot strip rolling are compared based on the mean and standard deviation of the three metrics, respectively. At the same time, in order to illustrate the effectiveness of the WOA-ANN model in this paper, the PIs of under confidence levels (0.80, 0.85, 0.90, 0.95 and 0.98) are completed, and the performance of WOA-ANN model is evaluated from different aspects according to three evaluation indexes.
The actual bending force data are used in the experiment, the experimental results show that compared with PSO-ANN, GWO-ANN and SA-ANN models, the WOA-ANN model has higher accuracy and stable performance for interval prediction of bending force in the hot strip rolling, achieving the best overall performance. Under different confidence levels, the model can construct PIs with high quality, which indicates that the WOA-ANN model in this paper has excellent prediction performance.
The WOA-ANN model proposed in this paper can guide the production process, which will be beneficial to improve the adjustment ability of the shape control mechanism, thus improving the flatness quality in the hot strip rolling process and obtaining greater production and economic benefits. Furthermore, the interval prediction established in this paper can also be applied to other prediction in rolling process after training and adjustment.
As future work, we will consider proposing new evaluation indexes to assess the quality of PIs from different aspects, processing the input data of the model such as feature extraction, and trying to use other networks and optimization algorithms for higher quality interval prediction of hot rolled strip bending roll forces. In the future, we will also study the interval prediction of other related indicators in the hot strip rolling production process or some parameters that affect the hot rolling production process, such as interval prediction of crown and rolling force. In addition, the research on interval prediction will be carried out for different varieties of steel, so that the model has wider applicability and stronger universality. This paper makes corresponding supplements in the conclusion section for the contribution, motivation and future work of the research work.
Statements and declarations
Author contribution
Conceptualization, funding acquisition and supervision of this research project were performed by Feng Luan and Xu Li. The experimental investigations, analysis and the first draft of the manuscript were performed by Xianghua Tian. The real data acquired and analyses were performed by Yan Wu and Nan Chen. All authors commented on previous versions of the manuscript and all authors read and approved the final manuscript.
Funding
This work was supported by the National Natural Science Foundation of China (No.U20A20187), the Liaoning Revitalization Talents Program (No. XLYC2007087), the Fundamental Research Funds for the Central Universities (No. N2007006 and No. N180708009).
Data availability
All authors confirm that the data supporting the findings of this study are available within the article.
Code availability
Not applicable.
Ethics approval
The manuscript has not been submitted to any other journal for simultaneous consideration. The submitted work is original and has not been published elsewhere in any form or language.
Consent to participate
All authors voluntarily agree to participate in this research study.
Consent for publication
All authors voluntarily agree to publish this research study.
Conflict of interest
The authors declare no competing interests.
