Abstract
In this study, support vector machine (SVM) and back-propagation (BP) neural networks were combined to predict the workload of cloud computing physical machine, so as to improve the work efficiency of physical machine and service quality of cloud computing. Then, the SVM and BP neural network was simulated and analyzed in MATLAB software and compared with SVM, BP and radial basis function (RBF) prediction models. The results showed that the average error of the SVM and BP based model was 0.670%, and the average error of SVM, BP and RBF was 0.781%, 0.759% and 0.708%, respectively; in the multi-step prediction, the prediction accuracy of SVM, BP, RBF and SVM + BP in the first step was 89.3%, 94.6%, 96.3% and 98.5%, respectively, the second step was 87.4%, 93.1%, 95.2% and 97.8%, respectively, the third step was 83.5%, 90.3%, 93.1% and 95.7%, the fourth step was 79.1%, 87.4%, 90.5% and 93.2%, respectively, the fifth step was 75.3%, 81.3%, 85.9% and 91.1% respectively, and the sixth step was 71.1%, 76.6%, 82.1% and 89.4%, respectively.
Introduction
The developments of information and communication technology have changed the business model and management concept of enterprises, not only reshaping the interaction model of enterprises and their stakeholders, but also playing a key role in improving consumer satisfaction [1]. Data centers provide a variety of intellectual supports for enterprises’ operation and management in the era of digital economy. However, with the development of enterprises, data centers need to deal with the rapid growth of data and business. It is difficult for a single data center to meet the production needs of enterprises [2]. Cloud computing has emerged as one of the most promising solutions.Cloud computing [3] is an Internet-based business computing model that combines heterogeneous, inexpensive physical machines and network devices to provide a configurable, convenient and on-demand computing resource sharing pool (including network, server, storage, application software, and services).These resources can be made available quickly with relatively less administrative efforts and interactions. Although cloud computing can provide huge computing resources, what supports computing resources are the high cost of physical machine resources and power resources. In order to maintain legitimate and normal profits, enterprises need to make full use of cloud resources [4]. The most direct way is to add physical machines, but the cost will rise when beyond a certain limit. Another way is to optimize the allocation of virtual machines on physical machines.
In cloud computing services, the task requirements are different, but the application of cloud computing by users also has certain rules; therefore the workload on the physical machine has random changes and corresponding hidden laws. By mining rules, the change of working load can be predicted to improve the working efficiency of physical machine. Algorithms that can mine the variation law of working load include support vector machine (SVM), back propagation (BP) neural network, etc. These algorithms can be regarded as intelligent algorithms. SVM makes classification prediction on data by finding the hyperplane between data. The advantage of SVM is that it does not need too much training, but the disadvantage is that it takes longer time to process big data. BP neural network approaches the nonlinear change law step by step through activation function and reverse weight adjustment. Its advantage is that it can process big data quickly, but its disadvantage is that the accuracy of mining rules will decrease because of the small size of samples when processing small data. Relevant studies are as follows. Chen et al. [5] proposed to estimate the completion time of cloud computing tasks with artificial neural networks to predict workload and make allocation decisions. They found through simulation experiment that the method could significantly reduce the estimation time and improve the prediction speed. Kumar et al. [6] predicted cloud computing workload with Long Short Term Memory-Recurrent neural network (LSTM-RNN) to solve the problem of dynamic resource allocation and consumption in cloud computing, made simulation test on three Web server log benchmark datasets, and found that this method could reduce the mean square error to 3.17×10–3. Barati et al. [7] proposed an improved support vector regression scheme for cloud load prediction. Its improvement lied in determining three key parameters of support vector machine (SVM) by applying hybrid genetic algorithm and particle swarm optimization (PSO) algorithm. The simulation results showed that the improved SVM model had higher prediction accuracy than the traditional model. Li et al. [16] embedded domain knowledge into neural network to predict the workload of cloud computing and found through simulation that the model could achieve higher prediction accuracy. Kumar et al. [17] combined the adaptive differential evolution algorithm with neural network to predict the workload of cloud computing and then carried out simulation experiments on the benchmark data set of HTTP tracking of NASA and Saskatchewan servers. They found that the prediction accuracy of the model significantly improved, and the prediction error reduced by 168 times compared to the traditional BP neural network. This study combined SVM with back propagation (BP) neural network for workload prediction and then simulated the prediction on the workload of data center physical machines in a company providing cloud computing services with MATLAB software. The results of the simulation experiment demonstrated that the combination of SVM and BP neural network had a higher prediction accuracy than the individual SVM and BP neural network models in the single-step prediction and multi-step prediction. Compared to the prediction methods in other studies mentioned above, this study combined SVM with BP neural network using optimal weighting rule, which effectively made up the shortcomings of the two models and improved the prediction accuracy.
Cloud computing load
As shown in Fig. 1, cloud computing services can be divided into four layers, including application layer, platform layer, infrastructure layer and virtualization layer. Virtualization is a decoupling technology that separates the underlying physical device from the upper operating system and software. It builds a virtual layer through software or firmware management program and manages it to map physical resources into logical virtual resources. Its goal is to maximize the utilization efficiency and flexibility of resources [18]. The workload of cloud computing depends on the task requirements submitted by users, so the change of cloud load is random. If the fixed allocation of physical machines is adopted, the allocation is not flexible enough, which can affect the user experience. The work tasks could be migrated to the appropriate physical machine according to the change of user’s task load, which can improve resource utilization. However, the randomness of the workload makes the general load allocation strategy difficult to balance; hence it is necessary to pre-set the allocation strategy by predicting the change of the workload.

Basic architecture of cloud computing.
Support vector machine (SVM) was used to establish a prediction model for the load generated by cloud computing in a short period of time.The short-time load sequence of cloud computing is obtained by means of nonlinear mapping function [8]:
Kernel function k (x i , x) consists of two kernel functions, of which k1 (x i , x) is a polynomial kernel function and k2 (x i , x) is a radial basis function (RBF).
It is noticed from equation (1) that the workload of cloud computing is a time series, then the prediction function of workload of the next moment can be expressed as:
BP network is a kind of multi-layer forward network with unidirectional propagation. It has three or more layers of neural network, including input layer, middle layer (hidden layer) and output layer.Its basic principle [13] is to reversely adjust the parameters in forward propagation formula according to the error between the calculation result obtained by inputting the data to be detected into the forward propagation formula and the set result. The formula of forward propagation is:
In cloud computing services, the task resource requests submitted by different users are different. In addition, the time for different users to request computing resources is also different. It can be said that the workload changes of physical machines in cloud computing services are almost random, but users’ habits and task requests have potential laws. Therefore, the workload changes of physical machines in cloud computing seem to be random but have hidden laws. Rules can be mined through the above intelligent algorithm, so as to realize the prediction of the working load and improve the working efficiency of the physical machine.
There are different types of prediction methods for cloud computing workload. As mentioned above, SVM and BP network have different degrees of advantages in the prediction of cloud computing workload. Both of them can accurately predict cloud computing workload in a certain range, but the real-time change of cloud computing makes the change of workload unstable. It is difficult for a single model to achieve stable and accurate prediction in long-term operation. For example, SVM algorithm and BP algorithm in this study have a good prediction effect when the amount of data is small, but after the amount of data increases, the calculation difficulty of the model increases exponentially, which is a great waste of computing resources; BP algorithm can deal with big data because of its ability of parallel processing data, but there will be local extremum in the training process of BP algorithm; in contrast, SVM does not have this problem. Therefore, by combining the two algorithms, we can give full play to their advantages and improve the prediction performance of the model.
The reason why this study combined SVM with BP model instead of linear regression combined with BP is that the change of working load is random but regular and the change law is mostly nonlinear law. Another reason is that a major problem which will be encountered in cloud computing workload prediction is that the number of tasks applied by users is random and moreover the time distribution is uneven, which leads to less prediction in some time periods and too many predictions in some time periods. SVM and BP can not fully cope with the workload prediction of cloud computing because of their advantages and disadvantages; therefore, they are combined [19].
As shown in Fig. 2, SVM is combined with BP neural network to predict workload. The flow is:

The workflow of the cloud computing workload prediction model based on SVM and BP.
the SVM model and BP model are trained using training samples;
in the process of training, the prediction results of the two models are combined using the optimal weighting rule; the calculation formula of the rule is shown in Equation (7);
the prediction results of the combined model obtained by the optimal weighting rule are compared with the actual values; if the error exceeds the specified range, the two models in the combined model are adjusted reversely according to the error, and the above steps are repeated until the error is within the specified range. The calculation formula of the error is:
after that, the combined model is tested with test samples.
The weight calculation formula of the optimal weight rule [14] is:
Experimental environment
BP neural network model and SVM algorithms were compiled using model predictive control toolbox, neural network toolbox and simulink toolbox in MATLAB software [15]. The experiment was carried out on a laboratory server which was equipped with Windows 7 system, I7 processor and 16 G memory.
Experimental parameters
SVM: The kernel function selected is shown in equation (9); the value of insensitive loss function ɛ was between 10C–4 and 10–1, 0.01 in this study; penalty parameter was set as 1.
BP neural network: the number of nodes was 6 in the input layer, 8 in the hidden layer and 1 in the output layer, the target error was 0.0001, the learning rate was 1, and the initial weight was 0.
Another prediction model for control
In order to further verify the prediction performance of the combined prediction model proposed in this study, it was also compared with RBF model in addition to the two prediction models and combined prediction models mentioned in this paper. RBF model [20] is a radial basis function neural network model, which is also an intelligent algorithm. In this study, the relevant parameters of RBF neural network used in contrast experiment are as follows. Gaussian function was selected as kernel function in the hidden layer (Gaussian function is a RBF); the target error was set as 0.0001; the initial weight between the hidden layer and output layer was set as 0, and the learning rate was set as 0.1.
Experimental data
The workload of physical machines providing virtual machine services in an enterprise which provides cloud computing was sampled. Firstly, ten physical machines were selected randomly, then the workload of the physical machines was sampled every 30 minutes in the peak period of business. The sampling work lasted for 7 days, and the samples were taken as training samples. The duration of the peak period was 8 hours, then there were 1120 sample points in the training samples.
Experimental projects
(1) Short-time single-step prediction
Ten physical machines were sampled as mentioned above. The workload of the physical machine at seven consecutive moments were randomly sampled as detection samples; the workload at the first six consecutive moments were taken as the input, and the actual data at the seventh moment was used for verifying the prediction accuracy of the model.
(2) Long-time multi-step prediction
Ten physical machines were sampled as mentioned above. The workload of the physical machine at twelve consecutive moments were randomly sampled as detection samples; the workload at the first six consecutive moments were taken as input, and the workload of the remaining six moments was used for verifying the accuracy of the multi-step prediction.
Experimental results
The predicted results of the test samples are shown in Fig. 3 and Table 1. The comparison of the predicted values and actual values suggested that all the four prediction models could effectively predict the workload of the cloud computing physical machine at the next moment, but it is difficult to judge the advantages and disadvantages of the four models based on the direct comparison of the predicted values and the actual values, whether it is the column chart in Fig. 3 or the direct data in Table 1. Therefore the three models were compared in the aspect of error. As shown in Fig. 3, the prediction error of the proposed model was obviously smaller than that of the other two prediction models; the testing errors of the individual SVM and BP neural network models were relatively close, but the testing error of the individual BP neural network prediction model was relatively smaller; the average testing error of the individual SVM prediction model was 0.781%, that of the individual BP prediction model was 0.759%, that of the RBF model was 0.708%, and that of the proposed model was 0.670%. It was concluded from the above findings that the prediction accuracy of the proposed model was higher.

Single-step prediction error of work load in cloud computing.
The actual value of cloud computing workload and the predicted values of four models
As shown in Fig. 4, the first-step prediction accuracy of the SVM model in predicting the work load of 10 physical machines was 89.3%, the second-step prediction accuracy was 87.4 %, the third-step prediction accuracy was 83.5%, the fourth-step prediction accuracy was 79.1%, the fifth-step prediction accuracy was 75.3%, and the sixth-step prediction accuracy was 71.1%; the first-step prediction accuracy of the BP model was 94.6%, the second-step prediction accuracy was 93.1%, the third-step prediction accuracy was 90.3%, the fourth-step prediction accuracy was 87.4%, the fifth-step prediction accuracy was 81.3%, and the sixth-step prediction accuracy was 76.6%; the first-step prediction accuracy of the RBF model was 96.3%, the second-step prediction accuracy was 95.2 %, the third-step prediction accuracy was 93.1%, the fourth-step prediction accuracy was 90.5%, the fifth-step prediction accuracy was 85.9%, and the sixth-step prediction accuracy was 82.1%; the first-step prediction accuracy of the SVM and BP combined model was 98.5%, the second-step prediction accuracy was 97.8%, the third-step prediction accuracy was 95.7%, the fourth-step prediction accuracy was 93.2%, the fifth-step prediction accuracy was 91.1%, and the sixth-step prediction accuracy was 89.4%. In the multi-step prediction, with the increase of prediction steps, the prediction accuracy of the four models gradually decreased. The reasons was that the prediction results of the previous step would be included in the calculation data together when the three models predicted the working load of physical machine in multiple steps, so the error in the multi-step calculation would gradually accumulate, resulting in the decrease of accuracy rate with the increase of prediction steps. Under the same prediction steps, the SVM and BP combined model had a higher prediction accuracy, and its reason is as follows. When the individual SVM model processed the big data, the calculation difficulty was high, and the prediction performance was poor; therefore the accuracy rate was low. Although the BP model could handle big data, the error was large due to the existence of local extremum in training, but its accuracy was higher than that of the SVM model. The combination of SVM and BP models reduced the defect influence of the two models.

The multi-step prediction accuracy of four prediction models.
This paper briefly introduced SVM and BP neural networks and combined them through optimal weighting rules for the prediction of cloud computing workload prediction. Then a simulation prediction was carried out on the workload of physical machine in the data center of a company providing cloud computing services using MATLAB software. The results are as follows. (1) The four models could effectively predict the workload at the next moment. The average error of the SVM and BP combined neural network model, BP neural network model, SVM model and RBF model was 0.670%, 0.759%, 0.781% and 0.708% respectively. In the multi-step prediction of workload, the prediction accuracy of the four models decreased with the increase of the prediction steps; under the same prediction steps, the prediction accuracy of the SVM model was the lowest, and the prediction accuracy of the combined SVM and BP model was the highest.
The future research direction is to further improve the prediction preciseness of the prediction model and reduce the influence of multi-step prediction on prediction preciseness.
