Abstract
Cloud computing offers internet-based services to customers. Infrastructure as a service offers consumers virtual computer resources including networking, hardware, and storage. Cloud-hosting startup delays hardware resource allocation by several minutes. Predicting computer demand will address this problem. The performance comparison showed that combining these algorithms was the best way to create a dynamic cloud data centre that efficiently used its resources. One of these challenges is the need of practicing effective SLA management in order to prevent the possibility of SLA breaches and the repercussions of such violations. Exponential Smoothing and Artificial Neural Network (ANN) models in terms of managing SLAs from the point of view of cloud customers as well as cloud providers. We proposed an Exponential Smoothing and Artificial Neural Network model (ESANN) for SLA violation and predict the CPU utilization from time series data. This model includes SLA monitoring, energy consumption, CPU utilization, and accuracy prediction. Experiments show that the suggested approach helps cloud providers reduce service breaches and penalties. ESANN outperforms Exponential Smoothing, LSTM, RACC-MDT, and ARIMA by attaining 6.28%, 16.2%, 27.33%, and 31.2% on the combined performance indicator of Energy SLA Violation, which measures both energy consumption and SLA compliance.
Introduction
As a relatively new kind of distributed computing, cloud computing has recently become more popular. Utilizing the technique of virtualization, it provides customers with on-demand access to a variety of physical resources, including processing power, storage space, and bandwidth capacity, along with the scalability and dependability necessary to meet QoS confines outlined in the SLA. Infrastructure requirements for every cloud-based programme include things like processing power, storage space, and I/O speeds. This is a service that is made available by the IAAS provider. The term “workload prediction” refers to the process by which cloud computing projects the future workload of a physical machine by analysing historical workload traces. The IAAS provider would be able to more efficiently manage the resources of the cloud data centre via the use of efficient capacity planning with the assistance of accurate workload prediction techniques. In order to satisfy service level agreements, service providers often overprovision resources, which lowers total physical resource utilisation [1, 2].
A Google trace provides a real-world dataset that is utilized to test the effectiveness of the suggested workload prediction method. The experimental results are validating the efficiency of the suggested method in handling time series prediction issues in cloud settings. Various algorithms, choosing one for a forecasting assignment is compromise three factors: solution complexity, prediction accuracy, and data characteristics. However, when applied to a context that is widely spread, it is unable to accurately forecast workload. A prediction model is therefore constructed in a large dispersed environment to provide interval values for future CPU and energy usage, both in the near future (one step ahead) and in the long future (multistep ahead). This is done so that both immediate and distant possibilities may be considered. Furthermore, current methods for reshaping workloads in grids or clouds rely only on single-model forecasting [3]. The utilization of CPU and energy consumption in the cloud may be estimated using this hybrid approach, which is based on statistical analysis. A job is comprised of one or more duties. Each job is performed by a separate piece of machinery. When submitting a project to a cloud provider, customers are expected to provide the CPU, energy, and I/O requirements for its completion. This ensures that the work will be completed successfully. It is possible that the individual who submits the work is uninformed of the resources that are required to do the assignment. As a direct consequence of this, it is challenging to make accurate projections about the future CPU and energy consumption of the data centre. Utilization of a cloud service provider’s CPU and energy has an effect on the work that can be done and the performance of the cloud itself, making it the most essential criterion. The workload pattern that is uploaded to the cloud may be steady, trending, seasonal, busy, or irregular. There are many other possible patterns. In order for the prediction model to be accurate, each of these factors has to be represented. As a consequence of this, a ESANN model is constructed in order to anticipate the future CPU and energy use of the cloud data centre. Both the CPU and the power supply will need to be heavily used by the workloads. In order to forecast CPU and energy use, we analyse the Google traces version task utilization table using statistical techniques. A time series may be modelled by providing a mathematical description of the underlying process. Time series forecasting makes advantage of this characterization. In forecasting, past data is used and after an appropriate model has been found that matches the data, the output of the model is predictions [4, 5].
The original data and residuals acquired from the exponential smoothing model are used by the ANN in order to make predictions about nonlinear components, whereas the linear components are predicted by the exponential smoothing model. Both sets of projections are based on the results of the exponential smoothing model. As a result of the erroneous value, maybe the nonlinear features of the initial data become more noticeable. The combined findings from the two models are then used to predict the load on the CPU and RAM. CPU utilization and SLA violation forecasts from ESANN are combined with prior CPU usage measurements to produce a new time series. Savitzky-Golay filtering is applied to the newly gathered time series, yielding a set of numbers that may be used as a basis for forecasting CPU and energy use. It is anticipated that the CPU and energy utilization of the cloud workload will increase in the future, which will result in improved management of cloud quality of service accuracy.
This paper’s primary contributions are:
We Propose a model that incorporates exponential smoothing and ANN for predicting cloud data centre resource needs. The model’s main contribution is its combined linear and nonlinear cloud prediction technique. The framework for user-level resource management and allocation guarantees excellent SLA accomplishment for the cloud instances. ESANN outperforms Exponential Smoothing, LSTM, RACC-MDT, and ARIMA by attaining 6.28%, 16.2%, 27.33%, and 31.2% on the combined performance indicator of Energy SLA Violation, which measures both energy consumption and SLA compliance. Extensive experimental assessment of publicly accessible data sets from Google cluster trace for use across various cloud environments.
The work associated with this issue will be addressed in the subsequent section. Section 2 presents the relevant work behind time series models, the linear exponential smoothing model, and the nonlinear ANN model. In Section 3, the suggested model was explained. Section 4 covers the experimental evaluation, whereas Section 5 discusses the conclusion and future work.
In recent years, companies that offer cloud services have been working to solve the twin challenge of delivering quality of service while also competing for limited resources. Statistical approaches provide a solution to this issue. For purposes such as sales forecasting, marketing research, financial forecasting, and many more, accurate forecasting is required. The use of time series forecasting is becoming more common in a variety of fields, including the prediction of wind speed, electric load, and crime. A review of time series forecasting in successful ARIMA and ANN models are in their analysis [6, 7, 8, 9].
Wu et al. [10] designed an adaptive hybrid prediction method (AH Model) based on autoregression (AR) to forecast n-ahead CPU demand in the computational grid. Moreover, they presented a grid-optimized Savitzky-Golay filter-based adaptive confidence window method. They combined the mean value with past interval data and adaptive parameters to anticipate the future CPU load. Mean Square Error (MSE) analysis is used to establish the past period used in the forecast. Many research organizations attempted to forecast how much the cloud will use CPU time. The number of works being done in the cloud changes with time and is connected with more prolonged periods. Therefore, the CPU load is forecasted based on the historical data on CPU use.
Shaw et al. [11] suggested using basic and double exponential smoothing (DES) prediction techniques in order to reduce the total number of virtual machine (VM) migrations. This algorithm analyses the upcoming load to identify an appropriate destination server for the virtual machine that has been chosen and decides whether or not migration has to be done. The findings demonstrated that the suggested method greatly reduced the total number of migrations and the portion of the energy used while still maintaining the SLA. According to the findings, the DES method had the highest effectiveness. The percentage of energy used and the number of SLAVs has both lowered by 34.59% and 63.92%, respectively. Therefore, this research aims to compare the DES method with the seasonal exponential smoothing algorithm in situations where the workloads at the CDC may have seasonal patterns. These situations include the intraday, intra-week, intra-month, intra-quarter, and intra-year periods.
Beloglazov et al. [12] offered as ways to forecast future workloads Local regression (LR) and robust local regression (LRR) algorithms were two examples of the approaches. These methods first make a prediction about a server’s CPU use, and then they compare that prediction to a dynamic threshold in order to identify servers that are overloaded. As of late, the majority of researchers have begun to make judgments based on a statistical analysis of historical data strategy. This is due to the fact that migration happens is dependent on the use of both current and future data trends. Because of this, these strategies stop unneeded migrations from being started whenever there is a significant increase in the amount of work to be done.
An adaptive forecasting model for workload and other cloud data centre management procedures was presented by Zharikov et al. [13]. According to the findings of this research, a total of six different approaches to forecasting have been taken into account in order to forecast the current state of resources in a cloud data centre. These approaches are as follows: Simple Exponential Smoothing (SES), Holt’s Linear Trend (HOLT), Holt’s Damped Trend (DHOLT), Auto-Regressive Integrated Moving Average (ARIMA), Linear Regression with Trend (LR), and TBATS. In addition to this, the authors have analysed 77 data windows ranging in size from 8 to 66 measures in order to determine which combination of these approaches and data windows produces the best results. A real-world dataset was gathered over the course of one month for the purpose of constructing and evaluating the suggested models. The measurement interval employed in this dataset was five minutes. The fact that the projected values can only be one value (that is, a one-term forecast) and the possibility of finding values outside the range [0%–100%] are, in my opinion, the study’s two most significant limitations.
Shaw et al. [14] proposed an ensemble technique on the topic of VM workload in a public cloud arena. Their methodology was developed with the goal of ensuring SLA compliance while also driving power efficiency at the host level. This is accomplished through the Predictive Interference and Energy Aware (PIEA) algorithm of the company, which profiles the workload of each VM present in the environment. Their ensemble algorithm maintains the balance between the number of resources that are used and the number of resources that are over-committed by keeping track of the resources that are accessible from the physical servers. According to Rahmanian et al. [15], this method collects information from the host server and interfaces with it in order to ascertain the workload and classify it accordingly. An ensemble of SVM, ANN, and Logistic Regression models is used in this method. This method ensures that both success criteria are satisfied, resulting in a reduction of SLA violations by over 70 percent and a saving of over 30 percent in energy usage.
Wang et al. [16] proposed an additional method for putting an ensemble-based algorithm into practice. Within the scope of this study, the authors discuss their approach to profiling complex system architectures. The Accuracy Based Error Pruning algorithms is used to achieve this objective. All relevant information was stored in a single repository and retrieved to simulate CPU consumption. Decision Trees, K Nearest Neighbours, and SVMs are employed. They were able to enhance the accuracy of their predictions in comparison to methods that were based on regression because they constructed a framework, eliminated information that was not essential, and predicted the configuration. Next, the authors look at a strategy to assessing application workload using the PARIS framework in their subsequent study, which focuses on application profiling and is referred to as Guo et al. [17]. Through the creation of a High-Performance Computing environment and the collection of measurements in the laboratory for use in machine learning models. When compared to the conventional approach of employing fault injection to forecast workload needs, which may continue for a considerable amount of time, implementing a machine learning model is a procedure that is both more accurate and less time-consuming than the old approach. It was stated by the authors that “Application resilience is inherently a regression concern.” This was cited in the article. The reduction of the quantity of silent data corruption that occurs in HPC clusters is the primary objective of this effort. An ensemble technique (MLPR, and gradient boosting) is used to assess the effectiveness of their prediction mechanism.
Yan et al. [18] was suggested, is a novel kind of artificial neural network to generalized regression neural network (GRNN). In the context of neural networks, it is a specific kind of RBF network. The GRNN requires just one design parameter to be calculated, and that is the spread factor. The RBF’s coverage, and hence the extent to which the training samples contribute to the final result, is set by the spread factor. To take use of the strengths of both models, Zhang et al. [19] suggested a hybrid approach to forecasting in which an ARIMA model and an ANN are combined. Time series that are linear may be represented by the ARIMA model, whereas time series that are non-linear can be represented by the ANN model’s residual. Similar to boosting, a popular ensemble strategy for reducing prediction bias, this approach uses many models to improve accuracy. The realization that the performance of a combination of many models almost always exceeds that of any one model operating in isolation served as the impetus for the notion.
Qin et al. [20] suggested a dual-stage attention mechanism known as DA-RNN for the purpose of solving multivariate forecasting issues. In the first stage of attention, known as the input attention, different driving series are assigned different weights based on their importance to the prediction at each time step. This takes place during the processing of sensory data. This process is carried out before the input is sent on to the S2S network’s encoder. The encoder receives a set of values at each time step that come from a variety of external driving series, the combined influence of which has been amplified or dampened as necessary. The weights that are assigned to the values of each driving series are generated by another MLP that is trained in tandem with the S2S network. In the second stage of attention, also known as temporal attention, a standard Bahdanau-style attention scheme is applied. For each prediction iteration, this method is utilized to assign relative importance to the encoder outputs. The model was validated using two independent data sets. Comprehensive comparisons and analyses were performed between the ARIMA model, the NARX RNN model, the encoder-decoder model, the attention RNN model, an input attention RNN model, and the DA-RNN model. In both datasets, the DA-RNN model performed much better than any of the other models [21, 22, 23].
Zoumi et al. [24] introduced a novel technique designed to facilitate load balancing inside Cloud infrastructures used in 5G-VCC systems. The MACO method employs a mechanism to allocate a pheromone value, also known as weight, to each virtual machine (VM). In a subsequent manner, the selection of a virtual machine (VM) for each service request is conducted based on the VM with the most pheromone, using a mechanism akin to the decision-making process used by ants to choose best pathways. The pheromone value of each virtual machine (VM) is decreased when it is selected, taking into account the workload of the associated service. Michailidis et al. [25] proposed a 3-D geometrical representation of the MEC-enabled network and propose an optimization method to minimize the WTEC of vehicles and ARSU subject to transmit power allocation, task allocation, and time slot scheduling for both computation and communication. An approach to network slicing for 5G networks that prioritizes the efficiency of cutting-edge automotive services [26]. Table 1 shows research on various methods used for prediction the time series data in cloud data centre.
Various method used for prediction the time series dataset
Various method used for prediction the time series dataset
Time series forecasting models
The primary objective of this study is to locate and evaluate a statistical model that is capable of making accurate predictions. A data point is considered to be part of a time series if it has been observed at several points in time throughout its progression. It is a measurement that indicates how often the observations are recorded. A time series is binary if it only contains records of a single variable, and multivariate if it contains records of two or more variables. Time series may either be continuous or discrete in their presentation. In continuous time series, observations are recorded at each and every instant in time, whereas in discrete time series, they are recorded at regular intervals. The time series may be broken down into four different parts. The observed data can be cleaned of these trends, periods, seasons, and outliers. In order to compute the time series data, it is important to examine these three factors.
Following these tests, the best three models are chosen, and then further tests are conducted against the residual data. A final choice on fit is made based on the model with the lowest RMSE score in Eq. (1). The mean average percentage score is the outcome that satisfies that model’s requirements and is consistent with its accuracy. A basic seasonal exponential smoothing model will be used to compare time-series data sets. The results section compares each model’s RMSE and MAPE values.
The Root Mean Square Error (RMSE) and Mean Absolute Percentage Error (MAPE) is used to assess predicting mistakes and work accuracy. MAPE is a measuring metric that calculates the absolute error by finding the disparity between predicated and actual value
In Eqs (3) and (4), the observation is denoted by
The level, trend, and seasonal smoothing coefficient (
More than sixty years ago, the concept of exponential smoothing was used in order to conduct research on stock exchanges. The seasonal trend in time-series forecasting was taken into consideration while developing this method. The most important advantage of using this approach is that it requires less memory to be used when in the training phase [35]. The following equations, which measure the level, trend, and seasonality factors, are the foundation for this plan’s design. Note that the weighted aggregate of the level, trend, and seasonality is computed using Eqs (4) and (5). Because the amount of weight assigned to each demand observation is scaled back exponentially, this approach is referred to as exponential smoothing. In order to demonstrate this, we will begin by looking at the exponential smoothing model once again.
Here, the weight assigned to
The structure of a neuron in the human brain is extremely comparable to the structure of an ANN neuron. ANNs are used to approximate the many nonlinearities that are present in the data. The connection between the output (ý
In Eq. (9), the model parameters
Because of their nonlinear and continuously differentiable nature, logistic functions are used for the modelling of temporal data. This functionality is considered to be an advantageous quality for network model. Therefore, ANN model carries out the exponential smoothing functional mapping between the previous observations and t by using Eq. (11); the function
Exponential Smoothing and ANN are included into the hybrid model, and within that framework, an estimated range of confidence values is used to provide accurate outcomes in prediction. It projects the amount of work that will need to be done n steps into the future from the present period. As an input to the hybrid model represented by Eq. (12), the time series gen is a function that has both linear and nonlinear components.
The linear components are denoted by the
In Eq. (13), the value
The order of the model may be specified by
In Eq. (17) are calculated using the neural network;
Figure 1 shows the Exponential Smoothing and ANN model’s (ESANN) system model, which is used to predict a confidence interval for future CPU and memory use. Data gathered by Google clusters forms the basis of the research. During data reprocessing, missing values are removed, turning the information into a time series. The time series is then fed into an Exponential Smoothing model, which makes predictions about future CPU and memory use based on historical trends. Exponential Smoothing uses the Dickey Fuller test to determine whether or not the original CPU and memory consumption time series is a stationary time series. Some nonlinear components, called residues, are beyond the scope of the Exponential Smoothing model. Next, the original data sequence is supplemented with the eliminated residuals and fed into the ANN model. Expectations of the computational and memory burdens is placed on CPU utilization. The values predicted by each model are added together to get the final forecast values. After then, these values are combined with the previous history to create a whole new time series. For example, it’s likely that information from more recent times is more valuable than data from far further back in past. The Savitzky-Golay filter [37] is then applied to this new data sequence in an effort to rectify any erroneous analyses [36]. We utilize the smoothed data to create a confidence interval for a n-step forward estimate of the CPU and energy consumption. A strategy for minimising the overall energy consumption is suggested, as is a method for estimating the prediction analysis and CPU utilisation rates that are needed as following Algorithm 1.
Working model of ESANN.
Experimental cases
Experimental cases
ESANN parameters
Comparing the consumption of energy.
To evaluate the proposed approach using the proposed learning methods, Google provided a 2019 dataset. This release incorporates data from a single cluster’s worth of processing cells over twenty-nine days. Google provides schema for data set definition [40]. This data set contains workload traces, records of resource demands, and records of actual resource usage for every job and task done during a period of twenty-nine days. Google’s compute cluster stores workload traces consisting of several different types of activities [38, 39]. This Google trace lists the CPU, memory, and disk I/O time needs and use for each activity. A timestamp attribute, recorded as a 64-bit integer, is associated with each and every record in the collection. microseconds are used as the time stamps. Each task and piece of equipment has its own unique 64-bit identity. A temporal series of data values for CPU and Memory utilization is generated using the Google cloud trace for this research. Different types of time series data are shown in Fig. 2. A break in the time series data corresponds to the transition between the learning and testing phases. The models were fitted using the training set data and then validated using the testing set data. The python code allowed for identifying the optimal model order for ARIMA model forecasts. Python is utilized for the modelling of detailed time series data. The n-step prediction used 100 data points, and for this, ANN and Exponential Averaging Model were developed in Python.
We have finished investigating and experimenting with workload traces for a Google trace computing cluster. The CPU, memory, and disk requirements of the job are recorded here. Reports on the use of all available resources are produced every 10 minutes. Workload1, with fifty virtual machines and fifty provisioning managers; Workload2, with one hundred virtual machines and one hundred provisioning managers; Workload3, with one hundred and fifty virtual machines and one hundred and fifty provisioning managers; and Workload4, with two hundred virtual machines and two hundred provisioning managers are used to test the proposed method. We focused mostly on CPU and power use, but our method may be extended to incorporate other resources, such as storage space. ESANN technique is associated to four state-of-the-art protocols includes Exponential Smoothing, LSTM, RACC-MDT, and ARIMA. Table 2 shows the workloads and parameters used in this experiment are detailed in Table 3.
Energy consumption analysis
ESANN consolidates migrating VMs onto as few hosts as possible. Thus, active servers need more CPU while idle hosts’ rest. This study shows ways to reduce residential energy usage in Fig. 2. Compared to Exponential Smoothing, LSTM, RACC-MDT, and ARIMA, the recommended approach saves 11.26%, 19.31%, 26.5%, and 32.14% of energy. The ESANN technique may use power efficiently as long as it prevents SLA breaches.
Comparison analysis of prediction results in 10 mins interval
Comparison analysis of prediction results in 10 mins interval
Comparison analysis of average CPU Utilization.
Accuracy prediction of different methods
Comparing the SLA violations.
Comparison analysis of energy SLA violation.
Figure 3 shows the average CPU consumption of Server1 and Server2. FFD server utilization is low. Data is gathered at a rate of once per second, allowing us to calculate the CPU load in real time. ESANN wants the most CPU on Server1, whereas Exponential Smoothing, LSTM, RACC-MDT, and ARIMA demand less on Server2 servers. The data shows peak and off-peak CPU usage, which meets our modelling needs. ESANN may combine high-demand virtual machines since its average memory utilization is close to 100%.
Comparison analysis prediction results.
Accuracy prediction comparison of different methods.
A processing load-independent metric is needed to quantify a virtual machine’s end user’s SLA. SLA Violation is a comprehensive metric for the cloud service provider to measure the CPU utilization. ESANN algorithm is outperformed as compare to the existing algorithm like, Exponential Smoothing, LSTM, RACC-MDT, and ARIMA. The goal of this research is to reduce energy consumption by SLA violations as possible onto a small number of servers. ESANN averages 7.2%, 9.1%, 15.4%, and 21.2% less SLA breaches than Exponential Smoothing, LSTM, RACC-MDT, and ARIMA.
Energy SLA violations
To save operating costs, cloud service providers consolidate virtual machines onto fewer servers. Consumers focus on service performance, which must stay consistent throughout consolidation. Cloud providers want to reduce energy usage without disrupting service level agreements to save money. The trade-off, which considers energy utilization and SLA violations, is the performance measure of services. ESANN reduces energy use without violating Service Level Agreements. In Fig. 5, ESANN lowers energy SLA violation better than Exponential Smoothing, LSTM, RACC-MDT, and ARIMA. Compared to Exponential Smoothing, LSTM, RACC-MDT, and ARIMA, data centre energy utilization and SLA violations saved 6.28%, 16.2%, 27.33%, and 31.2%.
The Quality of Service (QoS) values for that SLA from a prior time period and use them to forecast the QoS values for 10:00 AM to 11:30 AM. This allows us to assess the accuracy of the prediction algorithms. When training the ESANN algorithm on the time series data sets from the preceding days were taken into consideration. Table 4 and Fig. 6 is providing the comparison between the actual and expected levels of quality of service. The results of the forecast are shown at intervals of ten minutes, and the units of measurement are in milliseconds (ms).
Root Mean Square Error (RMSE) and Mean Absolute Percentage Error (MAPE) are the two metrics that are used in the process of determining the accurate value of each technique. Table 5 and Fig. 7 demonstrate the overall accuracy of prediction achieved by the various approaches
Conclusion
In order to achieve enhanced quality of service and boost resource utilization in a cloud environment, accurate workload forecasting for the future is required. This study builds a multi-step forward CPU and energy use prediction model by using the Exponential Smoothing and ANN approach. In the course of testing, it was discovered that this idea functions admirably in the fast-paced world of cloud computing. In this method, the Savitzky-Golay filter is combined with Exponential Smoothing and the ANN model. The accuracy of the model’s forecasts is validated using Google trace data that is readily accessible to the public. In this article, we have provided a concise explanation of the ESANN framework and concentrated on its post-interaction phase module. This module is in charge of predicting the quality of service (QoS), locating the potential occurrence of SLA violations, and advising the most appropriate course of action to avoid said violations. In light of these achievements, we can confidently assert that the main goal and intention of this article have been successfully accomplished. The system model we have developed, based on a ESANN approach, has demonstrated its efficacy in advancing the field of workload prediction for SLA performance in cloud environments. The demonstrated accuracy of ESANN model in predicting workloads paves the way for improved resource allocation, proactive scaling, and efficient SLA compliance. The proposed ESANN method has predicted the accuracy value 0.874865174 in RMSE and 0.149580211 in RMPE metrics i.e., lesser and hence better compared to state-of-the-art methods include Exponential Smoothing, LSTM, RACC-MDT, and ARIMA. ESANN.
Furthermore, our work extends beyond theoretical innovation, offering practical benefits to cloud service providers and users alike. By enhancing SLA performance through more accurate workload prediction, cloud providers can bolster their reputation, reduce penalties, and increase customer satisfaction. Simultaneously, cloud users can expect more reliable and consistent service experiences, aligning their operational objectives with provider commitments. Additionally, we exhibited ESANN’s performance when constrained by a variety of SLA requirements. ESANN guarantees SLA compliance with an 80% speed improvement over existing methods like, Exponential Smoothing, LSTM, RACC-MDT, and ARIMA. ESANN also predicts CPU efficiency with over 90% accuracy and response time with 18.5% MAPE. In the future, improving the accuracy of the prediction will require dynamically adjusting the size of the sliding window that is connected with the recent history that is being evaluated as well as dynamically assigning weights to the various data points in the recent past.
