Abstract
Considering the limitation of computing resources, resource over-reserving, and virtual machine stability, this paper proposes an adaptive dynamic management method for virtual machine resources in cloud/fog computing system. The cloud/fog computing task processing process is modeled in the case of the resource constraints in cloud or fog computing node. According to the tradeoff of resource utilization efficiency and system stability, the virtual machine resources are scheduled elastically to realize the on-demand use of cloud/fog computing resources. The experimental results show that the proposed method can effectively improve the resource efficiency of cloud/fog computing and guarantee the quality of service under the dynamic task workload.
Introduction
In recent years, with the rapid development of mobile Internet applications, the number of Internet applications is increasing rapidly, and new demand for computing resources is put forward. However, the traditional calculation mode adopts a fixed number of calculation quantity. When the dynamic load of the network changes, it is easy to cause the problem of overprovisioning or underprovisioning of computing resources. The emergence of cloud computing transforms traditional computing resources into ondemand service models. Through software virtualization technology, resources such as hardware CPU, memory and hard disk space can be abstracted as virtual computing resources that can be adjusted and configured on site, and the number of computing resources can be adjusted dynamically according to actual load demand. Moreover, in recent years, the emergence of fog computing extends cloud computing resources closer to the edge of users, which enable rapid response to user services and computing processing [1, 2]. Among them, the elastic scheduling and adaptive allocation of computing resources not only directly determine the utilization efficiency of virtual machine resources in cloud/fog computing, but also are limited by the problem of single computing node and the difference of computing node resources in cloud/fog computing system. How to reasonably schedule virtual machine resources ondemand directly affects the stability of the system and the efficiency of hardware resources.
However, in the actual cloud computing system, improving the efficiency of virtual machine resources not only optimizes the on-demand use and allocation of virtual machine resources, at the same time, the utilization efficiency of the hardware computing resources of the cloud computing system and the stability of the system operation determined by the virtual machine scheduling location will be comprehensively considered. Specifically, the on-demand scheduling of virtual machines in cloud computing system is limited by the physical computing resources of a single computing node. If the scheduling of virtual machines is concentrated on a computing node, the utilization efficiency of hardware resources will be maximized, but it causes the oversold problem of computing resources of virtual machines [3], which reduces the stability of the system; and vice versa. Although the above-mentioned research work optimizes the on-demand allocation and elastic scheduling of cloud computing virtual machine resources, it does not further comprehensively consider the impact of virtual machine scheduling location on hardware resource utilization efficiency and system stability. Therefore, this paper studies the scheduling optimization of virtual machine resources in the cloud computing system, mainly modeling the virtual machine task processing process based on the hardware resource constraints of the computing nodes. At the same time, under the premise of task service quality assurance, based on the balance between hardware resource utilization and system stability, the required virtual machine computing resources are allocated and scheduled on demand.
Related works
Currently, there have been many relevant studies on virtual machine resource scheduling for cloud/fog computing systems. A variety of resource scheduling schemes are comprehensively analyzed and compared in the literature. Through the classification of resource scheduling schemes, they are Load prediction, Resource-aware and SLA-aware approachs. In [5], this paper proposes an elastic scheduling method of container computing resources in cloud computing system, which realizes the on-demand allocation of container computing resources through the results of load prediction. In [6], based on the automatic scalability of public cloud HPC cluster computing resources, an automatic scalability model for HPC idle-time computing using the advantages of public cloud elastic and rich resources is introduced. In [7], an adaptive management method of cloud computing resources based on container technology is proposed, and a resource architecture scheme that is more suitable for containers and the scheduling method between resources are designed to improve the scheduling of container resources in the cloud computing system. Rationality. in [8], an automatic scaling mechanism for cloud computing resources is proposed based on Kubernetes, which combines responsive expansion and elastic scaling tolerance to ensure the reliability of the system and improve the application load capacity. In [9], an application-oriented elastic scaling algorithm based on a neural network is presented, which can predict the workload and response time of cloud computing applications and give appropriate resource scheduling strategies. In addition, a prediction-based elastic scaling strategy for cloud computing is proposed in [10]. This paper also introduces the performance of three prediction models. A container-based elastic scheduling strategy is proposed in [11]. This strategy can achieve fine-grained resource scheduling and elastic scaling based on load status, and improve the service response capability and resource utilization efficiency of the cloud computing system.
Analysis and modelling
Computing node model with limited hardware resources
Cloud computing system is generally a large-scale cluster structure, with a large number of computing nodes. Therefore, cloud computing systems can also be considered to have relatively unlimited capacity of hardware computing resources. However, in the reality, the limitation of hardware computing resources of cloud computing nodes, especially the limitation of hardware computing resources of fog computing nodes, will have a certain impact on virtual machine scheduling, among which the most obvious impact is on the utilization efficiency and computing performance of hardware resources, as shown in Fig. 1.
Computing nodes with limited hardware computing resources correspond to virtual machine structure.
The computing resources of the virtual machine are mainly composed of CPU and memory. Here,
1) Computing resource sharing and overselling issue
When
When
where
Similarly, when
When
Equation (4) indicates that the total memory space configured by the virtual machine on the computing node
2) Hardware resource use efficiency and stability issues of the system
As shown in Fig. 2, when the virtual machine scheduling strategy adopts, for example, load balancing, the virtual machines are evenly scheduled to each computing node in the system, This will reduce the degree of oversold of system hardware computing resources, that is, reduce the number of virtual machines of a single computing node, which will also reduce the task load of the system’s hardware computing resources; moreover, it will reduce the problem of instability of the virtual machine operation caused by the failure of a single computing node system; but it will also lead to a reduction in the efficiency of the use of computing node hardware resources; vice versa.
Define the use efficiency evaluation index of the system hardware computing resources as follows:
Simultaneously, defines the virtual machine stability evaluation index as follows:
Where
Generally, when a task enters a computing system, it can be considered as a queued computing system. That is, when a task from the network enters the computing system, it first enters the memory buffer to queue, and then it is handled by the CPU according to the order in which the task enters the system. When the wait time of the task exceeds the delay required for task QoS assurance, the task will be lost [13, 14]. In this paper, the typical virtual machine configuration is taken as the number of CPUs of each virtual machine is 1, that is,
Queuing model of computing system task processing.
This paper takes a typical telecommunication service task flow model as an example, that is, the intensity of task arrival obeys the homogeneous Poisson process, the processing time of task in CPU obeys the non-exponential distribution, and the task enters the virtual machine obeys the first come first served (FCFS). Here, we describe the task processing process of a virtual machine as M/M/1/K model according to the queuing theory. Suppose the task can be described as
Assuming that the memory size of the virtual machine is
Within the unit time
According to [4, 5], the idle probability of virtual machine CPU is:
The rejection rate of tasks, that is, the average number of rejected tasks is:
The intensity of the task actually entering the virtual machine, which means the average number of actual tasks reached is:
The average length of task queues within virtual machine is:
The average response time for the task is:
The virtual machine scheduling and task allocation structure in the cloud/fog computing system is shown in Fig. 3.
Virtual machine scheduling and task allocation structure of cloud/fog computing system.
As shown in Fig. 3, when a task arrives at the cloud/fog computing system, it is first allocated by the task load balancing service in the system, that is, the task is allocated to the virtual machine of the system for computing processing according to the corresponding strategy (for example, the round robin strategy [15]); at the same time, the management platform in the system analyzes the task load information collected by the load balancing service, and obtain the corresponding virtual machine scheduling decision according to the required optimization goal, and finally completes the elastic scheduling of the virtual machine in the system according to the decision.
On the basis of requirements of computing resource utilization, system stability and task service quality assurance, the optimization objectives of this paper are described as follows:
1) Task service quality assurance
The primary goal of the cloud/fog computing system is to ensure the service quality of the task. The processing delay of the task is the main indicator to measure the service quality of the task. The delay refers to the time from when the task enters the computing system to the completion of the calculation. The mission’s service quality assurance goals are expressed as follows:
2) Optimization of hardware computing resource utilization efficiency
As mentioned above, starting as many virtual machines on as few computing nodes as possible can improve the efficiency of hardware computing resources. Therefore, the second optimization goal of this paper is to increase the value of Eq. (12) as much as possible, denoted as:
The increase of this indicator will also reduce the task processing performance of the virtual machine, that is, the greater the degree of oversold of hardware computing resources, the greater the number of this indicator.
3) Guarantee of system stability
The stability of the system operation is also one of the main factors affecting the quality of service of the task. Another optimization objective of this paper is to reduce the value of Eq. (13) as far as possible, which means to make the virtual machine start as scattered as possible, and avoid the centralized failure of virtual machine caused by the failure of a single computing node. Therefore, the system stability guarantee optimization problem can be described as the system stability not higher than the required system failure probability (
In summary, there is a contradiction between the use efficiency of hardware computing resources and system stability. Therefore, the two optimization goals can only be balanced on demand according to actual need.
In view of the above optimization, Resource Utility and System Stability Tradeoff Virtual Machine Scheduling based on QoS (RU-2S QoS)algorithm is proposed in this paper.
Algorithm 1 is divided into two parts: the expansion and contraction of virtual machine resources, that is to schedule the virtual machines on demand according to the current task load. Firstly, obtain the task load distribution of each virtual machine in the system according to the task load balancing strategy. Then according to the required task service quality guarantees to obtain the required number of virtual machines. Finally, according to the required computing resource usage efficiency and system stability threshold, start or shut down the virtual machine on the appropriate computing node.
In order to verify the performance of the proposed virtual machine scheduling algorithm in cloud/fog computing system, this paper uses C
Virtual Machine Load Balance Scheduling Algorithm: this algorithm can be seen as the use of Round Robin (RR) strategy in virtual machine scheduling. This algorithm balances the scheduling of virtual machines on each computing node. Therefore, the algorithm will obtain the optimal system operation stability guarantee. Maximum Computing Resource Utilization Efficiency Algorithm (Max-U): this algorithm aims at maximizing the efficiency of a virtual machine computing resource utilization, neglects the guarantee of task service quality and system stability, and starts all virtual machines on one computing node.
1) Task service quality assurance
Task service quality assurance results.
It can be seen from Fig. 4 that the Max-U can guarantee the quality of service of the task when the load is light, its response delay is not more than 0.5 ms. This is because the hardware computing resources of a single computing node can meet the scheduling of virtual machine computing resources, which will not cause the oversold of computing resources; when the load increases, as the number of virtual machines, the computing resources capacity of an individual computing node is in an oversold state, resulting in a decrease in task processing performance, and the quality of service of the task cannot be guaranteed. RR uses the strategy of virtual machine load balancing to schedule virtual machines, which can guarantee and obtain the optimal quality of service. In contrast, the RU-2S QoS has been aware of the task service quality during scheduling, so it can also guarantee the task service quality, which proves the effectiveness of the RU-2S QoS in the task service quality guarantee processing.
2) Efficiency and stability of system
It can be seen from Fig. 5, the Max-U mainly focuses on the utilization efficiency of computing resources. The virtual machines are started in a single computing node to obtain the highest computing resource utilization efficiency index value. The RR uses a virtual machine balancing method to distribute and schedule virtual machines on different computing nodes, and the efficiency of computing resources is the lowest.
In contrast, the RU-2S QoS considers both computing resource utilization efficiency and system stability, and schedules virtual machines based on task service quality assurance. In terms of computing resource utilization efficiency, the RU-2S it is higher than RR but lower than Max-U, which proves that the algorithm achieves the design goal.
3) System operation stability
Table 1 indicates that the Max-U adopts the strategy of maximizing computing resource utilization efficiency to concentrate virtual machines on a single computing node for startup. Therefore, the system failure probability index is the highest. The RR performs distributed scheduling of virtual machines to reduce the centralized scheduling of virtual machines on a computing node as much as possible, and can get the lowest system failure probability, thereby obtaining the highest system stability. In contrast, the RU-2S QoS is not higher than the required system failure probability (10%). Although the system failure probability is higher than the Max-U, it also guarantees the quality of service of the task and optimizes computing resources which prove the effectiveness of the algorithm design.
Results of system failure probability
Results of computational resource efficiency.
Aiming at the on-demand scheduling of virtual machines in cloud/fog computing systems, this paper proposes a virtual machine adaptive allocation and target computing node scheduling algorithm. The algorithm not only considers the quality of service of tasks, but also takes into account the efficiency of computing resources and the stability of the system. Experimental results show that the algorithm proposed in this paper can simultaneously guarantee the service quality of the task, the efficiency of computing resources and the stability of the system.
