Abstract
The improvement of load balance scheduling algorithm is an effective means to realize an optimal scheduling of cloud computing service. This paper analyzes the existing load balance scheduling algorithm and proposes that the key to cloud service quality lies in the control of response time. Therefore, the direct algorithms are superior than intelligence algorithms. In this article, the advantages and disadvantages of Throttled algorithm, Throttled algorithm and Active Monitoring algorithm are compared. On this basis, the system response time is taken as a basis of improving the scheduling algorithm, and a stateful composite weight improvement algorithm is proposed.
Introduction
The nature of cloud computing service is to make a comprehensive integration and optimal scheduling of resources under the framework of Internet. Its core is how to effectively conduct task allocation, carry out a reasonable scheduling of resources and finally realize load balance at a high use ratio of resources. The rapidly developed technologies and continuously updated equipment make the present network very complex and bring a large number of data transmission and communication problems between different architectures, systems and standards. The dynamics and heterogeneity of cloud computing determines the complexity of resources scheduling system [3]. The cloud computing resources are geographically distributed and naturally heteroid, and each organization and management domain has its own resource management strategy. So, how to solve the network congestion problem and how to guarantee a timely, stable, correct and effective web service have become the core content concerned by the resource scheduling of cloud computing. Besides, both “how to” questions are also one of the main contents that the cloud computing need to investigate at present [1].
Implement the cloud service quality control by load balance
Zhan et al. [20], Madni et al. [12] think that cloud computing resource scheduling can be regarded as a reasonable way to deal with a set of tasks submitted by users and a collection of cloud resources. That is, on the basis of considering the processing efficiency of the task, the rational utilization of resources and the related costs,and how to determine which resource the task should be allocated to [20, 12]. In this sense, the goal of resource scheduling includes task execution time, resource utilization, cost control, etc. [19]. The goal of cloud resource scheduling is to realize the key index of cloud service quality control.
The load of scheduling center is reflected by the computation cost of data center in judging the performance and state of all virtual machines and making a balanced scheduling. As a control hub of load scheduling, the scheduling center must have strong computing and processing capabilities so as to rapidly, timely and efficiently conduct task scheduling. As far as a stateless scheduling algorithm is concerned, the computation cost of control center is small, but it could not gain virtual machines and queue state well which may result in a unbalanced load. As far as a stateful scheduling algorithm is concerned, the computation cost of control center may show a geometric growth as the complexity degree of state variable increases, which will lead to an overloaded scheduling center and thereby affect the entire service performance. Therefore, only if the parameters with an appropriate computing complexity are selected as the scheduling indexes and foundation, could the balanced scheduling be realized on the premise of guaranteeing the system service quality [17].
Load balance is a low-cost, effective and transparent method, which is based on the existing network structure to extend the bandwidth of network devices and server, enhance the throughput and network data handling capacity and improve network flexibility. It is also an important performance index of cloud service system. Therefore, load balance contains two meanings: one is dispersing a large number of concurrent access or data traffic to process respectively in multiple node devices, thus reducing users’ time to wait for a response; the other one is sharing the overweight load operation of single node to other receptive node devices for concurrent processing, thus substantially raising the processing capacity of entire system. In a word, load balance means taking advantages of concurrent and cluster and fully utilizing the existing system resources so as to expand the system processing capability [15].
Load balance is one of the top considerations of cloud service provider. Only the load balance that reaches a certain degree can sufficiently and effectively implement the computation capability of all software and hardware resources in a could computing architecture, and realize cost minimization when each service performance index is satisfied [4]. So, in the eyes of a cloud service provider, load balance is just an optimization question of how to realize the fully utilization of resources, timely service response, cost minimization and other objectives. It can obtain a good operability to achieve load balance by scheduling algorithm, in which the intelligence optimization algorithm and direct scheduling algorithm are applied generally.
Application of intelligence optimization algorithm in resources scheduling of cloud computing
As in the resources scheduling of cloud computing, it is required to make a reasonable deployment between each node according to the service condition of network, CPU, memory and other resources, so this is a typical combinational optimization problem. Such kind of problems often do not have a good analysis feature and normally do not apply to continuous function optimization method; thereby a lot of researches use the intelligence optimization algorithm. There are so many intelligence optimization algorithms frequently used in resources scheduling, such as simulated annealing, neural network, genetic algorithm, ant colony algorithm, particle swarm optimization algorithm, etc. Ricciardi et al. [11] proposed a hybrid routing and wavelength assignment algorithm. When the network load is increased, it dynamically switches the load balancing scheme to better allocate the available communication resource. Adhikari and Amgoth [6] designed an effective strategy to configure the server based on the number and size of the input task to find the appropriate allocator and maximize the use of the computing resource. Zuo et al. [5] put forward a multi-objective task scheduling model which minimizes the user’s task overhead and minimizes the task completion time by dividing the tasks submitted by users into computing intensive tasks and data intensive tasks according to the task’s requirement for different resources, and uses the ant colony (Ant Colony, AC) algorithm to solve the problem. Abdullahi et al. [7] studied how to minimize the task scheduling problem of user task completion time.
Among the intelligence algorithms applied in resources scheduling of cloud computing, there are a lot of researches and applications in ant colony algorithm. For example Zhou and Cao [14], a comparison is made between ant colony algorithm and some other algorithms in the CouldSim simulated environment, whose results show that this algorithm has a better scheduling performance and load balance. Hua et al. [16] taking a prediction of potential available nodes’ quality as the hormonal factors of ant colony algorithm, the influence of bandwidth, quality of track, responsive time and other factors in a cloud environment on the computing resource allocation are analyzed, and the GridSim’s simulated results indicate that its ant colony algorithm has a shorter responsive time and better operating quality than the simulated annealing and genetic algorithm. Liu et al. [18], when a bi-directional ant exchange mechanism is imported into the ant colony optimization scheduling strategy under the MapReduce framework, it is able to quickly discover suitable virtual machine resources, and consequently enables Master nodes to rapidly allocate virtual machines according to user tasks so as to overcome the weaknesses of large scale of nodes, low resource allocation of single node and inefficiently searching of effective computing resource in a cloud computing environment.
M. Sharma and P. Sharma [9], in the CloudSim simulated environment, the time that various ant colony algorithms consume in optimal resource matching under different parameter values is researched. The research results show that the average optimization consuming time is above 3.6 seconds, which is unbearable to a cloud service system. Because for users, the system responsive time added with network delay and task processing time is far more than users’ average endurance. Although intelligence optimization algorithm usually can solve complicated NP problem or nonlinear optimization problem in a satisfactory manner, it still faces prematurity, failed optimizing, too long time consumed and other problems. For optimization problems whose results have universal guiding significance, these defects are not big problems. However, for a cloud service system, since each access needs to be re-optimized, these defects will be fatal. Letting users to wait for a long time is equivalent to an inferior service, and an ineffective access after a long period of waiting will totally destroy the credibility and image of cloud service system. Hence, this paper considers that some time-consuming and indeterminate intelligence optimization algorithms are unfit for the load balance algorithm application of cloud computing. Comparatively, the simple and direct task scheduling algorithm with smaller computing complexity and less consuming time has unparalleled advantage than intelligence algorithms.
Application of direct scheduling algorithm in could computing resource scheduling
Common direct scheduling algorithms include Round Robin algorithm, weighted Round Robin algorithm, Throttled algorithm, active monitoring algorithm, efficiency algorithm and so on [13]. Wickremasinghe et al. [2], CloudAnalyst tool is used to compare several main direct scheduling algorithm. According to whether it is required to obtain virtual machine and queue information, these algorithms fall into stateless and stateful scheduling algorithms. Among them, Round Robin algorithm and weighted Round Robin algorithm are stateless scheduling algorithms, while Throttled algorithm, active monitoring algorithm and efficiency algorithm are stateful scheduling algorithms.
Round Robin algorithm
Round Robin scheduling algorithm is also called polling scheduling algorithm, a simple scheduling technology using time slice principle which orderly allocates requirements to all virtue machines and distributes sliced time of processor to nodes to use [10]. Suppose the current system has
Diagram of Round Robin algorithm.
Diagram of Throttled algorithm.
The premise of this algorithm is that all virtue machines have the same performance and need not care for each queue length and response speed, making it so easy to realize. But when demand service interval largely changes over time, the difference of server processing capacity will lead to a varied queue length, so this algorithm will easily result in unbalanced load between servers. In order to reflect the service level discrepancies of different virtue machines due to their varied service capabilities, a weighted Round Robin scheduling algorithm is derived [8]. This algorithm uses weights to represent the service capability level of virtue machine and conducts scheduling according to weights, so as to ensure that high-performance servers can obtain a high use rate. This, to some extent, avoids relative overload of low-performance servers and improves use rate of high-performance servers. Nevertheless, the weighted Round Robin algorithm that does not distinguish unit weight and the Round Robin algorithm that does not distinguish single virtue machine are the same in their nature.
Throttled scheduling algorithm sets a state variable for each virtue machine and uses Busy/Idle to identify the working condition (rhombic state variable in Fig. 2) of this virtue machine. When demand arrives, allocation center will appoint an “Idle” virtue machine to response and “busy” virtue machines not to response. Thus, the Throttled algorithm employing state identification is able to effectively avoid overload, and its mode of execution is shown in Fig. 2.
Active Monitoring algorithm
The state variable of Active Monitoring scheduling algorithm does not mean identifying whether virtue machine is busy or not but to denote the length Ln (instructions/task numbers) of virtue machines’ executing queue. When new demands arrive, the algorithm will place it in the instruction queue of virtue machine (red state variable in the following figure) with least load (shortest queue), allowing virtue machine loads to be equal. This algorithm is executed as shown in Fig. 3.
Diagram of Active Monitoring algorithm.
Different scheduling algorithms can achieve distinct load balance state, which is reflected as varied resource use rate and meanwhile directly showed in system responsive time. Load unbalance always makes some virtue machines to have a long queue of tasks and some other virtue machines to have no tasks to process. If demands are crowded in the long task queue of virtue machine and hard to be processed in time, their responsive time would be necessarily longer. For a cloud center reaching to a certain load level, all resources should be effectively and fully utilized and allvirtue machine queue should keep in an appropriate traffic state, thereby improving the responsive time of entire cloud service systemand increasing the cloud service quality. Consequently, this paper took system responsive time as a basis of modifying scheduling algorithm and presented a stateful composite weight improvement algorithm.
Deficiency of existing direct scheduling algorithm
The scheduling algorithms without state variables treat all virtue machines equally and is unable to conduct task allocation and adjustment according to their performance discrepancies, task processing, queue length and other conditions, which is easy to result in load unbalance phenomenon, such as Round Robin algorithm. Stateful scheduling algorithm provides a reasonable hint for load allocation by a simply state judgement and guides tasks flowing to low load virtue machine so as to avoid overload phenomenon. The design of state variables is decisive to a stateful scheduling algorithm. Too single state variables are difficult to accurately reflect load condition and thus unable to effectively carry out balanced scheduling. Introduction of state variables is an essential condition to study direct optimization scheduling algorithm, in which the so-called state variables refer to the variables to record the working condition, machine performance and other information of virtue machines. Through acquiring state variables, virtue machines are distinctively treated, enabling scheduler to allocate one task to a virtue machine that may process such task in time, and this will improve the use rate of total cloud resources and reduce the waiting time of demands.
Throttled algorithm’s state variables merely record whether virtue machines are idle or not. Virtue machines without any task to process are in an idle state, while virtue machines being processing tasks are in a busy state. Such state recording will lead to a too frequently altered state of virtue machines. Therefore, virtue machines frequently interacting with task allocation center will also lead to the waste and overload of a large amount of resources in server. ESCE algorithm is a similar case. In this algorithm, only frequently scanning task queue and virtue machine list could guarantee that tasks are timely allocated to available virtue machines. Such continuous scanning will bring huge computing burden to server. Active Monitoring algorithm takes the executing queue length of virtue machine, which reflects the bust state of a virtue machine to some extent and enables each new task to be allocated to a virtue machine with the shortest queue. However, because queue length could not accurately reflect the performance, time consumed in processing task and other information of virtue machine, so the advantages of high-performance virtue machines can not be fully exposed and the execution and waiting time of some tasks are too long.
Diagram of dynamic composite weight algorithm relationship
Introduction of state variables is an essential condition to study direct optimization scheduling algorithm, in which the so-called state variables refer to the variables to record the working condition, machine performance and other information of virtue machines. Through acquiring state variables, virtue machines are distinctively treated, enabling scheduler to allocate one task to a virtue machine that may process such tasks in time. This will improve the use rate of total cloud resources, reduce the waiting time of demands and make virtue machine reach to a certain load level. In this state, all resources would be effectively and fully utilized, and all virtue machines’ queue traffic states are equivalent. On this account, both the responsive time of entire cloud service system and the cloud service quality can be improved.
Integrating virtue machine performance, task queue length and other information, a composite weight based stateful scheduling algorithm was proposed. This algorithm uses state variables of comprehensive information to describe the processing capability of virtue machine, allowing each latest task in task allocation center to be allocated to the most suitable virtue machine and be processed in the shortest period of time, so as to substantially improve the responsive time of cloud service system. For a cost-sensitive cloud service provider, it not only has to promote responsive speed but also controls cost. So, on the basis of composite weight algorithm, it is needed to add a virtue dynamic increase/decrease mechanism, which means real-time adding or destroying virtue machines according to task number and intensity, reducing system operation cost and raising resource use rate on the premise of ensuring that cloud service system reaches responsive time limitations. A could service operator could design and use cloud files in a reasonable range, thus controlling the cloud design cost. The relationship between composite weight algorithm and dynamic composite weight algorithm is shown in Fig. 4.
Algorithm relationship diagram.
Thus it can be seen that the dynamic composite weight algorithm, proposed on basis of load balance need and stateful algorithm, can be divided into two levels. That’s to say, a composite weight algorithm is built after an improvement based on stateful algorithm, so as to increase responsive speed of cloud service system under existing hardware conditions. Dynamic composite weight algorithm is built after a secondary improvement based on composite weight algorithm, so as to control the operating cost of cloud service system through dynamic application and destroy mechanism of virtue machine. Dynamic composite weight algorithm considers both responsive speed and operating cost, which could effectively improve service quality of cloud system.
The improvement of scheduling algorithm can effectively raise task processing efficiency and control operating cost under existing conditions, so it is a feasible method to control cloud service system quality. Composite weight algorithm and dynamic composite weight algorithm improved on basis of stateful algorithm are the basis of an optimal scheduling, which mirrors the capabilities of virtue machines to receive and process new tasks by virtue of the descriptions of how fast to achieve an idle state, in order to ensure a task being processed in time. Meanwhile, dynamic application and destroy mechanism of virtue machine could control operating cost to some extent.
Footnotes
Acknowledgments
The authors would like to thank Quality Evaluation Standard Model of Cloud Computing Service in Zhejiang Key Soft Science Project (2016C25024), Zhejiang Natural Science Funds (LY16C200006), Zhejiang Key Research Base Tasks on Philosophy and Social Science (12JDCY02Z).
