Abstract
Cloud computing relates to the storage and accessing of data as a service from Internet for any organizational infrastructure at-any-time. The delivery of some of the services related to computing such as servers, networking, storage, software, etc., is made possible with the use of cloud computing. Companies offer these services in terms of cloud service providers (CSPs) who charge for the services they provide to the users. When a request is made to use the services, the service provider allocates a feasible number of virtual machines (VMs). Determining optimum amount of resources required at runtime to satisfy the user’s request is not a trivial task. Therefore, in cloud ecosystem the cardinal issue is the management of resource allocation to an application in order to abide by the service level agreements (SLAs). The fundamental objective of cloud service management is to design a self-adjustable auto-scalar to respond to elastic workload and optimizing the allocation of resources with reduced cost. The notable issue is how and at what time resources are to be allocated/de-allocated in order to follow agreed SLAs. In this paper, we propose a resource provisioning framework based on the integrated concepts of autonomic computing with Fuzzy Q Learning and Chebyshev’s Inequality principle. The concept of auto-scaling mechanism is commonly implemented in four phases of proposed autonomic MAPE loop framework: Monitoring, Analysis, Planning and Execution. The proposed framework follows the control MAPE loop structure with the inclusion of Chebyshev’s inequality for prediction in the analysis phase and fuzzy Q-learning in planning phase, where human intervention in the form of fuzzy rules ensures efficacious provisioning of VMs. A comparative analysis has been performed with a different combination of (i) LRM in the analysis phase with FBQ-LA in planning phase, ii) Chebyshev’s Inequality in the analysis phase with FBQ-LA in planning phase, and iii) Chebyshev’s Inequality in the analysis phase with Q-Learning in planning phase. Experimental results prove that the proposed autonomic model based on Chebyshev’s inequality and FBQ-LA outperforms the existing model in terms of improved VM provisioning, minimized costs as well as reduction in response time.
Introduction
Cloud computing is a pioneering technology, which provides the facility for bulk data storage in a remote location for easy accessing through internet. Utility of cloud computing being a pay-per-use model stimulated a large number of organizations to use a variety of resources through internet on rental basis [1]. With the phenomenal growth in cloud computing, innumerable application providers are hosting their applications on the cloud. Many cloud service providers like Amazon EC2 are giving their resources on rental premise and making profits. In fact, the ‘pay-per-use’ policy of cloud computing has revolutionized the business environment. The end user requests for availing the services offered by the SaaS provider is fulfilled by renting the infrastructure from IaaS provider. Since, CSP is well aware of the users’ demand dynamics, static resource provisioning may not yield optimum solution. This is due to the fact that, in static provisioning, with the surge in users’ demand, the occurrence of under-provisioning leads to the disruption or delayed response to user’s requests. This causes a breach of SLA. Whereas, with the decreased rate of traffic, the problem of resource over-provisioning occurs causing wastage of resources and will incur higher costs to CSP.
Managing resources for elastic workload in cloud environment is a challenging task for CSP [2–5]. Therefore, the requirement for better strategy, meeting the SLA and at the same time benefiting the CSP is indispensable.
To strike a balance between minimizing SLA violation and reduced cost of CSP [6–7], there is a need of intelligent and automated strategies for efficient auto scaling, which can provide dominant and suitable paradigms in the form of autonomic computing. Autonomic computing is a model which is self-managed, self-healed, self-configured and self-protected [8–10]. To achieve auto scaling mechanism, IBM provides a framework called control MAPE (Monitor, Analyze, Plan, Execute) loop. A system that adopts MAPE loop repeatedly follows all the four phases of monitoring, analysis, planning and execution to obtain optimal result in such a way that monitoring phase iteratively collects all the information regarding workload and resources. Analysis phase performs analysis by using the information gathered from monitoring phase. Subsequently, planning phase makes decision of scale up/scale down and execution phase executes decision by adding/removing appropriate VMs. This work presents an improved version of Analysis and planning phase in terms of VM provisioning, reduced cost and reduced response time.
This research focuses on the design of efficient autonomic framework. Our work follows MAPE loop architecture wherein Chebyshev’s inequality principle is implemented for forecasting workload by taking a workload history from monitoring phase. On the other hand, planning phase uses Fuzzy based Q-Learning to improve the learning accuracy and speed of convergence of Q-Learning. A summarized research contribution of this paper is described in the following manner. An autonomic framework inspired by a MAPE Loop model of IBM for promoting auto scaling of resources is proposed. An enhancement in analysis phase by the use of most accurate predicting model is proposed. Use of reinforcement learning, i.e. Fuzzy Q Learning as a decision maker in the planning phase is proposed. A number of experiments have been conducted using real world NASA workload data to prove the performance of the proposed work.
The rest of the paper is organized as follows.
Background
Autonomic computing
Autonomic computing refers to the computing systems that are self-managing and adjust to the environment as per requirements. The term autonomic computing was coined by IBM [11] in the form of reference model named MAPE control loop which is shown in Fig. 1. MAPE stands for Monitoring, Analysis, Planning and Execution. In the control MAPE loop, the elements such as operating system, VMs, CPU, storage, and other services are considered as data center elements. The sensor senses the managed elements and collects information such as the waiting time and response time of cloud services. The effectors incorporate changes required, such as adding or removing VMs. The autonomic manager consists of these sub phases viz. monitor, analysis, planning, and execution. The related information about resources can be collected from the sensors to fulfill the required changes through the effectors.

Autonomic computing MAPE-K Loop framework.
As of today, different resource provisioning techniques in light of machine learning has been utilized. Among them Neural Networks [12–14], Genetic Algorithm [15], Markov Chain [16] etc., are immensely adopted. But, our inclination is towards reinforcement learning (RL) in view of the accompanying two reasons: As to obtain the training dataset of workload in the cloud is not possible and reinforcement learning does not require any training dataset. RL is suitable for learning in dynamic and complex environment such as a cloud.
Related works
This section explores the related works in the field of autonomic computing, use of FQ-L for auto scaling and FQ-L for knowledge evolution. This section examines related work in two parts (i) Autonomic computing and FQ-L for auto scaling in cloud, (ii) FQ-L for tuning rules of fuzzy controller for knowledge evolution.
Autonomic computing and use of FQ-L for auto scaling
Xu et al. [20] proposed a two-level resource management framework to provision the resources to each individual virtual container, where fuzzy based logic is used in the local controller of virtual container to handle the uncertainties of fluctuating workloads. A simulation engine embedded with Case Based Reasoning (CBR) for knowledge management and decision making is presented by Maurer et al. [21]. Mao et al. [22] presented a methodology based on monitor-control loop having the aim of accomplishing its goal with user’s specified deadline in a cost-efficient way. Ritter et al. [23] introduces a dynamic provisioning and cost-efficient autonomic framework for multi-tenant system topologies. This model empowers provisioning capacities, supporting the client’s request, utilizing resources in a cost-effective way and resources shared by multiple tenant. Frey et al. [24] discusses about autonomic resource management in virtualized data centers using fuzzy logic. An idea related to type-2 fuzzy system for handling uncertainty of elastic workloads with the use of fuzzy logic for specifying elastic rules is presented by Jamshidi et al. [25]. Jamshidi et al. [26] proposed a self-learning fuzzy controller FQL4KE that modifies fuzzy rules automatically at runtime and help elasticity management with dynamic approach. Singh et al. [27] presents an energy aware autonomic framework for scheduling resources in terms of energy efficiency in data centers. Amiri et al. [28] deals the problem of resources and energy wastage by utilizing the concept of Reinforcement Learning and a Fuzzy approach for dynamic resource distribution. Arani et al. [29] proposed a hybrid framework based on the combination of both Reinforcement Learning and Autonomic computing. This work reduces SLA violation by minimizing the total cost and also increases the resource utilization. Arabnejad et al. [30] introduces two approaches, namely fuzzy SARSA learning and fuzzy Q-learning for cloud auto scaling. Aslanpour et al. [31] proposed a control MAPE-K loop architecture that emphasizes on reducing the total cost with the use of a cost saving super professional executer.
FQ-L for tuning rules of fuzzy controller for knowledge evolution
Fuzzy Q-learning is not a new approach for decision making and knowledge evolution, where a good number of remarkable works in this field are credited. A work proposed by Glorennec et al. [32] explores a dynamic version of fuzzy Q-learning method (DFQ-L) and a comparative result of this method with the basic fuzzy Q-learning. DFQ-L removes the drawback of both Q-learning and fuzzy Q-learning. Berenji et al. [33] proposed a methodology for Fuzzy based learning called GARIC-Q. GARIC-Q uses intelligent agents controlled by FQL for incremental dynamic programming. A hybrid algorithm proposed by Oh et al. [34] combines the advantages of both fuzzy Q-learning and conventional Q-learning. Jouffe et al. [35] discusses two methods named fuzzy actor-critic learning and fuzzy Q-learning for online tuning the concluding part of a Fuzzy Inference System (FIS). Bonarini et al. [36] introduced two strategies to distribute reinforcements to handle situations arising from interactions among the rules. In FIS, the interaction between fuzzy rules create a problem in learning process due to incoherency of reinforcement coming from rules. Boumehraz et al. [37] proposed a Reinforcement Learning mechanism for tuning the rules of a fuzzy inference system. Er et al. [38] introduces a dynamic fuzzy Q- learning (DFQL) for tuning rules online and a novel self-organizing learning algorithm is implemented for automatic identification of structures using Q- Learning. Cabrerizo et al. [39] presented a methodology that covers the challenges associated in group decision making and has been analyzed using Fuzzy system.
Proposed work
Design of fuzzy controller
Fuzzy inference system (FIS) is a system that maps a set of input to output through fuzzy rules. FIS with N fuzzy rules can be defined as;
Where,
Where, α i (x) is the rule strength.
Expert knowledge in the form of fuzzy rules can be applied to a given situation to take appropriate actions. Fuzzy rules are in the form of IF-THEN that depicts the human knowledge and can also handle the complex situations.
To perform the fuzzy related functionalities, in this work, we take two inputs and one output parameter. Two inputs are for the predicted values which are obtained from analysis phase and another is the number of requests in a given time (i.e. workload from monitoring phase). Output is a scaling action to be performed. Three scaling actions have been used to identify the output function (i.e. Scale IN (removing VMs), Scale Out (adding VMs), and No Operation). After this, we consider the first step and divide the input set into fuzzy state using membership value. Here, a triangular membership function is used to measure the membership degree. Each input value is now associated with linguistic term. The predicted value is labeled with three linguistic terms as; ‘Low’, ‘Medium’, and ‘High’. On the other hand, the workload has also the same three linguistic terms. After computation, Scale Out operation takes a maximum of 4 VMs as Scale Out {+4 + 3,+2,+1}. For Scale In, we can reduce up to 4 VMs, i.e. Scale In {–4, –3, –2, –1} and for No Operation it is {0}. Therefore, the output has the following value {–4, –3, –2, –1,0,+4,+3,+2,+1}. Rules formed by expert knowledge using these parameters are shown below.
Where, s i (1 ≤ i ≤ 9) is the state and a i (1 ≤ i ≤ 9) is the action.
Dynamic resource provisioning can also be achieved through auto scaling. Auto scaling is a decision making problem. Virtual machines are allocated to the cloud based applications by monitoring the current workload and future predictions. The idea behind this is to keep the response time of the system always below the desired response time as given in the SLA. To fulfill this constraint, in this paper, we use Fuzzy Q-Learning in planning phase of autonomic computing. Different characteristics of the system such as workload and response time are continuously monitored. The fuzzy rules obtained above are continuously tuned by Q-Learning to achieve optimal results. We use Q-Learning as a reinforcement learning approach in the autonomic MAPE loop architecture. A state is modeled by (prediction, workload) for which the FQ-L takes the best suitable action ‘a’ in terms of scaling in VMs or scaling out. The FQ-L is discussed in the following manner.
Where, N is the number of rules, μ i (x) is the firing strength of rule i for input signal x and a i is the consequent function for the fired rule.
Where, the value of Q (s, a) tells how desirable, it is to reach state s by taking a single action a, or repeatedly taking the action a.
Where, max (q [i, a k ]) is the maximum of the q values which can be achieved in state s’.
Where, γ is the discount rate determining the importance of future reward.
Where, η is learning rate and we have taken its value between 0 and 1.
Resource provisioning framework has been constructed that matches the control MAPE loop. All three cloud services viz. SaaS, PaaS and IaaS are accommodated in the framework. The working of framework is shown in Fig. 2. The SaaS layer provides cloud services to end users. The PaaS layer is responsible for resource provisioning to cloud services which are offered by SaaS. The IaaS layer contains data center at which VMs are hosted and it provides VMs to the SaaS layer. As shown in Fig. 2, the main units of the resource provisioning mechanisms based on the control MAPE loop are Monitoring, Analysis, Planning and Execution. These four units in the context of resource provisioning are discussed below.

Framework of autonomic computing.
The proposed approach comprises of equations and notations. The notations used in framing the problem are as follows. U is the number of users requesting for a cloud service at a given time with each user having a request R
u
The total number of requests at a given time is calculated as;
Where,
Where, cloudlets are the total number of tasks being executed, Cloudlet PE is total amount of processing element required by task. Cloudlet Length is the size of particular task. PE is the number of processing elements assigned to the requested VM and MIPS are the number of instructions executed per second. Total costs incurred by the SaaS provider for processing all the requested cloud services is calculated as;
Where, each VM cost depends on VM price and the duration for which the VM is activated.
Penalty cost is calculated by SLA violations. The objective is to minimize the total cost, i.e. Minimize (TotalCost = VMCost + PenaltyCost).
The proposed work follows the control MAPE loop. Each component of the MAPE architecture plays its role in order to avoid the SLA violation. Various notations with definitions that are used in our work are summarized in Table 1.
Model notations
In the monitoring phase, both the user’s metrics and resource’s metrics are monitored continuously. The user monitor collects metrics of the number of VMs leased (N i (Δt)) (and the number of requests used (W i (Δt)) for executing a cloud service S i at the Δtth time interval. The resource monitor observes the CPU utilization of VMs (U i (Δt)).
This phase uses Chebyshev’s Inequality principle to predict the future workload for cloud services by processing the output obtained from monitoring phase.
Chebyshev’s Inequality for predicting the future workload:
Chebyshev’s inequality guarantees that, for a wide class of probability distributions, no more than a certain fraction of values can be more than a certain distance from the mean [40–41]. Specifically, no more than 1/k2 of the distribution values can be more than k standard deviations away from the mean (or equivalently, at least 1–1/k2 of the distribution’s values is within k standard deviations of the mean). This simply implies that the probability that the expectation (or mean) of a number X, when subtracted from the number itself, is always less than k times the standard deviation of X is greater than or equal to (1–1/k2). It is termed in the mathematical notation as;
Where, E(X) is the expectation or the mean of the sample, σ
y
is the standard deviation and k is a constant generally taken to be either 3 or 6. This implies that;
Similarly,
From equation (14) and (15), we conclude that:
Taking k = 6 we get;
This equation when applied to predict workload is;
We have used a fuzzy Q-learning approach in autonomic computing for taking actions which optimizes the total cost by adjusting appropriate number of resources. Based on the 9 fuzzy rules, an action is chosen and the Q-value table is updated till we reach an optimal solution. A positive reward is given if the action chosen is appropriate else a negative or low reward. An optimal action is selected for each rule based on maximum q value from Q table. After this, adding or removing VMs performs action related to the selected action in the planning phase. The type of VMs considered for Q-Learning and Fuzzy Q-Learning are shown in Table 2 and Table 3.
Type of VMs considered for Q-Learning
Type of VMs considered for Fuzzy Q- Learning
This section illustrates the experimental results of the proposed work discussed in previous sections. Firstly, a brief detail on experimental setup with assumptions are described. After this, a comparative analysis of proposed work with existing work is done.
Experimental setup
The experimentations presented in this section were generated using CloudSim 3.0 toolkit and MATLAB version 2015a. Monitoring, analysis and execution phases are validated by using CloudSim [42], whereas planning phase is validated using MATLAB. Design of Fuzzy controller and implementation of Fuzzy Q-Learning and Q- Learning is done in MATLAB. For simulation, four heterogeneous VMs i.e. Large, Extra-large, Medium and Small are created when Q-Learning is used in the planning phase, whereas homogenous VMs of small size are taken when FBQ-L methodology is implemented. Configuration details of these VMs having different cost and capacities are presented in Table 2 and Table 3. Real World Workload traces are taken from NASA data [43] having different characteristics which gives more realistic results. Real workload is collected in a 1 hour interval for 12 hours.
Comparative analysis of proposed work with existing work
We compare our proposed approach with an autonomic framework proposed by Arani et al. [29], who uses a hybrid approach of autonomic computing and reinforcement learning. This framework is inspired by MAPE-K loop, where they used LRM in the analysis phase and Q-Learning for decision making in the planning phase. We compared our proposed prediction technique with LRM. Then analysis of the combination of LRM or Chebyshev’s Inequality based prediction in analysis phase and Q learning/Fuzzy Q learning in the planning phase is done.
Chebyshev’s inequality method with linear regression model (LRM) in analysis phase
We have used Chebyshev’s Inequality in our framework for predicting next hour workload and then compare it with LRM. Both the Chebyshev’s Inequality and LRM are used for predicting next hour workload. After rigorous analysis, it is concluded that Chebyshev’s Inequality gives very close result to actual workload whereas there is large variation in LRM. Prediction by Chebyshev’s inequality is more accurate than LRM. Simplicity of Chebyshev’s inequality grabs the attention towards the current prediction. As shown in Fig. 3, LRM shows large deviation from actual workload in the 2nd hour, whereas in Fig. 4, we can see how this difference is reduced by Chebyshev’s inequality.

LRM based prediction.

Chebyshev’s Inequality based prediction.
From the above, it is concluded that Chebyshev’s inequality gives a better prediction than LRM. As prediction by LRM is not very close to accurate, therefore, for managing fluctuations in workload, we are applying fuzzy rules to prediction values obtained by LRM to give more realistic results. For this, we have formulated 9 fuzzy rules corresponding to 9 states. So, a 9x9 matrix was constructed which shows how the practical formulation of each rule affects the action to be taken in terms of scale in, scale out or no operation. The columns of each rule signify the amount of VMs to be reduced or added. So, the 5th column signifies a no operation action to be taken, the first four columns are for scale in action with values ranging from –4 to –1 and the last four columns are for scale out operation with values ranging from+1 to+4. The first rule says that if the prediction is low and workload is low then it signifies a no operation action to be taken. So, the Q- value for no operation, i.e. Q (1, 5) will get the highest value. This was indeed the case and after 27 iterations the values were seen to be constant. Hence, the stopping criterion was met and Q-matrix was optimized. This has been shown in Fig. 5.

Optimized Q (1, 5).
In the third rule (when prediction is low and workload is high), we have to add a large amount of VMs. So, every column of Scale out must be optimized and thus we see that the 9th column of the third row gets the highest value and rest other scale out columns also optimized with a value less than that of the 9th column. In this way, the practical result has been observed. The Q-value for the 8th and the 9th column of third rule is shown in Fig. 6(a) and 6(b). The Q (3, 9) gives maximum value than Q (3, 8) so Scale Out operation with 4 VMs are carried out. Again in this scenario, after 27 iterations the values have been optimized.

(a) Optimized Q (3, 8) (b). Optimized Q (3, 9).
The fourth rule says that if the prediction is medium and workload is low, then we conclude that VMs to be decreased in slight amount. Every column of Scale In are updated and the 3rd column of the fourth row gets the highest value and rest other Scale In columns are also optimized but with a value less than of the 3rd column. The values can be seen to be getting optimized after 27th iteration. With this criteria, we decrease the number of required VMs are 3. The practical observation with Q-values of (4, 3) and Q-values of (4, 4) are shown in Fig. 7(a) and 7(b).

(a) Optimized Q (4, 3) (b) Optimized Q (4, 4).
With this framework, the total cost and penalty cost can be minimized. Applying fuzzy rules and forcing the states to take action in accordance with the outcome of the rules not only handles the case of dynamic workload changes, but also minimizes the total cost incurred both for the user and the CSP.
Chebyshev’s inequality for prediction gives very accurate prediction. Hence, there is no need to pass predicted values to fuzzy controller as it already produce near optimal result. Fuzzy Controllers are used only when there is lot of uncertainties. This instance must be considered as an important topic for further investigation in future.
Chebyshev’s inequality (proposed) in the analysis phase with Q-learning in planning phase (existing)
Arani et al. [29] proposed a planning phase of MAPE-Loop based on Q-Learning. In this work, CPU utilization of cloud service at a particular time was evaluated. In this, they consider three states ((i.e. Over Utilization, Under Utilization and Normal Utilization) based on CPU utilization. In our work, we consider the same methodology of the states to perform the related operations. On the basis of this a 3x3 matrix was constructed in planning phase, where rows of a matrix are states and columns are actions. As shown in Fig. 8(a), 8(b), and 8(c), the three states (i.e. Over Utilization, Under Utilization and Normal Utilization) form rows and their corresponding actions Scale Out, Scale In and No-Operation form columns. The formed matrix is optimized by Q-Learning. Suitable actions have been chosen based on maximum q-values. On the other hand, the cost comparison of Fuzzy Q-Learning and Q-Learning in the planning phase is depicted in Fig. 9. From the above validations, it is observed that Q-Learning converges at very slow rate and the same has been proved from Fig. 7(a), (b) and 8(a), (b), (c).

(a) Normal Utilization with No-operation (b) Over-utilization with Scale Out (c) Under-utilization with Scale In.

Comparison of VMs cost by using LRM in analysis phase with Q-L or Fuzzy Q-L in planning phase.
We can also observe that Q-values are being optimized after 80 iterations in all the three cases. On the other hand, from Figs. 5–7, we conclude that FQ-L causes earlier optimization, whereas optimization time using Q-Learning is nearly more than double than the FQ-L.
Thus, it is concluded that the use of FQ-L reduces response time. By using FQ-L, we get the exact amount of VMs and these VMs are being assigned when required, VM Cost is also minimized by using homogenous VMs of small type (shown in Fig. 9). Q-Learning stores Q-values in lookup table, which is impractical for large size states and actions, whereas FQ-L stores large Q-values easily. FQ-L decreases the training by embedding prior knowledge into rules. As a consequence, SLA has been complied with the proposed methodology of the framework.
We adopted the MAPE model proposed by Arani et al. [29] for comparative analysis. Our work comprises phase wise (analysis phase and planning phase) comparison with the existing approach. The proposed analysis model is compared with existing LRM model. After that, three types of combinations are taken for analysis; (i) LRM in analysis phase (existing) with FBQ-LA in planning phase (proposed), (ii) Chebyshev’s Inequality in analysis phase (proposed) with FBQ-LA in planning phase (proposed), and (iii) Chebyshev’s Inequality in analysis phase (proposed) with Q-Learning in planning phase (existing). Finally, a cost comparison analysis using LRM in analysis phase with Q-Learning/Fuzzy Q-Learning in planning phase is performed.
Conclusion and future work
In this paper, resource provisioning for handling elastic demands for cloud services have been considered. Handling demand in cloud environment is a challenging issue that requires an efficient auto-scaling mechanism. To manage this issue, we have proposed an autonomic framework based on Chebyshev’s inequality and Fuzzy based Q-learning approach (FBQ-LA) for cloud infrastructure management. The proposed work reduces the wastage as well as shortage of resources by restricting the SLA violations. The total cost is cut down by dynamically allocating accurate homogenous resources as and when required. The implementation of fuzzy rules for each state further increases the robustness of MAPE loop. The fluctuating nature of the workload for a given cloud service is well handled by forcing human interference in the form of fuzzy rules. Moreover, the speed of convergence and learning is achieved in the planning phase by using Fuzzy Q-Learning that reduces the response time. The limitation of the proposed approach is that it has been analyzed with the group of VMs configured in a homogeneous manner when applied FBQ-L in planning phase.
In future work, heterogeneous VMs will be considered for dynamic allocation with FBQ-L to obtain more realistic outcomes. The fuzzy based SARSA learning methodology will be applied in autonomic computing and the performance of both the approaches with respect to an optimal solution will be evaluated. The control MAPE-K loop can be made more dynamic by applying well formulated fuzzy rules for each and every phase. Further enhancement of MAPE loop can be done with the use of dynamic fuzzy Q-Learning.
Footnotes
Acknowledgments
The authors wish to express their gratitude and heartiest thanks to the editor and anonymous reviewers for their valuable suggestions in improving the paper significantly. The authors also thank the Department of Computer Science & Engineering,
