Abstract
Spot pricing is an auction based market mechanism introduced by Amazon EC2 to sell available cloud resources at a low cost without any availability guarantee. Bid prices submitted by users directly affect reliability of the instance acquired. In this context, we propose a Perceptive Bidding Strategy (PBS) for spot instances which presents different bid prices for different time during a day across all Amazon EC2 regions to assist users optimize their bids for obtaining cheaper spot instances with greater reliability. The strategy is firmly grounded on one-day-ahead spot price predictions using Random Forests. The bid price is not static but varies with job size to minimize cost and volatility of spot instances. We also develop a spot billing simulator to validate the bidding strategy against historical data. The simulator utilizes bid prices generated by PBS and Amazon’s real spot price traces of the bidding period for empirical investigations. We evaluate PBS for varying job sizes for a period of two weeks on different compute instance types and demonstrate that PBS greatly reduces job execution costs up to 80% while maintaining high reliability up to 80 to 90%. We compare PBS with several static bidding strategies and show that PBS provides spot instances with reliability for varying job sizes and greatly reduces risk of high execution costs.
Introduction
Spot pricing has been offered by Amazon EC2 since 2009 through market based auction mechanism to sell its spare capacity at a low cost with no availability guarantee. As the public cloud providers offerings mature, more and more spare capacity will be wasted. Spot pricing being highly dynamic, where price can change every minute, there is no real transparency into how much money would be saved and at what cost. Complexity in deciding the bid price to achieve reliability as well as cost savings may be another reason why, despite its extremely low prices, spot market has low utilization. Despite the substantial growth of spot instance usage [1], users today still fear using spot instances for their application not being sufficient fault tolerant. Applications with high computing power requirements, fault tolerance and no real time availability constraints can be executed at a significantly low cost by utilizing spot virtual machines. The problem is challenging since submitted price bid directly affects execution cost and reliability of spot instances. The more bid price spot users bid, greater is the reliability guarantee but at a risk of higher cost. Spot users would like spot instance availability long enough to serve their job requirements. Faced with the diversity and volatility of resources with market oriented pricing systems within IaaS providers such as Amazon EC2, differing questions emerge including:
How best can we exploit the diversity among different regions, and their respective spot markets in order to improve the opportunities to obtain spot instances with reliability at low cost? What is the expected lifetime of an instance at a given bid price that can serve users needs? What price to bid, when to bid, where (region) to bid to achieve both performance and cost savings?
The Perceptive Bidding Strategy for spot instances proposed in this work utilizes one-day-ahead spot price predictions using Random Forests. The bidding strategy minimizes execution cost and improves reliability of spot instances for fault tolerant jobs by three ways:
As spot prices change considerably during different hours of a day, the strategy presents different bid prices to bid at different hours of a day to minimize execution cost. Price to bid varies with job size to ensure expected job lifetime equals or exceeds job size in order to guarantee service availability. Exploits spot markets in multiple EC2 regions to minimize cost and volatility of spot instances.
Given the pricing history and the demands of the application, this paper presents a dynamic bidding strategy for spot instances based on future spot price forecasts using spot pricing history. Our objective is to minimize execution cost while minimizing the risk of high execution cost and volatility of spot instances. We expect that the bidding strategy would help spot users in reducing some limitations and would increase their confidence in using cloud spot market. The rest of the paper is organized as follows. Section 2 discusses literature review. Section 3 discusses the bidding framework including Amazon EC2 spot billing mechanism, spot price prediction using Random Forests, assumptions and the proposed Perceptive Bidding Strategy. Section 4 presents various bidding metrics used to assess the bidding strategy, its evaluation and analysis. Section 5 presents discussion and conclusion of bidding strategy.
Literature review
A number of spot bidding strategies both static and dynamic as shown in Fig. 1 are proposed by authors to bid spot instances effectively and reduce the chances of out-of-bid failure. Bidding strategies differ due to different job requirements, interruption overheads and failure tolerance. Static bidding strategies bid at static price and dynamic bidding strategies calculate the bid price according to the probability distribution of market prices existing in the spot pricing history and application performance requirements.
Taxonomy of spot bidding strategies.
Chohan et al. [2] use spot instances as accelerators in order to reduce the execution time of MapReduce jobs without giving consideration to any deadline constraints or check pointing. The authors consider bid prices independent of market prices. However in reality, we observe gradual and steep fluctuations (when spot prices rise above on-demand price) in spot pricing. When price fluctuations are gradual, which occur for most of the time, spot prices are market driven. Andrzejak et al. [3] proposed a probabilistic decision model in order to pre-compute fixed bid price and used hourly and optimal check pointing strategy to avoid loss due to out-of-bid failures. Bid price prediction methods based on probability distribution methods do not model the time durations between spot price fluctuations correctly and therefore do not provide sufficient information regarding lifetime of spot instance at a given bid price [4]. Mattess et al. [5] use spot instances in combination with local computing resources to manage peak computing loads efficiently and reduce computation cost. The authors request spot instances only when spot price is below the maximum bid price and then bid at a very high maximum bid price. Bidding at maximum bid price when spot prices are low can delay job execution. Although, it reduces the chances of instance being interrupted, it does not provide any guarantee regarding duration for which job can run without being interrupted. Liu [6] use spot instances in the map phase by bidding at the lowest price and use regular instances in reduce phase of MapReduce jobs in order to handle the unexpected job terminations in spot instance market. Yi et al. [7] examine many check pointing strategies for different bid values and different types of instances. After each instance’s interruption, their approach decides a new instance type, location, and price that reduces the total running time. Taifi et al. [8] found that over protective check pointing strategies can lead to slowdown of 2 to 3 times the failure free execution runtimes. Leslie et al. [9] propose a resource allocation and job scheduling framework that utilizes spot and on-demand clusters, Amazon’s multiple purchasing options, instance types and availability zones in order to provide a cost effective and low volatility means to run both moldable and rigid jobs. The authors match jobs to instances based on their past 60 days historical reliability. The bid price is calculated based on the average spot price. Due to the increasing growth in spot instance usage patterns, inclusion of past 60 days history traces to determine bid price is doubtful. Kushwaha and Simmhan [10] bid at 70% of on-demand price without considering any check pointing strategy and find that chance of an out-of-bid event is more frequent in longer jobs. Menache et al. [11] proposed bid price for spot VMs as either fixed or set as the weighted average of the past spot prices plus a safety parameter. Karunakaran and Sundarraj [12] show that for jobs with flexible completion time, bidding close to reserved price is an attractive strategy in order to minimize expected cost while bidding above on-demand although seems to be attractive, is not useful. Abundo et al. [13] model bidding problem as a Q-Learning problem. When the net slack time of an application is greater than zero, their bidding strategy bids on-demand price for spot instances assuming on-demand prices can be less than maximum spot price over a given period of time. Qu et al. [14] bid spot instances for web applications at truthful bid price and on-demand price bidding strategy. The authors conclude that increasing bid prices alone is not enough to guarantee high availability. Sharma and Irwin [15] suggest bidding on-demand prices for spot instances and make use of migration techniques in the case of spot instance interruption. The authors argue that bidding strategy does not have any significant impact on cost savings and availability ratio.
Dynamic bidding strategies
Kaminski and Szufel [16] proposed an adaptive bidding strategy for bidding spot instances that bids close to spot price in order to lower execution cost and delay. The authors conclude that bidding close to spot price can significantly lower cost but requires frequent check pointing. Wu et al. [17] find that since adaptive bidding algorithm [8] seeks the cheapest instance, it withdraws many bids and leads to high deadline miss rate. Zafer et al. [18] employ Semi Markov chain models to characterize spot prices and propose near-optimal bidding strategies. The authors propose an optimal bidding strategy which calculates the bid according to the probability distribution of all the market prices in the spot pricing history and the remaining deadline at the beginning of each instance hour. If the remaining execution time equals to the deadline, the application is executed on on-demand instances. Wu et al. [17] find that optimal bidding strategy [18] rarely gets spot instances and all the jobs are actually executed using on-demand instances as the strategy assumes all prices in the price history follow uniform distribution which is apparently not the case in practice. Tang et al. [19] formulate bidding problem as a Constrained Markov Decision Process and design an optimal bidding strategy that utilizes both the dynamic pricing model and the state transition intelligence. The authors obtain an optimal randomized bidding strategy through linear programming. However, such multidimensional models suffer from scalability limitations when considering multiple Amazon EC2 spot markets with multiple bids for better availability and profitability [4]. Wallace et al. [20] present a predictive model based on artificial neural networks for short-term prediction of future spot prices. The authors predict spot price for the following time interval for approximately every 1.3 hours. One prediction step takes about 12 seconds in their model. Their model is significantly slow than Random Forests based prediction model used in this paper. Guo et al. [21] reduce the costs of a distributed and storage service by using spot and on-demand instances. The authors bid in availability zone that has minimum failure probability. The bidding decision can change with each bidding interval if the spot prices fluctuate. Poola et al. [22] propose bidding strategy to exploit spot market until latest time to switch to on-demand instances does not reach. Their bidding strategy bids prices closer to the spot price in the beginning of the execution and gradually increases bid price closer to the on-demand price as the workflow nears the completion based on the current spot price, on-demand price. The failure probability at a particular bid price is calculated based on past one month spot history. Their bidding strategy emphasizes in meeting deadline constraints rather than cost minimization. Wolski and Brevik [23] predict the bid value in AWS EC2 spot market for users such that the instance’s requested lifetime exceeds a fixed duration with a given probability. Their strategy focuses on ensuring the lifetime of the instance with a given probability at any bid price and not on optimizing execution cost. Bidding in Amazon EC2 for spot instances is challenging as Amazon EC2 keeps on expanding its global infrastructure. New regions keep coming up and capacity at each region is also added significantly. So there is always a surplus capacity available. Amazon EC2 only publicizes its pricing history. It does not publicize the size of the spot pool available, the number of users bidding for an instance type and their bid price. Amazon sorts all received bids in a decreasing order, and use the size of the spot pool to find the cut-off price. Such cut-off price is called market price. All the bids above the cut off price will be granted spot instances. Once a new bid is received, Amazon re-sorts the bids and then a new market price is determined. With the real possibility of all the users bidding on-demand price or maximum price, market price would automatically increase sooner. Bidding high can improve availability, but there is no guarantee on availability and there is always a risk of high spot price and consequently high execution cost is involved. Few bidding strategies make use of fault tolerant mechanisms such as check-pointing and migration. These mechanisms further increase complexity, execution cost and time. Switching quickly to another region worsens the price volatility problem. At times, when spot availability is full, instance capacity cannot be always guaranteed. This would leave cloud customers with more interruptions in services. Moreover, previous bidding strategies do not suggest bid prices based on job size, computation time and also do not compare the price dynamics across different Amazon EC2 regions.
Bidding framework
Amazon EC2 spot billing mechanism
Currently AWS spot instances are billed at per hour granularity. When a spot instance is launched, customer will pay the current market price at the time of launch. AWS re-evaluates spot market price every hour, and adjusts the billing for customer’s instance after each hour of run time. Similar to on-demand instances, partial hours are billed as full hours except in the case where a spot instance is terminated by AWS. In this case, the customer will not be charged for the partial hour which is commonly known as free-lunch. If the customer terminates his instance, he will be charged for the partial hour billed as a full hour. The maximum price that user submits as part of his request is not necessarily what he will pay per hour, but is rather the maximum price he would be willing to pay to keep it running. User should set a maximum price for his request that is high enough to provide whatever probability he would like that his instances run for the amount of time that he desires within a given time frame. Amazon will only charge him for the current market price and never above his maximum bid. Bid submitted is independent of the running instance. User cannot change his maximum price once the instance is running. So, if his bid is not closed or canceled, he cannot change it. In order to change the price of his bid, he must cancel it and place another bid. Amazon has built it this way to make the market more efficient by encouraging customers to bid the maximum price they would be willing to pay for the job when they submit the bid. High bid prices cannot stop instance interruption; instead, instances will be interrupted when the size of the spot pool drops to zero.
Spot price prediction
1) Spot history dataset: We collate Amazon EC2 spot history traces of all compute instance types and Amazon EC2 regions from March 2015 to April 2016 from Amazon EC2 console using Amazon Web Service Command Line Interface [24] which provides past 90 days spot history traces. The history traces collected are used to study future evolution of spot pricing in different regions, determine the right amount of history data to consider, forecast future spot prices using Random Forests, check accuracy of the forecasts and evaluate the bidding strategy. Spot price history traces consist of time stamps when spot price changes. ap-northeast-1a c3.large Linux/UNIX 0.0286 2016-06-27T05:37:0Z. We use two attributes namely spot price and time for prediction purpose. The predictor vector
2) Random forests: Tree based algorithms are supervised non parametric algorithms that have low bias. Random Forests [25] is an ensemble method which uses bootstrapping and averaging the results obtained from multiple trees (also called bagging) to reduce the variance. It also uses random subspaces along with bagging so that different prediction trees are de-correlated. Bootstrapping creates multiple datasets for multiple trees by picking each time random samples of data with replacement. The samples which are not selected are called Out-of-bag samples and are used for computing Out-of-bag Error (mean squared error) in Random Forests. Out-of-bag samples eliminate the need for a separate validation dataset in Random Forests. Random Forests are also highly computationally efficient.
Let
3) Feature importance: Random Forests use the out-of-bag samples to construct feature importance to measure the prediction strength of each input variable.Let
Table 1 lists the abbreviations used for Amazon EC2 regions and Linux/Unix compute instance types considered for evaluation purpose. Feature Importance of various predictor variables for different instance types and across different Amazon EC2 regions averaged over a period of 100 to 150 days spot price prediction identified by Random Forests using out-of-bag samples is listed in Table 2. The most important features in spot pricing characterized by Random Forests are Day, Weekday and Hour. These features identify important variables that necessitate the time to bid decision criteria in a bidding strategy for cost optimization and reliability of spot instances. The feature importance also justifies the size of the training dataset size equal to 7 days which incorporates all the three important features by using past 7 days history traces.
Abbreviation used
Feature importance in spot pricing
Reduction in out-of-bag error with different leaf size, number of grown trees and training dataset size.
4) Fitting random forests model: Random Forests model have a few parameters to tune to reduce out-of-bag error which includes leaf size, number of trees grown and training dataset size. Figure 2a shows the effect of different leaf sizes (1 to 100) and number of trees grown (1 to 300) on out-of-bag error. Lower leaf size results in over fitting the data and higher leaf size increases variance. We therefore consider an optimal leaf size of 5 to train the prediction model. Increasing the number of trees grown above 50 does not show any reduction in out-of-bag error. Therefore the number of trees grown considered for training the Random Forests model is 50. Figure 2b shows that out-of-bag error (validation error) increases with increasing training dataset size. This is due to the fact that spot prices change considerably with time and therefore do not require any large training dataset size for prediction. The Random Forests model therefore considers past 7 days spot price history traces as a single training data to train the prediction model.
Comparison of prediction errors between different non-parametric machine learning algorithms.
MAPE, MCPE in one-day-ahead spot price predictions across different Amazon EC2 regions.
5) Mean consequential percentage error: We also propose a new metric Mean Consequential Percentage Error adapted from Kaggle competitions [27]. Consider
6) Comparison of random forests based predictions with other machine learning algorithms: We compare accuracy of one-day-ahead spot price predictions using tree based algorithms including Random Forests with other non parametric machine learning algorithms namely Neural Networks and Support Vector Machines. Prediction accuracy is assessed in terms of Mean Squared Error
Considering on-demand price as an upper bound to bid spot instances, we make predictions only for periods where spot price is less than on-demand price. Figure 4 shows one-day-ahead forecast accuracy in terms of MAPE and MCPE across Amazon EC2 regions and compute instances. The results show that
1) Assumptions:
If no out-of-bid situation takes place during spot instance lifetime, its availability is equal to the availability of on-demand instance which is equal to 99.9%. Instance type allocated to execute a job does not change until the job is executed completely. Spot price at time
2) Perceptive bidding strategy: Perceptive Bidding Strategy (PBS) is an offline bidding strategy that aims to aid the bid decision in three ways:
It allows the user to make informed decisions on how much to bid depending on the job size, a choice that directly influences the risk of instance failure and monetary spending; Decide the best point in time to start a new spot instance for a job, thus seeking to cover the period that will yield minimum cost, and; Where to bid seeking to obtain maximum reliability of spot instances by exploiting the diversity of regions at Amazon EC2.
Simulation parameters
Simulation parameters
Spot price variations on different weekdays and during different hours of a day.
The rationale behind combining these different pieces of information is to avoid hasty decision that may increase costs such as acquiring instances when spot prices are higher, acquiring instances in a region with higher spot price variations. PBS takes advantage of a metric called variability in spot prices among different regions to achieve both reliability and cost savings goals at the same time. Customer can choose the region with least price variability for bidding if latency, security and compliance requirements (in case of health care, banking, finance, education) are met.
Block diagram showing various steps of the proposed Perceptive Bidding Strategy.
Time to bid is an important variable in spot instance acquisition. Bidding low price during peak hours may lead to severe degradation in application performance while bidding high during non peak hours may lead to high risk of increased cost. Table 3 lists various simulation parameters used and their values. Figure 5 shows spot price variations during different hours in region R1 for I1 Linux/Unix instance type on different weekdays. A visual inspection of spot prices during 24 hours of a day reveals that spot prices change significantly at an interval of every 3 hours. Perceptive Bidding Strategy uses Random Forests based one-day-ahead spot price predictions and calculates bid time and bid values in cents at an interval of every 3 hours between 0 to 24 hours which corresponds to 8 different bid time and bid values. Figures 6 and 7 show the block diagram and algorithm of the proposed bidding strategy. Bid value for an interval of every three hours is the maximum predicted value during three hour interval to prevent out-of-bid failure during this interval.
where predicted price is price calculated using Random Forests for every three hours and
Bid price depends on job size. As the job size increases, bid price also increases. Bid once made cannot be changed during job execution without instance termination. Therefore, for greater job size, maximum of bid values in
The strategy can be used for greater job lengths up to 48 hours when bidding is initiated on off peak day, i.e. Saturday. We do not recommend the use of bidding strategy for greater job lengths as spot prices can vary up to 64% on different weekdays [28]. Using bidding strategy for greater job lengths would have increased risk of high execution cost.
Bid values according to PBS
Algorithm perceptive bidding strategy.
Bid prices for different job sizes according to PBS
3) Computation of bid values and bid price: Table 4 shows start time, end time of the time interval in minutes and bid values generated by the bidding strategy on 24-03-2016 for I5 instance type in region R7 using Eq. (4). A total number of 8 bid values per day are calculated. Minimum bid granularity G is 0.001 cents. Table 5 shows different bid prices and bid time for varying job sizes calculated using Eqs (5) and (6). As seen in Table 4, the lowest bid price is in row 2 at bid time 181 minutes, therefore for job size up to 180 minutes, the optimal bid time is 181 minute (i.e. 03:01 h) and bid price is 54.18 cents. For job size up to 360 minutes, the two lowest priced consecutive intervals are rows 2 and 3. So, bid time is 181 minute (i.e. 03:01 h) and bid price is the maximum price of the two intervals i.e. 54.23 cents. For job size up to 540 minutes, three lowest priced consecutive intervals are rows 2, 3 and 4. So, bid time is 181 minute (i.e. 03:01 h) and bid price is the maximum price of the three intervals i.e. 54.43 cents. For job size up to 720 minutes, the four lowest priced consecutive intervals are rows 1, 2, 3 and 4. So, bid time is 0 minute (i.e. 00:00 h) and bid price is maximum price of the four intervals i.e. 57.73 cents and so on.
4) Computation of additive constant
Bid metrics
Job completed: If the spot price never exceeds bid price once the instance is granted and until the job of size
Execution cost of job: For a job of size
where spot price is the price at the beginning of each hour after an instance is allocated.
Maximum cost of job:
where bid price is the price specified by the user for executing a job of size
On-demand cost of job: For a job of size
Reliability of bidding strategy: It is the ratio of number of jobs completed to the number of jobs submitted.
Actual cost to on-demand cost ratio: It is the ratio of the average cost for successfully completing job using spot instances with respect to cost using on-demand instances.
Variability: Variability is the difference between maximum and minimum bid price during 24 hours.
Increase in reliability with alpha parameter value for short and long jobs on different weekdays.
We analyse the performance of PBS against several static bidding strategies namely Bid Min, Bid Average, and Bid Max price based on past history traces, Bid Current spot price, Bid 70% of on-demand, Bid On-demand strategies. Figures 9 and 10 compare different static spot bidding strategies for region R9, instance type I5 and region R5, instance type I9 for a period of 2 weeks on different weekdays. The figures clearly show that Bid Min price bidding strategy hardly gets any spot instances. Bid Average price, Bid Current spot price may sometimes get and sometimes may not get spot instances and thereby have low reliability. Deadline constraint also may not be met.
Comparison between different bidding strategies for region R9 and I5 instance type.
Comparison between different bidding strategies for region R5 and I9 instance type.
1) Influence of bid time and job size: Figures 9 and 10 also show Amazon EC2 spot prices, forecasted prices using Random Forests and the bid prices generated by PBS for next 24 hours for a period of 2 weeks. The figures clearly show that volatility of instances change with weekday and time (hour). Volatility is low on Saturday and Sunday in comparison to other weekdays. Volatility is high on Wednesday, Thursday and Friday. Volatility is also high during peak hours and low during off-peak hours. Peak and off-peak hours vary for different regions. We observe that the bid prices generated by PBS can always get spot instances when spot prices are low. This suggests that one can achieve spot instance reliability for job sizes up to 900 minutes or more at minimal cost when spot prices are lowest i.e., during off peak hours. For higher job sizes, there is a risk of instance interruption in regions where spot prices vary greatly during peak hours.
Reliability of executing jobs on different spot compute instance types and varying job sizes.
Actual cost using PBS to on-demand cost ratio.
Cost reduction using PBS compared to on-demand bidding strategy.
Spot price variations on different weekdays and during different hours of a day.
2) Influence of job size on reliability:Figure 11a shows average reliability of executing jobs of varying job sizes using PBS across different Amazon EC2 regions. Although PBS takes into account job size and therefore bids higher prices for higher job size, reliability reduces to a great extent with increase in job size. Maximum reliability for job size of 180 minutes in different regions varies from 99% to 86%. In some regions reliability decreases by 4% to 5% only for higher job size while in others there is a strong influence of job size on reliability. We observe a remarkable decrease in reliability of 30% to 40% for higher job size in some regions. Execution of higher job size requires job to be executed during peak hours also when spot instances are highly volatile leading to low reliability.
3) Influence of instance types on reliability: Figure 11b shows reliability of executing jobs using PBS on different compute instance types from I1 to I5 for varying job size. We find that reliability of higher instance type i.e. instances with greater infrastructure (I5) is considerably less than the reliability of instance types with smaller infrastructure (I1) and reliability reduces considerably with increase in job size for higher instance types. Running short job is more cost effective on higher instance type during off peak hours. Long job should be run on less volatile instance type with lower infrastructure.
4) Actual cost to on-demand cost ratio (ACDR): Actual cost is the cost charged by Amazon EC2 which is equal to the sum of lowest bid at which instance is granted each hour. Spot instances are available at a very low cost using PBS in comparison to on-demand instances. Figure 12 shows ratio of actual cost using PBS to on-demand cost for instance type I4 across different Amazon EC2 regions for job size of 180 minutes. ACDR varies from 0.13 to 0.23 across different regions. This shows that on average one can achieve cost reduction from 77% to 87% in comparison to on-demand bidding strategy and still achieve reliability using PBS for spot instance acquisition. Bidding in most cost effective region can further reduce execution cost upto 10%. We performed about 215 simulations across different regions for different instance types and job sizes ranging from 3 to 12 hours to compare PBS with on-demand bidding strategy. On-demand bidding strategy guarantees maximum availability and is an upper bound to bid for spot instances. PBS is able to reduce execution cost by 10% and more for 50% of simulations with a minor reduction in availability up to 6% in comparison to on-demand bidding strategy as shown in Fig. 13. This cost reduction is achieved as PBS bids for instances when spot prices are low.
5) Influence of Amazon EC2 spot markets on volatility: Figure 14 shows average daily variations in bid prices generated by PBS for a period of 14 days for instance type I4 across different Amazon EC2 regions. In some regions there is a large variation between lowest and highest bid price within each day while in others, it is very small. Price variation in different regions is therefore an important factor in deciding region to bid for spot instance acquisition for minimizing the volatility of spot instances. Lowest bid prices are nearly same across regions R1, R3, R5, R6, R7 but price variation is least across Regions R5 and R6. Perceptive bidding strategy bids in a region with least price variation for acquiring spot instances which results in significant cost savings and reliability of spot instances.
6) Maximum cost of job: Maximum cost is the cost of acquiring spot instances assuming user’s submitted bid price is lowest and therefore, it is the market price i.e., spot price. Table 6 presents average hourly on-demand cost, maximum cost that can be incurred using PBS and the actual cost of using spot instances during evaluation period of I1 to I10 instance types for job size of 180 minutes across all Amazon EC2 regions. We observe that maximum costs are comparable with the actual costs incurred. This shows that PBS is efficient in minimizing execution cost without any risk of increased execution cost.
Hourly average on-demand cost, maximum cost and actual cost of executing jobs on different instance types across Amazon EC2 regions
The main objective of the proposed bidding strategy is to understand the pricing dynamics in the Amazon EC2 spot market and based on the history traces estimate minimum bids. To the best of our knowledge, our approach is the first to quantify volatility of spot markets and qualify risk associated with bidding in them. Similar to several static bidding strategies, PBS strategy is based on past history traces, but it differs in the size of the training dataset. Earlier past history based bidding strategies use 3 months history traces as the training dataset. PBS uses only past 7 days history traces to minimize prediction errors as spot instance usage patterns change very fast. Instead of using same bid price for all job sizes as used in previous static and dynamic bidding strategies, PBS uses different bid prices depending on the job size to explain EC2 spot market behavior. Previous bidding strategies are suitable for small job sizes only when executed during off peak hours and have very low reliability for long jobs. Deadline constraint also may not be met for short jobs. Previous bidding strategies aim to obtain spot instances, but not on optimizing execution costs. Bidding strategies such as Bid Max, Bid On-demand have high risk of increased execution cost. PBS greatly reduces risk of high execution cost (up to 6 times) with a minor reduction in reliability (up to 6%). A key benefit of our approach is that it suggests bid time also along with bid price to minimize cost, execution time and volatility of spot instances as spot prices vary greatly during different hours of a day and during different days of week. Another key differentiation is that we do not limit our bidding strategy to a single Amazon EC2 region and allow direct comparison of pricing dynamics with and across multiple regions.
Efficient bidding strategies for spot instances are crucial in lowering execution costs while avoiding out-of-bid failure, thereby promoting spot instance usage by raising user’s confidence. PBS proposed in this paper is based on one-day-ahead spot price forecasts using Random Forests. Analysis of experimental outcomes verifies that PBS can reduce user’s costs up to 80% to 85% without any increase in job completion time. The results are consistent across various regions and instance types. Simulation results illustrate that bidding strategy performs well in terms of instance reliability, monetary savings and execution time for job size up to 24 hours without any need for complex procedures such as check pointing. By tailoring bid price based on job size, the proposed bidding strategy is promising for executing real fault tolerant applications on spot instances. The bidding strategy does not aim to increase the spot price, but to guarantee spot instance on user’s bid price. The bidding strategy can be used for job size greater than 24 hours also by making use of one-week-ahead predictions. We do not suggest use of PBS for job size greater than 24 hours except on low volatility days (Saturday, Sunday) as it is likely that cost benefits are lost. Also, risk of cost escalation in bidding for long job size increases greatly. Experimental results verify that PBS outperforms previous bidding strategies in terms of lowering monetary cost, volatility of instances and reducing risk of high execution costs greatly. We expect that the bidding strategy would help spot users in reducing some limitations and would increase their confidence in using cloud spot market.
Footnotes
Authors’ Bios
