Abstract
The multi-objective service selection problem is a basic problem in Service Computing and it is NP-Hard. This paper proposes a novel Bi-Ant colony optimization (NBACO) algorithm for this problem. Two objective functions related to response time and cost attributes are considered. For each objective, a heuristic function and a pheromone updating policy are defined against the characteristics of this problem. Then, a combined state transition rule is designed based on them. It uses preposition skyline query (PSQ) algorithm for each service class to reduce the candidate services at the beginning of NBACO. The algorithm has been tested in nine cases and compared to the related MOACO algorithm and Co-Evolution algorithm for this problem. The efficiency of NBACO is greatly improved by using PSQ. The result demonstrates that our approach is effective and better than MOACO and Co-Evolution.
Introduction
Cloud computing is a model for enabling ubiquitous, convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction [1]. Clouds are next-generation data-storage and computing systems. It connects and manages distributed computers as a personalized inventory to meet a specific service-level agreement between service providers and users [2]. With the rapid development of Cloud Computing, a large number of cloud services are emerged on the Internet. These services have different abilities to run the applications of users across the systems or platforms of multiple enterprises over the Internet. More and more users establish applications by composing theseservices [3].
Service composition combines multiple services to collaborate with each other based on a business process to form a composite service. Business process defines the logic between the services in a composite service. There is one composite service which consists of many tasks and has specific business logic between tasks. It is necessary to choose a service for a task when a composite service is generated. Service selection is the process which chooses a service from a lot of candidate services for the task. There may be multiple service providers competing to offer the same functionality with different quality of service. Quality of Service (QoS) has become a central criterion for differentiating these competing service providers and plays a major role in determining the success or failure of the composed service [4].Current service QoS standards focus on several attributes, such as response time, cost, throughput, reliability, security, etc. Therefore, QoS-aware service composition has to be formulated as amulti-objective optimization problem [5].
Most existing research usually converts multi-objective problem to the single objective problem through specifying a utility function beforehand that defines priorities or weights for the different QoS attributes. The optimal solution is the one with maximum utility according to that function. The drawback of such an approach is that it is hard to determine the weights precisely and thus cannot find the optimal solution. Therefore, multi-objective service selection problem is not to identify single optimal solution, but rather a set of alternative solutions, known as Pareto Set. A solution is Pareto Set if there does not exist other solutions which have better QoS values for some attributes while having at least equivalent values for other attributes.
Heuristic algorithms can be applied to find the solutions with high efficiency. A lot of researchers present respectively particle swarm algorithm [6], multi-objective bees algorithm [7], multi-objective genetic algorithm [8, 9], chemical reaction optimization algorithm [10, 11] to solve it or similar problem. From their experiments, all algorithms can get the better solutions and the speed is quick. In order to improve the precision of solutions, some researchers present some mixed algorithm to solve this problem. In [12], the authors propose a fully polynomial time approximation scheme (A-FPTAS). It combines polynomial complexity with formal result precision guarantees according to the additive ɛ-metric. The experimental results show that a heuristic version of A-FPTAS has lower runtime and higher precision than competing randomized methods that are particle swarm algorithm and genetic algorithm. In [13], the authors present a cooperative evolution(Co-evolution) algorithm consisting of stochastic particle swarm optimization and simulated annealing to solve this problem. These two papers all mix a heuristic algorithm and other non- heuristic algorithm to improve precision of the solutions. The experiment results show they are effective. But all above heuristic algorithms have the common characteristic that the length of the solutions must be fixed. They are not flexible.
Ant colony optimization(ACO) algorithm is effective to solve the variable length pathfinding problem. In [14], the authors present several Multi-Objective Ant Colony Optimization algorithms to solve TSP problem. The experiments show that it is better in solving one of the commented problems than some of the most famous Multi-Objective Evolutionary Algorithms. In [15], the authors propose several multiple-objective ant colony optimization algorithms which are used to find the path for military problem. Its experiments show that ant colony algorithm is better to solve path-finding problem. In [16], the authors proposed a multi-objective MOACO algorithm for web services selection problem. The algorithm designs one heuristic function and one policy of pheromone for all objectives. It doesn’t design heuristic function and pheromone policy for every objective against its feature.
Recently services appear on the Internet with a rapid growth. When the number of services is huge, it is hard to find the best solution for a composite service using all kinds of intelligent optimization algorithms. So how to reduce the size of the services is very important.
Skyline query is the first proposed solution in database area to deal with the multi-criteria data analysis and decision making. Recently many researchers adopt skyline query algorithm in service selection. In [17], the authors present a Cloud Service Selection System using the Skyline algorithm. The system allows users to search and find services that match their requirements. In [18], the authors propose an approach to reduce the number of candidate services and select services for composition using skyline. In [19], the authors propose a skyline computation approach for service selection. The approach produces multi-service skyline which is effectively and optimally accessed by users. All these approaches focus on reducing the candidate services through skyline computation. How to use the skyline service to produce a composite service instance is not mentioned.
In this paper, a novel Bi-Ant colony optimization algorithm(NBACO) is designed to solve the problem of multi-objectives service selection. The problem is firstly modeled as a weighted directed acyclic graph. Secondly, a skyline query algorithm is applied for each service class to reduce the size of candidate services. Thirdly, a heuristic function and a policy of pheromone updating for every objective related to response time and cost are defined. It gives a combined state transition rule that is calculated by two pheromone matrices and heuristic functions. Lastly, the algorithm finds paths in the graph using the NBACO. The algorithm has been tested on nine different data sets. The results show the performance of NBACO is better than the performance of MOACO and Co-Evolution. The organization of this paper is as follows. In Section 2, the multi-objective service selection problem and model are introduced. In Section 3, NBACO for solving the multi-objective service selection problem is defined in detail. Section 4 shows the experimental scheme, the results of NBACO, and its comparison with MOACO and Co-Evolution. Section 5 summarizes the main work of the paper and puts forward the nextresearch work.
Problem definition and model
Generally, one composite service consists of many tasks and has specific business logic between tasks. One task is an abstract service and corresponding to one service set. A service set is a group services provided by different providers with the same functions, but not the same QoS values. Service selecting is to find the optimal service from every service set for every task that satisfies certain constrains. A composite service instance is generated after service selection.
Composite service can be considered as abstract workflow ℑ which contains a set of abstract services S. Each abstract service element S i (i ∈ [0, ∥S ∥] - 1) is a service class S i . S i = {si1, si2, …, s im } consists of all concrete services that potentially have different QoS values and have the same functionality. Each service has a QoS Vector Q s = {q1 (s), q2 (s), …, q r (s)}, which is used to describe all the QoS attributes’ values of service s. The i-th QoS attribute’s value for it is defined by function q i (s). Then, for a given composite service CS = {s1, s2, …, s n }, it consists of n (n ∈ [1, ∥S ∥]) service components, its QoS vector is QCS = . The i-th QoS attribute value of CS is . There are several different structures for service composition. Sequential structure is considered in this paper, because loop structure, branch structure, and parallel structure can be transformed to the sequential structure. The transformation techniques used in this paper is presented in [20]. Thus we can aggregate the corresponding values of component services to compute the .
Problem of service selection is a path-finding problem in a graph. In order to solve it, an abstract workflow for a composite service is considered a weighted directed acyclic graph G =< N, E >, as shown in Fig. 1, where N = S ∪ D ∪ S1 ∪ S2, …, S n . S is the virtual starting node and D is the target node. S and D have no concrete service set. E is the set of all the edges in the graph. An edge indicates that one service calls another service after it has been executed completely. A composite service instance is a path P in Fig. 1.
The goal of NBACO algorithm is to find a Pareto Set of path P such that each P’s QoS attributes are minimized, i.e., it satisfies:
In Equation (1), and represent the path P’s calculation functions of response time and cost, respectively. In Equation (2), g i (P) ≤0 defines m’s constrains. The goal is to minimize and . The calculation of the objective functions related to response time and cost have been discussedin [16].
ACO (Ant Colony Optimization) algorithm is used to solve the path-finding problem. Ants travel the search space to find the optimal path. The size of searching space decides the efficiency of ACO algorithm. It is a better way to reduce the searching space in order to improve the efficiency of the algorithm. In NBACO algorithm, a skyline query process is applied before ACO algorithm begins to run. The goal of the skyline query is to abandon some services which maybe not be the candidate service in every service class S. It will produce a subset of service class S. Then a new graph is constructed by the subsets of all service class. Bi-ant colony optimization algorithm is used to find the best optimal path over thenew graph.
Service class processing based on Skyline
Each service class S has many concrete services. As the number of concrete sevices increases, the number of composite service instances will grow exponentially. To improve searching speed, the searching space must be shrunk. Since not every concrete service is a potential candidate, a skyline query process can be applied to shrink the service set of service classes.
In the NBACO algorithm, the preposition skyline query (PSQ) algorithm which is presented in [21] is used. PSQ algorithm is a parallel skyline approach based on Map-Reduce framework. It first divides a service class S into a number of subsets. In each map task, a skyline query algorithm is used to shrink a subset. The reduce task merges all query results made by each map task and uses a skyline query algorithm to shrink the results. Then a subset of service class S, denoted as SKY (S) is produced (see Definition 1). The key point of a skyline query algorithm is dominance (see Definition 2).
In order to improve the efficiency of PSQ algorithm, a filter mechanism is then applied. A master is added to the Map-Reduce framework. Each map task selects the best result from query results and sends it to the master after a skyline query is finished. The master receives the results and selects a best result as a filter value. Then it sends the filter value to the map tasks which have not begun to run a skyline query. These map tasks firstly filter the subset using the received value and then run a skyline query. The process of PSQ algorithm is explained by an example, see Fig. 2.
In Fig. 2, if a service class S is divided into four subsets and sent to four map tasks, the process of PSQ algorithm includes six steps: Subset is filtered by filter value. If the value is null, ignore it. Each map task runs a skyline query for each subset. The query results are produced. A local filter value is selected and sent to master. Master receives all local filter values and select a global filter value. The global filter value is sent to the map tasks which have not begun to run. If all map tasks begin to run, ignore it. Each map task sends the query results to reduce task. Reduce task executes a skyline query and produces a new subset which is SKY (S).
PSQ algorithm uses a filter value to shrink the subset. The efficiency of the PSQ algorithm can be improved.
NBACO algorithm
In an ACO algorithm, by one iteration, each ant finds a whole path which satisfies the constraints. An ant finds a path by going through the graph. The pheromone information is left in the edges after the ants visited. The desirability of the edges is conveyed by these trails, and they will be taken into account by the following ants. The objective of this paper is to minimize the response time and the cost. For this purpose, we set two weights for every edge. Every edge has a pheromone matrix and heuristics function in a single ant colony. A composite state transition rule is used in NBACO algorithm. The rule is computed by using two pheromone matrices and heuristic functions. We will define the key factor and give pseudo code of NBACO in detail in this section.
Heuristics functions
Every objective has a heuristics function. The heuristics function is used to direct the search for the ants. Every edge in the graph is assigned a value that contains the heuristic information. Heuristic functions of a given edge e
ij
are Equations (3) and (4).
In Equations (3) and (4), S i represents the set of node that an ant has searched until node i. k represents a member of S i . q t (k) represents the value of response time of node k. q c (k) represents the value of cost of node k. q t (j) and q c (j) have the same meaning as q t (k) and q c (k). The heuristic information of edge e ij are η t (i, j) and η c (i, j) respectively.
Pheromone contains important part of priori knowledge. Pheromone of every edge is set initially with the same value, usually noted as τ0. Pheromone is changed after each generation. There are two kinds of change policy, updating and evaporating. Pheromone will be updated on the ant searching path. The amount of increased pheromone is related to the quality of paths in the solutions, that is Pareto Set, defined in Equations (5) and (6).
τ
t
(i, j) is the response time’s pheromone before updating and is the response time’s pheromone after updating. τ
c
(i, j) is the cost’s pheromone before updating and is the cost’s pheromone after updating. In Equation (7), M
t
is the response time’s mean value in the current solutions. In Equation (8), M
c
is the cost’s mean value in the current solutions.
In Equations (7) and (8), n is the number of paths in the solutions. Once all ants have found paths in each generation, the pheromone will be updated in the paths of current Pareto Set. The smaller M t and M c are, the better Pareto Set is. When Pareto Set is better, the pheromone is more and the effect is greater for the following ant.
There is a pheromone evaporating which is performed by ants in each generation. This formula is Equation (9).
In Equation (9), ρ is in the range [1] representing evaporating rate.
Each ACO algorithm has a State Transition Rule(STR). The rule decides an ant moving direction. It plays an important role. In NBACO algorithm, a Combined State Transition Rule(CSTR) is used. The CSTR groups heuristic and pheromone information of the response time and cost attributes, as in Equation (10).
In Equation (10), τ t (i, j) is the response time pheromone matrix. τ c (i, j) is the cost pheromone matrix. η t (i, j) is the response time’s heuristic function. η c (i, j) is the cost’s heuristic function. α is weighting parameter of pheromone information. β is weighting parameter of heuristic information. λ is a user-defined parameter that is between 0 and 1. N (S P ) is the optional edges by ant k in currentnode i.
In NBACO, a roulette wheel is used to decide the next node which an ant moves. The probability of each next node j is computed by prob k (i, j).
The process of the NBACO algorithm for multi-objective service selection is: start with PSQ algorithm for each service class to produce SKY(S); k ants go to find the m solutions in each generation; then form and update Pareto Set; after n’s iteration, the last Pareto Set is the optimal solutions. The pseudo code for the algorithm is shown in Algorithm 1 and Algorithm 2.
use PSQ algorithm to shrink S and get SKY (S)
{Construct graph, see Fig. 1 using all SKY (S).}
Construct_Graph()
{Initialize pheromone trials on the edges in search space.}
Initialize_Pheromone()
{finding a composite service. Algorithm 2}
P = Find_Path()
{evaluate path.}
Evaluate(P)
{if the path is non-dominated, it is added to Pareto Set}
Insert(Pareto Set, P)
{pheromone evaporate on each edge. Equation (9)}
Pheromone_Evaporate ()
{deposit pheromone trails in the path related to the current
Pareto Set. Equations (5), (6)}
Global_Pheromone_Update()
{initialize P.}
P = ø;
{get the begin node}
c_node = b_node
P.add(c_node)
flag = F
{consideration of the possible nodes for the ant to select}
Ni = Possible_Nodes(c_node)
{check if there is any possible node}
{calculate heuristic value using Equations (3) and (4)}
Calculate_heuristic();
{Set the probability of being chosen using Equation (9)}
Calculate_probability();
{select the next node against probability}
n_node = Select_node(c_node, Ni)
{insert the node to the path}
P.insert(n_node)
{the ant goes to the next node}
c_node = n_node
{check if the node is the target one}
return(P)
Experimental evaluation
In this section, an experimental scheme and evaluation of NBACO algorithm for multi-objective service selection problem are presented. The goal of the algorithm is obtained the smallest response time and cost of the CS. The PSQ algorithm is a parallel skyline approach based on Map-Reduce framework. So the experiments were conducted on a six-node shared-nothing cluster. Each node is equipped with one Core(i7), 2.93GHZ, 2GB RAM. All nodes are connected by Giga Bit Ethernet network. The parallel solutions are executed with six nodes by default. One node does reduce task and the other five nodes do map task. We utilized the default configuration in Hadoop. The algorithm is implemented by C++ & Java programming languages. One comparison is made among NBACO, MOACO and Co-Evolution, and the other comparison is made between NBACO with PSQ and NBACOwithout PSQ.
Experimental setup
In our evaluation experiments, a huge number of services are needed and the services require different distributions of QoS attributes value in order to test our algorithm. We use a publicly available synthetic generator to generate three datasets. The three datasets are respectively a correlated data set(c_data), an anti-correlated data set(a_data) and an independent data set(i_data). The QoS parameter values in c_data are positively correlated. The QoS parameter values in a_data are negatively correlated. The QoS parameter values in i_data are randomly. There are 100000 QoS vectors in each dataset. One vector has the two QoS attributes of response time and cost for a concrete service. In order to test the effectiveness of NBACO algorithm in all kinds of situations, we assume a composite service is respectively consists of 10 different abstract service classes, 20 different abstract service classes and 40 different abstract service classes. Every dataset generated above is randomly divided into 10 different service classes, 20 different service classes and 40 different service classes. According to this method, nine different scale test cases are generated, which are shown in Table 1. For example, case 4 represents a composite service with twenty abstract service classes, and in every service class there are 5000 concrete services from the anti-correlated data set. The other test cases are analogy. Each QoS vectors contains two random values which represent the values of the response time and cost attributes. The global constrain is that the overall utility value [4] is bigger than 0.1.
The parameters of the algorithm is important. But in this paper we do not make effort to find the best parameter settings. The parameters of the other two compared algorithms are set as the same as the original papers. In NBACO the parameters are set as below: the number of ants k in each generation is 50, ρ is 0.25, α is 0.5, β is 2, and λ is 0.5.
The convergence of the algorithm
In this experiment the NBACO algorithm, Co-Evolution algorithm and MOACO algorithm are compared in convergence. The cases in Table 1 are run to evaluate the convergence of the algorithms. Firstly, the Hyper-volume value of current Pareto Set is computed in every fixed number of fitness number. Then a line chart of Hyper-volume value is drawn to observe the convergence of the algorithms. The convergence condition is that the difference between the former Hyper-volume value and the current Hyper-volume’s value is less than 1*10–6. All of the line chart is shown in Fig. 3.
From Fig. 3, we observe that the algorithms are of convergence trend when the fitness number is bigger than 1*106. NBACO algorithm’s convergence speed is much faster than other two algorithms.
Analysis based on Hyper-volume indicator
Comparison among MOACO, Co-Evolution and NBACO algorithms
In this experiment the NBACO algorithm, Co-Evolution algorithm and MOACO algorithm are compared. We run independently each algorithm ten times on each test case. The termination condition for all the algorithms is that the fitness number is 1.0*106. The boxplot of hyper-volume is shown Fig. 4. The obtained max, min, mean and standard variance of Hyper-volume are given in Table 2.
From Fig. 4 and Table 2, we can see that the max value, min value, mean value and median value of NBACO algorithm are bigger than MOACO and Co-Evolution algorithms. The Hyper-volume of MOACO algorithm is the smallest. From case 1 to case 6, the Hyper-volume of Co-Evolution algorithm are close to the NBACO. But in case 7, case 8 and case 9, the Hyper-volume is significantly better than other two algorithms.
Comparison between PSQ and no-PSQ of NBACO algorithms
We make another comparison, it is between the NBACO algorithm with PSQ and NBACO algorithm without PSQ. We run independently each algorithm ten times on each test case in hadoop. The termination condition for all the algorithms is that the fitness number is 1.0*106. The boxplot of hyper-volume is shown Fig. 5.
From Fig. 5, we can see that the max value, min value, mean value and median value of NBACO algorithm with PSQ are bigger than NBACO algorithm without PSQ. With the quantity of services in each service class decreasing, the difference between the two algorithms is smaller.
Performance analysis based on PSQ
In the experiments, we evaluate the effect of service quantity on our NBACO algorithm. We assume that composite scale is 10. We generate separately five datasets for a_data, c_data and i_data. The size of the five datasets are respectively 1000,10000, 100000, 500000, and 1000000. We run independently NBACO algorithm with PSQ and without PSQ ten times on each data size. The termination condition for all the algorithms is that the fitness number is 1.0*106. Then we compute the average time of ten execution times and draw a figure see Fig. 6.
From Fig. 6 the size of service class has an effect on the execution time of the algorithms. When the size is smaller, the NBACO algorithm with PSQ spend more time than without PSQ. But when the size is larger or largest, it spend less time than the NBACO with PSQ.
Conclusions
In this paper a NBACO algorithm is proposed for solving multi-objective service selection problem. At beginning of the NBACO, a preposition skyline query is applied for each service class. Its goal is to reduce the quantity of the candidate services. In the algorithm we define two heuristic functions and two policies for pheromone updating. A combined state transition rule is used. The rule computes the probability of every node by an ant selected. The probability is computed by the heuristic and pheromone information for each node. We generate nine different test cases to test NBACO, Co-Evolution and MOACO algorithms. The results show that NBACO is more effective than MOACO and Co-Evolution for solving multi-objective service selection problem. We evaluate the effect of service quantity on NBACO algorithm with PSQ and without PSQ. The results show that NBACO with PSQ is fit for a large number of candidate services. The NBACO algorithm provides a useful way to solve the multi-objective service selection problem. It can also be applied to solve general multi-objective selectionproblems.
Footnotes
Acknowledgments
This work was supported by the National Natural Science Foundation Program of China (61572116, 61572117), the Special Fund for Fundamental Research of Central Universities of Northeastern University (N150408001, N150404009), and National Natural Science Foundation of China (No. 61202085, No. 61300019, No. 61402090).
