Abstract
Software as a Service is evolving as a leader model for cloud service delivery, enabling service providers to remotely deliver hosted, developed and managed software over the Internet. In parallel, some IT services are moving from traditional Internet services to cloud services based on peer-to-peer technologies. However, the P2P-based cloud is a large-scale, heterogeneous and highly dynamic environment whose performance is highly dependent on its ability to maintain persistent availability of SaaS services. In this paper, we propose an approach for improving SaaS service availability in order to meet service quality requirements and maintain performance in a P2P-Based cloud environment. It is mainly based on a new hybrid clustering mechanism that aims to provide a virtual and optimal infrastructure in order to organize the system peers into distinct clusters represented by virtual nodes forming together a virtual layer. This layer allows not only the distribution of peer providers but also the formation of condensed areas of each service of interest for a set of neighboring peers, which improve the availability probability of services in specific regions. In addition, a service availability measurement model was proposed based on the use of the system’s virtual layer taking into account different entities at different levels. The experimental results show that the proposed approach improves the probability of SaaS service availability and the reliability of the P2P-Cloud system. It responds mainly to the large-scale nature of distributed systems as well as making the best trade-off of maintaining QOS in terms of availability, performance and cost.
Keywords

Introduction
Over the last decade, cloud computing has been considered as the pillar of the future Internet generation and one of the most revolutionary concepts of information technology in the 21
Software as a Service (SaaS) is viewed as the promising IT service model and the leader of cloud service models, which consists of delivering hosted, developed and managed software applications to users via the Internet [3, 4]. These services are provided via pay-per-use or free use, and are accessible from anywhere as long as there is an Internet connection, allowing users to get rid of the burdens of installation and local execution of applications [1, 2]. As a result, the beneficial operational and financial concept of the SaaS model becoming the dominant model for IT service delivery and attracting the interest of researchers and professionals. According to a recent report, the U.S. advanced technology research and advisory company “Gartner” estimates that SaaS services will benefit from significant revenue growth, and that software providers will move from “Cloud-first” to “Cloud only”. This means that “consumption of licensed software will continue to fall, while SaaS and pay-per-use cloud consumption models will continue to grow”. Simultaneously, certain IT services are migrating from traditional Internet services to the cloud based on P2P systems [5, 6] which has emerged as an attractive and promising alternative model offering a significant improvement in the conception of large-scale distributed systems and in the evolution of Internet architectures. Such systems consist of overlay networks formed by a set of heterogeneous, autonomous, dynamic and interconnected peers that voluntarily participate in the network with equivalent functionality, where each node acts as client and server; and collaborate and communicates with others without using an expensive and complex central infrastructure. Depending on the degree of centralization, P2P networks can be divided into: 1) hybrid P2P network, 2) decentralized (pure) P2P networks, 3) structured P2P network with a distributed hash table [7, 8].
The union of these technologies is becoming one of the trendy words of the next generation industry where the use of its services has gained popularity because of its mobility, low cost, storage and computing power, etc. As any technology that requires constant evolution, it must therefore take special attention to its various challenges. Under such P2P-based cloud architecture, the ability of the system to perform its task depends on the reliability of the P2P environment, which depends on the physical and logical architecture of the network and the nodes composing the system. In other words, task execution is allocated to nodes according to client needs. Therefore, the conception of these environments must take into account the physical features of the peers on which the system is operating including: processor speed, memory size, disk capacity … etc.; as well as their logical characteristics that control their dynamic behaviors such as: inactivity, inertia, churn, free-riding … etc.; and network characteristics including: topology, bandwidth, latency, etc. [9]. Therefore, the performance of the system is influenced by these features where any reason for hardware or software failure makes the network’s peer inactive and inaccessible to meet clients’ needs preventing service providers from properly supplying the requested services. P2P-based cloud service providers must achieve identical or even better service levels than traditional cloud-based service in order to ensure the quality of their services, maintain client loyalty and maximize their sustainable financial gains [10, 11].
Recently, the importance of satisfying a permanent and consistent availability of services is becoming one of the most critical factors for the success of Internet-based services and applications, especially services provided via cloud-P2P environments. Traditionally, availability has been limited to local installations of hardware and software resources that businesses and consumers have deployed and maintained. With the advance of cloud services, there is a considerable migration of these resources into the cloud where their availability refers to the availability probability of a system, a network, hardware and software that collectively provide these services during their use. In addition, the service availability in a cloud-P2P system is defined as the property of being accessible and functional at the request of an authorized entity and reflects the system’s ability to meet customer demands, which is also related to the probability of the temporal and functional availability of peer providers [11, 12]. Therefore, ensuring the service availability in such service-oriented systems are more than a simple guarantee of server availability. It is also about providing a reliable and robust communication infrastructure between providers and clients that is able to flexibly and efficiently assure the required functionalities of the services, their availability and other features that may be covered by the term Quality of Cloud Service (QoCS) including: response time, performance, speed, scalability, trust, etc. [13, 14]. Achieving high availability in a P2P-based cloud environment is a complex task - mainly due to the high expectations of end-users who generally do not understand that this requirement is a challenge for system architects. The adoption of mature and powerful technologies to these environments has become necessary in order to effectively analyze and respond to the current challenges and maintain reasonable performance. In this context, data mining has emerged as a relatively new and interdisciplinary field of computing, consisting of the discovery of potentially valid and useful new models from a huge amount of data using several methods at the intersection of artificial intelligence, machine learning, statistics and database systems that have been proposed and evolved over time. It enables the extraction of structured information and useful knowledge for decision making from unstructured or semi-structured web data sources to identify trends and establish relationships between data, predict future behavior, which contributes to reducing personal storage and infrastructure costs as well as providing efficient and secure services to their users [1, 2, 15, 16]. The clustering is considered as one of the most powerful and widely used Data Mining category, the integration of its techniques in such environments has become very important. It has become clear that the future should witness the real power of large amounts of data and its efficient processing on distributed platforms through the use of clustering.
In this paper, we have extended the work proposed by Achache and all in [6] in order to meet the requirements of service availability and maintain performance in a dynamic “hybrid P2P network-based SaaS-Cloud for remote treatment” environment. Since the system proposed by the researchers is built on a peer-to-peer network, its performance is highly dependent on its ability to maintain service availability, which is related to the availability of the network peers. In order to obtain an estimation of service availability from them, we proposed an approach that improves service availability by overcoming the problems of system failures and peer dynamics. The proposed approach consists first of all in analyzing the individual behavior of clients (users/participants) and forming profile data for each of them. This profile data will be used to classify and aggregate the system’s clients according to their profile similarities using a hybrid clustering mechanism that aims to provide a virtual and optimal infrastructure to organize the system’s peers into distinct clusters, where each cluster is represented by a virtual node forming together a virtual layer. As the choice of clustering techniques depends on the nature of the analyzed data, we propose to improve the quality of the clustering of user profiles by combining in ‘a sequential hybridization process’ two well-known “hard” and “soft” clustering algorithms which are “Hierarchical Agglomerative Classification” and “Thresholded Fuzzy C-Means” [17, 18, 19, 20]. This hybridization responds largely to the nature of the manipulated data, allowing each of the client profile vector to be automatically associated with the appropriate number of clusters judged as close without a priori knowing the cluster number and how many clusters it should belong to. Consequently, a virtual layer is formed and added to the architecture of the system proposed in [6]. This layer allows not only the distribution of peer providers but also the formation of condensed areas of each service of interest to a set of neighboring peers, which greatly contributes to improve the availability probability of these services in specific regions in the virtual layer at the request time. In addition, the proposed model for measuring service availability is based primarily on the use of the virtual layer of the system and considers different entities at different levels of the system. To achieve the best performance in a service provision system, the model uses different types of availabilities existing in the literature [21, 22, 23, 24] that are refined to include special considerations in their calculations due to their particular use.
In summary, the main contributions of this work are as follows:
We form the profile data of the system’s clients from a dynamic monitoring and analysis of their behaviors and their interactions with each other and with the services offered by the system. The profile data collected from each client are represented as profile vectors with several attributes where the set of all these vectors generates a matrix of client profile vectors. We propose a hybrid unsupervised learning approach by combining in a sequential process the Hierarchical Agglomerative Classification algorithm with the Thresholded Fuzzy C-Means algorithm to improve the quality of automatic semi-fuzzy clustering of clients according to their profile vector similarities. The clustering mechanism provides an efficient representation to form the virtual layer of the system resulting in an optimal virtual infrastructure that contributes greatly to improve service availability. We considered the virtual layer as the kernel for estimating the availability of system services and the peers holding them. With the absence of formal methods for measuring service availability in the P2P networks on which the cloud architecture is built, several availability metrics are redefined and reformulated for use at different levels in the P2P-based SaaS cloud service provisioning system. We have implemented an experimental framework of our approach and adapted it to the PeerSim simulator. A number of preliminary experiments are being conducted to explore the performance of proposed clustering mechanism on profile data. In addition, the performance of the proposed approach is evaluated using a set of simulations compared to those previously performed in [6] on the basis of a number of performance criteria, particularly average service availability. To the best of our knowledge, we are the first to apply clustering algorithms on client profile data in a P2P-based SaaS cloud service provisioning system for the improvement of service availability.
The remainder of this paper is organized as follows: The literature review on enhancing and maintaining availability in cloud computing is presented in Section 2. Section 3 provides a complete and detailed description of the approach we propose to improve service availability and performance in cloud computing SaaS based on a hybrid P2P network for remote processing. We describe the experimental framework and present the results in Section 4. A comparative study is discussed in Section 5. Finally, Section 6 concludes the paper with future research directions.
Service availability is considered as one of the key features to maintain QoS in service-oriented distributed environments. Many research efforts and actual studies have been devoted to developing mechanisms to ensure and increase resource availability and performance in distributed environments, especially in cloud computing. The authors in [25] presented a taxonomy to classify research papers and solutions from cloud providers. By adopting this taxonomy proportionally, this section discusses some of the works that aim to improve and maintain availability in cloud computing environments.
Some authors base their availability guarantee approaches on Service Level Agreements (SLAs) between service providers and users that refer to the probability of providing a service according to defined requirements. In [13], the authors proposed a full SLA management solution in cloud computing from a customer-centric perspective in order to maintain the required availability of cloud services. The proposed approach measures and monitors cloud service availability using systems that detect client requests for cloud services and a BlackBox monitoring system that collects data sets (success or failure and latency). In addition, the authors used the adjustment quality analysis (probability distribution) to assess the representation of availability. In [26] the authors proposed an approach for the automatic generation of ranking services for Peer-to-Peer cloud computing systems based on multi-confidence factor ranking which are SLA-monitoring, resource equity algorithm, payment method and tariff scheme. The approach automatically measures the value of QoS parameters, SLA violation, SLA monitoring and SLA penalty. Subsequently, this SLA information is analyzed to generate the ranking list for the user as a cloud provider. This approach can be useful to the user to obtain a trusted service with less cost.
Improving the availability of cloud services in other research papers is ensured by developing new policies or adopting standard methods of replication and scheduling. Saadat and Rahmani [27] proposed a pre-extraction-based dynamic data replication (PDDRA) algorithm. Based on the file access history of the grid sites, PDDRA predicts future needs and pre-extracts the files to the applicant’s grid site, so that the next time that site needs a file, it will be available locally. This will significantly reduce access latency, response time and bandwidth consumption. However, the optimal location of replicas has not been studied in this article. In [28], the authors presented an algorithm that takes into account the number of file requests and the response time to place the replica in the optimal cluster site. As a result, the average execution time of the tasks is reduced to a minimum. The authors in [29] proposed a data replication system in cloud computing built on the use of the frequent pattern extraction algorithm of data mining to identify the popularity of data and generate adaptive threshold for replication, which proved to be a better mechanism to ensure an efficient replication process in the cloud system. System availability and data failure probability will be measured to identify the location where replicated data will be stored. Mohammed et al. in [30] proposed a Light-Weight Data Replication Strategy (LWDRS). The strategy adaptively selects data files to replicate using a light time series technique called Holt linear exponential smoothing (HLES), which analyzes the recent pattern of data file requests and provides predictions for future data requests. Once a replication factor based on file popularity exceeds a specific threshold, the replication signal is launched. The proposed strategy uses heuristic research to dynamically determine the number of replicas as well as the actual data nodes for replication. In [31] the authors proposed a Data replication based scheduling (DRBS) technique that integrates two techniques: data replication and task scheduling to achieve the goal of reducing data access time in the cloud, increasing load balancing and improving the performance of data-intensive applications. In [22], the author clarified the definition of quantitative as well as qualitative availability of end-to-end cloud computing services by considering all components of the service infrastructure. The results show that high availability requires redundancy for components such as network connections and data centers. Other replication strategies and algorithms have been proposed including: Dynamic Data Replication Strategy (D2RS) [32], Modified Fast Spread (MFS) Strategy based on the standard Fast Spread technique [33], Bandwidth Hierarchy Replication (BHR) Strategy [34], Modified BHR Strategy [35], Hierarchical Data Replication (HRM) Model [36], A Cost-Effective Dynamic Data Replication Management Scheme (CDRM) for a cloud storage system with support for Hadoop Distributed File Systems (HDFS) [14], Data Replication Algorithm based on the principle of Maximum Stream Problem at Minimum Cost [37]. These studies show that replication techniques are an essential approach that can be used to improve data accessibility and response time, maximize the rate of successful task execution and minimize bandwidth consumption to ensure the availability and improve the performance of large data applications in the cloud.
Other researchers have considered fault-tolerance techniques to protect cloud computing systems from any kind of hardware or software failure and maintain service availability. In [38], a replication technique called BVACQ (Binary Vote Assignment on Cloud Quorum) has been proposed to preserve data availability and consistency in case of cloud failure. This technique combines replication and fault tolerance mechanisms by considering the fragmentation of the distributed database. In addition, it takes into account the management of transactions, which helps preserve consistency and increase data availability and system reliability. In [40] the authors proposed a fault tolerance mechanism to ensure application availability by addressing software rejuvenation based on virtual machines. A software rejuvenation agent is installed in each virtual machine and monitors applications for software aging and application failures. In this solution, the detection is done at the application level, where replication is done at the VM level, which raises the issue of propagating application failure to waiting VMs. Singh et al. in [41] proposed intelligent failover strategies for cloud data centers using built-in checkpoint algorithms, which include support for load balancing algorithms and multi-level checkpoint. The envisaged failover strategy will run at the application layer and provide high availability for the Platform as a Service (PaaS) functionality of cloud computing. The study [42] investigated the importance of adaptive resource management for fault tolerance in cloud computing applications. The authors extended this concept with an online controller to provide a heuristic algorithm to restore application requirements at runtime in case of failure. Markov chains and queuing network algorithms are used to estimate the availability and performance attributes of a different implementation. The proposed approach increased availability and reduced degradation of system response times compared to traditional static schemes. Other efforts [43, 44] have been devoted to develop middleware and fault-tolerant system architectures able to handle various software failures to ensure the availability and reliability of server applications in a virtualized cloud VDC environment. In addition, some works [45, 46, 47] have developed interesting approaches to achieve adaptive and scalable service availability in Open Stack cloud computing environments and to ensure continuous operation of all cloud services in case of failure. In these works, the researchers attempt to address the weaknesses of existing fault-tolerance approaches by proposing improvements in failover, monitoring and weighting mechanisms, which have helped to minimize average response time and operating costs, improve the degree of required service availability and system performance. Table 1 summarizes the main above works by highlighting their advantages and limitations. Additionally, it presents the main cloud computing performance criteria that have been considered including: (a) availability, (b) scalability, (c) reliability, (d) cost reduction, (e) execution time reduction, (f) bandwidth consumption, (g) load balancing.
By examining these works closely, we realize that the interpretation of availability may differ from one paper to another and from one supplier to another. In addition, most of these techniques had a positive effect on increasing and improving system availability and performance while avoiding frequent outages, however, they are not able to guarantee the desired availability due to their various limitations. In addition to the dynamic volume of data analyzed in distributed environments that requires optimal management, the techniques mentioned above generally allow the production of huge amounts of data circulating in the network, which can limit them to ideally maintain system performance. This large amount of data requires more storage space, which makes it too costly and complex to maintain and manage consistently, unnecessarily increasing management and storage costs. It also requires a global knowledge of the network, which makes them more difficult to implement, limiting the performance of the search for the required data. Besides the processing and execution delay in wide area networks, the difficulty to access data, the increase in response time and network bandwidth limitation, the reduction in priority of fault recovery, the increase in peer inactivity time, they create a very sensitive bottleneck to simultaneous data access in the distributed environment.
In response to these limitations, another research path has advocated a different policy that applies clustering techniques to conveniently and efficiently manage dynamic and heterogeneous cloud resources. Malathy et al. in [48] proposed a resource clustering approach to form clusters based on the identification of the resource usage distribution for a group of nodes with similar resource usage patterns. This
Summary of the main papers on improving availability in the cloud through replication, scheduling and fault tolerance mechanisms
Summary of the main papers on improving availability in the cloud through replication, scheduling and fault tolerance mechanisms
approach aims to improve the performance of the cloud environment in resource utilization. It uses two complementary techniques including: resource utilization histograms to provide statistical information for resource capacities and resource clustering based on Expectation Maximization clustering algorithm “EM” used to obtain compact representation of a set of similar cloudlets to achieve scalability. The experimental results show that the proposed approach is able to provide high accuracy for resource availability and discovery. However, it suffers from low efficiency and high network traffic when the number of clusters becomes large. The paper [49] proposes an architecture based on clustering virtual machines in cloud data centers in terms of various attributes such as RAM, OS type, hardware configuration, etc. Resource clustering applies the K-Means clustering method that helped virtual machines reconfigure themselves and facilitate resource scheduling. It will lead to a better user experience with higher resource availability and scalability. The system proposed in [50] focuses on clustering virtual machines taking into account the size of the requested task and the bandwidth level. The main objective of the proposed virtual machine clustering is to match the task to the appropriate virtual machine for execution with the bandwidth to achieve high availability and maintain high reliability in cloud computing. The proposed algorithm proves its performance offering reduced task execution time with higher efficiency. Wu et al. [51] proposed a classification strategy for cloud service resources. The research presented an improvement to the original naive Bayesian classification algorithm by introducing weights to calculate similarity for different features in the resource dataset. In addition, they implemented a parallel programming model using Hadoop combined with MapReduce to implement the parallelization of the improved classification algorithm. The experimental results show that the proposed strategy is effective in large-scale dynamic cloud environments and offer improved resource classification performance via parallelization. However, this research does not consider applications and only classifies resources of different types. Yoori et al. in [52] proposed a method for dynamic clustering of similar resources by applying the SOM self-organizing maps algorithm and the k-means algorithm in hybrid cloud environments. Based on the resulting clusters, a cost-effective resource recommendation method to cloud users is applied that reflects the characteristics of applications. Experimental results prove the effectiveness of the proposed methods in terms of execution time, availability and resource utilization costs. In [53], a novel technique for optimal resource discovery and dynamic resource allocation is proposed. The proposed technique uses the Modified Hierarchical Agglomerative Clustering (MHAC) algorithm for the discovery of resources and construction of the tree. Subsequently, the resources are allocated using the hybrid artificial bee colony and cuckoo search (HABCCS) algorithm. Here, the artificial bee colony is used to optimize the tree construction path and the cuckoo search is used to modify the artificial bee colony algorithm. From the exciting results, it is clear that this technique allows to improve the availability of desired resources which are allocated in the most efficient way with minimum computation time. Meenakshi et al. [54] proposed an efficient cloud resource provisioning technique using k-means clustering and gray wolf optimization (GWO) partitioning technique. The proposed technique uses GWO for prioritization and k-means clustering is to analyze the QoS metrics to allocate the cloud resources in the best possible way and meet the users’ demands. The experimental results prove the performance of the proposed approach in terms of clustering accuracy effectively providing satisfactory service QoS in terms of availability, load balancing, memory usage and execution time. These research papers have shown that existing resource and service clustering approaches tend to focus on classifying small datasets and their performance decreases in large-scale environments. Table 2 summarizes the above approaches focusing on their benefits and drawbacks. Further, it introduces the main considered performance criteria, namely: (a) availability, (b) scalability, (c) cost reduction, (d) execution time, (e) bandwidth consumption, (f) load balancing, (g) accuracy.
Summary of the main papers on improving availability in the cloud through clustering techniques
Since the durable availability, accessibility and efficient management of services in distributed environments is of vital importance for both service customers and providers, the integration of clustering techniques due to Data Mining in service-oriented distributed environments such as SaaS models has become necessary in order to maintain reasonable performance [55, 56]. Recently, restricted theoretical and practical studies that favored the integration of clustering techniques in the SaaS model of cloud computing have been developed. These works have been presented in a recent paper [57], where the authors found that the different approaches focus mainly on assessing the quality of SaaS services rather than on improving the SaaS solutions provided. Although the studies cited in this section don’t deal with the aspect of incorporating clustering techniques to improve the availability of SaaS-Cloud services based on P2P networks, we consider that it could be an important step towards a highly available P2P-based cloud services provisioning platform.
In this section, we first give a brief description of the network model for a SaaS-Cloud system based on a H-P2P for remote service processing proposed by Achache et al. in [6] on which our solution will be executed. Then, we describe the metadata structure used to represent the system’s clients. Finally, we present our proposed approach aimed to enhance service availability and maintain a reliable and high-performance P2P network-based SaaS cloud service delivery system.
System model
Achache et al. in [6] have developed an initiation to distribute the SaaS model for the research field. The system proposed by the authors consists of a new cloud architecture that uses hybrid P2P network (H-P2P) as infrastructure to provide services in SaaS form to peers who don’t have them. It allow peers the use and execute SaaS services that are initially provided by the cloud but hosted and executed by the peers themselves. These services are executed by invocation using Remote Method Invocation (RMI) technology on the peers hosting them. This architecture consists of a distributed dynamic environment that has a minimum infrastructure covering four types of entities including: 1) The administrator responsible for feeding the system with services and their provisioning; 2) Partial Cloud Servers “PCS” are connected on a network layer and are responsible of service indexation. The clients can be: 3) Participant Clients, who host, provide and remotely execute services on behalf of the User Clients, 4) These later, could exploit these services. Together, the PCS and clients, which are geographically distributed, form the set of super peers and normal peers consecutively, hence constituting the H-P2P network. As a result, the normal peers are grouped into clusters managed by super peers forming the real network layer of the system. In addition, the clients exploit the concept of P2P networks to allow a direct connection between them when a service is invoked without any third party intervention.
Methodology
The proposed approach tries to rally the dynamicity and velocity of peers’ behavior in the real network layer of the service delivery system in order to collect their information. This individual behaviors information forms a profile data as a matrix of client profile vectors. A clustering mechanism uses this matrix and provides an optimal virtual infrastructure reorganizing the peers in the network. In other words, the proposed clustering mechanism performs a new grouping of the network peers besides their clustering in the real network layer. These peers are dispatched, according to their profile’s similarities, in several clusters forming a Virtual Layer (VL). Each cluster is represented by a Virtual Node (
In our service delivery system, the proposed service availability measurement is based primarily on the use of the virtual layer to compute the services’ availability probability taking in the account the availability probability of virtual nodes and thus the availability probability of service providers in the real network layer. In order to measure the availability of these entities, we use different types of availability metrics from the literature: “time-based availability, activity-based availability, availability based on the presence of k-of-n”. Our proposal has been structured in three major steps illustrated in the general scheme in Fig. 1 and detailed separately in the next subsections.
Global scheme of the proposed approach.
3.2.1.1. Client profiles data structure
The information of client profiles in the P2P-based SaaS-cloud service delivery system is maintained in 2D matrix. A client profile is a collection of information that determines its own behavior in the network. We denote
Client profile vector matrix.
Each profile vector
Hosted services: this criterion provides an indication of all the services hosted by a peer. Interest towards services: is a
Time-based features: measure the temporal behavior of peers: (a) connection time, (b) connection duration and (c) temporal availability. Activity-based features: characterize the level of peer activities and include: (a) individual availability based on the interaction (activity), (b) number of received service requests, (c) number of successful service provisioning requests, (d) number of failed service provisioning requests, (e) number of search requests and finally (f) the average of services hosted during a session.
The above features can vary from one session to another and during one session.
3.2.1.2. Feature normalization
Data normalization is defined as the process of scaling raw data without changing their nature and aim to generate clusters of high quality and improve the performance [61].
Several attributes of the profile data have very large range, for example the value of “number of received service requests” can vary in the range [0, 150]. Therefore, these features are normalized using the Z-score normalization method, which is considered one of the most powerful normalization methods that will give more accurate and efficient clustering results. The Z-score technique is a process of scaling the values of an attribute X based on the mean and standard deviation to ensure that the distribution of features has a mean
3.2.2.1. Clients’ profiles data clustering
The proposed data clustering mechanism combines the two algorithms HAC and TFCM in order to overcome their limitations and benefit from the advantages of both algorithms. It starts with the application of the HAC algorithm on the data. The HAC algorithm takes as input a user profile matrix formed by a set of profile vectors specified to each system peer
Step 1: Considering each of the profile vectors
Calculate the hops between each successive pairs of partitions
Search for the largest hop among the calculated hops between each two successive partitions: Since we are trying to minimize the loss of inter-class inertia, the final partition to be considered is the one that precedes the largest hop-loss. Retrieve the partition
Hierarchical ascending clustering algorithm with Ward’s criterion.
Based on this partition, information and data required for initialization of the TFCM algorithm can be extracted. Considering the user profile matrix VPM presented previously
Step 1(Initialization step). First, the number of clusters
The result will be a
Finally, we arbitrarily assign a real value to the fuzziness degree parameter “
Step 3. Updating the membership matrix
Step 4. If
Threshold step: Evaluate the degree of membership
Normalization step: this step allows to calculate the new degrees of membership
Modified thresholded fuzzy C-means.
As a continuation of the HAC algorithm, the TFCM algorithm allows first of all to add a membership degree factor
3.2.2.2. Virtual nodes and virtual layer formation
The set of clusters obtained in the final partition is denoted by
Therefore, the fuzzy client partition and the resulting virtual nodes together form a virtual layer that will be added to the architecture proposed by Achache et al. [6].
As mentioned above, the virtual layer allows the formation of condensed locations of each service, which increases the probability of its availability at the time of its request in a specific region. In addition, since the geographic location of peers is one of the characteristics taken into account during the clustering process, peers from different and similar geographic areas and time zones tend to be grouped together in the same cluster. This will have a positive impact on service availability and system performance. On the one hand, geographical proximity between neighboring peers increases the probability of finding close nodes that respond to service demands. On the other hand, the geographical diversity of neighboring peers is advantageous in terms of availability allowing services to be hosted and executed by peers in different geographical locations and to have a presence in several physical areas.
Once a system client performs a search for a service, the search is initially carried out in the virtual layer and rather than using all system users. Therefore, the requestor client contacts only the providers that are part of their virtual cluster. However, if this search fails it will be launched in the “Partial Cloud Servers” layer. The requestor client sends a service search request to his own Partial Cloud Server in order to locate the service and search for a provider. As a result, this will increase the probability of availability of the required services, reduce the search space and improve the response time.
The proposed availability measurement model refers to the efficient calculation and evaluation of availability in a dynamic network. The basic idea of our service availability assessment is mainly based on the use of the virtual layer of the system. In other terms, the service availability evaluation relies on the calculation of the availability of some entities including the availability of the network peers as well as the availability of the virtual nodes. To achieve the best performances in a service delivery system, different types of availability would be important at different levels of the system. These availabilities maintain the basic core of their traditional definitions, therefore they may appear similar to those typically used in P2P networks but are refined to include special considerations in their calculations due to their particular use:
Service Provider Availability: as the peer providers constitute the unit that performs the task of hosting and providing services to service requesters, increasing their availability probability is important for the service provision system. The Temporal Availability Probability of a service provider
where
where
where
Availability of the service provision system: The availability of a system during a measuring period reflects its ability to guarantee its main objective, which is the maintenance of an adequate rate of provision of the requested services under all constraints. This clearly shows that service availability and network peer availability is important to us when we want to know the overall state of the system at a given time. We identify it in a service provisioning network as the service provisioning system availability. In a time-based approach, availability defines the probability that the service delivery system is in an active state during a given time interval that is influenced by the temporal availability of the peers constituting the system. The Time-based Availability Probability of the Service Provision System can be expressed as follows:
In the case of the activity-based approach, availability determines the probability that the service provision system will be functional for a specified period of time. This is reflected in the average ability of peers to be functional in terms of the effective delivery of the requested services over the same period of time. The calculation of Activity-based
Availability Probability of the Service Provision System is similar to the previous one.
In the case of presence-based availability, the availability probability of the service provisioning system would be equal to the average availability probability of the virtual nodes representing the different clusters, which is also equal to the average availability probabilities of the different clusters in the network with
Service availability: In the cloud-P2P service delivery system, service availability means that the service must be usable by clients when they need it and during the necessary time to provide the service effectively, and that it must be satisfactory for peer requesters. This requires the availability of at least one peer provider that hosts the required service at the time of the request. The assessment of Service Availability Probability during a given time interval is based on the virtual layer of the system where a set of peers is dispersed over a number of clusters represented by virtual nodes. It is expressed as the ratio between the sum of the degree of probability availability of each virtual node weighted by the value of availability of each of the peers possessing the service among the peers composing it and the sum of the degree of probability availability of each virtual node weighted by the maximum value of probability availability that each of its peers possessing the service can reach which logically takes the value of 1. The Service Availability Probability is obtained as follows:
where
The service availability in our approach is conditioned by the availability of other entities in a closed cycle as shown in Fig. 5. In other words, a service is available once the service provision system on which it runs is available. Furthermore, a system is available if there is at least one available peer belonging to one or more clusters represented by Virtual Nodes in addition to communication links. Moreover, a Virtual Node is available if there are at least
Availability cycle.
This section starts with a brief description of the experimental environment and methodology that we used. Then, we analyze the obtained simulation results in order to measure the impact of the proposed approach on service availability and system performance.
Experimental setting
We have implemented a prototype of a hybrid P2P network-based SaaS-Cloud for remote treatment that relies on a hybrid fuzzy clustering mechanism for client profiles in order to improve the availability of the services that hold them. The prototype remains the core of the implementation carried out in [6] while taking into account other considerations. In order to study the performance and the impact of the proposed approach on service availability, our experiments are implemented according to three different scenarios, namely:
Scenario 1: in which we implemented a Hybrid P2P-based SaaS cloud service provisioning system without using a clustering mechanism, allowing the evaluation of the proposed approach’s performance and the formation of cache data to be used during network initialization in a different scenario. Scenario 2: in which we implemented a P2P-based SaaS cloud service provisioning system that uses different clustering algorithms (FCM, HAC) applied separately on the overlay network. Scenario 3: where, we have implemented a P2P-based SaaS cloud service provisioning system that uses the proposed hybrid clustering algorithm (HAC-TFCM) as a clustering mechanism.
The proposed approach is compared with the system proposed by Achache et al. in [6] in order to study the service availability factor in both architectures. In addition, the proposed clustering approach was evaluated according to different algorithms to measure its quality as well as its impact on service availability and system performance. The PeerSim simulator, [62, 63], is used as an experimental framework and the results are studied using the metrics described in the next section.
Configuration settings
Configuration settings
A large number of dynamic simulations using the cycle-based PeerSim model are performed to evaluate our prototype. The simulation begins by reading the configuration file consisting of a simple ASCII text file, essentially composed of key-value pairs and contains all simulation parameters concerning all objects involved in the experiment including the definition of the network variables, the declaration of the node structures, protocols, Dynamics and observers used [62, 63]. For more clarity, our experiments used variable test configurations by combining the parameters presented in Table 3. Note that all simulation results are performed on a PC DELL G5 15 5587 with a 2.2 GHz; Turbo 4.1 GHz/9 Intel Core i7 processor, 32 GB memory and a 1To HDD
The administrator: represented by a Control class that simulates its behavior in the cloud layer, accessible by all peers and uses a “Global Service List” as a data structure. The latter contains all the information related to the system’s services (including: their names, categories, inputs and outputs) and is used by peers as a reference for selecting the services to host or request. Besides the tasks allocated to the administrator in the system proposed in [6] including the feeding of the system with services as well as their management, this node will be responsible for the clustering process scheduling in the system implementing the proposed. This is done by specifying the cycles or moments in which the task of node clustering is carried out. Partial Cloud Servers: represented by a class that has the indexing, search modules and tools for communicating with their neighbors at the “Partial Cloud Servers” layer. This class implements the Control interface with a main method “execute ()” to inform each client of his PCS address. It uses “Partial list of hosted services” about the services hosted locally by its peers. Clients: represented by a class containing the necessary modules to perform various tasks including: communicating with their PCS and virtual neighbors, hosting, searching, requesting services and sending results. This class implements the CDProtocol interface that inherits from the Protocol class and has the nextCycle () method where the various tasks that a peer must perform during its execution are implemented including: 1) hosting services, 2) searching for services, 3) deleting services and 4) the peer latency. These scenarios are provided by the system with a probability equal to 0.25 for each one. This class uses the following data structures:
Local list of hosted services which contains all hosted services specific to each peer. Partial list of virtual neighbors which contains all the neighboring nodes in the virtual layer. This list will be used in the first place as a basis for searching a provider for a required service and in case the search fails the peer switches to searching via its Partial Cloud server.
Based on the configuration file, the cycle-driven simulation engine is loaded to configure the network by initializing the nodes and network protocols as instances of the classes implementing one or more interfaces. They are created by the “clone ()” method of the “Node” class in a single instance using the object constructor. The initialization phase is carried out by control objects whose execution is programmed only at the beginning of each experiment. After initialization, components (protocols and commands) are called by the cycle-driven engine once per cycle. However, other components are configured to run only in certain cycles until the end of the simulation [62, 63]. At the end, the collected statistical data by the observer components during the simulation is formatted and sent to a standard output for specific tasks, i.e. analysis or storage. Our observer components inherit from the “Observer” class of Peersim and implement a modified Analyze() method.
In order to evaluate the performance of our system as well as the impact of the proposed approach on the availability of services, we used different evaluation criteria commonly used to estimate the validity of the hybrid fuzzy clustering mechanism and other criteria to measure the degree of services availability quality. These criteria are:
Clustering evaluation criteria
As the actual structure of the input experimental data is unknown where data class labels and external information are not available, the validation of the fuzzy clustering approach requires the use of internal measures that rely only on the information contained in the analyzed data set [64, 65]. We present below some of the most common internal measures used in this paper to estimate the results quality of the hybrid fuzzy clustering mechanism compared to traditional hard clustering techniques and to determine its impact on service availability and system performance.
Partition Coefficient (PC): is considered as the first validity index proposed by Bezdek in [66] to measure the amount of “overlapping” between clusters. The
where
where
Xie and Beni’s Index (XB): This is one of the most widely used fuzzy clustering validity indices proposed by Xie and Beni in 1991 with parameter
A smaller value of the
where
where
During each cycle, measurements are captured and analyzed at the end of the simulation to estimate the performance of the proposed system in terms of service availability and meet the quality of service required by users. We have focused on the following performance indicators:
Success Search, the search for a requested service takes on a positive value if the Partial Cloud Server finds at least one participant that hosts the requested service. Failure Search, the search for a requested service fails if the Partial Cloud Servers cannot find a service provider. Success Result, expresses the success of result returned by the participants after execution of the requested services. Failure Result, which means the failure of the participating nodes to correctly provide the result of the requested service. Number of Resource Nodes, is the number of nodes that host the requested service. Maximum Average Service Availability is the average of the maximum probability of the services being available at the time of the request and being able to perform their required tasks correctly to the users. Minimum Average Service Availability is the minimum average probability that services will be available at the time of the request and able to perform their required tasks correctly to users.
Results and discussion
We conducted 500 tests for each scenario in order to assess the impact of the proposed approach on improving services and entities availability and system performance. The experimental results are expressed as the mean variance of all identical experiments under the scenarios described above, and varying under five other simulation scenarios depending on the size of the P2P network (size: 10000, 20000, 30000, 40000, 50000).
In order to conduct comparative experiments, we first examined our proposed Hybrid Fuzzy Clustering Method “HAC-TFCM” in scenarios of clustering of multidimensional user profile data by comparing it with widely used methods, namely the Fuzzy C-Means (FCM) and Hierarchical Ascending Classification (HAC) methods. Experimental data are dynamic in nature and are collected during the network’s operation within each experiment. Moreover, these data are considered as initial inputs for all the algorithms and formulas used in this study. In the case of FCM (Fuzzy C-Means), we performed a series of clustering validation experiments that reflect the properties of the environment by varying the number of clusters between 2 and 6. The U membership matrix was randomly initialized. The value of the fuzzification parameter
In this section, the proposed hybrid clustering approach has been evaluated with six previously introduced internal validity indices. The diagram in Fig. 6 shows the trend in the average variation of various cluster validity indices over the experimental dataset that are captured during the simulation scenarios. To facilitate visualization, the minimum value of all indicators is 0 and the maximum value is 1 per normalization treatment. Note that the clustering effect is best when all indicators are as small as possible except the PC (Partition Coefficient) index.
Average internal validity index values of hybrid fuzzy clustering mechanism, fuzzy C-means and hierarchical ascending classification.
Validation of fuzzy C-means, hierarchical ascending classification and hybrid hierarchical ascending classification with thresholded fuzzy C-means, applied on client profile dataset
The results show that the
The main objectives of the user profile clustering experiences were to study the ability of fuzzy clustering to discover good client relationships and to design a highly available P2P based cloud SaaS service system to improve the client experience. Now, we will show the impact of the proposed approach on the variation of Success_Search and Failure_Search percentages per cycle for the different simulation scenarios, Fig. 7.
Success search percentage vs failure search percentage.
According to the statistics, we initially observe that during the first simulation cycles, the percentage of Failure_Search is greater than the percentage of Success_Search. From the 4
In service-oriented systems, the major concern is to ensure the successful provision of services that reflect their ability to provide the resource provisioning function on demand. Figure 8 shows the evolution of the percentage of the two measures: Success_Result and Failure_Result according to the demands made in the different simulation scenarios. These statistics are represented by assuming the same failure rate for service providers (0.25%) as well as the probability of no failure in peer providers in different systems.
Success result percentage vs failure result percentage.
According to these statistics, it can be observed that the percentage of Success_Result in the system proposed in [6] has continuously increased to reach a maximum of 78.91% in the 19
Here, we will study the behavior of the probability of SaaS service availability taking into account the impact of the cloud-P2P service provisioning system conception.
Mean service availability percentage.
Figure 9 shows the variation in the percentage of service availability in the different simulation scenarios. The availability of services in the system implementing the proposed approach increases rapidly at the beginning of the simulation, reaching a percentage of 61.38% at the 5
As illustrated in Fig. 9, the percentage variation of service availability in the system adopting the proposed approach follows a similar way as other systems, despite the fact that the service availability percentage in the system adopting the proposed approach remains higher [6]. Moreover, by comparing the three curves associated to the scenarios in which one of the clustering algorithms is used, we notice that as the algorithm has lower performance on the experimental data set and the quality of clustering decreases, the improvement rate of service availability declines and gradually approaches the system of [6] where clustering is not used.
Average, min and max values of the curves of Fig. 10
The results of these simulations are summarized in Table 5 in which the statistics show an incremental trend between clustering quality and service availability, which again justifies the assumption that an optimal clustering model is necessary for the design of a high-performance and reliable service provisioning system. In addition, the statistics presented show that the clustering and virtual layer formation method considered by the proposed approach has a considerable effect in improving the percentage of SaaS service availability in a P2P-based cloud architecture compared to the architecture proposed in [6]. Efficient provider distribution involves the formation of condensed zones of required services hosted or executed by a set of peers. As a result, the condensed location of a required service in a specific region of the virtual layer increases the probability of its availability at the time of its request.
The proposed approach can achieve an Average Service Availability of 88% versus 76% with a total increase of 12% compared to the system proposed in [6]. Moreover, the system implementing the proposed approach can reach a Maximum Average Service Availability of 89.19% and Minimum Average Service Availability of 67.43% in the availability calculations versus 78.94% and 40.13 % in the architecture proposed by [6], which means that even in the failure cases the system, remains relatively efficient. Therefore, as there are more peer providers available belonging to one or more clusters represented by virtual nodes in addition to the communication links, the system is more available and responds faster. Besides, it seems that our approach gives a perfect combination of calculations using time-based availabilities, for better numerical analysis, and activity-based availabilities, for more accurate information about system during its operation, and presence-based availabilities monitoring the overall state of the system entities.
The proposed approach responds widely to the large-scale nature allowing the conception of a SaaS service provision system based on a P2P network able to find the best compromise between client satisfaction in terms of quality of service and the cost of the system.
Here the most relevant existing works for improving availability in the cloud are compared with our proposed work in order to demonstrate that the proposed work is better one. It should be noticed that these works are contrasted with our proposed approach in their performance analysis methods where each of them adopts a distinct experimental environment and is based on evaluation criteria different from those used in our paper. Therefore, the comparison scope will be done in terms of the proposed techniques’ conception as well as their impact on the cloud environment’s performance and especially on the availability of services and resources.
In order to ensure high availability and reliability, cloud service providers introduce various mechanisms such as redundancy, scheduling and fault tolerance in their cloud systems. Following these mechanisms, the authors in [14, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 40, 41, 42] have presented various effective strategies. Although these proposed approaches have contributed to enhance data availability and accessibility, prevent network disruptions and ensure system continuity. They still suffer from problems related to their core operation, which relies mainly on the need for global knowledge of networks and their complex exploitations stored and computed during their processing. In addition to the volume of dynamic data analyzed in large-scale distributed environments, these techniques typically produce huge amounts of data flowing through the network, resulting an unnecessary increase in effort and cost in terms of: storage, processor, financial overhead, maintenance overhead, complex and consistent data management. Moreover, the processing and execution delays in large networks, the increase in peer inactivity time and the limitation of network bandwidth create bottlenecks that are very sensitive to concurrent data access in the distributed environment.
In contrast to these works, the proposed approach meets the requirements of low-cost availability by maintaining a virtual and dynamic layer added to P2P network based-SaaS cloud service system architecture. This layer is built using a hybrid clustering mechanism of client profile vectors that are dispatched, according to their profile’s similarities, in several clusters. It enables the distribution of peer providers and the creation of condensed areas of service hosted on multiple nodes, thus enhancing the availability probability degree of services in specific regions at the time of demand. In addition, the service search process is initially carried out in the virtual layer instead of using all system users and in case of failure it will be launched in the real “Partial Cloud Servers” layer. The proposed approach provides high availability of the services as well as the peers holding them, efficient communication and network efficiency. The proposed approach ensures high availability of the services as well as the peers holding them, efficient communication and network efficiency. In other words, our system is able to ensure that clients interact continuously and correctly with SaaS applications and services, and that peer churn (normally by disconnection, abnormally by failure) does not affect system performance. Therefore, the proposed approach provides an optimal fault-tolerant infrastructure that responds largely to the dynamic nature of peers as well as their varying behaviors in the real network layer of the service delivery system. It allows to reduce the search space and communication cost, maintain low network traffic, improve the response time and success rate of service execution.
By adopting various clustering methods, other authors [48, 49, 50, 51, 52, 53, 54] have proposed several techniques in order to maintain adequate reliability and availability to their cloud users’ needs. However, the proposed techniques suffer from several challenges including: high network traffic, large overhead, low load balancing, reduced efficiency in large scale environments, simple data handling and clustering methods. As a clustering mechanism, the proposed approach used a fuzzy hybrid clustering method “HAC-TFCM” to perform a new clustering of network peers besides their clustering in the real network layer. The proposed clustering technique surpasses the techniques presented in [48, 49], as it allows each client to be automatically associated with the appropriate number of clusters judged to be close without a priori knowledge of the cluster number and how many clusters it should belong to. Compared to [51], our proposed approach was able to manage complex and dynamic data consisting of profile vectors of the system’s clients which determines their own network behavior. As compared to [52, 54], our system is scalable and its performance remains stable in large-scale environments allowing the conception of a P2P network-based SaaS service delivery system able to achieve the best trade-off between client satisfaction in terms of service quality and system cost. In addition, our paper is distinct from the works discussed below as it offers a service availability measurement model that considers different entities at several levels of the system and uses various types of o availability.
Conclusion
In the contemporary technological world, technology requires an effective approach to minimize overall costs and improve the overall reliability of the system by offering the best quality of services. In this paper, we proposed an approach to improve the persistent availability of SaaS services in a “Hybrid P2P Network-Based Remote Treatment SaaS cloud computing” environment. This is made in order to meet the requirements of on-demand service provisioning and to maintain cloud computing performances by overcoming the problems of system failures and peer dynamics. This approach primarily tracks the dynamic and rapid rhythm of these peers by analyzing their individual behaviors and representing them in the form of a matrix of client profile vectors. Besides, a Hybrid Fuzzy Clustering Mechanism has been proposed by combining in a sequential hybridization process the “Hierarchical Ascending Classification” algorithm and the “Thresholded Fuzzy C-Means”, allowing each of the client’s profile vector to be automatically associated to the appropriate number of clusters considered close without any prior knowledge of the cluster number and the number of clusters to which it should belong. It aims to provide a virtual and optimal infrastructure to organize the system peers based on their profile similarity into distinct clusters where each cluster is represented by a virtual node forming together a virtual layer. This layer provides both the distribution of peer service providers and the formation of dense zones of each service that interests a set of peers who host or use them, which greatly improves the probability of availability of these services in specific regions of the virtual layer at the request time. In addition, a service availability measurement model has been proposed, which is mainly based on the use of the virtual layer of the system and takes into account different entities at different levels of the system including the probability of availability of service providers and the probability of availability of virtual nodes. To obtain the best performance in a service provision system, we used different types of availability existing in the literature to measure the availability of these entities: “time-based availability, activity-based availability, availability based on the presence of k-of-n” which are refined to include special considerations in their calculations due to their particular use.
We examined the proposed hybrid fuzzy clustering method based on six internal validity indices and the experimental results verify the superiority of the hybrid fuzzy clustering algorithm “HAC-TFCM” compared to FCM and HAC in the efficient and reliable clustering of client profile data set creating better separated and more meaningful clusters with high compactness. Furthermore, the results of the experimental evaluation with PeerSim provide convincing evidence that the proposed approach can effectively improve the probability of SaaS service availability and the reliability of the cloud-P2P system for remote service provisioning offering an average service availability of 89% with a 7% improvement. Consequently, the proposed approach provides irrefutable proof that it largely responds to the large-scale nature of distributed systems and is able to achieve the best compromise for maintaining quality of service in terms of availability, performance and cost.
As future perspectives, it would be interesting to extend the proposed approach by taking into account other measures of cloud services quality and to test our approach on benchmark datasets.
Footnotes
Author’s Bios
