A quorum-based intelligent replicas management in data grids to improve performances

Abstract

Today, data replication is widely used in large-scale systems such as data grids to meet the requirements and challenges of managing these huge amounts of data. Indeed, replication is used to reduce the data access cost and access latency, to improve availability and performance, and to ensure reliability in grids. Most proposed replication strategies deal read-only case when accessing data, few approaches studies data consistency when the updates are allowed in data grids. In this paper, we present a contribution using a quorum-based management of replicas approach to maintain the consistency of these replicas, whose goal is to minimize quorum sizes to decrease replication costs and improve performances.

Keywords

Data grid data replication consistency quorum agent

1. Introduction

Replication is a technique widely used by distributed systems on a medium and large scale, for database communities and even for grid communities. However, the grid environment presents different challenges, such as reduced latency and the availability of dynamic resources [10]. Given the complexity of such an environment, it is very difficult for a human to make decisions about data management especially about those that need to be replicated. To this end, the solicitation of good replication strategy is required (for the right placement of replicas to improve performance). However, replication strategies introduce an additional load related to the management of these replicas and especially to the consistency (of replicated data) that must be assured and maintained. The basic principle for maintaining the consistency of replicated data is to require the locking of at least one common copy for conflicting operations (reads/writes) [25]. Different replication protocols have been proposed to ensure consistency of replicated data. The simplest solution is ROWA (Read One Write All); where one copy is sufficient for a read operation while all copies are required to successfully perform a write operation. Another protocol known as Quorum Consensus (or majority vote) was suggested; where a write quorum (number of copies) is required to perform a write operation, and a read quorum is required to perform a read operation. All quorum-based protocols aim to minimize sizes of quorums in order to reduce data access costs. In this context, we contribute with a new replica management approach in data grid, based on quorums, to improve data access performance. This approach aims to minimize sizes of quorums by distributing the replicas of the grid into groups. To evaluate our approach, we compared it with two approaches (ROWA and Quorum Consensus approaches). The obtained results show that our approach significantly improves data access performance.

The rest of the paper is organized as follows. In the second section, some work related to data replication and to data consistency based on quorums will be presented. In section three, our approach will be presented. The results of the experiments will be discussed in section four, where our approach will be compared to the ROWA approach and the Quorum Consensus approach. Section five provides a comparative study of related work. The sixth section will be reserved for conclusion and future work.

2. Literature review

Replication strategies for data grids is an active research domain [1, 2, 5, 24]. These strategies are centralized, which makes them likely to be failing and can become a bottleneck when data requests increase. A common point of these strategies is that they only consider read-only cases to access the data. Therefore, these approaches are not suitable for applications where data replicas can be modified [9].

Few data replication techniques have focused on consistency in data grids [3, 16]. Some of them have used quorum-based protocols such as Quorum Consensus, Tree Quorum and Grid Quorum. These protocols are primarily intended for database systems [8]. They are considered (as well as ROWA protocol) the most popular in replica consistency management. Each protocol is mainly distinguished from the others by the number of replicas involved in performing a read or write operation.

The ROWA protocol (Read One Write All) [12] is a simple protocol. As its name indicates, it needs only one replica to read a data, but it requires all replicas to write a data.

The principle of the Quorum Consensus protocol [15] (also known as the majority vote) is based on suffrage (voting). In literature, the quorum is defined as the number of votes required for an election to be valid. Thus, since each data has several copies, we can apply this principle to modify or consult data. Therefore, the basic idea is to have the authorization of several copies (quorum) before performing read or write operation on a data. For example, for a data having $n$ copies, we can specify that authorization of more than n/2 copies are required to ensure that a change will always be made on a majority of copies.

Let $n$ : the number of copies of a data, Qr: read quorum, Qw: write quorum.

A read operation on a data requires Qr votes to be able to perform it, and a write operation on a data requires Qw votes to be able to perform it,

In the Quorum Consensus protocol, the read and write quorums must satisfy the following constraints:

(1)
$Qr+Qw>n$ .
(2)
$2Qw>n$ .

The first constraint guarantees the exclusion between a read operation and a write operation on a same data, while the second constraint guarantees the exclusion between two write operations on a same data.

The Tree quorum protocol [4] is based on a logical tree topology, the replicas are logically arranged in a tree structure. A write operation requires a quorum formed of the root, the majority of its children, the majority of the children of each of these children, and so on. Consequently, two simultaneous write operations will have at least one common replica at each level of the tree. The read quorum is formed of the root. If the root is not available, the quorum will be formed by the majority of its children. If one child of this majority is unavailable, he will be replaced by the majority of these children and so on.

In the Grid quorum protocol [23], the replicas are logically arranged in grid structure. The rectangular grid is the simplest form of the Quorum Grid. The $n$ replicas of a data are logically organized in a rectangular grid of $r$ rows and $c$ columns ( $n=$ r.c). A write quorum Qw is formed of an entire column and a replica of each remaining columns (Qw $=r+c-1$ ). A read quorum Qr is formed of one replica of each column (Qr $=$ c ).

In order to improve the performances of a data replication protocol in large-scale systems, it is important not to overlook a few factors such as: the number of replicas used, the sites where replicas will be deposited, and how to maintain the consistency of these replicas.

In the following, we present other important works that have proposed data replication protocols based on quorums and that take into account some of the factors cited above to achieve their objectives.

In [8] the authors propose a new quorum-based data replication protocol in order to reduce the cost of data updates, to provide high availability and to maintain data consistency, with good response times and small read and write quorum sizes. They have developed an algorithm called DDG by logically decomposing the network into regions (zones). Each region groups the sites belonging to the same geographical partition. This decomposition is logically structured in a two-dimensional form. Where each line of this structure represents a region of the network, and the elements of the same row contain the geographically close sites. The DDG algorithm allows only one replica per region. The number of replicas for this approach should not exceed $N=R=\sqrt{S}$ , where $R$ represents the number of regions (zones) in the grid and $S$ represents the number of sites in the grid.

In [14], a protocol named A2DS was proposed. It can be adapted to any two dimensional structure. The replicas are organized logically into any 2D structure (straight line, triangle, square, trapezoid, and rectangle). The authors introduce the new idea of obtaining other 2D structures (hexagon and octagon) by the composition of several basic 2D structures. In this protocol, a write quorum is formed of all replicas of any single level of the 2D structure, and a read quorum is formed of any single replica of every level of the 2D structure. Another protocol called ReadTwoWriteMajority was also proposed, where replicas are arranged logically into the rectangle structure for an even number of replicas, and are arranged logically into the trapezoid structure for an odd number of replicas. This protocol was proposed in order to bypass the drawbacks of ROWA protocol (write operations) and to preserve its advantages (read operations). The authors carried out a comparative analysis on the adaptation of these protocols on the different 2D structures.

The authors of [17] propose a data replication protocol named DR2M. This protocol organizes the nodes into a two-dimensional logical structure, they use voting technique to improve the availability and data access performance by reducing the number of replicas of data for read or write operations. The DR2M protocol uses quorums to organize nodes into clusters. It calculates the best number of quorums and the best number of nodes for each quorum. To replicate a data, the protocol selects the node located in the middle of the diagonal of the nodes of the structure.

The replication protocol presented in [18] is named TQR (Triple Quorum Replication). As its name indicates, this protocol is structured on the intersection of three quorums groups. It aims to reduce quorum sizes and maintain high data availability even when the network size increases. In this work, the authors have focused on the selection of copies to be replicated (which copy is selected to be replicated) by proposing a new replicas control strategy. In this approach some assumptions are made such as the number of nodes in a quorum, the similarity of read quorums and write quorums, and the same availability for both read and write operations.

The authors of [21] propose a new replication protocol called Dynamic Hybrid Protocol which combines between the grid quorum protocol and the tree quorum protocol. A logical tree structure is used for this protocol, it is composed of the two logical structures of the grid quorum and the tree quorum. It is a topology based on three configuration parameters (tree height, grid depth and number of descendants). The authors indicate in their analytical study that in order to obtain high write availability, we must reduce the height of the tree and the depth of the grid, and we must increase the number of descendants. However, in order to obtain high read availability, we must reduce the height of the tree and the number of descendants and we must increase the depth of the grid.

In [22], the authors propose a dynamic load-balancing strategy based on the quorum system to improve the performance of replicas consistency. Nodes containing replicas are represented in a binary tree logical structure. To maintain data replicas consistency, when a read or write access request is addressed to a replica, a quorum is formed from the tree by the path from the root to a tree leaf. To improve performance, the authors complement replica consistency by a load balancing strategy according to the state of each quorum node. This strategy is based on the permutation between the nodes of the quorum sets in order to reduce the load and the communication time of read or write access requests.
3. Proposed approach

The number of replicas, the consistency of these replicas, and the nodes that contain the replicas are very important concepts for designing and developing a replication strategy.

In addition, if replication strategies are dynamic and quorum-based, quorum values (quorum sizes) must be calculated every time the number of replicas changes. This operation is very costly especially on a large scale (data grid). This influences the system performance such as response times.

In this context, and in order to minimize the cost and to improve replication performance we contribute by a new replica management approach named Imbricated Quorum (IQ), which is based on the formation of groups of quorum.

3.1 Principle

In a data grid, a dynamic replication strategy moves, adds or deletes the replicas dynamically. Adding and removing replicas inevitably influence quorum values (quorum sizes). Therefore, these values must also be managed dynamically.

However, the dynamic definition of quorum sets is costly. Because each time the number of replicas changes, the read and write quorums values must be recalculated, before any new read or write operation. Therefore, the problem is to minimize the number of these calculations. Which leads us to reduce the quorum sets sizes. For this, we reduce the number of replicas involved in a quorum by distributing replicas in groups. Consequently, the number of replicas involved in read or write operations is reduced.

Let $n$ be the number of copies of a distributed data in data grid. By applying the Quorum Consensus protocol, the value of the read quorum $Qr=\left\lceil\frac{n}{2}\right\rceil$ and the value of the write quorum $Qw=\left\lceil\frac{n+1}{2}\right\rceil$ . If the number of copies changes, the quorums values must be updated by all replication responsible entities (replicas consistency services that are geographically distant in the grid). This update is costly which reduces performance (response time $\ldots$ ). To minimize this cost in order to improve performance, we propose a technique that consists in minimizing the number of times the quorum values must be recalculated. For this, we distribute the set of replicas into groups, consequently the sizes of quorum applied to these groups will be considerably reduced. Each group will have its own replicas, hence the possibility to apply the principle of the Quorum Consensus protocol on groups. Next, we apply a second time the Quorum Consensus on replicas inside each group chosen separately (i.e. groups belonging to group quorums).

Figure 1.

Example of distribution of replicas in groups.

For example, in Fig. 1, the total number of replicas $n=$ 22 distributed into 3 groups.

Group 1 contains 8 replicas and each of groups 2 and 3 contains 7 replicas. By applying Quorum Consensus, the read quorum value Qr $=$ 11 and the write quorum value Qw $=$ 12. However, using our technique we obtain for the first quorum applied on the groups Qw $=$ Qr $=$ 2. Then, we choose 2 out of 3 groups. Inside each chosen group, we obtain the quorums applied on the replicas Qr $=$ 4 and Qw $=$ 4 (to the maximum Qw $=$ 5). Therefore, the global read quorum Qr $=$ 8 $=$ 4 $+$ 4 and the maximum of the global write quorum Qw $=$ 9 $=$ 5 $+$ 4 (it may be that Qw is equal to only 8 $=$ 4 $+$ 4). Thus, the quorum sizes are reduced compared to those of the Quorum Consensus. The difference between quorum sizes of the two approaches equals 6 (3 for Qw and 3 for Qr).

On the other hand, the updating cost of the quorum values is reduced if the number of replicas changes. For example, if one replica is added to Group 1, only the quorum values of this group will be updated (Qr $=$ 5 and Qw $=$ 5) and not for the whole grid. In addition, this update can be performed asynchronously if the group in question is not asked. It be noted that even the replicas updates can be performed asynchronously when the groups are not solicited.

3.2 Definition of quorum sizes

In the following, we define the quorum sizes of our approach (Imbricated Quorum).

We will use cluster term instead group term.

Below are some notations to be used.

$n$ : the number of replicas

Nclt: the total number of clusters in the grid such as Nclt $>$ 2.

rep_clt ${}_{i}$ : the number of replicas in cluster $i$ such as rep_clt ${}_{i}>$ 2 and 1 $\leqslant i\leqslant$ Nclt.

QCw: write quorum of grid (applied on clusters).

QCr: read quorum of grid (applied on clusters).

Qw ${}_{i}$ : write quorum of cluster $i$ (applied on replicas of cluster $i$ ).

Qr ${}_{i}$ : read quorum of cluster $i$ (applied on replicas of cluster $i$ ).

Quorums obtained by applying Imbricated Quorum are:

Write quorum on clusters concerned by a write access is:

$\displaystyle\textit{QCw}=\left\lceil\frac{\textit{Nclt}+1}{2}\right\rceil$ (1)

Read quorum on clusters concerned by a read access is:

$\displaystyle\textit{QCr}=\left\lceil\frac{\textit{Nclt}}{2}\right\rceil$ (2)

Write quorum on replicas of cluster $i$ concerned by a write access is:

$\displaystyle{Qw}_{i}=\left\lceil\frac{{\textit{rep\_clt}}_{i}+1}{2}\right\rceil$ (3)

Read quorum on replicas of cluster $i$ concerned by a read access is:

$\displaystyle{Qr}_{i}=\left\lceil\frac{{\textit{rep\_clt}}_{i}}{2}\right\rceil$ (4)

Global write quorum of Imbricated Quorum (quorum on the replicas of the whole grid) is:

$\displaystyle Qw=\sum\limits_{i=1}^{\left\lceil\frac{Nclt+1}{2}\right\rceil}{% \left\lceil\frac{{\textit{rep\_clt}}_{i}+1}{2}\right\rceil}$ (5)

Global read quorum of Imbricated Quorum (quorum on the replicas of the whole grid) is:

$\displaystyle Qr=\sum\limits_{i=1}^{\left\lceil\frac{\textit{Nclt}}{2}\right% \rceil}{\left\lceil\frac{{\textit{rep\_clt}}_{i}}{2}\right\rceil}$ (6)

We will show that the quorum sizes obtained by our approach (Imbricated Quorum) are better than those obtained by Quorum Consensus.

We have:

Write quorum of Quorum Consensus is:

$\displaystyle Qw=\left\lceil\frac{n+1}{2}\right\rceil$ (7)

Read quorum of Quorum Consensus is:

$\displaystyle Qr=\left\lceil\frac{n}{2}\right\rceil$ (8)

Nclt $>$ 2; rep_clt ${}_{i}>$ 2 $\forall$ 1 $\leqslant i\leqslant$ Nclt

For write quorum,

$\displaystyle\sum\limits_{i=1}^{\textit{Nclt}}{\textit{rep\_{clt}}_{i}=}n$ (9)

$\displaystyle\Rightarrow\sum\limits_{i=1}^{\left\lceil\frac{Nclt+1}{2}\right% \rceil}{\textit{rep\_{clt}}_{i}<}n$ (10)

$\displaystyle\Rightarrow\sum\limits_{i=1}^{\left\lceil\frac{\textit{Nclt}+1}{2% }\right\rceil}{\left\lceil\frac{{\textit{rep\_clt}}_{i}+1}{2}\right\rceil{<}}% \left\lceil\frac{n+1}{2}\right\rceil$ (11)

Then the write quorum size of Imbricated Quorum is less than that of Quorum Consensus.

For read quorum,

$\displaystyle\sum\limits_{i=1}^{\textit{Nclt}}{\textit{rep\_{clt}}_{i}=}n$ (12)

$\displaystyle\Rightarrow\sum\limits_{i=1}^{\left\lceil\frac{\textit{Nclt}}{2}% \right\rceil}{\textit{rep\_{clt}}_{i}{<}}n$ (13)

$\displaystyle\Rightarrow\sum\limits_{i=1}^{\left\lceil\frac{\textit{Nclt}}{2}% \right\rceil}{\left\lceil\frac{{\textit{rep\_clt}}_{i}}{2}\right\rceil{<}}% \left\lceil\frac{n}{2}\right\rceil$ (14)

Then the read quorum size of Imbricated Quorum is less than that of Quorum Consensus.

3.3 Formation of groups (clusters)

To form the groups of replicas, we first distribute all the nodes of the grid in groups. In distributed systems, there are several methods for grouping a set of nodes and for determining the leaders of these groups [7, 11, 13, 20, 19]. They can be based on different metrics such as network latency, bandwidth, free storage space, neighbourhood, availability $\ldots$ etc.

Since node availability is a very important parameter in the grids and to improve performance, in our approach we propose a grouping method based on the availability. The formation of groups by this method is done on basis of the availability of the nodes, where the most available node is likely to be the leader of its group. Each grid node has a value that indicates its availability. The node that has the highest availability value is selected. In case of equality between several nodes, the node with the smallest identifier will be selected.

This grouping is performed independently of the replicas and their placements (grouping of all nodes of the grid and not of the replicas).

This task is performed in three steps:

Step 1: For each node, we proceed as follows: among each node and these neighbours, we designate the most available node (called chief).

For example, for node 0 (see Fig. 2 and Table 1), the most available node among all nodes {0, 3, 9} is node 3, since its availability equals 70% among {35%, 70 %, 68%}.

When this step is completed for all the nodes of Fig. 2, we will obtain the results of Table 1.

Table 1
Set of nodes and their chief

Node	Availability	Chief (most available node)
0	35	3
1	70	6
2	60	1
3	70	8
4	70	6
5	90	8
6	95	6
7	50	9
8	95	8
9	67	6

Table 2

Set of nodes and their leader

Node	Availability	Chief	Leader
0	35	3	8
1	70	6	6
2	60	1	6
3	70	8	8
4	70	6	6
5	90	8	8
6	95	6	6
7	50	9	6
8	95	8	8
9	67	6	6

Figure 2.

Example of grouping based on availability.

Step 2: Then, from these nodes and their chief (most available), we determine the leaders of the clusters. For each node, we point at its chief, and for the latter (chief) we point again at its chief and so on, until a chief points at itself. Thus, the leader of the first node will be the last chief obtained (which points at itself).

For example, the chief of node 0 is 3, node 3 points at 8 and 8 is the chief of itself. Thus, the leader of node 0 will be node 8, and so on.

When this step is completed for all the nodes, we will obtain the results of Table 2.

Step 3: Then the nodes having the same leader form the same cluster.

The number of formed clusters will equal to the number of obtained leaders (see Fig. 2).

This method favours the most available nodes, which improves the data availability in grid if these nodes possess replica.

Algorithm 1 describes the availability-based grouping method.

The runtime complexity of this algorithm is $O(n^{2})$ , where $n$ is the number of nodes in grid. The worst scenario for this algorithm is that each node has n-1 neighbours, where we obtain as results only one cluster and only one leader (note that this case is unrealistic). When determining chiefs (steps: 5 to 15), runtime complexity in worst cases is O(n.(n-1)). But runtime complexity is O(1) for phase of leaders determination (steps: 16 to 25) and for phase of clusters forming (steps: 26 to 37), since there is only one cluster.

3.4 Formation of quorums and query processing

Each time the number of replicas is changed (adding or deleting replicas), the quorum sizes of Quorum Consensus protocol must be updated for the entire system. However, in our approach, quorum updates will only be done within the cluster where the number of replicas has changed.

The formation of quorums and query processing are performed by the cooperation of a set of agents, where we assign to each cluster an agent that will represent its leader (called intra-cluster agent), in addition to an agent that coordinates tasks between the different clusters (called inter-cluster agent). Each intra-cluster agent will have a global view of its cluster (see Fig. 3), and the inter-cluster agent will have a global view of the data grid (of all clusters of the data grid).

Figure 3.

Diagrammatic representation of grid environment including nodes, clusters, agents.

Thus, we distinguish two types of agents:

Inter-cluster agent who is responsible for:

•

Clusters formation.

•

Update read and write quorum sizes of grid (if necessary in case the cluster number changes) by applying Eq. (1) or Eq. (2).

•

The locking and unlocking of the leaders nodes involved in a quorum when accessing replicas (read/write).

•

Diffusion of updates to leaders in case of a write operation.

Intra-cluster agents that represent the cluster leader nodes (i.e. each cluster has an intra-cluster agent). Each agent is responsible for:

•

Update read and write quorum sizes of its cluster by applying Eq. (3) or Eq. (4).

•

The locking and unlocking of replicas of its cluster involved in a quorum when accessing replicas (read/write).

•

Update replicas of its cluster in case of a write operation.

Each replica of a data in the grid has a version number used to determine the most recent replica. When a node requests read or write access to a data, a query that asks for the version numbers of the different replicas is formulated first. When the most recent replica is determined (having the highest version number), an effective access (read or write) is performed to the node possessing this replica (the most recent).

So, when a node requests access to a data, it sends request to its cluster leader. When its leader receives the request, the intra-cluster agent forwards it to the inter-cluster agent. The inter-cluster agent sends a query to each intra-cluster agent, asking it to determine the most recent replica in its cluster. Each intra-cluster agent determines the replica having the latest version, then sends the address of that replica and its version to the inter-cluster agent. The inter-cluster agent collects the responses from different intra-cluster agents, and when the quorum (applied on clusters) is reached, it determines the most recent replica in whole grid and it transmits the version number and address of this replica to the intra-cluster agent that issued the query (cluster agent of node that wanted to access the data). After the access operation is completed, each intra-cluster agent unlocks the locked replicas of its cluster, and the inter-cluster agent frees the locked leaders. In writing case, the update of the replicas and their versions (new versions) must be achieved before unlocking.

Figure 4.

Activity diagram for cooperation between agents.

To locate the most recent replica in each cluster, each intra-cluster agent sends request to each node of its cluster which possess a replica, asking it the replica version and address. Then it collects the responses from different nodes, and when the quorum (applied on replicas of its cluster) is reached, it determines the most recent replica in its cluster and transmits the version number and address of this replica to the inter-cluster agent.

The activity diagram of Fig. 4 illustrates this cooperation between agents for responding to access request to a data sent by the node of cluster “i”.

Algorithm 2 describes the process to be executed by the agents to determine the most recent replica for a data access operation.

Algorithm 2 can be executed in two modes: either in intra-cluster mode by an intra-cluster agent or in inter-cluster mode by the inter-cluster agent.

In this algorithm, the word “element” is generic, it can represent: either a cluster node possessing a replica if the algorithm is executed in intra-cluster mode or a cluster leader if the algorithm is run in inter-cluster mode.

In the case of read access request, the variable $Q$ represents the read quorum size, and Lock function locks the replicas in read mode. However, in write access case, the variable $Q$ represents the write quorum size, and the Lock function locks the replicas in write mode.

In order to improve performance, determination of the most recent replica as well as its address in a cluster can be performed asynchronously. Indeed, when a leader is free and not solicited, its intra-cluster agent can determine the most recent replica in its cluster.

4. Simulation

To evaluate our approach, we used a simulator called “GREP-SIM” [6]. This simulator allows us to generate any tree hierarchical topology. We have modelled a tree topology inspired by the “CERN” data grid architecture.

In this simulator, each node in the grid is capable of specifying the performance of its processor, the locally replicated data, its storage capacity, and the detailed history of the requests passing through this node (the passage of a request by this node during its routing from a client to its serving node). We can also assign to each node other information, such as its availability.

During the simulation, the number of replicas does not remain fixed and changes, since in this simulator, replicas can be added, deleted and placed dynamically. This will allow us to evaluate our approach in a dynamic environment where the number of replicas changes, and compare it with two other approaches: the ROWA approach and especially the Quorum Consensus approach. The whole system of the grid will be influenced by the change of number of replicas with the Quorum Consensus approach, unlike our approach where only the cluster where the number of replicas has changed will be influenced.

We have extended this simulator by adding and integrating our IQ technique of replicas management (grouping of nodes, formation of quorums and queries processing).

4.1 Simulated grid model

Figure 5 shows a reduced example of the simulated grid topology. It consists of 12 nodes (one root and 11 intermediate nodes), and 12 leaves representing the clients. This simulator allows us to generate a tree topology with a large number of nodes (500 nodes in the simulation below). It also allows us to generate and distribute replicas of the different data.

4.2 Simulation results

The simulations were performed using defined parameters in Table 3.

Table 3
Simulation parameters

Number of data	10
Number of nodes	500
Number of clusters	From 3 to 50
Data sizes	From 100 MB to 2 GB
Number of replicas	From 1 to 100 per data
Number of requests	1000

Figure 5.

Structure of the simulated grid.

The simulator begins to form the clusters. Then, a random distribution of the replicas will be done. From this distribution and for the same chronological order of arrival of the client requests, the simulator calculates response times for each request (read or write), by applying the three replication protocols: ROWA, Quorum Consensus (QC) and Imbricated Quorum (IQ). Thus, we can compare our protocol to the other two protocols.

The results obtained and modelled by the graph of Fig. 6 represent the average response times of some clients (first 14 clients). It shows that performance is clearly improved by using our approach. Indeed, we note a decrease of 39.85% of the IQ average response time compared to ROWA and 23.71% compared to QC.

Figure 6.

Average response time per client.

Figure 7.

Average response times per number of replicas.

The best response times for clients 3 and 10 are obtained by ROWA protocol, because these clients have issued a negligible number of write requests in relation to the number of read requests.

Figure 8.

(a) Average response time per request type; (b). Average response time per request type; (c). Gap between average response times per request type.

The results obtained and modelled by the graph of Fig. 7 represent the average response times per number of replicas. It shows that our approach significantly improves performance. IQ average response time is 17.07% less than ROWA, and is 9.71% less than QC.

The following results show where our approach is interesting. The results illustrated by Fig. 8a represents the average response times per number of write requests (number of write requests among 1000 requests). IQ approach improves average response times by 75.38% compared to ROWA approach, and by 51.98% compared to QC approach.

Figure 8b (detailed in Fig. 8a) shows that if number of write requests is very small (almost zero), ROWA approach is better than IQ approach, but IQ approach is always better than QC.

We also note that our approach is better than the other two approaches since the gap between the average response times of IQ and of the other two approaches (ROWA and QC) increases rapidly when the number of write requests increases. This gap is almost linear as illustrated in Fig. 8c.

5. Comparative discussion

In the following, based on the literature review of Section 2, we summarize a critical study of the main related works.

We start with the two approaches with which we compared our approach in simulation.

The strength of the ROWA protocol [12] is the reliability of the read operations. Its advantages are low costs and high availability for read operations. However, its drawbacks are high costs and none fault tolerance for write operations, since all replicas must be updated for each write operation (failed to update only one replica causes the failure of entire write operation).

In case writing, the Quorum Consensus protocol [15] improves costs and offers greater availability. However, in case of reading the costs are higher because of the number of copies involved in the operation.

With these two approaches and especially with the Quorum Consensus approach, the whole system of the grid will be influenced by the change of number of replicas, unlike our approach where only the cluster where the number of replicas has changed will be influenced. When the number of replicas is changed, the quorum sizes of Quorum Consensus protocol must be updated for the whole system. However, the quorum sizes in our approach must be updated only within the cluster where the number of replicas has changed.

Other points can also be noted for the following proposed approaches.

In [8], since the proposed replication algorithm is limited by the number of replicas, it can only replicate data in very few places, which represents a limit for this approach, because it provides low availability especially in the case where sites possessing replicas fail.

In [14], the authors carried out a comparative analysis on the adaptation of the two proposed protocols on the different 2D structures and did not experiment them to see these performances in terms of response times.

In [17], the authors studied and experimented only the performances of the availability and they neglected the performances of the response times and the consistency costs.

In [18], the proposed approach is limited by several assumptions (number of nodes in a quorum, the similarity of read quorums and write quorums, and the same availability for both read and write operations). These authors also studied and experimented only the performances of the availability and they neglected the performances of the response times and the consistency costs.

In [21], the authors only studied their protocol on a reduced number of nodes, and no precision were given on the large scale.

In the proposed approach in [22], the number of replicas is fixed since the system is assumed to be static during the application of the proposed simulation algorithms.

Like these related works, our approach is based on quorums. And like these works, our approach aims to minimize the quorum sizes.

However, in these works as noted above, the following three points are not considered at the same time, namely the number of replicas, the scaling, and the response time performance. In our work, we take into account these three points. Indeed, we have proposed a replicas management approach in large-scale distributed systems (such as data grids), where the number of replica vary (not fixe), and which aims to improve performance by decreasing quorum sizes.

6. Conclusion and future work

In this work, we contributed to the replicated data management in data grids. We designed a quorum-based approach. This approach, called Imbricated Quorum, has been formulated with the aim of minimizing the read and write quorum sizes, to minimize the number of replicas involved in read or write operations, based on the distribution of all replicas of the grid into groups.

The simulated data grid is logically modelled by a tree architecture. Our approach was compared to ROWA approach and Quorum Consensus approach.

The results obtained show that our approach (Imbricated Quorum) improves data access performance in large-scale systems such as data grids. Indeed, these results show that our approach is considerably better than the other two approaches (ROWA and QC), since the quorum size in IQ is significantly lower than the other two approaches. These results are based on specific working environments, nevertheless, they are promising.

In the future, we project to extend this work in several directions. In the near future, it is envisaged to discuss the contribution of clustering to our approach. We plan to enrich our approach by the principle of collaboration and distributed negotiation between agents. We will also look at the study of availability, the addition of the notion of weighted votes and even the study of load balancing to improve performance. Finally, we project to validate our proposed approach on real data grids.

Footnotes

Authors’ Bios

Farouk Bouharaoua graduated in the Department of Computer Science in the Faculty of Sciences and Engineering Sciences at the University of Mostaganem, Abdelhamid Ibn Badis, Algeria, where he received his Magister degree in Computer Science in 2007. Currently, he is a PhD student at University of Oran 1, Ahmed Ben Bella, Algeria. His research interests are data grid, grid computing, distributed system, data replication and consistency.

Ghalem Belalem graduated in the Department of Computer Science in the Faculty of Exact and Applied Sciences at the University of Oran 1, Ahmed Ben Bella, Algeria, where he received his PhD degree in Computer Science in 2007. His current research interests are grid computing, cloud computing, mobile environments and replicas placement and consistency in large scale systems.

References

Chervenak

Schuler

Kesselman

Koranda

and Moe

, Wide-area data replication for scientific collaboration, International Journal of High Performance Computing and Networking 5(3) (2008), 124–134.

Elghirani

Subrata

and Zomaya

A.Y.

, Intelligent scheduling and replication: a synergistic approach, Concurrency and Computation: Practice and Experience 21(3) (2009), 357–376.

Mamat

Radi

Deris

M.M.

and Ibrahim

, Performance of update propagation techniques for data grid, in: IEEE International Conference on Computer and Communication Engineering (ICCCE), Kuala Lumpur, Malaysia (May 13–15, 2008), 332–335.

Agrawal

and El Abbadi

, The Tree Quorum Protocol: An Efficient Approach for Managing Replicated Data, in: Proceeding of the 16th International Conference on Very Large Databases, (VLDB), Brisbane, Australia (August 13–16, 1990), 243–254.

Nukarapu

Tang

Wang

and Lu

, Data replication in data intensive scientific applications with performance guarantee, IEEE Transactions on Parallel and Distributed Systems 22(8) (2011), 1299–1306.

Belalem

and Bouhraoua

, Dynamic strategy of placement of the replicas in data grid, in: Proceeding of the 9th International Conference of Parallel Computing Technologies, (PaCT 2007) Malyshkin

, eds, Lecture Notes in Computer Science, vol 4671. Springer Berlin Heidelberg, Pereslavl-Zalessky, Russia, (September 3–7, 2007), 496–506.

Martin

and Hassanein

, Cluster-based replication for large-scale mobile ad-hoc networks, in: IEEE 2005 International Conference on Wireless Networks, Communications and Mobile Computing, Maui, HI, USA (June 13–16, 2005), 552–557.

Abawajy

J.H.

and Deris

M.M.

, Data Replication Approach with Consistency Guarantee for Data Grid, IEEE Transactions on Computers 63(12) (2014), 2975–2987.

Pérez

J.M.

García-Carballeira

Carretero

Calderón

and Fernández

, Branch replication scheme: A new model for data replication in large scale data grids, Future Generation Computer Systems 26(1) (2010), 12–20.

10.

Deris

M.M.

Abawajy

J.H.

and Mamat

, An efficient replicated data access approach for large-scale distributed systems, Future Generation Computer Systems 24(1) (2008), 1–9.

11.

Mitton

Busson

and Fleury

, Efficient broadcasting in self-organizing sensor networks, International Journal of Distributed Sensor Networks 2(2) (2006), 161–187.

12.

Bernstein

P.A.

and Goodman

, An algorithm for concurrency control and recovery in replicated distributed databases, ACM Transaction Database Systems 9(4) (1984), 596–615.

13.

Sharma

Kant

and Chauhan

, A comparative study of cluster-based data replication techniques for MANETs, International Journal of Information Technology 2(2) (2010), 665–667.

14.

Basmadjian

and de Meer

, An Arbitrary 2D Structured Replica Control Protocol, in: 17th GI/ITG Conference on Communication in Distributed Systems (KiVS’11), Kiel- Germany (March 8–11, 2011), 157–168.

15.

Thomas

R.H.

, A majority consensus approach to concurrency control for multiple copy databases, ACM Transactions on Database Systems (TODS) 4(2) (1979), 180–209.

16.

Grace

R.K.

and Manimegalai

, Dynamic replica placement and selection strategies in data grids – a comprehensive survey, Journal of Parallel and Distributed Computing 74(2) (2014), 2099–2108.

17.

Latip

Othman

Abdullah

Ibrahim

and Sulaiman

M.N.

, Quorum-based data replication in grid environment, International Journal of Computational Intelligence Systems 2(4) (2009), 386–397.

18.

Latip

Mizan

Ghazali

N.F.

Kadir

R.A.

and Hanandeh

F.A.

, Replica control protocol: Triple Quorum Replication (TQR) in Data Grid, in: IEEE 2013 5th International Conference on Computer Science and Information Technology (CSIT), Amman, Jordan (March 27–28, 2013), 303–307.

19.

Alumuttairi

R.M.

Wankar

Negi

and Rao

C.R.

, Replica selection in data grids using preconditioning of decision attributes by k-means clustering (K-RSDG), in: IEEE 2010 Second Vaagdevi International Conference on Information Technology for Real World Problems (VCON), Warangal, AP, India (December 9–11, 2010), 18–23.

20.

Souli-Jbali

Hidri

M.S.

and Ayed

R.B.

, Dynamic data replication-driven model in data grids, in: Computer Software and Applications Conference (COMPSAC), 2015 IEEE 39th Annual, Taichung, Taiwan (July 01–05, 2015), 393–397.

21.

Choi

S.C.

and Youn

H.Y.

, Dynamic hybrid replication effectively combining tree and grid topology, The Journal of Supercomputing 59(3) (2012), 1289–1311.

22.

Sarra

Amar

and Hafida

, A load balancing strategy for replica consistency maintenance in data grid systems, Informatica 37(3) (2013), 345–353.

23.

Cheung

S.Y.

Ammar

M.H.

and Ahamad

, The grid protocol: A high performance scheme for maintaining replicated data, IEEE Transactions on Knowledge and Data Engineering 4(6) (1992), 582–592.

24.

Andronikou

Mamouras

Tserpes

Kyriazis

and Varvarigou

, Dynamic QOS-aware data replication in grid environments based on data importance, Future Generation Computer Systems 28(3) (2012), 544–553.

25.

Sawai

Shinohara

Kanzaki

Hara

and Nishio

, Quorum-based consistency management among replicas in ad hoc networks with data update, in: Proceedings of the 2007 ACM Symposium on Applied Computing, ACM, Seoul, Korea (March 11–15, 2007), 955–956.

A quorum-based intelligent replicas management in data grids to improve performances

Abstract

Keywords

1. Introduction

2. Literature review

3.1 Principle

Table 1 Set of nodes and their chief

4.1 Simulated grid model

4.2 Simulation results

Table 3 Simulation parameters

6. Conclusion and future work

Footnotes

Authors’ Bios

References

Table 1
Set of nodes and their chief

Table 3
Simulation parameters