Abstract
In cloud storage reliability management, a conflict occurs between the cloud storage provider (CSP) and users. The CSP considers the storage overhead as it adopts a replication mechanism to achieve data reliability. Users incur an auditing cost for verifying CSP’s reliability through remote data integrity checking. Both the CSP and users would like to spend a lower cost on the data reliability management. However, if one side decides to reduce his cost, such as, reducing stored copies for each datum, the data reliability would become lower. The opposite side would face a conflict that he has to increase his cost on the data reliability, such as, increasing auditing frequency. These conflicting situations may lead to an unsatisfied state of the data reliability if users and CSP arbitrarily make decisions according their own preferences. In this paper, we describe a game-theoretic analysis to design optimal strategies for users and the CSP. We present two cases of the user-CSP game: the strategic game and Bayesian game. We determine the Nash Equilibrium (NE) and Bayesian Nash Equilibrium (BNE) to get stable solutions for these games.
In the numerical study, we use the pricing strategy adopted by Amazon S3 to explain our analysis. We analyze the effect of the parameter setting with replica number n, data sensitive level, and punishment on the NE and BNE. Our results show that if a user wants to store important data in the CSP, he should choose the
Introduction
Problem description
Cloud storage services, such as Amazon S3 and Dropbox, become popular in recent years. On these systems, users can put their files in cloud storage through the Internet and retrieve them when needed. Because users store their files remotely, they expect that the CSP could provide reliable storage. Replication of data is an effective method for achieving high reliability. In the replication mechanism, each file is stored in multiple replicas. However, more redundancy means more storage space. For example, if a storage system adopts a three-replica strategy, the overall storage space is three times the original file size. The CSP may attempt to use fewer replicas to reduce the storage cost. Thus, the CSP considers a trade-off between reliability and the storage cost.
When a user stores data on the cloud, he loses control over them. If the user chooses the reduced replication mode, the data would be stored with less reliability. To detect data corruption, the user wants to audit the storage service of CSP through data integrity checking. If the user detects errors in the stored data, he could claim compensation from the CSP for the loss. Furthermore, the CSP may consider whether the user generates fake audit results to claim more compensations. The CSP could reject the user’s compensation if the audit results are not credible. In order to be credible, a user may authorize an independent third party to audit the data for integrity. Eventually, the user must cover the auditing cost. Therefore, the user needs to consider whether to conduct an audit or not.
We consider a situation where the CSP faces competition from another CSP. We assume that users can share their experience via a social network, e.g., Facebook and Twitter, after using the CSP’s service. If the CSP’s service is not trustable, an opportunistic user may leave the CSP for another one. Thus, the CSP should consider user’s staying probability (loyalty) when adopting a reliability policy.
In this paper, we address the conflict between a user and the CSP, where a user prefers less auditing and the CSP prefers less replication. The data reliability would become less if the CSP decides to reduce the number of copies. In this situation, the audit process is a necessary for the user. However, it conflicts with the user’s preference. On the other hand, if the user would like to reduce the auditing cost, he anticipates that the CSP provides more replicas to improve the data reliability. This results in a contradiction since the CSP prefers less storage cost. For example, the CSP may prefer to adopt two-replica in the
Game-theoretic analysis
A game-theoretic analysis is to find out optimal strategies for a user and the CSP. We treat a user and the CSP as rational players who make decisions depending their preferences only. Their strategies affect each other. We consider two cases in our game model. The first involves a loyal user, who does not prefer to leave from the CSP. This is called the loyalist game, which is a strategic game. The second involves an opportunistic user, who may switch to another CSP, if the original CSP’s service is unreliable or expensive. This is called the opportunist game, which is a Bayesian game. In these two games, we identify the Nash Equilibrium (NE) and Bayesian Nash Equilibrium (BNE), respectively. The CSP could use the statistic approach, such as the data mining, to analyze the user’s background and behaviors from the user’s information. The analysis results can be used to determine whether the user is loyal or opportunistic. For example, if a user’s access frequency is higher than others, his loyalty is relatively high. The CSP could take the user with high access frequency as a loyal user. According to this result, the CSP could chooses an appropriate game model.
In the loyalist game, the CSP would prefer to choose the
We use the pricing strategy adopted by Amazon S3 to explain our analysis. The study [1] showed that Amazon S3 has the highest market share. We improve our previous numerical study [2] by additionally considering the effect of punishment setting and data sensitive level on the NE and BNE. In the previous paper, we do not consider the parameters over punishment strength and data sensitive level. In this paper, we consider two cases of punishment (including: the severe punishment and mild punishment) and three data sensitive levels (including: non-critical and reproducible data, low-value critical data, and high-value critical data). We would like to analyze the effect of players’ strategy decisions from the variable punishment settings and data sensitive level. The evaluation results of the loyalist game show that if a user chooses to store important data in the CSP, he has more benefit for choosing the
Related work
To manage data reliability, data redundancy schemes have been applied in cloud storage systems. These schemes fall into two categories: replication and erasure code. Replication-based schemes, such as Amazon S3 and Google Cloud Storage, are the simplest for increasing data reliability. Li et al. [3,4] proposed a cost-effective data replication mechanism for reducing storage cost. Their approach assumes that the failure rate of cloud storage is an exponential distribution over the storage duration. The minimum number of replicas is determined using the evaluation of the data replication strategy. Thus, the required storage space is reduced.
Ateniese et al. [5] introduced the provable data possession (PDP) model, which allows users to remotely verify their stored data in an untrusted server without retrieving them. In this model, users can minimize their communication cost during data verification. Wang et al. [6,7] proposed a public cloud data auditing system. A user can authorize a third parity to audit data integrity at low communication and computation costs. Hao et al. [8] and Barsoum [9] introduced a multiple-replica remote data possession verification protocol for determining whether the CSP meets its promise of multiple replicas of data.
Some studies have analyzed the trade-off between the benefits and costs in a large-scale cloud storage system. For the CSP, Pamies et al. [10] proposed an approach for analyzing the costs of different redundancy schemes. They introduced a set of rules for determining the optimal level of redundancy for each system configuration. Wang et al. [11] proposed a quantitative model for evaluating redundancy strategies. This model suggested suitable redundancy configurations. For users, Mian et al. [12] provided a cost model for estimating the resource cost in a public cloud. It helped users evaluate their workloads during data-intensive operations.
Li et al. [13] proposed a hierarchical pricing system for cloud service providers to set reasonable prices. In this system, higher-level cloud service providers provide higher capability facilities. If user’s tasks cannot be executed in the lower-level cloud service providers, they will be transmitted to the upper level and make more charges to the user.
Niyato et al. [14] presented a hierarchical cooperative game model for analyzing the cooperative behavior of multiple cloud service providers. In this model, a group of CSPs can cooperate to establish a resource and revenue sharing mechanism. They used the stochastic linear programming game to obtain solutions. The performance evaluation showed that the cooperation strategy can lead to a higher profit for some parameter settings. For example, if the available network bandwidth of the CSP is less than 4.7 Gbps, the CSP should combine its resource with the available resources of another CSP to offer the service.
Game theoretic model
In this section, we introduce the basic concepts of the game-theoretic model used in this study. Game theory [15,16] is a useful technique for analyzing the conflict situation between decision makers, whose objective is to maximize their payoffs. It is an approach for studying behaviors of competitors and choices of appropriate strategies. In our game-theoretic model, we use a two-player strategic game and two-player Bayesian game, which are defined as follows.
Two-player strategic game
A two-player strategic game is a tuple of
We call
If player 1’s strategy The Nash equilibrium of a two-player strategic game
In the two-player strategic game, each player is aware of the other player’s strategy set and utility function. Now, we consider the case where a player does not have complete information, such as the strategy set or the utility function of the other player. In this incomplete information game, the player makes the decision according to his belief. The belief is the probability distribution over the types of the other player. We call this incomplete information game a two-player Bayesian game, which is defined as follows.
A two-player Bayesian game is a tuple of
Before player 1 makes a decision, he chooses his type
Each player may choose a different strategy for each type. Let
For player 1’s type
For the type
After player 1 determines the best-response strategies for type
Similarly, we can define
The BNE is the specific strategy profile
The Bayesian Nash Equilibrium of a two-player Bayesian game
The BNE could be interpreted as the NE of the strategic game if each type in a Bayesian game is regarded as a distinct player in a strategic game. For example, we consider a two-player Bayesian game
Game assumptions
Let
Notation
Notation
Let
First, we consider the payoffs of
Next, we set the parameters for
In addition, we consider that
In summary, we consider the following two situations in our analysis. The first is that
In the loyalist game,
Payoff matrix of the loyalist game
Payoff matrix of the loyalist game
We assume that
For
For
If
Otherwise,
Suppose that Suppose that
Lemma 1 indicates that
For
If
Otherwise,
For
If
Otherwise,
Suppose that Suppose that
Lemma 2 indicates that
From Lemma 1 and Lemma 2, the Nash Equilibrium
The
For
If
If
Otherwise,
For
If
Otherwise,
For
We prove the theorem through case analysis.
If the decrement of storage cost
We consider that the loyalist game is played in multi rounds. In this multi-round game model, each player knows other player’s action of the previous round. We observe that
We model the opportunist game as a two-player Bayesian game, where
Payoff matrices of the opportunist game
Payoff matrices of the opportunist game
We identify the best-response strategy row for each player.
The strategies
For
For
If
Otherwise,
Suppose that Suppose that If If
Apparently,
Let α be user’s staying probability in
For
If
Otherwise,
For
If
Otherwise,
For
If
Otherwise,
For
If
Otherwise,
Suppose that Suppose that Suppose that Suppose that
After we identify
The
If
If
If
Otherwise,
We prove the theorem through case analysis.
Suppose that Suppose that Suppose that For other cases, any case of Lemma 4 cannot correspond with cases of Lemma 3. Thus,
In case, if
We use the Amazon S3 pricing model (up to March 2014) to illustrate the equilibrium conditions in our game models. When a user uses the first 1 TB storage in a month, he would be charged
Parameter settings
We assume that punishments of
As mentioned in the game assumption of Section 4,
We assume that

Subsequently, we set the failure rate
In the loyalist game, we analyze the NE among various n-replicas, data sensitivity levels, and punishment types. The different setting represents the different game model. We find the strategy profile of NE for each game model. Notice that the replicas number n is not necessarily an integer. If we consider adopting the particular erasure coding approach, n could be taken as the ratio of the total replicas number to the number of source data. In this situation, n could be a real number. For example, in a
NE in the mild punishment
NE in the mild punishment
Note:

Figure 1 shows
By analyzing the intersection of Fig. 1(a), Fig. 1(b), Fig. 1(c), and Fig. 1(d), we could determine the
Figure 2 shows
Notice that, for
In this work, we have the following discoveries. First, for
In the opportunist game, we would like to determine the BNE among various settings with the different reduced redundancy strategies n, and
Figure 3 shows
NE in the severe punishment
NE in the severe punishment
Note:

BNE for the level 1 data in the severe punishment
Note:
In summary, if
This study provides a game-theoretic model for analyzing the conflict between a user and
In the opportunist game, if the prediction of
The present findings provide a reference for a user and
We are currently extending the study to consider
