Repeated Game for Distributed Estimation in Autonomous Clustered Wireless Sensor Networks

Abstract

A commonly encountered problem in wireless sensor networks (WSNs) applications is to reconstruct the state of nature, that is, distributed estimation of a parameter of interest through WSNs' observations. However, the distributed estimation in autonomous clustered WSNs faces a vital problem of sensors' selfishness. Each sensor autonomously decides whether or not to transmit its observations to the fusion center (FC) and not be controlled by the fusion center (FC) any more. Thus, to encourage cooperation within selfish sensors, infinitely and finitely repeated games are firstly modeled to depict sensors' behaviors. Then, the existences of Nash equilibriums for infinitely and finitely repeated games are discussed. Finally, simulation results show that the proposed Nash equilibrium strategies are effective.

1. Introduction

Wireless sensor networks (WSNs) have increasingly attracted attention due to their wide range of applications, such as industrial control and monitoring, home automation, military surveillance, environment monitoring, and health care. WSNs usually comprise a large number of small-size and energy-limited sensor nodes [1–7]. Different from traditional WSNs with fully cooperated nodes [8], some WSNs consist of selfish and autonomous nodes. In such WSNs, the selfishness nature of nodes that manage to achieve their own aims is considered to be common. In other words, all the nodes are not willing to cooperate and accomplish the network task. However, such noncooperation can deteriorate the network performances.

Specifically for the traditional distributed estimation problem [8, 9], nodes are required to cooperate fully and estimate a scalar parameter under the inherent limitations, such as limited energy and limited network bandwidth. In a practical WSN, these limitations impose a constraint on the design of estimation methods. Generally, the main goal is to save the total energy while achieving given estimation performance under these limitations. For example, in recent literature, the distributed estimation problem in the presence of attacks is discussed and joint estimation schemes of the statistical description of the attacks and the parameter to be estimated are proposed to deal with the attacked observations [10]. Additionally, a novel distributed estimation method based on observations prediction is focused on, and the innovations of sensors' observations are locally predicted and transmitted to the fusion center (FC) [11]. These recent advances usually assume all sensors are selfless and can be controlled by the FC arbitrarily.

However, in autonomous WSNs with selfishness, nodes may not be willing to cooperatively estimate a parameter at the cost of consuming their own limited battery resource. Therefore, each node autonomously decides whether or not to transmit its observations to the FC and not be controlled by the FC any more. Consequently, nodes will not be of their best interest to transmit their observations to the FC. It will deteriorate the network estimation accuracy of the interested parameter and this selfish rejection of transmitting eventually impairs the nodes' own interest. Hence, to encourage cooperation within selfish nodes and improve the final estimation accuracy, it is necessary to design rules and punishment mechanisms to self-enforce nodes' behaviors.

It is noted that such rules and punishment mechanisms usually are modeled as repeated games, in which the selfish nodes know when and how to cooperate in order to obtain potential interests over multiple periods. For example, the repeated game model has been adopted for packet forwarding problems in ad hoc networks. In [12, 13], the interactions of nodes' forwarding and rejection are modeled as repeated games. In [12], as a punishment strategy, a generous tit-for-tat (TFT) is proposed to enforce the nodes to cooperate. Meanwhile, in [13], three learning algorithms for different information structures are proposed to achieve the desired efficient cooperation equilibrium. Additionally, the repeated game model has been applied to address selfish behavior in the media access control (MAC) problem of sensor networks. For example, in [14], a contention window select game (CWSG) is defined, and a penalizing mechanism based on repeated games is proposed to prevent nodes' noncooperation.

We propose two simply repeated games instead of the extensive game [15] to meet the given estimation performance requirement. Different from the decentralized method [16–18], our game-theoretic approach is distributed and each node is selfish. To avoid the selfishness of nodes, a grim trigger strategy and the tit-for-tat strategy for the infinitely repeated estimation game are introduced in which each sensor is voluntarily cooperative. Meanwhile, multiple subgame-perfect Nash equilibriums for the finitely repeated estimation game are discussed to depict the cooperation behaviors.

Our main contributions are shown as follows: (1) the two kinds of repeated game models for distributed estimation in WSNs have been formulated: the infinitely repeated estimation game and the finitely repeated estimation game, respectively; (2) their Nash equilibriums and subgame-perfect Nash equilibriums are simply proposed; and (3) some conclusions of strategies have been verified to be effective in simulations.

2. System Model

2.1. Distributed Estimation Problem

Let us consider a distributed WSN with an FC as shown in Figure 1. This sensor network consists of K selfish nodes to observe a physical phenomenon θ (a scalar parameter of interest), such as temperature and moisture of soil. The nodes are selfish in the sense that the FC does not dictate to the local nodes any scheduling policies. Instead, all the local nodes choose their transmission policy by themselves to selfishly maximize their interest. Within, the network channel is assumed to be error-free and can be implemented by orthogonal time/frequency/code division multiple access (TDMA/FDMA/CDMA).

Figure 1

Sensor networks with selfish nodes.

As shown in Figure 1, where two virtual cluster heads (CHs) and ( $K - 2$ ) cluster nodes (CNs) are grouped into two clusters via using a distributed clustering algorithm. It is assumed that each virtual cluster is regarded as a community of interests and CNs are inclined to be scheduled by their virtual CHs to maximize their community interests. In other words, there are two different communities of interests. There are two jobs for each CH: (1) negotiating with the FC and (2) scheduling the actions of its CNs including itself.

It is assumed that the observation of node k at time t is described as

\begin{matrix} x_{t}^{k} = θ + n_{t}^{k}, \end{matrix}

(1)

where

n_{t}^{k}

is zero-mean additive white Gaussian noise (AWGN) with the variance

σ^{2}

. Additionally,

{n_{t}^{k}}

are independent and identically distributed (i.i.d.) across time and independent and identically distributed across nodes. Due to channels' bandwidth constraint problem, the same one-bit quantizer with threshold τ is commonly adopted for each node.

Here, we review a key result from [16] concerning the distributed estimation problem in cooperative WSNs. It is assumed that a set of indicator variables (binary observations) will be spontaneously transmitted by local nodes and the classical maximum likelihood estimator (MLE) [16] is adopted at the FC. According to Proposition $1$ in [16], the Cramer-Rao lower bound (CRLB) varies inversely to parameter K. As the benchmark of the estimation variances, the smaller the CRLB, the better the estimation performance. To meet the given estimation performance $B (θ)$ , a certain number of nodes $K_{0}$ exist, which is the required minimum number of participants (transmitting observation voluntarily).

The problem in distributed estimation arises because these selfish nodes have their own authorities to decide whether to transmit the binary information at each estimation stage. The FC can not make unilateral decisions and dictate nodes' behaviors. For example, in [17, 18], the decentralized power optimization schemes of the observation flow through solving Karush-Kuhn-Tucker (KKT) systems are not suitable for the autonomous WSNs any more. It is naturally assumed that all the nodes selfishly optimize their own interest, such as maximizing their energy efficiency.

It is worthwhile underlining that interactions among nodes happen not just once but repeatedly many times. Different from the extensive form game in [19], this special class of extensive form games, called repeated games, can explain why ongoing estimation tasks produce behavior very different from those observed in the one-time interaction in [19]. Additionally, it is worth mentioning that the extensive form game in [19] is assumed that all the nodes are required to cooperate fully. In other words, the refined Nash equilibrium in [19] is not suitable for depicting nodes' selfishness and autonomy. Meanwhile, due to the punishment mechanisms in [12, 13], the estimation problem in autonomous WSNs will be reformulated as a repeated game to depict the autonomy and then improve local nodes' energy efficiency.

2.2. Repeated Game

The repeated game theory is considered as a formal framework to model a multiplayer sequential decision making process. The model of repeated games has two versions: the horizon may be finite or infinite. It is noted that the results in the above two cases are different. Thus, in order to apply the model of repeated games in distributed estimation problems, an appropriate horizon (finite or infinite horizon) is required to be determined. In the following, some concepts of a repeated game are firstly introduced. Then, we formulate the distributed estimation system into an appropriate repeated game.

The stage game G is the basic component of a repeated game and can be represented by the three elements $〈N, (A_{i}), (u_{i})〉$ . Within, $N$ , $A_{i}$ , and $u_{i}$ denote the total number of players, a finite action space, and a payoff function for player i, respectively. Additionally, $G^{T}$ denotes the same stage game for T periods. If T approaches infinite, the game is called infinite repeated game. The infinitely repeated game is formally defined following [20]. Within, the notation $a^{t}$ is the action profile in period t and $δ^{t}$ is the discount factor δ raised to the power t. It is assumed that the same δ is adopted for all the players.

Definition 1.

The infinitely repeated game of $G^{\infty}$ for the discount factor δ is the extensive game with perfect information and simultaneous moves in which (i)

the set of players is N,

(ii)

the set of terminal histories is the set of infinite sequences $(a^{1}, a^{2}, \dots)$ of action profiles in $G^{\infty}$ ,

(iii)

the player function assigns the set of all players to every proper subhistory of every terminal history,

(iv)

the set of actions available to player i after any history is $A_{i}$ ,

(v)

each player i evaluates each terminal history $(a^{1}, a^{2}, \dots)$ according to its discounted average $(1 - δ) \sum_{t = 1}^{\infty} ‍ δ^{t - 1} u_{i} (a^{t})$ .

The formal description of finite repeated games is very similar to the definition of infinite repeated games and can be defined as the following.

Definition 2.

For any positive integer T, the T-period finitely repeated game of $G^{T}$ is the extensive game with perfect information and simultaneous moves that satisfies all the conditions of Definition 1 when the symbol ∞ is replaced by T. Meanwhile, it is assumed that the preferences is the mean payoff $\sum_{t = 1}^{T} ‍ u_{i} (a^{t}) / T$ .

3. Repeated Estimation Game

It is noted that the CRLB varies inversely to the parameter K and depends also on these parameters in the distributed estimation problem, such as θ, τ, and σ [16]. In other words, the estimation performance of the MLE depends on the parameters like K, θ, τ, and σ, and so forth. The energy of selfish nodes is supplied by battery once exhausted and they can not charge up. Therefore, parameter K varies with the times of estimation task. Additionally, it is usually assumed that the physical phenomenon is stable and the same MLE is adopted at each stage of the multiple estimation tasks.

To improve and maintain the performance of the MLE as long as possible, K selfish nodes should live as long as possible. However, the cooperation problem among selfish nodes in sequential estimation tasks has not been introduced in the traditional estimation methods. Meanwhile, the repeated games can deal with the problem of nodes' survival, in which the selfish nodes know when and how to cooperate in order to evenly keep the selfish nodes alive over many periods [20]. Thus, the following repeated estimation game is introduced to explore the impact of nodes' selfishness on the estimation performance.

3.1. Stage Game

To be concrete, in the case of the estimation problem, we need to review several notions, namely, a stage game, the game history, and the strategy of a player. The stage game usually consists of a set of players, a set of actions, and a payoff function for each player. Thus, the set of players for the stage game is ${1,2}$ (i.e., the two virtual clusters shown in Section 2.1). Additionally, the actions for clusters are assumed to be ${C o o p e r a t i o n, D e f e c t i o n}$ , where strategy $D e f e c t i o n$ denotes that there are no nodes that will transmit their observations in the cluster at the current. Instead, strategy $C o o p e r a t i o n$ denotes that there are a number of nodes $K_{1}$ ( $K_{1} \in (0, K_{0}]$ ) that transmit their observations in cluster $1$ and $K_{2}$ ( $K_{2} \in (0, K_{0}]$ ) that transmit their observations in cluster $2$ . $K_{1}$ and $K_{2}$ can be expressed as

\begin{matrix} (K_{1}, K_{2}) = \{\begin{cases} (⌈\frac{K_{0}}{2}⌉, K_{0} - ⌈\frac{K_{0}}{2}⌉), & a^{t} = (C, C); \\ (K_{0}, 0), & a^{t} = (C, D); \\ (0, K_{0}), & a^{t} = (D, C); \\ (0,0), & a^{t} = (D, D) . \end{cases} \end{matrix}

(2)

Assume all these nodes are rational and aim at maximizing the cluster's interest. Thus, the set of players is the two clusters and the clusters' strategies space can be defined as ${C, D}$ . Within, ${C}$ is “Cooperation” and ${D}$ is “Defection.” As shown in (2), if both clusters choose the “C” strategy, there are a number of nodes $⌈K_{0} / 2⌉$ that transmit their observations in cluster $1$ and there are a number of nodes $K_{2} = K_{0} - ⌈K_{0} / 2⌉$ that transmit their observations in cluster $2$ . If both clusters choose the “D” strategy, there are no nodes that transmit observations in the network. If one of the clusters chooses the “D” strategy and the other cluster chooses the “C” strategy, then there are no nodes that transmit observations in the cluster with “D” and there are $K_{0}$ nodes that transmit observations in the cluster with “C.”

According to results in Section 2.1, a certain number of nodes $K_{0}$ transmitting their observations voluntarily exist. Thus, if one of clusters chooses strategy “C,” there are a total of nodes $K_{0}$ that transmit their observations to the FC and the given estimation performance will be satisfied. Then, the interest of the cluster with strategy “C” is improved instead of nothing.

Players' payoff function can be given as

\begin{matrix} u_{k} (a^{t}) = \{\begin{cases} α, & a^{t} = (C, C); \\ β, & a_{k}^{t} = C, a_{j}^{t} | (j \neq k) = D; \\ γ, & a_{k}^{t} = D, a_{j}^{t} | (j \neq k) = C; \\ ρ, & a^{t} = (D, D) . \end{cases} \end{matrix}

(3)

It is noted that the payoff function

u (\cdot)

represents a player's preference. For example, if strategy

(C, C)

is adopted, its estimation performance is satisfied. While strategy

(D, D)

is adopted, its estimation performance is not satisfied. Thus,

u_{k} (C, C) = α > ρ = u_{k} (D, D)

for every player if and only if players prefer strategy

(C, C)

to strategy

(D, D)

. Similarly, if strategy

(C, D)

is adopted, the

1 s t

player's estimation performance is also satisfied at the cost of consuming its more residual energy (

K_{1} = K_{0}

sensors transmit their observations in cluster

1

instead of

K_{1} = ⌈K_{0} / 2⌉

). Thus, the payoff of player

1

(also the

1 s t

player) becomes less because it meets the performance requirement at the cost of consuming more residual energy (

K_{2} = K_{0} - K_{1}

), but the payoff of player

2

(also the

2 n d

player) becomes the greatest because it meets the performance requirement without consuming any residual energy. In other words,

γ > α > β > ρ

3.2. Infinitely Repeated Game

It is well known that a strategy of a player in infinitely repeated games should specify an action of the player for every sequence of outcomes. For the case of the estimation problem, a grim trigger strategy is defined as follows: $s_{k} (ϕ) = C$ and

\begin{matrix} s_{k} (a^{1}, \dots, a^{t}) = \{\begin{cases} C, & i f a_{j}^{τ} = C f o r τ = 1, \dots, t; \\ D, & o t h e r w i s e, \end{cases} \end{matrix}

(4)

where

s_{k} (ϕ) = C

denotes the player k chooses C at the start of the game and

s_{k} (a^{1}, \dots, a^{t}) = C

denotes the player k chooses C after any history in which every previous action of player j was C. The grim trigger strategy (labeled as Grim) is illustrated as Figure 2.

Figure 2

A grim trigger strategy for a repeated estimation game.

Another strategy (the tit-for-tat strategy, labeled as TFT) is shown in Figure 3. The strategy can be described in a very compact way: start by cooperating and then do whatever the other player did on the previous iteration.

Figure 3

The tit-for-tat strategy for a repeated estimation game.

Now, suppose each player has selected a strategy $s_{i}$ for playing the infinitely repeated estimation game. The pair of strategies $(s_{1}, s_{2})$ can be used to determine exactly how the game will proceed and then to discuss its existence of Nash equilibrium.

Proposition 3.

For the infinitely repeated estimation game, strategy profile (Grim, Grim) is a Nash equilibrium if and only if $δ \geq (γ - α) / (γ - ρ)$ ; strategy profile (TFT, TFT) is a Nash equilibrium if and only if $δ \geq (γ - α) / (α - β)$ and $δ \geq (γ - α) / (γ - ρ)$ .

Proof.

Suppose that player $1$ adheres to the strategy TFT. If player $2$ deviates by choosing “D” in the first estimation period, then player 1 chooses “D” in the second estimation period and continues to choose “D” until player $2$ reverts to “C.” As shown in Figure 3, player $2$ has two choices: reverting to “C” and adhering to “D.” For reverting to “C,” its corresponding payoffs are $(γ, β, γ, β, \dots)$ , with a discounted average of

\begin{matrix} U_{2} = (1 - δ) \cdot \frac{γ}{(1 - δ^{2})} + (1 - δ) \cdot δ \cdot \frac{β}{(1 - δ^{2})} = \frac{(γ + δ β)}{(1 + δ)}, \end{matrix}

(5)

while for adhering to “D,” its corresponding payoffs are $(γ, ρ, ρ, \dots)$ , with a discounted average of

\begin{matrix} U_{2} = (1 - δ) \cdot γ + (1 - δ) \cdot δ \cdot \frac{ρ}{(1 - δ)} = (1 - δ) \cdot γ + δ \cdot ρ . \end{matrix}

(6)

If player

2

also adheres to the tit-for-tat strategy, its corresponding payoffs are

(α, α, \dots)

, with a discounted average of α. According to formulas (5) and (6), the tit-for-tat strategy of each player is the best response to the strategy TFT of the other player if and only if

\begin{matrix} δ \geq \frac{(γ - α)}{(α - β)}, \\ δ \geq \frac{(γ - α)}{(γ - ρ)} . \end{matrix}

(7)

The proof of strategy profile (TFT, TFT) being a Nash equilibrium is done. Similarly, strategy profile (Grim, Grim) can be proven to be a Nash equilibrium. Then, the proof is done completely.

3.3. Finitely Repeated Game

The strategy space for repeated games is difficultly illustrated even if the game is repeated just $2$ times. To determine how to play a finitely repeated estimation game, the equilibrium in the one-shot version of the game is investigated here. For example, the simplest situation is considered, in which two players play the estimation game twice. Obviously, its players are involved repeatedly in an interaction with payoffs as shown in Table 1.

Table 1

Payoff of the repeated estimation game.

P1	P2
P1	C	D
C	α, α	β, γ
D	γ, β	ρ, ρ

The repeated estimation game $G^{T}$ ( $T = 2$ ) can be expressed in the extensive form. As shown in Figure 4, there are four histories at $t = 1$ : ( $C, C$ ), ( $C, D$ ), ( $D, C$ ), and ( $D, D$ ). It is easily derived that a reduced game for any history starting at $t = 1$ is expressed as Table 2.

Table 2

Payoff of the reduced two-stage game.

P1	P2
P1	C	D
C	$π_{1} + α, π_{2} + α$	$π_{1} + β, π_{2} + γ$
D	$π_{1} + γ, π_{2} + β$	$π_{1} + ρ, π_{2} + ρ$

Figure 4

Sum of the two stage-game payoffs.

For example, after ( $C, C$ ) in the initial round, each player's payoffs are increased by $π_{1} = π_{2} = α$ ; after ( $C, D$ ) in the initial round, the $1 s t$ player's payoffs are increased by $π_{1} = β$ and the $2 n d$ player's payoffs are increased by $π_{2} = γ$ ; after ( $D, C$ ) in the initial round, the $1 s t$ player's payoffs are increased by $π_{1} = γ$ and the $2 n d$ player's payoffs are increased by $π_{2} = β$ ; after ( $D, D$ ) in the initial round, each player's payoffs are increased by $π_{1} = π_{2} = ρ$ .

Since a player's preferences in the game of the initial round do not change when we add a constant to his payoffs, hence, the set of Nash equilibriums in the reduced estimation game is the same as the stage game (namely, the game of the initial round). It is a general result of finitely repeated game equilibriums as follows [21] and its proof is ignored here.

Lemma 4.

For the finitely repeated game $G^{T}$ , it is assumed that the stage game has a unique subgame-perfect Nash equilibrium $s^{*}$ (SPNE). Then, $G^{T}$ has a unique SPNE and $s^{*}$ is played at each round independent of the history of the previous rounds.

As shown in Table 1, the two players' sets of actions are the same and their preferences have the following characteristics:

\begin{matrix} u_{1} (a_{1}^{t}, a_{2}^{t}) = u_{2} (a_{2}^{t}, a_{1}^{t}), \end{matrix}

(8)

for every action pair

(a_{1}^{t}, a_{2}^{t})

. This two-player strategic game at any stage is denoted as the symmetric game and has a unique mixed strategy Nash equilibrium, in which each player's mixed strategy assigns probability

(β - ρ) / (γ + β - α - ρ)

to C and probability

(γ - α) / (γ + β - α - ρ)

to D.

In other words, there are multiple Nash equilibriums in the one-shot stage game of the finitely repeated estimation game: $(C, D)$ , $(D, C)$ , and the mixed strategy assigns probability $(β - ρ) / (γ + β - α - ρ)$ to C and probability $(γ - α) / (γ + β - α - ρ)$ to D. The uniqueness condition of SPNE in Lemma 4 is untenable. Actually, there are multiple SPNEs in finitely repeated estimation game, and some versions are given as follows: (1)

$(C, D), (C, D), \dots, (C, D)$ (T even rounds).

(2)

$(D, C), (D, C), \dots, (D, C)$ (T even rounds).

(3)

$(C, D), (D, C), (C, D), (D, C), \dots, (C, D), (D, C)$ (T even rounds).

(4)

$(D, C), (C, D), (D, C), (C, D), \dots, (D, C), (C, D)$ (T even rounds).

Within, the first strategy denotes that the 1st player's first move is to play C and its second move is to play C after every possible history, and the 2nd player's first move is to play D and its second move is to play D after every possible history. The average payoff for the first strategy is $(β, γ)$ . Similarly, the average payoffs for the second strategy, the third strategy, and the fourth strategy are $(γ, β), ((β + γ) / 2, (β + γ) / 2)$ , and $((β + γ) / 2, (β + γ) / 2)$ , respectively. These strategies are SPNEs because each of $(C, D)$ and $(D, C)$ is each player's best response to the other's strategy at each subgame.

According to Proposition 3 and Lemma 4, the Nash equilibriums of the proposed repeated estimation game deal with the problem of nodes' selfishness and maintain nodes' actions evenly. It is noted that the MLE [16] can be extended into nonideal channels [22]. Meanwhile, nonideal channels have no effect on the proposed game due to nonadditional information exchange among nodes. Thus, the results can be applied onto the nonideal channels.

4. Simulation Results

In this section, simulation results are obtained by Matlab. $10$ sensor nodes are randomly deployed in a given square area, such as the square region (200 m × 200 m). It is assumed that the minimum number of participants $K_{0}$ is equal to $4$ . The MLE is adopted by the FC (located on $(100,100)$ ). The virtual clusters are randomly divided into two clusters with the same number of members $5$ . As shown in Figure 5, the cluster with sensors ( $1,2, 3,4, 5$ ) is the $1 s t$ player and the cluster with sensors ( $6,7, 8,9, 10$ ) is the $2 n d$ player. To be more efficient and fair, in the two clusters, nodes with more residual energy are orderly selected to play the repeated estimation game. The discount factor δ is set to $1 / 2$ . γ, α, β, and ρ are set to be $4,3, 1$ and $0$ , respectively.

Figure 5

The players' actions for the infinitely repeated estimation game: cooperation.

A similar simple energy dissipation model is adopted for nodes' radio hardware [9]. In this model, $E_{e l e c}$ denotes the electronics energy consumption and $ϵ_{f s}$ and $ϵ_{m p}$ are energy factors. The energy consumption of the sensor i in a stage game is expressed as

\begin{matrix} E_{(i)} = l E_{e l e c} + l ϵ_{f s} d_{(i, F C)}^{2}, \end{matrix}

(9)

where

d_{(i, F C)}

denotes the distance from the sensor i to the FC. The initial energy of nodes is set to be

5

J. Because each sensor quantizes its local estimate by using a one-bit quantizer, its length of bits l is assumed to be

10

with

9

header bits for simplicity.

As shown in Figure 5, strategy profile (Grim, Grim) is adopted and the two clusters choose the “C” strategy. It is noted that strategy profile (Grim, Grim) is Nash equilibrium under the condition of these parameters (γ, α, β, and ρ), which coincides with Proposition 3. Additionally, according to the definition of the stage game in Section 3.1, there are $4$ sensors that transmit their observations ( $2$ sensors in cluster $1$ and $2$ sensors in cluster $2$ ). There are more times of cooperation for sensors ( $2,3, 4$ ) in cluster $1$ and sensors ( $6,9, 10$ ) in cluster $2$ than other sensors. Considering the requirement of energy efficiency and fairness, sensors with longer distances from the FC will consume more energy for transmitting observations and then have less times of cooperation. For example, sensors ( $1,2, 6,7$ ) are selected to be the actual players at stage $1$ of the infinitely repeated estimation game. At stages $2$ and $3$ , sensors ( $3,4, 8,9$ ) and sensors ( $2,5, 6,10$ ) are selected to be the actual players, respectively. At stage $3$ , sensor $2$ is the nearest from the FC for cluster $1$ , and sensor $6$ is the nearest from the FC for cluster $2$ . Additionally, as shown in Figure 6, sensors' residual energy varies with the player (cluster). The members of the $2 n d$ player are relatively closer to the FC than the members of the $1 s t$ player. Thus, the energy cost of the $2 n d$ player is less than that of the $1 s t$ player when playing the same strategy.

Figure 6

Sensors' residual energy for the infinitely repeated estimation game: cooperation.

To show the effectiveness of the SPNEs for the finitely repeated estimation game, strategy “ $(C, D), (D, C), \dots, (C, D), (D, C)$ ” is adopted by the two players. As shown in Figures 7 and 8, the strategy has the similar distributions of cooperation times and residual energy for the infinitely repeated estimation game. For example, sensors ( $2,3, 4,5$ ) are selected to be the actual players at stage $1$ of the finitely estimation repeated game. At stages $2$ , $3$ , and $4$ , sensors ( $7,8, 9,10$ ), sensors ( $1,2, 3,4$ ), and sensors ( $6,8, 9,10$ ) are selected to be the actual players, respectively.

Figure 7

The players' actions for the finitely repeated estimation game: $(C, D), (D, C), \dots, (C, D), (D, C)$ (SPNE, $T = 24$ ).

Figure 8

Sensors' residual energy for the finitely repeated estimation game: $(C, D), (D, C)$ , $\dots, (C, D), (D, C)$ (SPNE, $T = 24$ ).

Moreover, sensors' times of transmissions or cooperation are depicted in Figure 9. Whether it is the infinitely repeated estimation game or the finitely repeated estimation game, there are more times for these sensors ( $2,3, 4,6, 9,10$ ) closely related to the FC. Meanwhile, it is assumed that the payoffs of players are evenly divided by the cluster's members under the following cases: (1) the player adopts strategy “D” and its sensors obtain the same payoff; (2) the player adopts strategy “C” and its active sensors divide the payoffs evenly. For comparison's sake, sensors' payoffs for infinitely and finitely repeated estimation games are defined to their respective algebraic sums without considering the discount factor, as shown in Figure 10. For the infinitely repeated estimation game, there are more payoffs for these sensors ( $2,3, 4,6, 9,10$ ) closely related to the FC. More pay for more work is true. However, for the finitely repeated estimation game, payoffs are allocated evenly by the cluster's member if the cluster adopts strategy “D.” Then, payoffs of sensors are almost the same for the finitely repeated estimation game in Figure 10.

Figure 9

The players' times of transmissions or cooperation: infinitely and finitely repeated estimation games.

Figure 10

The players' payoffs: infinitely and finitely repeated estimation games.

5. Conclusions

In this paper, we focus on the repeated game for distributed estimation in WSNs. The two kinds of repeated estimation games (infinitely and finitely repeated estimation games) are investigated. Their existences of Nash equilibriums are simply proven. Particularly, the profiles (Grim, Grim) and (TFT, TFT) for the infinitely repeated estimation game and some SPNEs for the finitely repeated estimation game are discussed in detail. Finally, some simulation results show that some Nash equilibriums of the proposed infinitely and finitely repeated game are efficient.

Footnotes

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

This work was supported by National Natural Science Foundation, China (61403089, 61162008, and 61573153), Program for Guangzhou Municipal Colleges and Universities (1201431034), Guangdong Science & Technology Project (2013B0104) and Guangzhou Education Bureau Science and Technology Project (2012A082), and Guangzhou Science and Technology Foundation (nos. 2014J4100142, 2014J410023).

References

Yang

Chen

Sun

Energy-efficient probabilistic area coverage in wireless sensor networks

IEEE Transactions on Vehicular Technology 2015 64 1 367 377

10.1109/TVT.2014.2300181

Chen

Shen

X. S.

Sun

Mobility and intruder prior information improving the barrier coverage of sparse sensor networks

IEEE Transactions on Mobile Computing 2014 13 6 1268 1282

10.1109/tmc.2013.129

2-s2.0-84902184986

Chen

Yau

D. K. Y.

Sun

Cross-layer optimization of correlated data gathering in wireless sensor networks

IEEE Transactions on Mobile Computing 2012 11 11 1678 1691

10.1109/tmc.2011.210

2-s2.0-84655170952

Chen

Cheng

Sun

Maintaining quality of sensing with actors in wireless sensor networks

IEEE Transactions on Parallel and Distributed Systems 2012 23 9 1657 1667

10.1109/tpds.2012.100

2-s2.0-84864615521

Chen

Sun

Novel deployment schemes for mobile sensor networks

Sensors 2007 7 11 2907 2919

10.3390/s7112907

2-s2.0-36849032166

Liu

A novel joint logging and migrating traceback scheme for achieving low storage requirement and long lifetime in WSNs

AEU—International Journal of Electronics and Communications 2015 69 10 1464 1482

10.1016/j.aeue.2015.06.016

Liu

Chen

Analysis and improvement of send-and-wait automatic repeat-request protocols for wireless sensor networks

Wireless Personal Communications 2015 81 3 923 959

10.1007/s11277-014-2164-6

2-s2.0-84910090807

Liu

Chen

Zhou

Shu

Position-based adaptive quantization for target location estimation in wireless sensor networks using one-bit data

Wireless Communications and Mobile Computing 2015

10.1002/wcm.2576

2-s2.0-84922552080

Liu

Chen

Zhang

Xiang

Zhou

Adaptive quantization for distributed estimation in cluster-based wireless sensor networks

AEU—International Journal of Electronics and Communications 2014 68 6 484 488

10.1016/j.aeue.2013.12.004

2-s2.0-84899087253

10.

Zhang

Blum

R. S.

Conus

Asymptotically optimum distributed estimation in the presence of attacks

IEEE Transactions on Signal Processing 2015 63 5 1086 1101

10.1109/tsp.2014.2386281

2-s2.0-84922484210

11.

Bouchoucha

Ahmed

M. F.

Al-Naffouri

T. Y.

Alouini

Distributed estimation based on observations prediction in wireless sensor networks

IEEE Signal Processing Letters 2015 22 10 1530 1533

10.1109/lsp.2015.2411852

12.

Srinivasan

Nuggehalli

Chiasserini

C. F.

Rao

R. R.

Cooperation in wireless ad hoc networks

Proceedings of the 22nd Annual Joint Conference of the IEEE Computer and Communications (INFOCOM ′03)

March-April 2003

San Francisco, Calif, USA

IEEE

808 817

10.1109/INFCOM.2003.1208918

13.

Pandana

Han

Liu

K. J. R.

Cooperation enforcement and learning for optimizing packet forwarding in autonomous wireless networks

IEEE Transactions on Wireless Communications 2008 7 8 3150 3163

10.1109/twc.2008.070213

2-s2.0-50049090186

14.

Yan

Xiao

Huang

On selfish behavior in wireless sensor networks: a game theoretic case study

Proceedings of the 3rd International Conference on Measuring Technology and Mechatronics Automation (ICMTMA ′11)

January 2011

Shangshai, China

752 756

10.1109/icmtma.2011.472

2-s2.0-79952946630

15.

Liu

Chen

Adaptive quantization for distributed estimation in energy-harvesting wireless sensor networks: a game-theoretic approach

International Journal of Distributed Sensor Networks 2014 2014 9

217918

10.1155/2014/217918

16.

Ribeiro

Giannakis

G. B.

Bandwidth-constrained distributed estimation for wireless sensor networks—part I: gaussian case

IEEE Transactions on Signal Processing 2006 54 3 1131 1143

10.1109/tsp.2005.863009

2-s2.0-33244491229

17.

Xiao

J.-J.

Cui

Luo

Z.-Q.

Goldsmith

A. J.

Power scheduling of universal decentralized estimation in sensor networks

IEEE Transactions on Signal Processing 2006 54 2 413 422

10.1109/tsp.2005.861898

2-s2.0-31344455704

18.

Liu

Energy-efficient scheduling of distributed estimation with convolutional coding and rate-compatible punctured convolutional coding

IET Communications 2011 5 12 1650 1660

10.1049/iet-com.2010.0560

2-s2.0-80053289233

19.

Liu

Zhang

Liu

Distributed estimation based on game theory in energy harvesting wireless sensor networks

Proceedings of the 33rd Chinese Control Conference (CCC ′14)

July 2014

Nanjing, China

IEEE

401 404

10.1109/chicc.2014.6896656

2-s2.0-84907931870

20.

Osborne

M. J.

An Introduction to Game Theory 2004

Oxford, UK

Oxford University Press

21.

Osborne

M. J.

Rubinstein

A Course in Game Theory 1994

Cambridge, Mass, USA

The MIT Press

22.

Liu

Chen

Decentralized estimation over noisy channels in cluster-based wireless sensor networks

International Journal of Communication Systems 2012 25 10 1313 1329

10.1002/dac.1308

2-s2.0-84865541360