Abstract
The paper focuses on the problem of time consistency of an interval Shapley-like value in cooperative interval dynamic games. This problem has not been investigated in the game theory literature for the considered class of dynamic games. It is proved that the interval Shapley-like value is time inconsistent, and to satisfy the time consistency condition for the interval value, it is necessary to introduce a new control variable, so-called imputation distribution procedure. The imputation distribution procedure is proposed in an explicit form.
Introduction
In recent papers on cooperative games it is usually supposed that the characteristic function, which shows the worth of any coalition, is given. But this approach is difficult to introduce in dynamic and differential games since for the understanding evolution of the characteristic function from one subgame to another it is necessary to understand the mechanism how the characteristic function can be calculated in each subgame. Because of that in our setting we suppose that the characteristic function is defined in a classical way for each coalition in the sense of von Neumann and Morgenstern [15] –as value of a zero-sum game between the coalition (player 1) and its complement (player 2), where the payoff to the coalition is the sum of players’ payoffs from it. This approach was used in many papers on cooperative differential games.
When cooperation in dynamic and differential games is considered, the important problem of time consistency of the solution concept arises. This problem was for the first time mentioned in the paper of Petrosyan [16]. In the mentioned paper and also in later research, authors proposed [8, 17] a special payment algorithm for players to sustain time consistency of the given cooperative solution. In this paper, we try to use an approach based only on ideas formulated in [17–19] to dynamic cooperative games with interval payoffs. An interval payoff means that one cannot evaluate a precise value for the worth obtainable by each coalition, but instead knows a lower bound and an upper bound for each coalition value.
Branzei et al. [4, 5] provided the interval game-theoretic model to handle bankruptcy situations with interval claims and considered two Shapley-like values for the class of games. Alparslan Gök et al. [1] studied selections of cooperative interval games which are classical cooperative games. Alparslan Gök et al. [2] then characterized the interval Shapley value with the properties of additivity, efficiency, symmetry and dummy player, which are straightforward generalizations of the corresponding properties in the classical cooperative game theory. It is worth mentioning that the proposed interval Shapley value is defined on the strictly limited interval games, namely, size monotonic. From that, many axiomatization approach of the interval Shapley value have been developed [3, 10]. Meng [11] extended recently the interval Shapley values to the so called type-2 interval games, where the player participation levels and the coalition values are both interval numbers. Methods of interval arithmetic and analysis (cf. Moore [14]) have played a key role for the models of games based on interval uncertainty. By using Moore’s subtraction operator and an introduced total order with respect to the set of closed intervals, Han et al. [6] studied the new notions of interval cores and Shapley-like values. We aim at investigating time consistency of one of such interval Shapley-like values in cooperative interval dynamic games. Although the considered interval value is not efficient, it is defined for all interval games. The theory of interval cooperative values, for example, the interval Shapley value and its extension to the case of games with a coalition structure (the Owen value), is developed in [12, 13], in which authors study a fuzzy case.
The paper has the following structure. In Section 2 we define an interval dynamic game and construct a cooperative game based on the given interval dynamic game. Here we define mathematical operations on closed and bounded intervals using interval calculus. In this section we also propose a way of constructing the characteristic function—the function of coalitions—in the cooperative interval dynamic game and consider the interval Shapley-like value as a cooperative solution concept. The time consistency problem of the cooperative solution concept in cooperative interval dynamic games is studied in Section 3. A numerical example, which illustrates our theoretical results, is considered in Section 4 and this section concludes the paper.
Cooperation in interval dynamic games
Operations on closed intervals
At first, we introduce mathematical operations on closed intervals which we will use in the paper [6]. Let be the set of all closed and bounded intervals on . For any two intervals and define addition, subtraction and multiplication operators in the following way: , ,
For intervals define the addition operator in the similar way:
Define now operators of the order. Let |I| be the length of interval I, i.e., . We say that I is weakly better then J (I ≽ J or J ≼ I) if and only if and , I is better then J (I ≻ J or J ≺ I) if and only if I ≽ J and I ≠ J, I is weakly superior to J (I ≿ J or J ≾ I) if and only if , I and J are related by an indifference relationship (I ∼ J) if and only if , I is superior to J (I ≻ J or J ≺ I) if and only if .
Interval dynamic game with perfect information
In this section, we consider more simple class of dynamic games, so-called finite multistage games with perfect information [9]. Let X be a finite set and F be a mapping that assigns a subset F x ⊂ X to each element x ∈ X. The case F x =∅ is also possible. A pair G = (X, F) is called a graph. Each element x ∈ X is a node of the graph, and a pair (x, y), where y ∈ F x , is an arc. We suppose that the graph is a tree, and x0 is its root.
Let N be a finite set of players with |N| = n. Define an interval multistage game with perfect information as follows.
The set of nodes X is split up into n + 1 disjoint sets (player partition) X1, …, X
n
, Xn+1, i.e., , X
i
∩ X
j
= ∅, i ≠ j, and F
x
=∅ for all x ∈ Xn+1. Here set X
i
consists of nodes in which player i ∈ N makes a move, and set Xn+1 is the set of terminal nodes, For all x ∈ X, n closed intervals , are given. Here interval is the interval of possible payoffs to player i ∈ N in node x ∈ X.
The game develops as follows. Suppose the root x0 ∈ X i 1 , then in x0 player i1 makes a move and chooses a node x1 ∈ F x 0 . If x1 ∈ X i 2 , then in x1 player i2 makes a move and chooses a node x2 ∈ F x 1 etc. Thus, if on k-th step a node xk-1 ∈ X i k , then player i k makes a move and chooses a node from set F x k-1 . The game ends once a terminal node xℓ ∈ Xn+1 is reached, i.e., if F x ℓ =∅.
Each player i, who makes a move in node x ∈ X i , knows the node x and can reconstruct all previous nodes since graph G is a tree. Therefore, the players have perfect information.
Define a payoff to player i ∈ N in game Γ. Let (x0, x1, …, xℓ) such that x
k
= F
x
k-1
, k = 1, … , ℓ, and xℓ ∈ Xn+1 be a trajectory (or a path) on graph tree G starting in the root x0 and ending in the terminal node xℓ. The payoff to player i in the game is defined as
The set of all possible strategies of player i we denote as U i . An n-tuple u = (u1, …, u n ), where u i ∈ U i , is a strategy profile in game Γ, and Cartesian product is a set of all strategy profiles.
Each strategy profile uniquely defines a game trajectory and, therefore, players’ payoffs. Suppose we have a strategy profile u = (u1, …, u n ) which defines a trajectory (x0, x1, …, xℓ). One can introduce a payoff function of player i in the following way: K i (u1, …, u n ) = H i (x0, x1, …, xℓ), i ∈ N.
Cooperative interval dynamic game with perfect information
In papers on cooperative dynamic games with completely certain, non-interval payoffs it is supposed that before the game starts, players choose a strategy profile and the corresponding trajectory maximizing the sum of players’ payoffs. In cooperative dynamic games with interval payoffs, we cannot exactly find the maximal value that is why one needs to specify it. Similar to this approach, we suppose that before the game starts, players choose a strategy profile and corresponding trajectory , , maximizing the sum
The strategy profile we call the optimal strategy profile, and the corresponding trajectory is the cooperative (median) trajectory. It is obvious that the cooperative trajectory may not be unique. If it is the case, we fix one optimal strategy profile from the set of all optimal strategy profiles and the corresponding cooperative trajectory.
Any subset S of set of players N we call a coalition. The essential feature of any cooperative game is a characteristic function that shows the “worth” of any coalition.
Define a cooperative interval dynamic game.
Suppose that characteristic function v has already specified. There are different approaches of defining the characteristic function, and one of the most used definition of the characteristic function for cooperative interval dynamic games we propose below in Subsection 2.4.
The set of all cooperative interval multistage games Γ C with player set N we denoted by IG N .
A solution of the cooperative interval dynamic game is an interval value, a function , satisfying condition ∑i∈NΨ i (v) ∼ v (N). The i-th component of the interval value represents the interval payoff to player i ∈ N. The set of all interval values in game Γ C we denote by C (v).
A solution concept is a rule that assign a subset M (v) ⊂ C (v) to each cooperative interval dynamic game Γ C . Consider an interval value which is based on the Shapley value: the interval Shapley-like value Φ (v) = (Φ1 (v) , …, Φ n (v)), where for i ∈ N,
In this section, we propose a way of construction the characteristic function for any S ⊆ N in the cooperative interval dynamic game. Let be the cooperative trajectory maximizing (1), then denote
It is obvious that the median of v (N) equals (1). For all non-empty coalitions S ⊂ N let
It is easy to prove that the superadditivity condition is satisfied, i.e., v (S ∪ T) ≿ v (S) + v (T) for all disjoint coalitions S and T from the set of players N.
Suppose that before the game starts, players chose the interval value Ψ (v) ∈ M (v). It means that playing cooperatively and choosing strategies which maximize (), players expect each of them to get Ψ i (v) as a payoff at the end of the game.
When the game Γ is realized along the cooperative trajectory , in each node players are involved in a new interval multistage game with perfect information , k = 1, … , ℓ which is a subgame of the initial game Γ starting in with player i’s payoff:
Note that the Bellman’s equation is fulfilled for the expression (1) and the remaining part of the cooperative trajectory , starting from node , maximizes the sum of players’ payoffs in subgame , i.e.,
It means that the trajectory is also the cooperative trajectory in subgame .
Before entering the subgame , payoff to player i ∈ N belongs to the interval . On the other hand, in the beginning of game player i expects her payoff to belong to interval Ψ
i
(v) according to interval value Ψ (v). Therefore, in subgame player i’s expected payoff should satisfy the condition
Here we have a question: will the interval value in subgame , k = 1, … , ℓ be optimal in the same sense as the interval value Ψ (v) in game Γ, i.e., will if Ψ (v) ∈ M (v)? If it does not, in subgame players will not agree on the same solution concept as in game which may break cooperation at all and change the cooperative strategy profile and, therefore, the cooperative trajectory in subgame .
Similar to set C (v), introduce sets , k = 1, … , ℓ as follows: , and a solution concept in subgame : a rule that assign a subset to each subgame . If we suppose that players along the cooperative trajectory agree on the solution concept M (v), the interval value should belong to the same solution concept as Ψ (v), i.e., , k = 1, … , ℓ, and the condition (7) will be satisfied. Unfortunately, in most classes of cooperative dynamic games it is hard to find the solution concept satisfying the equality (7). To avoid this difficulty, introduce a special payment scheme at each stage of the cooperative interval game such that payments at each stage would not exceed the sum of players’ payoffs at this stage and payments at stages from k to ℓ in subgame , k = 1, … , ℓ would belong to the same solution concept as the interval value Ψ (v). Let Φ (v) = M (v), i.e., the solution concept is the interval Shapley-like value Φ (v) calculated by formula (2). For any subgame , k = 1, … , ℓ along the optimal trajectory calculate the interval Shapley-like value using the formula: for i ∈ N,
Consider a median element φ (v) = (φ1 (v) , …, φ
n
(v)) of the interval Shapley-like value Φ (v), where
Introduce an imputation distribution procedure [17].
IDP β has the following interpretation: β ik is a payment to player i at stage k in game Γ (or at the first stage in subgame ), and is a sum of player i’s payments at first k stages in game Γ.
From Equation (11) it follows that in game Γ each player i ∈ N expects her payoff to be equal to φ i (v) ∈ Φ i (v).
Time consistency of imputation distribution procedure β means that if at each stage along the cooperative trajectory payments are chosen according to β, then in each subgame , k = 1, … , ℓ players will expect payments which are optimal in subgame in the same sense as in game Γ. Therefore, for the median element φ (v) of the interval Shapley-like value Φ (v) one can define IDP β = {β ik } as follows:
From Definition 1 it follows that
At the same time, , k = 0, … , ℓ which means time consistency of imputation distribution procedure β of the median element φ i (v) ∈ Φ i (v).
Consider a numerical example which illustrates theoretical results. Suppose that the set of players N = {1, 2, 3}, |N| = n = 3. The game Γ on graph tree G = (X, F) is represented in Fig. 1. Here set X = X1 ∪ X2 ∪ X3 ∪ X4 is a set of nodes {x0, x1, …, x10}. Suppose that X1 = {x0}, X2 = {x1, x2}, X3 = {x4, x6} and X4 = {x3, x5, x7, x8, x9, x10}. It means that player i = 1, 2, 3 makes a move in nodes from set X i , and the game ends in nodes from set X4. In each node n intervals are given (players’ possible payoffs). For instance, in the root x0 payoff to player 1 belongs to the interval , payoff to player 2 belongs to the interval , payoff to player 3 belongs to the interval .
Using (1), one can find that the maximal value equals 46 along the trajectory (x0, x2, x6, x9). This trajectory is the cooperative trajectory. Therefore, to calculate the interval Shapley-like value along (x0, x2, x6, x9) one needs to construct the characteristic function (see (3)–(4)). In Table 1, interval values of the characteristic function are given for all coalitions S ⊆ N. Therefore, the interval Shapley-like value Φ (v) = (Φ1 (v) , Φ2 (v) , Φ3 (v)), calculated by formula (2), has the form: .
Consider now subgame Γ x 2 on the cooperative trajectory (x0, x2, x6, x9) and calculate the new interval Shapley-like value Φ (v x 2 ) = (Φ1 (v x 2 ) , Φ2 (v x 2 ) , Φ3 (v x 2 )) in this subgame. Intervals of the characteristic function for all S ⊆ N are shown in Table 2, column 2 (see (8)–(9)). Therefore, we have .
Now check condition (7) in node x2. For player 1 we obtain . This fact shows that we need to use a payment scheme, IDP β, and make it time consistent. To find matrix β, we calculate the remaining interval Shapley-like values in all nodes on the cooperative trajectory (x0, x2, x6, x9): Φ (v x 6 ) and Φ (v x 9 ). In Table 2, columns 4 and 6, intervals of characteristic functions and for all S ⊆ N are represented. Using values v x 6 (S) and v x 9 (S), S ⊆ N, we obtain: , Φ (v x 9 ) = ([3, 11] , [- 3, 5] , [1, 9]).
Now find the time-consistent imputation distribution procedure β of the median element φ
i
(v) = (φ1 (v) , φ2 (v) , φ3 (v)) ∈ Φ
i
(v). Here we have: , , , φ (v
x
9
) = (7, 1, 5). Therefore, according to (13), we obtain stage payments to player 1 along the cooperative trajectory:
Payments to players 2 and 3 are calculated in a similar way: , , , , , , β23 = 1, β33 = 5. The matrix representation of the IDP β is as follows:
For instance, consider player 1. If we take the median element as a payoff, the IDP means that player 1 in the root x0 should be paid , in x2 should be paid , in x6 should be paid (or, more specifically, should return) , and in x9 should be paid 7. In the whole game she gets , that is the same value that she expects to get as her entry of the median element φ (v) of the interval Shapley-like value Φ (v). Even if player 1 checks equality (12) in x2, it will hold. In other words, what she expects to get (the median element φ1 (v x 2 ) =9 of the interval Shapley-like value Φ1 (v x 2 )) in subgame Γ x 2 is exactly the difference . The same result is true for node x6: what player 1 expects to get (the median element of the interval Shapley-like value Φ1 (v x 6 )) in subgame Γ x 6 is exactly the difference . For node x9: what player 1 expects to get (the median element φ1 (v x 9 ) =7 of the interval Shapley-like value Φ1 (v x 9 )) in subgame Γ x 9 is the difference . This shows that player 1 will always agree on the median element of the interval Shapley-like value and will not break cooperation. The same result is true for player 2 and player 3. Thus, time-consistent IDP β of the median element φ i (v) = (φ1 (v) , φ2 (v) , φ3 (v)) ∈ Φ i (v) guarantees its realization and cooperation of players in the game along the cooperative trajectory (x0, x2, x6, x9).
Conclusion
In the paper, we have shown that in cooperative dynamic games with uncertain payoffs, namely interval payoffs, the problem of time consistency takes place. It means that in general, the cooperative outcome will not be realized as the game develops along the prescribed cooperative agreement. As a cooperative solution of the game, we have considered the interval Shapley-like value, an interval value based on the Shapley value that satisfies the indifference efficiency property. To realize the interval value in the dynamic game, we have introduced a mechanism of stage payments that reallocates the solution over time redistributing players’ stage payoffs—an imputation distribution procedure. The mentioned procedure has been implemented not only for the median element of the interval Shapley-like value; however, for an arbitrary element belonging to the interval value, we have provided two conditions which the procedure must satisfy. Our findings also correlate with those of crisp cooperative dynamic games, i.e., when the game becomes crisp, it can be easily seen that the interval Shapley-like value coincides with the Shapley value for crisp cooperative dynamic game. In this case, the imputation distribution procedure for the median value (or any arbitrary element belonging to the interval value) also coincides with the imputation distribution procedure for the crisp Shapley value (since the median value or the mentioned arbitrary element will coincide with the crisp Shapley value). As for limitations of our study, the theory has been developed for the interval value which is not efficient, so in future one can study the time consistency problem for other cooperative interval values, for instance, for the interval Shapley value satisfying the efficiency property. In this case one needs to redetermine the characteristic function of the game (to fulfill the property of size monotonicity mentioned in Alparslan Gök et al. [2]) and then calculate the solution, which was not a goal in the present research.
Acknowledgments
Authors thank two anonymous referees for their comments, which have improved the paper. Authors also acknowledge support from the Russian Foundation for Basic Research (Grant 13-01-91160), Saint Petersburg State University (Grants 9.38.245.2014, 9.42.1456.2015), the National Natural Science Foundation of China (Grants 71171163, 71271171, and 71311120091), the Science and Technology Research and Development Program in Shaanxi Province of China (Grant 2014KW03-01), and the Fundamental Research Funds for the Central Universities of China (Grant 3102014JCQ01072).
