Abstract
This paper considers the
Introduction
For stochastic games, such as Backgammon (Tesauro, 1995), bridge (Scott, 1991), Mahjong (Lin et al., 2011) and Chinese dark chess (CDC) (Chen et al., 2010), a critical problem during searching is how to deal with chance nodes. A better understanding of how chance nodes affect the behavior of searching can lead to the development of better algorithms.
This paper uses CDC as the research object. Since it is currently infeasible to completely solve the
The remainder of this paper is organized as follows. Section 2 introduces the rules of CDC, and Section 3 introduces the indexing function of positions and the method used to completely solve the game. Section 4 describes the experiment environment and shows the experiment results, while Section 5 considers implementation issues and discusses experiment results. Section 6 provides concluding remarks.

The initial board.
Chinese dark chess
Chinese dark chess (CDC) is a popular version of Banqi (Chen et al., 2010), which is a variation of Chinese chess. CDC uses the same set of pieces as Chinese chess, but only requires half of the game board and places pieces inside the squares instead of at the intersections. There are 32 pieces: 16 red and 16 black. In this paper, the red pieces and the black pieces are denoted by upper-case letters and lower-case letters, respectively. Each color has one king (K or k), two guards (G or g), two ministers (M or m), two rooks (R or r), two knights (N or n), two cannons (C or c) and five pawns (P or p). Each piece has two sides, dark and revealed. A piece’s type is marked on the revealed side, and the dark sides of all the pieces are identical. A piece is unrevealed if its dark side is facing up, and it is revealed if its marked side is facing up. At the beginning of a game, all the pieces are unrevealed so players do not know the type of any piece. The first player flips one of the unrevealed pieces and then owns the color of that piece, which means the second player owns the other color. In each turn, a player can either flip an unrevealed piece to reveal its identity, or to move one of his/her own revealed pieces. A revealed piece can be moved to an adjacent empty square or be used to capture a lower-ranked or same-ranked revealed opponent piece in the adjacent square with the exception that a cannon cannot capture the opponent’s cannon by moving. There is a special capture rule for cannons which will be described later. A player loses the game if he/she cannot make any legal move. Let
Game solving and fair game
In the domain of computer games, a game can be solved at three different levels, ultra-weakly solved, weakly solved, and strongly solved (van den Herik et al., 2002). If a game is ultra-weakly solved, the game-theoretical value of the initial position is determined. If a game is weakly solved, an optimal strategy from the beginning of the game is provided. If a game is strongly solved, the optimal moves of all legal positions are provided. Note that in imperfect information games and stochastic games, an optimal move does not mean 100% win, lose or draw. Instead, an optimal move in imperfect information games or stochastic games is the move having the maximum expected value of either the winning chance, the winning plus the drawing chance, or the difference between the winning and the losing chance. In this paper, we solve
Retrograde analysis
The retrograde analysis methods are widely used in solving games (Thompson, 1986). In the process of retrograde analysis, it is important to determine the set of next positions and the set of previous positions of a given position (Sturtevant and Saffidine, 2017). In the imperfect information games, one chance node has multiple possible next positions, therefore a backward propagation method is not feasible. A top-down search method is usually used for these imperfect information games (Bonnet and Viennot, 2017). Note that although Kriegspiel (Ciancarini and Favini, 2010) is also an imperfect information game, it is different from other imperfect information games. A position in Kriegspiel does not only contain the pieces’ locations of the current player but also the “partial information” of the opponent’s pieces’ locations.
Related works
In stochastic games and imperfect information games, chance nodes are involved in the search tree. In dice games, such as Backgammon (Tesauro, 1995) and “EinStein würfelt nicht!” (Althöfer and Voigt, 2014), legal moves are determined by dice rolls. At the beginning of each turn, the player first rolls the dice and then chooses one of the legal moves decided by the outcome of the dice roll. In other games, such as Mahjong and CDC, the legal moves are separated into deterministic moves and stochastic moves. The state transitions of deterministic moves are fixed. On the other hand, the state transition of stochastic moves is not fixed. For a stochastic move, there are multiple possible outcomes. In the search tree of a stochastic game, every node is a chance node. In the search tree of an imperfect information game, only the nodes related to the stochastic moves are chance nodes.
Previous work on CDC (Chang et al., 2010, 2012, 2015; Yen et al., 2015) has mainly focused on how to deal with the uncertainty caused by flip moves. Because a player knows neither the type nor the color of an unrevealed piece until it is revealed unless the remaining pieces are all of the same type, the game is stochastic, and the probability distribution of a flip move changes over the course of the game. Several games involve probability behavior, e.g., Backgammon, bridge (Scott, 1991), Mahjong (Lin et al., 2011) and Kriegspiel (Ciancarini and Favini, 2010). However, in contrast to Backgammon, the probability distribution of a flip move in CDC changes during the game (Tesauro, 1995). This dynamic behavior of CDC is more like that in bridge and Mahjong, which are normally played by 4 players, making them more difficult to analyze than 2-player games. On the other hand, CDC is not like Kriegspiel, which a player only knows part of the information of his/her opponent. In CDC, both players know the locations of revealed pieces of both sides but have no information about the exact type of any unrevealed piece. In Kriegspiel, a player can use deterministic moves to gain more information and to reduce the uncertainty. In CDC, a player can never gain any information of an unrevealed piece, unless he/she makes a flip move to reduce the number of unrevealed pieces. That is, a player cannot gain extra information without making a stochastic move in CDC. As a research topic, CDC contains the following important properties: (1) it is a 2-player game; (2) it is a stochastic game and the probability distribution of a flip move changes during the course of playing the game; and (3) at the endgame stage, when all pieces are revealed it becomes a deterministic game.
Methods
In this section, we describe a bottom-up approach similar to the retrograde analysis method used in (Thompson, 1986) to solve
CDC
Similar to
We start with a basic indexing function, which is easy to use but lacks optimization. The basic indexing function contains two parts, the locations of the pieces and whether a piece is revealed or not. We use a

Illustration of the basic indexing function. The material combination is “KMgp”. The prefix “d” of a letter denotes that piece is unrevealed. In Fig. 2(b) and (c), there are an unrevealed red minister, “dM”, and an unrevealed black guard, “dg”.
Before we describe the details of our indexing function, we first introduce two basic operators in our indexing function: a function to calculate the i
th
combination and a function to calculate the i
th
permutation. We start with defining the ith combination of an ordered set. For a given set with n distinct elements, there are
Now we consider the case of the ith permutation. Similar to the ith combination, there are
Now we describe how to construct our indexing function using the method proposed by Knuth (Knuth, 2008) to calculate the ith combination and the ith permutation of an ordered set and their inverse functions. First, we present six attributes that can encode a position uniquely. A position with a given material combination can be recorded by the following six attributes: (1) TPIECE: the total number of pieces on the board; (2) LPIECE: the number of revealed pieces on the board; (3) SQUAREID: the combination order of the occupied squares as a subset of all squares; (4) LSQUAREID: the combination order of the squares occupied by the revealed pieces as a subset of the occupied squares; (5) LSET: the combination order of the revealed pieces as a subset of the material combination; and (6) LPER: the permutation order of the set of revealed pieces, ordered according to their occupied squares’ IDs. The range and notation of each attribute are listed in Table 1.
In Fig. 3, the total number of pieces is 4 and the number of revealed pieces is 2. The squares occupied by the pieces are 0, 1, 2 and 7; hence the combination order equals
The attributes of a position
The representative values of the red turned positions and the black turned positions are saved separately, so the turn information is not used in the indexing function. Because there is a 1–1 mapping between the ith combination and integers as well as a 1–1 mapping between ith permutation and integers, each index represents a unique valid position.

Index of the position shown is

Dependency graph of “KGkg”.
For each material combination, we determine the representative value of positions from the case of having no unrevealed pieces to the case of having only unrevealed pieces. Because flipping an unrevealed piece is a non-reversible move, the number of unrevealed pieces cannot increase during the course of a game. Hence, the representative values of positions consisting of k unrevealed pieces depend on those consisting of

DFS(state s, depth d)

CDC database updates
This section describes the experimental settings and discusses the experimental results.
Experimental settings
In the experiments, we calculate all 1,790 equal-piece-number (4 red pieces and 4 black pieces) material combinations of
In a position with n unrevealed pieces, there are
Observations hold on symmetric material combinations
In
Table 2 lists the experimental results of symmetric material combinations. Column 1 and 2 show the index and the material combination of pieces, columns 3 to 6 show the experimental results. Columns 3 and 4 show the representative value and the standard deviation when the first move is “A2”; Columns 5 and 6 show the representative value and the standard deviation when the first move is “B2”. Results show that the first player is disadvantageous for most material combinations. The only exceptions are “KGCPkgcp”, “KGGCkggc” and “GCPPgcpp”. The standard deviation is 0.9; that is, although the expected result favors the second player, the first player also has a good chance to win.
Table 3 lists the results of the second move of “KGCPkgcp” if the first move is “B2.” There are four possible outcomes, that is, either a king, a guard, a cannon or a pawn is revealed. Each subtable has eight columns. Column 1 shows the square ID which is chosen by the second move, and columns 2 to 8 list the corresponding expected difference of each remaining unrevealed piece. Under the mini-max assumption, the second player plays “Flip C2” if the first player reveals a cannon; otherwise, the second player plays “Flip D2”.
Experimental results of symmetric material combinations: the results of choosing different first moves
Experimental results of symmetric material combinations: the results of choosing different first moves
Choices of the second move after the first move “Flip B2” is made for “KGCPkgcp”
As we mention in Section 4.2, there are two possible first moves which are “A2” and “B2” for each material combination. For each material combination, there are eight possible outcomes
In Fig. 5, a point (
Fig. 6 shows the representative values of choosing “A2” as the first move and separately choosing “B2” as the first move, for all material combinations. The x-axis is the expected winning probability and the y-axis is the expected losing probability. For example, the point

The distribution of

Resulting distributions of two possible first moves “Flip A2” and “Flip B2”.
Although this experiment only shows the result for
We now consider a strategical consideration to decide where to make the first move.
For each initial position, there are eight possible outcomes after the first move is made. For a first move x and each possible outcome i, we compare the representative value
Table 4 lists the changes of the representative value before and after the first move is made for all 1,790 possible material combinations. The first column shows
Here we define a “stable” first move to be one such that when it is played, the probability of having a better representative value equals the probability of having a worse one. Hence the game will likely remain as it was after the move is made. By using our definition, in Table 4, the material combinations achieving
Value of
where
“A2” or “B2” and the corresponding numbers of material combination (MC)
Value of
In Table 5, each row is separated into three categories according to the relationship between
A fundamental problem in the game tree search is how to compare two nodes with representative values in the form of distributions instead of exact numerical values. Traditionally, distributions are represented by their expected representative values which are
Numbers of material combinations breaking down using the resulting distributions and representative values of playing “A2” and “B2”
Numbers of material combinations breaking down using the resulting distributions and representative values of playing “A2” and “B2”
In total, there are 131 possible symmetric material combinations consisting of 4 pieces for each player. Although the total number of symmetric material combinations is 131, some material combinations are equivalent (Chen et al., 2015), namely, they have the same capturing relationship between pieces. For example, “KGMPkgmp” and “KGRPkgrp” have the same relationship since the roles of K, G and P are the same in both material combinations. That is, both M and R are less than K and G, but greater than P. Hence if we treat M in the first material combination as R in the second material combination, the two material combinations are considered as equivalent (Chen et al., 2015). There are 24 symmetric material combinations consisting of 4 pieces that are not equivalent. Also, there are 1,790 non-equivalent material combinations consisting of 4 red pieces and 4 black pieces if we do not require the red side and the black side to have the same piece set.
Although the experimental results show the unfairness of the game due to where the first move is made, we discover conditions for it to become fair later on. Figure 7 shows the three fairest positions with two revealed pieces and where the next move should be in “KGMPkgmp” combination.

Fair positions after a sequence of flips in “KGMPkgmp”. All of the three positions are fair because the expected difference is 0.
Implementation issues
Section 3 discusses the indexing function used in
To ensure the result of each position to be correct, each computed representative value is verified. There are two common verifying methods. First, each position’s representative value is obtained by the values returned from some of its children according to the properties of the search procedure, and for each position, there should be one child position causing the representative value of its parent position to be recorded as it is. Second, since the game board is symmetric, a position should have the same representative value after up-down flipping, left-right flipping, or clockwise rotation. We have verified our results using both methods.
Opponent model
In this paper, we use the difference of the winning chance and losing chance as the objective function. Here we show this objective function leads to the best score in a game tournament. In a tournament, a single win, draw and lose normally gain 2, 1 and 0 points, respectively. If the number of winning events, draw events and losing events of a nondeterministic state are
Possible usage of
CDC
In Section 1, we have mentioned that we want to gain insight into
Conclusion
This paper considers CDC on a
Footnotes
Acknowledgements
This work was supported in part by the Ministry of Science and Technology of Taiwan under contracts 104-2221-E-001-021-MY3, 106-2221-E-305-016-MY2, 106-2221-E-305-017-MY2, 107-2634-F-259-001- and 107-2634-F-009-011-.
