Abstract
An algorithm is described for constructing a chess opening book, to be used in play against an opponent using another book that is given. The book under construction is represented as a Directed Acyclic Graph, with chess positions as nodes and moves as directed arcs between nodes. Each position is given an absolute position probability that a game played between the given and being-constructed book will eventually reach that position. Books for White and Black are constructed separately. Assuming a book is being constructed for White, a position priority queue repeatedly chooses the highest-probability unexpanded position for expansion. If it is Black’s turn to move, moves with move probabilities for the position are copied from the given book. If it is White’s turn to move, a search determines the best move, which is added with move probability one. More-probable positions are searched to a greater depth. Several books for White, representing a range of build CPU time, were constructed to be used in play against the book that comes with Crafty version 25.0.1, CB25.0.1. The strongest, B29, took 12 CPU-days to build. In 1000-game matches against Black/CB25.0.1 (Black using CB25.0.1 book), White/B29 scored 64 ELO points higher than did White/CB25.0.1.
Forewarned, forearmed;
To be prepared is half the victory.
Introduction
Computer methods for constructing opening books, for games like chess, are classified into two types: passive and active (Chen et al., 2015). A passively-constructed book is based on a historical record of games played by highly-ranked human experts and strong computer engines, as something to be imitated. An actively-constructed book is built from scratch, and its moves are chosen algorithmically, without regard to the historical record.
Passively-constructed books
Passively-constructed books have been constructed for Chinese chess, chess, and checkers (Chen et al., 2015). A substantial industry has grown up around passively-constructed books for chess (CPW, 2017). Two state-of-the-art passively-constructed books for chess are the book that comes with Crafty (Hyatt, 1999), and the book that Deep Blue used to defeat Kasparov (Campbell, 1999). The earliest books of this type simply imitated the historically popular moves, and this led to problems. Hyatt recounts: With older chess programs (Cray Blitz to name one that I worked on) …the opening book was the weakest link in the way the program played. We actually used to “hold our breath” until we dropped out of the book and started searching on our own, hoping that the first search result would be something that was not already losing. I can recall several games that were played at ACM Computer Chess tournaments where a program was lost before it ever searched the first node.
The book that comes with version 25.0.1 of Crafty, which we will call CB25.0.1, is based on approximately 63,000 grandmaster and 106,000 engine games. Crafty plays tens of thousands of games per year on Internet chess servers, and the results of these games are used to continually augment and refine its book (Hyatt, 1999). One problem was that opponents would discover a line of play favorable for them, and play that line over and over until they found a way to win. This problem was solved by instituting automated learning to repair the book so that it would never lose the same way twice. Another countermeasure is to provide variability in the book, so that from a given position more than one response was available, and these were chosen at random during the game.
Another problem with passively-constructed books is that the evaluation function of the search engine might be unaware of goals and issues that experts had in mind when they played the moves in the historical database. Then when a game drops out of the book, the engine finds itself at cross-purposes with the position it is in, and unable to carry forward with the plan envisioned by the book. Even worse, a move that was played in the past may simply be a mistake. These problems have been widely noted, and are addressed by filtering, or validating, the historical moves to favor ones that the engine prefers (Hyatt, 1999; Campbell, 1999; Chen et al., 2015).
Actively-constructed books
Actively-constructed books, using search algorithms such as alpha-beta and Monte-Carlo search, have been built for Awari, Othello, Amazons, and Go (Chen et al., 2015). Buro (1999) constructs or augments a book for Othello by iteratively expanding the leaf node that is reached by following the path of best moves for both sides (the principal variation) from the start node in a DAG. If the opponent, at game time, always chooses the best move, then the game remains in the book, and the book provides us with the best reply. The Achilles heel of this approach is that the opponent may have a slightly different evaluation function or may play a move that is different from the best move, but nearly as good, thus casting us out of Buro’s book. It would be more reasonable to consider additional moves that are only near-optimal, in a controlled way. Lincke (2001) does this by expanding according to a linear combination of shallowness with best-first. Buro’s figure of merit for choosing the next node for expansion is the difference between the node’s lookahead value, and the best lookahead value among the node’s siblings. Lincke’s figure of merit additionally contains a term proportional to the distance of the node from the DAG root. A move has a higher figure of merit for expansion if it has a combination of a better evaluation (relative to its best sibling) and a shallower depth from the start of the game. Thus, Lincke’s algorithm guarantees that an opponent’s possible near-best move near the DAG root will not be perpetually ignored. Lincke (2001) evaluates his method for the games of Awari and Othello. In his dissertation, Lincke (2002) also used this approach to build a book for chess and collected statistics for it, such as depth distribution, average number of predecessors at a node, and average degree at a given depth. However, the book’s playing strength was not evaluated in match games. Lincke (2002) concluded “…from the shallowness of the book it is obvious that the book is not very useful (neither for computers nor for humans)”. To the author’s knowledge, there are no published reports of match evaluations of actively-constructed chess books.
Perhaps one reason that little effort has been made for actively-constructed chess books is that there is such tremendous respect for the consensus, developed over many years of play by many expert players, of the best opening moves, which is known as opening theory. Passive book construction is viewed as a way of harvesting that vast expertise (Campbell, 1999).
Problem formulation
As a result of improvements in search algorithms and hardware, chess engines have improved to the point where some are stronger than any human being. Given the increased power of chess engines, it makes sense to investigate using that power to build actively-constructed books. Instead of harvesting existing knowledge, we seek to use computing power to actively search for novel lines of play.
To keep the problem tractable, we make several simplifying assumptions.
The constructed book will be built to be used in play against another, given, book. The given book must provide move probabilities that add to one for the moves that it recommends from a given position. This assumption actually allows a great deal of flexibility that can be used to tailor the constructed book for a number of different uses. For example, the given book could be based on an arbitrarily large compendium of grandmaster and engine games, with a large number of moves (but with move probabilities still adding to one) from a given position. Although a historical record of games will be used, the constructed book will still be deemed actively-constructed, because the recommended move from a position will be determined solely by search, without regard to what was done historically in that position. The given book will be used only to predict the moves the opponent is likely to make. In this paper, we will use CB25.0.1 as the given book. Books for White and Black are to be constructed separately. For this paper, only books for White were actually built. It is assumed that what works for White also works for Black. The constructed books will contain only a single recommended move for White per position, which makes White’s play deterministic when in the book. When any game comes out of the book with a weak position for White, a future opponent can repeat Black’s moves from this game, leading always to the same weak position. Fixing this problem would require much more computing time, depending on how often additional moves for White are computed.
Constructing the opening book
The build process will be described assuming the constructed book is for White. The book is constructed from scratch. Every recommended move in the constructed book is the result of an alpha-beta engine search. In this way, all the power and knowledge of the search engine is captured and put into the book. Any future enhancements that make the engine more powerful will likewise increase the power of constructed books.
During the execution of the construction program, the book is represented as a Directed Acyclic Graph (DAG) from the starting position. In Fig. 1, the starting position of chess is position 0. Each node represents a chess position; each arc represents a move from the node at the arc’s tail to the node at its head. A transposition occurs when more than one move leads to a position, such as at position 12.
Move probability
Associated with each move in the DAG is the probability that the move will be chosen from the position at the move’s start position. Where Black is to move, these probabilities are copied, with minimal calculation, from the given book. The sum of move probabilities from a given position is 1. For example, in Fig. 1, Black’s moves from position 1 are d5, c6, e5, and c5, with probabilities 0.1, 0.2, 0.4, and 0.3, respectively. Currently, the book construction algorithm gives only one move where it is White’s move, so that move has probability 1.
Position probability
Associated with each position is the probability that the game will eventually reach the position, in a game between the given and the constructed book. Whereas a move probability is conditional on the game reaching the start position of the move, a position probability is absolute. The probability of the starting position is 1. For any other position, the probability is the sum, over all moves leading to it, of the probability of that move times the probability of the start position of the move. When a new move is added that leads to an existing position (a transposition), the probability of the position is updated. For example, in Fig. 1 the probability of position 12 is given in terms of the probabilities of positions 9 and 10:

Opening book represented as a DAG.

Pseudo-code for book construction.
A position priority queue based on position probability controls DAG construction (Fig. 2). Initially, the position priority queue contains only the beginning position of chess, with probability 1.0. Repeatedly, the position P with highest probability Q is removed from the position priority queue and moves from P are added into the DAG. Positions resulting from the moves are added to the DAG and the position priority queue. If it is White’s turn to move at P, a search is made from P to determine the best move, which is added to the DAG with probability 1. If it is Black’s turn and P is in the given book, the given book provides the moves from P with their probabilities. Because of this, the constructed book will only be effective when used against an opponent using the given book – if Black plays a move not in the given book, White will be cast out of the constructed book. If it is Black’s turn and P is not in the given book, all moves from P are added with equal probability. Probabilities are propagated to positions resulting from added moves, as described above.

Continuation principle.
Each White move is chosen by a Crafty engine search to a specified depth. This specified depth is controlled according to the following two principles:
More probable positions should be searched to a greater depth than less probable positions.
The Continuation Principle is justified as follows. In Fig. 3, the position R at the root of a search has been searched to a certain depth, choosing White’s best move to position S. The leaf positions of this search are referred to as the search’s ‘horizon’. According to the given book, Black has two replies, leading to T and U. T is searched to a deeper depth than U, so that T’s search horizon sees beyond R’s search horizon, but U’s does not. There is an ‘ignorance gap’ between the search horizons of U and R. When a search returns a minimax value V at the root of a search for White, the search’s lookahead tree constitutes a proof of the fact that there exists a strategy for White, such that no matter what Black does, White can force the game play to a leaf node of the tree that has a value at least as good for White as V. But when the horizon of U’s search does not reach as far as R’s horizon, the proper continuation may not be found, because the payoff for the move chosen by R’s search may be within the ignorance gap, and so U’s search might produce a blunder.
Depth of search is controlled by the probability of the root position of the search, according to the following rough heuristic: Assume, as in Fig. 4, that the starting position of chess, position 0, is searched to a depth of D ply, yielding the best move for White to position S. Suppose that the given book has K equally probable moves for each of its positions where Black is to move. Then the number of positions at depth
Thus, for a position with White to move that has probability Q, we ask Crafty to search the position to a depth of

Depth of search is controlled by position probability.
When the algorithm expands a Black-to-move position to its multiple children in the DAG, its position probability is divided accordingly. Thus as the DAG becomes deeper, the position probability Q of DAG leaves decreases exponentially and monotonically (with rare exceptions in some cases when transpositions occur) to zero. The search depth,
Similar reasoning tells us that the engine used to build the book should be the same as the engine that will use the book. If the engine changes in any significant way, such as changes to the leaf evaluation function, the book should be constructed again, from scratch.
In summary, depth of search is controlled by the parameters B, D, and d. For all books constructed for this paper, d was 16 and B was 3. 10 books were constructed, in which D varies from 20 ply for B20 to 29 ply for B29. A position with White to move and probability Q is searched by Crafty’s alpha-beta search routine to a depth of

Termination condition.
To avoid cycles within the book, a check is made, before adding a move to the DAG, to see if doing so would create a cycle. If it would, the move is not added, maintaining the acyclic character of the DAG. Potential cycles occurred rarely in book construction. During construction of B20 through B28 no potential cycles were detected. In B29, there were 18 cases, all far from the starting position at low-probability positions, where a move was not added because it would have created a cycle. Since B29 had 190K moves, this decreased the size of the book by less than 0.01%. In the very rare cases where the absence of a move meant that a game would drop out of the book, this simply meant that responsibility for dealing with potentially repeated game positions would be transferred to the engine, which is already capable of doing so.
Output
When the build finishes, it writes out the White moves from the DAG as a sequence of text lines. Each line contains the recommended move with the FEN string of the position it is recommended for. The following line says that in the position occurring after 1. e4 e5, White should play Nf3:
Nf3 rnbqkbnr/pppp1ppp/8/4p3/4P3/8/PPPP1PPP/RNBQKBNR w KQkq –
Match results

Results of 1000-game matches comparing White/B20–B29 to White/CB25.0.1.
Outcomes of 1000-game matches
Ten books for White, B20 through B29, were constructed. The numerical suffix of the constructed book name is the depth D of the search, in ply, at the starting position. In this section, “Side/Book” refers to player “Side” using “Book” as its book. E.g. “White/B29” is White using book B29, and “Black/CB25.0.1” is Black using CB25.0.1. Eleven 1000-game matches were played with White/CB25.0.1 and White/B20 through White/B29, against Black/CB25.0.1, with results shown in Fig. 6 and Table 1. All conditions were kept constant across all the matches, with the Crafty engine used for both sides, with CB25.0.1 always used for Black’s book, and endgame databases turned off. Each match was played using Xboard with time control –tc 0:18 –inc 0.3 (18 seconds initially on each clock, with 0.3 seconds added at each move), to allow for a 1000-game match to finish in a reasonable amount of time (∼18 hours). Each side started in its book, and when the book was exited, the Crafty version 25.0.1 search engine took over for that side. The only difference among the eleven matches is the book used for White: CB25.0.1, and B20 through B29. When CB25.0.1 gives more than one move for Black in a position, Crafty selects its move according to CB25.0.1’s move probabilities, using a physically random process based on low-order bits of the PC hardware clock. Table 1 gives White’s Win-Loss-Draw record for each match, along with consequent % score, White’s ELO advantage over Black, and ±99% ELO confidence bounds (Linscott, 2017). Figure 6 plots White ELO advantage for each constructed book over Black/CB25.0.1 with 99% confidence bounds, in comparison to White/CB25.0.1’s ELO advantage over Black/CB25.0.1 (horizontal dotted line at 26 ELO). Table 2 gives build time on a 2.4 GHz PC, # of lines (moves), and first move for each constructed book. The build time increases roughly threefold for every increase of D by one.
B20 through B28 are clearly inferior to B29, so they would never be used in competition if B29 were available. B20 took only 2.2 minutes to build and contains only 26 moves. The data for B20 through B28 are presented only to show the trend of increasing strength and increasing build time with increasing D, and thus the likely worth and cost of creating even deeper books.
Build times, number of lines, and recommended first move, for B20 through B29
As an example of the possibility of creating novel lines of play, the following is game 591 of the White/B29 vs. Black/CB25.0.1 match (Fishburn, 2017):
Nf3 c5 2. e3 Nf6 3. d4 d5 4. c4 e6 5. Bd3 Be7 6. O-O O-O 7. Nc3 dxc4 8. Bxc4 a6 9. dxc5 Bxc5 10. Bd3 Nc6 11. a3 Qc7 12. b4 Bd6 {+0.22/15 0.6} 13. Bb2 Ng4 {+0.55/14 1.6} 14. Bxh7+ Kh8 {+0.52/15 1.1} 15. h3 Nh2 {+1.25/13 0.8} 16. Ng5 Be5 {+6.51/14 1.5} 17. Qh5 {+9.74/15 0.5} g6 {+10.05/15 0.6} 18. Bxg6+ {+10.01/14 0.5} Kg7 {+18.97/15 1.2} 19. Qh7+ {+17.84/13 0.5} Kf6 {+18.97/10 0.1} 20. Nce4+ {+18.04/13 0.4} Ke7 {+18.97/10 0.1} 21. Nxf7 {+20.03/13 0.6} Nd4 {+19.14/13 0.8} 22. Nxe5+ {+21.44/13 1.0} Kd8 {+21.00/13 0.5} 23. Nf7+ {+21.55/13 0.6} Qxf7 {+22.17/13 0.8} 24. Bxf7 {+22.16/11 0.4} Nhf3+ {+22.60/13 1.0} 25. gxf3 {+22.17/11 0.4} Ne2+ {+25.62/13 2.6} 26. Kh1 {+24.25/12 0.8} Kc7 {+23.63/12 0.7} 27. Qg7 {+30.95/13 0.5} Kb8 {+30.67/13 0.4} 28. Be5+ {+327.52/19 0.6} Ka7 {+327.53/10 0.1} 29. Qxf8 {+327.54/17 0.1} Rb8 {+327.55/17 0.2} 30. Bxb8+ {+327.56/15 0.1} Kxb8 {+327.57/21 0.2} 31. Nd6 {+327.58/13 0.1} Kc7 {+327.59/22 0.2} 32. Qxc8+ {+327.60/11 0.1} Kxd6 {+327.61/44 0.2} 33. Qc5+ {+327.62/9 0.1} Kd7 {+327.63/10 0.1} 34. Rad1+ {+327.64/7 0.1} Nd4 {+327.65/10 0.1} 35. Rxd4# {+327.66/5 0.1}
(The comments in this game for engine moves are in the format “{search_value/ply_searched seconds_searched}”.) In this game, White’s
We can draw several conclusions from the data:
White/CB25.0.1 scored a 26 point ELO advantage over Black/CB25.0.1, which is considered to be the approximate natural advantage of White over Black. Under the same conditions, White/B29 scored 64 ELO points higher, at 90 ELO points advantage over Black/CB25.0.1.
Using books B27, B28 or B29, White has a greater than 26 point ELO advantage over Black/CB25.0.1, with 99% certainty.
Although there is some randomness in the trend, the White ELO advantage of the constructed books goes up at an average rate of about 8 ELO points for every increase of D by one.
It was generally thought that the historical-games-based opening book of chess, which has been built up from many grandmaster games over many decades and even centuries (and now also from engine games), is so deep that no machine-generated book could ever hope to beat it. However, it is possible that the machine-generated book will drop the opponent prematurely out of its book while continuing in its own book for several plies, from the point of exit. This was observed with the larger constructed books. For example, most games in the B29 match proceeded on a line in which White remained in B29 for three or four more moves after Black exited CB25.0.1. By choosing moves for White purely based on search, and by assigning Black’s move probabilities based on historical records of games, new opening lines may be discovered, adding to opening theory.
Another factor favoring a computed book over a games-based book is that the search engine may be oblivious to the goals and issues that the grandmaster or a different engine had in mind when making a particular move. Then when a game exits the games-based book and the engine takes over, it doesn’t know how to take advantage of its position.
By contrast, when White in a game leaves our actively-constructed book at a position P, its engine generates a lookahead tree with a horizon that sees at least to the horizons that the book construction process saw when it searched for moves along the path from the beginning position of chess down to P. The leaf nodes at this horizon are evaluated with the same evaluation function that was used in constructing the book. The engine at game time can see and achieve the goal that it saw and promised at book build time. The computed book is well matched to the engine.
The approach is flexible because the construction process can be tailored to serve various purposes:
A general-purpose book could be constructed by using a given book that is arbitrarily large. A book could be constructed for a match against a particular player, whether person or engine, by biasing the given book towards the historical record of that player. A particular opening could be analyzed by choosing the root position of the constructed DAG from that opening. When a constructed book is used in games, these games could be added to the given book, and a new constructed book built from the augmented given book. In this way, the constructed book could “learn” from experience, and avoid losing in the same way it has lost in the past, or find a way to win in a position from which it previously achieved only a draw. The parameter d can be matched to anticipated game time controls.
Future work
This paper has shown that it is possible to algorithmically construct an opening book that is purpose-built to be played against a given book. Match evaluations show that the constructed book can then be used effectively in play against an opponent using the given book. The question now becomes the more challenging one – whether it is possible to actively construct an effective book, without prior knowledge of the books it will be used against. As a possible answer to this question, the approach should be tried with larger given books, such as the 2-million game ChessBase database (ChessBase, 2017). In the CB25.0.1 given book that was used, only four replies to
For all books computed in this work, B and d were set at 3 and 16, respectively. Other values should be tried to see if they give a better result.
All searches were performed as standard Crafty alpha-beta searches. One modification that suggests itself would be to favor moves that exit from the given book, the earlier the better. This could be a term added to leaf evaluations, or at the root of the subtree where the exit takes place. The idea is that as much of the game as possible should occur with White in book but Black out of book.
A position with Black to move that is outside the given book is currently fully expanded with all its moves assigned equal probability. This should be replaced with some heuristic that assigns less probability to moves that are obviously inferior.
The position priority queue structure of the build program provides a natural framework for parallelism. All searches called for in the position priority queue (positions where it is White’s turn to move), at any one time, are independent of each other and can be done in parallel. Active book construction is thus an ideal application for distributed computing environments such as generic job-level search (Wu et al., 2013), which can package these searches into jobs to be run by remote processors in a job-level system.
Footnotes
Acknowledgements
Thanks go to Prof. Robert Hyatt, who has made available in Crafty a valuable open-source platform for experimentation in computer chess.
