Abstract
Chess, once famously referred to as the drosophila of artificial intelligence (AI) research, has been a significant domain for developing intelligent AI agents capable of achieving super-human performance in domains previously dominated by humans. However, the emphasis on unceasingly improved playing strength has come at the cost of neglecting other fundamental aspects of intelligent agents, such as being capable of explaining the rationality behind their decisions in human-understandable terms. The need for such capabilities may be even more profound now than before, partly because such agents may be capable of learning novel concepts of interest to us humans, for example, as recently demonstrated in the game of chess. In this paper, we survey the state of explainable AI in chess-playing agents, arguing that chess may indeed hold a promise as an admissible domain for explainable AI.
Keywords
Introduction
Today’s chess programs have reached a super-human level of play (Romstad et al. (2023); Pascutto et al. (2023); Silver et al. (2018)). The research effort into computer chess over the past few decades has first and foremost concentrated on developing new techniques and algorithms for improved gameplay while mostly neglecting other important artificial intelligence aspects, such as how to build intelligent chess tutoring systems. Contemporary chess engines are thus – albeit of a value for master-level players for assisting with their analysis and opening repository preparation – of a limited use for the regular club players that want to use them to improve their understanding of the game. Moreover, recent advancements in the field where deep neural-networks (DNNs) are used both for evaluating board positions and for action selection in the think-ahead process (e.g., in MCTS), such as in AlphaZero (Silver et al. (2018)), do seem to make the decision-making process even further non-transparent to humans, for example, by making it almost impossible in many situations to understand the rationality for why a particular game position is evaluated in favor of one player over the other, even for expert-level players.
The above-mentioned interpretability problem is not specific to chess engines, or other game-playing programs for that matter. As AI systems in diverse fields, such as healthcare and banking, become increasingly convoluted and ubiquitous, the need for humans to understand their decisions becomes increasingly crucial – not only to learn from the systems, but also due to concerns of correctness, ethics, and trust. This need has spurred a renewed interest in the field of explainable AI, that is, in designing of AI whose decisions can be better understood by humans.
In this paper,1
A previous version of the paper was presented at the (non-archival) IJCAI 2023 Workshop on Explainable Artificial Intelligence (XAI).
The rest of the paper is structured as follows: First, we give an up-to-date overview of desirable properties of explainable AI systems, then we review intelligent chess-playing agents from the perspective of explainable agency, both its current state and the potentials the domain holds for becoming an admissible test-bed, next we survey recent work on explainable game-playing agents and, finally, we conclude.
Explainable AI (Schwalbe and Finzel (2023)), also referred to as interpretable or transparent AI, concentrates on developing AI techniques that can explain the reasons behind their decisions in a way easily understandable to humans. This is in contrast to so-called black-box AI techniques, most notable artificial neural-networks (ANNs), that give answers without providing much insights into how a particular decision was reached. A related problem is when an AI system provides a detailed trace of its internal sub-decisions as an explanation to the human, but from a practical standpoint the amount of information is simply too excessive for the human to process and analyze, thus effectively rendering the explanation useless. This problem is typically referred to as information overload, information explosion or, more informally, infobesity.
Saliency maps (Kadir and Brady (2001)) visually represent how much a given input affects a model’s output. They are commonly used in image classification to visualize relevant aspects of a given input image to a given classification, for example, highlighting areas of interest (Simonyan et al. (2014)).
The Local Interpretable Model-Agnostic Explanations (LIME) is a well-known framework for model interpretation (Ribeiro et al. (2016)). It interprets individual model predictions based on approximating the model around a given prediction using a local linear explanation model. More recently, SHAP (SHapley Additive exPlanations) was proposed as a unified framework for interpreting predictions, unifying several existing methods (including LIME) as well as presenting new ones (Lundberg and Lee (2017)).
One drawback of the methods mentioned above is that they mainly work on detecting the sensitivity of the network’s predictions to the individual input parameters. In the case of DNNs, the input parameters are typically low-level features that are not necessarily meaningful for human-based interpretation (e.g., a single pixel in an image). Recently, concept-based explanation methods have shown promise, but they go beyond per-sample input features to explain higher-level human-friendly concepts across entire datasets. Testing with Concept Activation Vectors (TCAV) (Kim et al. (2018)) allows one to quantify how vital arbitrary user-defined (binary and non-binary) concepts are to neural-network predictions via model-probing (Alain and Bengio (2018)). The Automatic Concept-Based Explanations (ACE) method (Ghorbani et al. (2019)) further extends this line of concept-based explanations (in image recognition) by automatically discovering concepts, as opposed to them being human-provided.
In the context of intelligent autonomous agents, particularly in planning (Chakraborti et al. (2018)), the ability of an agent to explain the reasoning behind its decisions has been labeled explainable agency (Langley et al. (2017)), and which requires four distinct abilities: (i) the agent must be able to explain decisions made during plan generation, (ii) report which actions it executed at different levels of abstraction, (iii) show how actual events diverged from planned ones and what adaptions were necessary, and, finally, (iv) communicate its decisions and reasoning effectively in a formalism natural to humans. Furthermore, work on explainable agency in systems based on heuristic search tends to distinguish between two types of self-explanations: process vs. preference oriented. The former emphasizes the (thought) process leading to finding the solutions, whereas the latter focuses on the solutions themselves without concerns about how they were found (Langley (2019)).

From the game Fischer vs. Rossolimo, U.S.A. Championship
1955-6. Fisher’s commentary after the move
(from (Fischer (2012))): on
(if
)
(
is again met by
and the
Knight beats the Bishop in the ending)
, etc. Black’s best change,
however, is to try and reach sanctuary with
. White undoubtedly has the
initiative, but it’s hard to get at the King.
The state of explainable agency in contemporary chess-playing agents is weak, which may not be surprising given the lack of research focus. Although there exists some research literature on intelligent chess-tutoring systems (Guid et al. (2013); Sadikov et al. (2006)), it is sparse, and the reported work either restricted to simple chess endgames or is somewhat preliminary. When it comes to strong chess-playing agents (engines), their explanation capability is embarrassingly poor.
The sole explanations chess engines typically provide in a given board position are limited to the projected best continuation of play – the so-called principal-variation (PV) – and a single numeric score evaluating the merit of the board position at the tip of the PV. Although some engines can provide a somewhat more detailed explanation upon request, e.g., how different components such as material, mobility, pawn-structure, and king-safety contribute to the score, most engines do not offer such an option (and ANNs-based engines are inherently incapable of doing so).
Specifically, when measuring engines on the four properties of explainable agency, they fail on all accounts. For example, albeit showing a plan (the PV), no explanations are provided for why a given line of play is desirable, nor why it was chosen over other possible continuations. Also, if initially an alternative more promising plan is followed, but refuted by the opponent, the observer will never know. One could in some cases have the chess engines (programmatically) output more detailed search information, but without a proper human-like level of abstraction for reasoning and communicating one would soon fall prey to infobesity. This is in clear contrast to how an expert human would explain the rationality for her play, i.e., combining variations and higher-level goals, showing potential alternatives and the reasons for why they do or do not work, etc. An example thereof is illustrated in Fig. 1, for example, giving an example of why an initially promising alternative does not work (blocking the check by moving a Knight to c6), and a long-term strategic evaluation (Knight is superiour to a Bishop in a given ending potentially arising).
It is also worth noting that in chess (and other abstract games), both the process and preference types of self-explanation are relevant, that is, a human player looking to improve would be interested in a chess-tutoring system explaining (at a human-friendly level of abstraction of course) both how a desirable game position (solution) could be reached and why it is so preferable. It is a non-trivial task to explain the reasoning process as the agent needs to show, in addition to the preferred continuation, what other candidate lines of play it considered, explain why they were consideration worthy, and why they ultimately proved inferior to the chosen line of play. This is different from some other problem domains, where only one or the other (process or preference) is relevant, as hypothesised in the original paper on the topic (Langley (2019)).
Although many abstract board games provide a good testbed for AI and XAI research because of their simple rules yet require non-trivial strategies to play well, chess offers additional benefits. Chess is a popular game in Western culture, and its following is increasing, making it more relatable as a problem domain than most other abstract board games. Most importantly, the vast amount of literature on chess provides prime examples of how chess decisions are best communicated and explained to humans, from beginners to experts, at different levels of detail. Not many problem domains offer the same benefit – in games or otherwise. Furthermore, the game is easy to scale in complexity, ranging from studying trivial endgames to complex middle-game patterns. For research, the ability to scale the domain difficulty is instrumental (some other abstract board games offer similar benefits but few to the same level). From a technical standpoint, there are also good arguments for using chess as a problem domain. First, the reasoning approaches used in top chess-playing agents are either minimax- or simulation-based (e.g., alpha-beta and MCTS search, respectively); this allows researchers to experiment with techniques in both those dominant heuristic-search paradigms. Second, many freely available open-source chess-playing agents are available, which provides researchers with a valuable head-start and a more objective way of evaluating their results (i.e., as opposed to using custom-built software).
Some earlier work defining desirable properties of machine learning systems took a holistic look, where learning performance is considered only one of several desirable criteria for evaluating the capabilities of such systems. For example, Michie (1988) defined three criteria for machine learning: (i) weak: the learning system improves its performance through experience; (ii) strong criterion: additionally, it can describe what it has learned in explicit symbolic form; (iii) ultra-strong criterion: additionally, the symbolic description is human-understandable and suited for improving the human’s performance at the task. AlphaZero-style agents still have a long way to go to fulfill the last two criteria and, by this definition, are considered weak learners, as argued in (Bratko (2018)).
Finally, research into explainable AI is not orthogonal to other relevant AI research directions. An important aspect of explainable AI, as we have seen, is to be able to reason and communicate at an abstract level that is natural to humans. Internal representations used for that purpose can provide synergies with other types of reasoning and learning approaches. For example, when applying reinforcement learning in chess, sparse rewards and the absence of tangible sub-goals can make learning convergence slow. The aforementioned internal representations could potentially also be useful for expediting reinforcement learning, e.g., by introducing sub-goals (e.g., to mate the opponent’s king in a given endgame, one must first restrain it to any side of the board, then fix it to an adjacent corner, and only then go for the mate). Active learning, where a learning system may ask humans only a limited number of queries to help expedite its learning process and require fewer labeled examples, is another example of where synergies may occur.
Survey of recent work
Chess provides many exciting challenges for explainable AI. First, neural-network-based evaluation functions are gaining ground, often learning intricate concepts not immediately visible to humans (McGrath et al. (2022)). Thus, developing computational explainability methods for gauging into those networks to assist with analyzing the knowledge encoded there is valuable. Second, explainability research has focussed on model interpretability with little attention to the think-ahead process until recently, with the so-called explainable search (Baier and Kaisers (2020)). Third, the development of general chess-tutoring systems has been hampered, among other things, by a lack of research attention. Hopefully, advances in explainable AI will pave the road for more capable tutoring systems. We now survey recent work in chess (and relevant work in a few other abstract board games) along those three dimensions.
Explaining evaluations
Saliency maps are commonly used in image classification to visualize relevant aspects of a given input to the produced output. They have recently been adapted to help visualize and better understand game board evaluations. Fritz and Fürnkrans (2021) analyze the use of the Specific and Relevant Feature Attribution (SARFA) method in chess using different chess engines and pinpoint some of the pros and cons of such an approach, and propose some improvements to address identified shortcomings. Along similar lines, Pálsson and Björnsson (2022) evaluate the applicability and effectiveness of several saliency-map-based methods for explaining the evaluation of positions in the game of Breakthrough, demonstrating that the more applicable methods (like Shapley Value Sampling and LIME) provide valuable insights into the importance of game pieces and other domain-dependent knowledge learned by the model.
McGrath et al. (2022) analyze the knowledge acquired by AlphaZero in chess, drawing an analogy to chess concepts learned by humans, applying (linear) concept probes to the neural network and behavioral analysis of the agents opening play. The probing examination showed that many human-understandable chess concepts could be accurately regressed from the AlphaZero neural network both after and during training. Furthermore, a qualitative analysis of AlphaZero’s play by GM Vladimir Kramnik shed light on how its chess knowledge (as judged by an expert observer) developed alongside its training. The work shows, for example, that the agent’s opening knowledge undergoes a period of rapid development around the same time that many human-like concepts become predictable from network activations.
Using Stockfish (Romstad et al. (2023)), a world-class superhuman-strength chess-playing engine, as a testbed, Pálsson and Björnsson (2023) show how recent interpretability techniques, including surrogate models and concept probing, can illuminate human-understandable chess concepts learned by the engine’s neural network. Furthermore, the work contrasts the state evaluations of the learned neural network to that of its counterpart hand-crafted evaluation model. They identify and explain critical differences in the game state assessments by doing so. For example, the neural network could statically detect threats such as forks, promotions, and attacking potentials, which would require a look-ahead search in the classical version of Stockfish. Also of interest was the low agreement on king-safety evaluation between hand-crafted and neural-network models – the neural network had seemingly discovered an alternative and more effective way of evaluating king safety.
Lovering et al. (2022) present a highly related work, albeit in the game of Hex. It uses model probing and behavioral tests to investigate how and what information is encoded in an AlphaZero-style agent trained to play Hex. Their analyses suggest that the model neural network learned to represent and use concepts humans consider important for the game. However, they found gaps in embodied knowledge, such as dead cells and the lack of urgency to go for an imminent win. They also show that the training encodes short-term end-game-related concepts in the final layers of the network, whereas it encodes concepts related to long-term planning in the middle layers. They also show that MCTS typically discovers relevant concepts before the neural network learns to encode them. This partially resembles how expert human players learn: dynamic tactical motives, over time, become seen as statically detectable patterns.
Whereas the abovementioned work on concept probing checks only for the presence of pre-defined human-constructed concepts, Schut et al. (2023) take that approach a step further using the AlphaZero agent to discover new chess concepts, thus hopefully extending the scope of existing human chess knowledge. They first employ convex optimization to find suitable concept candidates, then filter out non-novel or non-teachable candidates based on spectral analysis and the notion of informativeness, respectively; finally, they validate the remaining candidates by presenting them to strong chess players and see if the new knowledge results in improved play (i.e., play better aligned with AlphaZero’s move choices).
Explainable search
Explainable search is a recently emerging research direction targeted toward explaining the decisions of search-based agents in sequential decision-making domains, such as chess.
As of today, the most mature research work towards that goal is (arguably) in the field of autonomous planning, with some recent work making noteworthy headway towards explainable planning. Chakaborti et al. (Chakraborti et al. (2020)) provide a comprehensive survey of Explainable AI Planning (XAIP) and compare that to earlier efforts in the field in terms of techniques, target users, and delivery mechanisms, with a particular focus on the role of explanations in the design of an effective human-in-the-loop planning systems.
On more general notes, Baier and Kaisers (2020) highlight six research challenges relating to explainable search: (i) explanations as conversations; (ii) explanations as a two-way street; (iii) explanations in long-term interactions with users; (iv) explanation-aware search; (v) counterfactual explanations of search; and, finally, (vi) integrated explanations of search and evaluation. Whereas all the above challenges are admissible to explainable AI for chess, some seem more relevant than others, particularly integrating explanations of search and evaluation. Explanation-aware search is also exciting and may become essential to future intelligent chess-tutoring systems.
Finally, Baier and Kaisers (2021) make some initial steps towards addressing some of the abovementioned challenges as applied to MCTS-based agents. They also rightfully acknowledge the need for a more robust and flexible formalization of explainable search and the need for carefully constructed user studies to get informative feedback on how preferred and practical explanations of search should look in practice. In chess, we believe this need is already partially met by the existing rich chess literature on compelling explanations, thus further supporting the case of chess being an ideal and fitting domain for explainable AI research.
Chess tutoring systems
There is little to no recent work on building intelligent chess-tutoring systems, except for possibly DecodeChess (DecodeChess), a commercial online platform where one can submit chess games for annotations. Their explanation approach is proprietary, and thus, little is known; however, they state that they use cutting-edge cognitive computing approaches to emulate abstract human thinking and look for and match relevant chess concepts to the positions at hand. The provided text explanations refer to known chess concepts. However, they often come across as elementary and superficial and far from being as insightful as annotations one is accustomed to in the chess literature. Although some older research on intelligent chess-tutoring systems exists, e.g. (Guid et al. (2013); Sadikov et al. (2006)), it is sparse, and the reported work is either restricted to simple chess endgames or is somewhat preliminary.
Another recent development worth mentioning uses natural-language generation techniques, including large language models, to provide online chess commentary, some in conjunction with symbolic reasoning (see, e.g., (Zang et al. (2019); Lee et al. (2022)) for such approaches as well as an overview of this recently emerging field).
Developing a fully general and robust tutoring system for chess is a lofty research goal, needing to address more or less all the abovelisted research challenges of explainable search. A more attainable yet challenging research goal would be to develop methods to fully annotate chess games with insightful human-like comments (like those in Fig. 1). For that, one would not need to be concerned with the interactive or two-way aspect of the explanation. Instead, one could concentrate on the tasks of integrating evaluation and think-ahead explanations, even altering them based on the expert level of the intended audience. 2
The International Computer Games Association (ICGA), formerly named the International Computer Chess Association (ICCA), used to hand out prizes to recognize chess systems providing good annotations; to further encourage such research, maybe it is time to re-instantiate that practice.
Chess-playing agents are now playing at a level far exceeding even the strongest human grand-masters. However, there are still ample opportunities to further improve the intelligence of the chess agents – not by further improving their playing strength, but by improving their explainable agency.
We believe that, in this respect, chess will continue to serve as an important domain for AI research. For example, it offers exciting challenges and future research directions into XAI, including: (i) explaining evaluations in human-friendly manners for audiences of different levels of expertise (building on the rich existing chess literature); (ii) explaining the reasoning process in combination with the evaluations (in games the think-a-head process is also of importance when explaining decisions); (iii) discovering and explaining concepts yet not fully appreciated by humans (some of the best chess programs play at a super-human level), to name a few.
