Abstract
Tibetan Jiu Chess, a recognized national intangible cultural heritage, is characterized by limited game data and significant parsing challenges. In this study, we leverage the retrieval-augmented generation (RAG) framework, prompt engineering, artificial intelligence (AI) agents, and large language models (LLMs) to construct a question-and-answer (Q&A) system tailored for Tibetan Jiu Chess. Additionally, we developed a specialized algorithm for Jiu Chess game parsing, which integrates with the LLM to enable intelligent and accurate game interpretation. Experimental results demonstrate that the Q&A system effectively addresses two types of Tibetan Jiu Chess questions, achieving notably higher accuracy in knowledge-based questions compared to baseline systems. This Q&A system not only addresses the gap in Tibetan Jiu Chess analysis but also pioneers a new approach to the preservation and transmission of Tibetan chess culture.
Introduction
With the continuous development of artificial intelligence (AI) technology, question-and-answer (Q&A) systems, as an important aspect of natural language processing, have garnered significant attention and research. In specialized fields, the high knowledge threshold, extensive domain-specific terminology, and rapid knowledge updates render general Q&A systems inadequate for meeting users’ specific needs. Consequently, developing Q&A systems tailored to these domains has become increasingly important.
Tibetan Jiu Chess is recognized as a national intangible cultural heritage that urgently requires protection and preservation (Ma & Tao, 2017). This thesis develops a quiz system for Tibetan Jiu Chess that intelligently analyzes chess games, addresses the gap in intelligent quiz systems for Tibetan chess, provides foundational data services for scientific research, and promotes the protection and preservation of this intangible cultural heritage.
The retrieval-augmented generation (RAG) framework integrates retrieval techniques with generative modeling approaches to enhance the quality and accuracy of content produced by large language models (LLMs) in natural language processing tasks. The RAG framework effectively leverages the strengths of traditional deep learning models (Maktapwong et al., 2022; Wang et al., 2022), which generate fluent text, and template-matching models (Chaidrata et al., 2021; Setyawan et al., 2018), which deliver specific and accurate information. The RAG framework includes a retrieval module and an LLM-based generation module (Es et al., 2023), making it suitable for a chess quiz system for Tibetan Jiu Chess by incorporating an external knowledge source during the generation process, retrieving relevant information from a document collection, and integrating it with user queries.
Prompt learning (Liu et al., 2023) introduces relevant prompting information, allowing pretrained LLMs (Devlin et al., 2019; Radford et al., 2018) to more effectively understand and generalize knowledge acquired from a limited number of labeled samples. Research by Schick and Schütze (2021) on prompt learning methods demonstrates that constructing suitable prompt templates and mappers for specific downstream tasks can significantly enhance performance, especially in low-data scenarios. Therefore, prompt learning methods are well-suited for Tibetan Jiu Chess Q&A systems with limited sample sizes.
LLMs struggle to directly possess planning abilities, tool utilization, and human-like memory, rendering them incapable of performing intelligent analysis on Tibetan Jiu Chess game records. From a software engineering perspective, an AI agent is a computer program based on LLMs that exhibits planning abilities, memory functions, and tool utilization (Wu et al., 2023), autonomously completing assigned tasks.
This study addresses the scarcity of Tibetan Jiu game record data and the parsing difficulties by establishing a domain documentation set and an expert knowledge base for game record analysis. It also integrates RAG technology and an algorithm for parsing Tibetan Jiu game records based on expert knowledge as core components within an LLM, thereby constructing a Tibetan Jiu Chess Q&A system.
Experiments demonstrate that the Q&A system can teach Tibetan Jiu Chess fundamentals to beginners, introduce basic formations, and interpret game records. It analyzes each move, strategy, and its underlying significance, providing players with a comprehensive understanding of the game. The system is rich in educational resources for Tibetan Jiu Chess, providing precise and professional-level game record analysis and rule explanations. Users can access information and guidance on Tibetan Jiu Chess anytime and anywhere through the system, eliminating the need to wait for or seek out experts, significantly enhancing learning efficiency, and increasing motivation. This study not only fills a gap in the analysis of Tibetan Jiu Chess game records but also opens new avenues for the protection and inheritance of intangible cultural heritage through AI.
Background
Specialized Domain Q&A System
The goal of processing information in computers to enable them to perform tasks as adeptly as humans is a primary aim of successful AI, directly fueling the rise and development of Q&A systems (Nematzadeh et al., 2018). Most traditional Q&A systems are based on either deep learning generative technologies or template-matching frameworks. Q&A systems utilizing deep learning generative techniques (Liu et al., 2019) often struggle with low-quality responses, such as inconsistent content and grammatical errors, while demanding substantial computational resources. Such systems may fail to provide accurate answers and raise concerns regarding data privacy. Conversely, template-matching Q&A systems (Sneiders, 2002) impose greater demands on knowledge bases, requiring considerable human effort and resources to compile template libraries that closely match users’ questions.
In recent years, methods for optimizing traditional knowledge base question answering (KBQA) using LLMs, as well as entirely LLM-based KBQA approaches, have emerged. Leveraging LLMs facilitates more precise construction of knowledge bases, improved understanding of user questions, and enhanced answer generation (Heyi et al., 2023). Common LLMs include GPT-3.5, GPT-4 (Sanderson, 2023), LLaMA-2 (Touvron et al., 2023), and ChatGLM-2 (Zeng et al., 2022), among others. According to comparative data from multiple related studies, GPT-4 has exhibited superior performance across various tasks among English language models. While these large models demonstrate tremendous potential in Q&A scenarios, limitations persist when applied to specialized domains, primarily due to the following reasons:
The issue of model hallucination (Tonmoy et al., 2024). When large models generate text or execute commands, they may produce information that appears plausible but contradicts factual accuracy, logical context, or input instructions. This is particularly challenging in specialized domains where factual accuracy and domain-specific logic are critical. The issue of model data bias (Wang et al., 2023). Data bias can be particularly detrimental in specialized domains, especially when the data involves domain-specific knowledge. Common biases, such as historical data bias, sampling bias, labeling bias, and data imbalance, can hinder the model's ability to accurately interpret domain-specific terms and contexts. This, in turn, may lead to the generation of misleading or biased responses when addressing specialized problems, undermining the accuracy and fairness of the system. For instance, when constructing legal Q&A systems, data collected from the internet may reflect biases related to gender or race (Caliskan et al., 2017). The process of assigning labels to text may also introduce biases stemming from the subjective judgments of the annotators. These issues undermine the accuracy and fairness of the system, potentially leading to biased responses that impact judicial fairness (Ranjan et al., 2024). The issue of model accuracy. Large models are pretrained on broad datasets; when applied to specialized domains, the discrepancies between the general data they were trained on and the required specialized knowledge can lead to an insufficient understanding of that domain. This deficiency results in the model's inability to accurately answer detailed and complex questions within that domain. Fine-tuning is challenging. When the context or data scope changes, large models require updates to their parameters, resulting in high overall costs. Additionally, instances may occur where domain knowledge is inadequately learned due to insufficient data, resulting in incomplete mastery of specialized knowledge.
To address these issues, several approaches have been proposed such as introducing AI Agents to optimize the inference process of large models (Guo et al., 2024), and implementing RAG architectures for specialized domain knowledge Q&A scenarios (Chen et al., 2024).
RAG, proposed by Lewis et al. (2020), centers on augmenting a generative model with a retrieval module. This module searches for contextually relevant information from a document corpus in response to queries and integrates it into the generation process, enhancing the accuracy and reliability of the generated content. In specialized domains, RAG technology is often combined with LLMs to produce higher-quality, domain-specific responses, thereby mitigating issues of model hallucination (Chen et al., 2024). To address indexing challenges, advanced RAG enhances its indexing techniques using sliding window methods, fine-grained segmentation, and metadata merging (Zhao et al., 2024).
The workflow of RAG is as follows: First, documents are loaded and segmented into text chunks. These text chunks are converted into vectors and stored in an efficient vector database. When a user submits a query, the system transforms the input into a vector and searches for similar document snippets in the vector database. Finally, these snippets and the query are fed into a generative model. The LLM integrates prompt templates, consolidates internal and external information, and generates precise answers through rewriting, summarization, or direct quotation.
In the final step, it typically involves creating a composite input, referred to as a “prompt,” that includes the original question and the retrieved information. A well-crafted prompt can help the model better understand the context of the question and the required type of response.
Prompt Engineering
Few-shot learning aims to enhance a model's generalization capabilities by maximizing the use of available training data when sample sizes are limited. LLMs demonstrate robust few-shot learning capabilities. While traditional fine-tuning methods can achieve good performance on downstream tasks with relatively small data sets, they are prone to overfitting when fine-tuning is conducted with limited data (Zhang et al., 2024). Moreover, the computational and time costs associated with fine-tuning large models can be substantial.
Prompt learning is an emerging method in few-shot learning. Its core concept involves designing specific instructions or contexts—termed “prompts”—to provide contextual cues to the model, thereby guiding it to extract more valuable information from existing data (Giray, 2023). Compared to traditional fine-tuning methods, prompt learning performs significantly better in scenarios with limited data, as the model can understand and execute tasks through prompts without needing extensive additional training data. This approach mitigates the risk of overfitting when training data is limited and considerably reduces overall costs (Liu et al., 2023).
AI Agent
An AI Agent is a form of intelligent agent system based on LLMs, endowed with capabilities such as autonomous understanding, perception, planning, memory, and tool utilization (Wang et al., 2024). Primarily composed of four key modules—memory, planning, tool usage, and action—AI agents can learn and execute tasks independently through sensing and decision-making processes.
The composition of the AI Agent is depicted in Figure 1.
Memory Module: This module encompasses both short-term and long-term memory. Short-term memory is context-dependent, and information is forgotten once it exceeds its capacity. In contrast, long-term memory retains information persistently, facilitating knowledge retrieval over extended periods. Planning Module: Centered around an LLM, it employs prompting strategies, such as chain-of-thought (Wang et al., 2022), to optimize the planning process. Action & Tools Module: This module involves text generation and the invocation of external tools. The former relies on the model's fundamental capabilities for text generation, while the latter extends the model's functions through tools integrated with domain knowledge and rules.

Artificial intelligence (AI) agent architecture diagram.
The integration of AI agents not only substantially enhances the reasoning efficiency and quality of large models but also increases the system's flexibility and adaptability, enabling it to demonstrate greater value in complex and dynamic real-world applications, particularly in highly specialized domains of inference and understanding (McTear, 2022; Schwartz et al., 2023). The incorporation of AI agents allows for a deeper exploration of large models’ application potential, achieving a significant leap in the capability to handle professional domain knowledge (Huq et al., 2024).
In the field of Tibetan Jiu Chess, there remains a gap in specialized-specific Q&A systems. The primary issue is the scarcity of sample data for Tibetan Jiu Chess notation interpretation and question answering, which renders it impossible to construct a Q&A system using conventional machine learning or fine-tuning pretrained models. Therefore, the focus shifts to retrieval-enhanced generative architectures and prompt engineering using LLMs. Additionally, there is a lack of knowledge and information regarding game state analysis and piece formation recognition, resulting in an insufficient understanding of relevant rules and hindering the systematic establishment of an expert knowledge base and interpretation algorithms. Interpretation algorithms can effectively utilize the accumulated experience and specialized knowledge of experts by simulating their thought processes to address problems requiring expertise. Therefore, delivering high-performance interpretation algorithms is essential for resolving the issue of Tibetan Jiu Chess notation interpretation. The AI agent can develop a precise and user-friendly Q&A system for game record analysis by integrating RAG technology, the Tibetan Jiu game record analysis algorithm, and advanced LLMs. In the realm of Tibetan Jiu Chess competition, relatively mature gaming algorithms (Li et al., 2018, 2022) and operational Jiu Chess platforms exist, providing favorable conditions for the subsequent collection of game record data.
Tibetan Jiu Chess Rules Introduction
Tibetan Jiu Chess Equipment
Tibetan Jiu Chess equipment consists of the board and the pieces. The most common type of Tibetan Jiu Chess board is the 14-line variant, featuring a total of 196 intersection points. Central to the board is a diagonal line known as the “Jiuwu” line. The pieces used in Tibetan Jiu Chess are often interchangeable with those used in the game of Go, due to their similar design.
Tibetan Jiu Chess Rules
The Tibetan Jiu Chess playing process is divided into two phases: layout and battle of the pieces. At the start of a Tibetan Jiu Chess match, each player controls pieces of one color. During the layout phase, both sides alternately place one piece at a time on the board, and no pieces may be moved until the board is completely filled. Once the layout phase concludes, the pieces on the “Jiuwu” line are removed, marking the transition to the battle phase. Subsequently, the player who placed the last piece during the layout phase makes the first move in the battle phase. During this phase, a piece can only be moved to an unoccupied point on the board, advancing by one space at a time.
During the battle phase, a player can capture an opponent's piece in three ways: by jump capture, by square capture, and by shape capture.
Jump Capture: If one of your pieces is adjacent to an opponent's piece and there is an empty space on the opposite side, you can jump over the opponent's piece to occupy the empty space, simultaneously capturing the opponent's piece. As shown in Figure 2(a), jump captures can be performed continuously, referred to as “multistep jump capture,” until the conditions for a jump capture are no longer met. This continuous series of captures is referred to as “multiple captures.” Square Capture: When four pieces of the same color occupy the corners of a square on the grid, a “square formation” is created, allowing the capture of one opponent's piece as shown in Figure 2(b1). Three pieces in a triangular arrangement form a “Chess Gate” or “Single Chess Gate” as shown in Figure 2(b2). If two square formations are made in one move, the player may capture two of the opponent's pieces, a condition called a “Double Chess Gate” as shown in Figure 2(b3) Shapes Capture: During gameplay, when pieces are arranged into specific shapes, such as the Dalian shape and others, as shown in Figure 2(c1) and (c2), it enables the capture of an opponent's piece or multiple pieces.

Tibetan Jiu Chess capture formation diagram.
The rules for determining victory or defeat in Jiu Chess are as follows:
If a player has fewer than 14 pieces remaining that player is judged the loser. If one player forms two or more stable Dalian while the opponent has no chess gate, the opponent is judged the loser. Even if a player has more pieces, if none can be moved effectively to form a square (or another strategic shape specific to Jiu Chess) that player is judged the loser. If a player exceeds the time limit for a move, that player is judged the loser.
Dalian includes both flat-mouthed and oblique-mouthed variations, with the flat-mouthed Dalian depicted in Figure 2(c1) and the oblique-mouthed Dalian depicted in Figure 2(c2).
Tibetan Jiu Chess Knowledge Collection and Processing
We extensively collected and rigorously analyzed a variety of literature on Tibetan Jiu Chess, including historical records, ethnographic works, and contemporary research findings. Building on this foundation, we developed a comprehensive documentary knowledge base for Tibetan Jiu Chess that holistically covers the rule system, professional terminology, and detailed explanations of strategic shapes.
In May 2023, we collected 23 game records from a Tibetan Jiu Chess tournament in Aba, Sichuan. In October 2023, we gathered 31 game records from another tournament in Qinghai. In March 2024, we visited the Tibetan Jiu Chess Association in Aba, Sichuan, where we consulted with experts and collected an additional nine game records. Through systematic on-site research, we visited local chess clubs and conducted extensive interviews and exchanges with seasoned players, including Zezhenmeiduo, the vice president of the Tibetan Jiu Chess Association. We utilized video recordings to capture games live, asking players to explain their moves during play and documenting their analyses of the employed strategies. After our trips, we manually entered the recorded games into a Tibetan Jiu Chess applet (Chen et al., 2023), generating smart game format (SGF) files to facilitate subsequent analysis. Through various channels, we collected nearly 200 game records from multiple sources, including daily games among experts, competitions, practice sessions by ordinary scholars, automatic game platforms developed by research teams, and mobile WeChat mini-programs. After screening, we found that game records from practice sessions and AI self-play had relatively low quality due to incomplete games, missing data, or insufficient skill levels, making them unsuitable for algorithm development. Consequently, we excluded these records. Ultimately, for our parsing experiments, we used high-quality game records from the 2023 National Tibetan Jiu Chess Competition. After processing, we obtained 87 high-quality game records, totaling 5,434 moves. Due to unforeseen circumstances, the 2024 National Tibetan Jiu Chess Competition was not held, so we currently only have data from expert-level players from 2023.
Collecting and analyzing these curated expert-level game records and annotations form the foundation for algorithmic learning and understanding of game dynamics. This data is instrumental in understanding various common and advanced tactics and strategies, thereby paving the way for the construction of an expert knowledge database. A critical aspect involves devising an effective representation of the game state, encompassing details such as the board configuration, the positions of individual pieces, and which side is to move next. Designing rule-based algorithms is essential for deciphering the intent behind each move. This includes scrutinizing piece placements, recognizing specific movement sequences that foreshadow particular strategies, and ensuring that the game record analysis algorithm can perform complex logical deductions and decision support.
Reconstruction of SGF Format Game Records
In the process of collecting Tibetan Jiu Chess records, SGF files are commonly used to store game record data. This game record format consists of numbers, letters, and symbols, making it nonintuitive and challenging for untrained readers to understand directly. Therefore, the first step involves decoding these abstract records in SGF format and converting them into intuitively understandable visual representations, allowing chess players, researchers, and enthusiasts to appreciate and analyze each move effectively. The restored game record data will be stored in a dedicated database, allowing subsequent users to quickly retrieve the corresponding game situation information when posing questions. The game record reconstruction interface is shown in Figure 3.

The Tibetan Jiu Chess game reconstruction interface diagram.
The regular expression
Due to the specialized nature of Tibetan Jiu Chess knowledge and the scarcity of related data, fully adapting a general LLM to this specialized domain through simple fine-tuning is challenging. Therefore, this system relies heavily on an external Tibetan Jiu Chess knowledge base and an AI agent's game analysis tool to achieve accurate responses. This requires the large model to possess strong capabilities in Chinese language understanding, multiturn dialogue, and coherent response generation. Based on these requirements, we decided to integrate the M3E model with GPT-4 and incorporate them into the Tibetan Jiu Chess Q&A system.
When selecting LLMs, we chose GPT-4, which has optimal performance, for comparison with the Chinese model ChatGLM-2, as GPT-3.5 and LLaMA-2 are designed for English. GPT-4 often struggles with misunderstanding words, providing inaccurate answers, lacking perspectives, and fabricating information when processing Chinese text. For example, when asked to “introduce the Jiuwu in Tibetan Jiu Chess,” GPT-4 provided an incorrect response. In contrast, ChatGLM-2 performed well. In a set of 20 tests, GPT-4 achieved an accuracy of 45%, while ChatGLM-2 reached 65%.
To improve GPT-4's understanding of specialized Chinese vocabulary related to Tibetan Jiu Chess, we introduce the M3E model. M3E integrates multimodal and multigranularity information, excelling in Chinese semantic understanding and providing strong support for vocabulary related to ethnic minority cultures. Combined with GPT-4, M3E enhances its semantic understanding of Chinese texts related to Tibetan Jiu Chess. After integrating M3E, GPT-4's accuracy in the same tests increased to 80%, surpassing both GPT-4 alone and ChatGLM-2.
In analyzing chess notation problems, the ability of large models to engage in multiturn reasoning and process complex contexts is crucial. We tested a regular and an expert chess game. The results showed that GPT-4's reasoning in multiturn dialogues surpassed that of ChatGLM-2, as it could infer the impact of current moves based on previous ones. For example, GPT-4 identified that the white player overlooked the black player's bridge formation, allowing consecutive jumps and captures, while ChatGLM-2 provided no valuable reasoning. Additionally, ChatGLM-2 struggled with information retention in its responses.
Therefore, in this study, we selected GPT-4 and M3E as the core components of the Q&A system. Their integration not only mitigates GPT-4's limitations in comprehending domain-specific vocabulary but also significantly enhances the system's overall performance.
Design and Implementation of the Tibetan Jiu Chess Q&A System
This Tibetan Jiu Chess Q&A system adopts the Langchain framework, leverages LLM and prompt engineering techniques, and incorporates an AI Agent that integrates the RAG tool with specialized Tibetan Jiu Chess parsing algorithms. Through collaborative efforts, it furnishes users with precise and efficient answers to queries related to Tibetan Jiu Chess knowledge.
As shown in Figure 4, upon receiving a user's question, the AI agent formats it using a preset prompt shown in Figure 5, guiding the LLM to accurately interpret the user's intention. The prompt is crucial at this stage, helping to identify the problem category. For knowledge explanations related to Tibetan Jiu Chess, the AI agent invokes the internal RAG tool to retrieve relevant points from the Tibetan Jiu Chess document library. If the issue concerns chess analysis, the AI agent transmits the chess data to the analysis algorithm via an application programming interface (API) for in-depth analysis. After completing the tool call, the AI agent integrates the results into the corresponding prompt preset templates, and the LLM generates a clear, professional answer.

Business process flowchart for the Tibetan Jiu Chess Q&A system.

Prompt template for question type identification diagram.
The process by which the system uses the RAG tool to generate Q&A responses related to Tibetan ancient chess knowledge is shown in Figure 6.

Example diagram of the response generation process in a retrieval-augmented generation (RAG) tool.
Effective knowledge preprocessing of Tibetan Jiu Chess documents is essential prior to retrieval by the RAG tool. To enhance processing efficiency and accuracy, we apply customized segmentation at punctuation marks, such as commas and periods, ensuring that each segment remains an independent and coherent knowledge fragment. This segmentation yields 267 such blocks.
We then employ the pretrained M3E model to vectorize the segmented text. For Tibetan Jiu Chess knowledge texts, we set the sliding window size to 512 tokens and apply a dynamic gradient clipping strategy with a threshold of 2.0 to optimize the encoding of long sequences. This step is essential for accurately capturing the subtle nuances within the knowledge. This vectorization significantly improves the quality and adaptability of knowledge embeddings, establishing a solid foundation for retrieval and application. The vectorized data is integrated into the Chroma vector database, ensuring rapid and accurate access to Tibetan Jiu Chess knowledge through its efficient indexing and retrieval mechanisms.
When an AI agent invokes the RAG tool, the retriever component first encodes the input data using a text feature vector generation algorithm, producing the query vector
The game record analysis tool within the AI agent utilizes this rule-based algorithm via an API to provide answers to questions related to Tibetan Jiu game record analysis. During the design phase of algorithms for parsing Tibetan Jiu Chess records, the scarcity of available game notation materials and the uniqueness of its rules limited our ability to utilize deep neural networks or other data-driven algorithms. Consequently, we developed a Tibetan Jiu Chess expert knowledge base to store various shapes and move knowledge, totaling 394 entries, and created a rule-based algorithm for parsing chess records.
Based on the unique rules of Tibetan Jiu Chess, establishing a comprehensive expert knowledge base is essential before analyzing the game record. This knowledge base is represented as a two-dimensional array

Diagram illustrating the triangular shape and the Dalian formations.
Triangle formation: A highly threatening formation in Tibetan Jiu Chess, where moving any piece to the gap allows for the formation of a square, enabling the arbitrary removal of one of the opponent's pieces. Within a
Dalian formation: In Tibetan Jiu Chess, Dalian formations are categorized into single Dalian and double Dalian. The representation of Dalian formations in the expert knowledge base is more intricate. For instance,
In conjunction with the completed Tibetan Jiu Chess expert knowledge base, a game record parsing rule algorithm has been developed. The algorithm's workflow is illustrated in Figure 8.

Diagram of the algorithm flow for parsing chess records according to rules.
Based on the steps relevant to the situation that needs parsing, we retrieve the chess status from the dictionary to obtain the two-dimensional arrays corresponding to the starting position, ending position of the moving piece, and the position of the opponent's captured piece. We compare the state of each two-dimensional array with the formation knowledge in the Tibetan Jiu Chess expert knowledge base, matching it with the move intentions defined in the rule algorithm, and storing the results for subsequent queries.
At the end of the game, the SGF parsing algorithm retrieves the game result information recorded in the SGF file, including the winner, resignation, and forfeiture. In actual competition, if one player has fewer formations and cannot destroy all of the opponent's formations, they will be judged as the loser. The parsing algorithm scans each point on the board individually and propagates nonzero points to construct a two-dimensional array representing the largest individual formation. Using this array, the algorithm searches the expert knowledge base for all formations related to the current base point and counts and stores the matched formations. Based on the matching results, the algorithm provides an analysis of the win–loss situation to the user.
The process of invoking the game analysis tool by the AI agent is shown in Figure 9. The parsing algorithm's results are combined with a preset Tibetan Jiu Chess notation analysis prompt template, generating standardized output via LLM technology.

Diagram for calling the game analysis tool.
The performance evaluation of a Q&A system is crucial to understanding its viability and practicality. The test environment and software versions primarily used in this study are: Windows 11 operating system, Python 3.8, Langchain framework, GPT-4 interface, Chroma database, and so forth.
Determining Evaluation Metrics for Experiments
To validate the performance of the Tibetan Jiu Chess Q&A system developed in this study, we reference existing Chess-related Q&A systems for guidance. For example, in the Chess Comment Generation Q&A system (Kim et al., 2024), which addresses the lack of explainability in existing Chess analysis system responses, evaluation metrics include accuracy, relevance, fluency, and comment length. In ChessGPT (Feng et al., 2023), the primary task is to build a GPT model integrating strategic learning and language modeling in Chess. The evaluation metrics for this system include the accuracy of Chess modeling, value judgment accuracy, and the match rate and accuracy of generated moves compared to standard answers in terms of strategic ability.
However, due to the lack of other Tibetan Jiu Chess-related Q&A systems, we draw on the evaluation strategies of Chess Q&A systems, considering the unique characteristics of Tibetan Jiu Chess, such as a relatively small and balanced sample size and a stronger reliance on expert knowledge focused on answer accuracy. Therefore, we select accuracy as the primary evaluation metric, as shown in the formula below.
In the formula, x represents the number of questions answered correctly and y represents the total number of questions asked. Through reviewing relevant literature and conducting fieldwork in regions where Tibetan Jiu Chess is popular, we collected 87 questions pertaining to the domain knowledge of Tibetan Jiu Chess. By analyzing games and consulting experts, we compiled a total of 37 game records for Tibetan Jiu Chess. A meticulous collection and organization of moves during the game phases of the game record have resulted in the accumulation of 2,153 move entries. After manual verification and categorization, these were prepared as two sets of questions: one focusing on knowledge explanation and the other on game record analysis, totaling 1,050 question pairs. Among these, there were 50 questions relating to Tibetan Jiu Chess knowledge, and 1,000 questions concerning game record analysis.
We invited Mr. Awang Bianba, Vice Chairman of the Tibetan Chess Association of the Tibet Autonomous Region and Chief Referee of the Tibetan Jiu Chess event in the national Tibetan chess competition, to conduct a professional evaluation of the accuracy of the Q&A system's game record analysis responses. Specifically, its accuracy is primarily demonstrated in the following areas: first, the evaluation of formations; second, the prediction of move intentions, such as whether to disrupt the opponent's next formation; and third, the assessment of which side, Black or White, holds an advantage.
The results of the experimental tests are presented in Table 1. Before enhancing the input content and standardizing the output using prompt templates, the accuracy rate for knowledge-based questions was 84%, while that for game analysis questions was 87.3%. Following the introduction of targeted prompt templates into the system, the accuracy rates for various types of questions in the Q&A system improved to varying degrees. Specifically, the accuracy rate for knowledge-based questions increased to 96%, representing a 12% improvement, while the accuracy rate for game analysis questions rose to 91.7%, reflecting a 4.4% increase. Due to the relatively simpler and clearer move intentions in Tibetan Jiu Chess compared to Chess, as well as the more fixed and unified language expressions, the accuracy of our system's responses surpasses the 63% accuracy rate of the Chess comment generation Q&A system. In ChessGPT, results vary depending on different capabilities, with most achieving an accuracy rate exceeding 90%.
Comparison Table of Correct Answer Rates for Various Questions With and Without Prompt Templates in the System.
Comparison Table of Correct Answer Rates for Various Questions With and Without Prompt Templates in the System.
A comparison of the data in Table 1 reveals that the accuracy rates for both types of Tibetan Jiu Chess questions have significantly improved in the Q&A system of this study. This improvement is particularly pronounced in the category of Tibetan Jiu Chess knowledge-based questions. Utilizing prompts helps the model avoid potential misinterpretations of ambiguous queries. This approach prevents the retrieval of incorrect information from prior knowledge and ensures that all relevant data is accurately integrated into the final response, greatly enhancing the precision and completeness of answers. With the assistance of prompts, the responses to knowledge-based questions about Tibetan long chess, as demonstrated in No.1 of Table 2, demonstrate the improved quality and reliability of the answers provided.
Tibetan Jiu Chess Q&A System Example Dialogue.
In chess game analysis, prompts can precisely direct the model's interpretation, preventing erroneous explanations due to vague concepts and ensuring answer reliability. In more complex chess analyses, prompts serve as information providers. They provide critical details often omitted in problem statements, such as “jump capture,” “square capture,” and other essential information for understanding the chess position. By providing these key details, prompts help the model build a more comprehensive understanding, leading to detailed and accurate answers.
The content in No. 2 of Table 2 illustrates how the Q&A system, tailored to specific requirements, effectively retrieved detailed information about a designated chess game, enabling in-depth analysis of a particular move in subsequent steps. The responses for Tibetan Jiu Chess game analysis facilitated by prompts, as indicated in numbers 3 to 6 of Table 2 illustrate their contribution to generating comprehensive and accurate explanations.
An AI agent utilizes specific prompt templates to invoke tools for precise knowledge acquisition. With clear instructions and guiding principles, these templates effectively orchestrate the algorithm's behavior, enabling accurate identification of complex patterns in chess. The analysis results are output in a standardized format, demonstrating the deep integration of algorithms, theoretical knowledge, and large models in chess game analysis.
Compared to Tibetan Jiu Chess knowledge-based questions, the accuracy of game record analysis questions depends more on the analytical capabilities of the Tibetan Jiu Chess game record parsing algorithm within the AI agent. Currently, this system primarily caters to beginners and casual chess players. For questions posed by these users, this Q&A system can accurately analyze game records and provide relevant answers.
Although this Q&A system demonstrates high accuracy in handling routine questions, the expert-knowledge-based algorithm faces challenges with complex chess positions or deep strategies, especially in predicting opponents’ intentions. In games involving long-term planning or psychological tactics, the current model often fails to capture crucial changes, leading to deviations in analysis. For example, when moves are made during the early stages of constructing a formation that only fully materializes after several steps, the system struggles to recognize this underlying intention.
To address this, we will expand high-quality datasets in four key areas to enhance the system's robustness. First, we will engage with Tibetan Jiu Chess masters and national competitions to collect expert game records. Second, we will continuously improve the AI's skill level in Tibetan Jiu Chess, enabling it to contribute to data collection through self-play. Third, we will broaden the reach of the Jiu Chess WeChat mini-program to attract high-level players, facilitating data collection. Fourth, we will expedite the development and deployment of a real-time offline intelligent collection system for Jiu Chess. By gathering large-scale high-quality game data, we will support deep learning model training. Introducing deep learning for large-scale sample learning, beyond rule-based expert knowledge, will help the system better understand players’ underlying strategies by learning complex patterns from numerous games.
Conclusions and Future Work
This paper aims to introduce the Tibetan Jiu Chess Q&A system, designed to address the knowledge shortage in this field and the challenges associated with chess notation analysis. The system integrates technologies such as AI agents, RAG, and chess notation analysis algorithms based on Tibetan Jiu Chess expert knowledge, constructing a comprehensive Tibetan Jiu Chess document knowledge base to realize the Q&A system's functionality. By designing targeted prompts, the system accurately acquires and analyzes Tibetan Jiu Chess knowledge. Additionally, the guiding role of templates ensures that the algorithm accurately identifies complex patterns in chess games and outputs results in a unified format, significantly improving the accuracy of the Q&A system's answers.
The Q&A system excels in standard game record analysis. It exhibits high accuracy and reliability in Tibetan Jiu Chess analysis tasks, effectively meeting most user needs. The system has significantly advanced the preservation and study of Tibetan chess culture, highlighting AI's potential in cultural heritage protection. This research not only offers a model for the digital protection of intangible cultural heritage but also predicts that AI will play an increasingly important role in cultural transmission, ushering in a new era of integration between technology and traditional board games.
In the future, we will expand our collection channels for high-quality chess games, covering a wider range of game types and scenarios. This will enlarge our chess game database, supporting the training and optimization of deep learning models. In turn, this will enhance the system's adaptability to complex scenarios and strengthen its robustness. We will also incorporate a Tibetan–Chinese bilingual mode into the system to enhance performance and user experience, thereby providing more comprehensive services and support to Tibetan Jiu Chess enthusiasts. This research not only exemplifies the digital preservation of intangible cultural heritage but also suggests that AI will play an increasingly significant role in cultural heritage transmission, heralding a new era of integration between technology and traditional chess culture.
Footnotes
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the National Natural Science Foundation of China (grant no. 62276285 and 6223601).
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
