Abstract
A nonogram is a pen and paper single-player logic game in which players paint each cell of a two-dimensional grid according to clues for specific rows and columns. This paper improves on the method proposed by Wu et al. by adding a freedom parameter which can significantly reduce the total computation cost of maximal painting. Combining a well-designed bitboard with BMI instructions in the CPU, we reduce memory loading while accelerating operations. Our Nonogram program, named Requiem, solved all 1000 puzzles in every Nonogram Tournament from 2011 to 2018 faster than all previous medal programs. Furthermore, we beat four other teams to win the TAAI 2017 Tournament, six other teams to win the ICGA 2018 Tournament and three other teams to win the TAAI 2018. To our knowledge, Requiem offers the best puzzle solving performance at the present time.
Introduction
Nonograms, also called Paint by Numbers puzzles, were invented by a Japanese graphics editor named Non Ishida in 1987. Players are presented with a blank two-dimensional grid, and color in certain squares based on a series of clues to reveal a resulting image. Figure 1 shows a
Players typically go through the grid and address the most obvious clues first. For example, in Fig. 2, the clue is “1 2”. Initially, the solution for all five cells is unknown, and we thus label the line “uuuuu”. All the 3 possible patterns for this line are shown in Fig. 2. We can see that the fourth cell must be black, while the four remaining cells are uncertain, and thus the line is denoted as “uuu1u”. Most Nonogram puzzles can be solved through repeated iterations of this process for each row and column, a process called maximal painting (Wu et al., 2013).
Solving Nonograms has been shown to be NP-complete (Ueda and Nagao, 1996). At present, only puzzles with moderate size can be solved in a reasonable amount of time. The ratio of black cells in the solution will affect the difficulty of solving it. Having more black cells in the solution means that more cells can be determined in advance using the maximal painting operations. For example, the puzzle in Fig. 1 can easily be completely solved by repeatedly using the maximal painting operations on each row and column. If the number of black cells in the solution is smaller, we can randomly guess some cells as being black, and then apply the maximal painting operations on each row and column to determine many unknown cells. Hence a “difficult” Nonogram puzzle needs an appropriate ratio of black cells in the solution (Batenburg et al., 2009). In each Nonogram tournament from

A nonogram example, (a) blank puzzle, (b) corresponding solution.

Maximal painting.
In the rest of this paper, Section 2 presents our ideas for improving the efficiency of line solving. The proposed technique involves the “freedom” number of each line which is the difference between the upper bound of the number of white pixels and the number of current white pixels of the line. The new algorithm can quickly stop the max-painting when the freedom number equals zero. Section 3 compares the advantages and disadvantages of three so-called Fully Probing (FP) methods to solve more pixels before backtracking. Among FP1, FP2, and FP3, FP1 is the most efficient way since it can solve almost the same number of pixels while only using 80% execution time on average. From this concrete comparison, we use only FP1 to let our program Requiem solve more puzzles in much less time. Section 4 describes our policy for selecting unknown cells and its speedup in backtracking. Employing “⩾” in the condition for selecting unknown cells can improve the execution time for 30% of the Nonogram problems. In Section 5, a high-performance bitboard data structure for implementing parallel instructions is proposed to store lines and clues. Experiments are presented in Section 6. We conduct complete experiments and show that the new techniques can dramatically reduce the computation cost of previous programs. Concluding remarks are given in Section 7.
Many researchers have proposed algorithms to solve for Nonograms, with various methods for treating the maximal painting of a line (Wolter). Yen et al. (2010) generated all available solutions for a line based on available clues and generated solutions as in Fig. 2. Wu et al. (2013) applied dynamic programming techniques (Cormen et al., 2009) to derive a method for maximal painting with time complexity
As with previous researchers, regular expressions are utilized to represent the various patterns that the line needs to match. We use 1 (or 0) to indicate that a cell is painted black (or white), and use u to indicate that a cell’s state is uncertain. Each clue
Let
Note that ⊕ is a symmetric operation (i.e.,
The following formulas (Wu et al., 2013; Chen and Hung, 2017) are developed and a dynamic programming approach proposed in Batenburg et al. (2009) can be applied to derive the maximal painting with a time complexity of
Here ϵ means null or conflicted, and
We found that the maximal painting in Wu et al. (2013) and Chen and Hung (2017) involves many impossible situations. For the example of Fig. 2,
From this example, we see that the clue
If we let
Note that d denotes the length of each black segment in D. The summation of
Bacchus and van Run (1995) defined a description d as “l-consistent” if
This is a similar definition as our freedom formula. But they used this to only show the difficulty of solving a Nonogram. In Lemma 4.1 and Lemma 4.2 of Bacchus and van Run (1995), they categorized two kinds of Nonograms: Simple type if
The previous maximal painting P and
Here c is a single character that means conflicted, and
Let
The previous function
For the example in Fig. 2,
For this example, we have a linear space with 5 slots and we need to put 2 black segments with 1 and 2 slots respectively, and we need at least one slot to separate them. So there are already 4 slots needed and we have an extra free slot (
We see that
In Table 1, we randomly generated 10000 puzzles and recorded the running time of maximal painting. The equipment we used is an Intel® Core™ i9-7900X 3.30 GHz with 64 GB RAM. Most of the experiments in this paper were performed on the same device except Table 5. After using the freedom parameter, we obtained more than 4 times acceleration. The implementation of the solver without using the freedom parameter in Table 1 is based on the more efficient method in Appendix of Wu et al. (2013) which has a time complexity of
Speed comparison with and without freedom
Speed comparison with and without freedom

After maximal painting, it may still have many unknown pixels on the board. The Davis-Putnam-Logemann-Loveland (DPLL) algorithm is a backtracking algorithm for examining all possibilities of unknown pixels. However, the time complexity for the DPLL algorithm is
Recently, Huang et al. (2018) sought to reduce some instructions in FP1, but this replicated the work of Wu et al. (Chen et al., 2015) in their program LalaFrogKK. Chen and Huang (2018) investigated the Group-Base Fully Probing (GP) to reveal the relations between unknown pixels. Instead of guessing the colors of unknown pixels one by one, GP can determine more pixels simultaneously during backtracking. Unfortunately, this causes their (GP) algorithm to incur excessive overhead, and thus it was not used in tournaments.
Comparison of FP1, FP2, and FP3
Comparison of FP1, FP2, and FP3

FP1 scheme.
In many puzzles, the FP1 method cannot correctly determine all pixels in the grid, and good performance still requires backtracking to paint the undetermined cells by carefully choosing the next pixel to paint. Seven choose-pixel heuristics were tried and compared in Wu et al. (2013). Among these, the Min-logd heuristic performed the fastest in all testing cases (Wu et al., 2013).
Here we replace it with new Min-logd which has one less log calculation but with similar effect.
A performance comparison between the original Min-logd and the new Min-logd heuristics is done by executing 100,000,000 times for each computation. The original Min-logd takes 4517 milliseconds while our new one takes only 2168 milliseconds.
When we apply the modified Min-logd to choose the next pixel to paint in the backtracking method, we use a loop to check each undetermined cell and choose the one with the maximal Min-logd value. The process usually compares the value with the “>” operation, but “⩾” is also available. The difference between “>” and “⩾” is that the former will choose the first pixel with the largest Min-logd value, and the latter will choose the last pixel with the largest Min-logd value in case multiple pixels have the same largest value. As shown in Tables 3 and 4, “⩾” occasionally has an advantage for total computation time for the 1000 puzzles of TAAI 2016. All of them used the same machine as Table 1. In other words, employing “⩾” in the condition for selecting unknown cells can improve the execution time for 30% Nonogram problems. Hence, we use “⩾” in our program although we don’t know the exact reason. The similar situation has also occurred in other studies. Huang et al. (2018) tried four different orders in Fully Probing and got different effects. We can regard “⩾” as “>” in a different order.
The speed comparison of using “⩾” and “>” for the 1000 puzzles in TAAI 2016
The speed comparison of using “⩾” and “>” for the 1000 puzzles in TAAI 2016
The run time comparison of using “⩾” and “>” for the 1000 puzzles in TAAI 2016
In this section, we present some high performance data structures for dealing with the
First, each cell in each line of the board has three possible states: black (01)2, white (10)2, and unknown (11)2 and this state needs to be represented by at least 2 bits. An intuitive approach is to use two consecutive bits to represent a cell and a 64-bit long integer can sufficiently represent a line of 25 cells, as shown in Fig. 5(a). To retrieve the state of the nth cell, we need a calculation like “
Furthermore, we use the remaining pattern (00)2 to express c, the conflicted result for the H and

Two representations of a line.
To represent the clue of a line, we use a 32-bit unsigned integer. Each number in the clue is denoted by consecutive 0s followed by a 1. Figure 6 illustrates an example of the clue
For the example in Fig. 6,

Data structure for the clue
This means the number of free slots for putting extra 0s is 15 in a line with 26 cells. Note that we have added an extra 0 at the left of the original line with 25 cells. In the calculation,
We collected all the Nonogram competition records of ICGA, TAAI, and TCGA from 2011 to 2018 and ran our program to solve all 1000 puzzles for each competition. Not all computers used in the competition are identical, most of them had
Furthermore, our program Requiem won the TAAI 2017 Tournament, the ICGA 2018 Tournament and TAAI 2018 Tournament (see Tables 6, 7 and 8). ICGA 2018 featured 7 participants, most of which successfully applied the method proposed by Wu et al. to solve all the 1000 puzzles. However, Requiem was the fastest.
Comparison of Requiem and previous gold medal programs (Wu, 2011)
Comparison of Requiem and previous gold medal programs (Wu, 2011)
Competition results of TAAI 2017
Competition results of ICGA 2018
Competition results of TAAI 2018
This paper investigates a new free parameter f in the dynamic programming approach to finish maximal painting earlier without significant overhead expense. Compared with previous approaches, maximal painting with the free parameter achieves more than 4 times speedup. Combined with high performance data structures, this approach substantially reduces memory usage for puzzle clues.
However, Fully Probing (FP) 1 still maintains lots of data. Fully Probing 1 embedded with the concept of the free parameter is only
Footnotes
Acknowledgement
This research was supported in part by a grant MOST 106-2221-E-003-027-MY2 from the Ministry of Science and Technology, R.O.C.
