Resizing cardinality constraints for MaxSAT

Abstract

In this paper we describe several variations of the incremental MSU3 and MSU4 algorithms for the MaxSAT problem, and show that some of these improve performance. Among the variations considered are new cardinality constraint encodings which enable incrementally updating the constraint, and have smaller worst-case size than those encodings previously considered. The new cardinality encodings are based on the well-known sorting networks. The incremental approach is also extended, in a novel way, inspired by the idea behind resizing arrays. Best performance was achieved when the totalizer encoding was used in conjunction with sorting networks; unlike other implementations of such combinations in the literature we found that to get best performance, sorting networks should be used very sparingly. We submitted a solver using a version of the methods described in this paper to the 2017 MaxSAT evaluation where it placed fourth out of 8 solvers participating in the unweighted category.

Keywords

MaxSAT cardinality encoding

1. Introduction

MaxSAT solvers have an impressive list of applications ranging from software package upgrades (Argelich et al. [6]) to hardware design (Chen et al. [11]). MaxSAT is the problem of finding an assignment that maximizes the number of satisfied clauses of a boolean formula in Conjunctive Normal Form (CNF). There are several approaches to solving MaxSAT, for instance, the solver MaxSatz (Li et al. [19]) extends the inference rules of SAT to MaxSAT, and the solver MaxHS (Davies and Bacchus [13]) combines Mixed Integer Programming and SAT solving. In this paper we are primarily concerned with MaxSAT algorithms based on satisfiability testing (for an introduction to MaxSAT techniques see for instance Li and Manyà [18]).

MaxSAT algorithms based on satisfiability testing, such as MSU3 (Marques-Silva and Planes [21]) and MSU4 (Marques-Sila and Planes [20]), call a SAT solver, inspect the result to refine upper and lower bounds, and then repeat the call to the solver with a modified formula. The formula is modified by adding constraints that bound the number of unsatisfied clauses, all encoded as a CNF formula.

Several SAT solvers facilitate such a use, and are therefore called incremental. An incremental SAT solver allows certain modifications to the formula before it can be rechecked for satisfiability, reusing as much as possible of the runs with the previous formulas. There are several such solvers, and the SAT-Race 2015 competition featured an incremental SAT solver track where 5 solvers entered.

Martins et al. [23] showed that leveraging the incremental aspect of SAT solvers is crucial for the MSU3 algorithm. To do so they devised a way of reusing a cardinality encoding, called totalizer (Bailleux and Boufkhad [9]), between calls to the solver. In this paper we extend this work to the cardinality network encoding (Asín et al. [7]), the mixed encoding of Abío et al. [3], and to the MSU4 algorithm. Our experiments show that the simple combination of the mixed encoding and the incremental MSU4 algorithm gives a competitive MaxSAT solver. However, unlike in (Martins [22]) and (Abío et al. [3]), best performance is achieved when sorting networks are used sparingly in the mixed encoding. This might be due to the size of the cardinality constraint being less important when the latter is reused.

There are three main varieties of MaxSAT: unweighted, partial, and weighted. All varieties can be solved using satisfiability testing with cardinality constraints (see for instance Ansótegui et al. [5]). However, for simplicity, we only discuss the unweighted case. For our experiments we use benchmarks that are partial MaxSAT from the 2016 MaxSAT evaluation (2016 MaxSAT evaluation [1]), which includes a large collection of industrial partial MaxSAT benchmarks.

In Section 2 we introduce notation and review the incremental MSU3 algorithm and present a straight forward extension to the MSU4 algorithm for incremental SAT solvers. In Section 4 we review the method used by Martins et al. [23] of reusing the totalizer encoding between calls to the solver and extend it to cardinality networks and the mixed encoding. In Section 5 we present experimental results, and conclude in Section 6.

2. Preliminaries

A literal l is either a variable x or its negation $\overline{x}$ , sometimes denoted $\neg x$ . The rule of double negation applies when a literal is negated: $\overline{\overline{x}} = x$ . A clause c is a disjunction of literals, $l_{1} \lor l_{2} \lor \dots \lor l_{n}$ and a CNF formula F is a conjunction of clauses: $c_{1} \land c_{2} \land \dots \land c_{m}$ . A (partial) assignment ϕ is a (partial) function from the set of variables to ${0, 1}$ . The evaluation of an assignment extends to a formula by the usual laws of boolean operators. If there is an assignment ϕ such that $ϕ (F) = 1$ we say that F is satisfiable, and unsatisfiable otherwise. The SAT problem is to find a satisfying assignment to a CNF formula, or to determine that it is unsatisfiable. The MaxSAT problem is to find an assignment that satisfies as many clauses of a formula as possible. In a partial MaxSAT problem, some clauses are hard and must be satisfied. We say that clause c is a consequence of F, denoted $F ⊢ c$ if every assignment satisfying F also satisfies c. We will assume that there is a constant true literal ⊤, for which $\neg ⊤$ is denoted ⊥. This is only for convenience of notation as any clause containing ⊤ can be erased; ⊥ can be removed from any clause it occurs in.

Unit propagation. Most modern SAT solvers make use of the unit propagation rule to decide satisfiability of a CNF formula. In this paper we will use unit propagation as a preprocessing step. The procedure is centered around unit clauses, clauses containing only one literal. A partial assignment ϕ to the variables in the formula F is built by applying the following steps (in any order) until a fixpoint is reached:

If F contains a clause c containing exactly one literal l and l is unassigned then the assignment is extended so that $ϕ (l) = 1$ . We say that l is propagated.

If a clause contains a literal l with $ϕ (l) = 0$ then that literal is removed from the clause.

If a clause contains a literal l with $ϕ (l) = 1$ then that clause is removed from the formula.

Assume that the formula $F^{up}$ is the result of preprocessing F with unit propagation, and that unit propagation produces the partial assignment ϕ. If $F^{up}$ contains the empty clause then F is unsatisfiable. Otherwise, any satisfying assignment for $F^{up}$ can be extended with ϕ to produce a satisfying assignment for F. Conversely, any satisfying assignment for F extends ϕ.

Cardinality constraints. In addition to clauses, we will need to make use of cardinality constraints. Cardinality constraints extend the expressiveness of a clause slightly. A clause $l_{1} \lor l_{2} \lor \dots \lor l_{n}$ requires that at least one literal $l_{i}$ must be assigned 1, but a cardinality constraint $l_{1} + l_{2} + \dots + l_{n} ⩾ k$ requires that at least k literals are assigned 1. One can also easily require that at most k literals are assigned 1 with the cardinality constraint $\neg l_{1} + \neg l_{2} + \dots + \neg l_{n} ⩾ n - k$ , and the combination of the two above constraints requires that exactly k literals are assigned 1.

For the most part, in this paper, we will need cardinality constraints that limit the number of true literals to at most k. For convenience, we denote such a cardinality contraint as $l_{1} + l_{2} + \dots + l_{n} ⩽ k$ .

A SAT solver only handles CNF formulas so the cardinality constraint will have to be realised by a CNF formula. We will use the general term constraint to mean either a clause or a cardinality constraint.

Indicator variables. We will need to relax a constraint, and we do this with an indicator variable. Given a constraint ψ and a CNF formula $F_{ψ}$ we say $i_{ψ}$ is an indicator variable for ψ if $F_{ψ}$ is satisfied by all assignments ϕ with $ϕ (i_{ψ}) = 1$ . Otherwise, when $ϕ (i_{ψ}) = 0$ , $F_{ψ}$ should be satisfied iff ψ is satisfied. We say that $F_{ψ}$ is an indicator-encoding of ψ. For a clause c we have the indicator-encoding $i_{c} \lor c$ . We will explore several indicator-encodings of cardinality constraint in Section 4.

Incremental SAT solvers. We will exploit that many SAT solvers can function incrementally, that is, solve a similar formula reusing as much as possible of the run with the original formula. In addition to being incremental, we also require that the SAT solver provides an unsatisfiable core if the formula is unsatisfiable, that is, a (preferably small) subset of the clauses which is unsatisfiable.

We assume that an incremental SAT solver provides the following functionality: add-clause( c )

The SAT solver maintains a CNF formula, and we may at any time add a clause c to it.

add-assumption( l )

Assume literal l to be assigned true, that is, when considering satisfiability, only assignments where $ϕ (l) = 1$ are allowed.

remove-assumption( l )

If the literal l was assumed, remove the assumption.

check-sat

Check if the current formula is satisfiable with the current assumptions. Provide a core if the formula is found to be unsatisfiable.

3. Incremental algorithms for MaxSAT

In this section we will review the MSU3 algorithm of Marques-Silva and Planes [21], focusing on the improvement made by Martins et al. [23] in leveraging incremental SAT solvers. This is then extended in a straight forward manner to the MSU4 algorithm. The exact etymology of the MSU acronym is unclear, but Marques-Silva and Planes [21] suggests that it refers to the Fu and Malik [16] MaxSAT algorithm based on unsatisfiable subformulas.

3.1. Linear search algorithm

First we review a simple linear search algorithm that the MSU3 algorithm is based on. A lower bound on the number of unsatisfied clauses is kept. Each clause is given an indicator. By using a cardinality constraint the lower bound is refined by bounding the number of active indicator variables and repeatedly asking the solver to verify increasing lower bounds. In this way, the linear search algorithm incrementally relaxes the formula, initially requiring that all clauses are satisfied and increasing the bound by one for each step, until we reach an optimal solution.

Algorithm 1

Simple linear search algorithm

3.2. The incremental MSU3 algorithm

A major inefficiency in Algorithm 1 is the size of β in the cardinality constraint. To remedy this shortcoming the incremental MSU3 algorithm will initially assume all indicators to be false, we say that the indicators are inactive. The cardinality constraint only contains active indicators. When the formula is shown to be unsatisfiable, the unsatisfiable core is examined to find indicators that need to be activated. Any indicator of a clause from the input formula that occurs in a unsatisfiable core must be activated.

The algorithm works in two phases: first we seek only to make the formula satisfiable by removing assumptions, and keep track of a lower bound on the number of unsatisfied clauses. Each time the solver answers unsatisfiable, we increase the bound, and activate all indicators in the core. This leads to one (possibly suboptimal) solution.

In phase two, a cardinality constraint is assumed that requires the solution to be no worse than the current lower bound. Iteratively, the formula is checked for satisfiability, the lower bound is refined, and the cardinality constraint updated until the formula is satisfiable. We refer to (Marques-Silva and Planes [21]) for the proof of correctness of this algorithm.

Algorithm 2

MSU3 (Marques-Silva and Planes [21])

In Algorithm 2 the set α contains the original clauses with the added indicator variables. This set is used to check for indicators needing activation in the unsatisfiable core. Unlike in the linear algorithm, β is initially empty and indicators will enter β only once they are shown to be contained in a unsatisfiable core. If unlucky, β quickly contains all indicators in which case the MSU3 algorithm defaults to linear search.

The procedure requires the following property from the cardinality encoding: it should be possible to efficiently change the constraint from $x_{1} + \dots + x_{n} ⩽ k$ to $x_{1} + \dots + x_{m} ⩽ k + 1$ with $m > n$ . Such an encoding is said to be incremental.

3.3. The incremental MSU4 algorithm

The MSU4 (Marques-Sila and Planes [20]) algorithm is a slight variation on the MSU3 algorithm which has been shown to sometimes increase performance. The essential difference is that in Phase(2) the algorithm will attempt to iteratively lower an upper bound in addition to increasing a lower bound.

Phase(1) coincides exactly with the MSU3 algorithm, but in Phase(2) the cardinality encoding bounds the number of unsatisfied clauses to be at least as good as the current upper bound. If a satisfying assignment is found, it must satisfy more clauses and hence refine the upper bound. If no satisfying assignment can be found, the lower bound can be incremented and if no new inactive indicators are found in the core, the incumbent assignment is optimal. Proof of the correctness of this algorithm can be found in (Marques-Sila and Planes [20]).

Utilising an incremental SAT solver for the MSU4 algorithm is analogous to the MSU3 algorithm, see Algorithm 3.

Algorithm 3

MSU4 (Marques-Sila and Planes [20])

4. Incremental cardinality encodings

In this section we review two cardinality encodings: the totalizer encoding (Bailleux and Boufkhad [9]) and odd-even sorting networks (Batcher [10]). Recall that, as described in Section 3.2, an incremental encoding of a cardinality constraint $x_{1} + \dots + x_{n} ⩽ k$ can be updated by increasing the bound k on the right-hand side, and also by adding more literals to the left-hand side.

The totalizer cardinality encoding has $O (n log (k))$ many variables and $O (n k)$ many clauses while odd-even sorting networks have $O (n {log}^{2} (k))$ many clauses and variables. As described in Section 3.2 we will require that an encoding of a cardinality constraint $x_{1} + \dots + x_{n} ⩽ k$ can handle increasing the bound k on the right-hand side, and adding more literals to the left-hand side. In that case we say that the encoding is incremental. Martins et al. [23] shows that the totalizer encoding can be made incremental while reusing large portions of the original formula. However, in their approach many small updates to the encoding may lead to the encoding using more than $O (n k)$ many variables. We propose a method which avoids this pitfall, and keeps the number of variables to $O (n log (k))$ and the number of clauses to $O (n k)$ . We also show that sorting networks can be used to provide an incremental encoding with size $O (n {log}^{2} (k))$ in combination with totalizer when the mixed encoding of Abío et al. [3] is used.

In this section we first introduce delayed variables, the method which we use to create our incremental encodings. Then the totalizer encoding, sorting networks encoding, and mixed encoding are introduced. Thereafter we explain how delayed variables are applied to achieve the aforementioned size bounds. Finally we discuss implementation details of delayed variables.

4.1. Delayed variables

To simplify the description of these incremental encodings we introduce delayed variables, denoted $\tilde{x}$ . Delayed variables are not introduced to the SAT solver until we undelay them. Similarly, a delayed literal is either a delayed variable or its negation, and a delayed clause is any clause containing a delayed literal. Undelaying a variable causes literals with this variable to be undelayed; clauses that no longer contain delayed literals become undelayed as well. A delayed clause will not be given to the SAT solver until it is undelayed.

Example 1.
Let us say we have the formula $(\tilde{x} \lor \neg y) \land (y \lor z) \land (\neg \tilde{x} \lor z)$ . Since x is delayed the SAT solver is only notified of one clause: $y \lor z$ . The SAT solver is called and gives the satisfying assignment $y = 1$ and $z = 0$ .

Now x is undelayed so the SAT solver is notified of two new clauses: $x \lor \neg y$ and $\neg x \lor z$ . Now a call to the SAT solver gives the satisfying assignment $x = 1$ , $y = 1$ and $z = 1$ .

In Example 1, after the variable x is undelayed the previous assignment can not be extended to satisfy the undelayed clauses. This shows that satisfying assignments can change after variables are undelayed. Indeed, undelaying can even lead to unsatisfiability.

To avoid changes to clauses already given to the SAT solver, one cannot delay a variable that has already been undelayed. Therefore variables can be introduced as being delayed, and then only undelayed once, and never redelayed. Since all clauses containing delayed variables are not given to the SAT solver yet, we are able to substitute literals in delayed clauses without having to change the formula. In Section 4.7 we show how delayed variables can be implemented efficiently for the purpose of this paper.
4.2. Incremental sorting networks

A sorting network applies a predetermined series of compare-and-swap operations to sort an input sequence. Several well-studied varieties exist: odd-even sorting networks (oe-sort) (Batcher [10]) are known to have optimal size for input with less than 9 elements, and have asymptotic size $O (n {log}^{2} (n))$ . Sorting networks of size $O (n log (n))$ are known to exist (Ajtai et al. [4]), but the hidden constant in the big-O notation makes them impractical. Pairwise sorting networks (Parberry [25]) are sorting networks having the same size as oe-sort. Codish and Zazon-Ivry [12] showed that pairwise sorting networks have some desirable properties for SAT solving. Minisat+ (Eén and Sörensson [15]) uses oe-sort, and other encodings, to solve Pseudo Boolean formulas.

Asín et al. [7] note that when sorting networks are used to encode cardinality constraints, one only needs the kth output of the sorting network and they devise a method of only providing the k lowest values of the output in sorted order using no more than $O (n {log}^{2} (k))$ compare-and-swap operations. We would like an incremental encoding that has the same size, but with oe-sort alone we only achieve $O (n log (n) + n {log}^{2} (k))$ while preserving incrementality. To achieve $O (n {log}^{2} (k))$ we have to combine oe-sort with the totalizer encoding as it is done in Abío et al. [3], and we prove that this gives us the desired bound in Section 4.6.

Codish and Zazon-Ivry [12] provide a method for achieving size $O (n {log}^{2} (k))$ cardinality networks with pairwise sorting networks by adding additional clauses to the formula and then reducing the formula via propagation and variable elimination. The additional clauses enhance unit-propagation which is desirable for SAT-solving where propagation plays a critical role. Here we adopt this approach to oe-sort, and combine it with delayed variables to give an incremental encoding.

4.3. The merge-sort network

Both the totalizer and oe-sort are based on a merge-sort pattern. Unlike oe-sort, the totalizer encoding is not based on compare-and-swap operations, but implements merging in another way. The merge-sort pattern that the two encodings share is a function which we define in the following way:

Definition 1 (Sort).

The function sort takes a list of input variables and returns a pair of a list of output variables and a set of clauses S, such that S enforces that the outputs are the inputs in sorted order (1s before 0s).

The base case is: $\begin{matrix} sort ([i_{1}]) = ([i_{1}], \emptyset) \end{matrix}$ and for input longer than 1, we define sort in the following way: $\begin{array}{l} sort ([i_{1} i_{2} \dots i_{n}]) \\ = ([c_{1} c_{2} \dots c_{n}], S_{1} \cup S_{2} \cup S_{merge}) \end{array}$ where $\begin{array}{l} ([a_{1} a_{2} \dots a_{n / 2}], S_{1}) \\ = sort ([i_{1} i_{2} \dots i_{n / 2}]); \\ ([b_{1} b_{2} \dots b_{n / 2}], S_{2}) \\ = sort ([i_{n / 2 + 1} i_{n / 2 + 2} \dots i_{n}]); \\ ([c_{1} c_{2} \dots c_{n}], S_{merge}) \\ = merge ([a_{1} a_{2} \dots a_{n / 2}], [b_{1} b_{2} \dots b_{n / 2}]) . \end{array}$

As mentioned, we will use this pattern for both totalizer and oe-sort, the only difference will be how merge is defined. We should therefore take care to describe what we require from merge.

Definition 2.
Let M be the network $([o_{1} \dots o_{2 n}], S) = merge ([a_{1} \dots a_{n}], [b_{1} \dots b_{n}])$ , and ϕ an assignment satisfying S that assigns 1 to $a_{1} \dots a_{k_{2}}$ and $b_{1} \dots b_{k_{1}}$ , and 0 to the remaining inputs. M is said to be a unidirectional merge if ϕ assigns 1 to $o_{i}$ if $i ⩽ k_{1} + k_{2}$ . We say that it is a bidirectional merge if ϕ assigns 1 to $o_{i}$ if and only if $i ⩽ k_{1} + k_{2}$ .

Note that, with unidirectional merge the network does not sort in a strict sense. However, it is still a correct encoding of the cardinality constraint: all input 1’s are propagated.

As noted by Asín et al. [7], although bidirectional merge is useful for encoding cardinality constraints of the form $x_{1} + \dots + x_{n} = k$ , unidirectional merge is sufficient for cardinality constraints of the form $x_{1} + \dots + x_{n} ⩽ k$ . We get an indicator encoding of $x_{1} + \dots + x_{n} ⩽ k$ by sorting the left hand side: $([o_{1} \dots o_{n}], S) = sort ([x_{1} \dots x_{n}])$ using unidirectional merge and letting $o_{k + 1}$ be the required indicator.

Note that we need no more than the $k + 1$ first output variables in order to give a cardinality encoding. If the merge steps can provide the $k + 1$ first outputs when given only the $k + 1$ first inputs, we may be able to reduce the total size. This is called k-simplification (Koshimura et al. [17]). As per the requirements of incrementality, we will also need to be able to produce the remaining output variables. To achieve both k-simplification and incrementality we will need to forgo providing a bidirectional merge.

An encoding based on the merge-sort lends itself immediately to incrementality, new literals can be added to the list by merging, but we generalize by using delayed variables.

We can obtain k-simplification with delayed variables: we rely on the sort function delaying all input and output variables of merge operations with index larger than k which in turn will reduce the size of the merge operation. How much the size is reduced depends on the implementation of merge which we will discuss later.

Let us say we want to encode $x_{1} + \dots + x_{n} + x_{n + 1} + \dots + x_{m} ⩽ k$ , and we already have $([o_{1} \dots o_{n}], S) = sort ([x_{1} \dots x_{n}])$ encoded. To add the variables $x_{n + 1}, \dots, x_{m}$ to the right hand side, we simply sort the new variables: $([o_{n + 1} \dots o_{m}], S^{'}) = sort ([x_{n + 1} \dots x_{m}])$ and apply merge: $merge ([o_{1} \dots o_{n}], [o_{n + 1} \dots o_{m}])$ .

To increase k to $k + 1$ when k-simplification is applied, it is necessary to undelay the appropriate literals and undelay corresponding clauses.

However, the above setup ignores the fact that m might be much smaller than n so that we might have to apply a large number of merges, each time increasing the number of sorted variables by very little. This leads to not achieving the aforementioned size bounds on the totalizer and oe-sort encoding.

Our proposed remedy is to use an amortized update scheme. We instead update the encoding to $x_{1} + \dots + x_{n} + \tilde{y_{n + 1}} + \dots + \tilde{y_{2 n}} ⩽ k + 1$ , where $y_{n + 1}, \dots, y_{2 n}$ are fresh “placeholder” variables. The placeholder variables are “matched” with the new variables $x_{n + 1}, \dots, x_{m}$ by substituting $y_{i}$ for $x_{i}$ and then undelaying this variable. Note that we can substitute delayed variables without changing any of the clauses already given to the SAT solver.

The variable $y_{i}$ is not matched with a variable if $i > m$ , so it remains delayed. Next time we need to add variable $x_{j}$ , we first check if we can match it with an existing unmatched $y_{j}$ . If not, we can update the constraint to add more delayed $y_{j}$ variables, doubling the amount of variables sorted.
4.4. The totalizer encoding

We now review the totalizer encoding (Bailleux and Boufkhad [9]), and the k-simplification of Koshimura et al. [17]. We describe an incremental version of this encoding based on delayed variables, which is equivalent to the incremental version described by Martins et al. [23]. We will describe the totalizer merge function, and rely on the sort function (Definition 1) to provide the remaining elements of the cardinality encoding.

Definition 3 (Totalizer merge function).

$\begin{array}{l} tot‐merge ([a_{1} \dots a_{n}], [b_{1} \dots b_{m}]) \\ = ([c_{1} \dots c_{n + m}], S_{1} \cup S_{2}) \end{array}$ where $\begin{array}{l} S_{1} = \underset{\begin{array}{c} 0 ⩽ j ⩽ m \\ 0 ⩽ i ⩽ n \end{array}}{⋀} \neg a_{i} \lor \neg b_{j} \lor c_{i + j}; \\ S_{2} = \underset{\begin{array}{c} 0 ⩽ j ⩽ m \\ 0 ⩽ i ⩽ n \end{array}}{⋀} a_{i + 1} \lor b_{j + 1} \lor \neg c_{i + j + 1}; \\ a_{0} = b_{0} = c_{0} = ⊤; \\ a_{n + 1} = b_{m + 1} = c_{n + m + 1} = ⊥ . \end{array}$

Since tot-merge creates $O (n + m)$ new variables and $O (n m)$ many clauses, we get an encoding with $O (n log (n))$ new variables and $O (n^{2})$ new clauses. Note that $S_{1}$ is enough for unidirectional merge, while adding $S_{2}$ provides bidirectional merge.

Lemma 1.
By delaying all input and output variables with index larger than k in each merge operation, sorting with tot-merge only produces $O (n k)$ undelayed clauses and $O (n log (k))$ undelayed variables.
Proof.
First we observe that each merge produces only $O (k^{2})$ many undelayed clauses, and $O (k)$ many new variables. Let M be the set of tot-merges where the length of the output sequence is larger than k in the network.

The size of M is $O (n / k)$ and each element of M gives $O (k^{2})$ many undelayed clauses. The inputs of merges in M are provided by a set S of $O (n / k)$ many sorting networks where the length of the output sequence is at most k, so each gives at most $O (k^{2})$ undelayed clauses. These two sets of networks make up the entire sorting network, both sets have size $O (n / k)$ and each element gives at most $O (k^{2})$ new clauses. Hence the total number of clauses is $2 O (\frac{n}{k}) O (k^{2}) = O (n k)$ .

S produces $O (\frac{n}{k}) O (k log (k)) = O (n log (k))$ many new variables and M produces $O (\frac{n}{k}) O (k) = O (n)$ many new variables. □

Although we achieve smaller encodings using this approach, under some conditions, tot-merge is not bidirectional when k-simplification is applied:
Lemma 2.
In $[c_{1} \dots c_{n + m}] = tot‐merge ([a_{1} \dots a_{n}] [\tilde{b_{1}} \dots \tilde{b_{m}}])$ , where all input variables of one input list are delayed, tot-merge is not bidirectional even when using the $S_{2}$ clauses.
Proof.
The only undelayed clauses containing $a_{1}$ are $\neg a_{1} \lor c_{1}$ and $a_{1} \lor \neg c_{m + 1}$ , therefore it is possible for $c_{1}$ to be true and $a_{1}$ to be false. □

Note that this cannot be easily repaired as we require the clause $a_{1} \lor \neg c_{1}$ which is not in the original clause set. The formula contains the delayed clause $a_{1} \lor \tilde{b_{1}} \lor \neg c_{1}$ which would ensure bidirectionality if no variables were delayed. This scenario can indeed occur in the MSU3 and MSU4 algorithms if many placeholder variables remain unmatched.
4.5. Odd-even merge network

As apposed to tot-merge, oe-sort merging is done using compare-and-swap operations. A compare-and-swap operation $swap (i_{1}, i_{2}, o_{1}, o_{2})$ where $i_{1}$ , $i_{2}$ are inputs and $o_{1}$ , $o_{2}$ are outputs, is an operation that compares the two inputs, and outputs the values in the correct order such that $o_{1} ⩽ o_{2}$ . A sorting network S is a set of compare-and-swap operations such that when these operations are applied to a sequence $[i_{1} i_{2} \dots i_{n}]$ then the resulting sequence $[o_{1} o_{2} \dots o_{n}]$ is sorted, that is $o_{i} ⩾ o_{i + 1}$ for $1 ⩽ i ⩽ n$ . Note that the set of swap operations are fixed regardless of input.

In order to use sorting networks to encode $x_{1} + \dots + x_{n} ⩽ k$ , a set of clauses S that simulates the application of the sorting network to $[x_{1} \dots x_{n}]$ is constructed. That is, for every $swap (i_{1}, i_{2}, o_{1}, o_{2})$ in the sorting network, the following clauses are added to S: $\begin{array}{l} (1) & i_{1} \lor i_{2} \lor \overline{o_{1}} \\ (2) & i_{1} \lor \overline{o_{2}} \\ (3) & i_{2} \lor \overline{o_{2}} \\ (4) & \overline{i_{1}} \lor \overline{i_{2}} \lor o_{2} \\ (5) & \overline{i_{1}} \lor o_{1} \\ (6) & \overline{i_{2}} \lor o_{1} \end{array}$

The clauses (4)–(6) are what is required for unidirectional merge, but by adding all we get bidirectional merging.

We will now describe the odd-even merge network. We refer to Asín et al. [7] for correctness of the construction as it will be similar to the definition given there.

Definition 4 (Odd even merge network).

For sequences of length 1 the odd-even merge function is defined in the following way: $\begin{array}{l} oe‐merge ([a_{1}], [b_{1}]) \\ = ([c_{1}, c_{2}], {swap (a_{1}, b_{1}, c_{1}, c_{2})}) \end{array}$ When $n > 1$ we have $\begin{array}{l} oe‐merge ([a_{1} \dots a_{n}], [b_{1} \dots b_{n}]) \\ (7) & = ([o_{1} c_{2} \dots c_{2 n - 1} e_{n}], S_{o} \cup S_{e} \cup S_{w}) \end{array}$ where $\begin{array}{l} oe‐merge ([a_{1} a_{3} \dots a_{n - 1}], [b_{1} b_{3} \dots b_{n - 1}]) \\ (8) & = ([o_{1} o_{2} \dots o_{n}], S_{o}) \\ oe‐merge ([a_{2} a_{4} \dots a_{n}], [b_{2} b_{4} \dots b_{n}]) \\ (9) & = ([e_{1} e_{2} \dots e_{n}], S_{e}) \\ (10) & S_{w} = ⋃_{i = 1}^{i = n - 1} {swap (e_{i}, o_{i + 1}, c_{2 i}, c_{2 i + 1}}) \end{array}$

To simplify the following proofs, we will divide a oe-merge operation into recursion levels. We say that a “call” to oe-merge by sort has recursion depth 0, while any subsequent call has depth 1+r where r is the recursion depth of the merge “calling” it. A recursion level is the collection of merges with same depth stemming from the same merge operation. The output (input) variables of a level are the output (input) variables of all merge operations in that level.

4.5.1. Delaying auxiliary variables

The sort network will ensure that the appropriate outputs and inputs are delayed. However, unlike with totalizer, oe-merge will create auxiliary variables in the two recursive calls to oe-merge. To lower the number of undelayed clauses we have to delay auxiliary variables. This is done by propagating delayed variables through swaps using the following rules: $\begin{array}{l} (11) & swap (\tilde{x_{1}}, x_{2}, y_{1}, y_{2}) \Rightarrow swap (\tilde{x_{1}}, x_{2}, y_{1}, \tilde{y_{2}}) \\ (12) & swap (x_{1}, \tilde{x_{2}}, y_{1}, y_{2}) \Rightarrow swap (x_{1}, \tilde{x_{2}}, y_{1}, \tilde{y_{2}}) \\ (13) & swap (\tilde{x_{1}}, \tilde{x_{2}}, y_{1}, y_{2}) \Rightarrow swap (\tilde{x_{1}}, \tilde{x_{2}}, \tilde{y_{1}}, \tilde{y_{2}}) \\ (14) & swap (x_{1}, x_{2}, \tilde{y_{1}}, \tilde{y_{2}}) \Rightarrow swap (\tilde{x_{1}}, \tilde{x_{2}}, \tilde{y_{1}}, \tilde{y_{2}}) \end{array}$

We will explain why these rules correctly delay unused clauses and variables. To do so we first need to formalize the well known fact that monochromatic variables, that is, variables that only occur negated or only unnegated in the formula, have little impact on the set of models.

Fact 1.
Let p be a variable and $S_{p}$ a set of clauses in which p only occurs positively (negatively). Let S be obtained from $S_{p}$ by leaving out all clauses in which p occurs. Then models of S can easily be extended to models of $S_{p}$ by putting $p = 1$ ( $p = 0$ ). Conversely, each model of $S_{p}$ can be restricted to a model of S by leaving out the assignment to p.

Rule (11) can be understood as follows. Let the swaps be unidirectionally encoded by (4)–(6). Delaying $x_{1}$ means delaying (4) and (5). As a consequences $y_{2}$ only occurs negatively as input variable of an other swap. By Fact 1 we can put $y_{2} = 0$ , or equivalently, delay $y_{2}$ . The rules (12)–(14) can be explained similarly.

We will have to take care with these rules when variables are undelayed. In essence when inputs or outputs to oe-merge are undelayed some auxiliary variables will need to be undelayed as a consequence. In Section 4.7 we will discuss how to efficiently implement delayed variables for sorting networks, but for now it is sufficient to say that any auxiliary variable a is delayed iff the above propagation rules, started with only delayed inputs and outputs results in a being delayed. This means that the rules will have to be reexamined whenever inputs or outputs are undelayed to find undelayed auxiliary variables.
Lemma 3.
Using the above rules, the number of undelayed output variables of each level is bounded by the number of undelayed inputs to that level.
Proof.
The rules preserve that there are no more outputs than inputs in any given swap operation. A merge operation only produces swap operations in such a way that each variable occurs at most once as an input in a swap. Any variable also occurs only once as an output. Therefore a delayed output can be mapped 1-to-1 to a delayed input. □

The propagation rules (11) and (12) do not preserve bidirectionality, for rule (11) the only undelayed clause that remains is $\neg x_{2} \lor y_{1}$ . Therefore it is possible for $y_{1}$ to be assigned true while $x_{2}$ is assigned false. We could achieve bidirectionality if we substitute variables instead of delaying clauses: $\begin{array}{l} (15) & swap (\tilde{x_{1}}, x_{2}, y_{1}, y_{2}) \Rightarrow x_{2} = y_{1} \\ (16) & swap (x_{1}, \tilde{x_{2}}, y_{1}, y_{2}) \Rightarrow x_{1} = y_{1} \end{array}$

However, this would mean that we could no longer update the constraint. We will show that by using the rules (15) and (16) the merge network simplifies to $O (k log (k))$ , and so one gets sorting networks with size $O (n {log}^{2} (k))$ . However, with rules (11) and (12) we will for now have the slightly larger size $O (k log (n))$ .
Lemma 4.
Using the rules ( 15 ) and ( 16 ) reduces $oe‐merge ([a_{1} \dots a_{n}], [b_{1} \dots b_{n}])$ to size $O (k log (k))$ when $a_{k}, \dots, a_{n}$ and $b_{k}, \dots, b_{n}$ are delayed.
Proof.
At some point before the depth of recursion has reached $O (log (k))$ , there will be no more than two undelayed variables in the input to merge, one in each list. In subsequent recursion steps, except possibly for the base case, all swaps are eliminated by (13), (15), and (16).

Using Lemma 3, at each level of recursion, there are at most $O (k)$ undelayed input variables and so at most $O (k)$ many swaps that are not eliminated, so in total there are $O (k log (k))$ many swaps. □

We can relate the above propagation rules to those commonly applied in SAT solving. Seeing a delayed clause as one that is set to false, which is admissible when encoding $x_{1} + \dots + x_{n} ⩽ k$ , the rules (11)–(14) are equivalent to unit-propagation, while the rules (15) and (16) are equivalent to eliminating equivalent variables, which could be performed by, for instance, the simplification routine “variable elimination by substitution” found in SatElite (Eén and Biere [14]). However, in order to provide incrementality, we cannot set delayed variables to false and must settle for a slightly larger size. Lemma 5.
When delaying all input and output variables with index larger than k in an oe-merge network, after propagating delayed variables, only $O (k log (n))$ undelayed clauses remain.
Proof.
If only the first input variable in each list of a oe-merge is undelayed, then that merge only produces one swap with undelayed clauses, and that swap is generated in the base case of oe-merge.

Only in the first $O (log (k))$ recursion levels are there any merge operations with undelayed variables which are not the first element of the input lists. By Lemma 3 each of those recursion levels have at most $O (k)$ swaps.

The number of swaps with undelayed clauses is therefore at most $O (k)$ which is generated in the base case plus $O (k)$ in each of the first $O (log (k))$ recursion levels. □
Theorem 1.
By delaying all input and output variables with index larger than k in all merge operation, sorting with oe-merge produces $O (n log (n) + n {log}^{2} (k))$ undelayed clauses.
Proof.
In the sorting formula there are only $O (n / k)$ many merges where the length of the output sequence is larger than k, and the remaining formula consists of $O (n / k)$ many sorts where the length of the output sequence is at most k, so each has size at most $O (k {log}^{2} (k))$ . Hence the total size is $O (\frac{n}{k} k {log}^{2} (k) + \frac{n}{k} k log (n)) = O (n {log}^{2} (k) + n log (n))$ . □

4.6. Mixed encoding

It is possible to combine the sorting network encoding with the totalizer encoding, as noted by Abío et al. [3]. The reason for doing so is that totalizer, in addition to creating fewer variables, also uses fewer clauses for short input sequences.

Surprisingly, it also provides us with a way of getting asymptotically smaller incremental networks. As noted in the proof of Lemma 4, after $O (log (k))$ recursion steps there are only 2 undelayed input variables left. Therefore, applying tot-merge to these inputs adds a constant amount of new clauses and variables. By doing this we will show that we have an incremental cardinality encoding of size $O (n {log}^{2} (k))$ .

As noted in Abío et al. [3], it might be beneficial to use tot-merge in order to optimize the ratio between the number of clauses and number of variables, and we have observed that totalizer might incur not just fewer variables but also fewer clauses for number of undelayed inputs up to 8.

In the following λ is a parameter controlling how to weigh the number of clauses versus the number variables when minimizing the formula. The ${mixed}_{λ}$ -merge encoding is constructed recursively as oe-merge. However, it will “call” tot-merge instead of oe-merge if $\begin{matrix} C_{tot} + λ V_{tot} ⩽ C_{oe} + λ V_{oe}, \end{matrix}$ where $C_{tot}$ ( $V_{tot}$ ) and $C_{oe}$ ( $V_{tot}$ ) are the number of clauses (variables) created if totalizer or odd-even network is used to merge. Note that when calculating $V_{oe}$ and $C_{oe}$ , we take into account that the mixed network is used to sort the odd and even recursive cases.

The idea is to minimize the ratio between the number of clauses and the number of variables, and so λ determines how heavily the number of variables should be weighed.

Lemma 6.
${mixed}_{λ}$ -merge has size $O (k log (k))$ when λ is set to a non-negative constant and the number of undelayed input variables is $O (k)$ .
Proof.
We prove this for $λ = 0$ as the general case follows a similar line of argument.

After at most $O (log (k))$ many recursion steps there are no more than two undelayed variables in each input list and so tot-merge is used. The number of such tot-merge operations is at most $O (k)$ and each has constant size. Therefore the recursion depth is bounded by $O (log (k))$ , and by Lemma 3 each level has at most $O (k)$ swaps. □

Abío et al. [3] note that $λ = 5$ seems to give the best performance. In addition to such a setup we also attempted a scheme where the number of additional clauses are kept at a threshold T while minimizing the number of variables used. This is done by first generating the mixed network with $λ = 0$ . This gives the network $N_{\min}$ with the minimum number of clauses $C_{\min}$ . We now have a budget of clauses $T - C_{\min}$ to spend in such a way that as many variables as possible are removed. We therefore calculate for every oe-merge in $N_{\min}$ the number of variables reduced $V_{red}$ and the number of clauses added $C_{add}$ if it were to be replaced with a tot-merge. Now the oe-merges are processed in order of $V_{red} / C_{add}$ and replaced with a tot-merge as long as the threshold T is not exceeded. We call this approach mixed-limit.

Fig. 1.
Number of 2016 MaxSAT competition industrial benchmarks solved over time.

Fig. 2.
Number of 2016 MaxSAT competition crafted benchmarks solved over time.
4.7. Delayed variable implementation details

We will now describe an efficient implementation of delayed variables suitable for the totalizer, odd-even merge sort and the mixed encoding.

Keeping every delayed clause in memory is not ideal since there can be many delayed clauses, and many of them will never be used. Also, keeping track of the consequences of undelaying a variable is not simple. Instead, in our implementation, indicator variables are undelayed when found in a core and we store additional information in order to discover which variables must be undelayed as a consequence. We also store information for each such variable so that when they are discovered, the clauses that contains them are generated.

Recall that when encoding $x_{1} + \dots + x_{n} ⩽ k$ , in each merge, only the k first outputs are used. For a variable v we say that its out-index is the smallest k for which the variable is used.

By keeping track of the recursive calls to oe-merge and the out-index of each variable, we are able to find which variables to undelay and therefore efficiently generate undelayed clauses. Details of this rather involved procedure are beyond the scope of this paper.

5. Results

We chose to use the partial MaxSAT benchmarks from the 2016 MaxSAT competition (2016 MaxSAT evaluation [1]). Both industrial and crafted instances were considered. There are versions of the incremental MSU3 and MSU4 algorithms that work with not just the partial benchmarks but also the unweighted and weighted ones as well, see for instance Ansótegui et al. [5]. However, we restrict our experiments to the partial benchmarks as we are only interested in the cardinality encoding aspect of these algorithms.

Figures 1 and 2 show the amount of benchmarks solved using MSU3 and MSU4 in conjunction with the encodings odd-even sorting network, totalizer, mixed (labeled mixed-ratio in the plots), and mixed-limit. Additionally, we compare these results with the MaxSAT solver Open-WBO 1.3.0 (OWBO, Martins et al. [24]), which ranked second on the crafted partial benchmarks in the 2016 MaxSAT competition (2016 MaxSAT evaluation [1]). Open-WBO uses the incremental MSU3 algorithm with the totalizer encoding. Overall, our implementation of MSU3 with the totalizer encoding seems to have similar performance to their implementation, and outperforms Open-WBO on the industrial benchmarks. The incremental interface of the glucose-syrup (Audemard and Simon [8]) was used in the implementation of all experiments.

5.1. Choice of λ and threshold

As a preliminary experiment, we selected 30 of the benchmarks from the MaxSAT competition to experiment with varying threshold T in the mixed-limit encoding and λ in the mixed-ratio encoding. We chose to set T to a multiple τ of the number of input clauses. Results are shown in Fig. 3.

For τ, performance seems roughly the same for values larger than 2. When $τ = 10$ , the encoding uses totalizer almost exclusively. For mixed-ratio, performance varies unpredictably, but trends lower for $λ > 4$ .

We chose to use $λ = 5$ as recommended by Abío et al. [3]. For τ we chose to set it at 8. This is large enough to almost always use the totalizer encoding while small enough that the clauses will fit into memory.

5.2. Experiments with the 2016 MaxSAT partial benchmark set

Overall it seems that MSU3 works best for the crafted instances, while MSU4 is better for the industrial instances. For encodings, the benchmarks favor heavily the totalizer encoding. The mixed-limit approach solves two extra instances of the crafted benchmarks than the totalizer encoding, which is already quite good, and one extra for the industrial benchmarks. All benchmarks solved with the totalizer encoding are also solved using the mixed-limit encoding. This behavior seems to be caused by the mixed-limit encoding avoiding an explosion in the number of clauses on some benchmarks while using totalizer in most other cases.

Fig. 3.

Experiments with 30 selected benchmarks, varying λ and τ.

For both MSU3 and MSU4 using the odd-even sorting networks alone performed very poorly. Additionally, the mixed encoding performed better when odd-even merges are used very sparingly. This is contrary to findings of Martins [22]. In their experiments, the totalizer encoding performs less favorably than odd-even sorting networks. However, those experiments were done with a non-incremental setup. It therefore seems that the totalizer encoding gains an advantage when there is no need to regenerate the entire formula.

A version of the solver used for these experiments was submitted to the 2017 MaxSAT Evaluation (2017 MaxSAT evaluation [2]) where it placed 4th out of 8 solvers in the unweighted category. In addition to using the mixed-limit incremental encoding, it combined using both the MSU3 and MSU4 algorithms as the union of solved instances by these algorithms is larger than the intersection.

6. Conclusion

In this paper we investigated an incremental setup of the MSU3 and MSU4 algorithm together with several schemes for encoding the cardinality encoding. We found that a combination of the totalizer encoding and the sorting networks encoding can increase performance slightly if the totalizer encoding is favored heavily. Most notably, unlike in other works where both the totalizer encoding and the sorting networks encoding is used (Abío et al. [3], Martins [22]), sorting networks alone seem to have poorer performance than the totalizer encoding. This might hint at some unidentified advantage of the totalizer encoding which might be explored further.

References

2016 MaxSAT evaluation, http://maxsat.ia.udl.cat/, 2016, Accessed: 2017-10-11.

2017 MaxSAT evaluation, http://mse17.cs.helsinki.fi/, 2017, Accessed: 2017-10-11.

Abío,

Nieuwenhuis,

Oliveras and

Rodríguez-Carbonell, A parametric approach for smaller and better encodings of cardinality constraints, in: Principles and Practice of Constraint Programming, Springer, 2013, pp. 80–96. doi:10.1007/978-3-642-40627-0_9.

Ajtai,

Komlós and

Szemerédi, An O(n log n) sorting network, in: Proceedings of the Fifteenth Annual ACM Symposium on Theory of Computing, ACM, 1983, pp. 1–9.

Ansótegui,

M.L.

Bonet and

Levy, Solving (weighted) partial MaxSAT through satisfiability testing, in: Theory and Applications of Satisfiability Testing – SAT 2009, Springer, 2009, pp. 427–440. doi:10.1007/978-3-642-02777-2_39.

Argelich,

Le Berre,

Lynce,

Marques-Silva and

Rapicault, Solving Linux upgradeability problems using Boolean optimization, in: Proceedings First International Workshop on Logics for Component Configuration, EPTCS, 2010, pp. 11–23.

Asín,

Nieuwenhuis,

Oliveras and

Rodríguez-Carbonell, Cardinality networks: A theoretical and empirical study, Constraints 16(2) (2011), 195–221. doi:10.1007/s10601-010-9105-0.

Audemard and

Simon, Predicting learnt clauses quality in modern SAT solvers, in: Proceedings of the Twenty-First International Joint Conference on Artificial Intelligence, Vol. 3, IJCAI, 2009, pp. 399–404.

Bailleux and

Boufkhad, Efficient CNF encoding of Boolean cardinality constraints, in: Principles and Practice of Constraint Programming, Springer, 2003, pp. 108–122.

10.

K.E.

Batcher, Sorting networks and their applications, in: Proceedings of the April 30–May 2, 1968, Spring Joint Computer Conference, ACM, 1968, pp. 307–314.

11.

Chen,

Safarpour,

Veneris and

Marques-Silva, Spatial and temporal design debug using partial MaxSAT, in: Proceedings of the 19th ACM Great Lakes Symposium on VLSI, ACM, 2009, pp. 345–350.

12.

Codish and

Zazon-Ivry, Pairwise cardinality networks, in: Logic for Programming, Artificial Intelligence, and Reasoning, Springer, 2010, pp. 154–172. doi:10.1007/978-3-642-17511-4_10.

13.

Davies and

Bacchus, Exploiting the power of MIP solvers in MAXSAT, in: Theory and Applications of Satisfiability Testing – SAT 2013, Springer, 2013, pp. 166–181. doi:10.1007/978-3-642-39071-5_13.

14.

Eén and

Biere, Effective preprocessing in SAT through variable and clause elimination, in: Theory and Applications of Satisfiability Testing – SAT 2005, Springer, 2005, pp. 61–75.

15.

Eén and

Sörensson, Translating pseudo-Boolean constraints into SAT, Journal on Satisfiability, Boolean Modeling and Computation 2(3–4) (2006), 1–25.

16.

Fu and

Malik, On solving the partial MAX-SAT problem, in: Theory and Applications of Satisfiability Testing – SAT 2006, Springer, 2006, pp. 252–265. doi:10.1007/11814948_25.

17.

Koshimura,

Zhang,

Fujita and Ryuzo , Hasegawa. QMaxSAT: A partial Max-SAT solver, Journal on Satisfiability, Boolean Modeling and Computation 8 (2012), 95–100.

18.

C.M.

Li and

Manyà, MaxSAT, hard and soft constraints, in: Handbook of Satisfiability,

Biere,

Heule and

van Maaren, eds, Vol. 185, IOS Press, 2009.

19.

C.M.

Li,

Manyà and

Planes, New inference rules for Max-SAT, Journal of Artificial Intelligence Research (JAIR) 30(1) (2007), 321–359.

20.

Marques-Sila and

Planes, Algorithms for maximum satisfiability using unsatisfiable cores, in: Advanced Techniques in Logic Synthesis, Optimizations and Applications, Springer, 2011, pp. 171–182. doi:10.1007/978-1-4419-7518-8_10.

21.

Marques-Silva and

Planes, On using unsatisfiability for solving maximum satisfiability, Preprint, arXiv:0712.1097, 2007.

22.

Martins, Parallel search for maximum satisfiability, PhD thesis, Instituto Superior Técnico, 2013.

23.

Martins,

Joshi,

Manquinho and

Lynce, Incremental cardinality constraints for MaxSAT, in: Principles and Practice of Constraint Programming, Springer, 2014, pp. 531–548.

24.

Martins,

Manquinho and

Lynce, Open-wbo: A modular MaxSAT solver, in: Theory and Applications of Satisfiability Testing – SAT 2014, Springer, 2014, pp. 438–445.

25.

Parberry, The pairwise sorting network, Parallel Processing Letters 2 (1992), 205–211. doi:10.1142/S0129626492000337.