solvers for non-clausal representations
(for example, NFLSAT18).
Most practically successful SAT solvers are based on an approach called
systematic search. Figure 2 depicts the
search space of a formula. The search
space is a tree with each vertex representing a variable and the out edges
representing the two decision choices
for this variable. For a formula with n
variables, there are 2n leaves in the tree.
Each path from the root to a leaf corresponds to a possible assignment to the
n variables. The formula may evaluate
to 1 or 0 at a leaf (colored green and red
respectively). Systematic search, as the
name implies, systematically searches
the tree and tries to find a green leaf or
prove that none exists.
The NP-completeness of the problem indicates that we will likely need to
visit an exponential number of vertices
in the worst case. The only hope for a
practical solver is that by being smart
in the search, almost all of the tree can
be pruned away and only a minuscule
fraction is actually visited in most cases. For an instance with a million variables, which is considered within the
reach of modern solvers, the tree has
210^ 6 leaves, and in reasonable computation time (about a day), we may be
able to visit a billion (about 230) vertices
as part of the search—a numerically insignificant fraction of the tree size!
Most search-based SAT solvers are
based on the so called DPLL approach
proposed by Davis, Logemann, and
Loveland in a seminal Communications
paper published in 1962.9 (This research builds on the work by Davis and
Putnam10 and thus Putnam is often
given shared credit for it.). Given a CNF
formula, the DPLL algorithm first heuristically chooses an unassigned variable and assigns it a value: either 1 or 0.
This is called branching or the decision
step. The solver then tries to deduce the
consequences of the variable assignment using deduction rules. The most
widely used deduction rule is the
unit-clause rule, which states that if a clause
in the formula has all but one of its literals assigned 0 and the remaining one
is unassigned, then the only way for the
clause to evaluate to true, and thus the
formula to evaluate to true, is for this
last unassigned literal to be assigned to
1. Such clauses are called unit clauses
and the forced assignments are called
x4 = 1
implications. This rule is applied iteratively until no unit clause exists. Note
that this deduction is enabled by the
CNF representation and is the main
reason for SAT solvers preferring this
form.
If at some point there is a clause in
the formula with all of its literals evaluating to 0, then the formula cannot be
true under the current assignment.
This is called a conflict and this clause
is referred to as a conflicting clause. A
conflict indicates that some of the earlier decision choices cannot lead to a
satisfying solution and the solver has
to backtrack and try a different branch
value. It accomplishes this by finding
the most recent decision variable for
which both branches have not been
taken, flip its value, undo all variable
assignments after that decision, and
run the deduction process again. Otherwise, if no such conflicting clause exists, the solver continues by branching
on another unassigned variable. The
search stops either when all variables
are assigned a value, in which case we
have hit a green leaf and the formula is
satisfiable, or when a conflicting clause
exists when all branches have been explored, in which case the formula is unsatisfiable.
Consider the application of the
algorithm to the formula shown in
Figure 2. At the beginning the solver
branches on variable x1 with value 1. After branching, the first clause becomes
unit and the remaining free literal Øx2
is implied to 1, which means x2 must
be 0. Now the second clause becomes
unit and Øx3 is implied to 1. Then Øx4 is
implied to 1 due to the third clause. At
this point the formula is satisfied, and
the satisfying assignment corresponds
to the 8th leaf node from the left in the
search tree. (This path is marked in
bold in the figure.) As we can see, by
applying the unit-clause rule, a single
branching leads directly to the satisfying solution.
Many significant improvements in
the basic DPLL algorithm have been
figure 2. search space of a formula.
(Øx1 ∨ Øx2) ∧ (Øx1 ∨ x2 ∨ Øx3) ∧ (Øx1 ∨ x3 ∨ Øx4) ∧ (x1 ∨ x4)
unknown
True ( 1)
False (0)
x1 = 1
x1 = 0
x2 = 1
x2 = 0
x2 = 1
x2 = 0
x3 = 1
x3 = 0
x3 = 1
x3 = 0
x3 = 1
x3 = 0
x3 = 1
x3 = 0
figure 3. conflict-driven learning and non-chronological backtracking.
x4 = 1
x9 = 1
x3 = 1
x1 ∨ x4
x1 ∨ Øx3 ∨ Øx8
x1 ∨ x8 ∨ x12
x2 ∨ x11
Øx7 ∨ Øx3 ∨ x9
Øx7 ∨ x8 ∨ Øx9
x7 ∨ x8 ∨ Øx10
x7 ∨ x10 ∨ Øx12
x1 x1=0,x4= 1
x3 = 1, x8 = 0, x12 = 1x3
x1 = 0
x7 = 1
x2 = 0, x11 = 1 x2
x7
x9 = 0
x7 = 1, x9 = 1, 0
x8 = 0
x11 = 1
x12 = 1
x2 = 0
x3 = 1 ∧ x7 = 1 ∧ x8 = 0 → conflict
(Øx3 ∨ Øx7 ∨ Øx8) is valid