returns x. Now the verifier checks if
setting f (x, y) = x satisfies the logical
specification; that is, it checks
the validity of the formula
(x ≥ x) ∧ (x ≥ y) ∧ (x ∈ {x, y}). This formula does not hold for all values of x
and y, and the verifier returns one
such counterexample, say, (x = 0, y =
1). This input is added to the set
Examples, and the learner now needs
to find a solution for f that satisfies
the specification at least for this
input. The learner can possibly return
the expression y, and when the verifier checks the validity of this answer
it returns (x = 1, y = 0) as a counterexample. Figure 2 shows the expressions learnt and the corresponding
counterexamples produced by the
verifier in successive iterations. For
instance, in iteration 4, the learner
attempts to find a candidate solution
that satisfies the specification for the
inputs (0, 1), ( 1, 0), and (0, 0), and f (x,
y) = x + y is indeed such a plausible
answer. The verifier then checks the
validityof(x + y ≥ x) ∧ (x + y ≥ y) ∧ (x +
y ∈ {x, y}) and returns (x = 1, y = 1) as a
counterexample. In the subsequent
iteration, when the learner attempts
to find a solution that fits all the four
inputs currently in Examples, the
shortest expression possible is I TE( (x
≤ y), y, x), which the verifier finds to be
a valid solution.
Given a candidate solution for the
desired function, checking whether it
satisfies the logical specification for
all inputs is a standard verification
problem, and we can rely upon a
mature verification technology such
as SMT solvers for this purpose. 5
Learning an expression from the set
Exp of candidate expressions that satisfies the specification for the current
inputs in Examples is a new challenge,
and has been the focus of research in
design and implementation of SyGuS
solvers.
Enumerative search. Given a set
Exp of candidate expressions specified by a (context-free) grammar, a
finite set Examples of inputs, and a
logical specification Spec, the learning problem is to find an expression e
in Exp such that Spec[f/e] is satisfied
for all inputs in Examples.
Furthermore, we want to find the simplest such expression.
The simplest solution to the
learning problem is based on enu-
merating all expressions in Exp one by
one in increasing order of size, and
checking for each one if it satisfies
Spec for all inputs in Examples. Since
the number of expressions grows
exponentially with the size, we need
some heuristics to prune the search
space. An optimization that turns out
to be effective is based on a notion of
equivalence among expressions with
respect to the given set of inputs. Let
us say that two expressions e1 and e2
are Examples-equivalent if for all
inputs in Examples, e1 and e2 evaluate
to the same value. Notice that if e is an
expression that contains e1 as a subex-pression, and if we obtain e′ by substituting e1 by another expression e2 that
is Examples-equivalent to e1, then e′ is
guaranteed to be Examples-equivalent
to e. As a result, the enumeration algorithm maintains a list of only inequivalent expressions. To construct the
next expression, it uses only the
expressions from this list as potential
subexpressions, and as a new expression is constructed, it first checks if it
is equivalent to one already in the list,
and if so, discards it.
To illustrate the algorithm, suppose the logical specification is Spec2,
the set of expressions is Exp1, and the
current set of Examples contains a
single input (x = 0, y = 1) (as noted earlier). The job of the learning algorithm
is to find an expression e that satisfies
Spec2 for x = 0 and y = 1, that is, e(0, 1)
> 1. Two expressions are equivalent in
this case if e1(0, 1) = e2(0, 1). The enumerator starts by listing expressions
of size 1 one by one. The first expression considered is x. It is added to the
list, and since it does not satisfy the
specification, the search continues.
The next expression is y, which is
inequivalent to x and does not satisfy
the specification, so is added to the
list and the search continues. The
next expression is 0, which turns out
to be equivalent to x (both evaluate to
0 for the input (0, 1) ), and is hence dis-
carded. The next expression is 1,
which is also discarded as it is equiva-
lent to y. Next the algorithm considers
the expressions generated by the
application of the rule E + E. The algo-
rithm considers only x and y as poten-
tial subexpressions at this step, and
thus, examines only x + x, x + y, y + x,
We conclude the discussion of the
enumerative search algorithm with a
few observations. First, if the set Exp
is unbounded, the algorithm may
simply keep enumerating expres-
sions of larger and larger size without
ever finding one that satisfies the
specification. Second, if we know (or
impose) a bound k on the depth of the
expression we are looking for, the
number of possible expressions is
exponential in k. The equivalence-
based pruning leads to significant
savings, but the exponential depen-
dence remains. Finally, to translate
the idea described above to an actual
algorithm that works for the set of
expressions described by a context-
free grammar some fine-tuning is
needed. For example, consider the
grammar for the set Exp2 of linear
expressions with conditionals, the
algorithm needs to enumerate
(inequivalent) expressions generated
by both non-terminals T and C con-
currently by employing a dynamic
programming strategy (see Alur et al. 2
and Udupa et al. 32).
An Infrastructure for Solvers
In the world of constraint solving, the
standardization of the input format,
collection and categorization of a
large number of benchmarks, access
to open-source computational infrastructure, and organization of an
annual competition of solvers, had a
transformative impact on both the
development of powerful computational techniques and the practical
applications to diverse problems. 5