we now prefer random forests of regression trees,
6 particularly when the instance distribution is heterogeneous.
We briefly describe this model class
for completeness, but refer readers to
the literature for details.
6, 7 Regression
trees are very similar to decision trees
(which we used here for predicting satisfiability status). However, as a classification method, decision trees associate categorical labels with each leaf
(for example, “satisfiable,” “
unsatisfiable”), while regression trees associate
a real-valued prediction with each leaf.
Random forests report an average over
the predictions made by each of an ensemble of regression trees; these trees
are made to differ by randomizing the
training process.
Figure 4 illustrates the results of
our broader experience with EHMs by
highlighting three different solvers,
each from a different domain. In each
case we give plots for both quadratic
ridge regression and random forests
to show the impact of the learning al-
gorithm. First (column 1), we consid-
ered the prominent SAT solver, Minisat
2.09, running on a very heterogeneous
mix of instances from the interna-
tional SAT competition. Whereas the
competition subdivides instances into
categories (“industrial/application,”
“handmade/crafted,” and “random”),
we merged all instances together.
Likely because of the heterogeneity of
the resulting set, quadratic regression
performed relatively poorly here. Ran-
dom forests yielded much more reli-
able estimates; notably, they can parti-
tion the feature space into qualitatively
different parts, and they never predict
runtimes larger or smaller than the
extrema observed in the training data.
However, observe that even the less
accurate quadratic regression models
were usually accurate enough to differ-
entiate between fast and slow runs in
this domain; see the sidebar on SATzil-
la. Second (column 2), we studied the
performance of the leading complete
TSP solver, Concorde,
2 on a widely used
suite of rather homogeneous, random-
ly generated TSP instances.
23 We again
see good performance, now for both
quadratic regression and random for-
ests. Third (columns 3 and 4), to show
the effect of changing only the instance
distribution, we consider one solver on
two different distributions. IBM ILOG
machine job allocation to wildlife con-
servation planning; TSP instances rep-
resenting drilling circuit boards and
traveling between real cities). We have
also studied more than 50 state-of-the-
art solvers, both open source projects
and proprietary tools developed by
industry. Our solvers were both deter-
ministic and randomized, and both
complete (that is, guaranteed to find a
solution if one exists) and incomplete.
In many cases, we only had access to an
executable of the solver, and in no case
did we make use of knowledge about a
solver’s inner workings. As mentioned
earlier, we have also gone beyond qua-
dratic basis function regression to
study more than a dozen other statis-
tical modeling techniques, including
lasso regression, multivariate adap-
tive regression splines, support vector
machine regression, neural networks,
Gaussian processes, regression trees
and random forests (see Leyton-Brown
et al.
29 and Hutter et al.
21). We omit the
details here, but state the conclusion:
Imagine that each time the designer of a heuristic algorithm faced a choice about a
given design element, she simply encoded it as a free parameter of a single solver. In the
end, her problem of designing a good algorithm for a given problem domain would be
reduced to the stochastic optimization problem of finding a configuration that achieved
good performance.
12, 14, 24
We have applied automated methods for solving this problem to identify novel
algorithm configurations that have yielded orders of magnitude speedups in a broad
range of domains, including SAT-based formal verification,
15 MIP solving,
17 and
automated planning.
37 One state-of-the-art method for solving this problem is based on
EHMs: sequential model-based algorithm configuration (SMAC)
19 iterates between using
an EHM to select promising configurations to explore next, executing the algorithm
with these configurations, and updating the model with the resulting information.
SMAC is freely available; please see http://aclib.net/SMAC. EHMs can also be used to
select configurations on a per-instance basis.
16
Application 3. Algorithm
Configuration
Realistic, hard benchmark distributions are important because they are used as an
objective measure of success in algorithm development. However, it can sometimes be
just as difficult to find new, hard benchmarks as it is to find new strategies for solving
previously hard benchmarks. To fill this gap, EHMs can be used to automatically adjust
existing instance generators so they produce instances that are harder for a given set of
algorithms.
26, 29
We start with an instance generator that has parameters p. Such generators are
often simply used with default parameter settings; however, to search for harder
instances we instead sample each parameter’s value uniformly from a fixed range.
Call the resulting distribution over instances D. Our goal is to sample from a new
distribution D′ over the same instances that weights instances by their hardness for a
given algorithm A. (Think of A as having the best average runtime among all algorithms
in a given set, or as being a SATzilla-style selector among such algorithms.) We can
do this via a form of importance sampling. We construct an EHM for A on D using
our standard instance features f; for an instance x, call this model’s prediction Hf (x).
We would like to generate a large set of instances from D, weight each instance x in
proportion to Hf (x), and then sample a single instance from the set in proportion to
the weights. This approach works, but requires a very large number of samples when
hard instances are rare in D. To improve performance, we learn a quadratic EHM Hp
that uses only the generator parameters p as features. We can then sample instances x
in proportion to D(x) ∙ Hp (x) rather than sampling from D (by sampling directly from
polynomial function Hp, and then running the instance generator with the resulting
parameters), and then weight each sampled instance x by Hf (x) / Hp (x). Hp thus guides
our search towards harder instances without biasing the weights. In experiments with
the Combinatorial Auction Test Suite30 this approach increased the mean hardness
of generated instances by up to a factor of 100,
26, 29 and often created instances much
harder than we had ever observed using the generators’ default parameters.
Application 2. Generating
Hard Benchmarks