exhibited a difference between A and B,
and we changed either A or B (or both),
depending on which response to σ
was considered unsatisfactory behavior. The implementations were learned
and checked iteratively with increasing sets of stimuli to handle scalability.
Issues were found in both the refactored and the legacy implementation
in an early stage, before the component was integrated. In this way, costly
rework in a later phase of the development was avoided.
Recent Advances
During recent years significant progress has been made on algorithms for
model learning, which is crucial for
scaling the application of these techniques to larger systems.
Basic algorithms. Since 1987, the L*
algorithm of Angluin’s6 has been considerably improved. The original L*
performs a membership query for each
entry in the observation table. This is
often redundant, given that the sole
purpose of membership queries is the
distinction of states (rows). Therefore,
Kearns and Vazirani28 replaced the observation table of the L algorithm by the
so-called discrimination trees, which
are basically decision trees for determining equivalence of states.
Another inefficiency of L is that all
prefixes of a counterexample are added
as rows to the table. Counterexamples
obtained through conformance testing or runtime monitoring may be
extremely long and are rarely minimal,
which results in numerous redundant membership queries. Rivest and
Schapire33 observed that, instead of
adding all prefixes of a counterexample
as rows to the table, it suffices to add a
single, well-chosen suffix as a column.
The new TTT algorithm of Isberner
et al. 24, 25 is currently the most efficient
algorithm for active learning. The algo-
rithm builds on the ideas of Kearns and
Vazirani28 and Rivest and Schapire33 but
eliminates overly long discrimination
trees, which may arise when processing
long counterexamples, by cleaning up
the internal data structures and reorga-
nizing the discrimination tree. Suppose
that a Mealy machine M has n states and
k inputs, and that the length of the longest
counterexample returned by the teacher
is m. Then in the worst-case TTT requires
O(n) equivalence queries and O(kn2 +
that we do not know how to cope with
but that are vital to our organization.” 7
Typically, these systems are based on
obsolete technologies, documentation
is limited, and the original developers
are no longer available. In addition,
existing regression tests will be limited.
Given these characteristics, innovations
that require changes of legacy compo-
nents are risky. Several techniques have
been developed to extract the crucial
business information hidden in legacy
components, and to support the con-
struction of refactored implementa-
tions. Margaria et al. 30 were the first to
point out that model learning may help
to increase confidence that a legacy
component and a refactored imple-
mentation have the same behavior.
Schuts et al., 35 for instance, used
model learning to support the rejuve-nation of legacy embedded software
in a development project at Philips.
The project concerned the introduc-
tion of a new hardware component,
the Power Control Component (PCC),
which is used to start-up and shut-
down an interventional radiology
system. All computers in the system
have a software component, the Power
Control Service (PCS) which commu-
nicates with the PCC over an internal
control network during the execution
of start-up and shutdown scenarios.
To deal with the new hardware of the
PCC, which has a different interface,
a new implementation of the PCS was
needed. Since different configurations
had to be supported, with old and new
PCC hardware, the old and new PCS
software needed to have exactly the
same external behavior. Figure 9 illus-
trates the approach that was followed.
From both the legacy implementation
A and the refactored implementation
B, Mealy machine models MA resp.
MB were obtained using model learn-
ing. These models were then compared
using an equivalence checker. When
the equivalence checker found a coun-
terexample σ, then we checked whether
A and MA behaved the same on input
σ and whether B and MB behaved the
same on input σ. If there was a discrep-
ancy between A and MA, or between B
and MB, then we asked the learner to
construct an improved model based
on counterexample σ. Otherwise σ
Figure 9. Approach to compare legacy component and refactored implementation (diagram
courtesy of Schuts et al. 35).
Implementation A Implementation B
model learner model learner
Model MA Model MB
models
correct
for σ?
equivalence
checker
equiv?
NN
Y
Y
done
counter
example σ
refine
model(s)
using σ
adapt
implementation(s)
Figure 10. A register automaton.
l0 start l1 l2
Push(in)/OK
υ:=in
Pop/KO
in ¹ υ
Push(in)/OK
w:=in
in = υ
Push(in)/KO
out = υ
Pop/Out(out)
out = υ
Pop/Out(out)
υ:=w
Push(in)/KO