some situations. This was because the
researchers rotated all the objects in
the environment, but not the sun, unexpectedly causing a shadow to fall on
the landing pad in some orientations,
revealing issues in the drone’s vision
system. The researchers solved this
with a more robust vision sensor that
was less sensitive to lighting changes.
Researchers from the University of
Virginia and from Columbia University
tested three different deep neural net-
work (DNN) models for autonomous
driving.
21 The inputs to the models
were pictures from a camera, and the
outputs were steering angles. To verify
the correctness of the outputs, the re-
searchers used a set of MRs based on
the software. Such relations are called
“metamorphic relations” (MRs) and
are necessary properties of the intend-
ed program’s functionality. If, for cer-
tain test cases, an MR is violated, then
the software must be faulty. Consider,
for example, the testing of a search
engine. Suppose the tester entered a
search criterion C1 and the search en-
gine returned 50,000 results. It may not
be easy to verify the accuracy and com-
pleteness of these 50,000 results. Nev-
ertheless, an MR can be identified as
follows: The search results for C1 must
include those for C1 AND C2, where C2
can be any additional condition (such
as a string or a filter). If the actual
search results violate this relation, the
search engine must be at fault. Here,
the search criterion “C1 AND C2” is a
new test case that can be constructed
automatically based on the source test
case “C1” (whereas C2 could be gener-
ated automatically and randomly) and
the satisfaction of the MR can also be
verified automatically through a test-
ing program.
A growing body of research from
both industry and academia has examined the MT concept and proved it
highly effective.
4, 7–9, 13, 18–20 The increasing interest in MT is not only due to it
being able to address the oracle problem and automate test generation but
also the perspective of MT has seldom
been used in previous testing strategies and, as a result, has detected a
large number of previously unknown
faults in many mature systems (such as
the GCC and LLVM compilers),
10, 17 the
Web search engines Google and Bing,
25
and code obfuscators.
5
MT for Testing
Autonomous Machinery
Several research groups have begun to
apply MT to alleviate the difficulties in
testing autonomous systems, yielding
encouraging results.
For example, researchers from the
Fraunhofer Center for Experimental
Software Engineering in College Park,
MD, developed a simulated environment in which the control software of
autonomous drones was tested using
MT.
14 The MRs made use of geometric transformations (such as rotation
and translation) in combination with
different formations of obstacles in
the flying scenarios of the drone. The
Fraunhofer researchers looked for behavioral differences of the drone when
it was flying under these different (
supposedly equivalent) scenarios. Their
MRs required the drone should have
consistent behavior, while finding that
in some situations the drone behaved
inconsistently, revealing multiple software defects. For example, one of the
bugs was in the sense-and-avoid algorithm, making the algorithm sensitive
to certain numerical values and hence
misbehavior under certain conditions,
causing the drone to crash. The researchers detected another bug after
running hundreds of tests using different rotations of the environment:
The drone had landing problems in
Figure 1. Input pictures used in a metamorphic test revealing inconsistent and erroneous
behavior of a DNN ( https://deeplearningtest.github.io/deep Test).
21
(a) original (b) with added rain
Figure 2. MT detected a real-life bug in Google Maps;
3, 20 the origin and destination of the route
were almost at the same point, but Google Maps generated an “optimal” route of 4. 9 miles.