Proceedings of the IEEE/ACM First International
Workshop on Metamorphic Testing, in conjunction
with the 38th International Conference on Software
Engineering (Austin, TX, May 16). ACM Press, New
10. Le, V., Afshari, M., and Su, Z. Compiler validation via
equivalence modulo inputs. In Proceedings of the 35th
ACM SIGPLAN Conference on Programming Language
Design and Implementation (Edinburgh, U.K., June
9–11). ACM Press, New York, 2014, 216–226.
11. Lee, D. Sensor firm Velodyne ‘baffled’ by Uber self-driving death. BBC News (Mar. 23, 2018); http://www.
12. Levin, S. Uber crash shows ‘catastrophic failure’ of
self-driving technology, experts say. The Guardian
(Mar. 23, 2018); https://www.theguardian.com/
13. Lindvall, M., Ganesan, D., Árdal, R., and Wiegand, R.E.
Metamorphic model-based testing applied on NASA
DAT — An experience report. In Proceedings of the
37th IEEE/ACM International Conference on Software
Engineering (Firenze, Italy, May 16-24). IEEE, 2015,
14. Lindvall, M., Porter, A., Magnusson, G., and Schulze,
C. Metamorphic model-based testing of autonomous
systems. In Proceedings of the Second IEEE/ACM
International Workshop on Metamorphic Testing, in
conjunction with the 39th International Conference on
Software Engineering (Buenos Aires, Argentina, May
22). IEEE, 2017, 35–41.
15. Ohnsman, A. LiDAR maker Velodyne ‘baffled’ by
self-driving Uber’s failure to avoid pedestrian. Forbes
(Mar. 23, 2018); https://www.forbes.com/sites/
16. Posky, M. LiDAR supplier defends hardware, blames
Uber for fatal crash. The Truth About Cars (Mar. 23,
17. Regehr, J. Finding Compiler Bugs by Removing Dead
Code. Blog, June 20, 2014; http://blog.regehr.org/
18. Segura, S., Fraser, G., Sanchez, A.B., and Ruiz-Cortés, A. A survey on metamorphic testing. IEEE
Transactions on Software Engineering 42, 9 (Sept.
19. Segura, S. and Zhou, Z.Q. Metamorphic testing:
Introduction and applications. ACM SIGSOFT
webinar, Sept. 27, 2017; https://event.on24.com/wcc/r/
20. Segura, S. and Zhou, Z.Q. Metamorphic testing 20
years later: A hands-on introduction. In Proceedings
of the 40th IEEE/ACM International Conference on
Software Engineering (Gothenburg, Sweden, May 27–
June 3, 2018). ACM Press, New York, 2018.
21. Tian, Y., Pei, K., Jana, S., and Ray, B. Deep Test:
Automated testing of deep neural network-driven
autonomous cars. In Proceedings of the 40th
IEEE/ACM International Conference on Software
Engineering (Gothenburg, Sweden, May 27–June 3,
2018). ACM Press, New York, 2018.
22. Vassilev, A. and Celi, C. Avoiding cyberspace
catastrophes through smarter testing. Computer 47,
10 (Oct. 2014), 102–106.
23. Velodyne, Velodyne’s HDL-64E: A High-Definition
LiDAR Sensor for 3-D Applications, White Paper, 2007;
24. Zhou, Z. Q., Towey, D., Poon, P.-L., and Tse, T.H.
Introduction to the special issue on test oracles.
Journal of Systems and Software 136 (Feb. 2018), 187;
25. Zhou, Z. Q., Xiang, S., and Chen, T. Y. Metamorphic
testing for software quality assessment: A study
of search engines. IEEE Transactions on Software
Engineering 42, 3 (Mar. 2016), 264–284.
Zhi Quan Zhou ( firstname.lastname@example.org) is an associate
professor in software engineering at the School of
Computing and Information Technology, University of
Wollongong, Wollongong, NS W, Australia.
Liqun Sun ( email@example.com) is pursuing an
M.Phil. degree in computer science at the University of
Wollongong, Wollongong, NSW, Australia, and a software
engineer at Itree, Wollongong, Australia.
© 2019 ACM 0001-0782/19/3
html). Figure 3b to Figure 3e summarize the test results of these categories,
and Figure 3a shows the overall results
corresponding to the Table.
Each vertical column in Figure 3 in-
cludes a subsection in blue, correspond-
ing to MR1 violations. They are labeled
with the actual numbers of |O| > |O′|
cases. We observed that all these num-
bers were greater than 0, indicating
critical errors in the perception of all
four types of obstacles: car, pedestrian,
cyclist, and unknown. Relatively speak-
ing, the error rate of the “car” category
was greatest, followed by “pedestrian,”
“cyclist,” and “unknown.”
Figure 4a and Figure 4b show a real-
world example revealed by our test,
whereby three cars inside the ROI could
not be detected after we added 1,000
random points outside the ROI. Fig-
ure 4c and Figure 4d show another ex-
ample, whereby a pedestrian inside the
ROI (the Apollo system depicted this
pedestrian with the small pink mark in
Figure 4c) could not be detected after
we added only 10 random points out-
side the ROI; as shown in Figure 4d, the
small pink mark was missing. As men-
tioned earlier, we reported the bug to
the Baidu Apollo self-driving car team
on March 10, 2018. On March 19, 2018,
the Apollo team confirmed the error by
acknowledging “It might happen” and
suggested “For cases like that, models
can be fine tuned using data augmen-
tation”; data augmentation is a tech-
nique that alleviates the problem of
lack of training data in machine learn-
ing by inflating the training set through
transformations of the existing data.
Our failure-causing metamorphic test
cases (those with the random points)
could thus serve this purpose.
The 2018 Uber fatal crash in Tempe,
AZ, revealed the inadequacy of con-
ventional testing approaches for mis-
sion-critical autonomous systems.
We have shown MT can help address
this limitation and enable automatic
detection of fatal errors in self-driv-
ing vehicles that operate on either
conventional algorithms or deep
learning models. We have introduced
an innovative testing strategy that
combines MT with fuzzing, reporting
how we used it to detect previously
unknown fatal errors in the real-life
LiDAR obstacle perception system of
Baidu’s Apollo self-driving software.
The scope of our study was limited
to LiDAR obstacle perception. Apart
from LiDAR, an autonomous vehicle
may also be equipped with radar. According to the Apollo website (http://
data.apollo.auto), “Radar could precisely estimate the velocity of moving
obstacles, while LiDAR point cloud
could give a better description of object shape and position.” Moreover,
there can also be cameras, which
are particularly useful for detecting
visual features (such as the color of
traffic lights). Our testing technique
can be applied to radar, camera, and
other types of sensor data, as well as
obstacle-fusion algorithms involving
multiple sensors. In future research,
we plan to collaborate with industry to
develop MT-based testing techniques,
combined with existing verification
and validation methods, to make driverless vehicles safer.
This work was supported in part
by a linkage grant from the Australian Research Council, project ID:
LP160101691. We would like to thank
Suzhou Insight Cloud Information
Technology Co., Ltd., for supporting
1. Baidu, Inc. Apollo Reference Hardware, Mar. 2018;
2. Barr, E. T., Harman, M., McMinn, P., Shahbaz, M., and
Yoo, S. The oracle problem in software testing: A
survey. IEEE Transactions on Software Engineering
41, 5 (May 2015), 507–525.
3. Brown, J., Zhou, Z.Q., and Chow, Y.-W. Metamorphic
testing of navigation software: A pilot study with
Google Maps. In Proceedings of the 51st Annual Hawaii
International Conference on System Sciences (Big
Island, HI, Jan. 3–6, 2018) 5687–5696; http://hdl.
4. Chen, T. Y., Kuo, F.-C., Liu, H., Poon, P.-L., Towey, D., Tse,
T.H., and Zhou, Z.Q. Metamorphic testing: A review
of challenges and opportunities. ACM Computing
Surveys 51, 1 (Jan. 2018), 4:1–4: 27.
5. Chen, T. Y., Kuo, F.-C., Ma, W., Susilo, W., Towey, D.,
Voas, J., and Zhou, Z.Q. Metamorphic testing for
cybersecurity. Computer 49, 6 (June 2016), 48–55.
6. Chen, T. Y., Tse, T.H., and Zhou, Z.Q. Fault-based testing
without the need of oracles. Information and Software
Technology 45, 1 (2003), 1–9.
7. Donaldson, A. F., Evrard, H., Lascu, A., and Thomson,
P. Automated testing of graphics shader compilers.
Proceedings of the ACM on Programming Languages 1
(2017), 93:1–93: 29.
8. Jarman, D. C., Zhou, Z. Q., and Chen, T. Y. Metamorphic
testing for Adobe data analytics software. In
Proceedings of the IEEE/ACM Second International
Workshop on Metamorphic Testing, in conjunction
with the 39th International Conference on Software
Engineering (Buenos Aires, Argentina, May 22). IEEE,
2017. 21–27; https://doi.org/10.1109/ME T.2017.1
9. Kanewala, U., Pullum, L. L., Segura, S., Towey, D., and
Zhou, Z.Q. Message from the workshop chairs. In