More Web Proxy on the site http://driver.im/

research-article

Open access

Semantic image fuzzing of AI perception systems

Authors:

Sebastian Elbaum,

Kevin SullivanAuthors Info & Claims

ICSE '22: Proceedings of the 44th International Conference on Software Engineering

Pages 1958 - 1969

https://doi.org/10.1145/3510003.3510212

Published: 05 July 2022 Publication History

Abstract

Perception systems enable autonomous systems to interpret raw sensor readings of the physical world. Testing of perception systems aims to reveal misinterpretations that could cause system failures. Current testing methods, however, are inadequate. The cost of human interpretation and annotation of real-world input data is high, so manual test suites tend to be small. The simulation-reality gap reduces the validity of test results based on simulated worlds. And methods for synthesizing test inputs do not provide corresponding expected interpretations. To address these limitations, we developed semSensFuzz, a new approach to fuzz testing of perception systems based on semantic mutation of test cases that pair real-world sensor readings with their ground-truth interpretations. We implemented our approach to assess its feasibility and potential to improve software testing for perception systems. We used it to generate 150,000 semantically mutated image inputs for five state-of-the-art perception systems. We found that it synthesized tests with novel and subjectively realistic image inputs, and that it discovered inputs that revealed significant inconsistencies between the specified and computed interpretations. We also found that it produced such test cases at a cost that was very low compared to that of manual semantic annotation of real-world images.

References

[1]

2019. Uber in fatal crash had safety flaws say US investigators. BBC (Nov 2019). https://www.bbc.com/news/business-50312340

[2]

Adith Boloor, Xin He, Christopher Gill, Yevgeniy Vorobeychik, and Xuan Zhang. 2019. Simple physical adversarial examples against end-to-end autonomous driving models. In 2019 IEEE International Conference on Embedded Software and Systems (ICESS). IEEE, 1--7.

[3]

Neal E Boudette and Niraj Chokshi. 2021. U.S. Will Investigate Tesla's Autopilot System Over Crashes With Emergency Vehicles. New York Times (Aug 2021). https://www.nytimes.com/2021/08/16/business/tesla-autopilot-nhtsa.html

[4]

G. Bradski. 2000. The OpenCV Library. Dr. Dobb's Journal of Software Tools (2000).

[5]

Holger Caesar, Varun Bankiti, Alex H. Lang, Sourabh Vora, Venice Erin Liong, Qiang Xu, Anush Krishnan, Yu Pan, Giancarlo Baldan, and Oscar Beijbom. 2019. nuScenes: A multimodal dataset for autonomous driving. arXiv preprint arXiv:1903.11027 (2019).

[6]

Alex Clark. 2015. Pillow (PIL Fork) Documentation. https://buildmedia.readthedocs.org/media/pdf/pillow/latest/pillow.pdf

[7]

Marius Cordts, Mohamed Omran, Sebastian Ramos, Timo Rehfeld, Markus Enzweiler, Rodrigo Benenson, Uwe Franke, Stefan Roth, and Bernt Schiele. 2016. The Cityscapes Dataset for Semantic Urban Scene Understanding. In Proc. of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]

Alexey Dosovitskiy, German Ros, Felipe Codevilla, Antonio Lopez, and Vladlen Koltun. 2017. CARLA: An open urban driving simulator. In Conference on robot learning. PMLR, 1--16.

[9]

Scott Ettinger, Shuyang Cheng, Benjamin Caine, Chenxi Liu, Hang Zhao, Sabeek Pradhan, Yuning Chai, Ben Sapp, Charles Qi, Yin Zhou, Zoey Yang, Aurélien Chouard, Pei Sun, Jiquan Ngiam, Vijay Vasudevan, Alexander McCauley, Jonathon Shlens, and Dragomir Anguelov. 2021. Large Scale Interactive Motion Forecasting for Autonomous Driving: The Waymo Open Motion Dataset. arXiv preprint arXiv:2104.10133 (2021).

[10]

Xiang Gao, Ripon K Saha, Mukul R Prasad, and Abhik Roychoudhury. 2020. Fuzz testing based data augmentation to improve robustness of deep neural networks. In 2020 IEEE/ACM 42nd International Conference on Software Engineering (ICSE). IEEE, 1147--1158.

Digital Library

[11]

Andreas Geiger, Philip Lenz, Christoph Stiller, and Raquel Urtasun. 2013. Vision meets Robotics: The KITTI Dataset. International Journal of Robotics Research (IJRR) (2013).

Digital Library

[12]

Isobel Asher Hamilton. 2019. Tesla is being sued again for a deadly Autopilot crash. Insider (Aug 2019). https://www.businessinsider.com/tesla-sued-family-jeremy-beren-banner-autopilot-crash-2019-8

[13]

Charles R. Harris, K. Jarrod Millman, Stéfan J. van der Walt, Ralf Gommers, Pauli Virtanen, David Cournapeau, Eric Wieser, Julian Taylor, Sebastian Berg, Nathaniel J. Smith, Robert Kern, Matti Picus, Stephan Hoyer, Marten H. van Kerkwijk, Matthew Brett, Allan Haldane, Jaime Fernández del Río, Mark Wiebe, Pearu Peterson, Pierre Gérard-Marchant, Kevin Sheppard, Tyler Reddy, Warren Weckesser, Hameer Abbasi, Christoph Gohlke, and Travis E. Oliphant. 2020. Array programming with NumPy. Nature 585, 7825 (Sept. 2020), 357--362.

[14]

Noor A Ibraheem, Mokhtar M Hasan, Rafqul Z Khan, and Pramod K Mishra. 2012. Understanding color models: a review. ARPN Journal of science and technology 2, 3 (2012), 265--275.

[15]

Nick Jakobi, Phil Husbands, and Inman Harvey. 1995. Noise and the reality gap: The use of simulation in evolutionary robotics. In European Conference on Artificial Life. Springer, 704--720.

Digital Library

[16]

Nidhi Kalra and Susan M. Paddock. 2016. Driving to Safety: How Many Miles of Driving Would It Take to Demonstrate Autonomous Vehicle Reliability? RAND Corporation, Santa Monica, CA.

[17]

Zelun Kong, Junfeng Guo, Ang Li, and Cong Liu. 2020. Physgan: Generating physical-world-resilient adversarial examples for autonomous driving. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 14254--14263.

[18]

Tom Krisher. 2018. Feds: Tesla accelerated, didn't brake ahead of fatal crash. AP News (Jun 2018). https://apnews.com/article/north-america-us-news-mi-state-wire-ca-state-wire-transportation-8c833b3e5d9c49cf97a10974126daad9

[19]

Xiangtai Li, Xia Li, Li Zhang, Cheng Guangliang, Jianping Shi, Zhouchen Lin, Yunhai Tong, and Shaohua Tan. 2020. Improving Semantic Segmentation via Decoupled Body and Edge Supervision. In ECCV.

[20]

Liang Liao, Jing Xiao, Zheng Wang, Chia-Wen Lin, and Shin'ichi Satoh. 2020. Guidance and evaluation: Semantic-aware image inpainting for mixed scenes. In European Conference on Computer Vision. Springer, 683--700.

Digital Library

[21]

Waymo LLC. 2020. Waymo's Safety Methodologies and Safety Readiness Determinations. Technical Report. 30 pages. https://storage.googleapis.com/sdcprod/v1/safety-report/Waymo-Safety-Methodologies-and-Readiness-Determinations.pdf

[22]

Andrea Matessi and Luca Lombardi. 1999. Vanishing point detection in the hough transform space. In European Conference on Parallel Processing. Springer, 987--994.

[23]

Ben Mildenhall, Pratul P Srinivasan, Matthew Tancik, Jonathan T Barron, Ravi Ramamoorthi, and Ren Ng. 2020. Nerf: Representing scenes as neural radiance fields for view synthesis. In European conference on computer vision. Springer, 405--421.

Digital Library

[24]

Rohit Mohan and Abhinav Valada. 2020. EfficientPS: Efficient Panoptic Segmentation. International Journal of Computer Vision 129 (2020), 1551 -- 1579.

[25]

Huy H Nguyen, T Ngoc-Dung Tieu, Hoang-Quoc Nguyen-Son, Vincent Nozick, Junichi Yamagishi, and Isao Echizen. 2018. Modular convolutional neural network for discriminating between computer-generated images and photographic images. In Proceedings of the 13th international conference on availability, reliability and security. 1--10.

Digital Library

[26]

Augustus Odena, Catherine Olsson, David Andersen, and Ian Goodfellow. 2019. Tensorfuzz: Debugging neural networks with coverage-guided fuzzing. In International Conference on Machine Learning. PMLR, 4901--4911.

[27]

Matthew O'Kelly, Aman Sinha, Hongseok Namkoong, Russ Tedrake, and John C Duchi. 2018. Scalable end-to-end autonomous vehicle testing via rare-event simulation. Advances in neural information processing systems 31 (2018).

[28]

Julian Ost, Fahim Mannan, Nils Thuerey, Julian Knodt, and Felix Heide. 2021. Neural scene graphs for dynamic scenes. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2856--2865.

[29]

Xiaojuan Qi, Qifeng Chen, Jiaya Jia, and Vladlen Koltun. 2018. Semi-parametric image synthesis. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 8808--8816.

[30]

Christos Sakaridis, Dengxin Dai, and Luc Van Gool. 2018. Semantic foggy scene understanding with synthetic data. International Journal of Computer Vision 126, 9 (2018), 973--992.

Digital Library

[31]

Hans-Peter Schöner. 2018. Simulation in development and testing of autonomous vehicles. In 18. Internationales Stuttgarter Symposium. Springer, 1083--1095.

[32]

Sergio Segura, Gordon Fraser, Ana B Sanchez, and Antonio Ruiz-Cortés. 2016. A survey on metamorphic testing. IEEE Transactions on software engineering 42, 9 (2016), 805--824.

[33]

Ke Sun, Bin Xiao, Dong Liu, and Jingdong Wang. 2019. Deep High-Resolution Representation Learning for Human Pose Estimation. In CVPR.

[34]

Andrew Tao, Karan Sapra, and Bryan Catanzaro. 2020. Hierarchical multi-scale attention for semantic segmentation. arXiv preprint arXiv:2005.10821 (2020).

[35]

Brad Templeton. 2019. NTSB Report On Tesla Autopilot Accident Shows What's Inside And It's Not Pretty For FSD. Forbes (Sep 2019). https://www.forbes.com/sites/bradtempleton/2019/09/06/ntsb-report-on-tesla-autopilot-accident-shows-whats-inside-and-its-not-pretty-for-fsd/?sh=6270d8234dc5

[36]

Brad Templeton. 2020. Tesla In Taiwan Crashes Directly Into Overturned Truck, Ignores Pedestrian, With Autopilot On. Forbes (Jun 2020). https://www.forbes.com/sites/bradtempleton/2020/06/02/tesla-in-taiwan-crashes-directly-into-overturned-truck-ignores-pedestrian-with-autopilot-on/?sh=20a7458f58e5

[37]

Yuchi Tian, Kexin Pei, Suman Jana, and Baishakhi Ray. 2018. Deeptest: Automated testing of deep-neural-network-driven autonomous cars. In Proceedings of the 40th international conference on software engineering. 303--314.

Digital Library

[38]

Christopher Steven Timperley, Afsoon Afzal, Deborah S Katz, Jam Marcos Hernandez, and Claire Le Goues. 2018. Crashing simulated planes is cheap: Can simulation detect robotics bugs early?. In 2018 IEEE 11th International Conference on Software Testing, Verification and Validation (ICST). IEEE, 331--342.

[39]

Jiahui Yu, Zhe Lin, Jimei Yang, Xiaohui Shen, Xin Lu, and Thomas S Huang. 2018. Generative image inpainting with contextual attention. In Proceedings of the IEEE conference on computer vision and pattern recognition. 5505--5514.

[40]

Juan Cristóbal Zagal and Javier Ruiz-Del-Solar. 2007. Combining simulation and reality in evolutionary robotics. Journal of Intelligent and Robotic Systems 50, 1 (2007), 19--39.

Digital Library

[41]

Sebastian Zanlongo, Matthew Turk, and Sanjay Parajuli. 2019. vanishing-point-detection. https://github.com/SZanlongo/vanishing-point-detection.

[42]

Mengshi Zhang, Yuqun Zhang, Lingming Zhang, Cong Liu, and Sarfraz Khurshid. 2018. DeepRoad: GAN-based metamorphic testing and input validation framework for autonomous driving systems. In 2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE, 132--142.

Digital Library

[43]

Husheng Zhou, Wei Li, Zelun Kong, Junfeng Guo, Yuqun Zhang, Bei Yu, Lingming Zhang, and Cong Liu. 2020. Deepbillboard: Systematic physical-world testing of autonomous driving systems. In 2020 IEEE/ACM 42nd International Conference on Software Engineering (ICSE). IEEE, 347--358.

Digital Library

[44]

Yi Zhu, Karan Sapra, Fitsum A Reda, Kevin J Shih, Shawn Newsam, Andrew Tao, and Bryan Catanzaro. 2019. Improving semantic segmentation via video propagation and label relaxation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 8856--8865.

Cited By

Yu XLiu LHu XKeung JXia XLo DChristakis MPradel M(2024)Practitioners’ Expectations on Automated Test GenerationProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3680386(1618-1630)Online publication date: 11-Sep-2024
https://dl.acm.org/doi/10.1145/3650212.3680386
Zhang JKeung JXiao YLiao YLi YMa X(2024)UniAda: Universal Adaptive Multiobjective Adversarial Attack for End-to-End Autonomous Driving SystemsIEEE Transactions on Reliability10.1109/TR.2024.339489473:4(1892-1906)Online publication date: Dec-2024
https://doi.org/10.1109/TR.2024.3394894
Toledo FShriver DElbaum SDwyer MChandra SBlincoe KTonella P(2023)Deeper Notions of Correctness in Image-Based DNNs: Lifting Properties from Pixel to EntitiesProceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3611643.3613079(2122-2126)Online publication date: 30-Nov-2023
https://dl.acm.org/doi/10.1145/3611643.3613079
Show More Cited By

Index Terms

Semantic image fuzzing of AI perception systems

Recommendations

Grammar-based testing for little languages: an experience report with student compilers
SLE 2020: Proceedings of the 13th ACM SIGPLAN International Conference on Software Language Engineering

We report on our experience in using various grammar-based test suite generation methods to test 61 single-pass compilers that undergraduate students submitted for the practical project of a computer architecture course.

We show that (1) all test ...
Metamorphic model-based testing of autonomous systems
MET '17: Proceedings of the 2nd International Workshop on Metamorphic Testing

Testing becomes difficult when we cannot easily determine whether or not the system under test delivers the correct result. Autonomous systems are a case in point because it is difficult to determine whether a safety-critical autonomous system's ...
Neuron Semantic-Guided Test Generation for Deep Neural Networks Fuzzing
In recent years, significant progress has been made in testing methods for deep neural networks (DNNs) to ensure their correctness and robustness. Coverage-guided criteria, such as neuron-wise, layer-wise, and path-/trace-wise, have been proposed for DNN ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

ICSE '22: Proceedings of the 44th International Conference on Software Engineering

May 2022

2508 pages

ISBN:9781450392211

DOI:10.1145/3510003

General Chair:
Matthew B Dwyer
University of Virginia
,
Program Chairs:
Daniela Damian
University of Victoria, Canada
,
Andreas Zeller
CISPA, Germany

Copyright © 2022 Owner/Author.

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike International 4.0 License.

Sponsors

SIGSOFT: ACM Special Interest Group on Software Engineering

In-Cooperation

IEEE CS

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 July 2022

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

NSF (National Science Foundation)

Conference

ICSE '22

Sponsor:

SIGSOFT

ICSE '22: 44th International Conference on Software Engineering

May 21 - 29, 2022

Pennsylvania, Pittsburgh

Acceptance Rates

Overall Acceptance Rate 276 of 1,856 submissions, 15%

Upcoming Conference

ICSE 2025

2025 IEEE/ACM 46th International Conference on Software Engineering

April 26 - May 3, 2025

Ottawa , ON , Canada

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

9
Total Citations
View Citations
962
Total Downloads

Downloads (Last 12 months)333
Downloads (Last 6 weeks)57

Reflects downloads up to 11 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Yu XLiu LHu XKeung JXia XLo DChristakis MPradel M(2024)Practitioners’ Expectations on Automated Test GenerationProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3680386(1618-1630)Online publication date: 11-Sep-2024
https://dl.acm.org/doi/10.1145/3650212.3680386
Zhang JKeung JXiao YLiao YLi YMa X(2024)UniAda: Universal Adaptive Multiobjective Adversarial Attack for End-to-End Autonomous Driving SystemsIEEE Transactions on Reliability10.1109/TR.2024.339489473:4(1892-1906)Online publication date: Dec-2024
https://doi.org/10.1109/TR.2024.3394894
Toledo FShriver DElbaum SDwyer MChandra SBlincoe KTonella P(2023)Deeper Notions of Correctness in Image-Based DNNs: Lifting Properties from Pixel to EntitiesProceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3611643.3613079(2122-2126)Online publication date: 30-Nov-2023
https://dl.acm.org/doi/10.1145/3611643.3613079
Tang SZhang ZZhang YZhou JGuo YLiu SGuo SLi YMa LXue YLiu Y(2023)A Survey on Automated Driving System Testing: Landscapes and TrendsACM Transactions on Software Engineering and Methodology10.1145/357964232:5(1-62)Online publication date: 24-Jul-2023
https://dl.acm.org/doi/10.1145/3579642
Plöger SMeier MSmith M(2023)A Usability Evaluation of AFL and libFuzzer with CS StudentsProceedings of the 2023 CHI Conference on Human Factors in Computing Systems10.1145/3544548.3581178(1-18)Online publication date: 19-Apr-2023
https://dl.acm.org/doi/10.1145/3544548.3581178
von Stein MShriver DElbaum S(2023)DeepManeuver: Adversarial Test Generation for Trajectory Manipulation of Autonomous VehiclesIEEE Transactions on Software Engineering10.1109/TSE.2023.330144349:10(4496-4509)Online publication date: 8-Aug-2023
https://dl.acm.org/doi/10.1109/TSE.2023.3301443
V RDhabliya DMathur MDas SKumar RRao S(2023)Ameliorating Semantic Search Through Advanced AI Techniques2023 3rd International Conference on Smart Generation Computing, Communication and Networking (SMART GENCON)10.1109/SMARTGENCON60755.2023.10442780(1-6)Online publication date: 29-Dec-2023
https://doi.org/10.1109/SMARTGENCON60755.2023.10442780
Christian GWoodlief TElbaum SGrundy JPollock LPenta M(2023)Generating Realistic and Diverse Tests for LiDAR-Based Perception SystemsProceedings of the 45th International Conference on Software Engineering10.1109/ICSE48619.2023.00217(2604-2616)Online publication date: 14-May-2023
https://dl.acm.org/doi/10.1109/ICSE48619.2023.00217
Wu MLu MCui HChen JZhang YZhang LGrundy JPollock LPenta M(2023)JITfuzz: Coverage-Guided Fuzzing for JVM Just-in-Time CompilersProceedings of the 45th International Conference on Software Engineering10.1109/ICSE48619.2023.00017(56-68)Online publication date: 14-May-2023
https://dl.acm.org/doi/10.1109/ICSE48619.2023.00017

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents