[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3510003.3510212acmconferencesArticle/Chapter ViewAbstractPublication PagesicseConference Proceedingsconference-collections
research-article
Open access

Semantic image fuzzing of AI perception systems

Published: 05 July 2022 Publication History

Abstract

Perception systems enable autonomous systems to interpret raw sensor readings of the physical world. Testing of perception systems aims to reveal misinterpretations that could cause system failures. Current testing methods, however, are inadequate. The cost of human interpretation and annotation of real-world input data is high, so manual test suites tend to be small. The simulation-reality gap reduces the validity of test results based on simulated worlds. And methods for synthesizing test inputs do not provide corresponding expected interpretations. To address these limitations, we developed semSensFuzz, a new approach to fuzz testing of perception systems based on semantic mutation of test cases that pair real-world sensor readings with their ground-truth interpretations. We implemented our approach to assess its feasibility and potential to improve software testing for perception systems. We used it to generate 150,000 semantically mutated image inputs for five state-of-the-art perception systems. We found that it synthesized tests with novel and subjectively realistic image inputs, and that it discovered inputs that revealed significant inconsistencies between the specified and computed interpretations. We also found that it produced such test cases at a cost that was very low compared to that of manual semantic annotation of real-world images.

References

[1]
2019. Uber in fatal crash had safety flaws say US investigators. BBC (Nov 2019). https://www.bbc.com/news/business-50312340
[2]
Adith Boloor, Xin He, Christopher Gill, Yevgeniy Vorobeychik, and Xuan Zhang. 2019. Simple physical adversarial examples against end-to-end autonomous driving models. In 2019 IEEE International Conference on Embedded Software and Systems (ICESS). IEEE, 1--7.
[3]
Neal E Boudette and Niraj Chokshi. 2021. U.S. Will Investigate Tesla's Autopilot System Over Crashes With Emergency Vehicles. New York Times (Aug 2021). https://www.nytimes.com/2021/08/16/business/tesla-autopilot-nhtsa.html
[4]
G. Bradski. 2000. The OpenCV Library. Dr. Dobb's Journal of Software Tools (2000).
[5]
Holger Caesar, Varun Bankiti, Alex H. Lang, Sourabh Vora, Venice Erin Liong, Qiang Xu, Anush Krishnan, Yu Pan, Giancarlo Baldan, and Oscar Beijbom. 2019. nuScenes: A multimodal dataset for autonomous driving. arXiv preprint arXiv:1903.11027 (2019).
[6]
Alex Clark. 2015. Pillow (PIL Fork) Documentation. https://buildmedia.readthedocs.org/media/pdf/pillow/latest/pillow.pdf
[7]
Marius Cordts, Mohamed Omran, Sebastian Ramos, Timo Rehfeld, Markus Enzweiler, Rodrigo Benenson, Uwe Franke, Stefan Roth, and Bernt Schiele. 2016. The Cityscapes Dataset for Semantic Urban Scene Understanding. In Proc. of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[8]
Alexey Dosovitskiy, German Ros, Felipe Codevilla, Antonio Lopez, and Vladlen Koltun. 2017. CARLA: An open urban driving simulator. In Conference on robot learning. PMLR, 1--16.
[9]
Scott Ettinger, Shuyang Cheng, Benjamin Caine, Chenxi Liu, Hang Zhao, Sabeek Pradhan, Yuning Chai, Ben Sapp, Charles Qi, Yin Zhou, Zoey Yang, Aurélien Chouard, Pei Sun, Jiquan Ngiam, Vijay Vasudevan, Alexander McCauley, Jonathon Shlens, and Dragomir Anguelov. 2021. Large Scale Interactive Motion Forecasting for Autonomous Driving: The Waymo Open Motion Dataset. arXiv preprint arXiv:2104.10133 (2021).
[10]
Xiang Gao, Ripon K Saha, Mukul R Prasad, and Abhik Roychoudhury. 2020. Fuzz testing based data augmentation to improve robustness of deep neural networks. In 2020 IEEE/ACM 42nd International Conference on Software Engineering (ICSE). IEEE, 1147--1158.
[11]
Andreas Geiger, Philip Lenz, Christoph Stiller, and Raquel Urtasun. 2013. Vision meets Robotics: The KITTI Dataset. International Journal of Robotics Research (IJRR) (2013).
[12]
Isobel Asher Hamilton. 2019. Tesla is being sued again for a deadly Autopilot crash. Insider (Aug 2019). https://www.businessinsider.com/tesla-sued-family-jeremy-beren-banner-autopilot-crash-2019-8
[13]
Charles R. Harris, K. Jarrod Millman, Stéfan J. van der Walt, Ralf Gommers, Pauli Virtanen, David Cournapeau, Eric Wieser, Julian Taylor, Sebastian Berg, Nathaniel J. Smith, Robert Kern, Matti Picus, Stephan Hoyer, Marten H. van Kerkwijk, Matthew Brett, Allan Haldane, Jaime Fernández del Río, Mark Wiebe, Pearu Peterson, Pierre Gérard-Marchant, Kevin Sheppard, Tyler Reddy, Warren Weckesser, Hameer Abbasi, Christoph Gohlke, and Travis E. Oliphant. 2020. Array programming with NumPy. Nature 585, 7825 (Sept. 2020), 357--362.
[14]
Noor A Ibraheem, Mokhtar M Hasan, Rafqul Z Khan, and Pramod K Mishra. 2012. Understanding color models: a review. ARPN Journal of science and technology 2, 3 (2012), 265--275.
[15]
Nick Jakobi, Phil Husbands, and Inman Harvey. 1995. Noise and the reality gap: The use of simulation in evolutionary robotics. In European Conference on Artificial Life. Springer, 704--720.
[16]
Nidhi Kalra and Susan M. Paddock. 2016. Driving to Safety: How Many Miles of Driving Would It Take to Demonstrate Autonomous Vehicle Reliability? RAND Corporation, Santa Monica, CA.
[17]
Zelun Kong, Junfeng Guo, Ang Li, and Cong Liu. 2020. Physgan: Generating physical-world-resilient adversarial examples for autonomous driving. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 14254--14263.
[18]
Tom Krisher. 2018. Feds: Tesla accelerated, didn't brake ahead of fatal crash. AP News (Jun 2018). https://apnews.com/article/north-america-us-news-mi-state-wire-ca-state-wire-transportation-8c833b3e5d9c49cf97a10974126daad9
[19]
Xiangtai Li, Xia Li, Li Zhang, Cheng Guangliang, Jianping Shi, Zhouchen Lin, Yunhai Tong, and Shaohua Tan. 2020. Improving Semantic Segmentation via Decoupled Body and Edge Supervision. In ECCV.
[20]
Liang Liao, Jing Xiao, Zheng Wang, Chia-Wen Lin, and Shin'ichi Satoh. 2020. Guidance and evaluation: Semantic-aware image inpainting for mixed scenes. In European Conference on Computer Vision. Springer, 683--700.
[21]
Waymo LLC. 2020. Waymo's Safety Methodologies and Safety Readiness Determinations. Technical Report. 30 pages. https://storage.googleapis.com/sdcprod/v1/safety-report/Waymo-Safety-Methodologies-and-Readiness-Determinations.pdf
[22]
Andrea Matessi and Luca Lombardi. 1999. Vanishing point detection in the hough transform space. In European Conference on Parallel Processing. Springer, 987--994.
[23]
Ben Mildenhall, Pratul P Srinivasan, Matthew Tancik, Jonathan T Barron, Ravi Ramamoorthi, and Ren Ng. 2020. Nerf: Representing scenes as neural radiance fields for view synthesis. In European conference on computer vision. Springer, 405--421.
[24]
Rohit Mohan and Abhinav Valada. 2020. EfficientPS: Efficient Panoptic Segmentation. International Journal of Computer Vision 129 (2020), 1551 -- 1579.
[25]
Huy H Nguyen, T Ngoc-Dung Tieu, Hoang-Quoc Nguyen-Son, Vincent Nozick, Junichi Yamagishi, and Isao Echizen. 2018. Modular convolutional neural network for discriminating between computer-generated images and photographic images. In Proceedings of the 13th international conference on availability, reliability and security. 1--10.
[26]
Augustus Odena, Catherine Olsson, David Andersen, and Ian Goodfellow. 2019. Tensorfuzz: Debugging neural networks with coverage-guided fuzzing. In International Conference on Machine Learning. PMLR, 4901--4911.
[27]
Matthew O'Kelly, Aman Sinha, Hongseok Namkoong, Russ Tedrake, and John C Duchi. 2018. Scalable end-to-end autonomous vehicle testing via rare-event simulation. Advances in neural information processing systems 31 (2018).
[28]
Julian Ost, Fahim Mannan, Nils Thuerey, Julian Knodt, and Felix Heide. 2021. Neural scene graphs for dynamic scenes. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2856--2865.
[29]
Xiaojuan Qi, Qifeng Chen, Jiaya Jia, and Vladlen Koltun. 2018. Semi-parametric image synthesis. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 8808--8816.
[30]
Christos Sakaridis, Dengxin Dai, and Luc Van Gool. 2018. Semantic foggy scene understanding with synthetic data. International Journal of Computer Vision 126, 9 (2018), 973--992.
[31]
Hans-Peter Schöner. 2018. Simulation in development and testing of autonomous vehicles. In 18. Internationales Stuttgarter Symposium. Springer, 1083--1095.
[32]
Sergio Segura, Gordon Fraser, Ana B Sanchez, and Antonio Ruiz-Cortés. 2016. A survey on metamorphic testing. IEEE Transactions on software engineering 42, 9 (2016), 805--824.
[33]
Ke Sun, Bin Xiao, Dong Liu, and Jingdong Wang. 2019. Deep High-Resolution Representation Learning for Human Pose Estimation. In CVPR.
[34]
Andrew Tao, Karan Sapra, and Bryan Catanzaro. 2020. Hierarchical multi-scale attention for semantic segmentation. arXiv preprint arXiv:2005.10821 (2020).
[35]
Brad Templeton. 2019. NTSB Report On Tesla Autopilot Accident Shows What's Inside And It's Not Pretty For FSD. Forbes (Sep 2019). https://www.forbes.com/sites/bradtempleton/2019/09/06/ntsb-report-on-tesla-autopilot-accident-shows-whats-inside-and-its-not-pretty-for-fsd/?sh=6270d8234dc5
[36]
Brad Templeton. 2020. Tesla In Taiwan Crashes Directly Into Overturned Truck, Ignores Pedestrian, With Autopilot On. Forbes (Jun 2020). https://www.forbes.com/sites/bradtempleton/2020/06/02/tesla-in-taiwan-crashes-directly-into-overturned-truck-ignores-pedestrian-with-autopilot-on/?sh=20a7458f58e5
[37]
Yuchi Tian, Kexin Pei, Suman Jana, and Baishakhi Ray. 2018. Deeptest: Automated testing of deep-neural-network-driven autonomous cars. In Proceedings of the 40th international conference on software engineering. 303--314.
[38]
Christopher Steven Timperley, Afsoon Afzal, Deborah S Katz, Jam Marcos Hernandez, and Claire Le Goues. 2018. Crashing simulated planes is cheap: Can simulation detect robotics bugs early?. In 2018 IEEE 11th International Conference on Software Testing, Verification and Validation (ICST). IEEE, 331--342.
[39]
Jiahui Yu, Zhe Lin, Jimei Yang, Xiaohui Shen, Xin Lu, and Thomas S Huang. 2018. Generative image inpainting with contextual attention. In Proceedings of the IEEE conference on computer vision and pattern recognition. 5505--5514.
[40]
Juan Cristóbal Zagal and Javier Ruiz-Del-Solar. 2007. Combining simulation and reality in evolutionary robotics. Journal of Intelligent and Robotic Systems 50, 1 (2007), 19--39.
[41]
Sebastian Zanlongo, Matthew Turk, and Sanjay Parajuli. 2019. vanishing-point-detection. https://github.com/SZanlongo/vanishing-point-detection.
[42]
Mengshi Zhang, Yuqun Zhang, Lingming Zhang, Cong Liu, and Sarfraz Khurshid. 2018. DeepRoad: GAN-based metamorphic testing and input validation framework for autonomous driving systems. In 2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE, 132--142.
[43]
Husheng Zhou, Wei Li, Zelun Kong, Junfeng Guo, Yuqun Zhang, Bei Yu, Lingming Zhang, and Cong Liu. 2020. Deepbillboard: Systematic physical-world testing of autonomous driving systems. In 2020 IEEE/ACM 42nd International Conference on Software Engineering (ICSE). IEEE, 347--358.
[44]
Yi Zhu, Karan Sapra, Fitsum A Reda, Kevin J Shih, Shawn Newsam, Andrew Tao, and Bryan Catanzaro. 2019. Improving semantic segmentation via video propagation and label relaxation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 8856--8865.

Cited By

View all
  • (2024)Practitioners’ Expectations on Automated Test GenerationProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3680386(1618-1630)Online publication date: 11-Sep-2024
  • (2024)UniAda: Universal Adaptive Multiobjective Adversarial Attack for End-to-End Autonomous Driving SystemsIEEE Transactions on Reliability10.1109/TR.2024.339489473:4(1892-1906)Online publication date: Dec-2024
  • (2023)Deeper Notions of Correctness in Image-Based DNNs: Lifting Properties from Pixel to EntitiesProceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3611643.3613079(2122-2126)Online publication date: 30-Nov-2023
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
ICSE '22: Proceedings of the 44th International Conference on Software Engineering
May 2022
2508 pages
ISBN:9781450392211
DOI:10.1145/3510003
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike International 4.0 License.

Sponsors

In-Cooperation

  • IEEE CS

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 July 2022

Check for updates

Author Tags

  1. autonomous systems
  2. perception
  3. semantic fuzzing

Qualifiers

  • Research-article

Funding Sources

Conference

ICSE '22
Sponsor:

Acceptance Rates

Overall Acceptance Rate 276 of 1,856 submissions, 15%

Upcoming Conference

ICSE 2025

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)333
  • Downloads (Last 6 weeks)57
Reflects downloads up to 11 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Practitioners’ Expectations on Automated Test GenerationProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3680386(1618-1630)Online publication date: 11-Sep-2024
  • (2024)UniAda: Universal Adaptive Multiobjective Adversarial Attack for End-to-End Autonomous Driving SystemsIEEE Transactions on Reliability10.1109/TR.2024.339489473:4(1892-1906)Online publication date: Dec-2024
  • (2023)Deeper Notions of Correctness in Image-Based DNNs: Lifting Properties from Pixel to EntitiesProceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3611643.3613079(2122-2126)Online publication date: 30-Nov-2023
  • (2023)A Survey on Automated Driving System Testing: Landscapes and TrendsACM Transactions on Software Engineering and Methodology10.1145/357964232:5(1-62)Online publication date: 24-Jul-2023
  • (2023)A Usability Evaluation of AFL and libFuzzer with CS StudentsProceedings of the 2023 CHI Conference on Human Factors in Computing Systems10.1145/3544548.3581178(1-18)Online publication date: 19-Apr-2023
  • (2023)DeepManeuver: Adversarial Test Generation for Trajectory Manipulation of Autonomous VehiclesIEEE Transactions on Software Engineering10.1109/TSE.2023.330144349:10(4496-4509)Online publication date: 8-Aug-2023
  • (2023)Ameliorating Semantic Search Through Advanced AI Techniques2023 3rd International Conference on Smart Generation Computing, Communication and Networking (SMART GENCON)10.1109/SMARTGENCON60755.2023.10442780(1-6)Online publication date: 29-Dec-2023
  • (2023)Generating Realistic and Diverse Tests for LiDAR-Based Perception SystemsProceedings of the 45th International Conference on Software Engineering10.1109/ICSE48619.2023.00217(2604-2616)Online publication date: 14-May-2023
  • (2023)JITfuzz: Coverage-Guided Fuzzing for JVM Just-in-Time CompilersProceedings of the 45th International Conference on Software Engineering10.1109/ICSE48619.2023.00017(56-68)Online publication date: 14-May-2023

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media