More Web Proxy on the site http://driver.im/

research-article

FUDGE: fuzz driver generation at scale

Authors:

Domagoj Babić,

Franjo Ivančić,

Caroline Lemieux,

László Szekeres,

Wei WangAuthors Info & Claims

ESEC/FSE 2019: Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering

Pages 975 - 985

https://doi.org/10.1145/3338906.3340456

Published: 12 August 2019 Publication History

Abstract

At Google we have found tens of thousands of security and robustness bugs by fuzzing C and C++ libraries. To fuzz a library, a fuzzer requires a fuzz driver—which exercises some library code—to which it can pass inputs. Unfortunately, writing fuzz drivers remains a primarily manual exercise, a major hindrance to the widespread adoption of fuzzing. In this paper, we address this major hindrance by introducing the Fudge system for automated fuzz driver generation. Fudge automatically generates fuzz driver candidates for libraries based on existing client code. We have used Fudge to generate thousands of new drivers for a wide variety of libraries. Each generated driver includes a synthesized C/C++ program and a corresponding build script, and is automatically analyzed for quality. Developers have integrated over 200 of these generated drivers into continuous fuzzing services and have committed to address reported security bugs. Further, several of these fuzz drivers have been upstreamed to open source projects and integrated into the OSS-Fuzz fuzzing infrastructure. Running these fuzz drivers has resulted in over 150 bug fixes, including the elimination of numerous exploitable security vulnerabilities.

References

[1]

Mike Aizatsky, Kostya Serebryany, Oliver Chang, Abhishek Arya, and Meredith Whittaker. 2016. Announcing OSS-Fuzz: Continuous Fuzzing for Open Source Software. Google Testing Blog. https://testing.googleblog.com/2016/12/ announcing-oss-fuzz-continuous-fuzzing.html

[2]

Shay Artzi, Michael D. Ernst, Adam Kieżun, Carlos Pacheco, and Jeff H. Perkins. 2006. Finding the needles in the haystack: Generating legal test inputs for object-oriented programs. In M-TOOS: 1st Workshop on Model-Based Testing and Object-Oriented Systems. Portland, OR, USA, 27–34.

[3]

Abhishek Arya, Oliver Chang, Max Moroz, Martin Barbella, Jonathan Metzman, and the ClusterFuzz Team. 2019. Open sourcing ClusterFuzz. Google Open Source Blog. https://opensource.googleblog.com/2019/02/open-sourcing-clusterfuzz. html

[4]

Dan Bloomberg. 2001–2018. Leptonica. http://www.leptonica.com.

[5]

Raymond P. L. Buse and Westley Weimer. 2012. Synthesizing API Usage Examples. In Proceedings of the 34th International Conference on Software Engineering (ICSE ’12). IEEE Press, Piscataway, NJ, USA, 782–792. http://dl.acm.org/citation.cfm? id=2337223.2337316

Digital Library

[6]

Craig Chambers, Ashish Raniwala, Frances Perry, Stephen Adams, Robert R. Henry, Robert Bradshaw, and Nathan Weizenbaum. 2010. FlumeJava: Easy, Efficient Data-parallel Pipelines. In Proceedings of the 31st ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’10). ACM, New York, NY, USA, 363–375.

Digital Library

[7]

Intel Corporation, Willow Garage, and Itseez. 2019. Open Source Computer Vision Library. https://opencv.org

[8]

Jeffrey Dean and Sanjay Ghemawat. 2004. MapReduce: Simplified Data Processing on Large Clusters. In Proceedings of the 6th Conference on Symposium on Opearting Systems Design & Implementation - Volume 6 (OSDI’04). USENIX Association, Berkeley, CA, USA, 10–10. http://dl.acm.org/citation.cfm?id=1251254.1251264

Digital Library

[9]

S. Elbaum, H. N. Chin, M. B. Dwyer, and M. Jorde. 2009. Carving and Replaying Differential Unit Test Cases from System Test Cases. IEEE Transactions on Software Engineering 35, 1 (Jan 2009), 29–45.

Digital Library

[10]

Chris Evans, Ben Hawkes, Heather Adkins, Matt Moore, Michal Zalewski, and Gerhard Eschelbeck. 2015. Feedback and data-driven updates to Google’s disclosure policy. https://googleprojectzero.blogspot.com/2015/02/feedback-and-datadriven-updates-to.html.

[11]

Jaroslav Fowkes and Charles Sutton. 2016. Parameter-free Probabilistic API Mining Across GitHub. In Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering (FSE 2016). ACM, New York, NY, USA, 254–265.

Digital Library

[12]

Patrice Godefroid. 2014. Micro Execution. In Proceedings of the 36th International Conference on Software Engineering (ICSE 2014). ACM, New York, NY, USA, 539– 549.

Digital Library

[13]

Patrice Godefroid, Nils Klarlund, and Koushik Sen. 2005. DART: Directed Automated Random Testing. In Proceedings of the 2005 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’05).

Digital Library

[14]

Patrice Godefroid, Michael Y. Levin, and David A. Molnar. 2008. Automated Whitebox Fuzz Testing. In Proceedings of the Network and Distributed System Security Symposium, NDSS 2008, San Diego, California, USA, 10th February - 13th February 2008.

[15]

Google Inc. 2015. Bazel – a fast, scalable, multi-language and extensible build system. http://www.bazel.io

[16]

Google Inc. 2018. OSS-Fuzz Issue Tracker. https://bugs.chromium.org/p/oss-fuzz

[17]

Google Inc. 2019. Third-Party. Google’s open source documentation. https: //opensource.google.com/docs/thirdparty

[18]

Alexander Kampmann and Andreas Zeller. 2018. Carving Parameterized Unit Tests. CoRR abs/1812.07932 (2018). arXiv: 1812.07932 http://arxiv.org/abs/1812.

[19]

07932

[20]

Nikolaos Katirtzis, Themistoklis Diamantopoulos, and Charles Sutton. 2018. Summarizing Software API Usage Examples Using Clustering Techniques. In Fundamental Approaches to Software Engineering, Alessandra Russo and Andy Schürr (Eds.). Springer International Publishing, Cham, 189–206.

[21]

Jinhan Kim, Sanghoon Lee, Seung-Won Hwang, and Sunghun Kim. 2013. Enriching Documents with Examples: A Corpus Mining Approach. ACM Trans. Inf. Syst. 31, 1, Article 1 (January 2013), 27 pages.

Digital Library

[22]

Chris Lattner and Vikram Adve. 2004. LLVM: A Compilation Framework for Lifelong Program Analysis and Transformation. In Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization (CGO ’04). IEEE Computer Society, Washington, DC, USA, 75–. http://dl.acm.org/citation.cfm?id=977395.977673

Digital Library

[23]

Valentin J. M. Manès, HyungSeok Han, Choongwoo Han, Sang Kil Cha, Manuel Egele, Edward J. Schwartz, and Maverick Woo. 2018. Fuzzing: Art, Science, and Engineering. CoRR abs/1812.00140 (2018). arXiv: 1812.00140 http://arxiv.org/abs/ 1812.00140

[24]

J. E. Montandon, H. Borges, D. Felix, and M. T. Valente. 2013. Documenting APIs with examples: Lessons learned with the APIMiner platform. In 2013 20th Working Conference on Reverse Engineering (WCRE). 401–408. 1109/WCRE.2013.6671315

[25]

Laura Moreno, Gabriele Bavota, Massimiliano Di Penta, Rocco Oliveto, and Andrian Marcus. 2015. How Can I Use This Method?. In Proceedings of the 37th International Conference on Software Engineering - Volume 1 (ICSE ’15). IEEE Press, Piscataway, NJ, USA, 880–890. http://dl.acm.org/citation.cfm?id=2818754.

Digital Library

[26]

2818860

[27]

Ogre Development Team. 2019. OGRE - Open Source 3D Graphics Engine. https://www.ogre3d.org/.

[28]

Carlos Pacheco, Shuvendu K. Lahiri, Michael D. Ernst, and Thomas Ball. 2007. Feedback-Directed Random Test Generation. In Proceedings of the 29th International Conference on Software Engineering (ICSE ’07). IEEE Computer Society, Washington, DC, USA, 75–84.

Digital Library

[29]

Rachel Potvin and Josh Levenberg. 2016. Why Google Stores Billions of Lines of Code in a Single Repository. Commun. ACM 59, 7 (June 2016), 78–87.

Digital Library

[30]

Matt Ruhstaller and Oliver Chang. 2018. A New Chapter for OSS-Fuzz. Google Security Blog. https://security.googleblog.com/2018/11/a-new-chapter-for-ossfuzz.html

[31]

Caitlin Sadowski, Kathryn T. Stolee, and Sebastian Elbaum. 2015. How Developers Search for Code: A Case Study. In Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering (ESEC/FSE 2015). ACM, New York, NY, USA, 191–201.

Digital Library

[32]

M. A. Saied, O. Benomar, H. Abdeen, and H. Sahraoui. 2015. Mining Multilevel API Usage Patterns. In 2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER). 23–32. SANER.2015.7081812

[33]

Kostya Serebryany. 2015. libFuzzer – a library for coverage-guided fuzz testing. https://llvm.org/docs/LibFuzzer.html#fuzz-target.

[34]

Kostya Serebryany. 2015. Simple guided fuzzing for libraries using LLVM’s new libFuzzer. http://blog.llvm.org/2015/04/fuzz-all-clangs.html.

[35]

Konstantin Serebryany, Derek Bruening, Alexander Potapenko, and Dmitry Vyukov. 2012. AddressSanitizer: A Fast Address Sanity Checker. In USENIX ATC 2012.

Digital Library

[36]

https://www.usenix.org/conference/usenixfederatedconferencesweek/ addresssanitizer-fast-address-sanity-checker

[37]

Ray Smith. 2007. An Overview of the Tesseract OCR Engine. In Proc. Ninth Int. Conference on Document Analysis and Recognition (ICDAR). 629–633.

Digital Library

[38]

1000 Genome Project Data Processing Subgroup, Alec Wysoker, Bob Handsaker, Gabor Marth, Goncalo Abecasis, Heng Li, Jue Ruan, Nils Homer, Richard Durbin, and Tim Fennell. 2009. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 16 (06 2009), 2078–2079.

Digital Library

[39]

1093/bioinformatics/btp352 arXiv: http://oup.prod.sis.lan/bioinformatics/articlepdf/25/16/2078/531810/btp352.pdf

[40]

Robert Swiecki. 2015. Honggfuzz. http://honggfuzz.com.

[41]

László Szekeres. 2017. Memory Corruption Mitigation via Hardening and Testing. Ph.D. Dissertation. Stony Brook University.

[42]

Frank Tip. 1994. A Survey of Program Slicing Techniques. Technical Report. Amsterdam, The Netherlands, The Netherlands.

Digital Library

[43]

J. Wang, Y. Dang, H. Zhang, K. Chen, T. Xie, and D. Zhang. 2013. Mining succinct and high-coverage API usage patterns from source code. In 2013 10th Working Conference on Mining Software Repositories (MSR). 319–328.

Digital Library

[44]

Mark Weiser. 1981. Program Slicing. In Proceedings of the 5th International Conference on Software Engineering (ICSE ’81). IEEE Press, Piscataway, NJ, USA, 439–449. http://dl.acm.org/citation.cfm?id=800078.802557

Digital Library

[45]

H. K. Wright, D. Jasper, M. Klimek, C. Carruth, and Z. Wan. 2013. Large-Scale Automated Refactoring Using ClangMR. In 2013 IEEE International Conference on Software Maintenance. 548–551.

Digital Library

[46]

Michał Zalewski. 2014. American Fuzzy Lop. http://lcamtuf.coredump.cx/afl.

[47]

Sai Zhang, David Saff, Yingyi Bu, and Michael D. Ernst. 2011. Combined Static and Dynamic Automated Test Generation. In Proceedings of the 2011 International Symposium on Software Testing and Analysis (ISSTA ’11). ACM, New York, NY, USA, 353–363.

Digital Library

[48]

Wujie Zheng, Qirun Zhang, Michael Lyu, and Tao Xie. 2010. Random Unit-test Generation with MUT-aware Sequence Recommendation. In Proceedings of the IEEE/ACM International Conference on Automated Software Engineering (ASE ’10). ACM, New York, NY, USA, 293–296.

Digital Library

[49]

Hao Zhong, Tao Xie, Lu Zhang, Jian Pei, and Hong Mei. 2009. MAPO: Mining and Recommending API Usage Patterns. In Proceedings of the 23rd European Conference on ECOOP 2009 — Object-Oriented Programming (Genoa). Springer-Verlag, Berlin, Heidelberg, 318–343. 0_15

Digital Library

Cited By

Kim DShon T(2024)WinEco: Semi-automatic Harness Generation based on Windows Time Travel DebuggingJournal of Digital Contents Society10.9728/dcs.2024.25.7.187325:7(1873-1881)Online publication date: 31-Jul-2024
https://doi.org/10.9728/dcs.2024.25.7.1873
Peng SZhang YDai JGu YShen ZLiu JWang LChen YQin YAi LLu XYang MFilkov VRay BZhou M(2024)Applying Fuzz Driver Generation to Native C/C++ Libraries of OEM Android Framework: Obstacles and SolutionsProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695266(2035-2040)Online publication date: 27-Oct-2024
https://dl.acm.org/doi/10.1145/3691620.3695266
Takashima YCho CMartins RJia LPăsăreanu C(2024)Crabtree: Rust API Test Synthesis Guided by Coverage and TypeProceedings of the ACM on Programming Languages10.1145/36897338:OOPSLA2(618-647)Online publication date: 8-Oct-2024
https://dl.acm.org/doi/10.1145/3689733
Show More Cited By

Index Terms

FUDGE: fuzz driver generation at scale
1. Security and privacy
  1. Software and application security
    1. Software security engineering
2. Software and its engineering
  1. Software creation and management
    1. Software post-development issues
      1. Software evolution
    2. Software verification and validation
      1. Software defect analysis
        Software testing and debugging
  2. Software organization and properties
    1. Software functional properties
      1. Formal methods
        Automated static analysis

Recommendations

FairFuzz: a targeted mutation strategy for increasing greybox fuzz testing coverage
ASE '18: Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering

In recent years, fuzz testing has proven itself to be one of the most effective techniques for finding correctness bugs and security vulnerabilities in practice. One particular fuzz testing tool, American Fuzzy Lop (AFL), has become popular thanks to ...
FuzzBench: an open fuzzer benchmarking platform and service
ESEC/FSE 2021: Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering

Fuzzing is a key tool used to reduce bugs in production software. At Google, fuzzing has uncovered tens of thousands of bugs. Fuzzing is also a popular subject of academic research. In 2020 alone, over 120 papers were published on the topic of improving,...
Prompt Fuzzing for Fuzz Driver Generation
CCS '24: Proceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security

Crafting high-quality fuzz drivers not only is time-consuming but also requires a deep understanding of the library. However, the state-of-the-art automatic fuzz driver generation techniques fall short of expectations. While fuzz drivers derived from ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

ESEC/FSE 2019: Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering

August 2019

1264 pages

ISBN:9781450355728

DOI:10.1145/3338906

General Chairs:
Marlon Dumas
University of Tartu, Estonia
,
Dietmar Pfahl
University of Tartu, Estonia
,
Program Chairs:
Sven Apel
Saarland University, Germany
,
Alessandra Russo
Imperial College, UK

Copyright © 2019 Owner/Author.

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives International 4.0 License.

Sponsors

SIGSOFT: ACM Special Interest Group on Software Engineering

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 August 2019

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Badges

Best Paper

Author Tags

Qualifiers

Research-article

Conference

ESEC/FSE '19

Sponsor:

SIGSOFT

ESEC/FSE '19: 27th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering

August 26 - 30, 2019

Tallinn, Estonia

Acceptance Rates

Overall Acceptance Rate 112 of 543 submissions, 21%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

62
Total Citations
View Citations
1,197
Total Downloads

Downloads (Last 12 months)279
Downloads (Last 6 weeks)32

Reflects downloads up to 13 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Kim DShon T(2024)WinEco: Semi-automatic Harness Generation based on Windows Time Travel DebuggingJournal of Digital Contents Society10.9728/dcs.2024.25.7.187325:7(1873-1881)Online publication date: 31-Jul-2024
https://doi.org/10.9728/dcs.2024.25.7.1873
Peng SZhang YDai JGu YShen ZLiu JWang LChen YQin YAi LLu XYang MFilkov VRay BZhou M(2024)Applying Fuzz Driver Generation to Native C/C++ Libraries of OEM Android Framework: Obstacles and SolutionsProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695266(2035-2040)Online publication date: 27-Oct-2024
https://dl.acm.org/doi/10.1145/3691620.3695266
Takashima YCho CMartins RJia LPăsăreanu C(2024)Crabtree: Rust API Test Synthesis Guided by Coverage and TypeProceedings of the ACM on Programming Languages10.1145/36897338:OOPSLA2(618-647)Online publication date: 8-Oct-2024
https://dl.acm.org/doi/10.1145/3689733
Sun Y(2024)Automated Generation and Compilation of Fuzz Driver Based on Large Language ModelsProceedings of the 2024 9th International Conference on Cyber Security and Information Engineering10.1145/3689236.3689272(461-468)Online publication date: 15-Sep-2024
https://dl.acm.org/doi/10.1145/3689236.3689272
Lyu YXie YChen PChen HLuo BLiao XXu JKirda ELie D(2024)Prompt Fuzzing for Fuzz Driver GenerationProceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security10.1145/3658644.3670396(3793-3807)Online publication date: 2-Dec-2024
https://dl.acm.org/doi/10.1145/3658644.3670396
Zhang CZheng YBai MLi YMa WXie XLi YSun LLiu YChristakis MPradel M(2024)How Effective Are They? Exploring Large Language Model Based Fuzz Driver GenerationProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3680355(1223-1235)Online publication date: 11-Sep-2024
https://dl.acm.org/doi/10.1145/3650212.3680355
Yin XFeng YShi QLiu ZLiu HXu BChristakis MPradel M(2024)FRIES: Fuzzing Rust Library Interactions via Efficient Ecosystem-Guided Target GenerationProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3680348(1137-1148)Online publication date: 11-Sep-2024
https://dl.acm.org/doi/10.1145/3650212.3680348
Xiong HDai QChang RQiu MWang RShen WZhou YChristakis MPradel M(2024)Atlas: Automating Cross-Language Fuzzing on Android Closed-Source LibrariesProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3652133(350-362)Online publication date: 11-Sep-2024
https://dl.acm.org/doi/10.1145/3650212.3652133
Hu TYe GTang ZTan SWang HLi MWang ZChristakis MPradel M(2024)UPBEAT: Test Input Checks of Q# Quantum LibrariesProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3652120(186-198)Online publication date: 11-Sep-2024
https://dl.acm.org/doi/10.1145/3650212.3652120
Shen YLiu JXu YSun HWang MGuan NShi HJiang YChristakis MPradel M(2024)Enhancing ROS System Fuzzing through Callback TracingProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3652111(76-87)Online publication date: 11-Sep-2024
https://dl.acm.org/doi/10.1145/3650212.3652111
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents