[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3338906.3340456acmconferencesArticle/Chapter ViewAbstractPublication PagesfseConference Proceedingsconference-collections
research-article

FUDGE: fuzz driver generation at scale

Published: 12 August 2019 Publication History

Abstract

At Google we have found tens of thousands of security and robustness bugs by fuzzing C and C++ libraries. To fuzz a library, a fuzzer requires a fuzz driver—which exercises some library code—to which it can pass inputs. Unfortunately, writing fuzz drivers remains a primarily manual exercise, a major hindrance to the widespread adoption of fuzzing. In this paper, we address this major hindrance by introducing the Fudge system for automated fuzz driver generation. Fudge automatically generates fuzz driver candidates for libraries based on existing client code. We have used Fudge to generate thousands of new drivers for a wide variety of libraries. Each generated driver includes a synthesized C/C++ program and a corresponding build script, and is automatically analyzed for quality. Developers have integrated over 200 of these generated drivers into continuous fuzzing services and have committed to address reported security bugs. Further, several of these fuzz drivers have been upstreamed to open source projects and integrated into the OSS-Fuzz fuzzing infrastructure. Running these fuzz drivers has resulted in over 150 bug fixes, including the elimination of numerous exploitable security vulnerabilities.

References

[1]
Mike Aizatsky, Kostya Serebryany, Oliver Chang, Abhishek Arya, and Meredith Whittaker. 2016. Announcing OSS-Fuzz: Continuous Fuzzing for Open Source Software. Google Testing Blog. https://testing.googleblog.com/2016/12/ announcing-oss-fuzz-continuous-fuzzing.html
[2]
Shay Artzi, Michael D. Ernst, Adam Kieżun, Carlos Pacheco, and Jeff H. Perkins. 2006. Finding the needles in the haystack: Generating legal test inputs for object-oriented programs. In M-TOOS: 1st Workshop on Model-Based Testing and Object-Oriented Systems. Portland, OR, USA, 27–34.
[3]
Abhishek Arya, Oliver Chang, Max Moroz, Martin Barbella, Jonathan Metzman, and the ClusterFuzz Team. 2019. Open sourcing ClusterFuzz. Google Open Source Blog. https://opensource.googleblog.com/2019/02/open-sourcing-clusterfuzz. html
[4]
Dan Bloomberg. 2001–2018. Leptonica. http://www.leptonica.com.
[5]
Raymond P. L. Buse and Westley Weimer. 2012. Synthesizing API Usage Examples. In Proceedings of the 34th International Conference on Software Engineering (ICSE ’12). IEEE Press, Piscataway, NJ, USA, 782–792. http://dl.acm.org/citation.cfm? id=2337223.2337316
[6]
Craig Chambers, Ashish Raniwala, Frances Perry, Stephen Adams, Robert R. Henry, Robert Bradshaw, and Nathan Weizenbaum. 2010. FlumeJava: Easy, Efficient Data-parallel Pipelines. In Proceedings of the 31st ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’10). ACM, New York, NY, USA, 363–375.
[7]
Intel Corporation, Willow Garage, and Itseez. 2019. Open Source Computer Vision Library. https://opencv.org
[8]
Jeffrey Dean and Sanjay Ghemawat. 2004. MapReduce: Simplified Data Processing on Large Clusters. In Proceedings of the 6th Conference on Symposium on Opearting Systems Design & Implementation - Volume 6 (OSDI’04). USENIX Association, Berkeley, CA, USA, 10–10. http://dl.acm.org/citation.cfm?id=1251254.1251264
[9]
S. Elbaum, H. N. Chin, M. B. Dwyer, and M. Jorde. 2009. Carving and Replaying Differential Unit Test Cases from System Test Cases. IEEE Transactions on Software Engineering 35, 1 (Jan 2009), 29–45.
[10]
Chris Evans, Ben Hawkes, Heather Adkins, Matt Moore, Michal Zalewski, and Gerhard Eschelbeck. 2015. Feedback and data-driven updates to Google’s disclosure policy. https://googleprojectzero.blogspot.com/2015/02/feedback-and-datadriven-updates-to.html.
[11]
Jaroslav Fowkes and Charles Sutton. 2016. Parameter-free Probabilistic API Mining Across GitHub. In Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering (FSE 2016). ACM, New York, NY, USA, 254–265.
[12]
Patrice Godefroid. 2014. Micro Execution. In Proceedings of the 36th International Conference on Software Engineering (ICSE 2014). ACM, New York, NY, USA, 539– 549.
[13]
Patrice Godefroid, Nils Klarlund, and Koushik Sen. 2005. DART: Directed Automated Random Testing. In Proceedings of the 2005 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’05).
[14]
Patrice Godefroid, Michael Y. Levin, and David A. Molnar. 2008. Automated Whitebox Fuzz Testing. In Proceedings of the Network and Distributed System Security Symposium, NDSS 2008, San Diego, California, USA, 10th February - 13th February 2008.
[15]
Google Inc. 2015. Bazel – a fast, scalable, multi-language and extensible build system. http://www.bazel.io
[16]
Google Inc. 2018. OSS-Fuzz Issue Tracker. https://bugs.chromium.org/p/oss-fuzz
[17]
Google Inc. 2019. Third-Party. Google’s open source documentation. https: //opensource.google.com/docs/thirdparty
[18]
Alexander Kampmann and Andreas Zeller. 2018. Carving Parameterized Unit Tests. CoRR abs/1812.07932 (2018). arXiv: 1812.07932 http://arxiv.org/abs/1812.
[20]
Nikolaos Katirtzis, Themistoklis Diamantopoulos, and Charles Sutton. 2018. Summarizing Software API Usage Examples Using Clustering Techniques. In Fundamental Approaches to Software Engineering, Alessandra Russo and Andy Schürr (Eds.). Springer International Publishing, Cham, 189–206.
[21]
Jinhan Kim, Sanghoon Lee, Seung-Won Hwang, and Sunghun Kim. 2013. Enriching Documents with Examples: A Corpus Mining Approach. ACM Trans. Inf. Syst. 31, 1, Article 1 (January 2013), 27 pages.
[22]
Chris Lattner and Vikram Adve. 2004. LLVM: A Compilation Framework for Lifelong Program Analysis and Transformation. In Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization (CGO ’04). IEEE Computer Society, Washington, DC, USA, 75–. http://dl.acm.org/citation.cfm?id=977395.977673
[23]
Valentin J. M. Manès, HyungSeok Han, Choongwoo Han, Sang Kil Cha, Manuel Egele, Edward J. Schwartz, and Maverick Woo. 2018. Fuzzing: Art, Science, and Engineering. CoRR abs/1812.00140 (2018). arXiv: 1812.00140 http://arxiv.org/abs/ 1812.00140
[24]
J. E. Montandon, H. Borges, D. Felix, and M. T. Valente. 2013. Documenting APIs with examples: Lessons learned with the APIMiner platform. In 2013 20th Working Conference on Reverse Engineering (WCRE). 401–408. 1109/WCRE.2013.6671315
[25]
Laura Moreno, Gabriele Bavota, Massimiliano Di Penta, Rocco Oliveto, and Andrian Marcus. 2015. How Can I Use This Method?. In Proceedings of the 37th International Conference on Software Engineering - Volume 1 (ICSE ’15). IEEE Press, Piscataway, NJ, USA, 880–890. http://dl.acm.org/citation.cfm?id=2818754.
[26]
[27]
Ogre Development Team. 2019. OGRE - Open Source 3D Graphics Engine. https://www.ogre3d.org/.
[28]
Carlos Pacheco, Shuvendu K. Lahiri, Michael D. Ernst, and Thomas Ball. 2007. Feedback-Directed Random Test Generation. In Proceedings of the 29th International Conference on Software Engineering (ICSE ’07). IEEE Computer Society, Washington, DC, USA, 75–84.
[29]
Rachel Potvin and Josh Levenberg. 2016. Why Google Stores Billions of Lines of Code in a Single Repository. Commun. ACM 59, 7 (June 2016), 78–87.
[30]
Matt Ruhstaller and Oliver Chang. 2018. A New Chapter for OSS-Fuzz. Google Security Blog. https://security.googleblog.com/2018/11/a-new-chapter-for-ossfuzz.html
[31]
Caitlin Sadowski, Kathryn T. Stolee, and Sebastian Elbaum. 2015. How Developers Search for Code: A Case Study. In Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering (ESEC/FSE 2015). ACM, New York, NY, USA, 191–201.
[32]
M. A. Saied, O. Benomar, H. Abdeen, and H. Sahraoui. 2015. Mining Multilevel API Usage Patterns. In 2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER). 23–32. SANER.2015.7081812
[33]
Kostya Serebryany. 2015. libFuzzer – a library for coverage-guided fuzz testing. https://llvm.org/docs/LibFuzzer.html#fuzz-target.
[34]
Kostya Serebryany. 2015. Simple guided fuzzing for libraries using LLVM’s new libFuzzer. http://blog.llvm.org/2015/04/fuzz-all-clangs.html.
[35]
Konstantin Serebryany, Derek Bruening, Alexander Potapenko, and Dmitry Vyukov. 2012. AddressSanitizer: A Fast Address Sanity Checker. In USENIX ATC 2012.
[36]
https://www.usenix.org/conference/usenixfederatedconferencesweek/ addresssanitizer-fast-address-sanity-checker
[37]
Ray Smith. 2007. An Overview of the Tesseract OCR Engine. In Proc. Ninth Int. Conference on Document Analysis and Recognition (ICDAR). 629–633.
[38]
1000 Genome Project Data Processing Subgroup, Alec Wysoker, Bob Handsaker, Gabor Marth, Goncalo Abecasis, Heng Li, Jue Ruan, Nils Homer, Richard Durbin, and Tim Fennell. 2009. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 16 (06 2009), 2078–2079.
[39]
1093/bioinformatics/btp352 arXiv: http://oup.prod.sis.lan/bioinformatics/articlepdf/25/16/2078/531810/btp352.pdf
[40]
Robert Swiecki. 2015. Honggfuzz. http://honggfuzz.com.
[41]
László Szekeres. 2017. Memory Corruption Mitigation via Hardening and Testing. Ph.D. Dissertation. Stony Brook University.
[42]
Frank Tip. 1994. A Survey of Program Slicing Techniques. Technical Report. Amsterdam, The Netherlands, The Netherlands.
[43]
J. Wang, Y. Dang, H. Zhang, K. Chen, T. Xie, and D. Zhang. 2013. Mining succinct and high-coverage API usage patterns from source code. In 2013 10th Working Conference on Mining Software Repositories (MSR). 319–328.
[44]
Mark Weiser. 1981. Program Slicing. In Proceedings of the 5th International Conference on Software Engineering (ICSE ’81). IEEE Press, Piscataway, NJ, USA, 439–449. http://dl.acm.org/citation.cfm?id=800078.802557
[45]
H. K. Wright, D. Jasper, M. Klimek, C. Carruth, and Z. Wan. 2013. Large-Scale Automated Refactoring Using ClangMR. In 2013 IEEE International Conference on Software Maintenance. 548–551.
[46]
Michał Zalewski. 2014. American Fuzzy Lop. http://lcamtuf.coredump.cx/afl.
[47]
Sai Zhang, David Saff, Yingyi Bu, and Michael D. Ernst. 2011. Combined Static and Dynamic Automated Test Generation. In Proceedings of the 2011 International Symposium on Software Testing and Analysis (ISSTA ’11). ACM, New York, NY, USA, 353–363.
[48]
Wujie Zheng, Qirun Zhang, Michael Lyu, and Tao Xie. 2010. Random Unit-test Generation with MUT-aware Sequence Recommendation. In Proceedings of the IEEE/ACM International Conference on Automated Software Engineering (ASE ’10). ACM, New York, NY, USA, 293–296.
[49]
Hao Zhong, Tao Xie, Lu Zhang, Jian Pei, and Hong Mei. 2009. MAPO: Mining and Recommending API Usage Patterns. In Proceedings of the 23rd European Conference on ECOOP 2009 — Object-Oriented Programming (Genoa). Springer-Verlag, Berlin, Heidelberg, 318–343. 0_15

Cited By

View all
  • (2024)WinEco: Semi-automatic Harness Generation based on Windows Time Travel DebuggingJournal of Digital Contents Society10.9728/dcs.2024.25.7.187325:7(1873-1881)Online publication date: 31-Jul-2024
  • (2024)Applying Fuzz Driver Generation to Native C/C++ Libraries of OEM Android Framework: Obstacles and SolutionsProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695266(2035-2040)Online publication date: 27-Oct-2024
  • (2024)Crabtree: Rust API Test Synthesis Guided by Coverage and TypeProceedings of the ACM on Programming Languages10.1145/36897338:OOPSLA2(618-647)Online publication date: 8-Oct-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
ESEC/FSE 2019: Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering
August 2019
1264 pages
ISBN:9781450355728
DOI:10.1145/3338906
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives International 4.0 License.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 August 2019

Permissions

Request permissions for this article.

Check for updates

Badges

  • Best Paper

Author Tags

  1. automated test generation
  2. code synthesis
  3. fuzz testing
  4. fuzzing
  5. program slicing
  6. software security
  7. testing

Qualifiers

  • Research-article

Conference

ESEC/FSE '19
Sponsor:

Acceptance Rates

Overall Acceptance Rate 112 of 543 submissions, 21%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)279
  • Downloads (Last 6 weeks)32
Reflects downloads up to 13 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)WinEco: Semi-automatic Harness Generation based on Windows Time Travel DebuggingJournal of Digital Contents Society10.9728/dcs.2024.25.7.187325:7(1873-1881)Online publication date: 31-Jul-2024
  • (2024)Applying Fuzz Driver Generation to Native C/C++ Libraries of OEM Android Framework: Obstacles and SolutionsProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695266(2035-2040)Online publication date: 27-Oct-2024
  • (2024)Crabtree: Rust API Test Synthesis Guided by Coverage and TypeProceedings of the ACM on Programming Languages10.1145/36897338:OOPSLA2(618-647)Online publication date: 8-Oct-2024
  • (2024)Automated Generation and Compilation of Fuzz Driver Based on Large Language ModelsProceedings of the 2024 9th International Conference on Cyber Security and Information Engineering10.1145/3689236.3689272(461-468)Online publication date: 15-Sep-2024
  • (2024)Prompt Fuzzing for Fuzz Driver GenerationProceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security10.1145/3658644.3670396(3793-3807)Online publication date: 2-Dec-2024
  • (2024)How Effective Are They? Exploring Large Language Model Based Fuzz Driver GenerationProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3680355(1223-1235)Online publication date: 11-Sep-2024
  • (2024)FRIES: Fuzzing Rust Library Interactions via Efficient Ecosystem-Guided Target GenerationProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3680348(1137-1148)Online publication date: 11-Sep-2024
  • (2024)Atlas: Automating Cross-Language Fuzzing on Android Closed-Source LibrariesProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3652133(350-362)Online publication date: 11-Sep-2024
  • (2024)UPBEAT: Test Input Checks of Q# Quantum LibrariesProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3652120(186-198)Online publication date: 11-Sep-2024
  • (2024)Enhancing ROS System Fuzzing through Callback TracingProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3652111(76-87)Online publication date: 11-Sep-2024
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media