[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article
Open access

Accelerating Fuzzing through Prefix-Guided Execution

Published: 06 April 2023 Publication History

Abstract

Coverage-guided fuzzing is one of the most effective approaches for discovering software defects and vulnerabilities. It executes all mutated tests from seed inputs to expose coverage-increasing tests. However, executing all mutated tests incurs significant performance penalties---most of the mutated tests are discarded because they do not increase code coverage. Thus, determining if a test increases code coverage without actually executing it is beneficial, but a paradoxical challenge. In this paper, we introduce the notion of prefix-guided execution (PGE) to tackle this challenge. PGE leverages two key observations: (1) Only a tiny fraction of the mutated tests increase coverage, thus requiring full execution; and (2) whether a test increases coverage may be accurately inferred from its partial execution. PGE monitors the execution of a test and applies early termination when the execution prefix indicates that the test is unlikely to increase coverage.
To demonstrate the potential of PGE, we implement a prototype on top of AFL++, which we call AFL++-PGE. We evaluate AFL++-PGE on MAGMA, a ground-truth benchmark set that consists of 21 programs from nine popular real-world projects. Our results show that, after 48 hours of fuzzing, AFL++-PGE finds more bugs, discovers bugs faster, and achieves higher coverage. Prefix-guided execution is general and can benefit the AFL-based family of fuzzers.

References

[1]
Mike Aizatsky, Kostya Serebryany, Oliver Chang, Abhishek Arya, and Meredith Whittaker. 2016. Announcing OSS-Fuzz: Continuous fuzzing for open source software. Google Testing Blog.
[2]
Austin Appleby. 2016. MurmurHash3. https://github.com/aappleby/smhasher/wiki/MurmurHash3
[3]
Cornelius Aschermann, Sergej Schumilo, Tim Blazytko, Robert Gawlik, and Thorsten Holz. 2019. REDQUEEN: Fuzzing with Input-to-State Correspondence. In Network and Distributed System Security. 19, 1–15.
[4]
Marcel Böhme and Brandon Falk. 2020. Fuzzing: On the exponential cost of vulnerability discovery. In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 713–724.
[5]
Marcel Böhme, Van-Thuan Pham, Manh-Dung Nguyen, and Abhik Roychoudhury. 2017. Directed greybox fuzzing. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security. 2329–2344.
[6]
Marcel Böhme, Van-Thuan Pham, and Abhik Roychoudhury. 2017. Coverage-based greybox fuzzing as markov chain. IEEE Transactions on Software Engineering, 45, 5 (2017), 489–506.
[7]
Hongxu Chen, Yinxing Xue, Yuekang Li, Bihuan Chen, Xiaofei Xie, Xiuheng Wu, and Yang Liu. 2018. Hawkeye: Towards a desired directed grey-box fuzzer. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security. 2095–2108.
[8]
Peng Chen and Hao Chen. 2018. Angora: Efficient fuzzing by principled search. In IEEE Symposium on Security and Privacy (SP). 711–725.
[9]
Peng Chen, Jianzhong Liu, and Hao Chen. 2019. Matryoshka: fuzzing deeply nested branches. In Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security. 499–513.
[10]
Yaohui Chen, Mansour Ahmadi, Boyu Wang, and Long Lu. 2020. MEUZZ: Smart Seed Scheduling for Hybrid Fuzzing. In 23rd International Symposium on Research in Attacks, Intrusions and Defenses (RAID 2020). 77–92.
[11]
Yaohui Chen, Peng Li, Jun Xu, Shengjian Guo, Rundong Zhou, Yulong Zhang, Tao Wei, and Long Lu. 2020. Savior: Towards bug-driven hybrid testing. In 2020 IEEE Symposium on Security and Privacy (SP). 1580–1596.
[12]
Andrea Fioraldi, Daniele Cono D’Elia, and Davide Balzarotti. 2021. The Use of Likely Invariants as Feedback for Fuzzers. In 30th USENIX Security Symposium (USENIX Security 21). 2829–2846.
[13]
Andrea Fioraldi, Dominik Maier, Heiko Eiß feldt, and Marc Heuse. 2020. $AFL++$: Combining Incremental Steps of Fuzzing Research. In 14th USENIX Workshop on Offensive Technologies (WOOT 20).
[14]
Andrea Fioraldi, Dominik Maier, Heiko Eiß feldt, and Marc Heuse. 2022. American Fuzzy Lop plus plus (AFL++). https://github.com/AFLplusplus/AFLplusplus/releases/tag/4.01c
[15]
Shuitao Gan, Chao Zhang, Peng Chen, Bodong Zhao, Xiaojun Qin, Dong Wu, and Zuoning Chen. 2020. GREYONE: Data flow sensitive fuzzing. In 29th USENIX Security Symposium (USENIX Security 20). 2577–2594.
[16]
Patrice Godefroid, Hila Peleg, and Rishabh Singh. 2017. Learn&fuzz: Machine learning for input fuzzing. In 2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE). 50–59.
[17]
Ahmad Hazimeh, Adrian Herrera, and Mathias Payer. 2020. Magma: A Ground-Truth Fuzzing Benchmark. Proc. ACM Meas. Anal. Comput. Syst., 4, 3 (2020), Article 49, Dec., 29 pages. https://doi.org/10.1145/3428334
[18]
Adrian Herrera, Hendra Gunadi, Shane Magrath, Michael Norrish, Mathias Payer, and Antony L Hosking. 2021. Seed selection for successful fuzzing. In Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis. 230–243.
[19]
Chin-Chia Hsu, Che-Yu Wu, Hsu-Chun Hsiao, and Shih-Kun Huang. 2018. Instrim: Lightweight instrumentation for coverage-guided fuzzing. In Symposium on Network and Distributed System Security, Workshop on Binary Analysis Research.
[20]
Heqing Huang, Peisen Yao, Rongxin Wu, Qingkai Shi, and Charles Zhang. 2020. Pangolin: Incremental hybrid fuzzing with polyhedral path abstraction. In 2020 IEEE Symposium on Security and Privacy (SP). 1613–1627.
[21]
Edward L Kaplan and Paul Meier. 1958. Nonparametric estimation from incomplete observations. Journal of the American statistical association, 53, 282 (1958), 457–481.
[22]
George Klees, Andrew Ruef, Benji Cooper, Shiyi Wei, and Michael Hicks. 2018. Evaluating fuzz testing. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security. 2123–2138.
[23]
Caroline Lemieux and Koushik Sen. 2018. Fairfuzz: A targeted mutation strategy for increasing greybox fuzz testing coverage. In Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering. 475–485.
[24]
Shaohua Li and Zhendong Su. 2023. Accelerating Fuzzing through Prefix-Guided Execution. https://doi.org/10.5281/zenodo.7727577
[25]
Yuekang Li, Bihuan Chen, Mahinthan Chandramohan, Shang-Wei Lin, Yang Liu, and Alwen Tiu. 2017. Steelix: program-state based binary fuzzing. In Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering. 627–637.
[26]
Yuwei Li, Shouling Ji, Yuan Chen, Sizhuang Liang, Wei-Han Lee, Yueyao Chen, Chenyang Lyu, Chunming Wu, Raheem Beyah, and Peng Cheng. 2021. UNIFUZZ: A Holistic and Pragmatic Metrics-Driven Platform for Evaluating Fuzzers. In 30th USENIX Security Symposium (USENIX Security 21).
[27]
LLVM team. 2021. SanitizerCoverage. https://releases.llvm.org/11.0.1/tools/clang/docs/SanitizerCoverage.html
[28]
Chenyang Lyu, Shouling Ji, Chao Zhang, Yuwei Li, Wei-Han Lee, Yu Song, and Raheem Beyah. 2019. $MOPT$: Optimized mutation scheduling for fuzzers. In 28th USENIX Security Symposium (USENIX Security 19). 1949–1966.
[29]
Valentin JM Manès, Soomin Kim, and Sang Kil Cha. 2020. Ankou: Guiding grey-box fuzzing towards combinatorial difference. In Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering. 1024–1036.
[30]
Valentin Jean Marie Manès, HyungSeok Han, Choongwoo Han, Sang Kil Cha, Manuel Egele, Edward J Schwartz, and Maverick Woo. 2019. The art, science, and engineering of fuzzing: A survey. IEEE Transactions on Software Engineering.
[31]
Nathan Mantel. 1966. Evaluation of survival data and two new rank order statistics arising in its consideration. Cancer Chemother Rep, 50, 3 (1966), 163–170.
[32]
Björn Mathis, Rahul Gopinath, and Andreas Zeller. 2020. Learning input tokens for effective fuzzing. In Proceedings of the 29th ACM SIGSOFT International Symposium on Software Testing and Analysis. 27–37.
[33]
Stefan Nagy and Matthew Hicks. 2019. Full-speed fuzzing: Reducing fuzzing overhead through coverage-guided tracing. In 2019 IEEE Symposium on Security and Privacy (SP). 787–802.
[34]
Stefan Nagy, Anh Nguyen-Tuong, Jason D Hiser, Jack W Davidson, and Matthew Hicks. 2021. Same Coverage, Less Bloat: Accelerating Binary-only Fuzzing with Coverage-preserving Coverage-guided Tracing. In Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security. 351–365.
[35]
Yannic Noller, Rody Kersten, and Corina S Păsăreanu. 2018. Badger: complexity analysis with fuzzing and symbolic execution. In Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis. 322–332.
[36]
Yannic Noller, Corina S Păsăreanu, Marcel Böhme, Youcheng Sun, Hoang Lam Nguyen, and Lars Grunske. 2020. HyDiff: Hybrid differential software analysis. In 2020 IEEE/ACM 42nd International Conference on Software Engineering (ICSE). 1273–1285.
[37]
Hui Peng, Yan Shoshitaishvili, and Mathias Payer. 2018. T-Fuzz: fuzzing by program transformation. In 2018 IEEE Symposium on Security and Privacy (SP). 697–710.
[38]
Van-Thuan Pham, Marcel Böhme, and Abhik Roychoudhury. 2020. AFLNet: a greybox fuzzer for network protocols. In 2020 IEEE 13th International Conference on Software Testing, Validation and Verification (ICST). 460–465.
[39]
Mohit Rajpal, William Blum, and Rishabh Singh. 2017. Not all bytes are equal: Neural byte sieve for fuzzing. arXiv preprint arXiv:1711.04596.
[40]
Sanjay Rawat, Vivek Jain, Ashish Kumar, Lucian Cojocar, Cristiano Giuffrida, and Herbert Bos. 2017. VUzzer: Application-aware Evolutionary Fuzzing. In Network and Distributed System Security. 17, 1–14.
[41]
David A Schum. 2001. The evidential foundations of probabilistic reasoning. Northwestern University Press.
[42]
Dongdong She, Rahul Krishna, Lu Yan, Suman Jana, and Baishakhi Ray. 2020. MTFuzz: fuzzing with a multi-task neural network. In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 737–749.
[43]
Dongdong She, Kexin Pei, Dave Epstein, Junfeng Yang, Baishakhi Ray, and Suman Jana. 2019. Neuzz: Efficient fuzzing with neural program smoothing. In 2019 IEEE Symposium on Security and Privacy (SP). 803–817.
[44]
Dokyung Song, Felicitas Hetzelt, Jonghwan Kim, Brent Byunghoon Kang, Jean-Pierre Seifert, and Michael Franz. 2020. Agamotto: Accelerating kernel driver fuzzing with lightweight virtual machine checkpoints. In 29th USENIX Security Symposium (USENIX Security 20). 2541–2557.
[45]
Nick Stephens, John Grosen, Christopher Salls, Andrew Dutcher, Ruoyu Wang, Jacopo Corbetta, Yan Shoshitaishvili, Christopher Kruegel, and Giovanni Vigna. 2016. Driller: Augmenting fuzzing through selective symbolic execution. In Network and Distributed System Security. 16, 1–16.
[46]
Jonas Benedict Wagner. 2017. Elastic program transformations: automatically optimizing the reliability/performance trade-off in systems software. Ph. D. Dissertation. Ecole Polytechnique Fédérale de Lausanne.
[47]
Haijun Wang, Xiaofei Xie, Yi Li, Cheng Wen, Yuekang Li, Yang Liu, Shengchao Qin, Hongxu Chen, and Yulei Sui. 2020. Typestate-guided fuzzer for discovering use-after-free vulnerabilities. In 2020 IEEE/ACM 42nd International Conference on Software Engineering (ICSE). 999–1010.
[48]
Junjie Wang, Bihuan Chen, Lei Wei, and Yang Liu. 2017. Skyfire: Data-driven seed generation for fuzzing. In 2017 IEEE Symposium on Security and Privacy (SP). 579–594.
[49]
Mingzhe Wang, Jie Liang, Yuanliang Chen, Yu Jiang, Xun Jiao, Han Liu, Xibin Zhao, and Jiaguang Sun. 2018. SAFL: increasing and accelerating testing coverage with symbolic execution and guided fuzzing. In Proceedings of the 40th International Conference on Software Engineering: Companion Proceeedings. 61–64.
[50]
Valentin Wüstholz and Maria Christakis. 2020. Targeted greybox fuzzing with static lookahead analysis. In 2020 IEEE/ACM 42nd International Conference on Software Engineering (ICSE). 789–800.
[51]
Wen Xu, Sanidhya Kashyap, Changwoo Min, and Taesoo Kim. 2017. Designing new operating primitives to improve fuzzing performance. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security. 2313–2328.
[52]
Insu Yun, Sangho Lee, Meng Xu, Yeongjin Jang, and Taesoo Kim. 2018. $QSYM$: A practical concolic execution engine tailored for hybrid fuzzing. In 27th USENIX Security Symposium (USENIX Security 18). 745–761.
[53]
Michał Zalewski. 2014. American fuzzy lop. https://lcamtuf.coredump.cx/afl/
[54]
Chijin Zhou, Mingzhe Wang, Jie Liang, Zhe Liu, and Yu Jiang. 2020. Zeror: Speed up fuzzing with coverage-sensitive tracing and scheduling. In Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering. 858–870.
[55]
Peiyuan Zong, Tao Lv, Dawei Wang, Zizhuang Deng, Ruigang Liang, and Kai Chen. 2020. Fuzzguard: Filtering out unreachable inputs in directed grey-box fuzzing through deep learning. In 29th USENIX Security Symposium (USENIX Security 20).

Cited By

View all
  • (2024)PolyJuice: Detecting Mis-compilation Bugs in Tensor Compilers with Equality Saturation Based RewritingProceedings of the ACM on Programming Languages10.1145/36897578:OOPSLA2(1309-1335)Online publication date: 8-Oct-2024
  • (2024)Towards Understanding and Characterizing the Arbitrage Bot Scam In the WildACM SIGMETRICS Performance Evaluation Review10.1145/3673660.365508852:1(89-90)Online publication date: 13-Jun-2024
  • (2024)Learning the Optimal Control for Evolving Systems with Converging DynamicsProceedings of the ACM on Measurement and Analysis of Computing Systems10.1145/36560078:2(1-39)Online publication date: 29-May-2024
  • Show More Cited By

Index Terms

  1. Accelerating Fuzzing through Prefix-Guided Execution

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image Proceedings of the ACM on Programming Languages
    Proceedings of the ACM on Programming Languages  Volume 7, Issue OOPSLA1
    April 2023
    901 pages
    EISSN:2475-1421
    DOI:10.1145/3554309
    Issue’s Table of Contents
    This work is licensed under a Creative Commons Attribution 4.0 International License.

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 06 April 2023
    Published in PACMPL Volume 7, Issue OOPSLA1

    Permissions

    Request permissions for this article.

    Check for updates

    Badges

    Author Tags

    1. code coverage
    2. fuzzing
    3. software testing

    Qualifiers

    • Research-article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)812
    • Downloads (Last 6 weeks)42
    Reflects downloads up to 14 Dec 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)PolyJuice: Detecting Mis-compilation Bugs in Tensor Compilers with Equality Saturation Based RewritingProceedings of the ACM on Programming Languages10.1145/36897578:OOPSLA2(1309-1335)Online publication date: 8-Oct-2024
    • (2024)Towards Understanding and Characterizing the Arbitrage Bot Scam In the WildACM SIGMETRICS Performance Evaluation Review10.1145/3673660.365508852:1(89-90)Online publication date: 13-Jun-2024
    • (2024)Learning the Optimal Control for Evolving Systems with Converging DynamicsProceedings of the ACM on Measurement and Analysis of Computing Systems10.1145/36560078:2(1-39)Online publication date: 29-May-2024
    • (2024)DiPri: Distance-based Seed Prioritization for Greybox FuzzingACM Transactions on Software Engineering and Methodology10.1145/3654440Online publication date: 26-Mar-2024
    • (2024)Towards Understanding and Characterizing the Arbitrage Bot Scam In the WildAbstracts of the 2024 ACM SIGMETRICS/IFIP PERFORMANCE Joint International Conference on Measurement and Modeling of Computer Systems10.1145/3652963.3655088(89-90)Online publication date: 10-Jun-2024
    • (2024)Logos: Log Guided Fuzzing for Protocol ImplementationsProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3680394(1720-1732)Online publication date: 11-Sep-2024
    • (2024)Machine Translation Testing via Syntactic Tree PruningACM Transactions on Software Engineering and Methodology10.1145/364032933:5(1-39)Online publication date: 4-Jun-2024
    • (2024)Instiller: Toward Efficient and Realistic RTL FuzzingIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2024.336031843:7(2177-2190)Online publication date: Jul-2024
    • (2024)Directed Fuzzing with Adaptive Path Guidance and Weighted Distribution2024 IEEE 24th International Conference on Software Quality, Reliability, and Security Companion (QRS-C)10.1109/QRS-C63300.2024.00112(839-848)Online publication date: 1-Jul-2024
    • (2023)DiPri: Distance-Based Seed Prioritization for Greybox Fuzzing (Registered Report)Proceedings of the 2nd International Fuzzing Workshop10.1145/3605157.3605172(21-30)Online publication date: 17-Jul-2023

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Login options

    Full Access

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media