More Web Proxy on the site http://driver.im/

research-article

Open access

Accelerating Fuzzing through Prefix-Guided Execution

Authors:

Zhendong SuAuthors Info & Claims

Proceedings of the ACM on Programming Languages, Volume 7, Issue OOPSLA1

Article No.: 75, Pages 1 - 27

https://doi.org/10.1145/3586027

Published: 06 April 2023 Publication History

Abstract

Coverage-guided fuzzing is one of the most effective approaches for discovering software defects and vulnerabilities. It executes all mutated tests from seed inputs to expose coverage-increasing tests. However, executing all mutated tests incurs significant performance penalties---most of the mutated tests are discarded because they do not increase code coverage. Thus, determining if a test increases code coverage without actually executing it is beneficial, but a paradoxical challenge. In this paper, we introduce the notion of prefix-guided execution (PGE) to tackle this challenge. PGE leverages two key observations: (1) Only a tiny fraction of the mutated tests increase coverage, thus requiring full execution; and (2) whether a test increases coverage may be accurately inferred from its partial execution. PGE monitors the execution of a test and applies early termination when the execution prefix indicates that the test is unlikely to increase coverage.

To demonstrate the potential of PGE, we implement a prototype on top of AFL++, which we call AFL++-PGE. We evaluate AFL++-PGE on MAGMA, a ground-truth benchmark set that consists of 21 programs from nine popular real-world projects. Our results show that, after 48 hours of fuzzing, AFL++-PGE finds more bugs, discovers bugs faster, and achieves higher coverage. Prefix-guided execution is general and can benefit the AFL-based family of fuzzers.

References

[1]

Mike Aizatsky, Kostya Serebryany, Oliver Chang, Abhishek Arya, and Meredith Whittaker. 2016. Announcing OSS-Fuzz: Continuous fuzzing for open source software. Google Testing Blog.

[2]

Austin Appleby. 2016. MurmurHash3. https://github.com/aappleby/smhasher/wiki/MurmurHash3

[3]

Cornelius Aschermann, Sergej Schumilo, Tim Blazytko, Robert Gawlik, and Thorsten Holz. 2019. REDQUEEN: Fuzzing with Input-to-State Correspondence. In Network and Distributed System Security. 19, 1–15.

[4]

Marcel Böhme and Brandon Falk. 2020. Fuzzing: On the exponential cost of vulnerability discovery. In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 713–724.

Digital Library

[5]

Marcel Böhme, Van-Thuan Pham, Manh-Dung Nguyen, and Abhik Roychoudhury. 2017. Directed greybox fuzzing. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security. 2329–2344.

Digital Library

[6]

Marcel Böhme, Van-Thuan Pham, and Abhik Roychoudhury. 2017. Coverage-based greybox fuzzing as markov chain. IEEE Transactions on Software Engineering, 45, 5 (2017), 489–506.

[7]

Hongxu Chen, Yinxing Xue, Yuekang Li, Bihuan Chen, Xiaofei Xie, Xiuheng Wu, and Yang Liu. 2018. Hawkeye: Towards a desired directed grey-box fuzzer. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security. 2095–2108.

Digital Library

[8]

Peng Chen and Hao Chen. 2018. Angora: Efficient fuzzing by principled search. In IEEE Symposium on Security and Privacy (SP). 711–725.

[9]

Peng Chen, Jianzhong Liu, and Hao Chen. 2019. Matryoshka: fuzzing deeply nested branches. In Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security. 499–513.

Digital Library

[10]

Yaohui Chen, Mansour Ahmadi, Boyu Wang, and Long Lu. 2020. MEUZZ: Smart Seed Scheduling for Hybrid Fuzzing. In 23rd International Symposium on Research in Attacks, Intrusions and Defenses (RAID 2020). 77–92.

[11]

Yaohui Chen, Peng Li, Jun Xu, Shengjian Guo, Rundong Zhou, Yulong Zhang, Tao Wei, and Long Lu. 2020. Savior: Towards bug-driven hybrid testing. In 2020 IEEE Symposium on Security and Privacy (SP). 1580–1596.

[12]

Andrea Fioraldi, Daniele Cono D’Elia, and Davide Balzarotti. 2021. The Use of Likely Invariants as Feedback for Fuzzers. In 30th USENIX Security Symposium (USENIX Security 21). 2829–2846.

[13]

Andrea Fioraldi, Dominik Maier, Heiko Eiß feldt, and Marc Heuse. 2020. $AFL++$: Combining Incremental Steps of Fuzzing Research. In 14th USENIX Workshop on Offensive Technologies (WOOT 20).

[14]

Andrea Fioraldi, Dominik Maier, Heiko Eiß feldt, and Marc Heuse. 2022. American Fuzzy Lop plus plus (AFL++). https://github.com/AFLplusplus/AFLplusplus/releases/tag/4.01c

[15]

Shuitao Gan, Chao Zhang, Peng Chen, Bodong Zhao, Xiaojun Qin, Dong Wu, and Zuoning Chen. 2020. GREYONE: Data flow sensitive fuzzing. In 29th USENIX Security Symposium (USENIX Security 20). 2577–2594.

Digital Library

[16]

Patrice Godefroid, Hila Peleg, and Rishabh Singh. 2017. Learn&fuzz: Machine learning for input fuzzing. In 2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE). 50–59.

[17]

Ahmad Hazimeh, Adrian Herrera, and Mathias Payer. 2020. Magma: A Ground-Truth Fuzzing Benchmark. Proc. ACM Meas. Anal. Comput. Syst., 4, 3 (2020), Article 49, Dec., 29 pages. https://doi.org/10.1145/3428334

Digital Library

[18]

Adrian Herrera, Hendra Gunadi, Shane Magrath, Michael Norrish, Mathias Payer, and Antony L Hosking. 2021. Seed selection for successful fuzzing. In Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis. 230–243.

Digital Library

[19]

Chin-Chia Hsu, Che-Yu Wu, Hsu-Chun Hsiao, and Shih-Kun Huang. 2018. Instrim: Lightweight instrumentation for coverage-guided fuzzing. In Symposium on Network and Distributed System Security, Workshop on Binary Analysis Research.

[20]

Heqing Huang, Peisen Yao, Rongxin Wu, Qingkai Shi, and Charles Zhang. 2020. Pangolin: Incremental hybrid fuzzing with polyhedral path abstraction. In 2020 IEEE Symposium on Security and Privacy (SP). 1613–1627.

[21]

Edward L Kaplan and Paul Meier. 1958. Nonparametric estimation from incomplete observations. Journal of the American statistical association, 53, 282 (1958), 457–481.

[22]

George Klees, Andrew Ruef, Benji Cooper, Shiyi Wei, and Michael Hicks. 2018. Evaluating fuzz testing. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security. 2123–2138.

Digital Library

[23]

Caroline Lemieux and Koushik Sen. 2018. Fairfuzz: A targeted mutation strategy for increasing greybox fuzz testing coverage. In Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering. 475–485.

Digital Library

[24]

Shaohua Li and Zhendong Su. 2023. Accelerating Fuzzing through Prefix-Guided Execution. https://doi.org/10.5281/zenodo.7727577

Digital Library

[25]

Yuekang Li, Bihuan Chen, Mahinthan Chandramohan, Shang-Wei Lin, Yang Liu, and Alwen Tiu. 2017. Steelix: program-state based binary fuzzing. In Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering. 627–637.

Digital Library

[26]

Yuwei Li, Shouling Ji, Yuan Chen, Sizhuang Liang, Wei-Han Lee, Yueyao Chen, Chenyang Lyu, Chunming Wu, Raheem Beyah, and Peng Cheng. 2021. UNIFUZZ: A Holistic and Pragmatic Metrics-Driven Platform for Evaluating Fuzzers. In 30th USENIX Security Symposium (USENIX Security 21).

[27]

LLVM team. 2021. SanitizerCoverage. https://releases.llvm.org/11.0.1/tools/clang/docs/SanitizerCoverage.html

[28]

Chenyang Lyu, Shouling Ji, Chao Zhang, Yuwei Li, Wei-Han Lee, Yu Song, and Raheem Beyah. 2019. $MOPT$: Optimized mutation scheduling for fuzzers. In 28th USENIX Security Symposium (USENIX Security 19). 1949–1966.

[29]

Valentin JM Manès, Soomin Kim, and Sang Kil Cha. 2020. Ankou: Guiding grey-box fuzzing towards combinatorial difference. In Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering. 1024–1036.

Digital Library

[30]

Valentin Jean Marie Manès, HyungSeok Han, Choongwoo Han, Sang Kil Cha, Manuel Egele, Edward J Schwartz, and Maverick Woo. 2019. The art, science, and engineering of fuzzing: A survey. IEEE Transactions on Software Engineering.

[31]

Nathan Mantel. 1966. Evaluation of survival data and two new rank order statistics arising in its consideration. Cancer Chemother Rep, 50, 3 (1966), 163–170.

[32]

Björn Mathis, Rahul Gopinath, and Andreas Zeller. 2020. Learning input tokens for effective fuzzing. In Proceedings of the 29th ACM SIGSOFT International Symposium on Software Testing and Analysis. 27–37.

Digital Library

[33]

Stefan Nagy and Matthew Hicks. 2019. Full-speed fuzzing: Reducing fuzzing overhead through coverage-guided tracing. In 2019 IEEE Symposium on Security and Privacy (SP). 787–802.

[34]

Stefan Nagy, Anh Nguyen-Tuong, Jason D Hiser, Jack W Davidson, and Matthew Hicks. 2021. Same Coverage, Less Bloat: Accelerating Binary-only Fuzzing with Coverage-preserving Coverage-guided Tracing. In Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security. 351–365.

Digital Library

[35]

Yannic Noller, Rody Kersten, and Corina S Păsăreanu. 2018. Badger: complexity analysis with fuzzing and symbolic execution. In Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis. 322–332.

Digital Library

[36]

Yannic Noller, Corina S Păsăreanu, Marcel Böhme, Youcheng Sun, Hoang Lam Nguyen, and Lars Grunske. 2020. HyDiff: Hybrid differential software analysis. In 2020 IEEE/ACM 42nd International Conference on Software Engineering (ICSE). 1273–1285.

Digital Library

[37]

Hui Peng, Yan Shoshitaishvili, and Mathias Payer. 2018. T-Fuzz: fuzzing by program transformation. In 2018 IEEE Symposium on Security and Privacy (SP). 697–710.

[38]

Van-Thuan Pham, Marcel Böhme, and Abhik Roychoudhury. 2020. AFLNet: a greybox fuzzer for network protocols. In 2020 IEEE 13th International Conference on Software Testing, Validation and Verification (ICST). 460–465.

[39]

Mohit Rajpal, William Blum, and Rishabh Singh. 2017. Not all bytes are equal: Neural byte sieve for fuzzing. arXiv preprint arXiv:1711.04596.

[40]

Sanjay Rawat, Vivek Jain, Ashish Kumar, Lucian Cojocar, Cristiano Giuffrida, and Herbert Bos. 2017. VUzzer: Application-aware Evolutionary Fuzzing. In Network and Distributed System Security. 17, 1–14.

[41]

David A Schum. 2001. The evidential foundations of probabilistic reasoning. Northwestern University Press.

[42]

Dongdong She, Rahul Krishna, Lu Yan, Suman Jana, and Baishakhi Ray. 2020. MTFuzz: fuzzing with a multi-task neural network. In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 737–749.

Digital Library

[43]

Dongdong She, Kexin Pei, Dave Epstein, Junfeng Yang, Baishakhi Ray, and Suman Jana. 2019. Neuzz: Efficient fuzzing with neural program smoothing. In 2019 IEEE Symposium on Security and Privacy (SP). 803–817.

[44]

Dokyung Song, Felicitas Hetzelt, Jonghwan Kim, Brent Byunghoon Kang, Jean-Pierre Seifert, and Michael Franz. 2020. Agamotto: Accelerating kernel driver fuzzing with lightweight virtual machine checkpoints. In 29th USENIX Security Symposium (USENIX Security 20). 2541–2557.

[45]

Nick Stephens, John Grosen, Christopher Salls, Andrew Dutcher, Ruoyu Wang, Jacopo Corbetta, Yan Shoshitaishvili, Christopher Kruegel, and Giovanni Vigna. 2016. Driller: Augmenting fuzzing through selective symbolic execution. In Network and Distributed System Security. 16, 1–16.

[46]

Jonas Benedict Wagner. 2017. Elastic program transformations: automatically optimizing the reliability/performance trade-off in systems software. Ph. D. Dissertation. Ecole Polytechnique Fédérale de Lausanne.

[47]

Haijun Wang, Xiaofei Xie, Yi Li, Cheng Wen, Yuekang Li, Yang Liu, Shengchao Qin, Hongxu Chen, and Yulei Sui. 2020. Typestate-guided fuzzer for discovering use-after-free vulnerabilities. In 2020 IEEE/ACM 42nd International Conference on Software Engineering (ICSE). 999–1010.

Digital Library

[48]

Junjie Wang, Bihuan Chen, Lei Wei, and Yang Liu. 2017. Skyfire: Data-driven seed generation for fuzzing. In 2017 IEEE Symposium on Security and Privacy (SP). 579–594.

[49]

Mingzhe Wang, Jie Liang, Yuanliang Chen, Yu Jiang, Xun Jiao, Han Liu, Xibin Zhao, and Jiaguang Sun. 2018. SAFL: increasing and accelerating testing coverage with symbolic execution and guided fuzzing. In Proceedings of the 40th International Conference on Software Engineering: Companion Proceeedings. 61–64.

Digital Library

[50]

Valentin Wüstholz and Maria Christakis. 2020. Targeted greybox fuzzing with static lookahead analysis. In 2020 IEEE/ACM 42nd International Conference on Software Engineering (ICSE). 789–800.

Digital Library

[51]

Wen Xu, Sanidhya Kashyap, Changwoo Min, and Taesoo Kim. 2017. Designing new operating primitives to improve fuzzing performance. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security. 2313–2328.

Digital Library

[52]

Insu Yun, Sangho Lee, Meng Xu, Yeongjin Jang, and Taesoo Kim. 2018. $QSYM$: A practical concolic execution engine tailored for hybrid fuzzing. In 27th USENIX Security Symposium (USENIX Security 18). 745–761.

[53]

Michał Zalewski. 2014. American fuzzy lop. https://lcamtuf.coredump.cx/afl/

[54]

Chijin Zhou, Mingzhe Wang, Jie Liang, Zhe Liu, and Yu Jiang. 2020. Zeror: Speed up fuzzing with coverage-sensitive tracing and scheduling. In Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering. 858–870.

Digital Library

[55]

Peiyuan Zong, Tao Lv, Dawei Wang, Zizhuang Deng, Ruigang Liang, and Kai Chen. 2020. Fuzzguard: Filtering out unreachable inputs in directed grey-box fuzzing through deep learning. In 29th USENIX Security Symposium (USENIX Security 20).

Cited By

Zhou CQian BGo GZhang QLi SJiang Y(2024)PolyJuice: Detecting Mis-compilation Bugs in Tensor Compilers with Equality Saturation Based RewritingProceedings of the ACM on Programming Languages10.1145/36897578:OOPSLA2(1309-1335)Online publication date: 8-Oct-2024
https://dl.acm.org/doi/10.1145/3689757
Li KGuan SLee D(2024)Towards Understanding and Characterizing the Arbitrage Bot Scam In the WildACM SIGMETRICS Performance Evaluation Review10.1145/3673660.365508852:1(89-90)Online publication date: 13-Jun-2024
https://dl.acm.org/doi/10.1145/3673660.3655088
Liu QFang Z(2024)Learning the Optimal Control for Evolving Systems with Converging DynamicsProceedings of the ACM on Measurement and Analysis of Computing Systems10.1145/36560078:2(1-39)Online publication date: 29-May-2024
https://dl.acm.org/doi/10.1145/3656007
Show More Cited By

Index Terms

Accelerating Fuzzing through Prefix-Guided Execution
1. Security and privacy
  1. Software and application security
    1. Software security engineering

Recommendations

Same Coverage, Less Bloat: Accelerating Binary-only Fuzzing with Coverage-preserving Coverage-guided Tracing
CCS '21: Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security

Coverage-guided fuzzing's aggressive, high-volume testing has helped reveal tens of thousands of software security flaws. While executing billions of test cases mandates fast code coverage tracing, the nature of binary-only targets leads to reduced ...
DeFlaker: automatically detecting flaky tests
ICSE '18: Proceedings of the 40th International Conference on Software Engineering

Developers often run tests to check that their latest changes to a code repository did not break any previously working functionality. Ideally, any new test failures would indicate regressions caused by the latest changes. However, some test failures may ...
Grey-box concolic testing on binary code
ICSE '19: Proceedings of the 41st International Conference on Software Engineering

We present grey-box concolic testing, a novel path-based test case generation method that combines the best of both white-box and grey-box fuzzing. At a high level, our technique systematically explores execution paths of a program under test as in white-...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Proceedings of the ACM on Programming Languages

Proceedings of the ACM on Programming Languages Volume 7, Issue OOPSLA1

April 2023

901 pages

EISSN:2475-1421

DOI:10.1145/3554309

Editor:
Michael Hicks
Amazon, USA

Issue’s Table of Contents

Copyright © 2023 Owner/Author.

This work is licensed under a Creative Commons Attribution 4.0 International License.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 06 April 2023

Published in PACMPL Volume 7, Issue OOPSLA1

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Badges

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

10
Total Citations
View Citations
1,567
Total Downloads

Downloads (Last 12 months)812
Downloads (Last 6 weeks)42

Reflects downloads up to 14 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Zhou CQian BGo GZhang QLi SJiang Y(2024)PolyJuice: Detecting Mis-compilation Bugs in Tensor Compilers with Equality Saturation Based RewritingProceedings of the ACM on Programming Languages10.1145/36897578:OOPSLA2(1309-1335)Online publication date: 8-Oct-2024
https://dl.acm.org/doi/10.1145/3689757
Li KGuan SLee D(2024)Towards Understanding and Characterizing the Arbitrage Bot Scam In the WildACM SIGMETRICS Performance Evaluation Review10.1145/3673660.365508852:1(89-90)Online publication date: 13-Jun-2024
https://dl.acm.org/doi/10.1145/3673660.3655088
Liu QFang Z(2024)Learning the Optimal Control for Evolving Systems with Converging DynamicsProceedings of the ACM on Measurement and Analysis of Computing Systems10.1145/36560078:2(1-39)Online publication date: 29-May-2024
https://dl.acm.org/doi/10.1145/3656007
Qian RZhang QFang CYang DLi SLi BChen Z(2024)DiPri: Distance-based Seed Prioritization for Greybox FuzzingACM Transactions on Software Engineering and Methodology10.1145/3654440Online publication date: 26-Mar-2024
https://dl.acm.org/doi/10.1145/3654440
Li KGuan SLee DGaretto MMarin ACiucu FFanti GRighter R(2024)Towards Understanding and Characterizing the Arbitrage Bot Scam In the WildAbstracts of the 2024 ACM SIGMETRICS/IFIP PERFORMANCE Joint International Conference on Measurement and Modeling of Computer Systems10.1145/3652963.3655088(89-90)Online publication date: 10-Jun-2024
https://dl.acm.org/doi/10.1145/3652963.3655088
Wu FLuo ZZhao YDu QYu JPeng RShi HJiang YChristakis MPradel M(2024)Logos: Log Guided Fuzzing for Protocol ImplementationsProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3680394(1720-1732)Online publication date: 11-Sep-2024
https://dl.acm.org/doi/10.1145/3650212.3680394
Zhang QZhai JFang CLiu JSun WHu HWang Q(2024)Machine Translation Testing via Syntactic Tree PruningACM Transactions on Software Engineering and Methodology10.1145/364032933:5(1-39)Online publication date: 4-Jun-2024
https://dl.acm.org/doi/10.1145/3640329
Zhang GWang PYue TLiu DGuo YLu K(2024)Instiller: Toward Efficient and Realistic RTL FuzzingIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2024.336031843:7(2177-2190)Online publication date: Jul-2024
https://doi.org/10.1109/TCAD.2024.3360318
Ding KWang ZZhang W(2024)Directed Fuzzing with Adaptive Path Guidance and Weighted Distribution2024 IEEE 24th International Conference on Software Quality, Reliability, and Security Companion (QRS-C)10.1109/QRS-C63300.2024.00112(839-848)Online publication date: 1-Jul-2024
https://doi.org/10.1109/QRS-C63300.2024.00112
Qian RZhang QFang CChen ZBöhme MNoller YRay BSzekeres L(2023)DiPri: Distance-Based Seed Prioritization for Greybox Fuzzing (Registered Report)Proceedings of the 2nd International Fuzzing Workshop10.1145/3605157.3605172(21-30)Online publication date: 17-Jul-2023
https://dl.acm.org/doi/10.1145/3605157.3605172

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Media

Figures

Other

Tables

View Issue’s Table of Contents