More Web Proxy on the site http://driver.im/

research-article

Alphuzz: Monte Carlo Search on Seed-Mutation Tree for Coverage-Guided Fuzzing

Authors:

Yueqiang Cheng,

Heng YinAuthors Info & Claims

ACSAC '22: Proceedings of the 38th Annual Computer Security Applications Conference

Pages 534 - 547

https://doi.org/10.1145/3564625.3564660

Published: 05 December 2022 Publication History

Abstract

Coverage-based greybox fuzzing (CGF) has been approved to be effective in finding security vulnerabilities. Seed scheduling, the process of selecting an input as the seed from the seed pool for the next fuzzing iteration, plays a central role in CGF. Although numerous seed scheduling strategies have been proposed, most of them treat these seeds independently and do not explicitly consider the relationships among seeds.

In this study, we make a key observation that the relationships among seeds are valuable for seed scheduling. We design and propose a “seed mutation tree” by investigating and leveraging the mutation relationships among seeds. With the “seed mutation tree”, we further model the seed scheduling problem as a Monte-Carlo Tree Search (MCTS) problem. That is, we select the next seed for fuzzing by walking this “seed mutation tree” through an optimal path, based on the estimation of MCTS. We implement two prototypes, Alphuzz on top of AFL and Alphuzz++ on top of AFL++. The evaluation results on three datasets (the UniFuzz dataset, the CGC binaries, and 12 real-world binaries) show that Alphuzz and Alphuzz++ outperform state-of-the-art fuzzers with higher code coverage and more discovered vulnerabilities. In particular, Alphuzz discovers 3 new vulnerabilities with CVEs.

References

[1]

Akash Arora, Robert Fitch, and Salah Sukkarieh. 2017. An approach to autonomous science by modeling geological knowledge in a Bayesian framework. In 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2017, Vancouver, BC, Canada, September 24-28, 2017. IEEE, 3803–3810.

Digital Library

[2]

Cornelius Aschermann, Sergej Schumilo, Tim Blazytko, Robert Gawlik, and Thorsten Holz. 2019. REDQUEEN: Fuzzing with Input-to-State Correspondence. In NDSS, Vol. 19. 1–15.

[3]

Peter Auer, Nicolo Cesa-Bianchi, and Paul Fischer. 2002. Finite-time analysis of the multiarmed bandit problem. Machine learning 47, 2-3 (2002), 235–256.

Digital Library

[4]

Hendrik Baier and Michael Kaisers. 2020. Guiding Multiplayer MCTS by Focusing on Yourself. In IEEE Conference on Games, CoG 2020, Osaka, Japan, August 24-27, 2020. 550–557.

[5]

Marcel Böhme, Valentin J. M. Manès, and Sang Kil Cha. 2020. Boosting Fuzzer Efficiency: An Information Theoretic Perspective. In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (Virtual Event, USA) (ESEC/FSE 2020). Association for Computing Machinery, New York, NY, USA, 678–689.

Digital Library

[6]

Marcel Böhme, Van-Thuan Pham, and Abhik Roychoudhury. 2017. Coverage-based greybox fuzzing as markov chain. IEEE Transactions on Software Engineering 45, 5 (2017), 489–506.

[7]

Cameron B Browne, Edward Powley, Daniel Whitehouse, Simon M Lucas, Peter I Cowling, Philipp Rohlfshagen, Stephen Tavener, Diego Perez, Spyridon Samothrakis, and Simon Colton. 2012. A survey of monte carlo tree search methods. IEEE Transactions on Computational Intelligence and AI in games 4, 1(2012), 1–43.

[8]

Peng Chen and Hao Chen. 2018. Angora: Efficient Fuzzing by Principled Search. In 2018 IEEE Symposium on Security and Privacy, SP 2018, Proceedings, 21-23 May 2018, San Francisco, California, USA. 711–725.

[9]

Yaohui Chen, Mansour Ahmadi, Reza Mirzazade Farkhani, Boyu Wang, and Long Lu. 2020. MEUZZ: Smart Seed Scheduling for Hybrid Fuzzing. CoRR abs/2002.08568(2020).

[10]

Vitaly Chipounov, Volodymyr Kuznetsov, and George Candea. 2011. S2E: a platform for in-vivo multi-path analysis of software systems. In Proceedings of the 16th International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS 2011, Newport Beach, CA, USA, March 5-11, 2011, Rajiv Gupta and Todd C. Mowry (Eds.). ACM, 265–278.

Digital Library

[11]

Micah Corah and Nathan Michael. 2017. Efficient Online Multi-robot Exploration via Distributed Sequential Greedy Assignment. In Robotics: Science and Systems, Vol. 13.

[12]

Aleksander Czechowski and Frans A. Oliehoek. 2020. Decentralized MCTS via Learned Teammate Models. CoRR abs/2003.08727(2020).

[13]

DARPA. 2017. Cyber Grand Challenge Challenge Repository. http://www.lungetech.com/cgc-corpus/.

[14]

Simon Demediuk, Marco Tamassia, Xiaodong Li, and William L. Raffe. 2019. Challenging AI: Evaluating the Effect of MCTS-Driven Dynamic Difficulty Adjustment on Player Enjoyment. In Proceedings of the Australasian Computer Science Week Multiconference, ACSW 2019, Sydney, NSW, Australia, January 29-31, 2019. 43:1–43:7.

Digital Library

[15]

Andrea Fioraldi, Dominik Maier, Heiko Eißfeldt, and Marc Heuse. 2020. AFL++: Combining Incremental Steps of Fuzzing Research. In 14th USENIX Workshop on Offensive Technologies (WOOT 20). USENIX Association.

[16]

Jonathan Foote. 2018. The exploitable GDB plugin. https://github.com/jfoote/exploitable.

[17]

Shuitao Gan, Chao Zhang, Peng Chen, Bodong Zhao, Xiaojun Qin, Dong Wu, and Zuoning Chen. 2020. GREYONE: Data Flow Sensitive Fuzzing. In 29th USENIX Security Symposium (USENIX Security 20). USENIX Association, Boston, MA.

[18]

Shuitao Gan, Chao Zhang, Xiaojun Qin, Xuwen Tu, Kang Li, Zhongyu Pei, and Zuoning Chen. 2018. CollAFL: Path Sensitive Fuzzing. In 2018 IEEE Symposium on Security and Privacy, SP 2018, Proceedings, 21-23 May 2018, San Francisco, California, USA. 679–696.

[19]

GDB. 2019. GDB: The GNU Project Debugger. https://www.gnu.org/software/gdb/.

[20]

Benjamin Hefferan, Oliver M Cliff, and Robert Fitch. 2016. Adversarial patrolling with reactive point processes. In Proceedings of the ARAA Australasian Conference on Robotics and Automation (ARAA, 2016). 39–46.

[21]

George Klees, Andrew Ruef, Benji Cooper, Shiyi Wei, and Michael Hicks. 2018. Evaluating Fuzz Testing. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, CCS 2018, Toronto, ON, Canada, October 15-19, 2018. 2123–2138.

Digital Library

[22]

Levente Kocsis, Csaba Szepesvári, and Jan Willemson. 2006. Improved monte-carlo search. Univ. Tartu, Estonia, Tech. Rep 1 (2006).

[23]

Mikko Lauri and Risto Ritala. 2016. Planning for robotic exploration based on forward simulation. Robotics and Autonomous Systems 83 (2016), 15–31.

Digital Library

[24]

Caroline Lemieux and Koushik Sen. 2018. Fairfuzz: A targeted mutation strategy for increasing greybox fuzz testing coverage. In Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering. 475–485.

Digital Library

[25]

Yuwei Li, Shouling Ji, Yuan Chen, Sizhuang Liang, Wei-Han Lee, Yueyao Chen, Chenyang Lyu, Chunming Wu, Raheem Beyah, Peng Cheng, Kangjie Lu, and Ting Wang. 2020. UNIFUZZ: A Holistic and Pragmatic Metrics-Driven Platform for Evaluating Fuzzers. CoRR abs/2010.01785(2020).

[26]

An-Jen Liu, Ti-Rong Wu, I-Chen Wu, Hung Guei, and Ting-Han Wei. 2020. Strength Adjustment and Assessment for MCTS-Based Programs [Research Frontier]. IEEE Comput. Intell. Mag. 15, 3 (2020), 60–73.

[27]

Dongge Liu, Gidon Ernst, Toby Murray, and Benjamin I. P. Rubinstein. 2020. Legion: Best-First Concolic Testing. In Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering. Association for Computing Machinery, New York, NY, USA, 54–65.

[28]

Chenyang Lyu, Shouling Ji, Chao Zhang, Yuwei Li, Wei-Han Lee, Yu Song, and Raheem Beyah. 2019. MOPT: Optimized mutation scheduling for fuzzers. In 28th USENIX Security Symposium (USENIX Security 19). 1949–1966.

Digital Library

[29]

Hector D. Menendez and David Clark. 2021. Hashing Fuzzing: Introducing Input Diversity to Improve Crash Detection. IEEE Transactions on Software Engineering(2021), 1–1.

[30]

Stefan Nagy and Matthew Hicks. 2019. Full-Speed Fuzzing: Reducing Fuzzing Overhead through Coverage-Guided Tracing. In 2019 IEEE Symposium on Security and Privacy, SP 2019, San Francisco, CA, USA, May 19-23, 2019. 787–802.

[31]

Trail of Bits. 2016. DARPA Challenges Sets for Linux, Windows, and macOS. Accessed on September 10th 2020. https://github.com/trailofbits/cb-multios.

[32]

Abdessamed Ouessai, Mohammed Salem, and Antonio Miguel Mora. 2020. Parametric action pre-selection for MCTS in real-time strategy games. In Proceedings of the VI Congreso de la Sociedad Española para las Ciencias del Videojuego, On-line, October 7-8, 2020. 104–115.

[33]

p value. 2022. p value. https://en.wikipedia.org/wiki/P-value.

[34]

Timothy Patten, Wolfram Martens, and Robert Fitch. 2018. Monte Carlo planning for active object classification. Autonomous Robots 42, 2 (2018), 391–421.

[35]

Mohit Rajpal, William Blum, and Rishabh Singh. 2017. Not all bytes are equal: Neural byte sieve for fuzzing. CoRR abs/1711.04596(2017).

[36]

Kostya Serebryany. 2015. libFuzzer–a library for coverage-guided fuzz testing. LLVM project (2015).

[37]

Konstantin Serebryany, Derek Bruening, Alexander Potapenko, and Dmitriy Vyukov. 2017. Addresssanitizer. https://github.com/google/sanitizers/wiki.

[38]

Eren Sezener and Peter Dayan. 2020. Static and Dynamic Values of Computation in MCTS. CoRR abs/2002.04335(2020). https://arxiv.org/abs/2002.04335

[39]

Dongdong She, Kexin Pei, Dave Epstein, Junfeng Yang, Baishakhi Ray, and Suman Jana. 2019. NEUZZ: Efficient Fuzzing with Neural Program Smoothing. In 2019 IEEE Symposium on Security and Privacy, SP 2019, San Francisco, CA, USA, May 19-23, 2019. 803–817.

[40]

Robert Swiecki. 2017. Honggfuzz: A general-purpose, easy-to-use fuzzer with interesting analysis options. URl: https://github. com/google/honggfuzz (visited on 06/21/2017) (2017).

[41]

The Clang Team. 2017. Addresssanitizer. http://clang.llvm.org/docs/AddressSanitizer.html.

[42]

Junjie Wang, Bihuan Chen, Lei Wei, and Yang Liu. 2017. Skyfire: Data-Driven Seed Generation for Fuzzing. In 2017 IEEE Symposium on Security and Privacy, SP 2017, San Jose, CA, USA, May 22-26, 2017. 579–594.

[43]

Jinghan Wang, Yue Duan, Wei Song, Heng Yin, and Chengyu Song. 2019. Be Sensitive and Collaborative: Analyzing Impact of Coverage Metrics in Greybox Fuzzing. In 22nd International Symposium on Research in Attacks, Intrusions and Defenses, RAID 2019, Chaoyang District, Beijing, China, September 23-25, 2019.

[44]

Jinghan Wang, Chengyu Song, and Heng Yin. 2021. Reinforcement Learning-based Hierarchical Seed Scheduling for Greybox Fuzzing. In 28th Annual Network and Distributed System Security Symposium, NDSS 2021, virtually, February 21-25, 2021. The Internet Society.

[45]

Wen Xu, Sanidhya Kashyap, Changwoo Min, and Taesoo Kim. 2017. Designing New Operating Primitives to Improve Fuzzing Performance. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, CCS 2017, Dallas, TX, USA, October 30 - November 03, 2017. 2313–2328.

Digital Library

[46]

Tai Yue, Pengfei Wang, Yong Tang, Enze Wang, Bo Yu, Kai Lu, and Xu Zhou. 2020. EcoFuzz: Adaptive Energy-Saving Greybox Fuzzing as a Variant of the Adversarial Multi-Armed Bandit. In 29th USENIX Security Symposium (USENIX Security 20).

[47]

M. Zalewski. 2017. American Fuzzy Lop. http://lcamtuf.coredump.cx/afl/.

[48]

Gen Zhang, Xu Zhou, Yingqi Luo, Xugang Wu, and Erxue Min. 2018. PTfuzz: Guided Fuzzing With Processor Trace Feedback. IEEE Access 6(2018), 37302–37313.

[49]

Lei Zhao, Yue Duan, Heng Yin, and Jifeng Xuan. 2019. Send Hardest Problems My Way: Probabilistic Path Prioritization for Hybrid Fuzzing. In 26th Annual Network and Distributed System Security Symposium, NDSS 2019, San Diego, California, USA, February 24-27, 2019. The Internet Society.

Cited By

Cheng MZhu KChen YLu YChen CYu J(2024)Reinforcement Learning-Based Multi-Phase Seed Scheduling for Network Protocol FuzzingElectronics10.3390/electronics1324496213:24(4962)Online publication date: 17-Dec-2024
https://doi.org/10.3390/electronics13244962
Qian RZhang QFang CYang DLi SLi BChen Z(2024)DiPri: Distance-Based Seed Prioritization for Greybox FuzzingACM Transactions on Software Engineering and Methodology10.1145/365444034:1(1-39)Online publication date: 19-Dec-2024
https://dl.acm.org/doi/10.1145/3654440
Huang HChiu HShi QYao PZhang C(2024)Balance Seed Scheduling via Monte Carlo PlanningIEEE Transactions on Dependable and Secure Computing10.1109/TDSC.2023.328529321:3(1469-1483)Online publication date: May-2024
https://doi.org/10.1109/TDSC.2023.3285293
Show More Cited By

Index Terms

Alphuzz: Monte Carlo Search on Seed-Mutation Tree for Coverage-Guided Fuzzing
1. Security and privacy
  1. Software and application security
    1. Software security engineering

Recommendations

Seed selection for successful fuzzing
ISSTA 2021: Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis

Mutation-based greybox fuzzing---unquestionably the most widely-used fuzzing technique---relies on a set of non-crashing seed inputs (a corpus) to bootstrap the bug-finding process. When evaluating a fuzzer, common approaches for constructing this ...
Coverage-based Greybox Fuzzing as Markov Chain
CCS '16: Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security

Coverage-based Greybox Fuzzing (CGF) is a random testing approach that requires no program analysis. A new test is generated by slightly mutating a seed input. If the test exercises a new and interesting path, it is added to the set of seeds; otherwise, ...
Graphuzz: Data-driven Seed Scheduling for Coverage-guided Greybox Fuzzing
Seed scheduling is a critical step of greybox fuzzing, which assigns different weights to seed test cases during seed selection, and significantly impacts the efficiency of fuzzing. Existing seed scheduling strategies rely on manually designed models to ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences

ACSAC '22: Proceedings of the 38th Annual Computer Security Applications Conference

December 2022

1021 pages

ISBN:9781450397599

DOI:10.1145/3564625

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 December 2022

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Badges

Artifacts Evaluated & Reusable / v1.1

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

ACSAC

ACSAC: Annual Computer Security Applications Conference

December 5 - 9, 2022

TX, Austin, USA

Acceptance Rates

Overall Acceptance Rate 104 of 497 submissions, 21%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

6
Total Citations
View Citations
388
Total Downloads

Downloads (Last 12 months)139
Downloads (Last 6 weeks)8

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Cheng MZhu KChen YLu YChen CYu J(2024)Reinforcement Learning-Based Multi-Phase Seed Scheduling for Network Protocol FuzzingElectronics10.3390/electronics1324496213:24(4962)Online publication date: 17-Dec-2024
https://doi.org/10.3390/electronics13244962
Qian RZhang QFang CYang DLi SLi BChen Z(2024)DiPri: Distance-Based Seed Prioritization for Greybox FuzzingACM Transactions on Software Engineering and Methodology10.1145/365444034:1(1-39)Online publication date: 19-Dec-2024
https://dl.acm.org/doi/10.1145/3654440
Huang HChiu HShi QYao PZhang C(2024)Balance Seed Scheduling via Monte Carlo PlanningIEEE Transactions on Dependable and Secure Computing10.1109/TDSC.2023.328529321:3(1469-1483)Online publication date: May-2024
https://doi.org/10.1109/TDSC.2023.3285293
Ding KWang ZZhang W(2024)Directed Fuzzing with Adaptive Path Guidance and Weighted Distribution2024 IEEE 24th International Conference on Software Quality, Reliability, and Security Companion (QRS-C)10.1109/QRS-C63300.2024.00112(839-848)Online publication date: 1-Jul-2024
https://doi.org/10.1109/QRS-C63300.2024.00112
Golla MGodboley S(2024)Automated SC-MCC test case generation using coverage-guided fuzzingSoftware Quality Journal10.1007/s11219-024-09667-332:3(849-880)Online publication date: 1-Sep-2024
https://dl.acm.org/doi/10.1007/s11219-024-09667-3
Xie CJia PYang PHu CKuang HYe GHong X(2023)Not All Seeds Are Important: Fuzzing Guided by Untouched EdgesApplied Sciences10.3390/app13241317213:24(13172)Online publication date: 12-Dec-2023
https://doi.org/10.3390/app132413172

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View Table of Conten