[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3564625.3564660acmotherconferencesArticle/Chapter ViewAbstractPublication PagesacsacConference Proceedingsconference-collections
research-article

Alphuzz: Monte Carlo Search on Seed-Mutation Tree for Coverage-Guided Fuzzing

Published: 05 December 2022 Publication History

Abstract

Coverage-based greybox fuzzing (CGF) has been approved to be effective in finding security vulnerabilities. Seed scheduling, the process of selecting an input as the seed from the seed pool for the next fuzzing iteration, plays a central role in CGF. Although numerous seed scheduling strategies have been proposed, most of them treat these seeds independently and do not explicitly consider the relationships among seeds.
In this study, we make a key observation that the relationships among seeds are valuable for seed scheduling. We design and propose a “seed mutation tree” by investigating and leveraging the mutation relationships among seeds. With the “seed mutation tree”, we further model the seed scheduling problem as a Monte-Carlo Tree Search (MCTS) problem. That is, we select the next seed for fuzzing by walking this “seed mutation tree” through an optimal path, based on the estimation of MCTS. We implement two prototypes, Alphuzz on top of AFL and Alphuzz++ on top of AFL++. The evaluation results on three datasets (the UniFuzz dataset, the CGC binaries, and 12 real-world binaries) show that Alphuzz and Alphuzz++ outperform state-of-the-art fuzzers with higher code coverage and more discovered vulnerabilities. In particular, Alphuzz discovers 3 new vulnerabilities with CVEs.

References

[1]
Akash Arora, Robert Fitch, and Salah Sukkarieh. 2017. An approach to autonomous science by modeling geological knowledge in a Bayesian framework. In 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2017, Vancouver, BC, Canada, September 24-28, 2017. IEEE, 3803–3810.
[2]
Cornelius Aschermann, Sergej Schumilo, Tim Blazytko, Robert Gawlik, and Thorsten Holz. 2019. REDQUEEN: Fuzzing with Input-to-State Correspondence. In NDSS, Vol. 19. 1–15.
[3]
Peter Auer, Nicolo Cesa-Bianchi, and Paul Fischer. 2002. Finite-time analysis of the multiarmed bandit problem. Machine learning 47, 2-3 (2002), 235–256.
[4]
Hendrik Baier and Michael Kaisers. 2020. Guiding Multiplayer MCTS by Focusing on Yourself. In IEEE Conference on Games, CoG 2020, Osaka, Japan, August 24-27, 2020. 550–557.
[5]
Marcel Böhme, Valentin J. M. Manès, and Sang Kil Cha. 2020. Boosting Fuzzer Efficiency: An Information Theoretic Perspective. In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (Virtual Event, USA) (ESEC/FSE 2020). Association for Computing Machinery, New York, NY, USA, 678–689.
[6]
Marcel Böhme, Van-Thuan Pham, and Abhik Roychoudhury. 2017. Coverage-based greybox fuzzing as markov chain. IEEE Transactions on Software Engineering 45, 5 (2017), 489–506.
[7]
Cameron B Browne, Edward Powley, Daniel Whitehouse, Simon M Lucas, Peter I Cowling, Philipp Rohlfshagen, Stephen Tavener, Diego Perez, Spyridon Samothrakis, and Simon Colton. 2012. A survey of monte carlo tree search methods. IEEE Transactions on Computational Intelligence and AI in games 4, 1(2012), 1–43.
[8]
Peng Chen and Hao Chen. 2018. Angora: Efficient Fuzzing by Principled Search. In 2018 IEEE Symposium on Security and Privacy, SP 2018, Proceedings, 21-23 May 2018, San Francisco, California, USA. 711–725.
[9]
Yaohui Chen, Mansour Ahmadi, Reza Mirzazade Farkhani, Boyu Wang, and Long Lu. 2020. MEUZZ: Smart Seed Scheduling for Hybrid Fuzzing. CoRR abs/2002.08568(2020).
[10]
Vitaly Chipounov, Volodymyr Kuznetsov, and George Candea. 2011. S2E: a platform for in-vivo multi-path analysis of software systems. In Proceedings of the 16th International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS 2011, Newport Beach, CA, USA, March 5-11, 2011, Rajiv Gupta and Todd C. Mowry (Eds.). ACM, 265–278.
[11]
Micah Corah and Nathan Michael. 2017. Efficient Online Multi-robot Exploration via Distributed Sequential Greedy Assignment. In Robotics: Science and Systems, Vol. 13.
[12]
Aleksander Czechowski and Frans A. Oliehoek. 2020. Decentralized MCTS via Learned Teammate Models. CoRR abs/2003.08727(2020).
[13]
DARPA. 2017. Cyber Grand Challenge Challenge Repository. http://www.lungetech.com/cgc-corpus/.
[14]
Simon Demediuk, Marco Tamassia, Xiaodong Li, and William L. Raffe. 2019. Challenging AI: Evaluating the Effect of MCTS-Driven Dynamic Difficulty Adjustment on Player Enjoyment. In Proceedings of the Australasian Computer Science Week Multiconference, ACSW 2019, Sydney, NSW, Australia, January 29-31, 2019. 43:1–43:7.
[15]
Andrea Fioraldi, Dominik Maier, Heiko Eißfeldt, and Marc Heuse. 2020. AFL++: Combining Incremental Steps of Fuzzing Research. In 14th USENIX Workshop on Offensive Technologies (WOOT 20). USENIX Association.
[16]
Jonathan Foote. 2018. The exploitable GDB plugin. https://github.com/jfoote/exploitable.
[17]
Shuitao Gan, Chao Zhang, Peng Chen, Bodong Zhao, Xiaojun Qin, Dong Wu, and Zuoning Chen. 2020. GREYONE: Data Flow Sensitive Fuzzing. In 29th USENIX Security Symposium (USENIX Security 20). USENIX Association, Boston, MA.
[18]
Shuitao Gan, Chao Zhang, Xiaojun Qin, Xuwen Tu, Kang Li, Zhongyu Pei, and Zuoning Chen. 2018. CollAFL: Path Sensitive Fuzzing. In 2018 IEEE Symposium on Security and Privacy, SP 2018, Proceedings, 21-23 May 2018, San Francisco, California, USA. 679–696.
[19]
GDB. 2019. GDB: The GNU Project Debugger. https://www.gnu.org/software/gdb/.
[20]
Benjamin Hefferan, Oliver M Cliff, and Robert Fitch. 2016. Adversarial patrolling with reactive point processes. In Proceedings of the ARAA Australasian Conference on Robotics and Automation (ARAA, 2016). 39–46.
[21]
George Klees, Andrew Ruef, Benji Cooper, Shiyi Wei, and Michael Hicks. 2018. Evaluating Fuzz Testing. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, CCS 2018, Toronto, ON, Canada, October 15-19, 2018. 2123–2138.
[22]
Levente Kocsis, Csaba Szepesvári, and Jan Willemson. 2006. Improved monte-carlo search. Univ. Tartu, Estonia, Tech. Rep 1 (2006).
[23]
Mikko Lauri and Risto Ritala. 2016. Planning for robotic exploration based on forward simulation. Robotics and Autonomous Systems 83 (2016), 15–31.
[24]
Caroline Lemieux and Koushik Sen. 2018. Fairfuzz: A targeted mutation strategy for increasing greybox fuzz testing coverage. In Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering. 475–485.
[25]
Yuwei Li, Shouling Ji, Yuan Chen, Sizhuang Liang, Wei-Han Lee, Yueyao Chen, Chenyang Lyu, Chunming Wu, Raheem Beyah, Peng Cheng, Kangjie Lu, and Ting Wang. 2020. UNIFUZZ: A Holistic and Pragmatic Metrics-Driven Platform for Evaluating Fuzzers. CoRR abs/2010.01785(2020).
[26]
An-Jen Liu, Ti-Rong Wu, I-Chen Wu, Hung Guei, and Ting-Han Wei. 2020. Strength Adjustment and Assessment for MCTS-Based Programs [Research Frontier]. IEEE Comput. Intell. Mag. 15, 3 (2020), 60–73.
[27]
Dongge Liu, Gidon Ernst, Toby Murray, and Benjamin I. P. Rubinstein. 2020. Legion: Best-First Concolic Testing. In Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering. Association for Computing Machinery, New York, NY, USA, 54–65.
[28]
Chenyang Lyu, Shouling Ji, Chao Zhang, Yuwei Li, Wei-Han Lee, Yu Song, and Raheem Beyah. 2019. MOPT: Optimized mutation scheduling for fuzzers. In 28th USENIX Security Symposium (USENIX Security 19). 1949–1966.
[29]
Hector D. Menendez and David Clark. 2021. Hashing Fuzzing: Introducing Input Diversity to Improve Crash Detection. IEEE Transactions on Software Engineering(2021), 1–1.
[30]
Stefan Nagy and Matthew Hicks. 2019. Full-Speed Fuzzing: Reducing Fuzzing Overhead through Coverage-Guided Tracing. In 2019 IEEE Symposium on Security and Privacy, SP 2019, San Francisco, CA, USA, May 19-23, 2019. 787–802.
[31]
Trail of Bits. 2016. DARPA Challenges Sets for Linux, Windows, and macOS. Accessed on September 10th 2020. https://github.com/trailofbits/cb-multios.
[32]
Abdessamed Ouessai, Mohammed Salem, and Antonio Miguel Mora. 2020. Parametric action pre-selection for MCTS in real-time strategy games. In Proceedings of the VI Congreso de la Sociedad Española para las Ciencias del Videojuego, On-line, October 7-8, 2020. 104–115.
[33]
p value. 2022. p value. https://en.wikipedia.org/wiki/P-value.
[34]
Timothy Patten, Wolfram Martens, and Robert Fitch. 2018. Monte Carlo planning for active object classification. Autonomous Robots 42, 2 (2018), 391–421.
[35]
Mohit Rajpal, William Blum, and Rishabh Singh. 2017. Not all bytes are equal: Neural byte sieve for fuzzing. CoRR abs/1711.04596(2017).
[36]
Kostya Serebryany. 2015. libFuzzer–a library for coverage-guided fuzz testing. LLVM project (2015).
[37]
Konstantin Serebryany, Derek Bruening, Alexander Potapenko, and Dmitriy Vyukov. 2017. Addresssanitizer. https://github.com/google/sanitizers/wiki.
[38]
Eren Sezener and Peter Dayan. 2020. Static and Dynamic Values of Computation in MCTS. CoRR abs/2002.04335(2020). https://arxiv.org/abs/2002.04335
[39]
Dongdong She, Kexin Pei, Dave Epstein, Junfeng Yang, Baishakhi Ray, and Suman Jana. 2019. NEUZZ: Efficient Fuzzing with Neural Program Smoothing. In 2019 IEEE Symposium on Security and Privacy, SP 2019, San Francisco, CA, USA, May 19-23, 2019. 803–817.
[40]
Robert Swiecki. 2017. Honggfuzz: A general-purpose, easy-to-use fuzzer with interesting analysis options. URl: https://github. com/google/honggfuzz (visited on 06/21/2017) (2017).
[41]
The Clang Team. 2017. Addresssanitizer. http://clang.llvm.org/docs/AddressSanitizer.html.
[42]
Junjie Wang, Bihuan Chen, Lei Wei, and Yang Liu. 2017. Skyfire: Data-Driven Seed Generation for Fuzzing. In 2017 IEEE Symposium on Security and Privacy, SP 2017, San Jose, CA, USA, May 22-26, 2017. 579–594.
[43]
Jinghan Wang, Yue Duan, Wei Song, Heng Yin, and Chengyu Song. 2019. Be Sensitive and Collaborative: Analyzing Impact of Coverage Metrics in Greybox Fuzzing. In 22nd International Symposium on Research in Attacks, Intrusions and Defenses, RAID 2019, Chaoyang District, Beijing, China, September 23-25, 2019.
[44]
Jinghan Wang, Chengyu Song, and Heng Yin. 2021. Reinforcement Learning-based Hierarchical Seed Scheduling for Greybox Fuzzing. In 28th Annual Network and Distributed System Security Symposium, NDSS 2021, virtually, February 21-25, 2021. The Internet Society.
[45]
Wen Xu, Sanidhya Kashyap, Changwoo Min, and Taesoo Kim. 2017. Designing New Operating Primitives to Improve Fuzzing Performance. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, CCS 2017, Dallas, TX, USA, October 30 - November 03, 2017. 2313–2328.
[46]
Tai Yue, Pengfei Wang, Yong Tang, Enze Wang, Bo Yu, Kai Lu, and Xu Zhou. 2020. EcoFuzz: Adaptive Energy-Saving Greybox Fuzzing as a Variant of the Adversarial Multi-Armed Bandit. In 29th USENIX Security Symposium (USENIX Security 20).
[47]
M. Zalewski. 2017. American Fuzzy Lop. http://lcamtuf.coredump.cx/afl/.
[48]
Gen Zhang, Xu Zhou, Yingqi Luo, Xugang Wu, and Erxue Min. 2018. PTfuzz: Guided Fuzzing With Processor Trace Feedback. IEEE Access 6(2018), 37302–37313.
[49]
Lei Zhao, Yue Duan, Heng Yin, and Jifeng Xuan. 2019. Send Hardest Problems My Way: Probabilistic Path Prioritization for Hybrid Fuzzing. In 26th Annual Network and Distributed System Security Symposium, NDSS 2019, San Diego, California, USA, February 24-27, 2019. The Internet Society.

Cited By

View all
  • (2024)Reinforcement Learning-Based Multi-Phase Seed Scheduling for Network Protocol FuzzingElectronics10.3390/electronics1324496213:24(4962)Online publication date: 17-Dec-2024
  • (2024)DiPri: Distance-based Seed Prioritization for Greybox FuzzingACM Transactions on Software Engineering and Methodology10.1145/3654440Online publication date: 26-Mar-2024
  • (2024)Balance Seed Scheduling via Monte Carlo PlanningIEEE Transactions on Dependable and Secure Computing10.1109/TDSC.2023.328529321:3(1469-1483)Online publication date: May-2024
  • Show More Cited By

Index Terms

  1. Alphuzz: Monte Carlo Search on Seed-Mutation Tree for Coverage-Guided Fuzzing

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    ACSAC '22: Proceedings of the 38th Annual Computer Security Applications Conference
    December 2022
    1021 pages
    ISBN:9781450397599
    DOI:10.1145/3564625
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 05 December 2022

    Permissions

    Request permissions for this article.

    Check for updates

    Badges

    Author Tags

    1. Fuzzing
    2. Seed scheduling strategy
    3. Vulnerability detection

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Conference

    ACSAC

    Acceptance Rates

    Overall Acceptance Rate 104 of 497 submissions, 21%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)145
    • Downloads (Last 6 weeks)18
    Reflects downloads up to 14 Dec 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Reinforcement Learning-Based Multi-Phase Seed Scheduling for Network Protocol FuzzingElectronics10.3390/electronics1324496213:24(4962)Online publication date: 17-Dec-2024
    • (2024)DiPri: Distance-based Seed Prioritization for Greybox FuzzingACM Transactions on Software Engineering and Methodology10.1145/3654440Online publication date: 26-Mar-2024
    • (2024)Balance Seed Scheduling via Monte Carlo PlanningIEEE Transactions on Dependable and Secure Computing10.1109/TDSC.2023.328529321:3(1469-1483)Online publication date: May-2024
    • (2024)Directed Fuzzing with Adaptive Path Guidance and Weighted Distribution2024 IEEE 24th International Conference on Software Quality, Reliability, and Security Companion (QRS-C)10.1109/QRS-C63300.2024.00112(839-848)Online publication date: 1-Jul-2024
    • (2024)Automated SC-MCC test case generation using coverage-guided fuzzingSoftware Quality Journal10.1007/s11219-024-09667-332:3(849-880)Online publication date: 1-Sep-2024
    • (2023)Not All Seeds Are Important: Fuzzing Guided by Untouched EdgesApplied Sciences10.3390/app13241317213:24(13172)Online publication date: 12-Dec-2023

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media