[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1109/ICSE.2019.00081acmconferencesArticle/Chapter ViewAbstractPublication PagesicseConference Proceedingsconference-collections
research-article

Superion: grammar-aware greybox fuzzing

Published: 25 May 2019 Publication History

Abstract

In recent years, coverage-based greybox fuzzing has proven itself to be one of the most effective techniques for finding security bugs in practice. Particularly, American Fuzzy Lop (AFL for short) is deemed to be a great success in fuzzing relatively simple test inputs. Unfortunately, when it meets structured test inputs such as XML and JavaScript, those grammar-blind trimming and mutation strategies in AFL hinder the effectiveness and efficiency.
To this end, we propose a grammar-aware coverage-based grey-box fuzzing approach to fuzz programs that process structured inputs. Given the grammar (which is often publicly available) of test inputs, we introduce a grammar-aware trimming strategy to trim test inputs at the tree level using the abstract syntax trees (ASTs) of parsed test inputs. Further, we introduce two grammar-aware mutation strategies (i.e., enhanced dictionary-based mutation and tree-based mutation). Specifically, tree-based mutation works via replacing subtrees using the ASTs of parsed test inputs. Equipped with grammar-awareness, our approach can carry the fuzzing exploration into width and depth.
We implemented our approach as an extension to AFL, named Superion; and evaluated the effectiveness of Superion using large-scale programs (i.e., an XML engine libplist and three JavaScript engines WebKit, Jerryscript and ChakraCore). Our results have demonstrated that Superion can improve the code coverage (i.e., 16.7% and 8.8% in line and function coverage) and bug-finding capability (i.e., 34 new bugs, among which we discovered 22 new vulnerabilities with 19 CVEs assigned and 3.2K USD bug bounty rewards received) over AFL and jsfunfuzz.

References

[1]
Antlr's grammar list for different languages. {Online}. Available: https://github.com/antlr/grammars-v4
[2]
Cyber grand challenge (cgc). {Online}. Available: http://archive.darpa.mil/cybergrandchallenge/
[3]
Peach fuzzer platform. {Online}. Available: http://www.peachfuzzer.com/products/peach-platform/
[4]
Spike fuzzer platform. {Online}. Available: http://www.immunitysec.com/
[5]
D. Appelt, C. D. Nguyen, L. C. Briand, and N. Alshahwan, "Automated testing for sql injection vulnerabilities: an input mutation approach," in ISSTA, 2014, pp. 259--269.
[6]
D. Babić, L. Martignoni, S. McCamant, and D. Song, "Statically-directed dynamic automated test generation," in ISSTA, 2011, pp. 12--22.
[7]
O. Bastani, R. Sharma, A. Aiken, and P. Liang, "Synthesizing program input grammars," in PLDI, 2017, pp. 95--110.
[8]
M. Böhme, V.-T. Pham, M.-D. Nguyen, and A. Roychoudhury, "Directed greybox fuzzing," in CCS, 2017.
[9]
M. Böhme, V.-T. Pham, and A. Roychoudhury, "Coverage-based greybox fuzzing as markov chain," in CCS, 2016, pp. 1032--1043.
[10]
S. K. Cha, M. Woo, and D. Brumley, "Program-adaptive mutational fuzzing," in SP, 2015, pp. 725--741.
[11]
M. Chandramohan, Y. Xue, Z. Xu, Y. Liu, C. Y. Cho, and H. B. K. Tan, "Bingo: Cross-architecture cross-os binary search," in FSE, 2016, pp. 678--689.
[12]
H. Chen, Y. Li, B. Chen, Y. Xue, and Y. Liu, "Fot: A versatile, configurable, extensible fuzzing framework," in ESEC/FSE, 2018, pp. 867--870.
[13]
H. Chen, Y. Xue, Y. Li, B. Chen, X. Xie, X. Wu, and Y. Liu, "Hawkeye: Towards a desired directed grey-box fuzzing," in CCS, 2018, pp. 2095--2108.
[14]
J. Chen, W. Diao, Q. Zhao, C. Zuo, Z. Lin, X. Wang, W. Lau, M. Sun, R. Yang, and K. Zhang, "Iotfuzzer: Discovering memory corruptions in iot through app-based fuzzing," in NDSS, 2018.
[15]
P. Chen and H. Chen, "Angora: Efficient fuzzing by principled search," in SP, 2018.
[16]
Y. Chen, A. Groce, C. Zhang, W.-K. Wong, X. Fern, E. Eide, and J. Regehr, "Taming compiler fuzzers," in PLDI, 2013, pp. 197--208.
[17]
J. Corina, A. Machiry, C. Salls, Y. Shoshitaishvili, S. Hao, C. Kruegel, and G. Vigna, "Difuze: Interface aware fuzzing for kernel drivers," in CCS, 2017, pp. 2123--2138.
[18]
C. Cummins, P. Petoumenos, A. Murray, and H. Leather, "Compiler fuzzing through deep learning," in ISSTA, 2018, pp. 95--105.
[19]
L. Della Toffola, C.-A. Staicu, and M. Pradel, "Sayinghi!is not enough: Mining inputs for effective test generation," in ASE, 2017.
[20]
K. Dewey, J. Roesch, and B. Hardekopf, "Language fuzzing using constraint logic programming," in ASE, 2014, pp. 725--730.
[21]
B. Dolan-Gavitt, P. Hulin, E. Kirda, T. Leek, A. Mambretti, W. Robertson, F. Ulrich, and R. Whelan, "Lava: Large-scale automated vulnerability addition," in S&P, 2016, pp. 110--121.
[22]
I. Fratric. (2017) The great dom fuzz-off of 2017. {Online}. Available: https://googleprojectzero.blogspot.sg/2017/09/the-great-dom-fuzz-off-of-2017.html
[23]
S. Gan, C. Zhang, X. Qin, X. Tu, K. Li, Z. Pei, and Z. Chen, "Collafl: Path sensitive fuzzing," in SP, 2018.
[24]
V. Ganesh, T. Leek, and M. Rinard, "Taint-based directed whitebox fuzzing," in ICSE, 2009, pp. 474--484.
[25]
P. Godefroid, A. Kiezun, and M. Y. Levin, "Grammar-based whitebox fuzzing," in PLDI, 2008, pp. 206--215.
[26]
P. Godefroid, M. Y. Levin, and D. Molnar, "Automated whitebox fuzz testing," in NDSS, 2008.
[27]
P. Godefroid, M. Y. Levin, and D. Molnar, "Sage: Whitebox fuzzing for security testing," Commun. ACM, vol. 55, no. 3, pp. 40--44, 2012.
[28]
P. Godefroid, H. Peleg, and R. Singh, "Learn&fuzz: Machine learning for input fuzzing," in ASE, 2017, pp. 50--59.
[29]
R. Guo, "Mongodb's javascript fuzzer," Commun. ACM, vol. 60, no. 5, pp. 43--47, 2017.
[30]
I. Haller, A. Slowinska, M. Neugschwandtner, and H. Bos, "Dowsing for overflows: A guided fuzzer to find buffer boundary violations," in USENIX Security, 2013, pp. 49--64.
[31]
H. Han and S. K. Cha, "Imf: Inferred model-based fuzzer," in CCS, 2017, pp. 2345--2358.
[32]
C. Holler, K. Herzig, and A. Zeller, "Fuzzing with code fragments," in USENIX Security, 2012, pp. 445--458.
[33]
M. Höschele and A. Zeller, "Mining input grammars from dynamic taints," in ASE, 2016, pp. 720--725.
[34]
M. Höschele and A. Zeller, "Mining input grammars with autogram," in ICSE, 2017, pp. 31--34.
[35]
A. Householder and J. Foote, "Probability-based parameter selection for black-box fuzz testing," Software Engineering Institute, Carnegie Mellon University, Tech. Rep. CMU/SEI-2012-TN-019, 2012.
[36]
B. Jiang, Y. Liu, and W. Chan, "Contractfuzzer: Fuzzing smart contracts for vulnerability detection," in ASE, 2018.
[37]
U. Kargén and N. Shahmehri, "Turning programs against each other: high coverage fuzz-testing using binary-code mutation and dynamic slicing," in FSE, 2015, pp. 782--792.
[38]
G. Klees, A. Ruef, B. Cooper, S. Wei, and M. Hicks, "Evaluating fuzz testing," in CCS, 2018, pp. 2123--2138.
[39]
V. Le, M. Afshari, and Z. Su, "Compiler validation via equivalence modulo inputs," in PLDI, 2014, pp. 216--226.
[40]
V. Le, C. Sun, and Z. Su, "Finding deep compiler bugs via guided stochastic program mutation," in OOPSLA, 2015, pp. 386--399.
[41]
C. Lemieux, R. Padhye, K. Sen, and D. Song, "Perffuzz: Automatically generating pathological inputs," in ISSTA, 2018, pp. 254--265.
[42]
C. Lemieux and K. Sen, "Fairfuzz: A targeted mutation strategy for increasing greybox fuzz testing coverage," in ASE, 2018.
[43]
Y. Li, B. Chen, M. Chandramohan, S.-W. Lin, Y. Liu, and A. Tiu, "Steelix: Program-state based binary fuzzing," in ESEC/FSE, 2017, pp. 627--637.
[44]
C. Lidbury, A. Lascu, N. Chong, and A. F. Donaldson, "Many-core compiler fuzzing," in PLDI, 2015, pp. 65--76.
[45]
G. Meng, Y. Liu, J. Zhang, A. Pokluda, and R. Boutaba, "Collaborative security: A survey and taxonomy," ACM Comput. Surv., vol. 48, no. 1, pp. 1:1--1:42, 2015.
[46]
B. P. Miller, L. Fredriksen, and B. So, "An empirical study of the reliability of unix utilities," Commun. ACM, vol. 33, no. 12, pp. 32--44, 1990.
[47]
M. Neugschwandtner, P. Milani Comparetti, I. Haller, and H. Bos, "The borg: Nanoprobing binaries for buffer overreads," in CODASPY, 2015, pp. 87--97.
[48]
Y. Noller, R. Kersten, and C. S. Păsăreanu, "Badger: Complexity analysis with fuzzing and symbolic execution," in ISSTA, 2018, pp. 322--332.
[49]
S. Pailoor, A. Aday, and S. Jana, "Moonshine: Optimizing OS fuzzer seed selection with trace distillation," in USENIX Security, 2018.
[50]
T. Parr, The Definitive ANTLR 4 Reference. Pragmatic Bookshelf, 2013.
[51]
J. Patra and M. Pradel, "Learning to fuzz: Application-independent fuzz testing with probabilistic, generative models of input data," TU Darmstadt, Tech. Rep. TUD-CS-2016-14664, 2016.
[52]
T. Petsios, J. Zhao, A. D. Keromytis, and S. Jana, "Slowfuzz: Automated domain-independent detection of algorithmic complexity vulnerabilities," in CCS, 2017, pp. 2155--2168.
[53]
V.-T. Pham, M. Böhme, and A. Roychoudhury, "Model-based whitebox fuzzing for program binaries," in ASE, 2016, pp. 543--553.
[54]
M. Rash. afl-cov - afl fuzzing code coverage. {Online}. Available: https://github.com/mrash/afl-cov
[55]
S. Rawat, V. Jain, A. Kumar, L. Cojocar, C. Giuffrida, and H. Bos, "Vuzzer: Application-aware evolutionary fuzzing," in NDSS, 2017.
[56]
A. Rebert, S. K. Cha, T. Avgerinos, J. Foote, D. Warren, G. Grieco, and D. Brumley, "Optimizing seed selection for fuzzing," in USENIX Security, 2014, pp. 861--875.
[57]
J. Ruderman. (2007) Introducing jsfunfuzz. {Online}. Available: http://www.squarefree.com/2007/08/02/introducing-jsfunfuzz
[58]
S. Schumilo, C. Aschermann, R. Gawlik, S. Schinzel, and T. Holz, "kafl: Hardware-assisted feedback fuzzing for os kernels," in USENIX Security, 2017, pp. 167--182.
[59]
N. Stephens, J. Grosen, C. Salls, A. Dutcher, R. Wang, J. Corbetta, Y. Shoshitaishvili, C. Kruegel, and G. Vigna, "Driller: Augmenting fuzzing through selective symbolic execution," in NDSS, 2016.
[60]
C. Sun, V. Le, and Z. Su, "Finding compiler bugs via live code mutation," in OOPSLA, 2016, pp. 849--863.
[61]
R. Valotta, "Taking browsers fuzzing to the next (dom) level," in DeepSec, 2012.
[62]
S. Veggalam, S. Rawat, I. Haller, and H. Bos, "Ifuzzer: An evolutionary interpreter fuzzer using genetic programming," in ESORICS, 2016, pp. 581--601.
[63]
J. Viide, A. Helin, M. Laakso, P. Pietikäinen, M. Seppänen, K. Halunen, R. Puuperä, and J. Röning, "Experiences with model inference assisted fuzzing," in WOOT, 2008, pp. 2:1--2:6.
[64]
J. Wang, B. Chen, L. Wei, and Y. Liu, "Skyfire: Data-driven seed generation for fuzzing," in SP, 2017, pp. 579--594.
[65]
T. Wang, T. Wei, G. Gu, and W. Zou, "Taintscope: A checksum-aware directed fuzzing tool for automatic software vulnerability detection," in SP, 2010, pp. 497--512.
[66]
M. Woo, S. K. Cha, S. Gottlieb, and D. Brumley, "Scheduling black-box mutational fuzzing," in CCS, 2013, pp. 511--522.
[67]
W. Xu, S. Kashyap, C. Min, and T. Kim, "Designing new operating primitives to improve fuzzing performance," in CCS, 2017, pp. 2313--2328.
[68]
X. Xu, C. Liu, Q. Feng, H. Yin, L. Song, and D. Song, "Neural network-based graph embedding for cross-platform binary code similarity detection," in CCS, 2017, pp. 363--376.
[69]
X. Yang, Y. Chen, E. Eide, and J. Regehr, "Finding and understanding bugs in c compilers," in PLDI, 2011, pp. 283--294.
[70]
M. Zalewski. afl-fuzz: making up grammar with a dictionary in hand. {Online}. Available: https://lcamtuf.blogspot.sg/2015/01/afl-fuzz-making-up-grammar-with.html
[71]
M. Zalewski. American fuzzy lop. {Online}. Available: http://lcamtuf.coredump.cx/afl/
[72]
M. Zalewski. mangleme. {Online}. Available: http://freecode.com/projects/mangleme/
[73]
M. Zalewski. Mutation strategies in american fuzzy lop. {Online}. Available: http://lcamtuf.coredump.cx/afl/status_screen.txt

Cited By

View all
  • (2024)Magneto: A Step-Wise Approach to Exploit Vulnerabilities in Dependent Libraries via LLM-Empowered Directed FuzzingProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695531(1633-1644)Online publication date: 27-Oct-2024
  • (2024)Incremental Context-free Grammar Inference in Black Box SettingsProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695494(1171-1182)Online publication date: 27-Oct-2024
  • (2024)Visualizing and Understanding the Internals of FuzzingProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695284(2199-2204)Online publication date: 27-Oct-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
ICSE '19: Proceedings of the 41st International Conference on Software Engineering
May 2019
1318 pages

Sponsors

Publisher

IEEE Press

Publication History

Published: 25 May 2019

Check for updates

Author Tags

  1. ASTs
  2. greybox fuzzing
  3. structured inputs

Qualifiers

  • Research-article

Conference

ICSE '19
Sponsor:

Acceptance Rates

Overall Acceptance Rate 276 of 1,856 submissions, 15%

Upcoming Conference

ICSE 2025

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)27
  • Downloads (Last 6 weeks)8
Reflects downloads up to 13 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Magneto: A Step-Wise Approach to Exploit Vulnerabilities in Dependent Libraries via LLM-Empowered Directed FuzzingProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695531(1633-1644)Online publication date: 27-Oct-2024
  • (2024)Incremental Context-free Grammar Inference in Black Box SettingsProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695494(1171-1182)Online publication date: 27-Oct-2024
  • (2024)Visualizing and Understanding the Internals of FuzzingProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695284(2199-2204)Online publication date: 27-Oct-2024
  • (2024)The Havoc Paradox in Generator-Based Fuzzing (Registered Report)Proceedings of the 3rd ACM International Fuzzing Workshop10.1145/3678722.3685529(3-12)Online publication date: 13-Sep-2024
  • (2024)Fuzzing JavaScript Engines with a Graph-based IRProceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security10.1145/3658644.3690336(3734-3748)Online publication date: 2-Dec-2024
  • (2024)Test Suites Guided Vulnerability Validation for Node.js ApplicationsProceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security10.1145/3658644.3690332(570-584)Online publication date: 2-Dec-2024
  • (2024)Prompt Fuzzing for Fuzz Driver GenerationProceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security10.1145/3658644.3670396(3793-3807)Online publication date: 2-Dec-2024
  • (2024)Fuzzing JavaScript Interpreters with Coverage-Guided Reinforcement Learning for LLM-Based MutationProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3680389(1656-1668)Online publication date: 11-Sep-2024
  • (2024)An Empirical Examination of Fuzzer Mutator PerformanceProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3680387(1631-1642)Online publication date: 11-Sep-2024
  • (2024)Fuzzing MLIR Compiler Infrastructure via Operation Dependency AnalysisProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3680360(1287-1299)Online publication date: 11-Sep-2024
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media