[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3213846.3213848acmconferencesArticle/Chapter ViewAbstractPublication PagesisstaConference Proceedingsconference-collections
research-article

Compiler fuzzing through deep learning

Published: 12 July 2018 Publication History

Abstract

Random program generation — fuzzing — is an effective technique for discovering bugs in compilers but successful fuzzers require extensive development effort for every language supported by the compiler, and often leave parts of the language space untested.
We introduce DeepSmith, a novel machine learning approach to accelerating compiler validation through the inference of generative models for compiler inputs. Our approach infers a learned model of the structure of real world code based on a large corpus of open source code. Then, it uses the model to automatically generate tens of thousands of realistic programs. Finally, we apply established differential testing methodologies on them to expose bugs in compilers. We apply our approach to the OpenCL programming language, automatically exposing bugs with little effort on our side. In 1,000 hours of automated testing of commercial and open source compilers, we discover bugs in all of them, submitting 67 bug reports. Our test cases are on average two orders of magnitude smaller than the state-of-the-art, require 3.03× less time to generate and evaluate, and expose bugs which the state-of-the-art cannot. Our random program generator, comprising only 500 lines of code, took 12 hours to train for OpenCL versus the state-of-the-art taking 9 man months to port from a generator for C and 50,000 lines of code. With 18 lines of code we extended our program generator to a second language, uncovering crashes in Solidity compilers in 12 hours of automated testing.

References

[1]
O. Bastani, R. Sharma, A. Aiken, and P. Liang. Synthesizing Program Input Grammars. In PLDI, 2017.
[2]
A. Betts, N. Chong, and A. Donaldson. GPUVerify: A Verifier for GPU Kernels. In OOPSLA. ACM, 2012.
[3]
A. S. Boujarwah and K. Saleh. Compiler Test Case Generation Methods: A Survey and Assessment. Information and Software Technology, 39(9), 1997.
[4]
S. R. Bowman, L. Vilnis, O. Vinyals, A. M. Dai, R. Jozefowicz, and S. Bengio. Generating Sentences from a Continuous Space. arXiv:1511.06349, 2015.
[5]
J. Chen, Y. Bai, D. Hao, Y. Xiong, H. Zhang, and B. Xie. Learning to Prioritize Test Programs for Compiler Testing. In ICSE, 2017.
[6]
J. Chen, W. Hu, D. Hao, Y. Xiong, H. Zhang, L. Zhang, and B. Xie. An Empirical Comparison of Compiler Testing Techniques. In ICSE, 2016.
[7]
Y. Chen, A. Groce, C. Zhang, W. Wong, X. Fern, E. Eide, and J. Regehr. Taming Compiler Fuzzers. PLDI, 2013.
[8]
M. Choi, S. Jeong, H. Oh, and J. Choo. End-to-End Prediction of Buffer Overruns from Raw Source Code via Neural Memory Networks. arXiv:1703.02458, 2017.
[9]
C. Cummins, P. Petoumenos, Z. Wang, and H. Leather. Endto-end Deep Learning of Optimization Heuristics. In PACT. IEEE, 2017.
[10]
P. Godefroid, H. Peleg, and R. Singh. Learn&Fuzz: Machine Learning for Input Fuzzing. In ASE, 2017.
[11]
K. Heo, H. Oh, and K. Yi. Machine-Learning-Guided Selectively Unsound Static Analysis. In ICSE, 2017.
[12]
S. Hochreiter and J. Schmidhuber. Long Short-Term Memory. Neural Computation, 9(8), 1997.
[13]
C. Holler, K. Herzig, and A. Zeller. Fuzzing with Code Fragments. Usenix, 2012.
[14]
X. Huo, M. Li, and Z. Zhou. Learning Unified Features from Natural and Programming Languages for Locating Buggy Source Code. In IJCAI, 2016.
[15]
U. Koc, P. Saadatpanah, J. S. Foster, and A. A. Porter. Learning a Classifier for False Positive Error Reports Emitted by Static Code Analysis Tools. In MAPL, 2017.
[16]
A. S. Kossatchev and M. A. Posypkin. Survey of Compiler Testing Methods. Programming and Computer Software, 31(1), 2005.
[17]
M. Koukoutos, M. Raghothaman, E. Kneuss, and V. Kuncak. On Repair with Probabilistic Attribute Grammars. arXiv:1707.04148, 2017.
[18]
A. N. Lam, A. T. Nguyen, H. A. Nguyen, and T. N. Nguyen. Combining Deep Learning with Information Retrieval to Localize Buggy Files for Bug Reports. In ASE, 2015.
[19]
V. Le, M. Afshari, and Z. Su. Compiler Validation via Equivalence Modulo Inputs. In PLDI, 2014.
[20]
C. Lidbury, A. Lascu, N. Chong, and A. Donaldson. Many-Core Compiler Fuzzing. In PLDI, 2015.
[21]
Z. C. Lipton, J. Berkowitz, and C. Elkan. A Critical Review of Recurrent Neural Networks for Sequence Learning. arXiv:1506.00019, 2015.
[22]
W. M. McKeeman. Differential Testing for Software. DTJ, 10(1), 1998.
[23]
E. Nagai, A. Hashimoto, and N. Ishiura. Scaling up Size and Number of Expressions in Random Testing of Arithmetic Optimization of C Compilers. In SASIMI, 2013.
[24]
R. Pacanu, T. Mikolov, and Y. Bengio. On the Difficulties of Training Recurrent Neural Networks. In ICML, 2013.
[25]
M. Pflanzer, A. Donaldson, and A. Lascu. Automatic Test Case Reduction for OpenCL. In IWOCL, 2016.
[26]
J. Price and S. Mcintosh-Smith. Oclgrind: An Extensible OpenCL Device Simulator. In IWOCL. ACM, 2015.
[27]
A. Radford, R. Jozefowicz, and I. Sutskever. Learning to Generate Reviews and Discovering Sentiment. arXiv:1704.01444, 2017.
[28]
M. Raghu, B. Poole, J. Kleinberg, S. Ganguli, and J. Sohl-Dickstein. On the expressive power of deep neural networks. arXiv:1606.05336, 2016.
[29]
C. Sun, V. Le, and Z. Su. Finding Compiler Bugs via Live Code Mutation. In OOPSLA, 2016.
[30]
J. Wang, B. Chen, L. Wei, and Y. Liu. Skyfire: Data-Driven Seed Generation for Fuzzing. In S&P, 2017.
[31]
M. White, M. Tufano, M. Martínez, M. Monperrus, and D. Poshyvanyk. Sorting and Transforming Program Repair Ingredients via Deep Learning Code Similarities. arXiv:1707.04742.
[32]
X. Yang, Y. Chen, E. Eide, and J. Regehr. Finding and Understanding Bugs in C Compilers. In PLDI, 2011.
[33]
M. Zalewski. American Fuzzy Lop.
[34]
Q. Zhang, C. Sun, and Z. Su. Skeletal Program Enumeration for Rigorous Compiler Testing. In PLDI, 2017.

Cited By

View all
  • (2024)Rustlantis: Randomized Differential Testing of the Rust CompilerProceedings of the ACM on Programming Languages10.1145/36897808:OOPSLA2(1955-1981)Online publication date: 8-Oct-2024
  • (2024)SMT2Test: From SMT Formulas to Effective Test CasesProceedings of the ACM on Programming Languages10.1145/36897198:OOPSLA2(222-245)Online publication date: 8-Oct-2024
  • (2024)State Reconciliation Defects in Infrastructure as CodeProceedings of the ACM on Software Engineering10.1145/36607901:FSE(1865-1888)Online publication date: 12-Jul-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
ISSTA 2018: Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis
July 2018
379 pages
ISBN:9781450356992
DOI:10.1145/3213846
  • General Chair:
  • Frank Tip,
  • Program Chair:
  • Eric Bodden
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

In-Cooperation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 July 2018

Permissions

Request permissions for this article.

Check for updates

Badges

  • Distinguished Paper

Author Tags

  1. Compiler Fuzzing
  2. Deep Learning
  3. Differential Testing

Qualifiers

  • Research-article

Funding Sources

  • EPSRC

Conference

ISSTA '18
Sponsor:

Acceptance Rates

Overall Acceptance Rate 58 of 213 submissions, 27%

Upcoming Conference

ISSTA '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)160
  • Downloads (Last 6 weeks)18
Reflects downloads up to 20 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Rustlantis: Randomized Differential Testing of the Rust CompilerProceedings of the ACM on Programming Languages10.1145/36897808:OOPSLA2(1955-1981)Online publication date: 8-Oct-2024
  • (2024)SMT2Test: From SMT Formulas to Effective Test CasesProceedings of the ACM on Programming Languages10.1145/36897198:OOPSLA2(222-245)Online publication date: 8-Oct-2024
  • (2024)State Reconciliation Defects in Infrastructure as CodeProceedings of the ACM on Software Engineering10.1145/36607901:FSE(1865-1888)Online publication date: 12-Jul-2024
  • (2024)Boosting Compiler Testing by Injecting Real-World CodeProceedings of the ACM on Programming Languages10.1145/36563868:PLDI(223-245)Online publication date: 20-Jun-2024
  • (2024)Fuzzing JavaScript Interpreters with Coverage-Guided Reinforcement Learning for LLM-Based MutationProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3680389(1656-1668)Online publication date: 11-Sep-2024
  • (2024)Enumerating Valid Non-Alpha-Equivalent Programs for Interpreter TestingACM Transactions on Software Engineering and Methodology10.1145/364799433:5(1-31)Online publication date: 4-Jun-2024
  • (2024)Fast Deterministic Black-box Context-free Grammar InferenceProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3639214(1-12)Online publication date: 20-May-2024
  • (2024)Fuzz4All: Universal Fuzzing with Large Language ModelsProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3639121(1-13)Online publication date: 20-May-2024
  • (2024)Large Language Models are Edge-Case Generators: Crafting Unusual Programs for Fuzzing Deep Learning LibrariesProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3623343(1-13)Online publication date: 20-May-2024
  • (2024)A Testing Program and Pragma Combination Selection Based Framework for High-Level Synthesis Tool Pragma-Related Bug DetectionIEEE Transactions on Software Engineering10.1109/TSE.2024.336855350:4(937-955)Online publication date: 22-Feb-2024
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media