More Web Proxy on the site http://driver.im/

research-article

Compiler fuzzing through deep learning

Authors:

Pavlos Petoumenos,

Alastair Murray,

Hugh LeatherAuthors Info & Claims

ISSTA 2018: Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis

Pages 95 - 105

https://doi.org/10.1145/3213846.3213848

Published: 12 July 2018 Publication History

Abstract

Random program generation — fuzzing — is an effective technique for discovering bugs in compilers but successful fuzzers require extensive development effort for every language supported by the compiler, and often leave parts of the language space untested.

We introduce DeepSmith, a novel machine learning approach to accelerating compiler validation through the inference of generative models for compiler inputs. Our approach infers a learned model of the structure of real world code based on a large corpus of open source code. Then, it uses the model to automatically generate tens of thousands of realistic programs. Finally, we apply established differential testing methodologies on them to expose bugs in compilers. We apply our approach to the OpenCL programming language, automatically exposing bugs with little effort on our side. In 1,000 hours of automated testing of commercial and open source compilers, we discover bugs in all of them, submitting 67 bug reports. Our test cases are on average two orders of magnitude smaller than the state-of-the-art, require 3.03× less time to generate and evaluate, and expose bugs which the state-of-the-art cannot. Our random program generator, comprising only 500 lines of code, took 12 hours to train for OpenCL versus the state-of-the-art taking 9 man months to port from a generator for C and 50,000 lines of code. With 18 lines of code we extended our program generator to a second language, uncovering crashes in Solidity compilers in 12 hours of automated testing.

References

[1]

O. Bastani, R. Sharma, A. Aiken, and P. Liang. Synthesizing Program Input Grammars. In PLDI, 2017.

Digital Library

[2]

A. Betts, N. Chong, and A. Donaldson. GPUVerify: A Verifier for GPU Kernels. In OOPSLA. ACM, 2012.

Digital Library

[3]

A. S. Boujarwah and K. Saleh. Compiler Test Case Generation Methods: A Survey and Assessment. Information and Software Technology, 39(9), 1997.

[4]

S. R. Bowman, L. Vilnis, O. Vinyals, A. M. Dai, R. Jozefowicz, and S. Bengio. Generating Sentences from a Continuous Space. arXiv:1511.06349, 2015.

[5]

J. Chen, Y. Bai, D. Hao, Y. Xiong, H. Zhang, and B. Xie. Learning to Prioritize Test Programs for Compiler Testing. In ICSE, 2017.

Digital Library

[6]

J. Chen, W. Hu, D. Hao, Y. Xiong, H. Zhang, L. Zhang, and B. Xie. An Empirical Comparison of Compiler Testing Techniques. In ICSE, 2016.

Digital Library

[7]

Y. Chen, A. Groce, C. Zhang, W. Wong, X. Fern, E. Eide, and J. Regehr. Taming Compiler Fuzzers. PLDI, 2013.

Digital Library

[8]

M. Choi, S. Jeong, H. Oh, and J. Choo. End-to-End Prediction of Buffer Overruns from Raw Source Code via Neural Memory Networks. arXiv:1703.02458, 2017.

[9]

C. Cummins, P. Petoumenos, Z. Wang, and H. Leather. Endto-end Deep Learning of Optimization Heuristics. In PACT. IEEE, 2017.

[10]

P. Godefroid, H. Peleg, and R. Singh. Learn&Fuzz: Machine Learning for Input Fuzzing. In ASE, 2017.

Digital Library

[11]

K. Heo, H. Oh, and K. Yi. Machine-Learning-Guided Selectively Unsound Static Analysis. In ICSE, 2017.

Digital Library

[12]

S. Hochreiter and J. Schmidhuber. Long Short-Term Memory. Neural Computation, 9(8), 1997.

Digital Library

[13]

C. Holler, K. Herzig, and A. Zeller. Fuzzing with Code Fragments. Usenix, 2012.

Digital Library

[14]

X. Huo, M. Li, and Z. Zhou. Learning Unified Features from Natural and Programming Languages for Locating Buggy Source Code. In IJCAI, 2016.

Digital Library

[15]

U. Koc, P. Saadatpanah, J. S. Foster, and A. A. Porter. Learning a Classifier for False Positive Error Reports Emitted by Static Code Analysis Tools. In MAPL, 2017.

Digital Library

[16]

A. S. Kossatchev and M. A. Posypkin. Survey of Compiler Testing Methods. Programming and Computer Software, 31(1), 2005.

Digital Library

[17]

M. Koukoutos, M. Raghothaman, E. Kneuss, and V. Kuncak. On Repair with Probabilistic Attribute Grammars. arXiv:1707.04148, 2017.

[18]

A. N. Lam, A. T. Nguyen, H. A. Nguyen, and T. N. Nguyen. Combining Deep Learning with Information Retrieval to Localize Buggy Files for Bug Reports. In ASE, 2015.

Digital Library

[19]

V. Le, M. Afshari, and Z. Su. Compiler Validation via Equivalence Modulo Inputs. In PLDI, 2014.

Digital Library

[20]

C. Lidbury, A. Lascu, N. Chong, and A. Donaldson. Many-Core Compiler Fuzzing. In PLDI, 2015.

Digital Library

[21]

Z. C. Lipton, J. Berkowitz, and C. Elkan. A Critical Review of Recurrent Neural Networks for Sequence Learning. arXiv:1506.00019, 2015.

[22]

W. M. McKeeman. Differential Testing for Software. DTJ, 10(1), 1998.

[23]

E. Nagai, A. Hashimoto, and N. Ishiura. Scaling up Size and Number of Expressions in Random Testing of Arithmetic Optimization of C Compilers. In SASIMI, 2013.

[24]

R. Pacanu, T. Mikolov, and Y. Bengio. On the Difficulties of Training Recurrent Neural Networks. In ICML, 2013.

Digital Library

[25]

M. Pflanzer, A. Donaldson, and A. Lascu. Automatic Test Case Reduction for OpenCL. In IWOCL, 2016.

Digital Library

[26]

J. Price and S. Mcintosh-Smith. Oclgrind: An Extensible OpenCL Device Simulator. In IWOCL. ACM, 2015.

Digital Library

[27]

A. Radford, R. Jozefowicz, and I. Sutskever. Learning to Generate Reviews and Discovering Sentiment. arXiv:1704.01444, 2017.

[28]

M. Raghu, B. Poole, J. Kleinberg, S. Ganguli, and J. Sohl-Dickstein. On the expressive power of deep neural networks. arXiv:1606.05336, 2016.

[29]

C. Sun, V. Le, and Z. Su. Finding Compiler Bugs via Live Code Mutation. In OOPSLA, 2016.

Digital Library

[30]

J. Wang, B. Chen, L. Wei, and Y. Liu. Skyfire: Data-Driven Seed Generation for Fuzzing. In S&P, 2017.

[31]

M. White, M. Tufano, M. Martínez, M. Monperrus, and D. Poshyvanyk. Sorting and Transforming Program Repair Ingredients via Deep Learning Code Similarities. arXiv:1707.04742.

[32]

X. Yang, Y. Chen, E. Eide, and J. Regehr. Finding and Understanding Bugs in C Compilers. In PLDI, 2011.

Digital Library

[33]

M. Zalewski. American Fuzzy Lop.

[34]

Q. Zhang, C. Sun, and Z. Su. Skeletal Program Enumeration for Rigorous Compiler Testing. In PLDI, 2017.

Digital Library

Cited By

Wang QJung R(2024)Rustlantis: Randomized Differential Testing of the Rust CompilerProceedings of the ACM on Programming Languages10.1145/36897808:OOPSLA2(1955-1981)Online publication date: 8-Oct-2024
https://dl.acm.org/doi/10.1145/3689780
Zhang CSu Z(2024)SMT2Test: From SMT Formulas to Effective Test CasesProceedings of the ACM on Programming Languages10.1145/36897198:OOPSLA2(222-245)Online publication date: 8-Oct-2024
https://dl.acm.org/doi/10.1145/3689719
Hassan MSalvador JSantu SRahman A(2024)State Reconciliation Defects in Infrastructure as CodeProceedings of the ACM on Software Engineering10.1145/36607901:FSE(1865-1888)Online publication date: 12-Jul-2024
https://dl.acm.org/doi/10.1145/3660790
Show More Cited By

Index Terms

Compiler fuzzing through deep learning
1. Software and its engineering
  1. Software creation and management
    1. Software verification and validation
      1. Software defect analysis
        Software testing and debugging

Recommendations

A comprehensive study of deep learning compiler bugs
ESEC/FSE 2021: Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering

There are increasing uses of deep learning (DL) compilers to generate optimized code, boosting the runtime performance of DL models on specific hardware. Like their traditional counterparts, DL compilers can generate incorrect code, resulting in ...
Automated conformance testing for JavaScript engines via deep compiler fuzzing
PLDI 2021: Proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation

JavaScript (JS) is a popular, platform-independent programming language. To ensure the interoperability of JS programs across different platforms, the implementation of a JS engine should conform to the ECMAScript standard. However, doing so is ...
Compiler fuzzing: how much does it matter?

Despite much recent interest in randomised testing (fuzzing) of compilers, the practical impact of fuzzer-found compiler bugs on real-world applications has barely been assessed. We present the first quantitative and qualitative study of the tangible ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

ISSTA 2018: Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis

July 2018

379 pages

ISBN:9781450356992

DOI:10.1145/3213846

General Chair:
Frank Tip
Northeastern University, USA
,
Program Chair:
Eric Bodden
University of Paderborn, Germany / Fraunhofer IEM, Germany

Copyright © 2018 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGSOFT: ACM Special Interest Group on Software Engineering

In-Cooperation

SIGPLAN: ACM Special Interest Group on Programming Languages

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 July 2018

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Badges

Distinguished Paper

Author Tags

Qualifiers

Research-article

Funding Sources

EPSRC

Conference

ISSTA '18

Sponsor:

SIGSOFT

ISSTA '18: International Symposium on Software Testing and Analysis

July 16 - 21, 2018

Amsterdam, Netherlands

Acceptance Rates

Overall Acceptance Rate 58 of 213 submissions, 27%

Upcoming Conference

ISSTA '25

Sponsor:
sigsoft

34th ACM SIGSOFT International Symposium on Software Testing and Analysis

June 25 - 28, 2025

Trondheim , Norway

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

94
Total Citations
View Citations
1,608
Total Downloads

Downloads (Last 12 months)160
Downloads (Last 6 weeks)18

Reflects downloads up to 20 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Wang QJung R(2024)Rustlantis: Randomized Differential Testing of the Rust CompilerProceedings of the ACM on Programming Languages10.1145/36897808:OOPSLA2(1955-1981)Online publication date: 8-Oct-2024
https://dl.acm.org/doi/10.1145/3689780
Zhang CSu Z(2024)SMT2Test: From SMT Formulas to Effective Test CasesProceedings of the ACM on Programming Languages10.1145/36897198:OOPSLA2(222-245)Online publication date: 8-Oct-2024
https://dl.acm.org/doi/10.1145/3689719
Hassan MSalvador JSantu SRahman A(2024)State Reconciliation Defects in Infrastructure as CodeProceedings of the ACM on Software Engineering10.1145/36607901:FSE(1865-1888)Online publication date: 12-Jul-2024
https://dl.acm.org/doi/10.1145/3660790
Li STheodoridis TSu Z(2024)Boosting Compiler Testing by Injecting Real-World CodeProceedings of the ACM on Programming Languages10.1145/36563868:PLDI(223-245)Online publication date: 20-Jun-2024
https://dl.acm.org/doi/10.1145/3656386
Eom JJeong SKwon TChristakis MPradel M(2024)Fuzzing JavaScript Interpreters with Coverage-Guided Reinforcement Learning for LLM-Based MutationProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3680389(1656-1668)Online publication date: 11-Sep-2024
https://dl.acm.org/doi/10.1145/3650212.3680389
Xia XFeng YShi QJones JZhang XXu B(2024)Enumerating Valid Non-Alpha-Equivalent Programs for Interpreter TestingACM Transactions on Software Engineering and Methodology10.1145/364799433:5(1-31)Online publication date: 4-Jun-2024
https://dl.acm.org/doi/10.1145/3647994
Arefin MShetiya SWang ZCsallner CRoychoudhury APaiva AAbreu RStorey M(2024)Fast Deterministic Black-box Context-free Grammar InferenceProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3639214(1-12)Online publication date: 20-May-2024
https://dl.acm.org/doi/10.1145/3597503.3639214
Xia CPaltenghi MLe Tian JPradel MZhang LRoychoudhury APaiva AAbreu RStorey M(2024)Fuzz4All: Universal Fuzzing with Large Language ModelsProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3639121(1-13)Online publication date: 20-May-2024
https://dl.acm.org/doi/10.1145/3597503.3639121
Deng YXia CYang CZhang SYang SZhang LRoychoudhury APaiva AAbreu RStorey M(2024)Large Language Models are Edge-Case Generators: Crafting Unusual Programs for Fuzzing Deep Learning LibrariesProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3623343(1-13)Online publication date: 20-May-2024
https://dl.acm.org/doi/10.1145/3597503.3623343
Jiang HWang ZZhou ZLi XGuo SSun WZhang T(2024)A Testing Program and Pragma Combination Selection Based Framework for High-Level Synthesis Tool Pragma-Related Bug DetectionIEEE Transactions on Software Engineering10.1109/TSE.2024.336855350:4(937-955)Online publication date: 22-Feb-2024
https://dl.acm.org/doi/10.1109/TSE.2024.3368553
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents