More Web Proxy on the site http://driver.im/

research-article

Finding Unstable Code via Compiler-Driven Differential Testing

Authors:

Zhendong SuAuthors Info & Claims

ASPLOS 2023: Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 3

Pages 238 - 251

https://doi.org/10.1145/3582016.3582053

Published: 25 March 2023 Publication History

Abstract

Unstable code refers to code that has inconsistent or unstable run-time semantics due to undefined behavior (UB) in the program. Compilers exploit UB by assuming that UB never occurs, which allows them to generate efficient but potentially semantically inconsistent binaries. Practitioners have put great research and engineering effort into designing dynamic tools such as sanitizers for frequently occurring UBs. However, it remains a big challenge how to detect UBs that are beyond the reach of current techniques.

In this paper, we introduce compiler-driven differential testing (CompDiff), a simple yet effective approach for finding unstable code in C/C++ programs. CompDiff relies on the fact that when compiling unstable code, different compiler implementations may produce semantically inconsistent binaries. Our main approach is to examine the outputs of different binaries on the same input. Discrepancies in outputs may signify the existence of unstable code. To detect unstable code in real-world programs, we also integrate CompDiff into AFL++, the most widely-used and actively-maintained general-purpose fuzzer.

Despite its simplicity, CompDiff is effective in practice: on the Juliet benchmark programs, CompDiff uniquely detected 1,409 bugs compared to sanitizers; on 23 popular open-source C/C++ projects, CompDiff-AFL++ uncovered 78 new bugs, 52 of which have been fixed by developers and 36 cannot be detected by sanitizers. Our evaluation also reveals the fact that CompDiff is not designed to replace current UB detectors but to complement them.

References

[1]

Austin Appleby. 2016. MurmurHash3. https://github.com/aappleby/smhasher/wiki/MurmurHash3 Accessed: March 7, 2022

[2]

Cornelius Aschermann, Sergej Schumilo, Tim Blazytko, Robert Gawlik, and Thorsten Holz. 2019. REDQUEEN: Fuzzing with Input-to-State Correspondence. In NDSS. 19, 1–15.

[3]

Marcel Böhme, Valentin JM Manès, and Sang Kil Cha. 2020. Boosting fuzzer efficiency: An information theoretic perspective. In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 678–689.

Digital Library

[4]

Marcel Böhme, Van-Thuan Pham, and Abhik Roychoudhury. 2017. Coverage-based greybox fuzzing as markov chain. IEEE Transactions on Software Engineering, 45, 5 (2017), 489–506.

[5]

Marcel Böhme, Ezekiel O Soremekun, Sudipta Chattopadhyay, Emamurho Ugherughe, and Andreas Zeller. 2017. Where is the bug and how is it fixed? an experiment with practitioners. In Proceedings of the 2017 11th joint meeting on foundations of software engineering. 117–128.

Digital Library

[6]

Derek Bruening and Qin Zhao. 2011. Practical memory checking with Dr. Memory. In International Symposium on Code Generation and Optimization (CGO 2011). 213–223.

[7]

Liming Chen and Algirdas Avizienis. 1978. N-version Programming: A Fault-tolerance Approach to Reliability of Software Operation. In Proceedings of the 8th IEEE International Symposium on Fault-Tolerant Computing (FTCS-8). 1, 3–9.

[8]

Yuting Chen, Ting Su, Chengnian Sun, Zhendong Su, and Jianjun Zhao. 2016. Coverage-directed differential testing of JVM implementations. In proceedings of the 37th ACM SIGPLAN Conference on Programming Language Design and Implementation. 85–99.

Digital Library

[9]

JTC1/SC22/WG14 The C Standards Committee. 2018. ISO/IEC 9899:2018, Programming languages — C. https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2310.pdf

[10]

JTC1/SC22/WG21 The C++ Standards Committee. 2020. Standard for Programming Language C++. https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/n4849.pdf

[11]

Mila Dalla Preda, Roberto Giacobazzi, Arun Lakhotia, and Isabella Mastroeni. 2015. Abstract symbolic automata: Mixed syntactic/semantic similarity analysis of executables. In Proceedings of the 42nd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. 329–341.

[12]

CppCheck developers. 2022. A Tool for Static C/C++ Code Analysis. http://cppcheck.sourceforge.net/

[13]

GCC developers. 2022. Options That Control Optimization. https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html Accessed: March 7, 2022

[14]

LLVM developers. 2022. Clang - the Clang C, C++, and Objective-C compiler. https://clang.llvm.org/docs/CommandGuide/clang.html Accessed: March 7, 2022

[15]

LLVM developers. 2022. UndefinedBehaviorSanitizer. https://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html Accessed: March 7, 2022

[16]

Sushant Dinesh, Nathan Burow, Dongyan Xu, and Mathias Payer. 2020. Retrowrite: Statically instrumenting cots binaries for fuzzing and sanitization. In 2020 IEEE Symposium on Security and Privacy (SP). 1497–1511.

[17]

Andrea Fioraldi, Dominik Maier, Heiko Eiß feldt, and Marc Heuse. 2020. $AFL++$: Combining Incremental Steps of Fuzzing Research. In 14th USENIX Workshop on Offensive Technologies (WOOT 20).

[18]

Patrice Godefroid, Daniel Lehmann, and Marina Polishchuk. 2020. Differential regression testing for REST APIs. In Proceedings of the 29th ACM SIGSOFT International Symposium on Software Testing and Analysis. 312–323.

Digital Library

[19]

Google. 2022. Honggfuzz. https://honggfuzz.dev/ Accessed: March 7, 2022

[20]

Google. 2022. OSS-Fuzz: Continuous Fuzzing for Open Source Software. https://github.com/google/oss-fuzz Accessed: March 7, 2022

[21]

The Tcpdump Group. 2022. TCPDUMP. https://github.com/the-tcpdump-group/tcpdump Accessed: March 7, 2022

[22]

Petr Hosek and Cristian Cadar. 2015. VARAN the Unbelievable: An Efficient N-version Execution Framework. In Proceedings of the Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’15). 339–353.

Digital Library

[23]

Heqing Huang, Peisen Yao, Rongxin Wu, Qingkai Shi, and Charles Zhang. 2020. Pangolin: Incremental hybrid fuzzing with polyhedral path abstraction. In 2020 IEEE Symposium on Security and Privacy (SP). 1613–1627.

[24]

Jaewon Hur, Suhwan Song, Dongup Kwon, Eunjin Baek, Jangwoo Kim, and Byoungyoung Lee. 2021. DIFUZZRTL: Differential Fuzz Testing to Find CPU Bugs. In 2021 IEEE Symposium on Security and Privacy (SP). 1286–1303.

[25]

Nasif Imtiaz, Brendan Murphy, and Laurie Williams. 2019. How do developers act on static analysis alerts? an empirical study of coverity usage. In 2019 IEEE 30th International Symposium on Software Reliability Engineering (ISSRE). 323–333.

[26]

lcamtuf. 2014. Fuzzing random programs without execve(). https://lcamtuf.blogspot.com/2014/10/fuzzing-binaries-without-execve.html Accessed: March 7, 2022

[27]

Vu Le, Mehrdad Afshari, and Zhendong Su. 2014. Compiler Validation via Equivalence modulo Inputs. In Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’14). 216–226.

Digital Library

[28]

Stephan Lipp, Sebastian Banescu, and Alexander Pretschner. 2022. An Empirical Study on the Effectiveness of Static C Code Analyzers for Vulnerability Detection. In Proceedings of the 31st ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA’22), July 18–22, 2022, Virtual, South Korea.

Digital Library

[29]

Chenyang Lyu, Shouling Ji, Chao Zhang, Yuwei Li, Wei-Han Lee, Yu Song, and Raheem Beyah. 2019. $MOPT$: Optimized mutation scheduling for fuzzers. In 28th USENIX Security Symposium (USENIX Security 19). 1949–1966.

[30]

Michaël Marcozzi, Qiyi Tang, Alastair F Donaldson, and Cristian Cadar. 2019. Compiler fuzzing: How much does it matter? Proceedings of the ACM on Programming Languages, 3, OOPSLA (2019), 1–29.

Digital Library

[31]

Meta. 2022. A Tool to Detect Bugs in Java and C/C++/Objective-c Code. https://fbinfer.com/

[32]

Microsoft. 2022. One Fuzz: A self-hosted Fuzzing-As-A-Service platform. https://github.com/microsoft/onefuzz Accessed: March 7, 2022

[33]

Nicholas Nethercote and Julian Seward. 2007. Valgrind: A Framework for Heavyweight Dynamic Binary Instrumentation. PLDI ’07. 89–100.

Digital Library

[34]

Shirin Nilizadeh, Yannic Noller, and Corina S Pasareanu. 2019. DifFuzz: differential fuzzing for side-channel analysis. In 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE). 176–187.

Digital Library

[35]

NIST. 2017. Juliet Test Suite for C/C++ 1.3. https://samate.nist.gov/SARD/test-suites/112

[36]

Spencer Pearson, José Campos, René Just, Gordon Fraser, Rui Abreu, Michael D Ernst, Deric Pang, and Benjamin Keller. 2017. Evaluating and improving fault localization. In 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE). 609–620.

Digital Library

[37]

Theofilos Petsios, Adrian Tang, Salvatore Stolfo, Angelos D Keromytis, and Suman Jana. 2017. Nezha: Efficient domain-independent differential testing. In 2017 IEEE Symposium on security and privacy (SP). 615–632.

[38]

Xiaolei Ren, Michael Ho, Jiang Ming, Yu Lei, and Li Li. 2021. Unleashing the hidden power of compiler optimization on binary code difference: An empirical study. In Proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation. 142–157.

Digital Library

[39]

Manuel Rigger and Zhendong Su. 2020. Detecting optimization bugs in database engines via non-optimizing reference engine construction. In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 1140–1152.

Digital Library

[40]

Coverity Scan. 2022. Coverity Scan: Find and Fix Defects in Your Java, C/C++, C#, JavaScript, Ruby, or Python Open Source Project for Free. https://scan.coverity.com/

[41]

Mozilla Security. 2021. Fuzzdata. https://github.com/MozillaSecurity/fuzzdata Accessed: March 7, 2022

[42]

Konstantin Serebryany, Derek Bruening, Alexander Potapenko, and Dmitriy Vyukov. 2012. AddressSanitizer: A Fast Address Sanity Checker. In 2012 USENIX Annual Technical Conference (USENIX ATC 12). 309–318.

[43]

Evgeniy Stepanov and Konstantin Serebryany. 2015. MemorySanitizer: fast detector of uninitialized memory use in C++. In 2015 IEEE/ACM International Symposium on Code Generation and Optimization (CGO). 46–55.

[44]

Nick Stephens, John Grosen, Christopher Salls, Andrew Dutcher, Ruoyu Wang, Jacopo Corbetta, Yan Shoshitaishvili, Christopher Kruegel, and Giovanni Vigna. 2016. Driller: Augmenting fuzzing through selective symbolic execution. In NDSS. 16, 1–16.

[45]

The LLVM team. 2022. Honggfuzz. https://llvm.org/docs/LibFuzzer.html Accessed: March 7, 2022

[46]

Kaushik Veeraraghavan, Peter M Chen, Jason Flinn, and Satish Narayanasamy. 2011. Detecting and surviving data races using complementary schedules. In Proceedings of the twenty-third ACM symposium on operating systems principles. 369–384.

Digital Library

[47]

Xi Wang, Haogang Chen, Alvin Cheung, Zhihao Jia, Nickolai Zeldovich, and M Frans Kaashoek. 2012. Undefined behavior: what happened to my code? In Proceedings of the Asia-Pacific Workshop on Systems. 1–7.

Digital Library

[48]

Xi Wang, Nickolai Zeldovich, M Frans Kaashoek, and Armando Solar-Lezama. 2013. Towards optimization-safe systems: Analyzing the impact of undefined behavior. In Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles. 260–275.

Digital Library

[49]

W Eric Wong, Ruizhi Gao, Yihao Li, Rui Abreu, and Franz Wotawa. 2016. A survey on software fault localization. IEEE Transactions on Software Engineering, 42, 8 (2016), 707–740.

Digital Library

[50]

Michal Zalewski. 2014. American fuzzy lop. https://lcamtuf.coredump.cx/afl/ Accessed: March 7, 2022

[51]

Qian Zhang, Jiyuan Wang, and Miryung Kim. 2021. Heterofuzz: Fuzz testing to detect platform dependent divergence for heterogeneous applications. In Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 242–254.

Digital Library

Cited By

Yang WGao CLiu XLi YXue YFilkov VRay BZhou M(2024)Rust-twins: Automatic Rust Compiler Testing through Program Mutation and Dual Macros GenerationProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695059(631-642)Online publication date: 27-Oct-2024
https://dl.acm.org/doi/10.1145/3691620.3695059
Kim HKim SLee JCha SChristakis MPradel M(2024)AsFuzzer: Differential Testing of Assemblers with Error-Driven Grammar InferenceProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3680345(1099-1111)Online publication date: 11-Sep-2024
https://dl.acm.org/doi/10.1145/3650212.3680345
Liu YSong WChristakis MPradel M(2024)FunRedisp: Reordering Function Dispatch in Smart Contract to Reduce Invocation Gas FeesProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3652146(516-527)Online publication date: 11-Sep-2024
https://dl.acm.org/doi/10.1145/3650212.3652146
Show More Cited By

Index Terms

Finding Unstable Code via Compiler-Driven Differential Testing
1. Software and its engineering
  1. Software notations and tools
    1. Compilers
      1. Runtime environments
    2. Formal language definitions
      1. Semantics
  2. Software organization and properties
    1. Software functional properties
      1. Correctness
        Consistency
        Functionality

Recommendations

Finding compiler bugs via live code mutation
OOPSLA '16

Validating optimizing compilers is challenging because it is hard to generate valid test programs (i.e., those that do not expose any undefined behavior). Equivalence Modulo Inputs (EMI) is an effective, promising methodology to tackle this problem. ...
Finding compiler bugs via live code mutation
OOPSLA 2016: Proceedings of the 2016 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications

Validating optimizing compilers is challenging because it is hard to generate valid test programs (i.e., those that do not expose any undefined behavior). Equivalence Modulo Inputs (EMI) is an effective, promising methodology to tackle this problem. ...
UBFuzz: Finding Bugs in Sanitizer Implementations
ASPLOS '24: Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 1

In this paper, we propose a testing framework for validating sanitizer implementations in compilers. Our core components are (1) a program generator specifically designed for producing programs containing undefined behavior (UB), and (2) a novel test ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

ASPLOS 2023: Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 3

March 2023

820 pages

ISBN:9781450399180

DOI:10.1145/3582016

General Chair:
Tor M. Aamodt
University of British Columbia, Canada
,
Program Chairs:
Natalie Enright Jerger
University of Toronto, Canada
,
Michael Swift
University of Wisconsin-Madison, USA

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

In-Cooperation

SIGBED: ACM Special Interest Group on Embedded Systems

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 25 March 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Badges

Author Tags

Qualifiers

Research-article

Conference

ASPLOS '23

Sponsor:

ASPLOS '23: 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 3

March 25 - 29, 2023

BC, Vancouver, Canada

Acceptance Rates

Overall Acceptance Rate 535 of 2,713 submissions, 20%

Upcoming Conference

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

4
Total Citations
View Citations
667
Total Downloads

Downloads (Last 12 months)203
Downloads (Last 6 weeks)19

Reflects downloads up to 16 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Yang WGao CLiu XLi YXue YFilkov VRay BZhou M(2024)Rust-twins: Automatic Rust Compiler Testing through Program Mutation and Dual Macros GenerationProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695059(631-642)Online publication date: 27-Oct-2024
https://dl.acm.org/doi/10.1145/3691620.3695059
Kim HKim SLee JCha SChristakis MPradel M(2024)AsFuzzer: Differential Testing of Assemblers with Error-Driven Grammar InferenceProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3680345(1099-1111)Online publication date: 11-Sep-2024
https://dl.acm.org/doi/10.1145/3650212.3680345
Liu YSong WChristakis MPradel M(2024)FunRedisp: Reordering Function Dispatch in Smart Contract to Reduce Invocation Gas FeesProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3652146(516-527)Online publication date: 11-Sep-2024
https://dl.acm.org/doi/10.1145/3650212.3652146
Lu HLiu ZWang SZhang F(2024)DTD: Comprehensive and Scalable Testing for DebuggersProceedings of the ACM on Software Engineering10.1145/36437791:FSE(1172-1193)Online publication date: 12-Jul-2024
https://dl.acm.org/doi/10.1145/3643779

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents