[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3582016.3582053acmconferencesArticle/Chapter ViewAbstractPublication PagesasplosConference Proceedingsconference-collections
research-article

Finding Unstable Code via Compiler-Driven Differential Testing

Published: 25 March 2023 Publication History

Abstract

Unstable code refers to code that has inconsistent or unstable run-time semantics due to undefined behavior (UB) in the program. Compilers exploit UB by assuming that UB never occurs, which allows them to generate efficient but potentially semantically inconsistent binaries. Practitioners have put great research and engineering effort into designing dynamic tools such as sanitizers for frequently occurring UBs. However, it remains a big challenge how to detect UBs that are beyond the reach of current techniques.
In this paper, we introduce compiler-driven differential testing (CompDiff), a simple yet effective approach for finding unstable code in C/C++ programs. CompDiff relies on the fact that when compiling unstable code, different compiler implementations may produce semantically inconsistent binaries. Our main approach is to examine the outputs of different binaries on the same input. Discrepancies in outputs may signify the existence of unstable code. To detect unstable code in real-world programs, we also integrate CompDiff into AFL++, the most widely-used and actively-maintained general-purpose fuzzer.
Despite its simplicity, CompDiff is effective in practice: on the Juliet benchmark programs, CompDiff uniquely detected 1,409 bugs compared to sanitizers; on 23 popular open-source C/C++ projects, CompDiff-AFL++ uncovered 78 new bugs, 52 of which have been fixed by developers and 36 cannot be detected by sanitizers. Our evaluation also reveals the fact that CompDiff is not designed to replace current UB detectors but to complement them.

References

[1]
Austin Appleby. 2016. MurmurHash3. https://github.com/aappleby/smhasher/wiki/MurmurHash3 Accessed: March 7, 2022
[2]
Cornelius Aschermann, Sergej Schumilo, Tim Blazytko, Robert Gawlik, and Thorsten Holz. 2019. REDQUEEN: Fuzzing with Input-to-State Correspondence. In NDSS. 19, 1–15.
[3]
Marcel Böhme, Valentin JM Manès, and Sang Kil Cha. 2020. Boosting fuzzer efficiency: An information theoretic perspective. In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 678–689.
[4]
Marcel Böhme, Van-Thuan Pham, and Abhik Roychoudhury. 2017. Coverage-based greybox fuzzing as markov chain. IEEE Transactions on Software Engineering, 45, 5 (2017), 489–506.
[5]
Marcel Böhme, Ezekiel O Soremekun, Sudipta Chattopadhyay, Emamurho Ugherughe, and Andreas Zeller. 2017. Where is the bug and how is it fixed? an experiment with practitioners. In Proceedings of the 2017 11th joint meeting on foundations of software engineering. 117–128.
[6]
Derek Bruening and Qin Zhao. 2011. Practical memory checking with Dr. Memory. In International Symposium on Code Generation and Optimization (CGO 2011). 213–223.
[7]
Liming Chen and Algirdas Avizienis. 1978. N-version Programming: A Fault-tolerance Approach to Reliability of Software Operation. In Proceedings of the 8th IEEE International Symposium on Fault-Tolerant Computing (FTCS-8). 1, 3–9.
[8]
Yuting Chen, Ting Su, Chengnian Sun, Zhendong Su, and Jianjun Zhao. 2016. Coverage-directed differential testing of JVM implementations. In proceedings of the 37th ACM SIGPLAN Conference on Programming Language Design and Implementation. 85–99.
[9]
JTC1/SC22/WG14 The C Standards Committee. 2018. ISO/IEC 9899:2018, Programming languages — C. https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2310.pdf
[10]
JTC1/SC22/WG21 The C++ Standards Committee. 2020. Standard for Programming Language C++. https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/n4849.pdf
[11]
Mila Dalla Preda, Roberto Giacobazzi, Arun Lakhotia, and Isabella Mastroeni. 2015. Abstract symbolic automata: Mixed syntactic/semantic similarity analysis of executables. In Proceedings of the 42nd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. 329–341.
[12]
CppCheck developers. 2022. A Tool for Static C/C++ Code Analysis. http://cppcheck.sourceforge.net/
[13]
GCC developers. 2022. Options That Control Optimization. https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html Accessed: March 7, 2022
[14]
LLVM developers. 2022. Clang - the Clang C, C++, and Objective-C compiler. https://clang.llvm.org/docs/CommandGuide/clang.html Accessed: March 7, 2022
[15]
LLVM developers. 2022. UndefinedBehaviorSanitizer. https://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html Accessed: March 7, 2022
[16]
Sushant Dinesh, Nathan Burow, Dongyan Xu, and Mathias Payer. 2020. Retrowrite: Statically instrumenting cots binaries for fuzzing and sanitization. In 2020 IEEE Symposium on Security and Privacy (SP). 1497–1511.
[17]
Andrea Fioraldi, Dominik Maier, Heiko Eiß feldt, and Marc Heuse. 2020. $AFL++$: Combining Incremental Steps of Fuzzing Research. In 14th USENIX Workshop on Offensive Technologies (WOOT 20).
[18]
Patrice Godefroid, Daniel Lehmann, and Marina Polishchuk. 2020. Differential regression testing for REST APIs. In Proceedings of the 29th ACM SIGSOFT International Symposium on Software Testing and Analysis. 312–323.
[19]
Google. 2022. Honggfuzz. https://honggfuzz.dev/ Accessed: March 7, 2022
[20]
Google. 2022. OSS-Fuzz: Continuous Fuzzing for Open Source Software. https://github.com/google/oss-fuzz Accessed: March 7, 2022
[21]
The Tcpdump Group. 2022. TCPDUMP. https://github.com/the-tcpdump-group/tcpdump Accessed: March 7, 2022
[22]
Petr Hosek and Cristian Cadar. 2015. VARAN the Unbelievable: An Efficient N-version Execution Framework. In Proceedings of the Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’15). 339–353.
[23]
Heqing Huang, Peisen Yao, Rongxin Wu, Qingkai Shi, and Charles Zhang. 2020. Pangolin: Incremental hybrid fuzzing with polyhedral path abstraction. In 2020 IEEE Symposium on Security and Privacy (SP). 1613–1627.
[24]
Jaewon Hur, Suhwan Song, Dongup Kwon, Eunjin Baek, Jangwoo Kim, and Byoungyoung Lee. 2021. DIFUZZRTL: Differential Fuzz Testing to Find CPU Bugs. In 2021 IEEE Symposium on Security and Privacy (SP). 1286–1303.
[25]
Nasif Imtiaz, Brendan Murphy, and Laurie Williams. 2019. How do developers act on static analysis alerts? an empirical study of coverity usage. In 2019 IEEE 30th International Symposium on Software Reliability Engineering (ISSRE). 323–333.
[26]
lcamtuf. 2014. Fuzzing random programs without execve(). https://lcamtuf.blogspot.com/2014/10/fuzzing-binaries-without-execve.html Accessed: March 7, 2022
[27]
Vu Le, Mehrdad Afshari, and Zhendong Su. 2014. Compiler Validation via Equivalence modulo Inputs. In Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’14). 216–226.
[28]
Stephan Lipp, Sebastian Banescu, and Alexander Pretschner. 2022. An Empirical Study on the Effectiveness of Static C Code Analyzers for Vulnerability Detection. In Proceedings of the 31st ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA’22), July 18–22, 2022, Virtual, South Korea.
[29]
Chenyang Lyu, Shouling Ji, Chao Zhang, Yuwei Li, Wei-Han Lee, Yu Song, and Raheem Beyah. 2019. $MOPT$: Optimized mutation scheduling for fuzzers. In 28th USENIX Security Symposium (USENIX Security 19). 1949–1966.
[30]
Michaël Marcozzi, Qiyi Tang, Alastair F Donaldson, and Cristian Cadar. 2019. Compiler fuzzing: How much does it matter? Proceedings of the ACM on Programming Languages, 3, OOPSLA (2019), 1–29.
[31]
Meta. 2022. A Tool to Detect Bugs in Java and C/C++/Objective-c Code. https://fbinfer.com/
[32]
Microsoft. 2022. One Fuzz: A self-hosted Fuzzing-As-A-Service platform. https://github.com/microsoft/onefuzz Accessed: March 7, 2022
[33]
Nicholas Nethercote and Julian Seward. 2007. Valgrind: A Framework for Heavyweight Dynamic Binary Instrumentation. PLDI ’07. 89–100.
[34]
Shirin Nilizadeh, Yannic Noller, and Corina S Pasareanu. 2019. DifFuzz: differential fuzzing for side-channel analysis. In 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE). 176–187.
[35]
NIST. 2017. Juliet Test Suite for C/C++ 1.3. https://samate.nist.gov/SARD/test-suites/112
[36]
Spencer Pearson, José Campos, René Just, Gordon Fraser, Rui Abreu, Michael D Ernst, Deric Pang, and Benjamin Keller. 2017. Evaluating and improving fault localization. In 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE). 609–620.
[37]
Theofilos Petsios, Adrian Tang, Salvatore Stolfo, Angelos D Keromytis, and Suman Jana. 2017. Nezha: Efficient domain-independent differential testing. In 2017 IEEE Symposium on security and privacy (SP). 615–632.
[38]
Xiaolei Ren, Michael Ho, Jiang Ming, Yu Lei, and Li Li. 2021. Unleashing the hidden power of compiler optimization on binary code difference: An empirical study. In Proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation. 142–157.
[39]
Manuel Rigger and Zhendong Su. 2020. Detecting optimization bugs in database engines via non-optimizing reference engine construction. In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 1140–1152.
[40]
Coverity Scan. 2022. Coverity Scan: Find and Fix Defects in Your Java, C/C++, C#, JavaScript, Ruby, or Python Open Source Project for Free. https://scan.coverity.com/
[41]
Mozilla Security. 2021. Fuzzdata. https://github.com/MozillaSecurity/fuzzdata Accessed: March 7, 2022
[42]
Konstantin Serebryany, Derek Bruening, Alexander Potapenko, and Dmitriy Vyukov. 2012. AddressSanitizer: A Fast Address Sanity Checker. In 2012 USENIX Annual Technical Conference (USENIX ATC 12). 309–318.
[43]
Evgeniy Stepanov and Konstantin Serebryany. 2015. MemorySanitizer: fast detector of uninitialized memory use in C++. In 2015 IEEE/ACM International Symposium on Code Generation and Optimization (CGO). 46–55.
[44]
Nick Stephens, John Grosen, Christopher Salls, Andrew Dutcher, Ruoyu Wang, Jacopo Corbetta, Yan Shoshitaishvili, Christopher Kruegel, and Giovanni Vigna. 2016. Driller: Augmenting fuzzing through selective symbolic execution. In NDSS. 16, 1–16.
[45]
The LLVM team. 2022. Honggfuzz. https://llvm.org/docs/LibFuzzer.html Accessed: March 7, 2022
[46]
Kaushik Veeraraghavan, Peter M Chen, Jason Flinn, and Satish Narayanasamy. 2011. Detecting and surviving data races using complementary schedules. In Proceedings of the twenty-third ACM symposium on operating systems principles. 369–384.
[47]
Xi Wang, Haogang Chen, Alvin Cheung, Zhihao Jia, Nickolai Zeldovich, and M Frans Kaashoek. 2012. Undefined behavior: what happened to my code? In Proceedings of the Asia-Pacific Workshop on Systems. 1–7.
[48]
Xi Wang, Nickolai Zeldovich, M Frans Kaashoek, and Armando Solar-Lezama. 2013. Towards optimization-safe systems: Analyzing the impact of undefined behavior. In Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles. 260–275.
[49]
W Eric Wong, Ruizhi Gao, Yihao Li, Rui Abreu, and Franz Wotawa. 2016. A survey on software fault localization. IEEE Transactions on Software Engineering, 42, 8 (2016), 707–740.
[50]
Michal Zalewski. 2014. American fuzzy lop. https://lcamtuf.coredump.cx/afl/ Accessed: March 7, 2022
[51]
Qian Zhang, Jiyuan Wang, and Miryung Kim. 2021. Heterofuzz: Fuzz testing to detect platform dependent divergence for heterogeneous applications. In Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 242–254.

Cited By

View all
  • (2024)Rust-twins: Automatic Rust Compiler Testing through Program Mutation and Dual Macros GenerationProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695059(631-642)Online publication date: 27-Oct-2024
  • (2024)AsFuzzer: Differential Testing of Assemblers with Error-Driven Grammar InferenceProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3680345(1099-1111)Online publication date: 11-Sep-2024
  • (2024)FunRedisp: Reordering Function Dispatch in Smart Contract to Reduce Invocation Gas FeesProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3652146(516-527)Online publication date: 11-Sep-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
ASPLOS 2023: Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 3
March 2023
820 pages
ISBN:9781450399180
DOI:10.1145/3582016
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

In-Cooperation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 25 March 2023

Permissions

Request permissions for this article.

Check for updates

Badges

Author Tags

  1. compiler
  2. undefined behavior
  3. unstable code

Qualifiers

  • Research-article

Conference

ASPLOS '23

Acceptance Rates

Overall Acceptance Rate 535 of 2,713 submissions, 20%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)203
  • Downloads (Last 6 weeks)19
Reflects downloads up to 16 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Rust-twins: Automatic Rust Compiler Testing through Program Mutation and Dual Macros GenerationProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695059(631-642)Online publication date: 27-Oct-2024
  • (2024)AsFuzzer: Differential Testing of Assemblers with Error-Driven Grammar InferenceProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3680345(1099-1111)Online publication date: 11-Sep-2024
  • (2024)FunRedisp: Reordering Function Dispatch in Smart Contract to Reduce Invocation Gas FeesProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3652146(516-527)Online publication date: 11-Sep-2024
  • (2024)DTD: Comprehensive and Scalable Testing for DebuggersProceedings of the ACM on Software Engineering10.1145/36437791:FSE(1172-1193)Online publication date: 12-Jul-2024

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media