More Web Proxy on the site http://driver.im/

research-article

Finding missed optimizations through the lens of dead code elimination

Authors:

Theodoros Theodoridis,

Zhendong SuAuthors Info & Claims

ASPLOS '22: Proceedings of the 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems

Pages 697 - 709

https://doi.org/10.1145/3503222.3507764

Published: 22 February 2022 Publication History

Abstract

Compilers are foundational software development tools and incorporate increasingly sophisticated optimizations. Due to their complexity, it is difficult to systematically identify opportunities for improving them. Indeed, the automatic discovery of missed optimizations has been an important and significant challenge. The few existing approaches either cannot accurately pinpoint missed optimizations or target only specific analyses. This paper tackles this challenge by introducing a novel, effective approach that --- in a simple and general manner --- automatically identifies a wide range of missed optimizations. Our core insight is to leverage dead code elimination (DCE) to both analyze how well compilers optimize code and identify missed optimizations: (1) insert "optimization markers" in the basic blocks of a given program, (2) compute the program's live/dead basic blocks using the "optimization markers", and (3) identify missed optimizations from how well compilers eliminate dead blocks. We essentially exploit that, since DCE heavily depends on the rest of the optimization pipeline, through the lens of DCE, one can systematically quantify how well compilers optimize code. We conduct an extensive analysis of GCC and LLVM using our approach, which (1) provides quantitative and qualitative insights regarding their optimization capabilities, and (2) uncovers a diverse set of missed optimizations. Our results also lead to 84 bug reports for GCC and LLVM, of which 62 have already been confirmed or fixed, demonstrating our work's strong practical utility. We expect that the simplicity and generality of our approach will make it widely applicable for understanding compiler performance and finding missed optimizations. This work opens and initiates this promising direction.

References

[1]

Alfred V Aho, Ravi Sethi, and Jeffrey D Ullman. 1986. Compilers, principles, techniques. Addison wesley, 7, 8 (1986), 9.

[2]

Amir H Ashouri, William Killian, John Cavazos, Gianluca Palermo, and Cristina Silvano. 2018. A survey on compiler autotuning using machine learning. ACM Computing Surveys (CSUR), 51, 5 (2018), 1–42. https://doi.org/10.1145/3197978

Digital Library

[3]

David F Bacon, Susan L Graham, and Oliver J Sharp. 1994. Compiler transformations for high-performance computing. ACM Computing Surveys (CSUR), 26, 4 (1994), 345–420. https://doi.org/10.1145/197405.197406

Digital Library

[4]

Gergö Barany. 2018. Finding Missed Compiler Optimizations by Differential Testing. In Proceedings of the 27th International Conference on Compiler Construction (CC 2018). Association for Computing Machinery, New York, NY, USA. 82–92. isbn:9781450356442 https://doi.org/10.1145/3178372.3179521

Digital Library

[5]

Edd Barrett, Carl Friedrich Bolz-Tereick, Rebecca Killick, Sarah Mount, and Laurence Tratt. 2017. Virtual machine warmup blows hot and cold. Proceedings of the ACM on Programming Languages, 1, OOPSLA (2017), 1–27. https://doi.org/10.1145/3133876

Digital Library

[6]

James Bucek, Klaus-Dieter Lange, and Jóakim v. Kistowski. 2018. SPEC CPU2017: Next-generation compute benchmark. In Companion of the 2018 ACM/SPEC International Conference on Performance Engineering. 41–42. https://doi.org/10.1145/3185768.3185771

Digital Library

[7]

David Callahan. 1988. The program summary graph and flow-sensitive interprocedual data flow analysis. In Proceedings of the ACM SIGPLAN 1988 conference on Programming Language design and Implementation. 47–56. https://doi.org/10.1145/53990.53995

Digital Library

[8]

Milind Chabbi and John Mellor-Crummey. 2012. Deadspy: a tool to pinpoint program inefficiencies. In Proceedings of the Tenth International Symposium on Code Generation and Optimization. 124–134. https://doi.org/10.1145/2259016.2259033

Digital Library

[9]

Dehao Chen, Neil Vachharajani, Robert Hundt, Shih-wei Liao, Vinodha Ramasamy, Paul Yuan, Wenguang Chen, and Weimin Zheng. 2010. Taming hardware event samples for FDO compilation. In Proceedings of the 8th annual IEEE/ACM international symposium on Code generation and optimization. 42–52. https://doi.org/10.1145/1772954.1772963

Digital Library

[10]

Armando Fox, Michael Hsiao, James Reed, and Brent Whitlock. [n. d.]. A Survey of General and Architecture-Specific Compiler Optimization Techniques.

[11]

Zhangxiaowen Gong, Zhi Chen, Justin Szaday, David Wong, Zehra Sura, Neftali Watkinson, Saeed Maleki, David Padua, Alexander Veidenbaum, and Alexandru Nicolau. 2018. An empirical study of the effect of source-level loop transformations on compiler stability. Proceedings of the ACM on Programming Languages, 2, OOPSLA (2018), 1–29. https://doi.org/10.1145/3276496

Digital Library

[12]

Atsushi Hashimoto and Nagisa Ishiura. 2016. Detecting arithmetic optimization opportunities for C compilers by randomly generated equivalent programs. IPSJ Transactions on System LSI Design Methodology, 9 (2016), 21–29. https://doi.org/10.2197/ipsjtsldm.9.21

[13]

Martin Jambor. 2010. The new intraprocedural Scalar Replacement of Aggregates. In GCC Developers’ Summit. 47.

[14]

Guoliang Jin, Linhai Song, Xiaoming Shi, Joel Scherpelz, and Shan Lu. 2012. Understanding and Detecting Real-World Performance Bugs. In Proceedings of the 33rd ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’12). Association for Computing Machinery, New York, NY, USA. 77–88. isbn:9781450312059 https://doi.org/10.1145/2254064.2254075

Digital Library

[15]

Jinho Jung, Hong Hu, Joy Arulraj, Taesoo Kim, and Woonhak Kang. 2019. APOLLO: Automatic detection and diagnosis of performance regressions in database systems. Proceedings of the VLDB Endowment, 13, 1 (2019), 57–70. https://doi.org/10.14778/3357377.3357382

Digital Library

[16]

Sesha Kalyur and GS Nagaraja. 2016. A survey of modeling techniques used in compiler design and implementation. In 2016 International Conference on Computation System and Information Technology for Sustainable Solutions (CSITSS). 355–358. https://doi.org/10.1109/CSITSS.2016.7779385

[17]

Jeyhun Karimov, Tilmann Rabl, and Volker Markl. 2018. Polybench: The first benchmark for polystores. In Technology Conference on Performance Evaluation and Benchmarking. 24–41. https://doi.org/10.1007/978-3-030-11404-6_3

[18]

J. C. Knight and N. G. Leveson. 1986. An Experimental Evaluation of the Assumption of Independence in Multiversion Programming. IEEE Trans. Softw. Eng., 12, 1 (1986), jan, 96–109. issn:0098-5589 https://doi.org/10.1109/TSE.1986.6312924

Digital Library

[19]

Vu Le, Mehrdad Afshari, and Zhendong Su. 2014. Compiler Validation via Equivalence modulo Inputs. In Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’14). Association for Computing Machinery, New York, NY, USA. 216–226. isbn:9781450327848 https://doi.org/10.1145/2594291.2594334

Digital Library

[20]

Vu Le, Chengnian Sun, and Zhendong Su. 2015. Finding Deep Compiler Bugs via Guided Stochastic Program Mutation. In Proceedings of the 2015 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA 2015). Association for Computing Machinery, New York, NY, USA. 386–399. isbn:9781450336895 https://doi.org/10.1145/2814270.2814319

Digital Library

[21]

Caroline Lemieux, Rohan Padhye, Koushik Sen, and Dawn Song. 2018. PerfFuzz: Automatically Generating Pathological Inputs. In Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA 2018). Association for Computing Machinery, New York, NY, USA. 254–265. isbn:9781450356992 https://doi.org/10.1145/3213846.3213874

Digital Library

[22]

Xavier Leroy, Sandrine Blazy, Daniel Kästner, Bernhard Schommer, Markus Pister, and Christian Ferdinand. 2016. CompCert-a formally verified optimizing compiler. In ERTS 2016: Embedded Real Time Software and Systems, 8th European Congress.

[23]

William M McKeeman. 1998. Differential testing for software. Digital Technical Journal, 10, 1 (1998), 100–107.

[24]

Tipp Moseley, Dirk Grunwald, and Ramesh Peri. 2009. OptiScope: performance accountability for optimizing compilers. In 2009 International Symposium on Code Generation and Optimization. 254–264. https://doi.org/10.1109/CGO.2009.26

Digital Library

[25]

Todd Mytkowicz, Amer Diwan, Matthias Hauswirth, and Peter F. Sweeney. 2009. Producing Wrong Data without Doing Anything Obviously Wrong!. In Proceedings of the 14th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS XIV). Association for Computing Machinery, New York, NY, USA. 265–276. isbn:9781605584065 https://doi.org/10.1145/1508244.1508275

Digital Library

[26]

Maksim Panchenko, Rafael Auler, Bill Nell, and Guilherme Ottoni. 2019. Bolt: a practical binary optimizer for data centers and beyond. In 2019 IEEE/ACM International Symposium on Code Generation and Optimization (CGO). 2–14. https://doi.org/10.1109/CGO.2019.8661201

[27]

Vasileios Porpodas and Timothy M Jones. 2015. Throttling automatic vectorization: When less is more. In 2015 International Conference on Parallel Architecture and Compilation (PACT). 432–444. https://doi.org/10.1109/PACT.2015.32

Digital Library

[28]

John Regehr, Yang Chen, Pascal Cuoq, Eric Eide, Chucky Ellison, and Xuejun Yang. 2012. Test-case reduction for C compiler bugs. In Proceedings of the 33rd ACM SIGPLAN conference on Programming Language Design and Implementation. 335–346. https://doi.org/10.1145/2254064.2254104

Digital Library

[29]

Konstantin Serebryany, Derek Bruening, Alexander Potapenko, and Dmitriy Vyukov. 2012. Addresssanitizer: A fast address sanity checker. In 2012 USENIX Annual Technical Conference (ATC 12). 309–318.

[30]

Jialiang Tan, Shuyin Jiao, Milind Chabbi, and Xu Liu. 2020. What every scientific programmer should know about compiler optimizations? In Proceedings of the 34th ACM International Conference on Supercomputing. 1–12. https://doi.org/10.1145/3392717.3392754

Digital Library

[31]

Jubi Taneja, Zhengyang Liu, and John Regehr. 2020. Testing Static Analyses for Precision and Soundness. In Proceedings of the 18th ACM/IEEE International Symposium on Code Generation and Optimization (CGO 2020). Association for Computing Machinery, New York, NY, USA. 81–93. isbn:9781450370479 https://doi.org/10.1145/3368826.3377927

Digital Library

[32]

Luca Della Toffola, Michael Pradel, and Thomas R. Gross. 2018. Synthesizing Programs That Expose Performance Bottlenecks. In Proceedings of the 2018 International Symposium on Code Generation and Optimization (CGO 2018). Association for Computing Machinery, New York, NY, USA. 314–326. isbn:9781450356176 https://doi.org/10.1145/3168830

Digital Library

[33]

Sid-Ahmed-Ali Touati and Denis Barthou. 2006. On the decidability of phase ordering problem in optimizing compilation. In Proceedings of the 3rd conference on Computing frontiers. 147–156. https://doi.org/10.1145/1128022.1128042

Digital Library

[34]

Xi Wang, Haogang Chen, Alvin Cheung, Zhihao Jia, Nickolai Zeldovich, and M Frans Kaashoek. 2012. Undefined behavior: what happened to my code? In Proceedings of the Asia-Pacific Workshop on Systems. 1–7. https://doi.org/10.1145/2349896.2349905

Digital Library

[35]

Xi Wang, Nickolai Zeldovich, M Frans Kaashoek, and Armando Solar-Lezama. 2013. Towards optimization-safe systems: Analyzing the impact of undefined behavior. In Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles. 260–275. https://doi.org/10.1145/2517349.2522728

Digital Library

[36]

Shasha Wen, Xu Liu, and Milind Chabbi. 2015. Runtime value numbering: A profiling technique to pinpoint redundant computations. In 2015 International Conference on Parallel Architecture and Compilation (PACT). 254–265. https://doi.org/10.1109/PACT.2015.29

Digital Library

[37]

Xuejun Yang, Yang Chen, Eric Eide, and John Regehr. 2011. Finding and understanding bugs in C compilers. In Proceedings of the 32nd ACM SIGPLAN conference on Programming language design and implementation. 283–294. https://doi.org/10.1145/2254064.2254075

Digital Library

[38]

Andreas Zeller. 2002. Isolating cause-effect chains from computer programs. ACM SIGSOFT Software Engineering Notes, 27, 6 (2002), 1–10. https://doi.org/10.1145/587051.587053

Digital Library

Cited By

Paiva JLeal JFigueira Á(2024)Comparing semantic graph representations of source code: The case of automatic feedback on programming assignmentsComputer Science and Information Systems10.2298/CSIS230615004P21:1(117-142)Online publication date: 2024
https://doi.org/10.2298/CSIS230615004P
Gao FChen HZhou YWang KFilkov VRay BZhou M(2024)Shoot Yourself in the Foot — Efficient Code Causes Inefficiency in Compiler OptimizationsProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695548(1846-1857)Online publication date: 27-Oct-2024
https://dl.acm.org/doi/10.1145/3691620.3695548
Chen HFilkov VRay BZhou M(2024)Efficient Code Causes Inefficiency in Compiler OptimizationsProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695369(2426-2428)Online publication date: 27-Oct-2024
https://dl.acm.org/doi/10.1145/3691620.3695369
Show More Cited By

Index Terms

Finding missed optimizations through the lens of dead code elimination
1. Software and its engineering
  1. Software notations and tools
    1. Compilers

Recommendations

Detection of Optimizations Missed by the Compiler
ESEC/FSE 2023: Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering

With the increasing significance of compilers in software development, identifying optimization bugs and enhancing their performance has become a significant challenge. In recent years, many studies have targeted only specific analyses to identify missed ...
Efficient Code Causes Inefficiency in Compiler Optimizations
ASE '24: Proceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering

Optimizations, a critical element within the compiler pipeline, is responsible for improving the efficiency of generated code. In this paper, we take a different angle to evaluate compiler optimizations than all existing works in compiler testing ...
Boosting Compiler Testing by Injecting Real-World Code

We introduce a novel approach for testing optimizing compilers with code from real-world applications. The main idea is to construct well-formed programs by fusing multiple code snippets from various real-world projects. The key insight is backed by the ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

ASPLOS '22: Proceedings of the 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems

February 2022

1164 pages

ISBN:9781450392051

DOI:10.1145/3503222

General Chairs:
Babak Falsafi
EPFL, Switzerland
,
Michael Ferdman
Stony Brook University, USA
,
Program Chairs:
Shan Lu
University of Chicago, USA
,
Tom Wenisch
University of Michigan, USA / Google, USA

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

In-Cooperation

SIGBED: ACM Special Interest Group on Embedded Systems

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 February 2022

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Badges

Author Tags

Qualifiers

Research-article

Conference

ASPLOS '22

Sponsor:

ASPLOS '22: 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems

February 28 - March 4, 2022

Lausanne, Switzerland

Acceptance Rates

Overall Acceptance Rate 535 of 2,713 submissions, 20%

Upcoming Conference

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

22
Total Citations
View Citations
1,112
Total Downloads

Downloads (Last 12 months)221
Downloads (Last 6 weeks)12

Reflects downloads up to 09 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Paiva JLeal JFigueira Á(2024)Comparing semantic graph representations of source code: The case of automatic feedback on programming assignmentsComputer Science and Information Systems10.2298/CSIS230615004P21:1(117-142)Online publication date: 2024
https://doi.org/10.2298/CSIS230615004P
Gao FChen HZhou YWang KFilkov VRay BZhou M(2024)Shoot Yourself in the Foot — Efficient Code Causes Inefficiency in Compiler OptimizationsProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695548(1846-1857)Online publication date: 27-Oct-2024
https://dl.acm.org/doi/10.1145/3691620.3695548
Chen HFilkov VRay BZhou M(2024)Efficient Code Causes Inefficiency in Compiler OptimizationsProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695369(2426-2428)Online publication date: 27-Oct-2024
https://dl.acm.org/doi/10.1145/3691620.3695369
Zhou CQian BGo GZhang QLi SJiang Y(2024)PolyJuice: Detecting Mis-compilation Bugs in Tensor Compilers with Equality Saturation Based RewritingProceedings of the ACM on Programming Languages10.1145/36897578:OOPSLA2(1309-1335)Online publication date: 8-Oct-2024
https://dl.acm.org/doi/10.1145/3689757
Huang QLuo ZXing ZZeng JChen JXu XChen Y(2024)Revealing the Unseen: AI Chain on LLMs for Predicting Implicit Dataflows to Generate Dataflow Graphs in Dynamically Typed CodeACM Transactions on Software Engineering and Methodology10.1145/367245833:7(1-29)Online publication date: 12-Jun-2024
https://dl.acm.org/doi/10.1145/3672458
Theodoridis TSu Z(2024)Refined Input, Degraded Output: The Counterintuitive World of Compiler BehaviorProceedings of the ACM on Programming Languages10.1145/36564048:PLDI(671-691)Online publication date: 20-Jun-2024
https://dl.acm.org/doi/10.1145/3656404
Lu HLiu ZWang SZhang F(2024)DTD: Comprehensive and Scalable Testing for DebuggersProceedings of the ACM on Software Engineering10.1145/36437791:FSE(1172-1193)Online publication date: 12-Jul-2024
https://dl.acm.org/doi/10.1145/3643779
Zang ZThimmaiah AGligoric MRoychoudhury APaiva AAbreu RStorey M(2024)JOG: Java JIT Peephole Optimizations and Tests from PatternsProceedings of the 2024 IEEE/ACM 46th International Conference on Software Engineering: Companion Proceedings10.1145/3639478.3640040(11-15)Online publication date: 14-Apr-2024
https://dl.acm.org/doi/10.1145/3639478.3640040
Tu HJiang LGao DJiang HRoychoudhury APaiva AAbreu RStorey MHierons RMadeira H(2024)Beyond a Joke: Dead Code Elimination Can Delete Live CodeProceedings of the 2024 ACM/IEEE 44th International Conference on Software Engineering: New Ideas and Emerging Results10.1145/3639476.3639763(32-36)Online publication date: 14-Apr-2024
https://dl.acm.org/doi/10.1145/3639476.3639763
Li SSu ZTsafrir DMusuvathi MGupta RAbu-Ghazaleh N(2024)UBFuzz: Finding Bugs in Sanitizer ImplementationsProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 110.1145/3617232.3624874(435-449)Online publication date: 27-Apr-2024
https://dl.acm.org/doi/10.1145/3617232.3624874
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents