It is with great pleasure that we welcome you to the ACM SIGPLAN 2022 International Conference on Compiler Construction (CC), the 31st edition of this long-standing conference. For the seventh year now, we continue the tradition of holding CC jointly with the Symposium on Principles and Practice of Parallel Programming (PPoPP), the International Symposium on Code Generation and Optimization (CGO), and the International Symposium on High-Performance Computer Architecture (HPCA). Co-locating these four conferences presents a unique opportunity to bring together researchers from different backgrounds and to catalyze interdisciplinary research in the areas of computer architecture, compilation, and parallel programming.
Proceeding Downloads
Writing and verifying a Quantum optimizing compiler (keynote)
As quantum computing hardware evolves, it will continue to face four key limitations: low qubit counts, limited connectivity, high error rates, and short coherence times. Quantum compilers play a key role in addressing these issues, reducing the number ...
QSSA: an SSA-based IR for Quantum computing
Quantum computing hardware has progressed rapidly. Simultaneously, there has been a proliferation of programming languages and program optimization tools for quantum computing. Existing quantum compilers use intermediate representations (IRs) where ...
QRANE: lifting QASM programs to an affine IR
This paper introduces QRANE, a tool that produces the affine intermediate representation (IR) from a quantum program expressed in Quantum Assembly language such as OpenQASM. QRANE finds subsets of quantum gates prescribed by the same operation type and ...
A polynomial time exact solution to the bit-aware register binding problem
Finding the minimum register bank is an optimization problem related to the synthesis of hardware. Given a program, the problem asks for the minimum number of registers plus their minimum size, in bits, that suffices to compile said program. This ...
Graph transformations for register-pressure-aware instruction scheduling
This paper presents graph transformation algorithms for register-pressure-aware instruction scheduling. The proposed transformations add edges to the data dependence graph (DDG) to eliminate solutions that are either redundant or sub-optimal. Register-...
Caviar: an e-graph based TRS for automatic code optimization
- Smail Kourta,
- Adel Abderahmane Namani,
- Fatima Benbouzid-Si Tayeb,
- Kim Hazelwood,
- Chris Cummins,
- Hugh Leather,
- Riyadh Baghdadi
Term Rewriting Systems (TRSs) are used in compilers to simplify and prove expressions. State-of-the-art TRSs in compilers use a greedy algorithm that applies a set of rewriting rules in a predefined order (where some of the rules are not axiomatic). ...
On the computation of interprocedural weak control closure
Many program analysis techniques depend on capturing the control dependencies of the program. Most existing control dependence algorithms either compute intraprocedural control dependencies only, or they compute control dependence relations that are not ...
Seamless deductive inference via macros
We present an approach to integrating state-of-art bottom-up logic programming within the Rust ecosystem, demonstrating it with Ascent, an extension of Datalog that performs well against comparable systems. Rust’s powerful macro system permits Ascent to ...
One-shot tuner for deep learning compilers
Auto-tuning DL compilers are gaining ground as an optimizing back-end for DL frameworks. While existing work can generate deep learning models that exceed the performance of hand-tuned libraries, they still suffer from prohibitively long auto-tuning time ...
Training of deep learning pipelines on memory-constrained GPUs via segmented fused-tiled execution
Training models with massive inputs is a significant challenge in the development of Deep Learning pipelines to process very large digital image datasets as required by Whole Slide Imaging (WSI) in computational pathology and analysis of brain fMRI ...
MLIR-based code generation for GPU tensor cores
The state-of-the-art in high-performance deep learning today is primarily driven by manually developed libraries optimized and highly tuned by expert programmers using low-level abstractions with significant effort. This effort is often repeated for ...
Automating reinforcement learning architecture design for code optimization
Reinforcement learning (RL) is emerging as a powerful technique for solving complex code optimization tasks with an ample search space. While promising, existing solutions require a painstaking manual process to tune the right task-specific RL ...
Memory access scheduling to reduce thread migrations
It has been widely observed that data movement is emerging as the primary bottleneck to scalability and energy efficiency in future hardware, especially for applications and algorithms that are not cache-friendly and achieve below 1% of peak performance ...
Performant portable OpenMP
Accelerated computing has increased the need to specialize how a program is parallelized depending on the target. Fully exploiting a highly parallel accelerator, such as a GPU, demands more parallelism and sometimes more levels of parallelism than a ...
BinPointer: towards precise, sound, and scalable binary-level pointer analysis
Binary-level pointer analysis is critical to binary-level applications such as reverse engineering and binary debloating. In this paper, we propose BinPointer, a new binary-level interprocedural pointer analysis that relies on an offset-sensitive value-...
Cape: compiler-aided program transformation for HTM-based cache side-channel defense
Cache side-channel attacks pose real threats to computer system security. Prior work called Cloak leverages commodity hardware transactional memory (HTM) to protect sensitive data and code from cache side-channel attacks. However, Cloak requires tedious ...
Making no-fuss compiler fuzzing effective
Developing a bug-free compiler is difficult; modern optimizing compilers are among the most complex software systems humans build. Fuzzing is one way to identify subtle compiler bugs that are hard to find with human-constructed tests. Grammar-based ...
Loner: utilizing the CPU vector datapath to process scalar integer data
Modern CPUs utilize SIMD vector instructions and hardware extensions to accelerate code with data-level parallelism. This allows for high performance gains in select application domains such as image and signal processing. However, general purpose code ...
Mapping parallelism in a functional IR through constraint satisfaction: a case study on convolution for mobile GPUs
Graphics Processing Units (GPUs) are notoriously hard to optimize for manually. What is needed are good automatic code generators and optimizers. Accelerate, Futhark and Lift demonstrated that a functional approach is well suited for this challenge. ...
Software pre-execution for irregular memory accesses in the HBM era
The introduction of High Bandwidth Memory (HBM) necessitates the use of intelligent software prefetching in irregular applications to utilize the surplus bandwidth. In this work, we propose Software Pre-execution (SPE), a technique that relies on pre-...
Efficient profile-guided size optimization for native mobile applications
Positive user experience of mobile apps demands they not only launch fast and run fluidly, but are also small in order to reduce network bandwidth from regular updates. Conventional optimizations often trade off size regressions for performance wins, ...
Index Terms
- Proceedings of the 31st ACM SIGPLAN International Conference on Compiler Construction