No abstract available.
Proceeding Downloads
Thread-level speculation with kernel support
Runtime systems for speculative parallelization can be substantially sped up by implementing them with kernel support. We describe a novel implementation of a thread-level speculation (TLS) system using virtual memory to isolate speculative state, ...
Reducing memory buffering overhead in software thread-level speculation
Software-based, automatic parallelization through Thread-Level Speculation (TLS) has significant practical potential, but also high overhead costs. Traditional "lazy" buffering mechanisms enable strong isolation of speculative threads, but imply large ...
Performance implications of transient loop-carried data dependences in automatically parallelized loops
Recent approaches to automatic parallelization have taken advantage of the low-latency on-chip interconnect provided in modern multicore processors, demonstrating significant speedups, even for complex workloads. Although these techniques can already ...
Safe and flexible adaptation via alternate data structure representations
The choice of data structures is crucial for achieving high performance. For applications that are long-running and/or operate on large data sets, the best choice for main data structures can change multiple times over the course of a single execution. ...
Relaxed dependence tracking for parallel runtime support
It is notoriously difficult to achieve both correctness and scalability for many shared-memory parallel programs. To improve correctness and scalability, researchers have developed various kinds of parallel runtime support such as multithreaded record & ...
Kindergarten cop: dynamic nursery resizing for GHC
Generational garbage collectors are among the most popular garbage collectors used in programming language runtime systems. Their performance is known to depend heavily on choosing the appropriate size of the area where new objects are allocated (the ...
Verified construction of static single assignment form
Modern compilers use intermediate representations in static single assignment (SSA) form, which simplifies many optimizations. However, the high implementation complexity of efficient SSA construction algorithms poses a challenge to verified compilers. ...
Mechanizing conventional SSA for a verified destruction with coalescing
Modern optimizing compilers rely on the Static Single Assignment (SSA) form to make optimizations fast and simpler to implement. From a semantic perspective, the SSA form is nowadays fairly well understood, as witnessed by recent advances in the field ...
Reachability and error diagnosis in LR(1) parsers
Given an LR(1) automaton, what are the states in which an error can be detected? For each such "error state", what is a minimal input sentence that causes an error in this state? We propose an algorithm that answers these questions. This allows building ...
Automatic fault location for data structures
Specification-based data structure verification is a powerful debugging technique. In this work we combine specification-based data structure verification with automatic detection of faulty program statements that corrupt data structures. The user ...
Sparse representation of implicit flows with applications to side-channel detection
Information flow analyses traditionally use the Program Dependence Graph (PDG) as a supporting data-structure. This graph relies on Ferrante et al.'s notion of control dependences to represent implicit flows of information. A limitation of this approach ...
Multiversioned decoupled access-execute: the key to energy-efficient compilation of general-purpose programs
- Konstantinos Koukos,
- Per Ekemark,
- Georgios Zacharopoulos,
- Vasileios Spiliopoulos,
- Stefanos Kaxiras,
- Alexandra Jimborean
Computer architecture design faces an era of great challenges in an attempt to simultaneously improve performance and energy efficiency. Previous hardware techniques for energy management become severely limited, and thus, compilers play an essential ...
Heap bounds protection with low fat pointers
Heap buffer overflow (underflow) errors are a common source of security vulnerabilities. One prevention mechanism is to add object bounds meta information and to instrument the program with explicit bounds checks for all memory access. The so-called "...
Register allocation and promotion through combined instruction scheduling and loop unrolling
Register allocation is a much studied problem. A particularly important context for optimizing register allocation is within loops, since a significant fraction of the execution time of programs is often inside loop code. A variety of algorithms have ...
On fusing recursive traversals of K-d trees
- Samyam Rajbhandari,
- Jinsung Kim,
- Sriram Krishnamoorthy,
- Louis-Noël Pouchet,
- Fabrice Rastello,
- Robert J. Harrison,
- P. Sadayappan
Loop fusion is a key program transformation for data locality optimization that is implemented in production compilers. But optimizing compilers for imperative languages currently cannot ex- ploit fusion opportunities across a set of recursive tree ...
Restrictification of function arguments
- Victor Hugo Sperle Campos,
- Péricles Rafael Alves,
- Henrique Nazaré Santos,
- Fernando Magno Quintão Pereira
Pointer aliasing still hinders compiler optimizations, in spite of years of research on pointer disambiguation. Because the automatic disambiguation of pointers is a difficult endeavor, several programming languages offer programmers mechanisms to ...
Static deadlock detection for concurrent go by global session graph synthesis
Go is a programming language developed at Google, with channel-based concurrent features based on CSP. Go can detect global communication deadlocks at runtime when all threads of execution are blocked, but deadlocks in other paths of execution could be ...
Static detection of energy defect patterns in Android applications
For static analysis researchers, Android software presents a wide variety of interesting challenges. The target of our work is static detection of energy-drain defects in Android applications. The management of energy-intensive resources (e.g., GPS) ...
On fast large-scale program analysis in Datalog
Designing and crafting a static program analysis is challenging due to the complexity of the task at hand. Among the challenges are modelling the semantics of the input language, finding suitable abstractions for the analysis, and handwriting efficient ...
Improved MHP Analysis
May-Happen-in-Parallel (MHP) analysis is becoming the backbone of many of the parallel analyses and optimizations. In this paper, we present new approaches to do MHP analysis for X10-like languages that support async-finish-atomic parallelism. We ...
Extended lattice-based memory allocation
This work extends lattice-based memory allocation, an earlier work on memory reuse through array contraction. Such an optimization is used for optimizing high-level programming languages where storage mapping may be abstracted away from programmers and ...
Mapping deviation: a technique to adapt or to guard loop transformation intuitions for legality
Parallel architectures are now omnipresent in mainstream electronic devices and exploiting them efficiently is a challenge for all developers. Hence, they need the support of languages, libraries and tools to assist them in the optimization or ...
Automatic data layout generation and kernel mapping for CPU+GPU architectures
The ubiquity of hybrid CPU+GPU architectures has led to renewed interest in automatic data layout generation owing to the fact that data layouts have a large impact on performance, and that different data layouts yield the best performance on CPUs vs. ...
Input space splitting for OpenCL
The performance of OpenCL programs suffers from memory and control flow divergence. Therefore, OpenCL compilers employ static analyses to identify non-divergent control flow and memory accesses in order to produce faster code. However, divergence is ...
GreenThumb: superoptimizer construction framework
Developing an optimizing compiler backend remains a laborious process, especially for nontraditional ISAs that have been appearing recently. Superoptimization sidesteps the need for many code transformations by searching for the most optimal instruction ...
Register allocation and instruction scheduling in Unison
This paper describes Unison, a simple, flexible, and potentially optimal software tool that performs register allocation and instruction scheduling in integration using combinatorial optimization. The tool can be used as an alternative or as a ...
SVF: interprocedural static value-flow analysis in LLVM
This paper presents SVF, a tool that enables scalable and precise interprocedural Static Value-Flow analysis for C programs by leveraging recent advances in sparse analysis. SVF, which is fully implemented in LLVM, allows value-flow construction and ...
Iguana: a practical data-dependent parsing framework
Data-dependent grammars extend context-free grammars with arbitrary computation, variable binding, and constraints. These features provide the user with the freedom and power to express syntactic constructs outside the realm of context-free grammars, ...
SYCO: a systematic testing tool for concurrent objects
We present the concepts, usage and prototypical implementation of SYCO: a SYstematic testing tool for Concurrent Objects. The system receives as input a program, a selection of method to be tested, and a set of initial values for its parameters. SYCO ...
Index Terms
- Proceedings of the 25th International Conference on Compiler Construction