Practical data breakpoints: design and implementation
A data breakpoint associates debugging actions with programmer-specified conditions on the memory state of an executing program. Data breakpoints provide a means for discovering program bugs that are tedious or impossible to isolate using control ...
Detection and recovery of endangered variables caused by instruction scheduling
Instruction scheduling re-orders and interleaves instruction sequences from different source statements. This impacts the task of a symbolic debugger, which attempts to present the user a picture of program execution that matches the source program. At ...
Efficient accommodation of may-alias information in SSA form
We present an algorithm for incrementally including may-alias information into Static Single Assignment form by computing a sequence of increasingly precise (and correspondingly larger) partial SSA forms. Our experiments show significant speedup of our ...
Abstract debugging of higher-order imperative languages
Abstract interpretation is a formal method that enables the static determination (i.e. at compile-time) of the dynamic properties (i.e. at run-time) of programs. We present an abstract interpretation-based method, called abstract debugging, which ...
Interprocedural modification side effect analysis with pointer aliasing
We present a new interprocedural modification side effects algorithm for C programs, that can discern side effects through general-purpose pointer usage. Ours is the first complete design and implementation of such an algorithm. Preliminary performance ...
A practical data flow framework for array reference analysis and its use in optimizations
Data flow analysis techniques have traditionally been restricted to the analysis of scalar variables. This retriction, however, imposes a limitation on the kinds of optimizations that can be performed in loops containing array references. We present a ...
Dependence-based program analysis
Program analysis and optimization can be speeded up through the use of the dependence flow graph (DFG), a representation of program dependences which generalizes def-use chains and static single assignment (SSA) form. In this paper, we give a simple ...
Interprocedural constant propagation: a study of jump function implementation
An implementation of interprocedural constant propagation must model the transmission of values through each procedure. In the framework proposed by Callahan, Cooper, Kennedy, and Torczon in 1986, this intraprocedural propagation is modeled with a jump ...
Orchestrating interactions among parallel computations
Many parallel programs contain multiple sub-computations, each with distinct communication and load balancing requirements. The traditional approach to compiling such programs is to impose a processor synchronization barrier between sub-computations, ...
Communication optimization and code generation for distributed memory machines
This paper presents several algorithms to solve code generation and optimization problems specific to machines with distributed address spaces. Given a description of how the computation is to be partitioned across the processors in a machine, our ...
First-class data-type representations in SCHEMEXEROX
In most programming language implementations, the compiler has detailed knowledge of the representations of and operations on primitive data typed and data-type constructors. In SCHEMEXEROX, this knowledge is almost entirely external to the compiler, in ...
Handling control
Non-local control transfer and exception handling have a long tradition in higher-order programming languages such as Common Lisp, Scheme and ML. However, each language stops short of providing a full and complementary approach—control handling is ...
Programmable syntax macros
Lisp has shown that a programmable syntax macro system acts as an adjunct to the compiler that gives the programmer important and powerful abstraction facilities not provided by the language. Unlike simple token substitution macros, such as are provided ...
Compiling real-time programs into schedulable code
We present a programming language with first-class timing constructs, whose semantics is based on time-constrained relationships between observable events. Since a system specification postulates timing relationships between events, realizing the ...
Improving the cache locality of memory allocation
The allocation and disposal of memory is a ubiquitous operation in most programs. Rarely do programmers concern themselves with details of memory allocators; most assume that memory allocators provided by the system perform well. This paper presents a ...
Using lifetime predictors to improve memory allocation performance
Dynamic storage allocation is used heavily in many application areas including interpreters, simulators, optimizers, and translators. We describe research that can improve all aspects of the performance of dynamic storage allocation by predicting the ...
Space efficient conservative garbage collection
We call a garbage collector conservative if it has only partial information about the location of pointers, and is thus forced to treat arbitrary bit patterns as though they might be pointers, in at least some cases. We show that some very inexpensive, ...
Guardians in a generation-based garbage collector
This paper describes a new language feature that allows dynamically allocated objects to be saved from deallocation by an automatic storage management system so that clean-up or other actions can be performed using the data stored within the objects. ...
Real-time replication garbage collection
We have implemented the first copying garbage collector that permits continuous unimpeded mutator access to the original objects during copying. The garbage collector incrementally replicates all accessible objects and uses a mutation log to bring the ...
Implementing type classes
We describe the implementation of a type checker for the functional programming language Haskell that supports the use of type classes. This extends the type system of ML to support overloading (ad-hoc polymorphism) and can be used to implement features ...
The essence of compiling with continuations
In order to simplify the compilation process, many compilers for higher-order languages use the continuation-passing style (CPS) transformation in a first phase to generate an intermediate representation of the source program. The salient aspect of this ...
Register allocation with instruction scheduling
We present a new framework in which considerations of both register allocation and instruction scheduling can be applied uniformly and simultaneously. In this framework an optimal coloring of a graph, called the parallel interference graph, provides an ...
Lifetime-sensitive modulo scheduling
This paper shows how to software pipeline a loop for minimal register pressure without sacrificing the loop's minimum execution time. This novel bidirectional slack-scheduling method has been implemented in a FORTRAN compiler and tested on many ...
Load/store range analysis for global register allocation
Live range splitting techniques improve global register allocation by splitting the live ranges of variables into segments that are individually allocated registers. Load/store range analysis is a new technique for live range splitting that is based on ...
Balanced scheduling: instruction scheduling when memory latency is uncertain
Traditional list schedulers order instructions based on an optimistic estimate of the load delay imposed by the implementation. Therefore they cannot respond to variations in load latencies (due to cache hits or misses, congestion in the memory ...
Reverse If-Conversion
In this paper we present a set of isomorphic control transformations that allow the compiler to apply local scheduling techniques to acyclic subgraphs of the control flow graph. Thus, the code motion complexities of global scheduling are eliminated. ...
Branch prediction for free
Many compilers rely on branch prediction to improve program performance by identifying frequently executed regions and by aiding in scheduling instructions.Profile-based predictors require a time-consuming and inconvenient compile-profile-compile cycle ...