Compiler-Directed Energy Savings in Superscalar Processors
View/ Open
Date
07/2006Author
Jones, Timothy M
Metadata
Abstract
Superscalar processors contain large, complex structures to hold data and instructions as
they wait to be executed. However, many of these structures consume large amounts of energy,
making them hotspots requiring sophisticated cooling systems. With the trend towards larger,
more complex processors, this will become more of a problem, having important implications
for future technology.
This thesis uses compiler-based optimisation schemes to target the issue queue and register
file. These are two of the most energy consuming structures in the processor. The algorithms
and hardware techniques developed in this work dynamically adapt the processor's resources
to the changing program phases, turning off parts of each structure when they are unused to
save dynamic and static energy.
To optimise the issue queue, the compiler analysis tracks data dependences through each
program procedure. It identifies the critical path through each program region and informs the
hardware of the minimum number of queue entries required to prevent it slowing down. This
reduces the occupancy of the queue and increases the opportunities to save energy. With just a
1.3% performance loss, 26% dynamic and 32% static energy savings are achieved.
Registers can be idle for many cycles after they are last read, before they are released and
put back on the free-list to be reused by another instruction. Alternatively, they can be turned
off for energy savings. Early register releasing can be used to perform this operation sooner
than usual, but hardware schemes must wait for the instruction redefining the relevant logical
register to enter the pipeline. This thesis presents an exploration of compiler-directed early
register releasing. The compiler can exactly identify the last use of each register and pass the
information to the hardware, based on simple data-flow and liveness analysis. The best scheme
achieves 15% dynamic and 19% static energy savings.
Finally, the issue queue limiting and early register releasing schemes are combined for energy
savings in both processor structures. Four different configurations are evaluated bringing
25% to 31% dynamic and 19% to 34% static issue queue energy savings and reductions of 18%
to 25% dynamic and 20% to 21% static energy in the register file.