Temperature-aware GPU design

JW Sheaffer, K Skadron, DP Luebke - ACM SIGGRAPH 2004 Posters, 2004 - dl.acm.org
JW Sheaffer, K Skadron, DP Luebke
ACM SIGGRAPH 2004 Posters, 2004dl.acm.org
Cooling for graphics processors is becoming prohibitively expensive. Even for GPUs not
intended for high-performance markets, cooling is a serious issue due to the low profit
margins in these market segments. Much of the heat originates from the processor core
itself. This paper argues for a runtime approach to cooling, reducing the need for bulky and
expensive thermal packages and fans. Today's cooling solutions are designed for worst-
case behavior. First, localized heating occurs much faster than chip-wide heating; since …
Cooling for graphics processors is becoming prohibitively expensive. Even for GPUs not intended for high-performance markets, cooling is a serious issue due to the low profit margins in these market segments. Much of the heat originates from the processor core itself. This paper argues for a runtime approach to cooling, reducing the need for bulky and expensive thermal packages and fans. Today’s cooling solutions are designed for worst-case behavior. First, localized heating occurs much faster than chip-wide heating; since power dissipation is spatially non-uniform across the chip, this leads to “hot spots” and spatial gradients that can cause timing errors or even physical damage. Reducing these hot spots, whether through changes in circuit design, microarchitecture, or software, will help reduce cooling requirements. Second, the package should be designed for the worst typical application. True worst-case behavior is rare, and a solution designed for worst case is in fact overdesigned for most typical operating conditions. However, a package designed for typical behavior could be overcome by some unusual application, and so should engage dynamic thermal management (DTM). These techniques throttle back the chip’s power dissipation (and possibly performance) until the thermal stress has passed. DTM has recently been the subject of considerable research in the general-purpose computerarchitecture community, and it is used in commercial chips like the Pentium 4, Pentium M, and Transmeta Crusoe.
It is important to note that runtime thermal management cannot merely be achieved by designing the chip for greater energy efficiency. Thermal behavior evolves over time scales of hundreds of microseconds or milliseconds. This means that power-management techniques, in order to be used for thermal management, must directly target the spatial and temporal behavior of operating temperature. In fact, many low-power techniques have insufficient effect on operating temperature, because they do not reduce power density in hot spots, or because they only reclaim slack and do not reduce power and temperature when no slack is present.
ACM Digital Library