article

Dynamic allocation for scratch-pad memory using compile-time decisions

Authors:

Sumesh Udayakumaran,

Angel Dominguez,

Rajeev BaruaAuthors Info & Claims

ACM Transactions on Embedded Computing Systems (TECS), Volume 5, Issue 2

Pages 472 - 511

https://doi.org/10.1145/1151074.1151085

Published: 01 May 2006 Publication History

Get Access

Abstract

In this research, we propose a highly predictable, low overhead, and, yet, dynamic, memory-allocation strategy for embedded systems with scratch pad memory. A scratch pad is a fast compiler-managed SRAM memory that replaces the hardware-managed cache. It is motivated by its better real-time guarantees versus cache and by its significantly lower overheads in energy consumption, area, and overall runtime, even with a simple allocation scheme. Primarily scratch pad allocation methods are of two types. First, software-caching schemes emulate the workings of a hardware cache in software. Instructions are inserted before each load/store to check the software-maintained cache tags. Such methods incur large overheads in runtime, code size, energy consumption, and SRAM space for tags and deliver poor real-time guarantees just like hardware caches. A second category of algorithms partitions variables at compile-time into the two banks. However, a drawback of such static allocation schemes is that they do not account for dynamic program behavior. It is easy to see why a data allocation that never changes at runtime cannot achieve the full locality benefits of a cache. We propose a dynamic allocation methodology for global and stack data and program code that; (i) accounts for changing program requirements at runtime, (ii) has no software-caching tags, (iii) requires no runtime checks, (iv) has extremely low overheads, and (v) yields 100% predictable memory access times. In this method, data that is about to be accessed frequently is copied into the scratch pad using compiler-inserted code at fixed and infrequent points in the program. Earlier data is evicted if necessary. When compared to a provably optimal static allocation, results show that our scheme reduces runtime by up to 39.8% and energy by up to 31.3%, on average, for our benchmarks, depending on the SRAM size used. The actual gain depends on the SRAM size, but our results show that close to the maximum benefit in runtime and energy is achieved for a substantial range of small SRAM sizes commonly found in embedded systems. Our comparison with a direct mapped cache shows that our method performs roughly as well as a cached architecture.

References

[1]

Ahmed, N., Mateev, N., and Pingali, K. 2000. Tiling imperfectly-nested loop nests. In Proceedings of the 2000 ACM/IEEE Conference on Supercomputing (CDROM). IEEE Computer Society, 31.]]

Abstract

References

Cited By

Index Terms

Recommendations

Compiler-decided dynamic memory allocation for scratch-pad based embedded systems

Memory allocation for embedded systems with a compile-time-unknown scratch-pad size

Recursive function data allocation to scratch-pad memory

Comments

Information

Published In

Publisher

Journal Family

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Login options

Full Access

View options

PDF

eReader

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations