Memory allocation for embedded systems with a compile-time-unknown scratch-pad size

N Nguyen, A Dominguez, R Barua - ACM Transactions on Embedded …, 2009 - dl.acm.org
N Nguyen, A Dominguez, R Barua
ACM Transactions on Embedded Computing Systems (TECS), 2009dl.acm.org
This article presents the first memory allocation scheme for embedded systems having a
scratch-pad memory whose size is unknown at compile time. A scratch-pad memory (SPM)
is a fast compiler-managed SRAM that replaces the hardware-managed cache. All existing
memory allocation schemes for SPM require the SPM size to be known at compile time.
Unfortunately, because of this constraint, the resulting executable is tied to that size of SPM
and is not portable to other processor implementations having a different SPM size. Size …
This article presents the first memory allocation scheme for embedded systems having a scratch-pad memory whose size is unknown at compile time. A scratch-pad memory (SPM) is a fast compiler-managed SRAM that replaces the hardware-managed cache. All existing memory allocation schemes for SPM require the SPM size to be known at compile time. Unfortunately, because of this constraint, the resulting executable is tied to that size of SPM and is not portable to other processor implementations having a different SPM size. Size-portable code is valuable when programs are downloaded during deployment either via a network or portable media. Code downloads are used for fixing bugs or for enhancing functionality. The presence of different SPM sizes in different devices is common because of the evolution in VLSI technology across years. The result is that SPM cannot be used in such situations with downloaded codes.
To overcome this limitation, our work presents a compiler method whose resulting executable is portable across SPMs of any size. Our technique is to employ a customized installer software, which decides the SPM allocation just before the program's first run, since the SPM size can be discovered at that time. The installer then, based on the decided allocation, modifies the program executable accordingly. The resulting executable places frequently used objects in SPM, considering both code and data for placement. To keep the overhead low, much of the preprocessing for the allocation is done at compile time. Results show that our benchmarks average a 41% speedup versus an all-DRAM allocation, while the optimal static allocation scheme, which knows the SPM size at compile time and is thus an unachievable upper-bound and is only slightly faster (45% faster than all-DRAM). Results also show that the overhead from our customized installer averages about 1.5% in code size, 2% in runtime, and 3% in compile time for our benchmarks.
ACM Digital Library