[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article
Open access

Dynamic Shared SPM Reuse for Real-Time Multicore Embedded Systems

Published: 11 May 2015 Publication History

Abstract

Allocating the scratchpad memory (SPM) space to tasks is a challenging problem in real-time multicore embedded systems that use shared SPM. Proper SPM space allocation is important, as it considerably influences the application worst-case execution time (WCET), which is of great importance in real-time applications. To address this problem, in this article we present a dynamic SPM reuse scheme, where SPM space can be reused by other tasks during runtime without requiring any static SPM partitioning. Although the proposed scheme is applied dynamically at runtime, the required decision making is fairly complex and hence cannot be performed at runtime. We have developed techniques to perform the decision making offline at design time in the form of optimization problems combined with task scheduling/mapping. The proposed work is unlike previous works that either exploit static schemes for SPM space allocation or perform task scheduling/mapping and SPM space allocation incoherently. The experimental results show that our dynamic SPM reuse scheme can reduce WCET by up to 55% as compared to recent previous works on SPM allocation in real-time multicore embedded systems.

References

[1]
Imtiaz Ahmad and Muhammad K. Dhodhi. 1996. Multiprocessor scheduling in a genetic paradigm. Parallel Computing 22, 3, 395--406.
[2]
Atmel. 2012. 32-bit Atmel AVR Microcontroller. Retrieved March 27, 2015, from http://www.atmel.com/Images/doc32099.pdf.
[3]
Rajeshwari Banakar, Stefan Steinke, Bo-Sik Lee, M. Balakrishnan, and Peter Marwedel. 2002. Scratchpad memory: Design alternative for cache on-chip memory in embedded systems. In Proceedings of the 10th International Symposium on Hardware/Software Codesign (CODES’02). ACM, New York, NY, 73--78.
[4]
Anthony Brooke, David Kendrick, Alexander Meeraus, and Ramesh Raman. 2003. GAMS/CPLEX 9.0. GAMS Development Corporation, Washington, DC.
[5]
Doug Burger and Todd M. Austin. 1997. The SimpleScalar tool set, version 2.0. ACM SIGARCH Computer Architecture News 25, 3, 13--25.
[6]
Giorgio C. Buttazzo. 2011. Hard Real-Time Computing Systems: Predictable Scheduling Algorithms and Applications. Vol. 24. Springer.
[7]
Lance Chambers. 1995. Practical Handbook of Genetic Algorithms. CRC Press, Boca Raton, FL.
[8]
Che-Wei Chang, Jian-Jia Chen, Tei-Wei Kuo, and Heiko Falk. 2013. Real-time partitioned scheduling on multi-core systems with local and global memories. In Proceedings of the 18th Asia and South Pacific Design Automation Conference (ASP-DAC’13). IEEE, Los Alamitos, CA, 467--472.
[9]
Che-Wei Chang, Jian-Jia Chen, Tei-Wei Kuo, and Heiko Falk. 2014. Real-time task scheduling on island-based multi-core platforms. IEEE Transactions on Parallel and Distributed Systems PP, 99, 1.
[10]
Che-Wei Chang, Jian-Jia Chen, Waqaas Munawar, Tei-Wei Kuo, and Heiko Falk. 2012. Partitioned scheduling for real-time tasks on multiprocessor embedded systems with programmable shared SRAMs. In Proceedings of the 10th ACM International Conference on Embedded Software (EMSOFT’12). ACM, New York, NY, 153--162.
[11]
Karam S. Chatha and Ranga Vemuri. 2002. Hardware-software partitioning and pipelined scheduling of transformative applications. IEEE Transactions on Very Large Scale Integration (VLSI) Systems 10, 3, 193--208.
[12]
Sudipta Chattopadhyay and Abhik Roychoudhury. 2011. Static bus schedule aware scratchpad allocation in multiprocessors. ACM SIGPLAN Notices 46, 5, 11--20.
[13]
Weijia Che and Karam S. Chatha. 2011. Scheduling of stream programs onto SPM enhanced processors with code overlay. In Proceedings of the 9th IEEE Symposium on Embedded Systems for Real-Time Multimedia (ESTIMedia’11). IEEE, Los Alamitos, CA, 9--18.
[14]
ILOG Cplex. 2007. 11.0 Users Manual. IBM. Available at http://www.ilog.com/products/cplex/.
[15]
Vinay Devadas and Hakan Aydin. 2012. On the interplay of voltage/frequency scaling and device power management for frame-based real-time embedded applications. IEEE Transactions on Computers 61, 1, 31--44.
[16]
Huping Ding, Yun Liang, and Tulika Mitra. 2012. WCET-centric partial instruction cache locking. In Proceedings of the 49th ACM/EDAC/IEEE Design Automation Conference (DAC’12). IEEE, Los Alamitos, CA, 412--420.
[17]
Heiko Falk and Jan C. Kleinsorge. 2009. Optimal static WCET-aware scratchpad allocation of program code. In Proceedings of the 46th Annual Design Automation Conference (DAC’09). ACM, New York, NY, 732--737.
[18]
Marco E. T. Gerards and Jan Kuper. 2013. Optimal DPM and DVFS for frame-based real-time systems. ACM Transactions on Architecture and Code Optimization 9, 4, Article No. 41.
[19]
Rony Ghattas, Gregory Parsons, and Alexander G. Dean. 2007. Optimal unified data allocation and task scheduling for real-time multi-tasking systems. In Proceedings of the 13th IEEE Real Time and Embedded Technology and Applications Symposium (RTAS’07). IEEE, Los Alamitos, CA, 168--182.
[20]
Martin Grajcar. 1999. Genetic list scheduling algorithm for scheduling and allocation on a loosely coupled heterogeneous multiprocessor system. In Proceedings of the 36th Design Automation Conference (DAC’99). IEEE, Los Alamitos, CA, 280--285.
[21]
Shouzhen Gu, Qingfeng Zhuge, Jingtong Hu, Juan Yi, and Edwin H.-M. Sha. 2013. Efficient task assignment and scheduling for MPSoC DSPS with VS-SPM considering concurrent accesses through data allocation. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP’13). IEEE, Los Alamitos, CA, 2615--2619.
[22]
Shouzhen Gu, Qingfeng Zhuge, Juan Yi, Jingtong Hu, and Edwin H.-M. Sha. 2014. Optimizing task and data assignment on multi-core systems with multi-port SPMs. IEEE Transactions on Parallel and Distributed Systems PP, 99, 1.
[23]
Yibo Guo, Qingfeng Zhuge, Jingtong Hu, Meikang Qiu, and Edwin H.-M. Sha. 2011. Optimal data allocation for scratch-pad memory on embedded multi-core systems. In Proceedings of the International Conference on Parallel Processing (ICPP’11). IEEE, Los Alamitos, CA, 464--471.
[24]
Yibo Guo, Qingfeng Zhuge, Jingtong Hu, Juan Yi, Meikang Qiu, and Edwin H.-M. Sha. 2013. Data placement and duplication for embedded multicore systems with scratch pad memory. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems. 32, 6, 809--817.
[25]
Jan Gustafsson, Adam Betts, Andreas Ermedahl, and Björn Lisper. 2010. The Mälardalen WCET benchmarks—past, present and future. In Proceedings of the 10th International Workshop on Worst-Case Execution Time Analysis. 137--147.
[26]
Matthew R. Guthaus, Jeffrey S. Ringenberg, Daniel Ernst, Todd M. Austin, Trevor Mudge, and Richard B. Brown. 2001. MiBench: A free, commercially representative embedded benchmark suite. In Proceedings of the IEEE International Workshop on Workload Characterization (WWC-4). IEEE, Los Alamitos, CA, 3--14.
[27]
Randy L. Haupt and Sue Ellen Haupt. 2004. Practical Genetic Algorithms. John Wiley and Sons, Hoboken, NJ.
[28]
Mahmut Kandemir and Alok Choudhary. 2002. Compiler-directed scratch pad memory hierarchy design and management. In Proceedings of the 39th Annual Design Automation Conference (DAC’02). ACM, New York, NY, 628--633.
[29]
Mahmut Kandemir, Ozcan Ozturk, and Mustafa Karakoy. 2004. Dynamic on-chip memory management for chip multiprocessors. In Proceedings of the International Conference on Compilers, Architecture, and Synthesis for Embedded Systems (CASES’04). ACM, New York, NY, 14--23.
[30]
Mahmut Kandemir, Jagannathan Ramanujam, and Alok Choudhary. 2002. Exploiting shared scratch pad memory space in embedded multiprocessor systems. In Proceedings of the 39th Annual Design Automation Conference (DAC’02). ACM, New York, NY, 219--224.
[31]
Sangyeol Kang and Alexander G. Dean. 2010. DARTS: Techniques and tools for predictably fast memory using integrated data allocation and real-time task scheduling. In Proceedings of the 16th IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS’10). IEEE, Los Alamitos, CA, 333--342.
[32]
Namyun Kim, Minsoo Ryu, Seongsoo Hong, Manas Saksena, Chong-Ho Choi, and Heonshik Shin. 1996. Visual assessment of a real-time system design: A case study on a CNC controller. In Proceedings of the 17th IEEE Real-Time Systems Symposium. IEEE, Los Alamitos, CA, 300--310.
[33]
Hermann Kopetz. 2011. Real-Time Systems: Design Principles for Distributed Embedded Applications. Springer, New York, NY.
[34]
Yu-Kwong Kwok and Ishfaq Ahmad. 1997. Efficient scheduling of arbitrary task graphs to multiprocessors using a parallel genetic algorithm. Journal of Parallel and Distributed Computing 47, 1, 58--77.
[35]
Dawei Li and Jie Wu. 2014. Minimizing energy consumption for frame-based tasks on heterogeneous multiprocessor platforms. IEEE Transactions on Parallel and Distributed Systems PP, 99, 1.
[36]
Xianfeng Li, Yun Liang, Tulika Mitra, and Abhik Roychoudury. 2007. Chronos: A timing analyzer for embedded software. Science of Computer Programming 69, 1--3, 56--67. http://www.comp.nus.edu.sg/∼rpembed/chronos.
[37]
Tiantian Liu, Yingchao Zhao, Minming Li, and Chun Jason Xue. 2011. Joint task assignment and cache partitioning with cache locking for {WCET} minimization on {MPSoC}. Journal of Parallel and Distributed Computing 71, 11, 1473--1483.
[38]
Yu Liu and Wei Zhang. 2012. Exploiting multi-level scratchpad memories for time-predictable multicore computing. In Proceedings of the 30th IEEE International Conference on Computer Design (ICCD’12). IEEE, Los Alamitos, CA, 61--66.
[39]
Peter Marwedel. 2006. Embedded System Design. Vol. 1. Springer, New York, NY.
[40]
Nghi Nguyen, Angel Dominguez, and Rajeev Barua. 2005. Memory allocation for embedded systems with a compile-time-unknown scratch-pad size. In Proceedings of the International Conference on Compilers, Architectures, and Synthesis for Embedded Systems (CASES’05). ACM, New York, NY, 115--125.
[41]
Fatma A. Omara and Mona M. Arafa. 2010. Genetic algorithms for task scheduling problem. Journal of Parallel and Distributed Computing 70, 1, 13--22.
[42]
Ozcan Ozturk, Guangyu Chen, Mahmut Kandemir, and Mustafa Karakoy. 2006a. An integer linear programming based approach to simultaneous memory space partitioning and data allocation for chip multiprocessors. In Proceedings of the IEEE Computer Society Annual Symposium on Emerging VLSI Technologies and Architectures. IEEE, Los Alamitos, CA.
[43]
Ozcan Ozturk, Mahmut Kandemir, Guangyu Chen, Mary J. Irwin, and Mustafa Karakoy. 2005. Customized on-chip memories for embedded chip multiprocessors. In Proceedings of the Asia and South Pacific Design Automation Conference (ASP-DAC’05). ACM, New York, NY, 743--748.
[44]
Ozcan Ozturk, Mahmut Kandemir, and Ibrahim Kolcu. 2006b. Shared scratch-pad memory space management. In Proceedings of the 7th International Symposium on Quality Electronic Design (ISQED’06). IEEE, Los Alamitos, CA.
[45]
Michael Pinedo. 2012. Scheduling: Theory, Algorithms, and Systems. Springer, New York, NY.
[46]
Sascha Plazar, Jan C. Kleinsorge, Peter Marwedel, and Heiko Falk. 2012. WCET-aware static locking of instruction caches. In Proceedings of the 10th International Symposium on Code Generation and Optimization (CGO’12). ACM, New York, NY, 44--52.
[47]
Cosmin Rusu, Rami Melhem, and Daniel Mossé. 2003. Maximizing rewards for real-time applications with energy constraints. ACM Transactions on Embedded Computing Systems 2, 4, 537--559.
[48]
Hassan Salamy and Jagannathan Ramanujam. 2012. An effective solution to task scheduling and memory partitioning for multiprocessor system-on-chip. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 31, 5, 717--725.
[49]
Vivy Suhendra, Tulika Mitra, Abhik Roychoudhury, and Ting Chen. 2005. WCET centric data allocation to scratchpad memory. In Proceedings of the 26th IEEE International Real-Time Systems Symposium (RTSS’05). IEEE, Los Alamitos, CA, 223--232.
[50]
Vivy Suhendra, Chandrashekar Raghavan, and Tulika Mitra. 2006. Integrated scratchpad memory optimization and task scheduling for MPSoC architectures. In Proceedings of the 2006 International Conference on Compilers, Architecture, and Synthesis for Embedded Systems (CASES’06). ACM, New York, NY, 401--410.
[51]
Vivy Suhendra, Abhik Roychoudhury, and Tulika Mitra. 2010. Scratchpad allocation for concurrent embedded software. ACM Transactions on Programming Languages and Systems 32, 4, Article No. 13.
[52]
Texas Instruments. 2011. TMS320C6472 Fixed-Point Digital Signal Processor. Retrieved March 27, 2015, from http://www.ti.com/lit/ds/symlink/tms320c6472.pdf
[53]
Sumesh Udayakumaran, Angel Dominguez, and Rajeev Barua. 2006. Dynamic allocation for scratch-pad memory using compile-time decisions. ACM Transactions on Embedded Computing Systems 5, 2, 472--511.
[54]
Qing Wan, Hui Wu, and Jingling Xue. 2013. Scratchpad memory aware task scheduling with minimum number of preemptions on a single processor. In Proceedings of the 18th Asia and South Pacific Design Automation Conference (ASP-DAC’13). IEEE, Los Alamitos, CA, 741--748.
[55]
Jack Whitham, Neil C. Audsley, and Robert I. Davis. 2014. Explicit reservation of cache memory in a predictable, preemptive multitasking real-time system. ACM Transactions on Embedded Computing Systems 13, 4s, Article No. 120.
[56]
XMOS. 2013. XS1-L4A-64-TQ48 Datasheet. Retrieved March 27, 2015, from https://www.xmos.com/download/public/XS1-L4A-64-TQ48-Datasheet(1.2).pdf.
[57]
Lei Zhang, Meikang Qiu, Wei-Che Tseng, and Edwin H.-M. Sha. 2010. Variable partitioning and scheduling for MPSoC with virtually shared scratch pad memory. Journal of Signal Processing Systems 58, 2, 247--265.
[58]
Qingfeng Zhuge, Yibo Guo, Jingtong Hu, Wei-Che Tseng, Chun J. Xue, and Edwin H.-M. Sha. 2012. Minimizing access cost for multiple types of memory units in embedded systems through data allocation and scheduling. IEEE Transactions on Signal Processing 60, 6, 3253--3263.
[59]
Sergey Zhuravlev, Juan Carlos Saez, Sergey Blagodurov, Alexandra Fedorova, and Manuel Prieto. 2012. Survey of scheduling techniques for addressing shared resources in multicore processors. ACM Computing Surveys 45, 1, Article No. 4.

Cited By

View all
  • (2023)Mapi-Pro: An Energy Efficient Memory Mapping Technique for Intermittent ComputingACM Transactions on Architecture and Code Optimization10.1145/362952420:4(1-25)Online publication date: 20-Oct-2023
  • (2022)MASTER: Reclamation of Hybrid Scratchpad Memory to Maximize Energy Saving in Multi-Core Edge SystemsIEEE Transactions on Sustainable Computing10.1109/TSUSC.2021.30494477:4(749-760)Online publication date: 1-Oct-2022
  • (2020)Interference-Aware Memory Allocation for Real-Time Multi-Core Systems2020 IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS)10.1109/RTAS48715.2020.00-10(148-159)Online publication date: Apr-2020
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Architecture and Code Optimization
ACM Transactions on Architecture and Code Optimization  Volume 12, Issue 2
July 2015
410 pages
ISSN:1544-3566
EISSN:1544-3973
DOI:10.1145/2775085
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 May 2015
Accepted: 01 February 2015
Revised: 01 January 2015
Received: 01 November 2014
Published in TACO Volume 12, Issue 2

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Scratchpad memory
  2. embedded real-time systems
  3. multicore processors
  4. scheduling
  5. shared memory

Qualifiers

  • Research-article
  • Research
  • Refereed

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)79
  • Downloads (Last 6 weeks)6
Reflects downloads up to 12 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2023)Mapi-Pro: An Energy Efficient Memory Mapping Technique for Intermittent ComputingACM Transactions on Architecture and Code Optimization10.1145/362952420:4(1-25)Online publication date: 20-Oct-2023
  • (2022)MASTER: Reclamation of Hybrid Scratchpad Memory to Maximize Energy Saving in Multi-Core Edge SystemsIEEE Transactions on Sustainable Computing10.1109/TSUSC.2021.30494477:4(749-760)Online publication date: 1-Oct-2022
  • (2020)Interference-Aware Memory Allocation for Real-Time Multi-Core Systems2020 IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS)10.1109/RTAS48715.2020.00-10(148-159)Online publication date: Apr-2020
  • (2020)WCET Aware Cache Locking and Task Scheduling in Single Core Embedded SystemsJournal of Physics: Conference Series10.1088/1742-6596/1684/1/0120871684(012087)Online publication date: 1-Dec-2020
  • (2019)A Survey of Timing Verification Techniques for Multi-Core Real-Time SystemsACM Computing Surveys10.1145/332321252:3(1-38)Online publication date: 18-Jun-2019
  • (2018)Slack clustering for scheduling frame-based tasks on multicore embedded systemsMicroelectronics Journal10.1016/j.mejo.2018.09.00281(144-153)Online publication date: Nov-2018
  • (2017)NPAM: NVM-Aware Page Allocation for Multi-Core Embedded SystemsIEEE Transactions on Computers10.1109/TC.2017.270382466:10(1703-1716)Online publication date: 1-Oct-2017

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media