[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/1878921.1878942acmconferencesArticle/Chapter ViewAbstractPublication PagesesweekConference Proceedingsconference-collections
research-article

Improving scratchpad allocation with demand-driven data tiling

Published: 24 October 2010 Publication History

Abstract

Existing scratchpad memory (SPM) allocation algorithms for arrays, whether they rely on well-crafted heuristics or resort to integer linear programming (ILP) techniques, typically assume that every array is small enough to fit directly into the SPM. As a result, some arrays have to be spilled entirely to the off-chip memory in order to make room for other arrays to stay in the SPM, resulting in sometimes poor SPM utilization.
In this paper, we introduce a new comparability graph coloring allocator that integrates for the first time data tiling and SPM allocation for arrays by tiling arrays on-demand to improve utilization of the SPM. The novelty lies in repeatedly identifying the heaviest path in an array interference graph and then reducing its weight by tiling certain arrays on the path appropriately with respect to the size of the SPM. The effectiveness of our allocator, which is presently restricted to tiling 1-D arrays, is validated by using a number of selected benchmarks for which existing allocators are ineffective.

References

[1]
Rajeshwari Banakar, Stefan Steinke, Bo-Sik Lee, M. Balakrishnan, and Peter Marwedel. Scratchpad memory: design alternative for cache on-chip memory in embedded systems. In CODES '02: Proceedings of the tenth international symposium on Hardware/software codesign, pages 73--78. ACM, 2002.
[2]
Kevin J. Barker, Kei Davis, Adolfy Hoisie, Darren J. Kerbyson, Mike Lang, Scott Pakin, and Jose C. Sancho. Entering the petaflop era: the architecture and performance of roadrunner. In SC '08: Proceedings of the 2008 ACM/IEEE conference on Supercomputing, pages 1--11. IEEE Press, 2008.
[3]
G. J. Chaitin. Register allocation & spilling via graph coloring. In SIGPLAN '82: Proceedings of the 1982 SIGPLAN symposium on Compiler construction, pages 98--101. ACM Press, 1982.
[4]
J.d. Cuvillo, W. Zhu, H.u. Ziang, and G.R. Gao. Fast: A functionally accurate simulation toolset for the cyclops64 cellular architecture. In MoBS2005: Workshop on Modeling, Benchmarking, and Simulation, pages 11--20. ACM Press, 2005.
[5]
William J. Dally, Francois Labonte, Abhishek Das, Patrick Hanrahan, and Jung-Ho Ahn et al. Merrimac: Supercomputing with streams. In SC '03: Proceedings of the 2003 ACM/IEEE conference on Supercomputing, pages 35--42, 2003.
[6]
Janet Fabri. Automatic storage optimization. SIGPLAN Not., 14(8):83--91, 1979.
[7]
Martin Charles Golumbic. Algorithmic Graph Theory and Perfect Graphs (Annals of Discrete Mathematics, Vol 57). North-Holland Publishing Co., 2004.
[8]
S. Louis Hakimi, Edward F. Schmeichel, and Neal E. Young. Orienting graphs to optimize reachability. Inf. Process. Lett., 63(5):229--235, 1997.
[9]
Pinar Heggernes, Federico Mancini, and Charis Papadopoulos. Minimal comparability completions of arbitrary graphs. Discrete Appl. Math., 156(5):705--718, 2008.
[10]
Qingguang Huang, Jingling Xue, and Xavier Vera. Code tiling for improving the cache performance of pde solvers. In ICPP, pages 615--624, 2003.
[11]
M. Kandemir, J. Ramanujam, J. Irwin, N. Vijaykrishnan, I. Kadayif, and A. Parikh. Dynamic management of scratch-pad memory space. In DAC '01: Proceedings of the 38th conference on Design automation, pages 690--695, New York, NY, USA, 2001. ACM.
[12]
Induprakas Kodukula, Nawaaz Ahmed, and Keshav Pingali. Data-centric multi-level blocking. SIGPLAN Not., 32(5):346--357, 1997.
[13]
Lian Li, Hui Feng, and Jingling Xue. Compiler-directed scratchpad memory management via graph coloring. ACM Trans. Archit. Code Optim., 6(3):1--17, 2009.
[14]
Lian Li, Lin Gao, and Jingling Xue. Memory coloring: A compiler approach for scratchpad memory management. In PACT '05: Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques, pages 329--338, 2005.
[15]
Lian Li, Quan Hoang Nguyen, and Jingling Xue. Scratchpad allocation for data aggregates in superperfect graphs. In Proceedings of the 2007 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems, pages 207--216. ACM, 2007.
[16]
Lian Li, Hui Wu, Hui Feng, and Jingling Xue. Towards data tiling for whole programs in scratchpad memory allocation. In ACSAC'07: Proceedings of the 12th Asia-Pacific Computer Systems Architecture Conference, pages 63--74, 2007.
[17]
J. Makino, K. Hiraki, and M. Inaba. Grape-dr: 2-pflops massively-parallel computer with 512-core, 512-gflops processor chips for scientific computing. In SC '07: Proceedings of the 2007 ACM/IEEE conference on Supercomputing, pages 1--11. ACM, 2007.
[18]
Jinpyo Park and Soo-Mook Moon. Optimistic register coalescing. ACM Trans. Program. Lang. Syst., 26(4):735--765, 2004
[19]
Sumesh Udayakumaran and Rajeev Barua. Compiler-decided dynamic memory allocation for scratch-pad based embedded systems. In CASES '03: Proceedings of the 2003 international conference on Compilers, architecture and synthesis for embedded systems, pages 276--286, New York, NY, USA, 2003. ACM.
[20]
Sumesh Udayakumaran, Angel Dominguez, and Rajeev Barua. Dynamic allocation for scratch-pad memory using compile-time decisions. ACM Trans. Embed. Comput. Syst., 5(2):472--511, 2006.
[21]
Manish Verma, Lars Wehmeyer, and Peter Marwedel. Dynamic overlay of scratchpad memory for energy minimization. In CODES+ISSS '04: Proceedings of the 2nd IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis, pages 104--109. ACM, 2004.
[22]
Xuejun Yang, Li Wang, Jingling Xue, Yu Deng, and Ying Zhang. Comparability graph coloring for optimizing utilization of stream register files in stream processors. In PPoPP '09: Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming, pages 111--120, 2009.
[23]
Chunhui Zhang and Fadi Kurdahi. On combining iteration space tiling with data space tiling for scratch-pad memory systems. In ASP-DAC '05: Proceedings of the 2005 Asia and South Pacific Design Automation Conference, pages 973--976. ACM, 2005.

Cited By

View all
  • (2017)Optimal Tiling Strategy for Memory Bandwidth Reduction for CNNsAdvanced Concepts for Intelligent Vision Systems10.1007/978-3-319-70353-4_8(89-100)Online publication date: 23-Nov-2017
  • (2013)SPM-SieveProceedings of the 2013 International Conference on Compilers, Architectures and Synthesis for Embedded Systems10.5555/2555729.2555750(1-10)Online publication date: 29-Sep-2013
  • (2013)SPM-Sieve: A framework for assisting data partitioning in scratch pad memory based systems2013 International Conference on Compilers, Architecture and Synthesis for Embedded Systems (CASES)10.1109/CASES.2013.6662527(1-10)Online publication date: Sep-2013
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
CASES '10: Proceedings of the 2010 international conference on Compilers, architectures and synthesis for embedded systems
October 2010
276 pages
ISBN:9781605589039
DOI:10.1145/1878921
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

In-Cooperation

  • CEDA
  • IEEE CAS
  • IEEE CS

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 24 October 2010

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. comparability graph coloring
  2. data tiling
  3. loop tiling
  4. scratchpad memory
  5. software-managed cache

Qualifiers

  • Research-article

Conference

ESWeek '10
ESWeek '10: Sixth Embedded Systems Week
October 24 - 29, 2010
Arizona, Scottsdale, USA

Acceptance Rates

Overall Acceptance Rate 52 of 230 submissions, 23%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)3
  • Downloads (Last 6 weeks)1
Reflects downloads up to 12 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2017)Optimal Tiling Strategy for Memory Bandwidth Reduction for CNNsAdvanced Concepts for Intelligent Vision Systems10.1007/978-3-319-70353-4_8(89-100)Online publication date: 23-Nov-2017
  • (2013)SPM-SieveProceedings of the 2013 International Conference on Compilers, Architectures and Synthesis for Embedded Systems10.5555/2555729.2555750(1-10)Online publication date: 29-Sep-2013
  • (2013)SPM-Sieve: A framework for assisting data partitioning in scratch pad memory based systems2013 International Conference on Compilers, Architecture and Synthesis for Embedded Systems (CASES)10.1109/CASES.2013.6662527(1-10)Online publication date: Sep-2013
  • (2011)A semi-automatic scratchpad memory management framework for CMPProceedings of the 9th international conference on Advanced parallel processing technologies10.5555/2042522.2042528(73-87)Online publication date: 26-Sep-2011
  • (2011)A Semi-automatic Scratchpad Memory Management Framework for CMPAdvanced Parallel Processing Technologies10.1007/978-3-642-24151-2_6(73-87)Online publication date: 2011

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media