[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3422575.3422796acmotherconferencesArticle/Chapter ViewAbstractPublication PagesmemsysConference Proceedingsconference-collections
research-article

Efficient Generation of Application Specific Memory Controllers

Published: 21 March 2021 Publication History

Abstract

The increasing gap between the bandwidth requirements of modern Systems on Chip (SoC) and the I/O data rate delivered by Dynamic Random Access Memory (DRAM), known as the Memory Wall, limits the performance of today’s data-intensive applications. General purpose memory controllers use online scheduling techniques in order to increase the memory bandwidth. Due to a limited buffer depth they only have a local view on the executed application. However, numerous applications, especially in the embedded systems domain, have regular or fixed memory access patterns, which are not yet exploited to overcome the memory wall. In this paper, we present a new methodology to generate the configuration for an Application-Specific Memory Controller (ASMC), which has a global view on the application and utilizes application knowledge to decrease the energy and increase the bandwidth. Therefore, we analyze the DRAM access pattern of the application offline by solving an instance of the Min-k-Union problem and generate a configuration for a reconfigurable address mapper. For several applications we show an improvement in energy efficiency of up to 8.5 × and sustainable bandwidth of 8.9 × .

References

[1]
B. Akin, J. C. Hoe, and F. Franchetti. 2014. HAMLeT: Hardware accelerated memory layout transform within 3D-stacked DRAM. In High Performance Extreme Computing Conference (HPEC), 2014 IEEE. 1–6. https://doi.org/10.1109/HPEC.2014.7040954
[2]
Rachata Ausavarungnirun, Kevin Kai-Wei Chang, Lavanya Subramanian, Gabriel H. Loh, and Onur Mutlu. 2012. Staged Memory Scheduling: Achieving High Performance and Scalability in Heterogeneous Systems. In Proceedings of the 39th Annual International Symposium on Computer Architecture(ISCA ’12). IEEE Computer Society, Washington, DC, USA, 416–427. http://dl.acm.org/citation.cfm?id=2337159.2337207
[3]
E. Azarkhish, C. Pfister, D. Rossi, I. Loi, and L. Benini. 2016. Logic-Base Interconnect Design for Near Memory Computing in the Smart Memory Cube. IEEE Transactions on Very Large Scale Integration (VLSI) Systems PP, 99(2016), 1–14. https://doi.org/10.1109/TVLSI.2016.2570283
[4]
Samuel Bayliss and George A. Constantinides. 2011. Application Specific Memory Access, Reuse and Reordering for SDRAM. In Proceedings of the 7th International Conference on Reconfigurable Computing: Architectures, Tools and Applications(ARC’11). Springer-Verlag, Berlin, Heidelberg, 41–52. http://dl.acm.org/citation.cfm?id=1987535.1987544
[5]
Mahdi Nazm Bojnordi and Engin Ipek. 2012. PARDIS: A Programmable Memory Controller for the DDRx Interfacing Standards. SIGARCH Comput. Archit. News 40, 3 (June 2012), 13–24. https://doi.org/10.1145/2366231.2337162
[6]
John Carter, Wilson Hsieh, Leigh Stoller, Mark. Swanson, Lixin Zhang, Erik. Brunvand, Al. Davis, Chen-Chi Kuo, Ravindra Kuramkote, Michael Parker, Lambert Schaelicke, and Terry Tateyama. 1999. Impulse: building a smarter memory controller. In High-Performance Computer Architecture, 1999. Proceedings. Fifth International Symposium On. 70–79. https://doi.org/10.1109/HPCA.1999.744334
[7]
Ren Chen and Viktor K. Prasanna. 2015. DRAM Row Activation Energy Optimization for Stride Memory Access on FPGA-Based Systems. In Applied Reconfigurable Computing - 11th International Symposium, ARC 2015, Bochum, Germany, April 13-17, 2015, Proceedings. 349–356. https://doi.org/10.1007/978-3-319-16214-0
[8]
Mohsen Ghasempour, Aamer Jaleel, Jim D. Garside, and Mikel Luján. 2016. DReAM: Dynamic Re-arrangement of Address Mapping to Improve the Performance of DRAMs. In Proceedings of the Second International Symposium on Memory Systems(MEMSYS ’16). ACM, New York, NY, USA, 362–373. https://doi.org/10.1145/2989081.2989102
[9]
Ibrahim Hur and Calvin Lin. 2004. Adaptive History-Based Memory Schedulers. In Proceedings of the 37th Annual IEEE/ACM International Symposium on Microarchitecture(MICRO 37). IEEE Computer Society, Washington, DC, USA, 343–354. https://doi.org/10.1109/MICRO.2004.4
[10]
J. Y. Hur, S. W. Rhim, B. H. Lee, and W. Jang. 2019. Adaptive Linear Address Map for Bank Interleaving in DRAMs. IEEE Access 7(2019), 129604–129616.
[11]
Bruce Jacob, S. Ng, and D. Wang. 2010. Memory Systems: Cache, DRAM, Disk. Elsevier Science.
[12]
Matthias Jung, Irene Heinrich, Marco Natale, Deepak M. Mathew, Christian Weis, Sven Krumke, and Norbert Wehn. 2016. ConGen: An Application Specific DRAM Memory Controller Generator. In Proceedings of the Second International Symposium on Memory Systems(MEMSYS ’16). ACM, New York, NY, USA, 257–267. https://doi.org/10.1145/2989081.2989131
[13]
Matthias Jung, Christian Weis, Patrick Bertram, and Norbert Wehn. 2013. Power Modelling of 3D-Stacked Memories with TLM2.0 based Virtual Platforms. In Synopsys User Group Conference (SNUG), May, 2013, Munich, Germany.
[14]
Matthias Jung, Christian Weis, and Norbert Wehn. 2015. DRAMSys: A flexible DRAM Subsystem Design Space Exploration Framework. IPSJ Transactions on System LSI Design Methodology (T-SLDM) (August 2015). https://doi.org/10.2197/ipsjtsldm.8.63
[15]
Matthias Jung, Éder Zulian, Deepak Mathew, Matthias Herrmann, Christian Brugger, Christian Weis, and Norbert Wehn. 2015. Omitting Refresh - A Case Study for Commodity and Wide I/O DRAMs. In 1st International Symposium on Memory Systems (MEMSYS 2015). Washington, DC, USA.
[16]
H. S. Kim, N. Vijaykrishnan, M. Kandemir, E. Brockmeyer, F. Catthoor, and M. J. Irwin. 2003. Estimating influence of data layout optimizations on SDRAM energy consumption. In Low Power Electronics and Design, 2003. ISLPED ’03. Proceedings of the 2003 International Symposium on. 40–43. https://doi.org/10.1109/LPE.2003.1231832
[17]
Tim Kogel. 2016. Optimizing DDR Memory Subsystem Efficiency - The Unpredictable Memory Bottleneck. Synopsys Inc. (January 2016).
[18]
S. Langemeyer, P. Pirsch, and H. Blume. 2011. Using SDRAMs for two-dimensional accesses of long 2n x 2m-point FFTs and transposing. In Embedded Computer Systems (SAMOS), 2011 International Conference on. 242–248. https://doi.org/10.1109/SAMOS.2011.6045467
[19]
Wei-Fen Lin, S.K. Reinhardt, and D. Burger. 2001. Reducing DRAM latencies with an integrated memory hierarchy design. In High-Performance Computer Architecture, 2001. HPCA. The Seventh International Symposium on. 301–312. https://doi.org/10.1109/HPCA.2001.903272
[20]
Jamie Liu, Ben Jaiyen, Yoongu Kim, Chris Wilkerson, and Onur Mutlu. 2013. An Experimental Study of Data Retention Behavior in Modern DRAM Devices: Implications for Retention Time Profiling Mechanisms. SIGARCH Comput. Archit. News 41, 3 (June 2013), 60–71. https://doi.org/10.1145/2508148.2485928
[21]
A. C. I. Malossi, M. Schaffner, A. Molnos, L. Gammaitoni, G. Tagliavini, A. Emerson, A. Tomás, D. S. Nikolopoulos, E. Flamand, and N. Wehn. 2018. The transprecision computing paradigm: Concept, design, and applications. In 2018 Design, Automation Test in Europe Conference Exhibition (DATE). 1105–1110. https://doi.org/10.23919/DATE.2018.8342176
[22]
Cadence Inc.2014, last access 18.02.2015. Cadence Denali DDR Memory IP. http://ip.cadence.com/ipportfolio/ip-portfolio-overview/memory-ip/ddr-lpddr. (October 2014, last access 18.02.2015).
[23]
Micron Technology Inc.2006. 1Gb: x4, x8, x16 DDR3 SDRAM. (July 2006).
[24]
Synopsys, Inc.2015, Last Access: 18.02.2015. DesignWare DDR IP. http://www.synopsys.com/IP/InterfaceIP/DDRn/Pages/. (2015, Last Access: 18.02.2015).
[25]
Xilinx, Inc.2015, Last Access: 18.02.2015. Memory Interface Generator (MIG). http://www.xilinx.com/products/intellectual-property/mig.html. (2015, Last Access: 18.02.2015).
[26]
Wei Mi, Xiaobing Feng, Jingling Xue, and Yaocang Jia. 2010. Software-hardware Cooperative DRAM Bank Partitioning for Chip Multiprocessors. In Proceedings of the 2010 IFIP International Conference on Network and Parallel Computing(NPC’10). Springer-Verlag, Berlin, Heidelberg, 329–343. http://dl.acm.org/citation.cfm?id=1882011.1882045
[27]
Onur Mutlu and Thomas Moscibroda. 2008. Parallelism-Aware Batch-Scheduling: Enhancing both Performance and Fairness of Shared DRAM Systems. In 35th International Symposium on Computer Architecture (ISCA). Association for Computing Machinery, Inc.http://research.microsoft.com/apps/pubs/default.aspx?id=79626
[28]
Scott Rixner, William J. Dally, Ujval J. Kapasi, Peter Mattson, and John D. Owens. 2000. Memory Access Scheduling. In Proceedings of the 27th Annual International Symposium on Computer Architecture(ISCA ’00). ACM, New York, NY, USA, 128–138. https://doi.org/10.1145/339647.339668
[29]
Tomas Rockicki. 1996. Indexing memory banks to maximize page mode hit percentage and minimize memory latency. Hewlett-Packard Laboratories Technical Report, HPL-96-95 (1996).
[30]
Vivek Seshadri, Thomas Mullins, Amirali Boroumand, Onur Mutlu, Phillip B. Gibbons, Michael A. Kozuch, and Todd C. Mowry. 2015. Gather-scatter DRAM: In-DRAM Address Translation to Improve the Spatial Locality of Non-unit Strided Accesses. In Proceedings of the 48th International Symposium on Microarchitecture(MICRO-48). ACM, New York, NY, USA, 267–280. https://doi.org/10.1145/2830772.2830820
[31]
Jun Shao and Brian T. Davis. 2005. The Bit-reversal SDRAM Address Mapping. In Proceedings of the 2005 Workshop on Software and Compilers for Embedded Systems(SCOPES ’05). ACM, New York, NY, USA, 62–71. https://doi.org/10.1145/1140389.1140396
[32]
Chirag Sudarshan, Jan Lappas, Christian Weis, Deepak M. Mathew, Matthias Jung, and Norbert Wehn. 2019. A Lean, Low Power, Low Latency DRAM Memory Controller for Transprecision Computing. In Embedded Computer Systems: Architectures, Modeling, and Simulation, Dionisios N. Pnevmatikatos, Maxime Pelcat, and Matthias Jung (Eds.). Springer International Publishing, Cham, 429–441.
[33]
S.A. Vinterbo. 2002. Maximum k-Intersection, Edge Labeled Multigraph Max Capacity k-Path, and Max Factor k-gcd are all NP-hard. Technical Report. Decision Systems Group, Harvard Medical School.
[34]
Zhao Zhang, Zhichun Zhu, and Xiaodong Zhang. 2000. A Permutation-based Page Interleaving Scheme to Reduce Row-buffer Conflicts and Exploit Data Locality. In Proceedings of the 33rd Annual ACM/IEEE International Symposium on Microarchitecture(MICRO 33). ACM, New York, NY, USA, 32–41. https://doi.org/10.1145/360128.360134

Cited By

View all
  • (2024)DESIGN AND DEVELOP LOW-POWER MEMORY CONTROLLER FOR GAIN CELL-EMBEDDED DYNAMIC RANDOM-ACCESS MEMORY CELL USING INTELLIGENT CLOCK GATINGTelecommunications and Radio Engineering10.1615/TelecomRadEng.202404997383:8(83-94)Online publication date: 2024

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
MEMSYS '20: Proceedings of the International Symposium on Memory Systems
September 2020
362 pages
ISBN:9781450388993
DOI:10.1145/3422575
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 21 March 2021

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Address Mapping
  2. Application Specific Memory Controller
  3. Combinatorics
  4. DRAM
  5. Embedded Systems
  6. Min-k-Union
  7. Optimization

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

Conference

MEMSYS 2020
MEMSYS 2020: The International Symposium on Memory Systems
September 28 - October 1, 2020
DC, Washington, USA

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)39
  • Downloads (Last 6 weeks)10
Reflects downloads up to 12 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)DESIGN AND DEVELOP LOW-POWER MEMORY CONTROLLER FOR GAIN CELL-EMBEDDED DYNAMIC RANDOM-ACCESS MEMORY CELL USING INTELLIGENT CLOCK GATINGTelecommunications and Radio Engineering10.1615/TelecomRadEng.202404997383:8(83-94)Online publication date: 2024

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media