[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3287624.3287646acmconferencesArticle/Chapter ViewAbstractPublication PagesaspdacConference Proceedingsconference-collections
research-article

CuckooPIM: an efficient and less-blocking coherence mechanism for processing-in-memory systems

Published: 21 January 2019 Publication History

Abstract

The ever-growing processing ability of in-memory processing logic makes the data sharing and coherence between processors and in-memory logic play an increasingly important role in Processing-in-Memory (PIM) systems. Unfortunately, the existing state-of-the-art coarse-grained PIM coherence solutions suffer from unnecessary data movements and stalls caused by a data ping-pong issue. This work proposes CuckooPIM, a criticality-aware and less-blocking coherence mechanism, which can effectively avoid unnecessary data movements and stalls. Experiments reveal that CuckooPIM achieves 1.68x speedup on average comparing with coarse-grained PIM coherence.

References

[1]
Papamarcos, M.S. and Patel, J.H., 1984. A low-overhead coherence solution for multiprocessors with private cache memories. ACM SIGARCH Computer Architecture News, 12(3), pp.348--354.
[2]
AMD, "AMD64 Technology," Available at: http://www.amd.com/usen/assets/content_type/white_papers_and_tech_docs/24593.pdf
[3]
Boroumand, Amirali, et al. "LazyPIM: An efficient cache coherence mechanism for processing-in-memory." IEEE Computer Architecture Letters (2017).
[4]
Farmahini-Farahani, Amin, et al. "NDA: Near-DRAM acceleration architecture leveraging commodity DRAM devices and standard memory modules." High Performance Computer Architecture (HPCA), 2015.
[5]
Gao, Mingyu, Grant Ayers, and Christos Kozyrakis. "Practical near-data processing for in-memory analytics frameworks." Parallel Architecture and Compilation (PACT), 2015 International Conference on. IEEE, 2015.
[6]
Hsieh, Kevin, et al. "Accelerating pointer chasing in 3D-stacked memory: Challenges, mechanisms, evaluation." Computer Design (ICCD), 2016.
[7]
Xu, Sheng, et al. "PIMCH: cooperative memory prefetching in processing-in-memory architecture." Proceedings of the 23rd Asia and South Pacific Design Automation Conference. IEEE Press, 2018.
[8]
Patterson, David, et al. "Intelligent RAM (IRAM): Chips that remember and compute." Solid-State Circuits Conference, 1997. Digest of Technical Papers. 43rd ISSCC., 1997 IEEE International. IEEE, 1997.
[9]
Elliott, Duncan G., et al. "Computational RAM: Implementing processors in memory." IEEE Design & Test of Computers 16.1 (1999): 32--41.
[10]
Pawlowski, J. Thomas. "Hybrid memory cube (HMC)." Hot Chips 23 Symposium (HCS), 2011 IEEE. IEEE, 2011.
[11]
Ahn, Junwhan, et al. "PIM-enabled instructions: A low-overhead, locality-aware processing-in-memory architecture." Computer Architecture (1SCA), 2015 ACM/IEEE 42nd Annual International Symposium on. IEEE, 2015.
[12]
Ahn, Junwhan, et al. "A scalable processing-in-memory accelerator for parallel graph processing." Computer Architecture (ISCA), 2015 ACM/IEEE 42nd Annual International Symposium on. IEEE, 2015.
[13]
PIMSim, "PIMSim," Available HTTP: https://github.com/vineodd/PIMSim.
[14]
Binkert, Nathan, et al. "The gem5 simulator." ACM SIGARCH Computer Architecture News 39.2 (2011): 1--7.
[15]
Leidel, John D., and Yong Chen. "Hmc-sim-2.0: A simulation platform for exploring custom memory cube operations." Parallel and Distributed Processing Symposium Workshops, 2016 IEEE International. IEEE, 2016.
[16]
Muralimanohar, Naveen, Rajeev Balasubramonian, and Norm Jouppi. "Optimizing NUCA organizations and wiring alternatives for large caches with CACTI 6.0." Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture. IEEE Computer Society, 2007.
[17]
Laboratory for Web Algorithmics. Available: http://law.di.unimi.it/datasets.php.
[18]
Black, Bryan, et al. "Die stacking (3D) microarchitecture." Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture. IEEE Computer Society, 2006.
[19]
Levon, John, and Philippe Elie. "Oprofile: A system profiler for linux." (2004).
[20]
Wulf, Wm A., and Sally A. McKee. "Hitting the memory wall: implications of the obvious." ACM SIGARCH computer architecture news 23.1 (1995): 20--24.

Cited By

View all
  • (2023)DrPIM: An Adaptive and Less-blocking Data Replication Framework for Processing-in-Memory ArchitectureProceedings of the Great Lakes Symposium on VLSI 202310.1145/3583781.3590294(385-389)Online publication date: 5-Jun-2023
  • (2022)CASPHAr: Cache-Managed Accelerator Staging and Pipelining in Heterogeneous System ArchitecturesIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2022.319753541:11(4325-4336)Online publication date: Nov-2022
  • (2021)FePIMProceedings of the 26th Asia and South Pacific Design Automation Conference10.1145/3394885.3431530(114-119)Online publication date: 18-Jan-2021

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
ASPDAC '19: Proceedings of the 24th Asia and South Pacific Design Automation Conference
January 2019
794 pages
ISBN:9781450360074
DOI:10.1145/3287624
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

In-Cooperation

  • IEICE ESS: Institute of Electronics, Information and Communication Engineers, Engineering Sciences Society
  • IEEE CAS
  • IEEE CEDA
  • IPSJ SIG-SLDM: Information Processing Society of Japan, SIG System LSI Design Methodology

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 21 January 2019

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article

Conference

ASPDAC '19
Sponsor:

Acceptance Rates

Overall Acceptance Rate 466 of 1,454 submissions, 32%

Upcoming Conference

ASPDAC '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)19
  • Downloads (Last 6 weeks)3
Reflects downloads up to 04 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2023)DrPIM: An Adaptive and Less-blocking Data Replication Framework for Processing-in-Memory ArchitectureProceedings of the Great Lakes Symposium on VLSI 202310.1145/3583781.3590294(385-389)Online publication date: 5-Jun-2023
  • (2022)CASPHAr: Cache-Managed Accelerator Staging and Pipelining in Heterogeneous System ArchitecturesIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2022.319753541:11(4325-4336)Online publication date: Nov-2022
  • (2021)FePIMProceedings of the 26th Asia and South Pacific Design Automation Conference10.1145/3394885.3431530(114-119)Online publication date: 18-Jan-2021

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media