[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3075564.3075584acmconferencesArticle/Chapter ViewAbstractPublication PagescfConference Proceedingsconference-collections
short-paper

Selective off-loading to Memory: Task Partitioning and Mapping for PIM-enabled Heterogeneous Systems

Published: 15 May 2017 Publication History

Abstract

Processing-in-Memory (PIM) is returning as a promising solution to address the issue of memory wall as computing systems gradually step into the big data era. Researchers continually proposed various PIM architecture combined with novel memory device or 3D integration technology, but it is still a lack of universal task scheduling method in terms of the new heterogeneous platform. In this paper, we propose a formalized model to quantify the performance and energy of the PIM+CPU heterogeneous parallel system. In addition, we are the first to build a task partitioning and mapping framework to exploit different PIM engines. In this framework, an application is divided into subtasks and mapped onto appropriate execution units based on the proposed PIM-oriented Earliest-Finish-Time (PEFT) algorithm to maximize the performance gains brought by PIM. Experimental evaluations show our PIM-aware framework significantly improves the system performance compared to conventional processor architectures.

References

[1]
Kozyrakis C.E, et al., Scalable processors in the billion-transistor era: IRAM. Computer, 1997, 30(9): pp. 75--78.
[2]
Ahn J, et al., A scalable processing-in-memory accelerator for parallel graph processing. In Proc. of ISCA, 2015, pp. 105--117.
[3]
Zhang D, et al. TOP-PIM: throughput-oriented programmable processing in memory. In Proc. of HPDC, 2014, pp. 85--98.
[4]
Pugsley S.H, et al. NDC: Analyzing the impact of 3D-stacked memory+ logic devices on MapReduce workloads. In Proc. of ISPASS, 2014, pp. 190--200.
[5]
Ahn J, et al. Pim-enabled instructions: A low-overhead, locality-aware processing-in-memory architecture. In Proc. of ISCA, 2015, pp. 336--348.
[6]
Johnson R.C, Efficient program analysis using dependence flow graphs. Ph.D. Dissertation. 1994, Cornell University.
[7]
Topcuoglu H, Hariri S, and Wu M.Y, Performance-effective and low-complexity task scheduling for heterogeneous computing. IEEE Transactions on Parallel and Distributed Systems, 2002, 13(3): pp. 260--274.
[8]
Hybrid Memory Cube Consortium. Hybrid Memory Cube Specification 2.0, Tech. Rep., 2013.
[9]
Ubal R, et al. Multi2sim: A simulation framework to evaluate multicore-multithread processors. In Proc. of SBAC-PAD, 2007, pp. 62--68.
[10]
Chen K, et al. CACTI-3DD: Architecture-level modeling for 3D die-stacked DRAM main memory. In Proc. of DATE, 2012, pp. 33--38.

Cited By

View all
  • (2021)Towards efficient allocation of graph convolutional networks on hybrid computation-in-memory architectureScience China Information Sciences10.1007/s11432-020-3248-y64:6Online publication date: 10-May-2021

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
CF'17: Proceedings of the Computing Frontiers Conference
May 2017
450 pages
ISBN:9781450344876
DOI:10.1145/3075564
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 May 2017

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Architecture
  2. Mapping
  3. Memory Wall
  4. PIM

Qualifiers

  • Short-paper
  • Research
  • Refereed limited

Conference

CF '17
Sponsor:
CF '17: Computing Frontiers Conference
May 15 - 17, 2017
Siena, Italy

Acceptance Rates

CF'17 Paper Acceptance Rate 43 of 87 submissions, 49%;
Overall Acceptance Rate 273 of 785 submissions, 35%

Upcoming Conference

CF '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)11
  • Downloads (Last 6 weeks)0
Reflects downloads up to 18 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2021)Towards efficient allocation of graph convolutional networks on hybrid computation-in-memory architectureScience China Information Sciences10.1007/s11432-020-3248-y64:6Online publication date: 10-May-2021

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media