short-paper

Selective off-loading to Memory: Task Partitioning and Mapping for PIM-enabled Heterogeneous Systems

Authors:

Dawen Xu,

Yi Liao,

Ying Wang,

Huawei Li,

Xiaowei LiAuthors Info & Claims

CF'17: Proceedings of the Computing Frontiers Conference

Pages 255 - 258

https://doi.org/10.1145/3075564.3075584

Published: 15 May 2017 Publication History

Get Access

Abstract

Processing-in-Memory (PIM) is returning as a promising solution to address the issue of memory wall as computing systems gradually step into the big data era. Researchers continually proposed various PIM architecture combined with novel memory device or 3D integration technology, but it is still a lack of universal task scheduling method in terms of the new heterogeneous platform. In this paper, we propose a formalized model to quantify the performance and energy of the PIM+CPU heterogeneous parallel system. In addition, we are the first to build a task partitioning and mapping framework to exploit different PIM engines. In this framework, an application is divided into subtasks and mapped onto appropriate execution units based on the proposed PIM-oriented Earliest-Finish-Time (PEFT) algorithm to maximize the performance gains brought by PIM. Experimental evaluations show our PIM-aware framework significantly improves the system performance compared to conventional processor architectures.

References

[1]

Kozyrakis C.E, et al., Scalable processors in the billion-transistor era: IRAM. Computer, 1997, 30(9): pp. 75--78.

Digital Library

Google Scholar

[2]

Ahn J, et al., A scalable processing-in-memory accelerator for parallel graph processing. In Proc. of ISCA, 2015, pp. 105--117.

Digital Library

Google Scholar

[3]

Zhang D, et al. TOP-PIM: throughput-oriented programmable processing in memory. In Proc. of HPDC, 2014, pp. 85--98.

Digital Library

Google Scholar

[4]

Pugsley S.H, et al. NDC: Analyzing the impact of 3D-stacked memory+ logic devices on MapReduce workloads. In Proc. of ISPASS, 2014, pp. 190--200.

Crossref

Google Scholar

[5]

Ahn J, et al. Pim-enabled instructions: A low-overhead, locality-aware processing-in-memory architecture. In Proc. of ISCA, 2015, pp. 336--348.

Digital Library

Google Scholar

[6]

Johnson R.C, Efficient program analysis using dependence flow graphs. Ph.D. Dissertation. 1994, Cornell University.

Digital Library

Google Scholar

[7]

Topcuoglu H, Hariri S, and Wu M.Y, Performance-effective and low-complexity task scheduling for heterogeneous computing. IEEE Transactions on Parallel and Distributed Systems, 2002, 13(3): pp. 260--274.

Digital Library

Google Scholar

[8]

Hybrid Memory Cube Consortium. Hybrid Memory Cube Specification 2.0, Tech. Rep., 2013.

Google Scholar

[9]

Ubal R, et al. Multi2sim: A simulation framework to evaluate multicore-multithread processors. In Proc. of SBAC-PAD, 2007, pp. 62--68.

Crossref

Google Scholar

[10]

Chen K, et al. CACTI-3DD: Architecture-level modeling for 3D die-stacked DRAM main memory. In Proc. of DATE, 2012, pp. 33--38.

Digital Library

Google Scholar

Cited By

View all

Chen JLin GChen JWang Y(2021)Towards efficient allocation of graph convolutional networks on hybrid computation-in-memory architectureScience China Information Sciences10.1007/s11432-020-3248-y64:6Online publication date: 10-May-2021
https://doi.org/10.1007/s11432-020-3248-y

Recommendations

GP-SIMD Processing-in-Memory

GP-SIMD, a novel hybrid general-purpose SIMD computer architecture, resolves the issue of data synchronization by in-memory computing through combining data storage and massively parallel processing. GP-SIMD employs a two-dimensional access memory with ...
A Memory Access Scheduling Method for Multi-core Processor
IWCSE '09: Proceedings of the 2009 Second International Workshop on Computer Science and Engineering - Volume 01

It is well known fact that multi-core processor architecture is the mainstream of the next-generation microprocessor architecture and actualizes by Chip Multi-core Processors (CMP). As the number of cores per processor and the number of threaded ...
Resistive GP-SIMD Processing-In-Memory

GP-SIMD, a novel hybrid general-purpose SIMD architecture, addresses the challenge of data synchronization by in-memory computing, through combining data storage and massive parallel processing. In this article, we explore a resistive implementation of ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

CF'17: Proceedings of the Computing Frontiers Conference

May 2017

450 pages

ISBN:9781450344876

DOI:10.1145/3075564

General Chair:
Roberto Giorgi
University of Siena, IT
,
Program Chairs:
Michela Becchi
North Carolina State University
,
Francesca Palumbo
University of Sassari, IT

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 May 2017

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Short-paper
Research
Refereed limited

Conference

CF '17

Sponsor:

SIGMICRO

CF '17: Computing Frontiers Conference

May 15 - 17, 2017

Siena, Italy

Acceptance Rates

CF'17 Paper Acceptance Rate 43 of 87 submissions, 49%;

Overall Acceptance Rate 273 of 785 submissions, 35%

Upcoming Conference

CF '25

Sponsor:
sigmicro

22nd ACM International Conference on Computing Frontiers

May 28 - 30, 2025

Cagliari , Italy

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
231
Total Downloads

Downloads (Last 12 months)11
Downloads (Last 6 weeks)0

Reflects downloads up to 18 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

View all

Chen JLin GChen JWang Y(2021)Towards efficient allocation of graph convolutional networks on hybrid computation-in-memory architectureScience China Information Sciences10.1007/s11432-020-3248-y64:6Online publication date: 10-May-2021
https://doi.org/10.1007/s11432-020-3248-y

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Cited By

Recommendations

GP-SIMD Processing-in-Memory

A Memory Access Scheduling Method for Multi-core Processor

Resistive GP-SIMD Processing-In-Memory