poster

Effective communication and computation overlap with hybrid MPI/SMPSs

Authors:

Vladimir Marjanovic,

Jesús Labarta,

Eduard Ayguadé,

Mateo ValeroAuthors Info & Claims

ACM SIGPLAN Notices, Volume 45, Issue 5

Pages 337 - 338

https://doi.org/10.1145/1837853.1693502

Published: 09 January 2010 Publication History

Get Access

Abstract

Communication overhead is one of the dominant factors affecting performance in high-performance computing systems. To reduce the negative impact of communication, programmers overlap communication and computation by using asynchronous communication primitives. This increases code complexity, requiring more development effort and making less readable programs. This paper presents the hybrid use of MPI and SMPSs (SMP superscalar, a task-based shared-memory programming model) that allows the programmer to easily introduce the asynchrony necessary to overlap communication and computation. We demonstrate the hybrid use of MPI/SMPSs with the high-performance LINPACK benchmark (HPL), and compare it to the pure MPI implementation, which uses the look-ahead technique to overlap communication and computation. The hybrid MPI/SMPSs version significantly improves the performance of the pure MPI version, getting close to the asymptotic performance at medium problem sizes and still getting significant benefits at small/large problem sizes.

References

[1]

J.M. Perez, L. Martinell, R.M. Badia and J. Labarta. A dependency aware task-based programming environment for multi-core architecture. Proceedings of IEEE Cluster 2008, September 2008.

Crossref

Google Scholar

[2]

V. Marjanovic, J.M. Perez, E. Ayguadé, J. Labarta and M. Valero. Overlapping Communication and Computation by Using a Hybrid MPI/SMPSs Approach. UPC-DAC-RR-2009-35 Research Report, Technical University of Catalunya. May 2009.

Google Scholar

[3]

J. Dongarra, P. Luszczek and A. Petitet. The LINPACK Benchmark: Past, Present and Future. Concurrency and Computation: Practice and Experience. Vol. 15, issue 9, pp. 803--820. 2003.

Crossref

Google Scholar

Cited By

View all

Álvarez CAyguadé EBosch JBueno JCherkashin AFilgueras AJiménez-González DMartorell XNavarro NVidal MTheodoropoulos DPnevmatikatos DCatani DOro DFernández CSegura CRodríguez JHernando JScordino CGai PPassera PPomella ABettin NRizzo AGiorgi R(2016)The AXIOM software layersMicroprocessors & Microsystems10.1016/j.micpro.2016.07.00247:PB(262-277)Online publication date: 1-Nov-2016
https://dl.acm.org/doi/10.1016/j.micpro.2016.07.002
Dutra DMoraes HAmorim C(2015)RadFlowProceedings of the 2015 International Symposium on Computer Architecture and High Performance Computing Workshop (SBAC-PADW)10.1109/SBAC-PADW.2015.26(115-119)Online publication date: 18-Oct-2015
https://dl.acm.org/doi/10.1109/SBAC-PADW.2015.26
Subotic VFerrer RSancho JLabarta JValero M(2011)Quantifying the Potential Task-Based Dataflow Parallelism in MPI ApplicationsEuro-Par 2011 Parallel Processing10.1007/978-3-642-23400-2_5(39-51)Online publication date: 2011
https://doi.org/10.1007/978-3-642-23400-2_5
Show More Cited By

Index Terms

Effective communication and computation overlap with hybrid MPI/SMPSs

Recommendations

Effective communication and computation overlap with hybrid MPI/SMPSs
PPoPP '10: Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming

Communication overhead is one of the dominant factors affecting performance in high-performance computing systems. To reduce the negative impact of communication, programmers overlap communication and computation by using asynchronous communication ...
Overlapping communication and computation by using a hybrid MPI/SMPSs approach
ICS '10: Proceedings of the 24th ACM International Conference on Supercomputing

Communication overhead is one of the dominant factors affecting performance in high-end computing systems. To reduce the negative impact of communication, programmers overlap communication and computation by using asynchronous communication primitives. ...
Overlapping communication and computation with OpenMP and MPI

Machines comprised of a distributed collection of shared memory or SMP nodes are becoming common for parallel computing. OpenMP can be combined with MPI on many such machines. Motivations for combing OpenMP and MPI are discussed. While OpenMP is ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

ACM SIGPLAN Notices Volume 45, Issue 5

PPoPP '10

May 2010

346 pages

ISSN:0362-1340

EISSN:1558-1160

DOI:10.1145/1837853

Issue’s Table of Contents

PPoPP '10: Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
January 2010
372 pages
ISBN:9781605588773
DOI:10.1145/1693453
General Chairs:
R. Govindarajan
Indian Institute of Science
,
David Padua
UIUC
,
Program Chair:
Mary Hall
University of Utah

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 January 2010

Published in SIGPLAN Volume 45, Issue 5

Check for updates

Author Tags

Qualifiers

Poster

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

20
Total Citations
View Citations
436
Total Downloads

Downloads (Last 12 months)13
Downloads (Last 6 weeks)2

Reflects downloads up to 17 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

View all

Álvarez CAyguadé EBosch JBueno JCherkashin AFilgueras AJiménez-González DMartorell XNavarro NVidal MTheodoropoulos DPnevmatikatos DCatani DOro DFernández CSegura CRodríguez JHernando JScordino CGai PPassera PPomella ABettin NRizzo AGiorgi R(2016)The AXIOM software layersMicroprocessors & Microsystems10.1016/j.micpro.2016.07.00247:PB(262-277)Online publication date: 1-Nov-2016
https://dl.acm.org/doi/10.1016/j.micpro.2016.07.002
Dutra DMoraes HAmorim C(2015)RadFlowProceedings of the 2015 International Symposium on Computer Architecture and High Performance Computing Workshop (SBAC-PADW)10.1109/SBAC-PADW.2015.26(115-119)Online publication date: 18-Oct-2015
https://dl.acm.org/doi/10.1109/SBAC-PADW.2015.26
Subotic VFerrer RSancho JLabarta JValero M(2011)Quantifying the Potential Task-Based Dataflow Parallelism in MPI ApplicationsEuro-Par 2011 Parallel Processing10.1007/978-3-642-23400-2_5(39-51)Online publication date: 2011
https://doi.org/10.1007/978-3-642-23400-2_5
Pereira RRoussel ACarribault PGautier T(2021)Communication-Aware Task Scheduling Strategy in Hybrid MPI+OpenMP ApplicationsOpenMP: Enabling Massive Node-Level Parallelism10.1007/978-3-030-85262-7_14(197-210)Online publication date: 8-Sep-2021
https://doi.org/10.1007/978-3-030-85262-7_14
Pang YWang SPeng YPeng XFraser NLeong P(2016)A Microcoded Kernel Recursive Least Squares Processor Using FPGA TechnologyACM Transactions on Reconfigurable Technology and Systems10.1145/295006110:1(1-22)Online publication date: 24-Sep-2016
https://dl.acm.org/doi/10.1145/2950061
Verdoolaege SCarlos Juega JCohen AIgnacio Gómez JTenllado CCatthoor F(2013)Polyhedral parallel code generation for CUDAACM Transactions on Architecture and Code Optimization10.1145/2400682.24007139:4(1-23)Online publication date: 20-Jan-2013
https://dl.acm.org/doi/10.1145/2400682.2400713
Pop ACohen A(2013)OpenStreamACM Transactions on Architecture and Code Optimization10.1145/2400682.24007129:4(1-25)Online publication date: 20-Jan-2013
https://dl.acm.org/doi/10.1145/2400682.2400712
Bueno JPlanas JDuran ABadia RMartorell XAyguade ELabarta J(2012)Productive Programming of GPU Clusters with OmpSsProceedings of the 2012 IEEE 26th International Parallel and Distributed Processing Symposium10.1109/IPDPS.2012.58(557-568)Online publication date: 21-May-2012
https://dl.acm.org/doi/10.1109/IPDPS.2012.58
Tejedor EFarreras MGrove DBadia RAlmasi GLabarta J(2012)A high-productivity task-based programming model for clustersConcurrency and Computation: Practice & Experience10.1002/cpe.283124:18(2421-2448)Online publication date: 1-Dec-2012
https://dl.acm.org/doi/10.1002/cpe.2831
Bueno JMartinell LDuran AFarreras MMartorell XBadia RAyguade ELabarta J(2011)Productive cluster programming with OmpSsProceedings of the 17th international conference on Parallel processing - Volume Part I10.5555/2033345.2033405(555-566)Online publication date: 29-Aug-2011
https://dl.acm.org/doi/10.5555/2033345.2033405
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Cited By

Index Terms

Recommendations

Effective communication and computation overlap with hybrid MPI/SMPSs

Overlapping communication and computation by using a hybrid MPI/SMPSs approach

Overlapping communication and computation with OpenMP and MPI