[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1109/HOTI.2010.22guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Design and Evaluation of Generalized Collective Communication Primitives with Overlap Using ConnectX-2 Offload Engine

Published: 18 August 2010 Publication History

Abstract

Collective communication operations provided by The Message Passing Interface (MPI) are heavily used by scientific applications at large scale. The current MPI standard, MPI-2.2, only defines blocking collective communication calls, which does not allow simultaneous computation and communication. It is expected that MPI-3 will allow for non-blocking collective communication. The newly introduced ConnectX-2 Infini Band adapter from Mellanox features an offload mechanism that enables the Network Interface Card (NIC) to perform a series of communication and reduction operations without the involvement of the host processor. Current generation MPI stacks implement each collective operation using point-to point operations. To take advantage of offload feature in a rapidly changing architectural environment for all MPI collectives, they must be re-designed using flexible and generalized primitives. The primitives can then be used to compose various collective algorithms. The primitives must provide increased overlap with adapters supporting offload capabilities with varying collective group sizes and communication message sizes. In this paper, we take on the challenge of designing collective communication primitives with good overlap characteristics and evaluate their performance using ConnectX-2 offload feature. We also show how collectives such as Barrier can be designed using our communication primitives. Our evaluation reveals that we can achieve near perfect (94% - 100%) overlap of computation and communication by using our primitives. Additionally, we observe performance improvement of up to 5% using the Recv-Replicate primitive for data transfer.

Cited By

View all
  • (2020)Communication-Aware Hardware-Assisted MPI Overlap EngineHigh Performance Computing10.1007/978-3-030-50743-5_26(517-535)Online publication date: 22-Jun-2020
  • (2015)Non-blocking PMI extensions for fast MPI startupProceedings of the 15th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing10.1109/CCGrid.2015.151(131-140)Online publication date: 4-May-2015
  • (2014)The TH Express high performance interconnect networksFrontiers of Computer Science: Selected Publications from Chinese Universities10.1007/s11704-014-3500-98:3(357-366)Online publication date: 1-Jun-2014
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings
HOTI '10: Proceedings of the 2010 18th IEEE Symposium on High Performance Interconnects
August 2010
131 pages
ISBN:9780769542089

Publisher

IEEE Computer Society

United States

Publication History

Published: 18 August 2010

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 14 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2020)Communication-Aware Hardware-Assisted MPI Overlap EngineHigh Performance Computing10.1007/978-3-030-50743-5_26(517-535)Online publication date: 22-Jun-2020
  • (2015)Non-blocking PMI extensions for fast MPI startupProceedings of the 15th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing10.1109/CCGrid.2015.151(131-140)Online publication date: 4-May-2015
  • (2014)The TH Express high performance interconnect networksFrontiers of Computer Science: Selected Publications from Chinese Universities10.1007/s11704-014-3500-98:3(357-366)Online publication date: 1-Jun-2014
  • (2012)Composable, non-blocking collective operations on power7 IHProceedings of the 26th ACM international conference on Supercomputing10.1145/2304576.2304605(215-224)Online publication date: 25-Jun-2012

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media