[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3588195.3595955acmconferencesArticle/Chapter ViewAbstractPublication PageshpdcConference Proceedingsconference-collections
poster

Accelerating MPI Collectives with Process-in-Process-based Multi-object Techniques

Published: 07 August 2023 Publication History

Abstract

In the exascale computing era, optimizing MPI collective performance in high-performance computing (HPC) applications is critical. Current algorithms face performance degradation due to system call overhead, page faults, or data-copy latency, affecting HPC applications' efficiency and scalability. To address these issues, we propose PiP-MColl, a Process-in-Process-based Multi-object Inter-process MPI Collective design that maximizes small message MPI collective performance at scale. PiP-MColl features efficient multiple sender and receiver collective algorithms and leverages Process-in-Process shared memory techniques to eliminate unnecessary system call, page fault overhead, and extra data copy, improving intra- and inter-node message rate and throughput. Our design also boosts performance for larger messages, resulting in comprehensive improvement for various message sizes. Experimental results show that PiP-MColl outperforms popular MPI libraries, including OpenMPI, MVAPICH2, and Intel MPI, by up to 4.6X for MPI collectives like MPI_Scatter and MPI_Allgather.

References

[1]
Jahanzeb Maqbool Hashmi, Sourav Chakraborty, Mohammadreza Bayatpour, Hari Subramoni, and Dhabaleswar K Panda. 2018. Designing efficient shared address space reduction collectives for multi-/many-cores. In 2018 IEEE International Parallel and Distributed Processing Symposium (IPDPS). IEEE, 1020--1029.
[2]
Atsushi Hori, Min Si, Balazs Gerofi, Masamichi Takagi, Jai Dayal, Pavan Balaji, and Yutaka Ishikawa. 2018. Process-in-Process: Techniques for Practical Address-Space Sharing. In Proceedings of the 27th International Symposium on High-Performance Parallel and Distributed Computing. ACM, 131--143.
[3]
Jiajun Huang, Sheng Di, Xiaodong Yu, Yujia Zhai, Jinyang Liu, Ken Raffenetti, Hui Zhou, Kai Zhao, Zizhong Chen, Franck Cappello, Yanfei Guo, and Rajeev Thakur 2023. C-Coll: Introducing Error-bounded Lossy Compression into MPI Collectives. arxiv: 2304.03890 [cs.DC]
[4]
Benjamin S Parsons and Vijay S Pai. 2014. Accelerating MPI collective communications through hierarchical algorithms without sacrificing inter-node communication flexibility. In 2014 IEEE 28th International Parallel and Distributed Processing Symposium. IEEE, 208--218. io

Cited By

View all
  • (2024)POSTER: Optimizing Collective Communications with Error-bounded Lossy Compression for GPU ClustersProceedings of the 29th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming10.1145/3627535.3638467(454-456)Online publication date: 2-Mar-2024
  • (2023)PiP-MColl: Process-in-Process-based Multi-object MPI Collectives2023 IEEE International Conference on Cluster Computing (CLUSTER)10.1109/CLUSTER52292.2023.00037(354-364)Online publication date: 31-Oct-2023

Index Terms

  1. Accelerating MPI Collectives with Process-in-Process-based Multi-object Techniques

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    HPDC '23: Proceedings of the 32nd International Symposium on High-Performance Parallel and Distributed Computing
    August 2023
    350 pages
    ISBN:9798400701559
    DOI:10.1145/3588195
    • General Chair:
    • Ali R. Butt,
    • Program Chairs:
    • Ningfang Mi,
    • Kyle Chard
    Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 07 August 2023

    Check for updates

    Author Tags

    1. distributed systems
    2. message passing interface
    3. mpi collective
    4. parallel algorithms
    5. process-in-process

    Qualifiers

    • Poster

    Funding Sources

    • US Department of Energy

    Conference

    HPDC '23

    Acceptance Rates

    Overall Acceptance Rate 166 of 966 submissions, 17%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)29
    • Downloads (Last 6 weeks)4
    Reflects downloads up to 11 Dec 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)POSTER: Optimizing Collective Communications with Error-bounded Lossy Compression for GPU ClustersProceedings of the 29th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming10.1145/3627535.3638467(454-456)Online publication date: 2-Mar-2024
    • (2023)PiP-MColl: Process-in-Process-based Multi-object MPI Collectives2023 IEEE International Conference on Cluster Computing (CLUSTER)10.1109/CLUSTER52292.2023.00037(354-364)Online publication date: 31-Oct-2023

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media