[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3615318.3615326acmotherconferencesArticle/Chapter ViewAbstractPublication PageseurompiConference Proceedingsconference-collections
research-article

Evaluating the Viability of LogGP for Modeling MPI Performance with Non-contiguous Datatypes on Modern Architectures

Published: 21 September 2023 Publication History

Abstract

Modern architectures and communication systems software include complex hardware, communication abstractions, and optimizations that make their performance difficult to measure, model, and understand. This paper examines the ability of modified versions of the existing Netgauge communication performance measurement tool and LogGOPS performance model to accurately characterize communication behavior of modern hardware, MPI abstractions, and implementations. This includes analyzing their ability to model both GPU-aware communication in different MPI implementations and quantifying the performance characteristics of different approaches to non-contiguous data communication on modern GPU systems. This paper also applies these techniques to quantify the performance of different implementations and optimization approaches to non-contiguous data communication on a variety of systems, demonstrating that modern communication system design approaches can result in widely-varying and difficult-to-predict performance variation, even within the same hardware/communication software combination.

References

[1]
Albert Alexandrov, Mihai F. Ionescu, Klaus E. Schauser, and Chris Scheiman. 1995. LogGP: Incorporating Long Messages into the LogP Model—One Step Closer towards a Realistic Model for Parallel Computation. In Proceedings of the Seventh Annual ACM Symposium on Parallel Algorithms and Architectures (Santa Barbara, California, USA) (SPAA ’95). Association for Computing Machinery, New York, NY, USA, 95–105.
[2]
Nicholas Bacon. 2023. GPU Datatype Enhanced Netgauge. https://github.com/CUP-ECS/datatypes-logGP
[3]
Amanda Bienz, Luke N. Olson, William D. Gropp, and Shelby Lockhart. 2021. Modeling Data Movement Performance on Heterogeneous Architectures. In 2021 IEEE High Performance Extreme Computing Conference (HPEC). 1–7.
[4]
Dan Bonachea and Paul H Hargrove. 2019. GASNet-EX: A high-performance, portable communication library for exascale. In Languages and Compilers for Parallel Computing: 31st International Workshop, Salt Lake City, UT, USA, October 9–11, 2018, Revised Selected Papers 31. Springer, 138–158.
[5]
Michael Boyer, Jiayuan Meng, and Kalyan Kumaran. 2013. Improving GPU performance prediction with data transfer modeling. In 2013 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum. IEEE, 1097–1106.
[6]
David Culler, Richard Karp, David Patterson, Abhijit Sahay, Klaus Erik Schauser, Eunice Santos, Ramesh Subramonian, and Thorsten von Eicken. 1993. LogP: Towards a Realistic Model of Parallel Computation. In Proceedings of the Fourth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. ACM, 1–12.
[7]
Keira Haskins, Patrick Bridges, Kurt Ferreira, and Scott Levy. 2021. A Benchmark to Understand Communication Performance in Hybrid MPI and GPU Applications.Technical Report. Sandia National Laboratory, Albuquerque, NM.
[8]
Torsten Hoefler, Torsten Mehlan, Andrew Lumsdaine, and Wolfgang Rehm. 2007. Netgauge: A Network Performance Measurement Framework. In Proceedings of High Performance Computing and Communications, HPCC’07 (Houston, USA), Vol. 4782. Springer, 659–671.
[9]
Torsten Hoefler, Timo Schneider, and Andrew Lumsdaine. 2010. LogGOPSim: simulating large-scale applications in the LogGOPS model. In Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing. 597–604.
[10]
Fumihiko Ino, Noriyuki Fujimoto, and Kenichi Hagihara. 2001. LogGPS: a parallel computational model for synchronization analysis. In Proceedings of the eighth ACM SIGPLAN symposium on Principles and practices of parallel programming. 133–142.
[11]
Argonne National Laboratory. 2020. Yaksa : High-performance Noncontiguous Data Management. https://www.yaksa.org/.
[12]
Lawrence Berkeley National Laboratory. 2023. GASNet-EX API Description. https://gasnet.lbl.gov/docs/GASNet-EX.txt
[13]
Message Passing Interface Forum. 2021. MPI: A Message-Passing Interface Standard Version 4.0. https://www.mpi-forum.org/docs/
[14]
Csaba Andras Moritz. 1998. Cost Modeling and Analysis: Towards Optimal Resource Utilization in Parallel Computer Systems. Ph. D. Thesis, Royal Institute of Technology (1998).
[15]
NVIDIA. 2022. Faster memory transfers between CPU and GPU with GDRCopy. https://developer.nvidia.com/gdrcopy
[16]
OpenUCX. 2023. Data type routines. https://openucx.readthedocs.io/en/master/api.html#data-type-routines
[17]
Dhabaleswar K Panda, Karen Tomko, Karl Schulz, and Amitava Majumdar. 2013. The MVAPICH project: Evolution and sustainability of an open source production quality MPI library for HPC. In Workshop on Sustainable Software for Science: Practice and Experiences, held in conjunction with Int’l Conference on Supercomputing (WSSPE).
[18]
Carl Pearson, Kun Wu, I-Hsin Chung, Jinjun Xiong, and Wen-Mei Hwu. 2021. TEMPI: An interposed MPI library with a canonical representation of CUDA-aware datatypes. In Proceedings of the 30th International Symposium on High-Performance Parallel and Distributed Computing. 95–106.
[19]
Rong Shi, Xiaoyi Lu, Sreeram Potluri, Khaled Hamidouche, Jie Zhang, and Dhabaleswar K Panda. 2014. Hand: A hybrid approach to accelerate non-contiguous data movement using MPI datatypes on GPU clusters. In 2014 43rd International Conference on Parallel Processing. IEEE, 221–230.
[20]
Xian-He Sun 2003. Improving the performance of MPI derived datatypes by optimizing memory-access cost. In 2003 Proceedings IEEE International Conference on Cluster Computing. IEEE, 412–419.
[21]
Kaushik Kandadi Suresh, Kawthar Shafie Khorassani, Chen Chun Chen, Bharath Ramesh, Mustafa Abduljabbar, Aamir Shafi, Hari Subramoni, and Dhabaleswar K Panda. 2022. Network Assisted Non-Contiguous Transfers for GPU-Aware MPI Libraries. In 2022 IEEE Symposium on High-Performance Interconnects (HOTI). IEEE, 13–20.
[22]
Ben Van Werkhoven, Jason Maassen, Frank J Seinstra, and Henri E Bal. 2014. Performance models for CPU-GPU data transfers. In 2014 14th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing. IEEE, 11–20.
[23]
Hao Wang, Sreeram Potluri, Miao Luo, Ashish Kumar Singh, Xiangyong Ouyang, Sayantan Sur, and Dhabaleswar K Panda. 2011. Optimized non-contiguous MPI datatype communication for GPU clusters: Design, implementation and evaluation with MVAPICH2. In 2011 IEEE International Conference on Cluster Computing. IEEE, 308–316.

Index Terms

  1. Evaluating the Viability of LogGP for Modeling MPI Performance with Non-contiguous Datatypes on Modern Architectures
            Index terms have been assigned to the content through auto-classification.

            Recommendations

            Comments

            Please enable JavaScript to view thecomments powered by Disqus.

            Information & Contributors

            Information

            Published In

            cover image ACM Other conferences
            EuroMPI '23: Proceedings of the 30th European MPI Users' Group Meeting
            September 2023
            123 pages
            ISBN:9798400709135
            DOI:10.1145/3615318
            Publication rights licensed to ACM. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of the United States government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            Published: 21 September 2023

            Permissions

            Request permissions for this article.

            Check for updates

            Qualifiers

            • Research-article
            • Research
            • Refereed limited

            Conference

            EUROMPI '23
            EUROMPI '23: 30th European MPI Users' Group Meeting
            September 11 - 13, 2023
            Bristol, United Kingdom

            Acceptance Rates

            Overall Acceptance Rate 66 of 139 submissions, 47%

            Contributors

            Other Metrics

            Bibliometrics & Citations

            Bibliometrics

            Article Metrics

            • 0
              Total Citations
            • 38
              Total Downloads
            • Downloads (Last 12 months)16
            • Downloads (Last 6 weeks)1
            Reflects downloads up to 30 Dec 2024

            Other Metrics

            Citations

            View Options

            Login options

            View options

            PDF

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader

            HTML Format

            View this article in HTML Format.

            HTML Format

            Media

            Figures

            Other

            Tables

            Share

            Share

            Share this Publication link

            Share on social media