[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3127024.3127027acmotherconferencesArticle/Chapter ViewAbstractPublication PageseurompiConference Proceedingsconference-collections
research-article
Public Access

PMIx: process management for exascale environments

Published: 25 September 2017 Publication History

Abstract

High-Performance Computing (HPC) applications have historically executed in static resource allocations, using programming models that ran independently from the resident system management stack (SMS). Achieving exascale performance that is both cost-effective and fits within site-level environmental constraints will, however, require that the application and SMS collaboratively orchestrate the flow of work to optimize resource utilization and compensate for on-the-fly faults. The Process Management Interface - Exascale (PMIx) community is committed to establishing scalable workflow orchestration by defining an abstract set of interfaces by which not only applications and tools can interact with the resident SMS, but also the various SMS components can interact with each other. This paper presents a high-level overview of the goals and current state of the PMIx standard, and lays out a roadmap for future directions.

References

[1]
Adaptive Computing Inc. 2017. Moab HPC Suite. http://www.adaptivecomputing.com/products/hpc-products/moab-hpc-basic-edition/. (2017). {Online; accessed 14-Jul-2017}.
[2]
Altair Inc. 2017. PBS Professional. http://www.pbsworks.com/PBSProduct.aspx?n=PBS-Professional&c=Overview-and-Capabilities. (2017). {Online; accessed 26-Apr-2017}.
[3]
Argonne National Laboratory. 2014. Hydra Process Management Framework. https://wiki.mpich.org/mpich/index.php/Hydra_Process_Management_Framework. (2014). {Online; accessed 26-Apr-2017}.
[4]
Wesley Bland, Aurelien Bouteiller, Thomas Herault, George Bosilca, and Jack Dongarra. 2013. Post-failure recovery of MPI communication capability. The International Journal of High Performance Computing Applications 27, 3 (2013), 244--254.
[5]
R. H. Castain, T. S. Woodall, D. J. Daniel, J. M. Squyres, B. Barrett, and G. E. Fagg. 2005. The Open Run-Time Environment (OpenRTE): A Transparent Multi-Cluster Environment for High-Performance Computing. In Proceedings, 12th European PVM/MPI Users' Group Meeting. Springer-Verlag, Sorrento, Italy, 225--232.
[6]
PMIx Consortiium. 2016. PMIx Server Data Requirements. https://github.com/pmix/pmix/wiki/2.8-Pmix-Server-Data-Requirements. (2016). {Online; accessed 24-Apr-2017}.
[7]
PMIx Consortiium. 2016. Process Management Interface - Exascale. https://pmix.github.io/pmix. (2016). {Online; accessed 26-Apr-2017}.
[8]
OpenSHMEM Consortium. 2016. OpenSHMEM Specification 1.3 Final. http://www.openshmem.org/site/sites/default/site_files/OpenSHMEM-1.3.pdf. (2016). {Online; accessed 26-Apr-2017}.
[9]
UPC Consortium. 2013. UPC Language Specifications. https://upc-lang.org/assets/Uploads/spec/upc-lang-spec-1.3.pdf. (2013). {Online; accessed 26-Apr-2017}.
[10]
Cray Inc. 2011. Workload Management and Application Placement for the Cray Linux Environment. http://docs.cray.com/books/S-2496-4001/S-2496-4001.pdf. (2011). {Online; accessed 26-Apr-2017}.
[11]
Message Passing Interface Forum. 2015. MPI-3.1: Official document. http://www.mpi-forum.org/docs. (June 2015).
[12]
Wolfgang Frings, Dong Ahn, Matthew LeGendre, Todd Gamblin, Bronis de Suspinski, and Felix Wolf. 2013. Massively Parallel Loading. In Proceedings of the 27th International Conference on Supercomputing. ACM, 389--398.
[13]
Argonne National Laboratory. 2012. PMI v2 API. https://wiki.mpich.org/mpich/index.php/PMI_v2_API. (2012). {Online; accessed 26-Apr-2017}.
[14]
DOE/Los Alamos National laboratory. 2009. Science At The Petascale:Roadrunner Results Unveiled. https://www.sciencedaily.com/releases/2009/10/091026125535.htm. (October 2009). {Online; accessed 15-May-2016}.
[15]
Mark Grondona. 2012. io-watchdog. https://github.com/grondo/io-watchdog. (2012). {Online; accessed 17-May-2017}.
[16]
Mathematics and Argonne National Laboratory Computer Science Division. 2006. MPICH-2, implementation of MPI 2 standard. http://www-unix.mcs.anl.gov/mpi/mpich2/. (2006). {Online; accessed 26-Apr-2017}.
[17]
SchedMD, LLC. 2017. SLURM Workload Manager. https://slurm.schedmd.com/. (2017). {Online; accessed 26-Apr-2017}.
[18]
Hadi Sharifi, Omar Aaziz, and Jonathan Cook. 2015. Monitoring HPC Applications in the Production Environment. In Proceedings of the 2nd Workshop on Parallel Programming for Analytics Applications (PPAA 2015). ACM, New York, NY, USA, 39--47.
[19]
PGAS working group. 2016. Partitioned Global Address Space. http://www.pgas.org/. (2016). {Online; accessed 26-Apr-2017}.

Cited By

View all
  • (2024)Shared Memory Access Optimization Analysis System for PMIx Standard Implementation2024 IEEE International Multi-Conference on Engineering, Computer and Information Sciences (SIBIRCON)10.1109/SIBIRCON63777.2024.10758451(457-461)Online publication date: 30-Sep-2024
  • (2024)HPC challenges and opportunities of industrial-scale reactive fluidized bed simulation using meshes of several billion cells on the route of ExascalePowder Technology10.1016/j.powtec.2024.120018444(120018)Online publication date: Aug-2024
  • (2024)A transmission optimization method for MPI communicationsThe Journal of Supercomputing10.1007/s11227-023-05699-x80:5(6240-6263)Online publication date: 1-Mar-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
EuroMPI '17: Proceedings of the 24th European MPI Users' Group Meeting
September 2017
169 pages
ISBN:9781450348492
DOI:10.1145/3127024
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

  • Mellanox: Mellanox Technologies
  • Intel: Intel

In-Cooperation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 25 September 2017

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article

Funding Sources

Conference

EuroMPI/USA '17
Sponsor:
  • Mellanox
  • Intel
EuroMPI/USA '17: 24th European MPI Users' Group Meeting
September 25 - 28, 2017
Illinois, Chicago

Acceptance Rates

EuroMPI '17 Paper Acceptance Rate 17 of 37 submissions, 46%;
Overall Acceptance Rate 66 of 139 submissions, 47%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)427
  • Downloads (Last 6 weeks)61
Reflects downloads up to 11 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Shared Memory Access Optimization Analysis System for PMIx Standard Implementation2024 IEEE International Multi-Conference on Engineering, Computer and Information Sciences (SIBIRCON)10.1109/SIBIRCON63777.2024.10758451(457-461)Online publication date: 30-Sep-2024
  • (2024)HPC challenges and opportunities of industrial-scale reactive fluidized bed simulation using meshes of several billion cells on the route of ExascalePowder Technology10.1016/j.powtec.2024.120018444(120018)Online publication date: Aug-2024
  • (2024)A transmission optimization method for MPI communicationsThe Journal of Supercomputing10.1007/s11227-023-05699-x80:5(6240-6263)Online publication date: 1-Mar-2024
  • (2024)Bringing HPE Slingshot 11 support to Open MPIConcurrency and Computation: Practice and Experience10.1002/cpe.820336:22Online publication date: 18-Jul-2024
  • (2023)MPI Application Binary Interface StandardizationProceedings of the 30th European MPI Users' Group Meeting10.1145/3615318.3615319(1-12)Online publication date: 11-Sep-2023
  • (2023)MPIGDB: A Flexible Debugging Infrastructure for MPI ProgramsProceedings of the 13th Workshop on AI and Scientific Computing at Scale using Flexible Computing10.1145/3589013.3596675(11-18)Online publication date: 10-Aug-2023
  • (2023)Portable Containerized MPI Application Using UCX Replacement MethodAdvances on P2P, Parallel, Grid, Cloud and Internet Computing10.1007/978-3-031-46970-1_21(222-234)Online publication date: 29-Oct-2023
  • (2022)Algorithms for Optimizing the Execution of Parallel Programs on High-Performance Systems When Solving Problems of Modeling Physical ProcessesOptoelectronics, Instrumentation and Data Processing10.3103/S875669902105011357:5(552-560)Online publication date: 18-Mar-2022
  • (2021)Key-Value Database Access Optimization For PMIx Standard Implementation2021 Ural Symposium on Biomedical Engineering, Radioelectronics and Information Technology (USBEREIT)10.1109/USBEREIT51232.2021.9455075(0362-0366)Online publication date: 13-May-2021
  • (2020)Customizable Scale-Out Key-Value StoresIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2020.298264031:9(2081-2096)Online publication date: 1-Sep-2020
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media