[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3636480.3637284acmotherconferencesArticle/Chapter ViewAbstractPublication PageshpcasiaConference Proceedingsconference-collections
research-article

NVIDIA Grace Superchip Early Evaluation for HPC Applications

Published: 11 January 2024 Publication History

Abstract

Arm-based system in HPC are a reality since more than a decade. However, when a new chip enters the market always implies challenges, not only at ISA level, but also with regards to the SoC integration, the memory subsystem, the board integration, the node interconnection, and finally the OS and all layers of the system software (compiler and libraries). Guided by the procurement of an NVIDIA Grace HPC cluster within the deployment of MareNostrum  5, and emulating the approach of a scientist who needs to migrate its scientific research to a new HPC system, we evaluated five complex scientific applications on engineering sample nodes of NVIDIA Grace CPU Superchip and NVIDIA Grace Hopper Superchip (CPU-only). We report intra-node and inter-node scalability and early performance results showing a speed-up between 1.3 × and 4.28 × for all codes when compared to the current generation of MareNostrum  4 powered by Intel Skylake CPUs.

References

[1]
Fabio Banchelli, Kilian Peiro, Guillem Ramirez-Gargallo, Joan Vinyals, David Vicente, Marta Garcia-Gasulla, and Filippo Mantovani. 2021. Cluster of emerging technology: evaluation of a production HPC system based on A64FX. In 2021 IEEE International Conference on Cluster Computing (CLUSTER). IEEE, 741–750.
[2]
Bine Brank, Stepan Nassyr, Fatemeh Pouyan, and Dirk Pleiter. 2020. Porting Applications to Arm-based Processors. IEEE Computer Society, 559–566. https://doi.org/10.1109/CLUSTER49012.2020.00079
[3]
Bine Brank and Dirk Pleiter. 2022. Assessing the State of Autovectorization Support based on SVE. IEEE Computer Society, 556–562. https://doi.org/10.1109/CLUSTER51413.2022.00073
[4]
Marc Clascà, Marta Garcia-Gasulla, Arnau Montagud, José Carbonell Caballero, and Alfonso Valencia. 2023. Lessons Learned from a Performance Analysis and Optimization of a Multiscale Cellular Simulation. In Proceedings of the Platform for Advanced Scientific Computing Conference. ACM. https://doi.org/10.1145/3592979.3593403
[5]
Miguel Ponce de Leon, Arnau Montagud, Vincent Noel, Gerard Pradas, Annika Meert, Emmanuel Barillot, Laurence Calzone, and Alfonso Valencia. 2022. PhysiBoSS 2.0: a sustainable integration of stochastic Boolean and agent-based modelling frameworks. (Jan. 2022). https://doi.org/10.1101/2022.01.06.468363
[6]
Marta Garcia-Gasulla, Fabio Banchelli, Kilian Peiro, Guillem Ramirez-Gargallo, Guillaume Houzeaux, Ismaïl Ben Hassan Saïdi, Christian Tenaud, Ivan Spisso, and Filippo Mantovani. 2020. A Generic Performance Analysis Technique Applied to Different CFD Methods for HPC. International Journal of Computational Fluid Dynamics 34, 7-8 (2020), 508–528. https://doi.org/10.1080/10618562.2020.1778168 arXiv:https://doi.org/10.1080/10618562.2020.1778168
[7]
Simon David Hammond 2018. The Astra Supercomputer. Technical Report. Sandia National Lab. https://www.osti.gov/servlets/purl/1574565
[8]
Hrvoje Jasak. 2009. OpenFOAM: Open source CFD in research and industry. International Journal of Naval Architecture and Ocean Engineering 1, 2 (2009), 89–94. https://doi.org/10.2478/IJNAOE-2013-0011
[9]
Gurvan Madec, Mike Bell, Adam Blaker, Clément Bricaud, Diego Bruciaferri, Miguel Castrillo, Daley Calvert, Jérômeme Chanut, Emanuela Clementi, Andrew Coward, Italo Epicoco, Christian Éthé, Jonas Ganderton, James Harle, Katherine Hutchinson, Doroteaciro Iovino, Dan Lea, Tomas Lovato, Matt Martin, Nicolas Martin, Francesca Mele, Diana Martins, Sébastien Masson, Pierre Mathiot, Francesca Mele, Silvia Mocavero, Simon Müller, A.J. George Nurser, Stella Paronuzzi, Mathieu Peltier, Renaud Person, Clement Rousset, Stefanie Rynders, Guillaume Samson, Sibylle Téchené, Martin Vancoppenolle, and Chris Wilson. 2023. NEMO Ocean Engine Reference Manual. https://doi.org/10.5281/zenodo.8167700
[10]
Gurvan Madec, Romain Bourdallé-Badie, Pierre-Antoine Bouttier, Clément Bricaud, Diego Bruciaferri, Daley Calvert, Jérôme Chanut, Emanuela Clementi, Andrew Coward, Damiano Delrosso, 2017. NEMO ocean engine. (2017).
[11]
Filippo Mantovani, Marta Garcia-Gasulla, José Gracia, Esteban Stafford, Fabio Banchelli, Marc Josep-Fabrego, Joel Criado-Ledesma, and Mathias Nachtmann. 2020. Performance and energy consumption of HPC workloads on a cluster based on Arm ThunderX2 CPU. Future generation computer systems 112 (2020), 800–818.
[12]
Nikola Rajovic, Alejandro Rico, Filippo Mantovani, Daniel Ruiz, Josep Oriol Vilarrubi, Constantino Gomez, Luna Backes, Diego Nieto, Harald Servat, Xavier Martorell, 2016. The Mont-Blanc prototype: an alternative approach for HPC systems. In SC’16: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. IEEE, 444–455.
[13]
Nikola Rajovic, Alejandro Rico, Nikola Puzovic, Chris Adeniyi-Jones, and Alex Ramirez. 2014. Tibidabo: Making the case for an ARM-based HPC system. Future Generation Computer Systems 36 (2014), 322–334.
[14]
Mitsuhisa Sato, Yutaka Ishikawa, Hirofumi Tomita, Yuetsu Kodama, Tetsuya Odajima, Miwako Tsuji, Hisashi Yashiro, Masaki Aoki, Naoyuki Shida, Ikuo Miyoshi, 2020. Co-Design for A64FX Manycore Processor and” Fugaku”. In SC20: International Conference for High Performance Computing, Networking, Storage and Analysis. IEEE, 1–15.
[15]
Nikolay A. Simakov, Robert L. Deleon, Joseph P. White, Matthew D. Jones, Thomas R. Furlani, Eva Siegmann, and Robert J. Harrison. 2023. Are we ready for broader adoption of ARM in the HPC community: Performance and Energy Efficiency Analysis of Benchmarks and Applications Executed on High-End ARM Systems. In Proceedings of the HPC Asia 2023 Workshops(HPC Asia ’23 Workshops). Association for Computing Machinery, New York, NY, USA, 78–86. https://doi.org/10.1145/3581576.3581618
[16]
A. P. Thompson, H. M. Aktulga, R. Berger, D. S. Bolintineanu, W. M. Brown, P. S. Crozier, P. J. in ’t Veld, A. Kohlmeyer, S. G. Moore, T. D. Nguyen, R. Shan, M. J. Stevens, J. Tranchida, C. Trott, and S. J. Plimpton. 2022. LAMMPS - a flexible simulation tool for particle-based materials modeling at the atomic, meso, and continuum scales. Comp. Phys. Comm. 271 (2022), 108171. https://doi.org/10.1016/j.cpc.2021.108171
[17]
Miwako Tsuji, Misun Min, Stefan Kerkemeier, Paul Fischer, Elia Merzari, and Mitsuhisa Sato. 2022. Performance tuning of the Helmholtz matrix-vector product kernel in the computational fluid dynamics solver Nek5000/RS for the A64FX processor. In International Conference on High Performance Computing in Asia-Pacific Region Workshops(HPCAsia 2022 Workshop). Association for Computing Machinery, New York, NY, USA, 49–59. https://doi.org/10.1145/3503470.3503476
[18]
Sudharshan S Vazhkudai, Bronis R De Supinski, Arthur S Bland, Al Geist, James Sexton, Jim Kahle, Christopher J Zimmer, Scott Atchley, Sarp Oral, Don E Maxwell, 2018. The design, deployment, and evaluation of the CORAL pre-exascale systems. In SC18: International Conference for High Performance Computing, Networking, Storage and Analysis. IEEE, 661–672.
[19]
Mariano Vázquez, Guillaume Houzeaux, Seid Koric, Antoni Artigues, Jazmin Aguado-Sierra, Ruth Arís, Daniel Mira, Hadrien Calmet, Fernando Cucchietti, Herbert Owen, 2016. Alya: Multiphysics engineering simulation toward exascale. Journal of Computational Science 14 (2016), 15–27.

Cited By

View all
  • (2024)Studying CPU and memory utilization of applications on Fujitsu A64FX and Nvidia Grace SuperchipProceedings of the International Symposium on Memory Systems10.1145/3695794.3695813(198-207)Online publication date: 30-Sep-2024
  • (2024)Harnessing Integrated CPU-GPU System Memory for HPC: a first look into Grace HopperProceedings of the 53rd International Conference on Parallel Processing10.1145/3673038.3673110(199-209)Online publication date: 12-Aug-2024
  • (2024)Evaluating and optimising compiler code generation for NVIDIA GraceProceedings of the 53rd International Conference on Parallel Processing10.1145/3673038.3673104(691-700)Online publication date: 12-Aug-2024

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
HPCAsia '24 Workshops: Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region Workshops
January 2024
134 pages
ISBN:9798400716522
DOI:10.1145/3636480
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 January 2024

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Arm
  2. Cluster
  3. Evaluation
  4. HPC
  5. NVIDIA Grace
  6. NVIDIA Superchip
  7. Performance
  8. Scalability

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

Conference

HPCAsiaWS 2024

Acceptance Rates

Overall Acceptance Rate 69 of 143 submissions, 48%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)711
  • Downloads (Last 6 weeks)23
Reflects downloads up to 11 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Studying CPU and memory utilization of applications on Fujitsu A64FX and Nvidia Grace SuperchipProceedings of the International Symposium on Memory Systems10.1145/3695794.3695813(198-207)Online publication date: 30-Sep-2024
  • (2024)Harnessing Integrated CPU-GPU System Memory for HPC: a first look into Grace HopperProceedings of the 53rd International Conference on Parallel Processing10.1145/3673038.3673110(199-209)Online publication date: 12-Aug-2024
  • (2024)Evaluating and optimising compiler code generation for NVIDIA GraceProceedings of the 53rd International Conference on Parallel Processing10.1145/3673038.3673104(691-700)Online publication date: 12-Aug-2024

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media