[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3581576.3581618acmotherconferencesArticle/Chapter ViewAbstractPublication PageshpcasiaConference Proceedingsconference-collections
research-article
Public Access

Are we ready for broader adoption of ARM in the HPC community: Performance and Energy Efficiency Analysis of Benchmarks and Applications Executed on High-End ARM Systems

Published: 27 February 2023 Publication History

Abstract

A set of benchmarks, including numerical libraries and real-world scientific applications, were run on several modern ARM systems (Amazon Graviton 3/2, Futjutsu A64FX, Ampere Altra, Thunder X2) and compared to x86 systems (Intel and AMD) as well as to hybrid Intel x86/NVIDIA GPUs systems. For benchmarking automation, the application kernel module of XDMoD was used. XDMoD is a comprehensive suite for HPC resource utilization and performance monitoring. The application kernel module enables continuous performance monitoring of HPC resources through the regular execution of user applications. It has been used on the Ookami system (one of the first USA-based Fujitsu ARM A64FX SVE 512 systems). The applications used for this study span a variety of computational paradigms: HPCC (several HPC benchmarks), NWChem (ab initio chemistry), Open Foam(partial differential equation solver), GROMACS (biomolecular simulation), AI Benchmark Alpha (AI benchmark) and Enzo (adaptive mesh refinement). ARM performance, while generally slower, was nonetheless shown in many cases to be comparable to current x86 counterparts and often outperforms previous generations of x86 CPUs. In terms of energy efficiency, which considers both power consumption and execution time, ARM was shown in most cases to be more energy efficient than x86 processors. In cases where GPU performance was tested, the GPU systems showed the fastest speed and the highest energy efficiency. Given the high core count per node, comparable performance, and competitive pricing, current high-end ARM CPUs are already a valid choice as a primary HPC system processor.

References

[1]
ACCESS. 2022. Advanced Cyberinfrastructure Coordination Ecosystem: Services & Support. https://access-ci.org/.
[2]
E. Aprà, E. J. Bylaska, W. A. de Jong, N. Govind, K. Kowalski, T. P. Straatsma, M. Valiev, H. J. J. van Dam, Y. Alexeev, J. Anchell, V. Anisimov, F. W. Aquino, R. Atta-Fynn, J. Autschbach, N. P. Bauman, J. C. Becca, D. E. Bernholdt, K. Bhaskaran-Nair, S. Bogatko, P. Borowski, J. Boschen, J. Brabec, A. Bruner, E. Cauët, Y. Chen, G. N. Chuev, C. J. Cramer, J. Daily, M. J. O. Deegan, T. H. Dunning, M. Dupuis, K. G. Dyall, G. I. Fann, S. A. Fischer, A. Fonari, H. Früchtl, L. Gagliardi, J. Garza, N. Gawande, S. Ghosh, K. Glaesemann, A. W. Götz, J. Hammond, V. Helms, E. D. Hermes, K. Hirao, S. Hirata, M. Jacquelin, L. Jensen, B. G. Johnson, H. Jónsson, R. A. Kendall, M. Klemm, R. Kobayashi, V. Konkov, S. Krishnamoorthy, M. Krishnan, Z. Lin, R. D. Lins, R. J. Littlefield, A. J. Logsdail, K. Lopata, W. Ma, A. V. Marenich, J. Martin del Campo, D. Mejia-Rodriguez, J. E. Moore, J. M. Mullin, T. Nakajima, D. R. Nascimento, J. A. Nichols, P. J. Nichols, J. Nieplocha, A. Otero-de-la Roza, B. Palmer, A. Panyala, T. Pirojsirikul, B. Peng, R. Peverati, J. Pittner, L. Pollack, R. M. Richard, P. Sadayappan, G. C. Schatz, W. A. Shelton, D. W. Silverstein, D. M. A. Smith, T. A. Soares, D. Song, M. Swart, H. L. Taylor, G. S. Thomas, V. Tipparaju, D. G. Truhlar, K. Tsemekhman, T. Van Voorhis, Á. Vázquez-Mayagoitia, P. Verma, O. Villa, A. Vishnu, K. D. Vogiatzis, D. Wang, J. H. Weare, M. J. Williamson, T. L. Windus, K. Woliński, A. T. Wong, Q. Wu, C. Yang, Q. Yu, M. Zacharias, Z. Zhang, Y. Zhao, and R. J. Harrison. 2020. NWChem: Past, present, and future. The Journal of Chemical Physics 152, 18 (2020), 184102. https://doi.org/10.1063/5.0004997 arXiv:https://doi.org/10.1063/5.0004997
[3]
G. L. Bryan, M. L. Norman, B. W. O’Shea, T. Abel, J. H. Wise, M. J. Turk, D. R. Reynolds, D. C. Collins, P. Wang, S. W. Skillman, B. Smith, R. P. Harkness, J. Bordner, J.-h. Kim, M. Kuhlen, H. Xu, N. Goldbaum, C. Hummels, A. G. Kritsuk, E. Tasker, S. Skory, C. M. Simpson, O. Hahn, J. S. Oishi, G. C. So, F. Zhao, R. Cen, Y. Li, and The Enzo Collaboration. 2014. ENZO: An Adaptive Mesh Refinement Code for Astrophysics. The Astrophysical Journal Supplement Series 211, Article 19 (April 2014), 19 pages. https://doi.org/10.1088/0067-0049/211/2/19
[4]
Cloud Native Computing Foundation. 2016. Prometheus. https://prometheus.io.
[5]
Wael Elwasif, Sergei Bastrakov, Spencer H. Bryngelson, Michael Bussmann, Sunita Chandrasekaran, Florina Ciorba, M. A. Clark, Alexander Debus, William Godoy, Nick Hagerty, Jeff Hammond, David Hardy, J. Austin Harris, Oscar Hernandez, Balint Joo, Sebastian Keller, Paul Kent, Henry Le Berre, Damien Lebrun-Grandie, Elijah MacCarthy, Verónica G. Melesse Vergara, Bronson Messer, Ross Miller, Sarp Oral, Jean-Guillaume Piccinali, Anand Radhakrishnan, Osman Simsek, Filippo Spiga, Klaus Steiniger, Jan Stephan, John E. Stone, Christian Trott, René Widera, and Jeffrey Young. 2022. Early Application Experiences on a Modern GPU-Accelerated Arm-based HPC Platform. https://doi.org/10.48550/ARXIV.2209.09731
[6]
T. R. Furlani, B. I. Schneider, M. D. Jones, J. Towns, D. L. Hart, S. M. Gallo, R. L. DeLeon, C. Lu, A. Ghadersohi, R. J. Gentner, A. K. Patra, G. Laszewski, F. Wang, J. T. Palmer, and N. Simakov. 2013. Using XDMoD to facilitate XSEDE operations, planning and analysis. In Proceedings of the Conference on Extreme Science and Engineering Discovery Environment: Gateway to Discovery (XSEDE ’13). ACM, 8. https://doi.org/10.1145/2484762.2484763
[7]
Todd Gamblin, Matthew P. LeGendre, Michael R. Collette, Gregory L. Lee, Adam Moody, Bronis R. de Supinski, and W. Scott Futral. 2015. The Spack Package Manager: Bringing order to HPC software chaos. In Supercomputing 2015 (SC’15). Austin, Texas. http://tgamblin.github.io/pubs/spack-sc15.pdf
[8]
Andrey Ignatov. 2019. AI Benchmark Alpha - open source python library for evaluating AI performance of various hardware platforms, including CPUs, GPUs and TPU. https://ai-benchmark.com/alpha.
[9]
Mateusz Jarus, Sébastien Varrette, Ariel Oleksiak, and Pascal Bouvry. 2013. Performance Evaluation and Energy Efficiency of High-Density HPC Platforms Based on Intel, AMD and ARM Processors. In Energy Efficiency in Large Scale Distributed Systems, Jean-Marc Pierson, Georges Da Costa, and Lars Dittmann (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 182–200.
[10]
Carsten Kutzner, Szilárd Páll, Martin Fechner, Ansgar Esztermann, Bert L. de Groot, and Helmut Grubmüller. 2015. Best bang for your buck: GPU nodes for GROMACS biomolecular simulations. Journal of Computational Chemistry 36, 26 (2015), 1990–2008. https://doi.org/10.1002/jcc.24030 arXiv:https://onlinelibrary.wiley.com/doi/pdf/10.1002/jcc.24030
[11]
Jahanzeb Maqbool, Sangyoon Oh, and Geoffrey C. Fox. 2015. Evaluating ARM HPC clusters for scientific workloads. Concurrency and Computation: Practice and Experience 27, 17(2015), 5390–5410. https://doi.org/10.1002/cpe.3602 arXiv:https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.3602
[12]
Simon McIntosh-Smith, James Price, Tom Deakin, and Andrei Poenaru. 2019. A performance analysis of the first generation of HPC-optimized Arm processors. Concurrency and Computation: Practice and Experience 31, 16(2019), e5110. https://doi.org/10.1002/cpe.5110 arXiv:https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.5110e5110 cpe.5110.
[13]
Rahul Bapat Mike Bennett, Bryan Gartner. 2022. Discovering the performance and efficiency of Ampere Altra cloud-native processors for HPC workloads. In The Arm HPC User Group (AHUG) symposium at SuperComputing (SC 2022). https://github.com/arm-hpc-user-group/sc22-ahug-symposium/blob/53f01146b13504a99eb24db5e673ad85ead7b388/presentations/06-mbennett-ampere-perf-ahug-sc22.pdf
[14]
Michael Norman, Vince Kellen, Shava Smallen, Brian DeMeulle, Shawn Strande, Ed Lazowska, Naomi Alterman, Rob Fatland, Sarah Stone, Amanda Tan, Katherine Yelick, Eric Van Dusen, and James Mitchell. 2021. CloudBank: Managed Services to Simplify Cloud Access for Computer Science Research and Education. In Practice and Experience in Advanced Research Computing (Boston, MA, USA) (PEARC ’21). Association for Computing Machinery, New York, NY, USA, Article 45, 4 pages. https://doi.org/10.1145/3437359.3465586
[15]
Jeffrey T. Palmer, Steven M. Gallo, Thomas R. Furlani, Matthew D. Jones, Robert L. DeLeon, Joseph P. White, Nikolay Simakov, Abani K. Patra, Jeanette Sperhac, Thomas Yearke, Ryan Rathsam, Martins Innus, Cynthia D. Cornelius, James C. Browne, William L. Barth, and Richard T. Evans. 2015. Open XDMoD: A Tool for the Comprehensive Management of High-Performance Computing Resources. Computing in Science & Engineering 17, 4 (7 2015), 52–62. https://doi.org/10.1109/MCSE.2015.68
[16]
Szilárd Páll, Artem Zhmurov, Paul Bauer, Mark Abraham, Magnus Lundborg, Alan Gray, Berk Hess, and Erik Lindahl. 2020. Heterogeneous parallelization and acceleration of molecular dynamics simulations in GROMACS. The Journal of Chemical Physics 153, 13 (2020), 134110. https://doi.org/10.1063/5.0018516 arXiv:https://doi.org/10.1063/5.0018516
[17]
Nikolay A. Simakov, Robert L. DeLeon, Joseph P. White, Thomas R. Furlani, Martins Innus, Steven M. Gallo, Matthew D. Jones, Abani Patra, Benjamin D. Plessinger, Jeanette Sperhac, Thomas Yearke, Ryan Rathsam, and Jeffrey T. Palmer. 2016. A Quantitative Analysis of Node Sharing on HPC Clusters Using XDMoD Application Kernels. In Proceedings of the XSEDE16 Conference on Diversity, Big Data, and Science at Scale (Miami, USA) (XSEDE16). ACM, New York, NY, USA, Article 32, 8 pages. https://doi.org/10.1145/2949550.2949553
[18]
Nikolay A. Simakov, Martins D. Innus, Matthew D. Jones, Joseph P. White, Steven M. Gallo, Robert L. DeLeon, and Thomas R. Furlani. 2018. Effect of Meltdown and Spectre Patches on the Performance of HPC Applications. CoRR abs/1801.04329(2018). arxiv:1801.04329http://arxiv.org/abs/1801.04329
[19]
Nikolay A Simakov, Joseph P White, Robert L DeLeon, Steven M Gallo, Matthew D Jones, Jeffrey T Palmer, Benjamin Plessinger, and Thomas R Furlani. 2018. A Workload Analysis of NSF’s Innovative HPC Resources Using XDMoD. CoRR abs/1801.04306(2018). arxiv:1801.04306http://arxiv.org/abs/1801.04306
[20]
Nikolay A Simakov, Joseph P White, Robert L DeLeon, Amin Ghadersohi, Thomas R Furlani, Matthew D Jones, Steven M Gallo, and Abani K Patra. 2015. Application kernels: HPC resources performance monitoring and variance analysis. Concurrency and Computation: Practice and Experience 27, 17(2015), 5238–5260. https://doi.org/10.1002/cpe.3564 arXiv:https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.3564

Cited By

View all
  • (2024)Evaluating ARM and RISC-V Architectures for High-Performance Computing with Docker and KubernetesElectronics10.3390/electronics1317349413:17(3494)Online publication date: 3-Sep-2024
  • (2024)First Impressions of the Sapphire Rapids Processor with HBM for Scientific WorkloadsSN Computer Science10.1007/s42979-024-02958-35:5Online publication date: 7-Jun-2024
  • (2024)Comprehensive analysis of energy efficiency and performance of ARM and RISC-V SoCsThe Journal of Supercomputing10.1007/s11227-024-05946-980:9(12771-12789)Online publication date: 20-Feb-2024
  • Show More Cited By

Index Terms

  1. Are we ready for broader adoption of ARM in the HPC community: Performance and Energy Efficiency Analysis of Benchmarks and Applications Executed on High-End ARM Systems

        Recommendations

        Comments

        Please enable JavaScript to view thecomments powered by Disqus.

        Information & Contributors

        Information

        Published In

        cover image ACM Other conferences
        HPCAsia '23 Workshops: Proceedings of the HPC Asia 2023 Workshops
        February 2023
        101 pages
        ISBN:9781450399890
        DOI:10.1145/3581576
        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 27 February 2023

        Permissions

        Request permissions for this article.

        Check for updates

        Author Tags

        1. ARM
        2. GPU
        3. HPC
        4. benchmarks
        5. energy efficiency
        6. x86

        Qualifiers

        • Research-article
        • Research
        • Refereed limited

        Funding Sources

        Conference

        HPCAsia2023 Workshop

        Acceptance Rates

        HPCAsia '23 Workshops Paper Acceptance Rate 9 of 10 submissions, 90%;
        Overall Acceptance Rate 69 of 143 submissions, 48%

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)440
        • Downloads (Last 6 weeks)92
        Reflects downloads up to 03 Jan 2025

        Other Metrics

        Citations

        Cited By

        View all
        • (2024)Evaluating ARM and RISC-V Architectures for High-Performance Computing with Docker and KubernetesElectronics10.3390/electronics1317349413:17(3494)Online publication date: 3-Sep-2024
        • (2024)First Impressions of the Sapphire Rapids Processor with HBM for Scientific WorkloadsSN Computer Science10.1007/s42979-024-02958-35:5Online publication date: 7-Jun-2024
        • (2024)Comprehensive analysis of energy efficiency and performance of ARM and RISC-V SoCsThe Journal of Supercomputing10.1007/s11227-024-05946-980:9(12771-12789)Online publication date: 20-Feb-2024
        • (2024)Prediction of Fps Using Ensembling Approach for Benchmarking Gaming SystemsMachine Vision and Augmented Intelligence10.1007/978-981-97-4359-9_36(365-378)Online publication date: 15-Dec-2024
        • (2023)ACCESS: Advancing InnovationPractice and Experience in Advanced Research Computing 2023: Computing for the Common Good10.1145/3569951.3597559(173-176)Online publication date: 23-Jul-2023

        View Options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        HTML Format

        View this article in HTML Format.

        HTML Format

        Login options

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media