[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3624062.3624197acmotherconferencesArticle/Chapter ViewAbstractPublication PagesscConference Proceedingsconference-collections
research-article
Open access

SPEChpc 2021 Benchmarks on Ice Lake and Sapphire Rapids Infiniband Clusters: A Performance and Energy Case Study

Published: 12 November 2023 Publication History

Abstract

In this work, fundamental performance, power, and energy characteristics of the full SPEChpc 2021 benchmark suite are assessed on two different clusters based on Intel Ice Lake and Sapphire Rapids CPUs using the MPI-only codes’ variants. We use memory bandwidth, data volume, and scalability metrics in order to categorize the benchmarks and pinpoint relevant performance and scalability bottlenecks on the node and cluster levels. Common patterns such as memory bandwidth limitation, dominating communication and synchronization overhead, MPI serialization, superlinear scaling, and alignment issues could be identified, in isolation or in combination, showing that SPEChpc 2021 is representative of many HPC workloads. Power dissipation and energy measurements indicate that the modern Intel server CPUs have such a high idle power level that race-to-idle is the paramount strategy for energy to solution and energy-delay product minimization. On the chip level, only memory-bound code shows a clear advantage of Sapphire Rapids compared to Ice Lake in terms of energy to solution.

Supplemental Material

MP4 File
Recording of "SPEChpc 2021 Benchmarks on Ice Lake and Sapphire Rapids Infiniband Clusters: A Performance and Energy Case Study" presentation at PMBS23.
ZIP File
We provide reproducibility initiative dependencies (Artifact Description or Artifact Evaluation or Computational Results Analysis) appendix. To allow a third party to duplicate the findings, this article provides our extensive performance data artifact and describes further details regarding the software environments, experimental design, and methodology employed for the results shown in the paper "SPEChpc 2021 Benchmarks on Ice Lake and Sapphire Rapids Infiniband Clusters: A Performance and Energy Case Study". The computational artifacts will enable experienced performance engineers to reproduce and interpret the data shown in the paper in the appropriate way and to follow the conclusions we draw from it.

References

[1]
Mark Adams, Jed Brown, John Shalf, Brian Van Straalen, Erich Strohmaier, and Sam Williams. 2014. HPGMG 1.0: A Benchmark for Ranking High Performance Computing Systems. Technical Report (2014). https://doi.org/10.2172/1131029
[2]
Ayesha Afzal. 2015. The cost of computation: Metrics and models for modern multicore-based systems in scientific computing. Master’s thesis, Department Informatik, Friedrich Alexander Universität Erlangen-Nürnberg (2015). https://doi.org/10.13140/RG.2.2.35954.25283
[3]
Ayesha Afzal, Georg Hager, Stefano Marakidis, and Gerhard Wellein. 2023. Making Applications Faster by Asynchronous Execution: Slowing Down Processes or Relaxing MPI Collectives. Future Generation Computer Systems 148 (2023), 472–487. https://doi.org/10.1016/j.future.2023.06.017
[4]
Ayesha Afzal, Georg Hager, and Gerhard Wellein. 2019. Propagation and Decay of Injected One-Off Delays on Clusters: A Case Study. In Proceedings of the 2019 IEEE International Conference on Cluster Computing. 1–10. https://doi.org/10.1109/CLUSTER.2019.8890995
[5]
Ayesha Afzal, Georg Hager, and Gerhard Wellein. 2020. Desynchronization and Wave Pattern Formation in MPI-Parallel and Hybrid Memory-Bound Programs. In Lecture Notes in Computer Science, Vol. 12151 LNCS. 391–411. https://doi.org/10.1007/978-3-030-50743-5_20
[6]
Ayesha Afzal, Georg Hager, and Gerhard Wellein. 2021. Analytic Modeling of Idle Waves in Parallel Programs: Communication, Cluster Topology, and Noise Impact. In Lecture Notes in Computer Science, Vol. 12728. 351–371. https://doi.org/10.1007/978-3-030-78713-4_19
[7]
Ayesha Afzal, Georg Hager, and Gerhard Wellein. 2022. Analytic performance model for parallel overlapping memory-bound kernels. Concurrency and Computation: Practice and Experience 34, 10 (2022), e6816. https://doi.org/10.1002/cpe.6816
[8]
Ayesha Afzal, Georg Hager, and Gerhard Wellein. 2022. The Role of Idle Waves, Desynchronization, and Bottleneck Evasion in the Performance of Parallel Programs. IEEE Transactions on Parallel and Distributed Systems, TPDS (2022). https://doi.org/10.1109/TPDS.2022.3221085
[9]
Ayesha Afzal, Georg Hager, and Gerhard Wellein. 2023. SPEChpc 2021 Benchmarks on Ice Lake and Sapphire Rapids Infiniband Clusters: A Performance and Energy Case Study – Performance Data Artifact Appendix. In [online]. https://doi.org/10.5281/zenodo.8338037
[10]
Ayesha Afzal, Georg Hager, Gerhard Wellein, and Stefano Marakidis. 2022. Exploring Techniques for the Analysis of Spontaneous Asynchronicity in MPI-Parallel Applications. In Parallel Processing and Applied Mathematics(PPAM’22). https://doi.org/10.1007/978-3-031-30442-2_12
[11]
D. H. Bailey, E. Barszcz, J. T. Barton, D. S. Browning, R. L. Carter, L. Dagum, R. A. Fatoohi, P. O. Frederickson, T. A. Lasinski, R. S. Schreiber, H. D. Simon, V. Venkatakrishnan, and S. K. Weeratunga. 1991. The NAS Parallel Benchmarks—Summary and Preliminary Results. In Proceedings of the 1991 ACM/IEEE Conference on Supercomputing(SC ’91). 158–165. https://doi.org/10.1145/125826.125925
[12]
Holger Brunst, Sunita Chandrasekaran, Florina M. Ciorba, Nick Hagerty, Robert Henschel, Guido Juckeland, Junjie Li, Verónica G. Melesse Vergara, Sandra Wienke, and Miguel Zavala. 2022. First Experiences in Performance Benchmarking with the New SPEChpc 2021 Suites. In Cluster, Cloud and Internet Computing (CCGrid). 675–684. https://doi.org/10.1109/CCGrid54584.2022.00077
[13]
Sudheer Chunduri, Taylor Groves, Peter Mendygral, Brian Austin, Jacob Balma, Krishna Kandalla, Kalyan Kumaran, Glenn Lockwood, Scott Parker, Steven Warren, Nathan Wichmann, and Nicholas Wright. 2019. GPCNeT: Designing a Benchmark Suite for Inducing and Measuring Contention in HPC Networks. In ACM/IEEE Conference on Supercomputing (Denver, Colorado) (SC ’19). Article 42. https://doi.org/10.1145/3295500.3356215
[14]
Paul Stewart Crozier, Heidi K Thornquist, Robert W Numrich, Alan B Williams, Harold Carter Edwards, Eric Richard Keiter, Mahesh Rajan, James M Willenbring, Douglas W Doerfler, and Michael Allen Heroux. 2009. Improving performance via mini-applications.Technical Report (9 2009). https://doi.org/10.2172/993908
[15]
Anthony Danalis, Gabriel Marin, Collin McCurdy, Jeremy S. Meredith, Philip C. Roth, Kyle Spafford, Vinod Tipparaju, and Jeffrey S. Vetter. 2010. The Scalable Heterogeneous Computing (SHOC) Benchmark Suite. In General-Purpose Computation on Graphics Processing Units Workshop (Pittsburgh, Pennsylvania, USA) (GPGPU-3). 63–74. https://doi.org/10.1145/1735688.1735702
[16]
Jack Dongarra, Michael A. Heroux, and Piotr Luszczek. 2016. A new metric for ranking high-performance computing systems. National Science Review 3, 1 (2016), 30–35. https://doi.org/10.1093/nsr/nwv084
[17]
Jack J Dongarra, Piotr Luszczek, and Antoine Petitet. 2003. The LINPACK benchmark: past, present and future. Concurrency and Computation: practice and experience 15, 9 (2003), 803–820. https://doi.org/10.1002/cpe.728
[18]
Rudolf Eigenmann, Greg Gaertner, Wesley Jones, Hideki Saito, and Brian Whitney. 2002. SPEC HPC2002: The Next High-Performance Computer Benchmark. In High Performance Computing, Hans P. Zima, Kazuki Joe, Mitsuhisa Sato, Yoshiki Seo, and Masaaki Shimasaki (Eds.). 7–10. https://doi.org/10.1007/3-540-47847-7_3
[19]
R. Eigenmann and S. Hassanzadeh. 1996. Benchmarking with real industrial applications: the SPEC High-Performance Group. IEEE Computational Science and Engineering 3, 1 (1996), 18–23. https://doi.org/10.1109/99.486757
[20]
Jan Eitzinger, Thomas Gruber, Ayesha Afzal, Thomas Zeiser, and Gerhard Wellein. 2019. ClusterCockpit—A web application for job-specific performance monitoring. In Proceedings of the 2019 IEEE International Conference on Cluster Computing. 1–7. https://doi.org/10.1109/CLUSTER.2019.8891017
[21]
Georg Hager, Jan Treibig, Johannes Habich, and Gerhard Wellein. 2016. Exploring performance and power properties of modern multi-core chips via simple machine models. Concurrency and Computation: Practice and Experience 28, 2 (2016), 189–210. https://doi.org/10.1002/cpe.3180 arXiv:https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.3180
[22]
Zihan Jiang, Wanling Gao, Lei Wang, Xingwang Xiong, Yuchen Zhang, Xu Wen, Chunjie Luo, Hainan Ye, Xiaoyi Lu, Yunquan Zhang, Shengzhong Feng, Kenli Li, Weijia Xu, and Jianfeng Zhan. 2018. HPC AI500: A Benchmark Suite for HPC AI Systems. In International Symposium on Benchmarking, Measuring and Optimization (BenchCouncil). Springer-Verlag, 10–22. https://doi.org/10.1007/978-3-030-32813-9_2
[23]
Guido Juckeland, William Brantley, Sunita Chandrasekaran, Barbara Chapman, Shuai Che, Mathew Colgrove, Huiyu Feng, Alexander Grund, Robert Henschel, Wen-Mei W. Hwu, Huian Li, Matthias S. Müller, Wolfgang E. Nagel, Maxim Perminov, Pavel Shelepugin, Kevin Skadron, John Stratton, Alexey Titov, Ke Wang, Matthijs van Waveren, Brian Whitney, Sandra Wienke, Rengan Xu, and Kalyan Kumaran. 2015. SPEC ACCEL: A Standard Application Suite for Measuring Hardware Accelerator Performance. In Performance Modeling, Benchmarking, and Simulation Workshop (PMBS) (New Orleans, LA, USA). Springer-Verlag, 46–67. https://doi.org/10.1007/978-3-319-17248-4_3
[24]
Shuhei Kudo, Keigo Nitadori, Takuya Ina, and Toshiyuki Imamura. 2020. Prompt Report on Exa-Scale HPL-AI Benchmark. In 2020 IEEE International Conference on Cluster Computing (CLUSTER). 418–419. https://doi.org/10.1109/CLUSTER49012.2020.00058
[25]
Junjie Li, Alexander Bobyr, Swen Boehm, William Brantley, Holger Brunst, Aurelien Cavelan, Sunita Chandrasekaran, Jimmy Cheng, Florina M. Ciorba, Mathew Colgrove, Tony Curtis, Christopher Daley, Mauricio Ferrato, Mayara Gimenes de Souza, Nick Hagerty, Robert Henschel, Guido Juckeland, Jeffrey Kelling, Kelvin Li, Ron Lieberman, Kevin McMahon, Egor Melnichenko, Mohamed Ayoub Neggaz, Hiroshi Ono, Carl Ponder, Dave Raddatz, Severin Schueller, Robert Searles, Fedor Vasilev, Veronica Melesse Vergara, Bo Wang, Bert Wesarg, Sandra Wienke, and Miguel Zavala. 2022. SPEChpc 2021 Benchmark Suites for Modern HPC Systems. In Companion of the 2022 ACM/SPEC International Conference on Performance Engineering (Bejing, China) (ICPE ’22). 15–16. https://doi.org/10.1145/3491204.3527498
[26]
Piotr R Luszczek, David H Bailey, Jack J Dongarra, Jeremy Kepner, Robert F Lucas, Rolf Rabenseifner, and Daisuke Takahashi. 2006. The HPC Challenge (HPCC) Benchmark Suite. In ACM/IEEE Conference on Supercomputing(SC ’06). 213–es. https://doi.org/10.1145/1188455.1188677
[27]
Matthias S. Müller, John Baron, William C. Brantley, Huiyu Feng, Daniel Hackenberg, Robert Henschel, Gabriele Jost, Daniel Molka, Chris Parrott, Joe Robichaux, Pavel Shelepugin, Matthijs van Waveren, Brian Whitney, and Kalyan Kumaran. 2012. SPEC OMP2012 — An Application Benchmark Suite for Parallel Systems Using OpenMP. In OpenMP in a Heterogeneous World. Springer Berlin, 223–236. https://doi.org/10.1007/978-3-642-30961-8_17
[28]
Matthias S Müller, Matthijs Van Waveren, Ron Lieberman, Brian Whitney, Hideki Saito, Kalyan Kumaran, John Baron, William C Brantley, Chris Parrott, Tom Elken, 2010. SPEC MPI2007—an application benchmark suite for parallel systems using MPI. Concurrency and Computation: Practice and Experience 22, 2 (2010), 191–205. https://doi.org/10.1002/cpe.1535
[29]
Nevine Nassif, Ashley O. Munch, Carleton L. Molnar, Gerald Pasdast, Sitaraman V. Lyer, Zibing Yang, Oscar Mendoza, Mark Huddart, Srikrishnan Venkataraman, Sireesha Kandula, Rafi Marom, Alexandra M. Kern, Bill Bowhill, David R. Mulvihill, Srikanth Nimmagadda, Varma Kalidindi, Jonathan Krause, Mohammad M. Haq, Roopali Sharma, and Kevin Duda. 2022. Sapphire Rapids: The Next-Generation Intel Xeon Scalable Processor. In 2022 IEEE International Solid- State Circuits Conference (ISSCC), Vol. 65. 44–46. https://doi.org/10.1109/ISSCC42614.2022.9731107
[30]
Konstantinos Parasyris, Ignacio Laguna, Harshitha Menon, Markus Schordan, Daniel Osei-Kuffuor, Giorgis Georgakoudis, Michael O. Lam, and Tristan Vanderbruggen. 2020. HPC-MixPBench: An HPC Benchmark Suite for Mixed-Precision Analysis. In 2020 IEEE International Symposium on Workload Characterization (IISWC). 25–36. https://doi.org/10.1109/IISWC50251.2020.00012
[31]
Jan Treibig, Georg Hager, and Gerhard Wellein. 2010. LIKWID: A Lightweight Performance-Oriented Tool Suite for x86 Multicore Environments. In 2012 41st International Conference on Parallel Processing Workshops. 207–216. https://doi.org/10.1109/ICPPW.2010.38
[32]
Markus Wittmann, Georg Hager, Thomas Zeiser, Jan Treibig, and Gerhard Wellein. 2016. Chip-level and multi-node analysis of energy-optimized lattice Boltzmann CFD simulations. Concurrency and Computation: Practice and Experience 28, 7 (2016), 2295–2315.

Cited By

View all
  • (2024)First Impressions of the Sapphire Rapids Processor with HBM for Scientific WorkloadsSN Computer Science10.1007/s42979-024-02958-35:5Online publication date: 7-Jun-2024

Index Terms

  1. SPEChpc 2021 Benchmarks on Ice Lake and Sapphire Rapids Infiniband Clusters: A Performance and Energy Case Study
    Index terms have been assigned to the content through auto-classification.

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    SC-W '23: Proceedings of the SC '23 Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis
    November 2023
    2180 pages
    ISBN:9798400707858
    DOI:10.1145/3624062
    This work is licensed under a Creative Commons Attribution International 4.0 License.

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 12 November 2023

    Check for updates

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Funding Sources

    • Competence Network for Scientific High-Performance Computing in Bavaria (KONWIHR)

    Conference

    SC-W 2023

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)948
    • Downloads (Last 6 weeks)86
    Reflects downloads up to 12 Dec 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)First Impressions of the Sapphire Rapids Processor with HBM for Scientific WorkloadsSN Computer Science10.1007/s42979-024-02958-35:5Online publication date: 7-Jun-2024

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media