[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3332186.3332237acmotherconferencesArticle/Chapter ViewAbstractPublication PagespearcConference Proceedingsconference-collections
research-article

Characterizing Performance Improvement of GPUs

Published: 28 July 2019 Publication History

Abstract

Evaluating super-computer performance through benchmarking is an integral part of an HPC center operation. It can be used to guide the selection of a system during hardware acquisition, to validate that the selected hardware performs as expected, and to ensure that the system performs reasonably throughout its lifetime. In this work we quantify the performance improvement of our recent GPU architectures, GeForce RTX 2080 Ti, Titan V, and Tesla V100 over an older architecture, GeForce GTX 1080 Ti. We perform a workload analysis to select a suite of applications to benchmark. Using the benchmark results, a single performance number is then assigned to each GPU system under test. Taking hardware price into consideration, our results show consumer-grade GeForce RTX 2080 Ti as the most cost competitive GPU platform.

References

[1]
B. Austin, C. Daley, D. Doerfler, J. Deslippe, B. Cook, B. Friesen, T. Kurth, C. Yang, C., & N.J. Wright. 2018. A Metric for Evaluating Super-computer Performance in the Era of Extreme Heterogeneity.
[2]
S. McIntosh-Smith, J. Price, T. Deakin, & A. Poenaru. 2019. A Performance Analysis of the First Generation of HPC-Optimised Arm Processors. Concurrency and Computation: Practice and Experience.
[3]
Charlotte Kotas, Thomas Naughton, Neena Imam. 2018. "A comparison of Amazon Web Services and Microsoft Azure cloud platforms for high performance computing", IEEE International Conference on Consumer Electronics (ICCE),
[4]
Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, Manjunath Kudlur, Josh Levenberg, Rajat Monga, Sherry Moore, Derek G. Murray, Benoit Steiner, Paul Tucker, Vijay Vasudevan, Pete Warden, Martin Wicke, Yuan Yu, Xiaoqiang Zheng. 2016. TensorFlow: a system for large-scale machine learning, Proceedings of the 12th USENIX conference on Operating Systems Design and Implementation.
[5]
Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan, Edward Yang, ZacharyDeVito, Zeming Lin, Alban Desmaison, Luca Antiga, and Adam Lerer. 2017. Automatic differentiation in PyTorch. InNIPS 2017 Autodiff Workshop: The Future of Gradient-basedMachine Learning Software and Techniques.
[6]
D.A. Case, I.Y. Ben-Shalom, S.R. Brozell, D.S. Cerutti, T.E. Cheatham, III, V.W.D. Cruzeiro, T.A. Darden, R.E. Duke, D. Ghoreishi, M.K. Gilson, H. Gohlke, A.W. Goetz, D. Greene, R Harris, N. Homeyer, S. Izadi, A. Kovalenko, T. Kurtzman, T.S. Lee, S. LeGrand, P. Li, C. Lin, J. Liu, T. Luchko, R. Luo, D.J. Mermelstein, K.M. Merz, Y. Miao, G. Monard, C. Nguyen, H. Nguyen, I. Omelyan, A. Onufriev, F. Pan, R. Qi, D.R. Roe, A. Roitberg, C. Sagui, S. Schott-Verdugo, J. Shen, C.L. Simmerling, J. Smith, R. Salomon-Ferrer, J. Swails, R.C. Walker, J. Wang, H. Wei, R.M. Wolf, X. Wu, L. Xiao, D.M. York and P.A. Kollman. 2018. AMBER 2018, University of California, San Francisco.
[7]
S. Plimpton. 1995. Fast Parallel Algorithms for Short-Range Molecular Dynamics, J Comp Phys, 117, 1--19.
[8]
Linfeng Zhang, Jiequn Han, Han Wang, Roberto Car, and Weinan E. 2018. Phys. Rev. Lett. 120, 143001
[9]
https://greta-stats.org/index.html
[10]
RC Walker, AW Götz AW. 2016. Electronic Structure Calculations on Graphics Processing Units: From Quantum Chemistry to Condensed Matter Physics. New York, NY: Wiley.
[11]
T. Kurth, M. Smorkalov, P Mendygral, S Sridharan, A Mathuriya. 2018. TensorFlow at Scale: Performance and productivity analysis of distributed training with Horovod, MLSL, and Cray PE ML. Concurrency Computat Pract Exper.
[12]
Sharan Chetlur, Cliff Woolley, Philippe Vandermersch, Jonathan Cohen, John Tran, Bryan Catanzaro. 2014. "cuDNN: Efficient Primitives for Deep Learning", arXiv:1410.0759.
[13]
C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, Z. Wojna. 2016. "Rethinking the inception architecture for computer vision", Proc. IEEE Conf. Comput. Vis. Pattern Recognit., pp. 2818--2826.
[14]
K He, X Zhang, S Ren, & J Sun. 2016. Deep Residual Learning for Image Recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 770--778.
[15]
K Simonyan & A Zisserman. 2015. Very Deep Convolutional Networks for Large-Scale Image Recognition. CoRR, abs/1409.1556.
[16]
Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton. 2012. ImageNet classification with deep convolutional neural networks, Proceedings of the 25th International Conference on Neural Information Processing Systems, p. 1097--1105.
[17]
Andreas W. Goetz, Mark J. Williamson, Dong Xu, Duncan Poole, Scott Le Grand, and Ross C. Walker. 2012.Routine microsecond molecular dynamics simulations with AMBER - Part I: Generalized Born", J. Chem. Theory Comput., 8 (5), pp 1542--1555.
[18]
http://ambermd.org/GPUSupport.php
[19]
Christian R. Trott, Lars Winterfeld. 2010. "General-purpose molecular dynamics simulations on GPU-based clusters", {http://arxiv.org/abs/1009.4330 arXiv:1009.4330}.
[20]
S Markidis, S.W. Chien, E. Laure, I. B. Peng, & J.S. Vetter. 2018. NVIDIA Tensor Core Programmability, Performance & Precision. 2018 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), 522--531
[21]
Sharan Chetlur, Cliff Woolley, Philippe Vandermersch, Jonathan Cohen, John Tran Primitives for Deep Learning. CoRR, abs/1410.0759.

Index Terms

  1. Characterizing Performance Improvement of GPUs

        Recommendations

        Comments

        Please enable JavaScript to view thecomments powered by Disqus.

        Information & Contributors

        Information

        Published In

        cover image ACM Other conferences
        PEARC '19: Practice and Experience in Advanced Research Computing 2019: Rise of the Machines (learning)
        July 2019
        775 pages
        ISBN:9781450372275
        DOI:10.1145/3332186
        • General Chair:
        • Tom Furlani
        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 28 July 2019

        Permissions

        Request permissions for this article.

        Check for updates

        Author Tags

        1. Benchmarking
        2. GPU
        3. Performance Improvement
        4. Workload Analysis

        Qualifiers

        • Research-article
        • Research
        • Refereed limited

        Conference

        PEARC '19

        Acceptance Rates

        Overall Acceptance Rate 133 of 202 submissions, 66%

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • 0
          Total Citations
        • 266
          Total Downloads
        • Downloads (Last 12 months)19
        • Downloads (Last 6 weeks)2
        Reflects downloads up to 09 Jan 2025

        Other Metrics

        Citations

        View Options

        Login options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media