[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3635035.3635046acmotherconferencesArticle/Chapter ViewAbstractPublication PageshpcasiaConference Proceedingsconference-collections
research-article

Evaluation of POSIT Arithmetic with Accelerators

Published: 19 January 2024 Publication History

Abstract

We present an evaluation of 32-bit POSIT arithmetic through its implementation as accelerators on FPGAs and GPUs. POSIT, a floating-point number format, adaptively changes the size of its fractional part. We developed hardware designs for FPGAs and software for GPUs to accelerate linear algebra operations using Posit(32,2) arithmetic. Our FPGA- and GPU-based accelerators in Posit(32,2) arithmetic significantly accelerated the Cholesky and LU decomposition algorithms for dense matrices. In terms of numerical accuracy, Posit(32,2) arithmetic is approximately 0.5 - 1.0 digits more accurate than the standard 32-bit format, especially when the norm of the elements of the input matrix is close to 1. Evaluating power consumption, we observed that the power efficiency of the accelerators ranged between 0.043 - 0.076 Gflops/watts for the LU decomposition in Posit(32,2) arithmetic. The power efficiency of the latest GPUs as accelerators of Posit(32,2) arithmetic is better than that of the evaluated FPGA chip.

References

[1]
Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel M. Ziegler, Jeffrey Wu, Clemens Winter, Christopher Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, and Dario Amodei. 2020. Language Models are Few-Shot Learners. arXiv e-prints, Article arXiv:2005.14165 (May 2020). https://doi.org/10.48550/arXiv.2005.14165 arxiv:2005.14165 [cs.CL]
[2]
Nicholas Buoncristiani, Sanjana Shah, David Donofrio, and John Shalf. 2020. Evaluating the Numerical Stability of Posit Arithmetic. In 2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS). 612–621. https://doi.org/10.1109/IPDPS47924.2020.00069
[3]
Rohit Chaurasiya, John Gustafson, Rahul Shrestha, Jonathan Neudorfer, Sangeeth Nambiar, Kaustav Niyogi, Farhad Merchant, and Rainer Leupers. 2018. Parameterized Posit Arithmetic Hardware Generator. In 2018 IEEE 36th International Conference on Computer Design (ICCD). 334–341. https://doi.org/10.1109/ICCD.2018.00057
[4]
Eric Chung, Jeremy Fowers, Kalin Ovtcharov, Michael Papamichael, Adrian Caulfield, Todd Massengill, Ming Liu, Daniel Lo, Shlomi Alkalay, Michael Haselman, Maleen Abeydeera, Logan Adams, Hari Angepat, Christian Boehn, Derek Chiou, Oren Firestein, Alessandro Forin, Kang Su Gatlin, Mahdi Ghandi, Stephen Heil, Kyle Holohan, Ahmad El Husseini, Tamas Juhasz, Kara Kagi, Ratna K. Kovvuri, Sitaram Lanka, Friedel van Megen, Dima Mukhortov, Prerak Patel, Brandon Perez, Amanda Rapsang, Steven Reinhardt, Bita Rouhani, Adam Sapek, Raja Seera, Sangeetha Shekar, Balaji Sridharan, Gabriel Weisz, Lisa Woods, Phillip Yi Xiao, Dan Zhang, Ritchie Zhao, and Doug Burger. 2018. Serving DNNs in Real Time at Datacenter Scale with Project Brainwave. IEEE Micro 38, 2 (2018), 8–20. https://doi.org/10.1109/MM.2018.022071131
[5]
Florent de Dinechin and Bogdan Pasca. 2011. Designing Custom Arithmetic Data Paths with FloPoCo. IEEE Design & Test of Computers 28, 4 (July 2011), 18–27.
[6]
T.J. Dekker. 1971. A Floating-Point Technique for Extending the Available Precision. Numer. Math. 18 (1971), 224–242.
[7]
Luc Forget, Yohann Uguen, and Florent de Dinechin. 2021. Comparing posit and IEEE-754 hardware cost. (April 2021). https://hal.science/hal-03195756 working paper or preprint.
[8]
Laurent Fousse, Guillaume Hanrot, Vincent Lefèvre, Patrick Pélissier, and Paul̃ Zimmermann. 2007. MPFR: A Multiple-Precision Binary Floating-Point Library with Correct Rounding. ACM Trans. Math. Softw. 33, 2 (jun 2007), 13–es. https://doi.org/10.1145/1236463.1236468
[9]
P. Ghysels and W. Vanroose. 2014. Hiding global synchronization latency in the preconditioned Conjugate Gradient algorithm. Parallel Comput. 40, 7 (2014), 224–238. https://doi.org/10.1016/j.parco.2013.06.001 7th Workshop on Parallel Matrix Algorithms and Applications.
[10]
The Khronos Group. 2020. OpenCL. https://www.khronos.org/opencl/
[11]
John L Gustafson and Isaac T Yonemoto. 2017. Beating floating point at its own game: Posit arithmetic. Supercomputing Frontiers and Innovations 4, 2 (2017), 71–86.
[12]
K. He, X. Zhang, S. Ren, and J. Sun. 2016. Deep Residual Learning for Image Recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE Computer Society, Los Alamitos, CA, USA, 770–778. https://doi.org/10.1109/CVPR.2016.90
[13]
Junjie Hou, Yongxin Zhu, Sen Du, and Shijin Song. 2019. Enhancing accuracy and dynamic range of scientific data analytics by implementing posit arithmetic on FPGA. Journal of Signal Processing Systems 91, 10 (2019), 1137–1148.
[14]
Sergey Ioffe and Christian Szegedy. 2015. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. arXiv e-prints, Article arXiv:1502.03167 (Feb. 2015). https://doi.org/10.48550/arXiv.1502.03167 arxiv:1502.03167 [cs.LG]
[15]
Manish Kumar Jaiswal and Hayden K-H So. 2019. Pacogen: A hardware posit arithmetic core generator. Ieee access 7 (2019), 74586–74601.
[16]
Donald E. Knuth. 1997. The Art of Computer Programming, Volume 2 (3rd Ed.): Seminumerical Algorithms. Addison-Wesley Longman Publishing Co., Inc., USA.
[17]
Fumiya Kono, Naohito Nakasato, and Maho Nakata. 2023. Accelerating 128-bit Floating-Point Matrix Multiplication on FPGAs. arXiv e-prints, Article arXiv:2306.04087 (June 2023). https://doi.org/10.48550/arXiv.2306.04087 arxiv:2306.04087 [cs.DC]
[18]
H.T. Kung, C.E. Leiserson, CARNEGIE-MELLON UNIV PITTSBURGH PA Dept. of COMPUTER SCIENCE., and Carnegie Mellon University. Computer Science Department. 1978. Systolic Arrays for (VLSI). Carnegie-Mellon University, Department of Computer Science. https://books.google.co.jp/books?id=pAKfHAAACAAJ
[19]
Cerlane Leong. 2020. SoftPosit. https://gitlab.com/cerlane/SoftPosit
[20]
Peter Lindstrom, Scott Lloyd, and Jeffrey Hittinger. 2018. Universal coding of the reals: alternatives to IEEE floating point. In Proceedings of the Conference for Next Generation Arithmetic. ACM, 5.
[21]
Raúl Murillo Montero, Alberto A Del Barrio, and Guillermo Botella. 2019. Template-based posit multiplication for training and inferring in neural networks. arXiv preprint arXiv:1907.04091 (2019).
[22]
Raul Murillo. 2023. Flo-Posit. https://github.com/RaulMurillo/Flo-Posit
[23]
Raul Murillo, Alberto A. Del Barrio, and Guillermo Botella. 2020. Customized Posit Adders and Multipliers using the FloPoCo Core Generator. In 2020 IEEE International Symposium on Circuits and Systems (ISCAS). 1–5. https://doi.org/10.1109/ISCAS45731.2020.9180771
[24]
Raul Murillo, David Mallasén, Alberto A. Del Barrio, and Guillermo Botella. 2022. Comparing different decodings for posit arithmetic. In Conference on Next Generation Arithmetic. Springer, 84–99. https://doi.org/10.1007/978-3-031-09779-9_6
[25]
Maho Nakata. 2021. MPLAPACK version 1.0.0 user manual. https://doi.org/10.48550/ARXIV.2109.13406
[26]
NVIDIA. 2020. CUDA. https://developer.nvidia.com/cuda-zone
[27]
Hiroyuki Ootomo, Katsuhisa Ozaki, and Rio Yokota. 2023. DGEMM on Integer Matrix Multiplication Unit. arXiv e-prints, Article arXiv:2306.11975 (June 2023). https://doi.org/10.48550/arXiv.2306.11975 arxiv:2306.11975 [cs.DC]
[28]
Hiroyuki Ootomo and Rio Yokota. 2022. Recovering Single Precision Accuracy from Tensor Cores While Surpassing the FP32 Theoretical Peak Performance. Int. J. High Perform. Comput. Appl. 36, 4 (jul 2022), 475–491. https://doi.org/10.1177/10943420221090256
[29]
A. Podobas and S. Matsuoka. 2018. Hardware Implementation of POSITs and Their Application in FPGAs. In 2018 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW). 138–145. https://doi.org/10.1109/IPDPSW.2018.00029
[30]
Sivan Toledo. 1997. Locality of Reference in LU Decomposition with Partial Pivoting. SIAM J. Matrix Anal. Appl. 18, 4 (1997), 1065–1081. https://doi.org/10.1137/S0895479896297744 arXiv:https://doi.org/10.1137/S0895479896297744
[31]
Yohann Uguen, Luc Forget, and Florent de Dinechin. 2019. Evaluating the Hardware Cost of the Posit Number System. In 2019 29th International Conference on Field Programmable Logic and Applications (FPL). 106–113. https://doi.org/10.1109/FPL.2019.00026

Cited By

View all

Index Terms

  1. Evaluation of POSIT Arithmetic with Accelerators
          Index terms have been assigned to the content through auto-classification.

          Recommendations

          Comments

          Please enable JavaScript to view thecomments powered by Disqus.

          Information & Contributors

          Information

          Published In

          cover image ACM Other conferences
          HPCAsia '24: Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region
          January 2024
          185 pages
          ISBN:9798400708893
          DOI:10.1145/3635035
          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          Published: 19 January 2024

          Permissions

          Request permissions for this article.

          Check for updates

          Qualifiers

          • Research-article
          • Research
          • Refereed limited

          Conference

          HPCAsia 2024

          Acceptance Rates

          Overall Acceptance Rate 69 of 143 submissions, 48%

          Contributors

          Other Metrics

          Bibliometrics & Citations

          Bibliometrics

          Article Metrics

          • 0
            Total Citations
          • 132
            Total Downloads
          • Downloads (Last 12 months)132
          • Downloads (Last 6 weeks)5
          Reflects downloads up to 20 Jan 2025

          Other Metrics

          Citations

          Cited By

          View all

          View Options

          Login options

          View options

          PDF

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader

          HTML Format

          View this article in HTML Format.

          HTML Format

          Media

          Figures

          Other

          Tables

          Share

          Share

          Share this Publication link

          Share on social media