More Web Proxy on the site http://driver.im/

research-article

Evaluation of POSIT Arithmetic with Accelerators

Authors:

Naohito Nakasato,

Maho NakataAuthors Info & Claims

HPCAsia '24: Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region

Pages 62 - 72

https://doi.org/10.1145/3635035.3635046

Published: 19 January 2024 Publication History

Abstract

We present an evaluation of 32-bit POSIT arithmetic through its implementation as accelerators on FPGAs and GPUs. POSIT, a floating-point number format, adaptively changes the size of its fractional part. We developed hardware designs for FPGAs and software for GPUs to accelerate linear algebra operations using Posit(32,2) arithmetic. Our FPGA- and GPU-based accelerators in Posit(32,2) arithmetic significantly accelerated the Cholesky and LU decomposition algorithms for dense matrices. In terms of numerical accuracy, Posit(32,2) arithmetic is approximately 0.5 - 1.0 digits more accurate than the standard 32-bit format, especially when the norm of the elements of the input matrix is close to 1. Evaluating power consumption, we observed that the power efficiency of the accelerators ranged between 0.043 - 0.076 Gflops/watts for the LU decomposition in Posit(32,2) arithmetic. The power efficiency of the latest GPUs as accelerators of Posit(32,2) arithmetic is better than that of the evaluated FPGA chip.

References

[1]

Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel M. Ziegler, Jeffrey Wu, Clemens Winter, Christopher Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, and Dario Amodei. 2020. Language Models are Few-Shot Learners. arXiv e-prints, Article arXiv:2005.14165 (May 2020). https://doi.org/10.48550/arXiv.2005.14165 arxiv:2005.14165 [cs.CL]

[2]

Nicholas Buoncristiani, Sanjana Shah, David Donofrio, and John Shalf. 2020. Evaluating the Numerical Stability of Posit Arithmetic. In 2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS). 612–621. https://doi.org/10.1109/IPDPS47924.2020.00069

[3]

Rohit Chaurasiya, John Gustafson, Rahul Shrestha, Jonathan Neudorfer, Sangeeth Nambiar, Kaustav Niyogi, Farhad Merchant, and Rainer Leupers. 2018. Parameterized Posit Arithmetic Hardware Generator. In 2018 IEEE 36th International Conference on Computer Design (ICCD). 334–341. https://doi.org/10.1109/ICCD.2018.00057

[4]

Eric Chung, Jeremy Fowers, Kalin Ovtcharov, Michael Papamichael, Adrian Caulfield, Todd Massengill, Ming Liu, Daniel Lo, Shlomi Alkalay, Michael Haselman, Maleen Abeydeera, Logan Adams, Hari Angepat, Christian Boehn, Derek Chiou, Oren Firestein, Alessandro Forin, Kang Su Gatlin, Mahdi Ghandi, Stephen Heil, Kyle Holohan, Ahmad El Husseini, Tamas Juhasz, Kara Kagi, Ratna K. Kovvuri, Sitaram Lanka, Friedel van Megen, Dima Mukhortov, Prerak Patel, Brandon Perez, Amanda Rapsang, Steven Reinhardt, Bita Rouhani, Adam Sapek, Raja Seera, Sangeetha Shekar, Balaji Sridharan, Gabriel Weisz, Lisa Woods, Phillip Yi Xiao, Dan Zhang, Ritchie Zhao, and Doug Burger. 2018. Serving DNNs in Real Time at Datacenter Scale with Project Brainwave. IEEE Micro 38, 2 (2018), 8–20. https://doi.org/10.1109/MM.2018.022071131

[5]

Florent de Dinechin and Bogdan Pasca. 2011. Designing Custom Arithmetic Data Paths with FloPoCo. IEEE Design & Test of Computers 28, 4 (July 2011), 18–27.

[6]

T.J. Dekker. 1971. A Floating-Point Technique for Extending the Available Precision. Numer. Math. 18 (1971), 224–242.

Digital Library

[7]

Luc Forget, Yohann Uguen, and Florent de Dinechin. 2021. Comparing posit and IEEE-754 hardware cost. (April 2021). https://hal.science/hal-03195756 working paper or preprint.

[8]

Laurent Fousse, Guillaume Hanrot, Vincent Lefèvre, Patrick Pélissier, and Paul̃ Zimmermann. 2007. MPFR: A Multiple-Precision Binary Floating-Point Library with Correct Rounding. ACM Trans. Math. Softw. 33, 2 (jun 2007), 13–es. https://doi.org/10.1145/1236463.1236468

Digital Library

[9]

P. Ghysels and W. Vanroose. 2014. Hiding global synchronization latency in the preconditioned Conjugate Gradient algorithm. Parallel Comput. 40, 7 (2014), 224–238. https://doi.org/10.1016/j.parco.2013.06.001 7th Workshop on Parallel Matrix Algorithms and Applications.

Digital Library

[10]

The Khronos Group. 2020. OpenCL. https://www.khronos.org/opencl/

[11]

John L Gustafson and Isaac T Yonemoto. 2017. Beating floating point at its own game: Posit arithmetic. Supercomputing Frontiers and Innovations 4, 2 (2017), 71–86.

Digital Library

[12]

K. He, X. Zhang, S. Ren, and J. Sun. 2016. Deep Residual Learning for Image Recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE Computer Society, Los Alamitos, CA, USA, 770–778. https://doi.org/10.1109/CVPR.2016.90

[13]

Junjie Hou, Yongxin Zhu, Sen Du, and Shijin Song. 2019. Enhancing accuracy and dynamic range of scientific data analytics by implementing posit arithmetic on FPGA. Journal of Signal Processing Systems 91, 10 (2019), 1137–1148.

Digital Library

[14]

Sergey Ioffe and Christian Szegedy. 2015. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. arXiv e-prints, Article arXiv:1502.03167 (Feb. 2015). https://doi.org/10.48550/arXiv.1502.03167 arxiv:1502.03167 [cs.LG]

[15]

Manish Kumar Jaiswal and Hayden K-H So. 2019. Pacogen: A hardware posit arithmetic core generator. Ieee access 7 (2019), 74586–74601.

[16]

Donald E. Knuth. 1997. The Art of Computer Programming, Volume 2 (3rd Ed.): Seminumerical Algorithms. Addison-Wesley Longman Publishing Co., Inc., USA.

Digital Library

[17]

Fumiya Kono, Naohito Nakasato, and Maho Nakata. 2023. Accelerating 128-bit Floating-Point Matrix Multiplication on FPGAs. arXiv e-prints, Article arXiv:2306.04087 (June 2023). https://doi.org/10.48550/arXiv.2306.04087 arxiv:2306.04087 [cs.DC]

[18]

H.T. Kung, C.E. Leiserson, CARNEGIE-MELLON UNIV PITTSBURGH PA Dept. of COMPUTER SCIENCE., and Carnegie Mellon University. Computer Science Department. 1978. Systolic Arrays for (VLSI). Carnegie-Mellon University, Department of Computer Science. https://books.google.co.jp/books?id=pAKfHAAACAAJ

[19]

Cerlane Leong. 2020. SoftPosit. https://gitlab.com/cerlane/SoftPosit

[20]

Peter Lindstrom, Scott Lloyd, and Jeffrey Hittinger. 2018. Universal coding of the reals: alternatives to IEEE floating point. In Proceedings of the Conference for Next Generation Arithmetic. ACM, 5.

Digital Library

[21]

Raúl Murillo Montero, Alberto A Del Barrio, and Guillermo Botella. 2019. Template-based posit multiplication for training and inferring in neural networks. arXiv preprint arXiv:1907.04091 (2019).

[22]

Raul Murillo. 2023. Flo-Posit. https://github.com/RaulMurillo/Flo-Posit

[23]

Raul Murillo, Alberto A. Del Barrio, and Guillermo Botella. 2020. Customized Posit Adders and Multipliers using the FloPoCo Core Generator. In 2020 IEEE International Symposium on Circuits and Systems (ISCAS). 1–5. https://doi.org/10.1109/ISCAS45731.2020.9180771

[24]

Raul Murillo, David Mallasén, Alberto A. Del Barrio, and Guillermo Botella. 2022. Comparing different decodings for posit arithmetic. In Conference on Next Generation Arithmetic. Springer, 84–99. https://doi.org/10.1007/978-3-031-09779-9_6

Digital Library

[25]

Maho Nakata. 2021. MPLAPACK version 1.0.0 user manual. https://doi.org/10.48550/ARXIV.2109.13406

[26]

NVIDIA. 2020. CUDA. https://developer.nvidia.com/cuda-zone

[27]

Hiroyuki Ootomo, Katsuhisa Ozaki, and Rio Yokota. 2023. DGEMM on Integer Matrix Multiplication Unit. arXiv e-prints, Article arXiv:2306.11975 (June 2023). https://doi.org/10.48550/arXiv.2306.11975 arxiv:2306.11975 [cs.DC]

[28]

Hiroyuki Ootomo and Rio Yokota. 2022. Recovering Single Precision Accuracy from Tensor Cores While Surpassing the FP32 Theoretical Peak Performance. Int. J. High Perform. Comput. Appl. 36, 4 (jul 2022), 475–491. https://doi.org/10.1177/10943420221090256

Digital Library

[29]

A. Podobas and S. Matsuoka. 2018. Hardware Implementation of POSITs and Their Application in FPGAs. In 2018 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW). 138–145. https://doi.org/10.1109/IPDPSW.2018.00029

[30]

Sivan Toledo. 1997. Locality of Reference in LU Decomposition with Partial Pivoting. SIAM J. Matrix Anal. Appl. 18, 4 (1997), 1065–1081. https://doi.org/10.1137/S0895479896297744 arXiv:https://doi.org/10.1137/S0895479896297744

Digital Library

[31]

Yohann Uguen, Luc Forget, and Florent de Dinechin. 2019. Evaluating the Hardware Cost of the Posit Number System. In 2019 29th International Conference on Field Programmable Logic and Applications (FPL). 106–113. https://doi.org/10.1109/FPL.2019.00026

Cited By

Index Terms

Evaluation of POSIT Arithmetic with Accelerators

Index terms have been assigned to the content through auto-classification.

Recommendations

PHAc: Posit Hardware Accelerator for Efficient Arithmetic Logic Operations
Next Generation Arithmetic
Abstract
Arithmetic accelerators are always in demand for fast computations and logic operations. Here, posit arithmetic plays an important role; it outperforms the traditional IEEE-754 floating-point in terms of accuracy and dynamic range. This paper ...
An Accelerator for Posit Arithmetic Targeting Posit Level 1 BLAS Routines and Pair-HMM
CoNGA'19: Proceedings of the Conference for Next Generation Arithmetic 2019

The newly proposed posit number format uses a significantly different approach to represent floating point numbers. This paper introduces a framework for posit arithmetic in reconfigurable logic that maintains full precision in intermediate results. We ...
Provably correct posit arithmetic with fixed-point big integer
CoNGA '18: Proceedings of the Conference for Next Generation Arithmetic

Floating-point number format is used extensively in many applications, especially scientific software. The applications rely on efficient hardware floating-point support to perform arithmetic operations. With the advent of multicore CPUs and massively ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences

HPCAsia '24: Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region

January 2024

185 pages

ISBN:9798400708893

DOI:10.1145/3635035

Copyright © 2024 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 19 January 2024

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Research-article
Research
Refereed limited

Conference

HPCAsia 2024

HPCAsia 2024: International Conference on High Performance Computing in Asia-Pacific Region

January 25 - 27, 2024

Nagoya, Japan

Acceptance Rates

Overall Acceptance Rate 69 of 143 submissions, 48%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
132
Total Downloads

Downloads (Last 12 months)132
Downloads (Last 6 weeks)5

Reflects downloads up to 20 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View Table of Contents