Article

Accelerating SpMV on FPGAs by Compressing Nonzero Values

Authors:

Paul Grigoras,

Pavel Burovskiy,

Eddie Hung,

Wayne LukAuthors Info & Claims

FCCM '15: Proceedings of the 2015 IEEE 23rd Annual International Symposium on Field-Programmable Custom Computing Machines

Pages 64 - 67

https://doi.org/10.1109/FCCM.2015.30

Published: 02 May 2015 Publication History

Abstract

Sparse matrix vector multiplication (SpMV) is an important kernel in many areas of scientific computing, especially as a building block for iterative linear system solvers. We study how loss less nonzero compression can be used to overcome memory bandwidth limitations in FPGA-based SpMV implementations. We introduce a dictionary-based compression algorithm which reduces redundant nonzero values to improve memory bandwidth without reducing computation efficiency by making use of spare FPGA resources. We show how a sparse matrix in the CSR format can be converted to the proposed storage format on the CPU and that average compression ratios of 1.14 - 1.40 and up to 2.65 times can be achieved, over CSR, for relevant matrices in our benchmarks.

Cited By

View all

Hsu ORucker AZhao TDesai VOlukotun KKjolstad FDoerfert JGrosser TLeather HSadayappan P(2025)Stardust: Compiling Sparse Tensor Algebra to a Reconfigurable Dataflow ArchitectureProceedings of the 23rd ACM/IEEE International Symposium on Code Generation and Optimization10.1145/3696443.3708918(628-643)Online publication date: 1-Mar-2025
https://dl.acm.org/doi/10.1145/3696443.3708918
Huang PTsividis YSeok MDev KYoo JMeinerzhagen P(2024)SPADES: A 0.54-GFLOPS/W Sparse Matrix Vector Multiplication Accelerator Featuring On-the-Fly GZIP Decompression for 3.36X Reduction in Off-Chip Data MovementProceedings of the 29th ACM/IEEE International Symposium on Low Power Electronics and Design10.1145/3665314.3670811(1-6)Online publication date: 5-Aug-2024
https://dl.acm.org/doi/10.1145/3665314.3670811
Rajashekar MTian XFang ZZhang ZPutnam A(2024)HiSpMV: Hybrid Row Distribution and Vector Buffering for Imbalanced SpMV Acceleration on FPGAsProceedings of the 2024 ACM/SIGDA International Symposium on Field Programmable Gate Arrays10.1145/3626202.3637557(154-164)Online publication date: 1-Apr-2024
https://dl.acm.org/doi/10.1145/3626202.3637557
Show More Cited By

Recommendations

Performance Optimization Using Partitioned SpMV on GPUs and Multicore CPUs
This paper presents a sparse matrix partitioning strategy to improve the performance of SpMV on GPUs and multicore CPUs. This method has wide adaptability for different types of sparse matrices, and is different from existing methods which only adapt to ...
Bandwidth Reduced Parallel SpMV on the SW26010 Many-Core Platform
ICPP '18: Proceedings of the 47th International Conference on Parallel Processing

SpMV (Sparse Matrix-Vector multiplication), in its simplest form y = Ax, multiplies a sparse matrix with a dense vector and is a widely used computing primitive in the domain of HPC. On the newly SW26010 many-core platform, we propose a highly efficient ...
A Performance Prediction and Analysis Integrated Framework for SpMV on GPUs

This paper presents unique modeling algorithms of performance prediction for sparse matrix-vector multiplication on GPUs. Based on the algorithms, we develop a framework that is able to predict SpMV kernel performance and to analyze the reported ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

FCCM '15: Proceedings of the 2015 IEEE 23rd Annual International Symposium on Field-Programmable Custom Computing Machines

May 2015

239 pages

ISBN:9781479999699

Publisher

IEEE Computer Society

United States

Publication History

Published: 02 May 2015

Author Tags

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

11
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

View all

Hsu ORucker AZhao TDesai VOlukotun KKjolstad FDoerfert JGrosser TLeather HSadayappan P(2025)Stardust: Compiling Sparse Tensor Algebra to a Reconfigurable Dataflow ArchitectureProceedings of the 23rd ACM/IEEE International Symposium on Code Generation and Optimization10.1145/3696443.3708918(628-643)Online publication date: 1-Mar-2025
https://dl.acm.org/doi/10.1145/3696443.3708918
Huang PTsividis YSeok MDev KYoo JMeinerzhagen P(2024)SPADES: A 0.54-GFLOPS/W Sparse Matrix Vector Multiplication Accelerator Featuring On-the-Fly GZIP Decompression for 3.36X Reduction in Off-Chip Data MovementProceedings of the 29th ACM/IEEE International Symposium on Low Power Electronics and Design10.1145/3665314.3670811(1-6)Online publication date: 5-Aug-2024
https://dl.acm.org/doi/10.1145/3665314.3670811
Rajashekar MTian XFang ZZhang ZPutnam A(2024)HiSpMV: Hybrid Row Distribution and Vector Buffering for Imbalanced SpMV Acceleration on FPGAsProceedings of the 2024 ACM/SIGDA International Symposium on Field Programmable Gate Arrays10.1145/3626202.3637557(154-164)Online publication date: 1-Apr-2024
https://dl.acm.org/doi/10.1145/3626202.3637557
Du YHu YZhou ZZhang ZAdler MIenne P(2022)High-Performance Sparse Linear Algebra on HBM-Equipped FPGAs Using HLSProceedings of the 2022 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays10.1145/3490422.3502368(54-64)Online publication date: 13-Feb-2022
https://dl.acm.org/doi/10.1145/3490422.3502368
Parravicini ASgherzi FSantambrogio M(2021)A reduced-precision streaming SpMV architecture for Personalized PageRank on FPGAProceedings of the 26th Asia and South Pacific Design Automation Conference10.1145/3394885.3431548(378-383)Online publication date: 18-Jan-2021
https://dl.acm.org/doi/10.1145/3394885.3431548
Gopinath SGhanathe NSeshadri VSharma RMcKinley KFisher K(2019)Compiling KB-sized machine learning models to tiny IoT devicesProceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation10.1145/3314221.3314597(79-95)Online publication date: 8-Jun-2019
https://dl.acm.org/doi/10.1145/3314221.3314597
Burovskiy PGrigoras PSherwin SLuk W(2017)Efficient Assembly for High-Order Unstructured FEM Meshes (FPL 2015)ACM Transactions on Reconfigurable Technology and Systems10.1145/302406410:2(1-22)Online publication date: 6-Apr-2017
https://dl.acm.org/doi/10.1145/3024064
He ZLuo GGreene JAnderson J(2017)FPGA Acceleration for Computational Glass-Free DisplaysProceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays10.1145/3020078.3021728(267-274)Online publication date: 22-Feb-2017
https://dl.acm.org/doi/10.1145/3020078.3021728
Grigoras PBurovskiy PLuk WChen DGreene J(2016)CASKProceedings of the 2016 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays10.1145/2847263.2847338(179-184)Online publication date: 21-Feb-2016
https://dl.acm.org/doi/10.1145/2847263.2847338
Boland DChen DGreene J(2016)Reducing Memory Requirements for High-Performance and Numerically Stable Gaussian EliminationProceedings of the 2016 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays10.1145/2847263.2847281(244-253)Online publication date: 21-Feb-2016
https://dl.acm.org/doi/10.1145/2847263.2847281
Show More Cited By

Abstract

Cited By

Recommendations

Performance Optimization Using Partitioned SpMV on GPUs and Multicore CPUs

Bandwidth Reduced Parallel SpMV on the SW26010 Many-Core Platform

A Performance Prediction and Analysis Integrated Framework for SpMV on GPUs

Comments

Information

Published In

Publisher

Publication History

Author Tags

Qualifiers

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

View options

Share

Share this Publication link

Share on social media

Affiliations