poster

Sparse-Matrix Compression Primitives with OpenCL Framework to Support Halide

Authors:

Ming-Yu HungAuthors Info & Claims

IWOCL '19: Proceedings of the International Workshop on OpenCL

Article No.: 24, Pages 1 - 2

https://doi.org/10.1145/3318170.3318179

Published: 13 May 2019 Publication History

Get Access

Abstract

Halide and OpenCL now play important roles for heterogeneous multi-core computing. OpenCL provides vendor-level support and Halide provides domain-specific support such as vision processing and AI model (TVM Halide IR). Halide also provides flexible scheduling for applications on target machines. OpenCL plays a supporting role for Halide environments. In this work, we investigate the research issues in supporting sparse computation with Halide and their corresponding OpenCL support. We present sparse matrix compression primitives on Halide for sparse matrix matrix (SpMM) multiplication with OpenCL framework. Halide is a programming language designed to process image and array from numerous algorithms and scheduling primitives to achieve state-of-art performance including SIMD and heterogeneous computation. This paper proposed the implementation of sparse matrix compression for Halide scheduling primitives including COO, CSR, and hybrid CSR. The design of experiments includes Halide primitives for sparse matrix compression and matrix computations. The experimental result of computation with compressing matrix shows the performance are improved by up to 85% compared to the baseline without compression.

References

[1]

Rong-Guey Chang, Tyng-Ruey Chuang, and Jenq Kuen Lee. 2004. Support and optimization for parallel sparse programs with array intrinsics of Fortran 90. Parallel Comput. 30, 4 (2004), 527--550.

Digital Library

Google Scholar

[2]

Changwan Hong, Aravind Sukumaran-Rajam, Bortik Bandyopadhyay, Jinsung Kim, Süreyya Emre Kurt, Israt Nisa, Shivani Sabhlok, Ümit V Çatalyürek, Srinivasan Parthasarathy, and P Sadayappan. 2018. Efficient sparse-matrix multi-vector product on GPUs. In Proceedings of the 27th International Symposium on High-Performance Parallel and Distributed Computing. ACM, 66--79.

Digital Library

Google Scholar

[3]

Jonathan Ragan-Kelley, Andrew Adams, Sylvain Paris, Marc Levoy, Saman Amarasinghe, and Frédo Durand. 2012. Decoupling algorithms from schedules for easy optimization of image processing pipelines. (2012).

Google Scholar

[4]

Jonathan Ragan-Kelley, Connelly Barnes, Andrew Adams, Sylvain Paris, Frédo Durand, and Saman Amarasinghe. 2013. Halide: a language and compiler for optimizing parallelism, locality, and recomputation in image processing pipelines. ACM SIGPLAN Notices 48, 6 (2013), 519--530.

Digital Library

Google Scholar

Cited By

View all

Liao HLee CLee JLai WHung MHuang C(2021)Support Convolution of CNN with Compression Sparse Matrix Multiplication Flow in TVM50th International Conference on Parallel Processing Workshop10.1145/3458744.3473352(1-7)Online publication date: 9-Aug-2021
https://dl.acm.org/doi/10.1145/3458744.3473352
Chao CChu WLee CLee JHung MSung H(2020)Devise Sparse Compression Schedulers to Enhance FastText MethodsWorkshop Proceedings of the 49th International Conference on Parallel Processing10.1145/3409390.3409394(1-8)Online publication date: 17-Aug-2020
https://dl.acm.org/doi/10.1145/3409390.3409394
Yu MChen TLee J(2020)Accelerating NNEF Framework on OpenCL Devices Using clDNNProceedings of the International Workshop on OpenCL10.1145/3388333.3388655(1-2)Online publication date: 27-Apr-2020
https://dl.acm.org/doi/10.1145/3388333.3388655

Index Terms

Sparse-Matrix Compression Primitives with OpenCL Framework to Support Halide
1. Computer systems organization
  1. Architectures
    1. Other architectures
      1. Heterogeneous (hybrid) systems
      2. Neural networks
    2. Parallel architectures
      1. Single instruction, multiple data

Recommendations

Performance Tuning of Matrix Multiplication in OpenCL on Different GPUs and CPUs
SCC '12: Proceedings of the 2012 SC Companion: High Performance Computing, Networking Storage and Analysis

OpenCL (Open Computing Language) is a framework for general-purpose parallel programming. Programs written in OpenCL are functionally portable across multiple processors including CPUs, GPUs, and also FPGAs. Using an auto-tuning technique makes ...
A Halide-based Synergistic Computing Framework for Heterogeneous Systems

New programming models have been developed to embrace contemporary heterogeneous machines, each of which may contain several types of processors, e.g., CPUs, GPUs, FPGAs and ASICs. Unlike the conventional ones, which use separate programming schemes for ...
Nuclear Reactor Simulations on OpenCL FPGA Platform
FPGA '19: Proceedings of the 2019 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays

Field-programmable gate arrays (FPGAs) are becoming a promising choice as a heterogeneous computing component for scientific computing when floating-point optimized architectures are added to the current FPGAs. The maturing high-level synthesis (HLS) ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

IWOCL '19: Proceedings of the International Workshop on OpenCL

May 2019

102 pages

ISBN:9781450362306

DOI:10.1145/3318170

Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

In-Cooperation

Khronos: Khronos Group
Northeastern University
Codeplay: Codeplay Software Ltd.
Intel: Intel
The University of Bristol: The University of Bristol

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 May 2019

Check for updates

Author Tags

Qualifiers

Poster
Research
Refereed limited

Conference

IWOCL'19

IWOCL'19: International Workshop on OpenCL

May 13 - 15, 2019

MA, Boston, USA

Acceptance Rates

IWOCL '19 Paper Acceptance Rate 13 of 33 submissions, 39%;

Overall Acceptance Rate 84 of 152 submissions, 55%

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

3
Total Citations
View Citations
110
Total Downloads

Downloads (Last 12 months)6
Downloads (Last 6 weeks)1

Reflects downloads up to 20 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

View all

Liao HLee CLee JLai WHung MHuang C(2021)Support Convolution of CNN with Compression Sparse Matrix Multiplication Flow in TVM50th International Conference on Parallel Processing Workshop10.1145/3458744.3473352(1-7)Online publication date: 9-Aug-2021
https://dl.acm.org/doi/10.1145/3458744.3473352
Chao CChu WLee CLee JHung MSung H(2020)Devise Sparse Compression Schedulers to Enhance FastText MethodsWorkshop Proceedings of the 49th International Conference on Parallel Processing10.1145/3409390.3409394(1-8)Online publication date: 17-Aug-2020
https://dl.acm.org/doi/10.1145/3409390.3409394
Yu MChen TLee J(2020)Accelerating NNEF Framework on OpenCL Devices Using clDNNProceedings of the International Workshop on OpenCL10.1145/3388333.3388655(1-2)Online publication date: 27-Apr-2020
https://dl.acm.org/doi/10.1145/3388333.3388655

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Cited By

Index Terms

Recommendations

Performance Tuning of Matrix Multiplication in OpenCL on Different GPUs and CPUs

A Halide-based Synergistic Computing Framework for Heterogeneous Systems

Nuclear Reactor Simulations on OpenCL FPGA Platform