[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3648115.3648130acmotherconferencesArticle/Chapter ViewAbstractPublication PagesiwoclConference Proceedingsconference-collections
short-paper

Evaluation of SYCL’s Different Data Parallel Kernels

Published: 08 April 2024 Publication History

Abstract

SYCL provides programmers with four, and in the case of AdaptiveCpp even five, ways for calling and writing a device kernel. This paper analyzes the performance of these diverse kernel invocation types for DPC++ and AdaptiveCpp as SYCL implementations on an NVIDIA A100 GPU, an AMD Instinct MI210 GPU, and a dual-socket AMD EPYC 9274F CPU. Using the example of a kernel matrix assembly, we show why the performance can differ by a factor of 100 in the worst case on the same hardware for the same problem using different SYCL implementations and kernel invocation types.

References

[1]
Aksel Alpay, Bálint Soproni, Holger Wünsche, and Vincent Heuveline. 2022. Exploring the Possibility of a HipSYCL-Based Implementation of OneAPI. In International Workshop on OpenCL (Bristol, United Kingdom, United Kingdom) (IWOCL’22). Association for Computing Machinery, New York, NY, USA, Article 10, 12 pages.
[2]
Marcel Breyer, Alexander Van Craen, and Dirk Pflüger. 2022. A Comparison of SYCL, OpenCL, CUDA, and OpenMP for Massively Parallel Support Vector Machine Classification on Multi-Vendor Hardware. In International Workshop on OpenCL (Bristol, United Kingdom, United Kingdom) (IWOCL’22). Association for Computing Machinery.
[3]
Marcel Breyer, Alexander Van Craen, and Dirk Pflüger. 2023. Performance Evolution of Different SYCL Implementations based on the Parallel Least Squares Support Vector Machine Library. In IWOCL ’23: Proceedings of the 2023 International Workshop on OpenCL (cambridge, United Kingdom, United Kingdom) (IWOCL & SYCLcon ’23). Association for Computing Machinery.
[4]
Wei Chu, Chong Jin Ong, and S Sathiya Keerthi. 2005. An Improved Conjugate Gradient Scheme to the Solution of Least Squares SVM. IEEE Transactions on Neural Networks 16, 2 (2005), 498–501. https://doi.org/10.1109/tnn.2004.841785
[5]
Tom Deakin, Simon McIntosh-Smith, Aksel Alpay, and Vincent Heuveline. 2021. Benchmarking and Extending SYCL Hierarchical Parallelism. In 2021 IEEE/ACM International Workshop on Hierarchical Parallelism for Exascale Computing.
[6]
Khronos®-SYCL™-Working-Group. 2023. SYCL™ 2020 Specification (revision 8). https://registry.khronos.org/SYCL/specs/sycl-2020/html/sycl-2020.html
[7]
Yuval Netzer, Tao Wang, Adam Coates, Alessandro Bissacco, Bo Wu, and Andrew Y. Ng. 2011. Reading Digits in Natural Images with Unsupervised Feature Learning. In NIPS Workshop on Deep Learning and Unsupervised Feature Learning 2011. Scientific Research, Granada, 12–17.
[8]
James Reinders, Ben Ashbaugh, James Brodman, Michael Kinsner, John Pennycook, and Xinmin Tian. 2021. Data Parallel C++: Mastering DPC++ for Programming of Heterogeneous Systems using C++ and SYCL. Springer Nature.
[9]
Johan A. K. Suykens and Joos Vandewalle. 1999. Least Squares Support Vector Machine Classifiers. Neural Processing Letters 9, 3 (1999), 293–300.
[10]
Alexander Van Craen, Marcel Breyer, and Dirk Pflüger. 2022. PLSSVM: A (multi-) GPGPU-accelerated Least Squares Support Vector Machine. IPVS - Scientific Computing - University of Stuttgart. https://github.com/SC-SGS/PLSSVM
[11]
Alexander Van Craen, Marcel Breyer, and Dirk Pflüger. 2022. PLSSVM: A (multi-)GPGPU-accelerated Least Squares Support Vector Machine. In 2022 IPDPSW. IEEE, New York, NY, USA, 818–827.
[12]
Alexander Van Craen, Marcel Breyer, and Dirk Pflüger. 2022. PLSSVM—Parallel Least Squares Support Vector Machine. Software Impacts 14 (2022), 100343.

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
IWOCL '24: Proceedings of the 12th International Workshop on OpenCL and SYCL
April 2024
124 pages
ISBN:9798400717901
DOI:10.1145/3648115
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 08 April 2024

Permissions

Request permissions for this article.

Check for updates

Badges

  • Best Short Paper

Author Tags

  1. CPU
  2. GPU
  3. Performance Evaluation
  4. SVM
  5. SYCL

Qualifiers

  • Short-paper
  • Research
  • Refereed limited

Funding Sources

  • EXC2075
  • AISA

Conference

IWOCL '24

Acceptance Rates

Overall Acceptance Rate 84 of 152 submissions, 55%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 47
    Total Downloads
  • Downloads (Last 12 months)47
  • Downloads (Last 6 weeks)5
Reflects downloads up to 12 Dec 2024

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media