An OpenCL micro-benchmark suite for GPUs and CPUs

Xin Yan¹,
Xiaohua Shi¹,
Lina Wang¹ &
…
Haiyan Yang¹

445 Accesses
8 Citations
Explore all metrics

Abstract

Open computing language (OpenCL) is a new industry standard for task-parallel and data-parallel heterogeneous computing on a variety of modern CPUs, GPUs, DSPs, and other microprocessor designs. OpenCL is vendor independent and hence not specialized for any particular compute device. To develop efficient OpenCL applications for the particular platform, we still need a more profound understanding of architecture features on the OpenCL model and computing devices. For this purpose, we design and implement an OpenCL micro-benchmark suite for GPUs and CPUs. In this paper, we introduce the implementations of our OpenCL micro benchmarks, and present the measuring results of hardware and software features like performance of mathematical operations, bus bandwidths, memory architectures, branch synchronizations and scalability, etc., on two multi-core CPUs, i.e. AMD Athlon II X2 250 and Intel Pentium Dual-Core E5400, and two different GPUs, i.e. NVIDIA GeForce GTX 460se and AMD Radeon HD 6850. We also compared the measuring results with existing benchmarks to demonstrate the reasonableness and correctness of our benchmark suite.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Institutional subscriptions

GPU-STREAM v2.0: Benchmarking the Achievable Memory Bandwidth of Many-Core Processors Across Diverse Parallel Programming Models

Evaluating GPU Programming Models for the LUMI Supercomputer

NAS Parallel Benchmarks with Python: a performance and programming effort analysis focusing on GPUs

Article 27 December 2022

References

The OpenCL official site, at URL:http://www.khronos.org/opencl/
Seo S, Jo G, Lee J (2011) Performance characterization of the NAS Parallel Benchmarks in OpenCL. In: Proceedings of 2011 IEEE International Symposium on Workload Characterization (IISWC), IEEE, pp 137–148
Volkov V, Demmel JW (2008) Benchmarking GPUs to tune dense linear algebra. In: Proceedings of the 2008 ACM/IEEE conference on Supercomputing. IEEE Press, USA, p 31
Parboil Benchmark suite, at URL: http://impact.crhc.illinois.edu/parboil.php
Che S, Boyer M, Meng J et al (2009) Rodinia: a benchmark suite for heterogeneous computing. In: Proceedings of IEEE International Symposium on Workload Characterization 2009 (IISWC 2009), IEEE, pp 44–54
Torres Y, Gonzalez-Escribano A, Llanos DR (2013) uBench: exposing the impact of CUDA block geometry in terms of performance. J Supercomput 1–14
Shen J et al (2012) Performance gaps between OpenMP and OpenCL for multi-core CPUs. In: Proceedings of 2012 41st international conference on parallel processing workshops (ICPPW), IEEE, pp 116–125
Danalis A, Marin G, McCurdy C et al (2010) The scalable heterogeneous computing (SHOC) benchmark suite. In: Proceedings of the 3rd workshop on general-purpose computation on graphics processing units, ACM, pp 63–74
The OpenCL 1.2 specification, at URL: http://www.khronos.org/registry/cl/specs/opencl-1.2
Torres Y, Gonzalez-Escribano A, Llanos DR (2011) Understanding the impact of CUDA tuning techniques for fermi. In: Proceedings of 2011 international conference on high performance computing and simulation (HPCS), IEEE
Helluy P (2011) A portable implementation of the radix sort algorithm in OpenCL, at URL: http://hal.archives-ouvertes.fr/hal-00596730, Technical Report
OpenCL Programming Guide Version 2.3. at URL: http://www.nvidia.com/content/cudazone/download/OpenCL/NVIDIA_OpenCL_ProgrammingGuide
Peiyuan S, Xiaohua S (2012) An OpenCL approach of prestack Kirchhoff time migration algorithm on general purpose GPU. In: Proceedings of the 2012 13th international conference on parallel and distributed computing, applications and technologies, IEEE Computer Society
Wong H, Papadopoulou MM, Sadooghi-Alvandi M et al (2010) Demystifying GPU microarchitecture through microbenchmarking. In: Proceedings of 2010 IEEE international symposium on performance analysis of systems & software (ISPASS), IEEE, pp 235–246

Download references

Acknowledgments

This material is based upon works supported by National Natural Science Foundation of China No.61073010 and No.61272166, National Science and Technology Major Project of China No.2012ZX01039-004, and the State Key Laboratory of Software Development Environment of China No.SKLSDE-2012ZX-02.

Author information

Authors and Affiliations

State Key Laboratory of Software Development Environment, School of Computer Science and Engineering, Beihang University, Beijing, China
Xin Yan, Xiaohua Shi, Lina Wang & Haiyan Yang

Authors

Xin Yan
View author publications
You can also search for this author in PubMed Google Scholar
Xiaohua Shi
View author publications
You can also search for this author in PubMed Google Scholar
Lina Wang
View author publications
You can also search for this author in PubMed Google Scholar
Haiyan Yang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xiaohua Shi.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yan, X., Shi, X., Wang, L. et al. An OpenCL micro-benchmark suite for GPUs and CPUs. J Supercomput 69, 693–713 (2014). https://doi.org/10.1007/s11227-014-1112-2

Download citation

Published: 28 January 2014
Issue Date: August 2014
DOI: https://doi.org/10.1007/s11227-014-1112-2

An OpenCL micro-benchmark suite for GPUs and CPUs

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

GPU-STREAM v2.0: Benchmarking the Achievable Memory Bandwidth of Many-Core Processors Across Diverse Parallel Programming Models

Evaluating GPU Programming Models for the LUMI Supercomputer

NAS Parallel Benchmarks with Python: a performance and programming effort analysis focusing on GPUs

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

An OpenCL micro-benchmark suite for GPUs and CPUs

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

GPU-STREAM v2.0: Benchmarking the Achievable Memory Bandwidth of Many-Core Processors Across Diverse Parallel Programming Models

Evaluating GPU Programming Models for the LUMI Supercomputer

NAS Parallel Benchmarks with Python: a performance and programming effort analysis focusing on GPUs

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation