Abstract
In this paper, we propose an OpenCL framework for GPU clusters. The target cluster architecture consists of a single host node and multiple compute nodes. They are connected by an interconnection network, such as Gigabit and InfiniBand switches. Each compute node consists of multiple GPUs. Each GPU becomes an OpenCL compute device. The host node executes the host program in an OpenCL application. Our OpenCL framework provides an illusion of a single system for the user. It allows the application to utilize GPUs in a compute node as if they were in the host node. No communication API, such as the MPI library, is required in the application source. We show that the original OpenCL semantics naturally fits to the GPU cluster environment, and the framework achieves both high performance and ease of programming. We implement the OpenCL framework and evaluate its performance on a GPU cluster that consists of one host and eight compute nodes using six OpenCL benchmark applications.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
AMD: AMD Accelerated Parallel Processing SDK v2.3, http://developer.amd.com/gpu/AMDAPPSDK/Pages/default.aspx
AMD: AMD Accelerated Parallel Processing (APP) SDK With OpenCL 1.1 Support (2011), http://developer.amd.com/gpu/atistreamsdk/pages/default.aspx
Amza, C., Cox, A.L., Dwarkadas, S., Keleher, P., Lu, H., Rajamony, R., Yu, W., Zwaenepoel, W.: TreadMarks: Shared Memory Computing on Networks of Workstations. Computer 29, 18–28 (1996)
Barak, A., Ben-nun, T., Levy, E., Shiloh, A.: A Package for OpenCL Based Heterogeneous Computing on Clusters with Many GPU Devices. In: Proceedings of the Workshop on Parallel Programming and Applications on Accelerator Clusters, PPAAC 2010 (2010)
Bienia, C., Kumar, S., Singh, J.P., Li, K.: The PARSEC benchmark suite: characterization and architectural implications. In: Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques, PACT 2008, pp. 72–81 (2008)
Chen, L., Liu, L., Tang, S., Huang, L., Jing, Z., Xu, S., Zhang, D., Shou, B.: Unified Parallel C for GPU Clusters: Language Extensions and Compiler Implementation. In: Cooper, K., Mellor-Crummey, J., Sarkar, V. (eds.) LCPC 2010. LNCS, vol. 6548, pp. 151–165. Springer, Heidelberg (2011)
Chen, Y., Cui, X., Mei, H.: Large-scale FFT on GPU clusters. In: Proceedings of the 24th ACM International Conference on Supercomputing, ICS 2010, pp. 315–324 (2010)
Fan, Z., Qiu, F., Kaufman, A., Yoakum-Stover, S.: GPU cluster for high performance computing. In: Proceedings of the 2004 ACM/IEEE Conference on Supercomputing, SC 2004, pp. 47–58 (2004)
IBM: OpenCL Development Kit for Linux on Power (2011), http://www.alphaworks.ibm.com/tech/opencl
Intel: Intel OpenCL SDK (2011), http://software.intel.com/en-us/articles/intel-opencl-sdk/
Khronos OpenCL Working Group: The OpenCL Specification Version 1.1 (2010), http://www.khronos.org/opencl
Kim, J., Kim, H., Lee, J.H., Lee, J.: Achieving a single compute device image in OpenCL for multiple GPUs. In: Proceedings of the 16th ACM Symposium on Principles and Practice of Parallel Programming, PPoPP 2011, pp. 277–288 (2011)
Lattner, C., Adve, V.: LLVM: A Compilation Framework for Lifelong Program Analysis & Transformation. In: Proceedings of the International Symposium on Code Generation and Optimization: Feedback-Directed and Runtime Optimization, CGO 2004, pp. 75–86 (2004)
NASA Advanced Supercomputing Division: NAS Parallel Benchmarks version 3.2, http://www.nas.nasa.gov/Resources/Software/npb.html
NVIDIA: NVIDIA CUDA Toolkit 3.2, http://developer.nvidia.com/cuda-toolkit-32-downloads
NVIDIA: NVIDIA CUDA C Programming Guide 3.2 (2010)
NVIDIA: NVIDIA GPU Computing Developer Home Page (2011), http://developer.nvidia.com/object/gpucomputing.html
Phillips, J.C., Stone, J.E., Schulten, K.: Adapting a message-driven parallel application to GPU-accelerated clusters. In: Proceedings of the 2008 ACM/IEEE Conference on Supercomputing, SC 2008, pp. 8:1–8:9 (2008)
Seoul National University and Samsung: SNU-SAMSUNG OpenCL Framework (2010), http://opencl.snu.ac.kr
The IMPACT Research Group: Parboil Benchmark suite, http://impact.crhc.illinois.edu/parboil.php
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kim, J., Seo, S., Lee, J., Nah, J., Jo, G., Lee, J. (2013). OpenCL as a Programming Model for GPU Clusters. In: Rajopadhye, S., Mills Strout, M. (eds) Languages and Compilers for Parallel Computing. LCPC 2011. Lecture Notes in Computer Science, vol 7146. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-36036-7_6
Download citation
DOI: https://doi.org/10.1007/978-3-642-36036-7_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-36035-0
Online ISBN: 978-3-642-36036-7
eBook Packages: Computer ScienceComputer Science (R0)