Abstract
CartSolver is widely used three dimensional Euler solver software for Cartesian grids. In this paper, we use the latest many-core accelerators such as NVIDIA Fermi C2050, NVIDIA Kepler K20 and Intel MIC to do the acceleration, and achieve expected speedup over the serial solver. On the GPU platform, two versions of accelerated CartSolver are implemented and optimized. For MIC, we employ various optimization methods in order to achieve the best performance by an open source performance analysis tool. The differences in architecture and programming model between GPU and MIC are also discussed. In the experiments, the correctness and accuracy of the solvers is validated, and the great effect of optimization methods is also proved. Finally, a new criterion for measuring the workload is proposed, and several recommendations on selecting suitable accelerators for CFD engineering software are given on the base of the comparison of the criteria.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Antoniou, A.S., Karantasis, K.I., Polychronopoulos, E.D.: Acceleration of a finite difference weno scheme for large-scale simulations on many-core architectures. In: Proceedings of the 48th AIAA Aerospace Sciences Meeting Including the New Horizons Forum and Aerospace Exposition, AIAA-2010-525. (2010)
Bader, M., Bungartz, H.J., Mudigere, D., Narasimhan, S., Narayanan, B.: Fast GPGPU Data Rearrangement Kernels using CUDA. Technical report arXiv:1011.3583 (2010)
Liu, Y., Liu, Y.C., Wang, F., Bai, H.L.: A GPU-based CFD Solver. In: Transaction of Nanjing University of Aeronautica & Astronautica, vol. 30(S), pp. 101–106 (2013)
Yang, X.J., Liao, X.K., Lu, K., et al.: The TianHe-1A supercomputer: its hardware and software. J. Comput. Sci. Technol. 26(3), 344–351 (2011)
Xiao, H.S., Chen, Z.B., Liu, G., Jiang, X.: Applicarions of 3-D adaptive Cartesian grid algorithm based on the Euler equations. Acta Aerodyn. Sin. 21(2), 202–210 (2003)
Levesque, J.M.: Application development for titan - a multi-petaflop hybrid-multicore MPP system. In: High Performance Computing, Networking, Storage and Analysis (SCC), 2012 SC Companion, pp. 1731–1821. IEEE (2012)
Gibbs, P.E.: Supercomputers, artificial intelligence & brain power. Prespacetime J. 4(7), 725–728 (2013)
Vladimirov, A., Addison, C.: Cluster-level tuning of a shallow water equation solver on the Intel MIC architecture. Eprint Arxiv: 1408 (2014)
Intel Corporation: Intel Xeon Phi Coprocessor System Software Developers Guide. SKU: 328207–001EN (2012)
Jameson, A., Schmitt, W., Turkel, E.: Numerical Solutions of the Euler Equations by Finite Volume Methods using Runge-Kutta Time-Stepping Schemes. AIAA Paper 81–1259 (1981)
Treibig, J., Hager, G., Wellein, G.: LIKWID: Lightweight Performance Tools. In: Bischof, C., Hegering, H.-G., Nagel, W.E., Wittum, G. (eds.) Competence in High Performance Computing (CiHPC) 2010, pp. 165–175. Springer, New York (2012)
Jarvis, S.A.: Exploring SIMD for molecular dynamics, using intel xeon processors and intel xeon phi coprocessors. In: IEEE 27th International Symposium on Parallel & Distributed Processing (IPDPS), pp. 1085–1097. IEEE (2013)
Aoki, T.: Application Performances on Many-core Processors Xeon Phi versus Kepler GPU. Tokyo Institute of Technology, pp. 1–10 (2013)
NVIDIA Corporation: CUDA Profiler User Guide, v5.0. (2012)
NVIDIA Corporation: NVIDIA CUDA C Programming Guide, v5.0 (2012)
Deng, L.: Many-core Parallel Computing for Typical Implicit CFD Methods. National University of Defense Technology (2013)
Crespo, A.J.C., Domnguez, J.M., Rogers, B.D., et al.: DualSPHysics: open-source parallel CFD solver based on Smoothed Particle Hydrodynamics (SPH). Compu. Phy. Commun. 187, 204–216 (2015)
Reguly, I.Z., Mudalige, G.R., Bertolli, C., et al.: Acceleration of a Full-scale Industrial CFD Application with OP2. eprint arXiv: 1403.7209 (2014)
Liu, Y., Pang, Y.F., Chen, B., Xiao, H.S., Bai, H.L.: CUDA implementation of a euler solver for cartesian grid. In: IEEE 10th International Conference on High Performance Computing and Communications, pp. 1308–1314. IEEE (2013)
The Portland Group: PGI Accelerator Programming Model for Fortran & C, v1.3 (2010)
Thibault J.C., Senocak, I.: CUDA implementation of a navier- stokes solver on multi-GPU desktop platforms for incompressible flows. In: Proccedings of the 47th AIAA Aerospace Sciences Meeting, AAIA-2009-758 (2009)
Che, Y.: Microarchitectural performance comparison of Intel Knights Corner and Intel Sandy Bridge with CFD applications. J. Supercomput. 70(1), 321–348 (2014)
Li, Y., Che, Y., Wang, Z.: Performance evaluation and scalability analysis of NPB-MZ on intel xeon phi coprocessor. In: Xu, W., Xiao, L., Zhang, C., Li, J., Yu, L. (eds.) NCCET 2013. CCIS, vol. 396, pp. 143–152. Springer, Heidelberg (2013)
Che, Y., Zhang, L., Wang, Y., Xu, C., Liu, W., Cheng, X.: Performance optimization of a CFD application on intel multicore and manycore architectures. In: Wu, J., Chen, H., Wang, X. (eds.) ACA 2014. CCIS, vol. 451, pp. 83–97. Springer, Heidelberg (2014)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Liu, Y., Deng, L. (2015). Acceleration of CFD Engineering Software on GPU and MIC. In: Wang, G., Zomaya, A., Martinez, G., Li, K. (eds) Algorithms and Architectures for Parallel Processing. ICA3PP 2015. Lecture Notes in Computer Science(), vol 9532. Springer, Cham. https://doi.org/10.1007/978-3-319-27161-3_77
Download citation
DOI: https://doi.org/10.1007/978-3-319-27161-3_77
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-27160-6
Online ISBN: 978-3-319-27161-3
eBook Packages: Computer ScienceComputer Science (R0)