Abstract
Hierarchical parallel computing is rapidly becoming ubiquitous in high performance computing (HPC) systems. Programming models used commonly in turbomachinery and other engineering simulation codes have traditionally relied upon distributed memory parallelism with MPI and have ignored thread and data parallelism. This paper presents methods for programming multi-block codes for concurrent computational on host multicore CPUs and many-core accelerators such as graphics processing units. Portable and standardized methods are language directives that are used to expose data and thread parallelism within the hybrid shared and distributed-memory simulation system. A single-source/multiple-object strategy is used to simplify code management and allow for heterogeneous computing. Automated load balancing is implemented to determine what portions of the domain are computed by the multi-core CPUs and GPUs. Preliminary results indicate that a moderate overall speed-up is possible by taking advantage of all processors and accelerators on a given HPC node.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
This assumes that all operations can be computed within a vector logic unit capable of processing eight values simultaneously compared to a scalar logic unit.
- 2.
The legacy !dir$ ivdep directive can often be used where OpenMP v4 is not supported.
References
Martin, C.: Multicore processors: challenges, opportunities, emerging trends. In: Proceedings of Embedded World Conference 2014, Nuremberg, Germany (2014)
OpenACC Specification Page. http://www.openacc.org/specification. Accessed 31 July 2017
Stone, C., Davis, R.: High-performance 3D multi-disciplinary fluid/thermal prediction using combined multi-core/multi-GPGPU computer systems. In: 22nd AIAA Computational Fluid Dynamics Conference, Dallas, Texas, USA (2015). https://doi.org/10.2514/6.2015-3058
OpenMP Specification Page. http://www.openmp.org/specifications. Accessed 31 July 2017
Pickering, B.P., Jackson, C.W., Scogland, T.R.W., Feng, W.-C., Roy, C.J.: Directive-based GPU programming for computational fluid dynamics. Comput. Fluids 114, 242–253 (2015)
Kraus, J., Schlottke, M., Adinetz, A., Pleiter, D.: Accelerating a C++ CFD code with OpenACC. In: 1st Workshop on Accelerator Programming Using Directives, pp. 47–54. IEEE (2014). https://doi.org/10.1109/WACCPD.2014.11
Wilcox, D.C.: Turbulence Modeling for CFD. DCW Industries, La Cannada (1998)
Smagorinsky, J.: General circulation experiments with the primitive equations. Mon. Weather Rev. 91, 99–164 (1963)
Strelets, M.: Detached eddy simulation of massively separated flows. In: 39th Aerospace Sciences Meeting and Exhibit, Reno, Nevada (2001). https://doi.org/10.2514/6.2001-879
Bush, R.H., Mani, M.: A two-equation large eddy stress model for high sub-grid shear. In: 15th AIAA Computational Fluid Dynamics Conference, Anaheim, CA (2001). https://doi.org/10.2514/6.2001-2561
Bozinoski, R., Davis, R.L.: General three-dimensional, multi-block, parallel turbulent Navier-Stokes procedure. In: AIAA Aerospace Sciences Meeting. Reno, Nevada (2008). https://doi.org/10.2514/6.2008-756
Ni, R.H.: A multiple grid scheme for solving the Euler equations. AIAA J. 20(11), 1565–1571 (1982). https://doi.org/10.2514/3.51220
Dannenhoffer, J.F.: Grid Adaptation for Complex Two-Dimensional Transonic Flows. Technical report CFDL-TR-87-10, Institute of Technology, Massachusetts (1987)
Davis, R.L., Ni, R.H., Carter, J.E.: Cascade viscous flow analysis using the Navier-Stokes equations. J. Propul. Power 3, 406–414 (1987). https://doi.org/10.2514/3.23005
Jameson, A.: Time dependent calculations using multi-grid, with applications to unsteady flows past airfoils and wings. In: 10th AIAA Computational Fluid Dynamics Conference, Honolulu, HI (1991). https://doi.org/10.2514/6.1991-1596
Davis, R.L., Clark, J.P.: Geometry-grid generation for three-dimensional multidisciplinary simulations in multistage turbomachinery. J. Propul. Power 30, 1502–1509 (2014). https://doi.org/10.2514/1.B35168
Huismann, I., Stiller, J., Frohlich, J.: Two-level parallelization of a fluid mechanics algorithm exploiting hardware heterogeneity. Comput. Fluids 117, 114–124 (2015). https://doi.org/10.1016/j.compfluid.2015.05.012
Acknowledgements
This material is based upon work supported by, or in part by, the Department of Defense High Performance Computing Modernization Program (HPCMP) under User Productivity, Technology Transfer and Training (PETTT) contract number GS04T09DBC0017.
US Department of Defense (DoD) Distribution Statement A: Approved for public release. Distribution is unlimited.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG
About this paper
Cite this paper
Stone, C.P., Davis, R.L., Lee, D.Y. (2018). Concurrent Parallel Processing on Graphics and Multicore Processors with OpenACC and OpenMP. In: Chandrasekaran, S., Juckeland, G. (eds) Accelerator Programming Using Directives. WACCPD 2017. Lecture Notes in Computer Science(), vol 10732. Springer, Cham. https://doi.org/10.1007/978-3-319-74896-2_6
Download citation
DOI: https://doi.org/10.1007/978-3-319-74896-2_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-74895-5
Online ISBN: 978-3-319-74896-2
eBook Packages: Computer ScienceComputer Science (R0)