Concurrent Parallel Processing on Graphics and Multicore Processors with OpenACC and OpenMP

Christopher P. Stone¹⁵,
Roger L. Davis¹⁶ &
Daryl Y. Lee¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 10732))

Included in the following conference series:

International Workshop on Accelerator Programming Using Directives

528 Accesses
1 Altmetric

Abstract

Hierarchical parallel computing is rapidly becoming ubiquitous in high performance computing (HPC) systems. Programming models used commonly in turbomachinery and other engineering simulation codes have traditionally relied upon distributed memory parallelism with MPI and have ignored thread and data parallelism. This paper presents methods for programming multi-block codes for concurrent computational on host multicore CPUs and many-core accelerators such as graphics processing units. Portable and standardized methods are language directives that are used to expose data and thread parallelism within the hybrid shared and distributed-memory simulation system. A single-source/multiple-object strategy is used to simplify code management and allow for heterogeneous computing. Automated load balancing is implemented to determine what portions of the domain are computed by the multi-core CPUs and GPUs. Preliminary results indicate that a moderate overall speed-up is possible by taking advantage of all processors and accelerators on a given HPC node.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: GBP 19.95; Price includes VAT (United Kingdom)

eBook: GBP 29.99; Price includes VAT (United Kingdom)

Softcover Book: GBP 37.99; Price includes VAT (United Kingdom)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Automatic Parallelization of Python Programs for Distributed Heterogeneous Computing

Enabling ISO Standard Languages for Complex HPC Workflows

Parallel Programming

Notes

1.
This assumes that all operations can be computed within a vector logic unit capable of processing eight values simultaneously compared to a scalar logic unit.
2.
The legacy !dir$ ivdep directive can often be used where OpenMP v4 is not supported.

References

Martin, C.: Multicore processors: challenges, opportunities, emerging trends. In: Proceedings of Embedded World Conference 2014, Nuremberg, Germany (2014)
Google Scholar
OpenACC Specification Page. http://www.openacc.org/specification. Accessed 31 July 2017
Stone, C., Davis, R.: High-performance 3D multi-disciplinary fluid/thermal prediction using combined multi-core/multi-GPGPU computer systems. In: 22nd AIAA Computational Fluid Dynamics Conference, Dallas, Texas, USA (2015). https://doi.org/10.2514/6.2015-3058
OpenMP Specification Page. http://www.openmp.org/specifications. Accessed 31 July 2017
Pickering, B.P., Jackson, C.W., Scogland, T.R.W., Feng, W.-C., Roy, C.J.: Directive-based GPU programming for computational fluid dynamics. Comput. Fluids 114, 242–253 (2015)
Article MathSciNet Google Scholar
Kraus, J., Schlottke, M., Adinetz, A., Pleiter, D.: Accelerating a C++ CFD code with OpenACC. In: 1st Workshop on Accelerator Programming Using Directives, pp. 47–54. IEEE (2014). https://doi.org/10.1109/WACCPD.2014.11
Wilcox, D.C.: Turbulence Modeling for CFD. DCW Industries, La Cannada (1998)
Google Scholar
Smagorinsky, J.: General circulation experiments with the primitive equations. Mon. Weather Rev. 91, 99–164 (1963)
Article Google Scholar
Strelets, M.: Detached eddy simulation of massively separated flows. In: 39th Aerospace Sciences Meeting and Exhibit, Reno, Nevada (2001). https://doi.org/10.2514/6.2001-879
Bush, R.H., Mani, M.: A two-equation large eddy stress model for high sub-grid shear. In: 15th AIAA Computational Fluid Dynamics Conference, Anaheim, CA (2001). https://doi.org/10.2514/6.2001-2561
Bozinoski, R., Davis, R.L.: General three-dimensional, multi-block, parallel turbulent Navier-Stokes procedure. In: AIAA Aerospace Sciences Meeting. Reno, Nevada (2008). https://doi.org/10.2514/6.2008-756
Ni, R.H.: A multiple grid scheme for solving the Euler equations. AIAA J. 20(11), 1565–1571 (1982). https://doi.org/10.2514/3.51220
Article MATH Google Scholar
Dannenhoffer, J.F.: Grid Adaptation for Complex Two-Dimensional Transonic Flows. Technical report CFDL-TR-87-10, Institute of Technology, Massachusetts (1987)
Google Scholar
Davis, R.L., Ni, R.H., Carter, J.E.: Cascade viscous flow analysis using the Navier-Stokes equations. J. Propul. Power 3, 406–414 (1987). https://doi.org/10.2514/3.23005
Article Google Scholar
Jameson, A.: Time dependent calculations using multi-grid, with applications to unsteady flows past airfoils and wings. In: 10th AIAA Computational Fluid Dynamics Conference, Honolulu, HI (1991). https://doi.org/10.2514/6.1991-1596
Davis, R.L., Clark, J.P.: Geometry-grid generation for three-dimensional multidisciplinary simulations in multistage turbomachinery. J. Propul. Power 30, 1502–1509 (2014). https://doi.org/10.2514/1.B35168
Article Google Scholar
Huismann, I., Stiller, J., Frohlich, J.: Two-level parallelization of a fluid mechanics algorithm exploiting hardware heterogeneity. Comput. Fluids 117, 114–124 (2015). https://doi.org/10.1016/j.compfluid.2015.05.012
Article MathSciNet Google Scholar

Download references

Acknowledgements

This material is based upon work supported by, or in part by, the Department of Defense High Performance Computing Modernization Program (HPCMP) under User Productivity, Technology Transfer and Training (PETTT) contract number GS04T09DBC0017.

US Department of Defense (DoD) Distribution Statement A: Approved for public release. Distribution is unlimited.

Author information

Authors and Affiliations

Computational Science and Engineering, LLC, Athens, GA, 30605, USA
Christopher P. Stone
Department of Mechanical and Aerospace Engineering, University of California Davis, Davis, CA, 95616, USA
Roger L. Davis & Daryl Y. Lee

Authors

Christopher P. Stone
View author publications
You can also search for this author in PubMed Google Scholar
Roger L. Davis
View author publications
You can also search for this author in PubMed Google Scholar
Daryl Y. Lee
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Christopher P. Stone .

Editor information

Editors and Affiliations

University of Delaware, Newark, Delaware, USA
Sunita Chandrasekaran
Helmholtz-Zentrum Dresden-Rossendorf e.V., Dresden, Germany
Guido Juckeland

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Stone, C.P., Davis, R.L., Lee, D.Y. (2018). Concurrent Parallel Processing on Graphics and Multicore Processors with OpenACC and OpenMP. In: Chandrasekaran, S., Juckeland, G. (eds) Accelerator Programming Using Directives. WACCPD 2017. Lecture Notes in Computer Science(), vol 10732. Springer, Cham. https://doi.org/10.1007/978-3-319-74896-2_6

Download citation

DOI: https://doi.org/10.1007/978-3-319-74896-2_6
Published: 31 January 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-74895-5
Online ISBN: 978-3-319-74896-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Concurrent Parallel Processing on Graphics and Multicore Processors with OpenACC and OpenMP

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Automatic Parallelization of Python Programs for Distributed Heterogeneous Computing

Enabling ISO Standard Languages for Complex HPC Workflows

Parallel Programming

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Concurrent Parallel Processing on Graphics and Multicore Processors with OpenACC and OpenMP

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Automatic Parallelization of Python Programs for Distributed Heterogeneous Computing

Enabling ISO Standard Languages for Complex HPC Workflows

Parallel Programming

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation