[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
article

Nested parallelization with OpenMP

Published: 01 October 2007 Publication History

Abstract

OpenMP is widely accepted as a de facto standard for shared memory parallel programming in Fortran, C and C++. Nested parallelization has been included in the first OpenMP specification, but it took a few years until the first commercially available compilers supported this optional part of the specification. We employed nested parallelization using OpenMP in three production codes: a C++ code for content-based image retrieval, a C++ code for the computation of critical points in multi-block CFD datasets, and a multi-block Navier-Stokes solver written in Fortran90. In this paper we discuss the opportunities as well as the deficiencies of the nested parallelization support in OpenMP.

References

[1]
Terboven, C., Deselaers, T., Bischof, C., Ney, H.: Shared-memory parallelization for content-based image retrieval. In: ECCV 2006 Workshop on Computation Intensive Methods for Computer Vision.
[2]
Nested OpenMP for Efficient Computation of 3D Critical Points in Multi-Block CFD Datasets; Super computing (2006) (to appear).
[3]
Johnson, S., Leggett, P., Ierotheou, C., Spiegel, A., an Mey, D., Hoerschler, I.: Nested parallelization of the flow solver tfs using the parawise parallelization environment; IWOMP (2006); http://iwomp.univ-reims.fr/cd/papers/JLI+06.pdf
[4]
OpenMP Architecture Review Board: OpenMP application program interface, v2.5. (2005) http://www.openmp.org or http://www.compunity.org
[5]
Solaris Memory Placement Optimization and Sun Fire Servers, Technical White Paper, http://www.sun.com/servers/wp/docs/mpo_v7_CUSTOMER.pdf
[6]
Sun Studio 11: OpenMP API User's Guide, Chapter 2, Nested Parallelism, http://docs.sun.com/ source/819-3694/2_nested.html
[7]
Müller, H., Michoux, N., Bandon, D., Geissbuhler, A.: A review of content-based image retrieval systems in medical applications-clinical benefits and future directions. Int. J. Med. Inform. (73)1-23 (2004).
[8]
Sun, Y., Zhang, H., Zhang, L., Li, M.: Myphotos a system for home photo management and processing. In: ACM Multimedia Conference, pp. 81-82 Juan-les-Pins, France, (2002).
[9]
Smeulders, A.W.M., Worring, M., Santini, S., Gupta, A., Jain, R.: Content-based image retrieval: the end of the early years. IEEE T. Pattern Anal. 22(12), 1349-1380 (2000).
[10]
Deselaers, T., Keysers, D., Ney, H.: Features for image retrieval--a quantitative comparison. In: DAGM 2004, Pattern Recognition, 26th DAGM Symposium, pp. 228-236 Number 3175 in Lecture Notes in Computer Science, Tübingen, Germany (2004).
[11]
Clough, P., Müller, H., Sanderson, M.: The CLEF cross language image retrieval track (ImageCLEF) 2004. In: Fifth Workshop of the Cross-Language Evaluation Forum (CLEF 2004). Volume 3491 of LNCS. pp. 597-613 (2005).
[12]
Clough, P., Mueller, H., Deselaers, T., Grubinger, M., Lehmann, T., Jensen, J., Hersh, W.: The clef 2005 cross-language image retrieval track. In: Workshop of the Cross-Language Evaluation Forum (CLEF 2005). Lecture Notes in Computer Science, Vienna, Austria (2005) (in press).
[13]
Hörschler, I., Meinke, M., Schröder, W.: Numerical simulation of the flow field in a model of the nasal cavity. Comput. Fluids 32 3945 (2003).
[14]
Hörschler, I., Brücker, C., Schröder, W., Meinke, M.: Investigation of the impact of the geometry on the nose flow, Eur. J. Mech. B/Fluids (In Press) http://dx.doi.org/10.1016/j.euromechflu.2005.11.006.
[15]
ParaWise automatic parallelisation environment, PSP Inc. http://www.parallelsp.com
[16]
Jin H., Frumkin, M., Yah, J.: Automatic generation of OpenMP directives and it application to computational fluid dynamics codes. International Symposium on High Performmace Computing, p. 440 Tokyo, Japan, (2000).
[17]
Johnson, S., Jerotheou, C.: Parallelization of the TFS multi-block code from RWTH Aachen using the ParaWise/CAPO tools, PSP Inc, TR-2005-09-02, (2005). http://www.parallelsp.com/downloads/TechnicalReports/TR-2005-09-02.pdf
[18]
Johnson, S., Cross, M., and Everett, M.: Exploitation of symbolic information in interprocedural dependence analysis. Parallel Comput. 22, 197-226 (1996).
[19]
Spiegel, A., an Mey, D., Bischof, C.: Hybrid parallelization of CFD Applications with Dynamic Thread Balancing, PARA04. In: Dongarra J., Madsen K., Wasniewski J. (eds.) Applied Parallel Computing State of the Art in Scientific Computing: 7th International Conference, PARA 2004, vol. 3732, pp. 433-441. Lyngby, Denmark (2006).
[20]
McCalpin, J.D.: STREAM: sustainable memory bandwidth in high performance computers, http://www.cs.virginia.edu/stream/
[21]
Bull, M.: The status of OpenMP 3.0, SC06, OpenMP BoF http://www.compunity.org/futures/Mark_ SC06BOF.pdf
[22]
SUSE Linux 10.1 NUMA Policy Control, http://www.novell.com/products/linuxpackages/. /suselinux/numactl.html

Cited By

View all
  • (2012)Task-parallel programming on NUMA architecturesProceedings of the 18th international conference on Parallel Processing10.1007/978-3-642-32820-6_63(638-649)Online publication date: 27-Aug-2012
  • (2009)Exploiting fine-grain thread parallelism on multicore architecturesScientific Programming10.1155/2009/24965117:4(309-323)Online publication date: 1-Dec-2009
  • (2009)Parallelism and scalability in an image processing applicationInternational Journal of Parallel Programming10.1007/s10766-009-0098-537:3(306-323)Online publication date: 1-Jun-2009
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image International Journal of Parallel Programming
International Journal of Parallel Programming  Volume 35, Issue 5
October 2007
67 pages

Publisher

Kluwer Academic Publishers

United States

Publication History

Published: 01 October 2007
Accepted: 26 January 2007
Received: 06 November 2006

Author Tags

  1. OpenMP
  2. ccNUMA
  3. nested parallelization
  4. shared memory parallelization

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 11 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2012)Task-parallel programming on NUMA architecturesProceedings of the 18th international conference on Parallel Processing10.1007/978-3-642-32820-6_63(638-649)Online publication date: 27-Aug-2012
  • (2009)Exploiting fine-grain thread parallelism on multicore architecturesScientific Programming10.1155/2009/24965117:4(309-323)Online publication date: 1-Dec-2009
  • (2009)Parallelism and scalability in an image processing applicationInternational Journal of Parallel Programming10.1007/s10766-009-0098-537:3(306-323)Online publication date: 1-Jun-2009
  • (2008)Scheduling dynamic OpenMP applications over multicore architecturesProceedings of the 4th international conference on OpenMP in a new era of parallelism10.5555/1789826.1789845(170-180)Online publication date: 12-May-2008
  • (2008)A microbenchmark study of OpenMP overheads under nested parallelismProceedings of the 4th international conference on OpenMP in a new era of parallelism10.5555/1789826.1789828(1-12)Online publication date: 12-May-2008
  • (2008)Data and thread affinity in openmp programsProceedings of the 2008 workshop on Memory access on future processors: a solved problem?10.1145/1366219.1366222(377-384)Online publication date: 5-May-2008

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media