Automatic Scaling of OpenMP Beyond Shared Memory

Okwan Kwon¹⁷,
Fahed Jubair¹⁷,
Seung-Jai Min¹⁸,
Hansang Bae¹⁷,
Rudolf Eigenmann¹⁷ &
…
Samuel P. Midkiff¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 7146))

Included in the following conference series:

International Workshop on Languages and Compilers for Parallel Computing

971 Accesses
3 Citations

Abstract

OpenMP is an explicit parallel programming model that offers reasonable productivity. Its memory model assumes a shared address space, and hence the direct translation - as done by common OpenMP compilers - requires an underlying shared-memory architecture. Many lab machines include 10s of processors, built from commodity components and thus include distributed address spaces. Despite many efforts to provide higher productivity for these platforms, the most common programming model uses message passing, which is substantially more tedious to program than shared-address-space models. This paper presents a compiler/runtime system that translates OpenMP programs into message passing variants and executes them on clusters up to 64 processors. We build on previous work that provided a proof of concept of such translation. The present paper describes compiler algorithms and runtime techniques that provide the automatic translation of a first class of OpenMP applications: those that exhibit regular write array subscripts and repetitive communication. We evaluate the translator on representative benchmarks of this class and compare their performance against hand-written MPI variants. In all but one case, our translated versions perform close to the hand-written variants.

This work was supported, in part, by the National Science Foundation under grants No. 0751153-CNS, 0707931-CNS, 0833115-CCF, and 0916817-CCF.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: GBP 19.95; Price includes VAT (United Kingdom)

eBook: GBP 35.99; Price includes VAT (United Kingdom)

Softcover Book: GBP 44.99; Price includes VAT (United Kingdom)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Beyond Explicit Transfers: Shared and Managed Memory in OpenMP

OpenMP as a High-Level Specification Language for Parallelism

Locality-Based Optimizations in the Chapel Compiler

References

High Performance Fortran Forum: High Performance Fortran language specification, version 1.0. Technical Report CRPC-TR92225, Houston, Tex. (1993)
Google Scholar
Amza, C., Cox, A.L., Dwarkadas, S., Keleher, P., Lu, H., Rajamony, R., Yu, W., Zwaenepoel, W.: TreadMarks: Shared Memory Computing on Networks of Workstations. IEEE Computer 29(2), 18–28 (1996)
Article Google Scholar
UPC Consortium: UPC Language Specifications, v1.2. Technical Report LBNL-59208, Lawrence Berkeley National Laboratory (2005)
Google Scholar
Numrich, R.W., Reid, J.: Co-array Fortran for Parallel Programming. SIGPLAN Fortran Forum 17(2), 1–31 (1998)
Article Google Scholar
Basumallik, A., Eigenmann, R.: Towards Automatic Translation of OpenMP to MPI. In: ICS 2005: Proceedings of the 19th Annual International Conference on Supercomputing, pp. 189–198. ACM Press, New York (2005)
Chapter Google Scholar
Bae, H., Bachega, L., Dave, C., Lee, S., Lee, S., Min, S., Eigenmann, R., Midkiff, S.: Cetus: A Source-to-Source Compiler Infrastructure for Multicores. In: Proc. of the 14th International Workshop on Compilers for Parallel Computing, CPC 2009 (January 2009)
Google Scholar
Min, S., Basumallik, A., Eigenmann, R.: Optimizing OpenMP programs on Software Distributed Shared Memory Systems. International Journal of Parallel Programming 31(3), 225–249 (2003)
Article MATH Google Scholar
Min, S., Eigenmann, R.: Combined Compile-time and Runtime-driven, Pro-active Data Movement in Software DSM Systems. In: LCR 2004: Proceedings of the 7th Workshop on Languages, Compilers, and Run-time Support for Scalable Systems, pp. 1–6. ACM Press, New York (2004)
Chapter Google Scholar
Bailey, D.H., Barszcz, E., Barton, J.T., Browning, D.S., Carter, R.L., Fatoohi, R.A., Frederickson, P.O., Lasinski, T.A., Simon, H.D., Venkatakrishnan, V., Weeratunga, S.K.: The NAS Parallel Benchmarks (1991)
Google Scholar
Satoh, S.: NAS Parallel Benchmarks 2.3 OpenMP C version (2000), http://www.hpcs.cs.tsukuba.ac.jp/omni-openmp
Andrews, G.R.: Foundations of Parallel and Distributed Programming. Addison-Wesley Longman Publishing Co., Inc., Boston (1999)
Google Scholar
Snir, M., Otto, S., Huss-Lederman, S., Walker, D., Dongarra, J.: Point-to-Point Communication. In: MPI: The Complete Reference, vol. 1, pp. 56–65. MIT Press (1998)
Google Scholar
Saad, Y.: SPARSKIT: A Basic Tool Kit for Sparse Matrix Computations. Technical report, Computer Science Department, University of Minnesota, Minneapolis, MN 55455, Version 2 (June 1994)
Google Scholar
Bhardwaj, D.: Description for Implementation of MPI Programs, http://www.cse.iitd.ernet.in/~dheerajb/MPI/Document/tp.html
Dwarkadas, S., Cox, A.L., Zwaenepoel, W.: An Integrated Compile-Time/Run-Time Software Distributed Shared Memory System. In: Proc. of the 7th Symposium on Architectural Support for Programming Languages and Operating Systems, ASPLOS VII, pp. 186–197 (1996)
Google Scholar
Bianchini, R., Pinto, R., Amorim, C.L.: Data prefetching for software DSMs. In: The 12th International Conference on Supercomputing, pp. 385–392 (1998)
Google Scholar
Viswanathan, G., Larus, J.R.: Compiler-directed Shared-memory Communication for Iterative Parallel Applications. In: Supercomputing (November 1996)
Google Scholar
Keleher, P., Tseng, C.W.: Enhancing Software DSMs for Compiler-Parallelized Applications. In: Proc. of the 11th Int’l Parallel Processing Symp., IPPS 1997 (1997)
Google Scholar
Keleher, P.: Update Protocols and Iterative Scientific Applications. In: Proceedings of the First Merged Symposium IPPS/SPDP, IPDPS 1998 (1998)
Google Scholar
Gupta, M., Midkiff, S., Schonberg, E., Seshadri, V., Shields, D., Wang, K.Y., Ching, W.M., Ngo, T.: An HPF compiler for the IBM SP2. In: Supercomputing 1995: Proceedings of the 1995 ACM/IEEE Conference on Supercomputing, CDROM, p. 71. ACM, New York (1995)
Chapter Google Scholar
Yelick, K., Semenzato, L., Pike, G., Miyamoto, C., Liblit, B., Krishnamurthy, A., Hilfinger, P., Graham, S., Gay, D., Colella, P., Aiken, A.: Titanium: A High-Performance Java Dialect, pp. 10–11. ACM (1998)
Google Scholar
Charles, P., Grothoff, C., Saraswat, V., Donawa, C., Kielstra, A., Ebcioglu, K., von Praun, C., Sarkar, V.: X10: An Object-oriented Approach to Non-uniform Cluster Computing. In: Proceedings of the 20th Annual ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and Applications, OOPSLA 2005, pp. 519–538. ACM, New York (2005)
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Purdue University, USA
Okwan Kwon, Fahed Jubair, Hansang Bae, Rudolf Eigenmann & Samuel P. Midkiff
Lawrence Berkeley National Laboratory, USA
Seung-Jai Min

Authors

Okwan Kwon
View author publications
You can also search for this author in PubMed Google Scholar
Fahed Jubair
View author publications
You can also search for this author in PubMed Google Scholar
Seung-Jai Min
View author publications
You can also search for this author in PubMed Google Scholar
Hansang Bae
View author publications
You can also search for this author in PubMed Google Scholar
Rudolf Eigenmann
View author publications
You can also search for this author in PubMed Google Scholar
Samuel P. Midkiff
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Computer Science Department, Colorado State University, 80523-1873, Fort Collins, CO, USA
Sanjay Rajopadhye & Michelle Mills Strout &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kwon, O., Jubair, F., Min, SJ., Bae, H., Eigenmann, R., Midkiff, S.P. (2013). Automatic Scaling of OpenMP Beyond Shared Memory. In: Rajopadhye, S., Mills Strout, M. (eds) Languages and Compilers for Parallel Computing. LCPC 2011. Lecture Notes in Computer Science, vol 7146. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-36036-7_1

Download citation

DOI: https://doi.org/10.1007/978-3-642-36036-7_1
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-36035-0
Online ISBN: 978-3-642-36036-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Automatic Scaling of OpenMP Beyond Shared Memory

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Beyond Explicit Transfers: Shared and Managed Memory in OpenMP

OpenMP as a High-Level Specification Language for Parallelism

Locality-Based Optimizations in the Chapel Compiler

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Automatic Scaling of OpenMP Beyond Shared Memory

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Beyond Explicit Transfers: Shared and Managed Memory in OpenMP

OpenMP as a High-Level Specification Language for Parallelism

Locality-Based Optimizations in the Chapel Compiler

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation