[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/2636228.2636229acmconferencesArticle/Chapter ViewAbstractPublication PagesicfpConference Proceedingsconference-collections
research-article

Group communication patterns for high performance computing in scala

Published: 03 September 2014 Publication History

Abstract

We developed a Functional Object-Oriented Parallel framework (FooPar) for high-level high-performance computing in Scala. Central to this framework are Distributed Memory Parallel Data structures (DPDs), i.e., collections of data distributed in a shared nothing system together with parallel operations on these data.
In this paper, we first present FooPar's architecture and the idea of DPDs and group communications. Then, we show how DPDs can be implemented elegantly and efficiently in Scala based on the Traversable/Builder pattern, unifying Functional and Object-Oriented Programming.
We prove the correctness and safety of one communication algorithm and show how specification testing (via ScalaCheck) can be used to bridge the gap between proof and implementation. Furthermore, we show that the group communication operations of FooPar outperform those of the MPJ Express open source MPI-bindings for Java, both asymptotically and empirically.
FooPar has already been shown to be capable of achieving close-to-optimal performance for dense matrix-matrix multiplication via JNI. In this article, we present results on a parallel implementation of the Floyd-Warshall algorithm in FooPar, achieving more than 94% efficiency compared to the serial version on a cluster using 100 cores for matrices of dimension 38000 x 38000.

References

[1]
Carver. http://nersc.gov/users/computational-systems/carver/.
[2]
E. Allen, D. Chase, J. Hallett, V. Luchangco, J.-W. Maessen, S. Ryu, G. L. Steele Jr., and S. Tobin-Hochstadt. The Fortress Language Specification. Technical report, Sun Microsystems, Inc., 2007.
[3]
D. A. Bader, K. Madduri, J. R. Gilbert, V. Shah, J. Kepner, T. Meuse, and A. Krishnamurthy. Designing scalable synthetic compact applications for benchmarking high productivity computing systems. CTWatch Quarterly, 2(48), November 2006.
[4]
A. Buss, Harshvardhan, I. Papadopoulos, O. Pearce, T. Smith, G. Tanase, N. Thomas, X. Xu, M. Bianco, N. M. Amato, and L. Rauchwerger. Stapl: Standard template adaptive parallel library. In Proceedings of the 3rd Annual Haifa Experimental Systems Conference, SYSTOR '10, pages 14:1--14:10, New York, NY, USA, 2010. ACM. ISBN 978-1-60558-908-4. URL http://doi.acm.org/10.1145/1815695. 1815713.
[5]
B. L. Chamberlain, D. Callahan, and H. P. Zima. Parallel programmability and the chapel language. IJHPCA, 21(3):291--312, 2007.
[6]
N. Chen, R. K. Karmani, A. Shali, B.-Y. Su, and R. Johnson. Collective communication patterns. In ParaPLOP, 2009.
[7]
C. Coarfa, Y. Dotsenko, J. Mellor-Crummey, F. Cantonnet, T. El-Ghazawi, A. Mohanti, Y. Yao, and D. Chavarría-Miranda. An evaluation of global address space languages: Co-Array Fortran and Unified Parallel C. In PPoPP, pages 36--47. ACM, 2005.
[8]
F. Darema. The SPMD model : Past, present and future. In EuroPVM/MPI Conference, LNCS 2131, page 1. Springer, 2001.
[9]
E. Gabriel, G. E. Fagg, G. Bosilca, T. Angskun, J. J. Dongarra, J. M. Squyres, V. Sahay, P. Kambadur, B. Barrett, A. Lumsdaine, R. H. Castain, D. J. Daniel, R. L. Graham, and T. S. Woodall. Open MPI: Goals, concept, and design of a next generation MPI implementation. In European PVM/MPI Users' Group Meeting, pages 97--104, 2004.
[10]
A. Grama, G. Karypis, V. Kumar, and A. Gupta. Introduction to Parallel Computing. Pearson, Addison Wesley, 2003.
[11]
C. V. Hall, K. Hammond, S. L. Peyton Jones, and P. L. Wadler. Type classes in haskell. ACM TOPLAS, 18(2):109--138, 1996.
[12]
F. P. Hargreaves and D. Merkle. FooPar: A Functional Object Oriented Parallel Framework in Scala. In PPAM, number 8385 in LNCS, pages 118--129, 2014. preprint:http://arxiv.org/abs/1304.2550.
[13]
K. Karimi, N. G. Dickson, and F. Hamze. A performance comparison of CUDA and OpenCL. CoRR, abs/1005.2581, 2010.
[14]
V. Kumar and V. Singh. Scalability of parallel algorithms for the allpairs shortest path problem. JPDC, 13(2):124--138, 1991.
[15]
R. Loogen, Y. Ortega-Mallén, and R. Peña. Parallel functional programming in Eden. JFP, 15:431--475, 2005.
[16]
E. Lusk and K. Yelick. Languages for high-productivity computing: the DARPA HPCS language project. PPL, 89(17), 2007.
[17]
J. Milthorpe, V. Ganesh, A. Rendell, and D. Grove. X10 as a parallel language for scientific computation: Practice and experience. Parallel and Distributed Processing Symposium, pages 1080--1088, 2011.
[18]
J. Nickolls, I. Buck, M. Garland, and K. Skadron. Scalable parallel programming with cuda. Queue, 6(2):40--53, 2008.
[19]
M. Odersky. Contracts for Scala. In Runtime Verification (RV), LNCS 6418, pages 51--57. Springer, 2010.
[20]
M. Odersky. The Scala language specification, 2011.
[21]
M. Odersky and A. Moors. Fighting bit rot with types (experience report: Scala collections). In FSTTCS, LIPIcs 4, pages 427--451, 2009.
[22]
A. Prokopec, P. Bagwell, T. Rompf, and M. Odersky. A generic parallel collection framework. In Euro-Par 2011 Parallel Processing, LNCS 6853, pages 136--147. Springer, 2011.
[23]
R. Roestenburg and R. Bakker. Akka in Action. Manning, 2012.
[24]
T. Rompf, A. K. Sujeeth, H. Lee, K. J. Brown, H. Chafi, M. Odersky, and K. Olukotun. Building-blocks for performance oriented dsls. In O. Danvy and C. chieh Shan, editors, DSL, volume 66 of EPTCS, pages 93--117, 2011.
[25]
T. Rompf, A. K. Sujeeth, N. Amin, K. J. Brown, V. Jovanovic, H. Lee, M. Jonnalagedda, K. Olukotun, and M. Odersky. Optimizing data structures in high-level programs: New directions for extensible compilers based on staging. SIGPLAN Not., 48(1):497--510, Jan. 2013. ISSN 0362-1340. . URL http://doi.acm.org/10.1145/2480359. 2429128.
[26]
A. Shafi and J. Manzoor. Towards efficient shared memory communications in MPJ express. In IPDPS, pages 1--7, 2009.
[27]
G. L. Taboada, J. Touriño, and R. Doallo. F-MPJ: Scalable Java Message-passing Communications on Parallel Systems. Journal of Supercomputing, 60(1):117--140, 2012.
[28]
P. Wadler. The essence of functional programming. In Principles of Programming Languages, pages 1--14. ACM, 1992.
[29]
P. Wadler. Why no one uses functional languages. SIGPLAN Not., 33:23--27, Aug. 1998. ISSN 0362--1340. . URL http://doi.acm.org/ 10.1145/286385.286387.
[30]
M. Zaharia, M. Chowdhury, M. J. Franklin, S. Shenker, and I. Stoica. Spark: cluster computing with working sets. In USENIX conference on Hot topics in cloud computing, page 10. USENIX Association, 2010.

Cited By

View all
  • (2016)Topology-aware performance optimization and modeling of adaptive mesh refinement codes for exascaleProceedings of the First Workshop on Optimization of Communication in HPC10.5555/3018058.3018061(17-28)Online publication date: 13-Nov-2016
  • (2016)Topology-Aware Performance Optimization and Modeling of Adaptive Mesh Refinement Codes for Exascale2016 First International Workshop on Communication Optimizations in HPC (COMHPC)10.1109/COMHPC.2016.008(17-28)Online publication date: Nov-2016

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
FHPC '14: Proceedings of the 3rd ACM SIGPLAN workshop on Functional high-performance computing
September 2014
116 pages
ISBN:9781450330404
DOI:10.1145/2636228
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 03 September 2014

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. design
  2. high-performance-computing
  3. languages
  4. performance
  5. scala

Qualifiers

  • Research-article

Conference

ICFP'14
Sponsor:

Acceptance Rates

FHPC '14 Paper Acceptance Rate 10 of 11 submissions, 91%;
Overall Acceptance Rate 18 of 25 submissions, 72%

Upcoming Conference

ICFP '25
ACM SIGPLAN International Conference on Functional Programming
October 12 - 18, 2025
Singapore , Singapore

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)2
  • Downloads (Last 6 weeks)0
Reflects downloads up to 10 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2016)Topology-aware performance optimization and modeling of adaptive mesh refinement codes for exascaleProceedings of the First Workshop on Optimization of Communication in HPC10.5555/3018058.3018061(17-28)Online publication date: 13-Nov-2016
  • (2016)Topology-Aware Performance Optimization and Modeling of Adaptive Mesh Refinement Codes for Exascale2016 First International Workshop on Communication Optimizations in HPC (COMHPC)10.1109/COMHPC.2016.008(17-28)Online publication date: Nov-2016

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media