More Web Proxy on the site http://driver.im/

research-article

Group communication patterns for high performance computing in scala

Authors:

Felix P. Hargreaves,

Peter Schneider-KampAuthors Info & Claims

FHPC '14: Proceedings of the 3rd ACM SIGPLAN workshop on Functional high-performance computing

Pages 75 - 85

https://doi.org/10.1145/2636228.2636229

Published: 03 September 2014 Publication History

Abstract

We developed a Functional Object-Oriented Parallel framework (FooPar) for high-level high-performance computing in Scala. Central to this framework are Distributed Memory Parallel Data structures (DPDs), i.e., collections of data distributed in a shared nothing system together with parallel operations on these data.

In this paper, we first present FooPar's architecture and the idea of DPDs and group communications. Then, we show how DPDs can be implemented elegantly and efficiently in Scala based on the Traversable/Builder pattern, unifying Functional and Object-Oriented Programming.

We prove the correctness and safety of one communication algorithm and show how specification testing (via ScalaCheck) can be used to bridge the gap between proof and implementation. Furthermore, we show that the group communication operations of FooPar outperform those of the MPJ Express open source MPI-bindings for Java, both asymptotically and empirically.

FooPar has already been shown to be capable of achieving close-to-optimal performance for dense matrix-matrix multiplication via JNI. In this article, we present results on a parallel implementation of the Floyd-Warshall algorithm in FooPar, achieving more than 94% efficiency compared to the serial version on a cluster using 100 cores for matrices of dimension 38000 x 38000.

References

[1]

Carver. http://nersc.gov/users/computational-systems/carver/.

[2]

E. Allen, D. Chase, J. Hallett, V. Luchangco, J.-W. Maessen, S. Ryu, G. L. Steele Jr., and S. Tobin-Hochstadt. The Fortress Language Specification. Technical report, Sun Microsystems, Inc., 2007.

[3]

D. A. Bader, K. Madduri, J. R. Gilbert, V. Shah, J. Kepner, T. Meuse, and A. Krishnamurthy. Designing scalable synthetic compact applications for benchmarking high productivity computing systems. CTWatch Quarterly, 2(48), November 2006.

[4]

A. Buss, Harshvardhan, I. Papadopoulos, O. Pearce, T. Smith, G. Tanase, N. Thomas, X. Xu, M. Bianco, N. M. Amato, and L. Rauchwerger. Stapl: Standard template adaptive parallel library. In Proceedings of the 3rd Annual Haifa Experimental Systems Conference, SYSTOR '10, pages 14:1--14:10, New York, NY, USA, 2010. ACM. ISBN 978-1-60558-908-4. URL http://doi.acm.org/10.1145/1815695. 1815713.

Digital Library

[5]

B. L. Chamberlain, D. Callahan, and H. P. Zima. Parallel programmability and the chapel language. IJHPCA, 21(3):291--312, 2007.

Digital Library

[6]

N. Chen, R. K. Karmani, A. Shali, B.-Y. Su, and R. Johnson. Collective communication patterns. In ParaPLOP, 2009.

[7]

C. Coarfa, Y. Dotsenko, J. Mellor-Crummey, F. Cantonnet, T. El-Ghazawi, A. Mohanti, Y. Yao, and D. Chavarría-Miranda. An evaluation of global address space languages: Co-Array Fortran and Unified Parallel C. In PPoPP, pages 36--47. ACM, 2005.

Digital Library

[8]

F. Darema. The SPMD model : Past, present and future. In EuroPVM/MPI Conference, LNCS 2131, page 1. Springer, 2001.

Digital Library

[9]

E. Gabriel, G. E. Fagg, G. Bosilca, T. Angskun, J. J. Dongarra, J. M. Squyres, V. Sahay, P. Kambadur, B. Barrett, A. Lumsdaine, R. H. Castain, D. J. Daniel, R. L. Graham, and T. S. Woodall. Open MPI: Goals, concept, and design of a next generation MPI implementation. In European PVM/MPI Users' Group Meeting, pages 97--104, 2004.

[10]

A. Grama, G. Karypis, V. Kumar, and A. Gupta. Introduction to Parallel Computing. Pearson, Addison Wesley, 2003.

[11]

C. V. Hall, K. Hammond, S. L. Peyton Jones, and P. L. Wadler. Type classes in haskell. ACM TOPLAS, 18(2):109--138, 1996.

Digital Library

[12]

F. P. Hargreaves and D. Merkle. FooPar: A Functional Object Oriented Parallel Framework in Scala. In PPAM, number 8385 in LNCS, pages 118--129, 2014. preprint:http://arxiv.org/abs/1304.2550.

[13]

K. Karimi, N. G. Dickson, and F. Hamze. A performance comparison of CUDA and OpenCL. CoRR, abs/1005.2581, 2010.

[14]

V. Kumar and V. Singh. Scalability of parallel algorithms for the allpairs shortest path problem. JPDC, 13(2):124--138, 1991.

Digital Library

[15]

R. Loogen, Y. Ortega-Mallén, and R. Peña. Parallel functional programming in Eden. JFP, 15:431--475, 2005.

Digital Library

[16]

E. Lusk and K. Yelick. Languages for high-productivity computing: the DARPA HPCS language project. PPL, 89(17), 2007.

[17]

J. Milthorpe, V. Ganesh, A. Rendell, and D. Grove. X10 as a parallel language for scientific computation: Practice and experience. Parallel and Distributed Processing Symposium, pages 1080--1088, 2011.

Digital Library

[18]

J. Nickolls, I. Buck, M. Garland, and K. Skadron. Scalable parallel programming with cuda. Queue, 6(2):40--53, 2008.

Digital Library

[19]

M. Odersky. Contracts for Scala. In Runtime Verification (RV), LNCS 6418, pages 51--57. Springer, 2010.

Digital Library

[20]

M. Odersky. The Scala language specification, 2011.

[21]

M. Odersky and A. Moors. Fighting bit rot with types (experience report: Scala collections). In FSTTCS, LIPIcs 4, pages 427--451, 2009.

[22]

A. Prokopec, P. Bagwell, T. Rompf, and M. Odersky. A generic parallel collection framework. In Euro-Par 2011 Parallel Processing, LNCS 6853, pages 136--147. Springer, 2011.

Digital Library

[23]

R. Roestenburg and R. Bakker. Akka in Action. Manning, 2012.

[24]

T. Rompf, A. K. Sujeeth, H. Lee, K. J. Brown, H. Chafi, M. Odersky, and K. Olukotun. Building-blocks for performance oriented dsls. In O. Danvy and C. chieh Shan, editors, DSL, volume 66 of EPTCS, pages 93--117, 2011.

[25]

T. Rompf, A. K. Sujeeth, N. Amin, K. J. Brown, V. Jovanovic, H. Lee, M. Jonnalagedda, K. Olukotun, and M. Odersky. Optimizing data structures in high-level programs: New directions for extensible compilers based on staging. SIGPLAN Not., 48(1):497--510, Jan. 2013. ISSN 0362-1340. . URL http://doi.acm.org/10.1145/2480359. 2429128.

Digital Library

[26]

A. Shafi and J. Manzoor. Towards efficient shared memory communications in MPJ express. In IPDPS, pages 1--7, 2009.

Digital Library

[27]

G. L. Taboada, J. Touriño, and R. Doallo. F-MPJ: Scalable Java Message-passing Communications on Parallel Systems. Journal of Supercomputing, 60(1):117--140, 2012.

Digital Library

[28]

P. Wadler. The essence of functional programming. In Principles of Programming Languages, pages 1--14. ACM, 1992.

Digital Library

[29]

P. Wadler. Why no one uses functional languages. SIGPLAN Not., 33:23--27, Aug. 1998. ISSN 0362--1340. . URL http://doi.acm.org/ 10.1145/286385.286387.

Digital Library

[30]

M. Zaharia, M. Chowdhury, M. J. Franklin, S. Shenker, and I. Stoica. Spark: cluster computing with working sets. In USENIX conference on Hot topics in cloud computing, page 10. USENIX Association, 2010.

Digital Library

Cited By

Chan CBachan JKenny JWilke JBeckner VAlmgren ABell J(2016)Topology-aware performance optimization and modeling of adaptive mesh refinement codes for exascaleProceedings of the First Workshop on Optimization of Communication in HPC10.5555/3018058.3018061(17-28)Online publication date: 13-Nov-2016
https://dl.acm.org/doi/10.5555/3018058.3018061
Chan CBachan JKenny JWilke JBeckner VAlmgren ABell J(2016)Topology-Aware Performance Optimization and Modeling of Adaptive Mesh Refinement Codes for Exascale2016 First International Workshop on Communication Optimizations in HPC (COMHPC)10.1109/COMHPC.2016.008(17-28)Online publication date: Nov-2016
https://doi.org/10.1109/COMHPC.2016.008

Index Terms

Group communication patterns for high performance computing in scala
1. Computing methodologies
  1. Parallel computing methodologies
    1. Parallel programming languages
2. Software and its engineering
  1. Software notations and tools
    1. General programming languages
      1. Language types
        Functional languages
        Parallel programming languages

Recommendations

High performance computing for vision on distributed-memory machines
A high-performance MPI implementation on a shared-memory vector supercomputer
Interpreting Parallel Processor Performance Measurements

This paper discusses execution time versus number of simultaneous operations in parallel computing systems. The main focus is on shared memory multiprocessors. A model for execution time as a function of the number of processes used in a computation is ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

FHPC '14: Proceedings of the 3rd ACM SIGPLAN workshop on Functional high-performance computing

September 2014

116 pages

ISBN:9781450330404

DOI:10.1145/2636228

General Chair:
Jost Berthold
University of Copenhagen
,
Program Chairs:
Mary Sheeran
Chalmers University of Technology
,
Ryan Newton
Indiana University

Copyright © 2014 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGPLAN: ACM Special Interest Group on Programming Languages

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 03 September 2014

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

ICFP'14

Sponsor:

SIGPLAN

ICFP'14: ACM SIGPLAN International Conference on Functional Programming

September 4, 2014

Gothenburg, Sweden

Acceptance Rates

FHPC '14 Paper Acceptance Rate 10 of 11 submissions, 91%;

Overall Acceptance Rate 18 of 25 submissions, 72%

Upcoming Conference

ICFP '25

Sponsor:
sigplan

ACM SIGPLAN International Conference on Functional Programming

October 12 - 18, 2025

Singapore , Singapore

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
101
Total Downloads

Downloads (Last 12 months)2
Downloads (Last 6 weeks)0

Reflects downloads up to 10 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Chan CBachan JKenny JWilke JBeckner VAlmgren ABell J(2016)Topology-aware performance optimization and modeling of adaptive mesh refinement codes for exascaleProceedings of the First Workshop on Optimization of Communication in HPC10.5555/3018058.3018061(17-28)Online publication date: 13-Nov-2016
https://dl.acm.org/doi/10.5555/3018058.3018061
Chan CBachan JKenny JWilke JBeckner VAlmgren ABell J(2016)Topology-Aware Performance Optimization and Modeling of Adaptive Mesh Refinement Codes for Exascale2016 First International Workshop on Communication Optimizations in HPC (COMHPC)10.1109/COMHPC.2016.008(17-28)Online publication date: Nov-2016
https://doi.org/10.1109/COMHPC.2016.008

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents