[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/2636228.2636234acmconferencesArticle/Chapter ViewAbstractPublication PagesicfpConference Proceedingsconference-collections
research-article

Lazy data-oriented evaluation strategies

Published: 03 September 2014 Publication History

Abstract

This paper presents a number of flexible parallelism control mechanisms in the form of evaluation strategies for tree-like data structures implemented in Glasgow parallel Haskell. We achieve additional flexibility by using laziness and circular programs in the coordination code. Heuristics-based parameter selection is employed to auto-tune these strategies for improved performance on a shared-memory machine without programmer-specified parameters. In particular for unbalanced trees we demonstrate improved performance on a state-of-the-art multi-core server: giving a speedup of up to 37.5 on 48 cores for a constructed test program, and up to 15 for two other non-trivial applications using these strategies, a Barnes-Hut implementation of the n-body problem and a sparse matrix multiplication implementation.

References

[1]
M. Aljabri, H.-W. Loidl, and P. Trinder. Distributed vs. Shared Heap, Parallel Haskell Implementations on Shared Memory Machines. In Draft Proc. of Symp. on Trends in Funct. Program., TFP'14, Univ. of Utrecht, The Netherlands, 2014.
[2]
L. Allison. Circular programs and self-referential structures. Soft.: Prac. and Exp., 19 (2): 99--109, 1989. ISSN 1097-024X. URL http://dx.doi.org/10.1002/spe.4380190202.
[3]
J. Barnes and P. Hut. A hierarchical o(n log n) force-calculation algorithm. Nature, 324 (6096): 446--449, Dec. 1986. URL http://dx.doi.org/10.1038/324446a0.
[4]
L. Bergstrom, M. Fluet, M. Rainey, J. Reppy, and A. Shaw. Lazy Tree Splitting. J. of Funct. Program., 22 (4--5): 382--438, 2012. URL http://dx.doi.org/10.1017/S0956796812000172.
[5]
L. Bergstrom, M. Fluet, M. Rainey, J. Reppy, S. Rosen, and A. Shaw. Data-Only Flattening for Nested Data Parallelism. In Proc. of the ACM SIGPLAN Symp. on Princ. Pract. of Par. Program., PPoPP'13, pages 81--92, Feb. 2013.
[6]
R. Bird. Using circular programs to eliminate multiple traversals of data. Acta Informatica, 21 (3): 239--250, 1984. ISSN 0001-5903. URL http://dx.doi.org/10.1007/BF00264249.
[7]
G. E. Blelloch. NESL: A Nested Data-Parallel Language. Technical report, Carnegie Mellon University, Pittsburgh, PA, USA, 1992.
[8]
C. Campbell, R. Johnson, A. Miller, and S. Toub. Parallel Programming with Microsoft .NET -- Design Patterns for Decomposition and Coordination on Multicore Architectures. Microsoft Press, Aug. 2010. URL http://msdn.microsoft.com/en-us/library/ff963553.aspx.
[9]
P. Charles, C. Grothoff, V. Saraswat, C. Donawa, A. Kielstra, K. Ebcioglu, C. von Praun, and V. Sarkar. X10: An Object-Oriented Approach to Non-uniform Cluster Computing. In Proc. of the ACM Conf. on OO Prog. Sys. Lang. and App., OOPSLA'05, pages 519--538, 2005. URL http://dx.doi.org/10.1145/1094811.1094852.
[10]
Cilk. Cilk 5.4.6 Reference Manual. MIT, Supercomputing Technologies Group MIT Laboratory for Computer Science, 1998. URL http://supertech.lcs.mit.edu/cilk.
[11]
M. Cole. Algorithmic Skeletons: Structured Management of Parallel Computation. MIT Press, Cambridge, MA, USA, 1991. ISBN 0-262-53086-4.
[12]
T. Cormen, C. Leiserson, R. Rivest, and C. Stein. Introduction to Algorithms. MIT Press, third edition, 2009. ISBN 978-0262033848.
[13]
T. El-Ghazawi, W. Carlson, T. Sterling, and K. Yelick. UPC: Distributed Shared Memory Programming. John Wiley and Sons, 2005. ISBN 9780471478362. URL http://dx.doi.org/10.1002/0471478369.fmatter.
[14]
C. Grelck and S.-B. Scholz. SAC: A Functional Array Language for Efficient Multi-threaded Execution. Int. J. Parallel Program., 34 (4): 383--427, Aug 2006. URL http://dx.doi.org/10.1007/s10766-006-0018-x.
[15]
C. T. Haynes and D. P. Friedman. Engines build process abstractions. In Proc. of the ACM Symp. on LISP and Funct. Program., LFP'84, pages 18--24, New York, NY, USA, 1984. ACM. ISBN 0-89791-142-3. URL http://doi.acm.org/10.1145/800055.802018.
[16]
L. Huelsbergen and J. Larus. Dynamic Program Parallelization. In Proc. of the ACM Conf. on LISP and Funct. Program., LFP'92, pages 311--323, New York, NY, USA, 1992. ISBN 0-89791-481-3. URL http://doi.acm.org/10.1145/141471.141567.
[17]
S. Marlow, P. Maier, H.-W. Loidl, M. K. Aswad, and P. Trinder. Seq no More: Better Strategies for Parallel Haskell. In Proc. of the 3rd ACM Haskell Symp., Haskell '10, pages 91--102, New York, NY, USA, 2010. ISBN 978-1-4503-0252-4. URL http://doi.acm.org/10.1145/1863523.1863535.
[18]
T. G. Mattson, B. A. Sanders, and B. L. Massingill. Patterns for Parallel Programming. Addison-Wesley, 2004. ISBN 978-0321228116.
[19]
M. McCool, A. Robison, and J. Reinders. Structured Parallel Programming. Morgan Kaufmann, 2012. ISBN 978-0-12-415993-8.
[20]
P. Narayanan and R. Newton. Graph Algorithms in a Guaranteed-Deterministic Language. In 5th Workshop on Deter. and Correct. in Par. Program., WODET 2014, March 2, Salt Lake City, UT, USA, 2014. URL http://wodet.cs.washington.edu/.
[21]
R. Numrich and J. Reid. Co-arrays in the Next Fortran Standard. ACM SIGPLAN Fortran Forum, 24 (2): 4--17, Aug. 2005. ISSN 1061-7264. URL http://doi.acm.org/10.1145/1080399.1080400.
[22]
C. Okasaki. Purely Functional Data Structures. Cambridge University Press, 1999. ISBN 978052166350. Sept.
[23]
S. Peyton Jones. Harnessing the Multicores: Nested Data Parallelism in Haskell. In Program. Lang. and Sys., LNCS 5356, pages 138--138. Springer, 2008. ISBN 978-3-540-89329-5. URL http://dx.doi.org/10.1007/978-3-540-89330-1.
[24]
J. Reinders. Intel Threading Building Blocks: Outfitting C for Multi-core Processor Parallelism. O.Reilly, 2007.
[25]
N. Shavit. Data structures in the multicore age. Commun. ACM, 54 (3): 76--84, Mar. 2011. ISSN 0001-0782. URL http://doi.acm.org/10.1145/1897852.1897873.
[26]
S. Swierstra, P. Azero Alcocer, and J. Saraiva. Designing and Implementing Combinator Languages. In Ad. Funct. Program., LNCS 1608, pages 150--206. Springer, 1999. ISBN 978-3-540-66241-9. 10.1007/10704973_4. URL http://dx.doi.org/10.1007/10704973_4.
[27]
P. Totoo and H.-W. Loidl. Parallel Haskell Implementations of the N-body Problem. Conc. and Comp.: Prac. and Exp., 26 (4): 987-?1019, Mar. 2014. URL http://dx.doi.org/10.1002/cpe.3087.
[28]
P. Trinder, K. Hammond, H.-W. Loidl, and S. Peyton Jones. Algorithm Strategy = Parallelism. J. Funct. Program., 8 (1): 23--60, Jan. 1998. URL http://dx.doi.org/10.1017/S0956796897002967.
[29]
R. L. Wainwright and M. E. Sexton. A study of sparse matrix representations for solving linear systems in a functional language. J. of Funct. Program., 2 (01): 61--72, 1992. URL http://dx.doi.org/10.1017/S0956796800000265.

Cited By

View all
  • (2019)Colocation of Potential Parallelism in a Distributed Adaptive Run-Time System for Parallel HaskellZivilgesellschaft und Wohlfahrtsstaat im Wandel10.1007/978-3-030-18506-0_1(1-19)Online publication date: 24-Apr-2019

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
FHPC '14: Proceedings of the 3rd ACM SIGPLAN workshop on Functional high-performance computing
September 2014
116 pages
ISBN:9781450330404
DOI:10.1145/2636228
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 03 September 2014

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. dynamic parallelism control
  2. parallel Haskell
  3. quad-trees

Qualifiers

  • Research-article

Conference

ICFP'14
Sponsor:

Acceptance Rates

FHPC '14 Paper Acceptance Rate 10 of 11 submissions, 91%;
Overall Acceptance Rate 18 of 25 submissions, 72%

Upcoming Conference

ICFP '25
ACM SIGPLAN International Conference on Functional Programming
October 12 - 18, 2025
Singapore , Singapore

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)2
  • Downloads (Last 6 weeks)1
Reflects downloads up to 01 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2019)Colocation of Potential Parallelism in a Distributed Adaptive Run-Time System for Parallel HaskellZivilgesellschaft und Wohlfahrtsstaat im Wandel10.1007/978-3-030-18506-0_1(1-19)Online publication date: 24-Apr-2019

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media