[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/1481839.1481843acmconferencesArticle/Chapter ViewAbstractPublication PagespoplConference Proceedingsconference-collections
research-article

Low-pain, high-gain multicore programming in Haskell: coordinating irregular symbolic computations on multicore architectures

Published: 20 January 2009 Publication History

Abstract

With the emergence of commodity multicore architectures, exploiting tightly-coupled parallelism has become increasingly important. Functional programming languages, such as Haskell, are, in principle, well placed to take advantage of this trend, offering the ability to easily identify large amounts of fine-grained parallelism. Unfortunately, obtaining real performance benefits has often proved hard to realise in practice.
This paper reports on a new approach using middleware that has been constructed using the Eden parallel dialect of Haskell. Our approach is ``low pain'' in the sense that the programmer constructs a parallel program by inserting a small number of higher-order algorithmic skeletons at key points in the program. It is ``high gain'' in the sense that we are able to get good parallel speedups. Our approach is unusual in that we do not attempt to use shared memory directly, but rather coordinate parallel computations using a message-passing implementation. This approach has a number of advantages. Firstly, coordination, i.e. locking and communication, is both confined to limited shared memory areas, essentially the communication buffers, and is also isolated within well-understood libraries. Secondly, the coarse thread granularity that we obtain reduces coordination overheads, so locks are normally needed only on (relatively large) messages, and not on individual data items, as is often the case for simple shared-memory implementations. Finally, cache coherency requirements are reduced since individual tasks do not share caches, and can garbage collect independently.
We report results for two representative computational algebra problems. Computational algebra is a challenging application area that has not been widely studied in the general parallelism community. Computational algebra applications have high computational demands, and are, in principle, often suitable for parallel execution, but usually display a high degree of irregularity in terms of both task and data structure. This makes it difficult to construct parallel applications that perform well in practice. Using our system, we are able to obtain both extremely good processor utilisation (97%) and very good absolute speedups (up to 7.7) on an eight-core machine.

References

[1]
A-R Adl-Tabatabai, B.T. Lewis, V. Menon, B.R. Murphy, B. Saha, and T. Shpeisman. Compiler and Runtime Support for Efficient Software Transactional Memory. In PLDI '06: Proceedings of the 2006 ACM SIGPLAN conference on Programming language design and implementation, pages 26--37, New York, NY, USA, 2006. ACM.
[2]
A. Al Zain, P. Trinder, K. Hammond, and S. Linton. Orchestrating Computational Algebra Components into a High-Performance Parallel System. In Proc. 2008 IEEE International Symposium on Parallel and Distributed Computing with Applications (ISPA '08), 2008.
[3]
A. Al Zain, P. Trinder, H-W. Loidl, and G. Michaelson. Evaluating a High-Level Parallel Language (GpH) for Computational GRIDs. IEEE Transactions on Parallel and Distributed Systems, 19(2):219--233, 2008.
[4]
B. Amrheim, O. Gloor, and W. Küchlin. A Case Study of Multithreaded Gröbner Basis Completion. In Proc. ISSAC '96: International Symposium on Symbolic and Algebraic Computation, pages 95--102. ACM Press, 1996.
[5]
C.K. Anand and W. Kahl. A Domain-Specific Language for the Generation of Optimized SIMD-Parallel Assembly Code. SQRL Report 43, Software Quality Research Laboratory, McMaster University, May 2007. available from http://sqrl.mcmaster.ca/sqrl_reports.html.
[6]
J. Armstrong, R. Virding, C. Wikström, and M. Williams. Concurrent Programming in Erlang. Prentice Hall, 2nd edition, 1996.
[7]
L. Bernardin. Maple on a Massively Parallel, Distributed Memory Machine. In Proc. PASCO '97: Intl. Symp. on Parallel Symbolic Computation, pages 217--222. ACM Press, 1997.
[8]
J. Berthold, M. Dieterle, R. Loogen, and S. Priebe. Hierarchical Master-Worker Skeletons. In Paul Hudak and David Scott Warren, editors, PADL, volume 4902 of Lecture Notes in Computer Science, pages 248--264. Springer, 2008.
[9]
J. Berthold, S. Marlow, A. Al Zain, and K. Hammond. Comparing and Optimising Parallel Haskell Implementation. In Sven-Bodo Scholz, editor, IFL'08 -- Implementation and Application of Functional Languages 20th International Symposium, Draft Proceedings, pages 223--240, Hetfield, Hertfordshire, UK, September 2008. Technical Report No. 474.
[10]
S. Breitinger, R. Loogen, Y. Ortega Mallén, and R. Peña Marí. Eden -- The Paradise of Functional Concurrent Programming. In EuroPar'96 -- European Conf. on Parallel Processing, LNCS 1123, pages 710--713, Lyon, France, 1996. Springer.
[11]
M. Bridges, N. Vachharajani, Y. Zhang, T. Jablin, and D. August. Revisiting the Sequential Programming Model for Multi-Core. In MICRO '07: Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture, pages 69--84, Washington, DC, USA, 2007. IEEE Computer Society.
[12]
Broadcom Corp. BCM1250 Multiprocessor. Technical report, Broadcom Corporation, April 2002.
[13]
R. Bündgen, M. Göbel, and W. Küchlin. Multi-Threaded AC Term Re-writing. In Proc. PASCO'94: Intl. Symp. on Parallel Symbolic Computation, volume 5, pages 84--93. World Scientific, 1994.
[14]
B. Carlstrom, A. McDonald, H. Chafi, J. Chung, C. Cao Minh, C. Kozyrakis, and K. Olukotun. The atomos transactional programming language. In ACM SIGPLAN 2006 Conference on Programming Language Design and Implementation. Jun 2006.
[15]
M.M.T. Chakravarty, R. Leshchinskiy, S.L. Peyton Jones, G. Keller, and S. Marlow. Data Parallel Haskell: a Status Report. In DAMP'07: Workshop on Declarative Aspects of Multicore Programming), Nice, France, 2007. ACM Press.
[16]
B. Chapman, G. Jost, and R. van der Pas. Using OpenMP: Portable Shared Memory Parallel Programming (Scientific and Engineering Computation). The MIT Press, 2007.
[17]
M.I. Cole. Algorithmic Skeletons: Structured Management of Parallel Computation. The MIT Press, Cambridge, MA, 1989.
[18]
M.I. Cole. Algorithmic Skeletons. In K. Hammond and G. Michaelson, editors, Research Directions in Parallel Functional Programming, chapter 13, pages 289--304. Springer-Verlag, 1999.
[19]
G. Cooperman. STAR/MPI: Binding a Parallel Library to Interactive Symbolic Algebra Systems. In Proc. ISSAC '95: International Symposium on Symbolic and Algebraic Computation, volume 249 of Lecture Notes in Control and Information Sciences, pages 126--132. ACM Press, 1995.
[20]
G. Cooperman. GAP/MPI: Facilitating Parallelism. In Proc. DIMACS Workshop on Groups and Computation II, volume 28 of DIMACS Series in Discrete Maths. and Theoretical Comp. Sci., pages 69--84. AMS, 1997.
[21]
J. Dean and S. Ghemawat. MapReduce: Simplified Data Processing on Large Clusters. Commun. ACM, 51(1):107--113, 2008.
[22]
A. Duran, M. Gonzàlez, and J. Corbalán. Automatic Thread Distribution for Nested Parallelism in OpenMP. In 19th Annual International Conference on Supercomputing (ICS '05), pages 121--130, New York, NY, USA, 2005. ACM.
[23]
Ericsson Utvecklings AB. Erlang Home Page.
[24]
M. Fluet, M. Rainey, J. Reppy, A. Shaw, and Y. Xiao. Manticore: A Heterogeneous Parallel Language. In DAMP'07: Workshop on Declarative Aspects of Multicore Programming, Nice, France, 2007.
[25]
M. Frigo, C.E. Leiserson, and K.H. Randall. The Implementation of the Cilk-5 Multithreaded Language. In PLDI'98 -- Conf. on Programming Language, Design and Implementation, volume 33 of ACM SIGPLAN Notices, pages 212--223. ACM Press, 1998.
[26]
GHC. http://www.haskell.org/ghc/.
[27]
M.I. Gordon, W. Thies, M. Karczmarek, J. Lin, A.S. Meli, A.A. Lamb, C. Leger, J. Wong, H. Hoffmann, D. Maze, and S. Amarasinghe. A Stream Compiler for Communication-Exposed Architectures. In ASPLOS-X: 10th international Conference on Architectural Support for Programming Languages and Operating Systems, pages 291--303, New York, NY, USA, 2002. ACM.
[28]
The GHC-Maple Interface, http://www.risc .uni-linz.ac.at/software/ghc-maple/.
[29]
D. Grossman, J. Manson, and W. Pugh. What do High-Level Memory Models mean for Transactions? In MSPC '06: 2006 workshop on Memory System Performance and Correctness, pages 62--69, New York, NY, USA, 2006. ACM.
[30]
The GAP Group. GAP -- Groups, Algorithms, and Programming, 2007. http://www.gap-system.org.
[31]
K. Hammond and G. Michaelson. Research Directions in Parallel Functional Programming, chapter Introduction. Springer-Verlag, 1999.
[32]
T. Harris and K. Fraser. Language Support for Lightweight Transactions. SIGPLAN Not., 38(11):388--402, 2003.
[33]
T. Harris, S. Marlow, and S. Peyton Jones. Composable Memory Transactions. In PPoPP 2005: Principles and Practice of Parallel Programming.
[34]
T. Harris, S. Marlow, and S. Peyton Jones. Haskell on a Shared-Memory Multiprocessor. In Proc. Haskell '05: 2005 ACM SIGPLAN Workshop on Haskell, pages 49--61. ACM Press, September 2005.
[35]
T. Harris and S. Singh. Feedback Directed Implicit Parallelism. In ICFP '07: 2007 ACM SIGPLAN International Conference on Functional Programming, pages 251--264, New York, NY, USA, 2007. ACM.
[36]
R. Lämmel. Google's MapReduce Programming Model -- Revisited. Sci. Comput. Program, 70(1):1--30, 2008.
[37]
B. Lewis and D.J. Berg. Multithreaded Programming with Pthreads. Prentice-Hall, Inc., Upper Saddle River, NJ, USA, 1998.
[38]
R. Loogen, Y. Ortega-Mallén, and R. Peña-Marí. Parallel Functional Programming in Eden. Journal of Functional Programming, 15(3):431--475, 2005.
[39]
R. Martínez and R. Pena. Building an Interface Between Eden and Maple. In Proc. IFL 2003, Springer-Verlag LNCS 3145, pages 135--151, 2004.
[40]
G. O. Michler. High Performance Computations in Group Representation Theory. Preprint, Institut für Experimentelle Mathematik, Univerisität GH Essen, 1998.
[41]
The OpenMath Standard, Version 2.0, http://www.openmath.org/.
[42]
W. Partain. The Benchmark Suite of Haskell Programs. In Proc. 1992 Glasgow Workshop on Functional Programming, pages 195--202, London, UK, 1993. Springer-Verlag.
[43]
S.L. Peyton Jones, C. Clack, J. Salkild, and M. Hardie. GRIP -- a High-Performance Architecture for Parallel Graph Reduction. In Intl. Conf. on Functional Programming Languages and Computer Architecture (FPCA '87), LNCS 274, pages 98--112, Portland, Oregon, September 1987. Springer-Verlag.
[44]
S.L. Peyton Jones, C.V. Hall, K. Hammond, W.D. Partain, and P.L. Wadler. The Glasgow Haskell Compiler: a Technical Overview. In Proc. JFIT (Joint Framework for Information Technology) Technical Conference, pages 249--257, Keele, UK, March 1993.
[45]
C Ranger, R. Raghuraman, A. Penmetsa, G.R. Bradski, and C. Kozyrakis. Evaluating MapReduce for Multi-core and Multiprocessor Systems. In HPCA '07: 2007 IEEE 13th International Symposium on High Performance Computer Architecture, pages 13--24. IEEE Computer Society, 2007.
[46]
L. Roch and G. Villard. Parallel computer algebra. In Proc. ISSAC '97: International Symposium on Symbolic and Algebraic Computation. Preprint IMAG Grenoble France, 1997.
[47]
H. Sutter and J. Larus. Software and the concurrency revolution. Queue, 3(7):54--62, 2005.
[48]
Tian Tian and Chiu-Pi Shih. Software Techniques for Shared-Cache Multi-Core Systems, 2007. Online article in Intel developer community.
[49]
P. Trinder, K. Hammond, J.S. Mattson Jr., A.S Partridge, and S.L. Peyton Jones. GUM: a Portable Parallel Implementation of Haskell. In Proc. PLDI'96, pages 79--88, Philadelphia, PA, USA, May 1996.
[50]
P.W. Trinder, K. Hammond, H.-W. Loidl, and S.L. Peyton Jones. Algorithm Strategy = Parallelism. J. Functional Programming, 8(1):23--60, January 1998.
[51]
C. Zilles and D. Flint. Challenges to Providing Performance Isolation in Transactional Memories. In Workshop on Duplicating, Deconstructing, and Debunking. 2005.

Cited By

View all

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
DAMP '09: Proceedings of the 4th workshop on Declarative aspects of multicore programming
January 2009
76 pages
ISBN:9781605584171
DOI:10.1145/1481839
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 January 2009

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. algorithmic skeletons
  2. eden
  3. haskell
  4. multicore parallelism

Qualifiers

  • Research-article

Conference

POPL09
Sponsor:

Upcoming Conference

POPL '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)3
  • Downloads (Last 6 weeks)3
Reflects downloads up to 18 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2018)The Peter Landin prizeHigher-Order and Symbolic Computation10.1007/s10990-010-9055-722:4(305-312)Online publication date: 14-Dec-2018
  • (2014)pHood: Tool Description, Analysis Techniques, and Case StudiesNew Generation Computing10.1007/s00354-014-0103-432:1(59-91)Online publication date: 18-Feb-2014
  • (2013)Reliable scalable symbolic computationProceedings of the 28th Annual ACM Symposium on Applied Computing10.1145/2480362.2480677(1674-1681)Online publication date: 18-Mar-2013
  • (2012)Supervised Workpools for Reliable Massively Parallel ComputingProceedings of the 2012 Conference on Trends in Functional Programming - Volume 782910.1007/978-3-642-40447-4_16(247-262)Online publication date: 12-Jun-2012

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media