[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/1810479.1810517acmconferencesArticle/Chapter ViewAbstractPublication PagesspaaConference Proceedingsconference-collections
short-paper

Brief announcement: serial-parallel reciprocity in dynamic multithreaded languages

Published: 13 June 2010 Publication History

Abstract

In dynamically multithreaded platforms that employ work stealing, there appears to be a fundamental tradeoff between providing provably good time and space bounds and supporting SP-reciprocity, the property of allowing arbitrary calling between parallel and serial code, including legacy serial binaries. Many known dynamically multithreaded platforms either fail to support SP-reciprocity or sacrifice on the provable time and space bounds that an efficient work-stealing scheduler could otherwise guarantee.
We describe PR-Cilk, a design of a runtime system that supports SP-reciprocity in PR-Cilk and provides provable bounds on time and space. In order to maintain the space bound, PR-Cilk uses subtree-restricted work stealing. We show that with subtree-restricted work stealing, PR-Cilk provides the same guarantee on stack space usage as ordinary Cilk. The completion time guaranteed by PR-Cilk is slightly worse than ordinary Cilk. Nevertheless, if the number of times a C function calls a Cilk function is small, or if each Cilk function called by a C function is sufficiently parallel, PR-Cilk still guarantees linear speedup.

References

[1]
Kunal Agrawal, Charles E. Leiserson, and Jim Sukha. Helper locks for fork-join parallel programming. In PPoPP '10, January 2010.
[2]
Eric Allen, David Chase, Joe Hallett, Victor Luchangco, Jan-Willem Maessen, Sukyoung Ryu, Guy L. Steele Jr., and Sam Tobin-Hochstadt The Fortress Language Specification Version 1.0. Sun Microsystems, Inc., March 2008.
[3]
Nimar S. Arora, Robert D. Blumofe, and C. Greg Plaxton Thread scheduling for multiprogrammed multiprocessors. In SPAA '98, pages 119--129, June 1998.
[4]
Robert D. Blumofe, Christopher F. Joerg, Bradley C. Kuszmaul, Charles E. Leiserson, Keith H. Randall, and Yuli Zhou. Cilk: An efficient multithreaded runtime system. Journal of Parallel and Distributed Computing, 37(1):55--69, August 1996.
[5]
Robert D. Blumofe and Charles E. Leiserson. Scheduling multithreaded computations by work stealing. Journal of the ACM, 46(5):720--748, September 1999.
[6]
Robert D. Blumofe and Dionisios Papadopoulos. Hood: A user-level threads library for multiprogrammed multiprocessors. Technical Report, University of Texas at Austin, 1999.
[7]
F. Warren Burton and M. Ronan Sleep. Executing functional programs on a virtual tree of processors. In FPCA '81, pages 187--194, October 1981.
[8]
Philippe Charles, Christian Grothoff, Vijay Saraswat, Christopher Donawa, Allan Kielstra, Kemal Ebcioglu, Christoph von Praun, and Vivek Sarkar. X10: An object-oriented approach to non-uniform cluster computing. In OOPSLA '05, pages 519--538. ACM, 2005.
[9]
Vincent W. Freeh, David K. Lowenthal, and Gregory R. Andrews. Distributed Filaments: Efficient fine-grain parallelism on a cluster of workstations. In OSDI '94, pages 201--213, November 1994.
[10]
Matteo Frigo, Charles E. Leiserson, and Keith H. Randall. The implementation of the Cilk-5 multithreaded language. In PLDI '98, pages 212--223, 1998.
[11]
Robert H. Halstead, Jr. Multilisp: A language for concurrent symbolic computation. ACM TOPLAS, 7(4):501--538, October 1985.
[12]
E. A. Hauck and B. A. Dent. Burroughs' B6500/B7500 stack mechanism. Proceedings of the AFIPS Spring Joint Computer Conference, pages 245--251, 1968.
[13]
Brian W. Kernighan and Dennis M. Ritchie. The C Programming Language. Prentice Hall, Inc., second edition, 1988.
[14]
David A. Kranz, Robert H. Halstead, Jr., and Eric Mohr. Mul-T: A high-performance parallel Lisp. In PLDI '89, pages 81--90, June 1989.
[15]
Doug Lea. A Java fork/join framework. In Java Grande Conference, pages 36--43, 2000.
[16]
I-Ting Angelina Lee, Silas Boyd-Wickizer, Zhiyi Huang, and Charles E. Leiserson. Using thread-local memory mapping to support cactus stacks in work-stealing runtime systems. Submitted for publication.
[17]
Daan Leijen, Wolfram Schulte, and Sebastian Burckhardt. The design of a task parallel library. In OOPSLA '09, pages 227--242, 2009.
[18]
Charles E. Leiserson. The Cilk++ concurrency platform. In 46th Design Automation Conference. ACM, July 2009.
[19]
Rishiyur S. Nikhil. Cid: A parallel, shared-memory C for distributed-memory machines. In LCPC '94, August 1994.
[20]
James Reinders. Intel Threading Building Blocks: Outfitting C++ for Multi-core Processor Parallelism. O'Reilly Media, Inc., 2007.
[21]
Bjarne Stroustrup. The C++ Programming Language. Addison-Wesley, Boston, MA, third edition, 2000.
[22]
Jim Sukha. Brief announcement: A lower bound for depth-restricted work stealing. In SPAA '09, August 2009.
[23]
Mark T. Vandevoorde and Eric S. Roberts. WorkCrews: An abstraction for controlling parallelism.International Journal of Parallel Programming, 17(4):347--366, August 1988.

Cited By

View all
  • (2016)A Practical Solution to the Cactus Stack ProblemProceedings of the 28th ACM Symposium on Parallelism in Algorithms and Architectures10.1145/2935764.2935787(61-70)Online publication date: 11-Jul-2016

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
SPAA '10: Proceedings of the twenty-second annual ACM symposium on Parallelism in algorithms and architectures
June 2010
378 pages
ISBN:9781450300797
DOI:10.1145/1810479

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 June 2010

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. cilk
  2. dynamic multithreading
  3. intel threading building blocks
  4. scheduling
  5. serial-parallel reciprocity
  6. work stealing

Qualifiers

  • Short-paper

Conference

SPAA 10

Acceptance Rates

Overall Acceptance Rate 447 of 1,461 submissions, 31%

Upcoming Conference

SPAA '25
37th ACM Symposium on Parallelism in Algorithms and Architectures
July 28 - August 1, 2025
Portland , OR , USA

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1
  • Downloads (Last 6 weeks)0
Reflects downloads up to 02 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2016)A Practical Solution to the Cactus Stack ProblemProceedings of the 28th ACM Symposium on Parallelism in Algorithms and Architectures10.1145/2935764.2935787(61-70)Online publication date: 11-Jul-2016

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media