More Web Proxy on the site http://driver.im/

announcement

Public Access

Brief Announcement: Open Cilk

Authors:

Tao B. Schardl,

I-Ting Angelina Lee,

Charles E. LeisersonAuthors Info & Claims

SPAA '18: Proceedings of the 30th on Symposium on Parallelism in Algorithms and Architectures

Pages 351 - 353

https://doi.org/10.1145/3210377.3210658

Published: 11 July 2018 Publication History

Abstract

Open Cilk is a new open-source platform to support Cilk multithreaded programming, especially for researchers and teachers. Open Cilk aims to provide a full-featured implementation of Cilk that is easy to modify and extend. Based on the award-winning Tapir/LLVM compiler, Open Cilk will provide a streamlined runtime system and feature comprehensive static instrumentation for dynamic-analysis tools. As a community-infrastructure project, Open Cilk encourages contributions from researchers in the areas of languages, compilers, runtime systems, tools, libraries, and benchmarks.

References

[1]

P. Alves, J. Brobecker, D. Evans, T. Tromey, and E. Zaretskii. GDB: The GNU project debugger. http://www.gnu.org/software/gdb/, Oct. 2014. Viewed Nov 3, 2014.

[2]

B. Ayer. Intel Cilk Plus is now available in open-source and for GCC 4.7! http: //www.cilkplus.org, 2011. The source code for the compiler and its associated runtime is available at http://gcc.gnu.org/svn/gcc/branches/cilkplus.

[3]

C. Bienia, S. Kumar, J. P. Singh, and K. Li. The PARSEC benchmark suite: Characterization and architectural implications. In PACT, pages 72--81, 2008.

Digital Library

[4]

G. E. Blelloch, J. T. Fineman, P. B. Gibbons, and J. Shun. Internally deterministic parallel algorithms can be fast. In PPoPP, pages 181--192, 2012.

Digital Library

[5]

R. D. Blumofe. Executing Multithreaded Programs Efficiently. PhD thesis, Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, Massachusetts, Sept. 1995. Available as MIT Laboratory for Computer Science Technical Report MIT/LCS/TR-677.

Digital Library

[6]

R. D. Blumofe, C. F. Joerg, B. C. Kuszmaul, C. E. Leiserson, K. H. Randall, and Y. Zhou. Cilk: An efficient multithreaded runtime system. Journal of Parallel and Distributed Computing, 37(1):55--69, 1996.

Digital Library

[7]

P. Charles, C. Grothoff, V. Saraswat, C. Donawa, A. Kielstra, K. Ebcioglu, C. von Praun, and V. Sarkar. X10: An object-oriented approach to non-uniform cluster computing. In OOPSLA, 2005.

Digital Library

[8]

D. Dailey and C. E. Leiserson. Using Cilk to write multiprocessor chess programs. The Journal of the International Computer Chess Association, pages 25--52, 2002.

[9]

L. Dhulipala, G. Blelloch, and J. Shun. Julienne: A framework for parallel graph algorithms using work-efficient bucketing. In SPAA, pages 293--304, 2017. ISBN 978--1--4503--4593--4.

Digital Library

[10]

M. Frigo. A fast Fourier transform compiler. ACM SIGPLAN Notices, 34(5):169--180, May 1999.

Digital Library

[11]

M. Frigo, C. E. Leiserson, and K. H. Randall. The implementation of the Cilk-5 multithreaded language. In PLDI, pages 212--223, 1998.

Digital Library

[12]

M. Frigo, P. Halpern, C. E. Leiserson, and S. Lewin-Berlin. Reducers and other Cilk++ hyperobjects. In SPAA, pages 79--90, 2009.

Digital Library

[13]

GCC Team. GCC 4.9 release series changes, new features, and fixes. Available at https://gcc.gnu.org/gcc-4.9/changes.html, 2014.

[14]

GGC Team. GCC 7 release series: Changes, new features, and fixes. Available at https://gcc.gnu.org/gcc-7/changes.html, 2017.

[15]

W. Hasenplaugh, T. Kaler, T. B. Schardl, and C. E. Leiserson. Ordering heuristics for parallel graph coloring. In SPAA, pages 166--177, 2014.

Digital Library

[16]

Y. He, C. E. Leiserson, and W. M. Leiserson. The Cilkview scalability analyzer. In SPAA, pages 145--156, 2010.

Digital Library

[17]

W. D. Hillis and G. L. Steele Jr. Data parallel algorithms. Commun. ACM, 29(12): 1170--1183, Dec. 1986.

Digital Library

[18]

Institute of Electrical and Electronic Engineers. Information technology - Portable Operating System Interface (POSIX) - Part 1: System application program interface (API) {C language}. IEEE Standard 1003.1, 1996 Edition, 1996.

[19]

Intel Corporation. Intel Cilk Plus Language Specification, 2010. Document Number: 324396-001US. Available from http://software.intel.com/sites/products/cilk-plus/ cilk_plus_language_specification.pdf.

[20]

Intel Corporation. Intel Cilk Plus software development kit. Available from http: //software.intel.com/en-us/articles/intel-cilk-plus-software-development-kit/, Dec. 2011.

[21]

Intel Corporation. Cilk Plus/LLVM: an implementation of the Intel Cilk Plus C/C++ language extensions in LLVM. Available from http://cilkplus.github.io/, 2016.

[22]

Intel Corporation. Intel Cilk Plus samples. Available from https://software.intel. com/en-us/code-samples/intel-compiler/intel-compiler-features/intelcilkplus, 2016.

[23]

Intel Corporation. Intel Cilk Plus. Available at https://www.cilkplus.org, sep 2017.

[24]

Intel Corporation. Reference Manual for Intel Math Kernel Library 2018 - C, 2017. URL https://software.intel.com/en-us/mkl-reference-manual-for-c.

[25]

S. Itzhaky, R. Singh, A. Solar-Lezama, K. Yessenov, Y. Lu, C. Leiserson, and R. Chowdhury. Deriving divide-and-conquer dynamic programming algorithms using solver-aided transformations. In OOPSLA, pages 145--164, 2016. ISBN 978--1--4503--4444--9.

Digital Library

[26]

C. F. Joerg. The Cilk System for Parallel Multithreaded Computing. PhD thesis, Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, Massachusetts, Jan. 1996. Available as MIT Laboratory for Computer Science Technical Report MIT/LCS/TR-701.

Digital Library

[27]

T. Kaler, W. Hasenplaugh, T. B. Schardl, and C. E. Leiserson. Executing dynamic data-graph computations deterministically using chromatic scheduling. Transactions on Parallel Computing, 3(1):2:1--2:31, 2016. ISSN 2329--4949.

Digital Library

[28]

B. W. Kernighan and D. M. Ritchie. The C Programming Language. Prentice Hall, Inc., second edition, 1988.

Digital Library

[29]

M. Kulkarni, M. Burtscher, C. Cascaval, and K. Pingali. Lonestar: A suite of parallel irregular programs. In ISPASS, pages 65--76, 2009.

[30]

T. J. LeBlanc and J. M. Mellor-Crummey. Debugging parallel programs with Instant Replay. IEEE Transactions on Computers, C-36(4):471--482, Apr. 1987.

Digital Library

[31]

I.-T. A. Lee, A. Shafi, and C. E. Leiserson. Memory-mapping support for reducer hyperobjects. In SPAA, pages 287--297, 2012.

Digital Library

[32]

I.-T. A. Lee, C. E. Leiserson, T. B. Schardl, Z. Zhang, and J. Sukha. On-the-fly pipeline parallelism. ACM TOPC, 2(3):17:1--17:42, 2015.

Digital Library

[33]

C. E. Leiserson. The Cilk++ concurrency platform. Journal of Supercomputing, 51(3):244--257, 2010.

Digital Library

[34]

C. E. Leiserson and T. B. Schardl. A work-efficient parallel breadth-first search algorithm (or how to cope with the nondeterminism of reducers). In SPAA, pages 303--314. ACM, June 2010.

Digital Library

[35]

C. E. Leiserson, T. B. Schardl, and J. Sukha. Deterministic parallel random-number generation for dynamic-multithreading platforms. In PPoPP, 2012.

Digital Library

[36]

M. M. Maza and Y. Xie. Balanced dense polynomial multiplication on multi-cores. ACM Communications on Computer Algebra, 43(3/4):85--87, June 2010. ISSN 1932--2240.

Digital Library

[37]

S. K. Muller and U. A. Acar. Latency-hiding work stealing: Scheduling interacting parallel computations with work stealing. In SPAA, pages 71--82. ACM, 2016. ISBN 978--1--4503--4210-0.

Digital Library

[38]

D. Oktay and S. Kamali, May 2017. Private communication.

[39]

K. Pingali, D. Nguyen, M. Kulkarni, M. Burtscher, M. A. Hassaan, R. Kaleem, T.-H. Lee, A. Lenharth, R. Manevich, M. Méndez-Lojo, D. Prountzos, and X. Sui. The Tao of parallelism in algorithms. In PLDI, pages 12--25, 2011.

Digital Library

[40]

R. Raman, J. Zhao, V. Sarkar, M. Vechev, and E. Yahav. Efficient data race detection for async-finish parallelism. Formal Methods in Systems Design, 41(3):321--347, Dec. 2012.

Digital Library

[41]

K. H. Randall. Cilk: Efficient Multithreaded Computing. PhD thesis, Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, May 1998.

Digital Library

[42]

A. D. Robison and C. E. Leiserson. Cilk Plus. In P. Balaji, editor, Programming Models for Parallel Computing, chapter 13, pages 323--352. The MIT Press, Cambridge, MA, 2015.

[43]

T. B. Schardl, B. C. Kuszmaul, I.-T. A. Lee, W. M. Leiserson, and C. E. Leiserson. The Cilkprof scalability profiler. In SPAA, pages 89--100, 2015.

Digital Library

[44]

T. B. Schardl, T. Denniston, D. Doucet, B. C. Kuszmaul, I.-T. A. Lee, and C. E. Leiserson. The CSI framework for compiler-inserted program instrumentation. Proc. ACM Meas. Anal. Comput. Syst., 1(2):43:1--43:25, Dec. 2017. ISSN 2476--1249.

Digital Library

[45]

T. B. Schardl, W. S. Moses, and C. E. Leiserson. Tapir: Embedding fork-join parallelism into LLVM's intermediate representation. In PPoPP, pages 249--265, 2017. ISBN 978--1--4503--4493--7.

Digital Library

[46]

J. Shun. Shared-Memory Parallelism Can be Simple, Fast, and Scalable. Morgan & Claypool, 2017.

Digital Library

[47]

J. Shun and G. E. Blelloch. Ligra: A lightweight graph processing framework for shared memory. In PPoPP, pages 135--146, 2013.

Digital Library

[48]

J. Shun, G. E. Blelloch, J. T. Fineman, P. B. Gibbons, A. Kyrola, H. V. Simhadri, and K. Tangwongsan. Brief announcement: the Problem Based Benchmark Suite. In SPAA, pages 68--70, 2012.

Digital Library

[49]

J. Shun, L. Dhulipala, and G. E. Blelloch. Smaller and faster: Parallel processing of compressed graphs with Ligra+. In DCC, pages 403--412, 2015.

Digital Library

[50]

G. L. Steele Jr., D. Lea, and C. H. Flood. Fast splittable pseudorandom number generators. In OOPSLA, pages 453--472, New York, NY, USA, 2014. ACM. ISBN 978--1--4503--2585--1.

Digital Library

[51]

B. Stroustrup. The C++ Programming Language. Addison-Wesley, Boston, MA, third edition, 2000.

Digital Library

[52]

Y. Tang, R. You, H. Kan, J. J. Tithi, P. Ganapathi, and R. A. Chowdhury. Cacheoblivious wavefront: Improving parallelism of recursive dynamic programming algorithms without losing cache-efficiency. In PPoPP, pages 205--214, 2015. ISBN 978--1--4503--3205--7.

Digital Library

[53]

S. Toledo. TAUCS: A library of sparse linear solvers. Available on the Web from http://www.cs.tau.ac.il/~stoledo/taucs/, 2003.

[54]

R. Utterback, K. Agrawal, J. T. Fineman, and I.-T. A. Lee. Provably good and practically efficient parallel race detection for fork-join programs. In SPAA, pages 83--94, 2016.

Digital Library

[55]

R. Utterback, K. Agrawal, I.-T. A. Lee, and M. Kulkarni. Processor-oblivious record and replay. In PPoPP, pages 145--161, 2017.

Digital Library

[56]

Y. Xu, I.-T. A. Lee, and K. Agrawal. Efficient parallel determinacy race detection for two-dimensional dags. In PPoPP, 2018. To appear.

Digital Library

[57]

B. Zhang, J. Huang, N. P. Pitsianis, and X. Sun. RECFMM: recursive parallelization of the adaptive fast multipole method for Coulomb and screened Coulomb interactions. Communications in Computational Physics, 20(2):534--550, 2016.

Cited By

Yanhaona MGrimshaw AMickey S(2024)HighP5: Programming using Partitioned Parallel Processing SpacesJournal of the Brazilian Computer Society10.5753/jbcs.2024.434530:1(653-687)Online publication date: 17-Dec-2024
https://doi.org/10.5753/jbcs.2024.4345
Jezghani AYoung JPowell WRahaman RCoulter J(2023)Future Computing with the Rogues Gallery2023 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)10.1109/IPDPSW59300.2023.00051(262-269)Online publication date: May-2023
https://doi.org/10.1109/IPDPSW59300.2023.00051
Damani SBarua PSarkar VEgger BSmith A(2022)Memory access scheduling to reduce thread migrationsProceedings of the 31st ACM SIGPLAN International Conference on Compiler Construction10.1145/3497776.3517768(144-155)Online publication date: 19-Mar-2022
https://dl.acm.org/doi/10.1145/3497776.3517768
Show More Cited By

Index Terms

Brief Announcement: Open Cilk

Recommendations

OpenCilk: A Modular and Extensible Software Infrastructure for Fast Task-Parallel Code
PPoPP '23: Proceedings of the 28th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming

This paper presents OpenCilk, an open-source software infrastructure for task-parallel programming that allows for substantial code reuse and easy exploration of design choices in language abstraction, compilation strategy, runtime mechanism, and ...
Brief announcement: a lower bound for depth-restricted work stealing
SPAA '09: Proceedings of the twenty-first annual symposium on Parallelism in algorithms and architectures

Work stealing is a common technique used in the runtime schedulers of parallel languages such as Cilk and parallel libraries such as Intel Threading Building Blocks (TBB). Depth-restricted work stealing is a restriction of Cilk-like work stealing in ...
Brief announcement: performance potential of an easy-to-program PRAM-on-chip prototype versus state-of-the-art processor
SPAA '09: Proceedings of the twenty-first annual symposium on Parallelism in algorithms and architectures

We compare the Paraleap FPGA computer, a 64-processor hardware prototype of the PRAM-driven XMT architecture, with an Intel Core 2 Duo processor and show that Paraleap outperforms the Intel processor by up to 13.89x in terms of cycle counts. The ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

SPAA '18: Proceedings of the 30th on Symposium on Parallelism in Algorithms and Architectures

July 2018

437 pages

ISBN:9781450357999

DOI:10.1145/3210377

General Chair:
Christian Scheideler
University of Paderborn, Germany
,
Program Chair:
Jeremy Fineman
Georgetown University, USA

Copyright © 2018 Owner/Author.

Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Sponsors

SIGACT: ACM Special Interest Group on Algorithms and Computation Theory
SIGARCH: ACM Special Interest Group on Computer Architecture
EATCS: European Association for Theoretical Computer Science

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 July 2018

Check for updates

Author Tags

Qualifiers

Announcement

Funding Sources

National Science Foundation

Conference

SPAA '18

Sponsor:

SPAA '18: 30th ACM Symposium on Parallelism in Algorithms and Architectures

July 16 - 18, 2018

Vienna, Austria

Acceptance Rates

SPAA '18 Paper Acceptance Rate 36 of 120 submissions, 30%;

Overall Acceptance Rate 447 of 1,461 submissions, 31%

Upcoming Conference

SPAA '25

Sponsor:
sigact
sigact

37th ACM Symposium on Parallelism in Algorithms and Architectures

July 28 - August 1, 2025

Portland , OR , USA

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

10
Total Citations
View Citations
425
Total Downloads

Downloads (Last 12 months)78
Downloads (Last 6 weeks)13

Reflects downloads up to 02 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Yanhaona MGrimshaw AMickey S(2024)HighP5: Programming using Partitioned Parallel Processing SpacesJournal of the Brazilian Computer Society10.5753/jbcs.2024.434530:1(653-687)Online publication date: 17-Dec-2024
https://doi.org/10.5753/jbcs.2024.4345
Jezghani AYoung JPowell WRahaman RCoulter J(2023)Future Computing with the Rogues Gallery2023 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)10.1109/IPDPSW59300.2023.00051(262-269)Online publication date: May-2023
https://doi.org/10.1109/IPDPSW59300.2023.00051
Damani SBarua PSarkar VEgger BSmith A(2022)Memory access scheduling to reduce thread migrationsProceedings of the 31st ACM SIGPLAN International Conference on Compiler Construction10.1145/3497776.3517768(144-155)Online publication date: 19-Mar-2022
https://dl.acm.org/doi/10.1145/3497776.3517768
Carratala-Saez RGonzalez-Escribano AIliopoulos ALeiserson CPark CRosa ISchardl TTorres YBunde D(2022)Peachy Parallel Assignments (EduHPC 2022)2022 IEEE/ACM International Workshop on Education for High Performance Computing (EduHPC)10.1109/EduHPC56719.2022.00012(50-56)Online publication date: Nov-2022
https://doi.org/10.1109/EduHPC56719.2022.00012
Jiang HZhang HTang XGovindaraj VSampson JKandemir MZhang DFreund SYahav E(2021)Fluid: a framework for approximate concurrency via controlled dependency relaxationProceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation10.1145/3453483.3454042(252-267)Online publication date: 19-Jun-2021
https://dl.acm.org/doi/10.1145/3453483.3454042
Handleman ARattew ALee ISchardl T(2021)A Hybrid Scheduling Scheme for Parallel Loops2021 IEEE International Parallel and Distributed Processing Symposium (IPDPS)10.1109/IPDPS49936.2021.00067(587-598)Online publication date: May-2021
https://doi.org/10.1109/IPDPS49936.2021.00067
Schmaus FPfeiffer NSchroder-Preikschat WHonig TNolte J(2021)Nowa: A Wait-Free Continuation-Stealing Concurrency Platform2021 IEEE International Parallel and Distributed Processing Symposium (IPDPS)10.1109/IPDPS49936.2021.00044(360-371)Online publication date: May-2021
https://doi.org/10.1109/IPDPS49936.2021.00044
Ritchie RBibak K(2021)DOTMIX-Pro: faster and more efficient variants of DOTMIX for dynamic-multithreading platformsThe Journal of Supercomputing10.1007/s11227-021-03904-3Online publication date: 2-Jun-2021
https://doi.org/10.1007/s11227-021-03904-3
Pirkelbauer PWilson APeterson CDechev D(2019)Blaze-TasksACM Transactions on Architecture and Code Optimization10.1145/329344815:4(1-25)Online publication date: 8-Jan-2019
https://dl.acm.org/doi/10.1145/3293448
Schardl TSamsi S(2019)TapirXLA: Embedding Fork-Join Parallelism into the XLA Compiler in TensorFlow Using Tapir2019 IEEE High Performance Extreme Computing Conference (HPEC)10.1109/HPEC.2019.8916312(1-8)Online publication date: Sep-2019
https://doi.org/10.1109/HPEC.2019.8916312

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Figures

Tables

Media

View Table of Conten