[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/1378533.1378575acmconferencesArticle/Chapter ViewAbstractPublication PagesspaaConference Proceedingsconference-collections
research-article

Scheduling strategies for optimistic parallel execution of irregular programs

Published: 14 June 2008 Publication History

Abstract

Recent application studies have shown that many irregular applications have a generalized data parallelism that manifests itself as iterative computations over worklists of different kinds. In general, there are complex dependencies between iterations. These dependencies cannot be elucidated statically because they depend on the inputs to the program; thus, optimistic parallel execution is the only tractable approach to parallelizing these applications.
We have built a system called Galois that supports this style of parallel execution. Its main features are (i) set iterators for expressing worklist-based data parallelism, and (ii) a runtime system that performs optimistic parallelization of these iterators, detecting conflicts and rolling back computations as needed.
Our work builds on the Galois system, and it addresses the problem of scheduling iterations of set iterators on multiple cores. The policy used by the base Galois system is to assign an iteration to a core whenever it needs work to do, but we show in this paper that this policy is not optimal for many applications. We also argue that OpenMP-style DO-ALL loop scheduling directives such as chunked and guided self-scheduling are too simplistic for irregular programs. These difficulties led us to develop a general scheduling framework for irregular problems; OpenMP-style scheduling strategies are special cases of this general approach. We also provide hooks into our framework, allowing the programmer to leverage application knowledge to further tune a schedule for a particular application.
To evaluate this framework, we implemented it as an extension of the Galois system. We then tested the system using five real-world, irregular, data-parallel applications. Our results show that (i) the optimal scheduling policy can be different for different applications and often leverages application-specific knowledge and (ii) implementing these schedules in the Galois system is relatively straightforward.

References

[1]
Yuri Boykov and Vladimir Kolmogorov. An experimental comparison of min-cut/max-flow algorithms for energy minimization in vision. International Journal of Computer Vision (IJCV), 70(2):109--131, 2006.
[2]
Shimin Chen, Phillip B. Gibbons, Michael Kozuch, Vasileios Liaskovitis, Anastassia Ailamaki, Guy E. Blelloch, Babak Falsafi, Limor Fix, Nikos Hardavellas, Todd C. Mowry, and Chris Wilkerson. Scheduling threads for constructive cache sharing on cmps. In SPAA'07: Proceedings of the nineteenth annual ACM symposium on Parallel algorithms and architectures, pages 105--115, New York, NY, USA, 2007. ACM Press.
[3]
L. Paul Chew. Guaranteed-quality mesh generation for curved surfaces. In SCG'93: Proceedings of the ninth annual symposium on Computational geometry, 1993.
[4]
Andrew V. Goldberg and Robert E. Tarjan. A new approach to the maximum-flow problem. J. ACM, 35(4):921--940, 1988.
[5]
Leonidas J. Guibas, Donald E. Knuth, and Micha Sharir. Randomized incremental construction of delaunay and voronoi diagrams. Algorithmica, 7(1):381--413, December 1992.
[6]
L. Hendren and A. Nicolau. Parallelizing programs with recursive data structures. IEEE Transactions on Parallel and Distributed Systems, 1(1):35--47, January 1990.
[7]
B. W. Kernighan and S. Lin. An effective heuristic procedure for partitioning graphs. The Bell System Technical Journal, pages 291--308, February 1970.
[8]
Charles H. Koelbel, David B. Loveman, Robert S. Schreiber, Jr. Guy L. Steele, and Mary E. Zosel. The high performance Fortran handbook. MIT Press, Cambridge, MA, USA, 1994.
[9]
Venkata Krishnan and Josep Torrellas. A chip-multiprocessor architecture with speculative multithreading. IEEE Trans. Comput., 48(9):866--880, 1999.
[10]
Milind Kulkarni, Keshav Pingali, Ganesh Ramanarayanan, Bruce Walter, Kavita Bala, and L. Paul Chew. Optimistic parallelism benefits from data partitioning. SIGARCH Comput. Archit. News, 36(1):233--243, 2008.
[11]
Milind Kulkarni, Keshav Pingali, Bruce Walter, Ganesh Ramanarayanan, Kavita Bala, and L. Paul Chew. Optimistic parallelism requires abstractions. SIGPLAN Not. (Proceedings of PLDI 2007), 42(6):211--222, 2007.
[12]
Jim Larus and Ravi Rajwar. Transactional Memory (Synthesis Lectures on Computer Architecture). Morgan & Claypool Publishers, 2007.
[13]
OpenMP. http://www.openmp.org.
[14]
Guilherme Ottoni, Ram Rangan, Adam Stoler, and David I. August. Automatic thread extraction with decoupled software pipelining. In MICRO 38: Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture, pages 105--118, Washington, DC, USA, 2005. IEEE Computer Society.
[15]
James Philbin, Jan Edler, Otto J. Anshus, Craig C. Douglas, and Kai Li. Thread scheduling for cache locality. In Architectural Support for Programming Languages and Operating Systems, pages 60--71, 1996.
[16]
Lawrence Rauchwerger and David A. Padua. The LRPD test: Speculative run-time parallelization of loops with privatization and reduction parallelization. IEEE Trans. Parallel Distrib. Syst., 10(2):160--180, 1999.
[17]
A. Rogers and K. Pingali. Process decomposition through locality of reference. In PLDI'89: Proceedings of the ACM SIGPLAN 1989 Conference on Programming language design and implementation, pages 69--80, New York, NY, USA, 1989. ACM.
[18]
M. Sagiv, T. Reps, and R. Wilhelm. Solving shape-analysis problems in languages with destructive updating. ACM Transactions on Programming Languages and Systems, 20(1):1--50, January 1998.
[19]
William N. Scherer, III and Michael L. Scott. Advanced contention management for dynamic software transactional memory. In PODC'05: Proceedings of the twenty-fourth annual ACM symposium on Principles of distributed computing, pages 240--248, New York, NY, USA, 2005. ACM.
[20]
Jonathan Richard Shewchuk. Triangle: Engineering a 2D Quality Mesh Generator and Delaunay Triangulator. In Applied Computational Geometry: Towards Geometric Engineering, volume 1148 of Lecture Notes in Computer Science, pages 203--222. Springer-Verlag, 1996.
[21]
Pang-Ning Tan, Michael Steinbach, and Vipin Kumar, editors. Introduction to Data Mining. Pearson Addison Wesley, 2005.
[22]
Neil Vachharajani, Ram Rangan, Easwaran Raman, Matthew J. Bridges, Guilherme Ottoni, and David I. August. Speculative decoupled software pipelining. In PACT'07: Proceedings of the 16th International Conference on Parallel Architecture and Compilation Techniques (PACT 2007), pages 49--59, Washington, DC, USA, 2007. IEEE Computer Society.
[23]
Bruce Walter, Sebastian Fernandez, Adam Arbree, Kavita Bala, Michael Donikian, and Donald Greenberg. Lightcuts: a scalable approach to illumination. ACM Transactions on Graphics (SIGGRAPH), 24(3):1098--1107, July 2005.

Cited By

View all
  • (2023)ASPENProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3669124(68625-68638)Online publication date: 10-Dec-2023
  • (2018)Oversubscribed Command Queues in GPUsProceedings of the 11th Workshop on General Purpose GPUs10.1145/3180270.3180271(50-60)Online publication date: 24-Feb-2018
  • (2018)Scheduling Parallel Computations by Work StealingInternational Journal of Parallel Programming10.1007/s10766-016-0484-846:2(173-197)Online publication date: 1-Apr-2018
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
SPAA '08: Proceedings of the twentieth annual symposium on Parallelism in algorithms and architectures
June 2008
380 pages
ISBN:9781595939739
DOI:10.1145/1378533
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 June 2008

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. irregular programs
  2. optimistic parallelism
  3. scheduling

Qualifiers

  • Research-article

Conference

SPAA08

Acceptance Rates

Overall Acceptance Rate 447 of 1,461 submissions, 31%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)6
  • Downloads (Last 6 weeks)1
Reflects downloads up to 09 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2023)ASPENProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3669124(68625-68638)Online publication date: 10-Dec-2023
  • (2018)Oversubscribed Command Queues in GPUsProceedings of the 11th Workshop on General Purpose GPUs10.1145/3180270.3180271(50-60)Online publication date: 24-Feb-2018
  • (2018)Scheduling Parallel Computations by Work StealingInternational Journal of Parallel Programming10.1007/s10766-016-0484-846:2(173-197)Online publication date: 1-Apr-2018
  • (2017)SyncPerfProceedings of the Twelfth European Conference on Computer Systems10.1145/3064176.3064186(298-313)Online publication date: 23-Apr-2017
  • (2016)Data-centric execution of speculative parallel programsThe 49th Annual IEEE/ACM International Symposium on Microarchitecture10.5555/3195638.3195644(1-13)Online publication date: 15-Oct-2016
  • (2016)First-class effect reflection for effect-guided programmingACM SIGPLAN Notices10.1145/3022671.298403751:10(820-837)Online publication date: 19-Oct-2016
  • (2016)First-class effect reflection for effect-guided programmingProceedings of the 2016 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications10.1145/2983990.2984037(820-837)Online publication date: 19-Oct-2016
  • (2016)A Survey on Thread-Level Speculation TechniquesACM Computing Surveys10.1145/293836949:2(1-39)Online publication date: 30-Jun-2016
  • (2016)Data-centric execution of speculative parallel programs2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO)10.1109/MICRO.2016.7783708(1-13)Online publication date: Oct-2016
  • (2016)New Data Structures to Handle Speculative Parallelization at RuntimeInternational Journal of Parallel Programming10.1007/s10766-014-0347-044:3(407-426)Online publication date: 1-Jun-2016
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media