More Web Proxy on the site http://driver.im/

research-article

Taskminer: automatic identification of tasks

Authors:

Guilherme Leobas,

Fernando Magno Q. PereiraAuthors Info & Claims

SBLP '18: Proceedings of the XXII Brazilian Symposium on Programming Languages

Pages 11 - 18

https://doi.org/10.1145/3264637.3264639

Published: 20 September 2018 Publication History

Abstract

This paper presents TaskMiner, a tool that automatically finds task parallelism in C code. TaskMiner solves classic problems of irregular parallelism, such as finding the memory ranges accessed by tasks, removing spurious static dependencies, detecting data-racing conditions and, most importantly, assessing the cost of parallelism for a given task or program. TaskMiner implements a suite of optimizations in LLVM's intermediate representation, which in the end insert OpenMP annotations to the code automatically, exposing task parallelism. TaskMiner finds parallelism that was previously manually appointed in benchmarks such as BSC-Bots, producing programs that are as fast or even faster than their sequential and manually annotated counterparts. TaskMiner also finds hidden parallelism in sequential programs, without human intervention.

References

[1]

K. Agrawal, C. E. Leiserson, and J. Sukha. 2010. Executing task graphs using work-stealing. In IPDPS. IEEE, Atlanga, Georgia, US, 1--12.

[2]

Eduard Ayguadé, Rosa M. Badia, Daniel Jiménez, José R. Herrero, Jesús Labarta, Vladimir Subotic, and Gladys Utrera. 2015. Tareador: A Tool to Unveil Parallelization Strategies at Undergraduate Level. In WCAE. ACM, New York, NY, USA, 1:1--1:8.

Digital Library

[3]

Eduard Ayguadé, Nawal Copty, Alejandro Duran, Jay Hoeflinger, Yuan Lin, Federico Massaioli, Xavier Teruel, Priya Unnikrishnan, and Guansong Zhang. 2009. The Design of OpenMP Tasks. Trans. Parallel Distrib. Syst. 20, 3 (2009), 404--418.

Digital Library

[4]

Javier Bueno, Luis Martinell, Alejandro Duran, Montse Farreras, Xavier Martorell, Rosa M. Badia, Eduard Ayguadé, and Jesús Labarta. 2011. Productive Cluster Programming with OmpSs. In Euro-Par. Springer-Verlag, Berlin, Heidelberg, 555--566.

Digital Library

[5]

J. Castrillon, R. Leupers, and G. Ascheid. 2013. MAPS: Mapping Concurrent Dataflow Applications to Heterogeneous MPSoCs. IEEE Transactions on Industrial Informatics 9, 1 (2013), 527--545.

[6]

Barcelona Supercomputing Center. 2018. Paraver: a flexible performance analysis tool. https://tools.bsc.es/paraver. (2018).

[7]

Andi Drebes, Antoniu Pop, Karine Heydemann, Albert Cohen, and Nathalie Drach. 2014. Aftermath: A graphical tool for performance analysis and debugging of fine-grained task-parallel programs and run-time systems. In MULTIPROG. ACM, Vienna, Austria, 80--88.

[8]

Alejandro Duran, Julita Corbalán, and Eduard Ayguadé. 2008. An Adaptive Cut-off for Task Parallelism. In SC. IEEE, Piscataway, NJ, USA, 36:1--36:11.

Digital Library

[9]

Alejandro Duran, Xavier Teruel, Roger Ferrer, Xavier Martorell, and Eduard Ayguadé. 2009. Barcelona OpenMP Tasks Suite: A Set of Benchmarks Targeting the Exploitation of Task Parallelism in OpenMP. In ICPP. IEEE, Washington, DC, USA, 124--131.

Digital Library

[10]

Jeanne Ferrante, Karl J. Ottenstein, and Joe D. Warren. 1987. The Program Dependence Graph and Its Use in Optimization. Trans. Program. Lang. Syst. 9, 3 (1987), 319--349.

Digital Library

[11]

Tobias Grosser, Armin Größlinger, and Christian Lengauer. 2012. Polly - Performing Polyhedral Optimizations on a Low-Level Intermediate Representation. Parallel Processing Letters 22, 4 (2012), 28 pages.

[12]

Gagan Gupta and Gurindar S. Sohi. 2011. Dataflow Execution of Sequential Imperative Programs on Multicore Architectures. In Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture. ACM, New York, NY, USA, 59--70.

Digital Library

[13]

An Huynh, Douglas Thain, Miquel Pericàs, and Kenjiro Taura. 2015. DAGViz: A DAG Visualization Tool for Analyzing Task-parallel Program Traces. In Proceedings of the 2nd Workshop on Visual Performance Analysis. ACM, New York, NY, USA, 3:1--3:8.

Digital Library

[14]

Kazuhisa Ishizaka, Motoki Obata, and Hironori Kasahara. 2000. Coarse-grain Task Parallel Processing Using the OpenMP Backend of the OSCAR Multigrain Parallelizing Compiler. In HPC, Mateo Valero, Kazuki Joe, Masaru Kitsuregawa, and Hidehiko Tanaka (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 457--470.

Digital Library

[15]

Shintaro Iwasaki and Kenjiro Taura. 2016. Autotuning of a Cut-Off for Task Parallel Programs. In MCSoC. IEEE, Piscataway, NJ, USA, 353--360.

[16]

Julien Jaeger, Patrick Carribault, and Marc Pérache. 2015. Fine-grain Data Management Directory for OpenMP 4.0 and OpenACC. Concurr. Comput.: Pract. Exper. 27, 6 (2015), 1528--1539.

Digital Library

[17]

Gary A. Kildall. 1973. A Unified Approach to Global Program Optimization. In POPL. ACM, New York, NY, USA, 194--206.

Digital Library

[18]

Zhen Li, Rohit Atre, Zia Huda, Ali Jannesari, and Felix Wolf. 2016. Unveiling Parallelization Opportunities in Sequential Programs. J. Syst. Softw. 117, C (2016), 282--295.

Digital Library

[19]

Gleison Souza Diniz Mendonça, Breno Campos Ferreira Guimarães, Péricles Rafael Oliveira Alves, Fernando Magno Quintão Pereira, Márcio Machado Pereira, and Guido Araújo. 2016. Automatic Insertion of Copy Annotation in Data-Parallel Programs. In SBAC-PAD. IEEE, New York, 34--41.

[20]

Gleison Mendonça, Breno Guimarães, Péricles Alves, Márcio Pereira, Guido Araújo, and Fernando Magno Quintão Pereira. 2017. DawnCC: Automatic Annotation for Data Parallelism and Offloading. ACM Trans. Archit. Code Optim. 14, 2 (2017), 13:1--13:25.

Digital Library

[21]

Rubens E.A. Moreira, Sylvain Collange, and Fernando Magno Quintão Pereira. 2017. Function Call Re-Vectorization. In PPoPP. ACM, New York, NY, USA, 313--326.

Digital Library

[22]

Keshav Pingali, Donald Nguyen, Milind Kulkarni, Martin Burtscher, M. Amber Hassaan, Rashid Kaleem, Tsung-Hsien Lee, Andrew Lenharth, Roman Manevich, Mario Méndez-Lojo, Dimitrios Prountzos, and Xin Sui. 2011. The Tao of Parallelism in Algorithms. In PLDI. ACM, New York, NY, USA, 12--25.

Digital Library

[23]

Judit Planas, Rosa M. Badia, Eduard Ayguadé, and Jesus Labarta. 2009. Hierarchical Task-Based Programming With StarSs. Int. J. High Perform. Comput. Appl. 23, 3 (2009), 284--299.

Digital Library

[24]

Gabriel Poesia, Breno Guimarães, Fabrício Ferracioli, and Fernando Magno Quintão Pereira. 2017. Static Placement of Computation on Heterogeneous Devices. Proc. ACM Program. Lang. 1, OOPSLA (2017), 50:1--50:28.

Digital Library

[25]

Mahesh Ravishankar, John Eisenlohr, Louis-Noël Pouchet, J. Ramanujam, Atanas Rountev, and P. Sadayappan. 2014. Automatic Parallelization of a Class of Irregular Loops for Distributed Memory Systems. ACM Trans. Parallel Comput. 1, 1 (2014), 7:1--7:37.

Digital Library

[26]

H. G. Rice. 1953. Classes of Recursively Enumerable Sets and Their Decision Problems. Trans. Amer. Math. Soc. 74, 2 (1953), 358--366.

[27]

Laurence Rideau, Bernard Paul Serpete, and Xavier Leroy. 2008. Tilting at Windmills with Coq: Formal Verification of a Compilation Algorithm for Parallel Moves. J. Autom. Reason. 40, 4 (2008), 307--326.

Digital Library

[28]

OpenACC Standard. 2013. The OpenACC Programming Interface. Technical Report. CAPs.

[29]

The High Performance Computing Center Stuttgart. 2018. TEMANEJO Project Website. https://www.hlrs.de/de/solutions-services/service-portfolio/programming/hpc-development-tools/temanejo/. (2018).

[30]

Georgios Tournavitis, Zheng Wang, Björn Franke, and Michael F.P. O'Boyle. 2009. Towards a Holistic Approach to Auto-parallelization: Integrating Profile-driven Parallelism Detection and Machine-learning Based Mapping. SIGPLAN Not. 44, 6 (2009), 177--187.

Digital Library

[31]

Hans Vandierendonck, Polyvios Pratikakis, and Dimitrios S. Nikolopoulos. 2011. Parallel Programming of General-purpose Programs Using Task-based Programming Models. In Proceedings of the 3rd USENIX Conference on Hot Topic in Parallelism. USENIX Association, Berkeley, CA, USA, 13--13.

Digital Library

[32]

Youfeng Wu and James R. Larus. 1994. Static Branch Frequency and Program Profile Analysis. In MICRO. ACM, New York, NY, USA, 1--11.

Digital Library

Index Terms

Taskminer: automatic identification of tasks
1. Computer systems organization
  1. Architectures
    1. Parallel architectures
2. Software and its engineering
  1. Software notations and tools
    1. Compilers
      1. Source code generation

Recommendations

Automatic annotation of tasks in structured code
PACT '18: Proceedings of the 27th International Conference on Parallel Architectures and Compilation Techniques

This paper describes the design and implementation of a suit of static analyses and code generation techniques to annotate programs with OpenMP pragmas for task parallelism. These techniques approximate the ranges covered by memory regions, bound ...
CS in parallel: modules for adding parallel computing to CS courses, from CS2 to theory of computation (abstract only)
SIGCSE '12: Proceedings of the 43rd ACM technical symposium on Computer Science Education

Parallel computing with more and more cores is here to stay. This workshop presents four independent, class-tested, primarily hands-on modules for incrementally adding parallelism in undergraduate CS courses, each requiring 1 to 3 class days and ...
Task-Based execution of nested OpenMP loops
IWOMP'12: Proceedings of the 8th international conference on OpenMP in a Heterogeneous World

In this work we propose a novel technique to reduce the overheads related to nested parallel loops in OpenMP programs. In particular we show that in many cases it is possible to replace the code of a nested parallel-for loop with equivalent code that ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences

SBLP '18: Proceedings of the XXII Brazilian Symposium on Programming Languages

September 2018

108 pages

ISBN:9781450364805

DOI:10.1145/3264637

Program Chairs:
Carlos Camarão
Universidade Federal de Minas Gerais
,
Martin Sulzmann
Hochschule Karlsruhe - Technik und Wirtschaft

Copyright © 2018 ACM.

© 2018 Association for Computing Machinery. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of a national government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 September 2018

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

SBLP 2018

SBLP 2018: XXII Brazilian Symposium on Programming Languages

September 20 - 21, 2018

Sao Carlos, Brazil

Acceptance Rates

SBLP '18 Paper Acceptance Rate 12 of 29 submissions, 41%;

Overall Acceptance Rate 22 of 50 submissions, 44%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
64
Total Downloads

Downloads (Last 12 months)2
Downloads (Last 6 weeks)1

Reflects downloads up to 23 Jan 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents