[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3264637.3264639acmotherconferencesArticle/Chapter ViewAbstractPublication PagessblpConference Proceedingsconference-collections
research-article

Taskminer: automatic identification of tasks

Published: 20 September 2018 Publication History

Abstract

This paper presents TaskMiner, a tool that automatically finds task parallelism in C code. TaskMiner solves classic problems of irregular parallelism, such as finding the memory ranges accessed by tasks, removing spurious static dependencies, detecting data-racing conditions and, most importantly, assessing the cost of parallelism for a given task or program. TaskMiner implements a suite of optimizations in LLVM's intermediate representation, which in the end insert OpenMP annotations to the code automatically, exposing task parallelism. TaskMiner finds parallelism that was previously manually appointed in benchmarks such as BSC-Bots, producing programs that are as fast or even faster than their sequential and manually annotated counterparts. TaskMiner also finds hidden parallelism in sequential programs, without human intervention.

References

[1]
K. Agrawal, C. E. Leiserson, and J. Sukha. 2010. Executing task graphs using work-stealing. In IPDPS. IEEE, Atlanga, Georgia, US, 1--12.
[2]
Eduard Ayguadé, Rosa M. Badia, Daniel Jiménez, José R. Herrero, Jesús Labarta, Vladimir Subotic, and Gladys Utrera. 2015. Tareador: A Tool to Unveil Parallelization Strategies at Undergraduate Level. In WCAE. ACM, New York, NY, USA, 1:1--1:8.
[3]
Eduard Ayguadé, Nawal Copty, Alejandro Duran, Jay Hoeflinger, Yuan Lin, Federico Massaioli, Xavier Teruel, Priya Unnikrishnan, and Guansong Zhang. 2009. The Design of OpenMP Tasks. Trans. Parallel Distrib. Syst. 20, 3 (2009), 404--418.
[4]
Javier Bueno, Luis Martinell, Alejandro Duran, Montse Farreras, Xavier Martorell, Rosa M. Badia, Eduard Ayguadé, and Jesús Labarta. 2011. Productive Cluster Programming with OmpSs. In Euro-Par. Springer-Verlag, Berlin, Heidelberg, 555--566.
[5]
J. Castrillon, R. Leupers, and G. Ascheid. 2013. MAPS: Mapping Concurrent Dataflow Applications to Heterogeneous MPSoCs. IEEE Transactions on Industrial Informatics 9, 1 (2013), 527--545.
[6]
Barcelona Supercomputing Center. 2018. Paraver: a flexible performance analysis tool. https://tools.bsc.es/paraver. (2018).
[7]
Andi Drebes, Antoniu Pop, Karine Heydemann, Albert Cohen, and Nathalie Drach. 2014. Aftermath: A graphical tool for performance analysis and debugging of fine-grained task-parallel programs and run-time systems. In MULTIPROG. ACM, Vienna, Austria, 80--88.
[8]
Alejandro Duran, Julita Corbalán, and Eduard Ayguadé. 2008. An Adaptive Cut-off for Task Parallelism. In SC. IEEE, Piscataway, NJ, USA, 36:1--36:11.
[9]
Alejandro Duran, Xavier Teruel, Roger Ferrer, Xavier Martorell, and Eduard Ayguadé. 2009. Barcelona OpenMP Tasks Suite: A Set of Benchmarks Targeting the Exploitation of Task Parallelism in OpenMP. In ICPP. IEEE, Washington, DC, USA, 124--131.
[10]
Jeanne Ferrante, Karl J. Ottenstein, and Joe D. Warren. 1987. The Program Dependence Graph and Its Use in Optimization. Trans. Program. Lang. Syst. 9, 3 (1987), 319--349.
[11]
Tobias Grosser, Armin Größlinger, and Christian Lengauer. 2012. Polly - Performing Polyhedral Optimizations on a Low-Level Intermediate Representation. Parallel Processing Letters 22, 4 (2012), 28 pages.
[12]
Gagan Gupta and Gurindar S. Sohi. 2011. Dataflow Execution of Sequential Imperative Programs on Multicore Architectures. In Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture. ACM, New York, NY, USA, 59--70.
[13]
An Huynh, Douglas Thain, Miquel Pericàs, and Kenjiro Taura. 2015. DAGViz: A DAG Visualization Tool for Analyzing Task-parallel Program Traces. In Proceedings of the 2nd Workshop on Visual Performance Analysis. ACM, New York, NY, USA, 3:1--3:8.
[14]
Kazuhisa Ishizaka, Motoki Obata, and Hironori Kasahara. 2000. Coarse-grain Task Parallel Processing Using the OpenMP Backend of the OSCAR Multigrain Parallelizing Compiler. In HPC, Mateo Valero, Kazuki Joe, Masaru Kitsuregawa, and Hidehiko Tanaka (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 457--470.
[15]
Shintaro Iwasaki and Kenjiro Taura. 2016. Autotuning of a Cut-Off for Task Parallel Programs. In MCSoC. IEEE, Piscataway, NJ, USA, 353--360.
[16]
Julien Jaeger, Patrick Carribault, and Marc Pérache. 2015. Fine-grain Data Management Directory for OpenMP 4.0 and OpenACC. Concurr. Comput.: Pract. Exper. 27, 6 (2015), 1528--1539.
[17]
Gary A. Kildall. 1973. A Unified Approach to Global Program Optimization. In POPL. ACM, New York, NY, USA, 194--206.
[18]
Zhen Li, Rohit Atre, Zia Huda, Ali Jannesari, and Felix Wolf. 2016. Unveiling Parallelization Opportunities in Sequential Programs. J. Syst. Softw. 117, C (2016), 282--295.
[19]
Gleison Souza Diniz Mendonça, Breno Campos Ferreira Guimarães, Péricles Rafael Oliveira Alves, Fernando Magno Quintão Pereira, Márcio Machado Pereira, and Guido Araújo. 2016. Automatic Insertion of Copy Annotation in Data-Parallel Programs. In SBAC-PAD. IEEE, New York, 34--41.
[20]
Gleison Mendonça, Breno Guimarães, Péricles Alves, Márcio Pereira, Guido Araújo, and Fernando Magno Quintão Pereira. 2017. DawnCC: Automatic Annotation for Data Parallelism and Offloading. ACM Trans. Archit. Code Optim. 14, 2 (2017), 13:1--13:25.
[21]
Rubens E.A. Moreira, Sylvain Collange, and Fernando Magno Quintão Pereira. 2017. Function Call Re-Vectorization. In PPoPP. ACM, New York, NY, USA, 313--326.
[22]
Keshav Pingali, Donald Nguyen, Milind Kulkarni, Martin Burtscher, M. Amber Hassaan, Rashid Kaleem, Tsung-Hsien Lee, Andrew Lenharth, Roman Manevich, Mario Méndez-Lojo, Dimitrios Prountzos, and Xin Sui. 2011. The Tao of Parallelism in Algorithms. In PLDI. ACM, New York, NY, USA, 12--25.
[23]
Judit Planas, Rosa M. Badia, Eduard Ayguadé, and Jesus Labarta. 2009. Hierarchical Task-Based Programming With StarSs. Int. J. High Perform. Comput. Appl. 23, 3 (2009), 284--299.
[24]
Gabriel Poesia, Breno Guimarães, Fabrício Ferracioli, and Fernando Magno Quintão Pereira. 2017. Static Placement of Computation on Heterogeneous Devices. Proc. ACM Program. Lang. 1, OOPSLA (2017), 50:1--50:28.
[25]
Mahesh Ravishankar, John Eisenlohr, Louis-Noël Pouchet, J. Ramanujam, Atanas Rountev, and P. Sadayappan. 2014. Automatic Parallelization of a Class of Irregular Loops for Distributed Memory Systems. ACM Trans. Parallel Comput. 1, 1 (2014), 7:1--7:37.
[26]
H. G. Rice. 1953. Classes of Recursively Enumerable Sets and Their Decision Problems. Trans. Amer. Math. Soc. 74, 2 (1953), 358--366.
[27]
Laurence Rideau, Bernard Paul Serpete, and Xavier Leroy. 2008. Tilting at Windmills with Coq: Formal Verification of a Compilation Algorithm for Parallel Moves. J. Autom. Reason. 40, 4 (2008), 307--326.
[28]
OpenACC Standard. 2013. The OpenACC Programming Interface. Technical Report. CAPs.
[29]
The High Performance Computing Center Stuttgart. 2018. TEMANEJO Project Website. https://www.hlrs.de/de/solutions-services/service-portfolio/programming/hpc-development-tools/temanejo/. (2018).
[30]
Georgios Tournavitis, Zheng Wang, Björn Franke, and Michael F.P. O'Boyle. 2009. Towards a Holistic Approach to Auto-parallelization: Integrating Profile-driven Parallelism Detection and Machine-learning Based Mapping. SIGPLAN Not. 44, 6 (2009), 177--187.
[31]
Hans Vandierendonck, Polyvios Pratikakis, and Dimitrios S. Nikolopoulos. 2011. Parallel Programming of General-purpose Programs Using Task-based Programming Models. In Proceedings of the 3rd USENIX Conference on Hot Topic in Parallelism. USENIX Association, Berkeley, CA, USA, 13--13.
[32]
Youfeng Wu and James R. Larus. 1994. Static Branch Frequency and Program Profile Analysis. In MICRO. ACM, New York, NY, USA, 1--11.

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
SBLP '18: Proceedings of the XXII Brazilian Symposium on Programming Languages
September 2018
108 pages
ISBN:9781450364805
DOI:10.1145/3264637
© 2018 Association for Computing Machinery. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of a national government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 September 2018

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. openMP
  2. parallelism
  3. tasks

Qualifiers

  • Research-article

Conference

SBLP 2018
SBLP 2018: XXII Brazilian Symposium on Programming Languages
September 20 - 21, 2018
Sao Carlos, Brazil

Acceptance Rates

SBLP '18 Paper Acceptance Rate 12 of 29 submissions, 41%;
Overall Acceptance Rate 22 of 50 submissions, 44%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 64
    Total Downloads
  • Downloads (Last 12 months)2
  • Downloads (Last 6 weeks)1
Reflects downloads up to 23 Jan 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media