[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/2835238.2835241acmconferencesArticle/Chapter ViewAbstractPublication PagesscConference Proceedingsconference-collections
research-article

DAGViz: a DAG visualization tool for analyzing task-parallel program traces

Published: 15 November 2015 Publication History

Abstract

In task-based parallel programming, programmers can expose logical parallelism of their programs by creating fine-grained tasks at arbitrary places in their code. All other burdens in the parallel execution of these tasks such as thread management, task scheduling, and load balancing are handled automatically by runtime systems. This kind of parallel programming model has been conceived as a promising paradigm that brings intricate parallel programming techniques to a larger audience of programmers because of its high programmability. There have been many languages (e.g., OpenMP, Cilk Plus) and libraries (e.g, Intel TBB, Qthreads, MassiveThreads) supporting task parallelism. However, the nondeterministic nature of task parallel execution which hides runtime scheduling mechanisms from programmers has made it difficult for programmers to understand the cause of suboptimal performance of their programs. As an effort to tackle this problem, and also to clarify differences between task parallel runtime systems, we have developed a toolset that captures and visualizes the trace of an execution of a task parallel program in the form of a directed acyclic graph (DAG). A computation DAG of a task parallel program's run is extracted automatically by our lightweight portable wrapper around all five systems which incurs no intervention into the target systems' code. The DAG is stored in a file and then visualized to analyze performance. We leverage the hierarchical structure of the DAG to enhance the DAG file format and DAG visualization, and make them manageable even with a huge DAG of arbitrarily large numbers of nodes. This DAG visualization provides a task-centric view of the program, which is different from other popular visualizations such as thread-centric timeline visualization and code-centric hotspots analysis. Besides, DAGViz also provides an additional timeline visualization which is constructed by individual nodes of the DAG, and is useful in coordinating user attention to low-parallelism areas on the DAG. We demonstrate usefulness of our DAG visualizations in some case studies. We expect to build other kinds of effective visualizations based on this computation DAG in future work, and make DAGViz an effective tool supporting the process of analyzing task parallel performance and developing scheduling algorithms for task parallel runtime schedulers.

References

[1]
A. Bilgin. Graphviz - graph visualization software, 1988.
[2]
A. Danowitz, K. Kelley, J. Mao, J. P. Stevenson, and M. Horowitz. Cpu db: Recording microprocessor history. Commun. ACM, 55(4):55--63, Apr. 2012.
[3]
A. Drebes, A. Pop, K. Heydemann, A. Cohen, and N. Drach-Temam. Aftermath: A graphical tool for performance analysis and debugging of fine-grained task-parallel programs and run-time systems. In Proceedings of 7th Workshop on Programmability Issues for Heterogeneous Multicores, MULTIPROG '14, 2014.
[4]
A. Duran, X. Teruel, R. Ferrer, X. Martorell, and E. Ayguade. Barcelona OpenMP Tasks Suite: A Set of Benchmarks Targeting the Exploitation of Task Parallelism in OpenMP. In 2009 International Conference on Parallel Processing, pages 124--131. IEEE, Sept. 2009.
[5]
S. Hunold, R. Hoffmann, and F. Suter. Jedule: A tool for visualizing schedules of parallel applications. In Parallel Processing Workshops (ICPPW), 2010 39th International Conference on, pages 169--178, Sept 2010.
[6]
Intel. Intel vtune amplifier. http://software.intel.com/en-us/intel-vtune-amplifier-xe, 2015. {Online; last accessed July 5, 2015}.
[7]
C. E. Leiserson. The Cilk++ concurrency platform. In Proceedings of the 46th Annual Design Automation Conference DAC '09, page 522, New York, New York, USA, July 2009. ACM Press.
[8]
W. E. Nagel, A. Arnold, M. Weber, H.-C. Hoppe, and K. Solchenbach. Vampir: Visualization and analysis of mpi resources. Supercomputer, 12:69--80, 1996.
[9]
J. Nakashima, S. Nakatani, and K. Taura. Design and implementation of a customizable work stealing scheduler. In Proceedings of the 3rd International Workshop on Runtime and Operating Systems for Supercomputers - ROSS '13, page 1, New York, New York, USA, June 2013. ACM Press.
[10]
J. Nakashima and K. Taura. MassiveThreads: A Thread Library for High Productivity Languages. In Festschrift of Symposium on Concurrent Objects and Beyond: From Theory to High-Performance Computing (to appear as a volume of Lecture Notes in Computer Science), 2012.
[11]
S. L. Olivier, B. R. de Supinski, M. Schulz, and J. F. Prins. Characterizing and mitigating work time inflation in task parallel programs. SC '12, pages 65:1--65:12, Los Alamitos, CA, USA, 2012. IEEE Computer Society Press.
[12]
OpenMP Architecture Review Board. OpenMP Application Program Interface. Technical Report July, OpenMP Architecture Review Board, 2011.
[13]
C. Pheatt. Intel(r) threading building blocks. J. Comput. Sci. Coll., 23(4):298--298, Apr. 2008.
[14]
A. Pop and A. Cohen. Openstream: Expressiveness and data-flow compilation of openmp streaming programs. ACM Trans. Archit. Code Optim., 9(4):53:1--53:25, Jan. 2013.
[15]
C. G. Project. Cairo. http://cairographics.org/, 2015. {Online; last accessed July 5, 2015}.
[16]
T. G. Project. Gtk+ 3. http://www.gtk.org/, 2015. {Online; last accessed July 5, 2015}.
[17]
K. Sugiyama, S. Tagawa, and M. Toda. Methods for visual understanding of hierarchical system structures. Systems, Man and Cybernetics, IEEE Transactions on, 11(2):109--125, 1981.
[18]
N. R. Tallent and J. M. Mellor-Crummey. Effective performance measurement and analysis of multithreaded applications. In Proceedings of the 14th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP '09, pages 229--240, New York, NY, USA, 2009. ACM.
[19]
K. Taura and J. Nakashima. A Comparative Study of Six Task Parallel Programming Systems (in Japanese). In IPSJ SIG Technical Report HPC, volume 140(16), pages 1--10. IPSJ, 2013.
[20]
K. B. Wheeler, R. C. Murphy, and D. Thain. Qthreads: An API for programming with millions of lightweight threads. In 2008 IEEE International Symposium on Parallel and Distributed Processing, pages 1--8. IEEE, Apr. 2008.
[21]
K. B. Wheeler and D. Thain. Visualizing massively multithreaded applications with ThreadScope. Concurrency and Computation: Practice and Experience, 22(1):45--67, Jan. 2010.
[22]
O. Zaki, E. Lusk, and D. Swider. Toward scalable performance visualization with jumpshot. High Performance Computing Applications, 13:277--288, 1999.

Cited By

View all
  • (2024)Evaluating Communication Pattern Representations in Execution Trace Gantt Charts2024 IEEE Working Conference on Software Visualization (VISSOFT)10.1109/VISSOFT64034.2024.00011(1-11)Online publication date: 6-Oct-2024
  • (2024)Visualizing Correctness Issues in OpenMP ProgramsAdvancing OpenMP for Future Accelerators10.1007/978-3-031-72567-8_11(161-175)Online publication date: 16-Sep-2024
  • (2023)Traveler: Navigating Task Parallel Traces for Performance AnalysisIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2022.320937529:1(788-797)Online publication date: Jan-2023
  • Show More Cited By
  1. DAGViz: a DAG visualization tool for analyzing task-parallel program traces

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    VPA '15: Proceedings of the 2nd Workshop on Visual Performance Analysis
    November 2015
    44 pages
    ISBN:9781450340137
    DOI:10.1145/2835238
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 15 November 2015

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. DAG visualization
    2. performance analysis
    3. profiler
    4. task parallel
    5. tracer

    Qualifiers

    • Research-article

    Conference

    SC15
    Sponsor:

    Acceptance Rates

    VPA '15 Paper Acceptance Rate 5 of 6 submissions, 83%;
    Overall Acceptance Rate 5 of 6 submissions, 83%

    Upcoming Conference

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)28
    • Downloads (Last 6 weeks)3
    Reflects downloads up to 24 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Evaluating Communication Pattern Representations in Execution Trace Gantt Charts2024 IEEE Working Conference on Software Visualization (VISSOFT)10.1109/VISSOFT64034.2024.00011(1-11)Online publication date: 6-Oct-2024
    • (2024)Visualizing Correctness Issues in OpenMP ProgramsAdvancing OpenMP for Future Accelerators10.1007/978-3-031-72567-8_11(161-175)Online publication date: 16-Sep-2024
    • (2023)Traveler: Navigating Task Parallel Traces for Performance AnalysisIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2022.320937529:1(788-797)Online publication date: Jan-2023
    • (2023)Summarizing task-based applications behavior over many nodes through progression clustering2023 31st Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP)10.1109/PDP59025.2023.00014(35-42)Online publication date: Mar-2023
    • (2021)Providing In-depth Performance Analysis for Heterogeneous Task-based Applications with StarVZ2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)10.1109/IPDPSW52791.2021.00013(16-25)Online publication date: Jun-2021
    • (2020)Visualizing Distributed System ExecutionsACM Transactions on Software Engineering and Methodology10.1145/337563329:2(1-38)Online publication date: 4-Mar-2020
    • (2020)Analyzing the Performance Trade-Off in Implementing User-Level ThreadsIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2020.297605731:8(1859-1877)Online publication date: 1-Aug-2020
    • (2019)Visualizing a Moving Target: A Design Study on Task Parallel Programs in the Presence of Evolving Data and ConcernsIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2019.2934285(1-1)Online publication date: 2019
    • (2018)TaskminerProceedings of the XXII Brazilian Symposium on Programming Languages10.1145/3264637.3264639(11-18)Online publication date: 20-Sep-2018
    • (2018)Automatic annotation of tasks in structured codeProceedings of the 27th International Conference on Parallel Architectures and Compilation Techniques10.1145/3243176.3243200(1-13)Online publication date: 1-Nov-2018
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media