[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3332186.3333051acmotherconferencesArticle/Chapter ViewAbstractPublication PagespearcConference Proceedingsconference-collections
research-article

openDIEL: A Parallel Workflow Engine and Data Analytics Framework

Published: 28 July 2019 Publication History

Abstract

openDIEL is a workflow engine that aims to give researchers and users of HPC an efficient way to coordinate, organize, and interconnect many disparate modules of computation in order to effectively utilize and allocate HPC resources [13]. A GUI has been developed to aid in creating workflows, and allows for the specification of data science jobs, including specification neural network architectures, data processing, and hyperparameter tuning. Existing machine learning tools can be readily used in the openDIEL, allowing for easy experimentation with various models and approaches.

References

[1]
Enis Afgan, Jeremy Goecks, Dannon Baker, Nate Coraor, Anton Nekrutenko, James Taylor, Galaxy Team, et al. 2011. Galaxy: A gateway to tools in e-science. In Guide to e-Science. Springer, 145--177.
[2]
Tal Ben-Nun and Torsten Hoefler. 2018. Demystifying parallel and distributed deep learning: An in-depth concurrency analysis. arXiv preprint arXiv:1802.09941 (2018).
[3]
Marc Claesen, Jaak Simm, Dusan Popovic, Yves Moreau, and Bart De Moor. 2014. Easy hyperparameter search using optunity. arXiv preprint arXiv:1412.1114 (2014).
[4]
Sanjay Surendranath Girija. 2016. Tensorflow: Large-scale machine learning on heterogeneous distributed systems. Software available from tensorflow. org (2016).
[5]
Aaron Klein, Stefan Falkner, Jost Tobias Springenberg, and Frank Hutter. 2016. Learning curve prediction with Bayesian neural networks. (2016).
[6]
Lisha Li, Kevin Jamieson, Giulia DeSalvo, Afshin Rostamizadeh, and Ameet Talwalkar. 2016. Hyperband: A novel bandit-based approach to hyperparameter optimization. arXiv preprint arXiv:1603.06560 (2016).
[7]
Richard Liaw, Eric Liang, Robert Nishihara, Philipp Moritz, Joseph E Gonzalez, and Ion Stoica. 2018. Tune: A research platform for distributed model selection and training. arXiv preprint arXiv:1807.05118 (2018).
[8]
Suresh Marru, Lahiru Gunathilake, Chathura Herath, Patanachai Tangchaisin, Marlon Pierce, Chris Mattmann, Raminder Singh, Thilina Gunarathne, Eran Chinthaka, Ross Gardler, Aleksander Slominski, Ate Douma, Srinath Perera, and Sanjiva Weerawarana. 2011. Apache Airavata: A Framework for Distributed Applications and Computational Workflows. In Proceedings of the 2011 ACM Workshop on Gateway Computing Environments (GCE '11). ACM, New York, NY, USA, 21--28.
[9]
Ruben Martinez-Cantin. 2014. Bayesopt: A bayesian optimization library for nonlinear optimization, experimental design and bandits. The Journal of Machine Learning Research 15, 1 (2014), 3735--3739.
[10]
Lucien Ng, Kwai Wong, Azzam Haidar, Stanimire Tomov, and Jack Don-garra. 2017. Magmadnn high-performance data analytics for manycore gpus and cpus. In magma DNN, 2017 Summer Research Experiences for Undergraduate (REU).
[11]
F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. 2011. Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research 12 (2011), 2825--2830.
[12]
Jasper Snoek, Hugo Larochelle, and Ryan P Adams. 2012. Practical bayesian optimization of machine learning algorithms. In Advances in neural information processing systems. 2951--2959.
[13]
Kwai Wong, Logan Brown, Jason Coan, and David White. 2014. Distributive Interoperable Executive Library (DIEL) for Systems of Multi-physics Simulation. In 2014 15th International Conference on Parallel and Distributed Computing, Applications and Technologies. IEEE, 49--55.

Cited By

View all
  • (2022)Spine Toolbox: A flexible open-source workflow management system with scenario and data managementSoftwareX10.1016/j.softx.2021.10096717(100967)Online publication date: Jan-2022
  • (2020)Integrating Deep Learning in Domain Sciences at ExascaleDriving Scientific and Engineering Discoveries Through the Convergence of HPC, Big Data and AI10.1007/978-3-030-63393-6_3(35-50)Online publication date: 18-Dec-2020
  • (2019)Hands-On Research and Training in High Performance Data Sciences, Data Analytics, and Machine Learning for Emerging EnvironmentsHigh Performance Computing10.1007/978-3-030-34356-9_49(643-655)Online publication date: 16-Jun-2019

Index Terms

  1. openDIEL: A Parallel Workflow Engine and Data Analytics Framework

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    PEARC '19: Practice and Experience in Advanced Research Computing 2019: Rise of the Machines (learning)
    July 2019
    775 pages
    ISBN:9781450372275
    DOI:10.1145/3332186
    • General Chair:
    • Tom Furlani
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 28 July 2019

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. data science
    2. graphical user interface
    3. neural network
    4. workflow engine

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Conference

    PEARC '19

    Acceptance Rates

    Overall Acceptance Rate 133 of 202 submissions, 66%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)4
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 07 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2022)Spine Toolbox: A flexible open-source workflow management system with scenario and data managementSoftwareX10.1016/j.softx.2021.10096717(100967)Online publication date: Jan-2022
    • (2020)Integrating Deep Learning in Domain Sciences at ExascaleDriving Scientific and Engineering Discoveries Through the Convergence of HPC, Big Data and AI10.1007/978-3-030-63393-6_3(35-50)Online publication date: 18-Dec-2020
    • (2019)Hands-On Research and Training in High Performance Data Sciences, Data Analytics, and Machine Learning for Emerging EnvironmentsHigh Performance Computing10.1007/978-3-030-34356-9_49(643-655)Online publication date: 16-Jun-2019

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media