[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

HyperFlow

Published: 01 February 2016 Publication History

Abstract

This paper presents HyperFlow: a model of computation, programming approach and enactment engine for scientific workflows. Workflow programming in HyperFlow combines a simple declarative description of the workflow structure with low-level implementation of workflow activities in a mainstream scripting language. The aim of this approach is to increase the programming productivity of workflow developers who are skilled programmers and desire a programming experience similar to the one offered by a mature programming ecosystem. Combining a declarative description with low-level programming enables elimination of shim nodes from the workflow graph, considerably simplifying workflow implementations. The workflow description is based on a formal model of computation (Process Networks) and is characterized by a simple and concise syntax, utilizing just three key abstractions-processes, signals and functions. Yet it is sufficient for expressing complex workflow patterns in a simple way. The adopted model of computation implemented in the HyperFlow workflow engine enables fully distributed and decentralized workflow enactment. The paper describes HyperFlow from the perspective of its workflow programming capabilities, the adopted model of computation, as well as the enactment engine, in particular its distributed workflow enactment capability. The provenance model and logging features are also presented. Several workflow examples derived from other workflow systems and reimplemented in HyperFlow are extensively discussed. A model of computation and system for scientific workflows, HyperFlow, is proposed.HyperFlow aims at high development productivity of skilled programmers.The HyperFlow Model of Computation combines simplicity with high expressiveness.Complex workflow patterns can be implemented using a simple syntax.HyperFlow enables a fully distributed and decentralized workflow execution.

References

[1]
E. Deelman, D. Gannon, M. Shields, I. Taylor, Workflows and e-science: An overview of workflow system features and capabilities, Future Gener. Comput. Syst., 25 (2009) 528-540.
[2]
A. Belloum, M. Inda, D. Vasunin, V. Korkhov, Z. Zhao, H. Rauwerda, T.M. Breit, M. Bubak, L.O. Hertzberger, Collaborative e-science experiments and scientific workflows, IEEE Internet Comput., 15 (2011) 39-47.
[3]
J. Goecks, A. Nekrutenko, J. Taylor, Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences, Genome Biol., 11 (2010) R86.
[4]
E. Deelman, Y. Gil, Managing large-scale scientific workflows in distributed environments: Experiences and challenges, in: e-Science, 2006, p. 144.
[5]
S. Tilkov, S. Vinoski, Node. js: Using JavaScript to build high-performance network programs, IEEE Internet Comput., 14 (2010).
[6]
B. Balis, Increasing scientific workflow programming productivity with hyperflow, in: Proceedings of the 9th Workshop on Workflows in Support of Large-Scale Science, IEEE Press, Piscataway, NJ, USA, 2014, pp. 59-69.
[7]
J. Qin, T. Fahringer, Scientific Workflows: Programming, Optimization, and Synthesis with ASKALON and AWDL, Springer, 2012.
[8]
B. Ludäscher, I. Altintas, C. Berkley, D. Higgins, E. Jaeger, M. Jones, E.A. Lee, J. Tao, Y. Zhao, Scientific workflow management and the Kepler system, Concurr. Comput.: Pract. Exper., 18 (2006) 1039-1065.
[9]
T. Oinn, P. Li, D.B. Kell, C. Goble, A. Goderis, M. Greenwood, D. Hull, R. Stevens, D. Turi, J. Zhao, Taverna/myGrid: aligning a workflow system with the life sciences community, in: Workflows for e-Science, Springer, New York, Secaucus, NJ, USA, 2007, pp. 300-319.
[10]
I. Taylor, M. Shields, I. Wang, A. Harrison, The triana workflow environment: Architecture and applications, in: Workflows for e-Science, Springer, New York, Secaucus, NJ, USA, 2007, pp. 320-339.
[11]
M. Hardt, T. Jejkal, I. Campos, E. Fernandez, A. Jackson, D. Nielsson, B. Palak, M. Płociennik, Transparent access to scientific and commercial clouds from the KEPLER workflow engine, Comput. Inform., 31 (2012) 119-134.
[12]
K.V. Knyazkov, S.V. Kovalchuk, T.N. Tchurov, S.V. Maryin, A.V. Boukhanovsky, CLAVIRE: e-Science infrastructure for data-driven computing, J. Comput. Sci., 3 (2012) 504-510.
[13]
R. Cushing, S. Koulouzis, A. Belloum, M. Bubak, Applying workflow as a service paradigm to application farming, Concurr. Comput.: Pract. Exper., 26 (2014) 1297-1312.
[14]
P. Missier, S.S. Sahoo, J. Zhao, C. Goble, A. Sheth, Janus: From workflows to semantic provenance and linked open data, in: Provenance and Annotation of Data and Processes, Springer, 2010, pp. 129-141.
[15]
P. Missier, K. Belhajjame, J. Zhao, M. Roos, C. Goble, Data lineage model for Taverna workflows with lightweight annotation requirements, in: Provenance and Annotation of Data and Processes, Springer, 2008, pp. 17-30.
[16]
E. Deelman, G. Mehta, G. Singh, M.-H. Su, K. Vahi, Pegasus: Mapping large-scale workflows to distributed resources, in: Workflows for e-Science, Springer, New York, 2007, pp. 376-394.
[17]
P. Kacsuk, Z. Farkas, M. Kozlovszky, G. Hermann, A. Balasko, K. Karoczkai, I. Marton, WS-PGRADE/gUSE generic DCI gateway framework for a large variety of user communities, J. Grid Comput., 10 (2012) 601-630.
[18]
A. Balasko, Z. Farkas, P. Kacsuk, Building science gateways by utilizing the generic WS-PGRADE/gUSE workflow system, Comput. Sci. J., 14 (2013) 307-325.
[19]
G. Terstyanszky, T. Kukla, T. Kiss, P. Kacsuk, A. Balasko, Z. Farkas, Enabling scientific workflow sharing through coarse-grained interoperability, Future Gener. Comput. Syst., 37 (2014) 46-59.
[20]
K. Plankensteiner, R. Prodan, M. Janetschek, T. Fahringer, J. Montagnat, D. Rogers, I. Harvey, I. Taylor, Á Balaskó, P. Kacsuk, Fine-grain interoperability of scientific workflows in distributed computing infrastructures, J. Grid Comput., 11 (2013) 429-455.
[21]
I. Foster, M. Hategan, J.M. Wozniak, M. Wilde, B. Clifford, Swift: A language for distributed parallel scripting, Parallel Comput., 37 (2011) 633-652.
[22]
K. Maheshwari, J. Montagnat, Scientific workflow development using both visual and script-based representation, in: Services 2010, IEEE 6th World Congress on Services, IEEE, 2010, pp. 328-335.
[23]
J. Montagnat, B. Isnard, T. Glatard, K. Maheshwari, M.B. Fornarino, A data-driven workflow language for grids based on array programming principles, in: WORKS09, Proc. 4th Workshop on Workflows in Support of Large-Scale Science, ACM, 2009, pp. 7.
[24]
M. Baranowski, A. Belloum, M. Bubak, M. Malawski, Constructing workflows from script applications, Sci. Program., 20 (2012) 359-377.
[25]
M. Malawski, T. Bartynski, M. Bubak, A tool for building collaborative applications by invocation of grid operations, in: Lecture Notes in Computer Science, vol. 5103, Springer, 2008, pp. 243-252.
[26]
Y. Li, M. Mascagni, Analysis of large-scale grid-based Monte Carlo applications, Int. J. High Perform. Comput. Appl., 17 (2003) 369-382.
[27]
C. Pautasso, G. Alonso, Parallel computing patterns for grid workflows, in: WORKS'06: Workshop on Workflows in Support of Large-Scale Science, IEEE, 2006, pp. 1-10.
[28]
G. Kahn, The semantics of a simple language for parallel programming, in: Information Processing, North-Holland, 1974, pp. 471-475.
[29]
E.A. Lee, T.M. Parks, Dataflow process networks, Proc. IEEE, 83 (1995) 773-801.
[30]
R. Bocchino, V. Adve, S. Adve, M. Snir, Parallel programming must be deterministic by default, in: Proceedings of the First USENIX Conference on Hot Topics in Parallelism, USENIX Association, 2009.
[31]
S. Bowers, T.M. McPhillips, B. Ludäscher, S. Cohen, S.B. Davidson, A model for user-oriented data provenance in pipelined scientific workflows, in: Lecture Notes in Computer Science, vol. 4145, Springer, Chicago, IL, USA, 2006, pp. 133-147.
[32]
J.L. Carlson, Redis in Action, Manning Publications Co., 2013.
[33]
B. Balis, K. Figiela, M. Malawski, M. Pawlik, M. Bubak, A lightweight approach for deployment of scientific workflows in cloud infrastructures, in: Lecture Notes in Computer Science, Springer, 2015.
[34]
R. Prodan, M. Wieczorek, Negotiation-based scheduling of scientific grid workflows through advance reservations, J. Grid Comput., 8 (2010) 493-510.
[35]
B. Balis, K. Figiela, M. Malawski, K. Jopek, Leveraging workflows and clouds for a multi-frontal solver for finite-element meshes, Procedia Comput. Sci., 51 (2015) 944-953.
[36]
B. Balis, M. Kasztelnik, M. Malawski, P. Nowakowski, B. Wilk, M. Pawlik, M. Bubak, Execution management and efficient resource provisioning for flood decision support, Procedia Comput. Sci., 51 (2015) 2377-2386.
[37]
L. Dou, D. Zinn, T. McPhillips, S. Kohler, S. Riddle, S. Bowers, B. Ludascher, Scientific workflow design 2.0: Demonstrating streaming data collections in Kepler, in: IEEE 2011 Data Engineering (ICDE) Conference, IEEE, 2011, pp. 1296-1299.
[38]
G.B. Berriman, E. Deelman, Montage: a grid-enabled engine for delivering custom science-grade mosaics on demand, in: Astronomical Telescopes and Instrumentation, International Society for Optics and Photonics, 2004, pp. 221-232.
[39]
M. Bubak, J. Kitowski, K. Wiatr, EScience on Distributed Computing Infrastructure: Achievements of PLGrid Plus Domain-specific Services and Tools, in: Lecture Notes in Computer Science, vol. 8500, Springer, 2014.

Cited By

View all
  • (2024)Peeking Behind the Serverless Implementations and Deployments of the Montage WorkflowCompanion of the 15th ACM/SPEC International Conference on Performance Engineering10.1145/3629527.3651420(196-203)Online publication date: 7-May-2024
  • (2024)Scientific workflow execution in the cloud using a dynamic runtime modelSoftware and Systems Modeling (SoSyM)10.1007/s10270-023-01112-623:1(163-193)Online publication date: 1-Feb-2024
  • (2024)A decentralized prediction-based workflow load balancing architecture for cloud/fog/IoT environmentsComputing10.1007/s00607-023-01216-3106:1(201-239)Online publication date: 1-Jan-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Future Generation Computer Systems
Future Generation Computer Systems  Volume 55, Issue C
February 2016
547 pages

Publisher

Elsevier Science Publishers B. V.

Netherlands

Publication History

Published: 01 February 2016

Author Tags

  1. Process networks
  2. Scientific workflows
  3. Workflow enactment
  4. Workflow patterns
  5. Workflow programming

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 24 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Peeking Behind the Serverless Implementations and Deployments of the Montage WorkflowCompanion of the 15th ACM/SPEC International Conference on Performance Engineering10.1145/3629527.3651420(196-203)Online publication date: 7-May-2024
  • (2024)Scientific workflow execution in the cloud using a dynamic runtime modelSoftware and Systems Modeling (SoSyM)10.1007/s10270-023-01112-623:1(163-193)Online publication date: 1-Feb-2024
  • (2024)A decentralized prediction-based workflow load balancing architecture for cloud/fog/IoT environmentsComputing10.1007/s00607-023-01216-3106:1(201-239)Online publication date: 1-Jan-2024
  • (2023)DataFlower: Exploiting the Data-flow Paradigm for Serverless Workflow OrchestrationProceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 410.1145/3623278.3624755(57-72)Online publication date: 25-Mar-2023
  • (2022)The Serverless Computing Survey: A Technical Primer for Design ArchitectureACM Computing Surveys10.1145/350836054:10s(1-34)Online publication date: 13-Sep-2022
  • (2022)FaaSFlow: enable efficient workflow execution for function-as-a-serviceProceedings of the 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems10.1145/3503222.3507717(782-796)Online publication date: 28-Feb-2022
  • (2022)Evaluation of Machine Learning Techniques for Predicting Run Times of Scientific Workflow JobsParallel Processing and Applied Mathematics10.1007/978-3-031-30442-2_15(197-208)Online publication date: 11-Sep-2022
  • (2022)Auto-scaling of Scientific Workflows in KubernetesComputational Science – ICCS 202210.1007/978-3-031-08754-7_5(33-40)Online publication date: 21-Jun-2022
  • (2021)Executing cyclic scientific workflows in the cloudJournal of Cloud Computing: Advances, Systems and Applications10.1186/s13677-021-00229-710:1Online publication date: 6-Apr-2021
  • (2021)Study-based Systematic Mapping Analysis of Cloud Technologies for Leveraging IT Resource and Service Management: The Case Study of the Science Gateway ApproachJournal of Grid Computing10.1007/s10723-021-09587-719:4Online publication date: 1-Dec-2021
  • Show More Cited By

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media