[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to main content

\(\textsf {CircuitFlow}\): A Domain Specific Language for Dataflow Programming

  • Conference paper
  • First Online:
Practical Aspects of Declarative Languages (PADL 2022)

Abstract

Dataflow applications, such as machine learning algorithms, can run for days, making it desirable to have assurances that they will work correctly. Current tools are not good enough: too often the interactions between tasks are not type-safe, leading to undesirable runtime errors. This paper presents a new declarative Haskell Embedded DSL (eDSL) for dataflow programming: \(\textsf {CircuitFlow}\). Defined as a Symmetric Monoidal Preorder (SMP) on data that models dependencies in the workflow, it has a strong mathematical basis, refocusing on how data flows through an application, resulting in a more expressive solution that not only catches errors statically, but also achieves competitive run-time performance. In our preliminary evaluation, \(\textsf {CircuitFlow}\) outperforms the industry-leading Luigi library of Spotify by scaling better with the number of inputs. The innovative creation of \(\textsf {CircuitFlow}\) is also of note, exemplifying how to create a modular eDSL whose semantics necessitates effects, and where storing complex type information for program correctness is paramount.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
GBP 19.95
Price includes VAT (United Kingdom)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
GBP 43.99
Price includes VAT (United Kingdom)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
GBP 54.99
Price includes VAT (United Kingdom)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Abadi, M., et al.: TensorFlow: a system for large-scale machine learning. In: 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 2016), pp. 265–283. USENIX Association, Savannah, November 2016

    Google Scholar 

  2. Apache: Airflow. http://airflow.apache.org

  3. Bahr, P., Hvitved, T.: Compositional data types. In: Proceedings of the Seventh ACM SIGPLAN Workshop on Generic Programming, WGP 2011, pp. 83–94. Association for Computing Machinery, New York (2011)

    Google Scholar 

  4. Bernardy, J.P., Spiwack, A.: Evaluating linear functions to symmetric monoidal categories. In: Proceedings of the 14th ACM SIGPLAN International Symposium on Haskell, Haskell 2021, pp. 14–26. Association for Computing Machinery, New York (2021)

    Google Scholar 

  5. Chambers, C., et al.: Easy, efficient data-parallel pipelines. In: ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), 2 Penn Plaza, Suite 701 New York, NY 10121–070, pp. 363–3751 (2010)

    Google Scholar 

  6. Coecke, B., Fritz, T., Spekkens, R.W.: A mathematical theory of resources. Inf. Comput. 250, 59–86 (2016)

    Article  MathSciNet  Google Scholar 

  7. Dennis, J.B., Misunas, D.P.: A preliminary architecture for a basic data-flow processor. In: Proceedings of the 2nd Annual Symposium on Computer Architecture, ISCA 1975, pp. 126–132. Association for Computing Machinery, New York (1974)

    Google Scholar 

  8. Eisenberg, R.A., Vytiniotis, D., Peyton Jones, S., Weirich, S.: Closed type families with overlapping equations. In: Proceedings of the 41st ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL 2014, pp. 671–683. Association for Computing Machinery, New York (2014)

    Google Scholar 

  9. Eisenberg, R.A., Weirich, S.: Dependently typed programming with singletons. In: Proceedings of the 2012 Haskell Symposium, Haskell 2012, pp. 117–130. Association for Computing Machinery, New York (2012)

    Google Scholar 

  10. Erdmann, M., Fischer, B., Fischer, R., Rieger, M.: Design and execution of make-like, distributed analyses based on spotify’s pipelining package Luigi. J. Phys. Conf. Ser. 898, 072047 (2017)

    Google Scholar 

  11. Evans, R., Frohlich, S., Wang, M.: CircuitFlow: a domain specific language for dataflow programming (with appendices) (2021)

    Google Scholar 

  12. Fokkinga, M.: Monadic maps and folds for arbitrary datatypes. Memoranda Informatica (94–28), June 1994. Imported from EWI/DB PMS [db-utwente:tech:0000003538]

    Google Scholar 

  13. Gibbons, J., Wu, N.: Folding domain-specific languages: deep and shallow embeddings (functional pearl). In: Proceedings of the ACM SIGPLAN International Conference on Functional Programming, ICFP 49, August 2014

    Google Scholar 

  14. Gonzalez, G.: Pipes. https://hackage.haskell.org/package/pipes

  15. Hils, D.D.: Visual languages and computing survey: data flow visual programming languages. J. Vis. Lang. Comput. 3, 69–101 (1992)

    Article  Google Scholar 

  16. Hughes, J.: Generalising monads to arrows. Sci. Comput. Program. 37(1), 67–111 (2000)

    Article  MathSciNet  Google Scholar 

  17. Inc, A.: Quartz composer user guide, July 2007

    Google Scholar 

  18. Johann, P., Ghani, N.: Foundations for structured programming with GADTs. In: Proceedings of the 35th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL 2008, pp. 297–308. Association for Computing Machinery, New York (2008)

    Google Scholar 

  19. Kiselyov, O., Lämmel, R., Schupke, K.: Strongly typed heterogeneous collections. In: Proceedings of the 2004 ACM SIGPLAN Workshop on Haskell, Haskell 2004, pp. 96–107. Association for Computing Machinery, New York (2004)

    Google Scholar 

  20. Kotliar, M., Kartashov, A.V., Barski, A.: CWL-Airflow: a lightweight pipeline manager supporting Common Workflow Language. GigaScience 8(7), giz084 (2019)

    Google Scholar 

  21. Lampa, S., Dahlö, M., Alvarsson, J., Spjuth, O.: SciPipe: a workflow library for agile development of complex and dynamic bioinformatics pipelines. GigaScience 8(5), giz044 (2019)

    Google Scholar 

  22. Lee, E.A., Parks, T.M.: Dataflow process networks. Proc. IEEE 83(5), 773–801 (1995)

    Article  Google Scholar 

  23. Maries, I.C.: Time. https://pypi.org/project/pytest-benchmark/

  24. Marlow, S., Brandy, L., Coens, J., Purdy, J.: There is no fork: an abstraction for efficient, concurrent, and concise data access. In: Proceedings of the 19th ACM SIGPLAN International Conference on Functional Programming, ICFP 2014, pp. 325–337. Association for Computing Machinery, New York (2014)

    Google Scholar 

  25. Marlow, S., Peyton Jones, S., Kmett, E., Mokhov, A.: Desugaring Haskell’s do-notation into applicative operations. SIGPLAN Not. 51(12), 92–104 (2016)

    Article  Google Scholar 

  26. McBride, C.: Functional pearl: Kleisli arrows of outrageous fortune. J. Funct. Program (2011, accepted for publication)

    Google Scholar 

  27. Mcbride, C., Paterson, R.: Applicative programming with effects. J. Funct. Program. 18(1), 1–13 (2008)

    Article  Google Scholar 

  28. Murray, D., McSherry, F., Isaacs, R., Isard, M., Barham, P., Abadi, M.: Naiad: a timely dataflow system. In: Proceedings of the 24th ACM Symposium on Operating Systems Principles (SOSP), pp. 439–455. ACM, November 2013

    Google Scholar 

  29. O’Sullivan, B.: Criterion. http://www.serpentine.com/criterion/

  30. Parès, Y., Bernardy, J.P., Eisenberg, R.A.: Composing effects into tasks and workflows. In: Proceedings of the 13th ACM SIGPLAN International Symposium on Haskell, Haskell 2020, pp. 80–94. Association for Computing Machinery, New York (2020)

    Google Scholar 

  31. Paterson, R.: A new notation for arrows. In: International Conference on Functional Programming, pp. 229–240. ACM Press, September 2001

    Google Scholar 

  32. Schrijvers, T., Peyton Jones, S., Chakravarty, M., Sulzmann, M.: Type checking with open type functions. In: Proceedings of the 13th ACM SIGPLAN International Conference on Functional Programming, ICFP 2008, pp. 51–62. Association for Computing Machinery, New York (2008)

    Google Scholar 

  33. Spotify: Spotify: Luigi. https://github.com/spotify/luigi

  34. Spotify: Tasks, April 2020. https://luigi.readthedocs.io/en/stable/tasks.html

  35. Svenningsson, J., Axelsson, E.: Combining deep and shallow embedding of domain-specific languages. Comput. Lang. Syst. Struct. 44, 143–165 (2015). sI: TFP 2011/12

    Google Scholar 

  36. Swierstra, W.: Data types á la carte. J. Funct. Program. 18(4), 423–436 (2008)

    Article  MathSciNet  Google Scholar 

  37. Wadler, P.: The expression problem, November 1998

    Google Scholar 

  38. Willis, J., Wu, N., Pickering, M.: Staged selective parser combinators. Proc. ACM Program. Lang. 4(ICFP), 1–30 (2020)

    Google Scholar 

  39. Yorgey, B.A., Weirich, S., Cretin, J., Peyton Jones, S., Vytiniotis, D., Magalhães, J.P.: Giving haskell a promotion. In: Proceedings of the 8th ACM SIGPLAN Workshop on Types in Language Design and Implementation, TLDI 2012, pp. 53–66. Association for Computing Machinery, New York (2012)

    Google Scholar 

  40. Yu, Y., et al.: DryadLINQ: a system for general-purpose distributed data-parallel computing using a high-level language. In: Proceedings of the 8th USENIX Conference on Operating Systems Design and Implementation, OSDI 2008, pp. 1–14. USENIX Association, USA (2008)

    Google Scholar 

  41. Zaharia, M., Chowdhury, M., Franklin, M.J., Shenker, S., Stoica, I.: Spark: cluster computing with working sets. In: Proceedings of the 2nd USENIX Conference on Hot Topics in Cloud Computing, HotCloud 2010, p. 10. USENIX Association, USA (2010)

    Google Scholar 

Download references

Acknowledgements

The authors would like to thank Jamie Willis for his insights while creating \(\textsf {CircuitFlow}\) and the anonymous reviewers for their constructive and helpful comments.

The work is partly supported by EPSRC Grant EXHIBIT: Expressive High-Level Languages for Bidirectional Transformations (EP/T008911/1) and Royal Society Grant Bidirectional Compiler for Software Evolution (IESR3170104).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Meng Wang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Evans, R., Frohlich, S., Wang, M. (2022). \(\textsf {CircuitFlow}\): A Domain Specific Language for Dataflow Programming. In: Cheney, J., Perri, S. (eds) Practical Aspects of Declarative Languages. PADL 2022. Lecture Notes in Computer Science(), vol 13165. Springer, Cham. https://doi.org/10.1007/978-3-030-94479-7_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-94479-7_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-94478-0

  • Online ISBN: 978-3-030-94479-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics