[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article
Open access

Coarsening optimization for differentiable programming

Published: 15 October 2021 Publication History

Abstract

This paper presents a novel optimization for differentiable programming named coarsening optimization. It offers a systematic way to synergize symbolic differentiation and algorithmic differentiation (AD). Through it, the granularity of the computations differentiated by each step in AD can become much larger than a single operation, and hence lead to much reduced runtime computations and data allocations in AD. To circumvent the difficulties that control flow creates to symbolic differentiation in coarsening, this work introduces phi-calculus, a novel method to allow symbolic reasoning and differentiation of computations that involve branches and loops. It further avoids "expression swell" in symbolic differentiation and balance reuse and coarsening through the design of reuse-centric segment of interest identification. Experiments on a collection of real-world applications show that coarsening optimization is effective in speeding up AD, producing several times to two orders of magnitude speedups.

Supplementary Material

Auxiliary Presentation Video (oopsla21main-p142-p-video.mp4)
This is the presentation video of our paper at OOPLSA 2021 on our paper "Coarsening Optimization for Differentiable Programming". It offers a systematic way to synergize symbolic differentiation and algorithmic differentiation (AD). Through it, the granularity of the computations differentiated by each step in AD can become much larger than a single operation, and hence lead to up to two orders of magnitude speedups. To circumvent the difficulties that control flow creates to symbolic differentiation in coarsening, this work introduces 𝜙-calculus, a novel method to allow symbolic reasoning and differentiation of computations that involve branches and loops. It further avoids "expression swell" in symbolic differentiation and balance reuse and coarsening through the design of reuse-centric segment of interest identification.

References

[1]
[n.d.]. Calculus package for Julia. Available at https://github.com/JuliaMath/Calculus.jl
[2]
[n.d.]. HMC Explained. Available at https://arogozhnikov.github.io/2016/12/19/markov_chain_monte_carlo.html
[3]
[n.d.]. SageMath. Available at https://www.sagemath.org/
[4]
[n.d.]. Sympy software. https://www.sympy.org/en/index.html.
[5]
1988. Fast reverse-mode automatic differentiation using expression templates in C++. Perspectives in Computing, 19 (1988), Source of expression swell.
[6]
2011. Handbook of Markov Chain Monte Carlo. May, isbn:9780429138508 https://doi.org/10.1201/b10905
[7]
2014. Fast reverse-mode automatic differentiation using expression templates in C++. Trans. Math. Software, 40, 26 (2014), ADEPT AD tool in C++.
[8]
2017. High-Performance Derivative Computations using CoDiPack. Trans. Math. Software, 45 (2017), CoDiPack.
[9]
A. V. Aho, M. S. Lam, R. Sethi, and J. D. Ullman. 2006. Compilers: Principles, Techniques, and Tools (2nd ed.). Addison Wesley.
[10]
P. Aubert, N. Di Cesare, and O. Pironneau. 2001. Automatic differentiation in C++ using expression templates ´ and application to a flow control problem. Comput. Vis. Sci., 3 (2001), 197–208.
[11]
Atılım Günes Baydin, Barak A. Pearlmutter, Alexey Andreyevich Radul, and Jeffrey Mark Siskind. 2018. Automatic differentiation in machine learning: a survey. The Journal of Machine Learning Research, 18, 1 (2018).
[12]
James Bradbury, Roy Frostig, Peter Hawkins, Matthew James Johnson, Chris Leary, Dougal Maclaurin, George Necula, Adam Paszke, Jake VanderPlas, Skye Wanderman-Milne, and Qiao Zhang. 2018. JAX: composable transformations of Python+NumPy programs. https://jax.readthedocs.io/.
[13]
Breandan Considine, Michalis Famelis, and Liam Paull. 2019. Kotlin∇ : A Shape-Safe eDSL for Differentiable Programming. https://github.com/breandan/kotlingrad.
[14]
Ron Cytron, Jeanne Ferrante, Barry K. Rosen, Mark N. Wegman, and F. Kenneth Zadeck. 1989. An efficient method of computing static single assignment form. In Proceedings of the 16th ACM SIGPLAN-SIGACT symposium on Principles of programming languages. 25–35.
[15]
B. Dauvergne and L. Hascoet. 2006. The Data-Flow Equations of Checkpointing in Reverse Automatic Differentiation. Lecture Notes in Computer Science, 3994 (2006).
[16]
Y. Ding and X. Shen. 2017. GLORE: Generalized Loop Redundancy Elimination upon LER-Notation. In Proceedings of OOPSLA at The ACM SIGPLAN conference on Systems, Programming, Languages and Applications: Software for Humanity (SPLASH).
[17]
L. C. Dixon. 1991. Use of automatic differentiation for calculating Hessians and Newton steps. Automatic Differentiation of Algorithms: Theory, Implementation, and Application, 114–125.
[18]
Michael Innes. 2018. Don’t Unroll Adjoint: Differentiating SSA-Form Programs. CoRR, abs/1810.07951 (2018), arXiv:1810.07951. arxiv:1810.07951
[19]
Michael J Innes. 2020. Sense & Sensitivities: The Path to General-Purpose Algorithmic Differentiation. In Proceedings of the 3rd MLSys Conference. https://fluxml.ai/Zygote.jl/latest/.
[20]
Kathleen B Knobe and Vivek Sarkar. 1998. Array SSA form and its use in parallelization. In Proceedings of the 25th ACM SIGPLAN-SIGACT symposium on Principles of programming languages.
[21]
Sören Laue. 2019. On the Equivalence of Forward Mode Automatic Differentiation and Symbolic Differentiation. CoRR, abs/1904.02990 (2019), arXiv:1904.02990. arxiv:1904.02990
[22]
Dougal Maclaurin. 2016. Modeling, Inference and Optimization with Composable Differentiable Procedures. Ph.D. Dissertation. Harvard University.
[23]
Charles C. Margossian. 2019. A review of automatic differentiation and its efficient implementation. WIREs Data Mining and Knowledge Discovery, 9, 4 (2019), Mar, issn:1942-4795 https://doi.org/10.1002/widm.1305
[24]
Karl J. Ottenstein, Robert A. Ballance, and Arthur B. MacCabe. 1990. The program dependence web: a representation supporting control-, data-, and demand-driven interpretation of imperative languages. In ACM SIGPLAN 1990 conference on Programming language design and implementation. 257–271.
[25]
Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan, Edward Yang, Zachary DeVito, Zeming Lin, Alban Desmaison, Luca Antiga, and Adam Lerer. 2017. Automatic differentiation in PyTorch. In Proceedings of NIPS 2017 Workshop Autodiff.
[26]
Eric Phipps and Roger Pawlowski. 2012. Efficient Expression Templates for Operator Overloading-BasedAutomatic Differentiation. In Recent Advances in Algorithmic Differentiation, Shaun Forth, Paul Hovland, Eric Phipps, Jean Utke, and Andrea Walther (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg. 309–319. isbn:978-3-642-30023-3
[27]
Junior Rojas, Stelian Coros, and Ladislav Kavan. 2019. Deep reinforcement learning for 2D soft body locomotion. In NeurIPS Workshop on Machine Learning for Creativity and Design 3.0.
[28]
Amir Shaikhha, Andrew Fitzgibbon, Dimitrios Vytiniotis, and Simon Peyton Jones. 2019. Efficient Differentiable Programming in a Functional Array-Processing Language. Proc. ACM Program. Lang., 3, ICFP (2019), Article 97, July, 30 pages. https://doi.org/10.1145/3341701
[29]
Benjamin Sherman, Jesse Michel, and Michael Carbin. 2021. Computable Semantics for Differentiable Programming with Higher-Order Functions and Datatypes. In Proceedings of the ACM SIGPLAN-SIGACT symposium on Principles of programming languages.
[30]
Nazanin Tehrani, Nimar S. Arora, Yucen Lily Li, Kinjal Divesh Shah, David Noursi, Michael Tingley, Narjes Torabi, Sepehr Masouleh, Eric Lippert, and Erik Meijer. 2020. Bean Machine: A Declarative Probabilistic Programming Language For Efficient Programmable Inference. In Proceedings of the 10th International Conference on Probabilistic Graphical Models.
[31]
Peng Tu and David Padua. 1995. Gated SSA-based demand-driven symbolic analysis for parallelizing compilers. In Proceedings of the 9th International Conference on Supercomputing. 414–423.
[32]
Robert A. van Engelen. 2001. A method for recognizing and substitutions of generalized inductive variables through Chains of recurrences (CRs). In Proceedings of the International Conference on Compiler Constructions.
[33]
Robert A. van Engelen, J. Birch, Y. Shou, B. Walsh, and Kyle A. Gallivan. 2004. A Unified Framework for Nonlinear Dependence Testing and Symbolic Analysis. In Proceedings of the International Conference on Supercomputing.
[34]
Bart van Merriënboer, Olivier Breuleux, Arnaud Bergeron, and Pascal Lamblin. 2018. Automatic differentiation in ML: Where we are and where we should be going. CoRR, abs/1810.11530 (2018), arXiv:1810.11530. arxiv:1810.11530
[35]
Fei Wang, Xilun Wu, Grégory M. Essertel, James M. Decker, and Tiark Rompf. 2018. Demystifying Differentiable Programming: Shift/Reset the Penultimate Backpropagator. CoRR, abs/1803.10228 (2018), arXiv:1803.10228. arxiv:1803.10228
[36]
Yun Zhu, Edwin Westbrook, Jun Inoue, Alexandre Chapoutot, Cherif Salama, Marisa Peralta, Travis Martin, Walid Taha, Robert Cartwright, Aaron Ames, and Raktim Bhattacharya. 2010. Mathematical equations as executable models of mechanical systems. In Proceedings of International Conference on Cyber-Physical Systems.

Index Terms

  1. Coarsening optimization for differentiable programming

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image Proceedings of the ACM on Programming Languages
    Proceedings of the ACM on Programming Languages  Volume 5, Issue OOPSLA
    October 2021
    2001 pages
    EISSN:2475-1421
    DOI:10.1145/3492349
    Issue’s Table of Contents
    This work is licensed under a Creative Commons Attribution International 4.0 License.

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 15 October 2021
    Published in PACMPL Volume 5, Issue OOPSLA

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. SSA
    2. calculus
    3. compiler
    4. differentiable programming
    5. program optimizations

    Qualifiers

    • Research-article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 513
      Total Downloads
    • Downloads (Last 12 months)142
    • Downloads (Last 6 weeks)19
    Reflects downloads up to 09 Jan 2025

    Other Metrics

    Citations

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Login options

    Full Access

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media