[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article
Open access

Smoothness Analysis for Probabilistic Programs with Application to Optimised Variational Inference

Published: 11 January 2023 Publication History

Abstract

We present a static analysis for discovering differentiable or more generally smooth parts of a given probabilistic program, and show how the analysis can be used to improve the pathwise gradient estimator, one of the most popular methods for posterior inference and model learning. Our improvement increases the scope of the estimator from differentiable models to non-differentiable ones without requiring manual intervention of the user; the improved estimator automatically identifies differentiable parts of a given probabilistic program using our static analysis, and applies the pathwise gradient estimator to the identified parts while using a more general but less efficient estimator, called score estimator, for the rest of the program. Our analysis has a surprisingly subtle soundness argument, partly due to the misbehaviours of some target smoothness properties when viewed from the perspective of program analysis designers. For instance, some smoothness properties, such as partial differentiability and partial continuity, are not preserved by function composition, and this makes it difficult to analyse sequential composition soundly without heavily sacrificing precision. We formulate five assumptions on a target smoothness property, prove the soundness of our analysis under those assumptions, and show that our leading examples satisfy these assumptions. We also show that by using information from our analysis instantiated for differentiability, our improved gradient estimator satisfies an important differentiability requirement and thus computes the correct estimate on average (i.e., returns an unbiased estimate) under a regularity condition. Our experiments with representative probabilistic programs in the Pyro language show that our static analysis is capable of identifying smooth parts of those programs accurately, and making our improved pathwise gradient estimator exploit all the opportunities for high performance in those programs.

References

[1]
Martin Arjovsky, Soumith Chintala, and Léon Bottou. 2017. Wasserstein Generative Adversarial Networks. In International Conference on Machine Learning (ICML). 214–223.
[2]
Gilles Barthe, Raphaëlle Crubillé, Ugo Dal Lago, and Francesco Gavazzo. 2020. On the Versatility of Open Logical Relations - Continuity, Automatic Differentiation, and a Containment Theorem. In European Symposium on Programming (ESOP). 56–83. https://doi.org/10.1007/978-3-030-44914-8_3
[3]
Eli Bingham, Jonathan P. Chen, Martin Jankowiak, Fritz Obermeyer, Neeraj Pradhan, Theofanis Karaletsos, Rohit Singh, Paul A. Szerlip, Paul Horsfall, and Noah D. Goodman. 2019. Pyro: Deep Universal Probabilistic Programming. Journal of Machine Learning Research, 20, 28 (2019), 1–6.
[4]
B. Blanchet, P. Cousot, R. Cousot, J. Feret, L. Mauborgne, A. Miné, D. Monniaux, and X. Rival. 2003. A Static Analyzer for Large Safety Critical Software. In Programming Languages, Design and Implementation (PLDI). 196–207. https://doi.org/10.1145/781131.781153
[5]
Vladimir I. Bogachev. 2007. Measure Theory (first ed.). Springer. https://doi.org/10.1007/978-3-540-34514-5
[6]
Bob Carpenter, Andrew Gelman, Matthew Hoffman, Daniel Lee, Ben Goodrich, Michael Betancourt, Marcus Brubaker, Jiqiang Guo, Peter Li, and Allen Riddell. 2017. Stan: A Probabilistic Programming Language. Journal of Statistical Software, 76, 1 (2017), 1–32. https://doi.org/10.18637/jss.v076.i01
[7]
Arun Tejasvi Chaganty, Aditya V. Nori, and Sriram K. Rajamani. 2013. Efficiently Sampling Probabilistic Programs via Program Analysis. In Artificial Intelligence and Statistics (AISTATS). 153–160.
[8]
Swarat Chaudhuri, Sumit Gulwani, and Roberto Lublinerman. 2010. Continuity analysis of programs. In Principles of Programming Languages (POPL). 57–70. https://doi.org/10.1145/1706299.1706308
[9]
Swarat Chaudhuri, Sumit Gulwani, and Roberto Lublinerman. 2012. Continuity and robustness of programs. Commun. ACM, 55, 8 (2012), 107–115. https://doi.org/10.1145/2240236.2240262
[10]
Guillaume Claret, Sriram K. Rajamani, Aditya V. Nori, Andrew D. Gordon, and Johannes Borgström. 2013. Bayesian inference using data flow analysis. In Foundations of Software Engineering (FSE). 92–102. https://doi.org/10.1145/2491411.2491423
[11]
M. R. Clarkson and F. B. Schneider. 2008. Hyperproperties. In Computer Security Foundations (CSF). 51–65. https://doi.org/10.1109/CSF.2008.7
[12]
Patrick Cousot and Radhia Cousot. 1977. Abstract Interpretation: A Unified Lattice Model for Static Analysis of Programs by Construction or Approximation of Fixpoints. In Principles of Programming Languages (POPL). 238–252. https://doi.org/10.1145/512950.512973
[13]
Patrick Cousot and Radhia Cousot. 1979. Systematic design of program analysis frameworks. In Principles of Programming Languages (POPL). 269–282. https://doi.org/10.1145/567752.567778
[14]
S. M. Ali Eslami, Nicolas Heess, Theophane Weber, Yuval Tassa, David Szepesvari, Koray Kavukcuoglu, and Geoffrey E. Hinton. 2016. Attend, Infer, Repeat: Fast Scene Understanding with Generative Models. In Neural Information Processing Systems (NIPS). 3233–3241.
[15]
Hong Ge, Kai Xu, and Zoubin Ghahramani. 2018. Turing: A Language for Flexible Probabilistic Inference. In Artificial Intelligence and Statistics (AISTATS). 1682–1690.
[16]
Timon Gehr, Sasa Misailovic, and Martin T. Vechev. 2016. PSI: Exact Symbolic Inference for Probabilistic Programs. In Computer Aided Verification (CAV). 62–83. https://doi.org/10.1007/978-3-319-41528-4_4
[17]
Noah Goodman, Vikash Mansinghka, Daniel M Roy, Keith Bonawitz, and Joshua B Tenenbaum. 2008. Church: a language for generative models. In Uncertainty in Artificial Intelligence (UAI). 220–229.
[18]
Andrew D. Gordon, Thore Graepel, Nicolas Rolland, Claudio Russo, Johannes Borgstrom, and John Guiver. 2014. Tabular: A Schema-driven Probabilistic Programming Language. In Principles of Programming Languages (POPL). 321–334. https://doi.org/10.1145/2578855.2535850
[19]
Maria I. Gorinova, Andrew D. Gordon, Charles Sutton, and Matthijs Vákár. 2022. Conditional Independence by Typing. ACM Trans. Program. Lang. Syst., 44, 1 (2022), 4:1–4:54. https://doi.org/10.1145/3490421
[20]
Maria I. Gorinova, Dave Moore, and Matthew D. Hoffman. 2020. Automatic Reparameterisation of Probabilistic Programs. In International Conference on Machine Learning (ICML). 3648–3657.
[21]
Steven Holtzen, Guy Van den Broeck, and Todd D. Millstein. 2020. Scaling exact inference for discrete probabilistic programs. Proc. ACM Program. Lang., 4, OOPSLA (2020), 140:1–140:31. https://doi.org/10.1145/3428208
[22]
Hyunjik Kim, George Papamakarios, and Andriy Mnih. 2021. The Lipschitz Constant of Self-Attention. In International Conference on Machine Learning (ICML). 5562–5571.
[23]
Diederik P. Kingma and Max Welling. 2014. Auto-Encoding Variational Bayes. In International Conference on Learning Representations (ICLR).
[24]
Alp Kucukelbir, Rajesh Ranganath, Andrew Gelman, and David M. Blei. 2015. Automatic Variational Inference in Stan. In Neural Information Processing Systems (NIPS). 568–576.
[25]
Jacob Laurel, Rem Yang, Gagandeep Singh, and Sasa Misailovic. 2022. A dual number abstraction for static analysis of Clarke Jacobians. Proc. ACM Program. Lang., 6, POPL (2022), 1–30. https://doi.org/10.1145/3498718
[26]
John M. Lee. 2012. Introduction to Smooth Manifolds (second ed.). Springer. https://doi.org/10.1007/978-1-4419-9982-5
[27]
Wonyeol Lee, Xavier Rival, and Hongseok Yang. 2022. Artifact for the Paper “Smoothness Analysis for Probabilistic Programs with Application to Optimised Variational Inference”. https://doi.org/10.5281/zenodo.7246597
[28]
Wonyeol Lee, Xavier Rival, and Hongseok Yang. 2022. Smoothness Analysis for Probabilistic Programs with Application to Optimised Variational Inference. arXiv:2208.10530, https://doi.org/10.48550/ARXIV.2208.10530
[29]
Wonyeol Lee, Hangyeol Yu, Xavier Rival, and Hongseok Yang. 2020. Towards verified stochastic variational inference for probabilistic programs. Proc. ACM Program. Lang., 4, POPL (2020), 16:1–16:33. https://doi.org/10.1145/3371084
[30]
Alexander K. Lew, Marco F. Cusumano-Towner, Benjamin Sherman, Michael Carbin, and Vikash K. Mansinghka. 2020. Trace types and denotational semantics for sound programmable inference in probabilistic languages. Proc. ACM Program. Lang., 4, POPL (2020), 19:1–19:32. https://doi.org/10.1145/3371087
[31]
Ravi Mangal, Kartik Sarangmath, Aditya V. Nori, and Alessandro Orso. 2020. Probabilistic Lipschitz Analysis of Neural Networks. In Static Analysis Symposium (SAS). 274–309. https://doi.org/10.1007/978-3-030-65474-0_13
[32]
Vikash K. Mansinghka, Daniel Selsam, and Yura N. Perov. 2014. Venture: a higher-order probabilistic programming platform with programmable inference. arXiv:1404.0099, https://doi.org/10.48550/ARXIV.1404.0099
[33]
T. Minka, J.M. Winn, J.P. Guiver, S. Webster, Y. Zaykov, B. Yangel, A. Spengler, and J. Bronskill. 2014. Infer.NET 2.6. https://dotnet.github.io/infer/
[34]
Praveen Narayanan, Jacques Carette, Wren Romano, Chung-chieh Shan, and Robert Zinkov. 2016. Probabilistic inference by program transformation in Hakaru (system description). In Functional and Logic Programming (FLOPS). 62–79. https://doi.org/10.1007/978-3-319-29604-3_5
[35]
Radford M. Neal. 2011. MCMC Using Hamiltonian Dynamics. In Handbook of Markov Chain Monte Carlo. 113–162. https://doi.org/10.1201/b10905
[36]
Aditya V. Nori, Chung-Kil Hur, Sriram K. Rajamani, and Selva Samuel. 2014. R2: An Efficient MCMC Sampler for Probabilistic Programs. In AAAI Conference on Artificial Intelligence (AAAI). 2476–2482. https://doi.org/10.1609/aaai.v28i1.9060
[37]
André Platzer. 2018. Logical Foundations of Cyber-Physical Systems. Springer. https://doi.org/10.1007/978-3-319-63588-0
[38]
Rajesh Ranganath, Sean Gerrish, and David M. Blei. 2014. Black Box Variational Inference. In Artificial Intelligence and Statistics (AISTATS). 814–822.
[39]
Danilo Jimenez Rezende, Shakir Mohamed, and Daan Wierstra. 2014. Stochastic Backpropagation and Approximate Inference in Deep Generative Models. In International Conference on Machine Learning (ICML). 7344–7353.
[40]
Daniel Ritchie, Paul Horsfall, and Noah D. Goodman. 2016. Deep Amortized Inference for Probabilistic Programs. arXiv:1610.05735, https://doi.org/10.48550/ARXIV.1610.05735
[41]
Daniel Ritchie, Andreas Stuhlmüller, and Noah D. Goodman. 2016. C3: Lightweight Incrementalized MCMC for Probabilistic Programs using Continuations and Callsite Caching. In Artificial Intelligence and Statistics (AISTATS). 28–37.
[42]
John Salvatier, Thomas V. Wiecki, and Christopher Fonnesbeck. 2016. Probabilistic programming in Python using PyMC3. PeerJ Comput. Sci., 2 (2016), e55. https://doi.org/10.7717/peerj-cs.55
[43]
John Schulman, Nicolas Heess, Theophane Weber, and Pieter Abbeel. 2015. Gradient Estimation Using Stochastic Computation Graphs. In Neural Information Processing Systems (NIPS). 3528–3536.
[44]
N. Siddharth, Brooks Paige, Jan-Willem van de Meent, Alban Desmaison, Noah D. Goodman, Pushmeet Kohli, Frank Wood, and Philip Torr. 2017. Learning Disentangled Representations with Semi-Supervised Deep Generative Models. In Neural Information Processing Systems (NIPS). 5927–5937.
[45]
Sam Staton, Hongseok Yang, Frank D. Wood, Chris Heunen, and Ohad Kammar. 2016. Semantics for probabilistic programming: higher-order functions, continuous distributions, and soft constraints. In Logic in Computer Science (LICS). 525–534. https://doi.org/10.1145/2933575.2935313
[46]
David Tolpin, Jan-Willem van de Meent, Hongseok Yang, and Frank D. Wood. 2016. Design and Implementation of Probabilistic Programming Language Anglican. In Implementation and Application of Functional Programming Languages (IFL). 6:1–6:12. https://doi.org/10.1145/3064899.3064910
[47]
Dustin Tran, Matthew D. Hoffman, Dave Moore, Christopher Suter, Srinivas Vasudevan, and Alexey Radul. 2018. Simple, Distributed, and Accelerated Probabilistic Programming. In Neural Information Processing Systems (NeurIPS). 7609–7620.
[48]
Dustin Tran, Alp Kucukelbir, Adji B. Dieng, Maja R. Rudolph, Dawen Liang, and David M. Blei. 2016. Edward: A library for probabilistic modeling, inference, and criticism. arXiv:1610.09787, https://doi.org/10.48550/ARXIV.1610.09787
[49]
Uber AI Labs. 2022. Pyro examples. http://pyro.ai/examples/ Version used: June 18, 2022
[50]
Jan-Willem van de Meent, Brooks Paige, Hongseok Yang, and Frank Wood. 2018. An Introduction to Probabilistic Programming. arXiv:1809.10756, https://doi.org/10.48550/ARXIV.1809.10756
[51]
Di Wang, Jan Hoffmann, and Thomas W. Reps. 2021. Sound probabilistic inference via guide types. In Programming Language Design and Implementation (PLDI). 788–803. https://doi.org/10.1145/3453483.3454077
[52]
WebPPL. 2019. https://github.com/probmods/webppl/blob/v0.9.15/src/header.wppl#L510
[53]
Ronald J. Williams. 1992. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning. Machine Learning, 8, 3-4 (1992), 229–256. https://doi.org/10.1007/BF00992696
[54]
David Wingate, Noah D. Goodman, Andreas Stuhlmüller, and Jeffrey Mark Siskind. 2011. Nonstandard Interpretations of Probabilistic Programs for Efficient Inference. In Neural Information Processing Systems (NIPS). 1152–1160.
[55]
David Wingate, Andreas Stuhlmüller, and Noah D. Goodman. 2011. Lightweight Implementations of Probabilistic Programming Languages Via Transformational Compilation. In Artificial Intelligence and Statistics (AISTATS). 770–778.
[56]
David Wingate and Theophane Weber. 2013. Automated Variational Inference in Probabilistic Programming. arXiv:1301.1299, https://doi.org/10.48550/ARXIV.1301.1299
[57]
Frank Wood, Jan Willem van de Meent, and Vikash Mansinghka. 2014. A New Approach to Probabilistic Programming Inference. In Artificial Intelligence and Statistics (AISTATS). 1024–1032.
[58]
Chenling Xu, Romain Lopez, Edouard Mehlman, Jeffrey Regier, Michael I Jordan, and Nir Yosef. 2021. Probabilistic harmonization and annotation of single-cell transcriptomics data with deep generative models. Molecular systems biology, 17, 1 (2021), e9620. https://doi.org/10.15252/msb.20209620
[59]
Yuan Zhou, Hongseok Yang, Yee Whye Teh, and Tom Rainforth. 2020. Divide, Conquer, and Combine: a New Inference Strategy for Probabilistic Programs with Stochastic Support. In International Conference on Machine Learning (ICML). 11534–11545.

Cited By

View all
  • (2025)Inference Plans for Hybrid Particle FilteringProceedings of the ACM on Programming Languages10.1145/37048469:POPL(271-299)Online publication date: 9-Jan-2025
  • (2024)Probabilistic Programming with Programmable Variational InferenceProceedings of the ACM on Programming Languages10.1145/36564638:PLDI(2123-2147)Online publication date: 20-Jun-2024
  • (2023)ωPAP Spaces: Reasoning Denotationally About Higher-Order, Recursive Probabilistic and Differentiable Programs2023 38th Annual ACM/IEEE Symposium on Logic in Computer Science (LICS)10.1109/LICS56636.2023.10175739(1-14)Online publication date: 26-Jun-2023

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Proceedings of the ACM on Programming Languages
Proceedings of the ACM on Programming Languages  Volume 7, Issue POPL
January 2023
2196 pages
EISSN:2475-1421
DOI:10.1145/3554308
  • Editor:
Issue’s Table of Contents
This work is licensed under a Creative Commons Attribution 4.0 International License.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 January 2023
Published in PACMPL Volume 7, Issue POPL

Permissions

Request permissions for this article.

Check for updates

Badges

Author Tags

  1. probabilistic programming
  2. smoothness
  3. static analysis
  4. variational inference

Qualifiers

  • Research-article

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)170
  • Downloads (Last 6 weeks)30
Reflects downloads up to 14 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2025)Inference Plans for Hybrid Particle FilteringProceedings of the ACM on Programming Languages10.1145/37048469:POPL(271-299)Online publication date: 9-Jan-2025
  • (2024)Probabilistic Programming with Programmable Variational InferenceProceedings of the ACM on Programming Languages10.1145/36564638:PLDI(2123-2147)Online publication date: 20-Jun-2024
  • (2023)ωPAP Spaces: Reasoning Denotationally About Higher-Order, Recursive Probabilistic and Differentiable Programs2023 38th Annual ACM/IEEE Symposium on Logic in Computer Science (LICS)10.1109/LICS56636.2023.10175739(1-14)Online publication date: 26-Jun-2023

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media