[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article
Open access

Symbolic Disintegration with a Variety of Base Measures

Published: 19 May 2020 Publication History

Abstract

Disintegration is a relation on measures and a transformation on probabilistic programs that generalizes density calculation and conditioning, two operations widely used for exact and approximate inference. Existing program transformations that find a disintegration or density automatically are limited to a fixed base measure that is an independent product of Lebesgue and counting measures, so they are of no help in practical cases that require tricky reasoning about other base measures. We present the first disintegrator that handles variable base measures, including discrete-continuous mixtures, dependent products, and disjoint sums. By analogy with type inference, our disintegrator can check a given base measure as well as infer an unknown one that is principal. We derive the disintegrator and prove it sound by equational reasoning from semantic specifications. It succeeds in a variety of applications where disintegration and density calculation had not been previously mechanized.

References

[1]
Nathanael L. Ackerman, Cameron E. Freer, and Daniel M. Roy. 2011. Noncomputable conditional distributions. In Proceedings of the 26th Symposium on Logic in Computer Science (LICS’11). IEEE Computer Society Press, 107--116.
[2]
Nathanael L. Ackerman, Cameron E. Freer, and Daniel M. Roy. 2017. On computability and disintegration. Math. Struct. Comput. Sci. 27, 8 (2017), 1287--1314.
[3]
Hadi Mohasel Afshar, Scott Sanner, and Christfried Webers. 2016. Closed-form Gibbs sampling for graphical models with algebraic constraints. In Proceedings of the 30th AAAI Conference on Artificial Intelligence. AAAI Press.
[4]
Amal Ahmed and Matthias Blume. 2011. An equivalence-preserving CPS translation via multi-language semantics. In Proceedings of the ACM SIGPLAN International Conference on Functional Programming, Manuel M. T. Chakravarty, Zhenjiang Hu, and Olivier Danvy (Eds.). ACM Press, 431--444.
[5]
Sooraj Bhat, Ashish Agarwal, Richard Vuduc, and Alexander Gray. 2012. A type theory for probability density functions. In Proceedings of the 39th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. ACM Press, 545--556.
[6]
Sooraj Bhat, Johannes Borgström, Andrew D. Gordon, and Claudio V. Russo. 2013. Deriving probability density functions from probabilistic functional programs. In Proceedings of the 19th International Conference on Tools and Algorithms for the Construction and Analysis of Systems (TACAS’13), Nir Piterman and Scott A. Smolka (Eds.). Springer, 508--522.
[7]
Richard S. Bird and Oege de Moor. 1996. Algebra of Programming. Prentice-Hall.
[8]
Anders Bondorf. 1992. Improving binding times without explicit CPS-conversion. In Proceedings of the ACM Conference on Lisp and Functional Programming (Lisp Pointers), William D. Clinger (Ed.), Vol. V(1). ACM Press, 1--10.
[9]
Johannes Borgström, Ugo Dal Lago, Andrew D. Gordon, and Marcin Szymczak. 2016. A lambda-calculus foundation for universal probabilistic programming. In Proceedings of the 21st ACM SIGPLAN International Conference on Functional Programming, Jacques Garrigue, Gabriele Keller, and Eijiro Sumii (Eds.). ACM Press, 33--46.
[10]
Wray L. Buntine. 1994. Operations for learning with graphical models. J. Artific. Intell. Res. 2 (1994), 159--225.
[11]
Jacques Carette and Chung-chieh Shan. 2016. Simplifying probabilistic programs using computer algebra. In Proceedings of the 18th International Symposium on Practical Aspects of Declarative Languages (PADL’16) (Lecture Notes in Computer Science), Marco Gavanelli and John H. Reppy (Eds.). Springer, 135--152.
[12]
Bob Carpenter, Andrew Gelman, Matthew Hoffman, Daniel Lee, Ben Goodrich, Michael Betancourt, Marcus Brubaker, Jiqiang Guo, Peter Li, and Allen Riddell. 2017. Stan: A probabilistic programming language. J. Stat. Softw. 76, 1 (2017), 1--32.
[13]
Joseph T. Chang and David Pollard. 1997. Conditioning as disintegration. Statistica Neerlandica 51, 3 (1997), 287--317.
[14]
Thomas M. Cover and Joy A. Thomas. 2006. Elements of Information Theory (2nd ed.). Wiley.
[15]
Ryan Culpepper and Andrew Cobb. 2017. Contextual equivalence for probabilistic programs with continuous random variables and scoring. In Proceedings of the 26th European Symposium on Programming Languages and Systems (ESOP’17) (Lecture Notes in Computer Science), Yang Hongseok (Ed.). Springer, 368--392.
[16]
Marco F. Cusumano-Towner, Feras A. Saad, Alexander K. Lew, and Vikash K. Mansinghka. 2019. Gen: A general-purpose probabilistic programming system with programmable inference. In Proceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation, Kathryn S. McKinley and Kathleen Fisher (Eds.). ACM Press, 221--236.
[17]
Olivier Danvy and Andrzej Filinski. 1990. Abstracting control. In Proceedings of the ACM Conference on Lisp and Functional Programming. ACM Press, 151--160.
[18]
Bruno de Finetti. 1970. Teoria delle Probabilità: Sintesi Introduttiva con Appendice Critica. Vol. 1. Giulio Einaudi, Torino. Translated as de Finetti 1974.
[19]
Bruno de Finetti. 1972. Probability, Induction, and Statistics. Wiley.
[20]
Bruno de Finetti. 1974. Theory of Probability: A Critical Introductory Treatment. Vol. 1. Wiley.
[21]
Luc Devroye. 1986. Non-Uniform Random Variate Generation. Springer.
[22]
Jean Dieudonné. 1947–1948. Sur le Théorème de Lebesgue-Nikodym (III). Annales de l’université de Grenoble 23 (1947–1948), 25--53. http://eudml.org/doc/84619.
[23]
Sebastian Fischer, Josep Silva, Salvador Tamarit, and Germán Vidal. 2008. Preserving sharing in the partial evaluation of lazy functional programs. In Proceedings of the 17th International Symposium on Logic-based Program Synthesis and Transformation (LOPSTR’07) (Lecture Notes in Computer Science), Andy King (Ed.). Springer, 74--89.
[24]
Weihao Gao, Sreeram Kannan, Sewoong Oh, and Pramod Viswanath. 2017. Estimating mutual information for discrete-continuous mixtures. In Advances in Neural Information Processing Systems, Isabelle Guyon, Ulrike von Luxburg, Samy Bengio, Hanna M. Wallach, Rob Fergus, S. V. N. Vishwanathan, and Roman Garnett (Eds.). Curran Associates, 5986--5997.
[25]
Timon Gehr, Sasa Misailovic, and Martin T. Vechev. 2016. PSI: Exact symbolic inference for probabilistic programs. In Proceedings of the 28th International Conference on Computer Aided Verification, Part I (Lecture Notes in Computer Science), Swarat Chaudhuri and Azadeh Farzan (Eds.). Springer, 62--83.
[26]
Alan E. Gelfand, Adrian F. M. Smith, and Tai-Ming Lee. 1992. Bayesian analysis of constrained parameter and truncated data problems using Gibbs sampling. J. Amer. Statist. Assoc. 87, 418 (1992), 523--532.
[27]
Stuart Geman and Donald Geman. 1984. Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Trans. Pattern Anal. Mach. Intell. 6, 6 (1984), 721--741.
[28]
Michèle Giry. 1982. A categorical approach to probability theory. In Proceedings of an International Conference on Categorical Aspects of Topology and Analysis, Bernhard Banaschewski (Ed.). Springer, 68--85.
[29]
Noah D. Goodman, Vikash K. Mansinghka, Daniel Roy, Keith Bonawitz, and Joshua B. Tenenbaum. 2008. Church: A language for generative models. In Proceedings of the 24th Conference on Uncertainty in Artificial Intelligence, David Allen McAllester and Petri Myllymäki (Eds.). AUAI Press, 220--229.
[30]
N. J. Gordon, D. J. Salmond, and A. F. M. Smith. 1993. Novel approach to nonlinear/non-Gaussian Bayesian state estimation. IEE Proc. F (Radar and Signal Processing) 140, 2 (1993), 107--113.
[31]
Peter J. Green. 1995. Reversible jump Markov chain Monte Carlo computation and Bayesian model determination. Biometrika 82, 4 (1995), 711--732.
[32]
W. Keith Hastings. 1970. Monte Carlo sampling methods using Markov chains and their applications. Biometrika 57, 1 (1970), 97--109.
[33]
Chris Heunen, Ohad Kammar, Sam Staton, and Hongseok Yang. 2017. A convenient category for higher-order probability theory. In Proceedings of the 32nd Symposium on Logic in Computer Science (LICS’17). IEEE Computer Society Press, 1--12.
[34]
Gérard Huet. 1976. Résolution d’Équations dans des Langages d’Ordre . Thèse de doctorat es sciences mathématiques. Université Paris VII.
[35]
Gérard Huet and Bernard Lang. 1978. Proving and applying program transformations expressed with second-order patterns. Acta Informatica 11, 1 (1978), 31--55.
[36]
John Hughes. 1995. The design of a pretty-printing library. In Proceedings of the 1st International Spring School on Advanced Functional Programming Techniques, Johan Jeuring and Erik Meijer (Eds.). Number 925 in Lecture Notes in Computer Science. Springer, 53--96.
[37]
Graham Hutton and Erik Meijer. 1996. Back to basics: Deriving representation changers functionally. J. Funct. Program. 6, 1 (1996), 181--188.
[38]
Neil D. Jones, Carsten K. Gomard, and Peter Sestoft. 1993. Partial Evaluation and Automatic Program Generation. Prentice-Hall.
[39]
Jesper Jørgensen. 1992. Generating a compiler for a lazy language by partial evaluation. In Proceedings of the 19th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, Ravi Sethi (Ed.). ACM Press, 258--268.
[40]
Herman Kahn and T. E. Harris. 1951. Estimation of particle transmission by random sampling. Nat. Bureau Stand. Appl. Math. Ser. 12 (1951), 27--30.
[41]
Anders Kock. 2012. Commutative monads as a theory of distributions. Theory Appl. Cat. 26, 4 (2012), 97--131.
[42]
Julia L. Lawall and Olivier Danvy. 1994. Continuation-based partial evaluation. In Proceedings of the ACM Conference on Lisp and Functional Programming. ACM Press, 227--238.
[43]
David J. Lunn, Andrew Thomas, Nicky Best, and David Spiegelhalter. 2000. WinBUGS—A Bayesian modelling framework: Concepts, structure, and extensibility. Statist. Comput. 10, 4 (2000), 325--337.
[44]
David J. C. MacKay. 1998. Introduction to Monte Carlo methods. In Learning and Inference in Graphical Models, Michael I. Jordan (Ed.). Kluwer.
[45]
Nicholas Metropolis, Arianna W. Rosenbluth, Marshall N. Rosenbluth, Augusta H. Teller, and Edward Teller. 1953. Equation of state calculations by fast computing machines. J. Chem. Phys. 21, 6 (1953), 1087--1092.
[46]
Wazim Mohammed Ismail and Chung-chieh Shan. 2016. Deriving a probability density calculator (functional pearl). In Proceedings of the 21st ACM SIGPLAN International Conference on Functional Programming, Jacques Garrigue, Gabriele Keller, and Eijiro Sumii (Eds.). ACM Press, 47--59.
[47]
Praveen Narayanan, Jacques Carette, Wren Romano, Chung-chieh Shan, and Robert Zinkov. 2016. Probabilistic inference by program transformation in Hakaru (system description). In Proceedings of the 13th International Symposium on Functional and Logic Programming (FLOPS’16) (Lecture Notes in Computer Science), Oleg Kiselyov and Andy King (Eds.). Springer, 62--79.
[48]
Praveen Narayanan and Chung-chieh Shan. 2017. Symbolic conditioning of arrays in probabilistic programs. Proc. ACM Program. Lang. 1, ICFP (2017), 11:1–11:25.
[49]
Otton Nikodym. 1930. Sur une Généralisation des Intégrales de M. J. Radon. Fundamenta Mathematicae 15 (1930), 131--179.
[50]
Avi Pfeffer. 2009. CTPPL: A continuous time probabilistic programming language. In Proceedings of the 21st International Joint Conference on Artificial Intelligence, Craig Boutilier (Ed.). 1943--1950.
[51]
Benjamin C. Pierce. 2002. Types and Programming Languages. MIT Press.
[52]
Gordon Plotkin and John Power. 2003. Algebraic operations and generic effects. Appl. Categor. Struct. 11, 1 (2003), 69--94.
[53]
Gordon D. Plotkin. 1975. Call-by-name, call-by-value and the -calculus. Theoret. Comput. Sci. 1, 2 (1975), 125--159.
[54]
David Pollard. 2001. A User’s Guide to Measure Theoretic Probability. Cambridge University Press.
[55]
Norman Ramsey and Avi Pfeffer. 2002. Stochastic lambda calculus and monads of probability distributions. In Proceedings of the 29th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. ACM Press, 154--165.
[56]
John C. Reynolds. 1972. Definitional interpreters for higher-order programming languages. In Proceedings of the ACM National Conference, Vol. 2. ACM Press, 717--740. Reprinted with introduction in Higher-Order and Symbolic Computation 11, 4 (1998), 363–397.
[57]
David A. Roberts, Marcus Gallagher, and Thomas Taimre. 2019. Reversible jump probabilistic programming. In Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics (AISTATS’19) (Proceedings of Machine Learning Research), Kamalika Chaudhuri and Masashi Sugiyama (Eds.). 634--643.
[58]
Halsey L. Royden. 1988. Real Analysis (3rd ed.). Prentice-Hall.
[59]
Tetsuya Sato, Alejandro Aguirre, Gilles Barthe, Marco Gaboardi, Deepak Garg, and Justin Hsu. 2019. Formal verification of higher-order probabilistic programs: Reasoning about approximation, convergence, Bayesian inference, and optimization. Proc. ACM Program. Lang. 3, POPL (2019), 38:1–38:30.
[60]
Adam Ścibior, Ohad Kammar, Matthijs Vákár, Sam Staton, Hongseok Yang, Yufei Cai, Klaus Ostermann, Sean K. Moss, Chris Heunen, and Zoubin Ghahramani. 2018. Denotational validation of higher-order Bayesian inference. Proc. ACM Program. Lang. 2, POPL (2018), 60:1–60:29.
[61]
Chung-chieh Shan and Norman Ramsey. 2017. Exact Bayesian inference by symbolic disintegration. In Proceedings of the 44th ACM SIGPLAN Symposium on Principles of Programming Languages, Giuseppe Castagna and Andrew D. Gordon (Eds.). ACM Press, 130--144.
[62]
Robert J. Shiller. 1999. The ET interview: Professor James Tobin. Econometric Theory 15, 6 (1999), 867--900.
[63]
Sam Staton. 2017. Commutative semantics for probabilistic programming. In Proceedings of the 26th European Symposium on Programming Languages and Systems (ESOP’17) (Lecture Notes in Computer Science), Yang Hongseok (Ed.). Springer, 855--879.
[64]
Robert D. Tennent. 1973. Mathematical semantics of SNOBOL 4. In Proceedings of the ACM Symposium on Principles of Programming Languages, Patrick C. Fischer and Jeffrey D. Ullman (Eds.). ACM Press, 95--107.
[65]
Hayo Thielecke. 2003. From control effects to typed continuation passing. In Proceedings of the 30th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. ACM Press, 139--149.
[66]
Luke Tierney. 1998. A note on Metropolis-Hastings kernels for general state spaces. Ann. Appl. Probabil. 8, 1 (1998), 1--9.
[67]
James Tobin. 1958. Estimation of relationships for limited dependent variables. Econometrica 26, 1 (1958), 24--36.
[68]
Matthijs Vákár, Ohad Kammar, and Sam Staton. 2019. A domain theory for statistical probabilistic programming. Proc. ACM Program. Lang. 3, POPL (2019), 36:1–36:29.
[69]
Matthijs Vákár and Luke Ong. 2018. On s-finite measures and kernels. e-Print 1810.01837. Retrieved from https://arxiv.org/abs/1810.01837
[70]
Rajan Walia, Praveen Narayanan, Jacques Carette, Sam Tobin-Hochstadt, and Chung-chieh Shan. 2019. From high-level inference algorithms to efficient code. Proc. ACM Program. Lang. 3, ICFP (2019), 98:1–98:30.
[71]
Mitchell Wand. 1987a. Complete type inference for simple objects. In Proceedings of the Symposium on Logic in Computer Science (LICS’87). IEEE Computer Society Press, 37--44.
[72]
Mitchell Wand. 1987b. A simple algorithm and proof for type inference. Fundamenta Informaticae 10, 2 (1987), 115--122.
[73]
Mitchell Wand, Ryan Culpepper, Theophilos Giannakopoulos, and Andrew Cobb. 2018. Contextual equivalence for a probabilistic language with continuous random variables and recursion. Proc. ACM Program. Lang. 2, ICFP (2018), 87:1–87:30.
[74]
David Wingate, Andreas Stuhlmüller, and Noah D. Goodman. 2011. Lightweight implementations of probabilistic programming languages via transformational compilation. In Proceedings of the 14th International Conference on Artificial Intelligence and Statistics (AISTATS’11) (JMLR Workshop and Conference Proceedings), Geoffrey Gordon, David Dunson, and Miroslav Dudík (Eds.). 770--778.
[75]
Frank Wood, Jan Willem van de Meent, and Vikash Mansinghka. 2014. A new approach to probabilistic programming inference. In Proceedings of the 17th International Conference on Artificial Intelligence and Statistics (AISTATS’14) (JMLR Workshop and Conference Proceedings). 1024--1032.
[76]
Yi Wu, Siddharth Srivastava, Nicholas Hay, Simon Du, and Stuart Russell. 2018. Discrete-continuous mixtures in probabilistic programming: Generalized semantics and inference algorithms. In Proceedings of the 35th International Conference on Machine Learning (Proceedings of Machine Learning Research), Jennifer Dy and Andreas Krause (Eds.), Vol. 80. 5339--5348.
[77]
Robert Zinkov and Chung-chieh Shan. 2017. Composing inference algorithms as program transformations. In Proceedings of the Conference on Uncertainty in Artificial Intelligence.

Cited By

View all
  • (2023)Probabilistic Programming with Stochastic ProbabilitiesProceedings of the ACM on Programming Languages10.1145/35912907:PLDI(1708-1732)Online publication date: 6-Jun-2023
  • (2023)Affine Monads and Lazy Structures for Bayesian ProgrammingProceedings of the ACM on Programming Languages10.1145/35712397:POPL(1338-1368)Online publication date: 11-Jan-2023
  • (2023)ωPAP Spaces: Reasoning Denotationally About Higher-Order, Recursive Probabilistic and Differentiable Programs2023 38th Annual ACM/IEEE Symposium on Logic in Computer Science (LICS)10.1109/LICS56636.2023.10175739(1-14)Online publication date: 26-Jun-2023
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Programming Languages and Systems
ACM Transactions on Programming Languages and Systems  Volume 42, Issue 2
June 2020
286 pages
ISSN:0164-0925
EISSN:1558-4593
DOI:10.1145/3395960
Issue’s Table of Contents
This work is licensed under a Creative Commons Attribution International 4.0 License.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 19 May 2020
Online AM: 07 May 2020
Accepted: 01 November 2019
Revised: 01 July 2019
Received: 01 February 2019
Published in TOPLAS Volume 42, Issue 2

Check for updates

Author Tags

  1. Probabilistic programs
  2. conditional distributions
  3. density functions
  4. measure kernels

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

  • DARPA

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)318
  • Downloads (Last 6 weeks)50
Reflects downloads up to 24 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2023)Probabilistic Programming with Stochastic ProbabilitiesProceedings of the ACM on Programming Languages10.1145/35912907:PLDI(1708-1732)Online publication date: 6-Jun-2023
  • (2023)Affine Monads and Lazy Structures for Bayesian ProgrammingProceedings of the ACM on Programming Languages10.1145/35712397:POPL(1338-1368)Online publication date: 11-Jan-2023
  • (2023)ωPAP Spaces: Reasoning Denotationally About Higher-Order, Recursive Probabilistic and Differentiable Programs2023 38th Annual ACM/IEEE Symposium on Logic in Computer Science (LICS)10.1109/LICS56636.2023.10175739(1-14)Online publication date: 26-Jun-2023
  • (2021)SPPL: probabilistic programming with fast exact symbolic inferenceProceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation10.1145/3453483.3454078(804-819)Online publication date: 19-Jun-2021
  • (2021)Densities of Almost Surely Terminating Probabilistic Programs are Differentiable Almost EverywhereProgramming Languages and Systems10.1007/978-3-030-72019-3_16(432-461)Online publication date: 27-Mar-2021
  • (2020)Continualization of Probabilistic Programs With CorrectionProgramming Languages and Systems10.1007/978-3-030-44914-8_14(366-393)Online publication date: 27-Apr-2020

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media