More Web Proxy on the site http://driver.im/

research-article

Free access

Automatic reparameterisation of probabilistic programs

AUTHORs:

Maria I. Gorinova,

Matthew D. HoffmanAuthors Info & Claims

ICML'20: Proceedings of the 37th International Conference on Machine Learning

Article No.: 342, Pages 3648 - 3657

Published: 13 July 2020 Publication History

PDF eReader Publisher Site

Abstract

Probabilistic programming has emerged as a powerful paradigm in statistics, applied science, and machine learning: by decoupling modelling from inference, it promises to allow modellers to directly reason about the processes generating data. However, the performance of inference algorithms can be dramatically affected by the parameterisation used to express a model, requiring users to transform their programs in non-intuitive ways. We argue for automating these transformations, and demonstrate that mechanisms available in recent modelling frameworks can implement noncentring and related reparameterisations. This enables new inference algorithms, and we propose two: a simple approach using interleaved sampling and a novel variational formulation that searches over a continuous space of parameterisations. We show that these approaches enable robust inference across a range of models, and can yield more efficient samplers than the best fixed parameterisation.

Supplementary Material

Additional material (3524938.3525280_supp.pdf)

Supplemental material.

Download
727.70 KB

References

[1]

Andrieu, C. and Thoms, J. A tutorial on adaptive MCMC. Statistics and computing, 18(4):343-373, 2008.

Digital Library

[2]

Betancourt, M. and Girolami, M. Hamiltonian Monte Carlo for hierarchical models. Current trends in Bayesian methodology with applications, 79:30, 2015.

[3]

Bishop, C. M. Pattern recognition and machine learning. Springer, 2006.

Digital Library

[4]

Blei, D. M. Build, compute, critique, repeat: Data analysis with latent variable models. Annual Review of Statistics and Its Application, 1:203-232, 2014.

[5]

Cusumano-Towner, M. F., Saad, F. A., Lew, A. K., and Mansinghka, V. K. Gen: A general-purpose probabilistic programming system with programmable inference. In Proceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2019, pp. 221-236, 2019.

Digital Library

[6]

Dua, D. and Graff, C. UCI machine learning repository, 2017. URL http://archive.ics.uci.edu/ml.

[7]

Gehr, T., Misailovic, S., and Vechev, M. PSI: Exact symbolic inference for probabilistic programs. In International Conference on Computer Aided Verification, pp. 62-83. Springer, 2016.

[8]

Gehr, T., Steffen, S., and Vechev, M. λPSI: Exact inference for higher-order probabilistic programs. In Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation, pp. 883-897, 2020.

Digital Library

[9]

Gelman, A. and Hill, J. Data analysis using regression and multilevel/hierarchical models. Cambridge university press, 2006.

[10]

Gorinova, M. I., Gordon, A. D., and Sutton, C. Probabilistic programming with densities in SlicStan: Efficient, flexible, and deterministic. Proceedings of the ACM on Programming Languages, 3(POPL):35, 2019.

Digital Library

[11]

Hoffman, M., Sountsov, P., Dillon, J. V., Langmore, I., Tran, D., and Vasudevan, S. NeuTra-lizing bad geometry in Hamiltonian Monte Carlo using neural transport. arXiv preprint arXiv:1903.03704, 2019.

[12]

Hoffman, M. D., Johnson, M., and Tran, D. Autoconj: Recognizing and exploiting conjugacy without a domainspecific language. In Neural Information Processing Systems, 2018.

[13]

Kastner, G. and Frühwirth-Schnatter, S. Ancillarity-sufficiency interweaving strategy (ASIS) for boosting MCMC estimation of stochastic volatility models. Computational Statistics & Data Analysis, 76:408-423, 2014.

[14]

Kingma, D. P. and Ba, J. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.

[15]

Kingma, D. P. and Welling, M. Auto-encoding variational Bayes. arXiv preprint arXiv:1312.6114, 2013.

[16]

Moore, D. and Gorinova, M. I. Effect handling for composable program transformations in Edward2. International Conference on Probabilistic Programming, 2018. URL https://arxiv.org/abs/1811.06150.

[17]

Narayanan, P., Carette, J., Romano, W., Shan, C., and Zinkov, R. Probabilistic inference by program transformation in Hakaru (system description). In International Symposium on Functional and Logic Programming - 13th International Symposium, FLOPS 2016, Kochi, Japan, March 4-6, 2016, Proceedings, pp. 62-79. Springer, 2016.

[18]

Neal, R. M. Slice sampling. The Annals of Statistics, 31(3): 705-741, 2003. ISSN 00905364. URL http://www.jstor.org/stable/3448413.

[19]

Papaspiliopoulos, O., Roberts, G. O., and Sköld, M. A general framework for the parametrization of hierarchical models. Statistical Science, pp. 59-73, 2007.

[20]

Parno, M. D. and Marzouk, Y. M. Transport map accelerated Markov chain Monte Carlo. SIAM/ASA Journal on Uncertainty Quantification, 6(2):645-682, 2018.

[21]

Plotkin, G. and Power, J. Adequacy for algebraic effects. In Honsell, F. and Miculan, M. (eds.), Foundations of Software Science and Computation Structures, pp. 1-24, Berlin, Heidelberg, 2001. Springer Berlin Heidelberg. ISBN 978-3-540-45315-4.

[22]

Plotkin, G. and Pretnar, M. Handlers of algebraic effects. In Castagna, G. (ed.), Programming Languages and Systems, pp. 80-94, Berlin, Heidelberg, 2009. Springer Berlin Heidelberg. ISBN 978-3-642-00590-9.

Digital Library

[23]

Pretnar, M. An introduction to algebraic effects and handlers. Invited tutorial paper. Electronic Notes in Theoretical Computer Science, 319:19 - 35, 2015. ISSN 1571-0661. The 31st Conference on the Mathematical Foundations of Programming Semantics (MFPS XXXI).

[24]

Rubin, D. B. Estimation in parallel randomized experiments. Journal of Educational Statistics, 6(4):377-401, 1981. ISSN 03629791. URL http://www.jstor.org/stable/1164617.

[25]

Stan Development Team et al. Stan modelling language users guide and reference manual. Technical report, 2016. https://mc-stan.org/docs/2_19/stan-users-guide/.

[26]

Tran, D., Hoffman, M. D., Vasudevan, S., Suter, C., Moore, D., Radul, A., Johnson, M., and Saurous, R. A. Simple, distributed, and accelerated probabilistic programming. 2018. URL https://arxiv.org/abs/1811.02091. Advances in Neural Information Processing Systems.

[27]

Uber AI Labs. Pyro: A deep probabilistic programming language, 2017. http://pyro.ai/.

[28]

Yao, Y., Vehtari, A., Simpson, D., and Gelman, A. Yes, but did it work?: Evaluating variational inference. arXiv preprint arXiv:1802.02538, 2018.

[29]

Yu, Y. and Meng, X.-L. To center or not to center: That is not the question--an ancillarity-sufficiency interweaving strategy (ASIS) for boosting MCMC efficiency. Journal of Computational and Graphical Statistics, 20(3):531- 570, 2011.

[30]

Zinkov, R. and Shan, C. Composing inference algorithms as program transformations. In Elidan, G., Kersting, K., and Ihler, A. T. (eds.), Proceedings of the Thirty-Third Conference on Uncertainty in Artificial Intelligence, UAI 2017, Sydney, Australia, August 11-15, 2017. AUAI Press, 2017.

Recommendations

Exact Bayesian Inference for Loopy Probabilistic Programs using Generating Functions

We present an exact Bayesian inference method for inferring posterior distributions encoded by probabilistic programs featuring possibly unbounded loops. Our method is built on a denotational semantics represented by probability generating functions, ...
Approximate Bayesian image interpretation using generative probabilistic graphics programs
NIPS'13: Proceedings of the 26th International Conference on Neural Information Processing Systems - Volume 1

The idea of computer vision as the Bayesian inverse problem to computer graphics has a long history and an appealing elegance, but it has proved difficult to directly implement. Instead, most vision tasks are approached via complex bottom-up processing ...
Variational probabilistic inference and the QMR-DT network

We describe a variational approximation method for efficient inference in large-scale probabilistic models. Variational methods are deterministic procedures that provide approximations to marginal and conditional probabilities of interest. They provide ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings

ICML'20: Proceedings of the 37th International Conference on Machine Learning

July 2020

11702 pages

Editors:
Hal Daumé,
Aarti Singh

Copyright © 2020.

Publisher

JMLR.org

Publication History

Published: 13 July 2020

Qualifiers

Research-article
Research
Refereed limited

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
41
Total Downloads

Downloads (Last 12 months)26
Downloads (Last 6 weeks)2

Reflects downloads up to 12 Dec 2024

Other Metrics

View Author Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents