More Web Proxy on the site http://driver.im/

research-article

Image generation with shortest path diffusion

AUTHORs:

Stathi Fotiadis,

Farhang Nabiei,

Alberto BernacchiaAuthors Info & Claims

ICML'23: Proceedings of the 40th International Conference on Machine Learning

Article No.: 279, Pages 7009 - 7024

Published: 23 July 2023 Publication History

Abstract

The field of image generation has made significant progress thanks to the introduction of Diffusion Models, which learn to progressively reverse a given image corruption. Recently, a few studies introduced alternative ways of corrupting images in Diffusion Models, with an emphasis on blurring. However, these studies are purely empirical and it remains unclear what is the optimal procedure for corrupting an image. In this work, we hypothesize that the optimal procedure minimizes the length of the path taken when corrupting an image towards a given final state. We propose the Fisher metric for the path length, measured in the space of probability distributions. We compute the shortest path according to this metric, and we show that it corresponds to a combination of image sharpening, rather than blurring, and noise deblurring. While the corruption was chosen arbitrarily in previous work, our Shortest Path Diffusion (SPD) determines uniquely the entire spatiotemporal structure of the corruption. We show that SPD improves on strong baselines without any hyperparameter tuning, and outperforms all previous Diffusion Models based on image blurring. Furthermore, any small deviation from the shortest path leads to worse performance, suggesting that SPD provides the optimal procedure to corrupt images. Our work sheds new light on observations made in recent works and provides a new approach to improve diffusion models on images and other types of data.

References

[1]

Amari, S.-i. Information geometry and its applications, volume 194. Springer, 2016.

[2]

Bansal, A., Borgnia, E., Chu, H.-M., Li, J. S., Kazemi, H., Huang, F., Goldblum, M., Geiping, J., and Goldstein, T. Cold diffusion: Inverting arbitrary image transforms without noise, 2022.

[3]

Bao, F., Li, C., Sun, J., Zhu, J., and Zhang, B. Estimating the optimal covariance with imperfect mean in diffusion probabilistic models. arXiv preprint arXiv:2206.07309, 2022a.

[4]

Bao, F., Li, C., Zhu, J., and Zhang, B. Analytic-dpm: an analytic estimate of the optimal reverse variance in diffusion probabilistic models. arXiv preprint arXiv:2201.06503, 2022b.

[5]

Bengio, Y., Laufer, E., Alain, G., and Yosinski, J. Deep generative stochastic networks trainable by backprop. In International Conference on Machine Learning, pp. 226- 234. PMLR, 2014.

Digital Library

[6]

Benny, Y. and Wolf, L. Dynamic dual-output diffusion models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11482- 11491, 2022.

[7]

Bond-Taylor, S., Leach, A., Long, Y., and Willcocks, C. G. Deep generative modelling: A comparative review of vaes, gans, normalizing flows, energy-based and autoregressive models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(11):7327-7347, 2022.

[8]

Chen, R. T., Behrmann, J., Duvenaud, D. K., and Jacobsen, J.-H. Residual flows for invertible generative modeling. Advances in Neural Information Processing Systems, 32, 2019.

[9]

Child, R. Very deep vaes generalize autoregressive models and can outperform them on images. ICLR, 2021.

[10]

Child, R., Gray, S., Radford, A., and Sutskever, I. Generating long sequences with sparse transformers. arXiv preprint arXiv:1904.10509, 2019.

[11]

Daras, G., Delbracio, M., Talebi, H., Dimakis, A. G., and Milanfar, P. Soft diffusion: Score matching for general corruptions. arXiv preprint arXiv:2209.05442, 2022.

[12]

Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., and Fei-Fei, L. Imagenet: A large-scale hierarchical image database. In 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248-255, 2009.

[13]

Dhariwal, P. and Nichol, A. Diffusion models beat gans on image synthesis. Advances in Neural Information Processing Systems, 34:8780-8794, 2021.

[14]

Dinh, L., Sohl-Dickstein, J., and Bengio, S. Density estimation using real nvp. ICLR, 2017.

[15]

Dockhorn, T., Vahdat, A., and Kreis, K. Score-based generative modeling with critically-damped langevin diffusion. arXiv preprint arXiv:2112.07068, 2021.

[16]

Fox, C. An introduction to the calculus of variations. Courier Corporation, 1987.

[17]

Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., and Bengio, Y. Generative adversarial networks. Communications of the ACM, 63(11):139-144, 2020.

Digital Library

[18]

Guth, F., Coste, S., Bortoli, V. D., and Mallat, S. Wavelet score-based generative modeling. In Oh, A. H., Agarwal, A., Belgrave, D., and Cho, K. (eds.), Advances in Neural Information Processing Systems, 2022. URL https://openreview.net/forum?id=xZmjH3Pm2BK.

[19]

Heusel, M., Ramsauer, H., Unterthiner, T., Nessler, B., and Hochreiter, S. Gans trained by a two time-scale update rule converge to a local nash equilibrium. In Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4-9, 2017, Long Beach, CA, USA, 2017.

[20]

Ho, J., Jain, A., and Abbeel, P. Denoising diffusion probabilistic models. Advances in Neural Information Processing Systems, 33:6840-6851, 2020.

[21]

Hoogeboom, E. and Salimans, T. Blurring diffusion models. ICLR, 2023.

[22]

Horn, R. A. and Johnson, C. R. Matrix Analysis. Cambridge University Press, Cambridge; New York, 2nd edition, 2013. ISBN 9780521839402.

[23]

Hyvärinen, A., Hurri, J., and Hoyer, P. O. Natural image statistics: A probabilistic approach to early computational vision., volume 39. Springer Science & Business Media, 2009.

[24]

Jing, B., Corso, G., Berlinghieri, R., and Jaakkola, T. Subspace diffusion generative models. arXiv preprint arXiv:2205.01490, 2022.

[25]

Karras, T., Laine, S., Aittala, M., Hellsten, J., Lehtinen, J., and Aila, T. Analyzing and improving the image quality of stylegan. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 8110- 8119, 2020.

[26]

Khrulkov, V., Ryzhakov, G., Chertkov, A., and Oseledets, I. Understanding ddpm latent codes through optimal transport. ICLR, 2023.

[27]

Kingma, D. and Welling, M. Auto-encoding variational bayes. ICLR, 2014.

[28]

Kingma, D., Salimans, T., Poole, B., and Ho, J. Variational diffusion models. Advances in neural information processing systems, 34:21696-21707, 2021.

[29]

Krizhevsky, A. Learning multiple layers of features from tiny images. Master's thesis, University of Tront, 2009.

[30]

Lee, S., Chung, H., Kim, J., and Ye, J. C. Progressive deblurring of diffusion models for coarse-to-fine image synthesis. arXiv preprint arXiv:2207.11192, 2022a.

[31]

Lee, S.-G., Kim, H., Shin, C., Tan, X., Liu, C., Meng, Q., Qin, T., Chen, W., Yoon, S., and Liu, T.-Y. Priorgrad: Improving conditional denoising diffusion models with data-dependent adaptive prior. In International Conference on Learning Representations, 2022b. URL https://openreview.net/forum?id=_BNiN4IjC5.

[32]

Lu, C., Zhou, Y., Bao, F., Chen, J., Li, C., and Zhu, J. DPM-solver: A fast ODE solver for diffusion probabilistic model sampling in around 10 steps. In Oh, A. H., Agarwal, A., Belgrave, D., and Cho, K. (eds.), Advances in Neural Information Processing Systems, 2022. URL https://openreview.net/forum?id=2uAaGwlP_V.

[33]

Ma, H., Zhang, L., Zhu, X., and Feng, J. Accelerating score-based generative models with preconditioned diffusion sampling. In European Conference on Computer Vision, pp. 1-16. Springer, 2022.

Digital Library

[34]

Nichol, A., Dhariwal, P., Ramesh, A., Shyam, P., Mishkin, P., McGrew, B., Sutskever, I., and Chen, M. Glide: Towards photorealistic image generation and editing with text-guided diffusion models. International Conference on Machine Learning, 2022.

[35]

Nichol, A. Q. and Dhariwal, P. Improved denoising diffusion probabilistic models. In Proceedings of the 38th International Conference on Machine Learning, ICML 2021, 18-24 July 2021, Virtual Event, volume 139 of Proceedings of Machine Learning Research, pp. 8162-8171, 2021a.

[36]

Nichol, A. Q. and Dhariwal, P. Improved denoising diffusion probabilistic models. In International Conference on Machine Learning, pp. 8162-8171. PMLR, 2021b.

[37]

Pinele, J., Strapasson, J. E., and Costa, S. I. The fisher-rao distance between multivariate normal distributions: Special cases, bounds and applications. Entropy, 22(4): 404, 2020.

[38]

Rissanen, S., Heinonen, M., and Solin, A. Generative modelling with inverse heat dissipation. ICLR, 2023.

[39]

Saharia, C., Chan, W., Saxena, S., Li, L., Whang, J., Denton, E., Ghasemipour, S. K. S., Ayan, B. K., Mahdavi, S. S., Lopes, R. G., et al. Photorealistic text-to-image diffusion models with deep language understanding. arXiv preprint arXiv:2205.11487, 2022.

[40]

Salimans, T. and Ho, J. Progressive distillation for fast sampling of diffusion models. arXiv preprint arXiv:2202.00512, 2022.

[41]

Salimans, T., Kingma, D., and Welling, M. Markov chain monte carlo and variational inference: Bridging the gap. In International conference on machine learning, pp. 1218-1226. PMLR, 2015.

Digital Library

[42]

Sohl-Dickstein, J., Weiss, E., Maheswaranathan, N., and Ganguli, S. Deep unsupervised learning using nonequilibrium thermodynamics. In International Conference on Machine Learning, pp. 2256-2265. PMLR, 2015.

[43]

Song, J., Meng, C., and Ermon, S. Denoising diffusion implicit models. ICLR, 2021.

[44]

Song, Y. and Ermon, S. Generative modeling by estimating gradients of the data distribution. Advances in Neural Information Processing Systems, 32, 2019.

[45]

Song, Y. and Ermon, S. Improved techniques for training score-based generative models. Advances in neural information processing systems, 33:12438-12448, 2020.

[46]

Song, Y., Sohl-Dickstein, J., Kingma, D. P., Kumar, A., Ermon, S., and Poole, B. Score-based generative modeling through stochastic differential equations. arXiv preprint arXiv:2011.13456, 2020.

[47]

Vahdat, A., Kreis, K., and Kautz, J. Score-based generative modeling in latent space. Advances in Neural Information Processing Systems, 34:11287-11302, 2021.

[48]

Van Den Oord, A., Kalchbrenner, N., and Kavukcuoglu, K. Pixel recurrent neural networks. In International conference on machine learning, pp. 1747-1756. PMLR, 2016.

[49]

Watson, D., Chan, W., Ho, J., and Norouzi, M. Learning fast samplers for diffusion models by differentiating through sample quality. In International Conference on Learning Representations, 2021.

Recommendations

Generalization of Shortest Path Map
ITNG '10: Proceedings of the 2010 Seventh International Conference on Information Technology: New Generations

We consider the problem of constructing shortest path maps in two dimensions under angle constraint. Shortest path maps are used for planning short length paths from a fixed source point s to varying goal points. In the standard shortest path map the ...
A Multiple Pairs Shortest Path Algorithm

The multiple pairs shortest path problem (MPSP) arises in many applications where the shortest paths and distances between only some specific pairs of origin-destination (OD) nodes in a network are desired. The traditional repeated single-source shortest ...
Acceleration of shortest path and constrained shortest path computation
WEA'05: Proceedings of the 4th international conference on Experimental and Efficient Algorithms

We study acceleration methods for point-to-point shortest path and constrained shortest path computations in directed graphs, in particular in road and railroad networks. Our acceleration methods are allowed to use a preprocessing of the network data to ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings

ICML'23: Proceedings of the 40th International Conference on Machine Learning

July 2023

43479 pages

Copyright © 2023.

Publisher

JMLR.org

Publication History

Published: 23 July 2023

Qualifiers

Research-article
Research
Refereed limited

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

View Options

View options

Figures

Tables

Media

View Table of Conten