Computer Science > Computer Vision and Pattern Recognition

arXiv:2211.17106 (cs)

[Submitted on 27 Nov 2022]

Title:Diffusion Probabilistic Model Made Slim

Authors:Xingyi Yang, Daquan Zhou, Jiashi Feng, Xinchao Wang

View PDF

Abstract:Despite the recent visually-pleasing results achieved, the massive computational cost has been a long-standing flaw for diffusion probabilistic models (DPMs), which, in turn, greatly limits their applications on resource-limited platforms. Prior methods towards efficient DPM, however, have largely focused on accelerating the testing yet overlooked their huge complexity and sizes. In this paper, we make a dedicated attempt to lighten DPM while striving to preserve its favourable performance. We start by training a small-sized latent diffusion model (LDM) from scratch, but observe a significant fidelity drop in the synthetic images. Through a thorough assessment, we find that DPM is intrinsically biased against high-frequency generation, and learns to recover different frequency components at different time-steps. These properties make compact networks unable to represent frequency dynamics with accurate high-frequency estimation. Towards this end, we introduce a customized design for slim DPM, which we term as Spectral Diffusion (SD), for light-weight image synthesis. SD incorporates wavelet gating in its architecture to enable frequency dynamic feature extraction at every reverse steps, and conducts spectrum-aware distillation to promote high-frequency recovery by inverse weighting the objective based on spectrum magni tudes. Experimental results demonstrate that, SD achieves 8-18x computational complexity reduction as compared to the latent diffusion models on a series of conditional and unconditional image generation tasks while retaining competitive image fidelity.

Subjects:	Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
Cite as:	arXiv:2211.17106 [cs.CV]
	(or arXiv:2211.17106v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2211.17106

Submission history

From: Xingyi Yang [view email]
[v1] Sun, 27 Nov 2022 16:27:28 UTC (16,866 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Diffusion Probabilistic Model Made Slim

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Diffusion Probabilistic Model Made Slim

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators