Computer Science > Computer Vision and Pattern Recognition

arXiv:2311.17137 (cs)

[Submitted on 28 Nov 2023 (v1), last revised 16 Oct 2024 (this version, v3)]

Title:Generative Models: What Do They Know? Do They Know Things? Let's Find Out!

Authors:Xiaodan Du, Nicholas Kolkin, Greg Shakhnarovich, Anand Bhattad

View PDF

Abstract:Generative models excel at mimicking real scenes, suggesting they might inherently encode important intrinsic scene properties. In this paper, we aim to explore the following key questions: (1) What intrinsic knowledge do generative models like GANs, Autoregressive models, and Diffusion models encode? (2) Can we establish a general framework to recover intrinsic representations from these models, regardless of their architecture or model type? (3) How minimal can the required learnable parameters and labeled data be to successfully recover this knowledge? (4) Is there a direct link between the quality of a generative model and the accuracy of the recovered scene intrinsics?
Our findings indicate that a small Low-Rank Adaptators (LoRA) can recover intrinsic images-depth, normals, albedo and shading-across different generators (Autoregressive, GANs and Diffusion) while using the same decoder head that generates the image. As LoRA is lightweight, we introduce very few learnable parameters (as few as 0.04% of Stable Diffusion model weights for a rank of 2), and we find that as few as 250 labeled images are enough to generate intrinsic images with these LoRA modules. Finally, we also show a positive correlation between the generative model's quality and the accuracy of the recovered intrinsics through control experiments.

Comments:	this https URL
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Graphics (cs.GR); Machine Learning (cs.LG)
Cite as:	arXiv:2311.17137 [cs.CV]
	(or arXiv:2311.17137v3 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2311.17137

Submission history

From: Xiaodan Du [view email]
[v1] Tue, 28 Nov 2023 18:59:02 UTC (47,865 KB)
[v2] Mon, 24 Jun 2024 01:42:55 UTC (24,304 KB)
[v3] Wed, 16 Oct 2024 07:08:57 UTC (11,201 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Generative Models: What Do They Know? Do They Know Things? Let's Find Out!

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Generative Models: What Do They Know? Do They Know Things? Let's Find Out!

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators