-
Notifications
You must be signed in to change notification settings - Fork 65
DAAM with mu #40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Hey, thanks. I may be wrong as I'm not too familiar with the InstructPix2Pix architecture, but I think focusing on the cross-attention heads between the key text embeddings and the usual latent embeddings could work. If the attention key vectors are instead a concatenation of text embeddings and, say, image embeddings, then you could look at cross attention restricted to the text dimensions/area. If the text and image embeddings are unseparable (e.g., multimodal fusion), then that would likely be outside of the scope of DAAM/cross-attention and require a separate set of techniques. |
@andreemic Please let me know if you were able to generate cross-attention maps for IP2P or ControlNet. I am trying to visualize cross-attention maps for Stable Diffusion image-to-image pipeline and facing same errors. |
Uh oh!
There was an error while loading. Please reload this page.
Hey! Great job on this repo! Very clean documentation and a useful idea.
The text was updated successfully, but these errors were encountered: