8000 Re: Full-Rank Training for Sana 2k model on Human Image Dataset · Issue #120 · NVlabs/Sana · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Re: Full-Rank Training for Sana 2k model on Human Image Dataset #120

Closed
gitanon112 opened this issue Dec 27, 2024 · 6 comments
Closed

Re: Full-Rank Training for Sana 2k model on Human Image Dataset #120

gitanon112 opened this issue Dec 27, 2024 · 6 comments
Labels
Answered Answered the question

Comments

@gitanon112
Copy link

Hello,

Thank you for your great work with SANA, its incredible. I just had a couple of questions:

1)If my goal is to make SANA 2k model(which currently struggles with human image generation) capable/better at generating human images, is it as simple as full-rank training on dataset with high-quality human image-text pairs? Or is it more complex than this?
2) Am I correct in believing that https://github.com/NVlabs/Sana/blob/main/train_scripts/train.py and https://github.com/NVlabs/Sana/blob/main/train_scripts/train.sh are the full-rank training scripts for SANA, while https://github.com/NVlabs/Sana/blob/main/train_scripts/train_dreambooth_lora_sana.py and https://github.com/NVlabs/Sana/blob/main/train_scripts/train_lora.sh are for dreambooth lora training?

  1. SimpleTuner has a guide for full-rank training Sana model(https://github.com/bghira/SimpleTuner/blob/main/documentation/quickstart/SANA.md). Should we use these over the scripts you have provided. Will training be faster/"better" if we use SimpleTuner approach?

  2. When training the 2K model, does this mean the dataset/images we train on should be of 2k resolution only? Sorry if it's a dumb question

Thanks so much for the help, really appreciate all your hardwork :)

@shaun-ba
Copy link

There is no way a model should even exist if it cannot general humans. If this isn't fixed in the 4k model and all future revisions of the lower res models then nobody will use it. In my opinion they will fix this, they just need time.

@nitinmukesh
Copy link
nitinmukesh commented Dec 28, 2024

It is already mentioned in the roadmap, with 1.5 version we will have better human image generation
https://github.com/NVlabs/Sana#to-do-list

Regarding # 4, it is answered here
#112

@lawrence-cj
Copy link
Collaborator

1)If my goal is to make SANA 2k model(which currently struggles with human image generation) capable/better at generating human images, is it as simple as full-rank training on dataset with high-quality human image-text pairs? Or is it more complex than this?
2) Am I correct in believing that https://github.com/NVlabs/Sana/blob/main/train_scripts/train.py and https://github.com/NVlabs/Sana/blob/main/train_scripts/train.sh are the full-rank training scripts for SANA, while https://github.com/NVlabs/Sana/blob/main/train_scripts/train_dreambooth_lora_sana.py and https://github.com/NVlabs/Sana/blob/main/train_scripts/train_lora.sh are for dreambooth lora training?

Correct. The training of 2K is totally the same as other resolution. The only difference would be needing more GPU memory which is obvious.

@lawrence-cj
Copy link
Collaborator

SimpleTuner has a guide for full-rank training Sana model(https://github.com/bghira/SimpleTuner/blob/main/documentation/quickstart/SANA.md). Should we use these over the scripts you have provided. Will training be faster/"better" if we use SimpleTuner approach?

I haven't tested the training script from SimpleTuner personally. No comment about the performance.

@lawrence-cj
Copy link
Collaborator

When training the 2K model, does this mean the dataset/images we train on should be of 2k resolution only? Sorry if it's a dumb question

Correct. Better to use 2K images for better performance.

@lawrence-cj lawrence-cj added the Answered Answered the question label Jan 2, 2025
@gitanon112
Copy link
Author

Thank you!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Answered Answered the question
Projects
None yet
Development

No branches or pull requests

4 participants
0