-
Notifications
You must be signed in to change notification settings - Fork 180
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Re: Full-Rank Training for Sana 2k model on Human Image Dataset #120
Comments
There is no way a model should even exist if it cannot general humans. If this isn't fixed in the 4k model and all future revisions of the lower res models then nobody will use it. In my opinion they will fix this, they just need time. |
It is already mentioned in the roadmap, with 1.5 version we will have better human image generation Regarding # 4, it is answered here |
Correct. The training of 2K is totally the same as other resolution. The only difference would be needing more GPU memory which is obvious. |
I haven't tested the training script from SimpleTuner personally. No comment about the performance. |
Correct. Better to use 2K images for better performance. |
Thank you!! |
Hello,
Thank you for your great work with SANA, its incredible. I just had a couple of questions:
1)If my goal is to make SANA 2k model(which currently struggles with human image generation) capable/better at generating human images, is it as simple as full-rank training on dataset with high-quality human image-text pairs? Or is it more complex than this?
2) Am I correct in believing that https://github.com/NVlabs/Sana/blob/main/train_scripts/train.py and https://github.com/NVlabs/Sana/blob/main/train_scripts/train.sh are the full-rank training scripts for SANA, while https://github.com/NVlabs/Sana/blob/main/train_scripts/train_dreambooth_lora_sana.py and https://github.com/NVlabs/Sana/blob/main/train_scripts/train_lora.sh are for dreambooth lora training?
SimpleTuner has a guide for full-rank training Sana model(https://github.com/bghira/SimpleTuner/blob/main/documentation/quickstart/SANA.md). Should we use these over the scripts you have provided. Will training be faster/"better" if we use SimpleTuner approach?
When training the 2K model, does this mean the dataset/images we train on should be of 2k resolution only? Sorry if it's a dumb question
Thanks so much for the help, really appreciate all your hardwork :)
The text was updated successfully, but these errors were encountered: