-
Notifications
You must be signed in to change notification settings - Fork 24.1k
Calling get_model_state_dict/set_model_state_dict
requires forward pass for _lazy_init
#125170
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@pytorchbot label "oncall: distributed" |
@mvpatel2000 I'm a little confused by the issue, does the above code fail or is the issue that _lazy_init is not called on a submoduleof the model? |
@LucasLLC sorry I was not clear -- I am suggesting _lazy_init should be inside |
@mvpatel2000 Can you show the error message? I thought FSDP.state_dict and FSDP.load_state_dict called the |
|
@mvpatel2000 The issues has been fixed, #121544. Can you check if this PR solves the issue? |
@fegin yep that looks good to me! It would be nice to include in 2.3.1 |
🐛 Describe the bug
The new distributed APIs
get_model_state_dict/set_model_state_dict
require running at least one forward pass in order to call_lazy_init
. For example,I believe get/set_model_state_dict (and maybe get/set_optim_state_dict) should call _lazy_init as well?
Versions
Torch 2.3
cc @mrshenli @pritamdamania87 @zhaojuanmao @satgera @rohan-varma @gqchen @aazzolini @osalpekar @jiayisuse @H-Huang @kwen2501 @awgu @penguinwu @fegin @XilunWu @wanchaol @fduwjj @wz337 @tianyu-l @wconstab @yf225 @chauhang @d4l3k
The text was updated successfully, but these errors were encountered: