You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The order in FSDP2 is different (weight init and wrapping are swapped):
Model instantiation (meta device or CPU memory) -> FSDP2 Wrapping / Sharding (meta device or CPU memory) -> weight initialization (GPU memory)
A test should make sure that the weight initialization (which is the same implementation for FSDP1 and FSDP2) initializes the weights following the specified distributions.
In FSDP1, the model instantiation has the following consecutive steps:
Model instantiation (CPU memory) -> weight initialization -> FSDP1 Wrapping / Sharding (GPU memory)
The order in FSDP2 is different (weight init and wrapping are swapped):
Model instantiation (meta device or CPU memory) -> FSDP2 Wrapping / Sharding (meta device or CPU memory) -> weight initialization (GPU memory)
A test should make sure that the weight initialization (which is the same implementation for FSDP1 and FSDP2) initializes the weights following the specified distributions.
modalities/src/modalities/models/model_factory.py
Line 188 in b620f43
The text was updated successfully, but these errors were encountered: