Enhanced Feedback for load_state_dict with strict=False

@albanD

🚀 The feature, motivation and pitch

When developers use load_state_dict with strict=False, they often face significant challenges, especially around debugging weight loading issues. The current behavior does not provide sufficient visibility into what weights were successfully loaded and which keys were unexpected or missing. This creates confusion, as many users unknowingly end up with mismatched or unloaded weights, leading to silent failures and hours of debugging.
This is especially useful for people like me who load partial weights, especially for fine-tuning tasks and domain adaptation. I end up writing my own script to do this all the time and I believe other researchers like me will find it very useful.

Why This is Critical

Lack of Feedback:
- Developers using strict=False are left in the dark about what was actually loaded.
- There’s no direct mechanism to verify mismatches or missing weights without additional manual inspection.
Silent Errors:
- If the weights are not loaded as intended, there’s no clear indication. Many users proceed without realizing their model isn’t correctly initialized.
Common Pitfall:
- This is a recurring issue for many developers. A seemingly harmless choice to use strict=False often results in broken workflows, especially for those less familiar with PyTorch’s internals.

Alternatives

Introduce a argument to log detailed information when load_state_dict is called with strict=False. This feedback can include:

Input

state_dict = torch.load("checkpoint.pth", map_location="cpu")
model.load_state_dict(state_dict, strict=False, verbose=True) # default verbose=False
print('Model Loaded')

Output

Loaded Keys: ['layer1.weight', 'layer1.bias', 'layer2.weight']
Unexpected Keys in Checkpoint: ['deprecated_layer.weight', 'old_bias']
Missing Keys in Model: ['layer3.weight', 'layer3.bias']
Model Loaded

Verbosity: Provide a verbosity flag (e.g., verbose=True) to enable or disable this detailed output, ensuring backward compatibility for users who prefer silence.
Loaded Keys: Explicitly list the keys from the checkpoint that were successfully matched and loaded.
Unexpected Keys:Clearly display keys in the checkpoint that were not found in the model.
Missing Keys: Highlight model keys that have no corresponding weights in the checkpoint.

Additional context

Benefits of This Feature

Transparency: Users can immediately identify loading issues without digging through checkpoint files or model definitions.
Efficiency: Reduces debugging time significantly, especially for models with complex architectures or experimental workflows.
Better Debugging: Makes it clear whether the mismatch is due to model changes, checkpoint errors, or other reasons.
Improved User Experience: Aligns with the needs of both beginner and advanced users, ensuring everyone can leverage load_state_dict effectively.

I would be happy to contribute these changes to Pytorch and send a PR

cc @albanD @mruberry @jbschlosser @walterddr @mikaylagawarecki

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

🚀 The feature, motivation and pitch

Alternatives

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Description

🚀 The feature, motivation and pitch

Alternatives

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions