Revamp `Conjunction` #379

ValerianRey · 2025-05-31T14:25:07Z

This PR completely reimplements Conjunction to make it much more efficient.

First, we can notice that all transforms that the Conjunction holds map to the same output type (_B). We thus don't have to do several costly calls to _least_common_ancestor! We can simply call type(self.transforms[0](tensor_dict)) to get the class of TensorDict that we want the final output to be.

This (the [0] indexing), however, would require having at least one transform. Until now, we also allowed empty conjunctions. We have two choices:

Make the output_type be EmptyTensorDict when the Conjunction is empty. This is a bit confusing for the type checker, because when the Conjunction is empty, _B is not very well-defined (in this case, it is, in fact, EmtpyTensorDict, but mypy doesn't seem to infer this).
Stop allowing empty Conjunctions. We do not allow mtl_backward with no loss anyway, so there's no way for a user to ever need an empty Conjunction at the moment.

I selected the second choice, because it's much simpler to implement. If we ever really need empty Conjunctions (which I doubt we will, because we can always replace them with a trivial transform returning the EmptyTensorDict), we can always go back on this choice.

Another implementation would have been:

tensor_dicts = [transform(tensor_dict) for transform in self.transforms]
union: dict[Tensor, Tensor] = {}
for td in tensor_dicts:
    union |= td
return type(tensor_dicts[0])(union)

This is shorter, but I'm scared it could use a bit more memory (in fact, since only references should be stored, this is probably not significant at all, so we could arguably use this implementation instead).

Lastly, this fixes another issue that we had in the previous implementation: TensorDicts are supposed to be immutable, but we called |= (the __ior__ method) on EmptyTensorDict. Now, we only call |= on union, which is not a TensorDict (but rather a simple dict[Tensor, Tensor]). The instantiation of the TensorDict is done only at the end, with return output_type(union).

This allows us to assign _raise_immutable_error to TensorDict.__ior__, as we should already have done.

Revamp Conjunction.call
Remove _least_common_ancestor
Disable ior in TensorDict (for immutability)

* Remove _least_common_ancestor * Disallow empty conjunctions

codecov · 2025-05-31T14:25:57Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Files with missing lines	Coverage Δ
src/torchjd/_autojac/_transform/_base.py	`100.00% <100.00%> (ø)`
src/torchjd/_autojac/_transform/_tensor_dict.py	`100.00% <100.00%> (ø)`

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

PierreQuinton · 2025-06-02T07:03:59Z

First, we can notice that all transforms that the Conjunction holds map to the same output type (_B). We thus don't have to do several costly calls to _least_common_ancestor! We can simply call type(self.transforms[0](tensor_dict)) to get the class of TensorDict that we want the final output to be.

I think that a Transform[EmptyTensorDict] is a Transform[_B], so if the first transform is a strict sub-type, then this is wrong. In my opinion, TensorDicts are weird now. It feels like we would like to remove the checks, but then they don't do anything except giving information to the developer if the composition (or stack or conjuncted) of Transforms is valid. So I think that they are basically only useful to verify that:

The transform are correct, i.e. they do output expected TDs
The transforms are correctly composed (or stacked or conjuncted).
It feels like TDs could be annotated types.

ValerianRey · 2025-06-02T11:26:16Z

I think that a Transform[EmptyTensorDict] is a Transform[_B], so if the first transform is a strict sub-type, then this is wrong.

Yes, my bad. This makes this PR not good enough.

It feels like TDs could be annotated types.

I thought about this too, but we still need their check method for our tests (arguably we could implement these checks differently) and more importantly, we can only have one "parent" with typing.Annotated, which wouldn't work with EmptyTensorDict. This could work when type intersection is added to Python, which could be in a long time.

About TensorDicts, I think #380 is good.

About Conjunction, I think we could go back to having some kind of least_common_ancestor function, but optimized to work directly on a list of objects rather than looking at objects two by two.

EDIT: this would work, but it would still be slower than #382 with practically no benefits.

ValerianRey · 2025-06-03T13:50:54Z

closing in favor of #382

ValerianRey · 2025-06-03T22:22:53Z

A fix that I came up with is to just return a TensorDict at the output of Conjunction and use Any as the return type hint.
It's like saying to mypy: this TensorDict will be compatible with everything, don't worry. So in the end this doesn't protect us from wrongly using the output of a conjunct at all, but at least it preserves the genericity of transforms (which make them clearer IMO).

Still not sure whether this is better than #382 or not. It's a bit of a hack (which I really don't like), but #382 makes transforms a bit harder to use IMO (gotta read the docstrings and really understand them rather than relying on the type checker and check_keys to ensure we're not doing something stupid).

ValerianRey · 2025-06-03T22:39:40Z

In the end we do all of this for safety, so that it's easy to make combined transforms (because we're sure we cannot break anything). This would not ensure safety whenever we use a Conjunction, so this feeling of safety (which is what makes the creation of transforms easy) would disappear. So I think it's useless, and #382 is better. #382 is also more open to adding another solution one day that will be more satisfying.

ValerianRey added 2 commits May 31, 2025 16:06

Revamp Conjunction.__call__

dab4e11

* Remove _least_common_ancestor * Disallow empty conjunctions

Disable __ior__ in TensorDict (for immutability)

7a66ac1

ValerianRey added package: autojac refactor labels May 31, 2025

ValerianRey self-assigned this May 31, 2025

ValerianRey added the package: autojac label May 31, 2025

ValerianRey requested a review from PierreQuinton May 31, 2025 14:25

ValerianRey added the refactor label May 31, 2025

This was referenced May 31, 2025

Revamp Conjunction.__call__ #374

Closed

Add mypy #364

Closed

Fix _least_common_ancestor #373

Closed

ValerianRey and others added 2 commits June 2, 2025 13:20

Merge branch 'main' into revamp-conjunction

9c6ab33

Remove extra import

f5e6719

ValerianRey mentioned this pull request Jun 2, 2025

Remove TensorDict classes #382

Merged

3 tasks

ValerianRey closed this Jun 3, 2025

ValerianRey added 2 commits June 4, 2025 00:18

Use Any return type hint in Conjunction.__call__

be51e6e

Fixup __ior__

7e942df

ValerianRey reopened this Jun 3, 2025

ValerianRey closed this Jun 3, 2025

ValerianRey deleted the revamp-conjunction branch June 3, 2025 22:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Revamp `Conjunction` #379

Revamp `Conjunction` #379

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Revamp Conjunction #379

Revamp Conjunction #379

Uh oh!

Conversation

Uh oh!

Uh oh!

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Revamp `Conjunction` #379

Revamp `Conjunction` #379