[Feature Request] MultiAgentLSTM Module and Full Example

Motivation

Hoping that someone could build out an example using an RL algorithm (e.g. PPO) in a multi-agent environment using LSTMs, which may likely/preferably require the building out of a new MultiAgentLSTM module. This may follow the MultiAgentMLP (https://pytorch.org/rl/main/reference/generated/torchrl.modules.MultiAgentMLP.html) way, however, it's not as straightforward to do with LSTMs or other recurrent architectures.

Solution

Build MultiAgentLSTM module similar to the construction and use of the MultiAgentMLP module -- for the most part a drop-in replacement.

Alternatives

An example using simple LSTM blocks (or the LSTMModule) vice using MultiAgentLSTM, but in a multi-agent setting.

Additional context

Note that this may require changes to LSTMModule (https://pytorch.org/rl/main/reference/generated/torchrl.modules.LSTMModule.html) if that is to be used as a component of MultiAgentLSTM, as there are incompatibilities with using the environment InitTracker primer (https://pytorch.org/rl/main/reference/generated/torchrl.envs.transforms.InitTracker.html) alongside LSTMModule for multi-agent environments/agents. One thing might be to change both LSTMModule and InitTracker primer to accept/place the "is_init" TensorDict key in different locations in a multi-agent environment, i.e. have per-agent or per-group is_init keys vice a single global is_init key for a timestep as is expected with LSTMModule.

Checklist

I have checked that there is no similar issue in the repo (required)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Motivation

Solution

Alternatives

Additional context

Checklist

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Description

Motivation

Solution

Alternatives

Additional context

Checklist

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions