8000 NotImplementedError when computing JVP of Attention · Issue #154226 · pytorch/pytorch · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
NotImplementedError when computing JVP of Attention #154226
Open
@limsanky

Description

@limsanky

Hi!

I am trying to compute the JVP of attention (FlashAttention), but I end up getting this NotImplementedError.

NotImplementedError: Trying to use forward AD with _scaled_dot_product_efficient_attention that does not support it because it has not been implemented yet.
Please file an issue to PyTorch at https://github.com/pytorch/pytorch/issues/new?template=feature-request.yml so that we can prioritize its implementation

I would greatly appreciate it if it is possible to implement this! :)

Thank you so much! Looking forward to hearing a response!

cc @ezyang @albanD @gqchen @nikitaved @soulitzer @Varal7 @xmfan

Metadata

Metadata

Assignees

No one assigned

    Labels

    actionablemodule: autogradRelated to torch.autograd, and the autograd engine in generalmodule: forward admodule: sdpaAll things related to torch.nn.functional.scaled_dot_product_attentiiontriagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate module

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      0