8000 Dispatch inhibiting fusion · Issue #20812 · iree-org/iree · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Dispatch inhibiting fusion #20812

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
pashu123 opened this issue May 14, 2025 · 8 comments
Open

Dispatch inhibiting fusion #20812

pashu123 opened this issue May 14, 2025 · 8 comments
Labels
codegen/rocm ROCm code generation compiler backend (HIP/HSA) codegen Shared code generation infrastructure and dialects

Comments

@pashu123
Copy link
Contributor
pashu123 commented May 14, 2025

Dispatch:
https://gist.github.com/pashu123/93b6076aa80d9ceebbc63b8b3a210ae0

In the above dispatch, it's unclear why the first linalg.generic is fused in the same dispatch, since we are storing the tensor's output at the very end. My reasoning is that it should be outside of the dispatch since it might unblock other dispatches that might need its output. Also, this kind of fusion isn't supported by the forall fusion. @MaheshRavishankar @IanWood1

Download 8b_fp8.mlir: https://gist.github.com/pashu123/b07a3988248cc8bb94139a9f2069153c

Compile command:iree-compile 8b_fp8.mlir --iree-hal-target-backends=rocm --iree-hip-target=gfx942 --iree-hal-target-device=hip --iree-opt-level=O3 --iree-hal-indirect-command-buffers=true --iree-stream-resource-memory-model=discrete --iree-hal-memoization=true -o tmp.vmfb --iree-hal-dump-executable-files-to=dump/

OR

Clone: https://github.com/nod-ai/iree-model-benchmark/tree/main/llama3 (There are standard compilation instruction there.)
./compile-8b-fp8.sh gfx942

@pashu123 pashu123 added codegen Shared code generation infrastructure and dialects codegen/rocm ROCm code generation compiler backend (HIP/HSA) labels May 14, 2025
@benvanik
Copy link
Collaborator

(please post full reproducers as the bug templates suggest - we need command lines, the original IR provided to IREE, etc)

@pashu123
Copy link
Contributor Author

(please post full reproducers as the bug templates suggest - we need command lines, the original IR provided to IREE, etc)

This wasn't a bug template. I just wanted to discuss the dispatch more.

@benvanik
Copy link
Collaborator

Post the command lines and input IR. How a dispatch is formed varies based on which flags are provided and the best way to investigate how something got to the state it is in is to be able to get it into that state. You're showing us a pile of broken glass and asking why it's broken while hiding the hammer behind your back ;)

@IanWood1
Copy link
Contributor

Post the command lines and input IR. How a dispatch is formed varies based on which flags are provided and the best way to investigate how something got to the state it is in is to be able to get it into that state. You're showing us a pile of broken glass and asking why it's broken while hiding the hammer behind your back ;)

Agreed, It would be helpful to have the IR and know if you are using --iree-dispatch-creation-propagate-collapse-across-expands=true or not.

@benvanik
Copy link
Collaborator

^ yep - or iree-preprocessing-make-single-dispatch, or lots of other stuff.

@pashu123
Copy link
Contributor Author

Post the command lines and input IR. How a dispatch is formed varies based on which flags are provided and the best way to investigate how something got to the state it is in is to be able to get it into that state. You're showing us a pile of broken glass and asking why it's broken while hiding the hammer behind your back ;)

Agreed, It would be helpful to have the IR and know if you are using --iree-dispatch-creation-propagate-collapse-across-expands=true or not.

Clone: https://github.com/nod-ai/iree-model-benchmark/tree/main/llama3
./compile-8b-fp8.sh gfx942

@benvanik
Copy link
Collaborator

Please, post the command lines and input IR. You have them.

As a tip: you're going to get more engagement on issues if you don't ask a lot of people to engage. More often than not someone can spot something and help you out right away, you just have to respect their time.

I went and looked at your repo and dug through your scripts and saw --iree-dispatch-creation-propagate-collapse-across-expands=true that @IanWood1 mentioned above - had you posted your command line you'd have saved us the time. I'm not going to clone your code and build it, though, and I don't expect anyone else to either.

@pashu123
Copy link
Contributor Author

Please, post the command lines and input IR. You have them.

As a tip: you're going to get more engagement on issues if you don't ask a lot of people to engage. More often than not someone can spot something and help you out right away, you just have to respect their time.

I went and looked at your repo and dug through your scripts and saw --iree-dispatch-creation-propagate-collapse-across-expands=true that @IanWood1 mentioned above - had you posted your command line you'd have saved us the time. I'm not going to clone your code and build it, though, and I don't expect anyone else to either.

I'm really sorry. I was lazy. I will make sure I respect everyone's time. Again, I'm really sorry.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
codegen/rocm ROCm code generation compiler backend (HIP/HSA) codegen 3D88 Shared code generation infrastructure and dialects
Projects
None yet
Development

No branches or pull requests

3 participants
0