-
Notifications
You must be signed in to change notification settings - Fork 701
Dispatch inhibiting fusion #20812
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
(please post full reproducers as the bug templates suggest - we need command lines, the original IR provided to IREE, etc) |
This wasn't a bug template. I just wanted to discuss the dispatch more. |
Post the command lines and input IR. How a dispatch is formed varies based on which flags are provided and the best way to investigate how something got to the state it is in is to be able to get it into that state. You're showing us a pile of broken glass and asking why it's broken while hiding the hammer behind your back ;) |
Agreed, It would be helpful to have the IR and know if you are using |
^ yep - or |
Clone: https://github.com/nod-ai/iree-model-benchmark/tree/main/llama3 |
Please, post the command lines and input IR. You have them. As a tip: you're going to get more engagement on issues if you don't ask a lot of people to engage. More often than not someone can spot something and help you out right away, you just have to respect their time. I went and looked at your repo and dug through your scripts and saw |
I'm really sorry. I was lazy. I will make sure I respect everyone's time. Again, I'm really sorry. |
Uh oh!
There was an error while loading. Please reload this page.
Dispatch:
https://gist.github.com/pashu123/93b6076aa80d9ceebbc63b8b3a210ae0
In the above dispatch, it's unclear why the first linalg.generic is fused in the same dispatch, since we are storing the tensor's output at the very end. My reasoning is that it should be outside of the dispatch since it might unblock other dispatches that might need its output. Also, this kind of fusion isn't supported by the forall fusion. @MaheshRavishankar @IanWood1
Download 8b_fp8.mlir: https://gist.github.com/pashu123/b07a3988248cc8bb94139a9f2069153c
Compile command:
iree-compile 8b_fp8.mlir --iree-hal-target-backends=rocm --iree-hip-target=gfx942 --iree-hal-target-device=hip --iree-opt-level=O3 --iree-hal-indirect-command-buffers=true --iree-stream-resource-memory-model=discrete --iree-hal-memoization=true -o tmp.vmfb --iree-hal-dump-executable-files-to=dump/
OR
Clone: https://github.com/nod-ai/iree-model-benchmark/tree/main/llama3 (There are standard compilation instruction there.)
./compile-8b-fp8.sh gfx942
The text was updated successfully, but these errors were encountered: