8000 [EPIC][GPU][DT] Bring up GPU data-tiling with reasonable performance · Issue #17181 · iree-org/iree · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

[EPIC][GPU][DT] Bring up GPU data-tiling with reasonable performance #17181

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
6 of 10 tasks
hanhanW opened this issue Apr 25, 2024 · 2 comments
Open
6 of 10 tasks

[EPIC][GPU][DT] Bring up GPU data-tiling with reasonable performance #17181

hanhanW opened this issue Apr 25, 2024 · 2 comments
Assignees
Labels
codegen/rocm ROCm code generation compiler backend (HIP/HSA) codegen Shared code generation infrastructure and dialects

Comments

@hanhanW
Copy link
Contributor
hanhanW commented Apr 25, 2024

Overview

This is the umbrella issue that collects tasks toward phase 1. In the phase 1, we aim to provide a functional data-tiling GPU path with reasonable performance. In this phase, we don't chase for optimal performance. Instead, we want to enable the path for all e2e tracking models.

The reasonable performance means that we should be able to vectorize, and apply vector distribution on data-tiling ops (i.e., pack/unpack/mmt4d-like ops).

ETA: ~1 month

Milestone 1 - enable data-tiling in tests/e2e/matmul test suite

The scope is to compile and execute a linalg.matmul; enable e2e tests. Additionally, we want to extract few matmul ops (potentially with dequant ops) from sdxl and lamma models, and focus on them. To achieve the milestone, the major tasks are:

@bjacob let's share the above tasks between you and me. I'll convert the tasks into issues soon.

Milestone 2 - enable at least one e2e model on benchmark CI

This milestone mainly focus on fusion codegen, which allows us to compile and execute ML workloads. For now, the target is sdxl and sd3.

Major tasks:

  • Support tiling interface to fuse consumers
  • Build the pipeline for pack fusion using "tile produce and fuse consumer" approach.
  • Be able to codegen unpack fusion.
  • Enable sdxl/sd3 models on e2e benchmark suite.

Assign @MaheshRavishankar to be contact point for milestone 2, because he is tracking the TilingInterface support. I can jump into some tasks when there is a need.

@hanhanW hanhanW added codegen Shared code generation infrastructure and dialects codegen/rocm ROCm code generation compiler backend (HIP/HSA) labels Apr 25, 2024
@hanhanW
Copy link
Contributor Author
hanhanW commented Apr 25, 2024

@hanhanW
Copy link
Contributor Author
hanhanW commented Apr 25, 2024

@bjacob I volunteer you to be assigned for #17185 and #17188 for now, but feel free to pick whatever tasks that you're interested in. Also, feel free to update the issues because I could miss something. I'll pick up the pack/unpack codegen because I had some patches long long long time ago; I'll try to revamp them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
codegen/rocm ROCm code generation compiler backend (HIP/HSA) codegen Shared code generation infrastructure and dialects
Projects
None yet
Development

No branches or pull requests

3 participants
0