8000 Plumb scaling_mfma through to IREE · Issue #20701 · iree-org/iree · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Plumb scaling_mfma through to IREE #20701

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
krzysz00 opened this issue May 1, 2025 · 0 comments
Open

Plumb scaling_mfma through to IREE #20701

krzysz00 opened this issue May 1, 2025 · 0 comments

Comments

@krzysz00
Copy link
Contributor
krzysz00 commented May 1, 2025

Once we have amdgpu.scaling_mfma landed in IREE,

  1. Define some sort of MMA kind attribute (or extend the existing ones) to represent these scaled MFMAs, which take a 32x[small float] and an i8 (really, a vector<4xi8> and a selector for if you do your own unrolling) scale. There's a 16x16x128 and 32x32x64 version of the intrinsic and any combination from {f4E2M1FN, f6E2M3FN, f6E3M2FN, f8E4M3FN, f8E5M2}^2` work as input element types. These intrinsics follow the usual MFMA layout
  2. Have some rewrite from a linalg representation of such a scaled MFMA into the relevant iree_gpu.multi_mma.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant
0