Plumb scaling_mfma through to IREE #20701

krzysz00 · 2025-05-01T21:33:25Z

Once we have amdgpu.scaling_mfma landed in IREE,

Define some sort of MMA kind attribute (or extend the existing ones) to represent these scaled MFMAs, which take a 32x[small float] and an i8 (really, a vector<4xi8> and a selector for if you do your own unrolling) scale. There's a 16x16x128 and 32x32x64 version of the intrinsic and any combination from {f4E2M1FN, f6E2M3FN, f6E3M2FN, f8E4M3FN, f8E5M2}^2` work as input element types. These intrinsics follow the usual MFMA layout
Have some rewrite from a linalg representation of such a scaled MFMA into the relevant iree_gpu.multi_mma.

The text was updated successfully, but these errors were encountered:

Provide feedback