8000 Notes on lowering `arith.scaling_extf` and `arith.scaling_truncf` to AMDGPU · Issue #20821 · iree-org/iree · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Notes on lowering arith.scaling_extf and arith.scaling_truncf to AMDGPU #20821

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
krzysz00 opened this issue May 15, 2025 · 1 comment
Open

Comments

@krzysz00
Copy link
Contributor

From me sketching this out in Umang's DMs

  1. Normalize to splat form:
    a. If the op's operating on scalars or 0-D vectors, go to 1xT
    b. Look at the scale input: if it isn't a broadcast or a splat - that is, if there isn't a single scalar value in all elements - unroll along leading or trailing dimensions (dependeng on what any vector.broadcast feeding the scale is doing) until you have ops of that form
  2. Now you've got scaling_{ext,trunc} of the form %out = scaling_op %in, splat(%scale) : vector<...xT>, vector<...xU>, vector<...xV>
  3. Once we're in this form, the scale is a scalar from now on
  4. Then, determine the instruction-level block size of the intrinsic (so 32 unless we'll be targetting those 2x16xf32 intrinsics, in which case you go for 16)
  5. Start extracting out [blocksize] elements at a time, padding out with 0s if needed, and pass those to an amdgpu.cvt_packed_scale type thing
    a. While there, the conversion from f8E8M0 to f32 gets to be a special case, since, unlike in the general case, we can just shift left and don't have to check for NaN
@krzysz00
Copy link
Contributor Author

... On further review, it looks like there are also scalar - or quasi scalar - round-down and round-up instructions that got added while I wasn't looking, so a lot of this gets easier

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant
0