-
Notifications
You must be signed in to change notification settings - Fork 700
[GPU] Update the cache info for padding resolver in cloneWithSimplifiedConfig. #20371
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[GPU] Update the cache info for padding resolver in cloneWithSimplifiedConfig. #20371
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this code motion makes sense in the current form. We shouldn't move arch-specific information outside of KnownTargets. What's the issue with keeping the code as-is?
We do not need such information, which is sort of duplicated config, in any other lowering.
gpu_pad_layout
is the only place where this information is materialized, so I'm not sure where any duplication happens?
// GPUPadLayoutAttr is only enabled for CDNA2 and CDNA3 for the time being. | ||
// TODO(kuhar): Enable for other HIP targets. | ||
if (!llvm::is_contained({"gfx90a", "gfx942"}, gpuTarget.getArch())) { | ||
return IREE::Encoding::IdentityEncodingAttr::get(ctx); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should move this code here -- I'd rather have it in KnownTargets together with all the other arch-specific information. We could materialize it in the target chip attribute if necessary, but I don't think nothing else needs it
The arch-specific information is described in KnownTargets, and how we use the information can be moved to other places IMO. Like other data-tiling resolvers, we have the cpu_features on CPU side and available intrinsic on GPU side. They all look at target configuration to filter out needed information during specialization and materialization. I honestly don't know how we figure the cache line bytes and cache sets as constants, and I think they are inferable from the GPUTargetAttr? Otherwise, they should be encoded in GPUTargetAttr and the GPUPadLayoutAttr can get the information from it. (I'm reworking the MaterializeEncodingIntoPadding pass, and the end goal is to merge it into the generic pass. How we set up the attribute today is a little different from the data-tiling one, which motivates me to raise the PR. I can probably use |
These depend solely on the gpu architecture and the value will be different across gfx versions. This information could go inside the GPUTargetAttr, but nothing else it the compiler needs it, so it seems excessive to me preserve it throughout the whole compilation pipeline. |
I see, it makes sense to me. FYI that I gave it a shot in my other PR. It ends up with making the pass depend on |
I think that's not to bad since we already depend on |
It sounds better to me. I think it is good to have a place to describe target details, and there are utilities that help us retrieve the information. |
The revision creates a generic GPUPadLayoutAttr in the first place, and materialize the cache config in the interface method. The cache config is only used in encoding specialization and materialization. We do not need such information, which is sort of duplicated config, in any other lowering. Thus, we do not need to hardcode the cache config in the first place. They can be resolved when we need it. Signed-off-by: hanhanW <hanhan0912@gmail.com>
Signed-off-by: hanhanW <hanhan0912@gmail.com>
0e9a36e
to
1524ada
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
compiler/src/iree/compiler/Codegen/Dialect/GPU/TargetUtils/KnownTargets.cpp
Outdated
Show resolved
Hide resolved
Signed-off-by: hanhanW <hanhan0912@gmail.com>
…edConfig. (iree-org#20371) The revision creates an empty GPUPadLayoutAttr in the first place, and materialize the cache config in the interface method. The cache config is only used in encoding specialization and materialization. We do not need such information, which is sort of duplicated config, in any other lowering. Thus, we do not need to hardcode the cache config in the first place. They can be resolved by the new utility (i.e., `getL1CacheInfo`) when we need it. --------- Signed-off-by: hanhanW <hanhan0912@gmail.com>
The revision creates an empty GPUPadLayoutAttr in the first place, and materialize the cache config in the interface method. The cache config is only used in encoding specialization and materialization. We do not need such information, which is sort of duplicated config, in any other lowering. Thus, we do not need to hardcode the cache config in the first place. They can be resolved by the new utility (i.e.,
getL1CacheInfo
) when we need it.