-
Notifications
You must be signed in to change notification settings - Fork 701
[Stream] Avoid clone on single user in CloneToConsumers #20667
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Jorn Tuyls <jorn.tuyls@gmail.com>
@benvanik Could you help review this? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not cloning when not required seems like a good change! Independently, you are correct that behavior is undefined if you use mixed affinities - if you're producing IR like that you'll want to fix it.
Noted in the review is RE the pass option to print and the test using it - please remove that (debug printing would still be useful, but not outs and not via an option) and have the test check the condition by testing for the presence/absence of IR instead.
if (printIterations) { | ||
llvm::outs() << "iterationCount: " << iterationCount << "\n"; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These kind of things go under LLVM_DEBUG blocks instead - then a user can pass -debug-> and see the output. This file does declare the DEBUG_TYPE but has no uses yet; you'd add this as:
LLVM_DEBUG(llvm::dbgs() << "[clone-to-consumers] iteration " << iterationCount << "/" << maxIterationCount << "\n");
|
||
// Tests that splats with only one use are never cloned. | ||
|
||
// CHECK: iterationCount: 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't rely on debug prints in tests - instead, check what you're actually testing. Here, if you don't expect the splat to be cloned then test that the splat is not present twice.
// CHECK: %[[SPLAT_A:.+]] = flow.tensor.splat | ||
// CHECK: %[[SPLAT_B:.+]] = flow.tensor.splat | ||
// CHECK-NEXT: %[[SPLAT_A:.+]] = flow.tensor.splat |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these orderings seem brittle and backwards - why did you do B followed by A?
There's a test case in CI that hits the maxIterationCount limit in
CloneToConsumers
, see following snippet from link:This can be reproduced by running the following command on clone_to_consumers_remark.mlir.txt:
I reduced this further to the following test case:
I am actually not entirely sure whether a dispatch with params that have different affinity is correct or whether this should never happen? It seems from the remark message that it can happen:
If this is correct, adding a
hasOneUse
check should break up the indefinite cloning of the single producer.