-
Notifications
You must be signed in to change notification settings - Fork 24.5k
Add linalg.lu_solve #77634
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add linalg.lu_solve #77634
Conversation
This PR adds `linalg.lu_solve`. While doing so, I found a bug in MAGMA when calling the batched MAGMA backend with trans=True. We work around that by solving the system solving two triangular systems. We also update the heuristics for this function, as they were fairly updated. We found that cuSolver is king, so luckily we do not need to rely on the buggy backend from magma for this function. We added tests testing this function left and right. We also added tests for the different backends. We also activated the tests for AMD, as those should work as well. Fixes #61657 [ghstack-poisoned]
🔗 Helpful links
✅ No Failures (0 Pending)As of commit fd63fd8 (more details on the Dr. CI page): Expand to see more💚 💚 Looks good so far! There are no failures yet. 💚 💚 This comment was automatically generated by Dr. CI (expand for details).Please report bugs/suggestions to the (internal) Dr. CI Users group. |
@malfet here's the new stack. Although let's wait first for the CI to run, because I had to perform a fairly non-trivial merge. |
Do we need to run this on ci/trunk? Why was it reverted previously? |
It broke an internal build. See #72935 (comment) #72935 (comment) #72935 (comment) #72935 (comment) |
Ok, so what next? We can't just reland it because it'll break internal build again? |
According to #72935 (comment), I'd assume that @malfet already corrected the internal error? |
@ngimel has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
I'm still getting internal build failures on this one
so we cannot land it. Coming from
|
@malfet any updates on this one? |
Sorry, completely slipped my mind. Do you mind rebasing it against latest viable/strict and let me try the import again... |
Relanding #72935 Differential Revision: [D36793144](https://our.internmc.facebook.com/intern/diff/D36793144) [ghstack-poisoned]
@malfet rebased. |
@malfet has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
Ok, I finally understand what is going on(torch_cuda in ideal world should be oblivious of supported CPU architectures, which is enforced internally but not in OSS build, so have an idea how to fix it internally): pytorch/aten/src/ATen/native/DispatchStub.h Lines 75 to 80 in 32461ed
Link to internal fix: D36981363 (essentially propagates |
@pytorchbot merge |
@pytorchbot successfully started a merge job. Check the current status here |
Summary: This PR adds `linalg.lu_solve`. While doing so, I found a bug in MAGMA when calling the batched MAGMA backend with trans=True. We work around that by solving the system solving two triangular systems. We also update the heuristics for this function, as they were fairly updated. We found that cuSolver is king, so luckily we do not need to rely on the buggy backend from magma for this function. We added tests testing this function left and right. We also added tests for the different backends. We also activated the tests for AMD, as those should work as well. Fixes #61657 Pull Request resolved: #77634 Approved by: https://github.com/malfet Test Plan: contbuild & OSS CI, see https://hud.pytorch.org/commit/pytorch/pytorch/c7d6cec078bcd0b0652ba10d1d55931b27be9f36 Reviewed By: osalpekar Differential Revision: D36793144 Pulled By: osalpekar fbshipit-source-id: fd15e677ff625ef53b42acf3217b9bba443a66d0
} else { | ||
return c10::MaybeOwned<Tensor>::borrowed(pivots); | ||
} | ||
} | ||
|
||
} // anonymous namespace | ||
static void lu_solve_kernel(const Tensor& LU, const Tensor& pivots, const Tensor& B, TransposeType trans) { | ||
// Trivial case. Remove it once `torch.solve` is removed, as linalg.solve already shortcuts this case |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lezcano is it time to remove this now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's rather minor, but sure, happy to approve a PR that removes this provided that CI passes
Stack from ghstack:
Relanding #72935
Differential Revision: D36793144