8000 Add linalg.lu_solve by lezcano · Pull Request #77634 · pytorch/pytorch · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Add linalg.lu_solve #77634

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 10 commits into from
Closed

Add linalg.lu_solve #77634

wants to merge 10 commits into from

Conversation

lezcano
Copy link
Collaborator
@lezcano lezcano commented May 17, 2022

This PR adds `linalg.lu_solve`. While doing so, I found a bug in MAGMA
when calling the batched MAGMA backend with trans=True. We work around
that by solving the system solving two triangular systems.

We also update the heuristics for this function, as they were fairly
updated. We found that cuSolver is king, so luckily we do not need to
rely on the buggy backend from magma for this function.

We added tests testing this function left and right. We also added tests
for the different backends. We also activated the tests for AMD, as
those should work as well.

Fixes #61657

[ghstack-poisoned]
@facebook-github-bot
Copy link
Contributor
facebook-github-bot commented May 17, 2022

🔗 Helpful links

✅ No Failures (0 Pending)

As of commit fd63fd8 (more details on the Dr. CI page):

Expand to see more

💚 💚 Looks good so far! There are no failures yet. 💚 💚


This comment was automatically generated by Dr. CI (expand for details).

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

@lezcano
Copy link
Collaborator Author
lezcano commented May 17, 2022

@malfet here's the new stack. Although let's wait first for the CI to run, because I had to perform a fairly non-trivial merge.

@lezcano lezcano requested review from malfet and removed request for nikitaved, albanD, soulitzer, ngimel, bdhirsh, IvanYashchuk, anjali411 and mruberry May 17, 2022 08:52
@lezcano lezcano added module: linear algebra Issues related to specialized linear algebra operations in PyTorch; includes matrix multiply matmul release notes: linalg_frontend release notes category labels May 17, 2022
lezcano added 3 commits May 17, 2022 14:28
Relanding #72935

[ghstack-poisoned]
Relanding #72935

[ghstack-poisoned]
Relanding #72935

[ghstack-poisoned]
lezcano added 3 commits May 25, 2022 16:30
Relanding #72935

[ghstack-poisoned]
Relanding #72935

[ghstack-poisoned]
Relanding #72935

[ghstack-poisoned]
@ngimel
Copy link
Collaborator
ngimel commented May 26, 2022

Do we need to run this on ci/trunk? Why was it reverted previously?

@lezcano
Copy link
Collaborator Author
lezcano commented May 27, 2022

It broke an internal build. See #72935 (comment) #72935 (comment) #72935 (comment) #72935 (comment)

@ngimel
Copy link
Collaborator
ngimel commented May 27, 2022

Ok, so what next? We can't just reland it because it'll break internal build again?

@lezcano
Copy link
Collaborator Author
lezcano commented May 27, 2022

According to #72935 (comment), I'd assume that @malfet already corrected the internal error?

@ngimel
Copy link
8000
Collaborator
ngimel commented May 31, 2022

@ngimel has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@ngimel
Copy link
Collaborator
ngimel commented May 31, 2022

I'm still getting internal build failures on this one

Summary: 
ld.lld: error: undefined symbol: at::native::DispatchStubImpl::get_call_ptr(c10::DeviceType, void*)
stderr: ld.lld: error: undefined symbol: at::native::DispatchStubImpl::get_call_ptr(c10::DeviceType, void*)

so we cannot land it. Coming from unpack_pivots_stub:

/aten/src/ATen/native/cuda/linalg/BatchLinearAlgebra.cpp.o:(at::native::DispatchStub<void (*)(at::TensorIterator&, long), at::native::unpack_pivots_stub>::get_call_ptr(c10::DeviceType))

@lezcano
Copy link
Collaborator Author
lezcano commented Jun 7, 2022

@malfet any updates on this one?

@malfet
Copy link
Contributor
malfet commented Jun 7, 2022

@malfet any updates on this one?

Sorry, completely slipped my mind. Do you mind rebasing it against latest viable/strict and let me try the import again...

Relanding #72935

Differential Revision: [D36793144](https://our.internmc.facebook.com/intern/diff/D36793144)

[ghstack-poisoned]
@lezcano
Copy link
Collaborator Author
lezcano commented Jun 7, 2022

@malfet rebased.
I needed to add an import that was missing to a file that was unrelated to this PR (idk why but it was not building locally...)

@malfet
Copy link
Contributor
malfet commented Jun 7, 2022

@malfet has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@malfet
Copy link
Contributor
malfet commented Jun 7, 2022

Ok, I finally understand what is going on(torch_cuda in ideal world should be oblivious of supported CPU architectures, which is enforced internally but not in OSS build, so have an idea how to fix it internally):

, void *DEFAULT
#ifdef HAVE_AVX512_CPU_DEFINITION
, void *AVX512
#endif
#ifdef HAVE_AVX2_CPU_DEFINITION
, void *AVX2

Link to internal fix: D36981363 (essentially propagates HAVE_AVX2_CP 8000 U_DEFINITION flag to both torch_cpu and torch_cuda)

@lezcano
Copy link
Collaborator Author
lezcano commented Jun 7, 2022

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

@pytorchbot successfully started a merge job. Check the current status here

facebook-github-bot pushed a commit that referenced this pull request Jun 8, 2022
Summary:
This PR adds `linalg.lu_solve`. While doing so, I found a bug in MAGMA
when calling the batched MAGMA backend with trans=True. We work around
that by solving the system solving two triangular systems.

We also update the heuristics for this function, as they were fairly
updated. We found that cuSolver is king, so luckily we do not need to
rely on the buggy backend from magma for this function.

We added tests testing this function left and right. We also added tests
for the different backends. We also activated the tests for AMD, as
those should work as well.

Fixes #61657

Pull Request resolved: #77634

Approved by: https://github.com/malfet

Test Plan: contbuild & OSS CI, see https://hud.pytorch.org/commit/pytorch/pytorch/c7d6cec078bcd0b0652ba10d1d55931b27be9f36

Reviewed By: osalpekar

Differential Revision: D36793144

Pulled By: osalpekar

fbshipit-source-id: fd15e677ff625ef53b42acf3217b9bba443a66d0
@facebook-github-bot facebook-github-bot deleted the gh/Lezcano/72/head branch June 11, 2022 14:17
} else {
return c10::MaybeOwned<Tensor>::borrowed(pivots);
}
}

} // anonymous namespace
static void lu_solve_kernel(const Tensor& LU, const Tensor& pivots, const Tensor& B, TransposeType trans) {
// Trivial case. Remove it once `torch.solve` is removed, as linalg.solve already shortcuts this case
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lezcano is it time to remove this now?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's rather minor, but sure, happy to approve a PR that removes this provided that CI passes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla signed Merged module: linear algebra Issues related to specialized linear algebra operations in PyTorch; includes matrix multiply matmul open source release notes: linalg_frontend release notes category topic: new features topic category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants
0