Add linalg.lu_solve #77634

lezcano · 2022-05-17T08:49:37Z

Stack from ghstack:

Relanding #72935

Differential Revision: D36793144

This PR adds `linalg.lu_solve`. While doing so, I found a bug in MAGMA when calling the batched MAGMA backend with trans=True. We work around that by solving the system solving two triangular systems. We also update the heuristics for this function, as they were fairly updated. We found that cuSolver is king, so luckily we do not need to rely on the buggy backend from magma for this function. We added tests testing this function left and right. We also added tests for the different backends. We also activated the tests for AMD, as those should work as well. Fixes #61657 [ghstack-poisoned]

facebook-github-bot · 2022-05-17T08:49:43Z

🔗 Helpful links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/77634
📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓Need help or want to give feedback on the CI? Visit our office hours

✅ No Failures (0 Pending)

As of commit fd63fd8 (more details on the Dr. CI page):

Expand to see more

💚 💚 Looks good so far! There are no failures yet. 💚 💚

This comment was automatically generated by Dr. CI (expand for details).

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

lezcano · 2022-05-17T08:52:27Z

@malfet here's the new stack. Although let's wait first for the CI to run, because I had to perform a fairly non-trivial merge.

Relanding #72935 [ghstack-poisoned]

ngimel · 2022-05-26T18:29:16Z

Do we need to run this on ci/trunk? Why was it reverted previously?

lezcano · 2022-05-27T08:39:33Z

It broke an internal build. See #72935 (comment) #72935 (comment) #72935 (comment) #72935 (comment)

ngimel · 2022-05-27T17:15:36Z

Ok, so what next? We can't just reland it because it'll break internal build again?

lezcano · 2022-05-27T21:42:53Z

According to #72935 (comment), I'd assume that @malfet already corrected the internal error?

ngimel · 2022-05-31T22:33:27Z

@ngimel has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

ngimel · 2022-05-31T23:16:00Z

I'm still getting internal build failures on this one

Summary: 
ld.lld: error: undefined symbol: at::native::DispatchStubImpl::get_call_ptr(c10::DeviceType, void*)
stderr: ld.lld: error: undefined symbol: at::native::DispatchStubImpl::get_call_ptr(c10::DeviceType, void*)

so we cannot land it. Coming from unpack_pivots_stub:

/aten/src/ATen/native/cuda/linalg/BatchLinearAlgebra.cpp.o:(at::native::DispatchStub<void (*)(at::TensorIterator&, long), at::native::unpack_pivots_stub>::get_call_ptr(c10::DeviceType))

lezcano · 2022-06-07T15:27:11Z

@malfet any updates on this one?

malfet · 2022-06-07T15:31:39Z

@malfet any updates on this one?

Sorry, completely slipped my mind. Do you mind rebasing it against latest viable/strict and let me try the import again...

Relanding #72935 Differential Revision: [D36793144](https://our.internmc.facebook.com/intern/diff/D36793144) [ghstack-poisoned]

lezcano · 2022-06-07T15:55:03Z

@malfet rebased.
I needed to add an import that was missing to a file that was unrelated to this PR (idk why but it was not building locally...)

malfet · 2022-06-07T16:12:19Z

@malfet has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

malfet · 2022-06-07T18:41:43Z

Ok, I finally understand what is going on(torch_cuda in ideal world should be oblivious of supported CPU architectures, which is enforced internally but not in OSS build, so have an idea how to fix it internally):

pytorch/aten/src/ATen/native/DispatchStub.h

Lines 75 to 80 in 32461ed

    
               , void *DEFAULT 
        
           #ifdef HAVE_AVX512_CPU_DEFINITION 
        
                 , void *AVX512 
        
           #endif 
        
           #ifdef HAVE_AVX2_CPU_DEFINITION 
        
                 , void *AVX2

Link to internal fix: D36981363 (essentially propagates HAVE_AVX2_CP 8000 U_DEFINITION flag to both torch_cpu and torch_cuda)

lezcano · 2022-06-07T22:27:08Z

@pytorchbot merge

pytorchmergebot · 2022-06-07T22:28:26Z

@pytorchbot successfully started a merge job. Check the current status here

Summary: This PR adds `linalg.lu_solve`. While doing so, I found a bug in MAGMA when calling the batched MAGMA backend with trans=True. We work around that by solving the system solving two triangular systems. We also update the heuristics for this function, as they were fairly updated. We found that cuSolver is king, so luckily we do not need to rely on the buggy backend from magma for this function. We added tests testing this function left and right. We also added tests for the different backends. We also activated the tests for AMD, as those should work as well. Fixes #61657 Pull Request resolved: #77634 Approved by: https://github.com/malfet Test Plan: contbuild & OSS CI, see https://hud.pytorch.org/commit/pytorch/pytorch/c7d6cec078bcd0b0652ba10d1d55931b27be9f36 Reviewed By: osalpekar Differential Revision: D36793144 Pulled By: osalpekar fbshipit-source-id: fd15e677ff625ef53b42acf3217b9bba443a66d0

kit1980 · 2023-11-09T21:08:17Z

aten/src/ATen/native/cuda/linalg/BatchLinearAlgebra.cpp

  } else {
    return c10::MaybeOwned<Tensor>::borrowed(pivots);
  }
 }

-}  // anonymous namespace
+static void lu_solve_kernel(const Tensor& LU, const Tensor& pivots, const Tensor& B, TransposeType trans) {
+  // Trivial case. Remove it once `torch.solve` is removed, as linalg.solve already shortcuts this case


@lezcano is it time to remove this now?

It's rather minor, but sure, happy to approve a PR that removes this provided that CI passes

lezcano requested review from mruberry, ngimel, nikitaved, IvanYashchuk, bdhirsh, albanD, anjali411 and soulitzer as code owners May 17, 2022 08:49

facebook-github-bot added the cla signed label May 17, 2022

lezcano requested review from malfet and removed request for nikitaved, albanD, soulitzer, ngimel, bdhirsh, IvanYashchuk, anjali411 and mruberry May 17, 2022 08:52

pytorchbot added the open source label May 17, 2022

lezcano added module: linear algebra Issues related to specialized linear algebra operations in PyTorch; includes matrix multiply matmul release notes: linalg_frontend release notes category labels May 17, 2022

lezcano added 3 commits May 17, 2022 14:28

Update on "Add linalg.lu_solve"

8000

8041f5d

Relanding #72935 [ghstack-poisoned]

Update on "Add linalg.lu_solve"

762fb68

Relanding #72935 [ghstack-poisoned]

Update on "Add linalg.lu_solve"

e0e4a7a

Relanding #72935 [ghstack-poisoned]

lezcano mentioned this pull request May 23, 2022

Doc bug about torch.linalg.inv #77954

Closed

lezcano added 3 commits May 25, 2022 16:30

Update on "Add linalg.lu_solve"

be113c0

Relanding #72935 [ghstack-poisoned]

Update on "Add linalg.lu_solve"

47c3e81

Relanding #72935 [ghstack-poisoned]

Update on "Add linalg.lu_solve"

43260f2

Relanding #72935 [ghstack-poisoned]

Update on "Add linalg.lu_solve"

fd63fd8

Relanding #72935 Differential Revision: [D36793144](https://our.internmc.facebook.com/intern/diff/D36793144) [ghstack-poisoned]

malfet approved these changes Jun 7, 2022

View reviewed changes

pytorchmergebot added the Merged label Jun 7, 2022

pytorchmergebot closed this in c7d6cec Jun 7, 2022

facebook-github-bot deleted the gh/Lezcano/72/head branch June 11, 2022 14:17

lezcano mentioned this pull request Dec 9, 2022

torch.linalg.solve yields much lower precisions in 1.13.0 than previous versions #90453

Closed

kit1980 reviewed Nov 9, 2023

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add linalg.lu_solve #77634

Add linalg.lu_solve #77634

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Add linalg.lu_solve #77634

Add linalg.lu_solve #77634

Uh oh!

Conversation

Uh oh!

Uh oh!

Uh oh!

🔗 Helpful links

✅ No Failures (0 Pending)

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!