8000 Expose lu_factor_batched_cublas by lezcano · Pull Request #73877 · pytorch/pytorch · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Expose lu_factor_batched_cublas #73877

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 43 commits into from
Closed

Conversation

lezcano
Copy link
Collaborator
@lezcano lezcano commented Mar 7, 2022

Stack from ghstack:

We had bindings for this function, but they were not exposed as a
function inside ATen. This PR exposes them.

This function is tested in the next PR of the stack, where it is used in
linalg.lu_factor.

We had bindings for this function, but they were not exposed as a
function inside ATen. This PR exposes them.

This function is tested in the next PR of the stack, where it is used in
linalg.lu_solve.

[ghstack-poisoned]
@pytorch-bot
Copy link
pytorch-bot bot commented Mar 7, 2022
CI Flow Status

⚛️ CI Flow

Ruleset - Version: v1
Ruleset - File: https://github.com/pytorch/pytorch/blob/610834b351fbc669c5fd905a8b699c1244ecf143/.github/generated-ciflow-ruleset.json
PR ciflow labels: ciflow/default
Add ciflow labels to this PR to trigger more builds:

Workflows Labels (bold enabled) Status
Triggered Workflows
linux-binary-conda ciflow/binaries, ciflow/binaries_conda, ciflow/default ✅ triggered
linux-binary-libtorch-cxx11-abi ciflow/binaries, ciflow/binaries_libtorch, ciflow/default ✅ triggered
linux-binary-libtorch-pre-cxx11 ciflow/binaries, ciflow/binaries_libtorch, ciflow/default ✅ triggered
linux-binary-manywheel ciflow/binaries, ciflow/binaries_wheel, ciflow/default ✅ triggered
linux-bionic-py3.7-clang9 ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/noarch, ciflow/trunk ✅ triggered
linux-bionic-rocm4.5-py3.7 ciflow/all, ciflow/default, ciflow/linux, ciflow/rocm, ciflow/trunk ✅ triggered
linux-docs ciflow/all, ciflow/cpu, ciflow/default, ciflow/docs, ciflow/linux, ciflow/trunk ✅ triggered
linux-vulkan-bionic-py3.7-clang9 ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk, ciflow/vulkan ✅ triggered
linux-xenial-cuda11.3-py3.7-gcc7 ciflow/all, ciflow/cuda, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
linux-xenial-cuda11.3-py3.7-gcc7-bazel-test ciflow/all, ciflow/bazel, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
linux-xenial-py3-clang5-mobile-build ciflow/all, ciflow/default, ciflow/linux, ciflow/mobile, ciflow/trunk ✅ triggered
linux-xenial-py3-clang5-mobile-custom-build-static ciflow/all, ciflow/default, ciflow/linux, ciflow/mobile, ciflow/trunk ✅ triggered
linux-xenial-py3.7-clang7-asan ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/sanitizers, ciflow/trunk ✅ triggered
linux-xenial-py3.7-clang7-onnx ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/onnx, ciflow/trunk ✅ triggered
linux-xenial-py3.7-gcc5.4 ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
linux-xenial-py3.7-gcc5.4-mobile-lightweight-dispatch-build ciflow/all, ciflow/cpu, ciflow/default, ciflow/libtorch, ciflow/linux, ciflow/mobile, ciflow/trunk ✅ triggered
linux-xenial-py3.7-gcc7 ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
linux-xenial-py3.7-gcc7-no-ops ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
macos-arm64-binary-conda ciflow/binaries, ciflow/binaries_conda, ciflow/default ✅ triggered
macos-arm64-binary-wheel ciflow/binaries, ciflow/binaries_wheel, ciflow/default ✅ triggered
macos-binary-conda ciflow/binaries, ciflow/binaries_conda, ciflow/default ✅ triggered
macos-binary-libtorch-cxx11-abi ciflow/binaries, ciflow/binaries_libtorch, ciflow/default ✅ triggered
macos-binary-libtorch-pre-cxx11 ciflow/binaries, ciflow/binaries_libtorch, ciflow/default ✅ triggered
macos-binary-wheel ciflow/binaries, ciflow/binaries_wheel, ciflow/default ✅ triggered
pytorch-linux-xenial-py3-clang5-android-ndk-r19c-gradle-custom-build-single ciflow/all, ciflow/android, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
pytorch-linux-xenial-py3-clang5-android-ndk-r19c-gradle-custom-build-single-full-jit ciflow/all, ciflow/android, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
win-vs2019-cpu-py3 ciflow/all, ciflow/cpu, ciflow/default, ciflow/trunk, ciflow/win ✅ triggered
win-vs2019-cuda11.3-py3 ciflow/all, ciflow/cuda, ciflow/default, ciflow/trunk, ciflow/win ✅ triggered
windows-binary-libtorch-debug ciflow/binaries, ciflow/binaries_libtorch, ciflow/default ✅ triggered
windows-binary-libtorch-release ciflow/binaries, ciflow/binaries_libtorch, ciflow/default ✅ triggered
windows-binary-wheel ciflow/binaries, ciflow/binaries_wheel, ciflow/default ✅ triggered
Skipped Workflows
caffe2-linux-xenial-py3.7-gcc5.4 ciflow/all, ciflow/cpu, ciflow/linux, ciflow/trunk 🚫 skipped
docker-builds ciflow/all, ciflow/trunk 🚫 skipped
ios-12-5-1-arm64 ciflow/all, ciflow/ios, ciflow/macos, ciflow/scheduled 🚫 skipped
ios-12-5-1-arm64-coreml ciflow/all, ciflow/ios, ciflow/macos, ciflow/scheduled 🚫 skipped
ios-12-5-1-arm64-custom-ops ciflow/all, ciflow/ios, ciflow/macos, ciflow/scheduled 🚫 skipped
ios-12-5-1-arm64-metal ciflow/all, ciflow/ios, ciflow/macos, ciflow/scheduled 🚫 skipped
ios-12-5-1-x86-64 ciflow/all, ciflow/ios, ciflow/macos, ciflow/trunk 🚫 skipped
ios-12-5-1-x86-64-coreml ciflow/all, ciflow/ios, ciflow/macos, ciflow/trunk 🚫 skipped
libtorch-linux-xenial-cuda10.2-py3.7-gcc7 ciflow/all, ciflow/cuda, ciflow/libtorch, ciflow/linux, ciflow/trunk 🚫 skipped
libtorch-linux-xenial-cuda11.3-py3.7-gcc7 ciflow/all, ciflow/cuda, ciflow/libtorch, ciflow/linux, ciflow/trunk 🚫 skipped
linux-bionic-cuda10.2-py3.9-gcc7 ciflow/all, ciflow/cuda, ciflow/linux, ciflow/slow, ciflow/trunk 🚫 skipped
linux-docs-push ciflow/all, ciflow/cpu, ciflow/linux, ciflow/scheduled 🚫 skipped
linux-xenial-cuda11.3-py3.7-gcc7-no-ops ciflow/all, ciflow/cuda, ciflow/linux, ciflow/trunk 🚫 skipped
macos-10-15-py3-arm64 ciflow/all, ciflow/macos, ciflow/trunk 🚫 skipped
macos-10-15-py3-lite-interpreter-x86-64 ciflow/all, ciflow/macos, ciflow/trunk 🚫 skipped
macos-11-py3-x86-64 ciflow/all, ciflow/macos, ciflow/trunk 🚫 skipped
parallelnative-linux-xenial-py3.7-gcc5.4 ciflow/all, ciflow/cpu, ciflow/linux, ciflow/trunk 🚫 skipped
periodic-libtorch-linux-bionic-cuda11.5-py3.7-gcc7 ciflow/all, ciflow/cuda, ci 8000 flow/libtorch, ciflow/linux, ciflow/scheduled 🚫 skipped
periodic-linux-bionic-cuda11.5-py3.7-gcc7 ciflow/all, ciflow/cuda, ciflow/linux, ciflow/scheduled 🚫 skipped
periodic-linux-xenial-cuda10.2-py3-gcc7-slow-gradcheck ciflow/all, ciflow/cuda, ciflow/linux, ciflow/scheduled, ciflow/slow, ciflow/slow-gradcheck 🚫 skipped
periodic-linux-xenial-cuda11.3-py3.7-gcc7-debug ciflow/all, ciflow/cuda, ciflow/linux, ciflow/scheduled 🚫 skipped
periodic-win-vs2019-cuda11.5-py3 ciflow/all, ciflow/cuda, ciflow/scheduled, ciflow/win 🚫 skipped
pytorch-linux-xenial-py3-clang5-android-ndk-r19c-build ciflow/all, ciflow/android, ciflow/cpu, ciflow/linux, ciflow/trunk 🚫 skipped
pytorch-xla-linux-bionic-py3.7-clang8 ciflow/all, ciflow/cpu, ciflow/linux, ciflow/trunk, ciflow/xla 🚫 skipped

@facebook-github-bot
Copy link
Contributor
facebook-github-bot commented Mar 7, 2022

🔗 Helpful links

✅ No Failures (0 Pending)

As of commit cbb7fdf (more details on the Dr. CI page):

Expand to see more

💚 💚 Looks good so far! There are no failures yet. 💚 💚


This comment was automatically generated by Dr. CI (expand for details).

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

@lezcano lezcano added the module: linear algebra Issues related to specialized linear algebra operations in PyTorch; includes matrix multiply matmul label Mar 8, 2022
lezcano added 2 commits March 8, 2022 18:16
We had bindings for this function, but they were not exposed as a
function inside ATen. This PR exposes them.

This function is tested in the next PR of the stack, where it is used in
linalg.lu_solve.

[ghstack-poisoned]
We had bindings for this function, but they were not exposed as a
function inside ATen. This PR exposes them.

This function is tested in the next PR of the stack, where it is used in
linalg.lu_solve.

[ghstack-poisoned]
@IvanYashchuk
Copy link
Collaborator

getrfBatched is used in linalg.inv:

// Heuristic: For small batch size or large matrix size, we use for-loop to iterate over the batches instead of
// calling the batched cublas routine.
if (batch_size <= 8 || /* batch_size > 8 && */ n >= 512) {

Also the description of the PR that added this function says getrfBatched doesn't have a good performance: #42403.
Let's see what your performance comparison would reveal.

@lezcano
Copy link
Collaborator Author
lezcano commented Mar 10, 2022

For what is worth, that PR just tested up to batches of 4 matrices, which is not very representative either.

We had bindings for this function, but they were not exposed as a
function inside ATen. This PR exposes them.

This function is tested in the next PR of the stack, where it is used in
linalg.lu_solve.

[ghstack-poisoned]
lezcano added 5 commits March 10, 2022 19:19
We had bindings for this function, but they were not exposed as a
function inside ATen. This PR exposes them.

This function is tested in the next PR of the stack, where it is used in
linalg.lu_solve.

[ghstack-poisoned]
We had bindings for this function, but they were not exposed as a
function inside ATen. This PR exposes them.

This function is tested in the next PR of the stack, where it is used in
linalg.lu_solve.

[ghstack-poisoned]
We had bindings for this function, but they were not exposed as a
function inside ATen. This PR exposes them.

This function is tested in the next PR of the stack, where it is used in
linalg.lu_solve.

[ghstack-poisoned]
We had bindings for this function, but they were not exposed as a
function inside ATen. This PR exposes them.

This function is tested in the next PR of the stack, where it is used in
linalg.lu_solve.

[ghstack-poisoned]
We had bindings for this function, but they were not exposed as a
function inside ATen. This PR exposes them.

This function is tested in the next PR of the stack, where it is used in
`linalg.lu_factor`.

[ghstack-poisoned]
lezcano added 2 commits March 30, 2022 21:38
We had bindings for this function, but they were not exposed as a
function inside ATen. This PR exposes them.

This function is tested in the next PR of the stack, where it is used in
`linalg.lu_factor`.

[ghstack-poisoned]
We had bindings for this function, but they were not exposed as a
function inside ATen. This PR exposes them.

This function is tested in the next PR of the stack, where it is used in
`linalg.lu_factor`.

[ghstack-poisoned]
We had bindings for this function, but they were not exposed as a
function inside ATen. This PR exposes them.

This function is tested in the next PR of the stack, where it is used in
`linalg.lu_factor`.

[ghstack-poisoned]
@lezcano lezcano added release notes: cuda release notes category topic: not user facing topic category labels May 5, 2022
We had bindings for this function, but they were not exposed as a
function inside ATen. This PR exposes them.

This function is tested in the next PR of the stack, where it is used in
`linalg.lu_factor`.

[ghstack-poisoned]
lezcano added 2 commits May 17, 2022 10:31
We had bindings for this function, but they were not exposed as a
function inside ATen. This PR exposes them.

This function is tested in the next PR of the stack, where it is used in
`linalg.lu_factor`.

[ghstack-poisoned]
We had bindings for this function, but they were not exposed as a
function inside ATen. This PR exposes them.

This function is tested in the next PR of the stack, where it is used in
`linalg.lu_factor`.

[ghstack-poisoned]
Comment on lines +109 to +112
auto a_ptr_array_data = reinterpret_cast<scalar_t**>(a_ptr_array.data_ptr());

at::cuda::blas::getrfBatched(n, a_ptr_array_data, lda, pivots_data, infos_data, batch_size);
#endif
Copy link
Collaborator
@nikitaved nikitaved May 17, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one handles non-contiguous inputs, right? Or is it done one level above in the stack?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

non-contiguous inputs are handled in the parent fucntion, that is, in linalg_lu_factor

Copy link
Collaborator
@nikitaved nikitaved left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks!

lezcano added 4 commits May 17, 2022 14:30
We had bindings for this function, but they were not exposed as a
function inside ATen. This PR exposes them.

This function is tested in the next PR of the stack, where it is used in
`linalg.lu_factor`.

[ghstack-poisoned]
We had bindings for this function, but they were not exposed as a
function inside ATen. This PR exposes them.

This function is tested in the next PR of the stack, where it is used in
`linalg.lu_factor`.

[ghstack-poisoned]
We had bindings for this function, but they were not exposed as a
function inside ATen. This PR exposes them.

This function is tested in the next PR of the stack, where it is used in
`linalg.lu_factor`.

[ghstack-poisoned]
We had bindings for this function, but they were not exposed as a
function inside ATen. This PR exposes them.

This function is tested in the next PR of the stack, where it is used in
`linalg.lu_factor`.

[ghstack-poisoned]
lezcano added 8 commits May 18, 2022 19:03
We had bindings for this function, but they were not exposed as a
function inside ATen. This PR exposes them.

This function is tested in the next PR of the stack, where it is used in
`linalg.lu_factor`.

[ghstack-poisoned]
We had bindings for this function, but they were not exposed as a
function inside ATen. This PR exposes them.

This function is tested in the next PR of the stack, where it is used in
`linalg.lu_factor`.

[ghstack-poisoned]
We had bindings for this function, but they were not exposed as a
function inside ATen. This PR exposes them.

This function is tested in the next PR of the stack, where it is used in
`linalg.lu_factor`.

[ghstack-poisoned]
We had bindings for this function, but they were not exposed as a
function inside ATen. This PR exposes them.

This function is tested in the next PR of the stack, where it is used in
`linalg.lu_factor`.

[ghstack-poisoned]
We had bindings for this function, but they were not exposed as a
function inside ATen. This PR exposes them.

This function is tested in the next PR of the stack, where it is used in
`linalg.lu_factor`.

[ghstack-poisoned]
We had bindings for this function, but they were not exposed as a
function inside ATen. This PR exposes them.

This function is tested in the next PR of the stack, where it is used in
`linalg.lu_factor`.

[ghstack-poisoned]
We had bindings for this function, but they were not exposed as a
function inside ATen. This PR exposes them.

This function is tested in the next PR of the stack, where it is used in
`linalg.lu_factor`.

[ghstack-poisoned]
We had bindings for this function, but they were not exposed as a
function inside ATen. This PR exposes them.

This function is tested in the next PR of the stack, where it is used in
`linalg.lu_factor`.

[ghstack-poisoned]
Copy link
Collaborator
@mruberry mruberry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stamped!

We had bindings for this function, but they were not exposed as a
function inside ATen. This PR exposes them.

This function is tested in the next PR of the stack, where it is used in
`linalg.lu_factor`.

[ghstack-poisoned]
@facebook-github-bot facebook-github-bot deleted the gh/Lezcano/51/head branch June 14, 2022 14:16
facebook-github-bot pushed a commit that referenced this pull request Jun 14, 2022
Summary:
We had bindings for this function, but they were not exposed as a
function inside ATen. This PR exposes them.

This function is tested in the next PR of the stack, where it is used in
linalg.lu_solve.

Pull Request resolved: #73877

Approved by: https://github.com/nikitaved, https://github.com/IvanYashchuk, https://github.com/mruberry

Test Plan: contbuild & OSS CI, see https://hud.pytorch.org/commit/pytorch/pytorch/1880d293b64badac1b74e9c6bb0152e3950d3f7e

Reviewed By: osalpekar

Differential Revision: D37089123

Pulled By: osalpekar

fbshipit-source-id: 55989cba1b2f8e0cbb1b7dae326bf2e52e3b0fb5
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla signed module: linear algebra Issues related to specialized linear algebra operations in PyTorch; includes matrix multiply matmul open source release notes: cuda release notes category topic: not user facing topic category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants
0