8000 [ROCm] Enable sort operator BF16 support by pruthvistony · Pull Request #72854 · pytorch/pytorch · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

[ROCm] Enable sort operator BF16 support #72854

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 9 commits into from

Conversation

pruthvistony
Copy link
Collaborator
@pruthvistony pruthvistony commented Feb 15, 2022

The changes add support for dtype BF16 for sort operator in ROCm.

Relates - #58196

Relanding the change - #71226

@jeffdaily @jithunnair-amd @dllehr-amd Please review this PR.

@pytorch-bot
Copy link
pytorch-bot bot commented Feb 15, 2022
CI Flow Status

⚛️ CI Flow

Ruleset - Version: v1
Ruleset - File: https://github.com/ROCmSoftwarePlatform/pytorch/blob/92c55976ce0d49d32720d2a79fdff37fc00dc602/.github/generated-ciflow-ruleset.json
PR ciflow labels: ciflow/default
Add ciflow labels to this PR to trigger more builds:

Workflows Labels (bold enabled) Status
Triggered Workflows
linux-binary-conda ciflow/binaries, ciflow/binaries_conda, ciflow/default ✅ triggered
linux-binary-libtorch-cxx11-abi ciflow/binaries, ciflow/binaries_libtorch, ciflow/default ✅ triggered
linux-binary-libtorch-pre-cxx11 ciflow/binaries, ciflow/binaries_libtorch, ciflow/default ✅ triggered
linux-binary-manywheel ciflow/binaries, ciflow/binaries_wheel, ciflow/default ✅ triggered
linux-bionic-py3.7-clang9 ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/noarch, ciflow/trunk, ciflow/xla ✅ triggered
linux-bionic-rocm4.5-py3.7 ciflow/all, ciflow/default, ciflow/linux, ciflow/rocm, ciflow/trunk ✅ triggered
linux-docs ciflow/all, ciflow/cpu, ciflow/default, ciflow/docs, ciflow/linux, ciflow/trunk ✅ triggered
linux-vulkan-bionic-py3.7-clang9 ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk, ciflow/vulkan ✅ triggered
linux-xenial-cuda11.3-py3.7-gcc7 ciflow/all, ciflow/cuda, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
linux-xenial-cuda11.3-py3.7-gcc7-bazel-test ciflow/all, ciflow/bazel, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
linux-xenial-py3-clang5-mobile-build ciflow/all, ciflow/default, ciflow/linux, ciflow/mobile, ciflow/trunk ✅ triggered
linux-xenial-py3-clang5-mobile-custom-build-static ciflow/all, ciflow/default, ciflow/linux, ciflow/mobile, ciflow/trunk ✅ triggered
linux-xenial-py3.7-clang7-asan ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/sanitizers, ciflow/trunk ✅ triggered
linux-xenial-py3.7-clang7-onnx ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/onnx, ciflow/trunk ✅ triggered
linux-xenial-py3.7-gcc5.4 ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
linux-xenial-py3.7-gcc7 ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
linux-xenial-py3.7-gcc7-no-ops ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
macos-arm64-binary-conda ciflow/binaries, ciflow/binaries_conda, ciflow/default ✅ triggered
macos-arm64-binary-wheel ciflow/binaries, ciflow/binaries_wheel, ciflow/default ✅ triggered
macos-binary-conda ciflow/binaries, ciflow/binaries_conda, ciflow/default ✅ triggered
macos-binary-libtorch-cxx11-abi ciflow/binaries, ciflow/binaries_libtorch, ciflow/default ✅ triggered
macos-binary-libtorch-pre-cxx11 ciflow/binaries, ciflow/binaries_libtorch, ciflow/default ✅ triggered
macos-binary-wheel ciflow/binaries, ciflow/binaries_wheel, ciflow/default ✅ triggered
pytorch-linux-xenial-py3-clang5-android-ndk-r19c-gradle-custom-build-single ciflow/all, ciflow/android, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
pytorch-linux-xenial-py3-clang5-android-ndk-r19c-gradle-custom-build-single-full-jit ciflow/all, ciflow/android, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
win-vs2019-cpu-py3 ciflow/all, ciflow/cpu, ciflow/default, ciflow/trunk, ciflow/win ✅ triggered
win-vs2019-cuda11.3-py3 ciflow/all, ciflow/cuda, ciflow/default, ciflow/trunk, ciflow/win ✅ triggered
windows-binary-libtorch-cxx11-abi ciflow/binaries, ciflow/binaries_libtorch, ciflow/default ✅ triggered
windows-binary-libtorch-pre-cxx11 ciflow/binaries, ciflow/binaries_libtorch, ciflow/default ✅ triggered
windows-binary-wheel ciflow/binaries, ciflow/binaries_wheel, ciflow/default ✅ triggered
Skipped Workflows
caffe2-linux-xenial-py3.7-gcc5.4 ciflow/all, ciflow/cpu, ciflow/linux, ciflow/trunk 🚫 skipped
docker-builds ciflow/all, ciflow/trunk 🚫 skipped
ios-12-5-1-arm64 ciflow/all, ciflow/ios, ciflow/macos, ciflow/trunk 🚫 skipped
ios-12-5-1-arm64-coreml ciflow/all, ciflow/ios, ciflow/macos, ciflow/trunk 🚫 skipped
ios-12-5-1-arm64-custom-ops ciflow/all, ciflow/ios, ciflow/macos, ciflow/trunk 🚫 skipped
ios-12-5-1-arm64-full-jit ciflow/all, ciflow/ios, ciflow/macos, ciflow/trunk 🚫 skipped
ios-12-5-1-arm64-metal ciflow/all, ciflow/ios, ciflow/macos, ciflow/trunk 🚫 skipped
ios-12-5-1-x86-64 ciflow/all, ciflow/ios, ciflow/macos, ciflow/trunk 🚫 skipped
ios-12-5-1-x86-64-coreml ciflow/all, ciflow/ios, ciflow/macos, ciflow/trunk 🚫 skipped
ios-12-5-1-x86-64-full-jit ciflow/all, ciflow/ios, ciflow/macos, ciflow/trunk 🚫 skipped
libtorch-linux-xenial-cuda10.2-py3.7-gcc7 ciflow/all, ciflow/cuda, ciflow/libtorch, ciflow/linux, ciflow/trunk 🚫 skipped
libtorch-linux-xenial-cuda11.3-py3.7-gcc7 ciflow/all, ciflow/cuda, ciflow/libtorch, ciflow/linux, ciflow/trunk 🚫 skipped
linux-bionic-cuda10.2-py3.9-gcc7 ciflow/all, ciflow/cuda, ciflow/linux, ciflow/slow, ciflow/trunk 🚫 skipped
linux-docs-push ciflow/all, ciflow/cpu, ciflow/linux, ciflow/scheduled 🚫 skipped
linux-xenial-cuda11.3-py3.7-gcc7-no-ops ciflow/all, ciflow/cuda, ciflow/linux, ciflow/trunk 🚫 skipped
macos-10-15-py3-arm64 ciflow/all, ciflow/macos, ciflow/trunk 🚫 skipped
macos-10-15-py3-lite-interpreter-x86-64 ciflow/all, ciflow/macos, ciflow/trunk 🚫 skipped
macos-11-py3-x86-64 ciflow/all, ciflow/macos, ciflow/trunk 🚫 skipped
parallelnative-linux-xenial-py3.7-gcc5.4 ciflow/all, ciflow/cpu, ciflow/linux, ciflow/trunk 🚫 skipped
periodic-libtorch-linux-bionic-cuda11.5-py3.7-gcc7 ciflow/all, ciflow/cuda, ciflow/libtorch, ciflow/linux, ciflow/scheduled 🚫 skipped
periodic-libtorch-linux-xenial-cuda11.1-py3.7-gcc7 ciflow/all, ciflow/cuda, ciflow/libtorch, ciflow/linux, ciflow/scheduled 🚫 skipped
periodic-linux-bionic-cuda11.5-py3.7-gcc7 ciflow/all, ciflow/cuda, ciflow/linux, ciflow/scheduled 🚫 skipped
periodic-linux-xenial-cuda10.2-py3-gcc7-slow-gradcheck ciflow/all, ciflow/cuda, ciflow/linux, ciflow/scheduled, ciflow/slow, ciflow/slow-gradcheck 🚫 skipped
periodic-linux-xenial-cuda11.1-py3.7-gcc7-debug ciflow/all, ciflow/cuda, ciflow/linux, ciflow/scheduled 🚫 skipped
periodic-win-vs2019-cuda11.1-py3 ciflow/all, ciflow/cuda, ciflow/scheduled, ciflow/win 🚫 skipped
periodic-win-vs2019-cuda11.5-py3 ciflow/all, ciflow/cuda, ciflow/scheduled, ciflow/win 🚫 skipped
pytorch-linux-xenial-py3-clang5-android-ndk-r19c-build ciflow/all, ciflow/android, ciflow/cpu, ciflow/linux, ciflow/trunk 🚫 skipped

@facebook-github-bot
Copy link
Contributor
facebook-github-bot commented Feb 15, 2022

🔗 Helpful links

💊 CI failures summary and remediations

As of commit fb65be0 (more details on the Dr. CI page):


💚 💚 Looks good so far! There are no failures yet. 💚 💚


This comment was automatically generated by Dr. CI (expand for details).

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

@pruthvistony
Copy link
Collaborator Author
pruthvistony commented Feb 15, 2022

Copied Jithun's comments

@pruthvistony @dllehr-amd Can you please elaborate on the breakage? The argsort unit test was enabled for float16 on ROCm by this PR: https://github.com/pytorch/pytorch/pull/67706/files#diff-18a0c3e7cd07089b34760028c387d39de1aa393172f1d5f123692106b3a3d144R15390

but CI passed for that PR. And I'm not sure how PR #71226 impacted it since it was for bfloat16?

--- There are 2 possible solutions add BF16 to the argsort ROCm dtype or removing the dtype specialization for ROCm.

Dependency is due to

def sample_inputs_argsort(*args, **kwargs):

Copy link
Contributor
@dllehr-amd dllehr-amd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good minus a few comments with dtypesIfROCM

@H-Huang H-Huang added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Feb 15, 2022
Copy link
Collaborator Author
@pruthvistony pruthvistony left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes are updated.

Copy link
Collaborator Author
@pruthvistony pruthvistony left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review comments are taken care.

Copy link
Contributor
@dllehr-amd dllehr-amd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@facebook-github-bot
Copy link
Contributor

@malfet has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot pushed a commit that referenced this pull request Feb 23, 2022
Summary:
The changes add support for dtype BF16 for sort operator in ROCm.

Relates - #58196

Relanding the change - #71226

jeffdaily jithunnair-amd dllehr-amd Please review this PR.

Pull Request resolved: #72854

Reviewed By: zou3519

Differential Revision: D34284313

Pulled By: malfet

fbshipit-source-id: abcfea84ea53874008d56416425849e990ebf15b
@github-actions
Copy link
Contributor

Hey @pruthvistony.
You've committed this PR, but it does not have both a 'release notes: ...' and 'topics: ...' label. Please add one of each to the PR. The 'release notes: ...' label should represent the part of PyTorch that this PR changes (fx, autograd, distributed, etc) and the 'topics: ...' label should represent the kind of PR it is (not user facing, new feature, bug fix, perf improvement, etc). The list of valid labels can be found here for the 'release notes: ...' and here for the 'topics: ...'.
For changes that are 'topic: not user facing' there is no need for a release notes label.

@jithunnair-amd jithunnair-amd deleted the new_bf16_sort branch February 28, 2022 18:24
cyyever pushed a commit to cyyever/pytorch_private that referenced this pull request Mar 3, 2022
Summary:
The changes add support for dtype BF16 for sort operator in ROCm.

Relates - pytorch/pytorch#58196

Relanding the change - pytorch/pytorch#71226

jeffdaily jithunnair-amd dllehr-amd Please review this PR.

Pull Request resolved: pytorch/pytorch#72854

Reviewed By: zou3519

Differential Revision: D34284313

Pulled By: malfet

fbshipit-source-id: abcfea84ea53874008d56416425849e990ebf15b
(cherry picked from commit e9e7e3e0472b726ff2fd5f115962d3c835fb33db)
cyyever pushed a commit to cyyever/pytorch_private that referenced this pull request Mar 3, 2022
Summary:
The changes add support for dtype BF16 for sort operator in ROCm.

Relates - pytorch/pytorch#58196

Relanding the change - pytorch/pytorch#71226

jeffdaily jithunnair-amd dllehr-amd Please review this PR.

Pull Request resolved: pytorch/pytorch#72854

Reviewed By: zou3519

Differential Revision: D34284313

Pulled By: malfet

fbshipit-source-id: abcfea84ea53874008d56416425849e990ebf15b
(cherry picked from commit e9e7e3e0472b726ff2fd5f115962d3c835fb33db)
pytorchmergebot pushed a commit that referenced this pull request Mar 28, 2022
Support for PR #[59977](#59977) in ROCm
Fixes #[56176](#56176) in ROCm since ROCm 5.0 supports dtype bfloat16 for sorting.

Dependency - PR #[72854](#72854) to be merged before

CC Please review @jithunnair-amd @jeffdaily
Pull Request resolved: #71913
Approved by: https://github.com/jithunnair-amd, https://github.com/osalpekar
facebook-github-bot pushed a commit that referenced this pull request Mar 30, 2022
Summary:
Support for PR #[59977](#59977) in ROCm
Fixes #[56176](#56176) in ROCm since ROCm 5.0 supports dtype bfloat16 for sorting.

Dependency - PR #[72854](#72854) to be merged before

CC Please review jithunnair-amd jeffdaily
Pull Request resolved: #71913
Approved by: https://github.com/jithunnair-amd, https://github.com/osalpekar

Test Plan: contbuild & OSS CI, see https://hud.pytorch.org/commit/pytorch/pytorch/6675f1e6970087cfc5ba67e7cf7690768e326464

Reviewed By: malfet

Differential Revision: D35205235

fbshipit-source-id: 906575b43aa57d8674a774d812ebfcd0037f4617
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla signed module: rocm AMD GPU support for Pytorch open source triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants
0