8000 Back out "Revert D34524207: [pytorch][PR] remove _s_where" by ngimel · Pull Request #73579 · pytorch/pytorch · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Back out "Revert D34524207: [pytorch][PR] remove _s_where" #73579

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

ngimel
Copy link
Collaborator
@ngimel ngimel commented Mar 1, 2022

Summary:
Original commit changeset: 87b1220d851c

Original Phabricator Diff: D34524207 (4eb2482)

Test Plan: OSS tests

Differential Revision: D34554432

Summary:
Original commit changeset: 87b1220d851c

Original Phabricator Diff: D34524207 (pytorch@4eb2482)

Test Plan: OSS tests

Differential Revision: D34554432

fbshipit-source-id: 4366db10349289ef447f95a2f0615b2b9447a633
@pytorch-bot
Copy link
pytorch-bot bot commented Mar 1, 2022
CI Flow Status

⚛️ CI Flow

Ruleset - Version: v1
Ruleset - File: https://github.com/ngimel/pytorch/blob/896f3edf73fa941d84394f0d7b65e885ba1b79db/.github/generated-ciflow-ruleset.json
PR ciflow labels: ciflow/default
Add ciflow labels to this PR to trigger more builds:

Workflows Labels (bold enabled) Status
Triggered Workflows
linux-binary-conda ciflow/binaries, ciflow/binaries_conda, ciflow/default ✅ triggered
linux-binary-libtorch-cxx11-abi ciflow/binaries, ciflow/binaries_libtorch, ciflow/default ✅ triggered
linux-binary-libtorch-pre-cxx11 ciflow/binaries, ciflow/binaries_libtorch, ciflow/default ✅ triggered
linux-binary-manywheel ciflow/binaries, ciflow/binaries_wheel, ciflow/default ✅ triggered
linux-bionic-py3.7-clang9 ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/noarch, ciflow/trunk ✅ triggered
linux-bionic-rocm4.5-py3.7 ciflow/all, ciflow/default, ciflow/linux, ciflow/rocm, ciflow/trunk ✅ triggered
linux-docs ciflow/all, ciflow/cpu, ciflow/default, ciflow/docs, ciflow/linux, ciflow/trunk ✅ triggered
linux-vulkan-bionic-py3.7-clang9 ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk, ciflow/vulkan ✅ triggered
linux-xenial-cuda11.3-py3.7-gcc7 ciflow/all, ciflow/cuda, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
linux-xenial-cuda11.3-py3.7-gcc7-bazel-test ciflow/all, ciflow/bazel, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
linux-xenial-py3-clang5-mobile-build ciflow/all, ciflow/default, ciflow/linux, ciflow/mobile, ciflow/trunk ✅ triggered
linux-xenial-py3-clang5-mobile-custom-build-static ciflow/all, ciflow/default, ciflow/linux, ciflow/mobile, ciflow/trunk ✅ triggered
linux-xenial-py3.7-clang7-asan ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/sanitizers, ciflow/trunk ✅ triggered
linux-xenial-py3.7-clang7-onnx ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/onnx, ciflow/trunk ✅ triggered
linux-xenial-py3.7-gcc5.4 ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
linux-xenial-py3.7-gcc7 ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
linux-xenial-py3.7-gcc7-no-ops ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
macos-arm64-binary-conda ciflow/binaries, ciflow/binaries_conda, ciflow/default ✅ triggered
macos-arm64-binary-wheel ciflow/binaries, ciflow/binaries_wheel, ciflow/default ✅ triggered
macos-binary-conda ciflow/binaries, ciflow/binaries_conda, ciflow/default ✅ triggered
macos-binary-libtorch-cxx11-abi ciflow/binaries, ciflow/binaries_libtorch, ciflow/default ✅ triggered
macos-binary-libtorch-pre-cxx11 ciflow/binaries, ciflow/binaries_libtorch, ciflow/default ✅ triggered
macos-binary-wheel ciflow/binaries, ciflow/binaries_wheel, ciflow/default ✅ triggered
pytorch-linux-xenial-py3-clang5-android-ndk-r19c-gradle-custom-build-single ciflow/all, ciflow/android, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
pytorch-linux-xenial-py3-clang5-android-ndk-r19c-gradle-custom-build-single-full-jit ciflow/all, ciflow/android, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
win-vs2019-cpu-py3 ciflow/all, ciflow/cpu, ciflow/default, ciflow/trunk, ciflow/win ✅ triggered
win-vs2019-cuda11.3-py3 ciflow/all, ciflow/cuda, ciflow/default, ciflow/trunk, ciflow/win ✅ triggered
windows-binary-libtorch-cxx11-abi ciflow/binaries, ciflow/binaries_libtorch, ciflow/default ✅ triggered
windows-binary-libtorch-pre-cxx11 ciflow/binaries, ciflow/binaries_libtorch, ciflow/default ✅ triggered
windows-binary-wheel ciflow/binaries, ciflow/binaries_wheel, ciflow/default ✅ triggered
Skipped Workflows
caffe2-linux-xenial-py3.7-gcc5.4 ciflow/all, ciflow/cpu, ciflow/linux, ciflow/trunk 🚫 skipped
docker-builds ciflow/all, ciflow/trunk 🚫 skipped
ios-12-5-1-arm64 ciflow/all, ciflow/ios, ciflow/macos, ciflow/scheduled 🚫 skipped
ios-12-5-1-arm64-coreml ciflow/all, ciflow/ios, ciflow/macos, ciflow/scheduled 🚫 skipped
ios-12-5-1-arm64-custom-ops ciflow/all, ciflow/ios, ciflow/macos, ciflow/scheduled 🚫 skipped
ios-12-5-1-arm64-metal ciflow/all, ciflow/ios, ciflow/macos, ciflow/scheduled 🚫 skipped
ios-12-5-1-x86-64 ciflow/all, ciflow/ios, ciflow/macos, ciflow/trunk 🚫 skipped
ios-12-5-1-x86-64-coreml ciflow/all, ciflow/ios, ciflow/macos, ciflow/trunk 🚫 skipped
libtorch-linux-xenial-cuda10.2-py3.7-gcc7 ciflow/all, ciflow/cuda, ciflow/libtorch, ciflow/linux, ciflow/trunk 🚫 skipped
libtorch-linux-xenial-cuda11.3-py3.7-gcc7 ciflow/all, ciflow/cuda, ciflow/libtorch, ciflow/linux, ciflow/trunk 🚫 skipped
linux-bionic-cuda10.2-py3.9-gcc7 ciflow/all, ciflow/cuda, ciflow/linux, ciflow/slow, ciflow/trunk 🚫 skipped
linux-docs-push ciflow/all, ciflow/cpu, ciflow/linux, ciflow/scheduled 🚫 skipped
linux-xenial-cuda11.3-py3.7-gcc7-no-ops ciflow/all, ciflow/cuda, ciflow/linux, ciflow/trunk 🚫 skipped
macos-10-15-py3-arm64 ciflow/all, ciflow/macos, ciflow/trunk 🚫 skipped
macos-10-15-py3-lite-interpreter-x86-64 ciflow/all, ciflow/macos, ciflow/trunk 🚫 skipped
macos-11-py3-x86-64 ciflow/all, ciflow/macos, ciflow/trunk 🚫 skipped
parallelnative-linux-xenial-py3.7-gcc5.4 ciflow/all, ciflow/cpu, ciflow/linux, ciflow/trunk 🚫 skipped
periodic-libtorch-linux-bionic-cuda11.5-py3.7-gcc7 ciflow/all, ciflow/cuda, ciflow/libtorch, ciflow/linux, ciflow/scheduled 🚫 skipped
periodic-linux-bionic-cuda11.5-py3.7-gcc7 ciflow/all, ciflow/cuda, ciflow/linux, ciflow/scheduled 🚫 skipped
periodic-linux-xenial-cuda10.2-py3-gcc7-slow-gradcheck ciflow/all, ciflow/cuda, ciflow/linux, ciflow/scheduled, ciflow/slow, ciflow/slow-gradcheck 🚫 skipped
periodic-linux-xenial-cuda11.3-py3.7-gcc7-debug ciflow/all, ciflow/cuda, ciflow/linux, ciflow/scheduled 🚫 skipped
periodic-win-vs2019-cuda11.5-py3 ciflow/all, ciflow/cuda, ciflow/scheduled, ciflow/win 🚫 skipped
pytorch-linux-xenial-py3-clang5-android-ndk-r19c-build ciflow/all, ciflow/android, ciflow/cpu, ciflow/linux, ciflow/trunk 🚫 skipped
pytorch-xla-linux-bionic-py3.7-clang8 ciflow/all, ciflow/cpu, ciflow/linux, ciflow/trunk, ciflow/xla 🚫 skipped

Uh oh!

There was an error while loading. Please reload this page.

@facebook-github-bot
Copy link
Contributor
facebook-github-bot commented Mar 1, 2022

🔗 Helpful links

💊 CI failures summary and remediations

As of commit 896f3ed (more details on the Dr. CI page):



🕵️ 1 new failure recognized by patterns

The following CI failures do not appear to be due to upstream breakages:

See GitHub Actions build linux-xenial-py3.7-gcc5.4 / test (backwards_compat, 1, 1, linux.2xlarge) (1/1)

Step: "Test" (full log | diagnosis details | 🔁 rerun)

2022-03-07T17:45:22.2801972Z The PR is introduc...m to confirm whether this change is wanted or not.
2022-03-07T17:45:22.2784829Z processing existing schema:  text(__torch__.torch.classes.profiling.SourceRef _0) -> (str _0)
2022-03-07T17:45:22.2786889Z processing existing schema:  count(__torch__.torch.classes.profiling.InstructionStats _0) -> (int _0)
2022-03-07T17:45:22.2787950Z processing existing schema:  duration_ns(__torch__.torch.classes.profiling.InstructionStats _0) -> (int _0)
2022-03-07T17:45:22.2789579Z processing existing schema:  source(__torch__.torch.classes.profiling.SourceStats _0) -> (__torch__.torch.classes.profiling.SourceRef _0)
2022-03-07T17:45:22.2792547Z processing existing schema:  line_map(__torch__.torch.classes.profiling.SourceStats _0) -> (Dict(int, __torch__.torch.classes.profiling.InstructionStats) _0)
2022-03-07T17:45:22.2793679Z processing existing schema:  __init__(__torch__.torch.classes.profiling._ScriptProfile _0) -> (NoneType _0)
2022-03-07T17:45:22.2795240Z processing existing schema:  enable(__torch__.torch.classes.profiling._ScriptProfile _0) -> (NoneType _0)
2022-03-07T17:45:22.2797353Z processing existing schema:  disable(__torch__.torch.classes.profiling._ScriptProfile _0) -> (NoneType _0)
2022-03-07T17:45:22.2799978Z processing existing schema:  _dump_stats(__torch__.torch.classes.profiling._ScriptProfile _0) -> (__torch__.torch.classes.profiling.SourceStats[] _0)
2022-03-07T17:45:22.2801326Z processing existing schema:  __init__(__torch__.torch.classes.dist_rpc.WorkerInfo _0, str _1, int _2) -> (NoneType _0)
2022-03-07T17:45:22.2801972Z The PR is introducing backward incompatible changes to the operator library. Please contact PyTorch team to confirm whether this change is wanted or not. 
2022-03-07T17:45:22.2802353Z 
2022-03-07T17:45:22.2802455Z Broken ops: [
2022-03-07T17:45:22.2803180Z 	aten::_nested_tensor(Tensor[] list, int? dtype=None, int? layout=None, Device? device=None, bool? pin_memory=None) -> (Tensor)
2022-03-07T17:45:22.2804090Z 	quantized::conv2d_relu_cudnn(Tensor act, Tensor weight, Tensor? bias, int[] stride, int[] padding, int[] dilation, int groups, float output_scale, int output_zero_point) -> (Tensor)
2022-03-07T17:45:22.2804561Z ]
2022-03-07T17:45:22.3594051Z + cleanup
2022-03-07T17:45:22.3594279Z + retcode=1
2022-03-07T17:45:22.3594476Z + set +x
2022-03-07T17:45:22.3630877Z ##[error]Process completed with exit code 1.
2022-03-07T17:45:22.3660492Z ##[group]Run # Ensure the working directory gets chowned back to the current user

🚧 1 fixed upstream failure:

These were probably caused by upstream breakages that were already fixed.

Please rebase on the viable/strict branch (expand for instructions)

If your commit is older than viable/strict, run these commands:

git fetch https://github.com/pytorch/pytorch viable/strict
git rebase FETCH_HEAD

This comment was automatically generated by Dr. CI (expand for details).

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D34554432

@ngimel
Copy link
Collaborator Author
ngimel commented Mar 1, 2022

@JackCaoG, can you please prepare xla patch for this? The failure on original PR: https://ossci-raw-job-status.s3.amazonaws.com/log/5371985105

2022-03-01T08:17:39.5711444Z Traceback (most recent call last):
2022-03-01T08:17:39.5711979Z   File "/opt/conda/lib/python3.7/runpy.py", line 193, in _run_module_as_main
2022-03-01T08:17:39.5712305Z     "__main__", mod_spec)
2022-03-01T08:17:39.5712611Z   File "/opt/conda/lib/python3.7/runpy.py", line 85, in _run_code
2022-03-01T08:17:39.5713059Z     exec(code, run_globals)
2022-03-01T08:17:39.5713391Z   File "/var/lib/jenkins/workspace/tools/codegen/gen_backend_stubs.py", line 318, in <module>
2022-03-01T08:17:39.5713697Z     main()
2022-03-01T08:17:39.5714038Z   File "/var/lib/jenkins/workspace/tools/codegen/gen_backend_stubs.py", line 201, in main
2022-03-01T08:17:39.5714819Z     run(options.source_yaml, options.output_dir, options.dry_run, options.impl_path)
2022-03-01T08:17:39.5715301Z   File "/var/lib/jenkins/workspace/tools/codegen/gen_backend_stubs.py", line 292, in run
2022-03-01T08:17:39.5716046Z     parsed_backend_yaml = parse_backend_yaml(source_yaml, grouped_native_functions, backend_indices)
2022-03-01T08:17:39.5716650Z   File "/var/lib/jenkins/workspace/tools/codegen/gen_backend_stubs.py", line 100, in parse_backend_yaml
2022-03-01T08:17:39.5717000Z     supported, backend_key, use_out_as_primary=use_out_as_primary, use_device_guard=use_device_guard)
2022-03-01T08:17:39.5717350Z   File "/var/lib/jenkins/workspace/tools/codegen/gen_backend_stubs.py", line 81, in create_backend_index
2022-03-01T08:17:39.5717670Z     assert op_name in native_functions_map, f"Found an invalid operator name: {op_name}"
2022-03-01T08:17:39.5717959Z AssertionError: Found an invalid operator name: _s_where
2022-03-01T08:17:39.6227504Z Failed to generate ATEN bindings: ['/var/lib/jenkins/workspace/xla/scripts/generate_code.sh']

This PR removes _s_where function, where is used instead where needed.

@JackCaoG
Copy link
Collaborator
JackCaoG commented Mar 1, 2022

@ngimel Will do. Any chance we can merge #62084 first? I have a pending xla pr to update where for pytorch/xla and that will likely cause some merge conflict for me.

@ngimel
Copy link
Collaborator Author
ngimel commented Mar 1, 2022

Looks like #62084 is stalled and won't be landed any time soon.

@JackCaoG
Copy link
Collaborator
JackCaoG commented Mar 1, 2022

@ngimel OK I can work on this fix now then

@ngimel
Copy link
Collaborator Author
ngimel commented Mar 3, 2022

Thanks @JackCaoG, I'll try to land this PR tomorrow, I'll let you know when it lands

@JackCaoG
Copy link
Collaborator
JackCaoG commented Mar 3, 2022

@ngimel sounds good, I need to fix some error on the xla side but it should be ready by tmr

@ngimel
Copy link
Collaborator Author
ngimel commented Mar 3, 2022

Thanks, let me know if you want to wait till Monday.

@JackCaoG
Copy link
Collaborator
JackCaoG commented Mar 4, 2022

@ngimel Finding some additional bug regarding where broadcasting, can we push the merge to Monday?

@JackCaoG
Copy link
Collaborator
JackCaoG commented Mar 4, 2022

is there any difference between _s_where and where.self? I pretty much just rename the op and now find some unexpected test failure.

@ngimel
Copy link
Collaborator Author
ngimel commented Mar 4, 2022

Ah ok, it just landed but I'll revert. No, there is no difference between _s_where and where.self, what is the error you are getting? where supports broadcasting, and _s_where nominally didn't, it was supposed to get the inputs of the same size (but the way it was written it could still support broadcasting)

@JackCaoG
Copy link
Collaborator
JackCaoG commented Mar 4, 2022

I got some xla error

non_scalar_shape.value().dimensions() == shape->dimensions() Unimplemented implicit broadcast.

Pytorch/XLA where under the hood calls xla::Select which seems like doesn't support broadcasting(at least not the case that where supported). Let me look at the test case and see if I can add the broadcasting logic in the pytorch/xla layer.

facebook-github-bot pushed a commit that referenced this pull request Mar 4, 2022
Summary:
Pull Request resolved: #73579

Original commit changeset: 87b1220d851c

Original Phabricator Diff: D34524207 (4eb2482)

Test Plan: OSS tests

Reviewed By: malfet

Differential Revision: D34554432

fbshipit-source-id: 2f3601d3d4261ebcebb05b4b1aec0c9a8a00ea04
@facebook-github-bot
Copy link
Contributor

This pull request has been reverted by 5552563. To re-land this change, please open another pull request, assignthe same reviewers, fix the CI failures that caused the revert and make sure that the failing CI runs on the PR by applying the proper ciflow label (e.g., ciflow/trunk).

@JackCaoG
Copy link
Collaborator
JackCaoG commented Mar 4, 2022

@ngimel Where can I find the where broadcasting logic? I run into issues like

condition: [8] self: [8, 8] other: [8, 8]

which is easy to fix but from things like

where([10,1, 10], [10, 10], [10, 10, 1]) -> [10, 10, 10]

are more confusing

@ngimel
Copy link
Collaborator Author
ngimel commented Mar 4, 2022

Broadcasting semantics is described in https://pytorch.org/docs/stable/notes/broadcasting.html?highlight=broadcasting, you line up tensors at the last dimension, and expand dimensions of size 1 starting from there. Previously all the tensors were sent to expand_outplace to make sure they are expanded to the same size according to these rules https://github.com/pytorch/pytorch/pull/73579/files#diff-586c0d10a44eaff10f3ef1ccf8e647ee60cc962ef73c549889f12de27600ccebL339, and then expanded tensors were sent to _s_where.

@JackCaoG
Copy link
Collaborator
JackCaoG commented Mar 4, 2022

I was trying to reverse engineering expand_outplace but then I realized I should just call it in pt/xla code. test is passing locally, I will submit and let CI verify.

@ngimel ngimel reopened this Mar 7, 2022
@facebook-github-bot
Copy link
Contributor

@ngimel has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@JackCaoG
Copy link
Collaborator
JackCaoG commented Mar 7, 2022
628C

@ngimel xla pr is ready, I think we are ready to merge.

@ngimel
Copy link
Collaborator Author
ngimel commented Mar 8, 2022

@JackCaoG diff is landing

facebook-github-bot pushed a commit that referenced this pull request Mar 8, 2022
Summary:
Original commit changeset: 87b1220d851c

Original Phabricator Diff: D34524207 (4eb2482) (4eb2482)

Pull Request resolved: #73579

Test Plan:
OSS tests
tested with canary https://www.internalfb.com/intern/ads/canary/441912928798660873

Reviewed By: ezyang

Differential Revision: D34688237

Pulled By: ngimel

fbshipit-source-id: 32f3a0046053ef52e95ab45a26bfc1de17e7e061
@github-actions
Copy link
Contributor
github-actions bot commented Mar 8, 2022

Hey @ngimel.
You've committed this PR, but it does not have both a 'release notes: ...' and 'topics: ...' label. Please add one of each to the PR. The 'release notes: ...' label should represent the part of PyTorch that this PR changes (fx, autograd, distributed, etc) and the 'topics: ...' label should represent the kind of PR it is (not user facing, new feature, bug fix, perf improvement, etc). The list of valid labels can be found here for the 'release notes: ...' and here for the 'topics: ...'.
For changes that are 'topic: not user facing' there is no need for a release notes label.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants
0