opinfo : nn.functional.embedding #66622

kshitij12345 · 2021-10-14T13:46:52Z

Adds opinfo for nn.functional.embedding

Few cases where numerical gradient doesn't match (gradcheck fails)

import torch

try:
    t = torch.randn(2, 1, dtype=torch.float64, requires_grad=True)
    idx = torch.tensor([0, 1])
    torch.autograd.gradcheck(lambda idx, t : torch.nn.functional.embedding(idx, t, padding_idx=1), (idx, t, ))
except Exception as e:
    print("PADDING IDX:", e)

try:
    t = torch.ones(2, 1, dtype=torch.float64, requires_grad=True)
    idx = torch.tensor([0, 1])
    torch.autograd.gradcheck(lambda idx, t : torch.nn.functional.embedding(idx, t, max_norm=1.), (idx, t, ))
except Exception as e:
    print("MAX NORM:", e)

try:
    t = torch.randn(2, 1, dtype=torch.float64, requires_grad=True)
    idx = torch.tensor([0, 1, 1])
    torch.autograd.gradcheck(lambda idx, t : torch.nn.functional.embedding(idx, t, scale_grad_by_freq=True), (idx, t, ))
except Exception as e:
    print("SCALE GRAD BY FREQUENCY:", e)

try:
    t = torch.randn(2, 1, dtype=torch.float64, requires_grad=True)
    idx = torch.tensor([0, 1])
    torch.autograd.gradcheck(lambda idx, t : torch.nn.functional.embedding(idx, t, sparse=True), (idx, t, ))
except Exception as e:
    print("SPARSE", e)

pytorch-probot · 2021-10-14T13:46:55Z

CI Flow Status

⚛️ CI Flow

Ruleset - Version: v1
Ruleset - File: https://github.com/kshitij12345/pytorch/blob/fce376af7814c65757fb406a488ee48ec97c23da/.github/generated-ciflow-ruleset.json
PR ciflow labels: ciflow/default

Workflows	Labels (bold enabled)	Status
Triggered Workflows
linux-bionic-py3.6-clang9	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/noarch`, `ciflow/xla`	✅ triggered
linux-vulkan-bionic-py3.6-clang9	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/vulkan`	✅ triggered
linux-xenial-cuda11.3-py3.6-gcc7	`ciflow/all`, `ciflow/cuda`, `ciflow/default`, `ciflow/linux`	✅ triggered
linux-xenial-py3.6-clang7-asan	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/sanitizers`	✅ triggered
linux-xenial-py3.6-clang7-onnx	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/onnx`	✅ triggered
linux-xenial-py3.6-gcc5.4	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`	✅ triggered
linux-xenial-py3.6-gcc7-bazel-test	`ciflow/all`, `ciflow/bazel`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`	✅ triggered
win-vs2019-cpu-py3	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/win`	✅ triggered
win-vs2019-cuda11.3-py3	`ciflow/all`, `ciflow/cuda`, `ciflow/default`, `ciflow/win`	✅ triggered
Skipped Workflows
libtorch-linux-xenial-cuda10.2-py3.6-gcc7	`ciflow/all`, `ciflow/cuda`, `ciflow/libtorch`, `ciflow/linux`	🚫 skipped
libtorch-linux-xenial-cuda11.3-py3.6-gcc7	`ciflow/all`, `ciflow/cuda`, `ciflow/libtorch`, `ciflow/linux`	🚫 skipped
linux-bionic-cuda10.2-py3.9-gcc7	`ciflow/all`, `ciflow/cuda`, `ciflow/linux`, `ciflow/slow`	🚫 skipped
linux-xenial-cuda10.2-py3.6-gcc7	`ciflow/all`, `ciflow/cuda`, `ciflow/linux`, `ciflow/slow`	🚫 skipped
parallelnative-linux-xenial-py3.6-gcc5.4	`ciflow/all`, `ciflow/cpu`, `ciflow/linux`	🚫 skipped
periodic-libtorch-linux-xenial-cuda11.1-py3.6-gcc7	`ciflow/all`, `ciflow/cuda`, `ciflow/libtorch`, `ciflow/linux`, `ciflow/scheduled`	🚫 skipped
periodic-linux-xenial-cuda10.2-py3-gcc7-slow-gradcheck	`ciflow/all`, `ciflow/cuda`, `ciflow/linux`, `ciflow/scheduled`, `ciflow/slow`, `ciflow/slow-gradcheck`	🚫 skipped
periodic-linux-xenial-cuda11.1-py3.6-gcc7	`ciflow/all`, `ciflow/cuda`, `ciflow/linux`, `ciflow/scheduled`	🚫 skipped
periodic-win-vs2019-cuda11.1-py3	`ciflow/all`, `ciflow/cuda`, `ciflow/scheduled`, `ciflow/win`	🚫 skipped
puretorch-linux-xenial-py3.6-gcc5.4	`ciflow/all`, `ciflow/cpu`, `ciflow/linux`	🚫 skipped

You can add a comment to the PR and tag @pytorchbot with the following commands:

# ciflow rerun, "ciflow/default" will always be added automatically
@pytorchbot ciflow rerun

# ciflow rerun with additional labels "-l <ciflow/label_name>", which is equivalent to adding these labels manually and trigger the rerun
@pytorchbot ciflow rerun -l ciflow/scheduled -l ciflow/slow

For more information, please take a look at the CI Flow Wiki.

facebook-github-bot · 2021-10-14T13:46:57Z

🔗 Helpful links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/66622
📄 Preview docs built from this PR
📄 Preview C++ docs built from this PR
🔧 Opt-in to CIFlow to control what jobs run on your PRs

💊 CI failures summary and remediations

As of commit fce376a (more details on the Dr. CI page):

1/1 failures possibly* introduced in this PR
- 1/1 non-scanned failure(s)

ci.pytorch.org: 1 failed

Failed: pr/pytorch-linux-bionic-rocm4.3.1-py3.6

This comment was automatically generated by Dr. CI (expand for details).

Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

kshitij12345 · 2021-10-18T16:04:18Z

@zou3519 gentle ping :)

zou3519 · 2021-10-18T21:07:06Z

Few cases where numerical gradient doesn't match (gradcheck fails)

These all make sense, those flags on embedding cause weird things to happen to the gradients.

zou3519 · 2021-10-18T21:16:03Z

torch/testing/_internal/common_methods_invocations.py

+        yield SampleInput(make_input((M, S)), args=(idx,),)
+
+        if not requires_grad:
+            # Following inputs return different gradient from the numerical gradient.


nit: we should make it clear that these are expected

zou3519

LGTM! One really minor nit

facebook-github-bot · 2021-10-19T14:38:43Z

@zou3519 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot · 2021-10-20T13:35:17Z

@zou3519 merged this pull request in ed5633d.

soulitzer · 2021-10-20T16:12:08Z

Sorry, I will have to revert this as it is breaking master:

======================================================================
FAIL [0.053s]: test_neg_view_nn_functional_embedding_cuda_float64 (__main__.TestMathBitsCUDA)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/opt/conda/lib/python3.9/site-packages/torch/testing/_internal/common_utils.py", line 1408, in wrapper
    method(*args, **kwargs)
  File "/opt/conda/lib/python3.9/site-packages/torch/testing/_internal/common_utils.py", line 1408, in wrapper
    method(*args, **kwargs)
  File "/opt/conda/lib/python3.9/site-packages/torch/testing/_internal/common_device_type.py", line 371, in instantiated_test
    result = test(self, **param_kwargs)
  File "/opt/conda/lib/python3.9/site-packages/torch/testing/_internal/common_device_type.py", line 737, in test_wrapper
    return test(*args, **kwargs)
  File "/var/lib/jenkins/workspace/test/test_ops.py", line 1064, in test_neg_view
    self._test_math_view(device, dtype, op, _requires_grad, math_op_physical, math_op_view, is_bit_set,
  File "/var/lib/jenkins/workspace/test/test_ops.py", line 1005, in _test_math_view
    self.assertEqual(expected_forward, forward_with_mathview)
  File "/opt/conda/lib/python3.9/site-packages/torch/testing/_internal/common_utils.py", line 1889, in assertEqual
    super().assertTrue(result, msg=self._get_assert_msg(msg, debug_msg=debug_msg))
AssertionError: False is not true : Tensors failed to compare as equal!With rtol=1e-07 and atol=1e-07, found 10 element(s) (out of 20) whose difference(s) exceeded the margin of error (including 0 nan comparisons). The greatest difference was 0.3243844201848294 (0.006377774802576542 vs. 0.33076219498740594), which occurred at index (0, 1, 0).

facebook-github-bot · 2021-10-20T16:18:44Z

This pull request has been reverted by 067365fa87516cdaa9d6df3984f4ec13583016a7. To re-land this change, follow these steps.

facebook-github-bot · 2021-10-20T16:18:49Z

This pull request has been reverted by 94f4b22. To re-land this change, follow these steps.

zou3519 · 2021-10-20T18:44:09Z

I guess we should have rebased first. @kshitij12345 could you open a new PR please?

facebook-github-bot · 2022-01-06T03:54:29Z

This pull request has been reverted by 94f4b22. To re-land this change, follow these steps.

kshitij12345 · 2022-01-06T04:28:27Z

@soulitzer why was this reverted? Thanks

soulitzer · 2022-01-06T05:30:53Z

It didn't get reverted (again). I think this is just an issue with facebook-github-bot replaying the revert comment from earlier, so we probably don't need to worry. You can check that the revert commit has the same commit hash as in #66622 (comment).

kshitij12345 · 2022-01-06T05:32:17Z

@soulitzer Ah, I see. Apologies for the noise. Thanks!

opinfo : nn.functional.embedding

cb21eb2

pytorch-probot bot added the ciflow/default label Oct 14, 2021

kshitij12345 requested a review from zou3519 October 14, 2021 13:47

kshitij12345 marked this pull request as ready for review October 14, 2021 13:48

facebook-github-bot added the cla signed label Oct 14, 2021

pytorchbot added the open source label Oct 14, 2021

kshitij12345 added 2 commits October 14, 2021 16:39

add to skip list

275e32e

add non-contig index cases

59048b7

soulitzer added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Oct 15, 2021

zou3519 reviewed Oct 18, 2021

View reviewed changes

zou3519 approved these changes Oct 18, 2021

View reviewed changes

kshitij12345 added 2 commits October 19, 2021 04:13

Merge branch 'master' into develop/opinfo/embedding

83a86f8

address review: add comment

fce376a

zou3519 added ci/slow-gradcheck labels Oct 19, 2021

facebook-github-bot closed this in ed5633d Oct 20, 2021

facebook-github-bot added the Merged label Oct 20, 2021

facebook-github-bot added the Reverted label Oct 20, 2021

janeyx99 mentioned this pull request Nov 1, 2021

[Meta] CI Revert Tracker #66178

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

opinfo : nn.functional.embedding #66622

opinfo : nn.functional.embedding #66622

Uh oh!

Uh oh!

⚛️ CI Flow

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

opinfo : nn.functional.embedding #66622

opinfo : nn.functional.embedding #66622

Uh oh!

Conversation

Uh oh!

Uh oh!

⚛️ CI Flow

Uh oh!

Uh oh!

🔗 Helpful links

💊 CI failures summary and remediations

ci.pytorch.org: 1 failed

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!