8000 Scaffolding for meta tensor crossref testing by ezyang · Pull Request #75994 · pytorch/pytorch · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Scaffolding for meta tensor crossref testing #75994

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 27 commits into from

Conversation

ezyang
Copy link
Contributor
@ezyang ezyang commented Apr 18, 2022

Stack from ghstack (oldest at bottom):

There's some utility stuff I could have stacked separately but haven't done so, shout if you want me to.

  • torch.overrides.resolve_name: this takes a public Torch API function and returns a string name corresponding to it. If found this pretty useful for giving good messages because the repr on most of our function objects is pretty useless (NB: this should be fixed.) It's missing a little bit of functionality; direct calls to torch.ops, torch._VF and torch._C._nn don't work yet. This needed a bit of surgery on _get_overridable_functions and a nicer refactor would be good, not exactly sure how to do it.
  • I added two utility functions that I needed for test scaffolding: torch._C. _set_storage_via_tensor, which is like set_ but it takes in a tensor rather than a storage (I need this because meta storages don't work), and torch._C. _is_batched, which lets me test if a tensor is batched (as far as I can tell, there's no way to test this in userland.)
  • I added a new testing envvar PYTORCH_TEST_WITH_COVERAGE_DB which will write out a sqlite3 database recording which tests called which torch API functions. This is useful if you're working on a particular torch function and want to quickly execute all tests that exercise that function (not just the obvious ones). From experimentation, I observed that it's best to have a fairly normalized database representation to reduce the disk size and inserts to the database go faster; I define a view to recover the 'naive' viewpoint on the database.

How does crossref testing work? When PYTORCH_TEST_WITH_CROSSREF=1, we install a torch function mode which will attempt to run the equivalent of any torch API call with all meta arguments. For now, we just run it and don't check if any of the results are right, but that is the next logical step.

Doing the same operation, but with meta arguments, is quite involved.

  • Input tensors may share the same storage; this must be preserved in the meta tensors so that they share the same storage too; the meta_storage ensures if we see the same storage multiple times we map it to the same meta storage. However, meta tensors don't actually support Python storage bindings right now, so I instead return a tensor and use _set_storage_via_tensor to install it later.
  • The obvious things we have to preserve in the meta tensor are dtype and size. However, here are some more non-obvious things we have to preserve: inference mode, leaf-ness, view-ness, strides, storage offset, conj bit, neg bit.
  • Many features we do not support as meta tensors: sparse CSR, sparse, MKLDNN, quantized, nested, functionalization, negative bit, conjugate bit, complex tensors. These don't get transformed, and if a function has an argument that is any of these, we skip the cross-reference test.
  • I played around with what to do with tensor subclasses, but in the end I decided that it was too difficult to understand what any given subclass was doing, so we skip cross-ref tests if the tensor in question is not exactly a Tensor or Parameter
  • There is a giant blocklist of functions which are known not to work with meta tensors. Previously, I automatically determined this by looking for a NotImplementedError, but this made the test suite run very slowly (cuz C++/Python exceptions are slow). So instead I ran the test suite once slowly, collecting all the functions that failed, and then put them into the blacklist. When you add meta function support, you just delete functions from this list.

Signed-off-by: Edward Z. Yang ezyang@fb.com

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

[ghstack-poisoned]
@facebook-github-bot
Copy link
Contributor
facebook-github-bot commented Apr 18, 2022

🔗 Helpful links

💊 CI failures summary and remediations

As of commit a1810e9 (more details on the Dr. CI page):

Expand to see more
  • 12/12 failures introduced in this PR

🕵️ 12 new failures recognized by patterns

The following CI failures do not appear to be due to upstream breakages

See GitHub Actions build pull / win-vs2019-cpu-py3 / test (default, 1, 2, windows.4xlarge) (1/12)

Step: "Test" (full log | diagnosis details | 🔁 rerun)

2022-05-05T19:54:09.6356044Z FAIL [0.016s]: test_rfftfreq_cpu (__main__.TestFFTDocExamplesCPU)
2022-05-05T19:54:09.6354708Z Traceback (most recent call last):
2022-05-05T19:54:09.6355030Z   File "C:\actions-runner\_work\pytorch\pytorch\build\win_tmp\build\torch\testing\_internal\common_device_type.py", line 376, in instantiated_test
2022-05-05T19:54:09.6355121Z     result = test(self, **param_kwargs)
2022-05-05T19:54:09.6355386Z   File "C:\actions-runner\_work\pytorch\pytorch\build\win_tmp\build\torch\testing\_internal\common_device_type.py", line 808, in dep_fn
2022-05-05T19:54:09.6355472Z     return fn(slf, *args, **kwargs)
2022-05-05T19:54:09.6355598Z   File "test_spectral_ops.py", line 1527, in test
2022-05-05T19:54:09.6355678Z     self.fail('Doctest failed')
2022-05-05T19:54:09.6355769Z AssertionError: Doctest failed
2022-05-05T19:54:09.6355775Z 
2022-05-05T19:54:09.6355868Z ======================================================================
2022-05-05T19:54:09.6356044Z FAIL [0.016s]: test_rfftfreq_cpu (__main__.TestFFTDocExamplesCPU)
2022-05-05T19:54:09.6356188Z ----------------------------------------------------------------------
2022-05-05T19:54:09.6356283Z Traceback (most recent call last):
2022-05-05T19:54:09.6356569Z   File "C:\actions-runner\_work\pytorch\pytorch\build\win_tmp\build\torch\testing\_internal\common_device_type.py", line 376, in instantiated_test
2022-05-05T19:54:09.6356661Z     result = test(self, **param_kwargs)
2022-05-05T19:54:09.6357008Z   File "C:\actions-runner\_work\pytorch\pytorch\build\win_tmp\build\torch\testing\_internal\common_device_type.py", line 808, in dep_fn
2022-05-05T19:54:09.6357097Z     return fn(slf, *args, **kwargs)
2022-05-05T19:54:09.6357226Z   File "test_spectral_ops.py", line 1527, in test
2022-05-05T19:54:09.6357304Z     self.fail('Doctest failed')
2022-05-05T19:54:09.6357394Z AssertionError: Doctest failed
2022-05-05T19:54:09.6357399Z 

See GitHub Actions build Lint / lintrunner (2/12)

Step: "Run lintrunner on PR files" (full log | diagnosis details | 🔁 rerun)

2022-05-05T18:24:12.8342830Z ##[error]Module has no attribute "_debug_tensor"
2022-05-05T18:23:56.2544027Z �[36;1m# in JSON mode and use jq to massage the output into GitHub Actions�[0m
2022-05-05T18:23:56.2544326Z �[36;1m# workflow commands.�[0m
2022-05-05T18:23:56.2544637Z �[36;1mlintru
8000
nner --merge-base-with "${PR_BASE_SHA}" --output=json | \�[0m
2022-05-05T18:23:56.2545290Z �[36;1m  jq --raw-output '"::\(if .severity == "advice" or .severity == "disabled" then "warning" else .severity end) file=\(.path),line=\(.line),col=\(.char),title=\(.code) \(.name)::" + (.description | gsub("\\n"; "%0A"))'�[0m
2022-05-05T18:23:56.2586339Z shell: /bin/bash -e {0}
2022-05-05T18:23:56.2586553Z env:
2022-05-05T18:23:56.2586837Z   pythonLocation: /opt/hostedtoolcache/Python/3.8.12/x64
2022-05-05T18:23:56.2587183Z   LD_LIBRARY_PATH: /opt/hostedtoolcache/Python/3.8.12/x64/lib
2022-05-05T18:23:56.2587539Z   PR_BASE_SHA: aa578d8b209aa7e70ce38320364ebb4bc7cde54f
2022-05-05T18:23:56.2587806Z ##[endgroup]
2022-05-05T18:24:12.8342830Z ##[error]Module has no attribute "_debug_tensor" 
2022-05-05T18:24:12.8380100Z Post job cleanup.
2022-05-05T18:24:12.8414060Z Post job cleanup.
2022-05-05T18:24:12.9470676Z [command]/usr/bin/git version
2022-05-05T18:24:12.9532642Z git version 2.36.0
2022-05-05T18:24:12.9587022Z [command]/usr/bin/git config --local --name-only --get-regexp core\.sshCommand
2022-05-05T18:24:12.9663158Z [command]/usr/bin/git submodule foreach --recursive git config --local --name-only --get-regexp 'core\.sshCommand' && git config --local --unset-all 'core.sshCommand' || :
2022-05-05T18:24:12.9944556Z Entering 'android/libs/fbjni'
2022-05-05T18:24:12.9989321Z Entering 'third_party/FP16'
2022-05-05T18:24:13.0034746Z Entering 'third_party/FXdiv'
2022-05-05T18:24:13.0077603Z Entering 'third_party/NNPACK'

See GitHub Actions build pull / linux-xenial-py3.7-gcc5.4 / test (default, 1, 2, linux.2xlarge) (3/12)

Step: "Test" (full log | diagnosis details | 🔁 rerun)

2022-05-05T18:49:41.1983918Z FAIL [0.003s]: test_rfftfreq_cpu (__main__.TestFFTDocExamplesCPU)
2022-05-05T18:49:41.1982631Z Traceback (most recent call last):
2022-05-05T18:49:41.1982913Z   File "/opt/conda/lib/python3.7/site-packages/torch/testing/_internal/common_device_type.py", line 376, in instantiated_test
2022-05-05T18:49:41.1983024Z     result = test(self, **param_kwargs)
2022-05-05T18:49:41.1983289Z   File "/opt/conda/lib/python3.7/site-packages/torch/testing/_internal/common_device_type.py", line 808, in dep_fn
2022-05-05T18:49:41.1983374Z     return fn(slf, *args, **kwargs)
2022-05-05T18:49:41.1983478Z   File "test_spectral_ops.py", line 1527, in test
2022-05-05T18:49:41.1983598Z     self.fail('Doctest failed')
2022-05-05T18:49:41.1983688Z AssertionError: Doctest failed
2022-05-05T18:49:41.1983692Z 
2022-05-05T18:49:41.1983786Z ======================================================================
2022-05-05T18:49:41.1983918Z FAIL [0.003s]: test_rfftfreq_cpu (__main__.TestFFTDocExamplesCPU)
2022-05-05T18:49:41.1984098Z ----------------------------------------------------------------------
2022-05-05T18:49:41.1984188Z Traceback (most recent call last):
2022-05-05T18:49:41.1984472Z   File "/opt/conda/lib/python3.7/site-packages/torch/testing/_internal/common_device_type.py", line 376, in instantiated_test
2022-05-05T18:49:41.1984564Z     result = test(self, **param_kwargs)
2022-05-05T18:49:41.1984831Z   File "/opt/conda/lib/python3.7/site-packages/torch/testing/_internal/common_device_type.py", line 808, in dep_fn
2022-05-05T18:49:41.1984916Z     return fn(slf, *args, **kwargs)
2022-05-05T18:49:41.1985022Z   File "test_spectral_ops.py", line 1527, in test
2022-05-05T18:49:41.1985131Z     self.fail('Doctest failed')
2022-05-05T18:49:41.1985221Z AssertionError: Doctest failed
2022-05-05T18:49:41.1985226Z 

See GitHub Actions build pull / linux-bionic-py3.7-clang9 / test (crossref, 1, 2, linux.2xlarge) (4/12)

Step: "Test" (full log | diagnosis details | 🔁 rerun)

2022-05-05T19:02:27.5827508Z RuntimeError: test_ops failed!
2022-05-05T19:02:26.3012007Z 
2022-05-05T19:02:26.3012092Z Generating XML reports...
2022-05-05T19:02:26.8492201Z Generated XML report: test-reports/python-unittest/test_ops/TEST-TestCommonCPU-20220505183514.xml
2022-05-05T19:02:27.0742336Z Generated XML report: test-reports/python-unittest/test_ops/TEST-TestCompositeComplianceCPU-20220505183514.xml
2022-05-05T19:02:27.1582341Z Generated XML report: test-reports/python-unittest/test_ops/TEST-TestMathBitsCPU-20220505183514.xml
2022-05-05T19:02:27.5822911Z Traceback (most recent call last):
2022-05-05T19:02:27.5823399Z   File "test/run_test.py", line 1070, in <module>
2022-05-05T19:02:27.5825049Z     main()
2022-05-05T19:02:27.5825397Z   File "test/run_test.py", line 1048, in main
2022-05-05T19:02:27.5827102Z     raise RuntimeError(err_message)
2022-05-05T19:02:27.5827508Z RuntimeError: test_ops failed!
2022-05-05T19:02:27.8223054Z 
2022-05-05T19:02:27.8223319Z real	27m18.027s
2022-05-05T19:02:27.8223516Z user	64m0.227s
2022-05-05T19:02:27.8224579Z sys	1m47.557s
2022-05-05T19:02:27.8224933Z + cleanup
2022-05-05T19:02:27.8225165Z + retcode=1
2022-05-05T19:02:27.8225326Z + set +x
2022-05-05T19:02:27.8256642Z ##[error]Process completed with exit code 1.
2022-05-05T19:02:27.8335099Z ##[group]Run pytorch/pytorch/.github/actions/get-workflow-job-id@master
2022-05-05T19:02:27.8335337Z with:

See GitHub Actions build pull / linux-bionic-py3.7-clang9 / test (default, 2, 2, linux.2xlarge) (5/12)

Step: "Test" (full log | diagnosis details | 🔁 rerun)

2022-05-05T18:36:25.7541336Z FAIL [0.003s]: test_rfftfreq_cpu (__main__.TestFFTDocExamplesCPU)
2022-05-05T18:36:25.7538914Z Traceback (most recent call last):
2022-05-05T18:36:25.7539718Z   File "/opt/conda/lib/python3.7/site-packages/torch/testing/_internal/common_device_type.py", line 376, in instantiated_test
2022-05-05T18:36:25.7539866Z     result = test(self, **param_kwargs)
2022-05-05T18:36:25.7540310Z   File "/opt/conda/lib/python3.7/site-packages/torch/testing/_internal/common_device_type.py", line 808, in dep_fn
2022-05-05T18:36:25.7540452Z     return fn(slf, *args, **kwargs)
2022-05-05T18:36:25.7540622Z   File "test_spectral_ops.py", line 1527, in test
2022-05-05T18:36:25.7540797Z     self.fail('Doctest failed')
2022-05-05T18:36:25.7540946Z AssertionError: Doctest failed
2022-05-05T18:36:25.7540956Z 
2022-05-05T18:36:25.7541104Z ======================================================================
2022-05-05T18:36:25.7541336Z FAIL [0.003s]: test_rfftfreq_cpu (__main__.TestFFTDocExamplesCPU)
2022-05-05T18:36:25.7541683Z ----------------------------------------------------------------------
2022-05-05T18:36:25.7541850Z Traceback (most recent call last):
2022-05-05T18:36:25.7542359Z   File "/opt/conda/lib/python3.7/site-packages/torch/testing/_internal/common_device_type.py", line 376, in instantiated_test
2022-05-05T18:36:25.7542506Z     result = test(self, **param_kwargs)
2022-05-05T18:36:25.7543063Z   File "/opt/conda/lib/python3.7/site-packages/torch/testing/_internal/common_device_type.py", line 808, in dep_fn
2022-05-05T18:36:25.7543232Z     return fn(slf, *args, **kwargs)
2022-05-05T18:36:25.7543435Z   File "test_spectral_ops.py", line 1527, in test
2022-05-05T18:36:25.7543664Z     self.fail('Doctest failed')
2022-05-05T18:36:25.7543841Z AssertionError: Doctest failed
2022-05-05T18:36:25.7543850Z 

See GitHub Actions build pull / win-vs2019-cuda11.3-py3 / test (default, 1, 2, windows.8xlarge.nvidia.gpu) (6/12)

Step: "Test" (full log | diagnosis details | 🔁 rerun)

2022-05-05T22:00:35.6523070Z FAIL [0.005s]: test_rfftfreq_cpu (__main__.TestFFTDocExamplesCPU)
2022-05-05T22:00:35.6521447Z Traceback (most recent call last):
2022-05-05T22:00:35.6521811Z   File "C:\actions-runner\_work\pytorch\pytorch\build\win_tmp\build\torch\testing\_internal\common_device_type.py", line 376, in instantiated_test
2022-05-05T22:00:35.6521935Z     result = test(self, **param_kwargs)
2022-05-05T22:00:35.6522271Z   File "C:\actions-runner\_work\pytorch\pytorch\build\win_tmp\build\torch\testing\_internal\common_device_type.py", line 808, in dep_fn
2022-05-05T22:00:35.6522384Z     return fn(slf, *args, **kwargs)
2022-05-05T22:00:35.6522547Z   File "test_spectral_ops.py", line 1527, in test
2022-05-05T22:00:35.6522647Z     self.fail('Doctest failed')
2022-05-05T22:00:35.6522765Z AssertionError: Doctest failed
2022-05-05T22:00:35.6522772Z 
2022-05-05T22:00:35.6522893Z ======================================================================
2022-05-05T22:00:35.6523070Z FAIL [0.005s]: test_rfftfreq_cpu (__main__.TestFFTDocExamplesCPU)
2022-05-05T22:00:35.6523351Z ----------------------------------------------------------------------
2022-05-05T22:00:35.6524395Z Traceback (most recent call last):
2022-05-05T22:00:35.6524836Z   File "C:\actions-runner\_work\pytorch\pytorch\build\win_tmp\build\torch\testing\_internal\common_device_type.py", line 376, in instantiated_test
2022-05-05T22:00:35.6524960Z     result = test(self, **param_kwargs)
2022-05-05T22:00:35.6525298Z   File "C:\actions-runner\_work\pytorch\pytorch\build\win_tmp\build\torch\testing\_internal\common_device_type.py", line 808, in dep_fn
2022-05-05T22:00:35.6525411Z     return fn(slf, *args, **kwargs)
2022-05-05T22:00:35.6525577Z   File "test_spectral_ops.py", line 1527, in test
2022-05-05T22:00:35.6525676Z     self.fail('Doctest failed')
2022-05-05T22:00:35.6525794Z AssertionError: Doctest failed
2022-05-05T22:00:35.6525802Z 

See GitHub Actions build pull / linux-xenial-py3.7-gcc7 / test (default, 1, 2, linux.2xlarge) (7/12)

Step: "Test" (full log | diagnosis details | 🔁 rerun)

2022-05-05T18:35:21.9142588Z FAIL [0.002s]: test_rfftfreq_cpu (__main__.TestFFTDocExamplesCPU)
2022-05-05T18:35:21.9139853Z Traceback (most recent call last):
2022-05-05T18:35:21.9140484Z   File "/opt/conda/lib/python3.7/site-packages/torch/testing/_internal/common_device_type.py", line 376, in instantiated_test
2022-05-05T18:35:21.9140671Z     result = test(self, **param_kwargs)
2022-05-05T18:35:21.9141266Z   File "/opt/conda/lib/python3.7/site-packages/torch/testing/_internal/common_device_type.py", line 808, in dep_fn
2022-05-05T18:35:21.9141448Z     return fn(slf, *args, **kwargs)
2022-05-05T18:35:21.9141674Z   File "test_spectral_ops.py", line 1527, in test
2022-05-05T18:35:21.9141929Z     self.fail('Doctest failed')
2022-05-05T18:35:21.9142109Z AssertionError: Doctest failed
2022-05-05T18:35:21.9142136Z 
2022-05-05T18:35:21.9142310Z ======================================================================
2022-05-05T18:35:21.9142588Z FAIL [0.002s]: test_rfftfreq_cpu (__main__.TestFFTDocExamplesCPU)
2022-05-05T18:35:21.9142997Z ----------------------------------------------------------------------
2022-05-05T18:35:21.9143193Z Traceback (most recent call last):
2022-05-05T18:35:21.9143812Z   File "/opt/conda/lib/python3.7/site-packages/torch/testing/_internal/common_device_type.py", line 376, in instantiated_test
2022-05-05T18:35:21.9143998Z     result = test(self, **param_kwargs)
2022-05-05T18:35:21.9144577Z   File "/opt/conda/lib/python3.7/site-packages/torch/testing/_internal/common_device_type.py", line 808, in dep_fn
2022-05-05T18:35:21.9144754Z     return fn(slf, *args, **kwargs)
2022-05-05T18:35:21.9144965Z   File "test_spectral_ops.py", line 1527, in test
2022-05-05T18:35:21.9145215Z     self.fail('Doctest failed')
2022-05-05T18:35:21.9145405Z AssertionError: Doctest failed
2022-05-05T18:35:21.9145470Z 

See GitHub Actions build pull / pytorch-xla-linux-bionic-py3.7-clang8 / test (xla, 1, 1, linux.2xlarge) (8/12)

Step: "Test" (full log | diagnosis details | 🔁 rerun)

2022-05-05T20:37:08.6794670Z �[0;31m[ FAILED ] �[mAtenXlaTensorTest.TestMaxUnpool3DBackward
2022-05-05T20:37:08.6790161Z �[0;32m[----------] �[m1 test from XlaUtilCacheTest (0 ms total)
2022-05-05T20:37:08.6790364Z 
2022-05-05T20:37:08.6790695Z �[0;32m[----------] �[mGlobal test environment tear-down
2022-05-05T20:37:08.6791256Z �[0;32m[==========] �[m623 tests from 8 test suites ran. (500332 ms total)
2022-05-05T20:37:08.6791740Z �[0;32m[  PASSED  ] �[m619 tests.
2022-05-05T20:37:08.6792188Z �[0;32m[  SKIPPED ] �[m1 test, listed below:
2022-05-05T20:37:08.6792796Z �[0;32m[  SKIPPED ] �[mAtenXlaTensorTest.TestGroupNormBackward
2022-05-05T20:37:08.6793370Z �[0;31m[  FAILED  ] �[m3 tests, listed below:
2022-05-05T20:37:08.6793785Z �[0;31m[  FAILED  ] �[mAtenXlaTensorTest.TestMatmul_1x2
2022-05-05T20:37:08.6794291Z �[0;31m[  FAILED  ] �[mAtenXlaTensorTest.TestMaxUnpool2DBackward
2022-05-05T20:37:08.6794670Z �[0;31m[  FAILED  ] �[mAtenXlaTensorTest.TestMaxUnpool3DBackward
2022-05-05T20:37:08.6794853Z 
2022-05-05T20:37:08.6794924Z  3 FAILED TESTS
2022-05-05T20:37:08.8698855Z + cleanup
2022-05-05T20:37:08.8699065Z + retcode=1
2022-05-05T20:37:08.8699217Z + set +x
2022-05-05T20:37:08.8945368Z ##[error]Process completed with exit code 1.
2022-05-05T20:37:08.9149138Z ##[group]Run pytorch/pytorch/.github/actions/get-workflow-job-id@master
2022-05-05T20:37:08.9149377Z with:
2022-05-05T20:37:08.9149719Z   github-token: ***
2022-05-05T20:37:08.9149890Z env:

See GitHub Actions build pull / linux-bionic-py3.7-clang9 / test (crossref, 2, 2, linux.2xlarge) (9/12)

Step: "Test" (full log | diagnosis details | 🔁 rerun)

2022-05-05T18:35:33.4892769Z RuntimeError: test_torch failed!
2022-05-05T18:35:33.2518473Z Generated XML report: test-reports/python-unittest/test_torch/TEST-TestTorchDeviceTypeCPU-20220505183455.xml
2022-05-05T18:35:33.2521532Z Generated XML report: test-reports/python-unittest/test_torch/TEST-TestVitalSignsCudaCPU-20220505183455.xml
2022-05-05T18:35:33.4689045Z [TORCH_VITAL] Dataloader.enabled		 True
2022-05-05T18:35:33.4689447Z [TORCH_VITAL] Dataloader.basic_unit_test		 TEST_VALUE_STRING
2022-05-05T18:35:33.4689786Z [TORCH_VITAL] CUDA.used		 False
2022-05-05T18:35:33.4886342Z Traceback (most recent call last):
2022-05-05T18:35:33.4886707Z   File "test/run_test.py", line 1070, in <module>
2022-05-05T18:35:33.4888981Z     main()
2022-05-05T18:35:33.4889214Z   File "test/run_test.py", line 1048, in main
2022-05-05T18:35:33.4892431Z     raise RuntimeError(err_message)
2022-05-05T18:35:33.4892769Z RuntimeError: test_torch failed!
2022-05-05T18:35:33.7569336Z 
2022-05-05T18:35:33.7569978Z real	0m48.272s
2022-05-05T18:35:33.7570358Z user	1m29.279s
2022-05-05T18:35:33.7570688Z sys	0m3.765s
2022-05-05T18:35:33.7570935Z + cleanup
2022-05-05T18:35:33.7571115Z + retcode=1
2022-05-05T18:35:33.7571292Z + set +x
2022-05-05T18:35:33.7609013Z ##[error]Process completed with exit code 1.
2022-05-05T18:35:33.7651216Z ##[group]Run pytorch/pytorch/.github/actions/get-workflow-job-id@master
2022-05-05T18:35:33.7651471Z with:

See GitHub Actions build pull / linux-xenial-py3.7-gcc 8000 5.4 / test (docs_test, 1, 1, linux.2xlarge) (10/12)

Step: "Test" (full log | diagnosis details | 🔁 rerun)

2022-05-05T18:33:24.8838754Z ##[error]Process completed with exit code 2.
2022-05-05T18:33:24.3542406Z     0 failures in cleanup code
2022-05-05T18:33:24.3542762Z �[01mbuild finished with problems.�[39;49;00m
2022-05-05T18:33:24.8795630Z Makefile:42: recipe for target 'doctest' failed
2022-05-05T18:33:24.8795894Z make: *** [doctest] Error 1
2022-05-05T18:33:24.8796331Z + cleanup
2022-05-05T18:33:24.8796503Z + retcode=2
2022-05-05T18:33:24.8796751Z + set +x
2022-05-05T18:33:24.8800091Z + cleanup
2022-05-05T18:33:24.8800386Z + retcode=2
2022-05-05T18:33:24.8800649Z + set +x
2022-05-05T18:33:24.8838754Z ##[error]Process completed with exit code 2.
2022-05-05T18:33:24.8878959Z ##[group]Run pytorch/pytorch/.github/actions/get-workflow-job-id@master
2022-05-05T18:33:24.8879202Z with:
2022-05-05T18:33:24.8879645Z   github-token: ***
2022-05-05T18:33:24.8879829Z env:
2022-05-05T18:33:24.8879969Z   IN_CI: 1
2022-05-05T18:33:24.8880133Z   IS_GHA: 1
2022-05-05T18:33:24.8880315Z   GIT_DEFAULT_BRANCH: master
2022-05-05T18:33:24.8880485Z ##[endgroup]
2022-05-05T18:33:24.8907549Z ##[group]Run nick-fields/retry@71062288b76e2b6214ebde0e673ce0de1755740a
2022-05-05T18:33:24.8907785Z with:

See GitHub Actions build pull / linux-xenial-py3.7-clang7-asan / test (default, 4, 4, linux.2xlarge) (11/12)

Step: "Test" (full log | diagnosis details | 🔁 rerun)

2022-05-05T19:06:56.3419090Z FAIL [0.006s]: test_rfftfreq_cpu (__main__.TestFFTDocExamplesCPU)
2022-05-05T19:06:56.3416585Z Traceback (most recent call last):
2022-05-05T19:06:56.3417153Z   File "/opt/conda/lib/python3.7/site-packages/torch/testing/_internal/common_device_type.py", line 376, in instantiated_test
2022-05-05T19:06:56.3417329Z     result = test(self, **param_kwargs)
2022-05-05T19:06:56.3417949Z   File "/opt/conda/lib/python3.7/site-packages/torch/testing/_internal/common_device_type.py", line 808, in dep_fn
2022-05-05T19:06:56.3418111Z     return fn(slf, *args, **kwargs)
2022-05-05T19:06:56.3418295Z   File "test_spectral_ops.py", line 1527, in test
2022-05-05T19:06:56.3418532Z     self.fail('Doctest failed')
2022-05-05T19:06:56.3418689Z AssertionError: Doctest failed
2022-05-05T19:06:56.3418699Z 
2022-05-05T19:06:56.3418855Z ======================================================================
2022-05-05T19:06:56.3419090Z FAIL [0.006s]: test_rfftfreq_cpu (__main__.TestFFTDocExamplesCPU)
2022-05-05T19:06:56.3419447Z ----------------------------------------------------------------------
2022-05-05T19:06:56.3419608Z Traceback (most recent call last):
2022-05-05T19:06:56.3420120Z   File "/opt/conda/lib/python3.7/site-packages/torch/testing/_internal/common_device_type.py", line 376, in instantiated_test
2022-05-05T19:06:56.3420282Z     result = test(self, **param_kwargs)
2022-05-05T19:06:56.3420807Z   File "/opt/conda/lib/python3.7/site-packages/torch/testing/_internal/common_device_type.py", line 808, in dep_fn
2022-05-05T19:06:56.3420968Z     return fn(slf, *args, **kwargs)
2022-05-05T19:06:56.3421303Z   File "test_spectral_ops.py", line 1527, in test
2022-05-05T19:06:56.3421532Z     self.fail('Doctest failed')
2022-05-05T19:06:56.3421696Z AssertionError: Doctest failed
2022-05-05T19:06:56.3421706Z 

See GitHub Actions build pull / win-vs2019-cuda11.3-py3 / test (force_on_cpu, 1, 1, windows.4xlarge) (12/12)

Step: "Test" (full log | diagnosis details | 🔁 rerun)

2022-05-05T21:12:21.6850124Z FAIL [0.000s]: test_rfftfreq_cpu (__main__.TestFFTDocExamplesCPU)
2022-05-05T21:12:21.6848793Z Traceback (most recent call last):
2022-05-05T21:12:21.6849067Z   File "C:\actions-runner\_work\pytorch\pytorch\build\win_tmp\build\torch\testing\_internal\common_device_type.py", line 376, in instantiated_test
2022-05-05T21:12:21.6849218Z     result = test(self, **param_kwargs)
2022-05-05T21:12:21.6849477Z   File "C:\actions-runner\_work\pytorch\pytorch\build\win_tmp\build\torch\testing\_internal\common_device_type.py", line 808, in dep_fn
2022-05-05T21:12:21.6849566Z     return fn(slf, *args, **kwargs)
2022-05-05T21:12:21.6849695Z   File "test_spectral_ops.py", line 1527, in test
2022-05-05T21:12:21.6849793Z     self.fail('Doctest failed')
2022-05-05T21:12:21.6849885Z AssertionError: Doctest failed
2022-05-05T21:12:21.6849890Z 
2022-05-05T21:12:21.6849984Z ======================================================================
2022-05-05T21:12:21.6850124Z FAIL [0.000s]: test_rfftfreq_cpu (__main__.TestFFTDocExamplesCPU)
2022-05-05T21:12:21.6850268Z ----------------------------------------------------------------------
2022-05-05T21:12:21.6850347Z Traceback (most recent call last):
2022-05-05T21:12:21.6850623Z   File "C:\actions-runner\_work\pytorch\pytorch\build\win_tmp\build\torch\testing\_internal\common_device_type.py", line 376, in instantiated_test
2022-05-05T21:12:21.6850722Z     result = test(self, **param_kwargs)
2022-05-05T21:12:21.6850974Z   File "C:\actions-runner\_work\pytorch\pytorch\build\win_tmp\build\torch\testing\_internal\common_device_type.py", line 808, in dep_fn
2022-05-05T21:12:21.6851067Z     return fn(slf, *args, **kwargs)
2022-05-05T21:12:21.6851824Z   File "test_spectral_ops.py", line 1527, in test
2022-05-05T21:12:21.6851935Z     self.fail('Doctest failed')
2022-05-05T21:12:21.6852030Z AssertionError: Doctest failed
2022-05-05T21:12:21.6852036Z 

This comment was automatically generated by Dr. CI (expand for details).

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

ezyang added a commit that referenced this pull request Apr 18, 2022
Signed-off-by: Edward Z. Yang <ezyangfb.com>

ghstack-source-id: 98de1bc
Pull Request resolved: #75994
Signed-off-by: Edward Z. Yang <ezyangfb.com>

[ghstack-poisoned]
Signed-off-by: Edward Z. Yang <ezyangfb.com>

[ghstack-poisoned]
Signed-off-by: Edward Z. Yang <ezyangfb.com>

[ghstack-poisoned]
Signed-off-by: Edward Z. Yang <ezyangfb.com>

[ghstack-poisoned]
ezyang added a commit that referenced this pull request Apr 19, 2022
Signed-off-by: Edward Z. Yang <ezyangfb.com>

ghstack-source-id: d7e96a8
Pull Request resolved: #75994
Signed-off-by: Edward Z. Yang <ezyangfb.com>

[ghstack-poisoned]
ezyang added a commit that referenced this pull request Apr 19, 2022
Signed-off-by: Edward Z. Yang <ezyangfb.com>

ghstack-source-id: 1461c55
Pull Request resolved: #75994
Signed-off-by: Edward Z. Yang <ezyangfb.com>

[ghstack-poisoned]
Signed-off-by: Edward Z. Yang <ezyangfb.com>

[ghstack-poisoned]
ezyang added a commit that referenced this pull request Apr 19, 2022
Signed-off-by: Edward Z. Yang <ezyangfb.com>

ghstack-source-id: 5d1458b
Pull Request resolved: #75994
Signed-off-by: Edward Z. Yang <ezyangfb.com>

[ghstack-poisoned]
Signed-off-by: Edward Z. Yang <ezyangfb.com>

[ghstack-poisoned]
ezyang added a commit that referenced this pull request Apr 20, 2022
Signed-off-by: Edward Z. Yang <ezyangfb.com>

ghstack-source-id: 82d9286
Pull Request resolved: #75994
Signed-off-by: Edward Z. Yang <ezyangfb.com>

[ghstack-poisoned]
Signed-off-by: Edward Z. Yang <ezyangfb.com>

[ghstack-poisoned]
ezyang added a commit that referenced this pull request Apr 20, 2022
Signed-off-by: Edward Z. Yang <ezyangfb.com>

ghstack-source-id: 18fc0ab
Pull Request resolved: #75994
Signed-off-by: Edward Z. Yang <ezyangfb.com>

[ghstack-poisoned]
ezyang added a commit that referenced this pull request May 6, 2022
#75994 was taking too long to
ship so I extracted out the CrossRef gadget and had it run on a simple
OpInfo invocation only.

TODO: There are failures that correspond to known bugs and need to be
skipped.

Signed-off-by: Edward Z. Yang <ezyangfb.com>

ghstack-source-id: c127657
Pull Request resolved: #76905
ezyang added a commit that referenced this pull request May 6, 2022
#75994 was taking too long to
ship so I extracted out the CrossRef gadget and had it run on a simple
OpInfo invocation only.

Signed-off-by: Edward Z. Yang <ezyangfb.com>

[ghstack-poisoned]
ezyang added a commit that referenced this pull request May 6, 2022
#75994 was taking too long to
ship so I extracted out the CrossRef gadget and had it run on a simple
OpInfo invocation only.

Signed-off-by: Edward Z. Yang <ezyangfb.com>

[ghstack-poisoned]
ezyang added a commit that referenced this pull request May 6, 2022
#75994 was taking too long to
ship so I extracted out the CrossRef gadget and had it run on a simple
OpInfo invocation only.

TODO: There are failures that correspond to known bugs and need to be
skipped.

Signed-off-by: Edward Z. Yang <ezyangfb.com>

ghstack-source-id: 900fca4
Pull Request resolved: #76905
ezyang added a commit that referenced this pull request May 6, 2022
#75994 was taking too long to
ship so I extracted out the CrossRef gadget and had it run on a simple
OpInfo invocation only.

Signed-off-by: Edward Z. Yang <ezyangfb.com>

[ghstack-poisoned]
ezyang added a commit that referenced this pull request May 6, 2022
#75994 was taking too long to
ship so I extracted out the CrossRef gadget and had it run on a simple
OpInfo invocation only.

Signed-off-by: Edward Z. Yang <ezyangfb.com>

[ghstack-poisoned]
ezyang added a commit that referenced this pull request May 6, 2022
#75994 was taking too long to
ship so I extracted out the CrossRef gadget and had it run on a simple
OpInfo invocation only.

Signed-off-by: Edward Z. Yang <ezyangfb.com>

[ghstack-poisoned]
ezyang added a commit that referenced this pull request May 6, 2022
#75994 was taking too long to
ship so I extracted out the CrossRef gadget and had it run on a simple
OpInfo invocation only.

Signed-off-by: Edward Z. Yang <ezyangfb.com>

[ghstack-poisoned]
ezyang added a commit that referenced this pull request May 6, 2022
#75994 was taking too long to
ship so I extracted out the CrossRef gadget and had it run on a simple
OpInfo invocation only.

TODO: There are failures that correspond to known bugs and need to be
skipped.

Signed-off-by: Edward Z. Yang <ezyangfb.com>

ghstack-source-id: b7272dd
Pull Request resolved: #76905
pytorchmergebot pushed a commit that referenced this pull request May 6, 2022
#75994 was taking too long to
ship so I extracted out the CrossRef gadget and had it run on a simple
OpInfo invocation only.

TODO: There are failures that correspond to known bugs and need to be
skipped.

Signed-off-by: Edward Z. Yang <ezyangfb.com>

Pull Request resolved: #76905

Approved by: https://github.com/anjali411, https://github.com/mruberry, https://github.com/albanD
ezyang added a commit that referenced this pull request May 6, 2022
PR #75994 was taking too long to ship so I extracted out the CrossRef gadget and
had it run on a simple OpInfo invocation only.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

[ghstack-poisoned]
ezyang added a commit that referenced this pull request May 7, 2022
PR #75994 was taking too long to ship so I extracted out the CrossRef gadget and
had it run on a simple OpInfo invocation only.

Signed-off-by: Edward Z. Yang <ezyangfb.com>

[ghstack-poisoned]
ezyang added a commit that referenced this pull request May 7, 2022
PR #75994 was taking too long to ship so I extracted out the CrossRef gadget and
had it run on a simple OpInfo invocation only.

Signed-off-by: Edward Z. Yang <ezyangfb.com>

[ghstack-poisoned]
ezyang added a commit that referenced this pull request May 7, 2022
PR #75994 was taking too long to ship so I extracted out the CrossRef gadget and
had it run on a simple OpInfo invocation only.

Signed-off-by: Edward Z. Yang <ezyangfb.com>

ghstack-source-id: 5488fed
Pull Request resolved: #77008
pytorchmergebot pushed a commit that referenced this pull request May 7, 2022
PR #75994 was taking too long to ship so I extracted out the CrossRef gadget and
had it run on a simple OpInfo invocation only.

Signed-off-by: Edward Z. Yang <ezyangfb.com>

Pull Request resolved: #77008

Approved by: https://github.com/ngimel
facebook-github-bot pushed a commit that referenced this pull request May 13, 2022
Summary:
PR #75994 was taking too long to ship so I extracted out the CrossRef gadget and
had it run on a simple OpInfo invocation only.

Signed-off-by: Edward Z. Yang <ezyangfb.com>

Pull Request resolved: #77008

Approved by: https://github.com/ngimel

Test Plan: contbuild & OSS CI, see https://hud.pytorch.org/commit/pytorch/pytorch/60f131fb6c2e3f4a23e64096a3e718a1e669215b

Reviewed By: malfet

Differential Revision: D36250515

fbshipit-source-id: 93cdc3cb9bf4c3375bd679aea8d5f59a09f65585
Copy link
Contributor
@eellison eellison left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the plan for this PR? Context is OpInfos dont have any mixed-device inputs so they're not very useful for testing FakeTensors, you would need something like this instead

PRAGMA locking_mode = EXCLUSIVE;
PRAGMA temp_store = MEMORY;

CREATE TABLE IF NOT EXISTS files ( file_id INTEGER NOT NULL PRIMARY KEY, file TEXT NOT NULL UNIQUE );
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

never thought id see SQL in pytorch 😮

@ezyang
Copy link
Contributor Author
ezyang commented May 21, 2022

I'm currently not sure how to prioritize it. You get very good coverage with it, but it's also very painful to do development on: it takes a few hours to run through the entirety of the test suite and when you issue fixes it is somewhat difficult to understand which tests you should rerun to revalidate. Without fake tensors is mind, I was planning on getting the OpInfo test suite clean first, and then moving on to crossref testing.

Here is my suggestion, @eellison. You should create a variant mode that looks for tests that exercise mixed device inputs and log them all. Then (somehow) selectively apply cross-ref testing on this set (the somehow because I don't know how to automatically apply something to a generated list of tests, you'll need to figure something out) and land that.

@github-actions
Copy link
Contributor

Looks like this PR hasn't been updated in a while so we're going to go ahead and mark this as Stale.
Feel free to remove the Stale label if you feel this was a mistake.
If you are unable to remove the Stale label please contact a maintainer in order to do so.
If you want the bot to never mark this PR stale again, add the no-stale label.
Stale pull requests will automatically be closed after 30 days of inactivity.

@github-actions github-actions bot added the Stale label Jul 20, 2022
@github-actions github-actions bot closed this Aug 19, 2022
@facebook-github-bot facebook-github-bot deleted the gh/ezyang/1130/head branch September 18, 2022 14:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants
0