8000 test failure on ubuntu 18.04 with cuda10 · Issue #14753 · pytorch/pytorch · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

test failure on ubuntu 18.04 with cuda10 #14753

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
xonobo opened this issue Dec 4, 2018 · 3 comments
Closed

test failure on ubuntu 18.04 with cuda10 #14753

xonobo opened this issue Dec 4, 2018 · 3 comments
Assignees
Labels
oncall: distributed Add this issue/PR to distributed oncall triage queue

Comments

@xonobo
Copy link
xonobo commented Dec 4, 2018

Running test_c10d ... [2018-12-04 16:24:19.898544]
sssssProcess process 0:
Process process 1:
Traceback (most recent call last):
Traceback (most recent call last):
File "/home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
self.run()
File "/home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
self.run()
File "/home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/lib/python3.7/multiprocessing/process.py", line 99, in run
self._target(*self._args, **self._kwargs)
File "/home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/lib/python3.7/multiprocessing/process.py", line 99, in run
self._target(*self._args, **self._kwargs)
File "test_c10d.py", line 412, in _run
getattr(self, self.id().split(".")[2])()
File "test_c10d.py", line 378, in wrapper
fn(self)
File "test_c10d.py", line 412, in _run
getattr(self, self.id().split(".")[2])()
File "test_c10d.py", line 61, in wrapper
return func(*args, **kwargs)
File "test_c10d.py", line 1427, in test_queue_reduction
devices)
File "test_c10d.py", line 378, in wrapper
fn(self)
File "test_c10d.py", line 61, in wrapper
return func(*args, **kwargs)
File "test_c10d.py", line 1427, in test_queue_reduction
devices)
RuntimeError: !gradsBatch.empty() ASSERT FAILED at /home/bozkalayci/github/pytorch/torch/csrc/distributed/c10d/ddp.cpp:132, please report a bug to PyTorch. (queueReduction at /home/bozkalayci/github/pytorch/torch/csrc/distributed/c10d/ddp.cpp:132)
frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&) + 0x6c (0x7f631e02b08c in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/lib/python3.7/site-packages/torch/lib/libc10.so)
frame #1: c10d::queueReduction(c10d::ProcessGroup&, std::vector<std::vector<at::Tensor, std::allocatorat::Tensor >, std::allocator<std::vector<at::Tensor, std::allocatorat::Tensor > > >&, std::vector<long, std::allocator > const&) + 0x35b9 (0x7f6333cae5a9 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
frame #2: + 0x6c86f4 (0x7f6333ca86f4 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
frame #3: + 0x17c955 (0x7f633375c955 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
frame #4: _PyMethodDef_RawFastCallKeywords + 0x264 (0x55f24e7cd494 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #5: _PyCFunction_FastCallKeywords + 0x21 (0x55f24e7cd5b1 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #6: _PyEval_EvalFrameDefault + 0x4f32 (0x55f24e8295b2 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #7: _PyEval_EvalCodeWithName + 0xb99 (0x55f24e76a2d9 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #8: _PyFunction_FastCallDict + 0x1d4 (0x55f24e76aab4 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #9: _PyEval_EvalFrameDefault + 0x1e4a (0x55f24e8264ca in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #10: _PyEval_EvalCodeWithName + 0x5db (0x55f24e769d1b in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #11: _PyFunction_FastCallKeywords + 0x387 (0x55f24e7cc7c7 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #12: _PyEval_EvalFrameDefault + 0x466 (0x55f24e824ae6 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #13: _PyEval_EvalCodeWithName + 0x5db (0x55f24e769d1b in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #14: _PyFunction_FastCallKeywords + 0x387 (0x55f24e7cc7c7 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #15: _PyEval_EvalFrameDefault + 0x466 (0x55f24e824ae6 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #16: _PyFunction_FastCallDict + 0x10b (0x55f24e76a9eb in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #17: _PyObject_Call_Prepend + 0x63 (0x55f24e781a83 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #18: PyObject_Call + 0x6e (0x55f24e77668e in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #19: _PyEval_EvalFrameDefault + 0x1e4a (0x55f24e8264ca in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #20: _PyFunction_FastCallKeywords + 0xfb (0x55f24e7cc53b in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #21: _PyEval_EvalFrameDefault + 0x6f5 (0x55f24e824d75 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #22: _PyFunction_FastCallKeywords + 0xfb (0x55f24e7cc53b in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #23: _PyEval_EvalFrameDefault + 0x6f5 (0x55f24e824d75 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #24: _PyFunction_FastCallKeywords + 0xfb (0x55f24e7cc53b in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #25: _PyEval_EvalFrameDefault + 0x6f5 (0x55f24e824d75 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #26: _PyFunction_FastCallDict + 0x10b (0x55f24e76a9eb in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #27: _PyObject_Call_Prepend + 0x63 (0x55f24e781a83 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #28: + 0x171b4a (0x55f24e7c4b4a in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #29: _PyObject_FastCallKeywords + 0x128 (0x55f24e7cd768 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #30: _PyEval_EvalFrameDefault + 0x4ca6 (0x55f24e829326 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #31: _PyFunction_FastCallKeywords + 0xfb (0x55f24e7cc53b in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #32: _PyEval_EvalFrameDefault + 0x4b54 (0x55f24e8291d4 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #33: _PyFunction_FastCallKeywords + 0xfb (0x55f24e7cc53b in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #34: _PyEval_EvalFrameDefault + 0x4b54 (0x55f24e8291d4 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #35: _PyFunction_FastCallKeywords + 0xfb (0x55f24e7cc53b in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #36: _PyEval_EvalFrameDefault + 0x6f5 (0x55f24e824d75 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #37: _PyFunction_FastCallKeywords + 0xfb (0x55f24e7cc53b in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #38: _PyEval_EvalFrameDefault + 0x6f5 (0x55f24e824d75 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #39: _PyEval_EvalCodeWithName + 0x5db (0x55f24e769d1b in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #40: _PyFunction_FastCallKeywords + 0x387 (0x55f24e7cc7c7 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #41: _PyEval_EvalFrameDefault + 0x466 (0x55f24e824ae6 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #42: _PyEval_EvalCodeWithName + 0xb99 (0x55f24e76a2d9 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #43: _PyFunction_FastCallKeywords + 0x387 (0x55f24e7cc7c7 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #44: _PyEval_EvalFrameDefault + 0x6f5 (0x55f24e824d75 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #45: _PyEval_EvalCodeWithName + 0x2f8 (0x55f24e769a38 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #46: _PyFunction_FastCallDict + 0x1d4 (0x55f24e76aab4 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #47: _PyObject_Call_Prepend + 0x63 (0x55f24e781a83 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #48: PyObject_Call + 0x6e (0x55f24e77668e in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #49: _PyEval_EvalFrameDefault + 0x1e4a (0x55f24e8264ca in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #50: _PyEval_EvalCodeWithName + 0x2f8 (0x55f24e769a38 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #51: _PyFunction_FastCallDict + 0x1d4 (0x55f24e76aab4 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #52: _PyObject_Call_Prepend + 0x63 (0x55f24e781a83 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #53: + 0x171c0a (0x55f24e7c4c0a in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #54: _PyObject_FastCallKeywords + 0x4ab (0x55f24e7cdaeb in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #55: _PyEval_EvalFrameDefault + 0x4ca6 (0x55f24e829326 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #56: _PyEval_EvalCodeWithName + 0x2f8 (0x55f24e769a38 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #57: _PyFunction_FastCallDict + 0x1d4 (0x55f24e76aab4 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #58: _PyObject_Call_Prepend + 0x63 (0x55f24e781a83 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #59: PyObject_Call + 0x6e (0x55f24e77668e in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #60: _PyEval_EvalFrameDefault + 0x1e4a (0x55f24e8264ca in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #61: _PyEval_EvalCodeWithName + 0x2f8 (0x55f24e769a38 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #62: _PyFunction_FastCallDict + 0x1d4 (0x55f24e76aab4 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #63: _PyObject_Call_Prepend + 0x63 (0x55f24e781a83 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)

RuntimeError: !gradsBatch.empty() ASSERT FAILED at /home/bozkalayci/github/pytorch/torch/csrc/distributed/c10d/ddp.cpp:132, please report a bug to PyTorch. (queueReduction at /home/bozkalayci/github/pytorch/torch/csrc/distributed/c10d/ddp.cpp:132)
frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&) + 0x6c (0x7f631e02b08c in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/lib/python3.7/site-packages/torch/lib/libc10.so)
frame #1: c10d::queueReduction(c10d::ProcessGroup&, std::vector<std::vector<at::Tensor, std::allocatorat::Tensor >, std::allocator<std::vector<at::Tensor, std::allocatorat::Tensor > > >&, std::vector<long, std::allocator > const&) + 0x35b9 (0x7f6333cae5a9 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
frame #2: + 0x6c86f4 (0x7f6333ca86f4 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
frame #3: + 0x17c955 (0x7f633375c955 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
frame #4: _PyMethodDef_RawFastCallKeywords + 0x264 (0x55f24e7cd494 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #5: _PyCFunction_FastCallKeywords + 0x21 (0x55f24e7cd5b1 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #6: _PyEval_EvalFrameDefault + 0x4f32 (0x55f24e8295b2 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #7: _PyEval_EvalCodeWithName + 0xb99 (0x55f24e76a2d9 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #8: _PyFunction_FastCallDict + 0x1d4 (0x55f24e76aab4 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #9: _PyEval_EvalFrameDefault + 0x1e4a (0x55f24e8264ca in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #10: _PyEval_EvalCodeWithName + 0x5db (0x55f24e769d1b in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #11: _PyFunction_FastCallKeywords + 0x387 (0x55f24e7cc7c7 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #12: _PyEval_EvalFrameDefault + 0x466 (0x55f24e824ae6 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #13: _PyEval_EvalCodeWithName + 0x5db (0x55f24e769d1b in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #14: _PyFunction_FastCallKeywords + 0x387 (0x55f24e7cc7c7 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #15: _PyEval_EvalFrameDefault + 0x466 (0x55f24e824ae6 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #16: _PyFunction_FastCallDict + 0x10b (0x55f24e76a9eb in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #17: _PyObject_Call_Prepend + 0x63 (0x55f24e781a83 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #18: PyObject_Call + 0x6e (0x55f24e77668e in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #19: _PyEval_EvalFrameDefault + 0x1e4a (0x55f24e8264ca in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #20: _PyFunction_FastCallKeywords + 0xfb (0x55f24e7cc53b in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #21: _PyEval_EvalFrameDefault + 0x6f5 (0x55f24e824d75 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #22: _PyFunction_FastCallKeywords + 0xfb (0x55f24e7cc53b in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #23: _PyEval_EvalFrameDefault + 0x6f5 (0x55f24e824d75 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #24: _PyFunction_FastCallKeywords + 0xfb (0x55f24e7cc53b in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #25: _PyEval_EvalFrameDefault + 0x6f5 (0x55f24e824d75 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #26: _PyFunction_FastCallDict + 0x10b (0x55f24e76a9eb in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #27: _PyObject_Call_Prepend + 0x63 (0x55f24e781a83 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #28: + 0x171b4a (0x55f24e7c4b4a in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #29: _PyObject_FastCallKeywords + 0x128 (0x55f24e7cd768 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #30: _PyEval_EvalFrameDefault + 0x4ca6 (0x55f24e829326 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #31: _PyFunction_FastCallKeywords + 0xfb (0x55f24e7cc53b in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #32: _PyEval_EvalFrameDefault + 0x4b54 (0x55f24e8291d4 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #33: _PyFunction_FastCallKeywords + 0xfb (0x55f24e7cc53b in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #34: _PyEval_EvalFrameDefault + 0x4b54 (0x55f24e8291d4 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #35: _PyFunction_FastCallKeywords + 0xfb (0x55f24e7cc53b in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #36: _PyEval_EvalFrameDefault + 0x6f5 (0x55f24e824d75 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #37: _PyFunction_FastCallKeywords + 0xfb (0x55f24e7cc53b in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #38: _PyEval_EvalFrameDefault + 0x6f5 (0x55f24e824d75 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #39: _PyEval_EvalCodeWithName + 0x5db (0x55f24e769d1b in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #40: _PyFunction_FastCallKeywords + 0x387 (0x55f24e7cc7c7 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #41: _PyEval_EvalFrameDefault + 0x466 (0x55f24e824ae6 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #42: _PyEval_EvalCodeWithName + 0xb99 (0x55f24e76a2d9 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #43: _PyFunction_FastCallKeywords + 0x387 (0x55f24e7cc7c7 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #44: _PyEval_EvalFrameDefault + 0x6f5 (0x55f24e824d75 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #45: _PyEval_EvalCodeWithName + 0x2f8 (0x55f24e769a38 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #46: _PyFunction_FastCallDict + 0x1d4 (0x55f24e76aab4 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #47: _PyObject_Call_Prepend + 0x63 (0x55f24e781a83 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #48: PyObject_Call + 0x6e (0x55f24e77668e in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #49: _PyEval_EvalFrameDefault + 0x1e4a (0x55f24e8264ca in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #50: _PyEval_EvalCodeWithName + 0x2f8 (0x55f24e769a38 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #51: _PyFunction_FastCallDict + 0x1d4 (0x55f24e76aab4 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #52: _PyObject_Call_Prepend + 0x63 (0x55f24e781a83 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #53: + 0x171c0a (0x55f24e7c4c0a in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #54: _PyObject_FastCallKeywords + 0x4ab (0x55f24e7cdaeb in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #55: _PyEval_EvalFrameDefault + 0x4ca6 (0x55f24e829326 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #56: _PyEval_EvalCodeWithName + 0x2f8 (0x55f24e769a38 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #57: _PyFunction_FastCallDict + 0x1d4 (0x55f24e76aab4 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #58: _PyObject_Call_Prepend + 0x63 (0x55f24e781a83 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #59: PyObject_Call + 0x6e (0x55f24e77668e in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #60: _PyEval_EvalFrameDefault + 0x1e4a (0x55f24e8264ca in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #61: _PyEval_EvalCodeWithName + 0x2f8 (0x55f24e769a38 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #62: _PyFunction_FastCallDict + 0x1d4 (0x55f24e76aab4 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #63: _PyObject_Call_Prepend + 0x63 (0x55f24e781a83 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)

FssProcess process 0:
Traceback (most recent call last):
File "/home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
self.run()
File "/home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/lib/python3.7/multiprocessing/process.py", line 99, in run
self._target(*self._args, **self._kwargs)
File "test_c10d.py", line 412, in _run
getattr(self, self.id().split(".")[2])()
File "test_c10d.py", line 378, in wrapper
fn(self)
File "test_c10d.py", line 61, in wrapper
return func(*args, **kwargs)
File "test_c10d.py", line 1453, in test_sync_reduction
devices)
RuntimeError: !gradsBatch.empty() ASSERT FAILED at /home/bozkalayci/github/pytorch/torch/csrc/distributed/c10d/ddp.cpp:132, please report a bug to PyTorch. (queueReduction at /home/bozkalayci/github/pytorch/torch/csrc/distributed/c10d/ddp.cpp:132)
frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&) + 0x6c (0x7f631e02b08c in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/lib/python3.7/site-packages/torch/lib/libc10.so)
frame #1: c10d::queueReduction(c10d::ProcessGroup&, std::vector<std::vector<at::Tensor, std::allocatorat::Tensor >, std::allocator<std::vector<at::Tensor, std::allocatorat::Tensor > > >&, std::vector<long, std::allocator > const&) + 0x35b9 (0x7f6333cae5a9 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
frame #2: + 0x6c86f4 (0x7f6333ca86f4 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
frame #3: + 0x17c955 (0x7f633375c955 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
frame #4: _PyMethodDef_RawFastCallKeywords + 0x264 (0x55f24e7cd494 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #5: _PyCFunction_FastCallKeywords + 0x21 (0x55f24e7cd5b1 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #6: _PyEval_EvalFrameDefault + 0x4f32 (0x55f24e8295b2 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #7: _PyEval_EvalCodeWithName + 0xb99 (0x55f24e76a2d9 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #8: _PyFunction_FastCallDict + 0x1d4 (0x55f24e76aab4 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #9: _PyEval_EvalFrameDefault + 0x1e4a (0x55f24e8264ca in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #10: _PyEval_EvalCodeWithName + 0x5db (0x55f24e769d1b in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #11: _PyFunction_FastCallKeywords + 0x387 (0x55f24e7cc7c7 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #12: _PyEval_EvalFrameDefault + 0x466 (0x55f24e824ae6 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #13: _PyEval_EvalCodeWithName + 0x5db (0x55f24e769d1b in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #14: _PyFunction_FastCallKeywords + 0x387 (0x55f24e7cc7c7 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #15: _PyEval_EvalFrameDefault + 0x466 (0x55f24e824ae6 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #16: _PyFunction_FastCallDict + 0x10b (0x55f24e76a9eb in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #17: _PyObject_Call_Prepend + 0x63 (0x55f24e781a83 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #18: PyObject_Call + 0x6e (0x55f24e77668e in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #19: _PyEval_EvalFrameDefault + 0x1e4a (0x55f24e8264ca in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #20: _PyFunction_FastCallKeywords + 0xfb (0x55f24e7cc53b in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #21: _PyEval_EvalFrameDefault + 0x6f5 (0x55f24e824d75 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #22: _PyFunction_FastCallKeywords + 0xfb (0x55f24e7cc53b in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #23: _PyEval_EvalFrameDefault + 0x6f5 (0x55f24e824d75 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #24: _PyFunction_FastCallKeywords + 0xfb (0x55f24e7cc53b in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #25: _PyEval_EvalFrameDefault + 0x6f5 (0x55f24e824d75 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #26: _PyFunction_FastCallDict + 0x10b (0x55f24e76a9eb in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #27: _PyObject_Call_Prepend + 0x63 (0x55f24e781a83 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #28: + 0x171b4a (0x55f24e7c4b4a in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #29: _PyObject_FastCallKeywords + 0x128 (0x55f24e7cd768 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #30: _PyEval_EvalFrameDefault + 0x4ca6 (0x55f24e829326 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #31: _PyFunction_FastCallKeywords + 0xfb (0x55f24e7cc53b in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #32: _PyEval_EvalFrameDefault + 0x4b54 (0x55f24e8291d4 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #33: _PyFunction_FastCallKeywords + 0xfb (0x55f24e7cc53b in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #34: _PyEval_EvalFrameDefault + 0x4b54 (0x55f24e8291d4 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #35: _PyFunction_FastCallKeywords + 0xfb (0x55f24e7cc53b in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #36: _PyEval_EvalFrameDefault + 0x6f5 (0x55f24e824d75 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #37: _PyFunction_FastCallKeywords + 0xfb (0x55f24e7cc53b in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #38: _PyEval_EvalFrameDefault + 0x6f5 (0x55f24e824d75 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #39: _PyEval_EvalCodeWithName + 0x5db (0x55f24e769d1b in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #40: _PyFunction_FastCallKeywords + 0x387 (0x55f24e7cc7c7 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #41: _PyEval_EvalFrameDefault + 0x466 (0x55f24e824ae6 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #42: _PyEval_EvalCodeWithName + 0xb99 (0x55f24e76a2d9 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #43: _PyFunction_FastCallKeywords + 0x387 (0x55f24e7cc7c7 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #44: _PyEval_EvalFrameDefault + 0x6f5 (0x55f24e824d75 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #45: _PyEval_EvalCodeWithName + 0x2f8 (0x55f24e769a38 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #46: _PyFunction_FastCallDict + 0x1d4 (0x55f24e76aab4 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #47: _PyObject_Call_Prepend + 0x63 (0x55f24e781a83 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #48: PyObject_Call + 0x6e (0x55f24e77668e in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #49: _PyEval_EvalFrameDefault + 0x1e4a (0x55f24e8264ca in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #50: _PyEval_EvalCodeWithName + 0x2f8 (0x55f24e769a38 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #51: _PyFunction_FastCallDict + 0x1d4 (0x55f24e76aab4 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #52: _PyObject_Call_Prepend + 0x63 (0x55f24e781a83 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #53: + 0x171c0a (0x55f24e7c4c0a in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #54: _PyObject_FastCallKeywords + 0x4ab (0x55f24e7cdaeb in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #55: _PyEval_EvalFrameDefault + 0x4ca6 (0x55f24e829326 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #56: _PyEval_EvalCodeWithName + 0x2f8 (0x55f24e769a38 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #57: _PyFunction_FastCallDict + 0x1d4 (0x55f24e76aab4 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #58: _PyObject_Call_Prepend + 0x63 (0x55f24e781a83 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #59: PyObject_Call + 0x6e (0x55f24e77668e in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #60: _PyEval_EvalFrameDefault + 0x1e4a (0x55f24e8264ca in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #61: _PyEval_EvalCodeWithName + 0x2f8 (0x55f24e769a38 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #62: _PyFunction_FastCallDict + 0x1d4 (0x55f24e76aab4 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #63: _PyObject_Call_Prepend + 0x63 (0x55f24e781a83 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)

Process process 1:
Traceback (most recent call last):
File "/home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
self.run()
File "/home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/lib/python3.7/multiprocessing/process.py", line 99, in run
self._target(*self._args, **self._kwargs)
File "test_c10d.py", line 412, in _run
getattr(self, self.id().split(".")[2])()
File "test_c10d.py", line 378, in wrapper
fn(self)
File "test_c10d.py", line 61, in wrapper
return func(*args, **kwargs)
File "test_c10d.py", line 1453, in test_sync_reduction
devices)
RuntimeError: !gradsBatch.empty() ASSERT FAILED at /home/bozkalayci/github/pytorch/torch/csrc/distributed/c10d/ddp.cpp:132, please report a bug to PyTorch. (queueReduction at /home/bozkalayci/github/pytorch/torch/csrc/distributed/c10d/ddp.cpp:132)
frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&) + 0x6c (0x7f631e02b08c in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/lib/python3.7/site-packages/torch/lib/libc10.so)
frame #1: c10d::queueReduction(c10d::ProcessGroup&, std::vector<std::vector<at::Tensor, std::allocatorat::Tensor >, std::allocator<std::vector<at::Tensor, std::allocatorat::Tensor > > >&, std::vector<long, std::allocator > const&) + 0x35b9 (0x7f6333cae5a9 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
frame #2: + 0x6c86f4 (0x7f6333ca86f4 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
frame #3: + 0x17c955 (0x7f633375c955 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
frame #4: _PyMethodDef_RawFastCallKeywords + 0x264 (0x55f24e7cd494 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #5: _PyCFunction_FastCallKeywords + 0x21 (0x55f24e7cd5b1 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #6: _PyEval_EvalFrameDefault + 0x4f32 (0x55f24e8295b2 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #7: _PyEval_EvalCodeWithName + 0xb99 (0x55f24e76a2d9 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #8: _PyFunction_FastCallDict + 0x1d4 (0x55f24e76aab4 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #9: _PyEval_EvalFrameDefault + 0x1e4a (0x55f24e8264ca in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #10: _PyEval_EvalCodeWithName + 0x5db (0x55f24e769d1b in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #11: _PyFunction_FastCallKeywords + 0x387 (0x55f24e7cc7c7 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #12: _PyEval_EvalFrameDefault + 0x466 (0x55f24e824ae6 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #13: _PyEval_EvalCodeWithName + 0x5db (0x55f24e769d1b in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #14: _PyFunction_FastCallKeywords + 0x387 (0x55f24e7cc7c7 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #15: _PyEval_EvalFrameDefault + 0x466 (0x55f24e824ae6 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #16: _PyFunction_FastCallDict + 0x10b (0x55f24e76a9eb in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #17: _PyObject_Call_Prepend + 0x63 (0x55f24e781a83 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #18: PyObject_Call + 0x6e (0x55f24e77668e in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #19: _PyEval_EvalFrameDefault + 0x1e4a (0x55f24e8264ca in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #20: _PyFunction_FastCallKeywords + 0xfb (0x55f24e7cc53b in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #21: _PyEval_EvalFrameDefault + 0x6f5 (0x55f24e824d75 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #22: _PyFunction_FastCallKeywords + 0xfb (0x55f24e7cc53b in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #23: _PyEval_EvalFrameDefault + 0x6f5 (0x55f24e824d75 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #24: _PyFunction_FastCallKeywords + 0xfb (0x55f24e7cc53b in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #25: _PyEval_EvalFrameDefault + 0x6f5 (0x55f24e824d75 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #26: _PyFunction_FastCallDict + 0x10b (0x55f24e76a9eb in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #27: _PyObject_Call_Prepend + 0x63 (0x55f24e781a83 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #28: + 0x171b4a (0x55f24e7c4b4a in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #29: _PyObject_FastCallKeywords + 0x128 (0x55f24e7cd768 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #30: _PyEval_EvalFrameDefault + 0x4ca6 (0x55f24e829326 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #31: _PyFunction_FastCallKeywords + 0xfb (0x55f24e7cc53b in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #32: _PyEval_EvalFrameDefault + 0x4b54 (0x55f24e8291d4 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #33: _PyFunction_FastCallKeywords + 0xfb (0x55f24e7cc53b in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #34: _PyEval_EvalFrameDefault + 0x4b54 (0x55f24e8291d4 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #35: _PyFunction_FastCallKeywords + 0xfb (0x55f24e7cc53b in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #36: _PyEval_EvalFrameDefault + 0x6f5 (0x55f24e824d75 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #37: _PyFunction_FastCallKeywords + 0xfb (0x55f24e7cc53b in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #38: _PyEval_EvalFrameDefault + 0x6f5 (0x55f24e824d75 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #39: _PyEval_EvalCodeWithName + 0x5db (0x55f24e769d1b in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #40: _PyFunction_FastCallKeywords + 0x387 (0x55f24e7cc7c7 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #41: _PyEval_EvalFrameDefault + 0x466 (0x55f24e824ae6 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #42: _PyEval_EvalCodeWithName + 0xb99 (0x55f24e76a2d9 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #43: _PyFunction_FastCallKeywords + 0x387 (0x55f24e7cc7c7 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #44: _PyEval_EvalFrameDefault + 0x6f5 (0x55f24e824d75 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #45: _PyEval_EvalCodeWithName + 0x2f8 (0x55f24e769a38 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #46: _PyFunction_FastCallDict + 0x1d4 (0x55f24e76aab4 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #47: _PyObject_Call_Prepend + 0x63 (0x55f24e781a83 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #48: PyObject_Call + 0x6e (0x55f24e77668e in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #49: _PyEval_EvalFrameDefault + 0x1e4a (0x55f24e8264ca in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #50: _PyEval_EvalCodeWithName + 0x2f8 (0x55f24e769a38 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #51: _PyFunction_FastCallDict + 0x1d4 (0x55f24e76aab4 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #52: _PyObject_Call_Prepend + 0x63 (0x55f24e781a83 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #53: + 0x171c0a (0x55f24e7c4c0a in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #54: _PyObject_FastCallKeywords + 0x4ab (0x55f24e7cdaeb in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #55: _PyEval_EvalFrameDefault + 0x4ca6 (0x55f24e829326 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #56: _PyEval_EvalCodeWithName + 0x2f8 (0x55f24e769a38 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #57: _PyFunction_FastCallDict + 0x1d4 (0x55f24e76aab4 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #58: _PyObject_Call_Prepend + 0x63 (0x55f24e781a83 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #59: PyObject_Call + 0x6e (0x55f24e77668e in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #60: _PyEval_EvalFrameDefault + 0x1e4a (0x55f24e8264ca in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #61: _PyEval_EvalCodeWithName + 0x2f8 (0x55f24e769a38 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #62: _PyFunction_FastCallDict + 0x1d4 (0x55f24e76aab4 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)
frame #63: _PyObject_Call_Prepend + 0x63 (0x55f24e781a83 in /home/bozkalayci/.pyenv/versions/miniconda3-latest/envs/py37/bin/python)

F......s..s.s..s........sssss.........

FAIL: test_queue_reduction (main.DistributedDataParallelTest)

Traceback (most recent call last):
File "test_c10d.py", line 376, in wrapper
self._join_processes(fn)
File "test_c10d.py", line 421, in _join_processes
self._check_return_codes(elapsed_time)
File "test_c10d.py", line 436, in _check_return_codes
self.assertEqual(first_process.exitcode, 0)
File "/home/bozkalayci/github/pytorch/test/common_utils.py", line 439, in assertEqual
super(TestCase, self).assertLessEqual(abs(x - y), prec, message)
AssertionError: 1 not less than or equal to 1e-05 :

======================================================================
FAIL: test_sync_reduction (main.DistributedDataParallelTest)

Traceback (most recent call last):
File "test_c10d.py", line 376, in wrapper
self._join_processes(fn)
File "test_c10d.py", line 421, in _join_processes
self._check_return_codes(elapsed_time)
File "test_c10d.py", line 436, in _check_return_codes
self.assertEqual(first_process.exitcode, 0)
File "/home/bozkalayci/github/pytorch/test/common_utils.py", line 439, in assertEqual
super(TestCase, self).assertLessEqual(abs(x - y), prec, message)
AssertionError: 1 not less than or equal to 1e-05 :


Ran 46 tests in 2.523s

FAILED (failures=2, skipped=16)
Traceback (most recent call last):
File "run_test.py", line 424, in
main()
File "run_test.py", line 416, in main
raise RuntimeError(message)
RuntimeError: test_c10d failed!

@xonobo xonobo closed this as completed Dec 4, 2018
@xonobo xonobo reopened this Dec 4, 2018
@zou3519 zou3519 added the oncall: distributed Add this issue/PR to distributed oncall triage queue label Dec 10, 2018
@teng-li
Copy link
Contributor
teng-li commented Jan 2, 2019

I think this was fixed two/three weeks ago,

@teng-li teng-li closed this as completed Jan 2, 2019
@teng-li teng-li self-assigned this Jan 2, 2019
@teng-li
Copy link
Contributor
teng-li commented Jan 2, 2019

Fix is here: #14452

@Dobatymo
Copy link
Dobatymo commented Mar 6, 2019

I use v1.0.1 which should include above fix, but I also get this error.

Running test_c10d ... [2019-03-06 08:40:13.969132]
sssssProcess process 0:
Traceback (most recent call last):
  File "/usr/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
    self.run()
  File "/usr/lib/python3.6/multiprocessing/process.py", line 93, in run
    self._target(*self._args, **self._kwargs)
  File "test_c10d.py", line 470, in _run
    getattr(self, self.id().split(".")[2])()
  File "test_c10d.py", line 436, in wrapper
    fn(self)
  File "test_c10d.py", line 61, in wrapper
    return func(*args, **kwargs)
  File "test_c10d.py", line 1564, in test_queue_reduction
    devices)
RuntimeError: !gradsBatch.empty() ASSERT FAILED at /pytorch/torch/csrc/distributed/c10d/ddp.cpp:132, please report a bug to PyTorch. (queueReduction at /pytorch/torch/csrc/distributed/c10d/ddp.cpp:132)
frame #0: std::function<std::string ()>::operator()() const + 0x11 (0x7f264d3c2021 in /usr/local/lib/python3.6/dist-packages/torch/lib/libc10.so)
frame #1: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x2a (0x7f264d3c18ea in /usr/local/lib/python3.6/dist-packages/torch/lib/libc10.so)
frame #2: c10d::queueReduction(c10d::ProcessGroup&, std::vector<std::vector<at::Tensor, std::allocator<at::Tensor> >, std::allocator<std::vector<at::Tensor, std::allocator<at::Tensor> > > >&, std::vector<long, std::allocator<long> > const&) + 0xbf4 (0x7f26449e5954 in /usr/local/lib/python3.6/dist-packages/torch/lib/libtorch_python.so)
frame #3: <unknown function> + 0x609ea6 (0x7f26449e1ea6 in /usr/local/lib/python3.6/dist-packages/torch/lib/libtorch_python.so)
frame #4: <unknown function> + 0x11642e (0x7f26444ee42e in /usr/local/lib/python3.6/dist-packages/torch/lib/libtorch_python.so)
frame #5: /usr/bin/python3() [0x5030d5]
frame #6: _PyEval_EvalFrameDefault + 0x449 (0x506859 in /usr/bin/python3)
frame #7: /usr/bin/python3() [0x504c28]
frame #8: /usr/bin/python3() [0x58650d]
frame #9: PyObject_Call + 0x3e (0x59ebbe in /usr/bin/python3)
frame #10: _PyEval_EvalFrameDefault + 0x1807 (0x507c17 in /usr/bin/python3)
frame #11: /usr/bin/python3() [0x504c28]
frame #12: /usr/bin/python3() [0x502540]
frame #13: /usr/bin/python3() [0x502f3d]
frame #14: _PyEval_EvalFrameDefault + 0x449 (0x506859 in /usr/bin/python3)
frame #15: /usr/bin/python3() [0x504c28]
frame #16: /usr/bin/python3() [0x502540]
frame #17: /usr/bin/python3() [0x502f3d]
frame #18: _PyEval_EvalFrameDefault + 0x449 (0x506859 in /usr/bin/python3)
frame #19: _PyFunction_FastCallDict + 0xf5 (0x501945 in /usr/bin/python3)
frame #20: /usr/bin/python3() [0x591461]
frame #21: PyObject_Call + 0x3e (0x59ebbe in /usr/bin/python3)
frame #22: _PyEval_EvalFrameDefault + 0x1807 (0x507c17 in /usr/bin/python3)
frame #23: /usr/bin/python3() [0x502209]
frame #24: /usr/bin/python3() [0x502f3d]
frame #25: _PyEval_EvalFrameDefault + 0x449 (0x506859 in /usr/bin/python3)
frame #26: /usr/bin/python3() [0x502209]
frame #27: /usr/bin/python3() [0x502f3d]
frame #28: _PyEval_EvalFrameDefault + 0x449 (0x506859 in /usr/bin/python3)
frame #29: /usr/bin/python3() [0x502209]
frame #30: /usr/bin/python3() [0x502f3d]
frame #31: _PyEval_EvalFrameDefault + 0x449 (0x506859 in /usr/bin/python3)
frame #32: _PyFunction_FastCallDict + 0xf5 (0x501945 in /usr/bin/python3)
frame #33: /usr/bin/python3() [0x591461]
frame #34: /usr/bin/python3() [0x54b813]
frame #35: /usr/bin/python3() [0x555421]
frame #36: _PyObject_FastCallKeywords + 0x19c (0x5a730c i
8000
n /usr/bin/python3)
frame #37: /usr/bin/python3() [0x503073]
frame #38: _PyEval_EvalFrameDefault + 0x449 (0x506859 in /usr/bin/python3)
frame #39: /usr/bin/python3() [0x502209]
frame #40: /usr/bin/python3() [0x502f3d]
frame #41: _PyEval_EvalFrameDefault + 0x449 (0x506859 in /usr/bin/python3)
frame #42: /usr/bin/python3() [0x502209]
frame #43: /usr/bin/python3() [0x502f3d]
frame #44: _PyEval_EvalFrameDefault + 0x449 (0x506859 in /usr/bin/python3)
frame #45: /usr/bin/python3() [0x502209]
frame #46: /usr/bin/python3() [0x502f3d]
frame #47: _PyEval_EvalFrameDefault + 0x449 (0x506859 in /usr/bin/python3)
frame #48: /usr/bin/python3() [0x502209]
frame #49: /usr/bin/python3() [0x502f3d]
frame #50: _PyEval_EvalFrameDefault + 0x449 (0x506859 in /usr/bin/python3)
frame #51: /usr/bin/python3() [0x504c28]
frame #52: /usr/bin/python3() [0x502540]
frame #53: /usr/bin/python3() [0x502f3d]
frame #54: _PyEval_EvalFrameDefault + 0x449 (0x506859 in /usr/bin/python3)
frame #55: /usr/bin/python3() [0x504c28]
frame #56: /usr/bin/python3() [0x502540]
frame #57: /usr/bin/python3() [0x502f3d]
frame #58: _PyEval_EvalFrameDefault + 0x449 (0x506859 in /usr/bin/python3)
frame #59: /usr/bin/python3() [0x504c28]
frame #60: _PyFunction_FastCallDict + 0x2de (0x501b2e in /usr/bin/python3)
frame #61: /usr/bin/python3() [0x591461]
frame #62: PyObject_Call + 0x3e (0x59ebbe in /usr/bin/python3)
frame #63: _PyEval_EvalFrameDefault + 0x1807 (0x507c17 in /usr/bin/python3)

Process process 1:
Traceback (most recent call last):
  File "/usr/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
    self.run()
  File "/usr/lib/python3.6/multiprocessing/process.py", line 93, in run
    self._target(*self._args, **self._kwargs)
  File "test_c10d.py", line 470, in _run
    getattr(self, self.id().split(".")[2])()
  File "test_c10d.py", line 436, in wrapper
    fn(self)
  File "test_c10d.py", line 61, in wrapper
    return func(*args, **kwargs)
  File "test_c10d.py", line 1564, in test_queue_reduction
    devices)
RuntimeError: !gradsBatch.empty() ASSERT FAILED at /pytorch/torch/csrc/distributed/c10d/ddp.cpp:132, please report a bug to PyTorch. (queueReduction at /pytorch/torch/csrc/distributed/c10d/ddp.cpp:132)
frame #0: std::function<std::string ()>::operator()() const + 0x11 (0x7f264d3c2021 in /usr/local/lib/python3.6/dist-packages/torch/lib/libc10.so)
frame #1: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x2a (0x7f264d3c18ea in /usr/local/lib/python3.6/dist-packages/torch/lib/libc10.so)
frame #2: c10d::queueReduction(c10d::ProcessGroup&, std::vector<std::vector<at::Tensor, std::allocator<at::Tensor> >, std::allocator<std::vector<at::Tensor, std::allocator<at::Tensor> > > >&, std::vector<long, std::allocator<long> > const&) + 0xbf4 (0x7f26449e5954 in /usr/local/lib/python3.6/dist-packages/torch/lib/libtorch_python.so)
frame #3: <unknown function> + 0x609ea6 (0x7f26449e1ea6 in /usr/local/lib/python3.6/dist-packages/torch/lib/libtorch_python.so)
frame #4: <unknown function> + 0x11642e (0x7f26444ee42e in /usr/local/lib/python3.6/dist-packages/torch/lib/libtorch_python.so)
frame #5: /usr/bin/python3() [0x5030d5]
frame #6: _PyEval_EvalFrameDefault + 0x449 (0x506859 in /usr/bin/python3)
frame #7: /usr/bin/python3() [0x504c28]
frame #8: /usr/bin/python3() [0x58650d]
frame #9: PyObject_Call + 0x3e (0x59ebbe in /usr/bin/python3)
frame #10: _PyEval_EvalFrameDefault + 0x1807 (0x507c17 in /usr/bin/python3)
frame #11: /usr/bin/python3() [0x504c28]
frame #12: /usr/bin/python3() [0x502540]
frame #13: /usr/bin/python3() [0x502f3d]
frame #14: _PyEval_EvalFrameDefault + 0x449 (0x506859 in /usr/bin/python3)
frame #15: /usr/bin/python3() [0x504c28]
frame #16: /usr/bin/python3() [0x502540]
frame #17: /usr/bin/python3() [0x502f3d]
frame #18: _PyEval_EvalFrameDefault + 0x449 (0x506859 in /usr/bin/python3)
frame #19: _PyFunction_FastCallDict + 0xf5 (0x501945 in /usr/bin/python3)
frame #20: /usr/bin/python3() [0x591461]
frame #21: PyObject_Call + 0x3e (0x59ebbe in /usr/bin/python3)
frame #22: _PyEval_EvalFrameDefault + 0x1807 (0x507c17 in /usr/bin/python3)
frame #23: /usr/bin/python3() [0x502209]
frame #24: /usr/bin/python3() [0x502f3d]
frame #25: _PyEval_EvalFrameDefault + 0x449 (0x506859 in /usr/bin/python3)
frame #26: /usr/bin/python3() [0x502209]
frame #27: /usr/bin/python3() [0x502f3d]
frame #28: _PyEval_EvalFrameDefault + 0x449 (0x506859 in /usr/bin/python3)
frame #29: /usr/bin/python3() [0x502209]
frame #30: /usr/bin/python3() [0x502f3d]
frame #31: _PyEval_EvalFrameDefault + 0x449 (0x506859 in /usr/bin/python3)
frame #32: _PyFunction_FastCallDict + 0xf5 (0x501945 in /usr/bin/python3)
frame #33: /usr/bin/python3() [0x591461]
frame #34: /usr/bin/python3() [0x54b813]
frame #35: /usr/bin/python3() [0x555421]
frame #36: _PyObject_FastCallKeywords + 0x19c (0x5a730c in /usr/bin/python3)
frame #37: /usr/bin/python3() [0x503073]
frame #38: _PyEval_EvalFrameDefault + 0x449 (0x506859 in /usr/bin/python3)
frame #39: /usr/bin/python3() [0x502209]
frame #40: /usr/bin/python3() [0x502f3d]
frame #41: _PyEval_EvalFrameDefault + 0x449 (0x506859 in /usr/bin/python3)
frame #42: /usr/bin/python3() [0x502209]
frame #43: /usr/bin/python3() [0x502f3d]
frame #44: _PyEval_EvalFrameDefault + 0x449 (0x506859 in /usr/bin/python3)
frame #45: /usr/bin/python3() [0x502209]
frame #46: /usr/bin/python3() [0x502f3d]
frame #47: _PyEval_EvalFrameDefault + 0x449 (0x506859 in /usr/bin/python3)
frame #48: /usr/bin/python3() [0x502209]
frame #49: /usr/bin/python3() [0x502f3d]
frame #50: _PyEval_EvalFrameDefault + 0x449 (0x506859 in /usr/bin/python3)
frame #51: /usr/bin/python3() [0x504c28]
frame #52: /usr/bin/python3() [0x502540]
frame #53: /usr/bin/python3() [0x502f3d]
frame #54: _PyEval_EvalFrameDefault + 0x449 (0x506859 in /usr/bin/python3)
frame #55: /usr/bin/python3() [0x504c28]
frame #56: /usr/bin/python3() [0x502540]
frame #57: /usr/bin/python3() [0x502f3d]
frame #58: _PyEval_EvalFrameDefault + 0x449 (0x506859 in /usr/bin/python3)
frame #59: /usr/bin/python3() [0x504c28]
frame #60: _PyFunction_FastCallDict + 0x2de (0x501b2e in /usr/bin/python3)
frame #61: /usr/bin/python3() [0x591461]
frame #62: PyObject_Call + 0x3e (0x59ebbe in /usr/bin/python3)
frame #63: _PyEval_EvalFrameDefault + 0x1807 (0x507c17 in /usr/bin/python3)

FssProcess process 0:
Traceback (most recent call last):
  File "/usr/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
    self.run()
  File "/usr/lib/python3.6/multiprocessing/process.py", line 93, in run
    self._target(*self._args, **self._kwargs)
  File "test_c10d.py", line 470, in _run
    getattr(self, self.id().split(".")[2])()
  File "test_c10d.py", line 436, in wrapper
    fn(self)
  File "test_c10d.py", line 61, in wrapper
    return func(*args, **kwargs)
  File "test_c10d.py", line 1590, in test_sync_reduction
    devices)
RuntimeError: !gradsBatch.empty() ASSERT FAILED at /pytorch/torch/csrc/distributed/c10d/ddp.cpp:132, please report a bug to PyTorch. (queueReduction at /pytorch/torch/csrc/distributed/c10d/ddp.cpp:132)
frame #0: std::function<std::string ()>::operator()() const + 0x11 (0x7f264d3c2021 in /usr/local/lib/python3.6/dist-packages/torch/lib/libc10.so)
frame #1: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x2a (0x7f264d3c18ea in /usr/local/lib/python3.6/dist-packages/torch/lib/libc10.so)
frame #2: c10d::queueReduction(c10d::ProcessGroup&, std::vector<std::vector<at::Tensor, std::allocator<at::Tensor> >, std::allocator<std::vector<at::Tensor, std::allocator<at::Tensor> > > >&, std::vector<long, std::allocator<long> > const&) + 0xbf4 (0x7f26449e5954 in /usr/local/lib/python3.6/dist-packages/torch/lib/libtorch_python.so)
frame #3: <unknown function> + 0x609ea6 (0x7f26449e1ea6 in /usr/local/lib/python3.6/dist-packages/torch/lib/libtorch_python.so)
frame #4: <unknown function> + 0x11642e (0x7f26444ee42e in /usr/local/lib/python3.6/dist-packages/torch/lib/libtorch_python.so)
frame #5: /usr/bin/python3() [0x5030d5]
frame #6: _PyEval_EvalFrameDefault + 0x449 (0x506859 in /usr/bin/python3)
frame #7: /usr/bin/python3() [0x504c28]
frame #8: /usr/bin/python3() [0x58650d]
frame #9: PyObject_Call + 0x3e (0x59ebbe in /usr/bin/python3)
frame #10: _PyEval_EvalFrameDefault + 0x1807 (0x507c17 in /usr/bin/python3)
frame #11: /usr/bin/python3() [0x504c28]
frame #12: /usr/bin/python3() [0x502540]
frame #13: /usr/bin/python3() [0x502f3d]
frame #14: _PyEval_EvalFrameDefault + 0x449 (0x506859 in /usr/bin/python3)
frame #15: /usr/bin/python3() [0x504c28]
frame #16: /usr/bin/python3() [0x502540]
frame #17: /usr/bin/python3() [0x502f3d]
frame #18: _PyEval_EvalFrameDefault + 0x449 (0x506859 in /usr/bin/python3)
frame #19: _PyFunction_FastCallDict + 0xf5 (0x501945 in /usr/bin/python3)
frame #20: /usr/bin/python3() [0x591461]
frame #21: PyObject_Call + 0x3e (0x59ebbe in /usr/bin/python3)
frame #22: _PyEval_EvalFrameDefault + 0x1807 (0x507c17 in /usr/bin/python3)
frame #23: /usr/bin/python3() [0x502209]
frame #24: /usr/bin/python3() [0x502f3d]
frame #25: _PyEval_EvalFrameDefault + 0x449 (0x506859 in /usr/bin/python3)
frame #26: /usr/bin/python3() [0x502209]
frame #27: /usr/bin/python3() [0x502f3d]
frame #28: _PyEval_EvalFrameDefault + 0x449 (0x506859 in /usr/bin/python3)
frame #29: /usr/bin/python3() [0x502209]
frame #30: /usr/bin/python3() [0x502f3d]
frame #31: _PyEval_EvalFrameDefault + 0x449 (0x506859 in /usr/bin/python3)
frame #32: _PyFunction_FastCallDict + 0xf5 (0x501945 in /usr/bin/python3)
frame #33: /usr/bin/python3() [0x591461]
frame #34: /usr/bin/python3() [0x54b813]
frame #35: /usr/bin/python3() [0x555421]
frame #36: _PyObject_FastCallKeywords + 0x19c (0x5a730c in /usr/bin/python3)
frame #37: /usr/bin/python3() [0x503073]
frame #38: _PyEval_EvalFrameDefault + 0x449 (0x506859 in /usr/bin/python3)
frame #39: /usr/bin/python3() [0x502209]
frame #40: /usr/bin/python3() [0x502f3d]
frame #41: _PyEval_EvalFrameDefault + 0x449 (0x506859 in /usr/bin/python3)
frame #42: /usr/bin/python3() [0x502209]
frame #43: /usr/bin/python3() [0x502f3d]
frame #44: _PyEval_EvalFrameDefault + 0x449 (0x506859 in /usr/bin/python3)
frame #45: /usr/bin/python3() [0x502209]
frame #46: /usr/bin/python3() [0x502f3d]
frame #47: _PyEval_EvalFrameDefault + 0x449 (0x506859 in /usr/bin/python3)
frame #48: /usr/bin/python3() [0x502209]
frame #49: /usr/bin/python3() [0x502f3d]
frame #50: _PyEval_EvalFrameDefault + 0x449 (0x506859 in /usr/bin/python3)
frame #51: /usr/bin/python3() [0x504c28]
frame #52: /usr/bin/python3() [0x502540]
frame #53: /usr/bin/python3() [0x502f3d]
frame #54: _PyEval_EvalFrameDefault + 0x449 (0x506859 in /usr/bin/python3)
frame #55: /usr/bin/python3() [0x504c28]
frame #56: /usr/bin/python3() [0x502540]
frame #57: /usr/bin/python3() [0x502f3d]
frame #58: _PyEval_EvalFrameDefault + 0x449 (0x506859 in /usr/bin/python3)
frame #59: /usr/bin/python3() [0x504c28]
frame #60: _PyFunction_FastCallDict + 0x2de (0x501b2e in /usr/bin/python3)
frame #61: /usr/bin/python3() [0x591461]
frame #62: PyObject_Call + 0x3e (0x59ebbe in /usr/bin/python3)
frame #63: _PyEval_EvalFrameDefault + 0x1807 (0x507c17 in /usr/bin/python3)

Process process 1:
Traceback (most recent call last):
  File "/usr/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
    self.run()
  File "/usr/lib/python3.6/multiprocessing/process.py", line 93, in run
    self._target(*self._args, **self._kwargs)
  File "test_c10d.py", line 470, in _run
    getattr(self, self.id().split(".")[2])()
  File "test_c10d.py", line 436, in wrapper
    fn(self)
  File "test_c10d.py", line 61, in wrapper
    return func(*args, **kwargs)
  File "test_c10d.py", line 1590, in test_sync_reduction
    devices)
RuntimeError: !gradsBatch.empty() ASSERT FAILED at /pytorch/torch/csrc/distributed/c10d/ddp.cpp:132, please report a bug to PyTorch. (queueReduction at /pytorch/torch/csrc/distributed/c10d/ddp.cpp:132)
frame #0: std::function<std::string ()>::operator()() const + 0x11 (0x7f264d3c2021 in /usr/local/lib/python3.6/dist-packages/torch/lib/libc10.so)
frame #1: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x2a (0x7f264d3c18ea in /usr/local/lib/python3.6/dist-packages/torch/lib/libc10.so)
frame #2: c10d::queueReduction(c10d::ProcessGroup&, std::vector<std::vector<at::Tensor, std::allocator<at::Tensor> >, std::allocator<std::vector<at::Tensor, std::allocator<at::Tensor> > > >&, std::vector<long, std::allocator<long> > const&) + 0xbf4 (0x7f26449e5954 in /usr/local/lib/python3.6/dist-packages/torch/lib/libtorch_python.so)
frame #3: <unknown function> + 0x609ea6 (0x7f26449e1ea6 in /usr/local/lib/python3.6/dist-packages/torch/lib/libtorch_python.so)
frame #4: <unknown function> + 0x11642e (0x7f26444ee42e in /usr/local/lib/python3.6/dist-packages/torch/lib/libtorch_python.so)
frame #5: /usr/bin/python3() [0x5030d5]
frame #6: _PyEval_EvalFrameDefault + 0x449 (0x506859 in /usr/bin/python3)
frame #7: /usr/bin/python3() [0x504c28]
frame #8: /usr/bin/python3() [0x58650d]
frame #9: PyObject_Call + 0x3e (0x59ebbe in /usr/bin/python3)
frame #10: _PyEval_EvalFrameDefault + 0x1807 (0x507c17 in /usr/bin/python3)
frame #11: /usr/bin/python3() [0x504c28]
frame #12: /usr/bin/python3() [0x502540]
frame #13: /usr/bin/python3() [0x502f3d]
frame #14: _PyEval_EvalFrameDefault + 0x449 (0x506859 in /usr/bin/python3)
frame #15: /usr/bin/python3() [0x504c28]
frame #16: /usr/bin/python3() [0x502540]
frame #17: /usr/bin/python3() [0x502f3d]
frame #18: _PyEval_EvalFrameDefault + 0x449 (0x506859 in /usr/bin/python3)
frame #19: _PyFunction_FastCallDict + 0xf5 (0x501945 in /usr/bin/python3)
frame #20: /usr/bin/python3() [0x591461]
frame #21: PyObject_Call + 0x3e (0x59ebbe in /usr/bin/python3)
frame #22: _PyEval_EvalFrameDefault + 0x1807 (0x507c17 in /usr/bin/python3)
frame #23: /usr/bin/python3() [0x502209]
frame #24: /usr/bin/python3() [0x502f3d]
frame #25: _PyEval_EvalFrameDefault + 0x449 (0x506859 in /usr/bin/python3)
frame #26: /usr/bin/python3() [0x502209]
frame #27: /usr/bin/python3() [0x502f3d]
frame #28: _PyEval_EvalFrameDefault + 0x449 (0x506859 in /usr/bin/python3)
frame #29: /usr/bin/python3() [0x502209]
frame #30: /usr/bin/python3() [0x502f3d]
frame #31: _PyEval_EvalFrameDefault + 0x449 (0x506859 in /usr/bin/python3)
frame #32: _PyFunction_FastCallDict + 0xf5 (0x501945 in /usr/bin/python3)
frame #33: /usr/bin/python3() [0x591461]
frame #34: /usr/bin/python3() [0x54b813]
frame #35: /usr/bin/python3() [0x555421]
frame #36: _PyObject_FastCallKeywords + 0x19c (0x5a730c in /usr/bin/python3)
frame #37: /usr/bin/python3() [0x503073]
frame #38: _PyEval_EvalFrameDefault + 0x449 (0x506859 in /usr/bin/python3)
frame #39: /usr/bin/python3() [0x502209]
frame #40: /usr/bin/python3() [0x502f3d]
frame #41: _PyEval_EvalFrameDefault + 0x449 (0x506859 in /usr/bin/python3)
frame #42: /usr/bin/python3() [0x502209]
frame #43: /usr/bin/python3() [0x502f3d]
frame #44: _PyEval_EvalFrameDefault + 0x449 (0x506859 in /usr/bin/python3)
frame #45: /usr/bin/python3() [0x502209]
frame #46: /usr/bin/python3() [0x502f3d]
frame #47: _PyEval_EvalFrameDefault + 0x449 (0x506859 in /usr/bin/python3)
frame #48: /usr/bin/python3() [0x502209]
frame #49: /usr/bin/python3() [0x502f3d]
frame #50: _PyEval_EvalFrameDefault + 0x449 (0x506859 in /usr/bin/python3)
frame #51: /usr/bin/python3() [0x504c28]
frame #52: /usr/bin/python3() [0x502540]
frame #53: /usr/bin/python3() [0x502f3d]
frame #54: _PyEval_EvalFrameDefault + 0x449 (0x506859 in /usr/bin/python3)
frame #55: /usr/bin/python3() [0x504c28]
frame #56: /usr/bin/python3() [0x502540]
frame #57: /usr/bin/python3() [0x502f3d]
frame #58: _PyEval_EvalFrameDefault + 0x449 (0x506859 in /usr/bin/python3)
frame #59: /usr/bin/python3() [0x504c28]
frame #60: _PyFunction_FastCallDict + 0x2de (0x501b2e in /usr/bin/python3)
frame #61: /usr/bin/python3() [0x591461]
frame #62: PyObject_Call + 0x3e (0x59ebbe in /usr/bin/python3)
frame #63: _PyEval_EvalFrameDefault + 0x1807 (0x507c17 in /usr/bin/python3)

F......s..s..s..s...s..s....sssss.........
======================================================================
FAIL: test_queue_reduction (__main__.DistributedDataParallelTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "test_c10d.py", line 434, in wrapper
    self._join_processes(fn)
  File "test_c10d.py", line 479, in _join_processes
    self._check_return_codes(elapsed_time)
  File "test_c10d.py", line 494, in _check_return_codes
    self.assertEqual(first_process.exitcode, 0)
  File "/home/xxx/pytorch/test/common_utils.py", line 439, in assertEqual
    super(TestCase, self).assertLessEqual(abs(x - y), prec, message)
AssertionError: 1 not less than or equal to 1e-05 :

======================================================================
FAIL: test_sync_reduction (__main__.DistributedDataParallelTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "test_c10d.py", line 434, in wrapper
    self._join_processes(fn)
  File "test_c10d.py", line 479, in _join_processes
    self._check_return_codes(elapsed_time)
  File "test_c10d.py", line 494, in _check_return_codes
    self.assertEqual(first_process.exitcode, 0)
  File "/home/xxx/pytorch/test/common_utils.py", line 439, in assertEqual
    super(TestCase, self).assertLessEqual(abs(x - y), prec, message)
AssertionError: 1 not less than or equal to 1e-05 :

----------------------------------------------------------------------
Ran 50 tests in 3.494s

FAILED (failures=2, skipped=18)
Traceback (most recent call last):
  File "test/run_test.py", line 431, in <module>
    main()
  File "test/run_test.py", line 423, in main
    raise RuntimeError(message)
RuntimeError: test_c10d failed!

Note that I use only one GPU.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
oncall: distributed Add this issue/PR to distributed oncall triage queue
Projects
None yet
Development

No branches or pull requests

4 participants
0