8000 Segmentation fault with ITIMER_REAL · Issue #57185 · pytorch/pytorch · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Segmentation fault w 8000 ith ITIMER_REAL #57185

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
sternj opened this issue Apr 28, 2021 · 8 comments
Open

Segmentation fault with ITIMER_REAL #57185

sternj opened this issue Apr 28, 2021 · 8 comments
Labels
module: macos Mac OS related issues triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Comments

@sternj
Copy link
sternj commented Apr 28, 2021

🐛 Bug

PyTorch throws SIGSEGV when running alongside timer on MacOS x86

To Reproduce

Steps to reproduce the behavior:

  1. Run code located here on Mac x86

Here is the stack trace from the crashed thread:

Thread 6 Crashed:
0   ???                           	0x00007ffeee6d7138 0 + 140732898570552
1   libtorch_cpu.dylib            	0x000000010392478c at::TensorIteratorBase::serial_for_each(c10::function_ref<void (char**, long long const*, long long, long long)>, at::Range) const + 588
2   libtorch_cpu.dylib            	0x000000010390cdf2 std::__1::__function::__func<at::internal::_parallel_run(long long, long long, long long, std::__1::function<void (long long, long long, unsigned long)> const&)::$_1, std::__1::allocator<at::internal::_parallel_run(long long, long long, long long, std::__1::function<void (long long, long long, unsigned long)> const&)::$_1>, void (int, unsigned long)>::operator()(int&&, unsigned long&&) + 114
3   libtorch_cpu.dylib            	0x000000010390b7ca std::__1::__function::__func<at::(anonymous namespace)::_run_with_pool(std::__1::function<void (int, unsigned long)> const&, unsigned long)::$_3, std::__1::allocator<at::(anonymous namespace)::_run_with_pool(std::__1::function<void (int, unsigned long)> const&, unsigned long)::$_3>, void ()>::operator()() + 42
4   libc10.dylib                  	0x00000001020996c9 c10::ThreadPool::main_loop(unsigned long) + 569
5   libc10.dylib                  	0x0000000102099d43 void* std::__1::__thread_proxy<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, c10::ThreadPool::ThreadPool(int, int, std::__1::function<void ()>)::$_0> >(void*) + 67
6   libsystem_pthread.dylib       	0x00007fff5a16a2eb _pthread_body + 126
7   libsystem_pthread.dylib       	0x00007fff5a16d249 _pthread_start + 66
8   libsystem_pthread.dylib       	0x00007fff5a16940d thread_start + 13

Expected behavior

Either the program should run without issue or should pass up the SIGALRM.

Environment

Collecting environment information...
PyTorch version: 1.8.1
Is debug build: False
CUDA used to build PyTorch: None
ROCM used to build PyTorch: N/A

OS: macOS 10.14.6 (x86_64)
GCC version: Could not collect
Clang version: 11.0.0 (clang-1100.0.33.12)
CMake version: Could not collect

Python version: 3.9 (64-bit runtime)
Is CUDA available: False
CUDA runtime version: No CUDA
GPU models and configuration: No CUDA
Nvidia driver version: No CUDA
cuDNN version: No CUDA
HIP runtime version: N/A
MIOpen runtime version: N/A

Versions of relevant libraries:
[pip3] numpy==1.20.1
[pip3] torch==1.8.1
[conda] Could not collect
@emeryberger
Copy link

Reproduced on my machine as well. I get no segfaults when I run with version 1.5.1, but with 1.8.1, it segfaults on most executions.

Collecting environment information...
PyTorch version: 1.8.1
Is debug build: False
CUDA used to build PyTorch: None
ROCM used to build PyTorch: N/A

OS: macOS 11.2.3 (x86_64)
GCC version: Could not collect
Clang version: 12.0.0 (clang-1200.0.32.29)
CMake version: version 3.19.1

Python version: 3.6 (64-bit runtime)
Is CUDA available: False
CUDA runtime version: No CUDA
GPU models and configuration: No CUDA
Nvidia driver version: No CUDA
cuDNN version: No CUDA
HIP runtime version: N/A
MIOpen runtime version: N/A

Versions of relevant libraries:
[pip3] numpy==1.19.4
[pip3] torch==1.8.1
[conda] Could not collect

@gchanan gchanan added module: macos Mac OS related issues triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module labels Apr 29, 2021
@lamhoangtung
Copy link
Contributor

I'm having the same problem on pytorch 1.8.1 as well

@ananyajoshi2301
Copy link

Receied Scalene error: received signal SIGSEGV when using Tensorflow.
Attaching the code for reference:

import tensorflow as tf
mnist = tf.keras.datasets.mnist

(x_train, y_train),(x_test, y_test) = mnist.load_data()
x_train, x_test = x_train/255.0, x_test/255.0

model = tf.keras.models.Sequential([
tf.keras.layers.Flatten(input_shape=(28, 28)),
tf.keras.layers.Dense(512, activation=tf.nn.relu),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(10, activation=tf.nn.softmax)
])

model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])

model.fit(x_train, y_train, epochs=5)
model.evaluate(x_test, y_test)

Version of tensorflow being used with Python 3.8:
tensorflow==2.6.0

@SolomidHero
Copy link

same problem, any update?

@chrish42
Copy link
Contributor

I could potentially have a look at this, but I don't have a lot of experience with the Pytorch codebase. It'd be lovely if someone with more experience there could point us in the right direction, at least.

@emeryberger
Copy link
emeryberger commented Aug 18, 2022

FWIW this is now working for me.

Collecting environment information...
PyTorch version: 1.13.0.dev20220521
Is debug build: False
CUDA used to build PyTorch: None
ROCM used to build PyTorch: N/A

OS: macOS 12.5 (arm64)
GCC version: Could not collect
Clang version: 13.1.6 (clang-1316.0.21.2.5)
CMake version: version 3.23.2
Libc version: N/A

Python version: 3.9.13 (main, May 24 2022, 21:13:51)  [Clang 13.1.6 (clang-1316.0.21.2)] (64-bit runtime)
Python platform: macOS-12.5-arm64-arm-64bit
Is CUDA available: False
CUDA runtime version: No CUDA
GPU models and configuration: No CUDA
Nvidia driver version: No CUDA
cuDNN version: No CUDA
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True

Versions of relevant libraries:
[pip3] mypy==0.920
[pip3] mypy-extensions==0.4.3
[pip3] numpy==1.23.1
[pip3] torch==1.13.0.dev20220521
[pip3] torchaudio==0.11.0
[pip3] torchvision==0.12.0

@thiagodaedalus
Copy link

Same problem.

@ogencoglu
Copy link
ogencoglu commented Nov 12, 2023

Same issue with pytorch 2.0.0, python 3.11 on Mac

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
module: macos Mac OS related issues triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
None yet
Development

No branches or pull requests

9 participants
0