-
Notifications
You must be signed in to change notification settings - Fork 24.2k
Memory leak in C++ when running module in separate thread 8000 bdi> #24237
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
This is likely to be caused by some thread local state that isn't cleaned up. Could you try running without MKL and see what happens? |
Hi @pietern , thanks for quick answer. Sure, do you mean in the python part when tracing the model? I don't think I use MKL in the C++ part, unless it's inside the torch lib. |
I mean on the C++ side. PyTorch compiled with MKL support will transparently use it, I think. |
I'm not sure how to check if it uses it or run without it. Could you walk me through or point to some documentation? |
I found a way to not use std::thread in my application, so this is not a problem for me anymore. |
While this still might be an issue by itself, I'll close the issue since you found a workaround. |
i am facing the similar problem |
I'm facing the same problem using the Rust bindings. |
Hi @pietern, I am facing a similar problem. The memory usage is keeping going up when I do the inference in separate threads. And without the mkl lib, the memory usage is stable. I tried to set the env variable MKL_DISABLE_FAST_MM=1. But it did not work out. |
I am facing a similar problem. |
Still the same issue with libtorch 1.7. |
I'm having the exact same issue using libtorch called in a thread from unity. C++ scripttorch::NoGradGuard no_grad;
at::Tensor tensor_image = torch::from_blob(...)
tensor_image.set_requires_grad(false);
vector<torch::jit::IValue> inputs;
inputs.push_back(tensor_image);
at::Tensor output;
output = model.forward(inputs).toTensor();
... (Unity) C# script calling the libtorch scriptvoid Update(){
...
ThreadMl = new Thread(Action);
ThreadMl.Start();
}
private void Action(){ // launched in a thread
...
ScriptML(...); // Memory being leaked
...
} libtorch version : (CPU - Windows) 1.11 |
this is very much still an active issue and should probably be reopened |
I seem to be getting this - coming from Rust like @tiberiusferreira. In a multithreaded async environment, inference appears to leak memory. |
Any news to fix this issue ?
Also having the same issue running a model for inference in a Thread class. |
Why is this closed??? To be clear: we cannot use torch any longer. I don't know how anyone does? Is it just for academics? |
@pietern can we get it reopened please? |
🐛 Bug
When calling the forward function of a Module, some memory is allocated that is not de-allocated at the end of the thread.
To Reproduce
Steps to reproduce the behavior:
Module scripted from Python as in tutoriel:
Loaded and ran in C++ in separate thread:
Expected behavior
Inference is done in separate thread with no increase in memory
Environment
PyTorch version: 1.2.0
Is debug build: No
CUDA used to build PyTorch: None
OS: Microsoft Windows 10 Home
GCC version: Could not collect
CMake version: version 3.12.2
Python version: 3.6
Is CUDA available: No
CUDA runtime version: No CUDA
GPU models and configuration: No CUDA
Nvidia driver version: No CUDA
cuDNN version: No CUDA
Versions of relevant libraries:
[pip] numpy==1.16.2
[pip] numpydoc==0.8.0
[pip] torch==1.2.0
[pip] torchvision==0.4.0
[conda] _tflow_1100_select 0.0.3 mkl
[conda] _tflow_select 2.3.0 mkl
[conda] blas 1.0 mkl
[conda] cpuonly 1.0 0 pytorch
[conda] libmklml 2019.0.3 0
[conda] mkl 2019.1 144
[conda] mkl-include 2019.1 144
[conda] mkl-service 1.1.2 py36hb782905_5
[conda] mkl_fft 1.0.10 py36h14836fe_0
[conda] mkl_random 1.0.2 py36h343c172_0
[conda] pytorch 1.2.0 py3.6_cpu_1 [cpuonly] pytorch
[conda] tensorflow-base 1.10.0 mkl_py36h81393da_0
[conda] torchvision 0.4.0 py36_cpu [cpuonly] pytorch
Additional context
When running on main thread, the memory seems to be allocated once on first call and then re-used.
Python threading doesn't have this problem
The text was updated successfully, but these errors were encountered: