-
Notifications
You must be signed in to change notification settings - Fork 134
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Latest version fails to launch command with "ERROR 0 at threadsync.cpp:176 void dmtcp::ThreadSync::wrapperExecutionLockLock(): Failed to acquire lock" #1158
Comments
Hi @NeuralModder , Thanks for the feedback. I'm not able to reproduce this bug. For example, I'm using glibc-2.37, with Fedora 38, and I'm nnot seeing it. We are now also on DMTCP-3.1.2 (recent fix for a regression). If you have the chance, could you try:
and then try:
And then paste the result here. Thanks. |
Hi @gc00, Thanks for the response. I updated to 3.1.2 to be thorough, though this unsurprisingly did not resolve the issue. I ran the configure script with
Simply running Nevertheless, using the file and line number that gets printed, I can set a breakpoint:
Here, I can get a backtrace:
From here, I can step twice before the program terminates at the third step:
I can also set the breakpoint one stack frame earlier, and step through to the end:
So, taking a glance at that error in I hope some of this is useful. I understand remotely debugging an unreproducible issue by proxy can be frustrating... |
@NeuralModder , Here are some general comments, in case it helps:
My best guess is that there might have been a change in some API from glibc-2.37 to glibc-2.40, but I'm not at all sure of that guess. If the standard But otherwise, the easiest thing for us (assuming it's only I hope this is reasonable. |
This program seems like it could be useful to me. I've tried multiple tagged versions and branches so far; it seems that each generates a different error (whether during compilation or trying to launch a program)
On the latest version (3.1.0), the program builds successfully. However, if I try to launch any program, I get this error:
Getting the coordinator logs by running the coordinator in one terminal and
./dmtcp_launch -j -p 7779 cat
in another doesn't seem (to me) to provide particularly helpful information either:I've tried looking at the named code location in threadsync.cpp, but I'm not very familiar with C++ and am rather puzzled.
Here's some system information that might be relevant:
Could my glibc version be too recent? Is there anything I can try to solve this problem, or any more information I could provide?
The text was updated successfully, but these errors were encountered: