-
Notifications
You must be signed in to change notification settings - Fork 17.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
syscall: misleading documentation for linux SysProcAttr.Pdeathsig #27505
Comments
If I am not mistaken, the current implementation of Go never kills threads, which is why you wouldn't be able to run into this bug in Go currently. At the same time, if Go ever does start killing threads, this might hit a lot of people at once. |
Looked around a bit, looks like go 1.10+ will kill a thread if you lock it and the goroutine exits without unlocking it: #20395 Here's an example that shows that: package main
import (
"flag"
"fmt"
"os/exec"
"runtime"
"syscall"
)
func main() {
kill := false
flag.BoolVar(&kill, "death", false, "Set deathsig")
flag.Parse()
cmd := exec.Command("sleep", "1000")
if kill == true {
fmt.Println("Death is coming")
cmd.SysProcAttr = &syscall.SysProcAttr{
Setpgid: true,
Pdeathsig: syscall.SIGKILL,
}
}
// force other goroutines to be spawned on a new thread
runtime.LockOSThread()
done := make(chan struct{})
go func() {
runtime.LockOSThread()
err := cmd.Start()
if err != nil {
fmt.Println(err)
}
close(done)
fmt.Println("Exit goroutine")
}()
<-done
fmt.Println("Waiting")
err := cmd.Wait()
fmt.Println("Done", err)
} In that example, if |
Circling back to this, I did eventually figure out the right way to use Pdeathsig, and I think the documentation should include some examples or discussion about this. Here's an example of what people usually want to do when using Pdeathsig -- ensure the child dies when the parent dies. This example ensures that golang won't accidentally kill the OS thread associated with the child process. package main
import (
"fmt"
"os/exec"
"runtime"
"syscall"
)
func main() {
done := make(chan struct{})
cmd := exec.Command("sleep", "1000")
cmd.SysProcAttr = &syscall.SysProcAttr{
Setpgid: true,
Pdeathsig: syscall.SIGKILL,
}
go func() {
// On Linux, pdeathsig will kill the child process when the thread dies,
// not when the process dies. runtime.LockOSThread ensures that as long
// as this function is executing that OS thread will still be around
runtime.LockOSThread()
defer runtime.UnlockOSThread()
err := cmd.Start()
if err != nil {
fmt.Println(err)
}
fmt.Println("Child is PID", cmd.Process.Pid)
err = cmd.Wait()
close(done)
fmt.Println("Child exited:", err)
}()
fmt.Println("Waiting for child to exit")
<-done
fmt.Println("Parent exiting")
} |
Change https://go.dev/cl/412114 mentions this issue: |
@ianlancetaylor On the CL you wrote
(The workaround being to use LockOSThread in the goroutine that creates the child process to ensure that that thread can't be killed.) Can you say more about why it might not work? |
The CL says " A workaround is to call LockOSThread just before starting the new process and UnlockOSThread after it finishes." But the thread can then later, before the child process exits, pick up another goroutine, that goroutine can call LockOSThread, and then that goroutine can exit. That will cause the thread to exit, and cause a signal to be sent to the child process, although the parent process overall is still running. Unless I misunderstand something. |
Isn't the point of LockOSThread that this cannot happen? |
The idea is that the goroutine locks to thread, and then just waits and does nothing until the process exits. That way, the thread cannot exit until the exec'd process exits go func() {
runtime.LockOSThread()
err := cmd.Start()
if err != nil {
...
}
// some channels to communicate with this goroutine or whatever are here...
// <-finished
err = cmd.Wait()
runtime.UnlockOSThread()
}() |
Thanks, I understand the comment now. I was reading "after it finishes" as meaning "after |
This is a rather large footgun, so let's mention that it sends the signal on thread termination and not process termination in the documentation. Updates #27505 Change-Id: I489cf7136e34a1a7896067ae24187b0d523d987e GitHub-Last-Rev: c8722b2 GitHub-Pull-Request: #53365 Reviewed-on: https://go-review.googlesource.com/c/go/+/412114 Reviewed-by: Ian Lance Taylor <iant@google.com> Auto-Submit: Ian Lance Taylor <iant@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Run-TryBot: Ian Lance Taylor <iant@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com>
See golang/go#27505 for context. Pdeathsig isn't safe to set without locking to the current OS thread, because otherwise thread termination will send the signal, which isn't the desired behavior. I discovered this while troubleshooting a problem that turned out to be unrelated, but I think it's necessary for correctness. Signed-off-by: Aaron Lehmann <alehmann@netflix.com>
See golang/go#27505 for context. Pdeathsig isn't safe to set without locking to the current OS thread, because otherwise thread termination will send the signal, which isn't the desired behavior. I discovered this while troubleshooting a problem that turned out to be unrelated, but I think it's necessary for correctness. Signed-off-by: Aaron Lehmann <alehmann@netflix.com>
See golang/go#27505 for context. Pdeathsig isn't safe to set without locking to the current OS thread, because otherwise thread termination will send the signal, which isn't the desired behavior. I discovered this while troubleshooting a problem that turned out to be unrelated, but I think it's necessary for correctness. Signed-off-by: Aaron Lehmann <alehmann@netflix.com>
On Linux, when (os/exec.Cmd).SysProcAttr.Pdeathsig is set, the signal will be sent to the process when the OS thread on which cmd.Start() was executed dies. The runtime terminates an OS thread when a goroutine exits after being wired to the thread with runtime.LockOSThread(). If other goroutines are allowed to be scheduled onto a thread which called cmd.Start(), an unrelated goroutine could cause the thread to be terminated and prematurely signal the command. See golang/go#27505 for more information. Prevent started subprocesses with Pdeathsig from getting signaled prematurely by wiring the starting goroutine to the OS thread until the subprocess has exited. No other goroutines can be scheduled onto a locked thread so it will remain alive until unlocked or the daemon process exits. Signed-off-by: Cory Snider <csnider@mirantis.com>
On Linux, when (os/exec.Cmd).SysProcAttr.Pdeathsig is set, the signal will be sent to the process when the OS thread on which cmd.Start() was executed dies. The runtime terminates an OS thread when a goroutine exits after being wired to the thread with runtime.LockOSThread(). If other goroutines are allowed to be scheduled onto a thread which called cmd.Start(), an unrelated goroutine could cause the thread to be terminated and prematurely signal the command. See golang/go#27505 for more information. Prevent started subprocesses with Pdeathsig from getting signaled prematurely by wiring the starting goroutine to the OS thread until the subprocess has exited. No other goroutines can be scheduled onto a locked thread so it will remain alive until unlocked or the daemon process exits. Signed-off-by: Cory Snider <csnider@mirantis.com>
On Linux, when (os/exec.Cmd).SysProcAttr.Pdeathsig is set, the signal will be sent to the process when the OS thread on which cmd.Start() was executed dies. The runtime terminates an OS thread when a goroutine exits after being wired to the thread with runtime.LockOSThread(). If other goroutines are allowed to be scheduled onto a thread which called cmd.Start(), an unrelated goroutine could cause the thread to be terminated and prematurely signal the command. See golang/go#27505 for more information. Prevent started subprocesses with Pdeathsig from getting signaled prematurely by wiring the starting goroutine to the OS thread until the subprocess has exited. No other goroutines can be scheduled onto a locked thread so it will remain alive until unlocked or the daemon process exits. Signed-off-by: Cory Snider <csnider@mirantis.com>
…eebsd When baur executes a task and the baur process gets killed, the task subprocess continues to run. This was reproduced on Linux, on other OSes it was not tested but they are probably also affected. Prevent that this can happen by setting Pdeathsig for the executed process. If the parent thread is killed, the specified signal (SIGKILL) will be sent to the child. Pdeathsig is sent when then parent thread dies, to prevent that thread on which the go-routine ran that started the process dies, runtime.LockOSThread is called[^1]. This fixes the issue only on Linux and FreeBSD. Windows & Darwin do not have Pdeathsig in their SysProcAttrs. To achieve the same on Windows support for job objects in Golang might be needed[^2]. [^1]: golang/go#27505 (comment) [^2]: golang/go#17608
When baur executes a task and the baur process gets killed, the task subprocess continues to run. This was reproduced on Linux, on other OSes it was not tested but they are probably also affected. Prevent that this can happen by setting Pdeathsig for the executed process. If the parent thread is killed, the specified signal (SIGKILL) will be sent to the child. Pdeathsig is sent when then parent thread dies, to prevent that thread on which the go-routine ran that started the process dies, runtime.LockOSThread is called[^1]. This fixes the issue only on Linux and FreeBSD. Windows & Darwin do not have Pdeathsig in their SysProcAttrs. To achieve the same on Windows support for job objects in Golang might be needed[^2]. [^1]: golang/go#27505 (comment) [^2]: golang/go#17608
When baur executes a task and the baur process gets killed, the task subprocess continues to run. This was reproduced on Linux, on other OSes it was not tested but they are probably also affected. Prevent that this can happen by setting Pdeathsig for the executed process. If the parent thread is killed, the specified signal (SIGKILL) will be sent to the child. Pdeathsig is sent when then parent thread dies, to prevent that thread on which the go-routine ran that started the process dies, runtime.LockOSThread is called[^1]. This fixes the issue only on Linux and FreeBSD. Windows & Darwin do not have Pdeathsig in their SysProcAttrs. To achieve the same on Windows support for job objects in Golang might be needed[^2]. [^1]: golang/go#27505 (comment) [^2]: golang/go#17608
When Pdeathsig is used, the child process will receive the signal whenever the parent *thread* is killed. In go, an os thread can be killed if it is locked to a goroutine and left locked when the goroutine finishes execution. The go runtime will schedule goroutines to OS threads as it sees fit, so this can result in random unexpected signals being sent to child processes with pdeathsig set. There's no currently *known* case where the shim code leaves goroutines locked to a thread, but given this could happen now or in the future in our code or in any dependency code, it's best to safeguard against this. The fix is to keep the goroutine that spawns and waits on the child process locked to its os thread. That way, we can be sure it will never be killed by the go runtime. See also: golang/go#27505 Signed-off-by: Erik Sipsma <erik@dagger.io>
When Pdeathsig is used, the child process will receive the signal whenever the parent *thread* is killed. In go, an os thread can be killed if it is locked to a goroutine and left locked when the goroutine finishes execution. The go runtime will schedule goroutines to OS threads as it sees fit, so this can result in random unexpected signals being sent to child processes with pdeathsig set. There's no currently *known* case where the shim code leaves goroutines locked to a thread, but given this could happen now or in the future in our code or in any dependency code, it's best to safeguard against this. The fix is to keep the goroutine that spawns and waits on the child process locked to its os thread. That way, we can be sure it will never be killed by the go runtime. See also: golang/go#27505 Signed-off-by: Erik Sipsma <erik@dagger.io>
Also don't check os.ProcessState.Exited() when reporting child deaths, I had previously misunderstood what it was for. In linux child processes are killed via pdeathsig if the *thread* that called fork() is ended, not the process. So this change makes sure that we do not allow go to kill thread. However apparently in current versions of go the threads are never killed anyway so this wouldn't have happened. This code is more formally correct though. See: golang/go#27505
On Linux, when (os/exec.Cmd).SysProcAttr.Pdeathsig is set, the signal will be sent to the process when the OS thread on which cmd.Start() was executed dies. The runtime terminates an OS thread when a goroutine exits after being wired to the thread with runtime.LockOSThread(). If other goroutines are allowed to be scheduled onto a thread which called cmd.Start(), an unrelated goroutine could cause the thread to be terminated and prematurely signal the command. See golang/go#27505 for more information. Prevent started subprocesses with Pdeathsig from getting signaled prematurely by wiring the starting goroutine to the OS thread until the subprocess has exited. No other goroutines can be scheduled onto a locked thread so it will remain alive until unlocked or the daemon process exits. Signed-off-by: Cory Snider <csnider@mirantis.com>
Currently, the documentation says:
However, according to the prctl man page:
I got bit by this in a python program -- started a new program on one thread, and tried to wait for it on another thread, and the child process kept dying and it took awhile to figure out what was going on. While I haven't ran into it in go yet, because in go threads and goroutines aren't one to one, I imagine if one ran into this sort of bug it would only occur intermittently.
Thinking about it, it seems like a user might want to call runtime.LockOSThread when using this?
It's not clear to me whether the docs should have a larger warning in them -- but I think at the minimum the documentation should be updated to say 'parent thread' or 'parent goroutine' instead of just 'parent'.
The text was updated successfully, but these errors were encountered: