Description
Systemd expects the process it execs to continue running and stay foregrounded, which is incompatible with the way that SIGUSR2 zero-downtime reloading works. This is because after sending the final signal in the series of USR2
-> WINCH
-> TERM
, the old master exits, leaving the new master running as a new (and also detached) pid. At this point, because the foreground process has exited, systemd terminates everything in the process group (or cgroup) and you're left with nothing.
I had assumed that it was expected that these two things were incompatible (i.e. this is exactly why unicornherder exists), but in #1109 it seems to be implied that it would in fact work. I am not sure how this could be possible from a conceptual standpoint, though, and I have indeed been unable to make it work.
I suspect this issue is not enormous for many others because they have alternative solutions (unicornherder, kubernetes), or their application startup time is short enough that a HUP
reload isn't a big deal. However, our startup time can be minutes long due to factors that can't easily be addressed, so zero downtime restarts are pretty important.
We aren't opposed to using unicornherder if this is the best solution, but it's unclear whether it's receiving active maintenance (we've had some minor PRs open on it for a while) and so wanted to explore other solutions. It seems like the SIGUSR2 process is not expected to work under systemd, but it would be great to get some confirmation of that, since the docs don't say, and we'd love to know if there are other solutions that are recommended for this.