8000 Stopping services which contain sub-routines is a hassle · Issue #2041 · cometbft/cometbft · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Stopping services which contain sub-routines is a hassle #2041

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
cason opened this issue Jan 15, 2024 · 0 comments
Open

Stopping services which contain sub-routines is a hassle #2041

cason opened this issue Jan 15, 2024 · 0 comments

Comments

@cason
Copy link
Contributor
cason commented Jan 15, 2024

This issue was motivated by the PR #2010, that addresses #646 and #652. The problem, however, is more general and possibly causes problems when stopping multiple services/reactors.

In short, when we call Stop() for an implementation of service.Service, the service.BaseService implementation of this method on service/service.go does the following:

  1. Sanity check: is the service started and not already stopped
  2. Changes the status of the service to stopped, causing IsRunning() to return false
  3. Calls OnStop(), which is supposed to be overridden by service implementations
  4. Closes the Quit() channel, which also releases blocking calls to Wait()

Implementations of the OnStop() method are supposed, in particular, to stop sub-routines, typically started upon OnStart(). The problem is how the routines know they have to leave, when the service was or is being stopped.

The typical approach is to use the Quit() channel for that, since once it is closed the service has been stopped. But this channel is only closed once closed once OnStop() returns (see above). While OnStop() implementations Are supposed to stop all sub-routines. We thus have short circuit here.

The solution adopted in #2010 consists in overriding the Stop() method, not the OnStop() method as expected by the service definition. The overridden method calls BaseService.Stop(), so that to, in particular, close the Quit() channel. The sub-routines should eventually, once realizing that this channel is closed, return and notify this to a synchronized abstraction (typically, a sync.WaitGroup). The overridden Stop() method will then block on the synchronized abstraction before returning.

While this solution, despite the added complexity, mostly works, we have an additional problem here. The Wait() call returns once the Quit() channel is closed. This channel is closed by BaseService.Stop(), once the OnStop() method returns. This is the right way to implement this functionality. But when we override the Stop() method, so that it waits for all sub-routines to quit, we first call BaseService.Stop(), as above described. Which means that the Wait() method is released and returns. While the service is not yet completely stopped, as the overridden version of Stop() is still waiting for sub-routines to quit. Moreover, the same problem arises if another component or service is using Quit() to identify whether this service has been stopped.

In summary, we can get rid of the short circuit created by OnStop() unblocking Quit(), while Quit() being unblocked is a condition for sub-routines to quit, which is what OnStop() is supposed to implement. But with this work-around we break the definition of the Wait() and Quit() methods, which do not actually wait until the service is fully stopped.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant
0