any way to graceful exit tornado application? #1791

0x11-dev · 2016-08-08T06:26:43Z

there are many gist show how to graceful exit tornado application like this:

stop_httpserver(http_server)
def try_stop_ioloop():
    io_loop = IOLoop.instance()
    if io_loop._callbacks or io_loop._timeouts:
        io_loop.add_timeout(time.time()+1, try_stop_ioloop)
    else:
        io_loop.stop()

try_stop_ioloop()

it's safy to test io_loop._timeouts or _callbacks ? or _timeouts can be ignored?

0x11-dev · 2016-08-08T06:29:30Z

https://gist.github.com/nicky-zs/6304878 here is one example using the method. some time main_ioloop._timeouts always have new _Timeout instance, we have to wait a long time for the _timeouts queue empty.

kerma · 2016-08-08T11:53:49Z

I came here today to ask the exact same question. For after much reading in SO I came up with this example:

import logging
import signal
import time
import tornado.httpserver
import tornado.ioloop
import tornado.web
import tornado.options
from tornado import gen

logger = logging.getLogger()


signal_received = False


class BlockingHandler(tornado.web.RequestHandler):

    def get(self):
        logger.debug("Sta
8000
rting sleep")
        time.sleep(5)
        logger.debug("Sleep done")
        self.finish('block')


class AsyncHandler(tornado.web.RequestHandler):

    @gen.coroutine
    def get(self):
        logger.debug("Starting range 5")
        for i in range(5):
            logger.debug(i)
            yield gen.sleep(1)
        logger.debug("Range done")
        self.finish('async')


def start_server():
    urls = [
        (r'/', BlockingHandler),
        (r'/async', AsyncHandler),
    ]
    application = tornado.web.Application(urls)
    http_server = tornado.httpserver.HTTPServer(application)
    http_server.listen(8888)
    ioloop = tornado.ioloop.IOLoop.instance()

    def register_signal(sig, frame):
        global signal_received
        logger.info("%s received, stopping server" % sig)
        http_server.stop()  # no more requests are accepted
        signal_received = True

    def stop_on_signal():
        global signal_received
        if signal_received and not ioloop._callbacks:
            ioloop.stop()
            logger.info("IOLoop stopped")

    tornado.ioloop.PeriodicCallback(stop_on_signal, 1000).start()
    signal.signal(signal.SIGTERM, register_signal)
    logging.info("Starting server")
    ioloop.start()


if __name__ == '__main__':
    tornado.options.parse_command_line()
    start_server()

It has two problems:

BlockingHandler works as expected (finishes the result and gives it to the client), but raises an Exception:

[E 160808 14:46:12 ioloop:633] Exception in callback None
    Traceback (most recent call last):
      File "/Users/margus/src/git/tornado-sigterm/venv/lib/python3.5/site-packages/tornado/ioloop.py", line 887, in start
        handler_func(fd_obj, events)
      File "/Users/margus/src/git/tornado-sigterm/venv/lib/python3.5/site-packages/tornado/stack_context.py", line 275, in null_wrapper
        return fn(*args, **kwargs)
      File "/Users/margus/src/git/tornado-sigterm/venv/lib/python3.5/site-packages/tornado/netutil.py", line 260, in accept_handler
        connection, address = sock.accept()
      File "/usr/local/Cellar/python3/3.5.2/Frameworks/Python.framework/Versions/3.5/lib/python3.5/socket.py", line 195, in accept
        fd, addr = self._accept()
    OSError: [Errno 9] Bad file descriptor
[I 160808 14:46:12 application:58] IOLoop stopped

.. which I think means that the tornado.access logger is trying to write after the ioloop has stopped and all sockets to write to gone. Not sure whether a feature or a bug in the logging.

Second problem is with AsyncHandler, where request is killed and client receives Remote Disconnected error.

It would be very nice to get an official right way of approaching this. I'll now go and test https://gist.github.com/nicky-zs/6304878 in a real world.

bdarnell · 2016-08-09T02:47:34Z

You definitely shouldn't be looking at IOLoop._callbacks or IOLoop._timeouts. In addition to being private variables, there's a good chance that they will never be empty, for reasons that are unimportant (if you use tornado_curl_httpclient there is always an active timeout). Instead of asking the IOLoop for all activity, you need to decide what activity you care about and exit the server when all of that activity is done.

Or you can do what I do and just stop the IOLoop 5 seconds after the stop is requested. This will be enough for any regular request to finish, and if it's taking longer than 5 seconds you probably want to let it fail anyway. It's not worth making a more precise measurement to stop in less than 5 seconds.

mivade · 2016-08-12T20:09:56Z

This seems like a question that is asked frequently enough that an example solution should be included in the user's guide section of the docs. I would consider adding something to either the Running and deploying section or the Structure section. Ben, which do you think makes the most sense?

0x11-dev · 2016-08-13T01:16:51Z

Set a timeout for stop IOloop is not safe. Is there a approach available to test all request are processed?

bdarnell · 2016-08-19T02:44:01Z

Yeah, I think adding something to "Running and deploying" makes sense. There are two tricky things in writing this up: A) The right way to do this depends heavily on your deployment environment (how your load balancers do health checks and/or retries), and B) Convincing people who want to wait for all operations to finish that it's better to just use a timeout.

On that latter point: There must be some maximum time that you're willing to wait for an operation to finish, even if it's several minutes. At that point you have to be willing to cut them off or they'll keep consuming resources indefinitely. Once you've decided how long you're willing to wait for the last client operation to finish, you've already committed the resources to keep the old server processes around for that long, so why not leave them running for that duration in every case? Trying to track the number of operations remaining and stopping the server precisely when the count reaches zero is not worth the trouble IMHO.

mivade · 2016-08-20T15:52:39Z

You are right that there are plenty of caveats to how to really do a graceful shutdown depending on the deployment strategy. But for the guide, maybe it would suffice to show a simple example case (such as what you mentioned where you stop the IOLoop 5 seconds later to give requests time to finish) while explaining that this is not the only way, but that it at least works for simple setups. I'll take a stab at this in the next few days and maybe we can try to iterate through a good solution via a pull request.

savar · 2017-04-28T07:15:03Z

It would be nice if we could specify what tornado should do on a SIGTERM.
For my kind of setup it would be nice to tell it, that it should stop accepting requests after time X and should quit as soon as all active requests are gone, additionally it would be good to be able to check this status (like .inShutdown() == true)

This would make it quiet easy to gracefully stop it inside kubernetes by having an health handler which will start to become "faulty" as soon as .inShutdown() returns true. With this I can ensure that I take the service out of the loadbalancing without killing any active connections (if they don't take too long) and not loosing some because kubernetes is not taking it out of LB instantaneously (by design to not have one faulty response kicking the service out).

piyush-kansal · 2017-05-16T00:39:55Z

We use Ubuntu's upstart to run our services. While terminating the service, upstart first sends SIGTERM, waits for the service to stop in the configured amount of time (by default timeout is 5s) and if the service fails to stop, sends SIGKILL.

I expect the behavior to be consistent across most of the *nix systems. It would be nice it Tornado can support these two signals by default along with a custom set of signals.

bdarnell · 2017-05-21T00:35:52Z

Tornado works correctly under the upstart configuration you described - it dies by default when it gets SIGTERM. What would you like it to do differently? This always seems to be application- and deployment-dependent (especially with respect to how your load balancers do health checks and retries), but whatever behavior you want can be obtained by defining your own signal handlers.

piyush-kansal · 2017-05-23T18:54:59Z

@bdarnell Thanks for your quick response. I think the reason it didn't work well for us is because we are using an older version of Tornado (2.1.1). Once I implemented my own signal handler for SIGTERM, things started working.

90h · 2018-07-31T10:45:07Z

How to shutdown tornado 5.1 gracefully?

ploxiln · 2018-07-31T15:16:21Z

Here's how I do it (currently with tornado-4.5.3 but I expect it will work the same with tornado-5.1):

async def shutdown():
    periodic_task.stop()
    http_server.stop()
    for client in ws_clients.values():
        client['handler'].close()
    await gen.sleep(1)
    ioloop.IOLoop.current().stop()

def exit_handler(sig, frame):
    ioloop.IOLoop.instance().add_callback_from_signal(shutdown)

...
if __name__ == '__main__':
    signal.signal(signal.SIGTERM, exit_handler)
    signal.signal(signal.SIGINT,  exit_handler)
    ...

(instead of just gen.sleep(1) I actually have a global active-request count and a tornado.locks.Event() that is set after the last request has finished, but that's more complicated and more code ...)

bdarnell · 2018-07-31T23:08:04Z

Yeah, something like @ploxiln's approach is what I'd do. I don't think there have been any noteworthy changes between Tornado 4 and 5 here. It all depends on what "gracefully" means for your application. (One missing piece in the snippet above is that you probably want to signal to your load balancer somehow to stop the incoming traffic).

baratrion · 2018-12-26T14:47:14Z

@ploxiln Hi mate, would you mind describing what ws_clients is in this case?

ploxiln · 2018-12-30T18:32:53Z

sure, ws_clients in my case was a list of dicts with each dict containing a 'handler' reference to an instance of WebSocketHandler - in short-hand:

ws_clients = [
    {
        'handler': WebSocketHandler(...),
        'tags': ...
    },
    ...
]

I had a WebSocketHandler which, for each websocket client which connected, would add "self" to this list. On close, it would find and remove "self" from this list. On shutdown, all still-open websocket connections could be cleanly closed in this way (in addition to other plain http requests having a second to finish up, while no new plain http or websocket requests are possible).

ewhauser · 2020-03-08T23:19:36Z

Linking to this gist in case anyone comes across this issue: https://gist.github.com/wonderbeyond/d38cd85243befe863cdde54b84505784

This works for me.

piraz · 2020-03-09T01:15:45Z

This is what I'm doing on Firenado:

https://github.com/candango/firenado/blob/develop/firenado/launcher.py#L159-L324

svaponi · 2023-10-22T00:47:30Z

We managed to gracefully shutdown tornado by implementing the following steps:

intercepting SIGINT/SIGTERM signals
manually stop the HTTPServer right away (no more requests are accepted)
manually stop the ioloop when all pending requests terminated

See https://github.com/svaponi/tornado-graceful-shutdown/blob/main/server.py

bdarnell added the docs label Jan 21, 2018

ploxiln mentioned this issue Jan 24, 2019

How to / doc: async configuration / awaitable loop.start() #2575

Closed

bartek-sarul mentioned this issue Mar 2, 2020

NIAD-62: configurable server port nhsconnect/integration-adaptors#153

Merged

bdarnell mentioned this issue Apr 10, 2021

Expose a future for currently running requests #3009

Closed

stefanv mentioned this issue Sep 13, 2021

SDSS and DR8 thumbnails don't always get created skyportal/skyportal#2254

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

any way to graceful exit tornado application? #1791

any way to graceful exit tornado application? #1791

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

any way to graceful exit tornado application? #1791

any way to graceful exit tornado application? #1791

Comments

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!