Description
Description of the problem
Sometimes after consumer crash we notice tasks that are stuck in SPAWNED state - they're no longer in the consumer queue (they are BLPOPed), but they never have been marked as STARTED.
We can't GC SPAWNED tasks that are not in any queue because we don't have any information about when they have been popped from the queue. last_update
points at the time when task was added to the queue (enqueued with SPAWNED state). Any state change requires JSON deserialization, proper updates and then serialization, so can't be done atomically within standard Redis commands.
It would be best if task is marked as STARTED as soon as it's BLPOPed from the consumer queue. Of course both last_update
and status
change requires the same JSON (de)serialization.
Proposed solution
That operation could be a bit more "atomic" if we use Redis functions:
Like all other operations in Redis, the execution of a function is atomic. A function's execution blocks all server activities during its entire time, similarly to the semantics of transactions. These semantics mean that all of the script's effects either have yet to happen or had already happened. The blocking semantics of an executed function apply to all connected clients at all times. Because running a function blocks the Redis server, functions are meant to finish executing quickly, so you should avoid using long-running functions.
https://redis.io/docs/latest/develop/interact/programmability/functions-intro/
It's not an ideal solution though because Redis Lua cjson
module may be not compatible with Python's json
module. Serialization and deserialization using cjson
may unintentionally transform some values of the payload (e.g. big numbers be converted to floats)