Description
Background
This is under the following epic for potential improvements to the state store:
This is related to but different from maintaining an up-to-date view of the state for queries:
Description
We'd like an option to increase the throughput of transactions for multiple tables by having a single long-running processes responsible for updating the transaction log for all Sleeper tables.
Instead of processing the updates to the state store in lambdas, we could have a single long-running process that is responsible for updating the transaction log for all Sleeper tables. Internally this could pull a message off the FIFO queue and pass that to a thread that is responsible for a particular Sleeper table. This would help increase the throughput as there would no longer be the switching problem where one lambda instance receives an update for one table, and the next update for that table goes to a different lambda instance meaning that it needs to update its state from the transaction log.
Analysis
We can have an instance-level option that determines whether commits to the state store are performed using a lambda (as they are now) or a single long-running ECS task (or service with a count of 1). If the latter option is chosen we can deploy an ECS service (with only 1 container running), and remove the lambda that is triggered by the SQS FIFO commit queue. This ECS task needs to pull messages off the queue and apply them to internal state representing the table, and commit them to the transaction log in Dynamo.