Description
I'm not sure if #3962 is completely different, but this is something that I've been noticing for a while..
It seems like sending out presence updates whenever my presence changes, would cause 100% CPU usage for a while on my small VPS.
An easy way to trigger it is to use the riot-ios app. Just opening the app and then sending it to the background would cause it to hit /_matrix/client/r0/presence/{userId}/status
with PUT
requests (either updating presence to online
or to unavailable
.. and after some more inactivity, to offline
by a Synapse background task, it seems).
Doing that would cause 100% CPU for a while. I imagine Synapse tries to notify many other servers about the presence change. While this (non-important thing) is going on, Synapse would be slow to respond to other requests.
If I just keep alternating between backgrounding and foregrounding the riot-ios app, I can effectively keep my homeserver at 100% CPU.
Normally though, a few seconds after backgrounding the app (which sets my presence as unavailable
), due to a subsequent foregrounding of the app or due to a /sync
by another client of mine (on desktop or something), my presence status would change back to online
and cause the same thing once again.
Maybe a few things could be investigated:
-
whether riot-ios should try to set presence as
unavailable
at all, especially given that other clients may be syncing at the same time and telling Synapse I'monline
.. -
even if a given client says
unavailable
, whether the server should accept that, given that other clients (devices) may be syncing and setting another status at the same time -
whether the server should be so quick to accept and propagate a presence status, when said status might change once again some couple of seconds later.
/sync
is usually called by clients with a long-polling timeout of 30 seconds, so there usually may be something that re-sets the presence status after as little as 30 seconds. Do federated clients care about sub-30-seconds granularity? Perhaps presence changes can be debounced for 30-60 seconds before actually kicking in -
whether propagating presence to other servers should be so heavy. Perhaps the code can be optimized or deprioritized, such that it won't disturb other server operations
Using use_presence = false
eliminates the problem, at the expense of presence not working.
I do like having presence, but I don't care about it being so accurate and so fast to propagate (at the expense of other operations).