perf: timestampCache causes GC-related slowdown #16796
Labels
C-performance
Perf of queries or internals. Solution not expected to change functional behavior.
Milestone
Running a read-only workload on a single node cluster shows performance degradation from ~19k ops/sec to a steady state of ~13.5k ops/sec. Investigation eventually pointed to the
timestampCache
. Specifically, iftimestampCache.AddRequest
is disabled, steady state performance is ~18k ops/sec. The workload under investigation is:Interestingly, the manipulation of the
timestampCache.requests
btree byAddRequest
doesn't seem to be the problem. IfAddRequest
is tweaked to add and immediately delete the request, performance is good.Part of the problem seems to be the btree. Replacing the btree with a fixed size ring buffer of approximately the same size results in steady state performance of ~15k ops/sec. The ring buffer experiment isn't a drop in replacement. More work would be required to flesh out its functionality and that additional functionality might end up negating the modest improvement it is providing.
Another experiment was to zero the
cacheRequest
before inserting it. This brought steady state performance back up to ~17k ops/sec.My current suspicion is that the non-zeroed
cacheRequests
are causing additional GC pressure. EnablingGODEBUG=gctrace=1
shows:Note that there are two fields in
cacheRequest
that zeroing affects with this workload:cacheRequest.span
andcacheRequest.reads
. I'm not sure what to do here. We need the spans in order to later expand the requests into the interval tree. It is surprising to me that merely holding on to these keys is causing additional GC CPU usage. Suggestions are welcome.The text was updated successfully, but these errors were encountered: