-
Notifications
You must be signed in to change notification settings - Fork 583
ingest-storage: Use pooled timeseries slices when deserializing #11393
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
@@ -395,6 +397,10 @@ func labelAdaptersHash(b []byte, ls []mimirpb.LabelAdapter) ([]byte, uint64) { | |||
// PushToStorage ignores SkipLabelNameValidation because that field is only used in the distributor and not in the ingester. | |||
// PushToStorage aborts the request if it encounters an error. | |||
func (p *parallelStorageShards) PushToStorage(ctx context.Context, request *mimirpb.WriteRequest) error { | |||
// We're moving Timeseries into batches, each batch has a fresh timeseries slice. | |||
// We're done with the slice in the request here, the contents will live on in the batch and be freed when the batch is freed. | |||
8000 | defer mimirpb.ReuseTimeseriesSliceDangerous(request.Timeseries) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consume
is responsible for allocating from the pool, then PushToStorage
frees back to the pool with the assumption that Consume
is the caller. Not super familiar with the details, but can Consume do the freeing for symmetry? Like right after it calls c.pushToStorage(...)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would definitely be cleaner to just free where it was allocated. Let's see if I can get it into that form or if I run into issues.
The trickiest part I see so far is we still have some implementation specific behavior making it hard to factor out. The parallelStoragePusher might collapse N requests into 1 batch, so you have N slices to free in that path, but only 1 slice to free in the sequential path.
What this PR does
The pusherConsumer deserializes directly into a
&mimirpb.WriteRequest{}
by calling the generatedUnmarshal
. This path will fetch invidiual PreallocTimeseries from the pool, but not the timeseries slice itself, which is subject to the usual extra allocations when appending.This PR uses a slice from the pool in addition to the contained objects.
There are 2 possible paths after we take from the pool, depending on configuration.
The main purpose of this PR is more of a refactor than an optimization. The block builder already uses a timeseries slice from the pool here. By aligning what we fetch from pools, I can move toward using the same deserialization logic in both services, which I'd like to do for record format V2. The reduced allocations are just a bonus.
TL;DR fewer allocations, supports a future change
Before:
After:
Which issue(s) this PR fixes or relates to
Rel https://github.com/grafana/mimir-squad/issues/2253
Checklist
CHANGELOG.md
updated - the order of entries should be[CHANGE]
,[FEATURE]
,[ENHANCEMENT]
,[BUGFIX]
.about-versioning.md
updated with experimental features.