-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Introduce a new Object Storage WAL format. #13253
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me - the chunks and index readers/writers look correct and I've verified that the tests look sensible, though its definitely possible I missed something!
In general I prefer slightly longer variable names (e.g. idxWriter over idxw, buf over b) as it makes the code slightly easier to understand & review but not a major issue as I could still follow everything.
// NewWalSegmentWriter creates a new WalSegmentWriter. | ||
func NewWalSegmentWriter() *SegmentWriter { | ||
return &SegmentWriter{ | ||
streams: swiss.NewMap[streamID, *streamSegment](64), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you choose a swiss map over something else?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know yet if this will make a difference but swissmap are reusable possibly we'll change to normal map.
} | ||
|
||
// Labels are passed a string `{foo="bar",baz="qux"}` `{foo="foo",baz="foo"}`. labels.Labels => Symbols foo, baz , qux | ||
func (b *SegmentWriter) Append(tenantID, labelsString string, lbls labels.Labels, entries []*logproto.Entry) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this intended to be called directly from a Push method, or is the client still expecting to group entries under instances/streams and then pass them to Append only when creating the WAL object to be stored?
If it's going to be called from Push directly, it'll vastly simplify that API!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think entries are already grouped by stream in ingester requests so for each stream we'll push into one selected segment.
What this PR does / why we need it:
This PR introduces a new file format for storing WAL segments directly on object storage. This is part of an experiment to replace the ring replication factor and delegate this responsibility to the object storage.
The format is documented at https://github.com/grafana/loki/blob/32b1d2c4acf3955801f7be9a0a43bef396e15507/pkg/storage/wal/README.md
I had some hesitationw with the format, particularly with tenants, may be storing an index per tenant is better, in case each tenant has big indexes. We will learn more once we run things.
see https://github.com/grafana/loki-private/issues/1001