-
Notifications
You must be signed in to change notification settings - Fork 714
Multitenant: merge incoming reports in collector #3780
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
acdefaa
to
6f0dea0
Compare
So the main Add() function isn't so long.
Doesn't do anything at present, but will be used later. Change the signature on BillingEmitter.Close() to match. Note we didn't use the error returned.
This means we store fewer, bigger, reports, which reduces cost of storage and time to render when data is viewed.
Writes to DynamoDB and S3 can be done in parallel, which will reduce the overall flush time.
6f0dea0
to
9a739fd
Compare
buf []byte | ||
} | ||
queue := make(chan queueEntry) | ||
const numParallel = 10 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How did you decide on the value here? Do you think it would useful to make it an arg param?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it would be more flexible, but the way flags are plumbed through from main()
makes it annoying to add more.
There is a balancing act to get all serialisation plus IO completed within the tick interval, and I don't really know what might need adjusting. I'm happy changing the code as I learn from running it; more than adding all possible configuration parameters then never using most of them.
if err != nil { | ||
// NOTE: We don't abort here because failing to store in memcache | ||
// doesn't actually break anything else -- it's just an | ||
// optimization. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not keep this comment in persistReport
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added it back, also another comment on the case where a shortcut report can't be stored.
entry.Unlock() | ||
|
||
if count > 0 { | ||
// serialise reports on one goroutine to limit CPU usage |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you think of cases (hardware configurations) where it might be desirable to branch out here instead? I'm thinking whether there would be any value in making this an optional arg as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's possible, but the collector can be run across multiple processes (pods) to get more CPUs working on the task.
See comments above about the pain of adding more options. Plus it raises the testing burden.
Conceptually LGTM! I only added a few minor comments concerning the config robustness. I just looked at the code though (didn't go as far as trying to test) - how well do you think the current unit tests cover the logic modified here? |
The current unit tests don't cover any of the DynamoDB/S3 code. |
This means we store fewer, bigger, reports, which reduces cost of storage and time to render when data is viewed.