8000 Multitenant: merge incoming reports in collector by bboreham · Pull Request #3780 · weaveworks/scope · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Multitenant: merge incoming reports in collector #3780

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Apr 15, 2020
Merged

Conversation

bboreham
Copy link
Collaborator

This means we store fewer, bigger, reports, which reduces cost of storage and time to render when data is viewed.

@bboreham bboreham requested review from qiell and satyamz as code owners April 13, 2020 11:01
@bboreham bboreham force-pushed the merge-in-collector branch 3 times, most recently from acdefaa to 6f0dea0 Compare April 13, 2020 19:00
Doesn't do anything at present, but will be used later.

Change the signature on BillingEmitter.Close() to match. Note we didn't use the error returned.
This means we store fewer, bigger, reports, which reduces cost of
storage and time to render when data is viewed.
Writes to DynamoDB and S3 can be done in parallel, which will reduce
the overall flush time.
@bboreham bboreham force-pushed the merge-in-collector branch from 6f0dea0 to 9a739fd Compare April 13, 2020 19:15
8000
buf []byte
}
queue := make(chan queueEntry)
const numParallel = 10
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How did you decide on the value here? Do you think it would useful to make it an arg param?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it would be more flexible, but the way flags are plumbed through from main() makes it annoying to add more.
There is a balancing act to get all serialisation plus IO completed within the tick interval, and I don't really know what might need adjusting. I'm happy changing the code as I learn from running it; more than adding all possible configuration parameters then never using most of them.

if err != nil {
// NOTE: We don't abort here because failing to store in memcache
// doesn't actually break anything else -- it's just an
// optimization.
Copy link
Contributor
@fbarl fbarl Apr 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not keep this comment in persistReport?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added it back, also another comment on the case where a shortcut report can't be stored.

entry.Unlock()

if count > 0 {
// serialise reports on one goroutine to limit CPU usage
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you think of cases (hardware configurations) where it might be desirable to branch out here instead? I'm thinking whether there would be any value in making this an optional arg as well.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's possible, but the collector can be run across multiple processes (pods) to get more CPUs working on the task.
See comments above about the pain of adding more options. Plus it raises the testing burden.

@fbarl
Copy link
Contributor
fbarl commented Apr 14, 2020

Conceptually LGTM! I only added a few minor comments concerning the config robustness.

I just looked at the code though (didn't go as far as trying to test) - how well do you think the current unit tests cover the logic modified here?

@bboreham
Copy link
Collaborator Author

The current unit tests don't cover any of the DynamoDB/S3 code.
I think if I was going to add tests I'd do it as integration tests, running mock DynamoDB and S3 containers.

@bboreham bboreham merged commit 24847e8 into master Apr 15, 2020
@bboreham bboreham deleted the merge-in-collector branch April 15, 2020 18:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants
0