default caching #1527

mtcherni95 · 2022-03-03T08:28:11Z

No description provided.

mtcherni95 · 2022-03-06T07:52:00Z

pkg/events/queue/queue_mem_list.go

+		return amountInGB * 1024
+	}
+	AmountOfEvents := func(amountInMB int) int {
+		return MBtoKB(KBtoB(amountInMB)) / eventSize


@rafaeldtinoco FYI shouldn't this beKbtoB(MBtoKB(amountInMB))?

btw, asking you as was part of your commits.

could be if eventSize was 1 and not 1024 (if I got what you mean correctly).

If I did that, then eventSize would always have to be 1024 multiple (1k, 2k, 4k). The way it is, number can be 768 or 1280 (which makes a big difference at the end).

The most correct thing to do here would be to calculate the eventSize based on an existing number (trace.Event) BUT I'm afraid that it is more complex than just that (considering garbage collection and many other overhead from runtime).

I measured different cache sizes and found out 1024 was something close to a constant for calculating cache sizes VS amount of events (this won't be true if we start having other event types in the future).

what I am saying is that we are using function KBtoB(amountInMB) and accepting something which is in units of MB. I don't see any constellation of input where this could be correct. I think should simply be the opposite:
KbtoB(MBtoKB(amountInMB))

mtcherni95 · 2022-03-06T08:15:42Z

pkg/events/queue/queue_mem_list.go

+		return AmountOfEvents(q.eventsCacheMemSizeMB)
+	}
+
+	switch {


we get here only if q.eventsCacheMemSizeMB < = 0 and the only relevant switch case will always be the first one (case q.eventsCacheMemSizeMB <= GBtoMB(1)) . what am I missing? again, asking it here as this was part of your commits @rafaeldtinoco and maybe I am loosing something important in the logics.

If you provide a cache size ("mem-cache-size" cmdline option), then you know what you are doing... q.eventsCacheMemSizeMB won't be 0 (default) and I'll return the amount of events for the given cache size (no automatic cache size calculation is needed).

Now... if you ONLY provided "cache-type", but no "mem-cache-size", then q.eventsCacheMemSizeMB will be 0 AND tracee calculates the amount of events for a specific cache size. Cache size depends on the amount of existing host/node memory.

not sure we understood each other. the logics before checks

if q.eventsCacheMemSizeMB > 0 { return AmountOfEvents(q.eventsCacheMemSizeMB) }

if it doesn't enter in the if statement will ALWAYS enter in this case case q.eventsCacheMemSizeMB <= GBtoMB(1): (as q.eventsCacheMemSizeMB will ALWAYS be <= 0).

so what's the point of the other switch conditions?

Under some circumstances, tracee-rules might be slower to consume events than tracee-ebpf is capable of generating them. This requires tracee-ebpf to deal with this possible lag, but, at the same time, perf-buffer consumption can't be left behind (or important events coming from the kernel might be loss, causing detection misses). There are 3 variables connected to this issue: 1) perf buffer could be increased to hold very big amount of memory pages: The problem with this approach is that the requested space, to perf-buffer, through libbpf, has to be contiguous and it is almost impossible to get very big contiguous allocations through mmap after a node is running for some time. 2) raising the events channel buffer to hold a very big amount of events: The problem with this approach is that the overhead of dealing with that amount of buffers, in a golang channel, causes event losses as well. It means this is not enough to relief the pressure from kernel events into perf-buffer. 3) create an internal, to tracee-ebpf, buffer based on the node size. This commit introduces (3): A generic interface for caching events in the pipeline. For now, there is a single interface implementation: an in-memory caching mechanism with cmdline settings for cache type and possible cache options. Needed tests were added to guarantee cmdline options are always right. In a near future, we may add new caching backends, such as offloading events to filesystem, to a broken, etc.

rafaeldtinoco

Your changes look good to me. If you're satisfied with my comments then feel free to merge. If not, then we can discuss things further.

pkg/events/queue/queue_mem_list.go

rafaeldtinoco · 2022-03-14T19:11:53Z

pkg/events/queue/queue_mem_list.go

+		return AmountOfEvents(q.eventsCacheMemSizeMB)
+	}
+
+	switch {


If you provide a cache size ("mem-cache-size" cmdline option), then you know what you are doing... q.eventsCacheMemSizeMB won't be 0 (default) and I'll return the amount of events for the given cache size (no automatic cache size calculation is needed).

Now... if you ONLY provided "cache-type", but no "mem-cache-size", then q.eventsCacheMemSizeMB will be 0 AND tracee calculates the amount of events for a specific cache size. Cache size depends on the amount of existing host/node memory.

rafaeldtinoco · 2022-03-14T19:41:18Z

pkg/events/queue/queue_mem_list.go

+		return amountInGB * 1024
+	}
+	AmountOfEvents := func(amountInMB int) int {
+		return MBtoKB(KBtoB(amountInMB)) / eventSize


could be if eventSize was 1 and not 1024 (if I got what you mean correctly).

If I did that, then eventSize would always have to be 1024 multiple (1k, 2k, 4k). The way it is, number can be 768 or 1280 (which makes a big difference at the end).

The most correct thing to do here would be to calculate the eventSize based on an existing number (trace.Event) BUT I'm afraid that it is more complex than just that (considering garbage collection and many other overhead from runtime).

I measured different cache sizes and found out 1024 was something close to a constant for calculating cache sizes VS amount of events (this won't be true if we start having other event types in the future).

pkg/events/queue/queue_mem_list_test.go

rafaeldtinoco · 2022-03-14T19:46:53Z

BTW, @mtcherni95, whenever you do git logs/commits, always use imperative wording:

Limit the subject line to 50 characters.
Capitalize only the first letter in the subject line.
Don't put a period at the end of the subject line.
Put a blank line between the subject line and the body.
Wrap the body at 72 characters.
Use the imperative mood.
Describe what was done and why, but not how.

Cheers!

grantseltzer · 2022-03-17T17:55:14Z

@mtcherni95 Is this ready to merge? It seems like there's still discussion.

mtcherni95 · 2022-03-17T18:15:51Z

@mtcherni95 Is this ready to merge? It seems like there's still discussion.

Hi, I'd prefer to wait for Rafael's answer first.

mtcherni95 changed the title ~~Michael default caching~~ default caching Mar 3, 2022

mtcherni95 mentioned this pull request Mar 3, 2022

events_pipeline: add queueEvents to cache pipeline events #1488

Closed

mtcherni95 force-pushed the michael-default-caching branch 5 times, most recently from 24ee1ee to 11eb9e8 Compare March 3, 2022 09:51

mtcherni95 commented Mar 6, 2022

View reviewed changes

mtcherni95 force-pushed the michael-default-caching branch from 1b7226c to 7b82e01 Compare March 6, 2022 08:32

rafaeldtinoco and others added 6 commits March 7, 2022 13:23

main: avoid calling stringSlice multiple times

53237e4

signatures/golang: gofmt simple syntax fixes

850d635

using empty struct for enqueue/dequeue done notification

5be260a

removing setup func

818f325

adding enqueue/dequeue test

f7ec5c5

mtcherni95 force-pushed the michael-default-caching branch from 7b82e01 to f7ec5c5 Compare March 7, 2022 13:23

rafaeldtinoco approved these changes Mar 14, 2022

View reviewed changes

rafaeldtinoco mentioned this pull request Mar 15, 2022

Flaky smoke tests #1456

Closed

rafaeldtinoco merged commit af9e2dc into aquasecurity:main Mar 21, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

default caching #1527

default caching #1527

default caching #1527

default caching #1527

Conversation

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment