Extended support for managing shared memory #972

eugeneia · 2016-07-21T13:58:35Z

This changes quite a few things around core.shm, mainly to remove statistics counter boilerplate, and to conveniently support multiple types of shared memory objects (as opposed to just counters). I am not super happy with the shm.{create,open,delete}_frame abstraction but it does the job. Summary of changes:

changes the runtime directory layout: there are now directories for apps/, links/, engine/, pci/. Link structs are no longer exposed. All memory mapped files now have extensions that designate their type, for e.g. snabb top to determine how to open them.
snabb top now has -l instead of -a, which supports dumping engine/link counters as well
apps no longer create/delete their shared memory directly but use the field myapp.shm to declare objects.
documentation of shared memory libraries was moved from inline to markdown (and updated)
per-PCI-address shared memory is created under pci/<pciaddr> using shm.create_frame
Some unused functions fromcore.shm were removed for simplicity: alias was unused, the whole path relative logic was removed (a single / now means fully qualified, I found the path qualified logic to be error prone and unnecessary)
modules that implement share memory types (core.counter, core.histogram) now use the new shm.register to register themselves as types with core.shm. Its a bit magical but enables convenient use in apps and e.g. lets them implement tonumber so they can be printed by snabb top -l
to facilitate uniform treatment of shared memory types, core.counter and core.histogram now implement a common interface: create, open and optionally delete

Cc @alexandergall @lukego

…code... - removed “alias” and “path” from core.shm - added “register” and “{create,open,delete}_frame” to core.shm - made core.counter API match create/open/delete interface - registered core.counter and core.histogram with core.shm - moved inline documentation of core.{shm,counter,histogram} to README.md - update modules to use shm frame abstraction where sensible

lukego · 2016-07-26T10:19:37Z

Cool!

I like the idea of associating types and modules with shm objects. The objects are becoming richer over time, not just counters but also e.g. histograms and timeline logs, and I would really like to have this framework for associating useful functionality with those objects e.g. pretty-printing functions.

This is something I have been lacking in terms of a UI for timeline objects. I already have a timeline.dump() function for pretty-printing the log but haven't had an obvious way to access that. Calling it via snabb snsh -e timeline.dump(...) seems too ad-hoc, a dedicated program like snabb timeline ... seems like overkill, so having an easy way for programs like snabb top (etc) to find the dumper seems a nice compromise.

lukego · 2016-07-27T10:25:37Z

One related thought on the topic of counters...

I would really like to have counters stored directly where they are accessed e.g. directly in struct link. Having pointers to counters in struct link is uncomfortable to me because it means more instructions (chasing pointers) and potentially cache misses too. Since counters have recently been affecting overall performance I think we need to start taking care of these details.

One idea would be for counter.commit() to be more flexible about how it sources counters. For example, perhaps it could determine the current value of each counter by calling a function. This way we could gather counter values from where-ever the master value lives e.g. as a uint64_t in struct link. This way we would move the logic of chasing pointers to find counters from the fast path (link.transmit etc) to the slow path (counter.commit()).

What do you think? (Likely this is something we should consider separately to this PR - unless there is a performance impact on this branch that we need to mitigate.)

Fixed merge conflicts from new debug vars

kbara · 2016-08-01T11:10:23Z

I second everything Luke said, including about reducing the number of pointers being a task for another PR, so I've upstreamed this at #982 .

eugeneia · 2016-08-01T11:12:38Z

@lukego @kbara actually I think this might not be ready yet, there is another performance regression in here that I haven't figured out yet.

lukego · 2016-08-01T11:18:11Z

@eugeneia need a [wip] tag next time :-)

@kbara if you want to drop these changes with a rebase that is fine with me. I think we would need to toggle the "protected" flag off from kbara-next for a moment to make that possible.

Otherwise git revert works too and is just a bit noisier (and would require a revert of the revert before merging this branch again). That would be the appropriate solution if other people had already merged these changes from kbara-next but I doubt that is the case right this moment.

kbara · 2016-08-01T12:03:07Z

I've run git reset --hard HEAD^ and updated kbara-next on my fork; I'll update it on the main repo if/when the protected flag comes off.

eugeneia · 2016-08-01T12:13:38Z

uhm... false alarm? ^^' I got things mixed up in my testing, shm_frame is actually OK. Can we still paddle back? I'm super sorry for the misinformation.

kbara · 2016-08-01T12:25:25Z

I'm happy to peddle back. I have done a second push to the kbara fork re-instating the changes, and hadn't updated my kbara-next branch on the main repo. I've re-opened #982 as well. Glad to hear it was a false alarm - it looks like a nice set of changes. :-)

lukego · 2016-08-02T10:40:10Z

Relatedly: I have also been starting to think about maintaining next as a pair of branches:

lukego/next would be an experimental working branch. Here I merge changes that look good to me. I test this branch as much as I like using a private Hydra jobset. If I find a problem then I will rewrite this branch with a rebase (which does not cause problems because it is a private development branch that I don't send PRs from and other people are not supposed to merge from).
snabbco/next is where I push changes once I am satisfied with them. If I make a mistake here then people can pull that and I have to follow it up with a fix e.g. revert commit. This is okay because the rate of mistakes should be lower due to earlier testing on the dev branch.

The basic intention would be to avoid merging changes onto a well-known branch before I know whether it causes a performance regression or other test failure that can only be detected by CI.

kbara · 2016-08-03T12:21:55Z

Sounds good, especially if there's an easy way for me to run a similar Hydra jobset on kbara-next.

Update benchdata/ packets to match test configurations

eugeneia added 2 commits July 21, 2016 14:43

snabb top: remove obsolete unlinking of runtime files (supervisor).

70ad231

eugeneia mentioned this pull request Jul 25, 2016

Timeline with engine instrumentation #973

Open

kbara self-assigned this Aug 1, 2016

kbara pushed a commit to kbara/snabb that referenced this pull request Aug 1, 2016

Merge PR snabbco#972 (Extended shm management support) into kbara-next

e55afad

Fixed merge conflicts from new debug vars

eugeneia mentioned this pull request Aug 1, 2016

Merge Max's "Extended support for managing shared memory #972" #982

Merged

eugeneia merged commit 70ad231 into snabbco:master Aug 3, 2016

takikawa pushed a commit to takikawa/snabb that referenced this pull request Sep 27, 2017

Merge pull request snabbco#972 from Igalia/fix-benchmarking-pcaps

ef0034b

Update benchdata/ packets to match test configurations

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Extended support for managing shared memory #972

Extended support for managing shared memory #972

Extended support for managing shared memory #972

Extended support for managing shared memory #972

Conversation