-
Notifications
You must be signed in to change notification settings - Fork 297
Merge Igalia/lwaftr upstream #1133
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This change makes the follower apply some pacing so that it doesn't stall packet processing. Also, the follower now waits for a "commit" action to actually apply its changes. The code that creates actions has been changed to add "commit" actions where appropriate.
Removed README. prefix from docs
Before, each ctable creation would result in a fresh struct allocation. This would eventually cause LuaJIT's 65536-entry ctype table to overflow.
This lowers the effective rate that the follower processes messages to Hz*4, which is really Hz*2 if you consider commit messages. The goal was to reduce packet drop.
Wait for complete change sets before committing in follower
Print out PID name, if defined, on a second column.
Remove useless pcap files, clarify docs and code.
Hard word wrap at 80 chars for filter perf doc
Sketch out Igalia's testing strategy
Changelog v3.1.5
Minor improvements to snabb config listen
Print out worker info in 'snabb ps'
Add basic error reporting to snabb-softwire-v1
- Get rid of `screen` command. - Connect to VM via telnet instead of ssh. - Change syntax to "verb command". - Remove "lwaftr" command. - Enable extra tap network interface to have external internet access from the guest.
Also run selftest in dasl files
@wingo: OK, sounds good to me. I overlooked the results of lib_pmu tests, although I still got the error I pasted above if I run the test in my laptop (4.X kernel, AVX2 support). Since CI passes that's all we should care about. I'd look deeper into the failing SnabbVMX test (program.snabbvmx.tests.selftest.sh). It could be the case that a Linux tool is missing in the VM. That would be an easy fix. But if that's not the cause, fixing the test could take longer. I'm OK with disabling it temporarily. To disable it, I believe the easiest thing to do is to rename the test from "selftest.sh" to "selftest.disabled.sh", for example. This way the test won't be run on "make test". |
The snabbvmx selftest currently passes only when run on a system with a 3.x kernel on the host. We will fix this but in the interests of getting the lwaftr in a mergeable state, temporarily disabling this test by renaming it.
Temporarily disable snabbvmx selftest
@eugeneia the snabb-bot run has been going overnight, since around 18h yesterday. Is that normal? |
@wingo No it died :-/ Still working on sorting out davos. |
Hydra seems happy! Looks like this branch has improved some configurations/setups of the iperf benchmark for snabbnfv. Could be relatively obscure cases (e.g. running a patched QEMU with a larger vring) but always nice to move forward anyway. |
@wingo Do you want to do more work on this branch before merge? |
@lukego Sorry for the delay; I was on a short holiday from when you wrote. I would like to fix snabb-bot before merging but I have no other functional changes queued. I think the snabb-bot failures are related to configuration rather than to functionality, fwiw. |
Actually lookin at things again, I think the |
Just pushed a trivial 8000 commit to force the bot to run again. |
My guess is that the intel_mp selftest is somehow affected by the changes to core, given that it seems reproducible. Sorry for the SnabbBot delays, seems like we hit some new docker related bugs since I redeployed davos. Quite the pandora’s box I opened there… |
Well! That was horrible. I spent days ignorantly trying to understand things from a constructive perspective and then gave up and did a bisect. It's wasn't quite a real bisect as the intel_mp driver only came in with v2017.04, but at each step between master and |
See snabbco#1145 for more discussion.
Disable bus-mastering cleanup for intel_mp
|
||
-- Run the engine with continuous configuration updates | ||
local current_config | ||
local child_path = "group/config/..name" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the " in the wrong place?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question, it certainly looks like it. Perhaps given that Luke is proposing #1021 separately, we comment there?
The bot ran, yay! From what I can tell, the failure noted in the bot run was a perf regression:
However a previous run on this branch before this unrelated bus-mastering patch succeeded with 1.0, so I am inclined to think, no problem. |
…ultiproc-lwaftr
Thank you for the review and the merge, yay!!!! |
Leader process apparently running on wrong NUMA node
Hello kind Snabb hackers!
This is a PR that is manifestly too large, however it's what I got. It merges hacking on the lwaftr to master, which hasn't been done for a while now (6 months or more). Notably it merges in the YANG support. Specific changes:
Add support for YANG schemas, configuration, and state data, allowing users to configure Snabb programs with pleasant textual schema-constrained configuration formats and have Snabb automatically translate that configuration into efficient native Lua data that is usable directly by a data plane. See the
lib.yang
documentation for full details.New lwAFTR configuration format based on YANG. To migrate old configurations, run
snabb lwaftr migrate-configuration old.conf
on the old configuration. See thesnabb-softwire-v1.yang
schema or configuration.md for full details on the new configuration format.Add ability to reconfigure an entire lwAFTR process while it is running, including changes that can add or remove ingresss or egress filters, change NIC settings, or the like. Pass the
--reconfigurable
argument tosnabb lwaftr run
, then interact with the lwAFTR instance viasnabb config
. This--reconfigurable
facility is also available for other Snabb programs to use. See theapps.config
documentationAdd
snabb config
set of commands, to replacesnabb lwaftr control
. These applications can query state or configuration of a running Snabb instance as well as update that configuration in an efficient way. See thesnabb config
documentation for full details.Add support for named Snabb instances (and specifically named lwAFTR instances). Pass
--name foo
to thesnabb lwaftr run
command to have it claim a name on a machine.snabb config
can identify the remote Snabb instance by name, which is often much more convenient than using the instance's PID. Thesnabb lwaftr query
,snabb lwaftr monitor
,snabbvmx query
,snabbvmx top
, andsnabb top
tools can locate Snabb instances by name, andsnabb ps
can show all Snabb processes with their names.Add support to the lwAFTR for fast binding-table updates to individual binding-table entries without causing the whole table to reload, via generic
snabb config
commands likesnabb config add /softwire-config/binding-table/softwire
. Seesnabb config
documentation for more on how to usesnabb config add
andsnabb config remove
. This facility is also generic and can be used by other Snabb data planes.Add support to lwAFTR for
ietf-softwire
alternate YANG schema. Pass the-s ietf-softwire
tosnabb config
invocations to use this schema. Data planes expose a primary YANG model but there is a generic facility to allow the data plane to define transformations from other models to the primary model.Adapt
snabbvmx
to configuration format changes.Incorporate all PodHashMap functionality into
lib.ctable
and switch the lwaftr over to uselib.ctable
, removing PodHashMap.Incorporate multi-process work from @lukego in Engine support for multiprocess orchestration #1021.
Review is very welcome. My personal opinion is that all of this is "safe" and good to merge with the exception of the changes to
src/core/app.lua
: master...Igalia:lwaftr-2017-04-24#diff-e2bf611b7cf61fb539b42bf778b8fc9c. These changes were designed and implemented here: #1067. I think that's the hard review. I would suggest that we do perf testing on this merge and see if the regression found on that PR is still present in this one, and just see what happens.