nexthop: fix disappearing addresses and routes after expiration #193
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
After a fixed period of time (20 minutes), grout will mark non-static nexthops that were resolved earlier (with ARP or NDP) as STALE and will try to probe them to refresh their mac addresses.
After 6 probes have been sent (3 unicast using the previously known mac address of the nexthop, and 3 broadcast/multicast) without any reply from the nexthop, it is marked as FAILED.
After 1 minute in the FAILED state, it is "freed" using the callback specified in the nexthop_ops per family (at the moment, only IPv4 and IPv6).
The current callbacks are rib4_cleanup and rib6_cleanup which are very aggressive. They effectively delete everything related to the nexthop. Including any local address, routes, etc.
Replace these callbacks with simpler ones which only delete /32 and /128 routes and decrement the reference counter on the nexthop. If the nexthop still has routes referencing it (for example, it is used as a gateway for a route), scrub its flags and mac address fields so that
it can reused later.
Fixes: fe3e408 ("route: rework route cleanup routine")