Stephen Worley [Tue, 15 Feb 2022 23:42:41 +0000 (18:42 -0500)]
zebra: include old reason in evpn-mh bond update
Ensure we include the old reason when we are updating the reason
code for a evpn-mh bond member. Now that this is a common API
it could include things external to EVPN in this reason code
bitfield (ex: vrrp).
Signed-off-by: Stephen Worley <sworley@nvidia.com>
Stephen Worley [Tue, 15 Feb 2022 22:56:50 +0000 (17:56 -0500)]
zebra: make netlink protodown func more readable
Make the netlink protodown static function for checking
if the only bit set for protodown reason is FRR's more
easily readable to someone not familiar with the code.
Signed-off-by: Stephen Worley <sworley@nvidia.com>
Stephen Worley [Tue, 15 Feb 2022 17:36:18 +0000 (12:36 -0500)]
zebra: evpn-mh bonds protodown check for set
When we are processing a bond member's protodown we get from
the dataplane, check to make sure we haven't already queued
up a set. If we have, it's likely this is just a notification
we get from the kernel after we set protodown and before we have
processed the result in our dplane pthread.
This change is needed now that we set protodown via the dplane
pthread.
Signed-off-by: Stephen Worley <sworley@nvidia.com>
Stephen Worley [Tue, 15 Feb 2022 17:33:06 +0000 (12:33 -0500)]
zebra: extern setting protodown reason directly
Extern the api for setting the protodown reason code
bitfield directly. Some places may want to completely update the
bitfield with more than one reason at a time.
Signed-off-by: Stephen Worley <sworley@nvidia.com>
Stephen Worley [Mon, 31 Jan 2022 21:12:01 +0000 (16:12 -0500)]
zebra: only clear pd_reason on shutdown/sweep
Only clear protodown reason on shutdown/sweep, retain protodown
state.
This is to retain traditional and expected behavior with daemons
like vrrpd setting protodown. They expet it to be set on shutdown
and retained on bring up to prevent traffic from being dropped.
We must cleanup our reason code though to prevent us from blocking
others.
Signed-off-by: Stephen Worley <sworley@nvidia.com>
Stephen Worley [Tue, 25 Jan 2022 18:49:05 +0000 (13:49 -0500)]
zebra: clear protodown_rc on shutdown and sweep
Add functionality to clear any reason code set on shutdown
of zebra when we are freeing the interface, in case a bad
client didn't tell us to clear it when the shutdown.
Also, in case of a crash or failure to do the above, clear reason
on startup if it is set.
Signed-off-by: Stephen Worley <sworley@nvidia.com>
Stephen Worley [Fri, 21 Jan 2022 09:03:01 +0000 (04:03 -0500)]
zebra: add enum set/unset states for queing
Add enums for set/unset of prodown state to handle the mainthread
knowing an update is already queued without actually marking it
as complete.
This is to make the logic confirm a bit more with other parts of the code
where we queue dplane updates and not update our internal structs until
success callback is received.
Signed-off-by: Stephen Worley <sworley@nvidia.com>
These patches handle all our netlink code for setting the reason.
For protodown reason we only set `frr` as the reason externally
but internally we have more descriptive reasoning available via
`show interface IFNAME`. The kernel only provides a bitwidth of 32
that all userspace programs have to share so this makes the most sense.
Since this is new functionality, it needs to be added to the dplane
pthread instead. So these patches, also move the protodown setting we
were doing before into the dplane pthread. For this, we abstract it a
bit more to make it a general interface LINK update dplane API. This
API can be expanded to support gernal link creation/updating when/if
someone ever adds that code.
We also move a more common entrypoint for evpn-mh and from zapi clients
like vrrpd. They both call common code now to set our internal flags
for protodown and protodown reason.
Also add debugging code for dumping netlink packets with
protodown/protodown_reason.
Signed-off-by: Stephen Worley <sworley@nvidia.com>
anlan_cs [Mon, 31 Jan 2022 00:44:35 +0000 (19:44 -0500)]
zebra: let /32 host route with same IP cross VRF
Contraints of host routes are too strict in current code:
Host routes with same destination address and nexthop address are forbidden
even when cross VRFs.
Currently host routes with different destination and nexthop address can cross
VRFs, it is ok. But host routes with same addresses are forbidden to cross VRFs,
it is wrong.
Since different VRFs can have the same addresses, leak specific host route with
the same nexthop address ( it means destination address is same to nexthop
address ) to other VRFs is a normal case.
This commit relaxes that contraints. Host routes with same destination address
and nexthop address are forbidden only when not cross VRFs.
Added this api to fill all multicast group address based on IP version.
For PIMv4 its 224.0.0.0/4, for PIMv6 its FF00::0/8.
Changed the code where its being used currently.
pim6d: Modify pim_*_cmd_worker api passing pim_addr parameter
Pass pim_addr as parameter for rp address to accomodate ipv6.
Modifying pim_rp_cmd_worker and pim_no_rp_cmd_worker function
parameters from in_addr to pim_addr.
Changes in the caller functions are done as well to make it work
for IPv6.
pim6d: Return type and parameter changes for api pim_rp_del_config
1. Return value of this function pim_rp_del_config is nowhere used.
So made it as a void function.
2. Paramater const char *rp is first converted to string from prefix
in the caller and then back to prefix in this api pim_rp_del_config.
Fixed it by directly passing the address instead of string.
Modified the bgp_clear_stale_route function to have
better indentation, but in the process changed some
`continue;` statements to `break;` which modified
the looping and caused stale paths to not always be
removed upon an update.
To reproduce: A ---- B, setup with addpath and GR
One side has a prefix with nhop1 and nhop2, kill one
side and then resend the same prefix with nhop3,
paths nhop1 and 2 become stale and never removed.
Code inspection clearly shows that that `continue`
statements became `break` statements causing the
loop over all paths to stop prematurely.
The fix is to change the break back to continue
statements so the loop can continue instead of
stopping.
Rafael Zalamena [Mon, 21 Feb 2022 11:28:11 +0000 (06:28 -0500)]
lib: tweak northbound gRPC default timeout
Don't let open sockets hang for too long. This will fix an issue where a
improperly coded client (e.g. socat) could exaust the amount of open
file descriptors.
lib: Route-map failed for OSPF routes even for matching prefixes
This issue is applicable to other protocols as well.
When user has used route-map, even though the prefixes are falling
under the permit rule, the prefixes were denied and were shown
as inactive route in zebra.
Reason being the parameter which is of type enum was passed to the api
route_map_get_index and was typecasted to uint8_t *.
This problem is visible in case of Big Endian systems because we are
accessing the most significant byte.
'match_ret' field is an enum in the caller and so it is of 4 bytes,
the typecasting it to 1 byte and passing it to the api made
the api to put the value in the most significant byte
which was already zero previously. Therefore the actual value
RMAP_NOMATCH which was 1 never gets reset in this case.
Therefore the api always returns 'RMAP_NOMATCH' and hence
the prefixes are always denied.
Chirag Shah [Thu, 4 Jun 2020 16:41:31 +0000 (09:41 -0700)]
zebra: fix crash in evpn neigh cleanup all
zebra crash is seen during shutdown (frr restart).
During shutdown, remote neigh and remote mac clean up
is triggered first, followed by per vni all neigh
(including local) and macs cleanup is triggered.
The crash occurs when a remote mac is cleaned up first
and its reference is remained in local neigh.
When local neigh attempt removes itself from its associated
mac's neigh_list it triggers inaccessible memory crash.
The fix is during mac deletion if its neigh_list is non-empty
then retain the MAC in AUTO state.
This can arise when MAC and neigh duo are in different state
(remote/local). Otherwise, the order of cleanup operation
is neighs followed by macs.
The auto mac will be cleaned up when per vni all neighs and macs
are cleaned up.
Donald Sharp [Fri, 2 Oct 2020 18:49:09 +0000 (14:49 -0400)]
zebra: Prevent installation of connected multiple times
With recent changes to interface up mechanics in if_netlink.c
FRR was receiving as many as 4 up events for an interface
on ifdown/ifup events. This was causing timing issues
in FRR based upon some fun timings. Remove this from
happening.
Ticket: CM-31623 Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Chirag Shah [Thu, 28 May 2020 18:40:29 +0000 (11:40 -0700)]
zebra: handle protodown netlink for vxlan device
Frr need to handle protocol down event for vxlan
interface.
In MLAG scenario, one of the pair switch can put
vxlan port to protodown state, followed by
tunnel-ip change from anycast IP to individual IP.
In absence of protodown handling, evpn end up
advertising locally learn EVPN (MAC-IP) routes
with individual IP as nexthop.
This leads an issue of overwriting locally learn
entries as remote on MLAG pair.
In EVPN deployment, restart one of the MLAG
daemon, which puts vxlan interfaces in protodown state.
FRR treats protodown as oper down for vxlan interfaces.
VNI down cleans up/withdraws locally learn routes.
Followed by vxlan device UP event, re-advertise
locally learn routes.