Daniel Walton [Fri, 11 Dec 2015 21:12:56 +0000 (21:12 +0000)]
Quagga: make check is broken with addpath changes
Signed-off-by: Daniel Walton <dwalton@cumulusnetworks.com> Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
Ticket: CM-8472
The first as-path is what the original as-path for the test, the second
is what the as-path should look like once all confed SETs and SEQs have
been removed. At the time the test was written quagga did not correctly
remove confed SETs and SEQs but whoever wrote the test didn't notice
this and assumed that the behavior they were seeing was correct so used
that output to populate the second as-path.
Now that we do correctly remove the confed parts these tests fail. So
the fix is to update the second as-path for these two tests so that they
no longer contain any confed SETs/SEQs.
vivek [Wed, 9 Dec 2015 19:01:21 +0000 (11:01 -0800)]
zebra: Reorganize NHT code
NextHop Tracking (NHT) is a significant function introduced into Quagga
by Cumulus. Initially intended for tracking BGP nexthops, this has been
extended subsequently to also cater to nexthops for static routes, BGP
peer reachability tracking and BGP route tracking for routes to be
imported into BGP.
This patch reorganizes the code a bit to make it easier to follow and
maintain. No functional changes introduced.
vivek [Wed, 9 Dec 2015 00:55:43 +0000 (16:55 -0800)]
Zebra: Schedule RIB processing based on trigger event
Currently, when RIB processing is initiated (i.e., by calling rib_update()),
all routes are queued for processing. This is not desirable in all situations
because, sometimes the protocol may have an alternate path. In addition,
with NHT tracking nexthops, there are situations when NHT should be kicked
off first and that can trigger subsequent RIB processing.
This patch addresses this by introducing the notion of a trigger event. This
is only for the situation when the entire RIB is walked. The current triggers
- based on when rib_update() is invoked - are "interface change" and "route-
map change". In the former case, only the relevant routes are walked and
scheduled, in the latter case, currently all routes are scheduled for
processing.
Note: The initial defect in this area was CM-7420. This was addressed in
2.5.4 with an interim change that only walked static routes upon interface
down. The change was considered a bit risky to do for interface up etc. Also,
this did not address scenarios like CM-7662. The current fix addresses CM-7662.
vivek [Tue, 8 Dec 2015 23:04:48 +0000 (15:04 -0800)]
Zebra: Eliminate unnecessary del-add upon static route addition
When static routes are added, they get processed and potentially installed
in the RIB once. Subsequently, NHT is invoked and ends up scheduling the
route for processing again because this is the first time the nexthop is
resolved for NHT. This used to result in a del-add earlier (as noted in
the defect), but is a replace now. This change eliminates the unnecessary
replace by ensuring NHT is invoked first if the static route has a nexthop
that will be tracked by NHT.
Donald Sharp [Tue, 8 Dec 2015 17:08:46 +0000 (17:08 +0000)]
bgpd: Modify maxpaths cli's to use MULTIPATH_NUM for range
Modify the various maxpath commands to use MULTIPATH_NUM
as the upper limit of allowed max paths in BGP. There
is no point in allowing a number of maximum paths greater
than what Quagga is compiled for.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Donald Sharp [Fri, 4 Dec 2015 17:34:42 +0000 (12:34 -0500)]
zebra: Remove STATIC_XXX_IFNAME and use _IFINDEX
When we get a static route through an interface convert the interface
name to an ifindex and pass it through to zebra_rib.c. zebra_rib.c
should not care about the ifname.
This code change will allow us to collapse some of the NEXTHOP_XXX types.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Daniel Walton [Mon, 30 Nov 2015 21:19:39 +0000 (21:19 +0000)]
BGP: 'neighbor swpX interface peer-group FOO' is needed to simplify swpX
configuration
Signed-off-by: Daniel Walton <dwalton@cumulusnetworks.com> Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com> Reviewed-by: Dinesh Dutt <ddutt@cumulusnetworks.com>
Ticket: CM-8321
This allows the user to configure the peer-group as an option for the
"neighbor swpX interface" command.
{code}
!
router bgp 100
neighbor FOO peer-group
neighbor FOO remote-as external
neighbor swp1 interface
neighbor swp2 interface v6only
neighbor swp3 interface peer-group FOO
neighbor swp4 interface v6only peer-group FOO
!
address-family ipv4 unicast
neighbor FOO activate
neighbor swp1 activate
neighbor swp2 activate
exit-address-family
!
{code}
Note that if the user configures
{code}
neighbor swp5 interface
neighbor swp5 peer-group FOO
{code}
We will display that as "neighbor swp5 interface peer-group FOO". It
did not seem worth tracking that the peer-group was entered via two
lines instead of one.
Donald Sharp [Fri, 20 Nov 2015 19:36:46 +0000 (11:36 -0800)]
Quagga: vrf_id not being set correctly
Several routing protocols use the zapi_ipv[4|6] api to talk to
zebra. There are some instances where the api.vrf_id was not
being set. Since the practice is to declare the api structure
on the stack, the data inside is not being set to 0. As
such random vrf_id values were being passed to zebrad
causing rage and confusion.
Ticket: CM-8287 Reviewed-by: CCR-3841
Testing: Test suites no longer crashing and burning
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Daniel Walton [Fri, 20 Nov 2015 18:53:50 +0000 (18:53 +0000)]
BGP: Remove the requirement to rebind a peer to its peer-group under the address-family.
Signed-off-by: Daniel Walton <dwalton@cumulusnetworks.com> Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com> Reviewed-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Ticket: CM-3868
NOTE: Many of the ibgp peers are not up in the 'show ip bgp summ' output below. This
is because ospf was disabled at the time. The main thing to look for is whether or
not all of the correct peers are listed based on their 'activate' status.
superm-redxp-05# show ip bgp summ
BGP router identifier 10.0.0.1, local AS number 10
BGP table version 4200
RIB entries 2399, using 281 KiB of memory
Peers 8, using 129 KiB of memory
Peer groups 2, using 112 bytes of memory
Example where the peer-group is inactive but a member of the peer-group is active:
==================================================================================
superm-redxp-05(config)# router bgp 10
superm-redxp-05(config-router)# address-family ipv4 unicast
superm-redxp-05(config-router-af)# neighbor 10.0.0.3 activate
superm-redxp-05(config-router-af)# no neighbor IBGP activate
superm-redxp-05(config-router-af)#
superm-redxp-05(config-router-af)# neighbor 10.0.0.4 activate
superm-redxp-05(config-router-af)# end
superm-redxp-05# show run
router bgp 10
bgp router-id 10.0.0.1
no bgp default ipv4-unicast
no bgp bestpath as-path multipath-relax
bgp bestpath compare-routerid
bgp route-map delay-timer 1
neighbor EBGP peer-group
neighbor EBGP advertisement-interval 1
neighbor IBGP peer-group
neighbor IBGP remote-as 10
neighbor IBGP update-source 10.0.0.1
neighbor IBGP advertisement-interval 1
neighbor 10.0.0.2 peer-group IBGP
neighbor 10.0.0.3 peer-group IBGP
neighbor 10.0.0.4 peer-group IBGP
neighbor 20.1.1.6 remote-as 20
neighbor 20.1.1.6 peer-group EBGP
neighbor 20.1.1.7 remote-as 20
neighbor 20.1.1.7 peer-group EBGP
neighbor 40.1.1.2 remote-as 40
neighbor 40.1.1.2 peer-group EBGP
neighbor 40.1.1.6 remote-as 40
neighbor 40.1.1.6 peer-group EBGP
neighbor 40.1.1.10 remote-as 40
neighbor 40.1.1.10 peer-group EBGP
!
address-family ipv4 unicast
neighbor EBGP activate
neighbor IBGP next-hop-self
neighbor 10.0.0.4 activate
exit-address-family
!
superm-redxp-05# show ip bgp summ
BGP router identifier 10.0.0.1, local AS number 10
BGP table version 4200
RIB entries 2399, using 281 KiB of memory
Peers 8, using 129 KiB of memory
Peer groups 2, using 112 bytes of memory
Daniel Walton [Fri, 20 Nov 2015 18:43:33 +0000 (18:43 +0000)]
BGP crash in group_announce_route_walkcb
Signed-off-by: Daniel Walton <dwalton@cumulusnetworks.com> Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
Ticket: CM-8295
This is a regression from the addpath-tx commit. We loop over the adj_out list
but the adj can be freed within the body of the loop so we need to note adj->next
prior to the free.
vivek [Fri, 20 Nov 2015 18:34:50 +0000 (10:34 -0800)]
Zebra: Handle IPv6 address status during initialization
At init time, Zebra queries the kernel for all interfaces. At this time,
an IPv6 address may exist on an interface but IPv6 DAD has not completed,
or it could be that DAD has failed. Zebra should examine the flags on
the address and act accordingly. Otherwise, it may end up with addresses
and routes which are not actually valid in the kernel, and this may lead
to, for example, BGP attempting neighbor connections on an interface on
which the source IPv6 address is not yet valid.
vivek [Fri, 20 Nov 2015 16:48:32 +0000 (08:48 -0800)]
Zebra: Cleanup RIB debugs
Some of the changes include:
- ensuring IPv6 addresses are printed correctly
- say 'updating' or 'deleting' etc. only when that is actually done
- say 'queuing' or 'dequeuing' only when that is actually done
- print useful info for 'detailed' debug - that now subsumes 'rib queue'
- delete various useless logs
- VRF-specific - print VRF id in RIB debugs prior to prefix
(e.g., 4:37.1.1.0/28)
Donald Sharp [Fri, 20 Nov 2015 15:10:47 +0000 (07:10 -0800)]
Quagga: Fixup decision about what an unnumbered interface is
This Change modifies what zebra thinks is an unnumbered interface.
If the interface is not a loopback and the prefixlength for the
interface is 32 than consider this an unnumbered interface.
Ticket: CM-8016
Reviewed by: CCR-3827
Testing: Full Regression Suites
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
vivek [Thu, 19 Nov 2015 20:48:02 +0000 (12:48 -0800)]
Zebra: Fix replace route for uninstall scenario
When a Quagga route that is currently installed is superceded by a
kernel route (connected or static route to same destination), the
Quagga route is not uninstalled from the kernel. Fix by ensuring
this case is handled correctly.
vivek [Thu, 19 Nov 2015 20:22:55 +0000 (12:22 -0800)]
Zebra: Implement route replace for IPv6
Zebra currently performs a delete followed by add when a route needs to be
modified. Change this to use the replace semantics of netlink so that the
operation can possibly be atomic.
Note: This patch handles IPv6 routes, IPv4 already performs a replace.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com> Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com> Reviewed-by: Dinesh Dutt <ddutt@cumulusnetworks.com>
Ticket: CM-5597
Reviewed By: CCR-3407
Testing Done: Manual testing of various scearnios (Vivek, Satish)
Note: This is an import of patch zebra-ipv6-route-replace.patch from 2.5-br.
Donald Sharp [Wed, 18 Nov 2015 23:36:04 +0000 (15:36 -0800)]
Quagga: Fixup cli and json keyword
The json keyword was being read incorrectly.
Basically some commands read a variable # of arguments
and in ospf the command values were being placed into
argc and argv. With a variable # of arguments their
existed a possibility that less arguments would be read
from the cli than were being tested for in the command function
handler. This caused core dumps in some situations.
All code to read to decide to use the json keyword has
been centralized through a function and all code
converted to use it, irrelevant if it exhibited the bug
Ticket: CM-8278
Reviewed by: CCR-3830
Testing: OSPF no longer crashes and all other test suites still run
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
vivek [Tue, 17 Nov 2015 21:57:56 +0000 (13:57 -0800)]
BGP: Handle router-id correctly in config and display.
BGP is currently displaying the in-use router-id in the config. This is
conditional on a CONFIG flag, however, that flag is set even when there
is no configured router-id and the router-id learnt from Zebra is in-use.
The CONFIG flag for router-id is redundant since there is a separate
variable for the configured value, so use that and deprecate the CONFIG
flag. This also makes BGP behave like OSPF.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com> Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com> Reviewed-by: Daniel Walton <dwalton@cumulusnetworks.com>
Ticket: CM-8077, CM-8220
Reviewed By: CCR-3793
Testing Done: Manual verification (in 2.5-br)
Note: Imported from 2.5-br patch bgpd-fix-router-id-config-display.patch
Donald Sharp [Tue, 17 Nov 2015 16:13:23 +0000 (08:13 -0800)]
Quagga: Set MULTIPATH_NUM to 64 when user specifies 0 from cli
The code has tests to see if the MULTIPATH_NUM == 0 and to
treat it like the user has entered 'Maximum PATHS'.
This 0 is treated as 64 internally. Remove this dependency
and setup MULTIPATH_NUM to 64 when --enable-multipath=0 from
the configure cli.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com> Reviewed-by: Daniel Walton <dwalton@cumulusnetworks.com>
Daniel Walton [Tue, 17 Nov 2015 02:09:57 +0000 (02:09 +0000)]
BGP: crash in update_subgroup_merge()
Signed-off-by: Daniel Walton <dwalton@cumulusnetworks.com> Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com> Reviewed-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Ticket: CM-8191
On my hard node I have a route to 10.0.0.0/22 via eth0, I then learn
10.0.0.8/32 from peer 10.0.0.8 with a nexthop of 10.0.0.8:
superm-redxp-05# show ip bgp 10.0.0.8/32
BGP routing table entry for 10.0.0.8/32
Paths: (1 available, no best path)
Not advertised to any peer
80
10.0.0.8 from r8(swp6) (10.0.0.8)
Origin IGP, metric 0, localpref 100, valid, external
AddPath ID: RX 0, TX 9
Last update: Thu Nov 12 14:00:00 2015
superm-redxp-05#
I do a lookup for the nexthop and see that 10.0.0.8 is reachable via my
eth0 10.0.0.22 so I select a bestpath and install the route. At this
point my route to 10.0.0.8 is a /32 that resolves via itself, NHT sees
that this is illegal and flags the nexthop as inaccessible.
superm-redxp-05# show ip bgp 10.0.0.8/32
BGP routing table entry for 10.0.0.8/32
Paths: (1 available, best #1, table Default-IP-Routing-Table)
Advertised to non peer-group peers:
r6(swp4) r7(swp5) r8(swp6) r2(10.1.2.2) r3(10.1.3.2) r4(10.1.4.2)
80
10.0.0.8 (inaccessible) from r8(swp6) (10.0.0.8)
Origin IGP, metric 0, localpref 100, invalid, external, bestpath-from-AS 80, best
AddPath ID: RX 0, TX 9
Last update: Thu Nov 12 14:00:00 2015
superm-redxp-05#
at which point we withdraw the route, things churn, we relearn it and go
through the whole process over and over again. We end up advertising and
withdrawing this route about 9 times a second!!
This exposed a crash in the update-group code where we try to merge two sub-groups
but the assert on adj_count fails because the timing worked out where one had
advertised 10.0.0.8/32 but the other had not.
NOTE: the race condition described above will be resolved via a separate patch.
Donald Sharp [Mon, 16 Nov 2015 20:48:07 +0000 (12:48 -0800)]
Zebra: Remove reliance on NEXTHOP_TYPE_IPV4_ONLINK
Zebra already knows if an interface is unnumbered or not. This
is communicated to OSPF.
OSPF would only send a NEXTHOP_TYPE_IPV4_ONLINK *if* the path
was unnumbered, which it learns from Zebra.
As such, Have OSPF use the normal NEXTHOP_TYPE_IPV4_IFINDEX
type for unnumbered paths. In Zebra, if the ifindex recieved
is unnumbered then assume that the link is NEXTHOP_FLAG_ONLINK.
Ticket: CM-8145 Reviewed-by: CCR-3771
Testing: See bug
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
vivek [Sun, 15 Nov 2015 18:21:12 +0000 (10:21 -0800)]
BGP: Fix nexthop registration churn
When a BGP nexthop is registered for resolution, if it is learnt from an
EBGP peer and other conditions warrant (non-multihop peer and connected check
is not disabled), the registration includes a flag that indicates that the
nexthop must be resolved only if it is directly connected. In peculiar
situations - e.g., third-party nexthop or policy configuration - the same
nexthop could be learnt from an IBGP peer, and in general, nexthops learnt
from IBGP peers can be resolved over any route. This scenario was causing
a churn in the nexthop registration with the 'must-be-connected' flag being
repeatedly toggled as routes are received from both peers. The registrations
would in turn trigger significant processing.
The fix is to treat 'must-be-connected' as an overriding condition.
The repeated registration and related processing was also causing heavy memory
usage by BGP - for memory buffers used to hold registration information. This
fix will ensure that is no longer the case.
Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com> Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
Ticket: CM-8005, CM-8013
Reviewed By: CCR-3772
Testing Done: Manual, bgpsmoke (on 2.5-br)
vivek [Sun, 15 Nov 2015 17:57:34 +0000 (09:57 -0800)]
BGP: Handle change to nexthop correctly
When a nexthop change is received and processed, the change flags are not
examined correctly and route change flags not updated correctly. Fix to
ensure correct handling.
vivek [Sun, 15 Nov 2015 15:36:50 +0000 (07:36 -0800)]
Zebra: Ensure correct route is used for redistribute delete.
After the optimization introduced by patch zebra-redist-update-fix.patch
which implements "replace" semantics for redistributed routes instead of
a delete followed by add, the code was passing an incorrect route for
redistribute deletion in one case. This is mainly inconsequential as of
now as the deletion process primarily cares about only the destination, but
the code needs to be corrected and that is done here.
vivek [Sun, 15 Nov 2015 15:17:47 +0000 (07:17 -0800)]
BGP: Fix the setting of link-local nexthops in some situations
This patch addresses three main issues:
a. Passing along the global IPv6 nexthop received from the EBGP peer to
IBGP peers but setting the link-local IPv6 nexthop to ourselves when
advertising EBGP-learnt routes to IBGP peers (in the absence of outbound
route-map or other overrides). The fix is to not send a link-local IPv6
nexthop in this case.
b. Passing along the link-local IPv6 nexthop received from one peer to
another peer which is (or may be) on a different subnet. This violates the
semantics of link-local IPv6 address. The fix is to set the nexthop to
ourselves in the situation where the nexthop normally has to be passed
but is a link-local IPv6 address.
c. Different behavior wrt nexthop advertisement for BGP unnumbered peering
if it is setup using link-local IPv6 address versus IPv4 /30 or /31. The
fix is to make the behavior consistent as long as the interface config is
the same in both cases.
Daniel Walton [Thu, 12 Nov 2015 20:30:22 +0000 (20:30 +0000)]
bgp may add multiple path entries with the same nexthop
Signed-off-by: Daniel Walton <dwalton@cumulusnetworks.com> Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
Ticket: CM-8129
- We have 14 paths for each prefix, 7 are from ipv4 peers and 7 are from ipv6 peers
- There are 7 unique nexthops
- When comparing the exact same path from an v4 peer vs. a v6 peer the path from
the v4 peer wins. This is due to the "lowest neighbor IP" check in the decision
algorithm. For example below we learn NEXTHOP 210.2.4.2 from 210.2.4.2 and
2001:20:4::2 but only the one from the v4 peer is flagged as multipath.
- The problem is when our bestpath is from a v6 peer, 2001:20:2::2 in this case
(see line 85). 2001:20:2::2 sent us 210.2.2.2 so we will install that nexthop
because it is from our bestpath, the problem is we flag the path from 210.2.2.2
(line 37) as multipath which causes us to install two paths with nexthop 210.2.2.2
Here you can see the two paths with nexthop 210.2.2.2
superm-redxp-05# show ip route 2.23.24.192/28
Routing entry for 2.23.24.192/28
Known via "bgp", distance 20, metric 0, best
Last update 00:32:12 ago
* 210.2.2.2, via swp3
* 210.2.0.2, via swp1
* 210.2.1.2, via swp2
* 210.2.2.2, via swp3
* 210.2.3.2, via swp4
* 210.2.4.2, via swp5
* 210.2.5.2, via swp6
* 210.2.6.2, via swp7
superm-redxp-05#
superm-redxp-05#
The fix is to not flag a path as multipath if it has the same nexthop as the bestpath
Daniel Walton [Tue, 10 Nov 2015 15:29:12 +0000 (15:29 +0000)]
BGP: route-server will now use addpath...chop the _rsclient code
Signed-off-by: Daniel Walton <dwalton@cumulusnetworks.com> Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com>
Ticket: CM-8122
per draft-ietf-idr-ix-bgp-route-server-09:
2.3.2.2.2. BGP ADD-PATH Approach
The [I-D.ietf-idr-add-paths] Internet draft proposes a different
approach to multiple path propagation, by allowing a BGP speaker to
forward multiple paths for the same prefix on a single BGP session.
As [RFC4271] specifies that a BGP listener must implement an implicit
withdraw when it receives an UPDATE message for a prefix which
already exists in its Adj-RIB-In, this approach requires explicit
support for the feature both on the route server and on its clients.
If the ADD-PATH capability is negotiated bidirectionally between the
route server and a route server client, and the route server client
propagates multiple paths for the same prefix to the route server,
then this could potentially cause the propagation of inactive,
invalid or suboptimal paths to the route server, thereby causing loss
of reachability to other route server clients. For this reason, ADD-
PATH implementations on a route server should enforce send-only mode
with the route server clients, which would result in negotiating
receive-only mode from the client to the route server.
This allows us to delete all of the following code:
- All XXXX_rsclient() functions
- peer->rib
- BGP_TABLE_MAIN and BGP_TABLE_RSCLIENT
- RMAP_IMPORT and RMAP_EXPORT
Daniel Walton [Thu, 5 Nov 2015 17:29:43 +0000 (17:29 +0000)]
BGP: support for addpath TX
Signed-off-by: Daniel Walton <dwalton@cumulusnetworks.com> Reviewed-by: Donald Sharp <sharpd@cumulusnetworks.com> Reviewed-by: Vivek Venkataraman <vivek@cumulusnetworks.com
Ticket: CM-8014
This implements addpath TX with the first feature to use it
being "neighbor x.x.x.x addpath-tx-all-paths".
One change to show output is 'show ip bgp x.x.x.x'. If no addpath-tx
features are configured for any peers then everything looks the same
as it is today in that "Advertised to" is at the top and refers to
which peers the bestpath was advertise to.
root@superm-redxp-05[quagga-stash5]# vtysh -c 'show ip bgp 1.1.1.1'
BGP routing table entry for 1.1.1.1/32
Paths: (6 available, best #6, table Default-IP-Routing-Table)
Advertised to non peer-group peers:
r1(10.0.0.1) r2(10.0.0.2) r3(10.0.0.3) r4(10.0.0.4) r5(10.0.0.5) r6(10.0.0.6) r8(10.0.0.8)
Local, (Received from a RR-client)
12.12.12.12 (metric 20) from r2(10.0.0.2) (10.0.0.2)
Origin IGP, metric 0, localpref 100, valid, internal
AddPath ID: RX 0, TX 8
Last update: Fri Oct 30 18:26:44 2015
[snip]
but once you enable an addpath feature we must display "Advertised to" on a path-by-path basis:
superm-redxp-05# show ip bgp 1.1.1.1/32
BGP routing table entry for 1.1.1.1/32
Paths: (6 available, best #6, table Default-IP-Routing-Table)
Local, (Received from a RR-client)
12.12.12.12 (metric 20) from r2(10.0.0.2) (10.0.0.2)
Origin IGP, metric 0, localpref 100, valid, internal
AddPath ID: RX 0, TX 8
Advertised to: r8(10.0.0.8)
Last update: Fri Oct 30 18:26:44 2015
Local, (Received from a RR-client)
34.34.34.34 (metric 20) from r3(10.0.0.3) (10.0.0.3)
Origin IGP, metric 0, localpref 100, valid, internal
AddPath ID: RX 0, TX 7
Advertised to: r8(10.0.0.8)
Last update: Fri Oct 30 18:26:39 2015
Local, (Received from a RR-client)
56.56.56.56 (metric 20) from r6(10.0.0.6) (10.0.0.6)
Origin IGP, metric 0, localpref 100, valid, internal
AddPath ID: RX 0, TX 6
Advertised to: r8(10.0.0.8)
Last update: Fri Oct 30 18:26:39 2015
Local, (Received from a RR-client)
56.56.56.56 (metric 20) from r5(10.0.0.5) (10.0.0.5)
Origin IGP, metric 0, localpref 100, valid, internal
AddPath ID: RX 0, TX 5
Advertised to: r8(10.0.0.8)
Last update: Fri Oct 30 18:26:39 2015
Local, (Received from a RR-client)
34.34.34.34 (metric 20) from r4(10.0.0.4) (10.0.0.4)
Origin IGP, metric 0, localpref 100, valid, internal
AddPath ID: RX 0, TX 4
Advertised to: r8(10.0.0.8)
Last update: Fri Oct 30 18:26:39 2015
Local, (Received from a RR-client)
12.12.12.12 (metric 20) from r1(10.0.0.1) (10.0.0.1)
Origin IGP, metric 0, localpref 100, valid, internal, best
AddPath ID: RX 0, TX 3
Advertised to: r1(10.0.0.1) r2(10.0.0.2) r3(10.0.0.3) r4(10.0.0.4) r5(10.0.0.5) r6(10.0.0.6) r8(10.0.0.8)
Last update: Fri Oct 30 18:26:34 2015
Feng Lu [Thu, 16 Oct 2014 01:52:36 +0000 (09:52 +0800)]
*: add VRF ID in the API message header
The API messages are used by zebra to exchange the interfaces, addresses,
routes and router-id information with its clients. To distinguish which
VRF the information belongs to, a new field "VRF ID" is added in the
message header. And hence the message version is increased to 3.
* The new field "VRF ID" in the message header:
Length (2 bytes)
Marker (1 byte)
Version (1 byte)
VRF ID (2 bytes, newly added)
Command (2 bytes)
- Client side:
- zclient_create_header() adds the VRF ID in the message header.
- zclient_read() extracts and validates the VRF ID from the header,
and passes the VRF ID to the callback functions registered to
the API messages.
- All relative functions are appended with a new parameter "vrf_id",
including all the callback functions.
- "vrf_id" is also added to "struct zapi_ipv4" and "struct zapi_ipv6".
Clients need to correctly set the VRF ID when using the API
functions zapi_ipv4_route() and zapi_ipv6_route().
- Till now all messages sent from a client have the default VRF ID
"0" in the header.
- The HELLO message is special, which is used as the heart-beat of
a client, and has no relation with VRF. The VRF ID in the HELLO
message header will always be 0 and ignored by zebra.
- Zebra side:
- zserv_create_header() adds the VRF ID in the message header.
- zebra_client_read() extracts and validates the VRF ID from the
header, and passes the VRF ID to the functions which process
the received messages.
- All relative functions are appended with a new parameter "vrf_id".
* Suppress the messages in a VRF which a client does not care:
Some clients may not care about the information in the VRF X, and
zebra should not send the messages in the VRF X to those clients.
Extra flags are used to indicate which VRF is registered by a client,
and a new message ZEBRA_VRF_UNREGISTER is introduced to let a client
can unregister a VRF when it does not need any information in that
VRF.
A client sends any message other than ZEBRA_VRF_UNREGISTER in a VRF
will automatically register to that VRF.
- lib/vrf:
A new utility "VRF bit-map" is provided to manage the flags for
VRFs, one bit per VRF ID.
- Use vrf_bitmap_init()/vrf_bitmap_free() to initialize/free a
bit-map;
- Use vrf_bitmap_set()/vrf_bitmap_unset() to set/unset a flag
in the given bit-map, corresponding to the given VRF ID;
- Use vrf_bitmap_check() to test whether the flag, in the given
bit-map and for the given VRF ID, is set.
- Client side:
- In "struct zclient", the following flags are changed from
"u_char" to "vrf_bitmap_t":
redist[ZEBRA_ROUTE_MAX]
default_information
These flags are extended for each VRF, and controlled by the
clients themselves (or with the help of zclient_redistribute()
and zclient_redistribute_default()).
- Zebra side:
- In "struct zserv", the following flags are changed from
"u_char" to "vrf_bitmap_t":
redist[ZEBRA_ROUTE_MAX]
redist_default
ifinfo
ridinfo
These flags are extended for each VRF, as the VRF registration
flags. They are maintained on receiving a ZEBRA_XXX_ADD or
ZEBRA_XXX_DELETE message.
When sending an interface/address/route/router-id message in
a VRF to a client, if the corresponding VRF registration flag
is not set, this message will not be dropped by zebra.
- A new function zread_vrf_unregister() is introduced to process
the new command ZEBRA_VRF_UNREGISTER. All the VRF registration
flags are cleared for the requested VRF.
Those clients, who support only the default VRF, will never receive
a message in a non-default VRF, thanks to the filter in zebra.
* New callback for the event of successful connection to zebra:
- zclient_start() is splitted, keeping only the code of connecting
to zebra.
- Now zclient_init()=>zclient_connect()=>zclient_start() operations
are purely dealing with the connection to zbera.
- Once zebra is successfully connected, at the end of zclient_start(),
a new callback is used to inform the client about connection.
- Till now, in the callback of connect-to-zebra event, all clients
send messages to zebra to request the router-id/interface/routes
information in the default VRF.
Of corse in future the client can do anything it wants in this
callback. For example, it may send requests for both default VRF
and some non-default VRFs.
Signed-off-by: Feng Lu <lu.feng@6wind.com> Reviewed-by: Alain Ritoux <alain.ritoux@6wind.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: Donald Sharp <sharpd@cumulusnetworks.com>
Conflicts:
lib/zclient.h
lib/zebra.h
zebra/zserv.c
zebra/zserv.h
David Lamparter [Mon, 13 Apr 2015 08:21:37 +0000 (10:21 +0200)]
lib: optimise prefix list setup
- duplicate prefix check can use the trie structure
- appending with a seq# beyond the end of the list can shortcut
Configuration load is now bottlenecked by cmd_element_match() and
strcmp(). For a real-world routeserver prefix list configuration
(38668 lines of config for multiple prefix lists):
before: 4.73s
after: 1.92s x 2.46
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
David Lamparter [Mon, 13 Apr 2015 08:21:36 +0000 (10:21 +0200)]
lib: use trie structure for prefix list matching
Prefix lists were implemented with a simple linear list that is scanned
sequentially. This is, of course, extremely inefficient as it scales by
O(n). This patch adds a trie-ish data structure that allows quickly
descending based on the prefix.
Note that the trie structure used here is designed for real-world use,
hence it uses a relatively crude fixed-size bytewise table instead of
some fancy balancing scheme. It is quite cacheline efficient.
Using real-world routeserver prefix lists, matching against a fulltable
dump:
David Lamparter [Mon, 13 Apr 2015 08:21:35 +0000 (10:21 +0200)]
lib: straighten out ORF prefix list support
BGP ORF prefix lists are in a separate namespace; this was previously
hooked up with a special-purpose AFI value. This is a little kludgy for
extension, hence this splits it off.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Feng Lu [Thu, 3 Jul 2014 10:23:09 +0000 (18:23 +0800)]
zebra, lib/memtypes.c: the netlink sockets work per VRF
This patch lets the netlink sockets work per VRF.
* The definition of "struct nlsock" is moved into zebra/rib.h.
* The previous global variables "netlink" and "netlink_cmd" now
become the members of "struct zebra_vrf", and are initialized
in zebra_vrf_alloc().
* All relative functions now work for a specific VRF, by adding
a new parameter which specifies the working VRF, except those
functions in which the VRF ID can be obtained from the interface.
* kernel_init(), interface_list() and route_read() are now also
working per VRF, and moved from main() to zebra_vrf_enable().
* A new function kernel_terminate() is added to release the
netlink sockets. It is called from zebra_vrf_disable().
* Correct VRF ID, instead of the previous VRF_DEFAULT, are now
passed to the functions of processing interfaces or route
entries.
Signed-off-by: Feng Lu <lu.feng@6wind.com> Reviewed-by: Alain Ritoux <alain.ritoux@6wind.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: Donald Sharp <sharpd@cumulusnetworks.com>
Conflicts:
lib/memtypes.c
zebra/rib.h
zebra/rt_netlink.c
Feng Lu [Fri, 22 May 2015 09:40:10 +0000 (11:40 +0200)]
zebra: maintain RTADV per VRF
This moves the global variable "rtadv" into the "struct zebra_vrf",
so that RTADV feature can work per VRF.
* rtadv.c/rtadv.h:
Add a proper parameter to the functions so that the entity of the
"struct zebra_vrf" and interfaces can be obtained from the specified
VRF.
The old rtadv_init() is splitted into:
- rtadv_cmd_init(): it installs the RTADV commands; is called from
main();
- new rtadv_init(): it creates the socket; is called from
zebra_vrf_enable().
rtadv_terminate() is added to stop the threads, close the socket and
clear the counters. It is called from zebra_vrf_disable().
rtadv_make_socket() now calls vrf_socket() to create a socket in
the VRF.
* interface.h and rib.h: define the macro RTADV.
* main.c: according changes, refer to rtadv.c.
Signed-off-by: Feng Lu <lu.feng@6wind.com> Reviewed-by: Alain Ritoux <alain.ritoux@6wind.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: Vincent JARDIN <vincent.jardin@6wind.com> Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Conflicts:
zebra/interface.h
zebra/rib.h
zebra/rtadv.c
zebra/rtadv.h
Feng Lu [Fri, 22 May 2015 09:40:09 +0000 (11:40 +0200)]
zebra: add hooks upon enabling / disabling a VRF
zebra_vrf_enable() is the callback for VRF_ENABLE_HOOK.
It presently needs do nothing.
zebra_vrf_disable() is the callback for VRF_DISABLE_HOOK.
It presently withdraws routes, shuts down interfaces, and
clears the router-id candidates in that VRF.
Signed-off-by: Feng Lu <lu.feng@6wind.com> Reviewed-by: Alain Ritoux <alain.ritoux@6wind.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: Vincent JARDIN <vincent.jardin@6wind.com> Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Feng Lu [Fri, 22 May 2015 09:40:08 +0000 (11:40 +0200)]
lib/vrf: enable / disable a VRF
A new API vrf_is_enabled() is defined to check whether a VRF is ready
to use, that is, to allocate resources in that VRF. Currently there's
only one type of resource: socket.
Two new hooks VRF_ENABLE_HOOK/VRF_DISABLE_HOOK are introduced to tell
the user when a VRF gets ready or to be unavailable.
The VRF_ENABLE_HOOK callback is called in the new function vrf_enable(),
which is used to let the VRF be ready to use. Till now, only the default
VRF can be enabled, and we need do nothing to enable the default, except
calling the hook.
The VRF_DISABLE_HOOK callback is called in the new function
vrf_disable(), which is used to let the VRF be unusable. Till now,
it is called only when the VRF is to be deleted.
A new utility vrf_socket() is defined to provide a socket in a given
VRF to the user.
Till now before introducing a way of VRF realization, only the default
VRF is enabled since its birth, and vrf_socket() creates socket for
only the default VRF.
This patch defines the framework of the VRF APIs. The way they serve
the users is:
- vrf_is_enabled() is used to tell the user whether a VRF is usable;
- users are informed by the VRF_ENABLE_HOOK that a VRF gets usable;
they can allocate resources after that;
- users are informed by the VRF_DISABLE_HOOK that a VRF is to be
unavailable, and they must release the resources instantly;
- vrf_socket() is used to provide a socket in a given VRF.
Signed-off-by: Feng Lu <lu.feng@6wind.com> Reviewed-by: Alain Ritoux <alain.ritoux@6wind.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: Vincent JARDIN <vincent.jardin@6wind.com> Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Feng Lu [Fri, 22 May 2015 09:40:07 +0000 (11:40 +0200)]
zebra: maintain the router-id per VRF
A router may need different identifier among the VRFs. So move the
maintenance of router-id per VRF.
* rib.h:
Move the previous global variables in router-id.c into the
"struct zebra_vrf":
- struct list _rid_all_sorted_list/*rid_all_sorted_list
- struct list _rid_lo_sorted_list/*rid_lo_sorted_list
- struct prefix rid_user_assigned
* router-id.c/router-id.h:
A new parameter "vrf_id" is added to all the router-id APIs.
Their operations are done only within the specified VRF.
A new command "router-id A.B.C.D vrf N" is added to allow
manual router-id for any VRF.
The old router_id_init() function is splitted into two:
- router_id_cmd_init(): it only installs the commands
- router_id_init(): this new one initializes the variables for
a specified VRF
* zebra_rib.c: Add new functions zebra_vrf_get/lookup() called
from router-id.c.
* main.c: Replace router_id_init() with router_id_cmd_init() and
call the new router_id_init() in zebra_vrf_new().
Signed-off-by: Feng Lu <lu.feng@6wind.com> Reviewed-by: Alain Ritoux <alain.ritoux@6wind.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: Vincent JARDIN <vincent.jardin@6wind.com> Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
Conflicts:
zebra/rib.h