]> git.proxmox.com Git - mirror_frr.git/log
mirror_frr.git
6 years agoMerge pull request #1496 from donaldsharp/install_failure
Renato Westphal [Mon, 4 Dec 2017 20:25:16 +0000 (18:25 -0200)]
Merge pull request #1496 from donaldsharp/install_failure

Additional Southbound API changes

6 years agoMerge pull request #1507 from donaldsharp/bgp_af_open
Renato Westphal [Mon, 4 Dec 2017 19:34:19 +0000 (17:34 -0200)]
Merge pull request #1507 from donaldsharp/bgp_af_open

bgpd: Allow Address-Family activation to work in certain states

6 years agoMerge pull request #1500 from opensourcerouting/ldpd-fixes
Donald Sharp [Mon, 4 Dec 2017 14:06:09 +0000 (09:06 -0500)]
Merge pull request #1500 from opensourcerouting/ldpd-fixes

ldpd: small improvements

6 years agoMerge pull request #1508 from qlyoung/bgpd-fix-lock
Rafael Zalamena [Mon, 4 Dec 2017 13:16:45 +0000 (11:16 -0200)]
Merge pull request #1508 from qlyoung/bgpd-fix-lock

bgpd: fix potential deadlock

6 years agoMerge pull request #1472 from opensourcerouting/lintian-warning
Donald Sharp [Mon, 4 Dec 2017 13:02:16 +0000 (08:02 -0500)]
Merge pull request #1472 from opensourcerouting/lintian-warning

debianpkg: Suppress frr-dbg debug-file-with-no-debug-symbols warning

6 years agoMerge pull request #1510 from qlyoung/ospf-gitignore-clippy
Lou Berger [Fri, 1 Dec 2017 22:06:37 +0000 (06:06 +0800)]
Merge pull request #1510 from qlyoung/ospf-gitignore-clippy

ospfd: remove clippy file, fix .gitignore

6 years agoMerge pull request #1433 from qlyoung/remove-deprecated-stream-macros
Rafael Zalamena [Fri, 1 Dec 2017 19:46:02 +0000 (17:46 -0200)]
Merge pull request #1433 from qlyoung/remove-deprecated-stream-macros

*: don't use deprecated stream.h macros

6 years agoospfd: remove clippy file, fix .gitignore
Quentin Young [Fri, 1 Dec 2017 19:24:30 +0000 (14:24 -0500)]
ospfd: remove clippy file, fix .gitignore

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
6 years ago*: don't use deprecated stream.h macros
Quentin Young [Wed, 8 Nov 2017 17:51:16 +0000 (12:51 -0500)]
*: don't use deprecated stream.h macros

Some of the deprecated stream.h macros see such little use that we may
as well just remove them and use the non-deprecated macros.

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
6 years agobgpd: fix potential deadlock
Quentin Young [Fri, 1 Dec 2017 18:41:27 +0000 (13:41 -0500)]
bgpd: fix potential deadlock

With the way things are set up, this bit of code would never actually
cause a deadlock, but would be highly likely in the future.

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
6 years agobgpd: Allow Address-Family activation to work in certain states
Donald Sharp [Fri, 1 Dec 2017 16:49:13 +0000 (11:49 -0500)]
bgpd: Allow Address-Family activation to work in certain states

If we are in OpenSent or OpenConfirm peer state and we receive a new
address-family activation, we would end up ignoring the new activation
and not tell our peer about it.  You could notice this by seeing
the fact that a 'show bgp neighbor' command returns a 'Not in
any update group' for a particular family.

This modifies the code such that we now notice that we are in
either OpenSent or OpenConfirm state and reset the peer to
allow us to send them the new capability.

Ticket: CM-19021
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
6 years agoospfd: fix NSSA LSA translation (BZ#493) (BZ#250)
Svata Dedic [Thu, 22 Dec 2011 14:07:15 +0000 (18:07 +0400)]
ospfd: fix NSSA LSA translation (BZ#493) (BZ#250)

6 years agoMerge pull request #1145 from qlyoung/bgpd-pthreads-frr
Martin Winter [Fri, 1 Dec 2017 07:35:51 +0000 (23:35 -0800)]
Merge pull request #1145 from qlyoung/bgpd-pthreads-frr

Multithreaded BGPD

6 years agobgpd: small optimization with UPDATE generation
Quentin Young [Thu, 30 Nov 2017 22:16:37 +0000 (17:16 -0500)]
bgpd: small optimization with UPDATE generation

After a batch of generated UPDATEs, call bgp_writes_on() once instead of
after generating each packet.

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
6 years agobgpd: use FOREACH_AFI_SAFI()
Quentin Young [Thu, 30 Nov 2017 21:58:37 +0000 (16:58 -0500)]
bgpd: use FOREACH_AFI_SAFI()

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
6 years agobgpd: intelligently adjust coalesce timer
Quentin Young [Thu, 30 Nov 2017 19:11:12 +0000 (14:11 -0500)]
bgpd: intelligently adjust coalesce timer

The subgroup coalesce timer controls how long updates to a particular
subgroup are delayed in order to allow additional peers to join the
subgroup. Presently the timer value is 200 ms. Increase it to 1 second
and adjust up as peers are configured, with an upper cap at 10s.

This cuts convergence time by a factor of 3 at large scale (300+ peers,
1000+ prefixes per peer).

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
6 years agotests: neuter fuzzing frontend for now
Quentin Young [Thu, 30 Nov 2017 20:07:29 +0000 (15:07 -0500)]
tests: neuter fuzzing frontend for now

Fuzzing hook for BGP packet processing does not map to MT-BGPD. Removing
offending call for now, additional work to fix this in the future.

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
6 years agobgpd: turn off keepalives when sending NOTIFY
Quentin Young [Mon, 13 Nov 2017 22:59:04 +0000 (17:59 -0500)]
bgpd: turn off keepalives when sending NOTIFY

This is necessary because otherwise between the time we wipe the output
buffer and the time we push the NOTIFY onto it, the KA generation thread
could have pushed a KEEPALIVE in the middle.

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
6 years agobgpd: yield more when generating UPDATEs
Quentin Young [Mon, 13 Nov 2017 08:18:49 +0000 (03:18 -0500)]
bgpd: yield more when generating UPDATEs

In the same vein as the round-robin input commit, this re-adds logic for
limiting the amount of time spent generating UPDATEs per generation
cycle. Missed this when shifting around wpkt_quanta; prior to MT it
limited both calls to write() as well as UPDATE generation.

6 years agobgpd: schedule UPDATE generation smarter
Quentin Young [Fri, 10 Nov 2017 22:03:58 +0000 (17:03 -0500)]
bgpd: schedule UPDATE generation smarter

No need to schedule a job to generate more packets until we're done with
the ones we've got. Shaves a few percent off convergence time.

6 years agobgpd: restore packet input limit
Quentin Young [Fri, 10 Nov 2017 21:42:49 +0000 (16:42 -0500)]
bgpd: restore packet input limit

Unfortunately, batching input processing severely impacts BGP initial
convergence times. As a consequence of the way update-groups were
implemented, advancing the state of the routing table based on prefixes
learned from one peer prior to all (or at least most) peers establishing
connections will cause us to start generating outbound UPDATEs, which is
a very expensive operation at present. This intensive processing starves
out bgp_accept(), delaying connection of additional peers. When
additional peers do connect the problem gets worse and worse, yielding
approximately exponential growth in convergence time dependent on both
peering and prefix counts. This behavior is present pre-multithreading
as well, but batched input exacerbates it.

Round-robin input processing marginally harms convergence times for
small topologies but should allow much larger topologies to function
within reasonable performance thresholds.

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
6 years agobgpd: schedule process packet as timer
Quentin Young [Tue, 7 Nov 2017 07:49:54 +0000 (02:49 -0500)]
bgpd: schedule process packet as timer

Different places scheduling the same thread should use the same
semantics and thread type. Additionally providing the back reference
here makes sure we only schedule the job once and avoids flooding the
event queue with jobs to process an empty buffer.

6 years agobgpd: re-add write trigger logic
Quentin Young [Mon, 6 Nov 2017 19:15:36 +0000 (14:15 -0500)]
bgpd: re-add write trigger logic

Apparently I didn't fully understand how subgroup packets make their way
out to individual peers. Turns out (on the base branch) we just busy
poll while waiting for packets to make their way onto subgroup queues.
While this needs to be fixed in the future, for now readding this logic
fixes performance issues with convergence.

6 years agobgpd: properly set peer->last_update
Quentin Young [Mon, 6 Nov 2017 06:41:27 +0000 (01:41 -0500)]
bgpd: properly set peer->last_update

Instead of checking whether the post-write number of updates sent was
greater than the pre-write number of updates sent, it was comparing post
to zero. In effect this meant every time we wrote a packet it was
counted as an update for route advertisement timer purposes.

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
6 years agobgpd: schedule packet job after connection xfer
Quentin Young [Mon, 6 Nov 2017 05:33:46 +0000 (00:33 -0500)]
bgpd: schedule packet job after connection xfer

During initial session establishment, bgpd performs a "connection
transfer" to a new peer struct if the connection was initiated passively
(i.e. by the remote peer). With the addition of buffered input and a
reorganized packet processor, the following race condition manifests:

1. Remote peer initiates a connection. After exchanging OPEN messages,
   we send them a KEEPALIVE. They send us a KEEPALIVE followed by
   10,000 UPDATE messages. The I/O thread pushes these onto our local
   peer's input buffer and schedules a packet processing job on the
   main thread.
2. The packet job runs and processes the KEEPALIVE, which completes the
   handshake on our end. As part of transferring to ESTABLISHED we
   transfer all peer state to a new struct, as mentioned. Upon returning
   from the KEEPALIVE processing routing, the peer context we had has
   now been destroyed. We notice this and stop processing. Meanwhile
   10k UPDATE messages are sitting on the input buffer.
3. N seconds later, the remote peer sends us a KEEPALIVE. The I/O thread
   schedules another process job, which finds 10k UPDATEs waiting for
   it. Convergence is achieved, but has been delayed by the value of the
   KEEPALIVE timer.

The racey part is that if the remote peer takes a little bit of time to
send UPDATEs after KEEPALIVEs -- somewhere on the order of a few hundred
milliseconds -- we complete the transfer successfully and the packet
processing job is scheduled on the new peer upon arrival of the UPDATE
messages. Yuck.

The solution is to schedule a packet processing job on the new peer
struct after transferring state.

Lengthy commit message in case someone has to debug similar problems in
the future...

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
6 years agobgpd: transfer raw input buffer to new peer
Quentin Young [Fri, 3 Nov 2017 18:47:56 +0000 (14:47 -0400)]
bgpd: transfer raw input buffer to new peer

During initial session establishment, bgpd performs a "connection
transfer" to a new peer struct if the connection was initiated passively
(i.e. by the remote peer). With the addition of buffered input, I forgot
to transfer the raw input buffer to the new peer. This resulted in
infrequent failures during session handshaking whereby half of a packet
would be thrown away in the middle of a read causing us to send a NOTIFY
for an unsynchronized header. Usually the transfer coincided with a
clean input buffer, hence why it only showed up once in a while.

6 years agobgpd: fix bgp active open
Quentin Young [Mon, 25 Sep 2017 02:18:15 +0000 (22:18 -0400)]
bgpd: fix bgp active open

At some point when rearranging FSM code, bgpd lost the ability to
perform active opens because it was only paying attention to POLLIN and
not POLLOUT, when the latter is used to signify a successful connection
in the active case.

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
6 years agobgpd: use correct byte order for notify data
Quentin Young [Wed, 20 Sep 2017 15:11:30 +0000 (11:11 -0400)]
bgpd: use correct byte order for notify data

Broke this when rewriting header validation.

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
6 years agotests: add name to test_mp_attr threadmaster
Quentin Young [Fri, 8 Sep 2017 16:58:59 +0000 (12:58 -0400)]
tests: add name to test_mp_attr threadmaster

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
6 years agobgpd, tests: comment formatting
Quentin Young [Fri, 8 Sep 2017 15:51:12 +0000 (11:51 -0400)]
bgpd, tests: comment formatting

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
6 years agobgpd: fix some formatting in bgp_io.c
Quentin Young [Fri, 4 Aug 2017 18:27:42 +0000 (14:27 -0400)]
bgpd: fix some formatting in bgp_io.c

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
6 years agobgpd: update atomic memory orders
Quentin Young [Wed, 5 Jul 2017 15:38:57 +0000 (11:38 -0400)]
bgpd: update atomic memory orders

Use best-performing memory orders where appropriate.
Also update some style and add missing comments.

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
6 years agobgpd: rebase onto master
Quentin Young [Fri, 30 Jun 2017 18:04:32 +0000 (18:04 +0000)]
bgpd: rebase onto master

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
6 years agobgpd: static bgp_pthreads_init()
Quentin Young [Mon, 26 Jun 2017 16:29:20 +0000 (16:29 +0000)]
bgpd: static bgp_pthreads_init()

got un-static'd at some point

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
6 years agobgpd: fix uninitialized result code
Quentin Young [Mon, 26 Jun 2017 15:50:35 +0000 (15:50 +0000)]
bgpd: fix uninitialized result code

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
6 years agobgpd: sleep in poll()
Quentin Young [Fri, 16 Jun 2017 20:15:31 +0000 (20:15 +0000)]
bgpd: sleep in poll()

poll won't sleep if there are no file descriptors! gotta sleep!

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
6 years agobgpd: lift read-quanta restriction
Quentin Young [Tue, 13 Jun 2017 19:06:51 +0000 (19:06 +0000)]
bgpd: lift read-quanta restriction

Per previous work to ensure all FSM state is updated after processing
each message, read-quanta should be safe to set > 1.

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
6 years agobgpd: remove unused extern from bgp_io.h
Quentin Young [Tue, 13 Jun 2017 01:58:39 +0000 (01:58 +0000)]
bgpd: remove unused extern from bgp_io.h

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
6 years agobgpd: be more promiscuous with updgrp packets
Quentin Young [Mon, 12 Jun 2017 21:16:40 +0000 (21:16 +0000)]
bgpd: be more promiscuous with updgrp packets

Slightly incorrect trigger for generating update group packets. In order
to match semantics of previous bgp_write() we need to trigger
update-group packet generation after every write operation, even if no
packets were written. Of course if we're tearing down the session we can
still skip this operation.

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
6 years agobgpd: re-add update-group write triggers
Quentin Young [Mon, 12 Jun 2017 20:20:50 +0000 (20:20 +0000)]
bgpd: re-add update-group write triggers

Removed in earlier version where the I/O pthread busy-waited for packets
to be posted to an output queue. Now that it's poll()-based, it's
necessary once again. Although this time we can say what we're actually
doing instead of a side effect of a write job.

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
6 years agotests: update tests for bgp_packet changes
Quentin Young [Mon, 12 Jun 2017 17:35:47 +0000 (17:35 +0000)]
tests: update tests for bgp_packet changes

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
6 years agobgpd: free notify packet after writing
Quentin Young [Mon, 12 Jun 2017 06:46:56 +0000 (06:46 +0000)]
bgpd: free notify packet after writing

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
6 years agobgpd: misc fsm fixes
Quentin Young [Mon, 12 Jun 2017 02:53:42 +0000 (02:53 +0000)]
bgpd: misc fsm fixes

* Keepalive on/off calls are necessary in certain cases due to screwy
  fsm flow not turning them on after transferring a passive peer
  connection in peer_xfer_conn

* Missed a case bgp_event_update() that resulted in a return code of -1
  instead of BGP_Stop, which confuses the packet processing routine

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
6 years agobgpd: fix bgp_packet.c / bgp_fsm.c organization
Quentin Young [Sat, 10 Jun 2017 01:01:56 +0000 (01:01 +0000)]
bgpd: fix bgp_packet.c / bgp_fsm.c organization

Despaghettification of bgp_packet.c and bgp_fsm.c

Sometimes we call bgp_event_update() inline packet parsing.
Sometimes we post events instead.
Sometimes we increment packet counters in the FSM.
Sometimes we do it in packet routines.
Sometimes we update EOR's in FSM.
Sometimes we do it in packet routines.

Fix the madness.

bgp_process_packet() is now the centralized place to:
- Update message counters
- Execute FSM events in response to incoming packets

FSM events are now executed directly from this function instead of being
queued on the thread_master. This is to ensure that the FSM contains the
proper state after each packet is parsed. Otherwise there could be race
conditions where two packets are parsed in succession without the
appropriate FSM update in between, leading to session closure due to
receiving inappropriate messages for the current FSM state.

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
6 years agobgpd: fix includes for bgp_keeaplives.c
Quentin Young [Fri, 9 Jun 2017 19:34:29 +0000 (19:34 +0000)]
bgpd: fix includes for bgp_keeaplives.c

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
6 years agobgpd: restyle bgp_keepalives.[ch]
Quentin Young [Fri, 9 Jun 2017 19:22:34 +0000 (19:22 +0000)]
bgpd: restyle bgp_keepalives.[ch]

And update copyright header.

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
6 years agobgpd: use stop event instead of pthread_kill()
Quentin Young [Fri, 9 Jun 2017 18:10:59 +0000 (18:10 +0000)]
bgpd: use stop event instead of pthread_kill()

When terminating I/O thread, just schedule an event to do any necessary
cleanup and gracefully exit instead of using a signal.

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
6 years agobgpd: update I/O docs
Quentin Young [Thu, 8 Jun 2017 21:47:33 +0000 (21:47 +0000)]
bgpd: update I/O docs

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
6 years agobgpd: restyle
Quentin Young [Thu, 8 Jun 2017 21:25:23 +0000 (21:25 +0000)]
bgpd: restyle

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
6 years agobgpd: small i/o threading improvements
Quentin Young [Thu, 8 Jun 2017 21:14:18 +0000 (21:14 +0000)]
bgpd: small i/o threading improvements

* Start bit flags at 1, not 2
* Make run-flags atomic for i/o thread
* Remove work_cond mutex, it should no longer be necessary
* Add asserts to ensure proper ordering in bgp_connect()
* Use true/false with booleans, not 1/0

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
6 years agobgpd: bye bye THREAD_BACKGROUND
Quentin Young [Thu, 8 Jun 2017 20:41:21 +0000 (20:41 +0000)]
bgpd: bye bye THREAD_BACKGROUND

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
6 years agobgpd: use mt-safe thread_cancel()
Quentin Young [Wed, 7 Jun 2017 21:29:48 +0000 (21:29 +0000)]
bgpd: use mt-safe thread_cancel()

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
6 years agobgpd: set thread_master owner appropriately
Quentin Young [Wed, 7 Jun 2017 21:09:59 +0000 (21:09 +0000)]
bgpd: set thread_master owner appropriately

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
6 years agobgpd: atomize write-quanta, add read-quanta
Quentin Young [Mon, 5 Jun 2017 20:14:47 +0000 (20:14 +0000)]
bgpd: atomize write-quanta, add read-quanta

bgpd supports setting a write-quanta that serves as a hint on how many
packets to write per I/O cycle. Now that input is buffered, it makes
sense to add the equivalent parameter for how many packets are processed
per cycle. This is *not* how many packets are read off the wire per I/O
cycle; rather it is how many packets are processed from the input buffer
in a given cycle after having been read off the wire and sanitized.

Since these values must be used from multiple threads, they have also
been made atomic.

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
6 years agobgpd: batched i/o
Quentin Young [Fri, 2 Jun 2017 01:52:39 +0000 (01:52 +0000)]
bgpd: batched i/o

Instead of reading a packet header and the rest of the packet in two
separate i/o cycles, instead read a chunk of data at one time and then
parse as many packets as possible out of the chunk.

Also changes bgp_packet.c to batch process packets.

To avoid thrashing on useless mutex locks, the scheduling call for
bgp_process_packet has been changed to always succeed at the cost of no
longer being cancel-able. In this case this is acceptable; following the
pattern of other event-based callbacks, an additional check in
bgp_process_packet to ignore stray events is sufficient. Before deleting
the peer all events are cleared which provides the requisite ordering.

XXX: chunk hardcoded to 5, should use something similar to wpkt_quanta

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
6 years agobgpd: fix includes for bgp_io.c
Quentin Young [Thu, 1 Jun 2017 16:44:02 +0000 (16:44 +0000)]
bgpd: fix includes for bgp_io.c

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
6 years agobgpd: style for bgp i/o
Quentin Young [Thu, 1 Jun 2017 16:26:49 +0000 (16:26 +0000)]
bgpd: style for bgp i/o

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
6 years agobgpd: use memcmp to check bgp marker
Quentin Young [Thu, 1 Jun 2017 16:20:58 +0000 (16:20 +0000)]
bgpd: use memcmp to check bgp marker

performance

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
6 years agobgpd: copyright style
Quentin Young [Wed, 17 May 2017 17:17:18 +0000 (17:17 +0000)]
bgpd: copyright style

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
6 years agobgpd: rename peer_keepalives* --> bgp_keepalives*
Quentin Young [Fri, 12 May 2017 03:54:18 +0000 (03:54 +0000)]
bgpd: rename peer_keepalives* --> bgp_keepalives*

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
6 years agobgpd: implement buffered reads
Quentin Young [Tue, 2 May 2017 00:37:45 +0000 (00:37 +0000)]
bgpd: implement buffered reads

* Move and modify all network input related code to bgp_io.c
* Add a real input buffer to `struct peer`
* Move connection initialization to its own thread.c task instead of
  piggybacking off of bgp_read()
* Tons of little fixups

Primary changes are in bgp_packet.[ch], bgp_io.[ch], bgp_fsm.[ch].
Changes made elsewhere are almost exclusively refactoring peer->ibuf to
peer->curr since peer->ibuf is now the true FIFO packet input buffer
while peer->curr represents the packet currently being processed by the
main pthread.

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
6 years agobgpd: move bgp i/o to a separate source file
Quentin Young [Tue, 18 Apr 2017 18:11:43 +0000 (18:11 +0000)]
bgpd: move bgp i/o to a separate source file

After implement threading, bgp_packet.c was serving the double purpose
of consolidating packet parsing functionality and handling actual I/O
operations. This is somewhat messy and difficult to understand. I've
thus moved all code and data structures for handling threaded packet
writes to bgp_io.[ch].

Although bgp_io.[ch] only handles writes at the moment to keep the noise
on this commit series down, for organization purposes, it's probably
best to move bgp_read() and its trappings into here as well and
restructure that code so that read()'s happen in the pthread and packet
processing happens on the main thread.

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
6 years agobgpd: use new threading infra
Quentin Young [Sun, 16 Apr 2017 05:18:07 +0000 (05:18 +0000)]
bgpd: use new threading infra

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
6 years agobgpd: use hash table for bgp_keepalives.c
Quentin Young [Wed, 12 Apr 2017 17:17:30 +0000 (17:17 +0000)]
bgpd: use hash table for bgp_keepalives.c

Large numbers of peers makes insertion and removal time for a linked
list non-negligible.

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
6 years agobgpd: correctly schedule select() at session startup
Quentin Young [Thu, 6 Apr 2017 23:45:57 +0000 (23:45 +0000)]
bgpd: correctly schedule select() at session startup

On TCP connection failure during session setup, bgp_stop() checks
whether peer->t_read is non-null to know whether or not to unschedule
select() on peer->fd before calling close() on it. Using the API exposed
by thread.c instead of bgpd's wrapper macro BGP_READ_ON() results in
this thread value never being set, which causes bgp_stop() to skip the
cancellation of select() before calling close(). Subsequent calls to
select() on that fd crash the daemon.

Use the macro instead.

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
6 years agobgpd: transfer packets from peer stub to actual peer
Quentin Young [Thu, 6 Apr 2017 01:09:33 +0000 (01:09 +0000)]
bgpd: transfer packets from peer stub to actual peer

During transition from OpenConfirm -> Established, we wipe the peer stub's
output buffer. Because thread.c prioritizes I/O operations over regular
background threads and events, in a single threaded environment this ordering
meant that the output buffer would be happily empty at wipe time.  In MT-land,
this convenient coincidence is no longer true; thus we need to make sure that
any packets remaining on the peer stub get transferred over to the peer proper.

Also removes misleading comment indicating that bgp_establish() sends a
keepalive packet. It does not.

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
6 years agobgpd: stop pseudo-blocking in bgp_write
Quentin Young [Fri, 31 Mar 2017 16:55:52 +0000 (16:55 +0000)]
bgpd: stop pseudo-blocking in bgp_write

If write() indicates that we should retry, just move along to the next
peer and come back later. No need to burn write() in a loop.

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
6 years agobgpd: dynamically allocate synchronization primitives
Quentin Young [Wed, 29 Mar 2017 19:16:28 +0000 (19:16 +0000)]
bgpd: dynamically allocate synchronization primitives

Changes all synchronization primitives to be dynamically allocated. This
should help catch any subtle errors in pthread lifecycles.

This change also pre-initializes synchronization primitives before
threads begin to run, eliminating a potential race condition that
probably would have caused a segfault on startup on a very fast box.

Also changes mutex and condition variable allocations to use
MTYPE_PTHREAD and updates tests to do the proper initializations.

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
6 years agobgpd: remove unused `struct thread` from peer
Quentin Young [Mon, 27 Mar 2017 19:47:23 +0000 (19:47 +0000)]
bgpd: remove unused `struct thread` from peer

* Remove t_write
* Remove t_keepalive

These have been replaced by pthreads and are no longer needed. Since
some code looks at these values to determine if the threads are
scheduled, also add a new bitfield to store the same information.

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
6 years agolib, bgpd: implement pthread lifecycle management
Quentin Young [Wed, 8 Mar 2017 23:16:15 +0000 (23:16 +0000)]
lib, bgpd: implement pthread lifecycle management

Removes the WiP shim and implements proper thread lifecycle management.

* Declare necessary pthread_t's in bgp_master
* Define new MTYPE in lib/thread.c for pthreads
* Allocate and free BGP's pthreads appropriately
* Terminate and join threads appropriately

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
6 years agobgpd: put BGP keepalives in a pthread
Quentin Young [Thu, 5 Jan 2017 23:13:16 +0000 (23:13 +0000)]
bgpd: put BGP keepalives in a pthread

This patch, in tandem with moving packet writes into a dedicated kernel
thread, fixes session flaps caused by long-running internal operations
starving the (old) userspace write thread.

BGP keepalives are now produced by a kernel thread and placed onto the
peer's output queue. These are then consumed by the write thread. Both
of these tasks are concurrent with the rest of bgpd, obviating the
session flaps described above.

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
6 years agobgpd: move bgp_connect_check() to bgp_fsm.c
Quentin Young [Fri, 24 Mar 2017 19:05:56 +0000 (19:05 +0000)]
bgpd: move bgp_connect_check() to bgp_fsm.c

Prior to this change, after initiating a nonblocking connection to the
remote peer bgpd would call both BGP_READ_ON and BGP_WRITE_ON on the
peer's socket. This resulted in a call to select(), so that when some
event (either a connection success or failure) occurred on the socket,
one of bgp_read() or bgp_write() would run. At the beginning of each of
those functions was a hook into bgp_connect_check(), which checked the
socket status and issued the correct connection event onto the BGP FSM.

This code is better suited for bgp_fsm.c. Placing it there avoids
scheduling packet reads or writes when we don't know if the socket has
established a connection yet, and the specific functionality is a better
fit for the responsibility scope of this unit.

This change also helps isolate the responsibilities of the
packet-writing kernel thread.

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
6 years agobgpd: move update group processing to main thread
Quentin Young [Wed, 22 Mar 2017 17:13:23 +0000 (17:13 +0000)]
bgpd: move update group processing to main thread

Prior to this change, packets generated for update groups were taken off
of the (independent) buffer for the update group, reformatted for the
specific peer under question and sent off inline with bgp_write(). Since
the operations of this code path can include the merging and pruning of
subgroups and are too large to safely synchronize, this change moves
that logic to execute after each tick of the write thread.

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
6 years agobgpd: move packet writes into dedicated pthread
Quentin Young [Mon, 6 Feb 2017 23:39:06 +0000 (23:39 +0000)]
bgpd: move packet writes into dedicated pthread

* BGP_WRITE_ON() removed
* BGP_WRITE_OFF() removed
* peer_writes_on() added
* peer_writes_off() added
* bgp_write_proceed_actions() removed

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
6 years agoMerge pull request #1498 from donaldsharp/thread_pqueue
Lou Berger [Thu, 30 Nov 2017 01:25:57 +0000 (09:25 +0800)]
Merge pull request #1498 from donaldsharp/thread_pqueue

lib: Fix thread removal from a pqueue

6 years agoldpd: improve processing of redistributed routes
Renato Westphal [Wed, 29 Nov 2017 20:30:26 +0000 (18:30 -0200)]
ldpd: improve processing of redistributed routes

ldpd should ignore blackhole routes and any other route that doesn't
have a nexthop address (connected routes being an exception).

Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
6 years agoldpd: add a few warning messages to aid in troubleshooting
Renato Westphal [Wed, 29 Nov 2017 19:11:28 +0000 (17:11 -0200)]
ldpd: add a few warning messages to aid in troubleshooting

Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
6 years agozebra, ldpd: fix display of pseudowire status
Renato Westphal [Wed, 29 Nov 2017 18:22:08 +0000 (16:22 -0200)]
zebra, ldpd: fix display of pseudowire status

In some circumstances zebra and ldpd would display a pseudowire as UP
when in reality it's not (example: MTU mismatch between the two ends). Fix
this to avoid confusion.

Reported-by: ßingen <bingen@voltanet.io>
Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
6 years agolib: Fix thread removal from a pqueue
Donald Sharp [Wed, 29 Nov 2017 19:26:44 +0000 (14:26 -0500)]
lib: Fix thread removal from a pqueue

When we remove a thread from a pqueue, use the saved
index to go to the correct spot immediately instead of
having to search the whole queue for it.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
6 years agozebra: Fix route replace flags
Donald Sharp [Wed, 29 Nov 2017 16:54:27 +0000 (11:54 -0500)]
zebra: Fix route replace flags

When doing a route replace, on openbsd we were not
marking the old lsp as no longer installed, while
on linux we were.  Move the abstraction up a layer.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
6 years agozebra: Fix lsp add/del from kernel using SETFLAG
Donald Sharp [Wed, 29 Nov 2017 13:53:33 +0000 (08:53 -0500)]
zebra: Fix lsp add/del from kernel using SETFLAG

Setup a interface such that the add/del of lsp's from
the kernel can have a callback for success/failure.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
6 years agozebra: Implement call back for route install/delete success/fail
Donald Sharp [Tue, 14 Nov 2017 14:57:37 +0000 (09:57 -0500)]
zebra: Implement call back for route install/delete success/fail

When a route is installed or deleted into the kernel allow a
callback mechanism to handle the success/failure of
the kernel call.

This separation is to allow us to do these things:

1) In the future create a true pthread to handle route
install/deletes.  This way we can schedule these
events in a smarter fashion

2) Allow us to use a common southbound api for route
install and deletion.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
6 years agoMerge pull request #1493 from donaldsharp/plist_stuff
Rafael Zalamena [Wed, 29 Nov 2017 16:03:56 +0000 (14:03 -0200)]
Merge pull request #1493 from donaldsharp/plist_stuff

lib: Fix prefix-list where le is == prefixlen

6 years agoMerge pull request #1476 from qlyoung/null0-hack
Russ White [Wed, 29 Nov 2017 12:49:04 +0000 (07:49 -0500)]
Merge pull request #1476 from qlyoung/null0-hack

zebra: add back support for nUlL0

6 years agoMerge pull request #1484 from chiragshah6/ospfv3_dev
Russ White [Wed, 29 Nov 2017 12:45:04 +0000 (07:45 -0500)]
Merge pull request #1484 from chiragshah6/ospfv3_dev

ospfd: Display NSSA in show running-config

6 years agoMerge pull request #1482 from chiragshah6/mdev1
Russ White [Wed, 29 Nov 2017 12:44:39 +0000 (07:44 -0500)]
Merge pull request #1482 from chiragshah6/mdev1

ospfd:  Running-config display ospf (non active) vrf config, OSPF Route json support

6 years agoMerge pull request #1477 from chiragshah6/ospf_vrf_dev
Russ White [Wed, 29 Nov 2017 12:39:35 +0000 (07:39 -0500)]
Merge pull request #1477 from chiragshah6/ospf_vrf_dev

ospfd: Forward reference ospf area config

6 years agoMerge pull request #1464 from chiragshah6/mdev
Russ White [Wed, 29 Nov 2017 12:37:58 +0000 (07:37 -0500)]
Merge pull request #1464 from chiragshah6/mdev

ospf6d: SPF consider all Router LSAs

6 years agolib: Fix prefix-list where le is == prefixlen
Donald Sharp [Wed, 29 Nov 2017 00:55:07 +0000 (19:55 -0500)]
lib: Fix prefix-list where le is == prefixlen

This should be allowed:

robot(config)# ip prefix-list outbound_asp_routes seq 33 permit 1.1.1.0/24 le 24
% Invalid prefix range for 1.1.1.0/24, make sure: len < ge-value <= le-value

This commit fixes the issue:

robot(config)# ip prefix-list outbound_asp_routes seq 33 permit 1.1.1.0/24 le 23
% Invalid prefix range for 1.1.1.0/24, make sure: len < ge-value <= le-value
robot(config)# ip prefix-list outbound_asp_routes seq 33 permit 1.1.1.0/24 le 24
robot(config)# ip prefix-list outbound_asp_routes seq 33 permit 1.1.1.0/24 le 25
robot(config)#

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
6 years agoMerge pull request #1448 from qlyoung/fix-peer-group-name
Lou Berger [Tue, 28 Nov 2017 19:37:48 +0000 (03:37 +0800)]
Merge pull request #1448 from qlyoung/fix-peer-group-name

bgpd: fix `show bgp peer-group NAME`

6 years agoMerge pull request #1491 from qlyoung/vrf-cli-fix
Lou Berger [Tue, 28 Nov 2017 19:32:14 +0000 (03:32 +0800)]
Merge pull request #1491 from qlyoung/vrf-cli-fix

bgpd: fix some vrf related cli

6 years agozebra: add back support for nUlL0
Quentin Young [Wed, 22 Nov 2017 20:45:41 +0000 (15:45 -0500)]
zebra: add back support for nUlL0

Re-add support for typos when specifying a null route.

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
6 years agobgpd: fix some vrf related cli
Quentin Young [Tue, 28 Nov 2017 17:46:58 +0000 (12:46 -0500)]
bgpd: fix some vrf related cli

argv_find() searching for wrong thing

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
6 years agoMerge pull request #1445 from donaldsharp/rpki_vtysh
Renato Westphal [Tue, 28 Nov 2017 16:44:39 +0000 (14:44 -0200)]
Merge pull request #1445 from donaldsharp/rpki_vtysh

vtysh: If RPKI is not compiled in don't let vtysh think it is.

6 years agoMerge pull request #1438 from donaldsharp/zapi_notify_install
Renato Westphal [Mon, 27 Nov 2017 22:31:20 +0000 (20:31 -0200)]
Merge pull request #1438 from donaldsharp/zapi_notify_install

Zapi notify install

6 years agoMerge pull request #1360 from donaldsharp/show_advertised_routes
Renato Westphal [Mon, 27 Nov 2017 22:14:15 +0000 (20:14 -0200)]
Merge pull request #1360 from donaldsharp/show_advertised_routes

Show advertised routes

6 years agozebra: Allow zebra_find_client to match on instance as well
Donald Sharp [Mon, 27 Nov 2017 14:25:32 +0000 (09:25 -0500)]
zebra: Allow zebra_find_client to match on instance as well

zebra_find_client needs to match on instance as well so
protocols like ospfd will work correctly for notification.

Modify the zebra_find_client code to accept the instance
number and to pass it in appropriately.

Signed-off-by: Doanld Sharp <sharpd@cumulusnetworks.com>
6 years agosharpd: Add Super Happy Advanced Routing Protocol
Donald Sharp [Fri, 10 Nov 2017 17:55:16 +0000 (12:55 -0500)]
sharpd: Add Super Happy Advanced Routing Protocol

Add a daemon that will allow us to test the zapi
as well as test route install/removal times from
the kernel.

The current commands are:

install route <starting ip address> nexthop <nexthop> (1-1000000)

This command starts installing at <starting ip address>/32
(1-100000) routes that it auto-increments by 1
Installation start time is noted in the log and finish
time is noted as well.

remove routes <starting ip address> (1-1000000)

This command removes routes at <starting ip address>/32
and removes (1-100000) routes created by the install route
command.

This code can be considered experimental and *is not*
something that should be run in a production environment.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
6 years agoeigrpd: Start conversion to use route install failure callback
Donald Sharp [Fri, 10 Nov 2017 01:46:11 +0000 (20:46 -0500)]
eigrpd: Start conversion to use route install failure callback

EIGRP must not advertise routes that have failed to install.
This commit turns on the notification for EIGRP.  We still
need to start handling this correctly.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
6 years agozebra: Add notification for Route Install events
Donald Sharp [Thu, 9 Nov 2017 19:42:50 +0000 (14:42 -0500)]
zebra: Add notification for Route Install events

When we are installing into the kernel, not the
change points for notification to a higher level
protocol and make it happen

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>