]> git.proxmox.com Git - mirror_kronosnet.git/log
mirror_kronosnet.git
4 years agoMerge pull request #285 from kronosnet/kill-latency master
Fabio M. Di Nitto [Sun, 1 Mar 2020 06:54:20 +0000 (07:54 +0100)]
Merge pull request #285 from kronosnet/kill-latency

[links] kill redundant latency in link status and move it to stats

4 years ago[links] kill redundant latency in link status and move it to stats
Fabio M. Di Nitto [Mon, 3 Feb 2020 09:40:05 +0000 (10:40 +0100)]
[links] kill redundant latency in link status and move it to stats

this is an ABI breakage. Soname change was already done a while back for master

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
4 years agoMerge pull request #286 from kronosnet/pmtud-run
Fabio M. Di Nitto [Fri, 28 Feb 2020 10:26:51 +0000 (11:26 +0100)]
Merge pull request #286 from kronosnet/pmtud-run

Pmtud run

4 years ago[stats] allow knet_link_get_status to operate in readlock context
Fabio M. Di Nitto [Thu, 20 Feb 2020 08:04:59 +0000 (09:04 +0100)]
[stats] allow knet_link_get_status to operate in readlock context

- add per link stats mutex
- use per link stats mutex across the board

note: some threads need to lock for a slightly longer period of time than
strictly necessary to avoid reverse-order locking with other mutexes.

Signed-off-by: Christine Caulfield <ccaulfie@redhat.com>
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
4 years ago[stats] allow knet_handle_get_stats to operate in a readlock context
Fabio M. Di Nitto [Wed, 5 Feb 2020 10:04:38 +0000 (11:04 +0100)]
[stats] allow knet_handle_get_stats to operate in a readlock context

- add global stat mutex lock to protect stats updates
- use global stat mutex lock across all the threads
- fix up some minor bugs:
  - update RX crypto stats only when crypto is enabled
  - update compress and crypto stats in a consistent fashion

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
4 years ago[rx] kill unused variable
Fabio M. Di Nitto [Wed, 5 Feb 2020 09:26:49 +0000 (10:26 +0100)]
[rx] kill unused variable

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
4 years agoMerge pull request #291 from kronosnet/udp-fixes
Fabio M. Di Nitto [Fri, 28 Feb 2020 05:46:32 +0000 (06:46 +0100)]
Merge pull request #291 from kronosnet/udp-fixes

Udp fixes

4 years ago[tests] rework test suite link port allocation
Fabio M. Di Nitto [Thu, 27 Feb 2020 14:12:30 +0000 (15:12 +0100)]
[tests] rework test suite link port allocation

Logic is to try to configure a link with port X and if it fails, try the next
port. This avoids port collisions between services and knet test suite.

Please note that the implementation in test-common.c is NOT super clean.
There is still some redundant code in there that is left on purpose.

There is another branch, not yet merged, that implements functional testing
framework that does heavy use of those functions.

We will clean test-common.c as we port the functional testing branch and make
it ready for merging.

For now, this is good enough to have a more stable test suite.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
4 years ago[transports] use SO_REUSEADDR only for sctp
Fabio M. Di Nitto [Tue, 25 Feb 2020 07:18:53 +0000 (08:18 +0100)]
[transports] use SO_REUSEADDR only for sctp

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
4 years agoMerge pull request #290 from jfriesse/fix-prio-description2
Fabio M. Di Nitto [Wed, 26 Feb 2020 05:15:04 +0000 (06:15 +0100)]
Merge pull request #290 from jfriesse/fix-prio-description2

Enhance prio description of POLICY_PASSIVE

4 years ago[man] Enhance prio description of POLICY_PASSIVE
Jan Friesse [Tue, 25 Feb 2020 14:09:19 +0000 (15:09 +0100)]
[man] Enhance prio description of POLICY_PASSIVE

Some users found description of POLICY_PASSIVE priority confusing
(probably because "priority" word is too overloaded) so add
some redundancy to make description unambiguous.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
4 years agoMerge pull request #244 from kronosnet/doxycov
Fabio M. Di Nitto [Fri, 21 Feb 2020 05:38:13 +0000 (06:38 +0100)]
Merge pull request #244 from kronosnet/doxycov

man: Fix covscan reports in doxyxml.c

4 years agoman: Change strcat to strncat
Christine Caulfield [Thu, 8 Aug 2019 12:26:54 +0000 (13:26 +0100)]
man: Change strcat to strncat

Oddly, covscan doesn't compain about the use of strcat, but
I'm going to pre-empt it, just in case it decides to.

4 years agoman: Fix covscan reports in doxyxml.c
Christine Caulfield [Mon, 5 Aug 2019 11:41:06 +0000 (12:41 +0100)]
man: Fix covscan reports in doxyxml.c

This fixes most of the remaining covscan errors in doxyxml.c.
The ones that remain are caused by malloced structures being
stored in qb_hashtable_maps.

These still cause unfreed memory, because the contents of the maps
are never explictly freed, but as they are used until the very end of
the program (when the OS will free everything) I'm dubious as to
whether it's worth doing it in the code - or whether covscan will
work out what's going on anyway.

4 years agoMerge pull request #284 from kronosnet/sctp-fixes
Fabio M. Di Nitto [Fri, 31 Jan 2020 08:51:49 +0000 (09:51 +0100)]
Merge pull request #284 from kronosnet/sctp-fixes

Sctp fixes

4 years ago[global] Update copyright across the board
Fabio M. Di Nitto [Thu, 30 Jan 2020 11:40:31 +0000 (12:40 +0100)]
[global] Update copyright across the board

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
4 years ago[sctp] major surgery to use only SCTP events to determine socket status
Fabio M. Di Nitto [Thu, 30 Jan 2020 11:36:18 +0000 (12:36 +0100)]
[sctp] major surgery to use only SCTP events to determine socket status

- drop concept of on_connected_epoll to determine if socket is ready or not
- provide much better debugging output at all levels
- incorporate fix from Xin Long <lxin@redhat.com> to gather socket status
  at the right time
- deal with a recent kernel change on SCTP socket that broke knet (from rhel7):
  [net] sctp: allow delivering notifications after receiving SHUTDOWN

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
4 years ago[rx] use defines to determine RX data types vs random numbers
Fabio M. Di Nitto [Fri, 24 Jan 2020 04:08:26 +0000 (05:08 +0100)]
[rx] use defines to determine RX data types vs random numbers

also extend a bit to make ready for SCTP extra return codes

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
4 years ago[tx] Don't Clear out msghdr for all transports.
Christine Caulfield [Wed, 22 Jan 2020 15:15:49 +0000 (15:15 +0000)]
[tx] Don't Clear out msghdr for all transports.

When sending a message to multiple links, if one of those links
is not connection-oriented then msg_name & msg_namelen would be cleared,
thus breaking the send to any subsequent non-connection-oriented links.

So now, if we need to clear out msg_name & msg_namelen, we take a copy of the
msghdr and edit that instead,

4 years ago[rx] Don't return 512 EOF messages from _recvmmsg
Christine Caulfield [Thu, 16 Jan 2020 09:18:19 +0000 (09:18 +0000)]
[rx] Don't return 512 EOF messages from _recvmmsg

If recvmsg() returns 0 for EOF then it's going to do so
until the error is rectified or read with getsockopt(). But
the _recvmmsg() wrapper keeps reading until the vector is full
thus returning a block of 512 EOF messages all of which the caller
has to plough through.

This patch causes _recvmmsg() to return as soon as it has got
the first EOF so the the caller can deal with it in good time
and not spin looking at the same thing over and over again.

I've also fixed a couple of typos in related comments

4 years ago[rx] send reply packets only when transport is connected
Fabio M. Di Nitto [Fri, 31 Jan 2020 05:28:56 +0000 (06:28 +0100)]
[rx] send reply packets only when transport is connected

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
4 years agoMerge pull request #283 from kronosnet/latency-fix
Fabio M. Di Nitto [Thu, 30 Jan 2020 14:53:13 +0000 (15:53 +0100)]
Merge pull request #283 from kronosnet/latency-fix

[rx] unify latency values to a capped value to link precision

4 years ago[rx] unify latency values to a capped value to link precision
Fabio M. Di Nitto [Thu, 30 Jan 2020 14:23:39 +0000 (15:23 +0100)]
[rx] unify latency values to a capped value to link precision

keep the patch simple to avoid API/ABI breakage for now for easy backporting

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
4 years agoMerge pull request #281 from kronosnet/latency-fix
Fabio M. Di Nitto [Thu, 30 Jan 2020 11:30:46 +0000 (12:30 +0100)]
Merge pull request #281 from kronosnet/latency-fix

[latency] fix incorrect math that could lead to bad latency calculation

4 years ago[latency] fix incorrect math that could lead to bad latency calculation
Fabio M. Di Nitto [Wed, 29 Jan 2020 15:02:46 +0000 (16:02 +0100)]
[latency] fix incorrect math that could lead to bad latency calculation

Also, document a bit better how latency is calculated

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
4 years agoMerge pull request #282 from kronosnet/better-eunreach-patch
Fabio M. Di Nitto [Thu, 30 Jan 2020 05:33:18 +0000 (06:33 +0100)]
Merge pull request #282 from kronosnet/better-eunreach-patch

[udp] Better fix for -ENETUNREACH

4 years ago[udp] Better fix for -ENETUNREACH
Christine Caulfield [Wed, 29 Jan 2020 15:52:26 +0000 (15:52 +0000)]
[udp] Better fix for -ENETUNREACH

This fix for the ENETUNREACH problem works better than the last one
in that it also works with Linux kernels > 5.0.0 (which return
-ENETUNREACH) if an interfaces is brought down, and also on FreeBSD
which returns ENETDOWN.

4 years agoMerge pull request #279 from kronosnet/nitpick
Fabio M. Di Nitto [Mon, 27 Jan 2020 09:08:37 +0000 (10:08 +0100)]
Merge pull request #279 from kronosnet/nitpick

[udp] simplify code (same logic)

4 years ago[udp] simplify code (same logic)
Fabio M. Di Nitto [Sat, 25 Jan 2020 05:26:28 +0000 (06:26 +0100)]
[udp] simplify code (same logic)

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
4 years agoMerge pull request #278 from kronosnet/dont-spin-enetunreach
Fabio M. Di Nitto [Fri, 24 Jan 2020 12:17:55 +0000 (13:17 +0100)]
Merge pull request #278 from kronosnet/dont-spin-enetunreach

[udp] don't make socket spin if a network I/F is down

4 years ago[udp] don't make socket spin if a network I/F is down
Christine Caulfield [Fri, 24 Jan 2020 09:33:50 +0000 (09:33 +0000)]
[udp] don't make socket spin if a network I/F is down

UDP treats ENETUNREACH as a temporary error and just retries,
but this causes the TX thread to spin just doing sendto() therefore
blocking all other traffic.

(To reproduce this try starting corosync with 2 links configured in
corosync.conf but only one of them configured to the 'right' address
- it will spin in a tight loop and need to be killed with -9)

SCTP does not seem to suffer from this.

4 years agoMerge pull request #277 from kronosnet/gcc10
Fabio M. Di Nitto [Wed, 22 Jan 2020 07:48:49 +0000 (08:48 +0100)]
Merge pull request #277 from kronosnet/gcc10

Fix errors detected by gcc10

4 years ago[nozzle] use interface name size consistently and drop strncpy in favour of memmove
Fabio M. Di Nitto [Wed, 22 Jan 2020 04:39:17 +0000 (05:39 +0100)]
[nozzle] use interface name size consistently and drop strncpy in favour of memmove

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
4 years ago[host] use KNET_MAX_HOST_LEN consistently
Fabio M. Di Nitto [Wed, 22 Jan 2020 04:17:39 +0000 (05:17 +0100)]
[host] use KNET_MAX_HOST_LEN consistently

detected by gcc10

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
4 years agoMerge pull request #274 from kronosnet/ppc-clang
Fabio M. Di Nitto [Wed, 20 Nov 2019 10:42:51 +0000 (11:42 +0100)]
Merge pull request #274 from kronosnet/ppc-clang

[tests] mark array as static

4 years ago[tests] mark array as static
Fabio M. Di Nitto [Wed, 20 Nov 2019 08:47:39 +0000 (09:47 +0100)]
[tests] mark array as static

fixes an odd segfault when running the test on ppc when built with clang

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
4 years agoMerge pull request #273 from kronosnet/wferi-patch-1
Fabio M. Di Nitto [Mon, 4 Nov 2019 07:35:16 +0000 (08:35 +0100)]
Merge pull request #273 from kronosnet/wferi-patch-1

[handle] fix typo in error log message

4 years ago[handle] fix typo in error log message
wferi [Sun, 3 Nov 2019 08:22:38 +0000 (09:22 +0100)]
[handle] fix typo in error log message

4 years agoMerge pull request #272 from kronosnet/cov-scan-errors
Fabio M. Di Nitto [Tue, 29 Oct 2019 13:25:03 +0000 (14:25 +0100)]
Merge pull request #272 from kronosnet/cov-scan-errors

[handle] make sure to unlock config handle on failure

4 years ago[handle] make sure to unlock config handle on failure
Fabio M. Di Nitto [Tue, 29 Oct 2019 12:20:55 +0000 (13:20 +0100)]
[handle] make sure to unlock config handle on failure

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
4 years agoMerge pull request #270 from kronosnet/bsd-build-fix
Fabio M. Di Nitto [Sun, 27 Oct 2019 14:08:00 +0000 (15:08 +0100)]
Merge pull request #270 from kronosnet/bsd-build-fix

[build] fix openssl version detection when not using pkg-config

4 years ago[build] fix openssl version detection when not using pkg-config
Fabio M. Di Nitto [Sun, 27 Oct 2019 05:42:54 +0000 (06:42 +0100)]
[build] fix openssl version detection when not using pkg-config

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
4 years agoMerge pull request #268 from kronosnet/netload-fixes
Fabio M. Di Nitto [Wed, 23 Oct 2019 13:40:27 +0000 (15:40 +0200)]
Merge pull request #268 from kronosnet/netload-fixes

Netload fixes

4 years ago[RX] silence defrag buffer expiration debug error
Fabio M. Di Nitto [Sat, 19 Oct 2019 07:05:16 +0000 (09:05 +0200)]
[RX] silence defrag buffer expiration debug error

when using active-active links, it is simply too noisy and
doesn't provide very useful information.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
4 years ago[RX] handle short write to the application properly
Fabio M. Di Nitto [Sat, 19 Oct 2019 06:47:27 +0000 (08:47 +0200)]
[RX] handle short write to the application properly

this change affects only applications that are not using knet
generated socketpairs to deliver/receive data to/from knet.

If an application uses a fd that is not SOCK_SEQPACKET (basically
streaming), we have to handle short writes accordingly, and knet
will continue delivering as long as there is progress.

The application is responsible to verify that the data packet
is complete as the delivery is not guaranteed to be complete.
The application can either embed the size of the packet in their
data structure or use the socket error notification callback
that will be invoked in case of errors or 0 data delivery.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
4 years ago[TX] discard too big packets when reading from socketpairs
Fabio M. Di Nitto [Fri, 18 Oct 2019 09:17:57 +0000 (11:17 +0200)]
[TX] discard too big packets when reading from socketpairs

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
4 years ago[RX] Discard incoming packets if knet cannot reply back.
Fabio M. Di Nitto [Fri, 18 Oct 2019 08:41:49 +0000 (10:41 +0200)]
[RX] Discard incoming packets if knet cannot reply back.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
4 years agoMerge pull request #266 from kronosnet/wferi/newline
Fabio M. Di Nitto [Fri, 18 Oct 2019 08:20:32 +0000 (10:20 +0200)]
Merge pull request #266 from kronosnet/wferi/newline

[test] append newline to knet_send timeout message

4 years ago[test] append newline to knet_send timeout message
Ferenc Wágner [Fri, 18 Oct 2019 06:38:04 +0000 (08:38 +0200)]
[test] append newline to knet_send timeout message

4 years agoMerge pull request #265 from kronosnet/netload-fixes
Fabio M. Di Nitto [Wed, 16 Oct 2019 07:04:08 +0000 (09:04 +0200)]
Merge pull request #265 from kronosnet/netload-fixes

[test] add packet verification option to knet_bench

4 years ago[test] add packet verification option to knet_bench
Fabio M. Di Nitto [Wed, 16 Oct 2019 06:10:23 +0000 (08:10 +0200)]
[test] add packet verification option to knet_bench

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
4 years agoMerge pull request #264 from kronosnet/netload-fixes
Fabio M. Di Nitto [Tue, 15 Oct 2019 14:17:18 +0000 (16:17 +0200)]
Merge pull request #264 from kronosnet/netload-fixes

Netload fixes

4 years ago[PMTUd] invalidate MTU for a link if the value is lower than minimum
Fabio M. Di Nitto [Tue, 15 Oct 2019 09:53:56 +0000 (11:53 +0200)]
[PMTUd] invalidate MTU for a link if the value is lower than minimum

Under heavy network load and packet loss, calculated MTU can be
too small. In that case we need to invalidate the link mtu,
that would remove the link from the rotation (and traffic) and
would give PMTUd time to get the right MTU in the next round.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
4 years ago[test] add ability to knet_bench to specify a fixed packet size for perf test
Fabio M. Di Nitto [Tue, 15 Oct 2019 05:16:22 +0000 (07:16 +0200)]
[test] add ability to knet_bench to specify a fixed packet size for perf test

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
4 years ago[rx] copy data into the defrag buffer only if we know the size of the frame
Fabio M. Di Nitto [Tue, 15 Oct 2019 05:02:05 +0000 (07:02 +0200)]
[rx] copy data into the defrag buffer only if we know the size of the frame

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
4 years ago[host] fix defrag buffers reclaim logic
Fabio M. Di Nitto [Tue, 15 Oct 2019 04:53:24 +0000 (06:53 +0200)]
[host] fix defrag buffers reclaim logic

The problem:

- let's assume a 2 nodes (A and B) cluster setup
- node A sends fragmented packets to node B and there is
  packet loss on the network.
- node B receives all those fragments and attempts to
  reassemble them.
- node A sends packet seq_num X in Y fragments.
- node B receives only part of the fragments and stores
  them in a defrag buf.
- packet loss stops.
- node A continues to send packets and a seq_num
  roll-over takes place.
- node A sends a new packet seq_num X in Y fragments.
- node B gets confused here because the parts of the old
  packet seq_num X are still stored and the buffer
  has not been reclaimed.
- node B continues to rebuild packet seq_num X with
  old stale data and new data from after the roll-over.
- node B completes reassembling the packet and delivers
  junk to the application.

The solution:

Add a much stronger buffer reclaim logic that will apply
on each received packet and not only when defrag buffers
are needed, as there might be a mix of fragmented and not
fragmented packets in-flight.

The new logic creates a window of N packets that can be
handled at the same time (based on the number of buffers)
and clear everything else.

Fixes https://github.com/kronosnet/kronosnet/issues/261

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
4 years ago[host] rename variables to make it easier to read the code
Fabio M. Di Nitto [Tue, 15 Oct 2019 04:46:36 +0000 (06:46 +0200)]
[host] rename variables to make it easier to read the code

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
4 years agoMerge pull request #263 from kronosnet/runtime-debug
Fabio M. Di Nitto [Wed, 9 Oct 2019 10:45:56 +0000 (12:45 +0200)]
Merge pull request #263 from kronosnet/runtime-debug

[build] add --with-sanitizers= option for sanitizer builds

4 years ago[build] add --with-sanitizers= option for sanitizer builds
Fabio M. Di Nitto [Wed, 9 Oct 2019 08:28:14 +0000 (10:28 +0200)]
[build] add --with-sanitizers= option for sanitizer builds

this option is stricly meant for runtime debugging purposes.
do NOT use in production.

check gcc/clang man pages on how to use ASAN/UBSAN/TSAN.

Also allow users to specificy SANITIZERS_CFLAGS and SANITIZERS_LDFLAGS
for advanced use.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
4 years agoMerge pull request #262 from ThomasLamprecht/fix-doxyxml-overflow
Fabio M. Di Nitto [Wed, 9 Oct 2019 05:35:55 +0000 (07:35 +0200)]
Merge pull request #262 from ThomasLamprecht/fix-doxyxml-overflow

doxyxml: print_param: fix heap-buffer-overflow on read

4 years agodoxyxml: print_param: fix heap-buffer-overflow on read
Thomas Lamprecht [Tue, 8 Oct 2019 15:09:07 +0000 (17:09 +0200)]
doxyxml: print_param: fix heap-buffer-overflow on read

in read_struct we can get the pi->paramtype assigned with:
> pi->paramtype = type?strdup(type):strdup("");

And in print_param we then always check the last character by getting
the strlen and subtracting one. But in the case where either type was
NULL and we assigned an empty string, or type wasn't null but
pointing to an empty string we ran into an read-heap-buffer-overflow
as here strlen is zero, and so we the first if branch evaluated to
> if (pi->paramtype[-1] == '*') {
which isn't valid. Depending on the OS, protection of surrounding
area due to said OS or the compiler, this can crash the program.

Similar issue was the case for the next check for double pointers,
here for all strings with strlen < 2.

To solve this get the strlen early and check if we cannot underflow
before doing the real read.

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
4 years agoMerge pull request #260 from kronosnet/test-suite
Fabio M. Di Nitto [Thu, 26 Sep 2019 10:17:36 +0000 (12:17 +0200)]
Merge pull request #260 from kronosnet/test-suite

[tests] add common function to sleep based on how the test suite is r…

4 years ago[tests] add common function to sleep based on how the test suite is running
Fabio M. Di Nitto [Thu, 26 Sep 2019 05:18:46 +0000 (07:18 +0200)]
[tests] add common function to sleep based on how the test suite is running

Address issue while waiting for host to be up and PMTUd first run.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
4 years agoMerge pull request #258 from kronosnet/wferi/fixes
Fabio M. Di Nitto [Wed, 25 Sep 2019 07:21:59 +0000 (09:21 +0200)]
Merge pull request #258 from kronosnet/wferi/fixes

Assorted small fixups

4 years agoFix typo: trasport -> transport
Ferenc Wágner [Wed, 29 May 2019 09:42:08 +0000 (11:42 +0200)]
Fix typo: trasport -> transport

Signed-off-by: Ferenc Wágner <wferi@debian.org>
4 years agotests: skip the SCTP test if SCTP is not supported by the kernel
Ferenc Wágner [Wed, 3 Apr 2019 08:26:11 +0000 (10:26 +0200)]
tests: skip the SCTP test if SCTP is not supported by the kernel

For example, module loading is disabled on Debian build daemons.
(In the vein of c5aa1c3343703455b480cef5c173f471e1bb020f.)

Signed-off-by: Ferenc Wágner <wferi@debian.org>
4 years agoMerge pull request #257 from kronosnet/netload-fixes
Fabio M. Di Nitto [Thu, 19 Sep 2019 11:32:02 +0000 (13:32 +0200)]
Merge pull request #257 from kronosnet/netload-fixes

[links] fix memory corryption of link structure

4 years ago[links] fix memory corryption of link structure
Fabio M. Di Nitto [Thu, 19 Sep 2019 07:02:44 +0000 (09:02 +0200)]
[links] fix memory corryption of link structure

the index would overflow the buffer and overwrite data in the link
structure. Depending on what was written the cluster could fall
apart in many ways, from crashing, to hung.

Fixes: https://github.com/kronosnet/kronosnet/issues/255
thanks to the proxmox developers and community for reporting the issue
and for all the help reproducing / debugging the problem.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
4 years agoMerge pull request #254 from kronosnet/test-fixes
Fabio M. Di Nitto [Fri, 13 Sep 2019 11:09:00 +0000 (13:09 +0200)]
Merge pull request #254 from kronosnet/test-fixes

Test fixes

4 years ago[tests] give PMTUd more time to redetect MTU
Fabio M. Di Nitto [Fri, 13 Sep 2019 05:30:06 +0000 (07:30 +0200)]
[tests] give PMTUd more time to redetect MTU

Ideal fix would be to use PMTUd callback, but that requires a lot of
extra test infrastructure. For now just workaround the problem.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
4 years ago[tests] fix ip generation boundaries
Fabio M. Di Nitto [Fri, 13 Sep 2019 05:28:55 +0000 (07:28 +0200)]
[tests] fix ip generation boundaries

https://ci.kronosnet.org/job/knet-build-all-voting/1450/knet-build-all-voting=rhel80z-s390x/console

and similar, when pid = 255, the secondary IP would hit 256 that is of course invalid.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
4 years agoMerge pull request #253 from kronosnet/bsd-fixes
Fabio M. Di Nitto [Thu, 12 Sep 2019 15:55:03 +0000 (17:55 +0200)]
Merge pull request #253 from kronosnet/bsd-fixes

[nozzle] fix tapX range on newer FreeBSD

4 years ago[nozzle] fix tapX range on newer FreeBSD
Fabio M. Di Nitto [Thu, 12 Sep 2019 04:01:38 +0000 (06:01 +0200)]
[nozzle] fix tapX range on newer FreeBSD

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
4 years agoMerge pull request #252 from kronosnet/lock-fix
Fabio M. Di Nitto [Tue, 10 Sep 2019 07:19:43 +0000 (09:19 +0200)]
Merge pull request #252 from kronosnet/lock-fix

[pmtud] switch to use async version of dstcache update due to locking…

4 years ago[pmtud] switch to use async version of dstcache update due to locking context (read...
Fabio M. Di Nitto [Mon, 9 Sep 2019 13:11:25 +0000 (15:11 +0200)]
[pmtud] switch to use async version of dstcache update due to locking context (read vs write)

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
4 years agoMerge pull request #251 from kronosnet/latency-fixes
Fabio M. Di Nitto [Mon, 9 Sep 2019 03:35:59 +0000 (05:35 +0200)]
Merge pull request #251 from kronosnet/latency-fixes

[links] stabilize latency calculation when nodes are not responsive

4 years ago[links] stabilize latency calculation when nodes are not responsive
Fabio M. Di Nitto [Fri, 6 Sep 2019 05:05:19 +0000 (07:05 +0200)]
[links] stabilize latency calculation when nodes are not responsive

The following scenario is more of a corner case than normal, but
this change allows to better deal with this situation:

1) 2 nodes cluster (corosync) (node A and node B)
2) kill -stop $(pidof corosync) on node A
3) node B will continue to send ping packets to node A
4) node A is accumulating those ping packets in the kernel network socket
5) wait some seconds and unpause node A
6) node A will start processing the ping packets in the queue
   and send pong replies to node B
7) node B will see an extreme increase of latency due
   those "obsoleted" ping/pong packets
8) node B, as latency increases, will take longer and longer
   to notice that node A is down due to the pong_timeout adjustment
   for latency (required for initial cluster spike).

the solution:

1) Use average latency to calculate pong_timeout_adj vs latency_max.
   Averate latency will go down again in time, while latency_max is never
   reset.

2) RX thread will filter out all pong packets that have higher latency
   than currently configure pong_timeout. This barrier should have
   been in place even before.

this solution reduces the latency spike on node B to a perfectly
reasonable level and it will all eventually stabilize over time
as latency samples increase and latency will reduce.

Please be aware that using a pong_timeout smaller than latency will
simply mark the link down now.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
4 years agoMerge pull request #250 from jfriesse/musl_fix
Fabio M. Di Nitto [Tue, 3 Sep 2019 10:12:28 +0000 (12:12 +0200)]
Merge pull request #250 from jfriesse/musl_fix

Fix compilation and running on Linux distribution with musl libc

4 years ago[handle] Set thread stack size on create
Jan Friesse [Mon, 2 Sep 2019 11:56:34 +0000 (13:56 +0200)]
[handle] Set thread stack size on create

Musl libc has small stack size for threads. Knet needs ~300KiB (tested
at the time when this patch was created). Glibc seems to use ~8MiB. As a
compromise, 1MiB is used.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
4 years ago[common] Conditionalize RTLD_DI_ORIGIN
Jan Friesse [Mon, 2 Sep 2019 09:11:27 +0000 (11:11 +0200)]
[common] Conditionalize RTLD_DI_ORIGIN

RTLD_DI_ORIGIN is used to get absolute path of plugin. It is used only
for logging useful info and not strictly needed, so use it only when it
is defined (only musl is known to author of the patch)

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
4 years ago[common] Include correct errno.h
Jan Friesse [Mon, 2 Sep 2019 08:05:18 +0000 (10:05 +0200)]
[common] Include correct errno.h

sys/errno.h is system-specific path and errno.h should be used instead.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
4 years agoMerge pull request #248 from jfriesse/fix-prio-description
Fabio M. Di Nitto [Tue, 27 Aug 2019 07:25:20 +0000 (09:25 +0200)]
Merge pull request #248 from jfriesse/fix-prio-description

[man] Fix priority description of POLICY_PASSIVE

4 years ago[man] Fix priority description of POLICY_PASSIVE
Jan Friesse [Mon, 26 Aug 2019 13:41:23 +0000 (15:41 +0200)]
[man] Fix priority description of POLICY_PASSIVE

... to match source code.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
4 years agoMerge pull request #245 from kronosnet/pmtud-fixes
Fabio M. Di Nitto [Wed, 21 Aug 2019 04:17:33 +0000 (06:17 +0200)]
Merge pull request #245 from kronosnet/pmtud-fixes

[PMTUd] rework the whole math to calculate MTU

4 years ago[PMTUd] add ability to manually override MTU and disable PMTUd
Fabio M. Di Nitto [Tue, 20 Aug 2019 04:57:45 +0000 (06:57 +0200)]
[PMTUd] add ability to manually override MTU and disable PMTUd

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
4 years ago[PMTUd] add dynamic pong timeout when using crypto
Fabio M. Di Nitto [Tue, 13 Aug 2019 04:41:32 +0000 (06:41 +0200)]
[PMTUd] add dynamic pong timeout when using crypto

problem originally reported by proxmox community, users
observed that under pressure the MTU would flap back and forth
between 2 values due to other node response timeout.

implement a dynamic timeout multiplier when using crypto that
should solve the problem in a more flexible fashion.

When a timeout hits, those new logs will show:

[knet]: [info] host: host: 1 (passive) best link: 0 (pri: 0)
[knet]: [debug] pmtud: Starting PMTUD for host: 1 link: 0
[knet]: [debug] pmtud: Increasing PMTUd response timeout multiplier to (4) for host 1 link: 0
[knet]: [info] pmtud: PMTUD link change for host: 1 link: 0 from 469 to 65429
[knet]: [debug] pmtud: PMTUD completed for host: 1 link: 0 current link mtu: 65429
[knet]: [info] pmtud: Global data MTU changed to: 65429
[knet]: [debug] pmtud: Starting PMTUD for host: 1 link: 0
[knet]: [debug] pmtud: Increasing PMTUd response timeout multiplier to (8) for host 1 link: 0
[knet]: [debug] pmtud: Increasing PMTUd response timeout multiplier to (16) for host 1 link: 0
[knet]: [debug] pmtud: Increasing PMTUd response timeout multiplier to (32) for host 1 link: 0
[knet]: [debug] pmtud: Increasing PMTUd response timeout multiplier to (64) for host 1 link: 0
[knet]: [debug] pmtud: PMTUD completed for host: 1 link: 0 current link mtu: 65429
[knet]: [debug] pmtud: Starting PMTUD for host: 1 link: 0
[knet]: [debug] pmtud: Increasing PMTUd response timeout multiplier to (128) for host 1 link: 0
[knet]: [debug] pmtud: PMTUD completed for host: 1 link: 0 current link mtu: 65429

and when the latency reduces and it is safe to be more responsive again:

[knet]: [debug] pmtud: Starting PMTUD for host: 1 link: 0
[knet]: [debug] pmtud: Decreasing PMTUd response timeout multiplier to (64) for host 1 link: 0
[knet]: [debug] pmtud: PMTUD completed for host: 1 link: 0 current link mtu: 65429

....

testing this patch on normal hosts is a bit challenging tho.

Patch was tested by hardcoding a super low timeout here:

diff --git a/libknet/threads_pmtud.c b/libknet/threads_pmtud.c
index 4f0ba0f..5e2b89b 100644
--- a/libknet/threads_pmtud.c
+++ b/libknet/threads_pmtud.c
@@ -261,7 +271,8 @@ retry:
                        /*
                         * crypto, under pressure, is a royal PITA
                         */
-                       pong_timeout_adj_tmp = dst_link->pong_timeout_adj * 2;
+                       //pong_timeout_adj_tmp = dst_link->pong_timeout_adj * dst_link->pmtud_crypto_timeout_multiplier;
+                       pong_timeout_adj_tmp = 30 * dst_link->pmtud_crypto_timeout_multiplier;
                } else {
                        pong_timeout_adj_tmp = dst_link->pong_timeout_adj;
                }

and using a long running version of api_knet_send_crypto_test with a short PMTUd setfreq (10 sec).

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
4 years ago[PMTUd] rework the whole math to calculate MTU
Fabio M. Di Nitto [Mon, 12 Aug 2019 14:52:59 +0000 (16:52 +0200)]
[PMTUd] rework the whole math to calculate MTU

internal changes:
- drop the concept of sec_header_size that was completely wrong
  and unnecessary
- bump crypto API to version 3 due to the above change
- clarify the difference between link->proto_overhead and
  link->status->proto_overhead. We cannot rename the status
  one as it would also change ABI.
- add onwire.c with documentation on the packet format
  and what various len(s) mean in context.
- add 3 new functions to calculate MTUs back and forth
  and use them around, hopefully with enough clarification
  on why things are done in a given way.
- heavily change thread_pmtud.c to use those new facilities.
- fix major calculation issues when using crypto (non-crypto
  was not affected by the problem).
- fix checks around to make sure they match the new math.
- fix padding calculation.
- add functional PMTUd crypto test
  this test can take several hours (12+) and should be executed
  on a controlled environment since it automatically changes
  loopback MTU to run tests.
- fix way the lowest MTU is calculated during a PMTUd run
  to avoid spurious double notifications.
- drop redundant checks.

user visible changes:
- Global MTU is now calculated properly when using crypto
  and values will be in general bigger than before due
  to incorrect padding calculation in the previous implementation.

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
4 years agoMerge pull request #242 from kronosnet/pmtud-fixes
Fabio M. Di Nitto [Fri, 2 Aug 2019 11:22:45 +0000 (13:22 +0200)]
Merge pull request #242 from kronosnet/pmtud-fixes

Pmtud fixes

4 years ago[PMTUd] fix MTU calculation when using crypto and add docs
Fabio M. Di Nitto [Fri, 2 Aug 2019 08:44:23 +0000 (10:44 +0200)]
[PMTUd] fix MTU calculation when using crypto and add docs

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
4 years ago[docs] add knet packet layout
Fabio M. Di Nitto [Fri, 2 Aug 2019 08:43:09 +0000 (10:43 +0200)]
[docs] add knet packet layout

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
4 years ago[udp] log information about detected kernel MTU
Fabio M. Di Nitto [Wed, 31 Jul 2019 12:15:07 +0000 (14:15 +0200)]
[udp] log information about detected kernel MTU

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
4 years ago[crypto] fix log information
Fabio M. Di Nitto [Tue, 30 Jul 2019 09:18:33 +0000 (11:18 +0200)]
[crypto] fix log information

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
4 years agoMerge pull request #240 from kronosnet/cov-scan
Fabio M. Di Nitto [Fri, 26 Jul 2019 13:07:32 +0000 (15:07 +0200)]
Merge pull request #240 from kronosnet/cov-scan

coverity scan fixes

4 years ago[sctp] retry locking in case of failure
Fabio M. Di Nitto [Fri, 26 Jul 2019 07:58:05 +0000 (09:58 +0200)]
[sctp] retry locking in case of failure

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
4 years ago[tx] clean up channel management code for internal communications
Fabio M. Di Nitto [Thu, 25 Jul 2019 09:18:19 +0000 (11:18 +0200)]
[tx] clean up channel management code for internal communications

the code is still not in use but it's more clear and doesn't trigger
memory overrun

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
4 years ago[nozzle] fix a few coverity errors in the test suite
Fabio M. Di Nitto [Thu, 25 Jul 2019 07:24:26 +0000 (09:24 +0200)]
[nozzle] fix a few coverity errors in the test suite

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
4 years ago[tx] drop unnecessary usleep when sending to localhost
Fabio M. Di Nitto [Thu, 25 Jul 2019 06:28:34 +0000 (08:28 +0200)]
[tx] drop unnecessary usleep when sending to localhost

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
4 years ago[common] make sure string is null terminated
Fabio M. Di Nitto [Wed, 24 Jul 2019 11:59:47 +0000 (13:59 +0200)]
[common] make sure string is null terminated

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
4 years ago[test] simplify flush log
Fabio M. Di Nitto [Wed, 24 Jul 2019 11:46:51 +0000 (13:46 +0200)]
[test] simplify flush log

allocate on stack only once and make sure strings are null terminated
drop useless read loop since log msg are always smaller than PAGE_SIZE
and read are atomic at that level

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>
4 years ago[nozzle] avoid tons of possible buffer overruns
Fabio M. Di Nitto [Wed, 24 Jul 2019 09:00:00 +0000 (11:00 +0200)]
[nozzle] avoid tons of possible buffer overruns

Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>