as reported in https://forum.proxmox.com/threads/sudden-reboot-of-multiple-nodes-while-adding-a-new-node.116714/
this patch just fixes a particular issue where a node joins (as in
quorum membership change, not limited to PVE cluster join) an existing
cluster, but has a lower MTU than the existing links to the already
joined part of the cluster.
i.e.:
Node A: MTU 9000
Node B: MTU 9000
Node C: MTU 1500
A & B are already up and running and have established that they can talk
to eachother with MTU 9000 (-overhead). Now C joins as well - without
the reset and re-schedule of MTU discovery in this patch, A and B will
use MTU 9000 when talking to C, but those packets might never arrive
(depending on network hardware and configuration). Since the heartbeat
packets used to detect the link status are always small, they are able
to arrive at C without any problems. If the network along the way
doesn't reject the packets, but just drops them, the MTU discovery is
also severely delayed (up to tens of minutes until the actual, low MTU
is correctly detected!).
In the regular case, the reset will be immediately followed by detecting
the correct MTU for the new link (and depending on whether its lower
than the other links, the global MTU used for fragmenting by knet), and
the window with additional overhead (smaller MTU => more fragmentation
=> more packets) should be fairly small. In case of a network blackhole
negatively affecting MTU discovery, the window might be big, but without
this patch, the result is a complete outage of the whole cluster, which
is even less desirable than a cluster running with performance impacted.
Upstream is working on further improving similar failure scenarios, such as:
- improved handling of MTU being lowered at runtime (either at the link
level, or somewhere along the network path)
- improving MTU discovery timeouts and intervals to speedup recovery
even with blackholing networks
These other changes are still work in progress and will follow at a
later date.
This patch is cherry-picked from upstream branch stable1-proposed
(slated for inclusion in the next stable 1.x release of libknet).