]> git.proxmox.com Git - mirror_ubuntu-eoan-kernel.git/commitdiff
tipc: reduce sensitive to retransmit failures
authorHoang Le <hoang.h.le@dektech.com.au>
Wed, 6 Nov 2019 06:26:10 +0000 (13:26 +0700)
committerKhalid Elmously <khalid.elmously@canonical.com>
Fri, 6 Mar 2020 07:25:26 +0000 (02:25 -0500)
BugLink: https://bugs.launchpad.net/bugs/1864060
commit 426071f1f3995d7e9603246bffdcbf344cd31719 upstream.

With huge cluster (e.g >200nodes), the amount of that flow:
gap -> retransmit packet -> acked will take time in case of STATE_MSG
dropped/delayed because a lot of traffic. This lead to 1.5 sec tolerance
value criteria made link easy failure around 2nd, 3rd of failed
retransmission attempts.

Instead of re-introduced criteria of 99 faled retransmissions to fix the
issue, we increase failure detection timer to ten times tolerance value.

Fixes: 77cf8edbc0e7 ("tipc: simplify stale link failure criteria")
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Hoang Le <hoang.h.le@dektech.com.au>
Acked-by: Jon
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
net/tipc/link.c

index 3fa02796a47d598992e49e24a5681bf9fd36f9fa..95908fa01583cfe271d83d0b18f48517f6c7a759 100644 (file)
@@ -1078,7 +1078,7 @@ static bool link_retransmit_failure(struct tipc_link *l, struct tipc_link *r,
                return false;
 
        if (!time_after(jiffies, TIPC_SKB_CB(skb)->retr_stamp +
-                       msecs_to_jiffies(r->tolerance)))
+                       msecs_to_jiffies(r->tolerance * 10)))
                return false;
 
        hdr = buf_msg(skb);