--- /dev/null
+From d88270eef4b56bd7973841dd1fed387ccfa83709 Mon Sep 17 00:00:00 2001
+From: Neal Cardwell <ncardwell@google.com>
+Date: Mon, 25 Jan 2016 14:01:53 -0800
+Subject: [PATCH] tcp: fix tcp_mark_head_lost to check skb len before
+ fragmenting
+
+This commit fixes a corner case in tcp_mark_head_lost() which was
+causing the WARN_ON(len > skb->len) in tcp_fragment() to fire.
+
+tcp_mark_head_lost() was assuming that if a packet has
+tcp_skb_pcount(skb) of N, then it's safe to fragment off a prefix of
+M*mss bytes, for any M < N. But with the tricky way TCP pcounts are
+maintained, this is not always true.
+
+For example, suppose the sender sends 4 1-byte packets and have the
+last 3 packet sacked. It will merge the last 3 packets in the write
+queue into an skb with pcount = 3 and len = 3 bytes. If another
+recovery happens after a sack reneging event, tcp_mark_head_lost()
+may attempt to split the skb assuming it has more than 2*MSS bytes.
+
+This sounds very counterintuitive, but as the commit description for
+the related commit c0638c247f55 ("tcp: don't fragment SACKed skbs in
+tcp_mark_head_lost()") notes, this is because tcp_shifted_skb()
+coalesces adjacent regions of SACKed skbs, and when doing this it
+preserves the sum of their packet counts in order to reflect the
+real-world dynamics on the wire. The c0638c247f55 commit tried to
+avoid problems by not fragmenting SACKed skbs, since SACKed skbs are
+where the non-proportionality between pcount and skb->len/mss is known
+to be possible. However, that commit did not handle the case where
+during a reneging event one of these weird SACKed skbs becomes an
+un-SACKed skb, which tcp_mark_head_lost() can then try to fragment.
+
+The fix is to simply mark the entire skb lost when this happens.
+This makes the recovery slightly more aggressive in such corner
+cases before we detect reordering. But once we detect reordering
+this code path is by-passed because FACK is disabled.
+
+Signed-off-by: Neal Cardwell <ncardwell@google.com>
+Signed-off-by: Yuchung Cheng <ycheng@google.com>
+Signed-off-by: Eric Dumazet <edumazet@google.com>
+Signed-off-by: David S. Miller <davem@davemloft.net>
+Cherry-picked-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
+---
+ net/ipv4/tcp_input.c | 10 +++++-----
+ 1 file changed, 5 insertions(+), 5 deletions(-)
+
+diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
+index 0003d40..d2ad433 100644
+--- a/net/ipv4/tcp_input.c
++++ b/net/ipv4/tcp_input.c
+@@ -2164,8 +2164,7 @@ static void tcp_mark_head_lost(struct sock *sk, int packets, int mark_head)
+ {
+ struct tcp_sock *tp = tcp_sk(sk);
+ struct sk_buff *skb;
+- int cnt, oldcnt;
+- int err;
++ int cnt, oldcnt, lost;
+ unsigned int mss;
+ /* Use SACK to deduce losses of new sequences sent during recovery */
+ const u32 loss_high = tcp_is_sack(tp) ? tp->snd_nxt : tp->high_seq;
+@@ -2205,9 +2204,10 @@ static void tcp_mark_head_lost(struct sock *sk, int packets, int mark_head)
+ break;
+
+ mss = tcp_skb_mss(skb);
+- err = tcp_fragment(sk, skb, (packets - oldcnt) * mss,
+- mss, GFP_ATOMIC);
+- if (err < 0)
++ /* If needed, chop off the prefix to mark as lost. */
++ lost = (packets - oldcnt) * mss;
++ if (lost < skb->len &&
++ tcp_fragment(sk, skb, lost, mss, GFP_ATOMIC) < 0)
+ break;
+ cnt = packets;
+ }
+--
+2.1.4
+