Patch "tcp: fix indefinite deferral of RTO with SACK reneging" has been added to the 4.9-stable tree

Sasha Levin <sashal@xxxxxxxxxx> · Mon, 31 Oct 2022 11:25:37 -0400

This is a note to let you know that I've just added the patch titled

    tcp: fix indefinite deferral of RTO with SACK reneging

to the 4.9-stable tree which can be found at:
    http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary

The filename of the patch is:
     tcp-fix-indefinite-deferral-of-rto-with-sack-renegin.patch
and it can be found in the queue-4.9 subdirectory.

If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@xxxxxxxxxxxxxxx> know about it.



commit c21d901ffaf4ad393484be166a3cfd5d0e30d5aa
Author: Neal Cardwell <ncardwell@xxxxxxxxxx>
Date:   Fri Oct 21 17:08:21 2022 +0000

    tcp: fix indefinite deferral of RTO with SACK reneging
    
    [ Upstream commit 3d2af9cce3133b3bc596a9d065c6f9d93419ccfb ]
    
    This commit fixes a bug that can cause a TCP data sender to repeatedly
    defer RTOs when encountering SACK reneging.
    
    The bug is that when we're in fast recovery in a scenario with SACK
    reneging, every time we get an ACK we call tcp_check_sack_reneging()
    and it can note the apparent SACK reneging and rearm the RTO timer for
    srtt/2 into the future. In some SACK reneging scenarios that can
    happen repeatedly until the receive window fills up, at which point
    the sender can't send any more, the ACKs stop arriving, and the RTO
    fires at srtt/2 after the last ACK. But that can take far too long
    (O(10 secs)), since the connection is stuck in fast recovery with a
    low cwnd that cannot grow beyond ssthresh, even if more bandwidth is
    available.
    
    This fix changes the logic in tcp_check_sack_reneging() to only rearm
    the RTO timer if data is cumulatively ACKed, indicating forward
    progress. This avoids this kind of nearly infinite loop of RTO timer
    re-arming. In addition, this meets the goals of
    tcp_check_sack_reneging() in handling Windows TCP behavior that looks
    temporarily like SACK reneging but is not really.
    
    Many thanks to Jakub Kicinski and Neil Spring, who reported this issue
    and provided critical packet traces that enabled root-causing this
    issue. Also, many thanks to Jakub Kicinski for testing this fix.
    
    Fixes: 5ae344c949e7 ("tcp: reduce spurious retransmits due to transient SACK reneging")
    Reported-by: Jakub Kicinski <kuba@xxxxxxxxxx>
    Reported-by: Neil Spring <ntspring@xxxxxx>
    Signed-off-by: Neal Cardwell <ncardwell@xxxxxxxxxx>
    Reviewed-by: Eric Dumazet <edumazet@xxxxxxxxxx>
    Cc: Yuchung Cheng <ycheng@xxxxxxxxxx>
    Tested-by: Jakub Kicinski <kuba@xxxxxxxxxx>
    Link: https://lore.kernel.org/r/20221021170821.1093930-1-ncardwell.kernel@xxxxxxxxx
    Signed-off-by: Jakub Kicinski <kuba@xxxxxxxxxx>
    Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx>

diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 9c7f716aab44..98ff1e34e04f 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -2053,7 +2053,8 @@ void tcp_enter_loss(struct sock *sk)
  */
 static bool tcp_check_sack_reneging(struct sock *sk, int flag)
 {
-	if (flag & FLAG_SACK_RENEGING) {
+	if (flag & FLAG_SACK_RENEGING &&
+	    flag & FLAG_SND_UNA_ADVANCED) {
 		struct tcp_sock *tp = tcp_sk(sk);
 		unsigned long delay = max(usecs_to_jiffies(tp->srtt_us >> 4),
 					  msecs_to_jiffies(10));