Patch "tcp: avoid too many retransmit packets" has been added to the 6.9-stable tree

Sasha Levin <sashal@xxxxxxxxxx> · Sat, 13 Jul 2024 09:28:51 -0400

This is a note to let you know that I've just added the patch titled

    tcp: avoid too many retransmit packets

to the 6.9-stable tree which can be found at:
    http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary

The filename of the patch is:
     tcp-avoid-too-many-retransmit-packets.patch
and it can be found in the queue-6.9 subdirectory.

If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@xxxxxxxxxxxxxxx> know about it.



commit 6d0a55397594271363d675d5a0b9845b022f87fb
Author: Eric Dumazet <edumazet@xxxxxxxxxx>
Date:   Wed Jul 10 00:14:01 2024 +0000

    tcp: avoid too many retransmit packets
    
    [ Upstream commit 97a9063518f198ec0adb2ecb89789de342bb8283 ]
    
    If a TCP socket is using TCP_USER_TIMEOUT, and the other peer
    retracted its window to zero, tcp_retransmit_timer() can
    retransmit a packet every two jiffies (2 ms for HZ=1000),
    for about 4 minutes after TCP_USER_TIMEOUT has 'expired'.
    
    The fix is to make sure tcp_rtx_probe0_timed_out() takes
    icsk->icsk_user_timeout into account.
    
    Before blamed commit, the socket would not timeout after
    icsk->icsk_user_timeout, but would use standard exponential
    backoff for the retransmits.
    
    Also worth noting that before commit e89688e3e978 ("net: tcp:
    fix unexcepted socket die when snd_wnd is 0"), the issue
    would last 2 minutes instead of 4.
    
    Fixes: b701a99e431d ("tcp: Add tcp_clamp_rto_to_user_timeout() helper to improve accuracy")
    Signed-off-by: Eric Dumazet <edumazet@xxxxxxxxxx>
    Cc: Neal Cardwell <ncardwell@xxxxxxxxxx>
    Reviewed-by: Jason Xing <kerneljasonxing@xxxxxxxxx>
    Reviewed-by: Jon Maxwell <jmaxwell37@xxxxxxxxx>
    Reviewed-by: Kuniyuki Iwashima <kuniyu@xxxxxxxxxx>
    Link: https://patch.msgid.link/20240710001402.2758273-1-edumazet@xxxxxxxxxx
    Signed-off-by: Jakub Kicinski <kuba@xxxxxxxxxx>
    Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx>

diff --git a/net/ipv4/tcp_timer.c b/net/ipv4/tcp_timer.c
index 22d25f63858b9..cceb4fabd4c85 100644
--- a/net/ipv4/tcp_timer.c
+++ b/net/ipv4/tcp_timer.c
@@ -481,15 +481,26 @@ static bool tcp_rtx_probe0_timed_out(const struct sock *sk,
 				     const struct sk_buff *skb,
 				     u32 rtx_delta)
 {
+	const struct inet_connection_sock *icsk = inet_csk(sk);
+	u32 user_timeout = READ_ONCE(icsk->icsk_user_timeout);
 	const struct tcp_sock *tp = tcp_sk(sk);
-	const int timeout = TCP_RTO_MAX * 2;
+	int timeout = TCP_RTO_MAX * 2;
 	s32 rcv_delta;
 
+	if (user_timeout) {
+		/* If user application specified a TCP_USER_TIMEOUT,
+		 * it does not want win 0 packets to 'reset the timer'
+		 * while retransmits are not making progress.
+		 */
+		if (rtx_delta > user_timeout)
+			return true;
+		timeout = min_t(u32, timeout, msecs_to_jiffies(user_timeout));
+	}
 	/* Note: timer interrupt might have been delayed by at least one jiffy,
 	 * and tp->rcv_tstamp might very well have been written recently.
 	 * rcv_delta can thus be negative.
 	 */
-	rcv_delta = inet_csk(sk)->icsk_timeout - tp->rcv_tstamp;
+	rcv_delta = icsk->icsk_timeout - tp->rcv_tstamp;
 	if (rcv_delta <= timeout)
 		return false;