From: Eric Dumazet <edumazet@xxxxxxxxxx> commit 97a9063518f198ec0adb2ecb89789de342bb8283 upstream. If a TCP socket is using TCP_USER_TIMEOUT, and the other peer retracted its window to zero, tcp_retransmit_timer() can retransmit a packet every two jiffies (2 ms for HZ=1000), for about 4 minutes after TCP_USER_TIMEOUT has 'expired'. The fix is to make sure tcp_rtx_probe0_timed_out() takes icsk->icsk_user_timeout into account. Before blamed commit, the socket would not timeout after icsk->icsk_user_timeout, but would use standard exponential backoff for the retransmits. Also worth noting that before commit e89688e3e978 ("net: tcp: fix unexcepted socket die when snd_wnd is 0"), the issue would last 2 minutes instead of 4. Fixes: b701a99e431d ("tcp: Add tcp_clamp_rto_to_user_timeout() helper to improve accuracy") Signed-off-by: Eric Dumazet <edumazet@xxxxxxxxxx> Cc: Neal Cardwell <ncardwell@xxxxxxxxxx> Reviewed-by: Jason Xing <kerneljasonxing@xxxxxxxxx> Reviewed-by: Jon Maxwell <jmaxwell37@xxxxxxxxx> Reviewed-by: Kuniyuki Iwashima <kuniyu@xxxxxxxxxx> Link: https://patch.msgid.link/20240710001402.2758273-1-edumazet@xxxxxxxxxx Signed-off-by: Jakub Kicinski <kuba@xxxxxxxxxx> Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> --- net/ipv4/tcp_timer.c | 16 ++++++++++++++-- 1 file changed, 14 insertions(+), 2 deletions(-) --- a/net/ipv4/tcp_timer.c +++ b/net/ipv4/tcp_timer.c @@ -457,16 +457,28 @@ static void tcp_fastopen_synack_timer(st static bool tcp_rtx_probe0_timed_out(const struct sock *sk, const struct sk_buff *skb) { + const struct inet_connection_sock *icsk = inet_csk(sk); + u32 user_timeout = READ_ONCE(icsk->icsk_user_timeout); const struct tcp_sock *tp = tcp_sk(sk); - const int timeout = TCP_RTO_MAX * 2; + int timeout = TCP_RTO_MAX * 2; u32 rtx_delta; s32 rcv_delta; + if (user_timeout) { + /* If user application specified a TCP_USER_TIMEOUT, + * it does not want win 0 packets to 'reset the timer' + * while retransmits are not making progress. + */ + if (rtx_delta > user_timeout) + return true; + timeout = min_t(u32, timeout, msecs_to_jiffies(user_timeout)); + } + /* Note: timer interrupt might have been delayed by at least one jiffy, * and tp->rcv_tstamp might very well have been written recently. * rcv_delta can thus be negative. */ - rcv_delta = inet_csk(sk)->icsk_timeout - tp->rcv_tstamp; + rcv_delta = icsk->icsk_timeout - tp->rcv_tstamp; if (rcv_delta <= timeout) return false; Patches currently in stable-queue which might be from edumazet@xxxxxxxxxx are queue-6.6/udp-set-sock_rcu_free-earlier-in-udp_lib_get_port.patch queue-6.6/tcp-avoid-too-many-retransmit-packets.patch queue-6.6/tcp-use-signed-arithmetic-in-tcp_rtx_probe0_timed_out.patch queue-6.6/tcp-fix-incorrect-undo-caused-by-dsack-of-tlp-retran.patch