6.10-stable review patch. If anyone has any objections, please let me know. ------------------ From: Xueming Feng <kuro@xxxxxxxx> [ Upstream commit bac76cf89816bff06c4ec2f3df97dc34e150a1c4 ] We have some problem closing zero-window fin-wait-1 tcp sockets in our environment. This patch come from the investigation. Previously tcp_abort only sends out reset and calls tcp_done when the socket is not SOCK_DEAD, aka orphan. For orphan socket, it will only purging the write queue, but not close the socket and left it to the timer. While purging the write queue, tp->packets_out and sk->sk_write_queue is cleared along the way. However tcp_retransmit_timer have early return based on !tp->packets_out and tcp_probe_timer have early return based on !sk->sk_write_queue. This caused ICSK_TIME_RETRANS and ICSK_TIME_PROBE0 not being resched and socket not being killed by the timers, converting a zero-windowed orphan into a forever orphan. This patch removes the SOCK_DEAD check in tcp_abort, making it send reset to peer and close the socket accordingly. Preventing the timer-less orphan from happening. According to Lorenzo's email in the v1 thread, the check was there to prevent force-closing the same socket twice. That situation is handled by testing for TCP_CLOSE inside lock, and returning -ENOENT if it is already closed. The -ENOENT code comes from the associate patch Lorenzo made for iproute2-ss; link attached below, which also conform to RFC 9293. At the end of the patch, tcp_write_queue_purge(sk) is removed because it was already called in tcp_done_with_error(). p.s. This is the same patch with v2. Resent due to mis-labeled "changes requested" on patchwork.kernel.org. Link: https://patchwork.ozlabs.org/project/netdev/patch/1450773094-7978-3-git-send-email-lorenzo@xxxxxxxxxx/ Fixes: c1e64e298b8c ("net: diag: Support destroying TCP sockets.") Signed-off-by: Xueming Feng <kuro@xxxxxxxx> Tested-by: Lorenzo Colitti <lorenzo@xxxxxxxxxx> Reviewed-by: Jason Xing <kerneljasonxing@xxxxxxxxx> Reviewed-by: Eric Dumazet <edumazet@xxxxxxxxxx> Link: https://patch.msgid.link/20240826102327.1461482-1-kuro@xxxxxxxx Signed-off-by: Jakub Kicinski <kuba@xxxxxxxxxx> Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx> --- net/ipv4/tcp.c | 18 +++++++++++------- 1 file changed, 11 insertions(+), 7 deletions(-) diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index ec6911034138f..2edbd5b181e29 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -4573,6 +4573,13 @@ int tcp_abort(struct sock *sk, int err) /* Don't race with userspace socket closes such as tcp_close. */ lock_sock(sk); + /* Avoid closing the same socket twice. */ + if (sk->sk_state == TCP_CLOSE) { + if (!has_current_bpf_ctx()) + release_sock(sk); + return -ENOENT; + } + if (sk->sk_state == TCP_LISTEN) { tcp_set_state(sk, TCP_CLOSE); inet_csk_listen_stop(sk); @@ -4582,16 +4589,13 @@ int tcp_abort(struct sock *sk, int err) local_bh_disable(); bh_lock_sock(sk); - if (!sock_flag(sk, SOCK_DEAD)) { - if (tcp_need_reset(sk->sk_state)) - tcp_send_active_reset(sk, GFP_ATOMIC, - SK_RST_REASON_NOT_SPECIFIED); - tcp_done_with_error(sk, err); - } + if (tcp_need_reset(sk->sk_state)) + tcp_send_active_reset(sk, GFP_ATOMIC, + SK_RST_REASON_NOT_SPECIFIED); + tcp_done_with_error(sk, err); bh_unlock_sock(sk); local_bh_enable(); - tcp_write_queue_purge(sk); if (!has_current_bpf_ctx()) release_sock(sk); return 0; -- 2.43.0