On Sat, Apr 9, 2022 at 9:46 AM Eric Dumazet <edumazet@xxxxxxxxxx> wrote: > > On Sat, Apr 9, 2022 at 1:19 AM Tetsuo Handa > <penguin-kernel@xxxxxxxxxxxxxxxxxxx> wrote: > > > > Hello, bpf developers. > > > > syzbot is reporting use-after-free increment at __NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPTIMEOUTS). > > > Try removing NFS from your kernel .config ? If your repro still works, > then another user of kernel TCP socket needs some care. > > NFS maintainers and other folks are already working on fixing this issue, > which is partly caused by fs/file_table.c being able to delay fput(), > look at code in fput_many() > > Kernel TCP sockets are tricky, they (for good reasons) do not take a > reference on the net namespace. > > This also means that users of such sockets need to make sure the > various tcp timers have been completed, > as sk_stop_timer() is not using del_timer_sync() > > Even after a synchronous fput(), there is no guarantee that another > cpu is not running some of the socket timers functions. So please add to your tree the NFS fix: commit f00432063db1a0db484e85193eccc6845435b80e Author: Trond Myklebust <trond.myklebust@xxxxxxxxxxxxxxx> Date: Sun Apr 3 15:58:11 2022 -0400 SUNRPC: Ensure we flush any closed sockets before xs_xprt_free() We must ensure that all sockets are closed before we call xprt_free() and release the reference to the net namespace. The problem is that calling fput() will defer closing the socket until delayed_fput() gets called. Let's fix the situation by allowing rpciod and the transport teardown code (which runs on the system wq) to call __fput_sync(), and directly close the socket. Reported-by: Felix Fu <foyjog@xxxxxxxxx> Acked-by: Al Viro <viro@xxxxxxxxxxxxxxxxxx> Fixes: a73881c96d73 ("SUNRPC: Fix an Oops in udp_poll()") Cc: stable@xxxxxxxxxxxxxxx # 5.1.x: 3be232f11a3c: SUNRPC: Prevent immediate close+reconnect Cc: stable@xxxxxxxxxxxxxxx # 5.1.x: 89f42494f92f: SUNRPC: Don't call connect() more than once on a TCP socket Cc: stable@xxxxxxxxxxxxxxx # 5.1.x Signed-off-by: Trond Myklebust <trond.myklebust@xxxxxxxxxxxxxxx> Then on top of that, add the following fix (I will formally submit this one once back to work, Monday morning) diff --git a/include/net/inet_connection_sock.h b/include/net/inet_connection_sock.h index 3908296d103fd2de9284adea64dba94fe6b8720f..e2c856ae4fdbef5bd3c7728e376786b804e2d4f1 100644 --- a/include/net/inet_connection_sock.h +++ b/include/net/inet_connection_sock.h @@ -171,6 +171,7 @@ void inet_csk_init_xmit_timers(struct sock *sk, void (*delack_handler)(struct timer_list *), void (*keepalive_handler)(struct timer_list *)); void inet_csk_clear_xmit_timers(struct sock *sk); +void inet_csk_clear_xmit_timers_sync(struct sock *sk); static inline void inet_csk_schedule_ack(struct sock *sk) { diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c index 1e5b53c2bb2670fc90b789e853458f5c86a00c27..aab83b766014d0a091a73bdc13376d9cdae99b27 100644 --- a/net/ipv4/inet_connection_sock.c +++ b/net/ipv4/inet_connection_sock.c @@ -581,6 +581,17 @@ void inet_csk_clear_xmit_timers(struct sock *sk) } EXPORT_SYMBOL(inet_csk_clear_xmit_timers); +void inet_csk_clear_xmit_timers_sync(struct sock *sk) +{ + struct inet_connection_sock *icsk = inet_csk(sk); + + icsk->icsk_pending = icsk->icsk_ack.pending = 0; + + sk_stop_timer_sync(sk, &icsk->icsk_retransmit_timer); + sk_stop_timer_sync(sk, &icsk->icsk_delack_timer); + sk_stop_timer_sync(sk, &sk->sk_timer); +} + void inet_csk_delete_keepalive_timer(struct sock *sk) { sk_stop_timer(sk, &sk->sk_timer); diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index e31cf137c6140f76f838b4a0dcddf9f104ad653b..3dacd202bf2af43c55ffe820c08316150d2018ea 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -2928,6 +2928,8 @@ void tcp_close(struct sock *sk, long timeout) lock_sock(sk); __tcp_close(sk, timeout); release_sock(sk); + if (!sk->sk_net_refcnt) + inet_csk_clear_xmit_timers_sync(sk); sock_put(sk); } EXPORT_SYMBOL(tcp_close);