On Wed, Jul 03, 2024 at 11:47:46AM +0800, Wen Gu wrote: > From: Wang Yufen <wangyufen@xxxxxxxxxx> > > [ Upstream commit d8616ee2affcff37c5d315310da557a694a3303d ] > > During TCP sockmap redirect pressure test, the following warning is triggered: > > WARNING: CPU: 3 PID: 2145 at net/core/stream.c:205 sk_stream_kill_queues+0xbc/0xd0 > CPU: 3 PID: 2145 Comm: iperf Kdump: loaded Tainted: G W 5.10.0+ #9 > Call Trace: > inet_csk_destroy_sock+0x55/0x110 > inet_csk_listen_stop+0xbb/0x380 > tcp_close+0x41b/0x480 > inet_release+0x42/0x80 > __sock_release+0x3d/0xa0 > sock_close+0x11/0x20 > __fput+0x9d/0x240 > task_work_run+0x62/0x90 > exit_to_user_mode_prepare+0x110/0x120 > syscall_exit_to_user_mode+0x27/0x190 > entry_SYSCALL_64_after_hwframe+0x44/0xa9 > > The reason we observed is that: > > When the listener is closing, a connection may have completed the three-way > handshake but not accepted, and the client has sent some packets. The child > sks in accept queue release by inet_child_forget()->inet_csk_destroy_sock(), > but psocks of child sks have not released. > > To fix, add sock_map_destroy to release psocks. > > Signed-off-by: Wang Yufen <wangyufen@xxxxxxxxxx> > Signed-off-by: Daniel Borkmann <daniel@xxxxxxxxxxxxx> > Signed-off-by: Andrii Nakryiko <andrii@xxxxxxxxxx> > Acked-by: Jakub Sitnicki <jakub@xxxxxxxxxxxxxx> > Acked-by: John Fastabend <john.fastabend@xxxxxxxxx> > Link: https://lore.kernel.org/bpf/20220524075311.649153-1-wangyufen@xxxxxxxxxx > Stable-dep-of: 8bbabb3fddcd ("bpf, sock_map: Move cancel_work_sync() out of sock lock") > Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx> > [Conflict in include/linux/bpf.h due to function declaration position > and remove non-existed sk_psock_stop helper from sock_map_destroy.] > Signed-off-by: Wen Gu <guwen@xxxxxxxxxxxxxxxxx> > --- > background: > Link: https://lore.kernel.org/stable/d11bc7e6-a2c7-445a-8561-3599eafb07b0@xxxxxxxxxxxxxxxxx/ > > @stable team: > This backport has 2 changes compared to the original patch: > - fix conflict due to sock_map_destroy declaration position in include/linux/bpf.h; > - remove the non-existed sk_psock_stop helper from sock_map_destroy. This helper is > introduced by 799aa7f98d53 ("skmsg: Avoid lock_sock() in sk_psock_backlog()") after > v5.10, it is not a fix and hard to backport. Considering that what did in > sk_psock_stop is done in sk_psock_drop and neither sock_map_close nor sock_map_unhash > in v5.10 introduces sk_psock_stop, I removed it from sock_map_destroy too. > I tested it in my environment, the regression was gone. Now queued up, thanks. greg k-h