On Sat, May 28, 2022 at 09:54 AM +08, wangyufen wrote: > 在 2022/5/28 5:37, Cong Wang 写道: >> On Tue, May 24, 2022 at 03:53:11PM +0800, Wang Yufen wrote: >>> During TCP sockmap redirect pressure test, the following warning is triggered: >>> WARNING: CPU: 3 PID: 2145 at net/core/stream.c:205 sk_stream_kill_queues+0xbc/0xd0 >>> CPU: 3 PID: 2145 Comm: iperf Kdump: loaded Tainted: G W 5.10.0+ #9 >>> Call Trace: >>> inet_csk_destroy_sock+0x55/0x110 >>> inet_csk_listen_stop+0xbb/0x380 >>> tcp_close+0x41b/0x480 >>> inet_release+0x42/0x80 >>> __sock_release+0x3d/0xa0 >>> sock_close+0x11/0x20 >>> __fput+0x9d/0x240 >>> task_work_run+0x62/0x90 >>> exit_to_user_mode_prepare+0x110/0x120 >>> syscall_exit_to_user_mode+0x27/0x190 >>> entry_SYSCALL_64_after_hwframe+0x44/0xa9 >>> >>> The reason we observed is that: >>> When the listener is closing, a connection may have completed the three-way >>> handshake but not accepted, and the client has sent some packets. The child >>> sks in accept queue release by inet_child_forget()->inet_csk_destroy_sock(), >>> but psocks of child sks have not released. >>> >> Hm, in this scenario, how does the child socket end up in the sockmap? >> Clearly user-space does not have a chance to get an fd yet. >> >> And, how does your patch work? Since the child sock does not even inheirt >> the sock proto after clone (see the comments above tcp_bpf_clone()) at >> all? >> >> Thanks. >> . > My test cases are as follows: > > __section("sockops") > int bpf_sockmap(struct bpf_sock_ops *skops) > { > switch (skops->op) { > case BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB: > case BPF_SOCK_OPS_ACTIVE_ESTABLISHED_CB: > ... > bpf_sock_hash_update(skops, &sock_ops_map, &key, BPF_NOEXIST); > break; > ... > } Right, when processing the final ACK in tcp_rcv_state_process(), we invoke the BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB BPF callback. This gives a chance to install sockmap sk_prot callbacks. An accept() without ever calling accept() ;-) [...]