On 3/19/25 23:18, Cong Wang wrote: > On Mon, Mar 17, 2025 at 10:52:25AM +0100, Michal Luczaj wrote: >> Signal delivery during connect() may lead to a disconnect of an already >> established socket. That involves removing socket from any sockmap and >> resetting state to SS_UNCONNECTED. While it correctly restores socket's >> proto, a call to vsock_bpf_recvmsg() might have been already under way in >> another thread. If the connect()ing thread reassigns the vsock transport to >> NULL, the recvmsg()ing thread may trigger a WARN_ON_ONCE. >> *THREAD 1* *THREAD 2* >> connect >> / state = SS_CONNECTED / >> sock_map_update_elem >> vsock_bpf_recvmsg >> psock = sk_psock_get() >> lock sk >> if signal_pending >> unhash >> sock_map_remove_links > > So vsock's ->recvmsg() should be restored after this, right? Then how is > vsock_bpf_recvmsg() called afterward? I'm not sure I understand the question, so I've added a header above: those are 2 parallel flows of execution. vsock_bpf_recvmsg() wasn't called afterwards. It was called before sock_map_remove_links(). Note that at the time of sock_map_remove_links() (in T1), vsock_bpf_recvmsg() is still executing (in T2). >> state = SS_UNCONNECTED >> release sk >> >> connect >> transport = NULL >> lock sk >> WARN_ON_ONCE(!vsk->transport) >> > > And I am wondering why we need to WARN here since we can handle this error > case correctly? The WARN and transport check are here for defensive measures, and to state a contract. But I think I get your point. If we accept for a fact of life that BPF code should be able to handle transport disappearing - then WARN can be removed (while keeping the check) and this patch can be dropped. My aim, instead, was to keep things consistent. By which I mean sticking to the conditions expressed in vsock_bpf_update_proto() as invariants; so that vsock with a psock is guaranteed to have transport assigned.