On Sat, Mar 13, 2021 at 06:32 PM CET, Cong Wang wrote: > On Fri, Mar 12, 2021 at 4:02 AM Jakub Sitnicki <jakub@xxxxxxxxxxxxxx> wrote: >> >> On Wed, Mar 10, 2021 at 06:32 AM CET, Cong Wang wrote: >> > diff --git a/net/core/sock_map.c b/net/core/sock_map.c >> > index dd53a7771d7e..26ba47b099f1 100644 >> > --- a/net/core/sock_map.c >> > +++ b/net/core/sock_map.c >> > @@ -1540,6 +1540,7 @@ void sock_map_close(struct sock *sk, long timeout) >> > saved_close = psock->saved_close; >> > sock_map_remove_links(sk, psock); >> > rcu_read_unlock(); >> > + sk_psock_purge(psock); >> > release_sock(sk); >> > saved_close(sk, timeout); >> > } >> >> Nothing stops sk_psock_backlog from running after sk_psock_purge: >> >> >> CPU 1 CPU 2 >> >> sk_psock_skb_redirect() >> sk_psock(sk_other) >> sock_flag(sk_other, SOCK_DEAD) >> sk_psock_test_state(psock_other, >> SK_PSOCK_TX_ENABLED) >> sk_psock_purge() >> skb_queue_tail(&psock_other->ingress_skb, skb) >> schedule_work(&psock_other->work) >> >> >> And sock_orphan can run while we're in sendmsg/sendpage_unlocked: >> >> >> CPU 1 CPU 2 >> >> sk_psock_backlog >> ... >> sendmsg_unlocked >> sock = sk->sk_socket >> tcp_close >> __tcp_close >> sock_orphan >> kernel_sendmsg(sock, msg, vec, num, size) >> >> >> So, after this change, without lock_sock in sk_psock_backlog, we will >> not block tcp_close from running. >> >> This makes me think that the process socket can get released from under >> us, before kernel_sendmsg/sendpage runs. > > I think you are right, I thought socket is orphaned in inet_release(), clearly > I was wrong. But, I'd argue in the above scenario, the packet should not > be even queued in the first place, as SK_PSOCK_TX_ENABLED is going > to be cleared, so I think the right fix is probably to make clearing psock > state and queuing the packet under a spinlock. Sounds like a good idea. The goal, I understand, is to guarantee that psock holds a ref count on proces socket for the duration of sk_psock_backlog() run. That would not only let us get rid of lock_sock(), with finer grained queue locks, but also the sock_flag(psock->sk, SOCK_DEAD) check.