Re: [Patch bpf] sock_map: convert cancel_work_sync() to cancel_work()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Jakub Sitnicki wrote:
> On Fri, Oct 28, 2022 at 12:16 PM -07, Cong Wang wrote:
> > On Mon, Oct 24, 2022 at 03:33:13PM +0200, Jakub Sitnicki wrote:
> >> On Tue, Oct 18, 2022 at 11:13 AM -07, sdf@xxxxxxxxxx wrote:
> >> > On 10/17, Cong Wang wrote:
> >> >> From: Cong Wang <cong.wang@xxxxxxxxxxxxx>
> >> >
> >> >> Technically we don't need lock the sock in the psock work, but we
> >> >> need to prevent this work running in parallel with sock_map_close().
> >> >
> >> >> With this, we no longer need to wait for the psock->work synchronously,
> >> >> because when we reach here, either this work is still pending, or
> >> >> blocking on the lock_sock(), or it is completed. We only need to cancel
> >> >> the first case asynchronously, and we need to bail out the second case
> >> >> quickly by checking SK_PSOCK_TX_ENABLED bit.
> >> >
> >> >> Fixes: 799aa7f98d53 ("skmsg: Avoid lock_sock() in sk_psock_backlog()")
> >> >> Reported-by: Stanislav Fomichev <sdf@xxxxxxxxxx>
> >> >> Cc: John Fastabend <john.fastabend@xxxxxxxxx>
> >> >> Cc: Jakub Sitnicki <jakub@xxxxxxxxxxxxxx>
> >> >> Signed-off-by: Cong Wang <cong.wang@xxxxxxxxxxxxx>
> >> >
> >> > This seems to remove the splat for me:
> >> >
> >> > Tested-by: Stanislav Fomichev <sdf@xxxxxxxxxx>
> >> >
> >> > The patch looks good, but I'll leave the review to Jakub/John.
> >> 
> >> I can't poke any holes in it either.
> >> 
> >> However, it is harder for me to follow than the initial idea [1].
> >> So I'm wondering if there was anything wrong with it?
> >
> > It caused a warning in sk_stream_kill_queues() when I actually tested
> > it (after posting).
> 
> We must have seen the same warnings. They seemed unrelated so I went
> digging. We have a fix for these [1]. They were present since 5.18-rc1.
> 
> >> This seems like a step back when comes to simplifying locking in
> >> sk_psock_backlog() that was done in 799aa7f98d53.
> >
> > Kinda, but it is still true that this sock lock is not for sk_socket
> > (merely for closing this race condition).
> 
> I really think the initial idea [2] is much nicer. I can turn it into a
> patch, if you are short on time.
> 
> With [1] and [2] applied, the dead lock and memory accounting warnings
> are gone, when running `test_sockmap`.
> 
> Thanks,
> Jakub
> 
> [1] https://lore.kernel.org/netdev/1667000674-13237-1-git-send-email-wangyufen@xxxxxxxxxx/
> [2] https://lore.kernel.org/netdev/Y0xJUc%2FLRu8K%2FAf8@pop-os.localdomain/

Cong, what do you think? I tend to agree [2] looks nicer to me.

@Jakub,

Also I think we could simply drop the proposed cancel_work_sync in
sock_map_close()?

 }
@@ -1619,9 +1619,10 @@ void sock_map_close(struct sock *sk, long timeout)
 	saved_close = psock->saved_close;
 	sock_map_remove_links(sk, psock);
 	rcu_read_unlock();
-	sk_psock_stop(psock, true);
-	sk_psock_put(sk, psock);
+	sk_psock_stop(psock);
 	release_sock(sk);
+	cancel_work_sync(&psock->work);
+	sk_psock_put(sk, psock);
 	saved_close(sk, timeout);
 }

The sk_psock_put is going to cancel the work before destroying the psock,

 sk_psock_put()
   sk_psock_drop()
     queue_rcu_work(system_wq, psock->rwork)

and then in callback we

  sk_psock_destroy()
    cancel_work_synbc(psock->work)

although it might be nice to have the work cancelled earlier rather than
latter maybe.

Thanks,
John



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux