On Fri, Feb 28, 2025 at 06:27:59PM -0800, Jakub Kicinski wrote: > On Thu, 27 Feb 2025 18:50:13 +0000 Joe Damato wrote: > > @@ -2870,9 +2883,15 @@ static void refill_work(struct work_struct *work) > > for (i = 0; i < vi->curr_queue_pairs; i++) { > > struct receive_queue *rq = &vi->rq[i]; > > > > + rtnl_lock(); > > virtnet_napi_disable(rq); > > + rtnl_unlock(); > > + > > still_empty = !try_fill_recv(vi, rq, GFP_KERNEL); > > + > > + rtnl_lock(); > > virtnet_napi_enable(rq); > > + rtnl_unlock(); > > Looks to me like refill_work is cancelled _sync while holding rtnl_lock > from the close path. I think this could deadlock? Good catch, thank you! It looks like this is also the case in the failure path on virtnet_open. Jason: do you have any suggestions? It looks like in both open and close disable_delayed_refill is called first, before the cancel_delayed_work_sync. Would something like this solve the problem? diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index 76dcd65ec0f2..457115300f05 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -2880,6 +2880,13 @@ static void refill_work(struct work_struct *work) bool still_empty; int i; + spin_lock(&vi->refill_lock); + if (!vi->refill_enabled) { + spin_unlock(&vi->refill_lock); + return; + } + spin_unlock(&vi->refill_lock); + for (i = 0; i < vi->curr_queue_pairs; i++) { struct receive_queue *rq = &vi->rq[i];