On Tue, 20 Dec 2011 21:45:19 +0200, "Michael S. Tsirkin" <mst@xxxxxxxxxx> wrote: > On Tue, Dec 20, 2011 at 11:31:54AM -0800, Tejun Heo wrote: > > On Tue, Dec 20, 2011 at 09:30:55PM +0200, Michael S. Tsirkin wrote: > > > Hmm, in that case it looks like a nasty race could get > > > triggered, with try_fill_recv run on multiple CPUs in parallel, > > > corrupting the linked list within the vq. > > > > > > Using the mutex as my patch did will fix that naturally, as well. > > > > Don't know the code but just use nrt wq. There's even a system one > > called system_nrq_wq. > > > > Thanks. > > We can, but we need the mutex for other reasons, anyway. Well, here's the alternate approach. What do you think? Finding two wq issues makes you justifiably cautious, but it almost feels like giving up to simply wrap it in a lock. The APIs are designed to let us do it without a lock; I was just using them wrong. Two patches in one mail. It's gauche, but it's an RFC only (untested). Cheers, Rusty. virtio_net: set/cancel work on ndo_open/ndo_stop Michael S. Tsirkin noticed that we could run the refill work after ndo_close, which can re-enable napi - we don't disable it until virtnet_remove. This is clearly wrong, so move the workqueue control to ndo_open and ndo_stop (aka. virtnet_open and virtnet_close). One subtle point: virtnet_probe() could simply fail if it couldn't allocate a receive buffer, but that's less polite in virtnet_open() so we schedule a refill as we do in the normal receive path if we run out of memory. diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -719,6 +719,10 @@ static int virtnet_open(struct net_devic { struct virtnet_info *vi = netdev_priv(dev); + /* Make sure we have some buffersl if oom use wq. */ + if (!try_fill_recv(vi, GFP_KERNEL)) + schedule_delayed_work(&vi->refill, 0); + virtnet_napi_enable(vi); return 0; } @@ -772,6 +776,8 @@ static int virtnet_close(struct net_devi { struct virtnet_info *vi = netdev_priv(dev); + /* Make sure refill_work doesn't re-enable napi! */ + cancel_delayed_work_sync(&vi->refill); napi_disable(&vi->napi); return 0; @@ -1082,7 +1088,6 @@ static int virtnet_probe(struct virtio_d unregister: unregister_netdev(dev); - cancel_delayed_work_sync(&vi->refill); free_vqs: vdev->config->del_vqs(vdev); free_stats: @@ -1121,9 +1126,7 @@ static void __devexit virtnet_remove(str /* Stop all the virtqueues. */ vdev->config->reset(vdev); - unregister_netdev(vi->dev); - cancel_delayed_work_sync(&vi->refill); /* Free unused buffers in both send and recv, if any. */ free_unused_bufs(vi); virtio_net: use non-reentrant workqueue. Michael S. Tsirkin also noticed that we could run the refill work multiple CPUs: if we kick off a refill on one CPU and then on another, they would both manipulate the queue at the same time (they use napi_disable to avoid racing against the receive handler itself). Tejun points out that this is what the WQ_NON_REENTRANT flag is for, and that there is a convenient system kthread we can use. diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -501,7 +501,7 @@ static void refill_work(struct work_stru /* In theory, this can happen: if we don't get any buffers in * we will *never* try to fill again. */ if (still_empty) - schedule_delayed_work(&vi->refill, HZ/2); + queue_delayed_work(system_nrt_wq, &vi->refill, HZ/2); } static int virtnet_poll(struct napi_struct *napi, int budget) @@ -520,7 +520,7 @@ again: if (vi->num < vi->max / 2) { if (!try_fill_recv(vi, GFP_ATOMIC)) - schedule_delayed_work(&vi->refill, 0); + queue_delayed_work(system_nrt_wq, &vi->refill, 0); } /* Out of packets? */ @@ -721,7 +721,7 @@ static int virtnet_open(struct net_devic /* Make sure we have some buffersl if oom use wq. */ if (!try_fill_recv(vi, GFP_KERNEL)) - schedule_delayed_work(&vi->refill, 0); + queue_delayed_work(&system_nrt_wq, &vi->refill, 0); virtnet_napi_enable(vi); return 0; _______________________________________________ Virtualization mailing list Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linuxfoundation.org/mailman/listinfo/virtualization