On 12/27/2013 01:46 PM, Eric Dumazet wrote: > On Fri, 2013-12-27 at 12:55 +0800, Jason Wang wrote: >> On 12/27/2013 05:56 AM, Eric Dumazet wrote: >>> On Thu, 2013-12-26 at 13:28 -0800, Michael Dalton wrote: >>>> On Mon, Dec 23, 2013 at 11:37 AM, Michael S. Tsirkin <mst@xxxxxxxxxx> wrote: >>>>> So there isn't a conflict with respect to locking. >>>>> >>>>> Is it problematic to use same page_frag with both GFP_ATOMIC and with >>>>> GFP_KERNEL? If yes why? >>>> I believe it is safe to use the same page_frag and I will send out a >>>> followup patchset using just the per-receive page_frags. For future >>>> consideration, Eric noted that disabling NAPI before GFP_KERNEL >>>> allocs can potentially inhibit virtio-net network processing for some >>>> time (e.g., during a blocking memory allocation or preemption). >>> Yep, using napi_disable() in the refill process looks quite inefficient >>> to me, it not buggy. >>> >>> napi_disable() is a big hammer, while whole idea of having a process to >>> block on GFP_KERNEL allocations is to allow some asynchronous behavior. >>> >>> I have hard time to convince myself virtio_net is safe anyway with this >>> work queue thing. >>> >>> virtnet_open() seems racy for example : >>> >>> for (i = 0; i < vi->max_queue_pairs; i++) { >>> if (i < vi->curr_queue_pairs) >>> /* Make sure we have some buffers: if oom use wq. */ >>> if (!try_fill_recv(&vi->rq[i], GFP_KERNEL)) >>> schedule_delayed_work(&vi->refill, 0); >>> virtnet_napi_enable(&vi->rq[i]); >>> >>> >>> What if the workqueue is scheduled _before_ the call to virtnet_napi_enable(&vi->rq[i]) ? >> Then napi_disable() in refill_work() will busy wait until napi is >> enabled by virtnet_napi_enable() which looks safe. Looks like the real >> issue is in virtnet_restore() who calls try_fill_recv() in neither napi >> context nor napi disabled context. > I think you don't really get the race. > > The issue is the following : > > CPU0 CPU1 > > schedule_delayed_work() > napi_disable(&rq->napi); > try_fill_recv(rq, GFP_KERNEL); If I didn't miss anything. In this case, for a specific rq, napi_disable() won't return immediately since NAPI_STATE_SCHED bit was still set. It will busy wait until NAPI_STATE_SCHED bit was clear by virtnet_napi_enable(), and then reset the bit. So try_fill_recv() should be called after its napi was enabled by virtnet_napi_enable() in CPU0. So the following order were guaranteed: - try_fill_recv(rq, GFP_ATOMIC) in CPU0 - virtnet_napi_enable(&vi->rq[i]) in CPU0 - napi_disable(&rq->napi) returned in CPU1 - try_fill_recv(rq) in CPU1 ... > > virtnet_napi_enable(&vi->rq[i]); > ... > try_fill_recv(rq, GFP_ATOMIC); > > napi_enable();// crash on : > BUG_ON(!test_bit(NAPI_STATE_SCHED, &n->state)); > > > > > -- > To unsubscribe from this list: send the line "unsubscribe netdev" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html _______________________________________________ Virtualization mailing list Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linuxfoundation.org/mailman/listinfo/virtualization