On Wed, Jun 27, 2012 at 12:19 PM, Marcelo Tosatti <mtosatti@xxxxxxxxxx> wrote: > On Wed, Jun 27, 2012 at 08:41:49AM +0100, Stefan Hajnoczi wrote: >> On Wed, Jun 27, 2012 at 8:39 AM, Stefan Hajnoczi <stefanha@xxxxxxxxx> wrote: >> > On Tue, Jun 26, 2012 at 8:34 PM, Marcelo Tosatti <mtosatti@xxxxxxxxxx> wrote: >> >> On Sat, Jun 23, 2012 at 12:55:49AM +0200, Jan Kiszka wrote: >> >> net.txt >> >> -------- >> >> >> >> >> >> iothread flow >> >> ============= >> >> >> >> 1) Skip-work-if-device-locked >> >> >> >> select(tap fd ready) >> >> tap_send >> >> if (trylock(TAPState->NetClientState->dev)) >> >> proceed; >> >> else >> >> continue; (data remains in queue) >> >> tap_read_packet >> >> qemu_send_packet_async >> >> >> >> DRAWBACK: lock intensive. >> >> DRAWBACK: VCPU threads can starve IOTHREAD (can be fixed with >> >> a scheme such as trylock() marks a flag saying "iothread wants lock". >> > >> > One alternative is to say the lock cannot be held across system calls >> > or other long-running operations (similar to interrupt handler >> > spinlocks in the kernel). But QEMU threads are still at the mercy of >> > the scheduler or page faults, so perhaps this is not a good idea >> > because waiting on the lock could tie up the iothread. >> > >> >> 2) Queue-work-to-vcpu-context >> >> >> >> select(tap fd ready) >> >> tap_send >> >> if (trylock(TAPState->NetClientState->dev)) { >> >> proceed; >> >> } else { >> >> AddToDeviceWorkQueue(work); >> >> } >> >> tap_read_packet >> >> qemu_send_packet_async >> >> >> >> DRAWBACK: lock intensive >> >> DRAWBACK: cache effects >> > >> > This almost suggests we should invert packet reception for NICs. tap >> > fd ready should add a work item and the guest device calls into >> > net/tap.c to pull out any packets from the work function: >> > >> > tap_send >> > dev_add_work(work); >> > >> > virtio_net_rx_work_fn >> > while ((req = virtqueue_pop(rx_queue))) { >> > if (!net_receive_packet(netclient, req->buf)) { >> > break; /* no more packets available */ >> > } >> > /* ...mark req complete and push onto virtqueue... */ >> > } >> > virtqueue_notify(rx_queue); >> > >> > The idea is to avoid copies on the rx path and make the locking simple >> > by always deferring to a work function (which can be invoked >> > immediately if the device isn't locked). >> >> I just realized this approach bypasses net/queue.c. I think it works >> really well for the peer model (no "VLANs"). Basically flow control >> is dictated by the peer (receiver) and we don't need to use NetQueue >> anymore. Whether this works elegantly for slirp and other backends, >> I'm not sure yet. >> >> Stefan > > Right. The advantage is only backends where performance matters need to > be converted (and only net hw drivers that matter performance wise need > to be converted). > > Apparently you guys have very different ideas on the higher level > model here. It would be good to agree on one structure (including > modifications on the one above) to follow and share the work on that > direction. "You guys" == Ping Fan + Jan + Anthony + Avi? I'm not working directly on QBL removal myself. Stefan -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html