Anthony Liguori <anthony@xxxxxxxxxxxxx> writes: > "Michael S. Tsirkin" <mst@xxxxxxxxxx> writes: > >> On Thu, May 30, 2013 at 08:40:47AM -0500, Anthony Liguori wrote: >>> Stefan Hajnoczi <stefanha@xxxxxxxxx> writes: >>> >>> > On Thu, May 30, 2013 at 7:23 AM, Rusty Russell <rusty@xxxxxxxxxxxxxxx> wrote: >>> >> Anthony Liguori <anthony@xxxxxxxxxxxxx> writes: >>> >>> Rusty Russell <rusty@xxxxxxxxxxxxxxx> writes: >>> >>>> On Fri, May 24, 2013 at 08:47:58AM -0500, Anthony Liguori wrote: >>> >>>>> FWIW, I think what's more interesting is using vhost-net as a networking >>> >>>>> backend with virtio-net in QEMU being what's guest facing. >>> >>>>> >>> >>>>> In theory, this gives you the best of both worlds: QEMU acts as a first >>> >>>>> line of defense against a malicious guest while still getting the >>> >>>>> performance advantages of vhost-net (zero-copy). >>> >>>>> >>> >>>> It would be an interesting idea if we didn't already have the vhost >>> >>>> model where we don't need the userspace bounce. >>> >>> >>> >>> The model is very interesting for QEMU because then we can use vhost as >>> >>> a backend for other types of network adapters (like vmxnet3 or even >>> >>> e1000). >>> >>> >>> >>> It also helps for things like fault tolerance where we need to be able >>> >>> to control packet flow within QEMU. >>> >> >>> >> (CC's reduced, context added, Dmitry Fleytman added for vmxnet3 thoughts). >>> >> >>> >> Then I'm really confused as to what this would look like. A zero copy >>> >> sendmsg? We should be able to implement that today. >>> >> >>> >> On the receive side, what can we do better than readv? If we need to >>> >> return to userspace to tell the guest that we've got a new packet, we >>> >> don't win on latency. We might reduce syscall overhead with a >>> >> multi-dimensional readv to read multiple packets at once? >>> > >>> > Sounds like recvmmsg(2). >>> >>> Could we map this to mergable rx buffers though? >>> >>> Regards, >>> >>> Anthony Liguori >> >> Yes because we don't have to complete buffers in order. > > What I meant though was for GRO, we don't know how large the received > packet is going to be. Mergable rx buffers lets us allocate a pool of > data for all incoming packets instead of allocating max packet size * > max packets. > > recvmmsg expects an array of msghdrs and I presume each needs to be > given a fixed size. So this seems incompatible with mergable rx > buffers. Good point. You'd need to build 64k buffers to pass to recvmmsg, then reuse the parts it didn't touch on the next call. This limits us to about a 16th of what we could do with an interface which understood buffer merging, but I don't know how much that would matter in practice. We'd need some benchmarks.... Cheers, Rusty. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html