On Thursday 17 December 2009, Anthony Liguori wrote: > When invoking qemu directly, for the first go about, I'd expect -net > vhost,dev=eth0 for a raw device and -net vhost,mode=tap,tap-arguments. > > Long term, there are so many possible ways to layer things, that I'd > really like to see: > > -net vepa,dev=eth0 > > Which ends up invoking /usr/libexec/qemu-net-helper-vepa --arg-dev=eth0 > --socketpair=X --try-vhost. > > qemu-net-helper-vepa would do all of the fancy stuff of creating a > macvtap device, trying to hook that up with vhost, sending us an fd over > the socketpair telling us which interface it's using and what features > were enabled. We need to make sure not to hardcode the dependency from VEPA to macvtap in your example, so I'm not sure if a VEPA specific helper is helpful. We really have a tuple of policy, kernel implementation and qemu implementation, with many possibly combinations, currently at least (ignoring UDP, TCP and VDE modes): nat-socket-user nat-bridge-tap nat-bridge-tap+vhost route-none-tap route-none-tap+vhost route-veth+macvlan-tap route-veth+macvlan-tap+vhost route-veth+macvlan-socket route-veth+macvlan-socket+vhost veb-bridge-tap veb-bridge-tap+vhost veb-macvlan-tap veb-macvlan-tap+vhost veb-macvlan-socket veb-macvlan-socket+vhost veb-sriov-socket veb-sriov-socket+vhost vepa-macvlan-tap vepa-macvlan-tap+vhost vepa-macvlan-socket vepa-macvlan-socket+vhost vepa-sriov-socket vepa-sriov-socket+vhost private-macvlan-tap private-macvlan-tap+vhost private-macvlan-socket private-macvlan-socket+vhost private-sriov-socket private-sriov-socket+vhost private-physdev-socket private-physdev-socket+vhost If my plans for extending macvlan for SR-IOV work out, we will also have bridge-sriov-tap bridge-sriov-tap+vhost vepa-sriov-tap vepa-sriov-tap+vhost private-sriov-tap private-sriov-tap+vhost As you can see, the policy is mostly independent from the qemu implementation and even from the kernel implementation. Naming the macvtap code in qemu '-net vepa' would completely mix up things for people that want to use vepa with an SR-IOV card, or macvtap in bridge mode! The concept with the callout to an external program to deal with the enourmous number of variations absolutely makes sense, but the naming needs to get better. In particular, I think that the policy should be only known between the helper and libvirt (or the user), but not show up anywhere in qemu, which can just pass all the options to the helper, and let that one decide what to do. E.g. "qemu -net host,mode=vepa,dev=eth0" can result in calling "/usr/libexec/qemu-net-helper --mode=vepa --dev=eth0 --socketpair=X --protocols=tap,socket,vhost". Then qemu-net-helper tries to find the best way to set up a vepa on eth0, given the choice of tap, socket, tap+vhost or socket+vhost, the system capabilities (sr-iov, macvlan, macvtap driver) and the user permissions it is running on. > That lets people infinitely extend qemu's networking support while allow > us to focus on just implementing backends for the interfaces we're > exposed to. AFAICT, that's just /dev/vhost, /dev/net/tun, and a normal > socket. The later two can be reduced to a single read/write interface > honestly. Well, I think you are still required to use sendmsg/recvmsg with the raw socket, not write/read, but aside from that I agree. > No, net/ would essentially become a series of helper programs. What's > nice about this approach is that libvirt could potentially use helpers > too which would allow people to run qemu directly based on the output of > ps -ef. Would certainly make debugging easier. Right. Also, if we put the helpers into netcf or a similar library, more applications that are unrelated to qemu could use them, e.g. user-mode-linux, if they are interested. > > Nope, not at all ;-) > > > > We do need to know if a VF is available or not (and if a PF has any of > > its VFs used). > > "We need to know" or "it would be nice to know"? > > You can make the same argument about a physical network interface. The difference to what we have today is that you can add an arbitrary number of taps to a bridge, so you don't need to know if any other guests are running when you add another one. But when you add a guest to a VF, you need to be sure tha t no other guest uses the same VF, so this needs system-wide coordination. libvirt can keep the state if it manages all guests, but if you want to run guests without libvirt, you need something like lock-files. Arnd -- Libvir-list mailing list Libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list