Re: [Qemu-devel] [RFC 0/7] Live Migration with Pass-through Devices proposal

"Dr. David Alan Gilbert" <dgilbert@xxxxxxxxxx> · Tue, 19 May 2015 17:27:20 +0100

* Michael S. Tsirkin (mst@xxxxxxxxxx) wrote:
> On Tue, May 19, 2015 at 04:45:03PM +0100, Daniel P. Berrange wrote:
> > On Tue, May 19, 2015 at 05:39:05PM +0200, Michael S. Tsirkin wrote:
> > > On Tue, May 19, 2015 at 04:35:08PM +0100, Daniel P. Berrange wrote:
> > > > On Tue, May 19, 2015 at 04:03:04PM +0100, Dr. David Alan Gilbert wrote:
> > > > > * Daniel P. Berrange (berrange@xxxxxxxxxx) wrote:
> > > > > > On Tue, May 19, 2015 at 10:15:17AM -0400, Laine Stump wrote:
> > > > > > > On 05/19/2015 05:07 AM, Michael S. Tsirkin wrote:
> > > > > > > > On Wed, Apr 22, 2015 at 10:23:04AM +0100, Daniel P. Berrange wrote:
> > > > > > > >> On Fri, Apr 17, 2015 at 04:53:02PM +0800, Chen Fan wrote:
> > > > > > > >>> backgrond:
> > > > > > > >>> Live migration is one of the most important features of virtualization technology.
> > > > > > > >>> With regard to recent virtualization techniques, performance of network I/O is critical.
> > > > > > > >>> Current network I/O virtualization (e.g. Para-virtualized I/O, VMDq) has a significant
> > > > > > > >>> performance gap with native network I/O. Pass-through network devices have near
> > > > > > > >>> native performance, however, they have thus far prevented live migration. No existing
> > > > > > > >>> methods solve the problem of live migration with pass-through devices perfectly.
> > > > > > > >>>
> > > > > > > >>> There was an idea to solve the problem in website:
> > > > > > > >>> https://www.kernel.org/doc/ols/2008/ols2008v2-pages-261-267.pdf
> > > > > > > >>> Please refer to above document for detailed information.
> > > > > > > >>>
> > > > > > > >>> So I think this problem maybe could be solved by using the combination of existing
> > > > > > > >>> technology. and the following steps are we considering to implement:
> > > > > > > >>>
> > > > > > > >>> -  before boot VM, we anticipate to specify two NICs for creating bonding device
> > > > > > > >>>    (one plugged and one virtual NIC) in XML. here we can specify the NIC's mac addresses
> > > > > > > >>>    in XML, which could facilitate qemu-guest-agent to find the network interfaces in guest.
> > > > > > > >>>
> > > > > > > >>> -  when qemu-guest-agent startup in guest it would send a notification to libvirt,
> > > > > > > >>>    then libvirt will call the previous registered initialize callbacks. so through
> > > > > > > >>>    the callback functions, we can create the bonding device according to the XML
> > > > > > > >>>    configuration. and here we use netcf tool which can facilitate to create bonding device
> > > > > > > >>>    easily.
> > > > > > > >> I'm not really clear on why libvirt/guest agent needs to be involved in this.
> > > > > > > >> I think configuration of networking is really something that must be left to
> > > > > > > >> the guest OS admin to control. I don't think the guest agent should be trying
> > > > > > > >> to reconfigure guest networking itself, as that is inevitably going to conflict
> > > > > > > >> with configuration attempted by things in the guest like NetworkManager or
> > > > > > > >> systemd-networkd.
> > > > > > > > There should not be a conflict.
> > > > > > > > guest agent should just give NM the information, and have  NM do
> > > > > > > > the right thing.
> > > > > > > 
> > > > > > > That assumes the guest will have NM running. Unless you want to severely
> > > > > > > limit the scope of usefulness, you also need to handle systems that have
> > > > > > > NM disabled, and among those the different styles of system network
> > > > > > > config. It gets messy very fast.
> > > > > > 
> > > > > > Also OpenStack already has a way to pass guest information about the
> > > > > > required network setup, via cloud-init, so it would not be interested
> > > > > > in any thing that used the QEMU guest agent to configure network
> > > > > > manager. Which is really just another example of why this does not
> > > > > > belong anywhere in libvirt or lower.  The decision to use NM is a
> > > > > > policy decision that will always be wrong for a non-negligble set
> > > > > > of use cases and as such does not belong in libvirt or QEMU. It is
> > > > > > the job of higher level apps to make that kind of policy decision.
> > > > > 
> > > > > This is exactly my worry though; why should every higher level management
> > > > > system have it's own way of communicating network config for hotpluggable
> > > > > devices.  You shoudln't need to reconfigure a VM to move it between them.
> > > > > 
> > > > > This just makes it hard to move it between management layers; there needs
> > > > > to be some standardisation (or abstraction) of this;  if libvirt isn't the place
> > > > > to do it, then what is?
> > > > 
> > > > NB, openstack isn't really defining a custom thing for networking here. It
> > > > is actually integrating with the standard cloud-init guest tools for this
> > > > task. Also note that OpenStack has defined a mechanism that works for
> > > > guest images regardless of what hypervisor they are running on - ie does
> > > > not rely on any QEMU or libvirt specific functionality here.
> > > 
> > > I'm not sure what the implication is.  No new functionality should be
> > > implemented unless we also add it to vmware?  People that don't want kvm
> > > specific functionality, won't use it.
> > 
> > I'm saying that standardization of virtualization policy in libvirt is the
> > wrong solution, because different applications will have different viewpoints
> > as to what "standardization" is useful / appropriate. Creating a standardized
> > policy in libvirt for KVM, does not help OpenStack may help people who only
> > care about KVM, but that is not the entire ecosystem. OpenStack has a
> > standardized solution for guest configuration imformation that works across
> > all the hypervisors it targets.  This is just yet another example of exactly
> > why libvirt aims to design its APIs such that it exposes direct mechanisms
> > and leaves usage policy decisions upto the management applications. Libvirt
> > is not best placed to decide which policy all these mgmt apps must use for
> > this task.
> > 
> > Regards,
> > Daniel
> 
> 
> I don't think we are pushing policy in libvirt here.
> 
> What we want is a mechanism that let users specify in the XML:
> interface X is fallback for pass-through device Y
> Then when requesting migration, specify that it should use
> device Z on destination as replacement for Y.
> 
> We are asking libvirt to automatically
> 1.- when migration is requested, request unplug of Y
> 2.- wait until Y is deleted
> 3.- start migration
> 4.- wait until migration is completed
> 5.- plug device Z on destination
> 
> I don't see any policy above: libvirt is in control of migration and
> seems best placed to implement this.

The step that list is missing is:
  0. Tell guest that *this virtio NIC (X) and *this real NIC (Y) are a bond pair
  6. Tell guest that *this real NIC (Z) are a bond pair

  0 has to happen both at startup and at hotplug of a new-pair;  I'm not clear
if 6 is actually needed depending on whether it can be done based on what was in 0.

Dave

> 
> 
> 
> > -- 
> > |: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
> > |: http://libvirt.org              -o-             http://virt-manager.org :|
> > |: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
> > |: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|
--
Dr. David Alan Gilbert / dgilbert@xxxxxxxxxx / Manchester, UK

--
libvir-list mailing list
libvir-list@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/libvir-list