* Daniel P. Berrange (berrange@xxxxxxxxxx) wrote: > On Wed, May 13, 2015 at 10:00:42AM +0100, Dr. David Alan Gilbert wrote: > > * Peter Krempa (pkrempa@xxxxxxxxxx) wrote: > > > On Wed, May 13, 2015 at 09:40:23 +0100, Dr. David Alan Gilbert wrote: > > > > * Peter Krempa (pkrempa@xxxxxxxxxx) wrote: > > > > > On Wed, May 13, 2015 at 09:08:39 +0100, Dr. David Alan Gilbert wrote: > > > > > > * Peter Krempa (pkrempa@xxxxxxxxxx) wrote: > > > > > > > On Wed, May 13, 2015 at 11:36:26 +0800, Chen Fan wrote: > > > > > > > > my main goal is to add support migration with host NIC > > > > > > > > passthrough devices and keep the network connectivity. > > > > > > > > > > > > > > > > this series patch base on Shradha's patches on > > > > > > > > https://www.redhat.com/archives/libvir-list/2012-November/msg01324.html > > > > > > > > which is add migration support for host passthrough devices. > > > > > > > > > > > > > > > > 1) unplug the ephemeral devices before migration > > > > > > > > > > > > > > > > 2) do native migration > > > > > > > > > > > > > > > > 3) when migration finished, hotplug the ephemeral devices > > > > > > > > > > > > > > IMHO this algorithm is something that an upper layer management app > > > > > > > should do. The device unplug operation is complex and it might not > > > > > > > succeed which will make the current migration thread hang or fail in an > > > > > > > intermediate state that will not be recoverable. > > > > > > > > > > > > However you wouldn't want each of the upper layer management apps implementing > > > > > > their own hacks for this; so something somewhere needs to standardise > > > > > > what the guest sees. > > > > > > > > > > The guest still will see an PCI device unplug request and will have to > > > > > respond to it, then will be paused and after resume a new PCI device > > > > > will appear. This is standardised. The nonstandardised part (which can't > > > > > really be standardised) is how the bonding or other guest-dependant > > > > > stuff will be handled, but that is up to the guest OS to handle. > > > > > > > > Why can't that be standardised? Don't we need to provide the information > > > > on what to bond to the guest and that this process is happening? The previous > > > > suggestion was to use guest-agent for this. > > > > > > Well, since only in linux you've got multiple ways to do that including > > > legacy init scripts on various distros, the systemd-networkd thingie or > > > how it's called or network manager, standardising this part won't be > > > that easy. Not speaking of possible different OSes. > > > > Right - so we need to standardise on the messaging we send to the guest to > > tell it that we've got this bonded hotplug setup, and then the different > > OSs can implement what they need off using that information. > > > > > > > From libvirt's perspective this is only something that will trigger the > > > > > device unplug and plug the devices back. And there are a lot of issues > > > > > here: > > > > > > > > > > 1) the destination of the migration might not have the desired devices > > > > > > > > > > This will trigger a lot of problems as we will not be able to guarantee > > > > > that the devices reappear on the destination and if we'd wanted to check > > > > > we'd need a new migration protocol AFAIK. > > > > > > > > But if it's using the bonding trick then that isn't fatal; it would still > > > > be able to have the bonded virtio device. > > > > > > > > > 2) The guest OS might refuse to detach the PCI device (it might be stuck > > > > > before PCI code is loaded) > > > > > > > > > > In that case the migration will be stuck forever and abort attempts > > > > > will make the domain state basically undefined depending on the > > > > > phase where it failed. > > > > > > > > > > Since we can't guarantee that the unplug of the PCI host devices will be > > > > > atomic or that it will succeed we basically can't guarantee in any way > > > > > in which state the VM will end up later after (a possibly failed) > > > > > migration. To recover such state there are too many option that could be > > > > > desired by the user that would be hard to implement in a way that would > > > > > be flexible enough. > > > > > > > > I don't understand why this is any different to any other PCI device hot-unplug. > > > > > > It's the same, but once libvirt would be doing multiple PCI unplug > > > requests along with the migration code, things might not go well. If you > > > then couple this with different user expectations what should happen in > > > various error cases it gets even more messy. > > > > Well, since we've got the bond it shouldn't get quite that bad; the > > error cases don't sound that bad: > > 1) If we can't hot-unplug then we don't migrate/cancel migration. > > We warn the user, if we're unlucky we're left running on the bond. > > 2) If we can't hot-plug at the end, then we've still got the bond in, > > so the guest carries on running (albeit with reduced performance). > > We need to flag this to the user somehow. > > If there are multiple PCI devices attached to the guest, we may end up > with some PCI devices removed and some still present, and some for which > we don't know if they are removed or present at all as the guest may simply > not have responded to us yet. Further there are devices which are not just > bonded NICs, so I'm really not happy for us to design a policy that works > for bonded NICs but which is quite possibly going to be useless for other > types of PCI device people will inevitably want to deal with later. This is only trying to address the problem for devices that can have the equivalent of a bond; so it's not NIC specific; the same should work for storage devices with multipath. Dave > > Regards, > Daniel > -- > |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| > |: http://libvirt.org -o- http://virt-manager.org :| > |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| > |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| -- Dr. David Alan Gilbert / dgilbert@xxxxxxxxxx / Manchester, UK -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list