* Yang Zhang (yang.zhang.wz@xxxxxxxxx) wrote: > On 2015/12/10 18:18, Dr. David Alan Gilbert wrote: > >* Lan, Tianyu (tianyu.lan@xxxxxxxxx) wrote: > >>On 12/8/2015 12:50 AM, Michael S. Tsirkin wrote: > >>>I thought about what this is doing at the high level, and I do have some > >>>value in what you are trying to do, but I also think we need to clarify > >>>the motivation a bit more. What you are saying is not really what the > >>>patches are doing. > >>> > >>>And with that clearer understanding of the motivation in mind (assuming > >>>it actually captures a real need), I would also like to suggest some > >>>changes. > >> > >>Motivation: > >>Most current solutions for migration with passthough device are based on > >>the PCI hotplug but it has side affect and can't work for all device. > >> > >>For NIC device: > >>PCI hotplug solution can work around Network device migration > >>via switching VF and PF. > >> > >>But switching network interface will introduce service down time. > >> > >>I tested the service down time via putting VF and PV interface > >>into a bonded interface and ping the bonded interface during plug > >>and unplug VF. > >>1) About 100ms when add VF > >>2) About 30ms when del VF > >> > >>It also requires guest to do switch configuration. These are hard to > >>manage and deploy from our customers. To maintain PV performance during > >>migration, host side also needs to assign a VF to PV device. This > >>affects scalability. > >> > >>These factors block SRIOV NIC passthough usage in the cloud service and > >>OPNFV which require network high performance and stability a lot. > > > >Right, that I'll agree it's hard to do migration of a VM which uses > >an SRIOV device; and while I think it should be possible to bond a virtio device > >to a VF for networking and then hotplug the SR-IOV device I agree it's hard to manage. > > > >>For other kind of devices, it's hard to work. > >>We are also adding migration support for QAT(QuickAssist Technology) device. > >> > >>QAT device user case introduction. > >>Server, networking, big data, and storage applications use QuickAssist > >>Technology to offload servers from handling compute-intensive operations, > >>such as: > >>1) Symmetric cryptography functions including cipher operations and > >>authentication operations > >>2) Public key functions including RSA, Diffie-Hellman, and elliptic curve > >>cryptography > >>3) Compression and decompression functions including DEFLATE and LZS > >> > >>PCI hotplug will not work for such devices during migration and these > >>operations will fail when unplug device. > > > >I don't understand that QAT argument; if the device is purely an offload > >engine for performance, then why can't you fall back to doing the > >same operations in the VM or in QEMU if the card is unavailable? > >The tricky bit is dealing with outstanding operations. > > > >>So we are trying implementing a new solution which really migrates > >>device state to target machine and won't affect user during migration > >>with low service down time. > > > >Right, that's a good aim - the only question is how to do it. > > > >It looks like this is always going to need some device-specific code; > >the question I see is whether that's in: > > 1) qemu > > 2) the host kernel > > 3) the guest kernel driver > > > >The objections to this series seem to be that it needs changes to (3); > >I can see the worry that the guest kernel driver might not get a chance > >to run during the right time in migration and it's painful having to > >change every guest driver (although your change is small). > > > >My question is what stage of the migration process do you expect to tell > >the guest kernel driver to do this? > > > > If you do it at the start of the migration, and quiesce the device, > > the migration might take a long time (say 30 minutes) - are you > > intending the device to be quiesced for this long? And where are > > you going to send the traffic? > > If you are, then do you need to do it via this PCI trick, or could > > you just do it via something higher level to quiesce the device. > > > > Or are you intending to do it just near the end of the migration? > > But then how do we know how long it will take the guest driver to > > respond? > > Ideally, it is able to leave guest driver unmodified but it requires the > hypervisor or qemu to aware the device which means we may need a driver in > hypervisor or qemu to handle the device on behalf of guest driver. Can you answer the question of when do you use your code - at the start of migration or just before the end? > >It would be great if we could avoid changing the guest; but at least your guest > >driver changes don't actually seem to be that hardware specific; could your > >changes actually be moved to generic PCI level so they could be made > >to work for lots of drivers? > > It is impossible to use one common solution for all devices unless the PCIE > spec documents it clearly and i think one day it will be there. But before > that, we need some workarounds on guest driver to make it work even it looks > ugly. Dave > > -- > best regards > yang -- Dr. David Alan Gilbert / dgilbert@xxxxxxxxxx / Manchester, UK -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html