On Thu, Dec 10, 2015 at 6:38 AM, Lan, Tianyu <tianyu.lan@xxxxxxxxx> wrote: > > > On 12/10/2015 7:41 PM, Dr. David Alan Gilbert wrote: >>> >>> Ideally, it is able to leave guest driver unmodified but it requires the >>> >hypervisor or qemu to aware the device which means we may need a driver >>> > in >>> >hypervisor or qemu to handle the device on behalf of guest driver. >> >> Can you answer the question of when do you use your code - >> at the start of migration or >> just before the end? > > > Just before stopping VCPU in this version and inject VF mailbox irq to > notify the driver if the irq handler is installed. > Qemu side also will check this via the faked PCI migration capability > and driver will set the status during device open() or resume() callback. The VF mailbox interrupt is a very bad idea. Really the device should be in a reset state on the other side of a migration. It doesn't make sense to have the interrupt firing if the device is not configured. This is one of the things that is preventing you from being able to migrate the device while the interface is administratively down or the VF driver is not loaded. My thought on all this is that it might make sense to move this functionality into a PCI-to-PCI bridge device and make it a requirement that all direct-assigned devices have to exist behind that device in order to support migration. That way you would be working with a directly emulated device that would likely already be supporting hot-plug anyway. Then it would just be a matter of coming up with a few Qemu specific extensions that you would need to add to the device itself. The same approach would likely be portable enough that you could achieve it with PCIe as well via the same configuration space being present on the upstream side of a PCIe port or maybe a PCIe switch of some sort. It would then be possible to signal via your vendor-specific PCI capability on that device that all devices behind this bridge require DMA page dirtying, you could use the configuration in addition to the interrupt already provided for hot-plug to signal things like when you are starting migration, and possibly even just extend the shpc functionality so that if this capability is present you have the option to pause/resume instead of remove/probe the device in the case of certain hot-plug events. The fact is there may be some use for a pause/resume type approach for PCIe hot-plug in the near future anyway. From the sounds of it Apple has required it for all Thunderbolt device drivers so that they can halt the device in order to shuffle resources around, perhaps we should look at something similar for Linux. The other advantage behind grouping functions on one bridge is things like reset domains. The PCI error handling logic will want to be able to reset any devices that experienced an error in the event of something such as a surprise removal. By grouping all of the devices you could disable/reset/enable them as one logical group in the event of something such as the "bad path" approach Michael has mentioned. >> >>>> > >It would be great if we could avoid changing the guest; but at least >>>> > > your guest >>>> > >driver changes don't actually seem to be that hardware specific; >>>> > > could your >>>> > >changes actually be moved to generic PCI level so they could be made >>>> > >to work for lots of drivers? >>> >>> > >>> >It is impossible to use one common solution for all devices unless the >>> > PCIE >>> >spec documents it clearly and i think one day it will be there. But >>> > before >>> >that, we need some workarounds on guest driver to make it work even it >>> > looks >>> >ugly. > > > Yes, so far there is not hardware migration support and it's hard to modify > bus level code. It also will block implementation on the Windows. Please don't assume things. Unless you have hard data from Microsoft that says they want it this way lets just try to figure out what works best for us for now and then we can start worrying about third party implementations after we have figured out a solution that actually works. - Alex -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html