Re: [Qemu-devel] live migration vs device assignment (motivation)

Alexander Duyck <alexander.duyck@xxxxxxxxx> · Thu, 10 Dec 2015 11:17:04 -0800

On Thu, Dec 10, 2015 at 8:11 AM, Michael S. Tsirkin <mst@xxxxxxxxxx> wrote:
> On Thu, Dec 10, 2015 at 10:38:32PM +0800, Lan, Tianyu wrote:
>>
>>
>> On 12/10/2015 7:41 PM, Dr. David Alan Gilbert wrote:
>> >>Ideally, it is able to leave guest driver unmodified but it requires the
>> >>>hypervisor or qemu to aware the device which means we may need a driver in
>> >>>hypervisor or qemu to handle the device on behalf of guest driver.
>> >Can you answer the question of when do you use your code -
>> >    at the start of migration or
>> >    just before the end?
>>
>> Just before stopping VCPU in this version and inject VF mailbox irq to
>> notify the driver if the irq handler is installed.
>> Qemu side also will check this via the faked PCI migration capability
>> and driver will set the status during device open() or resume() callback.
>
> Right, this is the "good path" optimization. Whether this buys anything
> as compared to just sending reset to the device when VCPU is stopped
> needs to be measured. In any case, we probably do need a way to
> interrupt driver on destination to make it reconfigure the device -
> otherwise it might take seconds for it to notice.  And a way to make
> sure driver can handle this surprise reset so we can block migration if
> it can't.

The question is how do we handle the "bad path"?  From what I can tell
it seems like we would have to have the dirty page tracking for DMA
handled in the host in order to support that.  Otherwise we risk
corrupting the memory in the guest as there are going to be a few
stale pages that end up being in the guest.

The easiest way to probably flag a "bad path" migration would be to
emulate a Manually-operated Retention Latch being opened and closed on
the device.  It may even allow us to work with the desire to support a
means for doing a pause/resume as that would be a hot-plug event where
the latch was never actually opened.  Basically if the retention latch
is released and then re-closed it can be assumed that the device has
lost power and as a result been reset.  As such a normal hot-plug
controller would have to reconfigure the device in such an event.  The
key bit being that with the power being cycled on the port the
assumption is that the device has lost any existing state, and we
should emulate that as well by clearing any state Qemu might be
carrying such as the shadow of the MSI-X table.  In addition we could
also signal if the host supports the dirty page tracking via the IOMMU
so if needed the guest could trigger some sort of memory exception
handling due to the risk of memory corruption.

I would argue that we don't necessarily have to provide a means to
guarantee the driver can support a surprise removal/reset.  Worst case
scenario is that it would be equivalent to somebody pulling the plug
on an externally connected PCIe cage in a physical host.  I know the
Intel Ethernet drivers have already had to add support for surprise
removal due to the fact that such a scenario can occur on Thunderbolt
enabled platforms.  Since it is acceptable for physical hosts to have
such an event occur I think we could support the same type of failure
for direct assigned devices in guests.  That would be the one spot
where I would say it is up to the drivers to figure out how they are
going to deal with it since this is something that can occur for any
given driver on any given OS assuming it can be plugged into an
externally removable cage.

- Alex
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html