On Tue, Jul 25, 2023 at 02:40:24PM -0700, Brett Creeley wrote: > It's possible that the device firmware crashes and is able to recover > due to some configuration and/or other issue. If a live migration > is in progress while the firmware crashes, the live migration will > fail. However, the VF PCI device should still be functional post > crash recovery and subsequent migrations should go through as > expected. > > When the pds_core device notices that firmware crashes it sends an > event to all its client drivers. When the pds_vfio driver receives > this event while migration is in progress it will request a deferred > reset on the next migration state transition. This state transition > will report failure as well as any subsequent state transition > requests from the VMM/VFIO. Based on uapi/vfio.h the only way out of > VFIO_DEVICE_STATE_ERROR is by issuing VFIO_DEVICE_RESET. Once this > reset is done, the migration state will be reset to > VFIO_DEVICE_STATE_RUNNING and migration can be performed. Have you actually tested this? Does the qemu side respond properly if this happens during a migration? Jason