Re: [Qemu-devel] [RFC PATCH v1 1/4] VFIO KABI for migration interface

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



* Kirti Wankhede (kwankhede@xxxxxxxxxx) wrote:
> - Added vfio_device_migration_info structure to use interact with vendor
>   driver.
> - Different flags are used to get or set migration related information
>   from/to vendor driver.
> Flag VFIO_MIGRATION_PROBE: To query if feature is supported
> Flag VFIO_MIGRATION_GET_REGION: To get migration region info
> Flag VFIO_MIGRATION_SET_STATE: To convey device state in vendor driver
> Flag VFIO_MIGRATION_GET_PENDING: To get pending bytes yet to be migrated
>   from vendor driver
> Flag VFIO_MIGRATION_GET_BUFFER: On this flag, vendor driver should write
>   data to migration region and return number of bytes written in the region
> Flag VFIO_MIGRATION_SET_BUFFER: In migration resume path, user space app
>   writes to migration region and communicates it to vendor driver with
>   this ioctl with this flag.
> Flag VFIO_MIGRATION_GET_DIRTY_PFNS: Get bitmap of dirty pages from vendor
>   driver from given start address
> 
> - Added enum for possible device states.
> 
> Signed-off-by: Kirti Wankhede <kwankhede@xxxxxxxxxx>
> Reviewed-by: Neo Jia <cjia@xxxxxxxxxx>
> ---
>  linux-headers/linux/vfio.h | 91 ++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 91 insertions(+)
> 
> diff --git a/linux-headers/linux/vfio.h b/linux-headers/linux/vfio.h
> index 3615a269d378..8e9045ed9aa8 100644
> --- a/linux-headers/linux/vfio.h
> +++ b/linux-headers/linux/vfio.h
> @@ -602,6 +602,97 @@ struct vfio_device_ioeventfd {
>  
>  #define VFIO_DEVICE_IOEVENTFD		_IO(VFIO_TYPE, VFIO_BASE + 16)
>  
> +/**
> + * VFIO_DEVICE_MIGRATION_INFO - _IOW(VFIO_TYPE, VFIO_BASE + 17,
> + *                              struct vfio_device_migration_info)

This is a little odd; what you really have here is 7 different
operations that can be performed; they're independent operations all
relating to part of migration; so instead of a 'flag' it's more of an
operation type, and there's no reason to make them individual bit flags
- for edxample there's no reason you'd want to OR together
MIGRATION_GET_REGION and MIGRATION_GET_PENDING - they're just
independent.
(Similarly you could just make them 7 ioctls)

I guess what you're trying to do here is a direct 1-1 mapping of qemu's
struct SaveVMHandlers interface to devices.
That's not necessarily a bad idea, but remember it's not a stable
interface, so QEMU will change it over time and you'll have to then
figure out how to shim these interfaces to it, so it's worth keeping
that in mind, just in case you can make these interfaces more general.

> + * Flag VFIO_MIGRATION_PROBE:
> + *      To query if feature is supported
> + *
> + * Flag VFIO_MIGRATION_GET_REGION:
> + *      To get migration region info
> + *      region_index [output] : region index to be used for migration region
> + *      size [output]: size of migration region
> + *
> + * Flag VFIO_MIGRATION_SET_STATE:
> + *      To set device state in vendor driver
> + *      device_state [input] : User space app sends device state to vendor
> + *           driver on state change
> + *
> + * Flag VFIO_MIGRATION_GET_PENDING:
> + *      To get pending bytes yet to be migrated from vendor driver
> + *      threshold_size [Input] : threshold of buffer in User space app.
> + *      pending_precopy_only [output] : pending data which must be migrated in
> + *          precopy phase or in stopped state, in other words - before target
> + *          vm start
> + *      pending_compatible [output] : pending data which may be migrated in any
> + *           phase
> + *      pending_postcopy_only [output] : pending data which must be migrated in
> + *           postcopy phase or in stopped state, in other words - after source
> + *           vm stop
> + *      Sum of pending_precopy_only, pending_compatible and
> + *      pending_postcopy_only is the whole amount of pending data.
> + *
> + * Flag VFIO_MIGRATION_GET_BUFFER:
> + *      On this flag, vendor driver should write data to migration region and
> + *      return number of bytes written in the region.
> + *      bytes_written [output] : number of bytes written in migration buffer by
> + *          vendor driver
> + *
> + * Flag VFIO_MIGRATION_SET_BUFFER
> + *      In migration resume path, user space app writes to migration region and
> + *      communicates it to vendor driver with this ioctl with this flag.
> + *      bytes_written [Input] : number of bytes written in migration buffer by
> + *          user space app.

I guess we call getbuffer/setbuffer multiple times.  Is there anything
that needs to be passed to get the data to line up (e.g. offset into the
data)

> + * Flag VFIO_MIGRATION_GET_DIRTY_PFNS
> + *      Get bitmap of dirty pages from vendor driver from given start address.
> + *      start_addr [Input] : start address
> + *      pfn_count [Input] : Total pfn count from start_addr for which dirty
> + *          bitmap is requested
> + *      dirty_bitmap [Output] : bitmap memory allocated by user space
> + *           application, vendor driver should return the bitmap with bits set
> + *           only for pages to be marked dirty.
> + * Return: 0 on success, -errno on failure.
> + */
> +
> +struct vfio_device_migration_info {
> +	__u32 argsz;
> +	__u32 flags;
> +#define VFIO_MIGRATION_PROBE            (1 << 0)
> +#define VFIO_MIGRATION_GET_REGION       (1 << 1)
> +#define VFIO_MIGRATION_SET_STATE        (1 << 2)
> +#define VFIO_MIGRATION_GET_PENDING      (1 << 3)
> +#define VFIO_MIGRATION_GET_BUFFER       (1 << 4)
> +#define VFIO_MIGRATION_SET_BUFFER       (1 << 5)
> +#define VFIO_MIGRATION_GET_DIRTY_PFNS   (1 << 6)
> +        __u32 region_index;	            /* region index */
> +        __u64 size;	                    /* size */
> +        __u32 device_state;                 /* VFIO device state */
> +        __u64 pending_precopy_only;
> +        __u64 pending_compatible;
> +        __u64 pending_postcopy_only;
> +        __u64 threshold_size;
> +        __u64 bytes_written;
> +        __u64 start_addr;
> +        __u64 pfn_count;
> +	__u8  dirty_bitmap[];
> +};

This feels like it should be separate types for the different calls.

Also, is there no state to tell the userspace what version of migration
data we have?  What happens if I try and load a migration state
from an older driver into a newer one, or the other way around,
or into a destination card that's not the same as the source.

> +enum {
> +    VFIO_DEVICE_STATE_NONE,
> +    VFIO_DEVICE_STATE_RUNNING,
> +    VFIO_DEVICE_STATE_MIGRATION_SETUP,
> +    VFIO_DEVICE_STATE_MIGRATION_PRECOPY_ACTIVE,
> +    VFIO_DEVICE_STATE_MIGRATION_STOPNCOPY_ACTIVE,
> +    VFIO_DEVICE_STATE_MIGRATION_SAVE_COMPLETED,
> +    VFIO_DEVICE_STATE_MIGRATION_RESUME,
> +    VFIO_DEVICE_STATE_MIGRATION_RESUME_COMPLETED,
> +    VFIO_DEVICE_STATE_MIGRATION_FAILED,
> +    VFIO_DEVICE_STATE_MIGRATION_CANCELLED,
> +};

That's an interesting merge of QEMU's run-state and it's migration
state.

You probably need to define which transitions you take as legal.

Dave

> +
> +#define VFIO_DEVICE_MIGRATION_INFO _IO(VFIO_TYPE, VFIO_BASE + 17)
> +
>  /* -------- API for Type1 VFIO IOMMU -------- */
>  
>  /**
> -- 
> 2.7.0
> 
> 
--
Dr. David Alan Gilbert / dgilbert@xxxxxxxxxx / Manchester, UK



[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux