Re: [PATCH RFC] vfio: Introduce DMA logging uAPIs for VFIO device

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, 1 May 2022 15:33:00 +0300
Yishai Hadas <yishaih@xxxxxxxxxx> wrote:

> DMA logging allows a device to internally record what DMAs the device is
> initiation and report them back to userspace.
> 
> It is part of the VFIO migration infrastructure that allows implementing
> dirty page tracking during the pre-copy phase of live migration.
> 
> Only DMA WRITEs are logged, and this API is not connected to
> VFIO_DEVICE_FEATURE_MIG_DEVICE_STATE.
> 
> This RFC patch shows the expected usage of the DMA logging involved
> uAPIs for VFIO device-tracker.
> 
> It uses the FEATURE ioctl with its GET/SET/PROBE options as of below.
> 
> It exposes a PROBE option to detect if the device supports DMA logging.
> 
> It exposes a SET option to start device DMA logging in given of IOVA
> ranges.
> 
> It exposes a SET option to stop device DMA logging that was previously
> started.
> 
> It exposes a GET option to read back and clear the device DMA log.
> 
> Extra details exist as part of vfio.h per a specific option in this RFC
> patch.
> 
> Note:
> To have IOMMU hardware support for dirty pages the below RFC [1] that
> was sent by Joao Martins can be referenced.
> 
> [1] https://lore.kernel.org/all/2d369e58-8ac0-f263-7b94-fe73917782e1@xxxxxxxxxxxxxxx/T/
> 
> Signed-off-by: Yishai Hadas <yishaih@xxxxxxxxxx>
> Signed-off-by: Jason Gunthorpe <jgg@xxxxxxxxxx>
> ---
>  include/uapi/linux/vfio.h | 80 +++++++++++++++++++++++++++++++++++++++
>  1 file changed, 80 insertions(+)
> 
> diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
> index fea86061b44e..9d0b7e73e999 100644
> --- a/include/uapi/linux/vfio.h
> +++ b/include/uapi/linux/vfio.h
> @@ -986,6 +986,86 @@ enum vfio_device_mig_state {
>  	VFIO_DEVICE_STATE_RUNNING_P2P = 5,
>  };
>  
> +/*
> + * Upon VFIO_DEVICE_FEATURE_SET start device DMA logging.
> + * VFIO_DEVICE_FEATURE_PROBE can be used to detect if the device supports
> + * DMA logging.
> + *
> + * DMA logging allows a device to internally record what DMAs the device is
> + * initiation and report them back to userspace. It is part of the VFIO
> + * migration infrastructure that allows implementing dirty page tracking
> + * during the pre copy phase of live migration. Only DMA WRITEs are logged,
> + * and this API is not connected to VFIO_DEVICE_FEATURE_MIG_DEVICE_STATE.
> + *
> + * When DMA logging is started a range of IOVAs to monitor is provided and the
> + * device can optimize its logging to cover only the IOVA range given. Each
> + * DMA that the device initiates inside the range will be logged by the device
> + * for later retrieval.
> + *
> + * page_size is an input that hints what tracking granularity the device
> + * should try to achieve. If the device cannot do the hinted page size then it
> + * should pick the next closest page size it supports. On output the device
> + * will return the page size it selected.
> + *
> + * ranges is a pointer to an array of
> + * struct vfio_device_feature_dma_logging_range.
> + */
> +struct vfio_device_feature_dma_logging_control {
> +	__aligned_u64 page_size;
> +	__u32 num_ranges;
> +	__u32 __reserved;
> +	__aligned_u64 ranges;
> +};
> +
> +struct vfio_device_feature_dma_logging_range {
> +	__aligned_u64 iova;
> +	__aligned_u64 length;
> +};
> +
> +#define VFIO_DEVICE_FEATURE_DMA_LOGGING_START 3
> +
> +
> +/*
> + * Upon VFIO_DEVICE_FEATURE_SET stop device DMA logging that was started
> + * by VFIO_DEVICE_FEATURE_DMA_LOGGING_START
> + */
> +#define VFIO_DEVICE_FEATURE_DMA_LOGGING_STOP 4

This seems difficult to use from a QEMU perspective, where a vfio
device typically operates on a MemoryListener and we only have
visibility to one range at a time.  I don't see any indication that
LOGGING_START is meant to be cumulative such that userspace could
incrementally add ranges to be watched, nor clearly does LOGGING_STOP
appear to have any sort of IOVA range granularity.  Is userspace
intended to pass the full vCPU physical address range here, and if so
would a single min/max IOVA be sufficient?  I'm not sure how else we
could support memory hotplug while this was enabled.

How does this work with IOMMU based tracking, I assume that if devices
share an IOAS we wouldn't be able to exclude devices supporting
device-level tracking from the IOAS log.

> +
> +/*
> + * Upon VFIO_DEVICE_FEATURE_GET read back and clear the device DMA log
> + *
> + * Query the device's DMA log for written pages within the given IOVA range.
> + * During querying the log is cleared for the IOVA range.
> + *
> + * bitmap is a pointer to an array of u64s that will hold the output bitmap
> + * with 1 bit reporting a page_size unit of IOVA. The mapping of IOVA to bits
> + * is given by:
> + *  bitmap[(addr - iova)/page_size] & (1ULL << (addr % 64))
> + *
> + * The input page_size can be any power of two value and does not have to
> + * match the value given to VFIO_DEVICE_FEATURE_DMA_LOGGING_START. The driver
> + * will format its internal logging to match the reporting page size, possibly
> + * by replicating bits if the internal page size is lower than requested.

Or setting multiple bits if the internal page size is larger than
requested.

Is there a bitmap size limit?  We've minimally needed to impose limits
to reflect limitations of the bitmap code internally in the past.
Userspace needs a means to learn such limits.  Thanks,

Alex

> + *
> + * Bits will be updated in bitmap using atomic or to allow userspace to
> + * combine bitmaps from multiple trackers together. Therefore userspace must
> + * zero the bitmap before doing any reports.
> + *
> + * If any error is returned userspace should assume that the dirty log is
> + * corrupted and restart.
> + *
> + * If DMA logging is not enabled, an error will be returned.
> + *
> + */
> +struct vfio_device_feature_dma_logging_report {
> +	__aligned_u64 iova;
> +	__aligned_u64 length;
> +	__aligned_u64 page_size;
> +	__aligned_u64 bitmap;
> +};
> +
> +#define VFIO_DEVICE_FEATURE_DMA_LOGGING_REPORT 5
> +
>  /* -------- API for Type1 VFIO IOMMU -------- */
>  
>  /**




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux