> From: Yishai Hadas <yishaih@xxxxxxxxxx> > Sent: Thursday, July 14, 2022 4:13 PM > > DMA logging allows a device to internally record what DMAs the device is > initiating and report them back to userspace. It is part of the VFIO > migration infrastructure that allows implementing dirty page tracking > during the pre copy phase of live migration. Only DMA WRITEs are logged, > and this API is not connected to VFIO_DEVICE_FEATURE_MIG_DEVICE_STATE. > > This patch introduces the DMA logging involved uAPIs. > > It uses the FEATURE ioctl with its GET/SET/PROBE options as of below. > > It exposes a PROBE option to detect if the device supports DMA logging. > It exposes a SET option to start device DMA logging in given IOVAs > ranges. > It exposes a SET option to stop device DMA logging that was previously > started. > It exposes a GET option to read back and clear the device DMA log. > > Extra details exist as part of vfio.h per a specific option. > > Signed-off-by: Yishai Hadas <yishaih@xxxxxxxxxx> > Signed-off-by: Jason Gunthorpe <jgg@xxxxxxxxxx> > --- > include/uapi/linux/vfio.h | 79 > +++++++++++++++++++++++++++++++++++++++ > 1 file changed, 79 insertions(+) > > diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h > index 733a1cddde30..81475c3e7c92 100644 > --- a/include/uapi/linux/vfio.h > +++ b/include/uapi/linux/vfio.h > @@ -986,6 +986,85 @@ enum vfio_device_mig_state { > VFIO_DEVICE_STATE_RUNNING_P2P = 5, > }; > > +/* > + * Upon VFIO_DEVICE_FEATURE_SET start device DMA logging. both 'start'/'stop' are via VFIO_DEVICE_FEATURE_SET > + * VFIO_DEVICE_FEATURE_PROBE can be used to detect if the device > supports > + * DMA logging. > + * > + * DMA logging allows a device to internally record what DMAs the device is > + * initiating and report them back to userspace. It is part of the VFIO > + * migration infrastructure that allows implementing dirty page tracking > + * during the pre copy phase of live migration. Only DMA WRITEs are logged, Then 'DMA dirty logging' might be a more accurate name throughput this series. > + * and this API is not connected to > VFIO_DEVICE_FEATURE_MIG_DEVICE_STATE. didn't get the point of this explanation. > + * > + * When DMA logging is started a range of IOVAs to monitor is provided and > the > + * device can optimize its logging to cover only the IOVA range given. Each > + * DMA that the device initiates inside the range will be logged by the device > + * for later retrieval. > + * > + * page_size is an input that hints what tracking granularity the device > + * should try to achieve. If the device cannot do the hinted page size then it > + * should pick the next closest page size it supports. On output the device next closest 'smaller' page size? > + * will return the page size it selected. > + * > + * ranges is a pointer to an array of > + * struct vfio_device_feature_dma_logging_range. > + */ > +struct vfio_device_feature_dma_logging_control { > + __aligned_u64 page_size; > + __u32 num_ranges; > + __u32 __reserved; > + __aligned_u64 ranges; > +}; should we move the definition of LOG_MAX_RANGES to be here so the user can know the max limits of tracked ranges? > + > +struct vfio_device_feature_dma_logging_range { > + __aligned_u64 iova; > + __aligned_u64 length; > +}; > + > +#define VFIO_DEVICE_FEATURE_DMA_LOGGING_START 3 Can the user update the range list by doing another START? > + > +/* > + * Upon VFIO_DEVICE_FEATURE_SET stop device DMA logging that was > started > + * by VFIO_DEVICE_FEATURE_DMA_LOGGING_START > + */ > +#define VFIO_DEVICE_FEATURE_DMA_LOGGING_STOP 4 Is there value of allowing the user to stop tracking of a specific range? > + > +/* > + * Upon VFIO_DEVICE_FEATURE_GET read back and clear the device DMA > log > + * > + * Query the device's DMA log for written pages within the given IOVA range. > + * During querying the log is cleared for the IOVA range. > + * > + * bitmap is a pointer to an array of u64s that will hold the output bitmap > + * with 1 bit reporting a page_size unit of IOVA. The mapping of IOVA to bits > + * is given by: > + * bitmap[(addr - iova)/page_size] & (1ULL << (addr % 64)) > + * > + * The input page_size can be any power of two value and does not have to > + * match the value given to VFIO_DEVICE_FEATURE_DMA_LOGGING_START. > The driver > + * will format its internal logging to match the reporting page size, possibly > + * by replicating bits if the internal page size is lower than requested. what's the purpose of this? I didn't quite get why an user would want to start tracking in one page size and then read back the dirty bitmap in another page size... > + * > + * Bits will be updated in bitmap using atomic or to allow userspace to > + * combine bitmaps from multiple trackers together. Therefore userspace > must > + * zero the bitmap before doing any reports. I'm a bit lost here. Since we allow userspace to combine bitmaps from multiple trackers then it's perfectly sane for userspace to leave bitmap with some 1's from one tracker when doing a report from another tracker. > + * > + * If any error is returned userspace should assume that the dirty log is > + * corrupted and restart. > + * > + * If DMA logging is not enabled, an error will be returned. > + * > + */ > +struct vfio_device_feature_dma_logging_report { > + __aligned_u64 iova; > + __aligned_u64 length; > + __aligned_u64 page_size; > + __aligned_u64 bitmap; > +}; > + > +#define VFIO_DEVICE_FEATURE_DMA_LOGGING_REPORT 5 > + > /* -------- API for Type1 VFIO IOMMU -------- */ > > /** > -- > 2.18.1