Re: [PATCH v12 Kernel 4/7] vfio iommu: Implementation of ioctl to for dirty pages tracking.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Feb 11, 2020 at 11:45:43AM +0800, Alex Williamson wrote:
> On Mon, 10 Feb 2020 21:52:51 -0500
> Yan Zhao <yan.y.zhao@xxxxxxxxx> wrote:
> 
> > On Tue, Feb 11, 2020 at 03:44:54AM +0800, Alex Williamson wrote:
> > > On Mon, 10 Feb 2020 04:49:54 -0500
> > > Yan Zhao <yan.y.zhao@xxxxxxxxx> wrote:
> > >   
> > > > On Sat, Feb 08, 2020 at 03:42:31AM +0800, Kirti Wankhede wrote:  
> > > > > VFIO_IOMMU_DIRTY_PAGES ioctl performs three operations:
> > > > > - Start pinned and unpinned pages tracking while migration is active
> > > > > - Stop pinned and unpinned dirty pages tracking. This is also used to
> > > > >   stop dirty pages tracking if migration failed or cancelled.
> > > > > - Get dirty pages bitmap. This ioctl returns bitmap of dirty pages, its
> > > > >   user space application responsibility to copy content of dirty pages
> > > > >   from source to destination during migration.
> > > > > 
> > > > > To prevent DoS attack, memory for bitmap is allocated per vfio_dma
> > > > > structure. Bitmap size is calculated considering smallest supported page
> > > > > size. Bitmap is allocated when dirty logging is enabled for those
> > > > > vfio_dmas whose vpfn list is not empty or whole range is mapped, in
> > > > > case of pass-through device.
> > > > > 
> > > > > There could be multiple option as to when bitmap should be populated:
> > > > > * Polulate bitmap for already pinned pages when bitmap is allocated for
> > > > >   a vfio_dma with the smallest supported page size. Updates bitmap from
> > > > >   page pinning and unpinning functions. When user application queries
> > > > >   bitmap, check if requested page size is same as page size used to
> > > > >   populated bitmap. If it is equal, copy bitmap. But if not equal,
> > > > >   re-populated bitmap according to requested page size and then copy to
> > > > >   user.
> > > > >   Pros: Bitmap gets populated on the fly after dirty tracking has
> > > > >         started.
> > > > >   Cons: If requested page size is different than smallest supported
> > > > >         page size, then bitmap has to be re-populated again, with
> > > > >         additional overhead of allocating bitmap memory again for
> > > > >         re-population of bitmap.
> > > > > 
> > > > > * Populate bitmap when bitmap is queried by user application.
> > > > >   Pros: Bitmap is populated with requested page size. This eliminates
> > > > >         the need to re-populate bitmap if requested page size is
> > > > >         different than smallest supported pages size.
> > > > >   Cons: There is one time processing time, when bitmap is queried.
> > > > > 
> > > > > I prefer later option with simple logic and to eliminate over-head of
> > > > > bitmap repopulation in case of differnt page sizes. Later option is
> > > > > implemented in this patch.
> > > > > 
> > > > > Signed-off-by: Kirti Wankhede <kwankhede@xxxxxxxxxx>
> > > > > Reviewed-by: Neo Jia <cjia@xxxxxxxxxx>
> > > > > ---
> > > > >  drivers/vfio/vfio_iommu_type1.c | 299 ++++++++++++++++++++++++++++++++++++++--
> > > > >  1 file changed, 287 insertions(+), 12 deletions(-)
> > > > > 
> > > > > diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
> > > > > index d386461e5d11..df358dc1c85b 100644
> > > > > --- a/drivers/vfio/vfio_iommu_type1.c
> > > > > +++ b/drivers/vfio/vfio_iommu_type1.c  
> > > [snip]  
> > > > > @@ -830,6 +924,113 @@ static unsigned long vfio_pgsize_bitmap(struct vfio_iommu *iommu)
> > > > >  	return bitmap;
> > > > >  }
> > > > >  
> > > > > +static int vfio_iova_dirty_bitmap(struct vfio_iommu *iommu, dma_addr_t iova,
> > > > > +				  size_t size, uint64_t pgsize,
> > > > > +				  unsigned char __user *bitmap)
> > > > > +{
> > > > > +	struct vfio_dma *dma;
> > > > > +	dma_addr_t i = iova, iova_limit;
> > > > > +	unsigned int bsize, nbits = 0, l = 0;
> > > > > +	unsigned long pgshift = __ffs(pgsize);
> > > > > +
> > > > > +	while ((dma = vfio_find_dma(iommu, i, pgsize))) {
> > > > > +		int ret, j;
> > > > > +		unsigned int npages = 0, shift = 0;
> > > > > +		unsigned char temp = 0;
> > > > > +
> > > > > +		/* mark all pages dirty if all pages are pinned and mapped. */
> > > > > +		if (dma->iommu_mapped) {
> > > > > +			iova_limit = min(dma->iova + dma->size, iova + size);
> > > > > +			npages = iova_limit/pgsize;
> > > > > +			bitmap_set(dma->bitmap, 0, npages);    
> > > > for pass-through devices, it's not good to always return all pinned pages as
> > > > dirty. could it also call vfio_pin_pages to track dirty pages? or any
> > > > other interface provided to do that?  
> > > 
> > > See patch 7/7.  Thanks,
> > >  
> > hi Alex and Kirti,
> > for pass-through devices, though patch 7/7 enables the vendor driver to
> > set dirty pages by calling vfio_pin_pages, however, its overhead is much
> > higher than the previous way of generating a bitmap directly to user.
> > And it also requires pass-through device vendor driver to track guest
> > operations to know when to call vfio_pin_pages.
> > There are still use cases like a pass-through device is able to track
> > dirty pages in its hardware buffer, so is there a way for it pass its
> > dirty bitmap to user?
> 
> Not currently and this sounds like another argument in favor of using
> the dirty bitmap per vfio_dma to directly track dirty pages.
it may need an interface to get max iova in all vfio_dma and then generate a
hardware bitmap for the whole guest system memory.

> Passthrough drivers could be provided an interface to set dirty bits
> which could be merged with pfn list entries when the user requests the
> bitmap, rather than requiring passthrough drivers to unnecessarily
> allocate pfn list entries directly.  Thanks,
yes, it's better.
and for devices with ability to track dirty pages in hardware,
maybe an interface to let vfio know where is the hardware bitmap?

Thanks
Yan



[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux