Hi, Alex, Thanks for the explanation! > From: Alex Williamson <alex.williamson@xxxxxxxxxx> > Sent: Tuesday, August 23, 2022 2:36 AM > > On Mon, 22 Aug 2022 04:39:45 +0000 > "Tian, Kevin" <kevin.tian@xxxxxxxxx> wrote: > > > > From: Jason Gunthorpe <jgg@xxxxxxxxxx> > > > Sent: Wednesday, August 17, 2022 3:13 AM > > > > > > + * > > > + * A driver may only call this function if the vfio_device was created > > > + * by vfio_register_emulated_iommu_dev(). > > > */ > > > int vfio_pin_pages(struct vfio_device *device, dma_addr_t iova, > > > int npage, int prot, struct page **pages) > > > > Though I agree with the code change, I'm still unclear about this > > comment. > > > > First this comment is not equivalent to the old code which only > > checks dev_counter (implying a physical device in a singleton > > group can use this API too). though I didn't get why the singleton > > restriction was imposed in the first place... > > It's related to the dirty page scope. Use of the pinned page interface > is essentially a contract with the IOMMU back-end that only pinned pages > will be considered for the dirty page scope. However, type1 operates > on groups, therefore we needed to create a restriction that only > singleton groups could make use of page pinning such that the dirty > page scope could be attached to the group. I get the former part but not the latter. Can you elaborate why multi-device group can not be attached by the dirty page scope? It's kind of sharing the scope by all devices in the group. > > > Second I also didn't get why such a pinning API is tied to emulated > > iommu now. Though not required for any physical device today, what > > would be the actual problem of allowing a variant driver to make > > such call? > > In fact I do recall such discussions. An IOMMU backed mdev (defunct) > or vfio-pci variant driver could gratuitously pin pages in order to > limit the dirty page scope. We don't have anything in-tree that relies > on this. It also seems we're heading more in the direction of device > level DMA dirty tracking as Yishai proposes in the series for mlx5. > These interfaces are far more efficient for this use case, but perhaps > you have another use case in mind where we couldn't use the dma_rw > interface? One potential scenario is when I/O page fault is supported VFIO can enable on-demand paging in stage-2 mappings. In case a device cannot tolerate faults in all paths then a variant driver could use this interface to pin down structures which don't expect faults. > > I think the assumption is that devices that can perform DMA through an > IOMMU generally wouldn't need to twiddle guest DMA targets on a regular > basis otherwise, therefore limiting this to emulated IOMMU devices is > reasonable. Thanks, > IMHO if functionally this function only works for emulated case then we should add code to detect and fail if it's called otherwise. Otherwise it doesn't make much sense to add a comment to explicitly limit it to an existing use case. Thanks Kevin