On Wed, Feb 02, 2022 at 07:05:02PM +0000, Joao Martins wrote: > On 2/2/22 17:03, Jason Gunthorpe wrote: > > how to integrate that with the iommufd work, which I hope will allow > > that series, and the other IOMMU drivers that can support this to be > > merged.. > > The iommu-fd thread wasn't particularly obvious on how dirty tracking is done > there, but TBH I am not up to speed on iommu-fd yet so I missed something > obvious for sure. When you say 'integrate that with the iommufd' can you > expand on that? The general idea is that iommufd is the place to put all the iommu driver uAPI for consumption by userspace. The IOMMU feature of dirty tracking would belong there. So, some kind of API needs to be designed to meet the needs of the IOMMU drivers. > Did you meant to use interface in the link, or perhaps VFIO would use an iommufd > /internally/ but still export the same UAPI as VFIO dirty tracking ioctls() (even if it's > not that efficient with a lot of bitmap copying). And optionally give a iommu_fd for the > VMM to scan iommu pagetables itself and see what was marked dirty or > not? iommufd and VFIO container's don't co-exist, either iommufd is providing the IOMMU interface, or the current type 1 code - not both together. iommfd current approach presents the same ABI as the type1 container as compatability, and it is a possible direction to provide the iommu_domain stored dirty bits through that compat API. But, as you say, it looks unnatural and inefficient when the domain itself is storing the dirty bits inside the IOPTE. It need some study I haven't got into yet :) > My over-simplistic/naive view was that the proposal in the link > above sounded a lot simpler. While iommu-fd had more longevity for > many other usecases outside dirty tracking, no? I'd prefer we don't continue to hack on the type1 code if iommufd is expected to take over in this role - especially for a multi-vendor feature like dirty tracking. It is actually a pretty complicated topic because migration capable PCI devices are also include their own dirty tracking HW, all this needs to be harmonized somehow. VFIO proposed to squash everything into the container code, but I've been mulling about having iommufd only do system iommu and push the PCI device internal tracking over to VFIO. > I have a PoC-ish using the interface in the link, with AMD IOMMU > dirty bit supported (including Qemu emulated amd-iommu for folks > lacking the hardware). Albeit the eager-spliting + collapsing of > IOMMU hugepages is not yet done there, and I wanted to play around > the emulated intel-iommu SLADS from specs looks quite similar. Happy > to join existing effort anyways. This sounds great, I would love to bring the AMD IOMMU along with a dirty tracking implementation! Can you share some patches so we can see what the HW implementation looks like? Jason