> From: Jason Gunthorpe <jgg@xxxxxxxxxx> > Sent: Wednesday, May 3, 2023 2:22 AM > > On Sat, Apr 29, 2023 at 12:13:39AM +0800, Yi Liu wrote: > > > > Whoa, noiommu is inherently unsafe an only meant to expose the vfio > > > device interface for userspace drivers that are going to do unsafe > > > things regardless. Enabling noiommu to work with mdev, pin pages, or > > > anything else should not be on our agenda. Userspaces relying on niommu > > > get the minimum viable interface and must impose a minuscule > > > incremental maintenance burden. The only reason we're spending so much > > > effort on it here is to make iommufd noiommu support equivalent to > > > group/container noiommu support. We should stop at that. Thanks, > > > > btw. I asked a question in [1] to check if we should allow attach/detach > > on noiommu devices. Jason has replied it. If in future noiommu userspace > > can pin page, then such userspace will need to attach/detach ioas. So I > > made cdev series[2] to allow attach ioas on noiommu devices. Supporting > > it from cdev day-1 may avoid probing if attach/detach is supported or > > not for specific devices when adding pin page for noiommu userspace. > > > > But now, I think such a support will not in plan, is it? If so, will it > > be better to disallow attach/detach on noiommu devices in patch [2]? > > > > [1] https://lore.kernel.org/kvm/ZEa+khH0tUFStRMW@xxxxxxxxxx/ > > [2] https://lore.kernel.org/kvm/20230426150321.454465-21-yi.l.liu@xxxxxxxxx/ > > If we block it then userspace has to act quite differently, I think we > should keep it. Maybe kernel can simply fail the attach/detach if it happens on noiommu devices, and noiommu userspace should just know it would fail. @Alex, how about your opinion? > My general idea to complete the no-iommu feature is to add a new IOCTL > to VFIO that is 'pin iova and return dma addr' that no-iommu userspace > would call instead of trying to abuse mlock and /proc/ to do it. That > ioctl would use the IOAS attached to the access just like a mdev would > do, so it has a real IOVA, but it is not a mdev. This new ioctl may be IOMMUFD ioctl since its input is the IOAS and addr, nothing related to the device. Is it? > unmap callback just does nothing, as Alex says it is all still totally > unsafe. Sure. That's also why I added a noiommu test to avoid calling unmap callback although it seems not possible to have unmap callback as mdev drivers would implement it. > > This just allows it use the mm a little more properly and safely (eg > mlock() doesn't set things like page_maybe_dma_pinned(), proc doesn't > reject things like DAX and it currently doesn't make an adjustment for > the PCI offset stuff..) So it would make DPDK a little more robust, > portable and make the whole VFIO no-iommu feature much easier to use. Thanks for the explanation. > To do that we need an iommufd access, an access ID and we need to link > the current IOAS to the special access, like mdev, but in any mdev > code paths. > > That creating the access ID solves the reset problem as well is a nice > side effect and is the only part of this you should focus on for now.. Yes. I get this part. We only need access ID so far to fix the noiommu gap in hot-reset. Regards, Yi Liu