On Sat, Apr 29, 2023 at 12:13:39AM +0800, Yi Liu wrote: > > Whoa, noiommu is inherently unsafe an only meant to expose the vfio > > device interface for userspace drivers that are going to do unsafe > > things regardless. Enabling noiommu to work with mdev, pin pages, or > > anything else should not be on our agenda. Userspaces relying on niommu > > get the minimum viable interface and must impose a minuscule > > incremental maintenance burden. The only reason we're spending so much > > effort on it here is to make iommufd noiommu support equivalent to > > group/container noiommu support. We should stop at that. Thanks, > > btw. I asked a question in [1] to check if we should allow attach/detach > on noiommu devices. Jason has replied it. If in future noiommu userspace > can pin page, then such userspace will need to attach/detach ioas. So I > made cdev series[2] to allow attach ioas on noiommu devices. Supporting > it from cdev day-1 may avoid probing if attach/detach is supported or > not for specific devices when adding pin page for noiommu userspace. > > But now, I think such a support will not in plan, is it? If so, will it > be better to disallow attach/detach on noiommu devices in patch [2]? > > [1] https://lore.kernel.org/kvm/ZEa+khH0tUFStRMW@xxxxxxxxxx/ > [2] https://lore.kernel.org/kvm/20230426150321.454465-21-yi.l.liu@xxxxxxxxx/ If we block it then userspace has to act quite differently, I think we should keep it. My general idea to complete the no-iommu feature is to add a new IOCTL to VFIO that is 'pin iova and return dma addr' that no-iommu userspace would call instead of trying to abuse mlock and /proc/ to do it. That ioctl would use the IOAS attached to the access just like a mdev would do, so it has a real IOVA, but it is not a mdev. unmap callback just does nothing, as Alex says it is all still totally unsafe. This just allows it use the mm a little more properly and safely (eg mlock() doesn't set things like page_maybe_dma_pinned(), proc doesn't reject things like DAX and it currently doesn't make an adjustment for the PCI offset stuff..) So it would make DPDK a little more robust, portable and make the whole VFIO no-iommu feature much easier to use. To do that we need an iommufd access, an access ID and we need to link the current IOAS to the special access, like mdev, but in any mdev code paths. That creating the access ID solves the reset problem as well is a nice side effect and is the only part of this you should focus on for now.. Jason