On Wed, May 24, 2023 at 09:31:42AM -0600, Alex Williamson wrote: > If a user creates an ioas within an iommufd, attaches a device to that > ioas and populates it with mappings, wouldn't the user expect the > device to have access to and honor those mappings? I think that's the > path we're headed down if we report a successful attach of a noiommu > device to an ioas. I understand we are going to drop no-iommu from this series, so this below is not relavent. But to clarify my general design idea here again The IOAS contains the mappings that userspace would like to use with no-iommu. Userspace would use a new IOCTL to pin and return the DMA addr's of those exact mappings. So attaching a noiommu to an IOAS is a necessary operation that should succeed. It doesn't make full API sense until we also get an ioctl to return the dma_addr_t lists. What is special about no-iommu is that the mapppings have to go through the special ioctl API to pin and translate, the IOVA cannot be used natively as a dma_addr. The IOAS is still used and still related to the device, it just for pinning and dma_addr generation not HW isolation. > We need to keep in mind that noiommu was meant to be a minimally > intrusive mechanism to provide a dummy vfio IOMMU backend and satisfy > the group requirements, solely for the purpose of making use of the > vfio device interface and without providing any DMA mapping services or > expectations. Well, no-iommu turned into a total hack job as soon as it wrongly relied on mlock() and /proc/ files to function. Even within its defined limitations this is an incorrect way to use the mm and DMA APIs. Memory under DMA must be locked using pin_user_pages(), mlock is not a substitution. I expect this is functionally broken these days, under some workloads, on certain kernel configurations. Even if we don't fully implement it, I prefer to imagine a design where no-iommu is implemented correctly and orient things toward that. > beyond the minimal code trickery of the legacy implementation. I hate > to ask, but could we reiterate our requirements for noiommu as a part of > the native iommufd interface for vfio? The nested userspace requirement > is gone now that hypervisors have vIOMMU support, so my assumption is > that this is only for bare metal systems without an IOMMU, which > ideally are less and less prevalent. I understood there was some desire for DPDK users to do this for higher performance on some systems. > that are actually going to adopt the noiommu cdev interface? What > terrible things happen if noiommu only exists in the vfio group compat > interface to iommufd and at some distant point in the future dies when > that gets disabled? I think it is fine, it is only for DPDK and if DPDK people really really care about this then they can implement it properly someday. I'm quite happy if we say we will not put no-iommu into the device cdev until it is put in fully correctly without relying on mlock/etc. Then the API construction would make alot more sense. Jason