On Fri, Apr 23, 2021 at 10:31:46AM +0000, Tian, Kevin wrote: > So your proposal sort of moves the entire container/group/domain > managment into /dev/ioasid and then leaves vfio only provide device > specific uAPI. An ioasid represents a page table (address space), thus > is equivalent to the scope of VFIO container. Having the device join > an ioasid is equivalent to attaching a device to VFIO container, and > here the group integrity must be enforced. Then /dev/ioasid anyway > needs to manage group objects and their association with ioasid and > underlying iommu domain thus it's pointless to keep same logic within > VFIO. Is this understanding correct? Yes, I haven't thought of a way to define /dev/ioasid in a way that is useful to VDPA/etc without all these parts.. If you come up with a better idea do share. > btw one remaining open is whether you expect /dev/ioasid to be > associated with a single iommu domain, or multiple. If only a single > domain is allowed, the ioasid_fd is equivalent to the scope of VFIO > container. See the prior email for a more complete set of thoughts on this. > It is supposed to have only one gpa_ioasid_id since one iommu domain > can only have a single 2nd level pgtable. Then all other ioasids, > once allocated, must be nested on this gpa_ioasid_id to fit in the > same domain. if a legacy vIOMMU is exposed (which disallows > nesting), the userspace has to open an ioasid_fd for every group. > This is basically the VFIO way. On the other hand if multiple > domains is allowed, there could be multiple ioasid_ids each holding > a 2nd level pgtable and an iommu domain (or a list of pgtables and > domains due to incompatibility issue as discussed in another > thread), and can be nested by other ioasids respectively. The > application only needs to open /dev/ioasid once regardless of > whether vIOMMU allows nesting, and has a single interface for ioasid > allocation. Which way do you prefer to? I have a feeling we want to have a single IOASID be usable in as many contexts as possible - as many domains, devices and groups as we can get away with. The IOASID is the expensive object, it is the pagetable, it is potentially a lot of memory. The API should be designed so we don't have to have multiple copies of the same pagetable. For this reason I think having multiple IOASID's in a single /dev/ioasid container is the way to go. To my mind the /dev/ioasid should be linked to a HW page table format and any device/domain/group that uses that same HW page table format can be joined to it. This implies we can have multiple domains under /dev/ioasid, but there is a limitation on what domains can be grouped together. This probably does not match the exact IOMMU capability/domain model we have today, so I present it as an inspirational goal. The other tricky thing here will be to define small steps.. eg V1 may only allow one domain, but the uAPI will not reflect this as we expect V2 will allow multiple domains.. Jason