> From: Jason Gunthorpe > Sent: Saturday, May 8, 2021 1:11 AM > > On Fri, May 07, 2021 at 11:06:14AM -0600, Alex Williamson wrote: > > > We had tossed around an idea of a super-container with vfio, it's maybe > > something we'd want to incorporate into this design. For instance, if > > memory could be pre-registered with a super container, which would > > handle the locked memory accounting for that memory, then > > sub-containers could all handle the IOMMU context of their sets of > > devices relative to that common memory pool. > > This is where I suggested to David to use nesting of IOASIDs. > > Without HW support for nesting a SW nest is really just re-using the > memory registration information stored in the parent when constructing > the children > yes, this sounds a sensible thing to do. it also unifies the user experience regardless of whether the underlying hw supports nesting, e.g. when vIOMMU is present Qemu can always use IOASID nesting uAPI. In case of SW nest then the kernel will merge the two-level translations from two IOASIDs into one-level shadow page table (unlike today's VFIO which has the userspace to manage shadow-based mapping). but want to remark that nesting IOASIDs alone cannot solve this accounting problem completely, as long as a process is allowed to have multiple ioasid FDs (unless there is a mechanism to allow nesting IOASIDs cross FDs). But this is probably not a big issue. With all the intended usages around the new interface, I think for most applications one ioasid FD should be sufficient to meet their requirements (multiple gpa_ioasids, ioasid nesting, etc.). Thanks Kevin