On Fri, May 28, 2021 at 10:24:56AM +0800, Jason Wang wrote: > > IOASID nesting can be implemented in two ways: hardware nesting and > > software nesting. With hardware support the child and parent I/O page > > tables are walked consecutively by the IOMMU to form a nested translation. > > When it's implemented in software, the ioasid driver > > Need to explain what did "ioasid driver" mean. I think it means "drivers/iommu" > And if yes, does it allow the device for software specific implementation: > > 1) swiotlb or I think it is necessary to have a 'software page table' which is required to do all the mdevs we have today. > 2) device specific IOASID implementation "drivers/iommu" is pluggable, so I guess it can exist? I've never seen it done before though If we'd want this to drive an on-device translation table is an interesting question. I don't have an answer > > I/O page tables routed through PASID are installed in a per-RID PASID > > table structure. > > I'm not sure this is true for all archs. It must be true. For security reasons access to a PASID must be limited by RID. RID_A assigned to guest A should not be able to access a PASID being used by RID_B in guest B. Only a per-RID restriction can accomplish this. > I would like to know the reason for such indirection. > > It looks to me the ioasid fd is sufficient for performing any operations. > > Such allocation only work if as ioas fd can have multiple ioasid which seems > not the case you describe here. It is the case, read the examples section. One had 3 interrelated IOASID objects inside the same FD. > > 5.3. IOASID nesting (software) > > +++++++++++++++++++++++++ > > > > Same usage scenario as 5.2, with software-based IOASID nesting > > available. In this mode it is the kernel instead of user to create the > > shadow mapping. > > > > The flow before guest boots is same as 5.2, except one point. Because > > giova_ioasid is nested on gpa_ioasid, locked accounting is only > > conducted for gpa_ioasid. So it's not necessary to pre-register virtual > > memory. > > > > To save space we only list the steps after boots (i.e. both dev1/dev2 > > have been attached to gpa_ioasid before guest boots): > > > > /* After boots */ > > /* Make GIOVA space nested on GPA space */ > > giova_ioasid = ioctl(ioasid_fd, IOASID_CREATE_NESTING, > > gpa_ioasid); > > > > /* Attach dev2 to the new address space (child) > > * Note dev2 is still attached to gpa_ioasid (parent) > > */ > > at_data = { .ioasid = giova_ioasid}; > > ioctl(device_fd2, VFIO_ATTACH_IOASID, &at_data); > > > For vDPA, we need something similar. And in the future, vDPA may allow > multiple ioasid to be attached to a single device. It should work with the > current design. What do you imagine multiple IOASID's being used for in VDPA? Jason