On Tue, Jun 01, 2021 at 08:10:14AM +0000, Tian, Kevin wrote: > > From: Jason Gunthorpe <jgg@xxxxxxxxxx> > > Sent: Saturday, May 29, 2021 1:36 AM > > > > On Thu, May 27, 2021 at 07:58:12AM +0000, Tian, Kevin wrote: > > > > > IOASID nesting can be implemented in two ways: hardware nesting and > > > software nesting. With hardware support the child and parent I/O page > > > tables are walked consecutively by the IOMMU to form a nested translation. > > > When it's implemented in software, the ioasid driver is responsible for > > > merging the two-level mappings into a single-level shadow I/O page table. > > > Software nesting requires both child/parent page tables operated through > > > the dma mapping protocol, so any change in either level can be captured > > > by the kernel to update the corresponding shadow mapping. > > > > Why? A SW emulation could do this synchronization during invalidation > > processing if invalidation contained an IOVA range. > > In this proposal we differentiate between host-managed and user- > managed I/O page tables. If host-managed, the user is expected to use > map/unmap cmd explicitly upon any change required on the page table. > If user-managed, the user first binds its page table to the IOMMU and > then use invalidation cmd to flush iotlb when necessary (e.g. typically > not required when changing a PTE from non-present to present). > > We expect user to use map+unmap and bind+invalidate respectively > instead of mixing them together. Following this policy, map+unmap > must be used in both levels for software nesting, so changes in either > level are captured timely to synchronize the shadow mapping. map+unmap or bind+invalidate is a policy of the IOASID itself set when it is created. If you put two different types in a tree then each IOASID must continue to use its own operation mode. I don't see a reason to force all IOASIDs in a tree to be consistent?? A software emulated two level page table where the leaf level is a bound page table in guest memory should continue to use bind/invalidate to maintain the guest page table IOASID even though it is a SW construct. The GPA level should use map/unmap because it is a kernel owned page table Though how to efficiently mix map/unmap on the GPA when there are SW nested levels below it looks to be quite challenging. Jason