> From: Jason Gunthorpe <jgg@xxxxxxxxxx> > Sent: Thursday, June 3, 2021 7:47 PM > > On Thu, Jun 03, 2021 at 06:49:20AM +0000, Tian, Kevin wrote: > > > From: David Gibson > > > Sent: Thursday, June 3, 2021 1:09 PM > > [...] > > > > > In this way the SW mode is the same as a HW mode with an infinite > > > > > cache. > > > > > > > > > > The collaposed shadow page table is really just a cache. > > > > > > > > > > > > > OK. One additional thing is that we may need a 'caching_mode" > > > > thing reported by /dev/ioasid, indicating whether invalidation is > > > > required when changing non-present to present. For hardware > > > > nesting it's not reported as the hardware IOMMU will walk the > > > > guest page table in cases of iotlb miss. For software nesting > > > > caching_mode is reported so the user must issue invalidation > > > > upon any change in guest page table so the kernel can update > > > > the shadow page table timely. > > > > > > For the fist cut, I'd have the API assume that invalidates are > > > *always* required. Some bypass to avoid them in cases where they're > > > not needed can be an additional extension. > > > > > > > Isn't a typical TLB semantics is that non-present entries are not > > cached thus invalidation is not required when making non-present > > to present? It's true to both CPU TLB and IOMMU TLB. In reality > > I feel there are more usages built on hardware nesting than software > > nesting thus making default following hardware TLB behavior makes > > more sense... > > From a modelling perspective it makes sense to have the most general > be the default and if an implementation can elide certain steps then > describing those as additional behaviors on the universal baseline is > cleaner > > I'm surprised to hear your remarks about the not-present though, > how does the vIOMMU emulation work if there are not hypervisor > invalidation traps for not-present/present transitions? > Such invalidation traps matter only for shadow I/O page table (software nesting). For hardware nesting no trap is required for non-present/ present transition since physical IOTLB doesn't cache non-present entries. The IOMMU will walk the guest I/O page table in case of IOTLB miss. The vIOMMU should be constructed according to whether software or hardware nesting is used. For Intel (and AMD iirc), a caching_mode capability decides whether the guest needs to do invalidation for non-present/present transition. Such vIOMMU should clear this bit for hardware nesting or set it for software nesting. ARM SMMU doesn't have this capability. Therefore their vSMMU can only work with a hardware nested IOASID. Thanks Kevin