Re: [RFC] /dev/ioasid uAPI proposal

Jason Gunthorpe <jgg@xxxxxxxxxx> · Thu, 3 Jun 2021 08:47:28 -0300

On Thu, Jun 03, 2021 at 06:49:20AM +0000, Tian, Kevin wrote:
> > From: David Gibson
> > Sent: Thursday, June 3, 2021 1:09 PM
> [...]
> > > > In this way the SW mode is the same as a HW mode with an infinite
> > > > cache.
> > > >
> > > > The collaposed shadow page table is really just a cache.
> > > >
> > >
> > > OK. One additional thing is that we may need a 'caching_mode"
> > > thing reported by /dev/ioasid, indicating whether invalidation is
> > > required when changing non-present to present. For hardware
> > > nesting it's not reported as the hardware IOMMU will walk the
> > > guest page table in cases of iotlb miss. For software nesting
> > > caching_mode is reported so the user must issue invalidation
> > > upon any change in guest page table so the kernel can update
> > > the shadow page table timely.
> > 
> > For the fist cut, I'd have the API assume that invalidates are
> > *always* required.  Some bypass to avoid them in cases where they're
> > not needed can be an additional extension.
> > 
> 
> Isn't a typical TLB semantics is that non-present entries are not
> cached thus invalidation is not required when making non-present
> to present? It's true to both CPU TLB and IOMMU TLB. In reality
> I feel there are more usages built on hardware nesting than software
> nesting thus making default following hardware TLB behavior makes
> more sense...

>From a modelling perspective it makes sense to have the most general
be the default and if an implementation can elide certain steps then
describing those as additional behaviors on the universal baseline is
cleaner

I'm surprised to hear your remarks about the not-present though,
how does the vIOMMU emulation work if there are not hypervisor
invalidation traps for not-present/present transitions?

Jason