> From: Jason Gunthorpe <jgg@xxxxxxxxxx> > Sent: Wednesday, April 24, 2024 8:12 AM > > On Tue, Apr 23, 2024 at 11:47:50PM +0000, Tian, Kevin wrote: > > > From: Jason Gunthorpe <jgg@xxxxxxxxxx> > > > Sent: Tuesday, April 23, 2024 8:02 PM > > > > > > On Tue, Apr 23, 2024 at 07:43:27AM +0000, Tian, Kevin wrote: > > > > I'm not sure how userspace can fully handle this w/o certain assistance > > > > from the kernel. > > > > > > > > So I kind of agree that emulated PASID capability is probably the only > > > > contract which the kernel should provide: > > > > - mapped 1:1 at the physical location, or > > > > - constructed at an offset according to DVSEC, or > > > > - constructed at an offset according to a look-up table > > > > > > > > The VMM always scans the vfio pci config space to expose vPASID. > > > > > > > > Then the remaining open is what VMM could do when a VF supports > > > > PASID but unfortunately it's not reported by vfio. W/o the capability > > > > of inspecting the PASID state of PF, probably the only feasible option > > > > is to maintain a look-up table in VMM itself and assumes the kernel > > > > always enables the PASID cap on PF. > > > > > > I'm still not sure I like doing this in the kernel - we need to do the > > > same sort of thing for ATS too, right? > > > > VF is allowed to implement ATS. > > > > PRI has the same problem as PASID. > > I'm surprised by this, I would have guessed ATS would be the device > global one, PRI not being per-VF seems problematic??? How do you > disable PRI generation to get a clean shutdown? Here is what the PCIe spec says: For SR-IOV devices, a single Page Request Interface is permitted for the PF and is shared between the PF and its associated VFs, in which case the PF implements this capability and its VFs must not. I'll let Baolu chime in for the potential impact to his PRI cleanup effort, e.g. whether disabling PRI generation is mandatory if the IOMMU side is already put in a mode auto-responding error to new PRI request instead of reporting to sw. But I do see another problem for shared capabilities between PF/VFs. Now those shared capabilities are enabled/disabled when the PF is attached to/detached from a domain, w/o counting the shared usage from VFs. Looks we have a gap here.