On Wed, Apr 24, 2024 at 05:19:31AM +0000, Tian, Kevin wrote: > > From: Jason Gunthorpe <jgg@xxxxxxxxxx> > > Sent: Wednesday, April 24, 2024 8:12 AM > > > > On Tue, Apr 23, 2024 at 11:47:50PM +0000, Tian, Kevin wrote: > > > > From: Jason Gunthorpe <jgg@xxxxxxxxxx> > > > > Sent: Tuesday, April 23, 2024 8:02 PM > > > > > > > > It feels simpler if the indicates if PASID and ATS can be supported > > > > and userspace builds the capability blocks. > > > > > > this routes back to Alex's original question about using different > > > interfaces (a device feature vs. PCI PASID cap) for VF and PF. > > > > I'm not sure it is different interfaces.. > > > > The only reason to pass the PF's PASID cap is to give free space to > > the VMM. If we are saying that gaps are free space (excluding a list > > of bad devices) then we don't acutally need to do that anymore. > > > > VMM will always create a synthetic PASID cap and kernel will always > > suppress a real one. > > oh you suggest that there won't even be a 1:1 map for PF! Right. No real need.. > kind of continue with the device_feature method as this series does. > and it could include all VMM-emulated capabilities which are not > enumerated properly from vfio pci config space. 1) VFIO creates the iommufd idev 2) VMM queries IOMMUFD_CMD_GET_HW_INFO to learn if PASID, PRI, etc, etc is supported 3) VMM locates empty space in the config space 4) VMM figures out where and what cap blocks to create (considering migration needs/etc) 5) VMM synthesizes the blocks and ties emulation to other iommufd things This works generically for any synthetic vPCI function including a non-vfio-pci one. Most likely due to migration needs the exact layout of the PCI config space should be configured to the VMM, including the location of any blocks copied from physical and any blocks synthezied. This is the only way to be sure the config space is actually 100% consistent. For non migration cases to make it automatic we can check the free space via gaps. The broken devices that have problems with this can either be told to use the explicit approach above,the VMM could consult some text file, or vPASID/etc can be left disabled. IMHO the support of PASID is so rare today this is probably fine. Vendors should be *strongly encouraged* to wrap their special used config space areas in DVSEC and not hide them in free space. We may also want a DVSEC to indicate free space - but if vendors are going to change their devices I'd rather them change to mark the used space with DVSEC then mark the free space :) Jason