> From: Jason Gunthorpe <jgg@xxxxxxxxxx> > Sent: Thursday, January 4, 2024 10:37 PM > > On Thu, Dec 14, 2023 at 07:26:39PM +0800, Yi Liu wrote: > > Per the prior discussion[1], we agreed to move the error reporting into the > > driver specific part. On Intel side, we want to report two devTLB > > invalidation errors: ICE (invalid completion error) and ITE (invalidation > > timeout error). Such errors have an additional SID information to tell > > which device failed the devTLB invalidation. I've got the below structure. > > IMHO all of this complexity is a consequence of the decision to hide > the devtlb invalidation from the VM.. > > On the other hand I guess you want to do this because of the SIOV > troubles where the vPCI function in the VM is entirely virtual and > can't be trivially mapped to a real PCI function for ATC invalidation > like ARM and AMD can do (but they also can't support SIOV because of > this). :( > > However it also makes it very confusing about how the VM would > perceive an error - eg if it invalidates an SIOV device single PASID > and that devtlb fails then the error should be connected back to the > vPCI function for the SIOV's specific PASID and not back to the > physical PCI function for the SIOV owner. > > As the iommu driver itself has no idea about the vPCI functions this > seems like it is going to get really confusing. The API I suggested in > the other email is not entirely going to work as the vPCI function for > SIOV cases will have to be identified by the (struct device, PASID) - > while it would be easy enough for the iommu driver to provide the > PASID, I'm not sure how the iommufd core will relate the PASID back to > the iommu_device to understand SIOV without actually being aware of > SIOV to some degree :\ we plan to add such awareness with a new binding helper: https://lore.kernel.org/lkml/20231009085123.463179-4-yi.l.liu@xxxxxxxxx/ and with metadata to track association between PASID's and iommufd vdev. but in reality the relation could be identified in an easy way due to a SIOV restriction which we discussed before - shared PASID space of PF disallows assigning sibling vdev's to a same VM (otherwise no way to identify which sibling vdev triggering an iopf when a pasid is used on both vdev's). That restriction implies that within an iommufd context every iommufd_device object should contain a unique struct device pointer. So PASID can be instead ignored in the lookup then just always do iommufd_get_dev_id() using struct device.