On Tue, 8 May 2018 16:10:19 -0600 Logan Gunthorpe <logang@xxxxxxxxxxxx> wrote: > On 08/05/18 04:03 PM, Alex Williamson wrote: > > If IOMMU grouping implies device assignment (because nobody else uses > > it to the same extent as device assignment) then the build-time option > > falls to pieces, we need a single kernel that can do both. I think we > > need to get more clever about allowing the user to specify exactly at > > which points in the topology they want to disable isolation. Thanks, > > > Yeah, so based on the discussion I'm leaning toward just having a > command line option that takes a list of BDFs and disables ACS for them. > (Essentially as Dan has suggested.) This avoids the shotgun. > > Then, the pci_p2pdma_distance command needs to check that ACS is > disabled for all bridges between the two devices. If this is not the > case, it returns -1. Future work can check if the EP has ATS support, in > which case it has to check for the ACS direct translated bit. > > A user then needs to either disable the IOMMU and/or add the command > line option to disable ACS for the specific downstream ports in the PCI > hierarchy. This means the IOMMU groups will be less granular but > presumably the person adding the command line argument understands this. > > We may also want to do some work so that there's informative dmesgs on > which BDFs need to be specified on the command line so it's not so > difficult for the user to figure out. I'd advise caution with a user supplied BDF approach, we have no guaranteed persistence for a device's PCI address. Adding a device might renumber the buses, replacing a device with one that consumes more/less bus numbers can renumber the buses, motherboard firmware updates could renumber the buses, pci=assign-buses can renumber the buses, etc. This is why the VT-d spec makes use of device paths when describing PCI hierarchies, firmware can't know what bus number will be assigned to a device, but it does know the base bus number and the path of devfns needed to get to it. I don't know how we come up with an option that's easy enough for a user to understand, but reasonably robust against hardware changes. Thanks, Alex