On Tue, 2 Feb 2021 19:06:04 -0400 Jason Gunthorpe <jgg@xxxxxxxxxx> wrote: > On Tue, Feb 02, 2021 at 02:30:13PM -0700, Alex Williamson wrote: > > > The first set of users already fail this specification though, we can't > > base it strictly on device and vendor IDs, we need wildcards, class > > codes, revision IDs, etc., just like any other PCI drvier. We're not > > going to maintain a set of specific device IDs for the IGD > > extension, > > The Intel GPU driver already has a include/drm/i915_pciids.h that > organizes all the PCI match table entries, no reason why VFIO IGD > couldn't include that too and use the same match table as the real GPU > driver. Same HW right? vfio-pci-igd support knows very little about the device, we're effectively just exposing a firmware table and some of the host bridge config space (read-only). So the idea that the host kernel needs to have updated i915 support in order to expose the device to userspace with these extra regions is a bit silly. > Also how sure are you that this loose detection is going to work with > future Intel discrete GPUs that likely won't need vfio_igd? Not at all, which is one more reason we don't want to rely on i915's device table, which would likely support those as well. We might only want to bind to GPUs on the root complex, or at address 0000:00:02.0. Our "what to reject" algorithm might need to evolve as those arrive, but I don't think that means we need to explicitly list every device ID either. > > nor I suspect the NVLINK support as that would require a kernel update > > every time a new GPU is released that makes use of the same interface. > > The nvlink device that required this special vfio code was a one > off. Current devices do not use it. Not having an exact PCI ID match > in this case is a bug. AIUI, the quirk is only activated when there's a firmware table to support it. No firmware table, no driver bind, no need to use explicit IDs. Vendor and class code should be enough. > > As I understand Jason's reply, these vendor drivers would have an ids > > table and a user could look at modalias for the device to compare to > > the driver supported aliases for a match. Does kmod already have this > > as a utility outside of modprobe? > > I think this is worth exploring. > > One idea that fits nicely with the existing infrastructure is to add > to driver core a 'device mode' string. It would be "default" or "vfio" > > devices in vfio mode only match vfio mode device_drivers. > > devices in vfio mode generate a unique modalias string that includes > some additional 'mode=vfio' identifier > > drivers that run in vfio mode generate a module table string that > includes the same mode=vfio > > The driver core can trigger driver auto loading soley based on the > mode string, happens naturally. > > All the existing udev, depmod/etc tooling will transparently work. > > Like driver_override, but doesn't bypass all the ID and module loading > parts of the driver core. > > (But lets not get too far down this path until we can agree that > embracing the driver core like the RFC contemplates is the agreed > direction) I'm not sure I fully follow the mechanics of this. I'm interpreting this as something like a sub-class of drivers where for example vfio-pci class drivers would have a vfio-pci: alias prefix rather than pci:. There might be some sysfs attribute for the device that would allow the user to write an alias prefix and would that trigger the (ex.) pci-core to send remove uevents for the pci: modalias device and add uevents for the vfio-pci: modalias device? Some ordering rules would then allow vendor/device modules to precede vfio-pci, which would have only a wildcard id table? I need to churn on that for a while, but if driver-core folks are interested, maybe it could be a good idea... Thanks, Alex