On Tue, 2 Feb 2021 14:50:17 -0400 Jason Gunthorpe <jgg@xxxxxxxxxx> wrote: > On Tue, Feb 02, 2021 at 10:54:55AM -0700, Alex Williamson wrote: > > > As noted previously, if we start adding ids for vfio drivers then we > > create conflicts with the native host driver. We cannot register a > > vfio PCI driver that automatically claims devices. > > We can't do that in vfio_pci.ko, but a nvlink_vfio_pci.ko can, just > like the RFC showed with the mlx5 example. The key thing is the module > is not autoloadable and there is no modules.alias data for the PCI > IDs. > > The admin must explicitly load the module, just like the admin must > explicitly do "cat > new_id". "modprobe nvlink_vfio_pci" replaces > "newid", and preloading the correct IDs into the module's driver makes > the entire admin experience much more natural and safe. > > This could be improved with some simple work in the driver core: > > diff --git a/drivers/base/dd.c b/drivers/base/dd.c > index 2f32f38a11ed0b..dc3b088ad44d69 100644 > --- a/drivers/base/dd.c > +++ b/drivers/base/dd.c > @@ -828,6 +828,9 @@ static int __device_attach_driver(struct device_driver *drv, void *_data) > bool async_allowed; > int ret; > > + if (drv->flags & DRIVER_EXPLICIT_BIND_ONLY) > + continue; > + > ret = driver_match_device(drv, dev); > if (ret == 0) { > /* no match */ > > Thus the match table could be properly specified, but only explicit > manual bind would attach the driver. This would cleanly resolve the > duplicate ID problem, and we could even set a wildcard PCI match table > for vfio_pci and eliminate the new_id part of the sequence. > > However, I'd prefer to split any driver core work from VFIO parts - so > I'd propose starting by splitting to vfio_pci_core.ko, vfio_pci.ko, > nvlink_vfio_pci.ko, and igd_vfio_pci.ko working as above. For the most part, this explicit bind interface is redundant to driver_override, which already avoids the duplicate ID issue. A user specifies a driver to use for a given device, which automatically makes the driver match accept the device and there are no conflicts with native drivers. The problem is still how the user knows to choose vfio-pci-igd for a device versus vfio-pci-nvlink, other vendor drivers, or vfio-pci. A driver id table doesn't really help for binding the device, ultimately even if a device is in the id table it might fail to probe due to the missing platform support that each of these igd and nvlink drivers expose, at which point the user would need to pick a next best options. Are you somehow proposing the driver id table for the user to understand possible drivers, even if that doesn't prioritize them? I don't see that there's anything new here otherwise. Thanks, Alex