On Wed, 10 Mar 2021 14:57:57 +0200 Max Gurtovoy <mgurtovoy@xxxxxxxxxx> wrote: > On 3/10/2021 8:39 AM, Alexey Kardashevskiy wrote: > > On 09/03/2021 19:33, Max Gurtovoy wrote: > >> +static const struct pci_device_id nvlink2gpu_vfio_pci_table[] = { > >> + { PCI_VDEVICE(NVIDIA, 0x1DB1) }, /* GV100GL-A NVIDIA Tesla > >> V100-SXM2-16GB */ > >> + { PCI_VDEVICE(NVIDIA, 0x1DB5) }, /* GV100GL-A NVIDIA Tesla > >> V100-SXM2-32GB */ > >> + { PCI_VDEVICE(NVIDIA, 0x1DB8) }, /* GV100GL-A NVIDIA Tesla > >> V100-SXM3-32GB */ > >> + { PCI_VDEVICE(NVIDIA, 0x1DF5) }, /* GV100GL-B NVIDIA Tesla > >> V100-SXM2-16GB */ > > > > > > Where is this list from? > > > > Also, how is this supposed to work at the boot time? Will the kernel > > try binding let's say this one and nouveau? Which one is going to win? > > At boot time nouveau driver will win since the vfio drivers don't > declare MODULE_DEVICE_TABLE This still seems troublesome, AIUI the MODULE_DEVICE_TABLE is responsible for creating aliases so that kmod can figure out which modules to load, but what happens if all these vfio-pci modules are built into the kernel or the modules are already loaded? In the former case, I think it boils down to link order while the latter is generally considered even less deterministic since it depends on module load order. So if one of these vfio modules should get loaded before the native driver, I think devices could bind here first. Are there tricks/extensions we could use in driver overrides, for example maybe a compatibility alias such that one of these vfio-pci variants could match "vfio-pci"? Perhaps that, along with some sort of priority scheme to probe variants ahead of the base driver, though I'm not sure how we'd get these variants loaded without something like module aliases. I know we're trying to avoid creating another level of driver matching, but that's essentially what we have in the compat option enabled here, and I'm not sure I see how userspace makes the leap to understand what driver to use for a given device. Thanks, Alex