On 8/1/22 7:58 AM, Erik Skultety wrote:
On Mon, Aug 01, 2022 at 12:02:22AM -0400, Laine Stump wrote:
Before a PCI device can be assigned to a guest with VFIO, that device
must be bound to the vfio-pci driver rather than to the device's
normal driver. The vfio-pci driver provides APIs that permit QEMU to
perform all the necessary operations to make the device accessible to
the guest.
There has been kernel work recently to support vendor/device-specific
VFIO drivers that provide the basic vfio-pci driver functionality
while adding support for device-specific operations (for example these
device-specific drivers are planned to support live migration of
certain devices). All that will be needed to make this functionality
available will be to bind the new vendor-specific driver to the device
(rather than the generic vfio-pci driver, which will continue to work
just without the extra functionality).
But until now libvirt has required that all PCI devices being assigned
to a guest with VFIO specifically have the "vfio-pci" driver bound to
the device. So even if the user manually binds a shiny new
vendor-specific driver to the device (and puts "managed='no'" in the
config to prevent libvirt from changing that), libvirt will just fail
during startup of the guest (or during hotplug) because the driver
bound to the device isn't named exactly "vfio-pci".
Fortunately these new vendor/device-specific drivers can be easily
identified as being "vfio-pci + extra stuff" - all that's needed is to
look at the output of the "modinfo $driver_name" command to see if
"vfio_pci" is in the alias list for the driver.
That's what this patch does. When libvirt checks the driver bound to a
device (either to decide if it needs to bind to a different driver or
perform some other operation, or if the current driver is acceptable
as-is), if the driver isn't specifically "vfio-pci", then it will look
at the output of modinfo for the driver that *is* bound to the device;
if modinfo shows vfio_pci as an alias for that device, then we'll
behave as if the driver was exactly "vfio-pci".
Since you say that the vendor/device-specific drivers does each of such drivers
implement the base vfio-pci functionality or they simply call into the base
driver? The reason why I'm asking is that if each of the vendor-specific
drivers depend on the vfio-pci module to be loaded as well, then reading
/proc/modules should suffice as vfio-pci should be listed right next to the
vendor-specific one. What am I missing?
I don't know the definitive answer to that, as I have no example of a
working vendor-specific driver to look at and only know about the kernel
work going on second-hand from Alex. It looks like even the vfio_pci
driver itself depends on other presumably lower level vfio-* modules (it
directly uses vfio_pci_core, which in turn uses vfio and vfio_vifqfd),
so possibly these new drivers would be depending on one or more of those
lower level modules rather than vfio_pci. Also I would imagine it would
be possible for other drivers to also depend on the vfio-pci driver
while not themselves being a vfio driver.
The 'alias' field is optional so do we have any support guarantees from the
vendors that the it will always be filled in correctly? I mean you surely
handle that case in the code, but once we start supporting this there's no way
back and we already know how painful it can be to convince the vendors to
follow some kind of standard so that we don't need to maintain several code
paths based on a vendor-matrix.
The aliases are what is used to determine the "best" vfio driver for a
particular device, so I don't think it would be possible for a driver to
not implement it, and the method I've used here to determine if a driver
is a vfio driver was recommended by Alex after a couple of discussions
on the subject.
...
+int
+virPCIDeviceGetDriverNameAndType(virPCIDevice *dev,
+ char **drvName,
+ virPCIStubDriver *drvType)
+{
+ g_autofree char *drvPath = NULL;
+ g_autoptr(virCommand) cmd = NULL;
+ g_autofree char *output = NULL;
+ g_autoptr(GRegex) regex = NULL;
+ g_autoptr(GError) err = NULL;
+ g_autoptr(GMatchInfo) info = NULL;
+ int exit;
+ int tmpType;
+
+ if (virPCIDeviceGetDriverPathAndName(dev, &drvPath, drvName) < 0)
+ return -1;
+
+ if (!*drvName) {
+ *drvType = VIR_PCI_STUB_DRIVER_NONE;
+ return 0;
+ }
+
+ tmpType = virPCIStubDriverTypeFromString(*drvName);
+
+ if (tmpType > VIR_PCI_STUB_DRIVER_NONE) {
+ *drvType = tmpType;
+ return 0; /* exact match of a known driver name (or no name) */
+ }
+
+ /* Check the output of "modinfo $drvName" to see if it has
+ * "vfio_pci" as an alias. If it does, then this driver should
+ * also be considered as a vfio-pci driver, because it implements
+ * all the functionality of the basic vfio-pci (plus additional
+ * device-specific stuff).
+ */
Instead of calling an external program and then grepping its output which
technically could change in the future, wouldn't it be better if we read
/lib/modules/`uname -r`/modules.alias and filtered whatever line had the
vfio-pci' substring and compared the module name with the user-provided device
driver?
Again, although I was hesistant about calling an external command, and
asked if there was something simpler, Alex still suggested modinfo, so
I'll let him answer that. Alex?
(Also, although the format of the output of "uname -r" is pretty much
written in stone, you're still running an external command :-))
If not then I think you should pass '-F alias' to the command to speed up the
regex just a tiny bit.
Sure. That sounds fine to me.