Historically libvirt has treated the concept of "loadable kernel module" and "device driver" as being effectively the same (at least in the case of the vfio-pci driver used for VFIO device assignment). The code assumed that a module named "vfio-pci" implemented a driver named "vfio-pci". In reality, the module is named "vfio_pci", and it implements a device driver named "vfio-pci" (note the difference in separator characters), and our code worked only because the modprobe utility we use to load the module will always "normalize" the module name it's given by replacing all "-" (dash) with "_" (underscore) (this has been verified in the modprobe source, which is in the kmod package - there are 3 separate functions that perform this same operation!). So even though we asked modprobe to load the "vfio-pci" module, it would actually load the "vfio_pci" module. After loading the module with modprobe, libvirt then looks for the desired *driver* to be present in sysfs, by looking for the directory /sys/bus/pci/drivers/${driverName}, which would succeed because we were still looking for the original "dash version" of the name ("vfio-pci"). When we recently gained the ability to manually specify a driver to bind to with virsh nodedev-detach, the fragility of this system became apparent - if a user gives the driver name as "vfio_pci", then we would modprobe the module, but then erroneously believe it hadn't been loaded because /sys/bus/pci/drivers/vfio_pci didn't exist. For manual specification of the driver name, this could be dealt with by telling the user "always use the correct name for the driver, don't assume that it is the same as the modulename", but it would still end up confusing people, especially since some drivers do use underscore in their name (e.g. the mlx5_vfio_pci driver/module). More more importantly, when we begin looking in the modules.alias file for the "best" VFIO variant driver for a particular device (in an upcoming patch), that lookup will provide us with the name of a *module*, not a driver, and 3 of the 4 examples of vfio-pci/variant drivers I have access to use underscore in the module name and dash in the driver, while the 4th uses underscore in both, so 3 out of 4 fail with the current code. To make the current code more forgiving (and yet more correct!), as well as to make the upcoming variant auto-selection actually useful, this patch follows the following steps: 1) if the requested driver (${driverName}) is present in sysfs drivers list (/sys/bus/pci/drivers/${driverName}) then use ${driverName} - DONE 2) if underscored(${driverName}) (call it ${underName} for short) is in sysfs drivers list then use ${underName} - DONE 3) if ${underName} *isn't* in the sysfs modules list (/sys/module/${underName}) then "modprobe ${underName}" to load it. 4) look for the PCI driver implemented by this module, it will be at /sys/module/${underName}/driver/pci:${newName}. 5) use ${newName} - DONE. A few notes: a) This will *always* replace dash with underscore in the name used to load the module, which seems it could be problematic, but I have verified this is what modprobe itself does, so I don't think there will ever be a module with "-" in its name as modprobe would be unable to load it (the same can't be said for drivers though!) b) The one case where the above steps wouldn't work would be if someone implemented a driver within a module where the driver and module had names that differed other than dash vs. underscore. I haven't seen an example of this, so the convention seems to be that they will "match" (modulo - & _). If this mismatch were to ever occur, though, it could be worked around by simply loading the module out-of-band during the host system startup, and then using the driver's name in the config. c) All of this is conveniently ignoring the possibility of a VFIO variant driver that is statically linked in the kernel. The entire design of variant driver auto-detection is based on doing a lookup in modules.alias, and that only lists *loadable modules* (not drivers), so unless I'm missing something, it would be impossible to auto-detect a VFIO variant driver that was statically linked. This is beyond libvirt's ability to fix; the best that could be done would be to manually specify the driver name in the libvirt config, which I suppose is better than nothing :-) Signed-off-by: Laine Stump <laine@xxxxxxxxxx> --- src/util/virpci.c | 195 +++++++++++++++++++++++++++++++++++++++------- 1 file changed, 165 insertions(+), 30 deletions(-) diff --git a/src/util/virpci.c b/src/util/virpci.c index baacde4c14..513ae948c0 100644 --- a/src/util/virpci.c +++ b/src/util/virpci.c @@ -221,6 +221,13 @@ virPCIDriverDir(const char *driver) } +static char * +virPCIModuleDir(const char *module) +{ + return g_strdup_printf("/sys/module/%s", module); +} + + static char * virPCIFile(const char *device, const char *file) { @@ -1153,44 +1160,168 @@ virPCIDeviceReset(virPCIDevice *dev, } +/** + * virPCINameDashToUnderscore: + * @path: the module/driver name or path - will be directly modified + * + * Replace all occurences of "-" with "_" in the name + * part of the path (everything after final "/" + * + * return true if any change was made, otherwise false. + */ static int -virPCIProbeDriver(const char *driverName) +virPCINameDashToUnderscore(char *path) { - g_autofree char *drvpath = NULL; + bool changed = false; + char *tmp = strrchr(path, '/'); + + if (!tmp) + tmp = path; + + while (*tmp) { + if (*tmp == '-') { + *tmp = '_'; + changed = true; + } + tmp++; + } + + return changed; +} + + +static int +virPCIProbeModule(const char *moduleName) +{ + g_autofree char *modulePath = NULL; g_autofree char *errbuf = NULL; - drvpath = virPCIDriverDir(driverName); + modulePath = virPCIModuleDir(moduleName); /* driver previously loaded, return */ - if (virFileExists(drvpath)) + if (virFileExists(modulePath)) return 0; - if ((errbuf = virKModLoad(driverName))) { - VIR_WARN("failed to load driver %s: %s", driverName, errbuf); - goto cleanup; + if ((errbuf = virKModLoad(moduleName))) { + /* If we know failure was because of admin config, let's report that; + * otherwise, report a more generic failure message + */ + if (virKModIsProhibited(moduleName)) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Failed to load PCI driver module %1$s: administratively prohibited"), + moduleName); + } else { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Failed to load PCI driver module %1$s: %2$s"), + moduleName, errbuf); + } + return -1; } /* driver loaded after probing */ - if (virFileExists(drvpath)) + if (virFileExists(modulePath)) return 0; - cleanup: - /* If we know failure was because of admin config, let's report that; - * otherwise, report a more generic failure message + virReportError(VIR_ERR_INTERNAL_ERROR, + _("modprobe reported success loading module '%1$s', but module is missing from /sys/module"), + moduleName); + return -1; +} + +/** + * virPCIDeviceFindDriver: + * @dev: initialized virPCIDevice, including desired stubDriverName + * + * Checks if there is a driver named @dev->stubDriverName already + * loaded. If there is, we're done. If not, look for a driver with + * that same name, except with dashes replaced with underscores. + + * If neither of the above is found, then look for/load the module of + * the underscored version of the name, and follow the links from + * /sys/module/$name/drivers/pci:* to the PCI driver associated with that + * module, and update @dev->stubDriverName with that name. + * + * On a successful return, @dev->stubDriverName will be updated with + * the proper name for the driver, and that driver will be loaded. + * + * returns 0 on success, -1 on failure + */ +static int +virPCIDeviceFindDriver(virPCIDevice *dev) +{ + g_autofree char *driverPath = virPCIDriverDir(dev->stubDriverName); + g_autofree char *errbuf = NULL; + g_autofree char *moduleName = NULL; + g_autofree char *moduleDriversDir = NULL; + g_autoptr(DIR) dir = NULL; + struct dirent *ent; + int direrr; + + /* unchanged stubDriverName */ + if (virFileExists(driverPath)) + return 0; + + /* try replacing "-" with "_" */ + if (virPCINameDashToUnderscore(driverPath) + && virFileExists(driverPath)) { + + /* update original name in dev */ + virPCINameDashToUnderscore(dev->stubDriverName); + return 0; + } + + /* look for a module with this name (but always replacing + * "-" with "_", since that's what modprobe does) */ - if (virKModIsProhibited(driverName)) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("Failed to load PCI driver module %1$s: administratively prohibited"), - driverName); - } else { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("Failed to load PCI driver module %1$s"), - driverName); + + moduleName = g_strdup(dev->stubDriverName); + virPCINameDashToUnderscore(moduleName); + + if (virPCIProbeModule(moduleName) < 0) + return -1; + + /* module was found/loaded. Now find the PCI driver it implements, + * linked to by /sys/module/$moduleName/drivers/pci:$driverName + */ + + moduleDriversDir = g_strdup_printf("/sys/module/%s/drivers", moduleName); + + if (virDirOpen(&dir, moduleDriversDir) < 0) + return -1; + + while ((direrr = virDirRead(dir, &ent, moduleDriversDir))) { + + if (STRPREFIX(ent->d_name, "pci:")) { + /* this is the link to the driver we want */ + + g_autofree char *drvName = NULL; + g_autofree char *drvPath = NULL; + + /* extract the driver name from the link name */ + drvName = g_strdup(ent->d_name + strlen("pci:")); + + /* make sure that driver is actually loaded */ + drvPath = virPCIDriverDir(drvName); + if (!virFileExists(drvPath)) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("pci driver '%1$s' supposedly loaded by module '%2$s' not found in sysfs"), + drvName, moduleName); + return -1; + } + + g_free(dev->stubDriverName); + dev->stubDriverName = g_steal_pointer(&drvName); + return 0; + } } + virReportError(VIR_ERR_INTERNAL_ERROR, + _("module '%1$s' does not implement any pci driver"), + moduleName); return -1; } + int virPCIDeviceUnbind(virPCIDevice *dev) { @@ -1290,7 +1421,6 @@ virPCIDeviceUnbindFromStub(virPCIDevice *dev) static int virPCIDeviceBindToStub(virPCIDevice *dev) { - const char *stubDriverName = dev->stubDriverName; g_autofree char *stubDriverPath = NULL; g_autofree char *driverLink = NULL; @@ -1302,30 +1432,35 @@ virPCIDeviceBindToStub(virPCIDevice *dev) return -1; } - if (!stubDriverName - && !(stubDriverName = virPCIStubDriverTypeToString(dev->stubDriverType))) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("Unknown stub driver configured for PCI device %1$s"), - dev->name); - return -1; + if (!dev->stubDriverName) { + + const char *stubDriverName = NULL; + + if (!(stubDriverName = virPCIStubDriverTypeToString(dev->stubDriverType))) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Unknown stub driver configured for PCI device %1$s"), + dev->name); + return -1; + } + dev->stubDriverName = g_strdup(stubDriverName); } - if (virPCIProbeDriver(stubDriverName) < 0) + if (virPCIDeviceFindDriver(dev) < 0) return -1; - stubDriverPath = virPCIDriverDir(stubDriverName); + stubDriverPath = virPCIDriverDir(dev->stubDriverName); driverLink = virPCIFile(dev->name, "driver"); if (virFileExists(driverLink)) { if (virFileLinkPointsTo(driverLink, stubDriverPath)) { /* The device is already bound to the correct driver */ VIR_DEBUG("Device %s is already bound to %s", - dev->name, stubDriverName); + dev->name, dev->stubDriverName); return 0; } } - if (virPCIDeviceBindWithDriverOverride(dev, stubDriverName) < 0) + if (virPCIDeviceBindWithDriverOverride(dev, dev->stubDriverName) < 0) return -1; dev->unbind_from_stub = true; -- 2.41.0