On Fri, Mar 26, 2021 at 11:48:26AM -0500, Jonathon Jongsma wrote: > When calling virNodeDeviceDefineXML() to define a new mediated device, > we call virMdevctlDefine() and then wait for the new device to appear in > the driver's device list before returning. This caused long delays due > to the behavior of nodeDeviceFindNewMediatedDevice(). This function > checks to see if the device is in the list and then waits for 5s before > checking again. > > Because mdevctl is relatively slow to query the list of defined > devices[0], the newly-defined device was generally not in the device > list when we first checked. This results in libvirt almost always taking > at least 5s to complete this API call for mediated devices, which is > unacceptable. > > In order to avoid this long delay, we resort to a workaround. If the > call to virMdevctlDefine() was successful, we can assume that this new > device will exist the next time we query mdevctl for new devices. So we > simply add this provisional device definition directly to the nodedev > driver's device list and return from the function. At some point in the > future, the mdevctl handler will run and the "official" device will be > processed, which will update the provisional device if any new details > need to be added. > > The reason that this is not necessary for virNodeDeviceCreateXML() is > because detecting newly-created (not defined) mdevs happens through > udev instead of mdevctl. And nodeDeviceFindNewMediatedDevice() always > calls 'udevadm settle' before checking to see whether the device is in > the list. This allows us to wait just long enough for all udev events to > be processed, so the device is almost always in the list the first time > we check and so we almost never end up hitting the 5s sleep. > > [0] on my machine, 'mdevctl list --defined' took around 0.8s to > complete for only 3 defined mdevs. > --- > src/node_device/node_device_driver.c | 126 +++++++++++++++------------ > 1 file changed, 68 insertions(+), 58 deletions(-) > > diff --git a/src/node_device/node_device_driver.c b/src/node_device/node_device_driver.c > index 35db24817a..a1b79d93f7 100644 > --- a/src/node_device/node_device_driver.c > +++ b/src/node_device/node_device_driver.c > @@ -1215,16 +1215,70 @@ nodeDeviceDestroy(virNodeDevicePtr device) > return ret; > } > > + > +/* takes ownership of @def and potentially frees it. @def should not be used > + * after returning from this function */ > +static int > +nodeDeviceUpdateMediatedDevice(virNodeDeviceDef *def) > +{ > + virNodeDeviceObj *obj; > + virObjectEvent *event; > + bool defined = false; > + g_autoptr(virNodeDeviceDef) owned = def; > + g_autofree char *name = g_strdup(owned->name); > + > + owned->driver = g_strdup("vfio_mdev"); > + > + if (!(obj = virNodeDeviceObjListFindByName(driver->devs, owned->name))) { > + virNodeDeviceDef *d = g_steal_pointer(&owned); > + if (!(obj = virNodeDeviceObjListAssignDef(driver->devs, d))) { > + virNodeDeviceDefFree(d); > + return -1; > + } > + } else { > + bool changed; > + virNodeDeviceDef *olddef = virNodeDeviceObjGetDef(obj); > + > + defined = virNodeDeviceObjIsPersistent(obj); > + /* Active devices contain some additional information (e.g. sysfs > + * path) that is not provided by mdevctl, so re-use the existing > + * definition and copy over new mdev data */ > + changed = nodeDeviceDefCopyFromMdevctl(olddef, owned); > + > + if (defined && !changed) { > + /* if this device was already defined and the definition > + * hasn't changed, there's nothing to do for this device */ > + virNodeDeviceObjEndAPI(&obj); > + return 0; > + } > + } > + > + /* all devices returned by virMdevctlListDefined() are persistent */ > + virNodeDeviceObjSetPersistent(obj, true); > + > + if (!defined) > + event = virNodeDeviceEventLifecycleNew(name, > + VIR_NODE_DEVICE_EVENT_DEFINED, > + 0); > + else > + event = virNodeDeviceEventUpdateNew(name); > + > + virNodeDeviceObjEndAPI(&obj); > + virObjectEventStateQueue(driver->nodeDeviceEventState, event); > + > + return 0; In 29/30, you extracted ^this code into a separate function. If you'd moved it to the right place as well, the diff in this patch would be smaller. > +} > + > virNodeDevice* > -nodeDeviceDefineXML(virConnect *conn, > +nodeDeviceDefineXML(virConnectPtr conn, I don't think you wanted ^this hunk > const char *xmlDesc, > unsigned int flags) > { > g_autoptr(virNodeDeviceDef) def = NULL; > - virNodeDevice *device = NULL; > const char *virt_type = NULL; > g_autofree char *uuid = NULL; > g_autofree char *errmsg = NULL; > + g_autofree char *name = NULL; > > virCheckFlags(0, NULL); > > @@ -1264,9 +1318,19 @@ nodeDeviceDefineXML(virConnect *conn, > } > > mdevGenerateDeviceName(def); > - device = nodeDeviceFindNewMediatedDevice(conn, def->caps->data.mdev.uuid); > + name = g_strdup(def->name); > + > + /* Normally we would call nodeDeviceFindNewMediatedDevice() here to wait > + * for the new device to appear. But mdevctl can take a while to query > + * devices, and if nodeDeviceFindNewMediatedDevice() doesn't find the new > + * device immediately it will wait at 5s before checking again. Since we > + * have already received the uuid from virMdevctlDefine(), we can simply > + * add the provisional device to the list and return it immediately and > + * avoid this long delay. */ > + if (nodeDeviceUpdateMediatedDevice(g_steal_pointer(&def)) < 0) > + return NULL; Let's just hope this workaround will not bite us in the back in the future. That said, I really didn't like the 5s delay :). After decreasing the diff of this patch: Reviewed-by: Erik Skultety <eskultet@xxxxxxxxxx>