On 07/26/2017 02:44 AM, Erik Skultety wrote: > Since @udev_monitor isn't immutable within the driver, we need to > protect every access to it by locking the driver, so that no spurious > action changing the pointer while one thread is actively using it might > occur. This patch just takes some precaution measures. > > Signed-off-by: Erik Skultety <eskultet@xxxxxxxxxx> > --- > src/node_device/node_device_udev.c | 16 ++++++++++++---- > 1 file changed, 12 insertions(+), 4 deletions(-) > > diff --git a/src/node_device/node_device_udev.c b/src/node_device/node_device_udev.c > index 7ecb330df..0643a5845 100644 > --- a/src/node_device/node_device_udev.c > +++ b/src/node_device/node_device_udev.c > @@ -1623,9 +1623,13 @@ udevEventHandleCallback(int watch ATTRIBUTE_UNUSED, > void *data ATTRIBUTE_UNUSED) > { > struct udev_device *device = NULL; > - struct udev_monitor *udev_monitor = DRV_STATE_UDEV_MONITOR(driver); > + struct udev_monitor *udev_monitor = NULL; > int udev_fd = -1; > > + nodeDeviceLock(); > + if (!(udev_monitor = DRV_STATE_UDEV_MONITOR(driver))) > + goto error; > + Originally I thought maybe we should provide some sort of error here, but I convinced myself perhaps not - although I wouldn't be opposed to a VIR_WARN or something... My second thought was should we add some sort of EventRemoveHandle type operation? But once we get the lock, the only way udev_monitor would be NULL is if nodeStateCleanup is run which should already have removed the handle, so probably not. But then it struck me, if that's the case, then the nodeDeviceLock could fail rather miserably because driver could be NULL... (e.g. virMutexLock(&driver->lock);) Still not quite clear how a path such as this could be triggered other than perhaps the last dying breaths of the daemon - unless something that the *unref does causes any events to be flushed. Maybe a "delayed" event - who knows. Still, if we wait on the lock, we could end up in a state where @driver doesn't exist or anything it was referencing has been freed... So I think we need : if (!driver) return; to follow how nodeStateCleanup will bail... then (considering my thoughts in patch 2) Add to node_device_udev.h (and use it in Cleanup too, so we don't have to create a local @priv variable): #define DRV_STATE_WATCH(ds) (((udevPrivate *)((ds)->privateData))->watch) nodeDeviceLock(); if (!driver || !driver->privateData || DRV_STATE_WATCH(driver) == -1 || !(udev_monitor = DRV_STATE_UDEV_MONITOR(driver))) { nodeDeviceUnlock(); return; } then the rest of the code. If you're "waiting" to get the lock while Cleanup is running, then remove the chance that something that Cleanup does could cause awful things to happen in this code. In cleanup, once nodeDeviceUnlock() is run, but before virMutexDestroy, we could grab the lock. That would cause virMutexDestroy (pthread_mutex_destroy) to fail, but we don't check that. That just becomes a side effect of using an event function and taking out the mutex I suppose. Oh and I should point out that it's possible that virMutexDestroy does run, thus our getting the lock does nothing, but one would think that the subsequent condition checks wouldn't be racing too hard against the VIR_FREE(driver) in the other thread. The impossible part of this race is that how does one tell if something is waiting on the mutex other than getting a failure when we destroy it, but since we don't manage failures for destroy, we have no way to be sure. Still in the long run, both threads are heading to a very quick death, so does it really matter. John (muttering to myself how much I hate lock races). > udev_fd = udev_monitor_get_fd(udev_monitor); > if (fd != udev_fd) { > udevPrivate *priv = driver->privateData; > @@ -1640,22 +1644,26 @@ udevEventHandleCallback(int watch ATTRIBUTE_UNUSED, > * the same error multiple times > */ > virEventRemoveHandle(priv->watch); > - > - goto cleanup; > + goto error; > } > > device = udev_monitor_receive_device(udev_monitor); > if (device == NULL) { > virReportError(VIR_ERR_INTERNAL_ERROR, "%s", > _("udev_monitor_receive_device returned NULL")); > - goto cleanup; > + goto error; > } > > + nodeDeviceUnlock(); > udevHandleOneDevice(device); > > cleanup: > udev_device_unref(device); > return; > + > + error: > + nodeDeviceUnlock(); > + goto cleanup; > } > > > -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list