On Thu, Nov 01, 2018 at 01:48:39PM -0400, John Ferlan wrote: > Commit cdbe1332 neglected to document the API. So let's add some > details about the algorithm and why it was used to help future > readers understand the issues encountered. Based largely off a > description provided by Erik Skultety <eskultet@xxxxxxxxxx>. Oh, I missed ^this last sentence. Since you're blaming the commit already and the description based on a mailing list thread will not be part of the log, you should IMHO drop it. > > NB: Management of the processing udev device notification is a > delicate balance between the udev process, the scheduler, and when > exactly the data from/for the socket is received. The balance is > particularly important for environments when multiple devices are > added into the system more or less simultaneously such as is done > for mdev. "or SRIOV" Note: I can't really remember what I used for reproducer, mdevs or a SRIOV card. > In these cases scheduler blocking on the udev recv() call I don't think scheduler is blocking anywhere, it's rather old libudev blocking on the recv call ;) > is more noticeable. It's expected that future devices will follow > similar algorithms. Even though the algorithm does present some > challenges for older OS's (such as Centos 6), trying to rewrite > the algorithm to fit both models would be more complex and involve > pulling the monitor object out of the private data lockable object > and would need to be guarded by a separate lock. Devising such an > algorithm to work around issues with older OS's at the expense > of more modern OS algorithms in newer event processing code may > result in unexpected issues, so the choice is to encourage use > of newer OS's with newer udev event processing code. > > Signed-off-by: John Ferlan <jferlan@xxxxxxxxxx> > --- > > v1: https://www.redhat.com/archives/libvir-list/2018-October/msg01053.html > > Changes are from code review with some minor tweaks/editing as I > went along. Mistakes are my own! > > src/node_device/node_device_udev.c | 28 ++++++++++++++++++++++++++++ > 1 file changed, 28 insertions(+) > > diff --git a/src/node_device/node_device_udev.c b/src/node_device/node_device_udev.c > index 22897591de..f2c2299d4d 100644 > --- a/src/node_device/node_device_udev.c > +++ b/src/node_device/node_device_udev.c > @@ -1591,6 +1591,26 @@ udevEventMonitorSanityCheck(udevEventDataPtr priv, > } > > > +/** > + * udevEventHandleThread > + * @opaque: unused > + * > + * Thread to handle the udevEventHandleCallback processing when udev > + * tells us there's a device change for us (add, modify, delete, etc). > + * > + * Once notified there is data to be processed, the actual @device > + * data retrieval by libudev may be delayed due to how threads are > + * scheduled. In fact, the event loop could be scheduled earlier than > + * the handler thread, thus potentially emitting the very same event > + * the handler thread is currently trying to process, simply because > + * the data hadn't been retrieved from the socket. > + * ... > + * NB: Usage of event based socket algorithm causes some issues with > + * older platforms such as CentOS 6 in which the data socket is opened ^Sounds a bit too generic, as an event based algorithm is not forced to open a socket without O_NONBLOCK - just put something like "older platforms' libudev opens sockets without the NONBLOCK flag which might cause issues with event based algorithm" or something alike in there. otherwise looks okay, I don't think a v3 is necessary: Reviewed-by: Erik Skultety <eskultet@xxxxxxxxxx> > + * without the NONBLOCK bit set. That can be avoided by moving the reset > + * priv->dataReady to just after the udev_monitor_receive_device; however, > + * scheduler issues would still come into play. > + */ > static void > udevEventHandleThread(void *opaque ATTRIBUTE_UNUSED) > { > @@ -1637,6 +1657,9 @@ udevEventHandleThread(void *opaque ATTRIBUTE_UNUSED) > return; > } > > + /* Trying to move the reset of the @priv->dataReady flag to > + * after the udev_monitor_receive_device wouldn't help much > + * due to event mgmt and scheduler timing. */ > virObjectLock(priv); > priv->dataReady = false; > virObjectUnlock(priv); > @@ -1646,6 +1669,11 @@ udevEventHandleThread(void *opaque ATTRIBUTE_UNUSED) > > udevHandleOneDevice(device); > udev_device_unref(device); > + > + /* Instead of waiting for the next event after processing @device > + * data, let's keep reading from the udev monitor and only wait > + * for the next event once either a EAGAIN or a EWOULDBLOCK error > + * is encountered. */ > } > } > > -- > 2.17.2 > > -- > libvir-list mailing list > libvir-list@xxxxxxxxxx > https://www.redhat.com/mailman/listinfo/libvir-list -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list