On 11/1/18 7:36 AM, Erik Skultety wrote: > On Fri, Oct 19, 2018 at 08:04:43AM -0400, John Ferlan wrote: >> Add details of the algorithm and why it was used to help future >> readers understand the issues encountered. Based largely off a >> description provided by Erik Skultety <eskultet@xxxxxxxxxx>. > > Since the link to the archive would be missing from the commit message, I feel > like simply blaming commit cdbe13329a7 for lacking documentation/justification > of the algorithm used is the right way of composing the commit message. > OK - thanks for the commit link... > I appreciate the effort to document it properly, although I feel it's a bit > overwhelming, so the comments I have below is just a sheer attempt to reduce > the amount of text documenting a single function, you know, if a function needs > a comment like that, it suggests that there's something wrong with that > function... > Understood - I threw a lot of spaghetti on the wall and altering what's here is perfectly fine. I didn't want to leave something out that you may have found important... BTW: Another explanation for a lengthy function header - it may suggest that if you're going to post a patch changing it's logic, you had better know what you're doing ;-) >> >> Signed-off-by: John Ferlan <jferlan@xxxxxxxxxx> >> --- >> >> see: https://www.redhat.com/archives/libvir-list/2018-October/msg00915.html >> >> src/node_device/node_device_udev.c | 49 ++++++++++++++++++++++++++++++ >> 1 file changed, 49 insertions(+) >> >> diff --git a/src/node_device/node_device_udev.c b/src/node_device/node_device_udev.c >> index 22897591de..8ca81e65ad 100644 >> --- a/src/node_device/node_device_udev.c >> +++ b/src/node_device/node_device_udev.c >> @@ -1591,6 +1591,53 @@ udevEventMonitorSanityCheck(udevEventDataPtr priv, >> } >> >> >> +/** >> + * udevEventHandleThread >> + * @opaque: unused >> + * >> + * Thread to handle the udevEventHandleCallback processing when udev >> + * tells us there's a device change for us (add, modify, delete, etc). >> + * >> + * Management of the processing udev device notification is a delicate >> + * balance between the udev process, the scheduler, and when exactly the >> + * data from/for the socket is received. > > I'd drop ^this sentence, the issue with this unfortunate mix of factors is > explained further down, let's try keeping the amount of commentary down. > I'll move it to the commit message ;-) >> + * The udev processing code notifies >> + * the udevEventHandleCallback that it has data; however, the actual recv() >> + * done in the udev code to fill in the @device structure can be delayed >> + * due to how threads are scheduled. > > How about: > "When we get notified by udev that there are data to be processed, the actual > data retrieval by libudev may be delayed due to how threads are scheduled." > > + * >> + * As it turns out, the event loop could be scheduled (and it in fact >> + * was) earlier than the handler thread. What that essentially means is >> + * that by the time the thread actually handled the event and read the >> + * data from the monitor, the event loop fired the very same event, simply >> + * because the data hadn't been retrieved from the socket at that point yet. > > ^This doesn't need to be a separate paragraph, you can connect it to the > previous one. Also, some words don't feel right for commentary purposes (I know > they're mine :D). How about: > > "In fact, the event loop could be scheduled earlier than the handler thread, > thus potentially emitting the very same event the handler thread is currently > trying to process, simply because the data hadn't been retrieved from the > socket at that time yet." > >> + * >> + * Thus this algorithm is that once udevEventHandleCallback is notified >> + * there is data, > > ^this can be dropped... > >> + * this loop will process as much data as possible until >> + * udev_monitor_receive_device fails to get the @device. This may be >> + * because we've processed everything or because the scheduling thread >> + * hasn't been able to populate data in the socket. Once we're done >> + * processing everything we can, wait on the next event/notification. > > How about: > > "Instead of waiting for the next event after retrieving a device data we keep > reading from the udev monitor until we encounter one of EAGAIN or EWOULDBLOCK > errors, only then we'll wait for the next event." > > Also, I'd move that bit into the function body, so it's directly tied to > the piece of code responsible for it. > >> + * Although this may incur a slight delay if the socket data wasn't >> + * populated, the event/notifications are fairly consistent so there's >> + * no need to use virCondWaitUntil. > > ^This could be dropped. > >> + * >> + * NB: Usage of event based socket algorithm causes some issues with >> + * older platforms such as CentOS 6 in which the data socket is opened >> + * without the NONBLOCK bit set. Still even if the reset of priv->dataReady >> + * were moved to just after the udev_monitor_receive_device in order to >> + * avoid the NONBLOCK issue, the scheduler would still come into play > > "Trying to move the reset of the @priv->dataReady flag within the loop wouldn't > help much as the scheduler would still come into play." > > Everything written below could go to the commit message itself. > > > Would you agree to the proposed adjustments? yep, a v2 is on it's way. John > Erik > >> + * especially for environments when multiple devices are added into the >> + * system. Even those environments would still be blocked on the udev >> + * socket recv() call. The algorithm for handling that scenario would >> + * be more complex and involve pulling the monitor object out of the >> + * private data lockable object and would need to be guarded by a >> + * separate lock. Devising algorithms to work around issues with older >> + * OS's at the expense of more modern OS algorithms in newer event >> + * processing code may result in unexpected issues, so the choice is >> + * to encourage use of newer OS's with newer udev event processing code. >> + * Newer devices, such as mdevs make heavy usage of event processing >> + * and it's expected future devices would follow that model. >> + */ >> static void >> udevEventHandleThread(void *opaque ATTRIBUTE_UNUSED) >> { >> @@ -1637,6 +1684,8 @@ udevEventHandleThread(void *opaque ATTRIBUTE_UNUSED) >> return; >> } >> >> + /* See the above description why this needs to be here >> + * and not after the udev_monitor_receive_device. */ >> virObjectLock(priv); >> priv->dataReady = false; >> virObjectUnlock(priv); >> -- >> 2.17.2 >> >> -- >> libvir-list mailing list >> libvir-list@xxxxxxxxxx >> https://www.redhat.com/mailman/listinfo/libvir-list -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list