On Wed, Feb 13, 2019 at 03:03 PM +0100, John Ferlan <jferlan@xxxxxxxxxx> wrote: > On 2/13/19 4:34 AM, Marc Hartmayer wrote: >> On Tue, Feb 12, 2019 at 09:46 PM +0100, John Ferlan <jferlan@xxxxxxxxxx> wrote: >>> On 2/7/19 11:08 AM, Marc Hartmayer wrote: >>>> Commit "nodedev: Move device enumumeration out of nodeStateInitialize" >>>> (9f0ae0b18e3e620) has moved the heavy task of device enumeration into >>>> a separate thread. The problem with this commit is that there is a >>>> functionality change in the cleanup when udevEnumerateDevices >>>> fails. Before commit 9f0ae0b18e3e620, the entire driver was cleaned up >>>> by calling nodeStateCleanup (defined in node_device_udev.c) which >>>> resulted in libvirtd stopping with the error message >>>> 'daemonRunStateInit:800 : Driver state initialization failed'. With >>>> the commit 9f0ae0b18e3e620 there is only a signal to the udev thread >>>> that it must stop. This means that for example the watch handle isn't >>>> removed from the main loop and this can result in the main loop >>>> consuming 100% CPU time as soon as a new udev event occurs. >>>> >>>> This patch proposes a simple solution to the described problem. In >>>> case the udev thread stops the watch handle is removed from the main >>>> loop. >>>> >>>> Fixes: 9f0ae0b18e3e620 ("nodedev: Move device enumumeration out of nodeStateInitialize") >>>> Signed-off-by: Marc Hartmayer <mhartmay@xxxxxxxxxxxxx> >>>> --- >>>> >>>> Note: I'm not sure whether we should stop libvirtd (as it would have >>>> been done before) or if this patch is already sufficient. >>>> >>>> --- >>>> src/node_device/node_device_udev.c | 7 +++++++ >>>> 1 file changed, 7 insertions(+) >>>> >>> >>> What you have seems reasonable - although I wonder if it would be better >>> to remove the event handle in the error of >>> nodeStateInitializeEnumerate. >> >> Might be better, yes - I’ve no strong opinion about that. I’ve added the >> removal here because it doesn’t make sense to still have the watch >> handle registered in the main loop if the udev thread stops - at least I >> think so (just to be bulletproof). >> >> The more important point is before your change, the behavior was that >> libvirtd stops after this error. Now (with this patch) it only removes >> the watch handle and stops the udev thread. Is this change of behavior >> okay? >> > > This was mentioned during review of the patch - final response here: > > https://www.redhat.com/archives/libvir-list/2017-December/msg00531.html > > with the feeling that at least allowing other aspects of libvirt to > function even though udev processing is crippled wasn't a bad thing. Okay. > > Without the patch something would need to be done before starting > libvirtd anyway. The "missing piece" was actually having what you had > happen occur. Part of me wonders what fails such that enumeration fails > and is that something we should be chasing instead. IOW: Is there > something in udevEnumerateDevices failure that could just be ignored and > we continue to operate logging the failure (like you've done in the > patch you recently posted). Does that fix what caused this patch? Yep, it does - if you’re referring to my mail id:20190213123838.2826-1-mhartmay@xxxxxxxxxxxxx. > […snip] -- Kind regards / Beste Grüße Marc Hartmayer IBM Deutschland Research & Development GmbH Vorsitzende des Aufsichtsrats: Matthias Hartmann Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen Registergericht: Amtsgericht Stuttgart, HRB 243294