Re: [PATCH] vireventglib: Remove handles with the highest priority

Michal Prívozník <mprivozn@xxxxxxxxxx> · Fri, 8 Oct 2021 15:13:27 +0200

On 10/8/21 2:23 PM, Daniel P. Berrangé wrote:
> On Fri, Oct 08, 2021 at 01:56:24PM +0200, Michal Prívozník wrote:
>> On 10/8/21 1:36 PM, Daniel P. Berrangé wrote:
>>> On Fri, Oct 08, 2021 at 01:12:49PM +0200, Michal Privoznik wrote:
>>>> When a server decides to close a client, the
>>>> virNetServerClientCloseLocked() is called. In here various
>>>> cleanup steps are taken, but the most important part (from this
>>>> commit's POV at least) is the way that the socket is closed.
>>>> Firstly, removal of the socket associated with the client from
>>>> the event loop is signalized and then the socket is unrefed. The
>>>> socket is not closed just yet though, because the event loop
>>>> holds a reference to it. This reference will be freed as soon as
>>>> the event loop wakes up and starts issuing callbacks (in this
>>>> case virNetSocketEventFree()).
>>>>
>>>> So far, this is how things usually work. But if the daemon
>>>> reaches the number of opened files limit, things start to work
>>>> differently.
>>>>
>>>> If the RLIMIT_NOFILE limit is reached and there's a client that
>>>> wants to connect then the event loop wakes up, sees POLLIN on the
>>>> socket and calls virNetServerServiceAccept() which in turn calls
>>>> virNetSocketAccept(). But because of the limit, accept() fails
>>>> with EMFILE leaving the POLLIN event unhandled. The dispatch then
>>>> continues to next FDs with events on them. BUT, it will NOT call
>>>> the socket removal callback (virNetSocketEventFree()) because it
>>>> has low priority (G_PRIORITY_DEFAULT_IDLE). Per glib's
>>>> documentation:
>>>
>>> if virNetSocketAccept fails, there's no client socket to
>>> remove/close.  AFAIK, we are not removing the server
>>> socket either.
>>>
>>> So I'm not following which socket is supposedly being
>>> removed in this scenario ?
>>
>> Right. I didn't phrase it properly. Let's call the
>> /var/run/libvirt/libvirt-sock SockA and accepted socket (i.e. per
>> client) SockB.
>> Let the NOFILE limit be just the right value so that both SockA and
>> SockB can exist, but no more. Now, let's have another client trying to
>> connect(). The event loop wakes up, because there's an event on SockA.
>> But the connection can't be accept()-ed. Alright, so let the already
>> connected client disconnect (that is, SockB). This will result in:
>>
>>   g_idle_add(virEventGLibHandleRemoveIdle, SockB); /* I've made a
>> shortcut here, in reality there's a data variable that points to SockB,
>> but you get the idea */
>>
>> Now, because we still haven't handled the connection attempt at SockA
>> the idle source won't ever be called because it has lower priority.
>> Therefore, the event loop is stuck in endless loop trying to accept()
>> connection on SockA.
> 
> Ah i see, "idle" callbacks aren't called because the event loop
> isn't idle.
> 
> So with this change, we'll let the idle callback run, which will
> close the FD which will let libvird accept the client which gets
> us out of the 100% CPU burn loop..... unless there were two
> pending clients....
> 
> 
> So, ACK to this patch as a quick fix, but it feels like we
> ultimately have a more serious problem.
> 
> I wonder if we should temporarily stop watching events on
> the incoming server if we get EMFILE first time on accept,
> and set a timer to re-enable events 60 seconds later. Rinse,
> repeat.

Yes, good idea. But I worry that 60 seconds may be too long. Or
needlessly short - depending on usage. Nevertheless, let us start with
60 seconds and we can fine tune later.

So are you okay with me pushing this patch and sending this ^^ as a
follow up?

> 
> 
>>>>  * Each event source is assigned a priority. The default priority,
>>>>  * %G_PRIORITY_DEFAULT, is 0. Values less than 0 denote higher priorities.
>>>>  * Values greater than 0 denote lower priorities. Events from high priority
>>>>  * sources are always processed before events from lower priority sources.
>>>>
>>>> and per g_idle_add() documentation:
>>>>
>>>>  * Adds a function to be called whenever there are no higher priority
>>>>  * events pending to the default main loop. The function is given the
>>>>  * default idle priority, %G_PRIORITY_DEFAULT_IDLE.
>>>>
>>>> Now, because we did not accept() the client we are constantly
>>>> seeing POLLIN on the main socket and thus the removal of the
>>>> client socket won't ever happen.
>>>>
>>>> The fix is to set at least the same priority as other sources,
>>>> but since we want to just close an FD, let's give it the highest
>>>> priority and call it before handling other events.
>>>>
>>>> This issue can be easily reproduced, for instance:
>>>>
>>>>  # ulimit -S -n 40 (tweak this number if needed)
>>>>  # ./src/libvirtd
>>>>
>>>> from another terminal:
>>>>
>>>>  # for ((i=0; i<100; i++)); do virsh list & done; virsh list
>>>>
>>>> The last `virsh list` must not get stuck.
>>>
>>> The bug below describes libvirt core dumps rather than
>>> hangs. Do you know why it dumped ?
>>
>> Because of glib. We called some glib function which internally wanted to
>> open a pipe which failed. Thus glib aborted. I don't remember the exact
>> function, but I can look it up if you want.
> 
> Oh, so this patch is not actually fixing the core dump scenario
> then.  Even after we fix the bug that prevents clients disconnecting,
> it will still dump if this API is called at a point where we are in
> a EMFILE state ?

Yes. But I am afraid there's no way around it. Found the stack trace:

Stack trace of thread 52231:
#0  0x000003ffa90be9e4 raise (libc.so.6)
#1  0x000003ffa96590ee g_log_default_handler (libglib-2.0.so.0)
#2  0x000003ffa965935e g_logv (libglib-2.0.so.0)
#3  0x000003ffa965955a g_log (libglib-2.0.so.0)
#4  0x000003ffa969dd70 g_wakeup_new (libglib-2.0.so.0)
#5  0x000003ffa964eb38 g_main_context_new (libglib-2.0.so.0)
#6  0x000003ffa9946886 vir_event_thread_init (libvirt.so.0)
#7  0x000003ffa95b5ff8 g_type_create_instance (libgobject-2.0.so.0)
#8  0x000003ffa95978cc g_object_new_internal (libgobject-2.0.so.0)
#9  0x000003ffa95990de g_object_new_with_properties (libgobject-2.0.so.0)
#10 0x000003ffa9599ab6 g_object_new (libgobject-2.0.so.0)
#11 0x000003ffa9946b78 virEventThreadNew (libvirt.so.0)
#12 0x000003ff6c2c8bd8 qemuProcessQMPNew (libvirt_driver_qemu.so)
#13 0x000003ff6c1f57b8 virQEMUCapsInitQMPSingle (libvirt_driver_qemu.so)
#14 0x000003ff6c1f59ec virQEMUCapsNewForBinaryInternal
(libvirt_driver_qemu.so)
#15 0x000003ff6c1f5c92 virQEMUCapsNewData (libvirt_driver_qemu.so)
#16 0x000003ffa99510f4 virFileCacheValidate (libvirt.so.0)
#17 0x000003ffa995133c virFileCacheLookup (libvirt.so.0)
#18 0x000003ff6c1f5e4a virQEMUCapsCacheLookup (libvirt_driver_qemu.so)
#19 0x000003ff6c1f621e virQEMUCapsCacheLookupCopy (libvirt_driver_qemu.so)
#20 0x000003ff6c2c4a44 qemuProcessInit (libvirt_driver_qemu.so)
#21 0x000003ff6c2c67f2 qemuProcessStart (libvirt_driver_qemu.so)
#22 0x000003ff6c26358c qemuDomainObjStart.constprop.55
(libvirt_driver_qemu.so)
#23 0x000003ff6c2639c8 qemuDomainCreateWithFlags (libvirt_driver_qemu.so)
#24 0x000003ffa9b77902 virDomainCreate (libvirt.so.0)
#25 0x000002aa34ad2d52 remoteDispatchDomainCreateHelper (libvirtd)
#26 0x000003ffa9a6094a virNetServerProgramDispatch (libvirt.so.0)
#27 0x000003ffa9a665b6 virNetServerHandleJob (libvirt.so.0)
#28 0x000003ffa99a0d32 virThreadPoolWorker (libvirt.so.0)
#29 0x000003ffa99a02d6 virThreadHelper (libvirt.so.0)
#30 0x000003ffa85880c6 start_thread (libpthread.so.0)
#31 0x000003ffa917bd36 thread_start (libc.so.6)

So my bad, it failed opening eventfd (in frame #4 g_wakeup_new()) but
anyway, I don't think we can mitigate this. But I think there was
another stacktrace attached to BZ where it failed in glib in pipe().
Doesn't really matter though. It's glib - we accepted abort policy when
adopting it.

Michal