On 10/8/21 2:23 PM, Daniel P. Berrangé wrote: > On Fri, Oct 08, 2021 at 01:56:24PM +0200, Michal Prívozník wrote: >> On 10/8/21 1:36 PM, Daniel P. Berrangé wrote: >>> On Fri, Oct 08, 2021 at 01:12:49PM +0200, Michal Privoznik wrote: >>>> When a server decides to close a client, the >>>> virNetServerClientCloseLocked() is called. In here various >>>> cleanup steps are taken, but the most important part (from this >>>> commit's POV at least) is the way that the socket is closed. >>>> Firstly, removal of the socket associated with the client from >>>> the event loop is signalized and then the socket is unrefed. The >>>> socket is not closed just yet though, because the event loop >>>> holds a reference to it. This reference will be freed as soon as >>>> the event loop wakes up and starts issuing callbacks (in this >>>> case virNetSocketEventFree()). >>>> >>>> So far, this is how things usually work. But if the daemon >>>> reaches the number of opened files limit, things start to work >>>> differently. >>>> >>>> If the RLIMIT_NOFILE limit is reached and there's a client that >>>> wants to connect then the event loop wakes up, sees POLLIN on the >>>> socket and calls virNetServerServiceAccept() which in turn calls >>>> virNetSocketAccept(). But because of the limit, accept() fails >>>> with EMFILE leaving the POLLIN event unhandled. The dispatch then >>>> continues to next FDs with events on them. BUT, it will NOT call >>>> the socket removal callback (virNetSocketEventFree()) because it >>>> has low priority (G_PRIORITY_DEFAULT_IDLE). Per glib's >>>> documentation: >>> >>> if virNetSocketAccept fails, there's no client socket to >>> remove/close. AFAIK, we are not removing the server >>> socket either. >>> >>> So I'm not following which socket is supposedly being >>> removed in this scenario ? >> >> Right. I didn't phrase it properly. Let's call the >> /var/run/libvirt/libvirt-sock SockA and accepted socket (i.e. per >> client) SockB. >> Let the NOFILE limit be just the right value so that both SockA and >> SockB can exist, but no more. Now, let's have another client trying to >> connect(). The event loop wakes up, because there's an event on SockA. >> But the connection can't be accept()-ed. Alright, so let the already >> connected client disconnect (that is, SockB). This will result in: >> >> g_idle_add(virEventGLibHandleRemoveIdle, SockB); /* I've made a >> shortcut here, in reality there's a data variable that points to SockB, >> but you get the idea */ >> >> Now, because we still haven't handled the connection attempt at SockA >> the idle source won't ever be called because it has lower priority. >> Therefore, the event loop is stuck in endless loop trying to accept() >> connection on SockA. > > Ah i see, "idle" callbacks aren't called because the event loop > isn't idle. > > So with this change, we'll let the idle callback run, which will > close the FD which will let libvird accept the client which gets > us out of the 100% CPU burn loop..... unless there were two > pending clients.... > > > So, ACK to this patch as a quick fix, but it feels like we > ultimately have a more serious problem. > > I wonder if we should temporarily stop watching events on > the incoming server if we get EMFILE first time on accept, > and set a timer to re-enable events 60 seconds later. Rinse, > repeat. Yes, good idea. But I worry that 60 seconds may be too long. Or needlessly short - depending on usage. Nevertheless, let us start with 60 seconds and we can fine tune later. So are you okay with me pushing this patch and sending this ^^ as a follow up? > > >>>> * Each event source is assigned a priority. The default priority, >>>> * %G_PRIORITY_DEFAULT, is 0. Values less than 0 denote higher priorities. >>>> * Values greater than 0 denote lower priorities. Events from high priority >>>> * sources are always processed before events from lower priority sources. >>>> >>>> and per g_idle_add() documentation: >>>> >>>> * Adds a function to be called whenever there are no higher priority >>>> * events pending to the default main loop. The function is given the >>>> * default idle priority, %G_PRIORITY_DEFAULT_IDLE. >>>> >>>> Now, because we did not accept() the client we are constantly >>>> seeing POLLIN on the main socket and thus the removal of the >>>> client socket won't ever happen. >>>> >>>> The fix is to set at least the same priority as other sources, >>>> but since we want to just close an FD, let's give it the highest >>>> priority and call it before handling other events. >>>> >>>> This issue can be easily reproduced, for instance: >>>> >>>> # ulimit -S -n 40 (tweak this number if needed) >>>> # ./src/libvirtd >>>> >>>> from another terminal: >>>> >>>> # for ((i=0; i<100; i++)); do virsh list & done; virsh list >>>> >>>> The last `virsh list` must not get stuck. >>> >>> The bug below describes libvirt core dumps rather than >>> hangs. Do you know why it dumped ? >> >> Because of glib. We called some glib function which internally wanted to >> open a pipe which failed. Thus glib aborted. I don't remember the exact >> function, but I can look it up if you want. > > Oh, so this patch is not actually fixing the core dump scenario > then. Even after we fix the bug that prevents clients disconnecting, > it will still dump if this API is called at a point where we are in > a EMFILE state ? Yes. But I am afraid there's no way around it. Found the stack trace: Stack trace of thread 52231: #0 0x000003ffa90be9e4 raise (libc.so.6) #1 0x000003ffa96590ee g_log_default_handler (libglib-2.0.so.0) #2 0x000003ffa965935e g_logv (libglib-2.0.so.0) #3 0x000003ffa965955a g_log (libglib-2.0.so.0) #4 0x000003ffa969dd70 g_wakeup_new (libglib-2.0.so.0) #5 0x000003ffa964eb38 g_main_context_new (libglib-2.0.so.0) #6 0x000003ffa9946886 vir_event_thread_init (libvirt.so.0) #7 0x000003ffa95b5ff8 g_type_create_instance (libgobject-2.0.so.0) #8 0x000003ffa95978cc g_object_new_internal (libgobject-2.0.so.0) #9 0x000003ffa95990de g_object_new_with_properties (libgobject-2.0.so.0) #10 0x000003ffa9599ab6 g_object_new (libgobject-2.0.so.0) #11 0x000003ffa9946b78 virEventThreadNew (libvirt.so.0) #12 0x000003ff6c2c8bd8 qemuProcessQMPNew (libvirt_driver_qemu.so) #13 0x000003ff6c1f57b8 virQEMUCapsInitQMPSingle (libvirt_driver_qemu.so) #14 0x000003ff6c1f59ec virQEMUCapsNewForBinaryInternal (libvirt_driver_qemu.so) #15 0x000003ff6c1f5c92 virQEMUCapsNewData (libvirt_driver_qemu.so) #16 0x000003ffa99510f4 virFileCacheValidate (libvirt.so.0) #17 0x000003ffa995133c virFileCacheLookup (libvirt.so.0) #18 0x000003ff6c1f5e4a virQEMUCapsCacheLookup (libvirt_driver_qemu.so) #19 0x000003ff6c1f621e virQEMUCapsCacheLookupCopy (libvirt_driver_qemu.so) #20 0x000003ff6c2c4a44 qemuProcessInit (libvirt_driver_qemu.so) #21 0x000003ff6c2c67f2 qemuProcessStart (libvirt_driver_qemu.so) #22 0x000003ff6c26358c qemuDomainObjStart.constprop.55 (libvirt_driver_qemu.so) #23 0x000003ff6c2639c8 qemuDomainCreateWithFlags (libvirt_driver_qemu.so) #24 0x000003ffa9b77902 virDomainCreate (libvirt.so.0) #25 0x000002aa34ad2d52 remoteDispatchDomainCreateHelper (libvirtd) #26 0x000003ffa9a6094a virNetServerProgramDispatch (libvirt.so.0) #27 0x000003ffa9a665b6 virNetServerHandleJob (libvirt.so.0) #28 0x000003ffa99a0d32 virThreadPoolWorker (libvirt.so.0) #29 0x000003ffa99a02d6 virThreadHelper (libvirt.so.0) #30 0x000003ffa85880c6 start_thread (libpthread.so.0) #31 0x000003ffa917bd36 thread_start (libc.so.6) So my bad, it failed opening eventfd (in frame #4 g_wakeup_new()) but anyway, I don't think we can mitigate this. But I think there was another stacktrace attached to BZ where it failed in glib in pipe(). Doesn't really matter though. It's glib - we accepted abort policy when adopting it. Michal