On 22.12.2017 17:13, John Ferlan wrote: > [...] > >>> >>> Still adding the "virHashRemoveAll(dmn->servers);" into >>> virNetDaemonClose doesn't help the situation as I can still either crash >>> randomly or hang, so I'm less convinced this would really fix anything. >>> It does change the "nature" of the hung thread stack trace though, as >>> the second thread is now: >> >> virHashRemoveAll is not enough now. Due to unref reordeing last ref to @srv is >> unrefed after virStateCleanup. So we need to virObjectUnref(srv|srvAdm) before >> virStateCleanup. Or we can call virThreadPoolFree from virNetServerClose ( >> as in the first version of the patch and as Erik suggests) instead >> of virHashRemoveAll. >> > > Patches w/ > > 1. Long pause before GetAllStats (without using [u]sleep) > 2. Adjustment to call virNetServerServiceToggle in > virNetServerServiceClose (instead of virNetServerDispose) > 3. Call virHashRemoveAll in virNetDaemonClose > 4. Call virThreadPoolFree in virNetServerClose > 5. Perform Unref (adminProgram, srvAdm, qemuProgram, lxcProgram, > remoteProgream, and srv) before virNetDaemonClose > > Still has the virCondWait's - so as Daniel points out there's quite a > bit more work to be done. Like most Red Hat engineers - I will not be > very active over the next week or so (until the New Year) as it's a > holiday break/vacation for us. > > So unless you have the burning desire to put together some patches and > do the work yourself, more thoughts/work will need to wait. > > John > I've checked what's going on after applying patch you described above (however it would be enough to apply only 3 (or 4) and part of 5 besides pause hunk). I get hangs too and this kind of hangs are fixed by second series - '[PATCH 0/4] libvirtd: fix hang on termination in qemu driver'. That is there is a next hang backtrace besides hang in thread freeing thread pool you already mentioned: #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 #1 0x00007ffff7335c58 in virCondWait (c=0x7fffc4000e18, m=0x7fffc4000df0) at util/virthread.c:154 #2 0x00007fffd9605983 in qemuMonitorSend (mon=0x7fffc4000de0, msg=0x7fffe70bd1f0) at qemu/qemu_monitor.c:1067 #3 0x00007fffd961b68f in qemuMonitorJSONCommandWithFd (mon=0x7fffc4000de0, cmd=0x7fffb0005310, scm_fd=-1, reply=0x7fffe70bd2d0) at qemu/qemu_monitor_json.c:300 #4 0x00007fffd961b7c1 in qemuMonitorJSONCommand (mon=0x7fffc4000de0, cmd=0x7fffb0005310, reply=0x7fffe70bd2d0) at qemu/qemu_monitor_json.c:330 #5 0x00007fffd9629f0b in qemuMonitorJSONGetObjectListPaths (mon=0x7fffc4000de0, path=0x7fffd96a7c96 "/machine/peripheral", paths=0x7fffe70bd380) at qemu/qemu_monitor_json.c:5715 #6 0x00007fffd962dcc4 in qemuMonitorJSONFindObjectPathByAlias (mon=0x7fffc4000de0, name=0x7fffd969f3cd "virtio-balloon-pci", alias=0x7fffcc1e8d30 "balloon0", path=0x7fffe70bd450) at qemu/qemu_monitor_json.c:7235 #7 0x00007fffd962e231 in qemuMonitorJSONFindLinkPath (mon=0x7fffc4000de0, name=0x7fffd969f3cd "virtio-balloon-pci", alias=0x7fffcc1e8d30 "balloon0", path=0x7fffe70bd450) at qemu/qemu_monitor_json.c:7349 #8 0x00007fffd9605bf7 in qemuMonitorInitBalloonObjectPath (mon=0x7fffc4000de0, balloon=0x7fffcc1e8e60) at qemu/qemu_monitor.c:1157 #9 0x00007fffd96098d3 in qemuMonitorGetMemoryStats (mon=0x7fffc4000de0, balloon=0x7fffcc1e8e60, stats=0x7fffe70bd5b0, nr_stats=10) at qemu/qemu_monitor.c:2133 #10 0x00007fffd964e70c in qemuDomainMemoryStatsInternal (driver=0x7fffcc1872a0, vm=0x7fffcc2737e0, stats=0x7fffe70bd5b0, nr_stats=10) at qemu/qemu_driver.c:11453 #11 0x00007fffd9667013 in qemuDomainGetStatsBalloon (driver=0x7fffcc1872a0, dom=0x7fffcc2737e0, record=0x7fffb00008c0, maxparams=0x7fffe70bd6b0, privflags=1) at qemu/qemu_driver.c:19478 #12 0x00007fffd9669597 in qemuDomainGetStats (conn=0x7fffb80030e0, dom=0x7fffcc2737e0, stats=127, record=0x7fffe70bd790, flags=1) at qemu/qemu_driver.c:20133 #13 0x00007fffd966997f in qemuConnectGetAllDomainStats (conn=0x7fffb80030e0, doms=0x7fffb0005220, ndoms=1, stats=127, retStats=0x7fffe70bd8e0, flags=0) at qemu/qemu_driver.c:20226 #14 0x00007ffff7424fd7 in virDomainListGetStats (doms=0x7fffb0005220, stats=0, retStats=0x7fffe70bd8e0, flags=0) at libvirt-domain.c:11595 #15 0x00005555555ac030 in remoteDispatchConnectGetAllDomainStats (server=0x55555612a3a0, client=0x555556151d10, msg=0x555556152540, rerr=0x7fffe70bda20, args=0x7fffb00036e0, ret=0x7fffb0002d20) at remote.c:6538 I'm writing this not to involve you back into the work and do not expect a reply. It is holydays) Only to document my research. Nikolay -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list