There were multiple race conditions that could lead to segmentation faults. The first precondition for this is qemuProcessLaunch must fail sometime shortly after starting the new QEMU process. The second precondition for the segmentation faults is that the new QEMU process dies - or to be more precise the QEMU monitor has to be closed irregularly. If both happens during qemuProcessStart (starting a domain) there are race windows between the thread with the event loop (T1) and the thread that is starting the domain (T2). First segmentation fault scenario: If qemuProcessLaunch fails during qemuProcessStart the code branches to the 'stop' path where 'qemuMonitorSetDomainLog(priv->mon, NULL, NULL, NULL)' will set the log function of the monitor to NULL (done in T2). In the meantime the event loop of T1 will wake up with an EOF event for the QEMU monitor because the QEMU process has died. The crash occurs if T1 has checked 'mon->logFunc != NULL' in qemuMonitorIO just before the logFunc was set to NULL by T2. If this situation occurs T1 will try to call mon->logFunc which leads to the segmentation fault. Solution: Require the monitor lock for setting the log function. Backtrace: 0 0x0000000000000000 in ?? () 1 0x000003ffe9e45316 in qemuMonitorIO (watch=<optimized out>, fd=<optimized out>, events=<optimized out>, opaque=0x3ffe08aa860) at ../../src/qemu/qemu_monitor.c:727 2 0x000003fffda2e1a4 in virEventPollDispatchHandles (nfds=<optimized out>, fds=0x2aa000fd980) at ../../src/util/vireventpoll.c:508 3 0x000003fffda2e398 in virEventPollRunOnce () at ../../src/util/vireventpoll.c:657 4 0x000003fffda2ca10 in virEventRunDefaultImpl () at ../../src/util/virevent.c:314 5 0x000003fffdba9366 in virNetDaemonRun (dmn=0x2aa000cc550) at ../../src/rpc/virnetdaemon.c:818 6 0x000002aa00024668 in main (argc=<optimized out>, argv=<optimized out>) at ../../daemon/libvirtd.c:1541 Second segmentation fault scenario: If qemuProcessLaunch fails it will unref the log context and with invoking qemuMonitorSetDomainLog(priv->mon, NULL, NULL, NULL) qemuDomainLogContextFree() will be invoked. qemuDomainLogContextFree() invokes virNetClientClose() to close the client and cleans everything up (including unref of _virLogManager.client) when virNetClientClose() returns. When T1 is now trying to report 'qemu unexpectedly closed the monitor' libvirtd will crash because the client has already been freed. Solution: As the critical section in qemuMonitorIO is protected with the monitor lock we can use the same solution as proposed for the first segmentation fault. Backtrace: 0 virClassIsDerivedFrom (klass=0x3100979797979797, parent=0x2aa000d92f0) at ../../src/util/virobject.c:169 1 0x000003fffda659e6 in virObjectIsClass (anyobj=<optimized out>, klass=<optimized out>) at ../../src/util/virobject.c:365 2 0x000003fffda65a24 in virObjectLock (anyobj=0x3ffe08c1db0) at ../../src/util/virobject.c:317 3 0x000003fffdba4688 in virNetClientIOEventLoop (client=client@entry=0x3ffe08c1db0, thiscall=thiscall@entry=0x2aa000fbfa0) at ../../src/rpc/virnetclient.c:1668 4 0x000003fffdba4b4c in virNetClientIO (client=client@entry=0x3ffe08c1db0, thiscall=0x2aa000fbfa0) at ../../src/rpc/virnetclient.c:1944 5 0x000003fffdba4d42 in virNetClientSendInternal (client=client@entry=0x3ffe08c1db0, msg=msg@entry=0x2aa000cc710, expectReply=expectReply@entry=true, nonBlock=nonBlock@entry=false) at ../../src/rpc/virnetclient.c:2116 6 0x000003fffdba6268 in virNetClientSendWithReply (client=0x3ffe08c1db0, msg=0x2aa000cc710) at ../../src/rpc/virnetclient.c:2144 7 0x000003fffdba6e8e in virNetClientProgramCall (prog=0x3ffe08c1120, client=<optimized out>, serial=<optimized out>, proc=<optimized out>, noutfds=<optimized out>, outfds=0x0, ninfds=0x0, infds=0x0, args_filter=0x3fffdb64440 <xdr_virLogManagerProtocolDomainReadLogFileArgs>, args=0x3ffffffe010, ret_filter=0x3fffdb644c0 <xdr_virLogManagerProtocolDomainReadLogFileRet>, ret=0x3ffffffe008) at ../../src/rpc/virnetclientprogram.c:329 8 0x000003fffdb64042 in virLogManagerDomainReadLogFile (mgr=<optimized out>, path=<optimized out>, inode=<optimized out>, offset=<optimized out>, maxlen=<optimized out>, flags=0) at ../../src/logging/log_manager.c:272 9 0x000003ffe9e0315c in qemuDomainLogContextRead (ctxt=0x3ffe08c2980, msg=0x3ffffffe1c0) at ../../src/qemu/qemu_domain.c:4422 10 0x000003ffe9e280a8 in qemuProcessReadLog (logCtxt=<optimized out>, msg=msg@entry=0x3ffffffe288) at ../../src/qemu/qemu_process.c:1800 11 0x000003ffe9e28206 in qemuProcessReportLogError (logCtxt=<optimized out>, msgprefix=0x3ffe9ec276a "qemu unexpectedly closed the monitor") at ../../src/qemu/qemu_process.c:1836 12 0x000003ffe9e28306 in qemuProcessMonitorReportLogError (mon=mon@entry=0x3ffe085cf10, msg=<optimized out>, opaque=<optimized out>) at ../../src/qemu/qemu_process.c:1856 13 0x000003ffe9e452b6 in qemuMonitorIO (watch=<optimized out>, fd=<optimized out>, events=<optimized out>, opaque=0x3ffe085cf10) at ../../src/qemu/qemu_monitor.c:726 14 0x000003fffda2e1a4 in virEventPollDispatchHandles (nfds=<optimized out>, fds=0x2aa000fd980) at ../../src/util/vireventpoll.c:508 15 0x000003fffda2e398 in virEventPollRunOnce () at ../../src/util/vireventpoll.c:657 16 0x000003fffda2ca10 in virEventRunDefaultImpl () at ../../src/util/virevent.c:314 17 0x000003fffdba9366 in virNetDaemonRun (dmn=0x2aa000cc550) at ../../src/rpc/virnetdaemon.c:818 18 0x000002aa00024668 in main (argc=<optimized out>, argv=<optimized out>) at ../../daemon/libvirtd.c:1541 Other code parts where the same problem was possible to occur are fixed as well (qemuMigrationFinish, qemuProcessStart, and qemuDomainSaveImageStartVM). Signed-off-by: Marc Hartmayer <mhartmay@xxxxxxxxxxxxxxxxxx> Reported-by: Sascha Silbe <silbe@xxxxxxxxxxxxxxxxxx> --- src/qemu/qemu_monitor.c | 44 ++++++++++++++++++++++++++++++++++---------- src/qemu/qemu_monitor.h | 4 ++++ 2 files changed, 38 insertions(+), 10 deletions(-) diff --git a/src/qemu/qemu_monitor.c b/src/qemu/qemu_monitor.c index a4fa6ec..b41aaed 100644 --- a/src/qemu/qemu_monitor.c +++ b/src/qemu/qemu_monitor.c @@ -963,7 +963,7 @@ qemuMonitorClose(qemuMonitorPtr mon) PROBE(QEMU_MONITOR_CLOSE, "mon=%p refs=%d", mon, mon->parent.parent.u.s.refs); - qemuMonitorSetDomainLog(mon, NULL, NULL, NULL); + qemuMonitorSetDomainLogLocked(mon, NULL, NULL, NULL); if (mon->fd >= 0) { qemuMonitorUnregister(mon); @@ -4035,20 +4035,21 @@ qemuMonitorGetDeviceAliases(qemuMonitorPtr mon, /** - * qemuMonitorSetDomainLog: - * Set the file descriptor of the open VM log file to report potential - * early startup errors of qemu. - * - * @mon: Monitor object to set the log file reading on + * qemuMonitorSetDomainLogLocked: + * @mon: Locked monitor object to set the log file reading on * @func: the callback to report errors * @opaque: data to pass to @func * @destroy: optional callback to free @opaque + * + * Set the file descriptor of the open VM log file to report potential + * early startup errors of qemu. This function requires @mon to be + * locked already! */ void -qemuMonitorSetDomainLog(qemuMonitorPtr mon, - qemuMonitorReportDomainLogError func, - void *opaque, - virFreeCallback destroy) +qemuMonitorSetDomainLogLocked(qemuMonitorPtr mon, + qemuMonitorReportDomainLogError func, + void *opaque, + virFreeCallback destroy) { if (mon->logDestroy && mon->logOpaque) mon->logDestroy(mon->logOpaque); @@ -4060,6 +4061,29 @@ qemuMonitorSetDomainLog(qemuMonitorPtr mon, /** + * qemuMonitorSetDomainLog: + * @mon: Unlocked monitor object to set the log file reading on + * @func: the callback to report errors + * @opaque: data to pass to @func + * @destroy: optional callback to free @opaque + * + * Set the file descriptor of the open VM log file to report potential + * early startup errors of qemu. This functions requires @mon to be + * unlocked. + */ +void +qemuMonitorSetDomainLog(qemuMonitorPtr mon, + qemuMonitorReportDomainLogError func, + void *opaque, + virFreeCallback destroy) +{ + virObjectLock(mon); + qemuMonitorSetDomainLogLocked(mon, func, opaque, destroy); + virObjectUnlock(mon); +} + + +/** * qemuMonitorJSONGetGuestCPU: * @mon: Pointer to the monitor * @arch: arch of the guest diff --git a/src/qemu/qemu_monitor.h b/src/qemu/qemu_monitor.h index 91ab905..2e42d16 100644 --- a/src/qemu/qemu_monitor.h +++ b/src/qemu/qemu_monitor.h @@ -1067,6 +1067,10 @@ int qemuMonitorGetDeviceAliases(qemuMonitorPtr mon, typedef void (*qemuMonitorReportDomainLogError)(qemuMonitorPtr mon, const char *msg, void *opaque); +void qemuMonitorSetDomainLogLocked(qemuMonitorPtr mon, + qemuMonitorReportDomainLogError func, + void *opaque, + virFreeCallback destroy); void qemuMonitorSetDomainLog(qemuMonitorPtr mon, qemuMonitorReportDomainLogError func, void *opaque, -- 2.5.5 -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list