On 10/08/2013 10:31 AM, Michal Privoznik wrote: > I've noticed a SIGSEGV-ing libvirtd on the destination when the qemu > died too quickly = in Prepare phase. What is happening here is: > > 1) [Thread 3493] We are in qemuMigrationPrepareAny() and calling > qemuProcessStart() which subsequently calls qemuProcessWaitForMonitor() > and qemuConnectMonitor(). So far so good. The qemuMonitorOpen() > succeeds, however switching monitor to QMP mode fails as qemu died > meanwhile. That is qemuMonitorSetCapabilities() returns -1. > > 2013-10-08 15:54:10.629+0000: 3493: debug : qemuMonitorSetCapabilities:1356 : mon=0x14a53da0 > 2013-10-08 15:54:10.630+0000: 3493: debug : qemuMonitorJSONCommandWithFd:262 : Send command '{"execute":"qmp_capabilities","id":"libvirt-1"}' for write with FD -1 > 2013-10-08 15:54:10.630+0000: 3493: debug : virEventPollUpdateHandle:147 : EVENT_POLL_UPDATE_HANDLE: watch=17 events=13 > ... > 2013-10-08 15:54:10.631+0000: 3493: debug : qemuMonitorSend:956 : QEMU_MONITOR_SEND_MSG: mon=0x14a53da0 msg={"execute":"qmp_capabilities","id":"libvirt-1"} > fd=-1 > 2013-10-08 15:54:10.631+0000: 3262: debug : virEventPollRunOnce:641 : Poll got 1 event(s) > > 2) [Thread 3262] The event loop is trying to do the talking to monitor. > However, qemu is dead already, remember? > > 2013-10-08 15:54:13.436+0000: 3262: error : qemuMonitorIORead:551 : Unable to read from monitor: Connection reset by peer > 2013-10-08 15:54:13.516+0000: 3262: debug : virFileClose:90 : Closed fd 25 > ... > 2013-10-08 15:54:13.533+0000: 3493: debug : qemuMonitorSend:968 : Send command resulted in error internal error: early end of file from monitor: possible problem: > > 3) [Thread 3493] qemuProcessStart() failed. No big deal. Go to the > 'endjob' label and subsequently to the 'cleanup'. Since the domain is > not persistent and ret is -1, the qemuDomainRemoveInactive() is called. > This has an (unpleasant) effect of virObjectUnref()-in the @vm object. > Unpleasant because the event loop which is about to trigger EOF callback > still holds a pointer to the @vm (not the reference). See the valgrind > output below. > > 4) [Thread 3262] So the even loop starts triggering EOF: s/even/event/ > > 2013-10-08 15:54:13.542+0000: 3262: debug : qemuMonitorIO:729 : Triggering EOF callback > 2013-10-08 15:54:13.543+0000: 3262: debug : qemuProcessHandleMonitorEOF:294 : Received EOF on 0x14549110 'migt10' > > And the monitor is cleaned up. This results in calling > qemuProcessHandleMonitorEOF with the @vm pointer passed. The pointer is > kept in qemuMonitor struct. > > ==3262== Thread 1: > ==3262== Invalid read of size 4 > > The mon->vm is set in qemuMonitorOpenInternal() which is the correct > place to increase @vm ref counter. The correct place to decrease the ref > counter is then qemuMonitorDispose(). However, in order to avoid cyclic > calling of dispose functions, we must use hack, and set priv->mon to > NULL. The qemuDomainObjPrivateFree won't call us again then. This > describes the last two chunks of the commit. The others are there for > handling virObjectRef() and virObjectUnref() correctly. So far it > doesn't work well with NULL virClassPtr :) > > Signed-off-by: Michal Privoznik <mprivozn@xxxxxxxxxx> > --- > > Yet another one of those patches, where chasing it down took a half of a day > and description what went wrong is longer than a few lines of fix. Yeah, we've had a few of those lately. > > src/qemu/qemu_capabilities.c | 14 ++++++++++---- > src/qemu/qemu_monitor.c | 12 ++++++++++++ > 2 files changed, 22 insertions(+), 4 deletions(-) > @@ -781,6 +792,7 @@ qemuMonitorOpenInternal(virDomainObjPtr vm, > } > mon->fd = fd; > mon->hasSendFD = hasSendFD; > + virObjectRef(vm); > mon->vm = vm; Could be written: mon->vm = virObjectRef(vm); Overall, looks fine to me, but I'd feel more comfortable if you also got Dan to review it before pushing. ACK. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org
Attachment:
signature.asc
Description: OpenPGP digital signature
-- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list