On 08.04.2015 08:51, Michael Chapman wrote: > qemuMigrationCookieAddNBD is usually called from within an async > MIGRATION_OUT or MIGRATION_IN job, so it needs to start a nested job. > > (The one exception is during the Begin phase when change protection > isn't enabled, but qemuDomainObjEnterMonitorAsync will behave the same > as qemuDomainObjEnterMonitor in this case.) > > This bug was encountered with a libvirt client that repeatedly queries > the disk mirroring block job info during a migration. If one of these > queries occurs just as the Perform migration cookie is baked, libvirt > crashes. > > Relevant logs are as follows: > > 6701: warning : qemuDomainObjEnterMonitorInternal:1544 : This thread seems to be the async job owner; entering monitor without asking for a nested job is dangerous > [1] 6701: info : qemuMonitorSend:972 : QEMU_MONITOR_SEND_MSG: mon=0x7fefdc004700 msg={"execute":"query-block","id":"libvirt-629"} > [2] 6699: info : qemuMonitorIOWrite:503 : QEMU_MONITOR_IO_WRITE: mon=0x7fefdc004700 buf={"execute":"query-block","id":"libvirt-629"} > [3] 6704: info : qemuMonitorSend:972 : QEMU_MONITOR_SEND_MSG: mon=0x7fefdc004700 msg={"execute":"query-block-jobs","id":"libvirt-630"} > [4] 6699: info : qemuMonitorJSONIOProcessLine:203 : QEMU_MONITOR_RECV_REPLY: mon=0x7fefdc004700 reply={"return": [...], "id": "libvirt-629"} > 6699: error : qemuMonitorJSONIOProcessLine:211 : internal error: Unexpected JSON reply '{"return": [...], "id": "libvirt-629"}' > > At [1] qemuMonitorBlockStatsUpdateCapacity sends its request, then waits > on mon->notify. At [2] the request is written out to the monitor socket. > At [3] qemuMonitorBlockJobInfo sends its request, and also waits on > mon->notify. The reply from the first request is received at [4]. > However, qemuMonitorJSONIOProcessLine is not expecting this reply since > the second request hadn't completed sending. The reply is dropped and an > error is returned. > > qemuMonitorIO signals mon->notify twice during its error handling, > waking up both of the threads waiting on it. One of them clears mon->msg > as it exits qemuMonitorSend; the other crashes: > > qemuMonitorSend (mon=0x7fefdc004700, msg=<value optimized out>) at qemu/qemu_monitor.c:975 > 975 while (!mon->msg->finished) { > (gdb) print mon->msg > $1 = (qemuMonitorMessagePtr) 0x0 > > Signed-off-by: Michael Chapman <mike@xxxxxxxxxxxxxxxxx> > --- > src/qemu/qemu_migration.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > ACKed and pushed. Michal -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list