qemuMigrationCookieAddNBD is usually called from within an async MIGRATION_OUT or MIGRATION_IN job, so it needs to start a nested job. (The one exception is during the Begin phase when change protection isn't enabled, but qemuDomainObjEnterMonitorAsync will behave the same as qemuDomainObjEnterMonitor in this case.) This bug was encountered with a libvirt client that repeatedly queries the disk mirroring block job info during a migration. If one of these queries occurs just as the Perform migration cookie is baked, libvirt crashes. Relevant logs are as follows: 6701: warning : qemuDomainObjEnterMonitorInternal:1544 : This thread seems to be the async job owner; entering monitor without asking for a nested job is dangerous [1] 6701: info : qemuMonitorSend:972 : QEMU_MONITOR_SEND_MSG: mon=0x7fefdc004700 msg={"execute":"query-block","id":"libvirt-629"} [2] 6699: info : qemuMonitorIOWrite:503 : QEMU_MONITOR_IO_WRITE: mon=0x7fefdc004700 buf={"execute":"query-block","id":"libvirt-629"} [3] 6704: info : qemuMonitorSend:972 : QEMU_MONITOR_SEND_MSG: mon=0x7fefdc004700 msg={"execute":"query-block-jobs","id":"libvirt-630"} [4] 6699: info : qemuMonitorJSONIOProcessLine:203 : QEMU_MONITOR_RECV_REPLY: mon=0x7fefdc004700 reply={"return": [...], "id": "libvirt-629"} 6699: error : qemuMonitorJSONIOProcessLine:211 : internal error: Unexpected JSON reply '{"return": [...], "id": "libvirt-629"}' At [1] qemuMonitorBlockStatsUpdateCapacity sends its request, then waits on mon->notify. At [2] the request is written out to the monitor socket. At [3] qemuMonitorBlockJobInfo sends its request, and also waits on mon->notify. The reply from the first request is received at [4]. However, qemuMonitorJSONIOProcessLine is not expecting this reply since the second request hadn't completed sending. The reply is dropped and an error is returned. qemuMonitorIO signals mon->notify twice during its error handling, waking up both of the threads waiting on it. One of them clears mon->msg as it exits qemuMonitorSend; the other crashes: qemuMonitorSend (mon=0x7fefdc004700, msg=<value optimized out>) at qemu/qemu_monitor.c:975 975 while (!mon->msg->finished) { (gdb) print mon->msg $1 = (qemuMonitorMessagePtr) 0x0 Signed-off-by: Michael Chapman <mike@xxxxxxxxxxxxxxxxx> --- src/qemu/qemu_migration.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index 5607d1a..d770aea 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -571,7 +571,9 @@ qemuMigrationCookieAddNBD(qemuMigrationCookiePtr mig, if (!(stats = virHashCreate(10, virHashValueFree))) goto cleanup; - qemuDomainObjEnterMonitor(driver, vm); + if (qemuDomainObjEnterMonitorAsync(driver, vm, + priv->job.asyncJob) < 0) + goto cleanup; rc = qemuMonitorBlockStatsUpdateCapacity(priv->mon, stats, false); if (qemuDomainObjExitMonitor(driver, vm) < 0) goto cleanup; -- 2.1.0 -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list