On Fri, Apr 10, 2015 at 09:49:43 +0200, Michal Privoznik wrote: > On 10.04.2015 00:32, Xing Lin wrote: > > Hi, > > > > It seems that I found a bug in the qemuMigrationWaitForCompletion(). It > > will do a sleep of 50 ms, even when qemuMigrationUpdateJobStatus() detects > > that a migration has completed. > > > > > > Here is the original implementation of the while loop. > > > > > > while (jobInfo->type == VIR_DOMAIN_JOB_UNBOUNDED) { > > /* Poll every 50ms for progress & to allow cancellation */ > > struct timespec ts = { .tv_sec = 0, .tv_nsec = 50 * 1000 * 1000ull > > }; > > > > if (qemuMigrationUpdateJobStatus(driver, vm, job, asyncJob) == -1) > > break; > > > > /* cancel migration if disk I/O error is emitted while migrating */ > > if (abort_on_error && > > virDomainObjGetState(vm, &pauseReason) == VIR_DOMAIN_PAUSED && > > pauseReason == VIR_DOMAIN_PAUSED_IOERROR) { > > virReportError(VIR_ERR_OPERATION_FAILED, > > _("%s: %s"), job, _("failed due to I/O error")); > > break; > > } > > > > if (dconn && virConnectIsAlive(dconn) <= 0) { > > virReportError(VIR_ERR_OPERATION_FAILED, "%s", > > _("Lost connection to destination host")); > > break; > > } > > > > virObjectUnlock(vm); > > > > nanosleep(&ts, NULL); > > > > virObjectLock(vm); > > } > > > > > > Even when qemuMigrationUpdateJobStatus() detects a migration has completed, > > the nanosleep() will be called again. When the while condition is checked, > > then the program breaks from the while loop. This introduces an unnecessary > > sleep of 50 ms which can be avoided by doing a sleep first. The code should > > likely be structured as following. > > > > while (jobInfo->type == VIR_DOMAIN_JOB_UNBOUNDED) { > > /* Poll every 50ms for progress & to allow cancellation */ > > struct timespec ts = { .tv_sec = 0, .tv_nsec = 50 * 1000 * 1000ull > > }; > > > > // do sleep here. > > virObjectUnlock(vm); > > > > nanosleep(&ts, NULL); > > > > virObjectLock(vm); > > Yeah, I guess it's very unlikely that migration will complete within > first 50ms after issuing the command on the monitor (in which case we > would again wait pointlessly). The other approach would be to leave the > sleep() where it is, but enclose it in if (jobInfo->type == > VIR_DOMAIN_JOB_UNBOUNDED) {}. I have no preference over these two > versions though. So ACK to the patch of yours. However, I'll postpone > merging the patch for a while and let others express their feelings. Or change to loop to while (1) and add a break before the sleep if jobInfo->type is not VIR_DOMAIN_JOB_UNBOUNDED. Personally, I don't mind either way. The ultimate solution is to avoid any sleeps by waiting for an event. However, Juan did not implement the even in QEMU yet. Once he does that, I will make patches for libvirt that will use that instead of polling every 50ms. Jirka -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list