On Thu, Sep 01, 2016 at 02:11:34PM -0500, Eric Blake wrote: > On 09/01/2016 08:57 AM, Kashyap Chamarthy wrote: > > So, I'm trying to understand how libvirt reports the "cur" and "end" > > values. I've read the virDomainBlockJobInfo() struct, it wasn't crystal > > clear. It states: > > > > /* > > * The following fields provide an indication of block job progress. @cur > > * indicates the current position and will be between 0 and @end. @end is > > * the final cursor position for this operation and represents completion. > > * To approximate progress, divide @cur by @end. > > */ > > Libvirt is (more or less) reporting numbers from qemu. What's more, > @end need not be the same between calls; qemu is free to change the end > value as it comes up with more work to do (or sees that less work > remains than initially estimated), all that REALLY matters is that the > ratio between the two numbers converges, and is < 1 while busy, and == 1 > when complete. Except that there are cases in qemu where a block job > really has 0 work to do. > > Prior to qemu exposing the "ready":true/false flag, libvirt had to guess > whether equal numbers really meant done, or if it merely meant nearly > done. But now that we have "ready", I think the sanest course of action > for libvirt is to fudge the numbers from qemu. After all, we've already > documented (both in libvirt and in qemu) that @end is not fixed, so much > as a moving target (it just doesn't move very much on operations where > we have a good grasp on how much work remains from the start, like a > deep copy; but is more prone to move on operations like live commit that > are influenced by how active the guest is at writing data that we are > attempting to commit at the same time). [...] > > [kashyap] > > > > So, if the job hasn't started yet, what should libvirt report? > > > > [mprivozn] > > > > That's the question. We can't change the [virDomainBlockJobInfo] > > struct (otherwise we won't be ABI compatible), so we can't really > > add a bolean there 'bool job_started'. > > > > Or we can introduce new "job type" which wouldn't really be a job > > type, but we will fill status.job with it to say explicitly job > > hasn't started yet > > > > > > So, any other thoughts here, on how to proceed here? > > My preference would be: > > If qemu doesn't report anything (because the job is not started yet), > then libvirt should report cur=0, end=1 (the job still has 100% to go). > > If qemu reports 0/0 and "done":false, then libvirt should report cur=0, > end=1 (that is, we fudge the end to be larger, because the job is not > done yet). > > If qemu reports 0/0 and "done":true (because the job was really a > no-op), then libvirt should report cur=1, end=1 (the job is 100% complete). > > If qemu reports 0/0 and lacks "done" (older qemu), then libvirt just has > to guess. I'm not sure which guess is most appropriate; maybe libvirt > itself will have to set up a timer and report 0/1 the first time, and > only report 1/1 after a minimum time has elapsed, to make sure qemu has > had a chance to do something about the job. Or maybe we don't worry > about it, and just have libvirt report 0/0 because we really don't know > any better. > > If qemu reports a/b, where a < b and b > 0, use those numbers as is. We > don't even have to check "done". > > If qemu reports a/a, where a > 0, then also check "done". If > "done":false is present, report a-1/a (the job is not quite done); if > "done" is absent or "done":true is present, report a/a (the job is done). [...] Excellent, thanks for the detailed response, Eric. That clarifies a lot. I've filed this bug, and captured the discussion from this thread. https://bugzilla.redhat.com/show_bug.cgi?id=1372613 -- Improve live block device job status reporting via virDomainBlockJobInfo() * * * Michal has pointed to these two fixes related to the above bug, in his branch: https://github.com/zippy2/libvirt/commit/47dcb46 -- qemuDomainGetBlockJobInfo: Move info translation into separate func https://github.com/zippy2/libvirt/commit/25557dd -- virDomainGetBlockJobInfo: Fix corner case when qemu reports no info I don't have a proper reproducer to test the corner case. But I just tested the regular "deep copy" test case, with the above fixes $ git describe v2.1.0-231-g25557dd $ sudo ./run tools/virsh start cvm1 $ sudo ./run tools/virsh dumpxml \ --inactive cvm1 > /var/tmp/cvm1.xml $ sudo ./run tools/virsh undefine cvm1 Domain cvm1 has been undefined $ sudo ./run tools/virsh blockcopy cvm1 \ vda /export/copy-cirrvm1.qcow2 --wait --verbose Block Copy: [100 %] Now in mirroring phase $ sudo ./run tools/virsh blockjob cvm1 vda --raw type=Block Copy bandwidth=0 cur=41126400 end=41126400 $ sudo ./run tools/virsh blockjob cvm1 vda --abort -- /kashyap -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list