Re: [PATCH] blockjob: add virsh blockjob --wait

Eric Blake <eblake@xxxxxxxxxx> · Fri, 13 Apr 2012 10:58:58 -0600

On 04/13/2012 01:29 AM, Michal Privoznik wrote:
> On 12.04.2012 21:59, Eric Blake wrote:
>> I'm tired of shell-scripting to wait for completion of a block pull,
>> when virsh can be taught to do the same.  I couldn't quite reuse
>> vshWatchJob, as this is not a case of a long-running command where
>> a second thread must be used to probe job status (at least, not unless
>> I make virsh start doing blocking waits for an event to fire), but it
>> served as inspiration for my simpler single-threaded loop.  There is
>> up to a half-second delay between sending SIGINT and the job being
>> aborted, but I didn't think it worth the complexity of a second thread
>> and use of poll() just to minimize that delay.
>>
>> @@ -7295,8 +7295,9 @@ repoll:
>>          }
>>
>>          GETTIMEOFDAY(&curr);
>> -        if ( timeout && ((int)(curr.tv_sec - start.tv_sec)  * 1000 + \
>> -                         (int)(curr.tv_usec - start.tv_usec) / 1000) > timeout * 1000 ) {
>> +        if (timeout && (((int)(curr.tv_sec - start.tv_sec)  * 1000 +
>> +                         (int)(curr.tv_usec - start.tv_usec) / 1000) >
>> +                        timeout * 1000)) {
>>              /* suspend the domain when migration timeouts. */
>>              vshDebug(ctl, VSH_ERR_DEBUG, "%s timeout", label);
>>              if (timeout_func)
> 
> These two chunks are rather cosmetic but I'm okay with having them in
> this patch not a separate one.

I noticed them because I was copying and pasting from them.  Depending
on whether I spot other cleanups for my v2, I may split the cleanups
into a separate patch.

>> +    while (blocking) {
>> +        virDomainBlockJobInfo info;
>> +        int result = virDomainGetBlockJobInfo(dom, path, &info, 0);
>> +
>> +        if (result < 0) {
>> +            vshError(ctl, _("failed to query job for disk %s"), path);
>> +            goto cleanup;
>> +        }
>> +        if (result == 0)
>> +            break;
>> +
>> +        if (verbose)
>> +            print_job_progress(_("Block Pull"), info.end - info.cur, info.end);
>> +
>> +        GETTIMEOFDAY(&curr);
>> +        if (intCaught || (timeout &&
>> +                          (((int)(curr.tv_sec - start.tv_sec)  * 1000 +
>> +                            (int)(curr.tv_usec - start.tv_usec) / 1000) >
>> +                           timeout * 1000))) {
>> +            vshDebug(ctl, VSH_ERR_DEBUG,
>> +                     intCaught ? "interrupted" : "timeout");
>> +            intCaught = 0;
>> +            timeout = 0;
>> +            quit = true;
>> +            if (virDomainBlockJobAbort(dom, path, 0) < 0) {
>> +                vshError(ctl, _("failed to abort job for disk %s"), path);
>> +                goto cleanup;
>> +            }
> 
> Don't we need blocking = false; here? Otherwise we may spin another
> rounds and issue GetBlockJobInfo API over and over again. But on the
> other hand, I can imagine this is desired behavior. If BlockJobAbort
> succeeds and returns 0 (we are enforcing synchronous operation here) it
> should mean that block job has really been aborted. Therefore we may
> issue GetBlockJobInfo to ascertain ourselves.

Hmm, interesting question, especially in light of the new async cancel
flag.  I'm not sure how long a cancellation can take (it has to flush
all pending I/O, which may take a while), but you are correct that
passing in 0 for the abort flags means to wait until the cancel is done;
we may be waiting up to half a second to issue the abort from the time
you pressed Ctrl-C, but worse, we are now waiting for the entire
duration of the BlockJobAbort call, which might possibly be in the range
of seconds, and that can feel awfully slow to a user trying to hit
Ctrl-C to regain control of their terminal as fast as possible.

I think I need to add an --async flag that controls whether we return
control to the user as fast as possible after Ctrl-C (and fails if we
are talking to a too-old libvirtd), or omit the flag so that we
guarantee to wait until the job is cancelled, even if it takes a while
after the Ctrl-C.  I'll have to respin this patch, and since I want to
do the same for my new 'virsh blockcopy' patch, I'll fold it into my
block storage migration series.

-- 
Eric Blake   eblake@xxxxxxxxxx    +1-919-301-3266
Libvirt virtualization library http://libvirt.org

Attachment:
signature.asc

Description: OpenPGP digital signature
--
libvir-list mailing list
libvir-list@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/libvir-list