On 04/13/2012 01:29 AM, Michal Privoznik wrote: > On 12.04.2012 21:59, Eric Blake wrote: >> I'm tired of shell-scripting to wait for completion of a block pull, >> when virsh can be taught to do the same. I couldn't quite reuse >> vshWatchJob, as this is not a case of a long-running command where >> a second thread must be used to probe job status (at least, not unless >> I make virsh start doing blocking waits for an event to fire), but it >> served as inspiration for my simpler single-threaded loop. There is >> up to a half-second delay between sending SIGINT and the job being >> aborted, but I didn't think it worth the complexity of a second thread >> and use of poll() just to minimize that delay. >> >> @@ -7295,8 +7295,9 @@ repoll: >> } >> >> GETTIMEOFDAY(&curr); >> - if ( timeout && ((int)(curr.tv_sec - start.tv_sec) * 1000 + \ >> - (int)(curr.tv_usec - start.tv_usec) / 1000) > timeout * 1000 ) { >> + if (timeout && (((int)(curr.tv_sec - start.tv_sec) * 1000 + >> + (int)(curr.tv_usec - start.tv_usec) / 1000) > >> + timeout * 1000)) { >> /* suspend the domain when migration timeouts. */ >> vshDebug(ctl, VSH_ERR_DEBUG, "%s timeout", label); >> if (timeout_func) > > These two chunks are rather cosmetic but I'm okay with having them in > this patch not a separate one. I noticed them because I was copying and pasting from them. Depending on whether I spot other cleanups for my v2, I may split the cleanups into a separate patch. >> + while (blocking) { >> + virDomainBlockJobInfo info; >> + int result = virDomainGetBlockJobInfo(dom, path, &info, 0); >> + >> + if (result < 0) { >> + vshError(ctl, _("failed to query job for disk %s"), path); >> + goto cleanup; >> + } >> + if (result == 0) >> + break; >> + >> + if (verbose) >> + print_job_progress(_("Block Pull"), info.end - info.cur, info.end); >> + >> + GETTIMEOFDAY(&curr); >> + if (intCaught || (timeout && >> + (((int)(curr.tv_sec - start.tv_sec) * 1000 + >> + (int)(curr.tv_usec - start.tv_usec) / 1000) > >> + timeout * 1000))) { >> + vshDebug(ctl, VSH_ERR_DEBUG, >> + intCaught ? "interrupted" : "timeout"); >> + intCaught = 0; >> + timeout = 0; >> + quit = true; >> + if (virDomainBlockJobAbort(dom, path, 0) < 0) { >> + vshError(ctl, _("failed to abort job for disk %s"), path); >> + goto cleanup; >> + } > > Don't we need blocking = false; here? Otherwise we may spin another > rounds and issue GetBlockJobInfo API over and over again. But on the > other hand, I can imagine this is desired behavior. If BlockJobAbort > succeeds and returns 0 (we are enforcing synchronous operation here) it > should mean that block job has really been aborted. Therefore we may > issue GetBlockJobInfo to ascertain ourselves. Hmm, interesting question, especially in light of the new async cancel flag. I'm not sure how long a cancellation can take (it has to flush all pending I/O, which may take a while), but you are correct that passing in 0 for the abort flags means to wait until the cancel is done; we may be waiting up to half a second to issue the abort from the time you pressed Ctrl-C, but worse, we are now waiting for the entire duration of the BlockJobAbort call, which might possibly be in the range of seconds, and that can feel awfully slow to a user trying to hit Ctrl-C to regain control of their terminal as fast as possible. I think I need to add an --async flag that controls whether we return control to the user as fast as possible after Ctrl-C (and fails if we are talking to a too-old libvirtd), or omit the flag so that we guarantee to wait until the job is cancelled, even if it takes a while after the Ctrl-C. I'll have to respin this patch, and since I want to do the same for my new 'virsh blockcopy' patch, I'll fold it into my block storage migration series. -- Eric Blake eblake@xxxxxxxxxx +1-919-301-3266 Libvirt virtualization library http://libvirt.org
Attachment:
signature.asc
Description: OpenPGP digital signature
-- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list