On 06/22/2011 11:05 AM, Jiri Denemark wrote: > On Wed, Jun 22, 2011 at 16:47:18 +0100, Daniel P. Berrange wrote: >> If the QEMU process has been stopped (kill -STOP/gdb), or the >> QEMU process has live-locked itself, then we will never get a >> reply from the monitor. We should not wait forever in this >> case, but instead timeout after a reasonable amount of time. >> >> NB if the host has high CPU load, or a single monitor command >> intentionally takes a long time, then this will cause bogus >> failures. In the case of high CPU load, arguably the guest >> should have been migrated elsewhere, since you can't effectively >> manage guests on a host if QEMU is taking > 30 seconds to reply >> to simply commands. Since we use background migration, there >> should not be any commands which take significant time to >> execute any more > > The thing I'm most concerned about is that is far too easy to get into such > situations especially since disk cache subsystem in Linux kernel is not the > best thing in the world. While I agree that running guests on a loaded host is > not very clever and guests should rather be migrated elsewhere, such situation > doesn't have to be intentional. In other words, in case of a malfunction of > some kind (some processes go crazy, network disruptions, ...) QEMU may require > more than a timeout seconds to respond and we will penalize an innocent QEMU > process because we won't be able to control it anymore even though the issues > get fixed. Is there any way to measure time spent by the child process, rather than just relying on wall-time elapsed? That is, when libvirt hits 30 seconds of wall time in waiting for a monitor, can it then check whether the child process has accumulated any execution time (likely hung) vs. no execution time (likely a starved system situation), and only give up in the former case? -- Eric Blake eblake@xxxxxxxxxx +1-801-349-2682 Libvirt virtualization library http://libvirt.org
Attachment:
signature.asc
Description: OpenPGP digital signature
-- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list