On Wed, Jun 22, 2011 at 07:05:08PM +0200, Jiri Denemark wrote: > On Wed, Jun 22, 2011 at 16:47:18 +0100, Daniel P. Berrange wrote: > > If the QEMU process has been stopped (kill -STOP/gdb), or the > > QEMU process has live-locked itself, then we will never get a > > reply from the monitor. We should not wait forever in this > > case, but instead timeout after a reasonable amount of time. > > > > NB if the host has high CPU load, or a single monitor command > > intentionally takes a long time, then this will cause bogus > > failures. In the case of high CPU load, arguably the guest > > should have been migrated elsewhere, since you can't effectively > > manage guests on a host if QEMU is taking > 30 seconds to reply > > to simply commands. Since we use background migration, there > > should not be any commands which take significant time to > > execute any more > > The thing I'm most concerned about is that is far too easy to get into such > situations especially since disk cache subsystem in Linux kernel is not the > best thing in the world. While I agree that running guests on a loaded host is > not very clever and guests should rather be migrated elsewhere, such situation > doesn't have to be intentional. In other words, in case of a malfunction of > some kind (some processes go crazy, network disruptions, ...) QEMU may require > more than a timeout seconds to respond and we will penalize an innocent QEMU > process because we won't be able to control it anymore even though the issues > get fixed. It's clearly a trade-off and the reason why it must be configurable 30s is a lot already. It's a first shot, and I'm sure feedback will suggest to add more logic around that basic timeout based error detection. Right now the problem is that never failing the call is a serious issue, and can block the whole process too (like on daemon restart when trying to reconnect to a stuck guest). Daniel -- Daniel Veillard | libxml Gnome XML XSLT toolkit http://xmlsoft.org/ daniel@xxxxxxxxxxxx | Rpmfind RPM search engine http://rpmfind.net/ http://veillard.com/ | virtualization library http://libvirt.org/ -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list