On Wed, Aug 24, 2011 at 11:58:29PM +0800, Guannan Ren wrote: > On 08/18/2011 05:55 AM, Dave Allan wrote: > >So, after your patches which have greatly improved the console > >behavior, I find that I'm back to this hang, which by its nature I > >can't reproduce with virsh console, as it only appears when I've > >shutdown and started a domain several times within the same > >connection. The hang is 100% reproducible. Per our IRC conversation, > >I'm attaching the RPC logs, as well as the python code for reference > >and a backtrace of the python process at the time that it was hung. > > > >Dave > > > > I can produce the problem, so I did an research on this. > According to the libvirtd log, it hangs because when the > domain boot up at the second time, > the libvirtd send a message to python scripts due to the > lifecycle_callback setting, meanwhile > setting the socket fd of the client to "mode=0", that means > neither readable or writable on the > libvirtd side. > So when the python scripts got the lifecycle event and trys to > call virDomainGetState() in > the command of openning console, after it sent the message to > libvirtd, it hanged and never get > the response. The problem is that before decrementing 'client->nrequests' we check what the message type/status is. The check is incorrect for streams, because it failed to take account of the fact that some stream errors may be asynchronous and thus untracked. This in turn caused the 'nrequests' variable to go negative. A fix which worked for me is here https://www.redhat.com/archives/libvir-list/2011-August/msg01518.html Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list