On Fri, Jun 17, 2011 at 10:55:43 +0100, Daniel P. Berrange wrote: > On Thu, Jun 16, 2011 at 04:03:36PM -0400, Dave Allan wrote: > > Dan, can you suggest some possible strategies here? I don't have a > > strong opinion on the implementation, although I agree with your > > concern about spawning unlimited numbers of threads. > > As I mentioned, we need to make the QEMU monitor timeout after some > period of time waiting, and ensure that the monitor for that VM cannot > be used thereafter. I'm not sure that's the best way to deal with this either. I hate this kind of timeouts since I worked on Xen :-) The problem with this timeout is that no matter how big the timeout is, it is usually pretty easy to get into a situation when the timeout is not big enough. If anything in the system goes crazy (easiest is just causing lots of disk writes) the monitor command times out and you cannot do nothing with the domain except for destroying it (or shutting it down from inside) even though you fixed the issue and the system returns back to normal operation. Another issue is that the threads don't have to be stuck in QEMU monitor after all, they can be doing migration, for example. Let's say you one client connects to libvirtd and starts 5 migrations. Then 15 other clients connect and each issues 1 additional migration. So we have 16 clients connected and all 20 threads consumed. So even though a new client can connect to libvirtd it can't do anything (not even cancel the migrations) since no worker is free. I know this is not a probable scenario but I just wanted to show that we need to think about more possibilities how libvirtd can become unresponsive. I'm afraid we won't find any perfect solution and we'll just need to take the one that we think sucks less. Jirka -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list