On Aug 1, 2011, at 18:01 , Jonathan Stoppani wrote: > > On Aug 1, 2011, at 16:50 , Jonathan Stoppani wrote: > >> >> On Aug 1, 2011, at 16:33 , Eric Blake wrote: >> >>> [re-adding the list] >> >> Sorry about that, still not used to mailman lists which don't put the list address in the reply-to field. ;-) >> >>>> Thanks for the prompt answer Eric! Yes, nc has a q option: >>>> >>>> -q, --hold-timeout=SEC1[:SEC2] Set hold timeout(s) for local [and remote] >>> >>> Glad to hear that we found root cause to your problems, then. >>> >>>> >>>> The bug specifically refers to ssh, does that mean that it should work over tcp? >>> >>> The problem is that libvirt is trying to start a remote nc session over ssh; but looking at http://libvirt.org/remote.html, it looks like ssh is the only protocol using nc in that manner (so yes, you can probably avoid the issue by using tcp or tls). Meanwhile, I think you can work around it without patching libvirt, by using this as your remote URI: >>> >>> qemu+ssh://user@remotehost/system?netcat=/path/to/nc-wrapper >>> >>> where nc-wrapper is an executable script installed on remotehost, looking like: >>> >>> #!/bin/sh >>> exec /path/to/real/nc -q0 "$@" >> >> Just tried this, but still hangs; will try tcp and report the results. >> >> ~Jonathan > > Tested using qemu+tcp and it hangs the same. If I interrupt the migration (^C), the domain is correctly destroyed on the destination but left in the paused state on the source. If I try to start it manually, I obtain this error: > > # virsh resume 1 > error: Failed to resume domain 1 > error: Timed out during operation: cannot acquire state change lock > > Any insights? Can someone shed some light on the libvirt locking possibilities? It seems to me that sanlock is not supported on gentoo (and libvirt is compiled using --without-sanlock); could this be the cause of the problem? Is there some way to explicitly set the locking mechanism to a noop in the libvirt configuration? ~Jonathan