[libvirt] domain restore race condition

Laine Stump <laine@xxxxxxxxx> · Mon, 22 Feb 2010 13:25:34 -0500

As noted in another message, the problem I was seeing is a race 
condition in qemudDomainRestore(), not with my modifications to 
qemudDmainSave(). Here's some discussion about that problem from IRC, 
with a question at the bottom:

<laine> Does anyone else see a failure of domain restore (immediately 
after domain save? I'm very definitely seeing it on my machine with 
F12+updates testing and libvirt built from unpatched sources.
<laine> It's very reproduceable - with virsh I do "save domain 
filename", then "restore filename" and it pretty much always gives me 
a black screen. Then I force shutdown the guest (with virt-manager) 
and do "restore filename" again. Tada! It's restored and running!
[...]
<danpb> laine: possible race condition
<danpb> laine: try putting a sleep(10) before the qemuMonitorStartCPUs 
in qemuDomainRestore()

Dan's suggestion *did* eliminate the failures.

[...]
<danpb> laine: this sounds like the issue with libvirt prematurely 
starting execution of the CPUs before QEMU has even started restoring 
(or soemthing like that)
<danpb> laine: search the archives for a mail from Charles Duffy on 
this subject some time ago

Here's the BZ filed by Charles Duffy

https://bugzilla.redhat.com/show_bug.cgi?id=537938

It looks like he's dealing with a race condition earlier in the restore, 
since his solution was to wait for the migration process to terminate 
somewhere inside qemudStartVMDaemon(), rather than waiting until 
qemudStartVMDaemon() was finished (which is what it does now). Since 
this wait has already been done anyway by the time of Dan's sleep(10) in 
my test, I don't think Charles' patch would help this situation.

So is there something that libvirt can wait on here to ensure proper 
start? Or is there a problem in qemu? (I'm still running 0.11. I'll also 
try upgrading to 0.12 and see if there are changes in behavior.)

--
libvir-list mailing list
libvir-list@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/libvir-list