Problems preserving lock state across suspend/resume

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi folks,

I'm looking into a problem discussed back in January 2013
wherein lock/lease state isn't properly preserved across suspend/resume.

(This situation can lead to corruption if the guest's block storage is
modified elsewhere while the original guest is paused.)

For details see:

	https://www.redhat.com/archives/libvirt-users/2013-January/msg00109.html
	https://bugzilla.redhat.com/show_bug.cgi?id=906590

I'm using libvirt-1.2.0 with explicit Sanlock leases defined in the domain XML.

It appears the problematic behavior is due to virDomainLockProcessPause()
and virDomainLockProcessResume() being called twice during each
suspend/resume: once by the RPC worker thread running the suspend/resume
command, and once by the main thread in response to the QEMU events
triggered by the RPC worker's actions.

In libvirt-1.2.0, call paths for suspend are as follows:

qemuDomainObjBeginJob(suspend) -> 
	qemuDomainSuspend() -> 
		qemuProcessStopCPUs() -> 
			virDomainLockProcessPause()

qemuMonitorJSONIOProcessEvent:143 : handle STOP ->
	qemuProcessHandleStop -> 
		virDomainLockProcessPause()

The first call -- usually out of qemuProcessHandleStop but perhaps
there's a race -- properly saves state and releases locks.

However the second call queries lock status after locks have been
released, so it finds no locks are held.  This results in a null/blank
lockState saved in the domain object.

Before I start working on a solution, are these multiple invocations
of virDomainLockProcessPause()/virDomainLockProcessResume() intentional?

Thanks,
Adam Tilghman
UC San Diego

--
libvir-list mailing list
libvir-list@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/libvir-list




[Index of Archives]     [Virt Tools]     [Libvirt Users]     [Lib OS Info]     [Fedora Users]     [Fedora Desktop]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite News]     [KDE Users]     [Fedora Tools]