On Thu, Jun 26, 2014 at 12:42:52PM +0100, Daniel P. Berrange wrote:
On Thu, Jun 26, 2014 at 01:20:02PM +0200, Martin Kletzander wrote:If locking the domain failed, files were already labelled and thus we restored the previous label on them. Having disks on NFS means the domain having the lock already gets permission denial. This code moves the labelling part into the command hook since it's still privileged, and also moves the clearing of VIR_QEMU_PROCESS_STOP_NO_RELABEL from stop_flags right after the handshare after hook.This problem description / fix doesn't make much sense to me. IIUC the control flow is - Parent runs fork() - Parent waits for handshake notify - Child runs hook - Hook *only* registers with lock daemon - Child sends handshake notify to parent - Child waits for handshake response - Parent received handshake notify - Parent does labelling - Parent sends handshake response - Child execs QEMU - QEMU launches but CPUs are paused - Parent acquires disk locks - Parent tells QEMU to start CPUs Note that the hook does not acquire any locks - it merely connects to the lock daemon. Locks are not acquired until the CPUs are ready to be started. So I don't see how moving labelling into the hook solves anything.
Oh, my fault, I haven't realized, we're just registering there.
Note that the goal of the locking code as it is today, was only to prevent the content of the disk image being corrupted by 2 QEMUs running concurrently. The design as it is succeeds in this. Stopping changes to the labelling was not attempted. Yes, this will result in a running QEMU loosing access to a disk if another QEMU attempts to start and use those disks, but the content is protected in this way. It isn't actually possible to protect against concurrent changes to both the content and the labelling with a single lock because there are differing lock ordering & protection rules requires for these. To do that, we actually need to incorporate use of the lock manager into the security drivers using a separate lock space and use locking rules that apply explicitly to the needs of the labelling.
It occurred to me too that this might be either fixed or the fix eased after Michal's patches are applied (not my area, though): http://www.redhat.com/archives/libvir-list/2014-March/msg00826.html What I think is that it would "(almost) solve it" automatically, since it would restore the original label, even though there would be a small window when the first QEMU process doesn't have access to the disk. But definitely better result than now. Martin
Attachment:
signature.asc
Description: Digital signature
-- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list