On 07/27/2018 11:01 AM, Daniel P. Berrangé wrote: > On Fri, Jul 27, 2018 at 09:56:23AM +0200, Michal Privoznik wrote: >> Dear list, >> >> we have this very old bug [1] that I just keep pushing in front of me. I >> made several attempts to fix it. However, none of them made into the >> tree. I guess it's time to have discussion what to do about it. IIRC, >> the algorithm that I implemented last was to keep original label in >> XATTRs (among with some ref counter) and the last one to restore the >> label will look there and find the original label. There was a problem >> with two libvirtds fighting over a file on shared FS. > > IIRC there was a second problem around security of XATTRs. eg, if we set > an XATTR it was possible for QEMU to change it once we given QEMU privs > on the image. Thus a malicious QEMU can cause libvirtd to change the > image to an arbitrary user on shutdown. > > I think it was possible to deal with that on Linux, because the kernel > hsa some xattr namespacing that is reserved for only root. This protection > is worthless though when using NFS images as you can't trust the remote NFS > client to honour the same restriction. > >> So I guess my question is can we come up with a design that would work? >> Or at least work to the extent that we're satisfied with? >> >> Personally, I like the XATTRs approach. And to resolve the NFS race we >> can invent yet another lockspace to guard labelling - I also have a bug >> for that [2] (although, I'm not that familiar with lockspaces). I guess >> doing disk metadata locking is not going to be trivial, is it? > > Yeah, for sure we need two distinct locking regimes. Our current locking > aims to protect disk content, and the lifetime of the lock is the entire > duration of the QEMU process. For XATTRs, we only need write protection > for the short time window in which the security drivers execute, which > is at start, during hotplug/unplug, during shutdown, and finally migration. > > I think we could achieve this with the virtlockd if we make it use the > same locking file, but a different byte offset within the file. Just > need to make sure it doesn't clash with the byte offset QEMU has chosen, > nor the current offset we use. So do we really need virtlockd for this? I mean, the whole purpose of it is to hold locks so that libvirtd can restart independently, without losing the lock attached. However, since the metadata lock will be held only for a fraction of a second and will be not held through more than a single API aren't we just fine with libvirtd holding the lock? The way I imagine this to work is the following: lock_for_metadata(vm->def); // locks an unique byte in all the disks qemuSecuritySetAllLabel(); ... qemuProcessStartCPUs(); // disk content lock is acquired here unlock_for_metadata(vm->def); Another way to look at it is: it's libvirtd who relabels all the paths and the metadata lock should guard just that. Speaking of which, we need these locks to wait for each other, not just set-or-fail. Again, something that goes against virtlockd view of locking. Frankly, I've started implementing this with virtlockd already, but the changes I made so far simply do not feel right, e.g. I have to change lock_protocol.x so that virLockSpaceProtocolAcquireResourceArgs has this 'type' argument which tells virtlockd which byte we want to lock. Michal -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list