On 4/15/19 1:09 PM, Michal Privoznik wrote: > On 4/10/19 11:35 PM, Cole Robinson wrote: >> On 3/28/19 11:04 AM, Michal Privoznik wrote: >>> Here is the problem: If all disks had XATTRs (i.e. domains using >>> them were started with owner remembering turned on) then >>> refcounting implemented in XATTRs would work nicely and we could >>> set the whole backing chain and restore it later. But world is >>> not that simple. As soon as there is one domain that was started >>> with the feature turned off (or simply by older libvirt), the >>> XATTR refounting does not reflect the actual number of uses by >>> running domains and therefore any attempt to restore might cut >>> off the old domain. >>> >>> There is no simple way around this. Except artificially turning >>> the feature off for the rest of the backing chain. >>> >> >> Is there a thread discussing the issues that led to disabling this code? >> I looked but couldn't find one. I could use some more context on what >> case this patch fixes, and the upcoming patches. I'm having trouble >> groking these comments > > I don't think I discussed it on the list. But imagine there are two > domains: vm1 and vm2. Let them have one disk each like this: > > vm1: disk1.qcow2 (RW) -> base.qcow2 (RO) > vm2: disk2.qcow2 (RW) -> base.qcow2 (RO) > > (I never know which way to draw the arrows, but I'm sure you get the > idea. base.qcow2 is shared between the domains) > > Now, start only vm1. This means that both disk1.qcow2 and base.qcow2 are > relabelled. And imagine that seclabel remembering is on. The paths then > have some XATTRs on them where original owner is stored. So far so good. > > But then the vm2 is started with seclabel remembering turned off (e.g. > it's on a different host and base.qcow2 is on shared NFS, or simply > sysadmin turned the feature off and restarted libvirtd). > > Okay, we have two domains running, base.qcow2's refcount would be 1 (as > read from XATTRs) even though it's used by two domains. But leave that > aside for a moment. > > Now, vm1 is shut down. The label restore is started. Because the domain > had the feature on when starting it up (it remembers that in the status > XML), the whole backing chain would be restored (btw turning the feature > off affects only freshly started domains). So we start with disk1.qcow2. > It's refcount is 1 and therefore the original owner is restored. Then we > proceed to base.qcow2. It's refcount is again 1 (as read from XATTRs) > and thus we restore its original owner. But this is problematic, becuase > that operation possibly cuts off vm2's access. > > Well, if the refcounter of base.qcow2 would reflect the actual number of > times the file is in use then we'd have no problem - restore wouldn't be > done there, merely just refcounter update. But the refcounter only shows > how many times the file is in use by domains with the feature enabled. > > Hopefully, this makes it clearer. > > I can't think of a clever way around this. Any other than remembering > only the top layer and leaving the rest of the backing chain alone. This > feels like solving a cluster problem to me. > Thanks for the info, that's basically what I determined in my later response to this mail. To me it sounds like this logic to skip refcounting needs to be extended to any plausibly shared resource, basically anything that doesn't get an exclusive virt_image_t label. But if it's true that we only need this 'remembering' behavior for resources exclusively assigned to a single VM, then I wonder if we need the refcounting at all Sidenote: Besides the long term enduser annoyance that lack of label/dac remembering has caused, is there a bug tracking this where I can look for more info? If things like rhev or openstack are asking about this I'd like to read the report Thanks, Cole -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list