Re: virtlock - a VM goes read-only

Branimir Pejakovic <branimirp@xxxxxxxxx> · Mon, 20 Nov 2017 22:56:18 +0000

On Thu, Nov 16, 2017 at 12:48 PM, Daniel P. Berrange <berrange@xxxxxxxxxx> wrote:
On Wed, Nov 15, 2017 at 02:24:48PM +0000, Branimir Pejakovic wrote:

> Dear colleagues,

>

> I am facing a problem that has been troubling me for last week and a half.

> Please if you are able to help or offer some guidance.

>

> I have a non-prod POC environment with 2 CentOS7 fully updated hypervisors

> and an NFS filer that serves as a VM image storage. The overall environment

> works exceptionally well. However, starting a few weeks ago I have been

> trying to implement virtlock in order to prevent a VM running on 2

> hypervisors at the same time.

[snip]

> h2 # virsh start test09

> error: Failed to start domain test09

> error: resource busy: Lockspace resource

> '/storage_nfs/images_001/test09.qcow2' is locked

[snip]

> Now, I am pretty sure that I am missing something simple here since this is

> a standard feature and should work out of the box if set correctly but so

> far I cannot see what I am missing.

So I think you are hitting the little surprise in the way our locking

works. Specifically, right now the locking only protects the image

file contents from concurrent writes. We don't have locking around

the file attributes (permissions, user/group ownership, selinux label,

etc)

Unfortunately with the current libvirt design, the security drivers run

before locking takes effect. So what happens is that you have your first

VM running normally. It has been granted ability to write to the image

in terms of SELinux label & permissions/owership. The lock manager is

holding locks protecting the image contents

Now you try to start the second guest, and libvirt will apply the SELinux

label & permissions/ownership needed for that second guest, despite it

being used by the first guest. Only then do we acquire the locks for the

disk image, and fail because the first guest holds the lock. We now

reset the permissions/ownership we just granted for the second guest,

and thus unfortunately blocks the first guest from using the images,

causing the I/O errors you mention

We *have* successfully prevented 2 guests from writing to the same

image at once, so your data is still safe. Unfortunately though the

first guest cannot write any further datas, so that previously

running guest is now fubar :-(

I appreciated this is rather surprising & unhelpful in general. Just

console yourself with the fact that at least your disk iamge is not

corrupted.

Note, this should only happen with SELinux enforcing though - if it is

permissive, then I'd expect the first guest to carry on working.

We would like to improve our locking so that we can apply locks before

we even try to change ownership/permissions/selinux, which would make

it far more useful. We've never succesfully completed that work though.

Hi Daniel

Thank you very much for your answer. Apologies for late reply.

I got it working but I had to do a few modifications.

Usually, qemu-kvm runs as a qemu user as configured in /etc/libvirt/qemu.conf (user/group parameters). My QCOW files were owned by root during this experiment (usually I set them to be owned by qemu user). Once a VM starts, the ownership is changed to qemu and it keeps that way until I try to start the same VM on another hypervisor and lock kicks in. In that moment, the file ownership is changed to root again (observed via watch in 2nd terminal) and the VM goes read only.

I can do a workaround and then lock works normally (no ro-VM) if I do the following:
- set 0777 permissions on the QEMU file
- change user/group parameters in qemu.conf to root and restart libvirtd

I like to have a bit of a security so I searched through qemu.conf file and found option dynamic_ownership. The option is set to 1 by default. I set it to 0 and then changed ownership of the image files to qemu, set user/group in qemu.conf 
to qemu, normal permissions on the files and finally restarted libvirtd. After that, lock works as expected.

The side effect is that if you want to do template based provisioning via python-libvirt based app or from the shell by using virt-sysprep, virt-clone or some other libvirt/libguestfs based app and you want to modify a VM after cloning (hostname/network/vCPU/etc..), it will throw an exception - permission denied on the image file. This one was solved by looking at https://access.redhat.com/solutions/2110391 and it works great once applied. During all this time, SELinux was disabled.

I would like to take this opportunity to personally thank you and the team in Red Hat for all hard work on libvirt and libvirt-based tools. I heavily use python-libvirt module and what I can say is that if you don't have RHEV-M/oVirt as a single
pane of glass for your virtualization layer, it helps you a great deal in managing and inspecting (and making statistics!) of a large pool of KVM hypervisors. Please keep up a good work!

Regards,

Branimir

_______________________________________________
libvirt-users mailing list
libvirt-users@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/libvirt-users