Re: Ceph rbd clients surrender exclusive lock in critical situation

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Jan 18, 2023 at 3:25 PM Frank Schilder <frans@xxxxxx> wrote:
>
> Hi Ilya,
>
> thanks a lot for the information. Yes, I was talking about the exclusive lock feature and was under the impression that only one rbd client can get write access on connect and will keep it until disconnect. The problem we are facing with multi-VM write access is, that this will inevitably corrupt the file system created on the rbd if two instances can get write access. Its not a shared file system, its just an xfs formatted virtual disk.
>
> > There is a way to disable automatic lock transitions but I don't think
> > it's wired up in QEMU.
>
> Can you point me to some documentation about that? It sounds like this is what would be needed to avoid the file system corruption in our use case. The lock transition should be initiated from the outside and the lock should then stay fixed on the client holding it until it is instructed to give up the lock or it disconnects.

It looks like there is not much documentation on this specific aspect
beyond a few scattered notes which I'm pasting below:

> To disable transparent lock transitions between multiple clients, the
> client must acquire the lock by using the RBD_LOCK_MODE_EXCLUSIVE flag.

> Per mapping (block device) rbd device map options:
> [...]
> - exclusive - Disable automatic exclusive lock transitions.
>   Equivalent to --exclusive.

(Yes, both the flag and the option are also named "exclusive".  Don't
ask why...)

However note that for krbd, --exclusive comes with some strings
attached.  For QEMU, there is no such option at all -- as already
mentioned, RBD_LOCK_MODE_EXCLUSIVE flag is not wired up there.

Ultimately, it's the responsibility of the orchestration layer to
prevent situations like this from happening.  Ceph just provides
storage, it can't really be involved in managing one's VMs or deciding
whether multi-VM access is OK.  The orchestration layer may choose to
use some of the RBD primitives for this (whether exclusive locks or
advisory locks -- see "rbd lock add", "rbd lock ls" and "rbd lock rm"
commands), use something else or do nothing at all...

>
> >> Is this a known problem with libceph and libvirtd?
> > Not sure what you mean by libceph.
>
> I simply meant that its not a krbd client. Libvirt uses libceph (or was it librbd?) to emulate virtual drives, not krbd.

libceph is actually one of the kernel modules.  libvirt/QEMU usually
use librbd but it's completely up to the user.  Nothing prevents you
from feeding some krbd devices to libvirt/QEMU, for example.

Thanks,

                Ilya
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux