Re: RBD Exclusive lock to shared lock

Ilya Dryomov <idryomov@xxxxxxxxx> · Fri, 25 Mar 2022 16:11:13 +0100

On Thu, Mar 24, 2022 at 2:04 PM Budai Laszlo <laszlo.budai@xxxxxxxxx> wrote:
>
> Hi Ilya,
>
> Thank you for your answer!
>
> On 3/24/22 14:09, Ilya Dryomov wrote:
>
>
> How can we see whether a lock is exclusive or shared? the rbd lock ls command output looks identical for the two cases.
>
> You can't.  The way --exclusive is implemented is the client simply
> refuses to release the lock when it gets the request to do so.  This
> isn't tracked on the OSD side in any way so "rbd lock ls" doesn't have
> that information.
>
>
> if I understand correctly then the lock itself is an OSD "flag" but whether is treated as shared or exclusive that is a local decision of the client. Is this correct?

Hi Laszlo,

Not entirely.  There are two orthogonal concepts: shared vs exclusive
and managed vs unmanaged.

The distinction between shared and exclusive is what you would expect:
a shared lock can be held by multiple clients at the same time (as long
as they all use the same lock tag -- a free-form string).  An exclusive
lock can only be held by a single client at a time.

Managed vs unmanaged refers to whether librbd is involved.  For the
managed case, if an image is opened in read-write mode, librbd ensures
that a lock is taken before proceeding with any write (and in certain
cases before proceeding with any read as well).  If the lock is owned
by another client at that time, it is transparently requested and,
unless the other client is in the poorly named --exclusive mode, the
lock is eventually transitioned behind the scenes.  A managed lock
doesn't prevent two clients from writing to the same image: it's sole
purpose is to prevent them from doing that at _exactly_ the same
moment in time.  The use case is protecting RBD image's internal
metadata, such as the object map, from concurrent modifications.

For the unmanaged case, everything is up to the user.  It is completely
external to librbd, meaning that librbd would happily scribble over the
image if the user doesn't check on the lock before mapping the image or
starting some operation.  The use case is providing a building block
for users building their own orchestration on top of RBD.

The matrix is as follows:

- unmanaged/exclusive           "rbd lock add"

- unmanaged/shared              "rbd lock add --shared"

- managed/exclusive with        exclusive-lock image feature
  automatic transitions

- managed/exclusive without     exclusive-lock image feature
  automatic transitions         with --exclusive mapping option

- managed/shared                technically possible but not
                                surfaced to the user

>
> If my previous understanding is correct then I assume that it would not be impossible to modify the client code so it can be configured on the fly how to handle lock release requests.

Not impossible, but pretty hard...

>
> My use case would be a HA cluster where a VM is mapping an rbd image, and then it encounters some network issue. An other node of the HA cluster could start the VM and map again the image, but if the networking is fixed on the first VM that would keep using the already mapped image. Here If I could instruct my second VM to treat the lock as exclusive after an automatic failover, then I'm protected against data corruption when the networking of initial VM is fixed. But I assume that a STONITH kind of fencing could also do the job (if it can be implemented).

I would suggest using unmanaged locks here -- this is exactly what
they are for.

Thanks,

                Ilya
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx