Re: Fwd: question on rbd locks

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



That makes sense. Thanks Ilya.

On Mon, Apr 13, 2020 at 4:10 AM Ilya Dryomov <idryomov@xxxxxxxxx> wrote:

> As Paul said, a lock is typically broken by a new client trying
> to grab it.  As part of that the existing lock holder needs to be
> blacklisted, unless you fence using some type of STONITH.
>
> The question of whether the existing lock holder is dead can't be
> answered in isolation.  For example, the exclusive-lock feature
> (automatic cooperative lock transitions to ensure that only a single
> client is writing to the image at any given time) uses watches.  If the
> existing lock holder has a watch, it is considered alive and the lock
> is requested cooperatively.  Otherwise, it is considered dead and the
> lock is broken.  This is implemented with care to avoid various corner
> cases related to watches and blacklisting: the client will not grab
> the lock without having a watch established, the client will update
> the lock cookie if the watch is lost and reestablished, the client
> will not use pre-blacklist osdmaps for any post-blacklist I/O, etc.
>
> Since you are grabbing locks manually in the orchestration layer,
> it is up to the orchestration layer to decide when (and how) to break
> them.  rbd can't make that decision for you -- consider a case where
> the device is alive and ready to serve I/O, but the workload is stuck
> for some other reason.
>
> Thanks,
>
>                 Ilya
>
> On Sun, Apr 12, 2020 at 8:42 PM Void Star Nill <void.star.nill@xxxxxxxxx>
> wrote:
> >
> > Paul, Ilya, others,
> >
> > Any inputs on this?
> >
> > Thanks,
> > Shridhar
> >
> >
> > On Thu, 9 Apr 2020 at 12:30, Void Star Nill <void.star.nill@xxxxxxxxx>
> wrote:
> >>
> >> Thanks Ilya, Paul.
> >>
> >> I dont have the panic traces and probably they are not related to rbd.
> I was merely describing our use case.
> >>
> >> On our setup that we manage, we have a software layer similar to
> Kubernetes CSI that orchestrates the volume map/unmap on behalf of the
> users. We are currently using volume locks as a way to protect the volumes
> from inadvertent concurrent write mounts that could lead to FS corruption
> as most of the volumes run with ext3/4.
> >>
> >> So in our orchestration, we take a shared on volumes that are read-only
> mounts, thus we can allow concurrent multiple read-only mounts, and we take
> exclusive lock for read-write mounts so that we can reject other RO/RW
> mounts while the first RW mount is in use.
> >>
> >> All this orchestration happens in a distributed manner across all our
> compute nodes - so it is not easy to determine when we should kick out the
> dead connections and claim the lock. We need to intervene manually and
> resolve such issues as of now. So I am looking for a way to do this
> deterministically.
> >>
> >> Thanks,
> >> Shridhar
> >>
> >>
> >> On Wed, 8 Apr 2020 at 02:48, Ilya Dryomov <idryomov@xxxxxxxxx> wrote:
> >>>
> >>> On Tue, Apr 7, 2020 at 6:49 PM Void Star Nill <
> void.star.nill@xxxxxxxxx> wrote:
> >>> >
> >>> > Hello All,
> >>> >
> >>> > Is there a way to specify that a lock (shared or exclusive) on an rbd
> >>> > volume be released if the client machine becomes unreachable or
> >>> > irresponsive?
> >>> >
> >>> > In one of our clusters, we use rbd locks on volumes to make sure
> provide a
> >>> > kind of shared or exclusive access - to make sure there are no
> writers when
> >>> > someone is reading and there are no readers when someone is writing.
> >>> >
> >>> > However, we often run into issues when one of the machines gets into
> kernel
> >>> > panic or something and the whole pipeline gets stalled.
> >>>
> >>> What kind of kernel panics are you running into?  Do you have any panic
> >>> messages or stack traces captured?
> >>>
> >>> Thanks,
> >>>
> >>>                 Ilya
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux