That makes sense. Thanks Ilya. On Mon, Apr 13, 2020 at 4:10 AM Ilya Dryomov <idryomov@xxxxxxxxx> wrote: > As Paul said, a lock is typically broken by a new client trying > to grab it. As part of that the existing lock holder needs to be > blacklisted, unless you fence using some type of STONITH. > > The question of whether the existing lock holder is dead can't be > answered in isolation. For example, the exclusive-lock feature > (automatic cooperative lock transitions to ensure that only a single > client is writing to the image at any given time) uses watches. If the > existing lock holder has a watch, it is considered alive and the lock > is requested cooperatively. Otherwise, it is considered dead and the > lock is broken. This is implemented with care to avoid various corner > cases related to watches and blacklisting: the client will not grab > the lock without having a watch established, the client will update > the lock cookie if the watch is lost and reestablished, the client > will not use pre-blacklist osdmaps for any post-blacklist I/O, etc. > > Since you are grabbing locks manually in the orchestration layer, > it is up to the orchestration layer to decide when (and how) to break > them. rbd can't make that decision for you -- consider a case where > the device is alive and ready to serve I/O, but the workload is stuck > for some other reason. > > Thanks, > > Ilya > > On Sun, Apr 12, 2020 at 8:42 PM Void Star Nill <void.star.nill@xxxxxxxxx> > wrote: > > > > Paul, Ilya, others, > > > > Any inputs on this? > > > > Thanks, > > Shridhar > > > > > > On Thu, 9 Apr 2020 at 12:30, Void Star Nill <void.star.nill@xxxxxxxxx> > wrote: > >> > >> Thanks Ilya, Paul. > >> > >> I dont have the panic traces and probably they are not related to rbd. > I was merely describing our use case. > >> > >> On our setup that we manage, we have a software layer similar to > Kubernetes CSI that orchestrates the volume map/unmap on behalf of the > users. We are currently using volume locks as a way to protect the volumes > from inadvertent concurrent write mounts that could lead to FS corruption > as most of the volumes run with ext3/4. > >> > >> So in our orchestration, we take a shared on volumes that are read-only > mounts, thus we can allow concurrent multiple read-only mounts, and we take > exclusive lock for read-write mounts so that we can reject other RO/RW > mounts while the first RW mount is in use. > >> > >> All this orchestration happens in a distributed manner across all our > compute nodes - so it is not easy to determine when we should kick out the > dead connections and claim the lock. We need to intervene manually and > resolve such issues as of now. So I am looking for a way to do this > deterministically. > >> > >> Thanks, > >> Shridhar > >> > >> > >> On Wed, 8 Apr 2020 at 02:48, Ilya Dryomov <idryomov@xxxxxxxxx> wrote: > >>> > >>> On Tue, Apr 7, 2020 at 6:49 PM Void Star Nill < > void.star.nill@xxxxxxxxx> wrote: > >>> > > >>> > Hello All, > >>> > > >>> > Is there a way to specify that a lock (shared or exclusive) on an rbd > >>> > volume be released if the client machine becomes unreachable or > >>> > irresponsive? > >>> > > >>> > In one of our clusters, we use rbd locks on volumes to make sure > provide a > >>> > kind of shared or exclusive access - to make sure there are no > writers when > >>> > someone is reading and there are no readers when someone is writing. > >>> > > >>> > However, we often run into issues when one of the machines gets into > kernel > >>> > panic or something and the whole pipeline gets stalled. > >>> > >>> What kind of kernel panics are you running into? Do you have any panic > >>> messages or stack traces captured? > >>> > >>> Thanks, > >>> > >>> Ilya > _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx