Re: [PATCH] rbd: prevent busy loop when requesting exclusive lock

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Aug 2, 2023 at 8:36 AM Dongsheng Yang
<dongsheng.yang@xxxxxxxxxxxx> wrote:
>
> Hi Ilya
>
> 在 2023/8/2 星期三 上午 6:22, Ilya Dryomov 写道:
> > Due to rbd_try_acquire_lock() effectively swallowing all but
> > EBLOCKLISTED error from rbd_try_lock() ("request lock anyway") and
> > rbd_request_lock() returning ETIMEDOUT error not only for an actual
> > notify timeout but also when the lock owner doesn't respond, a busy
> > loop inside of rbd_acquire_lock() between rbd_try_acquire_lock() and
> > rbd_request_lock() is possible.
> >
> > Requesting the lock on EBUSY error (returned by get_lock_owner_info()
> > if an incompatible lock or invalid lock owner is detected) makes very
> > little sense.  The same goes for ETIMEDOUT error (might pop up pretty
> > much anywhere if osd_request_timeout option is set) and many others.
> >
> > Just fail I/O requests on rbd_dev->acquiring_list immediately on any
> > error from rbd_try_lock().
> >
> > Cc: stable@xxxxxxxxxxxxxxx # 588159009d5b: rbd: retrieve and check lock owner twice before blocklisting
> > Cc: stable@xxxxxxxxxxxxxxx
> > Signed-off-by: Ilya Dryomov <idryomov@xxxxxxxxx>
> > ---
> >   drivers/block/rbd.c | 28 +++++++++++++++-------------
> >   1 file changed, 15 insertions(+), 13 deletions(-)
> >
> > diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
> > index 24afcc93ac01..2328cc05be36 100644
> > --- a/drivers/block/rbd.c
> > +++ b/drivers/block/rbd.c
> > @@ -3675,7 +3675,7 @@ static int rbd_lock(struct rbd_device *rbd_dev)
> >       ret = ceph_cls_lock(osdc, &rbd_dev->header_oid, &rbd_dev->header_oloc,
> >                           RBD_LOCK_NAME, CEPH_CLS_LOCK_EXCLUSIVE, cookie,
> >                           RBD_LOCK_TAG, "", 0);
> > -     if (ret)
> > +     if (ret && ret != -EEXIST)
> >               return ret;
> >
> >       __rbd_lock(rbd_dev, cookie);
>
> If we got -EEXIST here, we will call __rbd_lock() and return 0. -EEXIST
> means lock is held by myself, is that necessary to call __rbd_lock()?

Hi Dongsheng,

Yes, because the reason rbd_lock() gets called in the first place is
that the kernel client doesn't "know" that it's still holding the lock
in RADOS.  This can happen if the unlock operation times out, for
example.

Notice

        WARN_ON(__rbd_is_lock_owner(rbd_dev) ||
                rbd_dev->lock_cookie[0] != '\0');

at the top of rbd_lock().

Thanks,

                Ilya




[Index of Archives]     [CEPH Users]     [Ceph Large]     [Ceph Dev]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux