Re: [PATCH 0/3] rbd: support timeout in waiting state locked

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Mar 21, 2018 at 7:20 AM, Dongsheng Yang
<dongsheng.yang@xxxxxxxxxxxx> wrote:
> On 03/20/2018 11:13 PM, Ilya Dryomov wrote:
>>
>> On Tue, Mar 20, 2018 at 11:32 AM, Dongsheng Yang
>> <dongsheng.yang@xxxxxxxxxxxx> wrote:
>>>
>>> Currently, operations will hang in cases below, even we set
>>> osd_request_timeout.
>>>
>>> (1). We set osd_request_timeout, when a writing is doing
>>> rbd_wait_state_locked().
>>> At this moment, the ceph cluster is not reachable. Then the
>>> rbd_aquire_lock()
>>> will call rbd_try_acquire_lock() again and again. but the
>>> rbd_wait_state_locked()
>>> will never be wake up.
>>>
>>> (2). There is a mapping with exclusive, then this device will refuse to
>>> release
>>> the lock. if there is another mapping without exclusive, any writing to
>>> this
>>> device will be blocked until the exclusive mapping unmapped.
>>>
>>> To avoid the operation hang in these cases, this patch introduce an
>>> option
>>> named as state_lock_timeout. if we set this option, we will get an
>>> -ETIMEDOUT when we reach a timeout rather than waiting forever. and
>>> if this option not set, everything works as what it was.
>>
>> Hi Dongsheng,
>>
>> I think we should reuse ceph_options::mount_timeout instead of adding
>> a new option.  I realize it is not a proper rbd option, but rbd uses it
>> in a couple of places: waiting for latest osdmap on "rbd map" and
>> unwatch request on "rbd unmap".  Waiting for exclusive-lock, especially
>> given that "rbd map --exclusive" attempts to acquire the lock, seems like
>> a good fit.
>
> Hi Ilya,
>
>     Thanx for your reply. Yes, mount_timeout is used in unwatch request
> on "rbd unmap", because we need to call ceph_osdc_unwatch() in libceph
> module.
>
> That makes it confusing. But in rbd module, we only use it in waiting osdmap
> on "rbd map", which seems fit.  if we reuse mount_timeout in
> wait_state_locked().
> I am a little worried about it more and more confusing,
>
>     I mean, yes, it is a little not good now, but I don't want to make it
> worse.

Hi Dongsheng,

Right, reusing mount_timeout would not work because it defaults to 60
seconds.  The default for exclusive lock timeout should be "no timeout".
Sorry for misleading you.

In order to get this into 4.17, let's drop the "show option" patch as
the infrastructure for that is in flux -- we will worry about it later.
The remaining two patches can be merged into a single patch.

Thanks,

                Ilya
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux