On Tue, Mar 20, 2018 at 11:32 AM, Dongsheng Yang <dongsheng.yang@xxxxxxxxxxxx> wrote: > Currently, operations will hang in cases below, even we set osd_request_timeout. > > (1). We set osd_request_timeout, when a writing is doing rbd_wait_state_locked(). > At this moment, the ceph cluster is not reachable. Then the rbd_aquire_lock() > will call rbd_try_acquire_lock() again and again. but the rbd_wait_state_locked() > will never be wake up. > > (2). There is a mapping with exclusive, then this device will refuse to release > the lock. if there is another mapping without exclusive, any writing to this > device will be blocked until the exclusive mapping unmapped. > > To avoid the operation hang in these cases, this patch introduce an option > named as state_lock_timeout. if we set this option, we will get an > -ETIMEDOUT when we reach a timeout rather than waiting forever. and > if this option not set, everything works as what it was. Hi Dongsheng, I think we should reuse ceph_options::mount_timeout instead of adding a new option. I realize it is not a proper rbd option, but rbd uses it in a couple of places: waiting for latest osdmap on "rbd map" and unwatch request on "rbd unmap". Waiting for exclusive-lock, especially given that "rbd map --exclusive" attempts to acquire the lock, seems like a good fit. Thanks, Ilya -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html