Down OSDs blocking read requests.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

Follow up from the suggestion to use any of the following options:

- client_mount_timeout
- rados_mon_op_timeout
- rados_osd_op_timeout

To mitigate the waiting time being blocked on requests.  Is there
really no other way around this?

If two OSDs go down that between them have the both copies of an
object, it would be nice to have clients fail *immediately*.  I've
tried reducing the rados_osd_op_timeout setting to 0.5, but when
things go wrong, it still results in the collapse of the cluster and
all reads from it.

Reducing the rados_osd_op_timeout down to 0.05 seems like a sure way
to cause more false positives.  But in reality, if an OSD operation
can't serve in 150ms, then it's missed the train by over an hour.

-- 
Iain Buclaw

*(p < e ? p++ : p) = (c & 0x0f) + '0';
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux