Re: latency when OSD falls out of cluster

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

On 07/12/13 19:57, Edwin Peer wrote:
Seconds of down time is quite severe, especially when it is a planned shut down or rejoining. I can understand if an OSD just disappears, that some requests might be directed to the now gone node, but I see similar latency hiccups on scheduled shut downs and rejoins too?

have you tried to reduce the "osd recovery max active" and "osd backfill max" from the defaults ? There are also some option to reduce the recovery's priority.

The defaults say 63 prio for client and 10 for recovery. But also a "osd recovery max active" to 5 and 2 I/O threads per OSD. I notice that reducing the "osd recovery max active" to 1 reduced the I/O latency penalty when recovery is active.

Reweighting an OSD to 0.9 should be enough to let you see how your cluster performs under recovery.
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux