Repair/Rebalance slows down

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Everyone!

I have a 16 node, 640 OSD (5 to 1 SSD) bluestore cluster which is mainly used for RGW services. It has its own backend cluster network for IO separate from the customer network.

Whenever we add or remove an OSD the rebalance or repair IO starts off very fast 4GB/s+ but it will continually slow down over a week and by then end it's moving at KB/s. So each 16TB OSD takes a week+ to repair or rebalance! I have not been able to identify any bottleneck or slow point, it just seems to be Ceph taking longer to do its thing.

Are there any settings I can check or change to get the repair speed to maintain a high level to completion? If we could stay in the GB/s speed we should be able to repair in a couple days, not a week or more...

Thank you,
Ray

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux