Re: maximum rebuild speed for erasure coding pool

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





Den tors 9 maj 2019 kl 17:46 skrev Feng Zhang <prod.feng@xxxxxxxxx>:
Thanks, guys.

I forgot the IOPS. So since I have 100disks, the total
IOPS=100X100=10K. For the 4+2 erasure, one disk fail, then it needs to
read 5 and write 1 objects.Then the whole 100 disks can do 10K/6 ~ 2K
rebuilding actions per seconds.

While for the 100X6TB disks, suppose the object size is set to 4MB,
then 6TB/4MB=1.25 million objects. Not considering the disk throughput
IO or CPUs, fully rebuilding takes:

1.25M/2K=600 seconds?

I think you will _never_ see a full cluster all helping out at 100% to fix such an issue,
so while your math is probably correctly describing the absolute best-case, reality will
be somewhere below that.

Still, it will be quite possible to cause this situation and make
a measurement of your own with exactly your own circumstances, since everyones setup
is slightly different. Replacing broken drives is normal for any large storage system, and
ceph will prioritze client traffic most of the time over normal repairs, so that will add to the
total calendar time it takes for recovery, but keep your users happy while doing it.


--
May the most significant bit of your life be positive.
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux