Hi,
there are many threads dicussing recovery throughput, have you tried
any of the solutions? First thing to try is to increase
osd_recovery_max_active and osd_max_backfills. What are the current
values in your cluster?
Zitat von Monish Selvaraj <monish@xxxxxxxxxxxxxxx>:
Hi,
Our ceph cluster consists of 20 hosts and 240 osds.
We used the erasure-coded pool with cache-pool concept.
Some time back 2 hosts went down and the pg are in a degraded state. We got
the 2 hosts back up in some time. After the pg is started recovering but it
takes a long time ( months ) . While this was happening we had the cluster
with 664.4 M objects and 987 TB data. The recovery status is not changed;
it remains 88 pgs degraded.
During this period, we increase the pg size from 256 to 512 for the
data-pool ( erasure-coded pool ).
We also observed (one week ) the recovery to be very slow, the current
recovery around 750 Mibs.
Is there any way to increase this recovery throughput ?
*Ceph-version : quincy*
[image: image.png]
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx