Re: Increase the recovery throughput

Eugen Block <eblock@xxxxxx> · Mon, 12 Dec 2022 10:16:47 +0000

Hi,

there are many threads dicussing recovery throughput, have you tried  
any of the solutions? First thing to try is to increase  
osd_recovery_max_active and osd_max_backfills. What are the current  
values in your cluster?

Zitat von Monish Selvaraj <monish@xxxxxxxxxxxxxxx>:

Hi,

Our ceph cluster consists of 20 hosts and 240 osds.

We used the erasure-coded pool with cache-pool concept.

Some time back 2 hosts went down and the pg are in a degraded state. We got
the 2 hosts back up in some time. After the pg is started recovering but it
takes a long time ( months )  . While this was happening we had the cluster
with 664.4 M objects and 987 TB data. The recovery status is not changed;
it remains 88 pgs degraded.

During this period, we increase the pg size from 256 to 512 for the
data-pool ( erasure-coded pool ).

We also observed (one week ) the recovery to be very slow, the current
recovery around 750 Mibs.

Is there any way to increase this recovery throughput ?

*Ceph-version : quincy*

[image: image.png]

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx