Hi Monish, You might also want to check the values of osd_recovery_sleep_* if they are not the default ones. Regards, Frédéric. ----- Le 12 Déc 22, à 11:32, Monish Selvaraj monish@xxxxxxxxxxxxxxx a écrit : > Hi Eugen, > > We tried that already. the osd_max_backfills is in 24 and the > osd_recovery_max_active is in 20. > > On Mon, Dec 12, 2022 at 3:47 PM Eugen Block <eblock@xxxxxx> wrote: > >> Hi, >> >> there are many threads dicussing recovery throughput, have you tried >> any of the solutions? First thing to try is to increase >> osd_recovery_max_active and osd_max_backfills. What are the current >> values in your cluster? >> >> >> Zitat von Monish Selvaraj <monish@xxxxxxxxxxxxxxx>: >> >> > Hi, >> > >> > Our ceph cluster consists of 20 hosts and 240 osds. >> > >> > We used the erasure-coded pool with cache-pool concept. >> > >> > Some time back 2 hosts went down and the pg are in a degraded state. We >> got >> > the 2 hosts back up in some time. After the pg is started recovering but >> it >> > takes a long time ( months ) . While this was happening we had the >> cluster >> > with 664.4 M objects and 987 TB data. The recovery status is not changed; >> > it remains 88 pgs degraded. >> > >> > During this period, we increase the pg size from 256 to 512 for the >> > data-pool ( erasure-coded pool ). >> > >> > We also observed (one week ) the recovery to be very slow, the current >> > recovery around 750 Mibs. >> > >> > Is there any way to increase this recovery throughput ? >> > >> > *Ceph-version : quincy* >> > >> > [image: image.png] >> >> >> >> _______________________________________________ >> ceph-users mailing list -- ceph-users@xxxxxxx >> To unsubscribe send an email to ceph-users-leave@xxxxxxx >> > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx