Re: Increase number of objects in flight during recovery

胡玮文 <huww98@xxxxxxxxxxx> · Thu, 3 Dec 2020 11:35:03 +0000

Hi,

There is a “OSD recovery priority” dialog box in web dashboard. Configurations it will change includes:

osd_max_backfill
osd_recovery_max_active
osd_recovery_max_single_start
osd_recovery_sleep

Tune these config may helps. “High” priority corresponding to 4, 4, 4, 0, respectively. Some of these also have a _ssd/_hdd variant.

> 在 2020年12月3日，17:11，Frank Schilder <frans@xxxxxx> 写道：
> 
> Hi all,
> 
> I have the opposite problem as discussed in "slow down keys/s in recovery". I need to increase the number of objects in flight during rebalance. It is already all remapped PGs in state backfilling, but it looks like no more than 8 objects/sec are transferred per PG at a time. The pools sits on high-performance SSDs and could easily handle a transfer of 100 or more objects/sec simultaneously. Is there any way to increase the number of transfers/sec or simultaneous transfers? Increasing the options osd_max_backfills and osd_recovery_max_active has no effect.
> 
> Background: The pool in question (con-fs2-meta2) is the default data pool of a ceph fs, which stores exclusively the kind of meta data that goes into this pool. Storage consumption is reported as 0, but the number of objects is huge:
> 
>    NAME                     ID     USED        %USED     MAX AVAIL     OBJECTS   
>    con-fs2-meta1            12     216 MiB      0.02       933 GiB      13311115 
>    con-fs2-meta2            13         0 B         0       933 GiB     118389897 
>    con-fs2-data             14     698 TiB     72.15       270 TiB     286826739 
> 
> Unfortunately, there were no recommendations on dimensioning PG numbers for this pool, so I used the same for con-fs2-meta1, and con-fs2-meta2. In hindsight, this was potentially a bad idea, the meta2 pool should have a much higher PG count or a much more aggressive recovery policy.
> 
> I now need to rebalance PGs on meta2 and it is going way too slow compared with the performance of the SSDs it is located on. In a way, I would like to keep the PG count where it is, but increase the recovery rate for this pool by a factor of 10. Please let me know what options I have.
> 
> Best regards,
> =================
> Frank Schilder
> AIT Risø Campus
> Bygning 109, rum S14
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx