Re: backfilling kills rbd performance

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

On 2022-11-19 17:32, Anthony D'Atri wrote:
I’m not positive that the options work with hyphens in them.  Try

ceph tell osd.* injectargs '--osd_max_backfills 1
--osd_recovery_max_active 1 --osd_recovery_max_single_start 1
--osd_recovery_op_priority=1'

Did so.

With Quincy the following should already be set, but to be sure:

ceph tell osd.* config set osd_op_queue_cut_off high

Did so too and even restarted all osd as it was recommended.

I then stopped a single osd in order to cause some backfilling.

What is network saturation like on that 1GE replication network?

Typically 100% saturated.

Operations like yours that cause massive data movement could easily
saturate a pipe that narrow.

Sure, but I am used to other setups where the recovery can be slowed down in order to keep the rbds operating.

To me it looks like all backfilling happens in parallel without any pauses in between which would benefit the client traffic.

I would expect some of those pgs in active+undersized+degraded+remapped+backfill_wait state instead of backfilling.

2022-11-19T16:58:50.139390+0000 mgr.pve-02 (mgr.18134134) 61735 : cluster [DBG] pgmap v60978: 576 pgs: 102 active+undersized+degraded+remapped+backfilling, 474 active+clean; 2.4 TiB data, 4.3 TiB used, 10 TiB / 15 TiB avail; 150 KiB/s wr, 10 op/s; 123337/1272524 objects degraded (9.692%); 228 MiB/s, 58 objects/s recovering

Is this Quincy specific?

Regards
--martin
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux