For reference, here my settings: osd class:hdd advanced osd_recovery_sleep 0.050000 osd class:rbd_data advanced osd_recovery_sleep 0.025000 osd class:rbd_meta advanced osd_recovery_sleep 0.002500 osd class:ssd advanced osd_recovery_sleep 0.002500 osd advanced osd_recovery_sleep 0.050000 osd class:hdd advanced osd_max_backfills 3 osd class:rbd_data advanced osd_max_backfills 6 osd class:rbd_meta advanced osd_max_backfills 12 osd class:ssd advanced osd_max_backfills 12 osd advanced osd_max_backfills 3 osd class:hdd advanced osd_recovery_max_active 8 osd class:rbd_data advanced osd_recovery_max_active 16 osd class:rbd_meta advanced osd_recovery_max_active 32 osd class:ssd advanced osd_recovery_max_active 32 osd advanced osd_recovery_max_active 8 rbd_data is one type of low-performance SATA SSDs and ssd and rbd_meta are on small high-performance SSDs. For these, the sleep leaves 2/3 of the performance to client IO. Interestingly, I have never seen the max_active setting to have any effect. Has anyone else a report on what this setting does change in reality (not the documentation)? Best regards, ================= Frank Schilder AIT Risø Campus Bygning 109, rum S14 ________________________________________ From: Steven Pine <steven.pine@xxxxxxxxxx> Sent: 12 August 2021 17:45:48 To: Peter Lieven Cc: Frank Schilder; Nico Schottelius; Ceph Users Subject: Re: Re: Very slow I/O during rebalance - options to tune? Yes, osd_op_queue_cut_off makes a significant difference. And as mentioned, be sure to check your osd recovery sleep settings, there are several depending on your underlying drives: "osd_recovery_sleep": "0.000000", "osd_recovery_sleep_hdd": "0.050000", "osd_recovery_sleep_hybrid": "0.050000", "osd_recovery_sleep_ssd": "0.050000", Adjusting these will upwards will dramatically reduce IO, and take effect immediately at the cost of slowing rebalance/recovery. Sincerely, On Thu, Aug 12, 2021 at 11:36 AM Peter Lieven <pl@xxxxxxx<mailto:pl@xxxxxxx>> wrote: Am 12.08.21 um 17:25 schrieb Frank Schilder: >> Wow, that is impressive and sounds opposite of what we see around >> here. Often rebalances directly and strongly impact client I/O. > It might be the missing settings: > > osd_op_queue = wpq > osd_op_queue_cut_off = high Afaik, the default for osd_op_queue_cut_off was set to low by mistake prior to Octopus. Peter _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx<mailto:ceph-users@xxxxxxx> To unsubscribe send an email to ceph-users-leave@xxxxxxx<mailto:ceph-users-leave@xxxxxxx> -- Steven Pine E steven.pine@xxxxxxxxxx<mailto:steven.pine@xxxxxxxxxx> | P 516.938.4100 x Webair | 501 Franklin Avenue Suite 200, Garden City NY, 11530 webair.com<http://webair.com> [Facebook icon]<https://www.facebook.com/WebairInc/> [Twitter icon] <https://twitter.com/WebairInc> [Linkedin icon] <https://www.linkedin.com/company/webair> NOTICE: This electronic mail message and all attachments transmitted with it are intended solely for the use of the addressee and may contain legally privileged proprietary and confidential information. If the reader of this message is not the intended recipient, or if you are an employee or agent responsible for delivering this message to the intended recipient, you are hereby notified that any dissemination, distribution, copying, or other use of this message or its attachments is strictly prohibited. If you have received this message in error, please notify the sender immediately by replying to this message and delete it from your computer. _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx