Hi Denis, We have had the same issue with MClock, and now switch back to WPQ when draining nodes. I couldn’t identify the cause of the slow-ops. Even with custom mclock tuning and 1 backfill per osd, it was still causing client io issues. From: Denis Polom <denispolom@xxxxxxxxx> Date: Tuesday, 17 September 2024 at 8:16 pm To: ceph-users@xxxxxxx <ceph-users@xxxxxxx> Subject: [EXT] mclock scheduler kills clients IOs External email: Please exercise caution Hi guys, we have Ceph cluster with Quincy 17.2.7 where we are draining 3 hosts (each one is from one failure domain). Each host has 13 HDDs and there are another 38 hosts with same size in each failure domain. There is a lot of free space. I've set up primary-affinity on drained OSDs to 0 and set OSDs out mclock profile is: osd_mclock_profile = high_client_ops but soon after that I got slow ops on some of the OSDs and clients getting high iowaits until I stop recovering. It's strange to me that client is still trying to use OSDs where primary-affinity is set to 0 Here is the case of osd.15 with primary-affinity 0: REQUESTS 4 homeless 0 8337634 osd15 16.35734e3c 16.e3cs0 [1100,1444,1509,429,540,1106]/1100 [15,1444,403,429,540,1106]/15 e1562924 100087a5278.00000004 0x400024 1 write 8358040 osd15 16.ace50e3c 16.e3cs0 [1100,1444,1509,429,540,1106]/1100 [15,1444,403,429,540,1106]/15 e1562924 100087a5a4e.00000007 0x400024 1 write 8438861 osd15 16.a7b80e3c 16.e3cs0 [1100,1444,1509,429,540,1106]/1100 [15,1444,403,429,540,1106]/15 e1562924 100087a7c6a.0000001f 0x400024 1 write 8443242 osd15 16.de504e3c 16.e3cs0 [1100,1444,1509,429,540,1106]/1100 [15,1444,403,429,540,1106]/15 e1562924 100087a7d76.00000014 0x400024 1 write LINGER REQUESTS BACKOFFS Recovery speed is also not high (about 4GiB) and network is not saturated. I've tried also following custom profile with other custom settings: osd advanced osd_mclock_override_recovery_settings true osd advanced osd_async_recovery_min_cost 100 osd advanced osd_max_backfills 1 osd advanced osd_mclock_profile custom osd advanced osd_mclock_scheduler_background_recovery_res 0.100000 osd advanced osd_mclock_scheduler_client_res 0.900000 osd advanced osd_mclock_scheduler_client_wgt 6 osd advanced osd_recovery_max_active 1 osd advanced osd_recovery_max_chunk 8388608 osd advanced osd_recovery_op_priority 1 but result is same. Immediately clients get high iowaits and can't do any IOs. This didn't happen on this cluster before when we were using wpq scheduler. My question is - are there some other options I need to set? With wpq I was able to set osd_recovery_sleep but this option looks like not possble to change with mclock. Any ideas, please? _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx