Hi,
yes mclock scheduler doesn't looks like stable and ready for production
Ceph cluster. I just switched back to wpq and everything goes smoothly.
thx!
On 17. 09. 24 13:14, Justin Mammarella wrote:
Hi Denis,
We have had the same issue with MClock, and now switch back to WPQ
when draining nodes.
I couldn’t identify the cause of the slow-ops. Even with custom
mclock tuning and
1 backfill per osd, it was still causing client io issues.
*From: *Denis Polom <denispolom@xxxxxxxxx>
*Date: *Tuesday, 17 September 2024 at 8:16 pm
*To: *ceph-users@xxxxxxx <ceph-users@xxxxxxx>
*Subject: *[EXT] mclock scheduler kills clients IOs
External email: Please exercise caution
Hi guys,
we have Ceph cluster with Quincy 17.2.7 where we are draining 3 hosts
(each one is from one failure domain). Each host has 13 HDDs and there
are another 38 hosts with same size in each failure domain. There is a
lot of free space.
I've set up
primary-affinity on drained OSDs to 0
and set OSDs out
mclock profile is:
osd_mclock_profile = high_client_ops
but soon after that I got slow ops on some of the OSDs and clients
getting high iowaits until I stop recovering.
It's strange to me that client is still trying to use OSDs where
primary-affinity is set to 0
Here is the case of osd.15 with primary-affinity 0:
REQUESTS 4 homeless 0
8337634 osd15 16.35734e3c 16.e3cs0
[1100,1444,1509,429,540,1106]/1100 [15,1444,403,429,540,1106]/15
e1562924 100087a5278.00000004 0x400024 1 write
8358040 osd15 16.ace50e3c 16.e3cs0
[1100,1444,1509,429,540,1106]/1100 [15,1444,403,429,540,1106]/15
e1562924 100087a5a4e.00000007 0x400024 1 write
8438861 osd15 16.a7b80e3c 16.e3cs0
[1100,1444,1509,429,540,1106]/1100 [15,1444,403,429,540,1106]/15
e1562924 100087a7c6a.0000001f 0x400024 1 write
8443242 osd15 16.de504e3c 16.e3cs0
[1100,1444,1509,429,540,1106]/1100 [15,1444,403,429,540,1106]/15
e1562924 100087a7d76.00000014 0x400024 1 write
LINGER REQUESTS
BACKOFFS
Recovery speed is also not high (about 4GiB) and network is not saturated.
I've tried also following custom profile with other custom settings:
osd advanced osd_mclock_override_recovery_settings true
osd advanced osd_async_recovery_min_cost 100
osd advanced osd_max_backfills 1
osd advanced osd_mclock_profile custom
osd advanced osd_mclock_scheduler_background_recovery_res
0.100000
osd advanced osd_mclock_scheduler_client_res 0.900000
osd advanced osd_mclock_scheduler_client_wgt 6
osd advanced osd_recovery_max_active 1
osd advanced osd_recovery_max_chunk 8388608
osd advanced osd_recovery_op_priority 1
but result is same. Immediately clients get high iowaits and can't do
any IOs.
This didn't happen on this cluster before when we were using wpq
scheduler.
My question is - are there some other options I need to set?
With wpq I was able to set osd_recovery_sleep but this option looks like
not possble to change with mclock.
Any ideas, please?
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx