This is interesting, and it arrived minutes after I had replaced an HDD
OSD (with NVME DB/WAL) in a small cluster. With the three profiles i was
only seeing objects/second of around 6-8 (high_client_ops), 9-12
(balanced), 12-15 (high_recovery_ops). There was only a very light
client load.
With a bit of ad-hoc experimenting I ended up setting
osd_mclock_cost_per_byte_usec_hdd to 0.4 which seemed to give about a
7-8 times increase in objects/second for each profile, without unduly
affecting client response if I used high_client_ops. Values lower than
0.4 did speed up backfill, but disk saturation started to become an issue.
These are only very rough-and-ready figures, but they do suggest
something rather wrong in the calculations.
Chris
On 16/05/2023 19:07, 胡 玮文 wrote:
Hi Sake,
We are experiencing the same. I set “osd_mclock_cost_per_byte_usec_hdd” to 0.1 (default is 2.6) and get about 15 times backfill speed, without significant affect client IO. This parameter seems calculated wrongly, from the description 5e-3 should be a reasonable value for HDD (corresponding to 200MB/s). I noticed this default is originally 5.2, then changed to 2.6 to increase the recovery speed. So I suspect the original author just convert the unit wrongly, he may want 5.2e-3 but wrote 5.2 in code.
But all this may be not important in the next version. I see the relevant code is rewritten, and this parameter is now removed.
high_recovery_ops profile works very poorly for us. It increase the average latency of client IO from 50ms to about 1s.
Weiwen Hu
在 2023年5月16日,19:16,Sake Paulusma <sake1989@xxxxxxxxxxx> 写道:
We noticed extremely slow performance when remapping is necessary. We didn't do anything special other than assigning the correct device_class (to ssd). When checking ceph status, we notice the number of objects recovering is around 17-25 (with watch -n 1 -c ceph status).
How can we increase the recovery process?
There isn't any client load, because we're going to migrate to this cluster in the future, so only a rsync once a while is being executed.
[ceph: root@pwsoel12998 /]# ceph status
cluster:
id: da3ca2e4-ee5b-11ed-8096-0050569e8c3b
health: HEALTH_WARN
noscrub,nodeep-scrub flag(s) set
services:
mon: 5 daemons, quorum pqsoel12997,pqsoel12996,pwsoel12994,pwsoel12998,prghygpl03 (age 3h)
mgr: pwsoel12998.ylvjcb(active, since 3h), standbys: pqsoel12997.gagpbt
mds: 4/4 daemons up, 2 standby
osd: 32 osds: 32 up (since 73m), 32 in (since 6d); 10 remapped pgs
flags noscrub,nodeep-scrub
data:
volumes: 2/2 healthy
pools: 5 pools, 193 pgs
objects: 13.97M objects, 853 GiB
usage: 3.5 TiB used, 12 TiB / 16 TiB avail
pgs: 755092/55882956 objects misplaced (1.351%)
183 active+clean
10 active+remapped+backfilling
io:
recovery: 2.3 MiB/s, 20 objects/s
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx