What you could try is to increase the delete_sleep for the OSDs.
ceph tell osd.* injectargs '--osd_delete_sleep 30'
I had a customer that ran into similar issues, terrible performance on snapshot
removal. They ended up setting it to 30 to reduce the impact on performance.
You might start at a lower level and see how it affects your cluster.
On 5/30/22 19:47, Ml Ml wrote:
Hello,
i have this state right now (see below). If i remove the nosnaptrim
flag, my IO dies and i dont know why.
When i remove all of my snapshots, will snaptrim then go away, too?
root@cluster5-node01:~# ceph -s
cluster:
id: e1153ea5-bb07-4548-83a9-edd8bae3eeec
health: HEALTH_WARN
noout,nosnaptrim flag(s) set
4 nearfull osd(s)
1 pool(s) do not have an application enabled
3 pool(s) nearfull
services:
mon: 3 daemons, quorum
cluster5-node01,cluster5-node02,cluster5-node03 (age 25h)
mgr: cluster5-node03(active, since 26h), standbys:
cluster5-node02, cluster5-node01
osd: 18 osds: 18 up (since 3h), 18 in (since 9M)
flags noout,nosnaptrim
data:
pools: 3 pools, 1057 pgs
objects: 5.11M objects, 17 TiB
usage: 53 TiB used, 10 TiB / 63 TiB avail
pgs: 1017 active+clean+snaptrim_wait
40 active+clean
io:
client: 413 KiB/s rd, 3.9 MiB/s wr, 34 op/s rd, 364 op/s wr
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx