Re: IO of hell with snaptrim

Aaron Lauterer <a.lauterer@xxxxxxxxxxx> · Tue, 31 May 2022 09:39:28 +0200

What you could try is to increase the delete_sleep for the OSDs.

ceph tell osd.* injectargs '--osd_delete_sleep 30'

I had a customer that ran into similar issues, terrible performance on snapshot 
removal. They ended up setting it to 30 to reduce the impact on performance.

You might start at a lower level and see how it affects your cluster.

On 5/30/22 19:47, Ml Ml wrote:
Hello,

i have this state right now (see below). If i remove the nosnaptrim
flag, my IO dies and i dont know why.
When i remove all of my snapshots, will snaptrim then go away, too?

root@cluster5-node01:~# ceph -s
   cluster:
     id:     e1153ea5-bb07-4548-83a9-edd8bae3eeec
     health: HEALTH_WARN
             noout,nosnaptrim flag(s) set
             4 nearfull osd(s)
             1 pool(s) do not have an application enabled
             3 pool(s) nearfull

   services:
     mon: 3 daemons, quorum
cluster5-node01,cluster5-node02,cluster5-node03 (age 25h)
     mgr: cluster5-node03(active, since 26h), standbys:
cluster5-node02, cluster5-node01
     osd: 18 osds: 18 up (since 3h), 18 in (since 9M)
          flags noout,nosnaptrim

   data:
     pools:   3 pools, 1057 pgs
     objects: 5.11M objects, 17 TiB
     usage:   53 TiB used, 10 TiB / 63 TiB avail
     pgs:     1017 active+clean+snaptrim_wait
              40   active+clean

   io:
     client:   413 KiB/s rd, 3.9 MiB/s wr, 34 op/s rd, 364 op/s wr
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx