Re: PG_SLOW_SNAP_TRIMMING and possible storage leakage on 16.2.5

Dan van der Ster <dvanders@xxxxxxxxx> · Mon, 24 Jan 2022 19:29:40 +0100

Hi,

Yes, restarting an OSD also works to re-peer and "kick" the
snaptrimming process.
(In the ticket we first noticed this because snap trimming restarted
after an unrelated OSD crashed/restarted).
Please feel free to add your experience to that ticket.

> monitoring snaptrimq

This is from our local monitoring probes, based on `ceph pg dump -f json`.

-- Dan

-- dan

On Mon, Jan 24, 2022 at 6:31 PM David Prude <david@xxxxxxxxxxxxxxxx> wrote:
>
> Dan,
>
>   Thank you for replying. Since I posted I did some more digging. It
> really seemed as if snaptrim simply wasn't being processed. The output
> of "ceph health detail" showed that PG 3.9b had the longest queue. I
> examined this PG and saw that it's primary was osd.8 so I manually
> restarted that daemon. This seems to have kicked off snaptrim on some PGs:
>
> ----SNIP----
> 1513 pgs: 1 active+clean+scrubbing, 1 active+clean+scrubbing+snaptrim,
> 44 active+clean+snaptrim, 1 active+clean+scrubbing+deep+snaptrim_wait,
> 1406 active+clean, 2 active+clean+scrubbing+deep, 58
> active+clean+snaptrim_wait; 114 TiB data, 344 TiB used, 93 TiB / 437 TiB
> avail; 2.0 KiB/s rd, 64 KiB/s wr, 5 op/s
> ----SNIP----
>
> I can see the "snaptrimq_len* value decreasing for that PG now. I will
> look into the issue you posted as well as repeering the PGs. Does an osd
> restart resulting in snaptrim proceeding seem consistent with the
> behavior you saw?
>
> I notice in the bug report you linked, that you are somehow monitoring
> snaptrimq with grafana. Is this a global value that is readily avilable
> for monitoring or are you calculating this somehow. If there is an easy
> way to access it, I would greatly appreciate instructions.
>
> Thank you,
>
> -David
>
> On 1/24/22 11:53 AM, Dan van der Ster wrote:
> > Hi David,
> >
> > We observed the same here: https://tracker.ceph.com/issues/52026
> > You can poke the trimming by repeering the PGs.
> >
> > Also, depending on your hardware, the defaults for osd_snap_trim_sleep
> > might be far too conservative.
> > We use osd_snap_trim_sleep = 0.1 on our mixed hdd block / ssd block.db OSDs.
> >
> > Cheers, Dan
> >
> > On Mon, Jan 24, 2022 at 4:54 PM David Prude <david@xxxxxxxxxxxxxxxx> wrote:
> >> Hello,
> >>
> >>    We have a 5-node, 30 hdd (6 hdds/node) cluster running 16.2.5. We
> >> utilize a snapshot scheme within cephfs that results in 24 hourly
> >> snapshots, 7 daily snapshots, and 2 weekly snapshots. This has been
> >> running without overt issues for several months. As of this weekend, we
> >> started receiving a  PG_SLOW_SNAP_TRIMMING warning on a single PG. Over
> >> the last 24 hours we are now seeing that this warning is associated with
> >> 123 of our 1513 PGs. As recommended by the output of "ceph health
> >> detail" we have tried tuning the following from their default values:
> >>
> >> osd_pg_max_concurrent_snap_trims=4 (default 2)
> >> osd_snap_trim_sleep_hdd=3 (default 5)
> >> osd_snap_trim_sleep=0.5 (default 0, it was suggested somewhere in a
> >> search that 0 actually disables trim?)
> >>
> >> I am uncertain how to best measure if the above is having an effect on
> >> the trimming process. I am unclear on how to clearly monitor the
> >> progress of the snaptrim process or even of the total queue depth.
> >> Interestingly, "ceph pg stat" does not show any PGs in the snaptrim state:
> >>
> >> ----SNIP----
> >> 1513 pgs: 2 active+clean+scrubbing+deep, 1511 active+clean; 114 TiB
> >> data, 344 TiB used, 93 TiB / 437 TiB avail; 6.2 KiB/s rd, 2.2 MiB/s wr,
> >> 118 op/s
> >>
> >> ----SNIP----
> >>
> >> We have, for the time being, disabled our snapshots in the hopes that
> >> the cluster will catch up with the trimming process. Two potential
> >> things of note:
> >>
> >> 1. We are unaware of any particular action which would be associated
> >> with this happening now (there were no unusual deletions of either live
> >> data or snapshots).
> >> 2. For the past month or two it has appeared as if there has been a
> >> steady unchecked growth in storage utilization as if snapshots have not
> >> been actually being trimmed.
> >>
> >> Any assistance in determining what exactly has prompted this behavior or
> >> any guidance on how to evaluate the total snaptrim queue size to see if
> >> we are making progress would be much appreciated.
> >>
> >> Thank you,
> >>
> >> -David
> >>
> >> --
> >> David Prude
> >> Systems Administrator
> >> PGP Fingerprint: 1DAA 4418 7F7F B8AA F50C  6FDF C294 B58F A286 F847
> >> Democracy Now!
> >> www.democracynow.org
> >>
> >>
> >> _______________________________________________
> >> ceph-users mailing list -- ceph-users@xxxxxxx
> >> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>
> --
> David Prude
> Systems Administrator
> PGP Fingerprint: 1DAA 4418 7F7F B8AA F50C  6FDF C294 B58F A286 F847
> Democracy Now!
> www.democracynow.org
>
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx