Re: The snaptrim queue of PGs has not decreased for several days.

Sridhar Seshasayee <sseshasa@xxxxxxxxxx> · Fri, 23 Aug 2024 15:31:31 +0530

Hi Eugen,

On Fri, Aug 23, 2024 at 1:37 PM Eugen Block <eblock@xxxxxx> wrote:

> Hi again,
>
> I have a couple of questions about this.
> What exactly happened to the PGs? They were queued for snaptrimming,
> but we didn't see any progress. Let's assume the average object size
> in that pool was around 2 MB (I don't have the actual numbers). Does
> that mean if osd_snap_trim_cost (1M default) was too low, those too
> large objects weren't trimmed? And then we split the PGs, reducing the
> average object size to 1 MB, these objects could be trimmed then,
> obviously. Does this explanation make sense?
>

If you have the OSD logs, I can take a look and see why the snaptrim ops
did not make progress. The cost is one contributing factor on the position
of the op in the queue. Therefore, even though the cost incorrectly
represents the actual average size of the objects in the PG, the op should
be scheduled based on the set cost and the profile allocations.

The OSDs appear to be NVMe based is what I understand from the
thread. Based on the actions taken to resolve the situation (increased
pg_num to 64), I think something else was up on the cluster. For NVMe
based cluster, the current cost shouldn't cause stalling of the snaptrim
ops. I'd suggest raising an upstream tracker with your observation and
OSD logs to investigate this further.

>
> I just browsed through the changes, if I understand the fix correctly,
> the average object size is now calculated automatically, right? Which
> makes a lot of sense to me, as an operator I don't want to care too
> much about the average object sizes since ceph should know them better
> than me. ;-)
>

Yes, that's correct. This fix was part of the effort to incrementally
include
background OSD operations to be scheduled by mClock.
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx