Re: Slow ops during recovery for RGW index pool only when degraded OSD is primary

Wesley Dillingham <wes@xxxxxxxxxxxxxxxxx> · Thu, 4 Apr 2024 13:18:42 -0400

Initial indication shows "osd_async_recovery_min_cost = 0"  to be a huge
win here. Some initial thoughts. Were this not for the fact that the index
(and other OMAP pools) were isolated to their own OSDs in this cluster this
tunable would seemingly cause data/blob objects from data pools to async
recover when synchronous recovery might be better for those pools / that
data. I can play around with how this affects the RGW data pools. There was
a Ceph code walk thru video of this topic:
https://www.youtube.com/watch?v=waOtatCpnYs&t it seems that perhaps
osd_async_recovery_min_cost may have previously been referred to as
osd_async_recover_min_pg_log_entries (both default to 100). For a pool with
OMAP data where each or some OMAP objects are very very large this may not
be a dynamic enough factor to base the decision on. Thanks for the feedback
everybody!

Respectfully,

*Wes Dillingham*
LinkedIn <http://www.linkedin.com/in/wesleydillingham>
wes@xxxxxxxxxxxxxxxxx

On Wed, Apr 3, 2024 at 1:38 PM Joshua Baergen <jbaergen@xxxxxxxxxxxxxxxx>
wrote:

> We've had success using osd_async_recovery_min_cost=0 to drastically
> reduce slow ops during index recovery.
>
> Josh
>
> On Wed, Apr 3, 2024 at 11:29 AM Wesley Dillingham <wes@xxxxxxxxxxxxxxxxx>
> wrote:
> >
> > I am fighting an issue on an 18.2.0 cluster where a restart of an OSD
> which
> > supports the RGW index pool causes crippling slow ops. If the OSD is
> marked
> > with primary-affinity of 0 prior to the OSD restart no slow ops are
> > observed. If the OSD has a primary affinity of 1 slow ops occur. The slow
> > ops only occur during the recovery period of the OMAP data and further
> only
> > occur when client activity is allowed to pass to the cluster. Luckily I
> am
> > able to test this during periods when I can disable all client activity
> at
> > the upstream proxy.
> >
> > Given the behavior of the primary affinity changes preventing the slow
> ops
> > I think this may be a case of recovery being more detrimental than
> > backfill. I am thinking that causing an pg_temp acting set by forcing
> > backfill may be the right method to mitigate the issue. [1]
> >
> > I believe that reducing the PG log entries for these OSDs would
> accomplish
> > that but I am also thinking a tuning of osd_async_recovery_min_cost [2]
> may
> > also accomplish something similar. Not sure the appropriate tuning for
> that
> > config at this point or if there may be a better approach. Seeking any
> > input here.
> >
> > Further if this issue sounds familiar or sounds like another condition
> > within the OSD may be at hand I would be interested in hearing your input
> > or thoughts. Thanks!
> >
> > [1] https://docs.ceph.com/en/latest/dev/peering/#concepts
> > [2]
> >
> https://docs.ceph.com/en/latest/rados/configuration/osd-config-ref/#confval-osd_async_recovery_min_cost
> >
> > Respectfully,
> >
> > *Wes Dillingham*
> > LinkedIn <http://www.linkedin.com/in/wesleydillingham>
> > wes@xxxxxxxxxxxxxxxxx
> > _______________________________________________
> > ceph-users mailing list -- ceph-users@xxxxxxx
> > To unsubscribe send an email to ceph-users-leave@xxxxxxx
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx