Initial indication shows "osd_async_recovery_min_cost = 0" to be a huge win here. Some initial thoughts. Were this not for the fact that the index (and other OMAP pools) were isolated to their own OSDs in this cluster this tunable would seemingly cause data/blob objects from data pools to async recover when synchronous recovery might be better for those pools / that data. I can play around with how this affects the RGW data pools. There was a Ceph code walk thru video of this topic: https://www.youtube.com/watch?v=waOtatCpnYs&t it seems that perhaps osd_async_recovery_min_cost may have previously been referred to as osd_async_recover_min_pg_log_entries (both default to 100). For a pool with OMAP data where each or some OMAP objects are very very large this may not be a dynamic enough factor to base the decision on. Thanks for the feedback everybody! Respectfully, *Wes Dillingham* LinkedIn <http://www.linkedin.com/in/wesleydillingham> wes@xxxxxxxxxxxxxxxxx On Wed, Apr 3, 2024 at 1:38 PM Joshua Baergen <jbaergen@xxxxxxxxxxxxxxxx> wrote: > We've had success using osd_async_recovery_min_cost=0 to drastically > reduce slow ops during index recovery. > > Josh > > On Wed, Apr 3, 2024 at 11:29 AM Wesley Dillingham <wes@xxxxxxxxxxxxxxxxx> > wrote: > > > > I am fighting an issue on an 18.2.0 cluster where a restart of an OSD > which > > supports the RGW index pool causes crippling slow ops. If the OSD is > marked > > with primary-affinity of 0 prior to the OSD restart no slow ops are > > observed. If the OSD has a primary affinity of 1 slow ops occur. The slow > > ops only occur during the recovery period of the OMAP data and further > only > > occur when client activity is allowed to pass to the cluster. Luckily I > am > > able to test this during periods when I can disable all client activity > at > > the upstream proxy. > > > > Given the behavior of the primary affinity changes preventing the slow > ops > > I think this may be a case of recovery being more detrimental than > > backfill. I am thinking that causing an pg_temp acting set by forcing > > backfill may be the right method to mitigate the issue. [1] > > > > I believe that reducing the PG log entries for these OSDs would > accomplish > > that but I am also thinking a tuning of osd_async_recovery_min_cost [2] > may > > also accomplish something similar. Not sure the appropriate tuning for > that > > config at this point or if there may be a better approach. Seeking any > > input here. > > > > Further if this issue sounds familiar or sounds like another condition > > within the OSD may be at hand I would be interested in hearing your input > > or thoughts. Thanks! > > > > [1] https://docs.ceph.com/en/latest/dev/peering/#concepts > > [2] > > > https://docs.ceph.com/en/latest/rados/configuration/osd-config-ref/#confval-osd_async_recovery_min_cost > > > > Respectfully, > > > > *Wes Dillingham* > > LinkedIn <http://www.linkedin.com/in/wesleydillingham> > > wes@xxxxxxxxxxxxxxxxx > > _______________________________________________ > > ceph-users mailing list -- ceph-users@xxxxxxx > > To unsubscribe send an email to ceph-users-leave@xxxxxxx > _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx