Re: Slow ops during recovery for RGW index pool only when degraded OSD is primary

"Anthony D'Atri" <anthony.datri@xxxxxxxxx> · Wed, 3 Apr 2024 15:48:31 -0400

We currently have in  src/common/options/global.yaml.in

- name: osd_async_recovery_min_cost
  type: uint
  level: advanced
  desc: A mixture measure of number of current log entries difference and historical
    missing objects,  above which we switch to use asynchronous recovery when appropriate
  default: 100
  flags:
  - runtime

I'd like to rephrase the description there in a PR, might you be able to share your insight into the dynamics so I can craft a better description?  And do you have any thoughts on the default value?  Might appropriate values vary by pool type and/or media?

> On Apr 3, 2024, at 13:38, Joshua Baergen <jbaergen@xxxxxxxxxxxxxxxx> wrote:
> 
> We've had success using osd_async_recovery_min_cost=0 to drastically
> reduce slow ops during index recovery.
> 
> Josh
> 
> On Wed, Apr 3, 2024 at 11:29 AM Wesley Dillingham <wes@xxxxxxxxxxxxxxxxx> wrote:
>> 
>> I am fighting an issue on an 18.2.0 cluster where a restart of an OSD which
>> supports the RGW index pool causes crippling slow ops. If the OSD is marked
>> with primary-affinity of 0 prior to the OSD restart no slow ops are
>> observed. If the OSD has a primary affinity of 1 slow ops occur. The slow
>> ops only occur during the recovery period of the OMAP data and further only
>> occur when client activity is allowed to pass to the cluster. Luckily I am
>> able to test this during periods when I can disable all client activity at
>> the upstream proxy.
>> 
>> Given the behavior of the primary affinity changes preventing the slow ops
>> I think this may be a case of recovery being more detrimental than
>> backfill. I am thinking that causing an pg_temp acting set by forcing
>> backfill may be the right method to mitigate the issue. [1]
>> 
>> I believe that reducing the PG log entries for these OSDs would accomplish
>> that but I am also thinking a tuning of osd_async_recovery_min_cost [2] may
>> also accomplish something similar. Not sure the appropriate tuning for that
>> config at this point or if there may be a better approach. Seeking any
>> input here.
>> 
>> Further if this issue sounds familiar or sounds like another condition
>> within the OSD may be at hand I would be interested in hearing your input
>> or thoughts. Thanks!
>> 
>> [1] https://docs.ceph.com/en/latest/dev/peering/#concepts
>> [2]
>> https://docs.ceph.com/en/latest/rados/configuration/osd-config-ref/#confval-osd_async_recovery_min_cost
>> 
>> Respectfully,
>> 
>> *Wes Dillingham*
>> LinkedIn <http://www.linkedin.com/in/wesleydillingham>
>> wes@xxxxxxxxxxxxxxxxx
>> _______________________________________________
>> ceph-users mailing list -- ceph-users@xxxxxxx
>> To unsubscribe send an email to ceph-users-leave@xxxxxxx
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx