Thanks. I'll PR up some doc updates reflecting this and run them by the RGW / RADOS folks. > On Apr 3, 2024, at 16:34, Joshua Baergen <jbaergen@xxxxxxxxxxxxxxxx> wrote: > > Hey Anthony, > > Like with many other options in Ceph, I think what's missing is the > user-visible effect of what's being altered. I believe the reason why > synchronous recovery is still used is that, assuming that per-object > recovery is quick, it's faster to complete than asynchronous recovery, > which has extra steps on either end of the recovery process. Of > course, as you know, synchronous recovery blocks I/O, so when > per-object recovery isn't quick, as in RGW index omap shards, > particularly large shards, IMO we're better off always doing async > recovery. > > I don't know enough about the overheads involved here to evaluate > whether it's worth keeping synchronous recovery at all, but IMO RGW > index/usage(/log/gc?) pools are always better off using asynchronous > recovery. > > Josh > > On Wed, Apr 3, 2024 at 1:48 PM Anthony D'Atri <anthony.datri@xxxxxxxxx> wrote: >> >> We currently have in src/common/options/global.yaml.in >> >> - name: osd_async_recovery_min_cost >> type: uint >> level: advanced >> desc: A mixture measure of number of current log entries difference and historical >> missing objects, above which we switch to use asynchronous recovery when appropriate >> default: 100 >> flags: >> - runtime >> >> I'd like to rephrase the description there in a PR, might you be able to share your insight into the dynamics so I can craft a better description? And do you have any thoughts on the default value? Might appropriate values vary by pool type and/or media? >> >> >> >>> On Apr 3, 2024, at 13:38, Joshua Baergen <jbaergen@xxxxxxxxxxxxxxxx> wrote: >>> >>> We've had success using osd_async_recovery_min_cost=0 to drastically >>> reduce slow ops during index recovery. >>> >>> Josh >>> >>> On Wed, Apr 3, 2024 at 11:29 AM Wesley Dillingham <wes@xxxxxxxxxxxxxxxxx> wrote: >>>> >>>> I am fighting an issue on an 18.2.0 cluster where a restart of an OSD which >>>> supports the RGW index pool causes crippling slow ops. If the OSD is marked >>>> with primary-affinity of 0 prior to the OSD restart no slow ops are >>>> observed. If the OSD has a primary affinity of 1 slow ops occur. The slow >>>> ops only occur during the recovery period of the OMAP data and further only >>>> occur when client activity is allowed to pass to the cluster. Luckily I am >>>> able to test this during periods when I can disable all client activity at >>>> the upstream proxy. >>>> >>>> Given the behavior of the primary affinity changes preventing the slow ops >>>> I think this may be a case of recovery being more detrimental than >>>> backfill. I am thinking that causing an pg_temp acting set by forcing >>>> backfill may be the right method to mitigate the issue. [1] >>>> >>>> I believe that reducing the PG log entries for these OSDs would accomplish >>>> that but I am also thinking a tuning of osd_async_recovery_min_cost [2] may >>>> also accomplish something similar. Not sure the appropriate tuning for that >>>> config at this point or if there may be a better approach. Seeking any >>>> input here. >>>> >>>> Further if this issue sounds familiar or sounds like another condition >>>> within the OSD may be at hand I would be interested in hearing your input >>>> or thoughts. Thanks! >>>> >>>> [1] https://docs.ceph.com/en/latest/dev/peering/#concepts >>>> [2] >>>> https://docs.ceph.com/en/latest/rados/configuration/osd-config-ref/#confval-osd_async_recovery_min_cost >>>> >>>> Respectfully, >>>> >>>> *Wes Dillingham* >>>> LinkedIn <http://www.linkedin.com/in/wesleydillingham> >>>> wes@xxxxxxxxxxxxxxxxx >>>> _______________________________________________ >>>> ceph-users mailing list -- ceph-users@xxxxxxx >>>> To unsubscribe send an email to ceph-users-leave@xxxxxxx >>> _______________________________________________ >>> ceph-users mailing list -- ceph-users@xxxxxxx >>> To unsubscribe send an email to ceph-users-leave@xxxxxxx >> _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx