On Fri, Sep 29, 2023, 9:40 AM Zakhar Kirpichenko <zakhar@xxxxxxxxx> wrote: > Thanks for the suggestion, Tyler! Do you think switching the progress > module off will have no material impact on the operation of the cluster? > It does not. It literally just tracks the completion rate of certain actions so that it can render progress bars ETAs in e.g. `ceph -s` output. > /Z > > On Fri, 29 Sept 2023 at 14:13, Tyler Stachecki <stachecki.tyler@xxxxxxxxx> > wrote: > >> On Fri, Sep 29, 2023, 5:55 AM Zakhar Kirpichenko <zakhar@xxxxxxxxx> >> wrote: >> >>> Thank you, Eugen. >>> >>> Indeed it looks like the progress module had some stale events from the >>> time when we added new OSDs and set a specific number of PGs for pools, >>> while the autoscaler tried to scale them down. Somehow the scale-down >>> events got stuck in the progress log, although these tasks have finished >>> a >>> long time ago. Failing over to another MGR didn't help, so I have cleared >>> the progress log. >>> >>> I also restarted both mgrs, but unfortunately the warnings are still >>> being >>> logged. >>> >>> /Z >> >> >> I would recommend just turning off the progress module via `ceph progress >> off`. It's historically been a source of bugs (like this...) and does not >> do much in the grand scheme of things. >> >> >>> On Fri, 29 Sept 2023 at 11:32, Eugen Block <eblock@xxxxxx> wrote: >>> >>> > Hi, >>> > >>> > this is from the mgr progress module [1]. I haven't played too much >>> > with it yet, you can check out the output of 'ceph progress json', >>> > maybe there are old events from a (failed) upgrade etc. You can reset >>> > it with 'ceph progress clear', you could also turn it off ('ceph >>> > progress off') but I don't know what impact that would have, so maybe >>> > investigate first and then try just clearing it. Maybe a mgr failover >>> > would do the same, not sure. >>> > >>> > Regards, >>> > Eugen >>> > >>> > [1] >>> > >>> > >>> https://github.com/ceph/ceph/blob/1d10b71792f3be8887a7631e69851ac2df3585af/src/pybind/mgr/progress/module.py#L797 >>> > >>> > Zitat von Zakhar Kirpichenko <zakhar@xxxxxxxxx>: >>> > >>> > > Hi, >>> > > >>> > > Mgr of my cluster logs this every few seconds: >>> > > >>> > > [progress WARNING root] complete: ev >>> 7de5bb74-790b-4fda-8838-e4af4af18c62 >>> > > does not exist >>> > > [progress WARNING root] complete: ev >>> fff93fce-b630-4141-81ee-19e7a3e61483 >>> > > does not exist >>> > > [progress WARNING root] complete: ev >>> a02f6966-5b9f-49e8-89c4-b4fb8e6f4423 >>> > > does not exist >>> > > [progress WARNING root] complete: ev >>> 8d318560-ff1a-477f-9386-43f6b51080bf >>> > > does not exist >>> > > [progress WARNING root] complete: ev >>> ff3740a9-6434-470a-808f-a2762fb542a0 >>> > > does not exist >>> > > [progress WARNING root] complete: ev >>> 7d0589f1-545e-4970-867b-8482ce48d7f0 >>> > > does not exist >>> > > [progress WARNING root] complete: ev >>> 78d57e43-5be5-43f0-8b1a-cdc60e410892 >>> > > does not exist >>> > > >>> > > I would appreciate an advice on what these warnings mean and how >>> they can >>> > > be resolved. >>> > > >>> > > Best regards, >>> > > Zakhar >>> > > _______________________________________________ >>> > > ceph-users mailing list -- ceph-users@xxxxxxx >>> > > To unsubscribe send an email to ceph-users-leave@xxxxxxx >>> > >>> > >>> > _______________________________________________ >>> > ceph-users mailing list -- ceph-users@xxxxxxx >>> > To unsubscribe send an email to ceph-users-leave@xxxxxxx >>> > >>> _______________________________________________ >>> ceph-users mailing list -- ceph-users@xxxxxxx >>> To unsubscribe send an email to ceph-users-leave@xxxxxxx >>> >> _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx