Re: 16.2.14: [progress WARNING root] complete: ev {UUID} does not exist

Tyler Stachecki <stachecki.tyler@xxxxxxxxx> · Fri, 29 Sep 2023 09:43:15 -0400

On Fri, Sep 29, 2023, 9:40 AM Zakhar Kirpichenko <zakhar@xxxxxxxxx> wrote:

> Thanks for the suggestion, Tyler! Do you think switching the progress
> module off will have no material impact on the operation of the cluster?
>

It does not. It literally just tracks the completion rate of certain
actions so that it can render progress bars ETAs in e.g. `ceph -s` output.

> /Z
>
> On Fri, 29 Sept 2023 at 14:13, Tyler Stachecki <stachecki.tyler@xxxxxxxxx>
> wrote:
>
>> On Fri, Sep 29, 2023, 5:55 AM Zakhar Kirpichenko <zakhar@xxxxxxxxx>
>> wrote:
>>
>>> Thank you, Eugen.
>>>
>>> Indeed it looks like the progress module had some stale events from the
>>> time when we added new OSDs and set a specific number of PGs for pools,
>>> while the autoscaler tried to scale them down. Somehow the scale-down
>>> events got stuck in the progress log, although these tasks have finished
>>> a
>>> long time ago. Failing over to another MGR didn't help, so I have cleared
>>> the progress log.
>>>
>>> I also restarted both mgrs, but unfortunately the warnings are still
>>> being
>>> logged.
>>>
>>> /Z
>>
>>
>> I would recommend just turning off the progress module via `ceph progress
>> off`. It's historically been a source of bugs (like this...) and does not
>> do much in the grand scheme of things.
>>
>>
>>> On Fri, 29 Sept 2023 at 11:32, Eugen Block <eblock@xxxxxx> wrote:
>>>
>>> > Hi,
>>> >
>>> > this is from the mgr progress module [1]. I haven't played too much
>>> > with it yet, you can check out the output of 'ceph progress json',
>>> > maybe there are old events from a (failed) upgrade etc. You can reset
>>> > it with 'ceph progress clear', you could also turn it off ('ceph
>>> > progress off') but I don't know what impact that would have, so maybe
>>> > investigate first and then try just clearing it. Maybe a mgr failover
>>> > would do the same, not sure.
>>> >
>>> > Regards,
>>> > Eugen
>>> >
>>> > [1]
>>> >
>>> >
>>> https://github.com/ceph/ceph/blob/1d10b71792f3be8887a7631e69851ac2df3585af/src/pybind/mgr/progress/module.py#L797
>>> >
>>> > Zitat von Zakhar Kirpichenko <zakhar@xxxxxxxxx>:
>>> >
>>> > > Hi,
>>> > >
>>> > > Mgr of my cluster logs this every few seconds:
>>> > >
>>> > > [progress WARNING root] complete: ev
>>> 7de5bb74-790b-4fda-8838-e4af4af18c62
>>> > > does not exist
>>> > > [progress WARNING root] complete: ev
>>> fff93fce-b630-4141-81ee-19e7a3e61483
>>> > > does not exist
>>> > > [progress WARNING root] complete: ev
>>> a02f6966-5b9f-49e8-89c4-b4fb8e6f4423
>>> > > does not exist
>>> > > [progress WARNING root] complete: ev
>>> 8d318560-ff1a-477f-9386-43f6b51080bf
>>> > > does not exist
>>> > > [progress WARNING root] complete: ev
>>> ff3740a9-6434-470a-808f-a2762fb542a0
>>> > > does not exist
>>> > > [progress WARNING root] complete: ev
>>> 7d0589f1-545e-4970-867b-8482ce48d7f0
>>> > > does not exist
>>> > > [progress WARNING root] complete: ev
>>> 78d57e43-5be5-43f0-8b1a-cdc60e410892
>>> > > does not exist
>>> > >
>>> > > I would appreciate an advice on what these warnings mean and how
>>> they can
>>> > > be resolved.
>>> > >
>>> > > Best regards,
>>> > > Zakhar
>>> > > _______________________________________________
>>> > > ceph-users mailing list -- ceph-users@xxxxxxx
>>> > > To unsubscribe send an email to ceph-users-leave@xxxxxxx
>>> >
>>> >
>>> > _______________________________________________
>>> > ceph-users mailing list -- ceph-users@xxxxxxx
>>> > To unsubscribe send an email to ceph-users-leave@xxxxxxx
>>> >
>>> _______________________________________________
>>> ceph-users mailing list -- ceph-users@xxxxxxx
>>> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>>>
>>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx