Re: 'ceph fs status' no longer works?

Paul Mezzanini <pfmeec@xxxxxxx> · Fri, 3 May 2024 12:09:16 +0000

I've been running into this for quite some time now and if you want a more targeted solution you just need to restart the MDS servers that are not reporting metadata.  

ceph mds metadata

Not sure why they sometimes come up blank.  Not sure why there isn't a simple way to tell it to refresh without a full restart.   That's always been low enough on my priority list that "future Paul" will do it.  I love that guy, he fixes all my mistakes.

-paul

--

Paul Mezzanini
Platform Engineer III
Research Computing

Rochester Institute of Technology

 “End users is a description, not a goal.”

________________________________________
From: Eugen Block <eblock@xxxxxx>
Sent: Thursday, May 2, 2024 3:57 PM
To: ceph-users@xxxxxxx
Subject:  Re: 'ceph fs status' no longer works?

Yeah, I knew it was something trivial, I just checked my notes but
didn't have anything written down. I agree, it's not a big deal but
shouldn't be necessary.

Zitat von Erich Weiler <weiler@xxxxxxxxxxxx>:

> Excellent!  Restarting all the MDS daemons fixed it.  Thank you.
>
> This kinda feels like a bug.
>
> -erich
>
> On 5/2/24 12:44 PM, Bandelow, Gunnar wrote:
>> Hi Erich,
>>
>> im not sure about this specific error message, but "ceph fs status"
>> sometimes did fail me end of last year/in the beginning of the year.
>>
>> Restarting ALL mon, mgr AND mds fixed it at the time.
>>
>> Best regards,
>> Gunnar
>>
>>
>> =======================================================
>>
>> Gunnar Bandelow (dipl. phys.)
>>
>> Universitätsrechenzentrum (URZ)
>> Universität Greifswald
>> Felix-Hausdorff-Straße 18
>> 17489 Greifswald
>> Germany
>>
>>
>> --- Original Nachricht ---
>> *Betreff: * Re: 'ceph fs status' no longer works?
>> *Von: *"Erich Weiler" <weiler@xxxxxxxxxxxx <mailto:weiler@xxxxxxxxxxxx>>
>> *An: *"Eugen Block" <eblock@xxxxxx <mailto:eblock@xxxxxx>>,
>> ceph-users@xxxxxxx <mailto:ceph-users@xxxxxxx>
>> *Datum: *02-05-2024 21:05
>>
>>
>>
>>    Hi Eugen,
>>
>>    Thanks for the tip!  I just ran:
>>
>>    ceph orch daemon restart mgr.pr-md-01.jemmdf
>>
>>    (my specific mgr instance)
>>
>>    And it restarted my primary mgr daemon, and in the process failed over
>>    to my standby mgr daemon on another server.  That went smoothly.
>>
>>    Unfortunately, I still cannot get 'ceph fs status' to work (on any
>>    node)...
>>
>>    # ceph fs status
>>    Error EINVAL: Traceback (most recent call last):
>>        File "/usr/share/ceph/mgr/mgr_module.py", line 1811, in
>>    _handle_command
>>          return CLICommand.COMMANDS[cmd['prefix']].call(self, cmd, inbuf)
>>        File "/usr/share/ceph/mgr/mgr_module.py", line 474, in call
>>          return self.func(mgr, **kwargs)
>>        File "/usr/share/ceph/mgr/status/module.py", line 109, in
>>    handle_fs_status
>>          assert metadata
>>    AssertionError
>>
>>    -erich
>>
>>    On 5/2/24 11:07 AM, Eugen Block wrote:
>>     > Yep, seen this a couple of times during upgrades. I’ll have to
>>    check my
>>     > notes if I wrote anything down for that. But try a mgr failover
>>    first,
>>     > that could help.
>>     >
>>     > Zitat von Erich Weiler <weiler@xxxxxxxxxxxx
>>    <mailto:weiler@xxxxxxxxxxxx>>:
>>     >
>>     >> Hi All,
>>     >>
>>     >> For a while now I've been using 'ceph fs status' to show current
>>    MDS
>>     >> active servers, filesystem status, etc.  I recently took down my
>>    MDS
>>     >> servers and added RAM to them (one by one, so the filesystem stayed
>>     >> online).  After doing that with my four MDS servers (I had two
>>    active
>>     >> and two standby), all looks OK, 'ceph -s' reports HEALTH_OK.
>>     But when
>>     >> I do 'ceph fs status' now, I get this:
>>     >>
>>     >> # ceph fs status
>>     >> Error EINVAL: Traceback (most recent call last):
>>     >>   File "/usr/share/ceph/mgr/mgr_module.py", line 1811, in
>>    _handle_command
>>     >>     return CLICommand.COMMANDS[cmd['prefix']].call(self, cmd, inbuf)
>>     >>   File "/usr/share/ceph/mgr/mgr_module.py", line 474, in call
>>     >>     return self.func(mgr, **kwargs)
>>     >>   File "/usr/share/ceph/mgr/status/module.py", line 109, in
>>     >> handle_fs_status
>>     >>     assert metadata
>>     >> AssertionError
>>     >>
>>     >> This is on ceph 18.2.1 reef.  This is very odd - can anyone
>>    think of a
>>     >> reason why 'ceph fs status' would stop working after taking each of
>>     >> the servers down for maintenance?
>>     >>
>>     >> The filesystem is online and working just fine however.  This ceph
>>     >> instance is deployed via the cephadm method on RHEL 9.3, so the
>>     >> everything is containerized in podman.
>>     >>
>>     >> Thanks again,
>>     >> erich
>>     >> _______________________________________________
>>     >> ceph-users mailing list -- ceph-users@xxxxxxx
>>    <mailto:ceph-users@xxxxxxx>
>>     >> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>>    <mailto:ceph-users-leave@xxxxxxx>
>>     >
>>     >
>>     > _______________________________________________
>>     > ceph-users mailing list -- ceph-users@xxxxxxx
>>    <mailto:ceph-users@xxxxxxx>
>>     > To unsubscribe send an email to ceph-users-leave@xxxxxxx
>>    <mailto:ceph-users-leave@xxxxxxx>
>>    _______________________________________________
>>    ceph-users mailing list -- ceph-users@xxxxxxx
>>    <mailto:ceph-users@xxxxxxx>
>>    To unsubscribe send an email to ceph-users-leave@xxxxxxx
>>    <mailto:ceph-users-leave@xxxxxxx>
>>
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx