Re: [cephadm] mgr: no daemons active

Satish Patel <satish.txt@xxxxxxxxx> · Fri, 2 Sep 2022 09:44:28 -0400

Hi Adam,

In cephadm ls i found the following service but i believe it was there
before also.

{
        "style": "cephadm:v1",
        "name":
"cephadm.7ce656a8721deb5054c37b0cfb90381522d521dde51fb0c5a2142314d663f63d",
        "fsid": "f270ad9e-1f6f-11ed-b6f8-a539d87379ea",
        "systemd_unit":
"ceph-f270ad9e-1f6f-11ed-b6f8-a539d87379ea@cephadm.7ce656a8721deb5054c37b0cfb90381522d521dde51fb0c5a2142314d663f63d
",
        "enabled": false,
        "state": "stopped",
        "container_id": null,
        "container_image_name": null,
        "container_image_id": null,
        "version": null,
        "started": null,
        "created": null,
        "deployed": null,
        "configured": null
    },

Look like remove didn't work

root@ceph1:~# ceph orch rm cephadm
Failed to remove service. <cephadm> was not found.

root@ceph1:~# ceph orch rm
cephadm.7ce656a8721deb5054c37b0cfb90381522d521dde51fb0c5a2142314d663f63d
Failed to remove service.
<cephadm.7ce656a8721deb5054c37b0cfb90381522d521dde51fb0c5a2142314d663f63d>
was not found.

On Fri, Sep 2, 2022 at 8:27 AM Adam King <adking@xxxxxxxxxx> wrote:

> this looks like an old traceback you would get if you ended up with a
> service type that shouldn't be there somehow. The things I'd probably check
> are that "cephadm ls" on either host definitely doesn't report and strange
> things that aren't actually daemons in your cluster such as
> "cephadm.<hash>". Another thing you could maybe try, as I believe the
> assertion it's giving is for an unknown service type here ("AssertionError:
> cephadm"), is just "ceph orch rm cephadm" which would maybe cause it to
> remove whatever it thinks is this "cephadm" service that it has deployed.
> Lastly, you could try having the mgr you manually deploy be a 16.2.10 one
> instead of 15.2.17 (I'm assuming here, but the line numbers in that
> traceback suggest octopus). The 16.2.10 one is just much less likely to
> have a bug that causes something like this.
>
> On Fri, Sep 2, 2022 at 1:41 AM Satish Patel <satish.txt@xxxxxxxxx> wrote:
>
>> Now when I run "ceph orch ps" it works but the following command throws an
>> error.  Trying to bring up second mgr using ceph orch apply mgr command
>> but
>> didn't help
>>
>> root@ceph1:/ceph-disk# ceph version
>> ceph version 15.2.17 (8a82819d84cf884bd39c17e3236e0632ac146dc4) octopus
>> (stable)
>>
>> root@ceph1:/ceph-disk# ceph orch ls
>> Error EINVAL: Traceback (most recent call last):
>>   File "/usr/share/ceph/mgr/mgr_module.py", line 1212, in _handle_command
>>     return self.handle_command(inbuf, cmd)
>>   File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 140, in
>> handle_command
>>     return dispatch[cmd['prefix']].call(self, cmd, inbuf)
>>   File "/usr/share/ceph/mgr/mgr_module.py", line 320, in call
>>     return self.func(mgr, **kwargs)
>>   File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 102, in
>> <lambda>
>>     wrapper_copy = lambda *l_args, **l_kwargs: wrapper(*l_args,
>> **l_kwargs)
>>   File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 91, in
>> wrapper
>>     return func(*args, **kwargs)
>>   File "/usr/share/ceph/mgr/orchestrator/module.py", line 503, in
>> _list_services
>>     raise_if_exception(completion)
>>   File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 642, in
>> raise_if_exception
>>     raise e
>> AssertionError: cephadm
>>
>> On Fri, Sep 2, 2022 at 1:32 AM Satish Patel <satish.txt@xxxxxxxxx> wrote:
>>
>> > nevermind, i found doc related that and i am able to get 1 mgr up -
>> >
>> https://docs.ceph.com/en/quincy/cephadm/troubleshooting/#manually-deploying-a-mgr-daemon
>> >
>> >
>> > On Fri, Sep 2, 2022 at 1:21 AM Satish Patel <satish.txt@xxxxxxxxx>
>> wrote:
>> >
>> >> Folks,
>> >>
>> >> I am having little fun time with cephadm and it's very annoying to deal
>> >> with it
>> >>
>> >> I have deployed a ceph cluster using cephadm on two nodes. Now when i
>> was
>> >> trying to upgrade and noticed hiccups where it just upgraded a single
>> mgr
>> >> with 16.2.10 but not other so i started messing around and somehow I
>> >> deleted both mgr in the thought that cephadm will recreate them.
>> >>
>> >> Now i don't have any single mgr so my ceph orch command hangs forever
>> and
>> >> looks like a chicken egg issue.
>> >>
>> >> How do I recover from this? If I can't run the ceph orch command, I
>> won't
>> >> be able to redeploy my mgr daemons.
>> >>
>> >> I am not able to find any mgr in the following command on both nodes.
>> >>
>> >> $ cephadm ls | grep mgr
>> >>
>> >
>> _______________________________________________
>> ceph-users mailing list -- ceph-users@xxxxxxx
>> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>>
>>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx