Re: [cephadm] mgr: no daemons active

Adam King <adking@xxxxxxxxxx> · Fri, 2 Sep 2022 10:10:29 -0400

maybe also a "ceph orch ps --refresh"? It might still have the old cached
daemon inventory from before you remove the files.

On Fri, Sep 2, 2022 at 9:57 AM Satish Patel <satish.txt@xxxxxxxxx> wrote:

> Hi Adam,
>
> I have deleted file located here - rm
> /var/lib/ceph/f270ad9e-1f6f-11ed-b6f8-a539d87379ea/cephadm.7ce656a8721deb5054c37b0cfb90381522d521dde51fb0c5a2142314d663f63d
>
> But still getting the same error, do i need to do anything else?
>
> On Fri, Sep 2, 2022 at 9:51 AM Adam King <adking@xxxxxxxxxx> wrote:
>
>> Okay, I'm wondering if this is an issue with version mismatch. Having
>> previously had a 16.2.10 mgr and then now having a 15.2.17 one that doesn't
>> expect this sort of thing to be present. Either way, I'd think just
>> deleting this cephadm.7ce656a8721deb5054c37b0cfb9038
>> 1522d521dde51fb0c5a2142314d663f63d (and any others like it) file would
>> be the way forward to get orch ls working again.
>>
>> On Fri, Sep 2, 2022 at 9:44 AM Satish Patel <satish.txt@xxxxxxxxx> wrote:
>>
>>> Hi Adam,
>>>
>>> In cephadm ls i found the following service but i believe it was there
>>> before also.
>>>
>>> {
>>>         "style": "cephadm:v1",
>>>         "name":
>>> "cephadm.7ce656a8721deb5054c37b0cfb90381522d521dde51fb0c5a2142314d663f63d",
>>>         "fsid": "f270ad9e-1f6f-11ed-b6f8-a539d87379ea",
>>>         "systemd_unit":
>>> "ceph-f270ad9e-1f6f-11ed-b6f8-a539d87379ea@cephadm.7ce656a8721deb5054c37b0cfb90381522d521dde51fb0c5a2142314d663f63d
>>> ",
>>>         "enabled": false,
>>>         "state": "stopped",
>>>         "container_id": null,
>>>         "container_image_name": null,
>>>         "container_image_id": null,
>>>         "version": null,
>>>         "started": null,
>>>         "created": null,
>>>         "deployed": null,
>>>         "configured": null
>>>     },
>>>
>>> Look like remove didn't work
>>>
>>> root@ceph1:~# ceph orch rm cephadm
>>> Failed to remove service. <cephadm> was not found.
>>>
>>> root@ceph1:~# ceph orch rm
>>> cephadm.7ce656a8721deb5054c37b0cfb90381522d521dde51fb0c5a2142314d663f63d
>>> Failed to remove service.
>>> <cephadm.7ce656a8721deb5054c37b0cfb90381522d521dde51fb0c5a2142314d663f63d>
>>> was not found.
>>>
>>> On Fri, Sep 2, 2022 at 8:27 AM Adam King <adking@xxxxxxxxxx> wrote:
>>>
>>>> this looks like an old traceback you would get if you ended up with a
>>>> service type that shouldn't be there somehow. The things I'd probably check
>>>> are that "cephadm ls" on either host definitely doesn't report and strange
>>>> things that aren't actually daemons in your cluster such as
>>>> "cephadm.<hash>". Another thing you could maybe try, as I believe the
>>>> assertion it's giving is for an unknown service type here ("AssertionError:
>>>> cephadm"), is just "ceph orch rm cephadm" which would maybe cause it to
>>>> remove whatever it thinks is this "cephadm" service that it has deployed.
>>>> Lastly, you could try having the mgr you manually deploy be a 16.2.10 one
>>>> instead of 15.2.17 (I'm assuming here, but the line numbers in that
>>>> traceback suggest octopus). The 16.2.10 one is just much less likely to
>>>> have a bug that causes something like this.
>>>>
>>>> On Fri, Sep 2, 2022 at 1:41 AM Satish Patel <satish.txt@xxxxxxxxx>
>>>> wrote:
>>>>
>>>>> Now when I run "ceph orch ps" it works but the following command
>>>>> throws an
>>>>> error.  Trying to bring up second mgr using ceph orch apply mgr
>>>>> command but
>>>>> didn't help
>>>>>
>>>>> root@ceph1:/ceph-disk# ceph version
>>>>> ceph version 15.2.17 (8a82819d84cf884bd39c17e3236e0632ac146dc4) octopus
>>>>> (stable)
>>>>>
>>>>> root@ceph1:/ceph-disk# ceph orch ls
>>>>> Error EINVAL: Traceback (most recent call last):
>>>>>   File "/usr/share/ceph/mgr/mgr_module.py", line 1212, in
>>>>> _handle_command
>>>>>     return self.handle_command(inbuf, cmd)
>>>>>   File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 140, in
>>>>> handle_command
>>>>>     return dispatch[cmd['prefix']].call(self, cmd, inbuf)
>>>>>   File "/usr/share/ceph/mgr/mgr_module.py", line 320, in call
>>>>>     return self.func(mgr, **kwargs)
>>>>>   File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 102, in
>>>>> <lambda>
>>>>>     wrapper_copy = lambda *l_args, **l_kwargs: wrapper(*l_args,
>>>>> **l_kwargs)
>>>>>   File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 91, in
>>>>> wrapper
>>>>>     return func(*args, **kwargs)
>>>>>   File "/usr/share/ceph/mgr/orchestrator/module.py", line 503, in
>>>>> _list_services
>>>>>     raise_if_exception(completion)
>>>>>   File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 642, in
>>>>> raise_if_exception
>>>>>     raise e
>>>>> AssertionError: cephadm
>>>>>
>>>>> On Fri, Sep 2, 2022 at 1:32 AM Satish Patel <satish.txt@xxxxxxxxx>
>>>>> wrote:
>>>>>
>>>>> > nevermind, i found doc related that and i am able to get 1 mgr up -
>>>>> >
>>>>> https://docs.ceph.com/en/quincy/cephadm/troubleshooting/#manually-deploying-a-mgr-daemon
>>>>> >
>>>>> >
>>>>> > On Fri, Sep 2, 2022 at 1:21 AM Satish Patel <satish.txt@xxxxxxxxx>
>>>>> wrote:
>>>>> >
>>>>> >> Folks,
>>>>> >>
>>>>> >> I am having little fun time with cephadm and it's very annoying to
>>>>> deal
>>>>> >> with it
>>>>> >>
>>>>> >> I have deployed a ceph cluster using cephadm on two nodes. Now when
>>>>> i was
>>>>> >> trying to upgrade and noticed hiccups where it just upgraded a
>>>>> single mgr
>>>>> >> with 16.2.10 but not other so i started messing around and somehow I
>>>>> >> deleted both mgr in the thought that cephadm will recreate them.
>>>>> >>
>>>>> >> Now i don't have any single mgr so my ceph orch command hangs
>>>>> forever and
>>>>> >> looks like a chicken egg issue.
>>>>> >>
>>>>> >> How do I recover from this? If I can't run the ceph orch command, I
>>>>> won't
>>>>> >> be able to redeploy my mgr daemons.
>>>>> >>
>>>>> >> I am not able to find any mgr in the following command on both
>>>>> nodes.
>>>>> >>
>>>>> >> $ cephadm ls | grep mgr
>>>>> >>
>>>>> >
>>>>> _______________________________________________
>>>>> ceph-users mailing list -- ceph-users@xxxxxxx
>>>>> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>>>>>
>>>>>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx