maybe also a "ceph orch ps --refresh"? It might still have the old cached daemon inventory from before you remove the files. On Fri, Sep 2, 2022 at 9:57 AM Satish Patel <satish.txt@xxxxxxxxx> wrote: > Hi Adam, > > I have deleted file located here - rm > /var/lib/ceph/f270ad9e-1f6f-11ed-b6f8-a539d87379ea/cephadm.7ce656a8721deb5054c37b0cfb90381522d521dde51fb0c5a2142314d663f63d > > But still getting the same error, do i need to do anything else? > > On Fri, Sep 2, 2022 at 9:51 AM Adam King <adking@xxxxxxxxxx> wrote: > >> Okay, I'm wondering if this is an issue with version mismatch. Having >> previously had a 16.2.10 mgr and then now having a 15.2.17 one that doesn't >> expect this sort of thing to be present. Either way, I'd think just >> deleting this cephadm.7ce656a8721deb5054c37b0cfb9038 >> 1522d521dde51fb0c5a2142314d663f63d (and any others like it) file would >> be the way forward to get orch ls working again. >> >> On Fri, Sep 2, 2022 at 9:44 AM Satish Patel <satish.txt@xxxxxxxxx> wrote: >> >>> Hi Adam, >>> >>> In cephadm ls i found the following service but i believe it was there >>> before also. >>> >>> { >>> "style": "cephadm:v1", >>> "name": >>> "cephadm.7ce656a8721deb5054c37b0cfb90381522d521dde51fb0c5a2142314d663f63d", >>> "fsid": "f270ad9e-1f6f-11ed-b6f8-a539d87379ea", >>> "systemd_unit": >>> "ceph-f270ad9e-1f6f-11ed-b6f8-a539d87379ea@cephadm.7ce656a8721deb5054c37b0cfb90381522d521dde51fb0c5a2142314d663f63d >>> ", >>> "enabled": false, >>> "state": "stopped", >>> "container_id": null, >>> "container_image_name": null, >>> "container_image_id": null, >>> "version": null, >>> "started": null, >>> "created": null, >>> "deployed": null, >>> "configured": null >>> }, >>> >>> Look like remove didn't work >>> >>> root@ceph1:~# ceph orch rm cephadm >>> Failed to remove service. <cephadm> was not found. >>> >>> root@ceph1:~# ceph orch rm >>> cephadm.7ce656a8721deb5054c37b0cfb90381522d521dde51fb0c5a2142314d663f63d >>> Failed to remove service. >>> <cephadm.7ce656a8721deb5054c37b0cfb90381522d521dde51fb0c5a2142314d663f63d> >>> was not found. >>> >>> On Fri, Sep 2, 2022 at 8:27 AM Adam King <adking@xxxxxxxxxx> wrote: >>> >>>> this looks like an old traceback you would get if you ended up with a >>>> service type that shouldn't be there somehow. The things I'd probably check >>>> are that "cephadm ls" on either host definitely doesn't report and strange >>>> things that aren't actually daemons in your cluster such as >>>> "cephadm.<hash>". Another thing you could maybe try, as I believe the >>>> assertion it's giving is for an unknown service type here ("AssertionError: >>>> cephadm"), is just "ceph orch rm cephadm" which would maybe cause it to >>>> remove whatever it thinks is this "cephadm" service that it has deployed. >>>> Lastly, you could try having the mgr you manually deploy be a 16.2.10 one >>>> instead of 15.2.17 (I'm assuming here, but the line numbers in that >>>> traceback suggest octopus). The 16.2.10 one is just much less likely to >>>> have a bug that causes something like this. >>>> >>>> On Fri, Sep 2, 2022 at 1:41 AM Satish Patel <satish.txt@xxxxxxxxx> >>>> wrote: >>>> >>>>> Now when I run "ceph orch ps" it works but the following command >>>>> throws an >>>>> error. Trying to bring up second mgr using ceph orch apply mgr >>>>> command but >>>>> didn't help >>>>> >>>>> root@ceph1:/ceph-disk# ceph version >>>>> ceph version 15.2.17 (8a82819d84cf884bd39c17e3236e0632ac146dc4) octopus >>>>> (stable) >>>>> >>>>> root@ceph1:/ceph-disk# ceph orch ls >>>>> Error EINVAL: Traceback (most recent call last): >>>>> File "/usr/share/ceph/mgr/mgr_module.py", line 1212, in >>>>> _handle_command >>>>> return self.handle_command(inbuf, cmd) >>>>> File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 140, in >>>>> handle_command >>>>> return dispatch[cmd['prefix']].call(self, cmd, inbuf) >>>>> File "/usr/share/ceph/mgr/mgr_module.py", line 320, in call >>>>> return self.func(mgr, **kwargs) >>>>> File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 102, in >>>>> <lambda> >>>>> wrapper_copy = lambda *l_args, **l_kwargs: wrapper(*l_args, >>>>> **l_kwargs) >>>>> File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 91, in >>>>> wrapper >>>>> return func(*args, **kwargs) >>>>> File "/usr/share/ceph/mgr/orchestrator/module.py", line 503, in >>>>> _list_services >>>>> raise_if_exception(completion) >>>>> File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 642, in >>>>> raise_if_exception >>>>> raise e >>>>> AssertionError: cephadm >>>>> >>>>> On Fri, Sep 2, 2022 at 1:32 AM Satish Patel <satish.txt@xxxxxxxxx> >>>>> wrote: >>>>> >>>>> > nevermind, i found doc related that and i am able to get 1 mgr up - >>>>> > >>>>> https://docs.ceph.com/en/quincy/cephadm/troubleshooting/#manually-deploying-a-mgr-daemon >>>>> > >>>>> > >>>>> > On Fri, Sep 2, 2022 at 1:21 AM Satish Patel <satish.txt@xxxxxxxxx> >>>>> wrote: >>>>> > >>>>> >> Folks, >>>>> >> >>>>> >> I am having little fun time with cephadm and it's very annoying to >>>>> deal >>>>> >> with it >>>>> >> >>>>> >> I have deployed a ceph cluster using cephadm on two nodes. Now when >>>>> i was >>>>> >> trying to upgrade and noticed hiccups where it just upgraded a >>>>> single mgr >>>>> >> with 16.2.10 but not other so i started messing around and somehow I >>>>> >> deleted both mgr in the thought that cephadm will recreate them. >>>>> >> >>>>> >> Now i don't have any single mgr so my ceph orch command hangs >>>>> forever and >>>>> >> looks like a chicken egg issue. >>>>> >> >>>>> >> How do I recover from this? If I can't run the ceph orch command, I >>>>> won't >>>>> >> be able to redeploy my mgr daemons. >>>>> >> >>>>> >> I am not able to find any mgr in the following command on both >>>>> nodes. >>>>> >> >>>>> >> $ cephadm ls | grep mgr >>>>> >> >>>>> > >>>>> _______________________________________________ >>>>> ceph-users mailing list -- ceph-users@xxxxxxx >>>>> To unsubscribe send an email to ceph-users-leave@xxxxxxx >>>>> >>>>> _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx