Hi Adam, In cephadm ls i found the following service but i believe it was there before also. { "style": "cephadm:v1", "name": "cephadm.7ce656a8721deb5054c37b0cfb90381522d521dde51fb0c5a2142314d663f63d", "fsid": "f270ad9e-1f6f-11ed-b6f8-a539d87379ea", "systemd_unit": "ceph-f270ad9e-1f6f-11ed-b6f8-a539d87379ea@cephadm.7ce656a8721deb5054c37b0cfb90381522d521dde51fb0c5a2142314d663f63d ", "enabled": false, "state": "stopped", "container_id": null, "container_image_name": null, "container_image_id": null, "version": null, "started": null, "created": null, "deployed": null, "configured": null }, Look like remove didn't work root@ceph1:~# ceph orch rm cephadm Failed to remove service. <cephadm> was not found. root@ceph1:~# ceph orch rm cephadm.7ce656a8721deb5054c37b0cfb90381522d521dde51fb0c5a2142314d663f63d Failed to remove service. <cephadm.7ce656a8721deb5054c37b0cfb90381522d521dde51fb0c5a2142314d663f63d> was not found. On Fri, Sep 2, 2022 at 8:27 AM Adam King <adking@xxxxxxxxxx> wrote: > this looks like an old traceback you would get if you ended up with a > service type that shouldn't be there somehow. The things I'd probably check > are that "cephadm ls" on either host definitely doesn't report and strange > things that aren't actually daemons in your cluster such as > "cephadm.<hash>". Another thing you could maybe try, as I believe the > assertion it's giving is for an unknown service type here ("AssertionError: > cephadm"), is just "ceph orch rm cephadm" which would maybe cause it to > remove whatever it thinks is this "cephadm" service that it has deployed. > Lastly, you could try having the mgr you manually deploy be a 16.2.10 one > instead of 15.2.17 (I'm assuming here, but the line numbers in that > traceback suggest octopus). The 16.2.10 one is just much less likely to > have a bug that causes something like this. > > On Fri, Sep 2, 2022 at 1:41 AM Satish Patel <satish.txt@xxxxxxxxx> wrote: > >> Now when I run "ceph orch ps" it works but the following command throws an >> error. Trying to bring up second mgr using ceph orch apply mgr command >> but >> didn't help >> >> root@ceph1:/ceph-disk# ceph version >> ceph version 15.2.17 (8a82819d84cf884bd39c17e3236e0632ac146dc4) octopus >> (stable) >> >> root@ceph1:/ceph-disk# ceph orch ls >> Error EINVAL: Traceback (most recent call last): >> File "/usr/share/ceph/mgr/mgr_module.py", line 1212, in _handle_command >> return self.handle_command(inbuf, cmd) >> File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 140, in >> handle_command >> return dispatch[cmd['prefix']].call(self, cmd, inbuf) >> File "/usr/share/ceph/mgr/mgr_module.py", line 320, in call >> return self.func(mgr, **kwargs) >> File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 102, in >> <lambda> >> wrapper_copy = lambda *l_args, **l_kwargs: wrapper(*l_args, >> **l_kwargs) >> File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 91, in >> wrapper >> return func(*args, **kwargs) >> File "/usr/share/ceph/mgr/orchestrator/module.py", line 503, in >> _list_services >> raise_if_exception(completion) >> File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 642, in >> raise_if_exception >> raise e >> AssertionError: cephadm >> >> On Fri, Sep 2, 2022 at 1:32 AM Satish Patel <satish.txt@xxxxxxxxx> wrote: >> >> > nevermind, i found doc related that and i am able to get 1 mgr up - >> > >> https://docs.ceph.com/en/quincy/cephadm/troubleshooting/#manually-deploying-a-mgr-daemon >> > >> > >> > On Fri, Sep 2, 2022 at 1:21 AM Satish Patel <satish.txt@xxxxxxxxx> >> wrote: >> > >> >> Folks, >> >> >> >> I am having little fun time with cephadm and it's very annoying to deal >> >> with it >> >> >> >> I have deployed a ceph cluster using cephadm on two nodes. Now when i >> was >> >> trying to upgrade and noticed hiccups where it just upgraded a single >> mgr >> >> with 16.2.10 but not other so i started messing around and somehow I >> >> deleted both mgr in the thought that cephadm will recreate them. >> >> >> >> Now i don't have any single mgr so my ceph orch command hangs forever >> and >> >> looks like a chicken egg issue. >> >> >> >> How do I recover from this? If I can't run the ceph orch command, I >> won't >> >> be able to redeploy my mgr daemons. >> >> >> >> I am not able to find any mgr in the following command on both nodes. >> >> >> >> $ cephadm ls | grep mgr >> >> >> > >> _______________________________________________ >> ceph-users mailing list -- ceph-users@xxxxxxx >> To unsubscribe send an email to ceph-users-leave@xxxxxxx >> >> _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx